Figure 1.

Figure 2.

Figure 3.

Comparison of methods
| Feature | Progressive Image Inpainting | CNN | Transformer | Diffusion Models |
|---|---|---|---|---|
| Core Principles | Multi-stage processing (e.g., structure recovery followed by detail refinement, as in EdgeConnect's edge-prediction-and-filling stages) | Local feature extraction via convolutional kernels (e.g., Partial Conv's mask-aware convolution) | Global dependency modeling via self-attention (e.g., MAT's long-range reasoning) | Iterative denoising process for image generation (e.g., RePaint's stepwise restoration) |
| Key Strengths | 1.High structural integrity | 1.Strong local feature extraction | 1.Robust global semantics | 1.Highest generation quality |
| Key Weaknesses | 1.High computational complexity | 1.Limited receptive field | 1.High resource consumption | 1.Slow inference |
| Typical Use Cases | Complex structural restoration (e.g., artifact crack repair) | Small-area fast restoration (e.g., watermark removal from phone photos) | Large-area semantic restoration (e.g., street view occlusion removal) | High-fidelity detail generation (e.g., medical image super-resolution) |
| Computatio nal Efficiency | Moderate (requires multiple forward passes) | High (parallelizable computations) | Low (quadratic attention complexity) | Very Low (hundreds of denoising steps) |
| Training Data Needs | Moderate (requires structural annotations like edge maps) | Moderate (millions of images) | Very High (billion-scale pretraining) | Very High (massive high-quality datasets) |
| Representat ive Methods | EdgeConnect, RFR-Net | Partial Conv, DeepFill | MAT, SwinIR | RePaint, DiffBIR |