This document presents a method for image completion using a novel neural network trained with generative adversarial networks. The network can fill in arbitrary image regions and generates globally and locally consistent completions. It uses a completion network trained with two auxiliary discriminators at the global and local level to produce realistic inpainting results. The method is able to generate novel textures and objects and outperforms previous approaches.
2. 2
Image Completion (Inpainting)
Filling in a part of images
Object removal, texture generation for occluded regions, …
Input Mask Image completion
3. 3
Related Work
Patch-based inpainting [Criminisi+ ’04; Wexler+ ’07; Simakov+ ’08;
Barnes+ ’09; Darabi+ ’12; Huang+ ’14]
Synthesize texture by collecting small image patches
Cannot preserve global structures
Cannot generate novel objects
Input
Patch-based inpainting [Barnes+ ’09]
Output OutputInput
4. 4
Related Work
Learning-based inpainting [Pathak+ ’16]
Learning inpainting with Generative Adversarial Networks (GAN) [Goodfellow+ ’14]
Fixed image resolution (128 × 128 pixels)
Fixed mask position and size (center, 64 × 64 pixels)
Tends to generate texture that is inconsistent with an input image
Input Output Input Output
5. 5
Our Method
Novel network for globally and locally consistent image completion
Completion network that is able to inpaint arbitrary regions
Adversarial training with two auxiliary networks
Can generate novel objects
Input Output Input Output
8. 8
Completion Network
Composed of 17 convolutional layers
Output an image filled in holes
Dilated Convolution Layer
Convolution Layer
Model output
OutputInput
Mask
9. 9
Importance of Spatial Support
Hole
𝑝1
𝑝2
𝑝1
𝑝2
Ω1
Ω2
Ω1
Ω2
Influencing region
𝑝1 can be computed
𝑝2 cannot be computed
Both can be computed
10. 10
Global and Local Discriminators
Recognize whether an input image is real or inpainted
Only use to train the completion network
Global Discriminator
Local Discriminator
Real image
or
Inpainted image
Inpainted Image
Inpainted region
11. 11
Training
Alternately update the completion network and discriminators
Based on Generative Adversarial Networks [Goodfellow+ ’14]
MSE(Mean Squared Error) is also used
Completion
Network
Global and Local
Discriminators
Trained to fool
the discriminators
Trained to recognize
whether an image is
inpainted or real
12. 12
Dataset
Places2 dataset [Zhou+ ’16]
About 8 million images with various scenes
Randomly generate a hole for training
Places2 dataset [Zhou+ ’16]
Input Ground truth
13. 13
Computational Time
Inpaint within a few seconds
Image Size Pixels CPU(sec) GPU(sec) Speedup
512 × 512 409,600 2.286 0.141 16.2 ×
768 × 768 589,824 4.933 0.312 15.8 ×
1024 × 1024 1,048,576 8.262 0.561 14.7 ×
18. 18
Training with Different Discriminator Configurations
Input
Mean Squared Error (MSE) MSE + Global discriminator
MSE + Local discriminator Full method
19. 19
Application to Specific Dataset
Fine-tuning the model using a specific dataset
Achieves more complicated inpainting
Face dataset [Liu+ ’15]
200,000 training images
2499 test images
Large-scale CelebFaces Attributes Dataset [Liu+ ’15]
23. 23
Conclusion
Novel network for globally and locally consistent image completion
Completion network for arbitrary region inpainting
Adversarial training with global and local
discriminators