[SIGGRAPH 2017] Globally and Locally Consistent Image Completion

Satoshi Iizuka Edgar Simo-Serra Hiroshi Ishikawa
Waseda University

2
Image Completion (Inpainting)
 Filling in a part of images
 Object removal, texture generation for occluded regions, …
Input Mask Image completion

3
Related Work
 Patch-based inpainting [Criminisi+ ’04; Wexler+ ’07; Simakov+ ’08;
Barnes+ ’09; Darabi+ ’12; Huang+ ’14]
 Synthesize texture by collecting small image patches
 Cannot preserve global structures
 Cannot generate novel objects
Input
Patch-based inpainting [Barnes+ ’09]
Output OutputInput

4
Related Work
 Learning-based inpainting [Pathak+ ’16]
 Learning inpainting with Generative Adversarial Networks (GAN) [Goodfellow+ ’14]
 Fixed image resolution (128 × 128 pixels)
 Fixed mask position and size (center, 64 × 64 pixels)
 Tends to generate texture that is inconsistent with an input image
Input Output Input Output

5
Our Method
 Novel network for globally and locally consistent image completion
 Completion network that is able to inpaint arbitrary regions
 Adversarial training with two auxiliary networks
 Can generate novel objects
Input Output Input Output

7
Our Model
𝐻
2
×
𝑊
2
Global Discriminator
Local Discriminator
Completion Network
Image
+
Mask
Real
or
Inpainting
Model output
Output

8
Completion Network
 Composed of 17 convolutional layers
 Output an image filled in holes
Dilated Convolution Layer
Convolution Layer
Model output
OutputInput
Mask

9
Importance of Spatial Support
Hole
𝑝1
𝑝2
𝑝1
𝑝2
Ω1
Ω2
Ω1
Ω2
Influencing region
𝑝1 can be computed
𝑝2 cannot be computed
Both can be computed

10
Global and Local Discriminators
 Recognize whether an input image is real or inpainted
 Only use to train the completion network
Global Discriminator
Local Discriminator
Real image
or
Inpainted image
Inpainted Image
Inpainted region

11
Training
 Alternately update the completion network and discriminators
 Based on Generative Adversarial Networks [Goodfellow+ ’14]
 MSE（Mean Squared Error） is also used
Completion
Network
Global and Local
Discriminators
Trained to fool
the discriminators
Trained to recognize
whether an image is
inpainted or real

12
Dataset
 Places2 dataset [Zhou+ ’16]
 About 8 million images with various scenes
 Randomly generate a hole for training
Places2 dataset [Zhou+ ’16]
Input Ground truth

13
Computational Time
 Inpaint within a few seconds
Image Size Pixels CPU(sec) GPU(sec) Speedup
512 × 512 409,600 2.286 0.141 16.2 ×
768 × 768 589,824 4.933 0.312 15.8 ×
1024 × 1024 1,048,576 8.262 0.561 14.7 ×

15
Results: Object Removal
Input Mask Output

16
Comparison
Input
[Barnes+ ’09] [Darabi+ ’12]
Ours[Huang+ ’14]

17
Comparison
Input
[Barnes+ ’09] [Darabi+ ’12]
[Huang+ ’14] Ours

18
Training with Different Discriminator Configurations
Input
Mean Squared Error (MSE) MSE + Global discriminator
MSE + Local discriminator Full method

19
Application to Specific Dataset
 Fine-tuning the model using a specific dataset
 Achieves more complicated inpainting
 Face dataset [Liu+ ’15]
 200,000 training images
 2499 test images
Large-scale CelebFaces Attributes Dataset [Liu+ ’15]

21
Results: Removing Sunglasses
Original Input Output

22
Failure Case
Input Ours Ground truth

23
Conclusion
 Novel network for globally and locally consistent image completion
 Completion network for arbitrary region inpainting
 Adversarial training with global and local
discriminators

[SIGGRAPH 2017] Globally and Locally Consistent Image Completion

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to [SIGGRAPH 2017] Globally and Locally Consistent Image Completion

Similar to [SIGGRAPH 2017] Globally and Locally Consistent Image Completion (20)

Recently uploaded

Recently uploaded (20)

[SIGGRAPH 2017] Globally and Locally Consistent Image Completion