[SIGGRAPH 2017] Globally and Locally Consistent Image Completion

•

5 likes•2,551 views

This document presents a method for image completion using a novel neural network trained with generative adversarial networks. The network can fill in arbitrary image regions and generates globally and locally consistent completions. It uses a completion network trained with two auxiliary discriminators at the global and local level to produce realistic inpainting results. The method is able to generate novel textures and objects and outperforms previous approaches.

Technology

Satoshi Iizuka Edgar Simo-Serra Hiroshi Ishikawa
Waseda University

2
Image Completion (Inpainting)
 Filling in a part of images
 Object removal, texture generation for occluded regions, …
Input Mask Image completion

3
Related Work
 Patch-based inpainting [Criminisi+ ’04; Wexler+ ’07; Simakov+ ’08;
Barnes+ ’09; Darabi+ ’12; Huang+ ’14]
 Synthesize texture by collecting small image patches
 Cannot preserve global structures
 Cannot generate novel objects
Input
Patch-based inpainting [Barnes+ ’09]
Output OutputInput

4
Related Work
 Learning-based inpainting [Pathak+ ’16]
 Learning inpainting with Generative Adversarial Networks (GAN) [Goodfellow+ ’14]
 Fixed image resolution (128 × 128 pixels)
 Fixed mask position and size (center, 64 × 64 pixels)
 Tends to generate texture that is inconsistent with an input image
Input Output Input Output

5
Our Method
 Novel network for globally and locally consistent image completion
 Completion network that is able to inpaint arbitrary regions
 Adversarial training with two auxiliary networks
 Can generate novel objects
Input Output Input Output

7
Our Model
𝐻
2
×
𝑊
2
Global Discriminator
Local Discriminator
Completion Network
Image
+
Mask
Real
or
Inpainting
Model output
Output

8
Completion Network
 Composed of 17 convolutional layers
 Output an image filled in holes
Dilated Convolution Layer
Convolution Layer
Model output
OutputInput
Mask

9
Importance of Spatial Support
Hole
𝑝1
𝑝2
𝑝1
𝑝2
Ω1
Ω2
Ω1
Ω2
Influencing region
𝑝1 can be computed
𝑝2 cannot be computed
Both can be computed

10
Global and Local Discriminators
 Recognize whether an input image is real or inpainted
 Only use to train the completion network
Global Discriminator
Local Discriminator
Real image
or
Inpainted image
Inpainted Image
Inpainted region

11
Training
 Alternately update the completion network and discriminators
 Based on Generative Adversarial Networks [Goodfellow+ ’14]
 MSE（Mean Squared Error） is also used
Completion
Network
Global and Local
Discriminators
Trained to fool
the discriminators
Trained to recognize
whether an image is
inpainted or real

12
Dataset
 Places2 dataset [Zhou+ ’16]
 About 8 million images with various scenes
 Randomly generate a hole for training
Places2 dataset [Zhou+ ’16]
Input Ground truth

13
Computational Time
 Inpaint within a few seconds
Image Size Pixels CPU(sec) GPU(sec) Speedup
512 × 512 409,600 2.286 0.141 16.2 ×
768 × 768 589,824 4.933 0.312 15.8 ×
1024 × 1024 1,048,576 8.262 0.561 14.7 ×

15
Results: Object Removal
Input Mask Output

16
Comparison
Input
[Barnes+ ’09] [Darabi+ ’12]
Ours[Huang+ ’14]

17
Comparison
Input
[Barnes+ ’09] [Darabi+ ’12]
[Huang+ ’14] Ours

18
Training with Different Discriminator Configurations
Input
Mean Squared Error (MSE) MSE + Global discriminator
MSE + Local discriminator Full method

19
Application to Specific Dataset
 Fine-tuning the model using a specific dataset
 Achieves more complicated inpainting
 Face dataset [Liu+ ’15]
 200,000 training images
 2499 test images
Large-scale CelebFaces Attributes Dataset [Liu+ ’15]

21
Results: Removing Sunglasses
Original Input Output

23
Conclusion
 Novel network for globally and locally consistent image completion
 Completion network for arbitrary region inpainting
 Adversarial training with global and local
discriminators

Similar to [SIGGRAPH 2017] Globally and Locally Consistent Image Completion

Данило Ульянич “C89 OpenGL for ARM microcontrollers on Cortex-M. Basic functi...Lviv Startup Club

deep_stereo_arxiv_2015Ivan Neulander

Computer visionAntonio Radesca

Multimedia digital imagesMohammad Dwikat

Advanced Game Development with the Mobile 3D Graphics APITomi Aarnio

PR100: SeedNet: Automatic Seed Generation with Deep Reinforcement Learning fo...광희 이

Seed net automatic seed generation with deep reinforcement learning for robus...NAVER Engineering

Color Detection & Segmentation based Invisible CloakAviral Chaurasia

Generative Models and Adversarial Training (D2L3 Insight@DCU Machine Learning...Universitat Politècnica de Catalunya

Augmented reality in web rtc browserALTANAI BISHT

Interactive Stereoscopic Rendering for Non-Planar Projections (GRAPP 2009)Matthias Trapp

ML Paper Tutorial - Video Face Manipulation Detection Through Ensemble of CNN...Pei-Yuan Chien

Inpainting historySeowoo Han

Faire de la reconnaissance d'images avec le Deep Learning - Cristina & Pierre...Jedha Bootcamp

A Scalable Real-Time Many-Shadowed-Light Rendering SystemBo Li

Steam presentation deux 3 d prints from photographsScott Eastellerson

Comparison of Rendering Processes on 3D ModelAIRCC Publishing Corporation

cvpresentation-190812154654 (1).pptxPyariMohanJena

ppt 20BET1024.pptxManeetBali

Fundamentals Image and GraphicsShrawan Adhikari

Similar to [SIGGRAPH 2017] Globally and Locally Consistent Image Completion (20)

Данило Ульянич “C89 OpenGL for ARM microcontrollers on Cortex-M. Basic functi...

deep_stereo_arxiv_2015

Computer vision

Multimedia digital images

Advanced Game Development with the Mobile 3D Graphics API

PR100: SeedNet: Automatic Seed Generation with Deep Reinforcement Learning fo...

Seed net automatic seed generation with deep reinforcement learning for robus...

Color Detection & Segmentation based Invisible Cloak

Generative Models and Adversarial Training (D2L3 Insight@DCU Machine Learning...

Augmented reality in web rtc browser

Interactive Stereoscopic Rendering for Non-Planar Projections (GRAPP 2009)

ML Paper Tutorial - Video Face Manipulation Detection Through Ensemble of CNN...

Inpainting history

Faire de la reconnaissance d'images avec le Deep Learning - Cristina & Pierre...

A Scalable Real-Time Many-Shadowed-Light Rendering System

Steam presentation deux 3 d prints from photographs

Comparison of Rendering Processes on 3D Model

cvpresentation-190812154654 (1).pptx

ppt 20BET1024.pptx

Fundamentals Image and Graphics

Recently uploaded

Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed

Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited

Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3

Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro

Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan

New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada

Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3

From Family Reminiscence to Scholarly Archive .Alan Dix

A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3

DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell

WordPress Websites for Engineers: Elevate Your Brandgvaughan

The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3

TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc

DMCC Future of Trade Web3 - Special EditionDubai Multi Commodity Centre

"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays

"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays

How to write a Business Continuity PlanDatabarracks

Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada

Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm

The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech

Recently uploaded (20)

Scanning the Internet for External Cloud Exposures via SSL Certs

Ensuring Technical Readiness For Copilot in Microsoft 365

Digital Identity is Under Attack: FIDO Paris Seminar.pptx

Unraveling Multimodality with Large Language Models.pdf

Generative AI for Technical Writer or Information Developers

New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024

Moving Beyond Passwords: FIDO Paris Seminar.pdf

From Family Reminiscence to Scholarly Archive .

A Deep Dive on Passkeys: FIDO Paris Seminar.pptx

DSPy a system for AI to Write Prompts and Do Fine Tuning

WordPress Websites for Engineers: Elevate Your Brand

The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx

TrustArc Webinar - How to Build Consumer Trust Through Data Privacy

DMCC Future of Trade Web3 - Special Edition

"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack

"Debugging python applications inside k8s environment", Andrii Soldatenko

How to write a Business Continuity Plan

Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024

Streamlining Python Development: A Guide to a Modern Project Setup

The Ultimate Guide to Choosing WordPress Pros and Cons

[SIGGRAPH 2017] Globally and Locally Consistent Image Completion

1. Satoshi Iizuka Edgar Simo-Serra Hiroshi Ishikawa Waseda University

2. 2 Image Completion (Inpainting)  Filling in a part of images  Object removal, texture generation for occluded regions, … Input Mask Image completion

3. 3 Related Work  Patch-based inpainting [Criminisi+ ’04; Wexler+ ’07; Simakov+ ’08; Barnes+ ’09; Darabi+ ’12; Huang+ ’14]  Synthesize texture by collecting small image patches  Cannot preserve global structures  Cannot generate novel objects Input Patch-based inpainting [Barnes+ ’09] Output OutputInput

4. 4 Related Work  Learning-based inpainting [Pathak+ ’16]  Learning inpainting with Generative Adversarial Networks (GAN) [Goodfellow+ ’14]  Fixed image resolution (128 × 128 pixels)  Fixed mask position and size (center, 64 × 64 pixels)  Tends to generate texture that is inconsistent with an input image Input Output Input Output

5. 5 Our Method  Novel network for globally and locally consistent image completion  Completion network that is able to inpaint arbitrary regions  Adversarial training with two auxiliary networks  Can generate novel objects Input Output Input Output

6. 6 Demo

7. 7 Our Model 𝐻 2 × 𝑊 2 Global Discriminator Local Discriminator Completion Network Image + Mask Real or Inpainting Model output Output

8. 8 Completion Network  Composed of 17 convolutional layers  Output an image filled in holes Dilated Convolution Layer Convolution Layer Model output OutputInput Mask

9. 9 Importance of Spatial Support Hole 𝑝1 𝑝2 𝑝1 𝑝2 Ω1 Ω2 Ω1 Ω2 Influencing region 𝑝1 can be computed 𝑝2 cannot be computed Both can be computed

10. 10 Global and Local Discriminators  Recognize whether an input image is real or inpainted  Only use to train the completion network Global Discriminator Local Discriminator Real image or Inpainted image Inpainted Image Inpainted region

11. 11 Training  Alternately update the completion network and discriminators  Based on Generative Adversarial Networks [Goodfellow+ ’14]  MSE（Mean Squared Error） is also used Completion Network Global and Local Discriminators Trained to fool the discriminators Trained to recognize whether an image is inpainted or real

12. 12 Dataset  Places2 dataset [Zhou+ ’16]  About 8 million images with various scenes  Randomly generate a hole for training Places2 dataset [Zhou+ ’16] Input Ground truth

13. 13 Computational Time  Inpaint within a few seconds Image Size Pixels CPU(sec) GPU(sec) Speedup 512 × 512 409,600 2.286 0.141 16.2 × 768 × 768 589,824 4.933 0.312 15.8 × 1024 × 1024 1,048,576 8.262 0.561 14.7 ×

14. 14 Results: Image Completion

15. 15 Results: Object Removal Input Mask Output

16. 16 Comparison Input [Barnes+ ’09] [Darabi+ ’12] Ours[Huang+ ’14]

17. 17 Comparison Input [Barnes+ ’09] [Darabi+ ’12] [Huang+ ’14] Ours

18. 18 Training with Different Discriminator Configurations Input Mean Squared Error (MSE) MSE + Global discriminator MSE + Local discriminator Full method

19. 19 Application to Specific Dataset  Fine-tuning the model using a specific dataset  Achieves more complicated inpainting  Face dataset [Liu+ ’15]  200,000 training images  2499 test images Large-scale CelebFaces Attributes Dataset [Liu+ ’15]

20. 20 Results: Face Completion

21. 21 Results: Removing Sunglasses Original Input Output

22. 22 Failure Case Input Ours Ground truth

23. 23 Conclusion  Novel network for globally and locally consistent image completion  Completion network for arbitrary region inpainting  Adversarial training with global and local discriminators

[SIGGRAPH 2017] Globally and Locally Consistent Image Completion

Recommended

Recommended

More Related Content

Similar to [SIGGRAPH 2017] Globally and Locally Consistent Image Completion

Similar to [SIGGRAPH 2017] Globally and Locally Consistent Image Completion (20)

Recently uploaded

Recently uploaded (20)

[SIGGRAPH 2017] Globally and Locally Consistent Image Completion