Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Semantic Image Inpainting

203 views

Published on

Mostly paper review of Semantic Image Inpainting with Deep Generative Models, R Yeh et al. CVPR 2017.
Prepared for Lab Seminar at SNU Datamining Center on 20180213.

Published in: Data & Analytics
  • Be the first to comment

  • Be the first to like this

Semantic Image Inpainting

  1. 1. Semantic Image Inpainting Semantic Image Inpainting with Deep Generative Models, R Yeh et al. CVPR 2017 LAB SEMINAR 1 2018.02.13 SNU DATAMINING CENTER MINKI CHUNG
  2. 2. TABLE OF CONTENTS ▸ Motivation ▸ What is image inpainting ▸ Problem statement ▸ Baseline ▸ Semantic image inpainting with Deep Generative Models ▸ My work ▸ Discussion 2
  3. 3. MOTIVATION 3
  4. 4. MOTIVATION 4 ▸ What is image inpainting? https://www.youtube.com/watch?v=1F-6iRrgh1s
  5. 5. MOTIVATION 5 ▸ Objective: Make attentive inpainter IF BACKGROUND OF TARGET REMOVING OBJECT IS SIMPLE, EXISTING METHOD WORKS FINE HOWEVER, IF BACKGROUND OF TARGET REMOVING OBJECT IS COMPLEX, BETTER NEED ANOTHER METHOD
  6. 6. BASELINE ▸ Semantic Image Inpainting with Deep Generative Models, R Yeh et al. CVPR 2017 6
  7. 7. SEMANTIC IMAGE INPAINTING WITH DEEP GENERATIVE MODELS 7 ▸ DCGAN-Based ▸ Not end-to-end: ▸ 1. Train generator first (uncorrupted data) ▸ 2. Find z_hat for inpainting CONTEXTUAL LOSSPRIOR LOSS https://arxiv.org/abs/1607.07539
  8. 8. SEMANTIC IMAGE INPAINTING WITH DEEP GENERATIVE MODELS 8 ▸ Hypothesis: Trained G is efficient- image not from pdata (e.g., corrupted data) should not lie on the learned encoding manifold, z ▸ Objective: Find encoding z_hat: “closest” to the corrupted image while being constrained to the manifold, ▸ y: corrupted image M: binary mask(size equal to the image) https://arxiv.org/abs/1607.07539 PRIOR LOSSCONTEXTUAL LOSS
  9. 9. SEMANTIC IMAGE INPAINTING WITH DEEP GENERATIVE MODELS 9 ▸ Contextual Loss: Not simply l1 norm between G(z) and uncorrupted portion of input image y, do consider corrupted area ▸ Weighting term W, ▸ So, Wi: importance weight at pixel location i N(i): set of neighbors of pixel i in a local window BIGGER WEIGHT y: corrupted image M: binary mask(size equal to the image) https://arxiv.org/abs/1607.07539
  10. 10. SEMANTIC IMAGE INPAINTING WITH DEEP GENERATIVE MODELS 10 ▸ Prior Loss: how realistic the generated image is ▸ Identical to the GAN loss for training the discriminator D, ▸ ▸ Without Lp, the mapping from y to z may converge to a perceptually implausible result https://arxiv.org/abs/1607.07539
  11. 11. SEMANTIC IMAGE INPAINTING WITH DEEP GENERATIVE MODELS 11 ▸ Tackling points: ▸ Object-level occlusion: Narrowing down for object removal ▸ Contextual loss: A pixel that is very far away from any holes plays very little role in the inpainting process. ▸ What if..? ▸ Interpretation: Want to see the pixel which plays key role in deciding z_hat ▸ → Attention 1 2 3
  12. 12. MY WORK 12
  13. 13. MY WORK 13 ▸ Object-level occlusion: Narrowing down to object removal ▸ MS-COCO Dataset ▸ Train set: 118287 ▸ COCO Api: Get annotations(instance) ▸ Use images which have person instance such that smaller than 1/4 of the image bigger than 1/20 of the image ▸ 30830, (rescale to 256x256) 1
  14. 14. MY WORK 14 ▸ Limitation of contextual loss: less influence of farther part on inpainting ▸ Naive approach: for each grid of image, find pixel influence(attention_ratio) on finding optimal z_hat ▸ Do it subsequently, grid by grid occlusio 0.1 0.1 0.1 0.4 0.6 0.4 0.3 0.1 0.7 0.5 0.3 0.8 0.7 0.6 0.7 0.2 0.40.1 0.3 0.1 0.1 0.2 0.1 0.7 0.1 occlusio 2
  15. 15. MY WORK 15 ▸ After finding optimal attention_ratio for each grid ▸ Find noize z hat based on ‘Original * Attn_Ratio image’ to reconstruct image ▸ Visualization of pixel influence on inpainting ORIGINAL ORIGINAL*ATTN_RATIOMASKED 3
  16. 16. MY WORK 16 ▸ However… because of computation inefficiency, unable to learn ▸ (Current situation) Rethinking about the attention method.. WITHOUT ATTENTION, 1000 EPOCH WITH ATTENTION, 20 EPOCH
  17. 17. ANY Q? 17
  18. 18. REFERENCE ▸ Semantic image inpainting with Deep Generative Models, Raymond A. Yeh, Chen Chen, Teck Yian Lim, Alexander G. Schwing, Mark Hasegawa- Johnson, Minh N. Do, CVPR 2017, https://arxiv.org/abs/1607.07539 ▸ MS COCO dataset, http://cocodataset.org/#home 18
  19. 19. END OF DOCUMENT 19

×