Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Deep Learningによる超解像の進歩

12,359 views

Published on

deep learningベースの超解像手法についてのまとめ

Published in: Technology
  • Be the first to comment

Deep Learningによる超解像の進歩

  1. 1. Copyright © DeNA Co.,Ltd. All Rights Reserved. Deep Learningによる 超解像の進歩
  2. 2. Copyright © DeNA Co.,Ltd. All Rights Reserved. ⾃⼰紹介 2 Hiroto Honda @hirotomusiker n メーカー研究所 → 2017/1 DeNA n ETH Zurich CVLにて客員(2013-2014) n CVPR NTIRE Workshop Program Committee n DeNA AI研究開発エンジニア n 現職:Object Detection (OSS: https://github.com/DeNA/Chainer_Mask_R-CNN ) n 前職:Low-Level Vision, Computational, Sensor LSI
  3. 3. Copyright © DeNA Co.,Ltd. All Rights Reserved. Contents n 超解像は試しやすい n 初期のSISRネットワーク ⁃ SRCNN, ESPCN, VDSR ⁃ Upsampling⼿法– deconv or pixelshuffle n ベースライン⼿法:SRResNet ⁃ SRResNet, SRGAN, and EDSR n 超解像とperception ⁃ 復元結果とロス関数の関係 ⁃ Perception – Distortion Tradeoff n まとめ 3
  4. 4. Copyright © DeNA Co.,Ltd. All Rights Reserved. 超解像とは n 低解像度画像 n ⾼解像度画像 4 復元
  5. 5. Copyright © DeNA Co.,Ltd. All Rights Reserved. 超解像は試しやすい! 5 original(HR) LR resize train アノテーションが不要な Self-supervised learningの⼀種
  6. 6. Copyright © DeNA Co.,Ltd. All Rights Reserved. 超解像の進歩 6 https://github.com/jbhuang0604/SelfExSRPSNR* [dB] (over bicubic) on Set5 dataset, x4 +1.86 +2.93 +2.06 +3.63 A+0.0 bicubic 2015 20172014 2016 +4.20 +2.48 PSNR data from:5) SRCNN VDSR SRResNet EDSRESPCN 超解像の精度は年々向上している * PSNR = 10 log10 (2552 / MSE ) when max value is 255
  7. 7. Copyright © DeNA Co.,Ltd. All Rights Reserved. 超解像ネットワークの学習 n 正解画像からpatchをcropする HR n patchをダウンサンプルする LR = g(HR) n バッチを編成する {LR}, {HR} n ネットワークfを学習する ロス関数は: MSE(HR, f(LR)) n ...以上! 7 LR=g(HR) f(LR) HR f MSE e.g. bicubic down-sampling
  8. 8. Copyright © DeNA Co.,Ltd. All Rights Reserved. Non-deep⼿法: 辞書ベースのアルゴリズム 8 = 係数を最適化する 8 ベースライン: A+ (2014) http://www.vision.ee.ethz.ch/~timofter/publications/Timofte-ACCV-2014.pdf = 学習済みの辞書 x 0 + x 0 + x 0.8 + x 0.8 + x 0.05 + x 0.05 + LR patch HR patch
  9. 9. Copyright © DeNA Co.,Ltd. All Rights Reserved. n 初期のSISR networks ⁃ SRCNN, ESPCN, VDSR ⁃ Upsampling⼿法 – deconv or pixelshuffle 9
  10. 10. Copyright © DeNA Co.,Ltd. All Rights Reserved. 最初のDeep超解像– SRCNN 10 Kernel size: 9 – 1 – 5 or 9 – 3 – 5 or 9 – 5 – 5 from:1) ⾮常にシンプルで計算量も少ない bicubic x2
  11. 11. Copyright © DeNA Co.,Ltd. All Rights Reserved. VDSR: ディープなSRCNN 11 from:3) 3x3, 64 ch D= 5 to 20
  12. 12. Copyright © DeNA Co.,Ltd. All Rights Reserved. Efficient sub-pixel CNN (ESPCN) 12 SRCNNと違い、LR画像をconvするので効率的 Kernel size 5 – 3 – 3 from:2)
  13. 13. Copyright © DeNA Co.,Ltd. All Rights Reserved. SRCNN / VDSR とESPCNの違い n Post-upsamplingのほうが効率的だが、1.6倍 といった⾮整数の upsamplingができない 13 SRCNN, VDSR ESPCN bicubic x2 output input Pixel shuffle x2 ch h w
  14. 14. Copyright © DeNA Co.,Ltd. All Rights Reserved. CNNによるアップスケール - Deconvolution or PixelShuffle? n Deconvolution 14 https://distill.pub/2016/deconv-checkerboard/ 位置ごとに関与する画素数が均⼀ではないため 格⼦パターンが出てしまう
  15. 15. Copyright © DeNA Co.,Ltd. All Rights Reserved. CNNによるアップスケール - Deconvolution or PixelShuffle? n resize – convolutionしては? 15 格⼦パターンはなくなる Resize(low-pass)により情報が失われる可能性があるので、 Nearest neighborで埋める⽅法も
  16. 16. Copyright © DeNA Co.,Ltd. All Rights Reserved. CNNによるアップスケール - Deconvolution or PixelShuffle? n Sub-pixel convolution (aka. PixelShuffle) 16 各位置でチャネルの情報をタイルする e.g. 9 channels -> 3x3 サブピクセル 格⼦ノイズフリーではない from:2)
  17. 17. Copyright © DeNA Co.,Ltd. All Rights Reserved. n ベースライン⼿法:SRResNet ⁃ SRResNet, SRGAN, and EDSR 17
  18. 18. Copyright © DeNA Co.,Ltd. All Rights Reserved. SRResnet and SRGAN – twitter CVPR’17 18 Skip connection pixel shuffle x2 MSE MSE Discriminator Trained VGG Perceptual Loss Discriminator Loss MSE Loss from:4) pixel shuffle x2 ch h w ・3種類のロス関数 ・MSEのみを使⽤する場合SRResNetと呼ぶ 24 residual blocks, 64 ch
  19. 19. Copyright © DeNA Co.,Ltd. All Rights Reserved. SRResnet* and SRGAN – ネットワーク詳細 19 ・resblockとskip connection ・pixel shuffle upsampling from:4)
  20. 20. Copyright © DeNA Co.,Ltd. All Rights Reserved. さらに⾼精度に特化したEnhanced Deep Super Resolution (EDSR) ソウル⼤ 20 32 residual blocks, 256 ch Skip connection x2 x2 l1 l1 Loss from:5)
  21. 21. Copyright © DeNA Co.,Ltd. All Rights Reserved. PSNRと⾒た⽬ 21 from:5) 20dB台で1dB違うと明らかに⾒た⽬が変わる
  22. 22. Copyright © DeNA Co.,Ltd. All Rights Reserved. n 超解像とPerception ⁃ 復元結果とロス関数の関係 ⁃ Perception – Distortion Tradeoff 22
  23. 23. Copyright © DeNA Co.,Ltd. All Rights Reserved. 主観評価とPSNR 23 Original SRResNet 25.53dB SRGAN 21.15dB bicubic 21.59dB Method→ PSNR → from: 4)
  24. 24. Copyright © DeNA Co.,Ltd. All Rights Reserved. SRResnet and SRGAN – lossでこんなに違う 24 MSE loss ● ● Perceptual loss using VGG ● Discriminator loss ● ● from:4) PSNRが 最も⾼い
  25. 25. Copyright © DeNA Co.,Ltd. All Rights Reserved. 3タイプのロス関数 ①l1/l2 loss ②perceptual loss ③GAN loss 25 generated image real / fake ground truth multi-scale feature matching VGG discrimi- nator generated image ground truth generated image ground truth Low Distortion Good Perception
  26. 26. Copyright © DeNA Co.,Ltd. All Rights Reserved. Perception-Distortion Tradeoff どの⼿法も、low distortionとgood perceptual qualityを 同時に満たせない → tradeoff把握が⼤事 26 from:8)
  27. 27. Copyright © DeNA Co.,Ltd. All Rights Reserved. 超解像の⽬的はなにか? 27 Accurate Plausible 正確な復元 ⾃然な復元 どちらを選ぶかは、⽤途次第!! 引⽤元:4)
  28. 28. Copyright © DeNA Co.,Ltd. All Rights Reserved. n まとめ 28
  29. 29. Copyright © DeNA Co.,Ltd. All Rights Reserved. Progress on SISR – 精度と速度 29 PSNR [dB] (over bicubic) on Set5 dataset, x4 +1.86 +2.93 +2.06 +3.63 A+ SRCNN VDSR SRResNet EDSR0.0 bicubic 2015 20172014 2016 +4.20 ESPCN +2.48 0.44 0.04 0.74 1.33 40.7 ・CNNを通る画像サイズ ・中間レイヤのチャネル数 で計算量が⼤きく変化する PSNRデータ引⽤元:5) Mega-Multiplication per one input pixel for x2 restoration
  30. 30. Copyright © DeNA Co.,Ltd. All Rights Reserved. NTIRE 2017 超解像コンペでのベンチマーク詳細 30 EDSR SRResNet VDSR ESPCN SRCNN A+ from: 9)
  31. 31. Copyright © DeNA Co.,Ltd. All Rights Reserved. まとめ n 超解像はdeepが主流、⾼精度だが計算量が⼤きい n resblock連結 + skip connectionや、pixel shuffle upsamplingが重要 n SRResNetベースの⼿法がベースライン n ʻAccurateʼ か ʻPlausibleʼ かは⽤途次第。 31
  32. 32. Copyright © DeNA Co.,Ltd. All Rights Reserved. Appendix: Residual Dense Network for Super-Resolution 32 DenseNetベースのSRResNet from: 6)
  33. 33. Copyright © DeNA Co.,Ltd. All Rights Reserved. Appendix: Deep Back-Projection Networks For Super-Resolution (best PSNR in NTIRE ʼ18 x8 bicubic downsampling track) 33 from: 7)
  34. 34. Copyright © DeNA Co.,Ltd. All Rights Reserved. Datasets n DIV2K dataset (train, val) https://data.vision.ee.ethz.ch/cvl/DIV2K/ n Set5 dataset (test) http://people.rennes.inria.fr/Aline.Roumy/results/SR_BMVC12.html n B100 dataset (test) https://www2.eecs.berkeley.edu/Research/Projects/CS/vision/bsds/ n Urban100 dataset (test) https://sites.google.com/site/jbhuang0604/publications/struct_sr 34
  35. 35. Copyright © DeNA Co.,Ltd. All Rights Reserved. Competitions n NTIRE2017: New Trends in Image Restoration and Enhancement workshop and challenge on image super- resolution in conjunction with CVPR 2017 http://www.vision.ee.ethz.ch/ntire17/ report: http://www.vision.ee.ethz.ch/~timofter/publications/Timofte-CVPRW-2017.pdf n NTIRE2018: New Trends in Image Restoration and Enhancement workshop and challenge on super-resolution, dehazing, and spectral reconstructionin conjunction with CVPR 2018 http://www.vision.ee.ethz.ch/ntire18/ report: http://openaccess.thecvf.com/content_cvpr_2018_workshops/papers/w13/Timofte_NTIRE_2018 _Challenge_CVPR_2018_paper.pdf n PIRM2018: Workshop and Challenge on Perceptual Image Restoration and Manipulation in conjunction with ECCV 2018 https://www.pirm2018.org/ 35
  36. 36. Copyright © DeNA Co.,Ltd. All Rights Reserved. References 1) Dong et al., Image Super-Resolution Using Deep Convolutional Networks, https://arxiv.org/abs/1501.00092 2) Shi et al., Real-Time Single Image and Video Super-Resolution Using an Efficient Sub-Pixel Convolutional Neural Network, https://arxiv.org/abs/1609.05158 3) Kim et al., Accurate Image Super-Resolution Using Very Deep Convolutional Networks, https://arxiv.org/pdf/1511.04587 4) Ledig et al., Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network , https://arxiv.org/abs/1609.04802 5) Lim et al., Enhanced Deep Residual Networks for Single Image Super-Resolution, https://arxiv.org/abs/1707.02921 6) Zhang et al., Residual Dense Network for Image Super-Resolution, https://arxiv.org/abs/1802.08797 7) Haris et al., Deep Back-Projection Networks For Super-Resolution, https://arxiv.org/pdf/1803.02735.pdf 8) Blau et al., Perception Distortion Tradeoff, https://arxiv.org/abs/1711.06077 9) Timofte et al., NTIRE 2017 Challenge on Single Image Super-Resolution: Methods and Results , http://www.vision.ee.ethz.ch/~timofter/publications/Timofte-CVPRW-2017.pdf

×