Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

CVPR2018 現地レポート

2,939 views

Published on

CVPR2018への参加報告・論文紹介。
第46回 コンピュータビジョン勉強会@関東(前編)2018/7/1の発表資料です

Published in: Technology
  • Do This Simple 2-Minute Ritual To Loss 1 Pound Of Belly Fat Every 72 Hours ◆◆◆ https://tinyurl.com/bkfitness4u
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • How can I lose weight in three days? ▲▲▲ http://ishbv.com/bkfitness3/pdf
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here

CVPR2018 現地レポート

  1. 1. Copyright © DeNA Co.,Ltd. All Rights Reserved. CVPR 2018 現地レポート July 1, 2018 Hiroto Honda AI System Dept. DeNA Co., Ltd.
  2. 2. Copyright © DeNA Co.,Ltd. All Rights Reserved. ⾃⼰紹介 2 n Hiroto Honda n @hirotomusiker n メーカー研究所 → 2017/1 DeNA n ETH Zurich CVLにて客員(2013-2014) n CVPR NTIRE Workshop Program Committee n DeNA AI研究開発エンジニア n 現職:Object Detection (OSS: https://github.com/DeNA/Chainer_Mask_R-CNN ) n 前職:Low-Level Vision, Computational, Sensor LSI CVPRW’15
  3. 3. Copyright © DeNA Co.,Ltd. All Rights Reserved. Outline 1. CVPR overview 2. Awarded Papers 3. Yes, we GAN! – GANs and perception 4. Dense Detection & Recognition 5. Knowledge Transfer 6. Basic Nets 7. Conclusion 3
  4. 4. Copyright © DeNA Co.,Ltd. All Rights Reserved. Outline 1. CVPR overview 2. Awarded Papers 3. Yes, we GAN! – GANs and perception 4. Dense Detection & Recognition 5. Knowledge Transfer 6. Basic Nets 7. Conclusion 4
  5. 5. Copyright © DeNA Co.,Ltd. All Rights Reserved. 1. CVPR Overview 5 Computer Visionのトップカンファレンス 今年はユタ州ソルトレイクシティにて開催 http://cvpr2018.thecvf.com/
  6. 6. Copyright © DeNA Co.,Ltd. All Rights Reserved. Salt Lake City 6
  7. 7. Copyright © DeNA Co.,Ltd. All Rights Reserved. Social Event 7
  8. 8. Copyright © DeNA Co.,Ltd. All Rights Reserved. 参加者数 8 6,000⼈以上が参加 会場の外まで続くregistrationの ⼤⾏列 http://cvpr2018.thecvf.com/files/CVPR%202018%20Opening%202.pdf
  9. 9. Copyright © DeNA Co.,Ltd. All Rights Reserved. 論⽂数 9 acceptance: 29.6% orals (12min) : 2.1% spotlight (4min): 6.8% http://cvpr2018.thecvf.com/files/CVPR%202018%20Opening%202.pdf
  10. 10. Copyright © DeNA Co.,Ltd. All Rights Reserved. ワークショップ (の⼀部) n 1st International Workshop on Deep Learning for Visual SLAM n New Trends in Image Restoration and Enhancement workshop and challenges n Workshop on Autonomous Driving n Language and Vision n Perception Beyond the Visible Spectrum (PBVS 2018) n Efficient Deep Learning for Computer Vision n The First Workshop on Joint Detection, Tracking, and Prediction in the wild 10 SLAM, ⾃動運転, 画像復元, エッジ, など 多岐にわたるワークショップ
  11. 11. Copyright © DeNA Co.,Ltd. All Rights Reserved. コンペティション n Low-Power Image Recognition Challenge n NVIDIA AI City Challenge n DeepGlobe: A Challenge for Parsing the Earth through Satellite Images n Workshop on Autonomous Driving n VQA Challenge and Visual Dialog Workshop n Visual Understanding of Humans in Crowd Scene and the 2nd Look Into Person (LIP) Challenge n Robust Vision Challenge n Workshop and Challenge on Learnt Image Compression n Large-Scale Landmark Recognition: A Challenge n The DAVIS Challenge on Video Object Segmentation 2018 n Bridging the Gap between Computational Photography and Visual Recognition: the UG^2 Price Challenge n NTIRE: 3rd New Trends in Image Restoration and Enhancement workshop and challenges n ActivityNet Large Scale Activity Recognition Challenge 2018 n The 2nd CVPR Workshop on Visual Understanding by Learning from Web Data (WebVision) 11 ビデオセグメンテーション 衛星画像 画像圧縮 モバイル実装
  12. 12. Copyright © DeNA Co.,Ltd. All Rights Reserved. 今回の参加スタイル n リアルタイムSlackingをやってみた 12 n オーラル: 主要スライドと、Q&Aの内容を共有 n ポスター: slackでチームからの質問を受け付ける 著者に突撃、議論する その場でslackに結果を貼る 臨場感を共有・細かい点の聞き出しができた。 実装の闇な部分も、著者に聞けば快く教えてくれる!
  13. 13. Copyright © DeNA Co.,Ltd. All Rights Reserved. 研究のトレンド 13 ICCV’17と⽐較すると、3D・画像⽣成が多い印象 3D関連 認識・検出 Language & Vision画像⽣成 Video ネットワーク・学習 コンピュテーショナル 画像復元・圧縮 oral / spotlight 287件より集計
  14. 14. Copyright © DeNA Co.,Ltd. All Rights Reserved. 研究のトレンド 14 http://jponttuset.cat/are-gans-the-new-deep/ キーワードを含むタイトル数をカウント。コードは↓
  15. 15. Copyright © DeNA Co.,Ltd. All Rights Reserved. 15 現地で注⽬度の⾼かった、 オーラル or ポスター混雑 or 話題になった 発表を中⼼に紹介していきます (発表者の専⾨性バイアスがありますがご容赦ください 本発表では、
  16. 16. Copyright © DeNA Co.,Ltd. All Rights Reserved. 論⽂は、 でタイトルをサーチすれば参照できます 16 http://openaccess.thecvf.com/CVPR2018.py あと、
  17. 17. Copyright © DeNA Co.,Ltd. All Rights Reserved. Outline 1. CVPR overview 2. Awarded Papers 3. Yes, we GAN! – GANs and perception 4. Dense Detection & Recognition 5. Knowledge Transfer 6. Basic Nets 7. Conclusion 17
  18. 18. Copyright © DeNA Co.,Ltd. All Rights Reserved. 2. Awarded Papers Best Paper Award – Taskonomy : Disentangling Task Transfer Learning [1] 18 Task Taxonomy (タスク分類学) 転移学習により、 26タスクの相互関係を得る
  19. 19. Copyright © DeNA Co.,Ltd. All Rights Reserved. 19 - マルチタスクネットワ ーク設計に役⽴つ - 47,000 GPU時間かけた 2. Awarded Papers Best Paper Award – Taskonomy : Disentangling Task Transfer Learning [1] 各タスクでEnc-Decを学習 Encをフリーズ、タスクを⼊れ替えて学習 転移の有効性を規格化
  20. 20. Copyright © DeNA Co.,Ltd. All Rights Reserved. Multi-task Learningの例 20 Multi-Task Learning Using Uncertainty to Weigh Losses for Scene Geometry and Semantics [2] 各タスクの不確実性を求め、loss重みを⾃動調整する
  21. 21. Copyright © DeNA Co.,Ltd. All Rights Reserved. 2. Awarded Papers - Honorable Mention n Deep Learning for Graph Matching [3] ⁃ CNNによる画像間のnode (keypoint) matching n SPLATNet: Sparse Lattice Networks for Point Cloud Processing [4] ⁃ point cloudを⼊⼒としてNNで扱う⼿法 n CodeSLAM-Learning a Compact, Optimisable Representation for Dense Visual SLAM [5] ⁃ 単眼リアルタイムSLAMのための、CNNによるdepth mapのembedding ⁃ 参考:第⼀回3D勉強会@関東https://togetter.com/li/1231482?page=5 n Efficient Optimization for Rank-Based Loss Functions [6] ⁃ 画像検索のためのランクloss n Total Capture: A 3D Deformation Model for Tracking Faces, Hands, and Bodies [7] (Student Paper Award) ⁃ 全⾝の⾼精度マーカーレスモーションキャプチャ 21
  22. 22. Copyright © DeNA Co.,Ltd. All Rights Reserved. SPLATNet: Sparse Lattice Networks for Point Cloud Processing 22 [4] (読み会後編で紹介予定)
  23. 23. Copyright © DeNA Co.,Ltd. All Rights Reserved. Outline 1. CVPR overview 2. Awarded Papers 3. Yes, we GAN! – GANs and perception 4. Dense Detection & Recognition 5. Knowledge Transfer 6. Basic Nets 7. Conclusion 23
  24. 24. Copyright © DeNA Co.,Ltd. All Rights Reserved. 3. YES, WE GAN! – GANs and perception n GANの概要とトポロジー n GANとperception n 論⽂紹介 ⁃ Semantic label条件下での⽣成 ⁃ Domain変換 ⁃ 物体検出へのSRGAN応⽤ 24 引⽤元:http://jponttuset.cat/are-gans-the-new-deep/
  25. 25. Copyright © DeNA Co.,Ltd. All Rights Reserved. GANs and perception – image synthesis system generated image generator real / fakediscrimi- nator ②GAN loss Topology 1: 1ドメインの画像⽣成 trainable fixed latent (random) e.g. DCGAN ground truth 23
  26. 26. Copyright © DeNA Co.,Ltd. All Rights Reserved. GANs and perception – image synthesis system reference condition generated image generator real or fake ground truth multi- scale feature maps VGG discrimi- nator ②GAN loss paired ①l1/l2 loss Topology 2: 条件付き画像変換 trainable fixed e.g. pix2pix Pose-Guided ③perceptual loss 24 https://arxiv.org/abs/1611.07004 https://arxiv.org/abs/1705.09368
  27. 27. Copyright © DeNA Co.,Ltd. All Rights Reserved. GANs and perception – image synthesis system real A fake A fake B generator real / fake real B discrimi- nator ②GAN loss ①l1/l2 loss Topology 3: ⾮ペアのドメイン変換 trainable fixed e.g. CycleGAN generator 25
  28. 28. Copyright © DeNA Co.,Ltd. All Rights Reserved. Semantic label mapからの画像⽣成 (Topology 2) 28 [9] GAN loss perceptual loss progressive [10] instance query perctptual loss [8] perctptual loss https://www.youtube.com/watch?v=U4Q98lenGLQ
  29. 29. Copyright © DeNA Co.,Ltd. All Rights Reserved. Domain to Domain Image Transfer (Topology 3) n non-makeup <-> makeup 1, makeup 2, makeup 3.... makeup N 29 PairedcycleGAN [11] condition condition G: 条件付きメイクアップ F: メイク落とし
  30. 30. Copyright © DeNA Co.,Ltd. All Rights Reserved. 30 input makeup style result 1ドメイン ⇄ Nドメイン のcycleGANを可能にした [11]
  31. 31. Copyright © DeNA Co.,Ltd. All Rights Reserved. Domain to Domain Image Transfer (Topology 3) n ⼀つのネットワークで、複数のドメインへの画像変換が可能 n Discriminator は real-or-fake・which-domain を識別 n Generator は source image・a target domain labelを⼊⼒とする 31 StarGAN condition ‘blond hair’ condition 合ってる? l1 loss [12]
  32. 32. Copyright © DeNA Co.,Ltd. All Rights Reserved. Domain to Domain Image Transfer 32 [12]
  33. 33. Copyright © DeNA Co.,Ltd. All Rights Reserved. SRGAN for Face Detection (Topology 2) 33 fixed False Positive を減らしたい wider face datasetで学習 GANが補助的に使⽤される例 [13]
  34. 34. Copyright © DeNA Co.,Ltd. All Rights Reserved. 3 types of loss functions ①l1/l2 loss ②perceptual loss ③GAN loss 34 generated image real / fake ground truth multi-scale feature matching VGG discrimi- nator generated image ground truth generated image ground truth Low Distortion Good Perception
  35. 35. Copyright © DeNA Co.,Ltd. All Rights Reserved. SRGAN (CVPR’17) revisited 35 l2 loss ● ● Perceptual loss using VGG ● GAN (Discriminator) loss ● ● 引⽤元: https://arxiv.org/abs/1609.04802
  36. 36. Copyright © DeNA Co.,Ltd. All Rights Reserved. Perception-Distortion Tradeoff [14] どの⼿法も、low distortionとgood perceptual qualityを 同時に満たせない → tradeoff把握が⼤事 34
  37. 37. Copyright © DeNA Co.,Ltd. All Rights Reserved. Outline 1. CVPR overview 2. Awarded Papers 3. Yes, we GAN! – GANs and perception 4. Dense Detection & Recognition 5. Knowledge Transfer 6. Basic Nets 7. Conclusion 37
  38. 38. Copyright © DeNA Co.,Ltd. All Rights Reserved. 4. Image Recognition – from bounding boxes to dense recognition n 2-stage 検出器 を⽤いた論⽂が多い ⁃ multi-task 学習させやすい、フレキシブル ⁃ スピードは特にケアされていない n State-of-the-art Bounding Box Detector : MegDet n Dense Recognition: ⁃ 密なキーポイント: Densepose ⁃ 2D画像から3D形状を認識 : 3D-RCNN 38
  39. 39. Copyright © DeNA Co.,Ltd. All Rights Reserved. MegDet : A Large Mini-Batch Object Detector [15] n batchsize 〜 256、BN size=32 n 128 GPUs, cross-GPU batchnorm n FPN + bells & whistlesでCOCO 2017 Challenge 優勝 39
  40. 40. Copyright © DeNA Co.,Ltd. All Rights Reserved. DensePose: Dense Human Pose Estimation in the Wild [16] 40 https://entry.cgworld.jp/terms/UV%E5% BA%A7%E6%A8%99%E7%B3%BB.html
  41. 41. Copyright © DeNA Co.,Ltd. All Rights Reserved. DensePose: Dense Human Pose Estimation in the Wild [16] 41 densepose-COCOが利⽤可能になりました: https://github.com/facebookresearch/DensePose Patch U V
  42. 42. Copyright © DeNA Co.,Ltd. All Rights Reserved. 3D-RCNN: Instance-Level 3D Object Reconstruction via Render-and-Compare [17] 42 n Mask R-CNNのHeadを⽤いて、3DのDe-Renderに必要な情報を得る n 推定結果をRenderし、KITTIなどの2D Ground Truthとのlossを計算
  43. 43. Copyright © DeNA Co.,Ltd. All Rights Reserved. 43 3Ddatasetが必要というわけではないため、 かなり汎⽤性が⾼い 3D-RCNN: Instance-Level 3D Object Reconstruction via Render-and-Compare [17]
  44. 44. Copyright © DeNA Co.,Ltd. All Rights Reserved. Outline 1. CVPR overview 2. Awarded Papers 3. Yes, we GAN! – GANs and perception 4. Dense Detection & Recognition 5. Knowledge Transfer 6. Basic Nets 7. Conclusion 44
  45. 45. Copyright © DeNA Co.,Ltd. All Rights Reserved. 5. Knowledge Transfer for weakly-supervised learning 45 teacher A teacher B bounding box class bounding box class instance segmentation keypoints supervisionの数・種類が異なるデータセ ットから効果的に学習させたい
  46. 46. Copyright © DeNA Co.,Ltd. All Rights Reserved. Learning to Segment Every Thing [18] 46 teacher Aの 学習時に、box weightとmask weightの関係を得る COCO dataset: 80 classes + Visual Genome: 3000 classes
  47. 47. Copyright © DeNA Co.,Ltd. All Rights Reserved. Data Distillation [19] 47 画像に幾何変換を加え、ひとつのモ デルでinference→アンサンブルする
  48. 48. Copyright © DeNA Co.,Ltd. All Rights Reserved. Data Distillation [19] 48
  49. 49. Copyright © DeNA Co.,Ltd. All Rights Reserved. 3D Human Pose Estimation in the Wild by Adversarial Learning [20] 49 少ない 多い 3Dデータセットと⾒分ける2D/3Dデータセットの推論結果を、
  50. 50. Copyright © DeNA Co.,Ltd. All Rights Reserved. Outline 1. CVPR overview 2. Awarded Papers 3. Yes, we GAN! – GANs and perception 4. Dense Detection & Recognition 5. Knowledge Transfer 6. Basic Nets 7. Conclusion 50
  51. 51. Copyright © DeNA Co.,Ltd. All Rights Reserved. 6. Basic Nets 51 3x3 depthwise conv 1x1 conv1x1 conv group conv channel cross-talk channel shuffle 効率的かつ性能の⾼い基本ネットワークたち n Shufflenet n Mobilenetv2 (本⽇発表) [21] [22]
  52. 52. Copyright © DeNA Co.,Ltd. All Rights Reserved. 6. Basic Nets n SENet n Shift 52 global context情報を得る 1x1 convで空間⽅向もカバーする 効率的かつ性能の⾼い基本ネットワークたち [23] [24]
  53. 53. Copyright © DeNA Co.,Ltd. All Rights Reserved. Conclusion – take-home keywords q GANs are getting over THE WALL q GANが壁を超えつつある q Image synthesis が流⾏。 q Perception-distortion tradeoff を考えてlossを選ぶのが重要 q Denser than bboxes q 2-stage検出器を⽤い、bounding boxの中を詳細に認識する q Knowledge transfer across datasets q 異なるデータセットを効果的に⽤いる q 来年はLong Beachです、チャンスのある⽅は積極的に⾏ってみましょう! 53
  54. 54. Copyright © DeNA Co.,Ltd. All Rights Reserved. References [1] Taskonomy: Disentangling Task Transfer Learning [2] Multi-Task Learning Using Uncertainty to Weigh Losses for Scene Geometry and Semantics [3] Deep Learning for Graph Matching [4] SPLATNet: Sparse Lattice Networks for Point Cloud Processing [5] CodeSLAM-Learning a Compact, Optimisable Representation for Dense Visual SLAM [6] Efficient Optimization for Rank-Based Loss Functions [7] Total Capture: A 3D Deformation Model for Tracking Faces, Hands, and Bodies [8] Photographic Image Synthesis with Cascaded Refinement Networks (*ICCV2017) [9] High-Resolution Image Synthesis and Semantic Manipulation with Conditional GANs [10] Semi-parametric Image Synthesis [11] PairedCycleGAN: Asymmetric Style Transfer for Applying and Removing Makeup [12] StarGAN: Unified Generative Adversarial Networks for Multi-Domain Image-to-Image Translation [13] Finding Tiny Faces in the Wild With Generative Adversarial Network [14] The Perception-Distortion Tradeoff [15] MegDet: A Large Mini-Batch Object Detector [16] DensePose: Dense Human Pose Estimation in the Wild [17] 3D-RCNN: Instance-Level 3D Object Reconstruction via Render-and-Compare [18] Learning to Segment Every Thing [19] Data Distillation: Towards Omni-Supervised Learning [20] 3D Human Pose Estimation in the Wild by Adversarial Learning [21] ShuffleNet: An Extremely Efficient Convolutional Neural Network for Mobile Devices [22] MobileNetV2: Inverted Residuals and Linear Bottlenecks [23] Squeeze-and-Excitation Networks [24] Shift: A Zero FLOP, Zero Parameter Alternative to Spatial Convolutions 54 Please search titles at : http://openaccess.thecvf.com/CVPR2018.py

×