Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Taskonomy: Disentangling Task Transfer Learning -- Scouty Meetup 2018 Feb., 14th

26 views

Published on

scouty Knowledge Graph and Embedding Night
I read the following awesome paper
" Taskonomy: Disentangling Task Transfer Learning", CVPR2018

Published in: Technology
  • Be the first to comment

Taskonomy: Disentangling Task Transfer Learning -- Scouty Meetup 2018 Feb., 14th

  1. 1. Taskonomy: Disentangling Task Transfer Learning 2019 Feb. 14th Tatsuya Shirakawa (ABEJA, Inc.)
  2. 2. Self Introduction ABEJA, Inc. (Researcher) - Deep Learning (CV, Graph, NLP, ) - Machine Learning - Mathematical Optimization - https://github.com/TatsuyaShirakawa Tech blog http://tech-blog.abeja.asia/ Poincaré Embeddings Graph Convolution Annotation Hyperbolic
  3. 3. Today’s Paper Exploring the Structure among Visual Tasks
 by Measuring Transferability
 (Taskonomy = Task + Taxonomy http://taskonomy.stanford.edu/ http://taskonomy.vision/ + Super Thorough Analysis + Potentially Promising Research Direction = Super Interesting CVPR 2018 Best Paper ! + Super Large Dataset with 26 Task Annotations
  4. 4. Paper Introduction • Considering transferability among visual tasks • Analysis of the transferability by means of AHP (Analytic Hierarchy Process) • Combinatorial Optimization for extracting visual Taskonomy • Massive Dataset & Experiments 
 (4.5M images, 26 tasks, 47,886 GPU hours) http://taskonomy.stanford.edu/
  5. 5. http://taskonomy.vision/
  6. 6. Disclaimer The paper, slides, live demos, and web pages are great already.
 So, in this talk, let’s focus on the understanding 
 - the motivation,
 - the task,
 - method and 
 - some experimental results 
 of Taskonomy.
 
 In the following, I extensively quote some slides from
 https://storage.googleapis.com/taskonomy_slides/taskonomy_slides.html
  7. 7. Contents • Motivation & Task • Dataset • Method • Experiments
  8. 8. Zamir et al. Taskonomy 2018 Question: Vision problems - related or independent? Layout Objects ? Depth Normals Image ? !2 Zamir et al. Taskonomy 2018
  9. 9. Zamir et al. Taskonomy 2018 Question: Vision problems - related or independent? •Can be computationally measured •Unified model for transfer learning •Task relationships exist •Tasks belonging to a structured space Depth Normals Layout Objects Image derivative spatial prior !3
  10. 10. Goal — Task Transferability Structure Autoencoding Object Class. Scene Class. Curvature Denoising Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching Reshading Distance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. Novel Task 1 Novel Task 2 Novel Task 3 Autoencoding Object Class. Scene Class. Curvature Denoising Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching Reshading Distance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. Novel Task 1 Novel Task 2 Novel Task 3
  11. 11. Contents • Motivation & Task • Dataset • Method • Experiments
  12. 12. Zamir et al. Taskonomy 2018 Introduction Method Results Summary Query Image AutoencodingIn-painting Object Class. Scene Class. Jigsaw puzzle Colorization 2D Segm. 2.5D Segm. Semantic Segm. Vanishing Points 2D Edges 3D Edges 2D Keypoints 3D Keypoints 3D Curvature Image Reshading Denoising Cam. Pose (non-fixated) Cam. Pose(fixated) Triplet Cam. Pose Room Layout Point Matching Top 5 prediction: sliding door home theater, home theatre studio couch, day bed china cabinet, china closet entertainment center Eucl. DistanceSurface Normals Top 2 prediction: living room television room !21 • Task Bank • 26 Semantic, 2D, 3D, and tasks • Dataset • 4 million real images • Each image has the GT label for all tasks • Task-Specific Networks • 26 x https://storage.googleapis.com/taskonomy_slides/taskonomy_slides.html
  13. 13. Dataset Creation • Semantic tasks (e.g. scene classification)
 => “Knowledge distillation” from known methods
 = predictions of trained models are used as labels • Non-Semantic Labels
 => Programatically computed from images from multiple RGB-D cameras
  14. 14. Contents • Motivation & Task • Dataset • Method • Experiments
  15. 15. Zamir et al. Taskonomy 2018 Modeling Autoencoding Object Class. Scene Class. Curvature Denoising 2D Edges Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching Reshading Distance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. Autoencoding Object Class. Scene Class. Curvature Denoising 2D Edges Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching Reshading Distance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. Autoencoding Object Class. Scene Class. Curvature Denoising 2D Edges Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching Reshading Distance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. Autoencoding Object Class. Scene Class. Curvature Denoising 2D Edges Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching Reshading Distance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. Autoencoding Object Class. Scene Class. Curvature Denoising 2D Edges Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching Reshading Distance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. Autoencoding Object Class. Scene Class. Curvature Denoising 2D Edges Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching Reshading Distance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. I: Task-Specific Modeling II: Transfer Modeling III: Normalization (AHP) IV: Taxonomy Extraction (BIP) Autoencoding Object Class. Scene Class. Curvature Denoising Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching Reshading Distance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. Novel Task 1 Novel Task 2 Novel Task 3 Autoencoding Object Class. Scene Class. Curvature Denoising Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching Reshading Distance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. Novel Task 1 Novel Task 2 Novel Task 3 Introduction Method Results Summary !15
  16. 16. Zamir et al. Taskonomy 2018 I: Task-Specific Modeling Autoencoding Object Class. Scene Class. Curvature Denoising 2D Edges Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching Reshading Distance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. Autoencoding Object Class. Scene Class. Curvature Denoising 2D Edges Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching Reshading Distance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. Autoencoding Object Class. Scene Class. Curvature Denoising 2D Edges Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching Reshading Distance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. Autoencoding Object Class. Scene Class. Curvature Denoising 2D Edges Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching Reshading Distance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. Autoencoding Object Class. Scene Class. Curvature Denoising 2D Edges Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching Reshading Distance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. Autoencoding Object Class. Scene Class. Curvature Denoising 2D Edges Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching Reshading Distance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. I: Task-Specific Modeling II: Transfer Modeling III: Normalization (AHP) IV: Taxonomy Extraction (BIP) Image Source Output (normals)Training data Autoencoding Object Class. Scene Class. Curvature Denoising Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching Reshading Distance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. Novel Task 1 Novel Task 2 Novel Task 3 Autoencoding Object Class. Scene Class. Curvature Denoising Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching Reshading Distance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. Novel Task 1 Novel Task 2 Novel Task 3 Introduction Method Results Summary !16 Same Image Resolution
 Same Network Architecture
 => Same Latent Representation
  17. 17. Zamir et al. Taskonomy 2018 II: Transfer Modeling Image Autoencoding Object Class. Scene Class. Curvature Denoising 2D Edges Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching Reshading Distance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. Autoencoding Object Class. Scene Class. Curvature Denoising 2D Edges Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching Reshading Distance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. Autoencoding Object Class. Scene Class. Curvature Denoising 2D Edges Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching Reshading Distance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. Autoencoding Object Class. Scene Class. Curvature Denoising 2D Edges Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching Reshading Distance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. I: Task-Specific Modeling II: Transfer Modeling III: Normalization (AHP) IV: Taxonomy Extraction (BIP) Autoencoding Object Class. Scene Class. Curvature Denoising 2D Edges Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching Reshading Distance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. Autoencoding Object Class. Scene Class. Curvature Denoising 2D Edges Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching Reshading Distance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. Autoencoding Object Class. Scene Class. Curvature Denoising Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching Reshading Distance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. Novel Task 1 Novel Task 2 Novel Task 3 Autoencoding Object Class. Scene Class. Curvature Denoising Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching Reshading Distance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. Novel Task 1 Novel Task 2 Novel Task 3 Introduction Method Results Summary Training data Target Output (Curvature) !17Image Source Output (normals)Training data
  18. 18. Zamir et al. Taskonomy 2018 II: Transfer Modeling Image Autoencoding Object Class. Scene Class. Curvature Denoising 2D Edges Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching Reshading Distance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. Autoencoding Object Class. Scene Class. Curvature Denoising 2D Edges Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching Reshading Distance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. Autoencoding Object Class. Scene Class. Curvature Denoising 2D Edges Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching Reshading Distance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. Autoencoding Object Class. Scene Class. Curvature Denoising 2D Edges Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching Reshading Distance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. I: Task-Specific Modeling II: Transfer Modeling III: Normalization (AHP) IV: Taxonomy Extraction (BIP) Autoencoding Object Class. Scene Class. Curvature Denoising 2D Edges Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching Reshading Distance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. Autoencoding Object Class. Scene Class. Curvature Denoising 2D Edges Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching Reshading Distance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. Autoencoding Object Class. Scene Class. Curvature Denoising Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching Reshading Distance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. Novel Task 1 Novel Task 2 Novel Task 3 Autoencoding Object Class. Scene Class. Curvature Denoising Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching Reshading Distance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. Novel Task 1 Novel Task 2 Novel Task 3 Introduction Method Results Summary Training data Target Output (Curvature) !18
  19. 19. Zamir et al. Taskonomy 2018 II: Transfer Modeling Image Autoencoding Object Class. Scene Class. Curvature Denoising 2D Edges Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching Reshading Distance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. Autoencoding Object Class. Scene Class. Curvature Denoising 2D Edges Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching Reshading Distance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. Autoencoding Object Class. Scene Class. Curvature Denoising 2D Edges Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching Reshading Distance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. Autoencoding Object Class. Scene Class. Curvature Denoising 2D Edges Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching Reshading Distance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. Autoencoding Object Class. Scene Class. Curvature Denoising 2D Edges Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching Reshading Distance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. Autoencoding Object Class. Scene Class. Curvature Denoising 2D Edges Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching Reshading Distance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. I: Task-Specific Modeling II: Transfer Modeling III: Normalization (AHP) IV: Taxonomy Extraction (BIP) Autoencoding Object Class. Scene Class. Curvature Denoising Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching Reshading Distance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. Novel Task 1 Novel Task 2 Novel Task 3 Autoencoding Object Class. Scene Class. Curvature Denoising Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching Reshading Distance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. Novel Task 1 Novel Task 2 Novel Task 3 Introduction Method Results Summary Training data Target Output (Curvature) !19 + Higher Order Transfers (Beam Search)
  20. 20. Zamir et al. Taskonomy 2018 Autoencoding Object Class. Scene Class. Curvature Denoising 2D Edges Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching Reshading Distance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. Autoencoding Object Class. Scene Class. Curvature Denoising 2D Edges Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching Reshading Distance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. Autoencoding Object Class. Scene Class. Curvature Denoising 2D Edges Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching Reshading Distance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. Autoencoding Object Class. Scene Class. Curvature Denoising 2D Edges Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching Reshading Distance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. Autoencoding Object Class. Scene Class. Curvature Denoising 2D Edges Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching Reshading Distance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. Autoencoding Object Class. Scene Class. Curvature Denoising 2D Edges Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching Reshading Distance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. I: Task-Specific Modeling II: Transfer Modeling III: Normalization (AHP) IV: Taxonomy Extraction (BIP) Autoencoding Object Class. Scene Class. Curvature Denoising Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching Reshading Distance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. Novel Task 1 Novel Task 2 Novel Task 3 Autoencoding Object Class. Scene Class. Curvature Denoising Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching Reshading Distance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. Novel Task 1 Novel Task 2 Novel Task 3 Introduction Method Results Summary III: Normalization Adjacency Matrix (pre-normalization) !20 Adjacency Matrix W The (i, j)-th element is the raw loss/evaluation when i-th/j-th tasks are taken as source/target tasks. • problematic (scale and space mismatch)
 => a proper normalization is needed
  21. 21. Zamir et al. Taskonomy 2018 Autoencoding Object Class. Scene Class. Curvature Denoising 2D Edges Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching Reshading Distance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. Autoencoding Object Class. Scene Class. Curvature Denoising 2D Edges Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching Reshading Distance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. Autoencoding Object Class. Scene Class. Curvature Denoising 2D Edges Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching Reshading Distance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. Autoencoding Object Class. Scene Class. Curvature Denoising 2D Edges Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching Reshading Distance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. Autoencoding Object Class. Scene Class. Curvature Denoising 2D Edges Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching Reshading Distance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. Autoencoding Object Class. Scene Class. Curvature Denoising 2D Edges Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching Reshading Distance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. I: Task-Specific Modeling II: Transfer Modeling III: Normalization (AHP) IV: Taxonomy Extraction (BIP) Autoencoding Object Class. Scene Class. Curvature Denoising Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching Reshading Distance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. Novel Task 1 Novel Task 2 Novel Task 3 Autoencoding Object Class. Scene Class. Curvature Denoising Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching Reshading Distance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. Novel Task 1 Novel Task 2 Novel Task 3 Introduction Method Results Summary III: Normalization Adjacency Matrix (pre-normalization) !21 Adjacency Matrix W_t (t: target task) The (i, j)-th element is the ratio of (a) / (b) (a) number of images on which i-th task transfered
 to target task t better than j-th task did (b) number of images on which j-th task transfered
 to target task t better than i-th task did
  22. 22. Zamir et al. Taskonomy 2018 Autoencoding Object Class. Scene Class. Curvature Denoising 2D Edges Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching Reshading Distance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. Autoencoding Object Class. Scene Class. Curvature Denoising 2D Edges Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching Reshading Distance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. Autoencoding Object Class. Scene Class. Curvature Denoising 2D Edges Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching Reshading Distance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. Autoencoding Object Class. Scene Class. Curvature Denoising 2D Edges Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching Reshading Distance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. Autoencoding Object Class. Scene Class. Curvature Denoising 2D Edges Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching Reshading Distance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. Autoencoding Object Class. Scene Class. Curvature Denoising 2D Edges Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching Reshading Distance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. I: Task-Specific Modeling II: Transfer Modeling III: Normalization (AHP) IV: Taxonomy Extraction (BIP) Autoencoding Object Class. Scene Class. Curvature Denoising Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching Reshading Distance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. Novel Task 1 Novel Task 2 Novel Task 3 Autoencoding Object Class. Scene Class. Curvature Denoising Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching Reshading Distance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. Novel Task 1 Novel Task 2 Novel Task 3 Introduction Method Results Summary III: Normalization Adjacency Matrix (pre-normalization) Adjacency Matrix (post-normalization) !22 Ordinal Normalization - Analytic Hierarchical Process. (AHP)
  23. 23. AHP(Analytical Hierarchical Process) Mathematical Background Let us consider the ranking of n items {1, 2, …, n}. Let A = (a_ij), a_ij measure how i-th item is superior to j-th item. Assume matrix A = (a_ij) has the form of a_ij = u_i / u_j Then,
 
 (1) A is rank 1
 (2) Au = nu (u is the unique non-zero eigenvector) => u: importance vector
  24. 24. AHP for Taskonomy 1. Take the win-lose ratio between 
 (a) transfer s_i -> t and (b) transfer s_j -> t
 
 2. Take the 1st principal component (normalized to sum to 1) of the matrix 3. Create the final matrix by
 stacking the 1st principal 
 components

  25. 25. Zamir et al. Taskonomy 2018 IV: Taxonomy Extraction • Taxonomical structure: • Sparsified • What are best source tasks • What sources for each target • Out-of-dictionary tasks • Maximize performance while constrained by some budget Autoencoding Object Class. Scene Class. Curvature Denoising 2D Edges Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching Reshading Distance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. Autoencoding Object Class. Scene Class. Curvature Denoising 2D Edges Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching Reshading Distance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. Autoencoding Object Class. Scene Class. Curvature Denoising 2D Edges Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching Reshading Distance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. Autoencoding Object Class. Scene Class. Curvature Denoising 2D Edges Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching Reshading Distance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. Autoencoding Object Class. Scene Class. Curvature Denoising 2D Edges Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching Reshading Distance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. Autoencoding Object Class. Scene Class. Curvature Denoising 2D Edges Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching Reshading Distance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. I: Task-Specific Modeling II: Transfer Modeling III: Normalization (AHP) IV: Taxonomy Extraction (BIP) Autoencoding Object Class. Scene Class. Curvature Denoising Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching Reshading Distance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. Novel Task 1 Novel Task 2 Novel Task 3 Autoencoding Object Class. Scene Class. Curvature Denoising Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching Reshading Distance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. Novel Task 1 Novel Task 2 Novel Task 3 Introduction Method Results Summary
  26. 26. Zamir et al. Taskonomy 2018 IV: Taxonomy Extraction Source tasks Target tasks Dictionary= Sources ∪Targets target-only (small data)source/targetsource-only Introduction Method Results Summary Autoencoding Object Class. Scene Class. Curvature Denoising 2D Edges Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching Reshading Distance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. Autoencoding Object Class. Scene Class. Curvature Denoising 2D Edges Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching Reshading Distance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. Autoencoding Object Class. Scene Class.Curvature Denoising 2D Edges Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching ReshadingDistance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. Autoencoding Object Class. Scene Class.Curvature Denoising 2D Edges Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching ReshadingDistance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. Autoencoding Object Class. Scene Class.Curvature Denoising 2D Edges Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching ReshadingDistance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. Novel Task 1 Novel Task 2 Novel Task 3 Autoencoding Object Class. Scene Class.Curvature Denoising 2D Edges Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching ReshadingDistance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. Novel Task 1 Novel Task 2 Novel Task 3
  27. 27. Zamir et al. Taskonomy 2018 IV: Taxonomy Extraction Autoencoding Object Class. Scene Class.Curvature Denoising 2D Edges Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching ReshadingDistance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. Novel Task 1 Novel Task 2 Novel Task 3 Autoencoding Object Class. Scene Class.Curvature Denoising 2D Edges Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching ReshadingDistance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. Novel Task 1 Novel Task 2 Novel Task 3 Autoencoding Object Class. Scene Class.Curvature Denoising 2D Edges Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching ReshadingDistance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. Novel Task 1 Novel Task 2 Novel Task 3 Autoencoding Object Class. Scene Class.Curvature Denoising 2D Edges Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching ReshadingDistance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. Novel Task 1 Novel Task 2 Novel Task 3 Autoencoding Object Class. Scene Class.Curvature Denoising 2D Edges Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching ReshadingDistance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. Novel Task 1 Novel Task 2 Novel Task 3 Autoencoding Object Class. Scene Class.Curvature Denoising 2D Edges Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching ReshadingDistance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. Novel Task 1 Novel Task 2 Novel Task 3 Source tasks Target tasks Dictionary= Sources ∪Targets Introduction Method Results Summary Autoencoding Object Class. Scene Class.Curvature Denoising 2D Edges Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching ReshadingDistance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. Novel Task 1 Novel Task 2 Novel Task 3 Autoencoding Object Class. Scene Class.Curvature Denoising 2D Edges Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching ReshadingDistance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. Novel Task 1 Novel Task 2 Novel Task 3 target-only (small data)source/targetsource-only
  28. 28. Zamir et al. Taskonomy 2018 IV: Taxonomy Extraction Autoencoding Object Class. Scene Class.Curvature Denoising 2D Edges Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching ReshadingDistance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. Novel Task 1 Novel Task 2 Novel Task 3 Autoencoding Object Class. Scene Class.Curvature Denoising 2D Edges Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching ReshadingDistance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. Novel Task 1 Novel Task 2 Novel Task 3 Autoencoding Object Class. Scene Class.Curvature Denoising 2D Edges Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching ReshadingDistance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. Novel Task 1 Novel Task 2 Novel Task 3 Autoencoding Object Class. Scene Class.Curvature Denoising 2D Edges Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching ReshadingDistance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. Novel Task 1 Novel Task 2 Novel Task 3 Autoencoding Object Class. Scene Class.Curvature Denoising 2D Edges Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching ReshadingDistance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. Novel Task 1 Novel Task 2 Novel Task 3 Autoencoding Object Class. Scene Class.Curvature Denoising 2D Edges Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching ReshadingDistance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. Novel Task 1 Novel Task 2 Novel Task 3 Autoencoding Object Class. Scene Class. Curvature Denoising Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching Reshading Distance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. Novel Task 1 Novel Task 2 Novel Task 3 Autoencoding Object Class. Scene Class. Curvature Denoising Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching Reshading Distance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. Novel Task 1 Novel Task 2 Novel Task 3 Source tasks Target tasks Dictionary= Sources ∪Targets Introduction Method Results Summary Constraint I: only transfer from sources. Constraint II: all targets are transferred to. Autoencoding Scene Class.Curvature Denoising 2D Edges Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching ReshadingDistance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. Novel Task 1 Novel Task 2 Novel Task 3 Object Class. Autoencoding Scene Class.Curvature Denoising 2D Edges Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching ReshadingDistance Z-Depth Normals Layout 2.5D Segm. Novel Task 1 Novel Task 2 Novel Task 3 Vanishing Pts. Semantic Segm. 2D Segm. Object Class. Autoencoding Scene Class.Curvature Denoising 2D Edges Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching ReshadingDistance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. Novel Task 1 Novel Task 2 Novel Task 3 Object Class. Autoencoding Scene Class.Curvature Denoising 2D Edges Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching ReshadingDistance Z-Depth Normals Layout 2.5D Segm. Novel Task 1 Novel Task 2 Novel Task 3 Vanishing Pts. Semantic Segm. 2D Segm. Object Class. Autoencoding Scene Class.Curvature Denoising 2D Edges Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching ReshadingDistance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. Novel Task 1 Novel Task 2 Novel Task 3 Object Class. Autoencoding Scene Class.Curvature Denoising 2D Edges Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching ReshadingDistance Z-Depth Normals Layout 2.5D Segm. Novel Task 1 Novel Task 2 Novel Task 3 Vanishing Pts. Semantic Segm. 2D Segm. Object Class. Constraint III: not exceed budget. Binary Integer Program (BIP) Autoencoding Object Class. Scene Class.Curvature Denoising 2D Edges Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching ReshadingDistance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. Novel Task 1 Novel Task 2 Novel Task 3 Autoencoding Object Class. Scene Class.Curvature Denoising 2D Edges Occlusion Edges Egomotion Cam. Pose (fix) 2D Keypoint 3D Keypoint Cam. Pose (nonfix) Matching ReshadingDistance Z-Depth Normals Layout 2.5D Segm. 2D Segm. Semantic Segm. Vanishing Pts. Novel Task 1 Novel Task 2 Novel Task 3 target-only (small data)source/targetsource-only
  29. 29. Taxonomy Extraction • Boolean Integer Programming (BIP)
 — Finding the subgraph compose of tasks(nodes) and transfers(edges) 
 which solve the all tasks in minimum cost Constraint I
 if a transfer is in the subgraph, all of its source nodes/tasks must be included too Constraint II
 each target task has exactly one transfer in Constraint III
 supervision budget is not exceeded
  30. 30. Contents • Motivation & Task • Dataset • Method • Experiments
  31. 31. Zamir et al. Taskonomy 2018 Experimental Results Introduction Method Results Summary !31 • 26 Task-Specific Networks • 3000 Transfer Networks • 47,829 GPU hours • Transfers training data: 8x-120x less than task-specific
  32. 32. (“Normals” = diff. of “Depth” looks quite strong but many tasks are computed if 3D-reconstruction is done …)
  33. 33. Gain Quality
  34. 34. Gain Quality

×