SlideShare a Scribd company logo
1 of 32
Download to read offline
Do Better ImageNet Models Transfer
Better … for Image Recommendation?
FelipedelRío,PabloMessina,VicenteDominguez,DenisParra
CS Department
Schoolof Engineering
PontificiaUniversidadCatólicadeChile
KTLRecSysWorkshop,6de Octubrede 2018
Artwork Recommendation
• Online artwork market: Growing since 2008, despite
global crises!
– In 2011, art received $11.57 billion in totalglobal annual
revenue, over $2 billion versus 2010 (*forbes)
• Previous recommendation projects date for as long as
2007, such as the CHIP project to recommend paintings
from Rijksmuseum.
• Little use of recent advances in Deep Neural Networks
for Computer Vision.
October 6th, 2018 del Rio et al ~ RecSysKTL 2018 2
[forbes] The World’s Strongest Economy? the Global Art Market. https://www.forbes.com/sites/abigailesman/2012/02/29/the- worlds- strongest-
economy- the- global- art- market/ (2012)
Image Recommendation
• Since 2017 we have been working on
recommending art images, using data from the
online store UGallery.
• Two papers published:
– DLRS 2017: Dominguez,V., Messina, P., Parra,D., Mery,D., Trattner,
C., & Soto,A. (2017, August). ComparingNeural and Attractiveness-
based Visual Features for ArtworkRecommendation.In Proceedingsof
the 2nd WorkshoponDeep Learning for RecommenderSystems(pp. 55-59).
ACM.
– UMUAI2018: Messina, P., Dominguez,V., Parra,D., Trattner,C., &
Soto, A. (2018). Content-basedartworkrecommendation:integrating
painting metadata with neural and manually-engineeredvisualfeatures.
User Modelingand User-AdaptedInteraction,1-40.
October 6th, 2018 del Rio et al ~ RecSysKTL 2018 3
Data: UGallery
• Online Artwork Store, based on CA, USA.
• Mostly sales one-of-a-kind physical artwork.
October 6th, 2018 del Rio et al ~ RecSysKTL 2018 4
Image Recommendation
• Our top approach is a hybrid recommender, based
on metadata and visual features from Deep
Convolutional Neural Networks.
October 6th, 2018 del Rio et al ~ RecSysKTL 2018 5
Motivation
• When submitting our work we usually received
criticism for not using the latest DNN model.
• An actual review from a previous article submission
(2017):
<< Overall an interesting paper although … the
choice of AlexNet is rather odd as there are better
pre-trained networks available e.g. VGG16 >>
October 6th, 2018 del Rio et al ~ RecSysKTL 2018 6
Motivation
• Is it always the case that better pre-trained deep
convolutional models (on the Imagenet Challenge)
produce better results in a transfer learning setting?
October 6th, 2018 del Rio et al ~ RecSysKTL 2018 7
ImageNet:
Crowdsourcing a Large Dataset
of Image Labels
Datasets in Computer Vision
• 1996: faces and cars 14,000 images of 10,000 people
• 1998: MNIST 70,000 images of handwritten digits
• 2004: Caltech 101, 9,146 images of 101 categories
• 2005: PASCAL VOC 20,000 images with 20 classes
October 6th, 2018 del Rio et al ~ RecSysKTL 2018 9
Datasets in Computer Vision
• Imagenet: Presented in 2009 at CVPR
• Crowdsourced
• 14,197,122 images
• 21,841 categories (non-empty synsets)
• Categories based on WordNet taxonomy
October 6th, 2018 del Rio et al ~ RecSysKTL 2018 10
WordNet
• Wordnet: Miller’s project started in 1980 at
Princeton, a hierarchy for the English language
• Prof. Fei-Fei Li (UIUC, Princeton, Stanford),
worked on filling WordNet with many images.
October 6th, 2018 del Rio et al ~ RecSysKTL 2018 11
Crowdsourced
• Amazon Mechanical Turk
• It took 2.5 years to complete. Originally 3.2 million
images in 5,247 categories (mammal, vehicle, etc.)
October 6th, 2018 del Rio et al ~ RecSysKTL 2018 12
ImageNet Challenge
• The dataset was used to
set a competition for
image classification:
from 2010 on.
• In 2012 a team used
deep learning, got error
rate below 25% (Hinton
et al.), 10.8 point
margin, 41% better than
next best.
October 6th, 2018 del Rio et al ~ RecSysKTL 2018 13
Transfer Learning
• 2012 model was called AlexNet: a Convolutional
Neural Network
• The features learned (fc6, fc7) have been used in
succesfully, allowing to transfer the learning to
other tasks.
October 6th, 2018 del Rio et al ~ RecSysKTL 2018 14
Recent Imagenet results
https://github.com/tensorflow/models/tree/master/research/slim#pre-trained-models
October 6th, 2018 del Rio et al ~ RecSysKTL 2018 15
Method Top-1 Accuracy Top-5 Accuracy
NASNet Large 82.7 96.2
InceptionResNetV2 80.4 95.3
InceptionV3 78.0 93.9
ResNet50 75.6 92.8
VGG19 71.1 89.8
Inspiration
• Simon Kornblith, Jonathon Shlens, and Quoc V. Le.
2018. Do Better ImageNetModels Transfer Better?
(2018). https://arxiv.org/abs/1805.08974
October 6th, 2018 del Rio et al ~ RecSysKTL 2018 16
Without tuning
ResNet outperforms NASNet (SOTA)
Evaluation 1
• Do pre-trained ImageNet model performance
correlate with Image recommendation
performance?
October 6th, 2018 del Rio et al ~ RecSysKTL 2018 17
Ugallery Data and Evaluation
• 1,371 users / 3,940 items / 2,846 transactions
October 6th, 2018 del Rio et al ~ RecSysKTL 2018 18
Recommendation
• Scoring items based on cosine similarity between
user model and item model:
October 6th, 2018 del Rio et al ~ RecSysKTL 2018 19
Experiment 1: Results
October 6th, 2018 del Rio et al ~ RecSysKTL 2018 20
Experiment 1: Results
October 6th, 2018 del Rio et al ~ RecSysKTL 2018 21
• No correlation between ImageNet performance and
image recommendation performance.
Experiment 2
• What is the effect of fine-tuning?
• How should fine-tuning be performed?
October 6th, 2018 del Rio et al ~ RecSysKTL 2018 22
Tuning I: Shallow vs. Deep
October 6th, 2018 del Rio et al ~ RecSysKTL 2018 23
Shallow fine-tuning
Tuning I: Shallow vs. Deep
October 6th, 2018 del Rio et al ~ RecSysKTL 2018 24
Deep fine tuning
Learning: Multitask vs. Single Task
October 6th, 2018 del Rio et al ~ RecSysKTL 2018 25
• Dataset 1: Omniart
– 432,217 images
– Target classes: artist, artwork type, year
• Dataset 2: Ugallery
– 3,940 images
– Target classes: artist, medium (oil, acrylic, etc.)
Omniart Dataset
• http://isis-data.science.uva.nl/strezoski/#3
October 6th, 2018 del Rio et al ~ RecSysKTL 2018 26
Results 1
October 6th, 2018 del Rio et al ~ RecSysKTL 2018 27
Deep fine-tuning worked better than shallow fine tuning
Results 2
October 6th, 2018 del Rio et al ~ RecSysKTL 2018 28
• ResNet was better than shallow fine-tuning
• Consistent with Kornblith et al., ResNet is the best generic
visual feature extractor
Results 3
October 6th, 2018 del Rio et al ~ RecSysKTL 2018 29
• Training with a smaller but focused target dataset results in better
transfer learning performance
Results 4
October 6th, 2018 del Rio et al ~ RecSysKTL 2018 30
• There was not a clear winner between multitask and single task,
probably because the artist category is really descriptive
Conclusion
• Pre-trained neural image embeddings are great, but
do not assume that performance in the original task
is correlated with a current recommendation task.
• If you are still going to used a pre-trained Imagenet
visual embedding, ResNet is a good option,
although is not the current SOTA in ILSVRC.
• Fine-tuning is strongly suggested, even if your
dataset is small,
October 6th, 2018 del Rio et al ~ RecSysKTL 2018 31
THANKS!
dparra@ing.puc.cl

More Related Content

Similar to Do Better ImageNet Models Transfer Better... for Image Recommendation?

Team 08 geospatial user feedback
Team 08 geospatial user feedbackTeam 08 geospatial user feedback
Team 08 geospatial user feedback
plan4all
 

Similar to Do Better ImageNet Models Transfer Better... for Image Recommendation? (20)

A Look at TensorFlow.js
A Look at TensorFlow.jsA Look at TensorFlow.js
A Look at TensorFlow.js
 
Applications of Neural Networks
Applications of Neural NetworksApplications of Neural Networks
Applications of Neural Networks
 
Dissemination and Community Building
Dissemination and Community BuildingDissemination and Community Building
Dissemination and Community Building
 
Transformers in 2021
Transformers in 2021Transformers in 2021
Transformers in 2021
 
Reco4J @ London Meetup (June 26th)
Reco4J @ London Meetup (June 26th)Reco4J @ London Meetup (June 26th)
Reco4J @ London Meetup (June 26th)
 
Tracking research data footprints - slides
Tracking research data footprints - slidesTracking research data footprints - slides
Tracking research data footprints - slides
 
Bicod2017
Bicod2017Bicod2017
Bicod2017
 
BICOD-2017
BICOD-2017BICOD-2017
BICOD-2017
 
TensorFlow-KR 3rd meetup - Lightning Talk for SI Analytics
TensorFlow-KR 3rd meetup - Lightning Talk for SI AnalyticsTensorFlow-KR 3rd meetup - Lightning Talk for SI Analytics
TensorFlow-KR 3rd meetup - Lightning Talk for SI Analytics
 
Dii deck August 21
Dii deck August 21Dii deck August 21
Dii deck August 21
 
Fashion-MNIST: a Novel Image Dataset for Benchmarking Machine Learning Algori...
Fashion-MNIST: a Novel Image Dataset for Benchmarking Machine Learning Algori...Fashion-MNIST: a Novel Image Dataset for Benchmarking Machine Learning Algori...
Fashion-MNIST: a Novel Image Dataset for Benchmarking Machine Learning Algori...
 
Artificial Intelligence in practice - Gerbert Kaandorp - Codemotion Amsterdam...
Artificial Intelligence in practice - Gerbert Kaandorp - Codemotion Amsterdam...Artificial Intelligence in practice - Gerbert Kaandorp - Codemotion Amsterdam...
Artificial Intelligence in practice - Gerbert Kaandorp - Codemotion Amsterdam...
 
How Graph Data Science can turbocharge your Knowledge Graph
How Graph Data Science can turbocharge your Knowledge GraphHow Graph Data Science can turbocharge your Knowledge Graph
How Graph Data Science can turbocharge your Knowledge Graph
 
SoftElegance Services: Data Science, Data Engineering, Big Data Architecture
SoftElegance Services: Data Science, Data Engineering, Big Data Architecture SoftElegance Services: Data Science, Data Engineering, Big Data Architecture
SoftElegance Services: Data Science, Data Engineering, Big Data Architecture
 
Team 08 geospatial user feedback
Team 08 geospatial user feedbackTeam 08 geospatial user feedback
Team 08 geospatial user feedback
 
Rebuilding Reddit, A Case Study - Chris Slowe, CTO, Reddit
Rebuilding Reddit, A Case Study - Chris Slowe, CTO, RedditRebuilding Reddit, A Case Study - Chris Slowe, CTO, Reddit
Rebuilding Reddit, A Case Study - Chris Slowe, CTO, Reddit
 
20181212 Queensland AI Meetup
20181212 Queensland AI Meetup20181212 Queensland AI Meetup
20181212 Queensland AI Meetup
 
Slides ali-icomet2018
Slides ali-icomet2018Slides ali-icomet2018
Slides ali-icomet2018
 
State of the Map US 2018: Analytic Support to Mapping Contributors
State of the Map US 2018: Analytic Support to Mapping ContributorsState of the Map US 2018: Analytic Support to Mapping Contributors
State of the Map US 2018: Analytic Support to Mapping Contributors
 
Big data visualization allotting by r and python with gui tools
Big data visualization  allotting by r and python with gui toolsBig data visualization  allotting by r and python with gui tools
Big data visualization allotting by r and python with gui tools
 

More from Denis Parra Santander

More from Denis Parra Santander (15)

Human-Centered Machine Learning: Harnessing Visualization and Interactivity f...
Human-Centered Machine Learning: Harnessing Visualization and Interactivity f...Human-Centered Machine Learning: Harnessing Visualization and Interactivity f...
Human-Centered Machine Learning: Harnessing Visualization and Interactivity f...
 
Interactive Recommender Systems
Interactive Recommender SystemsInteractive Recommender Systems
Interactive Recommender Systems
 
Data Fusion for Dealing with the Recommendation Problem
Data Fusion for Dealing with the Recommendation ProblemData Fusion for Dealing with the Recommendation Problem
Data Fusion for Dealing with the Recommendation Problem
 
LDA on social bookmarking systems
LDA on social bookmarking systemsLDA on social bookmarking systems
LDA on social bookmarking systems
 
Identifying Relevant Messages in a Twitter-based Citizen Channel for Natural ...
Identifying Relevant Messages in a Twitter-based Citizen Channel for Natural ...Identifying Relevant Messages in a Twitter-based Citizen Channel for Natural ...
Identifying Relevant Messages in a Twitter-based Citizen Channel for Natural ...
 
Research on Recommender Systems: Beyond Ratings and Lists
Research on Recommender Systems: Beyond Ratings and ListsResearch on Recommender Systems: Beyond Ratings and Lists
Research on Recommender Systems: Beyond Ratings and Lists
 
The Effect of Different Set-based Visualizations on User Exploration of Reco...
The Effect of Different Set-based  Visualizations on User Exploration of Reco...The Effect of Different Set-based  Visualizations on User Exploration of Reco...
The Effect of Different Set-based Visualizations on User Exploration of Reco...
 
Twitter in Academic Conferences
Twitter in Academic ConferencesTwitter in Academic Conferences
Twitter in Academic Conferences
 
SetFusion Visual Hybrid Recommender - IUI 2014
SetFusion Visual Hybrid Recommender -  IUI 2014SetFusion Visual Hybrid Recommender -  IUI 2014
SetFusion Visual Hybrid Recommender - IUI 2014
 
Walk the Talk: Analyzing the relation between implicit and explicit feedback ...
Walk the Talk: Analyzing the relation between implicit and explicit feedback ...Walk the Talk: Analyzing the relation between implicit and explicit feedback ...
Walk the Talk: Analyzing the relation between implicit and explicit feedback ...
 
Network Visualization guest lecture at #DataVizQMSS at @Columbia / #SNA at PU...
Network Visualization guest lecture at #DataVizQMSS at @Columbia / #SNA at PU...Network Visualization guest lecture at #DataVizQMSS at @Columbia / #SNA at PU...
Network Visualization guest lecture at #DataVizQMSS at @Columbia / #SNA at PU...
 
A Hybrid Peer Recommender System for a Online Community Teachers
A Hybrid Peer Recommender System for a Online Community TeachersA Hybrid Peer Recommender System for a Online Community Teachers
A Hybrid Peer Recommender System for a Online Community Teachers
 
Implicit Feedback Recommendation via Implicit-to-Explicit Ordinal Logistic Re...
Implicit Feedback Recommendation via Implicit-to-Explicit Ordinal Logistic Re...Implicit Feedback Recommendation via Implicit-to-Explicit Ordinal Logistic Re...
Implicit Feedback Recommendation via Implicit-to-Explicit Ordinal Logistic Re...
 
Currents steps to be a researcher and faculty
Currents steps to be a researcher and facultyCurrents steps to be a researcher and faculty
Currents steps to be a researcher and faculty
 
Evaluation of Collaborative Filtering Algorithms for Recommending Articles on...
Evaluation of Collaborative Filtering Algorithms for Recommending Articles on...Evaluation of Collaborative Filtering Algorithms for Recommending Articles on...
Evaluation of Collaborative Filtering Algorithms for Recommending Articles on...
 

Recently uploaded

Vishram Singh - Textbook of Anatomy Upper Limb and Thorax.. Volume 1 (1).pdf
Vishram Singh - Textbook of Anatomy  Upper Limb and Thorax.. Volume 1 (1).pdfVishram Singh - Textbook of Anatomy  Upper Limb and Thorax.. Volume 1 (1).pdf
Vishram Singh - Textbook of Anatomy Upper Limb and Thorax.. Volume 1 (1).pdf
ssuserdda66b
 

Recently uploaded (20)

How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17
 
Micro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdfMicro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdf
 
SKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptx
SKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptxSKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptx
SKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptx
 
How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17
 
Fostering Friendships - Enhancing Social Bonds in the Classroom
Fostering Friendships - Enhancing Social Bonds  in the ClassroomFostering Friendships - Enhancing Social Bonds  in the Classroom
Fostering Friendships - Enhancing Social Bonds in the Classroom
 
Understanding Accommodations and Modifications
Understanding  Accommodations and ModificationsUnderstanding  Accommodations and Modifications
Understanding Accommodations and Modifications
 
Spatium Project Simulation student brief
Spatium Project Simulation student briefSpatium Project Simulation student brief
Spatium Project Simulation student brief
 
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
 
How to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POSHow to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POS
 
Vishram Singh - Textbook of Anatomy Upper Limb and Thorax.. Volume 1 (1).pdf
Vishram Singh - Textbook of Anatomy  Upper Limb and Thorax.. Volume 1 (1).pdfVishram Singh - Textbook of Anatomy  Upper Limb and Thorax.. Volume 1 (1).pdf
Vishram Singh - Textbook of Anatomy Upper Limb and Thorax.. Volume 1 (1).pdf
 
ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.
 
Unit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptxUnit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptx
 
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxBasic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
 
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
 
Python Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docxPython Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docx
 
Towards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptxTowards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptx
 
ComPTIA Overview | Comptia Security+ Book SY0-701
ComPTIA Overview | Comptia Security+ Book SY0-701ComPTIA Overview | Comptia Security+ Book SY0-701
ComPTIA Overview | Comptia Security+ Book SY0-701
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdf
 
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptxHMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
 
Application orientated numerical on hev.ppt
Application orientated numerical on hev.pptApplication orientated numerical on hev.ppt
Application orientated numerical on hev.ppt
 

Do Better ImageNet Models Transfer Better... for Image Recommendation?

  • 1. Do Better ImageNet Models Transfer Better … for Image Recommendation? FelipedelRío,PabloMessina,VicenteDominguez,DenisParra CS Department Schoolof Engineering PontificiaUniversidadCatólicadeChile KTLRecSysWorkshop,6de Octubrede 2018
  • 2. Artwork Recommendation • Online artwork market: Growing since 2008, despite global crises! – In 2011, art received $11.57 billion in totalglobal annual revenue, over $2 billion versus 2010 (*forbes) • Previous recommendation projects date for as long as 2007, such as the CHIP project to recommend paintings from Rijksmuseum. • Little use of recent advances in Deep Neural Networks for Computer Vision. October 6th, 2018 del Rio et al ~ RecSysKTL 2018 2 [forbes] The World’s Strongest Economy? the Global Art Market. https://www.forbes.com/sites/abigailesman/2012/02/29/the- worlds- strongest- economy- the- global- art- market/ (2012)
  • 3. Image Recommendation • Since 2017 we have been working on recommending art images, using data from the online store UGallery. • Two papers published: – DLRS 2017: Dominguez,V., Messina, P., Parra,D., Mery,D., Trattner, C., & Soto,A. (2017, August). ComparingNeural and Attractiveness- based Visual Features for ArtworkRecommendation.In Proceedingsof the 2nd WorkshoponDeep Learning for RecommenderSystems(pp. 55-59). ACM. – UMUAI2018: Messina, P., Dominguez,V., Parra,D., Trattner,C., & Soto, A. (2018). Content-basedartworkrecommendation:integrating painting metadata with neural and manually-engineeredvisualfeatures. User Modelingand User-AdaptedInteraction,1-40. October 6th, 2018 del Rio et al ~ RecSysKTL 2018 3
  • 4. Data: UGallery • Online Artwork Store, based on CA, USA. • Mostly sales one-of-a-kind physical artwork. October 6th, 2018 del Rio et al ~ RecSysKTL 2018 4
  • 5. Image Recommendation • Our top approach is a hybrid recommender, based on metadata and visual features from Deep Convolutional Neural Networks. October 6th, 2018 del Rio et al ~ RecSysKTL 2018 5
  • 6. Motivation • When submitting our work we usually received criticism for not using the latest DNN model. • An actual review from a previous article submission (2017): << Overall an interesting paper although … the choice of AlexNet is rather odd as there are better pre-trained networks available e.g. VGG16 >> October 6th, 2018 del Rio et al ~ RecSysKTL 2018 6
  • 7. Motivation • Is it always the case that better pre-trained deep convolutional models (on the Imagenet Challenge) produce better results in a transfer learning setting? October 6th, 2018 del Rio et al ~ RecSysKTL 2018 7
  • 8. ImageNet: Crowdsourcing a Large Dataset of Image Labels
  • 9. Datasets in Computer Vision • 1996: faces and cars 14,000 images of 10,000 people • 1998: MNIST 70,000 images of handwritten digits • 2004: Caltech 101, 9,146 images of 101 categories • 2005: PASCAL VOC 20,000 images with 20 classes October 6th, 2018 del Rio et al ~ RecSysKTL 2018 9
  • 10. Datasets in Computer Vision • Imagenet: Presented in 2009 at CVPR • Crowdsourced • 14,197,122 images • 21,841 categories (non-empty synsets) • Categories based on WordNet taxonomy October 6th, 2018 del Rio et al ~ RecSysKTL 2018 10
  • 11. WordNet • Wordnet: Miller’s project started in 1980 at Princeton, a hierarchy for the English language • Prof. Fei-Fei Li (UIUC, Princeton, Stanford), worked on filling WordNet with many images. October 6th, 2018 del Rio et al ~ RecSysKTL 2018 11
  • 12. Crowdsourced • Amazon Mechanical Turk • It took 2.5 years to complete. Originally 3.2 million images in 5,247 categories (mammal, vehicle, etc.) October 6th, 2018 del Rio et al ~ RecSysKTL 2018 12
  • 13. ImageNet Challenge • The dataset was used to set a competition for image classification: from 2010 on. • In 2012 a team used deep learning, got error rate below 25% (Hinton et al.), 10.8 point margin, 41% better than next best. October 6th, 2018 del Rio et al ~ RecSysKTL 2018 13
  • 14. Transfer Learning • 2012 model was called AlexNet: a Convolutional Neural Network • The features learned (fc6, fc7) have been used in succesfully, allowing to transfer the learning to other tasks. October 6th, 2018 del Rio et al ~ RecSysKTL 2018 14
  • 15. Recent Imagenet results https://github.com/tensorflow/models/tree/master/research/slim#pre-trained-models October 6th, 2018 del Rio et al ~ RecSysKTL 2018 15 Method Top-1 Accuracy Top-5 Accuracy NASNet Large 82.7 96.2 InceptionResNetV2 80.4 95.3 InceptionV3 78.0 93.9 ResNet50 75.6 92.8 VGG19 71.1 89.8
  • 16. Inspiration • Simon Kornblith, Jonathon Shlens, and Quoc V. Le. 2018. Do Better ImageNetModels Transfer Better? (2018). https://arxiv.org/abs/1805.08974 October 6th, 2018 del Rio et al ~ RecSysKTL 2018 16 Without tuning ResNet outperforms NASNet (SOTA)
  • 17. Evaluation 1 • Do pre-trained ImageNet model performance correlate with Image recommendation performance? October 6th, 2018 del Rio et al ~ RecSysKTL 2018 17
  • 18. Ugallery Data and Evaluation • 1,371 users / 3,940 items / 2,846 transactions October 6th, 2018 del Rio et al ~ RecSysKTL 2018 18
  • 19. Recommendation • Scoring items based on cosine similarity between user model and item model: October 6th, 2018 del Rio et al ~ RecSysKTL 2018 19
  • 20. Experiment 1: Results October 6th, 2018 del Rio et al ~ RecSysKTL 2018 20
  • 21. Experiment 1: Results October 6th, 2018 del Rio et al ~ RecSysKTL 2018 21 • No correlation between ImageNet performance and image recommendation performance.
  • 22. Experiment 2 • What is the effect of fine-tuning? • How should fine-tuning be performed? October 6th, 2018 del Rio et al ~ RecSysKTL 2018 22
  • 23. Tuning I: Shallow vs. Deep October 6th, 2018 del Rio et al ~ RecSysKTL 2018 23 Shallow fine-tuning
  • 24. Tuning I: Shallow vs. Deep October 6th, 2018 del Rio et al ~ RecSysKTL 2018 24 Deep fine tuning
  • 25. Learning: Multitask vs. Single Task October 6th, 2018 del Rio et al ~ RecSysKTL 2018 25 • Dataset 1: Omniart – 432,217 images – Target classes: artist, artwork type, year • Dataset 2: Ugallery – 3,940 images – Target classes: artist, medium (oil, acrylic, etc.)
  • 27. Results 1 October 6th, 2018 del Rio et al ~ RecSysKTL 2018 27 Deep fine-tuning worked better than shallow fine tuning
  • 28. Results 2 October 6th, 2018 del Rio et al ~ RecSysKTL 2018 28 • ResNet was better than shallow fine-tuning • Consistent with Kornblith et al., ResNet is the best generic visual feature extractor
  • 29. Results 3 October 6th, 2018 del Rio et al ~ RecSysKTL 2018 29 • Training with a smaller but focused target dataset results in better transfer learning performance
  • 30. Results 4 October 6th, 2018 del Rio et al ~ RecSysKTL 2018 30 • There was not a clear winner between multitask and single task, probably because the artist category is really descriptive
  • 31. Conclusion • Pre-trained neural image embeddings are great, but do not assume that performance in the original task is correlated with a current recommendation task. • If you are still going to used a pre-trained Imagenet visual embedding, ResNet is a good option, although is not the current SOTA in ILSVRC. • Fine-tuning is strongly suggested, even if your dataset is small, October 6th, 2018 del Rio et al ~ RecSysKTL 2018 31