SlideShare a Scribd company logo
1 of 38
Download to read offline
Building a Deep
Learning-powered Search
Engine
Koby Karp
Deep Learning Paris Meetup #7
I’m Koby - Data Scientist @ Equancy
★ Robotics Engineer (2007-2011)
★ Computer Visioner (2011-2012)
★ Data Scientist, Data Engineer, Data Miner, Data Analyst, ... (2011-2016)
★ Deep Learner (2016-)
★ ?
E-Commerce ♥ Images
★ Catalogue
★ Social Network
★ Marketplace
Three use cases for FASHION:
★ Visual Search Engine
★ Fashion Object Detection
★ Data Quality
Three use cases for FASHION:
★ Visual Search Engine
➹ Take pictures with your phone
➹ Search through catalogue using your images
➹ Return most similar or exact products
Big City Life = High Exposure to Fashion Daily
Visual Search Engine at a glance
Visual Search Engine at a glance
★ Batch Phase: Build
➢ Describe - Encode image into a numeric description (vector)
➢ Index - Apply transformation to all images and store in a DB
★ Online Phase: Deploy
➢ Measure Distance - Apply a distance metric between DB and a new (unseen) image
➢ Ranking - Sort by distance and return first N results
Visual Search Engine at a glance
Describe
Numerical
Representation
0.672
0.510
0.741
...
0.919
Catalogue Image
★ Batch Phase: Build
➢ Describe
➢ Index
★ Online Phase: Deploy
➢ Measure Distance
➢ Ranking
Encode image into a numeric description (vector)
★ Batch Phase: Build
➢ Describe
➢ Index
★ Online Phase: Deploy
➢ Measure Distance
➢ Ranking
Visual Search Engine at a glance
Apply transformation to all images and store in a DB
Index
0.672 0.435 0.482 ... 0.141
0.510 0.525 0.810 .... 0.241
0.741 0.526 0.210 ... 0.571
... ... ... ... 0.816
0.919 0.552 0.161 0.622 0.412
Catalogue Images
0.672 0.435 0.482 ... 0.141
0.510 0.525 0.810 .... 0.241
0.741 0.526 0.210 ... 0.571
... ... ... ... 0.816
0.919 0.552 0.161 0.622 0.412
Visual Search Engine at a glance
Apply a distance metric between DB and a new (unseen) image
Measure
Distance
★ Batch Phase: Build
➢ Describe
➢ Index
★ Online Phase: Deploy
➢ Measure Distance
➢ Ranking
0.672
0.510
0.741
...
0.919
User’s Image
Visual Search Engine at a glance
Sort by distance and return first N results
Top 5
★ Batch Phase: Build
➢ Describe
➢ Index
★ Online Phase: Deploy
➢ Measure Distance
➢ Ranking
User’s Image
Focus on the Describe step
Three attributes that we need to describe
Shape Color Texture
Three attributes that we need to describe
Shape Color Texture
How is it done with “classic” Computer Vision?
Edge Detectors
Image Moment
HOG / HOF / SIFT
Fourier / Wavelet
Color Histograms
Three attributes that we need to describe
Problems with this approach:
1. Too many parameters (difficult to tune)
2. Multiple methods (how to weigh?)
3. Slow (many transformations)
4. Ungeneralizable
Solution: Pre-Trained
Convolutional Neural
Network (CNN)
Entering: Convolutional Neural Network (CNN)
AlexNet (2012)
1. “The Beatles of the CNNs” -Me
2. Trained on the ImageNet dataset (15 million images)
3. Used for classification of 1000 categories (Animals, Plants, Urban - No Fashion)
4. Invariant to translations and horizontal reflections
5. Tried other models such as VGG16.
Entering: Convolutional Neural Network (CNN)
AlexNet (simplified visualization)
Convolutional Neural Network (CNN)
AlexNet (simplified visualization)
❖ We remove the last Fully connected layer (Soft-Max)
❖ We feed our images and generate CNN codes of size 4096
❖ The weights of the Trained CNN contain the Feature Engineering mapping that was necessary
to discriminate between the 1000 classes
❖ We use the network as a general-purpose descriptor.
Test Time ...
Dataset
M. Manfredi; C. Grana; S. Calderara; R. Cucchiara "A complete system for garment segmentation and color classification" MACHINE VISION AND
APPLICATIONS, vol. 25, pp. 955 -969 , 2014
Mix of various clothing and accessory:
❖ 60000 items
❖ Medium Quality
❖ Grey background
❖ Used as a benchmark for garment classification
Image Clustering
❖ Using t-SNE for compression to 2D
❖ Selected random 10% for visualization
Image Clustering Jewelry & Accessories
Image Clustering T-Shirts
Image Clustering Shoes
Image Clustering
Shorts
Image Clustering
Jeans, Khakis & Chinos
Image Clustering
Trousers
Image Clustering
Bags
Image Clustering
Jackets
Image Clustering
Funky Tops
Search Results ...
We propose our customers to
collaborate, using their data,
for building a first prototype
Built with our customers
Selected topics look for an
innovative way of using existing
data
Leveraging smart data
Topics must lead to real,
operational applications, with
added value for the business
For industrial applications
Equancy selects several topics we consider worth
investigating for our yearly program
Cutting-Edge Topics
Depending how speculative we judge
each topic, Equancy will support
significant time costs of consultants
Co-investment
EQUANCY
R&D Program
Equancy R&D Initiative
Thanks!
You were great :)
Equancy is recruiting:
❖ Data Scientist Intern
❖ Data Engineer
kkarp@equancy.com

More Related Content

What's hot

Understanding computer vision with Deep Learning
Understanding computer vision with Deep LearningUnderstanding computer vision with Deep Learning
Understanding computer vision with Deep LearningShubhWadekar
 
Visualizing the Model Selection Process
Visualizing the Model Selection ProcessVisualizing the Model Selection Process
Visualizing the Model Selection ProcessBenjamin Bengfort
 
Large-Scale Machine Learning with Apache Spark
Large-Scale Machine Learning with Apache SparkLarge-Scale Machine Learning with Apache Spark
Large-Scale Machine Learning with Apache SparkDB Tsai
 
Machine learning in production with scikit-learn
Machine learning in production with scikit-learnMachine learning in production with scikit-learn
Machine learning in production with scikit-learnJeff Klukas
 
Building Powerful and Intelligent Applications with Azure Machine Learning
Building Powerful and Intelligent Applications with Azure Machine LearningBuilding Powerful and Intelligent Applications with Azure Machine Learning
Building Powerful and Intelligent Applications with Azure Machine LearningDavid Walker, CSM,CSD,MCP,MCAD,MCSD,MVP
 
From Python to PySpark and Back Again – Unifying Single-host and Distributed ...
From Python to PySpark and Back Again – Unifying Single-host and Distributed ...From Python to PySpark and Back Again – Unifying Single-host and Distributed ...
From Python to PySpark and Back Again – Unifying Single-host and Distributed ...Databricks
 
Introduction to Machine Learning with Python and scikit-learn
Introduction to Machine Learning with Python and scikit-learnIntroduction to Machine Learning with Python and scikit-learn
Introduction to Machine Learning with Python and scikit-learnMatt Hagy
 
Approaching (almost) Any Machine Learning Problem (kaggledays dubai)
Approaching (almost) Any Machine Learning Problem (kaggledays dubai)Approaching (almost) Any Machine Learning Problem (kaggledays dubai)
Approaching (almost) Any Machine Learning Problem (kaggledays dubai)Abhishek Thakur
 
Introduction of Feature Hashing
Introduction of Feature HashingIntroduction of Feature Hashing
Introduction of Feature HashingWush Wu
 
Introduction to Machine Learning in Python using Scikit-Learn
Introduction to Machine Learning in Python using Scikit-LearnIntroduction to Machine Learning in Python using Scikit-Learn
Introduction to Machine Learning in Python using Scikit-LearnAmol Agrawal
 
Unsupervised Aspect Based Sentiment Analysis at Scale
Unsupervised Aspect Based Sentiment Analysis at ScaleUnsupervised Aspect Based Sentiment Analysis at Scale
Unsupervised Aspect Based Sentiment Analysis at ScaleAaron (Ari) Bornstein
 
Ibis: Seamless Transition Between Pandas and Apache Spark
Ibis: Seamless Transition Between Pandas and Apache SparkIbis: Seamless Transition Between Pandas and Apache Spark
Ibis: Seamless Transition Between Pandas and Apache SparkDatabricks
 
Feature Engineering - Getting most out of data for predictive models
Feature Engineering - Getting most out of data for predictive modelsFeature Engineering - Getting most out of data for predictive models
Feature Engineering - Getting most out of data for predictive modelsGabriel Moreira
 
Ge aviation spark application experience porting analytics into py spark ml p...
Ge aviation spark application experience porting analytics into py spark ml p...Ge aviation spark application experience porting analytics into py spark ml p...
Ge aviation spark application experience porting analytics into py spark ml p...Databricks
 
Feature Engineering
Feature EngineeringFeature Engineering
Feature EngineeringSri Ambati
 
XGBoost @ Fyber
XGBoost @ FyberXGBoost @ Fyber
XGBoost @ FyberDaniel Hen
 
Ml5 svm and-kernels
Ml5 svm and-kernelsMl5 svm and-kernels
Ml5 svm and-kernelsankit_ppt
 
Ensembling & Boosting 概念介紹
Ensembling & Boosting  概念介紹Ensembling & Boosting  概念介紹
Ensembling & Boosting 概念介紹Wayne Chen
 

What's hot (20)

Understanding computer vision with Deep Learning
Understanding computer vision with Deep LearningUnderstanding computer vision with Deep Learning
Understanding computer vision with Deep Learning
 
Visualizing the Model Selection Process
Visualizing the Model Selection ProcessVisualizing the Model Selection Process
Visualizing the Model Selection Process
 
Large-Scale Machine Learning with Apache Spark
Large-Scale Machine Learning with Apache SparkLarge-Scale Machine Learning with Apache Spark
Large-Scale Machine Learning with Apache Spark
 
Machine learning in production with scikit-learn
Machine learning in production with scikit-learnMachine learning in production with scikit-learn
Machine learning in production with scikit-learn
 
Building Powerful and Intelligent Applications with Azure Machine Learning
Building Powerful and Intelligent Applications with Azure Machine LearningBuilding Powerful and Intelligent Applications with Azure Machine Learning
Building Powerful and Intelligent Applications with Azure Machine Learning
 
From Python to PySpark and Back Again – Unifying Single-host and Distributed ...
From Python to PySpark and Back Again – Unifying Single-host and Distributed ...From Python to PySpark and Back Again – Unifying Single-host and Distributed ...
From Python to PySpark and Back Again – Unifying Single-host and Distributed ...
 
Introduction to Machine Learning with Python and scikit-learn
Introduction to Machine Learning with Python and scikit-learnIntroduction to Machine Learning with Python and scikit-learn
Introduction to Machine Learning with Python and scikit-learn
 
Approaching (almost) Any Machine Learning Problem (kaggledays dubai)
Approaching (almost) Any Machine Learning Problem (kaggledays dubai)Approaching (almost) Any Machine Learning Problem (kaggledays dubai)
Approaching (almost) Any Machine Learning Problem (kaggledays dubai)
 
Introduction of Feature Hashing
Introduction of Feature HashingIntroduction of Feature Hashing
Introduction of Feature Hashing
 
Introduction to Machine Learning in Python using Scikit-Learn
Introduction to Machine Learning in Python using Scikit-LearnIntroduction to Machine Learning in Python using Scikit-Learn
Introduction to Machine Learning in Python using Scikit-Learn
 
Unsupervised Aspect Based Sentiment Analysis at Scale
Unsupervised Aspect Based Sentiment Analysis at ScaleUnsupervised Aspect Based Sentiment Analysis at Scale
Unsupervised Aspect Based Sentiment Analysis at Scale
 
Xgboost
XgboostXgboost
Xgboost
 
Ibis: Seamless Transition Between Pandas and Apache Spark
Ibis: Seamless Transition Between Pandas and Apache SparkIbis: Seamless Transition Between Pandas and Apache Spark
Ibis: Seamless Transition Between Pandas and Apache Spark
 
Feature Engineering - Getting most out of data for predictive models
Feature Engineering - Getting most out of data for predictive modelsFeature Engineering - Getting most out of data for predictive models
Feature Engineering - Getting most out of data for predictive models
 
Ge aviation spark application experience porting analytics into py spark ml p...
Ge aviation spark application experience porting analytics into py spark ml p...Ge aviation spark application experience porting analytics into py spark ml p...
Ge aviation spark application experience porting analytics into py spark ml p...
 
Feature Engineering
Feature EngineeringFeature Engineering
Feature Engineering
 
XGBoost @ Fyber
XGBoost @ FyberXGBoost @ Fyber
XGBoost @ Fyber
 
Ml5 svm and-kernels
Ml5 svm and-kernelsMl5 svm and-kernels
Ml5 svm and-kernels
 
Ensembling & Boosting 概念介紹
Ensembling & Boosting  概念介紹Ensembling & Boosting  概念介紹
Ensembling & Boosting 概念介紹
 
Xgboost
XgboostXgboost
Xgboost
 

Viewers also liked

Deep Learning Models for Question Answering
Deep Learning Models for Question AnsweringDeep Learning Models for Question Answering
Deep Learning Models for Question AnsweringSujit Pal
 
Information Retrieval with Deep Learning
Information Retrieval with Deep LearningInformation Retrieval with Deep Learning
Information Retrieval with Deep LearningAdam Gibson
 
Deep Learning for Information Retrieval
Deep Learning for Information RetrievalDeep Learning for Information Retrieval
Deep Learning for Information RetrievalRoelof Pieters
 
Tutorial on Question Answering Systems
Tutorial on Question Answering Systems Tutorial on Question Answering Systems
Tutorial on Question Answering Systems Saeedeh Shekarpour
 
Mastering the game of go with deep neural networks and tree search
Mastering the game of go with deep neural networks and tree searchMastering the game of go with deep neural networks and tree search
Mastering the game of go with deep neural networks and tree searchSanFengChang
 
Question Answering - Application and Challenges
Question Answering - Application and ChallengesQuestion Answering - Application and Challenges
Question Answering - Application and ChallengesJens Lehmann
 
Introduction For seq2seq(sequence to sequence) and RNN
Introduction For seq2seq(sequence to sequence) and RNNIntroduction For seq2seq(sequence to sequence) and RNN
Introduction For seq2seq(sequence to sequence) and RNNHye-min Ahn
 
Instant Question Answering System
Instant Question Answering SystemInstant Question Answering System
Instant Question Answering SystemDhwaj Raj
 
"Probabilistic Latent Semantic Analysis for prediction of Gene Ontology annot...
"Probabilistic Latent Semantic Analysis for prediction of Gene Ontology annot..."Probabilistic Latent Semantic Analysis for prediction of Gene Ontology annot...
"Probabilistic Latent Semantic Analysis for prediction of Gene Ontology annot...Davide Chicco
 
Latent Semanctic Analysis Auro Tripathy
Latent Semanctic Analysis Auro TripathyLatent Semanctic Analysis Auro Tripathy
Latent Semanctic Analysis Auro TripathyAuro Tripathy
 
Latent Semantic Analysis of Wikipedia with Spark
Latent Semantic Analysis of Wikipedia with SparkLatent Semantic Analysis of Wikipedia with Spark
Latent Semantic Analysis of Wikipedia with SparkSandy Ryza
 
Presentation of Domain Specific Question Answering System Using N-gram Approach.
Presentation of Domain Specific Question Answering System Using N-gram Approach.Presentation of Domain Specific Question Answering System Using N-gram Approach.
Presentation of Domain Specific Question Answering System Using N-gram Approach.Tasnim Ara Islam
 
Latent Semantic Indexing and Analysis
Latent Semantic Indexing and AnalysisLatent Semantic Indexing and Analysis
Latent Semantic Indexing and AnalysisMercy Livingstone
 
さくっとはじめるテキストマイニング(R言語)  スタートアップ編
さくっとはじめるテキストマイニング(R言語)  スタートアップ編さくっとはじめるテキストマイニング(R言語)  スタートアップ編
さくっとはじめるテキストマイニング(R言語)  スタートアップ編Yutaka Shimada
 

Viewers also liked (15)

Deep Learning Models for Question Answering
Deep Learning Models for Question AnsweringDeep Learning Models for Question Answering
Deep Learning Models for Question Answering
 
Information Retrieval with Deep Learning
Information Retrieval with Deep LearningInformation Retrieval with Deep Learning
Information Retrieval with Deep Learning
 
Deep Learning for Information Retrieval
Deep Learning for Information RetrievalDeep Learning for Information Retrieval
Deep Learning for Information Retrieval
 
Tutorial on Question Answering Systems
Tutorial on Question Answering Systems Tutorial on Question Answering Systems
Tutorial on Question Answering Systems
 
Mastering the game of go with deep neural networks and tree search
Mastering the game of go with deep neural networks and tree searchMastering the game of go with deep neural networks and tree search
Mastering the game of go with deep neural networks and tree search
 
Question Answering - Application and Challenges
Question Answering - Application and ChallengesQuestion Answering - Application and Challenges
Question Answering - Application and Challenges
 
Introduction For seq2seq(sequence to sequence) and RNN
Introduction For seq2seq(sequence to sequence) and RNNIntroduction For seq2seq(sequence to sequence) and RNN
Introduction For seq2seq(sequence to sequence) and RNN
 
Instant Question Answering System
Instant Question Answering SystemInstant Question Answering System
Instant Question Answering System
 
"Probabilistic Latent Semantic Analysis for prediction of Gene Ontology annot...
"Probabilistic Latent Semantic Analysis for prediction of Gene Ontology annot..."Probabilistic Latent Semantic Analysis for prediction of Gene Ontology annot...
"Probabilistic Latent Semantic Analysis for prediction of Gene Ontology annot...
 
Latent Semanctic Analysis Auro Tripathy
Latent Semanctic Analysis Auro TripathyLatent Semanctic Analysis Auro Tripathy
Latent Semanctic Analysis Auro Tripathy
 
Latent Semantic Analysis of Wikipedia with Spark
Latent Semantic Analysis of Wikipedia with SparkLatent Semantic Analysis of Wikipedia with Spark
Latent Semantic Analysis of Wikipedia with Spark
 
Presentation of Domain Specific Question Answering System Using N-gram Approach.
Presentation of Domain Specific Question Answering System Using N-gram Approach.Presentation of Domain Specific Question Answering System Using N-gram Approach.
Presentation of Domain Specific Question Answering System Using N-gram Approach.
 
Latent Semantic Indexing and Analysis
Latent Semantic Indexing and AnalysisLatent Semantic Indexing and Analysis
Latent Semantic Indexing and Analysis
 
さくっとはじめるテキストマイニング(R言語)  スタートアップ編
さくっとはじめるテキストマイニング(R言語)  スタートアップ編さくっとはじめるテキストマイニング(R言語)  スタートアップ編
さくっとはじめるテキストマイニング(R言語)  スタートアップ編
 
How AlphaGo Works
How AlphaGo WorksHow AlphaGo Works
How AlphaGo Works
 

Similar to Deep Learning Meetup 7 - Building a Deep Learning-powered Search Engine

OReilly AI Transfer Learning
OReilly AI Transfer LearningOReilly AI Transfer Learning
OReilly AI Transfer LearningDanielle Dean
 
How to use transfer learning to bootstrap image classification and question a...
How to use transfer learning to bootstrap image classification and question a...How to use transfer learning to bootstrap image classification and question a...
How to use transfer learning to bootstrap image classification and question a...Wee Hyong Tok
 
Biometric Systems - Automate Video Streaming Analysis with Azure and AWS
Biometric Systems - Automate Video Streaming Analysis with Azure and AWSBiometric Systems - Automate Video Streaming Analysis with Azure and AWS
Biometric Systems - Automate Video Streaming Analysis with Azure and AWSRoberto Falconi
 
Hyf project ideas_02
Hyf project ideas_02Hyf project ideas_02
Hyf project ideas_02KatoK1
 
Track 2 Session 5_ 利用 SageMaker 深度學習容器化在廣告推播之應用
Track 2 Session 5_ 利用 SageMaker 深度學習容器化在廣告推播之應用Track 2 Session 5_ 利用 SageMaker 深度學習容器化在廣告推播之應用
Track 2 Session 5_ 利用 SageMaker 深度學習容器化在廣告推播之應用Amazon Web Services
 
"Keeping Brick and Mortar Relevant, A Look Inside Retail Analytics," A Presen...
"Keeping Brick and Mortar Relevant, A Look Inside Retail Analytics," A Presen..."Keeping Brick and Mortar Relevant, A Look Inside Retail Analytics," A Presen...
"Keeping Brick and Mortar Relevant, A Look Inside Retail Analytics," A Presen...Edge AI and Vision Alliance
 
Autodesk Recap Empowering Businesses with 3D Modelling
Autodesk Recap Empowering Businesses with 3D ModellingAutodesk Recap Empowering Businesses with 3D Modelling
Autodesk Recap Empowering Businesses with 3D Modellingamanraza23
 
Certification Study Group - Professional ML Engineer Session 3 (Machine Learn...
Certification Study Group - Professional ML Engineer Session 3 (Machine Learn...Certification Study Group - Professional ML Engineer Session 3 (Machine Learn...
Certification Study Group - Professional ML Engineer Session 3 (Machine Learn...gdgsurrey
 
Machine Learning in the air
Machine Learning in the airMachine Learning in the air
Machine Learning in the airAntoine SAUVAGE
 
Arne Schoenleben (innovation.rocks): Data Conversion
Arne Schoenleben (innovation.rocks): Data ConversionArne Schoenleben (innovation.rocks): Data Conversion
Arne Schoenleben (innovation.rocks): Data ConversionAugmentedWorldExpo
 
Using Web 2.0 For Outside I Nnovation Seybold Stm Dec 07
Using Web 2.0 For Outside I Nnovation Seybold Stm Dec 07Using Web 2.0 For Outside I Nnovation Seybold Stm Dec 07
Using Web 2.0 For Outside I Nnovation Seybold Stm Dec 07pseybold
 
DotNet Conf Madrid 2019 - Whats New in ML.NET
DotNet Conf Madrid 2019 - Whats New in ML.NETDotNet Conf Madrid 2019 - Whats New in ML.NET
DotNet Conf Madrid 2019 - Whats New in ML.NETAlberto Diaz Martin
 
20100117US001c-3DVisualizationOfRailroadWheelFlaws
20100117US001c-3DVisualizationOfRailroadWheelFlaws20100117US001c-3DVisualizationOfRailroadWheelFlaws
20100117US001c-3DVisualizationOfRailroadWheelFlawsBen Rayner
 
Team2 final project_presentation
Team2 final project_presentationTeam2 final project_presentation
Team2 final project_presentationNishtha Adroja
 
Building Microservices in the cloud - Software Architecture Summit 2016
Building Microservices in the cloud - Software Architecture Summit 2016Building Microservices in the cloud - Software Architecture Summit 2016
Building Microservices in the cloud - Software Architecture Summit 2016Christian Deger
 
Design patterns
Design patternsDesign patterns
Design patternsnisheesh
 
ARIADNE plus - vms workshop.pdf
ARIADNE plus - vms workshop.pdfARIADNE plus - vms workshop.pdf
ARIADNE plus - vms workshop.pdfariadnenetwork
 
Simplify Distributed TensorFlow Training for Fast Image Categorization at Sta...
Simplify Distributed TensorFlow Training for Fast Image Categorization at Sta...Simplify Distributed TensorFlow Training for Fast Image Categorization at Sta...
Simplify Distributed TensorFlow Training for Fast Image Categorization at Sta...Databricks
 

Similar to Deep Learning Meetup 7 - Building a Deep Learning-powered Search Engine (20)

OReilly AI Transfer Learning
OReilly AI Transfer LearningOReilly AI Transfer Learning
OReilly AI Transfer Learning
 
How to use transfer learning to bootstrap image classification and question a...
How to use transfer learning to bootstrap image classification and question a...How to use transfer learning to bootstrap image classification and question a...
How to use transfer learning to bootstrap image classification and question a...
 
Biometric Systems - Automate Video Streaming Analysis with Azure and AWS
Biometric Systems - Automate Video Streaming Analysis with Azure and AWSBiometric Systems - Automate Video Streaming Analysis with Azure and AWS
Biometric Systems - Automate Video Streaming Analysis with Azure and AWS
 
Hyf project ideas_02
Hyf project ideas_02Hyf project ideas_02
Hyf project ideas_02
 
Track 2 Session 5_ 利用 SageMaker 深度學習容器化在廣告推播之應用
Track 2 Session 5_ 利用 SageMaker 深度學習容器化在廣告推播之應用Track 2 Session 5_ 利用 SageMaker 深度學習容器化在廣告推播之應用
Track 2 Session 5_ 利用 SageMaker 深度學習容器化在廣告推播之應用
 
AWS Summit Milan - Media Apps
AWS Summit Milan - Media AppsAWS Summit Milan - Media Apps
AWS Summit Milan - Media Apps
 
"Keeping Brick and Mortar Relevant, A Look Inside Retail Analytics," A Presen...
"Keeping Brick and Mortar Relevant, A Look Inside Retail Analytics," A Presen..."Keeping Brick and Mortar Relevant, A Look Inside Retail Analytics," A Presen...
"Keeping Brick and Mortar Relevant, A Look Inside Retail Analytics," A Presen...
 
Autodesk Recap Empowering Businesses with 3D Modelling
Autodesk Recap Empowering Businesses with 3D ModellingAutodesk Recap Empowering Businesses with 3D Modelling
Autodesk Recap Empowering Businesses with 3D Modelling
 
Certification Study Group - Professional ML Engineer Session 3 (Machine Learn...
Certification Study Group - Professional ML Engineer Session 3 (Machine Learn...Certification Study Group - Professional ML Engineer Session 3 (Machine Learn...
Certification Study Group - Professional ML Engineer Session 3 (Machine Learn...
 
Machine Learning in the air
Machine Learning in the airMachine Learning in the air
Machine Learning in the air
 
Arne Schoenleben (innovation.rocks): Data Conversion
Arne Schoenleben (innovation.rocks): Data ConversionArne Schoenleben (innovation.rocks): Data Conversion
Arne Schoenleben (innovation.rocks): Data Conversion
 
Using Web 2.0 For Outside I Nnovation Seybold Stm Dec 07
Using Web 2.0 For Outside I Nnovation Seybold Stm Dec 07Using Web 2.0 For Outside I Nnovation Seybold Stm Dec 07
Using Web 2.0 For Outside I Nnovation Seybold Stm Dec 07
 
DotNet Conf Madrid 2019 - Whats New in ML.NET
DotNet Conf Madrid 2019 - Whats New in ML.NETDotNet Conf Madrid 2019 - Whats New in ML.NET
DotNet Conf Madrid 2019 - Whats New in ML.NET
 
20100117US001c-3DVisualizationOfRailroadWheelFlaws
20100117US001c-3DVisualizationOfRailroadWheelFlaws20100117US001c-3DVisualizationOfRailroadWheelFlaws
20100117US001c-3DVisualizationOfRailroadWheelFlaws
 
Team2 final project_presentation
Team2 final project_presentationTeam2 final project_presentation
Team2 final project_presentation
 
Building Microservices in the cloud - Software Architecture Summit 2016
Building Microservices in the cloud - Software Architecture Summit 2016Building Microservices in the cloud - Software Architecture Summit 2016
Building Microservices in the cloud - Software Architecture Summit 2016
 
Design patterns
Design patternsDesign patterns
Design patterns
 
Online webinar on latest nx enhancements
Online webinar on latest nx enhancementsOnline webinar on latest nx enhancements
Online webinar on latest nx enhancements
 
ARIADNE plus - vms workshop.pdf
ARIADNE plus - vms workshop.pdfARIADNE plus - vms workshop.pdf
ARIADNE plus - vms workshop.pdf
 
Simplify Distributed TensorFlow Training for Fast Image Categorization at Sta...
Simplify Distributed TensorFlow Training for Fast Image Categorization at Sta...Simplify Distributed TensorFlow Training for Fast Image Categorization at Sta...
Simplify Distributed TensorFlow Training for Fast Image Categorization at Sta...
 

Deep Learning Meetup 7 - Building a Deep Learning-powered Search Engine

  • 1. Building a Deep Learning-powered Search Engine Koby Karp Deep Learning Paris Meetup #7
  • 2. I’m Koby - Data Scientist @ Equancy ★ Robotics Engineer (2007-2011) ★ Computer Visioner (2011-2012) ★ Data Scientist, Data Engineer, Data Miner, Data Analyst, ... (2011-2016) ★ Deep Learner (2016-) ★ ?
  • 3. E-Commerce ♥ Images ★ Catalogue ★ Social Network ★ Marketplace
  • 4. Three use cases for FASHION: ★ Visual Search Engine ★ Fashion Object Detection ★ Data Quality
  • 5. Three use cases for FASHION: ★ Visual Search Engine ➹ Take pictures with your phone ➹ Search through catalogue using your images ➹ Return most similar or exact products
  • 6. Big City Life = High Exposure to Fashion Daily
  • 7. Visual Search Engine at a glance
  • 8. Visual Search Engine at a glance ★ Batch Phase: Build ➢ Describe - Encode image into a numeric description (vector) ➢ Index - Apply transformation to all images and store in a DB ★ Online Phase: Deploy ➢ Measure Distance - Apply a distance metric between DB and a new (unseen) image ➢ Ranking - Sort by distance and return first N results
  • 9. Visual Search Engine at a glance Describe Numerical Representation 0.672 0.510 0.741 ... 0.919 Catalogue Image ★ Batch Phase: Build ➢ Describe ➢ Index ★ Online Phase: Deploy ➢ Measure Distance ➢ Ranking Encode image into a numeric description (vector)
  • 10. ★ Batch Phase: Build ➢ Describe ➢ Index ★ Online Phase: Deploy ➢ Measure Distance ➢ Ranking Visual Search Engine at a glance Apply transformation to all images and store in a DB Index 0.672 0.435 0.482 ... 0.141 0.510 0.525 0.810 .... 0.241 0.741 0.526 0.210 ... 0.571 ... ... ... ... 0.816 0.919 0.552 0.161 0.622 0.412 Catalogue Images
  • 11. 0.672 0.435 0.482 ... 0.141 0.510 0.525 0.810 .... 0.241 0.741 0.526 0.210 ... 0.571 ... ... ... ... 0.816 0.919 0.552 0.161 0.622 0.412 Visual Search Engine at a glance Apply a distance metric between DB and a new (unseen) image Measure Distance ★ Batch Phase: Build ➢ Describe ➢ Index ★ Online Phase: Deploy ➢ Measure Distance ➢ Ranking 0.672 0.510 0.741 ... 0.919 User’s Image
  • 12. Visual Search Engine at a glance Sort by distance and return first N results Top 5 ★ Batch Phase: Build ➢ Describe ➢ Index ★ Online Phase: Deploy ➢ Measure Distance ➢ Ranking User’s Image
  • 13. Focus on the Describe step
  • 14. Three attributes that we need to describe Shape Color Texture
  • 15. Three attributes that we need to describe Shape Color Texture How is it done with “classic” Computer Vision? Edge Detectors Image Moment HOG / HOF / SIFT Fourier / Wavelet Color Histograms
  • 16. Three attributes that we need to describe Problems with this approach: 1. Too many parameters (difficult to tune) 2. Multiple methods (how to weigh?) 3. Slow (many transformations) 4. Ungeneralizable
  • 18. Entering: Convolutional Neural Network (CNN) AlexNet (2012) 1. “The Beatles of the CNNs” -Me 2. Trained on the ImageNet dataset (15 million images) 3. Used for classification of 1000 categories (Animals, Plants, Urban - No Fashion) 4. Invariant to translations and horizontal reflections 5. Tried other models such as VGG16.
  • 19. Entering: Convolutional Neural Network (CNN) AlexNet (simplified visualization)
  • 20. Convolutional Neural Network (CNN) AlexNet (simplified visualization) ❖ We remove the last Fully connected layer (Soft-Max) ❖ We feed our images and generate CNN codes of size 4096 ❖ The weights of the Trained CNN contain the Feature Engineering mapping that was necessary to discriminate between the 1000 classes ❖ We use the network as a general-purpose descriptor.
  • 22. Dataset M. Manfredi; C. Grana; S. Calderara; R. Cucchiara "A complete system for garment segmentation and color classification" MACHINE VISION AND APPLICATIONS, vol. 25, pp. 955 -969 , 2014 Mix of various clothing and accessory: ❖ 60000 items ❖ Medium Quality ❖ Grey background ❖ Used as a benchmark for garment classification
  • 23. Image Clustering ❖ Using t-SNE for compression to 2D ❖ Selected random 10% for visualization
  • 24. Image Clustering Jewelry & Accessories
  • 34.
  • 35.
  • 36.
  • 37. We propose our customers to collaborate, using their data, for building a first prototype Built with our customers Selected topics look for an innovative way of using existing data Leveraging smart data Topics must lead to real, operational applications, with added value for the business For industrial applications Equancy selects several topics we consider worth investigating for our yearly program Cutting-Edge Topics Depending how speculative we judge each topic, Equancy will support significant time costs of consultants Co-investment EQUANCY R&D Program Equancy R&D Initiative
  • 38. Thanks! You were great :) Equancy is recruiting: ❖ Data Scientist Intern ❖ Data Engineer kkarp@equancy.com