Building a Deep
Learning-powered Search
Engine
Koby Karp
Deep Learning Paris Meetup #7
I’m Koby - Data Scientist @ Equancy
★ Robotics Engineer (2007-2011)
★ Computer Visioner (2011-2012)
★ Data Scientist, Data Engineer, Data Miner, Data Analyst, ... (2011-2016)
★ Deep Learner (2016-)
★ ?
E-Commerce ♥ Images
★ Catalogue
★ Social Network
★ Marketplace
Three use cases for FASHION:
★ Visual Search Engine
★ Fashion Object Detection
★ Data Quality
Three use cases for FASHION:
★ Visual Search Engine
➹ Take pictures with your phone
➹ Search through catalogue using your images
➹ Return most similar or exact products
Big City Life = High Exposure to Fashion Daily
Visual Search Engine at a glance
Visual Search Engine at a glance
★ Batch Phase: Build
➢ Describe - Encode image into a numeric description (vector)
➢ Index - Apply transformation to all images and store in a DB
★ Online Phase: Deploy
➢ Measure Distance - Apply a distance metric between DB and a new (unseen) image
➢ Ranking - Sort by distance and return first N results
Visual Search Engine at a glance
Describe
Numerical
Representation
0.672
0.510
0.741
...
0.919
Catalogue Image
★ Batch Phase: Build
➢ Describe
➢ Index
★ Online Phase: Deploy
➢ Measure Distance
➢ Ranking
Encode image into a numeric description (vector)
★ Batch Phase: Build
➢ Describe
➢ Index
★ Online Phase: Deploy
➢ Measure Distance
➢ Ranking
Visual Search Engine at a glance
Apply transformation to all images and store in a DB
Index
0.672 0.435 0.482 ... 0.141
0.510 0.525 0.810 .... 0.241
0.741 0.526 0.210 ... 0.571
... ... ... ... 0.816
0.919 0.552 0.161 0.622 0.412
Catalogue Images
0.672 0.435 0.482 ... 0.141
0.510 0.525 0.810 .... 0.241
0.741 0.526 0.210 ... 0.571
... ... ... ... 0.816
0.919 0.552 0.161 0.622 0.412
Visual Search Engine at a glance
Apply a distance metric between DB and a new (unseen) image
Measure
Distance
★ Batch Phase: Build
➢ Describe
➢ Index
★ Online Phase: Deploy
➢ Measure Distance
➢ Ranking
0.672
0.510
0.741
...
0.919
User’s Image
Visual Search Engine at a glance
Sort by distance and return first N results
Top 5
★ Batch Phase: Build
➢ Describe
➢ Index
★ Online Phase: Deploy
➢ Measure Distance
➢ Ranking
User’s Image
Focus on the Describe step
Three attributes that we need to describe
Shape Color Texture
Three attributes that we need to describe
Shape Color Texture
How is it done with “classic” Computer Vision?
Edge Detectors
Image Moment
HOG / HOF / SIFT
Fourier / Wavelet
Color Histograms
Three attributes that we need to describe
Problems with this approach:
1. Too many parameters (difficult to tune)
2. Multiple methods (how to weigh?)
3. Slow (many transformations)
4. Ungeneralizable
Solution: Pre-Trained
Convolutional Neural
Network (CNN)
Entering: Convolutional Neural Network (CNN)
AlexNet (2012)
1. “The Beatles of the CNNs” -Me
2. Trained on the ImageNet dataset (15 million images)
3. Used for classification of 1000 categories (Animals, Plants, Urban - No Fashion)
4. Invariant to translations and horizontal reflections
5. Tried other models such as VGG16.
Entering: Convolutional Neural Network (CNN)
AlexNet (simplified visualization)
Convolutional Neural Network (CNN)
AlexNet (simplified visualization)
❖ We remove the last Fully connected layer (Soft-Max)
❖ We feed our images and generate CNN codes of size 4096
❖ The weights of the Trained CNN contain the Feature Engineering mapping that was necessary
to discriminate between the 1000 classes
❖ We use the network as a general-purpose descriptor.
Test Time ...
Dataset
M. Manfredi; C. Grana; S. Calderara; R. Cucchiara "A complete system for garment segmentation and color classification" MACHINE VISION AND
APPLICATIONS, vol. 25, pp. 955 -969 , 2014
Mix of various clothing and accessory:
❖ 60000 items
❖ Medium Quality
❖ Grey background
❖ Used as a benchmark for garment classification
Image Clustering
❖ Using t-SNE for compression to 2D
❖ Selected random 10% for visualization
Image Clustering Jewelry & Accessories
Image Clustering T-Shirts
Image Clustering Shoes
Image Clustering
Shorts
Image Clustering
Jeans, Khakis & Chinos
Image Clustering
Trousers
Image Clustering
Bags
Image Clustering
Jackets
Image Clustering
Funky Tops
Search Results ...
We propose our customers to
collaborate, using their data,
for building a first prototype
Built with our customers
Selected topics look for an
innovative way of using existing
data
Leveraging smart data
Topics must lead to real,
operational applications, with
added value for the business
For industrial applications
Equancy selects several topics we consider worth
investigating for our yearly program
Cutting-Edge Topics
Depending how speculative we judge
each topic, Equancy will support
significant time costs of consultants
Co-investment
EQUANCY
R&D Program
Equancy R&D Initiative
Thanks!
You were great :)
Equancy is recruiting:
❖ Data Scientist Intern
❖ Data Engineer
kkarp@equancy.com

Deep Learning Meetup 7 - Building a Deep Learning-powered Search Engine

  • 1.
    Building a Deep Learning-poweredSearch Engine Koby Karp Deep Learning Paris Meetup #7
  • 2.
    I’m Koby -Data Scientist @ Equancy ★ Robotics Engineer (2007-2011) ★ Computer Visioner (2011-2012) ★ Data Scientist, Data Engineer, Data Miner, Data Analyst, ... (2011-2016) ★ Deep Learner (2016-) ★ ?
  • 3.
    E-Commerce ♥ Images ★Catalogue ★ Social Network ★ Marketplace
  • 4.
    Three use casesfor FASHION: ★ Visual Search Engine ★ Fashion Object Detection ★ Data Quality
  • 5.
    Three use casesfor FASHION: ★ Visual Search Engine ➹ Take pictures with your phone ➹ Search through catalogue using your images ➹ Return most similar or exact products
  • 6.
    Big City Life= High Exposure to Fashion Daily
  • 7.
  • 8.
    Visual Search Engineat a glance ★ Batch Phase: Build ➢ Describe - Encode image into a numeric description (vector) ➢ Index - Apply transformation to all images and store in a DB ★ Online Phase: Deploy ➢ Measure Distance - Apply a distance metric between DB and a new (unseen) image ➢ Ranking - Sort by distance and return first N results
  • 9.
    Visual Search Engineat a glance Describe Numerical Representation 0.672 0.510 0.741 ... 0.919 Catalogue Image ★ Batch Phase: Build ➢ Describe ➢ Index ★ Online Phase: Deploy ➢ Measure Distance ➢ Ranking Encode image into a numeric description (vector)
  • 10.
    ★ Batch Phase:Build ➢ Describe ➢ Index ★ Online Phase: Deploy ➢ Measure Distance ➢ Ranking Visual Search Engine at a glance Apply transformation to all images and store in a DB Index 0.672 0.435 0.482 ... 0.141 0.510 0.525 0.810 .... 0.241 0.741 0.526 0.210 ... 0.571 ... ... ... ... 0.816 0.919 0.552 0.161 0.622 0.412 Catalogue Images
  • 11.
    0.672 0.435 0.482... 0.141 0.510 0.525 0.810 .... 0.241 0.741 0.526 0.210 ... 0.571 ... ... ... ... 0.816 0.919 0.552 0.161 0.622 0.412 Visual Search Engine at a glance Apply a distance metric between DB and a new (unseen) image Measure Distance ★ Batch Phase: Build ➢ Describe ➢ Index ★ Online Phase: Deploy ➢ Measure Distance ➢ Ranking 0.672 0.510 0.741 ... 0.919 User’s Image
  • 12.
    Visual Search Engineat a glance Sort by distance and return first N results Top 5 ★ Batch Phase: Build ➢ Describe ➢ Index ★ Online Phase: Deploy ➢ Measure Distance ➢ Ranking User’s Image
  • 13.
    Focus on theDescribe step
  • 14.
    Three attributes thatwe need to describe Shape Color Texture
  • 15.
    Three attributes thatwe need to describe Shape Color Texture How is it done with “classic” Computer Vision? Edge Detectors Image Moment HOG / HOF / SIFT Fourier / Wavelet Color Histograms
  • 16.
    Three attributes thatwe need to describe Problems with this approach: 1. Too many parameters (difficult to tune) 2. Multiple methods (how to weigh?) 3. Slow (many transformations) 4. Ungeneralizable
  • 17.
  • 18.
    Entering: Convolutional NeuralNetwork (CNN) AlexNet (2012) 1. “The Beatles of the CNNs” -Me 2. Trained on the ImageNet dataset (15 million images) 3. Used for classification of 1000 categories (Animals, Plants, Urban - No Fashion) 4. Invariant to translations and horizontal reflections 5. Tried other models such as VGG16.
  • 19.
    Entering: Convolutional NeuralNetwork (CNN) AlexNet (simplified visualization)
  • 20.
    Convolutional Neural Network(CNN) AlexNet (simplified visualization) ❖ We remove the last Fully connected layer (Soft-Max) ❖ We feed our images and generate CNN codes of size 4096 ❖ The weights of the Trained CNN contain the Feature Engineering mapping that was necessary to discriminate between the 1000 classes ❖ We use the network as a general-purpose descriptor.
  • 21.
  • 22.
    Dataset M. Manfredi; C.Grana; S. Calderara; R. Cucchiara "A complete system for garment segmentation and color classification" MACHINE VISION AND APPLICATIONS, vol. 25, pp. 955 -969 , 2014 Mix of various clothing and accessory: ❖ 60000 items ❖ Medium Quality ❖ Grey background ❖ Used as a benchmark for garment classification
  • 23.
    Image Clustering ❖ Usingt-SNE for compression to 2D ❖ Selected random 10% for visualization
  • 24.
  • 25.
  • 26.
  • 27.
  • 28.
  • 29.
  • 30.
  • 31.
  • 32.
  • 33.
  • 37.
    We propose ourcustomers to collaborate, using their data, for building a first prototype Built with our customers Selected topics look for an innovative way of using existing data Leveraging smart data Topics must lead to real, operational applications, with added value for the business For industrial applications Equancy selects several topics we consider worth investigating for our yearly program Cutting-Edge Topics Depending how speculative we judge each topic, Equancy will support significant time costs of consultants Co-investment EQUANCY R&D Program Equancy R&D Initiative
  • 38.
    Thanks! You were great:) Equancy is recruiting: ❖ Data Scientist Intern ❖ Data Engineer kkarp@equancy.com