Deep Learning Meetup 7 - Building a Deep Learning-powered Search Engine

Building a Deep
Learning-powered Search
Engine
Koby Karp
Deep Learning Paris Meetup #7

I’m Koby - Data Scientist @ Equancy
★ Robotics Engineer (2007-2011)
★ Computer Visioner (2011-2012)
★ Data Scientist, Data Engineer, Data Miner, Data Analyst, ... (2011-2016)
★ Deep Learner (2016-)
★ ?

E-Commerce ♥ Images
★ Catalogue
★ Social Network
★ Marketplace

Three use cases for FASHION:
★ Visual Search Engine
★ Fashion Object Detection
★ Data Quality

Three use cases for FASHION:
★ Visual Search Engine
➹ Take pictures with your phone
➹ Search through catalogue using your images
➹ Return most similar or exact products

Big City Life = High Exposure to Fashion Daily

Visual Search Engine at a glance

★ Batch Phase: Build
➢ Describe - Encode image into a numeric description (vector)
➢ Index - Apply transformation to all images and store in a DB
★ Online Phase: Deploy
➢ Measure Distance - Apply a distance metric between DB and a new (unseen) image
➢ Ranking - Sort by distance and return first N results

Describe
Numerical
Representation
0.672
0.510
0.741
...
0.919
Catalogue Image
➢ Describe
➢ Index
➢ Measure Distance
➢ Ranking
Encode image into a numeric description (vector)

➢ Describe
➢ Index
➢ Ranking
Apply transformation to all images and store in a DB
Index
0.672 0.435 0.482 ... 0.141
0.510 0.525 0.810 .... 0.241
0.741 0.526 0.210 ... 0.571
... ... ... ... 0.816
0.919 0.552 0.161 0.622 0.412
Catalogue Images

0.672 0.435 0.482 ... 0.141
0.510 0.525 0.810 .... 0.241
0.741 0.526 0.210 ... 0.571
... ... ... ... 0.816
0.919 0.552 0.161 0.622 0.412
Apply a distance metric between DB and a new (unseen) image
Measure
Distance
➢ Describe
➢ Index
➢ Ranking
0.672
0.510
0.741
...
0.919
User’s Image

Sort by distance and return first N results
Top 5
➢ Describe
➢ Index
➢ Ranking
User’s Image

Three attributes that we need to describe
Shape Color Texture

Shape Color Texture
How is it done with “classic” Computer Vision?
Edge Detectors
Image Moment
HOG / HOF / SIFT
Fourier / Wavelet
Color Histograms

Problems with this approach:
1. Too many parameters (difficult to tune)
2. Multiple methods (how to weigh?)
3. Slow (many transformations)
4. Ungeneralizable

Solution: Pre-Trained
Convolutional Neural
Network (CNN)

Entering: Convolutional Neural Network (CNN)
AlexNet (2012)
1. “The Beatles of the CNNs” -Me
2. Trained on the ImageNet dataset (15 million images)
3. Used for classification of 1000 categories (Animals, Plants, Urban - No Fashion)
4. Invariant to translations and horizontal reflections
5. Tried other models such as VGG16.

Entering: Convolutional Neural Network (CNN)
AlexNet (simplified visualization)

Convolutional Neural Network (CNN)
AlexNet (simplified visualization)
❖ We remove the last Fully connected layer (Soft-Max)
❖ We feed our images and generate CNN codes of size 4096
❖ The weights of the Trained CNN contain the Feature Engineering mapping that was necessary
to discriminate between the 1000 classes
❖ We use the network as a general-purpose descriptor.

Dataset
M. Manfredi; C. Grana; S. Calderara; R. Cucchiara "A complete system for garment segmentation and color classification" MACHINE VISION AND
APPLICATIONS, vol. 25, pp. 955 -969 , 2014
Mix of various clothing and accessory:
❖ 60000 items
❖ Medium Quality
❖ Grey background
❖ Used as a benchmark for garment classification

Image Clustering
❖ Using t-SNE for compression to 2D
❖ Selected random 10% for visualization

Image Clustering Jewelry & Accessories

Image Clustering
Jeans, Khakis & Chinos

We propose our customers to
collaborate, using their data,
for building a first prototype
Built with our customers
Selected topics look for an
innovative way of using existing
data
Leveraging smart data
Topics must lead to real,
operational applications, with
added value for the business
For industrial applications
Equancy selects several topics we consider worth
investigating for our yearly program
Cutting-Edge Topics
Depending how speculative we judge
each topic, Equancy will support
significant time costs of consultants
Co-investment
EQUANCY
R&D Program
Equancy R&D Initiative

Thanks!
You were great :)
Equancy is recruiting:
❖ Data Scientist Intern
❖ Data Engineer
kkarp@equancy.com

Deep Learning Meetup 7 - Building a Deep Learning-powered Search Engine

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (15)

Similar to Deep Learning Meetup 7 - Building a Deep Learning-powered Search Engine

Similar to Deep Learning Meetup 7 - Building a Deep Learning-powered Search Engine (20)

Deep Learning Meetup 7 - Building a Deep Learning-powered Search Engine