SlideShare a Scribd company logo
Image-Based E-Commerce
Product Discovery:
A Deep Learning Case Study
Denis Kamotsky, Peter Gazaryan
@Macys
#Activate18 #ActivateSearch
Agenda
• About Macy's
• Where we are with search right now
• 'More Like This' feature overview
• MLT implementation overview
• Transfer Learning and Models Tune Up
• Triple loss approach
• Scoring with vectors similarity metrics in Lucene
• Vector-space retrieval in Lucene
3
Macy’s and macys.com
 1998 Macys.com is launched and operates out of
New York and San Francisco.
 Over 800 locations across the U.S
 2013 Macys.com launches Keyword Search
running on Apache Solr
 Mobile is driving e-commerce growth
Image-Based Discovery Use Cases
Visual
Search
Visual
Similarity
Visual
Filtering
Visual
Attribution
Visual Search
• Visual Search
• Image Auto-Mapping (Shop the Look)
Visual Similarity
• More Like This
• Visual Recommendation Signal
Visual Filtering
• Visual Feature Facets
• Likeness Filtering
Visual Attribution
• Second Opinion
• New Product Onboarding
‘More Like This’ Feature
‘More Like This’ Feature
Serving Model
Serving Architecture
Spark-Based Pipeline
GPU Pipeline: Fast Experimentation
Deep Learning Similarity
Train
Retraining
•Fine-Tuning
•Deep Retraining
•Extra Layers
Deep Image
Hashing
Loss
•Classifiers
•Regressors
•Triplet Loss
Vectorize
Models
•Pre-Trained
•Re-Trained
Shallow
Embeddings
Deep Embeddings
Pack
Dimensionality
Reduction
Feature Fusion
Index
Distance Metrics
•Metric Spaces
•Non-Metric Spaces
Indexing Method
•Partitioning
•LSH
•Projections
Search
KNN Search
Radius Search
Exact Methods
Approximate
MethodsDecisions at each
Stage
Sample Product: Search by Attributes
 Sample product: red patterned dress
 Attribute vectors: TF-IDF (actually just IDF, because TF==1)
 22615 unique attribute values across catalog subset used in experiments
 Topic Model with 1024 dimensions
 24-NN search results
Train and Vectorize
Anatomy of a CNN Model
Decision: Choose Model Architecture
Decision: Deep Image Hashing
Decision: Retraining
 Loss
• Classification (multiclass model per attribute, or
multilabel for ease of operations)
• Regression (BOW, TF-IDF, other targets)
• Triplet
 Training Method
• Fine-Tuning
• Deep Retraining
• Extra Layers
Example: Triplet Loss Training
Easy Triplet:
Hard Triplet:
Black Box:
MobileNet2
Embedding
Triplet DNN
Anchor
Image
Positive
Image
Negative
Image
Embedding Embedding
Booster
Tower Weights
• Transfer Learning
• Extra Layers
• Custom Loss
Results: Re-Training with Triplet Loss
24
Proprietary & Confidential – Do Not Distribute
Decision: Deep vs Shallow Embeddings
VGG19 Layer Gram Size Flat Vector
Size
Accounting
for
Symmetry
input channels 3x3 9 6
block1_conv1 64 x 64 4096 2080
block2_conv1 128 x 128 16384 8256
block3_conv1 256 x 256 65536 32896
block4_conv1 512 x 512 262144 131328
block5_conv1 512 x 512 262144 131328
Total Vector Size 610313 305894
 Direct use of convolutional data
 Flattened Gram matrices
 Dimensionality Reduction Problem
Decision: Choosing Convolutional Layers
ReceptiveFieldSize
Decision: Choosing Convolutional Layers
ReceptiveFieldSize
Pack
Feature Fusion: Naïve Approach
 Concatenation
𝐶 = 𝐴, 𝐵
 Equalization
𝐶 = 𝑓(𝐴), 𝑓(𝐵)
 Properties of 𝑓 in Euclidean space
Constraint 1 Preserve pairwise distances in
the partial vector spaces
𝑑𝑖𝑠𝑡 𝑓(𝐴𝑖), 𝑓(𝐴𝑗) == 𝑂(𝑑𝑖𝑠𝑡 𝐴𝑖, 𝐴𝑗 )
𝑑𝑖𝑠𝑡 𝑓(𝐵𝑖), 𝑓(𝐵𝑗) == 𝑂(𝑑𝑖𝑠𝑡 𝐵𝑖, 𝐵𝑗 )
Constraint 2 Ratio of squared partial
pairwise distances == 1.0
𝑑𝑖𝑠𝑡2
(𝑓(𝐴𝑖), 𝑓(𝐴𝑗))
𝑑𝑖𝑠𝑡2(𝑓(𝐵𝑖), 𝑓(𝐵𝑗))
== 1.0
Concatenation: Adjusting Color
Feature Fusion: Canonical Correlation Analysis
Feature Fusion: Deep CCA
• Hyperparameters
• Loss Function
• Target Dimensions
• FC Stack Depth
• Add vs Concatenate Projections
• Performance
• Differentiable Loss
• GPU-Placeable Ops
Results: Fusing Deep Embeddings with Product Attributes
Results: Fusing Deep Image Hashes with Product Attributes
Index
Decision: Distance Metric
 Classic: L1 (Manhattan), L2 (Euclidean)
𝑥 1 =
𝑖
𝑥𝑖 ⟹ 𝐿1 𝑥, 𝑦 = 𝑥 − 𝑦 1 =
𝑖
𝑥𝑖 − 𝑦𝑖
𝑥 2 =
𝑖
𝑥𝑖
2
1
2
⟹ 𝐿2 𝑥, 𝑦 = 𝑥 − 𝑦 2 =
𝑖
𝑥𝑖 − 𝑦𝑖
2
1
2
 Fractional Distances: triangle rule is violated
𝐿1
𝑓
𝑥, 𝑦 = 𝑥 − 𝑦 1
𝑓
=
𝑖
𝑥𝑖 − 𝑦𝑖
1
𝑓
𝑓
Results: Cosine vs Euclidean
 Cosine distance is a special case of Euclidean distance
𝑐𝑜𝑠_𝑑𝑖𝑠𝑡(𝑥, 𝑦) = 1 − 𝑐𝑜𝑠 𝑥, 𝑦 = 1 −
𝑖 𝑥𝑖 𝑦𝑖
𝑥 2 𝑦 2
=
𝐿2
2 𝑥
𝑥 2
,
𝑦
𝑦 2
2
 Fast to compute for sparse vectors and ranges [0,1] for all-positive vectors  popular in NLP
Search
KNN Search
 Space Partitioning
 Locality Sensitive Hashing
 Projection to Lower Dimensions
 Scikit-Learn
 Annoy
 NMSLib
 Lucene?!...
Index Type Indexing Time
(including time to
persist raw arrays)
Search
Time
Index Size
(excluding raw
array data size)
Scikit-Learn KD-Tree 100% 100% 100%
Annoy 125% 68% 6%
NMSLib HNSW 395% 68% 6%
NMSLib Brute-Force 51% 85% 0%
Conclusion
 Would like arbitrary tensor similarity
signal in the search engine
 When all input tensors are known ahead
of time, results can be pre-computed
 When input tensors are not known
ahead of time, need GPU integration
 Model training pipeline needs to
integrate with indexing pipeline
 Fast KNN search on vectors up to 2048
dimensions is a desirable feature
Thank you!
Denis Kamotsky, Peter Gazaryan
@Macys
#Activate18 #ActivateSearch

More Related Content

What's hot

Image Interpolation Techniques with Optical and Digital Zoom Concepts
Image Interpolation Techniques with Optical and Digital Zoom ConceptsImage Interpolation Techniques with Optical and Digital Zoom Concepts
Image Interpolation Techniques with Optical and Digital Zoom Concepts
mmjalbiaty
 
Image representation
Image representationImage representation
Image representation
Rahul Dadwal
 
Texture mapping in_opengl
Texture mapping in_openglTexture mapping in_opengl
Texture mapping in_opengl
Manas Nayak
 
PR-132: SSD: Single Shot MultiBox Detector
PR-132: SSD: Single Shot MultiBox DetectorPR-132: SSD: Single Shot MultiBox Detector
PR-132: SSD: Single Shot MultiBox Detector
Jinwon Lee
 
Image Segmentation (D3L1 2017 UPC Deep Learning for Computer Vision)
Image Segmentation (D3L1 2017 UPC Deep Learning for Computer Vision)Image Segmentation (D3L1 2017 UPC Deep Learning for Computer Vision)
Image Segmentation (D3L1 2017 UPC Deep Learning for Computer Vision)
Universitat Politècnica de Catalunya
 
GRPHICS02 - Creating 3D Graphics
GRPHICS02 - Creating 3D GraphicsGRPHICS02 - Creating 3D Graphics
GRPHICS02 - Creating 3D Graphics
Michael Heron
 
Visual Cryptography
Visual CryptographyVisual Cryptography
Visual Cryptography
Ecaterina Moraru (Valica)
 
JonathanWestlake_ComputerVision_Project1
JonathanWestlake_ComputerVision_Project1JonathanWestlake_ComputerVision_Project1
JonathanWestlake_ComputerVision_Project1
Jonathan Westlake
 
Visual Cryptography in Meaningful Shares
Visual Cryptography in Meaningful SharesVisual Cryptography in Meaningful Shares
Visual Cryptography in Meaningful Shares
Debarko De
 
Mask R-CNN
Mask R-CNNMask R-CNN
Mask R-CNN
Chanuk Lim
 
Visual cryptography scheme for color images
Visual cryptography scheme for color imagesVisual cryptography scheme for color images
Visual cryptography scheme for color images
iaemedu
 
[Lecture 2] AI and Deep Learning: Logistic Regression (Theory)
[Lecture 2] AI and Deep Learning: Logistic Regression (Theory)[Lecture 2] AI and Deep Learning: Logistic Regression (Theory)
[Lecture 2] AI and Deep Learning: Logistic Regression (Theory)
Kobkrit Viriyayudhakorn
 
OpenGL Texture Mapping
OpenGL Texture MappingOpenGL Texture Mapping
OpenGL Texture Mapping
Syed Zaid Irshad
 
DeconvNet, DecoupledNet, TransferNet in Image Segmentation
DeconvNet, DecoupledNet, TransferNet in Image SegmentationDeconvNet, DecoupledNet, TransferNet in Image Segmentation
DeconvNet, DecoupledNet, TransferNet in Image Segmentation
NamHyuk Ahn
 
Introduction to Neural Networks and Deep Learning
Introduction to Neural Networks and Deep LearningIntroduction to Neural Networks and Deep Learning
Introduction to Neural Networks and Deep Learning
Vahid Mirjalili
 
Data Augmentation and Disaggregation by Neal Fultz
Data Augmentation and Disaggregation by Neal FultzData Augmentation and Disaggregation by Neal Fultz
Data Augmentation and Disaggregation by Neal Fultz
Data Con LA
 
Visual cryptography1
Visual cryptography1Visual cryptography1
Visual cryptography1
Pratiksha Patil
 
Singular Value Decomposition Image Compression
Singular Value Decomposition Image CompressionSingular Value Decomposition Image Compression
Singular Value Decomposition Image Compression
Aishwarya K. M.
 
Visual cryptography for color images
Visual cryptography for color imagesVisual cryptography for color images
Visual cryptography for color images
IIT Delhi
 
PPT s10-machine vision-s2
PPT s10-machine vision-s2PPT s10-machine vision-s2
PPT s10-machine vision-s2
Binus Online Learning
 

What's hot (20)

Image Interpolation Techniques with Optical and Digital Zoom Concepts
Image Interpolation Techniques with Optical and Digital Zoom ConceptsImage Interpolation Techniques with Optical and Digital Zoom Concepts
Image Interpolation Techniques with Optical and Digital Zoom Concepts
 
Image representation
Image representationImage representation
Image representation
 
Texture mapping in_opengl
Texture mapping in_openglTexture mapping in_opengl
Texture mapping in_opengl
 
PR-132: SSD: Single Shot MultiBox Detector
PR-132: SSD: Single Shot MultiBox DetectorPR-132: SSD: Single Shot MultiBox Detector
PR-132: SSD: Single Shot MultiBox Detector
 
Image Segmentation (D3L1 2017 UPC Deep Learning for Computer Vision)
Image Segmentation (D3L1 2017 UPC Deep Learning for Computer Vision)Image Segmentation (D3L1 2017 UPC Deep Learning for Computer Vision)
Image Segmentation (D3L1 2017 UPC Deep Learning for Computer Vision)
 
GRPHICS02 - Creating 3D Graphics
GRPHICS02 - Creating 3D GraphicsGRPHICS02 - Creating 3D Graphics
GRPHICS02 - Creating 3D Graphics
 
Visual Cryptography
Visual CryptographyVisual Cryptography
Visual Cryptography
 
JonathanWestlake_ComputerVision_Project1
JonathanWestlake_ComputerVision_Project1JonathanWestlake_ComputerVision_Project1
JonathanWestlake_ComputerVision_Project1
 
Visual Cryptography in Meaningful Shares
Visual Cryptography in Meaningful SharesVisual Cryptography in Meaningful Shares
Visual Cryptography in Meaningful Shares
 
Mask R-CNN
Mask R-CNNMask R-CNN
Mask R-CNN
 
Visual cryptography scheme for color images
Visual cryptography scheme for color imagesVisual cryptography scheme for color images
Visual cryptography scheme for color images
 
[Lecture 2] AI and Deep Learning: Logistic Regression (Theory)
[Lecture 2] AI and Deep Learning: Logistic Regression (Theory)[Lecture 2] AI and Deep Learning: Logistic Regression (Theory)
[Lecture 2] AI and Deep Learning: Logistic Regression (Theory)
 
OpenGL Texture Mapping
OpenGL Texture MappingOpenGL Texture Mapping
OpenGL Texture Mapping
 
DeconvNet, DecoupledNet, TransferNet in Image Segmentation
DeconvNet, DecoupledNet, TransferNet in Image SegmentationDeconvNet, DecoupledNet, TransferNet in Image Segmentation
DeconvNet, DecoupledNet, TransferNet in Image Segmentation
 
Introduction to Neural Networks and Deep Learning
Introduction to Neural Networks and Deep LearningIntroduction to Neural Networks and Deep Learning
Introduction to Neural Networks and Deep Learning
 
Data Augmentation and Disaggregation by Neal Fultz
Data Augmentation and Disaggregation by Neal FultzData Augmentation and Disaggregation by Neal Fultz
Data Augmentation and Disaggregation by Neal Fultz
 
Visual cryptography1
Visual cryptography1Visual cryptography1
Visual cryptography1
 
Singular Value Decomposition Image Compression
Singular Value Decomposition Image CompressionSingular Value Decomposition Image Compression
Singular Value Decomposition Image Compression
 
Visual cryptography for color images
Visual cryptography for color imagesVisual cryptography for color images
Visual cryptography for color images
 
PPT s10-machine vision-s2
PPT s10-machine vision-s2PPT s10-machine vision-s2
PPT s10-machine vision-s2
 

Similar to Image-Based E-Commerce Product Discovery: A Deep Learning Case Study - Denis Kamotsky & Peter Gazaryan, Macy's Inc

07 learning
07 learning07 learning
07 learning
ankit_ppt
 
Machine Learning, Deep Learning and Data Analysis Introduction
Machine Learning, Deep Learning and Data Analysis IntroductionMachine Learning, Deep Learning and Data Analysis Introduction
Machine Learning, Deep Learning and Data Analysis Introduction
Te-Yen Liu
 
Parking space detect
Parking space detectParking space detect
Parking space detect
Amanullah Tariq
 
Cahall Final Intern Presentation
Cahall Final Intern PresentationCahall Final Intern Presentation
Cahall Final Intern Presentation
Daniel Cahall
 
Oxford 05-oct-2012
Oxford 05-oct-2012Oxford 05-oct-2012
Oxford 05-oct-2012
Ted Dunning
 
Structured Forests for Fast Edge Detection [Paper Presentation]
Structured Forests for Fast Edge Detection [Paper Presentation]Structured Forests for Fast Edge Detection [Paper Presentation]
Structured Forests for Fast Edge Detection [Paper Presentation]
Mohammad Shaker
 
Learn to Build an App to Find Similar Images using Deep Learning- Piotr Teterwak
Learn to Build an App to Find Similar Images using Deep Learning- Piotr TeterwakLearn to Build an App to Find Similar Images using Deep Learning- Piotr Teterwak
Learn to Build an App to Find Similar Images using Deep Learning- Piotr Teterwak
PyData
 
Fast Single-pass K-means Clusterting at Oxford
Fast Single-pass K-means Clusterting at Oxford Fast Single-pass K-means Clusterting at Oxford
Fast Single-pass K-means Clusterting at Oxford
MapR Technologies
 
Lecture_2_Deep_Learning_Overview (1).pptx
Lecture_2_Deep_Learning_Overview (1).pptxLecture_2_Deep_Learning_Overview (1).pptx
Lecture_2_Deep_Learning_Overview (1).pptx
gamajima2023
 
k-Means Clustering.pptx
k-Means Clustering.pptxk-Means Clustering.pptx
k-Means Clustering.pptx
NJYOTSHNA
 
ACM 2013-02-25
ACM 2013-02-25ACM 2013-02-25
ACM 2013-02-25
Ted Dunning
 
Introduction to computer vision with Convoluted Neural Networks
Introduction to computer vision with Convoluted Neural NetworksIntroduction to computer vision with Convoluted Neural Networks
Introduction to computer vision with Convoluted Neural Networks
MarcinJedyk
 
Deep Learning Models for Question Answering
Deep Learning Models for Question AnsweringDeep Learning Models for Question Answering
Deep Learning Models for Question Answering
Sujit Pal
 
Cluster Analysis for Dummies
Cluster Analysis for DummiesCluster Analysis for Dummies
Cluster Analysis for Dummies
Venkata Reddy Konasani
 
Computer vision-nit-silchar-hackathon
Computer vision-nit-silchar-hackathonComputer vision-nit-silchar-hackathon
Computer vision-nit-silchar-hackathon
Aditya Bhattacharya
 
pydataPointCloud.pptx
pydataPointCloud.pptxpydataPointCloud.pptx
pydataPointCloud.pptx
Manuel Rodrigo Cabello Malagón
 
Nearest Neighbor Customer Insight
Nearest Neighbor Customer InsightNearest Neighbor Customer Insight
Nearest Neighbor Customer Insight
MapR Technologies
 
Lecture 11 - KNN and Clustering, a lecture in subject module Statistical & Ma...
Lecture 11 - KNN and Clustering, a lecture in subject module Statistical & Ma...Lecture 11 - KNN and Clustering, a lecture in subject module Statistical & Ma...
Lecture 11 - KNN and Clustering, a lecture in subject module Statistical & Ma...
Maninda Edirisooriya
 
Introduction to computer vision
Introduction to computer visionIntroduction to computer vision
Introduction to computer vision
Marcin Jedyk
 
Machine Learning workshop by GDSC Amity University Chhattisgarh
Machine Learning workshop by GDSC Amity University ChhattisgarhMachine Learning workshop by GDSC Amity University Chhattisgarh
Machine Learning workshop by GDSC Amity University Chhattisgarh
Poorabpatel
 

Similar to Image-Based E-Commerce Product Discovery: A Deep Learning Case Study - Denis Kamotsky & Peter Gazaryan, Macy's Inc (20)

07 learning
07 learning07 learning
07 learning
 
Machine Learning, Deep Learning and Data Analysis Introduction
Machine Learning, Deep Learning and Data Analysis IntroductionMachine Learning, Deep Learning and Data Analysis Introduction
Machine Learning, Deep Learning and Data Analysis Introduction
 
Parking space detect
Parking space detectParking space detect
Parking space detect
 
Cahall Final Intern Presentation
Cahall Final Intern PresentationCahall Final Intern Presentation
Cahall Final Intern Presentation
 
Oxford 05-oct-2012
Oxford 05-oct-2012Oxford 05-oct-2012
Oxford 05-oct-2012
 
Structured Forests for Fast Edge Detection [Paper Presentation]
Structured Forests for Fast Edge Detection [Paper Presentation]Structured Forests for Fast Edge Detection [Paper Presentation]
Structured Forests for Fast Edge Detection [Paper Presentation]
 
Learn to Build an App to Find Similar Images using Deep Learning- Piotr Teterwak
Learn to Build an App to Find Similar Images using Deep Learning- Piotr TeterwakLearn to Build an App to Find Similar Images using Deep Learning- Piotr Teterwak
Learn to Build an App to Find Similar Images using Deep Learning- Piotr Teterwak
 
Fast Single-pass K-means Clusterting at Oxford
Fast Single-pass K-means Clusterting at Oxford Fast Single-pass K-means Clusterting at Oxford
Fast Single-pass K-means Clusterting at Oxford
 
Lecture_2_Deep_Learning_Overview (1).pptx
Lecture_2_Deep_Learning_Overview (1).pptxLecture_2_Deep_Learning_Overview (1).pptx
Lecture_2_Deep_Learning_Overview (1).pptx
 
k-Means Clustering.pptx
k-Means Clustering.pptxk-Means Clustering.pptx
k-Means Clustering.pptx
 
ACM 2013-02-25
ACM 2013-02-25ACM 2013-02-25
ACM 2013-02-25
 
Introduction to computer vision with Convoluted Neural Networks
Introduction to computer vision with Convoluted Neural NetworksIntroduction to computer vision with Convoluted Neural Networks
Introduction to computer vision with Convoluted Neural Networks
 
Deep Learning Models for Question Answering
Deep Learning Models for Question AnsweringDeep Learning Models for Question Answering
Deep Learning Models for Question Answering
 
Cluster Analysis for Dummies
Cluster Analysis for DummiesCluster Analysis for Dummies
Cluster Analysis for Dummies
 
Computer vision-nit-silchar-hackathon
Computer vision-nit-silchar-hackathonComputer vision-nit-silchar-hackathon
Computer vision-nit-silchar-hackathon
 
pydataPointCloud.pptx
pydataPointCloud.pptxpydataPointCloud.pptx
pydataPointCloud.pptx
 
Nearest Neighbor Customer Insight
Nearest Neighbor Customer InsightNearest Neighbor Customer Insight
Nearest Neighbor Customer Insight
 
Lecture 11 - KNN and Clustering, a lecture in subject module Statistical & Ma...
Lecture 11 - KNN and Clustering, a lecture in subject module Statistical & Ma...Lecture 11 - KNN and Clustering, a lecture in subject module Statistical & Ma...
Lecture 11 - KNN and Clustering, a lecture in subject module Statistical & Ma...
 
Introduction to computer vision
Introduction to computer visionIntroduction to computer vision
Introduction to computer vision
 
Machine Learning workshop by GDSC Amity University Chhattisgarh
Machine Learning workshop by GDSC Amity University ChhattisgarhMachine Learning workshop by GDSC Amity University Chhattisgarh
Machine Learning workshop by GDSC Amity University Chhattisgarh
 

More from Lucidworks

Search is the Tip of the Spear for Your B2B eCommerce Strategy
Search is the Tip of the Spear for Your B2B eCommerce StrategySearch is the Tip of the Spear for Your B2B eCommerce Strategy
Search is the Tip of the Spear for Your B2B eCommerce Strategy
Lucidworks
 
Drive Agent Effectiveness in Salesforce
Drive Agent Effectiveness in SalesforceDrive Agent Effectiveness in Salesforce
Drive Agent Effectiveness in Salesforce
Lucidworks
 
How Crate & Barrel Connects Shoppers with Relevant Products
How Crate & Barrel Connects Shoppers with Relevant ProductsHow Crate & Barrel Connects Shoppers with Relevant Products
How Crate & Barrel Connects Shoppers with Relevant Products
Lucidworks
 
Lucidworks & IMRG Webinar – Best-In-Class Retail Product Discovery
Lucidworks & IMRG Webinar – Best-In-Class Retail Product DiscoveryLucidworks & IMRG Webinar – Best-In-Class Retail Product Discovery
Lucidworks & IMRG Webinar – Best-In-Class Retail Product Discovery
Lucidworks
 
Connected Experiences Are Personalized Experiences
Connected Experiences Are Personalized ExperiencesConnected Experiences Are Personalized Experiences
Connected Experiences Are Personalized Experiences
Lucidworks
 
Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...
Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...
Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...
Lucidworks
 
[Webinar] Intelligent Policing. Leveraging Data to more effectively Serve Com...
[Webinar] Intelligent Policing. Leveraging Data to more effectively Serve Com...[Webinar] Intelligent Policing. Leveraging Data to more effectively Serve Com...
[Webinar] Intelligent Policing. Leveraging Data to more effectively Serve Com...
Lucidworks
 
Preparing for Peak in Ecommerce | eTail Asia 2020
Preparing for Peak in Ecommerce | eTail Asia 2020Preparing for Peak in Ecommerce | eTail Asia 2020
Preparing for Peak in Ecommerce | eTail Asia 2020
Lucidworks
 
Accelerate The Path To Purchase With Product Discovery at Retail Innovation C...
Accelerate The Path To Purchase With Product Discovery at Retail Innovation C...Accelerate The Path To Purchase With Product Discovery at Retail Innovation C...
Accelerate The Path To Purchase With Product Discovery at Retail Innovation C...
Lucidworks
 
AI-Powered Linguistics and Search with Fusion and Rosette
AI-Powered Linguistics and Search with Fusion and RosetteAI-Powered Linguistics and Search with Fusion and Rosette
AI-Powered Linguistics and Search with Fusion and Rosette
Lucidworks
 
The Service Industry After COVID-19: The Soul of Service in a Virtual Moment
The Service Industry After COVID-19: The Soul of Service in a Virtual MomentThe Service Industry After COVID-19: The Soul of Service in a Virtual Moment
The Service Industry After COVID-19: The Soul of Service in a Virtual Moment
Lucidworks
 
Webinar: Smart answers for employee and customer support after covid 19 - Europe
Webinar: Smart answers for employee and customer support after covid 19 - EuropeWebinar: Smart answers for employee and customer support after covid 19 - Europe
Webinar: Smart answers for employee and customer support after covid 19 - Europe
Lucidworks
 
Smart Answers for Employee and Customer Support After COVID-19
Smart Answers for Employee and Customer Support After COVID-19Smart Answers for Employee and Customer Support After COVID-19
Smart Answers for Employee and Customer Support After COVID-19
Lucidworks
 
Applying AI & Search in Europe - featuring 451 Research
Applying AI & Search in Europe - featuring 451 ResearchApplying AI & Search in Europe - featuring 451 Research
Applying AI & Search in Europe - featuring 451 Research
Lucidworks
 
Webinar: Accelerate Data Science with Fusion 5.1
Webinar: Accelerate Data Science with Fusion 5.1Webinar: Accelerate Data Science with Fusion 5.1
Webinar: Accelerate Data Science with Fusion 5.1
Lucidworks
 
Webinar: 5 Must-Have Items You Need for Your 2020 Ecommerce Strategy
Webinar: 5 Must-Have Items You Need for Your 2020 Ecommerce StrategyWebinar: 5 Must-Have Items You Need for Your 2020 Ecommerce Strategy
Webinar: 5 Must-Have Items You Need for Your 2020 Ecommerce Strategy
Lucidworks
 
Where Search Meets Science and Style Meets Savings: Nordstrom Rack's Journey ...
Where Search Meets Science and Style Meets Savings: Nordstrom Rack's Journey ...Where Search Meets Science and Style Meets Savings: Nordstrom Rack's Journey ...
Where Search Meets Science and Style Meets Savings: Nordstrom Rack's Journey ...
Lucidworks
 
Apply Knowledge Graphs and Search for Real-World Decision Intelligence
Apply Knowledge Graphs and Search for Real-World Decision IntelligenceApply Knowledge Graphs and Search for Real-World Decision Intelligence
Apply Knowledge Graphs and Search for Real-World Decision Intelligence
Lucidworks
 
Webinar: Building a Business Case for Enterprise Search
Webinar: Building a Business Case for Enterprise SearchWebinar: Building a Business Case for Enterprise Search
Webinar: Building a Business Case for Enterprise Search
Lucidworks
 
Why Insight Engines Matter in 2020 and Beyond
Why Insight Engines Matter in 2020 and BeyondWhy Insight Engines Matter in 2020 and Beyond
Why Insight Engines Matter in 2020 and Beyond
Lucidworks
 

More from Lucidworks (20)

Search is the Tip of the Spear for Your B2B eCommerce Strategy
Search is the Tip of the Spear for Your B2B eCommerce StrategySearch is the Tip of the Spear for Your B2B eCommerce Strategy
Search is the Tip of the Spear for Your B2B eCommerce Strategy
 
Drive Agent Effectiveness in Salesforce
Drive Agent Effectiveness in SalesforceDrive Agent Effectiveness in Salesforce
Drive Agent Effectiveness in Salesforce
 
How Crate & Barrel Connects Shoppers with Relevant Products
How Crate & Barrel Connects Shoppers with Relevant ProductsHow Crate & Barrel Connects Shoppers with Relevant Products
How Crate & Barrel Connects Shoppers with Relevant Products
 
Lucidworks & IMRG Webinar – Best-In-Class Retail Product Discovery
Lucidworks & IMRG Webinar – Best-In-Class Retail Product DiscoveryLucidworks & IMRG Webinar – Best-In-Class Retail Product Discovery
Lucidworks & IMRG Webinar – Best-In-Class Retail Product Discovery
 
Connected Experiences Are Personalized Experiences
Connected Experiences Are Personalized ExperiencesConnected Experiences Are Personalized Experiences
Connected Experiences Are Personalized Experiences
 
Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...
Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...
Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...
 
[Webinar] Intelligent Policing. Leveraging Data to more effectively Serve Com...
[Webinar] Intelligent Policing. Leveraging Data to more effectively Serve Com...[Webinar] Intelligent Policing. Leveraging Data to more effectively Serve Com...
[Webinar] Intelligent Policing. Leveraging Data to more effectively Serve Com...
 
Preparing for Peak in Ecommerce | eTail Asia 2020
Preparing for Peak in Ecommerce | eTail Asia 2020Preparing for Peak in Ecommerce | eTail Asia 2020
Preparing for Peak in Ecommerce | eTail Asia 2020
 
Accelerate The Path To Purchase With Product Discovery at Retail Innovation C...
Accelerate The Path To Purchase With Product Discovery at Retail Innovation C...Accelerate The Path To Purchase With Product Discovery at Retail Innovation C...
Accelerate The Path To Purchase With Product Discovery at Retail Innovation C...
 
AI-Powered Linguistics and Search with Fusion and Rosette
AI-Powered Linguistics and Search with Fusion and RosetteAI-Powered Linguistics and Search with Fusion and Rosette
AI-Powered Linguistics and Search with Fusion and Rosette
 
The Service Industry After COVID-19: The Soul of Service in a Virtual Moment
The Service Industry After COVID-19: The Soul of Service in a Virtual MomentThe Service Industry After COVID-19: The Soul of Service in a Virtual Moment
The Service Industry After COVID-19: The Soul of Service in a Virtual Moment
 
Webinar: Smart answers for employee and customer support after covid 19 - Europe
Webinar: Smart answers for employee and customer support after covid 19 - EuropeWebinar: Smart answers for employee and customer support after covid 19 - Europe
Webinar: Smart answers for employee and customer support after covid 19 - Europe
 
Smart Answers for Employee and Customer Support After COVID-19
Smart Answers for Employee and Customer Support After COVID-19Smart Answers for Employee and Customer Support After COVID-19
Smart Answers for Employee and Customer Support After COVID-19
 
Applying AI & Search in Europe - featuring 451 Research
Applying AI & Search in Europe - featuring 451 ResearchApplying AI & Search in Europe - featuring 451 Research
Applying AI & Search in Europe - featuring 451 Research
 
Webinar: Accelerate Data Science with Fusion 5.1
Webinar: Accelerate Data Science with Fusion 5.1Webinar: Accelerate Data Science with Fusion 5.1
Webinar: Accelerate Data Science with Fusion 5.1
 
Webinar: 5 Must-Have Items You Need for Your 2020 Ecommerce Strategy
Webinar: 5 Must-Have Items You Need for Your 2020 Ecommerce StrategyWebinar: 5 Must-Have Items You Need for Your 2020 Ecommerce Strategy
Webinar: 5 Must-Have Items You Need for Your 2020 Ecommerce Strategy
 
Where Search Meets Science and Style Meets Savings: Nordstrom Rack's Journey ...
Where Search Meets Science and Style Meets Savings: Nordstrom Rack's Journey ...Where Search Meets Science and Style Meets Savings: Nordstrom Rack's Journey ...
Where Search Meets Science and Style Meets Savings: Nordstrom Rack's Journey ...
 
Apply Knowledge Graphs and Search for Real-World Decision Intelligence
Apply Knowledge Graphs and Search for Real-World Decision IntelligenceApply Knowledge Graphs and Search for Real-World Decision Intelligence
Apply Knowledge Graphs and Search for Real-World Decision Intelligence
 
Webinar: Building a Business Case for Enterprise Search
Webinar: Building a Business Case for Enterprise SearchWebinar: Building a Business Case for Enterprise Search
Webinar: Building a Business Case for Enterprise Search
 
Why Insight Engines Matter in 2020 and Beyond
Why Insight Engines Matter in 2020 and BeyondWhy Insight Engines Matter in 2020 and Beyond
Why Insight Engines Matter in 2020 and Beyond
 

Recently uploaded

GenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizationsGenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizations
kumardaparthi1024
 
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial IntelligenceAI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
IndexBug
 
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
SOFTTECHHUB
 
Serial Arm Control in Real Time Presentation
Serial Arm Control in Real Time PresentationSerial Arm Control in Real Time Presentation
Serial Arm Control in Real Time Presentation
tolgahangng
 
Best 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERPBest 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERP
Pixlogix Infotech
 
Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
Aftab Hussain
 
Video Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the FutureVideo Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the Future
Alpen-Adria-Universität
 
HCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAUHCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAU
panagenda
 
Mind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AIMind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AI
Kumud Singh
 
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc
 
Mariano G Tinti - Decoding SpaceX
Mariano G Tinti - Decoding SpaceXMariano G Tinti - Decoding SpaceX
Mariano G Tinti - Decoding SpaceX
Mariano Tinti
 
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with SlackLet's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
shyamraj55
 
Microsoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdfMicrosoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdf
Uni Systems S.M.S.A.
 
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAUHCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
panagenda
 
Presentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of GermanyPresentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of Germany
innovationoecd
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Paige Cruz
 
Full-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalizationFull-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalization
Zilliz
 
“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”
Claudio Di Ciccio
 
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
名前 です男
 
Building Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and MilvusBuilding Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and Milvus
Zilliz
 

Recently uploaded (20)

GenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizationsGenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizations
 
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial IntelligenceAI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
 
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
 
Serial Arm Control in Real Time Presentation
Serial Arm Control in Real Time PresentationSerial Arm Control in Real Time Presentation
Serial Arm Control in Real Time Presentation
 
Best 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERPBest 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERP
 
Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
 
Video Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the FutureVideo Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the Future
 
HCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAUHCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAU
 
Mind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AIMind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AI
 
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy Survey
 
Mariano G Tinti - Decoding SpaceX
Mariano G Tinti - Decoding SpaceXMariano G Tinti - Decoding SpaceX
Mariano G Tinti - Decoding SpaceX
 
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with SlackLet's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
 
Microsoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdfMicrosoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdf
 
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAUHCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
 
Presentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of GermanyPresentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of Germany
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
 
Full-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalizationFull-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalization
 
“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”
 
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
 
Building Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and MilvusBuilding Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and Milvus
 

Image-Based E-Commerce Product Discovery: A Deep Learning Case Study - Denis Kamotsky & Peter Gazaryan, Macy's Inc

  • 1. Image-Based E-Commerce Product Discovery: A Deep Learning Case Study Denis Kamotsky, Peter Gazaryan @Macys #Activate18 #ActivateSearch
  • 2. Agenda • About Macy's • Where we are with search right now • 'More Like This' feature overview • MLT implementation overview • Transfer Learning and Models Tune Up • Triple loss approach • Scoring with vectors similarity metrics in Lucene • Vector-space retrieval in Lucene
  • 3. 3 Macy’s and macys.com  1998 Macys.com is launched and operates out of New York and San Francisco.  Over 800 locations across the U.S  2013 Macys.com launches Keyword Search running on Apache Solr  Mobile is driving e-commerce growth
  • 4. Image-Based Discovery Use Cases Visual Search Visual Similarity Visual Filtering Visual Attribution
  • 5. Visual Search • Visual Search • Image Auto-Mapping (Shop the Look)
  • 6. Visual Similarity • More Like This • Visual Recommendation Signal
  • 7. Visual Filtering • Visual Feature Facets • Likeness Filtering
  • 8. Visual Attribution • Second Opinion • New Product Onboarding
  • 14. GPU Pipeline: Fast Experimentation
  • 15. Deep Learning Similarity Train Retraining •Fine-Tuning •Deep Retraining •Extra Layers Deep Image Hashing Loss •Classifiers •Regressors •Triplet Loss Vectorize Models •Pre-Trained •Re-Trained Shallow Embeddings Deep Embeddings Pack Dimensionality Reduction Feature Fusion Index Distance Metrics •Metric Spaces •Non-Metric Spaces Indexing Method •Partitioning •LSH •Projections Search KNN Search Radius Search Exact Methods Approximate MethodsDecisions at each Stage
  • 16. Sample Product: Search by Attributes  Sample product: red patterned dress  Attribute vectors: TF-IDF (actually just IDF, because TF==1)  22615 unique attribute values across catalog subset used in experiments  Topic Model with 1024 dimensions  24-NN search results
  • 18. Anatomy of a CNN Model
  • 19. Decision: Choose Model Architecture
  • 21. Decision: Retraining  Loss • Classification (multiclass model per attribute, or multilabel for ease of operations) • Regression (BOW, TF-IDF, other targets) • Triplet  Training Method • Fine-Tuning • Deep Retraining • Extra Layers
  • 22. Example: Triplet Loss Training Easy Triplet: Hard Triplet: Black Box: MobileNet2 Embedding Triplet DNN Anchor Image Positive Image Negative Image Embedding Embedding Booster Tower Weights • Transfer Learning • Extra Layers • Custom Loss
  • 24. 24 Proprietary & Confidential – Do Not Distribute Decision: Deep vs Shallow Embeddings VGG19 Layer Gram Size Flat Vector Size Accounting for Symmetry input channels 3x3 9 6 block1_conv1 64 x 64 4096 2080 block2_conv1 128 x 128 16384 8256 block3_conv1 256 x 256 65536 32896 block4_conv1 512 x 512 262144 131328 block5_conv1 512 x 512 262144 131328 Total Vector Size 610313 305894  Direct use of convolutional data  Flattened Gram matrices  Dimensionality Reduction Problem
  • 25. Decision: Choosing Convolutional Layers ReceptiveFieldSize
  • 26. Decision: Choosing Convolutional Layers ReceptiveFieldSize
  • 27. Pack
  • 28. Feature Fusion: Naïve Approach  Concatenation 𝐶 = 𝐴, 𝐵  Equalization 𝐶 = 𝑓(𝐴), 𝑓(𝐵)  Properties of 𝑓 in Euclidean space Constraint 1 Preserve pairwise distances in the partial vector spaces 𝑑𝑖𝑠𝑡 𝑓(𝐴𝑖), 𝑓(𝐴𝑗) == 𝑂(𝑑𝑖𝑠𝑡 𝐴𝑖, 𝐴𝑗 ) 𝑑𝑖𝑠𝑡 𝑓(𝐵𝑖), 𝑓(𝐵𝑗) == 𝑂(𝑑𝑖𝑠𝑡 𝐵𝑖, 𝐵𝑗 ) Constraint 2 Ratio of squared partial pairwise distances == 1.0 𝑑𝑖𝑠𝑡2 (𝑓(𝐴𝑖), 𝑓(𝐴𝑗)) 𝑑𝑖𝑠𝑡2(𝑓(𝐵𝑖), 𝑓(𝐵𝑗)) == 1.0
  • 30. Feature Fusion: Canonical Correlation Analysis
  • 31. Feature Fusion: Deep CCA • Hyperparameters • Loss Function • Target Dimensions • FC Stack Depth • Add vs Concatenate Projections • Performance • Differentiable Loss • GPU-Placeable Ops
  • 32. Results: Fusing Deep Embeddings with Product Attributes
  • 33. Results: Fusing Deep Image Hashes with Product Attributes
  • 34. Index
  • 35. Decision: Distance Metric  Classic: L1 (Manhattan), L2 (Euclidean) 𝑥 1 = 𝑖 𝑥𝑖 ⟹ 𝐿1 𝑥, 𝑦 = 𝑥 − 𝑦 1 = 𝑖 𝑥𝑖 − 𝑦𝑖 𝑥 2 = 𝑖 𝑥𝑖 2 1 2 ⟹ 𝐿2 𝑥, 𝑦 = 𝑥 − 𝑦 2 = 𝑖 𝑥𝑖 − 𝑦𝑖 2 1 2  Fractional Distances: triangle rule is violated 𝐿1 𝑓 𝑥, 𝑦 = 𝑥 − 𝑦 1 𝑓 = 𝑖 𝑥𝑖 − 𝑦𝑖 1 𝑓 𝑓
  • 36. Results: Cosine vs Euclidean  Cosine distance is a special case of Euclidean distance 𝑐𝑜𝑠_𝑑𝑖𝑠𝑡(𝑥, 𝑦) = 1 − 𝑐𝑜𝑠 𝑥, 𝑦 = 1 − 𝑖 𝑥𝑖 𝑦𝑖 𝑥 2 𝑦 2 = 𝐿2 2 𝑥 𝑥 2 , 𝑦 𝑦 2 2  Fast to compute for sparse vectors and ranges [0,1] for all-positive vectors  popular in NLP
  • 38. KNN Search  Space Partitioning  Locality Sensitive Hashing  Projection to Lower Dimensions  Scikit-Learn  Annoy  NMSLib  Lucene?!... Index Type Indexing Time (including time to persist raw arrays) Search Time Index Size (excluding raw array data size) Scikit-Learn KD-Tree 100% 100% 100% Annoy 125% 68% 6% NMSLib HNSW 395% 68% 6% NMSLib Brute-Force 51% 85% 0%
  • 39. Conclusion  Would like arbitrary tensor similarity signal in the search engine  When all input tensors are known ahead of time, results can be pre-computed  When input tensors are not known ahead of time, need GPU integration  Model training pipeline needs to integrate with indexing pipeline  Fast KNN search on vectors up to 2048 dimensions is a desirable feature
  • 40. Thank you! Denis Kamotsky, Peter Gazaryan @Macys #Activate18 #ActivateSearch

Editor's Notes

  1. 1858  Entrepreneur R.H. Macy opens R.H. Macy & Company, a small dry goods store.  Candidate to removal
  2. The model is highly cacheable, as well as the requests
  3. Because we use catalog images, we can serve model from cache.
  4. Fully-Connected Transfer Learning Convolutional Transfer Learning Triplet Learning Feature Fusion Distance Metrics KNN-Search