In this talk we detail the step to creating a Visual Search engine for 1M Amazon product using MXNet Gluon and the K-Nearest Neighbor search library HNSW.
For implementation details, check this repository: https://github.com/ThomasDelteil/VisualSearch_MXNet
Video available here:
https://www.youtube.com/watch?v=9a8MAtfFVwI
Demo website available here:
https://thomasdelteil.github.io/VisualSearch_MXNet/
6. • Angel.co: 74 visual search startups
• Syte.ai: raised $8M
• Slyce.it
“By 2020, more than half of all mobile
searches will be voice or visual
searches.”
IPG Media Lab
6
Growing Market
9. 2012 - ImageNet Classification with Deep Convolutional Neural Networks
ImageNet classification with Deep Convolutional Neural Networks, Alex Krizhevsky, Ilya Sutskever, Geoffrey E. Hinton, Advances in
Neural Information Processing Systems, 2012
AlexNet architecture
14. GPUs!
Nvidia V100, float16 Ops:
~ 120 TFLOPS, 5000+ cuda cores
(#1 Super computer 2005 135 TFLOPS)
Source: Mathworks
15. Sea/Land segmentation via satellite images
DeepUNet: A Deep Fully Convolutional Network for Pixel-level Sea-Land Segmentation, Ruirui Li et al, 2017
16. Automatic Galaxy classication
Deep Galaxy: Classification of Galaxies based on Deep Convolutional Neural Networks , Nour Eldeen M. Khalifa, 2017
17. Medical Imaging, MRI, X-ray, surgical cameras
Review of MRI-based Brain Tumor Image Segmentation Using Deep Learning Methods, Ali Isn et al. 2016
18. What is a convolution ?
It is the cross-channel sum of the element-wise multiplication
of a convolutional filter (kernel/mask) computed over a sliding
window on an input tensor given a certain stride and padding,
plus a bias term. The result is called a feature map.
2 2 1
3 1 -1
4 3 2
1 -1
-1 0
Input matrix (3x3)
no padding
1 channel
Kernel (2x2)
Stride 1
Bias = 2
Feature map (2x2)
-1 2
0 1
1*2 –1*2 –1*3 + 0*1 + 2 = – 1
1*2 –1*2 –1*1 + 0*-1 + 2 = 2
1*3 –1*1 –1*4 + 0*3 + 2 = 0
1*1 – (-1)*1 –1*3 + 0*2 + 2 = 1
19. What is a convolution ? Padding
Source: Machine Learning guru - Neural Networks CNN
20. What is a convolution ? Stride = 2
Source: Machine Learning guru - Neural Networks CNN
21. What is a convolution ? Multi Channel
1 convolutional filter
(3)x(3x3)
Source: Machine Learning guru - Neural Networks CNN
29. Why does it work?
- Detect patterns at larger and larger scale by stacking convolution layers on
top of each others to grow the receptive field
- Applicable to spatially correlated data
Source: AlexNet first 96 filters learned represented in RGB space
36. 36
Hierarchical Navigable SmallWorld
AMAZON CONFIDENTIAL
“Efficient and robust approximate nearest neighbor search using Hierarchical Navigable Small World graphs”
Yu. A. Malkov, D. A. Yashunin, arXiv preprint arXiv:1603.09320 (2016)
41. - Develop model using a Jupyter notebook
- Train model on GPU instance
- Package model behind web API in a Docker container, e.g using MXNet Model Server
- Upload container to container registry
- Deploy container to an elastic container service / FARGATE task
- Enjoy quick and linear scaling
- Put the API behind a load balancer with SSL termination
- Enjoy
Workflow and Operationalization
Elastic
Container
Service
GPU instance Container
Registry
Auto-scaling Load BalancerContainer
HTTPS request
HTTPS response
{
“prediction” : {
{
“item1”:
…
}
}
42. Demo Resource
• Python Notebook
• Apache MXNet
• KNN Search: Hierarchical Navigable Small Worlds (hnsw)
• Dataset:
• 8M Amazon products (2013)
• Image-based recommendations on styles and substitutes
J. McAuley, C. Targett, J. Shi, A. van den Hengel
SIGIR, 2015
42
43. Amazon resources:
• Featurized product images:
• Embeddings for each ASIN
• Product Image Embedding wiki
• Contact Jianfeng, ljianfen@
• K-Nearest Neighbor Search:
• Nearest Neighbor Service wiki
• Contact Pracheer, guptapra@
43