Deep Learning in E-Commerce:
Applications and Challenges
A guide by Houda Bakir
houda.bakir@datavora.com
DL and ML Trends
2
Reebok Classic Leather Purple
$90.80
http://www.ebay.co.uk/itm/Reebok-Classic-Leather-
Womens-Suede-Purple-Red-Trainers-New-Shoes-All-
Sizes-/381019249763
AI, ML, DL
What a trend
AI, ML and DL
Evolution of ML
Classes of ML/DL
Product Matching
Pre Processing the Data
Use Text classification:
- Support Vector Machine (SVM)
- Convolution Neural Network (CNN)
In Order to FILTER the useful data.
Build time-series of targeted products.
Samsung Galaxy S8 64Gb
2. Prediction H Bakir, G Chniti, H Zaher
E-Commerce price forecasting using LSTM neural networks
International Journal of Machine Learning and Computing 8 (2), 169-174
Use case: Shoes
Convolutional Neural Network
Clothes Classification
[Convolutional Neural Networks for Clothes Categories September
2015 doi: 10.1007/978-3-662-48570-5_12 ]
Use Case: Clothes Classification
Computer Vision For E-Commerce
Use Case: Clothes Classification
Use Case: Clothes Classification
A guide by Houda Bakir
houda.bakir@datavora.com
Convolutional Neural Networks
➔ Introduction
➔ Convolution Neural Network
➔ The different Layers
➔ TensorFlow and Keras
26
Object Detection
27
- You will learn The Convolution Neural
Network
- How to implement it with Keras and
Tensorflow.
Agenda
28
Deep Neural Network
29
Fully Connected (FC) Neural Network
Inspiration of Convolution Networks
“Receptive fields, binocular interaction and functional architecture in the cat's visual cortex” 1968 - D. H. Hubel and T.
N. Wiesel
30
New Idea: Known unknowns => unknown unknowns
31
Neocognitron - Fukushima 1980
Convolutional Neural Network
Example of a CNN architecture:
LeNet-5 by LeCun in 1998
32
Convolutional neural networks (CNN)
Are similar to the neural networks. CNNs have weights, biases, and outputs through a
nonlinear activation.
33
A convolutional layer
34
35
Convolutional Layer
stride=1
5 x 5 image
36
Filter 1
Filter 2
Repeat this for each filter
Convolution
37
1 0 0 0 0 1
0 1 0 0 1 0
0 0 1 1 0 0
1 0 0 0 1 0
0 1 0 0 1 0
0 0 1 0 1 0
6 x 6 image
1 -1 -1
-1 1 -1
-1 -1 1
Filter 1
-1 1 -1
-1 1 -1
-1 1 -1
Filter 2
…
…
These are the network
parameters to be learned.
Each filter detects a
small pattern (3 x 3).
Convolution
38
1 0 0 0 0 1
0 1 0 0 1 0
0 0 1 1 0 0
1 0 0 0 1 0
0 1 0 0 1 0
0 0 1 0 1 0
6 x 6 image
1 -1 -1
-1 1 -1
-1 -1 1
Filter 1
3 -1
stride = 1
Dot
product
Convolution
39
1 0 0 0 0 1
0 1 0 0 1 0
0 0 1 1 0 0
1 0 0 0 1 0
0 1 0 0 1 0
0 0 1 0 1 0
6 x 6 image
1 -1 -1
-1 1 -1
-1 -1 1
Filter 1
3 -3
stride = 2
Dot
product
Convolution
40
1 0 0 0 0 1
0 1 0 0 1 0
0 0 1 1 0 0
1 0 0 0 1 0
0 1 0 0 1 0
0 0 1 0 1 0
6 x 6 image
Filter 2
stride = 1
Dot
product
3 -1 -3 -1
-3 1 0 -3
-3 -3 0 1
3 -2 -2 -1
1 -1 -1
-1 1 -1
-1 -1 1
Convolution
41
1 0 0 0 0 1
0 1 0 0 1 0
0 0 1 1 0 0
1 0 0 0 1 0
0 1 0 0 1 0
0 0 1 0 1 0
6 x 6 image
stride = 1
Dot
product
-1 1 -1
-1 1 -1
-1 1 -1
3 -1 -3 -1
-3 1 0 -3
-3 -3 0 1
3 -2 -2 -1
-1 -1 -1 -1
-1 -1 -2 1
-1 -1 -2 1
-1 0 -4 3
Two 4 x 4 images
Forming 2 x 4 x 4 matrix
Feature
Map
Filter 2
Convolution v.s. Fully Connected
42
1 0 0 0 0 1
0 1 0 0 1 0
0 0 1 1 0 0
1 0 0 0 1 0
0 1 0 0 1 0
0 0 1 0 1 0
image
convolution
-1 1 -1
-1 1 -1
-1 1 -1
1 -1 -1
-1 1 -1
-1 -1 1
…
…
…
…
1 0 0 0 0 1
0 1 0 0 1 0
0 0 1 1 0 0
1 0 0 0 1 0
0 1 0 0 1 0
0 0 1 0 1 0
Fully-connected
43
1 0 0 0 0 1
0 1 0 0 1 0
0 0 1 1 0 0
1 0 0 0 1 0
0 1 0 0 1 0
0 0 1 0 1 0
6 x 6 image
Filter 1
1
2
3
…
8
9
…
1
Only connect to
9 inputs, not
fully connected
4
:
10:
1
0
0
0
0
1
0
0
0
3
1 -1 -1
-1 1 -1
-1 -1 1
fewer parameters!
44
1 0 0 0 0 1
0 1 0 0 1 0
0 0 1 1 0 0
1 0 0 0 1 0
0 1 0 0 1 0
0 0 1 0 1 0
6 x 6 image
Filter 1
…
1
0
0
1 -1 -1
-1 1 -1
-1 -1 1
1
:2
:3
:
…
7
:8
:9
:
…
1
4
:
10:
1
0
0
0
0
1
0
0
0
3
-1
Shared weights
fewer parameters!
Color image: RGB 3 channels
45
1 0 0 0 0 1
0 1 0 0 1 0
0 0 1 1 0 0
1 0 0 0 1 0
0 1 0 0 1 0
0 0 1 0 1 0
1 0 0 0 0 1
0 1 0 0 1 0
0 0 1 1 0 0
1 0 0 0 1 0
0 1 0 0 1 0
0 0 1 0 1 0
1 0 0 0 0 1
0 1 0 0 1 0
0 0 1 1 0 0
1 0 0 0 1 0
0 1 0 0 1 0
0 0 1 0 1 0
1 -1 -1
-1 1 -1
-1 -1 1 Filter 1
-1 1 -1
-1 1 -1
-1 1 -1
Filter 2
1 -1 -1
-1 1 -1
-1 -1 1
1 -1 -1
-1 1 -1
-1 -1 1
-1 1 -1
-1 1 -1
-1 1 -1
-1 1 -1
-1 1 -1
-1 1 -1
Color image
46
The whole CNN
Fully Connected Feedforward network
cat dog …… Convolution
Max Pooling
Convolution
Max Pooling
Flattened
Can repeat
many times
Max Pooling
47
Max Pooling
48
3 -1 -3 -1
-3 1 0 -3
-3 -3 0 1
3 -2 -2 -1
-1 1 -1
-1 1 -1
-1 1 -1 Filter 2
-1 -1 -1 -1
-1 -1 -2 1
-1 -1 -2 1
-1 0 -4 3
1 -1 -1
-1 1 -1
-1 -1 1 Filter 1
49
1 0 0 0 0 1
0 1 0 0 1 0
0 0 1 1 0 0
1 0 0 0 1 0
0 1 0 0 1 0
0 0 1 0 1 0
6 x 6 image
3 0
13
-1 1
30
2 x 2 image
Each filter
is a channel
Conv
Max
Pooling
50
Convolution
Max Pooling
Convolution
Max Pooling
Can repeat
many times
A new image
The number of channels is the
number of filters
Smaller than the original
image
3 0
13
-1 1
30
51
The whole CNN
Fully Connected Feedforward network
cat dog …… Convolution
Max Pooling
Convolution
Max Pooling
Flattened
Can repeat
many times
Flattening
52
3 0
13
-1 1
30 Flattened
3
0
1
3
-1
1
0
3
Fully Connected
Feedforward network
Visualizing Convolution Neural Network
53
Visualizing Neural Networks
54
Region-based Convolutional Networks (R-CNN)
55
Source: Mask R-CNN paper.
CNN improvement
56
Tensorflow
1. Powerful Deep Learning Framework
2. Keras
3. The approach is optional (Tensorflow Eager)
4. Tensorboard
57
CNN in Keras
58
Convolution
Max Pooling
Convolution
Max Pooling
Input
1 x 28 x 28
25 x 26 x 26
25 x 13 x 13
50 x 11 x 11
50 x 5 x 5
How many
parameters for each
filter?
How many
parameters
for each filter?
9
225=
25x9
CNN in Keras
59
Convolution
Max Pooling
Convolution
Input
1 x 28 x 28
25 x 26 x 26
25 x 13 x 13
50 x 11 x 11
Flattened
1250
Fully connected
feedforward network
Output
50 x 5 x 5
Max Pooling
CNN LOSS FUNCTIONS
Can be categorized in to the following categories:
1. Binary Classification(SVM hinge loss, Squared hinge
loss).
2. Identity Verification(Contrastive loss).
3. Multi-class Classification (Softmax loss, Expectation loss).
4. Regression (SSIM,`1 error, Euclidean loss)
60
61
62
Questions?
63

Deep learning in E-Commerce Applications and Challenges (CNN)

  • 1.
    Deep Learning inE-Commerce: Applications and Challenges A guide by Houda Bakir houda.bakir@datavora.com
  • 2.
    DL and MLTrends 2
  • 4.
    Reebok Classic LeatherPurple $90.80 http://www.ebay.co.uk/itm/Reebok-Classic-Leather- Womens-Suede-Purple-Red-Trainers-New-Shoes-All- Sizes-/381019249763
  • 6.
  • 7.
  • 8.
  • 11.
  • 15.
  • 16.
    Pre Processing theData Use Text classification: - Support Vector Machine (SVM) - Convolution Neural Network (CNN) In Order to FILTER the useful data. Build time-series of targeted products.
  • 17.
  • 18.
    2. Prediction HBakir, G Chniti, H Zaher E-Commerce price forecasting using LSTM neural networks International Journal of Machine Learning and Computing 8 (2), 169-174
  • 19.
  • 20.
  • 21.
    Clothes Classification [Convolutional NeuralNetworks for Clothes Categories September 2015 doi: 10.1007/978-3-662-48570-5_12 ]
  • 22.
    Use Case: ClothesClassification Computer Vision For E-Commerce
  • 23.
    Use Case: ClothesClassification
  • 24.
    Use Case: ClothesClassification
  • 25.
    A guide byHouda Bakir houda.bakir@datavora.com Convolutional Neural Networks
  • 26.
    ➔ Introduction ➔ ConvolutionNeural Network ➔ The different Layers ➔ TensorFlow and Keras 26
  • 27.
  • 28.
    - You willlearn The Convolution Neural Network - How to implement it with Keras and Tensorflow. Agenda 28
  • 29.
    Deep Neural Network 29 FullyConnected (FC) Neural Network
  • 30.
    Inspiration of ConvolutionNetworks “Receptive fields, binocular interaction and functional architecture in the cat's visual cortex” 1968 - D. H. Hubel and T. N. Wiesel 30
  • 31.
    New Idea: Knownunknowns => unknown unknowns 31 Neocognitron - Fukushima 1980
  • 32.
    Convolutional Neural Network Exampleof a CNN architecture: LeNet-5 by LeCun in 1998 32
  • 33.
    Convolutional neural networks(CNN) Are similar to the neural networks. CNNs have weights, biases, and outputs through a nonlinear activation. 33
  • 34.
  • 35.
  • 36.
    Convolutional Layer stride=1 5 x5 image 36 Filter 1 Filter 2 Repeat this for each filter
  • 37.
    Convolution 37 1 0 00 0 1 0 1 0 0 1 0 0 0 1 1 0 0 1 0 0 0 1 0 0 1 0 0 1 0 0 0 1 0 1 0 6 x 6 image 1 -1 -1 -1 1 -1 -1 -1 1 Filter 1 -1 1 -1 -1 1 -1 -1 1 -1 Filter 2 … … These are the network parameters to be learned. Each filter detects a small pattern (3 x 3).
  • 38.
    Convolution 38 1 0 00 0 1 0 1 0 0 1 0 0 0 1 1 0 0 1 0 0 0 1 0 0 1 0 0 1 0 0 0 1 0 1 0 6 x 6 image 1 -1 -1 -1 1 -1 -1 -1 1 Filter 1 3 -1 stride = 1 Dot product
  • 39.
    Convolution 39 1 0 00 0 1 0 1 0 0 1 0 0 0 1 1 0 0 1 0 0 0 1 0 0 1 0 0 1 0 0 0 1 0 1 0 6 x 6 image 1 -1 -1 -1 1 -1 -1 -1 1 Filter 1 3 -3 stride = 2 Dot product
  • 40.
    Convolution 40 1 0 00 0 1 0 1 0 0 1 0 0 0 1 1 0 0 1 0 0 0 1 0 0 1 0 0 1 0 0 0 1 0 1 0 6 x 6 image Filter 2 stride = 1 Dot product 3 -1 -3 -1 -3 1 0 -3 -3 -3 0 1 3 -2 -2 -1 1 -1 -1 -1 1 -1 -1 -1 1
  • 41.
    Convolution 41 1 0 00 0 1 0 1 0 0 1 0 0 0 1 1 0 0 1 0 0 0 1 0 0 1 0 0 1 0 0 0 1 0 1 0 6 x 6 image stride = 1 Dot product -1 1 -1 -1 1 -1 -1 1 -1 3 -1 -3 -1 -3 1 0 -3 -3 -3 0 1 3 -2 -2 -1 -1 -1 -1 -1 -1 -1 -2 1 -1 -1 -2 1 -1 0 -4 3 Two 4 x 4 images Forming 2 x 4 x 4 matrix Feature Map Filter 2
  • 42.
    Convolution v.s. FullyConnected 42 1 0 0 0 0 1 0 1 0 0 1 0 0 0 1 1 0 0 1 0 0 0 1 0 0 1 0 0 1 0 0 0 1 0 1 0 image convolution -1 1 -1 -1 1 -1 -1 1 -1 1 -1 -1 -1 1 -1 -1 -1 1 … … … … 1 0 0 0 0 1 0 1 0 0 1 0 0 0 1 1 0 0 1 0 0 0 1 0 0 1 0 0 1 0 0 0 1 0 1 0 Fully-connected
  • 43.
    43 1 0 00 0 1 0 1 0 0 1 0 0 0 1 1 0 0 1 0 0 0 1 0 0 1 0 0 1 0 0 0 1 0 1 0 6 x 6 image Filter 1 1 2 3 … 8 9 … 1 Only connect to 9 inputs, not fully connected 4 : 10: 1 0 0 0 0 1 0 0 0 3 1 -1 -1 -1 1 -1 -1 -1 1 fewer parameters!
  • 44.
    44 1 0 00 0 1 0 1 0 0 1 0 0 0 1 1 0 0 1 0 0 0 1 0 0 1 0 0 1 0 0 0 1 0 1 0 6 x 6 image Filter 1 … 1 0 0 1 -1 -1 -1 1 -1 -1 -1 1 1 :2 :3 : … 7 :8 :9 : … 1 4 : 10: 1 0 0 0 0 1 0 0 0 3 -1 Shared weights fewer parameters!
  • 45.
    Color image: RGB3 channels 45 1 0 0 0 0 1 0 1 0 0 1 0 0 0 1 1 0 0 1 0 0 0 1 0 0 1 0 0 1 0 0 0 1 0 1 0 1 0 0 0 0 1 0 1 0 0 1 0 0 0 1 1 0 0 1 0 0 0 1 0 0 1 0 0 1 0 0 0 1 0 1 0 1 0 0 0 0 1 0 1 0 0 1 0 0 0 1 1 0 0 1 0 0 0 1 0 0 1 0 0 1 0 0 0 1 0 1 0 1 -1 -1 -1 1 -1 -1 -1 1 Filter 1 -1 1 -1 -1 1 -1 -1 1 -1 Filter 2 1 -1 -1 -1 1 -1 -1 -1 1 1 -1 -1 -1 1 -1 -1 -1 1 -1 1 -1 -1 1 -1 -1 1 -1 -1 1 -1 -1 1 -1 -1 1 -1 Color image
  • 46.
    46 The whole CNN FullyConnected Feedforward network cat dog …… Convolution Max Pooling Convolution Max Pooling Flattened Can repeat many times
  • 47.
  • 48.
    Max Pooling 48 3 -1-3 -1 -3 1 0 -3 -3 -3 0 1 3 -2 -2 -1 -1 1 -1 -1 1 -1 -1 1 -1 Filter 2 -1 -1 -1 -1 -1 -1 -2 1 -1 -1 -2 1 -1 0 -4 3 1 -1 -1 -1 1 -1 -1 -1 1 Filter 1
  • 49.
    49 1 0 00 0 1 0 1 0 0 1 0 0 0 1 1 0 0 1 0 0 0 1 0 0 1 0 0 1 0 0 0 1 0 1 0 6 x 6 image 3 0 13 -1 1 30 2 x 2 image Each filter is a channel Conv Max Pooling
  • 50.
    50 Convolution Max Pooling Convolution Max Pooling Canrepeat many times A new image The number of channels is the number of filters Smaller than the original image 3 0 13 -1 1 30
  • 51.
    51 The whole CNN FullyConnected Feedforward network cat dog …… Convolution Max Pooling Convolution Max Pooling Flattened Can repeat many times
  • 52.
    Flattening 52 3 0 13 -1 1 30Flattened 3 0 1 3 -1 1 0 3 Fully Connected Feedforward network
  • 53.
  • 54.
  • 55.
    Region-based Convolutional Networks(R-CNN) 55 Source: Mask R-CNN paper.
  • 56.
  • 57.
    Tensorflow 1. Powerful DeepLearning Framework 2. Keras 3. The approach is optional (Tensorflow Eager) 4. Tensorboard 57
  • 58.
    CNN in Keras 58 Convolution MaxPooling Convolution Max Pooling Input 1 x 28 x 28 25 x 26 x 26 25 x 13 x 13 50 x 11 x 11 50 x 5 x 5 How many parameters for each filter? How many parameters for each filter? 9 225= 25x9
  • 59.
    CNN in Keras 59 Convolution MaxPooling Convolution Input 1 x 28 x 28 25 x 26 x 26 25 x 13 x 13 50 x 11 x 11 Flattened 1250 Fully connected feedforward network Output 50 x 5 x 5 Max Pooling
  • 60.
    CNN LOSS FUNCTIONS Canbe categorized in to the following categories: 1. Binary Classification(SVM hinge loss, Squared hinge loss). 2. Identity Verification(Contrastive loss). 3. Multi-class Classification (Softmax loss, Expectation loss). 4. Regression (SSIM,`1 error, Euclidean loss) 60
  • 61.
  • 62.
  • 63.