Product categories are the structural backbone of every online shop. To attract customers and facilitate navigation, categories need to be easily understandable, logical and consistent. With the explosive growth of data, it is becoming more and more difficult for retailers to match products to appropriate categories, and large product catalogs as well as the need to quickly adapt to changes often lead to costly misclassifications. In this talk, Amadeus presents the approach at commercetools to build a category recommendation system using methods from machine learning. He talks about the use of deep neural nets and transfer learning to build an image classifier, word2vec and tf-idf to build a text classifier, and how to integrate these models in a REST API.
4. 11/2017 4
Goal: Use machine learning to automatically
recommend categories for products
Machine Learning for Category Recommendations
Fashion
Men Women
Sports
Shoes Pants
Business
5. 11/2017
5
Challenge: Every online store has a different category structure.
Challenges
Fashion
Men Women
Jeans
Clothing
Pants Shirts Shoes
Store 1 Store 2
6. 11/2017
6
Challenge: Every online store has a different category structure.
Challenges
Fashion
Men Women
Jeans
Clothing
Pants Shirts Shoes
Store 1 Store 2 Store 3
Model 1 Model 2 Model 3
predictpredictpredict
Option 1: multiple store-specific models
Store 1 Store 2
7. 11/2017
7
Challenge: Every online store has a different category structure.
Challenges
Fashion
Men Women
Jeans
Clothing
Pants Shirts Shoes
Store 1 Store 2 Store 3
Model 1 Model 2 Model 3
predictpredictpredict
General Categories
Model 1
predict
match
Store 3Store 2Store 1
match
match
Option 1: multiple store-specific models
Option 2: one general model
Store 1 Store 2
8. 11/2017
8
Challenge: Every online store has a different category structure.
Challenges
Fashion
Men Women
Jeans
Clothing
Pants Shirts Shoes
Store 1 Store 2 Store 3
Model 1 Model 2 Model 3
predictpredictpredict
General Categories
Model 1
predict
match
Store 3Store 2Store 1
match
match
Option 1: multiple store-specific models
Option 2: one general model
• Better accuracies for
stores with very specific
categories
• No category matching
necessary
• More data-per-model
• More flexible
• Easier to deploy
• Also works for stores
with little data
• Can also recommend
categories that are not
yet defined in the store
Store 1 Store 2
9. 11/2017 9
Challenge: Product data is diverse and unbalanced, which
complicates feature selection.
Challenges
Approach:
→ Focus on features names, images and descriptions
• carry most information
• available for most products
• Product names
• Images
• Prices
• Descriptions
• Sizes
• Brands
• Colors
• Expiration Dates
• …
10. 11/2017 10
Challenge: Very large class set
• Amazon/Ebay have listed 50000+ categories
• Tradeoff: Coverage vs. Accuracy
Challenges
Approach:
→ select broad model categories
to cover main use cases
→ rely on category matching procedure
to catch more specialized categories
→ current version has a selection of
723 model categories
11. 11/2017 11
Overview of Approach
723 General Categories
Name Model Description Model
predict
match
predict
Store 3Store 2Store 1
match
match
Image Model
predict
12. 11/2017 12
• Model: Convolutional Neural Network (Deep Learning)
• Similar to mechanisms in the brain: Idea of building complex
representations by combining simple representations
Model for Product Images
13. 11/2017 13
• Model: Convolutional Neural Network (Deep Learning)
• Similar to mechanisms in the brain: Idea of building complex
representations by combining simple representations
• Trained via transfer learning on famous image recognition network
Inception v3 (TensorFlow, Google Cloud ML Engine)
Model for Product Images
14. 11/2017 14
Preprocessing: (spacy, re, gensim, Google Translate, pyenchant)
• spellchecker
• translation
• tokenization
• normalization
• lemmatization
• phrasing
• word removal
Model for Product Names
Examples:
“Mens Heavyweight 6.1-ounce, 100% cotton T-Shirts in Regular, Big and Tall Sizes”
“Gala Apples Fresh Fruit, 3 LB Bag”
“Carhartt Men's Maddock Pocket T-Shirt Size M”
“Samsung SM-G900V - Galaxy S5 - 16GB Android Smartphone Verizon + GSM - Black”
(smartwathc → smartwatch)
(German → English)
(complete names → words)
(lowercasing, deleting special characters)
(apples → apple)
(louis vuitton → louis_vuitton)
(stop words, blacklist)
15. 11/2017 15
Preprocessing: (spacy, re, gensim, Google Translate, pyenchant)
• spellchecker
• translation
• tokenization
• normalization
• lemmatization
• phrasing
• word removal
Model for Product Names
Examples:
“Mens Heavyweight 6.1-ounce, 100% cotton T-Shirts in Regular, Big and Tall Sizes”
“Gala Apples Fresh Fruit, 3 LB Bag”
“Carhartt Men's Maddock Pocket T-Shirt Size M”
“Samsung SM-G900V - Galaxy S5 - 16GB Android Smartphone Verizon + GSM - Black”
(smartwathc → smartwatch)
(German → English)
(complete names → words)
(lowercasing, deleting special characters)
(apples → apple)
(louis vuitton → louis_vuitton)
(stop words, blacklist)
Models: (scikit-learn)
• Logistic Regression
• Naive Bayes
• Random Forest
• XGBoost
• Support Vector Machine
16. 11/2017 16
Vectorization methods: (text → numbers)
bag-of-words:
• Simple approach, but sparse representation and blind to context
Model for Product Names
17. 11/2017 17
Vectorization methods: (text → numbers)
bag-of-words:
• Simple approach, but sparse representation and blind to context
Model for Product Names
tf-idf:
• Similar to bag-of-words, but weighs words higher when they
do not occur frequently in dataset
• Intuition: “the” has less predictive value than “iPhone”
• TF(w) = (number of times word appears in name) / (total number of words in name)
• IDF(w) = log_e(total number of names / number of names with word w in it)
18. 11/2017 18
Vectorization methods: (text → numbers)
bag-of-words:
• Simple approach, but sparse representation and blind to context
Model for Product Names
word2vec:
• Trains two-layer neural network that predicts
context words of a word
• Results in a dense and context-sensitive
representation
tf-idf:
• Similar to bag-of-words, but weighs words higher when they
do not occur frequently in dataset
• Intuition: “the” has less predictive value than “iPhone”
• TF(w) = (number of times word appears in name) / (total number of words in name)
• IDF(w) = log_e(total number of names / number of names with word w in it)
19. 11/2017 19
Model for Product Names
Model for Product Descriptions
Model for Product Descriptions
20. 11/2017 20
Category Matching
Model categories are matched to store-specific categories via a word2vec model trained on a news dataset
word2vec
similarity
723 General Categories
Name Model Description Model
predict
match
predict
Store 3Store 2Store 1
match
match
Image Model
predictaveraging
class
probabilities
24. 11/2017 24
Thank you!
Amadeus Magrabi
@amadeusmagrabi
amadeus.magrabi@commercetools.com
word2vec
similarity
723 General Categories
Name Model Description Model
predict
match
predict
Store 3Store 2Store 1
match
match
Image Model
predictaveraging
class
probabilities
Tensorflow, Inception tf-idf, LogReg tf-idf, LogReg