SlideShare a Scribd company logo
1 of 24
Download to read offline
Amadeus Magrabi
@amadeusmagrabi
BOOSTING PRODUCT CATEGORIZATION

WITH MACHINE LEARNING
…
11/2017 2
Company:
Customers: People who want to sell something online
…
Main product: REST API to manage online shops
11/2017 3
User Interface
Company:
11/2017 4
Goal: Use machine learning to automatically
recommend categories for products
Machine Learning for Category Recommendations
Fashion
Men Women
Sports
Shoes Pants
Business
11/2017
5
Challenge: Every online store has a different category structure.
Challenges
Fashion
Men Women
Jeans
Clothing
Pants Shirts Shoes
Store 1 Store 2
11/2017
6
Challenge: Every online store has a different category structure.
Challenges
Fashion
Men Women
Jeans
Clothing
Pants Shirts Shoes
Store 1 Store 2 Store 3
Model 1 Model 2 Model 3
predictpredictpredict
Option 1: multiple store-specific models
Store 1 Store 2
11/2017
7
Challenge: Every online store has a different category structure.
Challenges
Fashion
Men Women
Jeans
Clothing
Pants Shirts Shoes
Store 1 Store 2 Store 3
Model 1 Model 2 Model 3
predictpredictpredict
General Categories
Model 1
predict
match
Store 3Store 2Store 1
match
match
Option 1: multiple store-specific models
Option 2: one general model
Store 1 Store 2
11/2017
8
Challenge: Every online store has a different category structure.
Challenges
Fashion
Men Women
Jeans
Clothing
Pants Shirts Shoes
Store 1 Store 2 Store 3
Model 1 Model 2 Model 3
predictpredictpredict
General Categories
Model 1
predict
match
Store 3Store 2Store 1
match
match
Option 1: multiple store-specific models
Option 2: one general model
• Better accuracies for 

stores with very specific

categories
• No category matching 

necessary
• More data-per-model
• More flexible
• Easier to deploy
• Also works for stores

with little data
• Can also recommend 

categories that are not
yet defined in the store
Store 1 Store 2
11/2017 9
Challenge: Product data is diverse and unbalanced, which

complicates feature selection.
Challenges
Approach:
→ Focus on features names, images and descriptions
• carry most information
• available for most products
• Product names
• Images
• Prices
• Descriptions
• Sizes
• Brands
• Colors
• Expiration Dates
• …
11/2017 10
Challenge: Very large class set
• Amazon/Ebay have listed 50000+ categories
• Tradeoff: Coverage vs. Accuracy
Challenges
Approach:

→ select broad model categories

to cover main use cases
→ rely on category matching procedure 

to catch more specialized categories
→ current version has a selection of

723 model categories
11/2017 11
Overview of Approach
723 General Categories
Name Model Description Model
predict
match
predict
Store 3Store 2Store 1
match
match
Image Model
predict
11/2017 12
• Model: Convolutional Neural Network (Deep Learning)
• Similar to mechanisms in the brain: Idea of building complex
representations by combining simple representations
Model for Product Images
11/2017 13
• Model: Convolutional Neural Network (Deep Learning)
• Similar to mechanisms in the brain: Idea of building complex
representations by combining simple representations
• Trained via transfer learning on famous image recognition network
Inception v3 (TensorFlow, Google Cloud ML Engine)
Model for Product Images
11/2017 14
Preprocessing: (spacy, re, gensim, Google Translate, pyenchant)
• spellchecker
• translation
• tokenization
• normalization
• lemmatization
• phrasing
• word removal
Model for Product Names
Examples:
“Mens Heavyweight 6.1-ounce, 100% cotton T-Shirts in Regular, Big and Tall Sizes”
“Gala Apples Fresh Fruit, 3 LB Bag”
“Carhartt Men's Maddock Pocket T-Shirt Size M”
“Samsung SM-G900V - Galaxy S5 - 16GB Android Smartphone Verizon + GSM - Black”
(smartwathc → smartwatch)
(German → English)
(complete names → words)
(lowercasing, deleting special characters)
(apples → apple)
(louis vuitton → louis_vuitton)
(stop words, blacklist)
11/2017 15
Preprocessing: (spacy, re, gensim, Google Translate, pyenchant)
• spellchecker
• translation
• tokenization
• normalization
• lemmatization
• phrasing
• word removal
Model for Product Names
Examples:
“Mens Heavyweight 6.1-ounce, 100% cotton T-Shirts in Regular, Big and Tall Sizes”
“Gala Apples Fresh Fruit, 3 LB Bag”
“Carhartt Men's Maddock Pocket T-Shirt Size M”
“Samsung SM-G900V - Galaxy S5 - 16GB Android Smartphone Verizon + GSM - Black”
(smartwathc → smartwatch)
(German → English)
(complete names → words)
(lowercasing, deleting special characters)
(apples → apple)
(louis vuitton → louis_vuitton)
(stop words, blacklist)
Models: (scikit-learn)
• Logistic Regression
• Naive Bayes
• Random Forest
• XGBoost
• Support Vector Machine
11/2017 16
Vectorization methods: (text → numbers)

bag-of-words:
• Simple approach, but sparse representation and blind to context
Model for Product Names
11/2017 17
Vectorization methods: (text → numbers)

bag-of-words:
• Simple approach, but sparse representation and blind to context
Model for Product Names
tf-idf:
• Similar to bag-of-words, but weighs words higher when they 

do not occur frequently in dataset
• Intuition: “the” has less predictive value than “iPhone”
• TF(w) = (number of times word appears in name) / (total number of words in name)
• IDF(w) = log_e(total number of names / number of names with word w in it)
11/2017 18
Vectorization methods: (text → numbers)

bag-of-words:
• Simple approach, but sparse representation and blind to context
Model for Product Names
word2vec:
• Trains two-layer neural network that predicts

context words of a word
• Results in a dense and context-sensitive 

representation
tf-idf:
• Similar to bag-of-words, but weighs words higher when they 

do not occur frequently in dataset
• Intuition: “the” has less predictive value than “iPhone”
• TF(w) = (number of times word appears in name) / (total number of words in name)
• IDF(w) = log_e(total number of names / number of names with word w in it)
11/2017 19
Model for Product Names 

Model for Product Descriptions
Model for Product Descriptions
11/2017 20
Category Matching
Model categories are matched to store-specific categories via a word2vec model trained on a news dataset
word2vec

similarity
723 General Categories
Name Model Description Model
predict
match
predict
Store 3Store 2Store 1
match
match
Image Model
predictaveraging

class

probabilities
11/2017 21
REST API
General API
11/2017 22
REST API
Store-Specific API
11/2017 23
GUI Integration
11/2017 24
Thank you!
Amadeus Magrabi
@amadeusmagrabi
amadeus.magrabi@commercetools.com
word2vec

similarity
723 General Categories
Name Model Description Model
predict
match
predict
Store 3Store 2Store 1
match
match
Image Model
predictaveraging

class

probabilities
Tensorflow, Inception tf-idf, LogReg tf-idf, LogReg

More Related Content

Similar to Boosting Product Categorization with Machine Learning Models

Calin Constantinov - Neo4j - Keyboards and Mice - Craiova 2016
Calin Constantinov - Neo4j - Keyboards and Mice - Craiova 2016Calin Constantinov - Neo4j - Keyboards and Mice - Craiova 2016
Calin Constantinov - Neo4j - Keyboards and Mice - Craiova 2016Calin Constantinov
 
Fishing Graphs in a Hadoop Data Lake
Fishing Graphs in a Hadoop Data LakeFishing Graphs in a Hadoop Data Lake
Fishing Graphs in a Hadoop Data LakeArangoDB Database
 
Fai[ Away with Dynamo, Bigtabte, and Cassandra194 cHArlrEF.docx
Fai[ Away with Dynamo, Bigtabte, and Cassandra194 cHArlrEF.docxFai[ Away with Dynamo, Bigtabte, and Cassandra194 cHArlrEF.docx
Fai[ Away with Dynamo, Bigtabte, and Cassandra194 cHArlrEF.docxssuser454af01
 
Clear AppSec Visibility with AppSpider and ThreadFix
 Clear AppSec Visibility with AppSpider and ThreadFix Clear AppSec Visibility with AppSpider and ThreadFix
Clear AppSec Visibility with AppSpider and ThreadFixDenim Group
 
Semantic Search Component
Semantic Search ComponentSemantic Search Component
Semantic Search ComponentMario Flecha
 
Introduction to question answering for linked data & big data
Introduction to question answering for linked data & big dataIntroduction to question answering for linked data & big data
Introduction to question answering for linked data & big dataAndre Freitas
 
Polyglot Persistence with MongoDB and Neo4j
Polyglot Persistence with MongoDB and Neo4jPolyglot Persistence with MongoDB and Neo4j
Polyglot Persistence with MongoDB and Neo4jCorie Pollock
 
Advanced SEO - Digital Content Creators
Advanced SEO - Digital Content CreatorsAdvanced SEO - Digital Content Creators
Advanced SEO - Digital Content CreatorsAndrea Berberich
 
Advanced SEO for Digital Content Creators
Advanced SEO for Digital Content CreatorsAdvanced SEO for Digital Content Creators
Advanced SEO for Digital Content CreatorsAndrea Berberich
 
2018 NYC Localogy: Using Data to Build Exceptional Local Pages
2018 NYC Localogy: Using Data to Build Exceptional Local Pages2018 NYC Localogy: Using Data to Build Exceptional Local Pages
2018 NYC Localogy: Using Data to Build Exceptional Local PagesLocalogy
 
Is multi-model the future of NoSQL?
Is multi-model the future of NoSQL?Is multi-model the future of NoSQL?
Is multi-model the future of NoSQL?Max Neunhöffer
 
Webinar: Scaling MongoDB
Webinar: Scaling MongoDBWebinar: Scaling MongoDB
Webinar: Scaling MongoDBMongoDB
 
Introduction to MySQL Document Store
Introduction to MySQL Document StoreIntroduction to MySQL Document Store
Introduction to MySQL Document StoreFrederic Descamps
 
Strategies for using alternative queries to mitigate zero results
Strategies for using alternative queries to mitigate zero resultsStrategies for using alternative queries to mitigate zero results
Strategies for using alternative queries to mitigate zero resultsJean Silva
 
Human computer interaction research at ibm t
Human computer interaction research at ibm tHuman computer interaction research at ibm t
Human computer interaction research at ibm tJohn Thomas
 
Towards a Distributional Semantic Web Stack
Towards a Distributional Semantic Web StackTowards a Distributional Semantic Web Stack
Towards a Distributional Semantic Web StackAndre Freitas
 
Building Large Sustainable Apps
Building Large Sustainable AppsBuilding Large Sustainable Apps
Building Large Sustainable AppsBuğra Oral
 
Discovering User's Topics of Interest in Recommender Systems
Discovering User's Topics of Interest in Recommender SystemsDiscovering User's Topics of Interest in Recommender Systems
Discovering User's Topics of Interest in Recommender SystemsGabriel Moreira
 

Similar to Boosting Product Categorization with Machine Learning Models (20)

Calin Constantinov - Neo4j - Keyboards and Mice - Craiova 2016
Calin Constantinov - Neo4j - Keyboards and Mice - Craiova 2016Calin Constantinov - Neo4j - Keyboards and Mice - Craiova 2016
Calin Constantinov - Neo4j - Keyboards and Mice - Craiova 2016
 
Fishing Graphs in a Hadoop Data Lake
Fishing Graphs in a Hadoop Data LakeFishing Graphs in a Hadoop Data Lake
Fishing Graphs in a Hadoop Data Lake
 
Fai[ Away with Dynamo, Bigtabte, and Cassandra194 cHArlrEF.docx
Fai[ Away with Dynamo, Bigtabte, and Cassandra194 cHArlrEF.docxFai[ Away with Dynamo, Bigtabte, and Cassandra194 cHArlrEF.docx
Fai[ Away with Dynamo, Bigtabte, and Cassandra194 cHArlrEF.docx
 
DataHub
DataHubDataHub
DataHub
 
Clear AppSec Visibility with AppSpider and ThreadFix
 Clear AppSec Visibility with AppSpider and ThreadFix Clear AppSec Visibility with AppSpider and ThreadFix
Clear AppSec Visibility with AppSpider and ThreadFix
 
Architecting for Data Science
Architecting for Data ScienceArchitecting for Data Science
Architecting for Data Science
 
Semantic Search Component
Semantic Search ComponentSemantic Search Component
Semantic Search Component
 
Introduction to question answering for linked data & big data
Introduction to question answering for linked data & big dataIntroduction to question answering for linked data & big data
Introduction to question answering for linked data & big data
 
Polyglot Persistence with MongoDB and Neo4j
Polyglot Persistence with MongoDB and Neo4jPolyglot Persistence with MongoDB and Neo4j
Polyglot Persistence with MongoDB and Neo4j
 
Advanced SEO - Digital Content Creators
Advanced SEO - Digital Content CreatorsAdvanced SEO - Digital Content Creators
Advanced SEO - Digital Content Creators
 
Advanced SEO for Digital Content Creators
Advanced SEO for Digital Content CreatorsAdvanced SEO for Digital Content Creators
Advanced SEO for Digital Content Creators
 
2018 NYC Localogy: Using Data to Build Exceptional Local Pages
2018 NYC Localogy: Using Data to Build Exceptional Local Pages2018 NYC Localogy: Using Data to Build Exceptional Local Pages
2018 NYC Localogy: Using Data to Build Exceptional Local Pages
 
Is multi-model the future of NoSQL?
Is multi-model the future of NoSQL?Is multi-model the future of NoSQL?
Is multi-model the future of NoSQL?
 
Webinar: Scaling MongoDB
Webinar: Scaling MongoDBWebinar: Scaling MongoDB
Webinar: Scaling MongoDB
 
Introduction to MySQL Document Store
Introduction to MySQL Document StoreIntroduction to MySQL Document Store
Introduction to MySQL Document Store
 
Strategies for using alternative queries to mitigate zero results
Strategies for using alternative queries to mitigate zero resultsStrategies for using alternative queries to mitigate zero results
Strategies for using alternative queries to mitigate zero results
 
Human computer interaction research at ibm t
Human computer interaction research at ibm tHuman computer interaction research at ibm t
Human computer interaction research at ibm t
 
Towards a Distributional Semantic Web Stack
Towards a Distributional Semantic Web StackTowards a Distributional Semantic Web Stack
Towards a Distributional Semantic Web Stack
 
Building Large Sustainable Apps
Building Large Sustainable AppsBuilding Large Sustainable Apps
Building Large Sustainable Apps
 
Discovering User's Topics of Interest in Recommender Systems
Discovering User's Topics of Interest in Recommender SystemsDiscovering User's Topics of Interest in Recommender Systems
Discovering User's Topics of Interest in Recommender Systems
 

More from Dataconomy Media

Data Natives Paris v 10.0 | "Blockchain in Healthcare" - Lea Dias & David An...
Data Natives Paris v 10.0 | "Blockchain in Healthcare" - Lea Dias & 	David An...Data Natives Paris v 10.0 | "Blockchain in Healthcare" - Lea Dias & 	David An...
Data Natives Paris v 10.0 | "Blockchain in Healthcare" - Lea Dias & David An...Dataconomy Media
 
Data Natives Frankfurt v 11.0 | "Competitive advantages with knowledge graphs...
Data Natives Frankfurt v 11.0 | "Competitive advantages with knowledge graphs...Data Natives Frankfurt v 11.0 | "Competitive advantages with knowledge graphs...
Data Natives Frankfurt v 11.0 | "Competitive advantages with knowledge graphs...Dataconomy Media
 
Data Natives Frankfurt v 11.0 | "Can we be responsible for misuse of data & a...
Data Natives Frankfurt v 11.0 | "Can we be responsible for misuse of data & a...Data Natives Frankfurt v 11.0 | "Can we be responsible for misuse of data & a...
Data Natives Frankfurt v 11.0 | "Can we be responsible for misuse of data & a...Dataconomy Media
 
Data Natives Munich v 12.0 | "How to be more productive with Autonomous Data ...
Data Natives Munich v 12.0 | "How to be more productive with Autonomous Data ...Data Natives Munich v 12.0 | "How to be more productive with Autonomous Data ...
Data Natives Munich v 12.0 | "How to be more productive with Autonomous Data ...Dataconomy Media
 
Data Natives meets DataRobot | "Build and deploy an anti-money laundering mo...
Data Natives meets DataRobot |  "Build and deploy an anti-money laundering mo...Data Natives meets DataRobot |  "Build and deploy an anti-money laundering mo...
Data Natives meets DataRobot | "Build and deploy an anti-money laundering mo...Dataconomy Media
 
Data Natives Munich v 12.0 | "Political Data Science: A tale of Fake News, So...
Data Natives Munich v 12.0 | "Political Data Science: A tale of Fake News, So...Data Natives Munich v 12.0 | "Political Data Science: A tale of Fake News, So...
Data Natives Munich v 12.0 | "Political Data Science: A tale of Fake News, So...Dataconomy Media
 
Data Natives Vienna v 7.0 | "Building Kubernetes Operators with KUDO for Dat...
Data Natives Vienna v 7.0  | "Building Kubernetes Operators with KUDO for Dat...Data Natives Vienna v 7.0  | "Building Kubernetes Operators with KUDO for Dat...
Data Natives Vienna v 7.0 | "Building Kubernetes Operators with KUDO for Dat...Dataconomy Media
 
Data Natives Vienna v 7.0 | "The Ingredients of Data Innovation" - Robbert de...
Data Natives Vienna v 7.0 | "The Ingredients of Data Innovation" - Robbert de...Data Natives Vienna v 7.0 | "The Ingredients of Data Innovation" - Robbert de...
Data Natives Vienna v 7.0 | "The Ingredients of Data Innovation" - Robbert de...Dataconomy Media
 
Data Natives Cologne v 4.0 | "The Data Lorax: Planting the Seeds of Fairness...
Data Natives Cologne v 4.0  | "The Data Lorax: Planting the Seeds of Fairness...Data Natives Cologne v 4.0  | "The Data Lorax: Planting the Seeds of Fairness...
Data Natives Cologne v 4.0 | "The Data Lorax: Planting the Seeds of Fairness...Dataconomy Media
 
Data Natives Cologne v 4.0 | "How People Analytics Can Reveal the Hidden Aspe...
Data Natives Cologne v 4.0 | "How People Analytics Can Reveal the Hidden Aspe...Data Natives Cologne v 4.0 | "How People Analytics Can Reveal the Hidden Aspe...
Data Natives Cologne v 4.0 | "How People Analytics Can Reveal the Hidden Aspe...Dataconomy Media
 
Data Natives Amsterdam v 9.0 | "Ten Little Servers: A Story of no Downtime" -...
Data Natives Amsterdam v 9.0 | "Ten Little Servers: A Story of no Downtime" -...Data Natives Amsterdam v 9.0 | "Ten Little Servers: A Story of no Downtime" -...
Data Natives Amsterdam v 9.0 | "Ten Little Servers: A Story of no Downtime" -...Dataconomy Media
 
Data Natives Amsterdam v 9.0 | "Point in Time Labeling at Scale" - Timothy Th...
Data Natives Amsterdam v 9.0 | "Point in Time Labeling at Scale" - Timothy Th...Data Natives Amsterdam v 9.0 | "Point in Time Labeling at Scale" - Timothy Th...
Data Natives Amsterdam v 9.0 | "Point in Time Labeling at Scale" - Timothy Th...Dataconomy Media
 
Data Natives Hamburg v 6.0 | "Interpersonal behavior: observing Alex to under...
Data Natives Hamburg v 6.0 | "Interpersonal behavior: observing Alex to under...Data Natives Hamburg v 6.0 | "Interpersonal behavior: observing Alex to under...
Data Natives Hamburg v 6.0 | "Interpersonal behavior: observing Alex to under...Dataconomy Media
 
Data Natives Hamburg v 6.0 | "About Surfing, Failing & Scaling" - Florian Sch...
Data Natives Hamburg v 6.0 | "About Surfing, Failing & Scaling" - Florian Sch...Data Natives Hamburg v 6.0 | "About Surfing, Failing & Scaling" - Florian Sch...
Data Natives Hamburg v 6.0 | "About Surfing, Failing & Scaling" - Florian Sch...Dataconomy Media
 
Data NativesBerlin v 20.0 | "Serving A/B experimentation platform end-to-end"...
Data NativesBerlin v 20.0 | "Serving A/B experimentation platform end-to-end"...Data NativesBerlin v 20.0 | "Serving A/B experimentation platform end-to-end"...
Data NativesBerlin v 20.0 | "Serving A/B experimentation platform end-to-end"...Dataconomy Media
 
Data Natives Berlin v 20.0 | "Ten Little Servers: A Story of no Downtime" - A...
Data Natives Berlin v 20.0 | "Ten Little Servers: A Story of no Downtime" - A...Data Natives Berlin v 20.0 | "Ten Little Servers: A Story of no Downtime" - A...
Data Natives Berlin v 20.0 | "Ten Little Servers: A Story of no Downtime" - A...Dataconomy Media
 
Big Data Frankfurt meets Thinkport | "The Cloud as a Driver of Innovation" - ...
Big Data Frankfurt meets Thinkport | "The Cloud as a Driver of Innovation" - ...Big Data Frankfurt meets Thinkport | "The Cloud as a Driver of Innovation" - ...
Big Data Frankfurt meets Thinkport | "The Cloud as a Driver of Innovation" - ...Dataconomy Media
 
Thinkport meets Frankfurt | "Financial Time Series Analysis using Wavelets" -...
Thinkport meets Frankfurt | "Financial Time Series Analysis using Wavelets" -...Thinkport meets Frankfurt | "Financial Time Series Analysis using Wavelets" -...
Thinkport meets Frankfurt | "Financial Time Series Analysis using Wavelets" -...Dataconomy Media
 
Big Data Helsinki v 3 | "Distributed Machine and Deep Learning at Scale with ...
Big Data Helsinki v 3 | "Distributed Machine and Deep Learning at Scale with ...Big Data Helsinki v 3 | "Distributed Machine and Deep Learning at Scale with ...
Big Data Helsinki v 3 | "Distributed Machine and Deep Learning at Scale with ...Dataconomy Media
 
Big Data Helsinki v 3 | "Federated Learning and Privacy-preserving AI" - Oguz...
Big Data Helsinki v 3 | "Federated Learning and Privacy-preserving AI" - Oguz...Big Data Helsinki v 3 | "Federated Learning and Privacy-preserving AI" - Oguz...
Big Data Helsinki v 3 | "Federated Learning and Privacy-preserving AI" - Oguz...Dataconomy Media
 

More from Dataconomy Media (20)

Data Natives Paris v 10.0 | "Blockchain in Healthcare" - Lea Dias & David An...
Data Natives Paris v 10.0 | "Blockchain in Healthcare" - Lea Dias & 	David An...Data Natives Paris v 10.0 | "Blockchain in Healthcare" - Lea Dias & 	David An...
Data Natives Paris v 10.0 | "Blockchain in Healthcare" - Lea Dias & David An...
 
Data Natives Frankfurt v 11.0 | "Competitive advantages with knowledge graphs...
Data Natives Frankfurt v 11.0 | "Competitive advantages with knowledge graphs...Data Natives Frankfurt v 11.0 | "Competitive advantages with knowledge graphs...
Data Natives Frankfurt v 11.0 | "Competitive advantages with knowledge graphs...
 
Data Natives Frankfurt v 11.0 | "Can we be responsible for misuse of data & a...
Data Natives Frankfurt v 11.0 | "Can we be responsible for misuse of data & a...Data Natives Frankfurt v 11.0 | "Can we be responsible for misuse of data & a...
Data Natives Frankfurt v 11.0 | "Can we be responsible for misuse of data & a...
 
Data Natives Munich v 12.0 | "How to be more productive with Autonomous Data ...
Data Natives Munich v 12.0 | "How to be more productive with Autonomous Data ...Data Natives Munich v 12.0 | "How to be more productive with Autonomous Data ...
Data Natives Munich v 12.0 | "How to be more productive with Autonomous Data ...
 
Data Natives meets DataRobot | "Build and deploy an anti-money laundering mo...
Data Natives meets DataRobot |  "Build and deploy an anti-money laundering mo...Data Natives meets DataRobot |  "Build and deploy an anti-money laundering mo...
Data Natives meets DataRobot | "Build and deploy an anti-money laundering mo...
 
Data Natives Munich v 12.0 | "Political Data Science: A tale of Fake News, So...
Data Natives Munich v 12.0 | "Political Data Science: A tale of Fake News, So...Data Natives Munich v 12.0 | "Political Data Science: A tale of Fake News, So...
Data Natives Munich v 12.0 | "Political Data Science: A tale of Fake News, So...
 
Data Natives Vienna v 7.0 | "Building Kubernetes Operators with KUDO for Dat...
Data Natives Vienna v 7.0  | "Building Kubernetes Operators with KUDO for Dat...Data Natives Vienna v 7.0  | "Building Kubernetes Operators with KUDO for Dat...
Data Natives Vienna v 7.0 | "Building Kubernetes Operators with KUDO for Dat...
 
Data Natives Vienna v 7.0 | "The Ingredients of Data Innovation" - Robbert de...
Data Natives Vienna v 7.0 | "The Ingredients of Data Innovation" - Robbert de...Data Natives Vienna v 7.0 | "The Ingredients of Data Innovation" - Robbert de...
Data Natives Vienna v 7.0 | "The Ingredients of Data Innovation" - Robbert de...
 
Data Natives Cologne v 4.0 | "The Data Lorax: Planting the Seeds of Fairness...
Data Natives Cologne v 4.0  | "The Data Lorax: Planting the Seeds of Fairness...Data Natives Cologne v 4.0  | "The Data Lorax: Planting the Seeds of Fairness...
Data Natives Cologne v 4.0 | "The Data Lorax: Planting the Seeds of Fairness...
 
Data Natives Cologne v 4.0 | "How People Analytics Can Reveal the Hidden Aspe...
Data Natives Cologne v 4.0 | "How People Analytics Can Reveal the Hidden Aspe...Data Natives Cologne v 4.0 | "How People Analytics Can Reveal the Hidden Aspe...
Data Natives Cologne v 4.0 | "How People Analytics Can Reveal the Hidden Aspe...
 
Data Natives Amsterdam v 9.0 | "Ten Little Servers: A Story of no Downtime" -...
Data Natives Amsterdam v 9.0 | "Ten Little Servers: A Story of no Downtime" -...Data Natives Amsterdam v 9.0 | "Ten Little Servers: A Story of no Downtime" -...
Data Natives Amsterdam v 9.0 | "Ten Little Servers: A Story of no Downtime" -...
 
Data Natives Amsterdam v 9.0 | "Point in Time Labeling at Scale" - Timothy Th...
Data Natives Amsterdam v 9.0 | "Point in Time Labeling at Scale" - Timothy Th...Data Natives Amsterdam v 9.0 | "Point in Time Labeling at Scale" - Timothy Th...
Data Natives Amsterdam v 9.0 | "Point in Time Labeling at Scale" - Timothy Th...
 
Data Natives Hamburg v 6.0 | "Interpersonal behavior: observing Alex to under...
Data Natives Hamburg v 6.0 | "Interpersonal behavior: observing Alex to under...Data Natives Hamburg v 6.0 | "Interpersonal behavior: observing Alex to under...
Data Natives Hamburg v 6.0 | "Interpersonal behavior: observing Alex to under...
 
Data Natives Hamburg v 6.0 | "About Surfing, Failing & Scaling" - Florian Sch...
Data Natives Hamburg v 6.0 | "About Surfing, Failing & Scaling" - Florian Sch...Data Natives Hamburg v 6.0 | "About Surfing, Failing & Scaling" - Florian Sch...
Data Natives Hamburg v 6.0 | "About Surfing, Failing & Scaling" - Florian Sch...
 
Data NativesBerlin v 20.0 | "Serving A/B experimentation platform end-to-end"...
Data NativesBerlin v 20.0 | "Serving A/B experimentation platform end-to-end"...Data NativesBerlin v 20.0 | "Serving A/B experimentation platform end-to-end"...
Data NativesBerlin v 20.0 | "Serving A/B experimentation platform end-to-end"...
 
Data Natives Berlin v 20.0 | "Ten Little Servers: A Story of no Downtime" - A...
Data Natives Berlin v 20.0 | "Ten Little Servers: A Story of no Downtime" - A...Data Natives Berlin v 20.0 | "Ten Little Servers: A Story of no Downtime" - A...
Data Natives Berlin v 20.0 | "Ten Little Servers: A Story of no Downtime" - A...
 
Big Data Frankfurt meets Thinkport | "The Cloud as a Driver of Innovation" - ...
Big Data Frankfurt meets Thinkport | "The Cloud as a Driver of Innovation" - ...Big Data Frankfurt meets Thinkport | "The Cloud as a Driver of Innovation" - ...
Big Data Frankfurt meets Thinkport | "The Cloud as a Driver of Innovation" - ...
 
Thinkport meets Frankfurt | "Financial Time Series Analysis using Wavelets" -...
Thinkport meets Frankfurt | "Financial Time Series Analysis using Wavelets" -...Thinkport meets Frankfurt | "Financial Time Series Analysis using Wavelets" -...
Thinkport meets Frankfurt | "Financial Time Series Analysis using Wavelets" -...
 
Big Data Helsinki v 3 | "Distributed Machine and Deep Learning at Scale with ...
Big Data Helsinki v 3 | "Distributed Machine and Deep Learning at Scale with ...Big Data Helsinki v 3 | "Distributed Machine and Deep Learning at Scale with ...
Big Data Helsinki v 3 | "Distributed Machine and Deep Learning at Scale with ...
 
Big Data Helsinki v 3 | "Federated Learning and Privacy-preserving AI" - Oguz...
Big Data Helsinki v 3 | "Federated Learning and Privacy-preserving AI" - Oguz...Big Data Helsinki v 3 | "Federated Learning and Privacy-preserving AI" - Oguz...
Big Data Helsinki v 3 | "Federated Learning and Privacy-preserving AI" - Oguz...
 

Recently uploaded

Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...dajasot375
 
Unveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystUnveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystSamantha Rae Coolbeth
 
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...soniya singh
 
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...Florian Roscheck
 
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptdokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptSonatrach
 
Brighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingBrighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingNeil Barnes
 
B2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxB2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxStephen266013
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfLars Albertsson
 
RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998YohFuh
 
Call Girls In Mahipalpur O9654467111 Escorts Service
Call Girls In Mahipalpur O9654467111  Escorts ServiceCall Girls In Mahipalpur O9654467111  Escorts Service
Call Girls In Mahipalpur O9654467111 Escorts ServiceSapana Sha
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Serviceranjana rawat
 
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...Suhani Kapoor
 
Call Girls In Noida City Center Metro 24/7✡️9711147426✡️ Escorts Service
Call Girls In Noida City Center Metro 24/7✡️9711147426✡️ Escorts ServiceCall Girls In Noida City Center Metro 24/7✡️9711147426✡️ Escorts Service
Call Girls In Noida City Center Metro 24/7✡️9711147426✡️ Escorts Servicejennyeacort
 
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /WhatsappsBeautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsappssapnasaifi408
 
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...Pooja Nehwal
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...Suhani Kapoor
 
Data Science Jobs and Salaries Analysis.pptx
Data Science Jobs and Salaries Analysis.pptxData Science Jobs and Salaries Analysis.pptx
Data Science Jobs and Salaries Analysis.pptxFurkanTasci3
 
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfSocial Samosa
 

Recently uploaded (20)

Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
 
Unveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystUnveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data Analyst
 
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
 
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
 
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptdokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
 
Brighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingBrighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data Storytelling
 
B2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxB2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docx
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdf
 
RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998
 
Call Girls In Mahipalpur O9654467111 Escorts Service
Call Girls In Mahipalpur O9654467111  Escorts ServiceCall Girls In Mahipalpur O9654467111  Escorts Service
Call Girls In Mahipalpur O9654467111 Escorts Service
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
 
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
 
Call Girls In Noida City Center Metro 24/7✡️9711147426✡️ Escorts Service
Call Girls In Noida City Center Metro 24/7✡️9711147426✡️ Escorts ServiceCall Girls In Noida City Center Metro 24/7✡️9711147426✡️ Escorts Service
Call Girls In Noida City Center Metro 24/7✡️9711147426✡️ Escorts Service
 
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
 
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /WhatsappsBeautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
 
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
 
Data Science Jobs and Salaries Analysis.pptx
Data Science Jobs and Salaries Analysis.pptxData Science Jobs and Salaries Analysis.pptx
Data Science Jobs and Salaries Analysis.pptx
 
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
 

Boosting Product Categorization with Machine Learning Models

  • 1. Amadeus Magrabi @amadeusmagrabi BOOSTING PRODUCT CATEGORIZATION
 WITH MACHINE LEARNING
  • 2. … 11/2017 2 Company: Customers: People who want to sell something online … Main product: REST API to manage online shops
  • 4. 11/2017 4 Goal: Use machine learning to automatically recommend categories for products Machine Learning for Category Recommendations Fashion Men Women Sports Shoes Pants Business
  • 5. 11/2017 5 Challenge: Every online store has a different category structure. Challenges Fashion Men Women Jeans Clothing Pants Shirts Shoes Store 1 Store 2
  • 6. 11/2017 6 Challenge: Every online store has a different category structure. Challenges Fashion Men Women Jeans Clothing Pants Shirts Shoes Store 1 Store 2 Store 3 Model 1 Model 2 Model 3 predictpredictpredict Option 1: multiple store-specific models Store 1 Store 2
  • 7. 11/2017 7 Challenge: Every online store has a different category structure. Challenges Fashion Men Women Jeans Clothing Pants Shirts Shoes Store 1 Store 2 Store 3 Model 1 Model 2 Model 3 predictpredictpredict General Categories Model 1 predict match Store 3Store 2Store 1 match match Option 1: multiple store-specific models Option 2: one general model Store 1 Store 2
  • 8. 11/2017 8 Challenge: Every online store has a different category structure. Challenges Fashion Men Women Jeans Clothing Pants Shirts Shoes Store 1 Store 2 Store 3 Model 1 Model 2 Model 3 predictpredictpredict General Categories Model 1 predict match Store 3Store 2Store 1 match match Option 1: multiple store-specific models Option 2: one general model • Better accuracies for 
 stores with very specific
 categories • No category matching 
 necessary • More data-per-model • More flexible • Easier to deploy • Also works for stores
 with little data • Can also recommend 
 categories that are not yet defined in the store Store 1 Store 2
  • 9. 11/2017 9 Challenge: Product data is diverse and unbalanced, which
 complicates feature selection. Challenges Approach: → Focus on features names, images and descriptions • carry most information • available for most products • Product names • Images • Prices • Descriptions • Sizes • Brands • Colors • Expiration Dates • …
  • 10. 11/2017 10 Challenge: Very large class set • Amazon/Ebay have listed 50000+ categories • Tradeoff: Coverage vs. Accuracy Challenges Approach:
 → select broad model categories
 to cover main use cases → rely on category matching procedure 
 to catch more specialized categories → current version has a selection of
 723 model categories
  • 11. 11/2017 11 Overview of Approach 723 General Categories Name Model Description Model predict match predict Store 3Store 2Store 1 match match Image Model predict
  • 12. 11/2017 12 • Model: Convolutional Neural Network (Deep Learning) • Similar to mechanisms in the brain: Idea of building complex representations by combining simple representations Model for Product Images
  • 13. 11/2017 13 • Model: Convolutional Neural Network (Deep Learning) • Similar to mechanisms in the brain: Idea of building complex representations by combining simple representations • Trained via transfer learning on famous image recognition network Inception v3 (TensorFlow, Google Cloud ML Engine) Model for Product Images
  • 14. 11/2017 14 Preprocessing: (spacy, re, gensim, Google Translate, pyenchant) • spellchecker • translation • tokenization • normalization • lemmatization • phrasing • word removal Model for Product Names Examples: “Mens Heavyweight 6.1-ounce, 100% cotton T-Shirts in Regular, Big and Tall Sizes” “Gala Apples Fresh Fruit, 3 LB Bag” “Carhartt Men's Maddock Pocket T-Shirt Size M” “Samsung SM-G900V - Galaxy S5 - 16GB Android Smartphone Verizon + GSM - Black” (smartwathc → smartwatch) (German → English) (complete names → words) (lowercasing, deleting special characters) (apples → apple) (louis vuitton → louis_vuitton) (stop words, blacklist)
  • 15. 11/2017 15 Preprocessing: (spacy, re, gensim, Google Translate, pyenchant) • spellchecker • translation • tokenization • normalization • lemmatization • phrasing • word removal Model for Product Names Examples: “Mens Heavyweight 6.1-ounce, 100% cotton T-Shirts in Regular, Big and Tall Sizes” “Gala Apples Fresh Fruit, 3 LB Bag” “Carhartt Men's Maddock Pocket T-Shirt Size M” “Samsung SM-G900V - Galaxy S5 - 16GB Android Smartphone Verizon + GSM - Black” (smartwathc → smartwatch) (German → English) (complete names → words) (lowercasing, deleting special characters) (apples → apple) (louis vuitton → louis_vuitton) (stop words, blacklist) Models: (scikit-learn) • Logistic Regression • Naive Bayes • Random Forest • XGBoost • Support Vector Machine
  • 16. 11/2017 16 Vectorization methods: (text → numbers)
 bag-of-words: • Simple approach, but sparse representation and blind to context Model for Product Names
  • 17. 11/2017 17 Vectorization methods: (text → numbers)
 bag-of-words: • Simple approach, but sparse representation and blind to context Model for Product Names tf-idf: • Similar to bag-of-words, but weighs words higher when they 
 do not occur frequently in dataset • Intuition: “the” has less predictive value than “iPhone” • TF(w) = (number of times word appears in name) / (total number of words in name) • IDF(w) = log_e(total number of names / number of names with word w in it)
  • 18. 11/2017 18 Vectorization methods: (text → numbers)
 bag-of-words: • Simple approach, but sparse representation and blind to context Model for Product Names word2vec: • Trains two-layer neural network that predicts
 context words of a word • Results in a dense and context-sensitive 
 representation tf-idf: • Similar to bag-of-words, but weighs words higher when they 
 do not occur frequently in dataset • Intuition: “the” has less predictive value than “iPhone” • TF(w) = (number of times word appears in name) / (total number of words in name) • IDF(w) = log_e(total number of names / number of names with word w in it)
  • 19. 11/2017 19 Model for Product Names 
 Model for Product Descriptions Model for Product Descriptions
  • 20. 11/2017 20 Category Matching Model categories are matched to store-specific categories via a word2vec model trained on a news dataset word2vec
 similarity 723 General Categories Name Model Description Model predict match predict Store 3Store 2Store 1 match match Image Model predictaveraging
 class
 probabilities
  • 24. 11/2017 24 Thank you! Amadeus Magrabi @amadeusmagrabi amadeus.magrabi@commercetools.com word2vec
 similarity 723 General Categories Name Model Description Model predict match predict Store 3Store 2Store 1 match match Image Model predictaveraging
 class
 probabilities Tensorflow, Inception tf-idf, LogReg tf-idf, LogReg