More Related Content
Similar to Machine Learning Projects @ commercetools (20)
Machine Learning Projects @ commercetools
- 2. All Rights Reserved © 2017 2
Overview
• Category Recommendations
• Detecting Similar Products (duplicate detection)
• Attribute Normalization
• Detecting Anomalous Orders (fraud detection)
• Detecting Missing Data (attributes, prices, images)
- 3. All Rights Reserved © 2017 3
Category Recommendations
• Goal: Predict which categories fit to a given product based on product images,
names or descriptions.
• Two API versions:
• Project-specific API recommends only categories defined in a particular
commercetools project.
• General API recommends from a broad set of categories for any image, name or
description.
• Tech: Convolutional neural networks (tensorflow), transfer learning (Inception v3),
natural language processing (spacy), tf-idf (scikit-learn), word2vec (gensim), logistic
regression (scikit-learn), microservices (flask), Google Cloud Compute Engine
- 7. All Rights Reserved © 2017 7
Product Similarity
• Goal: Identify the most similar products, either within a project or between two
projects.
• Use cases:
• Detect and clean up duplicate products.
• Product matching: Check whether a product in one project already exists in
another project.
• Use information about product similarity to improve search engine optimization
(e.g. make product descriptions more unique).
• Analyze how similar two projects are.
• Tech: Convolutional neural networks (keras, ResNet), cosine similarity (numpy),
numeric scaling (scikit-learn), string matching (fuzzywuzzy), tf-idf (scikit-learn)
- 8. All Rights Reserved © 2017 8
Product Similarity
/similarities/products/example-store-name?region=EU&staged=true&similarityMeasures=name
- 9. All Rights Reserved © 2017 9
Product Similarity
/similarities/products/example-store-name-2?region=EU&staged=true
&similarityMeasures=name,image,variantNumber
- 10. All Rights Reserved © 2017 10
Attribute Normalization
• Goal:
• Attribute values can be quite inconsistent when projects have low data quality
(e.g. lowercase vs. uppercase-style, occasional spelling mistakes, inconsistent
use of abbreviations, etc.).
• This API predicts how attribute values can be normalized to match a cleanly
defined set.
• Tech: tf-idf (scikit-learn), cosine similarity (numpy), affinity propagation (scikit-learn)
- 11. All Rights Reserved © 2017 11
Attribute Normalization
/normalizations/attributes/example-store?attributeName=sizes&attributeValueSet=xs,s,m,l,xl,xxl
- 12. All Rights Reserved © 2017 12
Missing Data Analysis
• Goal: Direct attention of merchants to products with a lot of missing data.
• Currently supported:
• how many attributes of the corresponding product type are covered in a
product and whether they contain valid attribute values
• whether product images are missing (takes into account how many images per
product are common in a project)
• whether prices are defined and and still valid for selected time frames
• Planned extension:
• Not just detect missing data, but also automatically recommend how it should
be filled.
- 13. All Rights Reserved © 2017 13
Missing Data Analysis
/missing-data/attributes/example-store?staged=false®ion=EU&productSetLimit=5000&limit=2
- 14. All Rights Reserved © 2017 14
Order Anomalies
• Goal: Detect any unusual orders that should be checked for potential fraud.
• Currently supported cases:
• Unusual total cost of an order
• Unusual number of products in an order
• Unusual time between orders of the same user
• Unusual amount of orders of the same user
• Machine learning makes sure that the context of individual projects is
automatically taken into account when checking for unusual cases (e.g. orders in a
grocery store and a luxury jewelry store naturally have a very different pattern).
• Tech: IsolationForest (scikit-learn)
- 15. All Rights Reserved © 2017 15
Order Anomalies
/anomalies/orders/example-store?region=EU&orderSetLimit=10000
- 16. All Rights Reserved © 2016 16
Thank you!
techblog.commercetools.com
amadeus.magrabi@commercetools.com
www.commercetools.com