1. Machine Learning (ML) for
eCommerce and Retail
Dr. Andrei Lopatenko
Director of Engineering,
Recruit Institute of Technology
Recruit Holdings
former Walmart Labs, Google (twice), Apple (twice)
andrei@recruit.ai
2. ML for eCommerce
• Search, Browse, for commerce sites and
application
• Help users to find and discover items they
will purchase
• Maximize revenue/profit per user session
9. Search data size
• Catalogue items
• 8 M items now compare ~ 400 M
Amazon / eBay
• X 10 in near future
• 2 K text description per item + images
• Several hundreds of structured attributes
per catalog
10. Search – user searches
• Tens of millions per day
• Tens billions session per year
• Online sales 13.2 B per year (http://
fortune.com/2015/11/17/walmart-
ecommerce/)
• 500B per year sales offline stories (8% USA
economy) in ~ 11K stores
• The number of transactions ~ 10B (public
data)
11. ML addressable problems
• Learning to rank
• Given a query, what’s the list of items
with the highest probability of conversion
(purchase), ATC (add to card), page view
12. ML addressable problems
• Typeahead
• Given a sequence of characters types by
user, what’s most probably competitions,
what are most probable items users wants
to buy
13. ML addressable problems
• Spell correction
• Given a user query, what’s the query user
actually wanted to type
14. ML addressable problems
• Cold start
• Given a new items with it’s set of
attributes and no history of sales or
exposure on site, predict items sales and
item sales per query
15. ML addressable problems
• Prediction of LHN
• Given a user query, what’s the best set of
facet and facet values, which gives higher
probability of users interacting with them
and finally buying an item
16. ML addressable problems
• Query understanding
• Given a query, build a semantic parse of
query, tag tokens with attributes: blue
tshirts for teenagers -> blue:color
tshirts:type for:opt
teenagers:agerestriction10-20
• Classification: blue tshirts for teenagers: -
> type:apparel, price preference: 10-30,
releaseyearpreference: 2014-2016
17. ML addressable problems
• Related searches
• Given a query, what are queries which are
either semantically close to this one, or
represent coincidental users interests
• Nike shoes -> adidas shoes, sport shoes,
• Coffee mugs -> travel mugs, photo coffee
mugs, cappuccino cups
18. ML addressable problems
• product discovery
• help users to explore product assortment,
• drive users to diverse products
• reduce risk of selecting irrelevant items
• help to find price,quality,brand etc
alternatives
• reduce pigeonhole risk
• provide relevant data to make a decision
19. ML addressable problems
• Image similarity
• Given images of the items, give other
items such that images of those are
visually appealing to the users which like
the original item (appealing by shape?
Color? Texture?) -> causing high conversion
in recommendation
20. ML addressable problems
• Voice search
• Given voice input, reply with a list of the
best items
• “what are the cheapest samsung tvs in the
store”
• “what is best deal on queen bed today?”
21. ML addressable problems
• extraction of item attributes
• Given an item: what are item attributes:
brand, color, size (wheel, screen, height,
S/M/XL, Queen/Twin/King/Full), Gender,
Pattern, Shape, Features
22. ML addressable problems
• Representations of users : actions on
websites/apps -> searches, clicks,
browsing behaviour, product -> purchase
preferences, reviews, ratings, return rates
23. ML addressable problems
• title generation: how to generate the title
which will cause maximum conversion
rate
• which product attributes select for the
title?
29. Inventory management
• Customer want to buy products
• Customers have diverse needs
• Products should be in stock, ideally in
warehouses close to customers
• but it’s expensive to store products
• Problem: How many products of each type
should be stored, when product supply
should be refilled?