Real-time personalized recommendations using product embeddings

MEET
OUR
TEAM
WRITE HERE SOMETHING

About me - Jakub Macina @dmacjam
● MSc in Information Systems @ Slovak University of Technology
○ collaboration with edX
○ research paper at ACM RecSys 17
● open source contributor @ Discourse
○ Google Summer of Code
● Spine Hero - Microsoft Imagine Cup, Brilliant Young Entrepreneurs
● AI Engineer @ Exponea
2

MEET
OUR
TEAM
WRITE HERE SOMETHING1. Motivation for recommender systems
2. Collaborative filtering
3. Challenges
4. Text-based recommenders
○ Term weighting
○ Word2vec
5. Product embeddings
○ Usage
○ Training
○ Examples
6. Conclusions

Information overload
4
More than 400 hours of videos
are uploaded every minute
More than 35 million songs available

Recommender system
● provide suggestions to users for items they might be interested to consume or
items meeting their needs
● more formally:
○ Estimate a utility function that automatically predicts how a user will like an item.
6

Recommendations are everywhere
7

•
•
•
•
•
•
•
•
•
•
•

Value of recommendation
“Our recommender system is used on most screens of the Netflix product beyond
the homepage, and in total influences choice for about 80% of hours
streamed at Netflix. The remaining 20% comes from search [...]”
Carlos A Gomez-Uribe and Neil Hunt. 2016. The netflix recommender system: Algorithms, business value, and innovation.
ACM Transactions on Management Information Systems (TMIS) 6, 4 (2016), 13.
9

Value of recommendation
60% of video clicks are from homepage recommendation
James Davidson, Benjamin Liebald, Junning Liu, Palash Nandy, Taylor Van Vleet, Ullas Gargi, Sujoy Gupta, Yu He, Mike Lambert, Blake Livingston,
and Dasarathi Sampath. 2010. The YouTube Video Recommendation System. In Proceedings of the Fourth ACM Conference on Recommender
Systems (RecSys ’10). ACM, New York, NY, USA, 293–296.
10

Collaborative filtering
● Based on user’s past behaviour
11
Image by Erik Bernhardsson

Challenges
● Customers are not logged in while browsing a website - no history available
12
Dataset Users Items Matrix density
Movielens 10M 69 878 10 681 1.340%
Average fashion e-commerce 100 000 - 500 000 1000 - 50 000 0.012% - 0.155 %

Challenges
● Customers are not logged in while browsing a website - no history available
● Buying intent and preferences might change from visit to visit
13
Dataset Users Items Matrix density
Movielens 10M 69 878 10 681 1.340%
Average fashion e-commerce 100 000 - 500 000 1000 - 50 000 0.012% - 0.155 %

1. Content-based recommendation
● Find similar items by analyzing content (texts, images, music, ...)
● Text analysis
15

1. Content-based recommendation
● Project each product into low-dimensional space
● Compute similarities between them
16

Text preprocessing
● Stopwords removal
● Part-of-speech (POS) tagging
17
from nltk.corpus import stopwords
review = "Great local atmosphere, tasty tapas and great selection of beers."
words = review.lower().split(" ")
print([word for word in words if word not in (stopwords.words('english'))])
>> ['great', 'local', 'atmosphere,', 'tasty', 'tapas', 'great', 'selection', 'beers.']
for sent in nltk.sent_tokenize(review):
print(list(nltk.pos_tag(nltk.word_tokenize(sent))))
>> [('Great', 'NNP'), ('local', 'JJ'), ('atmosphere', 'NN'), (',', ','), ('tasty',
'JJ'), ('tapas', 'NN'), ('and', 'CC'), ('great', 'JJ'), ('selection', 'NN'), ('of',
'IN'), ('beers', 'NNS')]

Term weighting
● Weight = importance indicator of a term regarding content
18
0 2 1 1 0 ... 1
local
great
tapas
“Great local atmosphere, tasty tapas
and great beer selection.”
beer
blue
wine
|V|

Term weighting
● comparing exact words in document
● ignore order of words
● what about synonyms and related words?
19
“Great selection of local beers.” Wide variety of beers from all around the
world.

Word embeddings
● Distributional hypothesis:
“You shall know a word by the company it keeps” (J. R. Firth 1957)
21

Word embeddings
● capture similarity between words, analogies, general syntactic and semantic
information
● unsupervised learning
● representing each word as a numeric vector = embedding
○ dense vectors - size is usually from 100 to 300
22
|100|
YOUNG, Tom, et al. Recent trends in deep learning based natural language processing. arXiv preprint arXiv:1708.02709, 2017.

Word2vec
Mikolov, Tomas, et al. "Distributed representations of words and phrases and their compositionality." Advances in neural
information processing systems. 2013.
23
We
tried
food
yesterday
Italian Italian
We
tried
food
yesterday

Word2vec with gensim
24
print(reviews[:3])
>>> [
['this', 'place', 'is', 'horrible'],
['i', 'was', 'impressed', 'there'],
['i', 'decided', 'to', 'try', 'it', 'turned', 'out', 'is' , 'cheap', 'eat']
]
from gensim.models import Word2Vec
word2vec_model = Word2Vec(reviews, sg=1, iter=10, size=100, window=5, min_count=2,
workers=4)
word2vec_model.wv['impressed']
>> array([ 0.2790776 , -0.3456704 , 0.23330563, ..., -0.11152197],
dtype=float32)

Word2vec example
● Yelp Open Dataset - https://www.yelp.com/dataset
○ Business with reviews
○ Personal, educational, and academic purposes
25

Word2vec exploration
● http://projector.tensorflow.org/
https://radimrehurek.com/gensim/scripts/word2vec2tensor.html
26

Examples - word analogies
Breakfast + lunch =
Wines - french + belgian =
28

Word embeddings
● Word2vec
○ Google News (about 100 billion words)
○ https://code.google.com/archive/p/word2vec/
● Fasttext
○ Character embeddings - deal with unknown word issue
○ Wikipedia
○ https://github.com/facebookresearch/fastText/blob/master/pretrained-vectors.md
29

Word embeddings
● Works great with texts
● Able to capture synonyms, words used in the same context
● Find similar products to a product
● Not too personalized
● Can we use this idea to provide more personalized recommendations?
30
Green sweater Ma&Mi Red sweater Ma&Mi Khaki sweater Originals

2. Product embeddings
● represent each product as a numeric vector
● products in similar contexts to have similar vectors
32
-0.6 0.1 0.3 0.6 -0.3 ... 0.7
|100|

Usage
● Calculate similarity between any products
33
0.6 2.1 1.4 0.1 4.2 ... 3.3
0.4 1.7 0.7 0.3 5.6 ... 2.1
Cosine similarity = 0.823

In-session personalization using product embeddings
34
Last products viewed
in a session
Combined product vector
0.86
0.24
0.61

Training - product views
● Views by users ordered in time
35
= 5846 8743 9635 8745
= 8011 1239 2310
= 3324 9803
= 6798 7129 5989

Neural network architecture
36

Clustering of products
37
WomenMen
Casual
Formal
(-0.4, 0.7)
(0.8, 0.6)

Model tuning
● filter out short clicks (accidental, not interesting)
● negative sampling - random sampling by default - use products from the same
category
43

Comparison with collaborative filtering
44BARKAN, Oren; KOENIGSTEIN, Noam. Item2vec: neural item embedding for collaborative filtering. In: Machine Learning for Signal Processing
(MLSP), 2016 IEEE 26th International Workshop on. IEEE, 2016. p. 1-6.

Recommendation using product embeddings
● utilize session data about customers browse through your website
● represent each product as a dense numeric vector / embedding
● real-time - retrieve, combine and compute similarities
● able to capture style of a product, color, category or a price level
● contact me:
○ jakub.macina@exponea.com
○ @dmacjam
45

Resources
● https://www.slideshare.net/xamat/recommender-systems-machine-learning-su
mmer-school-2014-cmu
● BARKAN, Oren; KOENIGSTEIN, Noam. Item2vec: neural item embedding for
collaborative filtering. In: Machine Learning for Signal Processing (MLSP),
2016 IEEE 26th International Workshop on. IEEE, 2016. p. 1-6.
● YOUNG, Tom, et al. Recent trends in deep learning based natural language
processing. arXiv preprint arXiv:1708.02709, 2017.
46

Real-time personalized recommendations using product embeddings

More Related Content

Similar to Real-time personalized recommendations using product embeddings

Recently uploaded

Real-time personalized recommendations using product embeddings