Recommendation Systems

§ Netflix: 2/3 of the movies watched are recommended
§ Google News: recommendations generate 38% more
clickthrough
§ Amazon: 35% sales from recommendations
§ Choicestream: 28% of the people would buy more music if
they found what they liked.
25.08.2017Recommender Systems 2

§ In search, user know about item
§ In recommendation, user even do not know whether item
exists or not

§ Content Based Filtering
§ Collabrative Filtering

§ Learning to Rank
§ Context-Aware Recommendations
§ Tensor Factorization
§ Factorization Machines
§ Deep Learning
§ Similarity
§ Social Approaches

§ Collaborative Filtering: Recommend items based only on the users
past behavior
§ User-based: Find similar users to me and recommend what they liked
§ Item-based: Find similar items to those that I have previously liked
§ Content-based: Recommend based on item features
§ Personalized Learning to Rank:Treat recommendation as a ranking
problem
§ Demographic: Recommend based on user features
§ Social recommendations (trust-based)
§ Hybrid: Combine any of the above

Item1 Item2 Item3 Item4 Item5
Alice 5 3 4 4 ?
User1 3 1 2 3 3
User2 4 3 4 3 5
User3 3 3 1 5 4
User4 1 5 5 2 1
USER-MOVIE RATINGS

§ List of m Users and a list of n Items
§ Each user has a list of items with associated opinion
§ Explicit opinion - a rating score
§ Sometime the rating is implicitly – purchase records or listen to tracks
§ Active user for whom the CF prediction task is performed
§ Metric for measuring similarity between users
§ Method for selecting a subset of neighbors
§ Method for predicting a rating for items not currently rated by
the active user.

§ Identify set of ratings for the target/active user
§ Identify set of users most similar to the target/active user
according to a similarity function (neighborhood
formation)
§ Identify the products these similar users liked
§ Generate a prediction - rating that would be given by the
target user to the product - for each one of these products
§ Based on this predicted rating recommend a set of top N
products

§ Pros:
§ Requires minimal knowledge engineering efforts
§ Users and products are symbols without any internal structure or
characteristics
§ Produces good-enough results in most cases
§ Cons:
§ Requires a large number of reliable “user feedback data points”
to bootstrap
§ Requires products to be standardized (users should have bought
exactly the same product)
§ Assumes that prior behavior determines current behavior without
taking into account “contextual” knowledge (session-level)

§ CF recommendations are personalized since the “prediction” is
based on the ratings expressed by similar users
§ Those neighbors are different for each target user
§ A non-personalized collaborative-based recommendation can be
generated by averaging the recommendations of ALL the users

Alice 5 3 4 4 ?
User1 3 1 2 3 3
User2 4 3 4 3 5
User3 3 3 1 5 4
User4 1 5 5 2 1
sim = 0,85
sim = 0,70
sim = -0,79

Alice 5 3 4 4 ?
User1 3 1 2 3 3
User2 4 3 4 3 5
User3 3 3 1 5 4
User4 1 5 5 2 1

§ Methods of dimensionality reduction
§ Matrix Factorization
§ Clustering
§ Projection (PCA, SVD …)

§ Memory based
§ Use the entire user-item database to generate a prediction.
§ Usage of statistical techniques to find the neighbors – e.g. nearest-
neighbor.
§ Memory based
§ First develop a model of user
§ Type of model:
§ Probabilistic (e.g. Bayesian Network)
§ Clustering
§ Rule-based approaches (e.g. Association Rules)
§ Classification, Regression
§ LDA

§ Common for recommending text-based products (web pages,
usenet news messages, )
§ Items to recommend are “described” by their associated
features (e.g. keywords)
§ User Model structured in a “similar” way as the content:
features/keywords more likely to occur in the preferred
documents (lazy approach)
§ Text documents recommended based on a comparison between their
content (words appearing) and user model (a set of preferred words)
§ The user model can also be a classifier based on whatever
technique (Neural Networks, Naïve Bayes...)

§ Let Content(s) be an item profile, i.e. a set of attributes
characterizing item s.
• Content usually described with keywords.
• “Importance” (or “informativeness”) of word kj in document dj
is determined with some weighting
• measure wij
• One of the best-known measures in IR is the term
frequency/inverse document frequency (TF- IDF)

§ Linear Regression
§ SVD
§ SVD++
§ Funk SVD
§ Matrix Factorization
§ Neural Networks
§ Locality-Sensitive Hashing (LSH)
§ Clustering
§ Association rules

Vk
T
Dim1 -0.44 -0.57 0.06 0.38 0.57
Dim2 0.58 -0.66 0.26 0.18 -0.36
Uk Dim1 Dim2
Alice 0.47 -0.30
Bob -0.44 0.23
Mary 0.70 -0.06
Sue 0.31 0.93 Dim1 Dim2
Dim1 5.63 0
Dim2 0 3.23
T
kkkk VUM ´S´=
kS
• SVD:
• Prediction:
= 3 + 0.84 = 3.84
)()(ˆ EPLVAliceUrr T
kkkuui ´S´+=

HOSVD

§ Personalized recommendations—user-based
§ Similar items—item-based
§ Viewed this bought that—item-based cross-action
§ Popular Items and User-defined ranking
§ Item-set recommendations for complimentary purchases or
shopping carts—item-set-based
§ Hybrid collaborative filtering and content based
recommendations—limited content-based
§ Business rules

r = recommendations
hp = a user’s history of some action
(purchase for instance)
P = the history of all users’ primary action
rows are users, columns are items
(PtP) = compares column to column using
log-likelihood based correlation test
r = (PtP)hp

§ Let’s call (PtP) an indicator matrix for some primary action
like purchase
§ Rows = items, columns = items, element = similarity/correlation score
§ The score is row compared to column using a “similarity” or
“correlation” metric
§ Log-Likelihood Ratio (LLR) finds important/correlating co-
occurrences and filters out the rest—a major improvement in
quality over simple co-occurrence or other similarity
metrics.
§ Experiments on real-world data show LLR is significantly
better than other similarity metrics

§ This actually means to take the user’s
history hp and compare it to rows of the co-
occurrence matrix (PtP)
§ TF-IDF weighting of co-occurrence would
be nice to mitigate the undue influence
of popular items
§ Find items nearest to the user’s history
§ Sort these by similarity strength and
keep only the highest —you have
recommendations
§ hp
§ user1: [item2, item3]
§ (PtP)
§ item1: [item2, item3]
§ item2: [item1, item3, item95]
§ item3: […]
§ find item that most closely matches the
user’s history
§ item1 !

Single User History of Multi-modal Behavior
buy views
terms
in
search
users
products products categories terms
...
A B C E
input
categorypref
products
D
share
user-i

All User’s Buys Cooccurrence
users
products
A
users
products
At
X
=
cooccurrence
products
products
product-j
product-j had 2 other products that
were bought in common, we replace
cooccurrence magnitude with LLR
score, it adds the “correlation test” to
simple cooccurrence

Cooccurrence Analysis

How often do items co-occur?
// compute co-occurrence matrix
val C = A.t %*% A

Which cooccurences are interesting?
// compute some statistics
val interactionsPerItem =
drmBroadcast(A.colSums)
// convert to indicator matrix
val I = C.mapBlock() {
// compute LLR scores from
// cooccurrences and statistics
...
// only keep interesting cooccurrences
...
}
// save indicator matrix
I.writeDrm(...);

§ Virtually all existing collaborative filtering type
recommenders use only one indicator of preference
§ Virtually anything we know about the user can be used to
improve recommendations—purchase, view, category-
preference, location-preference, device-preference,
gender…
r = (PtP)hp + (PtV)hv + (PtC)hc + …
CROSS-OCCURRENCE

All User’s Buys Cross-occurrence with
Search terms
users
users
products
At
X
=
cross-
occur-
rence
products
product-j
product-j had 3 terms that were
searched for in common, we replace
cross-occurrence magnitude with LLR
score, it adds the “correlation test” to
simple cross-occurrence!
terms
in
search
terms terms
E

§ Comparing the history of the primary action to other actions finds
actions that lead to the one you want to recommend
§ Given strong data about user preferences on a general population we
can also use
§ items clicked
§ terms searched
§ categories viewed
§ items shared
§ people followed
§ items disliked (yes dislikes may predict likes)
§ location
§ device preference
§ gender
§ age bracket
§ Virtually any anything we know about the population can be tested for correlation
and used to predict a particular users preferences

§ Collaborative Topic Filtering
§ Use Latent Dirichlet Allocation (LDA) to model topics directly from the textual
content
§ Calculate based on Word2Vec type word vectors instead of bag-of-words
analysis to boost quality
§ Create cross-occurrence indicators from topics the user has preferred
§ Repeat periodically
§ Entity Preferences:
§ Use a Named Entity Recognition (NER) system to find entities in textual content
§ Create cross-occurrence indicators for these entities
§ Entities and Topics are long lived and richly describe user interests,
these are very good for use in the Universal Recommender.

§ The final calculation uses hp as the query on the Cooccurrence Matrix
(PtP), returns a ranked set of items
§ Query is a “similarity” query, not relational or key based fetch
§ Uses Search Engine as Cosine-based K-Nearest Neighbor (KNN)
Engine with norms and TF-IDF weighting
§ Highly optimized for serving these queries in realtime
§ Several (Solr, Elasticsearch) have High Availability, massively
scalable clustered auto-sharding features like the best of NoSQL DBs.

Indicators can also be based on content similarity
(TTt) is a calculation that compares every 2 documents to
each other and finds the most similar—based upon content
alone
r = (TTt)ht + l*L …

§ Cooccurrence
§ Find the best indicator of a user preference for the item type to be
recommended: examples are “buy”,“read”,“video_watch”,
“share”,“follow”,“like”.
§ Cross-occurrence
§ Item metadata as “user” preference, for example: treat item
category as a user category-preferences
§ Calculated from user actions on any data that may give information
about user— category-preferences, search terms, gender, location
§ Create with Mahout-Samsara SimilarityAnalysis.cooccurrence

§ Content or metadata
§ Content text, tags, categories, description text, anything describing
an item
§ Create with Mahout-Samsara SimilarityAnalysis.rowSimilarity
§ Intrinsic
§ Popularity rank, geo-location, anything describing an item
§ Some may be derived from usage data like popularity rank, or
hotness
§ Is a known or specially calculated property of the item

“Universal” means one query on all indicators at once
Unified query:
purchase-correlator: users-history-of-purchases
view-correlator: users-history-of-views
category-correlator: users-history-of-categories-viewed
tags-correlator: users-history-of-purchases
geo-location-correlator: users-location
r = (PtP)hp + (PtV)hv + (PtC)hc + …
(TTt)ht + l*L …

§ Dataset

§ getBiasedSimilarItems
§ Get similar items for an item,these are already in the action correlators in ES
§ getBoostedMetadata
§ Get all metadata fields that potentially have boosts (not filters)
§ getFilteringMetadata
§ Get all metadata fields that are filters (not boosts)
§ getFilteringDateRange
§ Get part of query for dates and date ranges

§ getExcludedItems
§ Create a list of item ids that the user has interacted with or are not to be included in
recommendations
§ getBiasedRecentUserActions
§ Get recent events of the user on items to create the recommendations query from
§ getExcludingMetadata
§ Get all metadata fields that are filters (not boosts)

§ Results

Cenk Bircanoğlu

Recommendation Systems

More Related Content

Similar to Recommendation Systems

More from Cenk Bircanoğlu

Recently uploaded

Recommendation Systems

Editor's Notes