Demystifying Recommendation Systems

Demystifying
Recommendation
Systems

About Rumman
•Senior Data Scientist and Instructor at Metis
•Practicing Data Scientist
• Find me on twitter @ruchowdh
• Visit my website at rummanchowdhury.com
• Check out my jobs page
• …and my blog

About Metis
• Data Science Bootcamp
• Part of Kaplan
• Accredited by ACCET
• 12-weeks, full-time including 60 hours of
online pre-work
• Evening and weekend training courses
• Third party financing options
• $3,000 scholarship for women,
underrepresented minority groups, and
veterans or members of the U.S. military

Overview
• What is a recommendation engine?
• What are the types of recommendation
systems?
• What are the drawbacks of the most
common recommendation engines and how
do I deal with them?
• How do I fine-tune my model?

What are recommendation
systems?

What are recommendation systems?
Automated systems that seek to suggest whether a
given item (product, event, movie, song, etc) will be
desirable to a user.
Or, more data science-y: predict what a user’s
review will be for items that they have not  
reviewed

Where does a recommendation system lie in the
space of data science and analytics?
• Descriptive
• Average, percents, etc
• Explains post-event or during
• Predictive
• Uses modeling of past behavior to make
predictions about the future
• Prescriptive
• Informed decision of how actions 
should be taken based on data

How do I pick the best kind of recommender system
for my data?
• What is your existing data?
• How quickly does your inventory change?
• How much information can you get on a
user? (explicit and implicit)
• Does your model need to scale well?

What are the kinds of
recommendation systems?

What are the kinds of recommender systems?
• Search (knowledge-based)
• Pros: items will be close matches to
expressed needs, no cold-start issues
• Cons: Static, manual tagging, will not
work well with very similar inventories
or rapidly changing inventories
• Example: Amazon’s basic search

• Content-based
• Items are mapped based on characteristics into
an item-feature space, and recommendations
are based on specified characteristics
• Pros: Easier comparison between items
• Cons: Cold start problem, need good content
descriptions, need item ratings
•Example: Search for ‘ai’ vs ‘AI’,  
‘mit’ vs ‘MIT’

• Collaborative filtering: based on user and
item similarities
• Pros: can provide less-obvious matches
• Cons: cold-start problem for new users and
new items, requires a feedback rating

Limitations, or, Ask yourself, do you really need a
recommendation engine?
• Recommendation systems have to update immediately.
• You have to have a sufficiently inexpensive
model and have the bandwidth to return results
fast.
• You have more information than you think:
• existing item popularity
• geography based in ip address
• cookies

How does Content-Based recommendation work?
• Users and items are represented by vectors
in a feature space
• Approaches:
• Map users and items to the same
feature space, compute distance
between a user and an item.

Example: Content-Based Recommendation
Features = (big box office, aimed at kids, famous actors)
Items (movies): 
 
Finding Nemo = (5, 5, 2)
Mission Impossible = (3, -5, 5)
Jiro Dreams of Sushi = (-4, -5, -5)
Predicted ratings*:
(-3*5 + 2*5 + 2*2) = -9
(-3*3 - 2*5 - 2*5) = -29
(3*4 - 2*5 + 2*5) = +12
* Ratings for user with a described
preference of (-3, 2, 2) for these features

How does Content Based Recommendation work?
• Another option is to create features from
user+item pairs and use an algorithm
(classifier?) to predict like/dislike
•Each user/item pair has a labeled outcome,
such as purchased/not purchased. You can
train a model to predict purchase behavior.

How does Collaborative Filtering work?
• Collaborative filtering refers to a family of
methods for predicting ratings where instead of
thinking about users and items in terms of a
feature space, we are only interested in the
existing user-item ratings themselves. 
•In this case, our dataset is a ratings matrix whose
columns correspond to items, and whose rows
correspond to users.

Example: Netflix movie recommendations

How does collaborative filtering work?
• Method 1: Item-based CF, a.k.a. neighborhood
methods or memory-based CF
• Ratings data are used to create an item-item
similarity matrix.
• Recommendations are made based on the items
most similar to those a user has already rated
highly.
•This method does not scale well.
• Why? You need a fully populated matrix of
item-item similarity. This doesn’t work well
if you have a lot of items or if your items
change a lot.

How does CF work?
• Method 2: Model-based CF use matrix
decomposition via singular value
decomposition (SVD) to reduce
dimensionality and extract latent variables.
• We express users and items in terms of
these variables.

Why is model-based CF preferred?
• Scalable, flexible, accurate, domain
independent, and requires no explicit
information.

What are the drawbacks, and
how can I address them?

Let’s discuss the drawbacks
• Cold-start problem!
• Data is typically very sparse
•Need granularity in your data

Drawback: Cold Start problem
• Build an initial profile based on implicit
data, evolve based on explicit feedback as it
comes.
• Sometimes called a ‘hybrid’ filtering
method, you can use content-based
information to ease cold-start and data
sparsity problems.

Drawback: Sparsity of Data
• Famous Netflix prize dataset, ~ 99% of
possible ratings were missing.
• Data is skewed and sparse
• or, most people don’t rate a lot and
most items aren’t rated
• those that are often are rated
constantly

Drawback: Granularity of data
• Traditional model-based CF works well for
non-binary data (ie, a 5 star rating). Doesn’t
work well for binary (ie, click/not click,
purchased/did not purchase)
• You will need to tweak your  
measurements of item similarities

Quick overview of measurement
• Non-binary rating:
• Pearson correlation coefficient
• Euclidean distance
• Manhattan distance
• Binary ratings:
• Jaccard similarity
• Cosine similarity

Normalization
• Some items are significantly higher rated
(ie, blockbuster movies, Oscar winners)
• Some users are lower (or higher) raters
from the norm
• Ratings can change over time

Normalization
• Need to offset per user
• Need to offset per item
•Ex: Mean rating across all users for item x is
some value. How does it differ from the mean
rating across all items? How does my rating
differ from the mean rating of that item?

Capturing data trends
• Rating distributions:
• ratings aren’t random, they follow a
distribution - model this distribution
• Feature importance: You can regress on your
feature vectors to get an understanding of what
values impact ratings
• Feature generation: Characterize your users and
create one-hot features (this can save a lot of time,
and help with cold-start problems)

Temporal factors
• There can be an upward trend of ratings
over time
• Seasonal shifts due to holidays, awards, etc
• Anchoring (ie, an item based on a previous
iteration or version of that item)

Conclusions
• Think about your data, your capabilities,
and your needs prior to creating a
recommendation system
• Consider the pros and cons of each type
• Refine your model thoughtfully

Questions?
www.rummanchowdhury.com
@ruchowdh

Demystifying Recommendation Systems

More Related Content

Similar to Demystifying Recommendation Systems

Recently uploaded

Demystifying Recommendation Systems