Recommender Systems

RecSys: Recommender Systems
Tran The Truyen
http://truyen.vietlabs.com

The world is an over-crowded place

They all want to get our attention

We are overloaded
• Thousands of news articles
and blog posts each day
• Millions of movies, books
and music tracks online
• In Hanoi, > 50 TV channels,
thousands of programs
each day
• In New York, several
thousands of ad messages
sent to us per day

But we really need and
consume only a few of them!

Sometimes, all we need is this

Or, just this

!
RB
TU
IS
D
’T
N
O
D

Can Google help?
• Yes, but only when we really know what
we are looking for
• What if I just want some interesting music
tracks?
– Btw, what does it mean by “interesting”?

Can Facebook help?
• Yes, I tend to find my friends’ stuffs
interesting
• What if I had only few friends, and what
they like do not always attract me?

Can experts help?
• Yes, but it won’t scale well
– Everyone receives exactly the same advice!

• It is what they like, not me!
– Like movies, what get expert approval does
not guarantee attention of the mass

OK, here is the idea called RecSys:
I like these bits
• To recommend to us
something we may like
– It may not be popular
– The world is long-tailed
• How?
– Based on our history of
using services
– Based on other people
like us
– Ever heard of “collective
intelligence”?

Hang on, what is long-tailed?
• Popularised by Chris Anderson, Wired 2004

The short-tailed distribution

The bell-shaped distribution

The long-tailed distribution

Ever heard of
• GroupLens?
• Amazon recommendation?
• Netflix Cinematch?
• Google News personalization?
• Netflix Prize $1mil challenge?
• Strands?
• TiVo?
• Findory?

Want some evidences?
(Celma & Lamere, ISMIR 2007)

• Netflix:
– 2/3 rented movies are from recommendation

• Google News
– 38% more click-through are due to
recommendation

• Amazon
– 35% sales are from recommendation

What can be recommended?
• Advertising messages • Tags
• Investment choices • News articles
• Restaurants • Online mates (Dating services)
• Cafes • Future friends (Social network sites)
• Music tracks • Courses in e-learning
• Movies • Drug components
• TV programs • Research papers
• Books • Citations
• Cloths • Code modules
• Supermarket goods • Programmers

But, what do recommender
systems do, exactly?
1. Predict how much you may like a certain
product/service
2. Compose a list of N best items for you
3. Compose a list of N best users for a certain
product/service
4. Explain to you why these items are recommended to
you
5. Adjust the prediction and recommendation based on
your feedback and other people

Graph representation

Titanic Taken Panda

?
Me My friend You Another guy

We must also take a good care of

• Data normalisation
• Removal or reduction of noise
• Protection of users’ privacy
• Attack: someone just doesn’t like your
system

Task 1: Preference prediction
• Collaborative filtering
– User-based method
– Item-based method
– Matrix Factorization
• Content-based filtering
• Hybrid:
– Linear/sequential/switching combination
– Semi-Restricted Boltzmann Machines

Collaborative filtering (1)
• User-based method (1994,
GroupLens)
– Many people liked “Kungfu
Panda” item
123 4 5678
– Can you tell how much I like it?
1545 3 4
– The idea is to pick about 20-50
2 35 4 5
people who share similar
3 4 5 4
taste with me, then how much I
45 5 35
4
like depend on how much
54 33 4
THEY liked.
652 35
– In short: you may like it
7 1 4 2
user
because your “friends” liked it
8 5 43

• Item-based method (2001,
deployed at Amazon)
– I have watched so many good &
bad movies
– Would you recommend me
watching “Taken”? item
– The idea is to pick from my 1 2 3 4 5 67 8
previous list 20-50 movies that
1 4 3 4
5 5
share similar audience with
“Taken”, then how much I will like 2 3 5 4 5
depend on how much I liked those 3 4 5 4
early movies
4 5 35
5 4
– In short: I tend to watch this movie
because I have watched those 5 4 3 3 4
movies … or 6 5 2 3 5
– People who have watched those
7 1 4 2
user

movies also liked this movie
(Amazon style) 8 5 4 3

~ [0.1 0.3 0.2 0.9 0.5 0.4 0.7 0.3 0.8 1.5]
• Matrix Factorization (2006, Netflix
challgence)
– You many have watched thousands of movies
– But perhaps I can tell these movies belong to
10 groups, like Action, Sci-Fi, Animation,
etc,…
– So 10 numbers are enough to describe your
taste
– Likewise, “Titanic” has been watched by
millions people, but perhaps …10 numbers
are enough to describe its features
– Magic: these hidden aspects can be
discovered automatically by Matrix
Factorization!

Problems with collaborative filtering
• Scale
– Netflix (2007): 5M users, 50K movies, 1.4B ratings

• Sparse data
– I have rated only one book at Amazon!

• Cold-Start
– New users and items do not have history

• Popularity bias
– Everyone reads “Harry Potter”

• Hacking
– Someone reads “Harry Potter” reads “Karma Sutra”

Content-based method
• Web page: words, hyperlinks, images, tags, comments,
titles, URL, topic
• Music: genre, rhythm, melody, harmony, lyrics, meta data,
artists, bands, press releases, expert reviews, loudness,
energy, time, spectrum, duration, frequency, pitch, key,
mode, mood, style, tempo
• User: age, sex, job, location, time, income, education,
language, family status, hobbies, general interests, Web
usage, computer usage, fan club membership, opinion,
comments, tags, mobile usage
• Context: time, location, mobility, activity, socializing,
emotion

Content-based method (2)
• Can we acquire those content pieces
automatically?
– Fairly easy for text
– Difficult for music and video, except for digital signals.
E.g. music genre classification 60-80% accuracy
– A lot of noise, e.g. misplaced tags
– Attacks
• What can we do with these?
– Compute similarity between items or users
– Query items that are similar to a given item
– Match item’s content and user’s profile

Content-based method (3)
• Measuring similarity
– Cosine, TF-IDF as in standard Information
Retrieval
– KL-divergence for probability-oriented guys
– Euclidean, dimensionality reduction if you
want
– Anything you can imagine of!

Hybrid: Semi-Restricted Boltzmann
Machines (2009, IMPCA)
User A User B User C

• A probabilistic combination of
– Item-based method
– User-based method
– Matrix Factorization
– (May be) content-based method

• It looks like a Neural Network
11
00 111
000
– But it does not really so ☺ 11
00 111
000
11
00 111
000
11
00 111
000
Item X

• It really is a type of Markov
random fields, which is, again, a
type of Graphical Models
– Self-advertising: I work on these
stuffs for living!

Task 2,3: Top-N recommendation

• Top-N item list:
– Find similar users, collect what they like
– Filter out those the user has rated
– Rank the remaining items by considering
• The number of times each item is liked by those users
• The popularity of the item
• The associated ratings
• The similarity between each item in the list and what the user
has rated

• Switching the role of item to user, we may have
top-N user list

Task 4: Explanation
• This is a current hit …
• More on this artist …
• Try something from similar artists …
• Someone similar to you also like this …
• As you listened to that, you may want this …
• These two go together …
• This is most popular in your group …
• This is highly rated …
• Try something new …

Task 4: Explanation (2)
• Examples from Strands.com
– Welcom back (recently viewed)
– For you today
– New for you
– Hot / Most popular of this type
– Other people also do this …
– Similar or related products
– Complementary accessories
– This goes with this …
– Gift idea
– Shopping assisant

Task 5: Online updating
• New items and users come each hour or minute
• The two worlds:
– Most songs and books are still interesting for a long
time (the tail is really long)
– Most news articles are read on the day and forgotten
next day
• But tracking back is useful to follow an event or scandal

• Online updating large-scale neighbour-based
systems is NOT easy at all

Evaluation
• How do we know the recommendation is
good?
– How good is good?
– Measures should be automated
• Practice: training/testing split (e.g. 80/20)
• Popular criteria
– Prediction error: ZOE, MAE, RMSE
– Hit recall/precision/F-measure, rank utility,
ROC curve,

Evaluation (2)
• Yet little on
– Relevance
– Usefulness
– % Increase in purchase
– % Reduction in cost
– Novelty/surprise/long-tails
– Diversity
– Coverage
– Explainability

A question: Can we
make use of these
information sources?
• Blogs
• Social Media
• Online comments
• Online stores
• Review sites
• Locations
• Mobility

A case-study: Strands
• Services for any online-retailers
– Retailers send product, purchase information into
Strands server (one retailer per account) through
APIs
– Strands returns recommendation for each visitor
• The same logic for social media servers
• moneyStrands for personal financial
management (e.g. investment recommendation)
• MyStrands for music personalization

Want more practical hints?
• New books:
– Toby Segaran, Programming Collective
Intelligence, O'Reilly, 2007
– Satnam Alag, Collective Intelligence in
Action, Manning Publications, 2009
• Check out for real deployment:
– TechCrunch
– ReadWriteWeb

Want more state-of-the-arts?
• Research in Recommender Systems is becoming a
mainstream, evidenced from the recent conference
ACM RecSys.
• Other places:
– ICWSM: Weblog and Social Media
– WebKDD: Web Knowledge Discovery and Data Mining
– WWW: The original WWW conference
– SIGIR: Information Retrieval
– ACM KDD: Knowledge Discovery and Data Mining
– ICML: Machine Learning

Questions left to you
• Will you trust such Recommender
Systems?
• Will you implement and deploy it here?
• Will you do research?
– PhD scholarships available (as of 19/4/09)
– See http://truyen.vietlabs.com/scholarship.html
– Warning: you are going to waste 3-5 years of your
youth life!

Recommender Systems

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (16)

Similar to Recommender Systems

Similar to Recommender Systems (20)

Recently uploaded

Recently uploaded (20)

Recommender Systems