2. The Plan
• Examples
• Why to bother?
• Long tail
• Recommender systems
• Zvooq Recommender Platform
3. Disclaimer
• There are a plenty of information on RS
• The technology is quite mature, you may have
RS just out of the box at any programming
framework
• It’s easy to use something as a blackbox and
fail just because you had to think about
certain things
• This talk is about these things
5. Examples - Amazon
• Consumer goods of all types
• Suggest items based on
– different activity in the past (buying, browsing)
– news
– similarity
• Support catalogue exploration
• Explain recommendations
7. Examples - Netflix
• The interface is a set of rows, one row per
different recommender system (signal)
• Based mostly on movie ratings
• Predicts rating to unseen films
• Use UX specifics (multiple users of home
cinema)
13. Why to bother?
• Consumer perspective
– what to buy/use?
– user satisfaction
• Producer perspective
– promote things and get attention of consumers
– increase demand, compete with other producers
• Business perspective
– optimize for core business values: costs, revenue or
betterness
– business settings may vary and aren’t always aligned
with customers or producers
14. Any “default” interface may be
optimized
• Consumers optimize for satisfaction
– may be satisfied by the popular items
• Producers optimize for demand
– ideally, would like to lock customers and the business
on them, cheat the game
• Business:
– business optimize to reduce negative scale factors
(e.g. number of deals) and increase positive
– marketplace business optimize for market volume and
growth
16. The Long Tail
“Forget squeezing millions from a few megahits
at the top of the charts. The future of entertainment
is in the millions of niche markets at the shallow end
of the bitstream” Chris Andersen, Wired, 2004
popularity
SELECT count(buys) FROM items ORDER BY count DESC;
physical shelf
restriction
17. The Long Tail
• Supply-driven factors
– Distribution channels (limited space of physical
shelves)
• Demand-driven factors
– Discovery channels (mass-media, limited attention
span, interfaces with a limited viewport)
– Preferences / taste
– Quality of content
• It is not possible to solve all of them
19. The Long Tail
• Consumers almost don’t suffer from the thin
tail; producers suffer a lot
• In media, where the producer/consumer
border is blurred, the whole ecosystem
suffer
• Help to discover new stuff and elicit
preferences, create a lot of niche
communities/movements
22. Search to Discover
• One need to formulate the question
– known unknowns only
• When search paradigm fails:
– lack of preferences
– lack of domain knowledge
– lack of query-result relevance
23. Possible shortcuts
• Suggest a query
• Mine social layer
• Apply non-relevance scoring
• Recommender systems are all about non-
relevance scoring
24. Recommender model
• Allows to solve problems without knowing the
domain, even without the preferences
(unknown unknowns)
items
Recommender
system
users
list of recommendations
25. IR vs. RS
• IR more like to remember what you don’t
know, finding an answer to a question, RS is
more like discover what you are not aware of.
• Current web is biased towards search (thanks,
Google). People start from thinking up a
question instead of looking around.
26. Recommender Systems and Interfaces
• RS and interface solve the same problem: provide
an access to data given restrictions of device and
human.
• As there’s no ‘no interface’ setting, as there’s no
‘no RS setting’, since viewport is limited anyway.
Things that are there by default are
‘recommended’.
• If you don’t know about RS or don’t think about
RS, you still have a problem.
• Better know!
27. Decisions to make
• What data to mine?
• How to build the recommendations?
– That is, how to pick a subset and order it
• How to evaluate?
– That is, how to tune and optimize
• How to present the results?
28. preferences
explicit or implicit
What data to mine?
users
items
metadata
and content
items
features
demographic
and social data
users
features
social
connections
users
users
context
explicit or
implicit
history
time
usershistory
history
history
history
evolution-based
29. preferences
explicit or implicit
How to recommend?
users
items
metadata
and content
items
features
demographic
and social data
users
features
CF-based
user
similarity
CF-based
item
similarity
content-
based
user
similarity
content-
based
item
similarity
Model-
based
prediction
Collaborative Filtering
Cold Start Problem
30. Collaborative Filtering Example!
oranges celery meat
Alice 1 1 0
Bob 1 0 1
John ? ? 1
• User-based CF: Bob is more similar to John than Alice => John
likes oranges, but not celery.
• Item-based CF: Celery is unlike meat, oranges somwhere in
between => Jonh doesn’t like celery, maybe 0.5 for oranges.
• Model-based CF: Apparently, for John, meat > oranges >> celery.
1 1 0
1 0 1
0.5 0.5 1
-0.6 -0.5
-0.5 0.8
-0.6 -0.3
-0.7 -0.4 -0.6
0.2 0.7 -0.7
0.3 -0.1 0.7
0.5 0.8 -0.3
0.4 0 0.6
31. heavy offline
computation
Summary
• General or personalized recommendations
• Collaborative filtering
– what do people similar to you use?
– what items are similar to items you use?
– model-based methods
• Cold start problem
– how to assess new items?
– what recommend to new users?
• Exploration/Exploitation
– accuracy on history vs. discovery
kNN for each
request
heavy offline
computation
32. More things to keep in mind
(AKA “a very long slide”)
• Data sparsity and aggregation
• Popularity bias
• Filter bubble problem
• Hubness
• Choosing between good options is hard and
dissatisfying
• Preference/Quality problem
• Robustness
• A sense of control
• Discoverability
33. How to present results?
• Interface:
– explicit: easy to attract and explain, lots of WTF,
doesn’t work as discovery channel
– hidden: hard to explain, low trust per se, but
augments existing discovery channels
• Explaining recommendations:
– important not only to increase user trust, but also due
to difference between expected and perceived utility
• Interface matters:
– very small amount of actual user satisfaction depends
on the algorithms
34. How to evaluate and optimize?
• Only evaluation affects algorithm selection
and parameter optimization
• Different evaluation settings result in different
algorithms used
• Offline evaluation
– historical data
• Online evaluation
– A/B testing on live users
35. Offline evaluation
• Rating prediction and top-K recommenders
• Cross-validation vs. backtesting
• Caveats: trying to make long-tail thick, but in
the same time fitting to the historic thin long-
tail
• Additional diversity, freshness and long-tail
distribution metrics may apply
• Primary goal: tune algorithm parameters
36. Online evaluation
• Primary goal: make decisions on algorithms
• Within-subjects and Between-subjects
• Metrics to optimize:
– retention, ARPU, taste evolution
• Statistical significance
37. Domain-specific recommendation
• Music
– augmentive (a lot of contexts)
– cheap to discover and fail
– to cheap to bother make ratings
• Videos
– quite reliable rating systems
– expected/experienced utility may be different
• Books
– huge time investment, expensive to fail and discover
– evolution is more important than preference
• News and events
– unique objects, metadata and proper aggregation is more
important than pure CF
41. If you listened this you may also be
interested in…
• The Long Tail: Why The Future of Business is
Selling Less for More by Chris Andersen
• Recommender systems: An Introduction
• Music Recommendation and Discovery: The
Long Tail, Long Fail and Long Play by Oscar
Celma
• Recommender Systems Handbook
• http://recommenderbook.net
42. Next talk
• Thursday 08.08.2013, 20:00
• Speaker: Vladimir Belikov
• More technical side
• Decisions we took and how to make it better