Recommendation
Challenges in Web Media
        Settings

     Ronny Lempel
 Yahoo! Labs, Haifa, Israel
Recommender Systems
 • Pioneered in the mid/late 90s by Amazon




 • Today applied “everywhere”
    • Shopping sites
    • Content sites (news, sports, gossip, …)
    • Multimedia streaming services (videos, music)
    • Social networks
 • Easily merit a dedicated academic course

                            -1-           Bangalore/MumbaiConfidential
                                                     Yahoo! 2013
Recommendation in Social Networks




                      -2-       Bangalore/MumbaiConfidential
                                           Yahoo! 2013
Recommender Systems – Example of Effectiveness
  • 1988: Random House releases
    “Touching the Void”, a book by a
    mountain climber detailing a harrowing
    account of near death in the Andes
     – It got good reviews but modest commercial
       success
  • 1999: “Into Thin Air”, another mountain-climbing tragedy
    book, becomes a best-seller
  • By virtue of Amazon’s recommender system, “Touching
    the Void” started to sell again, prompting Random House
    to rush out a new edition
     – A revised paperback edition spent 14 weeks on the New York
       Times bestseller list
                               From “The Long Tail”, by Chris Anderson

                                 -3-               Bangalore/MumbaiConfidential
                                                              Yahoo! 2013
The Netflix Challenge




Slides 4-6 courtesy of
Yehuda Koren, member
of Challenge winners
“Bellkor’s Pragmatic
Chaos”




                         -4-   Bangalore/MumbaiConfidential
                                          Yahoo! 2013
“We’re quite curious, really. To the tune of
one million dollars.” – Netflix Prize rules
  • Goal was to improve on Netflix’ existing movie
    recommendation technology
  • The open-to-the-public contest began October 2, 2006;
    winners announced September 2009
  • Prize
     – Based on reduction in root mean squared error (RMSE) on test data
     – $1 million grand prize for 10% improvement on Cinematch result
     – $50K 2007 progress prize for 8.43% improvement
     – $50K 2008 progress prize for 9.44% improvement
  • Netflix gets full rights to use IP developed by the winners
     – Example of Crowdsourcing – Netflix basically got over 100
       researcher years (and good publicity) for $1.1M


                                  -5-              Bangalore/MumbaiConfidential
                                                              Yahoo! 2013
Netflix Movie Ratings Data
                               Training data                     Test data
• Training data           user     movie       score     user      movie
   – 100 million           1         21          1         1          62              ?
      ratings
                           1        213          5         1          96              ?
   – 480,000 users
                           2        345          4         2          7               ?
   – 17,770 movies
   – 6 years of data:      2        123          4         2          3               ?
      2000-2005            2        768          3         3          47              ?
• Test data                3         76          5         3          15              ?
   – Last few ratings      4         45          4         4          41              ?
      of each user (2.8    5        568          1         4          28              ?
      million)             5        342          2         5          93              ?
• Dates of ratings are     5        234          2         5          74              ?
  given
                           6         76          5         6          69              ?
                           6         56          4         6          83              ?



                                    -6-                Bangalore/MumbaiConfidential
                                                                  Yahoo! 2013
Recommender Systems – Mathematical Abstraction
   • Consider a matrix R of users and the items they’ve
     consumed
      – Users correspond to the rows of R, products to its columns, with
        ri,j=1 whenever person i consumed item j
      – In other cases, ri,j might be the rating given by person i on item j
   • The matrix R is typically very sparse
                                                                            Items
      – …and often very large
   • Real-life task: top-k recommendation




                                                                 users
      – From among the items that weren’t               R=
        consumed by each user, predict which
        ones the user would most enjoy
   • Related task on ratings data: matrix
     completion                                                          |U| x |I|
      – Predict users’ ratings for items they have
        yet to rate, i.e. “complete” missing values
                                   -7-                Bangalore/MumbaiConfidential
                                                                 Yahoo! 2013
Types of Recommender Systems
 At a high level, two main techniques:
 • Content-based recommendation: characterizes the
    affinity of users to certain features (content, metadata)
    of their preferred items
    – Lots of classification technology under the hood
 • Collaborative Filtering: exploits similar consumption
   and preference patterns between users
    – See next slides
 • Many state of the art systems combine both techniques




                                   -8-              Bangalore/MumbaiConfidential
                                                               Yahoo! 2013
Collaborative Filtering – Neighborhood Models

   • Compute the similarity of items [users] to each other
      – Items are considered similar when users tend to rate them
        similarly or to co-consume them
      – Users are considered similar when they tend to co-consume
        items or rate items similarly
   • Recommend to a user:
      – Items similar to items he/she has already consumed [rated
        highly]
      – Items consumed [rated highly] by similar users
   • Key questions:
      – How exactly to define pair-wise similarities?
      – How to combine them into quality recommendations?



                                  -9-               Bangalore/MumbaiConfidential
                                                               Yahoo! 2013
Collaborative Filtering – Matrix Factorization

  • Latent factor models (LFM):
     – Maps both users and items to some f-dimensional space Rf, i.e.
       produce f-dimensional vectors vu and wi for each user and items
     – Define rating estimates as inner products: qij = <vi,wj>
     – Main problem: finding a mapping of users and items to the latent
       factor space that produces “good” estimates
     – Closely related to dimensionality reduction techniques of the
       ratings matrix R (e.g. Singular Value Decomposition)
                   Items            V
                                                      W
         users




    R=                       ≈


                 |U| x |I|       |U| x f            f x |I|
                                           - 10 -             Bangalore/MumbaiConfidential
                                                                         Yahoo! 2013
Web Media Sites




                  - 11 -   Bangalore/MumbaiConfidential
                                      Yahoo! 2013
Challenge: Cold Start Problems
  • Good recommendations require observed data on the user
    being recommended to [the items being recommended]
     – What did the user consume/enjoy before?
     – Which users consumed/enjoyed this item before?
  • User cold start: what happens when a new user arrives to a
    system?
     – How can the system make a good “first impression”?
  • Item cold start: how do we recommend newly arrived items
    with little historic consumption?
  • In certain settings, items are
    ephemeral – a significant portion of
    their lifetime is spent in cold-start state
     – E.g. news recommendation


                                 - 12 -           Bangalore/MumbaiConfidential
                                                             Yahoo! 2013
Low False-Positive Costs
  False positive: recommending an irrelevant item
  • Consequence, in media sites: a bit of lost time
     – As opposed to lots of lost time or money in other settings
  • Opportunity: better address cold-start issues
  • Item cold-start: show new item to select group of users
    whose feedback should help in modeling it to everyone
     – Note the very short item life times in news cycles
  • User cold-start: more aggressive exploration
     – Vs. playing it safe and perpetuating popular items
  • Search: injecting randomization into the ranking of search
    results (Pandey et al., VLDB 2005)



                                  - 13 -           Bangalore/MumbaiConfidential
                                                              Yahoo! 2013
Challenge: Inferring Negative Feedback
  • In many recommendation settings we only know which
    items users have consumed, not whether they liked them
     – I.e. no explicit ratings data
  • What can we infer about satisfaction of consumed items
    from observing other interactions with the content?
     – Web pages: what happens after the initial click?
     – Short online videos: what happens after pressing “play”?
     – TV programs: zapping patterns
  • What can we infer about items the user did not consume?
  • Was the user even aware of the items he/she did not
    consume?
     – What items did the recommender system expose the user to?



                                       - 14 -      Bangalore/MumbaiConfidential
                                                              Yahoo! 2013
Presentation Bias’ Effect on Media Consumption
  • Pop Culture: items’ longevity creates familiarity




  • Media sites: items are ephemeral, and users are mostly
    unaware of items the site did not expose them to
  • Presentation bias obscures users’ true taste – they
    essentially select the best of the little that was shown
  • Must correctly account for presentation bias when
    modeling: seen and not selected ≠ not seen and not
    selected
  • Search: negative interpretation of “skipped” search results
    (Joachims, KDD’2002)
                              - 15 -         Bangalore/MumbaiConfidential
                                                        Yahoo! 2013
Layouts of Recommendation Modules




  • Interpreting interactions in vertical layouts is “easy” using
    the “skips” paradigm
  • What about 2D, tabbed, horizontal layouts?

                               - 16 -         Bangalore/MumbaiConfidential
                                                         Yahoo! 2013
Layouts of Recommendation Modules




                              • What about multiple
                                presentation formats?




                     - 17 -         Bangalore/MumbaiConfidential
                                               Yahoo! 2013
Personalized




                                            Popular




Contextual




             - 18 -   Bangalore/MumbaiConfidential
                                 Yahoo! 2013
Contextualized, Personalization, Popular
   • Web media sites often display links to additional stories
     on each article page
      – Matching the article’s context, matching the user, consumed by
        the user’s friends, popular
   • When creating a unified list for a given a user reading a
     specific page, what should be the relative importance of
     matching the additional stories to the page vs. matching
     to the user?
   • Ignoring story context might create offending
     recommendations
   • Related direction: Tensor Factorization, Karatzoglou et.
     al, RecSys’2010




                                  - 19 -           Bangalore/MumbaiConfidential
                                                              Yahoo! 2013
Challenge: Incremental Collaborative Filtering
  •   In a live system, we often cannot afford to recompute
      recommendations regularly over the entire history
  •   Problem: neither neighborhood models nor matrix
      factorization models easily lend themselves to faithful
      incremental processing

       User-Item      User-Item      User-Item              Mi = CF-ALG(ti)
      Interactions   Interactions   Interactions

           t1             t2             t3
                                                            ∀f, f { M1, M2 } ≠ CF_ALG(t1∪t2)
                                                   …

                                                       T
  •   Is there a model aggregation function f(Mprev, Mcurr) that is
      “good enough”?


                                                   - 20 -             Bangalore/MumbaiConfidential
                                                                                 Yahoo! 2013
Challenge: Repeated Recommendations
  • One typically doesn’t buy the same book twice, nor do
    people typically read the same news story twice
  • But people listen to the songs they like over and over
    again, and watch movies they like multiple times as well
  • When and how frequently is it ok to recommend an item
    that was already consumed?
  • On the other hand, when should we stop showing a
    recommendation if the user doesn’t act upon it?
  • Implication: a recommendation system may not only need
    to track aggregated consumption to-date,
     – It may need to track consumption timelines
     – It may need to track recommendation history




                                - 21 -           Bangalore/MumbaiConfidential
                                                            Yahoo! 2013
Challenge: Recommending Sets & Sequences of
Items
  • In some domains, users consume multiple items in rapid
    succession (e.g. music playlists)
     – Recent works: WWW’2012 (Aizenberg et al., sets) and KDD’2012
       (Chen et al., sequences)
  • From Independent utility of recommendations to set or
    sequence utility, predicting items that “go well together”
     – Sometimes need to respect constraints
  • Tiling recommendations: in TV Watchlist generation, the
    broadcast schedules further complicates matters due to
    program overlaps
  • Perhaps a new domain of constrained recommendations?
  • Search: result set attributes (e.g. diversity) in Search
    (Agrawal et al., WSDM’2009)
  • Netflix tutorial at RecSys’2012: diversity is key @Netflix

                                - 22 -         Bangalore/MumbaiConfidential
                                                          Yahoo! 2013
Social Networks and Recommendation
Computation
  • Some are hailing social networks as a
    silver bullet for recommender systems
     – Tell me who your friends are and we’ll tell
       you what you like
  • Is it really the case that we like the
    same media as our friends?
  • Affinity trumps friendship!
     – There are people out there who are “more
       like us” than our limited set of friends
     – Once affinity is considered, the marginal
       value of social connections is often
       negligible
  • Not to be confused with non-friendship social networks,
    where connections are affinity related (Epinions)


                                  - 23 -             Bangalore/MumbaiConfidential
                                                                Yahoo! 2013
                                                                 RecSys 202
Social Networks and Recommendation
Consumption
  • Previous slide nonewithstanding, “social” is a great
    motivator for consuming recommendations
     – People like you rate “Lincoln” very highly     vs.
     – Your friends Alice and Bob saw “Lincoln” last night and loved it
  • Explaining recommendations for motivating and increasing
    consumption is an emerging practice
  • Some commercial systems completely separate their
    explanation generation from their recommendation generation
     – So Alice and Bob may not be why the system recommended
       “Lincoln” to you, but they will be leveraged to get you to watch it
  • Privacy in the face of joint consumption of a personalized
    experience?




                                   - 24 -            Bangalore/MumbaiConfidential
                                                                Yahoo! 2013
                                                                 RecSys 202
Questions, Comments?




                Thank you!

        rlempel (at) yahoo-inc dot com




                       - 25 -            Yahoo! Confidential

Ronny lempelyahooindiabigthinkerapril2013

  • 1.
    Recommendation Challenges in WebMedia Settings Ronny Lempel Yahoo! Labs, Haifa, Israel
  • 2.
    Recommender Systems •Pioneered in the mid/late 90s by Amazon • Today applied “everywhere” • Shopping sites • Content sites (news, sports, gossip, …) • Multimedia streaming services (videos, music) • Social networks • Easily merit a dedicated academic course -1- Bangalore/MumbaiConfidential Yahoo! 2013
  • 3.
    Recommendation in SocialNetworks -2- Bangalore/MumbaiConfidential Yahoo! 2013
  • 4.
    Recommender Systems –Example of Effectiveness • 1988: Random House releases “Touching the Void”, a book by a mountain climber detailing a harrowing account of near death in the Andes – It got good reviews but modest commercial success • 1999: “Into Thin Air”, another mountain-climbing tragedy book, becomes a best-seller • By virtue of Amazon’s recommender system, “Touching the Void” started to sell again, prompting Random House to rush out a new edition – A revised paperback edition spent 14 weeks on the New York Times bestseller list From “The Long Tail”, by Chris Anderson -3- Bangalore/MumbaiConfidential Yahoo! 2013
  • 5.
    The Netflix Challenge Slides4-6 courtesy of Yehuda Koren, member of Challenge winners “Bellkor’s Pragmatic Chaos” -4- Bangalore/MumbaiConfidential Yahoo! 2013
  • 6.
    “We’re quite curious,really. To the tune of one million dollars.” – Netflix Prize rules • Goal was to improve on Netflix’ existing movie recommendation technology • The open-to-the-public contest began October 2, 2006; winners announced September 2009 • Prize – Based on reduction in root mean squared error (RMSE) on test data – $1 million grand prize for 10% improvement on Cinematch result – $50K 2007 progress prize for 8.43% improvement – $50K 2008 progress prize for 9.44% improvement • Netflix gets full rights to use IP developed by the winners – Example of Crowdsourcing – Netflix basically got over 100 researcher years (and good publicity) for $1.1M -5- Bangalore/MumbaiConfidential Yahoo! 2013
  • 7.
    Netflix Movie RatingsData Training data Test data • Training data user movie score user movie – 100 million 1 21 1 1 62 ? ratings 1 213 5 1 96 ? – 480,000 users 2 345 4 2 7 ? – 17,770 movies – 6 years of data: 2 123 4 2 3 ? 2000-2005 2 768 3 3 47 ? • Test data 3 76 5 3 15 ? – Last few ratings 4 45 4 4 41 ? of each user (2.8 5 568 1 4 28 ? million) 5 342 2 5 93 ? • Dates of ratings are 5 234 2 5 74 ? given 6 76 5 6 69 ? 6 56 4 6 83 ? -6- Bangalore/MumbaiConfidential Yahoo! 2013
  • 8.
    Recommender Systems –Mathematical Abstraction • Consider a matrix R of users and the items they’ve consumed – Users correspond to the rows of R, products to its columns, with ri,j=1 whenever person i consumed item j – In other cases, ri,j might be the rating given by person i on item j • The matrix R is typically very sparse Items – …and often very large • Real-life task: top-k recommendation users – From among the items that weren’t R= consumed by each user, predict which ones the user would most enjoy • Related task on ratings data: matrix completion |U| x |I| – Predict users’ ratings for items they have yet to rate, i.e. “complete” missing values -7- Bangalore/MumbaiConfidential Yahoo! 2013
  • 9.
    Types of RecommenderSystems At a high level, two main techniques: • Content-based recommendation: characterizes the affinity of users to certain features (content, metadata) of their preferred items – Lots of classification technology under the hood • Collaborative Filtering: exploits similar consumption and preference patterns between users – See next slides • Many state of the art systems combine both techniques -8- Bangalore/MumbaiConfidential Yahoo! 2013
  • 10.
    Collaborative Filtering –Neighborhood Models • Compute the similarity of items [users] to each other – Items are considered similar when users tend to rate them similarly or to co-consume them – Users are considered similar when they tend to co-consume items or rate items similarly • Recommend to a user: – Items similar to items he/she has already consumed [rated highly] – Items consumed [rated highly] by similar users • Key questions: – How exactly to define pair-wise similarities? – How to combine them into quality recommendations? -9- Bangalore/MumbaiConfidential Yahoo! 2013
  • 11.
    Collaborative Filtering –Matrix Factorization • Latent factor models (LFM): – Maps both users and items to some f-dimensional space Rf, i.e. produce f-dimensional vectors vu and wi for each user and items – Define rating estimates as inner products: qij = <vi,wj> – Main problem: finding a mapping of users and items to the latent factor space that produces “good” estimates – Closely related to dimensionality reduction techniques of the ratings matrix R (e.g. Singular Value Decomposition) Items V W users R= ≈ |U| x |I| |U| x f f x |I| - 10 - Bangalore/MumbaiConfidential Yahoo! 2013
  • 12.
    Web Media Sites - 11 - Bangalore/MumbaiConfidential Yahoo! 2013
  • 13.
    Challenge: Cold StartProblems • Good recommendations require observed data on the user being recommended to [the items being recommended] – What did the user consume/enjoy before? – Which users consumed/enjoyed this item before? • User cold start: what happens when a new user arrives to a system? – How can the system make a good “first impression”? • Item cold start: how do we recommend newly arrived items with little historic consumption? • In certain settings, items are ephemeral – a significant portion of their lifetime is spent in cold-start state – E.g. news recommendation - 12 - Bangalore/MumbaiConfidential Yahoo! 2013
  • 14.
    Low False-Positive Costs False positive: recommending an irrelevant item • Consequence, in media sites: a bit of lost time – As opposed to lots of lost time or money in other settings • Opportunity: better address cold-start issues • Item cold-start: show new item to select group of users whose feedback should help in modeling it to everyone – Note the very short item life times in news cycles • User cold-start: more aggressive exploration – Vs. playing it safe and perpetuating popular items • Search: injecting randomization into the ranking of search results (Pandey et al., VLDB 2005) - 13 - Bangalore/MumbaiConfidential Yahoo! 2013
  • 15.
    Challenge: Inferring NegativeFeedback • In many recommendation settings we only know which items users have consumed, not whether they liked them – I.e. no explicit ratings data • What can we infer about satisfaction of consumed items from observing other interactions with the content? – Web pages: what happens after the initial click? – Short online videos: what happens after pressing “play”? – TV programs: zapping patterns • What can we infer about items the user did not consume? • Was the user even aware of the items he/she did not consume? – What items did the recommender system expose the user to? - 14 - Bangalore/MumbaiConfidential Yahoo! 2013
  • 16.
    Presentation Bias’ Effecton Media Consumption • Pop Culture: items’ longevity creates familiarity • Media sites: items are ephemeral, and users are mostly unaware of items the site did not expose them to • Presentation bias obscures users’ true taste – they essentially select the best of the little that was shown • Must correctly account for presentation bias when modeling: seen and not selected ≠ not seen and not selected • Search: negative interpretation of “skipped” search results (Joachims, KDD’2002) - 15 - Bangalore/MumbaiConfidential Yahoo! 2013
  • 17.
    Layouts of RecommendationModules • Interpreting interactions in vertical layouts is “easy” using the “skips” paradigm • What about 2D, tabbed, horizontal layouts? - 16 - Bangalore/MumbaiConfidential Yahoo! 2013
  • 18.
    Layouts of RecommendationModules • What about multiple presentation formats? - 17 - Bangalore/MumbaiConfidential Yahoo! 2013
  • 19.
    Personalized Popular Contextual - 18 - Bangalore/MumbaiConfidential Yahoo! 2013
  • 20.
    Contextualized, Personalization, Popular • Web media sites often display links to additional stories on each article page – Matching the article’s context, matching the user, consumed by the user’s friends, popular • When creating a unified list for a given a user reading a specific page, what should be the relative importance of matching the additional stories to the page vs. matching to the user? • Ignoring story context might create offending recommendations • Related direction: Tensor Factorization, Karatzoglou et. al, RecSys’2010 - 19 - Bangalore/MumbaiConfidential Yahoo! 2013
  • 21.
    Challenge: Incremental CollaborativeFiltering • In a live system, we often cannot afford to recompute recommendations regularly over the entire history • Problem: neither neighborhood models nor matrix factorization models easily lend themselves to faithful incremental processing User-Item User-Item User-Item Mi = CF-ALG(ti) Interactions Interactions Interactions t1 t2 t3 ∀f, f { M1, M2 } ≠ CF_ALG(t1∪t2) … T • Is there a model aggregation function f(Mprev, Mcurr) that is “good enough”? - 20 - Bangalore/MumbaiConfidential Yahoo! 2013
  • 22.
    Challenge: Repeated Recommendations • One typically doesn’t buy the same book twice, nor do people typically read the same news story twice • But people listen to the songs they like over and over again, and watch movies they like multiple times as well • When and how frequently is it ok to recommend an item that was already consumed? • On the other hand, when should we stop showing a recommendation if the user doesn’t act upon it? • Implication: a recommendation system may not only need to track aggregated consumption to-date, – It may need to track consumption timelines – It may need to track recommendation history - 21 - Bangalore/MumbaiConfidential Yahoo! 2013
  • 23.
    Challenge: Recommending Sets& Sequences of Items • In some domains, users consume multiple items in rapid succession (e.g. music playlists) – Recent works: WWW’2012 (Aizenberg et al., sets) and KDD’2012 (Chen et al., sequences) • From Independent utility of recommendations to set or sequence utility, predicting items that “go well together” – Sometimes need to respect constraints • Tiling recommendations: in TV Watchlist generation, the broadcast schedules further complicates matters due to program overlaps • Perhaps a new domain of constrained recommendations? • Search: result set attributes (e.g. diversity) in Search (Agrawal et al., WSDM’2009) • Netflix tutorial at RecSys’2012: diversity is key @Netflix - 22 - Bangalore/MumbaiConfidential Yahoo! 2013
  • 24.
    Social Networks andRecommendation Computation • Some are hailing social networks as a silver bullet for recommender systems – Tell me who your friends are and we’ll tell you what you like • Is it really the case that we like the same media as our friends? • Affinity trumps friendship! – There are people out there who are “more like us” than our limited set of friends – Once affinity is considered, the marginal value of social connections is often negligible • Not to be confused with non-friendship social networks, where connections are affinity related (Epinions) - 23 - Bangalore/MumbaiConfidential Yahoo! 2013 RecSys 202
  • 25.
    Social Networks andRecommendation Consumption • Previous slide nonewithstanding, “social” is a great motivator for consuming recommendations – People like you rate “Lincoln” very highly vs. – Your friends Alice and Bob saw “Lincoln” last night and loved it • Explaining recommendations for motivating and increasing consumption is an emerging practice • Some commercial systems completely separate their explanation generation from their recommendation generation – So Alice and Bob may not be why the system recommended “Lincoln” to you, but they will be leveraged to get you to watch it • Privacy in the face of joint consumption of a personalized experience? - 24 - Bangalore/MumbaiConfidential Yahoo! 2013 RecSys 202
  • 26.
    Questions, Comments? Thank you! rlempel (at) yahoo-inc dot com - 25 - Yahoo! Confidential