Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Ronny lempelyahooindiabigthinkerapril2013


Published on

Published in: Education, Technology
  • Be the first to comment

Ronny lempelyahooindiabigthinkerapril2013

  1. 1. RecommendationChallenges in Web Media Settings Ronny Lempel Yahoo! Labs, Haifa, Israel
  2. 2. Recommender Systems • Pioneered in the mid/late 90s by Amazon • Today applied “everywhere” • Shopping sites • Content sites (news, sports, gossip, …) • Multimedia streaming services (videos, music) • Social networks • Easily merit a dedicated academic course -1- Bangalore/MumbaiConfidential Yahoo! 2013
  3. 3. Recommendation in Social Networks -2- Bangalore/MumbaiConfidential Yahoo! 2013
  4. 4. Recommender Systems – Example of Effectiveness • 1988: Random House releases “Touching the Void”, a book by a mountain climber detailing a harrowing account of near death in the Andes – It got good reviews but modest commercial success • 1999: “Into Thin Air”, another mountain-climbing tragedy book, becomes a best-seller • By virtue of Amazon’s recommender system, “Touching the Void” started to sell again, prompting Random House to rush out a new edition – A revised paperback edition spent 14 weeks on the New York Times bestseller list From “The Long Tail”, by Chris Anderson -3- Bangalore/MumbaiConfidential Yahoo! 2013
  5. 5. The Netflix ChallengeSlides 4-6 courtesy ofYehuda Koren, memberof Challenge winners“Bellkor’s PragmaticChaos” -4- Bangalore/MumbaiConfidential Yahoo! 2013
  6. 6. “We’re quite curious, really. To the tune ofone million dollars.” – Netflix Prize rules • Goal was to improve on Netflix’ existing movie recommendation technology • The open-to-the-public contest began October 2, 2006; winners announced September 2009 • Prize – Based on reduction in root mean squared error (RMSE) on test data – $1 million grand prize for 10% improvement on Cinematch result – $50K 2007 progress prize for 8.43% improvement – $50K 2008 progress prize for 9.44% improvement • Netflix gets full rights to use IP developed by the winners – Example of Crowdsourcing – Netflix basically got over 100 researcher years (and good publicity) for $1.1M -5- Bangalore/MumbaiConfidential Yahoo! 2013
  7. 7. Netflix Movie Ratings Data Training data Test data• Training data user movie score user movie – 100 million 1 21 1 1 62 ? ratings 1 213 5 1 96 ? – 480,000 users 2 345 4 2 7 ? – 17,770 movies – 6 years of data: 2 123 4 2 3 ? 2000-2005 2 768 3 3 47 ?• Test data 3 76 5 3 15 ? – Last few ratings 4 45 4 4 41 ? of each user (2.8 5 568 1 4 28 ? million) 5 342 2 5 93 ?• Dates of ratings are 5 234 2 5 74 ? given 6 76 5 6 69 ? 6 56 4 6 83 ? -6- Bangalore/MumbaiConfidential Yahoo! 2013
  8. 8. Recommender Systems – Mathematical Abstraction • Consider a matrix R of users and the items they’ve consumed – Users correspond to the rows of R, products to its columns, with ri,j=1 whenever person i consumed item j – In other cases, ri,j might be the rating given by person i on item j • The matrix R is typically very sparse Items – …and often very large • Real-life task: top-k recommendation users – From among the items that weren’t R= consumed by each user, predict which ones the user would most enjoy • Related task on ratings data: matrix completion |U| x |I| – Predict users’ ratings for items they have yet to rate, i.e. “complete” missing values -7- Bangalore/MumbaiConfidential Yahoo! 2013
  9. 9. Types of Recommender Systems At a high level, two main techniques: • Content-based recommendation: characterizes the affinity of users to certain features (content, metadata) of their preferred items – Lots of classification technology under the hood • Collaborative Filtering: exploits similar consumption and preference patterns between users – See next slides • Many state of the art systems combine both techniques -8- Bangalore/MumbaiConfidential Yahoo! 2013
  10. 10. Collaborative Filtering – Neighborhood Models • Compute the similarity of items [users] to each other – Items are considered similar when users tend to rate them similarly or to co-consume them – Users are considered similar when they tend to co-consume items or rate items similarly • Recommend to a user: – Items similar to items he/she has already consumed [rated highly] – Items consumed [rated highly] by similar users • Key questions: – How exactly to define pair-wise similarities? – How to combine them into quality recommendations? -9- Bangalore/MumbaiConfidential Yahoo! 2013
  11. 11. Collaborative Filtering – Matrix Factorization • Latent factor models (LFM): – Maps both users and items to some f-dimensional space Rf, i.e. produce f-dimensional vectors vu and wi for each user and items – Define rating estimates as inner products: qij = <vi,wj> – Main problem: finding a mapping of users and items to the latent factor space that produces “good” estimates – Closely related to dimensionality reduction techniques of the ratings matrix R (e.g. Singular Value Decomposition) Items V W users R= ≈ |U| x |I| |U| x f f x |I| - 10 - Bangalore/MumbaiConfidential Yahoo! 2013
  12. 12. Web Media Sites - 11 - Bangalore/MumbaiConfidential Yahoo! 2013
  13. 13. Challenge: Cold Start Problems • Good recommendations require observed data on the user being recommended to [the items being recommended] – What did the user consume/enjoy before? – Which users consumed/enjoyed this item before? • User cold start: what happens when a new user arrives to a system? – How can the system make a good “first impression”? • Item cold start: how do we recommend newly arrived items with little historic consumption? • In certain settings, items are ephemeral – a significant portion of their lifetime is spent in cold-start state – E.g. news recommendation - 12 - Bangalore/MumbaiConfidential Yahoo! 2013
  14. 14. Low False-Positive Costs False positive: recommending an irrelevant item • Consequence, in media sites: a bit of lost time – As opposed to lots of lost time or money in other settings • Opportunity: better address cold-start issues • Item cold-start: show new item to select group of users whose feedback should help in modeling it to everyone – Note the very short item life times in news cycles • User cold-start: more aggressive exploration – Vs. playing it safe and perpetuating popular items • Search: injecting randomization into the ranking of search results (Pandey et al., VLDB 2005) - 13 - Bangalore/MumbaiConfidential Yahoo! 2013
  15. 15. Challenge: Inferring Negative Feedback • In many recommendation settings we only know which items users have consumed, not whether they liked them – I.e. no explicit ratings data • What can we infer about satisfaction of consumed items from observing other interactions with the content? – Web pages: what happens after the initial click? – Short online videos: what happens after pressing “play”? – TV programs: zapping patterns • What can we infer about items the user did not consume? • Was the user even aware of the items he/she did not consume? – What items did the recommender system expose the user to? - 14 - Bangalore/MumbaiConfidential Yahoo! 2013
  16. 16. Presentation Bias’ Effect on Media Consumption • Pop Culture: items’ longevity creates familiarity • Media sites: items are ephemeral, and users are mostly unaware of items the site did not expose them to • Presentation bias obscures users’ true taste – they essentially select the best of the little that was shown • Must correctly account for presentation bias when modeling: seen and not selected ≠ not seen and not selected • Search: negative interpretation of “skipped” search results (Joachims, KDD’2002) - 15 - Bangalore/MumbaiConfidential Yahoo! 2013
  17. 17. Layouts of Recommendation Modules • Interpreting interactions in vertical layouts is “easy” using the “skips” paradigm • What about 2D, tabbed, horizontal layouts? - 16 - Bangalore/MumbaiConfidential Yahoo! 2013
  18. 18. Layouts of Recommendation Modules • What about multiple presentation formats? - 17 - Bangalore/MumbaiConfidential Yahoo! 2013
  19. 19. Personalized PopularContextual - 18 - Bangalore/MumbaiConfidential Yahoo! 2013
  20. 20. Contextualized, Personalization, Popular • Web media sites often display links to additional stories on each article page – Matching the article’s context, matching the user, consumed by the user’s friends, popular • When creating a unified list for a given a user reading a specific page, what should be the relative importance of matching the additional stories to the page vs. matching to the user? • Ignoring story context might create offending recommendations • Related direction: Tensor Factorization, Karatzoglou et. al, RecSys’2010 - 19 - Bangalore/MumbaiConfidential Yahoo! 2013
  21. 21. Challenge: Incremental Collaborative Filtering • In a live system, we often cannot afford to recompute recommendations regularly over the entire history • Problem: neither neighborhood models nor matrix factorization models easily lend themselves to faithful incremental processing User-Item User-Item User-Item Mi = CF-ALG(ti) Interactions Interactions Interactions t1 t2 t3 ∀f, f { M1, M2 } ≠ CF_ALG(t1∪t2) … T • Is there a model aggregation function f(Mprev, Mcurr) that is “good enough”? - 20 - Bangalore/MumbaiConfidential Yahoo! 2013
  22. 22. Challenge: Repeated Recommendations • One typically doesn’t buy the same book twice, nor do people typically read the same news story twice • But people listen to the songs they like over and over again, and watch movies they like multiple times as well • When and how frequently is it ok to recommend an item that was already consumed? • On the other hand, when should we stop showing a recommendation if the user doesn’t act upon it? • Implication: a recommendation system may not only need to track aggregated consumption to-date, – It may need to track consumption timelines – It may need to track recommendation history - 21 - Bangalore/MumbaiConfidential Yahoo! 2013
  23. 23. Challenge: Recommending Sets & Sequences ofItems • In some domains, users consume multiple items in rapid succession (e.g. music playlists) – Recent works: WWW’2012 (Aizenberg et al., sets) and KDD’2012 (Chen et al., sequences) • From Independent utility of recommendations to set or sequence utility, predicting items that “go well together” – Sometimes need to respect constraints • Tiling recommendations: in TV Watchlist generation, the broadcast schedules further complicates matters due to program overlaps • Perhaps a new domain of constrained recommendations? • Search: result set attributes (e.g. diversity) in Search (Agrawal et al., WSDM’2009) • Netflix tutorial at RecSys’2012: diversity is key @Netflix - 22 - Bangalore/MumbaiConfidential Yahoo! 2013
  24. 24. Social Networks and RecommendationComputation • Some are hailing social networks as a silver bullet for recommender systems – Tell me who your friends are and we’ll tell you what you like • Is it really the case that we like the same media as our friends? • Affinity trumps friendship! – There are people out there who are “more like us” than our limited set of friends – Once affinity is considered, the marginal value of social connections is often negligible • Not to be confused with non-friendship social networks, where connections are affinity related (Epinions) - 23 - Bangalore/MumbaiConfidential Yahoo! 2013 RecSys 202
  25. 25. Social Networks and RecommendationConsumption • Previous slide nonewithstanding, “social” is a great motivator for consuming recommendations – People like you rate “Lincoln” very highly vs. – Your friends Alice and Bob saw “Lincoln” last night and loved it • Explaining recommendations for motivating and increasing consumption is an emerging practice • Some commercial systems completely separate their explanation generation from their recommendation generation – So Alice and Bob may not be why the system recommended “Lincoln” to you, but they will be leveraged to get you to watch it • Privacy in the face of joint consumption of a personalized experience? - 24 - Bangalore/MumbaiConfidential Yahoo! 2013 RecSys 202
  26. 26. Questions, Comments? Thank you! rlempel (at) yahoo-inc dot com - 25 - Yahoo! Confidential