Buzz words-dunning-multi-modal-recommendation

2,435 views

Published on

Multi-model recommendation engines use multiple kinds of behavior as input and can be implemented using standard search engine technology. I show how and why starting with basic recommendations all the way through full multi-modal systems.

Published in: Technology
0 Comments
7 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
2,435
On SlideShare
0
From Embeds
0
Number of Embeds
22
Actions
Shares
0
Downloads
0
Comments
0
Likes
7
Embeds 0
No embeds

No notes for slide

Buzz words-dunning-multi-modal-recommendation

  1. 1. 1©MapR Technologies - ConfidentialMulti-Modal Recommendations
  2. 2. 2©MapR Technologies - ConfidentialMultiple Kinds of Behaviorfor RecommendingMultiple Kinds of Things
  3. 3. 3©MapR Technologies - ConfidentialWhat’s Up What is this multi-modal stuff? A simple recommendation architecture Some scary math Putting it into a deployable architecture Final thoughts
  4. 4. 4©MapR Technologies - Confidential Contact:– tdunning@maprtech.com– @ted_dunning– @apachemahout– @user-subscribe@mahout.apache.org Slides and such (available late tonight):– http://www.slideshare.net/tdunning Hash tags: #bbuzz #mapr #recommendations
  5. 5. 5©MapR Technologies - ConfidentialRecommendations Often known (inaccurately) as collaborative filtering Actors interact with items– observe successful interaction We want to suggest additional successful interactions Observations inherently very sparse
  6. 6. 6©MapR Technologies - ConfidentialExamples of Recommendations Customers buying books (Linden et al) Web visitors rating music (Shardanand and Maes) or movies (Riedl,et al), (Netflix) Internet radio listeners not skipping songs (Musicmatch) Internet video watchers watching >30 s (Veoh)
  7. 7. 7©MapR Technologies - ConfidentialWhat is this multi-modal stuff? But people don’t just do one thing One kind of behavior is useful for predicting other kinds Having a complete picture is important for accuracy What has the user said, viewed, clicked, closed, bought lately?
  8. 8. 8©MapR Technologies - ConfidentialA simple recommendation architecture Look at the history of interactions Find significant item cooccurrence in user histories Use these cooccurring items as “indicators” For all indicators in user history, add up scores
  9. 9. 9©MapR Technologies - ConfidentialRecommendation Basics History:User Thing1 32 43 42 33 21 12 1
  10. 10. 10©MapR Technologies - ConfidentialRecommendation Basics History as matrix: (t1, t3) cooccur 2 times, (t1, t4) once, (t2, t4) once, (t3, t4) oncet1 t2 t3 t4u1 1 0 1 0u2 1 0 1 1u3 0 1 0 1
  11. 11. 11©MapR Technologies - ConfidentialA Quick Simplification Users who do h Also do rAhATAh( )ATA( )hUser-centric recommendationsItem-centric recommendations
  12. 12. 12©MapR Technologies - ConfidentialRecommendation Basics Coocurrencet1 t2 t3 t4t1 2 0 2 1t2 0 1 0 1t3 2 0 1 1t4 1 1 1 2
  13. 13. 13©MapR Technologies - ConfidentialProblems with Raw Cooccurrence Very popular items co-occur with everything– Welcome document– Elevator music That isn’t interesting– We want anomalous cooccurrence
  14. 14. 14©MapR Technologies - ConfidentialRecommendation Basics Coocurrencet1 t2 t3 t4t1 2 0 2 1t2 0 1 0 1t3 2 0 1 1t4 1 1 1 2t3 not t3t1 2 1not t1 1 1
  15. 15. 15©MapR Technologies - ConfidentialSpot the Anomaly Root LLR is roughly like standard deviationsA not AB 13 1000not B 1000 100,000A not AB 1 0not B 0 2A not AB 1 0not B 0 10,000A not AB 10 0not B 0 100,0000.44 0.982.26 7.15
  16. 16. 16©MapR Technologies - ConfidentialRoot LLR Details In Rentropy = function(k) {-sum(k*log((k==0)+(k/sum(k))))}rootLLr = function(k) {sign = …sign * sqrt((entropy(rowSums(k))+entropy(colSums(k))- entropy(k))/2)} Like sqrt(mutual information * N/2)See http://bit.ly/16DvLVK
  17. 17. 17©MapR Technologies - ConfidentialThreshold by Score Coocurrencet1 t2 t3 t4t1 2 0 2 1t2 0 1 0 1t3 2 0 1 1t4 1 1 1 2
  18. 18. 18©MapR Technologies - ConfidentialThreshold by Score Significant cooccurrence => Indicatorst1 t2 t3 t4t1 1 0 0 1t2 0 1 0 1t3 0 0 1 1t4 1 0 0 1
  19. 19. 19©MapR Technologies - ConfidentialSo Far, So Good Classic recommendation systems based on these approaches– Musicmatch (ca 2000)– Veoh Networks (ca 2005) Currently available in Mahout– See RowSimilarityJob Very simple to deploy– Compute indicators– Store in search engine– Works very well with enough data
  20. 20. 20©MapR Technologies - ConfidentialWhat’s rightabout this?
  21. 21. 21©MapR Technologies - ConfidentialVirtues of Current State of the Art Lots of well publicized history– Musicmatch, Veoh, Netflix, Amazon, Overstock Lots of support– Mahout, commercial offerings like Myrrix Lots of existing code– Mahout, commercial codes Proven track record Well socialized solution
  22. 22. 22©MapR Technologies - ConfidentialWhat’s wrongabout this?
  23. 23. 23©MapR Technologies - ConfidentialToo Limited People do more than one kind of thing Different kinds of behaviors give different quality, quantity andkind of information We don’t have to do co-occurrence We can do cross-occurrence Result is cross-recommendation
  24. 24. 24©MapR Technologies - ConfidentialHeh?
  25. 25. 25©MapR Technologies - ConfidentialSymmetry Gives Cross RecommentationsWhy just dyadic learning?Why not triadic learning?Why not cross learning?ATA( )hBTA( )h
  26. 26. 26©MapR Technologies - ConfidentialFor example Users enter queries (A)– (actor = user, item=query) Users view videos (B)– (actor = user, item=video) A’A gives query recommendation– “did you mean to ask for” B’B gives video recommendation– “you might like these videos”
  27. 27. 27©MapR Technologies - ConfidentialThe punch-line B’A recommends videos in response to a query– (isn’t that a search engine?)– (not quite, it doesn’t look at content or meta-data)
  28. 28. 28©MapR Technologies - ConfidentialReal-life example Query: “Paco de Lucia” Conventional meta-data search results:– “hombres del paco” times 400– not much else Recommendation based search:– Flamenco guitar and dancers– Spanish and classical guitar– Van Halen doing a classical/flamenco riff
  29. 29. 29©MapR Technologies - ConfidentialReal-life example
  30. 30. 30©MapR Technologies - ConfidentialHypothetical Example Want a navigational ontology? Just put labels on a web page with traffic– This gives A = users x label clicks Remember viewing history– This gives B = users x items Cross recommend– B’A = label to item mapping After several users click, results are whatever users think theyshould be
  31. 31. 31©MapR Technologies - Confidential
  32. 32. 32©MapR Technologies - ConfidentialNice. But wecan do better?
  33. 33. 33©MapR Technologies - ConfidentialAusersthings
  34. 34. 34©MapR Technologies - ConfidentialA1 A2éëùûusersthingtype 1thingtype 2
  35. 35. 35©MapR Technologies - ConfidentialA1 A2éëùûusersaction1item type1action2item type2
  36. 36. 36©MapR Technologies - ConfidentialA1 A2éëùûTA1 A2éëùû=A1TA2TéëêêùûúúA1 A2éëùû=A1TA1 A1TA2AT2A1 AT2A2éëêêùûúúr1r2éëêêùûúú=A1TA1 A1TA2AT2A1 AT2A2éëêêùûúúh1h2éëêêùûúúr1 = A1TA1 A1TA2éëêùûúh1h2éëêêùûúú
  37. 37. 37©MapR Technologies - ConfidentialSummary Input: Multiple kinds of behavior on one set of things Output: Recommendations for one kind of behavior with adifferent set of things Cross recommendation is a special case
  38. 38. 38©MapR Technologies - ConfidentialNow again, withoutthe scary math
  39. 39. 39©MapR Technologies - ConfidentialInput Data User transactions– user id, merchant id– SIC code, amount– Descriptions, cuisine, … Offer transactions– user id, offer id– vendor id, merchant id’s,– offers, views, accepts
  40. 40. 40©MapR Technologies - ConfidentialInput Data User transactions– user id, merchant id– SIC code, amount– Descriptions, cuisine, … Offer transactions– user id, offer id– vendor id, merchant id’s,– offers, views, accepts Derived user data– merchant id’s– anomalous descriptor terms– offer & vendor id’s Derived merchant data– local top40– SIC code– vendor code– amount distribution
  41. 41. 41©MapR Technologies - ConfidentialCross-recommendation Per merchant indicators– merchant id’s– chain id’s– SIC codes– indicator terms from text– offer vendor id’s Computed by finding anomalous (indicator => merchant) rates
  42. 42. 42©MapR Technologies - ConfidentialSearch-based Recommendations Sample document– Merchant Id– Field for text description– Phone– Address– Location
  43. 43. 43©MapR Technologies - ConfidentialSearch-based Recommendations Sample document– Merchant Id– Field for text description– Phone– Address– Location– Indicator merchant id’s– Indicator industry (SIC) id’s– Indicator offers– Indicator text– Local top40
  44. 44. 44©MapR Technologies - ConfidentialSearch-based Recommendations Sample document– Merchant Id– Field for text description– Phone– Address– Location– Indicator merchant id’s– Indicator industry (SIC) id’s– Indicator offers– Indicator text– Local top40 Sample query– Current location– Recent merchant descriptions– Recent merchant id’s– Recent SIC codes– Recent accepted offers– Local top40
  45. 45. 45©MapR Technologies - ConfidentialSolRIndexerSolRIndexerSolrindexingCooccurrence(Mahout)Item meta-dataIndexshardsCompletehistory
  46. 46. 46©MapR Technologies - ConfidentialSolRIndexerSolRIndexerSolrsearchWeb tierItem meta-dataIndexshardsUserhistory
  47. 47. 47©MapR Technologies - Confidential Contact:– tdunning@maprtech.com– @ted_dunning– @apachemahout– @user-subscribe@mahout.apache.org Slides and such (available late tonight):– http://www.slideshare.net/tdunning Hash tags: #bbuzz #mapr #recommendations We are hiring!
  48. 48. 48©MapR Technologies - ConfidentialObjective Results At a very large credit card company History is all transactions, all web interaction Processing time cut from 20 hours per day to 3 Recommendation engine load time decreased from 8 hours to 3minutes Recommendation quality increased visibly
  49. 49. 49©MapR Technologies - ConfidentialThank You

×