Your SlideShare is downloading. ×
Buzz words-dunning-multi-modal-recommendation
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Buzz words-dunning-multi-modal-recommendation

1,483
views

Published on

Multi-model recommendation engines use multiple kinds of behavior as input and can be implemented using standard search engine technology. I show how and why starting with basic recommendations all …

Multi-model recommendation engines use multiple kinds of behavior as input and can be implemented using standard search engine technology. I show how and why starting with basic recommendations all the way through full multi-modal systems.

Published in: Technology

0 Comments
7 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
1,483
On Slideshare
0
From Embeds
0
Number of Embeds
5
Actions
Shares
0
Downloads
0
Comments
0
Likes
7
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. 1©MapR Technologies - ConfidentialMulti-Modal Recommendations
  • 2. 2©MapR Technologies - ConfidentialMultiple Kinds of Behaviorfor RecommendingMultiple Kinds of Things
  • 3. 3©MapR Technologies - ConfidentialWhat’s Up What is this multi-modal stuff? A simple recommendation architecture Some scary math Putting it into a deployable architecture Final thoughts
  • 4. 4©MapR Technologies - Confidential Contact:– tdunning@maprtech.com– @ted_dunning– @apachemahout– @user-subscribe@mahout.apache.org Slides and such (available late tonight):– http://www.slideshare.net/tdunning Hash tags: #bbuzz #mapr #recommendations
  • 5. 5©MapR Technologies - ConfidentialRecommendations Often known (inaccurately) as collaborative filtering Actors interact with items– observe successful interaction We want to suggest additional successful interactions Observations inherently very sparse
  • 6. 6©MapR Technologies - ConfidentialExamples of Recommendations Customers buying books (Linden et al) Web visitors rating music (Shardanand and Maes) or movies (Riedl,et al), (Netflix) Internet radio listeners not skipping songs (Musicmatch) Internet video watchers watching >30 s (Veoh)
  • 7. 7©MapR Technologies - ConfidentialWhat is this multi-modal stuff? But people don’t just do one thing One kind of behavior is useful for predicting other kinds Having a complete picture is important for accuracy What has the user said, viewed, clicked, closed, bought lately?
  • 8. 8©MapR Technologies - ConfidentialA simple recommendation architecture Look at the history of interactions Find significant item cooccurrence in user histories Use these cooccurring items as “indicators” For all indicators in user history, add up scores
  • 9. 9©MapR Technologies - ConfidentialRecommendation Basics History:User Thing1 32 43 42 33 21 12 1
  • 10. 10©MapR Technologies - ConfidentialRecommendation Basics History as matrix: (t1, t3) cooccur 2 times, (t1, t4) once, (t2, t4) once, (t3, t4) oncet1 t2 t3 t4u1 1 0 1 0u2 1 0 1 1u3 0 1 0 1
  • 11. 11©MapR Technologies - ConfidentialA Quick Simplification Users who do h Also do rAhATAh( )ATA( )hUser-centric recommendationsItem-centric recommendations
  • 12. 12©MapR Technologies - ConfidentialRecommendation Basics Coocurrencet1 t2 t3 t4t1 2 0 2 1t2 0 1 0 1t3 2 0 1 1t4 1 1 1 2
  • 13. 13©MapR Technologies - ConfidentialProblems with Raw Cooccurrence Very popular items co-occur with everything– Welcome document– Elevator music That isn’t interesting– We want anomalous cooccurrence
  • 14. 14©MapR Technologies - ConfidentialRecommendation Basics Coocurrencet1 t2 t3 t4t1 2 0 2 1t2 0 1 0 1t3 2 0 1 1t4 1 1 1 2t3 not t3t1 2 1not t1 1 1
  • 15. 15©MapR Technologies - ConfidentialSpot the Anomaly Root LLR is roughly like standard deviationsA not AB 13 1000not B 1000 100,000A not AB 1 0not B 0 2A not AB 1 0not B 0 10,000A not AB 10 0not B 0 100,0000.44 0.982.26 7.15
  • 16. 16©MapR Technologies - ConfidentialRoot LLR Details In Rentropy = function(k) {-sum(k*log((k==0)+(k/sum(k))))}rootLLr = function(k) {sign = …sign * sqrt((entropy(rowSums(k))+entropy(colSums(k))- entropy(k))/2)} Like sqrt(mutual information * N/2)See http://bit.ly/16DvLVK
  • 17. 17©MapR Technologies - ConfidentialThreshold by Score Coocurrencet1 t2 t3 t4t1 2 0 2 1t2 0 1 0 1t3 2 0 1 1t4 1 1 1 2
  • 18. 18©MapR Technologies - ConfidentialThreshold by Score Significant cooccurrence => Indicatorst1 t2 t3 t4t1 1 0 0 1t2 0 1 0 1t3 0 0 1 1t4 1 0 0 1
  • 19. 19©MapR Technologies - ConfidentialSo Far, So Good Classic recommendation systems based on these approaches– Musicmatch (ca 2000)– Veoh Networks (ca 2005) Currently available in Mahout– See RowSimilarityJob Very simple to deploy– Compute indicators– Store in search engine– Works very well with enough data
  • 20. 20©MapR Technologies - ConfidentialWhat’s rightabout this?
  • 21. 21©MapR Technologies - ConfidentialVirtues of Current State of the Art Lots of well publicized history– Musicmatch, Veoh, Netflix, Amazon, Overstock Lots of support– Mahout, commercial offerings like Myrrix Lots of existing code– Mahout, commercial codes Proven track record Well socialized solution
  • 22. 22©MapR Technologies - ConfidentialWhat’s wrongabout this?
  • 23. 23©MapR Technologies - ConfidentialToo Limited People do more than one kind of thing Different kinds of behaviors give different quality, quantity andkind of information We don’t have to do co-occurrence We can do cross-occurrence Result is cross-recommendation
  • 24. 24©MapR Technologies - ConfidentialHeh?
  • 25. 25©MapR Technologies - ConfidentialSymmetry Gives Cross RecommentationsWhy just dyadic learning?Why not triadic learning?Why not cross learning?ATA( )hBTA( )h
  • 26. 26©MapR Technologies - ConfidentialFor example Users enter queries (A)– (actor = user, item=query) Users view videos (B)– (actor = user, item=video) A’A gives query recommendation– “did you mean to ask for” B’B gives video recommendation– “you might like these videos”
  • 27. 27©MapR Technologies - ConfidentialThe punch-line B’A recommends videos in response to a query– (isn’t that a search engine?)– (not quite, it doesn’t look at content or meta-data)
  • 28. 28©MapR Technologies - ConfidentialReal-life example Query: “Paco de Lucia” Conventional meta-data search results:– “hombres del paco” times 400– not much else Recommendation based search:– Flamenco guitar and dancers– Spanish and classical guitar– Van Halen doing a classical/flamenco riff
  • 29. 29©MapR Technologies - ConfidentialReal-life example
  • 30. 30©MapR Technologies - ConfidentialHypothetical Example Want a navigational ontology? Just put labels on a web page with traffic– This gives A = users x label clicks Remember viewing history– This gives B = users x items Cross recommend– B’A = label to item mapping After several users click, results are whatever users think theyshould be
  • 31. 31©MapR Technologies - Confidential
  • 32. 32©MapR Technologies - ConfidentialNice. But wecan do better?
  • 33. 33©MapR Technologies - ConfidentialAusersthings
  • 34. 34©MapR Technologies - ConfidentialA1 A2éëùûusersthingtype 1thingtype 2
  • 35. 35©MapR Technologies - ConfidentialA1 A2éëùûusersaction1item type1action2item type2
  • 36. 36©MapR Technologies - ConfidentialA1 A2éëùûTA1 A2éëùû=A1TA2TéëêêùûúúA1 A2éëùû=A1TA1 A1TA2AT2A1 AT2A2éëêêùûúúr1r2éëêêùûúú=A1TA1 A1TA2AT2A1 AT2A2éëêêùûúúh1h2éëêêùûúúr1 = A1TA1 A1TA2éëêùûúh1h2éëêêùûúú
  • 37. 37©MapR Technologies - ConfidentialSummary Input: Multiple kinds of behavior on one set of things Output: Recommendations for one kind of behavior with adifferent set of things Cross recommendation is a special case
  • 38. 38©MapR Technologies - ConfidentialNow again, withoutthe scary math
  • 39. 39©MapR Technologies - ConfidentialInput Data User transactions– user id, merchant id– SIC code, amount– Descriptions, cuisine, … Offer transactions– user id, offer id– vendor id, merchant id’s,– offers, views, accepts
  • 40. 40©MapR Technologies - ConfidentialInput Data User transactions– user id, merchant id– SIC code, amount– Descriptions, cuisine, … Offer transactions– user id, offer id– vendor id, merchant id’s,– offers, views, accepts Derived user data– merchant id’s– anomalous descriptor terms– offer & vendor id’s Derived merchant data– local top40– SIC code– vendor code– amount distribution
  • 41. 41©MapR Technologies - ConfidentialCross-recommendation Per merchant indicators– merchant id’s– chain id’s– SIC codes– indicator terms from text– offer vendor id’s Computed by finding anomalous (indicator => merchant) rates
  • 42. 42©MapR Technologies - ConfidentialSearch-based Recommendations Sample document– Merchant Id– Field for text description– Phone– Address– Location
  • 43. 43©MapR Technologies - ConfidentialSearch-based Recommendations Sample document– Merchant Id– Field for text description– Phone– Address– Location– Indicator merchant id’s– Indicator industry (SIC) id’s– Indicator offers– Indicator text– Local top40
  • 44. 44©MapR Technologies - ConfidentialSearch-based Recommendations Sample document– Merchant Id– Field for text description– Phone– Address– Location– Indicator merchant id’s– Indicator industry (SIC) id’s– Indicator offers– Indicator text– Local top40 Sample query– Current location– Recent merchant descriptions– Recent merchant id’s– Recent SIC codes– Recent accepted offers– Local top40
  • 45. 45©MapR Technologies - ConfidentialSolRIndexerSolRIndexerSolrindexingCooccurrence(Mahout)Item meta-dataIndexshardsCompletehistory
  • 46. 46©MapR Technologies - ConfidentialSolRIndexerSolRIndexerSolrsearchWeb tierItem meta-dataIndexshardsUserhistory
  • 47. 47©MapR Technologies - Confidential Contact:– tdunning@maprtech.com– @ted_dunning– @apachemahout– @user-subscribe@mahout.apache.org Slides and such (available late tonight):– http://www.slideshare.net/tdunning Hash tags: #bbuzz #mapr #recommendations We are hiring!
  • 48. 48©MapR Technologies - ConfidentialObjective Results At a very large credit card company History is all transactions, all web interaction Processing time cut from 20 hours per day to 3 Recommendation engine load time decreased from 8 hours to 3minutes Recommendation quality increased visibly
  • 49. 49©MapR Technologies - ConfidentialThank You

×