Polyvalent recommendations

632 views
573 views

Published on

Recent work in recommendations allows some really amazing simplicity of implementation while extending the inputs handled to multiple kinds of interactions against items different from the ones being recommended.

Published in: Technology, Business
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
632
On SlideShare
0
From Embeds
0
Number of Embeds
5
Actions
Shares
0
Downloads
21
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

Polyvalent recommendations

  1. 1. 1©MapR Technologies - ConfidentialPolyvalent Recommendations
  2. 2. 2©MapR Technologies - ConfidentialMultiple Kinds of Behaviorfor RecommendingMultiple Kinds of Things
  3. 3. 3©MapR Technologies - Confidential Contact:– tdunning@maprtech.com– @ted_dunning Slides and such (available late tonight):– http://www.slideshare.net/tdunning Hash tags: #mapr #recommendations
  4. 4. 4©MapR Technologies - ConfidentialA new approach to recommendation, polyvalent recommendation,that is both simpler and much more powerful than traditionalapproaches. The idea is that you can combine user, item and contentrecommendations into a single query that you can implement using avery simple architecture.
  5. 5. 5©MapR Technologies - ConfidentialRecommendations Often known (inaccurately) as collaborative filtering Actors interact with items– observe successful interaction We want to suggest additional successful interactions Observations inherently very sparse
  6. 6. 6©MapR Technologies - ConfidentialExamples Customers buying books (Linden et al) Web visitors rating music (Shardanand and Maes) or movies (Riedl,et al), (Netflix) Internet radio listeners not skipping songs (Musicmatch) Internet video watchers watching >30 s
  7. 7. 7©MapR Technologies - ConfidentialDyadic Structure Functional– Interaction: actor -> item* Relational– Interaction ⊆ Actors x Items Matrix– Rows indexed by actor, columns by item– Value is count of interactions Predict missing observations
  8. 8. 8©MapR Technologies - ConfidentialRecommendation Basics History:User Thing1 32 43 42 33 21 12 1
  9. 9. 9©MapR Technologies - ConfidentialRecommendation Basics History as matrix: (t1, t2) cooccur 2 times, (t1, t4) once, (t2, t4) oncet1 t2 t3 t4u1 1 0 1 0u2 1 0 1 1u3 0 1 0 1
  10. 10. 10©MapR Technologies - ConfidentialA Quick Simplification Users who do h Also do rAhATAh( )ATA( )hUser-centric recommendationsItem-centric recommendations
  11. 11. 11©MapR Technologies - ConfidentialRecommendation Basics Coocurrencet1 t2 t3 t4t1 2 0 2 1t2 0 1 0 1t3 2 0 1 1t4 1 1 1 2
  12. 12. 12©MapR Technologies - ConfidentialProblems with Raw Cooccurrence Very popular items co-occur with everything– Welcome document– Elevator music That isn’t interesting– We want anomalous cooccurrence
  13. 13. 13©MapR Technologies - ConfidentialRecommendation Basics Coocurrencet1 t2 t3 t4t1 2 0 2 1t2 0 1 0 1t3 2 0 1 1t4 1 1 1 2t3 not t3t1 2 1not t1 1 1
  14. 14. 14©MapR Technologies - ConfidentialRoot LLR Details In Rentropy = function(k) {-sum(k*log((k==0)+(k/sum(k))))}rootLLr = function(k) {sqrt((entropy(rowSums(k))+entropy(colSums(k))- entropy(k))/2)} Like sqrt(mutual information * N/2)
  15. 15. 15©MapR Technologies - ConfidentialSpot the Anomaly Root LLR is roughly like standard deviationsA not AB 13 1000not B 1000 100,000A not AB 1 0not B 0 2A not AB 1 0not B 0 10,000A not AB 10 0not B 0 100,0000.44 0.982.26 7.15
  16. 16. 16©MapR Technologies - ConfidentialThreshold by Score Coocurrencet1 t2 t3 t4t1 2 0 2 1t2 0 1 0 1t3 2 0 1 1t4 1 1 1 2
  17. 17. 17©MapR Technologies - ConfidentialThreshold by Score Significant cooccurrence => Indicatorst1 t2 t3 t4t1 1 0 0 1t2 0 1 0 1t3 0 0 1 1t4 1 0 0 1
  18. 18. 18©MapR Technologies - ConfidentialDecomposition for Cooccurrence Can use SVD for cooccurrence But first one or two singular vectors just encode popularity …ignore those VT projects items into concept space, V projects back into itemspace Thresholding reconstructed cooccurrence matrix is another way toget indicatorsATA = USVT( )TUSVT( )= VS2VT
  19. 19. 19©MapR Technologies - ConfidentialWhat’s rightabout this?
  20. 20. 20©MapR Technologies - ConfidentialVirtues of Current State of the Art Lots of well publicized history– Netflix, Amazon, Overstock Lots of support– Mahout, commercial offerings like Myrrix Lots of existing code– Mahout, commercial codes Proven track record Well socialized solution
  21. 21. 21©MapR Technologies - ConfidentialWhat’s wrongabout this?
  22. 22. 22©MapR Technologies - ConfidentialCross Occurrence We don’t have to do co-occurrence We can do cross-occurrence Result is cross-recommendation
  23. 23. 23©MapR Technologies - ConfidentialFundamental Algorithmics Cooccurrence A is users x items, K is items x items Product has general shape of matrix K tells us “users who interacted with x also interacted with y”K = ATA
  24. 24. 24©MapR Technologies - ConfidentialFundamental Algorithmic Structure Cooccurrence Matrix approximation by factoring LLRK = ATAA » USVTK » VS2VTr = VS2VThr =sparsify(ATA)h
  25. 25. 25©MapR Technologies - ConfidentialBut Wait ...Does it have to be that way?
  26. 26. 26©MapR Technologies - ConfidentialBut why not ...Why just dyadic learning?Why not triadic learning?Why not cross learning?ATA( )hBTA( )h
  27. 27. 27©MapR Technologies - ConfidentialFor example Users enter queries (A)– (actor = user, item=query) Users view videos (B)– (actor = user, item=video) A’A gives query recommendation– “did you mean to ask for” B’B gives video recommendation– “you might like these videos”
  28. 28. 28©MapR Technologies - ConfidentialThe punch-line B’A recommends videos in response to a query– (isn’t that a search engine?)– (not quite, it doesn’t look at content or meta-data)
  29. 29. 29©MapR Technologies - ConfidentialReal-life example Query: “Paco de Lucia” Conventional meta-data search results:– “hombres del paco” times 400– not much else Recommendation based search:– Flamenco guitar and dancers– Spanish and classical guitar– Van Halen doing a classical/flamenco riff
  30. 30. 30©MapR Technologies - ConfidentialReal-life example
  31. 31. 31©MapR Technologies - ConfidentialHypothetical Example Want a navigational ontology? Just put labels on a web page with traffic– This gives A = users x label clicks Remember viewing history– This gives B = users x items Cross recommend– B’A = label to item mapping After several users click, results are whatever users think theyshould be
  32. 32. 32©MapR Technologies - ConfidentialBut wait,there’s more!
  33. 33. 33©MapR Technologies - ConfidentialAusersthings
  34. 34. 34©MapR Technologies - ConfidentialA1 A2éëùûusersthingtype 1thingtype 2
  35. 35. 35©MapR Technologies - ConfidentialA1 A2éëùûTB1 B2éëùû=A1TA2TéëêêùûúúB1 B2éëùû=A1TB1 A1TB2A2B1 A2B2éëêêùûúúr1r2éëêêùûúú=A1TB1 A1TB2A2B1 A2B2éëêêùûúúh1h2éëêêùûúúr1 = A1TB1 A1TB2éëêùûúh1h2éëêêùûúú
  36. 36. 36©MapR Technologies - ConfidentialSummary Input: Multiple kinds of behavior on one set of things Output: Recommendations for one kind of behavior with adifferent set of things Cross recommendation is a special case
  37. 37. 37©MapR Technologies - ConfidentialNow again, withoutthe scary math
  38. 38. 38©MapR Technologies - ConfidentialInput Data User transactions– user id, merchant id– SIC code, amount– Descriptions, cuisine, … Offer transactions– user id, offer id– vendor id, merchant id’s,– offers, views, accepts
  39. 39. 39©MapR Technologies - ConfidentialInput Data User transactions– user id, merchant id– SIC code, amount– Descriptions, cuisine, … Offer transactions– user id, offer id– vendor id, merchant id’s,– offers, views, accepts Derived user data– merchant id’s– anomalous descriptor terms– offer & vendor id’s Derived merchant data– local top40– SIC code– vendor code– amount distribution
  40. 40. 40©MapR Technologies - ConfidentialCross-recommendation Per merchant indicators– merchant id’s– chain id’s– SIC codes– indicator terms from text– offer vendor id’s Computed by finding anomalous (indicator => merchant) rates
  41. 41. 41©MapR Technologies - ConfidentialSearch-based Recommendations Sample document– Merchant Id– Field for text description– Phone– Address– Location
  42. 42. 42©MapR Technologies - ConfidentialSearch-based Recommendations Sample document– Merchant Id– Field for text description– Phone– Address– Location– Indicator merchant id’s– Indicator industry (SIC) id’s– Indicator offers– Indicator text– Local top40
  43. 43. 43©MapR Technologies - ConfidentialSearch-based Recommendations Sample document– Merchant Id– Field for text description– Phone– Address– Location– Indicator merchant id’s– Indicator industry (SIC) id’s– Indicator offers– Indicator text– Local top40 Sample query– Current location– Recent merchant descriptions– Recent merchant id’s– Recent SIC codes– Recent accepted offers– Local top40
  44. 44. 44©MapR Technologies - ConfidentialSolRIndexerSolRIndexerSolrindexingCooccurrence(Mahout)Item meta-dataIndexshardsCompletehistory
  45. 45. 45©MapR Technologies - ConfidentialSolRIndexerSolRIndexerSolrsearchWeb tierItem meta-dataIndexshardsUserhistory
  46. 46. 46©MapR Technologies - ConfidentialObjective Results At a very large credit card company History is all transactions, all web interaction Processing time cut from 20 hours per day to 3 Recommendation engine load time decreased from 8 hours to 3minutes Recommendation quality increased visibly
  47. 47. 47©MapR Technologies - Confidential Contact:– tdunning@maprtech.com– @ted_dunning Slides and such (available late tonight):– http://www.slideshare.net/tdunning Hash tags: #mapr #recommendations We are hiring!
  48. 48. 48©MapR Technologies - ConfidentialThank You

×