Your SlideShare is downloading. ×
0
1©MapR Technologies - Confidential
Multi-Modal Recommendations
2©MapR Technologies - Confidential
Multiple Kinds of Behavior
for Recommending
Multiple Kinds of Things
3©MapR Technologies - Confidential
What’s Up
 What is this multi-modal stuff?
 A simple recommendation architecture
 So...
4©MapR Technologies - Confidential
 Contact:
– tdunning@maprtech.com
– @ted_dunning
– @apachemahout
– @user-subscribe@mah...
5©MapR Technologies - Confidential
Recommendations
 Often known (inaccurately) as collaborative filtering
 Actors intera...
6©MapR Technologies - Confidential
Examples of Recommendations
 Customers buying books (Linden et al)
 Web visitors rati...
7©MapR Technologies - Confidential
What is this multi-modal stuff?
 But people don’t just do one thing
 One kind of beha...
8©MapR Technologies - Confidential
A simple recommendation architecture
 Look at the history of interactions
 Find signi...
9©MapR Technologies - Confidential
Recommendation Basics
 History:
User Thing
1 3
2 4
3 4
2 3
3 2
1 1
2 1
10©MapR Technologies - Confidential
Recommendation Basics
 History as matrix:
 (t1, t3) cooccur 2 times,
 (t1, t4) once...
11©MapR Technologies - Confidential
A Quick Simplification
 Users who do h
 Also do r
Ah
AT
Ah( )
AT
A( )h
User-centric ...
12©MapR Technologies - Confidential
Recommendation Basics
 Coocurrence
t1 t2 t3 t4
t1 2 0 2 1
t2 0 1 0 1
t3 2 0 1 1
t4 1 ...
13©MapR Technologies - Confidential
Problems with Raw Cooccurrence
 Very popular items co-occur with everything
– Welcome...
14©MapR Technologies - Confidential
Recommendation Basics
 Coocurrence
t1 t2 t3 t4
t1 2 0 2 1
t2 0 1 0 1
t3 2 0 1 1
t4 1 ...
15©MapR Technologies - Confidential
Spot the Anomaly
 Root LLR is roughly like standard deviations
A not A
B 13 1000
not ...
16©MapR Technologies - Confidential
Root LLR Details
 In R
entropy = function(k) {
-sum(k*log((k==0)+(k/sum(k))))
}
rootL...
17©MapR Technologies - Confidential
Threshold by Score
 Coocurrence
t1 t2 t3 t4
t1 2 0 2 1
t2 0 1 0 1
t3 2 0 1 1
t4 1 1 1...
18©MapR Technologies - Confidential
Threshold by Score
 Significant cooccurrence => Indicators
t1 t2 t3 t4
t1 1 0 0 1
t2 ...
19©MapR Technologies - Confidential
So Far, So Good
 Classic recommendation systems based on these approaches
– Musicmatc...
20©MapR Technologies - Confidential
What’s right
about this?
21©MapR Technologies - Confidential
Virtues of Current State of the Art
 Lots of well publicized history
– Musicmatch, Ve...
22©MapR Technologies - Confidential
What’s wrong
about this?
23©MapR Technologies - Confidential
Too Limited
 People do more than one kind of thing
 Different kinds of behaviors giv...
24©MapR Technologies - Confidential
Heh?
25©MapR Technologies - Confidential
Symmetry Gives Cross Recommentations
Why just dyadic learning?
Why not triadic learnin...
26©MapR Technologies - Confidential
For example
 Users enter queries (A)
– (actor = user, item=query)
 Users view videos...
27©MapR Technologies - Confidential
The punch-line
 B’A recommends videos in response to a query
– (isn’t that a search e...
28©MapR Technologies - Confidential
Real-life example
 Query: “Paco de Lucia”
 Conventional meta-data search results:
– ...
29©MapR Technologies - Confidential
Real-life example
30©MapR Technologies - Confidential
Hypothetical Example
 Want a navigational ontology?
 Just put labels on a web page w...
31©MapR Technologies - Confidential
32©MapR Technologies - Confidential
Nice. But we
can do better?
33©MapR Technologies - Confidential
Ausers
things
34©MapR Technologies - Confidential
A1 A2
é
ë
ù
û
users
thing
type 1
thing
type 2
35©MapR Technologies - Confidential
A1 A2
é
ë
ù
û
users
action1
item type1
action2
item type2
36©MapR Technologies - Confidential
A1 A2
é
ë
ù
û
T
A1 A2
é
ë
ù
û=
A1
T
A2
T
é
ë
ê
ê
ù
û
ú
ú
A1 A2
é
ë
ù
û
=
A1
T
A1 A1
T
...
37©MapR Technologies - Confidential
Summary
 Input: Multiple kinds of behavior on one set of things
 Output: Recommendat...
38©MapR Technologies - Confidential
Now again, without
the scary math
39©MapR Technologies - Confidential
Input Data
 User transactions
– user id, merchant id
– SIC code, amount
– Description...
40©MapR Technologies - Confidential
Input Data
 User transactions
– user id, merchant id
– SIC code, amount
– Description...
41©MapR Technologies - Confidential
Cross-recommendation
 Per merchant indicators
– merchant id’s
– chain id’s
– SIC code...
42©MapR Technologies - Confidential
Search-based Recommendations
 Sample document
– Merchant Id
– Field for text descript...
43©MapR Technologies - Confidential
Search-based Recommendations
 Sample document
– Merchant Id
– Field for text descript...
44©MapR Technologies - Confidential
Search-based Recommendations
 Sample document
– Merchant Id
– Field for text descript...
45©MapR Technologies - Confidential
SolR
Indexer
SolR
Indexer
Solr
indexing
Cooccurrence
(Mahout)
Item meta-
data
Index
sh...
46©MapR Technologies - Confidential
SolR
Indexer
SolR
Indexer
Solr
search
Web tier
Item meta-
data
Index
shards
User
histo...
47©MapR Technologies - Confidential
 Contact:
– tdunning@maprtech.com
– @ted_dunning
– @apachemahout
– @user-subscribe@ma...
48©MapR Technologies - Confidential
Objective Results
 At a very large credit card company
 History is all transactions,...
49©MapR Technologies - Confidential
Thank You
Upcoming SlideShare
Loading in...5
×

Buzz Words Dunning Multi Modal Recommendations

254

Published on

Multi-model recommendation engines use multiple kinds of behavior as input and can be implemented using standard search engine technology. I show how and why starting with basic recommendations all the way through full multi-modal systems.

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
254
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
7
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Transcript of "Buzz Words Dunning Multi Modal Recommendations"

  1. 1. 1©MapR Technologies - Confidential Multi-Modal Recommendations
  2. 2. 2©MapR Technologies - Confidential Multiple Kinds of Behavior for Recommending Multiple Kinds of Things
  3. 3. 3©MapR Technologies - Confidential What’s Up  What is this multi-modal stuff?  A simple recommendation architecture  Some scary math  Putting it into a deployable architecture  Final thoughts
  4. 4. 4©MapR Technologies - Confidential  Contact: – tdunning@maprtech.com – @ted_dunning – @apachemahout – @user-subscribe@mahout.apache.org  Slides and such (available late tonight): – http://www.slideshare.net/tdunning  Hash tags: #bbuzz #mapr #recommendations
  5. 5. 5©MapR Technologies - Confidential Recommendations  Often known (inaccurately) as collaborative filtering  Actors interact with items – observe successful interaction  We want to suggest additional successful interactions  Observations inherently very sparse
  6. 6. 6©MapR Technologies - Confidential Examples of Recommendations  Customers buying books (Linden et al)  Web visitors rating music (Shardanand and Maes) or movies (Riedl, et al), (Netflix)  Internet radio listeners not skipping songs (Musicmatch)  Internet video watchers watching >30 s (Veoh)
  7. 7. 7©MapR Technologies - Confidential What is this multi-modal stuff?  But people don’t just do one thing  One kind of behavior is useful for predicting other kinds  Having a complete picture is important for accuracy  What has the user said, viewed, clicked, closed, bought lately?
  8. 8. 8©MapR Technologies - Confidential A simple recommendation architecture  Look at the history of interactions  Find significant item cooccurrence in user histories  Use these cooccurring items as “indicators”  For all indicators in user history, add up scores
  9. 9. 9©MapR Technologies - Confidential Recommendation Basics  History: User Thing 1 3 2 4 3 4 2 3 3 2 1 1 2 1
  10. 10. 10©MapR Technologies - Confidential Recommendation Basics  History as matrix:  (t1, t3) cooccur 2 times,  (t1, t4) once,  (t2, t4) once,  (t3, t4) once t1 t2 t3 t4 u1 1 0 1 0 u2 1 0 1 1 u3 0 1 0 1
  11. 11. 11©MapR Technologies - Confidential A Quick Simplification  Users who do h  Also do r Ah AT Ah( ) AT A( )h User-centric recommendations Item-centric recommendations
  12. 12. 12©MapR Technologies - Confidential Recommendation Basics  Coocurrence t1 t2 t3 t4 t1 2 0 2 1 t2 0 1 0 1 t3 2 0 1 1 t4 1 1 1 2
  13. 13. 13©MapR Technologies - Confidential Problems with Raw Cooccurrence  Very popular items co-occur with everything – Welcome document – Elevator music  That isn’t interesting – We want anomalous cooccurrence
  14. 14. 14©MapR Technologies - Confidential Recommendation Basics  Coocurrence t1 t2 t3 t4 t1 2 0 2 1 t2 0 1 0 1 t3 2 0 1 1 t4 1 1 1 2 t3 not t3 t1 2 1 not t1 1 1
  15. 15. 15©MapR Technologies - Confidential Spot the Anomaly  Root LLR is roughly like standard deviations A not A B 13 1000 not B 1000 100,000 A not A B 1 0 not B 0 2 A not A B 1 0 not B 0 10,000 A not A B 10 0 not B 0 100,000 0.44 0.98 2.26 7.15
  16. 16. 16©MapR Technologies - Confidential Root LLR Details  In R entropy = function(k) { -sum(k*log((k==0)+(k/sum(k)))) } rootLLr = function(k) { sign = … sign * sqrt( (entropy(rowSums(k))+entropy(colSums(k)) - entropy(k))/2) }  Like sqrt(mutual information * N/2) See http://bit.ly/16DvLVK
  17. 17. 17©MapR Technologies - Confidential Threshold by Score  Coocurrence t1 t2 t3 t4 t1 2 0 2 1 t2 0 1 0 1 t3 2 0 1 1 t4 1 1 1 2
  18. 18. 18©MapR Technologies - Confidential Threshold by Score  Significant cooccurrence => Indicators t1 t2 t3 t4 t1 1 0 0 1 t2 0 1 0 1 t3 0 0 1 1 t4 1 0 0 1
  19. 19. 19©MapR Technologies - Confidential So Far, So Good  Classic recommendation systems based on these approaches – Musicmatch (ca 2000) – Veoh Networks (ca 2005)  Currently available in Mahout – See RowSimilarityJob  Very simple to deploy – Compute indicators – Store in search engine – Works very well with enough data
  20. 20. 20©MapR Technologies - Confidential What’s right about this?
  21. 21. 21©MapR Technologies - Confidential Virtues of Current State of the Art  Lots of well publicized history – Musicmatch, Veoh, Netflix, Amazon, Overstock  Lots of support – Mahout, commercial offerings like Myrrix  Lots of existing code – Mahout, commercial codes  Proven track record  Well socialized solution
  22. 22. 22©MapR Technologies - Confidential What’s wrong about this?
  23. 23. 23©MapR Technologies - Confidential Too Limited  People do more than one kind of thing  Different kinds of behaviors give different quality, quantity and kind of information  We don’t have to do co-occurrence  We can do cross-occurrence  Result is cross-recommendation
  24. 24. 24©MapR Technologies - Confidential Heh?
  25. 25. 25©MapR Technologies - Confidential Symmetry Gives Cross Recommentations Why just dyadic learning? Why not triadic learning?Why not cross learning? AT A( )hBT A( )h
  26. 26. 26©MapR Technologies - Confidential For example  Users enter queries (A) – (actor = user, item=query)  Users view videos (B) – (actor = user, item=video)  A’A gives query recommendation – “did you mean to ask for”  B’B gives video recommendation – “you might like these videos”
  27. 27. 27©MapR Technologies - Confidential The punch-line  B’A recommends videos in response to a query – (isn’t that a search engine?) – (not quite, it doesn’t look at content or meta-data)
  28. 28. 28©MapR Technologies - Confidential Real-life example  Query: “Paco de Lucia”  Conventional meta-data search results: – “hombres del paco” times 400 – not much else  Recommendation based search: – Flamenco guitar and dancers – Spanish and classical guitar – Van Halen doing a classical/flamenco riff
  29. 29. 29©MapR Technologies - Confidential Real-life example
  30. 30. 30©MapR Technologies - Confidential Hypothetical Example  Want a navigational ontology?  Just put labels on a web page with traffic – This gives A = users x label clicks  Remember viewing history – This gives B = users x items  Cross recommend – B’A = label to item mapping  After several users click, results are whatever users think they should be
  31. 31. 31©MapR Technologies - Confidential
  32. 32. 32©MapR Technologies - Confidential Nice. But we can do better?
  33. 33. 33©MapR Technologies - Confidential Ausers things
  34. 34. 34©MapR Technologies - Confidential A1 A2 é ë ù û users thing type 1 thing type 2
  35. 35. 35©MapR Technologies - Confidential A1 A2 é ë ù û users action1 item type1 action2 item type2
  36. 36. 36©MapR Technologies - Confidential A1 A2 é ë ù û T A1 A2 é ë ù û= A1 T A2 T é ë ê ê ù û ú ú A1 A2 é ë ù û = A1 T A1 A1 T A2 AT 2A1 AT 2A2 é ë ê ê ù û ú ú r1 r2 é ë ê ê ù û ú ú = A1 T A1 A1 T A2 AT 2A1 AT 2A2 é ë ê ê ù û ú ú h1 h2 é ë ê ê ù û ú ú r1 = A1 T A1 A1 T A2 é ëê ù ûú h1 h2 é ë ê ê ù û ú ú
  37. 37. 37©MapR Technologies - Confidential Summary  Input: Multiple kinds of behavior on one set of things  Output: Recommendations for one kind of behavior with a different set of things  Cross recommendation is a special case
  38. 38. 38©MapR Technologies - Confidential Now again, without the scary math
  39. 39. 39©MapR Technologies - Confidential Input Data  User transactions – user id, merchant id – SIC code, amount – Descriptions, cuisine, …  Offer transactions – user id, offer id – vendor id, merchant id’s, – offers, views, accepts
  40. 40. 40©MapR Technologies - Confidential Input Data  User transactions – user id, merchant id – SIC code, amount – Descriptions, cuisine, …  Offer transactions – user id, offer id – vendor id, merchant id’s, – offers, views, accepts  Derived user data – merchant id’s – anomalous descriptor terms – offer & vendor id’s  Derived merchant data – local top40 – SIC code – vendor code – amount distribution
  41. 41. 41©MapR Technologies - Confidential Cross-recommendation  Per merchant indicators – merchant id’s – chain id’s – SIC codes – indicator terms from text – offer vendor id’s  Computed by finding anomalous (indicator => merchant) rates
  42. 42. 42©MapR Technologies - Confidential Search-based Recommendations  Sample document – Merchant Id – Field for text description – Phone – Address – Location
  43. 43. 43©MapR Technologies - Confidential Search-based Recommendations  Sample document – Merchant Id – Field for text description – Phone – Address – Location – Indicator merchant id’s – Indicator industry (SIC) id’s – Indicator offers – Indicator text – Local top40
  44. 44. 44©MapR Technologies - Confidential Search-based Recommendations  Sample document – Merchant Id – Field for text description – Phone – Address – Location – Indicator merchant id’s – Indicator industry (SIC) id’s – Indicator offers – Indicator text – Local top40  Sample query – Current location – Recent merchant descriptions – Recent merchant id’s – Recent SIC codes – Recent accepted offers – Local top40
  45. 45. 45©MapR Technologies - Confidential SolR Indexer SolR Indexer Solr indexing Cooccurrence (Mahout) Item meta- data Index shards Complete history
  46. 46. 46©MapR Technologies - Confidential SolR Indexer SolR Indexer Solr search Web tier Item meta- data Index shards User history
  47. 47. 47©MapR Technologies - Confidential  Contact: – tdunning@maprtech.com – @ted_dunning – @apachemahout – @user-subscribe@mahout.apache.org  Slides and such (available late tonight): – http://www.slideshare.net/tdunning  Hash tags: #bbuzz #mapr #recommendations  We are hiring!
  48. 48. 48©MapR Technologies - Confidential Objective Results  At a very large credit card company  History is all transactions, all web interaction  Processing time cut from 20 hours per day to 3  Recommendation engine load time decreased from 8 hours to 3 minutes  Recommendation quality increased visibly
  49. 49. 49©MapR Technologies - Confidential Thank You
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×