1©MapR Technologies - Confidential
Polyvalent Recommendations
2©MapR Technologies - Confidential
Multiple Kinds of Behavior
for Recommending
Multiple Kinds of Things
3©MapR Technologies - Confidential
 Contact:
– tdunning@maprtech.com
– @ted_dunning
– @apachemahout
– @user-subscribe@mah...
4©MapR Technologies - Confidential
A new approach to recommendation, polyvalent recommendation,
that is both simpler and m...
5©MapR Technologies - Confidential
Recommendations
 Often known (inaccurately) as collaborative filtering
 Actors intera...
6©MapR Technologies - Confidential
Examples
 Customers buying books (Linden et al)
 Web visitors rating music (Shardanan...
7©MapR Technologies - Confidential
Dyadic Structure
 Functional
– Interaction: actor -> item*
 Relational
– Interaction ...
8©MapR Technologies - Confidential
Recommendation Basics
 History:
User Thing
1 3
2 4
3 4
2 3
3 2
1 1
2 1
9©MapR Technologies - Confidential
Recommendation Basics
 History as matrix:
 (t1, t2) cooccur 2 times, (t1, t4) once, (...
10©MapR Technologies - Confidential
A Quick Simplification
 Users who do h
 Also do r
Ah
AT
Ah( )
AT
A( )h
User-centric ...
11©MapR Technologies - Confidential
Recommendation Basics
 Coocurrence
t1 t2 t3 t4
t1 2 0 2 1
t2 0 1 0 1
t3 2 0 1 1
t4 1 ...
12©MapR Technologies - Confidential
Problems with Raw Cooccurrence
 Very popular items co-occur with everything
– Welcome...
13©MapR Technologies - Confidential
Recommendation Basics
 Coocurrence
t1 t2 t3 t4
t1 2 0 2 1
t2 0 1 0 1
t3 2 0 1 1
t4 1 ...
14©MapR Technologies - Confidential
Root LLR Details
 In R
entropy = function(k) {
-sum(k*log((k==0)+(k/sum(k))))
}
rootL...
15©MapR Technologies - Confidential
Spot the Anomaly
 Root LLR is roughly like standard deviations
A not A
B 13 1000
not ...
16©MapR Technologies - Confidential
Threshold by Score
 Coocurrence
t1 t2 t3 t4
t1 2 0 2 1
t2 0 1 0 1
t3 2 0 1 1
t4 1 1 1...
17©MapR Technologies - Confidential
Threshold by Score
 Significant cooccurrence => Indicators
t1 t2 t3 t4
t1 1 0 0 1
t2 ...
18©MapR Technologies - Confidential
Decomposition for Cooccurrence
 Can use SVD for cooccurrence
 But first one or two s...
19©MapR Technologies - Confidential
What’s right
about this?
20©MapR Technologies - Confidential
Virtues of Current State of the Art
 Lots of well publicized history
– Netflix, Amazo...
21©MapR Technologies - Confidential
What’s wrong
about this?
22©MapR Technologies - Confidential
Cross Occurrence
 We don’t have to do co-occurrence
 We can do cross-occurrence
 Re...
23©MapR Technologies - Confidential
Fundamental Algorithmics
 Cooccurrence
 A is users x items, K is items x items
 Pro...
24©MapR Technologies - Confidential
Fundamental Algorithmic Structure
 Cooccurrence
 Matrix approximation by factoring
...
25©MapR Technologies - Confidential
But Wait ...
Does it have to be that way?
26©MapR Technologies - Confidential
But why not ...
Why just dyadic learning?
Why not triadic learning?Why not cross learn...
27©MapR Technologies - Confidential
For example
 Users enter queries (A)
– (actor = user, item=query)
 Users view videos...
28©MapR Technologies - Confidential
The punch-line
 B’A recommends videos in response to a query
– (isn’t that a search e...
29©MapR Technologies - Confidential
Real-life example
 Query: “Paco de Lucia”
 Conventional meta-data search results:
– ...
30©MapR Technologies - Confidential
Real-life example
31©MapR Technologies - Confidential
Hypothetical Example
 Want a navigational ontology?
 Just put labels on a web page w...
32©MapR Technologies - Confidential
But wait,
there’s more!
33©MapR Technologies - Confidential
Ausers
things
34©MapR Technologies - Confidential
A1 A2
é
ë
ù
û
users
thing
type 1
thing
type 2
35©MapR Technologies - Confidential
A1 A2
é
ë
ù
û
T
B1 B2
é
ë
ù
û=
A1
T
A2
T
é
ë
ê
ê
ù
û
ú
ú
B1 B2
é
ë
ù
û
=
A1
T
B1 A1
T
...
36©MapR Technologies - Confidential
Summary
 Input: Multiple kinds of behavior on one set of things
 Output: Recommendat...
37©MapR Technologies - Confidential
Now again, without
the scary math
38©MapR Technologies - Confidential
Input Data
 User transactions
– user id, merchant id
– SIC code, amount
– Description...
39©MapR Technologies - Confidential
Input Data
 User transactions
– user id, merchant id
– SIC code, amount
– Description...
40©MapR Technologies - Confidential
Cross-recommendation
 Per merchant indicators
– merchant id’s
– chain id’s
– SIC code...
41©MapR Technologies - Confidential
Search-based Recommendations
 Sample document
– Merchant Id
– Field for text descript...
42©MapR Technologies - Confidential
Search-based Recommendations
 Sample document
– Merchant Id
– Field for text descript...
43©MapR Technologies - Confidential
Search-based Recommendations
 Sample document
– Merchant Id
– Field for text descript...
44©MapR Technologies - Confidential
SolR
Indexer
SolR
Indexer
Solr
indexing
Cooccurrence
(Mahout)
Item meta-
data
Index
sh...
45©MapR Technologies - Confidential
SolR
Indexer
SolR
Indexer
Solr
search
Web tier
Item meta-
data
Index
shards
User
histo...
46©MapR Technologies - Confidential
Objective Results
 At a very large credit card company
 History is all transactions,...
47©MapR Technologies - Confidential
 Contact:
– tdunning@maprtech.com
– @ted_dunning
 Slides and such (available late to...
48©MapR Technologies - Confidential
Thank You
Upcoming SlideShare
Loading in...5
×

Polyvalent Recommendations

269

Published on

Recent work in recommendations allows some really amazing simplicity of implementation while extending the inputs handled to multiple kinds of interactions against items different from the ones being recommended.

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
269
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
5
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Polyvalent Recommendations

  1. 1. 1©MapR Technologies - Confidential Polyvalent Recommendations
  2. 2. 2©MapR Technologies - Confidential Multiple Kinds of Behavior for Recommending Multiple Kinds of Things
  3. 3. 3©MapR Technologies - Confidential  Contact: – tdunning@maprtech.com – @ted_dunning – @apachemahout – @user-subscribe@mahout.apache.org  Slides and such (available late tonight): – http://www.slideshare.net/tdunning  Hash tags: #mapr #recommendations
  4. 4. 4©MapR Technologies - Confidential A new approach to recommendation, polyvalent recommendation, that is both simpler and much more powerful than traditional approaches. The idea is that you can combine user, item and content recommendations into a single query that you can implement using a very simple architecture.
  5. 5. 5©MapR Technologies - Confidential Recommendations  Often known (inaccurately) as collaborative filtering  Actors interact with items – observe successful interaction  We want to suggest additional successful interactions  Observations inherently very sparse
  6. 6. 6©MapR Technologies - Confidential Examples  Customers buying books (Linden et al)  Web visitors rating music (Shardanand and Maes) or movies (Riedl, et al), (Netflix)  Internet radio listeners not skipping songs (Musicmatch)  Internet video watchers watching >30 s
  7. 7. 7©MapR Technologies - Confidential Dyadic Structure  Functional – Interaction: actor -> item*  Relational – Interaction ⊆ Actors x Items  Matrix – Rows indexed by actor, columns by item – Value is count of interactions  Predict missing observations
  8. 8. 8©MapR Technologies - Confidential Recommendation Basics  History: User Thing 1 3 2 4 3 4 2 3 3 2 1 1 2 1
  9. 9. 9©MapR Technologies - Confidential Recommendation Basics  History as matrix:  (t1, t2) cooccur 2 times, (t1, t4) once, (t2, t4) once t1 t2 t3 t4 u1 1 0 1 0 u2 1 0 1 1 u3 0 1 0 1
  10. 10. 10©MapR Technologies - Confidential A Quick Simplification  Users who do h  Also do r Ah AT Ah( ) AT A( )h User-centric recommendations Item-centric recommendations
  11. 11. 11©MapR Technologies - Confidential Recommendation Basics  Coocurrence t1 t2 t3 t4 t1 2 0 2 1 t2 0 1 0 1 t3 2 0 1 1 t4 1 1 1 2
  12. 12. 12©MapR Technologies - Confidential Problems with Raw Cooccurrence  Very popular items co-occur with everything – Welcome document – Elevator music  That isn’t interesting – We want anomalous cooccurrence
  13. 13. 13©MapR Technologies - Confidential Recommendation Basics  Coocurrence t1 t2 t3 t4 t1 2 0 2 1 t2 0 1 0 1 t3 2 0 1 1 t4 1 1 1 2 t3 not t3 t1 2 1 not t1 1 1
  14. 14. 14©MapR Technologies - Confidential Root LLR Details  In R entropy = function(k) { -sum(k*log((k==0)+(k/sum(k)))) } rootLLr = function(k) { sqrt( (entropy(rowSums(k))+entropy(colSums(k)) - entropy(k))/2) }  Like sqrt(mutual information * N/2)
  15. 15. 15©MapR Technologies - Confidential Spot the Anomaly  Root LLR is roughly like standard deviations A not A B 13 1000 not B 1000 100,000 A not A B 1 0 not B 0 2 A not A B 1 0 not B 0 10,000 A not A B 10 0 not B 0 100,000 0.44 0.98 2.26 7.15
  16. 16. 16©MapR Technologies - Confidential Threshold by Score  Coocurrence t1 t2 t3 t4 t1 2 0 2 1 t2 0 1 0 1 t3 2 0 1 1 t4 1 1 1 2
  17. 17. 17©MapR Technologies - Confidential Threshold by Score  Significant cooccurrence => Indicators t1 t2 t3 t4 t1 1 0 0 1 t2 0 1 0 1 t3 0 0 1 1 t4 1 0 0 1
  18. 18. 18©MapR Technologies - Confidential Decomposition for Cooccurrence  Can use SVD for cooccurrence  But first one or two singular vectors just encode popularity … ignore those  VT projects items into concept space, V projects back into item space  Thresholding reconstructed cooccurrence matrix is another way to get indicators AT A = USVT ( ) T USVT ( )= VS2 VT
  19. 19. 19©MapR Technologies - Confidential What’s right about this?
  20. 20. 20©MapR Technologies - Confidential Virtues of Current State of the Art  Lots of well publicized history – Netflix, Amazon, Overstock  Lots of support – Mahout, commercial offerings like Myrrix  Lots of existing code – Mahout, commercial codes  Proven track record  Well socialized solution
  21. 21. 21©MapR Technologies - Confidential What’s wrong about this?
  22. 22. 22©MapR Technologies - Confidential Cross Occurrence  We don’t have to do co-occurrence  We can do cross-occurrence  Result is cross-recommendation
  23. 23. 23©MapR Technologies - Confidential Fundamental Algorithmics  Cooccurrence  A is users x items, K is items x items  Product has general shape of matrix  K tells us “users who interacted with x also interacted with y” K = AT A
  24. 24. 24©MapR Technologies - Confidential Fundamental Algorithmic Structure  Cooccurrence  Matrix approximation by factoring  LLR K = AT A A » USVT K » VS2 VT r = VS2 VT h r =sparsify(AT A)h
  25. 25. 25©MapR Technologies - Confidential But Wait ... Does it have to be that way?
  26. 26. 26©MapR Technologies - Confidential But why not ... Why just dyadic learning? Why not triadic learning?Why not cross learning? AT A( )hBT A( )h
  27. 27. 27©MapR Technologies - Confidential For example  Users enter queries (A) – (actor = user, item=query)  Users view videos (B) – (actor = user, item=video)  A’A gives query recommendation – “did you mean to ask for”  B’B gives video recommendation – “you might like these videos”
  28. 28. 28©MapR Technologies - Confidential The punch-line  B’A recommends videos in response to a query – (isn’t that a search engine?) – (not quite, it doesn’t look at content or meta-data)
  29. 29. 29©MapR Technologies - Confidential Real-life example  Query: “Paco de Lucia”  Conventional meta-data search results: – “hombres del paco” times 400 – not much else  Recommendation based search: – Flamenco guitar and dancers – Spanish and classical guitar – Van Halen doing a classical/flamenco riff
  30. 30. 30©MapR Technologies - Confidential Real-life example
  31. 31. 31©MapR Technologies - Confidential Hypothetical Example  Want a navigational ontology?  Just put labels on a web page with traffic – This gives A = users x label clicks  Remember viewing history – This gives B = users x items  Cross recommend – B’A = label to item mapping  After several users click, results are whatever users think they should be
  32. 32. 32©MapR Technologies - Confidential But wait, there’s more!
  33. 33. 33©MapR Technologies - Confidential Ausers things
  34. 34. 34©MapR Technologies - Confidential A1 A2 é ë ù û users thing type 1 thing type 2
  35. 35. 35©MapR Technologies - Confidential A1 A2 é ë ù û T B1 B2 é ë ù û= A1 T A2 T é ë ê ê ù û ú ú B1 B2 é ë ù û = A1 T B1 A1 T B2 A2B1 A2B2 é ë ê ê ù û ú ú r1 r2 é ë ê ê ù û ú ú = A1 T B1 A1 T B2 A2B1 A2B2 é ë ê ê ù û ú ú h1 h2 é ë ê ê ù û ú ú r1 = A1 T B1 A1 T B2 é ëê ù ûú h1 h2 é ë ê ê ù û ú ú
  36. 36. 36©MapR Technologies - Confidential Summary  Input: Multiple kinds of behavior on one set of things  Output: Recommendations for one kind of behavior with a different set of things  Cross recommendation is a special case
  37. 37. 37©MapR Technologies - Confidential Now again, without the scary math
  38. 38. 38©MapR Technologies - Confidential Input Data  User transactions – user id, merchant id – SIC code, amount – Descriptions, cuisine, …  Offer transactions – user id, offer id – vendor id, merchant id’s, – offers, views, accepts
  39. 39. 39©MapR Technologies - Confidential Input Data  User transactions – user id, merchant id – SIC code, amount – Descriptions, cuisine, …  Offer transactions – user id, offer id – vendor id, merchant id’s, – offers, views, accepts  Derived user data – merchant id’s – anomalous descriptor terms – offer & vendor id’s  Derived merchant data – local top40 – SIC code – vendor code – amount distribution
  40. 40. 40©MapR Technologies - Confidential Cross-recommendation  Per merchant indicators – merchant id’s – chain id’s – SIC codes – indicator terms from text – offer vendor id’s  Computed by finding anomalous (indicator => merchant) rates
  41. 41. 41©MapR Technologies - Confidential Search-based Recommendations  Sample document – Merchant Id – Field for text description – Phone – Address – Location
  42. 42. 42©MapR Technologies - Confidential Search-based Recommendations  Sample document – Merchant Id – Field for text description – Phone – Address – Location – Indicator merchant id’s – Indicator industry (SIC) id’s – Indicator offers – Indicator text – Local top40
  43. 43. 43©MapR Technologies - Confidential Search-based Recommendations  Sample document – Merchant Id – Field for text description – Phone – Address – Location – Indicator merchant id’s – Indicator industry (SIC) id’s – Indicator offers – Indicator text – Local top40  Sample query – Current location – Recent merchant descriptions – Recent merchant id’s – Recent SIC codes – Recent accepted offers – Local top40
  44. 44. 44©MapR Technologies - Confidential SolR Indexer SolR Indexer Solr indexing Cooccurrence (Mahout) Item meta- data Index shards Complete history
  45. 45. 45©MapR Technologies - Confidential SolR Indexer SolR Indexer Solr search Web tier Item meta- data Index shards User history
  46. 46. 46©MapR Technologies - Confidential Objective Results  At a very large credit card company  History is all transactions, all web interaction  Processing time cut from 20 hours per day to 3  Recommendation engine load time decreased from 8 hours to 3 minutes  Recommendation quality increased visibly
  47. 47. 47©MapR Technologies - Confidential  Contact: – tdunning@maprtech.com – @ted_dunning  Slides and such (available late tonight): – http://www.slideshare.net/tdunning  Hash tags: #mapr #recommendations  We are hiring!
  48. 48. 48©MapR Technologies - Confidential Thank You
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×