Polyvalent Recommendations
Upcoming SlideShare
Loading in...5
×
 

Polyvalent Recommendations

on

  • 322 views

Recent work in recommendations allows some really amazing simplicity of implementation while extending the inputs handled to multiple kinds of interactions against items different from the ones being ...

Recent work in recommendations allows some really amazing simplicity of implementation while extending the inputs handled to multiple kinds of interactions against items different from the ones being recommended.

Statistics

Views

Total Views
322
Views on SlideShare
322
Embed Views
0

Actions

Likes
0
Downloads
4
Comments
0

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    Polyvalent Recommendations Polyvalent Recommendations Presentation Transcript

    • 1©MapR Technologies - Confidential Polyvalent Recommendations
    • 2©MapR Technologies - Confidential Multiple Kinds of Behavior for Recommending Multiple Kinds of Things
    • 3©MapR Technologies - Confidential  Contact: – tdunning@maprtech.com – @ted_dunning – @apachemahout – @user-subscribe@mahout.apache.org  Slides and such (available late tonight): – http://www.slideshare.net/tdunning  Hash tags: #mapr #recommendations
    • 4©MapR Technologies - Confidential A new approach to recommendation, polyvalent recommendation, that is both simpler and much more powerful than traditional approaches. The idea is that you can combine user, item and content recommendations into a single query that you can implement using a very simple architecture.
    • 5©MapR Technologies - Confidential Recommendations  Often known (inaccurately) as collaborative filtering  Actors interact with items – observe successful interaction  We want to suggest additional successful interactions  Observations inherently very sparse
    • 6©MapR Technologies - Confidential Examples  Customers buying books (Linden et al)  Web visitors rating music (Shardanand and Maes) or movies (Riedl, et al), (Netflix)  Internet radio listeners not skipping songs (Musicmatch)  Internet video watchers watching >30 s
    • 7©MapR Technologies - Confidential Dyadic Structure  Functional – Interaction: actor -> item*  Relational – Interaction ⊆ Actors x Items  Matrix – Rows indexed by actor, columns by item – Value is count of interactions  Predict missing observations
    • 8©MapR Technologies - Confidential Recommendation Basics  History: User Thing 1 3 2 4 3 4 2 3 3 2 1 1 2 1
    • 9©MapR Technologies - Confidential Recommendation Basics  History as matrix:  (t1, t2) cooccur 2 times, (t1, t4) once, (t2, t4) once t1 t2 t3 t4 u1 1 0 1 0 u2 1 0 1 1 u3 0 1 0 1
    • 10©MapR Technologies - Confidential A Quick Simplification  Users who do h  Also do r Ah AT Ah( ) AT A( )h User-centric recommendations Item-centric recommendations
    • 11©MapR Technologies - Confidential Recommendation Basics  Coocurrence t1 t2 t3 t4 t1 2 0 2 1 t2 0 1 0 1 t3 2 0 1 1 t4 1 1 1 2
    • 12©MapR Technologies - Confidential Problems with Raw Cooccurrence  Very popular items co-occur with everything – Welcome document – Elevator music  That isn’t interesting – We want anomalous cooccurrence
    • 13©MapR Technologies - Confidential Recommendation Basics  Coocurrence t1 t2 t3 t4 t1 2 0 2 1 t2 0 1 0 1 t3 2 0 1 1 t4 1 1 1 2 t3 not t3 t1 2 1 not t1 1 1
    • 14©MapR Technologies - Confidential Root LLR Details  In R entropy = function(k) { -sum(k*log((k==0)+(k/sum(k)))) } rootLLr = function(k) { sqrt( (entropy(rowSums(k))+entropy(colSums(k)) - entropy(k))/2) }  Like sqrt(mutual information * N/2)
    • 15©MapR Technologies - Confidential Spot the Anomaly  Root LLR is roughly like standard deviations A not A B 13 1000 not B 1000 100,000 A not A B 1 0 not B 0 2 A not A B 1 0 not B 0 10,000 A not A B 10 0 not B 0 100,000 0.44 0.98 2.26 7.15
    • 16©MapR Technologies - Confidential Threshold by Score  Coocurrence t1 t2 t3 t4 t1 2 0 2 1 t2 0 1 0 1 t3 2 0 1 1 t4 1 1 1 2
    • 17©MapR Technologies - Confidential Threshold by Score  Significant cooccurrence => Indicators t1 t2 t3 t4 t1 1 0 0 1 t2 0 1 0 1 t3 0 0 1 1 t4 1 0 0 1
    • 18©MapR Technologies - Confidential Decomposition for Cooccurrence  Can use SVD for cooccurrence  But first one or two singular vectors just encode popularity … ignore those  VT projects items into concept space, V projects back into item space  Thresholding reconstructed cooccurrence matrix is another way to get indicators AT A = USVT ( ) T USVT ( )= VS2 VT
    • 19©MapR Technologies - Confidential What’s right about this?
    • 20©MapR Technologies - Confidential Virtues of Current State of the Art  Lots of well publicized history – Netflix, Amazon, Overstock  Lots of support – Mahout, commercial offerings like Myrrix  Lots of existing code – Mahout, commercial codes  Proven track record  Well socialized solution
    • 21©MapR Technologies - Confidential What’s wrong about this?
    • 22©MapR Technologies - Confidential Cross Occurrence  We don’t have to do co-occurrence  We can do cross-occurrence  Result is cross-recommendation
    • 23©MapR Technologies - Confidential Fundamental Algorithmics  Cooccurrence  A is users x items, K is items x items  Product has general shape of matrix  K tells us “users who interacted with x also interacted with y” K = AT A
    • 24©MapR Technologies - Confidential Fundamental Algorithmic Structure  Cooccurrence  Matrix approximation by factoring  LLR K = AT A A » USVT K » VS2 VT r = VS2 VT h r =sparsify(AT A)h
    • 25©MapR Technologies - Confidential But Wait ... Does it have to be that way?
    • 26©MapR Technologies - Confidential But why not ... Why just dyadic learning? Why not triadic learning?Why not cross learning? AT A( )hBT A( )h
    • 27©MapR Technologies - Confidential For example  Users enter queries (A) – (actor = user, item=query)  Users view videos (B) – (actor = user, item=video)  A’A gives query recommendation – “did you mean to ask for”  B’B gives video recommendation – “you might like these videos”
    • 28©MapR Technologies - Confidential The punch-line  B’A recommends videos in response to a query – (isn’t that a search engine?) – (not quite, it doesn’t look at content or meta-data)
    • 29©MapR Technologies - Confidential Real-life example  Query: “Paco de Lucia”  Conventional meta-data search results: – “hombres del paco” times 400 – not much else  Recommendation based search: – Flamenco guitar and dancers – Spanish and classical guitar – Van Halen doing a classical/flamenco riff
    • 30©MapR Technologies - Confidential Real-life example
    • 31©MapR Technologies - Confidential Hypothetical Example  Want a navigational ontology?  Just put labels on a web page with traffic – This gives A = users x label clicks  Remember viewing history – This gives B = users x items  Cross recommend – B’A = label to item mapping  After several users click, results are whatever users think they should be
    • 32©MapR Technologies - Confidential But wait, there’s more!
    • 33©MapR Technologies - Confidential Ausers things
    • 34©MapR Technologies - Confidential A1 A2 é ë ù û users thing type 1 thing type 2
    • 35©MapR Technologies - Confidential A1 A2 é ë ù û T B1 B2 é ë ù û= A1 T A2 T é ë ê ê ù û ú ú B1 B2 é ë ù û = A1 T B1 A1 T B2 A2B1 A2B2 é ë ê ê ù û ú ú r1 r2 é ë ê ê ù û ú ú = A1 T B1 A1 T B2 A2B1 A2B2 é ë ê ê ù û ú ú h1 h2 é ë ê ê ù û ú ú r1 = A1 T B1 A1 T B2 é ëê ù ûú h1 h2 é ë ê ê ù û ú ú
    • 36©MapR Technologies - Confidential Summary  Input: Multiple kinds of behavior on one set of things  Output: Recommendations for one kind of behavior with a different set of things  Cross recommendation is a special case
    • 37©MapR Technologies - Confidential Now again, without the scary math
    • 38©MapR Technologies - Confidential Input Data  User transactions – user id, merchant id – SIC code, amount – Descriptions, cuisine, …  Offer transactions – user id, offer id – vendor id, merchant id’s, – offers, views, accepts
    • 39©MapR Technologies - Confidential Input Data  User transactions – user id, merchant id – SIC code, amount – Descriptions, cuisine, …  Offer transactions – user id, offer id – vendor id, merchant id’s, – offers, views, accepts  Derived user data – merchant id’s – anomalous descriptor terms – offer & vendor id’s  Derived merchant data – local top40 – SIC code – vendor code – amount distribution
    • 40©MapR Technologies - Confidential Cross-recommendation  Per merchant indicators – merchant id’s – chain id’s – SIC codes – indicator terms from text – offer vendor id’s  Computed by finding anomalous (indicator => merchant) rates
    • 41©MapR Technologies - Confidential Search-based Recommendations  Sample document – Merchant Id – Field for text description – Phone – Address – Location
    • 42©MapR Technologies - Confidential Search-based Recommendations  Sample document – Merchant Id – Field for text description – Phone – Address – Location – Indicator merchant id’s – Indicator industry (SIC) id’s – Indicator offers – Indicator text – Local top40
    • 43©MapR Technologies - Confidential Search-based Recommendations  Sample document – Merchant Id – Field for text description – Phone – Address – Location – Indicator merchant id’s – Indicator industry (SIC) id’s – Indicator offers – Indicator text – Local top40  Sample query – Current location – Recent merchant descriptions – Recent merchant id’s – Recent SIC codes – Recent accepted offers – Local top40
    • 44©MapR Technologies - Confidential SolR Indexer SolR Indexer Solr indexing Cooccurrence (Mahout) Item meta- data Index shards Complete history
    • 45©MapR Technologies - Confidential SolR Indexer SolR Indexer Solr search Web tier Item meta- data Index shards User history
    • 46©MapR Technologies - Confidential Objective Results  At a very large credit card company  History is all transactions, all web interaction  Processing time cut from 20 hours per day to 3  Recommendation engine load time decreased from 8 hours to 3 minutes  Recommendation quality increased visibly
    • 47©MapR Technologies - Confidential  Contact: – tdunning@maprtech.com – @ted_dunning  Slides and such (available late tonight): – http://www.slideshare.net/tdunning  Hash tags: #mapr #recommendations  We are hiring!
    • 48©MapR Technologies - Confidential Thank You