Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Building multi-modal recommendation engines using search engines

14,467 views

Published on

This is my strata NY talk about how to build recommendation engines using common items. In particular, I show how multi-modal recommendations can be built using the same framework.

Published in: Technology, Design
  • Go here now to unlock your dog's natural intelligence today. ➤➤ https://bit.ly/38b79Wm
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • DOWNLOAD THIS BOOKS INTO AVAILABLE FORMAT (2019 Update) ......................................................................................................................... ......................................................................................................................... Download Full PDF EBOOK here { https://soo.gd/irt2 } ......................................................................................................................... Download Full EPUB Ebook here { https://soo.gd/irt2 } ......................................................................................................................... Download Full doc Ebook here { https://soo.gd/irt2 } ......................................................................................................................... Download PDF EBOOK here { https://soo.gd/irt2 } ......................................................................................................................... Download EPUB Ebook here { https://soo.gd/irt2 } ......................................................................................................................... Download doc Ebook here { https://soo.gd/irt2 } ......................................................................................................................... ......................................................................................................................... ................................................................................................................................... eBook is an electronic version of a traditional print book THIS can be read by using a personal computer or by using an eBook reader. (An eBook reader can be a software application for use on a computer such as Microsoft's free Reader application, or a book-sized computer THIS is used solely as a reading device such as Nuvomedia's Rocket eBook.) Users can purchase an eBook on diskette or CD, but the most popular method of getting an eBook is to purchase a downloadable file of the eBook (or other reading material) from a Web site (such as Barnes and Noble) to be read from the user's computer or reading device. Generally, an eBook can be downloaded in five minutes or less ......................................................................................................................... .............. Browse by Genre Available eBooks .............................................................................................................................. Art, Biography, Business, Chick Lit, Children's, Christian, Classics, Comics, Contemporary, Cookbooks, Manga, Memoir, Music, Mystery, Non Fiction, Paranormal, Philosophy, Poetry, Psychology, Religion, Romance, Science, Science Fiction, Self Help, Suspense, Spirituality, Sports, Thriller, Travel, Young Adult, Crime, Ebooks, Fantasy, Fiction, Graphic Novels, Historical Fiction, History, Horror, Humor And Comedy, ......................................................................................................................... ......................................................................................................................... .....BEST SELLER FOR EBOOK RECOMMEND............................................................. ......................................................................................................................... Blowout: Corrupted Democracy, Rogue State Russia, and the Richest, Most Destructive Industry on Earth,-- The Ride of a Lifetime: Lessons Learned from 15 Years as CEO of the Walt Disney Company,-- Call Sign Chaos: Learning to Lead,-- StrengthsFinder 2.0,-- Stillness Is the Key,-- She Said: Breaking the Sexual Harassment Story THIS Helped Ignite a Movement,-- Atomic Habits: An Easy & Proven Way to Build Good Habits & Break Bad Ones,-- Everything Is Figureoutable,-- What It Takes: Lessons in the Pursuit of Excellence,-- Rich Dad Poor Dad: What the Rich Teach Their Kids About Money THIS the Poor and Middle Class Do Not!,-- The Total Money Makeover: Classic Edition: A Proven Plan for Financial Fitness,-- Shut Up and Listen!: Hard Business Truths THIS Will Help You Succeed, ......................................................................................................................... .........................................................................................................................
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here

Building multi-modal recommendation engines using search engines

  1. 1. Introduction to Mahout And How To Build a Recommender ©MapR Technologies 2013- Confidential 1
  2. 2. Topic For Today  What is recommendation?  What makes it different?  What is multi-model recommendation?  How can I build it using common household items? ©MapR Technologies 2013- Confidential 2
  3. 3. Oh … Also This  Detailed break-down of a live machine learning system running with Mahout on MapR  With code examples ©MapR Technologies 2013- Confidential 3
  4. 4. I may have to summarize ©MapR Technologies 2013- Confidential 4
  5. 5. I may have to summarize just a bit ©MapR Technologies 2013- Confidential 5
  6. 6. Part 1: 5 minutes of background ©MapR Technologies 2013- Confidential 6
  7. 7. Part 2: 5 minutes: I want a pony ©MapR Technologies 2013- Confidential 7
  8. 8. ©MapR Technologies 2013- Confidential 8
  9. 9. Part 1: 5 minutes of background ©MapR Technologies 2013- Confidential 9
  10. 10. What Does Machine Learning Look Like? ©MapR Technologies 2013- Confidential 10
  11. 11. What Does Machine Learning Look Like? é T ù T é A A ù é A A ù = ê A1 úé 2 û ë 1 2 û ë 1 ê AT úë ë 2 û é T A1 A1 =ê T ê A 2 A1 ë é r ù é AT A ê 1 ú=ê 1 1 ê r2 ú ê AT A1 ë û ë 2 k3 O(k2 k3 O(κ k d + d) = d log n + d) for small k, high quality O(κ d log k) or O(d log κ log k) for larger k, looser quality A1 A2 ù û ù T A1 A 2 ú AT A 2 ú 2 û ù T A1 A 2 úé h1 ù ê ú T úê h 2 ú A 2 A 2 ûë û é T T r1 = ê A1 A1 A1 A 2 ë é ù ùê h1 ú úê h ú û ë 2 û But tonight we’re going to show you how to keep it simple yet powerful… ©MapR Technologies 2013- Confidential 11
  12. 12. Recommendations as Machine Learning  Recommendation: – – – Involves observation of interactions between people taking action (users) and items for input data to the recommender model Goal is to suggest additional appropriate or desirable interactions Applications include: movie, music or map-based restaurant choices; suggesting sale items for e-stores or via cash-register receipts ©MapR Technologies 2013- Confidential 12
  13. 13. ©MapR Technologies 2013- Confidential 13
  14. 14. ©MapR Technologies 2013- Confidential 14
  15. 15. Part 2: How recommenders work (I still want a pony) ©MapR Technologies 2013- Confidential 15
  16. 16. Recommendations Recap: Behavior of a crowd helps us understand what individuals will do ©MapR Technologies 2013- Confidential 16
  17. 17. Recommendations Alice Charles ©MapR Technologies 2013- Confidential Alice got an apple and a puppy Charles got a bicycle 17
  18. 18. Recommendations Alice Bob Charles ©MapR Technologies 2013- Confidential Alice got an apple and a puppy Bob got an apple Charles got a bicycle 18
  19. 19. Recommendations Alice Bob ? What else would Bob like? Charles ©MapR Technologies 2013- Confidential 19
  20. 20. Recommendations Alice Bob A puppy, of course! Charles ©MapR Technologies 2013- Confidential 20
  21. 21. You get the idea of how recommenders work… (By the way, like me, Bob also wants a pony) ©MapR Technologies 2013- Confidential 21
  22. 22. Recommendations Alice What if everybody gets a pony? Bob Amelia ? What else would you recommend for Amelia? Charles ©MapR Technologies 2013- Confidential 22
  23. 23. Recommendations Alice Bob Amelia ? If everybody gets a pony, it’s not a very good indicator of what to else predict... Charles ©MapR Technologies 2013- Confidential 23
  24. 24. Problems with Raw Co-occurrence  Very popular items co-occur with everything (or why it’s not very helpful to know that everybody wants a pony…) –  Very widespread occurrence is not interesting as a way to generate indicators –  Examples: Welcome document; Elevator music Unless you want to offer an item that is constantly desired, such as razor blades (or ponies) What we want is anomalous co-occurrence – This is the source of interesting indicators of preference on which to base recommendation ©MapR Technologies 2013- Confidential 24
  25. 25. Get Useful Indicators from Behaviors Use log files to build history matrix of users x items 1. – Remember: this history of interactions will be sparse compared to all potential combinations 2. Transform to a co-occurrence matrix of items x items 3. Look for useful co-occurrence by looking for anomalous cooccurrences to make an indicator matrix – – Log Likelihood Ratio (LLR) can be helpful to judge which co-occurrences can with confidence be used as indicators of preference RowSimilarityJob in Apache Mahout uses LLR ©MapR Technologies 2013- Confidential 25
  26. 26. Log Files Alice Charles Charles Alice Alice Bob Bob ©MapR Technologies 2013- Confidential 26
  27. 27. Log Files u1 u2 t4 u2 t3 u1 t2 u1 t3 u3 t3 u3 ©MapR Technologies 2013- Confidential t1 t1 27
  28. 28. Log Files and Dimensions u1 t1 u2 t4 u2 t3 u1 t2 t1 u1 t3 t2 u3 t3 u3 t1 ©MapR Technologies 2013- Confidential Things Users u1 Alice u2 Charles u3 Bob 28 t3 t4
  29. 29. History Matrix: Users by Items Alice ✔ Bob ✔ Charles ©MapR Technologies 2013- Confidential ✔ ✔ ✔ ✔ 29 ✔
  30. 30. Co-occurrence Matrix: Items by Items How do you tell which co-occurrences are useful?. 1 2 1 1 2 ©MapR Technologies 2013- Confidential 1 0 - 0 1 1 30 0 0
  31. 31. Co-occurrence Matrix: Items by Items Use LLR test to turn co-occurrence into indicators… 1 2 1 1 2 ©MapR Technologies 2013- Confidential 1 0 - 0 1 1 31 0 0
  32. 32. Co-occurrence Binary Matrix not not ©MapR Technologies 2013- Confidential 1 1 32 1
  33. 33. Spot the Anomaly What conclusion do you draw from each situation? A not A B 13 1000 not B 1000 100,000 A not A B 1 0 not B 0 10,000 ©MapR Technologies 2013- Confidential A B 1 0 not B 0 2 A not A B 10 0 not B 33 not A 0 100,000
  34. 34. Spot the Anomaly What conclusion do you draw from each situation? A not A B 13 1000 not B 1000 100,000 A not A B 1 0 not B 0 10,000 0.90 4.52 A not A B 1 0 not B 0 2 A not A B 10 0 not B 0 100,000 1.95 14.3 Root LLR is roughly like standard deviations  In Apache Mahout, RowSimilarityJob uses LLR  ©MapR Technologies 2013- Confidential 34
  35. 35. Co-occurrence Matrix Recap: Use LLR test to turn co-occurrence into indicators 1 2 1 1 2 ©MapR Technologies 2013- Confidential 1 0 - 0 1 1 35 0 0
  36. 36. Indicator Matrix: Anomalous Co-Occurrence Result: The marked row will be added to the indicator field in the item document… ✔ ✔ ©MapR Technologies 2013- Confidential 36
  37. 37. Indicator Matrix That one row from indicator matrix becomes the indicator field in the Solr document used to deploy the recommendation engine. ✔ id: t4 title: puppy desc: The sweetest little puppy ever. keywords: puppy, dog, pet indicators: (t1) Note: data for the indicator field is added directly to meta-data for a document in Solr index. You don’t need to create a separate index for the indicators. ©MapR Technologies 2013- Confidential 37
  38. 38. Internals of the Recommender Engine 38 ©MapR Technologies 2013- Confidential 38
  39. 39. Internals of the Recommender Engine 39 ©MapR Technologies 2013- Confidential 39
  40. 40. Looking Inside LucidWorks Real-time recommendation query and results: Evaluation What to recommend if new user listened to 2122: Fats Domino & 303: Beatles? Recommendation is “1710 : Chuck Berry” 40 ©MapR Technologies 2013- Confidential 40
  41. 41. Search-based Recommendations  Sample document – – – – – Merchant Id Field for text description Phone Address Location ©MapR Technologies 2013- Confidential 41
  42. 42. Search-based Recommendations  Sample document – – – – – – – – – – Merchant Id Field for text description Phone Address Location Indicator merchant id’s Indicator industry (SIC) id’s Indicator offers Indicator text Local top40 ©MapR Technologies 2013- Confidential 42
  43. 43. Search-based Recommendations  Sample document – – – – –  Merchant Id Field for text description Phone Address Location Sample query – – – – – – – – – – – Indicator merchant id’s Indicator industry (SIC) id’s Indicator offers Indicator text Local top40 ©MapR Technologies 2013- Confidential 43 Current location Recent merchant descriptions Recent merchant id’s Recent SIC codes Recent accepted offers Local top40
  44. 44. Search-based Recommendations  Original data Sample document and meta-data – Merchant Id – – – –  Sample query – Field for text description Phone Address Location – – – – – – – – – – Current location Recent merchant descriptions Recent merchant id’s Recent SIC codes Recent accepted offers Local top40 Indicator merchant id’s Recommendation Indicator industry (SIC) id’s query Indicator offers Indicator text Derived from cooccurrence Local top40 and cross-occurrence analysis ©MapR Technologies 2013- Confidential 44
  45. 45. For example  Users enter queries (A) –  Users view videos (B) –  (actor = user, item=video) ATA gives query recommendation –  (actor = user, item=query) “did you mean to ask for” BTB gives video recommendation – “you might like these videos” ©MapR Technologies 2013- Confidential 45
  46. 46. The punch-line  BTA recommends videos in response to a query – – (isn’t that a search engine?) (not quite, it doesn’t look at content or meta-data) ©MapR Technologies 2013- Confidential 46
  47. 47. Real-life example  Query: “Paco de Lucia”  Conventional meta-data search results: – –  “hombres del paco” times 400 not much else Recommendation based search: – – – Flamenco guitar and dancers Spanish and classical guitar Van Halen doing a classical/flamenco riff ©MapR Technologies 2013- Confidential 47
  48. 48. Real-life example ©MapR Technologies 2013- Confidential 48
  49. 49. Hypothetical Example  Want a navigational ontology?  Just put labels on a web page with traffic –  Remember viewing history –  This gives B = users x items Cross recommend –  This gives A = users x label clicks B’A = label to item mapping After several users click, results are whatever users think they should be ©MapR Technologies 2013- Confidential 49
  50. 50. Nice. But we can do better? ©MapR Technologies 2013- Confidential 50
  51. 51. A Quick Simplification  Users who do h (a vector of things a user has done) Ah  A translates things into users Also do r A ( Ah) User-centric recommendations (transpose translates back to things) ( A A) h Item-centric recommendations (change the order of operations) T T ©MapR Technologies 2013- Confidential 51
  52. 52. Symmetry Gives Cross Recommentations ( A A) h T ( ) BT A h ©MapR Technologies 2013- Confidential Conventional recommendations with off-line learning Cross recommendations 52
  53. 53. things users ©MapR Technologies 2013- Confidential A 53
  54. 54. thing thing type 1 type 2 users ©MapR Technologies 2013- Confidential é A A ù 2 û ë 1 54
  55. 55. é A ë 1 é A 2 ù é A1 A 2 ù = ê û ë û ê ë é =ê ê ë é r ù é ê 1 ú=ê ê r2 ú ê ë û ë T T ù A1 úé A1 T úë A2 û A2 ù û ù T T A1 A1 A1 A 2 ú AT A1 AT A 2 ú 2 2 û ù T T A1 A1 A1 A 2 úé h1 ê T T A 2 A1 A 2 A 2 úê h 2 ûë é h é T ù 1 T r1 = ê A1 A1 A1 A 2 úê ë ûê h 2 ë ©MapR Technologies 2013- Confidential 55 ù ú ú û ù ú ú û
  56. 56. Part 3: What about that worked example? ©MapR Technologies 2013- Confidential 56
  57. 57. History collector (6) User behavior generator (1) Presentation tier (2) Diagnostic browsing (9) Cooccurrence analysis (7) Post to search engine (8) Search engine (4) Session collector (3) http://bit.ly/18vbbaT ©MapR Technologies 2013- Confidential 57 Metrics and logs (5)
  58. 58. Analyze with Map-Reduce Complete history SolR SolR Indexer Solr Indexer indexing Cooccurrence (Mahout) Item metadata ©MapR Technologies 2013- Confidential Index shards 58
  59. 59. Deploy with Conventional Search System User history SolR SolR Indexer Solr Indexer search Web tier Item metadata ©MapR Technologies 2013- Confidential Index shards 59
  60. 60. Me, Us  Ted Dunning, Chief Application Architect, MapR Committer PMC member, Mahout, Zookeeper, Drill Bought the beer at the first HUG  MapR Distributes more open source components for Hadoop Adds major technology for performance, HA, industry standard API’s  Info Hash tag - #mapr See also - @ApacheMahout @ApacheDrill @ted_dunning and @mapR ©MapR Technologies 2013- Confidential 60

×