Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

A recommendation engine for your php application

3,530 views

Published on

Nowadays a lot of websites try to guess what we might like: ”Recommendation for you in books”
”People you may like”
Sounds familiar, isn’t it? Wouldn’t be cool if you could do the same in your application? Well, this session is for you! In the first part of this talk recommendation systems will be introduced, focusing on collaborative filtering algorithms (CR). After that we’ll dive in Prediction.io, an open source machine learning server for software developers to create predictive features, such as personalization, recommendation and content discovery. In the last part we’ll cover the integration details with a PHP application

Published in: Software
  • Hello! Get Your Professional Job-Winning Resume Here - Check our website! https://vk.cc/818RFv
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here

A recommendation engine for your php application

  1. 1. A recommendation engine for your PHP apps
  2. 2. 1) Intro to recommender systems 2) PredictionIO 3) Case Study
  3. 3. Definition: a system that help people finding things when the process of finding what you need is challenging because you have a lot of choices/ alternatives
  4. 4. So… it’s a search engine!
  5. 5. Search Engines Document base is (almost) static Queries are dynamic
  6. 6. Search Engines Create an index analyzing the documents Calculate relevance for a query: tf*idf
  7. 7. Recommender systems Document base is growing (eg: Netflix) Query is static: find something I like
  8. 8. Classification
  9. 9. Domain: news, products, … Helps defining what can be suggested
  10. 10. Purpose: sales, information, education, build a community What is TripAdvisor purpose?
  11. 11. Personalization levels • Non personalized: best sellers • Demographic: age, location • Ephemeral: based on current activities • Persistent
  12. 12. Types of input • Explicit: ask user to rate something • Implicit: inferred from user behaviour
  13. 13. Output • Prediction: predicted rating, evaluation • Recommendations: suggestion list, top-n, offers, promotion • Filtering: email filters, news articles
  14. 14. A model for comparison
  15. 15. User: people with preference Items: subject of rating Rating: expression of opinion (Community: space where opinions makes sense)
  16. 16. Non-personalized
  17. 17. Best seller Most popular Trending Summary of community ratings: eg best hotel in town
  18. 18. Hotel
  19. 19. Visitor Hotel
  20. 20. Visitor Hotel
  21. 21. Hotel A Hotel B Hotel C John 3 5 Jane 3 Fred 1 0 Tom 4 AVG 3.5 3 0
  22. 22. Content based
  23. 23. User rate items We build a model of user preference Look for similar items based on the model
  24. 24. Action 0.7 Sci Fi 3.2 Vin Diesel 1.2 … … https://www.amazon.com/Relevant-Search-applications-Solr-Elasticsearch/dp/161729277X http://www.slideshare.net/treygrainger/building-a-real-time-solrpowered-recommendation-engine
  25. 25. Problems/Limitations
  26. 26. Need to know items content User cold start: time to learn important features for the user What if user interest change? Lack of serendipity: accidentally discover something you like
  27. 27. Collaborative filtering
  28. 28. No need to analyze (index) content Can capture more subtle things Serendipity
  29. 29. User-User Select people of my neighborhood with similar taste. If other people share my taste I want their opinion combined
  30. 30. E.T 2 4 Joe 2 2 3 ? 1 5 5 2 4 … Tom 3 3 2 4 1 User-User: which users have similar tastes?
  31. 31. E.T 2 4 Joe 2 2 3 ? 1 5 5 2 4 … Tom 3 3 2 4 1 User-User: which users have similar tastes?
  32. 32. Item-Item Find an items where I have expressed an opinion and look how other people felt about it. Precompute similarities between items
  33. 33. E.T 2 4 Joe 2 2 3 ? 1 5 5 2 4 … Tom 3 3 4 1 Item-Item: which item are similar?
  34. 34. Problems/Limitations
  35. 35. Sparsity When recommending from a large item set, users will have rated only some of the items
  36. 36. User Cold start Not enough known about new user to decide who is similar
  37. 37. Item cold start Cannot predict ratings for new item till some similar users have rated it [No problem for content-based]
  38. 38. Scalability With millions of ratings, computations become slow
  39. 39. Dimensionality reduction
  40. 40. Express my opinions as a set of tastes Compact representation of the matrix with relevant features
  41. 41. Rogue One 1 3 5 Joe 1 2 3
  42. 42. An example
  43. 43. Item1 Item2 Item3 Item4 Item5 Joe 8 1 ? 2 7 Tom 2 ? 5 7 5 Alice 5 4 7 4 7 Bob 7 1 7 3 8 How similar are Joe and Tom? How similar are Joe and Bob?
  44. 44. Only consider items both users have rated For each item - Calculate difference in the users’ ratings - Take the average of this difference over the items Item1 Item2 Item3 Item4 Item5 Joe 8 1 ? 2 7 Tom 2 ? 5 7 5 Alice 5 4 7 4 7 Bob 7 1 7 3 8
  45. 45. Sim(Joe, Tom) = (|8-2| + |2-7| + |7-5|)/3 = 13/3 = 4.3 Sim(Alice, Bob) = (|5-7| + |4-1| + |4-3| + |7-8|)/4 = 7/4 = 1.75 Item1 Item2 Item3 Item4 Item5 Joe 8 1 ? 2 7 Tom 2 ? 5 7 5 Alice 5 4 7 4 7 Bob 7 1 7 3 8
  46. 46. Now we have a score or weight for each user
  47. 47. Recommend what similar user have rated highly To calculate rating of an item to recommend, give weight to each user’s recommendations based on how similar they are to you.
  48. 48. use entire matrix or use a K-nn algorithm: people who historically have the same tastes as me aggregate using weighted sum weights depends on similarity
  49. 49. Item1 Item2 Item3 Item4 Item5 Joe 8 1 ? 2 7 Tom 2 ? 5 7 5 Alice 5 4 7 4 7 Bob 7 1 7 3 8 How similar are Item1 and Item2? How similar are Item1 and Item3?
  50. 50. Only consider items both users have rated For each item - Calculate difference in ratings for the 2 items - Take the average of this difference over the users Item1 Item2 Item3 Item4 Item5 Joe 8 1 ? 2 7 Tom 2 ? 5 7 5 Alice 5 4 7 4 7 Bob 7 1 7 3 8
  51. 51. Sim(I1, I2) = (|8-1| + |5-4| + |7-1|)/3 = 14/3 = 4,6 Sim(I1, I3) = (|2-5| + |5-7| + |7-7|)/3 = 5/3 = 1,6 Item1 Item2 Item3 Item4 Item5 Joe 8 1 ? 2 7 Tom 2 ? 5 7 5 Alice 5 4 7 4 7 Bob 7 1 7 3 8
  52. 52. As user-user, use whole matrix or identify neighbors
  53. 53. Cosine similarity [3,5] [2,7] [0,0]
  54. 54. Our domain
  55. 55. Domain: online book shop, both paper and digital Recommend titles, old and news - Who bought this also bought - You might like
  56. 56. Choosing the tool
  57. 57. PredictionIO
  58. 58. Under the Apache umbrella Based on solid open source stack Customizable templates engines SDK for PHP
  59. 59. Installation http://actionml.com/docs/pio_by_actionml Pre-baked Amazon AMIs
  60. 60. Installation via source code http://predictionio.incubator.apache.org/ install/install-sourcecode/
  61. 61. You can choose storage mysql/postgres vs elasticsearch+hbase
  62. 62. The event server
  63. 63. Pattern: user -- action -- item User 1 purchased product X User 2 viewed product Y User 1 added product Z in the cart
  64. 64. $ pio app new MyApp1 [INFO] [App$] Initialized Event Store for this app ID: 1.
 [INFO] [App$] Created new app:
 [INFO] [App$] Name: MyApp1
 [INFO] [App$] ID: 1
 [INFO] [App$] Access Key: 3mZWDzci2D5YsqAnqNnXH9SB6Rg3dsTBs8iHkK6X2i54IQsIZI1eEeQQyMfs7b3F $ pio eventserver
  65. 65. Server runs on port 7070 by default $ curl -i -X GET http://localhost:7070 {“status":"alive"}
  66. 66. $ curl -i -X GET “http://localhost:7070/ events.json?accessKey=$ACCESS_KEY"
  67. 67. Events modeling what can/should we model? rate, like, buy, view, depending on the algorithm
  68. 68. $set , $unset and $delete _pio* are reserved
  69. 69. setUser($uid, array $properties=array(), $eventTime=null) unsetUser($uid, array $properties, $eventTime=null) deleteUser($uid, $eventTime=null) setItem($iid, array $properties=array(), $eventTime=null) unsetItem($iid, array $properties, $eventTime=null) deleteItem($iid, $eventTime=null) recordUserActionOnItem($event, $uid, $iid, array $properties=array(), $eventTime=null) createEvent(array $data) getEvent($eventId)
  70. 70. Engines
  71. 71. D.A.S.E Architecture Data Source and Preparation Algorithm Serving Evaluation
  72. 72. $ pio template get apache/ incubator-predictionio-template- recommender MyRecommendation $ cd MyRecommendation
  73. 73. engine.json "datasource": {
 "params" : {
 "appName": “MyApp1”,
 "eventNames": [“buy”, “view”] }
 },
  74. 74. $ pio build —verbose $ pio train $ pio deploy
  75. 75. Getting recommendations
  76. 76. Implementation
  77. 77. 2 kind of suggestions - who bought this also bought (recommendation) - you may like (similarities)
  78. 78. View Like (add to basket, add to wishlist) Conversion (buy) Recorded in batch
  79. 79. 4 engines 2 for books, 2 for ebooks (not needed now) Retrained every night with new data
  80. 80. recordLike($user, array $item) recordConversion($user, array $item) recordView($user, array $item) createUser($uid)
  81. 81. getRecommendation($uid, $itype, $n = self::N_SUGGESTION) getSimilarity($iid, $itype, $n = self::N_SUGGESTION)
  82. 82. user cold start/item cold start if we don’t get enough suggestion switch to non personalized (also for non logged users)
  83. 83. user cold start/item cold start if we don’t get enough suggestion switch to non personalized (best sellers)
  84. 84. Michele Orselli CTO@Ideato _orso_ micheleorselli / ideatosrl mo@ideato.it https://joind.in/talk/93d2d
  85. 85. Links • http://www.slideshare.net/NYCPredictiveAnalytics/building-a-recommendation- engine-an-example-of-a-product-recommendation-engine?next_slideshow=1 • https://www.coursera.org/learn/recommender-systems-introduction • http://actionml.com/ • https://github.com/grahamjenson/ger

×