Wakoopa Recommendations Engine on AWS

  • 2,096 views
Uploaded on

Menno van der Sman, Lead Engineer of Wakoopa presents at the AWS Start-Up Event - Amsterdam about their use of Amazon EC2 and S3 for their recommendations engine.

Menno van der Sman, Lead Engineer of Wakoopa presents at the AWS Start-Up Event - Amsterdam about their use of Amazon EC2 and S3 for their recommendations engine.

More in: Technology , Business
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
No Downloads

Views

Total Views
2,096
On Slideshare
0
From Embeds
0
Number of Embeds
0

Actions

Shares
Downloads
75
Comments
1
Likes
9

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. Menno van der Sman Lead Developer Coen Stevens Recommendation Engineer
  • 2. Mission: Discover software & games
  • 3. Updates
  • 4. Searching powered by
  • 5. Recommendations Codename: Ludwig
  • 6. How to get started? Research Mathemagicians Amazon, Netflix etc Peter Tegelaar & Coen Stevens Ludwig created recommender system in ruby running on EC2
  • 7. Challenges when building your first recommender system
  • 8. Data what do we have? Usage (implicit) vs. Ratings (explicit) • Noisy • Accurate • Only positive • Positive and negative feedback feedback • Easy to collect • Hard to collect
  • 9. Item-Based Collaborative Filtering User software usage matrix Software items 220 90 180 22 280 12 42 80 Users 175 210 210 45 165 14 35 195 13 25 100 50 185 35 190 60 65 185
  • 10. Classified user software usage matrix (1, 2, 3) Software items 3 2 2 2 3 2 1 2 Users 3 3 2 3 2 1 2 2 3 2 3 2 2 2 3 1 2 3
  • 11. How do we predict the probability that I would like to use GMail? Software items 3 2 2 2 3 2 1 2 Users 3 3 ? 2 3 2 1 2 2 3 2 3 2 2 2 3 1 2 3
  • 12. Calculate the similarities between Gmail and the other software items. Software items 3 2 2 2 3 2 1 2 Users 3 3 2 3 2 1 2 2 3 2 3 2 2 2 3 1 2 3 Similarity(Firefox, Gmail)
  • 13. Calculate the similarities between Gmail and the other software items. Gmail similarities 0.6 3 2 2 2 0.8 3 2 1 2 1.0 3 3 2 3 0.4 2 1 2 2 3 2 0.4 3 2 2 2 3 0.3 1 2 3 0.3
  • 14. Calculate the predicted value for Gmail Gmail similarities User usage 0.6 3 0.8 3 1.0 0.4 2 0.4 0.3 3 0.3
  • 15. Calculate the predicted value for Gmail Gmail similarities User usage We take only the ‘K’ most similar items (say 2) 0.6 3 0.8 3 1.0 0.4 2 0.4 0.3 3 0.6*3 + 0.8*3 = 2.8 0.6 + 0.8 + 0.4 + 0.3 0.3
  • 16. Calculate all unknown values and show the Top-N recommendations to each user Software items 3 2 ? 2 ? ? 2 3 2 1 ? 2 ? ? Users 3 3 ? 2 ? 3 ? 2 1 2 2 3 2 ? ? 3 2 2 ? 2 3 ? 1 ? 2 ? ? 3
  • 17. Metrics measure for success Space complexity: O(m + Kn) Computational complexity: O(m + n²) Performance: Root Mean Squared Error
  • 18. Evaluating the approach Maximize ( performance cost ) This is easy with EC2
  • 19. Why EC2? Low cost Flexibility Ease of use
  • 20. Infrastructure Wakoopa EC2 checkout Repository Computing Application power Database ssh tunnel Big Database
  • 21. Want more? http://recked.org Time & place TBD