Acts As Recommendable

Loading...

Flash Player 9 (or above) is needed to view presentations.
We have detected that you do not have it on your computer. To install it, go here.

1 comments

Comments 1 - 1 of 1 previous next Post a comment

Post a comment
Embed Video
Edit your comment Cancel

3 Favorites

Acts As Recommendable - Presentation Transcript

  1. Recommendations in Production Alex MacCaw
  2. Netflix Prize
  3. Amazon.com Facebook Last.fm StumbleUpon Google Suggest iTunes Rotten Tomatoes Yelp
  4. Google Search
  5. Chicken or Egg
  6. • Google Reader • IMDB
  7. Acts As Recommendable
  8. Types of recommendations • Content Based • User Based • Item Based
  9. Programming Collective Intelligence
  10. Has Many Through Relationship
  11. User Has Many Through Book Has Many Has Many UserBooks Can have score (rating)
  12. User class User < ActiveRecord::Base has_many :user_books has_many :books, :through => :user_books acts_as_recommendable :books, :through => :user_books end
  13. Gives you User#similar_users User#recommended_books Book#similar_books
  14. The algorithms • Manhattan Distance • Euclidean distance • Cosine • Pearson correlation coefficient • Jaccard • Levenshtein
  15. How does it work?
  16. Strategy • Map data into Euclidean Space • Calculate similarity • Use similarities to recommend
  17. The Black John Tucker Knight Must Die James 4 5 Jonah 3 2 George 5 3 Alex 4 2
  18. 5.00 3.75 The Black Knight 2.50 1.25 0 0 1.25 2.50 3.75 5.00 John Tucker Must Die
  19. 5.00 3.75 The Black Knight 2.50 1.25 0 0 1.25 2.50 3.75 5.00 John Tucker Must Die
  20. item id { user id 1 => { 1 => 1.0, 2 => 0.0, score ... }, ... }
  21. [[1, 0.5554], [2, 0.888], [3, 0.8843], ...]
  22. Problem 1 It was far too slow to calculate on the fly (obvious)
  23. SELECT * FROM \"users\" WHERE (\"users\".\"id\" = 2) SELECT * FROM \"books\" SELECT * FROM \"users\" SELECT \"user_books\".* FROM \"user_books\" WHERE (\"user_books\".user_id IN (1,2,3,4,5,6,7,8,9,10)) SELECT * FROM \"books\" WHERE (\"books\".\"id\" IN (11,6,12,7,13,8,14,9,15,1,2,19,20,3,10,4,5)) SELECT * FROM \"books\" WHERE (\"books\".\"id\" IN (20,3,19,6)) All books All user_books
  24. Solution Cache the dataset Build offline rake recommendations:build
  25. SELECT * FROM \"user_books\" WHERE (\"user_books\".user_id = 2) SELECT * FROM \"books\" WHERE (\"books\".\"id\" = 5) SELECT * FROM \"books\" WHERE (\"books\".\"id\" = 4) SELECT * FROM \"books\" WHERE (\"books\".\"id\" = 8) SELECT * FROM \"books\" WHERE (\"books\".\"id\" = 7) SELECT * FROM \"books\" WHERE (\"books\".\"id\" = 2) SELECT * FROM \"books\" WHERE (\"books\".\"id\" = 1)
  26. Problem 2 Fetching the dataset took too long since it was so massive
  27. Solution Split up the cache by item
  28. Rails.cache.write( \"aar_books_1\", scores )
  29. Problem 3 The dataset was so big it crashed Ruby!
  30. Solution Get rid of ActiveRecord Only deal with integers
  31. items = options[:on_class].connection.select_values( \"SELECT id from #{options[:on_class].table_name}\" ).collect(&:to_i)
  32. Problem 4 It still crashed Ruby!
  33. { 1 => { 1 => 1.0, 2 => 0.0, ... }, ... }
  34. Solution Remove unnecessary cruft from dataset
  35. { 1 => { 1 => 1.0, ... }, ... }
  36. Problem 5 It was too slow
  37. Solution Re-write the slow bits in C
  38. Details • RubyInline • Implemented Pearson • Monkey patched original Ruby methods • Very fast
  39. InlineC = Module.new do inline do |builder| builder.c ' #include <math.h> #include \"ruby.h\" double c_sim_pearson(VALUE items) { Ruby Object
  40. InlineC = Module.new do inline do |builder| builder.c ' #include <math.h> #include \"ruby.h\" double c_sim_pearson(VALUE items) { No Floats :(
  41. Hash Lookup if (!st_lookup(RHASH(prefs1)->tbl, items_a[i], &prefs1_item_ob)) { prefs1_item = 0.0; } else { prefs1_item = NUM2DBL(prefs1_item_ob); }
  42. Conversion return num / den;
  43. Design Designs • Not too many relationships • Not to many ‘items’ • Similarity matrix for items, not users
  44. Changing data
  45. Scaling Even Further • K Means clustering • Split cluster by category
  46. Adding ratings ActiveRecord::Schema.define(:version => 1) do create_table \"books\", :force => true do |t| t.string \"name\" t.datetime \"created_at\" t.datetime \"updated_at\" end create_table \"user_books\", :force => true do |t| t.integer \"user_id\", :null => false t.integer \"book_id\", :null => false t.integer \"rating\", :default => 0 end create_table \"users\", :force => true do |t| t.string \"name\" t.datetime \"created_at\" t.datetime \"updated_at\" end end
  47. class User < ActiveRecord::Base has_many :user_books has_many :books, :through => :user_books acts_as_recommendable :books, :through => :user_books, :score => :rating end
  48. That’s it
  49. Improvements? • Better API • Perform calculations over a cluster (EC2) using Map/Nanite
  50. class AARN < Nanite::Actor expose :sim_pearson def sim_pearson(item1, item2) Optimizations.c_sim_pearson(item1, item2) end end
  51. Questions? http://eribium.org/blog twitter : maccman email/jabber: maccman@gmail.com http://github.com/maccman/acts_as_recommendable http://rubyurl.com/kUpk

+ maccmanmaccman, 2 years ago

custom

722 views, 3 favs, 0 embeds more stats

RubyManor talk on using Recommendation systems in p more

More info about this document

© All Rights Reserved

Go to text version

  • Total Views 722
    • 722 on SlideShare
    • 0 from embeds
  • Comments 1
  • Favorites 3
  • Downloads 15
Most viewed embeds

more

All embeds

less

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate. If needed, use the feedback form to let us know more details.

Cancel
File a copyright complaint
Having problems? Go to our helpdesk?

Categories