Recommender Systems with Ruby (adding machine learning, statistics, etc)
Upcoming SlideShare
Loading in...5
×
 

Like this? Share it with your network

Share

Recommender Systems with Ruby (adding machine learning, statistics, etc)

on

  • 9,180 views

Talk lectured at Frevo On Rails Ruby Meeting at Recife/Pernambuco 14/09/2013

Talk lectured at Frevo On Rails Ruby Meeting at Recife/Pernambuco 14/09/2013

Statistics

Views

Total Views
9,180
Views on SlideShare
2,254
Embed Views
6,926

Actions

Likes
8
Downloads
30
Comments
2

67 Embeds 6,926

http://aimotion.blogspot.com 2369
http://aimotion.blogspot.com.br 1437
http://aimotion.blogspot.in 558
http://aimotion.blogspot.co.uk 286
http://aimotion.blogspot.de 225
http://aimotion.blogspot.fr 192
http://aimotion.blogspot.ca 150
http://aimotion.blogspot.it 116
http://aimotion.blogspot.ru 114
http://aimotion.blogspot.com.es 107
http://aimotion.blogspot.sg 104
http://aimotion.blogspot.com.au 99
http://www.aimotion.blogspot.com.br 93
http://aimotion.blogspot.jp 78
http://aimotion.blogspot.kr 68
http://aimotion.blogspot.pt 62
http://aimotion.blogspot.hk 60
http://aimotion.blogspot.nl 59
http://aimotion.blogspot.tw 52
http://aimotion.blogspot.co.il 50
http://aimotion.blogspot.mx 48
http://aimotion.blogspot.ie 48
http://cloud.feedly.com 42
http://aimotion.blogspot.com.tr 41
http://aimotion.blogspot.gr 40
http://aimotion.blogspot.cz 39
http://aimotion.blogspot.ch 37
http://aimotion.blogspot.se 37
http://aimotion.blogspot.fi 31
http://aimotion.blogspot.be 29
http://aimotion.blogspot.com.ar 27
http://aimotion.blogspot.hu 24
http://aimotion.blogspot.dk 23
http://aimotion.blogspot.co.nz 21
http://aimotion.blogspot.no 19
http://aimotion.blogspot.ro 16
http://www.feedspot.com 14
http://aimotion.blogspot.co.at 13
http://feedreader.com 12
http://aimotion.blogspot.ae 11
http://digg.com 11
http://www.aimotion.blogspot.fr 9
http://feedly.com 8
http://www.aimotion.blogspot.com 6
http://www.aimotion.blogspot.com.ar 5
http://translate.googleusercontent.com 3
http://10.70.141.85 3
http://aimotion.blogspot.sk 3
http://127.0.0.1 3
http://localhost 2
More...

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Recommender Systems with Ruby (adding machine learning, statistics, etc) Presentation Transcript

  • 1. Ruby in the world of recommendations (also machine learning, statistics and visualizations..) Marcel Caraciolo @marcelcaraciolo Developer, Cientist, contributor to the Crab recsys project, works with Python for 6 years, interested at mobile, education, machine learning and dataaaaa! Recife, Brazil - http://aimotion.blogspot.com Saturday, September 14, 2013
  • 2. FAÇA BACKUP!    NUNCA:  find  .  -­‐type  f  -­‐not  -­‐name  '*pyc'  |  xargs  rm Saturday, September 14, 2013
  • 3. Scientific Environment Presentation & Visualization Experimentation (Re-Design) Data AcquisitionData Analysis Saturday, September 14, 2013
  • 4. Where is Ruby? Presentation & Visualization Experimentation (Re-Design) Data AcquisitionData Analysis Saturday, September 14, 2013
  • 5. Where is Ruby? Presentation & Visualization Experimentation (Re-Design) Data AcquisitionData Analysis Saturday, September 14, 2013
  • 6. Where is Ruby? Presentation & Visualization Experimentation (Re-Design) Data AcquisitionData Analysis Saturday, September 14, 2013
  • 7. Where is Ruby? Presentation & Visualization Experimentation (Re-Design) Data AcquisitionData Analysis Saturday, September 14, 2013
  • 8. Where is Ruby? Python launched at 1991; Ruby launched at 1995 Python was highly addopted and promoted by most of the research and development team of Google Saturday, September 14, 2013
  • 9. Where is Ruby? Python lançado em 1991; Ruby lançado em 1995 Python foi altamente popularizado com a adoção oficial de boa parte do time de pesquisa do Google Python has been an important key of Google since its beginning, and still continues as our infra- structure grows, we are always looking for more people with skills in this language. Peter Norvig, Google, Inc. Saturday, September 14, 2013
  • 10. Where is Ruby? Python was famous even at some old scientific articles Saturday, September 14, 2013
  • 11. Where is Ruby? Ruby’s popularity exploded at 2004. Focus on web Django - 2005; Numpy - 2005; BioPython - 2001; SAGE - 2005; Matplotlib- 2000; Python Saturday, September 14, 2013
  • 12. Where is Ruby? Programming comes second to researchers, not first like us. - “Ruby developer answer” Python     [(x, x*x) for x in [1,2,3,4] if x != 3] vs Ruby `[1,2,3,4].map { |x| [x, x*x] if x != 3 }` vs Result     [(1,1), (2,4), (4,16)] Saturday, September 14, 2013
  • 13. Where is Ruby? Ruby Python Saturday, September 14, 2013
  • 14. Hey, Ruby has options! Saturday, September 14, 2013
  • 15. Hey, Ruby has options! Saturday, September 14, 2013
  • 16. :( Saturday, September 14, 2013
  • 17. :D Saturday, September 14, 2013
  • 18. gem install nmatrix git clone https://github.com/SciRuby/nmatrix.git cd nmatrix/ bundle install rake compile rake repackage gem install pkg/nmatrix-*.gem Saturday, September 14, 2013
  • 19. >> NMatrix.new([2, 3], [0, 1, 2, 3, 4, 5], :int64).pp [0, 1, 2] [3, 4, 5] => nil >> m = N[ [2, 3, 4], [7, 8, 9] ] => #<NMatrix:0x007f8e121b6cf8shape:[2,3] dtype:int32 stype:dense> >> m.pp [2, 3, 4] [7, 8, 9] Depends on ATLAS/CBLAST and written mostly in C and C++ https://github.com/SciRuby/nmatrix/wiki/Getting-started Saturday, September 14, 2013
  • 20. Hey, Ruby has options! Saturday, September 14, 2013
  • 21. Data Visualization •R •Gnuplot •Google Charts API •JFreeChart •Scruffy •Timetric •Tioga •RChart Saturday, September 14, 2013
  • 22. Data Visualization require 'rsruby' cmd = %Q ( pdf(file = "r_directly.pdf")) boxplot(c(1,2,3,4),c(5,6,7,8)) dev.off() ) def gnuplot(commands) IO.popen("gnuplot", "w") { |io| io.puts commands } end commands = %Q( set terminal svg set output "curves.svg" plot [-10:10] sin(x), atan(x), cos(atan(x)) ) gnuplot(commands) http://effectif.com/ruby/manor/data-visualisation-with-ruby https://github.com/glejeune/Ruby-Graphviz/Saturday, September 14, 2013
  • 23. Other tools •BioRuby #!/usr/bin/env ruby   require 'bio'   # create a DNA sequence object from a String dna = Bio::Sequence::NA.new("atcggtcggctta")   # create a RNA sequence object from a String rna = Bio::Sequence::NA.new("auugccuacauaggc")   # create a Protein sequence from a String aa = Bio::Sequence::AA.new("AGFAVENDSA")   # you can check if the sequence contains illegal characters # that is not an accepted IUB character for that symbol # (should prepare a Bio::Sequence::AA#illegal_symbols method also) puts dna.illegal_bases   # translate and concatenate a DNA sequence to Protein sequence newseq = aa + dna.translate puts newseq # => "AGFAVENDSAIGRL" http://bioruby.org/ Saturday, September 14, 2013
  • 24. Other tools •RubyDoop (uses JRuby) module  WordCount    class  Reducer        def  reduce(key,  values,  context)            sum  =  0            values.each  {  |value|  sum  +=  value.get  }            context.write(key,  Hadoop::Io::IntWritable.new(sum))        end    end end https://github.com/iconara/rubydoop module  WordCount    class  Mapper        def  map(key,  value,  context)            value.to_s.split.each  do  |word|                word.gsub!(/W/,  '')                word.downcase!                unless  word.empty?                    context.write(Hadoop::Io::Text.new(word),  Hadoop::Io::IntWritable.new(1))                end            end        end    end end Saturday, September 14, 2013
  • 25. Coming back to the world of recommenders The world is an over-crowded place Saturday, September 14, 2013
  • 26. Coming back to the world of recommenders!"#$%&'()$*+$,-$&.#'/0'&%)#)$1(,0# Saturday, September 14, 2013
  • 27. Recommendation Systems Systems designed to recommend to me something I may like Saturday, September 14, 2013
  • 28. Recommendation Systems!"#$%&"'$"'(')*#*+,) -+*#)+. -#/') 0#)1# 2' 23&4"+')1 5,6 7),*%'"&863 ! Graph Representation Saturday, September 14, 2013
  • 29. And how does it work ? Saturday, September 14, 2013
  • 30. What the recommenders realy do ? 1. Predict how much you may like a certain product o service 2. It suggests a list of N items ordered by the level of your interests. 3. It suggests a N list o f users to a product/ service 4. It explains to you why those items were recommended. 5. It adjusts the prediction and recommendations based on your feedback and from anothers. Saturday, September 14, 2013
  • 31. Content Based Filtering Gone with the Wind Die Hard Similar Armagedon Toy Store Marcel likes recommends Items Users Saturday, September 14, 2013
  • 32. Problems with Content Recommenders 1. Restrict Data Analysis 3. Portfolio Effect - Items and users mal-formed. Even worst in audio and images - An person that does not have experience with Sushi does not get the recommendation of the best sushi in town. - Just because I saw 1 movie of Xuxa when I was child, it must have to recommend all movies of her (só para baixinhos!) 2. Specialized Data Saturday, September 14, 2013
  • 33. Collaborative Filtering Gone with the wind Thor Similar Armagedon Toy Store Marcel like recommend Items Rafael Amanda Users Saturday, September 14, 2013
  • 34. Problems with Collaborative Filtering 1. Scalability 2. Sparse Data 3. Cold Start 4. Popularity - Amazon with 5M users, 50K items, 1.4B ratings - New users and items with no records - I only rated one book at Amazon! - The person who reads ‘Harry Potter’ also reads ‘Kama Sutra’ 5. Hacking - Everyone reads Harry Potter! Saturday, September 14, 2013
  • 35. How does it show ? Highlights More about this artist... Listen to the similar songs Someone similar to you also liked this... Since you listened this, you may like this one... Those items come together... The most popular of your group... New Releases Saturday, September 14, 2013
  • 36. Recommendable Quickly add a recommender engine for Likes and Dislikes to your Ruby app http://davidcel.is/recommendable/ Saturday, September 14, 2013
  • 37. Recommendable Saturday, September 14, 2013
  • 38. Recommendable    gem  'recommendable' Add to your GemFile: Saturday, September 14, 2013
  • 39. Recommendable require 'redis' Recommendable.configure do |config| # Recommendable's connection to Redis config.redis = Redis.new(:host => 'localhost', :port => 6379, :db => 0) # A prefix for all keys Recommendable uses config.redis_namespace = :recommendable # Whether or not to automatically enqueue users to have their recommendations # refreshed after they like/dislike an item config.auto_enqueue = true # The name of the queue that background jobs will be placed in config.queue_name = :recommendable # The number of nearest neighbors (k-NN) to check when updating # recommendations for a user. Set to `nil` if you want to check all # other users as opposed to a subset of the nearest ones. config.nearest_neighbors = nil end Create a configuration initializer: Saturday, September 14, 2013
  • 40. Recommendable In your ONE model that will be receiving the recommendations: class User recommends :movies, :books, :minerals, :other_things # ... end Saturday, September 14, 2013
  • 41. Recommendable >> current_user.liked_movies.limit(10) >> current_user.bookmarked_books.where(:author => "Cormac McCarthy") >> current_user.disliked_movies.joins(:cast_members).where('cast_members.name = Kim Kardashian') You can chain your queries Saturday, September 14, 2013
  • 42. Recommendable >> current_user.hidden_minerals.order('density DESC') >> current_user.recommended_movies.where('year < 2010') >> book.liked_by.order('age DESC').limit(20) >> movie.disliked_by.where('age > 18') You can chain your queries Saturday, September 14, 2013
  • 43. Recommendable You can also like your recommendable objects >> user.like(movie) => true >> user.likes?(movie) => true >> user.rated?(movie) => true # also true if user.dislikes?(movie) >> user.liked_movies => [#<Movie id: 23, name: "2001: A Space Odyssey">] >> user.liked_movie_ids => ["23"] >> user.like(book) => true >> user.likes => [#<Movie id: 23, name: "2001: A Space Odyssey">, #<Book id: 42, title: "100 Years of Solitude">] >> user.likes_count => 2 >> user.liked_movies_count => 1 >> user.likes_in_common_with(friend) => [#<Movie id: 23, name: "2001: A Space Odyssey">, #<Book id: 42, title: "100 Years of Solitude">] >> user.liked_movies_in_common_with(friend) => [#<Movie id: 23, name: "2001: A Space Odyssey">] >> movie.liked_by_count => 2 >> movie.liked_by => [#<User username: 'davidbowman'>, #<User username: 'frankpoole'>] Saturday, September 14, 2013
  • 44. Recommendable Obviously, You can also DISLIKE your recommendable objects >> user.dislike(movie) >> user.dislikes?(movie) >> user.disliked_movies >> user.disliked_movie_ids >> user.dislikes >> user.dislikes_count >> user.disliked_movies_count >> user.dislikes_in_common_with(friend) >> user.disliked_movies_in_common_with(friend) >> movie.disliked_by_count >> movie.disliked_by Saturday, September 14, 2013
  • 45. Recommendable Recommendations >> friend.like(Movie.where(:name => "2001: A Space Odyssey").first) >> friend.like(Book.where(:title => "A Clockwork Orange").first) >> friend.like(Book.where(:title => "Brave New World").first) >> friend.like(Book.where(:title => "One Flew Over the Cuckoo's Next").first) >> user.like(Book.where(:title => "A Clockwork Orange").first) => [#<User username: "frankpoole">, #<User username: "davidbowman">, ...] >> user.recommended_books # Defaults to 10 recommendations => [#<Book title: "Brave New World">, #<Book title: "One Flew Over the Cuckoo's Nest">] >> user.similar_raters # Defaults to 10 similar users => [#< >> user.recommended_movies(10, 30) # 10 Recommendations, offset by 30 (i.e. page 4) => [#<Movie name: "A Clockwork Orange">, #<Movie name: "Chinatown">, ...] >> user.similar_raters(25, 50) # 25 similar users, offset by 50 (i.e. page 3) => [#<User username: "frankpoole">, #<User username: "davidbowman">, ...] Saturday, September 14, 2013
  • 46. Recommendable Jaccard Similarity Marcel likes A, B, C and dislikes D Amanda likes A, B and dislikes C Guilherme likes C, D and dislikes A Flavio likes B, C, E and dislikes D J(Marcel, Amanda) = ([A,B].size + [].size - [C].size - [].size) / [A,B,C,D].size J(Marcel, Amanda) = 2 + 0 - 1 - 0 / 4 = 1/4 = 0.25 Saturday, September 14, 2013
  • 47. Recommendable Jaccard Similarity Marcel likes A, B, C and dislikes D Amanda likes A, B and dislikes C Guilherme likes C, D and dislikes A Flavio likes B, C, E and dislikes D J(Marcel, Guilherme) = ([C].size + [].size - [A].size - [D].size) / [A,B,C,D].size J(Marcel, Guilherme) = 1 + 0 - 1 - 1 / 4 = 1/4 = - 0.25 Saturday, September 14, 2013
  • 48. Recommendable Jaccard Similarity Marcel likes A, B, C and dislikes D Amanda likes A, B and dislikes C Guilherme likes C, D and dislikes A Flavio likes B, C, E and dislikes D J(Marcel, Flavio) = ([B,C].size + [D].size - [].size - [].size) / [A,B,C,D, E].size J(Marcel, Flavio) = 2 + 0 - 0 - 0 = 2/5 = 0.4 Saturday, September 14, 2013
  • 49. Recommendable Jaccard Similarity MostSimilar(Marcel) = [ (Flavio, 0.4) , (Amanda, 0.25) , (Guilherme, -0.25)] Marcel likes A, B, C and dislikes D Amanda likes A, B and dislikes C Guilherme likes C, D and dislikes A Flavio likes B, C, E and dislikes D Saturday, September 14, 2013
  • 50. Recommendable Recommendations >> Movie.top => #<Movie name: "2001: A Space Odyssey"> >> Movie.top(3) => [#<Movie name: "2001: A Space Odyssey">, #<Movie name: "A Clockwork Orange">, #<Movie name: "The Shining">] The best of your recommendable models Wilson score confidence - Reddit Algorithm Saturday, September 14, 2013
  • 51. Recommendable Callbacks class User < ActiveRecord::Base has_one :feed recommends :movies after_like :update_feed def update_feed(obj) feed.update "liked #{obj.name}" end end apotonick/hooks to implement callbacks for liking, disliking, etc Saturday, September 14, 2013
  • 52. Recommendable Recommendable::Helpers::Calculations.update_similarities_for(user.id) Recommendable::Helpers::Calculations.update_recommendations_for(user.id) Manual recommendations Saturday, September 14, 2013
  • 53. redis makes the magic! Manual recommendations Saturday, September 14, 2013
  • 54. redis makes the magic! Manual recommendations Saturday, September 14, 2013
  • 55. Recommendable module  Recommendable    module  Workers        class  Resque            include  ::Resque::Plugins::UniqueJob  if  defined?(::Resque::Plugins::UniqueJob)            @queue  =  :recommendable            def  self.perform(user_id)                Recommendable::Helpers::Calculations.update_similarities_for(user_id)                Recommendable::Helpers::Calculations.update_recommendations_for(user_id)            end        end    end end Recommendations over Queueing System Put the workers to do the job! (SideKiq, Resque, DelayedJob) Saturday, September 14, 2013
  • 56. Recommended Books SatnamAlag, Collective Intelligence in Action, Manning Publications, 2009 Toby Segaran, Programming Collective Intelligence, O'Reilly, 2007 Saturday, September 14, 2013
  • 57. Recommended Books Exploring everyday things with R and Ruby, Sau Chang, O’Reilly, 2012 Saturday, September 14, 2013
  • 58. Recommended Course https://www.coursera.org/course/recsys Saturday, September 14, 2013
  • 59. Ruby developers, It does exist Web Saturday, September 14, 2013
  • 60. Ruby in the world of recommendations (also machine learning, statistics and visualizations..) Marcel Caraciolo @marcelcaraciolo Developer, Cientist, contributor to the Crab recsys project, works with Python for 6 years, interested at mobile, education, machine learning and dataaaaa! Recife, Brazil - http://aimotion.blogspot.com Saturday, September 14, 2013