Ruby in the world of
recommendations
(also machine learning, statistics and visualizations..)
Marcel Caraciolo
@marcelcaraciolo
Developer, Cientist, contributor to the Crab recsys project,
works with Python for 6 years, interested at mobile,
education, machine learning and dataaaaa!
Recife, Brazil - http://aimotion.blogspot.com
Saturday, September 14, 2013
FAÇA BACKUP!
	
  	
  NUNCA:	
  find	
  .	
  -­‐type	
  f	
  -­‐not	
  -­‐name	
  '*pyc'	
  |	
  xargs	
  rm
Saturday, September 14, 2013
Scientific Environment
Presentation & Visualization
Experimentation
(Re-Design)
Data AcquisitionData Analysis
Saturday, September 14, 2013
Where is Ruby?
Presentation & Visualization
Experimentation
(Re-Design)
Data AcquisitionData Analysis
Saturday, September 14, 2013
Where is Ruby?
Presentation & Visualization
Experimentation
(Re-Design)
Data AcquisitionData Analysis
Saturday, September 14, 2013
Where is Ruby?
Presentation & Visualization
Experimentation
(Re-Design)
Data AcquisitionData Analysis
Saturday, September 14, 2013
Where is Ruby?
Presentation & Visualization
Experimentation
(Re-Design)
Data AcquisitionData Analysis
Saturday, September 14, 2013
Where is Ruby?
Python launched at 1991; Ruby
launched at 1995
Python was highly addopted and
promoted by most of the research and
development team of Google
Saturday, September 14, 2013
Where is Ruby?
Python lançado em 1991; Ruby lançado em 1995
Python foi altamente popularizado com a adoção oficial de
boa parte do time de pesquisa do Google
Python has been an important
key of Google since its beginning,
and still continues as our infra-
structure grows, we are always
looking for more people with
skills in this language.
Peter Norvig, Google, Inc.
Saturday, September 14, 2013
Where is Ruby?
Python was famous even at some old
scientific articles
Saturday, September 14, 2013
Where is Ruby?
Ruby’s popularity exploded at 2004.
Focus on web
Django - 2005; Numpy - 2005;
BioPython - 2001; SAGE - 2005;
Matplotlib- 2000;
Python
Saturday, September 14, 2013
Where is Ruby?
Programming comes second to researchers, not
first like us. - “Ruby developer answer”
Python
    [(x, x*x) for x in [1,2,3,4] if x != 3]
vs Ruby
`[1,2,3,4].map { |x| [x, x*x] if x != 3 }`
vs Result
    [(1,1), (2,4), (4,16)]
Saturday, September 14, 2013
Where is Ruby?
Ruby
Python
Saturday, September 14, 2013
Hey, Ruby has options!
Saturday, September 14, 2013
Hey, Ruby has options!
Saturday, September 14, 2013
:(
Saturday, September 14, 2013
:D
Saturday, September 14, 2013
gem install nmatrix
git clone https://github.com/SciRuby/nmatrix.git
cd nmatrix/
bundle install
rake compile
rake repackage
gem install pkg/nmatrix-*.gem
Saturday, September 14, 2013
>> NMatrix.new([2, 3], [0, 1, 2, 3, 4, 5], :int64).pp
[0, 1, 2]
[3, 4, 5]
=> nil
>> m = N[ [2, 3, 4], [7, 8, 9] ]
=> #<NMatrix:0x007f8e121b6cf8shape:[2,3] dtype:int32
stype:dense>
>> m.pp
[2, 3, 4]
[7, 8, 9]
Depends on ATLAS/CBLAST
and written mostly in C and C++
https://github.com/SciRuby/nmatrix/wiki/Getting-started
Saturday, September 14, 2013
Hey, Ruby has options!
Saturday, September 14, 2013
Data Visualization
•R
•Gnuplot
•Google Charts API
•JFreeChart
•Scruffy
•Timetric
•Tioga
•RChart
Saturday, September 14, 2013
Data Visualization
require 'rsruby'
cmd = %Q
(
pdf(file = "r_directly.pdf"))
boxplot(c(1,2,3,4),c(5,6,7,8))
dev.off()
)
def gnuplot(commands)
IO.popen("gnuplot", "w") { |io| io.puts commands }
end
commands = %Q(
set terminal svg
set output "curves.svg"
plot [-10:10] sin(x), atan(x), cos(atan(x))
)
gnuplot(commands)
http://effectif.com/ruby/manor/data-visualisation-with-ruby
https://github.com/glejeune/Ruby-Graphviz/Saturday, September 14, 2013
Other tools
•BioRuby
#!/usr/bin/env ruby
 
require 'bio'
 
# create a DNA sequence object from a String
dna = Bio::Sequence::NA.new("atcggtcggctta")
 
# create a RNA sequence object from a String
rna = Bio::Sequence::NA.new("auugccuacauaggc")
 
# create a Protein sequence from a String
aa = Bio::Sequence::AA.new("AGFAVENDSA")
 
# you can check if the sequence contains illegal characters
# that is not an accepted IUB character for that symbol
# (should prepare a Bio::Sequence::AA#illegal_symbols method also)
puts dna.illegal_bases
 
# translate and concatenate a DNA sequence to Protein sequence
newseq = aa + dna.translate
puts newseq # => "AGFAVENDSAIGRL"
http://bioruby.org/
Saturday, September 14, 2013
Other tools
•RubyDoop (uses JRuby)
module	
  WordCount
	
  	
  class	
  Reducer
	
  	
  	
  	
  def	
  reduce(key,	
  values,	
  context)
	
  	
  	
  	
  	
  	
  sum	
  =	
  0
	
  	
  	
  	
  	
  	
  values.each	
  {	
  |value|	
  sum	
  +=	
  value.get	
  }
	
  	
  	
  	
  	
  	
  context.write(key,	
  Hadoop::Io::IntWritable.new(sum))
	
  	
  	
  	
  end
	
  	
  end
end
https://github.com/iconara/rubydoop
module	
  WordCount
	
  	
  class	
  Mapper
	
  	
  	
  	
  def	
  map(key,	
  value,	
  context)
	
  	
  	
  	
  	
  	
  value.to_s.split.each	
  do	
  |word|
	
  	
  	
  	
  	
  	
  	
  	
  word.gsub!(/W/,	
  '')
	
  	
  	
  	
  	
  	
  	
  	
  word.downcase!
	
  	
  	
  	
  	
  	
  	
  	
  unless	
  word.empty?
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  context.write(Hadoop::Io::Text.new(word),	
  Hadoop::Io::IntWritable.new(1))
	
  	
  	
  	
  	
  	
  	
  	
  end
	
  	
  	
  	
  	
  	
  end
	
  	
  	
  	
  end
	
  	
  end
end
Saturday, September 14, 2013
Coming back to the
world of recommenders
The world is an over-crowded place
Saturday, September 14, 2013
Coming back to the
world of recommenders!"#$%&'()$*+$,-$&.#'/0'&%)#)$1(,0#
Saturday, September 14, 2013
Recommendation Systems
Systems designed to recommend to me something I may like
Saturday, September 14, 2013
Recommendation Systems!"#$%&"'$"'(')*#*+,)
-+*#)+. -#/') 0#)1#
2' 23&4"+')1 5,6 7),*%'"&863
!
Graph Representation
Saturday, September 14, 2013
And how does it work ?
Saturday, September 14, 2013
What the recommenders realy do ?
1. Predict how much you may like a certain
product o service
2. It suggests a list of N items ordered by the level of
your interests.
3. It suggests a N list o f users to a product/
service
4. It explains to you why those items were
recommended.
5. It adjusts the prediction and recommendations
based on your feedback and from anothers.
Saturday, September 14, 2013
Content Based Filtering
Gone with
the Wind
Die Hard
Similar
Armagedon
Toy
Store
Marcel
likes
recommends
Items
Users
Saturday, September 14, 2013
Problems with Content
Recommenders
1. Restrict Data Analysis
3. Portfolio Effect
- Items and users mal-formed. Even worst in audio and images
- An person that does not have experience with Sushi does not get
the recommendation of the best sushi in town.
- Just because I saw 1 movie of Xuxa when I was child, it must have
to recommend all movies of her (só para baixinhos!)
2. Specialized Data
Saturday, September 14, 2013
Collaborative Filtering
Gone with
the wind
Thor
Similar
Armagedon
Toy
Store
Marcel
like
recommend
Items
Rafael Amanda Users
Saturday, September 14, 2013
Problems with Collaborative Filtering
1. Scalability
2. Sparse Data
3. Cold Start
4. Popularity
- Amazon with 5M users, 50K items, 1.4B ratings
- New users and items with no records
- I only rated one book at Amazon!
- The person who reads ‘Harry Potter’ also reads ‘Kama Sutra’
5. Hacking
- Everyone reads Harry Potter!
Saturday, September 14, 2013
How does it show ?
Highlights More about this artist...
Listen to the similar songs
Someone similar to you also liked this...
Since you listened this, you may like this one...
Those items come together...
The most popular of your group...
New Releases
Saturday, September 14, 2013
Recommendable
Quickly add a recommender engine for Likes and
Dislikes to your Ruby app
http://davidcel.is/recommendable/
Saturday, September 14, 2013
Recommendable
Saturday, September 14, 2013
Recommendable
	
  	
  gem	
  'recommendable'
Add to your GemFile:
Saturday, September 14, 2013
Recommendable
require 'redis'
Recommendable.configure do |config|
# Recommendable's connection to Redis
config.redis = Redis.new(:host => 'localhost', :port => 6379, :db => 0)
# A prefix for all keys Recommendable uses
config.redis_namespace = :recommendable
# Whether or not to automatically enqueue users to have their
recommendations
# refreshed after they like/dislike an item
config.auto_enqueue = true
# The name of the queue that background jobs will be placed in
config.queue_name = :recommendable
# The number of nearest neighbors (k-NN) to check when updating
# recommendations for a user. Set to `nil` if you want to check all
# other users as opposed to a subset of the nearest ones.
config.nearest_neighbors = nil
end
Create a configuration initializer:
Saturday, September 14, 2013
Recommendable
In your ONE model that will be receiving the
recommendations:
class User
recommends :movies, :books, :minerals,
:other_things
# ...
end
Saturday, September 14, 2013
Recommendable
>> current_user.liked_movies.limit(10)
>> current_user.bookmarked_books.where(:author => "Cormac McCarthy")
>> current_user.disliked_movies.joins(:cast_members).where('cast_members.name = Kim Kardashian')
You can chain your queries
Saturday, September 14, 2013
Recommendable
>> current_user.hidden_minerals.order('density DESC')
>> current_user.recommended_movies.where('year < 2010')
>> book.liked_by.order('age DESC').limit(20)
>> movie.disliked_by.where('age > 18')
You can chain your queries
Saturday, September 14, 2013
Recommendable
You can also like your recommendable objects
>> user.like(movie)
=> true
>> user.likes?(movie)
=> true
>> user.rated?(movie)
=> true # also true if user.dislikes?(movie)
>> user.liked_movies
=> [#<Movie id: 23, name: "2001: A Space Odyssey">]
>> user.liked_movie_ids
=> ["23"]
>> user.like(book)
=> true
>> user.likes
=> [#<Movie id: 23, name: "2001: A Space Odyssey">, #<Book id: 42, title: "100 Years of Solitude">]
>> user.likes_count
=> 2
>> user.liked_movies_count
=> 1
>> user.likes_in_common_with(friend)
=> [#<Movie id: 23, name: "2001: A Space Odyssey">, #<Book id: 42, title: "100 Years of Solitude">]
>> user.liked_movies_in_common_with(friend)
=> [#<Movie id: 23, name: "2001: A Space Odyssey">]
>> movie.liked_by_count
=> 2
>> movie.liked_by
=> [#<User username: 'davidbowman'>, #<User username: 'frankpoole'>]
Saturday, September 14, 2013
Recommendable
Obviously, You can also DISLIKE your recommendable
objects
>> user.dislike(movie)
>> user.dislikes?(movie)
>> user.disliked_movies
>> user.disliked_movie_ids
>> user.dislikes
>> user.dislikes_count
>> user.disliked_movies_count
>> user.dislikes_in_common_with(friend)
>> user.disliked_movies_in_common_with(friend)
>> movie.disliked_by_count
>> movie.disliked_by
Saturday, September 14, 2013
Recommendable
Recommendations
>> friend.like(Movie.where(:name => "2001: A Space Odyssey").first)
>> friend.like(Book.where(:title => "A Clockwork Orange").first)
>> friend.like(Book.where(:title => "Brave New World").first)
>> friend.like(Book.where(:title => "One Flew Over the Cuckoo's Next").first)
>> user.like(Book.where(:title => "A Clockwork Orange").first)
=> [#<User username: "frankpoole">, #<User username: "davidbowman">, ...]
>> user.recommended_books # Defaults to 10 recommendations
=> [#<Book title: "Brave New World">, #<Book title: "One Flew Over the Cuckoo's
Nest">]
>> user.similar_raters # Defaults to 10 similar users
=> [#<
>> user.recommended_movies(10, 30) # 10 Recommendations, offset by 30 (i.e. page
4)
=> [#<Movie name: "A Clockwork Orange">, #<Movie name: "Chinatown">, ...]
>> user.similar_raters(25, 50) # 25 similar users, offset by 50 (i.e. page 3)
=> [#<User username: "frankpoole">, #<User username: "davidbowman">, ...]
Saturday, September 14, 2013
Recommendable
Jaccard Similarity
Marcel likes A, B, C and dislikes D
Amanda likes A, B and dislikes C
Guilherme likes C, D and dislikes A
Flavio likes B, C, E and dislikes D
J(Marcel, Amanda) =
([A,B].size + [].size - [C].size - [].size) / [A,B,C,D].size
J(Marcel, Amanda) =
2 + 0 - 1 - 0 / 4 = 1/4 = 0.25
Saturday, September 14, 2013
Recommendable
Jaccard Similarity
Marcel likes A, B, C and dislikes D
Amanda likes A, B and dislikes C
Guilherme likes C, D and dislikes A
Flavio likes B, C, E and dislikes D
J(Marcel, Guilherme) =
([C].size + [].size - [A].size - [D].size) / [A,B,C,D].size
J(Marcel, Guilherme) =
1 + 0 - 1 - 1 / 4 = 1/4 = - 0.25
Saturday, September 14, 2013
Recommendable
Jaccard Similarity
Marcel likes A, B, C and dislikes D
Amanda likes A, B and dislikes C
Guilherme likes C, D and dislikes A
Flavio likes B, C, E and dislikes D
J(Marcel, Flavio) =
([B,C].size + [D].size - [].size - [].size) / [A,B,C,D, E].size
J(Marcel, Flavio) =
2 + 0 - 0 - 0 = 2/5 = 0.4
Saturday, September 14, 2013
Recommendable
Jaccard Similarity
MostSimilar(Marcel) = [ (Flavio, 0.4) , (Amanda, 0.25) , (Guilherme, -0.25)]
Marcel likes A, B, C and dislikes D
Amanda likes A, B and dislikes C
Guilherme likes C, D and dislikes A
Flavio likes B, C, E and dislikes D
Saturday, September 14, 2013
Recommendable
Recommendations
>> Movie.top
=> #<Movie name: "2001: A Space Odyssey">
>> Movie.top(3)
=> [#<Movie name: "2001: A Space Odyssey">, #<Movie name: "A Clockwork Orange">,
#<Movie name: "The Shining">]
The best of your recommendable models
Wilson score confidence - Reddit Algorithm
Saturday, September 14, 2013
Recommendable
Callbacks
class User < ActiveRecord::Base
has_one :feed
recommends :movies
after_like :update_feed
def update_feed(obj)
feed.update "liked #{obj.name}"
end
end
apotonick/hooks to implement callbacks for liking,
disliking, etc
Saturday, September 14, 2013
Recommendable
Recommendable::Helpers::Calculations.update_similarities_for(user.id)
Recommendable::Helpers::Calculations.update_recommendations_for(user.id)
Manual recommendations
Saturday, September 14, 2013
redis makes the magic!
Manual recommendations
Saturday, September 14, 2013
redis makes the magic!
Manual recommendations
Saturday, September 14, 2013
Recommendable
module	
  Recommendable
	
  	
  module	
  Workers
	
  	
  	
  	
  class	
  Resque
	
  	
  	
  	
  	
  	
  include	
  ::Resque::Plugins::UniqueJob	
  if	
  defined?(::Resque::Plugins::UniqueJob)
	
  	
  	
  	
  	
  	
  @queue	
  =	
  :recommendable
	
  	
  	
  	
  	
  	
  def	
  self.perform(user_id)
	
  	
  	
  	
  	
  	
  	
  	
  Recommendable::Helpers::Calculations.update_similarities_for(user_id)
	
  	
  	
  	
  	
  	
  	
  	
  Recommendable::Helpers::Calculations.update_recommendations_for(user_id)
	
  	
  	
  	
  	
  	
  end
	
  	
  	
  	
  end
	
  	
  end
end
Recommendations over Queueing System
Put the workers to do the job! (SideKiq, Resque, DelayedJob)
Saturday, September 14, 2013
Recommended Books
SatnamAlag, Collective Intelligence in
Action, Manning Publications, 2009
Toby Segaran, Programming Collective
Intelligence, O'Reilly, 2007
Saturday, September 14, 2013
Recommended Books
Exploring everyday things
with R and Ruby, Sau Chang,
O’Reilly, 2012
Saturday, September 14, 2013
Recommended Course
https://www.coursera.org/course/recsys
Saturday, September 14, 2013
Ruby developers, It does
exist
Web
Saturday, September 14, 2013
Ruby in the world of
recommendations
(also machine learning, statistics and visualizations..)
Marcel Caraciolo
@marcelcaraciolo
Developer, Cientist, contributor to the Crab recsys project,
works with Python for 6 years, interested at mobile,
education, machine learning and dataaaaa!
Recife, Brazil - http://aimotion.blogspot.com
Saturday, September 14, 2013

Recommender Systems with Ruby (adding machine learning, statistics, etc)

  • 1.
    Ruby in theworld of recommendations (also machine learning, statistics and visualizations..) Marcel Caraciolo @marcelcaraciolo Developer, Cientist, contributor to the Crab recsys project, works with Python for 6 years, interested at mobile, education, machine learning and dataaaaa! Recife, Brazil - http://aimotion.blogspot.com Saturday, September 14, 2013
  • 2.
    FAÇA BACKUP!    NUNCA:  find  .  -­‐type  f  -­‐not  -­‐name  '*pyc'  |  xargs  rm Saturday, September 14, 2013
  • 3.
    Scientific Environment Presentation &Visualization Experimentation (Re-Design) Data AcquisitionData Analysis Saturday, September 14, 2013
  • 4.
    Where is Ruby? Presentation& Visualization Experimentation (Re-Design) Data AcquisitionData Analysis Saturday, September 14, 2013
  • 5.
    Where is Ruby? Presentation& Visualization Experimentation (Re-Design) Data AcquisitionData Analysis Saturday, September 14, 2013
  • 6.
    Where is Ruby? Presentation& Visualization Experimentation (Re-Design) Data AcquisitionData Analysis Saturday, September 14, 2013
  • 7.
    Where is Ruby? Presentation& Visualization Experimentation (Re-Design) Data AcquisitionData Analysis Saturday, September 14, 2013
  • 8.
    Where is Ruby? Pythonlaunched at 1991; Ruby launched at 1995 Python was highly addopted and promoted by most of the research and development team of Google Saturday, September 14, 2013
  • 9.
    Where is Ruby? Pythonlançado em 1991; Ruby lançado em 1995 Python foi altamente popularizado com a adoção oficial de boa parte do time de pesquisa do Google Python has been an important key of Google since its beginning, and still continues as our infra- structure grows, we are always looking for more people with skills in this language. Peter Norvig, Google, Inc. Saturday, September 14, 2013
  • 10.
    Where is Ruby? Pythonwas famous even at some old scientific articles Saturday, September 14, 2013
  • 11.
    Where is Ruby? Ruby’spopularity exploded at 2004. Focus on web Django - 2005; Numpy - 2005; BioPython - 2001; SAGE - 2005; Matplotlib- 2000; Python Saturday, September 14, 2013
  • 12.
    Where is Ruby? Programmingcomes second to researchers, not first like us. - “Ruby developer answer” Python     [(x, x*x) for x in [1,2,3,4] if x != 3] vs Ruby `[1,2,3,4].map { |x| [x, x*x] if x != 3 }` vs Result     [(1,1), (2,4), (4,16)] Saturday, September 14, 2013
  • 13.
  • 14.
    Hey, Ruby hasoptions! Saturday, September 14, 2013
  • 15.
    Hey, Ruby hasoptions! Saturday, September 14, 2013
  • 16.
  • 17.
  • 18.
    gem install nmatrix gitclone https://github.com/SciRuby/nmatrix.git cd nmatrix/ bundle install rake compile rake repackage gem install pkg/nmatrix-*.gem Saturday, September 14, 2013
  • 19.
    >> NMatrix.new([2, 3],[0, 1, 2, 3, 4, 5], :int64).pp [0, 1, 2] [3, 4, 5] => nil >> m = N[ [2, 3, 4], [7, 8, 9] ] => #<NMatrix:0x007f8e121b6cf8shape:[2,3] dtype:int32 stype:dense> >> m.pp [2, 3, 4] [7, 8, 9] Depends on ATLAS/CBLAST and written mostly in C and C++ https://github.com/SciRuby/nmatrix/wiki/Getting-started Saturday, September 14, 2013
  • 20.
    Hey, Ruby hasoptions! Saturday, September 14, 2013
  • 21.
    Data Visualization •R •Gnuplot •Google ChartsAPI •JFreeChart •Scruffy •Timetric •Tioga •RChart Saturday, September 14, 2013
  • 22.
    Data Visualization require 'rsruby' cmd= %Q ( pdf(file = "r_directly.pdf")) boxplot(c(1,2,3,4),c(5,6,7,8)) dev.off() ) def gnuplot(commands) IO.popen("gnuplot", "w") { |io| io.puts commands } end commands = %Q( set terminal svg set output "curves.svg" plot [-10:10] sin(x), atan(x), cos(atan(x)) ) gnuplot(commands) http://effectif.com/ruby/manor/data-visualisation-with-ruby https://github.com/glejeune/Ruby-Graphviz/Saturday, September 14, 2013
  • 23.
    Other tools •BioRuby #!/usr/bin/env ruby   require'bio'   # create a DNA sequence object from a String dna = Bio::Sequence::NA.new("atcggtcggctta")   # create a RNA sequence object from a String rna = Bio::Sequence::NA.new("auugccuacauaggc")   # create a Protein sequence from a String aa = Bio::Sequence::AA.new("AGFAVENDSA")   # you can check if the sequence contains illegal characters # that is not an accepted IUB character for that symbol # (should prepare a Bio::Sequence::AA#illegal_symbols method also) puts dna.illegal_bases   # translate and concatenate a DNA sequence to Protein sequence newseq = aa + dna.translate puts newseq # => "AGFAVENDSAIGRL" http://bioruby.org/ Saturday, September 14, 2013
  • 24.
    Other tools •RubyDoop (usesJRuby) module  WordCount    class  Reducer        def  reduce(key,  values,  context)            sum  =  0            values.each  {  |value|  sum  +=  value.get  }            context.write(key,  Hadoop::Io::IntWritable.new(sum))        end    end end https://github.com/iconara/rubydoop module  WordCount    class  Mapper        def  map(key,  value,  context)            value.to_s.split.each  do  |word|                word.gsub!(/W/,  '')                word.downcase!                unless  word.empty?                    context.write(Hadoop::Io::Text.new(word),  Hadoop::Io::IntWritable.new(1))                end            end        end    end end Saturday, September 14, 2013
  • 25.
    Coming back tothe world of recommenders The world is an over-crowded place Saturday, September 14, 2013
  • 26.
    Coming back tothe world of recommenders!"#$%&'()$*+$,-$&.#'/0'&%)#)$1(,0# Saturday, September 14, 2013
  • 27.
    Recommendation Systems Systems designedto recommend to me something I may like Saturday, September 14, 2013
  • 28.
    Recommendation Systems!"#$%&"'$"'(')*#*+,) -+*#)+. -#/')0#)1# 2' 23&4"+')1 5,6 7),*%'"&863 ! Graph Representation Saturday, September 14, 2013
  • 29.
    And how doesit work ? Saturday, September 14, 2013
  • 30.
    What the recommendersrealy do ? 1. Predict how much you may like a certain product o service 2. It suggests a list of N items ordered by the level of your interests. 3. It suggests a N list o f users to a product/ service 4. It explains to you why those items were recommended. 5. It adjusts the prediction and recommendations based on your feedback and from anothers. Saturday, September 14, 2013
  • 31.
    Content Based Filtering Gonewith the Wind Die Hard Similar Armagedon Toy Store Marcel likes recommends Items Users Saturday, September 14, 2013
  • 32.
    Problems with Content Recommenders 1.Restrict Data Analysis 3. Portfolio Effect - Items and users mal-formed. Even worst in audio and images - An person that does not have experience with Sushi does not get the recommendation of the best sushi in town. - Just because I saw 1 movie of Xuxa when I was child, it must have to recommend all movies of her (só para baixinhos!) 2. Specialized Data Saturday, September 14, 2013
  • 33.
    Collaborative Filtering Gone with thewind Thor Similar Armagedon Toy Store Marcel like recommend Items Rafael Amanda Users Saturday, September 14, 2013
  • 34.
    Problems with CollaborativeFiltering 1. Scalability 2. Sparse Data 3. Cold Start 4. Popularity - Amazon with 5M users, 50K items, 1.4B ratings - New users and items with no records - I only rated one book at Amazon! - The person who reads ‘Harry Potter’ also reads ‘Kama Sutra’ 5. Hacking - Everyone reads Harry Potter! Saturday, September 14, 2013
  • 35.
    How does itshow ? Highlights More about this artist... Listen to the similar songs Someone similar to you also liked this... Since you listened this, you may like this one... Those items come together... The most popular of your group... New Releases Saturday, September 14, 2013
  • 36.
    Recommendable Quickly add arecommender engine for Likes and Dislikes to your Ruby app http://davidcel.is/recommendable/ Saturday, September 14, 2013
  • 37.
  • 38.
    Recommendable    gem  'recommendable' Add to your GemFile: Saturday, September 14, 2013
  • 39.
    Recommendable require 'redis' Recommendable.configure do|config| # Recommendable's connection to Redis config.redis = Redis.new(:host => 'localhost', :port => 6379, :db => 0) # A prefix for all keys Recommendable uses config.redis_namespace = :recommendable # Whether or not to automatically enqueue users to have their recommendations # refreshed after they like/dislike an item config.auto_enqueue = true # The name of the queue that background jobs will be placed in config.queue_name = :recommendable # The number of nearest neighbors (k-NN) to check when updating # recommendations for a user. Set to `nil` if you want to check all # other users as opposed to a subset of the nearest ones. config.nearest_neighbors = nil end Create a configuration initializer: Saturday, September 14, 2013
  • 40.
    Recommendable In your ONEmodel that will be receiving the recommendations: class User recommends :movies, :books, :minerals, :other_things # ... end Saturday, September 14, 2013
  • 41.
    Recommendable >> current_user.liked_movies.limit(10) >> current_user.bookmarked_books.where(:author=> "Cormac McCarthy") >> current_user.disliked_movies.joins(:cast_members).where('cast_members.name = Kim Kardashian') You can chain your queries Saturday, September 14, 2013
  • 42.
    Recommendable >> current_user.hidden_minerals.order('density DESC') >>current_user.recommended_movies.where('year < 2010') >> book.liked_by.order('age DESC').limit(20) >> movie.disliked_by.where('age > 18') You can chain your queries Saturday, September 14, 2013
  • 43.
    Recommendable You can alsolike your recommendable objects >> user.like(movie) => true >> user.likes?(movie) => true >> user.rated?(movie) => true # also true if user.dislikes?(movie) >> user.liked_movies => [#<Movie id: 23, name: "2001: A Space Odyssey">] >> user.liked_movie_ids => ["23"] >> user.like(book) => true >> user.likes => [#<Movie id: 23, name: "2001: A Space Odyssey">, #<Book id: 42, title: "100 Years of Solitude">] >> user.likes_count => 2 >> user.liked_movies_count => 1 >> user.likes_in_common_with(friend) => [#<Movie id: 23, name: "2001: A Space Odyssey">, #<Book id: 42, title: "100 Years of Solitude">] >> user.liked_movies_in_common_with(friend) => [#<Movie id: 23, name: "2001: A Space Odyssey">] >> movie.liked_by_count => 2 >> movie.liked_by => [#<User username: 'davidbowman'>, #<User username: 'frankpoole'>] Saturday, September 14, 2013
  • 44.
    Recommendable Obviously, You canalso DISLIKE your recommendable objects >> user.dislike(movie) >> user.dislikes?(movie) >> user.disliked_movies >> user.disliked_movie_ids >> user.dislikes >> user.dislikes_count >> user.disliked_movies_count >> user.dislikes_in_common_with(friend) >> user.disliked_movies_in_common_with(friend) >> movie.disliked_by_count >> movie.disliked_by Saturday, September 14, 2013
  • 45.
    Recommendable Recommendations >> friend.like(Movie.where(:name =>"2001: A Space Odyssey").first) >> friend.like(Book.where(:title => "A Clockwork Orange").first) >> friend.like(Book.where(:title => "Brave New World").first) >> friend.like(Book.where(:title => "One Flew Over the Cuckoo's Next").first) >> user.like(Book.where(:title => "A Clockwork Orange").first) => [#<User username: "frankpoole">, #<User username: "davidbowman">, ...] >> user.recommended_books # Defaults to 10 recommendations => [#<Book title: "Brave New World">, #<Book title: "One Flew Over the Cuckoo's Nest">] >> user.similar_raters # Defaults to 10 similar users => [#< >> user.recommended_movies(10, 30) # 10 Recommendations, offset by 30 (i.e. page 4) => [#<Movie name: "A Clockwork Orange">, #<Movie name: "Chinatown">, ...] >> user.similar_raters(25, 50) # 25 similar users, offset by 50 (i.e. page 3) => [#<User username: "frankpoole">, #<User username: "davidbowman">, ...] Saturday, September 14, 2013
  • 46.
    Recommendable Jaccard Similarity Marcel likesA, B, C and dislikes D Amanda likes A, B and dislikes C Guilherme likes C, D and dislikes A Flavio likes B, C, E and dislikes D J(Marcel, Amanda) = ([A,B].size + [].size - [C].size - [].size) / [A,B,C,D].size J(Marcel, Amanda) = 2 + 0 - 1 - 0 / 4 = 1/4 = 0.25 Saturday, September 14, 2013
  • 47.
    Recommendable Jaccard Similarity Marcel likesA, B, C and dislikes D Amanda likes A, B and dislikes C Guilherme likes C, D and dislikes A Flavio likes B, C, E and dislikes D J(Marcel, Guilherme) = ([C].size + [].size - [A].size - [D].size) / [A,B,C,D].size J(Marcel, Guilherme) = 1 + 0 - 1 - 1 / 4 = 1/4 = - 0.25 Saturday, September 14, 2013
  • 48.
    Recommendable Jaccard Similarity Marcel likesA, B, C and dislikes D Amanda likes A, B and dislikes C Guilherme likes C, D and dislikes A Flavio likes B, C, E and dislikes D J(Marcel, Flavio) = ([B,C].size + [D].size - [].size - [].size) / [A,B,C,D, E].size J(Marcel, Flavio) = 2 + 0 - 0 - 0 = 2/5 = 0.4 Saturday, September 14, 2013
  • 49.
    Recommendable Jaccard Similarity MostSimilar(Marcel) =[ (Flavio, 0.4) , (Amanda, 0.25) , (Guilherme, -0.25)] Marcel likes A, B, C and dislikes D Amanda likes A, B and dislikes C Guilherme likes C, D and dislikes A Flavio likes B, C, E and dislikes D Saturday, September 14, 2013
  • 50.
    Recommendable Recommendations >> Movie.top => #<Moviename: "2001: A Space Odyssey"> >> Movie.top(3) => [#<Movie name: "2001: A Space Odyssey">, #<Movie name: "A Clockwork Orange">, #<Movie name: "The Shining">] The best of your recommendable models Wilson score confidence - Reddit Algorithm Saturday, September 14, 2013
  • 51.
    Recommendable Callbacks class User <ActiveRecord::Base has_one :feed recommends :movies after_like :update_feed def update_feed(obj) feed.update "liked #{obj.name}" end end apotonick/hooks to implement callbacks for liking, disliking, etc Saturday, September 14, 2013
  • 52.
  • 53.
    redis makes themagic! Manual recommendations Saturday, September 14, 2013
  • 54.
    redis makes themagic! Manual recommendations Saturday, September 14, 2013
  • 55.
    Recommendable module  Recommendable    module  Workers        class  Resque            include  ::Resque::Plugins::UniqueJob  if  defined?(::Resque::Plugins::UniqueJob)            @queue  =  :recommendable            def  self.perform(user_id)                Recommendable::Helpers::Calculations.update_similarities_for(user_id)                Recommendable::Helpers::Calculations.update_recommendations_for(user_id)            end        end    end end Recommendations over Queueing System Put the workers to do the job! (SideKiq, Resque, DelayedJob) Saturday, September 14, 2013
  • 56.
    Recommended Books SatnamAlag, CollectiveIntelligence in Action, Manning Publications, 2009 Toby Segaran, Programming Collective Intelligence, O'Reilly, 2007 Saturday, September 14, 2013
  • 57.
    Recommended Books Exploring everydaythings with R and Ruby, Sau Chang, O’Reilly, 2012 Saturday, September 14, 2013
  • 58.
  • 59.
    Ruby developers, Itdoes exist Web Saturday, September 14, 2013
  • 60.
    Ruby in theworld of recommendations (also machine learning, statistics and visualizations..) Marcel Caraciolo @marcelcaraciolo Developer, Cientist, contributor to the Crab recsys project, works with Python for 6 years, interested at mobile, education, machine learning and dataaaaa! Recife, Brazil - http://aimotion.blogspot.com Saturday, September 14, 2013