SlideShare a Scribd company logo
Coen Stevens
Lead Recommendation Engineer
How to build a recommender system?
           Wakoopa use case
Mission:
Discover software & games
Software tracker
Windows         Mac          Linux
Your profile
Updates
Software pages
Recommendations
Building a recommender system
       Approach and challenges
Data
                      what do we have?

Usage (implicit)                       Ratings (explicit)
                             vs.


•                                  •
    Noisy                              Accurate


•                                  •
    Only positive feedback             Positive and negative
                                       feedback

•                                  •
    Easy to collect                    Hard to collect
Data
                       what do we use?


•   Active users (Tracker activity in the past month): ~9.000

•   Actively used software items (in the past month): ~10.000

•   We calculate recommendations for each OS together with
    Web applications separately
Recommender system methods
Collaborative recommendations: The user will be
recommended items that people with similar tastes and
preferences liked (used) in the past

•   Item-based collaborative filtering

•   User-based collaborative filtering (we only use for
    calculating user similarities to find people like you)

•   Combining both methods
Item-Based Collaborative Filtering
           User software usage matrix
                     Software items




             220   90         180          22

             280   12    42           80

   Users     175 210          210          45

             165         35   195     13   25

                   100   50   185          35   190

                   60         65                185
User software usage matrix [0, 1]
                      Software items




              1   1      0     1       0   1   0

              1   1      1     0       1   0   0

Users         1   1      0     1       0   1   0

              1   0      1     1       1   1   0

              0   1      1     1       0   1   1

              0   1      0     1       0   0   1
How do we predict the probability that I would like to use GMail?
                              Software items




                      1   1      0     1       0   1   0

                      1   1      1     0       1   0   0

                                 ?
         Users        1   1            1       0   1   0

                      1   0      1     1       1   1   0

                      0   1      1     1       0   1   1

                      0   1      0     1       0   0   1
Calculate the similarities between Gmail and the other software items.
                                            Software items




                                1       1       0       1    0   1   0

                                1       1       1       0    1   0   0

            Users               1       1       0       1    0   1   0

                                1       0       1       1    1   1   0

                                0       1       1       1    0   1   1

                                0       1       0       1    0   0   1


                    Cosine Similarity(Firefox, Gmail)
Calculate the similarities between Gmail and the other software items.
                                            Software items




                                1       1       0       1    0   1   0

                                1       1       1       0    1   0   0

            Users               1       1       0       1    0   1   0

                                1       0       1       1    1   1   0

                                0       1       1       1    0   1   1

                                0       1       0       1    0   0   1


                    Cosine Similarity(Firefox, Gmail)
Calculate the similarities between Gmail and the other software items.
                                            Software items




                                1       1       0       1    0   1   0

                                1       1       1       0    1   0   0
                                                                         Popularity correction,
            Users               1       1       0       1    0   1   0
                                                                            we put less trust
                                1       0       1       1    1   1   0
                                                                          in popular software
                                0       1       1       1    0   1   1

                                0       1       0       1    0   0   1


                    Cosine Similarity(Firefox, Gmail)
Item-item correlation matrix



    1    0.1   0.6   0.1   0.1   0.1   0.7

   0.2   1     0.8   0.5   0.8   0.1   0.9

   0.1   0.6   1     0.5   0.7   0.2   0.3

   0.2   0.6   0.4   1     0.8   0.2   0.3

   0.5   0.4   0.4   0.4   1     0.1   0.2

   0.5   0.5   0.3   0.5   0.3   1     0.3

   0.2   0.6   0.3   0.8   0.7   0.7   1
Item-item correlation matrix
Gmail similarities




          0.6            1    0.1   0.6   0.1   0.1   0.1   0.7

          0.8           0.2   1     0.8   0.5   0.8   0.1   0.9

          0.4           0.1   0.6   1     0.5   0.7   0.2   0.3

          0.4           0.2   0.6   0.4   1     0.8   0.2   0.3

          0.3           0.5   0.4   0.4   0.4   1     0.1   0.2

          0.3           0.5   0.5   0.3   0.5   0.3   1     0.3

                        0.2   0.6   0.3   0.8   0.7   0.7   1
K-nearest neighbor approach
Gmail similarities


                     •   Performance vs quality
          0.6
                     •   We take only the ‘K’ most similar items (say 4)
          0.8

                     •   Space complexity: O(m + Kn)
          0.4


                     •
          0.4
                         Computational complexity: O(m + n²)
          0.3

          0.3
Calculate the predicted value for Gmail
Gmail similarities   User usage




                            1
          0.6

                            1
          0.8

                            1
          0.4

          0.4

                            1
Calculate the predicted value for Gmail
Gmail similarities   User usage




                           0.9
          0.6
                                        Usage correction,
                           0.8
          0.8
                                       more usage results
                                     in a higher score [0,1]
                           0.6
          0.4

          0.4

                           0.2
Calculate the predicted value for Gmail
Gmail similarities   User usage




                           0.9
          0.6

                           0.8
          0.8

                           0.6
          0.4

          0.4

                           0.2

                                          (0.6 * 0.9) + (0.8 * 0.8) + (0.4 * 0.6)
                                                                                    = 0.82
                                                  0.6 + 0.8 + 0.4 + 0.4
Calculate the predicted value for Gmail

                                       • User feedback
Gmail similarities   User usage


                                       • Contacts usage
                           0.9
          0.6
                                       • Commercial vs Free
                           0.8
          0.8

                           0.6
          0.4

          0.4

                           0.2

                                          (0.6 * 0.9) + (0.8 * 0.8) + (0.4 * 0.6)
                                                                                    = 0.82
                                                  0.6 + 0.8 + 0.4 + 0.4
Calculate all unknown values and
show the Top-N recommendations to each user
                    Software items




                       ?             ?
                     ?
            1   1            1           1

                  ?1??
            1 1 1

                ?1?1?
Users       1 1

              ?1111?
            1

            ?111?11
            ?1?1??1
Explainability
             Why did I get this recommendation?


•   Overlap between the item’s (K) neighbors and your usage
User-Based Collaborative Filtering
                                 Finding people like you



                                  1   1   0   1   0   1    0

                                  1   1   1   0   1   0    0

                                  1   1   0   1   0   1    0

                                  1   1   1   1   1   1    0
Cosine Similarity(Coen, Menno)


                                  0   1   1   1   0   1    1

                                  0   1   0   1   0   0    1
Applying inverse user frequency

        log(n/ni): ni is the number of users that uses item i and n is
                  the total number of users in the database


                                    0.1   0.2   0     0.4   0     0.4   0

                                    0.1   0.2   0.6   0     0.8   0     0

                                    0.1   0.2   0     0.4   0     0.4   0

                                    0.1   0.2   0.6   0.4   0.8   0.4   0
Cosine Similarity(Coen, Menno)


                                    0     0.2   0.6   0.4   0     0.4   0.2

                                    0     0.2   0     0.4   0     0     0.2

        The fact that you both use Textmate tells you more than
                       when you both use firefox
0.1   0.2   0     0.4   0     0.4   0

                                 0.1   0.2   0.6   0     0.8   0     0

                                 0.1   0.2   0     0.4   0     0.4   0

                                 0.1   0.2   0.6   0.4   0.8   0.4   0
Cosine Similarity(Coen, Menno)


                                 0     0.2   0.6   0.4   0     0.4   0.2

                                 0     0.2   0     0.4   0     0     0.2
User-user correlation matrix



     1     0.8   0.6   0.5   0.7   0.2

     0.8   1     0.4   0.7   0.5   0.5

     0.6   0.4   1     0.4   0.9   0.1

     0.5   0.8   0.4   1     0.6   0.4

     0.8   0.5   0.9   0.6   1     0.2

     0.2   0.5   0.1   0.4   0.2   1
Performance
                 measure for success

•   Cross-validation: Train-Test split (80-20)

•   Precision and Recall:
    - precision = size(hit set) / size(total given recs)
    - recall = size(hit set) / size(test set)

•   Root mean squared error (RMSE)
Implementation

•   Ruby Enterprise Edition (garbage collection)

•   MySQL database

•   Built our own c-libraries

•   Amazon EC2:
    - Low cost
    - Flexibility
    - Ease of use

•   Open source
Future challenges


•   What is the best algorithm for Wakoopa? (or you)

•   Reducing space-time complexity (scalability):
    - Parallelization (Clojure)
    - Distributed computing (Hadoop)
1 evening, 3 speakers, 100 developers
           www.recked.org

More Related Content

What's hot

Recent advances in deep recommender systems
Recent advances in deep recommender systemsRecent advances in deep recommender systems
Recent advances in deep recommender systems
NAVER Engineering
 
Recommender Systems In Industry
Recommender Systems In IndustryRecommender Systems In Industry
Recommender Systems In Industry
Xavier Amatriain
 
Overview of recommender system
Overview of recommender systemOverview of recommender system
Overview of recommender system
Stanley Wang
 
Context Aware Recommendations at Netflix
Context Aware Recommendations at NetflixContext Aware Recommendations at Netflix
Context Aware Recommendations at Netflix
Linas Baltrunas
 
Tutorial on Deep Learning in Recommender System, Lars summer school 2019
Tutorial on Deep Learning in Recommender System, Lars summer school 2019Tutorial on Deep Learning in Recommender System, Lars summer school 2019
Tutorial on Deep Learning in Recommender System, Lars summer school 2019
Anoop Deoras
 
Calibrated Recommendations
Calibrated RecommendationsCalibrated Recommendations
Calibrated Recommendations
Harald Steck
 
Recent Trends in Personalization: A Netflix Perspective
Recent Trends in Personalization: A Netflix PerspectiveRecent Trends in Personalization: A Netflix Perspective
Recent Trends in Personalization: A Netflix Perspective
Justin Basilico
 
Deep Learning for Recommender Systems RecSys2017 Tutorial
Deep Learning for Recommender Systems RecSys2017 Tutorial Deep Learning for Recommender Systems RecSys2017 Tutorial
Deep Learning for Recommender Systems RecSys2017 Tutorial
Alexandros Karatzoglou
 
Recommender system
Recommender systemRecommender system
Recommender system
Nilotpal Pramanik
 
Recommendation System Explained
Recommendation System ExplainedRecommendation System Explained
Recommendation System Explained
Crossing Minds
 
Deep Learning for Recommender Systems
Deep Learning for Recommender SystemsDeep Learning for Recommender Systems
Deep Learning for Recommender Systems
Yves Raimond
 
Recommender Systems (Machine Learning Summer School 2014 @ CMU)
Recommender Systems (Machine Learning Summer School 2014 @ CMU)Recommender Systems (Machine Learning Summer School 2014 @ CMU)
Recommender Systems (Machine Learning Summer School 2014 @ CMU)
Xavier Amatriain
 
Recommendation system
Recommendation systemRecommendation system
Recommendation system
Akshat Thakar
 
Netflix Recommendations - Beyond the 5 Stars
Netflix Recommendations - Beyond the 5 StarsNetflix Recommendations - Beyond the 5 Stars
Netflix Recommendations - Beyond the 5 Stars
Xavier Amatriain
 
Recommender systems using collaborative filtering
Recommender systems using collaborative filteringRecommender systems using collaborative filtering
Recommender systems using collaborative filtering
D Yogendra Rao
 
An introduction to Recommender Systems
An introduction to Recommender SystemsAn introduction to Recommender Systems
An introduction to Recommender Systems
David Zibriczky
 
Sequential Decision Making in Recommendations
Sequential Decision Making in RecommendationsSequential Decision Making in Recommendations
Sequential Decision Making in Recommendations
Jaya Kawale
 
Recommender Systems
Recommender SystemsRecommender Systems
Recommender Systems
Girish Khanzode
 
Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...
Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...
Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...
MLconf
 
Missing values in recommender models
Missing values in recommender modelsMissing values in recommender models
Missing values in recommender models
Parmeshwar Khurd
 

What's hot (20)

Recent advances in deep recommender systems
Recent advances in deep recommender systemsRecent advances in deep recommender systems
Recent advances in deep recommender systems
 
Recommender Systems In Industry
Recommender Systems In IndustryRecommender Systems In Industry
Recommender Systems In Industry
 
Overview of recommender system
Overview of recommender systemOverview of recommender system
Overview of recommender system
 
Context Aware Recommendations at Netflix
Context Aware Recommendations at NetflixContext Aware Recommendations at Netflix
Context Aware Recommendations at Netflix
 
Tutorial on Deep Learning in Recommender System, Lars summer school 2019
Tutorial on Deep Learning in Recommender System, Lars summer school 2019Tutorial on Deep Learning in Recommender System, Lars summer school 2019
Tutorial on Deep Learning in Recommender System, Lars summer school 2019
 
Calibrated Recommendations
Calibrated RecommendationsCalibrated Recommendations
Calibrated Recommendations
 
Recent Trends in Personalization: A Netflix Perspective
Recent Trends in Personalization: A Netflix PerspectiveRecent Trends in Personalization: A Netflix Perspective
Recent Trends in Personalization: A Netflix Perspective
 
Deep Learning for Recommender Systems RecSys2017 Tutorial
Deep Learning for Recommender Systems RecSys2017 Tutorial Deep Learning for Recommender Systems RecSys2017 Tutorial
Deep Learning for Recommender Systems RecSys2017 Tutorial
 
Recommender system
Recommender systemRecommender system
Recommender system
 
Recommendation System Explained
Recommendation System ExplainedRecommendation System Explained
Recommendation System Explained
 
Deep Learning for Recommender Systems
Deep Learning for Recommender SystemsDeep Learning for Recommender Systems
Deep Learning for Recommender Systems
 
Recommender Systems (Machine Learning Summer School 2014 @ CMU)
Recommender Systems (Machine Learning Summer School 2014 @ CMU)Recommender Systems (Machine Learning Summer School 2014 @ CMU)
Recommender Systems (Machine Learning Summer School 2014 @ CMU)
 
Recommendation system
Recommendation systemRecommendation system
Recommendation system
 
Netflix Recommendations - Beyond the 5 Stars
Netflix Recommendations - Beyond the 5 StarsNetflix Recommendations - Beyond the 5 Stars
Netflix Recommendations - Beyond the 5 Stars
 
Recommender systems using collaborative filtering
Recommender systems using collaborative filteringRecommender systems using collaborative filtering
Recommender systems using collaborative filtering
 
An introduction to Recommender Systems
An introduction to Recommender SystemsAn introduction to Recommender Systems
An introduction to Recommender Systems
 
Sequential Decision Making in Recommendations
Sequential Decision Making in RecommendationsSequential Decision Making in Recommendations
Sequential Decision Making in Recommendations
 
Recommender Systems
Recommender SystemsRecommender Systems
Recommender Systems
 
Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...
Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...
Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...
 
Missing values in recommender models
Missing values in recommender modelsMissing values in recommender models
Missing values in recommender models
 

Viewers also liked

Design of recommender systems
Design of recommender systemsDesign of recommender systems
Design of recommender systems
Rashmi Sinha
 
How to build a Recommender System
How to build a Recommender SystemHow to build a Recommender System
How to build a Recommender System
Võ Duy Tuấn
 
Building a Recommendation Engine - An example of a product recommendation engine
Building a Recommendation Engine - An example of a product recommendation engineBuilding a Recommendation Engine - An example of a product recommendation engine
Building a Recommendation Engine - An example of a product recommendation engine
NYC Predictive Analytics
 
Recommendation system
Recommendation system Recommendation system
Recommendation system
Vikrant Arya
 
Recommender Systems
Recommender SystemsRecommender Systems
Recommender Systems
T212
 
Collaborative Filtering Recommendation System
Collaborative Filtering Recommendation SystemCollaborative Filtering Recommendation System
Collaborative Filtering Recommendation System
Milind Gokhale
 
Recommender Engines Seminar Paper
Recommender Engines Seminar PaperRecommender Engines Seminar Paper
Recommender Engines Seminar Paper
Thomas Hess
 
Recommender Systems
Recommender SystemsRecommender Systems
Recommender Systems
Lior Rokach
 
How to Build Recommender System with Content based Filtering
How to Build Recommender System with Content based FilteringHow to Build Recommender System with Content based Filtering
How to Build Recommender System with Content based Filtering
Võ Duy Tuấn
 
genetic algorithm based music recommender system
genetic algorithm based music recommender systemgenetic algorithm based music recommender system
genetic algorithm based music recommender system
neha pevekar
 
Buidling large scale recommendation engine
Buidling large scale recommendation engineBuidling large scale recommendation engine
Buidling large scale recommendation engine
Keeyong Han
 
How to Build a Recommendation Engine on Spark
How to Build a Recommendation Engine on SparkHow to Build a Recommendation Engine on Spark
How to Build a Recommendation Engine on Spark
Caserta
 
Wakoopa Recommendation Engine on AWS
Wakoopa Recommendation Engine on AWSWakoopa Recommendation Engine on AWS
Wakoopa Recommendation Engine on AWS
Menno van der Sman
 
Impact of web 2.0 on evaluation and select solutions
Impact of web 2.0 on evaluation and select solutionsImpact of web 2.0 on evaluation and select solutions
Impact of web 2.0 on evaluation and select solutionssarvenaz arianfar
 
Interaction Design Patterns in Recommender Systems
Interaction Design Patterns in Recommender SystemsInteraction Design Patterns in Recommender Systems
Interaction Design Patterns in Recommender Systems
University of Bergen
 
Anyone Can Build A Recommendation Engine With Solr: Presented by Doug Turnbul...
Anyone Can Build A Recommendation Engine With Solr: Presented by Doug Turnbul...Anyone Can Build A Recommendation Engine With Solr: Presented by Doug Turnbul...
Anyone Can Build A Recommendation Engine With Solr: Presented by Doug Turnbul...
Lucidworks
 
Recommender Systems in E-Commerce
Recommender Systems in E-CommerceRecommender Systems in E-Commerce
Recommender Systems in E-Commerce
Roger Chen
 
recommender_systems
recommender_systemsrecommender_systems
recommender_systems
Ramin Anushiravani
 
H2O World - Intro to Data Science with Erin Ledell
H2O World - Intro to Data Science with Erin LedellH2O World - Intro to Data Science with Erin Ledell
H2O World - Intro to Data Science with Erin Ledell
Sri Ambati
 
Recommender system
Recommender systemRecommender system
Recommender system
Jie Jin
 

Viewers also liked (20)

Design of recommender systems
Design of recommender systemsDesign of recommender systems
Design of recommender systems
 
How to build a Recommender System
How to build a Recommender SystemHow to build a Recommender System
How to build a Recommender System
 
Building a Recommendation Engine - An example of a product recommendation engine
Building a Recommendation Engine - An example of a product recommendation engineBuilding a Recommendation Engine - An example of a product recommendation engine
Building a Recommendation Engine - An example of a product recommendation engine
 
Recommendation system
Recommendation system Recommendation system
Recommendation system
 
Recommender Systems
Recommender SystemsRecommender Systems
Recommender Systems
 
Collaborative Filtering Recommendation System
Collaborative Filtering Recommendation SystemCollaborative Filtering Recommendation System
Collaborative Filtering Recommendation System
 
Recommender Engines Seminar Paper
Recommender Engines Seminar PaperRecommender Engines Seminar Paper
Recommender Engines Seminar Paper
 
Recommender Systems
Recommender SystemsRecommender Systems
Recommender Systems
 
How to Build Recommender System with Content based Filtering
How to Build Recommender System with Content based FilteringHow to Build Recommender System with Content based Filtering
How to Build Recommender System with Content based Filtering
 
genetic algorithm based music recommender system
genetic algorithm based music recommender systemgenetic algorithm based music recommender system
genetic algorithm based music recommender system
 
Buidling large scale recommendation engine
Buidling large scale recommendation engineBuidling large scale recommendation engine
Buidling large scale recommendation engine
 
How to Build a Recommendation Engine on Spark
How to Build a Recommendation Engine on SparkHow to Build a Recommendation Engine on Spark
How to Build a Recommendation Engine on Spark
 
Wakoopa Recommendation Engine on AWS
Wakoopa Recommendation Engine on AWSWakoopa Recommendation Engine on AWS
Wakoopa Recommendation Engine on AWS
 
Impact of web 2.0 on evaluation and select solutions
Impact of web 2.0 on evaluation and select solutionsImpact of web 2.0 on evaluation and select solutions
Impact of web 2.0 on evaluation and select solutions
 
Interaction Design Patterns in Recommender Systems
Interaction Design Patterns in Recommender SystemsInteraction Design Patterns in Recommender Systems
Interaction Design Patterns in Recommender Systems
 
Anyone Can Build A Recommendation Engine With Solr: Presented by Doug Turnbul...
Anyone Can Build A Recommendation Engine With Solr: Presented by Doug Turnbul...Anyone Can Build A Recommendation Engine With Solr: Presented by Doug Turnbul...
Anyone Can Build A Recommendation Engine With Solr: Presented by Doug Turnbul...
 
Recommender Systems in E-Commerce
Recommender Systems in E-CommerceRecommender Systems in E-Commerce
Recommender Systems in E-Commerce
 
recommender_systems
recommender_systemsrecommender_systems
recommender_systems
 
H2O World - Intro to Data Science with Erin Ledell
H2O World - Intro to Data Science with Erin LedellH2O World - Intro to Data Science with Erin Ledell
H2O World - Intro to Data Science with Erin Ledell
 
Recommender system
Recommender systemRecommender system
Recommender system
 

Similar to How to build a recommender system?

Open Social Tech Talk Beijing
Open Social Tech Talk   BeijingOpen Social Tech Talk   Beijing
Open Social Tech Talk Beijing
Arne Roomann-Kurrik
 
DevOps goes Mobile (daho.am)
DevOps goes Mobile (daho.am)DevOps goes Mobile (daho.am)
DevOps goes Mobile (daho.am)
Wooga
 
Wakoopa Recommendations Engine on AWS
Wakoopa Recommendations Engine on AWSWakoopa Recommendations Engine on AWS
Wakoopa Recommendations Engine on AWS
Amazon Web Services
 
Oslo Schibsted Performance Gathering
Oslo Schibsted Performance GatheringOslo Schibsted Performance Gathering
Oslo Schibsted Performance Gathering
Almudena Vivanco
 
Spil games konrad
Spil games konradSpil games konrad
Spil games konrad
BigDataExpo
 
Building native apps with web components
Building native apps with web componentsBuilding native apps with web components
Building native apps with web components
Denis Radin
 
Keeping Swift Apps Small
Keeping Swift Apps SmallKeeping Swift Apps Small
Keeping Swift Apps Small
Bruno Rocha
 
IoT Analytics Workshop (IOT314-R1) - AWS re:Invent 2018
IoT Analytics Workshop (IOT314-R1) - AWS re:Invent 2018IoT Analytics Workshop (IOT314-R1) - AWS re:Invent 2018
IoT Analytics Workshop (IOT314-R1) - AWS re:Invent 2018
Amazon Web Services
 
Customer Showcase for AWS IoT Analytics (IOT219) - AWS re:Invent 2018
Customer Showcase for AWS IoT Analytics (IOT219) - AWS re:Invent 2018Customer Showcase for AWS IoT Analytics (IOT219) - AWS re:Invent 2018
Customer Showcase for AWS IoT Analytics (IOT219) - AWS re:Invent 2018
Amazon Web Services
 
Our Favorite Admin Features in Cognos Analytics 11.1
Our Favorite Admin Features in Cognos Analytics 11.1Our Favorite Admin Features in Cognos Analytics 11.1
Our Favorite Admin Features in Cognos Analytics 11.1
Senturus
 
PredictionIO - Building Applications That Predict User Behavior Through Big D...
PredictionIO - Building Applications That Predict User Behavior Through Big D...PredictionIO - Building Applications That Predict User Behavior Through Big D...
PredictionIO - Building Applications That Predict User Behavior Through Big D...
predictionio
 
UCLA HACKU'11
UCLA HACKU'11UCLA HACKU'11
UCLA HACKU'11
Gopal Venkatesan
 
Introduction to GluonNLP
Introduction to GluonNLPIntroduction to GluonNLP
Introduction to GluonNLP
Apache MXNet
 
Windows10TipsandTricksBooklet
Windows10TipsandTricksBookletWindows10TipsandTricksBooklet
Windows10TipsandTricksBooklet
William Sung - CISA, CISSP, ITIL
 
systemd and configuration management
systemd and configuration managementsystemd and configuration management
systemd and configuration management
Julien Pivotto
 
Presentation at Hong Kong Start-Up Association Event
Presentation at Hong Kong Start-Up Association EventPresentation at Hong Kong Start-Up Association Event
Presentation at Hong Kong Start-Up Association Event
Ben Cheng
 
Feature Based Opinion Mining from Amazon Reviews
Feature Based Opinion Mining from Amazon ReviewsFeature Based Opinion Mining from Amazon Reviews
Feature Based Opinion Mining from Amazon Reviews
Ravi Kiran Holur Vijay
 
IoT State of the Union
IoT State of the UnionIoT State of the Union
IoT State of the Union
Amazon Web Services
 
الفصل الثالث البرمجيات التطبيقية - د. خالد بكرو Application Soft - Dr. Khale...
الفصل الثالث  البرمجيات التطبيقية - د. خالد بكرو Application Soft - Dr. Khale...الفصل الثالث  البرمجيات التطبيقية - د. خالد بكرو Application Soft - Dr. Khale...
الفصل الثالث البرمجيات التطبيقية - د. خالد بكرو Application Soft - Dr. Khale...
Dr. Khaled Bakro
 
Static Software Watermarking
Static Software WatermarkingStatic Software Watermarking
Static Software Watermarking
James Hamilton
 

Similar to How to build a recommender system? (20)

Open Social Tech Talk Beijing
Open Social Tech Talk   BeijingOpen Social Tech Talk   Beijing
Open Social Tech Talk Beijing
 
DevOps goes Mobile (daho.am)
DevOps goes Mobile (daho.am)DevOps goes Mobile (daho.am)
DevOps goes Mobile (daho.am)
 
Wakoopa Recommendations Engine on AWS
Wakoopa Recommendations Engine on AWSWakoopa Recommendations Engine on AWS
Wakoopa Recommendations Engine on AWS
 
Oslo Schibsted Performance Gathering
Oslo Schibsted Performance GatheringOslo Schibsted Performance Gathering
Oslo Schibsted Performance Gathering
 
Spil games konrad
Spil games konradSpil games konrad
Spil games konrad
 
Building native apps with web components
Building native apps with web componentsBuilding native apps with web components
Building native apps with web components
 
Keeping Swift Apps Small
Keeping Swift Apps SmallKeeping Swift Apps Small
Keeping Swift Apps Small
 
IoT Analytics Workshop (IOT314-R1) - AWS re:Invent 2018
IoT Analytics Workshop (IOT314-R1) - AWS re:Invent 2018IoT Analytics Workshop (IOT314-R1) - AWS re:Invent 2018
IoT Analytics Workshop (IOT314-R1) - AWS re:Invent 2018
 
Customer Showcase for AWS IoT Analytics (IOT219) - AWS re:Invent 2018
Customer Showcase for AWS IoT Analytics (IOT219) - AWS re:Invent 2018Customer Showcase for AWS IoT Analytics (IOT219) - AWS re:Invent 2018
Customer Showcase for AWS IoT Analytics (IOT219) - AWS re:Invent 2018
 
Our Favorite Admin Features in Cognos Analytics 11.1
Our Favorite Admin Features in Cognos Analytics 11.1Our Favorite Admin Features in Cognos Analytics 11.1
Our Favorite Admin Features in Cognos Analytics 11.1
 
PredictionIO - Building Applications That Predict User Behavior Through Big D...
PredictionIO - Building Applications That Predict User Behavior Through Big D...PredictionIO - Building Applications That Predict User Behavior Through Big D...
PredictionIO - Building Applications That Predict User Behavior Through Big D...
 
UCLA HACKU'11
UCLA HACKU'11UCLA HACKU'11
UCLA HACKU'11
 
Introduction to GluonNLP
Introduction to GluonNLPIntroduction to GluonNLP
Introduction to GluonNLP
 
Windows10TipsandTricksBooklet
Windows10TipsandTricksBookletWindows10TipsandTricksBooklet
Windows10TipsandTricksBooklet
 
systemd and configuration management
systemd and configuration managementsystemd and configuration management
systemd and configuration management
 
Presentation at Hong Kong Start-Up Association Event
Presentation at Hong Kong Start-Up Association EventPresentation at Hong Kong Start-Up Association Event
Presentation at Hong Kong Start-Up Association Event
 
Feature Based Opinion Mining from Amazon Reviews
Feature Based Opinion Mining from Amazon ReviewsFeature Based Opinion Mining from Amazon Reviews
Feature Based Opinion Mining from Amazon Reviews
 
IoT State of the Union
IoT State of the UnionIoT State of the Union
IoT State of the Union
 
الفصل الثالث البرمجيات التطبيقية - د. خالد بكرو Application Soft - Dr. Khale...
الفصل الثالث  البرمجيات التطبيقية - د. خالد بكرو Application Soft - Dr. Khale...الفصل الثالث  البرمجيات التطبيقية - د. خالد بكرو Application Soft - Dr. Khale...
الفصل الثالث البرمجيات التطبيقية - د. خالد بكرو Application Soft - Dr. Khale...
 
Static Software Watermarking
Static Software WatermarkingStatic Software Watermarking
Static Software Watermarking
 

More from blueace

Research & Tracking via een Social Network
Research & Tracking via een Social NetworkResearch & Tracking via een Social Network
Research & Tracking via een Social Network
blueace
 
Enhanced research via software & web tracking
Enhanced research via software & web trackingEnhanced research via software & web tracking
Enhanced research via software & web tracking
blueace
 
(Dutch) CSN: social network succes
(Dutch) CSN: social network succes(Dutch) CSN: social network succes
(Dutch) CSN: social network succes
blueace
 
Recommendations 101
Recommendations 101Recommendations 101
Recommendations 101
blueace
 
(Dutch) Web 2.0 Succesfactoren @ Overheid 2.0
(Dutch) Web 2.0 Succesfactoren @ Overheid 2.0(Dutch) Web 2.0 Succesfactoren @ Overheid 2.0
(Dutch) Web 2.0 Succesfactoren @ Overheid 2.0blueace
 
Roomware - The operating system for interactive spaces
Roomware - The operating system for interactive spacesRoomware - The operating system for interactive spaces
Roomware - The operating system for interactive spaces
blueace
 
Wakoopa at The Next Web 2008
Wakoopa at The Next Web 2008Wakoopa at The Next Web 2008
Wakoopa at The Next Web 2008
blueace
 
How we did RoR in Wakoopa
How we did RoR in WakoopaHow we did RoR in Wakoopa
How we did RoR in Wakoopa
blueace
 

More from blueace (8)

Research & Tracking via een Social Network
Research & Tracking via een Social NetworkResearch & Tracking via een Social Network
Research & Tracking via een Social Network
 
Enhanced research via software & web tracking
Enhanced research via software & web trackingEnhanced research via software & web tracking
Enhanced research via software & web tracking
 
(Dutch) CSN: social network succes
(Dutch) CSN: social network succes(Dutch) CSN: social network succes
(Dutch) CSN: social network succes
 
Recommendations 101
Recommendations 101Recommendations 101
Recommendations 101
 
(Dutch) Web 2.0 Succesfactoren @ Overheid 2.0
(Dutch) Web 2.0 Succesfactoren @ Overheid 2.0(Dutch) Web 2.0 Succesfactoren @ Overheid 2.0
(Dutch) Web 2.0 Succesfactoren @ Overheid 2.0
 
Roomware - The operating system for interactive spaces
Roomware - The operating system for interactive spacesRoomware - The operating system for interactive spaces
Roomware - The operating system for interactive spaces
 
Wakoopa at The Next Web 2008
Wakoopa at The Next Web 2008Wakoopa at The Next Web 2008
Wakoopa at The Next Web 2008
 
How we did RoR in Wakoopa
How we did RoR in WakoopaHow we did RoR in Wakoopa
How we did RoR in Wakoopa
 

Recently uploaded

“How Axelera AI Uses Digital Compute-in-memory to Deliver Fast and Energy-eff...
“How Axelera AI Uses Digital Compute-in-memory to Deliver Fast and Energy-eff...“How Axelera AI Uses Digital Compute-in-memory to Deliver Fast and Energy-eff...
“How Axelera AI Uses Digital Compute-in-memory to Deliver Fast and Energy-eff...
Edge AI and Vision Alliance
 
Monitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdfMonitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdf
Tosin Akinosho
 
"Frontline Battles with DDoS: Best practices and Lessons Learned", Igor Ivaniuk
"Frontline Battles with DDoS: Best practices and Lessons Learned",  Igor Ivaniuk"Frontline Battles with DDoS: Best practices and Lessons Learned",  Igor Ivaniuk
"Frontline Battles with DDoS: Best practices and Lessons Learned", Igor Ivaniuk
Fwdays
 
9 CEO's who hit $100m ARR Share Their Top Growth Tactics Nathan Latka, Founde...
9 CEO's who hit $100m ARR Share Their Top Growth Tactics Nathan Latka, Founde...9 CEO's who hit $100m ARR Share Their Top Growth Tactics Nathan Latka, Founde...
9 CEO's who hit $100m ARR Share Their Top Growth Tactics Nathan Latka, Founde...
saastr
 
Principle of conventional tomography-Bibash Shahi ppt..pptx
Principle of conventional tomography-Bibash Shahi ppt..pptxPrinciple of conventional tomography-Bibash Shahi ppt..pptx
Principle of conventional tomography-Bibash Shahi ppt..pptx
BibashShahi
 
Programming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup SlidesProgramming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup Slides
Zilliz
 
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
saastr
 
Digital Banking in the Cloud: How Citizens Bank Unlocked Their Mainframe
Digital Banking in the Cloud: How Citizens Bank Unlocked Their MainframeDigital Banking in the Cloud: How Citizens Bank Unlocked Their Mainframe
Digital Banking in the Cloud: How Citizens Bank Unlocked Their Mainframe
Precisely
 
Leveraging the Graph for Clinical Trials and Standards
Leveraging the Graph for Clinical Trials and StandardsLeveraging the Graph for Clinical Trials and Standards
Leveraging the Graph for Clinical Trials and Standards
Neo4j
 
What is an RPA CoE? Session 1 – CoE Vision
What is an RPA CoE?  Session 1 – CoE VisionWhat is an RPA CoE?  Session 1 – CoE Vision
What is an RPA CoE? Session 1 – CoE Vision
DianaGray10
 
Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...
Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...
Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...
saastr
 
Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024
Jason Packer
 
Astute Business Solutions | Oracle Cloud Partner |
Astute Business Solutions | Oracle Cloud Partner |Astute Business Solutions | Oracle Cloud Partner |
Astute Business Solutions | Oracle Cloud Partner |
AstuteBusiness
 
The Microsoft 365 Migration Tutorial For Beginner.pptx
The Microsoft 365 Migration Tutorial For Beginner.pptxThe Microsoft 365 Migration Tutorial For Beginner.pptx
The Microsoft 365 Migration Tutorial For Beginner.pptx
operationspcvita
 
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdfHow to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
Chart Kalyan
 
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAUHCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
panagenda
 
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success StoryDriving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Safe Software
 
Essentials of Automations: Exploring Attributes & Automation Parameters
Essentials of Automations: Exploring Attributes & Automation ParametersEssentials of Automations: Exploring Attributes & Automation Parameters
Essentials of Automations: Exploring Attributes & Automation Parameters
Safe Software
 
[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...
[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...
[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...
Jason Yip
 
Biomedical Knowledge Graphs for Data Scientists and Bioinformaticians
Biomedical Knowledge Graphs for Data Scientists and BioinformaticiansBiomedical Knowledge Graphs for Data Scientists and Bioinformaticians
Biomedical Knowledge Graphs for Data Scientists and Bioinformaticians
Neo4j
 

Recently uploaded (20)

“How Axelera AI Uses Digital Compute-in-memory to Deliver Fast and Energy-eff...
“How Axelera AI Uses Digital Compute-in-memory to Deliver Fast and Energy-eff...“How Axelera AI Uses Digital Compute-in-memory to Deliver Fast and Energy-eff...
“How Axelera AI Uses Digital Compute-in-memory to Deliver Fast and Energy-eff...
 
Monitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdfMonitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdf
 
"Frontline Battles with DDoS: Best practices and Lessons Learned", Igor Ivaniuk
"Frontline Battles with DDoS: Best practices and Lessons Learned",  Igor Ivaniuk"Frontline Battles with DDoS: Best practices and Lessons Learned",  Igor Ivaniuk
"Frontline Battles with DDoS: Best practices and Lessons Learned", Igor Ivaniuk
 
9 CEO's who hit $100m ARR Share Their Top Growth Tactics Nathan Latka, Founde...
9 CEO's who hit $100m ARR Share Their Top Growth Tactics Nathan Latka, Founde...9 CEO's who hit $100m ARR Share Their Top Growth Tactics Nathan Latka, Founde...
9 CEO's who hit $100m ARR Share Their Top Growth Tactics Nathan Latka, Founde...
 
Principle of conventional tomography-Bibash Shahi ppt..pptx
Principle of conventional tomography-Bibash Shahi ppt..pptxPrinciple of conventional tomography-Bibash Shahi ppt..pptx
Principle of conventional tomography-Bibash Shahi ppt..pptx
 
Programming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup SlidesProgramming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup Slides
 
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
 
Digital Banking in the Cloud: How Citizens Bank Unlocked Their Mainframe
Digital Banking in the Cloud: How Citizens Bank Unlocked Their MainframeDigital Banking in the Cloud: How Citizens Bank Unlocked Their Mainframe
Digital Banking in the Cloud: How Citizens Bank Unlocked Their Mainframe
 
Leveraging the Graph for Clinical Trials and Standards
Leveraging the Graph for Clinical Trials and StandardsLeveraging the Graph for Clinical Trials and Standards
Leveraging the Graph for Clinical Trials and Standards
 
What is an RPA CoE? Session 1 – CoE Vision
What is an RPA CoE?  Session 1 – CoE VisionWhat is an RPA CoE?  Session 1 – CoE Vision
What is an RPA CoE? Session 1 – CoE Vision
 
Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...
Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...
Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...
 
Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024
 
Astute Business Solutions | Oracle Cloud Partner |
Astute Business Solutions | Oracle Cloud Partner |Astute Business Solutions | Oracle Cloud Partner |
Astute Business Solutions | Oracle Cloud Partner |
 
The Microsoft 365 Migration Tutorial For Beginner.pptx
The Microsoft 365 Migration Tutorial For Beginner.pptxThe Microsoft 365 Migration Tutorial For Beginner.pptx
The Microsoft 365 Migration Tutorial For Beginner.pptx
 
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdfHow to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
 
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAUHCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
 
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success StoryDriving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success Story
 
Essentials of Automations: Exploring Attributes & Automation Parameters
Essentials of Automations: Exploring Attributes & Automation ParametersEssentials of Automations: Exploring Attributes & Automation Parameters
Essentials of Automations: Exploring Attributes & Automation Parameters
 
[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...
[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...
[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...
 
Biomedical Knowledge Graphs for Data Scientists and Bioinformaticians
Biomedical Knowledge Graphs for Data Scientists and BioinformaticiansBiomedical Knowledge Graphs for Data Scientists and Bioinformaticians
Biomedical Knowledge Graphs for Data Scientists and Bioinformaticians
 

How to build a recommender system?

  • 1.
  • 3. How to build a recommender system? Wakoopa use case
  • 10. Building a recommender system Approach and challenges
  • 11. Data what do we have? Usage (implicit) Ratings (explicit) vs. • • Noisy Accurate • • Only positive feedback Positive and negative feedback • • Easy to collect Hard to collect
  • 12. Data what do we use? • Active users (Tracker activity in the past month): ~9.000 • Actively used software items (in the past month): ~10.000 • We calculate recommendations for each OS together with Web applications separately
  • 13. Recommender system methods Collaborative recommendations: The user will be recommended items that people with similar tastes and preferences liked (used) in the past • Item-based collaborative filtering • User-based collaborative filtering (we only use for calculating user similarities to find people like you) • Combining both methods
  • 14. Item-Based Collaborative Filtering User software usage matrix Software items 220 90 180 22 280 12 42 80 Users 175 210 210 45 165 35 195 13 25 100 50 185 35 190 60 65 185
  • 15. User software usage matrix [0, 1] Software items 1 1 0 1 0 1 0 1 1 1 0 1 0 0 Users 1 1 0 1 0 1 0 1 0 1 1 1 1 0 0 1 1 1 0 1 1 0 1 0 1 0 0 1
  • 16. How do we predict the probability that I would like to use GMail? Software items 1 1 0 1 0 1 0 1 1 1 0 1 0 0 ? Users 1 1 1 0 1 0 1 0 1 1 1 1 0 0 1 1 1 0 1 1 0 1 0 1 0 0 1
  • 17. Calculate the similarities between Gmail and the other software items. Software items 1 1 0 1 0 1 0 1 1 1 0 1 0 0 Users 1 1 0 1 0 1 0 1 0 1 1 1 1 0 0 1 1 1 0 1 1 0 1 0 1 0 0 1 Cosine Similarity(Firefox, Gmail)
  • 18. Calculate the similarities between Gmail and the other software items. Software items 1 1 0 1 0 1 0 1 1 1 0 1 0 0 Users 1 1 0 1 0 1 0 1 0 1 1 1 1 0 0 1 1 1 0 1 1 0 1 0 1 0 0 1 Cosine Similarity(Firefox, Gmail)
  • 19. Calculate the similarities between Gmail and the other software items. Software items 1 1 0 1 0 1 0 1 1 1 0 1 0 0 Popularity correction, Users 1 1 0 1 0 1 0 we put less trust 1 0 1 1 1 1 0 in popular software 0 1 1 1 0 1 1 0 1 0 1 0 0 1 Cosine Similarity(Firefox, Gmail)
  • 20. Item-item correlation matrix 1 0.1 0.6 0.1 0.1 0.1 0.7 0.2 1 0.8 0.5 0.8 0.1 0.9 0.1 0.6 1 0.5 0.7 0.2 0.3 0.2 0.6 0.4 1 0.8 0.2 0.3 0.5 0.4 0.4 0.4 1 0.1 0.2 0.5 0.5 0.3 0.5 0.3 1 0.3 0.2 0.6 0.3 0.8 0.7 0.7 1
  • 21. Item-item correlation matrix Gmail similarities 0.6 1 0.1 0.6 0.1 0.1 0.1 0.7 0.8 0.2 1 0.8 0.5 0.8 0.1 0.9 0.4 0.1 0.6 1 0.5 0.7 0.2 0.3 0.4 0.2 0.6 0.4 1 0.8 0.2 0.3 0.3 0.5 0.4 0.4 0.4 1 0.1 0.2 0.3 0.5 0.5 0.3 0.5 0.3 1 0.3 0.2 0.6 0.3 0.8 0.7 0.7 1
  • 22. K-nearest neighbor approach Gmail similarities • Performance vs quality 0.6 • We take only the ‘K’ most similar items (say 4) 0.8 • Space complexity: O(m + Kn) 0.4 • 0.4 Computational complexity: O(m + n²) 0.3 0.3
  • 23. Calculate the predicted value for Gmail Gmail similarities User usage 1 0.6 1 0.8 1 0.4 0.4 1
  • 24. Calculate the predicted value for Gmail Gmail similarities User usage 0.9 0.6 Usage correction, 0.8 0.8 more usage results in a higher score [0,1] 0.6 0.4 0.4 0.2
  • 25. Calculate the predicted value for Gmail Gmail similarities User usage 0.9 0.6 0.8 0.8 0.6 0.4 0.4 0.2 (0.6 * 0.9) + (0.8 * 0.8) + (0.4 * 0.6) = 0.82 0.6 + 0.8 + 0.4 + 0.4
  • 26. Calculate the predicted value for Gmail • User feedback Gmail similarities User usage • Contacts usage 0.9 0.6 • Commercial vs Free 0.8 0.8 0.6 0.4 0.4 0.2 (0.6 * 0.9) + (0.8 * 0.8) + (0.4 * 0.6) = 0.82 0.6 + 0.8 + 0.4 + 0.4
  • 27. Calculate all unknown values and show the Top-N recommendations to each user Software items ? ? ? 1 1 1 1 ?1?? 1 1 1 ?1?1? Users 1 1 ?1111? 1 ?111?11 ?1?1??1
  • 28. Explainability Why did I get this recommendation? • Overlap between the item’s (K) neighbors and your usage
  • 29. User-Based Collaborative Filtering Finding people like you 1 1 0 1 0 1 0 1 1 1 0 1 0 0 1 1 0 1 0 1 0 1 1 1 1 1 1 0 Cosine Similarity(Coen, Menno) 0 1 1 1 0 1 1 0 1 0 1 0 0 1
  • 30. Applying inverse user frequency log(n/ni): ni is the number of users that uses item i and n is the total number of users in the database 0.1 0.2 0 0.4 0 0.4 0 0.1 0.2 0.6 0 0.8 0 0 0.1 0.2 0 0.4 0 0.4 0 0.1 0.2 0.6 0.4 0.8 0.4 0 Cosine Similarity(Coen, Menno) 0 0.2 0.6 0.4 0 0.4 0.2 0 0.2 0 0.4 0 0 0.2 The fact that you both use Textmate tells you more than when you both use firefox
  • 31. 0.1 0.2 0 0.4 0 0.4 0 0.1 0.2 0.6 0 0.8 0 0 0.1 0.2 0 0.4 0 0.4 0 0.1 0.2 0.6 0.4 0.8 0.4 0 Cosine Similarity(Coen, Menno) 0 0.2 0.6 0.4 0 0.4 0.2 0 0.2 0 0.4 0 0 0.2
  • 32. User-user correlation matrix 1 0.8 0.6 0.5 0.7 0.2 0.8 1 0.4 0.7 0.5 0.5 0.6 0.4 1 0.4 0.9 0.1 0.5 0.8 0.4 1 0.6 0.4 0.8 0.5 0.9 0.6 1 0.2 0.2 0.5 0.1 0.4 0.2 1
  • 33. Performance measure for success • Cross-validation: Train-Test split (80-20) • Precision and Recall: - precision = size(hit set) / size(total given recs) - recall = size(hit set) / size(test set) • Root mean squared error (RMSE)
  • 34. Implementation • Ruby Enterprise Edition (garbage collection) • MySQL database • Built our own c-libraries • Amazon EC2: - Low cost - Flexibility - Ease of use • Open source
  • 35. Future challenges • What is the best algorithm for Wakoopa? (or you) • Reducing space-time complexity (scalability): - Parallelization (Clojure) - Distributed computing (Hadoop)
  • 36.
  • 37. 1 evening, 3 speakers, 100 developers www.recked.org