SlideShare a Scribd company logo
1 of 15
Download to read offline
Incremental Item-based
Collaborative Filtering




              João Marques da Silva
          Palco Workshop - May 13, 2009
Item Similarity

                          Clã       Xutos      Gift    DaWeasel
            Ana            1          1         0           0
            Miguel         1          1         1           0
            Ivo            0          1         0           1
            Paula          0          0         1           0
            Joana          1          0         0           0

Take columns as vectors:
  v =1,1,0 ,0 ,1
   Clã                   and     v =0,1 ,0 ,1,0
                                  Gift

Similarity between Clã and Gift (cosine measure):
                                       v . v
                                         Clã Gift
  sim Clã , Gift =cos  v , v =
                           Clã  Gift                ≃0.16
                                     ∥v ∥∗∥v ∥
                                       Clã     Gift

                                                                  2
Similarity Matrix
                             S matrix
                      MxM, with M = nº of items

                       Clã     Xutos      Gift    DaWeasel
           Clã         1
           Xutos       ...       1
           Gift       0.16      ...        1
           DaWeasel    0        ...        0         1


     How do we keep S up-to-date?

    Rebuild S at each new session:
        O(m2n) for m items and n users.
    Incrementally update S with session data:
         O(km) for k items in session.
                                                             3
Algorithm
 Cosine measure for binary ratings:
                 #  I ∩J 
 cos  ,  =
      i j                            I , J are the sets of users that rated items i , j
                # I × # J
 A cache matrix Int stores #(I ∩ J) for all item pairs (i,j):
    Inti,j = #(I ∩ J)
    Inti,i = #I

 For each new session:
     Increment Inti,j by 1 for each item pair (i,j) in session
  
      For each item in session update corresponding row/col in S:
                           Int i , .
            S i ,.=
                     Int i , i ×  Int . , .                                         4
Forgetting

   Usage and content change!
       News content quickly becomes obsolete
       Music/Movies/Books - popularity is often volatile

   How can CF adapt to change?
       Forget older data
       Two methods: sliding windows and fading
        factors


                                                            5
Forgetting: Sliding Windows

                    Sliding Windows


                       window
   Session weight




                       length



                     Data in window   Current Session




                      Session index

Good for non-incremental:
Rebuild S with data in window.                          6
Forgetting: Fading Factors

                    Fading Factors
   Session weight




                                           Current Session



                      Session index

Good for incremental. Before updating S:
     S = αS , 0 < α < 1
α=1 is the non-fading factor
                                                             7
Implementation


   Implementation in R
       Code available from previous work (C. Miranda)
       Adapt algorithms to use forgetting mechanisms
       Improvements: sparse matrix handling
       Limitations with R: speed




                                                         8
Experiments

   Aims
       Forgetting – is it useful?
       Sliding windows vs fading factors
       Item-based better than user-based?

   Evaluation method
       All-but-one protocol (training, test and hidden sets)
       Artificial disturbances in datasets
       Accuracy: precision/recall (binary ratings)
                                                                9
Experiments: datasets

  2 sequential datasets:
    Dataset       Origin        # sessions       # items
  PALCO*      Palco principal           725           1285
  ART*        Artificial               1500              4



  PALCO: Listened tracks in Palcoprincipal
  ART: dataset with abrupt change
              {a,b,c} → {a,b,d} at session 500



                                                             10
Results (so far)
   Matrix update time
       Update time < Rebuild time
       Item-based better for #users > #items
       PALCO: user-based performs better
       Non-incremental good with small windows
   Recommendation time
       Item-based is faster
   Recovery from drifts
    
        ART: α<1 recovers faster than α=1 (as expected)
    
        PALCO: α=1 still better even with 90% drift!      11
Accuracy IBFF w/ ART




                       12
Accuracy UBSW, UBFF w/ PALCO




                               13
Issues

   Forgetting
       Not good for PALCO dataset?
       Good with ART dataset, but not realistic
       Other datasets (ex: news)?
   Long term effects → larger scale experiments
       Better hardware - on the way
       Other implementations (Java, C, SQL…)
   Palcoprincipal
       More items than users!
       Item-based possibly better for artist recommendations.
                                                             14
Thank you!




             15

More Related Content

Similar to Incremental Item-based Collaborative Filtering

H2O.ai's Distributed Deep Learning by Arno Candel 04/03/14
H2O.ai's Distributed Deep Learning by Arno Candel 04/03/14H2O.ai's Distributed Deep Learning by Arno Candel 04/03/14
H2O.ai's Distributed Deep Learning by Arno Candel 04/03/14Sri Ambati
 
H2O Open Source Deep Learning, Arno Candel 03-20-14
H2O Open Source Deep Learning, Arno Candel 03-20-14H2O Open Source Deep Learning, Arno Candel 03-20-14
H2O Open Source Deep Learning, Arno Candel 03-20-14Sri Ambati
 
San Francisco Hadoop User Group Meetup Deep Learning
San Francisco Hadoop User Group Meetup Deep LearningSan Francisco Hadoop User Group Meetup Deep Learning
San Francisco Hadoop User Group Meetup Deep LearningSri Ambati
 
H2O Distributed Deep Learning by Arno Candel 071614
H2O Distributed Deep Learning by Arno Candel 071614H2O Distributed Deep Learning by Arno Candel 071614
H2O Distributed Deep Learning by Arno Candel 071614Sri Ambati
 
[PR12] PR-036 Learning to Remember Rare Events
[PR12] PR-036 Learning to Remember Rare Events[PR12] PR-036 Learning to Remember Rare Events
[PR12] PR-036 Learning to Remember Rare EventsTaegyun Jeon
 
SVD and the Netflix Dataset
SVD and the Netflix DatasetSVD and the Netflix Dataset
SVD and the Netflix DatasetBen Mabey
 
H2O Deep Learning at Next.ML
H2O Deep Learning at Next.MLH2O Deep Learning at Next.ML
H2O Deep Learning at Next.MLSri Ambati
 
Fast Distributed Online Classification
Fast Distributed Online ClassificationFast Distributed Online Classification
Fast Distributed Online ClassificationPrasad Chalasani
 
The Back Propagation Learning Algorithm
The Back Propagation Learning AlgorithmThe Back Propagation Learning Algorithm
The Back Propagation Learning AlgorithmESCOM
 
H2ODeepLearningThroughExamples021215
H2ODeepLearningThroughExamples021215H2ODeepLearningThroughExamples021215
H2ODeepLearningThroughExamples021215Sri Ambati
 
Yulia Honcharenko "Application of metric learning for logo recognition"
Yulia Honcharenko "Application of metric learning for logo recognition"Yulia Honcharenko "Application of metric learning for logo recognition"
Yulia Honcharenko "Application of metric learning for logo recognition"Fwdays
 
Single shot multiboxdetectors
Single shot multiboxdetectorsSingle shot multiboxdetectors
Single shot multiboxdetectors지현 백
 
Mining Adaptively Frequent Closed Unlabeled Rooted Trees in Data Streams
Mining Adaptively Frequent Closed Unlabeled Rooted Trees in Data StreamsMining Adaptively Frequent Closed Unlabeled Rooted Trees in Data Streams
Mining Adaptively Frequent Closed Unlabeled Rooted Trees in Data StreamsAlbert Bifet
 
Large data with Scikit-learn - Boston Data Mining Meetup - Alex Perrier
Large data with Scikit-learn - Boston Data Mining Meetup  - Alex PerrierLarge data with Scikit-learn - Boston Data Mining Meetup  - Alex Perrier
Large data with Scikit-learn - Boston Data Mining Meetup - Alex PerrierAlexis Perrier
 
Efficient Data Stream Classification via Probabilistic Adaptive Windows
Efficient Data Stream Classification via Probabilistic Adaptive WindowsEfficient Data Stream Classification via Probabilistic Adaptive Windows
Efficient Data Stream Classification via Probabilistic Adaptive WindowsAlbert Bifet
 
Efficient Similarity Computation for Collaborative Filtering in Dynamic Envir...
Efficient Similarity Computation for Collaborative Filtering in Dynamic Envir...Efficient Similarity Computation for Collaborative Filtering in Dynamic Envir...
Efficient Similarity Computation for Collaborative Filtering in Dynamic Envir...Olivier Jeunen
 
A scalable collaborative filtering framework based on co clustering
A scalable collaborative filtering framework based on co clusteringA scalable collaborative filtering framework based on co clustering
A scalable collaborative filtering framework based on co clusteringAllenWu
 
Making BIG DATA smaller
Making BIG DATA smallerMaking BIG DATA smaller
Making BIG DATA smallerTony Tran
 
Kaggle Otto Challenge: How we achieved 85th out of 3,514 and what we learnt
Kaggle Otto Challenge: How we achieved 85th out of 3,514 and what we learntKaggle Otto Challenge: How we achieved 85th out of 3,514 and what we learnt
Kaggle Otto Challenge: How we achieved 85th out of 3,514 and what we learntEugene Yan Ziyou
 

Similar to Incremental Item-based Collaborative Filtering (20)

H2O.ai's Distributed Deep Learning by Arno Candel 04/03/14
H2O.ai's Distributed Deep Learning by Arno Candel 04/03/14H2O.ai's Distributed Deep Learning by Arno Candel 04/03/14
H2O.ai's Distributed Deep Learning by Arno Candel 04/03/14
 
H2O Open Source Deep Learning, Arno Candel 03-20-14
H2O Open Source Deep Learning, Arno Candel 03-20-14H2O Open Source Deep Learning, Arno Candel 03-20-14
H2O Open Source Deep Learning, Arno Candel 03-20-14
 
San Francisco Hadoop User Group Meetup Deep Learning
San Francisco Hadoop User Group Meetup Deep LearningSan Francisco Hadoop User Group Meetup Deep Learning
San Francisco Hadoop User Group Meetup Deep Learning
 
H2O Distributed Deep Learning by Arno Candel 071614
H2O Distributed Deep Learning by Arno Candel 071614H2O Distributed Deep Learning by Arno Candel 071614
H2O Distributed Deep Learning by Arno Candel 071614
 
[PR12] PR-036 Learning to Remember Rare Events
[PR12] PR-036 Learning to Remember Rare Events[PR12] PR-036 Learning to Remember Rare Events
[PR12] PR-036 Learning to Remember Rare Events
 
SVD and the Netflix Dataset
SVD and the Netflix DatasetSVD and the Netflix Dataset
SVD and the Netflix Dataset
 
H2O Deep Learning at Next.ML
H2O Deep Learning at Next.MLH2O Deep Learning at Next.ML
H2O Deep Learning at Next.ML
 
Fast Distributed Online Classification
Fast Distributed Online ClassificationFast Distributed Online Classification
Fast Distributed Online Classification
 
The Back Propagation Learning Algorithm
The Back Propagation Learning AlgorithmThe Back Propagation Learning Algorithm
The Back Propagation Learning Algorithm
 
H2ODeepLearningThroughExamples021215
H2ODeepLearningThroughExamples021215H2ODeepLearningThroughExamples021215
H2ODeepLearningThroughExamples021215
 
Yulia Honcharenko "Application of metric learning for logo recognition"
Yulia Honcharenko "Application of metric learning for logo recognition"Yulia Honcharenko "Application of metric learning for logo recognition"
Yulia Honcharenko "Application of metric learning for logo recognition"
 
Single shot multiboxdetectors
Single shot multiboxdetectorsSingle shot multiboxdetectors
Single shot multiboxdetectors
 
Mining Adaptively Frequent Closed Unlabeled Rooted Trees in Data Streams
Mining Adaptively Frequent Closed Unlabeled Rooted Trees in Data StreamsMining Adaptively Frequent Closed Unlabeled Rooted Trees in Data Streams
Mining Adaptively Frequent Closed Unlabeled Rooted Trees in Data Streams
 
Large data with Scikit-learn - Boston Data Mining Meetup - Alex Perrier
Large data with Scikit-learn - Boston Data Mining Meetup  - Alex PerrierLarge data with Scikit-learn - Boston Data Mining Meetup  - Alex Perrier
Large data with Scikit-learn - Boston Data Mining Meetup - Alex Perrier
 
Fast Distributed Online Classification
Fast Distributed Online Classification Fast Distributed Online Classification
Fast Distributed Online Classification
 
Efficient Data Stream Classification via Probabilistic Adaptive Windows
Efficient Data Stream Classification via Probabilistic Adaptive WindowsEfficient Data Stream Classification via Probabilistic Adaptive Windows
Efficient Data Stream Classification via Probabilistic Adaptive Windows
 
Efficient Similarity Computation for Collaborative Filtering in Dynamic Envir...
Efficient Similarity Computation for Collaborative Filtering in Dynamic Envir...Efficient Similarity Computation for Collaborative Filtering in Dynamic Envir...
Efficient Similarity Computation for Collaborative Filtering in Dynamic Envir...
 
A scalable collaborative filtering framework based on co clustering
A scalable collaborative filtering framework based on co clusteringA scalable collaborative filtering framework based on co clustering
A scalable collaborative filtering framework based on co clustering
 
Making BIG DATA smaller
Making BIG DATA smallerMaking BIG DATA smaller
Making BIG DATA smaller
 
Kaggle Otto Challenge: How we achieved 85th out of 3,514 and what we learnt
Kaggle Otto Challenge: How we achieved 85th out of 3,514 and what we learntKaggle Otto Challenge: How we achieved 85th out of 3,514 and what we learnt
Kaggle Otto Challenge: How we achieved 85th out of 3,514 and what we learnt
 

Incremental Item-based Collaborative Filtering

  • 1. Incremental Item-based Collaborative Filtering João Marques da Silva Palco Workshop - May 13, 2009
  • 2. Item Similarity Clã Xutos Gift DaWeasel Ana 1 1 0 0 Miguel 1 1 1 0 Ivo 0 1 0 1 Paula 0 0 1 0 Joana 1 0 0 0 Take columns as vectors: v =1,1,0 ,0 ,1 Clã and v =0,1 ,0 ,1,0 Gift Similarity between Clã and Gift (cosine measure): v . v Clã Gift sim Clã , Gift =cos  v , v = Clã Gift ≃0.16 ∥v ∥∗∥v ∥ Clã Gift 2
  • 3. Similarity Matrix S matrix MxM, with M = nº of items Clã Xutos Gift DaWeasel Clã 1 Xutos ... 1 Gift 0.16 ... 1 DaWeasel 0 ... 0 1 How do we keep S up-to-date?  Rebuild S at each new session: O(m2n) for m items and n users.  Incrementally update S with session data: O(km) for k items in session. 3
  • 4. Algorithm Cosine measure for binary ratings: #  I ∩J  cos  ,  = i j I , J are the sets of users that rated items i , j  # I × # J A cache matrix Int stores #(I ∩ J) for all item pairs (i,j): Inti,j = #(I ∩ J) Inti,i = #I For each new session:  Increment Inti,j by 1 for each item pair (i,j) in session  For each item in session update corresponding row/col in S: Int i , . S i ,.=  Int i , i ×  Int . , . 4
  • 5. Forgetting  Usage and content change!  News content quickly becomes obsolete  Music/Movies/Books - popularity is often volatile  How can CF adapt to change?  Forget older data  Two methods: sliding windows and fading factors 5
  • 6. Forgetting: Sliding Windows Sliding Windows window Session weight length Data in window Current Session Session index Good for non-incremental: Rebuild S with data in window. 6
  • 7. Forgetting: Fading Factors Fading Factors Session weight Current Session Session index Good for incremental. Before updating S: S = αS , 0 < α < 1 α=1 is the non-fading factor 7
  • 8. Implementation  Implementation in R  Code available from previous work (C. Miranda)  Adapt algorithms to use forgetting mechanisms  Improvements: sparse matrix handling  Limitations with R: speed 8
  • 9. Experiments  Aims  Forgetting – is it useful?  Sliding windows vs fading factors  Item-based better than user-based?  Evaluation method  All-but-one protocol (training, test and hidden sets)  Artificial disturbances in datasets  Accuracy: precision/recall (binary ratings) 9
  • 10. Experiments: datasets 2 sequential datasets: Dataset Origin # sessions # items PALCO* Palco principal 725 1285 ART* Artificial 1500 4 PALCO: Listened tracks in Palcoprincipal ART: dataset with abrupt change {a,b,c} → {a,b,d} at session 500 10
  • 11. Results (so far)  Matrix update time  Update time < Rebuild time  Item-based better for #users > #items  PALCO: user-based performs better  Non-incremental good with small windows  Recommendation time  Item-based is faster  Recovery from drifts  ART: α<1 recovers faster than α=1 (as expected)  PALCO: α=1 still better even with 90% drift! 11
  • 13. Accuracy UBSW, UBFF w/ PALCO 13
  • 14. Issues  Forgetting  Not good for PALCO dataset?  Good with ART dataset, but not realistic  Other datasets (ex: news)?  Long term effects → larger scale experiments  Better hardware - on the way  Other implementations (Java, C, SQL…)  Palcoprincipal  More items than users!  Item-based possibly better for artist recommendations. 14