• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content

Loading…

Flash Player 9 (or above) is needed to view presentations.
We have detected that you do not have it on your computer. To install it, go here.

Like this presentation? Why not share!

A scalable collaborative filtering framework based on co-clustering

on

  • 1,163 views

 

Statistics

Views

Total Views
1,163
Views on SlideShare
1,134
Embed Views
29

Actions

Likes
0
Downloads
11
Comments
0

2 Embeds 29

http://web204seminar.blogspot.com 24
http://web204seminar.blogspot.tw 5

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    A scalable collaborative filtering framework based on co-clustering A scalable collaborative filtering framework based on co-clustering Presentation Transcript

    • A Scalable Collaborative Filtering Framework based on Co-clustering
      Author: Thomas George, SrujanaMeruguin ICDM’05.
      Presenter: Rei-Zhe Liu.Date: 2010/10/26.
    • Outline
      Introduction
      System architecture
      Experiments and result
      Conclusion
      2
    • Introduction
      We propose a dynamic collaborative filtering approach that can support the entry of new users, items and ratings using a hybrid of incremental and batch versions of the co-clustering algorithm.
      Empirical comparison of our approach with SVD, NNMF and correlation-based collaborative filtering techniques indicates comparable accuracy at a much lower computational effort.
      3
    • System architecture
    • Problem definition(1/2)
      The approximate matrix for prediction is given by
      5
    • Problem definition(2/2)
      We can now pose the prediction of unknown ratings as a co-clustering problem where we seek to find the optimal user and item clustering such that the approximation error with respect to the known ratings of A is minimized,
      where ensures that only the known ratings contribute to the loss function.
      6
    • Algorithm(1/3)
      7
    • Algorithm(2/3)
      8
    • Algorithm(3/3)
      9
    • System description
      P1 handles the prediction and incremental training.
      P2 is responsible for the static training.
      During incremental training P1, also updates the raw ratings.
      P2 performs co-clustering repeatedly by reading A(the current ratings matrix) and updating S(summary statistics) when done.
      Data Objects A and S are stored at 2 parts: (a)stable part (b)increment part.
      At the end of each co-clustering run, the two parts are merged to obtain a new set of stable values.
      10
    • Experiments and results
    • Data sets and Exp. Settings(1/2)
      Data set
      MovieLens: 943-1882 user-by-movie matrix. Totally 100,000ratings. Rated from 1 to 5.
      Evaluation methodology
      The prediction accuracy was measured using the mean absolute error (MAE),which is the average of the absolute values of the errors over all the predictions.
      The static training time was estimated in terms of the CPU time taken for the core training routines (viz. co-clustering and SVD).
      The prediction time was estimated by averaging over the response time taken for all the predictions.
      12
    • Data sets and Exp. Settings(2/2)
      For evaluating the prediction accuracy, we created ten 80-20% random train-test splits of the datasets and averaged the results over the various splits.
      We considered two scenarios, —(i) static testing, where the known ratings do not change, and (ii) dynamic testing, where the ratings are updated incrementally.
      Algorithms
      We compared the performance of our co-clustering based approach with SVD [13], NNMF [10] and classic correlation-based collaborative filtering [12].
      An incremental SVD-based approach [14] using a folding in technique was also implemented in order to evaluate the prediction accuracy in dynamic scenarios with changing ratings.
      13
    • Evaluation(1/3)
      k = l = SVD rank =NNMF rank=3
      k = l = SVD rank=3
      14
    • Evaluation(2/3)
      Dataset: Mov1
      Dataset: MovieLens
      15
      CoC: (m+n+kl-k-l)
      NNMF, SVD: (m+n)(k+l)
    • Evaluation(3/3)
      16
      Dataset: MovieLens
    • Conclusion
    • Conclusion
      In this paper, we presented a new dynamic collaborative filtering approach based on simultaneous clustering of users and items.
      Empirical results indicate that our approach can provide high quality predictions at a much lower computational cost compared to traditional correlation and SVD-based approaches.
      18