SlideShare a Scribd company logo
1 of 20
A SCALABLE
    COLLABORATIVE
    FILTERING FRAMEWORK
    BASED ON CO-CLUSTERING
1   Authors/ Thomas George and Srujana Merugu
    Source/ ICDM’05, pp. 628-628
    Presenter/ Allen
OUTLINE
 Introduction
 Related Work

 Problem Definition

 Collaborative Filtering via Co-clustering

 Scalable Collaborative Filtering System

 Experimental Results

 Conclusion




                                              2
INTRODUCTION
   Due to the overwhelming increasing in web-based
    activities, users are often forced to choose from a large
    number of products or content items.

   To aid users in the decision making process, it has
    become increasingly important to design recommender
    systems.

   Collaborative filtering identify the likely preferences of a
    user based on the known preferences of other users.
                                                                   3
INTRODUCTION (CONT.)
   Existing collaborative filtering methods based on correlation criteria
      Singular value decomposition (SVD)
      Non-negative matrix factorization (NNMF)
           Drawbacks:
              Computationally expensive of training component




   The practical scenarios such as real-time news personalization
    require dynamic collaborative filtering.

   The key idea
      Simultaneously obtaining user and item neighborhoods via co-
       clustering.
      Generating predictions based on average ratings.                      4
INTRODUCTION (CONT.)
   Two new contributions:
     Dynamic      collaborative filtering approach
          Supporting the entry of new users, items and ratings via a hybrid of
           incremental and batch versions of the co-clustering algorithm.


     A scalable,    real-time collaborative filtering system
          Developing parallel versions of co-clustering, prediction and
           incremental training routines.


   Notation:
     A:   matrix, e.g. Aij denoting the corresponding matrix elements.
     χ: sets, and enumerated as {xi}ni=1, where xi are the elements of
                                                                                  5
      the set.
RELATED WORK
   Recommender System
     Content-based  filtering system
     Collaborative filtering system



   Co-clustering
     SVD   and NNMF-based filtering techniques that predict the
      unknown ratings based on a low rank approximation of the
      original ratings matrix.
          The missing values are filled with the average ratings.
     Incrementalversions of SVD has been proposed to solve the
      computational expensive problem. (SDM 2003)
                                                                     6
PROBLEM DEFINITION
   Let U={ui}mi=1 be the set of users such that |U|=m and
    P={pj}nj=1 be the set of items such that |P|=n.

   Let A be the m×n ratings matrix such that Aij is the rating
    of the user ui to the item pj.
     Let W be   the m×n matrix corresponding to the condifence of
       the ratings in A.
          Wij=1, the rating is known and 0 otherwise.


   Let user clustering ρ: {1, …, m} → {1, …, k}, and item
    clustering γ:{1, …, n} → {1, …, l}                               7
     k:   # user clusters; l: # item clusters
PROBLEM DEFINITION (CONT.)
   The approximate matrix  is given by

     where g=ρ(i), h=γ(j).
     AiR, AjC are the average ratings of user ui and item pj.




       AghCOC, AgRC and AhCC are the average ratings of the corresponding co-
        cluster, user-cluster and item-cluster.




                                                                                 8
COLLABORATIVE FILTERING VIA
CO-CLUSTERING
   Static training (co-clustering): the goal is to minimize



   The row and column assignment steps can be
    implemented efficiently by pre-computing the invariant
    parts of the update cost functions.
     Requiredinfo.
     Row updating: minimizing


     Column    updating: minimizing
        Aρ ( i )3j − Aρ (i ) h + Ah
         tmp          COC         CC
                                                               9
STATIC TRAINING: CO-CLUSTERING




                                 10
PREDICTION




             11
INCREMENTAL TRAINING




                       12
SCALABLE COLLABORATIVE
FILTERING SYSTEM
   Using a distributed memory representation for the data
    objects so that each of the processors P1 and P2 are in
    fact clusters of processors.
     P1 handles the prediction and incremental training.
     P2 is responsible for the static training.




                                                              13
PARALLEL CO-CLUSTERING




                         14
EXPERIMENTAL RESULTS
   Datasets and algorithm
     Movie-lens  (100K): 943 users and 1682 movies consists of
      100,000 ratings(1-5).
     BookCrossing: 470034 users and 133438 books consists of
      269392 ratings(1-10).
     Movie1-Movie10: 10-100% ratings of the movie-lens 100K.


 80% training and 20% testing for all the datasets.
 Evaluation metrics: Mean Absolute Error (MAE)
     The experiments evaluated the effectiveness and efficiency in
      terms of MAE and execution time.
                                                                      15
MAE COMPARISON
 Mov1: movie-lens
 Mov2: BookCrossing

 Mov3: 10 subsets of movie-lens




                K=3




                                   16
VARIATION OF MAE WITH #
PARAMETERS
   # prediction parameters:
     COCLUST:(m+n+kl-k-l) values
     SVD, NNMF: (m+n)(k+l) values
   Movie3 dataset




                                     17
EFFICIENCY
   The time is needed for prediction on each given test pair
    of movie-lens.




   Training time (co-clustering) vs. Data size
     Movie-lensdataset
     Experimental devices
        AMD 1.4Ghz on 128 computer
       nodes with 384MB RAM

                                                                18
TRAINING TIME VS. # OF
PROCESSORS
 Movie-lens dataset
 Experimental devices
     AMD   1.4Ghz on different # of processors with 384MB RAM




                                                                 19
CONCLUSION
   Recommender system are proving to be extremely useful
    for a number of online activities such as e-commerce.

   Regarding to the dynamic scenario, the efficiency and
    effectiveness issues should be concerned.
     New   users, items and ratings enter the system at a rapid rate.

   This paper proposed a new dynamic CF approach based
    on co-clustering.

   Empirical results indicate the high quality predictions at           20
    a much lower computational cost.

More Related Content

What's hot

Training Deep Networks with Backprop (D1L4 Insight@DCU Machine Learning Works...
Training Deep Networks with Backprop (D1L4 Insight@DCU Machine Learning Works...Training Deep Networks with Backprop (D1L4 Insight@DCU Machine Learning Works...
Training Deep Networks with Backprop (D1L4 Insight@DCU Machine Learning Works...Universitat Politècnica de Catalunya
 
Generative Models and Adversarial Training (D2L3 Insight@DCU Machine Learning...
Generative Models and Adversarial Training (D2L3 Insight@DCU Machine Learning...Generative Models and Adversarial Training (D2L3 Insight@DCU Machine Learning...
Generative Models and Adversarial Training (D2L3 Insight@DCU Machine Learning...Universitat Politècnica de Catalunya
 
Dear - 딥러닝 논문읽기 모임 김창연님
Dear - 딥러닝 논문읽기 모임 김창연님Dear - 딥러닝 논문읽기 모임 김창연님
Dear - 딥러닝 논문읽기 모임 김창연님taeseon ryu
 
Dimension Reduction And Visualization Of Large High Dimensional Data Via Inte...
Dimension Reduction And Visualization Of Large High Dimensional Data Via Inte...Dimension Reduction And Visualization Of Large High Dimensional Data Via Inte...
Dimension Reduction And Visualization Of Large High Dimensional Data Via Inte...wl820609
 
Deformable DETR Review [CDM]
Deformable DETR Review [CDM]Deformable DETR Review [CDM]
Deformable DETR Review [CDM]Dongmin Choi
 
Convolutional Neural Network (CNN) presentation from theory to code in Theano
Convolutional Neural Network (CNN) presentation from theory to code in TheanoConvolutional Neural Network (CNN) presentation from theory to code in Theano
Convolutional Neural Network (CNN) presentation from theory to code in TheanoSeongwon Hwang
 
Optimizing Deep Networks (D1L6 Insight@DCU Machine Learning Workshop 2017)
Optimizing Deep Networks (D1L6 Insight@DCU Machine Learning Workshop 2017)Optimizing Deep Networks (D1L6 Insight@DCU Machine Learning Workshop 2017)
Optimizing Deep Networks (D1L6 Insight@DCU Machine Learning Workshop 2017)Universitat Politècnica de Catalunya
 
D1L5 Visualization (D1L2 Insight@DCU Machine Learning Workshop 2017)
D1L5 Visualization (D1L2 Insight@DCU Machine Learning Workshop 2017)D1L5 Visualization (D1L2 Insight@DCU Machine Learning Workshop 2017)
D1L5 Visualization (D1L2 Insight@DCU Machine Learning Workshop 2017)Universitat Politècnica de Catalunya
 
Lecture 6: Convolutional Neural Networks
Lecture 6: Convolutional Neural NetworksLecture 6: Convolutional Neural Networks
Lecture 6: Convolutional Neural NetworksSang Jun Lee
 
On Sampling Strategies for Sampling Strategies-based Collaborative Filtering
On Sampling Strategies for Sampling Strategies-based Collaborative FilteringOn Sampling Strategies for Sampling Strategies-based Collaborative Filtering
On Sampling Strategies for Sampling Strategies-based Collaborative FilteringTing Chen
 
Unsupervised Deep Learning (D2L1 Insight@DCU Machine Learning Workshop 2017)
Unsupervised Deep Learning (D2L1 Insight@DCU Machine Learning Workshop 2017)Unsupervised Deep Learning (D2L1 Insight@DCU Machine Learning Workshop 2017)
Unsupervised Deep Learning (D2L1 Insight@DCU Machine Learning Workshop 2017)Universitat Politècnica de Catalunya
 
Machine learning applications in aerospace domain
Machine learning applications in aerospace domainMachine learning applications in aerospace domain
Machine learning applications in aerospace domain홍배 김
 
Naver learning to rank question answer pairs using hrde-ltc
Naver learning to rank question answer pairs using hrde-ltcNaver learning to rank question answer pairs using hrde-ltc
Naver learning to rank question answer pairs using hrde-ltcNAVER Engineering
 
Workload-aware materialization for efficient variable elimination on Bayesian...
Workload-aware materialization for efficient variable elimination on Bayesian...Workload-aware materialization for efficient variable elimination on Bayesian...
Workload-aware materialization for efficient variable elimination on Bayesian...Cigdem Aslay
 
safe and efficient off policy reinforcement learning
safe and efficient off policy reinforcement learningsafe and efficient off policy reinforcement learning
safe and efficient off policy reinforcement learningRyo Iwaki
 
Learning Convolutional Neural Networks for Graphs
Learning Convolutional Neural Networks for GraphsLearning Convolutional Neural Networks for Graphs
Learning Convolutional Neural Networks for GraphsMathias Niepert
 
Image Segmentation (D3L1 2017 UPC Deep Learning for Computer Vision)
Image Segmentation (D3L1 2017 UPC Deep Learning for Computer Vision)Image Segmentation (D3L1 2017 UPC Deep Learning for Computer Vision)
Image Segmentation (D3L1 2017 UPC Deep Learning for Computer Vision)Universitat Politècnica de Catalunya
 

What's hot (20)

Self-organizing map
Self-organizing mapSelf-organizing map
Self-organizing map
 
Training Deep Networks with Backprop (D1L4 Insight@DCU Machine Learning Works...
Training Deep Networks with Backprop (D1L4 Insight@DCU Machine Learning Works...Training Deep Networks with Backprop (D1L4 Insight@DCU Machine Learning Works...
Training Deep Networks with Backprop (D1L4 Insight@DCU Machine Learning Works...
 
Generative Models and Adversarial Training (D2L3 Insight@DCU Machine Learning...
Generative Models and Adversarial Training (D2L3 Insight@DCU Machine Learning...Generative Models and Adversarial Training (D2L3 Insight@DCU Machine Learning...
Generative Models and Adversarial Training (D2L3 Insight@DCU Machine Learning...
 
Dear - 딥러닝 논문읽기 모임 김창연님
Dear - 딥러닝 논문읽기 모임 김창연님Dear - 딥러닝 논문읽기 모임 김창연님
Dear - 딥러닝 논문읽기 모임 김창연님
 
Dimension Reduction And Visualization Of Large High Dimensional Data Via Inte...
Dimension Reduction And Visualization Of Large High Dimensional Data Via Inte...Dimension Reduction And Visualization Of Large High Dimensional Data Via Inte...
Dimension Reduction And Visualization Of Large High Dimensional Data Via Inte...
 
Deformable DETR Review [CDM]
Deformable DETR Review [CDM]Deformable DETR Review [CDM]
Deformable DETR Review [CDM]
 
Transfer Learning (D2L4 Insight@DCU Machine Learning Workshop 2017)
Transfer Learning (D2L4 Insight@DCU Machine Learning Workshop 2017)Transfer Learning (D2L4 Insight@DCU Machine Learning Workshop 2017)
Transfer Learning (D2L4 Insight@DCU Machine Learning Workshop 2017)
 
Convolutional Neural Network (CNN) presentation from theory to code in Theano
Convolutional Neural Network (CNN) presentation from theory to code in TheanoConvolutional Neural Network (CNN) presentation from theory to code in Theano
Convolutional Neural Network (CNN) presentation from theory to code in Theano
 
Optimizing Deep Networks (D1L6 Insight@DCU Machine Learning Workshop 2017)
Optimizing Deep Networks (D1L6 Insight@DCU Machine Learning Workshop 2017)Optimizing Deep Networks (D1L6 Insight@DCU Machine Learning Workshop 2017)
Optimizing Deep Networks (D1L6 Insight@DCU Machine Learning Workshop 2017)
 
D1L5 Visualization (D1L2 Insight@DCU Machine Learning Workshop 2017)
D1L5 Visualization (D1L2 Insight@DCU Machine Learning Workshop 2017)D1L5 Visualization (D1L2 Insight@DCU Machine Learning Workshop 2017)
D1L5 Visualization (D1L2 Insight@DCU Machine Learning Workshop 2017)
 
Lecture 6: Convolutional Neural Networks
Lecture 6: Convolutional Neural NetworksLecture 6: Convolutional Neural Networks
Lecture 6: Convolutional Neural Networks
 
On Sampling Strategies for Sampling Strategies-based Collaborative Filtering
On Sampling Strategies for Sampling Strategies-based Collaborative FilteringOn Sampling Strategies for Sampling Strategies-based Collaborative Filtering
On Sampling Strategies for Sampling Strategies-based Collaborative Filtering
 
Unsupervised Deep Learning (D2L1 Insight@DCU Machine Learning Workshop 2017)
Unsupervised Deep Learning (D2L1 Insight@DCU Machine Learning Workshop 2017)Unsupervised Deep Learning (D2L1 Insight@DCU Machine Learning Workshop 2017)
Unsupervised Deep Learning (D2L1 Insight@DCU Machine Learning Workshop 2017)
 
Machine learning applications in aerospace domain
Machine learning applications in aerospace domainMachine learning applications in aerospace domain
Machine learning applications in aerospace domain
 
Naver learning to rank question answer pairs using hrde-ltc
Naver learning to rank question answer pairs using hrde-ltcNaver learning to rank question answer pairs using hrde-ltc
Naver learning to rank question answer pairs using hrde-ltc
 
Workload-aware materialization for efficient variable elimination on Bayesian...
Workload-aware materialization for efficient variable elimination on Bayesian...Workload-aware materialization for efficient variable elimination on Bayesian...
Workload-aware materialization for efficient variable elimination on Bayesian...
 
safe and efficient off policy reinforcement learning
safe and efficient off policy reinforcement learningsafe and efficient off policy reinforcement learning
safe and efficient off policy reinforcement learning
 
Learning Convolutional Neural Networks for Graphs
Learning Convolutional Neural Networks for GraphsLearning Convolutional Neural Networks for Graphs
Learning Convolutional Neural Networks for Graphs
 
Deep Learning for Computer Vision: Visualization (UPC 2016)
Deep Learning for Computer Vision: Visualization (UPC 2016)Deep Learning for Computer Vision: Visualization (UPC 2016)
Deep Learning for Computer Vision: Visualization (UPC 2016)
 
Image Segmentation (D3L1 2017 UPC Deep Learning for Computer Vision)
Image Segmentation (D3L1 2017 UPC Deep Learning for Computer Vision)Image Segmentation (D3L1 2017 UPC Deep Learning for Computer Vision)
Image Segmentation (D3L1 2017 UPC Deep Learning for Computer Vision)
 

Viewers also liked

Using support vector machine with a hybrid feature selection method to the st...
Using support vector machine with a hybrid feature selection method to the st...Using support vector machine with a hybrid feature selection method to the st...
Using support vector machine with a hybrid feature selection method to the st...lolokikipipi
 
Transfer learning in heterogeneous collaborative filtering domains
Transfer learning in heterogeneous collaborative filtering domainsTransfer learning in heterogeneous collaborative filtering domains
Transfer learning in heterogeneous collaborative filtering domainsAllen Wu
 
Friends of Solr - Nutch & HDFS
Friends of Solr - Nutch & HDFSFriends of Solr - Nutch & HDFS
Friends of Solr - Nutch & HDFSSaumitra Srivastav
 
Anyone Can Build A Recommendation Engine With Solr: Presented by Doug Turnbul...
Anyone Can Build A Recommendation Engine With Solr: Presented by Doug Turnbul...Anyone Can Build A Recommendation Engine With Solr: Presented by Doug Turnbul...
Anyone Can Build A Recommendation Engine With Solr: Presented by Doug Turnbul...Lucidworks
 
01 Introduction to Data Mining
01 Introduction to Data Mining01 Introduction to Data Mining
01 Introduction to Data MiningValerii Klymchuk
 
05 Clustering in Data Mining
05 Clustering in Data Mining05 Clustering in Data Mining
05 Clustering in Data MiningValerii Klymchuk
 
Apache Solr crash course
Apache Solr crash courseApache Solr crash course
Apache Solr crash courseTommaso Teofili
 
Learning to Rank in Solr: Presented by Michael Nilsson & Diego Ceccarelli, Bl...
Learning to Rank in Solr: Presented by Michael Nilsson & Diego Ceccarelli, Bl...Learning to Rank in Solr: Presented by Michael Nilsson & Diego Ceccarelli, Bl...
Learning to Rank in Solr: Presented by Michael Nilsson & Diego Ceccarelli, Bl...Lucidworks
 
Building a Recommendation Engine - An example of a product recommendation engine
Building a Recommendation Engine - An example of a product recommendation engineBuilding a Recommendation Engine - An example of a product recommendation engine
Building a Recommendation Engine - An example of a product recommendation engineNYC Predictive Analytics
 
Building a real time, solr-powered recommendation engine
Building a real time, solr-powered recommendation engineBuilding a real time, solr-powered recommendation engine
Building a real time, solr-powered recommendation engineTrey Grainger
 

Viewers also liked (11)

Using support vector machine with a hybrid feature selection method to the st...
Using support vector machine with a hybrid feature selection method to the st...Using support vector machine with a hybrid feature selection method to the st...
Using support vector machine with a hybrid feature selection method to the st...
 
Transfer learning in heterogeneous collaborative filtering domains
Transfer learning in heterogeneous collaborative filtering domainsTransfer learning in heterogeneous collaborative filtering domains
Transfer learning in heterogeneous collaborative filtering domains
 
Friends of Solr - Nutch & HDFS
Friends of Solr - Nutch & HDFSFriends of Solr - Nutch & HDFS
Friends of Solr - Nutch & HDFS
 
Scaling search with SolrCloud
Scaling search with SolrCloudScaling search with SolrCloud
Scaling search with SolrCloud
 
Anyone Can Build A Recommendation Engine With Solr: Presented by Doug Turnbul...
Anyone Can Build A Recommendation Engine With Solr: Presented by Doug Turnbul...Anyone Can Build A Recommendation Engine With Solr: Presented by Doug Turnbul...
Anyone Can Build A Recommendation Engine With Solr: Presented by Doug Turnbul...
 
01 Introduction to Data Mining
01 Introduction to Data Mining01 Introduction to Data Mining
01 Introduction to Data Mining
 
05 Clustering in Data Mining
05 Clustering in Data Mining05 Clustering in Data Mining
05 Clustering in Data Mining
 
Apache Solr crash course
Apache Solr crash courseApache Solr crash course
Apache Solr crash course
 
Learning to Rank in Solr: Presented by Michael Nilsson & Diego Ceccarelli, Bl...
Learning to Rank in Solr: Presented by Michael Nilsson & Diego Ceccarelli, Bl...Learning to Rank in Solr: Presented by Michael Nilsson & Diego Ceccarelli, Bl...
Learning to Rank in Solr: Presented by Michael Nilsson & Diego Ceccarelli, Bl...
 
Building a Recommendation Engine - An example of a product recommendation engine
Building a Recommendation Engine - An example of a product recommendation engineBuilding a Recommendation Engine - An example of a product recommendation engine
Building a Recommendation Engine - An example of a product recommendation engine
 
Building a real time, solr-powered recommendation engine
Building a real time, solr-powered recommendation engineBuilding a real time, solr-powered recommendation engine
Building a real time, solr-powered recommendation engine
 

Similar to A scalable collaborative filtering framework based on co clustering

EFFICIENT USE OF HYBRID ADAPTIVE NEURO-FUZZY INFERENCE SYSTEM COMBINED WITH N...
EFFICIENT USE OF HYBRID ADAPTIVE NEURO-FUZZY INFERENCE SYSTEM COMBINED WITH N...EFFICIENT USE OF HYBRID ADAPTIVE NEURO-FUZZY INFERENCE SYSTEM COMBINED WITH N...
EFFICIENT USE OF HYBRID ADAPTIVE NEURO-FUZZY INFERENCE SYSTEM COMBINED WITH N...csandit
 
AN IMPROVED METHOD FOR IDENTIFYING WELL-TEST INTERPRETATION MODEL BASED ON AG...
AN IMPROVED METHOD FOR IDENTIFYING WELL-TEST INTERPRETATION MODEL BASED ON AG...AN IMPROVED METHOD FOR IDENTIFYING WELL-TEST INTERPRETATION MODEL BASED ON AG...
AN IMPROVED METHOD FOR IDENTIFYING WELL-TEST INTERPRETATION MODEL BASED ON AG...IAEME Publication
 
Using HOG Descriptors on Superpixels for Human Detection of UAV Imagery
Using HOG Descriptors on Superpixels for Human Detection of UAV ImageryUsing HOG Descriptors on Superpixels for Human Detection of UAV Imagery
Using HOG Descriptors on Superpixels for Human Detection of UAV ImageryWai Nwe Tun
 
Towards explanations for Data-Centric AI using provenance records
Towards explanations for Data-Centric AI using provenance recordsTowards explanations for Data-Centric AI using provenance records
Towards explanations for Data-Centric AI using provenance recordsPaolo Missier
 
SVD and the Netflix Dataset
SVD and the Netflix DatasetSVD and the Netflix Dataset
SVD and the Netflix DatasetBen Mabey
 
Efficient Pseudo-Relevance Feedback Methods for Collaborative Filtering Recom...
Efficient Pseudo-Relevance Feedback Methods for Collaborative Filtering Recom...Efficient Pseudo-Relevance Feedback Methods for Collaborative Filtering Recom...
Efficient Pseudo-Relevance Feedback Methods for Collaborative Filtering Recom...Daniel Valcarce
 
Performance Evaluation: A Comparative Study of Various Classifiers
Performance Evaluation: A Comparative Study of Various ClassifiersPerformance Evaluation: A Comparative Study of Various Classifiers
Performance Evaluation: A Comparative Study of Various Classifiersamreshkr19
 
A scalable collaborative filtering framework based on co-clustering
A scalable collaborative filtering framework based on co-clusteringA scalable collaborative filtering framework based on co-clustering
A scalable collaborative filtering framework based on co-clusteringlau
 
Parallel k nn on gpu architecture using opencl
Parallel k nn on gpu architecture using openclParallel k nn on gpu architecture using opencl
Parallel k nn on gpu architecture using opencleSAT Publishing House
 
Parallel knn on gpu architecture using opencl
Parallel knn on gpu architecture using openclParallel knn on gpu architecture using opencl
Parallel knn on gpu architecture using opencleSAT Journals
 
Machine learning in Dynamic Adaptive Streaming over HTTP (DASH)
Machine learning in Dynamic Adaptive Streaming over HTTP (DASH)Machine learning in Dynamic Adaptive Streaming over HTTP (DASH)
Machine learning in Dynamic Adaptive Streaming over HTTP (DASH)Eswar Publications
 
COMPARATIVE PERFORMANCE ANALYSIS OF RNSC AND MCL ALGORITHMS ON POWER-LAW DIST...
COMPARATIVE PERFORMANCE ANALYSIS OF RNSC AND MCL ALGORITHMS ON POWER-LAW DIST...COMPARATIVE PERFORMANCE ANALYSIS OF RNSC AND MCL ALGORITHMS ON POWER-LAW DIST...
COMPARATIVE PERFORMANCE ANALYSIS OF RNSC AND MCL ALGORITHMS ON POWER-LAW DIST...acijjournal
 
RunPool: A Dynamic Pooling Layer for Convolution Neural Network
RunPool: A Dynamic Pooling Layer for Convolution Neural NetworkRunPool: A Dynamic Pooling Layer for Convolution Neural Network
RunPool: A Dynamic Pooling Layer for Convolution Neural NetworkPutra Wanda
 
Efficient de cvpr_2020_paper
Efficient de cvpr_2020_paperEfficient de cvpr_2020_paper
Efficient de cvpr_2020_papershanullah3
 
A PSO-Based Subtractive Data Clustering Algorithm
A PSO-Based Subtractive Data Clustering AlgorithmA PSO-Based Subtractive Data Clustering Algorithm
A PSO-Based Subtractive Data Clustering AlgorithmIJORCS
 
Large Scale Kernel Learning using Block Coordinate Descent
Large Scale Kernel Learning using Block Coordinate DescentLarge Scale Kernel Learning using Block Coordinate Descent
Large Scale Kernel Learning using Block Coordinate DescentShaleen Kumar Gupta
 
Tutorial-on-DNN-09A-Co-design-Sparsity.pdf
Tutorial-on-DNN-09A-Co-design-Sparsity.pdfTutorial-on-DNN-09A-Co-design-Sparsity.pdf
Tutorial-on-DNN-09A-Co-design-Sparsity.pdfDuy-Hieu Bui
 
Adaptive check-pointing and replication strategy to tolerate faults in comput...
Adaptive check-pointing and replication strategy to tolerate faults in comput...Adaptive check-pointing and replication strategy to tolerate faults in comput...
Adaptive check-pointing and replication strategy to tolerate faults in comput...IOSR Journals
 

Similar to A scalable collaborative filtering framework based on co clustering (20)

EFFICIENT USE OF HYBRID ADAPTIVE NEURO-FUZZY INFERENCE SYSTEM COMBINED WITH N...
EFFICIENT USE OF HYBRID ADAPTIVE NEURO-FUZZY INFERENCE SYSTEM COMBINED WITH N...EFFICIENT USE OF HYBRID ADAPTIVE NEURO-FUZZY INFERENCE SYSTEM COMBINED WITH N...
EFFICIENT USE OF HYBRID ADAPTIVE NEURO-FUZZY INFERENCE SYSTEM COMBINED WITH N...
 
AN IMPROVED METHOD FOR IDENTIFYING WELL-TEST INTERPRETATION MODEL BASED ON AG...
AN IMPROVED METHOD FOR IDENTIFYING WELL-TEST INTERPRETATION MODEL BASED ON AG...AN IMPROVED METHOD FOR IDENTIFYING WELL-TEST INTERPRETATION MODEL BASED ON AG...
AN IMPROVED METHOD FOR IDENTIFYING WELL-TEST INTERPRETATION MODEL BASED ON AG...
 
Using HOG Descriptors on Superpixels for Human Detection of UAV Imagery
Using HOG Descriptors on Superpixels for Human Detection of UAV ImageryUsing HOG Descriptors on Superpixels for Human Detection of UAV Imagery
Using HOG Descriptors on Superpixels for Human Detection of UAV Imagery
 
Towards explanations for Data-Centric AI using provenance records
Towards explanations for Data-Centric AI using provenance recordsTowards explanations for Data-Centric AI using provenance records
Towards explanations for Data-Centric AI using provenance records
 
SVD and the Netflix Dataset
SVD and the Netflix DatasetSVD and the Netflix Dataset
SVD and the Netflix Dataset
 
Efficient Pseudo-Relevance Feedback Methods for Collaborative Filtering Recom...
Efficient Pseudo-Relevance Feedback Methods for Collaborative Filtering Recom...Efficient Pseudo-Relevance Feedback Methods for Collaborative Filtering Recom...
Efficient Pseudo-Relevance Feedback Methods for Collaborative Filtering Recom...
 
Performance Evaluation: A Comparative Study of Various Classifiers
Performance Evaluation: A Comparative Study of Various ClassifiersPerformance Evaluation: A Comparative Study of Various Classifiers
Performance Evaluation: A Comparative Study of Various Classifiers
 
A scalable collaborative filtering framework based on co-clustering
A scalable collaborative filtering framework based on co-clusteringA scalable collaborative filtering framework based on co-clustering
A scalable collaborative filtering framework based on co-clustering
 
Parallel k nn on gpu architecture using opencl
Parallel k nn on gpu architecture using openclParallel k nn on gpu architecture using opencl
Parallel k nn on gpu architecture using opencl
 
Parallel knn on gpu architecture using opencl
Parallel knn on gpu architecture using openclParallel knn on gpu architecture using opencl
Parallel knn on gpu architecture using opencl
 
Machine learning in Dynamic Adaptive Streaming over HTTP (DASH)
Machine learning in Dynamic Adaptive Streaming over HTTP (DASH)Machine learning in Dynamic Adaptive Streaming over HTTP (DASH)
Machine learning in Dynamic Adaptive Streaming over HTTP (DASH)
 
COMPARATIVE PERFORMANCE ANALYSIS OF RNSC AND MCL ALGORITHMS ON POWER-LAW DIST...
COMPARATIVE PERFORMANCE ANALYSIS OF RNSC AND MCL ALGORITHMS ON POWER-LAW DIST...COMPARATIVE PERFORMANCE ANALYSIS OF RNSC AND MCL ALGORITHMS ON POWER-LAW DIST...
COMPARATIVE PERFORMANCE ANALYSIS OF RNSC AND MCL ALGORITHMS ON POWER-LAW DIST...
 
RunPool: A Dynamic Pooling Layer for Convolution Neural Network
RunPool: A Dynamic Pooling Layer for Convolution Neural NetworkRunPool: A Dynamic Pooling Layer for Convolution Neural Network
RunPool: A Dynamic Pooling Layer for Convolution Neural Network
 
Efficient de cvpr_2020_paper
Efficient de cvpr_2020_paperEfficient de cvpr_2020_paper
Efficient de cvpr_2020_paper
 
A PSO-Based Subtractive Data Clustering Algorithm
A PSO-Based Subtractive Data Clustering AlgorithmA PSO-Based Subtractive Data Clustering Algorithm
A PSO-Based Subtractive Data Clustering Algorithm
 
Large Scale Kernel Learning using Block Coordinate Descent
Large Scale Kernel Learning using Block Coordinate DescentLarge Scale Kernel Learning using Block Coordinate Descent
Large Scale Kernel Learning using Block Coordinate Descent
 
Tutorial-on-DNN-09A-Co-design-Sparsity.pdf
Tutorial-on-DNN-09A-Co-design-Sparsity.pdfTutorial-on-DNN-09A-Co-design-Sparsity.pdf
Tutorial-on-DNN-09A-Co-design-Sparsity.pdf
 
Adaptive check-pointing and replication strategy to tolerate faults in comput...
Adaptive check-pointing and replication strategy to tolerate faults in comput...Adaptive check-pointing and replication strategy to tolerate faults in comput...
Adaptive check-pointing and replication strategy to tolerate faults in comput...
 
E01113138
E01113138E01113138
E01113138
 
Pay-as-you-go Reconciliation in Schema Matching Networks
Pay-as-you-go Reconciliation in Schema Matching NetworksPay-as-you-go Reconciliation in Schema Matching Networks
Pay-as-you-go Reconciliation in Schema Matching Networks
 

More from AllenWu

Collaborative filtering with CCAM
Collaborative filtering with CCAMCollaborative filtering with CCAM
Collaborative filtering with CCAMAllenWu
 
DSTree: A Tree Structure for the Mining of Frequent Sets from Data Streams
DSTree: A Tree Structure for the Mining of Frequent Sets from Data StreamsDSTree: A Tree Structure for the Mining of Frequent Sets from Data Streams
DSTree: A Tree Structure for the Mining of Frequent Sets from Data StreamsAllenWu
 
Co-clustering with augmented data
Co-clustering with augmented dataCo-clustering with augmented data
Co-clustering with augmented dataAllenWu
 
Ch4.mapreduce algorithm design
Ch4.mapreduce algorithm designCh4.mapreduce algorithm design
Ch4.mapreduce algorithm designAllenWu
 
地震知識
地震知識地震知識
地震知識AllenWu
 
Collaborative filtering using orthogonal nonnegative matrix
Collaborative filtering using orthogonal nonnegative matrixCollaborative filtering using orthogonal nonnegative matrix
Collaborative filtering using orthogonal nonnegative matrixAllenWu
 
Co clustering by-block_value_decomposition
Co clustering by-block_value_decompositionCo clustering by-block_value_decomposition
Co clustering by-block_value_decompositionAllenWu
 
Information Theoretic Co Clustering
Information Theoretic Co ClusteringInformation Theoretic Co Clustering
Information Theoretic Co ClusteringAllenWu
 
Semantics In Digital Photos A Contenxtual Analysis
Semantics In Digital Photos A Contenxtual AnalysisSemantics In Digital Photos A Contenxtual Analysis
Semantics In Digital Photos A Contenxtual AnalysisAllenWu
 

More from AllenWu (9)

Collaborative filtering with CCAM
Collaborative filtering with CCAMCollaborative filtering with CCAM
Collaborative filtering with CCAM
 
DSTree: A Tree Structure for the Mining of Frequent Sets from Data Streams
DSTree: A Tree Structure for the Mining of Frequent Sets from Data StreamsDSTree: A Tree Structure for the Mining of Frequent Sets from Data Streams
DSTree: A Tree Structure for the Mining of Frequent Sets from Data Streams
 
Co-clustering with augmented data
Co-clustering with augmented dataCo-clustering with augmented data
Co-clustering with augmented data
 
Ch4.mapreduce algorithm design
Ch4.mapreduce algorithm designCh4.mapreduce algorithm design
Ch4.mapreduce algorithm design
 
地震知識
地震知識地震知識
地震知識
 
Collaborative filtering using orthogonal nonnegative matrix
Collaborative filtering using orthogonal nonnegative matrixCollaborative filtering using orthogonal nonnegative matrix
Collaborative filtering using orthogonal nonnegative matrix
 
Co clustering by-block_value_decomposition
Co clustering by-block_value_decompositionCo clustering by-block_value_decomposition
Co clustering by-block_value_decomposition
 
Information Theoretic Co Clustering
Information Theoretic Co ClusteringInformation Theoretic Co Clustering
Information Theoretic Co Clustering
 
Semantics In Digital Photos A Contenxtual Analysis
Semantics In Digital Photos A Contenxtual AnalysisSemantics In Digital Photos A Contenxtual Analysis
Semantics In Digital Photos A Contenxtual Analysis
 

A scalable collaborative filtering framework based on co clustering

  • 1. A SCALABLE COLLABORATIVE FILTERING FRAMEWORK BASED ON CO-CLUSTERING 1 Authors/ Thomas George and Srujana Merugu Source/ ICDM’05, pp. 628-628 Presenter/ Allen
  • 2. OUTLINE  Introduction  Related Work  Problem Definition  Collaborative Filtering via Co-clustering  Scalable Collaborative Filtering System  Experimental Results  Conclusion 2
  • 3. INTRODUCTION  Due to the overwhelming increasing in web-based activities, users are often forced to choose from a large number of products or content items.  To aid users in the decision making process, it has become increasingly important to design recommender systems.  Collaborative filtering identify the likely preferences of a user based on the known preferences of other users. 3
  • 4. INTRODUCTION (CONT.)  Existing collaborative filtering methods based on correlation criteria  Singular value decomposition (SVD)  Non-negative matrix factorization (NNMF)  Drawbacks:  Computationally expensive of training component  The practical scenarios such as real-time news personalization require dynamic collaborative filtering.  The key idea  Simultaneously obtaining user and item neighborhoods via co- clustering.  Generating predictions based on average ratings. 4
  • 5. INTRODUCTION (CONT.)  Two new contributions:  Dynamic collaborative filtering approach  Supporting the entry of new users, items and ratings via a hybrid of incremental and batch versions of the co-clustering algorithm.  A scalable, real-time collaborative filtering system  Developing parallel versions of co-clustering, prediction and incremental training routines.  Notation:  A: matrix, e.g. Aij denoting the corresponding matrix elements.  χ: sets, and enumerated as {xi}ni=1, where xi are the elements of 5 the set.
  • 6. RELATED WORK  Recommender System  Content-based filtering system  Collaborative filtering system  Co-clustering  SVD and NNMF-based filtering techniques that predict the unknown ratings based on a low rank approximation of the original ratings matrix.  The missing values are filled with the average ratings.  Incrementalversions of SVD has been proposed to solve the computational expensive problem. (SDM 2003) 6
  • 7. PROBLEM DEFINITION  Let U={ui}mi=1 be the set of users such that |U|=m and P={pj}nj=1 be the set of items such that |P|=n.  Let A be the m×n ratings matrix such that Aij is the rating of the user ui to the item pj.  Let W be the m×n matrix corresponding to the condifence of the ratings in A.  Wij=1, the rating is known and 0 otherwise.  Let user clustering ρ: {1, …, m} → {1, …, k}, and item clustering γ:{1, …, n} → {1, …, l} 7  k: # user clusters; l: # item clusters
  • 8. PROBLEM DEFINITION (CONT.)  The approximate matrix  is given by  where g=ρ(i), h=γ(j).  AiR, AjC are the average ratings of user ui and item pj.  AghCOC, AgRC and AhCC are the average ratings of the corresponding co- cluster, user-cluster and item-cluster. 8
  • 9. COLLABORATIVE FILTERING VIA CO-CLUSTERING  Static training (co-clustering): the goal is to minimize  The row and column assignment steps can be implemented efficiently by pre-computing the invariant parts of the update cost functions.  Requiredinfo.  Row updating: minimizing  Column updating: minimizing Aρ ( i )3j − Aρ (i ) h + Ah tmp COC CC 9
  • 13. SCALABLE COLLABORATIVE FILTERING SYSTEM  Using a distributed memory representation for the data objects so that each of the processors P1 and P2 are in fact clusters of processors.  P1 handles the prediction and incremental training.  P2 is responsible for the static training. 13
  • 15. EXPERIMENTAL RESULTS  Datasets and algorithm  Movie-lens (100K): 943 users and 1682 movies consists of 100,000 ratings(1-5).  BookCrossing: 470034 users and 133438 books consists of 269392 ratings(1-10).  Movie1-Movie10: 10-100% ratings of the movie-lens 100K.  80% training and 20% testing for all the datasets.  Evaluation metrics: Mean Absolute Error (MAE)  The experiments evaluated the effectiveness and efficiency in terms of MAE and execution time. 15
  • 16. MAE COMPARISON  Mov1: movie-lens  Mov2: BookCrossing  Mov3: 10 subsets of movie-lens K=3 16
  • 17. VARIATION OF MAE WITH # PARAMETERS  # prediction parameters:  COCLUST:(m+n+kl-k-l) values  SVD, NNMF: (m+n)(k+l) values  Movie3 dataset 17
  • 18. EFFICIENCY  The time is needed for prediction on each given test pair of movie-lens.  Training time (co-clustering) vs. Data size  Movie-lensdataset  Experimental devices  AMD 1.4Ghz on 128 computer nodes with 384MB RAM 18
  • 19. TRAINING TIME VS. # OF PROCESSORS  Movie-lens dataset  Experimental devices  AMD 1.4Ghz on different # of processors with 384MB RAM 19
  • 20. CONCLUSION  Recommender system are proving to be extremely useful for a number of online activities such as e-commerce.  Regarding to the dynamic scenario, the efficiency and effectiveness issues should be concerned.  New users, items and ratings enter the system at a rapid rate.  This paper proposed a new dynamic CF approach based on co-clustering.  Empirical results indicate the high quality predictions at 20 a much lower computational cost.