SlideShare a Scribd company logo
1 of 25
Download to read offline
ICTAI 2011, Boca Raton
                                              November 7, 2011


Collaborative Filtering Based on
          Star Users
                   Qiang Liu
      with Bingfei Cheng and Congfu Xu

       College of Computer Science and Technology
                    Zhejiang University
            Hangzhou, Zhejiang 310027, China
                   2012dtd@gmail.com
Outline
 Introduction
 Star-user-based Collaborative Filtering
 Experimental Results
 Conclusion
INTRODUCTION
Collaborative Filtering

                                           User-based
                   Neighborhood-based
Collaborative                              Item-based
Filtering(CF)
                                   Bayesian Model
                                   Factorization Model
                   Model-based    Maximum Entropy
                                   Classification or Clustering
                                   ……
Motivation
   To improve the most widely used
    technology in real-life recommender
    systems.
Neighborhood Model
                                      Similarity between users:
                                                           cov(������,������)
                                  

                                                            ������������ ������������
                                      ◦ Pearson:
                                                               ������∙������
                                                             ������ ������
                                      ◦ Cosine:
                                      ◦ Other similarity measures

                                     Weighted sum of neighbors’
                                      ratings:
                                      ◦ ������������,������ = ������������ +
                                                           ∑������∈������ ������������,������ −������������ ∙ ������������,������
                                                                ∑������∈������ ������������,������

Common items:1,4,6
Rating vectors of common items:
          a=[1,4,5]
          b=[2,2,5]
Challenges faced by traditional
methods

   Matching similar users (computing similarities ):
       Sparsity and noise
       Scalability
       ……
STAR-USER-BASED CF
The MPN users
 Let A, B, C, D are neighbors of users A, B,
  C, D respectively.
 Then area E is the set of the most
  popular neighbors(MPN).
What is star user
 Star users are special users who have
  rated all items with relatively stable
  standard.
 We maintain a small set of star users, and
  treat them as fixed neighbors of every
  general user
Problem Formulation
                Filling the following matrix ℛ ∈ ������������×������ .

                                 Items (N)

                          ������������     …      ������������     …    ������������

                  ������������     ?       .         .     .     ?
Star users(H)
                  …         .      .         .     .      .

                  ������������      .      .     ������������,������   .      .

                   ...      .      .         .     .      .

                  ������������     ?       .         .     .     ?
Prediction Model
                       Selecting Star Neighbors:             Generate predictions
                                                               based on star users’
                            General Users (M)

                          ������������         ������������         ������������
                                                               ratings:
                                                                             �
                                                                   ������������,������ = ������������ +
                                                                                      ∑������∈������ ������������,������ −������������ ∙ ������������,������
                                                                                            ∑������∈������ ������������,������
                 ������������
                                 …              …
                                                               
Star Users (H)




                                                               The parameters are ������������,������
                            .     .      .      .     .


                 ������������                 ������������,������                  and ������������,������ .
                 …          .     .      .      .     .
                                                           
                            .     .             .     .


                 ������������
                  ...       .     .      .      .     .
                            .     .      .      .     .

                            Relationship Matrix W
How we get star users(1)

    1. Initialization star user matrix ℛ.
   Training Stage:

    2. Predict each rating ������̂������,������ in the training set:
                                    ∑������∈������(������������,������ − ������̅������ ) × ������������,������
               ������̂������,������   = ������̅������ +
                                           ∑������∈������ ������������,������
    3. The residual is ������������,������ = ������������,������ − ������̂������,������
       gradient of ������������,������ 2 is:
                                                                         and the


                              ������������,������ 2 = −2������������,������ ∙ ∑
                                                                ������−1
                 ������                                                  ∙������������,������
                                                                  ������
               ������������������,������                                          ������∈������ ������������,������
How we get star users(2)

    4. Update each element of matrix ℛ:
   Training Stage:

                                                ������������,������
        ������������,������   ← ������������,������ + ������ ∙ ������������,������ ∙
                                             ∑������∈������ ������������,������

    5. Repeat steps 2 to 4 until convergence.
How we get star users(3)

    ◦ ������ (users):The update frequency of ������̅������ .
   Parameters:

    ◦ ������ ������������������������������������������������������������ :The update frequency of
      ������������,������ ∈ ������ for each u, and s.
    w������,������ is computed using Pearson Correlation

                  ������ ∈ ������������×������
   Maintain the relationship matrix W:

    until recommending stage.
EXPERIMENTAL RESULTS
Results on MovieLens Dataset




RMSE of our approach against    Time requirement comparison
various H and comparison with
kNN
Item-based Model
 We firstly train a small set of star items
  instead of star users.
 Predictions are computed as:

                        ∑������∈������ ′ ������������,������ − ������������ × ������������,������
                                           �
    ������������,������   = ������̅������ +
                                 ∑������∈������ ′ ������������,������
Results on Netflix Dataset




Our approach with different values   Our approach with different values
of learning rate                     of H
Discussion
   Comparison with kNN  Comparison with SVD

    ◦ Accuracy                    ◦ Scientific explanation
    ◦ Data Sparsity               ◦ Parameters
    ◦ Scalability                 ◦ Updating

    ������ ������2 × ������ ′
          → ������(������ × ������ × ������ ′ )
    where ������ ≪ ������.
CONCLUSION
Summary
 We proposed a novel CF model based on
  star users.
 The original intention is to improve
  traditional neighborhood-based CF model.
 Experimental results on two datasets
  verified the effectiveness of our approach.
Future work
 Incorporating contextual information into
  our model.
 Validating our approach in practical
  applications.
THANK YOU

More Related Content

What's hot

Kccsi 2012 a real-time robust object tracking-v2
Kccsi 2012   a real-time robust object tracking-v2Kccsi 2012   a real-time robust object tracking-v2
Kccsi 2012 a real-time robust object tracking-v2Prarinya Siritanawan
 
Block Matching Project
Block Matching ProjectBlock Matching Project
Block Matching Projectdswazalwar
 
Fingerprint High Level Classification
Fingerprint High Level ClassificationFingerprint High Level Classification
Fingerprint High Level ClassificationReza Rahimi
 
Image Acquisition and Representation
Image Acquisition and RepresentationImage Acquisition and Representation
Image Acquisition and RepresentationAmnaakhaan
 
Chapter 1 introduction (Image Processing)
Chapter 1 introduction (Image Processing)Chapter 1 introduction (Image Processing)
Chapter 1 introduction (Image Processing)Varun Ojha
 
5 spatial filtering p1
5 spatial filtering p15 spatial filtering p1
5 spatial filtering p1Gichelle Amon
 
Image Texture Analysis
Image Texture AnalysisImage Texture Analysis
Image Texture Analysislalitxp
 
Region filling
Region fillingRegion filling
Region fillinghetvi naik
 
Notes on image processing
Notes on image processingNotes on image processing
Notes on image processingMohammed Kamel
 
Morphological Image Processing
Morphological Image ProcessingMorphological Image Processing
Morphological Image Processingkumari36
 
03 digital image fundamentals DIP
03 digital image fundamentals DIP03 digital image fundamentals DIP
03 digital image fundamentals DIPbabak danyal
 
An improved Spread Spectrum Watermarking technique to withstand Geometric Def...
An improved Spread Spectrum Watermarking technique to withstand Geometric Def...An improved Spread Spectrum Watermarking technique to withstand Geometric Def...
An improved Spread Spectrum Watermarking technique to withstand Geometric Def...IDES Editor
 
Digital image processing img smoothning
Digital image processing img smoothningDigital image processing img smoothning
Digital image processing img smoothningVinay Gupta
 

What's hot (20)

Kccsi 2012 a real-time robust object tracking-v2
Kccsi 2012   a real-time robust object tracking-v2Kccsi 2012   a real-time robust object tracking-v2
Kccsi 2012 a real-time robust object tracking-v2
 
Block Matching Project
Block Matching ProjectBlock Matching Project
Block Matching Project
 
Fingerprint High Level Classification
Fingerprint High Level ClassificationFingerprint High Level Classification
Fingerprint High Level Classification
 
Image Acquisition and Representation
Image Acquisition and RepresentationImage Acquisition and Representation
Image Acquisition and Representation
 
Chapter 1 introduction (Image Processing)
Chapter 1 introduction (Image Processing)Chapter 1 introduction (Image Processing)
Chapter 1 introduction (Image Processing)
 
Mathematical tools in dip
Mathematical tools in dipMathematical tools in dip
Mathematical tools in dip
 
5 spatial filtering p1
5 spatial filtering p15 spatial filtering p1
5 spatial filtering p1
 
Image Texture Analysis
Image Texture AnalysisImage Texture Analysis
Image Texture Analysis
 
Region filling
Region fillingRegion filling
Region filling
 
Notes on image processing
Notes on image processingNotes on image processing
Notes on image processing
 
PPT s02-machine vision-s2
PPT s02-machine vision-s2PPT s02-machine vision-s2
PPT s02-machine vision-s2
 
Morphological Image Processing
Morphological Image ProcessingMorphological Image Processing
Morphological Image Processing
 
03 digital image fundamentals DIP
03 digital image fundamentals DIP03 digital image fundamentals DIP
03 digital image fundamentals DIP
 
PPT s04-machine vision-s2
PPT s04-machine vision-s2PPT s04-machine vision-s2
PPT s04-machine vision-s2
 
Ao25246249
Ao25246249Ao25246249
Ao25246249
 
PPT s08-machine vision-s2
PPT s08-machine vision-s2PPT s08-machine vision-s2
PPT s08-machine vision-s2
 
An improved Spread Spectrum Watermarking technique to withstand Geometric Def...
An improved Spread Spectrum Watermarking technique to withstand Geometric Def...An improved Spread Spectrum Watermarking technique to withstand Geometric Def...
An improved Spread Spectrum Watermarking technique to withstand Geometric Def...
 
2. filtering basics
2. filtering basics2. filtering basics
2. filtering basics
 
Digital image processing img smoothning
Digital image processing img smoothningDigital image processing img smoothning
Digital image processing img smoothning
 
Image segmentation
Image segmentationImage segmentation
Image segmentation
 

Similar to Collaborative Filtering Based on Star Users

SVD and the Netflix Dataset
SVD and the Netflix DatasetSVD and the Netflix Dataset
SVD and the Netflix DatasetBen Mabey
 
EE660_Report_YaxinLiu_8448347171
EE660_Report_YaxinLiu_8448347171EE660_Report_YaxinLiu_8448347171
EE660_Report_YaxinLiu_8448347171Yaxin Liu
 
Fast Object Recognition from 3D Depth Data with Extreme Learning Machine
Fast Object Recognition from 3D Depth Data with Extreme Learning MachineFast Object Recognition from 3D Depth Data with Extreme Learning Machine
Fast Object Recognition from 3D Depth Data with Extreme Learning MachineSoma Boubou
 
SemanticSVD++: Incorporating Semantic Taste Evolution for Predicting Ratings
SemanticSVD++: Incorporating Semantic Taste Evolution for Predicting RatingsSemanticSVD++: Incorporating Semantic Taste Evolution for Predicting Ratings
SemanticSVD++: Incorporating Semantic Taste Evolution for Predicting Ratings Matthew Rowe
 
Variational Autoencoders For Image Generation
Variational Autoencoders For Image GenerationVariational Autoencoders For Image Generation
Variational Autoencoders For Image GenerationJason Anderson
 
Hussain Learning Relevant Eye Movement Feature Spaces Across Users
Hussain Learning Relevant Eye Movement Feature Spaces Across UsersHussain Learning Relevant Eye Movement Feature Spaces Across Users
Hussain Learning Relevant Eye Movement Feature Spaces Across UsersKalle
 
Nearest Neighbor Algorithm Zaffar Ahmed
Nearest Neighbor Algorithm  Zaffar AhmedNearest Neighbor Algorithm  Zaffar Ahmed
Nearest Neighbor Algorithm Zaffar AhmedZaffar Ahmed Shaikh
 
Paper Study: Melding the data decision pipeline
Paper Study: Melding the data decision pipelinePaper Study: Melding the data decision pipeline
Paper Study: Melding the data decision pipelineChenYiHuang5
 
Clustering:k-means, expect-maximization and gaussian mixture model
Clustering:k-means, expect-maximization and gaussian mixture modelClustering:k-means, expect-maximization and gaussian mixture model
Clustering:k-means, expect-maximization and gaussian mixture modeljins0618
 
Instance Based Learning in Machine Learning
Instance Based Learning in Machine LearningInstance Based Learning in Machine Learning
Instance Based Learning in Machine LearningPavithra Thippanaik
 
Learning a nonlinear embedding by preserving class neibourhood structure 최종
Learning a nonlinear embedding by preserving class neibourhood structure   최종Learning a nonlinear embedding by preserving class neibourhood structure   최종
Learning a nonlinear embedding by preserving class neibourhood structure 최종WooSung Choi
 
Download
DownloadDownload
Downloadbutest
 
Download
DownloadDownload
Downloadbutest
 
More investment in Research and Development for better Education in the future?
More investment in Research and Development for better Education in the future?More investment in Research and Development for better Education in the future?
More investment in Research and Development for better Education in the future?Dhafer Malouche
 
CMA-ES with local meta-models
CMA-ES with local meta-modelsCMA-ES with local meta-models
CMA-ES with local meta-modelszyedb
 
Vectorise all the things
Vectorise all the thingsVectorise all the things
Vectorise all the thingsJodieBurchell1
 
Anomaly detection using deep one class classifier
Anomaly detection using deep one class classifierAnomaly detection using deep one class classifier
Anomaly detection using deep one class classifier홍배 김
 

Similar to Collaborative Filtering Based on Star Users (20)

SVD and the Netflix Dataset
SVD and the Netflix DatasetSVD and the Netflix Dataset
SVD and the Netflix Dataset
 
EE660_Report_YaxinLiu_8448347171
EE660_Report_YaxinLiu_8448347171EE660_Report_YaxinLiu_8448347171
EE660_Report_YaxinLiu_8448347171
 
Fast Object Recognition from 3D Depth Data with Extreme Learning Machine
Fast Object Recognition from 3D Depth Data with Extreme Learning MachineFast Object Recognition from 3D Depth Data with Extreme Learning Machine
Fast Object Recognition from 3D Depth Data with Extreme Learning Machine
 
SemanticSVD++: Incorporating Semantic Taste Evolution for Predicting Ratings
SemanticSVD++: Incorporating Semantic Taste Evolution for Predicting RatingsSemanticSVD++: Incorporating Semantic Taste Evolution for Predicting Ratings
SemanticSVD++: Incorporating Semantic Taste Evolution for Predicting Ratings
 
TunUp final presentation
TunUp final presentationTunUp final presentation
TunUp final presentation
 
Variational Autoencoders For Image Generation
Variational Autoencoders For Image GenerationVariational Autoencoders For Image Generation
Variational Autoencoders For Image Generation
 
Hussain Learning Relevant Eye Movement Feature Spaces Across Users
Hussain Learning Relevant Eye Movement Feature Spaces Across UsersHussain Learning Relevant Eye Movement Feature Spaces Across Users
Hussain Learning Relevant Eye Movement Feature Spaces Across Users
 
Nearest Neighbor Algorithm Zaffar Ahmed
Nearest Neighbor Algorithm  Zaffar AhmedNearest Neighbor Algorithm  Zaffar Ahmed
Nearest Neighbor Algorithm Zaffar Ahmed
 
Paper Study: Melding the data decision pipeline
Paper Study: Melding the data decision pipelinePaper Study: Melding the data decision pipeline
Paper Study: Melding the data decision pipeline
 
Isvc08
Isvc08Isvc08
Isvc08
 
Clustering:k-means, expect-maximization and gaussian mixture model
Clustering:k-means, expect-maximization and gaussian mixture modelClustering:k-means, expect-maximization and gaussian mixture model
Clustering:k-means, expect-maximization and gaussian mixture model
 
Instance Based Learning in Machine Learning
Instance Based Learning in Machine LearningInstance Based Learning in Machine Learning
Instance Based Learning in Machine Learning
 
Learning a nonlinear embedding by preserving class neibourhood structure 최종
Learning a nonlinear embedding by preserving class neibourhood structure   최종Learning a nonlinear embedding by preserving class neibourhood structure   최종
Learning a nonlinear embedding by preserving class neibourhood structure 최종
 
Download
DownloadDownload
Download
 
Download
DownloadDownload
Download
 
More investment in Research and Development for better Education in the future?
More investment in Research and Development for better Education in the future?More investment in Research and Development for better Education in the future?
More investment in Research and Development for better Education in the future?
 
CMA-ES with local meta-models
CMA-ES with local meta-modelsCMA-ES with local meta-models
CMA-ES with local meta-models
 
C3_W2.pdf
C3_W2.pdfC3_W2.pdf
C3_W2.pdf
 
Vectorise all the things
Vectorise all the thingsVectorise all the things
Vectorise all the things
 
Anomaly detection using deep one class classifier
Anomaly detection using deep one class classifierAnomaly detection using deep one class classifier
Anomaly detection using deep one class classifier
 

Recently uploaded

Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilV3cube
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 

Recently uploaded (20)

Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 

Collaborative Filtering Based on Star Users

  • 1. ICTAI 2011, Boca Raton November 7, 2011 Collaborative Filtering Based on Star Users Qiang Liu with Bingfei Cheng and Congfu Xu College of Computer Science and Technology Zhejiang University Hangzhou, Zhejiang 310027, China 2012dtd@gmail.com
  • 2. Outline  Introduction  Star-user-based Collaborative Filtering  Experimental Results  Conclusion
  • 4. Collaborative Filtering User-based  Neighborhood-based Collaborative Item-based Filtering(CF) Bayesian Model Factorization Model  Model-based Maximum Entropy Classification or Clustering ……
  • 5. Motivation  To improve the most widely used technology in real-life recommender systems.
  • 6. Neighborhood Model Similarity between users: cov(������,������)  ������������ ������������ ◦ Pearson: ������∙������ ������ ������ ◦ Cosine: ◦ Other similarity measures  Weighted sum of neighbors’ ratings: ◦ ������������,������ = ������������ + ∑������∈������ ������������,������ −������������ ∙ ������������,������ ∑������∈������ ������������,������ Common items:1,4,6 Rating vectors of common items: a=[1,4,5] b=[2,2,5]
  • 7. Challenges faced by traditional methods  Matching similar users (computing similarities ):  Sparsity and noise  Scalability  ……
  • 9. The MPN users  Let A, B, C, D are neighbors of users A, B, C, D respectively.  Then area E is the set of the most popular neighbors(MPN).
  • 10. What is star user  Star users are special users who have rated all items with relatively stable standard.  We maintain a small set of star users, and treat them as fixed neighbors of every general user
  • 11. Problem Formulation Filling the following matrix ℛ ∈ ������������×������ . Items (N) ������������ … ������������ … ������������ ������������ ? . . . ? Star users(H) … . . . . . ������������ . . ������������,������ . . ... . . . . . ������������ ? . . . ?
  • 12. Prediction Model  Selecting Star Neighbors:  Generate predictions based on star users’ General Users (M) ������������ ������������ ������������ ratings: � ������������,������ = ������������ + ∑������∈������ ������������,������ −������������ ∙ ������������,������ ∑������∈������ ������������,������ ������������ … …  Star Users (H) The parameters are ������������,������ . . . . . ������������ ������������,������ and ������������,������ . … . . . . .  . . . . ������������ ... . . . . . . . . . . Relationship Matrix W
  • 13. How we get star users(1) 1. Initialization star user matrix ℛ.  Training Stage: 2. Predict each rating ������̂������,������ in the training set: ∑������∈������(������������,������ − ������̅������ ) × ������������,������ ������̂������,������ = ������̅������ + ∑������∈������ ������������,������ 3. The residual is ������������,������ = ������������,������ − ������̂������,������ gradient of ������������,������ 2 is: and the ������������,������ 2 = −2������������,������ ∙ ∑ ������−1 ������ ∙������������,������ ������ ������������������,������ ������∈������ ������������,������
  • 14. How we get star users(2) 4. Update each element of matrix ℛ:  Training Stage: ������������,������ ������������,������ ← ������������,������ + ������ ∙ ������������,������ ∙ ∑������∈������ ������������,������ 5. Repeat steps 2 to 4 until convergence.
  • 15. How we get star users(3) ◦ ������ (users):The update frequency of ������̅������ .  Parameters: ◦ ������ ������������������������������������������������������������ :The update frequency of ������������,������ ∈ ������ for each u, and s. w������,������ is computed using Pearson Correlation ������ ∈ ������������×������  Maintain the relationship matrix W: until recommending stage.
  • 17. Results on MovieLens Dataset RMSE of our approach against Time requirement comparison various H and comparison with kNN
  • 18. Item-based Model  We firstly train a small set of star items instead of star users.  Predictions are computed as: ∑������∈������ ′ ������������,������ − ������������ × ������������,������ � ������������,������ = ������̅������ + ∑������∈������ ′ ������������,������
  • 19. Results on Netflix Dataset Our approach with different values Our approach with different values of learning rate of H
  • 20. Discussion  Comparison with kNN  Comparison with SVD ◦ Accuracy ◦ Scientific explanation ◦ Data Sparsity ◦ Parameters ◦ Scalability ◦ Updating ������ ������2 × ������ ′ → ������(������ × ������ × ������ ′ ) where ������ ≪ ������.
  • 22. Summary  We proposed a novel CF model based on star users.  The original intention is to improve traditional neighborhood-based CF model.  Experimental results on two datasets verified the effectiveness of our approach.
  • 23. Future work  Incorporating contextual information into our model.  Validating our approach in practical applications.
  • 24.