SlideShare a Scribd company logo
1 of 21
On Top-k Recommendation using Social
                      Networks



    Xiwang Yang, Harald Steck*+, Yang Guo* and Yong Liu

                 Polytechnic Institute of NYU
                 *Bell Labs
              +
                Netflix Inc.

1
Outline
     Background & Motivation
       Social network based top-k recommendation
       Related Work: AllRank, SoRec, STE, SocialMF, Trust-cf


     Top-k recommender using social networks
       Top-k MF using Social Networks
       Nearest Neighbor Methods


     Evaluation
     Conclusion

2
Social Recommenders Everywhere




3
Social network based top-k recommendation


                                         Target
                                         Customer
                                          List of
                                          Top
                                          Movies ??


                                        Recommender




Social network based top-k recommendation is not well studied
 4
Social Top-K Recommendation
 Top-k recommendation:
     More realistic RS task
 Integrate social network information into RS
     Matrix Factorization(MF)
       • SoRec, STE, SocialMF – optimzie RMSE
       • AllRank - without social network information
       • Our approach directly optimize social network based
         top-k recommendation
     Nearest Neighbor(NN)
       • Trust-cf (recsys’09)
           – Combine CF neighborhood with social neighborhood,
             items rated by the combined neighborhood are
             considered, average rating, rank item based on
             predicted rating to form top-k recommendation
       • Our approach employs new neighborhood construction +
5        using voting mechanism
AllRank-(Steck kdd’10)
 Use AllRank to optimize top-k recommendation
 user’s selection bias causes the observed feedback (e.g. ratings,
  purchases, clicks) in the data to be missing not at random (MNAR)—
  (Recsys’09)
    Lower ratings missed with higher probability
    missing ratings tend to indicate that a user does not like the item
              ˆ
 Prediction: Ru ,i = rm + Qu PiT
 Objective:

    ∑∑
    all u all i
                   i
                        ˆ
       Wu ,i ( Ruo,&i − Ru ,i ) 2 + λ (|| P ||2 + || Q ||2 )
                                              F          F


         1 if Ru ,i observed           R     if Ru ,i observed
Wu ,i =                       Ruo,&i =  u ,i
         wm otherwise
                                   i
                                         rm otherwise
 Wm > 0, training on all items
 BaseMF: Wm = 0, training on observed ratings only
 Rank items based on predicted rating to form top-k list
 Tailor existing social-trust enhanced MF model for top-k
  recommendation
6
Outline
     Background & Motivation
       Social network based top-k recommendation
       Related Work: AllRank, SoRec, STE, SocialMF


     Top-k recommender using social networks
       Top-k MF using Social Networks
       Nearest Neighbor Methods


     Evaluation
     Conclusion

7
SoRec
      Prediction:
              ˆ                   ˆ
              Ru ,i = rm + Qu PiT S * ,v = sm + Qu Z vT
                                    u

      Objective-optimize RMSE

     ∑                       ˆ
                   ( Ru ,i − Ru ,i ) 2 + γ         ∑                       ˆ*
                                                                 ( Su ,v − Su ,v ) 2 + λ (|| P ||2 + || Q ||2 + || Z ||2 )
                                                                    *
                                                                                                 F          F          F
  ( u ,i ) obs .                                ( u ,v ) obs .

      Modified Objective-optimize top-k hit rate
∑ ∑W
all u all i
                   u ,i   (Ro &i
                            u ,i
                                     ˆ ) 2 + ∑ ∑ W ( S ) ( S *( o&i ) − S * ) 2 + λ (|| P ||2 + || Q ||2 + || Z ||2 )
                                   − Ru ,i        u ,v
                                              all u all v
                                                            u ,v
                                                                        ˆ
                                                                         u ,v               F          F          F


                   1 if Ru ,i observed                                           R                  if Ru ,i observed
          Wu ,i =                                                       Ruo,&i =  u ,i
                   wm >0 otherwise
                                                                             i
                                                                                   rm               otherwise
                               1 if Su ,v observed
                                        *
                                                                                           Su , v
                                                                                             *             *
                                                                                                       if Su ,v observed
         W     (S )
              u ,v        = γ  (S )                                     S   *( o &i )
                                                                                         =
                               wm >0 otherwise
                                                                             u ,v
                                                                                           sm        otherwise
   Top-k list generated based on ranking of predicted ratings of all items
 STE:            Ru ,i = rm + α Qu PiT + (1 − α )∑ Su ,vQv PiT
                  ˆ
                                                          v
   Modified Objective-optimize top-k hit rate

    ∑∑
    all u all i
                   i
                        ˆ
       Wu ,i ( Ruo,&i − Ru ,i )2 + λ (|| P ||2 + || Q ||2 )
                                             F          F


         1 if Ru ,i observed                           R        if Ru ,i observed
Wu ,i =                                       Ruo,&i =  u ,i
         wm >0 otherwise
                                                   i
                                                         rm     otherwise


                         ˆ
                         Ru ,i = rm + Qu PiT
 SocialMF:
   Modified Objective-optimize top-k hit rate
   ∑∑                  ˆ
      Wu ,i ( Ruo,&i − Ru ,i ) 2
   all u all i
                  i


                                                
   + β ∑  (Qu − ∑ Su ,v Qv )(Qu − ∑ Su ,v Qv )T ÷
                           *          *

       all u         v            v             
   +λ (|| P ||2 + || Q ||2 )
               F         F
Nearest Neighbor Methods
 CF-ULF approach
     Use AllRank to obtain user latent features
     Clustering user by PCC in latent feature space
     Select k1 nearest neighbor for target user u
     Relevant items of these nearest neighbors are voted to
      target user, voting weight is PCC similarity
      Voteu ,i =   ∑ ∑ sim(u, v) δ   i∈I v   ,
                   v∈Nu   i


   Top-k list is generated based on voting value
Nearest Neighbor Methods
 PureTrust approach
   breadth-first search (BFS) in the social network to
    find k2 trusted users to the target user u.
   Relevant items of these trusted users are voted to
    target user, voting weight is proportional to 1/dv

     Voteu ,i =   ∑ ∑ w (u, v) δ
                     t
                               t   i∈I v
                  v∈Nu     i

   is the set of trusted users of u
     t
    Nu

   wt (u, v) is the voting weight from user v
     wt (u , v) = 1
                      dv
   dv is the depth of user v in the BFS tree rooted at
    user u.
Nearest Neighbor Methods
 Trust-CF-ULF approach
    combination of CF-ULF approach and PureTrust
    Find k1 nearest neighbors from the CF-ULF neighborhood
    Find k2 nearest neighbors from the trust neighborhood which
     are not in the k1 set (k2 = k1)
    Relevant items of these users are voted to target user
    Top-k list is generated based on voting value


 Trust-CF-ULF-best approach
    Given total neighborhood size, dynamically tune the value of
     k1 and k2 to obtain the best recall result
Outline
      Background & Motivation
        Social network based top-k recommendation
        Related Work: AllRank, SoRec, STE, SocialMF


      Top-k recommender using social networks
        Top-k MF using Social Networks
        Nearest Neighbor Methods


      Evaluation
      Conclusion

13
Evaluation Metrics
  Top-k hit rate(Recall)
      The fraction of relevant items in the test set that are in the
       top-k of the ranking list


  RMSE

     RMSE =
                ∑  ( u ,i )∈Rtest
                                              ˆ
                                    ( Ru ,i − Ru ,i ) 2
                              | Rtest |




14
Top-k hit rate on Epinions Dataset
  71K users, 104K items, 571K item reviews, 509K trust statement




  Up to ~10× increment compared with training on observed rating
  Social network is very helpful in terms of top-k recommendation
   especially for recommendation of cold start users
  Modified SoRec outperforms modified No Trust (AllRank)by 23.1% in
   terms of overall recall and 101.8% in terms of cold user recall
  Recall of cold users in SoRec better than all users
  Item rated by a cold user averagely has received 102 ratings
  Item rated by all users has received averagely 93 ratings
15
RMSE on Epinions Dataset
  Set j0 = 10 λ =0.1, rm = 4.0, wm = 0
  RMSE = 1.174, BaseMF
  RMSE = 1.095, for SocialMF (β = 20),
  RMSE = 1.157, for STE (α = 0.5),
  RMSE = 1.117, for SoRec ( γ = 50 and wM =0)
                                         (S )




  Consistent with RMSE results in published literature
  SocialMF performs best in RMSE while performs
   worst in terms of top-k hit rate



16
Experiments on Epinions Dataset-NN




  Greatly outperform existing work—trust-cf
      Trust-cf predicts the rating value of target user in terms of
       the average rating values of the user’s neighbors–which is
       obviously based on the observed ratings only
      Our CF neighbors derived from user latent features obtained
       from AllRank, which considered data MNAR, training on all items
      Voting is the simplest possible way of accounting for all
       ratings, i.e. by counting 0 for an absent rating and counting 1
17     for an observed relevant rating
Experiments on Flixster Dataset
  ~1M Users, 49K movies, 8.2M ratings,
   26.7M connections
  Results are similar




18
Impact of Dimensionality and Top-k




  top-k hit rate of Flixster data is much more better than
   Epinions data
      Number of items in Epinions dataset is about two times as of
       Flixster dataset while recall of Flixster is more than twice
       of Epinions for top-5 to top-500 recommendations
      Epinions is a multi-category data(cars, movies, books,etc.)
      users in Flixster dataset averagely have more number of
19
       social connections and item ratings
Conclusion
  Comprehensive study on improving the accuracy of
   top-k recommendation using social networks
      Tailor existing social-trust enhanced MF models for top-k
       recommendation by considering missing ratings

  Proposed a NN based top-k recommendation method
   combining users’ neighborhoods in the trust network with
   their neighborhoods in the latent feature space and used
   voting instead of average rating to consider all ratings

  Social recommenders considering missing feedbacks
   that works best for minimizing RMSE works worst for
   maximizing the hit rate, and vice versa
      First developing a good RMSE approach, and then modifying
       the training for top-k is not necessarily a viable strategy for
       obtaining a good top-k approach
20
Thanks!

     Q&A
21

More Related Content

Recently uploaded

Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piececharlottematthew16
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfRankYa
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024The Digital Insurer
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embeddingZilliz
 

Recently uploaded (20)

Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdf
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embedding
 

Featured

PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024Neil Kimberley
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)contently
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024Albert Qian
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsKurio // The Social Media Age(ncy)
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Search Engine Journal
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summarySpeakerHub
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next Tessa Mero
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentLily Ray
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best PracticesVit Horky
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project managementMindGenius
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...RachelPearson36
 
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Applitools
 
12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at WorkGetSmarter
 
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...DevGAMM Conference
 
Barbie - Brand Strategy Presentation
Barbie - Brand Strategy PresentationBarbie - Brand Strategy Presentation
Barbie - Brand Strategy PresentationErica Santiago
 

Featured (20)

PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search Intent
 
How to have difficult conversations
How to have difficult conversations How to have difficult conversations
How to have difficult conversations
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best Practices
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project management
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
 
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
 
12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work
 
ChatGPT webinar slides
ChatGPT webinar slidesChatGPT webinar slides
ChatGPT webinar slides
 
More than Just Lines on a Map: Best Practices for U.S Bike Routes
More than Just Lines on a Map: Best Practices for U.S Bike RoutesMore than Just Lines on a Map: Best Practices for U.S Bike Routes
More than Just Lines on a Map: Best Practices for U.S Bike Routes
 
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
 
Barbie - Brand Strategy Presentation
Barbie - Brand Strategy PresentationBarbie - Brand Strategy Presentation
Barbie - Brand Strategy Presentation
 

On Top-k Recommendation using Social Networks

  • 1. On Top-k Recommendation using Social Networks Xiwang Yang, Harald Steck*+, Yang Guo* and Yong Liu Polytechnic Institute of NYU *Bell Labs + Netflix Inc. 1
  • 2. Outline  Background & Motivation  Social network based top-k recommendation  Related Work: AllRank, SoRec, STE, SocialMF, Trust-cf  Top-k recommender using social networks  Top-k MF using Social Networks  Nearest Neighbor Methods  Evaluation  Conclusion 2
  • 4. Social network based top-k recommendation Target Customer List of Top Movies ?? Recommender Social network based top-k recommendation is not well studied 4
  • 5. Social Top-K Recommendation  Top-k recommendation:  More realistic RS task  Integrate social network information into RS  Matrix Factorization(MF) • SoRec, STE, SocialMF – optimzie RMSE • AllRank - without social network information • Our approach directly optimize social network based top-k recommendation  Nearest Neighbor(NN) • Trust-cf (recsys’09) – Combine CF neighborhood with social neighborhood, items rated by the combined neighborhood are considered, average rating, rank item based on predicted rating to form top-k recommendation • Our approach employs new neighborhood construction + 5 using voting mechanism
  • 6. AllRank-(Steck kdd’10)  Use AllRank to optimize top-k recommendation  user’s selection bias causes the observed feedback (e.g. ratings, purchases, clicks) in the data to be missing not at random (MNAR)— (Recsys’09)  Lower ratings missed with higher probability  missing ratings tend to indicate that a user does not like the item ˆ  Prediction: Ru ,i = rm + Qu PiT  Objective: ∑∑ all u all i i ˆ Wu ,i ( Ruo,&i − Ru ,i ) 2 + λ (|| P ||2 + || Q ||2 ) F F  1 if Ru ,i observed R if Ru ,i observed Wu ,i =  Ruo,&i =  u ,i  wm otherwise i  rm otherwise  Wm > 0, training on all items  BaseMF: Wm = 0, training on observed ratings only  Rank items based on predicted rating to form top-k list  Tailor existing social-trust enhanced MF model for top-k recommendation 6
  • 7. Outline  Background & Motivation  Social network based top-k recommendation  Related Work: AllRank, SoRec, STE, SocialMF  Top-k recommender using social networks  Top-k MF using Social Networks  Nearest Neighbor Methods  Evaluation  Conclusion 7
  • 8. SoRec  Prediction: ˆ ˆ Ru ,i = rm + Qu PiT S * ,v = sm + Qu Z vT u  Objective-optimize RMSE ∑ ˆ ( Ru ,i − Ru ,i ) 2 + γ ∑ ˆ* ( Su ,v − Su ,v ) 2 + λ (|| P ||2 + || Q ||2 + || Z ||2 ) * F F F ( u ,i ) obs . ( u ,v ) obs .  Modified Objective-optimize top-k hit rate ∑ ∑W all u all i u ,i (Ro &i u ,i ˆ ) 2 + ∑ ∑ W ( S ) ( S *( o&i ) − S * ) 2 + λ (|| P ||2 + || Q ||2 + || Z ||2 ) − Ru ,i u ,v all u all v u ,v ˆ u ,v F F F  1 if Ru ,i observed R if Ru ,i observed Wu ,i =  Ruo,&i =  u ,i  wm >0 otherwise i  rm otherwise  1 if Su ,v observed *  Su , v * * if Su ,v observed W (S ) u ,v = γ  (S ) S *( o &i ) =  wm >0 otherwise u ,v  sm otherwise Top-k list generated based on ranking of predicted ratings of all items
  • 9.  STE: Ru ,i = rm + α Qu PiT + (1 − α )∑ Su ,vQv PiT ˆ v  Modified Objective-optimize top-k hit rate ∑∑ all u all i i ˆ Wu ,i ( Ruo,&i − Ru ,i )2 + λ (|| P ||2 + || Q ||2 ) F F  1 if Ru ,i observed R if Ru ,i observed Wu ,i =  Ruo,&i =  u ,i  wm >0 otherwise i  rm otherwise ˆ Ru ,i = rm + Qu PiT  SocialMF:  Modified Objective-optimize top-k hit rate ∑∑ ˆ Wu ,i ( Ruo,&i − Ru ,i ) 2 all u all i i   + β ∑  (Qu − ∑ Su ,v Qv )(Qu − ∑ Su ,v Qv )T ÷ * * all u  v v  +λ (|| P ||2 + || Q ||2 ) F F
  • 10. Nearest Neighbor Methods  CF-ULF approach  Use AllRank to obtain user latent features  Clustering user by PCC in latent feature space  Select k1 nearest neighbor for target user u  Relevant items of these nearest neighbors are voted to target user, voting weight is PCC similarity Voteu ,i = ∑ ∑ sim(u, v) δ i∈I v , v∈Nu i  Top-k list is generated based on voting value
  • 11. Nearest Neighbor Methods  PureTrust approach  breadth-first search (BFS) in the social network to find k2 trusted users to the target user u.  Relevant items of these trusted users are voted to target user, voting weight is proportional to 1/dv Voteu ,i = ∑ ∑ w (u, v) δ t t i∈I v v∈Nu i  is the set of trusted users of u t Nu  wt (u, v) is the voting weight from user v wt (u , v) = 1 dv  dv is the depth of user v in the BFS tree rooted at user u.
  • 12. Nearest Neighbor Methods  Trust-CF-ULF approach  combination of CF-ULF approach and PureTrust  Find k1 nearest neighbors from the CF-ULF neighborhood  Find k2 nearest neighbors from the trust neighborhood which are not in the k1 set (k2 = k1)  Relevant items of these users are voted to target user  Top-k list is generated based on voting value  Trust-CF-ULF-best approach  Given total neighborhood size, dynamically tune the value of k1 and k2 to obtain the best recall result
  • 13. Outline  Background & Motivation  Social network based top-k recommendation  Related Work: AllRank, SoRec, STE, SocialMF  Top-k recommender using social networks  Top-k MF using Social Networks  Nearest Neighbor Methods  Evaluation  Conclusion 13
  • 14. Evaluation Metrics  Top-k hit rate(Recall)  The fraction of relevant items in the test set that are in the top-k of the ranking list  RMSE RMSE = ∑ ( u ,i )∈Rtest ˆ ( Ru ,i − Ru ,i ) 2 | Rtest | 14
  • 15. Top-k hit rate on Epinions Dataset  71K users, 104K items, 571K item reviews, 509K trust statement  Up to ~10× increment compared with training on observed rating  Social network is very helpful in terms of top-k recommendation especially for recommendation of cold start users  Modified SoRec outperforms modified No Trust (AllRank)by 23.1% in terms of overall recall and 101.8% in terms of cold user recall  Recall of cold users in SoRec better than all users  Item rated by a cold user averagely has received 102 ratings  Item rated by all users has received averagely 93 ratings 15
  • 16. RMSE on Epinions Dataset  Set j0 = 10 λ =0.1, rm = 4.0, wm = 0  RMSE = 1.174, BaseMF  RMSE = 1.095, for SocialMF (β = 20),  RMSE = 1.157, for STE (α = 0.5),  RMSE = 1.117, for SoRec ( γ = 50 and wM =0) (S )  Consistent with RMSE results in published literature  SocialMF performs best in RMSE while performs worst in terms of top-k hit rate 16
  • 17. Experiments on Epinions Dataset-NN  Greatly outperform existing work—trust-cf  Trust-cf predicts the rating value of target user in terms of the average rating values of the user’s neighbors–which is obviously based on the observed ratings only  Our CF neighbors derived from user latent features obtained from AllRank, which considered data MNAR, training on all items  Voting is the simplest possible way of accounting for all ratings, i.e. by counting 0 for an absent rating and counting 1 17 for an observed relevant rating
  • 18. Experiments on Flixster Dataset  ~1M Users, 49K movies, 8.2M ratings, 26.7M connections  Results are similar 18
  • 19. Impact of Dimensionality and Top-k  top-k hit rate of Flixster data is much more better than Epinions data  Number of items in Epinions dataset is about two times as of Flixster dataset while recall of Flixster is more than twice of Epinions for top-5 to top-500 recommendations  Epinions is a multi-category data(cars, movies, books,etc.)  users in Flixster dataset averagely have more number of 19 social connections and item ratings
  • 20. Conclusion  Comprehensive study on improving the accuracy of top-k recommendation using social networks  Tailor existing social-trust enhanced MF models for top-k recommendation by considering missing ratings  Proposed a NN based top-k recommendation method combining users’ neighborhoods in the trust network with their neighborhoods in the latent feature space and used voting instead of average rating to consider all ratings  Social recommenders considering missing feedbacks that works best for minimizing RMSE works worst for maximizing the hit rate, and vice versa  First developing a good RMSE approach, and then modifying the training for top-k is not necessarily a viable strategy for obtaining a good top-k approach 20
  • 21. Thanks! Q&A 21

Editor's Notes

  1. Brief introduction of social network based top-k rec
  2. Due to their great commercial value, social recommender systems have been widely deployed in industry, such as social photo sharing at Pinterest, social music community site last.fm. Social product review website, Epinions. In Epinions, users review various items, such as cars, movies, books, software, etc., and assign ratings to the items. Users also assign trust values to other users if he find their product reviews or ratings are valuable Flixster is a social network site where user can rate movies and share movie reviews.
  3. What is social network based top-k recommendation Most recent work on social network based recommendation is focused on minimizing RMSE Social network based top-k recommendation is not well studied. What we study in this paper is social network based top-k recommendation.
  4. Why should we study top-k recommendation instead of RMSE oriented recommendation? When we look at the recommendation at Netflix or Amazon, it provides user the list of items that they may like. Top-k recommendation is a more realistic recommendation task. Top-k more relevant task, existing top-k in social network, MF & NN. Structure clear .
  5. Let look at this part, w_m is the weight for missing ratings, r_m is the imputed value for missing ratings. The idea of AllRank is to impute a low value for missing rating but with low confidence. It is crucial that the weight assigned to the imputed ratings is positive . In contrast, the usual optimization of the RMSE (Root Mean Square Error) is obtained by training with Wm = 0. This seemingly small difference has the important effect that AllRank model is trained on all items, while RMSE - approaches are trained only on the observed ratings. Missing ratings prone to be not interested, this is captured by imputed value for missing rating r_m < global average rating, With less confidence than observed rating is captured by weight for missing ratings w_m >0, <1.
  6. matrix Q is shared among the two equations. Due to this constraint, Q (i.e. the user profiles Qu for each user u) reflects information from both the ratings and the social network as to achieve accurate predictions for both. Gamma≥ 0 determines the weight of the social network information compared to the rating data. Obviously, Gamma = 0 corresponds to the extreme case where the social network is ignored when learning the matrices P and Q. As increases, the influence of the social network increases. We training on all user item pair by adding W_{u,i} with w_m > 0 we are indeed training on all user-item interaction. We training on all user user pair by adding W_{u,v}^{(S)}, with w_m^{(S)}>0 we are indeed training on all user-user interation. For the missing user-item rating, we impute a value r_m. For the missing user-user interaction, we impute value s_m. Note that, here we are not all training on all user-item interactions, but also all user-user interactions.
  7. For STE model, the predicted rating of user u to item i is consist of two parts: the first part is Q_u.P_i, which infer from user u’s own taste, the other part is decided by u’s followees, followee’s weighted average latent factors contribute to the final results. Parameter alpha controls the contribution from the two parts. For modified STE and SocialMF model, the optimization procedure get easily stuck at local minimum, and we proposed some tricks to get rid of local minimum, the detail can be found in our technical report.
  8. Trust-CF-ULF-best approach Given neighborhood size, dynamically tuning the value of k 1 and k 2 so as to obtain the best recall results. Recommendation of items to user is same as the one in CF-ULF approach.
  9. N(u) number of relevant items of user u N(k,u) number of relevant items in top-k list for user u
  10. Epinions: Social product review website In Epinions, users review various items, such as cars, movies, books, software, etc., and assign ratings to the items. Users also assign trust values to other users whose reviews and/or ratings they find valuable Compare with original training.
  11. This illustrates that approaches that work well for the vastly popular RMSE are not necessarily useful for optimizing the more realistic top- k hit ratio or recall.
  12. Despite the different properties of the Epinions and Flixster data sets, the results on the Flixster data confirm our results on the Epinions data.
  13. Epinions data is a multi-category data which contains items from many categories(cars, movies, books, software, etc.) while items in Flixster are all movies which makes the recommendation easier in general. Furthermore, users in Flixster dataset averagely have more number of social connections and item ratings compared to Epinions dataset.
  14. Existing social-trust enhanced Matrix Factorization (MF) models can be tailored for top-k recommendation by including observed and missing ratings in their training objective functions Found that the technical approach for combining feedback data (e.g. ratings) with social network information that works best for minimizing RMSE works poorly for maximizing the hit ratio, and vice versa.