SlideShare a Scribd company logo
1 of 30
PERSONALIZED RETWEET PREDICTION IN TWITTER


        Liangjie Hong*, Lehigh University
        Aziz Doumith, Lehigh University
  1
        Brian D. Davison, Lehigh University
OVERVIEW
 Motivation
 Related Work

 Our Method

 Experimental Results




                         2
MOTIVATIONS
Social information platforms




                               3
MOTIVATIONS
Information overload




                       4
MOTIVATIONS
Information shortage




                       5
MOTIVATIONS




                                                                                          6


              Photo from: http://www.jenful.com/2011/06/google-1-and-the-filter-bubble/
MOTIVATIONS




                                                 7


              Photo from: http://graphlab.org/
MOTIVATIONS




                                               8


              Photo from: http://kexino.com/
MOTIVATIONS




                                                                        9


              Photo from: http://performancemarketingassociation.com/
TASKS
Given a target user and his/her friends, provide a ranked
list of tweets from these friends such that the tweets that
are potentially retweeted will be ranked higher.




                                                                              10


                    Photo from: http://performancemarketingassociation.com/
RELATED WORK
Generic Popular Tweets Analysis/Prediction
 [Suh et al., SocialCom 2010]

 [Y. Kim and K. Shim, ICDM, 2011]

 [Uysal and W. B. Croft, CIKM 2011]

 [Hong et al., WWW 2011]

Personalized Tweets Prediction
 [Chen et al., SIGIR 2012]

 [Peng et al., ICDM Workshop 2011]



                                             11
RELATED WORK
Generic Popular Tweets Analysis/Prediction
 [Suh et al., SocialCom 2010]

 [Y. Kim and K. Shim, ICDM, 2011]

 [Uysal and W. B. Croft, CIKM 2011]

 [Hong et al., WWW 2011]

Personalized Tweets Analysis/Prediction
 [Chen et al., SIGIR 2012]

 [Peng et al., ICDM Workshop 2011]


       Understanding users’ behaviors & content modeling
                                                           12
OUR METHOD
Design requirements
 Utilize users’ historical behaviors

 Collaborative filtering

 Incorporating a rich-set of features

 Coupled modeling with content

 Learning a correct objective function

 Scalability




                                          13
OUR METHOD
Design requirements
 Utilize users’ historical behaviors

 Collaborative filtering




   Latent factor models



                                        14
OUR METHOD
Design requirements
 Utilize users’ historical behaviors

 Collaborative filtering

 Incorporating a rich-set of features




   Latent factor models
       Factorization Machines [Rendle, ACM TIST 2012]
                                                         15
OUR METHOD
Factorization Machines
 Generic enough
       matrix factorization
       pairwise interaction tensor factorization
       SVD++
       neighborhood models
       …
   Technically mature
       [Rendle, ICDM 2010]
       [Rendle et al., SIGIR 2011]
       [Freudenthaler et al., NIPS Workshop 2011]
       [Rendle et al., WSDM 2012]
                                                     16
       [Rendle, ACM TIST 2012]
OUR METHOD
Extending Factorization Machines
 Non-negative decomposition of term-tweet matrix
       Compatible to standard topic models
   Co-Factorization Machines
       Multiple aspects of the dataset
         Shared feature paradigm
         Shared latent space paradigm

         Regularized latent space paradigm




                                                    17
OUR METHOD
Design requirements
 Utilize users’ historical behaviors

 Collaborative filtering

 Incorporating a rich-set of features

 Coupled modeling with content

 Learning a correct objective function

 Scalability




                                          18
OUR METHOD
Design requirements
 Learning objective functions for different aspects
     User decisions
         Ranking-based loss
          Weighted Approximately Rank Pairwise loss (WARP)
     Content modeling
       Log-Poisson loss
       Logistic loss




                                                             19
OUR METHOD
WARP loss
 Proposed by [Usunier et al., ICML 2009]

 Image retrieval tasks and IR tasks
    [Weston et al., Machine Learning 2010]
    [Weston et al., ICML 2012]
    [Weston et al., UAI 2012]
    [Bordes et al, AISTATS 2012]
   Can mimic many ranking measures
       NDCG, MAP, Precision@k


   Applied to collaborative filtering
                                             20
OUR METHOD
Design requirements
 Utilize users’ historical behaviors

 Collaborative filtering

 Incorporating a rich-set of features

 Coupled modeling with content

 Learning a correct objective function

 Scalability (Stochastic Gradient Descent)




                                              21
EXPERIMENTS
Twitter data
 0.7M target users with 11M tweets

 4.3M neighbor users with 27M tweets

 “Complete” sample for each target user

 Mean Average Precision (MAP) as measure

 Train/test on consecutive time periods




                                            22
EXPERIMENTS
Comparisons
 Matrix factorization (MF)

 Matrix factorization with attributes (MFA)

 CPTR [Chen et al, SIGIR 2012]

 Factorization machines with attributes (FMA)

 CoFM with shared features (CoFM-SF)

 CoFM with shared latent spaces (CoFM-SL)

 CoFM with latent space regularization (CoFM-REG)



                                                     23
EXPERIMENTS
Comparisons
 Matrix factorization (MF)

 Matrix factorization with attributes (MFA)

 CPTR [Chen et al, SIGIR 2012]

 Factorization machines with attributes (FMA)

 CoFM with shared features (CoFM-SF)

 CoFM with shared latent spaces (CoFM-SL)

 CoFM with latent space regularization (CoFM-REG)



                                                     24
EXPERIMENTS




              25
EXPERIMENTS




              26
EXPERIMENTS




              27
EXPERIMENTS
Examples of topics are shown. The terms are top ranked terms in
each topic. The topic names in bold are given by the authors.
Entertainment


album music lady artist video listen itunes apple produced movies #bieber bieber new songs

Finance

percent billion bank financial debt banks euro crisis rates greece bailout spain economy


Politics


party election budget tax president million obama money pay bill federal increase cuts
                                                                                             28
CONCLUSIONS
   Main contributions
     Propose Co-Factorization Machines (CoFM) to handle
      two (multiple) aspects of the dataset.
     Apply FM to text data with constraints to mimic topic
      models
     Introduce WARP loss into collaborative filtering/recsys
      models
     Explore a wide range of features and demonstrate the
      effectiveness of feature sets with significant
      improvement over several non-trival baselines.


                                                                29
THANK YOU.




   Liangjie Hong
   PhD candidate
   WUME Lab
   Lehigh University
   lih307@cse.lehigh.edu   30

More Related Content

Similar to Personalized Retweet Prediction in Twitter

Towards Method Engineering of Model-Driven User Interface Development
Towards Method Engineering ofModel-Driven User Interface Development Towards Method Engineering ofModel-Driven User Interface Development
Towards Method Engineering of Model-Driven User Interface Development Jean Vanderdonckt
 
Automating Software Development Using Artificial Intelligence (AI)
Automating Software Development Using Artificial Intelligence (AI)Automating Software Development Using Artificial Intelligence (AI)
Automating Software Development Using Artificial Intelligence (AI)Jeremy Bradbury
 
A Context-aware Model for the Analysis of User Interaction and QoE in Mobile ...
A Context-aware Model for the Analysis of User Interaction and QoE in Mobile ...A Context-aware Model for the Analysis of User Interaction and QoE in Mobile ...
A Context-aware Model for the Analysis of User Interaction and QoE in Mobile ...Pedro Luis Mateo Navarro
 
REQUIREMENTS VARIABILITY SPECIFICATION FOR DATA INTENSIVE SOFTWARE
REQUIREMENTS VARIABILITY SPECIFICATION FOR DATA INTENSIVE SOFTWARE REQUIREMENTS VARIABILITY SPECIFICATION FOR DATA INTENSIVE SOFTWARE
REQUIREMENTS VARIABILITY SPECIFICATION FOR DATA INTENSIVE SOFTWARE mathsjournal
 
Requirements Variability Specification for Data Intensive Software
Requirements Variability Specification for Data Intensive Software Requirements Variability Specification for Data Intensive Software
Requirements Variability Specification for Data Intensive Software ijseajournal
 
Discreate eventsimulation idef
Discreate eventsimulation idefDiscreate eventsimulation idef
Discreate eventsimulation idefMandar Trivedi
 
Productivity mdd mdb_code_centric
Productivity mdd mdb_code_centricProductivity mdd mdb_code_centric
Productivity mdd mdb_code_centricSantiago Meliá
 
OO Development 1 - Introduction to Object-Oriented Development
OO Development 1 - Introduction to Object-Oriented DevelopmentOO Development 1 - Introduction to Object-Oriented Development
OO Development 1 - Introduction to Object-Oriented DevelopmentRandy Connolly
 
PresentationTest
PresentationTestPresentationTest
PresentationTestbolu804
 
Large Graph Mining
Large Graph MiningLarge Graph Mining
Large Graph MiningSabri Skhiri
 
Fundamentals of Deep Recommender Systems
 Fundamentals of Deep Recommender Systems Fundamentals of Deep Recommender Systems
Fundamentals of Deep Recommender SystemsWQ Fan
 
Enhanced Feature Analysis Framework for Comparative Analysis & Evaluation of ...
Enhanced Feature Analysis Framework for Comparative Analysis & Evaluation of ...Enhanced Feature Analysis Framework for Comparative Analysis & Evaluation of ...
Enhanced Feature Analysis Framework for Comparative Analysis & Evaluation of ...IJCSIS Research Publications
 
Systems variability modeling a textual model mixing class and feature concepts
Systems variability modeling a textual model mixing class and feature conceptsSystems variability modeling a textual model mixing class and feature concepts
Systems variability modeling a textual model mixing class and feature conceptsijcsit
 
A Model To Compare The Degree Of Refactoring Opportunities Of Three Projects ...
A Model To Compare The Degree Of Refactoring Opportunities Of Three Projects ...A Model To Compare The Degree Of Refactoring Opportunities Of Three Projects ...
A Model To Compare The Degree Of Refactoring Opportunities Of Three Projects ...acijjournal
 
A MODEL TO COMPARE THE DEGREE OF REFACTORING OPPORTUNITIES OF THREE PROJECTS ...
A MODEL TO COMPARE THE DEGREE OF REFACTORING OPPORTUNITIES OF THREE PROJECTS ...A MODEL TO COMPARE THE DEGREE OF REFACTORING OPPORTUNITIES OF THREE PROJECTS ...
A MODEL TO COMPARE THE DEGREE OF REFACTORING OPPORTUNITIES OF THREE PROJECTS ...acijjournal
 
Self-adaptation Driven by goals in SysML Models
Self-adaptation Driven by goals in SysML ModelsSelf-adaptation Driven by goals in SysML Models
Self-adaptation Driven by goals in SysML Modelsamalanda1
 
AI-SDV 2022: Accommodating the Deep Learning Revolution by a Development Proc...
AI-SDV 2022: Accommodating the Deep Learning Revolution by a Development Proc...AI-SDV 2022: Accommodating the Deep Learning Revolution by a Development Proc...
AI-SDV 2022: Accommodating the Deep Learning Revolution by a Development Proc...Dr. Haxel Consult
 

Similar to Personalized Retweet Prediction in Twitter (20)

Towards Method Engineering of Model-Driven User Interface Development
Towards Method Engineering ofModel-Driven User Interface Development Towards Method Engineering ofModel-Driven User Interface Development
Towards Method Engineering of Model-Driven User Interface Development
 
Automating Software Development Using Artificial Intelligence (AI)
Automating Software Development Using Artificial Intelligence (AI)Automating Software Development Using Artificial Intelligence (AI)
Automating Software Development Using Artificial Intelligence (AI)
 
A Context-aware Model for the Analysis of User Interaction and QoE in Mobile ...
A Context-aware Model for the Analysis of User Interaction and QoE in Mobile ...A Context-aware Model for the Analysis of User Interaction and QoE in Mobile ...
A Context-aware Model for the Analysis of User Interaction and QoE in Mobile ...
 
REQUIREMENTS VARIABILITY SPECIFICATION FOR DATA INTENSIVE SOFTWARE
REQUIREMENTS VARIABILITY SPECIFICATION FOR DATA INTENSIVE SOFTWARE REQUIREMENTS VARIABILITY SPECIFICATION FOR DATA INTENSIVE SOFTWARE
REQUIREMENTS VARIABILITY SPECIFICATION FOR DATA INTENSIVE SOFTWARE
 
Requirements Variability Specification for Data Intensive Software
Requirements Variability Specification for Data Intensive Software Requirements Variability Specification for Data Intensive Software
Requirements Variability Specification for Data Intensive Software
 
Discreate eventsimulation idef
Discreate eventsimulation idefDiscreate eventsimulation idef
Discreate eventsimulation idef
 
Productivity mdd mdb_code_centric
Productivity mdd mdb_code_centricProductivity mdd mdb_code_centric
Productivity mdd mdb_code_centric
 
OO Development 1 - Introduction to Object-Oriented Development
OO Development 1 - Introduction to Object-Oriented DevelopmentOO Development 1 - Introduction to Object-Oriented Development
OO Development 1 - Introduction to Object-Oriented Development
 
Sub1583
Sub1583Sub1583
Sub1583
 
PresentationTest
PresentationTestPresentationTest
PresentationTest
 
Large Graph Mining
Large Graph MiningLarge Graph Mining
Large Graph Mining
 
2008.11560v2.pdf
2008.11560v2.pdf2008.11560v2.pdf
2008.11560v2.pdf
 
Fundamentals of Deep Recommender Systems
 Fundamentals of Deep Recommender Systems Fundamentals of Deep Recommender Systems
Fundamentals of Deep Recommender Systems
 
Introduction to MDE
Introduction to MDEIntroduction to MDE
Introduction to MDE
 
Enhanced Feature Analysis Framework for Comparative Analysis & Evaluation of ...
Enhanced Feature Analysis Framework for Comparative Analysis & Evaluation of ...Enhanced Feature Analysis Framework for Comparative Analysis & Evaluation of ...
Enhanced Feature Analysis Framework for Comparative Analysis & Evaluation of ...
 
Systems variability modeling a textual model mixing class and feature concepts
Systems variability modeling a textual model mixing class and feature conceptsSystems variability modeling a textual model mixing class and feature concepts
Systems variability modeling a textual model mixing class and feature concepts
 
A Model To Compare The Degree Of Refactoring Opportunities Of Three Projects ...
A Model To Compare The Degree Of Refactoring Opportunities Of Three Projects ...A Model To Compare The Degree Of Refactoring Opportunities Of Three Projects ...
A Model To Compare The Degree Of Refactoring Opportunities Of Three Projects ...
 
A MODEL TO COMPARE THE DEGREE OF REFACTORING OPPORTUNITIES OF THREE PROJECTS ...
A MODEL TO COMPARE THE DEGREE OF REFACTORING OPPORTUNITIES OF THREE PROJECTS ...A MODEL TO COMPARE THE DEGREE OF REFACTORING OPPORTUNITIES OF THREE PROJECTS ...
A MODEL TO COMPARE THE DEGREE OF REFACTORING OPPORTUNITIES OF THREE PROJECTS ...
 
Self-adaptation Driven by goals in SysML Models
Self-adaptation Driven by goals in SysML ModelsSelf-adaptation Driven by goals in SysML Models
Self-adaptation Driven by goals in SysML Models
 
AI-SDV 2022: Accommodating the Deep Learning Revolution by a Development Proc...
AI-SDV 2022: Accommodating the Deep Learning Revolution by a Development Proc...AI-SDV 2022: Accommodating the Deep Learning Revolution by a Development Proc...
AI-SDV 2022: Accommodating the Deep Learning Revolution by a Development Proc...
 

Recently uploaded

Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsRoshan Dwivedi
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodJuan lago vázquez
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024SynarionITSolutions
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...apidays
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businesspanagenda
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 

Recently uploaded (20)

Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 

Personalized Retweet Prediction in Twitter

  • 1. PERSONALIZED RETWEET PREDICTION IN TWITTER Liangjie Hong*, Lehigh University Aziz Doumith, Lehigh University 1 Brian D. Davison, Lehigh University
  • 2. OVERVIEW  Motivation  Related Work  Our Method  Experimental Results 2
  • 6. MOTIVATIONS 6 Photo from: http://www.jenful.com/2011/06/google-1-and-the-filter-bubble/
  • 7. MOTIVATIONS 7 Photo from: http://graphlab.org/
  • 8. MOTIVATIONS 8 Photo from: http://kexino.com/
  • 9. MOTIVATIONS 9 Photo from: http://performancemarketingassociation.com/
  • 10. TASKS Given a target user and his/her friends, provide a ranked list of tweets from these friends such that the tweets that are potentially retweeted will be ranked higher. 10 Photo from: http://performancemarketingassociation.com/
  • 11. RELATED WORK Generic Popular Tweets Analysis/Prediction  [Suh et al., SocialCom 2010]  [Y. Kim and K. Shim, ICDM, 2011]  [Uysal and W. B. Croft, CIKM 2011]  [Hong et al., WWW 2011] Personalized Tweets Prediction  [Chen et al., SIGIR 2012]  [Peng et al., ICDM Workshop 2011] 11
  • 12. RELATED WORK Generic Popular Tweets Analysis/Prediction  [Suh et al., SocialCom 2010]  [Y. Kim and K. Shim, ICDM, 2011]  [Uysal and W. B. Croft, CIKM 2011]  [Hong et al., WWW 2011] Personalized Tweets Analysis/Prediction  [Chen et al., SIGIR 2012]  [Peng et al., ICDM Workshop 2011] Understanding users’ behaviors & content modeling 12
  • 13. OUR METHOD Design requirements  Utilize users’ historical behaviors  Collaborative filtering  Incorporating a rich-set of features  Coupled modeling with content  Learning a correct objective function  Scalability 13
  • 14. OUR METHOD Design requirements  Utilize users’ historical behaviors  Collaborative filtering  Latent factor models 14
  • 15. OUR METHOD Design requirements  Utilize users’ historical behaviors  Collaborative filtering  Incorporating a rich-set of features  Latent factor models  Factorization Machines [Rendle, ACM TIST 2012] 15
  • 16. OUR METHOD Factorization Machines  Generic enough  matrix factorization  pairwise interaction tensor factorization  SVD++  neighborhood models  …  Technically mature  [Rendle, ICDM 2010]  [Rendle et al., SIGIR 2011]  [Freudenthaler et al., NIPS Workshop 2011]  [Rendle et al., WSDM 2012] 16  [Rendle, ACM TIST 2012]
  • 17. OUR METHOD Extending Factorization Machines  Non-negative decomposition of term-tweet matrix  Compatible to standard topic models  Co-Factorization Machines  Multiple aspects of the dataset  Shared feature paradigm  Shared latent space paradigm  Regularized latent space paradigm 17
  • 18. OUR METHOD Design requirements  Utilize users’ historical behaviors  Collaborative filtering  Incorporating a rich-set of features  Coupled modeling with content  Learning a correct objective function  Scalability 18
  • 19. OUR METHOD Design requirements  Learning objective functions for different aspects  User decisions  Ranking-based loss Weighted Approximately Rank Pairwise loss (WARP)  Content modeling  Log-Poisson loss  Logistic loss 19
  • 20. OUR METHOD WARP loss  Proposed by [Usunier et al., ICML 2009]  Image retrieval tasks and IR tasks [Weston et al., Machine Learning 2010] [Weston et al., ICML 2012] [Weston et al., UAI 2012] [Bordes et al, AISTATS 2012]  Can mimic many ranking measures  NDCG, MAP, Precision@k  Applied to collaborative filtering 20
  • 21. OUR METHOD Design requirements  Utilize users’ historical behaviors  Collaborative filtering  Incorporating a rich-set of features  Coupled modeling with content  Learning a correct objective function  Scalability (Stochastic Gradient Descent) 21
  • 22. EXPERIMENTS Twitter data  0.7M target users with 11M tweets  4.3M neighbor users with 27M tweets  “Complete” sample for each target user  Mean Average Precision (MAP) as measure  Train/test on consecutive time periods 22
  • 23. EXPERIMENTS Comparisons  Matrix factorization (MF)  Matrix factorization with attributes (MFA)  CPTR [Chen et al, SIGIR 2012]  Factorization machines with attributes (FMA)  CoFM with shared features (CoFM-SF)  CoFM with shared latent spaces (CoFM-SL)  CoFM with latent space regularization (CoFM-REG) 23
  • 24. EXPERIMENTS Comparisons  Matrix factorization (MF)  Matrix factorization with attributes (MFA)  CPTR [Chen et al, SIGIR 2012]  Factorization machines with attributes (FMA)  CoFM with shared features (CoFM-SF)  CoFM with shared latent spaces (CoFM-SL)  CoFM with latent space regularization (CoFM-REG) 24
  • 28. EXPERIMENTS Examples of topics are shown. The terms are top ranked terms in each topic. The topic names in bold are given by the authors. Entertainment album music lady artist video listen itunes apple produced movies #bieber bieber new songs Finance percent billion bank financial debt banks euro crisis rates greece bailout spain economy Politics party election budget tax president million obama money pay bill federal increase cuts 28
  • 29. CONCLUSIONS  Main contributions  Propose Co-Factorization Machines (CoFM) to handle two (multiple) aspects of the dataset.  Apply FM to text data with constraints to mimic topic models  Introduce WARP loss into collaborative filtering/recsys models  Explore a wide range of features and demonstrate the effectiveness of feature sets with significant improvement over several non-trival baselines. 29
  • 30. THANK YOU. Liangjie Hong PhD candidate WUME Lab Lehigh University lih307@cse.lehigh.edu 30