SlideShare a Scribd company logo
Who Will Follow You
                    Back? Reciprocal
                    Relationship
1
1
                2
                    Prediction*  3
John Hopcroft, Tiancheng Lou, Jie Tang
Department of Computer Science, Cornell University,
2Institutefor Interdisciplinary Information Sciences, Tsinghua
University
3Department of Computer Science, Tsinghua University
reciprocal
                                                       parasocial

Motivation                                                               v2
                                                                                          v3
                                                             v1




   Two kinds of relationships in social network,                              v4

                                                                   v5
      one-way(called parasocial) relationship and,
      two-way(called reciprocal) relationship
                                                                                    v6
   Two-way(reciprocal) relationship
      usually developed from a one-way relationship
      more trustful.                                    after 3 days               prediction
   Try to understand(predict) the formation of two-
    way relationships
      micro-level dynamics of the social network.
                                                                          v2
                                                                                           v3
      underlying community structure?                        v1


      how users influence each other?
                                                                                v4

                                                                    v5




                                                                                     v6
Example : real friend relationship

   On Twitter : Who Will Follow You Back?

                                     30%
                                       ?

                100%
                  ?
                                      60%
                                        ?          JimmyQiao

   Ladygaga           1%
                       ?
                           Shiteng




              Obama                        Huwei
Several key challenges

   How to model the formation of
    two-way relationships?                                        y2= ?
                                                                  y2= ?
                                                                y2
                                                                y2
        SVM & LRC                                                                  y4 = 0
                                                                                     4


   How to combine many social                                                       y4
                                                                                      4


    theories into the prediction              y1= 1 y1
                                              y1= 1 y1

    model?                                                                y3
                                                                          y3
                                                                          y3 = ?
                                                                          y3 = ?

                          v2
                                         v3
               v1




                               v4

                     v5
                                                              (v1,, v2)
                                                              (v1 v2)
                                                                                   (v3, v4)
                                                                                     3   4

                                    v6            (v1,, v5)
                                                  (v1 v5)

                                                                     (v2,, v4)
                                                                     (v2 v4)
Outline

   Previous works
   Our approach
   Experimental results
   Conclusion & future works
Link prediction

   Unsupervised link prediction
        Scores & intution, such as preferential attachment [N01].
   Supervised link prediction
        supervised random walks [BL11].
        logistic regression model to predict positive and negative links [L10].
   Main differences:
        We predict a directed link instead of only handles undirected social networks.
        Our model is dynamic and learned from the evolution of the Twitter network.
Social behavior analysis

   Existing works on social behavior analysis:
      The difference of the social influence on difference topics and to model the topic-
       level social influence in social networks. [T09]
      How social actions evolve in a dynamic social network? [T10]
   Main differences:
        The proposed methods in previous work can be used here
        but the problem is fundamentally different.
Twitter study

   The twitter network.
        The topological and geographical properties. [J07]
        Twittersphere and some notable properties, such as a non-power-law follower
         distribution, and low reciprocity. [K10]
   The twitter users.
        Influential users.
        Tweeting behaviors of users.
   The tweets.
        Utilize the real-time nature to detect a target event. [S10]
        TwitterMonitor, to detect emerging topics. [M10]
Outline

   Previous works
   Our approach
   Experimental results
   Conclusion & future works
Factor graph model

   Problem definition
        Given a network at time t, i.e., Gt = (Vt, Et, Xt, Yt)
        Variables y are partially labeled.
        Goal : infer unknown variables.
   Factor graph model
        P(Y | X, G) = P(X, G|Y) P(Y) / P(X, G) = C0 P(X | Y) P(Y | G)
        In P(X | Y), assuming that the generative probability is conditionally independent,
        P(Y | X, G) = C0P(Y | G)ΠP(xi|yi)
        How to deal with P(Y | G) ?
Random Field

                              Given an undirected graph G=(V,
     Ya                  Yb      E) such that Y={Yv|v∈V}, if

                              the probability of Yv given X and
          Yc                     those random variables
                                 corresponding to nodes
                                 neighboring v in G. Then (X, Y)
                    Ye           is a conditional random field.
    Yd
                              undirected graphical model
               Yf                globally conditioned on X
Conditional random field(CRF)

   Model them in a Markov random field, by the Hammersley-Clifford theorem,
      P(xi|yi) = 1/Z1 * exp {Σαj fj (xij, yi)}
      P(Y|G) = 1/Z2 * exp {ΣcΣkμkhk(Yc)}
      Z1 and Z2 are normalization factors.
   So, the likelihood probability is :
      P(Y | X, G) = 1/Z0 * exp {Σαj fj (xij, yi) + ΣcΣkμkhk(Yc)}
      Z0 is normalization factors.
Maximize likelihood

      Objective function
         O(θ) = log Pθ(Y | X, G) = ΣiΣjαj fj (xij, yi) + ΣΣμk hk(Yc) – log Z

      Learning the model to
         estimate a parameter configuration θ= {α, μ} to maximize the objective
          function :
         that is, the goal is to compute θ* = argmax O(θ)
Learning algorithm

           θ*
    Goal : = argmax O(θ)                     L                                    (k )        (k)       ( Z ( x ( k ) ))'
                                                                     Fj ( y              ,x         )
                                                 j          k                                              Z ( x(k ) )
   Newton methold.
   The gradient of each μk with           Z ( x(k ) )                   exp             F ( y, x ( k ) )
                                                                     y
    regard to the objective
                                                                                  exp(              F ( y, x ( k ) )) F j ( y, x ( k ) )
    function.                              (Z ( x    (k )
                                                            ))   '
                                                                          y
        dθ/ dμk= E[hk(Yc)] – EPμ (Yc|X,
                                   k         Z (x    (k )
                                                            )                                       exp       F ( y, x ( k ) )
         G)[hk(Yc)]
                                                                                               y


                                                                                         exp(           F ( y, x ( k ) )
                                                                                                                     (k )
                                                                                                                                ) F j ( y, x ( k ) )
                                                                              y              exp          F ( y, x          )
                                                                                         y

                                                                                     p ( y | x ( k ) ) F j ( y, x ( k ) )
                                                                              y

                                                                          E p (Y | x( k ) ) F j (Y , x ( k ) )
Learning algorithm

   One challenge : how to calculate the marginal distribution Pμ (yc|x, G).
                                                                      k


      Intractable to directly calculate the marginal distribution.
      Use approximation algorithm : Belief propagation.
Belief propagation

   Sum-product
     The algorithm works by passing real values functions called “messages” along
      the edges between the nodes.
     Extend edges to triangles
Belief propagation

   A “message” from a variable node v to a factor node u is
    μv->u(xv) = Πu*∈N(v){u}μu*->v(xv)
   A “message” from a factor node u to a variable node v is
    μu->v(xu) = Σx’ux’v=xvΠu*∈N(u){v}μv*->u(x’u)
Learning algorithm(TriFG model)

Input : network Gt, learning rateη
Output : estimated parametersθ

Initalize θ= 0;
Repeat
     Perform LBP to calculate marginal distribution of unknown variables P(yi|xi, G);
     Perform LBP to calculate marginal distribution of triad c, i.e. P(yc|Xc, G);
     Calculate the gradient of μk according to :
          dθ/ dμk= E[hk(Yc)] – EPμ (Yc|X, G)[hk(Yc)]
                                    k


     Update parameter θ with the learning rate η:
          θ new = θold + ηd θ
Until Convergence;
Prediction features

   Geographic distance
        Global vs Local
   Homophily
        Link homophily
        Status homophily
   Implicit structure
        Retweet or reply
                                      (A) and (B) are balanced, but (C) and (D) are not.
        Retweeting seems to be     Users who share                     Elite users have a
         more helpful                             Global
                                    common links will                      Local
                                                                        much stronger
   Structural balance              have a tendency to                  tendency to
      Two-way relationships are    follow each other.                  follow each other
       balanced (88%),
      But, one-way relationships
       are not (only 29%).
Our approach : TriFG

   TriFG model
     Features based on observations
     Partially labeled
     Conditional random field
     Triad correlation factors
Outline

   Previous works
   Our approach
   Experimental results
   Conclusion & future works
Data collection

   Huge sub-network of twitter
      13,442,659 users and 56,893,234 following links.
      Extracted 35,746,366 tweets.
   Dynamic networks
      With an average of 728,509 new links per day.
      Averagely 3,337 new follow-back links per day.
      13 time stamps by viewing every four days as a time stamp
Prediction performance

   Baseline algorithms
         SVM & LRC & CRF
   Accurately infer 90% of reciprocal relationships in twitter.
         Data     Algotithm     Precision   Recall   F1Measure   Accuracy
                    SVM          0.6908     0.6129    0.6495      0.9590
      Test          LRC          0.6957     0.2581    0.3765      0.9510
      Case
                    CRF          1.0000     0.6290    0.7723      0.9770
       1
                   TriFG         1.0000     0.8548    0.9217      0.9910
                    SVM          0.7323     0.6212    0.6722      0.9534
      Test          LRC          0.8333     0.3030    0.4444      0.9417
      Case
                    CRF          1.0000     0.6333    0.7755      0.9717
       2
                   TriFG         1.0000     0.8788    0.9355      0.9907
Convergence property

   BLP-based learning algorithm can converge in less than 10 iterations.
   After only 7 learning iterations, the prediction performance of TriFG
    becomes stable.
Effect of Time Span

   Distribution of time span in which follow back action is performed.
      60% of follow-backs are performed in the next time stamp,
      37% of follow-backs would be still performed in the following 3 time stamps.
   How different settings of the time span.
      The prediction performance of TriFG drops sharply when setting the time span
       as two or less.
      The performance is acceptable, when setting it as three time stamps.
Factor contribution analysis

   We consider five different factor functions
        Geographic distance(G)
        Link homophily(L)
        Status homophily(S)
        Implicit network correlation(I)
        Structural balance correlation(B)
Outline

   Previous works
   Our approach
   Experimental results
   Conclusion & future works
Conclusion

   Reciprocal relationship prediction in social network
   Incorporates social theories into prediction model.
   Several interesting phenomena.
      Elite users tend to follow each other.
      Two-way relationships on Twitter are balanced, but one-way relationships are
       not.
      Social networks are going global, but also stay local.
Future works

   Other social theories for reciprocal relationship prediction.
   User feedback.
   Incorporating user interactions.
   Building a theory for different kinds of networks.
   Thanks!
   Q&A
Reference
    [BL11] L.Backstrom and J.Leskovec. Supervised random walks :
     predicting and recommending links in social networks. In WSDM’11
    [C10] D.J.Crandall, L.Backstrom, D. Cosley, S.Suri, D.Huttenlocher, and J.
     Kleinberg. Inferring social ties from geographic coincidences. PNAS, Dec.
     2010
    [W10] C.Wang, J. Han, Y.Jia, J.Tang, D.Zhang, Y. Yu and J.Guo. Mining
     advisor-advisee relationships from research publication networks. In
     KDD’10.
    [N01]M.E.J. Newman. Clustering and preferential attachment in growing
     networks. Phys. Rev. E, 2001
    [L10] J.Leskovec, D.Huttenlocher, and J.Kleinberg. Predicting positive and
     negative links in online social networks. In WWW10.
    [T10] C.Tan, J. Tang, J. Sun, Q.Lin, and F.Wang. Social action tracking
     via noise tolerant time-varying factor graphs. In KDD10
    [T09] J. Tang, J. Sun, C. Wang, and Z. Yang. Social influence analysis in
     large-scale networks. In KDD09.
Reference
    [J07]A. Java, X.Song, T.Finin, and B.L. Tseng. Why we twitter : An
     analysis of a microblogging community. In KDD2007.
    [K10]H. Kwak, C.Lee, H.Park, and S.B. Moon. What is twitter, a social
     network or a news media? In WWW2010.
    [M10]M.Mathioudakis and N.Koudas. Twittermonitor : trend detection over
     the twitter stream. In SIGMOD10.
    [S10]T. Sakaki, M. Okazaki, and Y.Matsuo. Earthquake shakes twitter
     users: real-time event detection by social sensors. In WWW10.

More Related Content

Recently uploaded

GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
Neo4j
 
Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
Aftab Hussain
 
Video Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the FutureVideo Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the Future
Alpen-Adria-Universität
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
Ana-Maria Mihalceanu
 
Free Complete Python - A step towards Data Science
Free Complete Python - A step towards Data ScienceFree Complete Python - A step towards Data Science
Free Complete Python - A step towards Data Science
RinaMondal9
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
Kari Kakkonen
 
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptxSecstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
nkrafacyberclub
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
Laura Byrne
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
ControlCase
 
National Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practicesNational Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practices
Quotidiano Piemontese
 
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionGenerative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Aggregage
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
Guy Korland
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
Prayukth K V
 
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdfUni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems S.M.S.A.
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
DanBrown980551
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance
 
A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...
sonjaschweigert1
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
Alan Dix
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
Safe Software
 

Recently uploaded (20)

GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
 
Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
 
Video Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the FutureVideo Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the Future
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
 
Free Complete Python - A step towards Data Science
Free Complete Python - A step towards Data ScienceFree Complete Python - A step towards Data Science
Free Complete Python - A step towards Data Science
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
 
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptxSecstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
 
National Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practicesNational Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practices
 
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionGenerative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to Production
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
 
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdfUni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdf
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
 
A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
 

Featured

Skeleton Culture Code
Skeleton Culture CodeSkeleton Culture Code
Skeleton Culture Code
Skeleton Technologies
 
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024
Neil Kimberley
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)
contently
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024
Albert Qian
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
Kurio // The Social Media Age(ncy)
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024
Search Engine Journal
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary
SpeakerHub
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd
Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next
Tessa Mero
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Lily Ray
 
How to have difficult conversations
How to have difficult conversations How to have difficult conversations
How to have difficult conversations
Rajiv Jayarajah, MAppComm, ACC
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
Christy Abraham Joy
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best Practices
Vit Horky
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project management
MindGenius
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
RachelPearson36
 
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Applitools
 
12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work
GetSmarter
 
More than Just Lines on a Map: Best Practices for U.S Bike Routes
More than Just Lines on a Map: Best Practices for U.S Bike RoutesMore than Just Lines on a Map: Best Practices for U.S Bike Routes
More than Just Lines on a Map: Best Practices for U.S Bike Routes
Project for Public Spaces & National Center for Biking and Walking
 
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
DevGAMM Conference
 

Featured (20)

Skeleton Culture Code
Skeleton Culture CodeSkeleton Culture Code
Skeleton Culture Code
 
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search Intent
 
How to have difficult conversations
How to have difficult conversations How to have difficult conversations
How to have difficult conversations
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best Practices
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project management
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
 
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
 
12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work
 
ChatGPT webinar slides
ChatGPT webinar slidesChatGPT webinar slides
ChatGPT webinar slides
 
More than Just Lines on a Map: Best Practices for U.S Bike Routes
More than Just Lines on a Map: Best Practices for U.S Bike RoutesMore than Just Lines on a Map: Best Practices for U.S Bike Routes
More than Just Lines on a Map: Best Practices for U.S Bike Routes
 
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
 

Who will-follow-you-back-reciprocal-relationship-prediction

  • 1. Who Will Follow You Back? Reciprocal Relationship 1 1 2 Prediction* 3 John Hopcroft, Tiancheng Lou, Jie Tang Department of Computer Science, Cornell University, 2Institutefor Interdisciplinary Information Sciences, Tsinghua University 3Department of Computer Science, Tsinghua University
  • 2. reciprocal parasocial Motivation v2 v3 v1  Two kinds of relationships in social network, v4 v5  one-way(called parasocial) relationship and,  two-way(called reciprocal) relationship v6  Two-way(reciprocal) relationship  usually developed from a one-way relationship  more trustful. after 3 days prediction  Try to understand(predict) the formation of two- way relationships  micro-level dynamics of the social network. v2 v3  underlying community structure? v1  how users influence each other? v4 v5 v6
  • 3. Example : real friend relationship On Twitter : Who Will Follow You Back? 30% ? 100% ? 60% ? JimmyQiao Ladygaga 1% ? Shiteng Obama Huwei
  • 4. Several key challenges  How to model the formation of two-way relationships? y2= ? y2= ? y2 y2  SVM & LRC y4 = 0 4  How to combine many social y4 4 theories into the prediction y1= 1 y1 y1= 1 y1 model? y3 y3 y3 = ? y3 = ? v2 v3 v1 v4 v5 (v1,, v2) (v1 v2) (v3, v4) 3 4 v6 (v1,, v5) (v1 v5) (v2,, v4) (v2 v4)
  • 5. Outline  Previous works  Our approach  Experimental results  Conclusion & future works
  • 6. Link prediction  Unsupervised link prediction  Scores & intution, such as preferential attachment [N01].  Supervised link prediction  supervised random walks [BL11].  logistic regression model to predict positive and negative links [L10].  Main differences:  We predict a directed link instead of only handles undirected social networks.  Our model is dynamic and learned from the evolution of the Twitter network.
  • 7. Social behavior analysis  Existing works on social behavior analysis:  The difference of the social influence on difference topics and to model the topic- level social influence in social networks. [T09]  How social actions evolve in a dynamic social network? [T10]  Main differences:  The proposed methods in previous work can be used here  but the problem is fundamentally different.
  • 8. Twitter study  The twitter network.  The topological and geographical properties. [J07]  Twittersphere and some notable properties, such as a non-power-law follower distribution, and low reciprocity. [K10]  The twitter users.  Influential users.  Tweeting behaviors of users.  The tweets.  Utilize the real-time nature to detect a target event. [S10]  TwitterMonitor, to detect emerging topics. [M10]
  • 9. Outline  Previous works  Our approach  Experimental results  Conclusion & future works
  • 10. Factor graph model  Problem definition  Given a network at time t, i.e., Gt = (Vt, Et, Xt, Yt)  Variables y are partially labeled.  Goal : infer unknown variables.  Factor graph model  P(Y | X, G) = P(X, G|Y) P(Y) / P(X, G) = C0 P(X | Y) P(Y | G)  In P(X | Y), assuming that the generative probability is conditionally independent,  P(Y | X, G) = C0P(Y | G)ΠP(xi|yi)  How to deal with P(Y | G) ?
  • 11. Random Field Given an undirected graph G=(V, Ya Yb E) such that Y={Yv|v∈V}, if the probability of Yv given X and Yc those random variables corresponding to nodes neighboring v in G. Then (X, Y) Ye is a conditional random field. Yd undirected graphical model Yf globally conditioned on X
  • 12. Conditional random field(CRF)  Model them in a Markov random field, by the Hammersley-Clifford theorem,  P(xi|yi) = 1/Z1 * exp {Σαj fj (xij, yi)}  P(Y|G) = 1/Z2 * exp {ΣcΣkμkhk(Yc)}  Z1 and Z2 are normalization factors.  So, the likelihood probability is :  P(Y | X, G) = 1/Z0 * exp {Σαj fj (xij, yi) + ΣcΣkμkhk(Yc)}  Z0 is normalization factors.
  • 13. Maximize likelihood  Objective function  O(θ) = log Pθ(Y | X, G) = ΣiΣjαj fj (xij, yi) + ΣΣμk hk(Yc) – log Z  Learning the model to  estimate a parameter configuration θ= {α, μ} to maximize the objective function :  that is, the goal is to compute θ* = argmax O(θ)
  • 14. Learning algorithm  θ* Goal : = argmax O(θ) L (k ) (k) ( Z ( x ( k ) ))' Fj ( y ,x ) j k Z ( x(k ) )  Newton methold.  The gradient of each μk with Z ( x(k ) ) exp F ( y, x ( k ) ) y regard to the objective exp( F ( y, x ( k ) )) F j ( y, x ( k ) ) function. (Z ( x (k ) )) ' y  dθ/ dμk= E[hk(Yc)] – EPμ (Yc|X, k Z (x (k ) ) exp F ( y, x ( k ) ) G)[hk(Yc)] y exp( F ( y, x ( k ) ) (k ) ) F j ( y, x ( k ) ) y exp F ( y, x ) y p ( y | x ( k ) ) F j ( y, x ( k ) ) y E p (Y | x( k ) ) F j (Y , x ( k ) )
  • 15. Learning algorithm  One challenge : how to calculate the marginal distribution Pμ (yc|x, G). k  Intractable to directly calculate the marginal distribution.  Use approximation algorithm : Belief propagation.
  • 16. Belief propagation  Sum-product  The algorithm works by passing real values functions called “messages” along the edges between the nodes.  Extend edges to triangles
  • 17. Belief propagation  A “message” from a variable node v to a factor node u is μv->u(xv) = Πu*∈N(v){u}μu*->v(xv)  A “message” from a factor node u to a variable node v is μu->v(xu) = Σx’ux’v=xvΠu*∈N(u){v}μv*->u(x’u)
  • 18. Learning algorithm(TriFG model) Input : network Gt, learning rateη Output : estimated parametersθ Initalize θ= 0; Repeat Perform LBP to calculate marginal distribution of unknown variables P(yi|xi, G); Perform LBP to calculate marginal distribution of triad c, i.e. P(yc|Xc, G); Calculate the gradient of μk according to : dθ/ dμk= E[hk(Yc)] – EPμ (Yc|X, G)[hk(Yc)] k Update parameter θ with the learning rate η: θ new = θold + ηd θ Until Convergence;
  • 19. Prediction features  Geographic distance  Global vs Local  Homophily  Link homophily  Status homophily  Implicit structure  Retweet or reply (A) and (B) are balanced, but (C) and (D) are not.  Retweeting seems to be Users who share Elite users have a more helpful Global common links will Local much stronger  Structural balance have a tendency to tendency to  Two-way relationships are follow each other. follow each other balanced (88%),  But, one-way relationships are not (only 29%).
  • 20. Our approach : TriFG  TriFG model  Features based on observations  Partially labeled  Conditional random field  Triad correlation factors
  • 21. Outline  Previous works  Our approach  Experimental results  Conclusion & future works
  • 22. Data collection  Huge sub-network of twitter  13,442,659 users and 56,893,234 following links.  Extracted 35,746,366 tweets.  Dynamic networks  With an average of 728,509 new links per day.  Averagely 3,337 new follow-back links per day.  13 time stamps by viewing every four days as a time stamp
  • 23. Prediction performance  Baseline algorithms  SVM & LRC & CRF  Accurately infer 90% of reciprocal relationships in twitter. Data Algotithm Precision Recall F1Measure Accuracy SVM 0.6908 0.6129 0.6495 0.9590 Test LRC 0.6957 0.2581 0.3765 0.9510 Case CRF 1.0000 0.6290 0.7723 0.9770 1 TriFG 1.0000 0.8548 0.9217 0.9910 SVM 0.7323 0.6212 0.6722 0.9534 Test LRC 0.8333 0.3030 0.4444 0.9417 Case CRF 1.0000 0.6333 0.7755 0.9717 2 TriFG 1.0000 0.8788 0.9355 0.9907
  • 24. Convergence property  BLP-based learning algorithm can converge in less than 10 iterations.  After only 7 learning iterations, the prediction performance of TriFG becomes stable.
  • 25. Effect of Time Span  Distribution of time span in which follow back action is performed.  60% of follow-backs are performed in the next time stamp,  37% of follow-backs would be still performed in the following 3 time stamps.  How different settings of the time span.  The prediction performance of TriFG drops sharply when setting the time span as two or less.  The performance is acceptable, when setting it as three time stamps.
  • 26. Factor contribution analysis  We consider five different factor functions  Geographic distance(G)  Link homophily(L)  Status homophily(S)  Implicit network correlation(I)  Structural balance correlation(B)
  • 27. Outline  Previous works  Our approach  Experimental results  Conclusion & future works
  • 28. Conclusion  Reciprocal relationship prediction in social network  Incorporates social theories into prediction model.  Several interesting phenomena.  Elite users tend to follow each other.  Two-way relationships on Twitter are balanced, but one-way relationships are not.  Social networks are going global, but also stay local.
  • 29. Future works  Other social theories for reciprocal relationship prediction.  User feedback.  Incorporating user interactions.  Building a theory for different kinds of networks.
  • 30. Thanks!  Q&A
  • 31. Reference  [BL11] L.Backstrom and J.Leskovec. Supervised random walks : predicting and recommending links in social networks. In WSDM’11  [C10] D.J.Crandall, L.Backstrom, D. Cosley, S.Suri, D.Huttenlocher, and J. Kleinberg. Inferring social ties from geographic coincidences. PNAS, Dec. 2010  [W10] C.Wang, J. Han, Y.Jia, J.Tang, D.Zhang, Y. Yu and J.Guo. Mining advisor-advisee relationships from research publication networks. In KDD’10.  [N01]M.E.J. Newman. Clustering and preferential attachment in growing networks. Phys. Rev. E, 2001  [L10] J.Leskovec, D.Huttenlocher, and J.Kleinberg. Predicting positive and negative links in online social networks. In WWW10.  [T10] C.Tan, J. Tang, J. Sun, Q.Lin, and F.Wang. Social action tracking via noise tolerant time-varying factor graphs. In KDD10  [T09] J. Tang, J. Sun, C. Wang, and Z. Yang. Social influence analysis in large-scale networks. In KDD09.
  • 32. Reference  [J07]A. Java, X.Song, T.Finin, and B.L. Tseng. Why we twitter : An analysis of a microblogging community. In KDD2007.  [K10]H. Kwak, C.Lee, H.Park, and S.B. Moon. What is twitter, a social network or a news media? In WWW2010.  [M10]M.Mathioudakis and N.Koudas. Twittermonitor : trend detection over the twitter stream. In SIGMOD10.  [S10]T. Sakaki, M. Okazaki, and Y.Matsuo. Earthquake shakes twitter users: real-time event detection by social sensors. In WWW10.