SlideShare a Scribd company logo
1 of 21
Download to read offline
Scalable Influence Maximization for
    Prevalent Viral Marketing in Large-Scale
                Social Networks

                                  Wei Chen
                            Microsoft Research Asia

                                   In collaboration with
                            Chi Wang     University of Illinois at Urbana-Champaign
                            Yajun Wang   Microsoft Research Asia


1   KDD'10, July 27, 2010
Outline
     Background and problem definition
     Maximum Influence Arborescence (MIA) heuristic
     Experimental evaluations
     Related work and future directions




2   KDD'10, July 27, 2010
Ubiquitous Social Networks




3   KDD'10, July 27, 2010
A Hypothetical Example of Viral Marketing
                                        Avatar is great
                                                                             Avatar is great


       Avatar is great
                                                                 Avatar is great




    Avatar is great
                                                          Avatar is great




                      Avatar is great




4        KDD'10, July 27, 2010
Effectiveness of Viral Marketing
                                 level of trust on different types of ads *

                                                                              very effective




    *source from Forrester Research and Intelliseek


5            KDD'10, July 27, 2010
The Problem of Influence Maximization
     Social influence graph
       vertices are individuals
       links are social relationships
       number p(u,v) on a directed link
       from u to v is the probability that
       v is activated by u after u is
       activated
     Independent cascade model
       initially some seed nodes are
       activated                             0.3
       At each step, each newly
       activated node u activates its        0.1
       neighbor v with probability p(u,v)
       influence spread: expected
       number of nodes activated
     Influence maximization:
       find k seeds that generate the
       largest influence spread


6      KDD'10, July 27, 2010
Research Background
     Influence maximization as a discrete optimization problem
     proposed by Kempe, Kleinberg, and Tardos, in KDD’2003
       Finding optimal solution is provably hard (NP-hard)
       Greedy approximation algorithm, 63% approximation of the
       optimal solution
          Repeat k rounds: in the i-th round, select a node v that provides the
          largest marginal increase in influence spread
          require the evaluation of influence spread given a seed set --- hard and
          slow
     Several subsequent studies improved the running time
     Serious drawback:
       very slow, not scalable: > 3 hrs on a 30k node graph for
       50 seeds
7      KDD'10, July 27, 2010
Our Work
     Design new heuristics
       MIA (maximum influence arborescence) heuristic
          for general independent cascade model
          103 speedup --- from hours to seconds (or days to minutes)
          influence spread close to that of the greedy algorithm of [KKT’03]
     We also show that computing exact influence spread given a
     seed set is #P-hard (counting hardness)
       resolve an open problem in [KKT’03]
       indicate the intrinsic difficulty of computing influence spread




8      KDD'10, July 27, 2010
Maximum Influence Arborescence (MIA)
    Heuristic I: Maximum Influence Paths (MIPs)
     For any pair of nodes u and
     v, find the maximum
     influence path (MIP) from u
     to v                             u

     ignore MIPs with too small
     probabilities ( < parameter )
                                          0.3

                                          0.1
                                                v




9      KDD'10, July 27, 2010
MIA Heuristic II: Maximum Influence in-
 (out-) Arborescences
     Local influence regions
       for every node v, all MIPs
       to v form its maximum
       influence in-arborescence    u
       (MIIA )


                                        0.3

                                        0.1
                                              v




10     KDD'10, July 27, 2010
MIA Heuristic II: Maximum Influence in-
 (out-) Arborescences
     Local influence regions
       for every node v, all MIPs
       to v form its maximum
       influence in-arborescence    u
       (MIIA )
       for every node u, all MIPs
       from u form its maximum
                                        0.3
       influence out-
       arborescence (MIOA )             0.1
                                              v
       These MIIAs and MIOAs
       can be computed
       efficiently using the
       Dijkstra shortest path
       algorithm

11     KDD'10, July 27, 2010
MIA Heuristic III: Computing Influence
 through the MIA structure
     Recursive computation of activation probability ap(u) of a
     node u in its in-arborescence, given a seed set S




     Can be used in the greedy algorithm for selecting k seeds,
     but not efficient enough



12     KDD'10, July 27, 2010
MIA Heuristic IV: Efficient Updates on
 Activation Probabilities
     If v is the root of a MIIA, and u is a node in the MIIA, then
     their activation probabilities have a linear relationship:

     All          ‘s in a MIIA can be recursively computed
       time reduced from quadratic to linear time
     If u is selected as a seed, its marginal influence increase to v
     is
     Summing up the above marginal influence over all nodes v,
     we obtain the marginal influence of u
     Select the u with the largest marginal influence
     Update           for all w’s that are in the same MIIAs as u


13     KDD'10, July 27, 2010
MIA Heuristic IV: Summary
     Iterating the following two steps until finding k seeds
       Selecting the node u giving the largest marginal influence
       Update MIAs (linear coefficients) after selecting u as the seed
     Key features:
       updates are local, and linear to the arborescence size
       tunable with parameter : tradeoff between running time and
       influence spread




14     KDD'10, July 27, 2010
Experiment Results on MIA Heuristic
                          Influence spread vs. seed set size




     NetHEPT dataset:                                         Epinions dataset:
     • collaboration network from physics archive             • who-trust-whom network of Epinions.com
     • 15K nodes, 31K edges                                   • 76K nodes, 509K edges


                  weighted cascade model:
                  •    influence probability to a node v = 1 / (# of in-neighbors of v)

15      KDD'10, July 27, 2010
Experiment Results on MIA Heuristic
                                    running time


                                  >103 times                      104 times
                                   speed up                       speed up




                         Running time is for selecting 50 seeds

16   KDD'10, July 27, 2010
Scalability of MIA Heuristic




     •    synthesized graphs of different sizes generated from power-law graph model
     •    weighted cascade model
     •    running time is for selecting 50 seeds



17       KDD'10, July 27, 2010
Related Work
     Greedy approximation algorithms
       Original greedy algorithm [Kempe, Kleinberg, and Tardos, 2003]
       Lazy-forward optimization [Leskovec, Krause, Guestrin, Faloutsos, VanBriesen, and
       Glance, 2007]
       Edge sampling and reachable sets [Kimura, Saito and Nakano, 2007; C., Wang, and
       Yang, 2009]
       reduced seed selection from days to hours (with 30K nodes), but still not scalable
     Heuristic algorithms
       SPM/SP1M based on shortest paths [Kimura and Saito, 2006], not scalable
       SPIN based on Shapley values [Narayanam and Narahari, 2008], not scalable
       Degree discounts [C., Wang, and Yang, 2009], designed for the uniform IC model
       CGA based on community partitions [Wang, Cong, Song, and Xie 2010]
          complementary
          our local MIAs naturally adapt to the community structure, including overlapping communities


18     KDD'10, July 27, 2010
Future Directions
     Theoretical problem: efficient approximation algorithms:
       How to efficiently approximate influence spread given a seed set?
     Practical problem: Influence analysis from online social media
       How to mine the influence graph?




19     KDD'10, July 27, 2010
Thanks!
               and
               questions?




20   KDD'10, July 27, 2010
Experiment Results on MIA Heuristic
         Influence spread vs. seed set size                                      running time

                                                                                                            104 times
                                                                                    >103 times              speed up
                                                                                     speed up




 NetHEPT dataset:                  Epinions dataset:
 • collaboration network from      • who-trust-whom network
   physics archive                   of Epinions.com
 • 15K nodes, 31K edges            • 76K nodes, 509K edges

                                                                        Running time is for selecting 50 seeds
 weighted cascade model:
 •   influence probability to a node v = 1 / (# of in-neighbors of v)




21       KDD'10, July 27, 2010

More Related Content

What's hot

Network centrality measures and their effectiveness
Network centrality measures and their effectivenessNetwork centrality measures and their effectiveness
Network centrality measures and their effectivenessemapesce
 
Federated learning
Federated learningFederated learning
Federated learningMindos Cheng
 
DBSCAN (2014_11_25 06_21_12 UTC)
DBSCAN (2014_11_25 06_21_12 UTC)DBSCAN (2014_11_25 06_21_12 UTC)
DBSCAN (2014_11_25 06_21_12 UTC)Cory Cook
 
Scalable Detection of Concept Drifts on Data Streams with Parallel Adaptive W...
Scalable Detection of Concept Drifts on Data Streams with Parallel Adaptive W...Scalable Detection of Concept Drifts on Data Streams with Parallel Adaptive W...
Scalable Detection of Concept Drifts on Data Streams with Parallel Adaptive W...Jonas Traub
 
PhD Thesis Proposal
PhD Thesis Proposal PhD Thesis Proposal
PhD Thesis Proposal Ziqiang Feng
 
Explainable AI - making ML and DL models more interpretable
Explainable AI - making ML and DL models more interpretableExplainable AI - making ML and DL models more interpretable
Explainable AI - making ML and DL models more interpretableAditya Bhattacharya
 
Sim-to-Real Transfer in Deep Reinforcement Learning
Sim-to-Real Transfer in Deep Reinforcement LearningSim-to-Real Transfer in Deep Reinforcement Learning
Sim-to-Real Transfer in Deep Reinforcement Learningatulshah16
 
Community Detection
Community Detection Community Detection
Community Detection Kanika Kanwal
 
Real time machine learning
Real time machine learningReal time machine learning
Real time machine learningVinoth Kannan
 
Community detection in social networks
Community detection in social networksCommunity detection in social networks
Community detection in social networksFrancisco Restivo
 
Social network analysis
Social network analysisSocial network analysis
Social network analysisCaleb Jones
 
Artificial neural network
Artificial neural networkArtificial neural network
Artificial neural networkmustafa aadel
 
Arm A64fx and Post-K: Game-Changing CPU & Supercomputer for HPC, Big Data, & AI
Arm A64fx and Post-K: Game-Changing CPU & Supercomputer for HPC, Big Data, & AIArm A64fx and Post-K: Game-Changing CPU & Supercomputer for HPC, Big Data, & AI
Arm A64fx and Post-K: Game-Changing CPU & Supercomputer for HPC, Big Data, & AIinside-BigData.com
 
01 Add Health Network Data Challenges: IRB and Security Issues
01 Add Health Network Data Challenges: IRB and Security Issues01 Add Health Network Data Challenges: IRB and Security Issues
01 Add Health Network Data Challenges: IRB and Security IssuesDuke Network Analysis Center
 
Scalable community detection with the louvain algorithm
Scalable community detection with the louvain algorithmScalable community detection with the louvain algorithm
Scalable community detection with the louvain algorithmNavid Sedighpour
 

What's hot (20)

Network centrality measures and their effectiveness
Network centrality measures and their effectivenessNetwork centrality measures and their effectiveness
Network centrality measures and their effectiveness
 
Machine learning yearning
Machine learning yearningMachine learning yearning
Machine learning yearning
 
Random walk on Graphs
Random walk on GraphsRandom walk on Graphs
Random walk on Graphs
 
Federated learning
Federated learningFederated learning
Federated learning
 
DBSCAN (2014_11_25 06_21_12 UTC)
DBSCAN (2014_11_25 06_21_12 UTC)DBSCAN (2014_11_25 06_21_12 UTC)
DBSCAN (2014_11_25 06_21_12 UTC)
 
Scalable Detection of Concept Drifts on Data Streams with Parallel Adaptive W...
Scalable Detection of Concept Drifts on Data Streams with Parallel Adaptive W...Scalable Detection of Concept Drifts on Data Streams with Parallel Adaptive W...
Scalable Detection of Concept Drifts on Data Streams with Parallel Adaptive W...
 
PhD Thesis Proposal
PhD Thesis Proposal PhD Thesis Proposal
PhD Thesis Proposal
 
Explainable AI - making ML and DL models more interpretable
Explainable AI - making ML and DL models more interpretableExplainable AI - making ML and DL models more interpretable
Explainable AI - making ML and DL models more interpretable
 
Sim-to-Real Transfer in Deep Reinforcement Learning
Sim-to-Real Transfer in Deep Reinforcement LearningSim-to-Real Transfer in Deep Reinforcement Learning
Sim-to-Real Transfer in Deep Reinforcement Learning
 
Resnet
ResnetResnet
Resnet
 
Community Detection
Community Detection Community Detection
Community Detection
 
Real time machine learning
Real time machine learningReal time machine learning
Real time machine learning
 
Community detection in social networks
Community detection in social networksCommunity detection in social networks
Community detection in social networks
 
Social network analysis
Social network analysisSocial network analysis
Social network analysis
 
Artificial neural network
Artificial neural networkArtificial neural network
Artificial neural network
 
Arm A64fx and Post-K: Game-Changing CPU & Supercomputer for HPC, Big Data, & AI
Arm A64fx and Post-K: Game-Changing CPU & Supercomputer for HPC, Big Data, & AIArm A64fx and Post-K: Game-Changing CPU & Supercomputer for HPC, Big Data, & AI
Arm A64fx and Post-K: Game-Changing CPU & Supercomputer for HPC, Big Data, & AI
 
01 Add Health Network Data Challenges: IRB and Security Issues
01 Add Health Network Data Challenges: IRB and Security Issues01 Add Health Network Data Challenges: IRB and Security Issues
01 Add Health Network Data Challenges: IRB and Security Issues
 
storm at twitter
storm at twitterstorm at twitter
storm at twitter
 
Open mp
Open mpOpen mp
Open mp
 
Scalable community detection with the louvain algorithm
Scalable community detection with the louvain algorithmScalable community detection with the louvain algorithm
Scalable community detection with the louvain algorithm
 

Similar to Socable Influence Maximization

Time Critical Influence Maximization
Time Critical Influence MaximizationTime Critical Influence Maximization
Time Critical Influence MaximizationWei Lu
 
Scalable and Parallelizable Processing of Influence Maximization for Large-S...
Scalable and Parallelizable Processing of Influence Maximization  for Large-S...Scalable and Parallelizable Processing of Influence Maximization  for Large-S...
Scalable and Parallelizable Processing of Influence Maximization for Large-S...Jinha Kim
 
Detecting root of the rumor in social network using GSSS
Detecting root of the rumor in social network using GSSSDetecting root of the rumor in social network using GSSS
Detecting root of the rumor in social network using GSSSIRJET Journal
 
Study on the Effect of Network Dynamics on Opportunistic Routing
Study on the Effect of Network Dynamics on Opportunistic RoutingStudy on the Effect of Network Dynamics on Opportunistic Routing
Study on the Effect of Network Dynamics on Opportunistic RoutingWaldir Moreira
 
Neural Architectures for Still Images - Xavier Giro- UPC Barcelona 2019
Neural Architectures for Still Images - Xavier Giro- UPC Barcelona 2019Neural Architectures for Still Images - Xavier Giro- UPC Barcelona 2019
Neural Architectures for Still Images - Xavier Giro- UPC Barcelona 2019Universitat Politècnica de Catalunya
 
Redundancy avoidance for big data in data centers a conventional neural netwo...
Redundancy avoidance for big data in data centers a conventional neural netwo...Redundancy avoidance for big data in data centers a conventional neural netwo...
Redundancy avoidance for big data in data centers a conventional neural netwo...nexgentechnology
 
Bayesian Network 을 활용한 예측 분석
Bayesian Network 을 활용한 예측 분석Bayesian Network 을 활용한 예측 분석
Bayesian Network 을 활용한 예측 분석datasciencekorea
 
Discovering Influential User by Coupling Multiplex Heterogeneous OSN’S
Discovering Influential User by Coupling Multiplex Heterogeneous OSN’SDiscovering Influential User by Coupling Multiplex Heterogeneous OSN’S
Discovering Influential User by Coupling Multiplex Heterogeneous OSN’SIRJET Journal
 
Jeffrey xu yu large graph processing
Jeffrey xu yu large graph processingJeffrey xu yu large graph processing
Jeffrey xu yu large graph processingjins0618
 
2 gamarnik
2 gamarnik2 gamarnik
2 gamarnikYandex
 
User Identity Linkage: Data Collection, DataSet Biases, Method, Control and A...
User Identity Linkage: Data Collection, DataSet Biases, Method, Control and A...User Identity Linkage: Data Collection, DataSet Biases, Method, Control and A...
User Identity Linkage: Data Collection, DataSet Biases, Method, Control and A...IIIT Hyderabad
 
モデルアーキテクチャ観点からの高速化2019
モデルアーキテクチャ観点からの高速化2019モデルアーキテクチャ観点からの高速化2019
モデルアーキテクチャ観点からの高速化2019Yusuke Uchida
 
Joint InBand Backhauling and Interference Mitigation in 5G Heterogeneous Netw...
Joint InBand Backhauling and Interference Mitigation in 5G Heterogeneous Netw...Joint InBand Backhauling and Interference Mitigation in 5G Heterogeneous Netw...
Joint InBand Backhauling and Interference Mitigation in 5G Heterogeneous Netw...TrungKienVu3
 
Why ∆Q is the ideal network metric
Why ∆Q is the ideal network metricWhy ∆Q is the ideal network metric
Why ∆Q is the ideal network metricMartin Geddes
 
A Novel Target Marketing Approach based on Influence Maximization
A Novel Target Marketing Approach based on Influence MaximizationA Novel Target Marketing Approach based on Influence Maximization
A Novel Target Marketing Approach based on Influence MaximizationSurendra Gadwal
 
Collaborative Research with UK MOD - an Academic's Experience ((John Fitzgerald)
Collaborative Research with UK MOD - an Academic's Experience ((John Fitzgerald)Collaborative Research with UK MOD - an Academic's Experience ((John Fitzgerald)
Collaborative Research with UK MOD - an Academic's Experience ((John Fitzgerald)scirexcenter
 
Smart manufacturing through cloud based-r-nabati--dr abdulbaghi ghaderzadeh
Smart manufacturing through cloud based-r-nabati--dr abdulbaghi ghaderzadehSmart manufacturing through cloud based-r-nabati--dr abdulbaghi ghaderzadeh
Smart manufacturing through cloud based-r-nabati--dr abdulbaghi ghaderzadehnabati
 
Can we predict the quality of spectrum-based fault localization?
Can we predict the quality of spectrum-based fault localization?Can we predict the quality of spectrum-based fault localization?
Can we predict the quality of spectrum-based fault localization?Lionel Briand
 
Utilization of Super Pixel Based Microarray Image Segmentation
Utilization of Super Pixel Based Microarray Image SegmentationUtilization of Super Pixel Based Microarray Image Segmentation
Utilization of Super Pixel Based Microarray Image Segmentationijtsrd
 

Similar to Socable Influence Maximization (20)

Time Critical Influence Maximization
Time Critical Influence MaximizationTime Critical Influence Maximization
Time Critical Influence Maximization
 
Scalable and Parallelizable Processing of Influence Maximization for Large-S...
Scalable and Parallelizable Processing of Influence Maximization  for Large-S...Scalable and Parallelizable Processing of Influence Maximization  for Large-S...
Scalable and Parallelizable Processing of Influence Maximization for Large-S...
 
Detecting root of the rumor in social network using GSSS
Detecting root of the rumor in social network using GSSSDetecting root of the rumor in social network using GSSS
Detecting root of the rumor in social network using GSSS
 
Study on the Effect of Network Dynamics on Opportunistic Routing
Study on the Effect of Network Dynamics on Opportunistic RoutingStudy on the Effect of Network Dynamics on Opportunistic Routing
Study on the Effect of Network Dynamics on Opportunistic Routing
 
Neural Architectures for Still Images - Xavier Giro- UPC Barcelona 2019
Neural Architectures for Still Images - Xavier Giro- UPC Barcelona 2019Neural Architectures for Still Images - Xavier Giro- UPC Barcelona 2019
Neural Architectures for Still Images - Xavier Giro- UPC Barcelona 2019
 
Redundancy avoidance for big data in data centers a conventional neural netwo...
Redundancy avoidance for big data in data centers a conventional neural netwo...Redundancy avoidance for big data in data centers a conventional neural netwo...
Redundancy avoidance for big data in data centers a conventional neural netwo...
 
Bayesian Network 을 활용한 예측 분석
Bayesian Network 을 활용한 예측 분석Bayesian Network 을 활용한 예측 분석
Bayesian Network 을 활용한 예측 분석
 
Discovering Influential User by Coupling Multiplex Heterogeneous OSN’S
Discovering Influential User by Coupling Multiplex Heterogeneous OSN’SDiscovering Influential User by Coupling Multiplex Heterogeneous OSN’S
Discovering Influential User by Coupling Multiplex Heterogeneous OSN’S
 
Jeffrey xu yu large graph processing
Jeffrey xu yu large graph processingJeffrey xu yu large graph processing
Jeffrey xu yu large graph processing
 
2 gamarnik
2 gamarnik2 gamarnik
2 gamarnik
 
User Identity Linkage: Data Collection, DataSet Biases, Method, Control and A...
User Identity Linkage: Data Collection, DataSet Biases, Method, Control and A...User Identity Linkage: Data Collection, DataSet Biases, Method, Control and A...
User Identity Linkage: Data Collection, DataSet Biases, Method, Control and A...
 
モデルアーキテクチャ観点からの高速化2019
モデルアーキテクチャ観点からの高速化2019モデルアーキテクチャ観点からの高速化2019
モデルアーキテクチャ観点からの高速化2019
 
Joint InBand Backhauling and Interference Mitigation in 5G Heterogeneous Netw...
Joint InBand Backhauling and Interference Mitigation in 5G Heterogeneous Netw...Joint InBand Backhauling and Interference Mitigation in 5G Heterogeneous Netw...
Joint InBand Backhauling and Interference Mitigation in 5G Heterogeneous Netw...
 
Why ∆Q is the ideal network metric
Why ∆Q is the ideal network metricWhy ∆Q is the ideal network metric
Why ∆Q is the ideal network metric
 
A Novel Target Marketing Approach based on Influence Maximization
A Novel Target Marketing Approach based on Influence MaximizationA Novel Target Marketing Approach based on Influence Maximization
A Novel Target Marketing Approach based on Influence Maximization
 
Collaborative Research with UK MOD - an Academic's Experience ((John Fitzgerald)
Collaborative Research with UK MOD - an Academic's Experience ((John Fitzgerald)Collaborative Research with UK MOD - an Academic's Experience ((John Fitzgerald)
Collaborative Research with UK MOD - an Academic's Experience ((John Fitzgerald)
 
Smart manufacturing through cloud based-r-nabati--dr abdulbaghi ghaderzadeh
Smart manufacturing through cloud based-r-nabati--dr abdulbaghi ghaderzadehSmart manufacturing through cloud based-r-nabati--dr abdulbaghi ghaderzadeh
Smart manufacturing through cloud based-r-nabati--dr abdulbaghi ghaderzadeh
 
Can we predict the quality of spectrum-based fault localization?
Can we predict the quality of spectrum-based fault localization?Can we predict the quality of spectrum-based fault localization?
Can we predict the quality of spectrum-based fault localization?
 
CrowDM system
CrowDM systemCrowDM system
CrowDM system
 
Utilization of Super Pixel Based Microarray Image Segmentation
Utilization of Super Pixel Based Microarray Image SegmentationUtilization of Super Pixel Based Microarray Image Segmentation
Utilization of Super Pixel Based Microarray Image Segmentation
 

Recently uploaded

08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 

Recently uploaded (20)

08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 

Socable Influence Maximization

  • 1. Scalable Influence Maximization for Prevalent Viral Marketing in Large-Scale Social Networks Wei Chen Microsoft Research Asia In collaboration with Chi Wang University of Illinois at Urbana-Champaign Yajun Wang Microsoft Research Asia 1 KDD'10, July 27, 2010
  • 2. Outline  Background and problem definition  Maximum Influence Arborescence (MIA) heuristic  Experimental evaluations  Related work and future directions 2 KDD'10, July 27, 2010
  • 3. Ubiquitous Social Networks 3 KDD'10, July 27, 2010
  • 4. A Hypothetical Example of Viral Marketing Avatar is great Avatar is great Avatar is great Avatar is great Avatar is great Avatar is great Avatar is great 4 KDD'10, July 27, 2010
  • 5. Effectiveness of Viral Marketing level of trust on different types of ads * very effective *source from Forrester Research and Intelliseek 5 KDD'10, July 27, 2010
  • 6. The Problem of Influence Maximization Social influence graph vertices are individuals links are social relationships number p(u,v) on a directed link from u to v is the probability that v is activated by u after u is activated Independent cascade model initially some seed nodes are activated 0.3 At each step, each newly activated node u activates its 0.1 neighbor v with probability p(u,v) influence spread: expected number of nodes activated Influence maximization: find k seeds that generate the largest influence spread 6 KDD'10, July 27, 2010
  • 7. Research Background Influence maximization as a discrete optimization problem proposed by Kempe, Kleinberg, and Tardos, in KDD’2003 Finding optimal solution is provably hard (NP-hard) Greedy approximation algorithm, 63% approximation of the optimal solution Repeat k rounds: in the i-th round, select a node v that provides the largest marginal increase in influence spread require the evaluation of influence spread given a seed set --- hard and slow Several subsequent studies improved the running time Serious drawback: very slow, not scalable: > 3 hrs on a 30k node graph for 50 seeds 7 KDD'10, July 27, 2010
  • 8. Our Work Design new heuristics MIA (maximum influence arborescence) heuristic for general independent cascade model 103 speedup --- from hours to seconds (or days to minutes) influence spread close to that of the greedy algorithm of [KKT’03] We also show that computing exact influence spread given a seed set is #P-hard (counting hardness) resolve an open problem in [KKT’03] indicate the intrinsic difficulty of computing influence spread 8 KDD'10, July 27, 2010
  • 9. Maximum Influence Arborescence (MIA) Heuristic I: Maximum Influence Paths (MIPs) For any pair of nodes u and v, find the maximum influence path (MIP) from u to v u ignore MIPs with too small probabilities ( < parameter ) 0.3 0.1 v 9 KDD'10, July 27, 2010
  • 10. MIA Heuristic II: Maximum Influence in- (out-) Arborescences Local influence regions for every node v, all MIPs to v form its maximum influence in-arborescence u (MIIA ) 0.3 0.1 v 10 KDD'10, July 27, 2010
  • 11. MIA Heuristic II: Maximum Influence in- (out-) Arborescences Local influence regions for every node v, all MIPs to v form its maximum influence in-arborescence u (MIIA ) for every node u, all MIPs from u form its maximum 0.3 influence out- arborescence (MIOA ) 0.1 v These MIIAs and MIOAs can be computed efficiently using the Dijkstra shortest path algorithm 11 KDD'10, July 27, 2010
  • 12. MIA Heuristic III: Computing Influence through the MIA structure Recursive computation of activation probability ap(u) of a node u in its in-arborescence, given a seed set S Can be used in the greedy algorithm for selecting k seeds, but not efficient enough 12 KDD'10, July 27, 2010
  • 13. MIA Heuristic IV: Efficient Updates on Activation Probabilities If v is the root of a MIIA, and u is a node in the MIIA, then their activation probabilities have a linear relationship: All ‘s in a MIIA can be recursively computed time reduced from quadratic to linear time If u is selected as a seed, its marginal influence increase to v is Summing up the above marginal influence over all nodes v, we obtain the marginal influence of u Select the u with the largest marginal influence Update for all w’s that are in the same MIIAs as u 13 KDD'10, July 27, 2010
  • 14. MIA Heuristic IV: Summary Iterating the following two steps until finding k seeds Selecting the node u giving the largest marginal influence Update MIAs (linear coefficients) after selecting u as the seed Key features: updates are local, and linear to the arborescence size tunable with parameter : tradeoff between running time and influence spread 14 KDD'10, July 27, 2010
  • 15. Experiment Results on MIA Heuristic Influence spread vs. seed set size NetHEPT dataset: Epinions dataset: • collaboration network from physics archive • who-trust-whom network of Epinions.com • 15K nodes, 31K edges • 76K nodes, 509K edges weighted cascade model: • influence probability to a node v = 1 / (# of in-neighbors of v) 15 KDD'10, July 27, 2010
  • 16. Experiment Results on MIA Heuristic running time >103 times 104 times speed up speed up Running time is for selecting 50 seeds 16 KDD'10, July 27, 2010
  • 17. Scalability of MIA Heuristic • synthesized graphs of different sizes generated from power-law graph model • weighted cascade model • running time is for selecting 50 seeds 17 KDD'10, July 27, 2010
  • 18. Related Work Greedy approximation algorithms Original greedy algorithm [Kempe, Kleinberg, and Tardos, 2003] Lazy-forward optimization [Leskovec, Krause, Guestrin, Faloutsos, VanBriesen, and Glance, 2007] Edge sampling and reachable sets [Kimura, Saito and Nakano, 2007; C., Wang, and Yang, 2009] reduced seed selection from days to hours (with 30K nodes), but still not scalable Heuristic algorithms SPM/SP1M based on shortest paths [Kimura and Saito, 2006], not scalable SPIN based on Shapley values [Narayanam and Narahari, 2008], not scalable Degree discounts [C., Wang, and Yang, 2009], designed for the uniform IC model CGA based on community partitions [Wang, Cong, Song, and Xie 2010] complementary our local MIAs naturally adapt to the community structure, including overlapping communities 18 KDD'10, July 27, 2010
  • 19. Future Directions Theoretical problem: efficient approximation algorithms: How to efficiently approximate influence spread given a seed set? Practical problem: Influence analysis from online social media How to mine the influence graph? 19 KDD'10, July 27, 2010
  • 20. Thanks! and questions? 20 KDD'10, July 27, 2010
  • 21. Experiment Results on MIA Heuristic Influence spread vs. seed set size running time 104 times >103 times speed up speed up NetHEPT dataset: Epinions dataset: • collaboration network from • who-trust-whom network physics archive of Epinions.com • 15K nodes, 31K edges • 76K nodes, 509K edges Running time is for selecting 50 seeds weighted cascade model: • influence probability to a node v = 1 / (# of in-neighbors of v) 21 KDD'10, July 27, 2010