Time-Critical Influence Maximization in Social
Networks withTime-Delayed Diffusion Process
Wei Chen Wei Lu Ning Zhang
Microsoft ResearchAsia U. of British Columbia U. of Sci andTech of China
This work was done during the internships of Wei Lu and Ning Zhang at Microsoft Research Asia.
Influence in Social Networks
Thursday, July 26, 2012AAAI 2012,Toronto, Ontario.2
 We live in communities
and interact with social
acquaintances
 This forms social
networks
 In social interactions, we
influence each other
Kinect is great
Kinect is great
Kinect is great
Kinect is great
Kinect is great
Kinect is great
Kinect is great
Influence Diffusion & Viral Marketing
3 AAAI 2012,Toronto, Ontario.
Word-of-mouth effects
Social Networks as Directed Graphs
Thursday, July 26, 2012AAAI 2012,Toronto, Ontario.4
 Nodes: Individuals in the network
 Edges: Links between individuals
 Edge weight: Influence probability p(u,v) – the probability that v will be
influenced by u
0.8
0.7
0.1
0.13
0.3
0.41
0.27
0.2
0.9
0.01
0.6
0.54
0.1
0.110
0.20.7
A Classical Influence Propagation
Model
Thursday, July 26, 2012AAAI 2012,Toronto, Ontario.5
 Independent Cascade (IC) (Kempe, Kleinberg, andTardos 2003)
 Initially some seed nodes are activated
 At each time step (discrete), each newly-activated node u
activates its neighbor v independently with probability p(u,v)
 Influence spread: Expected number of nodes activated
 Other propagation models
 LinearThreshold (LT)
 GeneralThreshold
 etc
Influence Maximization
Thursday, July 26, 2012AAAI 2012,Toronto, Ontario.6
Problem
Select k individuals such that by
activating them, the expected
spread of influence is maximized.
Input
Output
A directed graph representing a
social network, with influence
probabilities on edges
Seed set of size k
NP-hard  #P-hard to compute exact influence 
Temporal Aspects in Influence
Diffusion
Thursday, July 26, 2012AAAI 2012,Toronto, Ontario.7
 Influence diffusion can be time-delayed
 Heterogeneity of human activities and interactions (Iribarren and
Moro, 2009)
 Network topology and burstiness (Karsai et al., 2011)
 NOT captured in classical propagation models
Temporal Aspects in Influence
Diffusion
Thursday, July 26, 2012AAAI 2012,Toronto, Ontario.8
 The task of influence maximization may be time-critical.
 Xbox 360 + Kinect is on sale  inVancouver area BestBuys, for
3 days only !!!!!
 Alice has grabbed this great deal, and wants to inform Bob
 But Bob’s been away for a road trip in Banff National Park
 No stable Internet & cellphone access (Rocky mountains!) &
Uncertain return time 
 Viral marketing campaigns may have limited time horizon,
which affects the spread of word-of-mouth.
 NOT captured in current propagation models
Thursday, July 26, 2012AAAI 2012,Toronto, Ontario.9
Our Work
Thursday, July 26, 2012AAAI 2012,Toronto, Ontario.10
 Extend the influence maximization problem to have a deadline
constraint:Time-Critical Influence Maximization
 Propose a new propagation model to reflect temporal delays
of influence diffusion: Independent Cascade with Meeting
events (IC-M)
Independent Cascade with Meeting
Events (The IC-M model)
Thursday, July 26, 2012AAAI 2012,Toronto, Ontario.11
 Model parameters
 Social networks modeled as directed graphs
 Influence probability p(u,v)
 Meeting probability m(u,v), the probability that u and v
meet in each time step
 Diffusion dynamics
 Initially, a seed set is targeted and activated
 At each time step, u and v meet w.p. m(u,v).
 If u is active & is meeting v for the first time, u influences v with
prob p(u,v)
Time-Critical Influence Maximization
Thursday, July 26, 2012AAAI 2012,Toronto, Ontario.12
 Original Influence Maximization
 Diffusion ends only when no more nodes can be activated
 Unconstrained time horizon
 Meeting probabilities not essential
 Time-Critical Influence Maximization
 Given an integer T << |V|, we only consider influence spread
within T time steps
 T: the deadline constraint.
 Representing limited time horizon.
Time-Critical Influence Maximization
Thursday, July 26, 2012AAAI 2012,Toronto, Ontario.13
 Problem Formulation
 Input: G=(V, E), k, T
 Objective: find k seeds such that the spread of influence by the
end of time step T is maximized (under the IC-M model)
 NP-hard 
Greedy Approximation Algorithm
Thursday, July 26, 2012AAAI 2012,Toronto, Ontario.14
 Under the IC-M model, our objective function (spread of
influence) is monotone and submodular in the seed set.
 Monotonicity:As the seed set grows, influence is non-decreasing
 Submodularity:The law of diminishing marginal returns
 Greedy approximation algorithm
 Repeat k rounds
 In each round, select the node v that provides the largest marginal
gain in influence spread
 Approximation ratio = 1-1/e ≈ 63% (Nemhauser et al., 1978)
 #P-hard to compute influence spread exactly for IC-M 
 Can use Monte Carlo simulations to estimate, but very costly 
Overcome the Inefficiency of Greedy
Thursday, July 26, 2012AAAI 2012,Toronto, Ontario.15
 MIA: Maximum InfluenceArborescence (Chen et al., 2010)
 Heuristic No.1 (MIA-M algorithm)
 Design an efficient algorithm to compute influence spread
exactly in tree structures
 Leverage it to design scalable heuristics for time-critical
influence maximization in general graphs
 Heuristic No.2 (MIA-C algorithm)
 For each pair of nodes (u,v), estimate the probability that
influence will propagate from u to v by combining p(u,v),
m(u,v), and the deadline T
 Convert the problem to one in the classical IC model, and solve
it using MIA
Compute Influence in Directed Trees
(In-Arborescences)
Thursday, July 26, 2012AAAI 2012,Toronto, Ontario.16
 Activation probability ap(u,t): the probability that u
becomes active right at time step t
u
Calculating Activation Probability
Thursday, July 26, 2012AAAI 2012,Toronto, Ontario.17
 Step 1: For any seed set S, Compute ap(u,t) given S via
dynamic programming
 Base cases
 The recursion: ap(u,t) =
 Step 2: By linearity of expectation, for a given seed set S,
inf(S) =
Computing Influence in General Graphs
Thursday, July 26, 2012AAAI 2012,Toronto, Ontario.18
 Utilize the dynamic programming algorithm for trees
 Restrict incoming influence to a node u in a local region
 Influence from nodes far away can be ignored
 “Sparsify” the local region of a node u to an in-arborescence, by
including only the strongest influence path from other nodes
to u
 Dijkstra’s algorithm
Experiments: Network Datasets
Thursday, July 26, 2012AAAI 2012,Toronto, Ontario.19
 NetHEPT:A co-authorship network from arxiv.org High Energey
PhysicsTheory section.
 WikiVote:A who-voted-whom network fromWikipedia
 Epinions:A who-trusts-whom network from the customer
reviews site Epinions.com
 DBLP:A co-authorship network from DBLP
NetHEPT WikiVote Epinions DBLP
# Nodes 15K 7.1K 75K 655K
# Edges 62K 101K 509K 2.0M
Avg. degree 4.12 26.6 13.4 6.1
Max. degree 64 1065 3079 588
Experimental Results
Thursday, July 26, 2012AAAI 2012,Toronto, Ontario.20
(a). NetHEPT (b). Epinions
Graph parameters:
• Influence probability: p(u,v) = 1.0 / in-degree(v)
• Meeting probability: m(u,v) = 1.0 / out-degree(u)
Fig. Spread of influence (Y-axis) vs. Seed set size (X-axis), T = 5 and 15
Experimental Results
Thursday, July 26, 2012AAAI 2012,Toronto, Ontario.21
(a). NetHEPT (b). Epinions
Spread of influence (Y-axis) vs. Seed set size (X-axis), T = 5 and 15
Graph parameters:
• Influence probability: p(u,v) = 1.0 / in-degree(v)
• Meeting probability: m(u,v) chosen uniformly at random from
{0.2, 0.3, … , 0.7, 0.8}
Running Time Comparisons
Thursday, July 26, 2012AAAI 2012,Toronto, Ontario.22
 T = 5, Weighted meeting probabilities. DNP = Did Not Complete
(within 72 hours)
 In general, Greedy is too slow to use in practice
 Cannot scale to large graphs
 MIA-M, MIA-C are 2-3 orders of magnitude faster
 MIA-M is a little bit slower than MIA-C, MIA
 But has higher spread of influence
NetHEPT WikiVote Epinions DBLP
Greedy 40m 22m DNP DNP
MIA-M 1.6s 7.9s 41s 6.6m
MIA-C 0.3s 0.4s 2.7s 24s
MIA 0.3s 1.4s 12s 40s
Conclusions, Discussions & Future
Work
Thursday, July 26, 2012AAAI 2012,Toronto, Ontario.23
 Conclusions
 Time-Critical Influence Maximization Problem
 Independent Cascade model with meeting Events
 Approximation algorithm & heuristic solutions
 Extensions & Refinements
 LinearThreshold model with Meeting events (LT-M)
 More efficient computation of activation probabilities in tree
structures
 Details available in our full technical report:arXiv 1204.3074
 FutureWork
 Extend classical propagation models to incorporate login events
 Extend to more general propagation models
Thursday, July 26, 2012AAAI 2012,Toronto, Ontario.24
Thanks!!! Questions???
KDD 2012 tutorial on Information and Influence Spread in
Social Networks (Aug 12,Beijing,China)
Carlos Castillo (Qatar Computing Research Institute)
Wei Chen (Microsoft ResearchAsia)
LaksV.S. Lakshmanan (University of British Columbia)
0.8
0.7
0.1
0.1
0.3
0.4
0.2
0.2
0.9
0.1
0.6
0.5
0.1
0.1
1
0
0.20.7

Time Critical Influence Maximization

  • 1.
    Time-Critical Influence Maximizationin Social Networks withTime-Delayed Diffusion Process Wei Chen Wei Lu Ning Zhang Microsoft ResearchAsia U. of British Columbia U. of Sci andTech of China This work was done during the internships of Wei Lu and Ning Zhang at Microsoft Research Asia.
  • 2.
    Influence in SocialNetworks Thursday, July 26, 2012AAAI 2012,Toronto, Ontario.2  We live in communities and interact with social acquaintances  This forms social networks  In social interactions, we influence each other
  • 3.
    Kinect is great Kinectis great Kinect is great Kinect is great Kinect is great Kinect is great Kinect is great Influence Diffusion & Viral Marketing 3 AAAI 2012,Toronto, Ontario. Word-of-mouth effects
  • 4.
    Social Networks asDirected Graphs Thursday, July 26, 2012AAAI 2012,Toronto, Ontario.4  Nodes: Individuals in the network  Edges: Links between individuals  Edge weight: Influence probability p(u,v) – the probability that v will be influenced by u 0.8 0.7 0.1 0.13 0.3 0.41 0.27 0.2 0.9 0.01 0.6 0.54 0.1 0.110 0.20.7
  • 5.
    A Classical InfluencePropagation Model Thursday, July 26, 2012AAAI 2012,Toronto, Ontario.5  Independent Cascade (IC) (Kempe, Kleinberg, andTardos 2003)  Initially some seed nodes are activated  At each time step (discrete), each newly-activated node u activates its neighbor v independently with probability p(u,v)  Influence spread: Expected number of nodes activated  Other propagation models  LinearThreshold (LT)  GeneralThreshold  etc
  • 6.
    Influence Maximization Thursday, July26, 2012AAAI 2012,Toronto, Ontario.6 Problem Select k individuals such that by activating them, the expected spread of influence is maximized. Input Output A directed graph representing a social network, with influence probabilities on edges Seed set of size k NP-hard  #P-hard to compute exact influence 
  • 7.
    Temporal Aspects inInfluence Diffusion Thursday, July 26, 2012AAAI 2012,Toronto, Ontario.7  Influence diffusion can be time-delayed  Heterogeneity of human activities and interactions (Iribarren and Moro, 2009)  Network topology and burstiness (Karsai et al., 2011)  NOT captured in classical propagation models
  • 8.
    Temporal Aspects inInfluence Diffusion Thursday, July 26, 2012AAAI 2012,Toronto, Ontario.8  The task of influence maximization may be time-critical.  Xbox 360 + Kinect is on sale  inVancouver area BestBuys, for 3 days only !!!!!  Alice has grabbed this great deal, and wants to inform Bob  But Bob’s been away for a road trip in Banff National Park  No stable Internet & cellphone access (Rocky mountains!) & Uncertain return time   Viral marketing campaigns may have limited time horizon, which affects the spread of word-of-mouth.  NOT captured in current propagation models
  • 9.
    Thursday, July 26,2012AAAI 2012,Toronto, Ontario.9
  • 10.
    Our Work Thursday, July26, 2012AAAI 2012,Toronto, Ontario.10  Extend the influence maximization problem to have a deadline constraint:Time-Critical Influence Maximization  Propose a new propagation model to reflect temporal delays of influence diffusion: Independent Cascade with Meeting events (IC-M)
  • 11.
    Independent Cascade withMeeting Events (The IC-M model) Thursday, July 26, 2012AAAI 2012,Toronto, Ontario.11  Model parameters  Social networks modeled as directed graphs  Influence probability p(u,v)  Meeting probability m(u,v), the probability that u and v meet in each time step  Diffusion dynamics  Initially, a seed set is targeted and activated  At each time step, u and v meet w.p. m(u,v).  If u is active & is meeting v for the first time, u influences v with prob p(u,v)
  • 12.
    Time-Critical Influence Maximization Thursday,July 26, 2012AAAI 2012,Toronto, Ontario.12  Original Influence Maximization  Diffusion ends only when no more nodes can be activated  Unconstrained time horizon  Meeting probabilities not essential  Time-Critical Influence Maximization  Given an integer T << |V|, we only consider influence spread within T time steps  T: the deadline constraint.  Representing limited time horizon.
  • 13.
    Time-Critical Influence Maximization Thursday,July 26, 2012AAAI 2012,Toronto, Ontario.13  Problem Formulation  Input: G=(V, E), k, T  Objective: find k seeds such that the spread of influence by the end of time step T is maximized (under the IC-M model)  NP-hard 
  • 14.
    Greedy Approximation Algorithm Thursday,July 26, 2012AAAI 2012,Toronto, Ontario.14  Under the IC-M model, our objective function (spread of influence) is monotone and submodular in the seed set.  Monotonicity:As the seed set grows, influence is non-decreasing  Submodularity:The law of diminishing marginal returns  Greedy approximation algorithm  Repeat k rounds  In each round, select the node v that provides the largest marginal gain in influence spread  Approximation ratio = 1-1/e ≈ 63% (Nemhauser et al., 1978)  #P-hard to compute influence spread exactly for IC-M   Can use Monte Carlo simulations to estimate, but very costly 
  • 15.
    Overcome the Inefficiencyof Greedy Thursday, July 26, 2012AAAI 2012,Toronto, Ontario.15  MIA: Maximum InfluenceArborescence (Chen et al., 2010)  Heuristic No.1 (MIA-M algorithm)  Design an efficient algorithm to compute influence spread exactly in tree structures  Leverage it to design scalable heuristics for time-critical influence maximization in general graphs  Heuristic No.2 (MIA-C algorithm)  For each pair of nodes (u,v), estimate the probability that influence will propagate from u to v by combining p(u,v), m(u,v), and the deadline T  Convert the problem to one in the classical IC model, and solve it using MIA
  • 16.
    Compute Influence inDirected Trees (In-Arborescences) Thursday, July 26, 2012AAAI 2012,Toronto, Ontario.16  Activation probability ap(u,t): the probability that u becomes active right at time step t u
  • 17.
    Calculating Activation Probability Thursday,July 26, 2012AAAI 2012,Toronto, Ontario.17  Step 1: For any seed set S, Compute ap(u,t) given S via dynamic programming  Base cases  The recursion: ap(u,t) =  Step 2: By linearity of expectation, for a given seed set S, inf(S) =
  • 18.
    Computing Influence inGeneral Graphs Thursday, July 26, 2012AAAI 2012,Toronto, Ontario.18  Utilize the dynamic programming algorithm for trees  Restrict incoming influence to a node u in a local region  Influence from nodes far away can be ignored  “Sparsify” the local region of a node u to an in-arborescence, by including only the strongest influence path from other nodes to u  Dijkstra’s algorithm
  • 19.
    Experiments: Network Datasets Thursday,July 26, 2012AAAI 2012,Toronto, Ontario.19  NetHEPT:A co-authorship network from arxiv.org High Energey PhysicsTheory section.  WikiVote:A who-voted-whom network fromWikipedia  Epinions:A who-trusts-whom network from the customer reviews site Epinions.com  DBLP:A co-authorship network from DBLP NetHEPT WikiVote Epinions DBLP # Nodes 15K 7.1K 75K 655K # Edges 62K 101K 509K 2.0M Avg. degree 4.12 26.6 13.4 6.1 Max. degree 64 1065 3079 588
  • 20.
    Experimental Results Thursday, July26, 2012AAAI 2012,Toronto, Ontario.20 (a). NetHEPT (b). Epinions Graph parameters: • Influence probability: p(u,v) = 1.0 / in-degree(v) • Meeting probability: m(u,v) = 1.0 / out-degree(u) Fig. Spread of influence (Y-axis) vs. Seed set size (X-axis), T = 5 and 15
  • 21.
    Experimental Results Thursday, July26, 2012AAAI 2012,Toronto, Ontario.21 (a). NetHEPT (b). Epinions Spread of influence (Y-axis) vs. Seed set size (X-axis), T = 5 and 15 Graph parameters: • Influence probability: p(u,v) = 1.0 / in-degree(v) • Meeting probability: m(u,v) chosen uniformly at random from {0.2, 0.3, … , 0.7, 0.8}
  • 22.
    Running Time Comparisons Thursday,July 26, 2012AAAI 2012,Toronto, Ontario.22  T = 5, Weighted meeting probabilities. DNP = Did Not Complete (within 72 hours)  In general, Greedy is too slow to use in practice  Cannot scale to large graphs  MIA-M, MIA-C are 2-3 orders of magnitude faster  MIA-M is a little bit slower than MIA-C, MIA  But has higher spread of influence NetHEPT WikiVote Epinions DBLP Greedy 40m 22m DNP DNP MIA-M 1.6s 7.9s 41s 6.6m MIA-C 0.3s 0.4s 2.7s 24s MIA 0.3s 1.4s 12s 40s
  • 23.
    Conclusions, Discussions &Future Work Thursday, July 26, 2012AAAI 2012,Toronto, Ontario.23  Conclusions  Time-Critical Influence Maximization Problem  Independent Cascade model with meeting Events  Approximation algorithm & heuristic solutions  Extensions & Refinements  LinearThreshold model with Meeting events (LT-M)  More efficient computation of activation probabilities in tree structures  Details available in our full technical report:arXiv 1204.3074  FutureWork  Extend classical propagation models to incorporate login events  Extend to more general propagation models
  • 24.
    Thursday, July 26,2012AAAI 2012,Toronto, Ontario.24 Thanks!!! Questions??? KDD 2012 tutorial on Information and Influence Spread in Social Networks (Aug 12,Beijing,China) Carlos Castillo (Qatar Computing Research Institute) Wei Chen (Microsoft ResearchAsia) LaksV.S. Lakshmanan (University of British Columbia) 0.8 0.7 0.1 0.1 0.3 0.4 0.2 0.2 0.9 0.1 0.6 0.5 0.1 0.1 1 0 0.20.7