2. @estebanmoro
First Lecture
• Motivation
• Social dynamical processes
• Ties
• Tie activity
• Tie dynamics
• Community dynamics
• Network dynamics http://bit.ly/LesHouches
6. @estebanmoro
appear/disappear
t1 t2 t3
Barabasi et al., Physica A (2002), Holme et al. Soc.Net.(2004)
Nodes
Timescale
7. @estebanmoro
form/decay
t1 t2 t3
Tie activity
is bursty t
Groups of
conversation
t1 t1+dt
Hidalgo et al., Physica A (2008), Burt, Soc.Net.(2000)
Barabasi, Nature (2005)
Kovanen et al. J.Stat.Mech (2011), Zhao et al. NetMob (2011)
Ties appear/disappear
t1 t2 t3
Barabasi et al., Physica A (2002), Holme et al. Soc.Net.(2004)
Nodes
Timescale
8. @estebanmoro
form/decay
t1 t2 t3
Tie activity
is bursty t
Groups of
conversation
t1 t1+dt
Hidalgo et al., Physica A (2008), Burt, Soc.Net.(2000)
Barabasi, Nature (2005)
Kovanen et al. J.Stat.Mech (2011), Zhao et al. NetMob (2011)
Ties appear/disappear
t1 t2 t3
Barabasi et al., Physica A (2002), Holme et al. Soc.Net.(2004)
Nodes
Communities
form/change/decay
t1 t2
Palla et al. Proc.of SPIE (2007)
Communities
Timescale
9. @estebanmoro
form/decay
t1 t2 t3
Tie activity
is bursty t
Groups of
conversation
t1 t1+dt
Hidalgo et al., Physica A (2008), Burt, Soc.Net.(2000)
Barabasi, Nature (2005)
Kovanen et al. J.Stat.Mech (2011), Zhao et al. NetMob (2011)
Ties appear/disappear
t1 t2 t3
Barabasi et al., Physica A (2002), Holme et al. Soc.Net.(2004)
Nodes
Communities
form/change/decay
t1 t2
Palla et al. Proc.of SPIE (2007)
Communities
Networks
form/change/decay
t1 t2
Kossinets and Watts, Science (2006)
Network
Timescale
10. 1Social dynamical process
Dynamical communication strategies 59
A
0.0
0.2
0.4
0.6
0.8
10 20 50
k
mean
g
g1
g2
g3
0.10
0.15
0.20
an
g
g1
pi
ci
52 105 158 211
52 105 158 211
B
C D
logn↵,i
2
3
4
3.2e-04
6.6e-04
1.4e-03
2.9e-03
6.0e-03
1.3e-02
2.6e-02
2
3
4
0.00015161
0.00031503
0.00065460
0.00136021
0.00282641
0.00587305
0.01220371
0.025358322.5e-2
2.8e-3
3.1e-4
A
B
11. @estebanmoro
• Cognitive limits
• Dunbar’s number
•There is a cognitive limit to the number of
people with whom one can maintain stable
social relationships. (Dunbar 1992)
• The magical number Seven Plus Minus
Two
• The number of objects an average human
can hold in working memory is 7 ± 2 (Miller
’56)
ki
hwij|kii
Miritello, G. et al., 2013. Time as a limited resource: Communication
strategy in mobile phone networks. Social Networks.
12. @estebanmoro
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
weak tie
structural hole
bridge
strong tie
• Embeddedness / clustering / triadic closure / weak ties
• Embeddedness, clustering:
People who spend time with a third
are likely to encounter each other
(triadic closure). Minimizes conflict,
maximizes trusts,…
• Bridges, structural holes (Burt):
Bridges have structural advantages
since they have access to non-
redundant information
• Weak ties (Granovetter): weak ties
tend to connect different areas of
the network (they are more likely to
be sources of novel information)
Rivera, M.T., Soderstrom, S.B. & Uzzi, B., 2010. Dynamics of
Dyads in Social Networks: Assortative, Relational, and
Proximity Mechanisms. Annual Review of Sociology, 36(1),
pp.91–115.
13. @estebanmoro
• Contagion
• Human behaviors spread on the network
• Dynamics too
• Homophily
• The greater the similarity between individuals the more likely they are to
establish a connection
strongest ties will lead to a sudden disintegration of the netw
In contrast, reversing the order shrinks the network wi
precipitously breaking it apart.
0
0.1
0.2
0.3
0.4
0.5
0.6
0 5 10 15 20 25
Probability
Number of Churner Neighbours
May Churners
June Churners
July Churners
(a)
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0 0.2 0.4 0.6 0.8 1
Probability
Proportion of Pairs Adjacent
3 Churners
4 Churners
5 Churners
6 Churners
Worldwide Buzz 27
Attribute Random Communicate
Age -0.0001 0.297
Gender 0.0001 -0.032
ZIP -0.0003 0.557
County 0.0005 0.704
Language -0.0001 0.694
Table 5: Correlation coe cients for random pairs of people and pairs of people who communicate.
We compare the degree of homophily of random pairs of users with pairs of users that communicate.
80 80
Worldwide Buzz 27
Attribute Random Communicate
Age -0.0001 0.297
Gender 0.0001 -0.032
ZIP -0.0003 0.557
County 0.0005 0.704
Language -0.0001 0.694
Table 5: Correlation coe cients for random pairs of people and pairs of people who communicate.
We compare the degree of homophily of random pairs of users with pairs of users that communicate.
10 20 30 40 50 60 70 80
10
20
30
40
50
60
70
80
10 20 30 40 50 60 70 80
10
20
30
40
50
60
70
80
(a) Random (b) Communicate
Figure 21: Number of pairs of people of di↵erent ages. We plot ages of two people and color
corresponds to the number of such pairs. (a) Ages of randomly selected pairs of people; we note
there is little correlation. (b) Ages of people who communicate with one another, i.e., ages of people
at the endpoints of links in the communication network. The high correlation is captured by the
diagonal trend.
Correlation coefficient
Number of pairs of people at different ages
Leskovec, J. & Horvitz, E., 2008. Planetary-scale views
on a large instant-messaging network. pp.915–924.
Dasgupta, K. et al., 2008. Social ties and their relevance
to churn in mobile telecom networks.
14. @estebanmoro
• Contagion = Homophily?
• Influence and homophily are usually confounded in observational social
network studies
network registered Ͼ14 billion page views and sent 3
messages over 89.3 million distinct relationships. For deta
the service, the data, and descriptive statistics see the Da
of the SI.
Evidence of Assortative Mixing and Temporal Clusterin
We observe strong evidence of both assortative mixing
poral clustering in Go adoption. At the end of the 5-mont
adopters have a 5-fold higher percentage of adopters in t
networks (t Ϫ stat ϭ 100.12, p Ͻ 0.001; k.s. Ϫ stat ϭ 0.06, p
and receive a 5-fold higher percentage of messages from
than nonadopters (t Ϫ stat ϭ 88.30, p Ͻ 0.001; k.s. Ϫ sta
p Ͻ 0.001). Both the number and percentage of one’s loca
who have adopted are highly predictive of one’s propensity
(Logistic: (#) ϭ 0.153, p Ͻ 0.001; (%) ϭ 1.268, p Ͻ 0.001
adopt earlier (Hazard Rate: (#) ϭ 0.10, p Ͻ 0.001; (%)
p Ͻ 0.001). The likelihood of adoption increases dramati
the number of adopter friends (Fig. 2C), and corresp
adopters are more likely to have more adopter friends (
mirroring prior evidence on product adoption in networ
Adoption decisions among friends also cluster in t
randomly reassigned all Go adoption times (while mainta
adoption frequency distribution over time) and compared
Fig. 1. Diffusion of Yahoo! Go over time. (A–C and D–F) Two subgraphs of the
Yahoo! IM network colored by adoption states on July 4 (the Go launch date),
August 10, and October 29, 2007. For animations of the diffusion of Yahoo! Go
over time see Movies S1 and S2.
Fig. 3. Distinguishing homophily and influence. (A and B) The fraction of observed treated to untreated adopters (nϩ/nϪ) under random (A) and propensity score
(B) matching over time. The dotted line shows a ratio of 1, when treatment has no effect. The Right Inset in B graphs the average marginal influence effects of having
1, 2, 3, or 4 adopter friends implied by random (open circles) and propensity score (filled circles) matching. The Left Inset graphs the average cosine distance of attribute
andbehaviorvectorsofadopterstoadopterfriendsasthenumberofadoptersinthelocalnetworkincreases(͚i,j
n
cos(xi
a
,xj
a
)/n).(C)Graphsthecosinedistancesofadopters
to their adopter friends cos(xit
a
, xjt
a
), their nonadopter friends cos(xit
a
, xjt), and a random alter cos(xit
a
, xrt) over time with trend lines fitted by ordinary least squares. (D)
The fraction of treated and untreated adopters, where treatment is defined as having a friend who adopted within a certain time period (or recency) (⌬t ϵ ti
a
Ϫ tj
a
ϭ
R), under random matching (open circles) and propensity score matching (filled circles). The Inset graphs the cosine distances of dyads of adopters cos(xit
a
, xjt
a
) by the time
Aral, S., et al. 2009. Distinguishing influence-based contagion from homophily-driven
diffusion in dynamic networks. Proceedings of the National Academy of Sciences,
106(51), p.21544.
15. form/decay
t1 t2 t3
Tie activity
is bursty t
Groups of
conversation
t1 t1+dt
Ties
Communities
form/change/decay
t1 t2
Communities
Networks
form/change/decay
t1 t2
Network
2Tie Activity
16. @estebanmoro
• Bursty human dynamics: inter-event time between activities is heavy-tailed
distributed
ntial, the best-estimate predic-
eous Poisson process do not
decay rate (Fig. 1C). This sug-
changes to the homogeneous
needed to reproduce the ob-
n. We hypothesize that, as for
ability 1 − xi, at which point the individual’s
behavior is again governed by a homogeneous
Poisson process with rate ri (25). We refer to the
resulting model as a cascading Poisson process.
To compare the predictions of the cascading
Poisson process (26) to the empirical data, we
40 1960 10
0
10
1
10
2
Inter-event time, τ (d)
10
-6
10
-5
10
-4
10
-3
10
-2
10
-1
10
0
Probabilitydensity
1896-1925
1925-1955
1896-1955
10
0
A B
tely consider the t
ete t distribution
www.sciencemag.orDownloadedfrom
Malmgren, R. et al., 2009. On universality in
human correspondence activity. Science,
325(5948), p.1696.
17. @estebanmoro
• Bursty human dynamics: inter-event time between activities is heavy-tailed
distributed
These finding have important implications, ranging from
resource management to service allocation, in both communi-
cations and retail.
Humans participate on a daily basis in a large number of distinct
activities, ranging from electronic communication (such as sending
e-mails or making telephone calls) to browsing the Internet,
initiating financial transactions, or engaging in entertainment and
sports. Given the number of factors that determine the timing of
each action, ranging from work and sleep patterns to resource
availability, it seems impossible to seek regularities in human
dynamics, apart from the obvious daily and seasonal periodicities.
Therefore, in contrast with the accurate predictive tools common in
estimate the number of congestion-caused blocked calls in calls in
mobile communication4
. Yet, an increasing number of recent
measurements indicate that the timing of many human actions
systematically deviates from the Poisson prediction, the waiting or
inter-event times being better approximated by a heavy tailed or
Pareto distribution (Fig. 1d–f). The differences between Poisson
and heavy-tailed behaviour are striking: a Poisson distribution
decreases exponentially, forcing the consecutive events to follow
each other at relatively regular time intervals and forbidding very
long waiting times. In contrast, the slowly decaying, heavy-tailed
processes allow for very long periods of inactivity that separate
bursts of intensive activity (Fig. 1).
Figure 1 The difference between the activity patterns predicted by a Poisson process and
the heavy-tailed distributions observed in human dynamics. a, Succession of events
predicted by a Poisson process, which assumes that in any moment an event takes place
with probability q. The horizontal axis denotes time, each vertical line corresponding to an
events displayed in a, b. d, The succession of events for a heavy-tailed distribution.
e, The waiting time t of 1,000 consecutive events, where the mean event time was
chosen to coincide with the mean event time of the Poisson process shown in a–c. Note
the large spikes in the plot, corresponding to very long delay times. b and e have the same
Barabasi, A.-L., 2005. The origin of bursts and heavy tails in
human dynamics. Nature, 435(7039), pp.207–211.
18. @estebanmoro
• Why?: managing tasks, queue theory. time in the queue
1 x1
2 x2
3 x3
4 x4
5 x5
…
L xL
L+1 xL+1
⇢(x)
xmax
4
5
user’s list
⌧ =
Barabasi, A.-L., 2005. The origin of bursts and heavy tails in
human dynamics. Nature, 435(7039), pp.207–211.
tðxÞ ¼
X1
t¼1
tf ðx;tÞ ¼
Figure 3 The waiting time distribution predicted b
19. @estebanmoro
• Bursty contacts: inter-event times on ties are also heavy-tailed distributed
tribution around 20 seconds is found. This peak is due
to event correlations between links. The power law indi-
cates the non-Poissonian, bursty character of the events.
Both the characteristics vanish for the time-shuffled null
model BCW, and the inter-event time is well described
by an exponential function (see inset of Fig. 4), i.e., the
process is Poissonian.
FIG. 4: (color online) Scaled inter-event time distributions for
the MCN data. Edges were binned (log bins with base 1.3)
according to their weights and for every second bin the inter-
event time distribution of the events occurring in the corre-
type of reasoning has its limitations, neverthel
trates the mechanisms of slowing down becaus
the residual waiting time increases as the chan
waiting times after getting infected increases
tailed waiting time distribution.
In conclusion, the spreading phenomena in s
communication networks are slow mainly for t
First, the community structure and its corre
link weights have already a considerable effec
the inhomogeneous and bursty activity patte
links result in an additional slowing down.
misleading to emphasize only one of these re
as shown here, by using proper null models th
tions of different factors can be distinguished.
surprisingly, the daily pattern and event corr
tween links seem to play only a minor role
spreading speed.
Acknowledgement The project ICTeCo
knowledges the financial support of the F
Emerging Technologies (FET) programme
Seventh Framework Programme for Research
ropean Commission, under FET-Open gran
238597. Partial support by the Academy of F
Finnish Center of Excellence program 2006-20
no. 129670, as well as OTKA K60456 and T
also acknowledged.
DayMinutes
Karsai, M. et al., 2011. Small But Slow World: How
Network Topology and Burstiness Slow Down
Spreading. Physical Review E, 83(2), p.025102.
Miritello, G., Moro, E. & Lara, R., 2011. Dynamical
strength of social ties in information spreading.
Physical Review E, 83(4), p.045102.
20. @estebanmoro
• Bursty contacts: impact on the waiting time
• When should I wait next call from a friend?
• When is the next bus coming?
• Given , calculate
t t + t
⌧
P( t) P(⌧)
Task/bus/call arrives at random time
21. @estebanmoro
• Bursty contacts: impact on the waiting time
• When should I wait next call from a friend?
• When is the next bus coming?
• Given , calculate
t t + t
⌧
P( t) P(⌧)
P(⇥) =
Z 1
⌧
d t
tP( t)
t
1
t
⇤ =
t
2
✓
1 +
⇥2
t
t
2
◆
Task/bus/call arrives at random time
22. @estebanmoro
• Bursty contacts: impact on the waiting time
• When should I wait next call from a friend?
• When is the next bus coming?
• Given , calculate
t t + t
⌧
P( t) P(⌧)
P(⇥) =
Z 1
⌧
d t
tP( t)
t
1
t
⇤ =
t
2
✓
1 +
⇥2
t
t
2
◆
!"#$%&'(&)%*+%,-".%&/0&,/,10*%23%,-"%4&/,&-56%&-56%&/0&8"7(&9::;
<-"*-&!565,.&
)/5,-4
=,-%*6%85"-%&
!565,.&)/5,-4
>-?%*&@34&
<-/A4
B$$&@3
<-/A
:F::&-/&:GH: !" #$ %# #&
:GHI&-/&ID9G !' #! #$ #!
IDH:&-/&I;H: !" %( %% #"
B$$&-56%4&1&:F::&-/&I;H: !& #" %! #'
"#$!%&'()!*!+,-.+!,-.!/01230&(435!6&74)8!'5!349)!-:!8&5$!;+!94<,3!')!)=/)23)8
/&73420(&7(5!&3!413)79)84&3)!'0+!+3-/+>!.&+!')+3!41!3,)!413)7/)&?!,-07+!.,)1!37&
1-79&((5!(-.)7$!
"@$!A,&73!B!')(-.!+,-.+!,-.!3,)!84+374'034-1!-:!(&3)1)++!&18!,-.!3,4+!6&74)+!'
+3-/$! C1! /&73420(&7>! 43! +,-.+! 3,&3! -6)7! &! D0&73)7! -:! '0+)+!+3&73)8! 3,)47! E-071)
349)>!'03!:-7!1-1F34941<!/-413+!+3-/+>!3,)!/7-/-734-1!.&+!&'-03!"GH$!
!"#$%&'(&)#%*+*,,&-.&/-+01$*23*+%&43,*,&56&768*&-.&43,&9%-8:
(&;<<=
:
D
I:
ID
9:
9D
H:
1D 1' 1H 19 1I : I 9 H ' D J ; F G I: II I9 IH I' ID IJ I;
>?+3%*,
@*$A*+%#B*
<-"*-&!565,.&)/5,-
=,-%*6%85"-%&!565,.&)/5,-
>-?%*&@34&<-/A
I
&KL+$38%4&-?%&HM&/0"%4&-?"-&N%*%&6/*%&-?",&9:&
I7)D0)13!J)7642)+!
"K$!;+! 8)+274')8! 41! /&7&<7&/,! @>! 3,)! /01230&(435! -:! :7)D0)13! +)7642)+! 4+! 9)&
L=2)++! M&4341<! %49)! NLM%O! '-71)! '5! /&++)1<)7+$! ;+! :-7! 1-1F:7)D0)13! +)7
/01230&(435!.&+!&++)++)8!'5!&++0941<!/7-/-734-1+!-:!BPH>!*PH!&18!BPH!:-7!3,
-:!'0+!+3-/$!
Bus Punctuality Statistics GB
2007. Dept. of Transport
Task/bus/call arrives at random time
23. @estebanmoro
• Is that all? Nope: bursts are correlated in
time
• To find correlation, detect
sequence of events with
• If activity is a renewal process, the
probability that we find n of such events in
a row is
• P(E) decays exponentially
• However, in real data it decays like a
power-law
Results
Correlated events. A sequence of discrete temporal events can be
interpreted as a time-dependent point process, X(t), where X(ti) 5 1
at each time step ti when an event takes place, otherwise X(ti) 5 0. To
detect bursty clusters in this binary event sequence we have to
identify those events we consider correlated. The smallest temporal
scale at which correlations can emerge in the dynamics is between
consecutive events. If only X(t) is known, we can assume two
consecutive actions at ti and ti 1 1 to be related if they follow each
other within a short time interval, ti 1 12ti # Dt30,38
. For events with
the duration di this condition is slightly modified: ti 1 12(ti 1 di) #
Dt.
This definition allows us to detect bursty periods, defined as a
sequence of events where each event follows the previous one within
a time interval Dt. By counting the number of events, E, that belong to
the same bursty period, we can calculate their distribution P(E) in a
signal. For a sequence of independent events, P(E) is uniquely deter-
mined by the inter-event time distribution P(tie) as follows:
P E~nð Þ~
ðDt
0
P tieð Þdtie
n{1
1{
ðDt
0
P tieð Þdtie
ð1Þ
for n . 0. Here the integral
ÐDt
0 P tieð Þdtie defines the probability to
draw an inter-event time P(tie) # Dt randomly from an arbitrary
randomly selected users with maxim
Fig. 2.a and b (right bottom panel
sequences strong temporal correl
exponents see Table 1). The power
after a short period denoting the
corresponding channel and lasts u
natural rhythm of human activities
bottom panels) long term correlatio
which reflects a typical office hour
includes internal email communicati
The broad shape of P(tie) and A(t)
communication dynamics is inhom
trivial correlations up to finite time sc
event-event correlations by shuffli
sequences (see Methods) the autoco
slow power-law like decay (empty sy
indicating spurious unexpected depe
strates the disability of A(t) to chara
geneous signals (for further results se
measure of such correlations is prov
distribution for various Dt windows,
following scale invariant behavior
P Eð Þ*E
Figure 1 | Activity of single entities with color-coded inter-event times. (a): Sequence of earthquakes with magnitude
(South of Chishima Island, 8th–9th October 1994) (b): Firing sequence of a single neuron (from rat’s hippocampal)
sequence of an individual. Shorter the time between the consecutive events darker the color.
www.nature
P(E = n) =
Z t
0
P( t)d t
!n 1
1
Z t
0
P( t)d t
!
t t
Karsai, M. et al., 2012. Universal features of correlated
bursty behaviour. Scientific Reports, 2.
24. @estebanmoro
• Is that all? Nope
• Adjacent tie contacts
are correlated in time
*
i
j
ossible heavy-tail proper-
erited by P(⇤ij). Fig. 2
P( tij) and P(⇤ij). For
results obtained when i)
events are randomly se-
hus destroying any possi-
⇤ j and e ectively mim-
whole CDR time-stamps
th tie temporal patterns
th shu⌅ings preserve the
umber of calls and their
rhythms of human com-
P( tij) shows that small
more probable for the real
, where the pdf is almost
rocess, apart from a small
hythms. This bursty pat-
d in numerous examples
i j
⇥ i
t t
t
ij tij
FIG. 1. (color online) Schematic view of communicatio
events around individual i: each horizontal segment indicat
an event between i ! j (top) and ⇤ ! i (bottom). At ea
t↵ in the ⇤ ! i time series, ⇥ij is the time elapsed to the ne
i ! j event, which is di erent from the inter-event time t
in the i ! j time series. The red shaded area represents t
recover time window Ti after t↵.
10
0
d in particular, the possible heavy-tail proper-
tij) are directly inherited by P(⇤ij). Fig. 2
(rescaled) results for P( tij) and P(⇤ij). For
n, we also show the results obtained when i)
tamps of the ⇥ ⇤ i events are randomly se-
m the complete CDR, thus destroying any possi-
ral correlation with i ⇤ j and e ectively mim-
(1) and ii) when the whole CDR time-stamps
d thus destroying both tie temporal patterns
ation between ties. Both shu⌅ings preserve the
ty wij [18], i.e. the number of calls and their
nd also the circadian rhythms of human com-
n [15]. The result for P( tij) shows that small
nter-event times are more probable for the real
n for the shu⌅ed ones, where the pdf is almost
al as in a Poissonian process, apart from a small
due to the circadian rhythms. This bursty pat-
tivity has been found in numerous examples
behavior [6] and seems to be universal in the
le individual schedules tasks. Here we see that
ppens at the level of two individuals interac-
ming recent results in mobile [15] and online
ies [7] dynamics. The pdf for ⇤ij is also heavy-
displays a larger number of short ⇤ij compared
⌅ed one. The abundance of short ⇤ij suggests
ving an information (⇥ ⇤ i) triggers commu-
with other people (i ⇤ j), a manifestation of
versations [11–13]. While the fat-tail of P(⇤ij)
ely described by Eq. (1), i.e. large transmission
ij are mostly due to large inter-event commu-
i j
⇥ i
t t
t
ij tij
FIG. 1. (color online) Schematic view of communications
events around individual i: each horizontal segment indicates
an event between i ! j (top) and ⇤ ! i (bottom). At each
t↵ in the ⇤ ! i time series, ⇥ij is the time elapsed to the next
i ! j event, which is di erent from the inter-event time tij
in the i ! j time series. The red shaded area represents the
recover time window Ti after t↵.
10
-6
10
-4
10
-2
10
0
10
210
-4
10
-2
10
0
10
-4
10
-2
10
0
10
2
10
-6
10
-3
10
0
10
3
⇥ij / tij
P(⇥ij/tij)
P(tij/tij)
FIG. 2. (color online) Distribution of the relay time inter-
P(⇥) =
Z 1
⌧
d t
tP( t)
t
1
t
Miritello, G., Moro, E. Lara, R., 2011. Dynamical
strength of social ties in information spreading.
Physical Review E, 83(4), p.045102.
25. @estebanmoro
• Is that all? Nope
• Temporal motifs:
• Two interactions are
∆t-conected if they
happen with a time
difference ∆t
Temporal motifs in time-dependent networks 5
Figure 1. (a) An example event data set E with six events. Durations have been
omitted for simplicity. With t = 10 there are two maximal subgraphs, shown in (b)
and (c). (d) Valid subgraphs contained in the maximal subgraph in (b). In addition
Figure 1. (a) An example event data set E with six
omitted for simplicity. With t = 10 there are two maxi
and (c). (d) Valid subgraphs contained in the maximal
to these the maximal subgraph itself and all unit subgra
maximal subgraph in (c) does not contain other valid sub
unit subgraphs. (e) Event sets that are contained in (b)
the upper one because it is not t-connected, the lower on
all consecutive t-connected events of node c.
he presented definition for temporal subgraph is meaning
ved in at most one event at a time. When overlapping
umber of di↵erent situations that can arise in the mo
Kovanen, L. et al., 2011. Temporal motifs in time-
dependent networks. Journal Of Statistical
Mechanics-Theory And Experiment, 2011(11),
p.P11005.
26. @estebanmoro
• Is that all? Nope
• Temporal motifs:
• Most of temporal
motifs involve two
nodes
• Motifs that allow causal
interpretations are
more common
Zhao, Q. et al., 2010. Communication motifs: a tool to characterize social
communications. Proceedings of the 19th ACM international conference
on Information and knowledge management, pp.1645–1648.
27. @estebanmoro
• Is that all?
• Temporal motifs
depend on sex
• All-female/All-male
cases are
overrepresented/
underrepresented
C
1.,2.
1.
2.
2.
1.
1.
2.1.
2.
2.
1.
repeated contact returned contact
causal chain non-causal chain
out-star in-star
=1
t=3
t=4
a
b
c
1. 2.
3.
t=1,
t=2
ba
1., 2.
temporal
subgraph
( t=3)
temporal motif
Kovanen, L. et al., 2013. Temporal motifs reveal homophily,
gender-specific patterns, and group talk in call sequences.
PNAS, 110(45), pp.18070–18075.
28. @estebanmoro
• Wrap-up
• Activity within a single tie is bursty
• P(dt) is a heavy tailed
• Bursts are correlated
• Activity across adjacent ties is correlated
• Two adjacent ties
• Group conversations
• Impact on the waiting time (spreading)
• More adjacent ties
• Temporal motifs
29. @estebanmoro
• Wrap-up
• Activity within a single tie is bursty
• P(dt) is a heavy tailed
• Bursts are correlated
• Activity across adjacent ties is correlated
• Two adjacent ties
• Group conversations
• Impact on the waiting time (spreading)
• More adjacent ties
• Temporal motifs
30. form/decay
t1 t2 t3
Tie activity
is bursty t
Groups of
conversation
t1 t1+dt
Ties
Communities
form/change/decay
t1 t2
Communities
Networks
form/change/decay
t1 t2
Network
3Tie dynamics
31. @estebanmoro
• Ties are formed and decay
• Why?
• Creation
• Node creation/decay
• Assortative: Homophily/Heterophily
• Relational: Reciprocity, Triadic Closure, Degree
• Proximity: Proximity and Social Foci
• Decay
• Idem
• How?
• Social limitations / strategies
32. @estebanmoro
• Ties are formed and decay
• Why?
• Creation
• Node creation/decay
• Assortative: Homophily/Heterophily
• Relational: Reciprocity, Triadic Closure, Degree
• Proximity: Proximity and Social Foci
• Decay
• Idem
• How?
• Social limitations / strategies
35. @estebanmoro
3020 C.A. Hidalgo, C. Rodriguez-Sickert / Physica A 387 (2008) 3017–3024
Fig. 2. Persistence across a cellular phone network (a) Distribution of persistence for all links (b) Fraction of surviving ties as a function of time.
The inset shows the same plot in a double logarithmic scale. The continuous line is t 1/4.
• Tie decay: predictors
• Links with large embeddedness and reciprocity are more likely to persist
C.A. Hidalgo, C. Rodriguez-Sickert / Physica A 387 (2008) 3017–3024 3021
Table 1
Persistence of ties and link attributes
Pearson’s correlation C k r R TO Persistence
C 1 0.023 0.15 0.11 0.23 0.15
k 1 0.02 0.13 0.19 0.16
r 1 0.68 0.073 0.033
R 1 0.2964 0.5886
TO 1 0.3537
Regression coefficients 0.09 0.002 0.15 0.35 0.56
Partial correlations 0.0027 0.0032 0.007 0.26 0.034
Fig. 3(d) shows the distribution of persistence divided by clustering coefficient categories, indicating that highly
clustered nodes tend to have relatively large cores. In the core periphery context, this means that persevering nodes
are located in dense parts of the social network (Fig. 3(a) I) while those in sparser parts tend to have nonpersistent
ties acting as bridges which interruptedly connect different parts of the network (Fig. 3(a) II). Finally, we split the
distribution of persistence by reciprocity (Fig. 3(e)) and observe that nodes with more reciprocated ties tend to be
( )R.S. BurtrSocial Networks 22 2000 1–28 11
Fig. 1. Decay functions.
variation. The hazard rate for a relationship is the probability that it will be gone next
year. Hazard rates for the colleague relations are given in Table 3, predicted by logit
equations in Table 4, and graphed in Fig. 1B. Test statistics in Table 4 are adjusted
Ždown for autocorrelation between relations cited by the same respondent e.g., Kish and
.Frankel, 1974 .
Decay is high on average. Three in four of the 22,709 colleague relations at risk of
Burt, R., 2000. Decay functions.
Social Networks, 22(1), pp.1–28.
Hidalgo, C. Rodriguez-Sickert, C., 2008. The
dynamics of a mobile phone network.
Physica A, 387(12), pp.3017–3024.
36. @estebanmoro
• Two paradoxes
• Ties bridging distant parts of the network (the ones important for information
diffusion, achievement) are not only the least likely to be created, but also the
most likely to decay
R.S. Burt / Social Networks 24 (2002) 333–363 335
37. @estebanmoro
• Two paradoxes
• Tie formation tends to close triangles. But ties embedded in triangles are less
likely to decay. Thus, network should become more clustered
of network evolution are gen
specific to the cultural, or
institutional context in quest
the methods we introduced he
may be applied easily to a v
tings. We conclude by emph
standing tie formation and re
social networks requires lon
both social interactions and
(4, 6, 10). With the appropria
retical conjectures can be te
conclusions previously based
data can be validated or quali
References and Notes
1. P. S. Dodds, D. J. Watts, C. F. Sab
U.S.A. 100, 12516 (2003).
2. J. M. Kleinberg, Nature 406, 84
3. T. W. Valente, Network Models of
Innovations (Hampton Press, Cre
Fig. 3. Network-level
properties over time, for
three choices of smooth-
ing window t 0 30 days
(dashes), 60 days (solid
lines), and 90 days
(dots). (A) Mean vertex
degree bkÀ. (B) Fraction-
al size of the largest
component S. (C) Mean
shortest path length in
the largest component
L. (D) Clustering coeffi-
cient C.
PORTS
38. @estebanmoro
• How are ties formed and destroyed?
Is there any strategy?
• Cognitive limitations, time limitations
• Dunbar number: there is a limit to
the number of people with whom
one can maintain stable social
relationships.
• Time/attention is limited: how do
we manage relationships if our
time is limited?
39. @estebanmoro
• How are ties formed and destroyed?
• Disentangling tie burstiness and formation/decay
G. Miritello, R. Lara, M. Cebrián E. Moro 2
G. Miritello, R. Lara, M. Cebrián E. Moro 3
0 T
i $ j
(a)
(c)
(b)
Links tipo (a): 15%
Links tipo (b): 19%
Links tipo (c): 24%
Links tipo (d): 42%
Links tipo (e) : 3.5%
t
(e)
(d)
tij
Figure S2: (Color online) Schematic view of the different situations of tie formation/decay and the
⌦
7 months6 months 6 months
Miritello, G. et al., 2013. Limited communication
capacity unveils strategies for human
interaction. Scientific Reports, 3.
40. @estebanmoro
• How are ties formed and destroyed?
• Very heterogeneous tie evolution
• Mean
• But for 20% of nodes
hn↵,ii ' hn!,ii ' 8 hkii ' 16
n↵,i 15
on the average
Thus, we could
(2)
tij. Or equiv-
he distribution
vities from that
en by:
(3)
en by the ccf of
(4)
e heavy tailed,
me dependence
y the exponen-
is an exponen-
we
que
data
sers
t all
ors,
ime
We
the
mu-
ows
ver,
mes
nob-
the
our
uni-
hus
s of
ki(T)/2, (see Fig. 3a) for the observation period, which sug-
gests that a large fraction of the revealed aggregated social
connectivity ki(T) is given by newly formed or removed con-
nections. Thus, ki(T) usually overestimates the instantaneous
human social capacity of maintaining social ties.
The imbalance between the number of added/removed
ties measures how social capacity changes. At the end of
B
C
Numberofties
i(t)
ki(t)
20301102030
n ,i(t)
n ,i(t) + i(0)
days
Agg. # of ties destroyed
Agg. # of ties formed
#of ties opened
41. @estebanmoro
• How are ties formed and destroyed?
• Tie formation/decay is bursty
3.3 Dynamical communication strategies 57
10!5
10!3
10!1
101
10!5
10!3
10!1
101
(b)
10!5
10!3
10!1
101
10!5
10!3
10!1
101
(a)
P(x/x)
x/x
x = tk k+1
P(x/x)
x/x
x = tk k 1
Figure 3.10: (Rescaled) Distribution of the time gap between edge creation (a) and edge removal
(b). Each curve refers to a group of nodes with a different activity rate ↵i, where groups have
been obtained according to the quartiles of ↵i for the whole population. The dashed line in both
panels correspond to an exponential distribution with the same mean.
We then calculate the aggregated number of events ˆni,↵ and ˆni,! and fit them to linear
models to obtain the simulated ˆ↵i and ˆ!i for those cases in which ˆni,↵ 5 and ˆni,!
5. As shown in Fig. 3.9 (c) the empirical values of ni,↵ observed in our data can be well
explained by the simulations, suggesting that our model works well at the particular
scale considered. In addition, we also find a good agreement between the measured
values of ↵i and !i and the simulations (Fig. 3.9 (d)), despite of the small amount of
outliers that cannot be explained by our model.
cating the number of days in each month when the
user used this service.
• Time stamped events of link addition and deletion
of each users.
In the Skype network, when a user adds a friend to
his/her contact list, the friend may confirm the contact
invitation or not. Also, at any point in time a user may
delete a “friend” from their contact list.[32] Thus, the
network evolves by means of the following events: contact
addition, contact confirmation and contact deletion. In
our study to take into account only trusted social links we
retained only confirmed edges, meaning edges where both
parties accepted the connection. Failure to do so would
lead to mixing undesired with desired connections.
For the present study we employed two subsets of the
above dataset. The first dataset (DS1) includes every
active user as of the end of 2010, all confirmed edges
between these users, and the date of confirmation of each
edge. In this context, we define an active user as one who
connected to the Skype network at least in two di↵erent
months during the first year after their registration date.
In order to consider users with realistic number of friends
we selected only those with degree between 2 and 1000.
Users with more connections are suspected to be bots
or are business accounts and their behavior di↵ers from
the majority who use Skype for personal communication.
This filtering led to a set of more than 150 million users.
The second dataset (DS2) includes the set of edge ad-
dition events and edge confirmation events for the period
of 2009 2011 as well as edge deletion events recorded for
i i+1
addition and deletion of an edge e. If this distribution
follows a power-law as
P(⌧) ⇠ ⌧ (2)
it indicates strong temporal heterogeneities and bursti-
ness, or otherwise if it decays exponentially it reflects
regular dynamical features. Bursty temporal evolution of
human dynamics was confirmed in various systems rang-
ing from library loans to human communication [7, 14]
or recently for the evolution of social networks [6].
1e-6x
1e-5x
1e-4x
1e-3x
1e-2x
1e-1x
1e-0x
1 10 100
P(τa),P(τd)
τa , τd (in days)
(a)
P(τa)
P(τd)
γ=0.85
1e-6x
1e-5x
1e-4x
1e-3x
1e-2x
1e-1x
1e-0x
1 10 100
P(τad)
τad , (in days)
(b)
P(τad)
γ=0.82
FIG. 1: Inter-event time distributions of (a) edge addition
unication strategies 57
!3
10!1
101
(b)
10!5
10!3
10!1
101
10!5
10!3
10!1
101
(a)
P(x/x)
x/x
tk k+1
x/x
x = tk k 1
d) Distribution of the time gap between edge creation (a) and edge removal
Kikas, R., Dumas, M. Karsai, M., 2012. Bursty egocentric
network evolution in Skype. arXiv.org, physics.soc-ph.
Miritello, G., Temporal Patterns of Communication
in Social Networks, Springer 2013
42. @estebanmoro
• How are ties formed and destroyed?
• Linear tie formation/decay and conserved capacity
e in the i j
n the average
us, we could
(2)
tij. Or equiv-
distribution
ies from that
by:
(3)
by the ccf of
(4)
heavy tailed,
dependence
the exponen-
an exponen-
nd
we
ue
ta
rs
all
rs,
me
We
he
u-
ws
er,
es
b-
he
ur
ni-
us
of
de
es
are often used to collect as many friends and followers as pos-
sible. Note that on average n↵,i(T) and n!,i(T) almost equals
ki(T)/2, (see Fig. 3a) for the observation period, which sug-
gests that a large fraction of the revealed aggregated social
connectivity ki(T) is given by newly formed or removed con-
nections. Thus, ki(T) usually overestimates the instantaneous
human social capacity of maintaining social ties.
The imbalance between the number of added/removed
ties measures how social capacity changes. At the end of
B
C
NumberoftiesTieid
i(t)
ki(t)
20301102030
n ,i(t)
n ,i(t) + i(0)
days
-4
-3
-2
-1
0
-4 -3 -2 -1 0
3.5e-05
1.0e-03
2.0e-03
3.1e-03
4.1e-03
5.1e-03
6.1e-03
7.1e-03
8.1e-03
9.1e-03
6e-3
9e-3
3e-3
0
-4
-3
-2
-1
-4 -3 -2 -1
black
black
3.5e-05
1.0e-03
2.0e-03
3.1e-03
4.1e-03
5.1e-03
6.1e-03
7.1e-03
8.1e-03
9.1e-03
logi
log i
5 10 20 50 100 200
1e-071e-051e-031e-01
x
dens
10-7
10-5
10-3
10-1
10 100
P(x)
x
ki
n ,i
n ,i
i
n ,i
n,i
0 20 40 60 80 100
020406080100
A B C
acterization of social dynamical strategies (A) Probability distribution function (pdf) of the aggregated social conne
and number of deleted ties n!,i at t = T, compared with the pdf for the average social capacity i over the observation window.
etween the number of created n↵,i and removed n!,i connections with a linear correlation coe cient of 0.87. The results form the PC
variation can be explained by the first component with a standard deviation of 1.81 in the (0.70, 0.71) direction. This result is show
nd the top of the boxes correspond to the 25th and the 75th quantiles respectively, while the band near the middle is the 50th per
down and top of the whiskers represent the 5th and 95th percentiles. The line y = x lies between the 9th and the 91st percentiles,
on in correspondence of each box. The blue solid lines refer to the 5th and 95th percentiles of random generated n↵ and n! from a P
e expected number of events in a given time interval of length T. (c) Density plot of ↵i as a function of !i. We observe.. (In order fo
n↵,i(t) ' ↵it n!,i(t) ' !it ↵i ' !i
i(t) ' i(0)
43. @estebanmoro
• How are ties formed and destroyed? Linear tie formation/decay
3.5 4.0 4.5 5.0 5.5
0100200300400500600700
Year
neighbordid
2
7
20
54
148
403
1096
2980
44. @estebanmoro
• How are ties formed and destroyed? Linear tie formation/decay
3.5 4.0 4.5 5.0 5.5
0100200300400500600700
Year
neighbordid
2
7
20
54
148
403
1096
2980
1.1 persons/
0.6 persons/
45. @estebanmoro
• How are ties formed and destroyed?
• Social capacity and activity are not
independent
• For a given we have
• Social explorers (A)
• Balanced (-)
• Social keepers (B)
1
2
3
4
-1 0 1 2 3 4 5
3.5e-0
7.3e-0
1.5e-0
3.2e-0
6.6e-0
1.4e-0
2.9e-0
6.0e-0
1.3e-0
2.6e-0
1
2
3
4
-1 0 1 2 3 4 5
3.5e-05
7.3e-05
1.5e-04
3.2e-04
6.6e-04
1.4e-03
2.9e-03
6.0e-03
1.3e-02
2.6e-022.5e-2
2.9e-3
3.2e-4
3.5e-5
A
B
logn↵,i
log i
158 211
158 211
C
n↵,i / i
ki
n↵,i ' i
n↵,i i
n↵,i ⌧ i
Miritello, Lara, Cebrián and EM
Scientific Reports 3, 1950 (2013)
46. @estebanmoro
n↵,i = 23, i = 4
Social explorer
n↵,i = 3, i = 24
Social keeper
• How are ties formed and destroyed?
47. @estebanmoro
n↵,i = 23, i = 4
Social explorer
n↵,i = 3, i = 24
Social keeper
• How are ties formed and destroyed?
48. @estebanmoro
n↵,i = 23, i = 4
Social explorer
n↵,i = 3, i = 24
Social keeper
• How are ties formed and destroyed?
49. @estebanmoro
• Wrap-up
• Triadic closure, reciprocity are predictors for the formation of a link
• Embeddedness, reciprocity are predictors for the persistence of a
link
• Tie formation/decay is bursty
• Tie formation/decay strategy:
• Heterogeneous
• Linear in time
• Social explorers / social keepers
50. form/decay
t1 t2 t3
Tie activity
is bursty t
Groups of
conversation
t1 t1+dt
Ties
Communities
form/change/decay
t1 t2
Communities
Networks
form/change/decay
t1 t2
Network
4Community activity
51. @estebanmoro
• What structural properties influence community evolution?
• Communities = dense areas of the network
• Detection of communities: clique percolation algorithm http://cfinder.org
Vicsek
ciety1–7
y con-
a social
d com-
d com-
6
. Our
mmun-
anding
ole17–22
.
ion23,24
apping
nships
tworks
alls be-
sist for
ember-
osition
ps dis-
is that
know-
dominated by single links, whereas the co-authorship data have many
dense, highly connected neighbourhoods. Furthermore, the links in
the phone network correspond to instant communication events, cap-
turing a relationship as it happens. In contrast, the co-authorship data
a Co-authorship
c d
b Phone call
14
0.6
Zip-code Zip-code0.512
Palla, G., Barabasi, A.-L. Vicsek, T., 2007.
Quantifying social group evolution.
Nature, 446(7), pp.664–667.
52. @estebanmoro
• Dynamics of communities:
• For each interval of time calculate the communities
• Compare them at different time intervals
hese find-
tween the
icles in the
cond-mat)
25
, and (2)
bile phone
k-long per-
r 4 million
or a phone
tween the
as (time-
ights from
mation. In
step in the
e f
0.4
0.3
0.2
0.1
0
〈nreal〉/〈nra
〈nreal〉/
10
6
2
8
4
0
0 20
Growth
Merging
Birth
t t + 1
t t + 1
t t + 1
Contraction
Splitting
Death
t t + 1
t
t
t + 1
t t + 1
40 60
s
80 100 120 0 20
53. @estebanmoro
• Community evolution depends on its size:
• Larger communities are on average older
• Membership of large communities is changing at a higher rate
significantly larger than 1 for both the zip-code and the
g that communities have a tendency to contain people
e generation and living in the same neighbourhood
ize of the largest subset of people having the same zip-
over time steps and the set of available communities,
epresents the same average but with randomly selected
specific interest that Ænrealæ/Ænrandæ for the zip-code has a
ak at community size s 35, suggesting that commu-
size are geographically the most homogeneous ones.
ig. 1d shows, the situation is more complex: on aver-
er communities are more homogeneous in respect of
ode and the age, but there is still a noticeable peak at
he zip-code. In summary, the phone-call communities
the CPM tend to contain individuals living in the same
the auto-correlation function decays faster for th
ies, showing that the membership of the larger c
ging at a higher rate. In contrast, small comm
smaller rate, their composition being more or le
this aspect of community evolution, we define t
community as the average correlation between
f:
Ptmax{1
t~t0
C(t,tz1)
tmax{t0{1
where t0 denotes the birth of the community, an
before the extinction of the community. Thus,
average ratio of members changed in one step.
a
b
c
d
2.8
35
18
16
14
30
25
20
15
10
5
Co-authorship
Phone call
Phone call, s = 6
Phone call, s = 12
Phone call, s = 18
Co-authorship, s = 6
Co-authorship, s = 12
Co-authorship, s = 18
2.4
2.0
1.6
1.2
0.8
1.0
0.8
0 20 40 60 80
s
100 120 140 0.8 0.825 0.85 0.875 0.9 0.925 0.95 0.975 1.0
60
50
40
30
20
10
0
ζ
〈τ(s)〉/〈τ〉
〈τ*〉
20
15
〈τ*〉
〉
s
However, as Fig. 1d shows, the situation is more complex: on aver-
age, the smaller communities are more homogeneous in respect of
both the zip-code and the age, but there is still a noticeable peak at
s 30–35 for the zip-code. In summary, the phone-call communities
uncovered by the CPM tend to contain individuals living in the same
wh
bef
ave
a
b
c
d
2.8
35
18
16
14
12
10
8
6
30
25
20
15
10
5
Co-authorship
Phone call
Phone call, s = 6
Phone call, s = 12
Phone call, s = 18
Co-authorship, s = 6
Co-authorship, s = 12
Co-authorship, s = 18
2.4
2.0
1.6
1.2
0.8
1.0
0.8
0.6
0.4
0.2
0 20 40 60 80
s
100 120 140
0.85
0.8
0 5 1510 20 25
t
30 35 40
〈τ(s)〉/〈τ〉〈C(t)〉
ss
Figure 2 | Characteristic features of community evolution. a, The age t of
communities with a given size (number of people) s, averaged over the set of
available communities and the time steps, divided by the average age of all
valu
the
for
n in the two networks represent potentially generic
f community formation, rather than being rooted in
network representation or data collection process.
ities at each time step were extracted using the clique
hod23,24
(CPM). The key features of the communities
CPM are that their members can be reached through
subsets of nodes, and that the communities may
odes with each other). This latter property is essen-
works are characterized by overlapping and nested
. As a first step, it is important to check if the uncov-
es correspond to groups of individuals with a shared
y pattern. For this purpose, we compared the average
nks inside communities, wc, to the average weight
mmunity links, wic. For the co-authorship network
.9, whereas for the phone-call network the difference
nificant, as wc/wic 5.9, indicating that the intensity
/communicationwithinagroupissignificantlyhigher
cts belonging to a different group26–28
. Although for
uality of the clustering can be directly tested by study-
tion records in more detail, in the phone-call network
ation is not available. In this case the zip-code and
sers provide additional information for checking the
the communities. According to Fig. 1c, the Ænrealæ/
gnificantly larger than 1 for both the zip-code and the
hat communities have a tendency to contain people
generation and living in the same neighbourhood
e of the largest subset of people having the same zip-
ver time steps and the set of available communities,
resents the same average but with randomly selected
ecific interest that Ænrealæ/Ænrandæ for the zip-code has a
at community size s 35, suggesting that commu-
support is given in Supplementary Information.
The basic events that may occur in the life of a community a
shown in Fig. 1e: a community can grow or contract; groups m
merge orsplit;new communities are born while others maydisappe
We have developed a method for the appropriate matching (betwe
the subsequent states of the evolving communities) from the inform
tion available for relatively distant points in time only (see Method
After determining the dynamically changing community stru
ture, we first consider two basic quantities characterizing a commu
ity: its size s and its age t, representing the time passed since its bir
The quantities s and t are positively correlated: larger communit
are on average older (Fig. 2a). Next we used the auto-correlati
function, C(t), to quantify the relative overlap between two sta
of the same community A(t) at t time steps apart:
C(t):
A(t0)A(t0zt)j j
A(t0)|A(t0zt)j j
ð
where A(t0)A(t0zt)j j is the number of common nodes (membe
in A(t0) and A(t0 1 t), and A(t0)|A(t0zt)j j is the number of nod
in the union of A(t0) and A(t0 1 t). Figure 2b shows the average tim
dependent auto-correlation function for communities born with d
ferent sizes. The data indicate that the collaboration network is mo
‘dynamic’ (ÆC(t)æ decays faster). We also find that in both networ
the auto-correlation function decays faster for the larger commun
ies, showing that the membership of the larger communities is cha
ging at a higher rate. In contrast, small communities change at
smaller rate, their composition being more or less static. To quant
this aspect of community evolution, we define the stationarity f o
community as the average correlation between subsequent states:
Ptmax{1
C(t,tz1)
Timesincebirth
ofthecommunity
Community size
Unit time = 2 weeks
54. @estebanmoro
• Community evolution depends on its size:
• Large communities are on average older
• Membership of large communities is changing at a higher rate
• Thus: large communities survive by continually changing membership
comm
c
d
e
large,
stationary
large,
non-stationary
small, non-stationary
τ = 1
τ = 9 τ = 10
τ = 2 τ = 3 τ = 4 τ = 5 τ = 6 τ = 7 τ = 8
0 10 20 30
τ
40 50
0 10 20 30
τ
40 50
0 10 20 30
τ
40 50
0
50
New Leaving in
next step
Old
200
150
100
50
0
0
ss
Figure 3 | Evolution of four types of communities in the co-authorship
network. The height of the columns corresponds to the actual community
size, s, and within one column the yellow colour indicates the number of ‘old’
nodes (that have been present in the community at least in the previous time
step as well), while newcomers are shown with green. The members
plpd
a
b
0.
0.
0.
0.
0.
0.
0.
0.
0.
0.
0.
0.
Figure
membe
ratio o
(wout)
average life-span is at a stationarity value very close to one, indicating a typical small and stationary community undergoes minor chan
but lives for a long time. This is well illustrated by the snapshot
the community structure, showing that the community’s stabilit
conferred by a core of three individuals representing a collabora
group spanning over 52 months. In contrast, a small community w
high turnover of its members has a lifetime of nine time steps o
(Fig. 3b). The opposite is seen for large communities: a large
tionary community disintegrates after four time steps (Fig. 3c)
contrast, a large non-stationary community whose members cha
dynamically, resulting in significant fluctuations in both size a
composition, has a quite extended lifetime (Fig. 3d).
The different stability rules followed by the small and la
communities raise an important question: could the inspection
a
b
c
d
e
50
small, stationary
large,
stationary
large,
non-stationary
small, non-stationary
0 10 20 30
τ = 0–2
τ = 1 τ = 2 τ = 3 τ = 4 τ = 5 τ = 6 τ = 7 τ = 8
τ = 3 τ = 4–34 τ = 35 τ = 36–52
τ
40 50
0 10 20 30
τ
40 50
0 10 20 30
τ
40 50
0 10 20 30
τ
40 50
0
s
50
0
s
50
New Leaving in
next step
Old
200
150
100
50
0
0
ss
〈τn〉
〈τ*〉
wout/(win + wout)
wout/(win + wout)
pl
a
b
0 0.2 0.4 0.6 0.8 1.0
8
12
16
4
0
Phone call
Co-authorship
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
0.08
0.07
0.06
0.05
0.04
0.03
0.02
0.01
0
0.20
0.15
10
20
30
55. @estebanmoro
• Is there any structural property that predicts community evolution?
• YES: how members are “committed” with the community
• Commitment = F(strong links, embeddedness, etc.) = I/O ratio
ty
d’
〈τn〉
〈τ*〉
wout/(win + wout)
wout/(win + wout)
plpd
b
Co-authorship
Phone call
0 0.2 0.4 0.6 0.8 1.0
0 0.2 0.4 0.6 0.8 1.0
8
12
4
0
Phone call
Co-authorship
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
Wout/(Win + Wout)
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
0.06
0.05
0.04
0.03
0.02
0.01
0
0.20
0.15
0.05
0
0.10
Wout/(Win + Wout)
0
10
20
30
Figure 4 | Effects of links between communities. a, The probability pl of a
member abandoning the community in the next step as a function of the
Wout
Win + Wout
56. @estebanmoro
• What makes a node to join a community?
• Explicit definition of communities
• LiveJournal
• Free on-line community with ~ 10M members
• 300,000 update the content in 24-hour period
• Maintaining journals, individual and group blogs
• Declaring who are their friends and to which communities they belong
• DBLP
• On-line database of computer science publications (about 400,000 papers)
• Friendship network – co-authors in the paper
• Conference - community
Backstrom, L. et al., 2006. Group formation in large
social networks: membership, growth, and evolution.
57. @estebanmoro
• Is membership a behavior spreading on the network? (contagion/influence)
• Yes: probability to join a community depends on the number of friends previously
in the community
n in-
ative
ntita-
oach
]; we
mem-
cally
BLP,
ween
work
(see
ed to
erent
0
0.005
0.01
0.015
0.02
0.025
0 5 10 15 20 25 30 35 40 45 50
probability
k
Probability of joining a community when k friends are already members
Figure 1: The probability p of joining a LiveJournal commu-
Diminishing returns
58. @estebanmoro
• Is membership a behavior diffusing on the
network?
• Beyond the dyadic approximation:
• Probability to join depends on the clustering of
friends in the community
t
t+1
Figure 3: The top two levels of decision tree splits for predict-
ing single individuals joining communities in LiveJournal. The
overall rate of joining is 8.48e-4.
0.002
0.003
0.004
0.005
0.006
0.007
0.008
0.009
0 0.2 0.4 0.6 0.8 1
Probability
Proportion of Pairs Adjacent
Probability of joining a community versus adjacent pairs of friends in the community
3 friends
4 friends
5 friends
linkage increases among the individual
It is interesting to consider such a fin
spective — why should the fact that y
know each other make you more likel
logical principles that could potentiall
dichotomy.4
On the one hand, argume
(and see also the notion of structural h
tion that there is an informational adva
community who do not know each oth
“independent” ways of potentially dec
hand, arguments based on social capi
there is a trust advantage to having f
know each other — this indicates that
ported by a richer local social structure
possible conclusion from the trends in
tages provide a stronger effect than info
case of LiveJournal community memb
The fact that edges among one’s frie
bership more likely is also consistent
recent work of Centola, Macy, and Eg
instances of successful social diffusion
clustered networks” [7]. In the case
ties, for example, Macy observes tha
may contribute to a “coordination effe
stronger net endorsement of a commun
interest among a group of interconnect
Relation to Mathematical Models of
ber of theoretical models for the diffu
59. @estebanmoro
• Wrap-up
• Communities evolution
• Larger communities last longer
• Larger communities change at higher rate
• Communities with larger I/O ratio are more stable
• Node dynamics
• Probability to join a community depends on the number of friends previously
in the community
• Probability to join depends on the clustering of friends in the community
60. form/decay
t1 t2 t3
Tie activity
is bursty t
Groups of
conversation
t1 t1+dt
Ties
Communities
form/change/decay
t1 t2
Communities
Networks
form/change/decay
t1 t2
Network
4Network activity
61. @estebanmoro
• How does a network growth?
• Nodes arrival is not linear. Neither edges
0.5
0.6
0.7
0.8
0.9
1
0 20 40 60 80 100
Gapparameterα
Current degree, d
0
0.1
0.2
0.3
0.4
0.5
0 20 40 60 80 100
Gapparameterβ
Current degree, d
Flickr
LinkedIn
Answers
Delicious
0
0.1
0.2
0.3
0.4
0.5
0 20 40 60 80 100
Gapparameterβ
Current degree, d
Figure 9: Evolution of the α and β parameters with the current
node degree d. α remains constant, and β linearly increases.
101
10
2
103
10
4
10
5
10
6
0 5 10 15 20 25
Nodes
Time (months)
N(t) ∝ e
0.25 t
4.0e4
6.0e4
8.0e4
1.0e5
1.2e5
1.4e5
1.6e5
1.8e5
2.0e5
2.2e5
0 5 10 15 20 25 30 35 40 45
Nodes
Time (months)
N(t) = 16t2
+ 3e3 t + 4e4
(a) FLICKR (b) DELICIOUS
0.0e0
1.0e5
2.0e5
3.0e5
4.0e5
5.0e5
6.0e5
0 2 4 6 8 10 12 14 16 18
Nodes
Time (weeks)
N(t) = -284 t2
+ 4e4 t - 2.5e3
0e0
1e6
2e6
3e6
4e6
5e6
6e6
7e6
0 5 10 15 20 25 30 35 40 45
Nodes
Time (months)
N(t) = 3900 t2
+ 7600 t - 1.3e5
(c) ANSWERS (d) LINKEDIN
Figure 10: Number of nodes over time.
Given the above observation, a natural hypothesis would be that
degree d power power law log stretched
law exp. cutoff normal exp.
1 9.84 12.50 11.65 12.10
2 11.55 13.85 13.02 13.40
3 10.53 13.00 12.15 12.59
4 9.82 12.40 11.55 12.05
5 8.87 11.62 10.77 11.28
avg., d ≤ 20 8.27 11.12 10.23 10.76
Table 3: Edge gap distribution: percent improvement of the
log-likelihood at MLE over the exponential distribution.
10
-5
10
-4
10
-3
10-2
10
-1
10
0
10
0
10
1
10
2
10
3
Gapprobability,p(δ(1))
Gap, δ(1)
10
-5
10
-4
10
-3
10
-2
10
-1
10
0
10
1
10
2
10
3
Gapprobability,p(δ(1))
Gap, δ(1)
(a) FLICKR (b) DELICIOUS
10
-5
10-4
10-3
10-2
10-1
10
0
10
0
10
1
10
2
10
3
Gapprobability,p(δ(1))
Gap, δ(1)
10
-6
10
-5
10-4
10-3
10-2
10-1
10
0
10
0
10
1
10
2
10
3
Gapprobability,p(δ(1))
Gap, δ(1)
(c) ANSWERS (d) LINKEDIN
Figure 8: Edge gap distribution for a node to obtain the second
edge, δ(1), and MLE power law with exponential cutoff fits.
6.1.2 Time gap between the edges
Network T N E E
FLICKR (03/2003–09/2005) 621 584,207 3,554,130 2,59
DELICIOUS (05/2006–02/2007) 292 203,234 430,707 348
ANSWERS (03/2007–06/2007) 121 598,314 1,834,217 1,06
LINKEDIN (05/2003–10/2006) 1294 7,550,955 30,682,028 30,68
Table 1: Network dataset statistics. Eb is the number of bidirectional edge
the number of edges that close triangles, % is the fraction of triangle-clos
and κ is the decay exponent (Eh ∝ exp(−κh)) of the number of edges Eh
10
-5
10-4
10-3
10
0
10
1
10
2
Edgeprobability,pe(d)
Destination node degree, d
pe(d) ∝ d
0
10
-5
10
-4
10
-3
10
-2
10
-1
100
10
0
10
1
10
2
10
3
Edgeprobability,pe(d)
Destination node degree, d
pe(d) ∝ d
1
(a) Gnp (b) PA
10
-7
10
-6
10-5
10-4
10-3
10
0
10
1
10
2
10
3
Edgeprobability,pe(d)
Destination node degree, d
pe(d) ∝ d
1
10
-7
10
-6
10
-5
10-4
10-3
10-2
10-1
10
0
10
1
10
2
10
3
Edgeprobability,pe(d)
Destination node degree, d
pe(d) ∝ d
1
(c) FLICKR (d) DELICIOUS
10-7
10-6
10-5
10
-4
10
-3
10-2
10
0
10
1
10
2
10
3
Edgeprobability,pe(d)
Destination node degree, d
pe(d) ∝ d
0.9
10-8
10-7
10
-6
10-5
10-4
10
0
10
1
10
2
10
3
Edgeprobability,pe(d)
Destination node degree, d
pe(d) ∝ d
0.6
(e) ANSWERS (f) LINKEDIN
Figu
Ne
dτ
. In
of be
expla
lows
weak
edges
Leskovec, J. et al., 2008. Microscopic evolution of social networks. In
KDD '08: Proceeding of the 14th ACM SIGKDD international
conference on Knowledge discovery and data mining. ACM
62. @estebanmoro
• How does a network growth?
• Edge dynamics: Preferential attachment
• Triadic closure
DELICIOUS (05/2006–02/2007) 292 203,234 430,707 348,437 348,437 96,387 27.66 1.15 0
ANSWERS (03/2007–06/2007) 121 598,314 1,834,217 1,067,021 1,300,698 303,858 23.36 1.25 0
LINKEDIN (05/2003–10/2006) 1294 7,550,955 30,682,028 30,682,028 30,682,028 15,201,596 49.55 1.14 1
Table 1: Network dataset statistics. Eb is the number of bidirectional edges, Eu is the number of edges in undirected network,
the number of edges that close triangles, % is the fraction of triangle-closing edges, ρ is the densification exponent (E(t) ∝ N
and κ is the decay exponent (Eh ∝ exp(−κh)) of the number of edges Eh closing h hop paths (see Section 5 and Figure 4).
10
-5
10-4
10-3
10
0
10
1
10
2
Edgeprobability,pe(d)
Destination node degree, d
pe(d) ∝ d
0
10
-5
10
-4
10
-3
10
-2
10
-1
100
10
0
10
1
10
2
10
3
Edgeprobability,pe(d)
Destination node degree, d
pe(d) ∝ d
1
(a) Gnp (b) PA
10
-7
10
-6
10-5
10-4
10-3
10
0
10
1
10
2
10
3
Edgeprobability,pe(d)
Destination node degree, d
pe(d) ∝ d
1
10
-7
10
-6
10
-5
10-4
10-3
10-2
10-1
10
0
10
1
10
2
10
3
Edgeprobability,pe(d)
Destination node degree, d
pe(d) ∝ d
1
(c) FLICKR (d) DELICIOUS
10-7
10-6
10-5
10
-4
10
-3
10-2
10
0
10
1
10
2
10
3
Edgeprobability,pe(d)
Destination node degree, d
pe(d) ∝ d
0.9
10-8
10-7
10
-6
10-5
10-4
10
0
10
1
10
2
10
3
Edgeprobability,pe(d)
Destination node degree, d
pe(d) ∝ d
0.6
(e) ANSWERS (f) LINKEDIN
Figure 1: Probability pe(d) of a new edge e choosing a destina-
tion at a node of degree d.
and for every edge that arrives into the network, we measure the
likelihood that the particular edge endpoints would be chosen under
0.01
0.1
1
10
0 20 40 60 80 100 120 140
Avg.no.ofcreatededges,e(a)
Node age (weeks), a
0.01
0.1
1
10
0 5 10 15 20 25 30 35
Avg.no.ofcreatededges,e(a)
Node age (weeks), a
(a) FLICKR (b) DELICIOUS
0.1
1
10
0 2 4 6 8 10 12 14 16
Avg.no.ofcreatededges,e(a)
Node age (weeks), a
0.01
0.1
1
0 20 40 60 80 100 120 140 160
Avg.no.ofcreatededges,e(a)
Node age (weeks), a
(c) ANSWERS (d) LINKEDIN
Figure 2: Average number of edges created by a node of
Next we turn to our four networks and fit the function pe
dτ
. In FLICKR, Figure 1(c), degree 1 nodes have lower prob
of being linked as in the PA model; the rest of the edges co
explained well by PA. In DELICIOUS, Figure 1(d), the fit nice
lows PA. In ANSWERS, Figure 1(e), the presence of PA is s
weaker, with pe(d) ∝ d0.9
. LINKEDIN has a very different p
edges to the low degree nodes do not attach preferentially (th
d0.6
), whereas edges to higher degree nodes are more “stick
fit is d1.2
). This suggests that high-degree nodes in LINKED
super-preferential treatment. To summarize, even though th
minor differences in the exponents τ for each of the four net
10
0
10
1
102
103
10
4
0 5 10 15 20 25 30
Numberofedges
Hops, h
10
0
10
1
10
2
10
3
10
4
0 2 4 6 8 10
Numberofedges
Hops, h
(a) Gnp (b) PA
10
0
10
1
102
103
104
105
10
6
0 2 4 6 8 10 12
Numberofedges
Hops, h
∝ e-1.45 h
10
0
101
102
103
10
4
105
10
6
0 2 4 6 8 10 12 14 16
Numberofedges
Hops, h
∝ e-0.8 h
(c) FLICKR (d) DELICIOUS
10
0
101
102
103
10
4
105
0 2 4 6 8 10 12 14 16
Numberofedges
Hops, h
∝ e-0.95 h
10
0
10
1
10
2
103
10
4
0 2 4 6 8 10 12
Numberofedges
Hops, h
∝ e-1.04 h
(e) ANSWERS (f) LINKEDIN
Figure 4: Number of edges Eh created to nodes h hops away.
h = 0 counts the number of edges that connected previously
disconnected components.
number of edges decays exponentially with the hop distance be-
tween the nodes (see Table 1 for fitted decay exponents κ). This
10
-5
10-4
10-3
0 2 4 6 8 10 12
Edgeprobability,pe(h)
Hops, h
10
-5
10
-4
10
-3
0 2 4 6 8 10
Edgeprobability,pe(h)
Hops, h
(a) Gnp (b) PA
10
-8
10
-7
10-6
10-5
10-4
10
-3
0 2 4 6 8 10
Edgeprobability,pe(h)
Hops, h
10
-7
10
-6
10-5
10-4
10-3
10
-2
0 2 4 6 8 10 12 14
Edgeprobability,pe(h)
Hops, h
(c) FLICKR (d) DELICIOUS
10
-7
10
-6
10
-5
10-4
0 2 4 6 8 10 12
Edgeprobability,pe(h) Hops, h
10
-8
10-7
10-6
10-5
10-4
0 2 4 6 8 10 12
Edgeprobability,pe(h)
Hops, h
(e) ANSWERS (f) LINKEDIN
Figure 5: Probability of linking to a random node at h hops
from source node. Value at h = 0 hops is for edges that connect
previously disconnected components.
63. @estebanmoro
• But how do people get to the
network)
• Adoption Contagion +
• Mass media
• = Bass Model
• Homophily: geography
the creators in the form of advertising and marketing or done
implicitly by the users through word-of-mouth. As explained
above, we use Google news and search volumes as a proxy for
these media influences. Importantly, we note that media coverage
(i) We begin by initializing the agent p
network. Contagion spreading is simulate
resembling the susceptible - infected (SI) mo
special case of the Bass model, widely used
Figure 1. Plots of weekly national adoption. (a.) The number of new U.S. Twitter users is plotted for each week, norma
weekly increase during the entire period of data collection. (b.) The cumulative total number of U.S. Twitter users is plotted
period. Google search and news volumes are normalized such that the maximum value is 1.
doi:10.1371/journal.pone.0029528.g001
Adoption with Geographi
dn
dt
= p(t)(1 n) + qn(1 n)
mass media
Contagion
Toole, J.L., Cha, M. Gonzalez, M.C., 2011. Modeling the adoption
of innovations in the presence of geographic and media
influences. PLoS ONE, 7(1), pp.e29528–e29528.
64. @estebanmoro
• Users leaving behavior is also contagious
two users A and B, representing the proportion of their common
friends, as OAB = NAB/((KA-1)+(KB-1)-NAB), where NAB is the
number of common neighbors of A and B, and KA (KB) denotes
the degree of node A(B).1
Fig. 3(d) demonstrates the effect of
removing links in order of strongest (or weakest) overlaps. In both
cases, we find that removing ties in rank order of weakest to
strongest ties will lead to a sudden disintegration of the network.
In contrast, reversing the order shrinks the network without
precipitously breaking it apart.
0
0.1
0.2
0.3
0.4
0.5
0.6
0 5 10 15 20 25
Probability
Number of Churner Neighbours
May Churners
June Churners
July Churners
(a)
0.3
0.35
0.4
3 Churners
4 Churners
5 Churners
6 Churners
links will delete bridges that connect different com
leading to a network collapse. Further, we believe
observed local relationship, between network topolog
strength affects any global information diffusion proc
churn). In fact, we opine that churn as a behavior can b
less as a dyadic phenomenon (affected only by strong
churner ties), but more as a diffusion process where bo
and weak ties play a significant role in spreading the
through the network topology.
4. PREDICTING CHURNERS IN THE
CALL GRAPH
We next discuss how to exploit social ties to identify
churners in an operator’s network. Our approach is as fol
start with a set of churners (e.g. for April) and the
relationships (ties) captured in the call graph (for Marc
the underlying topology of the call graph, we then
diffusion process with the churners as seeds. Effect
model a “word-of-mouth” scenario where a churner i
one of his neighbors to churn, from where the influence s
some other neighbor, and so on. At the end of the
process, we inspect the amount of influence received
node. Using a threshold-based technique, a node that is
not a churner can be declared to be a potential future o
on the influence that has been accumulated. Finally, we
the number of correct predictions by tallying with the act
churners that were recorded for a subsequent month
May). The diffusion model is based on Spreading A
(SPA) techniques proposed in cognitive psychology and
for trust metric computations [32]. In essence, SPA is
performing a breadth-first search on the call graph GMa
The basic steps are outlined below:-
Node Activation: During each iterative step i, there is
active nodes. Let X be an active node which has associat
0
0.1
0.2
0.3
0 5 10 15 20 25
Probabili
Number of Churner Neighbours
May Churners
June Churners
July Churners
(a)
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0 0.2 0.4 0.6 0.8 1
Probability
Proportion of Pairs Adjacent
3 Churners
4 Churners
5 Churners
6 Churners
(b)
Figure 2. Probability of churning when (a) k friends have
already churned (b) adjacent pairs of friends have already
churned
This result is broadly consistent with the strength of weak ties
hypothesis [5], offering one of its first confirmations in mobile
Dasgupta, K. et al., 2008. Social ties and their relevance
to churn in mobile telecom networks.
66. @estebanmoro
• If users leave once they have less than k neighbor. Thus the relevant structure is
the k-core of the network: the part of the network than remains once nodes with
degree smaller thank k are removed
• There is a critical k-coreness in each network Ks
67. @estebanmoro
• Thus: are unsuccessful networks those with small k-cores?
David Garcia Chair of Systems Design www.sg.ethz.ch
The Autopsy of Friendster Measuring Social Resilience
Empirical K-core decompositions
Conference in Online Social Networks. Boston, US October 7th, 2013 11 / 19
NO!
k-coreness
Colored area is proportional to the number of nodes
A large proportion
of nodes have a
large k-coreness
68. @estebanmoro
• Assumption: Ks increases with time (linearly)
• Network decay is a much more complex process
To conclude our analysis, we explored how the spread of departures captured in the k-core
decomposition (see Section 3.3) can describe the collapse of Friendster as an OSN. As we do
not have access to the precise amount of active users of Friendster, we proxy its value through
the Google search volume of www.friendster.com. The inset of Figure 6 shows the relative weekly
search volume from 2004, where the increase of popularity of Friendster is evident. At some
point in 2009, Friendster introduced changes in its user interface, coinciding with some technical
problems, and the rise of popularity of Facebook 4. This led to the fast decrease of active users
in the community, ending on its discontinuation in 2011.
date
Googlesearchvolume(%)
020406080
2Jul2009 1Jan2010 2Jul2010 1Jan2011
P(ks6)
Figure 6: Weekly Google search trend volume for Friendster. The red line shows the estima-
tion of the remaning users in a process of unraveling. Inset: time series of fraction of nodes
with ks 6.
Model
69. @estebanmoro
• Wrap up
• Network growth
• Nodes and edges arrive to the network through a complex process (mass
media)
• Edges are created through the preferential attachment and triadic closure
process
• Joining the network is also a contagious process
• Network death
• Users leaving behavior is also contagious. Depends on the embeddedness
• Network resilience has an impact through the k-core of the network