1. Outline Social Influence Community Detection
Social Influence & Community Detection
V.A. Traag
February 13, 2009
2. Outline Social Influence Community Detection
Outline
1 Social Influence
Introduction
BA-model
Social influence model
Empirical results
Further research
2 Community Detection
Introduction
Modularity & Potts model
Negative links
Empirical example
Further research
3. Outline Social Influence Community Detection
Outline
1 Social Influence
Introduction
BA-model
Social influence model
Empirical results
Further research
2 Community Detection
Introduction
Modularity & Potts model
Negative links
Empirical example
Further research
4. Outline Social Influence Community Detection
Introduction
• What items (e.g. movies, books) become popular?
• Based on an extension of the BA-model.
(Social influence balancing parameter)
• Idea emerged from web based experiment of Salganik et al.
(Science, 2006)
5. Outline Social Influence Community Detection
Experiment from Salganik et al.
More social influence 1
...
More social influence 8
Social influence 1
...
Social influence 8
No social influence 1
...
No social influence 8
User arrival
6. Outline Social Influence Community Detection
Experiment from Salganik et al.
More social influence 1
...
More social influence 8
Social influence 1
...
Social influence 8
No social influence 1
...
No social influence 8
User arrival
Moreinequalityanduncertainty
7. Outline Social Influence Community Detection
BA-model
• Rich-get-richer effect.
• Web sites (items) attract links (votes) proportional to the
number of links (votes).
˙ki = m
ki
j kj
• Yields stationary degree distribution.
Pr(X = k) = 2m2
k−3
8. Outline Social Influence Community Detection
Social influence
• Additional good-get-richer effect.
• Introduce quality φ ≥ 0 with mean quality µ and variance σ.
• Balance quality and popularity through parameter 0 ≤ λ ≤ 1.
• New differential equation
˙ki = m (1 − λ)
φi
j φj
+ λ
ki
j kj
.
9. Outline Social Influence Community Detection
Theoretical results
Result is
ki (t) =
t
ti
λ
− 1 (1 − λ)
mφi
µλ
.
from which we can see that:
• Votes increase with time
• Older items obtain more votes
• Better items obtain more votes (might catch up with older,
but worse, items)
• Higher social influence, changes growth pattern: less quickly
at introduction, but keeps growing more.
10. Outline Social Influence Community Detection
Theoretical results
• For invariant quality, the “uncertainty” distribution is
Pr(X = k|φ) =
µ((1 − λ)mφ)
1
λ
(kλµ + (1 − λ)mφ)(1+ 1
λ )
.
• Mean popularity and variance
E(X|φ) =
mφ
µ
and Var(X|φ) =
E(X|φ)2
1 − 2λ
.
• Expected number of votes rise with quality
• Uncertainty rises with quality and with social influence
• In congruence with experiment from Salganik et al.
11. Outline Social Influence Community Detection
Theoretical results
• Quality distribution is ρ(φ) with mean µ and variance σ.
• The “popularity” distribution can be deduced as
Pr(X = k) =
φmax
φmin
ρ(φ) Pr(X = k|φ)dφ.
• In general, mean popularity and variance is
E(X) = m and Var(X) =
m2(2σ(1 − λ) + µ2)
µ2(1 − 2λ).
12. Outline Social Influence Community Detection
Empirical results
• Quality usually a problem, how to estimate it?
• Workaround: assume a quality distribution (e.g. Dirac,
Exponential).
• Compare empirical popularity distribution (#views, #sales) to
theoretical distribution.
• Estimate social influence parameter λ using MLE.
13. Outline Social Influence Community Detection
10
-4
10
-3
10
-2
10
-1
10
0
10
-6
10
-5
10
-4
10
-3
10
-2
10
-1
10
0
10
1
10
2
10
3
Hollywood
YouTube
Fit (Hollywood)
Fit (YouTube)
k
Pr(x>k)
YouTube1 λ ≈ 0.878
Hollywood1 λ ≈ 0.663
1
Assuming an exponential distribution
14. Outline Social Influence Community Detection
Other results
• Other research from Pennock et al. shows additional results.
• Hyperlink distribution per category of websites.
• Relatively high for companies (0.950) and newspapers (0.948).
• Relatively low for universities (0.612) and scientists (0.602).
• Might be used as a rough estimate of the amount of social
influence.
15. Outline Social Influence Community Detection
Social Influence
• Introduce parameter social influence parameter λ on network.
• Balance between own preferences and preferences of others.
• Spreading (cascading) of preferences.
• Updating of exclusive preferences might result in community
detection algorithm.
• Popularity of items = size of communities?
One separate topic: estimate social influence in citation
distributions over the last few years. Has it increased?
16. Outline Social Influence Community Detection
Social Balance Theory
E1
E2
AB
C
D • Triads (a triple set of nodes) are balanced
if their relationships are “symmetric”.
• Triad i, j, k is balanced if AijAikAjk = 1.
• If network is balanced, is can be split in
two communities. (Harary, 1953)
• Social balance can be extended to
k-balanced: a k-cycle does not contain
exactly one negative edge.
• For unbalanced (or k-balanced) networks,
how can communities be assigned such
that nodes form cohesive groups?
17. Outline Social Influence Community Detection
Modularity
Definition
Modularity Q = 1
m ij (Aij − pij )δ(σi , σj ).
Newman & Girvan.
• Modularity can also be expressed as
Q =
1
m c
ac − ec .
• Optimising modularity yields a good community assignment.
18. Outline Social Influence Community Detection
Potts approach
• Potts approach by Reichardt and Bornholdt: reward “allowed”
links, penalise “forbidden” links.
Allowed • Links within communities
(reward aij = γpij).
Forbidden • Absent links within communities
(penalty bij = 1 − γpij).
• Formulated as a “energy/cost” function (Hamiltonian):
H =
ij
−aijAijδ(σi , σj ) + bij (1 − Aij)δ(σi , σj )
• Reformulated equals modularity (if γ = 1)
−
1
m
Q = H = −
ij
(Aij − γpij )δ(σi , σj )
• Results in a tuneable (γ) version of modularity.
19. Outline Social Influence Community Detection
Problem with negative links
ik = 1 j k = 1
k k = −1
Negative links poses problem for
modularity. Probabilities pij not well
defined.
A =
+ + −
+ + −
− − +
Q =
1
m
ij
Aij −
ki kj
m
δ(σi , σj )
= 0
20. Outline Social Influence Community Detection
Negative links
• Solution is to change “allowed” and “forbidden” links:
Allowed • Positive links within communities
(reward aij = γp+
ij ).
• Absent negative links within communities
(reward dij = λp−
ij ).
Forbidden • Absent positive links within communities
(penalty bij = 1 − γp+
ij ).
• Negative links within communities
(penalty cij = 1 − λp−
ij ).
• Results in two separate Hamiltonian
H+
= −
ij
(A+
ij − γp+
ij )δ(σi , σj ) and
H−
= −
ij
(A−
ij − λp−
ij )δ(σi , σj ).
21. Outline Social Influence Community Detection
Hamiltonian
• We weigh both Hamiltonians equally.
• This results in
−
1
m
Q = H+
− H−
= −
ij
(Aij − (γp+
ij − λp−
ij ))δ(σi , σj )
• Changing the expected values in modularity, allows
community detection in networks with negative links.
25. Outline Social Influence Community Detection
Further research
• Apply community detection scheme to citation networks.
• Communities in unsigned networks are ’thematic’ clusters.
• Communities in signed networks are ’positional’ clusters.
• For example: Dutch opinion makers.