More Related Content
Similar to Machine Learning Social Network Analysis
Similar to Machine Learning Social Network Analysis (20)
More from Tianlu Wang (20)
Machine Learning Social Network Analysis
- 1. Machine LearningMachine Learninggg
Application II:Application II:
Social Network AnalysisSocial Network AnalysisSocial Network AnalysisSocial Network Analysis
Eric XingEric Xing
Eric Xing © Eric Xing @ CMU, 2006-2010 1
Lecture 19, August 16, 2010
Reading:
- 6. Global topological measuresGlobal topological measures
Indicate the gross topological structure of the networkg p g
Connectivity Path length Clustering coefficienty
(Degree)
g g
Eric Xing © Eric Xing @ CMU, 2006-2010 6
[Barabasi]
- 7. Local network motifs profilesLocal network motifs profiles
Regulatory modules within the networkg y
SIM MIM FFLFBL
Eric Xing © Eric Xing @ CMU, 2006-2010 7
[Alon]
- 8. Models for Global Network AnalysisModels for Global Network Analysis
k kk
A. Random Networks [Erdos and Rényi (1959, 1960)]
!
)(
k
ke
kP
kk
Mean path length ~ ln(k)
Phase transition:
l ( k) / k
P(k) ~ k
, k 1, 2 B. Scale Free [Price,1965 & Barabasi,1999]
Connected if: p ln( k) / k
Mean path length ~ lnln(k)
Preferential
attachment. Add
proportionally to
connectedness
C.Hierarchial
Copy smaller graphs and let them
keep their connections.
Eric Xing © Eric Xing @ CMU, 2006-2010 8
p
- 9. Models for Meta Analysis of Networks
A.Stochastic Block Models
Models for Meta Analysis of Networks
Domingo
Carlos
Alejandro
Eduardo
Frank
[Holland, Laskey, and Leinhardt 1983]
Hal
Karl
Bob
Ike
Gill
Lanny
Mike
John
Xavier
Utrecht
Norm
Russ
Quint
Wendle
Ozzie
Ted
Sam
Vern
B.Exponential Radom Graph Models
[Bahadur 1961, Besag 1974, Frank &
Vern
Paul
Strauss 1986]
C.Latent Space Models
[Hoff, Raftery, and Handcock, 2002]
Eric Xing © Eric Xing @ CMU, 2006-2010 9
- 10. SummarySummary
Impressive graphs!p g p
Seemingly impressive statements about nwk properties
--- scale free, small world, …
Can even fit data with some models
But … What about details?
Can we inferinfer …
e.g., the role of every node, the meaning of every edge?
the network topology itself?
C di tdi t Can we predictpredict …
Can we simulatesimulate …
Most current analyses tell BIGBIG stories or obvious stories but
Eric Xing © Eric Xing @ CMU, 2006-2010 10
Most current analyses tell BIGBIG stories, or obvious stories, but
not so useful to serious detail-hunters
- 11. I: Network tomographyI: Network tomography
MicroMicro--inference vs. Mesoinference vs. Meso-- or Macroor Macro--inferenceinference
Multi-role of every node
Context dependent role-instantiation
R l d i Role dynamics
Eric Xing © Eric Xing @ CMU, 2006-2010 11
- 12. Evolving Social NetworksEvolving Social Networks
Can I get his vote?
Corporativity,
Antagonism,
CliCliques,
…
over time?
Eric Xing © Eric Xing @ CMU, 2006-2010 12
March 2005 January 2006 August 2006
- 13. II: Travel time tomographyII: Travel time tomography
Eric Xing © Eric Xing @ CMU, 2006-2010 13
- 14. III: Where do networks come from?III: Where do networks come from?
Existing work:g
Assuming iid sample, and a singlea single staticstatic network
Assuming networks or network time series are observable and given
Then model/analyze the generative and/or dynamic mechanisms Then model/analyze the generative and/or dynamic mechanisms
Eric Xing © Eric Xing @ CMU, 2006-2010 14
- 15. This Talk:This Talk:
Dynamic Network Model
Temporal exponential random graph model (tERGM)
[Hanneke and Xing, ICML 06; Fan, Hanneke, Fu and Xing, ICML 07; Hanneke, Fu, and Xing, EJS 10]
Reverse Engineer Latent Time-Evolving Networks Reverse Engineer Latent Time-Evolving Networks
The TESLA and KELLER algorithms, and beyond
[Fan, Hanneke, Fu, and Xing, ICML 07; Ahmed and Xing, PNAS 09; Song, Mladen and Xing, ISMB 09; Mladen, Song, Ahmed
and Xing, AOAS 09; Mladen, Song, and Xing, NIPS 09; Song, Mladen and Xing, NIPS 09]
Network tomography: modeling latent “multi-role" of vertices
Mixed Membership Stochastic Blockmodel (MMSB)
[Ai ldi Bl i Xi d Fi b Li kKDD 05 Ai ldi Bl i Fi b d Xi JMLR 08][Airoldi, Blei, Xing and Fienberg, LinkKDD 05; Airoldi, Blei, Fienberg and Xing, JMLR 08]
Dynamic tomography underlying evolving network
dynamic MMSBs
Eric Xing © Eric Xing @ CMU, 2006-2010 15
y
[Xing, Fu, and Song, AOAS 09; Ho, Le, and Xing, submitted 10]
- 16. Modeling Evolving NetworksModeling Evolving Networks
We observe the network at discrete, evenly spaced timey p
points t = 1,2,…,T.
The observed network at time t: At.
We want to design a class of statistical models for the
f f f
Eric Xing © Eric Xing @ CMU, 2006-2010 16
evolution of networks over a fixed set of nodes.
- 17. Markov AssumptionMarkov Assumption
To simplify things, assume the network observed at timestep tp y g p
(1 ≤ t ≤ T) is independent of the rest of history given the
knowledge of the network at timestep t-1, then
1121121
APAAPAAPAAAAP tttt
,,,,
What should the conditional look like?
Eric Xing © Eric Xing @ CMU, 2006-2010 17
- 18. Exponential Random Graphsp p
[Holland and Leinhardt 1981, Wasserman and Pattison 1996]
Very general families for modeling a single staticVery general families for modeling a single static
network observation.
ZAAP lnexp
are known as a potential, and its weight
Can estimate the parameters by MCMC MLE Can estimate the parameters by MCMC MLE
Eric Xing © Eric Xing @ CMU, 2006-2010 18
- 19. ERGM ExampleERGM Example
A Classic example: (Frank & Strauss 1986)
1(A) = # edges in A
2(A) = # 2-stars in A
3(A) = # triangles in A
AAAAP 332211 exp
Eric Xing © Eric Xing @ CMU, 2006-2010 19
- 20. Temporal Extension of ERGMsp
[Hanneke, Fu and Xing, EJS 2010]
C b ild ll h k ERGM h d i i Can we build all the work on ERGMs when designing a
temporal model?
111
ttttt
AZAAAAP ,ln,exp
is a temporal potential
Say the network has a single relation, and its value is either 0 or
1 ( “f i d ” “ f i d ”)1 (e.g., “friends” or “not friends”).
Let Aij denote the value of the relation between ith actor and jth
actor.
Eric Xing © Eric Xing @ CMU, 2006-2010 20
Then we can use to capture the dynamic propertiescapture the dynamic properties
of all Aij's
- 21. An ExampleAn Example
111 ttttt
111
111
ttttt
AZAAAAP ,ln,exp
“Continuity”:
“Reciprocity”:
ij
t
ij
t
ij
t
ij
t
ij
tt
AAAAAA 111
1 11,
tttt
AAAA 11
Reciprocity :
“Transitivity”:
ij
t
ji
t
ij
tt
AAAA 11
2 ,
ijk
t
kj
t
ik
t
ijtt
AAA
AA
11
1Transitivity :
“Density”:
ijk
t
kj
t
ik AA
AA 113 ,
t
ij
tt
AAA 1
4 ,
Eric Xing © Eric Xing @ CMU, 2006-2010 21
ij
ij4
- 22. DegeneracyDegeneracy [Handcock et al. 2008]
For many ERGMs, most of the parameter space is populated by
distributions that place almost all of the probability mass on a subset
of the sample space containing networks that bear no resemblance
to the observed networks (typically the complete or empty graphs).
For such models, an MLE does not exist, resulting in poor fit
Eric Xing © Eric Xing @ CMU, 2006-2010 22
e.g., When the observed statistics do not lie inside of the convex hull of the set of
all realizable u(A).
- 23. A tERGM is non-degenerate
[Hanneke, Fu and Xing, EJS 2010]
Theorem 1: when the transition distribution factors over the Theorem 1: when the transition distribution factors over the
edges, a tERGM is non-degenerate:
Maximum likelihood estimator exists!
Eric Xing © Eric Xing @ CMU, 2006-2010 23
Maximum likelihood estimator exists!
(should not be taken for granted for arbitrary models, be careful!)
- 24. Assessing statistic importance
d lit f fit A t dand quality of fit: A case study
Senate network – 109th congress Senate network 109 congress
Voting records from 109th congress (2005 - 2006)
There are 100 senators whose votes were recorded on the
542 bills, each vote is a binary outcome
Eric Xing © Eric Xing @ CMU, 2006-2010 24
- 26. Quality of fitQuality of fit
Statistic of from sampled At from P(At|At-1) versusp ( | )
that from the true At
Eric Xing © Eric Xing @ CMU, 2006-2010 26
- 27. Quality of fitQuality of fit
Able to predict sharp changesAble to predict sharp changes
Eric Xing © Eric Xing @ CMU, 2006-2010 27
- 28. What’s it good for?What s it good for?
Hypothesis Testingyp g
Data Exploration (e.g., node-classification)
Foundation for Learning
Eric Xing © Eric Xing @ CMU, 2006-2010 28
- 29. This Talk:This Talk:
Dynamic Network Model
Temporal exponential random graph model (tERGM)
[Hanneke and Xing, ICML 06; Fan, Hanneke, Fu and Xing, ICML 07; Hanneke, Fu, and Xing, EJS 10]
Reverse Engineer Latent Time-Evolving Networks Reverse Engineer Latent Time-Evolving Networks
The TESLA and KELLER algorithms, and beyond
[Fan, Hanneke, Fu, and Xing, ICML 07; Ahmed and Xing, PNAS 09; Song, Mladen and Xing, ISMB 09; Mladen, Song, Ahmed
and Xing, AOAS 09; Mladen, Song, and Xing, NIPS 09; Song, Mladen and Xing, NIPS 09]
Network tomography: modeling latent “multi-role" of vertices
Mixed Membership Stochastic Blockmodel (MMSB)
[Ai ldi Bl i Xi d Fi b Li kKDD 05 Ai ldi Bl i Fi b d Xi JMLR 08][Airoldi, Blei, Xing and Fienberg, LinkKDD 05; Airoldi, Blei, Fienberg and Xing, JMLR 08]
Dynamic tomography underlying evolving network
dynamic MMSBs
Eric Xing © Eric Xing @ CMU, 2006-2010 29
y
[Xing, Fu, and Song, AOAS 09; Ho, Le, and Xing, submitted 10]
- 30. Where do networks come from?Where do networks come from?
Existing work:g
Assuming iid sample, and a singlea single staticstatic network
Assuming networks or network time series are observable and given
Then model/analyze the generative and/or dynamic mechanisms Then model/analyze the generative and/or dynamic mechanisms
We assume:We assume:
Eric Xing © Eric Xing @ CMU, 2006-2010 30
Non-iid Varying Coefficient Varying StructureNon-iid VVarying CCoefficient VVarying SStructure
- 32. Reverse engineer "rewiring"
social networks
t*
social networks
t
…
T0 TN
n=1 or some small #
Drosophila developmentDrosophila development
Eric Xing © Eric Xing @ CMU, 2006-2010 32
- 34. Modeling Time Varying GraphsModeling Time-Varying Graphs
The hidden temporal exponential graph models [ Fan et al.
ICML 2007]ICML 2007]
Transition Model:
i
tt
iit
tt
AA
AZ
AAp ),(exp
),(
1
1
1 1
i),(
Eric Xing © Eric Xing @ CMU, 2006-2010 34
Emission Model:
ij
ij
t
ij
t
j
t
iijt
tt
Axx
AZ
Axp ),,,(exp
),,(
,
1
- 35. Inference (0)Inference (0)
Gibbs sampling:
[Fan et al. ICML 2007]
P(Network|Data) ?
p g
Need to evaluate the log-odds
Difficulty: Evaluate the ratio of Partition function Z(A')=Aexp((A,A'))
S f l t 20 d !!!S f l t 20 d !!! So far scale to ~20 nodes !!!So far scale to ~20 nodes !!!
Straightforward -- tractable
transition model; the partition
function is the product of per
edge terms
Computation is non-trivial
Eric Xing © Eric Xing @ CMU, 2006-2010 35
edge terms
Given the graphical structure, run
variable elimination algorithms, works
well only for small graphs
- 37. Inference I
Lasso:
KELLER: Kernel Weighted L1-regularized Logistic Regression
Inference I [Song, Kolar and Xing, Bioinformatics 09]
Constrained convex optimization Constrained convex optimization
Estimate time-specific nets one by one, based on "virtual iidvirtual iid" samples
Could scale to ~104 genes, but under stronger smoothness assumptions
Eric Xing © Eric Xing @ CMU, 2006-2010 37
- 38. Algorithm - neighborhood selection
Conditional likelihood
Algorithm - neighborhood selection
NNeighborhood Selectioneighborhood Selection:
Time-specific graph regression:
Estimate at
Where
Eric Xing © Eric Xing @ CMU, 2006-2010 38
and
- 41. Inference II
TESLA: Temporally Smoothed L1-regularized logistic regression
Inference II [Ahmed and Xing, PNAS 09]
Constrained convex optimization
Estimate time-specific nets jointly, based on original "nonnon--iidiid" samples
Eric Xing © Eric Xing @ CMU, 2006-2010 41
Estimate time specific nets jointly, based on original nonnon iidiid samples
Scale to ~5000 nodes, does not need smoothness assumption, can accommodate abrupt
changes.
- 42. Structural Consistency of TESLAy
[Kolar, Le and Xing, NIPS 09; Kolar and Xing, 2010]
I. It can be shown that, by applying the results for modely pp y g
selection of the Lasso on a temporal difference of beta, thethe
blockblock are estimated consistentlyare estimated consistently
II. Then it can be further shown that, by applying Lasso on
within the blocks, thethe neighborhoodneighborhood of each nodeof each node onon
each of the estimated blockseach of the estimated blocks consistentlyconsistently
Further advantages of the two step procedure Further advantages of the two step procedure
choosing parameters easier
faster optimization procedure
Eric Xing © Eric Xing @ CMU, 2006-2010 42
- 43. Comparison of KELLER and
TESLATESLA
Smoothly varying Abruptly varying
Eric Xing © Eric Xing @ CMU, 2006-2010 43
- 44. Senate network 109th congressSenate network – 109th congress
Voting records from 109th congress (2005 - 2006)
Th 100 t h t d d th There are 100 senators whose votes were recorded on the
542 bills, each vote is a binary outcome
Estimating parameters:
KELLER: bandwidth parameter to be hn = 0.174, and the penalty parameter 1 =
0.195
Eric Xing © Eric Xing @ CMU, 2006-2010 44
0.195
TESLA: 1 = 0.24 and 2 = 0.28
- 45. Senate network 109th congressSenate network – 109th congress
March 2005 January 2006 August 2006
Eric Xing © Eric Xing @ CMU, 2006-2010 45
- 48. Dynamic Gene Interactions Networks
of Drosophila Melanogasterof Drosophila Melanogaster
biologicalbiological
processprocess
molecularmolecular
functionfunction
cellularcellular
componentcomponent
Eric Xing © Eric Xing @ CMU, 2006-2010 48
componentcomponent
- 49. This Talk:This Talk:
Dynamic Network Model
Temporal exponential random graph model (tERGM)
[Hanneke and Xing, ICML 06; Fan, Hanneke, Fu and Xing, ICML 07; Hanneke, Fu, and Xing, EJS 10]
Reverse Engineer Latent Time-Evolving Networks Reverse Engineer Latent Time-Evolving Networks
The TESLA and KELLER algorithms, and beyond
[Fan, Hanneke, Fu, and Xing, ICML 07; Ahmed and Xing, PNAS 09; Song, Mladen and Xing, ISMB 09; Mladen, Song, Ahmed
and Xing, AOAS 09; Mladen, Song, and Xing, NIPS 09; Song, Mladen and Xing, NIPS 09]
Network tomography: modeling latent “multi-role" of vertices
Mixed Membership Stochastic Blockmodel (MMSB)
[Ai ldi Bl i Xi d Fi b Li kKDD 05 Ai ldi Bl i Fi b d Xi JMLR 08][Airoldi, Blei, Xing and Fienberg, LinkKDD 05; Airoldi, Blei, Fienberg and Xing, JMLR 08]
Dynamic tomography underlying evolving network
dynamic MMSBs
Eric Xing © Eric Xing @ CMU, 2006-2010 49
y
[Xing, Fu, and Song, AOAS 09; Ho, Le, and Xing, submitted 10]
- 54. This Talk:This Talk:
Dynamic Network Model
Temporal exponential random graph model (tERGM)
[Hanneke and Xing, ICML 06; Fan, Hanneke, Fu and Xing, ICML 07; Hanneke, Fu, and Xing, EJS 10]
Reverse Engineer Latent Time-Evolving Networks Reverse Engineer Latent Time-Evolving Networks
The TESLA and KELLER algorithms, and beyond
[Fan, Hanneke, Fu, and Xing, ICML 07; Ahmed and Xing, PNAS 09; Song, Mladen and Xing, ISMB 09; Mladen, Song, Ahmed
and Xing, AOAS 09; Mladen, Song, and Xing, NIPS 09; Song, Mladen and Xing, NIPS 09]
Network tomography: modeling latent “multi-role" of vertices
Mixed Membership Stochastic Blockmodel (MMSB)
[Ai ldi Bl i Xi d Fi b Li kKDD 05 Ai ldi Bl i Fi b d Xi JMLR 08][Airoldi, Blei, Xing and Fienberg, LinkKDD 05; Airoldi, Blei, Fienberg and Xing, JMLR 08]
Dynamic tomography underlying evolving network
dynamic MMSBs
Eric Xing © Eric Xing @ CMU, 2006-2010 54
y
[Xing, Fu, and Song, AOAS 09; Ho, Le, and Xing, submitted 10]
- 55. Dynamic tomographyDynamic tomography
How to model dynamics in a simplex?y p
Project an individual/stock in
network into a "tomographic" space
Trajectory of an individual/stock
in the "tomographic" space
Eric Xing © Eric Xing @ CMU, 2006-2010 55
- 57. Dynamic MMSB (dMMSB) [Xing, Fu, and Song,y ( ) [ g, , g,
AOAS 2009]
C
N×N
N N
N×N
N N
N×N
N N
K×K K×K K×K
Eric Xing © Eric Xing @ CMU, 2006-2010 57
- 58. Dynamic Mixture of MMSB
(dM3SB)
Φµ
C
(dM3SB) [Ho, Le, and Xing, submitted 2010]
µh
1
Σh
C
νµ … µh
TTime-varying
Role Prior
γi
1ci
1δ γi
Tci
T
N
N
zi j
1 zj i
1
…
N
N
zi j
T zj i
TCluster
Selection Prior
N×N
ei,j
1
N×N
ei,j
TTime-varying
Network ModelLegend
Hidden role prior
Actor hidden roles
Eric Xing © Eric Xing @ CMU, 2006-2010 58
K×K
βk,l
Role Compatibility
Matrix
Actor hidden roles
Observed interactions
Role compatibility matrix
- 59. Algorithm: Generalized Mean Fieldg
(xing et al. 2004)
2
1
Inference via variational EM
Generalized mean field
Laplace approximation
3
Eric Xing © Eric Xing @ CMU, 2006-2010 59
p pp
Kalman filter & RTS smoother
- 63. Case Study 1: Sampson’s Monk
N t kNetwork
Dataset Descriptionp
18 monks (junior members in a monastery)
Liking relations recorded
3 time-points in one year period
Timing: before a major conflict outbreak
Recall static analysis: Recall static analysis:
Eric Xing © Eric Xing @ CMU, 2006-2010 63
- 64. Sampson’s Monk Network:
role trajectoriesrole trajectories
The trajectories of the varying role-vectors over timej y g
Eric Xing © Eric Xing @ CMU, 2006-2010 64
- 65. Case Study 2: The 109th congressCase Study 2: The 109th congress
March 2005 January 2006 August 2006
US senator voting records
Eric Xing © Eric Xing @ CMU, 2006-2010 65
US senator voting records
100 senators, 109th Congress (Jan 2005 – Dec 2006) in 8 epochs
- 66. Senate Network: role trajectories
Voting data preprocessed into a network graph using (Kolar et al., 2008)
Senate Network: role trajectories
Role Compatibility Matrix BColored bars: Estimated latent space vector
Eric Xing © Eric Xing @ CMU, 2006-2010 66
p y
Role 1 = Passive, 2/4 = Democratic clique,
3 = Republican clique
Colored bars: Estimated latent space vector
Numbers under bars: Estimated cluster
Letters beside actor index: Political party and State
- 67. Senate Network: role trajectoriesj
Cluster legend
Jon Corzine’s seat (#28, Democrat,
New Jersey) was taken over by
Bob Menendez from t=5 onwards.
Ben Nelson (#75) is a right‐wing Democrat
(Nebraska), whose views are more
consistent with the Republican party.
Corzine was especially left‐wing,
so much that his views did not
align with the majority of
Democrats (t=1 to 4).
Observe that as the 109th Congress
proceeds into 2006, Nelson’s latent space
vector includes more of role 3,
corresponding to the main Republican
voting clique
Once Menendez took over, the
latent space vector for senator
#28 shifted towards role 4,
corresponding to the main
voting clique.
This coincides with Nelson’s re‐election as
the Senator from Nebraska in late 2006,
during which a high proportion of
Democratic voting clique. Republicans voted for him.
Eric Xing © Eric Xing @ CMU, 2006-2010 67
- 68. Summary
Non-degenerate model for network evolution
Summary
g
Efficient and Sparsistent algorithms for recovering latent time-
evolving networks from nodal attributesevolving networks from nodal attributes
Multi-role estimation from network topology
Roles undertaken by actors are not independent of each other; they can have internal
dependency structures
An actor in the network can be fractionally assigned to multiple roles
Mixed memberships of actors vary temporally
Visualization and analysis tool for dynamic networks
Eric Xing © Eric Xing @ CMU, 2006-2010 68
Discovered interesting patterns from Email/Vote/Gene Nwks
- 69. Discussion: future directionsDiscussion: future directions
How about estimating time varying "causal" graphs, org y g g p
general time-varying graphical models?
The time-varying DBN based on an nonstationary auto-regressive model (NIPS 2009) entails
1st-order "Granger causality"
C i t f i diffi lt ( l l diti ll i d d t) b t till ibl Consistency proof is difficult (sample no longer conditionally independent), but still possible
under assumption of local stationarity
Est. of arbitrary TV-GM remains an open problem in both algorithm and theory
Web-scale inference/modeling Web scale inference/modeling
Scalability to million-node network problem remains hard
Are nodal/edge level inference still interesting in mega networks?
Predictive models for large graph evolution, community organization, information diffusion
Socio-media modeling and prediction
Facebook, Twitter, Flicker: many interesting problems
Data integration: Graph + Text + image + …
Eric Xing © Eric Xing @ CMU, 2006-2010 69