SlideShare a Scribd company logo
bio+stats vbem/networks hierarchical
variational and hierarchical modeling
for biological data
chris wiggins
columbia
april 23, 2012
chris.wiggins@columbia.edu 4/23/12
Chris Wiggins
• APAM: Department of Applied Physics and Applied Mathematics;
• C2B2: Center for Computational Biology and Bioinformatics;
• CISB: Columbia University Initiative in Systems Biology
• ISDE: Institute for Data Sciences and Engineering
Columbia University
September 28, 2012
bio+stats vbem/networks hierarchical biological challenges inference model selection
thanks. . .
- jake hofman (vbmod,vbfret)
- jonathan bronson (vbfret)
- jan-willem van de meent (hfret)
- ruben gonzalez (vbfret, hfret)
for more info:
- vbfret.sourceforge.net
- vbmod.sourceforge.net
- hfret.sourceforge.net (soon)
chris.wiggins@columbia.edu 4/23/12
BMC bioinformatics, 2010;
PNAS 2009;
Biophysical Journal 2009;
bio+stats vbem/networks hierarchical
1 biology and statistics
genomics
generative modeling
2 variational/biological networks
variational Bayesian expectation maximization
inference
model selection
3 hierarchical/time series
biological challenges
inference
model selection
chris.wiggins@columbia.edu 4/23/12
bio+stats vbem/networks hierarchical genomics generative modeling
biology and statistics:
chris.wiggins@columbia.edu 4/23/12
bio+stats vbem/networks hierarchical genomics generative modeling
genomics:
chris.wiggins@columbia.edu 4/23/12
plos comp bio `10, nyas `07, bmc bioinfo `06a, bmc bioinfo `06b,
bioinfo `04, regulatory genomics 04, IEEE `05; NIPS (MLCB`03, `06 )
bio+stats vbem/networks hierarchical genomics generative modeling
generative modeling:
chris.wiggins@columbia.edu 4/23/12
bmc bioinfo '10, PNAS`09, biophys
j`09, PNAS`07, PNAS`06 NIPS
(MLCB '09, '10, '11) IEEE sig. proc.
`12; plos one `08,cell `07,prl `06, jcs
`06, biophys j `06
pami `08, prl `08, NIPS (MLCB08),
PNAS `05, bmc bioinfo `04
bio+stats vbem/networks hierarchical vbem inference model selection
variational/biological networks:
- variational bayesian expectation maximization
- inference
- model selection
chris.wiggins@columbia.edu 4/23/12
introduction formulation results extensions motivation history
introduction:
chris.wiggins@columbia.edu 22.2.12 vbmod.sourceforge.net
Hartwell, Hopfield, Leibler and Murray
NATURE|VOL 402 | SUPP | 2
DECEMBER 1999 | www.nature.com
introduction formulation results extensions motivation history
motivation:
community detection in networks
social networks
biological networks
problem: over-fitting/resolution limit
chris.wiggins@columbia.edu 22.2.12 vbmod.sourceforge.net
introduction formulation results extensions motivation history
history:
by community
math/cs: spectral methods (Fiedler ’74, Shi + Malik ’00)
math/cs: clustering generally (Taskar, Koller, Getoor)
physics: modularity
common thread: test w/ stochastic block model (’76, ’83)
ergo: use as inference tool (Hastings 0604429,
Newman+Liecht 061148)
chris.wiggins@columbia.edu 22.2.12 vbmod.sourceforge.net
introduction formulation results extensions generative model max likelihood max evidence algo
formulation:
generative model
maximum likelihood
maximum evidence
complexity control. . .
variational/mft. . .
algorithm
in physics: “test hamiltonian”
in ML “variational bayesian methods” (Jordan, Mackay)
chris.wiggins@columbia.edu 22.2.12 vbmod.sourceforge.net
introduction formulation results extensions generative model max likelihood max evidence algo
generative model:
foreach node roll K-sided die with bias π to choose
zi {1, . . . , K}
foreach edge flip coin with bias ϑ+ if zi = zj , else ϑ−
draw edge if coin lands heads up
chris.wiggins@columbia.edu 22.2.12 vbmod.sourceforge.net
Stochastic block models (Holland, Laskey, Leinhardt 1983; Wang and Wong, 1987)
i≠j
zi zj
Aij
π
θ
introduction formulation results extensions generative model max likelihood max evidence algo
generative model. . . (bis)
chris.wiggins@columbia.edu 22.2.12 vbmod.sourceforge.net
Die rolling, coin flipping, and priors: where counts are:
non-edges within
modules
edges within
modules
edges between
modules
non-edges
between modules
nodes in each
module
introduction formulation results extensions generative model max likelihood max evidence algo
max likelihood:
chris.wiggins@columbia.edu 22.2.12 vbmod.sourceforge.net
H ⇥ ln p(A, ⌦z|⌦⇤, ⌦⇥) =
i,j
(JLAij JG) zi,zj +
K
µ=1
hµ
N
i=1
zi,µ
JG ⇥ ln ⇥c/⇥d
JL ⇥ ln(1 ⇥d)/(1 ⇥c) + JG
hµ ⇥ ln µ
Extends Newman (2004, 2006), Hastings (2006), Bornholdt & Reichardt (2006)
•Die rolling, coin flipping <-> infinite-range spin-glass Potts model:
introduction formulation results extensions generative model max likelihood max evidence algo
formulation:
generative model
maximum likelihood
maximum evidence
complexity control. . .
variational/mft. . .
algorithm
in physics: “test hamiltonian”
in ML “variational bayesian methods” (Jordan, Mackay)
chris.wiggins@columbia.edu 22.2.12 vbmod.sourceforge.net
introduction formulation results extensions generative model max likelihood max evidence algo
formulation:
generative model
maximum likelihood
maximum evidence
complexity control. . .
variational/mft. . .
algorithm
in physics: “test hamiltonian”
in ML “variational bayesian methods” (Jordan, Mackay)
chris.wiggins@columbia.edu 22.2.12 vbmod.sourceforge.net
introduction formulation results extensions generative model max likelihood max evidence algo
max evidence:
chris.wiggins@columbia.edu 22.2.12 vbmod.sourceforge.net
Increasing complexity
introduction formulation results extensions generative model max likelihood max evidence algo
max evidence:
chris.wiggins@columbia.edu 22.2.12 vbmod.sourceforge.net
http://research.microsoft.com/~minka/statlearn/demo/
introduction formulation results extensions generative model max likelihood max evidence algo
max evidence:
chris.wiggins@columbia.edu 22.2.12 vbmod.sourceforge.net
introduction formulation results extensions generative model max likelihood max evidence algo
max evidence:
chris.wiggins@columbia.edu 22.2.12 vbmod.sourceforge.net
cf. “BIC” Schwartz, 1978
introduction formulation results extensions generative model max likelihood max evidence algo
generative model:
foreach node roll K-sided die with bias π to choose
zi {1, . . . , K}
foreach edge flip coin with bias ϑ+ if zi = zj , else ϑ−
draw edge if coin lands heads up
chris.wiggins@columbia.edu 22.2.12 vbmod.sourceforge.net
Stochastic block models (Holland, Laskey, Leinhardt 1983; Wang and Wong, 1987)
i≠j
zi zj
Aij
π
θ
introduction formulation results extensions generative model max likelihood max evidence algo
generative model:
foreach node roll K-sided die with bias π to choose
zi {1, . . . , K}
foreach edge flip coin with bias ϑ+ if zi = zj , else ϑ−
draw edge if coin lands heads up
chris.wiggins@columbia.edu 22.2.12 vbmod.sourceforge.net
Stochastic block models (Holland, Laskey, Leinhardt 1983; Wang and Wong, 1987)
i≠j
zi zj
Aij
π
θ c
n
introduction formulation results extensions generative model max likelihood max evidence algo
generative model. . . (bis)
chris.wiggins@columbia.edu 22.2.12 vbmod.sourceforge.net
Die rolling, coin flipping, and priors: where counts are:
non-edges within
modules
edges within
modules
edges between
modules
non-edges
between modules
nodes in each
module
introduction formulation results extensions generative model max likelihood max evidence algo
generative model. . . (bis)
chris.wiggins@columbia.edu 22.2.12 vbmod.sourceforge.net
Die rolling, coin flipping, and priors: where counts are:
non-edges within
modules
edges within
modules
edges between
modules
non-edges
between modules
nodes in each
module
introduction formulation results extensions generative model max likelihood max evidence algo
max likelihood:
chris.wiggins@columbia.edu 22.2.12 vbmod.sourceforge.net
Extends Newman (2004, 2006), Hastings (2006), Bornholdt & Reichardt (2006)
•Die rolling, coin flipping <-> infinite-range spin-glass Potts model:
•Infer distributions over spin assignments, coupling constants, and
chemical potentials and find number of occupied spin states
JG ⇥ ln ⇥c/⇥d
JL ⇥ ln(1 ⇥d)/(1 ⇥c) + JG
hµ ⇥ ln µ
•Die rolling, coin flipping <-> infinite-range spin-glass Potts model:
•Infer distributions over spin assignments, coupling constants, and
chemical potentials and find number of occupied spin states
H ⇥ ln p(A, ⌦z|⌦⇤, ⌦⇥) =
i,j
(JLAij JG) zi,zj +
K
µ=1
hµ
N
i=1
zi,µ
introduction formulation results extensions generative model max likelihood max evidence algo
max evidence:
chris.wiggins@columbia.edu 22.2.12 vbmod.sourceforge.net
Extends Newman (2004, 2006), Hastings (2006), Bornholdt & Reichardt (2006)
•Die rolling, coin flipping <-> infinite-range spin-glass Potts model:
•Infer distributions over spin assignments, coupling constants, and
chemical potentials and find number of occupied spin states
H ⇥ ln p(A, ⌦z|⌦⇤, ⌦⇥) =
i,j
(JLAij JG) zi,zj +
K
µ=1
hµ
N
i=1
zi,µ
JG ⇥ ln ⇥c/⇥d
JL ⇥ ln(1 ⇥d)/(1 ⇥c) + JG
hµ ⇥ ln µ
p(A|K) =
⇥z
⇥
d⌦
⇥
d⌦⇥ p(A,⌦z,⌦⇥, ⌦) =
⇥z
⇥
d⌦
⇥
d⌦⇥ e H
p(⌦)p(⌦⇥)
introduction formulation results extensions generative model max likelihood max evidence algo
max evidence:
chris.wiggins@columbia.edu 22.2.12 vbmod.sourceforge.net
Extends Newman (2004, 2006), Hastings (2006), Bornholdt & Reichardt (2004 & 2006)
•Die rolling, coin flipping <-> infinite-range spin-glass Potts model:
•Infer distributions over spin assignments, coupling constants, and
chemical potentials and find number of occupied spin states
H ⇥ ln p(A, ⌦z|⌦⇤, ⌦⇥) =
i,j
(JLAij JG) zi,zj +
K
µ=1
hµ
N
i=1
zi,µ
JG ⇥ ln ⇥c/⇥d
JL ⇥ ln(1 ⇥d)/(1 ⇥c) + JG
hµ ⇥ ln µ
p(A|K) =
⇥z
⇥
d⌦
⇥
d⌦⇥ p(A,⌦z,⌦⇥, ⌦) =
⇥z
⇥
d⌦
⇥
d⌦⇥ e H
p(⌦)p(⌦⇥)
Can do integrals,
but sum is
intractable, O(KN);
use mean-field
introduction formulation results extensions generative model max likelihood max evidence algo
max evidence:
chris.wiggins@columbia.edu 22.2.12 vbmod.sourceforge.net
• Gibbs’/Jensen’s inequality (log of expected value bounds expected value of log) for any distribution q
p(A|K) =
⇥z
⇥
d⌦
⇥
d⌦⇥ p(A,⌦z,⌦⇥, ⌦) =
⇥z
⇥
d⌦
⇥
d⌦⇥ e H
p(⌦)p(⌦⇥)
Variational Bayes (MacKay, Jordan, Ghahramani, Jaakola, Saul 1999; cf. Feynman 1972)
introduction formulation results extensions generative model max likelihood max evidence algo
max evidence:
chris.wiggins@columbia.edu 22.2.12 vbmod.sourceforge.net
why would you do this? (A1):
Beal, 2003
introduction formulation results extensions generative model max likelihood max evidence algo
max evidence:
chris.wiggins@columbia.edu 22.2.12 vbmod.sourceforge.net
why would you do this? (A2):
Beal, 2003
introduction formulation results extensions generative model max likelihood max evidence algo
max evidence:
chris.wiggins@columbia.edu 22.2.12 vbmod.sourceforge.net
why would you do this? (A3):
Beal, 2003
introduction formulation results extensions generative model max likelihood max evidence algo
max evidence:
chris.wiggins@columbia.edu 22.2.12 vbmod.sourceforge.net
• Gibbs’/Jensen’s inequality (log of expected value bounds expected value of log) for any distribution q
Variational Bayes (MacKay, Jordan, Ghahramani, Jaakola, Saul 1999; cf. Feynman 1972)
• F is a functional of q; find approximation to posterior by optimizing approximation to
evidence
• Take q(z, π, θ)=q(z)q(π)q(θ); Qiμ is probability node i in module μ where expected counts
are:
introduction formulation results extensions generative model max likelihood max evidence algo
algo:
chris.wiggins@columbia.edu 22.2.12 vbmod.sourceforge.net
where expected counts
are:
introduction formulation results extensions generative model max likelihood max evidence algo
algo:
chris.wiggins@columbia.edu 22.2.12 vbmod.sourceforge.net
where expected counts
are:
introduction formulation results extensions generative model max likelihood max evidence algo
algo:
chris.wiggins@columbia.edu 22.2.12 vbmod.sourceforge.net
introduction formulation results extensions generative model max likelihood max evidence algo
algo:
chris.wiggins@columbia.edu 22.2.12 vbmod.sourceforge.net
suggests hard limit in step 3; sparse in step 1
introduction formulation results extensions run time consistency good vs easy real data
results:
run time
consistency
required plot: good vs. easy
real data
karate
biology
american football
chris.wiggins@columbia.edu 22.2.12 vbmod.sourceforge.net
introduction formulation results extensions run time consistency good vs easy real data
run time:
chris.wiggins@columbia.edu 22.2.12 vbmod.sourceforge.net
• Main loop runtime for 104 nodes in MATLAB ~30 seconds
introduction formulation results extensions run time consistency good vs easy real data
consistency:
chris.wiggins@columbia.edu 22.2.12 vbmod.sourceforge.net
introduction formulation results extensions run time consistency good vs easy real data
consistency:
chris.wiggins@columbia.edu 22.2.12 vbmod.sourceforge.net
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
0
2
4
6
8
10
12
θ
N=8, K=2, distribution after 2 iterations
p(θ+
)
p(θ
−
)
introduction formulation results extensions run time consistency good vs easy real data
consistency:
chris.wiggins@columbia.edu 22.2.12 vbmod.sourceforge.net
• K=4?
• Automatic complexity control: probability of occupation for extraneous modules
goes to zero
introduction formulation results extensions run time consistency good vs easy real data
consistency:
chris.wiggins@columbia.edu 22.2.12 vbmod.sourceforge.net
• K=4?
• Automatic complexity control: probability of occupation for extraneous modules
goes to zero
introduction formulation results extensions run time consistency good vs easy real data
consistency:
chris.wiggins@columbia.edu 22.2.12 vbmod.sourceforge.net
The “resolution limit” problem
1
2
3
4
5
57
6
7
8 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
4041
42
43 44
45
46
47
48
49
50
51
52
53
54
55
56
58
59
60
1
2
3
4
5
57
6
7
8 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
4041
42
43 44
45
46
47
48
49
50
51
52
53
54
55
56
58
59
60
Variational Bayes
Girvan-Newman
modularity
introduction formulation results extensions run time consistency good vs easy real data
consistency:
chris.wiggins@columbia.edu 22.2.12 vbmod.sourceforge.net
The “resolution limit” problem
10 12 14 16 18 20
8
10
12
14
16
18
20
Ktrue
K*
K
*
=Ktrue
Variational Bayes
Modularity optimization
10 12 14 16 18 20
0.72
0.74
0.76
0.78
0.8
0.82
0.84
Ktrue
GNmodularity
Resolution limit problem on ring of 4−node cliques
Single−clique communities (correct)
Double−clique communities (incorrect)
GN modularity (Clauset’s algorithm)
Girvan-Newman modularity or Potts model w/ fixed parameters suffers from a resolution limit,
where size of detected modules depends on network size
Fortunato et. al. (2007), Kumpula et. al. (2007),
introduction formulation results extensions run time consistency good vs easy real data
good vs easy:
chris.wiggins@columbia.edu 22.2.12 vbmod.sourceforge.net
introduction formulation results extensions run time consistency good vs easy real data
real data:
chris.wiggins@columbia.edu 22.2.12 vbmod.sourceforge.net
• Correctly infer K=12 conferences
Validation: NCAA football schedule
1
2 3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45 46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
nodes: teams
edges: games
shape: conference
color: inferred module
introduction formulation results extensions run time consistency good vs easy real data
real data:
chris.wiggins@columbia.edu 22.2.12 vbmod.sourceforge.net
APS march meeting 2008
superconductivity
(experimentalists)
Nanotubes, Graphene
superconductivity
(theorists)
introduction formulation results extensions
extensions:
model extensions
full SBM (done)
hierarchical model p(Aij = 1|zi = zi ± 1)
hierarchical modeling (ensemble of graphs)
p(D|u, K) = ΠL
i dϑi p(D|ϑi )p(ϑi |u, K)
Rd
embedding (latent are real)
more ‘rigorous’ SOM?
‘correct’ for degree (allow variable affinity) (cf. Bader, Karrer)
algorithm extensions
BP (see earlier talks)
map-reduce
cvmod (model selection via cross validation)
chris.wiggins@columbia.edu 22.2.12 vbmod.sourceforge.net
introduction formulation results extensions
full SBM. . .
probability of edge depends only on block membership:
p(Aij |zi = µ, zj = ν) = ϑµν
chris.wiggins@columbia.edu 22.2.12 vbmod.sourceforge.net
introduction formulation results extensions
full SBM. . .
probability of edge depends only on block membership:
p(Aij |zi = µ, zj = ν) = ϑµν
chris.wiggins@columbia.edu 22.2.12 vbmod.sourceforge.net
• Nodes belong to “blocks” of
varying size
• Roll die for assignment of
nodes to blocks
• Probability of edge between two
nodes depends only on block
membership
• Flip (one of K2) coins for edges
• Result: mixture of Erdos-Renyi
graphs
0 20 40 60 80 100 120
0
20
40
60
80
100
120
nz = 2275
adjacency matrix
introduction formulation results extensions
full SBM. . .
probability of edge depends only on block membership:
p(Aij |zi = µ, zj = ν) = ϑµν
chris.wiggins@columbia.edu 22.2.12 vbmod.sourceforge.net
vs
introduction formulation results extensions
full SBM. . .
probability of edge depends only on block membership:
p(Aij |zi = µ, zj = ν) = ϑµν
chris.wiggins@columbia.edu 22.2.12 vbmod.sourceforge.net
0 20 40 60 80 100 120
0
20
40
60
80
100
120
nz = 2803
adjacency matrix
introduction formulation results extensions
full SBM. . .
probability of edge depends only on block membership:
p(Aij |zi = µ, zj = ν) = ϑµν
chris.wiggins@columbia.edu 22.2.12 vbmod.sourceforge.net
vs
introduction formulation results extensions
full SBM. . .
probability of edge depends only on block membership:
p(Aij |zi = µ, zj = ν) = ϑµν
chris.wiggins@columbia.edu 22.2.12 vbmod.sourceforge.net
>> vbsbm_vs_vbmod(0)
running vbmod ...
Elapsed time is 1.136925 seconds.
running vbsbm ...
Elapsed time is 1.398904 seconds.
Fmod=13089.158019 Fsbm=13144.445782
vbmod wins
introduction formulation results extensions
full SBM. . .
probability of edge depends only on block membership:
p(Aij |zi = µ, zj = ν) = ϑµν
chris.wiggins@columbia.edu 22.2.12 vbmod.sourceforge.net
>> vbsbm_vs_vbmod(0.25)
running vbmod ...
Elapsed time is 1.557298 seconds.
running vbsbm ...
Elapsed time is 1.759527 seconds.
Fmod=20457.142416 Fsbm=19457.306022
vbsbm wins
introduction formulation results extensions
full SBM. . .
probability of edge depends only on block membership:
p(Aij |zi = µ, zj = ν) = ϑµν
chris.wiggins@columbia.edu 22.2.12 vbmod.sourceforge.net
>> vbsbm_vs_vbmod(0.5)
running vbmod ...
Elapsed time is 2.624886 seconds.
running vbsbm ...
Elapsed time is 1.440242 seconds.
Fmod=26133.351210 Fsbm=23921.797625
vbsbm wins
introduction formulation results extensions
full SBM. . .
probability of edge depends only on block membership:
p(Aij |zi = µ, zj = ν) = ϑµν
chris.wiggins@columbia.edu 22.2.12 vbmod.sourceforge.net
• Using same framework we can compare the
unconstrained and full stochastic block models via p(D|M,K*)
0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.1
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
perturbation to constrained model
winpercentageforunconstrainedmodel
0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.1
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
perturbation to constrained model
winpercentageforunconstrainedmodel
0 20 40 60 80 100 120
0
20
40
60
80
100
120
nz = 2100
adjacency matrix
0 20 40 60 80 100 120
0
20
40
60
80
100
120
nz = 2048
adjacency matrix
0 20 40 60 80 100 120
0
20
40
60
80
100
120
nz = 2108
adjacency matrix
introduction formulation results extensions
extensions:
model extensions
full SBM (done)
hierarchical model p(Aij = 1|zi = zi ± 1)
hierarchical modeling (ensemble of graphs)
p(D|u, K) = ΠL
i dϑi p(D|ϑi )p(ϑi |u, K)
Rd
embedding (latent are real)
more ‘rigorous’ SOM?
‘correct’ for degree (allow variable affinity) (cf. Bader, Karrer)
algorithm extensions
BP (see earlier talks)
map-reduce
cvmod (model selection via cross validation)
chris.wiggins@columbia.edu 22.2.12 vbmod.sourceforge.net
introduction formulation results extensions
extensions:
model extensions
full SBM (done)
hierarchical model p(Aij = 1|zi = zi ± 1)
hierarchical modeling (ensemble of graphs)
p(D|u, K) = ΠL
i dϑi p(D|ϑi )p(ϑi |u, K)
Rd
embedding (latent are real)
more ‘rigorous’ SOM?
‘correct’ for degree (allow variable affinity) (cf. Bader, Karrer)
algorithm extensions
BP (see earlier talks)
map-reduce
cvmod (model selection via cross validation)
chris.wiggins@columbia.edu 22.2.12 vbmod.sourceforge.net
introduction formulation results extensions
extensions:
model extensions
full SBM (done)
hierarchical model p(Aij = 1|zi = zi ± 1)
hierarchical modeling (ensemble of graphs)
p(D|u, K) = ΠL
i dϑi p(D|ϑi )p(ϑi |u, K)
Rd
embedding (latent are real)
more ‘rigorous’ SOM?
‘correct’ for degree (allow variable affinity) (cf. Bader, Karrer)
algorithm extensions
BP (see earlier talks)
map-reduce
cvmod (model selection via cross validation)
chris.wiggins@columbia.edu 22.2.12 vbmod.sourceforge.net
introduction formulation results extensions
extensions:
model extensions
full SBM (done)
hierarchical model p(Aij = 1|zi = zi ± 1)
hierarchical modeling (ensemble of graphs)
p(D|u, K) = ΠL
i dϑi p(D|ϑi )p(ϑi |u, K)
Rd
embedding (latent are real)
more ‘rigorous’ SOM?
‘correct’ for degree (allow variable affinity) (cf. Bader, Karrer)
algorithm extensions
BP (see earlier talks)
map-reduce
cvmod (model selection via cross validation)
chris.wiggins@columbia.edu 22.2.12 vbmod.sourceforge.net
Stochastic block models (Holland, Laskey, Leinhardt 1983; Wang and Wong, 1987)
i≠j
zi
zj
Aij
π
θ c
n
introduction formulation results extensions
extensions:
model extensions
full SBM (done)
hierarchical model p(Aij = 1|zi = zi ± 1)
hierarchical modeling (ensemble of graphs)
p(D|u, K) = ΠL
i dϑi p(D|ϑi )p(ϑi |u, K)
Rd
embedding (latent are real)
more ‘rigorous’ SOM?
‘correct’ for degree (allow variable affinity) (cf. Bader, Karrer)
algorithm extensions
BP (see earlier talks)
map-reduce
cvmod (model selection via cross validation)
chris.wiggins@columbia.edu 22.2.12 vbmod.sourceforge.net
Stochastic block models (Holland, Laskey, Leinhardt 1983; Wang and Wong, 1987)
i≠j
zi
zj
Aij
π
θ c
n
L
introduction formulation results extensions
extensions:
model extensions
full SBM (done)
hierarchical model p(Aij = 1|zi = zi ± 1)
hierarchical modeling (ensemble of graphs)
p(D|u, K) = ΠL
i dϑi p(D|ϑi )p(ϑi |u, K)
Rd
embedding (latent are real)
more ‘rigorous’ SOM?
‘correct’ for degree (allow variable affinity) (cf. Bader, Karrer)
algorithm extensions
BP (see earlier talks)
map-reduce
cvmod (model selection via cross validation)
chris.wiggins@columbia.edu 22.2.12 vbmod.sourceforge.net
introduction formulation results extensions
for more info. . .
code: MATLAB & python (inc. “full” SBM) (vbmod.sf.net)
paper: arxiv 08 / prl 08
Hofman soon to come (not by me)
code in C++, inc. full ‘vblabel propagation’ algo
twitter-scale analysis
chris.wiggins@columbia.edu 22.2.12 vbmod.sourceforge.net
bio+stats vbem/networks hierarchical biological challenges inference model selection
hierarchical/time series:
- biological challenges
- inference
- model selection
chris.wiggins@columbia.edu 4/23/12
bio+stats vbem/networks hierarchical biological challenges inference model selection
hfret. . .
chris.wiggins@columbia.edu 4/23/12
Jan-Willem van de Meent, Ruben Gonzalez, Chris Wiggins
Columbia University
Ramakrishnan et al – http://www.mrc-lmb.cam.ac.uk/ribo/
Ramakrishnan et al – http://www.mrc-lmb.cam.ac.uk/ribo/
Ramakrishnan et al – http://www.mrc-lmb.cam.ac.uk/ribo/
Ramakrishnan et al – http://www.mrc-lmb.cam.ac.uk/ribo/
bio+stats vbem/networks hierarchical biological challenges inference model selection
hfret. . .
chris.wiggins@columbia.edu 4/23/12
FRET = cy5 / (cy3 + cy5)
Tinoco and Gonzalez, Genes Dev, 2011
bio+stats vbem/networks hierarchical biological challenges inference model selection
hfret. . .
chris.wiggins@columbia.edu 4/23/12
Tinoco and Gonzalez, Genes Dev, 2011
bio+stats vbem/networks hierarchical biological challenges inference model selection
hfret. . .
chris.wiggins@columbia.edu 4/23/12
Unbound
EF-G bound
Tinoco and Gonzalez, Genes Dev, 2011 Fei et al, PNAS, 2009
(short-lived GS1 states correspond to an EF-G + GDPNP binding event)
bio+stats vbem/networks hierarchical biological challenges inference model selection
hfret. . .
chris.wiggins@columbia.edu 4/23/12
Unbound
EF-G bound
Tinoco and Gonzalez, Genes Dev, 2011 Fei et al, PNAS, 2009
bio+stats vbem/networks hierarchical biological challenges inference model selection
hfret. . .
chris.wiggins@columbia.edu 4/23/12
1. Identify states
2. Estimate Kinetic Rates
3. Average over many time series
4. Detect subpopulations
bio+stats vbem/networks hierarchical biological challenges inference model selection
hfret. . .
chris.wiggins@columbia.edu 4/23/12
FRET Signal
bio+stats vbem/networks hierarchical biological challenges inference model selection
hfret. . .
chris.wiggins@columbia.edu 4/23/12
FRET SignalHistogram
bio+stats vbem/networks hierarchical biological challenges inference model selection
hfret. . .
chris.wiggins@columbia.edu 4/23/12
FRET SignalHistogram
bio+stats vbem/networks hierarchical biological challenges inference model selection
hfret. . .
chris.wiggins@columbia.edu 4/23/12
FRET SignalHistogram
bio+stats vbem/networks hierarchical biological challenges inference model selection
hfret. . .
chris.wiggins@columbia.edu 4/23/12
FRET SignalHistogram
Idea: Find probability of belonging to each state
bio+stats vbem/networks hierarchical biological challenges inference model selection
hfret. . .
chris.wiggins@columbia.edu 4/23/12
Expectation Maximization
1. calculate p(z | x, θi)
2. calculate θi+1 from p(z | x, θi)
bio+stats vbem/networks hierarchical biological challenges inference model selection
hfret. . .
chris.wiggins@columbia.edu 4/23/12
Log-Likelihood
L = log p(x  θ) = log
z
p(x, z  θ)
Expectation Maximization
1. calculate p(z | x, θi)
2. calculate θi+1 from p(z | x, θi)
bio+stats vbem/networks hierarchical biological challenges inference model selection
hfret. . .
chris.wiggins@columbia.edu 4/23/12
Learned Truth
Accurate for occupancy of states,
not so good for rate estimates
bio+stats vbem/networks hierarchical biological challenges inference model selection
hfret. . .
chris.wiggins@columbia.edu 4/23/12
p(x, z  µ, σ, π) = p(x  z, µ, σ)p(z  π)
bio+stats vbem/networks hierarchical biological challenges inference model selection
hfret. . .
chris.wiggins@columbia.edu 4/23/12
probability of state depends on previous state
p(zt+ =l  zt =k) = Akl
bio+stats vbem/networks hierarchical biological challenges inference model selection
hfret. . .
chris.wiggins@columbia.edu 4/23/12
bio+stats vbem/networks hierarchical biological challenges inference model selection
hfret. . .
chris.wiggins@columbia.edu 4/23/12
bio+stats vbem/networks hierarchical biological challenges inference model selection
hfret. . .
chris.wiggins@columbia.edu 4/23/12
bio+stats vbem/networks hierarchical biological challenges inference model selection
hfret. . .
chris.wiggins@columbia.edu 4/23/12
p(z =k) = πk
bio+stats vbem/networks hierarchical biological challenges inference model selection
hfret. . .
chris.wiggins@columbia.edu 4/23/12
p(zt+ =l  zt =k) = Akl
bio+stats vbem/networks hierarchical biological challenges inference model selection
hfret. . .
chris.wiggins@columbia.edu 4/23/12
p(xt  zt = k) = N(xt  µk , σk)
bio+stats vbem/networks hierarchical biological challenges inference model selection
hfret. . .
chris.wiggins@columbia.edu 4/23/12
Learned Real
We’ve learned:
parameters: θ = {µ, σ, π, A} states: p(z | x, θ)
bio+stats vbem/networks hierarchical biological challenges inference model selection
hfret. . .
chris.wiggins@columbia.edu 4/23/12
2 States 3 States
bio+stats vbem/networks hierarchical biological challenges inference model selection
hfret. . .
chris.wiggins@columbia.edu 4/23/12
Log-Likelihood
L = log p(x  θ) = log
z
p(x, z  θ)
Log-Evidence
L = log p(x  u) = log
z
∫ dθ p(x, z  θ)p(θu)
Log-Evidence
bio+stats vbem/networks hierarchical biological challenges inference model selection
hfret. . .
chris.wiggins@columbia.edu 4/23/12
Log-Likelihood
L = log p(x  θ) = log
z
p(x, z  θ)
Log-Evidence
L = log p(x  u) = log
z
∫ dθ p(x, z  θ)p(θu)
Prior
bio+stats vbem/networks hierarchical biological challenges inference model selection
hfret. . .
chris.wiggins@columbia.edu 4/23/12
Log-Likelihood
L = log p(x  θ) = log
z
p(x, z  θ)
Log-Evidence
L = log p(x  u) = log
z
∫ dθ p(x, z  θ)p(θu)
Ensemble
bio+stats vbem/networks hierarchical biological challenges inference model selection
hfret. . .
chris.wiggins@columbia.edu 4/23/12
Log-Likelihood
L = log p(x  θ) = log
z
p(x, z  θ)
Log-Evidence
best model has highest average likelihood
L = log p(x  u) = log
z
∫ dθ p(x, z  θ)p(θu)
Log-Evidence
bio+stats vbem/networks hierarchical biological challenges inference model selection
hfret. . .
chris.wiggins@columbia.edu 4/23/12
Log-Evidence
31
L = log p(x  u) = log
z
∫ dθ p(x, z  θ)p(θu)
Lower Bound
L = 
z
∫ dθ q(z)q(θ  w)log
p(x, z, θ  u)
q(z)q(θ  w)

≥ log p(x  u)
q(z)q(θ  w)  p(z, θ  x)
bio+stats vbem/networks hierarchical biological challenges inference model selection
hfret. . .
chris.wiggins@columbia.edu 4/23/12
31
Lower bound tight for true posterior
L = 
z
∫ dθ p(z, θ  x)log
p(x, z, θ  u)
p(z, θ  x)

= 
z
∫ dθ p(z, θ  x)log[p(x  u)]
= log p(x  u)
L = log p(x  u) − Dkl [q(z)q(θ  w)  p(z, θ  x)]
bio+stats vbem/networks hierarchical biological challenges inference model selection
hfret. . .
chris.wiggins@columbia.edu 4/23/12
31
We’ve learned:
parameters: q(θ | w) states: p(z | x, θ)
δLn
δq(zn)
= 
VBEM Updates
δLn
δq(θn)
=
bio+stats vbem/networks hierarchical biological challenges inference model selection
hfret. . .
chris.wiggins@columbia.edu 4/23/12
31
variability: photophysical/experimental
bio+stats vbem/networks hierarchical biological challenges inference model selection
hfret. . .
chris.wiggins@columbia.edu 4/23/12
31
bio+stats vbem/networks hierarchical biological challenges inference model selection
hfret. . .
chris.wiggins@columbia.edu 4/23/12
31
bio+stats vbem/networks hierarchical biological challenges inference model selection
hfret. . .
chris.wiggins@columbia.edu 4/23/12
31
bio+stats vbem/networks hierarchical biological challenges inference model selection
hfret. . .
chris.wiggins@columbia.edu 4/23/12
31
bio+stats vbem/networks hierarchical biological challenges inference model selection
hfret. . .
chris.wiggins@columbia.edu 4/23/12
31
Hierarchical Updates
∂
∂u

n
Ln =
bio+stats vbem/networks hierarchical biological challenges inference model selection
hfret. . .
chris.wiggins@columbia.edu 4/23/12
31
Hierarchical Updates
∂
∂u

n
Ln =  “two-stage PEB model/CIHM”-Kass  Steffey JASA 1989
bio+stats vbem/networks hierarchical biological challenges inference model selection
hfret. . .
chris.wiggins@columbia.edu 4/23/12
2. Update p(θ | u)
Until Σ Ln converges
Until Ln converges
• Update q(zn)
• Update q(θn | wn)
1. Run VBEM on each trace
δLn
δq(zn)
= 
Hierarchical UpdatesVBEM Updates
δLn
δq(θn)
= 
∂
∂u

n
Ln =
bio+stats vbem/networks hierarchical biological challenges inference model selection
hfret. . .
chris.wiggins@columbia.edu 4/23/12
2. Update p(θ | u)
Until Σ Ln converges
Until Ln converges
• Update q(zn)
• Update q(θn | wn)
1. Run VBEM on each trace
We’ve learned:
p(θn, zn | xn) ≃ q(θn) q(zn)
(for each trace)
p(θ | u)
(for ensemble)
bio+stats vbem/networks hierarchical biological challenges inference model selection
hfret. . .
chris.wiggins@columbia.edu 4/23/12
bio+stats vbem/networks hierarchical biological challenges inference model selection
hfret. . .
chris.wiggins@columbia.edu 4/23/12
bio+stats vbem/networks hierarchical biological challenges inference model selection
hfret. . .
chris.wiggins@columbia.edu 4/23/12
bio+stats vbem/networks hierarchical biological challenges inference model selection
hfret. . .
chris.wiggins@columbia.edu 4/23/12
bio+stats vbem/networks hierarchical biological challenges inference model selection
hfret. . .
chris.wiggins@columbia.edu 4/23/12
bio+stats vbem/networks hierarchical biological challenges inference model selection
hfret. . .
chris.wiggins@columbia.edu 4/23/12
bio+stats vbem/networks hierarchical biological challenges inference model selection
hfret. . .
chris.wiggins@columbia.edu 4/23/12
bio+stats vbem/networks hierarchical biological challenges inference model selection
hfret. . .
chris.wiggins@columbia.edu 4/23/12
Unbound
EF-G bound
Tinoco and Gonzalez, Genes Dev, 2011 Fei et al, PNAS, 2009
bio+stats vbem/networks hierarchical biological challenges inference model selection
hfret. . .
chris.wiggins@columbia.edu 4/23/12
ξntkl = p(znt = k, znt+ = l  xn)
1. Run mixture model on posterior counts
p(ξnA) = 
tkl
Aξntkl
kl
p(ξn um) = ∫ dA p(Aum)p(ξn  A)
2. Rerun with M x K block-diagonal form
uA
=









uA

uA


uA
M
bio+stats vbem/networks hierarchical biological challenges inference model selection
hfret. . .
chris.wiggins@columbia.edu 4/23/12
τfast
τslow
2
4
8
16
32
64
bio+stats vbem/networks hierarchical biological challenges inference model selection
hfret. . .
chris.wiggins@columbia.edu 4/23/12
τfast
τslow
2
4
8
16
32
64
reality
bio+stats vbem/networks hierarchical biological challenges inference model selection
hfret. . .
chris.wiggins@columbia.edu 4/23/12
no EF-G 50 nM EF-G 500 nM EF-G
Fei, Bronson, Hofman, Srinivas, Wiggins, Gonzalez, PNAS, 2009
bio+stats vbem/networks hierarchical biological challenges inference model selection
hfret. . .
chris.wiggins@columbia.edu 4/23/12
no EF-G 50 nM EF-G 500 nM EF-G
p(zk) ∼ e−Gk kB T
log p(zk) − log p(zl ) = −(Gk − Gl )kBT + cst.
bio+stats vbem/networks hierarchical biological challenges inference model selection
hfret. . .
chris.wiggins@columbia.edu 4/23/12
no EF-G 50 nM EF-G 500 nM EF-G
p(zk) ∼ e−Gk kB T
Δ∆G=logit(p)
bio+stats vbem/networks hierarchical biological challenges inference model selection
hfret. . .
chris.wiggins@columbia.edu 4/23/12
no EF-Gbound fraction and life-times
bio+stats vbem/networks hierarchical biological challenges inference model selection
hfret. . .
chris.wiggins@columbia.edu 4/23/12
5 nM EF-Gbound fraction and life-times
bio+stats vbem/networks hierarchical biological challenges inference model selection
hfret. . .
chris.wiggins@columbia.edu 4/23/12
50 nM EF-Gbound fraction and life-times
bio+stats vbem/networks hierarchical biological challenges inference model selection
hfret. . .
chris.wiggins@columbia.edu 4/23/12
250 nM EF-Gbound fraction and life-times
bio+stats vbem/networks hierarchical biological challenges inference model selection
hfret. . .
chris.wiggins@columbia.edu 4/23/12
500 nM EF-Gbound fraction and life-times
bio+stats vbem/networks hierarchical biological challenges inference model selection
model selection:
chris.wiggins@columbia.edu 4/23/12
bio+stats vbem/networks hierarchical biological challenges inference model selection
hfret. . .
chris.wiggins@columbia.edu 4/23/12
Low Noise, UnderfittedInf Out - Inf In
bio+stats vbem/networks hierarchical biological challenges inference model selection
hfret. . .
chris.wiggins@columbia.edu 4/23/12
Low Noise, CorrectOut vs In
bio+stats vbem/networks hierarchical biological challenges inference model selection
hfret. . .
chris.wiggins@columbia.edu 4/23/12
Low Noise, OverfittedInf Out - Inf In
bio+stats vbem/networks hierarchical biological challenges inference model selection
hfret. . .
chris.wiggins@columbia.edu 4/23/12
High Noise, UnderfittedInf Out - Inf In
bio+stats vbem/networks hierarchical biological challenges inference model selection
hfret. . .
chris.wiggins@columbia.edu 4/23/12
High Noise, CorrectInf Out - Inf In
bio+stats vbem/networks hierarchical biological challenges inference model selection
hfret. . .
chris.wiggins@columbia.edu 4/23/12
High Noise, OverfittedInf Out - Inf In
bio+stats vbem/networks hierarchical biological challenges inference model selection
hfret. . .
chris.wiggins@columbia.edu 4/23/12
High Noise, OverfittedInf Out - Inf In
bio+stats vbem/networks hierarchical biological challenges inference model selection
hfret. . .
chris.wiggins@columbia.edu 4/23/12
the future, in progress:
X
bio+stats vbem/networks hierarchical biological challenges inference model selection
thanks. . .
- jake hofman (vbmod,vbfret)
- jonathan bronson (vbfret)
- jan-willem van de meent (hfret)
- ruben gonzalez (vbfret, hfret)
for more info:
- vbfret.sourceforge.net
- vbmod.sourceforge.net
- hfret.sourceforge.net (soon)
chris.wiggins@columbia.edu 4/23/12
BMC bioinformatics, 2010;
PNAS 2009;
Biophysical Journal 2009;
traditional role of statistics in biophysics
“if your experiment needs
statistics, you ought to
have done a better
experiment”
-lord rutherford

More Related Content

Viewers also liked

Metrics that matter: Making the business case that documentation has value
Metrics that matter: Making the business case that documentation has valueMetrics that matter: Making the business case that documentation has value
Metrics that matter: Making the business case that documentation has value
Publishing Smarter
 
Introduction to OSGi (Tokyo JUG)
Introduction to OSGi (Tokyo JUG)Introduction to OSGi (Tokyo JUG)
Introduction to OSGi (Tokyo JUG)njbartlett
 
與太陽公公的七天對話
與太陽公公的七天對話與太陽公公的七天對話
與太陽公公的七天對話
YesYou GotIt
 
Projeto: Uma careta para as drogas!
Projeto: Uma careta para as drogas!Projeto: Uma careta para as drogas!
Projeto: Uma careta para as drogas!Ivonilde Lima
 
Benchmarking the Accounting & Finance Function: 2014 Summary Presentation
Benchmarking the Accounting & Finance Function: 2014 Summary PresentationBenchmarking the Accounting & Finance Function: 2014 Summary Presentation
Benchmarking the Accounting & Finance Function: 2014 Summary Presentation
Robert Half
 
Agile & Lean for Project Delivery
Agile & Lean for Project DeliveryAgile & Lean for Project Delivery
Agile & Lean for Project Delivery
MMT Digital
 
Student equity: policy and practice
Student equity: policy and practiceStudent equity: policy and practice
Student equity: policy and practice
Australian Centre for Student Equity and Success
 
New Dell Notebooks
New Dell NotebooksNew Dell Notebooks
New Dell Notebooks
james bond
 

Viewers also liked (8)

Metrics that matter: Making the business case that documentation has value
Metrics that matter: Making the business case that documentation has valueMetrics that matter: Making the business case that documentation has value
Metrics that matter: Making the business case that documentation has value
 
Introduction to OSGi (Tokyo JUG)
Introduction to OSGi (Tokyo JUG)Introduction to OSGi (Tokyo JUG)
Introduction to OSGi (Tokyo JUG)
 
與太陽公公的七天對話
與太陽公公的七天對話與太陽公公的七天對話
與太陽公公的七天對話
 
Projeto: Uma careta para as drogas!
Projeto: Uma careta para as drogas!Projeto: Uma careta para as drogas!
Projeto: Uma careta para as drogas!
 
Benchmarking the Accounting & Finance Function: 2014 Summary Presentation
Benchmarking the Accounting & Finance Function: 2014 Summary PresentationBenchmarking the Accounting & Finance Function: 2014 Summary Presentation
Benchmarking the Accounting & Finance Function: 2014 Summary Presentation
 
Agile & Lean for Project Delivery
Agile & Lean for Project DeliveryAgile & Lean for Project Delivery
Agile & Lean for Project Delivery
 
Student equity: policy and practice
Student equity: policy and practiceStudent equity: policy and practice
Student equity: policy and practice
 
New Dell Notebooks
New Dell NotebooksNew Dell Notebooks
New Dell Notebooks
 

Similar to variational bayes in biophysics

EDBT 2015: Summer School Overview
EDBT 2015: Summer School OverviewEDBT 2015: Summer School Overview
EDBT 2015: Summer School Overview
dgarijo
 
Knowledge graphs for knowing more and knowing for sure
Knowledge graphs for knowing more and knowing for sureKnowledge graphs for knowing more and knowing for sure
Knowledge graphs for knowing more and knowing for sure
Steffen Staab
 
Lecture17 xing fei-fei
Lecture17 xing fei-feiLecture17 xing fei-fei
Lecture17 xing fei-fei
Tianlu Wang
 
Learning to assess Linked Data relationships using Genetic Programming
Learning to assess Linked Data relationships using Genetic ProgrammingLearning to assess Linked Data relationships using Genetic Programming
Learning to assess Linked Data relationships using Genetic Programming
Vrije Universiteit Amsterdam
 
algorithmic-decisions, fairness, machine learning, provenance, transparency
algorithmic-decisions, fairness, machine learning, provenance, transparencyalgorithmic-decisions, fairness, machine learning, provenance, transparency
algorithmic-decisions, fairness, machine learning, provenance, transparency
Paolo Missier
 
Reviews on Deep Generative Models in the early days / GANs & VAEs paper review
Reviews on Deep Generative Models in the early days / GANs & VAEs paper reviewReviews on Deep Generative Models in the early days / GANs & VAEs paper review
Reviews on Deep Generative Models in the early days / GANs & VAEs paper review
changedaeoh
 
MOMENT: Temporal Meta-Fact Generation and Propagation in Knowledge Graphs
MOMENT: Temporal Meta-Fact Generation and Propagation in Knowledge GraphsMOMENT: Temporal Meta-Fact Generation and Propagation in Knowledge Graphs
MOMENT: Temporal Meta-Fact Generation and Propagation in Knowledge Graphs
Paris Sud University
 
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Universitat Politècnica de Catalunya
 
The Nucleon Parton Distribution Functions from Lattice QCD
The Nucleon Parton Distribution Functions from Lattice QCDThe Nucleon Parton Distribution Functions from Lattice QCD
The Nucleon Parton Distribution Functions from Lattice QCD
Christos Kallidonis
 
Output Units and Cost Function in FNN
Output Units and Cost Function in FNNOutput Units and Cost Function in FNN
Output Units and Cost Function in FNN
Lin JiaMing
 
Why Data Science is a Science
Why Data Science is a ScienceWhy Data Science is a Science
Why Data Science is a Science
Christoforos Anagnostopoulos
 
K-Means Algorithm
K-Means AlgorithmK-Means Algorithm
K-Means Algorithm
Carlos Castillo (ChaTo)
 
slides of ABC talk at i-like workshop, Warwick, May 16
slides of ABC talk at i-like workshop, Warwick, May 16slides of ABC talk at i-like workshop, Warwick, May 16
slides of ABC talk at i-like workshop, Warwick, May 16
Christian Robert
 
Studio 4 - workshop introduction
Studio 4 - workshop introductionStudio 4 - workshop introduction
Studio 4 - workshop introduction
Danil Nagy
 
Lecture10 xing
Lecture10 xingLecture10 xing
Lecture10 xing
Tianlu Wang
 
Object Detection Beyond Mask R-CNN and RetinaNet III
Object Detection Beyond Mask R-CNN and RetinaNet IIIObject Detection Beyond Mask R-CNN and RetinaNet III
Object Detection Beyond Mask R-CNN and RetinaNet III
Wanjin Yu
 
Deep Generative Learning for All
Deep Generative Learning for AllDeep Generative Learning for All
Deep Generative Learning for All
Universitat Politècnica de Catalunya
 
Introduction to Genetic Algorithms
Introduction to Genetic AlgorithmsIntroduction to Genetic Algorithms
Introduction to Genetic Algorithms
Premsankar Chakkingal
 
VLDB 2015 Tutorial: On Uncertain Graph Modeling and Queries
VLDB 2015 Tutorial: On Uncertain Graph Modeling and QueriesVLDB 2015 Tutorial: On Uncertain Graph Modeling and Queries
VLDB 2015 Tutorial: On Uncertain Graph Modeling and Queries
Arijit Khan
 
Interactive High-Dimensional Visualization of Social Graphs
Interactive High-Dimensional Visualization of Social GraphsInteractive High-Dimensional Visualization of Social Graphs
Interactive High-Dimensional Visualization of Social Graphs
Tokyo Tech (Tokyo Institute of Technology)
 

Similar to variational bayes in biophysics (20)

EDBT 2015: Summer School Overview
EDBT 2015: Summer School OverviewEDBT 2015: Summer School Overview
EDBT 2015: Summer School Overview
 
Knowledge graphs for knowing more and knowing for sure
Knowledge graphs for knowing more and knowing for sureKnowledge graphs for knowing more and knowing for sure
Knowledge graphs for knowing more and knowing for sure
 
Lecture17 xing fei-fei
Lecture17 xing fei-feiLecture17 xing fei-fei
Lecture17 xing fei-fei
 
Learning to assess Linked Data relationships using Genetic Programming
Learning to assess Linked Data relationships using Genetic ProgrammingLearning to assess Linked Data relationships using Genetic Programming
Learning to assess Linked Data relationships using Genetic Programming
 
algorithmic-decisions, fairness, machine learning, provenance, transparency
algorithmic-decisions, fairness, machine learning, provenance, transparencyalgorithmic-decisions, fairness, machine learning, provenance, transparency
algorithmic-decisions, fairness, machine learning, provenance, transparency
 
Reviews on Deep Generative Models in the early days / GANs & VAEs paper review
Reviews on Deep Generative Models in the early days / GANs & VAEs paper reviewReviews on Deep Generative Models in the early days / GANs & VAEs paper review
Reviews on Deep Generative Models in the early days / GANs & VAEs paper review
 
MOMENT: Temporal Meta-Fact Generation and Propagation in Knowledge Graphs
MOMENT: Temporal Meta-Fact Generation and Propagation in Knowledge GraphsMOMENT: Temporal Meta-Fact Generation and Propagation in Knowledge Graphs
MOMENT: Temporal Meta-Fact Generation and Propagation in Knowledge Graphs
 
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
 
The Nucleon Parton Distribution Functions from Lattice QCD
The Nucleon Parton Distribution Functions from Lattice QCDThe Nucleon Parton Distribution Functions from Lattice QCD
The Nucleon Parton Distribution Functions from Lattice QCD
 
Output Units and Cost Function in FNN
Output Units and Cost Function in FNNOutput Units and Cost Function in FNN
Output Units and Cost Function in FNN
 
Why Data Science is a Science
Why Data Science is a ScienceWhy Data Science is a Science
Why Data Science is a Science
 
K-Means Algorithm
K-Means AlgorithmK-Means Algorithm
K-Means Algorithm
 
slides of ABC talk at i-like workshop, Warwick, May 16
slides of ABC talk at i-like workshop, Warwick, May 16slides of ABC talk at i-like workshop, Warwick, May 16
slides of ABC talk at i-like workshop, Warwick, May 16
 
Studio 4 - workshop introduction
Studio 4 - workshop introductionStudio 4 - workshop introduction
Studio 4 - workshop introduction
 
Lecture10 xing
Lecture10 xingLecture10 xing
Lecture10 xing
 
Object Detection Beyond Mask R-CNN and RetinaNet III
Object Detection Beyond Mask R-CNN and RetinaNet IIIObject Detection Beyond Mask R-CNN and RetinaNet III
Object Detection Beyond Mask R-CNN and RetinaNet III
 
Deep Generative Learning for All
Deep Generative Learning for AllDeep Generative Learning for All
Deep Generative Learning for All
 
Introduction to Genetic Algorithms
Introduction to Genetic AlgorithmsIntroduction to Genetic Algorithms
Introduction to Genetic Algorithms
 
VLDB 2015 Tutorial: On Uncertain Graph Modeling and Queries
VLDB 2015 Tutorial: On Uncertain Graph Modeling and QueriesVLDB 2015 Tutorial: On Uncertain Graph Modeling and Queries
VLDB 2015 Tutorial: On Uncertain Graph Modeling and Queries
 
Interactive High-Dimensional Visualization of Social Graphs
Interactive High-Dimensional Visualization of Social GraphsInteractive High-Dimensional Visualization of Social Graphs
Interactive High-Dimensional Visualization of Social Graphs
 

More from chris wiggins

data science at the new york times
data science at the new york timesdata science at the new york times
data science at the new york times
chris wiggins
 
"data hum: a core approach to the ethics of data"
"data hum: a core approach to the ethics of data""data hum: a core approach to the ethics of data"
"data hum: a core approach to the ethics of data"
chris wiggins
 
"data: past, present, and future" day 1 lecture 2020-01-20
"data: past, present, and future" day 1 lecture 2020-01-20"data: past, present, and future" day 1 lecture 2020-01-20
"data: past, present, and future" day 1 lecture 2020-01-20
chris wiggins
 
a mission-driven approach to personalizing the customer journey
a mission-driven approach to personalizing the customer journeya mission-driven approach to personalizing the customer journey
a mission-driven approach to personalizing the customer journey
chris wiggins
 
Data Science at The New York Times: what industry can learn from us; what we ...
Data Science at The New York Times: what industry can learn from us; what we ...Data Science at The New York Times: what industry can learn from us; what we ...
Data Science at The New York Times: what industry can learn from us; what we ...
chris wiggins
 
Data Science at The New York Times
Data Science at The New York TimesData Science at The New York Times
Data Science at The New York Times
chris wiggins
 
history and ethics of data
history and ethics of datahistory and ethics of data
history and ethics of data
chris wiggins
 
"data: past, present, and future" lecture 1 (intro) 1/22/19
"data: past, present, and future" lecture 1 (intro) 1/22/19"data: past, present, and future" lecture 1 (intro) 1/22/19
"data: past, present, and future" lecture 1 (intro) 1/22/19
chris wiggins
 
"data: past, present, and future" lab 2 (EDA) notes by Prof. Matt Jones
"data: past, present, and future" lab 2 (EDA) notes by Prof. Matt Jones"data: past, present, and future" lab 2 (EDA) notes by Prof. Matt Jones
"data: past, present, and future" lab 2 (EDA) notes by Prof. Matt Jones
chris wiggins
 
Data: Past, Present, and Future (Cornell Digital Life Seminar on Data Literac...
Data: Past, Present, and Future (Cornell Digital Life Seminar on Data Literac...Data: Past, Present, and Future (Cornell Digital Life Seminar on Data Literac...
Data: Past, Present, and Future (Cornell Digital Life Seminar on Data Literac...
chris wiggins
 
Data: Past, Present, and Future (Lecture 1, Spring 2018)
Data: Past, Present, and Future (Lecture 1, Spring 2018)Data: Past, Present, and Future (Lecture 1, Spring 2018)
Data: Past, Present, and Future (Lecture 1, Spring 2018)
chris wiggins
 
data science: past present & future [American Statistical Association (ASA) C...
data science: past present & future [American Statistical Association (ASA) C...data science: past present & future [American Statistical Association (ASA) C...
data science: past present & future [American Statistical Association (ASA) C...
chris wiggins
 
Machine Learning Summer School 2016
Machine Learning Summer School 2016Machine Learning Summer School 2016
Machine Learning Summer School 2016
chris wiggins
 
lean + design thinking in building data products
lean + design thinking in building data productslean + design thinking in building data products
lean + design thinking in building data products
chris wiggins
 
data science @NYT ; inaugural Data Science Initiative Lecture
data science @NYT ; inaugural Data Science Initiative Lecturedata science @NYT ; inaugural Data Science Initiative Lecture
data science @NYT ; inaugural Data Science Initiative Lecture
chris wiggins
 
data history / data science @ NYT
data history / data science @ NYTdata history / data science @ NYT
data history / data science @ NYT
chris wiggins
 
data science history / data science @ NYT
data science history / data science @ NYTdata science history / data science @ NYT
data science history / data science @ NYT
chris wiggins
 
data science: past, present, and future
data science: past, present, and futuredata science: past, present, and future
data science: past, present, and future
chris wiggins
 
Chris Wiggins: "engagement & reality"
Chris Wiggins: "engagement & reality"Chris Wiggins: "engagement & reality"
Chris Wiggins: "engagement & reality"
chris wiggins
 
intro data science at NYT 2015-01-22
intro data science at NYT 2015-01-22intro data science at NYT 2015-01-22
intro data science at NYT 2015-01-22
chris wiggins
 

More from chris wiggins (20)

data science at the new york times
data science at the new york timesdata science at the new york times
data science at the new york times
 
"data hum: a core approach to the ethics of data"
"data hum: a core approach to the ethics of data""data hum: a core approach to the ethics of data"
"data hum: a core approach to the ethics of data"
 
"data: past, present, and future" day 1 lecture 2020-01-20
"data: past, present, and future" day 1 lecture 2020-01-20"data: past, present, and future" day 1 lecture 2020-01-20
"data: past, present, and future" day 1 lecture 2020-01-20
 
a mission-driven approach to personalizing the customer journey
a mission-driven approach to personalizing the customer journeya mission-driven approach to personalizing the customer journey
a mission-driven approach to personalizing the customer journey
 
Data Science at The New York Times: what industry can learn from us; what we ...
Data Science at The New York Times: what industry can learn from us; what we ...Data Science at The New York Times: what industry can learn from us; what we ...
Data Science at The New York Times: what industry can learn from us; what we ...
 
Data Science at The New York Times
Data Science at The New York TimesData Science at The New York Times
Data Science at The New York Times
 
history and ethics of data
history and ethics of datahistory and ethics of data
history and ethics of data
 
"data: past, present, and future" lecture 1 (intro) 1/22/19
"data: past, present, and future" lecture 1 (intro) 1/22/19"data: past, present, and future" lecture 1 (intro) 1/22/19
"data: past, present, and future" lecture 1 (intro) 1/22/19
 
"data: past, present, and future" lab 2 (EDA) notes by Prof. Matt Jones
"data: past, present, and future" lab 2 (EDA) notes by Prof. Matt Jones"data: past, present, and future" lab 2 (EDA) notes by Prof. Matt Jones
"data: past, present, and future" lab 2 (EDA) notes by Prof. Matt Jones
 
Data: Past, Present, and Future (Cornell Digital Life Seminar on Data Literac...
Data: Past, Present, and Future (Cornell Digital Life Seminar on Data Literac...Data: Past, Present, and Future (Cornell Digital Life Seminar on Data Literac...
Data: Past, Present, and Future (Cornell Digital Life Seminar on Data Literac...
 
Data: Past, Present, and Future (Lecture 1, Spring 2018)
Data: Past, Present, and Future (Lecture 1, Spring 2018)Data: Past, Present, and Future (Lecture 1, Spring 2018)
Data: Past, Present, and Future (Lecture 1, Spring 2018)
 
data science: past present & future [American Statistical Association (ASA) C...
data science: past present & future [American Statistical Association (ASA) C...data science: past present & future [American Statistical Association (ASA) C...
data science: past present & future [American Statistical Association (ASA) C...
 
Machine Learning Summer School 2016
Machine Learning Summer School 2016Machine Learning Summer School 2016
Machine Learning Summer School 2016
 
lean + design thinking in building data products
lean + design thinking in building data productslean + design thinking in building data products
lean + design thinking in building data products
 
data science @NYT ; inaugural Data Science Initiative Lecture
data science @NYT ; inaugural Data Science Initiative Lecturedata science @NYT ; inaugural Data Science Initiative Lecture
data science @NYT ; inaugural Data Science Initiative Lecture
 
data history / data science @ NYT
data history / data science @ NYTdata history / data science @ NYT
data history / data science @ NYT
 
data science history / data science @ NYT
data science history / data science @ NYTdata science history / data science @ NYT
data science history / data science @ NYT
 
data science: past, present, and future
data science: past, present, and futuredata science: past, present, and future
data science: past, present, and future
 
Chris Wiggins: "engagement & reality"
Chris Wiggins: "engagement & reality"Chris Wiggins: "engagement & reality"
Chris Wiggins: "engagement & reality"
 
intro data science at NYT 2015-01-22
intro data science at NYT 2015-01-22intro data science at NYT 2015-01-22
intro data science at NYT 2015-01-22
 

Recently uploaded

Guidance_and_Counselling.pdf B.Ed. 4th Semester
Guidance_and_Counselling.pdf B.Ed. 4th SemesterGuidance_and_Counselling.pdf B.Ed. 4th Semester
Guidance_and_Counselling.pdf B.Ed. 4th Semester
Atul Kumar Singh
 
Embracing GenAI - A Strategic Imperative
Embracing GenAI - A Strategic ImperativeEmbracing GenAI - A Strategic Imperative
Embracing GenAI - A Strategic Imperative
Peter Windle
 
Palestine last event orientationfvgnh .pptx
Palestine last event orientationfvgnh .pptxPalestine last event orientationfvgnh .pptx
Palestine last event orientationfvgnh .pptx
RaedMohamed3
 
Lapbook sobre os Regimes Totalitários.pdf
Lapbook sobre os Regimes Totalitários.pdfLapbook sobre os Regimes Totalitários.pdf
Lapbook sobre os Regimes Totalitários.pdf
Jean Carlos Nunes Paixão
 
Thesis Statement for students diagnonsed withADHD.ppt
Thesis Statement for students diagnonsed withADHD.pptThesis Statement for students diagnonsed withADHD.ppt
Thesis Statement for students diagnonsed withADHD.ppt
EverAndrsGuerraGuerr
 
Biological Screening of Herbal Drugs in detailed.
Biological Screening of Herbal Drugs in detailed.Biological Screening of Herbal Drugs in detailed.
Biological Screening of Herbal Drugs in detailed.
Ashokrao Mane college of Pharmacy Peth-Vadgaon
 
The basics of sentences session 5pptx.pptx
The basics of sentences session 5pptx.pptxThe basics of sentences session 5pptx.pptx
The basics of sentences session 5pptx.pptx
heathfieldcps1
 
Unit 8 - Information and Communication Technology (Paper I).pdf
Unit 8 - Information and Communication Technology (Paper I).pdfUnit 8 - Information and Communication Technology (Paper I).pdf
Unit 8 - Information and Communication Technology (Paper I).pdf
Thiyagu K
 
Model Attribute Check Company Auto Property
Model Attribute  Check Company Auto PropertyModel Attribute  Check Company Auto Property
Model Attribute Check Company Auto Property
Celine George
 
Phrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXX
Phrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXXPhrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXX
Phrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXX
MIRIAMSALINAS13
 
The Accursed House by Émile Gaboriau.pptx
The Accursed House by Émile Gaboriau.pptxThe Accursed House by Émile Gaboriau.pptx
The Accursed House by Émile Gaboriau.pptx
DhatriParmar
 
"Protectable subject matters, Protection in biotechnology, Protection of othe...
"Protectable subject matters, Protection in biotechnology, Protection of othe..."Protectable subject matters, Protection in biotechnology, Protection of othe...
"Protectable subject matters, Protection in biotechnology, Protection of othe...
SACHIN R KONDAGURI
 
Unit 2- Research Aptitude (UGC NET Paper I).pdf
Unit 2- Research Aptitude (UGC NET Paper I).pdfUnit 2- Research Aptitude (UGC NET Paper I).pdf
Unit 2- Research Aptitude (UGC NET Paper I).pdf
Thiyagu K
 
The French Revolution Class 9 Study Material pdf free download
The French Revolution Class 9 Study Material pdf free downloadThe French Revolution Class 9 Study Material pdf free download
The French Revolution Class 9 Study Material pdf free download
Vivekanand Anglo Vedic Academy
 
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
Levi Shapiro
 
Welcome to TechSoup New Member Orientation and Q&A (May 2024).pdf
Welcome to TechSoup   New Member Orientation and Q&A (May 2024).pdfWelcome to TechSoup   New Member Orientation and Q&A (May 2024).pdf
Welcome to TechSoup New Member Orientation and Q&A (May 2024).pdf
TechSoup
 
Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46
Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46
Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46
MysoreMuleSoftMeetup
 
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
siemaillard
 
Introduction to AI for Nonprofits with Tapp Network
Introduction to AI for Nonprofits with Tapp NetworkIntroduction to AI for Nonprofits with Tapp Network
Introduction to AI for Nonprofits with Tapp Network
TechSoup
 
CLASS 11 CBSE B.St Project AIDS TO TRADE - INSURANCE
CLASS 11 CBSE B.St Project AIDS TO TRADE - INSURANCECLASS 11 CBSE B.St Project AIDS TO TRADE - INSURANCE
CLASS 11 CBSE B.St Project AIDS TO TRADE - INSURANCE
BhavyaRajput3
 

Recently uploaded (20)

Guidance_and_Counselling.pdf B.Ed. 4th Semester
Guidance_and_Counselling.pdf B.Ed. 4th SemesterGuidance_and_Counselling.pdf B.Ed. 4th Semester
Guidance_and_Counselling.pdf B.Ed. 4th Semester
 
Embracing GenAI - A Strategic Imperative
Embracing GenAI - A Strategic ImperativeEmbracing GenAI - A Strategic Imperative
Embracing GenAI - A Strategic Imperative
 
Palestine last event orientationfvgnh .pptx
Palestine last event orientationfvgnh .pptxPalestine last event orientationfvgnh .pptx
Palestine last event orientationfvgnh .pptx
 
Lapbook sobre os Regimes Totalitários.pdf
Lapbook sobre os Regimes Totalitários.pdfLapbook sobre os Regimes Totalitários.pdf
Lapbook sobre os Regimes Totalitários.pdf
 
Thesis Statement for students diagnonsed withADHD.ppt
Thesis Statement for students diagnonsed withADHD.pptThesis Statement for students diagnonsed withADHD.ppt
Thesis Statement for students diagnonsed withADHD.ppt
 
Biological Screening of Herbal Drugs in detailed.
Biological Screening of Herbal Drugs in detailed.Biological Screening of Herbal Drugs in detailed.
Biological Screening of Herbal Drugs in detailed.
 
The basics of sentences session 5pptx.pptx
The basics of sentences session 5pptx.pptxThe basics of sentences session 5pptx.pptx
The basics of sentences session 5pptx.pptx
 
Unit 8 - Information and Communication Technology (Paper I).pdf
Unit 8 - Information and Communication Technology (Paper I).pdfUnit 8 - Information and Communication Technology (Paper I).pdf
Unit 8 - Information and Communication Technology (Paper I).pdf
 
Model Attribute Check Company Auto Property
Model Attribute  Check Company Auto PropertyModel Attribute  Check Company Auto Property
Model Attribute Check Company Auto Property
 
Phrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXX
Phrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXXPhrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXX
Phrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXX
 
The Accursed House by Émile Gaboriau.pptx
The Accursed House by Émile Gaboriau.pptxThe Accursed House by Émile Gaboriau.pptx
The Accursed House by Émile Gaboriau.pptx
 
"Protectable subject matters, Protection in biotechnology, Protection of othe...
"Protectable subject matters, Protection in biotechnology, Protection of othe..."Protectable subject matters, Protection in biotechnology, Protection of othe...
"Protectable subject matters, Protection in biotechnology, Protection of othe...
 
Unit 2- Research Aptitude (UGC NET Paper I).pdf
Unit 2- Research Aptitude (UGC NET Paper I).pdfUnit 2- Research Aptitude (UGC NET Paper I).pdf
Unit 2- Research Aptitude (UGC NET Paper I).pdf
 
The French Revolution Class 9 Study Material pdf free download
The French Revolution Class 9 Study Material pdf free downloadThe French Revolution Class 9 Study Material pdf free download
The French Revolution Class 9 Study Material pdf free download
 
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
 
Welcome to TechSoup New Member Orientation and Q&A (May 2024).pdf
Welcome to TechSoup   New Member Orientation and Q&A (May 2024).pdfWelcome to TechSoup   New Member Orientation and Q&A (May 2024).pdf
Welcome to TechSoup New Member Orientation and Q&A (May 2024).pdf
 
Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46
Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46
Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46
 
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
 
Introduction to AI for Nonprofits with Tapp Network
Introduction to AI for Nonprofits with Tapp NetworkIntroduction to AI for Nonprofits with Tapp Network
Introduction to AI for Nonprofits with Tapp Network
 
CLASS 11 CBSE B.St Project AIDS TO TRADE - INSURANCE
CLASS 11 CBSE B.St Project AIDS TO TRADE - INSURANCECLASS 11 CBSE B.St Project AIDS TO TRADE - INSURANCE
CLASS 11 CBSE B.St Project AIDS TO TRADE - INSURANCE
 

variational bayes in biophysics

  • 1. bio+stats vbem/networks hierarchical variational and hierarchical modeling for biological data chris wiggins columbia april 23, 2012 chris.wiggins@columbia.edu 4/23/12 Chris Wiggins • APAM: Department of Applied Physics and Applied Mathematics; • C2B2: Center for Computational Biology and Bioinformatics; • CISB: Columbia University Initiative in Systems Biology • ISDE: Institute for Data Sciences and Engineering Columbia University September 28, 2012
  • 2. bio+stats vbem/networks hierarchical biological challenges inference model selection thanks. . . - jake hofman (vbmod,vbfret) - jonathan bronson (vbfret) - jan-willem van de meent (hfret) - ruben gonzalez (vbfret, hfret) for more info: - vbfret.sourceforge.net - vbmod.sourceforge.net - hfret.sourceforge.net (soon) chris.wiggins@columbia.edu 4/23/12 BMC bioinformatics, 2010; PNAS 2009; Biophysical Journal 2009;
  • 3. bio+stats vbem/networks hierarchical 1 biology and statistics genomics generative modeling 2 variational/biological networks variational Bayesian expectation maximization inference model selection 3 hierarchical/time series biological challenges inference model selection chris.wiggins@columbia.edu 4/23/12
  • 4. bio+stats vbem/networks hierarchical genomics generative modeling biology and statistics: chris.wiggins@columbia.edu 4/23/12
  • 5. bio+stats vbem/networks hierarchical genomics generative modeling genomics: chris.wiggins@columbia.edu 4/23/12 plos comp bio `10, nyas `07, bmc bioinfo `06a, bmc bioinfo `06b, bioinfo `04, regulatory genomics 04, IEEE `05; NIPS (MLCB`03, `06 )
  • 6. bio+stats vbem/networks hierarchical genomics generative modeling generative modeling: chris.wiggins@columbia.edu 4/23/12 bmc bioinfo '10, PNAS`09, biophys j`09, PNAS`07, PNAS`06 NIPS (MLCB '09, '10, '11) IEEE sig. proc. `12; plos one `08,cell `07,prl `06, jcs `06, biophys j `06 pami `08, prl `08, NIPS (MLCB08), PNAS `05, bmc bioinfo `04
  • 7. bio+stats vbem/networks hierarchical vbem inference model selection variational/biological networks: - variational bayesian expectation maximization - inference - model selection chris.wiggins@columbia.edu 4/23/12
  • 8. introduction formulation results extensions motivation history introduction: chris.wiggins@columbia.edu 22.2.12 vbmod.sourceforge.net Hartwell, Hopfield, Leibler and Murray NATURE|VOL 402 | SUPP | 2 DECEMBER 1999 | www.nature.com
  • 9. introduction formulation results extensions motivation history motivation: community detection in networks social networks biological networks problem: over-fitting/resolution limit chris.wiggins@columbia.edu 22.2.12 vbmod.sourceforge.net
  • 10. introduction formulation results extensions motivation history history: by community math/cs: spectral methods (Fiedler ’74, Shi + Malik ’00) math/cs: clustering generally (Taskar, Koller, Getoor) physics: modularity common thread: test w/ stochastic block model (’76, ’83) ergo: use as inference tool (Hastings 0604429, Newman+Liecht 061148) chris.wiggins@columbia.edu 22.2.12 vbmod.sourceforge.net
  • 11. introduction formulation results extensions generative model max likelihood max evidence algo formulation: generative model maximum likelihood maximum evidence complexity control. . . variational/mft. . . algorithm in physics: “test hamiltonian” in ML “variational bayesian methods” (Jordan, Mackay) chris.wiggins@columbia.edu 22.2.12 vbmod.sourceforge.net
  • 12. introduction formulation results extensions generative model max likelihood max evidence algo generative model: foreach node roll K-sided die with bias π to choose zi {1, . . . , K} foreach edge flip coin with bias ϑ+ if zi = zj , else ϑ− draw edge if coin lands heads up chris.wiggins@columbia.edu 22.2.12 vbmod.sourceforge.net Stochastic block models (Holland, Laskey, Leinhardt 1983; Wang and Wong, 1987) i≠j zi zj Aij π θ
  • 13. introduction formulation results extensions generative model max likelihood max evidence algo generative model. . . (bis) chris.wiggins@columbia.edu 22.2.12 vbmod.sourceforge.net Die rolling, coin flipping, and priors: where counts are: non-edges within modules edges within modules edges between modules non-edges between modules nodes in each module
  • 14. introduction formulation results extensions generative model max likelihood max evidence algo max likelihood: chris.wiggins@columbia.edu 22.2.12 vbmod.sourceforge.net H ⇥ ln p(A, ⌦z|⌦⇤, ⌦⇥) = i,j (JLAij JG) zi,zj + K µ=1 hµ N i=1 zi,µ JG ⇥ ln ⇥c/⇥d JL ⇥ ln(1 ⇥d)/(1 ⇥c) + JG hµ ⇥ ln µ Extends Newman (2004, 2006), Hastings (2006), Bornholdt & Reichardt (2006) •Die rolling, coin flipping <-> infinite-range spin-glass Potts model:
  • 15. introduction formulation results extensions generative model max likelihood max evidence algo formulation: generative model maximum likelihood maximum evidence complexity control. . . variational/mft. . . algorithm in physics: “test hamiltonian” in ML “variational bayesian methods” (Jordan, Mackay) chris.wiggins@columbia.edu 22.2.12 vbmod.sourceforge.net
  • 16. introduction formulation results extensions generative model max likelihood max evidence algo formulation: generative model maximum likelihood maximum evidence complexity control. . . variational/mft. . . algorithm in physics: “test hamiltonian” in ML “variational bayesian methods” (Jordan, Mackay) chris.wiggins@columbia.edu 22.2.12 vbmod.sourceforge.net
  • 17. introduction formulation results extensions generative model max likelihood max evidence algo max evidence: chris.wiggins@columbia.edu 22.2.12 vbmod.sourceforge.net Increasing complexity
  • 18. introduction formulation results extensions generative model max likelihood max evidence algo max evidence: chris.wiggins@columbia.edu 22.2.12 vbmod.sourceforge.net http://research.microsoft.com/~minka/statlearn/demo/
  • 19. introduction formulation results extensions generative model max likelihood max evidence algo max evidence: chris.wiggins@columbia.edu 22.2.12 vbmod.sourceforge.net
  • 20. introduction formulation results extensions generative model max likelihood max evidence algo max evidence: chris.wiggins@columbia.edu 22.2.12 vbmod.sourceforge.net cf. “BIC” Schwartz, 1978
  • 21. introduction formulation results extensions generative model max likelihood max evidence algo generative model: foreach node roll K-sided die with bias π to choose zi {1, . . . , K} foreach edge flip coin with bias ϑ+ if zi = zj , else ϑ− draw edge if coin lands heads up chris.wiggins@columbia.edu 22.2.12 vbmod.sourceforge.net Stochastic block models (Holland, Laskey, Leinhardt 1983; Wang and Wong, 1987) i≠j zi zj Aij π θ
  • 22. introduction formulation results extensions generative model max likelihood max evidence algo generative model: foreach node roll K-sided die with bias π to choose zi {1, . . . , K} foreach edge flip coin with bias ϑ+ if zi = zj , else ϑ− draw edge if coin lands heads up chris.wiggins@columbia.edu 22.2.12 vbmod.sourceforge.net Stochastic block models (Holland, Laskey, Leinhardt 1983; Wang and Wong, 1987) i≠j zi zj Aij π θ c n
  • 23. introduction formulation results extensions generative model max likelihood max evidence algo generative model. . . (bis) chris.wiggins@columbia.edu 22.2.12 vbmod.sourceforge.net Die rolling, coin flipping, and priors: where counts are: non-edges within modules edges within modules edges between modules non-edges between modules nodes in each module
  • 24. introduction formulation results extensions generative model max likelihood max evidence algo generative model. . . (bis) chris.wiggins@columbia.edu 22.2.12 vbmod.sourceforge.net Die rolling, coin flipping, and priors: where counts are: non-edges within modules edges within modules edges between modules non-edges between modules nodes in each module
  • 25. introduction formulation results extensions generative model max likelihood max evidence algo max likelihood: chris.wiggins@columbia.edu 22.2.12 vbmod.sourceforge.net Extends Newman (2004, 2006), Hastings (2006), Bornholdt & Reichardt (2006) •Die rolling, coin flipping <-> infinite-range spin-glass Potts model: •Infer distributions over spin assignments, coupling constants, and chemical potentials and find number of occupied spin states JG ⇥ ln ⇥c/⇥d JL ⇥ ln(1 ⇥d)/(1 ⇥c) + JG hµ ⇥ ln µ •Die rolling, coin flipping <-> infinite-range spin-glass Potts model: •Infer distributions over spin assignments, coupling constants, and chemical potentials and find number of occupied spin states H ⇥ ln p(A, ⌦z|⌦⇤, ⌦⇥) = i,j (JLAij JG) zi,zj + K µ=1 hµ N i=1 zi,µ
  • 26. introduction formulation results extensions generative model max likelihood max evidence algo max evidence: chris.wiggins@columbia.edu 22.2.12 vbmod.sourceforge.net Extends Newman (2004, 2006), Hastings (2006), Bornholdt & Reichardt (2006) •Die rolling, coin flipping <-> infinite-range spin-glass Potts model: •Infer distributions over spin assignments, coupling constants, and chemical potentials and find number of occupied spin states H ⇥ ln p(A, ⌦z|⌦⇤, ⌦⇥) = i,j (JLAij JG) zi,zj + K µ=1 hµ N i=1 zi,µ JG ⇥ ln ⇥c/⇥d JL ⇥ ln(1 ⇥d)/(1 ⇥c) + JG hµ ⇥ ln µ p(A|K) = ⇥z ⇥ d⌦ ⇥ d⌦⇥ p(A,⌦z,⌦⇥, ⌦) = ⇥z ⇥ d⌦ ⇥ d⌦⇥ e H p(⌦)p(⌦⇥)
  • 27. introduction formulation results extensions generative model max likelihood max evidence algo max evidence: chris.wiggins@columbia.edu 22.2.12 vbmod.sourceforge.net Extends Newman (2004, 2006), Hastings (2006), Bornholdt & Reichardt (2004 & 2006) •Die rolling, coin flipping <-> infinite-range spin-glass Potts model: •Infer distributions over spin assignments, coupling constants, and chemical potentials and find number of occupied spin states H ⇥ ln p(A, ⌦z|⌦⇤, ⌦⇥) = i,j (JLAij JG) zi,zj + K µ=1 hµ N i=1 zi,µ JG ⇥ ln ⇥c/⇥d JL ⇥ ln(1 ⇥d)/(1 ⇥c) + JG hµ ⇥ ln µ p(A|K) = ⇥z ⇥ d⌦ ⇥ d⌦⇥ p(A,⌦z,⌦⇥, ⌦) = ⇥z ⇥ d⌦ ⇥ d⌦⇥ e H p(⌦)p(⌦⇥) Can do integrals, but sum is intractable, O(KN); use mean-field
  • 28. introduction formulation results extensions generative model max likelihood max evidence algo max evidence: chris.wiggins@columbia.edu 22.2.12 vbmod.sourceforge.net • Gibbs’/Jensen’s inequality (log of expected value bounds expected value of log) for any distribution q p(A|K) = ⇥z ⇥ d⌦ ⇥ d⌦⇥ p(A,⌦z,⌦⇥, ⌦) = ⇥z ⇥ d⌦ ⇥ d⌦⇥ e H p(⌦)p(⌦⇥) Variational Bayes (MacKay, Jordan, Ghahramani, Jaakola, Saul 1999; cf. Feynman 1972)
  • 29. introduction formulation results extensions generative model max likelihood max evidence algo max evidence: chris.wiggins@columbia.edu 22.2.12 vbmod.sourceforge.net why would you do this? (A1): Beal, 2003
  • 30. introduction formulation results extensions generative model max likelihood max evidence algo max evidence: chris.wiggins@columbia.edu 22.2.12 vbmod.sourceforge.net why would you do this? (A2): Beal, 2003
  • 31. introduction formulation results extensions generative model max likelihood max evidence algo max evidence: chris.wiggins@columbia.edu 22.2.12 vbmod.sourceforge.net why would you do this? (A3): Beal, 2003
  • 32. introduction formulation results extensions generative model max likelihood max evidence algo max evidence: chris.wiggins@columbia.edu 22.2.12 vbmod.sourceforge.net • Gibbs’/Jensen’s inequality (log of expected value bounds expected value of log) for any distribution q Variational Bayes (MacKay, Jordan, Ghahramani, Jaakola, Saul 1999; cf. Feynman 1972) • F is a functional of q; find approximation to posterior by optimizing approximation to evidence • Take q(z, π, θ)=q(z)q(π)q(θ); Qiμ is probability node i in module μ where expected counts are:
  • 33. introduction formulation results extensions generative model max likelihood max evidence algo algo: chris.wiggins@columbia.edu 22.2.12 vbmod.sourceforge.net where expected counts are:
  • 34. introduction formulation results extensions generative model max likelihood max evidence algo algo: chris.wiggins@columbia.edu 22.2.12 vbmod.sourceforge.net where expected counts are:
  • 35. introduction formulation results extensions generative model max likelihood max evidence algo algo: chris.wiggins@columbia.edu 22.2.12 vbmod.sourceforge.net
  • 36. introduction formulation results extensions generative model max likelihood max evidence algo algo: chris.wiggins@columbia.edu 22.2.12 vbmod.sourceforge.net suggests hard limit in step 3; sparse in step 1
  • 37. introduction formulation results extensions run time consistency good vs easy real data results: run time consistency required plot: good vs. easy real data karate biology american football chris.wiggins@columbia.edu 22.2.12 vbmod.sourceforge.net
  • 38. introduction formulation results extensions run time consistency good vs easy real data run time: chris.wiggins@columbia.edu 22.2.12 vbmod.sourceforge.net • Main loop runtime for 104 nodes in MATLAB ~30 seconds
  • 39. introduction formulation results extensions run time consistency good vs easy real data consistency: chris.wiggins@columbia.edu 22.2.12 vbmod.sourceforge.net
  • 40. introduction formulation results extensions run time consistency good vs easy real data consistency: chris.wiggins@columbia.edu 22.2.12 vbmod.sourceforge.net 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 2 4 6 8 10 12 θ N=8, K=2, distribution after 2 iterations p(θ+ ) p(θ − )
  • 41. introduction formulation results extensions run time consistency good vs easy real data consistency: chris.wiggins@columbia.edu 22.2.12 vbmod.sourceforge.net • K=4? • Automatic complexity control: probability of occupation for extraneous modules goes to zero
  • 42. introduction formulation results extensions run time consistency good vs easy real data consistency: chris.wiggins@columbia.edu 22.2.12 vbmod.sourceforge.net • K=4? • Automatic complexity control: probability of occupation for extraneous modules goes to zero
  • 43. introduction formulation results extensions run time consistency good vs easy real data consistency: chris.wiggins@columbia.edu 22.2.12 vbmod.sourceforge.net The “resolution limit” problem 1 2 3 4 5 57 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 4041 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 58 59 60 1 2 3 4 5 57 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 4041 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 58 59 60 Variational Bayes Girvan-Newman modularity
  • 44. introduction formulation results extensions run time consistency good vs easy real data consistency: chris.wiggins@columbia.edu 22.2.12 vbmod.sourceforge.net The “resolution limit” problem 10 12 14 16 18 20 8 10 12 14 16 18 20 Ktrue K* K * =Ktrue Variational Bayes Modularity optimization 10 12 14 16 18 20 0.72 0.74 0.76 0.78 0.8 0.82 0.84 Ktrue GNmodularity Resolution limit problem on ring of 4−node cliques Single−clique communities (correct) Double−clique communities (incorrect) GN modularity (Clauset’s algorithm) Girvan-Newman modularity or Potts model w/ fixed parameters suffers from a resolution limit, where size of detected modules depends on network size Fortunato et. al. (2007), Kumpula et. al. (2007),
  • 45. introduction formulation results extensions run time consistency good vs easy real data good vs easy: chris.wiggins@columbia.edu 22.2.12 vbmod.sourceforge.net
  • 46. introduction formulation results extensions run time consistency good vs easy real data real data: chris.wiggins@columbia.edu 22.2.12 vbmod.sourceforge.net • Correctly infer K=12 conferences Validation: NCAA football schedule 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 nodes: teams edges: games shape: conference color: inferred module
  • 47. introduction formulation results extensions run time consistency good vs easy real data real data: chris.wiggins@columbia.edu 22.2.12 vbmod.sourceforge.net APS march meeting 2008 superconductivity (experimentalists) Nanotubes, Graphene superconductivity (theorists)
  • 48. introduction formulation results extensions extensions: model extensions full SBM (done) hierarchical model p(Aij = 1|zi = zi ± 1) hierarchical modeling (ensemble of graphs) p(D|u, K) = ΠL i dϑi p(D|ϑi )p(ϑi |u, K) Rd embedding (latent are real) more ‘rigorous’ SOM? ‘correct’ for degree (allow variable affinity) (cf. Bader, Karrer) algorithm extensions BP (see earlier talks) map-reduce cvmod (model selection via cross validation) chris.wiggins@columbia.edu 22.2.12 vbmod.sourceforge.net
  • 49. introduction formulation results extensions full SBM. . . probability of edge depends only on block membership: p(Aij |zi = µ, zj = ν) = ϑµν chris.wiggins@columbia.edu 22.2.12 vbmod.sourceforge.net
  • 50. introduction formulation results extensions full SBM. . . probability of edge depends only on block membership: p(Aij |zi = µ, zj = ν) = ϑµν chris.wiggins@columbia.edu 22.2.12 vbmod.sourceforge.net • Nodes belong to “blocks” of varying size • Roll die for assignment of nodes to blocks • Probability of edge between two nodes depends only on block membership • Flip (one of K2) coins for edges • Result: mixture of Erdos-Renyi graphs 0 20 40 60 80 100 120 0 20 40 60 80 100 120 nz = 2275 adjacency matrix
  • 51. introduction formulation results extensions full SBM. . . probability of edge depends only on block membership: p(Aij |zi = µ, zj = ν) = ϑµν chris.wiggins@columbia.edu 22.2.12 vbmod.sourceforge.net vs
  • 52. introduction formulation results extensions full SBM. . . probability of edge depends only on block membership: p(Aij |zi = µ, zj = ν) = ϑµν chris.wiggins@columbia.edu 22.2.12 vbmod.sourceforge.net 0 20 40 60 80 100 120 0 20 40 60 80 100 120 nz = 2803 adjacency matrix
  • 53. introduction formulation results extensions full SBM. . . probability of edge depends only on block membership: p(Aij |zi = µ, zj = ν) = ϑµν chris.wiggins@columbia.edu 22.2.12 vbmod.sourceforge.net vs
  • 54. introduction formulation results extensions full SBM. . . probability of edge depends only on block membership: p(Aij |zi = µ, zj = ν) = ϑµν chris.wiggins@columbia.edu 22.2.12 vbmod.sourceforge.net >> vbsbm_vs_vbmod(0) running vbmod ... Elapsed time is 1.136925 seconds. running vbsbm ... Elapsed time is 1.398904 seconds. Fmod=13089.158019 Fsbm=13144.445782 vbmod wins
  • 55. introduction formulation results extensions full SBM. . . probability of edge depends only on block membership: p(Aij |zi = µ, zj = ν) = ϑµν chris.wiggins@columbia.edu 22.2.12 vbmod.sourceforge.net >> vbsbm_vs_vbmod(0.25) running vbmod ... Elapsed time is 1.557298 seconds. running vbsbm ... Elapsed time is 1.759527 seconds. Fmod=20457.142416 Fsbm=19457.306022 vbsbm wins
  • 56. introduction formulation results extensions full SBM. . . probability of edge depends only on block membership: p(Aij |zi = µ, zj = ν) = ϑµν chris.wiggins@columbia.edu 22.2.12 vbmod.sourceforge.net >> vbsbm_vs_vbmod(0.5) running vbmod ... Elapsed time is 2.624886 seconds. running vbsbm ... Elapsed time is 1.440242 seconds. Fmod=26133.351210 Fsbm=23921.797625 vbsbm wins
  • 57. introduction formulation results extensions full SBM. . . probability of edge depends only on block membership: p(Aij |zi = µ, zj = ν) = ϑµν chris.wiggins@columbia.edu 22.2.12 vbmod.sourceforge.net • Using same framework we can compare the unconstrained and full stochastic block models via p(D|M,K*) 0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 perturbation to constrained model winpercentageforunconstrainedmodel 0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 perturbation to constrained model winpercentageforunconstrainedmodel 0 20 40 60 80 100 120 0 20 40 60 80 100 120 nz = 2100 adjacency matrix 0 20 40 60 80 100 120 0 20 40 60 80 100 120 nz = 2048 adjacency matrix 0 20 40 60 80 100 120 0 20 40 60 80 100 120 nz = 2108 adjacency matrix
  • 58. introduction formulation results extensions extensions: model extensions full SBM (done) hierarchical model p(Aij = 1|zi = zi ± 1) hierarchical modeling (ensemble of graphs) p(D|u, K) = ΠL i dϑi p(D|ϑi )p(ϑi |u, K) Rd embedding (latent are real) more ‘rigorous’ SOM? ‘correct’ for degree (allow variable affinity) (cf. Bader, Karrer) algorithm extensions BP (see earlier talks) map-reduce cvmod (model selection via cross validation) chris.wiggins@columbia.edu 22.2.12 vbmod.sourceforge.net
  • 59. introduction formulation results extensions extensions: model extensions full SBM (done) hierarchical model p(Aij = 1|zi = zi ± 1) hierarchical modeling (ensemble of graphs) p(D|u, K) = ΠL i dϑi p(D|ϑi )p(ϑi |u, K) Rd embedding (latent are real) more ‘rigorous’ SOM? ‘correct’ for degree (allow variable affinity) (cf. Bader, Karrer) algorithm extensions BP (see earlier talks) map-reduce cvmod (model selection via cross validation) chris.wiggins@columbia.edu 22.2.12 vbmod.sourceforge.net
  • 60. introduction formulation results extensions extensions: model extensions full SBM (done) hierarchical model p(Aij = 1|zi = zi ± 1) hierarchical modeling (ensemble of graphs) p(D|u, K) = ΠL i dϑi p(D|ϑi )p(ϑi |u, K) Rd embedding (latent are real) more ‘rigorous’ SOM? ‘correct’ for degree (allow variable affinity) (cf. Bader, Karrer) algorithm extensions BP (see earlier talks) map-reduce cvmod (model selection via cross validation) chris.wiggins@columbia.edu 22.2.12 vbmod.sourceforge.net
  • 61. introduction formulation results extensions extensions: model extensions full SBM (done) hierarchical model p(Aij = 1|zi = zi ± 1) hierarchical modeling (ensemble of graphs) p(D|u, K) = ΠL i dϑi p(D|ϑi )p(ϑi |u, K) Rd embedding (latent are real) more ‘rigorous’ SOM? ‘correct’ for degree (allow variable affinity) (cf. Bader, Karrer) algorithm extensions BP (see earlier talks) map-reduce cvmod (model selection via cross validation) chris.wiggins@columbia.edu 22.2.12 vbmod.sourceforge.net Stochastic block models (Holland, Laskey, Leinhardt 1983; Wang and Wong, 1987) i≠j zi zj Aij π θ c n
  • 62. introduction formulation results extensions extensions: model extensions full SBM (done) hierarchical model p(Aij = 1|zi = zi ± 1) hierarchical modeling (ensemble of graphs) p(D|u, K) = ΠL i dϑi p(D|ϑi )p(ϑi |u, K) Rd embedding (latent are real) more ‘rigorous’ SOM? ‘correct’ for degree (allow variable affinity) (cf. Bader, Karrer) algorithm extensions BP (see earlier talks) map-reduce cvmod (model selection via cross validation) chris.wiggins@columbia.edu 22.2.12 vbmod.sourceforge.net Stochastic block models (Holland, Laskey, Leinhardt 1983; Wang and Wong, 1987) i≠j zi zj Aij π θ c n L
  • 63. introduction formulation results extensions extensions: model extensions full SBM (done) hierarchical model p(Aij = 1|zi = zi ± 1) hierarchical modeling (ensemble of graphs) p(D|u, K) = ΠL i dϑi p(D|ϑi )p(ϑi |u, K) Rd embedding (latent are real) more ‘rigorous’ SOM? ‘correct’ for degree (allow variable affinity) (cf. Bader, Karrer) algorithm extensions BP (see earlier talks) map-reduce cvmod (model selection via cross validation) chris.wiggins@columbia.edu 22.2.12 vbmod.sourceforge.net
  • 64. introduction formulation results extensions for more info. . . code: MATLAB & python (inc. “full” SBM) (vbmod.sf.net) paper: arxiv 08 / prl 08 Hofman soon to come (not by me) code in C++, inc. full ‘vblabel propagation’ algo twitter-scale analysis chris.wiggins@columbia.edu 22.2.12 vbmod.sourceforge.net
  • 65. bio+stats vbem/networks hierarchical biological challenges inference model selection hierarchical/time series: - biological challenges - inference - model selection chris.wiggins@columbia.edu 4/23/12
  • 66. bio+stats vbem/networks hierarchical biological challenges inference model selection hfret. . . chris.wiggins@columbia.edu 4/23/12 Jan-Willem van de Meent, Ruben Gonzalez, Chris Wiggins Columbia University
  • 67. Ramakrishnan et al – http://www.mrc-lmb.cam.ac.uk/ribo/
  • 68. Ramakrishnan et al – http://www.mrc-lmb.cam.ac.uk/ribo/
  • 69. Ramakrishnan et al – http://www.mrc-lmb.cam.ac.uk/ribo/
  • 70. Ramakrishnan et al – http://www.mrc-lmb.cam.ac.uk/ribo/
  • 71. bio+stats vbem/networks hierarchical biological challenges inference model selection hfret. . . chris.wiggins@columbia.edu 4/23/12 FRET = cy5 / (cy3 + cy5) Tinoco and Gonzalez, Genes Dev, 2011
  • 72. bio+stats vbem/networks hierarchical biological challenges inference model selection hfret. . . chris.wiggins@columbia.edu 4/23/12 Tinoco and Gonzalez, Genes Dev, 2011
  • 73. bio+stats vbem/networks hierarchical biological challenges inference model selection hfret. . . chris.wiggins@columbia.edu 4/23/12 Unbound EF-G bound Tinoco and Gonzalez, Genes Dev, 2011 Fei et al, PNAS, 2009 (short-lived GS1 states correspond to an EF-G + GDPNP binding event)
  • 74. bio+stats vbem/networks hierarchical biological challenges inference model selection hfret. . . chris.wiggins@columbia.edu 4/23/12 Unbound EF-G bound Tinoco and Gonzalez, Genes Dev, 2011 Fei et al, PNAS, 2009
  • 75. bio+stats vbem/networks hierarchical biological challenges inference model selection hfret. . . chris.wiggins@columbia.edu 4/23/12 1. Identify states 2. Estimate Kinetic Rates 3. Average over many time series 4. Detect subpopulations
  • 76. bio+stats vbem/networks hierarchical biological challenges inference model selection hfret. . . chris.wiggins@columbia.edu 4/23/12 FRET Signal
  • 77. bio+stats vbem/networks hierarchical biological challenges inference model selection hfret. . . chris.wiggins@columbia.edu 4/23/12 FRET SignalHistogram
  • 78. bio+stats vbem/networks hierarchical biological challenges inference model selection hfret. . . chris.wiggins@columbia.edu 4/23/12 FRET SignalHistogram
  • 79. bio+stats vbem/networks hierarchical biological challenges inference model selection hfret. . . chris.wiggins@columbia.edu 4/23/12 FRET SignalHistogram
  • 80. bio+stats vbem/networks hierarchical biological challenges inference model selection hfret. . . chris.wiggins@columbia.edu 4/23/12 FRET SignalHistogram Idea: Find probability of belonging to each state
  • 81. bio+stats vbem/networks hierarchical biological challenges inference model selection hfret. . . chris.wiggins@columbia.edu 4/23/12 Expectation Maximization 1. calculate p(z | x, θi) 2. calculate θi+1 from p(z | x, θi)
  • 82. bio+stats vbem/networks hierarchical biological challenges inference model selection hfret. . . chris.wiggins@columbia.edu 4/23/12 Log-Likelihood L = log p(x θ) = log z p(x, z θ) Expectation Maximization 1. calculate p(z | x, θi) 2. calculate θi+1 from p(z | x, θi)
  • 83. bio+stats vbem/networks hierarchical biological challenges inference model selection hfret. . . chris.wiggins@columbia.edu 4/23/12 Learned Truth Accurate for occupancy of states, not so good for rate estimates
  • 84. bio+stats vbem/networks hierarchical biological challenges inference model selection hfret. . . chris.wiggins@columbia.edu 4/23/12 p(x, z µ, σ, π) = p(x z, µ, σ)p(z π)
  • 85. bio+stats vbem/networks hierarchical biological challenges inference model selection hfret. . . chris.wiggins@columbia.edu 4/23/12 probability of state depends on previous state p(zt+ =l zt =k) = Akl
  • 86. bio+stats vbem/networks hierarchical biological challenges inference model selection hfret. . . chris.wiggins@columbia.edu 4/23/12
  • 87. bio+stats vbem/networks hierarchical biological challenges inference model selection hfret. . . chris.wiggins@columbia.edu 4/23/12
  • 88. bio+stats vbem/networks hierarchical biological challenges inference model selection hfret. . . chris.wiggins@columbia.edu 4/23/12
  • 89. bio+stats vbem/networks hierarchical biological challenges inference model selection hfret. . . chris.wiggins@columbia.edu 4/23/12 p(z =k) = πk
  • 90. bio+stats vbem/networks hierarchical biological challenges inference model selection hfret. . . chris.wiggins@columbia.edu 4/23/12 p(zt+ =l zt =k) = Akl
  • 91. bio+stats vbem/networks hierarchical biological challenges inference model selection hfret. . . chris.wiggins@columbia.edu 4/23/12 p(xt zt = k) = N(xt µk , σk)
  • 92. bio+stats vbem/networks hierarchical biological challenges inference model selection hfret. . . chris.wiggins@columbia.edu 4/23/12 Learned Real We’ve learned: parameters: θ = {µ, σ, π, A} states: p(z | x, θ)
  • 93. bio+stats vbem/networks hierarchical biological challenges inference model selection hfret. . . chris.wiggins@columbia.edu 4/23/12 2 States 3 States
  • 94. bio+stats vbem/networks hierarchical biological challenges inference model selection hfret. . . chris.wiggins@columbia.edu 4/23/12 Log-Likelihood L = log p(x θ) = log z p(x, z θ) Log-Evidence L = log p(x u) = log z ∫ dθ p(x, z θ)p(θu) Log-Evidence
  • 95. bio+stats vbem/networks hierarchical biological challenges inference model selection hfret. . . chris.wiggins@columbia.edu 4/23/12 Log-Likelihood L = log p(x θ) = log z p(x, z θ) Log-Evidence L = log p(x u) = log z ∫ dθ p(x, z θ)p(θu) Prior
  • 96. bio+stats vbem/networks hierarchical biological challenges inference model selection hfret. . . chris.wiggins@columbia.edu 4/23/12 Log-Likelihood L = log p(x θ) = log z p(x, z θ) Log-Evidence L = log p(x u) = log z ∫ dθ p(x, z θ)p(θu) Ensemble
  • 97. bio+stats vbem/networks hierarchical biological challenges inference model selection hfret. . . chris.wiggins@columbia.edu 4/23/12 Log-Likelihood L = log p(x θ) = log z p(x, z θ) Log-Evidence best model has highest average likelihood L = log p(x u) = log z ∫ dθ p(x, z θ)p(θu) Log-Evidence
  • 98. bio+stats vbem/networks hierarchical biological challenges inference model selection hfret. . . chris.wiggins@columbia.edu 4/23/12 Log-Evidence 31 L = log p(x u) = log z ∫ dθ p(x, z θ)p(θu) Lower Bound L = z ∫ dθ q(z)q(θ w)log p(x, z, θ u) q(z)q(θ w) ≥ log p(x u) q(z)q(θ w) p(z, θ x)
  • 99. bio+stats vbem/networks hierarchical biological challenges inference model selection hfret. . . chris.wiggins@columbia.edu 4/23/12 31 Lower bound tight for true posterior L = z ∫ dθ p(z, θ x)log p(x, z, θ u) p(z, θ x) = z ∫ dθ p(z, θ x)log[p(x u)] = log p(x u) L = log p(x u) − Dkl [q(z)q(θ w) p(z, θ x)]
  • 100. bio+stats vbem/networks hierarchical biological challenges inference model selection hfret. . . chris.wiggins@columbia.edu 4/23/12 31 We’ve learned: parameters: q(θ | w) states: p(z | x, θ) δLn δq(zn) = VBEM Updates δLn δq(θn) =
  • 101. bio+stats vbem/networks hierarchical biological challenges inference model selection hfret. . . chris.wiggins@columbia.edu 4/23/12 31 variability: photophysical/experimental
  • 102. bio+stats vbem/networks hierarchical biological challenges inference model selection hfret. . . chris.wiggins@columbia.edu 4/23/12 31
  • 103. bio+stats vbem/networks hierarchical biological challenges inference model selection hfret. . . chris.wiggins@columbia.edu 4/23/12 31
  • 104. bio+stats vbem/networks hierarchical biological challenges inference model selection hfret. . . chris.wiggins@columbia.edu 4/23/12 31
  • 105. bio+stats vbem/networks hierarchical biological challenges inference model selection hfret. . . chris.wiggins@columbia.edu 4/23/12 31
  • 106. bio+stats vbem/networks hierarchical biological challenges inference model selection hfret. . . chris.wiggins@columbia.edu 4/23/12 31 Hierarchical Updates ∂ ∂u n Ln =
  • 107. bio+stats vbem/networks hierarchical biological challenges inference model selection hfret. . . chris.wiggins@columbia.edu 4/23/12 31 Hierarchical Updates ∂ ∂u n Ln = “two-stage PEB model/CIHM”-Kass Steffey JASA 1989
  • 108. bio+stats vbem/networks hierarchical biological challenges inference model selection hfret. . . chris.wiggins@columbia.edu 4/23/12 2. Update p(θ | u) Until Σ Ln converges Until Ln converges • Update q(zn) • Update q(θn | wn) 1. Run VBEM on each trace δLn δq(zn) = Hierarchical UpdatesVBEM Updates δLn δq(θn) = ∂ ∂u n Ln =
  • 109. bio+stats vbem/networks hierarchical biological challenges inference model selection hfret. . . chris.wiggins@columbia.edu 4/23/12 2. Update p(θ | u) Until Σ Ln converges Until Ln converges • Update q(zn) • Update q(θn | wn) 1. Run VBEM on each trace We’ve learned: p(θn, zn | xn) ≃ q(θn) q(zn) (for each trace) p(θ | u) (for ensemble)
  • 110. bio+stats vbem/networks hierarchical biological challenges inference model selection hfret. . . chris.wiggins@columbia.edu 4/23/12
  • 111. bio+stats vbem/networks hierarchical biological challenges inference model selection hfret. . . chris.wiggins@columbia.edu 4/23/12
  • 112. bio+stats vbem/networks hierarchical biological challenges inference model selection hfret. . . chris.wiggins@columbia.edu 4/23/12
  • 113. bio+stats vbem/networks hierarchical biological challenges inference model selection hfret. . . chris.wiggins@columbia.edu 4/23/12
  • 114. bio+stats vbem/networks hierarchical biological challenges inference model selection hfret. . . chris.wiggins@columbia.edu 4/23/12
  • 115. bio+stats vbem/networks hierarchical biological challenges inference model selection hfret. . . chris.wiggins@columbia.edu 4/23/12
  • 116. bio+stats vbem/networks hierarchical biological challenges inference model selection hfret. . . chris.wiggins@columbia.edu 4/23/12
  • 117. bio+stats vbem/networks hierarchical biological challenges inference model selection hfret. . . chris.wiggins@columbia.edu 4/23/12 Unbound EF-G bound Tinoco and Gonzalez, Genes Dev, 2011 Fei et al, PNAS, 2009
  • 118. bio+stats vbem/networks hierarchical biological challenges inference model selection hfret. . . chris.wiggins@columbia.edu 4/23/12 ξntkl = p(znt = k, znt+ = l xn) 1. Run mixture model on posterior counts p(ξnA) = tkl Aξntkl kl p(ξn um) = ∫ dA p(Aum)p(ξn A) 2. Rerun with M x K block-diagonal form uA = uA uA uA M
  • 119. bio+stats vbem/networks hierarchical biological challenges inference model selection hfret. . . chris.wiggins@columbia.edu 4/23/12 τfast τslow 2 4 8 16 32 64
  • 120. bio+stats vbem/networks hierarchical biological challenges inference model selection hfret. . . chris.wiggins@columbia.edu 4/23/12 τfast τslow 2 4 8 16 32 64 reality
  • 121. bio+stats vbem/networks hierarchical biological challenges inference model selection hfret. . . chris.wiggins@columbia.edu 4/23/12 no EF-G 50 nM EF-G 500 nM EF-G Fei, Bronson, Hofman, Srinivas, Wiggins, Gonzalez, PNAS, 2009
  • 122. bio+stats vbem/networks hierarchical biological challenges inference model selection hfret. . . chris.wiggins@columbia.edu 4/23/12 no EF-G 50 nM EF-G 500 nM EF-G p(zk) ∼ e−Gk kB T log p(zk) − log p(zl ) = −(Gk − Gl )kBT + cst.
  • 123. bio+stats vbem/networks hierarchical biological challenges inference model selection hfret. . . chris.wiggins@columbia.edu 4/23/12 no EF-G 50 nM EF-G 500 nM EF-G p(zk) ∼ e−Gk kB T Δ∆G=logit(p)
  • 124. bio+stats vbem/networks hierarchical biological challenges inference model selection hfret. . . chris.wiggins@columbia.edu 4/23/12 no EF-Gbound fraction and life-times
  • 125. bio+stats vbem/networks hierarchical biological challenges inference model selection hfret. . . chris.wiggins@columbia.edu 4/23/12 5 nM EF-Gbound fraction and life-times
  • 126. bio+stats vbem/networks hierarchical biological challenges inference model selection hfret. . . chris.wiggins@columbia.edu 4/23/12 50 nM EF-Gbound fraction and life-times
  • 127. bio+stats vbem/networks hierarchical biological challenges inference model selection hfret. . . chris.wiggins@columbia.edu 4/23/12 250 nM EF-Gbound fraction and life-times
  • 128. bio+stats vbem/networks hierarchical biological challenges inference model selection hfret. . . chris.wiggins@columbia.edu 4/23/12 500 nM EF-Gbound fraction and life-times
  • 129. bio+stats vbem/networks hierarchical biological challenges inference model selection model selection: chris.wiggins@columbia.edu 4/23/12
  • 130. bio+stats vbem/networks hierarchical biological challenges inference model selection hfret. . . chris.wiggins@columbia.edu 4/23/12 Low Noise, UnderfittedInf Out - Inf In
  • 131. bio+stats vbem/networks hierarchical biological challenges inference model selection hfret. . . chris.wiggins@columbia.edu 4/23/12 Low Noise, CorrectOut vs In
  • 132. bio+stats vbem/networks hierarchical biological challenges inference model selection hfret. . . chris.wiggins@columbia.edu 4/23/12 Low Noise, OverfittedInf Out - Inf In
  • 133. bio+stats vbem/networks hierarchical biological challenges inference model selection hfret. . . chris.wiggins@columbia.edu 4/23/12 High Noise, UnderfittedInf Out - Inf In
  • 134. bio+stats vbem/networks hierarchical biological challenges inference model selection hfret. . . chris.wiggins@columbia.edu 4/23/12 High Noise, CorrectInf Out - Inf In
  • 135. bio+stats vbem/networks hierarchical biological challenges inference model selection hfret. . . chris.wiggins@columbia.edu 4/23/12 High Noise, OverfittedInf Out - Inf In
  • 136. bio+stats vbem/networks hierarchical biological challenges inference model selection hfret. . . chris.wiggins@columbia.edu 4/23/12 High Noise, OverfittedInf Out - Inf In
  • 137. bio+stats vbem/networks hierarchical biological challenges inference model selection hfret. . . chris.wiggins@columbia.edu 4/23/12 the future, in progress: X
  • 138. bio+stats vbem/networks hierarchical biological challenges inference model selection thanks. . . - jake hofman (vbmod,vbfret) - jonathan bronson (vbfret) - jan-willem van de meent (hfret) - ruben gonzalez (vbfret, hfret) for more info: - vbfret.sourceforge.net - vbmod.sourceforge.net - hfret.sourceforge.net (soon) chris.wiggins@columbia.edu 4/23/12 BMC bioinformatics, 2010; PNAS 2009; Biophysical Journal 2009;
  • 139. traditional role of statistics in biophysics “if your experiment needs statistics, you ought to have done a better experiment” -lord rutherford