SlideShare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our User Agreement and Privacy Policy.
SlideShare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our Privacy Policy and User Agreement for details.
Successfully reported this slideshow.
Activate your 14 day free trial to unlock unlimited reading.
Application of Bayesian and Sparse Network Models for Assessing Linkage Disequilibrium in Animals and Plants
Application of Bayesian and Sparse Network Models for Assessing Linkage Disequilibrium in Animals and Plants
1.
Application of Bayesian and Sparse Network
Models for Assessing Linkage Disequilibrium in
Animals and Plants
C-36-6
Gota Morota
Department of Animal Sciences
University of Wisconsin-Madison
Aug 30, 2012
1 / 16
2.
Systems Genetics
Figure 1: Multi-dimensional gene network
Purpose of this study
• take the view that loci associate and interact together as a
network
• evaluate LD reflecting the biological nature that loci interact as
a complex system
2 / 16
3.
IAMB algorithm
Incremental Association Markov Blanket (Tsamardinos et al. 2003)
1. Compute Markov Blankets (MB)
2. Compute Graph Structure
3. Orient Edges
Figure 2: The Markov Blanket of a node xi
3 / 16
4.
Identifying the MB of a node
• Growing phase
• heuristic function:
f (X ; T |CMB ) = MI(X ; T |CMB )
=
cmb ∈CMB
P (CMB )
P (X , T |CMB )
P (X , T |CMB ) log
P (X |CMB )P (T |CMB )
x ∈X t ∈T
• conditional independence tests (Pearson’s χ2 test):
H0 : P (X , T |CMB ) = P (X |CMB ) · P (T |CMB ) (do not add X )
HA : P (X , T |CMB )
P (X |CMB ) · P (T |CMB ) (add X to the CMB)
• Shrinking phase
• conditional independence tests (Pearson’s χ2 test):
H0 : P (X , T |CMB − X )
P (X |CMB − X ) · P (T |CMB − X ) (keep X )
HA : P (X , T |CMB − X ) = P (X |CMB − X ) · P (T |CMB − X ) (remove X
4 / 16
5.
Network Structure
Algorithm
Suppose Y ∈ MB (T ). Then T and Y are connected if they are
conditionally dependent given all subsets of the smaller of
MB (T ) − (Y ) and MB (Y ) − (T ).
Example:
• MB (T ) = (A , B , Y ), MB (Y ) = (C , D , E , F , T )
• since MB (T ) < MB (Y ), independence tests are conditional
on all subsets of MB (T ) − (Y ) = (A , B ).
• if any of the
CI(T , Y |{}), CI(T , Y |{A }), CI(T , Y |{B }), andCI(T , Y |{A , B })
imply conditional independence,
↓
• T and Y are considered separate (spouses)
• repeat for T ∈ S and Y ∈ MB (T ),
5 / 16
6.
Materials
1. Data
• 4,898 Holstein bulls (USDA-ARS AIPL)
• 37,217 SNP markers (MAF > 0.025)
• milk protein yield
2. Missing genotypes imputation
• fastPHASE (Scheet and Stephens, 2006)
3. Select 15 SNPs
• Bayesian LASSO
4. uncover associations among a set of marker loci found to
have the strongest effects on milk protein yield
6 / 16
7.
Results – Top 15 SNPs
IAMB algorithm
Pairwise LD among SNPs (r2)
J
d
A
c
b
a
Z
L
Y
X
M
N
W
V
U
F
T
S
B
K
G
R
Q
P
H
O
N
O
E
M
L
K
J
I
I
H
G
F
E
C
D
C
B
R2 Color Key
A
0
Figure 3: r 2
0.2
0.4
0.6
0.8
1
D
Figure 4: IAMB
7 / 16
8.
Conclusion and Possible Improvements
• LD relationships are of a multivariate nature
• r 2 gives an incomplete description of LD
⇓
• undirected networks
• sparsity
8 / 16
9.
Pairwise Binary Markov Networks
We estimate the Markov network parameters Θp ×p by maximizing
a log-likelihood.
f (x1 , ..., xp ) =
exp
Ψ(Θ)
1
p
θj ,j xj +
j =1
1≤j <k ≤p
θj ,k xj xk
(1)
where
xj ∈ {0, 1}
Ψ(Θ) =
x ∈0 , 1
(2)
exp
p
θj ,j xj +
j =1
1 ≤j <k ≤p
θj ,k xj xk
(3)
• the first term is a main effect of binary marker xj (node
potential)
• the second term corresponds to an“interaction effect” between
binary markers xj and xk (link potential)
• Ψ(Θ) is the normalization constant (partition function)
9 / 16
10.
Ravikumar et al. (2010)
The pseudo-likelihood based on the local conditional likelihood
associated with each binary marker can be represented as
n
p
x
φi ,ij,j (1 − φi ,j )1−xi,j
l (Θ) =
(4)
i =1 j =1
where φi ,j is the conditional probability of xi ,j = 1 given all other
variables. Using a logistic link function,
φi ,j = P(xi ,j = 1|xi ,k , k j ; θj ,k , 1 ≤ k ≤ p )
exp(θj ,j + k j θj ,k xi ,k )
=
1 + exp(θj ,j + k j θj ,k xi ,k )
(5)
(6)
10 / 16
11.
Ravikumar et al. (2010) (cont.)
• L1 regularized logistic regressions problem
• regressing each marker on the rest of the markers
• the network structure is recovered from the sparsity pattern of
the regression coefficients
0
ˆ−2
β
1
ˆ .
.
Θ= .
−(p −1)
ˆ
β
1
−p
ˆ
β1
ˆ
β −1 ,
2
0
··· ,
··· ,
··· ,
0
ˆ−(p −1)
· · · , β p −2
ˆ p
· · · , β−−2
p
˜
Θ=
ˆ ˆ
Θ • ΘT
ˆ 1
β−−1
p
ˆ 2
β−−1
p
ˆp
β −1
ˆp
β −2
.
.
.
··· ,
−(p −1)
ˆp
0
β
−p
ˆ
β p −1
0
(7)
(8)
11 / 16
14.
Summary
Interactions and associations among the cells and genes form a
complex biological system
⇓
• r 2 → association(m1, m2)|∅ (empty set)
• L1 regularized MN → association(m1, m2) | else
A final remark
• selecting tag SNPs unconditionally, as well as conditionally,
on other markers when the dimension of the data is high
• data generated from next generation sequence technologies
14 / 16
15.
Acknowledgments
University of Wisconsin-Madison
• Daniel Gianola
• Guilherme Rosa
University College London
• Marco Scutari
• Kent Weigel
• Bruno Valente
15 / 16
0 likes
Be the first to like this
Views
Total views
432
On SlideShare
0
From Embeds
0
Number of Embeds
2
You have now unlocked unlimited access to 20M+ documents!
Unlimited Reading
Learn faster and smarter from top experts
Unlimited Downloading
Download to take your learnings offline and on the go
You also get free access to Scribd!
Instant access to millions of ebooks, audiobooks, magazines, podcasts and more.
Read and listen offline with any device.
Free access to premium services like Tuneln, Mubi and more.