SlideShare a Scribd company logo
1 of 44
Download to read offline
IMPS 2008, Durham, NH 
Linear Non-Gaussian Structural 
Equation Models 
Shohei Shimizu, Patrik Hoyer and Aapo Hyvarinen 
Osaka University, Japan 
University of Helsinki, Finland
2 
Abstract 
• Linear Structural Equation Modeling (linear SEM) 
– Analyzes causal relations 
• Covariance-based SEM 
– Uses covariance structure alone for model identification 
– A number of indistinguishable models 
• Linear non-Gaussian SEM 
– Uses non-Gaussian structures for model identification 
– Makes many models distinguishable
3 
SEM and causal analysis 
• SEM is often used for causal analysis 
based on non-experimental data 
• Assumption: the data generating process 
is represented by a SEM model 
• If the assumption is reasonable, SEM 
provides causal information
4 
Limitations of covariance-based SEM 
• Covariance-based SEM cannot distinguish 
between many models 
• Example 
e1 x1 x2 x1 x2 e2
5 
Linear non-Gaussian SEM 
• Many observed data are considerably non- 
Gaussian (Micceri, 1989; Hyvarinen et al. 2001) 
• Non-Gaussian structures of data are useful 
(Bentler 1983; Mooijaart 1985) 
• Non-Gaussianity distinguish between the two 
models (Shimizu et al. 2006) : 
e1 x1 x2 x1 x2 e2
6 
Independent component 
analysis (ICA) 
• Observed random vector x is modeled as 
x = As 
s 
– are independent and non-Gaussian 
i 
• Zero means and unit variances 
– A is a constant matrix 
• Typically square, # variables = # independent components 
• Identifiable up to permutation of the columns 
(Mooijaart 1985; Comon, 1994)
7 
ICA estimation 
• An alternative expression of ICA (x=As): 
s =Wx 
, 
~ 
called a recovering matrix 
~ 
where 1 W = A 
• Find such that maximizes independence of 
sˆ =Wx 
components of 
– Many proposals (Hyvarinen et al. 2001) 
W ~ 
• is estimated up to permutation of the rows: 
~ 
= 
W PW 
W
8 
ICA estimation 
• An alternative expression of ICA (x=As): 
s =Wx 
, 
~ 
called a recovering matrix 
~ 
where 1 W = A 
• Find such that maximizes independence of 
sˆ =Wx 
components of 
– Many proposals (Hyvarinen et al. 2001) 
W ~ 
• is estimated up to permutation of the rows: 
~ 
= 
W PW 
W
9 
ICA estimation 
• An alternative expression of ICA (x=As): 
s =Wx 
, 
~ 
called a recovering matrix 
~ 
where 1 W = A 
• Find such that maximizes independence of 
sˆ =Wx 
components of 
– Many proposals (Hyvarinen et al. 2001) 
W ~ 
• is estimated up to permutation of the rows: 
~ 
= 
W PW 
W
Discovery of linear non-Gaussian 
acyclic models 
Shimizu, Hoyer, Hyvarinen and Kerminen (2006)
11 
Linear non-Gaussian acyclic 
model (LiNGAM) 
• Directed acyclic graphs (DAG) 
x 
– can be arranged in a order k(i) 
• Assumptions: 
– Linearity 
– External influences e 
are independent 
– and are non-Gaussian 
x = Bx + e i 
x = b x + e i ij j 
k ( j )  
k ( i 
) 
or 
i 
i
12 
Goal 
• We know 
– Data X is generated by 
• We do NOT know 
– Path coefficients: bij 
– Orders k(i) 
– External influences: ei 
x = Bx + e 
• What we observe is data X only 
• Goal 
– Estimate B and k(i) using data X only!
13 
Key idea 
• First, relate LiNGAM with ICA as follows: 
= + 
x Bx e 
 - ICA! 
( ) 1 
 =  = 
x I B e Ae 
=  = 
e I B x Wx 
• Due to the permutation indeterminacy, ICA 
gives: 
• Can find a correct P 
~ 
= 
– The correct permutation is the only one that has no 
zeros in the diagonal 
~ 
equivalently ( ) 
W PW
14 
Key idea 
• First, relate LiNGAM with ICA as follows: 
= + 
x Bx e 
 - ICA! 
( ) 1 
 =  = 
x I B e Ae 
=  = 
e I B x Wx 
• Due to the permutation indeterminacy, ICA 
gives: 
• Can find a correct P 
~ 
= 
– The correct permutation is the only one that has no 
zeros in the diagonal 
~ 
equivalently ( ) 
W PW
15 
Key idea 
• First, relate LiNGAM with ICA as follows: 
= + 
 - ICA! 
( ) 1 
 =  = 
x I B e Ae 
=  = 
e I B x Wx 
• Due to the permutation indeterminacy, ICA 
gives: 
W PW 
• Can find a correct P 
– The correct permutation is the only one that has no 
zeros in the diagonal 
~ 
= 
x Bx e 
~ 
equivalently ( )
16 
Key idea 
• First, relate LiNGAM with ICA as follows: 
= + 
 - ICA! 
( ) 1 
 =  = 
x I B e Ae 
=  = 
e I B x Wx 
• Due to the permutation indeterminacy, ICA 
gives: 
W PW 
• Can find the correct P 
– The correct permutation is the only one that has no 
zeros in the diagonal 
~ 
= 
x Bx e 
~ 
equivalently ( )
17 
Illustrative example 
• Consider the model: 
 
=  
 
x 
1 
• Goal 
e1 x1 x2 
 
 
+  
 
 
e 
1 
x 
1 
0 0.6 
– Estimate the path direction between x1 and 
x2 observing only x1 and x2 
0.6 
 
 
 
 
 
 
 
 
2 
2 
2 
0 0 
e 
x 
x 
14243 
B
18 
Perform ICA 
• Relation of the LiNGAM model with ICA: 
 
 
 
 
 
x 
1 
e 
1 
1 0.6 
~ 
= 
• Due to the permutation indeterminacy, ICA might 
give: 
 
 
 
  
=  
 
2 
2 
0 1 
x 
e 
14243 
 
 
1 0 ~W 
W P 
( )  
 
 
= = 
1 0.8 
e Wx 
W ~
19 
Perform ICA 
• Relation of the LiNGAM model with ICA: 
 
 
 
 
 
 
  
 
=  
 
 
x 
1 
2 
e 
1 
e 
2 
1 0.6 
0 1 
x 
14243 
~ 
= 
• Due to the permutation indeterminacy, ICA 
might give: 
 
 
1 0 ~W 
W P 
( )  
 
 
= = 
1 0.6 
e Wx 
W ~
x 
1 
20 
Find the correct P 
• Find a permutation of the rows of W so that it 
has no zeros in the diagonal 
• In the example… 
 
 
 
 
 
  
=  
1 0.8 
 
 
 
 
2 
e 
1 
2 
0 1 
x 
e 
14243 
 
 
 
 
 
 
 
 
0 1 
 
 
=  
 
 
x 
1 
2 
e 
2 
1 
1 0.6 
x 
e 
14243 
Permute the rows 
W W ~ 0
x 
1 
21 
Find the correct P 
• Find a permutation of the rows of W so that it 
has no zeros in the diagonal 
• In the example… 
 
 
 
 
 
  
=  
1 0.8 
 
 
 
 
2 
e 
1 
2 
0 1 
x 
e 
14243 
 
 
 
 
 
 
 
 
0 1 
 
 
=  
 
 
x 
1 
2 
e 
2 
1 
1 0.6 
x 
e 
14243 
Permute the rows 
W W ~ 0
x 
1 
22 
Find the correct P 
• Find a permutation of the rows of W so that it 
has no zeros in the diagonal 
• In the example… 
 
 
 
 
 
  
=  
1 0.6 
 
 
 
 
2 
e 
1 
2 
0 1 
x 
e 
14243 
 
 
 
 
 
 
 
 
0 1 
 
 
=  
 
 
x 
1 
2 
e 
2 
1 
1 0.6 
x 
e 
14243 
Permute the rows 
0 
0 
~ W W
23 
Find the correct P 
• In practice, 
1 
( PTW 
)ii 
ˆ max = 
P 
P 
• Heavily penalizes small absolute values in 
the diagonal
24 
Simulations: Estimation of B 
• Both super- and sub-Gaussian external influences tested 
• 5 datasets created for each scatterplot 
• B randomly generated at each trial 
3 
2 
1 
0 
-1 
-2 
-3 
-3 -2 -1 0 1 2 3 
3 
2 
1 
0 
-1 
-2 
-3 
-3 -2 -1 0 1 2 3 
3 
2 
1 
0 
-1 
-2 
-3 
-3 -2 -1 0 1 2 3 
3 
2 
1 
0 
-1 
-2 
-3 
-3 -2 -1 0 1 2 3 
3 
2 
1 
0 
-1 
-2 
-3 
-3 -2 -1 0 1 2 3 
3 
2 
1 
0 
-1 
-2 
-3 
-3 -2 -1 0 1 2 3 
3 
2 
1 
0 
-1 
-2 
-3 
-3 -2 -1 0 1 2 3 
3 
2 
1 
0 
-1 
-2 
-3 
-3 -2 -1 0 1 2 3 
3 
2 
1 
0 
-1 
-2 
-3 
-3 -2 -1 0 1 2 3 
200 1,000 3,000 
Number of observations 
Number of variables 
10 
50 
100 
Generating bij 
Estimated bij
25 
Prune B (1) 
• In practice, due to estimation errors, we 
would get: 
 
 
 
+  
 
 
 
 
 
 
 
 
 
=  
 
 
e 
1 
e 
2 
x 
1 
2 
x 
1 
2 
0 0.65 
0.05 0 
x 
x 
1442443 
B 
• Need to find which path coefficients are 
actually zeros
26 
Find a permutation that gives a 
lower triangular matrix 
• The LiNGAM model is acyclic 
– The matrix B can be permuted to be lower triangular 
for some permutation of variables (Bollen, 1989) 
• First, find a simultaneous permutation of rows 
and columns of B that gives a lower-triangular B 
• In practice, find a permutation matrix Q that 
minimizes the sum of the elements in its upper 
triangular part: Q ˆ = 
max 
 
( QBQ 
T ) ij 
Q 
i j
27 
Find a permutation that gives a 
lower triangular matrix 
• The LiNGAM model is acyclic 
– The matrix B can be permuted to be lower triangular 
for some permutation of variables (Bollen, 1989) 
• First, find a simultaneous permutation of rows 
and columns of B that gives a lower-triangular B 
• In practice, find a permutation matrix Q that 
minimizes the sum of the elements in its upper 
triangular part: Q ˆ = 
max 
 
( QBQ 
T ) ij 
Q 
i j
28 
Find a permutation that gives a 
lower triangular matrix 
• The LiNGAM model is acyclic 
– The matrix B can be permuted to be lower triangular 
for some permutation of variables (Bollen, 1989) 
• First, find a simultaneous permutation of rows 
and columns of B that gives a lower-triangular B 
• In practice, find a permutation matrix Q that 
minimizes the sum of the elements in its upper 
triangular part: Q ˆ = 
min 
 
( QBQ 
T ) ij 
Q 
i j
29 
Get a lower-triangular B 
• Applying such a simultaneous permutation of the 
• we get a permuted B that is as lower-triangular 
• Set the upper-triangular elements to be zeros 
 
 
rows and columns, 
 
+  
 
 
 
 
 
as possible 
0 0.65 
 
 
 
 
 
=  
 
 
e 
1 
2 
x 
1 
2 
x 
1 
2 
0.05 0 
e 
x 
x 
 
 
+  
 
 
 
 
 
 
  
=  
 
 
 
 
e 
2 
1 
2 
x 
1 
2 
x 
1 
0 0.05 
0.62 0 
e 
x 
x 
B 
T QBQ
30 
Get a lower-triangular B 
• Applying such a simultaneous permutation of the 
• we get a permuted B that is as lower-triangular 
• Set the upper-triangular elements to be zeros 
 
 
rows and columns, 
 
+  
 
 
 
 
 
as possible 
0 0.65 
 
 
 
 
 
=  
 
 
e 
1 
2 
x 
1 
2 
x 
1 
2 
0.05 0 
e 
x 
x 
 
 
+  
 
 
 
 
 
 
  
=  
 
 
 
 
e 
2 
1 
2 
x 
1 
2 
x 
1 
0 0.05 
0.65 0 
e 
x 
x 
B 
T QBQ
31 
Get a lower-triangular B 
• Applying such a simultaneous permutation of the 
• we get a permuted B that is as lower-triangular 
• Set the upper-triangular elements to be zeros 
 
 
rows and columns, 
 
+  
 
 
 
 
 
as possible 
0 0.65 
 
 
 
 
 
=  
 
 
e 
1 
2 
x 
1 
2 
x 
1 
2 
0.05 0 
e 
x 
x 
 
 
+  
 
 
 
 
 
 
-0.05 
  
=  
 
 
 
 
e 
2 
1 
2 
x 
1 
2 
x 
1 
0 0.05 
0.65 0 
e 
x 
x 
B 
T QBQ
32 
Get a lower-triangular B 
• Applying such a simultaneous permutation of the 
• we get a permuted B that is as lower-triangular 
• Set the upper-triangular elements to be zeros 
 
 
rows and columns, 
 
+  
 
 
 
 
 
as possible 
0 0.65 
 
 
 
 
 
=  
 
 
e 
1 
2 
x 
1 
2 
x 
1 
2 
0.05 0 
e 
x 
x 
 
 
+  
 
 
 
 
 
 
  
=  
 
 
 
 
e 
2 
1 
2 
x 
1 
2 
x 
1 
0 
0 0.05 
0.65 0 
e 
x 
x 
B 
T QBQ
33 
Pruning B (2) 
• Once we get a lower-triangular B, the model is 
identifiable using covariance-based SEM 
 
 
 
+  
 
 
 
 
 
 
 
 
=  
 
 
 
e 
2 
e 
1 
2 
x 
x 
1 
2 
x 
x 
1 
0 0 
0.65 0 
• Many existing methods can be used for pruning 
the remaining path coefficients 
– Wald test, Bootstrapping, Model fit 
– Lasso-type estimators (Tibshirani 1996; Zou, 2006) etc.
34 
To summarize the procedure… 
1. Estimate B 
– ICA + finding the correct row permutation 
2. Prune estimated B 
1. Find a row-and-column permutation that makes 
estimated B lower triangular 
2. Prune remaining paths using a covariance-based 
method 
1. Estimate B 2. Prune estimated B 
x4 
x3 
x2 
x1 
x4 
x3 
x2 
x1 
x4 
x3 
x2 
x1
35 
Summary of the regular LiNGAM 
• A linear acyclic model is identifiable based on 
non-Gaussianity 
• ICA-based estimation works well 
– Confidence intervals (Konya et al., in progress) 
• Better pruning methods might be developed 
– Imposing sparseness in the ICA stage (Zhang  Chang, 
2006; Hayashi et al. in progress) like Lasso (Tibshirani 1996)
Some extensions
37 
Latent factors (Shimizu et al., 2007) 
• A non-Gaussian multiple indicator model: 
= + 
f Bf d 
= + 
x Gf e 
• Suppose that G is identified, then B is identified 
– Could identify G in a data driven way using a tetrad-constraint- 
based method (Silva et al., 2006)
38 
Latent classes 
(Shimizu  Hyvarinen, 2008) 
• LiNGAM model for each class q: 
x = B x + (I B )ì + e x = ì + A e - ICA! 
q q q q q q q 
• ICA mixtures (Lee et al., 2000; Mollah et al., 2006) 
Class 1: 
Class 2: 
0 0 
0.9 
x2 x1 
6 5 
0.2 
x2 x1
39 
Unobserved confounders 
(Hoyer et al., in press 
• Can identify and distinguish between more 
models 
1. 2. 3. 
x1 x2 x1 x2 
4. 5. 6. 
u1 
x1 x2 
x1 x2 
u1 
x1 x2 
u1 
x1 x2
40 
Time structures (Hyvarinen et al., 2008) 
• Combining LiNGAM and autoregressive model: 
k 
e x B x +  == 
( ) ( ) ( ) 
t t t 
0 
 
– In econometrics: Structural vector autoregression 
(Swanson  Granger, 1997) 
• Changes ordinary AR coefficients based on 
instantaneous effects: 
  
( ) for 0 
B I B M ( :AR matrix) 
0 =      
M
41 
Some variables are Gaussian 
(Hoyer et al., 2008) 
• Consider the model: 
0.6 
e2 x2 x1 
• Can identify the path direction 
– if either of x1 or e2 is non-Gaussian 
• In general, there exist several equivalent models 
that entail the same distribution if some are 
Gaussian
42 
Some other extensions 
• Cyclic models (Lacerda et al., 2008) 
– Fewer equivalent models than covariance-based 
approach 
• Nonlinearity (Zhang  Chan, 2007; Sun et al., 2007) 
• Model fit statistics are under development 
– Non-Gaussian structures
43 
Conclusion 
• Use of non-Gaussianity in SEM is useful 
for model identification 
• Many observed data are considerably 
non-Gaussian 
• The non-Gaussian approach can be a 
good option
44 
• Most of our papers and Matlab/Octave 
code are available on our webpages 
• Google will find us!

More Related Content

What's hot

Langrange Interpolation Polynomials
Langrange Interpolation PolynomialsLangrange Interpolation Polynomials
Langrange Interpolation PolynomialsSohaib H. Khan
 
Interpolation and Extrapolation
Interpolation and ExtrapolationInterpolation and Extrapolation
Interpolation and ExtrapolationVNRacademy
 
Discrete-Chapter 12 Modeling Computation
Discrete-Chapter 12 Modeling ComputationDiscrete-Chapter 12 Modeling Computation
Discrete-Chapter 12 Modeling ComputationWongyos Keardsri
 
Normal density and discreminant analysis
Normal density and discreminant analysisNormal density and discreminant analysis
Normal density and discreminant analysisVARUN KUMAR
 
Jacobi and gauss-seidel
Jacobi and gauss-seidelJacobi and gauss-seidel
Jacobi and gauss-seidelarunsmm
 
Chapter8
Chapter8Chapter8
Chapter8Vu Vo
 
Lecture 11 systems of nonlinear equations
Lecture 11 systems of nonlinear equationsLecture 11 systems of nonlinear equations
Lecture 11 systems of nonlinear equationsHazel Joy Chong
 
Chapter3 econometrics
Chapter3 econometricsChapter3 econometrics
Chapter3 econometricsVu Vo
 
Applied numerical methods lec9
Applied numerical methods lec9Applied numerical methods lec9
Applied numerical methods lec9Yasser Ahmed
 
Jacobi iterative method
Jacobi iterative methodJacobi iterative method
Jacobi iterative methodLuckshay Batra
 
Interpolation with Finite differences
Interpolation with Finite differencesInterpolation with Finite differences
Interpolation with Finite differencesDr. Nirav Vyas
 
Supporting Vector Machine
Supporting Vector MachineSupporting Vector Machine
Supporting Vector MachineSumit Singh
 
8 - 3 Graphing Rational Functions
8 - 3 Graphing Rational Functions8 - 3 Graphing Rational Functions
8 - 3 Graphing Rational Functionsrfrettig
 

What's hot (20)

Langrange Interpolation Polynomials
Langrange Interpolation PolynomialsLangrange Interpolation Polynomials
Langrange Interpolation Polynomials
 
Exponentials
ExponentialsExponentials
Exponentials
 
Interpolation and Extrapolation
Interpolation and ExtrapolationInterpolation and Extrapolation
Interpolation and Extrapolation
 
Discrete-Chapter 12 Modeling Computation
Discrete-Chapter 12 Modeling ComputationDiscrete-Chapter 12 Modeling Computation
Discrete-Chapter 12 Modeling Computation
 
Normal density and discreminant analysis
Normal density and discreminant analysisNormal density and discreminant analysis
Normal density and discreminant analysis
 
Jacobi and gauss-seidel
Jacobi and gauss-seidelJacobi and gauss-seidel
Jacobi and gauss-seidel
 
Analytic function
Analytic functionAnalytic function
Analytic function
 
10.4
10.410.4
10.4
 
Chapter8
Chapter8Chapter8
Chapter8
 
gls
glsgls
gls
 
Lecture 11 systems of nonlinear equations
Lecture 11 systems of nonlinear equationsLecture 11 systems of nonlinear equations
Lecture 11 systems of nonlinear equations
 
Chapter3 econometrics
Chapter3 econometricsChapter3 econometrics
Chapter3 econometrics
 
exponen dan logaritma
exponen dan logaritmaexponen dan logaritma
exponen dan logaritma
 
Applied numerical methods lec9
Applied numerical methods lec9Applied numerical methods lec9
Applied numerical methods lec9
 
Jacobi iterative method
Jacobi iterative methodJacobi iterative method
Jacobi iterative method
 
Interpolation with Finite differences
Interpolation with Finite differencesInterpolation with Finite differences
Interpolation with Finite differences
 
Supporting Vector Machine
Supporting Vector MachineSupporting Vector Machine
Supporting Vector Machine
 
Numerical method (curve fitting)
Numerical method (curve fitting)Numerical method (curve fitting)
Numerical method (curve fitting)
 
8 - 3 Graphing Rational Functions
8 - 3 Graphing Rational Functions8 - 3 Graphing Rational Functions
8 - 3 Graphing Rational Functions
 
Eigen vector
Eigen vectorEigen vector
Eigen vector
 

Viewers also liked

非ガウス性を利用した 因果構造探索
非ガウス性を利用した因果構造探索非ガウス性を利用した因果構造探索
非ガウス性を利用した 因果構造探索Shiga University, RIKEN
 
構造方程式モデルによる因果推論: 因果構造探索に関する最近の発展
構造方程式モデルによる因果推論: 因果構造探索に関する最近の発展構造方程式モデルによる因果推論: 因果構造探索に関する最近の発展
構造方程式モデルによる因果推論: 因果構造探索に関する最近の発展Shiga University, RIKEN
 
因果探索: 観察データから 因果仮説を探索する
因果探索: 観察データから因果仮説を探索する因果探索: 観察データから因果仮説を探索する
因果探索: 観察データから 因果仮説を探索するShiga University, RIKEN
 
因果探索: 基本から最近の発展までを概説
因果探索: 基本から最近の発展までを概説因果探索: 基本から最近の発展までを概説
因果探索: 基本から最近の発展までを概説Shiga University, RIKEN
 
構造方程式モデルによる因果探索と非ガウス性
構造方程式モデルによる因果探索と非ガウス性構造方程式モデルによる因果探索と非ガウス性
構造方程式モデルによる因果探索と非ガウス性Shiga University, RIKEN
 
統計的因果推論 勉強用 isseing333
統計的因果推論 勉強用 isseing333統計的因果推論 勉強用 isseing333
統計的因果推論 勉強用 isseing333Issei Kurahashi
 
相関と因果について考える:統計的因果推論、その(不)可能性の中心
相関と因果について考える:統計的因果推論、その(不)可能性の中心相関と因果について考える:統計的因果推論、その(不)可能性の中心
相関と因果について考える:統計的因果推論、その(不)可能性の中心takehikoihayashi
 

Viewers also liked (7)

非ガウス性を利用した 因果構造探索
非ガウス性を利用した因果構造探索非ガウス性を利用した因果構造探索
非ガウス性を利用した 因果構造探索
 
構造方程式モデルによる因果推論: 因果構造探索に関する最近の発展
構造方程式モデルによる因果推論: 因果構造探索に関する最近の発展構造方程式モデルによる因果推論: 因果構造探索に関する最近の発展
構造方程式モデルによる因果推論: 因果構造探索に関する最近の発展
 
因果探索: 観察データから 因果仮説を探索する
因果探索: 観察データから因果仮説を探索する因果探索: 観察データから因果仮説を探索する
因果探索: 観察データから 因果仮説を探索する
 
因果探索: 基本から最近の発展までを概説
因果探索: 基本から最近の発展までを概説因果探索: 基本から最近の発展までを概説
因果探索: 基本から最近の発展までを概説
 
構造方程式モデルによる因果探索と非ガウス性
構造方程式モデルによる因果探索と非ガウス性構造方程式モデルによる因果探索と非ガウス性
構造方程式モデルによる因果探索と非ガウス性
 
統計的因果推論 勉強用 isseing333
統計的因果推論 勉強用 isseing333統計的因果推論 勉強用 isseing333
統計的因果推論 勉強用 isseing333
 
相関と因果について考える:統計的因果推論、その(不)可能性の中心
相関と因果について考える:統計的因果推論、その(不)可能性の中心相関と因果について考える:統計的因果推論、その(不)可能性の中心
相関と因果について考える:統計的因果推論、その(不)可能性の中心
 

Similar to Linear Non-Gaussian Structural Equation Models

machine learning.pptx
machine learning.pptxmachine learning.pptx
machine learning.pptxAbdusSadik
 
Lect 2 boolean algebra (4 5-21)
Lect 2 boolean algebra (4 5-21)Lect 2 boolean algebra (4 5-21)
Lect 2 boolean algebra (4 5-21)MeghaSharma513
 
booleanalgebra-140914001141-phpapp01 (1).ppt
booleanalgebra-140914001141-phpapp01 (1).pptbooleanalgebra-140914001141-phpapp01 (1).ppt
booleanalgebra-140914001141-phpapp01 (1).pptmichaelaaron25322
 
Eighan values and diagonalization
Eighan values and diagonalization Eighan values and diagonalization
Eighan values and diagonalization gandhinagar
 
Section 5 Power Flow.pdf
Section 5 Power Flow.pdfSection 5 Power Flow.pdf
Section 5 Power Flow.pdfLucasMogaka
 
Least square method
Least square methodLeast square method
Least square methodSomya Bagai
 
DimensionalityReduction.pptx
DimensionalityReduction.pptxDimensionalityReduction.pptx
DimensionalityReduction.pptx36rajneekant
 
20200830230859_PPT4-Lines, Parabolas and Systems.pptx
20200830230859_PPT4-Lines, Parabolas and Systems.pptx20200830230859_PPT4-Lines, Parabolas and Systems.pptx
20200830230859_PPT4-Lines, Parabolas and Systems.pptxConanEdogawaShinichi
 
Grovers Algorithm
Grovers Algorithm Grovers Algorithm
Grovers Algorithm CaseyHaaland
 
5 DimensionalityReduction.pdf
5 DimensionalityReduction.pdf5 DimensionalityReduction.pdf
5 DimensionalityReduction.pdfRahul926331
 
Matrix-Decomposition-and-Its-application-in-Statistics_NK.ppt
Matrix-Decomposition-and-Its-application-in-Statistics_NK.pptMatrix-Decomposition-and-Its-application-in-Statistics_NK.ppt
Matrix-Decomposition-and-Its-application-in-Statistics_NK.pptpoojaacharya578
 

Similar to Linear Non-Gaussian Structural Equation Models (20)

machine learning.pptx
machine learning.pptxmachine learning.pptx
machine learning.pptx
 
Curve fitting
Curve fittingCurve fitting
Curve fitting
 
Curve fitting
Curve fittingCurve fitting
Curve fitting
 
Boolean Algebra
Boolean AlgebraBoolean Algebra
Boolean Algebra
 
Lect 2 boolean algebra (4 5-21)
Lect 2 boolean algebra (4 5-21)Lect 2 boolean algebra (4 5-21)
Lect 2 boolean algebra (4 5-21)
 
booleanalgebra-140914001141-phpapp01 (1).ppt
booleanalgebra-140914001141-phpapp01 (1).pptbooleanalgebra-140914001141-phpapp01 (1).ppt
booleanalgebra-140914001141-phpapp01 (1).ppt
 
Eighan values and diagonalization
Eighan values and diagonalization Eighan values and diagonalization
Eighan values and diagonalization
 
Section 5 Power Flow.pdf
Section 5 Power Flow.pdfSection 5 Power Flow.pdf
Section 5 Power Flow.pdf
 
Straight Line2.pptx
Straight Line2.pptxStraight Line2.pptx
Straight Line2.pptx
 
Least square method
Least square methodLeast square method
Least square method
 
8.5 GaussianBNs.pdf
8.5 GaussianBNs.pdf8.5 GaussianBNs.pdf
8.5 GaussianBNs.pdf
 
DimensionalityReduction.pptx
DimensionalityReduction.pptxDimensionalityReduction.pptx
DimensionalityReduction.pptx
 
20200830230859_PPT4-Lines, Parabolas and Systems.pptx
20200830230859_PPT4-Lines, Parabolas and Systems.pptx20200830230859_PPT4-Lines, Parabolas and Systems.pptx
20200830230859_PPT4-Lines, Parabolas and Systems.pptx
 
Graphical presentation
Graphical presentationGraphical presentation
Graphical presentation
 
Grovers Algorithm
Grovers Algorithm Grovers Algorithm
Grovers Algorithm
 
5 DimensionalityReduction.pdf
5 DimensionalityReduction.pdf5 DimensionalityReduction.pdf
5 DimensionalityReduction.pdf
 
1560 mathematics for economists
1560 mathematics for economists1560 mathematics for economists
1560 mathematics for economists
 
Matrix-Decomposition-and-Its-application-in-Statistics_NK.ppt
Matrix-Decomposition-and-Its-application-in-Statistics_NK.pptMatrix-Decomposition-and-Its-application-in-Statistics_NK.ppt
Matrix-Decomposition-and-Its-application-in-Statistics_NK.ppt
 
Load_Flow.pptx
Load_Flow.pptxLoad_Flow.pptx
Load_Flow.pptx
 
Curve Fitting
Curve FittingCurve Fitting
Curve Fitting
 

Recently uploaded

Land use land cover change analysis and detection of its drivers using geospa...
Land use land cover change analysis and detection of its drivers using geospa...Land use land cover change analysis and detection of its drivers using geospa...
Land use land cover change analysis and detection of its drivers using geospa...MohammedAhmed246550
 
Gliese 12 b, a temperate Earth-sized planet at 12 parsecs discovered with TES...
Gliese 12 b, a temperate Earth-sized planet at 12 parsecs discovered with TES...Gliese 12 b, a temperate Earth-sized planet at 12 parsecs discovered with TES...
Gliese 12 b, a temperate Earth-sized planet at 12 parsecs discovered with TES...Sérgio Sacani
 
Jet reorientation in central galaxies of clusters and groups: insights from V...
Jet reorientation in central galaxies of clusters and groups: insights from V...Jet reorientation in central galaxies of clusters and groups: insights from V...
Jet reorientation in central galaxies of clusters and groups: insights from V...Sérgio Sacani
 
Extensive Pollution of Uranus and Neptune’s Atmospheres by Upsweep of Icy Mat...
Extensive Pollution of Uranus and Neptune’s Atmospheres by Upsweep of Icy Mat...Extensive Pollution of Uranus and Neptune’s Atmospheres by Upsweep of Icy Mat...
Extensive Pollution of Uranus and Neptune’s Atmospheres by Upsweep of Icy Mat...Sérgio Sacani
 
Mining Activity and Investment Opportunity in Myanmar.pptx
Mining Activity and Investment Opportunity in Myanmar.pptxMining Activity and Investment Opportunity in Myanmar.pptx
Mining Activity and Investment Opportunity in Myanmar.pptxKyawThanTint
 
family therapy psychotherapy types .pdf
family therapy psychotherapy types  .pdffamily therapy psychotherapy types  .pdf
family therapy psychotherapy types .pdfhaseebahmeddrama
 
Lec 1.b Totipotency and birth of tissue culture.ppt
Lec 1.b Totipotency and birth of tissue culture.pptLec 1.b Totipotency and birth of tissue culture.ppt
Lec 1.b Totipotency and birth of tissue culture.pptasifali1111
 
Hemoglobin metabolism: C Kalyan & E. Muralinath
Hemoglobin metabolism: C Kalyan & E. MuralinathHemoglobin metabolism: C Kalyan & E. Muralinath
Hemoglobin metabolism: C Kalyan & E. Muralinathmuralinath2
 
Ostiguy & Panizza & Moffitt (eds.) - Populism in Global Perspective. A Perfor...
Ostiguy & Panizza & Moffitt (eds.) - Populism in Global Perspective. A Perfor...Ostiguy & Panizza & Moffitt (eds.) - Populism in Global Perspective. A Perfor...
Ostiguy & Panizza & Moffitt (eds.) - Populism in Global Perspective. A Perfor...frank0071
 
Aerodynamics. flippatterncn5tm5ttnj6nmnynyppt
Aerodynamics. flippatterncn5tm5ttnj6nmnynypptAerodynamics. flippatterncn5tm5ttnj6nmnynyppt
Aerodynamics. flippatterncn5tm5ttnj6nmnynypptsreddyrahul
 
RACEMIzATION AND ISOMERISATION completed.pptx
RACEMIzATION AND ISOMERISATION completed.pptxRACEMIzATION AND ISOMERISATION completed.pptx
RACEMIzATION AND ISOMERISATION completed.pptxArunLakshmiMeenakshi
 
Plasma proteins_ Dr.Muralinath_Dr.c. kalyan
Plasma proteins_ Dr.Muralinath_Dr.c. kalyanPlasma proteins_ Dr.Muralinath_Dr.c. kalyan
Plasma proteins_ Dr.Muralinath_Dr.c. kalyanmuralinath2
 
MODERN PHYSICS_REPORTING_QUANTA_.....pdf
MODERN PHYSICS_REPORTING_QUANTA_.....pdfMODERN PHYSICS_REPORTING_QUANTA_.....pdf
MODERN PHYSICS_REPORTING_QUANTA_.....pdfRevenJadePalma
 
The solar dynamo begins near the surface
The solar dynamo begins near the surfaceThe solar dynamo begins near the surface
The solar dynamo begins near the surfaceSérgio Sacani
 
Errors: types, determination and elimination
Errors: types, determination and eliminationErrors: types, determination and elimination
Errors: types, determination and eliminationArunLakshmiMeenakshi
 
Constraints on Neutrino Natal Kicks from Black-Hole Binary VFTS 243
Constraints on Neutrino Natal Kicks from Black-Hole Binary VFTS 243Constraints on Neutrino Natal Kicks from Black-Hole Binary VFTS 243
Constraints on Neutrino Natal Kicks from Black-Hole Binary VFTS 243Sérgio Sacani
 
Pests of sugarcane_Binomics_IPM_Dr.UPR.pdf
Pests of sugarcane_Binomics_IPM_Dr.UPR.pdfPests of sugarcane_Binomics_IPM_Dr.UPR.pdf
Pests of sugarcane_Binomics_IPM_Dr.UPR.pdfPirithiRaju
 
Gliese 12 b: A Temperate Earth-sized Planet at 12 pc Ideal for Atmospheric Tr...
Gliese 12 b: A Temperate Earth-sized Planet at 12 pc Ideal for Atmospheric Tr...Gliese 12 b: A Temperate Earth-sized Planet at 12 pc Ideal for Atmospheric Tr...
Gliese 12 b: A Temperate Earth-sized Planet at 12 pc Ideal for Atmospheric Tr...Sérgio Sacani
 
GBSN - Microbiology Lab (Compound Microscope)
GBSN - Microbiology Lab (Compound Microscope)GBSN - Microbiology Lab (Compound Microscope)
GBSN - Microbiology Lab (Compound Microscope)Areesha Ahmad
 
Pests of Green Manures_Bionomics_IPM_Dr.UPR.pdf
Pests of Green Manures_Bionomics_IPM_Dr.UPR.pdfPests of Green Manures_Bionomics_IPM_Dr.UPR.pdf
Pests of Green Manures_Bionomics_IPM_Dr.UPR.pdfPirithiRaju
 

Recently uploaded (20)

Land use land cover change analysis and detection of its drivers using geospa...
Land use land cover change analysis and detection of its drivers using geospa...Land use land cover change analysis and detection of its drivers using geospa...
Land use land cover change analysis and detection of its drivers using geospa...
 
Gliese 12 b, a temperate Earth-sized planet at 12 parsecs discovered with TES...
Gliese 12 b, a temperate Earth-sized planet at 12 parsecs discovered with TES...Gliese 12 b, a temperate Earth-sized planet at 12 parsecs discovered with TES...
Gliese 12 b, a temperate Earth-sized planet at 12 parsecs discovered with TES...
 
Jet reorientation in central galaxies of clusters and groups: insights from V...
Jet reorientation in central galaxies of clusters and groups: insights from V...Jet reorientation in central galaxies of clusters and groups: insights from V...
Jet reorientation in central galaxies of clusters and groups: insights from V...
 
Extensive Pollution of Uranus and Neptune’s Atmospheres by Upsweep of Icy Mat...
Extensive Pollution of Uranus and Neptune’s Atmospheres by Upsweep of Icy Mat...Extensive Pollution of Uranus and Neptune’s Atmospheres by Upsweep of Icy Mat...
Extensive Pollution of Uranus and Neptune’s Atmospheres by Upsweep of Icy Mat...
 
Mining Activity and Investment Opportunity in Myanmar.pptx
Mining Activity and Investment Opportunity in Myanmar.pptxMining Activity and Investment Opportunity in Myanmar.pptx
Mining Activity and Investment Opportunity in Myanmar.pptx
 
family therapy psychotherapy types .pdf
family therapy psychotherapy types  .pdffamily therapy psychotherapy types  .pdf
family therapy psychotherapy types .pdf
 
Lec 1.b Totipotency and birth of tissue culture.ppt
Lec 1.b Totipotency and birth of tissue culture.pptLec 1.b Totipotency and birth of tissue culture.ppt
Lec 1.b Totipotency and birth of tissue culture.ppt
 
Hemoglobin metabolism: C Kalyan & E. Muralinath
Hemoglobin metabolism: C Kalyan & E. MuralinathHemoglobin metabolism: C Kalyan & E. Muralinath
Hemoglobin metabolism: C Kalyan & E. Muralinath
 
Ostiguy & Panizza & Moffitt (eds.) - Populism in Global Perspective. A Perfor...
Ostiguy & Panizza & Moffitt (eds.) - Populism in Global Perspective. A Perfor...Ostiguy & Panizza & Moffitt (eds.) - Populism in Global Perspective. A Perfor...
Ostiguy & Panizza & Moffitt (eds.) - Populism in Global Perspective. A Perfor...
 
Aerodynamics. flippatterncn5tm5ttnj6nmnynyppt
Aerodynamics. flippatterncn5tm5ttnj6nmnynypptAerodynamics. flippatterncn5tm5ttnj6nmnynyppt
Aerodynamics. flippatterncn5tm5ttnj6nmnynyppt
 
RACEMIzATION AND ISOMERISATION completed.pptx
RACEMIzATION AND ISOMERISATION completed.pptxRACEMIzATION AND ISOMERISATION completed.pptx
RACEMIzATION AND ISOMERISATION completed.pptx
 
Plasma proteins_ Dr.Muralinath_Dr.c. kalyan
Plasma proteins_ Dr.Muralinath_Dr.c. kalyanPlasma proteins_ Dr.Muralinath_Dr.c. kalyan
Plasma proteins_ Dr.Muralinath_Dr.c. kalyan
 
MODERN PHYSICS_REPORTING_QUANTA_.....pdf
MODERN PHYSICS_REPORTING_QUANTA_.....pdfMODERN PHYSICS_REPORTING_QUANTA_.....pdf
MODERN PHYSICS_REPORTING_QUANTA_.....pdf
 
The solar dynamo begins near the surface
The solar dynamo begins near the surfaceThe solar dynamo begins near the surface
The solar dynamo begins near the surface
 
Errors: types, determination and elimination
Errors: types, determination and eliminationErrors: types, determination and elimination
Errors: types, determination and elimination
 
Constraints on Neutrino Natal Kicks from Black-Hole Binary VFTS 243
Constraints on Neutrino Natal Kicks from Black-Hole Binary VFTS 243Constraints on Neutrino Natal Kicks from Black-Hole Binary VFTS 243
Constraints on Neutrino Natal Kicks from Black-Hole Binary VFTS 243
 
Pests of sugarcane_Binomics_IPM_Dr.UPR.pdf
Pests of sugarcane_Binomics_IPM_Dr.UPR.pdfPests of sugarcane_Binomics_IPM_Dr.UPR.pdf
Pests of sugarcane_Binomics_IPM_Dr.UPR.pdf
 
Gliese 12 b: A Temperate Earth-sized Planet at 12 pc Ideal for Atmospheric Tr...
Gliese 12 b: A Temperate Earth-sized Planet at 12 pc Ideal for Atmospheric Tr...Gliese 12 b: A Temperate Earth-sized Planet at 12 pc Ideal for Atmospheric Tr...
Gliese 12 b: A Temperate Earth-sized Planet at 12 pc Ideal for Atmospheric Tr...
 
GBSN - Microbiology Lab (Compound Microscope)
GBSN - Microbiology Lab (Compound Microscope)GBSN - Microbiology Lab (Compound Microscope)
GBSN - Microbiology Lab (Compound Microscope)
 
Pests of Green Manures_Bionomics_IPM_Dr.UPR.pdf
Pests of Green Manures_Bionomics_IPM_Dr.UPR.pdfPests of Green Manures_Bionomics_IPM_Dr.UPR.pdf
Pests of Green Manures_Bionomics_IPM_Dr.UPR.pdf
 

Linear Non-Gaussian Structural Equation Models

  • 1. IMPS 2008, Durham, NH Linear Non-Gaussian Structural Equation Models Shohei Shimizu, Patrik Hoyer and Aapo Hyvarinen Osaka University, Japan University of Helsinki, Finland
  • 2. 2 Abstract • Linear Structural Equation Modeling (linear SEM) – Analyzes causal relations • Covariance-based SEM – Uses covariance structure alone for model identification – A number of indistinguishable models • Linear non-Gaussian SEM – Uses non-Gaussian structures for model identification – Makes many models distinguishable
  • 3. 3 SEM and causal analysis • SEM is often used for causal analysis based on non-experimental data • Assumption: the data generating process is represented by a SEM model • If the assumption is reasonable, SEM provides causal information
  • 4. 4 Limitations of covariance-based SEM • Covariance-based SEM cannot distinguish between many models • Example e1 x1 x2 x1 x2 e2
  • 5. 5 Linear non-Gaussian SEM • Many observed data are considerably non- Gaussian (Micceri, 1989; Hyvarinen et al. 2001) • Non-Gaussian structures of data are useful (Bentler 1983; Mooijaart 1985) • Non-Gaussianity distinguish between the two models (Shimizu et al. 2006) : e1 x1 x2 x1 x2 e2
  • 6. 6 Independent component analysis (ICA) • Observed random vector x is modeled as x = As s – are independent and non-Gaussian i • Zero means and unit variances – A is a constant matrix • Typically square, # variables = # independent components • Identifiable up to permutation of the columns (Mooijaart 1985; Comon, 1994)
  • 7. 7 ICA estimation • An alternative expression of ICA (x=As): s =Wx , ~ called a recovering matrix ~ where 1 W = A • Find such that maximizes independence of sˆ =Wx components of – Many proposals (Hyvarinen et al. 2001) W ~ • is estimated up to permutation of the rows: ~ = W PW W
  • 8. 8 ICA estimation • An alternative expression of ICA (x=As): s =Wx , ~ called a recovering matrix ~ where 1 W = A • Find such that maximizes independence of sˆ =Wx components of – Many proposals (Hyvarinen et al. 2001) W ~ • is estimated up to permutation of the rows: ~ = W PW W
  • 9. 9 ICA estimation • An alternative expression of ICA (x=As): s =Wx , ~ called a recovering matrix ~ where 1 W = A • Find such that maximizes independence of sˆ =Wx components of – Many proposals (Hyvarinen et al. 2001) W ~ • is estimated up to permutation of the rows: ~ = W PW W
  • 10. Discovery of linear non-Gaussian acyclic models Shimizu, Hoyer, Hyvarinen and Kerminen (2006)
  • 11. 11 Linear non-Gaussian acyclic model (LiNGAM) • Directed acyclic graphs (DAG) x – can be arranged in a order k(i) • Assumptions: – Linearity – External influences e are independent – and are non-Gaussian x = Bx + e i x = b x + e i ij j k ( j ) k ( i ) or i i
  • 12. 12 Goal • We know – Data X is generated by • We do NOT know – Path coefficients: bij – Orders k(i) – External influences: ei x = Bx + e • What we observe is data X only • Goal – Estimate B and k(i) using data X only!
  • 13. 13 Key idea • First, relate LiNGAM with ICA as follows: = + x Bx e - ICA! ( ) 1 = = x I B e Ae = = e I B x Wx • Due to the permutation indeterminacy, ICA gives: • Can find a correct P ~ = – The correct permutation is the only one that has no zeros in the diagonal ~ equivalently ( ) W PW
  • 14. 14 Key idea • First, relate LiNGAM with ICA as follows: = + x Bx e - ICA! ( ) 1 = = x I B e Ae = = e I B x Wx • Due to the permutation indeterminacy, ICA gives: • Can find a correct P ~ = – The correct permutation is the only one that has no zeros in the diagonal ~ equivalently ( ) W PW
  • 15. 15 Key idea • First, relate LiNGAM with ICA as follows: = + - ICA! ( ) 1 = = x I B e Ae = = e I B x Wx • Due to the permutation indeterminacy, ICA gives: W PW • Can find a correct P – The correct permutation is the only one that has no zeros in the diagonal ~ = x Bx e ~ equivalently ( )
  • 16. 16 Key idea • First, relate LiNGAM with ICA as follows: = + - ICA! ( ) 1 = = x I B e Ae = = e I B x Wx • Due to the permutation indeterminacy, ICA gives: W PW • Can find the correct P – The correct permutation is the only one that has no zeros in the diagonal ~ = x Bx e ~ equivalently ( )
  • 17. 17 Illustrative example • Consider the model: = x 1 • Goal e1 x1 x2 + e 1 x 1 0 0.6 – Estimate the path direction between x1 and x2 observing only x1 and x2 0.6 2 2 2 0 0 e x x 14243 B
  • 18. 18 Perform ICA • Relation of the LiNGAM model with ICA: x 1 e 1 1 0.6 ~ = • Due to the permutation indeterminacy, ICA might give: = 2 2 0 1 x e 14243 1 0 ~W W P ( ) = = 1 0.8 e Wx W ~
  • 19. 19 Perform ICA • Relation of the LiNGAM model with ICA: = x 1 2 e 1 e 2 1 0.6 0 1 x 14243 ~ = • Due to the permutation indeterminacy, ICA might give: 1 0 ~W W P ( ) = = 1 0.6 e Wx W ~
  • 20. x 1 20 Find the correct P • Find a permutation of the rows of W so that it has no zeros in the diagonal • In the example… = 1 0.8 2 e 1 2 0 1 x e 14243 0 1 = x 1 2 e 2 1 1 0.6 x e 14243 Permute the rows W W ~ 0
  • 21. x 1 21 Find the correct P • Find a permutation of the rows of W so that it has no zeros in the diagonal • In the example… = 1 0.8 2 e 1 2 0 1 x e 14243 0 1 = x 1 2 e 2 1 1 0.6 x e 14243 Permute the rows W W ~ 0
  • 22. x 1 22 Find the correct P • Find a permutation of the rows of W so that it has no zeros in the diagonal • In the example… = 1 0.6 2 e 1 2 0 1 x e 14243 0 1 = x 1 2 e 2 1 1 0.6 x e 14243 Permute the rows 0 0 ~ W W
  • 23. 23 Find the correct P • In practice, 1 ( PTW )ii ˆ max = P P • Heavily penalizes small absolute values in the diagonal
  • 24. 24 Simulations: Estimation of B • Both super- and sub-Gaussian external influences tested • 5 datasets created for each scatterplot • B randomly generated at each trial 3 2 1 0 -1 -2 -3 -3 -2 -1 0 1 2 3 3 2 1 0 -1 -2 -3 -3 -2 -1 0 1 2 3 3 2 1 0 -1 -2 -3 -3 -2 -1 0 1 2 3 3 2 1 0 -1 -2 -3 -3 -2 -1 0 1 2 3 3 2 1 0 -1 -2 -3 -3 -2 -1 0 1 2 3 3 2 1 0 -1 -2 -3 -3 -2 -1 0 1 2 3 3 2 1 0 -1 -2 -3 -3 -2 -1 0 1 2 3 3 2 1 0 -1 -2 -3 -3 -2 -1 0 1 2 3 3 2 1 0 -1 -2 -3 -3 -2 -1 0 1 2 3 200 1,000 3,000 Number of observations Number of variables 10 50 100 Generating bij Estimated bij
  • 25. 25 Prune B (1) • In practice, due to estimation errors, we would get: + = e 1 e 2 x 1 2 x 1 2 0 0.65 0.05 0 x x 1442443 B • Need to find which path coefficients are actually zeros
  • 26. 26 Find a permutation that gives a lower triangular matrix • The LiNGAM model is acyclic – The matrix B can be permuted to be lower triangular for some permutation of variables (Bollen, 1989) • First, find a simultaneous permutation of rows and columns of B that gives a lower-triangular B • In practice, find a permutation matrix Q that minimizes the sum of the elements in its upper triangular part: Q ˆ = max ( QBQ T ) ij Q i j
  • 27. 27 Find a permutation that gives a lower triangular matrix • The LiNGAM model is acyclic – The matrix B can be permuted to be lower triangular for some permutation of variables (Bollen, 1989) • First, find a simultaneous permutation of rows and columns of B that gives a lower-triangular B • In practice, find a permutation matrix Q that minimizes the sum of the elements in its upper triangular part: Q ˆ = max ( QBQ T ) ij Q i j
  • 28. 28 Find a permutation that gives a lower triangular matrix • The LiNGAM model is acyclic – The matrix B can be permuted to be lower triangular for some permutation of variables (Bollen, 1989) • First, find a simultaneous permutation of rows and columns of B that gives a lower-triangular B • In practice, find a permutation matrix Q that minimizes the sum of the elements in its upper triangular part: Q ˆ = min ( QBQ T ) ij Q i j
  • 29. 29 Get a lower-triangular B • Applying such a simultaneous permutation of the • we get a permuted B that is as lower-triangular • Set the upper-triangular elements to be zeros rows and columns, + as possible 0 0.65 = e 1 2 x 1 2 x 1 2 0.05 0 e x x + = e 2 1 2 x 1 2 x 1 0 0.05 0.62 0 e x x B T QBQ
  • 30. 30 Get a lower-triangular B • Applying such a simultaneous permutation of the • we get a permuted B that is as lower-triangular • Set the upper-triangular elements to be zeros rows and columns, + as possible 0 0.65 = e 1 2 x 1 2 x 1 2 0.05 0 e x x + = e 2 1 2 x 1 2 x 1 0 0.05 0.65 0 e x x B T QBQ
  • 31. 31 Get a lower-triangular B • Applying such a simultaneous permutation of the • we get a permuted B that is as lower-triangular • Set the upper-triangular elements to be zeros rows and columns, + as possible 0 0.65 = e 1 2 x 1 2 x 1 2 0.05 0 e x x + -0.05 = e 2 1 2 x 1 2 x 1 0 0.05 0.65 0 e x x B T QBQ
  • 32. 32 Get a lower-triangular B • Applying such a simultaneous permutation of the • we get a permuted B that is as lower-triangular • Set the upper-triangular elements to be zeros rows and columns, + as possible 0 0.65 = e 1 2 x 1 2 x 1 2 0.05 0 e x x + = e 2 1 2 x 1 2 x 1 0 0 0.05 0.65 0 e x x B T QBQ
  • 33. 33 Pruning B (2) • Once we get a lower-triangular B, the model is identifiable using covariance-based SEM + = e 2 e 1 2 x x 1 2 x x 1 0 0 0.65 0 • Many existing methods can be used for pruning the remaining path coefficients – Wald test, Bootstrapping, Model fit – Lasso-type estimators (Tibshirani 1996; Zou, 2006) etc.
  • 34. 34 To summarize the procedure… 1. Estimate B – ICA + finding the correct row permutation 2. Prune estimated B 1. Find a row-and-column permutation that makes estimated B lower triangular 2. Prune remaining paths using a covariance-based method 1. Estimate B 2. Prune estimated B x4 x3 x2 x1 x4 x3 x2 x1 x4 x3 x2 x1
  • 35. 35 Summary of the regular LiNGAM • A linear acyclic model is identifiable based on non-Gaussianity • ICA-based estimation works well – Confidence intervals (Konya et al., in progress) • Better pruning methods might be developed – Imposing sparseness in the ICA stage (Zhang Chang, 2006; Hayashi et al. in progress) like Lasso (Tibshirani 1996)
  • 37. 37 Latent factors (Shimizu et al., 2007) • A non-Gaussian multiple indicator model: = + f Bf d = + x Gf e • Suppose that G is identified, then B is identified – Could identify G in a data driven way using a tetrad-constraint- based method (Silva et al., 2006)
  • 38. 38 Latent classes (Shimizu Hyvarinen, 2008) • LiNGAM model for each class q: x = B x + (I B )ì + e x = ì + A e - ICA! q q q q q q q • ICA mixtures (Lee et al., 2000; Mollah et al., 2006) Class 1: Class 2: 0 0 0.9 x2 x1 6 5 0.2 x2 x1
  • 39. 39 Unobserved confounders (Hoyer et al., in press • Can identify and distinguish between more models 1. 2. 3. x1 x2 x1 x2 4. 5. 6. u1 x1 x2 x1 x2 u1 x1 x2 u1 x1 x2
  • 40. 40 Time structures (Hyvarinen et al., 2008) • Combining LiNGAM and autoregressive model: k e x B x + == ( ) ( ) ( ) t t t 0 – In econometrics: Structural vector autoregression (Swanson Granger, 1997) • Changes ordinary AR coefficients based on instantaneous effects: ( ) for 0 B I B M ( :AR matrix) 0 = M
  • 41. 41 Some variables are Gaussian (Hoyer et al., 2008) • Consider the model: 0.6 e2 x2 x1 • Can identify the path direction – if either of x1 or e2 is non-Gaussian • In general, there exist several equivalent models that entail the same distribution if some are Gaussian
  • 42. 42 Some other extensions • Cyclic models (Lacerda et al., 2008) – Fewer equivalent models than covariance-based approach • Nonlinearity (Zhang Chan, 2007; Sun et al., 2007) • Model fit statistics are under development – Non-Gaussian structures
  • 43. 43 Conclusion • Use of non-Gaussianity in SEM is useful for model identification • Many observed data are considerably non-Gaussian • The non-Gaussian approach can be a good option
  • 44. 44 • Most of our papers and Matlab/Octave code are available on our webpages • Google will find us!