DirectLiNGAM is a new non-Gaussian estimation method for the LiNGAM model that directly estimates the variable ordering without using ICA. It iteratively identifies exogenous variables using independence tests between each variable and its residuals from regression on other variables. This allows it to estimate the ordering in a fixed number of steps, with no algorithmic parameters, convergence issues, or scale dependence like previous ICA-based methods. It was shown to estimate the correct causal ordering on a real-world socioeconomic dataset, matching domain knowledge better than alternative methods like ICA-LiNGAM, PC algorithm, and GES.
Non-Gaussian Methods for Learning Linear Structural Equation Models: Part IShiga University, RIKEN
This document provides an overview of a tutorial on non-Gaussian methods for learning linear structural equation models (SEMs). The tutorial will cover how linear SEMs can be used to model data generating processes and review a new approach that utilizes non-Gaussianity of data for model identification. The tutorial is divided into two parts, with the first part providing an overview of linear SEMs and the identifiability problems of conventional methods. The second part will discuss recent advances in applying these non-Gaussian methods to time series data and models with latent confounders.
Discovery of Linear Acyclic Models Using Independent Component AnalysisShiga University, RIKEN
This document discusses the discovery of linear acyclic models from non-experimental data using independent component analysis (ICA). It describes how existing methods assume Gaussian disturbances, producing equivalent models, whereas the proposed LiNGAM approach assumes non-Gaussian disturbances. This allows identifying the connection strengths and structure without equivalent models. The LiNGAM algorithm estimates the matrix B using ICA and post-processing, finds a causal order, and prunes non-significant edges. Examples show LiNGAM can correctly estimate networks and the document concludes it is an important topic with code available online.
Non-Gaussian Methods for Learning Linear Structural Equation Models: Part IShiga University, RIKEN
This document provides an overview of a tutorial on non-Gaussian methods for learning linear structural equation models (SEMs). The tutorial will cover how linear SEMs can be used to model data generating processes and review a new approach that utilizes non-Gaussianity of data for model identification. The tutorial is divided into two parts, with the first part providing an overview of linear SEMs and the identifiability problems of conventional methods. The second part will discuss recent advances in applying these non-Gaussian methods to time series data and models with latent confounders.
Discovery of Linear Acyclic Models Using Independent Component AnalysisShiga University, RIKEN
This document discusses the discovery of linear acyclic models from non-experimental data using independent component analysis (ICA). It describes how existing methods assume Gaussian disturbances, producing equivalent models, whereas the proposed LiNGAM approach assumes non-Gaussian disturbances. This allows identifying the connection strengths and structure without equivalent models. The LiNGAM algorithm estimates the matrix B using ICA and post-processing, finds a causal order, and prunes non-significant edges. Examples show LiNGAM can correctly estimate networks and the document concludes it is an important topic with code available online.
ベイズ機械学習(an introduction to bayesian machine learning)医療IT数学同好会 T/T
This document provides an introduction to Bayesian machine learning. It discusses key concepts like Bayes' theorem, the modeling and inference procedures in Bayesian learning, and examples like linear regression and Gaussian mixture models. It also introduces variational inference as a technique for approximating intractable posterior distributions. Finally, it lists some example papers and programming languages/libraries for probabilistic programming.
The document summarizes key concepts from Chapter 10 of Bishop's PRML book on approximate inference using variational methods. It introduces variational inference as a deterministic alternative to importance sampling for approximating intractable distributions. Variational inference frames inference as an optimization problem of variationally approximating the true posterior using a simpler distribution from an assumed family. This is done by maximizing a lower bound on the marginal likelihood. Mean-field variational inference further assumes a factorized form for the variational distribution.
This document discusses fuzzy relations, reasoning, and linguistic variables. It defines fuzzy relations as membership functions between elements of Cartesian product spaces. It describes the extension principle for mapping fuzzy sets through functions. Max-min and max-product composition are defined for combining fuzzy relations. Linguistic variables allow information to be expressed using fuzzy linguistic terms rather than numerical values. Operations on linguistic variables like concentration and dilation are discussed. Fuzzy if-then rules are defined using implication functions to model "if A then B" statements where A and B are linguistic values. Fuzzy reasoning uses these rules and facts to derive conclusions.
1. This document discusses linear equations, slope, graphing lines, writing equations in slope-intercept form, and solving systems of linear equations.
2. Key concepts explained include slope as rise over run, the different forms of writing a linear equation, finding the x- and y-intercepts, and using two points to write the equation of a line in slope-intercept form.
3. Examples are provided to demonstrate how to graph lines based on their equations in different forms, find intercepts, write equations from two points, and solve systems of linear equations.
ベイズ機械学習(an introduction to bayesian machine learning)医療IT数学同好会 T/T
This document provides an introduction to Bayesian machine learning. It discusses key concepts like Bayes' theorem, the modeling and inference procedures in Bayesian learning, and examples like linear regression and Gaussian mixture models. It also introduces variational inference as a technique for approximating intractable posterior distributions. Finally, it lists some example papers and programming languages/libraries for probabilistic programming.
The document summarizes key concepts from Chapter 10 of Bishop's PRML book on approximate inference using variational methods. It introduces variational inference as a deterministic alternative to importance sampling for approximating intractable distributions. Variational inference frames inference as an optimization problem of variationally approximating the true posterior using a simpler distribution from an assumed family. This is done by maximizing a lower bound on the marginal likelihood. Mean-field variational inference further assumes a factorized form for the variational distribution.
This document discusses fuzzy relations, reasoning, and linguistic variables. It defines fuzzy relations as membership functions between elements of Cartesian product spaces. It describes the extension principle for mapping fuzzy sets through functions. Max-min and max-product composition are defined for combining fuzzy relations. Linguistic variables allow information to be expressed using fuzzy linguistic terms rather than numerical values. Operations on linguistic variables like concentration and dilation are discussed. Fuzzy if-then rules are defined using implication functions to model "if A then B" statements where A and B are linguistic values. Fuzzy reasoning uses these rules and facts to derive conclusions.
1. This document discusses linear equations, slope, graphing lines, writing equations in slope-intercept form, and solving systems of linear equations.
2. Key concepts explained include slope as rise over run, the different forms of writing a linear equation, finding the x- and y-intercepts, and using two points to write the equation of a line in slope-intercept form.
3. Examples are provided to demonstrate how to graph lines based on their equations in different forms, find intercepts, write equations from two points, and solve systems of linear equations.
The document discusses Gaussian Bayesian networks and their properties. It covers topics such as:
- Learning the parameters of a Gaussian Bayesian network from data using maximum likelihood estimation. Closed form solutions exist for the parameters.
- Representing multivariate Gaussian distributions in three equivalent forms: covariance form, Gaussian Bayesian network (information form), and Gaussian Markov random field. The information form captures conditional independencies.
- Performing operations like marginalization and conditioning on Gaussian distributions is easier in the information form compared to the covariance form.
This document provides information about derivatives and their applications:
1. It defines the derivative as the limit of the difference quotient, and explains how to calculate derivatives using first principles. It also covers rules for finding derivatives of sums, products, quotients, exponentials, and logarithmic functions.
2. Higher order derivatives are introduced, with examples of how to take second and third derivatives.
3. Applications of derivatives like finding velocity and acceleration from a position-time function are demonstrated. Maximum/minimum values and how to find local and absolute extrema are also discussed with an example.
This is the entrance exam paper for ISI MSQE Entrance Exam for the year 2008. Much more information on the ISI MSQE Entrance Exam and ISI MSQE Entrance preparation help available on http://crackdse.com
This document discusses optimization techniques and provides examples to illustrate key concepts in optimization problems. It defines optimization as finding extreme states like minimum/maximum and discusses how it is applied in various fields. It then covers basic definitions like design variables, objective functions, constraints, convexity, local vs global optima. Examples are given to show unconstrained vs constrained problems and illustrate active, inactive and violated constraints. Optimization techniques largely depend on calculus concepts like derivatives and hessian matrix.
The document discusses various numerical techniques for solving equations and systems of equations. It covers bisection, regula falsi, Newton-Raphson, and interpolation methods for finding roots of equations. It also covers the Jacobi and Gauss-Seidel methods for solving systems of linear equations iteratively. Numerical differentiation and integration techniques like the trapezoidal, Simpson's, and Runge-Kutta methods are also summarized. Examples are provided to illustrate solving systems of equations using the Jacobi and Gauss-Seidel methods.
This is meant for university students taking either information technology or engineering courses, this course of differentiation, Integration and limits helps you to develop your problem solving skills and other benefits that come along with it.
Olivier Hudry (INFRES-MIC2 Télécom ParisTech)
A Branch and Bound Algorithm to Compute a Median Permutation
Algorithms & Permutations 2012, Paris.
http://igm.univ-mlv.fr/AlgoB/algoperm2012/
This document discusses derivatives of various functions including:
- Exponential functions like ex and ax where the derivative of ex is ex and the derivative of ax is axln(a)
- Inverse functions where the derivative of the inverse is the reciprocal of the derivative of the original function
- Logarithmic functions like ln(x), loga(x) where the derivatives are 1/x and 1/(xln(a))
- Using logarithmic differentiation to find derivatives of functions like f(x)g(x)
It also provides practice problems finding derivatives of various functions and solving related equations.
This document discusses axi-symmetric analysis and finite element modeling techniques for single-variable problems. It introduces axi-symmetric coordinates where quantities depend only on radial (r) and axial (z) coordinates, reducing a 3D problem to 2D. It presents the weak form, finite element model, and element stiffness matrix formulation for a single-variable problem. It also discusses constant strain triangle (CST) and isoparametric elements, where shape functions allow modeling of arbitrary geometries.
EXPERT SYSTEMS AND SOLUTIONS
Project Center For Research in Power Electronics and Power Systems
IEEE 2010 , IEEE 2011 BASED PROJECTS FOR FINAL YEAR STUDENTS OF B.E
Email: expertsyssol@gmail.com,
Cell: +919952749533, +918608603634
www.researchprojects.info
OMR, CHENNAI
IEEE based Projects For
Final year students of B.E in
EEE, ECE, EIE,CSE
M.E (Power Systems)
M.E (Applied Electronics)
M.E (Power Electronics)
Ph.D Electrical and Electronics.
Training
Students can assemble their hardware in our Research labs. Experts will be guiding the projects.
EXPERT GUIDANCE IN POWER SYSTEMS POWER ELECTRONICS
We provide guidance and codes for the for the following power systems areas.
1. Deregulated Systems,
2. Wind power Generation and Grid connection
3. Unit commitment
4. Economic Dispatch using AI methods
5. Voltage stability
6. FLC Control
7. Transformer Fault Identifications
8. SCADA - Power system Automation
we provide guidance and codes for the for the following power Electronics areas.
1. Three phase inverter and converters
2. Buck Boost Converter
3. Matrix Converter
4. Inverter and converter topologies
5. Fuzzy based control of Electric Drives.
6. Optimal design of Electrical Machines
7. BLDC and SR motor Drives
Numerical Methods was a core subject for Electrical & Electronics Engineering, Based On Anna University Syllabus. The Whole Subject was there in this document.
Share with it ur friends & Follow me for more updates.!
To get a copy of the slides for free Email me at: japhethmuthama@gmail.com
You can also support my PhD studies by donating a 1 dollar to my PayPal.
PayPal ID is japhethmuthama@gmail.com
To get a copy of the slides for free Email me at: japhethmuthama@gmail.com
You can also support my PhD studies by donating a 1 dollar to my PayPal.
PayPal ID is japhethmuthama@gmail.com
1. This document provides rules for taking derivatives of basic functions including:
- Constant functions: f'(x) = 0
- Power functions: f(x) = x^n, f'(x) = nx^(n-1)
2. It also outlines rules for derivatives of sums, differences, and constant multiples of functions.
3. Experiments are described using calculators and geometry software to demonstrate that the derivative of sin(x) is cos(x) and the derivative of e^x is also e^x.
The document defines quantification theory and key concepts such as open propositions, propositional functions, binding variables, and quantified propositions. It discusses universal and existential quantification and how to determine the truth value of quantified propositions. Examples are provided to illustrate functional notations, binding variables, quantification rules of inference, negation of quantified statements, symbolizing relations, and translating statements to logical notations. In summary, the document provides an overview of quantification theory, including defining open propositions and propositional functions, discussing ways to bind variables and determine truth values, and demonstrating techniques for symbolizing and negating quantified statements.
This document discusses Buss's theories of bounded arithmetic (S2i) and how they relate to the polynomial hierarchy (PH). It proposes using the separation of Buss's theories to approach the separation of levels of PH. It presents a consistency proof for S2i inside S2i+2 by introducing a predicate E for term existence and defining a bounded truth definition. The goal is to separate S2i and S2i+2 through Gödel's incompleteness theorem by showing S2i+2 can prove the i-consistency of S2i-E but S2i cannot prove its own consistency. Future work aims to simplify S2i-E and formally prove S2i can derive the i
Similar to A direct method for estimating linear non-Gaussian acyclic models (20)
This document discusses causal discovery and its application to analyzing predictive models. It introduces causal discovery as the unsupervised learning of causal relations from data to estimate causal structures like directed acyclic graphs under certain assumptions. The document then discusses using causal discovery to analyze the mechanisms of predictive models by combining causal models with predictive models to model how interventions on features affect predictions. An example using an auto MPG dataset demonstrates how this approach can suggest which variable has the greatest intervention effect on MPG predictions.
A non-Gaussian model for causal discovery in the presence of hidden common ca...Shiga University, RIKEN
1) The document proposes a Bayesian linear non-Gaussian structural equation model (SEM) approach for estimating causal direction between observed variables in the presence of hidden common causes.
2) Rather than explicitly model the hidden common causes, the approach transforms the model to one without hidden causes by including observation-specific intercepts representing the sums of hidden causes.
3) The approach compares marginal likelihoods of the transformed models under different causal directions to select the most likely direction, without needing to specify the number or distributions of hidden causes. It was shown to successfully estimate causal directions on a sociology data set.
1. The document discusses estimating causal direction between two variables in the presence of hidden common causes.
2. A key challenge is that the hidden common causes introduce dependence between the error terms, making regression coefficients an unreliable guide to causal direction.
3. The author proposes a non-Gaussian structural equation model that can estimate causal direction without specifying the number of hidden common causes by exploiting the fact that different causal directions imply different distributions over the data, even when the error terms are dependent.
This document summarizes linear non-Gaussian structural equation modeling (SEM). It introduces linear SEM and its limitations due to indistinguishable models based on covariance structure alone. Linear non-Gaussian SEM uses non-Gaussian distributions of the external influences to distinguish between models. Independent component analysis (ICA) is then used to estimate the model, relating it to linear non-Gaussian acyclic models (LiNGAM). The key steps of LiNGAM involve using ICA to estimate the model, finding the correct permutation to eliminate zeros in the diagonal, and pruning the estimated model matrix to identify actual zero paths. Simulations demonstrate accurate estimation of the model matrix B.
The debris of the ‘last major merger’ is dynamically youngSérgio Sacani
The Milky Way’s (MW) inner stellar halo contains an [Fe/H]-rich component with highly eccentric orbits, often referred to as the
‘last major merger.’ Hypotheses for the origin of this component include Gaia-Sausage/Enceladus (GSE), where the progenitor
collided with the MW proto-disc 8–11 Gyr ago, and the Virgo Radial Merger (VRM), where the progenitor collided with the
MW disc within the last 3 Gyr. These two scenarios make different predictions about observable structure in local phase space,
because the morphology of debris depends on how long it has had to phase mix. The recently identified phase-space folds in Gaia
DR3 have positive caustic velocities, making them fundamentally different than the phase-mixed chevrons found in simulations
at late times. Roughly 20 per cent of the stars in the prograde local stellar halo are associated with the observed caustics. Based
on a simple phase-mixing model, the observed number of caustics are consistent with a merger that occurred 1–2 Gyr ago.
We also compare the observed phase-space distribution to FIRE-2 Latte simulations of GSE-like mergers, using a quantitative
measurement of phase mixing (2D causticality). The observed local phase-space distribution best matches the simulated data
1–2 Gyr after collision, and certainly not later than 3 Gyr. This is further evidence that the progenitor of the ‘last major merger’
did not collide with the MW proto-disc at early times, as is thought for the GSE, but instead collided with the MW disc within
the last few Gyr, consistent with the body of work surrounding the VRM.
The binding of cosmological structures by massless topological defectsSérgio Sacani
Assuming spherical symmetry and weak field, it is shown that if one solves the Poisson equation or the Einstein field
equations sourced by a topological defect, i.e. a singularity of a very specific form, the result is a localized gravitational
field capable of driving flat rotation (i.e. Keplerian circular orbits at a constant speed for all radii) of test masses on a thin
spherical shell without any underlying mass. Moreover, a large-scale structure which exploits this solution by assembling
concentrically a number of such topological defects can establish a flat stellar or galactic rotation curve, and can also deflect
light in the same manner as an equipotential (isothermal) sphere. Thus, the need for dark matter or modified gravity theory is
mitigated, at least in part.
hematic appreciation test is a psychological assessment tool used to measure an individual's appreciation and understanding of specific themes or topics. This test helps to evaluate an individual's ability to connect different ideas and concepts within a given theme, as well as their overall comprehension and interpretation skills. The results of the test can provide valuable insights into an individual's cognitive abilities, creativity, and critical thinking skills
ESPP presentation to EU Waste Water Network, 4th June 2024 “EU policies driving nutrient removal and recycling
and the revised UWWTD (Urban Waste Water Treatment Directive)”
ANAMOLOUS SECONDARY GROWTH IN DICOT ROOTS.pptxRASHMI M G
Abnormal or anomalous secondary growth in plants. It defines secondary growth as an increase in plant girth due to vascular cambium or cork cambium. Anomalous secondary growth does not follow the normal pattern of a single vascular cambium producing xylem internally and phloem externally.
Unlocking the mysteries of reproduction: Exploring fecundity and gonadosomati...AbdullaAlAsif1
The pygmy halfbeak Dermogenys colletei, is known for its viviparous nature, this presents an intriguing case of relatively low fecundity, raising questions about potential compensatory reproductive strategies employed by this species. Our study delves into the examination of fecundity and the Gonadosomatic Index (GSI) in the Pygmy Halfbeak, D. colletei (Meisner, 2001), an intriguing viviparous fish indigenous to Sarawak, Borneo. We hypothesize that the Pygmy halfbeak, D. colletei, may exhibit unique reproductive adaptations to offset its low fecundity, thus enhancing its survival and fitness. To address this, we conducted a comprehensive study utilizing 28 mature female specimens of D. colletei, carefully measuring fecundity and GSI to shed light on the reproductive adaptations of this species. Our findings reveal that D. colletei indeed exhibits low fecundity, with a mean of 16.76 ± 2.01, and a mean GSI of 12.83 ± 1.27, providing crucial insights into the reproductive mechanisms at play in this species. These results underscore the existence of unique reproductive strategies in D. colletei, enabling its adaptation and persistence in Borneo's diverse aquatic ecosystems, and call for further ecological research to elucidate these mechanisms. This study lends to a better understanding of viviparous fish in Borneo and contributes to the broader field of aquatic ecology, enhancing our knowledge of species adaptations to unique ecological challenges.
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...Ana Luísa Pinho
Functional Magnetic Resonance Imaging (fMRI) provides means to characterize brain activations in response to behavior. However, cognitive neuroscience has been limited to group-level effects referring to the performance of specific tasks. To obtain the functional profile of elementary cognitive mechanisms, the combination of brain responses to many tasks is required. Yet, to date, both structural atlases and parcellation-based activations do not fully account for cognitive function and still present several limitations. Further, they do not adapt overall to individual characteristics. In this talk, I will give an account of deep-behavioral phenotyping strategies, namely data-driven methods in large task-fMRI datasets, to optimize functional brain-data collection and improve inference of effects-of-interest related to mental processes. Key to this approach is the employment of fast multi-functional paradigms rich on features that can be well parametrized and, consequently, facilitate the creation of psycho-physiological constructs to be modelled with imaging data. Particular emphasis will be given to music stimuli when studying high-order cognitive mechanisms, due to their ecological nature and quality to enable complex behavior compounded by discrete entities. I will also discuss how deep-behavioral phenotyping and individualized models applied to neuroimaging data can better account for the subject-specific organization of domain-general cognitive systems in the human brain. Finally, the accumulation of functional brain signatures brings the possibility to clarify relationships among tasks and create a univocal link between brain systems and mental functions through: (1) the development of ontologies proposing an organization of cognitive processes; and (2) brain-network taxonomies describing functional specialization. To this end, tools to improve commensurability in cognitive science are necessary, such as public repositories, ontology-based platforms and automated meta-analysis tools. I will thus discuss some brain-atlasing resources currently under development, and their applicability in cognitive as well as clinical neuroscience.
ESR spectroscopy in liquid food and beverages.pptxPRIYANKA PATEL
With increasing population, people need to rely on packaged food stuffs. Packaging of food materials requires the preservation of food. There are various methods for the treatment of food to preserve them and irradiation treatment of food is one of them. It is the most common and the most harmless method for the food preservation as it does not alter the necessary micronutrients of food materials. Although irradiated food doesn’t cause any harm to the human health but still the quality assessment of food is required to provide consumers with necessary information about the food. ESR spectroscopy is the most sophisticated way to investigate the quality of the food and the free radicals induced during the processing of the food. ESR spin trapping technique is useful for the detection of highly unstable radicals in the food. The antioxidant capability of liquid food and beverages in mainly performed by spin trapping technique.
A direct method for estimating linear non-Gaussian acyclic models
1. Updated at Jan 14 2011
DirectLiNGAM:
A direct estimation method for
LiNGAM
Shohei Shimizu, Takanori Inazumi, Yasuhiro Sogawa, Osaka Univ.
Aapo Hyvarinen, Univ. Helsinki
Yoshinobu Kawahara, Takashi Washio, Osaka Univ.
Patrik O. Hoyer, Univ. Helsinki
Kenneth Bollen, Univ. North Carolina
2. 2
Abstract
• Structural equation models (SEMs) are widely
used in many empirical sciences (Bollen, 1989)
• A non-Gaussian framework has been shown to
be useful for discovering SEMs (Shimizu, et al. 2006)
• Propose a new non-Gaussian estimation method
– No algorithmic parameters
– Guaranteed convergence in a fixed number of steps
if the data strictly follows the model
4. 4
Linear Non-Gaussian Acyclic Model
(LiNGAM model) (Shimizu et al. 2006)
• A SEM model, identifiable using non-Gaussianity
• Continuous observed random variables
x
i • Directed acyclic graph (DAG)
• Linearity
• Disturbances are independent and non-Gaussian
x = Bx + e i
i e
i ij j e x b x + = Σ<
k j k i
( ) ( )
or
i x
– k(i) denotes an order of
– B can be permuted to be lower triangular by simultaneous equal
row and column permutations
5. 5
⎤
x
1
2
x
-1.3
Example
• A three-variable model
x =
e
1 1
x = 1.5
x +
e
2 1 2
x = − 1.3
x +
e
3 2 3
• Orders of variables:
⎡
⎤
⎡
x
1
2
0 0 0
1.5 0 0
x
–
– x2 can be influenced by x1, but never by x3
• External influences:
– x1 is equal to e1 and is exogenous
– e2 and e3 are errors
⎤
⎥ ⎥ ⎥
⎦
⎡
+
⎢ ⎢ ⎢
⎣
⎥ ⎥ ⎥
⎦
⎡
⎢ ⎢ ⎢
⎣
⎤
⎥ ⎥ ⎥
⎦
⎢ ⎢ ⎢
⎣
−
=
⎥ ⎥ ⎥
⎦
⎢ ⎢ ⎢
⎣
e
1
e
2
3
3
3
0 1.3 0
e
x
x
1442443
k(1) =1, k(2) = 2, k(3) = 3
x3
x1
x2
1.5
e1
e2 e3
B
6. 6
Our goal
• We know
– Data X is generated by
• We do NOT know
x = Bx + e
ij b
– Connection strengths:
– Orders: k(i)
– Disturbances:
• What we observe is data X only
• Goal
i e
– Estimate B and k using data X only!
8. 8 Independent Component Analysis
(Comon 1994; Hyvarinen et al., 2001)
x = As
• A is an unknown square matrix
• are independent and non-Gaussian
i s
• Identifiable including the rotation (Comon, 1994)
• Many estimation methods
– e.g., FastICA (Hyvarinen,99), Amari (99) and Bach & Jordan (02)
9. 9 Key idea
i x
• Observed variables are linear combinations of
non-Gaussian independent disturbances
• ICA gives
x = Bx +
e
( )−1
x I B e
⇒ = −
Ae
=
-- ICA!
W= PDA−1 = PD(I −B)
– P: Permutation matrix, D: scaling matrix
i e
• Permutation indeterminacy in ICA can be solved
– Can be shown that the correct permutation is the only one which
has no zeros in the diagonal (Shimizu et al., UAI2005)
10. ICA-LiNGAM algorithm 10
(Shimizu et al., 2006)
1. Do ICA (here, FastICA) and get W = PD(I-B)
2. Find a permutation that gives no zeros on the
diagonal. Then we obtain D(I-B).
ˆ min 1
= Σ ( )
i ii PW
P
P
1
1
1
3. Divide each row by its corresponding diagonal
element. Then we get I-B, i.e., B
4. Find a simultaneous row and column permutation Q so
that the permuted B is as close as possible to be
strictly lower triangular. Then we get k(i).
( ) Σ≤
ˆ =
min
i j
ij
Q QBQT
Q
1 P
11. 11 Potential problems of
ICA-LiNGAM algorithm
1. ICA is an iterative search method
– May stuck in a local optimum if the initial
guess or step size is badly chosen
2. The permutation algorithms are not
scale-invariant
– May provide different variable orderings for
different scales of variables
13. 13
DirectLiNGAM algorithm
(Shimizu et al., UAI2009; Shimizu et al., 2011)
• Alternative estimation method without ICA
– Estimates an ordering of variables that makes path-coefficient
matrix B to be lower-triangular.
perm perm perm e x x + ⎥⎦ ⎤
⎡
=
123
⎢⎣
O
perm B
A full DAG
x1 x3
x2
Redundant edges
• Many existing (covariance-based) methods can
do further pruning or finding significant path
coefficients (Zou, 2006; Shimizu et al., 2006; Hyvarinen et al. 2010)
14. Basic idea (1/2) : 14
An exogenous variable can be at the
top of a right ordering
• An exogenous variable is a variable with no
parents (Bollen, 1989), here .
– The corresponding row of B has all zeros.
• So, an exogenous variable can be at the top of
such an ordering that makes B lower-triangular
with zeros on the diagonal.
⎤
⎥ ⎥ ⎥
⎦
⎡
+
⎢ ⎢ ⎢
⎣
⎤
⎥ ⎥ ⎥
⎦
⎡
⎢ ⎢ ⎢
⎣
⎤
⎥ ⎥ ⎥
⎦
⎡
⎢ ⎢ ⎢
0 0 0
1.5 0 ⎣
−
=
⎤
⎥ ⎥ ⎥
⎦
⎡
⎢ ⎢ ⎢
⎣
e
3
e
1
2
x
3
x
1
2
x
3
x
1
2
0 1.3 0
e
x
x
0
0
0
0
0 0
x3 x1 x2
j x
3 x
15. Basic idea (2/2): 15
3 x
Regress exogenous out
r(3) (i =1,2) i
• Compute the residuals regressing the other
variables on exogenous :
3 x (i =1,2) x i
( ) (3)
3
1 r and r
– The residuals form a LiNGAM model.
– The ordering of the residuals is equivalent to that of
corresponding original variables.
⎤
⎥ ⎥ ⎥
⎦
⎡
+
⎢ ⎢ ⎢
⎣
⎤
⎥ ⎥ ⎥
⎦
⎡
⎢ ⎢ ⎢
⎣
⎤
⎥ ⎥ ⎥
⎦
2
e
3
e
1
e
2
x
3
x
1
2
0 0
x
0
0
0
0
1 r 1 x
x ⎡
0
⎢ ⎢ ⎢
1.5 0 ⎣
0
0 −
1.3 =
⎤
⎥ ⎥ ⎥
⎦
⎡
⎢ ⎢ ⎢
⎣
3
x
1
2
x
⎡
+ ⎥⎦
⎡
r 0 0
⎤
⎡
−
⎤
⎡
(3)
1
(3)
1
r
• Exogenous ( 3 ) implies ` can be at the second top’.
⎤
⎥⎦
⎢⎣
⎤
⎢⎣
⎥⎦
⎢⎣
= ⎥⎦
⎢⎣
e
1
2
(3)
2
(3)
2
1.3 e
r
r
(3)
2 (3) r
1 x3 x1 x2 r
0
16. 16
Outline of DirectLiNGAM
• Iteratively find exogenous variables until all the
variables are ordered:
1. Find an exogenous variable .
– Put at the top of the ordering.
– Regress out.
2. Find an exogenous residual, here .
– Put at the second top of the ordering.
– Regress out.
3. Put at the third top of the ordering and terminate.
The estimated ordering is
3 x
(3)
1 r
3 x
1 x3 x1 x2 r (3,1)
2 r
(3)
2 (3) r
3 x
1 x
(3)
1 r
2 x
. 3 1 2 x < x < x
Step. 1 Step. 2 Step. 3
17. 17
Identification of an exogenous
variable (two variable cases)
i) is exogenous. ii) is NOT exogenous.
( )
1 e
x b x b
=
1 = 12 2 + 12 ≠
0
x e
x b var
x
12 2
var( )
x x
( ) 1 1 x = e 1 x
Regressing on ,
r x x x
cov( , )
2 1
var( )
= −
b x x
1 12 cov( 2 , 1
)
var( )
1
2
1
1
1
2
(1)
2
2 1
x
x
x
x
−
⎭ ⎬ ⎫
⎩ ⎨ ⎧
= −
x e
=
x b x e b
1 1
= + ≠
( 0) 2 21 1 2 21
x x
Regressing on ,
r x x x
cov( , )
= −
x b x
= −
2 21 1
2
1
2 1
1
2
(1)
2
2 1
var( )
e
x
x
=
( )
2 2
1 e
and (1) are NOT independent.
1 2 and (1) are independent. x r
1 2 x r
18. Need to use Darmois-Skitovitch’ 18
theorem (Darmois, 1953; Skitovitch, 1953)
( )
1
x b x e b
=
1 = 12 2 + ⋅ 1 12 ≠
0
x e
x x
1
2
2
x x
Regressing on ,
r x x x
cov( , )
2 1
var( )
= −
b x x
1 12 cov( 2 , 1
)
1
1
1
2
(1)
2
1 2
var
var( )
var( )
e
x
x
x
x
−
⎭ ⎬ ⎫
⎩ ⎨ ⎧
= −
( )
2 2
Darmois-Skitovitch’ theorem:
Define two variables and as
ii) 1 is NOT exogenous. x
and (1) are NOT independent.
1 2 x r
Σ Σ
= =
x 1 = a 1 j e j ,
x =
a e
p
2 2
j
j j
p
j
1
1
1 x
j e
where are independent
random variables.
If there exists a non-Gaussian
for which ,
and are dependent.
i e
0 1 2 ≠ i i a a
1 x 2 x
1
12 b
2 x
19. 19
Identification of an exogenous
variable (p variable cases)
( )
cov( ) x
j = − , x x
i j
j
i x
r x
• Lemma 1: and its residual
are independent for all is exogenous
j
var( )
j
i
x
i ≠ j j ⇔ x
• In practice, we can identify an exogenous variable
by finding a variable that is most independent of
its residuals
20. 20 Independence measures
• Evaluate independence between a variable and a
residual by a nonlinear correlation:
corr{x , g(r( j) )} (g = tanh)
j i
• Taking the sum over all the residuals, we get:
{ ( j
Σ≠
) } { ( ) } T = corr x ) )
j , g i r( +
corr g x , r( i j
j
j i
• Can use more sophisticated measures as well
(Bach & Jordan, 2002; Gretton et al., 2005; Kraskov et al., 2004).
– Kernel-based independence measure (Bach & Jordan, 2002)
often gives more accurate estimates (Sogawa et al., IJCNN10)
21. Real-world data example (1/2) 21
• Status attainment model
– General Social Survey (U.S.A.)
– Sample size = 1380
• Non-farm, ages 35-45, white, male, in the labor force, years 1972-2006
Domain knowledge
(Duncan et al. 1972)
DirectLiNGAM
23. 23
Summary
• DirectLiNGAM repeats:
– Least squares simple linear regression
– Evaluation of pairwise independence between each
variable and its residuals
• No algorithmic parameters like stepsize, initial
guesses, convergence criteria
• Guaranteed convergence to the right solution in
a fixed number of steps (the number of
variables) if the data strictly follows the model