These days a lot of data being generated is in the form of time series. From climate data to users post in social media, stock prices, neurological data etc. Discovering the temporal dependence between different time series data is important task in time series analysis. It finds its application in varied fields ranging from advertising in social media, finding influencers, marketing, share markets, psychology, climate science etc. Identifying the networks of dependencies has been studied in this report.
In this report we have study how this problem has been studied in the field of econometrics. We will also study three different approaches for building causal networks between the time series and then see how this knowledge has been used in three completely different fields. At last some important issues are presented and areas in which this can be extended for further research.
Granger Causality Test: A Useful Descriptive Tool for Time Series DataIJMER
Interdependency of one or more variables on the other has been in the existence over long
time when it was discovered that one variable has to move or regress toward another following the
work done by Galton (1886); Pearson & Lee (1903); Kendall & Stuart, (1961); Johnston and
DiNardo, (1997); Gujarati, (2004) etc. It was in the light of this dependency over time the researcher
uses Granger Causality as an effective tool in time series Predictive causality using Nigeria GDP and
Money Supply to know the type of causality in existence in the two time series variables under
consideration and which one can statistically predicts the other.
The research work aimed at testing for nature of causality between GDP and money supply for
Federal Republic of Nigeria for the period of thirty years using the data sourced from Central Bank
of Nigeria Statistical Bulletin. After observing the various conditions of Granger causality test such
as ensuring stationarity in the variables under consideration; adding enough number of lags in the
prescribed model before estimation as Granger causality test is sensitive to the number of lags
introduced in the model; and as well as assuming the disturbance terms in the various models are
uncorrelated, the result of the analysis indicates a bilateral relationship between Nigeria GDP and
Money Supply. It implies Nigeria GDP Granger causes money Supply and vice versa. Based on the
result of this study, both Nigeria GDP and money Supply can be successfully model using Vector
Autoregressive Model since changes in one variable has a significant effect on the other variable.
1. The document discusses Granger causality testing within the context of bivariate analysis of stationary time series.
2. It defines Granger causality as when one time series can better predict another by including information from its own past, and describes three main tests for Granger causality between two stationary time series: the direct Granger test, Sims test, and modified Sims test.
3. The direct Granger test involves regressing each variable on lagged values of itself and the other variable, and using an F-test to examine if including lags of the other variable improves predictions compared to only using own lags.
Lecture notes on Johansen cointegrationMoses sichei
This document discusses the Johansen cointegration procedure and error correction models. It provides an example where there are 3 variables (short-term interest rate, 3-year interest rate, and 10-year interest rate) that are cointegrated with 2 cointegrating relationships. The error correction form of the vector autoregression is shown, with the 2 cointegrating vectors entering each equation. Restrictions can be tested on the coefficients of the cointegrating vectors (beta) using likelihood ratio tests. This allows testing of economic theory restrictions on the long-run relationships between the variables.
The document discusses various econometric modeling techniques including regression equations, cointegration, error correction models, vector autoregressive (VAR) modeling, and vector error correction models (VECM). It explains that regression equations can produce spurious results if the data is non-stationary, and that cointegration exists if the residuals from a regression equation are stationary. Error correction models specify the short-run relationship that maintains the long-run equilibrium between cointegrated variables. VAR models express current values of variables as functions of past values, while VECMs are VARs in first differences that incorporate the long-run cointegrating relationships between variables.
Definition of Co-integration .
Different Approaches of Co-integration.
Johansen and Juselius (J.J) Co-integration.
Error Correction Model (ECM).
Interpretation of ECM term.
Long – Run Co-integration Equation.
This document provides an overview of distributed lag models. It defines distributed lag models as models where the current value of a dependent variable is predicted based on current and past values of an explanatory variable. It discusses finite and infinite distributed lag models. Methods for estimating distributed lag models like ad hoc estimation and the Koyck model are described. The Koyck model specifies an exponential decline in lag weights. Problems with estimation like multicollinearity, serial correlation, and heteroscedasticity are also summarized.
Cointegration analysis: Modelling the complex interdependencies between finan...Edward Thomas Jones
1) The document discusses cointegration analysis, which models the complex interdependencies between financial assets. It examines the non-stationary nature of financial time series data and explores vector autoregressive (VAR) models and cointegration techniques to analyze relationships between non-stationary variables.
2) VAR models provide a framework for modeling dynamic relationships between stationary time series variables. The document outlines univariate and multivariate VAR models and discusses estimations and lag order selection for VAR models.
3) Cointegration techniques allow modeling of relationships between non-stationary time series variables. The document reviews tests for identifying stationary and non-stationary time series, including the Augmented Dickey-Fuller and Phillips-Perron tests
This document discusses the Koyck transformation approach to modeling distributed lag structures. It begins by introducing distributed lag models, which allow the effect of a causal variable to be spread over multiple time periods. It then describes the Koyck transformation technique, which simplifies an infinite distributed lag model into an estimable autoregressive model by assuming the lag coefficients decline geometrically. This involves lagging the model by one period, multiplying by the decay parameter λ, and subtracting to isolate the impact of the causal variable in the current period. The Koyck approach allows estimation of distributed lag models using standard regression methods.
Granger Causality Test: A Useful Descriptive Tool for Time Series DataIJMER
Interdependency of one or more variables on the other has been in the existence over long
time when it was discovered that one variable has to move or regress toward another following the
work done by Galton (1886); Pearson & Lee (1903); Kendall & Stuart, (1961); Johnston and
DiNardo, (1997); Gujarati, (2004) etc. It was in the light of this dependency over time the researcher
uses Granger Causality as an effective tool in time series Predictive causality using Nigeria GDP and
Money Supply to know the type of causality in existence in the two time series variables under
consideration and which one can statistically predicts the other.
The research work aimed at testing for nature of causality between GDP and money supply for
Federal Republic of Nigeria for the period of thirty years using the data sourced from Central Bank
of Nigeria Statistical Bulletin. After observing the various conditions of Granger causality test such
as ensuring stationarity in the variables under consideration; adding enough number of lags in the
prescribed model before estimation as Granger causality test is sensitive to the number of lags
introduced in the model; and as well as assuming the disturbance terms in the various models are
uncorrelated, the result of the analysis indicates a bilateral relationship between Nigeria GDP and
Money Supply. It implies Nigeria GDP Granger causes money Supply and vice versa. Based on the
result of this study, both Nigeria GDP and money Supply can be successfully model using Vector
Autoregressive Model since changes in one variable has a significant effect on the other variable.
1. The document discusses Granger causality testing within the context of bivariate analysis of stationary time series.
2. It defines Granger causality as when one time series can better predict another by including information from its own past, and describes three main tests for Granger causality between two stationary time series: the direct Granger test, Sims test, and modified Sims test.
3. The direct Granger test involves regressing each variable on lagged values of itself and the other variable, and using an F-test to examine if including lags of the other variable improves predictions compared to only using own lags.
Lecture notes on Johansen cointegrationMoses sichei
This document discusses the Johansen cointegration procedure and error correction models. It provides an example where there are 3 variables (short-term interest rate, 3-year interest rate, and 10-year interest rate) that are cointegrated with 2 cointegrating relationships. The error correction form of the vector autoregression is shown, with the 2 cointegrating vectors entering each equation. Restrictions can be tested on the coefficients of the cointegrating vectors (beta) using likelihood ratio tests. This allows testing of economic theory restrictions on the long-run relationships between the variables.
The document discusses various econometric modeling techniques including regression equations, cointegration, error correction models, vector autoregressive (VAR) modeling, and vector error correction models (VECM). It explains that regression equations can produce spurious results if the data is non-stationary, and that cointegration exists if the residuals from a regression equation are stationary. Error correction models specify the short-run relationship that maintains the long-run equilibrium between cointegrated variables. VAR models express current values of variables as functions of past values, while VECMs are VARs in first differences that incorporate the long-run cointegrating relationships between variables.
Definition of Co-integration .
Different Approaches of Co-integration.
Johansen and Juselius (J.J) Co-integration.
Error Correction Model (ECM).
Interpretation of ECM term.
Long – Run Co-integration Equation.
This document provides an overview of distributed lag models. It defines distributed lag models as models where the current value of a dependent variable is predicted based on current and past values of an explanatory variable. It discusses finite and infinite distributed lag models. Methods for estimating distributed lag models like ad hoc estimation and the Koyck model are described. The Koyck model specifies an exponential decline in lag weights. Problems with estimation like multicollinearity, serial correlation, and heteroscedasticity are also summarized.
Cointegration analysis: Modelling the complex interdependencies between finan...Edward Thomas Jones
1) The document discusses cointegration analysis, which models the complex interdependencies between financial assets. It examines the non-stationary nature of financial time series data and explores vector autoregressive (VAR) models and cointegration techniques to analyze relationships between non-stationary variables.
2) VAR models provide a framework for modeling dynamic relationships between stationary time series variables. The document outlines univariate and multivariate VAR models and discusses estimations and lag order selection for VAR models.
3) Cointegration techniques allow modeling of relationships between non-stationary time series variables. The document reviews tests for identifying stationary and non-stationary time series, including the Augmented Dickey-Fuller and Phillips-Perron tests
This document discusses the Koyck transformation approach to modeling distributed lag structures. It begins by introducing distributed lag models, which allow the effect of a causal variable to be spread over multiple time periods. It then describes the Koyck transformation technique, which simplifies an infinite distributed lag model into an estimable autoregressive model by assuming the lag coefficients decline geometrically. This involves lagging the model by one period, multiplying by the decay parameter λ, and subtracting to isolate the impact of the causal variable in the current period. The Koyck approach allows estimation of distributed lag models using standard regression methods.
This document discusses the key concepts and assumptions of multiple linear regression analysis. It begins by defining the multiple regression model as examining the linear relationship between a dependent variable (Y) and two or more independent variables (X1, X2,...Xk). It then outlines the assumptions of the regression model, including linearity, independence of errors, normality of errors, and equal variance. The document also discusses how to evaluate the significance of the overall model and individual variables using F-tests, t-tests, and confidence intervals. It concludes by discussing how to evaluate the regression assumptions by examining the residuals.
This document reviews testing for causality between variables. It begins by defining Granger causality, which tests whether including one time series helps forecast another. For bivariate systems, causality can be tested by examining coefficients in a vector autoregression (VAR) model. For multivariate systems, causality is more complex and graphical models may help. The document outlines procedures for testing causality between stationary and nonstationary time series using impulse responses, vector autoregressive moving average (VARMA) models, and other techniques. It provides examples and discusses challenges like potential omitted common factors.
This document proposes generalized additive models (GAMs) to model conditional dependence structures between random variables. Specifically, it develops a GAM framework where a dependence or concordance measure between two variables is modeled as a parametric, non-parametric, or semi-parametric function of explanatory variables. It derives the root-n consistency and asymptotic normality of the maximum penalized log-likelihood estimator for the proposed GAMs. It also discusses details of the estimation procedure and selection of smoothing parameters.
This document outlines the generalised method of moments (GMM) estimation technique. It begins with the basic principles of GMM, including that it uses theoretical relations that parameters should satisfy to choose parameter estimates. It then discusses estimating GMM, hypothesis testing with GMM, and extensions such as using GMM with dynamic stochastic general equilibrium (DSGE) models. The document provides details on how population moments relate to sample moments, and how method of moments estimation and instrumental variables estimation can both be viewed as special cases of GMM. It concludes by explaining how the generalized method of moments estimator works by minimizing a weighted distance between sample and population moments.
1. The document discusses linear correlation and regression between plasma amphetamine levels and amphetamine-induced psychosis scores using data from 10 patients.
2. A positive correlation was found between the two variables, and a linear regression equation was established to predict psychosis scores from amphetamine levels.
3. However, further statistical tests were needed to determine if the correlation and regression model could be generalized to the overall patient population.
This document provides an overview of simple linear regression and correlation analysis. It defines regression as estimating the relationship between two variables and correlation as measuring the strength and direction of that relationship. The key points covered include:
- Regression finds an estimating equation to relate known and unknown variables. Correlation determines how well that equation fits the data.
- Pearson's correlation coefficient r measures the linear relationship between two variables on a scale from -1 to 1.
- The coefficient of determination r2 indicates what percentage of variation in the dependent variable is explained by the independent variable.
- Statistical tests can evaluate whether a correlation is statistically significant or could be due to chance.
A Regularization Approach to the Reconciliation of Constrained Data SetsAlkis Vazacopoulos
This document proposes a new regularization approach to reconcile constrained data sets. The approach assumes unmeasured variables have a finite but equal uncertainty to derive an iterative solution that does not require explicitly computing a projection matrix at each step. This avoids issues when the projection matrix is non-invertible. The method arrives at a minimized solution by reformulating the problem to include an added regularization term for the unmeasured variables. It also provides an alternative way to classify variables without using the projection matrix.
This document summarizes and analyzes the performance of Newton's method, BFGS method, and SR1 method for minimizing a quadratic and convex function. It finds that:
1) Newton's method performed the best, requiring fewer iterations and achieving greater accuracy than the other methods.
2) For constrained problems, the SR1 method achieved some success due to its flexibility in not always requiring a descent direction.
3) While Newton's method has the best theoretical convergence rate, quasi-Newton methods are more applicable to complex problems as hessian inversion becomes more computationally expensive.
4) When minimizing quadratic and convex functions, Newton's method generally performs better than the other tested methods. However, the best
This document discusses heteroscedasticity, which occurs when the error variance is not constant. It provides examples of when the variance of errors may change, such as with income level or outliers. Graphical methods are presented for detecting heteroscedasticity by examining patterns in residual plots. Formal tests are also described, including the Park test which regresses the log of the squared residuals on explanatory variables, and the Glejser test which regresses the absolute value of residuals on variables related to the error variance. Detection of heteroscedasticity is important as it violates assumptions of the classical linear regression model.
Introduction to correlation and regression analysisFarzad Javidanrad
This document provides an introduction to correlation and regression analysis. It defines key concepts like variables, random variables, and probability distributions. It discusses how correlation measures the strength and direction of a linear relationship between two variables. Correlation coefficients range from -1 to 1, with values closer to these extremes indicating stronger correlation. The document also introduces determination coefficients, which measure the proportion of variance in one variable explained by the other. Regression analysis builds on correlation to study and predict the average value of one variable based on the values of other explanatory variables.
This document summarizes key concepts regarding the chi-square distribution and its applications to statistical tests. It discusses:
1) The mathematical properties of the chi-square distribution and how it can be derived from the normal distribution.
2) Examples of chi-square goodness-of-fit tests to determine if sample data fits an expected distribution like the normal.
3) How chi-square tests of independence can assess if two criteria of classification applied to data are independent.
4) Additional chi-square tests of homogeneity and Fisher's exact test. Formulas and steps for calculating test statistics are provided.
The document discusses simple linear regression and correlation methods. It defines deterministic and probabilistic models for describing the relationship between two variables. A simple linear regression model assumes a population regression line with intercept a and slope b, where observations may deviate from the line by some random error e. Key assumptions of the model are that e has a normal distribution with mean 0 and constant variance across values of x, and errors are independent. The slope b estimates the average change in y per unit change in x.
Finding the relationship between two quantitative variables without being able to infer causal relationships
Correlation is a statistical technique used to determine the degree to which two variables are related
Introduces and explains the use of multiple linear regression, a multivariate correlational statistical technique. For more info, see the lecture page at http://goo.gl/CeBsv. See also the slides for the MLR II lecture http://www.slideshare.net/jtneill/multiple-linear-regression-ii
This document discusses correlation and linear regression. It defines correlation as the association between two variables, which can be positive, negative, or non-existent. Linear correlation exists when plotted points approximate a straight line. The correlation coefficient r measures the strength of a linear relationship between -1 and 1. Linear regression finds the linear relationship that best fits the data using a regression equation to predict y values from x. Multiple linear regression extends this to use multiple explanatory variables.
This dissertation consists of three chapters that study identification and inference in econometric models.
Chapter 1 considers identification robust inference when the moment variance matrix is singular. It develops a novel asymptotic approach based on higher order expansions of the eigensystem to show that the Generalized Anderson-Rubin statistic possesses a chi-squared limit under additional regularity conditions. When these conditions are violated, the statistic is shown to be Op(n) and exhibit "moment-singularity bias".
Chapter 2 provides a method called "Normalized Principal Components" to minimize many weak instrument bias in linear IV settings. It derives an asymptotically valid ranking of instruments in terms of correlation and selects instruments to minimize MSE approximations.
Chapter
The document provides an overview of the chi-squared test and examples of its applications. It introduces the chi-squared test as a method to assess how well observed data fits expected theoretical results. Several examples are given demonstrating chi-squared tests of goodness of fit for binomial, Poisson, normal and contingency table distributions. Practice questions are also provided involving a range of chi-squared test applications.
1) The document analyzes correlations between various physical and psychological measures based on statistical data. It reports correlation coefficients (r values) and significance values (p values) between different variable pairs.
2) Several correlations are identified as statistically significant based on p values less than 0.01 or 0.05, with moderate to strong correlation coefficients ranging from around 0.3 to over 0.5.
3) The strongest correlations explain over 30% of the variance between variables, which is considered clinically significant, while weaker correlations explain only around 1% of variance or less.
Explains some advanced uses of multiple linear regression, including partial correlations, analysis of residuals, interactions, and analysis of change. See also previous lecture http://www.slideshare.net/jtneill/multiple-linear-regression
- The document describes a study that uses a modified Kolmogorov-Smirnov (KS) test to test if the innovations of a GARCH model come from a mixture of normal distributions rather than a standard normal distribution.
- It establishes critical values for the KS test and modified KS (MKS) test through simulation under the null hypothesis. It then uses simulation to calculate the size and power of both tests when the innovations come from alternative distributions like the normal, Student's t, and generalized error distributions.
- The results show that the KS and MKS tests maintain the correct size when the innovations are actually from the mixture of normals. The power of both tests is greater than the nominal level when the innovations come
This document provides information about performing a Granger causality test in Eviews. It defines Granger causality, lists the assumptions, outlines the method steps to run the test in Eviews, and provides an example of results and interpretation. The method section explains how to open Eviews, create a new work file with dated data from 1979-2010, select the Granger causality test and specify lags, and view the results. The results section shows an example output and interpretation that the null hypothesis that LCPI does not Granger cause LPGDP is accepted.
This document discusses Granger causality and how to test for it. It provides the following key points:
1) Granger causality measures whether variable A occurs before variable B and helps predict B, but does not guarantee true causality. If A does not Granger cause B, one can be more confident A does not cause B.
2) To test for Granger causality, autoregressive models are developed with and without the variable being tested, and an F-test or t-test is used to see if adding the variable significantly lowers the residuals.
3) The document applies this to test if changes in loans Granger cause changes in deposits using quarterly U.S. financial
This document discusses the key concepts and assumptions of multiple linear regression analysis. It begins by defining the multiple regression model as examining the linear relationship between a dependent variable (Y) and two or more independent variables (X1, X2,...Xk). It then outlines the assumptions of the regression model, including linearity, independence of errors, normality of errors, and equal variance. The document also discusses how to evaluate the significance of the overall model and individual variables using F-tests, t-tests, and confidence intervals. It concludes by discussing how to evaluate the regression assumptions by examining the residuals.
This document reviews testing for causality between variables. It begins by defining Granger causality, which tests whether including one time series helps forecast another. For bivariate systems, causality can be tested by examining coefficients in a vector autoregression (VAR) model. For multivariate systems, causality is more complex and graphical models may help. The document outlines procedures for testing causality between stationary and nonstationary time series using impulse responses, vector autoregressive moving average (VARMA) models, and other techniques. It provides examples and discusses challenges like potential omitted common factors.
This document proposes generalized additive models (GAMs) to model conditional dependence structures between random variables. Specifically, it develops a GAM framework where a dependence or concordance measure between two variables is modeled as a parametric, non-parametric, or semi-parametric function of explanatory variables. It derives the root-n consistency and asymptotic normality of the maximum penalized log-likelihood estimator for the proposed GAMs. It also discusses details of the estimation procedure and selection of smoothing parameters.
This document outlines the generalised method of moments (GMM) estimation technique. It begins with the basic principles of GMM, including that it uses theoretical relations that parameters should satisfy to choose parameter estimates. It then discusses estimating GMM, hypothesis testing with GMM, and extensions such as using GMM with dynamic stochastic general equilibrium (DSGE) models. The document provides details on how population moments relate to sample moments, and how method of moments estimation and instrumental variables estimation can both be viewed as special cases of GMM. It concludes by explaining how the generalized method of moments estimator works by minimizing a weighted distance between sample and population moments.
1. The document discusses linear correlation and regression between plasma amphetamine levels and amphetamine-induced psychosis scores using data from 10 patients.
2. A positive correlation was found between the two variables, and a linear regression equation was established to predict psychosis scores from amphetamine levels.
3. However, further statistical tests were needed to determine if the correlation and regression model could be generalized to the overall patient population.
This document provides an overview of simple linear regression and correlation analysis. It defines regression as estimating the relationship between two variables and correlation as measuring the strength and direction of that relationship. The key points covered include:
- Regression finds an estimating equation to relate known and unknown variables. Correlation determines how well that equation fits the data.
- Pearson's correlation coefficient r measures the linear relationship between two variables on a scale from -1 to 1.
- The coefficient of determination r2 indicates what percentage of variation in the dependent variable is explained by the independent variable.
- Statistical tests can evaluate whether a correlation is statistically significant or could be due to chance.
A Regularization Approach to the Reconciliation of Constrained Data SetsAlkis Vazacopoulos
This document proposes a new regularization approach to reconcile constrained data sets. The approach assumes unmeasured variables have a finite but equal uncertainty to derive an iterative solution that does not require explicitly computing a projection matrix at each step. This avoids issues when the projection matrix is non-invertible. The method arrives at a minimized solution by reformulating the problem to include an added regularization term for the unmeasured variables. It also provides an alternative way to classify variables without using the projection matrix.
This document summarizes and analyzes the performance of Newton's method, BFGS method, and SR1 method for minimizing a quadratic and convex function. It finds that:
1) Newton's method performed the best, requiring fewer iterations and achieving greater accuracy than the other methods.
2) For constrained problems, the SR1 method achieved some success due to its flexibility in not always requiring a descent direction.
3) While Newton's method has the best theoretical convergence rate, quasi-Newton methods are more applicable to complex problems as hessian inversion becomes more computationally expensive.
4) When minimizing quadratic and convex functions, Newton's method generally performs better than the other tested methods. However, the best
This document discusses heteroscedasticity, which occurs when the error variance is not constant. It provides examples of when the variance of errors may change, such as with income level or outliers. Graphical methods are presented for detecting heteroscedasticity by examining patterns in residual plots. Formal tests are also described, including the Park test which regresses the log of the squared residuals on explanatory variables, and the Glejser test which regresses the absolute value of residuals on variables related to the error variance. Detection of heteroscedasticity is important as it violates assumptions of the classical linear regression model.
Introduction to correlation and regression analysisFarzad Javidanrad
This document provides an introduction to correlation and regression analysis. It defines key concepts like variables, random variables, and probability distributions. It discusses how correlation measures the strength and direction of a linear relationship between two variables. Correlation coefficients range from -1 to 1, with values closer to these extremes indicating stronger correlation. The document also introduces determination coefficients, which measure the proportion of variance in one variable explained by the other. Regression analysis builds on correlation to study and predict the average value of one variable based on the values of other explanatory variables.
This document summarizes key concepts regarding the chi-square distribution and its applications to statistical tests. It discusses:
1) The mathematical properties of the chi-square distribution and how it can be derived from the normal distribution.
2) Examples of chi-square goodness-of-fit tests to determine if sample data fits an expected distribution like the normal.
3) How chi-square tests of independence can assess if two criteria of classification applied to data are independent.
4) Additional chi-square tests of homogeneity and Fisher's exact test. Formulas and steps for calculating test statistics are provided.
The document discusses simple linear regression and correlation methods. It defines deterministic and probabilistic models for describing the relationship between two variables. A simple linear regression model assumes a population regression line with intercept a and slope b, where observations may deviate from the line by some random error e. Key assumptions of the model are that e has a normal distribution with mean 0 and constant variance across values of x, and errors are independent. The slope b estimates the average change in y per unit change in x.
Finding the relationship between two quantitative variables without being able to infer causal relationships
Correlation is a statistical technique used to determine the degree to which two variables are related
Introduces and explains the use of multiple linear regression, a multivariate correlational statistical technique. For more info, see the lecture page at http://goo.gl/CeBsv. See also the slides for the MLR II lecture http://www.slideshare.net/jtneill/multiple-linear-regression-ii
This document discusses correlation and linear regression. It defines correlation as the association between two variables, which can be positive, negative, or non-existent. Linear correlation exists when plotted points approximate a straight line. The correlation coefficient r measures the strength of a linear relationship between -1 and 1. Linear regression finds the linear relationship that best fits the data using a regression equation to predict y values from x. Multiple linear regression extends this to use multiple explanatory variables.
This dissertation consists of three chapters that study identification and inference in econometric models.
Chapter 1 considers identification robust inference when the moment variance matrix is singular. It develops a novel asymptotic approach based on higher order expansions of the eigensystem to show that the Generalized Anderson-Rubin statistic possesses a chi-squared limit under additional regularity conditions. When these conditions are violated, the statistic is shown to be Op(n) and exhibit "moment-singularity bias".
Chapter 2 provides a method called "Normalized Principal Components" to minimize many weak instrument bias in linear IV settings. It derives an asymptotically valid ranking of instruments in terms of correlation and selects instruments to minimize MSE approximations.
Chapter
The document provides an overview of the chi-squared test and examples of its applications. It introduces the chi-squared test as a method to assess how well observed data fits expected theoretical results. Several examples are given demonstrating chi-squared tests of goodness of fit for binomial, Poisson, normal and contingency table distributions. Practice questions are also provided involving a range of chi-squared test applications.
1) The document analyzes correlations between various physical and psychological measures based on statistical data. It reports correlation coefficients (r values) and significance values (p values) between different variable pairs.
2) Several correlations are identified as statistically significant based on p values less than 0.01 or 0.05, with moderate to strong correlation coefficients ranging from around 0.3 to over 0.5.
3) The strongest correlations explain over 30% of the variance between variables, which is considered clinically significant, while weaker correlations explain only around 1% of variance or less.
Explains some advanced uses of multiple linear regression, including partial correlations, analysis of residuals, interactions, and analysis of change. See also previous lecture http://www.slideshare.net/jtneill/multiple-linear-regression
- The document describes a study that uses a modified Kolmogorov-Smirnov (KS) test to test if the innovations of a GARCH model come from a mixture of normal distributions rather than a standard normal distribution.
- It establishes critical values for the KS test and modified KS (MKS) test through simulation under the null hypothesis. It then uses simulation to calculate the size and power of both tests when the innovations come from alternative distributions like the normal, Student's t, and generalized error distributions.
- The results show that the KS and MKS tests maintain the correct size when the innovations are actually from the mixture of normals. The power of both tests is greater than the nominal level when the innovations come
This document provides information about performing a Granger causality test in Eviews. It defines Granger causality, lists the assumptions, outlines the method steps to run the test in Eviews, and provides an example of results and interpretation. The method section explains how to open Eviews, create a new work file with dated data from 1979-2010, select the Granger causality test and specify lags, and view the results. The results section shows an example output and interpretation that the null hypothesis that LCPI does not Granger cause LPGDP is accepted.
This document discusses Granger causality and how to test for it. It provides the following key points:
1) Granger causality measures whether variable A occurs before variable B and helps predict B, but does not guarantee true causality. If A does not Granger cause B, one can be more confident A does not cause B.
2) To test for Granger causality, autoregressive models are developed with and without the variable being tested, and an F-test or t-test is used to see if adding the variable significantly lowers the residuals.
3) The document applies this to test if changes in loans Granger cause changes in deposits using quarterly U.S. financial
This document provides an overview of an upcoming presentation on error correction models and their application to agricultural economics research. It outlines the major topics to be covered, including concepts and definitions related to cointegration and error correction models, Johansen's cointegration test, the Engle-Granger two-step error correction model, and a case study on market integration of arecanut in Karnataka state using an error correction model approach. Tables and figures are included to illustrate key concepts like order of integration, cointegration, and the residual-based test for cointegration.
The document summarizes the Toda-Yamamoto augmented Granger causality test.
[1] The test allows checking for causality between integrated variables of different orders without needing to determine cointegration. It involves estimating a VAR model with maximal order of integration lags added.
[2] The test procedure involves determining the order of integration (d), selecting the optimal lag length (k), setting the null and alternative hypotheses of no causality and causality, and calculating an F-statistic to test for causality.
[3] If the F-statistic exceeds the critical value, the null of no causality is rejected, indicating causality between the variables.
Linear regression models a quantity as a linear combination of features to minimize residuals. Issues include unstable results from collinear features and overfitting. Regularization like ridge regression addresses this by minimizing residuals and coefficient size, improving variance. Cross-validation chooses the best model by evaluating prediction error on validation data rather than training error. Lasso induces sparsity by minimizing residuals and the L1 norm of coefficients. Scikit Learn implements these methods with options like normalization, intercept fitting, and solver choices.
The document discusses the decline in law school applicants over the past decade. Between 2010-2011 and 2013, law school applicants progressively declined. The acceptance rate for law schools increased from 55.6% in 2004 to 76.9% in 2013 as schools attempted to match the reduction in demand with a reduction in supply. Prestigious law schools still have high employment rates around 95% concentrated in large law firms, while most other law schools have employment rates of 80% or above.
This document contains analysis of stationarity and unit root tests for the S&P 500 Index (SPIndex) and Atlanta housing price index (AtlantaHPIndex) time series data. Optimal lags were selected using the Bayesian information criterion. Unit root tests using these lags show that the null hypothesis of non-stationarity cannot be rejected for the SPIndex, but can be rejected for the AtlantaHPIndex, indicating it is stationary.
Este documento presenta una introducción al modelo de regresión lineal múltiple. Explica que este modelo utiliza más de una variable explicativa para predecir los valores de una variable dependiente de manera más precisa que la regresión lineal simple. Además, describe cómo se estiman los parámetros del modelo mediante el método de mínimos cuadrados y cómo se evalúa la bondad de ajuste del modelo. Finalmente, introduce un ejemplo para ilustrar la aplicación de la regresión lineal múltiple.
The document discusses unrestricted vector autoregression (VAR) models. It analyzes a VAR model using quarterly data on H6 money aggregate DDA, personal income, and 10-year Treasury rates from the early 1960s to 2015. The model includes endogenous and exogenous variables. The main benefits of VAR discussed are that it allows measuring the impact of shocks to endogenous variables on other variables using impulse response functions and forecast error variance decompositions. However, the document notes some limitations of VAR models and questions whether some results like impulse responses truly represent economic relationships.
Quantitative method intro variable_levels_measurementKeiko Ono
This document discusses variables, levels of measurement, and key terms in quantitative methods. It defines a variable as a property of an observation that can take on two or more values. There are three levels of measurement for variables: nominal, ordinal, and interval. Nominal variables categorize without order, ordinal can be ordered but differences are not exact, and interval variables have exact differences represented by each value. Appropriate summary statistics depend on the level of measurement, with nominal only allowing frequency and mode, ordinal adding median and range, and interval permitting all including mean, variance, minimum, and maximum.
The document discusses deep learning and deep neural networks. Some key points:
1) A deep neural network (DNN) has at least two hidden layers, whereas a regular neural network only has one hidden layer. DNNs can be thought of as a series of logit regressions with intermediate factors representing hidden layers.
2) Important parameters for DNNs include the number of hidden layers, number of nodes per layer, activation functions, number of iterations, and output function. Tuning these parameters is important.
3) The author tested various DNN structures on a dataset to predict stock market returns, comparing performance to a regression model. DNN models with one hidden layer of 5-7 nodes performed better than the regression
This document provides an outline and overview of tutorials for using the STATA data analysis software. It describes STATA's capabilities for data management tasks like sorting, keeping/dropping variables and observations, merging datasets, and working with dates. It lists example datasets from an econometrics textbook that are used in the tutorials. The website www.STATA.org.uk hosts step-by-step screenshot guides for various STATA functions covering data management, statistical analysis, importing data, and more.
Issues associated with Unit Root, multicollinearity, and autocorrelation. Those issues are not as black-and-white as people think they are. They are rather complex and at times even inconclusive. Read why.
This document is an introduction to statistical machine learning presented by Christfried Webers from NICTA and The Australian National University. It discusses linear basis function models and how to perform maximum likelihood and least squares estimation. Specifically, it shows that maximizing the likelihood is equivalent to minimizing the sum-of-squares error, and that the maximum likelihood solution is given by the pseudo-inverse of the design matrix. It also examines the geometry of least squares and the bias-variance decomposition.
This document discusses multiple linear regression analysis. It begins by defining a multiple regression equation that describes the relationship between a response variable and two or more explanatory variables. It notes that multiple regression allows prediction of a response using more than one predictor variable. The document outlines key elements of multiple regression including visualization of relationships, statistical significance testing, and evaluating model fit. It provides examples of interpreting multiple regression output and using the technique to predict outcomes.
This document provides an outline for tutorials on using STATA software for statistical analysis, focusing on linear regressions. It describes datasets used in examples from an econometrics textbook. The outline lists topics covered in the tutorials, including regressions, F-statistics, fitted values, residuals, and challenges. It directs readers to a website for downloading tutorial examples and guides to using STATA for additional statistical techniques.
Knowledge of cause-effect relationships is central to the field of climate science, supporting mechanistic understanding, observational sampling strategies, experimental design, model development and model prediction. While the major causal connections in our planet's climate system are already known, there is still potential for new discoveries in some areas. The purpose of this talk is to make this community familiar with a variety of available tools to discover potential cause-effect relationships from observed or simulation data. Some of these tools are already in use in climate science, others are just emerging in recent years. None of them are miracle solutions, but many can provide important pieces of information to climate scientists. An important way to use such methods is to generate cause-effect hypotheses that climate experts can then study further. In this talk we will (1) introduce key concepts important for causal analysis; (2) discuss some methods based on the concepts of Granger causality and Pearl causality; (3) point out some strengths and limitations of these approaches; and (4) illustrate such methods using a few real-world examples from climate science.
This document provides an overview of data mining techniques discussed in Chapter 3, including parametric and nonparametric models, statistical perspectives on point estimation and error measurement, Bayes' theorem, decision trees, neural networks, genetic algorithms, and similarity measures. Nonparametric techniques like neural networks, decision trees, and genetic algorithms are particularly suitable for data mining applications involving large, dynamically changing datasets.
The document provides an overview of correlation and regression analysis, time series models, and cost indexes. It defines correlation, regression analysis, and their importance and applications. It discusses simple linear regression equations, assumptions, and hypothesis testing. It also covers multiple linear regression, moving averages, exponential smoothing, and quantitative measures for evaluating time series models. The document is serving as the agenda for the Advanced Economics for Engineers course taught by Leemary Berrios, Irving Rivera, and Wilfredo Robles.
Although we often told not to do it, statistical scientists frequently predict the value of outcome measures of physical systems at input points far the observed data. Since predictions are made in new regions of the input space, a statistical theory cannot dictate optimal rules for measures of uncertainty associated with extrapolation. This talk presents several solutions based on simple principles. The solutions are illustrated via the analysis of data generated by dropping spheres of varying radii and masses from different heights. Some of the techniques apply to more complex physical systems. The efficacy of these techniques is demonstrated using data (experimental and simulated) of the level of complexity physical scientist frequently face. Scientists should tailor these techniques to fit the needs of a particular application.
The document discusses developing quantitative structure-activity relationship (QSAR) models to predict the biological responses of nanomaterials. It describes using descriptors of pristine and weathered nanomaterials, as well as experimental parameters, to develop linear regression models between descriptors and responses. Partial least squares regression is used to handle correlations between descriptors. The data is also analyzed using k-means clustering to identify separate descriptor clusters, and QSAR models are developed for each cluster to improve predictions. The resulting models could then be used to predict responses of emerging nanomaterials based on their similarity to existing clusters.
We consider the problem of model estimation in episodic Block MDPs. In these MDPs, the decision maker has access to rich observations or contexts generated from a small number of latent states. We are interested in estimating the latent state decoding function (the mapping from the observations to latent states) based on data generated under a fixed behavior policy. We derive an information-theoretical lower bound on the error rate for estimating this function and present an algorithm approaching this fundamental limit. In turn, our algorithm also provides estimates of all the components of the MDP.
We apply our results to the problem of learning near-optimal policies in the reward-free setting. Based on our efficient model estimation algorithm, we show that we can infer a policy converging (as the number of collected samples grows large) to the optimal policy at the best possible asymptotic rate. Our analysis provides necessary and sufficient conditions under which exploiting the block structure yields improvements in the sample complexity for identifying near-optimal policies. When these conditions are met, the sample complexity in the minimax reward-free setting is improved by a multiplicative factor $n$, where $n$ is the number of contexts.
This document discusses moments, skewness, kurtosis, and several statistical distributions including binomial, Poisson, hypergeometric, and chi-square distributions. It defines key terms such as moment ratios, central moments, theorems, skewness, kurtosis, and correlation. Properties and applications of the binomial, Poisson, and hypergeometric distributions are provided. Finally, the document discusses the chi-square test for goodness of fit and independence.
[ICLR2021 (spotlight)] Benefit of deep learning with non-convex noisy gradien...Taiji Suzuki
Presentation slide of our ICLR2021 paper "Benefit of deep learning with non-convex noisy gradient descent: Provable excess risk bound and superiority to kernel methods."
Abstract:
Establishing a theoretical analysis that explains why deep learning can outperform shallow learning such as kernel methods is one of the biggest issues in the deep learning literature. Towards answering this question, we evaluate excess risk of a deep learning estimator trained by a noisy gradient descent with ridge regularization on a mildly overparameterized neural network, and discuss its superiority to a class of linear estimators that includes neural tangent kernel approach, random feature model, other kernel methods, k-NN estimator and so on. We consider a teacher-student regression model, and eventually show that {\it any} linear estimator can be outperformed by deep learning in a sense of the minimax optimal rate especially for a high dimension setting. The obtained excess bounds are so-called fast learning rate which is faster than O(1/\sqrt{n}) that is obtained by usual Rademacher complexity analysis. This discrepancy is induced by the non-convex geometry of the model and the noisy gradient descent used for neural network training provably reaches a near global optimal solution even though the loss landscape is highly non-convex. Although the noisy gradient descent does not employ any explicit or implicit sparsity inducing regularization, it shows a preferable generalization performance that dominates linear estimators.
This document describes specification tests that can be used after estimating dynamic panel data models using the generalized method of moments (GMM) estimator. It presents GMM estimators for first-order autoregressive models with individual fixed effects that exploit moment restrictions from assuming serially uncorrelated errors. Monte Carlo simulations are used to evaluate the small-sample performance of tests of serial correlation based on GMM residuals, Sargan tests, and Hausman tests. The tests are also applied to estimated employment equations using an unbalanced panel of UK firms.
Covariance matrices are central to many adaptive filtering and optimisation problems. In practice, they have to be estimated from a finite number of samples; on this, I will review some known results from spectrum estimation and multiple-input multiple-output communications systems, and how properties that are assumed to be inherent in covariance and power spectral densities can easily be lost in the estimation process. I will discuss new results on space-time covariance estimation, and how the estimation from finite sample sets will impact on factorisations such as the eigenvalue decomposition, which is often key to solving the introductory optimisation problems. The purpose of the presentation is to give you some insight into estimating statistics as well as to provide a glimpse on classical signal processing challenges such as the separation of sources from a mixture of signals.
In this study,
We propose a EEG analysis model using a nonlinear oscillator with one degree of freedom.
It doesn’t have a random term.
our study method identifies six model parameters experimentally.
Here is the detail: https://kenyu-life.com/2018/11/03/modeling_of_eeg/
Created by Kenyu Uehara
This document discusses Bayesian neural networks. It begins with an introduction to Bayesian inference and variational inference. It then explains how variational inference can be used to approximate the posterior distribution in a Bayesian neural network. Several numerical methods for obtaining the posterior distribution are covered, including Metropolis-Hastings, Hamiltonian Monte Carlo, and Stochastic Gradient Langevin Dynamics. Finally, it provides an example of classifying MNIST digits with a Bayesian neural network and analyzing model uncertainties.
The document discusses random phenomena and random processes. Some key points:
- Random phenomena are those whose outcomes cannot be predicted deterministically due to complex factors. They are described statistically rather than deterministically.
- A random process is the collection of all possible time histories that could result from random phenomena. Individual time histories are called sample functions.
- Random processes can be described using averages over ensembles of sample functions or over time from a single sample function. Stationary and ergodic processes allow the use of time averages.
- Random variables, power spectral densities, and probability distributions provide information about random processes and allow their characterization in different domains.
The document discusses multiple linear regression analysis. It defines multiple regression as exploring the relationship between one continuous dependent variable and multiple independent variables. It provides examples of multiple regression models with one and two predictors. It also discusses assumptions of multiple regression like sample size, multicollinearity, outliers, and normality of residuals. Key steps in multiple regression like estimating parameters, assessing model fit and diagnosing assumptions are outlined.
The document presents a study that jointly models the duration and size of forest fires in British Columbia using random effects to link the two outcomes. Specifically, it uses joint models and random effects to investigate the effects of environmental variables on the duration and size of large, long-lasting lightning-caused fires. The study finds that a shared frailty component is significant, suggesting the two outcomes are related and can be modeled together.
This document presents a general framework for enhancing time series prediction performance. It discusses using multiple predictions from a base method like neural networks, ARIMA or Holt-Winters to improve accuracy. Short-term enhancement uses support vector regression on statistic and reliability features of the multiple predictions to enhance 1-step ahead predictions. Long-term enhancement trains additional models on the short-term predictions to enhance longer-horizon predictions. The framework is evaluated on traffic flow data with prediction horizons of 1 week and 13 weeks.
Beyond Degrees - Empowering the Workforce in the Context of Skills-First.pptxEduSkills OECD
Iván Bornacelly, Policy Analyst at the OECD Centre for Skills, OECD, presents at the webinar 'Tackling job market gaps with a skills-first approach' on 12 June 2024
Main Java[All of the Base Concepts}.docxadhitya5119
This is part 1 of my Java Learning Journey. This Contains Custom methods, classes, constructors, packages, multithreading , try- catch block, finally block and more.
This document provides an overview of wound healing, its functions, stages, mechanisms, factors affecting it, and complications.
A wound is a break in the integrity of the skin or tissues, which may be associated with disruption of the structure and function.
Healing is the body’s response to injury in an attempt to restore normal structure and functions.
Healing can occur in two ways: Regeneration and Repair
There are 4 phases of wound healing: hemostasis, inflammation, proliferation, and remodeling. This document also describes the mechanism of wound healing. Factors that affect healing include infection, uncontrolled diabetes, poor nutrition, age, anemia, the presence of foreign bodies, etc.
Complications of wound healing like infection, hyperpigmentation of scar, contractures, and keloid formation.
Reimagining Your Library Space: How to Increase the Vibes in Your Library No ...Diana Rendina
Librarians are leading the way in creating future-ready citizens – now we need to update our spaces to match. In this session, attendees will get inspiration for transforming their library spaces. You’ll learn how to survey students and patrons, create a focus group, and use design thinking to brainstorm ideas for your space. We’ll discuss budget friendly ways to change your space as well as how to find funding. No matter where you’re at, you’ll find ideas for reimagining your space in this session.
हिंदी वर्णमाला पीपीटी, hindi alphabet PPT presentation, hindi varnamala PPT, Hindi Varnamala pdf, हिंदी स्वर, हिंदी व्यंजन, sikhiye hindi varnmala, dr. mulla adam ali, hindi language and literature, hindi alphabet with drawing, hindi alphabet pdf, hindi varnamala for childrens, hindi language, hindi varnamala practice for kids, https://www.drmullaadamali.com
This presentation includes basic of PCOS their pathology and treatment and also Ayurveda correlation of PCOS and Ayurvedic line of treatment mentioned in classics.
A review of the growth of the Israel Genealogy Research Association Database Collection for the last 12 months. Our collection is now passed the 3 million mark and still growing. See which archives have contributed the most. See the different types of records we have, and which years have had records added. You can also see what we have for the future.
How to Fix the Import Error in the Odoo 17Celine George
An import error occurs when a program fails to import a module or library, disrupting its execution. In languages like Python, this issue arises when the specified module cannot be found or accessed, hindering the program's functionality. Resolving import errors is crucial for maintaining smooth software operation and uninterrupted development processes.
This slide is special for master students (MIBS & MIFB) in UUM. Also useful for readers who are interested in the topic of contemporary Islamic banking.
3. Why?
Why did the apple fall
down instead of going
up?
Why does average
temperature rise?
Why did the stock
market fall?
Why did a post go viral
on facebook?
CAUSE EFFECT RELATIONSHIPS
4. Cause - Effect Relationships
◦ Causality is defined as the relation between two events: cause and effect where
the effect occurs as a consequence of the cause.
◦ Effect is “What happened?” and Cause is “Why it happened?”
◦ e.g. In case of global warming, the increase in Greenhouse gases is the cause and
increase in average temperature is the effect.[1]
5. Characteristics of Causal Relationships
◦ Temporal Precedence: It states that the cause occurs prior to the effect. e.g. A person
must smoke first and then he gets lung cancer.
◦ Co-occurrence : Whenever cause happens, effect must also happen. Cause cannot be
isolated from the effect. e.g. Whenever there is a net force on a body, it will accelerate.
Is Causality same as Association then?
6. Correlation Vs Causation
◦ Correlation does not imply Causation
◦ Correlation only means that two events co-exist more often than ordinary chance.[2]
7. Physics
Econometrics
Types of Data: web metrics , stock prices, sales
(all time series)
Medicine
Types of Data: experiments result, gene
sequences(sequential data), brain signals(time
series)
Climate Science
Types of Data: weather conditions (spatio-temporal
or temporal data)
Fields of Study
HOW TO DETECT CAUSALITY?
9. Control Experimentation
Aim: To find out what happens to a system when you interfere with it.
Divide subjects
randomly into two
groups: Test and
Control
Introduce X only in
the test group and
observe Y in both.
If X causes Y :
ܲ(ܻ=ݕ|݀(ܺ)) >
ܲ(ܻ=ݕ|!݀(ܺ))
implies Causality
10. Disadvantage of Control Experimentation
◦ Not possible to always carry out the experiment.
◦ Most time series data cannot be manipulated. e.g. Climate, Stock data
◦ Have to resort to statistical methods to determine causality.
HOW TO DO IT IN TIME SERIES?
11. Time Series
◦ A time series is a sequence of data points, measured typically at successive
points in time spaced at uniform time intervals.
12. Granger Causality
◦ Also known as Predictive Causality.
◦ Granger said that Causality could be reflected by measuring the ability of
predicting the future values of a time series using past values of another time
series.
◦ Two main principles:
Cause must occur before the Effect.
The Cause can be used to predict the of Effect i.e. Cause has some unique information
about the future values of the effect.
13. Granger Causality
Suppose X and Y are two time series and for X to cause Y :
푃[푌(푡 + 1)|훤 푡 ≠ 푃[푌(푡 + 1)| 훤−X 푡
훤 푡 and 훤−X 푡 denote the “information in the universe up to time t” and “information in
alternate universe up to time t in which X is excluded”.
14. Performing the Granger Causality test
◦ Model 1: Build model 1 by regressing on the past values of both X and Y
푚 훼푗푌푡−푗 + 푖=1
퐸(푌|푌푡−푘 , 푋푡−푘 ) 푌푡 = 푗=1
푛 훽푖 푋푡−푖 + 퐷푡 + 휀푡
◦ Model 2: Build model 2 by regressing on the past values of Y only
푚 훼푗푌푡−푗 + 퐷푡 + 휀푡
퐸(푌|푌푡−푘 ) 푌푡 = 푗=1
◦ Check whether the prediction accuracy has significantly increased by performing
F-test.[11]
15. Granger Causality
• CONS
It does not take into account the effect of hidden common
causes(confounders)
It assumes that all the relationships are linear in nature and does not account
for non-linear dependencies.
HOW TO DEAL WITH MULTIPLE TIME SERIES?
16. Relationship Graphs in Time Series
Extending the concept of Granger Causality to Mult iple Time Series
17. Relationship Graphs
◦ Relationship graph has all time series as nodes and an edge between any two
nodes denotes the direction of relationship between the two.
◦ Input:
Matrix X of time series
Xlag which is the lagged versions of time series matrix X.
◦ Output
◦ Relationship graph between the time series with nodes xi’s each edge from xi to xj if xi
causes xj.
xi xj
18. Exhaustive Graphical Granger method
◦ Algorithm:
◦ For every pair of nodes(xi,xj) perform the following
Insert an edge xi → xj if Granger (xi,xj, Xlag) = ‘yes’ and Granger (xj,xi, Xlag) = ‘no’
Insert an edge xi ← xj if Granger (xi,xj, Xlag)= ‘no’ and Granger (xj,xi, Xlag) = ‘yes’
Insert an edge xi↔xj, if Granger (xi,xj, Xlag) = ‘yes’ and Granger (xj,xi, Xlag)= ‘yes’
19. Exhaustive Graphical Granger method
◦ Complexity
A total of N time series with T lags each and P time stamps/sample size, makes the
complexity as O(N2P2T2).
◦ Shortcomings
Not considering the effect of other time series.
Computationally expensive.
20. The LASSO-Granger Method
◦ Uses variable selection in Causality Detection
◦ Aim is to identify the subset of time series on which xi is conditionally dependent
and on what lag is it dependent.
◦ Achieved by applying variable selection on the set of time series and the lags
◦ Variable selection is done by LASSO.
LASSO-Least Absolute Shrinkage and Selection Operator
21. LASSO
◦ A selection method for linear regression
◦ Selects a subset of variables subject to the following condition
푤 = 푚푖푛
1
n
(푤. 푥 − 푦)2+휆 푤
Here w is the vector of coefficients, y is the variable to be predicted.
◦ Aim is to minimize the OLS error and the sum of coefficients to prevent over
fitting.
◦ LARS(Least Angle Regression): best method to achieve LASSO.
22. LARS(Least Angle Regression Shrinkage)
Step 1: Start with û0=0
Step 2: The residual ŷ2-û0 has a
greater correlation with x1 than
with x2
24. LARS(Least Angle Regression Shrinkage)
Step 4: First LARS estimate :
û1 = û0 + ƛx1
where the residual ŷ2-û1
has equal correlation with
both x1 and x2
26. The Lasso-Granger Method
◦ Algorithm
Obtain Xlag(the lagged version of the time series matrix X).
For each xi in X,
y= xi
Performs LASSO (y,Xlag)
Wi : the set of time series for which the coefficients returned by are non-zero.
Add edge (xj, xi) to the graph if xj is in Wj
27. The Lasso-Granger Method
◦ Complexity
Using LARS to solve the lasso problem: O(PN2T2).
◦ Pros.
Computationally less expensive.
Can be used when number of series are quite large as compared to the number of data
points.
Consistency: The probability of Lasso falsely including a non-neighboring feature in its
neighborhood is very small even when the number of features are very large.
28. Forward Backward Granger Causality
◦ Improvement on LASSO-Granger Algorithm
◦ Inspired from Physics
◦ Principle: Reverse time and all the relationships must remain same except for
change in direction, i.e. if xi causes xj with a time lag of k then on reversing time xj
will cause xi with time lag k.
◦ Apply LASSO-Granger on both the forward and backward time series and
combine the results of the two.
30. Brain Imaging
◦ How different portions of the brain affect one another.
Identify the direction and order of influence
◦ Apply Granger Causality to obtain the relationship between different components of
the brain.
Obtain fMRI data from the brain corresponding to a stimulus and divided it into
independent components corresponding to different sections of the brain.
Each independent component corresponds to a time series.
Apply Exhaustive Granger test to obtain the relationship between different time
series.
31. Brain Imaging
◦ Advantages:
No prior assumption about the nodes and their inter-connections.
Measures not only the connections but also the time lags between interactions.
Can work with a large number of regions.
32. Mining topics based on Causality
◦ Identification of topics that are causally related with the non textual data
iteratively.
◦ InCaToMi (Integrative Causal Topic Miner)
◦ Architecture:
Topic modelling
module
Causality Module
Feedback
Text Data
33. InCaToMi: Integrative Causal Topic Miner
◦ Topic Modelling Module:
Takes text and number of topics as input.
Creates topics based on word probabilities and the likelihood of each topic in the
document using PLSA algorithm.
Time series of the topic formed by summation of likelihood of each word in the topic for
a day.
34. InCaToMi: Integrative Causal Topic Miner
◦ Causality Module:
Perform the Granger Causality test for the time series for each topic and for each word in
the topic.
Form new candidate topic by selecting the words which are most causally related with the
non textual series.
Use this as prior for the next round of Topic Modelling.
35. Anomaly Detection
◦ Types of Anomalies:
Univariate Anomalies
Dependency Anomalies
◦ Given two sets of data sequences A(training) and B(test) each containing p time
series we have to find data points in B which significantly deviate from the
normal pattern of data sequence.
◦ Algorithm for finding dependency anomalies.
36. Anomaly Detection
Learning temporal causal graphs
by regularization
Finding the Anomaly
Score using Kullback-
Leibler (KL)
Divergence
Determining
Anomalies by
specifying a threshold
and finding the
underlying causes
Hypothesis: Causal Graphs of both remain the same
37. Anomaly Detection
Learning temporal causal graphs
by regularization
Finding the Anomaly
Score using Kullback-
Leibler (KL)
Divergence
Determining
Anomalies by
specifying a threshold
and finding the
underlying causes
Calculate the graph for A by LASSO Granger method.
When finding the causal graph for B we need to apply additional constraints. This can be done using two
methods:
a) Neighborhood Similarity: This implies imposing an additional constraint that the values of β(a) should
be zero or non-zero only when the value of β(b) are zero or non-zero. Here β(a) and β(b) are the coefficients
obtained by running Lasso Granger on set A and Set B respectively.
b) Coefficient similarity: The constraint is that the coefficients β(a) and β(b) should be similar.
38. Anomaly Detection
Learning temporal causal graphs
by regularization
Finding the Anomaly
Score using Kullback-
Leibler (KL)
Divergence
Determining
Anomalies by
specifying a threshold
and finding the
underlying causes
KL divergence is a measure of how much one distribution differs from another.
Obtain the distributions for the two time series and the anomaly score is calculated using the KL
formulae.
39. Anomaly Detection
Learning temporal causal graphs
by regularization
Finding the Anomaly
Score using Kullback-
Leibler (KL)
Divergence
Determining
Anomalies by
specifying a threshold
and finding the
underlying causes
• To set a threshold we calculate how a normal time series would score on the anomaly score.
• We slide the window through the reference data and calculate the anomaly scores for each window.
• We them use these to approximate the distribution of anomaly scores that a normal time series
should have.
• Given a significance level α, we set the α quantile of the distribution as threshold cutoff.
40. Conclusion
◦ Widespread application of causal relationships motivates the study.
◦ Completely data driven approach. So provides a new outlook in every field
without making any assumptions.
◦ Further Scope:
Applying the model to different domains. e.g Climate and Social media
Predicting anomalous behavior.
41. References
[1] Lashof, Daniel A., and Dilip R. Ahuja. "Relative contributions of greenhouse gas emissions to global warming." (1990): 529-531.
[2] Perry, Ronen. "Correlation versus Causality: Further Thoughts on the Law Review/Law School Liaison." Conn. L. Rev. 39 (2006): 77.
[3] Diks, Cees, and Valentyn Panchenko. Modified hiemstra-jones test for Granger non-causality. No. 192. Society for Computational Economics, 2004.
[4] Granger, Clive WJ. "Investigating causal relations by econometric models and cross-spectral methods." Econometrica: Journal of the Econometric Society (1969):
424-438.
[5] Arnold, Andrew, Yan Liu, and Naoki Abe. "Temporal causal modeling with graphical granger methods." Proceedings of the 13th ACM SIGKDD international
conference on Knowledge discovery and data mining. ACM, 2007
[6] Tibshirani, Robert. "Regression shrinkage and selection via the lasso." Journal of the Royal Statistical Society. Series B (Methodological) (1996): 267-288.
[7] Cheng, Dehua, Mohammad Taha Bahadori, and Yan Liu. "FBLG: a simple and effective approach for temporal dependence discovery from time series
data."Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 2014.
[8] Smith, Delmas, Iwabuchi, Kirk. “Demonstrating causal links between fMRI time series using time-lagged correlation”.
[9] Kim, Hyun Duk, et al. "Incatomi: Integrative causal topic miner between textual and non-textual time series data." Proceedings of the 21st ACM international
conference on Information and knowledge management. ACM, 2012.
[10] Qiu, Liu, Subrahmanya, et al. "Granger Causality for Time-Series Anomaly Detection." Proceedings of the 12th IEEE international conference on data mining,
2012.
[11] Lomax, Richard G. (2007) Statistical Concepts: A Second Course, p. 10