Causal inference is a branch of applied statistics which seeks to identify the causal connections between phenomena. It is a tool worth having in the toolset of a data analyst in addition to hypothesis testing and machine learning methods.
2. Problem with Traditional Regression
Y = k * X + b ⇨
⇨ X = 1/k * Y - b/k
Which variable is a cause
and which is an effect here?
We can not tell without
knowing the context.
3. Types of Variables
1. Continuous - blood glucose level - 0-30 mmol/l.
2. Categorical (factor) - patient’s country of origin -
countable number of levels.
3. Continuous artificially made categorical, i.e. numerical
range divided into subgroups. Blood glucose level “low”,
“normal”, “high”.
14. Recommended Books
1. The Book of Why by Judea Pearl and Dana Mackenzie
2. Causal Inference in Statistics: A Primer by Judea Pearl
and others + Solution Manual
3. Mostly Harmless Econometrics by Joshua D. Angrist
and Jörn-Steffen Pischke
4. R packages: dagitty (structural causal models)
and lavaan (structural equation modeling).