SlideShare a Scribd company logo
1 of 64
Download to read offline
Causal Networks: Learning and Inference
Fabio Stella and Luca Bernardinello
University of Milan-Bicocca
Department of Informatics, Systems and Communication
2019 September 23rd – 27th
The 100,000
Genomes Project
▪ complete sequencing of
70,000 individuals
▪ 21 PetaBytes of data
▪ 1 PetaByte of music requires
about 2,000 years to be
listened
The complexity increases with
digital data:
▪ Electronic Health Record
▪ Life style
▪ Nutrition
▪ Job type
▪ Geographical
▪ Social networks
Smart Cities
▪ Automate
▪ Control
▪ See real-time data
▪ Home automation
▪ Environment monitoring
▪ Medical and healthcare
▪ Smart transportation
▪ Smart manufacturing
▪ Energy resource
management
Large Hadron Collider
▪ 2017 June 29 at CERN, 200
PB of data permanently
stored.
▪ 1 billion collisions per second
happening at ATLAS, CMS,
ALICE, and LHCb.
According to [1],
• Big Data and Data Science are being used as buzzwords and are composites of
many concepts.
• Big data appears frequently in the press and academic journals.
• In the last five years; birth and growth of many data science programs in academia.
• In 2012, the White House Office of Science and Technology Policy announced the
Big Data Research and Development Initiative that builds upon federal initiatives
ranging from
✓ computer architecture and networking technologies,
✓ algorithms,
✓ data management,
✓ artificial intelligence,
✓ machine learning,
✓ advanced cyber-infrastructure.
[1] Brady H. E., Annual Review of Political Science, 22 (2019) 297.
Big Data 5 V’s
Volume; the amount of
data to be stored and
managed.
▪ It is probably the first
thing people think of
when they hear the
phrase big data.
▪ We typically talk about
big data in the case
where ExaBytes or
PetaBytes have to be
stored and managed.
Big Data 5 V’s
Velocity; defines the
speed of increase in big
data volume and its
relative accessibility.
▪ This dimension helps
organizations to
realize the relative
growth of their big
data and how quickly
that data reaches
sourcing users,
applications and
systems.
Big Data 5 V’s
Variety; we no longer have
control over the data
format, i.e. we must
manage heterogeneous
data, with different formats,
semantics and structures.
▪ In other words we could
be asked to combine:
✓ pure text,
✓ photo,
✓ audio,
✓ video,
✓ web,
✓ GPS data,
✓ sensor data,
✓ relational data bases,
✓ and documents
to answer a given question.
Big Data 5 V’s
Veracity; how accurate or
truthful a data set may be.
▪ In the context of big
data, however, it takes
on a bit more meaning.
▪ When it comes to the
accuracy of big data,
it’s not just the quality
of the data itself but
how trustworthy the
data source, type, and
processing of it is.
▪ Removing things like
bias, abnormalities or
inconsistencies,
duplication, and
volatility are just a few
aspects that factor into
improving the accuracy
of big data.
Big Data 5 V’s
Value; the worth of the data
being extracted.
▪ Having endless amounts
of data is one thing, but
unless it can be turned
into value it is useless.
▪ While there is a clear
link between data and
insights, this does not
always mean there is
value in Big Data.
▪ The most important part
of embarking on a big
data initiative is to
understand the costs
and benefits of collecting
and analyzing the data
to ensure that ultimately
the data that is reaped
can be monetized.
Big Data 5 V’s
Artificial Intelligence;
intelligence demonstrated by
machines, in contrast to the
natural intelligence displayed
by humans.
▪ Colloquially, the term
Artificial Intelligence (AI) is
used to describe
machines/computers that
mimic “cognitive” functions
that humans associate
with other human minds,
such as "learning" and
"problem solving".
▪ Two kinds of AI:
✓ Weak
✓ Strong
Artificial Intelligence
Bayesian Networks
▪ A type of statistical
model that represents a
set of variables and their
conditional
dependencies via a
directed acyclic graph
(DAG).
▪ Bayesian networks are
ideal for taking an event
that occurred and
predicting the likelihood
that any one of several
possible known causes
was the contributing
factor.
▪ Structural Causal
Models.
Artificial Intelligence
Knowledge Graphs
▪ A knowledge base used
by Google and its
services to enhance its
search engine's results
with information gathered
from a variety of sources.
▪ The information is
presented to users in an
infobox.
Artificial Intelligence
Knowledge Graphs
▪ A knowledge base used
by Google and its
services to enhance its
search engine's results
with information gathered
from a variety of sources.
▪ The information is
presented to users in an
infobox.
Machine Learning;
algorithms and statistical
models that computer
systems use in order to
perform a specific task
effectively without using
explicit instructions, relying
on patterns and inference
instead.
Three kinds of ML:
▪ Supervised
▪ Self-Supervised
▪ Reinforcement Learning
Machine Learning
Supervised
▪ Classification
SETOSA
VERSICOLOR
VIRGINICA
Machine Learning
Supervised
▪ Curve fitting
▪ Surface fitting
Machine Learning
Self-Supervised
▪ Recommendation
System
▪ Market Basket Analysis
▪ Social Network Analysis
Machine Learning
Reinforcement
Learning
▪ Learn by interacting
with the environment
▪ The environment reacts
to our decisions/actions
▪ Sequential learning,
only at the end of the
game we know our
performance
(reward/punishment)
Deep Learning;
is part of a broader family
of machine learning
methods based on
Artificial Neural
Networks.
Three kinds of DL:
▪ Supervised
▪ Self-Supervised
▪ Reinforcement Learning
Deep Learning
Feedforward
Neural Networks
▪ The first and simplest
type of artificial neural
network devised.
▪ The information moves
in only one direction,
forward, from the input
nodes, through the
hidden nodes (if any)
and to the output
nodes.
▪ There are no cycles or
loops in the network.
8 pixels
8pixels
X1
X64
Y0
Y9
…
…
…
…
Deep Learning
Convolutional
Networks
▪ Regularized versions of
multilayer perceptrons
which are fully
connected and thus
prone to overfitting the
data.
▪ Regularization by
adding some form of
magnitude
measurement of weights
to the loss function.
▪ Different approach
towards regularization:
take advantage of the
hierarchical pattern in
data and assemble
more complex patterns
using smaller and
simpler patterns.
Deep Learning
LSTM Networks
▪ An artificial Recurrent
Neural Network
architecture.
▪ Unlike standard
feedforward neural
networks, LSTM has
feedback connections
that make it a "general
purpose computer" (it
can compute anything
that a Turing machine
can).
▪ LSTM started to
revolutionize speech
recognition,
outperforming traditional
models in certain
speech applications.
Data Science
▪ Applied Mathematics
▪ Computer Science
▪ Neuroscience
▪ Engineering
▪ Statistics
▪ Biology
▪ Physics
Data science is the application of computational and statistical
techniques to address or gain insight into some problem in the
real world (J. Zico Kolter, Carnegie Mellon University, 2018)
Data science is an extraordinary multidisciplinary and
interdisciplinary challenge to apply the scientific method
empowered by
• data,
• models,
• computational resources,
• open source software, and
… the most sophisticated technology we have today, i.e. the
human being.
Machine
Learning
▪ Many different names
for learning
But most of machine learning
nowadays
is just curve fitting
REINFORCEMENT
LEARNING
Machine
Learning
▪ Curve fitting
(correlations) - linear
Machine
Learning
▪ Curve fitting - nonlinear
Machine
Learning
▪ Curve fitting -
multidimensional
Machine
Learning
▪ Spurious Correlations
Spurious Correlations
Fitting can be highly misleading
Fitting does not
give us any
understanding
“The vision systems of the eagle and the snake outperform
everything that we can make in the laboratory, but snakes
and eagles cannot build an eyeglass or a telescope or a
microscope."
— Judea Pearl
What can truly
be achieved?
▪ No matter how much data
you collect, the question
can not be answered
when using the data
alone
Does Exercise affect
Cholesterol?
Simpson’s Paradox
CHOLESTEROLEXERCISE
AGE
Big Data and the
data-fusion
problem
▪ Piecing together multiple
datasets collected under
heterogeneous
conditions (i.e., different
populations, regimes,
and sampling methods)
to obtain valid answers
to queries of interest.
▪ The availability of
multiple heterogeneous
datasets presents new
opportunities to big data
analysts, because the
knowledge that can be
acquired from combined
data would not be
possible from any
individual source alone.
Causation is the key
▪ What we (computers so far)
miss is causal knowledge
▪ For predicting the
consequences of actions
▪ Causation for us is a
synonym of understanding
▪ Actual intelligence
needs causal
knowledge
▪ But causal knowledge
is not in the data!
Introduction to Causal Inference
Why study
causation?
▪ To make sense of data
• effect of smoking on lung
cancer?
• effect of education on
salaries?
• effect of carbon emissions on
the climate?
▪ To understand how we
have an effect
• malaria caused by mosquitos
or by mal-air?
Why study
causation?
▪ To make sense of data
• effect of smoking on lung
cancer?
• effect of education on
salaries?
• effect of carbon emissions on
the climate?
▪ To understand how we
have an effect
• malaria caused by mosquitos
or by mal-air?
▪ To guide actions and
policies
• pack mosquito nets or use
breathing masks?
• reduce CO2 emissions?
• have a degree?
• stop smoking?
Why study causation
separately from
statistics (machine
learning)?
Why study causation separately from statistics
(machine learning)?
▪ What can causation tell us that ML doesn’t?
• Causation is not an aspect of ML
• It is an addition to ML
▪ Causation uncovers facts that ML cannot
• None of the previous problems can be articulated in the language of ML
▪ Let us introduce these aspects with a famous statistical puzzle …
Simpson’s Paradox
▪ Named after Edward Simpson
(born 1922), statistician
▪ A group of sick patients are
given the option to try a new
drug
▪ Among those who took the
drug, a lower percentage
recovered than among those
who did not
▪ However, when we partition
by gender, we see that:
▪ more men taking the drug
recover than do men are not
taking the drug, and
▪ more women taking the
drug recover than do
women are not taking the
drug!
The drug appears to help
men and women,
but hurt the general
population
Simpson’s Paradox
▪ Named after Edward Simpson
(born 1922), statistician
▪ A group of sick patients are
given the option to try a new
drug
▪ Among those who took the
drug, a lower percentage
recovered than among those
who did not
▪ However, when we partition
by gender, we see that:
▪ more men taking the drug
recover than do men are not
taking the drug, and
▪ more women taking the
drug recover than do
women are not taking the
drug!
patients recovered % recovered patients recovered % recovered
Men 87 81 93% 270 234 87%
Women 263 192 73% 80 55 69%
Combined data 350 273 78% 350 289 83%
Table 1.1 Results of a study into a new drug, with gender being taken into account
Drug No Drug
Example 1.2.1 We record the recovery rates of 700 patients who were given
access to the drug. A total of 350 patients chose to take the drug and 350
patients did not. The results of the study are shown in Table 1.1.
Example 1.2.1 We record the recovery rates of 700 patients who were given
access to the drug. A total of 350 patients chose to take the drug and 350
patients did not. The results of the study are shown in Table 1.1.
Drug vs non-drug takers recovery rates:
▪ 93% vs 87% male
▪ 73% vs 69% female
▪ 78% vs 83% general population!
Should a doctor prescribe the drug; to whom?
Should a policy maker approve the drug for use?
Simpson’s Paradox
▪ Named after Edward Simpson
(born 1922), statistician
▪ A group of sick patients are
given the option to try a new
drug
▪ Among those who took the
drug, a lower percentage
recovered than among those
who did not
▪ However, when we partition
by gender, we see that:
▪ more men taking the drug
recover than do men are not
taking the drug, and
▪ more women taking the
drug recover than do
women are not taking the
drug!
patients recovered % recovered patients recovered % recovered
Men 87 81 93% 270 234 87%
Women 263 192 73% 80 55 69%
Combined data 350 273 78% 350 289 83%
Table 1.1 Results of a study into a new drug, with gender being taken into account
Drug No Drug
Example 1.2.1 We record the recovery rates of 700 patients who were given
access to the drug. A total of 350 patients chose to take the drug and 350
patients did not. The results of the study are shown in Table 1.1.
Drug vs non-drug takers recovery rates:
▪ 93% vs 87% male
▪ 73% vs 69% female
▪ 78% vs 83% general population!
Should a doctor prescribe the drug; to whom?
Should a policy maker approve the drug for use?
Simpson’s Paradox
▪ Named after Edward Simpson
(born 1922), statistician
▪ A group of sick patients are
given the option to try a new
drug
▪ Among those who took the
drug, a lower percentage
recovered than among those
who did not
▪ However, when we partition
by gender, we see that:
▪ more men taking the drug
recover than do men are not
taking the drug, and
▪ more women taking the
drug recover than do
women are not taking the
drug!
patients recovered % recovered patients recovered % recovered
Men 87 81 93% 270 234 87%
Women 263 192 73% 80 55 69%
Combined data 350 273 78% 350 289 83%
Table 1.1 Results of a study into a new drug, with gender being taken into account
Drug No Drug
Example 1.2.1 We record the recovery rates of 700 patients who were given
access to the drug. A total of 350 patients chose to take the drug and 350
patients did not. The results of the study are shown in Table 1.1.
Drug vs non-drug takers recovery rates:
▪ 93% vs 87% male
▪ 73% vs 69% female
▪ 78% vs 83% general population!
Should a doctor prescribe the drug; to whom?
Should a policy maker approve the drug for use?
Simpson’s Paradox
▪ Named after Edward Simpson
(born 1922), statistician
▪ A group of sick patients are
given the option to try a new
drug
▪ Among those who took the
drug, a lower percentage
recovered than among those
who did not
▪ However, when we partition
by gender, we see that:
▪ more men taking the drug
recover than do men are not
taking the drug, and
▪ more women taking the
drug recover than do
women are not taking the
drug!
patients recovered % recovered patients recovered % recovered
Men 87 81 93% 270 234 87%
Women 263 192 73% 80 55 69%
Combined data 350 273 78% 350 289 83%
Table 1.1 Results of a study into a new drug, with gender being taken into account
Drug No Drug
Example 1.2.1 We record the recovery rates of 700 patients who were given
access to the drug. A total of 350 patients chose to take the drug and 350
patients did not. The results of the study are shown in Table 1.1.
Drug vs non-drug takers recovery rates:
▪ 93% vs 87% male
▪ 73% vs 69% female
▪ 78% vs 83% general population!
Should a doctor prescribe the drug; to whom?
Should a policy maker approve the drug for use?
Simpson’s Paradox
▪ Named after Edward Simpson
(born 1922), statistician
▪ A group of sick patients are
given the option to try a new
drug
▪ Among those who took the
drug, a lower percentage
recovered than among those
who did not
▪ However, when we partition
by gender, we see that:
▪ more men taking the drug
recover than do men are not
taking the drug, and
▪ more women taking the
drug recover than do
women are not taking the
drug!
patients recovered % recovered patients recovered % recovered
Men 87 81 93% 270 234 87%
Women 263 192 73% 80 55 69%
Combined data 350 273 78% 350 289 83%
Table 1.1 Results of a study into a new drug, with gender being taken into account
Drug No Drug
Understand the
causal story behind
the data
▪ What mechanism generated
the data?
▪ Suppose: estrogen has a
negative effect on recovery
▪ Women less likely to recover
than men, regardless of the
drug
From the data:
Conclusion: the drug appears to be harmful but it is not
▪ If we select a drug taker at random, that person is more likely to be a woman
▪ Hence less likely to recover than a random person who doesn’t take the drug
Causal Story
▪ Being a woman is a common cause of both drug taking and failure to recover.
▪ To assess the effectiveness we need to compare subjects of the same gender.
(Ensures that any difference in recovery rates is not ascribable to estrogen)
patients recovered % recovered patients recovered % recovered
Men 87 81 93% 270 234 87%
Women 263 192 73% 80 55 69%
Combined data 350 273 78% 350 289 83%
Table 1.1 Results of a study into a new drug, with gender being taken into account
Drug No Drug
Data Segregation
▪ We have solved the problem
using gender-segregated data
▪ Then let’s just segregate the
data whenever possible,
right?
WRONG!!!
▪ Consider a drug affecting
recovery by lowering blood
pressure (BP)
▪ Unfortunately, it has also a
toxic effect
Should a doctor prescribe this drug or not?
patients recovered % recovered patients recovered % recovered
Low BP 87 81 93% 270 234 87%
High BP 263 192 73% 80 55 69%
Combined data 350 273 78% 350 289 83%
Table 1.2 Results of a study into a new drug, with posttreatment blood pressure taken into account
No Drug Drug
Should a doctor prescribe this drug or not?
Once again, the answer follows from the way the data were generated.
Data Segregation
▪ We have solved the problem
using gender-segregated data
▪ Then let’s just segregate the
data whenever possible,
right?
WRONG!!!
▪ Consider a drug affecting
recovery by lowering blood
pressure (BP)
▪ Unfortunately, it has also a
toxic effect
patients recovered % recovered patients recovered % recovered
Low BP 87 81 93% 270 234 87%
High BP 263 192 73% 80 55 69%
Combined data 350 273 78% 350 289 83%
Table 1.2 Results of a study into a new drug, with posttreatment blood pressure taken into account
No Drug Drug
Should a doctor prescribe this drug or not?
Once again, the answer follows from the way the data were generated.
▪ In the general population, the drug might improve recovery rates because of
its effect on blood pressure.
Data Segregation
▪ We have solved the problem
using gender-segregated data
▪ Then let’s just segregate the
data whenever possible,
right?
WRONG!!!
▪ Consider a drug affecting
recovery by lowering blood
pressure (BP)
▪ Unfortunately, it has also a
toxic effect
patients recovered % recovered patients recovered % recovered
Low BP 87 81 93% 270 234 87%
High BP 263 192 73% 80 55 69%
Combined data 350 273 78% 350 289 83%
Table 1.2 Results of a study into a new drug, with posttreatment blood pressure taken into account
No Drug Drug
Should a doctor prescribe this drug or not?
Once again, the answer follows from the way the data were generated.
▪ In the general population, the drug might improve recovery rates because of
its effect on blood pressure.
▪ in subpopulations—the group of people whose posttreatment BP is high and
the group whose posttreatment BP is low—we, of course, would not see that
effect; we would only see the drug’s toxic effect.
Data Segregation
▪ We have solved the problem
using gender-segregated data
▪ Then let’s just segregate the
data whenever possible,
right?
WRONG!!!
▪ Consider a drug affecting
recovery by lowering blood
pressure (BP)
▪ Unfortunately, it has also a
toxic effect
patients recovered % recovered patients recovered % recovered
Low BP 87 81 93% 270 234 87%
High BP 263 192 73% 80 55 69%
Combined data 350 273 78% 350 289 83%
Table 1.2 Results of a study into a new drug, with posttreatment blood pressure taken into account
No Drug Drug
Should a doctor prescribe this drug or not?
Once again, the answer follows from the way the data were generated.
▪ In the general population, the drug might improve recovery rates because of
its effect on blood pressure.
▪ in subpopulations—the group of people whose posttreatment BP is high and
the group whose posttreatment BP is low—we, of course, would not see that
effect; we would only see the drug’s toxic effect.
Data Segregation
▪ We have solved the problem
using gender-segregated data
▪ Then let’s just segregate the
data whenever possible,
right?
WRONG!!!
▪ Consider a drug affecting
recovery by lowering blood
pressure (BP)
▪ Unfortunately, it has also a
toxic effect
patients recovered % recovered patients recovered % recovered
Low BP 87 81 93% 270 234 87%
High BP 263 192 73% 80 55 69%
Combined data 350 273 78% 350 289 83%
Table 1.2 Results of a study into a new drug, with posttreatment blood pressure taken into account
No Drug Drug
Should a doctor prescribe this drug or not?
▪ Only by BP-segregating the data we can see the toxic effect
▪ It makes no sense to segregate the data; we should use the combined data
YES
Data Segregation
▪ We have solved the problem
using gender-segregated data
▪ Then let’s just segregate the
data whenever possible,
right?
WRONG!!!
▪ Consider a drug affecting
recovery by lowering blood
pressure (BP)
▪ Unfortunately, it has also a
toxic effect
patients recovered % recovered patients recovered % recovered
Low BP 87 81 93% 270 234 87%
High BP 263 192 73% 80 55 69%
Combined data 350 273 78% 350 289 83%
Table 1.2 Results of a study into a new drug, with posttreatment blood pressure taken into account
No Drug Drug
Note that the data are the same of Simpson’s paradox.
patients recovered % recovered patients recovered % recovered
Men 87 81 93% 270 234 87%
Women 263 192 73% 80 55 69%
Combined data 350 273 78% 350 289 83%
Table 1.1 Results of a study into a new drug, with gender being taken into account
Drug No Drug
Should a doctor prescribe this drug or not?
▪ Only by BP-segregating the data we can see the toxic effect
▪ It makes no sense to segregate the data; we should use the combined data
YES
Data Segregation
▪ We have solved the problem
using gender-segregated data
▪ Then let’s just segregate the
data whenever possible,
right?
WRONG!!!
▪ Consider a drug affecting
recovery by lowering blood
pressure (BP)
▪ Unfortunately, it has also a
toxic effect
patients recovered % recovered patients recovered % recovered
Low BP 87 81 93% 270 234 87%
High BP 263 192 73% 80 55 69%
Combined data 350 273 78% 350 289 83%
Table 1.2 Results of a study into a new drug, with posttreatment blood pressure taken into account
No Drug Drug
▪ the timing of the measurements
▪ that the treatment affects blood pressure
▪ that blood pressure affects recovery
▪ as Statisticians rightly say, correlation is not causation
▪ hence there is no method that can determine the causal story from data alone
▪ whence no ML method can aid in our decision
▪ The paradox arises out of our conviction that treatment cannot affect sex
▪ If it could, we could explain it as in our blood pressure case
▪ But we cannot test the assumption using the data
▪ Information that
allowed us to make a
correct decision
▪ All this information
was not in the data
▪ The same holds for
Simpson’s paradox
Lessons Learned
The Ladder
of Causation
Seeing; we are looking for
regularities in observations.
“What if I see …?”
Calls for predictions based on passive observations.
It is characterized by the question “What if I see …?”
For instance, imagine a marketing director at a
department store who asks,
“How likely is a customer who bought toothpaste to
also buy dental floss?”
“What if do …?” & “How?”
We step up to the next level of causal queries
when we begin to change the world. A typical
question for this level is
“What will happen to our floss sales if we double
the price of toothpaste?”
This already calls for a new
kind of knowledge, absent
from the data, which we find at
rung two of the Ladder of
Causation, Intervention.
Intervention; ranks
higher than association
because it involves not just
seeing but changing what is.
Many scientists have been quite traumatized to learn
that none of the methods they learned in statistics is
sufficient even to articulate, let alone answer, a simple
question like
“What happens if we double the price?”
“What if I had done …?” & “Why?”
Counterfactuals; ranks
higher than intervention
because it involves
imagining, retrospection
and understanding.
We might wonder, My
headache is gone now, but
• Why?
• Was it the aspirin I took?
• The food I ate?
• The good news I heard?
These queries take us to the top rung of the Ladder of
Causation, the level of Counterfactuals, because to
answer them we must go back in time, change history,
and ask,
“What would have happened if I had not taken the
aspirin?”
No experiment in the world can deny treatment to an
already treated person and compare the two outcomes,
so we must import a whole new kind of knowledge.
Extra-Statistical
Methods
▪ Methods to express and interpret causal assumptions
• To describe problems of any complexity, where intuition no longer helps
• To solve them mechanically as we solve algebraic equations
▪ In particular we need:
1. A working definition of “causation”
2. A method to articulate causal assumptions (i.e., a model)
3. A method to link causal models to data
4. A method to draw conclusions from model and data
▪ We start by 1
• As simple as it can be, it has eluded philosophers for centuries
• We use an operational definition:
▪ Methods to express and interpret causal assumptions
• To describe problems of any complexity, where intuition no longer helps
• To solve them mechanically as we solve algebraic equations
▪ In particular we need:
1. A working definition of “causation”
2. A method to articulate causal assumptions (i.e., a model)
3. A method to link causal models to data
4. A method to draw conclusions from model and data
A variable X is a cause of a variable Y if Y “listens” to X
and decides its value in response to what it hears.
Extra-Statistical
Methods
Before moving on to actual causal methods we need:
▪ Some elementary concepts from probability and statistics
• Most causal statements are uncertain (e.g., “careless driving
causes accidents”)
• Uncertainty is expressed by probability
• Probability is at the heart of statistics (i.e., learning from data)
▪ Some graph theory
• Causal stories will be represented by graphs
• Graphs will be the basis of solutions methods
Next Steps
References

More Related Content

What's hot

Semantic Web Investigation within Big Data Context
Semantic Web Investigation within Big Data ContextSemantic Web Investigation within Big Data Context
Semantic Web Investigation within Big Data ContextMurad Daryousse
 
Data Processing and Semantics for Advanced Internet of Things (IoT) Applicati...
Data Processing and Semantics for Advanced Internet of Things (IoT) Applicati...Data Processing and Semantics for Advanced Internet of Things (IoT) Applicati...
Data Processing and Semantics for Advanced Internet of Things (IoT) Applicati...Artificial Intelligence Institute at UofSC
 
A Semantics-based Approach to Machine Perception
A Semantics-based Approach to Machine PerceptionA Semantics-based Approach to Machine Perception
A Semantics-based Approach to Machine PerceptionCory Andrew Henson
 
Semantics-empowered Approaches to Big Data Processing for Physical-Cyber-Soci...
Semantics-empowered Approaches to Big Data Processing for Physical-Cyber-Soci...Semantics-empowered Approaches to Big Data Processing for Physical-Cyber-Soci...
Semantics-empowered Approaches to Big Data Processing for Physical-Cyber-Soci...Artificial Intelligence Institute at UofSC
 
hariri2019.pdf
hariri2019.pdfhariri2019.pdf
hariri2019.pdfAkuhuruf
 
TRANSFORMING BIG DATA INTO SMART DATA: Deriving Value via Harnessing Volume, ...
TRANSFORMING BIG DATA INTO SMART DATA: Deriving Value via Harnessing Volume, ...TRANSFORMING BIG DATA INTO SMART DATA: Deriving Value via Harnessing Volume, ...
TRANSFORMING BIG DATA INTO SMART DATA: Deriving Value via Harnessing Volume, ...Amit Sheth
 
Smart Data - How you and I will exploit Big Data for personalized digital hea...
Smart Data - How you and I will exploit Big Data for personalized digital hea...Smart Data - How you and I will exploit Big Data for personalized digital hea...
Smart Data - How you and I will exploit Big Data for personalized digital hea...Amit Sheth
 
Transforming Big Data into Smart Data for Smart Energy: Deriving Value via ha...
Transforming Big Data into Smart Data for Smart Energy: Deriving Value via ha...Transforming Big Data into Smart Data for Smart Energy: Deriving Value via ha...
Transforming Big Data into Smart Data for Smart Energy: Deriving Value via ha...Amit Sheth
 
Challenges in Analytics for BIG Data
Challenges in Analytics for BIG DataChallenges in Analytics for BIG Data
Challenges in Analytics for BIG DataPrasant Misra
 
Big Data Analytics : Understanding for Research Activity
Big Data Analytics : Understanding for Research ActivityBig Data Analytics : Understanding for Research Activity
Big Data Analytics : Understanding for Research ActivityAndry Alamsyah
 
The NEEDS vs. the WANTS in IoT
The NEEDS vs. the WANTS in IoTThe NEEDS vs. the WANTS in IoT
The NEEDS vs. the WANTS in IoTPrasant Misra
 
The current challenges and opportunities of big data and analytics in emergen...
The current challenges and opportunities of big data and analytics in emergen...The current challenges and opportunities of big data and analytics in emergen...
The current challenges and opportunities of big data and analytics in emergen...IBM Analytics
 
Mining Big Data to Predicting Future
Mining Big Data to Predicting FutureMining Big Data to Predicting Future
Mining Big Data to Predicting FutureIJERA Editor
 
Overcomming Big Data Mining Challenges for Revolutionary Breakthroughs in Com...
Overcomming Big Data Mining Challenges for Revolutionary Breakthroughs in Com...Overcomming Big Data Mining Challenges for Revolutionary Breakthroughs in Com...
Overcomming Big Data Mining Challenges for Revolutionary Breakthroughs in Com...AnthonyOtuonye
 
Physical Cyber Social Computing
Physical Cyber Social ComputingPhysical Cyber Social Computing
Physical Cyber Social ComputingAmit Sheth
 
Smart IoT for Connected Manufacturing
Smart IoT for Connected ManufacturingSmart IoT for Connected Manufacturing
Smart IoT for Connected ManufacturingAmit Sheth
 

What's hot (20)

Big data survey
Big data surveyBig data survey
Big data survey
 
Semantic Web Investigation within Big Data Context
Semantic Web Investigation within Big Data ContextSemantic Web Investigation within Big Data Context
Semantic Web Investigation within Big Data Context
 
Data Processing and Semantics for Advanced Internet of Things (IoT) Applicati...
Data Processing and Semantics for Advanced Internet of Things (IoT) Applicati...Data Processing and Semantics for Advanced Internet of Things (IoT) Applicati...
Data Processing and Semantics for Advanced Internet of Things (IoT) Applicati...
 
Lecture #01
Lecture #01Lecture #01
Lecture #01
 
A Semantics-based Approach to Machine Perception
A Semantics-based Approach to Machine PerceptionA Semantics-based Approach to Machine Perception
A Semantics-based Approach to Machine Perception
 
Semantics-empowered Approaches to Big Data Processing for Physical-Cyber-Soci...
Semantics-empowered Approaches to Big Data Processing for Physical-Cyber-Soci...Semantics-empowered Approaches to Big Data Processing for Physical-Cyber-Soci...
Semantics-empowered Approaches to Big Data Processing for Physical-Cyber-Soci...
 
hariri2019.pdf
hariri2019.pdfhariri2019.pdf
hariri2019.pdf
 
Intro to Data Science Concepts
Intro to Data Science ConceptsIntro to Data Science Concepts
Intro to Data Science Concepts
 
TRANSFORMING BIG DATA INTO SMART DATA: Deriving Value via Harnessing Volume, ...
TRANSFORMING BIG DATA INTO SMART DATA: Deriving Value via Harnessing Volume, ...TRANSFORMING BIG DATA INTO SMART DATA: Deriving Value via Harnessing Volume, ...
TRANSFORMING BIG DATA INTO SMART DATA: Deriving Value via Harnessing Volume, ...
 
Big data Paper
Big data PaperBig data Paper
Big data Paper
 
Smart Data - How you and I will exploit Big Data for personalized digital hea...
Smart Data - How you and I will exploit Big Data for personalized digital hea...Smart Data - How you and I will exploit Big Data for personalized digital hea...
Smart Data - How you and I will exploit Big Data for personalized digital hea...
 
Transforming Big Data into Smart Data for Smart Energy: Deriving Value via ha...
Transforming Big Data into Smart Data for Smart Energy: Deriving Value via ha...Transforming Big Data into Smart Data for Smart Energy: Deriving Value via ha...
Transforming Big Data into Smart Data for Smart Energy: Deriving Value via ha...
 
Challenges in Analytics for BIG Data
Challenges in Analytics for BIG DataChallenges in Analytics for BIG Data
Challenges in Analytics for BIG Data
 
Big Data Analytics : Understanding for Research Activity
Big Data Analytics : Understanding for Research ActivityBig Data Analytics : Understanding for Research Activity
Big Data Analytics : Understanding for Research Activity
 
The NEEDS vs. the WANTS in IoT
The NEEDS vs. the WANTS in IoTThe NEEDS vs. the WANTS in IoT
The NEEDS vs. the WANTS in IoT
 
The current challenges and opportunities of big data and analytics in emergen...
The current challenges and opportunities of big data and analytics in emergen...The current challenges and opportunities of big data and analytics in emergen...
The current challenges and opportunities of big data and analytics in emergen...
 
Mining Big Data to Predicting Future
Mining Big Data to Predicting FutureMining Big Data to Predicting Future
Mining Big Data to Predicting Future
 
Overcomming Big Data Mining Challenges for Revolutionary Breakthroughs in Com...
Overcomming Big Data Mining Challenges for Revolutionary Breakthroughs in Com...Overcomming Big Data Mining Challenges for Revolutionary Breakthroughs in Com...
Overcomming Big Data Mining Challenges for Revolutionary Breakthroughs in Com...
 
Physical Cyber Social Computing
Physical Cyber Social ComputingPhysical Cyber Social Computing
Physical Cyber Social Computing
 
Smart IoT for Connected Manufacturing
Smart IoT for Connected ManufacturingSmart IoT for Connected Manufacturing
Smart IoT for Connected Manufacturing
 

Similar to Causal networks, learning and inference - Introduction

Fundamentals of Data science Introduction Unit 1
Fundamentals of Data science Introduction Unit 1Fundamentals of Data science Introduction Unit 1
Fundamentals of Data science Introduction Unit 1sasi
 
Data Science - An emerging Stream of Science with its Spreading Reach & Impact
Data Science - An emerging Stream of Science with its Spreading Reach & ImpactData Science - An emerging Stream of Science with its Spreading Reach & Impact
Data Science - An emerging Stream of Science with its Spreading Reach & ImpactDr. Sunil Kr. Pandey
 
Understand the Idea of Big Data and in Present Scenario
Understand the Idea of Big Data and in Present ScenarioUnderstand the Idea of Big Data and in Present Scenario
Understand the Idea of Big Data and in Present ScenarioAI Publications
 
Bigger and Better: Employing a Holistic Strategy for Big Data toward a Strong...
Bigger and Better: Employing a Holistic Strategy for Big Data toward a Strong...Bigger and Better: Employing a Holistic Strategy for Big Data toward a Strong...
Bigger and Better: Employing a Holistic Strategy for Big Data toward a Strong...IT Network marcus evans
 
Getting started in Data Science (April 2017, Los Angeles)
Getting started in Data Science (April 2017, Los Angeles)Getting started in Data Science (April 2017, Los Angeles)
Getting started in Data Science (April 2017, Los Angeles)Thinkful
 
Unveiling Tomorrow_ The Future of Data Science.pdf
Unveiling Tomorrow_ The Future of Data Science.pdfUnveiling Tomorrow_ The Future of Data Science.pdf
Unveiling Tomorrow_ The Future of Data Science.pdfCIOWomenMagazine
 
Delivering on the Promise of Big Data and the Cloud
Delivering on the Promise of Big Data and the CloudDelivering on the Promise of Big Data and the Cloud
Delivering on the Promise of Big Data and the CloudBooz Allen Hamilton
 
Introduction to Data Science and Analytics
Introduction to Data Science and AnalyticsIntroduction to Data Science and Analytics
Introduction to Data Science and AnalyticsDhruv Saxena
 
Data Science Demystified_ Journeying Through Insights and Innovations
Data Science Demystified_ Journeying Through Insights and InnovationsData Science Demystified_ Journeying Through Insights and Innovations
Data Science Demystified_ Journeying Through Insights and InnovationsVaishali Pal
 
Putting data science into perspective
Putting data science into perspectivePutting data science into perspective
Putting data science into perspectiveSravan Ankaraju
 
Data Science and AI in Biomedicine: The World has Changed
Data Science and AI in Biomedicine: The World has ChangedData Science and AI in Biomedicine: The World has Changed
Data Science and AI in Biomedicine: The World has ChangedPhilip Bourne
 
Career in Data Science (July 2017, DTLA)
Career in Data Science (July 2017, DTLA)Career in Data Science (July 2017, DTLA)
Career in Data Science (July 2017, DTLA)Thinkful
 
Real-time applications of Data Science.pptx
Real-time applications  of Data Science.pptxReal-time applications  of Data Science.pptx
Real-time applications of Data Science.pptxshalini s
 
Mining Big Data using Genetic Algorithm
Mining Big Data using Genetic AlgorithmMining Big Data using Genetic Algorithm
Mining Big Data using Genetic AlgorithmIRJET Journal
 
Getting Started in Data Science
Getting Started in Data ScienceGetting Started in Data Science
Getting Started in Data ScienceThinkful
 

Similar to Causal networks, learning and inference - Introduction (20)

DATAIA & TransAlgo
DATAIA & TransAlgoDATAIA & TransAlgo
DATAIA & TransAlgo
 
Data science
Data scienceData science
Data science
 
Fundamentals of Data science Introduction Unit 1
Fundamentals of Data science Introduction Unit 1Fundamentals of Data science Introduction Unit 1
Fundamentals of Data science Introduction Unit 1
 
Challenges of Big Data Research
Challenges of Big Data ResearchChallenges of Big Data Research
Challenges of Big Data Research
 
Data Science - An emerging Stream of Science with its Spreading Reach & Impact
Data Science - An emerging Stream of Science with its Spreading Reach & ImpactData Science - An emerging Stream of Science with its Spreading Reach & Impact
Data Science - An emerging Stream of Science with its Spreading Reach & Impact
 
Understand the Idea of Big Data and in Present Scenario
Understand the Idea of Big Data and in Present ScenarioUnderstand the Idea of Big Data and in Present Scenario
Understand the Idea of Big Data and in Present Scenario
 
Bigger and Better: Employing a Holistic Strategy for Big Data toward a Strong...
Bigger and Better: Employing a Holistic Strategy for Big Data toward a Strong...Bigger and Better: Employing a Holistic Strategy for Big Data toward a Strong...
Bigger and Better: Employing a Holistic Strategy for Big Data toward a Strong...
 
Getting started in Data Science (April 2017, Los Angeles)
Getting started in Data Science (April 2017, Los Angeles)Getting started in Data Science (April 2017, Los Angeles)
Getting started in Data Science (April 2017, Los Angeles)
 
BIG-DATAPPTFINAL.ppt
BIG-DATAPPTFINAL.pptBIG-DATAPPTFINAL.ppt
BIG-DATAPPTFINAL.ppt
 
Unveiling Tomorrow_ The Future of Data Science.pdf
Unveiling Tomorrow_ The Future of Data Science.pdfUnveiling Tomorrow_ The Future of Data Science.pdf
Unveiling Tomorrow_ The Future of Data Science.pdf
 
Delivering on the Promise of Big Data and the Cloud
Delivering on the Promise of Big Data and the CloudDelivering on the Promise of Big Data and the Cloud
Delivering on the Promise of Big Data and the Cloud
 
Introduction to Data Science and Analytics
Introduction to Data Science and AnalyticsIntroduction to Data Science and Analytics
Introduction to Data Science and Analytics
 
Conclusion
ConclusionConclusion
Conclusion
 
Data Science Demystified_ Journeying Through Insights and Innovations
Data Science Demystified_ Journeying Through Insights and InnovationsData Science Demystified_ Journeying Through Insights and Innovations
Data Science Demystified_ Journeying Through Insights and Innovations
 
Putting data science into perspective
Putting data science into perspectivePutting data science into perspective
Putting data science into perspective
 
Data Science and AI in Biomedicine: The World has Changed
Data Science and AI in Biomedicine: The World has ChangedData Science and AI in Biomedicine: The World has Changed
Data Science and AI in Biomedicine: The World has Changed
 
Career in Data Science (July 2017, DTLA)
Career in Data Science (July 2017, DTLA)Career in Data Science (July 2017, DTLA)
Career in Data Science (July 2017, DTLA)
 
Real-time applications of Data Science.pptx
Real-time applications  of Data Science.pptxReal-time applications  of Data Science.pptx
Real-time applications of Data Science.pptx
 
Mining Big Data using Genetic Algorithm
Mining Big Data using Genetic AlgorithmMining Big Data using Genetic Algorithm
Mining Big Data using Genetic Algorithm
 
Getting Started in Data Science
Getting Started in Data ScienceGetting Started in Data Science
Getting Started in Data Science
 

Recently uploaded

PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPramod Kumar Srivastava
 
20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdfHuman37
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfRachmat Ramadhan H
 
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...soniya singh
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfLars Albertsson
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptxAnupama Kate
 
Brighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingBrighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingNeil Barnes
 
Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxEmmanuel Dauda
 
Dubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls DubaiDubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls Dubaihf8803863
 
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
Call Girls In Mahipalpur O9654467111 Escorts Service
Call Girls In Mahipalpur O9654467111  Escorts ServiceCall Girls In Mahipalpur O9654467111  Escorts Service
Call Girls In Mahipalpur O9654467111 Escorts ServiceSapana Sha
 
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改atducpo
 
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...Florian Roscheck
 
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...Suhani Kapoor
 
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptxEMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptxthyngster
 
Unveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystUnveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystSamantha Rae Coolbeth
 
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Jack DiGiovanna
 
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Sapana Sha
 
B2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxB2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxStephen266013
 

Recently uploaded (20)

PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
 
20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
 
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdf
 
E-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptxE-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptx
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx
 
Brighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingBrighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data Storytelling
 
Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptx
 
Dubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls DubaiDubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls Dubai
 
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
 
Call Girls In Mahipalpur O9654467111 Escorts Service
Call Girls In Mahipalpur O9654467111  Escorts ServiceCall Girls In Mahipalpur O9654467111  Escorts Service
Call Girls In Mahipalpur O9654467111 Escorts Service
 
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
 
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
 
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
 
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptxEMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
 
Unveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystUnveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data Analyst
 
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
 
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
 
B2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxB2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docx
 

Causal networks, learning and inference - Introduction

  • 1. Causal Networks: Learning and Inference Fabio Stella and Luca Bernardinello University of Milan-Bicocca Department of Informatics, Systems and Communication 2019 September 23rd – 27th
  • 2.
  • 3.
  • 4. The 100,000 Genomes Project ▪ complete sequencing of 70,000 individuals ▪ 21 PetaBytes of data ▪ 1 PetaByte of music requires about 2,000 years to be listened The complexity increases with digital data: ▪ Electronic Health Record ▪ Life style ▪ Nutrition ▪ Job type ▪ Geographical ▪ Social networks
  • 5. Smart Cities ▪ Automate ▪ Control ▪ See real-time data ▪ Home automation ▪ Environment monitoring ▪ Medical and healthcare ▪ Smart transportation ▪ Smart manufacturing ▪ Energy resource management
  • 6. Large Hadron Collider ▪ 2017 June 29 at CERN, 200 PB of data permanently stored. ▪ 1 billion collisions per second happening at ATLAS, CMS, ALICE, and LHCb.
  • 7. According to [1], • Big Data and Data Science are being used as buzzwords and are composites of many concepts. • Big data appears frequently in the press and academic journals. • In the last five years; birth and growth of many data science programs in academia. • In 2012, the White House Office of Science and Technology Policy announced the Big Data Research and Development Initiative that builds upon federal initiatives ranging from ✓ computer architecture and networking technologies, ✓ algorithms, ✓ data management, ✓ artificial intelligence, ✓ machine learning, ✓ advanced cyber-infrastructure. [1] Brady H. E., Annual Review of Political Science, 22 (2019) 297.
  • 8. Big Data 5 V’s
  • 9. Volume; the amount of data to be stored and managed. ▪ It is probably the first thing people think of when they hear the phrase big data. ▪ We typically talk about big data in the case where ExaBytes or PetaBytes have to be stored and managed. Big Data 5 V’s
  • 10. Velocity; defines the speed of increase in big data volume and its relative accessibility. ▪ This dimension helps organizations to realize the relative growth of their big data and how quickly that data reaches sourcing users, applications and systems. Big Data 5 V’s
  • 11. Variety; we no longer have control over the data format, i.e. we must manage heterogeneous data, with different formats, semantics and structures. ▪ In other words we could be asked to combine: ✓ pure text, ✓ photo, ✓ audio, ✓ video, ✓ web, ✓ GPS data, ✓ sensor data, ✓ relational data bases, ✓ and documents to answer a given question. Big Data 5 V’s
  • 12. Veracity; how accurate or truthful a data set may be. ▪ In the context of big data, however, it takes on a bit more meaning. ▪ When it comes to the accuracy of big data, it’s not just the quality of the data itself but how trustworthy the data source, type, and processing of it is. ▪ Removing things like bias, abnormalities or inconsistencies, duplication, and volatility are just a few aspects that factor into improving the accuracy of big data. Big Data 5 V’s
  • 13. Value; the worth of the data being extracted. ▪ Having endless amounts of data is one thing, but unless it can be turned into value it is useless. ▪ While there is a clear link between data and insights, this does not always mean there is value in Big Data. ▪ The most important part of embarking on a big data initiative is to understand the costs and benefits of collecting and analyzing the data to ensure that ultimately the data that is reaped can be monetized. Big Data 5 V’s
  • 14. Artificial Intelligence; intelligence demonstrated by machines, in contrast to the natural intelligence displayed by humans. ▪ Colloquially, the term Artificial Intelligence (AI) is used to describe machines/computers that mimic “cognitive” functions that humans associate with other human minds, such as "learning" and "problem solving". ▪ Two kinds of AI: ✓ Weak ✓ Strong
  • 15. Artificial Intelligence Bayesian Networks ▪ A type of statistical model that represents a set of variables and their conditional dependencies via a directed acyclic graph (DAG). ▪ Bayesian networks are ideal for taking an event that occurred and predicting the likelihood that any one of several possible known causes was the contributing factor. ▪ Structural Causal Models.
  • 16. Artificial Intelligence Knowledge Graphs ▪ A knowledge base used by Google and its services to enhance its search engine's results with information gathered from a variety of sources. ▪ The information is presented to users in an infobox.
  • 17. Artificial Intelligence Knowledge Graphs ▪ A knowledge base used by Google and its services to enhance its search engine's results with information gathered from a variety of sources. ▪ The information is presented to users in an infobox.
  • 18. Machine Learning; algorithms and statistical models that computer systems use in order to perform a specific task effectively without using explicit instructions, relying on patterns and inference instead. Three kinds of ML: ▪ Supervised ▪ Self-Supervised ▪ Reinforcement Learning
  • 20. Machine Learning Supervised ▪ Curve fitting ▪ Surface fitting
  • 21. Machine Learning Self-Supervised ▪ Recommendation System ▪ Market Basket Analysis ▪ Social Network Analysis
  • 22. Machine Learning Reinforcement Learning ▪ Learn by interacting with the environment ▪ The environment reacts to our decisions/actions ▪ Sequential learning, only at the end of the game we know our performance (reward/punishment)
  • 23. Deep Learning; is part of a broader family of machine learning methods based on Artificial Neural Networks. Three kinds of DL: ▪ Supervised ▪ Self-Supervised ▪ Reinforcement Learning
  • 24. Deep Learning Feedforward Neural Networks ▪ The first and simplest type of artificial neural network devised. ▪ The information moves in only one direction, forward, from the input nodes, through the hidden nodes (if any) and to the output nodes. ▪ There are no cycles or loops in the network. 8 pixels 8pixels X1 X64 Y0 Y9 … … … …
  • 25. Deep Learning Convolutional Networks ▪ Regularized versions of multilayer perceptrons which are fully connected and thus prone to overfitting the data. ▪ Regularization by adding some form of magnitude measurement of weights to the loss function. ▪ Different approach towards regularization: take advantage of the hierarchical pattern in data and assemble more complex patterns using smaller and simpler patterns.
  • 26. Deep Learning LSTM Networks ▪ An artificial Recurrent Neural Network architecture. ▪ Unlike standard feedforward neural networks, LSTM has feedback connections that make it a "general purpose computer" (it can compute anything that a Turing machine can). ▪ LSTM started to revolutionize speech recognition, outperforming traditional models in certain speech applications.
  • 27. Data Science ▪ Applied Mathematics ▪ Computer Science ▪ Neuroscience ▪ Engineering ▪ Statistics ▪ Biology ▪ Physics Data science is the application of computational and statistical techniques to address or gain insight into some problem in the real world (J. Zico Kolter, Carnegie Mellon University, 2018) Data science is an extraordinary multidisciplinary and interdisciplinary challenge to apply the scientific method empowered by • data, • models, • computational resources, • open source software, and … the most sophisticated technology we have today, i.e. the human being.
  • 28. Machine Learning ▪ Many different names for learning But most of machine learning nowadays is just curve fitting REINFORCEMENT LEARNING
  • 32. Machine Learning ▪ Spurious Correlations Spurious Correlations Fitting can be highly misleading
  • 33. Fitting does not give us any understanding
  • 34. “The vision systems of the eagle and the snake outperform everything that we can make in the laboratory, but snakes and eagles cannot build an eyeglass or a telescope or a microscope." — Judea Pearl
  • 35. What can truly be achieved? ▪ No matter how much data you collect, the question can not be answered when using the data alone Does Exercise affect Cholesterol? Simpson’s Paradox CHOLESTEROLEXERCISE AGE
  • 36. Big Data and the data-fusion problem ▪ Piecing together multiple datasets collected under heterogeneous conditions (i.e., different populations, regimes, and sampling methods) to obtain valid answers to queries of interest. ▪ The availability of multiple heterogeneous datasets presents new opportunities to big data analysts, because the knowledge that can be acquired from combined data would not be possible from any individual source alone.
  • 37. Causation is the key ▪ What we (computers so far) miss is causal knowledge ▪ For predicting the consequences of actions ▪ Causation for us is a synonym of understanding ▪ Actual intelligence needs causal knowledge ▪ But causal knowledge is not in the data!
  • 39. Why study causation? ▪ To make sense of data • effect of smoking on lung cancer? • effect of education on salaries? • effect of carbon emissions on the climate? ▪ To understand how we have an effect • malaria caused by mosquitos or by mal-air?
  • 40. Why study causation? ▪ To make sense of data • effect of smoking on lung cancer? • effect of education on salaries? • effect of carbon emissions on the climate? ▪ To understand how we have an effect • malaria caused by mosquitos or by mal-air? ▪ To guide actions and policies • pack mosquito nets or use breathing masks? • reduce CO2 emissions? • have a degree? • stop smoking?
  • 41. Why study causation separately from statistics (machine learning)? Why study causation separately from statistics (machine learning)? ▪ What can causation tell us that ML doesn’t? • Causation is not an aspect of ML • It is an addition to ML ▪ Causation uncovers facts that ML cannot • None of the previous problems can be articulated in the language of ML ▪ Let us introduce these aspects with a famous statistical puzzle …
  • 42. Simpson’s Paradox ▪ Named after Edward Simpson (born 1922), statistician ▪ A group of sick patients are given the option to try a new drug ▪ Among those who took the drug, a lower percentage recovered than among those who did not ▪ However, when we partition by gender, we see that: ▪ more men taking the drug recover than do men are not taking the drug, and ▪ more women taking the drug recover than do women are not taking the drug! The drug appears to help men and women, but hurt the general population
  • 43. Simpson’s Paradox ▪ Named after Edward Simpson (born 1922), statistician ▪ A group of sick patients are given the option to try a new drug ▪ Among those who took the drug, a lower percentage recovered than among those who did not ▪ However, when we partition by gender, we see that: ▪ more men taking the drug recover than do men are not taking the drug, and ▪ more women taking the drug recover than do women are not taking the drug! patients recovered % recovered patients recovered % recovered Men 87 81 93% 270 234 87% Women 263 192 73% 80 55 69% Combined data 350 273 78% 350 289 83% Table 1.1 Results of a study into a new drug, with gender being taken into account Drug No Drug Example 1.2.1 We record the recovery rates of 700 patients who were given access to the drug. A total of 350 patients chose to take the drug and 350 patients did not. The results of the study are shown in Table 1.1.
  • 44. Example 1.2.1 We record the recovery rates of 700 patients who were given access to the drug. A total of 350 patients chose to take the drug and 350 patients did not. The results of the study are shown in Table 1.1. Drug vs non-drug takers recovery rates: ▪ 93% vs 87% male ▪ 73% vs 69% female ▪ 78% vs 83% general population! Should a doctor prescribe the drug; to whom? Should a policy maker approve the drug for use? Simpson’s Paradox ▪ Named after Edward Simpson (born 1922), statistician ▪ A group of sick patients are given the option to try a new drug ▪ Among those who took the drug, a lower percentage recovered than among those who did not ▪ However, when we partition by gender, we see that: ▪ more men taking the drug recover than do men are not taking the drug, and ▪ more women taking the drug recover than do women are not taking the drug! patients recovered % recovered patients recovered % recovered Men 87 81 93% 270 234 87% Women 263 192 73% 80 55 69% Combined data 350 273 78% 350 289 83% Table 1.1 Results of a study into a new drug, with gender being taken into account Drug No Drug
  • 45. Example 1.2.1 We record the recovery rates of 700 patients who were given access to the drug. A total of 350 patients chose to take the drug and 350 patients did not. The results of the study are shown in Table 1.1. Drug vs non-drug takers recovery rates: ▪ 93% vs 87% male ▪ 73% vs 69% female ▪ 78% vs 83% general population! Should a doctor prescribe the drug; to whom? Should a policy maker approve the drug for use? Simpson’s Paradox ▪ Named after Edward Simpson (born 1922), statistician ▪ A group of sick patients are given the option to try a new drug ▪ Among those who took the drug, a lower percentage recovered than among those who did not ▪ However, when we partition by gender, we see that: ▪ more men taking the drug recover than do men are not taking the drug, and ▪ more women taking the drug recover than do women are not taking the drug! patients recovered % recovered patients recovered % recovered Men 87 81 93% 270 234 87% Women 263 192 73% 80 55 69% Combined data 350 273 78% 350 289 83% Table 1.1 Results of a study into a new drug, with gender being taken into account Drug No Drug
  • 46. Example 1.2.1 We record the recovery rates of 700 patients who were given access to the drug. A total of 350 patients chose to take the drug and 350 patients did not. The results of the study are shown in Table 1.1. Drug vs non-drug takers recovery rates: ▪ 93% vs 87% male ▪ 73% vs 69% female ▪ 78% vs 83% general population! Should a doctor prescribe the drug; to whom? Should a policy maker approve the drug for use? Simpson’s Paradox ▪ Named after Edward Simpson (born 1922), statistician ▪ A group of sick patients are given the option to try a new drug ▪ Among those who took the drug, a lower percentage recovered than among those who did not ▪ However, when we partition by gender, we see that: ▪ more men taking the drug recover than do men are not taking the drug, and ▪ more women taking the drug recover than do women are not taking the drug! patients recovered % recovered patients recovered % recovered Men 87 81 93% 270 234 87% Women 263 192 73% 80 55 69% Combined data 350 273 78% 350 289 83% Table 1.1 Results of a study into a new drug, with gender being taken into account Drug No Drug
  • 47. Example 1.2.1 We record the recovery rates of 700 patients who were given access to the drug. A total of 350 patients chose to take the drug and 350 patients did not. The results of the study are shown in Table 1.1. Drug vs non-drug takers recovery rates: ▪ 93% vs 87% male ▪ 73% vs 69% female ▪ 78% vs 83% general population! Should a doctor prescribe the drug; to whom? Should a policy maker approve the drug for use? Simpson’s Paradox ▪ Named after Edward Simpson (born 1922), statistician ▪ A group of sick patients are given the option to try a new drug ▪ Among those who took the drug, a lower percentage recovered than among those who did not ▪ However, when we partition by gender, we see that: ▪ more men taking the drug recover than do men are not taking the drug, and ▪ more women taking the drug recover than do women are not taking the drug! patients recovered % recovered patients recovered % recovered Men 87 81 93% 270 234 87% Women 263 192 73% 80 55 69% Combined data 350 273 78% 350 289 83% Table 1.1 Results of a study into a new drug, with gender being taken into account Drug No Drug
  • 48. Understand the causal story behind the data ▪ What mechanism generated the data? ▪ Suppose: estrogen has a negative effect on recovery ▪ Women less likely to recover than men, regardless of the drug From the data: Conclusion: the drug appears to be harmful but it is not ▪ If we select a drug taker at random, that person is more likely to be a woman ▪ Hence less likely to recover than a random person who doesn’t take the drug Causal Story ▪ Being a woman is a common cause of both drug taking and failure to recover. ▪ To assess the effectiveness we need to compare subjects of the same gender. (Ensures that any difference in recovery rates is not ascribable to estrogen) patients recovered % recovered patients recovered % recovered Men 87 81 93% 270 234 87% Women 263 192 73% 80 55 69% Combined data 350 273 78% 350 289 83% Table 1.1 Results of a study into a new drug, with gender being taken into account Drug No Drug
  • 49. Data Segregation ▪ We have solved the problem using gender-segregated data ▪ Then let’s just segregate the data whenever possible, right? WRONG!!! ▪ Consider a drug affecting recovery by lowering blood pressure (BP) ▪ Unfortunately, it has also a toxic effect Should a doctor prescribe this drug or not? patients recovered % recovered patients recovered % recovered Low BP 87 81 93% 270 234 87% High BP 263 192 73% 80 55 69% Combined data 350 273 78% 350 289 83% Table 1.2 Results of a study into a new drug, with posttreatment blood pressure taken into account No Drug Drug
  • 50. Should a doctor prescribe this drug or not? Once again, the answer follows from the way the data were generated. Data Segregation ▪ We have solved the problem using gender-segregated data ▪ Then let’s just segregate the data whenever possible, right? WRONG!!! ▪ Consider a drug affecting recovery by lowering blood pressure (BP) ▪ Unfortunately, it has also a toxic effect patients recovered % recovered patients recovered % recovered Low BP 87 81 93% 270 234 87% High BP 263 192 73% 80 55 69% Combined data 350 273 78% 350 289 83% Table 1.2 Results of a study into a new drug, with posttreatment blood pressure taken into account No Drug Drug
  • 51. Should a doctor prescribe this drug or not? Once again, the answer follows from the way the data were generated. ▪ In the general population, the drug might improve recovery rates because of its effect on blood pressure. Data Segregation ▪ We have solved the problem using gender-segregated data ▪ Then let’s just segregate the data whenever possible, right? WRONG!!! ▪ Consider a drug affecting recovery by lowering blood pressure (BP) ▪ Unfortunately, it has also a toxic effect patients recovered % recovered patients recovered % recovered Low BP 87 81 93% 270 234 87% High BP 263 192 73% 80 55 69% Combined data 350 273 78% 350 289 83% Table 1.2 Results of a study into a new drug, with posttreatment blood pressure taken into account No Drug Drug
  • 52. Should a doctor prescribe this drug or not? Once again, the answer follows from the way the data were generated. ▪ In the general population, the drug might improve recovery rates because of its effect on blood pressure. ▪ in subpopulations—the group of people whose posttreatment BP is high and the group whose posttreatment BP is low—we, of course, would not see that effect; we would only see the drug’s toxic effect. Data Segregation ▪ We have solved the problem using gender-segregated data ▪ Then let’s just segregate the data whenever possible, right? WRONG!!! ▪ Consider a drug affecting recovery by lowering blood pressure (BP) ▪ Unfortunately, it has also a toxic effect patients recovered % recovered patients recovered % recovered Low BP 87 81 93% 270 234 87% High BP 263 192 73% 80 55 69% Combined data 350 273 78% 350 289 83% Table 1.2 Results of a study into a new drug, with posttreatment blood pressure taken into account No Drug Drug
  • 53. Should a doctor prescribe this drug or not? Once again, the answer follows from the way the data were generated. ▪ In the general population, the drug might improve recovery rates because of its effect on blood pressure. ▪ in subpopulations—the group of people whose posttreatment BP is high and the group whose posttreatment BP is low—we, of course, would not see that effect; we would only see the drug’s toxic effect. Data Segregation ▪ We have solved the problem using gender-segregated data ▪ Then let’s just segregate the data whenever possible, right? WRONG!!! ▪ Consider a drug affecting recovery by lowering blood pressure (BP) ▪ Unfortunately, it has also a toxic effect patients recovered % recovered patients recovered % recovered Low BP 87 81 93% 270 234 87% High BP 263 192 73% 80 55 69% Combined data 350 273 78% 350 289 83% Table 1.2 Results of a study into a new drug, with posttreatment blood pressure taken into account No Drug Drug
  • 54. Should a doctor prescribe this drug or not? ▪ Only by BP-segregating the data we can see the toxic effect ▪ It makes no sense to segregate the data; we should use the combined data YES Data Segregation ▪ We have solved the problem using gender-segregated data ▪ Then let’s just segregate the data whenever possible, right? WRONG!!! ▪ Consider a drug affecting recovery by lowering blood pressure (BP) ▪ Unfortunately, it has also a toxic effect patients recovered % recovered patients recovered % recovered Low BP 87 81 93% 270 234 87% High BP 263 192 73% 80 55 69% Combined data 350 273 78% 350 289 83% Table 1.2 Results of a study into a new drug, with posttreatment blood pressure taken into account No Drug Drug
  • 55. Note that the data are the same of Simpson’s paradox. patients recovered % recovered patients recovered % recovered Men 87 81 93% 270 234 87% Women 263 192 73% 80 55 69% Combined data 350 273 78% 350 289 83% Table 1.1 Results of a study into a new drug, with gender being taken into account Drug No Drug Should a doctor prescribe this drug or not? ▪ Only by BP-segregating the data we can see the toxic effect ▪ It makes no sense to segregate the data; we should use the combined data YES Data Segregation ▪ We have solved the problem using gender-segregated data ▪ Then let’s just segregate the data whenever possible, right? WRONG!!! ▪ Consider a drug affecting recovery by lowering blood pressure (BP) ▪ Unfortunately, it has also a toxic effect patients recovered % recovered patients recovered % recovered Low BP 87 81 93% 270 234 87% High BP 263 192 73% 80 55 69% Combined data 350 273 78% 350 289 83% Table 1.2 Results of a study into a new drug, with posttreatment blood pressure taken into account No Drug Drug
  • 56. ▪ the timing of the measurements ▪ that the treatment affects blood pressure ▪ that blood pressure affects recovery ▪ as Statisticians rightly say, correlation is not causation ▪ hence there is no method that can determine the causal story from data alone ▪ whence no ML method can aid in our decision ▪ The paradox arises out of our conviction that treatment cannot affect sex ▪ If it could, we could explain it as in our blood pressure case ▪ But we cannot test the assumption using the data ▪ Information that allowed us to make a correct decision ▪ All this information was not in the data ▪ The same holds for Simpson’s paradox Lessons Learned
  • 58. Seeing; we are looking for regularities in observations. “What if I see …?” Calls for predictions based on passive observations. It is characterized by the question “What if I see …?” For instance, imagine a marketing director at a department store who asks, “How likely is a customer who bought toothpaste to also buy dental floss?”
  • 59. “What if do …?” & “How?” We step up to the next level of causal queries when we begin to change the world. A typical question for this level is “What will happen to our floss sales if we double the price of toothpaste?” This already calls for a new kind of knowledge, absent from the data, which we find at rung two of the Ladder of Causation, Intervention. Intervention; ranks higher than association because it involves not just seeing but changing what is. Many scientists have been quite traumatized to learn that none of the methods they learned in statistics is sufficient even to articulate, let alone answer, a simple question like “What happens if we double the price?”
  • 60. “What if I had done …?” & “Why?” Counterfactuals; ranks higher than intervention because it involves imagining, retrospection and understanding. We might wonder, My headache is gone now, but • Why? • Was it the aspirin I took? • The food I ate? • The good news I heard? These queries take us to the top rung of the Ladder of Causation, the level of Counterfactuals, because to answer them we must go back in time, change history, and ask, “What would have happened if I had not taken the aspirin?” No experiment in the world can deny treatment to an already treated person and compare the two outcomes, so we must import a whole new kind of knowledge.
  • 61. Extra-Statistical Methods ▪ Methods to express and interpret causal assumptions • To describe problems of any complexity, where intuition no longer helps • To solve them mechanically as we solve algebraic equations ▪ In particular we need: 1. A working definition of “causation” 2. A method to articulate causal assumptions (i.e., a model) 3. A method to link causal models to data 4. A method to draw conclusions from model and data
  • 62. ▪ We start by 1 • As simple as it can be, it has eluded philosophers for centuries • We use an operational definition: ▪ Methods to express and interpret causal assumptions • To describe problems of any complexity, where intuition no longer helps • To solve them mechanically as we solve algebraic equations ▪ In particular we need: 1. A working definition of “causation” 2. A method to articulate causal assumptions (i.e., a model) 3. A method to link causal models to data 4. A method to draw conclusions from model and data A variable X is a cause of a variable Y if Y “listens” to X and decides its value in response to what it hears. Extra-Statistical Methods
  • 63. Before moving on to actual causal methods we need: ▪ Some elementary concepts from probability and statistics • Most causal statements are uncertain (e.g., “careless driving causes accidents”) • Uncertainty is expressed by probability • Probability is at the heart of statistics (i.e., learning from data) ▪ Some graph theory • Causal stories will be represented by graphs • Graphs will be the basis of solutions methods Next Steps