Greetings and thanksgiving (opportunity non only to give a “face” to e-collegues, but also for the chance to talk about my phd project) First, introduce main features of philosophy of probability. Have a look at some probability theory, but mostly focus on foundational and interpretational aspects Second, philosophy of causality. But phil of causality such a vast domain… so, focus on “probability and causality”. Several accounts of causation using probabilities, I’ll present just Suppes’ theory, since it’s probably “the” theory, or at least the base for any other theory. Last, if the presentation of phil of probability and causality has raised interest, then probably willing to hear my ideas … In preparing the introductory part, I didn’t assume any familiarity with the topic, and indeed, it has been a useful work for myself too! Feel free to ask any questions, I hope I’ll be able to answer.
In the beginning probability is just mathematics. No interpretation whatsoever. Defining probability over sentences is just one possibility. Other possibilities: over events, variables… To be more precise on the maths of probability: Need 3 primitive notions: 1. nonempty set X of possible outcomes (=probability space); 2. family F of subsets of X representing possible events (points in the probability space); 3. a real-valued function P on F . P is then interpreted as the probability of E. Kolmogorov treatment assumes P to be a real-valued function over sets . Whatever plug in P () or P ( | ) are expressions of a Boolean algebra. (BA, basically = set theory) If uneasy with events (because of philosophical problems), possible to give a set-theoretical version Interpretation with sentences is legitimate bcz truth-functional operations on propositions ( , , ) obey the same formal principles as Boolean operators on sets. Defining probability over sentences is convenient for a subjective interpretation and it is also very intuitive: P (A)=1 is A is true, P(A)=0 if A is false, and in the middle there is our uncertainty about A. Show probability by Venn diagrams. A, B mutually exclusive. E.g. throwing a die, since the die can show only one side at a time, statements “P(die will come up 1)=1/6” etc are mutually exclusive. E.g.2: “socrates is both bald and wise” and “socrates is neither bald nor wise” are mutually exclusive.
a. follows since the overall probability can’t exceed 1. It’s very clear in Venn’s diagram. b. A and B make the same factual claim Conditional probability is a definition . Conditional prob allow for the fact that if a certain statement A is known to be true, this may affect the probability of another statement B. E.g.: pr (getting even number)=1/2. pr(getting even number | a 2 or a 4 is thrown)=1. Bayes’ theorem follows from the axioms and the definition of conditional probability. It’s a law that governs the inversion of a probability and relates Pr B given A to Pr A given B, provided that Pr A and Pr B are known or is accepted conventional procedure to determine. E.g: P(A) = P(A|B) P(A)+ P(A|-B)P(-B) (th. of total pr). Bayes’ th. gives posterior pr from prior pr and from test or sample. Give ex if needed. The 3 conditions for unconditional independence are equivalent. The notion of independence. There are cases where the knowledge that A is true is irrelevant to the probability to be assigned to B. E.g.: pr(throwing a even number)=1/2. Pr (A | the president of u.s. sneeze at the same time of the throw) = still ½! [Conditional independence is the famous screening off relation, central in Bayesian nets and in causality. The notion of independence is central for causality. More on this later.] Notice the difference btw independent events and mutually exclusive events. Independence doesn’t imply that ev are mutually exclusive. Give ex.
Classifications of interpretations significantly differ from one author to anther, and almost every philosopher of probability proposes his own interpretation. Yet, there are 3 main concepts all classifications turn around. Logical interpretation rests on Principle of Indifference = without reasons to assign different pr values to different events, they must be considered equiprobable. Problem: circular explication of the concept. Subjective interpretations mostly of two kinds: Fully personalist approach: pr can differ from one agent to another, as long as they are coherent = satisfy Kolmogorov axioms Personalist approach requiring other constraints on pr values (more on that later) Often pr are interpreted as betting odds a person is willing to accept on a given proposition. More on this in a while. Objective interpretation mostly of two kinds: Frequency theories: P(A|B) is the relative frequency of As among Bs. Problems: determination of reference class and determination of limiting value in infinite classes Propensity theories: pr is tendency or propensity of displaying a certain property. It allows assignments to individual outcomes but works badly with sequences .
This is to give a rough idea of what Dutch Book arguments are, what are they based on, what they re conceived for. Subjectivists (fully personalist approach) assume coherence to be the minimal criterion of rationality. Coherence = do not violate probability axioms. Dutch Book arguments aim to show that coherence is all you need in order to be rational. Rationality is explained by the minimal criterion of coherence. The criterion is either descriptive or normative The conditionalization rule: newP(H) = P(H|E) = P(H&E)/P(E) is the sole rule to update probability saving coherence. First, probability values are the fair betting quotient you are willing to accept betting on the truth of a proposition. Then the Dutch Book is defined as the book constructed in such a way that whatever the value of the sentence A (object of the bet) the bettor looses. The synchronic argument: Dutch book theorems show that no Dutch Book can be made against the bettor iff the betting quotients are probability. The diachronic argument concerns the case of updating probability: conditionalization is the sole rule that does not lead to incoherence, i.e. the new probability values still obey Kolmogorov axioms. Several criticism on Dutch Book arguments: descriptive – normative, assumption probability = betting quotients … don’t go in that debate.
Bayesianism = inductive reasoning from data = examining probabilities of hypotheses given data Subjective Bayesianism allows that two agents have different probability values for the same propositions and neither of them be wrong, as long as they are coherent (= do not violate pr axioms), according to Dutch Book theorems. Objective Bayesianism in requiring further constraints, implies that if two agents disagree, at least one of them must be wrong. Empirical constraints: e.g. observed frequencies Logical constraints: principle of indifference or principle of maximum entropy. This is more or less all what you need to know about the philosophy of probability. Let’s move to philosophy of causality.
First, few general remarks on phil of causality. Russell 1912: the funeral of causation: “The law of causality, I believe, like much that passes muster among philosophers, is a relic of a bygone age, surviving, like the monarchy, only because it is erroneously supposed to do no harm.” (Russell, On the notion of cause). Probably Russell was wrong, since the debate on causality is still alive. Phil of causality is one of the richest “lieu” of phil bcz it obliges to take position on several related topics: time’s arrow, determinism, explanation, induction, interpretation of probability. Motivation for probabilistic causality. It seems that both the intuitions behind common language and scientific practice are not about necessary or sufficient causation, but something “in between” .. Just probable. Probabilistic reasoning is very powerful in circumstances of partial knowledge (Laplacian idea), i.e. where a complete causal analysis is not feasible. (a) Pr. Th of causality: the theory uses the standard probability theory but its use doesn’t entail that the causal relation itself involves an irreducible element of chance. (b) Th. Of Pr Causality: pr is placed directly into the relation between the cause and the effect; the causal relation contains a properly probabilistic element. Relation with determinism and indeterminism. Many who adopt (a) would argue that causality by itself is deterministic, but me model causal relations probabilistically because of lack of complete knowledge. For the nostalgic of determinism, (b) would hold as well: the relation between “physical probabilities” is deterministic. The remote origin is the interpretation of probability. Can you see why? Accounts of probabilistic causality are highly technical, I’ll avoid any technicality and try to present arguments in a rather intuitive way.
Suppes 1970: masterpiece on probability and causality. Essentially 2 parts: analysis of causal relations among events and analysis of causal relations among quantitative properties. Ingredients of causality: temporal priority of causes, statistical relevance relation. Let’s illustrate the main concepts by an example: Barometer Reading, Rain, and Air Pressure The barometer -prima facie- causes rain P (R | B) > P (R) The barometer is a spurious cause of rain, i.e. has no real effect P (R | B A) = P (R | A) P (R | B A) P (R | B)
The reason why I present this part of Suppes’ theory –which indeed is more technical- is that it better mirror the scientific practice and the formalization of causal relations in statistical modelling. Here are the “tools”: X, Y, Z Random variables [= a function that assigns a real number to each and every possible outcome of a random experiment (=the sample space)] P (X ≤ x) P (Y ≤ y) Probability distributions [probability distributions describe the probability that a r.a. X takes on a value less than or equal to a number x] P (Z ≤ z) E.g.: sample space = 3 children in a family; r.a. = # of girls. E.g. of distribution, the famous normal distribution (gaussian) cov (X, Y) Measure of association between two r.a. X and Y [large values of X and Y occur together] (X, Y) Standardized measure of association between X and Y [the measure of covariation depends on the scale of measurement, so it is difficult to determine at first glance whether X and Y covary standardize. ranges btw -1 and 1] The problem with correlation is that correlation does not imply causation. Think of many cases in which 2 r.a. covary but there is no causal relation at all. (storks and births, sea level in venice and bread prices in england …) When Suppes proves a the theorem causation correlation, is saying that if there is a causal relation working, then you should find a positive correlation btw the variable you’re studying. The underlying idea of probabilistic causality (oversimplifying…) is that on average a cause increases the chance of the effect. That is, within a given population, if you suspect that smoking causes lung cancer, you expect to find more people affected among smokers than among non smokers in the same population.
With this “probability increase” idea in mind let’s have a look at some traditional problems Introduce golf example (Rosen) for improbable consequences Introduce squirrel example for negative causes and levels of causation Sufficiency causation: if we had enough knowledge about the causal process, we would be able to determine
We may deal with causality from different perspectives. The distinction between metaphysics and epistemology is more clear-cut than the distinction btw epistemology and methodology – lots of questions overlap. Metaphysics: what is the causal relation? What are causal relata? (this presupposes belief in physical causality – out there; otherwise agnostic positions till the extreme denial of physical causality. Notice that denying physical causality does not imply denying the possibility of causal inference. Do you see why? Agency theories …) Epistemology: how can we get knowledge about causal relations? What notion of causality? Methodology: how to detect causal relations? Mark method in physical processes? Analysis of correlations? Bayesian networks? By causal modelling I mean statistical models used in (social) sciences for causal analysis. Traditional example: smoking and lung cancer. Often causal models makes strong metaphysical assumptions (e.g. determinism, physical causality) – I don’t think it is strictly necessary to embrace such strong ontological commitments. More on this later. What causal modelling lacks (in my opinion) is a conceptualization of the notion of causality. Methodology is well developed: structural models, graph methods, Granger causality, cross tabs methods …
In my phd I analyse methodology to come up with some claims about epistemology. In particular, I study structural models and probabilistic causality and try to understand how causal inferences are drawn, what concept of causality scientists use (probably unconsciously) . I analyse several case studies (e.g. …) to see in practice what it means to use a structural models and both to test and extrapolate epistemological claims.
The equations involved functionally relate variables. The goal is estimate the parameters of the equation, intuitively, the parameters of the equations represent the characteristics of the population under study (e.g. mean, variance, etc ..). The graph here does not represent the data, but how the variables are related. So, it is useful to visualize the links or paths from one variable to another. One big problem in the debate is exactly the causal interpretation of these models, “pictorially” whether or not the arrows in the graph represent causes or not. In fact, structural models can be used for descriptive purposes, predictive purposes, or can be given a causal interpretation. The issue is interesting also from an historical point of view: at the beginning SM were causal in character. Some authors defend the causal interpretation by explicating the meaning of = in the equation. Not = as in maths formulas, it is an assignment … More on the causal interpretation of SM later (my claim)
Oversimplified causal diagram of Caldwell’s thesis (Caldwell 1979) about the impact of mother’s education on child’s mortality. C3 is child mortality represented by age of child at death (Y1); C1 stands for mother’s education represented by years of school of mother (X1); C2 stands for socio-economic status, operationally defined by X1, by income (X2) and by years of school of father (X3).
Explain where this statement comes from: explain in what sense probabilistic causality, structural models look for variation. Give examples from case studies. Say that the idea is consistent with the mechanist approach (physical processes) The idea of variation is very appealing, but the question of the causal interpretation still needs an answer. Idea of variation (change) is akin to idea of correlation (= joint variation). So, the question is understand what distinguishes spurious correlations from causal ones. Storks and births? Sea level in venice and bread prices in England? Metaphysical approach (Hausmann) accidental correlations vs . true correlations. You just picked up the wrong correlation. (though I think that the demarcation problems still holds, relying on laws and counterfactuals doesn’t help so much) Epistemological approach: causation is in the hypotheses, in particular in the causal hypothesis. Models have different hyp e.g. linearity, normality, markovian, etc … if all these conditions are satisfied, then we can say smth about the causal relation hypothesised. Think of the Duhem-Quine thesis: you don’t reject the whole theory, but you try to modify one hypothesis. In that case, either one of the hyp of the model is not satisfied and we have to build another model. Or probably the causal hyp. hasn’t been confirmed. So, there is much more confirmatory logic than deductive one.
The beauty of causality, as I told, is that it forces to take position regarding other millions of related topics. I’ll mention just two. Probability. How I approach probability. Given the account of causality I propose, what is the best suited interpretation? Natural choice is within the subjectivist framework, in particular the Bayesian one. This follows from the epistemological perspective. Next step: though causality is a mind-dependent notion, is not subjective, i.e. matter of personal taste. Choose the kind of Bayesianism reflecting these characteristics objective Bayesianism. Determinism. Actually I don’t have so much to say … (I think it’s one of the false problem in philo, especially because of worries concerning free will …) Still, I’ve to take position. In the literature on causal modelling always this tension btw det-indet … the structural eq seems to describe a det relation, but probabilities seem to reflect a stochastic component in the causal relation. If indet accepted, then worry of possibility to predict, and at the bottom, to have knowledge about causal relations … Hint: is it really necessary to embrace the metaphysical thesis of det? No. det can be a methodological assumption or a heuristic principle that guides scientific practice. We look for det functions bcz they allow better predictions and explanations. Indeed this is an agency-perspective. (Kantian … if you wish)
Federica Russo Université Catholique de Louvain Centre for Philosophy of Natural and Social Science - LSE Overview: Philosophy of probability: theory and interpretations Philosophy of causality: probability and causality My research project: some ideas in progress
Philosophy of probability: theory and interpretations <ul><li>Axioms: </li></ul><ul><li>Let S be a collection of sentences and P a probability function satisfying Kolmogorov axioms : </li></ul><ul><li>1. P (A) 0 </li></ul><ul><li>2. P (A) = 1 if A is true in all models </li></ul><ul><li>3. P (A B) = P (A) + P (B) if A, B mutually exclusive </li></ul>
<ul><li>Consequences: </li></ul><ul><li>a. P ( A) = 1 P (A) </li></ul><ul><li>b. P (A) = P (B) if in all models A B </li></ul><ul><li>c. P (A B) = P (A) + P (B) P (A B) </li></ul><ul><li>Conditional probability: </li></ul><ul><li>P (A | B) = P (A B) / P (B) if P (B) 0 </li></ul><ul><li>Bayes’ Theorem: </li></ul><ul><li>P (B | A) = P (A | B) P (B) / P (A) </li></ul><ul><li>Unconditional independence: </li></ul><ul><li>A and B are unconditional independent iff </li></ul><ul><li>P (A | B) = P (A) or </li></ul><ul><li>P (A | B) = P (B) or </li></ul><ul><li>P (A | B) = P (A) P (B) </li></ul><ul><li>Conditional independence: </li></ul><ul><li>A is conditional independent of B given C iff </li></ul><ul><li>P (A | B C) = P (A | C) </li></ul>
Interpretations of probability <ul><li>Logical interpretation </li></ul><ul><ul><li>Probability is the ratio between favourable cases and equipossible cases </li></ul></ul><ul><li>Subjective interpretation </li></ul><ul><ul><li>Probability a quantitative expression of degree of knowledge, degree of belief, degree of confirmation </li></ul></ul><ul><li>Objective interpretation </li></ul><ul><ul><li>Probability is a quantitative expression of an objective feature of the world </li></ul></ul>Agent-dependent notion Epistemological interpretations Agent-independent notion Metaphysical interpretation
Dutch Book arguments <ul><li>Within subjective interpretation: probabilities are degrees of belief </li></ul><ul><li>Goal: justify 2 epistemological principles </li></ul><ul><ul><li>Probability laws are coherence conditions on degrees of belief </li></ul></ul><ul><ul><li>Conditionalization is a rule of probabilistic inference </li></ul></ul><ul><li>Assumption: degrees of belief are betting quotients </li></ul><ul><li>A Dutch Book is such that the bettor looses whatever happens </li></ul><ul><li>Synchronic Dutch Book theorem: the bettor is not liable to the Dutch Book iff his betting quotients satisfy probability axioms </li></ul><ul><li>Diachronic Dutch Book theorem: conditionalization is the only coherent dynamic rule for updating probabilities </li></ul>
Varieties of Bayesianism <ul><li>Subjective Bayesianism </li></ul><ul><li>coherence is the only constraint on probability functions </li></ul><ul><li>Objective Baysianism </li></ul><ul><li>knowledge and lack of knowledge are empirical and logical constraints on probability functions </li></ul>
Philosophy of causality: probability and causality <ul><li>Motivation for the probabilistic approach </li></ul><ul><li>Probabilistic theories of causality </li></ul><ul><li>vs. </li></ul><ul><li>Theories of probabilistic causality </li></ul>
Suppes: causal relations among events <ul><li>Causes precede effects in time by definition </li></ul><ul><li>Causes increase the probability of the effect: P(E | C) > P(E) </li></ul><ul><li>Genuine causes are not spurious </li></ul>
Suppes: causal relations among quantitative properties <ul><li>Restate former definitions in terms of random variables and probability distributions </li></ul><ul><li>Causation implies correlation </li></ul>
Traditional problems <ul><li>Improbable consequences </li></ul><ul><li>Levels of causation: </li></ul><ul><ul><li>type causation vs . token causation </li></ul></ul><ul><li>Negative causes </li></ul><ul><li>Deterministic causality vs. Indeterministic causality </li></ul>
My research project: some ideas in progress <ul><li>Causality: metaphysics, epistemology, or methodology? </li></ul><ul><li>Causal modelling: metaphysics, epistemology or methodology? </li></ul>
<ul><li>My project: </li></ul><ul><li>Is in the epistemology of causality </li></ul><ul><li>Attempts to extrapolate a notion of causality </li></ul><ul><li>My methodology: </li></ul><ul><li>Analysis of modelling </li></ul><ul><li>Analysis of case studies </li></ul>
Intermezzo: what is a causal model? <ul><li>Causal models have two parts: </li></ul><ul><li>A set of equations </li></ul><ul><li>A graph </li></ul><ul><li>Equations functionally relate variables </li></ul><ul><li>Graphs are a device for laying out pictorially what is hypothesized to cause what </li></ul>
Causal models: an example C 3 = 1 C 2 + 2 C 1 + i Child mortality and mother’s education C 1 = mother’s edication C 2 = socioeconomic status C 3 = child mortality
Two claims about causality <ul><li>What is causality? </li></ul><ul><li>causality is a measure of change </li></ul><ul><li>Where does causality come from? </li></ul><ul><li>causality comes from the causal hypotheses </li></ul>
Related problems <ul><li>What about probability? </li></ul><ul><li>objective Bayesian approach </li></ul><ul><li>What about determinism? </li></ul><ul><li>determinism is a heuristic principle </li></ul>