Upcoming SlideShare
×

Russo Vub Seminar

638 views

Published on

Published in: Technology, Education
0 Likes
Statistics
Notes
• Full Name
Comment goes here.

Are you sure you want to Yes No
• Be the first to comment

• Be the first to like this

Views
Total views
638
On SlideShare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
3
0
Likes
0
Embeds 0
No embeds

No notes for slide
• For instance, if A is the proposition that a coin tossed at time t will fall heads, X is the proposition that the chance of A at time t is x , and E is the available evidence that does not contradict X ; the Principal Principle then says that x equals the actual degree of belief that the coin falls heads, conditionally on the proposition that its chance of falling heads is x . In other words, the chance of A equals the degree to which an agent believes in A .
• In fully personalistic approaches coherence is nec and suff for assignments of prior proba. So, the objection: aren’t frequencies just a pedagogical need? Carnap gives an interesting answer (Philosophical Foundations of Probability, §50-51, 41C). 1. we can do without frequencies if inductive logic is accepted. Reason 1: prob1 (subj) can be explicated as estimate of proba2 (obj). So, if proba2 is know, then proba1 just equals this value. Reason 2: even if proba2 is unknown, we can still compute proba1 as estimate of the unknown proba2 from *frequencies* in the sample But the subjectivist can still play a last card: the exchangeability argument. Briefly and informally, De Finetti shows that different agents may start with different prior probabilities, but, as evidence accumulates, their posterior proba will tend to converge, thus giving the illusory impression that objective probability exists.
• Russo Vub Seminar

1. 1. Objective Bayesianism and causal modelling in the social sciences Federica Russo Philosophy, Louvain & Kent
2. 2. Overview <ul><li>Probabilistic causal claims in CM </li></ul><ul><ul><li>Generic / Single-case </li></ul></ul><ul><li>Interpreting probability: </li></ul><ul><ul><li>a rush course </li></ul></ul><ul><li>Interpreting probability in CM: </li></ul><ul><ul><li>Frequency-driven epistemic probabilities </li></ul></ul><ul><ul><li>Objective Bayesian probabilities </li></ul></ul>
3. 3. Probabilistic causal claims <ul><li>Causal claims: </li></ul><ul><ul><li>Tendency of an event to cause another </li></ul></ul><ul><ul><li>Causal effectiveness of an event </li></ul></ul><ul><ul><li>Frequency of occurrence of a causal relation </li></ul></ul><ul><ul><li>… </li></ul></ul><ul><li>Causal claims are probabilistically modelled </li></ul>
4. 4. Probabilistic causal claims <ul><li>Generic </li></ul><ul><ul><li>About the population as a whole; </li></ul></ul><ul><ul><li>Describe an average causal relation; </li></ul></ul><ul><ul><li>Concern frequency of occurrence. </li></ul></ul><ul><li>Single-case </li></ul><ul><ul><li>About a particular ‘individual’; </li></ul></ul><ul><ul><li>Occur at particular time and space; </li></ul></ul><ul><ul><li>Expresse a rational belief about what will or did happen </li></ul></ul><ul><li>What interpretation fits generic and single-case? </li></ul>
5. 5. Interpreting probability: a rush course <ul><li>Kolmogorov axiomatisation </li></ul><ul><ul><li>Probabilities are non-negative numbers </li></ul></ul><ul><ul><li>Every tautology is assigned value 1 </li></ul></ul><ul><ul><li>The sum of 2 mutually inconsistent sentences is equal to the probability of their disjunction </li></ul></ul><ul><ul><li>Conditional probability: </li></ul></ul><ul><ul><li>Bayes’ theorem then follows: </li></ul></ul>
6. 6. Interpreting probability: a rush course <ul><li>Classical and logical </li></ul><ul><ul><li>P = ratio # of favourable cases / # of all equipossible cases </li></ul></ul><ul><li>Physical: frequency and propensity </li></ul><ul><ul><li>P = limiting relative frequency of an attribute in a reference class </li></ul></ul><ul><ul><li>P = tendency of a type of physical situation to yield an outcome </li></ul></ul><ul><li>Bayesian interpretation </li></ul><ul><ul><li>Subjective </li></ul></ul><ul><ul><li>Empirically-based </li></ul></ul><ul><ul><li>Objective </li></ul></ul>
7. 7. Interpreting probability: a rush critique <ul><li>Classical and logical </li></ul><ul><ul><li>Well suited to games of chance, </li></ul></ul><ul><ul><li>not quite to express generic causal knowledge </li></ul></ul><ul><ul><li>nor individual hypotheses </li></ul></ul><ul><li>Physical </li></ul><ul><ul><li>Does not make sense in the single case, </li></ul></ul><ul><ul><li>it is of scarce applicability to evaluate </li></ul></ul><ul><ul><li>individual hypotheses </li></ul></ul>
8. 8. Bayesian interpretations <ul><li>An epistemological stance </li></ul><ul><li>about scientific reasoning </li></ul><ul><ul><li>We reason according to the formal principle of probability theory </li></ul></ul><ul><ul><li>Bayesianism provides an account of how we can/should learn from experience </li></ul></ul><ul><li>Probability expresses rational degree of belief </li></ul><ul><li>Bayesians disagree as to how </li></ul><ul><li>degrees of belief are shaped </li></ul>
9. 9. Bayesian interpretations <ul><li>Subjective </li></ul><ul><ul><li>Choose any probability you wish, </li></ul></ul><ul><ul><li>just preserve coherence </li></ul></ul><ul><li>Empirically-based </li></ul><ul><ul><li>Choose any probability you wish, preserve coherence, </li></ul></ul><ul><ul><li>and incorporate empirical constraints </li></ul></ul><ul><li>Objective </li></ul><ul><ul><li>Choose any probability you wish, preserve coherence, </li></ul></ul><ul><ul><li>incorporate empirical constraints and logical constraints </li></ul></ul>
10. 10. Bayesianism  empirically-based or objective  is the interpretation that best fit CM
11. 11. Janus-faced probability <ul><li>Historically tenable </li></ul><ul><li>Who’s in the driver’s sit? </li></ul><ul><ul><li>Frequency-driven epistemic probabilities </li></ul></ul><ul><ul><ul><li>Degrees of belief are shaped </li></ul></ul></ul><ul><ul><ul><li>upon knowledge of frequencies </li></ul></ul></ul><ul><ul><li>Credence-driven physical probabilities </li></ul></ul><ul><ul><ul><li>Credence in the truth of a proposition </li></ul></ul></ul><ul><ul><ul><li>fixes the chance of the event </li></ul></ul></ul><ul><ul><ul><li>(as long as evidence does not contradict) </li></ul></ul></ul>
12. 12. Frequency-driven epistemic probabilities <ul><li>Account for different types of </li></ul><ul><li>probabilistic causal claims </li></ul><ul><ul><li>because they are Janus-faced </li></ul></ul><ul><li>Make sense of learning from experience </li></ul><ul><ul><li>because they incorporate empirical constraints </li></ul></ul>
13. 13. Credence-driven objective probabilities <ul><li>Lewis’ Principal Principle </li></ul><ul><ul><li>Let C be any reasonable initial credence function. Let t be any time. Let x be any real number in the unit interval. Let X be the proposition that the chance, at time t , of A 's holding equals x . Let E be any proposition compatible with X that is admissible at time t . Then, C ( A | XE ) = x . </li></ul></ul>
14. 14. The Janus you choose makes the whole difference <ul><li>Lewis 1986 : </li></ul><ul><ul><li>Carnap did well to distinguish two concepts of probability, insisting that both were legitimate and useful and that neither was at fault because it was not the other. I do not think Carnap chose quite the right two concepts, however. In place of his ‘degree of confirmation’ I would put credence or degree of belief; in place of his ‘relative frequency in the long run’ I would put chance or propensity, understood as making sense in the single case. The division of labor between the two concepts will be little changed by these replacements. Credence is well suited to play the role of Carnap's probability1, and chance to play the role of probability2 . </li></ul></ul>
15. 15. In fact: <ul><li>C-D accounts leave room to arbitrariness </li></ul><ul><ul><li>Different agents with different initial credence functions </li></ul></ul><ul><ul><li>will assign different chances to the same event </li></ul></ul>Additionally: In the single-case the goal is not to claim credence about chance but to express a rational degree of belief in an individual hypothesis
16. 16. The case against? <ul><li>Bayesian probabilities are degrees of belief . </li></ul><ul><li>So is causal knowledge given up? </li></ul><ul><ul><li>Not quite: empirical constraints ensure that </li></ul></ul><ul><ul><li>probabilities are not devoid of empirical content </li></ul></ul><ul><li>Rational decision making does not uses </li></ul><ul><li>frequencies at all … </li></ul><ul><ul><li>Perhaps, but still, experience informs our </li></ul></ul><ul><ul><li>degrees of beliefs in many ways </li></ul></ul><ul><li>Exchangeable sequences show that </li></ul><ul><li>‘ probability does not exist’ </li></ul><ul><ul><li>Well, there’s nothing ‘metaphysical’ </li></ul></ul><ul><ul><li>in making an epistemic use of frequencies </li></ul></ul>
17. 17. The full-blown advantage of objective Bayesianism <ul><li>In the design and interpretation of tests </li></ul><ul><li>In guiding action </li></ul>
18. 18. Hypothesis testing <ul><li>Basic idea </li></ul><ul><ul><li>To compare hypotheses with data </li></ul></ul><ul><li>Elements </li></ul><ul><ul><li>Null hypothesis: observed variation is chancy </li></ul></ul><ul><ul><li>Alternative hypothesis: observed variation is real </li></ul></ul><ul><ul><li>Test statistic </li></ul></ul><ul><li>Null hypothesis is accepted/rejected </li></ul><ul><li>depending on the chosen p-value </li></ul>
19. 19. Probability in hypothesis testing <ul><li>From frequentist viewpoint: </li></ul><ul><ul><li>Evaluate the probability to obtain the sample </li></ul></ul><ul><ul><li>if the hypothesis is true; </li></ul></ul><ul><ul><li>‘ The probability of a hypothesis’ </li></ul></ul><ul><ul><li>has no meaning because it is single-case </li></ul></ul><ul><li>But Bayesians can evaluate </li></ul><ul><li>the probability of a hypothesis </li></ul>
20. 20. Consider: <ul><li>‘ The unknown parameter  lies in (  1 ,  2 ) </li></ul><ul><li>with confidence level 95%’ </li></ul><ul><li>This means: </li></ul><ul><ul><li>If we draw many samples of the same size and </li></ul></ul><ul><ul><li>build the same interval around  , </li></ul></ul><ul><ul><li>then we can expect that 95% of confidence intervals </li></ul></ul><ul><ul><li>will contain  </li></ul></ul><ul><li>This is not the probability that  will lie in (  1 ,  2 ) </li></ul><ul><li>Freedman et al: </li></ul><ul><ul><li>“ Chances are in the sampling procedure, </li></ul></ul><ul><ul><li>not in the parameter.” </li></ul></ul>
21. 21. What’s the probability of a hypothesis? <ul><li>Frequentists cannot answer this </li></ul><ul><li>But Bayesians can : </li></ul><ul><ul><li>A long-lasting project to rephrase </li></ul></ul><ul><ul><li>(frequentist) statistical problems in Bayesian terms </li></ul></ul><ul><ul><ul><li>Jaynes, Florens & Mouchart, Dr èze & Mouchart, </li></ul></ul></ul><ul><ul><ul><li>Berger, Bernardo, … </li></ul></ul></ul>
22. 22. … the probability of which hypothesis? <ul><li>Hypothesis testing tests </li></ul><ul><li>the Null H against the Alternative H </li></ul><ul><li>Acceptance/rejection </li></ul><ul><li>directly concerns Null H, not Alternative H </li></ul><ul><li>Objective Bayesianism treat both Hs on a par, </li></ul><ul><li>unless evidence suggests doing otherwise </li></ul>
23. 23. A problem of error? <ul><li>Type I error </li></ul><ul><ul><li>Reject Null H when it is in fact true </li></ul></ul><ul><li>Type II error </li></ul><ul><ul><li>Accept Null H when it is in fact false </li></ul></ul><ul><li>Type I is weightier than type II: </li></ul><ul><ul><li>Be more cautious to accept the H that </li></ul></ul><ul><ul><li>the observed variation is true rather than chancy </li></ul></ul><ul><ul><li>Be more cautious in accepting ‘causal’ variations </li></ul></ul><ul><li>Solution: restrict rejection region </li></ul>
24. 24. But what about the probability of the Alternative hypothesis? <ul><li>Type I error has probability  , aka p-value </li></ul><ul><li>Type II error has probability  </li></ul><ul><ul><li> depends but is not determined by  </li></ul></ul><ul><li>The frequentist does not treat them on a par </li></ul><ul><li>The Bayesian </li></ul><ul><ul><li>Assigns different  and  only based on evidence </li></ul></ul><ul><ul><li>Chooses Null or Alternative H on the basis of the posterior </li></ul></ul><ul><li>Not a problem of error anymore, </li></ul><ul><li>a problem of evidence </li></ul>
25. 25. Probability is the very guide of life <ul><li>That is, probability guides decisions </li></ul><ul><li>Decisions </li></ul><ul><ul><li>To accept/reject a hypothesis </li></ul></ul><ul><ul><li>To take action </li></ul></ul><ul><ul><ul><li>Policy-making </li></ul></ul></ul><ul><ul><ul><li>About individuals </li></ul></ul></ul>
26. 26. Individual decisions <ul><li>Concern the single case, </li></ul><ul><li>e.g. a medical patient </li></ul><ul><li>Bayesian probabilities are applicable in the single case </li></ul><ul><li>Frequencies aren’t </li></ul>
27. 27. Decisions in policy making <ul><li>From the Policy Hub of UK Civil Service </li></ul><ul><ul><li>Policy making is: 'the process by which governments translate their political vision into programmes and actions to deliver 'outcomes' - desired changes in the real world '. </li></ul></ul><ul><ul><li>This concern with achieving real changes in people's lives is reflected in the Government's overall strategy for improving public services published in March 2002 </li></ul></ul><ul><ul><li>Promoting good practice in policy making is fundamental to the delivery of quality outcomes for citizens and to the realisation of public sector reform. Policy makers should have available to them the widest and latest information on research and best practice and all decisions should be demonstrably rooted in this knowledge . </li></ul></ul>
28. 28. To sum up and conclude <ul><li>Causal claims in CM </li></ul><ul><ul><li>Generic / Single-case </li></ul></ul><ul><li>Both are probabilistic </li></ul><ul><li>What interpretation of probability? </li></ul><ul><li>Not just a Bayesian, </li></ul><ul><li>but an objective Bayesian </li></ul>