Probability is powerful tool for dealing with uncertainty\n\nImportant to Cog Sci because:\n1) living in the world is fraught with uncertainty\n2) Scientific data is noisy - \n A) Behavior and neural representations are noisy.\n\n B) our scientific perceptual systems are fraught with uncertainty\n\n
-Principled way of dealing with the uncertainty in our perception, experience in general.\n- - Maybe an idea as to how the brain deals with things?\n\nImportant to Cog Sci because:\n1) living in the world is fraught with uncertainty\n(e.g., any 2D image is consistent with multiple 3D scenes)\n\nUnderstanding the mathematics of probability may help us to understand how the brain deals with uncertainty\n
\n
Important to Cog Sci because:\n2) our scientific perceptual systems are fraught with uncertainty\n\nProbability is key to making sense of our data.\n
\n
\n
Examples:\nfrequentist:\nchance of winning\ngambles, repeatable stuff (but a lot of things are not repeatable! wouldn&#x2019;t you get same results from the same initial conditions?)\n\nBayesian: \nweather\nexistence of aliens on Saturn\nsubjective belief must follow prob. laws in order to be coherent.\n(but how do you measure the belief? and priors?)\n\n
Examples:\nfrequentist:\nchance of winning\ngambles, repeatable stuff (but a lot of things are not repeatable! wouldn&#x2019;t you get same results from the same initial conditions?)\n\nBayesian: \nweather\nexistence of aliens on Saturn\nsubjective belief must follow prob. laws in order to be coherent.\n(but how do you measure the belief? and priors?)\n\n
Stop, do some examples of sample spaces and events.\n\nTo build a probability model, we need at least three ingredients. We need to know: &#x2022; What are all the things that could possibly happen? &#x2022; What sensible yes-no questions can we ask about these things? &#x2022; For any such question, what is the probability that the answer is yes?\nThe first point on the agenda is formalized by specifying a set &#x3A9;. Every element &#x3C9; &#x2208; &#x3A9; symbolizes one possible fate of the model.\n
Stop, do some examples of sample spaces and events.\n\nTo build a probability model, we need at least three ingredients. We need to know: &#x2022; What are all the things that could possibly happen? &#x2022; What sensible yes-no questions can we ask about these things? &#x2022; For any such question, what is the probability that the answer is yes?\nThe first point on the agenda is formalized by specifying a set &#x3A9;. Every element &#x3C9; &#x2208; &#x3A9; symbolizes one possible fate of the model.\n
Stop, do some examples of sample spaces and events.\n\nTo build a probability model, we need at least three ingredients. We need to know: &#x2022; What are all the things that could possibly happen? &#x2022; What sensible yes-no questions can we ask about these things? &#x2022; For any such question, what is the probability that the answer is yes?\nThe first point on the agenda is formalized by specifying a set &#x3A9;. Every element &#x3C9; &#x2208; &#x3A9; symbolizes one possible fate of the model.\n
Stop, do some examples of sample spaces and events.\n\nTo build a probability model, we need at least three ingredients. We need to know: &#x2022; What are all the things that could possibly happen? &#x2022; What sensible yes-no questions can we ask about these things? &#x2022; For any such question, what is the probability that the answer is yes?\nThe first point on the agenda is formalized by specifying a set &#x3A9;. Every element &#x3C9; &#x2208; &#x3A9; symbolizes one possible fate of the model.\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
draw Venn diagram\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
Transcript of "Probability Review"
1.
Probability Review Tomoki TsuchidaComputational & Cognitive Neuroscience Lab Department of Computer Science University of California, San Diego
2.
Lectures based on previous bootcamp lectures/slides, ﬁrst written by Tim K. Marks (Thanks Tim!!) and then by David Groppe (Thanks David!!) and then by Jake Olson (Thanks Jake!!)
3.
Talk Outline• Why study probability?• Probability deﬁned• Probabilities of Events Formed from other Events ‣ P(not E) ‣ P(E or F) ‣ P(E & F) - Independent events• Conditional Probability
9.
Talk Outline• Why is probability useful?• Probability Deﬁned• Probabilities of Events Formed from other Events ‣ P(not E) ‣ P(E or F) ‣ P(E & F) - Independent events• Conditional Probability
10.
We now know Probability is useful...But what does it mean?Examples:“There is a 47% chance of red.”“There is a 47% chance of rain.”
11.
Two of the ways to interpret the probability: (Philosophical debates ensure for either interpretation. Or, one can forget about the interpretation and focus on the properties alone, as mathematicians do :P)
12.
Two of the ways to interpret the probability: Frequentist-The percentage of time some event will happen. (Philosophical debates ensure for either interpretation. Or, one can forget about the interpretation and focus on the properties alone, as mathematicians do :P)
13.
Two of the ways to interpret the probability: Frequentist-The percentage of time some event will happen. Bayesian-How strongly you believe that something is true. (Philosophical debates ensure for either interpretation. Or, one can forget about the interpretation and focus on the properties alone, as mathematicians do :P)
15.
Probability based on Axiomatic TheoryRandom Experiment: An experiment withstochastic (nondeterministic) result(e.g. “two coin tosses”)
16.
Probability based on Axiomatic TheoryRandom Experiment: An experiment withstochastic (nondeterministic) result(e.g. “two coin tosses”)Outcome: Result of the experiment(e.g. HH, TH etc.)
17.
Probability based on Axiomatic TheoryRandom Experiment: An experiment withstochastic (nondeterministic) result(e.g. “two coin tosses”)Outcome: Result of the experiment(e.g. HH, TH etc.)Sample Space (Ω): Set of all possibleoutcomes of the random experiment(Ω = {HH, TH, HT, TT} )
18.
Probability based on Axiomatic TheoryRandom Experiment: An experiment withstochastic (nondeterministic) result(e.g. “two coin tosses”)Outcome: Result of the experiment(e.g. HH, TH etc.)Sample Space (Ω): Set of all possibleoutcomes of the random experiment(Ω = {HH, TH, HT, TT} )Event: A subset of the sample space thatsatisﬁes certain constraints(e.g. E={HH}, F={HH, TH}, G={HH, TT})
19.
An Example Random Experiment: Rolling a six-sided die once Outcomes: 1, 2, 3, 4, 5, or 6 Sample Space: Ω={1,2,3,4,5,6} Events: 26 total (why?) note: The sample space itself Ω (Omega) and the empty set ϕ (phi) or {} ...are also events.
20.
Complement of an Event Odd numbers on the die: E={1,3,5} c The complement of E, E, are all elements that are not members of E: c E ={2,4,6}Draw Venn diagrams (Remember: events are sets! c So E is also an event itself.)
21.
Union of Two Events Odd numbers on the die: E={1,3,5} Numbers less than 4: F={1,2,3}The union of E and F are all elements that are members of E or members of F: E or F=E ∪ F={1,2,3,5}
22.
Intersection of Two Events Odd numbers on the die: E={1,3,5} Numbers less than 4: F={1,2,3} The intersection of E and F are allelements that are members of E and members of F: E & F=E ∩ F={1,3}
23.
An Axiomatic Deﬁnition of ProbabilityProbability is a set function P(E) that assigns to every event E a number called the “probability of E” such that:
24.
An Axiomatic Deﬁnition of ProbabilityProbability is a set function P(E) that assigns to every event E a number called the “probability of E” such that:1. The probability of an event is greater than or equal to zero
25.
An Axiomatic Deﬁnition of ProbabilityProbability is a set function P(E) that assigns to every event E a number called the “probability of E” such that:1. The probability of an event is greater than or equal to zero P(E) ≥ 0 €
26.
An Axiomatic Deﬁnition of ProbabilityProbability is a set function P(E) that assigns to every event E a number called the “probability of E” such that:1. The probability of an event is greater than or equal to zero P(E) ≥ 0 2. The probability of the sample space is one €
27.
An Axiomatic Deﬁnition of ProbabilityProbability is a set function P(E) that assigns to every event E a number called the “probability of E” such that:1. The probability of an event is greater than or equal to zero P(E) ≥ 0 2. The probability of the sample space is one P(Ω) = 1 €
28.
An Axiomatic Deﬁnition of ProbabilityProbability is a set function P(E) that assigns to every event E a number called the “probability of E” such that:1. The probability of an event is greater than or equal to zero P(E) ≥ 0 2. The probability of the sample space is one P(Ω) = 1 3. If two events are disjoint, the probability of their union € equals the sum of their probabilities
29.
An Axiomatic Deﬁnition of ProbabilityProbability is a set function P(E) that assigns to every event E a number called the “probability of E” such that:1. The probability of an event is greater than or equal to zero P(E) ≥ 0 2. The probability of the sample space is one P(Ω) = 1 3. If two events are disjoint, the probability of their union € equals the sum of their probabilities P(E or F) = P(E) + P(F)
30.
An Example Random Experiment: Rolling a six-sided die once A legal probability function: P(1)=P(2)=P(3)=P(4)=P(5)=P(6)=1/6 An illegal probability function: P(1)=P(2)=P(3)=P(4)=P(5)=P(6)=1/2 P(Ω)=P(1)+P(2)+P(3)+P(4)+P(5)+P(6)=3
31.
Talk Outline• Probability Deﬁned• Probabilities of Events Formed from other Events ‣ P(not E) ‣ P(E or F) ‣ P(E & F) - Independent events• Conditional Probability ‣ Bayes’ Rule
32.
Probability of Events from other Events: Complement of an event c P(E )=P(not E)=1-P(E) c E={1,3},E ={2,4,5,6} c P(E )=1-P(E)=1-1/3=2/3Note: This is easy to visualize with Venn Diagrams
33.
Probability of Events from other Events: Intersection of two events P(E ∩ F)=P(E & F)=P(E)P(F) If and only if E and F are independent events E={1,3,5}, F={1,2,3,4} E ∩ F={1,3}
34.
Probability of Events from other Events: Intersection of two events P(E ∩ F)=P(E & F)=P(E)P(F) If and only if E and F are independent events E={1,3,5}, F={1,2,3,4} E ∩ F={1,3} P(E)=1/2, P(F)=2/3 P(E ∩ F)=1/3=P(E)P(F)
35.
Probability of Events from other Events: Union of two events P(E ∪ F)=P(E or F)=P(E)+P(F)-P(E ∩ F) E={1,3}, F={1,2,3} E ∪ F={1,2,3}, E ∩ F={1,3}
36.
Probability of Events from other Events: Union of two events P(E ∪ F)=P(E or F)=P(E)+P(F)-P(E ∩ F) E={1,3}, F={1,2,3} E ∪ F={1,2,3}, E ∩ F={1,3} P(E)=1/3, P(F)=1/2P(E ∪ F)=P(E)+P(F)-P(E ∩ F)=1/3-1/2-1/3=1/2
37.
Probability of Events from other Events: Intersection of two events P(E ∩ F)=P(E & F) E={1,3}, F={1,2,3,4} E ∩ F={1,3} P(E)=2/6, P(F)=4/6 P(E ∩ F)=2/6
38.
Probability of Events from other Events: Intersection of two events P(E ∩ F)=P(E)P(F) If and only if E and F are independent events E={1,3}, F={1,2,3,4} E ∩ F={1,3} P(E)=2/6, P(F)=4/6 P(E ∩ F)=2/6≠P(E)P(F)
39.
Understanding Independence P(E ∩ F)=P(E)P(F) Independent events give no information about each other. E={1,3,5}, F={1,2,3,4} E ∩ F={1,3} P(E)=1/2, P(F)=2/3, P(E & F)=1/3 P(E)P(F)=(1/2)/(2/3)=1/3=P(E & F)note: this is different from disjoint events, for which P(E)P(F) = 0 !
40.
Talk Outline• Probability Deﬁned• Probabilities of Events Formed from other Events ‣ P(not E) ‣ P(E or F) ‣ P(E & F) - Independent events• Conditional Probability ‣ Bayes’ Rule
41.
If I tell you that I rolled a number less than 4, what is the probability that I rolled an odd number?Conditional Probability: The probability of event Egiven that event F has happened P(E & F) P(E | F) = P(F) E={1,3,5}, F={1,2,3} € E ∩ F={1,3}
42.
If I tell you that I rolled a number less than 4, what is the probability that I rolled an odd number?Conditional Probability: The probability of event Egiven that event F has happened P(E & F) P(E | F) = P(F) E={1,3,5}, F={1,2,3} € E ∩ F={1,3} P(E & F)=1/3, P(F)=1/2 P(E|F)=(1/3)/(1/2)=2/3
43.
Conditional Probability What if E and F are independent events? P(E & F) P(E)P(F) P(E | F) = = = P(E) P(F) P(F)€ same example of independent variables that I used before
44.
Conditional Probability What if E and F are independent events? P(E & F) P(E)P(F) P(E | F) = = = P(E) P(F) P(F) If I tell you that I rolled a number less than 5, what is the probability that I rolled an odd number?€ E={1,3,5}, F={1,2,3,4} same example of E ∩ F={1,3} independent variables that I used before P(E)=1/2, P(F)=2/3, P(E & F)=1/3 P(E | F)=(1/3)/(2/3)=1/2=P(E)
45.
The multiplication ruleRearranging the deﬁnition of conditional probability,we get: P (E&F ) = P (E|F )P (F )
46.
The multiplication ruleRearranging the deﬁnition of conditional probability,we get: P (E&F ) = P (E|F )P (F )If we have more events, say E, F, G...(and change & to , for simpler notation):
47.
The multiplication ruleRearranging the deﬁnition of conditional probability,we get: P (E&F ) = P (E|F )P (F )If we have more events, say E, F, G...(and change & to , for simpler notation): P (E, F, G) = P (E, F |G)P (G) = P (E|F, G)P (F |G)P (G)
48.
The multiplication ruleRearranging the deﬁnition of conditional probability,we get: P (E&F ) = P (E|F )P (F )If we have more events, say E, F, G...(and change & to , for simpler notation): P (E, F, G) = P (E, F |G)P (G) = P (E|F, G)P (F |G)P (G)If you forget everything else today about conditionalprobability, just remember:
49.
The multiplication ruleRearranging the deﬁnition of conditional probability,we get: P (E&F ) = P (E|F )P (F )If we have more events, say E, F, G...(and change & to , for simpler notation): P (E, F, G) = P (E, F |G)P (G) = P (E|F, G)P (F |G)P (G)If you forget everything else today about conditionalprobability, just remember: P (EF ) = P (E|F )P (F )
50.
The Wrong Conditioning Variable:“The CASA (Center for Addiction and Substance Abuse at Columbia)study establishes a clear progression that begins with gateway drugs andleads to cocaine use: nearly 90% of people who have ever tried cocaineused all three gateway substances [alcohol, marijuana, & cigarettes]first.” Source: http://www.columbia.edu/cu/record/archives/vol20/ vol20_iss10/record2010.24.html 29
51.
Extra: Monty Hall problemSuppose youre on a game show and youre given the choice of three doors [and will winwhat is behind the chosen door]. Behind one door is a car; behind the others, goats. The carand the goats were placed randomly behind the doors before the show. The rules of thegame show are as follows: After you have chosen a door, the door remains closed for thetime being. The game show host, Monty Hall, who knows what is behind the doors, now hasto open one of the two remaining doors, and the door he opens must have a goat behind it. Ifboth remaining doors have goats behind them, he chooses one [uniformly] at random. AfterMonty Hall opens a door with a goat, he will ask you to decide whether you want to stay withyour ﬁrst choice or to switch to the last remaining door. Imagine that you chose Door 1 andthe host opens Door 3, which has a goat. He then asks you "Do you want to switch to DoorNumber 2?" Is it to your advantage to change your choice?- Krauss and Wang 2003:10 30
52.
Talk Outline• Probability Deﬁned• Probabilities of Events Formed from other Events ‣ P(not E) ‣ P(E or F) ‣ P(E & F) - Independent events• Conditional Probability ‣ Bayes’ Rule
53.
Bayes’ Rule (a.k.a. Bayes’ Theorem)Note that the multiplication rule is symmetric, so P (E, F ) = P (E|F )P (F ) = P (F |E)P (E)Dividing one of them by P(F) yields P (F |E)P (E) P (E|F ) = P (F )
54.
Bayes’ Rule (a.k.a. Bayes’ Theorem) A way to infer one conditional probability from another (probabilistic inference): P (D|H)P (H) P (H|D) = P (D)Prior: P (H) (“prior” belief about H)Likelihood: P (D|H) (Under, H how likely it is to see D)Posterior: P (H|D) (“updated” belief about H after seeing D)Note: Named after Rev. Thomas Bayes (1702-1761)
55.
MarginalizationHow do we calculate P(F)? Note that { E = (E F ) [ (E F )So that P (E) = P (EF ) + P (EF { ) { { = P (E|F )P (F ) + P (E|F )P (F ) = P (E|F )P (F ) + P (E|F { )(1 P (F ))
56.
MarginalizationSo for Bayes’ rule for two hypotheses is P (D|H)P (H)P (H|D) = P (D|H)P (H) + P (D|H { )P (H { )(phew!)
57.
Exercise• A laboratory blood test is 95% effective in detecting a disease when it is, in fact, present. However, the test also yields a “false positive” result for 1 percent of the healthy persons tested. If 0.5 percent of the population actually has the disease, what is the probability a person has the disease given that the test result is positive? (from Ross, Section 3.3 example 3d)
58.
Solution• D: the event that the tested person has the disease.• E: the event that the test result is positive.• We know: P(E | D) = 0.95, P(D) = .005.• We want to know P(D | E)
59.
Solution P (D, E)P (D|E) = P (E) P (E|D)P (D) = P (E|D)P (D) + P (E|D{ )P (D{ ) (.95)(.005) = (.95)(.005) + (.01)(.995) ⇡ .323 (Why is it so low?)
60.
Does the Brain Perform Bayesian Inference? Opinion TRENDS in Neurosciences Vol.27 No.12 December 2004 The Bayesian brain: the role of uncertainty in neural coding and computation David C. Knill and Alexandre Pouget Center for Visual Science and the Department of Brain and Cognitive Science, University of Rochester, NY 14627, USA To use sensory information efﬁciently to make judgments Bayesian inference and the Bayesian coding hypothesis and guide action in the world, the brain must represent The fundamental concept behind the Bayesian approach and use information about uncertainty in its computations to perceptual computations is that the information for perception and action. Bayesian methods have proven provided by a set of sensory data about the world is successful in building computational theories for percep- represented by a conditional probability density function tion and sensorimotor control, and psychophysics is over the set of unknown variables – the posterior density providing a growing body of evidence that human function. A Bayesian perceptual system, therefore, would perceptual computations are ‘Bayes’ optimal’. This leads represent the perceived depth of an object, for example, to the ‘Bayesian coding hypothesis’: that the brain not as a single number Z but as a conditional probability represents sensory information probabilistically, in the density function p(Z/I), where I is the available image form of probability distributions. Several computational information (e.g. stereo disparities). Loosely speaking, schemes have recently been proposed for how this might p(Z/I) would specify the relative probability that the object be achieved in populations of neurons. Neurophysio- is at different depths Z, given the available sensory logical data on the hypothesis, however, is almost non- information. existent. A major challenge for neuroscientists is to test More generally, the component computations that these ideas experimentally, and so determine whether underlay Bayesian inferences [that give rise to p(Z/I)] and how neurons code information about sensory are ideally performed on representations of conditional uncertainty. probability density functions rather than on unitary estimates of parameter values. Loosely speaking, a Humans and other animals operate in a world of sensory Bayes’ optimal system maintains, at each stage of local uncertainty. Although introspection tells us that percep- computation, a representation of all possible values of the
61.
Does the Brain Perform Bayesian Inference? • An appealing idea because: ‣ It could explain why the brain works as it does (i.e., it is performing optimal Bayesian inference) • An unappealing idea because: ‣ For even some simple problems it can be difﬁcult to know what the optimal Bayesian inference is ‣ Computation of probabilities can be difﬁcult
62.
Talk Outline‣ Next session: Random Variables etc.
A particular slide catching your eye?
Clipping is a handy way to collect important slides you want to go back to later.
Be the first to comment