Introduction to ArtificiaI Intelligence in Higher Education
Module #19 – Probability Theory
1. Module #19 – Probability
Probability Theory
Slides for a Course Based on the TextSlides for a Course Based on the Text
Discrete Mathematics & Its ApplicationsDiscrete Mathematics & Its Applications (6(6thth
Edition) Kenneth H. RosenEdition) Kenneth H. Rosen
based on slides bybased on slides by
Michael P. Frank and Andrew W. MooreMichael P. Frank and Andrew W. Moore
2. Module #19 – Probability
Terminology
• A (stochastic)A (stochastic) experimentexperiment is a procedure thatis a procedure that
yields one of a given set of possible outcomesyields one of a given set of possible outcomes
• TheThe sample spacesample space SS of the experiment is theof the experiment is the
set of possible outcomes.set of possible outcomes.
• An event is aAn event is a subsetsubset of sample space.of sample space.
• A random variable is a function that assigns aA random variable is a function that assigns a
real value to each outcome of an experimentreal value to each outcome of an experiment
Normally, a probability is related to an experiment or a trial.
Let’s take flipping a coin for example, what are the possible outcomes?
Heads or tails (front or back side) of the coin will be shown upwards.
After a sufficient number of tossing, we can “statistically” conclude
that the probability of head is 0.5.
In rolling a dice, there are 6 outcomes. Suppose we want to calculate the
prob. of the event of odd numbers of a dice. What is that probability?
3. Module #19 – Probability
Random Variables
• AA “random variable”“random variable” VV is any variable whoseis any variable whose
value is unknown, or whose value depends onvalue is unknown, or whose value depends on
the precise situation.the precise situation.
– E.g.E.g., the number of students in class today, the number of students in class today
– Whether it will rain tonight (Boolean variable)Whether it will rain tonight (Boolean variable)
• The propositionThe proposition VV==vvii may have an uncertainmay have an uncertain
truth value, and may be assigned atruth value, and may be assigned a probability.probability.
4. Module #19 – Probability
Example 10
• A fair coin is flipped 3 times. LetA fair coin is flipped 3 times. Let SS be the sample space of 8be the sample space of 8
possible outcomes, and letpossible outcomes, and let XX be a random variable that assigneesbe a random variable that assignees
to an outcome the number of heads in this outcome.to an outcome the number of heads in this outcome.
• Random variableRandom variable XX is a functionis a function XX::SS →→ XX((SS),),
wherewhere XX((SS)={0, 1, 2, 3} is the range of)={0, 1, 2, 3} is the range of X,X, which is the numberthe number
of heads, andof heads, and
SS={={ (TTT), (TTH), (THH), (HTT), (HHT), (HHH), (THT), (HTH)(TTT), (TTH), (THH), (HTT), (HHT), (HHH), (THT), (HTH) }}
• X(TTT) = 0X(TTT) = 0
X(TTH) = X(HTT) = X(THT) = 1X(TTH) = X(HTT) = X(THT) = 1
X(HHT) = X(THH) = X(HTH) = 2X(HHT) = X(THH) = X(HTH) = 2
X(HHH) = 3X(HHH) = 3
• TheThe probability distribution (pdf) of random variableprobability distribution (pdf) of random variable XX
is given byis given by
P(X=3) = 1/8, P(X=2) = 3/8, P(X=1) = 3/8, P(X=0) = 1/8.P(X=3) = 1/8, P(X=2) = 3/8, P(X=1) = 3/8, P(X=0) = 1/8.
5. Module #19 – Probability
Experiments & Sample Spaces
• A (stochastic)A (stochastic) experimentexperiment is any process by whichis any process by which
a given random variablea given random variable VV gets assigned somegets assigned some
particularparticular value, and where this value is notvalue, and where this value is not
necessarily known in advance.necessarily known in advance.
– We call it the “actual” value of the variable, asWe call it the “actual” value of the variable, as
determined by that particular experiment.determined by that particular experiment.
• TheThe sample spacesample space SS of the experiment is justof the experiment is just
the domain of the random variable,the domain of the random variable, SS = dom[= dom[VV]]..
• TheThe outcomeoutcome of the experiment is the specificof the experiment is the specific
valuevalue vvii of the random variable that is selected.of the random variable that is selected.
6. Module #19 – Probability
Events
• AnAn eventevent EE is any set of possible outcomes inis any set of possible outcomes in SS……
– That is,That is, EE ⊆⊆ SS == domdom[[VV]]..
• E.g., the event that “less than 50 people show up for our nextE.g., the event that “less than 50 people show up for our next
class” is represented as the setclass” is represented as the set {1, 2, …, 49}{1, 2, …, 49} of values of theof values of the
variablevariable VV = (# of people here next class)= (# of people here next class)..
• We say that eventWe say that event EE occursoccurs when the actual valuewhen the actual value
ofof VV is inis in EE, which may be written, which may be written VV∈∈EE..
– Note thatNote that VV∈∈EE denotes the proposition (of uncertaindenotes the proposition (of uncertain
truth) asserting that the actual outcome (value oftruth) asserting that the actual outcome (value of VV) will) will
be one of the outcomes in the setbe one of the outcomes in the set EE..
7. Module #19 – Probability
Probabilities
• We write P(A) as “the fraction of possibleWe write P(A) as “the fraction of possible
worlds in which A is true”worlds in which A is true”
8. Module #19 – Probability
Visualizing A
Event space of
all possible
worlds
Its area is 1
Worlds in which A is False
Worlds in which
A is true
P(A) = Area of
reddish oval
9. Module #19 – Probability
Probability
• TheThe probabilityprobability pp = Pr[= Pr[EE]] ∈∈ [0,1][0,1] of an eventof an event EE isis
a real number representing our degree of certaintya real number representing our degree of certainty
thatthat EE will occur.will occur.
– IfIf Pr[Pr[EE] = 1] = 1, then, then EE is absolutely certain to occur,is absolutely certain to occur,
• thusthus VV∈∈EE has the truth valuehas the truth value TrueTrue..
– IfIf Pr[Pr[EE] = 0] = 0, then, then EE is absolutely certainis absolutely certain notnot to occur,to occur,
• thusthus VV∈∈EE has the truth valuehas the truth value FalseFalse..
– IfIf Pr[Pr[EE] =] = ½½, then we are, then we are maximally uncertainmaximally uncertain aboutabout
whetherwhether EE will occur; that is,will occur; that is,
• VV∈∈EE andand VV∉∉EE are consideredare considered equally likelyequally likely..
– How do we interpret other values ofHow do we interpret other values of pp??
Note: We could also define probabilities for more
general propositions, as well as events.
10. Module #19 – Probability
Four Definitions of Probability
• Several alternative definitions of probabilitySeveral alternative definitions of probability
are commonly encountered:are commonly encountered:
– Frequentist, Bayesian, Laplacian, AxiomaticFrequentist, Bayesian, Laplacian, Axiomatic
• They have different strengths &They have different strengths &
weaknesses, philosophically speaking.weaknesses, philosophically speaking.
– But fortunately, they coincide with each otherBut fortunately, they coincide with each other
and work well together, in the majority of casesand work well together, in the majority of cases
that are typically encountered.that are typically encountered.
11. Module #19 – Probability
Probability: Frequentist Definition
• The probability of an eventThe probability of an event EE is the limit, asis the limit, as nn→∞→∞,, ofof
the fraction of times that we findthe fraction of times that we find VV∈∈EE over the courseover the course
ofof nn independent repetitions of (different instances of)independent repetitions of (different instances of)
the same experiment.the same experiment.
• Some problems with this definition:Some problems with this definition:
– It is only well-defined for experiments that can beIt is only well-defined for experiments that can be
independently repeated, infinitely many times!independently repeated, infinitely many times!
• or at least, if the experiment can be repeated in principle,or at least, if the experiment can be repeated in principle, e.g.e.g., over, over
some hypothetical ensemble of (say) alternate universes.some hypothetical ensemble of (say) alternate universes.
– It can never be measured exactly in finite time!It can never be measured exactly in finite time!
• Advantage:Advantage: It’s an objective, mathematical definition.It’s an objective, mathematical definition.
n
n
E EV
n
∈
∞→
≡ lim:]Pr[
12. Module #19 – Probability
Probability: Bayesian Definition
• Suppose a rational, profit-maximizing entitySuppose a rational, profit-maximizing entity RR isis
offered a choice between two rewards:offered a choice between two rewards:
– WinningWinning $1$1 if and only if the eventif and only if the event EE actually occurs.actually occurs.
– ReceivingReceiving pp dollars (wheredollars (where pp∈∈[0,1][0,1]) unconditionally.) unconditionally.
• IfIf RR can honestly state that he is completelycan honestly state that he is completely
indifferent between these two rewards, then we sayindifferent between these two rewards, then we say
thatthat RR’s probability for’s probability for EE isis pp, that is,, that is, PrPrRR[[EE] :] :≡≡ pp..
• Problem:Problem: It’s a subjective definition; depends on theIt’s a subjective definition; depends on the
reasonerreasoner RR, and his knowledge, beliefs, & rationality., and his knowledge, beliefs, & rationality.
– The version above additionally assumes that the utility ofThe version above additionally assumes that the utility of
money is linear.money is linear.
• This assumption can be avoided by using “utils” (utility units)This assumption can be avoided by using “utils” (utility units)
instead of dollars.instead of dollars.
13. Module #19 – Probability
Probability: Laplacian Definition
• First, assume that all individual outcomes in theFirst, assume that all individual outcomes in the
sample space aresample space are equally likelyequally likely to each otherto each other……
– Note that this term still needs an operational definition!Note that this term still needs an operational definition!
• Then, the probability of any eventThen, the probability of any event EE is given by,is given by,
Pr[Pr[EE] = |] = |EE|/||/|SS||.. Very simple!Very simple!
• Problems:Problems: Still needs a definition forStill needs a definition for equally likelyequally likely,,
and depends on the existence ofand depends on the existence of somesome finite samplefinite sample
spacespace SS in which all outcomes inin which all outcomes in SS are, in fact,are, in fact,
equally likely.equally likely.
14. Module #19 – Probability
Probability: Axiomatic Definition
• LetLet pp be any total functionbe any total function pp::SS→[0,1]→[0,1] such thatsuch that
∑∑ss pp((ss) = 1) = 1..
• Such aSuch a pp is called ais called a probability distributionprobability distribution..
• Then, theThen, the probability under pprobability under p
of any eventof any event EE⊆⊆SS is just:is just:
• Advantage:Advantage: Totally mathematically well-defined!Totally mathematically well-defined!
– This definition can even be extended to apply to infiniteThis definition can even be extended to apply to infinite
sample spaces, by changingsample spaces, by changing ∑∑→→∫∫, and calling, and calling pp aa
probability density functionprobability density function or a probabilityor a probability measuremeasure..
• Problem:Problem: Leaves operational meaning unspecified.Leaves operational meaning unspecified.
∑∈
≡
Es
spEp )(:][
15. Module #19 – Probability
The Axioms of Probability
• 0 <= P(A) <= 10 <= P(A) <= 1
• P(True) = 1P(True) = 1
• P(False) = 0P(False) = 0
• P(A or B) = P(A) + P(B) - P(A and B)P(A or B) = P(A) + P(B) - P(A and B)
16. Module #19 – Probability
Interpreting the axioms
• 0 <= P(A)0 <= P(A) <= 1<= 1
• P(True) = 1P(True) = 1
• P(False) = 0P(False) = 0
• P(A or B) = P(A) + P(B) - P(A and B)P(A or B) = P(A) + P(B) - P(A and B)
The area of A can’t get
any smaller than 0
And a zero area would
mean no world could
ever have A true
17. Module #19 – Probability
Interpreting the axioms
• 0 <=0 <= P(A) <= 1P(A) <= 1
• P(True) = 1P(True) = 1
• P(False) = 0P(False) = 0
• P(A or B) = P(A) + P(B) - P(A and B)P(A or B) = P(A) + P(B) - P(A and B)
The area of A can’t get
any bigger than 1
And an area of 1 would
mean all worlds will have
A true
19. Module #19 – Probability
Theorems from the Axioms
• 0 <= P(A) <= 1, P(True) = 1, P(False) = 00 <= P(A) <= 1, P(True) = 1, P(False) = 0
• P(P(AA oror BB) = P() = P(AA) + P() + P(BB) - P() - P(AA andand BB))
From these we can prove:From these we can prove:
P(not A) = P(~A) = 1-P(A)P(not A) = P(~A) = 1-P(A)
• How?How?
20. Module #19 – Probability
Another important theorem
• 0 <= P(A) <= 1, P(True) = 1, P(False) = 00 <= P(A) <= 1, P(True) = 1, P(False) = 0
• P(P(AA oror BB) = P() = P(AA) + P() + P(BB) - P() - P(AA andand BB))
From these we can prove:From these we can prove:
P(A) = P(A ^ B) + P(A ^ ~B)P(A) = P(A ^ B) + P(A ^ ~B)
• How?How?
21. Module #19 – Probability
Probability of an event E
The probability of an event E is the sum of theThe probability of an event E is the sum of the
probabilities of the outcomes in E. That isprobabilities of the outcomes in E. That is
Note that, if there are n outcomes in the eventNote that, if there are n outcomes in the event
E, that is, if E = {aE, that is, if E = {a11,a,a22,…,a,…,ann} then} then
p(E) = p(s)
s∈E
∑
p(E) = p(
i =1
n
∑ ai )
22. Module #19 – Probability
Example
• What is the probability that, if we flip a coinWhat is the probability that, if we flip a coin
three times, that we get an odd number ofthree times, that we get an odd number of
tails?tails?
((TTTTTT), (TTH), (), (TTH), (TTHH), (HTT), (HHHH), (HTT), (HHTT), (HHH),), (HHH),
(THT), (H(THT), (HTTH)H)
Each outcome has probability 1/8,Each outcome has probability 1/8,
p(odd number of tails) = 1/8+1/8+1/8+1/8 = ½p(odd number of tails) = 1/8+1/8+1/8+1/8 = ½
23. Module #19 – Probability
Visualizing
Sample Space
• 1.1. ListingListing
– S = {Head, Tail}S = {Head, Tail}
• 2.2. Venn DiagramVenn Diagram
• 3.3. Contingency TableContingency Table
• 4.4. Decision Tree DiagramDecision Tree Diagram
27. Module #19 – Probability
Discrete Random Variable
– Possible values (outcomes) are discretePossible values (outcomes) are discrete
• E.g., natural number (0, 1, 2, 3 etc.)E.g., natural number (0, 1, 2, 3 etc.)
– Obtained by CountingObtained by Counting
– Usually Finite Number of ValuesUsually Finite Number of Values
• But could be infinite (must be “countable”)But could be infinite (must be “countable”)
28. Module #19 – Probability
Discrete Probability Distribution
( also called probability mass function (pmf) )
1.1.List of All possible [List of All possible [xx,, pp((xx)] pairs)] pairs
– xx = Value of Random Variable (Outcome)= Value of Random Variable (Outcome)
– pp((xx) = Probability Associated with Value) = Probability Associated with Value
2.2.Mutually Exclusive (No Overlap)Mutually Exclusive (No Overlap)
3.3.Collectively Exhaustive (Nothing Left Out)Collectively Exhaustive (Nothing Left Out)
4. 04. 0 ≤≤ pp((xx)) ≤≤ 11
5.5. ∑∑ pp((xx) = 1) = 1
29. Module #19 – Probability
Visualizing Discrete Probability
Distributions
{ (0, .25), (1, .50), (2, .25) }{ (0, .25), (1, .50), (2, .25) }{ (0, .25), (1, .50), (2, .25) }{ (0, .25), (1, .50), (2, .25) }
ListingListing
TableTable
GraphGraph
EquationEquation
# Tails# Tails f(xf(x))
CountCount
p(xp(x))
00 11 .25.25
11 22 .50.50
22 11 .25.25
pp xx
nn
xx nn xx
pp ppxx nn xx
(( ))
!!
!!(( ))!!
(( ))==
−−
−− −−
11
.00.00
.25.25
.50.50
00 11 22
xx
p(x)p(x)
30. Module #19 – Probability
Mutually Exclusive Events
• Two eventsTwo events EE11,, EE22 are calledare called mutuallymutually
exclusiveexclusive if they are disjoint:if they are disjoint: EE11∩∩EE22 == ∅∅
– Note that two mutually exclusive eventsNote that two mutually exclusive events cannotcannot
both occurboth occur in the same instance of a givenin the same instance of a given
experiment.experiment.
• For mutually exclusive events,For mutually exclusive events,
Pr[Pr[EE11 ∪∪ EE22] = Pr[] = Pr[EE11] + Pr[] + Pr[EE22]]..
31. Module #19 – Probability
Exhaustive Sets of Events
• A setA set EE = {= {EE11,, EE22, …}, …} of events in the sampleof events in the sample
spacespace SS is calledis called exhaustiveexhaustive iff .iff .
• An exhaustive setAn exhaustive set EE of events that are all mutuallyof events that are all mutually
exclusive with each other has the property thatexclusive with each other has the property that
SEi =
.1]Pr[ =∑ iE
32. Module #19 – Probability
Elementary Probability Rules
• P(~A) + P(A) = 1P(~A) + P(A) = 1
• P(B) = P(B ^ A) + P(B ^ ~A)P(B) = P(B ^ A) + P(B ^ ~A)
33. Module #19 – Probability
Bernoulli Trials
• Each performance of an experiment withEach performance of an experiment with
only two possible outcomes is called aonly two possible outcomes is called a
Bernoulli trialBernoulli trial..
• In general, a possible outcome of aIn general, a possible outcome of a
Bernoulli trial is called aBernoulli trial is called a successsuccess or aor a
failurefailure..
• If p is the probability of a success and q isIf p is the probability of a success and q is
the probability of a failure, then p+q=1.the probability of a failure, then p+q=1.
34. Module #19 – Probability
Probability of k successes in n
independent Bernoulli trials.
The probability of k successes in nThe probability of k successes in n
independent Bernoulli trials, withindependent Bernoulli trials, with
probability of success p and probability ofprobability of success p and probability of
failure q = 1-p is C(n,k)pfailure q = 1-p is C(n,k)pkk
qqn-kn-k
35. Module #19 – Probability
Example
A coin is biased so that the probability of heads isA coin is biased so that the probability of heads is
2/3. What is the probability that exactly four2/3. What is the probability that exactly four
heads come up when the coin is flipped sevenheads come up when the coin is flipped seven
times, assuming that the flips are independent?times, assuming that the flips are independent?
The number of ways that we can get four heads is:The number of ways that we can get four heads is:
C(7,4) = 7!/4!3!= 7*5 = 35C(7,4) = 7!/4!3!= 7*5 = 35
The probability of getting four heads and three tailsThe probability of getting four heads and three tails
is (2/3)is (2/3)44
(1/3)(1/3)3=3=
16/316/377
p(4 heads and 3 tails) is C(7,4)p(4 heads and 3 tails) is C(7,4) (2/3)(2/3)44
(1/3)(1/3)33
= 35*16/3= 35*16/377
= 560/2187= 560/2187
36. Module #19 – Probability
Find each of the following probabilities when
n independent Bernoulli trials are carried out
with probability of success, p.
• Probability of no successes.Probability of no successes.
C(n,0)pC(n,0)p00
qqn-kn-k
= 1(p= 1(p00
)(1-p))(1-p)nn
= (1-p)= (1-p)nn
• Probability of at least one success.Probability of at least one success.
1 - (1-p)1 - (1-p)nn
(why?)(why?)
37. Module #19 – Probability
Find each of the following probabilities when
n independent Bernoulli trials are carried out
with probability of success, p.
• Probability of at most one success.Probability of at most one success.
Means there can be no successes or oneMeans there can be no successes or one
successsuccess
C(n,0)pC(n,0)p00
qqn-0n-0
+C(n,1)p+C(n,1)p11
qqn-1n-1
(1-p)(1-p)nn
+ np(1-p)+ np(1-p)n-1n-1
• Probability of at least two successes.Probability of at least two successes.
1 -1 - (1-p)(1-p)nn
- np(1-p)- np(1-p)n-1n-1
38. Module #19 – Probability
A coin is flipped until it comes ups tails. The
probability the coin comes up tails is p.
• What is the probability that the experimentWhat is the probability that the experiment
ends after n flips, that is, the outcomeends after n flips, that is, the outcome
consists of n-1 heads and a tail?consists of n-1 heads and a tail?
(1-p)(1-p)n-1n-1
pp
39. Module #19 – Probability
Probability vs. Odds
• You may have heard the term “odds.”You may have heard the term “odds.”
– It is widely used in the gambling community.It is widely used in the gambling community.
• This is not the same thing as probability!This is not the same thing as probability!
– But, it is very closely related.But, it is very closely related.
• TheThe odds in favorodds in favor of an eventof an event EE means themeans the relativerelative
probability ofprobability of EE compared with its complementcompared with its complement EE..
OO((EE) :) :≡ Pr(≡ Pr(EE)/Pr()/Pr(EE))..
– E.g.E.g., if, if pp((EE) = 0.6) = 0.6 thenthen pp((EE) = 0.4) = 0.4 andand OO((EE) = 0.6/0.4 = 1.5) = 0.6/0.4 = 1.5..
• Odds are conventionally written as a ratio of integers.Odds are conventionally written as a ratio of integers.
– E.g.E.g.,, 3/23/2 oror 3:23:2 in above example. “Three to two in favor.”in above example. “Three to two in favor.”
• TheThe odds againstodds against EE just meansjust means 1/1/OO((EE)).. “2 to 3 against”“2 to 3 against”
Exercise:
Express the
probability
p as a function
of the odds in
favor O.
40. Module #19 – Probability
Example 1: Balls-and-Urn
• Suppose an urn contains 4 blue balls and 5 red balls.Suppose an urn contains 4 blue balls and 5 red balls.
• An exampleAn example experiment:experiment: Shake up the urn, reach inShake up the urn, reach in
(without looking) and pull out a ball.(without looking) and pull out a ball.
• AA random variablerandom variable VV: Identity of the chosen ball.: Identity of the chosen ball.
• TheThe sample spacesample space SS: The set of: The set of
all possible values ofall possible values of VV::
– In this case,In this case, SS = {= {bb11,…,,…,bb99}}
• AnAn eventevent EE: “The ball chosen is: “The ball chosen is
blue”:blue”: EE = { ______________ }= { ______________ }
• What are the odds in favor ofWhat are the odds in favor of EE??
• What is the probability ofWhat is the probability of EE??
b1 b2
b3
b4
b5
b6
b7
b8
b9
41. Module #19 – Probability
Independent Events
• Two events E,F are called independent if
Pr[E∩F] = Pr[E]·Pr[F].
• Relates to the product rule for the number
of ways of doing two independent tasks.
• Example: Flip a coin, and roll a die.
Pr[(coin shows heads) ∩ (die shows 1)] =
Pr[coin is heads] × Pr[die is 1] = ½×1/6 =1/12.
42. Module #19 – Probability
Example
Suppose a red die and
a blue die are rolled.
The sample space:
1 2 3 4 5 6
1 x x x x x x
2 x x x x x x
3 x x x x x x
4 x x x x x x
5 x x x x x x
6 x x x x x x
Are the events
sum is 7 and
the blue die is 3
independent?
43. Module #19 – Probability
|S| = 36 1 2 3 4 5 6
1 x x x x x x
2 x x x x x x
3 x x x x x x
4 x x x x x x
5 x x x x x x
6 x x x x x x
|sum is 7| = 6
|blue die is 3| = 6
| in intersection | = 1
p(sum is 7 and blue die is 3) =1/36
p(sum is 7) p(blue die is 3) =6/36*6/36=1/36
Thus, p((sum is 7) and (blue die is 3)) = p(sum is 7) p(blue die is 3)
The events sum is 7 and
the blue die is 3 are independent:
44. Module #19 – Probability
Conditional Probability
• LetLet EE,,FF be any events such thatbe any events such that Pr[Pr[FF]>0]>0..
• Then, theThen, the conditional probabilityconditional probability ofof EE givengiven FF,,
writtenwritten Pr[Pr[EE||FF]], is defined as, is defined as
Pr[Pr[EE||FF] :] :≡≡ Pr[Pr[EE∩∩FF]/Pr[]/Pr[FF]]..
• This is what our probability thatThis is what our probability that EE would turn outwould turn out
to occur should be, if we are givento occur should be, if we are given onlyonly thethe
information thatinformation that FF occurs.occurs.
• IfIf EE andand FF are independent thenare independent then Pr[Pr[EE||FF] = Pr[] = Pr[EE]]..
Pr[Pr[EE||FF] = Pr[] = Pr[EE∩∩FF]/Pr[]/Pr[FF] = Pr[] = Pr[EE]]×Pr[×Pr[FF]/Pr[]/Pr[FF] = Pr[] = Pr[EE]]
45. Module #19 – Probability
Visualizing Conditional Probability
• If we are given that eventIf we are given that event FF occurs, thenoccurs, then
– Our attention gets restricted to the subspaceOur attention gets restricted to the subspace FF..
• OurOur posteriorposterior probability forprobability for EE (after seeing(after seeing
FF)) correspondscorresponds
to theto the fractionfraction
ofof FF wherewhere EE
occurs also.occurs also.
• Thus,Thus, pp′(′(EE)=)=
pp((EE∩∩FF)/)/pp((FF).).
Entire sample space S
Event FEvent E
Event
E∩F
46. Module #19 – Probability
• Suppose I choose a single letter out of the 26-letter English
alphabet, totally at random.
– Use the Laplacian assumption on the sample space {a,b,..,z}.
– What is the (prior) probability
that the letter is a vowel?
• Pr[Vowel] = __ / __ .
• Now, suppose I tell you that the
letter chosen happened to be in
the first 9 letters of the alphabet.
– Now, what is the conditional (or
posterior) probability that the letter
is a vowel, given this information?
• Pr[Vowel | First9] = ___ / ___ .
Conditional Probability Example
a
b c
d
e f
g
h
i
j
k
l
mn
o
p q
r
s
t
u
v
w
x
y
z
1st
9
lettersvowels
Sample Space S
47. Module #19 – Probability
Example
• What is the probability that, if we flip a coin three times,
that we get an odd number of tails (=event E), if we
know that the event F, the first flip comes up tails
occurs?
(TTT), (TTH), (THH), (HTT),
(HHT), (HHH), (THT), (HTH)
Each outcome has probability 1/4,
p(E |F) = 1/4+1/4 = ½, where E=odd number of tails
or p(E|F) = p(E∩F)/p(F) = 2/4 = ½
For comparison p(E) = 4/8 = ½
E and F are independent, since p(E |F) = Pr(E).
48. Module #19 – Probability
Prior and Posterior Probability
• Suppose that, before you are given any information about the
outcome of an experiment, your personal probability for an
event E to occur is p(E) = Pr[E].
– The probability of E in your original probability distribution p is called
the prior probability of E.
• This is its probability prior to obtaining any information about the outcome.
• Now, suppose someone tells you that some event F (which may
overlap with E) actually occurred in the experiment.
– Then, you should update your personal probability for event E to occur,
to become p′(E) = Pr[E|F] = p(E∩F)/p(F).
• The conditional probability of E, given F.
– The probability of E in your new probability distribution p′ is called the
posterior probability of E.
• This is its probability after learning that event F occurred.
• After seeing F, the posterior distribution p′ is defined by letting
p′(v) = p({v}∩F)/p(F) for each individual outcome v∈S.
Editor's Notes
Other compound events could be formed:
Tail on the second toss {HT, TT}
At least 1 Head {HH, HT, TH}
To be consistent with the Berenson & Levine text, a simple event is shown. Typically, this is not considered an event since it is not an outcome of the experiment.