SlideShare a Scribd company logo
1 of 34
Dealing With Uncertainty
P(X|E)
Probability theory
The foundation of Statistics
Chapter 13
History
• Games of chance: 300 BC
• 1565: first formalizations
• 1654: Fermat & Pascal, conditional probability
• Reverend Bayes: 1750’s
• 1950: Kolmogorov: axiomatic approach
• Objectivists vs subjectivists
– (frequentists vs Bayesians)
• Frequentist build one model
• Bayesians use all possible models, with priors
Concerns
• Future: what is the likelihood that a student
will get a CS job given his grades?
• Current: what is the likelihood that a person
has cancer given his symptoms?
• Past: what is the likelihood that Marilyn
Monroe committed suicide?
• Combining evidence.
• Always: Representation & Inference
Basic Idea
• Attach degrees of belief to proposition.
• Theorem: Probability theory is the best way
to do this.
– if someone does it differently you can play a
game with him and win his money.
• Unlike logic, probability theory is non-
monotonic.
• Additional evidence can lower or raise
belief in a proposition.
Probability Models:
Basic Questions
• What are they?
– Analogous to constraint models, with probabilities on
each table entry
• How can we use them to make inferences?
– Probability theory
• How does new evidence change inferences
– Non-monotonic problem solved
• How can we acquire them?
– Experts for model structure, hill-climbing for
parameters
Discrete Probability Model
• Set of RandomVariables V1,V2,…Vn
• Each RV has a discrete set of values
• Joint probability known or computable
• For all vi in domain(Vi),
Prob(V1=v1,V2=v2,..Vn=vn) is known,
non-negative, and sums to 1.
Random Variable
• Intuition: A variable whose values belongs to a
known set of values, the domain.
• Math: non-negative function on a domain (called
the sample space) whose sum is 1.
• Boolean RV: John has a cavity.
– cavity domain ={true,false}
• Discrete RV: Weather Condition
– wc domain= {snowy, rainy, cloudy, sunny}.
• Continuous RV: John’s height
– john’s height domain = { positive real number}
Cross-Product RV
• If X is RV with values x1,..xn and
– Y is RV with values y1,..ym, then
– Z = X x Y is a RV with n*m values
<x1,y1>…<xn,ym>
• This will be very useful!
• This does not mean P(X,Y) = P(X)*P(Y).
Discrete Probability Distribution
• If a discrete RV X has values v1,…vn, then a
prob distribution for X is non-negative real
valued function p such that: sum p(vi) = 1.
• This is just a (normalized) histogram.
• Example: a coin is flipped 10 times and heads
occur 6 times.
• What is best probability model to predict this
result?
• Biased coin model: prob head = .6, trials = 10
From Model to Prediction
Use Math or Simulation
• Math: X = number of heads in 10 flips
• P(X = 0) = .4^10
• P(X = 1) = 10* .6*.4^9
• P(X = 2) = Comb(10,2)*.6^2*.4^8 etc
• Where Comb(n,m) = n!/ (n-m)!* m!.
• Simulation: Do many times: flip coin (p = .6) 10
times, record heads.
• Math is exact, but sometimes too hard.
• Computation is inexact and expensive, but doable
p=.6 Exact 10 100 1000
0 .0001 .0 .0 .0
1 .001 .0 .0 .002
2 .010 .0 .01 .011
3 .042 .0 .04 .042
4 .111 .2 .05 .117
5 .200 .1 .24 .200
6 .250 .6 .22 .246
7 .214 .1 .16 .231
8 .120 .0 .18 .108
9 .43 .0 .09 .035
10 .005 .0 .01 .008
P=.5 Exact 10 100 1000
0 .0009 .0 .0 .002
1 .009 .0 .01 .011
2 .043 .0 .07 .044
3 .117 .1 .13 .101
4 .205 .2 .24 .231
5 .246 .0 .28 .218
6 .205 .3 .15 .224
7 .117 .3 .08 .118
8 .043 .1 .04 .046
9 .009 .0 .0 .009
10 .0009 .0 .0 .001
Learning Model: Hill Climbing
• Theoretically it can be shown that p = .6 is
best model.
• Without theory, pick a random p value and
simulate. Now try a larger and a smaller p
value.
• Maximize P(Data|Model). Get model
which gives highest probability to the data.
• This approach extends to more complicated
models (variables, parameters).
Another Data Set
What’s going on?
0 .34
1 .38
2 .19
3 .05
4 .01
5 .02
6 .08
7 .20
8 .30
9 .26
10 .1
Mixture Model
• Data generated from two simple models
• coin1 prob = .8 of heads
• coin2 prob = .1 of heads
• With prob .5 pick coin 1 or coin 2 and flip.
• Model has more parameters
• Experts are supposed to supply the model.
• Use data to estimate the parameters.
Continuous Probability
• RV X has values in R, then a prob
distribution for X is a non-negative real-
valued function p such that the integral of p
over R is 1. (called prob density function)
• Standard distributions are uniform, normal
or gaussian, poisson, etc.
• May resort to empirical if can’t compute
analytically. I.E. Use histogram.
Joint Probability: full knowledge
• If X and Y are discrete RVs, then the prob
distribution for X x Y is called the joint
prob distribution.
• Let x be in domain of X, y in domain of Y.
• If P(X=x,Y=y) = P(X=x)*P(Y=y) for every
x and y, then X and Y are independent.
• Standard Shorthand: P(X,Y)=P(X)*P(Y),
which means exactly the statement above.
Marginalization
• Given the joint probability for X and Y, you
can compute everything.
• Joint probability to individual probabilities.
• P(X =x) is sum P(X=x and Y=y) over all y
• Conditioning is similar:
– P(X=x) = sum P(X=x|Y=y)*P(Y=y)
Marginalization Example
• Compute Prob(X is healthy) from
• P(X healthy & X tests positive) = .1
• P(X healthy & X tests neg) = .8
• P(X healthy) = .1 + .8 = .9
• P(flush) = P(heart flush)+P(spade flush)+
P(diamond flush)+ P(club flush)
Conditional Probability
• P(X=x | Y=y) = P(X=x, Y=y)/P(Y=y).
• Intuition: use simple examples
• 1 card hand X = value card, Y = suit card
P( X= ace | Y= heart) = 1/13
also P( X=ace , Y=heart) = 1/52
P(Y=heart) = 1 / 4
P( X=ace, Y= heart)/P(Y =heart) = 1/13.
Formula
• Shorthand: P(X|Y) = P(X,Y)/P(Y).
• Product Rule: P(X,Y) = P(X |Y) * P(Y)
• Bayes Rule:
– P(X|Y) = P(Y|X) *P(X)/P(Y).
• Remember the abbreviations.
Conditional Example
• P(A = 0) = .7
• P(A = 1) = .3
P(A,B) = P(B,A)
P(B,A)= P(B|A)*P(A)
P(A,B) = P(A|B)*P(B)
P(A|B) =
P(B|A)*P(A)/P(B)
B A P(B|A)
0 0 .2
0 1 .9
1 0 .8
1 1 .1
Exact and simulated
A B P(A,B) 10 100 1000
0 0 .14 .1 .18 .14
0 1 .56 .6 .55 .56
1 0 .27 .2 .24 .24
1 1 .03 .1 .03 .06
Note Joint yields everything
• Via marginalization
• P(A = 0) = P(A=0,B=0)+P(A=0,B=1)=
– .14+.56 = .7
• P(B=0) = P(B=0,A=0)+P(B=0,A=1) =
– .14+.27 = .41
Simulation
• Given prob for A and prob for B given A
• First, choose value for A, according to prob
• Now use conditional table to choose value
for B with correct probability.
• That constructs one world.
• Repeats lots of times and count number of
times A= 0 & B = 0, A=0 & B= 1, etc.
• Turn counts into probabilities.
Consequences of Bayes Rules
• P(X|Y,Z) = P(Y,Z |X)*P(X)/P(Y,Z).
proof: Treat Y&Z as new product RV U
P(X|U) =P(U|X)*P(X)/P(U) by bayes
• P(X1,X2,X3) =P(X3|X1,X2)*P(X1,X2)
= P(X3|X1,X2)*P(X2|X1)*P(X1) or
• P(X1,X2,X3) =P(X1)*P(X2|X1)*P(X3|X1,X2).
• Note: These equations make no assumptions!
• Last equation is called the Chain or Product Rule
• Can pick the any ordering of variables.
Extensions of P(A) +P(~A) = 1
• P(X|Y) + P(~X|Y) = 1
• Semantic Argument
– conditional just restricts worlds
• Syntactic Argument: lhs equals
– P(X,Y)/P(Y) + P(~X,Y)/P(Y) =
– (P(X,Y) + P(~X,Y))/P(Y) = (marginalization)
– P(Y)/P(Y) = 1.
Bayes Rule Example
• Meningitis causes stiff neck (.5).
– P(s|m) = 0.5
• Prior prob of meningitis = 1/50,000.
– p(m)= 1/50,000 = .00002
• Prior prob of stick neck ( 1/20).
– p(s) = 1/20.
• Does patient have meningitis?
– p(m|s) = p(s|m)*p(m)/p(s) = 0.0002.
• Is this reasonable? p(s|m)/p(s) = change=10
Bayes Rule: multiple symptoms
• Given symptoms s1,s2,..sn, what estimate
probability of Disease D.
• P(D|s1,s2…sn) = P(D,s1,..sn)/P(s1,s2..sn).
• If each symptom is boolean, need tables of
size 2^n. ex. breast cancer data has 73
features per patient. 2^73 is too big.
• Approximate!
Notation: max arg
• Conceptual definition, not operational
• Max arg f(x) is a value of x that maximizes
f(x).
• MaxArg Prob(X = 6 heads | prob heads)
yields prob(heads) = .6
Idiot or Naïve Bayes:
First learning Algorithm
Goal: max arg P(D| s1..sn) over all Diseases
= max arg P(s1,..sn|D)*P(D)/ P(s1,..sn)
= max arg P(s1,..sn|D)*P(D) (why?)
~ max arg P(s1|D)*P(s2|D)…P(sn|D)*P(D).
• Assumes conditional independence.
• enough data to estimate
• Not necessary to get prob right: only order.
• Pretty good but Bayes Nets do it better.
Chain Rule and Markov Models
• Recall P(X1, X2, …Xn) =
P(X1)*P(X2|X1)*…P(Xn| X1,X2,..Xn-1).
• If X1, X2, etc are values at time points 1, 2..
and if Xn only depends on k previous times,
then this is a markov model of order k.
• MMO: Independent of time
– P(X1,…Xn) = P(X1)*P(X2)..*P(Xn)
Markov Models
• MM1: depends only on previous time
– P(X1,…Xn)= P(X1)*P(X2|X1)*…P(Xn|Xn-1).
• May also be used for approximating
probabilities. Much simpler to estimate.
• MM2: depends on previous 2 times
– P(X1,X2,..Xn)= P(X1,X2)*P(X3|X1,X2) etc
Common DNA application
• Looking for needles: surprising frequency?
• Goal:Compute P(gataag) given lots of data
• MM0 = P(g)*P(a)*P(t)*P(a)*P(a)*P(g).
• MM1 = P(g)*P(a|g)*P(t|a)*P(a|a)*P(g|a).
• MM2 = P(ga)*P(t|ga)*P(a|ta)*P(g|aa).
• Note: each approximation requires less data
and less computation time.

More Related Content

Similar to Lec12-Probability.ppt

2.statistical DEcision makig.pptx
2.statistical DEcision makig.pptx2.statistical DEcision makig.pptx
2.statistical DEcision makig.pptxImpanaR2
 
Probability_Review.ppt
Probability_Review.pptProbability_Review.ppt
Probability_Review.pptssuserd329601
 
Probability_Review.ppt
Probability_Review.pptProbability_Review.ppt
Probability_Review.pptsarahfarhin
 
Probability_Review.ppt
Probability_Review.pptProbability_Review.ppt
Probability_Review.pptYonas992841
 
Probability_Review.ppt for your knowledg
Probability_Review.ppt for your knowledgProbability_Review.ppt for your knowledg
Probability_Review.ppt for your knowledgnsnayak03
 
Probability_Review.ppt
Probability_Review.pptProbability_Review.ppt
Probability_Review.pptSameer607695
 
Probability_Review HELPFUL IN STATISTICS.ppt
Probability_Review HELPFUL IN STATISTICS.pptProbability_Review HELPFUL IN STATISTICS.ppt
Probability_Review HELPFUL IN STATISTICS.pptShamshadAli58
 
슬로우캠퍼스: scikit-learn & 머신러닝 (강박사)
슬로우캠퍼스:  scikit-learn & 머신러닝 (강박사)슬로우캠퍼스:  scikit-learn & 머신러닝 (강박사)
슬로우캠퍼스: scikit-learn & 머신러닝 (강박사)마이캠퍼스
 
Probability_Review.ppt
Probability_Review.pptProbability_Review.ppt
Probability_Review.pptGireeshNcs
 
Probability_Review.ppt
Probability_Review.pptProbability_Review.ppt
Probability_Review.pptRobinBushu
 
Probability distribution
Probability distributionProbability distribution
Probability distributionRanjan Kumar
 

Similar to Lec12-Probability.ppt (20)

Gerstman_PP09.ppt
Gerstman_PP09.pptGerstman_PP09.ppt
Gerstman_PP09.ppt
 
2.statistical DEcision makig.pptx
2.statistical DEcision makig.pptx2.statistical DEcision makig.pptx
2.statistical DEcision makig.pptx
 
Statistics-2 : Elements of Inference
Statistics-2 : Elements of InferenceStatistics-2 : Elements of Inference
Statistics-2 : Elements of Inference
 
Uncertainity
Uncertainity Uncertainity
Uncertainity
 
NaiveBayes.ppt
NaiveBayes.pptNaiveBayes.ppt
NaiveBayes.ppt
 
NaiveBayes.ppt
NaiveBayes.pptNaiveBayes.ppt
NaiveBayes.ppt
 
NaiveBayes.ppt
NaiveBayes.pptNaiveBayes.ppt
NaiveBayes.ppt
 
5. RV and Distributions.pptx
5. RV and Distributions.pptx5. RV and Distributions.pptx
5. RV and Distributions.pptx
 
Chapter 5.pptx
Chapter 5.pptxChapter 5.pptx
Chapter 5.pptx
 
Probability_Review.ppt
Probability_Review.pptProbability_Review.ppt
Probability_Review.ppt
 
Probability_Review.ppt
Probability_Review.pptProbability_Review.ppt
Probability_Review.ppt
 
Probability_Review.ppt
Probability_Review.pptProbability_Review.ppt
Probability_Review.ppt
 
Probability_Review.ppt for your knowledg
Probability_Review.ppt for your knowledgProbability_Review.ppt for your knowledg
Probability_Review.ppt for your knowledg
 
Probability_Review.ppt
Probability_Review.pptProbability_Review.ppt
Probability_Review.ppt
 
Probability_Review HELPFUL IN STATISTICS.ppt
Probability_Review HELPFUL IN STATISTICS.pptProbability_Review HELPFUL IN STATISTICS.ppt
Probability_Review HELPFUL IN STATISTICS.ppt
 
슬로우캠퍼스: scikit-learn & 머신러닝 (강박사)
슬로우캠퍼스:  scikit-learn & 머신러닝 (강박사)슬로우캠퍼스:  scikit-learn & 머신러닝 (강박사)
슬로우캠퍼스: scikit-learn & 머신러닝 (강박사)
 
Probability_Review.ppt
Probability_Review.pptProbability_Review.ppt
Probability_Review.ppt
 
Probability_Review.ppt
Probability_Review.pptProbability_Review.ppt
Probability_Review.ppt
 
Probability_Review.ppt
Probability_Review.pptProbability_Review.ppt
Probability_Review.ppt
 
Probability distribution
Probability distributionProbability distribution
Probability distribution
 

More from RohitKumar639388

WHAT IS BUSINESS ANALYTICS um hj mnjh nit 1 ppt only kjjn
WHAT IS BUSINESS ANALYTICS um hj mnjh nit 1 ppt only kjjnWHAT IS BUSINESS ANALYTICS um hj mnjh nit 1 ppt only kjjn
WHAT IS BUSINESS ANALYTICS um hj mnjh nit 1 ppt only kjjnRohitKumar639388
 
Microstrip patch antenna in hfss Anyss presentation PPT for college final year
Microstrip patch antenna in hfss Anyss presentation PPT for college final yearMicrostrip patch antenna in hfss Anyss presentation PPT for college final year
Microstrip patch antenna in hfss Anyss presentation PPT for college final yearRohitKumar639388
 
Cost management forece md engineerrs .ppt
Cost management forece md  engineerrs .pptCost management forece md  engineerrs .ppt
Cost management forece md engineerrs .pptRohitKumar639388
 
Jointly Distributed Random Variaables.ppt
Jointly Distributed Random Variaables.pptJointly Distributed Random Variaables.ppt
Jointly Distributed Random Variaables.pptRohitKumar639388
 
pythontraining-201jn026043638.pptx
pythontraining-201jn026043638.pptxpythontraining-201jn026043638.pptx
pythontraining-201jn026043638.pptxRohitKumar639388
 

More from RohitKumar639388 (8)

WHAT IS BUSINESS ANALYTICS um hj mnjh nit 1 ppt only kjjn
WHAT IS BUSINESS ANALYTICS um hj mnjh nit 1 ppt only kjjnWHAT IS BUSINESS ANALYTICS um hj mnjh nit 1 ppt only kjjn
WHAT IS BUSINESS ANALYTICS um hj mnjh nit 1 ppt only kjjn
 
Microstrip patch antenna in hfss Anyss presentation PPT for college final year
Microstrip patch antenna in hfss Anyss presentation PPT for college final yearMicrostrip patch antenna in hfss Anyss presentation PPT for college final year
Microstrip patch antenna in hfss Anyss presentation PPT for college final year
 
Cost management forece md engineerrs .ppt
Cost management forece md  engineerrs .pptCost management forece md  engineerrs .ppt
Cost management forece md engineerrs .ppt
 
BITTU PPT.pptx
BITTU PPT.pptxBITTU PPT.pptx
BITTU PPT.pptx
 
Jointly Distributed Random Variaables.ppt
Jointly Distributed Random Variaables.pptJointly Distributed Random Variaables.ppt
Jointly Distributed Random Variaables.ppt
 
lectr10a.ppt
lectr10a.pptlectr10a.ppt
lectr10a.ppt
 
pythontraining-201jn026043638.pptx
pythontraining-201jn026043638.pptxpythontraining-201jn026043638.pptx
pythontraining-201jn026043638.pptx
 
internsala c-and-c.pptx
internsala c-and-c.pptxinternsala c-and-c.pptx
internsala c-and-c.pptx
 

Recently uploaded

Artificial-Intelligence-in-Electronics (K).pptx
Artificial-Intelligence-in-Electronics (K).pptxArtificial-Intelligence-in-Electronics (K).pptx
Artificial-Intelligence-in-Electronics (K).pptxbritheesh05
 
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...srsj9000
 
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130Suhani Kapoor
 
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICSAPPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICSKurinjimalarL3
 
power system scada applications and uses
power system scada applications and usespower system scada applications and uses
power system scada applications and usesDevarapalliHaritha
 
Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024hassan khalil
 
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130Suhani Kapoor
 
Application of Residue Theorem to evaluate real integrations.pptx
Application of Residue Theorem to evaluate real integrations.pptxApplication of Residue Theorem to evaluate real integrations.pptx
Application of Residue Theorem to evaluate real integrations.pptx959SahilShah
 
College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCollege Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCall Girls in Nagpur High Profile
 
Microscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxMicroscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxpurnimasatapathy1234
 
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escortsranjana rawat
 
Call Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile serviceCall Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile servicerehmti665
 
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube ExchangerStudy on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube ExchangerAnamika Sarkar
 
chaitra-1.pptx fake news detection using machine learning
chaitra-1.pptx  fake news detection using machine learningchaitra-1.pptx  fake news detection using machine learning
chaitra-1.pptx fake news detection using machine learningmisbanausheenparvam
 
Introduction to Microprocesso programming and interfacing.pptx
Introduction to Microprocesso programming and interfacing.pptxIntroduction to Microprocesso programming and interfacing.pptx
Introduction to Microprocesso programming and interfacing.pptxvipinkmenon1
 
Sachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective IntroductionSachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective IntroductionDr.Costas Sachpazis
 
Past, Present and Future of Generative AI
Past, Present and Future of Generative AIPast, Present and Future of Generative AI
Past, Present and Future of Generative AIabhishek36461
 

Recently uploaded (20)

9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
 
Artificial-Intelligence-in-Electronics (K).pptx
Artificial-Intelligence-in-Electronics (K).pptxArtificial-Intelligence-in-Electronics (K).pptx
Artificial-Intelligence-in-Electronics (K).pptx
 
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
 
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
 
★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR
★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR
★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR
 
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICSAPPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
 
power system scada applications and uses
power system scada applications and usespower system scada applications and uses
power system scada applications and uses
 
Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024
 
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
 
Application of Residue Theorem to evaluate real integrations.pptx
Application of Residue Theorem to evaluate real integrations.pptxApplication of Residue Theorem to evaluate real integrations.pptx
Application of Residue Theorem to evaluate real integrations.pptx
 
College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCollege Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
 
Microscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxMicroscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptx
 
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
 
Call Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile serviceCall Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile service
 
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube ExchangerStudy on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
 
🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...
🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...
🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...
 
chaitra-1.pptx fake news detection using machine learning
chaitra-1.pptx  fake news detection using machine learningchaitra-1.pptx  fake news detection using machine learning
chaitra-1.pptx fake news detection using machine learning
 
Introduction to Microprocesso programming and interfacing.pptx
Introduction to Microprocesso programming and interfacing.pptxIntroduction to Microprocesso programming and interfacing.pptx
Introduction to Microprocesso programming and interfacing.pptx
 
Sachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective IntroductionSachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
 
Past, Present and Future of Generative AI
Past, Present and Future of Generative AIPast, Present and Future of Generative AI
Past, Present and Future of Generative AI
 

Lec12-Probability.ppt

  • 1. Dealing With Uncertainty P(X|E) Probability theory The foundation of Statistics Chapter 13
  • 2. History • Games of chance: 300 BC • 1565: first formalizations • 1654: Fermat & Pascal, conditional probability • Reverend Bayes: 1750’s • 1950: Kolmogorov: axiomatic approach • Objectivists vs subjectivists – (frequentists vs Bayesians) • Frequentist build one model • Bayesians use all possible models, with priors
  • 3. Concerns • Future: what is the likelihood that a student will get a CS job given his grades? • Current: what is the likelihood that a person has cancer given his symptoms? • Past: what is the likelihood that Marilyn Monroe committed suicide? • Combining evidence. • Always: Representation & Inference
  • 4. Basic Idea • Attach degrees of belief to proposition. • Theorem: Probability theory is the best way to do this. – if someone does it differently you can play a game with him and win his money. • Unlike logic, probability theory is non- monotonic. • Additional evidence can lower or raise belief in a proposition.
  • 5. Probability Models: Basic Questions • What are they? – Analogous to constraint models, with probabilities on each table entry • How can we use them to make inferences? – Probability theory • How does new evidence change inferences – Non-monotonic problem solved • How can we acquire them? – Experts for model structure, hill-climbing for parameters
  • 6. Discrete Probability Model • Set of RandomVariables V1,V2,…Vn • Each RV has a discrete set of values • Joint probability known or computable • For all vi in domain(Vi), Prob(V1=v1,V2=v2,..Vn=vn) is known, non-negative, and sums to 1.
  • 7. Random Variable • Intuition: A variable whose values belongs to a known set of values, the domain. • Math: non-negative function on a domain (called the sample space) whose sum is 1. • Boolean RV: John has a cavity. – cavity domain ={true,false} • Discrete RV: Weather Condition – wc domain= {snowy, rainy, cloudy, sunny}. • Continuous RV: John’s height – john’s height domain = { positive real number}
  • 8. Cross-Product RV • If X is RV with values x1,..xn and – Y is RV with values y1,..ym, then – Z = X x Y is a RV with n*m values <x1,y1>…<xn,ym> • This will be very useful! • This does not mean P(X,Y) = P(X)*P(Y).
  • 9. Discrete Probability Distribution • If a discrete RV X has values v1,…vn, then a prob distribution for X is non-negative real valued function p such that: sum p(vi) = 1. • This is just a (normalized) histogram. • Example: a coin is flipped 10 times and heads occur 6 times. • What is best probability model to predict this result? • Biased coin model: prob head = .6, trials = 10
  • 10. From Model to Prediction Use Math or Simulation • Math: X = number of heads in 10 flips • P(X = 0) = .4^10 • P(X = 1) = 10* .6*.4^9 • P(X = 2) = Comb(10,2)*.6^2*.4^8 etc • Where Comb(n,m) = n!/ (n-m)!* m!. • Simulation: Do many times: flip coin (p = .6) 10 times, record heads. • Math is exact, but sometimes too hard. • Computation is inexact and expensive, but doable
  • 11. p=.6 Exact 10 100 1000 0 .0001 .0 .0 .0 1 .001 .0 .0 .002 2 .010 .0 .01 .011 3 .042 .0 .04 .042 4 .111 .2 .05 .117 5 .200 .1 .24 .200 6 .250 .6 .22 .246 7 .214 .1 .16 .231 8 .120 .0 .18 .108 9 .43 .0 .09 .035 10 .005 .0 .01 .008
  • 12. P=.5 Exact 10 100 1000 0 .0009 .0 .0 .002 1 .009 .0 .01 .011 2 .043 .0 .07 .044 3 .117 .1 .13 .101 4 .205 .2 .24 .231 5 .246 .0 .28 .218 6 .205 .3 .15 .224 7 .117 .3 .08 .118 8 .043 .1 .04 .046 9 .009 .0 .0 .009 10 .0009 .0 .0 .001
  • 13. Learning Model: Hill Climbing • Theoretically it can be shown that p = .6 is best model. • Without theory, pick a random p value and simulate. Now try a larger and a smaller p value. • Maximize P(Data|Model). Get model which gives highest probability to the data. • This approach extends to more complicated models (variables, parameters).
  • 14. Another Data Set What’s going on? 0 .34 1 .38 2 .19 3 .05 4 .01 5 .02 6 .08 7 .20 8 .30 9 .26 10 .1
  • 15. Mixture Model • Data generated from two simple models • coin1 prob = .8 of heads • coin2 prob = .1 of heads • With prob .5 pick coin 1 or coin 2 and flip. • Model has more parameters • Experts are supposed to supply the model. • Use data to estimate the parameters.
  • 16. Continuous Probability • RV X has values in R, then a prob distribution for X is a non-negative real- valued function p such that the integral of p over R is 1. (called prob density function) • Standard distributions are uniform, normal or gaussian, poisson, etc. • May resort to empirical if can’t compute analytically. I.E. Use histogram.
  • 17. Joint Probability: full knowledge • If X and Y are discrete RVs, then the prob distribution for X x Y is called the joint prob distribution. • Let x be in domain of X, y in domain of Y. • If P(X=x,Y=y) = P(X=x)*P(Y=y) for every x and y, then X and Y are independent. • Standard Shorthand: P(X,Y)=P(X)*P(Y), which means exactly the statement above.
  • 18. Marginalization • Given the joint probability for X and Y, you can compute everything. • Joint probability to individual probabilities. • P(X =x) is sum P(X=x and Y=y) over all y • Conditioning is similar: – P(X=x) = sum P(X=x|Y=y)*P(Y=y)
  • 19. Marginalization Example • Compute Prob(X is healthy) from • P(X healthy & X tests positive) = .1 • P(X healthy & X tests neg) = .8 • P(X healthy) = .1 + .8 = .9 • P(flush) = P(heart flush)+P(spade flush)+ P(diamond flush)+ P(club flush)
  • 20. Conditional Probability • P(X=x | Y=y) = P(X=x, Y=y)/P(Y=y). • Intuition: use simple examples • 1 card hand X = value card, Y = suit card P( X= ace | Y= heart) = 1/13 also P( X=ace , Y=heart) = 1/52 P(Y=heart) = 1 / 4 P( X=ace, Y= heart)/P(Y =heart) = 1/13.
  • 21. Formula • Shorthand: P(X|Y) = P(X,Y)/P(Y). • Product Rule: P(X,Y) = P(X |Y) * P(Y) • Bayes Rule: – P(X|Y) = P(Y|X) *P(X)/P(Y). • Remember the abbreviations.
  • 22. Conditional Example • P(A = 0) = .7 • P(A = 1) = .3 P(A,B) = P(B,A) P(B,A)= P(B|A)*P(A) P(A,B) = P(A|B)*P(B) P(A|B) = P(B|A)*P(A)/P(B) B A P(B|A) 0 0 .2 0 1 .9 1 0 .8 1 1 .1
  • 23. Exact and simulated A B P(A,B) 10 100 1000 0 0 .14 .1 .18 .14 0 1 .56 .6 .55 .56 1 0 .27 .2 .24 .24 1 1 .03 .1 .03 .06
  • 24. Note Joint yields everything • Via marginalization • P(A = 0) = P(A=0,B=0)+P(A=0,B=1)= – .14+.56 = .7 • P(B=0) = P(B=0,A=0)+P(B=0,A=1) = – .14+.27 = .41
  • 25. Simulation • Given prob for A and prob for B given A • First, choose value for A, according to prob • Now use conditional table to choose value for B with correct probability. • That constructs one world. • Repeats lots of times and count number of times A= 0 & B = 0, A=0 & B= 1, etc. • Turn counts into probabilities.
  • 26. Consequences of Bayes Rules • P(X|Y,Z) = P(Y,Z |X)*P(X)/P(Y,Z). proof: Treat Y&Z as new product RV U P(X|U) =P(U|X)*P(X)/P(U) by bayes • P(X1,X2,X3) =P(X3|X1,X2)*P(X1,X2) = P(X3|X1,X2)*P(X2|X1)*P(X1) or • P(X1,X2,X3) =P(X1)*P(X2|X1)*P(X3|X1,X2). • Note: These equations make no assumptions! • Last equation is called the Chain or Product Rule • Can pick the any ordering of variables.
  • 27. Extensions of P(A) +P(~A) = 1 • P(X|Y) + P(~X|Y) = 1 • Semantic Argument – conditional just restricts worlds • Syntactic Argument: lhs equals – P(X,Y)/P(Y) + P(~X,Y)/P(Y) = – (P(X,Y) + P(~X,Y))/P(Y) = (marginalization) – P(Y)/P(Y) = 1.
  • 28. Bayes Rule Example • Meningitis causes stiff neck (.5). – P(s|m) = 0.5 • Prior prob of meningitis = 1/50,000. – p(m)= 1/50,000 = .00002 • Prior prob of stick neck ( 1/20). – p(s) = 1/20. • Does patient have meningitis? – p(m|s) = p(s|m)*p(m)/p(s) = 0.0002. • Is this reasonable? p(s|m)/p(s) = change=10
  • 29. Bayes Rule: multiple symptoms • Given symptoms s1,s2,..sn, what estimate probability of Disease D. • P(D|s1,s2…sn) = P(D,s1,..sn)/P(s1,s2..sn). • If each symptom is boolean, need tables of size 2^n. ex. breast cancer data has 73 features per patient. 2^73 is too big. • Approximate!
  • 30. Notation: max arg • Conceptual definition, not operational • Max arg f(x) is a value of x that maximizes f(x). • MaxArg Prob(X = 6 heads | prob heads) yields prob(heads) = .6
  • 31. Idiot or Naïve Bayes: First learning Algorithm Goal: max arg P(D| s1..sn) over all Diseases = max arg P(s1,..sn|D)*P(D)/ P(s1,..sn) = max arg P(s1,..sn|D)*P(D) (why?) ~ max arg P(s1|D)*P(s2|D)…P(sn|D)*P(D). • Assumes conditional independence. • enough data to estimate • Not necessary to get prob right: only order. • Pretty good but Bayes Nets do it better.
  • 32. Chain Rule and Markov Models • Recall P(X1, X2, …Xn) = P(X1)*P(X2|X1)*…P(Xn| X1,X2,..Xn-1). • If X1, X2, etc are values at time points 1, 2.. and if Xn only depends on k previous times, then this is a markov model of order k. • MMO: Independent of time – P(X1,…Xn) = P(X1)*P(X2)..*P(Xn)
  • 33. Markov Models • MM1: depends only on previous time – P(X1,…Xn)= P(X1)*P(X2|X1)*…P(Xn|Xn-1). • May also be used for approximating probabilities. Much simpler to estimate. • MM2: depends on previous 2 times – P(X1,X2,..Xn)= P(X1,X2)*P(X3|X1,X2) etc
  • 34. Common DNA application • Looking for needles: surprising frequency? • Goal:Compute P(gataag) given lots of data • MM0 = P(g)*P(a)*P(t)*P(a)*P(a)*P(g). • MM1 = P(g)*P(a|g)*P(t|a)*P(a|a)*P(g|a). • MM2 = P(ga)*P(t|ga)*P(a|ta)*P(g|aa). • Note: each approximation requires less data and less computation time.