SlideShare a Scribd company logo
Bayes for Beginners
LUCA CHECH AND JOLANDA MALAMUD
SUPERVISOR: THOMAS PARR
13TH FEBRUARY 2019
Outline
• Probability distributions
• Joint probability
• Marginal probability
• Conditional probability
• Bayes’ theorem
• Bayesian inference
• Coin toss example
“Probability is orderly opinion and
inference from data is nothing other than
the revision of such opinion in the light
of relevant new information.”
Eliezer S. Yudkowsky
Some applications
P(X)
Probability distribution
Discrete Continuous
1
2
100
…
X P(X)
1
2
…
100
1/100
1/100
…
1/100
PMF
1 100
…
1/100
X
P(X)
2
𝑋
𝑃𝑀𝐹 𝑋 = 1
PDF
UK
POPULATION
Height
X
1.8 m 0
1.75 ≤ 𝑋 ≤ 1.85
P given by the area
Probability
• Probability of A occurring: P(A)
• Probability of B occurring: P(B)
• Joint probability (A AND B both occurring): P(A,B)
Marginal probability
x
Y
disease
symptoms
0
0
1
1
x
Y
0.5
0.1
0.1
0.3
disease
symptoms
𝑥,𝑦
𝑃 𝑋 = 𝑥, 𝑌 = 𝑦 = 1
𝑃 𝑌 = 1 = 0.1 + 0.3 = 0.4
𝑃 𝑋 = 0 = 0.1 + 0.5 = 0.6
𝑃 𝑋 = 𝑥 =
𝑦
𝑃(𝑋 = 𝑥, 𝑌 = 𝑦)
joint probability : 𝑃 𝑋 = 0, 𝑌 = 1 = 0.1
Conditional probability
What is the probability of A occurring, given that B has occurred?
Probability of A given B?
0
0
1
1
x
Y
0.5
0.1
0.1
0.3
disease
symptoms
joint probability : 𝑃 𝑋 = 0, 𝑌 = 1 = 0.1
Conditional probability:
𝑃 𝑋 = 1 𝑌 = 1
𝑃 𝑋 = 0 𝑌 = 1
𝑃 𝑋 = 1 𝑌 = 1 = 0.3
𝑃 𝑋 = 0 𝑌 = 1 = 0.1
𝑃 𝑋 = 1 𝑌 = 1 =
0.3
0.1 + 0.3
𝑃 𝑋 = 1 𝑌 = 1 =
0.3
0.1 + 0.3
=
3
4
𝑃 𝑋 = 0 𝑌 = 1 =
0.1
0.1 + 0.3
𝑃 𝑋 = 0 𝑌 = 1 =
0.1
0.1 + 0.3
=
1
4
joint probability : 𝑃 𝑋 = 0, 𝑌 = 1
Conditional Probability
P(X|Y)=
𝑃(𝑋=𝑥,𝑌=𝑦)
𝑃(𝑌=𝑦)
𝑃 𝐶| + =
𝑃(𝐶, +)
𝑃(+)
Conditional probability: Example
𝑃 𝐶 =
1
100
𝑃 𝑁𝐶 =
99
100
𝑃 +|𝐶 =
90
100
𝑃 +|𝑁𝐶 =
8
100
𝑃 𝐶| + = ? ? ?
𝑃 + 𝐶 =
𝑃 + 𝐶 =
𝑃(+, 𝐶)
𝑃(𝐶)
𝑃 𝐶, + = 𝑃(+|𝐶) × 𝑃(𝐶)
𝑃 𝐶, + = 𝑃 + 𝐶 × 𝑃 𝐶 =
90
100
×
1
100
𝑃 𝐶, + = 𝑃 + 𝐶 × 𝑃 𝐶 =
9
1000
𝑃 + =
𝑥
𝑃(𝑋, +)
𝑥
𝑃(𝑋, +) = 𝑃 𝐶, + + 𝑃(𝑁𝐶, +)
𝑃 + 𝑁𝐶 =
𝑃(+, 𝑁𝐶)
𝑃(𝑁𝐶)
𝑃 +, 𝑁𝐶 = 𝑃(+|𝑁𝐶) × 𝑃(𝑁𝐶)
𝑃 +, 𝑁𝐶 = 𝑃 + 𝑁𝐶 × 𝑃 𝑁𝐶 =
8
100
×
99
100
=
792
10000
𝑃 𝐶| + =
𝑃(𝐶, +)
𝑃(+)
=
9
1000
9
1000
+
792
10000
≅ 0.1
Conditional probability: Example
𝑃 𝐶 =
1
100
𝑃 𝑁𝐶 =
99
100
𝑃 +|𝐶 =
90
100
𝑃 +|𝑁𝐶 =
8
100
𝑃 𝐶| + = ? ? ?
Derivation of Bayes’ theorem
𝑃 𝐵 𝐴 =
𝑃(𝐵 ∩ 𝐴)
𝑃(𝐴)
=
𝑃(𝐴 ∩ 𝐵)
𝑃(𝐴)
𝑃 𝐵 𝐴 =
𝑃(𝐵 ∩ 𝐴)
𝑃(𝐴)
𝑃 𝐴 ∩ 𝐵 = 𝑃 𝐵 𝐴 × 𝑃(𝐴)
𝑃 𝐴 𝐵 =
𝑃(𝐴 ∩ 𝐵)
𝑃(𝐵)
𝑃 𝐴 𝐵 =
𝑃(𝐴 ∩ 𝐵)
𝑃(𝐵)
=
𝑃(𝐵|𝐴) × 𝑃(𝐴)
𝑃(𝐵)
𝑃 𝐴 𝐵 =
𝑃(𝐵|𝐴) × 𝑃(𝐴)
𝑃(𝐵)
Bayes’ theorem
1
2
Bayes’ theorem, alternative form
𝑃 𝐴 𝐵 =
𝑃(𝐵|𝐴) × 𝑃(𝐴)
𝑃(𝐵)
Bayes’ theorem problems
Example 1
P(A) = probability of liver disease = 0.10
P(B) = probability of alcoholism = 0.05
P(B|A) = 0.07
P(A|B) = ?
𝑃 𝐴 𝐵 =
𝑃 𝐵 𝐴 ×𝑃 𝐴
𝑃 𝐵
=
0.07 × 0.10
0.05
= 0.14
In other words, if the patient is an alcoholic, their chances of having liver disease is 0.14 (14%)
10% of patients in a clinic have liver disease. Five percent of the clinic’s patients are alcoholics.
Amongst those patients diagnosed with liver disease, 7% are alcoholics. You are interested in knowing
the probability of a patient having liver disease, given that he is an alcoholic.
Example 2
A disease occurs in 0.5% of the population
A diagnostic test gives a positive result in:
◦ 99% of people with the disease
◦ 5% of people without the disease (false positive)
A person receives a positive result
What is the probability of them having the disease, given a positive result?
𝑃 𝑑𝑖𝑠𝑒𝑎𝑠𝑒 𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒 𝑡𝑒𝑠𝑡 =
𝑃 𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒 𝑡𝑒𝑠𝑡 𝑑𝑖𝑠𝑒𝑎𝑠𝑒 × 𝑃 𝑑𝑖𝑠𝑒𝑎𝑠𝑒
𝑃 𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒 𝑡𝑒𝑠𝑡
We know:
𝑃 𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒 𝑡𝑒𝑠𝑡 𝑑𝑖𝑠𝑒𝑎𝑠𝑒 = 0.99
𝑃(𝑑𝑖𝑠𝑒𝑎𝑠𝑒) = 0.005
𝑃(𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒 𝑡𝑒𝑠𝑡) = ???
𝑃 𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒 𝑡𝑒𝑠𝑡 = 𝑃 𝐷 𝑃𝑇 × 𝑃 𝐷 + 𝑃 𝑃𝑇 ~𝐷 × 𝑃 ~𝐷
= 0.99 × 0.005 + 0.05 × 0.995 = 0.005
Where:
𝑃 𝐷 = chance of having the disease
𝑃 ~𝐷 = chance of not having the disease
Remember: 𝑃 ~𝐷 = 1 − 𝑃 𝐷
𝑃 𝑃𝑇 𝐷 = chance of positive test given that disease is present
𝑃 𝑃𝑇 ~𝐷 = chance of positive test given that the disease isn’t present
Therefore:
𝑃 𝑑𝑖𝑠𝑒𝑎𝑠𝑒 𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒 𝑡𝑒𝑠𝑡 = 0.99 × 0.005 = 0.09
𝑖. 𝑒. 9%
Frequentist vs. Bayesian statistics
Frequentist models in practice
• Model: 𝑌 = 𝑋𝜃 + 𝜀
• Data X is random variable, while parameters 𝜽 are unknown but fixed
• We assume there is a true set of parameters, or true model of the world, and we
are concerned with getting the best possible estimate
• We are interested in point estimates of parameters given the data
Bayesian models in practice
• Model: 𝑌 = 𝑋𝜃 + 𝜀
• Data X is fixed, while parameters 𝜃 are considered to be random
variables
• There is no single set of parameters that denotes a true model of the
world - we have parameters that are more or less probable
• We are interested in distribution of parameters given the data
Bayesian Inference
• Provides a dynamic model through which our belief is constantly updated as
we add more data
• Ultimate goal is to calculate the posterior probability density, which is
proportional to the likelihood (of our data being correct) and our prior
knowledge
• Can be used as model for the brain (Bayesian brain), history and human
behaviour
Bayes rule
Likelihood
• How good are our parameters given the data
• Prior knowledge is incorporated and used to update our beliefs about the
parameters
𝑃 𝜃 𝐷 =
𝑃 𝐷 𝜃 × 𝑃 𝜃
𝑃 𝐷
∝ 𝑃 𝐷 𝜃 × 𝑃 𝜃
Prior
Posterior
Evidence 𝑃 𝐷 𝜃 × 𝑃 𝜃 𝑑𝜃
Generative models
• Specify a joint probability distribution over all variables (observations and
parameters)  requires a likelihood function and a prior:
𝑃 𝐷, 𝜃 𝑚 = 𝑃 𝐷 𝜃, 𝑚 × 𝑃 𝜃 𝑚 ∝ 𝑃 𝜃 𝐷, 𝑚
• Model comparison based on the model evidence:
𝑃 𝐷 𝑚 = 𝑃 𝐷 𝜃, 𝑚 × 𝑃 𝜃 𝑚 𝑑𝜃
Principles of Bayesian Inference
• Formulation of a generative model
• Observation of data
• Model inversion – updating one’s belief
Model
Measurement
𝑃 𝜃 𝐷 ∝ 𝑃 𝐷 𝜃 × 𝑃(𝜃)
data D
Likelihood function 𝑃 𝐷 𝜃
Prior distribution 𝑃(𝜃)
Posterior distribution
Model evidence
Priors
Priors can be of different sorts, e.g.
• empirical (previous data)
• uninformed
• principled (e.g. positivity constraints)
• shrinkage
Conjugate priors = posterior 𝑃 𝐷 𝜃 is in the same family as the prior 𝑃 𝜃
• effect of more
informative prior
distributions on
the posterior
distribution
𝑃 𝜃 𝐷 ∝ 𝑃 𝐷 𝜃 × 𝑃 𝜃
∝ 𝑙𝑖𝑘𝑒𝑙𝑖ℎ𝑜𝑜𝑑 × 𝑝𝑟𝑖𝑜𝑟
𝑃 𝜃 𝐷 ∝ 𝑃 𝐷 𝜃 × 𝑃 𝜃
∝ 𝑙𝑖𝑘𝑒𝑙𝑖ℎ𝑜𝑜𝑑 × 𝑝𝑟𝑖𝑜𝑟
• effect of larger
sample sizes on
the posterior
distribution
Example: Coin flipping model
• Someone flips a coin
• We don’t know if the coin is fair or not
• We are told only the outcome of the coin flipping
• 1st Hypothesis: Coin is fair, 50% Heads or Tails
• 2nd Hypothesis: Both sides of the coin are heads, 100% Heads
Example: Coin flipping model
Example: Coin flipping model
• 1st Hypothesis: Coin is fair, 50% Heads or Tails
𝑃 𝐴 = 𝑓𝑎𝑖𝑟 𝑐𝑜𝑖𝑛 = 0.99
• 2nd Hypothesis: Both sides of the coin are heads, 100% Heads
𝑃 𝐴 = 𝑢𝑛𝑓𝑎𝑖𝑟 𝑐𝑜𝑖𝑛 = 0.01
•
Example: Coin flipping model
•
Example: Coin flipping model
Coin is flipped a second time and it is heads again
 Posterior in the previous time step becomes the new prior!!
Example: Coin flipping model
Example: Coin flipping model
Hypothesis testing
Classical
• Define the null hypothesis
• H0: Coin is fair θ=0.5
•
Bayesian Inference
• Define a hypothesis
• H: θ>0.1
0.1
Example: Coin flipping model
𝐷 = 𝑇 𝐻 𝑇 𝐻 𝑇 𝑇 𝑇 𝑇 𝑇 𝑇 and we think a priori that the coin is fair:
𝑃 𝑓𝑎𝑖𝑟 = 0.8, 𝑃 𝑏𝑒𝑛𝑡 = 0.2
Evidence for a fair model is:
𝑃 𝐷 𝑓𝑎𝑖𝑟 = 0.510 ≈ 0.001
And for a bent model:
𝑃 𝐷 𝑏𝑒𝑛𝑡 = 𝑃 𝑏𝑒𝑛𝑡 𝜃, 𝐷 × 𝑃 𝜃 𝑏𝑒𝑛𝑡 𝑑𝜃
= 𝜃2 × (1 − 𝜃)8𝑑𝜃 = 𝐵(3,9) ≈ 0.002
Posterior for the models:
𝑃 𝑓𝑎𝑖𝑟 𝐷 ∝ 0.001 × 0.8 = 0.0008
𝑃 𝑏𝑒𝑛𝑡 𝐷 ∝ 0.002 × 0.2 = 0.0004
"A Bayesian is one who,
vaguely expecting a horse,
and catching a glimpse of a donkey,
strongly believes he has seen a mule."
References
• Previous MfD slides
• Bayesian statistics (a very brief introduction) – Ken Rice
• http://www.statisticshowto.com/bayes-theorem-problems/
• Slides “Bayesian inference and generative models” of K.E. Stephan
• Introslides to probabilistic & unsupervised learning of M. Sahani
• Animations: https://blog.stata.com/2016/11/01/introduction-to-
bayesian-statistics-part-1-the-basic-concepts/

More Related Content

Similar to Lec13_Bayes.pptx

Probability
ProbabilityProbability
Probability
Shahla Yasmin
 
Basic statistics for algorithmic trading
Basic statistics for algorithmic tradingBasic statistics for algorithmic trading
Basic statistics for algorithmic trading
QuantInsti
 
Bayesian decision making in clinical research
Bayesian decision making in clinical researchBayesian decision making in clinical research
Bayesian decision making in clinical researchBhaswat Chakraborty
 
Probability Distributions.pdf
Probability Distributions.pdfProbability Distributions.pdf
Probability Distributions.pdf
Shivakumar B N
 
05inference_2011.ppt
05inference_2011.ppt05inference_2011.ppt
05inference_2011.ppt
DrMMuntasirRahman
 
Discrete and Continuous Random Variables
Discrete and Continuous Random VariablesDiscrete and Continuous Random Variables
Discrete and Continuous Random Variables
Cumberland County Schools
 
U unit8 ksb
U unit8 ksbU unit8 ksb
U unit8 ksb
Akhilesh Deshpande
 
Introduction to probabilities and radom variables
Introduction to probabilities and radom variablesIntroduction to probabilities and radom variables
Introduction to probabilities and radom variables
mohammedderriche2
 
Discrete and continuous probability distributions ppt @ bec doms
Discrete and continuous probability distributions ppt @ bec domsDiscrete and continuous probability distributions ppt @ bec doms
Discrete and continuous probability distributions ppt @ bec doms
Babasab Patil
 
STSTISTICS AND PROBABILITY THEORY .pptx
STSTISTICS AND PROBABILITY THEORY  .pptxSTSTISTICS AND PROBABILITY THEORY  .pptx
STSTISTICS AND PROBABILITY THEORY .pptx
VenuKumar65
 
8. Hypothesis Testing.ppt
8. Hypothesis Testing.ppt8. Hypothesis Testing.ppt
8. Hypothesis Testing.ppt
ABDULRAUF411
 
Binomial probability distributions
Binomial probability distributions  Binomial probability distributions
Binomial probability distributions
Long Beach City College
 
hypotesting lecturenotes by Amity university
hypotesting lecturenotes by Amity universityhypotesting lecturenotes by Amity university
hypotesting lecturenotes by Amity university
deepti .
 
Bayesian Statistics.pdf
Bayesian Statistics.pdfBayesian Statistics.pdf
Bayesian Statistics.pdf
MuhammadAnas742878
 
Machine learning session2
Machine learning   session2Machine learning   session2
Machine learning session2
Abhimanyu Dwivedi
 
Gerstman_PP09.ppt
Gerstman_PP09.pptGerstman_PP09.ppt
Gerstman_PP09.ppt
JamshidjonImomaliyev2
 
Ppt1
Ppt1Ppt1
Ppt1
kalai75
 
Phylogenetics2
Phylogenetics2Phylogenetics2
Phylogenetics2
Sébastien De Landtsheer
 
CHAPTER 1 THEORY OF PROBABILITY AND STATISTICS.pptx
CHAPTER 1 THEORY OF PROBABILITY AND STATISTICS.pptxCHAPTER 1 THEORY OF PROBABILITY AND STATISTICS.pptx
CHAPTER 1 THEORY OF PROBABILITY AND STATISTICS.pptx
anshujain54751
 

Similar to Lec13_Bayes.pptx (20)

Probability
ProbabilityProbability
Probability
 
Basic statistics for algorithmic trading
Basic statistics for algorithmic tradingBasic statistics for algorithmic trading
Basic statistics for algorithmic trading
 
Bayesian decision making in clinical research
Bayesian decision making in clinical researchBayesian decision making in clinical research
Bayesian decision making in clinical research
 
Probability Distributions.pdf
Probability Distributions.pdfProbability Distributions.pdf
Probability Distributions.pdf
 
05inference_2011.ppt
05inference_2011.ppt05inference_2011.ppt
05inference_2011.ppt
 
Discrete and Continuous Random Variables
Discrete and Continuous Random VariablesDiscrete and Continuous Random Variables
Discrete and Continuous Random Variables
 
U unit8 ksb
U unit8 ksbU unit8 ksb
U unit8 ksb
 
Introduction to probabilities and radom variables
Introduction to probabilities and radom variablesIntroduction to probabilities and radom variables
Introduction to probabilities and radom variables
 
Discrete and continuous probability distributions ppt @ bec doms
Discrete and continuous probability distributions ppt @ bec domsDiscrete and continuous probability distributions ppt @ bec doms
Discrete and continuous probability distributions ppt @ bec doms
 
STSTISTICS AND PROBABILITY THEORY .pptx
STSTISTICS AND PROBABILITY THEORY  .pptxSTSTISTICS AND PROBABILITY THEORY  .pptx
STSTISTICS AND PROBABILITY THEORY .pptx
 
8. Hypothesis Testing.ppt
8. Hypothesis Testing.ppt8. Hypothesis Testing.ppt
8. Hypothesis Testing.ppt
 
Binomial probability distributions
Binomial probability distributions  Binomial probability distributions
Binomial probability distributions
 
hypotesting lecturenotes by Amity university
hypotesting lecturenotes by Amity universityhypotesting lecturenotes by Amity university
hypotesting lecturenotes by Amity university
 
Bayesian Statistics.pdf
Bayesian Statistics.pdfBayesian Statistics.pdf
Bayesian Statistics.pdf
 
Machine learning session2
Machine learning   session2Machine learning   session2
Machine learning session2
 
Gerstman_PP09.ppt
Gerstman_PP09.pptGerstman_PP09.ppt
Gerstman_PP09.ppt
 
Gerstman_PP09.ppt
Gerstman_PP09.pptGerstman_PP09.ppt
Gerstman_PP09.ppt
 
Ppt1
Ppt1Ppt1
Ppt1
 
Phylogenetics2
Phylogenetics2Phylogenetics2
Phylogenetics2
 
CHAPTER 1 THEORY OF PROBABILITY AND STATISTICS.pptx
CHAPTER 1 THEORY OF PROBABILITY AND STATISTICS.pptxCHAPTER 1 THEORY OF PROBABILITY AND STATISTICS.pptx
CHAPTER 1 THEORY OF PROBABILITY AND STATISTICS.pptx
 

More from KhushiDuttVatsa

processing and marketing.pptx
processing and marketing.pptxprocessing and marketing.pptx
processing and marketing.pptx
KhushiDuttVatsa
 
Unit 2 Star Activity.pdf
Unit 2 Star Activity.pdfUnit 2 Star Activity.pdf
Unit 2 Star Activity.pdf
KhushiDuttVatsa
 
Unit 3 Gene Transfer Techniques.pdf
Unit 3 Gene Transfer Techniques.pdfUnit 3 Gene Transfer Techniques.pdf
Unit 3 Gene Transfer Techniques.pdf
KhushiDuttVatsa
 
Unit 2 Gene Cloning.pdf
Unit 2 Gene Cloning.pdfUnit 2 Gene Cloning.pdf
Unit 2 Gene Cloning.pdf
KhushiDuttVatsa
 
Southern-blotting-and-Western-blotting.pptx
Southern-blotting-and-Western-blotting.pptxSouthern-blotting-and-Western-blotting.pptx
Southern-blotting-and-Western-blotting.pptx
KhushiDuttVatsa
 
bayesNaive.ppt
bayesNaive.pptbayesNaive.ppt
bayesNaive.ppt
KhushiDuttVatsa
 
Bayes_Theorem.ppt
Bayes_Theorem.pptBayes_Theorem.ppt
Bayes_Theorem.ppt
KhushiDuttVatsa
 

More from KhushiDuttVatsa (7)

processing and marketing.pptx
processing and marketing.pptxprocessing and marketing.pptx
processing and marketing.pptx
 
Unit 2 Star Activity.pdf
Unit 2 Star Activity.pdfUnit 2 Star Activity.pdf
Unit 2 Star Activity.pdf
 
Unit 3 Gene Transfer Techniques.pdf
Unit 3 Gene Transfer Techniques.pdfUnit 3 Gene Transfer Techniques.pdf
Unit 3 Gene Transfer Techniques.pdf
 
Unit 2 Gene Cloning.pdf
Unit 2 Gene Cloning.pdfUnit 2 Gene Cloning.pdf
Unit 2 Gene Cloning.pdf
 
Southern-blotting-and-Western-blotting.pptx
Southern-blotting-and-Western-blotting.pptxSouthern-blotting-and-Western-blotting.pptx
Southern-blotting-and-Western-blotting.pptx
 
bayesNaive.ppt
bayesNaive.pptbayesNaive.ppt
bayesNaive.ppt
 
Bayes_Theorem.ppt
Bayes_Theorem.pptBayes_Theorem.ppt
Bayes_Theorem.ppt
 

Recently uploaded

Operation Blue Star - Saka Neela Tara
Operation Blue Star   -  Saka Neela TaraOperation Blue Star   -  Saka Neela Tara
Operation Blue Star - Saka Neela Tara
Balvir Singh
 
Best Digital Marketing Institute In NOIDA
Best Digital Marketing Institute In NOIDABest Digital Marketing Institute In NOIDA
Best Digital Marketing Institute In NOIDA
deeptiverma2406
 
A Survey of Techniques for Maximizing LLM Performance.pptx
A Survey of Techniques for Maximizing LLM Performance.pptxA Survey of Techniques for Maximizing LLM Performance.pptx
A Survey of Techniques for Maximizing LLM Performance.pptx
thanhdowork
 
BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...
BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...
BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...
Nguyen Thanh Tu Collection
 
CACJapan - GROUP Presentation 1- Wk 4.pdf
CACJapan - GROUP Presentation 1- Wk 4.pdfCACJapan - GROUP Presentation 1- Wk 4.pdf
CACJapan - GROUP Presentation 1- Wk 4.pdf
camakaiclarkmusic
 
How libraries can support authors with open access requirements for UKRI fund...
How libraries can support authors with open access requirements for UKRI fund...How libraries can support authors with open access requirements for UKRI fund...
How libraries can support authors with open access requirements for UKRI fund...
Jisc
 
Advantages and Disadvantages of CMS from an SEO Perspective
Advantages and Disadvantages of CMS from an SEO PerspectiveAdvantages and Disadvantages of CMS from an SEO Perspective
Advantages and Disadvantages of CMS from an SEO Perspective
Krisztián Száraz
 
The Challenger.pdf DNHS Official Publication
The Challenger.pdf DNHS Official PublicationThe Challenger.pdf DNHS Official Publication
The Challenger.pdf DNHS Official Publication
Delapenabediema
 
Guidance_and_Counselling.pdf B.Ed. 4th Semester
Guidance_and_Counselling.pdf B.Ed. 4th SemesterGuidance_and_Counselling.pdf B.Ed. 4th Semester
Guidance_and_Counselling.pdf B.Ed. 4th Semester
Atul Kumar Singh
 
Embracing GenAI - A Strategic Imperative
Embracing GenAI - A Strategic ImperativeEmbracing GenAI - A Strategic Imperative
Embracing GenAI - A Strategic Imperative
Peter Windle
 
The approach at University of Liverpool.pptx
The approach at University of Liverpool.pptxThe approach at University of Liverpool.pptx
The approach at University of Liverpool.pptx
Jisc
 
The French Revolution Class 9 Study Material pdf free download
The French Revolution Class 9 Study Material pdf free downloadThe French Revolution Class 9 Study Material pdf free download
The French Revolution Class 9 Study Material pdf free download
Vivekanand Anglo Vedic Academy
 
Language Across the Curriculm LAC B.Ed.
Language Across the  Curriculm LAC B.Ed.Language Across the  Curriculm LAC B.Ed.
Language Across the Curriculm LAC B.Ed.
Atul Kumar Singh
 
Normal Labour/ Stages of Labour/ Mechanism of Labour
Normal Labour/ Stages of Labour/ Mechanism of LabourNormal Labour/ Stages of Labour/ Mechanism of Labour
Normal Labour/ Stages of Labour/ Mechanism of Labour
Wasim Ak
 
STRAND 3 HYGIENIC PRACTICES.pptx GRADE 7 CBC
STRAND 3 HYGIENIC PRACTICES.pptx GRADE 7 CBCSTRAND 3 HYGIENIC PRACTICES.pptx GRADE 7 CBC
STRAND 3 HYGIENIC PRACTICES.pptx GRADE 7 CBC
kimdan468
 
Thesis Statement for students diagnonsed withADHD.ppt
Thesis Statement for students diagnonsed withADHD.pptThesis Statement for students diagnonsed withADHD.ppt
Thesis Statement for students diagnonsed withADHD.ppt
EverAndrsGuerraGuerr
 
Digital Tools and AI for Teaching Learning and Research
Digital Tools and AI for Teaching Learning and ResearchDigital Tools and AI for Teaching Learning and Research
Digital Tools and AI for Teaching Learning and Research
Vikramjit Singh
 
Multithreading_in_C++ - std::thread, race condition
Multithreading_in_C++ - std::thread, race conditionMultithreading_in_C++ - std::thread, race condition
Multithreading_in_C++ - std::thread, race condition
Mohammed Sikander
 
The Accursed House by Émile Gaboriau.pptx
The Accursed House by Émile Gaboriau.pptxThe Accursed House by Émile Gaboriau.pptx
The Accursed House by Émile Gaboriau.pptx
DhatriParmar
 
Natural birth techniques - Mrs.Akanksha Trivedi Rama University
Natural birth techniques - Mrs.Akanksha Trivedi Rama UniversityNatural birth techniques - Mrs.Akanksha Trivedi Rama University
Natural birth techniques - Mrs.Akanksha Trivedi Rama University
Akanksha trivedi rama nursing college kanpur.
 

Recently uploaded (20)

Operation Blue Star - Saka Neela Tara
Operation Blue Star   -  Saka Neela TaraOperation Blue Star   -  Saka Neela Tara
Operation Blue Star - Saka Neela Tara
 
Best Digital Marketing Institute In NOIDA
Best Digital Marketing Institute In NOIDABest Digital Marketing Institute In NOIDA
Best Digital Marketing Institute In NOIDA
 
A Survey of Techniques for Maximizing LLM Performance.pptx
A Survey of Techniques for Maximizing LLM Performance.pptxA Survey of Techniques for Maximizing LLM Performance.pptx
A Survey of Techniques for Maximizing LLM Performance.pptx
 
BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...
BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...
BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...
 
CACJapan - GROUP Presentation 1- Wk 4.pdf
CACJapan - GROUP Presentation 1- Wk 4.pdfCACJapan - GROUP Presentation 1- Wk 4.pdf
CACJapan - GROUP Presentation 1- Wk 4.pdf
 
How libraries can support authors with open access requirements for UKRI fund...
How libraries can support authors with open access requirements for UKRI fund...How libraries can support authors with open access requirements for UKRI fund...
How libraries can support authors with open access requirements for UKRI fund...
 
Advantages and Disadvantages of CMS from an SEO Perspective
Advantages and Disadvantages of CMS from an SEO PerspectiveAdvantages and Disadvantages of CMS from an SEO Perspective
Advantages and Disadvantages of CMS from an SEO Perspective
 
The Challenger.pdf DNHS Official Publication
The Challenger.pdf DNHS Official PublicationThe Challenger.pdf DNHS Official Publication
The Challenger.pdf DNHS Official Publication
 
Guidance_and_Counselling.pdf B.Ed. 4th Semester
Guidance_and_Counselling.pdf B.Ed. 4th SemesterGuidance_and_Counselling.pdf B.Ed. 4th Semester
Guidance_and_Counselling.pdf B.Ed. 4th Semester
 
Embracing GenAI - A Strategic Imperative
Embracing GenAI - A Strategic ImperativeEmbracing GenAI - A Strategic Imperative
Embracing GenAI - A Strategic Imperative
 
The approach at University of Liverpool.pptx
The approach at University of Liverpool.pptxThe approach at University of Liverpool.pptx
The approach at University of Liverpool.pptx
 
The French Revolution Class 9 Study Material pdf free download
The French Revolution Class 9 Study Material pdf free downloadThe French Revolution Class 9 Study Material pdf free download
The French Revolution Class 9 Study Material pdf free download
 
Language Across the Curriculm LAC B.Ed.
Language Across the  Curriculm LAC B.Ed.Language Across the  Curriculm LAC B.Ed.
Language Across the Curriculm LAC B.Ed.
 
Normal Labour/ Stages of Labour/ Mechanism of Labour
Normal Labour/ Stages of Labour/ Mechanism of LabourNormal Labour/ Stages of Labour/ Mechanism of Labour
Normal Labour/ Stages of Labour/ Mechanism of Labour
 
STRAND 3 HYGIENIC PRACTICES.pptx GRADE 7 CBC
STRAND 3 HYGIENIC PRACTICES.pptx GRADE 7 CBCSTRAND 3 HYGIENIC PRACTICES.pptx GRADE 7 CBC
STRAND 3 HYGIENIC PRACTICES.pptx GRADE 7 CBC
 
Thesis Statement for students diagnonsed withADHD.ppt
Thesis Statement for students diagnonsed withADHD.pptThesis Statement for students diagnonsed withADHD.ppt
Thesis Statement for students diagnonsed withADHD.ppt
 
Digital Tools and AI for Teaching Learning and Research
Digital Tools and AI for Teaching Learning and ResearchDigital Tools and AI for Teaching Learning and Research
Digital Tools and AI for Teaching Learning and Research
 
Multithreading_in_C++ - std::thread, race condition
Multithreading_in_C++ - std::thread, race conditionMultithreading_in_C++ - std::thread, race condition
Multithreading_in_C++ - std::thread, race condition
 
The Accursed House by Émile Gaboriau.pptx
The Accursed House by Émile Gaboriau.pptxThe Accursed House by Émile Gaboriau.pptx
The Accursed House by Émile Gaboriau.pptx
 
Natural birth techniques - Mrs.Akanksha Trivedi Rama University
Natural birth techniques - Mrs.Akanksha Trivedi Rama UniversityNatural birth techniques - Mrs.Akanksha Trivedi Rama University
Natural birth techniques - Mrs.Akanksha Trivedi Rama University
 

Lec13_Bayes.pptx

  • 1. Bayes for Beginners LUCA CHECH AND JOLANDA MALAMUD SUPERVISOR: THOMAS PARR 13TH FEBRUARY 2019
  • 2. Outline • Probability distributions • Joint probability • Marginal probability • Conditional probability • Bayes’ theorem • Bayesian inference • Coin toss example
  • 3. “Probability is orderly opinion and inference from data is nothing other than the revision of such opinion in the light of relevant new information.” Eliezer S. Yudkowsky
  • 5. P(X) Probability distribution Discrete Continuous 1 2 100 … X P(X) 1 2 … 100 1/100 1/100 … 1/100 PMF 1 100 … 1/100 X P(X) 2 𝑋 𝑃𝑀𝐹 𝑋 = 1 PDF UK POPULATION Height X 1.8 m 0 1.75 ≤ 𝑋 ≤ 1.85 P given by the area
  • 6. Probability • Probability of A occurring: P(A) • Probability of B occurring: P(B) • Joint probability (A AND B both occurring): P(A,B)
  • 7.
  • 8. Marginal probability x Y disease symptoms 0 0 1 1 x Y 0.5 0.1 0.1 0.3 disease symptoms 𝑥,𝑦 𝑃 𝑋 = 𝑥, 𝑌 = 𝑦 = 1 𝑃 𝑌 = 1 = 0.1 + 0.3 = 0.4 𝑃 𝑋 = 0 = 0.1 + 0.5 = 0.6 𝑃 𝑋 = 𝑥 = 𝑦 𝑃(𝑋 = 𝑥, 𝑌 = 𝑦) joint probability : 𝑃 𝑋 = 0, 𝑌 = 1 = 0.1
  • 9. Conditional probability What is the probability of A occurring, given that B has occurred? Probability of A given B?
  • 10. 0 0 1 1 x Y 0.5 0.1 0.1 0.3 disease symptoms joint probability : 𝑃 𝑋 = 0, 𝑌 = 1 = 0.1 Conditional probability: 𝑃 𝑋 = 1 𝑌 = 1 𝑃 𝑋 = 0 𝑌 = 1 𝑃 𝑋 = 1 𝑌 = 1 = 0.3 𝑃 𝑋 = 0 𝑌 = 1 = 0.1 𝑃 𝑋 = 1 𝑌 = 1 = 0.3 0.1 + 0.3 𝑃 𝑋 = 1 𝑌 = 1 = 0.3 0.1 + 0.3 = 3 4 𝑃 𝑋 = 0 𝑌 = 1 = 0.1 0.1 + 0.3 𝑃 𝑋 = 0 𝑌 = 1 = 0.1 0.1 + 0.3 = 1 4 joint probability : 𝑃 𝑋 = 0, 𝑌 = 1 Conditional Probability P(X|Y)= 𝑃(𝑋=𝑥,𝑌=𝑦) 𝑃(𝑌=𝑦)
  • 11. 𝑃 𝐶| + = 𝑃(𝐶, +) 𝑃(+) Conditional probability: Example 𝑃 𝐶 = 1 100 𝑃 𝑁𝐶 = 99 100 𝑃 +|𝐶 = 90 100 𝑃 +|𝑁𝐶 = 8 100 𝑃 𝐶| + = ? ? ? 𝑃 + 𝐶 = 𝑃 + 𝐶 = 𝑃(+, 𝐶) 𝑃(𝐶) 𝑃 𝐶, + = 𝑃(+|𝐶) × 𝑃(𝐶) 𝑃 𝐶, + = 𝑃 + 𝐶 × 𝑃 𝐶 = 90 100 × 1 100 𝑃 𝐶, + = 𝑃 + 𝐶 × 𝑃 𝐶 = 9 1000 𝑃 + = 𝑥 𝑃(𝑋, +) 𝑥 𝑃(𝑋, +) = 𝑃 𝐶, + + 𝑃(𝑁𝐶, +) 𝑃 + 𝑁𝐶 = 𝑃(+, 𝑁𝐶) 𝑃(𝑁𝐶) 𝑃 +, 𝑁𝐶 = 𝑃(+|𝑁𝐶) × 𝑃(𝑁𝐶) 𝑃 +, 𝑁𝐶 = 𝑃 + 𝑁𝐶 × 𝑃 𝑁𝐶 = 8 100 × 99 100 = 792 10000
  • 12. 𝑃 𝐶| + = 𝑃(𝐶, +) 𝑃(+) = 9 1000 9 1000 + 792 10000 ≅ 0.1 Conditional probability: Example 𝑃 𝐶 = 1 100 𝑃 𝑁𝐶 = 99 100 𝑃 +|𝐶 = 90 100 𝑃 +|𝑁𝐶 = 8 100 𝑃 𝐶| + = ? ? ?
  • 13. Derivation of Bayes’ theorem 𝑃 𝐵 𝐴 = 𝑃(𝐵 ∩ 𝐴) 𝑃(𝐴) = 𝑃(𝐴 ∩ 𝐵) 𝑃(𝐴) 𝑃 𝐵 𝐴 = 𝑃(𝐵 ∩ 𝐴) 𝑃(𝐴) 𝑃 𝐴 ∩ 𝐵 = 𝑃 𝐵 𝐴 × 𝑃(𝐴) 𝑃 𝐴 𝐵 = 𝑃(𝐴 ∩ 𝐵) 𝑃(𝐵) 𝑃 𝐴 𝐵 = 𝑃(𝐴 ∩ 𝐵) 𝑃(𝐵) = 𝑃(𝐵|𝐴) × 𝑃(𝐴) 𝑃(𝐵) 𝑃 𝐴 𝐵 = 𝑃(𝐵|𝐴) × 𝑃(𝐴) 𝑃(𝐵) Bayes’ theorem 1 2
  • 14. Bayes’ theorem, alternative form 𝑃 𝐴 𝐵 = 𝑃(𝐵|𝐴) × 𝑃(𝐴) 𝑃(𝐵)
  • 16. Example 1 P(A) = probability of liver disease = 0.10 P(B) = probability of alcoholism = 0.05 P(B|A) = 0.07 P(A|B) = ? 𝑃 𝐴 𝐵 = 𝑃 𝐵 𝐴 ×𝑃 𝐴 𝑃 𝐵 = 0.07 × 0.10 0.05 = 0.14 In other words, if the patient is an alcoholic, their chances of having liver disease is 0.14 (14%) 10% of patients in a clinic have liver disease. Five percent of the clinic’s patients are alcoholics. Amongst those patients diagnosed with liver disease, 7% are alcoholics. You are interested in knowing the probability of a patient having liver disease, given that he is an alcoholic.
  • 17. Example 2 A disease occurs in 0.5% of the population A diagnostic test gives a positive result in: ◦ 99% of people with the disease ◦ 5% of people without the disease (false positive) A person receives a positive result What is the probability of them having the disease, given a positive result?
  • 18. 𝑃 𝑑𝑖𝑠𝑒𝑎𝑠𝑒 𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒 𝑡𝑒𝑠𝑡 = 𝑃 𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒 𝑡𝑒𝑠𝑡 𝑑𝑖𝑠𝑒𝑎𝑠𝑒 × 𝑃 𝑑𝑖𝑠𝑒𝑎𝑠𝑒 𝑃 𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒 𝑡𝑒𝑠𝑡 We know: 𝑃 𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒 𝑡𝑒𝑠𝑡 𝑑𝑖𝑠𝑒𝑎𝑠𝑒 = 0.99 𝑃(𝑑𝑖𝑠𝑒𝑎𝑠𝑒) = 0.005 𝑃(𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒 𝑡𝑒𝑠𝑡) = ???
  • 19. 𝑃 𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒 𝑡𝑒𝑠𝑡 = 𝑃 𝐷 𝑃𝑇 × 𝑃 𝐷 + 𝑃 𝑃𝑇 ~𝐷 × 𝑃 ~𝐷 = 0.99 × 0.005 + 0.05 × 0.995 = 0.005 Where: 𝑃 𝐷 = chance of having the disease 𝑃 ~𝐷 = chance of not having the disease Remember: 𝑃 ~𝐷 = 1 − 𝑃 𝐷 𝑃 𝑃𝑇 𝐷 = chance of positive test given that disease is present 𝑃 𝑃𝑇 ~𝐷 = chance of positive test given that the disease isn’t present
  • 20. Therefore: 𝑃 𝑑𝑖𝑠𝑒𝑎𝑠𝑒 𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒 𝑡𝑒𝑠𝑡 = 0.99 × 0.005 = 0.09 𝑖. 𝑒. 9%
  • 22. Frequentist models in practice • Model: 𝑌 = 𝑋𝜃 + 𝜀 • Data X is random variable, while parameters 𝜽 are unknown but fixed • We assume there is a true set of parameters, or true model of the world, and we are concerned with getting the best possible estimate • We are interested in point estimates of parameters given the data
  • 23. Bayesian models in practice • Model: 𝑌 = 𝑋𝜃 + 𝜀 • Data X is fixed, while parameters 𝜃 are considered to be random variables • There is no single set of parameters that denotes a true model of the world - we have parameters that are more or less probable • We are interested in distribution of parameters given the data
  • 24. Bayesian Inference • Provides a dynamic model through which our belief is constantly updated as we add more data • Ultimate goal is to calculate the posterior probability density, which is proportional to the likelihood (of our data being correct) and our prior knowledge • Can be used as model for the brain (Bayesian brain), history and human behaviour
  • 25. Bayes rule Likelihood • How good are our parameters given the data • Prior knowledge is incorporated and used to update our beliefs about the parameters 𝑃 𝜃 𝐷 = 𝑃 𝐷 𝜃 × 𝑃 𝜃 𝑃 𝐷 ∝ 𝑃 𝐷 𝜃 × 𝑃 𝜃 Prior Posterior Evidence 𝑃 𝐷 𝜃 × 𝑃 𝜃 𝑑𝜃
  • 26. Generative models • Specify a joint probability distribution over all variables (observations and parameters)  requires a likelihood function and a prior: 𝑃 𝐷, 𝜃 𝑚 = 𝑃 𝐷 𝜃, 𝑚 × 𝑃 𝜃 𝑚 ∝ 𝑃 𝜃 𝐷, 𝑚 • Model comparison based on the model evidence: 𝑃 𝐷 𝑚 = 𝑃 𝐷 𝜃, 𝑚 × 𝑃 𝜃 𝑚 𝑑𝜃
  • 27. Principles of Bayesian Inference • Formulation of a generative model • Observation of data • Model inversion – updating one’s belief Model Measurement 𝑃 𝜃 𝐷 ∝ 𝑃 𝐷 𝜃 × 𝑃(𝜃) data D Likelihood function 𝑃 𝐷 𝜃 Prior distribution 𝑃(𝜃) Posterior distribution Model evidence
  • 28. Priors Priors can be of different sorts, e.g. • empirical (previous data) • uninformed • principled (e.g. positivity constraints) • shrinkage Conjugate priors = posterior 𝑃 𝐷 𝜃 is in the same family as the prior 𝑃 𝜃
  • 29. • effect of more informative prior distributions on the posterior distribution 𝑃 𝜃 𝐷 ∝ 𝑃 𝐷 𝜃 × 𝑃 𝜃 ∝ 𝑙𝑖𝑘𝑒𝑙𝑖ℎ𝑜𝑜𝑑 × 𝑝𝑟𝑖𝑜𝑟
  • 30. 𝑃 𝜃 𝐷 ∝ 𝑃 𝐷 𝜃 × 𝑃 𝜃 ∝ 𝑙𝑖𝑘𝑒𝑙𝑖ℎ𝑜𝑜𝑑 × 𝑝𝑟𝑖𝑜𝑟 • effect of larger sample sizes on the posterior distribution
  • 31. Example: Coin flipping model • Someone flips a coin • We don’t know if the coin is fair or not • We are told only the outcome of the coin flipping
  • 32. • 1st Hypothesis: Coin is fair, 50% Heads or Tails • 2nd Hypothesis: Both sides of the coin are heads, 100% Heads Example: Coin flipping model
  • 33. Example: Coin flipping model • 1st Hypothesis: Coin is fair, 50% Heads or Tails 𝑃 𝐴 = 𝑓𝑎𝑖𝑟 𝑐𝑜𝑖𝑛 = 0.99 • 2nd Hypothesis: Both sides of the coin are heads, 100% Heads 𝑃 𝐴 = 𝑢𝑛𝑓𝑎𝑖𝑟 𝑐𝑜𝑖𝑛 = 0.01
  • 36. Coin is flipped a second time and it is heads again  Posterior in the previous time step becomes the new prior!! Example: Coin flipping model
  • 38. Hypothesis testing Classical • Define the null hypothesis • H0: Coin is fair θ=0.5 • Bayesian Inference • Define a hypothesis • H: θ>0.1 0.1
  • 39. Example: Coin flipping model 𝐷 = 𝑇 𝐻 𝑇 𝐻 𝑇 𝑇 𝑇 𝑇 𝑇 𝑇 and we think a priori that the coin is fair: 𝑃 𝑓𝑎𝑖𝑟 = 0.8, 𝑃 𝑏𝑒𝑛𝑡 = 0.2 Evidence for a fair model is: 𝑃 𝐷 𝑓𝑎𝑖𝑟 = 0.510 ≈ 0.001 And for a bent model: 𝑃 𝐷 𝑏𝑒𝑛𝑡 = 𝑃 𝑏𝑒𝑛𝑡 𝜃, 𝐷 × 𝑃 𝜃 𝑏𝑒𝑛𝑡 𝑑𝜃 = 𝜃2 × (1 − 𝜃)8𝑑𝜃 = 𝐵(3,9) ≈ 0.002 Posterior for the models: 𝑃 𝑓𝑎𝑖𝑟 𝐷 ∝ 0.001 × 0.8 = 0.0008 𝑃 𝑏𝑒𝑛𝑡 𝐷 ∝ 0.002 × 0.2 = 0.0004
  • 40. "A Bayesian is one who, vaguely expecting a horse, and catching a glimpse of a donkey, strongly believes he has seen a mule."
  • 41. References • Previous MfD slides • Bayesian statistics (a very brief introduction) – Ken Rice • http://www.statisticshowto.com/bayes-theorem-problems/ • Slides “Bayesian inference and generative models” of K.E. Stephan • Introslides to probabilistic & unsupervised learning of M. Sahani • Animations: https://blog.stata.com/2016/11/01/introduction-to- bayesian-statistics-part-1-the-basic-concepts/

Editor's Notes

  1. 6
  2. 9
  3. 17
  4. 18
  5. 19
  6. 20
  7. 22
  8. 23
  9. 25