A workshop organized by Bangalore Institute of Technology on Sep 7, 2022, and conducted by Dr. Somik Raha. Topics covered include the definition of probability, the basics of conditional probability, application to real-life inference, and continuous probability distributions. The epistemology of probability theory is also traced, including Bayes, Laplace, Jaynes, Howard and Keelin.
1. To the extent possible under law, Somik
Raha has waived all copyright and
related or neighboring rights to the
slide deck “The Magic of Probability.”
This work is published from: United
States and India.
2. The Magic of Probability
Workshop by Dr. Somik Raha
9. Psst… Sojan, take a look.
Prabha, What’s
your probability
of heads?
Sojan, What’s
your probability
of heads?
Sojan, What’s
your probability
of heads?
Prabha, What’s
your probability
of heads?
50%
50%
100%
50%
Let’s call a coin toss
(ಾಣ` ÉಾÑ ಅನುU ಮುನೂtbkೂೕಣ
10. How can two people have different probabilities for the same event?
ಒಂEೕ ಈ>ಂÜL ಇಬáರು ವ`àKಗಳ. +ೕL !âನU ಸಂಭವ8ೕಯZಯನುU +ೂಂNರುyಾKB?
•Dialog
12. THERE IS NO OBJECTIVE PROBABILITY
ವಸುK8ಷä ಸಂಭವ8ೕಯZಯಂತಹ jಾವ*Eೕ !ಷಯ!ಲ'
If it was objective, it would be data, not probability!
ಅದು ವಸುK8ಷä<ಾ=ದXB, ಅದು ãೕÉಾ, ಸಂಭವ8ೕಯZ ಅಲ'!
Probability =Your State of Information
ಸಂಭವ%ೕಯ( = %ಮ* +ಾ-.ಯ /0.
13. THERE IS NO OBJECTIVE PROBABILITY
ವಸುK8ಷä ಸಂಭವ8ೕಯZಯಂತಹ jಾವ*Eೕ !ಷಯ!ಲ'
Inappropriate Phrases
ಸೂಕ$ವಲ'ದ ನು+ಗಟು.ಗಳ0
The probability
Appropriate Phrases
ಸೂಕ$1ಾದ ನು+ಗಟು.ಗಳ0
My probability
Your probability
Prabha’s probability
Sojan’s probability
Our group’s probability
14. You can only place a probability on clear distinctions
8ೕವ* ಸåಷ| ಪYಕಲågಗಳ çೕr 5ಾತS ಸಂಭವ8ೕಯZಯನುU
The Clarity Test
ಸåಷ|Z ಪYೕé
Can a fact-based clairvoyant tell you how the distinction has resolved?
15. What is your probability of rain in Bengaluru tomorrow?
Let’s first define rain: At least 10 cm of water from the sky
Let’s define tomorrow: From 12:01 AM to 11:59 PM
Quick poll
16. When should you place a 100% or 0% probability?
8ೕವ* jಾ<ಾಗ 100% ಅಥ<ಾ 0% ಸಂಭವ8ೕಯZಯನುU
I did not die yesterday
I will definitely die someday
17. The Epistemology of Probability from a
Decision Analysis Perspective
8êಾ6ರ !ë'ೕಷíಯ ದೃî|@ೂೕನNಂದ
ಸಂಭವ8ೕಯZಯ "ಾನïಾಸñ
18. Epistemology.
"ಾನïಾಸñ
The theory of knowledge, especially with regard to its methods, validity,
and scope, and the distinction between justified belief and opinion
"ಾನದ fvಾóಂತ, !ëೕಷ<ಾ= ಅದರ !êಾನಗಳ., fಂಧುತò ಮತುK <ಾ`ÄK ಮತುK
ಸಮಥ68ೕಯ ನಂô@ ಮತುK ಅâ~ಾSಯದ ನಡು!ನ ವ`yಾ`ಸ@H ಸಂಬಂ}fದಂZ
dictionary.com
It is the branch of philosophy concerned with knowledge
ಇದು "ಾನ@H ಸಂಬಂ}fದ ತತòïಾಸñದ ïಾöjಾ=E
wikipedia.com
19. A tale of friendship
Rev Thomas Bayes
Died: 1761
1763
An Essay towards solving a Problem in the Doctrine of Chances
Read out by his friend Richard Price to the Royal Society
Richard Price
Worked out Conditional Probability and the Beta Distribution!
20. Conditional Probability: Haemophilia
Haeomophiliac
Not
Haemophiliac
Male
Female
Male
Female
Male
Female
Haemophiliac
Not
Haemophiliac
Haemophiliac
Not
Haemophiliac
161.57M
161.60M
99.98%
0.02%
30,000
161.6M
0.0016%
99.9984%
161.6M
166.7M
49.22%
50.78%
0.01%
49.21%
0.0008%
50.78%
0.01%
0.0008%
50.78%
49.21%
0.01%
99.99%
91.74%
8.26%
49.22%
50.78%
Mistaking these two is called:
“Associative Logic Error”
21. Conditional Probability: Haemophilia
Haeomophiliac
Not
Haemophiliac
Male
Female
Male
Female
Male
Female
Haemophiliac
Not
Haemophiliac
Haemophiliac
Not
Haemophiliac
99.98%
0.02%
0.0016%
99.9984%
161.6M
166.7M
49.22%
50.78%
0.01%
49.21%
0.0008%
50.78%
0.01%
0.0008%
50.78%
49.21%
0.01%
99.99%
91.74%
8.26%
49.22%
50.78%
Your turn:
https://tinyurl.com/haemophilia-inference
Prior Likelihood
Pre-Posterior
Posterior
0.01%
49.21%
0.0008%
50.78%
22. Conditional Probability Quiz 2
If someone has lung cancer, what is your probability for this person being a
smoker?
If someone smokes, what is your probability for this person getting lung cancer?
Your turn:
https://tinyurl.com/smoker-lungcancer
23. 0.15%
0.02%
85.86%
13.98%
Conditional Probability: Lung Cancer
Lung Cancer
No Lung Cancer
Smoker
Non-Smoker
Smoker
Non-Smoker
Smoker
Not Smoker
Lung Cancer
No Lung Cancer
Lung Cancer
No Lung Cancer
0.0004%
99.9996%
26.7%
73.3%
0.15%
13.98%
0.02%
85.86%
6.9
100,000
0.0069%
99.931%
95.2%
4.8%
26.7%
73.3%
99.98%
0.02%
Prior Likelihood
Pre-Posterior
Posterior
Mistaking these two is called:
“Associative Logic Error”
24. Conditional Probability Quiz 3
If someone has Covid, what is your probability for this person testing positive
for antibodies?
If someone tests positive for Covid antibodies, what is your probability for
this person having Covid?
Your turn:
https://tinyurl.com/covid-inference
25. 4.5%
0.02%
85.86%
13.98%
Conditional Probability: Covid Antibody Test
Covid +ve
Covid -ve
Antibody
detected
Antibody
not detected
Antibody
detected
Antibody
not detected
Antibody
detected
Antibody not
detected
Covid +ve
Covid -ve
Covid +ve
Covid -ve
0.55%
99.45%
9.25%
90.75%
4.5%
13.98%
0.02%
85.86%
5%
95%
90%
10%
5%
95%
51.35%
48.65%
Prior
Likelihood
Pre-Posterior
Posterior
Mistaking these two is called:
“Associative Logic Error”
26. Let’s look at Shruti
Lived in New York City at the height of the
pandemic. She thought she had Covid and
took the antibody test several months later.
She just got a negative antibody test result
Her doctor says: “There is no chance you got Covid.”
Why is he saying that?
28. Let’s look at Shruti
Let’s add more context
Shruti presents with all the serious symptoms
(according to WHO ) of COVID:
Difficulty breathing or shortness of breath
Chest pain or pressure
Loss of speech or movement
She just got a negative antibody test result
What prior probability should the doctor have placed?
What is the conditional probability of her having Covid with this prior?
33. What is your probability of rain in Bengaluru tomorrow?
34. What is your probability of rain in Bengaluru tomorrow?
A smart engineer has built a rain detector using ML on
weather models
What would you like to know from the evaluation results of
the model in a lab setting?
35. ML Example: Rain Detector
Rain
No Rain
Detector: “RAIN”
Detector: “No RAIN”
Detector: “RAIN”
Detector: “No RAIN”
Detector: “RAIN”
Detector: “No RAIN”
Rain
No Rain
Rain
No Rain
24%
76%
69%
31%
75%
25%
90%
10%
5%
95%
2%
98%
Prior
Likelihood
Pre-Posterior
Posterior
What metrics do ML classification model designers report?
My probability of rain
tomorrow
36. How can we combine my P(Rain) to evaluate
model?
Rain
No Rain
Detector: “RAIN”
Detector: “No RAIN”
Detector: “RAIN”
Detector: “No RAIN”
Detector: “RAIN”
Detector: “No RAIN”
Rain
No Rain
Rain
No Rain
24%
76%
69%
31%
75%
25%
90%
10%
5%
95%
2%
98%
Prior
Likelihood
Pre-Posterior
Posterior
Can you easily do inference with these metrics?
If yes, why?
If not, why not?
RECALL
PRECISION
My probability of rain
tomorrow
37. TPR & FPR allow human evaluation of model
Rain
No Rain
Detector: “RAIN”
Detector: “No RAIN”
Detector: “RAIN”
Detector: “No RAIN”
Detector: “RAIN”
Detector: “No RAIN”
Rain
No Rain
Rain
No Rain
24%
76%
69%
31%
75%
25%
90%
10%
5%
95%
2%
98%
Prior
Likelihood
Pre-Posterior
Posterior
Instead, ask for True Positive Rate (Recall / Sensitivity) and
False Positive Rate (1-Specificity)
TPR
SENSITIVITY
RECALL
FPR
1-SPECIFICITY
My probability of rain
tomorrow
38. Poll Question:
If you saw seven heads in a row in a coin toss, what
probability would you place on the next coin landing
heads?
41. Roll two dice and bet on the sum of the roll
ಎರಡು vಾಳಗಳನುU ಉರುMf ಮತುK Bೂೕõನ úತKದ çೕr lಾz
ಕù|@ೂMû
Let’s play a game ಒಂದು ಆಟ ಆãೂೕಣ
Poll: If you could bet on only one number for the sum of two die rolls (2-12), what
would you bet on?
42. Roll two dice and bet on the sum of the roll
ಎರಡು vಾಳಗಳನುU ಉರುMf ಮತುK Bೂೕõನ úತKದ çೕr lಾz
ಕù|@ೂMû
Let’s play a game ಒಂದು ಆಟ ಆãೂೕಣ
Poll: If you could bet on three numbers for the sum of two die rolls (2-12), what
would you bet on?
43. Roll two dice and bet on the sum of the roll
ಎರಡು vಾಳಗಳನುU ಉರುMf ಮತುK Bೂೕõನ úತKದ çೕr lಾz
ಕù|@ೂMû
Let’s play a game ಒಂದು ಆಟ ಆãೂೕಣ
Let’s try it: Open Workbook
44. Discrete vs Continuous Measures
Discrete:
Sum of two die rolls
2,3,4,5,6,7,8,9,10,11,12
Continuous:
Revenue in Crores
Test Positivity Rate (0%-100%)
Probability Mass Function (PMF)
Probability
0%
5%
9%
14%
18%
2 3 4 5 6 7 8 9 10 11 12
Cumulative Mass Function
Probability
0%
25%
50%
75%
100%
2 3 4 5 6 7 8 9 10 11 12
?
48. Poll Question:
How would you read off the Probability of making revenue
less than X crores from this curve?
49. Cumulative Distribution Function (CDF)
Probability
0%
25%
50%
75%
100%
Test Positivity Rate
0% 25% 50% 75% 100%
CMF and PMF become CDF and PDF when dealing with
“continuous variables”
Low (10%)
Med (50%)
High (90%)
Low
25%
Med
35%
High
60%
Probability Distribution Function (PDF)
0
1.25
2.5
3.75
5
Test Positivity Rate
0% 25% 50% 75% 100%
Low
25%
Med
35%
High
60%
Probability:
Area under curve
51. It’s conjugate is also a Beta!
Bayes also worked out the Beta Distribution
Binomial trial
Exactly two (“bi”) outcomes, “success or failure”
Each trial has same chance of success
Binomial Distribution
0
0.045
0.09
0.135
0.18
# of Heads
0 5 10 15 20
Beta Distribution
Ref: Wikipedia
52. Beta(1,1)
The Uniform Prior
0
0.25
0.5
0.75
1
0 0.25 0.5 0.75 1
Bayes: Suggested “with a great deal of doubt” as the prior probability
distribution to express ignorance about the correct prior distribution.
Why is this statement problematic? How would you edit it?
Hint: Remember the clairvoyant?
53. Worked out Probability Theory
“Probability as an instrument for repairing defects in knowledge”
Pierre Simon Laplace
1771
Mémoire sur la probabilité des causes par les événements
Memoir on the Probability of the Causes of Events
Independently worked out what Bayes had worked out, and
then some.
54. What is the probability of the sun rising tomorrow?
0
0.25
0.5
0.75
1
Parameter “p”
0 0.25 0.5 0.75 1
Laplace had no hesitation in using the Uniform prior to express total ignorance
55. What is the probability of the sun rising tomorrow?
Laplace’s succession rule, illustrated with an unfortunate example
Prior Knowledge:
both success and failure are possible
Assumes
we observed one success and
one failure for sure before we started
Not correct if s=0, or s=n
“But this number [the probability of the sun coming up tomorrow] is far greater for him who,
seeing in the totality of phenomena the principle regulating the days and seasons, realizes
that nothing at present moment can arrest the course of it.”
56. Probability theory as an extension of logic
Edwin Thompson Jaynes
1950s
Probability Theory: The Logic of Science
The “Robot”, the ancestor of the “Clairvoyant”
“In order to direct attention to constructive things and away
from controversial irrelevancies, we shall invent an imaginary
being. Its brain is to be designed by us, so that it reasons
according to certain definite rules. These rules will be
deduced from simple desiderata which, it appears to us,
would be desirable in human brains; i.e. we think that a
rational person, on discovering that they were violating one
of these desiderata, would wish to revise their thinking.”
Excerpt From: E. T. Jaynes & G. Larry Bretthorst. “Probability
Theory.” Apple Books. https://books.apple.com/us/book/
probability-theory/id811951223
57. Probability theory as an extension of logic
Edwin Thompson Jaynes
1950s
Probability Theory: The Logic of Science
The “Robot”, the ancestor of the “Clairvoyant”
“Our robot is going to reason about propositions. As already
indicated above, we shall denote various propositions by
italicized capital letters, {A, B, C, etc.}, and for the time
being we must require that any proposition used must have,
to the robot, an unambiguous meaning and must be of the
simple, definite logical type that must be either true or
false.”
Excerpt From: E. T. Jaynes & G. Larry Bretthorst. “Probability
Theory.” Apple Books. https://books.apple.com/us/book/
probability-theory/id811951223
58. Probability theory as an extension of logic
Edwin Thompson Jaynes
1950s
Probability Theory: The Logic of Science
Let me make what, I fear, will seem to some a radical,
shocking suggestion: The merits of any statistical method
are not determined by the ideology which led to it. For,
many different, violently opposed ideologies may all lead to
the same final “working equations” for dealing with real
problems. Apparently, this phenomenon is something new in
statistics; but it is so commonplace in physics that we have
long since learned how to live with it. Today, when a
physicist says, “Theory A is better than theory B,” he does
not have in mind any ideological considerations; he means
simply, “There is at least one specific application where
theory A leads to a better result than theory B.” I suggest
that we apply the same criterion in statistics: The merits of
any statistical method are determined by the results it gives
when applied to specific problems. The Court of Last Resort
in statistics is simply our commonsense judgment of those
results.
Foundations of Probability Theory, Statistical Inference, and Statistical Theories of Science
59. Decision Analytic lens
Ronald A. Howard
1966
Coined the term “Decision Analysis”
Cofounder of the field
Brought the notion of “decisions”
First published version of “Encoding of priors” in
Decision Analysis: Applied Decision Theory
Prior development as “psychoanalytic process”
Watch interview
60. Conditional Probability: Covid Antibody Test
Covid +ve
Covid -ve
Antibody
detected
Antibody
not detected
Antibody
detected
Antibody
not detected
Antibody
detected
Antibody not
detected
Covid +ve
Covid -ve
Covid +ve
Covid -ve
0.55%
99.45%
9.25%
90.75%
5%
95%
90%
10%
5%
95%
51.35%
48.65%
Prior
Likelihood
Pre-Posterior
Posterior
What about Shruti?
She presents with all the serious symptoms
(according to WHO ) of COVID:
Difficulty breathing or shortness of breath
Chest pain or pressure
Loss of speech or movement
62. Distinctions that pass clarity test
0
0.25
0.5
0.75
1
Long-run Fraction of Heads
0 0.25 0.5 0.75 1
Meaningless to put probability on unclear distinctions
We know nothing at all about the distinction
-AND-
Believe that every value is equally likely
63. Updating the beta is a simple addition operation
0
0.75
1.5
2.25
3
0
0.25
0.5
0.75
1
Long-run fraction of heads
0 0.04 0.08 0.12 0.16 0.2 0.24 0.28 0.32 0.36 0.4 0.44 0.48 0.52 0.56 0.6 0.64 0.68 0.72 0.76 0.8 0.84 0.88 0.92 0.96 1
Add the number of tosses to update N and the number of successes to update S
PRIOR
S = 1, N = 2
alpha = 1, beta = 2-1= 1
0
HEADS
in
2
TOSSES
S
=
1+0
=
1, N
=
2+2
=
4
alpha =
1, beta =
4-1=
3
2
HEADS in 2
TOSSES
S =
1+2
=
3, N
=
2+2
=4
alpha =
3, beta =
4-3=
1
1 HEADS in 2 TOSSES
S = 1+1 = 2, N = 2+2 =4
alpha = 2, beta = 4-2= 2
64. Decision Analytic AB tests
0%
25%
50%
75%
100%
Long-run fraction of heads
0 0.15 0.3 0.45 0.6 0.75 0.9
0
0.25
0.5
0.75
1
Long-run fraction of heads
0 0.15 0.3 0.45 0.6 0.75 0.9
Before Experiment PRIOR
Treatment Control
POSTERIOR
After Experiment
0
0.75
1.5
2.25
3
Long-run fraction of heads
0 0.15 0.3 0.45 0.6 0.75 0.9
0
0.75
1.5
2.25
3
Long-run fraction of heads
0 0.15 0.3 0.45 0.6 0.75 0.9
Treatment Control
$ Value of Treatment $ Value of Control
Compare
Pass through Decision Model
65. The Metalog Distribution
Tom Keelin
2016
Published “The Metalog Distributions”
A significant advance in statistics
A distribution to match data/assessments
Prior development and updating
a fully “psychoanalytic” process
Entirely compatible with data,
huge implications for ML
67. Beta Distribution
Play with the Beta.DIST Excel function to
create:
1. Uniform Prior
2. Updating Beta Distributions
Decision Analytic AB Testing
Read Article
Try spreadsheet and Web model
Metalog Distributions
Watch Video 1, Video 2
Download Metalog Workboooks in Excel
Encode your beliefs with Metalog
Distributions