SlideShare a Scribd company logo
1 of 64
Download to read offline
Course Overview
Statistics for Data Science
CSE357 - Fall 2021
Statistics for Data Science
Statistics - methods for evaluating hypotheses in the light of empirical facts
(Stanford Encyclopedia of Philosophy, 2014)
2
Statistics for Data Science
Statistics - methods for evaluating hypotheses in the light of empirical facts
(Stanford Encyclopedia of Philosophy, 2014)
Data Science - a field focused on using statistical, scientific, and computational
techniques to gain insights from data.
3
Statistics for Data Science
Statistics - methods for evaluating hypotheses in the light of empirical facts
(Stanford Encyclopedia of Philosophy, 2014)
Data Science - a field focused on using statistical, scientific, and computational
techniques to gain insights from data.
4
Computation Statistics
Science
Statistics for Data Science
Statistics - methods for evaluating hypotheses in the light of empirical facts
(Stanford Encyclopedia of Philosophy, 2014)
Data Science - a field focused on using statistical, scientific, and computational
techniques to gain insights from data.
5
Computation Statistics
Science
Statistics for Data Science
Statistics - methods for evaluating hypotheses in the light of empirical facts
(Stanford Encyclopedia of Philosophy, 2014)
Data Science - a field focused on using statistical, scientific, and computational
techniques to gain insights from data.
Approximately equal:
Data Science ≈ Data Mining ≈ Analytics ≈ Quantitative Science
Highly Related
Data Science , Big Data , Machine Learning , Artificial Intelligence
6
Statistics for Data Science
Statistical methods for gaining knowledge and insights from data.
-- designed for those already proficient in programming (i.e. computing)
7
Statistics for Data Science
Statistical methods for gaining knowledge and insights from data.
-- designed for those already proficient in programming (i.e. computing)
A pathway to knowledge about…
… what was, (past)
… what is, (present)
… what is likely (future)
8
Statistics for Data Science
Statistical methods for gaining knowledge and insights from data.
-- designed for those already proficient in programming (i.e. computing)
A pathway to knowledge about…
… what was, (past)
… what is, (present)
… what is likely (future, the full population)
9
Why?!?
Statistics for Data Science
Statistical methods for gaining knowledge and insights from data.
-- designed for those already proficient in programming (i.e. computing)
A pathway to knowledge about…
… what was, (past)
… what is, (present)
… what is likely (future)
10
Why?!?
Jobs
Statistics for Data Science
Statistical methods for gaining knowledge and insights from data.
-- designed for those already proficient in programming (i.e. computing)
A pathway to knowledge about…
… what was, (past)
… what is, (present)
… what is likely (future)
11
Why?!?
Jobs
Decisions
Statistics for Data Science
Statistical methods for gaining knowledge and insights from data.
-- designed for those already proficient in programming (i.e. computing)
A pathway to knowledge about…
… what was, (past)
… what is, (present)
… what is likely (future)
12
Why?!?
Jobs
Decisions
Truth / Meaning in Life
The answer to the "ultimate question of
life, the universe, and everything" (Adams)
In other words, so you can go on Twitter and say
"The data say …"
"I did my research."
… and change no one's mind but at least understand it better yourself.
13
Course Website
https://www3.cs.stonybrook.edu/~has/CSE357/index.html
14
Probability
Statistics for Data Science
CSE357 - Fall 2021
What is Probability?
16
What is Probability?
Examples
(1) outcome of flipping a coin
(2) amount of snowfall
(3) mentioning "happy"
(4) mentioning "happy" a lot
17
What is Probability?
The chance that something will happen.
Given infinite observations of an event, the proportion of observations where a
given outcome happens.
Strength of belief that something is true.
18
What is Probability?
The chance that something will happen.
Given infinite observations of an event, the proportion of observations where a
given outcome happens.
Strength of belief that something is true.
“Mathematical language for quantifying uncertainty” - Wasserman
19
Probability (review)
Ω : Sample Space, set of all outcomes of a random experiment
A : Event (A ⊆ Ω), collection of possible outcomes of an experiment
P(A): Probability of event A, P is a function: events→ℝ
20
Probability (review)
Ω : Sample Space, set of all outcomes of a random experiment
A : Event (A ⊆ Ω), collection of possible outcomes of an experiment
P(A): Probability of event A, P is a function: events→ℝ
(1) P(Ω) = 1
(2) P(A) ≥ 0 , for all A
(3) If A1
, A2
, … are disjoint events then:
21
Probability (review)
Ω : Sample Space, set of all outcomes of a random experiment
A : Event (A ⊆ Ω), collection of possible outcomes of an experiment
P(A): Probability of event A, P is a function: events→ℝ
P is a probability measure, if and only if
(1) P(Ω) = 1
(2) P(A) ≥ 0 , for all A
(3) If A1
, A2
, … are disjoint events then:
22
Probability (review)
Ω : Sample Space, set of all outcomes of a random experiment
A : Event (A ⊆ Ω), collection of possible outcomes of an experiment
P(A): Probability of event A, P is a function: events→ℝ
P is a probability measure, if and only if
(1) P(Ω) = 1
(2) P(A) ≥ 0 , for all A
(3) If A1
, A2
, … are disjoint events then:
23
Probability (review)
Ω : Sample Space, set of all outcomes of a random experiment
A : Event (A ⊆ Ω), collection of possible outcomes of an experiment
P(A): Probability of event A, P is a function: events→ℝ
P is a probability measure, if and only if
(1) P(Ω) = 1
(2) P(A) ≥ 0 , for all A
(3) If A1
, A2
, … are disjoint events then:
24
Examples
(1) outcome of flipping a coin
(2) amount of snowfall
(3) mentioning "happy"
(4) mentioning "happy" a lot
Probability (review)
Some Properties:
If B ⊆ A then P(A) ≥ P(B)
P(A ⋃ B) ≤ P(A) + P(B)
P(A ⋂ B) ≤ min(P(A), P(B))
P(¬A) = P(Ω / A) = 1 - P(A)
/ is set difference
P(A ⋂ B) will be notated as P(A, B)
25
Examples
(1) outcome of flipping a coin
(2) amount of snowfall
(3) mentioning "happy"
(4) mentioning "happy" a lot
Independence
Independence
Two Events: A and B
Does knowing something about A tell us whether B happens (and vice versa)?
26
Independence
Independence
Two Events: A and B
Does knowing something about A tell us whether B happens (and vice versa)?
(1) A: first flip of a fair coin; B: second flip of the same fair coin
(2) A: mention or not of the first word is “happy”
B: mention or not of the second word is “birthday”
27
Independence
Independence
Two Events: A and B
Does knowing something about A tell us whether B happens (and vice versa)?
(1) A: first flip of a fair coin; B: second flip of the same fair coin
(2) A: mention or not of the first word is “happy”
B: mention or not of the second word is “birthday”
Two events, A and B, are independent iff P(A, B) = P(A)P(B)
28
Independence
Independence
Two Events: A and B
Does knowing something about A tell us whether B happens (and vice versa)?
(1) A: first flip of a fair coin; B: second flip of the same fair coin
(2) A: mention or not of the first word is “happy”
B: mention or not of the second word is “birthday”
Two events, A and B, are independent iff P(A, B) = P(A)P(B)
29
Does dependence
imply causality?
Disjoint Sets vs. Independent Events
Independence: Two events, A and B are independence iff P(A,B) = P(A)P(B)
Disjoint Sets: If two events, A and B, come from disjoint sets, then
P(A,B) = 0
30
Disjoint Sets vs. Independent Events
Independence: … iff P(A,B) = P(A)P(B)
Disjoint Sets: If two events, A and B, come from disjoint sets, then
P(A,B) = 0
Does independence imply disjoint?
31
Disjoint Sets vs. Independent Events
Independence: … iff P(A,B) = P(A)P(B)
Disjoint Sets: If two events, A and B, come from disjoint sets, then
P(A,B) = 0
Does independence imply disjoint? No
Proof: A counterexample: ?
32
Disjoint Sets vs. Independent Events
Independence: … iff P(A,B) = P(A)P(B)
Disjoint Sets: If two events, A and B, come from disjoint sets, then
P(A,B) = 0
Does independence imply disjoint? No
Proof: A counterexample: A: flip of fair coin A is heads,
B: flip of fair boin B is heads;
independence tell us P(A)P(B) = P(A,B) = .25
but disjoint tells us P(A, B) = 0
33
A B
Probability (Review)
Conditional Probability
P(A, B)
P(A|B) = -------------
P(B)
34
Probability (Review)
Conditional Probability
P(A, B)
P(A|B) = -------------
P(B)
35
H: mention “happy” in message, m
B: mention “birthday” in message, m
P(H) = .01 P(B) =.001 P(H, B) = .0005
P(H|B) = ??
Probability (Review)
Conditional Probability
P(A, B)
P(A|B) = -------------
P(B)
36
H: mention “happy” in message, m
B: mention “birthday” in message, m
P(H) = .01 P(B) =.001 P(H, B) = .0005
P(H|B) = .50
H1: first flip of a fair coin is heads
H2: second flip of the same coin is heads
P(H2) = 0.5 P(H1) = 0.5 P(H2, H1) = 0.25
P(H2|H1) = 0.5
Probability (Review)
Conditional Probability
P(A, B)
P(A|B) = -------------
P(B)
Two events, A and B, are independent iff P(A, B) = P(A)P(B)
P(A, B) = P(A)P(B) iff P(A|B) = P(A)
37
H1: first flip of a fair coin is heads
H2: second flip of the same coin is heads
P(H2) = 0.5 P(H1) = 0.5 P(H2, H1) = 0.25
P(H2|H1) = 0.5
Probability (Review)
Conditional Probability
P(A, B)
P(A|B) = -------------
P(B)
Two events, A and B, are independent iff P(A, B) = P(A)P(B)
P(A, B) = P(A)P(B) iff P(A|B) = P(A)
Interpretation of Independence:
Observing B has no effect on probability of A. 38
H1: first flip of a fair coin is heads
H2: second flip of the same coin is heads
P(H2) = 0.5 P(H1) = 0.5 P(H2, H1) = 0.25
P(H2|H1) = 0.5
Why Probability?
39
Why Probability?
A formality to make sense of the world.
(1) To quantify uncertainty
Should we believe something or not? Is it a meaningful difference?
(2) To be able to generalize from one situation or point in time to another.
Can we rely on some information? What is the chance Y happens?
(3) To organize data into meaningful groups or “dimensions”
Where does X belong? What words are similar to X?
40
Probabilities over >2 events...
Independence:
A1
, A2
, …, An
are independent iff
41
Probabilities over >2 events...
Independence:
A1
, A2
, …, An
are independent iff
Conditional Probability:
42
Probabilities over >2 events...
Independence:
A1
, A2
, …, An
are independent iff
Conditional Probability:
just think of multiple events happening as a single event:
Z = A1
,, A2
,… , Am-1
= A1
,⋂ A2
⋂ … ⋂ Am-1
then P(Z|An
) 43
Conditional Probabilities are Fundamental to Data Science
for example
Machine Learning: Most modern deep learning techniques try to estimate
P(outcome | data)
Causal inference: Does treatment cause outcome?
P(outcome | treatment) =/= P(outcome) *
*also requires random sampling of treatment conditions
44
Conditional Independence
A and B are conditionally independent, given C, IFF
P(A, B | C) = P(A|C)P(B|C)
Equivalently, P(A|B,C) = P(A|C)
Interpretation: Once we know C, then B doesn’t tell us anything useful about A.
45
Bayes Theorem - Lite
GOAL: Relate (1) P(A|B) to (2) P(B|A)
46
Bayes Theorem - Lite
GOAL: Relate (1) P(A|B) to (2) P(B|A)
Let’s try:
(3) P(A|B) = P(A,B) / P(B), def. of conditional probability on (1)
47
Bayes Theorem - Lite
GOAL: Relate (1) P(A|B) to (2) P(B|A)
Let’s try:
(3) P(A|B) = P(A,B) / P(B), def. of conditional probability on (1)
(4) P(B|A) = P(B,A) / P(A) = P(A,B) / P(A), def. of cond prob on (2); sym of set intrsct
48
Bayes Theorem - Lite
GOAL: Relate (1) P(A|B) to (2) P(B|A)
Let’s try:
(3) P(A|B) = P(A,B) / P(B), def. of conditional probability on (1)
(4) P(B|A) = P(B,A) / P(A) = P(A,B) / P(A), def. of cond prob on (2); sym of set intrsct
(5) P(B|A)P(A) = P(A,B), algebra on (4) ← known as “Multiplication Rule”
49
Bayes Theorem - Lite
GOAL: Relate (1) P(A|B) to (2) P(B|A)
Let’s try:
(3) P(A|B) = P(A,B) / P(B), def. of conditional probability on (1)
(4) P(B|A) = P(B,A) / P(A) = P(A,B) / P(A), def. of cond prob on (2); sym of set intrsct
(5) P(B|A)P(A) = P(A,B), algebra on (4) ← known as “Multiplication Rule”
(6) P(A|B) = (P(B|A)P(A)) / P(B), Substitute P(A,B) from (5) into (3)
50
Bayes Theorem - Lite
GOAL: Relate (1) P(A|B) to (2) P(B|A)
Let’s try:
(3) P(A|B) = P(A,B) / P(B), def. of conditional probability on (1)
(4) P(B|A) = P(B,A) / P(A) = P(A,B) / P(A), def. of cond prob on (2); sym of set intrsct
(5) P(B|A)P(A) = P(A,B), algebra on (4) ← known as “Multiplication Rule”
(6) P(A|B) = (P(B|A)P(A)) / P(B), Substitute P(A,B) from (5) into (3)
51
Bayes Theorem - Lite
GOAL: Relate (1) P(A|B) to (2) P(B|A)
Let’s try:
(3) P(A|B) = P(A,B) / P(B), def. of conditional probability on (1)
(4) P(B|A) = P(B,A) / P(A) = P(A,B) / P(A), def. of cond prob on (2); sym of set intrsct
(5) P(B|A)P(A) = P(A,B), algebra on (4) ← known as “Multiplication Rule”
(6) P(A|B) = (P(B|A)P(A)) / P(B), Substitute P(A,B) from (5) into (3)
52
Why?
We often want to know P(A|B) but we
are only given P(B|A) and P(A).
Example: You want to know if an email is
likely spam given a word appearing in it:
P(spam | word). However, you only have a
dataset of words and spam: P(word | spam)
and you can look up the frequency of spam
emails in general to get P(spam) as well as the
frequency of "word" in general for P(word).
Bayes Theorem - Heavy (with multiple events partitioning Ω)
GOAL: Relate P(Ai
|B) to P(B|Ai
),
for all i = 1 ... k, where A1
... Ak
partition Ω
53
First: Law of Total Probability
GOAL: Relate P(Ai
|B) to P(B|Ai
),
for all i = 1 ... k, where A1
... Ak
partition Ω
partition: P(A1
U A2
… U Ak
) = Ω
P(Ai
, Aj
) = 0, for all i ≠ j
54
A1
A2
A3
Ak
...
First: Law of Total Probability
GOAL: Relate P(Ai
|B) to P(B|Ai
),
for all i = 1 ... k, where A1
... Ak
partition Ω
partition: P(A1
U A2
… U Ak
) = Ω
P(Ai
, Aj
) = 0, for all i ≠ j
55
A1
A2
A3
Ak
...
When both of these conditions are
true, we say "A1
, …, Ak
partition Ω"
First: Law of Total Probability
GOAL: Relate P(Ai
|B) to P(B|Ai
),
for all i = 1 ... k, where A1
... Ak
partition Ω
partition: P(A1
U A2
… U Ak
) = Ω
P(Ai
, Aj
) = 0, for all i ≠ j
law of total probability: If A1
... Ak
partition Ω,
then for any event, B:
56
A1
A2
A3
Ak
...
Law of Total Probability and Bayes Theorem
GOAL: Relate P(Ai
|B) to P(B|Ai
),
for all i = 1 ... k, where A1
... Ak
partition Ω
Let’s try:
57
Law of Total Probability
Law of Total Probability and Bayes Theorem
GOAL: Relate P(Ai
|B) to P(B|Ai
),
for all i = 1 ... k, where A1
... Ak
partition Ω
Let’s try:
(1) P(Ai
|B) = P(Ai
,B) / P(B)
(2) P(Ai
,B) / P(B) = P(B|Ai
) P(Ai
) / P(B), by multiplication rule
58
Law of Total Probability
P(A,B) = P(B|A)P(A)
Law of Total Probability and Bayes Theorem
GOAL: Relate P(Ai
|B) to P(B|Ai
),
for all i = 1 ... k, where A1
... Ak
partition Ω
Let’s try:
(1) P(Ai
|B) = P(Ai
,B) / P(B)
(2) P(Ai
,B) / P(B) = P(B|Ai
) P(Ai
) / P(B), by multiplication rule
but in practice, we might not know P(B)
59
Law of Total Probability
Law of Total Probability and Bayes Theorem
GOAL: Relate P(Ai
|B) to P(B|Ai
),
for all i = 1 ... k, where A1
... Ak
partition Ω
Let’s try:
(1) P(Ai
|B) = P(Ai
,B) / P(B)
(2) P(Ai
,B) / P(B) = P(B|Ai
) P(Ai
) / P(B), by multiplication rule
but in practice, we might not know P(B)
(3) P(B|Ai
) P(Ai
) / P(B) = P(B|Ai
) P(Ai
) / ( ), by law of total
probability
60
Law of Total Probability
Law of Total Probability and Bayes Theorem
GOAL: Relate P(Ai
|B) to P(B|Ai
),
for all i = 1 ... k, where A1
... Ak
partition Ω
Let’s try:
(1) P(Ai
|B) = P(Ai
,B) / P(B)
(2) P(Ai
,B) / P(B) = P(B|Ai
) P(Ai
) / P(B), by multiplication rule
but in practice, we might not know P(B)
(3) P(B|Ai
) P(Ai
) / P(B) = P(B|Ai
) P(Ai
) / ( ), by law of total
probability
Thus, 61
Law of Total Probability
Law of Total Probability and Bayes Theorem
GOAL: Relate P(Ai
|B) to P(B|Ai
),
for all i = 1 ... k, where A1
... Ak
partition Ω
Let’s try:
(1) P(Ai
|B) = P(Ai
,B) / P(B)
(2) P(Ai
,B) / P(B) = P(B|Ai
) P(Ai
) / P(B), by multiplication rule
but in practice, we might not know P(B)
(3) P(B|Ai
) P(Ai
) / P(B) = P(B|Ai
) P(Ai
) / ( ), by law of total
probability
Thus, 62
Law of Total Probability
Bayes Rule, in practice
Law of Total Probability and Bayes Theorem
GOAL: Relate P(Ai
|B) to P(B|Ai
),
for all i = 1 ... k, where A1
... Ak
partition Ω
Let’s try:
(1) P(Ai
|B) = P(Ai
,B) / P(B)
(2) P(Ai
,B) / P(B) = P(B|Ai
) P(Ai
) / P(B), by multiplication rule
but in practice, we might not know P(B)
(3) P(B|Ai
) P(Ai
) / P(B) = P(B|Ai
) P(Ai
) / ( ), by law of total
probability
Thus, 63
Bayes Rule, in practice
Example:
https://www.youtube.com/watch?v=R13BD8qKeTg
Probability Review:
● What constitutes a probability measure?
● Independence
● Conditional probability
● Conditional independence
● How to derive Bayes Theorem
● Multiplication Rule
● Partition of Sample Space
● Law of Total Probability
● Bayes Theorem in Practice

More Related Content

Similar to CSE357 fa21 (1) Course Intro and Probability 8-26.pdf

probability and statistics
probability and statisticsprobability and statistics
probability and statisticszain393885
 
Basic Concept Of Probability
Basic Concept Of ProbabilityBasic Concept Of Probability
Basic Concept Of Probabilityguest45a926
 
1 Probability Please read sections 3.1 – 3.3 in your .docx
 1 Probability   Please read sections 3.1 – 3.3 in your .docx 1 Probability   Please read sections 3.1 – 3.3 in your .docx
1 Probability Please read sections 3.1 – 3.3 in your .docxaryan532920
 
Probability concepts for Data Analytics
Probability concepts for Data AnalyticsProbability concepts for Data Analytics
Probability concepts for Data AnalyticsSSaudia
 
Basic concept of probability
Basic concept of probabilityBasic concept of probability
Basic concept of probabilityIkhlas Rahman
 
Statistical computing 1
Statistical computing 1Statistical computing 1
Statistical computing 1Padma Metta
 
Probability cheatsheet
Probability cheatsheetProbability cheatsheet
Probability cheatsheetSuvrat Mishra
 
1-Probability-Conditional-Bayes.pdf
1-Probability-Conditional-Bayes.pdf1-Probability-Conditional-Bayes.pdf
1-Probability-Conditional-Bayes.pdfKrushangDilipbhaiPar
 
Making probability easy!!!
Making probability easy!!!Making probability easy!!!
Making probability easy!!!GAURAV SAHA
 
Unit IV UNCERTAINITY AND STATISTICAL REASONING in AI K.Sundar,AP/CSE,VEC
Unit IV UNCERTAINITY AND STATISTICAL REASONING in AI K.Sundar,AP/CSE,VECUnit IV UNCERTAINITY AND STATISTICAL REASONING in AI K.Sundar,AP/CSE,VEC
Unit IV UNCERTAINITY AND STATISTICAL REASONING in AI K.Sundar,AP/CSE,VECsundarKanagaraj1
 
Reliability-Engineering.pdf
Reliability-Engineering.pdfReliability-Engineering.pdf
Reliability-Engineering.pdfBakiyalakshmiR1
 

Similar to CSE357 fa21 (1) Course Intro and Probability 8-26.pdf (20)

PTSP PPT.pdf
PTSP PPT.pdfPTSP PPT.pdf
PTSP PPT.pdf
 
probability and statistics
probability and statisticsprobability and statistics
probability and statistics
 
Basic Concept Of Probability
Basic Concept Of ProbabilityBasic Concept Of Probability
Basic Concept Of Probability
 
lec2_CS540_handouts.pdf
lec2_CS540_handouts.pdflec2_CS540_handouts.pdf
lec2_CS540_handouts.pdf
 
1 Probability Please read sections 3.1 – 3.3 in your .docx
 1 Probability   Please read sections 3.1 – 3.3 in your .docx 1 Probability   Please read sections 3.1 – 3.3 in your .docx
1 Probability Please read sections 3.1 – 3.3 in your .docx
 
Probability concepts for Data Analytics
Probability concepts for Data AnalyticsProbability concepts for Data Analytics
Probability concepts for Data Analytics
 
Basic concept of probability
Basic concept of probabilityBasic concept of probability
Basic concept of probability
 
Statistical computing 1
Statistical computing 1Statistical computing 1
Statistical computing 1
 
Chap03 probability
Chap03 probabilityChap03 probability
Chap03 probability
 
Lecture 01
Lecture 01Lecture 01
Lecture 01
 
Probability cheatsheet
Probability cheatsheetProbability cheatsheet
Probability cheatsheet
 
Probability-06.pdf
Probability-06.pdfProbability-06.pdf
Probability-06.pdf
 
Tp4 probability
Tp4 probabilityTp4 probability
Tp4 probability
 
4 probability
4 probability4 probability
4 probability
 
1-Probability-Conditional-Bayes.pdf
1-Probability-Conditional-Bayes.pdf1-Probability-Conditional-Bayes.pdf
1-Probability-Conditional-Bayes.pdf
 
Making probability easy!!!
Making probability easy!!!Making probability easy!!!
Making probability easy!!!
 
Unit IV UNCERTAINITY AND STATISTICAL REASONING in AI K.Sundar,AP/CSE,VEC
Unit IV UNCERTAINITY AND STATISTICAL REASONING in AI K.Sundar,AP/CSE,VECUnit IV UNCERTAINITY AND STATISTICAL REASONING in AI K.Sundar,AP/CSE,VEC
Unit IV UNCERTAINITY AND STATISTICAL REASONING in AI K.Sundar,AP/CSE,VEC
 
S244 10 Probability.ppt
S244 10 Probability.pptS244 10 Probability.ppt
S244 10 Probability.ppt
 
Probability.pptx
Probability.pptxProbability.pptx
Probability.pptx
 
Reliability-Engineering.pdf
Reliability-Engineering.pdfReliability-Engineering.pdf
Reliability-Engineering.pdf
 

Recently uploaded

Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...stazi3110
 
Software Project Health Check: Best Practices and Techniques for Your Product...
Software Project Health Check: Best Practices and Techniques for Your Product...Software Project Health Check: Best Practices and Techniques for Your Product...
Software Project Health Check: Best Practices and Techniques for Your Product...Velvetech LLC
 
Unveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML DiagramsUnveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML DiagramsAhmed Mohamed
 
The Evolution of Karaoke From Analog to App.pdf
The Evolution of Karaoke From Analog to App.pdfThe Evolution of Karaoke From Analog to App.pdf
The Evolution of Karaoke From Analog to App.pdfPower Karaoke
 
英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作qr0udbr0
 
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样umasea
 
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...OnePlan Solutions
 
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)jennyeacort
 
Recruitment Management Software Benefits (Infographic)
Recruitment Management Software Benefits (Infographic)Recruitment Management Software Benefits (Infographic)
Recruitment Management Software Benefits (Infographic)Hr365.us smith
 
React Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief UtamaReact Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief UtamaHanief Utama
 
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...gurkirankumar98700
 
Asset Management Software - Infographic
Asset Management Software - InfographicAsset Management Software - Infographic
Asset Management Software - InfographicHr365.us smith
 
Intelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalmIntelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalmSujith Sukumaran
 
Professional Resume Template for Software Developers
Professional Resume Template for Software DevelopersProfessional Resume Template for Software Developers
Professional Resume Template for Software DevelopersVinodh Ram
 
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio, Inc.
 
Folding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a seriesFolding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a seriesPhilip Schwarz
 
Salesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantSalesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantAxelRicardoTrocheRiq
 
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdfGOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdfAlina Yurenko
 
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024StefanoLambiase
 
Cloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEECloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEEVICTOR MAESTRE RAMIREZ
 

Recently uploaded (20)

Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
 
Software Project Health Check: Best Practices and Techniques for Your Product...
Software Project Health Check: Best Practices and Techniques for Your Product...Software Project Health Check: Best Practices and Techniques for Your Product...
Software Project Health Check: Best Practices and Techniques for Your Product...
 
Unveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML DiagramsUnveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML Diagrams
 
The Evolution of Karaoke From Analog to App.pdf
The Evolution of Karaoke From Analog to App.pdfThe Evolution of Karaoke From Analog to App.pdf
The Evolution of Karaoke From Analog to App.pdf
 
英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作
 
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
 
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
 
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)
 
Recruitment Management Software Benefits (Infographic)
Recruitment Management Software Benefits (Infographic)Recruitment Management Software Benefits (Infographic)
Recruitment Management Software Benefits (Infographic)
 
React Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief UtamaReact Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief Utama
 
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
 
Asset Management Software - Infographic
Asset Management Software - InfographicAsset Management Software - Infographic
Asset Management Software - Infographic
 
Intelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalmIntelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalm
 
Professional Resume Template for Software Developers
Professional Resume Template for Software DevelopersProfessional Resume Template for Software Developers
Professional Resume Template for Software Developers
 
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
 
Folding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a seriesFolding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a series
 
Salesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantSalesforce Certified Field Service Consultant
Salesforce Certified Field Service Consultant
 
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdfGOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
 
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
 
Cloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEECloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEE
 

CSE357 fa21 (1) Course Intro and Probability 8-26.pdf

  • 1. Course Overview Statistics for Data Science CSE357 - Fall 2021
  • 2. Statistics for Data Science Statistics - methods for evaluating hypotheses in the light of empirical facts (Stanford Encyclopedia of Philosophy, 2014) 2
  • 3. Statistics for Data Science Statistics - methods for evaluating hypotheses in the light of empirical facts (Stanford Encyclopedia of Philosophy, 2014) Data Science - a field focused on using statistical, scientific, and computational techniques to gain insights from data. 3
  • 4. Statistics for Data Science Statistics - methods for evaluating hypotheses in the light of empirical facts (Stanford Encyclopedia of Philosophy, 2014) Data Science - a field focused on using statistical, scientific, and computational techniques to gain insights from data. 4 Computation Statistics Science
  • 5. Statistics for Data Science Statistics - methods for evaluating hypotheses in the light of empirical facts (Stanford Encyclopedia of Philosophy, 2014) Data Science - a field focused on using statistical, scientific, and computational techniques to gain insights from data. 5 Computation Statistics Science
  • 6. Statistics for Data Science Statistics - methods for evaluating hypotheses in the light of empirical facts (Stanford Encyclopedia of Philosophy, 2014) Data Science - a field focused on using statistical, scientific, and computational techniques to gain insights from data. Approximately equal: Data Science ≈ Data Mining ≈ Analytics ≈ Quantitative Science Highly Related Data Science , Big Data , Machine Learning , Artificial Intelligence 6
  • 7. Statistics for Data Science Statistical methods for gaining knowledge and insights from data. -- designed for those already proficient in programming (i.e. computing) 7
  • 8. Statistics for Data Science Statistical methods for gaining knowledge and insights from data. -- designed for those already proficient in programming (i.e. computing) A pathway to knowledge about… … what was, (past) … what is, (present) … what is likely (future) 8
  • 9. Statistics for Data Science Statistical methods for gaining knowledge and insights from data. -- designed for those already proficient in programming (i.e. computing) A pathway to knowledge about… … what was, (past) … what is, (present) … what is likely (future, the full population) 9 Why?!?
  • 10. Statistics for Data Science Statistical methods for gaining knowledge and insights from data. -- designed for those already proficient in programming (i.e. computing) A pathway to knowledge about… … what was, (past) … what is, (present) … what is likely (future) 10 Why?!? Jobs
  • 11. Statistics for Data Science Statistical methods for gaining knowledge and insights from data. -- designed for those already proficient in programming (i.e. computing) A pathway to knowledge about… … what was, (past) … what is, (present) … what is likely (future) 11 Why?!? Jobs Decisions
  • 12. Statistics for Data Science Statistical methods for gaining knowledge and insights from data. -- designed for those already proficient in programming (i.e. computing) A pathway to knowledge about… … what was, (past) … what is, (present) … what is likely (future) 12 Why?!? Jobs Decisions Truth / Meaning in Life The answer to the "ultimate question of life, the universe, and everything" (Adams)
  • 13. In other words, so you can go on Twitter and say "The data say …" "I did my research." … and change no one's mind but at least understand it better yourself. 13
  • 15. Probability Statistics for Data Science CSE357 - Fall 2021
  • 17. What is Probability? Examples (1) outcome of flipping a coin (2) amount of snowfall (3) mentioning "happy" (4) mentioning "happy" a lot 17
  • 18. What is Probability? The chance that something will happen. Given infinite observations of an event, the proportion of observations where a given outcome happens. Strength of belief that something is true. 18
  • 19. What is Probability? The chance that something will happen. Given infinite observations of an event, the proportion of observations where a given outcome happens. Strength of belief that something is true. “Mathematical language for quantifying uncertainty” - Wasserman 19
  • 20. Probability (review) Ω : Sample Space, set of all outcomes of a random experiment A : Event (A ⊆ Ω), collection of possible outcomes of an experiment P(A): Probability of event A, P is a function: events→ℝ 20
  • 21. Probability (review) Ω : Sample Space, set of all outcomes of a random experiment A : Event (A ⊆ Ω), collection of possible outcomes of an experiment P(A): Probability of event A, P is a function: events→ℝ (1) P(Ω) = 1 (2) P(A) ≥ 0 , for all A (3) If A1 , A2 , … are disjoint events then: 21
  • 22. Probability (review) Ω : Sample Space, set of all outcomes of a random experiment A : Event (A ⊆ Ω), collection of possible outcomes of an experiment P(A): Probability of event A, P is a function: events→ℝ P is a probability measure, if and only if (1) P(Ω) = 1 (2) P(A) ≥ 0 , for all A (3) If A1 , A2 , … are disjoint events then: 22
  • 23. Probability (review) Ω : Sample Space, set of all outcomes of a random experiment A : Event (A ⊆ Ω), collection of possible outcomes of an experiment P(A): Probability of event A, P is a function: events→ℝ P is a probability measure, if and only if (1) P(Ω) = 1 (2) P(A) ≥ 0 , for all A (3) If A1 , A2 , … are disjoint events then: 23
  • 24. Probability (review) Ω : Sample Space, set of all outcomes of a random experiment A : Event (A ⊆ Ω), collection of possible outcomes of an experiment P(A): Probability of event A, P is a function: events→ℝ P is a probability measure, if and only if (1) P(Ω) = 1 (2) P(A) ≥ 0 , for all A (3) If A1 , A2 , … are disjoint events then: 24 Examples (1) outcome of flipping a coin (2) amount of snowfall (3) mentioning "happy" (4) mentioning "happy" a lot
  • 25. Probability (review) Some Properties: If B ⊆ A then P(A) ≥ P(B) P(A ⋃ B) ≤ P(A) + P(B) P(A ⋂ B) ≤ min(P(A), P(B)) P(¬A) = P(Ω / A) = 1 - P(A) / is set difference P(A ⋂ B) will be notated as P(A, B) 25 Examples (1) outcome of flipping a coin (2) amount of snowfall (3) mentioning "happy" (4) mentioning "happy" a lot
  • 26. Independence Independence Two Events: A and B Does knowing something about A tell us whether B happens (and vice versa)? 26
  • 27. Independence Independence Two Events: A and B Does knowing something about A tell us whether B happens (and vice versa)? (1) A: first flip of a fair coin; B: second flip of the same fair coin (2) A: mention or not of the first word is “happy” B: mention or not of the second word is “birthday” 27
  • 28. Independence Independence Two Events: A and B Does knowing something about A tell us whether B happens (and vice versa)? (1) A: first flip of a fair coin; B: second flip of the same fair coin (2) A: mention or not of the first word is “happy” B: mention or not of the second word is “birthday” Two events, A and B, are independent iff P(A, B) = P(A)P(B) 28
  • 29. Independence Independence Two Events: A and B Does knowing something about A tell us whether B happens (and vice versa)? (1) A: first flip of a fair coin; B: second flip of the same fair coin (2) A: mention or not of the first word is “happy” B: mention or not of the second word is “birthday” Two events, A and B, are independent iff P(A, B) = P(A)P(B) 29 Does dependence imply causality?
  • 30. Disjoint Sets vs. Independent Events Independence: Two events, A and B are independence iff P(A,B) = P(A)P(B) Disjoint Sets: If two events, A and B, come from disjoint sets, then P(A,B) = 0 30
  • 31. Disjoint Sets vs. Independent Events Independence: … iff P(A,B) = P(A)P(B) Disjoint Sets: If two events, A and B, come from disjoint sets, then P(A,B) = 0 Does independence imply disjoint? 31
  • 32. Disjoint Sets vs. Independent Events Independence: … iff P(A,B) = P(A)P(B) Disjoint Sets: If two events, A and B, come from disjoint sets, then P(A,B) = 0 Does independence imply disjoint? No Proof: A counterexample: ? 32
  • 33. Disjoint Sets vs. Independent Events Independence: … iff P(A,B) = P(A)P(B) Disjoint Sets: If two events, A and B, come from disjoint sets, then P(A,B) = 0 Does independence imply disjoint? No Proof: A counterexample: A: flip of fair coin A is heads, B: flip of fair boin B is heads; independence tell us P(A)P(B) = P(A,B) = .25 but disjoint tells us P(A, B) = 0 33 A B
  • 34. Probability (Review) Conditional Probability P(A, B) P(A|B) = ------------- P(B) 34
  • 35. Probability (Review) Conditional Probability P(A, B) P(A|B) = ------------- P(B) 35 H: mention “happy” in message, m B: mention “birthday” in message, m P(H) = .01 P(B) =.001 P(H, B) = .0005 P(H|B) = ??
  • 36. Probability (Review) Conditional Probability P(A, B) P(A|B) = ------------- P(B) 36 H: mention “happy” in message, m B: mention “birthday” in message, m P(H) = .01 P(B) =.001 P(H, B) = .0005 P(H|B) = .50 H1: first flip of a fair coin is heads H2: second flip of the same coin is heads P(H2) = 0.5 P(H1) = 0.5 P(H2, H1) = 0.25 P(H2|H1) = 0.5
  • 37. Probability (Review) Conditional Probability P(A, B) P(A|B) = ------------- P(B) Two events, A and B, are independent iff P(A, B) = P(A)P(B) P(A, B) = P(A)P(B) iff P(A|B) = P(A) 37 H1: first flip of a fair coin is heads H2: second flip of the same coin is heads P(H2) = 0.5 P(H1) = 0.5 P(H2, H1) = 0.25 P(H2|H1) = 0.5
  • 38. Probability (Review) Conditional Probability P(A, B) P(A|B) = ------------- P(B) Two events, A and B, are independent iff P(A, B) = P(A)P(B) P(A, B) = P(A)P(B) iff P(A|B) = P(A) Interpretation of Independence: Observing B has no effect on probability of A. 38 H1: first flip of a fair coin is heads H2: second flip of the same coin is heads P(H2) = 0.5 P(H1) = 0.5 P(H2, H1) = 0.25 P(H2|H1) = 0.5
  • 40. Why Probability? A formality to make sense of the world. (1) To quantify uncertainty Should we believe something or not? Is it a meaningful difference? (2) To be able to generalize from one situation or point in time to another. Can we rely on some information? What is the chance Y happens? (3) To organize data into meaningful groups or “dimensions” Where does X belong? What words are similar to X? 40
  • 41. Probabilities over >2 events... Independence: A1 , A2 , …, An are independent iff 41
  • 42. Probabilities over >2 events... Independence: A1 , A2 , …, An are independent iff Conditional Probability: 42
  • 43. Probabilities over >2 events... Independence: A1 , A2 , …, An are independent iff Conditional Probability: just think of multiple events happening as a single event: Z = A1 ,, A2 ,… , Am-1 = A1 ,⋂ A2 ⋂ … ⋂ Am-1 then P(Z|An ) 43
  • 44. Conditional Probabilities are Fundamental to Data Science for example Machine Learning: Most modern deep learning techniques try to estimate P(outcome | data) Causal inference: Does treatment cause outcome? P(outcome | treatment) =/= P(outcome) * *also requires random sampling of treatment conditions 44
  • 45. Conditional Independence A and B are conditionally independent, given C, IFF P(A, B | C) = P(A|C)P(B|C) Equivalently, P(A|B,C) = P(A|C) Interpretation: Once we know C, then B doesn’t tell us anything useful about A. 45
  • 46. Bayes Theorem - Lite GOAL: Relate (1) P(A|B) to (2) P(B|A) 46
  • 47. Bayes Theorem - Lite GOAL: Relate (1) P(A|B) to (2) P(B|A) Let’s try: (3) P(A|B) = P(A,B) / P(B), def. of conditional probability on (1) 47
  • 48. Bayes Theorem - Lite GOAL: Relate (1) P(A|B) to (2) P(B|A) Let’s try: (3) P(A|B) = P(A,B) / P(B), def. of conditional probability on (1) (4) P(B|A) = P(B,A) / P(A) = P(A,B) / P(A), def. of cond prob on (2); sym of set intrsct 48
  • 49. Bayes Theorem - Lite GOAL: Relate (1) P(A|B) to (2) P(B|A) Let’s try: (3) P(A|B) = P(A,B) / P(B), def. of conditional probability on (1) (4) P(B|A) = P(B,A) / P(A) = P(A,B) / P(A), def. of cond prob on (2); sym of set intrsct (5) P(B|A)P(A) = P(A,B), algebra on (4) ← known as “Multiplication Rule” 49
  • 50. Bayes Theorem - Lite GOAL: Relate (1) P(A|B) to (2) P(B|A) Let’s try: (3) P(A|B) = P(A,B) / P(B), def. of conditional probability on (1) (4) P(B|A) = P(B,A) / P(A) = P(A,B) / P(A), def. of cond prob on (2); sym of set intrsct (5) P(B|A)P(A) = P(A,B), algebra on (4) ← known as “Multiplication Rule” (6) P(A|B) = (P(B|A)P(A)) / P(B), Substitute P(A,B) from (5) into (3) 50
  • 51. Bayes Theorem - Lite GOAL: Relate (1) P(A|B) to (2) P(B|A) Let’s try: (3) P(A|B) = P(A,B) / P(B), def. of conditional probability on (1) (4) P(B|A) = P(B,A) / P(A) = P(A,B) / P(A), def. of cond prob on (2); sym of set intrsct (5) P(B|A)P(A) = P(A,B), algebra on (4) ← known as “Multiplication Rule” (6) P(A|B) = (P(B|A)P(A)) / P(B), Substitute P(A,B) from (5) into (3) 51
  • 52. Bayes Theorem - Lite GOAL: Relate (1) P(A|B) to (2) P(B|A) Let’s try: (3) P(A|B) = P(A,B) / P(B), def. of conditional probability on (1) (4) P(B|A) = P(B,A) / P(A) = P(A,B) / P(A), def. of cond prob on (2); sym of set intrsct (5) P(B|A)P(A) = P(A,B), algebra on (4) ← known as “Multiplication Rule” (6) P(A|B) = (P(B|A)P(A)) / P(B), Substitute P(A,B) from (5) into (3) 52 Why? We often want to know P(A|B) but we are only given P(B|A) and P(A). Example: You want to know if an email is likely spam given a word appearing in it: P(spam | word). However, you only have a dataset of words and spam: P(word | spam) and you can look up the frequency of spam emails in general to get P(spam) as well as the frequency of "word" in general for P(word).
  • 53. Bayes Theorem - Heavy (with multiple events partitioning Ω) GOAL: Relate P(Ai |B) to P(B|Ai ), for all i = 1 ... k, where A1 ... Ak partition Ω 53
  • 54. First: Law of Total Probability GOAL: Relate P(Ai |B) to P(B|Ai ), for all i = 1 ... k, where A1 ... Ak partition Ω partition: P(A1 U A2 … U Ak ) = Ω P(Ai , Aj ) = 0, for all i ≠ j 54 A1 A2 A3 Ak ...
  • 55. First: Law of Total Probability GOAL: Relate P(Ai |B) to P(B|Ai ), for all i = 1 ... k, where A1 ... Ak partition Ω partition: P(A1 U A2 … U Ak ) = Ω P(Ai , Aj ) = 0, for all i ≠ j 55 A1 A2 A3 Ak ... When both of these conditions are true, we say "A1 , …, Ak partition Ω"
  • 56. First: Law of Total Probability GOAL: Relate P(Ai |B) to P(B|Ai ), for all i = 1 ... k, where A1 ... Ak partition Ω partition: P(A1 U A2 … U Ak ) = Ω P(Ai , Aj ) = 0, for all i ≠ j law of total probability: If A1 ... Ak partition Ω, then for any event, B: 56 A1 A2 A3 Ak ...
  • 57. Law of Total Probability and Bayes Theorem GOAL: Relate P(Ai |B) to P(B|Ai ), for all i = 1 ... k, where A1 ... Ak partition Ω Let’s try: 57 Law of Total Probability
  • 58. Law of Total Probability and Bayes Theorem GOAL: Relate P(Ai |B) to P(B|Ai ), for all i = 1 ... k, where A1 ... Ak partition Ω Let’s try: (1) P(Ai |B) = P(Ai ,B) / P(B) (2) P(Ai ,B) / P(B) = P(B|Ai ) P(Ai ) / P(B), by multiplication rule 58 Law of Total Probability P(A,B) = P(B|A)P(A)
  • 59. Law of Total Probability and Bayes Theorem GOAL: Relate P(Ai |B) to P(B|Ai ), for all i = 1 ... k, where A1 ... Ak partition Ω Let’s try: (1) P(Ai |B) = P(Ai ,B) / P(B) (2) P(Ai ,B) / P(B) = P(B|Ai ) P(Ai ) / P(B), by multiplication rule but in practice, we might not know P(B) 59 Law of Total Probability
  • 60. Law of Total Probability and Bayes Theorem GOAL: Relate P(Ai |B) to P(B|Ai ), for all i = 1 ... k, where A1 ... Ak partition Ω Let’s try: (1) P(Ai |B) = P(Ai ,B) / P(B) (2) P(Ai ,B) / P(B) = P(B|Ai ) P(Ai ) / P(B), by multiplication rule but in practice, we might not know P(B) (3) P(B|Ai ) P(Ai ) / P(B) = P(B|Ai ) P(Ai ) / ( ), by law of total probability 60 Law of Total Probability
  • 61. Law of Total Probability and Bayes Theorem GOAL: Relate P(Ai |B) to P(B|Ai ), for all i = 1 ... k, where A1 ... Ak partition Ω Let’s try: (1) P(Ai |B) = P(Ai ,B) / P(B) (2) P(Ai ,B) / P(B) = P(B|Ai ) P(Ai ) / P(B), by multiplication rule but in practice, we might not know P(B) (3) P(B|Ai ) P(Ai ) / P(B) = P(B|Ai ) P(Ai ) / ( ), by law of total probability Thus, 61 Law of Total Probability
  • 62. Law of Total Probability and Bayes Theorem GOAL: Relate P(Ai |B) to P(B|Ai ), for all i = 1 ... k, where A1 ... Ak partition Ω Let’s try: (1) P(Ai |B) = P(Ai ,B) / P(B) (2) P(Ai ,B) / P(B) = P(B|Ai ) P(Ai ) / P(B), by multiplication rule but in practice, we might not know P(B) (3) P(B|Ai ) P(Ai ) / P(B) = P(B|Ai ) P(Ai ) / ( ), by law of total probability Thus, 62 Law of Total Probability Bayes Rule, in practice
  • 63. Law of Total Probability and Bayes Theorem GOAL: Relate P(Ai |B) to P(B|Ai ), for all i = 1 ... k, where A1 ... Ak partition Ω Let’s try: (1) P(Ai |B) = P(Ai ,B) / P(B) (2) P(Ai ,B) / P(B) = P(B|Ai ) P(Ai ) / P(B), by multiplication rule but in practice, we might not know P(B) (3) P(B|Ai ) P(Ai ) / P(B) = P(B|Ai ) P(Ai ) / ( ), by law of total probability Thus, 63 Bayes Rule, in practice Example: https://www.youtube.com/watch?v=R13BD8qKeTg
  • 64. Probability Review: ● What constitutes a probability measure? ● Independence ● Conditional probability ● Conditional independence ● How to derive Bayes Theorem ● Multiplication Rule ● Partition of Sample Space ● Law of Total Probability ● Bayes Theorem in Practice