FALLSEM2023-24_MSM5001_ETH_VL2023240106334_2023-09-19_Reference-Material-I (3).pptx

 Theory of probability provides the foundation of statistical inference
 Theory of probability: branch of mathematics dealing with analysis of
random phenomena
 Not the subject of this course
 Aim to discuss the application of probability in biostats
 A patient has 50-50% chances of survival; 95% recovery of patients upon
treatment
 Statisticians use fractions instead of percentage
Probability in statistics

 Event: is an outcome of an experiment/trial. For eg: getting a “head” by
tossing the coin is an event. If we are looking for that event, and if the
event is happening, it becomes a ‘successful event (success)’
 Experiment/Trial: is an action to calculate the probability of occurrence of
an event. For eg: tossing the coin, rolling the dice
 Mutually exclusive events: Can not occur simultaneously
 Equally likely events: The chances are same for all outcomes
 Independent events: No effect of the events on each other in subsequent
trials
 Dependent events: The probability of one event depends on the other in
subsequent trials
Basic concepts in probability

 Probability of a successful event (simply event…) ranges between 0 (will
not occur) to 1 (always occur)
 Most biological events range between 0 and 1 (least likely to most likely)

“Absolute” and “No”
 Absolute: Sun rises in the East. Or if I throw a stone up, it will come down
 No probability: If I make a statement that this evening sun will set in the
east; the probability is zero, or no probability

Types of probabilities
 Theoretical/ Classical/ Mathematical/ Apriori probability:
 We assume equal likelihood of all events
 prior information is not required
 Generally used to make mathematical predictions
 Some established biological phenomena (Mendelian genetics)
 Relative/ Statistical/ Posteriori probability:
 Information is needed to know if there is mutual exclusiveness and equal
likelihood of an event (no assumptions)
 Determined by actual observations and trials (many trials required first)
 Revising the prior probability according to trials
 Generally for statistical analyses
 Most biological phenomena

 Subjective probability:
 Probability derived from an individual's personal judgment or own
experience about whether a specific outcome is likely to occur
 Assign numbers from 0 to 1 (0-100%) only based upon the subjective
view/experience
 It contains no formal calculations and only reflects the subject's opinions
and past experience
 An example of subjective probability is a "gut instinct" when making a
trade

EXAMPLE
 Deck of cards: 52; four sets of 13 cards each
 What is the probability that I would pick a clubs king with one pick – 1 out
of 52
 What is the probability that I would pick a king with one pick – 4/52=1/13
 I am picking a king from a pack of cards. Without replacement, I am
picking one more king. What is the probability that I would get 1 more
king (dependent; no longer
 1st pick – 1/13; 2nd pick – 3/51 = 1/17

Elementary properties of probability
 For events that are mutually exclusive and equally likely
 1. Probability of any trial with n mutually exclusive outcomes (E1, E2, ….
En) is a non-negative number
 2. Sum of the probabilities of all mutually exclusive outcomes is 1
 3. Addition rule – For two mutually exclusive events, Ei and Ej, the
probability of occurrence of either Ei or Ej is given by the sum of the each
of probabilities
 For eg: the probability for getting 1 in rolling a dice is 1/6. probability for
getting 3 is also 1/6. then what is the probability of getting either 1 or 3
while rolling a dice is: 1/6 + 1/6 = 2/6 = 1/3

 4. Multiplication rule – For two mutually exclusive events, Ei and Ej, the
probability of occurrence of both Ei or Ej is given by the multiplication of
the each of probabilities
 What is the probability of getting 1 and then 3 while rolling a dice on
successive turns (independent events)
P = 1/6 X 1/6 = 1/36
 For two dice, what is the probability of getting 6 on one die and 1 on the
other (independent events)
P = 1/6 X 1/6 = 1/36
 A bag contains 6 black marbles and 4 blue marbles; what is the probability
of drawing two blue marbles one by one successively? (dependent events)
P = 4/10 X 3/9 = 2/15

 Addition rule: for OR (event Ei OR Ej occur)
 Multiplication rule: AND (event Ei AND Ej occur)
 BUT THESE RULES ARE ONLY FOR MUTUALLY EXCLUSIVE EVENTS

What is the probability that a random person picked up from the dataset will be
18 years or less? 141/318
Suppose we pick a random subject 18 years old or less; what is the probability
that he/she will have no family history of mood disorders? (Conditional
probability) 28/141 P(NF|18 yrs)
What is the probability that a random subject will be less than equal to 18 years
of age and have no family history of mood disorders? (Joint probability)
28/318
Apply multiplication rule; P of 18 yrs * P of (NF |18 years) = 141/318 * 28/141
OR P of NF * P of (18 years |NF) = 63/318 * 28/63

Bayes’ Theorem
Describes probability of an event, based upon prior knowledge of conditions
that are related to the event
For unknown/unestablished conditions or situations (drugs/diagnostic tests)

 Take the example of evaluation of screening tests and diagnostic criteria
 Bayer’s theorem will accurately predict the presence or absence of a
particular disease from the knowledge of the test results (positive or
negative) and the status of a particular symptom (present or absent)
 Screening testing results: not infallible (can yield false +ve or false –ve)
 A particular symptom may also be present or absent in a disease

 Following questions must be answered in order to evaluate the usefulness
of test results and status of symptoms in determining whether or not a
person has a disease
1. P(+ve|disease)
2. P(-ve|no disease)
3. P(disease|+ve)
4. P(no disease|-ve)

 a: no. of subjects who have the disease and show a +ve result in screening
test (true positive)
 b: no. of subjects who have the disease but show a –ve test results (false
positive)
 True negative: d
 False negative: c

 Probability of diseased subjects showing +ve test = true positive / total
diseased cases = a/(a+c) (Sensitivity of the test) P(+ve|disease)
 Probability of normal subjects showing –ve test = true negative / total normal
cases = d/(b+d) (Specificity of the test) P(-ve|no disease)

 Probability of diseased subjects showing +ve test = true positive / total
diseased cases = a/(a+c) (Sensitivity of the test)
 P(+ve|disease) i.e. P of getting +ve results, given the person is diseased
 Probability of normal subjects showing –ve test = true negative / total normal
cases = d/(b+d) (Specificity of the test)
 P(-ve|no disease) i.e. P of getting –ve results, given the person is non-
diseased

 Probability of subjects having the disease upon showing a +ve test = true
positive /all tested positive = a/(a+b)
(Predictive value positive of the test)
 P(disease|+ve) i.e. P of having the disease given the result is +ve
 Probability of subjects not having the disease upon showing –ve test =
true negative /all tested negative = d/(c+d)
(Predictive value negative of the test)
 P(no disease|-ve) i.e. P of not having the disease given the result is -ve

P (disease) = 0.05
P (+ve | disease) = 0.9
P (+ve | not diseased) = 0.05
Application of Bayes’ Theorem: Example
Find
P (disease | +ve) = ?
P (+ve) = ?

P (disease|+ve) = P (+ve|disease) * P(disease)] / P(+ve)
First find P (+ve)
All +ve results = +ve results with disease or +ve results without disease
P(+ve) = P (disease AND +ve) OR P (no disease AND +ve)
P(+ve) = P (disease) * P (+ve|diseased) OR P (no disease) * P (+ve|no disease)
= 0.05*0.9 OR (1-0.05)* 0.05 = (0.05*0.9) + (0.95*0.05)
= 0.045 + 0.0475 = 0.0925
P (disease) = 0.05; hence P(no disease) = 0.95
P (+ve | disease) = 0.9
P (+ve | not diseased) = 0.05

First find P (+ve)
All +ve results = +ve results with disease and +ve results without disease
P(+ve) = P (disease and +ve) OR P (no disease and +ve)
P(+ve) = P (disease) * P (diseased; +ve) OR P (no disease) * P (no disease; +ve)
= 0.05*0.9 OR (1-0.05)* 0.05 = (0.05*0.9) + (0.95*0.05)
= 0.045 + 0.0475 = 0.0925
= (0.9 * 0.05)/0.0925
= 0.4864

Application of Bayes’ Theorem: Example
 A particular test for a person using cannabis is 90% sensitive. The test is
also 80% specific.
 Assuming prevalence of 5% of cannabis usage, what is the probability that
a random person who tests +ve actually uses cannabis.

 90% sensitive; correctly identifies (+ve test) 90% of users = P(+ve|user)
 Probability that a person is identified +ve given he is a user = 0.9
 Users: either identified +ve or -ve
 P(-ve|user) = 1-0.9
 Probability that a person is identified -ve when (given) he is a user = 0.1
 80% specific; correctly identifies (-ve test) 80% of non-users = P(-ve|nonuser)
 Probability that a person is identified -ve given he is a nonuser = 0.8
 Nonusers: either identified -ve or +ve
 P(+ve|nonuser) = 1-0.8
 Probability that a person is identified +ve when (given) he is a nonuser = 0.2
 Assuming prevalence of 5% of cannabis usage, what is the probability that a
random person who tests +ve actually uses cannabis;
 I.e. find P (user|+ve) or probability that a person is a user when (given) he is
tested +ve

Where do we get this term from?
All Positive cases = (user AND testing positive) OR (nonuser AND testing
positive)
Probability that someone tests positive is the probability that a user tests
positive into (AND) the probability of being a user (P of actual user being tested
positive)
OR the probability that a non-user tests positive into the probability of being a
non-user (P of non-user being tested positive)

Multiplication rule: (a) Probability of a user testing positive
(b) Probability of a non-user testing positive
Addition law: Probability of anybody (user or non-user)
testing positive
Also think in terms of total probability: only two outcomes for +ve
test: either user or non-user

Probability distributions
Discrete variables: Binomial and Poisson distributions
Continuous variables: Normal distribution

Probability distributions (discrete variables)
 Probability distribution: relationship between the values of a random
variable and the probabilities of their occurrence in a graphical or tabular
manner
 Similar to frequency distribution: relationship between a random variable
and its frequencies
 Probability distribution is a powerful tool for both describing a dataset and
for making predictions

Relative frequency of occurrence of any variable (X) to assume a
corresponding value (x)

 What is the probability that a
random selected family used 3
assistance programs?
 What is the probability that a
randomly selected family used
either one or two assistance
programs?

Cumulative Probability distributions (ogive)

 What is the probability that a randomly selected family used 2 or less
 What is the probability that a randomly selected family used 6 or less

 What is the probability that a randomly selected family used 5 or more
 What is the probability that a randomly selected family used between 3
and 5 (inclusive) assistance programs?
 Use more than cumulative frequency probability OR subtract from 1

 What is the probability that a randomly selected family used 5 or more
 P (X ≥ 5) + P (X ≤ 4) = 1; P (X ≥ 5) = 1 – 0.6296
 What is the probability that a randomly selected family used between 3
and 5 (inclusive) assistance programs?
 P (X ≤ 5) is P for a family using 1-5 programs (inclusive), P (X ≤ 2) is P of a
family using less than 3 programs (1-2 programs)
 P (3 ≤ X ≤ 5) = P (X ≤ 5) – P (X ≤ 2); = 0.8249 – 0.3670

Binomial distribution (Bernoulli’s distribution)
 For dichotomous variables (two mutually exclusive outcomes only)
 Coil flip
 Red/White flowers
 Dead or alive
 Treated or untreated
 Statisticians call it success or failure
 One outcome is denoted as P; probability for P is denoted as p (remains
same during all trials)
 Other outcome is denoted as Q, probability is q (= 1 - p)
 Trials should be independent (outcome of one trial does not
influence the outcome of any other)

 Death record for a disease; P (death) and Q (alive)
 Probabilities are p and q, respectively
 For a randomly selected set of five individuals with the disease, what is
the probability that we will have the sequence of PQPPQ (dead, alive,
dead, dead, alive)?

 Probabilities are p and q, respectively
the probability that we will have the sequence of PQPPQ (dead, alive,
dead, dead, alive)?
 Apply multiplication rule (AND)
 P = pqppq = p3q2

 For a dichotomous variable, two trials only
 p2+ 2pq + q2 = Ptot = 1
 For three trials
 p3+ 3p2q + 3pq2 + q3 = Ptot = 1
 For four trials
 p4 + 4p3q + 6p2q2 + 4pq3 + q4 = Ptot = 1
 For event (P) to occur for x times in a total no. of trials (n), the number of
combinations is
 Factorial of x: product of all numbers from x to 1, 0! =1
Binomial expansion

the probability that we will have three death cases?
 n = 5, x = 3
 C =
 No. of combinations is 10
 PPPQQ, PQPQP, so on…

 Finding frequency
 The success is P and failure is Q, for simplicity, we write probability as f(x)
 This expression is called the binomial distribution
 f(x) = nCx * px *qn-x = 10p3q2

 Toss coin 10 times
 what is the probability that there will be 4 heads
 heads: P (successful event); p = 50% = 0.5
 tails: Q (failures); q = 1-0.5 = 0.5
 No. of P events; n = 4
 Total no. of events; x = 10
nCx = 10C4 = 10!/ [4! (10-4)!] = 3628800/(24*720) = 210
f (x) = 10C4 px q(n-x)
= 210 * (0.5)4 * (0.5)6
= 0.205

 14% of pregnant women admitted to a hospital are smokers
 If a random sample of 10 women are selected from this population,
 what is the probability that it will have 4 smokers

 14% of pregnant women admitted to a hospital are smokers
 If a random sample of 10 women are selected from this population,
 what is the probability that it will have 4 smokers
 smoking women: P (successful event); p = 14% = 0.14
 non-smoking women: Q (failures); q = 1-0.14 = 0.86
 No. of P events; n = 4
 Total no. of events; x = 10
nCx = 10C4 = 10!/ [4! (10-4)!] = 3628800/(24*720) = 210
f (x) = 10C4 px q(n-x)
= 210 * (0.14)4 * (0.86)6
= 210 * 0.00038416 * 0.404567
= 0.032 or 3.2%

Poisson distribution
 Simeon Denis Poisson derived this
 Also for discrete variables
 Difference: distribution of the number of times a rare discrete event
occurs in a continuum of space and time
 The same size N is infinitely large and uncountable; size of rare events is
finite and countable
 Poisson distribution is used to predict very rare events in a given period
of time or space
 E.g.. no. of radioactive particles emitted per unit time
 No. of earthquakes per year

 Poisson distribution is given by
 e = Euler’s number (constant) =2.7183
  = mean no. of events in an interval (in a given time/space)
 x = no. of events
 Assumptions:
 The occurrence of events are independent (one occurrence does not
affect any other one)
 Theoretically, an infinite no. of occurrences of the event is possible in the
interval
 The probability of a single event is proportional to the length of the
interval

A 100 km stretch of road along a forest land was surveyed for assessing the
incidence of death of wild life due to accidents caused by heavy vehicles. The
total number of dead animals counted was 75. Calculate the probability of
finding: (a) no dead animals, (b) 1 dead animal, (c) 2 dead animals, (d) 3 dead
animals in a given km of the road. Assume one accident kills one animal only.
 = mean no. of events/unit space = 75/100 = 0.75
x = 0 (for a); 1 (for b); 2 (for c); 3 (for d)

 = mean no. of events/unit space = 75/100 = 0.75
x = 0 (for a); 1 (for b); 2 (for c); 3 (for d)
(a) f (x) = P (x) = [2.71(-0.75) * 0.750] / 0! = (0.4724 * 1)/1 = 0.4724
(b) f (x) = [2.71(-0.75) * 0.751] / 1! = (0.4724 * 0.75)/1 = 0.3543
(c) f (x) = [2.71(-0.75) * 0.752] / 2! = 0.133
(d) f (x) = [2.71(-0.75) * 0.753] / 3! = (0.4724 * 0.75)/1 = 0.0331

In a study of drug-induced anaphylaxis among patients, it was found that the
occurrence of anaphylaxis is 12 per year. Assuming that the model follows
Poisson distribution,
Find the probability that three subjects will experience anaphylaxis in the
next year.
 = mean no. of events/unit time = 12/1 = 12
x = 3
f (x) = P (x) = [2.71(-12) * 123] / 3! = (0.000006144 * 1728)/6 = 0.00177

 Poisson distribution is used when n is very large and p is very small
Differentiating Poisson and binomial distribution

Distribution of continuous variables
 For continuous random variables
 Infinite values of the variable are possible

 Imagine that the number of the variables is very large and the widths of
the classes are very small
 As the n increases to infinity and class widths approaches 0, the polygon is
converted to a curve
 Such smooth curves represent the distribution of random continuous
variables graphically
Relative
frequency
Relative
frequency

 The total area under the curve is equal to 1 as with the case with relative
frequency curve
 Relative frequency (probability) of occurrence of values between any two
points on the x axis is equal to the area bound by the curve
 The probability of finding a specific value is 0 (area above a point is 0)
Relative
frequency

Distribution of continuous variables: Normal
distribution
 Most important distribution in statistics
 First given by Abraham De Moivre
 Carl Gauss contributed immensely in its understanding
 Hence, also called Gaussian distribution
 e and  are constants
  is mean and  is standard deviation (use calculus to find these)

1. It is symmetrical about its mean (mirror image on either side)
2. Mean, mode and median are all equal
3. Total area of the curve is one square unit (probability distribution)
4. Each side is 50%-50% on either side of mean
Characteristics of Normal distribution

5. 1 SD distance either side of mean accommodates ~68% of the values (34%
each side)
6. 2 SD distance either side of mean accommodates 95% of the values
7. 3 SD distance either side of mean accommodates 99.7% of the values

8. Normal distribution is hence completely described by the mean (central
tendency) and SD (dispersion)
9. Different values of mean() shift the distribution along the X-axis
 is the location parameter
10. Different values of SD () shift the distribution along the Y-axis (flatness or
peakedness)
 is the shape parameter

Consider normal distribution as a family, each member being determined by
the different mean and SD values
Standard (or unit) normal distribution is one member/kind of normal
distribution in which mean = 0 and SD = 1 unit

FALLSEM2023-24_MSM5001_ETH_VL2023240106334_2023-09-19_Reference-Material-I (3).pptx

Recommended

Recommended

More Related Content

Similar to FALLSEM2023-24_MSM5001_ETH_VL2023240106334_2023-09-19_Reference-Material-I (3).pptx

Similar to FALLSEM2023-24_MSM5001_ETH_VL2023240106334_2023-09-19_Reference-Material-I (3).pptx (20)

Recently uploaded

Recently uploaded (20)

FALLSEM2023-24_MSM5001_ETH_VL2023240106334_2023-09-19_Reference-Material-I (3).pptx

Editor's Notes