SlideShare a Scribd company logo
PROBABILITY
What is probability?
 Theory of probability provides the foundation of statistical inference
 Theory of probability: branch of mathematics dealing with analysis of
random phenomena
 Not the subject of this course
 Aim to discuss the application of probability in biostats
 A patient has 50-50% chances of survival; 95% recovery of patients upon
treatment
 Statisticians use fractions instead of percentage
Probability in statistics
 Event: is an outcome of an experiment/trial. For eg: getting a “head” by
tossing the coin is an event. If we are looking for that event, and if the
event is happening, it becomes a ‘successful event (success)’
 Experiment/Trial: is an action to calculate the probability of occurrence of
an event. For eg: tossing the coin, rolling the dice
 Mutually exclusive events: Can not occur simultaneously
 Equally likely events: The chances are same for all outcomes
 Independent events: No effect of the events on each other in subsequent
trials
 Dependent events: The probability of one event depends on the other in
subsequent trials
Basic concepts in probability
 Probability of a successful event (simply event…) ranges between 0 (will
not occur) to 1 (always occur)
 Most biological events range between 0 and 1 (least likely to most likely)
“Absolute” and “No”
 Absolute: Sun rises in the East. Or if I throw a stone up, it will come down
 No probability: If I make a statement that this evening sun will set in the
east; the probability is zero, or no probability
Types of probabilities
 Theoretical/ Classical/ Mathematical/ Apriori probability:
 We assume equal likelihood of all events
 prior information is not required
 Generally used to make mathematical predictions
 Some established biological phenomena (Mendelian genetics)
 Relative/ Statistical/ Posteriori probability:
 Information is needed to know if there is mutual exclusiveness and equal
likelihood of an event (no assumptions)
 Determined by actual observations and trials (many trials required first)
 Revising the prior probability according to trials
 Generally for statistical analyses
 Most biological phenomena
 Subjective probability:
 Probability derived from an individual's personal judgment or own
experience about whether a specific outcome is likely to occur
 Assign numbers from 0 to 1 (0-100%) only based upon the subjective
view/experience
 It contains no formal calculations and only reflects the subject's opinions
and past experience
 An example of subjective probability is a "gut instinct" when making a
trade
EXAMPLE
 Deck of cards: 52; four sets of 13 cards each
 What is the probability that I would pick a clubs king with one pick – 1 out
of 52
 What is the probability that I would pick a king with one pick – 4/52=1/13
 I am picking a king from a pack of cards. Without replacement, I am
picking one more king. What is the probability that I would get 1 more
king (dependent; no longer
 1st pick – 1/13; 2nd pick – 3/51 = 1/17
Elementary properties of probability
 For events that are mutually exclusive and equally likely
 1. Probability of any trial with n mutually exclusive outcomes (E1, E2, ….
En) is a non-negative number
 2. Sum of the probabilities of all mutually exclusive outcomes is 1
 3. Addition rule – For two mutually exclusive events, Ei and Ej, the
probability of occurrence of either Ei or Ej is given by the sum of the each
of probabilities
 For eg: the probability for getting 1 in rolling a dice is 1/6. probability for
getting 3 is also 1/6. then what is the probability of getting either 1 or 3
while rolling a dice is: 1/6 + 1/6 = 2/6 = 1/3
 4. Multiplication rule – For two mutually exclusive events, Ei and Ej, the
probability of occurrence of both Ei or Ej is given by the multiplication of
the each of probabilities
 What is the probability of getting 1 and then 3 while rolling a dice on
successive turns (independent events)
P = 1/6 X 1/6 = 1/36
 For two dice, what is the probability of getting 6 on one die and 1 on the
other (independent events)
P = 1/6 X 1/6 = 1/36
 A bag contains 6 black marbles and 4 blue marbles; what is the probability
of drawing two blue marbles one by one successively? (dependent events)
P = 4/10 X 3/9 = 2/15
 Addition rule: for OR (event Ei OR Ej occur)
 Multiplication rule: AND (event Ei AND Ej occur)
 BUT THESE RULES ARE ONLY FOR MUTUALLY EXCLUSIVE EVENTS
What is the probability that a random person picked up from the dataset will be
18 years or less? 141/318
Suppose we pick a random subject 18 years old or less; what is the probability
that he/she will have no family history of mood disorders? (Conditional
probability) 28/141 P(NF|18 yrs)
What is the probability that a random subject will be less than equal to 18 years
of age and have no family history of mood disorders? (Joint probability)
28/318
Apply multiplication rule; P of 18 yrs * P of (NF |18 years) = 141/318 * 28/141
OR P of NF * P of (18 years |NF) = 63/318 * 28/63
Bayes’ Theorem
Describes probability of an event, based upon prior knowledge of conditions
that are related to the event
For unknown/unestablished conditions or situations (drugs/diagnostic tests)
 Take the example of evaluation of screening tests and diagnostic criteria
 Bayer’s theorem will accurately predict the presence or absence of a
particular disease from the knowledge of the test results (positive or
negative) and the status of a particular symptom (present or absent)
 Screening testing results: not infallible (can yield false +ve or false –ve)
 A particular symptom may also be present or absent in a disease
 Following questions must be answered in order to evaluate the usefulness
of test results and status of symptoms in determining whether or not a
person has a disease
1. P(+ve|disease)
2. P(-ve|no disease)
3. P(disease|+ve)
4. P(no disease|-ve)
 a: no. of subjects who have the disease and show a +ve result in screening
test (true positive)
 b: no. of subjects who have the disease but show a –ve test results (false
positive)
 True negative: d
 False negative: c
 Probability of diseased subjects showing +ve test = true positive / total
diseased cases = a/(a+c) (Sensitivity of the test) P(+ve|disease)
 Probability of normal subjects showing –ve test = true negative / total normal
cases = d/(b+d) (Specificity of the test) P(-ve|no disease)
 Probability of diseased subjects showing +ve test = true positive / total
diseased cases = a/(a+c) (Sensitivity of the test)
 P(+ve|disease) i.e. P of getting +ve results, given the person is diseased
 Probability of normal subjects showing –ve test = true negative / total normal
cases = d/(b+d) (Specificity of the test)
 P(-ve|no disease) i.e. P of getting –ve results, given the person is non-
diseased
 Probability of subjects having the disease upon showing a +ve test = true
positive /all tested positive = a/(a+b)
(Predictive value positive of the test)
 P(disease|+ve) i.e. P of having the disease given the result is +ve
 Probability of subjects not having the disease upon showing –ve test =
true negative /all tested negative = d/(c+d)
(Predictive value negative of the test)
 P(no disease|-ve) i.e. P of not having the disease given the result is -ve
P (disease) = 0.05
P (+ve | disease) = 0.9
P (+ve | not diseased) = 0.05
Application of Bayes’ Theorem: Example
Find
P (disease | +ve) = ?
P (+ve) = ?
P (disease|+ve) = P (+ve|disease) * P(disease)] / P(+ve)
First find P (+ve)
All +ve results = +ve results with disease or +ve results without disease
P(+ve) = P (disease AND +ve) OR P (no disease AND +ve)
P(+ve) = P (disease) * P (+ve|diseased) OR P (no disease) * P (+ve|no disease)
= 0.05*0.9 OR (1-0.05)* 0.05 = (0.05*0.9) + (0.95*0.05)
= 0.045 + 0.0475 = 0.0925
P (disease) = 0.05; hence P(no disease) = 0.95
P (+ve | disease) = 0.9
P (+ve | not diseased) = 0.05
P (disease|+ve) = P (+ve|disease) * P(disease)] / P(+ve)
First find P (+ve)
All +ve results = +ve results with disease and +ve results without disease
P(+ve) = P (disease and +ve) OR P (no disease and +ve)
P(+ve) = P (disease) * P (diseased; +ve) OR P (no disease) * P (no disease; +ve)
= 0.05*0.9 OR (1-0.05)* 0.05 = (0.05*0.9) + (0.95*0.05)
= 0.045 + 0.0475 = 0.0925
P (disease|+ve) = P (+ve|disease) * P(disease)] / P(+ve)
= (0.9 * 0.05)/0.0925
= 0.4864
Application of Bayes’ Theorem: Example
 A particular test for a person using cannabis is 90% sensitive. The test is
also 80% specific.
 Assuming prevalence of 5% of cannabis usage, what is the probability that
a random person who tests +ve actually uses cannabis.
 90% sensitive; correctly identifies (+ve test) 90% of users = P(+ve|user)
 Probability that a person is identified +ve given he is a user = 0.9
 Users: either identified +ve or -ve
 P(-ve|user) = 1-0.9
 Probability that a person is identified -ve when (given) he is a user = 0.1
 80% specific; correctly identifies (-ve test) 80% of non-users = P(-ve|nonuser)
 Probability that a person is identified -ve given he is a nonuser = 0.8
 Nonusers: either identified -ve or +ve
 P(+ve|nonuser) = 1-0.8
 Probability that a person is identified +ve when (given) he is a nonuser = 0.2
 Assuming prevalence of 5% of cannabis usage, what is the probability that a
random person who tests +ve actually uses cannabis;
 I.e. find P (user|+ve) or probability that a person is a user when (given) he is
tested +ve
Where do we get this term from?
All Positive cases = (user AND testing positive) OR (nonuser AND testing
positive)
Probability that someone tests positive is the probability that a user tests
positive into (AND) the probability of being a user (P of actual user being tested
positive)
OR the probability that a non-user tests positive into the probability of being a
non-user (P of non-user being tested positive)
Multiplication rule: (a) Probability of a user testing positive
(b) Probability of a non-user testing positive
Addition law: Probability of anybody (user or non-user)
testing positive
Also think in terms of total probability: only two outcomes for +ve
test: either user or non-user
Probability distributions
Discrete variables: Binomial and Poisson distributions
Continuous variables: Normal distribution
Probability distributions (discrete variables)
 Probability distribution: relationship between the values of a random
variable and the probabilities of their occurrence in a graphical or tabular
manner
 Similar to frequency distribution: relationship between a random variable
and its frequencies
 Probability distribution is a powerful tool for both describing a dataset and
for making predictions
Relative frequency of occurrence of any variable (X) to assume a
corresponding value (x)
 What is the probability that a
random selected family used 3
assistance programs?
 What is the probability that a
randomly selected family used
either one or two assistance
programs?
Cumulative Probability distributions (ogive)
 What is the probability that a randomly selected family used 2 or less
assistance programs?
 What is the probability that a randomly selected family used 6 or less
assistance programs?
 What is the probability that a randomly selected family used 5 or more
assistance programs?
 What is the probability that a randomly selected family used between 3
and 5 (inclusive) assistance programs?
 Use more than cumulative frequency probability OR subtract from 1
 What is the probability that a randomly selected family used 5 or more
assistance programs?
 P (X ≥ 5) + P (X ≤ 4) = 1; P (X ≥ 5) = 1 – 0.6296
 What is the probability that a randomly selected family used between 3
and 5 (inclusive) assistance programs?
 P (X ≤ 5) is P for a family using 1-5 programs (inclusive), P (X ≤ 2) is P of a
family using less than 3 programs (1-2 programs)
 P (3 ≤ X ≤ 5) = P (X ≤ 5) – P (X ≤ 2); = 0.8249 – 0.3670
Binomial distribution (Bernoulli’s distribution)
 For dichotomous variables (two mutually exclusive outcomes only)
 Coil flip
 Red/White flowers
 Dead or alive
 Treated or untreated
 Statisticians call it success or failure
 One outcome is denoted as P; probability for P is denoted as p (remains
same during all trials)
 Other outcome is denoted as Q, probability is q (= 1 - p)
 Trials should be independent (outcome of one trial does not
influence the outcome of any other)
 Death record for a disease; P (death) and Q (alive)
 Probabilities are p and q, respectively
 For a randomly selected set of five individuals with the disease, what is
the probability that we will have the sequence of PQPPQ (dead, alive,
dead, dead, alive)?
 Death record for a disease; P (death) and Q (alive)
 Probabilities are p and q, respectively
 For a randomly selected set of five individuals with the disease, what is
the probability that we will have the sequence of PQPPQ (dead, alive,
dead, dead, alive)?
 Apply multiplication rule (AND)
 P = pqppq = p3q2
 For a dichotomous variable, two trials only
 p2+ 2pq + q2 = Ptot = 1
 For three trials
 p3+ 3p2q + 3pq2 + q3 = Ptot = 1
 For four trials
 p4 + 4p3q + 6p2q2 + 4pq3 + q4 = Ptot = 1
 For event (P) to occur for x times in a total no. of trials (n), the number of
combinations is
 Factorial of x: product of all numbers from x to 1, 0! =1
Binomial expansion
 Death record for a disease; P (death) and Q (alive)
 For a randomly selected set of five individuals with the disease, what is
the probability that we will have three death cases?
 n = 5, x = 3
 C =
 No. of combinations is 10
 PPPQQ, PQPQP, so on…
 Finding frequency
 The success is P and failure is Q, for simplicity, we write probability as f(x)
 This expression is called the binomial distribution
 f(x) = nCx * px *qn-x = 10p3q2
 Toss coin 10 times
 what is the probability that there will be 4 heads
 heads: P (successful event); p = 50% = 0.5
 tails: Q (failures); q = 1-0.5 = 0.5
 No. of P events; n = 4
 Total no. of events; x = 10
nCx = 10C4 = 10!/ [4! (10-4)!] = 3628800/(24*720) = 210
f (x) = 10C4 px q(n-x)
= 210 * (0.5)4 * (0.5)6
= 0.205
 14% of pregnant women admitted to a hospital are smokers
 If a random sample of 10 women are selected from this population,
 what is the probability that it will have 4 smokers
 14% of pregnant women admitted to a hospital are smokers
 If a random sample of 10 women are selected from this population,
 what is the probability that it will have 4 smokers
 smoking women: P (successful event); p = 14% = 0.14
 non-smoking women: Q (failures); q = 1-0.14 = 0.86
 No. of P events; n = 4
 Total no. of events; x = 10
nCx = 10C4 = 10!/ [4! (10-4)!] = 3628800/(24*720) = 210
f (x) = 10C4 px q(n-x)
= 210 * (0.14)4 * (0.86)6
= 210 * 0.00038416 * 0.404567
= 0.032 or 3.2%
Poisson distribution
 Simeon Denis Poisson derived this
 Also for discrete variables
 Difference: distribution of the number of times a rare discrete event
occurs in a continuum of space and time
 The same size N is infinitely large and uncountable; size of rare events is
finite and countable
 Poisson distribution is used to predict very rare events in a given period
of time or space
 E.g.. no. of radioactive particles emitted per unit time
 No. of earthquakes per year
 Poisson distribution is given by
 e = Euler’s number (constant) =2.7183
  = mean no. of events in an interval (in a given time/space)
 x = no. of events
 Assumptions:
 The occurrence of events are independent (one occurrence does not
affect any other one)
 Theoretically, an infinite no. of occurrences of the event is possible in the
interval
 The probability of a single event is proportional to the length of the
interval
A 100 km stretch of road along a forest land was surveyed for assessing the
incidence of death of wild life due to accidents caused by heavy vehicles. The
total number of dead animals counted was 75. Calculate the probability of
finding: (a) no dead animals, (b) 1 dead animal, (c) 2 dead animals, (d) 3 dead
animals in a given km of the road. Assume one accident kills one animal only.
 = mean no. of events/unit space = 75/100 = 0.75
x = 0 (for a); 1 (for b); 2 (for c); 3 (for d)
 = mean no. of events/unit space = 75/100 = 0.75
x = 0 (for a); 1 (for b); 2 (for c); 3 (for d)
(a) f (x) = P (x) = [2.71(-0.75) * 0.750] / 0! = (0.4724 * 1)/1 = 0.4724
(b) f (x) = [2.71(-0.75) * 0.751] / 1! = (0.4724 * 0.75)/1 = 0.3543
(c) f (x) = [2.71(-0.75) * 0.752] / 2! = 0.133
(d) f (x) = [2.71(-0.75) * 0.753] / 3! = (0.4724 * 0.75)/1 = 0.0331
In a study of drug-induced anaphylaxis among patients, it was found that the
occurrence of anaphylaxis is 12 per year. Assuming that the model follows
Poisson distribution,
Find the probability that three subjects will experience anaphylaxis in the
next year.
 = mean no. of events/unit time = 12/1 = 12
x = 3
f (x) = P (x) = [2.71(-12) * 123] / 3! = (0.000006144 * 1728)/6 = 0.00177
 Poisson distribution is used when n is very large and p is very small
Differentiating Poisson and binomial distribution
Distribution of continuous variables
 For continuous random variables
 Infinite values of the variable are possible
 Imagine that the number of the variables is very large and the widths of
the classes are very small
 As the n increases to infinity and class widths approaches 0, the polygon is
converted to a curve
 Such smooth curves represent the distribution of random continuous
variables graphically
Relative
frequency
Relative
frequency
 The total area under the curve is equal to 1 as with the case with relative
frequency curve
 Relative frequency (probability) of occurrence of values between any two
points on the x axis is equal to the area bound by the curve
 The probability of finding a specific value is 0 (area above a point is 0)
Relative
frequency
Distribution of continuous variables: Normal
distribution
 Most important distribution in statistics
 First given by Abraham De Moivre
 Carl Gauss contributed immensely in its understanding
 Hence, also called Gaussian distribution
 e and  are constants
  is mean and  is standard deviation (use calculus to find these)
1. It is symmetrical about its mean (mirror image on either side)
2. Mean, mode and median are all equal
3. Total area of the curve is one square unit (probability distribution)
4. Each side is 50%-50% on either side of mean
Characteristics of Normal distribution
5. 1 SD distance either side of mean accommodates ~68% of the values (34%
each side)
6. 2 SD distance either side of mean accommodates 95% of the values
7. 3 SD distance either side of mean accommodates 99.7% of the values
8. Normal distribution is hence completely described by the mean (central
tendency) and SD (dispersion)
9. Different values of mean() shift the distribution along the X-axis
 is the location parameter
10. Different values of SD () shift the distribution along the Y-axis (flatness or
peakedness)
 is the shape parameter
Consider normal distribution as a family, each member being determined by
the different mean and SD values
Standard (or unit) normal distribution is one member/kind of normal
distribution in which mean = 0 and SD = 1 unit
THANK YOU

More Related Content

Similar to FALLSEM2023-24_MSM5001_ETH_VL2023240106334_2023-09-19_Reference-Material-I (3).pptx

1615 probability-notation for joint probabilities
1615 probability-notation for joint probabilities1615 probability-notation for joint probabilities
1615 probability-notation for joint probabilities
Dr Fereidoun Dejahang
 
Epidemiological method to determine utility of a diagnostic test
Epidemiological method to determine utility of a diagnostic testEpidemiological method to determine utility of a diagnostic test
Epidemiological method to determine utility of a diagnostic test
Bhoj Raj Singh
 
session three epidemiology.pptx
session three epidemiology.pptxsession three epidemiology.pptx
session three epidemiology.pptx
AxmedAbdiHasen
 
session three epidemiology.pptx
session three epidemiology.pptxsession three epidemiology.pptx
session three epidemiology.pptx
AxmedAbdiHasen
 
34301
3430134301
34301
Alok Roy
 
How to read a paper
How to read a paperHow to read a paper
How to read a paper
faheta
 
measures of association.pptx
measures of association.pptxmeasures of association.pptx
measures of association.pptx
Ayon Gupta
 
Mathematics in Epidemiology and Biostatistics (Medical Booklet Series by Dr. ...
Mathematics in Epidemiology and Biostatistics (Medical Booklet Series by Dr. ...Mathematics in Epidemiology and Biostatistics (Medical Booklet Series by Dr. ...
Mathematics in Epidemiology and Biostatistics (Medical Booklet Series by Dr. ...
Dr. Aryan (Anish Dhakal)
 
Addition rule and multiplication rule
Addition rule and multiplication rule  Addition rule and multiplication rule
Addition rule and multiplication rule
Long Beach City College
 
MD Paediatricts (Part 2) - Epidemiology and Statistics
MD Paediatricts (Part 2) - Epidemiology and StatisticsMD Paediatricts (Part 2) - Epidemiology and Statistics
MD Paediatricts (Part 2) - Epidemiology and Statistics
Bernard Deepal W. Jayamanne
 
Probability
ProbabilityProbability
Probability
Mayank Devnani
 
Basic concepts of probability
Basic concepts of probabilityBasic concepts of probability
Basic concepts of probability
Avjinder (Avi) Kaler
 
Biostatistics-MDS(Sampling techniques, Probabaility) Dr. Kanwal Preet K Gill....
Biostatistics-MDS(Sampling techniques, Probabaility) Dr. Kanwal Preet K Gill....Biostatistics-MDS(Sampling techniques, Probabaility) Dr. Kanwal Preet K Gill....
Biostatistics-MDS(Sampling techniques, Probabaility) Dr. Kanwal Preet K Gill....
DrSandeepKaur4
 
Complements and Conditional Probability, and Bayes' Theorem
 Complements and Conditional Probability, and Bayes' Theorem Complements and Conditional Probability, and Bayes' Theorem
Complements and Conditional Probability, and Bayes' Theorem
Long Beach City College
 
RSS probability theory
RSS probability theoryRSS probability theory
RSS probability theory
Kaimrc_Rss_Jd
 
PROBABILITY4.pptx
PROBABILITY4.pptxPROBABILITY4.pptx
PROBABILITY4.pptx
SNEHA AGRAWAL GUPTA
 
basic probability Lecture 9.pptx
basic probability Lecture 9.pptxbasic probability Lecture 9.pptx
basic probability Lecture 9.pptx
SabirinINahassan
 
Probability
ProbabilityProbability
Probability
Mmedsc Hahm
 
probability.ppt
probability.pptprobability.ppt
probability.ppt
Pudhuvai Baveesh
 
Reporting Results of Statistical Analysis
Reporting Results of Statistical Analysis Reporting Results of Statistical Analysis
Reporting Results of Statistical Analysis
Centre for Social Initiative and Management
 

Similar to FALLSEM2023-24_MSM5001_ETH_VL2023240106334_2023-09-19_Reference-Material-I (3).pptx (20)

1615 probability-notation for joint probabilities
1615 probability-notation for joint probabilities1615 probability-notation for joint probabilities
1615 probability-notation for joint probabilities
 
Epidemiological method to determine utility of a diagnostic test
Epidemiological method to determine utility of a diagnostic testEpidemiological method to determine utility of a diagnostic test
Epidemiological method to determine utility of a diagnostic test
 
session three epidemiology.pptx
session three epidemiology.pptxsession three epidemiology.pptx
session three epidemiology.pptx
 
session three epidemiology.pptx
session three epidemiology.pptxsession three epidemiology.pptx
session three epidemiology.pptx
 
34301
3430134301
34301
 
How to read a paper
How to read a paperHow to read a paper
How to read a paper
 
measures of association.pptx
measures of association.pptxmeasures of association.pptx
measures of association.pptx
 
Mathematics in Epidemiology and Biostatistics (Medical Booklet Series by Dr. ...
Mathematics in Epidemiology and Biostatistics (Medical Booklet Series by Dr. ...Mathematics in Epidemiology and Biostatistics (Medical Booklet Series by Dr. ...
Mathematics in Epidemiology and Biostatistics (Medical Booklet Series by Dr. ...
 
Addition rule and multiplication rule
Addition rule and multiplication rule  Addition rule and multiplication rule
Addition rule and multiplication rule
 
MD Paediatricts (Part 2) - Epidemiology and Statistics
MD Paediatricts (Part 2) - Epidemiology and StatisticsMD Paediatricts (Part 2) - Epidemiology and Statistics
MD Paediatricts (Part 2) - Epidemiology and Statistics
 
Probability
ProbabilityProbability
Probability
 
Basic concepts of probability
Basic concepts of probabilityBasic concepts of probability
Basic concepts of probability
 
Biostatistics-MDS(Sampling techniques, Probabaility) Dr. Kanwal Preet K Gill....
Biostatistics-MDS(Sampling techniques, Probabaility) Dr. Kanwal Preet K Gill....Biostatistics-MDS(Sampling techniques, Probabaility) Dr. Kanwal Preet K Gill....
Biostatistics-MDS(Sampling techniques, Probabaility) Dr. Kanwal Preet K Gill....
 
Complements and Conditional Probability, and Bayes' Theorem
 Complements and Conditional Probability, and Bayes' Theorem Complements and Conditional Probability, and Bayes' Theorem
Complements and Conditional Probability, and Bayes' Theorem
 
RSS probability theory
RSS probability theoryRSS probability theory
RSS probability theory
 
PROBABILITY4.pptx
PROBABILITY4.pptxPROBABILITY4.pptx
PROBABILITY4.pptx
 
basic probability Lecture 9.pptx
basic probability Lecture 9.pptxbasic probability Lecture 9.pptx
basic probability Lecture 9.pptx
 
Probability
ProbabilityProbability
Probability
 
probability.ppt
probability.pptprobability.ppt
probability.ppt
 
Reporting Results of Statistical Analysis
Reporting Results of Statistical Analysis Reporting Results of Statistical Analysis
Reporting Results of Statistical Analysis
 

Recently uploaded

How to Build a Module in Odoo 17 Using the Scaffold Method
How to Build a Module in Odoo 17 Using the Scaffold MethodHow to Build a Module in Odoo 17 Using the Scaffold Method
How to Build a Module in Odoo 17 Using the Scaffold Method
Celine George
 
South African Journal of Science: Writing with integrity workshop (2024)
South African Journal of Science: Writing with integrity workshop (2024)South African Journal of Science: Writing with integrity workshop (2024)
South African Journal of Science: Writing with integrity workshop (2024)
Academy of Science of South Africa
 
Natural birth techniques - Mrs.Akanksha Trivedi Rama University
Natural birth techniques - Mrs.Akanksha Trivedi Rama UniversityNatural birth techniques - Mrs.Akanksha Trivedi Rama University
Natural birth techniques - Mrs.Akanksha Trivedi Rama University
Akanksha trivedi rama nursing college kanpur.
 
How to Setup Warehouse & Location in Odoo 17 Inventory
How to Setup Warehouse & Location in Odoo 17 InventoryHow to Setup Warehouse & Location in Odoo 17 Inventory
How to Setup Warehouse & Location in Odoo 17 Inventory
Celine George
 
What is Digital Literacy? A guest blog from Andy McLaughlin, University of Ab...
What is Digital Literacy? A guest blog from Andy McLaughlin, University of Ab...What is Digital Literacy? A guest blog from Andy McLaughlin, University of Ab...
What is Digital Literacy? A guest blog from Andy McLaughlin, University of Ab...
GeorgeMilliken2
 
Chapter 4 - Islamic Financial Institutions in Malaysia.pptx
Chapter 4 - Islamic Financial Institutions in Malaysia.pptxChapter 4 - Islamic Financial Institutions in Malaysia.pptx
Chapter 4 - Islamic Financial Institutions in Malaysia.pptx
Mohd Adib Abd Muin, Senior Lecturer at Universiti Utara Malaysia
 
Walmart Business+ and Spark Good for Nonprofits.pdf
Walmart Business+ and Spark Good for Nonprofits.pdfWalmart Business+ and Spark Good for Nonprofits.pdf
Walmart Business+ and Spark Good for Nonprofits.pdf
TechSoup
 
NEWSPAPERS - QUESTION 1 - REVISION POWERPOINT.pptx
NEWSPAPERS - QUESTION 1 - REVISION POWERPOINT.pptxNEWSPAPERS - QUESTION 1 - REVISION POWERPOINT.pptx
NEWSPAPERS - QUESTION 1 - REVISION POWERPOINT.pptx
iammrhaywood
 
BÀI TẬP BỔ TRỢ TIẾNG ANH 8 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2023-2024 (CÓ FI...
BÀI TẬP BỔ TRỢ TIẾNG ANH 8 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2023-2024 (CÓ FI...BÀI TẬP BỔ TRỢ TIẾNG ANH 8 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2023-2024 (CÓ FI...
BÀI TẬP BỔ TRỢ TIẾNG ANH 8 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2023-2024 (CÓ FI...
Nguyen Thanh Tu Collection
 
Hindi varnamala | hindi alphabet PPT.pdf
Hindi varnamala | hindi alphabet PPT.pdfHindi varnamala | hindi alphabet PPT.pdf
Hindi varnamala | hindi alphabet PPT.pdf
Dr. Mulla Adam Ali
 
Exploiting Artificial Intelligence for Empowering Researchers and Faculty, In...
Exploiting Artificial Intelligence for Empowering Researchers and Faculty, In...Exploiting Artificial Intelligence for Empowering Researchers and Faculty, In...
Exploiting Artificial Intelligence for Empowering Researchers and Faculty, In...
Dr. Vinod Kumar Kanvaria
 
Chapter wise All Notes of First year Basic Civil Engineering.pptx
Chapter wise All Notes of First year Basic Civil Engineering.pptxChapter wise All Notes of First year Basic Civil Engineering.pptx
Chapter wise All Notes of First year Basic Civil Engineering.pptx
Denish Jangid
 
Leveraging Generative AI to Drive Nonprofit Innovation
Leveraging Generative AI to Drive Nonprofit InnovationLeveraging Generative AI to Drive Nonprofit Innovation
Leveraging Generative AI to Drive Nonprofit Innovation
TechSoup
 
Digital Artefact 1 - Tiny Home Environmental Design
Digital Artefact 1 - Tiny Home Environmental DesignDigital Artefact 1 - Tiny Home Environmental Design
Digital Artefact 1 - Tiny Home Environmental Design
amberjdewit93
 
MARY JANE WILSON, A “BOA MÃE” .
MARY JANE WILSON, A “BOA MÃE”           .MARY JANE WILSON, A “BOA MÃE”           .
MARY JANE WILSON, A “BOA MÃE” .
Colégio Santa Teresinha
 
ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...
ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...
ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...
PECB
 
ANATOMY AND BIOMECHANICS OF HIP JOINT.pdf
ANATOMY AND BIOMECHANICS OF HIP JOINT.pdfANATOMY AND BIOMECHANICS OF HIP JOINT.pdf
ANATOMY AND BIOMECHANICS OF HIP JOINT.pdf
Priyankaranawat4
 
คำศัพท์ คำพื้นฐานการอ่าน ภาษาอังกฤษ ระดับชั้น ม.1
คำศัพท์ คำพื้นฐานการอ่าน ภาษาอังกฤษ ระดับชั้น ม.1คำศัพท์ คำพื้นฐานการอ่าน ภาษาอังกฤษ ระดับชั้น ม.1
คำศัพท์ คำพื้นฐานการอ่าน ภาษาอังกฤษ ระดับชั้น ม.1
สมใจ จันสุกสี
 
LAND USE LAND COVER AND NDVI OF MIRZAPUR DISTRICT, UP
LAND USE LAND COVER AND NDVI OF MIRZAPUR DISTRICT, UPLAND USE LAND COVER AND NDVI OF MIRZAPUR DISTRICT, UP
LAND USE LAND COVER AND NDVI OF MIRZAPUR DISTRICT, UP
RAHUL
 
The History of Stoke Newington Street Names
The History of Stoke Newington Street NamesThe History of Stoke Newington Street Names
The History of Stoke Newington Street Names
History of Stoke Newington
 

Recently uploaded (20)

How to Build a Module in Odoo 17 Using the Scaffold Method
How to Build a Module in Odoo 17 Using the Scaffold MethodHow to Build a Module in Odoo 17 Using the Scaffold Method
How to Build a Module in Odoo 17 Using the Scaffold Method
 
South African Journal of Science: Writing with integrity workshop (2024)
South African Journal of Science: Writing with integrity workshop (2024)South African Journal of Science: Writing with integrity workshop (2024)
South African Journal of Science: Writing with integrity workshop (2024)
 
Natural birth techniques - Mrs.Akanksha Trivedi Rama University
Natural birth techniques - Mrs.Akanksha Trivedi Rama UniversityNatural birth techniques - Mrs.Akanksha Trivedi Rama University
Natural birth techniques - Mrs.Akanksha Trivedi Rama University
 
How to Setup Warehouse & Location in Odoo 17 Inventory
How to Setup Warehouse & Location in Odoo 17 InventoryHow to Setup Warehouse & Location in Odoo 17 Inventory
How to Setup Warehouse & Location in Odoo 17 Inventory
 
What is Digital Literacy? A guest blog from Andy McLaughlin, University of Ab...
What is Digital Literacy? A guest blog from Andy McLaughlin, University of Ab...What is Digital Literacy? A guest blog from Andy McLaughlin, University of Ab...
What is Digital Literacy? A guest blog from Andy McLaughlin, University of Ab...
 
Chapter 4 - Islamic Financial Institutions in Malaysia.pptx
Chapter 4 - Islamic Financial Institutions in Malaysia.pptxChapter 4 - Islamic Financial Institutions in Malaysia.pptx
Chapter 4 - Islamic Financial Institutions in Malaysia.pptx
 
Walmart Business+ and Spark Good for Nonprofits.pdf
Walmart Business+ and Spark Good for Nonprofits.pdfWalmart Business+ and Spark Good for Nonprofits.pdf
Walmart Business+ and Spark Good for Nonprofits.pdf
 
NEWSPAPERS - QUESTION 1 - REVISION POWERPOINT.pptx
NEWSPAPERS - QUESTION 1 - REVISION POWERPOINT.pptxNEWSPAPERS - QUESTION 1 - REVISION POWERPOINT.pptx
NEWSPAPERS - QUESTION 1 - REVISION POWERPOINT.pptx
 
BÀI TẬP BỔ TRỢ TIẾNG ANH 8 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2023-2024 (CÓ FI...
BÀI TẬP BỔ TRỢ TIẾNG ANH 8 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2023-2024 (CÓ FI...BÀI TẬP BỔ TRỢ TIẾNG ANH 8 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2023-2024 (CÓ FI...
BÀI TẬP BỔ TRỢ TIẾNG ANH 8 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2023-2024 (CÓ FI...
 
Hindi varnamala | hindi alphabet PPT.pdf
Hindi varnamala | hindi alphabet PPT.pdfHindi varnamala | hindi alphabet PPT.pdf
Hindi varnamala | hindi alphabet PPT.pdf
 
Exploiting Artificial Intelligence for Empowering Researchers and Faculty, In...
Exploiting Artificial Intelligence for Empowering Researchers and Faculty, In...Exploiting Artificial Intelligence for Empowering Researchers and Faculty, In...
Exploiting Artificial Intelligence for Empowering Researchers and Faculty, In...
 
Chapter wise All Notes of First year Basic Civil Engineering.pptx
Chapter wise All Notes of First year Basic Civil Engineering.pptxChapter wise All Notes of First year Basic Civil Engineering.pptx
Chapter wise All Notes of First year Basic Civil Engineering.pptx
 
Leveraging Generative AI to Drive Nonprofit Innovation
Leveraging Generative AI to Drive Nonprofit InnovationLeveraging Generative AI to Drive Nonprofit Innovation
Leveraging Generative AI to Drive Nonprofit Innovation
 
Digital Artefact 1 - Tiny Home Environmental Design
Digital Artefact 1 - Tiny Home Environmental DesignDigital Artefact 1 - Tiny Home Environmental Design
Digital Artefact 1 - Tiny Home Environmental Design
 
MARY JANE WILSON, A “BOA MÃE” .
MARY JANE WILSON, A “BOA MÃE”           .MARY JANE WILSON, A “BOA MÃE”           .
MARY JANE WILSON, A “BOA MÃE” .
 
ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...
ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...
ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...
 
ANATOMY AND BIOMECHANICS OF HIP JOINT.pdf
ANATOMY AND BIOMECHANICS OF HIP JOINT.pdfANATOMY AND BIOMECHANICS OF HIP JOINT.pdf
ANATOMY AND BIOMECHANICS OF HIP JOINT.pdf
 
คำศัพท์ คำพื้นฐานการอ่าน ภาษาอังกฤษ ระดับชั้น ม.1
คำศัพท์ คำพื้นฐานการอ่าน ภาษาอังกฤษ ระดับชั้น ม.1คำศัพท์ คำพื้นฐานการอ่าน ภาษาอังกฤษ ระดับชั้น ม.1
คำศัพท์ คำพื้นฐานการอ่าน ภาษาอังกฤษ ระดับชั้น ม.1
 
LAND USE LAND COVER AND NDVI OF MIRZAPUR DISTRICT, UP
LAND USE LAND COVER AND NDVI OF MIRZAPUR DISTRICT, UPLAND USE LAND COVER AND NDVI OF MIRZAPUR DISTRICT, UP
LAND USE LAND COVER AND NDVI OF MIRZAPUR DISTRICT, UP
 
The History of Stoke Newington Street Names
The History of Stoke Newington Street NamesThe History of Stoke Newington Street Names
The History of Stoke Newington Street Names
 

FALLSEM2023-24_MSM5001_ETH_VL2023240106334_2023-09-19_Reference-Material-I (3).pptx

  • 3.  Theory of probability provides the foundation of statistical inference  Theory of probability: branch of mathematics dealing with analysis of random phenomena  Not the subject of this course  Aim to discuss the application of probability in biostats  A patient has 50-50% chances of survival; 95% recovery of patients upon treatment  Statisticians use fractions instead of percentage Probability in statistics
  • 4.  Event: is an outcome of an experiment/trial. For eg: getting a “head” by tossing the coin is an event. If we are looking for that event, and if the event is happening, it becomes a ‘successful event (success)’  Experiment/Trial: is an action to calculate the probability of occurrence of an event. For eg: tossing the coin, rolling the dice  Mutually exclusive events: Can not occur simultaneously  Equally likely events: The chances are same for all outcomes  Independent events: No effect of the events on each other in subsequent trials  Dependent events: The probability of one event depends on the other in subsequent trials Basic concepts in probability
  • 5.  Probability of a successful event (simply event…) ranges between 0 (will not occur) to 1 (always occur)  Most biological events range between 0 and 1 (least likely to most likely)
  • 6. “Absolute” and “No”  Absolute: Sun rises in the East. Or if I throw a stone up, it will come down  No probability: If I make a statement that this evening sun will set in the east; the probability is zero, or no probability
  • 7.
  • 8. Types of probabilities  Theoretical/ Classical/ Mathematical/ Apriori probability:  We assume equal likelihood of all events  prior information is not required  Generally used to make mathematical predictions  Some established biological phenomena (Mendelian genetics)  Relative/ Statistical/ Posteriori probability:  Information is needed to know if there is mutual exclusiveness and equal likelihood of an event (no assumptions)  Determined by actual observations and trials (many trials required first)  Revising the prior probability according to trials  Generally for statistical analyses  Most biological phenomena
  • 9.  Subjective probability:  Probability derived from an individual's personal judgment or own experience about whether a specific outcome is likely to occur  Assign numbers from 0 to 1 (0-100%) only based upon the subjective view/experience  It contains no formal calculations and only reflects the subject's opinions and past experience  An example of subjective probability is a "gut instinct" when making a trade
  • 10. EXAMPLE  Deck of cards: 52; four sets of 13 cards each  What is the probability that I would pick a clubs king with one pick – 1 out of 52  What is the probability that I would pick a king with one pick – 4/52=1/13  I am picking a king from a pack of cards. Without replacement, I am picking one more king. What is the probability that I would get 1 more king (dependent; no longer  1st pick – 1/13; 2nd pick – 3/51 = 1/17
  • 11. Elementary properties of probability  For events that are mutually exclusive and equally likely  1. Probability of any trial with n mutually exclusive outcomes (E1, E2, …. En) is a non-negative number  2. Sum of the probabilities of all mutually exclusive outcomes is 1  3. Addition rule – For two mutually exclusive events, Ei and Ej, the probability of occurrence of either Ei or Ej is given by the sum of the each of probabilities  For eg: the probability for getting 1 in rolling a dice is 1/6. probability for getting 3 is also 1/6. then what is the probability of getting either 1 or 3 while rolling a dice is: 1/6 + 1/6 = 2/6 = 1/3
  • 12.  4. Multiplication rule – For two mutually exclusive events, Ei and Ej, the probability of occurrence of both Ei or Ej is given by the multiplication of the each of probabilities  What is the probability of getting 1 and then 3 while rolling a dice on successive turns (independent events) P = 1/6 X 1/6 = 1/36  For two dice, what is the probability of getting 6 on one die and 1 on the other (independent events) P = 1/6 X 1/6 = 1/36  A bag contains 6 black marbles and 4 blue marbles; what is the probability of drawing two blue marbles one by one successively? (dependent events) P = 4/10 X 3/9 = 2/15
  • 13.  Addition rule: for OR (event Ei OR Ej occur)  Multiplication rule: AND (event Ei AND Ej occur)  BUT THESE RULES ARE ONLY FOR MUTUALLY EXCLUSIVE EVENTS
  • 14. What is the probability that a random person picked up from the dataset will be 18 years or less? 141/318 Suppose we pick a random subject 18 years old or less; what is the probability that he/she will have no family history of mood disorders? (Conditional probability) 28/141 P(NF|18 yrs) What is the probability that a random subject will be less than equal to 18 years of age and have no family history of mood disorders? (Joint probability) 28/318 Apply multiplication rule; P of 18 yrs * P of (NF |18 years) = 141/318 * 28/141 OR P of NF * P of (18 years |NF) = 63/318 * 28/63
  • 15. Bayes’ Theorem Describes probability of an event, based upon prior knowledge of conditions that are related to the event For unknown/unestablished conditions or situations (drugs/diagnostic tests)
  • 16.  Take the example of evaluation of screening tests and diagnostic criteria  Bayer’s theorem will accurately predict the presence or absence of a particular disease from the knowledge of the test results (positive or negative) and the status of a particular symptom (present or absent)  Screening testing results: not infallible (can yield false +ve or false –ve)  A particular symptom may also be present or absent in a disease
  • 17.  Following questions must be answered in order to evaluate the usefulness of test results and status of symptoms in determining whether or not a person has a disease 1. P(+ve|disease) 2. P(-ve|no disease) 3. P(disease|+ve) 4. P(no disease|-ve)
  • 18.  a: no. of subjects who have the disease and show a +ve result in screening test (true positive)  b: no. of subjects who have the disease but show a –ve test results (false positive)  True negative: d  False negative: c
  • 19.  Probability of diseased subjects showing +ve test = true positive / total diseased cases = a/(a+c) (Sensitivity of the test) P(+ve|disease)  Probability of normal subjects showing –ve test = true negative / total normal cases = d/(b+d) (Specificity of the test) P(-ve|no disease)
  • 20.  Probability of diseased subjects showing +ve test = true positive / total diseased cases = a/(a+c) (Sensitivity of the test)  P(+ve|disease) i.e. P of getting +ve results, given the person is diseased  Probability of normal subjects showing –ve test = true negative / total normal cases = d/(b+d) (Specificity of the test)  P(-ve|no disease) i.e. P of getting –ve results, given the person is non- diseased
  • 21.  Probability of subjects having the disease upon showing a +ve test = true positive /all tested positive = a/(a+b) (Predictive value positive of the test)  P(disease|+ve) i.e. P of having the disease given the result is +ve  Probability of subjects not having the disease upon showing –ve test = true negative /all tested negative = d/(c+d) (Predictive value negative of the test)  P(no disease|-ve) i.e. P of not having the disease given the result is -ve
  • 22.
  • 23. P (disease) = 0.05 P (+ve | disease) = 0.9 P (+ve | not diseased) = 0.05 Application of Bayes’ Theorem: Example Find P (disease | +ve) = ? P (+ve) = ?
  • 24. P (disease|+ve) = P (+ve|disease) * P(disease)] / P(+ve) First find P (+ve) All +ve results = +ve results with disease or +ve results without disease P(+ve) = P (disease AND +ve) OR P (no disease AND +ve) P(+ve) = P (disease) * P (+ve|diseased) OR P (no disease) * P (+ve|no disease) = 0.05*0.9 OR (1-0.05)* 0.05 = (0.05*0.9) + (0.95*0.05) = 0.045 + 0.0475 = 0.0925 P (disease) = 0.05; hence P(no disease) = 0.95 P (+ve | disease) = 0.9 P (+ve | not diseased) = 0.05
  • 25. P (disease|+ve) = P (+ve|disease) * P(disease)] / P(+ve) First find P (+ve) All +ve results = +ve results with disease and +ve results without disease P(+ve) = P (disease and +ve) OR P (no disease and +ve) P(+ve) = P (disease) * P (diseased; +ve) OR P (no disease) * P (no disease; +ve) = 0.05*0.9 OR (1-0.05)* 0.05 = (0.05*0.9) + (0.95*0.05) = 0.045 + 0.0475 = 0.0925 P (disease|+ve) = P (+ve|disease) * P(disease)] / P(+ve) = (0.9 * 0.05)/0.0925 = 0.4864
  • 26. Application of Bayes’ Theorem: Example  A particular test for a person using cannabis is 90% sensitive. The test is also 80% specific.  Assuming prevalence of 5% of cannabis usage, what is the probability that a random person who tests +ve actually uses cannabis.
  • 27.  90% sensitive; correctly identifies (+ve test) 90% of users = P(+ve|user)  Probability that a person is identified +ve given he is a user = 0.9  Users: either identified +ve or -ve  P(-ve|user) = 1-0.9  Probability that a person is identified -ve when (given) he is a user = 0.1  80% specific; correctly identifies (-ve test) 80% of non-users = P(-ve|nonuser)  Probability that a person is identified -ve given he is a nonuser = 0.8  Nonusers: either identified -ve or +ve  P(+ve|nonuser) = 1-0.8  Probability that a person is identified +ve when (given) he is a nonuser = 0.2  Assuming prevalence of 5% of cannabis usage, what is the probability that a random person who tests +ve actually uses cannabis;  I.e. find P (user|+ve) or probability that a person is a user when (given) he is tested +ve
  • 28. Where do we get this term from? All Positive cases = (user AND testing positive) OR (nonuser AND testing positive) Probability that someone tests positive is the probability that a user tests positive into (AND) the probability of being a user (P of actual user being tested positive) OR the probability that a non-user tests positive into the probability of being a non-user (P of non-user being tested positive)
  • 29. Multiplication rule: (a) Probability of a user testing positive (b) Probability of a non-user testing positive Addition law: Probability of anybody (user or non-user) testing positive Also think in terms of total probability: only two outcomes for +ve test: either user or non-user
  • 30.
  • 31. Probability distributions Discrete variables: Binomial and Poisson distributions Continuous variables: Normal distribution
  • 32. Probability distributions (discrete variables)  Probability distribution: relationship between the values of a random variable and the probabilities of their occurrence in a graphical or tabular manner  Similar to frequency distribution: relationship between a random variable and its frequencies  Probability distribution is a powerful tool for both describing a dataset and for making predictions
  • 33. Relative frequency of occurrence of any variable (X) to assume a corresponding value (x)
  • 34.
  • 35.  What is the probability that a random selected family used 3 assistance programs?  What is the probability that a randomly selected family used either one or two assistance programs?
  • 37.
  • 38.  What is the probability that a randomly selected family used 2 or less assistance programs?  What is the probability that a randomly selected family used 6 or less assistance programs?
  • 39.  What is the probability that a randomly selected family used 5 or more assistance programs?  What is the probability that a randomly selected family used between 3 and 5 (inclusive) assistance programs?  Use more than cumulative frequency probability OR subtract from 1
  • 40.  What is the probability that a randomly selected family used 5 or more assistance programs?  P (X ≥ 5) + P (X ≤ 4) = 1; P (X ≥ 5) = 1 – 0.6296  What is the probability that a randomly selected family used between 3 and 5 (inclusive) assistance programs?  P (X ≤ 5) is P for a family using 1-5 programs (inclusive), P (X ≤ 2) is P of a family using less than 3 programs (1-2 programs)  P (3 ≤ X ≤ 5) = P (X ≤ 5) – P (X ≤ 2); = 0.8249 – 0.3670
  • 41. Binomial distribution (Bernoulli’s distribution)  For dichotomous variables (two mutually exclusive outcomes only)  Coil flip  Red/White flowers  Dead or alive  Treated or untreated  Statisticians call it success or failure  One outcome is denoted as P; probability for P is denoted as p (remains same during all trials)  Other outcome is denoted as Q, probability is q (= 1 - p)  Trials should be independent (outcome of one trial does not influence the outcome of any other)
  • 42.  Death record for a disease; P (death) and Q (alive)  Probabilities are p and q, respectively  For a randomly selected set of five individuals with the disease, what is the probability that we will have the sequence of PQPPQ (dead, alive, dead, dead, alive)?
  • 43.  Death record for a disease; P (death) and Q (alive)  Probabilities are p and q, respectively  For a randomly selected set of five individuals with the disease, what is the probability that we will have the sequence of PQPPQ (dead, alive, dead, dead, alive)?  Apply multiplication rule (AND)  P = pqppq = p3q2
  • 44.  For a dichotomous variable, two trials only  p2+ 2pq + q2 = Ptot = 1  For three trials  p3+ 3p2q + 3pq2 + q3 = Ptot = 1  For four trials  p4 + 4p3q + 6p2q2 + 4pq3 + q4 = Ptot = 1  For event (P) to occur for x times in a total no. of trials (n), the number of combinations is  Factorial of x: product of all numbers from x to 1, 0! =1 Binomial expansion
  • 45.  Death record for a disease; P (death) and Q (alive)  For a randomly selected set of five individuals with the disease, what is the probability that we will have three death cases?  n = 5, x = 3  C =  No. of combinations is 10  PPPQQ, PQPQP, so on…
  • 46.  Finding frequency  The success is P and failure is Q, for simplicity, we write probability as f(x)  This expression is called the binomial distribution  f(x) = nCx * px *qn-x = 10p3q2
  • 47.  Toss coin 10 times  what is the probability that there will be 4 heads  heads: P (successful event); p = 50% = 0.5  tails: Q (failures); q = 1-0.5 = 0.5  No. of P events; n = 4  Total no. of events; x = 10 nCx = 10C4 = 10!/ [4! (10-4)!] = 3628800/(24*720) = 210 f (x) = 10C4 px q(n-x) = 210 * (0.5)4 * (0.5)6 = 0.205
  • 48.  14% of pregnant women admitted to a hospital are smokers  If a random sample of 10 women are selected from this population,  what is the probability that it will have 4 smokers
  • 49.  14% of pregnant women admitted to a hospital are smokers  If a random sample of 10 women are selected from this population,  what is the probability that it will have 4 smokers  smoking women: P (successful event); p = 14% = 0.14  non-smoking women: Q (failures); q = 1-0.14 = 0.86  No. of P events; n = 4  Total no. of events; x = 10 nCx = 10C4 = 10!/ [4! (10-4)!] = 3628800/(24*720) = 210 f (x) = 10C4 px q(n-x) = 210 * (0.14)4 * (0.86)6 = 210 * 0.00038416 * 0.404567 = 0.032 or 3.2%
  • 50. Poisson distribution  Simeon Denis Poisson derived this  Also for discrete variables  Difference: distribution of the number of times a rare discrete event occurs in a continuum of space and time  The same size N is infinitely large and uncountable; size of rare events is finite and countable  Poisson distribution is used to predict very rare events in a given period of time or space  E.g.. no. of radioactive particles emitted per unit time  No. of earthquakes per year
  • 51.  Poisson distribution is given by  e = Euler’s number (constant) =2.7183   = mean no. of events in an interval (in a given time/space)  x = no. of events  Assumptions:  The occurrence of events are independent (one occurrence does not affect any other one)  Theoretically, an infinite no. of occurrences of the event is possible in the interval  The probability of a single event is proportional to the length of the interval
  • 52. A 100 km stretch of road along a forest land was surveyed for assessing the incidence of death of wild life due to accidents caused by heavy vehicles. The total number of dead animals counted was 75. Calculate the probability of finding: (a) no dead animals, (b) 1 dead animal, (c) 2 dead animals, (d) 3 dead animals in a given km of the road. Assume one accident kills one animal only.  = mean no. of events/unit space = 75/100 = 0.75 x = 0 (for a); 1 (for b); 2 (for c); 3 (for d)
  • 53.  = mean no. of events/unit space = 75/100 = 0.75 x = 0 (for a); 1 (for b); 2 (for c); 3 (for d) (a) f (x) = P (x) = [2.71(-0.75) * 0.750] / 0! = (0.4724 * 1)/1 = 0.4724 (b) f (x) = [2.71(-0.75) * 0.751] / 1! = (0.4724 * 0.75)/1 = 0.3543 (c) f (x) = [2.71(-0.75) * 0.752] / 2! = 0.133 (d) f (x) = [2.71(-0.75) * 0.753] / 3! = (0.4724 * 0.75)/1 = 0.0331
  • 54. In a study of drug-induced anaphylaxis among patients, it was found that the occurrence of anaphylaxis is 12 per year. Assuming that the model follows Poisson distribution, Find the probability that three subjects will experience anaphylaxis in the next year.  = mean no. of events/unit time = 12/1 = 12 x = 3 f (x) = P (x) = [2.71(-12) * 123] / 3! = (0.000006144 * 1728)/6 = 0.00177
  • 55.  Poisson distribution is used when n is very large and p is very small Differentiating Poisson and binomial distribution
  • 56.
  • 57. Distribution of continuous variables  For continuous random variables  Infinite values of the variable are possible
  • 58.
  • 59.  Imagine that the number of the variables is very large and the widths of the classes are very small  As the n increases to infinity and class widths approaches 0, the polygon is converted to a curve  Such smooth curves represent the distribution of random continuous variables graphically Relative frequency Relative frequency
  • 60.  The total area under the curve is equal to 1 as with the case with relative frequency curve  Relative frequency (probability) of occurrence of values between any two points on the x axis is equal to the area bound by the curve  The probability of finding a specific value is 0 (area above a point is 0) Relative frequency
  • 61. Distribution of continuous variables: Normal distribution  Most important distribution in statistics  First given by Abraham De Moivre  Carl Gauss contributed immensely in its understanding  Hence, also called Gaussian distribution  e and  are constants   is mean and  is standard deviation (use calculus to find these)
  • 62. 1. It is symmetrical about its mean (mirror image on either side) 2. Mean, mode and median are all equal 3. Total area of the curve is one square unit (probability distribution) 4. Each side is 50%-50% on either side of mean Characteristics of Normal distribution
  • 63. 5. 1 SD distance either side of mean accommodates ~68% of the values (34% each side) 6. 2 SD distance either side of mean accommodates 95% of the values 7. 3 SD distance either side of mean accommodates 99.7% of the values
  • 64. 8. Normal distribution is hence completely described by the mean (central tendency) and SD (dispersion) 9. Different values of mean() shift the distribution along the X-axis  is the location parameter 10. Different values of SD () shift the distribution along the Y-axis (flatness or peakedness)  is the shape parameter
  • 65. Consider normal distribution as a family, each member being determined by the different mean and SD values Standard (or unit) normal distribution is one member/kind of normal distribution in which mean = 0 and SD = 1 unit

Editor's Notes

  1. examples of dice; two dice
  2. mutually exclusive: can not occur together equal likelihood: occurrence of one event does not determine the outcome of another event
  3. 1. 141/318 2. 28/141 3. 28/318
  4. No need for derivations; in exams you will be informed if the data follows Poisson model. If not, use binomial distribution