05/04/2025 1
Probability and probability distribution
At the end of this chapter, students are expected to
understand the following points
 Probability (definition of terms, probability rules)
 The difference between probability and probability distribution
 Conditional probability
 Distribution for categorical variable
 Distribution for continuous variable
 Different distribution tables
05/04/2025 2
Probability definition
Chance of observing a particular outcome or
Likelihood of an event
Assumes a “stochastic” or “random” process:
i.e.. the outcome is not predetermined there
‐
is an element of chance
An outcome is a specific result of a single trial
of a probability experiment
05/04/2025
Chance
• When a meteorologist states that the chance of rain is 50%,
the meteorologist is saying that it is equally likely to rain or
not to rain. If the chance of rain rises to 80%, it is more likely
to rain. If the chance drops to 20%, then it may rain, but it
probably will not rain.
• These examples suggest the chance of an
occurrence of some event of a random variable.
3
05/04/2025
Probability and Probability
Distributions
4
Probabilities and probability distributions
are nothing more than extensions of the
ideas of relative frequency and histograms,
respectively.
05/04/2025
Why Probability in Medicine?
• Because medicine is an inexact science,
physicians seldom predict an outcome with
absolute certainty.
• E.g., to formulate a diagnosis, a physician must
rely on available diagnostic information about a
patient
– History and physical examination
– Laboratory investigation, X-ray findings, ECG, etc
5
05/04/2025 6
Cont…
• An understanding of probability is fundamental
for quantifying the uncertainty that is inherent in
the decision-making process.
• Probability theory also allows us to draw
conclusions about a population based on
known information about a sample which drown
from that population.
05/04/2025 7
Conclusions/Inferences in science are using
probability
05/04/2025 8
Terminology
Random experiment/ random variable: is one
in which the out comes occur at random or
cannot be predicted with certainty.
e.g. A single coin tossing experiment is a random
as the occurrence of Head(H) and Tail(T)
Trial: A physical action , the result of which
cannot be predetermined
05/04/2025 9
Terminology…
Sample Space: The set of all possible outcomes of an
experiment .
In die throwing, S={1,2,3,4,5,6}
Events: Collections of basic outcomes from the sample space.
We say that an event occurs if any one of the basic outcomes in
the event occurs.
Any subset of sample space.
- Event of getting even number A={2,4,6}
Success/ favorable case: Outcome that entail the happening of
a desired event.
05/04/2025 10
Equally likely events:
 If in a random experiment all out comes have
equal chance of occurrence.
- In tossing coin both H and T have equal chance to occur
Mutually Exclusive Events (Disjoint Events)
 If the occurrence of one event prevent the
occurrence of the other.
- In tossing coin the occurrence of Head prevent the
occurrence of Tail.
05/04/2025 11
Cont…
Independent events(mutual independence)
 The occurrence or non-occurrence of one event
doesn’t affect the occurrence or non-occurrence
of the other event in repeated trials, conduction
of a random experiment.
While tossing of two coin simultaneously, the occurrence of
head in one coin does not affect the occurrence of tail on the
other.
05/04/2025
Two Categories of Probability
• Objective and Subjective Probabilities.
• Objective probability
1) Classical probability and
2) Relative frequency probability.
12
05/04/2025 13
Types of probability
Classical Method
Is based on gambling ideas
 If there are n equally likely possibilities, of
which one must occur and m are regarded as
favorable, or as a “success,” then the probability
of a “success” is m/n.
P(A) = m/n
What is the probability of rolling a 6 with a well-balanced
die? Ans.
In this case, m=1 and n=6, so that the probability is 1/6
= 0.167
05/04/2025
Relative Frequency Probability
• In the long run process …..
• The proportion of times the event A occurs —
in a large number of trials repeated under
essentially identical conditions
• Definition: If a process is repeated a large
number of times (n), and if an event with the
characteristic E occurs m times, the relative
frequency of E,
Probability of E = P(E) = m/n.
14
05/04/2025 15
Relative freq…
Examples
• If you toss a coin 100 times and head comes up 40
times,
P(H) = 40/100 = 0.4.
• If we toss a coin 10,000 times and the head
comes up 5562,
P(H) = 0.5562.
• Therefore, the longer the series and the longer
sample size, the closer the estimate to the true
value (0.5)
05/04/2025 16
Subjective Probability
 Personalistic (An opinion or judgment by a decision
maker about the likelihood of an event)
Personal assessment of which is more effective to
provide cure traditional/modern
‐
 Personal assessment of which sports team will win a
match
Also uses classical and relative frequency methods to
assess the likelihood of an event, but does not rely
on repeatability of any process.
05/04/2025
Properties of Probability
1. The numerical value of a probability always
lies between 0 and 1, inclusive.
0  P(E)  1
 A value 0 means the event can not occur
 A value 1 means the event definitely will occur
 A value of 0.5 means that the probability that
the event will occur is the same as the
probability that it will not occur.
17
05/04/2025 18
2. The sum of the probabilities of all mutually
exclusive outcomes is equal to 1.
P(E1
) + P(E2
) + .... + P(En
) = 1.
3. For two mutually exclusive events A and B,
P(A or B ) = P(AUB)= P(A) + P(B).
If not mutually exclusive:
P(A or B) = P(A) + P(B) - P(A and B)
05/04/2025 19
4. The complement of an event A, denoted by
Ā or Ac
, is the event that A does not occur
– Consists of all the outcomes in which event A
does NOT occur
P(Ā) = P(not A) = 1 – P(A)
– Ā occurs only when A does not occur.
– These are complementary events.
05/04/2025
Unions of Two Events
“If A and B are events, then the union of A and B, denoted
by AUB, represents the event composed of all basic
outcomes in A or B.”
Intersections of Two Events
“If A and B are events, then the intersection of A and B, denoted by A n
B, represents the event composed of all basic outcomes in A and
B.”
Unions and Intersections of Two Events
20
B =With lung
cancer
A=Cigarette
smoking
A n B=Smokers with lung cancer
05/04/2025 21
Additive Law of Probability
Let A and B be two events in a sample space S. The
probability of the union of A and B is
( ) ( ) ( ) ( ).
P A B P A P B P A B
    
B
A A n B
05/04/2025
22
Mutually Exclusive Events
Mutually Exclusive Events: Events that have no basic
outcomes in common, or equivalently, their intersection is
empty set.
S
B
A
Let A and B be two events in a sample space S. The probability of the
union of two mutually exclusive events A and B is:
( ) ( ) ( ).
P A B P A P B
  
05/04/2025 23
Two events are independent if the occurrence of one of the
events does not affect the probability of the other event.
That is, A and B are independent if :
P (B |A) = P (B) or if P (A |B) = P (A).
Independent Events
Example:
Let event A stands for “the sex of the first child from a mother is female”;
and event B stands for “the sex of the second child from the same
mother is female”
Are A and B independent?
Solution
P(B/A) = P(B) = 0.5 The occurrence of A does not affect the probability of B,
so the events are independent.
05/04/2025 24
Multiplication rule
– If A and B are independent events, then
P(A ∩ B) = P(A) × P(B)
– More generally,
P(A ∩ B) = P(A) P(B|A) = P(B) P(A|B)
P(A and B) denotes the probability that A and
B both occur at the same time.
05/04/2025 25
Conditional probabilities and the multiplicative law
 Sometimes the chance a particular event happens depends on
the outcome of some other event. This applies obviously with
many events that are spread out in time.
 Example: The chance a patient with some disease survives the next
year depends on his having survived to the present time. Such
probabilities are called conditional.
 The notation is Pr(B/A), which is read as “the probability event B
occurs given that event A has already occurred .”
 Let A and B be two events of a sample space S. The conditional
probability of an event A, given B, denoted by
Pr ( A/B )= P(A n B) / P(B) , P(B) not = 0.
 Similarly, P(B/A) = P(A n B) / P(A) , P(A)not =0. This can be taken as an
alternative form of the multiplicative law.
05/04/2025
Conditional Probability
The conditional probability of the event A given that
event B has occurred is denoted by P(A|B).
Then, P(A|B) =P(A ∩ B)/P(B) , P(B) > 0.
Similarly,
P(B|A) = P(A ∩ B)/P(A), P(A) > 0
when do you use conditional probability ???
Sensitivity and specificity
26
05/04/2025 27
Example 1
Calculating probability of an event
Table 1: Shows the frequency of cocaine use by sex
among adult cocaine users
_______________________________________________________________________________________________
Life time frequency Male Female Total
of cocaine use
_______________________________________________________________________________________________
1-19 times 32 7 39
20-99 times 18 20 38
more than 100 times 25 9 34
--------------------------------------------------------------------------------------------
Total 75 36 111
---------------------------------------------------------------------------------------------
05/04/2025 28
Questions…
1. What is the probability of a person randomly picked is a
male?
2. What is the probability of a person randomly picked uses
cocaine more than 100 times?
3. Given that the selected person is male, what is the
probability of a person randomly picked uses cocaine
more than 100 times?
4. Given that the person has used cocaine less than 100
times, what is the probability of being female?
5. What is the probability of a person randomly picked is a
male and uses cocaine more than 100 times?
05/04/2025 29
Solution
1. Pr(m)=Total adult males/Total adult cocaine users
=75/111 =0.68 .
2. Pr(c>100)=All adult cocaine users more than 100 times
Total adult cocaine users
=34/111=0.31.
3. Pr (c>100m)=25/75=0.33.
4. Pr(fc<100)=(7+20)/77 =0.35
5. Pr(m ∩ c>100)= Pr(m) × Pr (c>100)=75/111 ×25/75 or
25/34 x 34/111=25/111=0.23.
05/04/2025 30
Summery
05/04/2025 31
Application of probability of categorical
variables
• Calculating the probability of an event in epidemiological
studies, we can estimate prevalence of certain diseases
in a given population.
– Prevalence of a disease (e.g. Tuberculosis, diabetes,
heart disease),
– Prevalence of certain characteristics (e.g. high blood
pressure, low birth weight) or
– prevalence of certain behavior (e.g. smoking, drug use,
condom use).
05/04/2025
Probability Distributions
• A probability distribution is a device used to describe the
behavior that a random variable may have by applying the
theory of probability.
• It is a list of the probabilities associated with the
values of the random variable obtained in an
experiment
• It is the way data are distributed, in order to draw
conclusions about a set of data
• Random Variable = Any quantity or characteristic that is
able to assume a number of different values such that any
particular outcome is determined by chance
32
05/04/2025 33
Probability distribution…
♣ A probability distribution of a random variable
can be displayed by a table or a graph or a
mathematical formula.
♣ With categorical variables, we obtain the
frequency distribution of each variable.
♣ With numeric variables, the aim is to determine
whether or not normality may be assumed.
05/04/2025 34
Therefore, the probability distribution of a
random variable is a table, graph, or
mathematical formula that gives the
probabilities with which the random variable
takes different values or ranges of values.
05/04/2025
A. Discrete Probability Distributions
• For a discrete random variable, the probability
distribution specifies each of the possible
outcomes of the random variable along with the
probability that each will occur
• Examples can be:
– Frequency distribution
– Relative frequency distribution
– Cumulative frequency
35
05/04/2025
The following data shows the number of diagnostic
services a patient receives
36
05/04/2025 37
• What is the probability that a patient receives
exactly 3 diagnostic services?
P(X=3) = 0.031
• What is the probability that a patient receives
at most one diagnostic service?
P (X≤1) = P(X = 0) + P(X = 1)
= 0.671 + 0.229
= 0.900
05/04/2025 38
• What is the probability that a patient
receives at least four diagnostic services?
P (X≥4) = P(X = 4) + P(X = 5)
= 0.010 + 0.006
= 0.016
05/04/2025
Probability distributions can also
be displayed using a graph
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0 1 2 3 4 5
No. of diagnostic services, x
Probability,
X=x
39
05/04/2025
Binomial Distribution
• It is one of the most widely encountered discrete
probability distributions.
• Consider dichotomous (binary) random variable
• Is based on Bernoulli trial
– When a single trial of an experiment can result in only
one of two mutually exclusive outcomes (success or
failure; dead or alive; sick or well, male or female)
40
05/04/2025
A binomial probability distribution occurs when
the following requirements are met.
1. The procedure has a fixed number of trials.
2. The trials must be independent.
3. Each trial must have all outcomes that fall into
two categories.
4. The probabilities must remain constant for each
trial [P(success) = p].
41
05/04/2025
Binomial Distribution
A process that has only two possible outcomes is called a
binomial process. In statistics, the two outcomes are
frequently denoted as success and failure. The
probabilities of a success or a failure are denoted by p and
q, respectively. Note that p + q = 1. The binomial
distribution gives the probability of exactly k successes in
n trials
P(k) 
n
k






pk 1 p
 n  k
42
05/04/2025
Binomial distribution, generally
43
X
n
X
n
X
p
p 







)
1
(
1-p = probability of
failure
p = probability of
success
X = #
successes out
of n trials
n = number of trials
Note the general pattern emerging  if you have only two possible
outcomes (call them 1/0 or yes/no or success/failure) in n
independent trials, then the probability of exactly X “successes”=
05/04/2025 44
• n denotes the number of fixed trials
• x denotes the number of successes in
the n trials
• p denotes the probability of success
• q denotes the probability of failure (1- p)
=
• Represents the number of ways of selecting x objects out of n
where the order of selection does not matter.
• where n!=n(n-1)(n-2)…(1) , and 0!=1
05/04/2025
Example 2
• Suppose we know that 40% of a certain population
are cigarette smokers. If we take a random sample
of 10 people from this population, what is the
probability that we will have exactly 4 smokers in
our sample?
45
05/04/2025 46
• If the probability that any individual in the
population is a smoker to be P=.40, then the
probability that x=4 smokers out of n=10
subjects selected is:
P(X=4) =10C4(0.4)4
(1-0.4)10-4
= 10C4(0.4)4
(0.6)6
= 210(.0256)(.04666)
= 0.25
• The probability of obtaining exactly 4 smokers in
the sample is about 0.25.
05/04/2025 47
• We can compute the probability of observing zero
smokers out of 10 subjects selected at random,
exactly 1 smoker, and so on, and display the
results in a table, as given, below.
• The third column, P(X ≤ x), gives the cumulative
probability. E.g. the probability of selecting 3 or
fewer smokers into the sample of 10 subjects is
P(X ≤ 3) =.3823, or about 38%.
05/04/2025 48
05/04/2025
The probability in the above table can
be converted into the following graph
0
0.05
0.1
0.15
0.2
0.25
0.3
0 1 2 3 4 5 6 7 8 9 10
No. of Smokers
Probability
49
05/04/2025
II. Probability distribution of continuous
variables
• Under different circumstances, the outcome of a random
variable may not be limited to categories or counts.
– E.g. Suppose, X represents the continuous variable
‘Height’; rarely is an individual exactly equal to 170cm tall.
– X can assume an infinite number of intermediate values
170.1, 170.2, 170.3 etc.
• Because a continuous random variable X can take on an
uncountable, infinite number of values, the probability
associated with any particular one value is almost equal
to zero.
50
05/04/2025 51
Continuous Probability Distributions
 There are infinite number of continuous random variables
 We try to pick a model that
 Fits the data well
 Allows us to make the best possible inferences
using the data.
f (x)
x
Uniform Normal Skewed
05/04/2025 52
Properties of Normal Distributions
The most important probability distribution in statistics is the
normal distribution.
A normal distribution is a continuous probability distribution
for a random variable, x.
The graph of a normal distribution is called the normal
curve.
Normal curve
x
05/04/2025 53
The Normal Distribution
 The formula that generates the normal probability distribution is:
Where, s = Population variance
µ = population mean
e =2.718…, π= 3.14…
2
)
(
2
1
2
1
)
( 







x
e
x
f
This is a bell shaped curve
with different centers and
spreads depending on 
and 
05/04/2025
Normal Curve Characteristics
1. It is a probability distribution of a continuous variable.
It extends from minus infinity to plus infinity.
2. It is unimodal, bell-shaped and symmetric.
3. The mean, the median and mode are all equal
4. The curve approaches, but never meets, the abscissa
at both high and low ends.
5. The total area under the curve is 1. (This is a
requirement of any probability density function.)
6. It is determined by two quantities: its mean and SD .
Changing mean alone shifts the entire normal curve to
the left or right. Changing SD alone changes the
degree to which the distribution is spread out.
54
05/04/2025 55
05/04/2025 56
7. The height of the frequency curve, which is
called the probability density, cannot be taken as
the probability of a particular value.
• An observation from a normal distribution can
be related to a standard normal distribution
(SND) which has a published table.
05/04/2025
The standard normal distribution
Since a normal distribution could be an infinite number
of possible values for its mean and SD, it is impossible
to tabulate the area associated for each and every
normal curve.
Instead only a single curve for which μ = 0 and σ = 1 is
tabulated.
The curve is called the standard normal distribution
(SND).
57
05/04/2025
The Standard Normal Distribution
 To find P(a < x < b), we need to find the area under the
appropriate normal curve.
 To simplify the tabulation of these areas, we standardize
each value of x by expressing it as a z-score, the number
of standard deviations s it lies from the mean m.
58




x
z
05/04/2025
The Standard Normal (z)
Distribution
 Mean = 0; Standard deviation = 1
 When x = m, z = 0
 Symmetric about z = 0
 Values of z to the left of center are negative
 Values of z to the right of center are positive
 Total area under the curve is 1.
59
05/04/2025
Some Useful Tips
60
05/04/2025
Using normal table
61
The four digit probability in a particular row and column of Table
1 gives the area under the z curve to the left that particular value
of z.
Area for z = 1.36
05/04/2025
P(z 1.36) = .9131
P(z >1.36)
= 1 - .9131 = .0869
P(-1.20  z  1.36)
= .9131 - .1151 = .7980
Example-4
62
Use Table 1 to calculate these probabilities:
05/04/2025
Example:5 Probability Z is between
–2.59 and 1.31
63
P(-2.59  Z  1.31)
= P(0 < Z  1.31) + P(-2.59 < Z  0 )
= 0.4049 +0.4952 = 0.9001
05/04/2025
Exercises 2
Find the probability of the following under the SND
– Above 1.96?
– Below –1.96 , 1.96 ?
– Between –1.28 and 1.28?
– Between –1.65 and 1.08? 0.8502
– What level cuts the upper 25%?
– What level cuts the middle 99%?
64
05/04/2025 65
Example: The average weight of pregnant women attending a
prenatal care in a clinic was 78kg with a standard deviation of
8kg. If the weights are normally distributed:
a) Find the probability that a randomly selected pregnant woman
weights less than 90kg.
Probability and Normal Distributions
P(x < 90) = P(z < 1.5) = 0.9332
-

90-78
=
8
x μ
z
σ
= 1.5
The probability that a
randomly selected pregnant
woman weights less than
90kg. is 0.9332.
μ =0
z
?
1.5
90
μ =78
P(x < 90)
μ = 78
σ = 8
x
05/04/2025 66
Example:
b) Based on the above example, find the probability that a
pregnant woman weights greater than 85kg.
Probability and Normal Distributions
P(x > 85) = P(z > 0.88) = 1  P(z < 0.88) = 1  0.8106 = 0.1894
85-78
= =
8
x - μ
z
σ

= 0.875 0.88
The probability that a
randomly selected pregnant
woman weights greater than
85kg. is 0.1894.
μ =0
z
?
0.88
85
μ =78
P(x > 85)
μ = 78
σ = 8
x
05/04/2025 67
Example:
From the above example, find the probability that a randomly
selected pregnant woman weights between 60 and 80.
Probability and Normal Distributions
P(60 < x < 80) = P(2.25 < z < 0.25) = P(z < 0.25)  P(z < 2.25)
- -
1
60 78
= =
8
x μ
z
σ
-
= 2.25
The probability that a
randomly selected pregnant
women weights between 60
and 80 is 0.5865.
2
- -

80 78
=
8
x μ
z
σ
= 0.25
μ =0
z
?
? 0.25
2.25
= 0.5987  0.0122 = 0.5865
60 80
μ =78
P(60 < x < 80)
μ = 78
σ = 8
x
05/04/2025 68
Exercise 3
Calculate the following
probabilities when X is taken
different value with mean 35 and
SD 2?
A. P(x<37),
B. P(x>40),
C. P(38<x<40)
05/04/2025 69
Exercise
1. What proportion of newborns will weight above 2700 grams?
2. What is the probability that a randomly selected newborns
will weight between 2800 and 2700 grams?
3. What is the 75 percentile in gram for the distribution of
weight of newborns?
4. What is the probability that a randomly selected newborns
will weight exactly 2900 grams?
A population of newborn infant have a
mean weight of 2800 grams with
standard deviation of 400 grams.
Based on this information give a short
answer to the following questions.
05/04/2025 70
05/04/2025
Area between 0 and z
0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09
0.0 0.0000 0.0040 0.0080 0.0120 0.0160 0.0199 0.0239 0.0279 0.0319 0.0359
0.1 0.0398 0.0438 0.0478 0.0517 0.0557 0.0596 0.0636 0.0675 0.0714 0.0753
0.2 0.0793 0.0832 0.0871 0.0910 0.0948 0.0987 0.1026 0.1064 0.1103 0.1141
0.3 0.1179 0.1217 0.1255 0.1293 0.1331 0.1368 0.1406 0.1443 0.1480 0.1517
0.4 0.1554 0.1591 0.1628 0.1664 0.1700 0.1736 0.1772 0.1808 0.1844 0.1879
0.5 0.1915 0.1950 0.1985 0.2019 0.2054 0.2088 0.2123 0.2157 0.2190 0.2224
0.6 0.2257 0.2291 0.2324 0.2357 0.2389 0.2422 0.2454 0.2486 0.2517 0.2549
0.7 0.2580 0.2611 0.2642 0.2673 0.2704 0.2734 0.2764 0.2794 0.2823 0.2852
0.8 0.2881 0.2910 0.2939 0.2967 0.2995 0.3023 0.3051 0.3078 0.3106 0.3133
0.9 0.3159 0.3186 0.3212 0.3238 0.3264 0.3289 0.3315 0.3340 0.3365 0.3389
1.0 0.3413 0.3438 0.3461 0.3485 0.3508 0.3531 0.3554 0.3577 0.3599 0.3621
1.1 0.3643 0.3665 0.3686 0.3708 0.3729 0.3749 0.3770 0.3790 0.3810 0.3830
1.2 0.3849 0.3869 0.3888 0.3907 0.3925 0.3944 0.3962 0.3980 0.3997 0.4015
1.3 0.4032 0.4049 0.4066 0.4082 0.4099 0.4115 0.4131 0.4147 0.4162 0.4177
1.4 0.4192 0.4207 0.4222 0.4236 0.4251 0.4265 0.4279 0.4292 0.4306 0.4319
1.5 0.4332 0.4345 0.4357 0.4370 0.4382 0.4394 0.4406 0.4418 0.4429 0.4441
1.6 0.4452 0.4463 0.4474 0.4484 0.4495 0.4505 0.4515 0.4525 0.4535 0.4545
1.7 0.4554 0.4564 0.4573 0.4582 0.4591 0.4599 0.4608 0.4616 0.4625 0.4633
1.8 0.4641 0.4649 0.4656 0.4664 0.4671 0.4678 0.4686 0.4693 0.4699 0.4706
1.9 0.4713 0.4719 0.4726 0.4732 0.4738 0.4744 0.4750 0.4756 0.4761 0.4767
71
Table : Normal distribution
05/04/2025 72
05/04/2025 73

Probability and probability distribution.pptx

  • 1.
    05/04/2025 1 Probability andprobability distribution At the end of this chapter, students are expected to understand the following points  Probability (definition of terms, probability rules)  The difference between probability and probability distribution  Conditional probability  Distribution for categorical variable  Distribution for continuous variable  Different distribution tables
  • 2.
    05/04/2025 2 Probability definition Chanceof observing a particular outcome or Likelihood of an event Assumes a “stochastic” or “random” process: i.e.. the outcome is not predetermined there ‐ is an element of chance An outcome is a specific result of a single trial of a probability experiment
  • 3.
    05/04/2025 Chance • When ameteorologist states that the chance of rain is 50%, the meteorologist is saying that it is equally likely to rain or not to rain. If the chance of rain rises to 80%, it is more likely to rain. If the chance drops to 20%, then it may rain, but it probably will not rain. • These examples suggest the chance of an occurrence of some event of a random variable. 3
  • 4.
    05/04/2025 Probability and Probability Distributions 4 Probabilitiesand probability distributions are nothing more than extensions of the ideas of relative frequency and histograms, respectively.
  • 5.
    05/04/2025 Why Probability inMedicine? • Because medicine is an inexact science, physicians seldom predict an outcome with absolute certainty. • E.g., to formulate a diagnosis, a physician must rely on available diagnostic information about a patient – History and physical examination – Laboratory investigation, X-ray findings, ECG, etc 5
  • 6.
    05/04/2025 6 Cont… • Anunderstanding of probability is fundamental for quantifying the uncertainty that is inherent in the decision-making process. • Probability theory also allows us to draw conclusions about a population based on known information about a sample which drown from that population.
  • 7.
    05/04/2025 7 Conclusions/Inferences inscience are using probability
  • 8.
    05/04/2025 8 Terminology Random experiment/random variable: is one in which the out comes occur at random or cannot be predicted with certainty. e.g. A single coin tossing experiment is a random as the occurrence of Head(H) and Tail(T) Trial: A physical action , the result of which cannot be predetermined
  • 9.
    05/04/2025 9 Terminology… Sample Space:The set of all possible outcomes of an experiment . In die throwing, S={1,2,3,4,5,6} Events: Collections of basic outcomes from the sample space. We say that an event occurs if any one of the basic outcomes in the event occurs. Any subset of sample space. - Event of getting even number A={2,4,6} Success/ favorable case: Outcome that entail the happening of a desired event.
  • 10.
    05/04/2025 10 Equally likelyevents:  If in a random experiment all out comes have equal chance of occurrence. - In tossing coin both H and T have equal chance to occur Mutually Exclusive Events (Disjoint Events)  If the occurrence of one event prevent the occurrence of the other. - In tossing coin the occurrence of Head prevent the occurrence of Tail.
  • 11.
    05/04/2025 11 Cont… Independent events(mutualindependence)  The occurrence or non-occurrence of one event doesn’t affect the occurrence or non-occurrence of the other event in repeated trials, conduction of a random experiment. While tossing of two coin simultaneously, the occurrence of head in one coin does not affect the occurrence of tail on the other.
  • 12.
    05/04/2025 Two Categories ofProbability • Objective and Subjective Probabilities. • Objective probability 1) Classical probability and 2) Relative frequency probability. 12
  • 13.
    05/04/2025 13 Types ofprobability Classical Method Is based on gambling ideas  If there are n equally likely possibilities, of which one must occur and m are regarded as favorable, or as a “success,” then the probability of a “success” is m/n. P(A) = m/n What is the probability of rolling a 6 with a well-balanced die? Ans. In this case, m=1 and n=6, so that the probability is 1/6 = 0.167
  • 14.
    05/04/2025 Relative Frequency Probability •In the long run process ….. • The proportion of times the event A occurs — in a large number of trials repeated under essentially identical conditions • Definition: If a process is repeated a large number of times (n), and if an event with the characteristic E occurs m times, the relative frequency of E, Probability of E = P(E) = m/n. 14
  • 15.
    05/04/2025 15 Relative freq… Examples •If you toss a coin 100 times and head comes up 40 times, P(H) = 40/100 = 0.4. • If we toss a coin 10,000 times and the head comes up 5562, P(H) = 0.5562. • Therefore, the longer the series and the longer sample size, the closer the estimate to the true value (0.5)
  • 16.
    05/04/2025 16 Subjective Probability Personalistic (An opinion or judgment by a decision maker about the likelihood of an event) Personal assessment of which is more effective to provide cure traditional/modern ‐  Personal assessment of which sports team will win a match Also uses classical and relative frequency methods to assess the likelihood of an event, but does not rely on repeatability of any process.
  • 17.
    05/04/2025 Properties of Probability 1.The numerical value of a probability always lies between 0 and 1, inclusive. 0  P(E)  1  A value 0 means the event can not occur  A value 1 means the event definitely will occur  A value of 0.5 means that the probability that the event will occur is the same as the probability that it will not occur. 17
  • 18.
    05/04/2025 18 2. Thesum of the probabilities of all mutually exclusive outcomes is equal to 1. P(E1 ) + P(E2 ) + .... + P(En ) = 1. 3. For two mutually exclusive events A and B, P(A or B ) = P(AUB)= P(A) + P(B). If not mutually exclusive: P(A or B) = P(A) + P(B) - P(A and B)
  • 19.
    05/04/2025 19 4. Thecomplement of an event A, denoted by Ā or Ac , is the event that A does not occur – Consists of all the outcomes in which event A does NOT occur P(Ā) = P(not A) = 1 – P(A) – Ā occurs only when A does not occur. – These are complementary events.
  • 20.
    05/04/2025 Unions of TwoEvents “If A and B are events, then the union of A and B, denoted by AUB, represents the event composed of all basic outcomes in A or B.” Intersections of Two Events “If A and B are events, then the intersection of A and B, denoted by A n B, represents the event composed of all basic outcomes in A and B.” Unions and Intersections of Two Events 20 B =With lung cancer A=Cigarette smoking A n B=Smokers with lung cancer
  • 21.
    05/04/2025 21 Additive Lawof Probability Let A and B be two events in a sample space S. The probability of the union of A and B is ( ) ( ) ( ) ( ). P A B P A P B P A B      B A A n B
  • 22.
    05/04/2025 22 Mutually Exclusive Events MutuallyExclusive Events: Events that have no basic outcomes in common, or equivalently, their intersection is empty set. S B A Let A and B be two events in a sample space S. The probability of the union of two mutually exclusive events A and B is: ( ) ( ) ( ). P A B P A P B   
  • 23.
    05/04/2025 23 Two eventsare independent if the occurrence of one of the events does not affect the probability of the other event. That is, A and B are independent if : P (B |A) = P (B) or if P (A |B) = P (A). Independent Events Example: Let event A stands for “the sex of the first child from a mother is female”; and event B stands for “the sex of the second child from the same mother is female” Are A and B independent? Solution P(B/A) = P(B) = 0.5 The occurrence of A does not affect the probability of B, so the events are independent.
  • 24.
    05/04/2025 24 Multiplication rule –If A and B are independent events, then P(A ∩ B) = P(A) × P(B) – More generally, P(A ∩ B) = P(A) P(B|A) = P(B) P(A|B) P(A and B) denotes the probability that A and B both occur at the same time.
  • 25.
    05/04/2025 25 Conditional probabilitiesand the multiplicative law  Sometimes the chance a particular event happens depends on the outcome of some other event. This applies obviously with many events that are spread out in time.  Example: The chance a patient with some disease survives the next year depends on his having survived to the present time. Such probabilities are called conditional.  The notation is Pr(B/A), which is read as “the probability event B occurs given that event A has already occurred .”  Let A and B be two events of a sample space S. The conditional probability of an event A, given B, denoted by Pr ( A/B )= P(A n B) / P(B) , P(B) not = 0.  Similarly, P(B/A) = P(A n B) / P(A) , P(A)not =0. This can be taken as an alternative form of the multiplicative law.
  • 26.
    05/04/2025 Conditional Probability The conditionalprobability of the event A given that event B has occurred is denoted by P(A|B). Then, P(A|B) =P(A ∩ B)/P(B) , P(B) > 0. Similarly, P(B|A) = P(A ∩ B)/P(A), P(A) > 0 when do you use conditional probability ??? Sensitivity and specificity 26
  • 27.
    05/04/2025 27 Example 1 Calculatingprobability of an event Table 1: Shows the frequency of cocaine use by sex among adult cocaine users _______________________________________________________________________________________________ Life time frequency Male Female Total of cocaine use _______________________________________________________________________________________________ 1-19 times 32 7 39 20-99 times 18 20 38 more than 100 times 25 9 34 -------------------------------------------------------------------------------------------- Total 75 36 111 ---------------------------------------------------------------------------------------------
  • 28.
    05/04/2025 28 Questions… 1. Whatis the probability of a person randomly picked is a male? 2. What is the probability of a person randomly picked uses cocaine more than 100 times? 3. Given that the selected person is male, what is the probability of a person randomly picked uses cocaine more than 100 times? 4. Given that the person has used cocaine less than 100 times, what is the probability of being female? 5. What is the probability of a person randomly picked is a male and uses cocaine more than 100 times?
  • 29.
    05/04/2025 29 Solution 1. Pr(m)=Totaladult males/Total adult cocaine users =75/111 =0.68 . 2. Pr(c>100)=All adult cocaine users more than 100 times Total adult cocaine users =34/111=0.31. 3. Pr (c>100m)=25/75=0.33. 4. Pr(fc<100)=(7+20)/77 =0.35 5. Pr(m ∩ c>100)= Pr(m) × Pr (c>100)=75/111 ×25/75 or 25/34 x 34/111=25/111=0.23.
  • 30.
  • 31.
    05/04/2025 31 Application ofprobability of categorical variables • Calculating the probability of an event in epidemiological studies, we can estimate prevalence of certain diseases in a given population. – Prevalence of a disease (e.g. Tuberculosis, diabetes, heart disease), – Prevalence of certain characteristics (e.g. high blood pressure, low birth weight) or – prevalence of certain behavior (e.g. smoking, drug use, condom use).
  • 32.
    05/04/2025 Probability Distributions • Aprobability distribution is a device used to describe the behavior that a random variable may have by applying the theory of probability. • It is a list of the probabilities associated with the values of the random variable obtained in an experiment • It is the way data are distributed, in order to draw conclusions about a set of data • Random Variable = Any quantity or characteristic that is able to assume a number of different values such that any particular outcome is determined by chance 32
  • 33.
    05/04/2025 33 Probability distribution… ♣A probability distribution of a random variable can be displayed by a table or a graph or a mathematical formula. ♣ With categorical variables, we obtain the frequency distribution of each variable. ♣ With numeric variables, the aim is to determine whether or not normality may be assumed.
  • 34.
    05/04/2025 34 Therefore, theprobability distribution of a random variable is a table, graph, or mathematical formula that gives the probabilities with which the random variable takes different values or ranges of values.
  • 35.
    05/04/2025 A. Discrete ProbabilityDistributions • For a discrete random variable, the probability distribution specifies each of the possible outcomes of the random variable along with the probability that each will occur • Examples can be: – Frequency distribution – Relative frequency distribution – Cumulative frequency 35
  • 36.
    05/04/2025 The following datashows the number of diagnostic services a patient receives 36
  • 37.
    05/04/2025 37 • Whatis the probability that a patient receives exactly 3 diagnostic services? P(X=3) = 0.031 • What is the probability that a patient receives at most one diagnostic service? P (X≤1) = P(X = 0) + P(X = 1) = 0.671 + 0.229 = 0.900
  • 38.
    05/04/2025 38 • Whatis the probability that a patient receives at least four diagnostic services? P (X≥4) = P(X = 4) + P(X = 5) = 0.010 + 0.006 = 0.016
  • 39.
    05/04/2025 Probability distributions canalso be displayed using a graph 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0 1 2 3 4 5 No. of diagnostic services, x Probability, X=x 39
  • 40.
    05/04/2025 Binomial Distribution • Itis one of the most widely encountered discrete probability distributions. • Consider dichotomous (binary) random variable • Is based on Bernoulli trial – When a single trial of an experiment can result in only one of two mutually exclusive outcomes (success or failure; dead or alive; sick or well, male or female) 40
  • 41.
    05/04/2025 A binomial probabilitydistribution occurs when the following requirements are met. 1. The procedure has a fixed number of trials. 2. The trials must be independent. 3. Each trial must have all outcomes that fall into two categories. 4. The probabilities must remain constant for each trial [P(success) = p]. 41
  • 42.
    05/04/2025 Binomial Distribution A processthat has only two possible outcomes is called a binomial process. In statistics, the two outcomes are frequently denoted as success and failure. The probabilities of a success or a failure are denoted by p and q, respectively. Note that p + q = 1. The binomial distribution gives the probability of exactly k successes in n trials P(k)  n k       pk 1 p  n  k 42
  • 43.
    05/04/2025 Binomial distribution, generally 43 X n X n X p p        ) 1 ( 1-p = probability of failure p = probability of success X = # successes out of n trials n = number of trials Note the general pattern emerging  if you have only two possible outcomes (call them 1/0 or yes/no or success/failure) in n independent trials, then the probability of exactly X “successes”=
  • 44.
    05/04/2025 44 • ndenotes the number of fixed trials • x denotes the number of successes in the n trials • p denotes the probability of success • q denotes the probability of failure (1- p) = • Represents the number of ways of selecting x objects out of n where the order of selection does not matter. • where n!=n(n-1)(n-2)…(1) , and 0!=1
  • 45.
    05/04/2025 Example 2 • Supposewe know that 40% of a certain population are cigarette smokers. If we take a random sample of 10 people from this population, what is the probability that we will have exactly 4 smokers in our sample? 45
  • 46.
    05/04/2025 46 • Ifthe probability that any individual in the population is a smoker to be P=.40, then the probability that x=4 smokers out of n=10 subjects selected is: P(X=4) =10C4(0.4)4 (1-0.4)10-4 = 10C4(0.4)4 (0.6)6 = 210(.0256)(.04666) = 0.25 • The probability of obtaining exactly 4 smokers in the sample is about 0.25.
  • 47.
    05/04/2025 47 • Wecan compute the probability of observing zero smokers out of 10 subjects selected at random, exactly 1 smoker, and so on, and display the results in a table, as given, below. • The third column, P(X ≤ x), gives the cumulative probability. E.g. the probability of selecting 3 or fewer smokers into the sample of 10 subjects is P(X ≤ 3) =.3823, or about 38%.
  • 48.
  • 49.
    05/04/2025 The probability inthe above table can be converted into the following graph 0 0.05 0.1 0.15 0.2 0.25 0.3 0 1 2 3 4 5 6 7 8 9 10 No. of Smokers Probability 49
  • 50.
    05/04/2025 II. Probability distributionof continuous variables • Under different circumstances, the outcome of a random variable may not be limited to categories or counts. – E.g. Suppose, X represents the continuous variable ‘Height’; rarely is an individual exactly equal to 170cm tall. – X can assume an infinite number of intermediate values 170.1, 170.2, 170.3 etc. • Because a continuous random variable X can take on an uncountable, infinite number of values, the probability associated with any particular one value is almost equal to zero. 50
  • 51.
    05/04/2025 51 Continuous ProbabilityDistributions  There are infinite number of continuous random variables  We try to pick a model that  Fits the data well  Allows us to make the best possible inferences using the data. f (x) x Uniform Normal Skewed
  • 52.
    05/04/2025 52 Properties ofNormal Distributions The most important probability distribution in statistics is the normal distribution. A normal distribution is a continuous probability distribution for a random variable, x. The graph of a normal distribution is called the normal curve. Normal curve x
  • 53.
    05/04/2025 53 The NormalDistribution  The formula that generates the normal probability distribution is: Where, s = Population variance µ = population mean e =2.718…, π= 3.14… 2 ) ( 2 1 2 1 ) (         x e x f This is a bell shaped curve with different centers and spreads depending on  and 
  • 54.
    05/04/2025 Normal Curve Characteristics 1.It is a probability distribution of a continuous variable. It extends from minus infinity to plus infinity. 2. It is unimodal, bell-shaped and symmetric. 3. The mean, the median and mode are all equal 4. The curve approaches, but never meets, the abscissa at both high and low ends. 5. The total area under the curve is 1. (This is a requirement of any probability density function.) 6. It is determined by two quantities: its mean and SD . Changing mean alone shifts the entire normal curve to the left or right. Changing SD alone changes the degree to which the distribution is spread out. 54
  • 55.
  • 56.
    05/04/2025 56 7. Theheight of the frequency curve, which is called the probability density, cannot be taken as the probability of a particular value. • An observation from a normal distribution can be related to a standard normal distribution (SND) which has a published table.
  • 57.
    05/04/2025 The standard normaldistribution Since a normal distribution could be an infinite number of possible values for its mean and SD, it is impossible to tabulate the area associated for each and every normal curve. Instead only a single curve for which μ = 0 and σ = 1 is tabulated. The curve is called the standard normal distribution (SND). 57
  • 58.
    05/04/2025 The Standard NormalDistribution  To find P(a < x < b), we need to find the area under the appropriate normal curve.  To simplify the tabulation of these areas, we standardize each value of x by expressing it as a z-score, the number of standard deviations s it lies from the mean m. 58     x z
  • 59.
    05/04/2025 The Standard Normal(z) Distribution  Mean = 0; Standard deviation = 1  When x = m, z = 0  Symmetric about z = 0  Values of z to the left of center are negative  Values of z to the right of center are positive  Total area under the curve is 1. 59
  • 60.
  • 61.
    05/04/2025 Using normal table 61 Thefour digit probability in a particular row and column of Table 1 gives the area under the z curve to the left that particular value of z. Area for z = 1.36
  • 62.
    05/04/2025 P(z 1.36) =.9131 P(z >1.36) = 1 - .9131 = .0869 P(-1.20  z  1.36) = .9131 - .1151 = .7980 Example-4 62 Use Table 1 to calculate these probabilities:
  • 63.
    05/04/2025 Example:5 Probability Zis between –2.59 and 1.31 63 P(-2.59  Z  1.31) = P(0 < Z  1.31) + P(-2.59 < Z  0 ) = 0.4049 +0.4952 = 0.9001
  • 64.
    05/04/2025 Exercises 2 Find theprobability of the following under the SND – Above 1.96? – Below –1.96 , 1.96 ? – Between –1.28 and 1.28? – Between –1.65 and 1.08? 0.8502 – What level cuts the upper 25%? – What level cuts the middle 99%? 64
  • 65.
    05/04/2025 65 Example: Theaverage weight of pregnant women attending a prenatal care in a clinic was 78kg with a standard deviation of 8kg. If the weights are normally distributed: a) Find the probability that a randomly selected pregnant woman weights less than 90kg. Probability and Normal Distributions P(x < 90) = P(z < 1.5) = 0.9332 -  90-78 = 8 x μ z σ = 1.5 The probability that a randomly selected pregnant woman weights less than 90kg. is 0.9332. μ =0 z ? 1.5 90 μ =78 P(x < 90) μ = 78 σ = 8 x
  • 66.
    05/04/2025 66 Example: b) Basedon the above example, find the probability that a pregnant woman weights greater than 85kg. Probability and Normal Distributions P(x > 85) = P(z > 0.88) = 1  P(z < 0.88) = 1  0.8106 = 0.1894 85-78 = = 8 x - μ z σ  = 0.875 0.88 The probability that a randomly selected pregnant woman weights greater than 85kg. is 0.1894. μ =0 z ? 0.88 85 μ =78 P(x > 85) μ = 78 σ = 8 x
  • 67.
    05/04/2025 67 Example: From theabove example, find the probability that a randomly selected pregnant woman weights between 60 and 80. Probability and Normal Distributions P(60 < x < 80) = P(2.25 < z < 0.25) = P(z < 0.25)  P(z < 2.25) - - 1 60 78 = = 8 x μ z σ - = 2.25 The probability that a randomly selected pregnant women weights between 60 and 80 is 0.5865. 2 - -  80 78 = 8 x μ z σ = 0.25 μ =0 z ? ? 0.25 2.25 = 0.5987  0.0122 = 0.5865 60 80 μ =78 P(60 < x < 80) μ = 78 σ = 8 x
  • 68.
    05/04/2025 68 Exercise 3 Calculatethe following probabilities when X is taken different value with mean 35 and SD 2? A. P(x<37), B. P(x>40), C. P(38<x<40)
  • 69.
    05/04/2025 69 Exercise 1. Whatproportion of newborns will weight above 2700 grams? 2. What is the probability that a randomly selected newborns will weight between 2800 and 2700 grams? 3. What is the 75 percentile in gram for the distribution of weight of newborns? 4. What is the probability that a randomly selected newborns will weight exactly 2900 grams? A population of newborn infant have a mean weight of 2800 grams with standard deviation of 400 grams. Based on this information give a short answer to the following questions.
  • 70.
  • 71.
    05/04/2025 Area between 0and z 0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.0 0.0000 0.0040 0.0080 0.0120 0.0160 0.0199 0.0239 0.0279 0.0319 0.0359 0.1 0.0398 0.0438 0.0478 0.0517 0.0557 0.0596 0.0636 0.0675 0.0714 0.0753 0.2 0.0793 0.0832 0.0871 0.0910 0.0948 0.0987 0.1026 0.1064 0.1103 0.1141 0.3 0.1179 0.1217 0.1255 0.1293 0.1331 0.1368 0.1406 0.1443 0.1480 0.1517 0.4 0.1554 0.1591 0.1628 0.1664 0.1700 0.1736 0.1772 0.1808 0.1844 0.1879 0.5 0.1915 0.1950 0.1985 0.2019 0.2054 0.2088 0.2123 0.2157 0.2190 0.2224 0.6 0.2257 0.2291 0.2324 0.2357 0.2389 0.2422 0.2454 0.2486 0.2517 0.2549 0.7 0.2580 0.2611 0.2642 0.2673 0.2704 0.2734 0.2764 0.2794 0.2823 0.2852 0.8 0.2881 0.2910 0.2939 0.2967 0.2995 0.3023 0.3051 0.3078 0.3106 0.3133 0.9 0.3159 0.3186 0.3212 0.3238 0.3264 0.3289 0.3315 0.3340 0.3365 0.3389 1.0 0.3413 0.3438 0.3461 0.3485 0.3508 0.3531 0.3554 0.3577 0.3599 0.3621 1.1 0.3643 0.3665 0.3686 0.3708 0.3729 0.3749 0.3770 0.3790 0.3810 0.3830 1.2 0.3849 0.3869 0.3888 0.3907 0.3925 0.3944 0.3962 0.3980 0.3997 0.4015 1.3 0.4032 0.4049 0.4066 0.4082 0.4099 0.4115 0.4131 0.4147 0.4162 0.4177 1.4 0.4192 0.4207 0.4222 0.4236 0.4251 0.4265 0.4279 0.4292 0.4306 0.4319 1.5 0.4332 0.4345 0.4357 0.4370 0.4382 0.4394 0.4406 0.4418 0.4429 0.4441 1.6 0.4452 0.4463 0.4474 0.4484 0.4495 0.4505 0.4515 0.4525 0.4535 0.4545 1.7 0.4554 0.4564 0.4573 0.4582 0.4591 0.4599 0.4608 0.4616 0.4625 0.4633 1.8 0.4641 0.4649 0.4656 0.4664 0.4671 0.4678 0.4686 0.4693 0.4699 0.4706 1.9 0.4713 0.4719 0.4726 0.4732 0.4738 0.4744 0.4750 0.4756 0.4761 0.4767 71 Table : Normal distribution
  • 72.
  • 73.

Editor's Notes

  • #35 The probability distribution of a discrete random variable is a table, graph, formula, or other device used to specify all possible values of a discrete random variable along with their respective probabilities. The relationship between the values and their associated probabilities is called a probability mass function.
  • #37 Many random variables are displayed in tables or figures in terms of a cumulative distribution function rather than a distribution of probabilities of individual values. The basic concept is to assign to each individual value the sum of probabilities of all values that are no larger than the value being considered. Thus, the cumulative distribution function of a random variable X is denoted by F(X) and, for a specific value x of X, is defined by P(X≤ x) and denoted by F(x).
  • #40 In a sample of n independent trials, each of which can have only two possible outcomes. denoted as “success” and “failure”.
  • #49 Note that, although p is a fraction, its sampling distribution is discrete and not continuous, since it may take only a limited number of values for any given sample size. As the sample size n increases the binomial distribution becomes very close to a normal distribution, and this can be used to calculate confidence intervals and carry out hypothesis tests. In fact the normal distribution can be used as a reasonable approximation to the binomial distribution if both np and n-np are 10 or more. This approximating normal distribution has the same mean and standard error as the binomial distribution.