SlideShare a Scribd company logo
► Estimation
► Hypothesis testing
Statistical inference
7/5/2023
1
By Asaye
Objectives
7/5/2023
By Asaye
2
After complete this session, learners will be able to do
 Parameter estimations
Point estimate
Confidence interval
 Hypothesis testing
Z-test
T-test
 Testing associations
Chi-square test
Sampling distribution
3
 A sampling distribution is a distribution of all possible values of a statistic
computed from samples of the same size randomly selected from the same
population.
 Sampling distribution is the probability distribution of sample statistic.
 It is formed when samples of size n repeatedly taken from population.
 Some would be higher than the population parameters and some would be
lower.
7/5/2023
Asaye.A
Sampling distribution….
4
 We consider sample statistic as random variables.
For example:
 Age of individuals is a random variable.
 Similarly, mean of age is a random variable.
 No conclusion about values of population parameters based on one
individual value.
 It should be based on sample statistic computed from adequate
sample size.
7/5/2023
Asaye.A
Sampling distribution….
5
Construction of sampling distributions
1. From a population of size N, randomly draw all possible samples
of size n.
2. Compute the statistic of interest for each sample.
3. Create a frequency distribution of the statistic.
7/5/2023
Asaye.A
A. Sampling distribution of sample mean
6
7/5/2023
Asaye.A
Example: sampling distribution of sample mean
7/5/2023
Asaye.A
7
The population values {18, 20, 22, 24} put in a box. Two observations
are randomly selected, with replacement.
Find the mean, variance, and standard deviation of the population.
Solution:
Mean: μ =
𝑋𝑖
𝑁
=
84
4
= 21
Variance: 𝜎2 =
𝑋𝑖
−𝜇 2
𝑁
=
20
4
= 5
Standard deviation: 5 = 2.236
Example: sampling distribution of sample mean
7/5/2023
Asaye.A
8
Now consider all possible samples of size “n=2”
16 Sample Means
16 possible samples (with replacement)
Example: sampling distribution of sample mean
7/5/2023
Asaye.A
9
List all the possible samples of size n = 2 and calculate the mean of
each sample.
Solution:
Samples 𝑿 Samples 𝑿
18,18 18 22,18 20
18,20 19 22,20 21
18,22 20 22,22 22
18,24 21 22,24 23
20,18 19 24,18 21
20,20 20 24,20 22
20,22 21 24,22 23
20,24 22 24,24 24
These means form the
sampling distribution of
sample means
Example: sampling distribution of sample mean
7/5/2023
Asaye.A
10
Construct the frequency distribution of the sample means;
Example: sampling distribution of sample mean
7/5/2023
Asaye.A
11
Find mean, variance and standard deviation of the 16 sample means
are;
Mean: 𝜇𝑥 =
𝑥𝑖
𝑛
=
18+19+21+⋯+24
16
=21
Variance: 𝜎𝑥
2
=
𝑥𝑖−𝜇𝑥
2
𝑛
= 2.5, 𝜎𝑥 = 2.5 = 1.581
These results satisfy the properties of sampling distributions of sample
means.
𝜇𝑥 = 𝜇 = 21, 𝜎𝑥 =
𝜎
𝑛
=
5
2
= 1.581
1st 2nd Observation
Obs 18 20 22 24
18 18 19 20 21
20 19 20 21 22
22 20 21 22 23
24 21 22 23 24
Example: sampling distribution of sample mean
12
18 19 20 21 22 23 24
0
.1
.2
.3
Sample Means
Distribution
16 Sample Means
P(𝑋)
𝑋
7/5/2023
Asaye.A
Comparing the population with its sampling distribution
13
18 19 20 21 22 23 24
0
.1
.2
.3
P(x)
Mean
18 20 22 24
𝒙
0
.1
.2
.3
P(x)=1/4
Population, N = 4
𝜇 = 21 𝜎 = 2.236
Sample means distribution, n = 2
𝜇𝑥=21 𝜎𝑥= 1.58
𝒙
7/5/2023
Asaye.A
Properties of sampling distribution of mean
14
A. Sampling from normally distributed populations
a. If a population is normal with mean 𝜇 and standard
deviation σ, the sampling distribution of 𝑥 is also normally
distributed with 𝜇𝑥 = 𝜇 and 𝜎𝑥 =
𝜎
𝑛
,
OR, the standard deviation of any sample statistic is called its
standard error.
7/5/2023
Asaye.A
Cont…
15
b. The mean 𝜇 of the distribution of sample mean is equal to the mean
of the population from which the samples were drawn.
c. The variance of the distribution of sample mean is equal to the
variance of the population divided by the sample size.
7/5/2023
Asaye.A
Sampling from non-normally distributed populations
16
We can apply the Central Limit Theorem:
 Even if the population is not normal,
 Sample means from the population will be approximately
normal if the sample sizes ≥ 30 are drawn from any population
with mean 𝜇 and standard deviation 𝜎.
 The sampling distribution of sample means has 𝜇𝑥 = 𝜇 and
𝜎𝑥 =
𝜎
𝑛
7/5/2023
Asaye.A
Sampling distribution of Proportion
7/5/2023
By Asaye
17
o Suppose we choose a random sample of size n, the sampling distribution
of the sample means p posses the following properties.
o The sample proportion p will be an estimate of the population mean P.
o The standard deviation of p is equal to p(1−p) /n called the standard
error of the proportion).
o Provided n is large enough the shape of the sampling distribution of p is
normal.
Types of estimation
7/5/2023
By Asaye
18
There are two methods of estimation:
1. Point estimation
2. Interval estimation
 Point estimation involves the calculation of a single value to
estimate the population parameter.
 Interval estimation specifies a range of values assumed to
include population parameter.
19
1. Point Estimation
 A parameter : is a numerical descriptive measure of a population
(e.g. μ).
 A statistic: is a numerical descriptive measure of a sample (e.g.
𝑋). It estimates the population parameter.
 A point estimate of some population parameter is a single value
of a sample statistic.
 To each sample statistic there corresponds a population parameter.
20
Sample statistic & corresponding population parameter
Sample statistic
 Sample mean ( 𝑋 )
 Sample variance (S2 )
 Sample Standard deviation (SD)
 Sample proportion (p)
Population parameters
 μ (population mean)
 σ2 (population variance)
 σ(population standard deviation)
 P or π (Population proportion)
21
Point estimation…..
If a random sample of 100 drug related patients has a mean survival
time of 46.9 months then ,what is the point estimate of the population
mean?
Answer = 46.9
22
2. Interval Estimation
 A point estimate does not give any indication on how far away the
parameter lies.
 But an interval which has a high probability of containing the value
parameter lies.
 An interval estimate is a statement that a population parameter has a
value lying between two specified limits.
 Such interval estimates are called Confidence Intervals (CI)
23
Confidence Interval (CI)
7/5/2023
By Asaye
24
 A confidence interval defines an interval within which the true
population parameter is like to fall (interval estimate).
 Confidence interval therefore takes into account the sample to
sample variation of the statistic and gives the measure of precision.
 Confidence intervals express the inherent uncertainty in any medical
study by expressing upper and lower bounds for anticipated true
underlying population parameter.
Confidence Interval (CI)…
7/5/2023
By Asaye
25
 Most commonly the 95% confidence intervals are calculated, however 90%
and 99% confidence intervals are sometimes used.
 The probability that the interval contains the true population parameter is
(1-α)100%.
 If we were to select 100 random samples from the population and calculate
confidence intervals for each, approximately 95 of them would include the
true population mean B (and 5 would not).
Confidence Interval (CI)…
Interval Estimate components
Estimator ± Margin of error
Estimator ± (Reliability coefficient) x (Standard error)
 Precision of the estimate or Margin of error (d)= reliability coefficient
x standard error.
Where;
 Reliability Coefficient (RC) is the 1 − α 100% percentile of the
given probability distribution.
 Standard Error (SE) is the standard deviation of the sampling
distribution of the sample statistic.
26
Reliability Coefficient
7/5/2023
By Asaye
27
The standardized “z” value corresponding to the given level of confidence.
Z = 1.64 if your confidence level is 90%
Z = 1.96 if your confidence level is 95%
Z = 2.58 if your confidence level is 99%
A wide interval suggests imprecision of estimation.
Narrow CI widths reflects large sample size, low variability and low
confidence level
e.g. if you had a confidence level of 99%, the confidence coefficient would be
. 99.
Confidence Level
Confidence level is the probability that the interval estimate will contain the
parameter, assuming that a large number of samples are selected and that the
estimation process on the same parameter is repeated.
 Denoted by 100(1- 𝛼)%.
 A relative frequency interpretation:
 In long run; 100(1-𝛼 )% of all confidences intervals that can be
constructed will contain unknown parameter.
 A specific interval will either contain or not contain unknown parameter.
28
Normal or t-distribution
Is n≥30?
Is a population normally, or
approximately normally distributed
Is variance 𝜎 known?
Use t-distribution with n-1 degree
of freedom
Use normal distribution (Z)
Con not use normal or t-distribution
Use normal distribution (Z)
If 𝜎 is unknown , use s instead.
No
Yes
No
Yes
No
Yes
Confidence Interval for single population mean
1.When the variance is known and the sample size is large or small, the C.I. has
the form:
𝑋 - Z (1- α/2) δ /√n < μ < 𝑋 + Z (1- α /2) δ / √ n or 𝑋 ± 𝑍𝛼 2
𝑆
𝑛
for n ≥
30, 𝑏𝑢𝑡 𝜎 𝑖𝑠 𝑢𝑛𝑘𝑛𝑜𝑤𝑛.
2. When variance is unknown, and the sample size is small , the C.I. has the
form:
𝑋 - t (1- α /2),n-1 s/ √ n < μ < 𝑋+ t (1- α /2),n-1 s/ √ n , d.f = n-1
30
Example
E.g. In normally distributed population mean reading speed of a
random sample of 81 adults is 325 words per minute. Find a 90% C.I.
for the mean reading speed of all adults (μ) if it is known that the
standard deviation for all adults is 45 words per minute .
Given n = 81 σ = 45 𝑥 = 325
Zα/2 = 1.645
A 90% C.I. for μ is 325 ± (1.64 x 5 ) = 325 ± 8.2= (316.8, 333.2)
Therefore, A 90% CI for μ is 316.8 to 333.2 words per minute.
31
7/5/2023
By Asaye
32
CI for the difference of means & independent samples
1. When variance known
CI = 𝑥1- 𝑥2 ± Z
 / 2
ẟ12
𝑛1
+
ẟ22
𝑛2
2. When variance unknown and if the sample size is less than 30
Use t – distribution instead of z – distribution
CI = 𝑥1- 𝑥2 ± t
 / 2, 𝑛1 + 𝑛2 − 2
𝑆1
2
𝑛1
+
𝑆2
2
𝑛2
33
Example
If a random sample of 50 non-smokers have a mean life of 76 years
with a standard deviation of 8 years, and a random sample of 65
smokers live 68 years with a standard deviation of 9 years,
Find a 95% C.I for the difference of mean lifetime of non-smokers and
smokers?
34
Confidence Interval for a Single Population proportion (P):
 A sample is drawn from the population of interest ,then compute the
sample proportion p such as;
 This sample proportion is used as the point estimator of the population
proportion
n
P
P
Z
P
)
ˆ
1
(
ˆ
ˆ
2
1




35
p =
no. of elements in the sample with some characterstics
Total no. of element in the sample
=
x
n
Single proportion cont….
2. In Addis Ababa, a survey of 350 students showed that 28% carried
their lunch to school. Find the 95% CI for the true population
proportion of students who carried their lunch to school?
3. Suppose that 22 people were obese from 100 people in Debre Tabor.
Find the 95% confidence interval for the true population proportion?
36
CI for the difference between two Population proportions
 Two samples are drawn from two independent population of interest,
 then compute the sample proportion for each sample for the
characteristic of interest.
 An unbiased point estimator for the difference between two population
proportions 𝑝1 − 𝑝2.
37
CI for the difference between two Population proportions
 A 100(1-α)% confident interval for P1 - P2 is given by
38
2
2
2
1
1
1
2
1
2
1
)
ˆ
1
(
ˆ
)
ˆ
1
(
ˆ
)
ˆ
ˆ
(
n
P
P
n
P
P
Z
P
P







Example
A researcher investigated gender differences in sexual abuse in a
sample of 323 adults (68 female and 255 males ). In the sample, 31 of
the female and 53 of the males reported sexual abuse. We wish to
construct 99% C.I. for the difference between the proportions of
sexual abuse in the two sampled population .
39
Example cont…..
1-α =0.99 → α = 0.01 → α/2 =0.005 → 1- α/2 = 0.995
Z 1- α/2 = Z 0.995 =2.58 , nF=68, nM=255,
40
2078
.
0
255
53
ˆ
,
4559
.
0
68
31
ˆ 





M
M
M
F
F
F n
a
p
n
a
p
M
M
M
F
F
F
M
F
n
P
P
n
P
P
Z
P
P
)
ˆ
1
(
ˆ
)
ˆ
1
(
ˆ
)
ˆ
ˆ
(
2
1







255
)
2078
.
0
1
(
2078
.
0
68
)
4559
.
0
1
(
4559
.
0
58
.
2
)
2078
.
0
4559
.
0
(





Example cont…..
0.2481 ± 2.58(0.0655) = ( 0.07914 , 0.4171 )
 Interpretation: ??????
41
C. Paired Samples
7/5/2023
By Asaye
42
 Tests Means of two Related Populations
∆ Paired or matched samples
∆ Repeated measures (before/after)
∆ Use difference between paired values:
d = x1-x2
 Eliminates variation among subjects
 Assumptions:
 Both populations are normally distributed,
 Or, if not normal, use large samples.
Examples
7/5/2023
By Asaye
43
Paired data arises when each individual (more specifically, each unit of
measurement) in a sample is measured twice.
e.g. Blood pressure prior to and following treatment,
 Notice in each of these examples that the two occasions of
measurement are linked by virtue of the two measurements being
made on the same individual.
7/5/2023
By Asaye
44
Where t𝛼 2 has n-1 df.
Example
7/5/2023
By Asaye
45
Ten hypertensive patients are screened at a neighborhood health clinic
and are given methyl dopa, a strong antihypertensive medication for
their condition. They are asked to come back 1 week later and have
their blood pressures measured again. Suppose the initial and follow-up
SBPs (mm Hg) of the patients are given below.
7/5/2023
By Asaye
46
•
7/5/2023
By Asaye
47
We have the following data and summary statistics
Summary
7/5/2023
By Asaye
48
 Students sometimes have difficulty deciding whether to use 𝑍𝛼/2 or 𝑡𝛼/2
values when finding confidence intervals.
Hypothesis testing
 A statistical hypothesis is a statement about the population under study
or about the distribution of a quantity under consideration.
 Researchers are interested in answering many types of questions. For example,
A physician might want to know whether a new medication will lower a
person’s blood pressure.
 These types of questions can be addressed through statistical hypothesis testing,
which is a decision-making process for evaluating claims about a population.
49
Hypothesis testing
7/5/2023
By Asaye
50
 Hypothesis is a testable statement that describes the nature of the
proposed relationship between two or more variables of interest.
 In hypothesis testing, the researcher must defined the population
under study, state the particular hypotheses that will be
investigated, give the significance level, select a sample from the
population, collect the data, perform the calculations required for
the statistical test, and reach a conclusion.
Type of Hypotheses
 Null hypothesis (represented by HO) is the statement about the value of the
population parameter (normal statement).
 The null hypothesis postulates that ‘there is no difference between factor and
outcome’ or ‘there is no an intervention effect.’
 Alternative hypothesis (represented by HA) is the hypothesis that a researcher
want to test or claim, or states the ‘opposing’ view that ‘there is a difference
between factor and outcome’ or ‘there is an intervention effect.
 Level of significance: the percentage of the sample means that is outside certain
prescribed limits.
51
7/5/2023
By Asaye
52
Methods of hypothesis testing
7/5/2023
By Asaye
53
 Hypotheses concerning about parameters which may or may not be
true.
 The three methods used to test hypotheses are:-
The traditional method
The P-value method
The confidence interval method.
Steps in hypothesis testing
7/5/2023
By Asaye
54
1. Identify the null hypothesis H0 and the alternate hypothesis HA.
2. Choose 𝛼. The value should be small, usually less than 10%. It is
important to consider the consequences of both types of errors.
3. Select the test statistic and determine its value from the sample data.
4. Compare the observed value of the statistic to the critical value obtained
for the chosen 𝛼.
5. Make a decision
6. Conclusion
Test Statistics
 A test statistics is a value we can compare with known distribution
of what we expect when the null hypothesis is true.
 The general formula of the test statistics is:
 Test statistics =
55
Critical value
7/5/2023
By Asaye
56
 The critical value separates the critical region from the non-critical
region for a given level of significance.
Decision making
Accept or Reject the null hypothesis
There are 2 types of errors
Type I error is more serious error and it is the level of significant.
Power is the probability of rejecting false null hypothesis and it is given by 1-β
57
7/5/2023
By Asaye
58
7/5/2023
By Asaye
59
Types of errors
7/5/2023
By Asaye
60
Type I errors: refers to the situation when we reject the null hypothesis when
it is true (Ho is wrongly rejected)
E.g. Ho: there is no differences between two drugs on average
Type I error will occur if we conclude that the two drugs produce different
effects when actually there isn’t a difference. Prob(type I error)=α
Type II errors: refers to the situation when we accept the null hypothesis
when it is false.
E.g. Ho: there is no differences between two drugs on average
Type II error will occur if we conclude that the two drugs produce the same
effects when actually there is a difference. Prob(type II error)=𝛽
Hypothesis testing about a Population mean (μ)
Two Tailed Test (The value of sample statistic failing into either tail of
the distribution)
The large sample (n > = 30) test of hypothesis about a population mean
μ is as follows
1 . H 0 :𝜇1 = 𝜇0 vs H A : 𝜇1 ≠ 𝜇0
2. Z cal=
𝑥−𝜇0
ẟ
𝑛
61
Hypothesis testing about a Population mean (μ)
7/5/2023
By Asaye
62
Ztab  Z α / 2
Decision rule :
Reject Ho if the Z value falls in the rejection region.
Don’t reject Ho if the Z value falls in the non-rejection region.
if |zcal| Ztab reject H 0
i f | zcal |< Ztab accept H 0
 If n < 30 and variance unknown
tcal =
𝑥−𝜇0
𝑠
𝑛
at n-1 d.f
And the decision is similar to z calculated
One tailed tests
2 . H 0 :    0 vs H A :  1 <  0
Ztab  Zα
D e c i s i o n : if zcal  - Ztab accept H0
if zcal < - Ztab reject H0
H 0 :    0 vs H A :  1   0
3. H 0 :    0 vs H A :  1   0
Decision : if zcal  Ztab reject H0
if zcal < Ztab accept H0
63
The P- Value
7/5/2023
By Asaye
64
 P-value is the probability that the observed difference is due to chance.
 A large p-value implies that the probability of the value observed, occurring
just by chance, when the null hypothesis is true.
 With small p-value, we can ignore the effect of chance, and suggests that
there might be sufficient evidence for rejecting the null hypothesis.
 The p-value is defined as the probability of observing the computed
significance test value or a larger one, if the H0 hypothesis is true. For
example, P[ Z >= Zcal/H0 true].
The P- Value…
7/5/2023
By Asaye
65
 A p-value is the probability of getting the observed difference, or one
more extreme, in the sample purely by chance from a population
where the true difference is zero.
 If the p-value is greater than 0.05 then, by convention, we conclude
that the observed difference could have occurred by chance and there
is no statistically significant evidence (at the 5% level of
significance) for a difference between the groups in the population.
P-value and confidence interval
7/5/2023
By Asaye
66
 Confidence intervals are preferable because they give information about the
size of any difference in the population, and they also indicate the amount
of uncertainty remaining about the size of the difference.
 When the null hypothesis is rejected in a hypothesis-testing situation, the
confidence interval for the mean using the same level of significance will
not contain the hypothesized mean.
P-value and confidence interval…..
7/5/2023
By Asaye
67
 But for what values of p-value should we reject the null hypothesis?
 By convention, a p-value of 0.05 or smaller is considered
sufficient evidence for rejecting the null hypothesis.
 By using p-value of 0.05, we are allowing a 5% chance of wrongly
rejecting the null hypothesis when it is in fact true.
 When the p-value is less than to 0.05, we often say that the result is
statistically significant.
Example1
A simple random sample of 10 people from a certain population
has a mean age of 27. Can we conclude that the mean age of the
population is not 30? The variance is known to be 20. Let 𝛼 =
.05.
68
Example….
7/5/2023
By Asaye
69
Solution
1. State hypothesis test: Ho: µ = 30 VS HA: µ ≠ 30
2. Determine level of significance: α = 0.05
3. Calculate test statistics: Zcal = (27-30)/ 20
10 = -2.12
4. Determine critical value: Z-critical value at 0.025 is equal to 1.96.
5. Make decision: We reject the null hypothesis since |Zcal | = 2.12 ≥ Ztab = 1.96.
That is Zcal =-2.12 is in the rejection region.
6. Conclusion: The mean of age of the population is different from 30 at 5%
level of significance. We conclude that µ is not 30 since p-value= 0.034.
Example 2
Suppose that we have a population mean 3.1 and n=20 people,
𝑥 = 4.5,
1. H0:  3.1 vs HA:   3.1
2. α= 0.5 at 95% CI
3. Our test statistic is:
70
Example 2…
7/5/2023
By Asaye
71
4. The observed value of the test statistic falls within the range of the
non-rejection region. i.e. tcal = 1.14 < ttab = 2.09, since do not reject Ho.
5. We accept Ho and we conclude that there is no enough evidence to
reject the null hypothesis
Hypothesis testing for single proportions
Example: In the study of childhood abuse in psychiatry patients,
brown found that 166 in a sample of 947 patients reported histories of
physical or sexual abuse. Test the hypothesis that the true population
proportion is 30%?
Solution
 To the hypothesis we need to follow thesteps.
72
Example:…
7/5/2023
By Asaye
73
Step 1: State the hypothesis
H0: P= Po = 0.3 vs Ha: P ≠ Po ≠ 0.3
Step 2: Fix the level of significant (α=0.05)
Step3: determine critical value: Ztab= Z𝛼/2= 1.96
Step 4: Compute the calculated and tabulated value of the teststatistic:
Zcal =
𝑃−𝑃0
𝑃∗𝑞
𝑛
=
0.175−0.3
0.3(0.7)
947
=
−0.125
0.0149
Zcal = -8.39
Example:…
7/5/2023
By Asaye
74
Step 5: make decision: reject Ho sine |Zcal|=8.39 ≥ Ztab=1.96.
Step 6: making conclusion: we conclude that there is statistical
evidence to reject the true population proportion is different from zero.
Hypothesis testing for two sample means
Ho: µ1-µ2 =0
VS
HA: µ1-µ2 ≠0, HA: µ1-µ2 <0, HA: µ1-µ2>0
75
Example
A researchers wish to know if the data they have collected provide sufficient
evidence to indicate a difference in mean serum uric acid levels between normal
individual and individual with down’s syndrome. The data consists of serum
uric acid readings on 12 individuals with down’s syndrome and 15 normal
individuals. The means are 4.5mg/100ml and 3.4 mg/100ml with standard
deviation of 2.9 and 3.5 mg/100ml, respectively with variances (2=1, 2=1.5,
respectively). Is there a difference between the means of both groups at 5% level
of significance?
Hypothesis test: HA: µ1 - µ2 ≠ 0 or HA: µ1 ≠ µ2
76
Cont…
7/5/2023
77
With α = 0.05, the critical values of Z are -1.96 and +1.96. We reject Ho if Z
< -1.96 or Z > +1.96.
Reject Ho because 2.57 > 1.96.
 We are 95% confident that there is a statistically significant evidence the
population means are not equal.
Hypothesis testing for two proportions
 Suppose that n1 and n2 are large enough sothat;
n1·p1≥5, n1·(1 - p1)≥5, n2·p2≥5, and n2·(1 – p2)≥5
 To test the hypothesis
Ho: P1-P2 =0
VS
HA: P1-P2 ≠0
Test statistics:
78
𝜎𝑃1−𝑃2
=
𝑍𝑐𝑎𝑙 =
𝑃1 − 𝑃2 − 𝐷0
𝜎𝑃1−𝑃2
Where; 𝐷0 = (𝑃1 − 𝑃2)
Example
7/5/2023
By Asaye
79
Two hundred patients suffering from a certain disease were randomly divided into two
equal groups. Of the first group, 78 recovered within three days. Out of the other 100,
who were treated by a new method, 90 recovered within three days. The physician
wishes to know whether the data provide sufficient evidence at 90% level of
confidence to indicate that the new treatment is more effective than the standard
treatment.
Solution;
Given: n1= n2= 100;
p1=78/100= 0.78 p2=90/100=0.90
1. State the hypothesis: Ho: P1=P2 vs H1: P1< P2
2. Determine level of significance.
Example…
7/5/2023
By Asaye
80
3. Test statistic:
𝑍𝑐𝑎𝑙 =
0.78 − 0.90 − 0
0.78(0.32)
100
+
0.90(0.10)
100
=
−0.12
0.058
= −2.07
4. Critical value: It is one-tailed test and therefore Zα = Z0.05 = ±1.645
5. Decision: since 𝑍𝑐𝑎𝑙<−Zαi.e. -2.07 < -1.645 we reject the Ho
6. Conclusion: the data suggests that the new treatment is more
effective than the standard at 95% level of significance.
Chi-square test
7/5/2023
By Asaye
81
 Chi-square test is used to determine a significant difference between the
observed and expected frequencies in categorical attributes.
 In recent years, the use of specialized statistical methods for categorical
data has increased dramatically, particularly for applications in the
biomedical and social sciences.
 Categorical scales occur frequently in the health sciences, for measuring
responses.
Cont…
7/5/2023
By Asaye
82
For example:
Patient survives an operation (yes, no),
Severity of an injury (none, mild, moderate, severe), and
Stages of a disease (initial, advanced).
 Studies often collect data on categorical variables that can be summarized
as a series of counts and commonly arranged in a tabular format known as a
contingency table.
Cont…
7/5/2023
By Asaye
83
 As with the z and t distributions, there is a different chi-square
distribution for each possible value of degrees of freedom.
Chi-square distributions with a small number of degrees of freedom
are highly skewed; however, this skewness is attenuated as the
number of degrees of freedom increases.
Cont…
7/5/2023
By Asaye
84
The chi-squared distribution is concentrated over non-negative values. It
has mean equal to its degrees of freedom (d.f), and its standard deviation
equals √(2df ). As d.f increases, the distribution concentrates around larger
values and is more spread out.
The distribution is skewed to the right, but it becomes more bell-shaped
(normal) as d.f increases.
Cont…
7/5/2023
By Asaye
85
 For contingency table, d.f is equal to (r-1)x(c-1)
Test of association
7/5/2023
By Asaye
86
 The chi-squared (2) test statistics is widely used in the analysis of
contingency tables.
 It compares the actual observed frequency in each group with the expected
frequency.
 The chi-squared test (Pearson’s χ2) allows us to test for association
between categorical (nominal) variables.
 The null hypothesis for this test is there is no association between the
variables. Consequently a significant p-value implies association.
Cont…
7/5/2023
By Asaye
87
Test Statistic: 2-test with d.f. = (r-1)x(c-1)
 



j
i ij
ij
ij
E
E
O
,
2
2

Oij=observed frequency, Eij=expected frequency of the cell at the
juncture of i th raw & j th column
𝐸𝑖𝑗 =
𝑖𝑡ℎ
𝑟𝑜𝑤 𝑡𝑜𝑡𝑎𝑙 × 𝑗𝑡ℎ
𝑐𝑜𝑙𝑢𝑚𝑛 𝑡𝑜𝑡𝑎𝑙
𝑔𝑟𝑎𝑛𝑑 𝑡𝑜𝑡𝑎𝑙
=
𝑅𝑖 × 𝐶𝑗
𝑛
Procedures of Hypothesis Testing
7/5/2023
By Asaye
88
1. State the hypothesis
2. Fix level of significance
3. Find the critical value (𝜒2 (d.f, α))
4. Compute the test statistics
5. Decision rules; reject null hypothesis if test statistics > table value.
6. Make conclusion
Test of associations for 2x2 tables
7/5/2023
By Asaye
89
If we call the frequencies in the four cells of 2x2 table a, b, c and d
then the table is given by
Exposure status
Disease status
Row
total
diseased Non-
diseased
Exposed a b a+b
Non-exposed c d c+d
Column total a+c b+d a+b+c+
d
Cont…
7/5/2023
By Asaye
90
If the contingency table is 2x2
The d.f is (r-1)x(c-1), then
 
)
)(
)(
)(
(
2
2
d
c
b
a
d
b
c
a
bc
ad
n







 



j
i ij
ij
ij
E
E
O
,
2
2

Assumptions of the 2 - test
7/5/2023
By Asaye
91
The chi-squared test assumes that
 Data must be categorical data.
 The data be a frequency data.
 The numbers in each cell are ‘not too small’. No expected frequency
should be less than 1, and
 No more than 20% of the expected frequencies should be less than 5.
 If this does not hold row or column variables categories can sometimes be
combined (re-categorized) to make the expected frequencies larger or use.
Example:
7/5/2023
By Asaye
92
 Consider hypothetical example on smoking and symptoms of asthma. The
study involved 150 individuals and the result is given in the following
table:
 Is there association between smoking cigarettes and symptoms of asthma at
0.05 level of significance?
Symptoms of
Asthma
Ever smoking
Total
Yes No
Yes 20 30 50
No 22 78 100
Total 42 108 150
7/5/2023
By Asaye
93
dfarea 0.995 0.99 0.975 0.95 0.9 0.25 0.1 0.05 0.025 0.01 0.005
1 0.000 0.000 0.001 0.004 0.016 1.323 2.706 3.841 5.024 6.635 7.879
2 0.010 0.020 0.051 0.103 0.211 2.773 4.605 5.991 7.378 9.210 10.597
3 0.072 0.115 0.216 0.352 0.584 4.108 6.251 7.815 9.348 11.345 12.838
4 0.207 0.297 0.484 0.711 1.064 5.385 7.779 9.488 11.143 13.277 14.860
5 0.412 0.554 0.831 1.145 1.610 6.626 9.236 11.071 12.833 15.086 16.750
6 0.676 0.872 1.237 1.635 2.204 7.841 10.645 12.592 14.449 16.812 18.548
7 0.989 1.239 1.690 2.167 2.833 9.037 12.017 14.067 16.013 18.475 20.278
8 1.344 1.647 2.180 2.733 3.490 10.219 13.362 15.507 17.535 20.090 21.955
9 1.735 2.088 2.700 3.325 4.168 11.389 14.684 16.919 19.023 21.666 23.589
10 2.156 2.558 3.247 3.940 4.865 12.549 15.987 18.307 20.483 23.209 25.188
11 2.603 3.053 3.816 4.575 5.578 13.701 17.275 19.675 21.920 24.725 26.757
12 3.074 3.571 4.404 5.226 6.304 14.845 18.549 21.026 23.337 26.217 28.300
13 3.565 4.107 5.009 5.892 7.042 15.984 19.812 22.362 24.736 27.688 29.819
14 4.075 4.660 5.629 6.571 7.790 17.117 21.064 23.685 26.119 29.141 31.319
15 4.601 5.229 6.262 7.261 8.547 18.245 22.307 24.996 27.488 30.578 32.801
16 5.142 5.812 6.908 7.962 9.312 19.369 23.542 26.296 28.845 32.000 34.267
17 5.697 6.408 7.564 8.672 10.085 20.489 24.769 27.587 30.191 33.409 35.718
18 6.265 7.015 8.231 9.390 10.865 21.605 25.989 28.869 31.526 34.805 37.156
19 6.844 7.633 8.907 10.117 11.651 22.718 27.204 30.144 32.852 36.191 38.582
20 7.434 8.260 9.591 10.851 12.443 23.828 28.412 31.410 34.170 37.566 39.997
21 8.034 8.897 10.283 11.591 13.240 24.935 29.615 32.671 35.479 38.932 41.401
22 8.643 9.542 10.982 12.338 14.041 26.039 30.813 33.924 36.781 40.289 42.796
23 9.260 10.196 11.689 13.091 14.848 27.141 32.007 35.172 38.076 41.638 44.181
24 9.886 10.856 12.401 13.848 15.659 28.241 33.196 36.415 39.364 42.980 45.559
25 10.520 11.524 13.120 14.611 16.473 29.339 34.382 37.652 40.646 44.314 46.928
26 11.160 12.198 13.844 15.379 17.292 30.435 35.563 38.885 41.923 45.642 48.290
27 11.808 12.879 14.573 16.151 18.114 31.528 36.741 40.113 43.195 46.963 49.645
28 12.461 13.565 15.308 16.928 18.939 32.620 37.916 41.337 44.461 48.278 50.993
29 13.121 14.256 16.047 17.708 19.768 33.711 39.087 42.557 45.722 49.588 52.336
30 13.787 14.953 16.791 18.493 20.599 34.800 40.256 43.773 46.979 50.892 53.672
Table C. Right tail areas for the Chi-square
Solution
7/5/2023
By Asaye
94
Hypothesis:
 H0: there is no association between smoking and symptoms of asthma.
 H0: there is association between smoking and symptoms of asthma.
The critical value is given by 𝜒2(0.05,1) = 3.841
Test statistics
Cont…
7/5/2023
By Asaye
95
The corresponding p-value to 5.36 at 1 degree of freedom is estimated
by 0.02.
Decision: Hence, the decision is reject the null hypothesis and accept
the alternative hypothesis
Conclusion: there is association between smoking and symptoms of
asthma).
Exercise
7/5/2023
By Asaye
96
Consider the data on the assessment of the effectiveness of antidepressant. The
data is given below:
Is there association between treatments and depression at 0.01 level of
significance?
Treatment
Depression status
Total
Yes No
Desipramine 14(8) 10(16) 24
Lithium 6(8) 18(16) 24
Placebo 4(8) 20(16) 24
Total 24 48 72
7/5/2023
By Asaye
97
Measure of Association
Measure of Association
7/5/2023
By Asaye
98
 Chi-square test only tells us whether there is association between the two
categorical variables or not, however, it did not tell us about the direction
and strength of the association.
 Statistical relationship between exposure and disease.
 An association is said to exist between two variables when a change in one
variable parallels or coincides with a change in another variable.
 Requires comparing two groups:
 Exposed Vs Unexposed
 Cases Vs non cases/controls
Cont…
7/5/2023
By Asaye
99
 Variables can be related or unrelated to one another.
 If they have relation, it can be:
 Positively or negatively
 Strongly or weakly (one variable can have large or small effect
on the other)
 Significantly or not significantly related
 Statistically significant association is the association is unlikely to be
due to chance.
Cont…
7/5/2023
By Asaye
100
Commonly, the strength of the association is measured by the
 Relative risk (RR)
 Odds Ratio (OR)
Relative Risk (RR)
7/5/2023
By Asaye
101
 Risk: The probability of an event occurring over time
 Risk Ratio: The ratio of the risk of disease incidence in exposed
group compared to the risk in those unexposed.
 Risk measures the probability of disease incidence among groups.
 Relative risk is used to compare the risk in two different groups of
people.
Risk =
number of cases of disease
number of people at risk
Cont…
7/5/2023
By Asaye
102
 It estimates the magnitude (size) of an association between exposure
and outcome.
 It indicates the chance of developing the disease in the exposed
group relative to those who are not exposed group to a risk factor.
Cont…
7/5/2023
By Asaye
103
 Table 1: a 2 by 2 table indicating findings of a cohort study
Cont…
7/5/2023
By Asaye
104
 From the above table the RR is calculated as:
Example1
7/5/2023
By Asaye
105
Table 2: Data from a cohort study of oral contraceptive (OC) use and
bacteriuria among women aged 15-49 years.
Current OC
use
Bacteriuria
Total
Yes No
Yes 27 455 482
No 77 1831 1908
Total 104 2286 2390
Cont…
7/5/2023
By Asaye
106
Calculate RR?
RR =
𝑙𝑒
𝑙𝑜
or 𝑎/(𝑎+𝑏)
𝑐/(𝑐+𝑑)
=
27/482 ∗1000
77/1908 ∗1000
=1.4
 Interpretation: women who used oral contraceptive had 1.4 times higher risk
of developing bacteriuria when compared to non-users.
RR = Incidence among exposed (Ie)
Incidence among non-exposed (Io)
Interpretation
7/5/2023
By Asaye
107
 The value of RR ranges from 0 and infinity.
 RR is always a positive number.
 RR=1
 Risk in exposed = risk in non-exposed
 No association
 RR>1
 Risk in exposed > risk in non-exposed
 Implies that exposed individuals are x times highly likely to develop the outcome
as compared to non-exposed.
 Positive association, factor is associated with disease
 Larger RR  stronger association
Cont…
7/5/2023
By Asaye
108
 RR<1
 Risk in exposed < risk in non-exposed
 Indicates the risk of acquiring the disease is less among subjects with the
risk factor than among subjects without the risk factor.
 Negative association, factor is “protective”
Interpretation cont’d…
1
No association
Preventive Risk
0 ∞
Guideline for strength of association
7/5/2023
By Asaye
110
 1.0 = No association
 1.1-1.3 = Weak
 1.4-1.7 = Mild
 1.8-3.0 = Moderate
 3.0-8.0 =Strong
Q. What if RR is less than 1?
Cont…
7/5/2023
By Asaye
111
 For inverse associations (RR is less than 1.0), take the reciprocal and
look in above table, e.g., reciprocal of 0.5 is 2.0, which corresponds
to a “moderate” association.
 The further RR away from 1, the stronger the association between
exposure and disease.
Odds Ratio (OR)
7/5/2023
By Asaye
112
The Odds of disease is the probability that an individual experiences the
disease as a function of exposure.
Odds: The probability of an event's occurring to the probability of its not
occurring.
Odds = P/1-P
Where ; p = the probability of an event
1-p = the probability that the event does not occur
 Indicates the likelihood of having been exposed among cases relative to
controls.
Cont…
7/5/2023
By Asaye
113
Consider the following 2x2 table:
Treatment
Outcome status
Total
X
-
X+
Y
-
a b a+b
Y+
c d c+d
Total a+c b+d a+b+c+d
Cont…
7/5/2023
By Asaye
114
Odds Ratio: The ratio of two odds or the ratio of the odds of exposure in
cases compared with the odds of the exposure in controls.
Odds Ratio =
Odds of positive outcome among cases
Odds of positive outcome aomg control
= OR =
a/c
b/d
=
a∗d
b∗c
Odds – the ratio of the probability of occurrence of an event to that of
nonoccurrence.
 We can calculate either exposure or disease odds ratio, which are exactly
the same.
Example
7/5/2023
By Asaye
115
Table 3: Data from a case-control study of current oral contraceptive
(OC) use and MI in pre-menopausal female nurses.
Current OC
use
Myocardial infraction
Total
Yes No
Yes 23 304 327
No 133 2816 2949
Total 156 3120 3276
Cont…
7/5/2023
By Asaye
116
Calculate OR
OR =
a/c
b/d
=
23∗2816
304∗133
= 1.6
Interpretation: the odds of having MI is 1.6 times higher among OCP
users as compared to that of the non OCP users.
Interpretation cont’d…
 OR can be ranges from 0 to positive infinity.
 OR = 1 then exposure not related to disease.
 OR >1 then exposure positively related to disease.
 OR <1 then exposure negatively related to disease.
0
1.0 ∞
Positive
Negative
No
weak
Interpretation
7/5/2023
By Asaye
118
 The odds of having the disease in question are OR times greater among
those exposed to the suspected risk factor than among those with no such
exposure.
 The formula for standard error of the log odds ratio is given by
𝑆𝐸(ln 𝑂𝑅 ) =
1
𝑎
+
1
𝑏
+
1
𝑐
+
1
𝑑
 The 95% confidence interval for the log odds ratio is given by
ln 𝑂𝑅 − 𝑍𝛼
2
∗ 𝑆𝐸 ln 𝑂𝑅 , ln 𝑂𝑅 + 𝑍𝛼
2
∗ 𝑆𝐸 ln 𝑂𝑅
Cont…
7/5/2023
By Asaye
119
 To obtain 95% confidence interval interpretation for the odds ratio, we need
to transform back to the original value of odds ratio.
 Or, The 95% confidence interval for odds ratio is given by:
 OR is the point estimate of the sample.
Exercise
7/5/2023
By Asaye
120
 Example: Let us consider an example in order to make the concept clear. The data
in the table below is information about infant birth weights and mortality among
white infants in region X within a year.
 Find the confidence interval for odds ratio of infant mortality at 5% level of
significance?
Birth weight
Mortality
Total
Dead Alive
Low BW 618 4597 5215
High BW 422 67093 67515
Total 1040 71690 72730
Sampled reference
7/5/2023
By Asaye
121
BLUMAN ELEMENTARY STATISTICS: A STEP BY STEP
APPROACH, EIGHTH EDITION
An Introduction to Statistical Methods and Data Analysis,
Sixth Edition
Introduction to Biostatistics BY Larry Winner; Department of
Statistics, University of Florida
7/5/2023
By Asaye
122

More Related Content

Similar to 3. Statistical inference_anesthesia.pptx

Statistik 1 7 estimasi & ci
Statistik 1 7 estimasi & ciStatistik 1 7 estimasi & ci
Statistik 1 7 estimasi & ci
Selvin Hadi
 
Inferential statistics-estimation
Inferential statistics-estimationInferential statistics-estimation
Inferential statistics-estimation
Southern Range, Berhampur, Odisha
 
Biostatics 8.pptx
Biostatics 8.pptxBiostatics 8.pptx
Biostatics 8.pptx
EyobAlemu11
 
Ch3_Statistical Analysis and Random Error Estimation.pdf
Ch3_Statistical Analysis and Random Error Estimation.pdfCh3_Statistical Analysis and Random Error Estimation.pdf
Ch3_Statistical Analysis and Random Error Estimation.pdf
Vamshi962726
 
Estimating a Population Mean
Estimating a Population Mean  Estimating a Population Mean
Estimating a Population Mean
Long Beach City College
 
Lecture-3 inferential stastistics.ppt
Lecture-3 inferential stastistics.pptLecture-3 inferential stastistics.ppt
Lecture-3 inferential stastistics.ppt
fantahungedamu
 
Sampling
SamplingSampling
Sampling
Md Iqbal
 
6. point and interval estimation
6. point and interval estimation6. point and interval estimation
6. point and interval estimation
ONE Virtual Services
 
Business statistics-i-part2-aarhus-bss
Business statistics-i-part2-aarhus-bssBusiness statistics-i-part2-aarhus-bss
Business statistics-i-part2-aarhus-bss
Antonio Rivero Ostoic
 
Sampling distribution
Sampling distributionSampling distribution
Sampling distribution
swarna dey
 
Confidence Intervals in the Life Sciences PresentationNamesS.docx
Confidence Intervals in the Life Sciences PresentationNamesS.docxConfidence Intervals in the Life Sciences PresentationNamesS.docx
Confidence Intervals in the Life Sciences PresentationNamesS.docx
maxinesmith73660
 
Estimating population mean
Estimating population meanEstimating population mean
Estimating population mean
Ronaldo Cabardo
 
Sampling_Distribution_stat_of_Mean_New.pptx
Sampling_Distribution_stat_of_Mean_New.pptxSampling_Distribution_stat_of_Mean_New.pptx
Sampling_Distribution_stat_of_Mean_New.pptx
RajJirel
 
Review & Hypothesis Testing
Review & Hypothesis TestingReview & Hypothesis Testing
Review & Hypothesis Testing
Sr Edith Bogue
 
Basics of biostatistic
Basics of biostatisticBasics of biostatistic
Basics of biostatistic
NeurologyKota
 
Business statistic ii
Business statistic iiBusiness statistic ii
Business statistic ii
Lenin Chakma
 
estimation
estimationestimation
estimation
Mmedsc Hahm
 
Estimation
EstimationEstimation
Estimation
Mmedsc Hahm
 
Normal and standard normal distribution
Normal and standard normal distributionNormal and standard normal distribution
Normal and standard normal distribution
Avjinder (Avi) Kaler
 
Chapter09
Chapter09Chapter09
Chapter09
rwmiller
 

Similar to 3. Statistical inference_anesthesia.pptx (20)

Statistik 1 7 estimasi & ci
Statistik 1 7 estimasi & ciStatistik 1 7 estimasi & ci
Statistik 1 7 estimasi & ci
 
Inferential statistics-estimation
Inferential statistics-estimationInferential statistics-estimation
Inferential statistics-estimation
 
Biostatics 8.pptx
Biostatics 8.pptxBiostatics 8.pptx
Biostatics 8.pptx
 
Ch3_Statistical Analysis and Random Error Estimation.pdf
Ch3_Statistical Analysis and Random Error Estimation.pdfCh3_Statistical Analysis and Random Error Estimation.pdf
Ch3_Statistical Analysis and Random Error Estimation.pdf
 
Estimating a Population Mean
Estimating a Population Mean  Estimating a Population Mean
Estimating a Population Mean
 
Lecture-3 inferential stastistics.ppt
Lecture-3 inferential stastistics.pptLecture-3 inferential stastistics.ppt
Lecture-3 inferential stastistics.ppt
 
Sampling
SamplingSampling
Sampling
 
6. point and interval estimation
6. point and interval estimation6. point and interval estimation
6. point and interval estimation
 
Business statistics-i-part2-aarhus-bss
Business statistics-i-part2-aarhus-bssBusiness statistics-i-part2-aarhus-bss
Business statistics-i-part2-aarhus-bss
 
Sampling distribution
Sampling distributionSampling distribution
Sampling distribution
 
Confidence Intervals in the Life Sciences PresentationNamesS.docx
Confidence Intervals in the Life Sciences PresentationNamesS.docxConfidence Intervals in the Life Sciences PresentationNamesS.docx
Confidence Intervals in the Life Sciences PresentationNamesS.docx
 
Estimating population mean
Estimating population meanEstimating population mean
Estimating population mean
 
Sampling_Distribution_stat_of_Mean_New.pptx
Sampling_Distribution_stat_of_Mean_New.pptxSampling_Distribution_stat_of_Mean_New.pptx
Sampling_Distribution_stat_of_Mean_New.pptx
 
Review & Hypothesis Testing
Review & Hypothesis TestingReview & Hypothesis Testing
Review & Hypothesis Testing
 
Basics of biostatistic
Basics of biostatisticBasics of biostatistic
Basics of biostatistic
 
Business statistic ii
Business statistic iiBusiness statistic ii
Business statistic ii
 
estimation
estimationestimation
estimation
 
Estimation
EstimationEstimation
Estimation
 
Normal and standard normal distribution
Normal and standard normal distributionNormal and standard normal distribution
Normal and standard normal distribution
 
Chapter09
Chapter09Chapter09
Chapter09
 

More from Abebe334138

Advanced Biostatistics presentation pptx
Advanced Biostatistics presentation  pptxAdvanced Biostatistics presentation  pptx
Advanced Biostatistics presentation pptx
Abebe334138
 
Regression Analysis.ppt
Regression Analysis.pptRegression Analysis.ppt
Regression Analysis.ppt
Abebe334138
 
Lecture_5Conditional_Probability_Bayes_T.pptx
Lecture_5Conditional_Probability_Bayes_T.pptxLecture_5Conditional_Probability_Bayes_T.pptx
Lecture_5Conditional_Probability_Bayes_T.pptx
Abebe334138
 
chapter-7b.pptx
chapter-7b.pptxchapter-7b.pptx
chapter-7b.pptx
Abebe334138
 
chapter -7.pptx
chapter -7.pptxchapter -7.pptx
chapter -7.pptx
Abebe334138
 
7 Chi-square and F (1).ppt
7 Chi-square and F (1).ppt7 Chi-square and F (1).ppt
7 Chi-square and F (1).ppt
Abebe334138
 
Presentation1.pptx
Presentation1.pptxPresentation1.pptx
Presentation1.pptx
Abebe334138
 
RCT CH0.ppt
RCT CH0.pptRCT CH0.ppt
RCT CH0.ppt
Abebe334138
 
1. intro_biostatistics.pptx
1. intro_biostatistics.pptx1. intro_biostatistics.pptx
1. intro_biostatistics.pptx
Abebe334138
 
Lecture_R.ppt
Lecture_R.pptLecture_R.ppt
Lecture_R.ppt
Abebe334138
 
ppt1221[1][1].pptx
ppt1221[1][1].pptxppt1221[1][1].pptx
ppt1221[1][1].pptx
Abebe334138
 
dokumen.tips_biostatistics-basics-biostatistics.ppt
dokumen.tips_biostatistics-basics-biostatistics.pptdokumen.tips_biostatistics-basics-biostatistics.ppt
dokumen.tips_biostatistics-basics-biostatistics.ppt
Abebe334138
 

More from Abebe334138 (12)

Advanced Biostatistics presentation pptx
Advanced Biostatistics presentation  pptxAdvanced Biostatistics presentation  pptx
Advanced Biostatistics presentation pptx
 
Regression Analysis.ppt
Regression Analysis.pptRegression Analysis.ppt
Regression Analysis.ppt
 
Lecture_5Conditional_Probability_Bayes_T.pptx
Lecture_5Conditional_Probability_Bayes_T.pptxLecture_5Conditional_Probability_Bayes_T.pptx
Lecture_5Conditional_Probability_Bayes_T.pptx
 
chapter-7b.pptx
chapter-7b.pptxchapter-7b.pptx
chapter-7b.pptx
 
chapter -7.pptx
chapter -7.pptxchapter -7.pptx
chapter -7.pptx
 
7 Chi-square and F (1).ppt
7 Chi-square and F (1).ppt7 Chi-square and F (1).ppt
7 Chi-square and F (1).ppt
 
Presentation1.pptx
Presentation1.pptxPresentation1.pptx
Presentation1.pptx
 
RCT CH0.ppt
RCT CH0.pptRCT CH0.ppt
RCT CH0.ppt
 
1. intro_biostatistics.pptx
1. intro_biostatistics.pptx1. intro_biostatistics.pptx
1. intro_biostatistics.pptx
 
Lecture_R.ppt
Lecture_R.pptLecture_R.ppt
Lecture_R.ppt
 
ppt1221[1][1].pptx
ppt1221[1][1].pptxppt1221[1][1].pptx
ppt1221[1][1].pptx
 
dokumen.tips_biostatistics-basics-biostatistics.ppt
dokumen.tips_biostatistics-basics-biostatistics.pptdokumen.tips_biostatistics-basics-biostatistics.ppt
dokumen.tips_biostatistics-basics-biostatistics.ppt
 

Recently uploaded

Research, Monitoring and Evaluation, in Public Health
Research, Monitoring and Evaluation, in Public HealthResearch, Monitoring and Evaluation, in Public Health
Research, Monitoring and Evaluation, in Public Health
aghedogodday
 
2024 Media Preferences of Older Adults: Consumer Survey and Marketing Implica...
2024 Media Preferences of Older Adults: Consumer Survey and Marketing Implica...2024 Media Preferences of Older Adults: Consumer Survey and Marketing Implica...
2024 Media Preferences of Older Adults: Consumer Survey and Marketing Implica...
Media Logic
 
1比1制作(uofm毕业证书)美国密歇根大学毕业证学位证书原版一模一样
1比1制作(uofm毕业证书)美国密歇根大学毕业证学位证书原版一模一样1比1制作(uofm毕业证书)美国密歇根大学毕业证学位证书原版一模一样
1比1制作(uofm毕业证书)美国密歇根大学毕业证学位证书原版一模一样
5sj7jxf7
 
Hypotension and role of physiotherapy in it
Hypotension and role of physiotherapy in itHypotension and role of physiotherapy in it
Hypotension and role of physiotherapy in it
Vishal kr Thakur
 
chatgptfornlp-230314021506-2f03f614.pdf. 21506-2f03f614.pdf
chatgptfornlp-230314021506-2f03f614.pdf. 21506-2f03f614.pdfchatgptfornlp-230314021506-2f03f614.pdf. 21506-2f03f614.pdf
chatgptfornlp-230314021506-2f03f614.pdf. 21506-2f03f614.pdf
marynayjun112024
 
The Importance of Black Women Understanding the Chemicals in Their Personal C...
The Importance of Black Women Understanding the Chemicals in Their Personal C...The Importance of Black Women Understanding the Chemicals in Their Personal C...
The Importance of Black Women Understanding the Chemicals in Their Personal C...
bkling
 
3. User Guide Activity Budget Tracking App Steps to apply.pptx
3. User Guide Activity Budget Tracking App Steps to apply.pptx3. User Guide Activity Budget Tracking App Steps to apply.pptx
3. User Guide Activity Budget Tracking App Steps to apply.pptx
habtegirma
 
Emotional and Behavioural Problems in Children - Counselling and Family Thera...
Emotional and Behavioural Problems in Children - Counselling and Family Thera...Emotional and Behavioural Problems in Children - Counselling and Family Thera...
Emotional and Behavioural Problems in Children - Counselling and Family Thera...
PsychoTech Services
 
一比一原版(EUR毕业证)鹿特丹伊拉斯姆斯大学毕业证如何办理
一比一原版(EUR毕业证)鹿特丹伊拉斯姆斯大学毕业证如何办理一比一原版(EUR毕业证)鹿特丹伊拉斯姆斯大学毕业证如何办理
一比一原版(EUR毕业证)鹿特丹伊拉斯姆斯大学毕业证如何办理
gjsma0ep
 
Management of Post Operative Pain: to make doctors conscious about the benefi...
Management of Post Operative Pain: to make doctors conscious about the benefi...Management of Post Operative Pain: to make doctors conscious about the benefi...
Management of Post Operative Pain: to make doctors conscious about the benefi...
Nilima65
 
nhs fpx 4000 assessment 4 analyzing a current health care problem or issue.pdf
nhs fpx 4000 assessment 4 analyzing a current health care problem or issue.pdfnhs fpx 4000 assessment 4 analyzing a current health care problem or issue.pdf
nhs fpx 4000 assessment 4 analyzing a current health care problem or issue.pdf
Carolyn Harker
 
R3 Stem Cell Therapy: A New Hope for Women with Ovarian Failure
R3 Stem Cell Therapy: A New Hope for Women with Ovarian FailureR3 Stem Cell Therapy: A New Hope for Women with Ovarian Failure
R3 Stem Cell Therapy: A New Hope for Women with Ovarian Failure
R3 Stem Cell
 
India Home Healthcare Market: Driving Forces and Disruptive Trends [2029]
India Home Healthcare Market: Driving Forces and Disruptive Trends [2029]India Home Healthcare Market: Driving Forces and Disruptive Trends [2029]
India Home Healthcare Market: Driving Forces and Disruptive Trends [2029]
Kumar Satyam
 
Enhancing Hip and Knee Arthroplasty Precision with Preoperative CT and MRI Im...
Enhancing Hip and Knee Arthroplasty Precision with Preoperative CT and MRI Im...Enhancing Hip and Knee Arthroplasty Precision with Preoperative CT and MRI Im...
Enhancing Hip and Knee Arthroplasty Precision with Preoperative CT and MRI Im...
Pristyn Care Reviews
 
一比一原版(USF毕业证)旧金山大学毕业证如何办理
一比一原版(USF毕业证)旧金山大学毕业证如何办理一比一原版(USF毕业证)旧金山大学毕业证如何办理
一比一原版(USF毕业证)旧金山大学毕业证如何办理
40fortunate
 
CHAPTER 1 SEMESTER V COMMUNICATION TECHNIQUES FOR CHILDREN.pdf
CHAPTER 1 SEMESTER V  COMMUNICATION TECHNIQUES FOR CHILDREN.pdfCHAPTER 1 SEMESTER V  COMMUNICATION TECHNIQUES FOR CHILDREN.pdf
CHAPTER 1 SEMESTER V COMMUNICATION TECHNIQUES FOR CHILDREN.pdf
Sachin Sharma
 
一比一原版(UoA毕业证)昆士兰科技大学毕业证如何办理
一比一原版(UoA毕业证)昆士兰科技大学毕业证如何办理一比一原版(UoA毕业证)昆士兰科技大学毕业证如何办理
一比一原版(UoA毕业证)昆士兰科技大学毕业证如何办理
xkute
 
Fit to Fly PCR Covid Testing at our Clinic Near You
Fit to Fly PCR Covid Testing at our Clinic Near YouFit to Fly PCR Covid Testing at our Clinic Near You
Fit to Fly PCR Covid Testing at our Clinic Near You
NX Healthcare
 
EXAMINATION OF HUMAN URINE AND FAECES.pdf
EXAMINATION OF HUMAN URINE AND FAECES.pdfEXAMINATION OF HUMAN URINE AND FAECES.pdf
EXAMINATION OF HUMAN URINE AND FAECES.pdf
Madhusmita Sahoo
 
Psychedelic Retreat Portugal - Escape to Lighthouse Retreats for an unforgett...
Psychedelic Retreat Portugal - Escape to Lighthouse Retreats for an unforgett...Psychedelic Retreat Portugal - Escape to Lighthouse Retreats for an unforgett...
Psychedelic Retreat Portugal - Escape to Lighthouse Retreats for an unforgett...
Lighthouse Retreat
 

Recently uploaded (20)

Research, Monitoring and Evaluation, in Public Health
Research, Monitoring and Evaluation, in Public HealthResearch, Monitoring and Evaluation, in Public Health
Research, Monitoring and Evaluation, in Public Health
 
2024 Media Preferences of Older Adults: Consumer Survey and Marketing Implica...
2024 Media Preferences of Older Adults: Consumer Survey and Marketing Implica...2024 Media Preferences of Older Adults: Consumer Survey and Marketing Implica...
2024 Media Preferences of Older Adults: Consumer Survey and Marketing Implica...
 
1比1制作(uofm毕业证书)美国密歇根大学毕业证学位证书原版一模一样
1比1制作(uofm毕业证书)美国密歇根大学毕业证学位证书原版一模一样1比1制作(uofm毕业证书)美国密歇根大学毕业证学位证书原版一模一样
1比1制作(uofm毕业证书)美国密歇根大学毕业证学位证书原版一模一样
 
Hypotension and role of physiotherapy in it
Hypotension and role of physiotherapy in itHypotension and role of physiotherapy in it
Hypotension and role of physiotherapy in it
 
chatgptfornlp-230314021506-2f03f614.pdf. 21506-2f03f614.pdf
chatgptfornlp-230314021506-2f03f614.pdf. 21506-2f03f614.pdfchatgptfornlp-230314021506-2f03f614.pdf. 21506-2f03f614.pdf
chatgptfornlp-230314021506-2f03f614.pdf. 21506-2f03f614.pdf
 
The Importance of Black Women Understanding the Chemicals in Their Personal C...
The Importance of Black Women Understanding the Chemicals in Their Personal C...The Importance of Black Women Understanding the Chemicals in Their Personal C...
The Importance of Black Women Understanding the Chemicals in Their Personal C...
 
3. User Guide Activity Budget Tracking App Steps to apply.pptx
3. User Guide Activity Budget Tracking App Steps to apply.pptx3. User Guide Activity Budget Tracking App Steps to apply.pptx
3. User Guide Activity Budget Tracking App Steps to apply.pptx
 
Emotional and Behavioural Problems in Children - Counselling and Family Thera...
Emotional and Behavioural Problems in Children - Counselling and Family Thera...Emotional and Behavioural Problems in Children - Counselling and Family Thera...
Emotional and Behavioural Problems in Children - Counselling and Family Thera...
 
一比一原版(EUR毕业证)鹿特丹伊拉斯姆斯大学毕业证如何办理
一比一原版(EUR毕业证)鹿特丹伊拉斯姆斯大学毕业证如何办理一比一原版(EUR毕业证)鹿特丹伊拉斯姆斯大学毕业证如何办理
一比一原版(EUR毕业证)鹿特丹伊拉斯姆斯大学毕业证如何办理
 
Management of Post Operative Pain: to make doctors conscious about the benefi...
Management of Post Operative Pain: to make doctors conscious about the benefi...Management of Post Operative Pain: to make doctors conscious about the benefi...
Management of Post Operative Pain: to make doctors conscious about the benefi...
 
nhs fpx 4000 assessment 4 analyzing a current health care problem or issue.pdf
nhs fpx 4000 assessment 4 analyzing a current health care problem or issue.pdfnhs fpx 4000 assessment 4 analyzing a current health care problem or issue.pdf
nhs fpx 4000 assessment 4 analyzing a current health care problem or issue.pdf
 
R3 Stem Cell Therapy: A New Hope for Women with Ovarian Failure
R3 Stem Cell Therapy: A New Hope for Women with Ovarian FailureR3 Stem Cell Therapy: A New Hope for Women with Ovarian Failure
R3 Stem Cell Therapy: A New Hope for Women with Ovarian Failure
 
India Home Healthcare Market: Driving Forces and Disruptive Trends [2029]
India Home Healthcare Market: Driving Forces and Disruptive Trends [2029]India Home Healthcare Market: Driving Forces and Disruptive Trends [2029]
India Home Healthcare Market: Driving Forces and Disruptive Trends [2029]
 
Enhancing Hip and Knee Arthroplasty Precision with Preoperative CT and MRI Im...
Enhancing Hip and Knee Arthroplasty Precision with Preoperative CT and MRI Im...Enhancing Hip and Knee Arthroplasty Precision with Preoperative CT and MRI Im...
Enhancing Hip and Knee Arthroplasty Precision with Preoperative CT and MRI Im...
 
一比一原版(USF毕业证)旧金山大学毕业证如何办理
一比一原版(USF毕业证)旧金山大学毕业证如何办理一比一原版(USF毕业证)旧金山大学毕业证如何办理
一比一原版(USF毕业证)旧金山大学毕业证如何办理
 
CHAPTER 1 SEMESTER V COMMUNICATION TECHNIQUES FOR CHILDREN.pdf
CHAPTER 1 SEMESTER V  COMMUNICATION TECHNIQUES FOR CHILDREN.pdfCHAPTER 1 SEMESTER V  COMMUNICATION TECHNIQUES FOR CHILDREN.pdf
CHAPTER 1 SEMESTER V COMMUNICATION TECHNIQUES FOR CHILDREN.pdf
 
一比一原版(UoA毕业证)昆士兰科技大学毕业证如何办理
一比一原版(UoA毕业证)昆士兰科技大学毕业证如何办理一比一原版(UoA毕业证)昆士兰科技大学毕业证如何办理
一比一原版(UoA毕业证)昆士兰科技大学毕业证如何办理
 
Fit to Fly PCR Covid Testing at our Clinic Near You
Fit to Fly PCR Covid Testing at our Clinic Near YouFit to Fly PCR Covid Testing at our Clinic Near You
Fit to Fly PCR Covid Testing at our Clinic Near You
 
EXAMINATION OF HUMAN URINE AND FAECES.pdf
EXAMINATION OF HUMAN URINE AND FAECES.pdfEXAMINATION OF HUMAN URINE AND FAECES.pdf
EXAMINATION OF HUMAN URINE AND FAECES.pdf
 
Psychedelic Retreat Portugal - Escape to Lighthouse Retreats for an unforgett...
Psychedelic Retreat Portugal - Escape to Lighthouse Retreats for an unforgett...Psychedelic Retreat Portugal - Escape to Lighthouse Retreats for an unforgett...
Psychedelic Retreat Portugal - Escape to Lighthouse Retreats for an unforgett...
 

3. Statistical inference_anesthesia.pptx

  • 1. ► Estimation ► Hypothesis testing Statistical inference 7/5/2023 1 By Asaye
  • 2. Objectives 7/5/2023 By Asaye 2 After complete this session, learners will be able to do  Parameter estimations Point estimate Confidence interval  Hypothesis testing Z-test T-test  Testing associations Chi-square test
  • 3. Sampling distribution 3  A sampling distribution is a distribution of all possible values of a statistic computed from samples of the same size randomly selected from the same population.  Sampling distribution is the probability distribution of sample statistic.  It is formed when samples of size n repeatedly taken from population.  Some would be higher than the population parameters and some would be lower. 7/5/2023 Asaye.A
  • 4. Sampling distribution…. 4  We consider sample statistic as random variables. For example:  Age of individuals is a random variable.  Similarly, mean of age is a random variable.  No conclusion about values of population parameters based on one individual value.  It should be based on sample statistic computed from adequate sample size. 7/5/2023 Asaye.A
  • 5. Sampling distribution…. 5 Construction of sampling distributions 1. From a population of size N, randomly draw all possible samples of size n. 2. Compute the statistic of interest for each sample. 3. Create a frequency distribution of the statistic. 7/5/2023 Asaye.A
  • 6. A. Sampling distribution of sample mean 6 7/5/2023 Asaye.A
  • 7. Example: sampling distribution of sample mean 7/5/2023 Asaye.A 7 The population values {18, 20, 22, 24} put in a box. Two observations are randomly selected, with replacement. Find the mean, variance, and standard deviation of the population. Solution: Mean: μ = 𝑋𝑖 𝑁 = 84 4 = 21 Variance: 𝜎2 = 𝑋𝑖 −𝜇 2 𝑁 = 20 4 = 5 Standard deviation: 5 = 2.236
  • 8. Example: sampling distribution of sample mean 7/5/2023 Asaye.A 8 Now consider all possible samples of size “n=2” 16 Sample Means 16 possible samples (with replacement)
  • 9. Example: sampling distribution of sample mean 7/5/2023 Asaye.A 9 List all the possible samples of size n = 2 and calculate the mean of each sample. Solution: Samples 𝑿 Samples 𝑿 18,18 18 22,18 20 18,20 19 22,20 21 18,22 20 22,22 22 18,24 21 22,24 23 20,18 19 24,18 21 20,20 20 24,20 22 20,22 21 24,22 23 20,24 22 24,24 24 These means form the sampling distribution of sample means
  • 10. Example: sampling distribution of sample mean 7/5/2023 Asaye.A 10 Construct the frequency distribution of the sample means;
  • 11. Example: sampling distribution of sample mean 7/5/2023 Asaye.A 11 Find mean, variance and standard deviation of the 16 sample means are; Mean: 𝜇𝑥 = 𝑥𝑖 𝑛 = 18+19+21+⋯+24 16 =21 Variance: 𝜎𝑥 2 = 𝑥𝑖−𝜇𝑥 2 𝑛 = 2.5, 𝜎𝑥 = 2.5 = 1.581 These results satisfy the properties of sampling distributions of sample means. 𝜇𝑥 = 𝜇 = 21, 𝜎𝑥 = 𝜎 𝑛 = 5 2 = 1.581
  • 12. 1st 2nd Observation Obs 18 20 22 24 18 18 19 20 21 20 19 20 21 22 22 20 21 22 23 24 21 22 23 24 Example: sampling distribution of sample mean 12 18 19 20 21 22 23 24 0 .1 .2 .3 Sample Means Distribution 16 Sample Means P(𝑋) 𝑋 7/5/2023 Asaye.A
  • 13. Comparing the population with its sampling distribution 13 18 19 20 21 22 23 24 0 .1 .2 .3 P(x) Mean 18 20 22 24 𝒙 0 .1 .2 .3 P(x)=1/4 Population, N = 4 𝜇 = 21 𝜎 = 2.236 Sample means distribution, n = 2 𝜇𝑥=21 𝜎𝑥= 1.58 𝒙 7/5/2023 Asaye.A
  • 14. Properties of sampling distribution of mean 14 A. Sampling from normally distributed populations a. If a population is normal with mean 𝜇 and standard deviation σ, the sampling distribution of 𝑥 is also normally distributed with 𝜇𝑥 = 𝜇 and 𝜎𝑥 = 𝜎 𝑛 , OR, the standard deviation of any sample statistic is called its standard error. 7/5/2023 Asaye.A
  • 15. Cont… 15 b. The mean 𝜇 of the distribution of sample mean is equal to the mean of the population from which the samples were drawn. c. The variance of the distribution of sample mean is equal to the variance of the population divided by the sample size. 7/5/2023 Asaye.A
  • 16. Sampling from non-normally distributed populations 16 We can apply the Central Limit Theorem:  Even if the population is not normal,  Sample means from the population will be approximately normal if the sample sizes ≥ 30 are drawn from any population with mean 𝜇 and standard deviation 𝜎.  The sampling distribution of sample means has 𝜇𝑥 = 𝜇 and 𝜎𝑥 = 𝜎 𝑛 7/5/2023 Asaye.A
  • 17. Sampling distribution of Proportion 7/5/2023 By Asaye 17 o Suppose we choose a random sample of size n, the sampling distribution of the sample means p posses the following properties. o The sample proportion p will be an estimate of the population mean P. o The standard deviation of p is equal to p(1−p) /n called the standard error of the proportion). o Provided n is large enough the shape of the sampling distribution of p is normal.
  • 18. Types of estimation 7/5/2023 By Asaye 18 There are two methods of estimation: 1. Point estimation 2. Interval estimation
  • 19.  Point estimation involves the calculation of a single value to estimate the population parameter.  Interval estimation specifies a range of values assumed to include population parameter. 19
  • 20. 1. Point Estimation  A parameter : is a numerical descriptive measure of a population (e.g. μ).  A statistic: is a numerical descriptive measure of a sample (e.g. 𝑋). It estimates the population parameter.  A point estimate of some population parameter is a single value of a sample statistic.  To each sample statistic there corresponds a population parameter. 20
  • 21. Sample statistic & corresponding population parameter Sample statistic  Sample mean ( 𝑋 )  Sample variance (S2 )  Sample Standard deviation (SD)  Sample proportion (p) Population parameters  μ (population mean)  σ2 (population variance)  σ(population standard deviation)  P or π (Population proportion) 21
  • 22. Point estimation….. If a random sample of 100 drug related patients has a mean survival time of 46.9 months then ,what is the point estimate of the population mean? Answer = 46.9 22
  • 23. 2. Interval Estimation  A point estimate does not give any indication on how far away the parameter lies.  But an interval which has a high probability of containing the value parameter lies.  An interval estimate is a statement that a population parameter has a value lying between two specified limits.  Such interval estimates are called Confidence Intervals (CI) 23
  • 24. Confidence Interval (CI) 7/5/2023 By Asaye 24  A confidence interval defines an interval within which the true population parameter is like to fall (interval estimate).  Confidence interval therefore takes into account the sample to sample variation of the statistic and gives the measure of precision.  Confidence intervals express the inherent uncertainty in any medical study by expressing upper and lower bounds for anticipated true underlying population parameter.
  • 25. Confidence Interval (CI)… 7/5/2023 By Asaye 25  Most commonly the 95% confidence intervals are calculated, however 90% and 99% confidence intervals are sometimes used.  The probability that the interval contains the true population parameter is (1-α)100%.  If we were to select 100 random samples from the population and calculate confidence intervals for each, approximately 95 of them would include the true population mean B (and 5 would not).
  • 26. Confidence Interval (CI)… Interval Estimate components Estimator ± Margin of error Estimator ± (Reliability coefficient) x (Standard error)  Precision of the estimate or Margin of error (d)= reliability coefficient x standard error. Where;  Reliability Coefficient (RC) is the 1 − α 100% percentile of the given probability distribution.  Standard Error (SE) is the standard deviation of the sampling distribution of the sample statistic. 26
  • 27. Reliability Coefficient 7/5/2023 By Asaye 27 The standardized “z” value corresponding to the given level of confidence. Z = 1.64 if your confidence level is 90% Z = 1.96 if your confidence level is 95% Z = 2.58 if your confidence level is 99% A wide interval suggests imprecision of estimation. Narrow CI widths reflects large sample size, low variability and low confidence level e.g. if you had a confidence level of 99%, the confidence coefficient would be . 99.
  • 28. Confidence Level Confidence level is the probability that the interval estimate will contain the parameter, assuming that a large number of samples are selected and that the estimation process on the same parameter is repeated.  Denoted by 100(1- 𝛼)%.  A relative frequency interpretation:  In long run; 100(1-𝛼 )% of all confidences intervals that can be constructed will contain unknown parameter.  A specific interval will either contain or not contain unknown parameter. 28
  • 29. Normal or t-distribution Is n≥30? Is a population normally, or approximately normally distributed Is variance 𝜎 known? Use t-distribution with n-1 degree of freedom Use normal distribution (Z) Con not use normal or t-distribution Use normal distribution (Z) If 𝜎 is unknown , use s instead. No Yes No Yes No Yes
  • 30. Confidence Interval for single population mean 1.When the variance is known and the sample size is large or small, the C.I. has the form: 𝑋 - Z (1- α/2) δ /√n < μ < 𝑋 + Z (1- α /2) δ / √ n or 𝑋 ± 𝑍𝛼 2 𝑆 𝑛 for n ≥ 30, 𝑏𝑢𝑡 𝜎 𝑖𝑠 𝑢𝑛𝑘𝑛𝑜𝑤𝑛. 2. When variance is unknown, and the sample size is small , the C.I. has the form: 𝑋 - t (1- α /2),n-1 s/ √ n < μ < 𝑋+ t (1- α /2),n-1 s/ √ n , d.f = n-1 30
  • 31. Example E.g. In normally distributed population mean reading speed of a random sample of 81 adults is 325 words per minute. Find a 90% C.I. for the mean reading speed of all adults (μ) if it is known that the standard deviation for all adults is 45 words per minute . Given n = 81 σ = 45 𝑥 = 325 Zα/2 = 1.645 A 90% C.I. for μ is 325 ± (1.64 x 5 ) = 325 ± 8.2= (316.8, 333.2) Therefore, A 90% CI for μ is 316.8 to 333.2 words per minute. 31
  • 33. CI for the difference of means & independent samples 1. When variance known CI = 𝑥1- 𝑥2 ± Z  / 2 ẟ12 𝑛1 + ẟ22 𝑛2 2. When variance unknown and if the sample size is less than 30 Use t – distribution instead of z – distribution CI = 𝑥1- 𝑥2 ± t  / 2, 𝑛1 + 𝑛2 − 2 𝑆1 2 𝑛1 + 𝑆2 2 𝑛2 33
  • 34. Example If a random sample of 50 non-smokers have a mean life of 76 years with a standard deviation of 8 years, and a random sample of 65 smokers live 68 years with a standard deviation of 9 years, Find a 95% C.I for the difference of mean lifetime of non-smokers and smokers? 34
  • 35. Confidence Interval for a Single Population proportion (P):  A sample is drawn from the population of interest ,then compute the sample proportion p such as;  This sample proportion is used as the point estimator of the population proportion n P P Z P ) ˆ 1 ( ˆ ˆ 2 1     35 p = no. of elements in the sample with some characterstics Total no. of element in the sample = x n
  • 36. Single proportion cont…. 2. In Addis Ababa, a survey of 350 students showed that 28% carried their lunch to school. Find the 95% CI for the true population proportion of students who carried their lunch to school? 3. Suppose that 22 people were obese from 100 people in Debre Tabor. Find the 95% confidence interval for the true population proportion? 36
  • 37. CI for the difference between two Population proportions  Two samples are drawn from two independent population of interest,  then compute the sample proportion for each sample for the characteristic of interest.  An unbiased point estimator for the difference between two population proportions 𝑝1 − 𝑝2. 37
  • 38. CI for the difference between two Population proportions  A 100(1-α)% confident interval for P1 - P2 is given by 38 2 2 2 1 1 1 2 1 2 1 ) ˆ 1 ( ˆ ) ˆ 1 ( ˆ ) ˆ ˆ ( n P P n P P Z P P       
  • 39. Example A researcher investigated gender differences in sexual abuse in a sample of 323 adults (68 female and 255 males ). In the sample, 31 of the female and 53 of the males reported sexual abuse. We wish to construct 99% C.I. for the difference between the proportions of sexual abuse in the two sampled population . 39
  • 40. Example cont….. 1-α =0.99 → α = 0.01 → α/2 =0.005 → 1- α/2 = 0.995 Z 1- α/2 = Z 0.995 =2.58 , nF=68, nM=255, 40 2078 . 0 255 53 ˆ , 4559 . 0 68 31 ˆ       M M M F F F n a p n a p M M M F F F M F n P P n P P Z P P ) ˆ 1 ( ˆ ) ˆ 1 ( ˆ ) ˆ ˆ ( 2 1        255 ) 2078 . 0 1 ( 2078 . 0 68 ) 4559 . 0 1 ( 4559 . 0 58 . 2 ) 2078 . 0 4559 . 0 (     
  • 41. Example cont….. 0.2481 ± 2.58(0.0655) = ( 0.07914 , 0.4171 )  Interpretation: ?????? 41
  • 42. C. Paired Samples 7/5/2023 By Asaye 42  Tests Means of two Related Populations ∆ Paired or matched samples ∆ Repeated measures (before/after) ∆ Use difference between paired values: d = x1-x2  Eliminates variation among subjects  Assumptions:  Both populations are normally distributed,  Or, if not normal, use large samples.
  • 43. Examples 7/5/2023 By Asaye 43 Paired data arises when each individual (more specifically, each unit of measurement) in a sample is measured twice. e.g. Blood pressure prior to and following treatment,  Notice in each of these examples that the two occasions of measurement are linked by virtue of the two measurements being made on the same individual.
  • 45. Example 7/5/2023 By Asaye 45 Ten hypertensive patients are screened at a neighborhood health clinic and are given methyl dopa, a strong antihypertensive medication for their condition. They are asked to come back 1 week later and have their blood pressures measured again. Suppose the initial and follow-up SBPs (mm Hg) of the patients are given below.
  • 47. 7/5/2023 By Asaye 47 We have the following data and summary statistics
  • 48. Summary 7/5/2023 By Asaye 48  Students sometimes have difficulty deciding whether to use 𝑍𝛼/2 or 𝑡𝛼/2 values when finding confidence intervals.
  • 49. Hypothesis testing  A statistical hypothesis is a statement about the population under study or about the distribution of a quantity under consideration.  Researchers are interested in answering many types of questions. For example, A physician might want to know whether a new medication will lower a person’s blood pressure.  These types of questions can be addressed through statistical hypothesis testing, which is a decision-making process for evaluating claims about a population. 49
  • 50. Hypothesis testing 7/5/2023 By Asaye 50  Hypothesis is a testable statement that describes the nature of the proposed relationship between two or more variables of interest.  In hypothesis testing, the researcher must defined the population under study, state the particular hypotheses that will be investigated, give the significance level, select a sample from the population, collect the data, perform the calculations required for the statistical test, and reach a conclusion.
  • 51. Type of Hypotheses  Null hypothesis (represented by HO) is the statement about the value of the population parameter (normal statement).  The null hypothesis postulates that ‘there is no difference between factor and outcome’ or ‘there is no an intervention effect.’  Alternative hypothesis (represented by HA) is the hypothesis that a researcher want to test or claim, or states the ‘opposing’ view that ‘there is a difference between factor and outcome’ or ‘there is an intervention effect.  Level of significance: the percentage of the sample means that is outside certain prescribed limits. 51
  • 53. Methods of hypothesis testing 7/5/2023 By Asaye 53  Hypotheses concerning about parameters which may or may not be true.  The three methods used to test hypotheses are:- The traditional method The P-value method The confidence interval method.
  • 54. Steps in hypothesis testing 7/5/2023 By Asaye 54 1. Identify the null hypothesis H0 and the alternate hypothesis HA. 2. Choose 𝛼. The value should be small, usually less than 10%. It is important to consider the consequences of both types of errors. 3. Select the test statistic and determine its value from the sample data. 4. Compare the observed value of the statistic to the critical value obtained for the chosen 𝛼. 5. Make a decision 6. Conclusion
  • 55. Test Statistics  A test statistics is a value we can compare with known distribution of what we expect when the null hypothesis is true.  The general formula of the test statistics is:  Test statistics = 55
  • 56. Critical value 7/5/2023 By Asaye 56  The critical value separates the critical region from the non-critical region for a given level of significance.
  • 57. Decision making Accept or Reject the null hypothesis There are 2 types of errors Type I error is more serious error and it is the level of significant. Power is the probability of rejecting false null hypothesis and it is given by 1-β 57
  • 60. Types of errors 7/5/2023 By Asaye 60 Type I errors: refers to the situation when we reject the null hypothesis when it is true (Ho is wrongly rejected) E.g. Ho: there is no differences between two drugs on average Type I error will occur if we conclude that the two drugs produce different effects when actually there isn’t a difference. Prob(type I error)=α Type II errors: refers to the situation when we accept the null hypothesis when it is false. E.g. Ho: there is no differences between two drugs on average Type II error will occur if we conclude that the two drugs produce the same effects when actually there is a difference. Prob(type II error)=𝛽
  • 61. Hypothesis testing about a Population mean (μ) Two Tailed Test (The value of sample statistic failing into either tail of the distribution) The large sample (n > = 30) test of hypothesis about a population mean μ is as follows 1 . H 0 :𝜇1 = 𝜇0 vs H A : 𝜇1 ≠ 𝜇0 2. Z cal= 𝑥−𝜇0 ẟ 𝑛 61
  • 62. Hypothesis testing about a Population mean (μ) 7/5/2023 By Asaye 62 Ztab  Z α / 2 Decision rule : Reject Ho if the Z value falls in the rejection region. Don’t reject Ho if the Z value falls in the non-rejection region. if |zcal| Ztab reject H 0 i f | zcal |< Ztab accept H 0  If n < 30 and variance unknown tcal = 𝑥−𝜇0 𝑠 𝑛 at n-1 d.f And the decision is similar to z calculated
  • 63. One tailed tests 2 . H 0 :    0 vs H A :  1 <  0 Ztab  Zα D e c i s i o n : if zcal  - Ztab accept H0 if zcal < - Ztab reject H0 H 0 :    0 vs H A :  1   0 3. H 0 :    0 vs H A :  1   0 Decision : if zcal  Ztab reject H0 if zcal < Ztab accept H0 63
  • 64. The P- Value 7/5/2023 By Asaye 64  P-value is the probability that the observed difference is due to chance.  A large p-value implies that the probability of the value observed, occurring just by chance, when the null hypothesis is true.  With small p-value, we can ignore the effect of chance, and suggests that there might be sufficient evidence for rejecting the null hypothesis.  The p-value is defined as the probability of observing the computed significance test value or a larger one, if the H0 hypothesis is true. For example, P[ Z >= Zcal/H0 true].
  • 65. The P- Value… 7/5/2023 By Asaye 65  A p-value is the probability of getting the observed difference, or one more extreme, in the sample purely by chance from a population where the true difference is zero.  If the p-value is greater than 0.05 then, by convention, we conclude that the observed difference could have occurred by chance and there is no statistically significant evidence (at the 5% level of significance) for a difference between the groups in the population.
  • 66. P-value and confidence interval 7/5/2023 By Asaye 66  Confidence intervals are preferable because they give information about the size of any difference in the population, and they also indicate the amount of uncertainty remaining about the size of the difference.  When the null hypothesis is rejected in a hypothesis-testing situation, the confidence interval for the mean using the same level of significance will not contain the hypothesized mean.
  • 67. P-value and confidence interval….. 7/5/2023 By Asaye 67  But for what values of p-value should we reject the null hypothesis?  By convention, a p-value of 0.05 or smaller is considered sufficient evidence for rejecting the null hypothesis.  By using p-value of 0.05, we are allowing a 5% chance of wrongly rejecting the null hypothesis when it is in fact true.  When the p-value is less than to 0.05, we often say that the result is statistically significant.
  • 68. Example1 A simple random sample of 10 people from a certain population has a mean age of 27. Can we conclude that the mean age of the population is not 30? The variance is known to be 20. Let 𝛼 = .05. 68
  • 69. Example…. 7/5/2023 By Asaye 69 Solution 1. State hypothesis test: Ho: µ = 30 VS HA: µ ≠ 30 2. Determine level of significance: α = 0.05 3. Calculate test statistics: Zcal = (27-30)/ 20 10 = -2.12 4. Determine critical value: Z-critical value at 0.025 is equal to 1.96. 5. Make decision: We reject the null hypothesis since |Zcal | = 2.12 ≥ Ztab = 1.96. That is Zcal =-2.12 is in the rejection region. 6. Conclusion: The mean of age of the population is different from 30 at 5% level of significance. We conclude that µ is not 30 since p-value= 0.034.
  • 70. Example 2 Suppose that we have a population mean 3.1 and n=20 people, 𝑥 = 4.5, 1. H0:  3.1 vs HA:   3.1 2. α= 0.5 at 95% CI 3. Our test statistic is: 70
  • 71. Example 2… 7/5/2023 By Asaye 71 4. The observed value of the test statistic falls within the range of the non-rejection region. i.e. tcal = 1.14 < ttab = 2.09, since do not reject Ho. 5. We accept Ho and we conclude that there is no enough evidence to reject the null hypothesis
  • 72. Hypothesis testing for single proportions Example: In the study of childhood abuse in psychiatry patients, brown found that 166 in a sample of 947 patients reported histories of physical or sexual abuse. Test the hypothesis that the true population proportion is 30%? Solution  To the hypothesis we need to follow thesteps. 72
  • 73. Example:… 7/5/2023 By Asaye 73 Step 1: State the hypothesis H0: P= Po = 0.3 vs Ha: P ≠ Po ≠ 0.3 Step 2: Fix the level of significant (α=0.05) Step3: determine critical value: Ztab= Z𝛼/2= 1.96 Step 4: Compute the calculated and tabulated value of the teststatistic: Zcal = 𝑃−𝑃0 𝑃∗𝑞 𝑛 = 0.175−0.3 0.3(0.7) 947 = −0.125 0.0149 Zcal = -8.39
  • 74. Example:… 7/5/2023 By Asaye 74 Step 5: make decision: reject Ho sine |Zcal|=8.39 ≥ Ztab=1.96. Step 6: making conclusion: we conclude that there is statistical evidence to reject the true population proportion is different from zero.
  • 75. Hypothesis testing for two sample means Ho: µ1-µ2 =0 VS HA: µ1-µ2 ≠0, HA: µ1-µ2 <0, HA: µ1-µ2>0 75
  • 76. Example A researchers wish to know if the data they have collected provide sufficient evidence to indicate a difference in mean serum uric acid levels between normal individual and individual with down’s syndrome. The data consists of serum uric acid readings on 12 individuals with down’s syndrome and 15 normal individuals. The means are 4.5mg/100ml and 3.4 mg/100ml with standard deviation of 2.9 and 3.5 mg/100ml, respectively with variances (2=1, 2=1.5, respectively). Is there a difference between the means of both groups at 5% level of significance? Hypothesis test: HA: µ1 - µ2 ≠ 0 or HA: µ1 ≠ µ2 76
  • 77. Cont… 7/5/2023 77 With α = 0.05, the critical values of Z are -1.96 and +1.96. We reject Ho if Z < -1.96 or Z > +1.96. Reject Ho because 2.57 > 1.96.  We are 95% confident that there is a statistically significant evidence the population means are not equal.
  • 78. Hypothesis testing for two proportions  Suppose that n1 and n2 are large enough sothat; n1·p1≥5, n1·(1 - p1)≥5, n2·p2≥5, and n2·(1 – p2)≥5  To test the hypothesis Ho: P1-P2 =0 VS HA: P1-P2 ≠0 Test statistics: 78 𝜎𝑃1−𝑃2 = 𝑍𝑐𝑎𝑙 = 𝑃1 − 𝑃2 − 𝐷0 𝜎𝑃1−𝑃2 Where; 𝐷0 = (𝑃1 − 𝑃2)
  • 79. Example 7/5/2023 By Asaye 79 Two hundred patients suffering from a certain disease were randomly divided into two equal groups. Of the first group, 78 recovered within three days. Out of the other 100, who were treated by a new method, 90 recovered within three days. The physician wishes to know whether the data provide sufficient evidence at 90% level of confidence to indicate that the new treatment is more effective than the standard treatment. Solution; Given: n1= n2= 100; p1=78/100= 0.78 p2=90/100=0.90 1. State the hypothesis: Ho: P1=P2 vs H1: P1< P2 2. Determine level of significance.
  • 80. Example… 7/5/2023 By Asaye 80 3. Test statistic: 𝑍𝑐𝑎𝑙 = 0.78 − 0.90 − 0 0.78(0.32) 100 + 0.90(0.10) 100 = −0.12 0.058 = −2.07 4. Critical value: It is one-tailed test and therefore Zα = Z0.05 = ±1.645 5. Decision: since 𝑍𝑐𝑎𝑙<−Zαi.e. -2.07 < -1.645 we reject the Ho 6. Conclusion: the data suggests that the new treatment is more effective than the standard at 95% level of significance.
  • 81. Chi-square test 7/5/2023 By Asaye 81  Chi-square test is used to determine a significant difference between the observed and expected frequencies in categorical attributes.  In recent years, the use of specialized statistical methods for categorical data has increased dramatically, particularly for applications in the biomedical and social sciences.  Categorical scales occur frequently in the health sciences, for measuring responses.
  • 82. Cont… 7/5/2023 By Asaye 82 For example: Patient survives an operation (yes, no), Severity of an injury (none, mild, moderate, severe), and Stages of a disease (initial, advanced).  Studies often collect data on categorical variables that can be summarized as a series of counts and commonly arranged in a tabular format known as a contingency table.
  • 83. Cont… 7/5/2023 By Asaye 83  As with the z and t distributions, there is a different chi-square distribution for each possible value of degrees of freedom. Chi-square distributions with a small number of degrees of freedom are highly skewed; however, this skewness is attenuated as the number of degrees of freedom increases.
  • 84. Cont… 7/5/2023 By Asaye 84 The chi-squared distribution is concentrated over non-negative values. It has mean equal to its degrees of freedom (d.f), and its standard deviation equals √(2df ). As d.f increases, the distribution concentrates around larger values and is more spread out. The distribution is skewed to the right, but it becomes more bell-shaped (normal) as d.f increases.
  • 85. Cont… 7/5/2023 By Asaye 85  For contingency table, d.f is equal to (r-1)x(c-1)
  • 86. Test of association 7/5/2023 By Asaye 86  The chi-squared (2) test statistics is widely used in the analysis of contingency tables.  It compares the actual observed frequency in each group with the expected frequency.  The chi-squared test (Pearson’s χ2) allows us to test for association between categorical (nominal) variables.  The null hypothesis for this test is there is no association between the variables. Consequently a significant p-value implies association.
  • 87. Cont… 7/5/2023 By Asaye 87 Test Statistic: 2-test with d.f. = (r-1)x(c-1)      j i ij ij ij E E O , 2 2  Oij=observed frequency, Eij=expected frequency of the cell at the juncture of i th raw & j th column 𝐸𝑖𝑗 = 𝑖𝑡ℎ 𝑟𝑜𝑤 𝑡𝑜𝑡𝑎𝑙 × 𝑗𝑡ℎ 𝑐𝑜𝑙𝑢𝑚𝑛 𝑡𝑜𝑡𝑎𝑙 𝑔𝑟𝑎𝑛𝑑 𝑡𝑜𝑡𝑎𝑙 = 𝑅𝑖 × 𝐶𝑗 𝑛
  • 88. Procedures of Hypothesis Testing 7/5/2023 By Asaye 88 1. State the hypothesis 2. Fix level of significance 3. Find the critical value (𝜒2 (d.f, α)) 4. Compute the test statistics 5. Decision rules; reject null hypothesis if test statistics > table value. 6. Make conclusion
  • 89. Test of associations for 2x2 tables 7/5/2023 By Asaye 89 If we call the frequencies in the four cells of 2x2 table a, b, c and d then the table is given by Exposure status Disease status Row total diseased Non- diseased Exposed a b a+b Non-exposed c d c+d Column total a+c b+d a+b+c+ d
  • 90. Cont… 7/5/2023 By Asaye 90 If the contingency table is 2x2 The d.f is (r-1)x(c-1), then   ) )( )( )( ( 2 2 d c b a d b c a bc ad n             j i ij ij ij E E O , 2 2 
  • 91. Assumptions of the 2 - test 7/5/2023 By Asaye 91 The chi-squared test assumes that  Data must be categorical data.  The data be a frequency data.  The numbers in each cell are ‘not too small’. No expected frequency should be less than 1, and  No more than 20% of the expected frequencies should be less than 5.  If this does not hold row or column variables categories can sometimes be combined (re-categorized) to make the expected frequencies larger or use.
  • 92. Example: 7/5/2023 By Asaye 92  Consider hypothetical example on smoking and symptoms of asthma. The study involved 150 individuals and the result is given in the following table:  Is there association between smoking cigarettes and symptoms of asthma at 0.05 level of significance? Symptoms of Asthma Ever smoking Total Yes No Yes 20 30 50 No 22 78 100 Total 42 108 150
  • 93. 7/5/2023 By Asaye 93 dfarea 0.995 0.99 0.975 0.95 0.9 0.25 0.1 0.05 0.025 0.01 0.005 1 0.000 0.000 0.001 0.004 0.016 1.323 2.706 3.841 5.024 6.635 7.879 2 0.010 0.020 0.051 0.103 0.211 2.773 4.605 5.991 7.378 9.210 10.597 3 0.072 0.115 0.216 0.352 0.584 4.108 6.251 7.815 9.348 11.345 12.838 4 0.207 0.297 0.484 0.711 1.064 5.385 7.779 9.488 11.143 13.277 14.860 5 0.412 0.554 0.831 1.145 1.610 6.626 9.236 11.071 12.833 15.086 16.750 6 0.676 0.872 1.237 1.635 2.204 7.841 10.645 12.592 14.449 16.812 18.548 7 0.989 1.239 1.690 2.167 2.833 9.037 12.017 14.067 16.013 18.475 20.278 8 1.344 1.647 2.180 2.733 3.490 10.219 13.362 15.507 17.535 20.090 21.955 9 1.735 2.088 2.700 3.325 4.168 11.389 14.684 16.919 19.023 21.666 23.589 10 2.156 2.558 3.247 3.940 4.865 12.549 15.987 18.307 20.483 23.209 25.188 11 2.603 3.053 3.816 4.575 5.578 13.701 17.275 19.675 21.920 24.725 26.757 12 3.074 3.571 4.404 5.226 6.304 14.845 18.549 21.026 23.337 26.217 28.300 13 3.565 4.107 5.009 5.892 7.042 15.984 19.812 22.362 24.736 27.688 29.819 14 4.075 4.660 5.629 6.571 7.790 17.117 21.064 23.685 26.119 29.141 31.319 15 4.601 5.229 6.262 7.261 8.547 18.245 22.307 24.996 27.488 30.578 32.801 16 5.142 5.812 6.908 7.962 9.312 19.369 23.542 26.296 28.845 32.000 34.267 17 5.697 6.408 7.564 8.672 10.085 20.489 24.769 27.587 30.191 33.409 35.718 18 6.265 7.015 8.231 9.390 10.865 21.605 25.989 28.869 31.526 34.805 37.156 19 6.844 7.633 8.907 10.117 11.651 22.718 27.204 30.144 32.852 36.191 38.582 20 7.434 8.260 9.591 10.851 12.443 23.828 28.412 31.410 34.170 37.566 39.997 21 8.034 8.897 10.283 11.591 13.240 24.935 29.615 32.671 35.479 38.932 41.401 22 8.643 9.542 10.982 12.338 14.041 26.039 30.813 33.924 36.781 40.289 42.796 23 9.260 10.196 11.689 13.091 14.848 27.141 32.007 35.172 38.076 41.638 44.181 24 9.886 10.856 12.401 13.848 15.659 28.241 33.196 36.415 39.364 42.980 45.559 25 10.520 11.524 13.120 14.611 16.473 29.339 34.382 37.652 40.646 44.314 46.928 26 11.160 12.198 13.844 15.379 17.292 30.435 35.563 38.885 41.923 45.642 48.290 27 11.808 12.879 14.573 16.151 18.114 31.528 36.741 40.113 43.195 46.963 49.645 28 12.461 13.565 15.308 16.928 18.939 32.620 37.916 41.337 44.461 48.278 50.993 29 13.121 14.256 16.047 17.708 19.768 33.711 39.087 42.557 45.722 49.588 52.336 30 13.787 14.953 16.791 18.493 20.599 34.800 40.256 43.773 46.979 50.892 53.672 Table C. Right tail areas for the Chi-square
  • 94. Solution 7/5/2023 By Asaye 94 Hypothesis:  H0: there is no association between smoking and symptoms of asthma.  H0: there is association between smoking and symptoms of asthma. The critical value is given by 𝜒2(0.05,1) = 3.841 Test statistics
  • 95. Cont… 7/5/2023 By Asaye 95 The corresponding p-value to 5.36 at 1 degree of freedom is estimated by 0.02. Decision: Hence, the decision is reject the null hypothesis and accept the alternative hypothesis Conclusion: there is association between smoking and symptoms of asthma).
  • 96. Exercise 7/5/2023 By Asaye 96 Consider the data on the assessment of the effectiveness of antidepressant. The data is given below: Is there association between treatments and depression at 0.01 level of significance? Treatment Depression status Total Yes No Desipramine 14(8) 10(16) 24 Lithium 6(8) 18(16) 24 Placebo 4(8) 20(16) 24 Total 24 48 72
  • 98. Measure of Association 7/5/2023 By Asaye 98  Chi-square test only tells us whether there is association between the two categorical variables or not, however, it did not tell us about the direction and strength of the association.  Statistical relationship between exposure and disease.  An association is said to exist between two variables when a change in one variable parallels or coincides with a change in another variable.  Requires comparing two groups:  Exposed Vs Unexposed  Cases Vs non cases/controls
  • 99. Cont… 7/5/2023 By Asaye 99  Variables can be related or unrelated to one another.  If they have relation, it can be:  Positively or negatively  Strongly or weakly (one variable can have large or small effect on the other)  Significantly or not significantly related  Statistically significant association is the association is unlikely to be due to chance.
  • 100. Cont… 7/5/2023 By Asaye 100 Commonly, the strength of the association is measured by the  Relative risk (RR)  Odds Ratio (OR)
  • 101. Relative Risk (RR) 7/5/2023 By Asaye 101  Risk: The probability of an event occurring over time  Risk Ratio: The ratio of the risk of disease incidence in exposed group compared to the risk in those unexposed.  Risk measures the probability of disease incidence among groups.  Relative risk is used to compare the risk in two different groups of people. Risk = number of cases of disease number of people at risk
  • 102. Cont… 7/5/2023 By Asaye 102  It estimates the magnitude (size) of an association between exposure and outcome.  It indicates the chance of developing the disease in the exposed group relative to those who are not exposed group to a risk factor.
  • 103. Cont… 7/5/2023 By Asaye 103  Table 1: a 2 by 2 table indicating findings of a cohort study
  • 104. Cont… 7/5/2023 By Asaye 104  From the above table the RR is calculated as:
  • 105. Example1 7/5/2023 By Asaye 105 Table 2: Data from a cohort study of oral contraceptive (OC) use and bacteriuria among women aged 15-49 years. Current OC use Bacteriuria Total Yes No Yes 27 455 482 No 77 1831 1908 Total 104 2286 2390
  • 106. Cont… 7/5/2023 By Asaye 106 Calculate RR? RR = 𝑙𝑒 𝑙𝑜 or 𝑎/(𝑎+𝑏) 𝑐/(𝑐+𝑑) = 27/482 ∗1000 77/1908 ∗1000 =1.4  Interpretation: women who used oral contraceptive had 1.4 times higher risk of developing bacteriuria when compared to non-users. RR = Incidence among exposed (Ie) Incidence among non-exposed (Io)
  • 107. Interpretation 7/5/2023 By Asaye 107  The value of RR ranges from 0 and infinity.  RR is always a positive number.  RR=1  Risk in exposed = risk in non-exposed  No association  RR>1  Risk in exposed > risk in non-exposed  Implies that exposed individuals are x times highly likely to develop the outcome as compared to non-exposed.  Positive association, factor is associated with disease  Larger RR  stronger association
  • 108. Cont… 7/5/2023 By Asaye 108  RR<1  Risk in exposed < risk in non-exposed  Indicates the risk of acquiring the disease is less among subjects with the risk factor than among subjects without the risk factor.  Negative association, factor is “protective”
  • 110. Guideline for strength of association 7/5/2023 By Asaye 110  1.0 = No association  1.1-1.3 = Weak  1.4-1.7 = Mild  1.8-3.0 = Moderate  3.0-8.0 =Strong Q. What if RR is less than 1?
  • 111. Cont… 7/5/2023 By Asaye 111  For inverse associations (RR is less than 1.0), take the reciprocal and look in above table, e.g., reciprocal of 0.5 is 2.0, which corresponds to a “moderate” association.  The further RR away from 1, the stronger the association between exposure and disease.
  • 112. Odds Ratio (OR) 7/5/2023 By Asaye 112 The Odds of disease is the probability that an individual experiences the disease as a function of exposure. Odds: The probability of an event's occurring to the probability of its not occurring. Odds = P/1-P Where ; p = the probability of an event 1-p = the probability that the event does not occur  Indicates the likelihood of having been exposed among cases relative to controls.
  • 113. Cont… 7/5/2023 By Asaye 113 Consider the following 2x2 table: Treatment Outcome status Total X - X+ Y - a b a+b Y+ c d c+d Total a+c b+d a+b+c+d
  • 114. Cont… 7/5/2023 By Asaye 114 Odds Ratio: The ratio of two odds or the ratio of the odds of exposure in cases compared with the odds of the exposure in controls. Odds Ratio = Odds of positive outcome among cases Odds of positive outcome aomg control = OR = a/c b/d = a∗d b∗c Odds – the ratio of the probability of occurrence of an event to that of nonoccurrence.  We can calculate either exposure or disease odds ratio, which are exactly the same.
  • 115. Example 7/5/2023 By Asaye 115 Table 3: Data from a case-control study of current oral contraceptive (OC) use and MI in pre-menopausal female nurses. Current OC use Myocardial infraction Total Yes No Yes 23 304 327 No 133 2816 2949 Total 156 3120 3276
  • 116. Cont… 7/5/2023 By Asaye 116 Calculate OR OR = a/c b/d = 23∗2816 304∗133 = 1.6 Interpretation: the odds of having MI is 1.6 times higher among OCP users as compared to that of the non OCP users.
  • 117. Interpretation cont’d…  OR can be ranges from 0 to positive infinity.  OR = 1 then exposure not related to disease.  OR >1 then exposure positively related to disease.  OR <1 then exposure negatively related to disease. 0 1.0 ∞ Positive Negative No weak
  • 118. Interpretation 7/5/2023 By Asaye 118  The odds of having the disease in question are OR times greater among those exposed to the suspected risk factor than among those with no such exposure.  The formula for standard error of the log odds ratio is given by 𝑆𝐸(ln 𝑂𝑅 ) = 1 𝑎 + 1 𝑏 + 1 𝑐 + 1 𝑑  The 95% confidence interval for the log odds ratio is given by ln 𝑂𝑅 − 𝑍𝛼 2 ∗ 𝑆𝐸 ln 𝑂𝑅 , ln 𝑂𝑅 + 𝑍𝛼 2 ∗ 𝑆𝐸 ln 𝑂𝑅
  • 119. Cont… 7/5/2023 By Asaye 119  To obtain 95% confidence interval interpretation for the odds ratio, we need to transform back to the original value of odds ratio.  Or, The 95% confidence interval for odds ratio is given by:  OR is the point estimate of the sample.
  • 120. Exercise 7/5/2023 By Asaye 120  Example: Let us consider an example in order to make the concept clear. The data in the table below is information about infant birth weights and mortality among white infants in region X within a year.  Find the confidence interval for odds ratio of infant mortality at 5% level of significance? Birth weight Mortality Total Dead Alive Low BW 618 4597 5215 High BW 422 67093 67515 Total 1040 71690 72730
  • 121. Sampled reference 7/5/2023 By Asaye 121 BLUMAN ELEMENTARY STATISTICS: A STEP BY STEP APPROACH, EIGHTH EDITION An Introduction to Statistical Methods and Data Analysis, Sixth Edition Introduction to Biostatistics BY Larry Winner; Department of Statistics, University of Florida