Chapter 7; Hypothesis Testing
• Cha
by; Yibekal .M
( BSC in PH,PHA & MPH in Epidemiology)
Hypothesis testing
 The majority of statistical analyses involve comparison,
most obviously between treatments or procedures or
between groups of subjects.
 The numerical value corresponding to the comparison
of interest is often called the effect.
 The purpose of hypothesis testing is to aid the
researcher in reaching a decision concerning a
population by examining a sample from that population.
 Hypothesis: A statement about one or more population
Hypothesis testing…
§ The purpose of HT is to aid the clinician, researcher or
administrator in reaching a decision (conclusion)
concerning a population by examining a sample from
that population.
§ Is a statement about one or more population
§ Is a claim (assumption) about a population parameter
§ Is frequently concerned with the parameters of the
population about which the statement is made.
Examples of Research Hypotheses
Population Mean
The average length of stay of patients admitted to the
hospital is five days.
The mean birth weight of babies delivered by mothers
with low SES is lower than those from higher SES.Etc
Population Proportion
The proportion of adult smokers in Dire Dawa is p =
0.40
 The prevalence of HIV among non-married adults is
higher than that in married adults.
Hypotheses testing
Hypothesis testing in statistics involves the
following steps:
 Choose the hypothesis that is to be questioned (Ho) and Choose
an alternative hypothesis(HA) which is accepted if the original
hypothesis is rejected.
 Choose a rule for making a decision about when to reject the
original hypothesis and when to fail to reject it.
 Choose a random sample from the appropriate population and
compute appropriate statistics.
 Make the decision and
 State conclusion.
1. State null and alternative hypotheses
A. null hypotheses(HO)
• The main hypothesis which we wish to test is called the null
hypothesis, acceptance of it commonly implies “no effect”
or “ no difference.” research question into a testable
hypothesis
 HO is always a statement about a parameter( mean,
proportion, etc. of a population).
 It is not about a sample, nor are sample statistics used in
formulating the null hypothesis.
 HO is an equality ( = 14) rather than an inequality.
B. The Alternative Hypothesis, HA
• Is a statement of what we will believe is
true if our sample data causes us to reject
Ho.
• Is generally the hypothesis that is
believed (or needs to be supported) by the
researcher.
The null and alternative hypotheses…
Choosing the Alternative Hypothesis (HA)
• The notation HA(or H1) is used for the hypothesis that will
be accepted, if HO is rejected.
• HA must also be formulated before a sample is tested.
• If the mean height of the CMHS students ( HO:  = 1.63
m) is questioned,
• then the alternative hypothesis (HA) is set  1.63 m.
Other alternatives are also:
• HA :  > 1.63 m.
• HA :  < 1.63 m.
2. Choose a rule for making a decision
 Select a sample and collect data
.
 Decide on the appropriate test statistic for the
hypothesis (Z, t, X2
or F ):
Depended on sample size, population variance or type
of variable(numeric vs categorical).
 Select the level of significance for the statistical
test (α=0.05, 0.01, 0.001, ...)
and Determine the critical value or z/t tabulated
e.g Za/2- -------------1.96 where... α=0.05
-1.96 1.96
Rejection region
-1.645 1.645
Example: Two-sided test at α 5%
Rejection region Non-rejection region Rejection region
= 0.025 = 0.025
0.95
1.96
-1.96
Table values for Z statistic
4. Make a decision
 If the numerical value of the test statistic(z or t-calculated)
falls in the rejection region, we reject the null hypothesis(HO)
and accept alternative hypothesis.
 If the test statistic does not fall in the rejection region, we do
not reject H0.
5. state the conclusion
A. Hypothesis test about population
mean (normally distributed)
Example
Researchers are interested in the mean level
of some enzyme in a certain population. They
are asking: can we conclude that the mean
enzyme level in this population is different
from 25? And They collect a sample of size 10
from a normally distributed population with a
known variance, σ2= 45. The calculated sample
mean is = 22. (α= 0.05)
Solution
Step 1: State null and alternative hypotheses
H0: μ= 25
HA: μ≠25
Step 2: Choose a rule for making a decision
z statistic is appropriate
α=0.05 then
Za/2 ------------- 1.96 (critical region)
Step 3:Test Statistics( z calculated)
Mean(x)= 22, μ= 25, ,= ,σ=45 , n = 10
Z calculated = 22-25/45/10
=-1.41
Then compared tabulated vs calculate
Step 3:Test Statistics( z calculated)
-1.41
Step 4:Make a decision
Since -1.41 fail in the acceptance region,
we accept the null hypothesis
Step: state the conclusion
And conclude that the mean enzyme level in the
population is not differ from 25.
Example 2: From the study Klingler et. al (2002), among 157 African American
men patients, the mean Systolic blood pressure(SBP) was 146 mm Hg with
standard deviation of 27. With 95% level of confidence can we conclude that the
mean SBP for the population of African American men is different from 140.
 Step 1
1. Ho: μ=140
HA: μ≠140
 Step 2
- 95% CI = 1.96
 Step 3
- Test statistics and calculation
 Step 4
- Decision Rule: Reject Ho (since Zcal > Ztab)
 Step 5
Hence, statistically With 95% level of confidence it can be conclude that the mean
SBP for the population of African American men is different from 140.
20
X 146 140
Z 2.78
S 27
n 157
  
  
Test for single mean/normally
distributed
22
Hypothesis Tests for Proportions
 Involves categorical values
 Two possible outcomes
◦ “Success” (possesses certain characteristic)
◦ “Failure” (does not possess)
 Fraction or proportion of population in the “success” category
is denoted by p
22
Hypothesis testing about a single population proportion…
Example: A survey was conducted to study the
dental health practices, and attitudes of a certain
urban adult population. Of 300 adults interviewed,
123 said that they regularly had a dental check up
twice a year. Can we conclude that the population
proportion π= 0.5? Take =0.05
Solution
Step 1: State null and alternative hypotheses
H0: = 0.5
HA: ≠0.5
Step 2: Choose a rule for making a decision
z statistic is appropriate
α=0.05 then
Za/2 ------------- 1.96 (critical region)
Step 3:Test Statistics( z calculated)
P=123/300=0.41
Z cal=0.41-0.5/(.5x.5)/300= -3.11
Solution…
Since -3.11 < -1.96 we reject HO and
Step 5: state the conclusion
We can conclude that 50% of the population
didn’t had a dental check up twice Year
We conclude that not 50% of the population
regularly have a dental check up twice a year
Step 4:Make a decision
Errors of hypothesis testing
The null hypothesis is either true or false.
Correspondingly, H0 is either not rejected or rejected
Type I error: rejecting the null hypothesis when it is true.
• The probability of making a type I error is denoted by α.
Type II error: not rejecting the null hypothesis when it is
actually false.
• The probability of making a type II error is denoted by β.
Power: The probability of rejecting the null hypothesis
when it : is false. Power = 1 - β.
P –Values
§It’s important to stress that the p-value is not
the probability that the null hypothesis is true
(or not true).
§It’s a measure of the strength of the evidence
against the null hypothesis.
§The smaller the p-value, the stronger the
evidence (the less likely it is that the outcome
you got occurred by chance).
When the p-value is smaller than 0.01, the result is
considered to be very significant.
When the p-value is between 0.01 and 0.05, the result is
considered to be significant.
When the p-value is between 0.05 and 0.10, the result is
considered by some as marginally significant (and
considered as not significant).
When the p-value is greater than 0.10, the result is
considered not significant.
The p-Value: Rules of Thumb
Cont…
It is important to distinguish between the significance
level ( value) and the p –value.
©The significance level is  the probability of
making a type I error. This is set before the test is
carried out.
©The P –value is the result observed after the study
is completed and is based on the observed data.
Confidence interval or p –value?
Confidence intervals and p-values are based upon the
same theory and mathematics will lead to the same
conclusion about whether a population difference
exists.
CI or p –value…
If the P-value is greater than 0.05 then, by convention,
we conclude that the observed difference could have
occurred by chance and there is no statistically
significant evidence(at the 5% level of significance)
for a difference between the groups in the population.
A 95% confidence interval gives a plausible range of values that
should contain the true population difference. If the 95%
confidence interval includes the point of zero difference then, by
convention, any difference in the sample cannot be generalized
to the population.
The End!!!

7. hypothesis_tot (1)................pptx

  • 1.
    Chapter 7; HypothesisTesting • Cha by; Yibekal .M ( BSC in PH,PHA & MPH in Epidemiology)
  • 2.
    Hypothesis testing  Themajority of statistical analyses involve comparison, most obviously between treatments or procedures or between groups of subjects.  The numerical value corresponding to the comparison of interest is often called the effect.  The purpose of hypothesis testing is to aid the researcher in reaching a decision concerning a population by examining a sample from that population.  Hypothesis: A statement about one or more population
  • 3.
    Hypothesis testing… § Thepurpose of HT is to aid the clinician, researcher or administrator in reaching a decision (conclusion) concerning a population by examining a sample from that population. § Is a statement about one or more population § Is a claim (assumption) about a population parameter § Is frequently concerned with the parameters of the population about which the statement is made.
  • 4.
    Examples of ResearchHypotheses Population Mean The average length of stay of patients admitted to the hospital is five days. The mean birth weight of babies delivered by mothers with low SES is lower than those from higher SES.Etc Population Proportion The proportion of adult smokers in Dire Dawa is p = 0.40  The prevalence of HIV among non-married adults is higher than that in married adults.
  • 5.
    Hypotheses testing Hypothesis testingin statistics involves the following steps:  Choose the hypothesis that is to be questioned (Ho) and Choose an alternative hypothesis(HA) which is accepted if the original hypothesis is rejected.  Choose a rule for making a decision about when to reject the original hypothesis and when to fail to reject it.  Choose a random sample from the appropriate population and compute appropriate statistics.  Make the decision and  State conclusion.
  • 6.
    1. State nulland alternative hypotheses A. null hypotheses(HO) • The main hypothesis which we wish to test is called the null hypothesis, acceptance of it commonly implies “no effect” or “ no difference.” research question into a testable hypothesis  HO is always a statement about a parameter( mean, proportion, etc. of a population).  It is not about a sample, nor are sample statistics used in formulating the null hypothesis.  HO is an equality ( = 14) rather than an inequality.
  • 7.
    B. The AlternativeHypothesis, HA • Is a statement of what we will believe is true if our sample data causes us to reject Ho. • Is generally the hypothesis that is believed (or needs to be supported) by the researcher.
  • 8.
    The null andalternative hypotheses… Choosing the Alternative Hypothesis (HA) • The notation HA(or H1) is used for the hypothesis that will be accepted, if HO is rejected. • HA must also be formulated before a sample is tested. • If the mean height of the CMHS students ( HO:  = 1.63 m) is questioned, • then the alternative hypothesis (HA) is set  1.63 m. Other alternatives are also: • HA :  > 1.63 m. • HA :  < 1.63 m.
  • 10.
    2. Choose arule for making a decision  Select a sample and collect data .  Decide on the appropriate test statistic for the hypothesis (Z, t, X2 or F ): Depended on sample size, population variance or type of variable(numeric vs categorical).  Select the level of significance for the statistical test (α=0.05, 0.01, 0.001, ...)
  • 11.
    and Determine thecritical value or z/t tabulated e.g Za/2- -------------1.96 where... α=0.05 -1.96 1.96 Rejection region -1.645 1.645
  • 12.
    Example: Two-sided testat α 5% Rejection region Non-rejection region Rejection region = 0.025 = 0.025 0.95 1.96 -1.96
  • 14.
    Table values forZ statistic
  • 15.
    4. Make adecision  If the numerical value of the test statistic(z or t-calculated) falls in the rejection region, we reject the null hypothesis(HO) and accept alternative hypothesis.  If the test statistic does not fall in the rejection region, we do not reject H0. 5. state the conclusion
  • 16.
    A. Hypothesis testabout population mean (normally distributed) Example Researchers are interested in the mean level of some enzyme in a certain population. They are asking: can we conclude that the mean enzyme level in this population is different from 25? And They collect a sample of size 10 from a normally distributed population with a known variance, σ2= 45. The calculated sample mean is = 22. (α= 0.05)
  • 17.
    Solution Step 1: Statenull and alternative hypotheses H0: μ= 25 HA: μ≠25 Step 2: Choose a rule for making a decision z statistic is appropriate α=0.05 then Za/2 ------------- 1.96 (critical region) Step 3:Test Statistics( z calculated) Mean(x)= 22, μ= 25, ,= ,σ=45 , n = 10 Z calculated = 22-25/45/10 =-1.41 Then compared tabulated vs calculate
  • 18.
    Step 3:Test Statistics(z calculated) -1.41 Step 4:Make a decision Since -1.41 fail in the acceptance region, we accept the null hypothesis Step: state the conclusion And conclude that the mean enzyme level in the population is not differ from 25.
  • 19.
    Example 2: Fromthe study Klingler et. al (2002), among 157 African American men patients, the mean Systolic blood pressure(SBP) was 146 mm Hg with standard deviation of 27. With 95% level of confidence can we conclude that the mean SBP for the population of African American men is different from 140.  Step 1 1. Ho: μ=140 HA: μ≠140  Step 2 - 95% CI = 1.96  Step 3 - Test statistics and calculation  Step 4 - Decision Rule: Reject Ho (since Zcal > Ztab)  Step 5 Hence, statistically With 95% level of confidence it can be conclude that the mean SBP for the population of African American men is different from 140. 20 X 146 140 Z 2.78 S 27 n 157      
  • 20.
    Test for singlemean/normally distributed
  • 21.
    22 Hypothesis Tests forProportions  Involves categorical values  Two possible outcomes ◦ “Success” (possesses certain characteristic) ◦ “Failure” (does not possess)  Fraction or proportion of population in the “success” category is denoted by p 22
  • 22.
    Hypothesis testing abouta single population proportion… Example: A survey was conducted to study the dental health practices, and attitudes of a certain urban adult population. Of 300 adults interviewed, 123 said that they regularly had a dental check up twice a year. Can we conclude that the population proportion π= 0.5? Take =0.05
  • 23.
    Solution Step 1: Statenull and alternative hypotheses H0: = 0.5 HA: ≠0.5 Step 2: Choose a rule for making a decision z statistic is appropriate α=0.05 then Za/2 ------------- 1.96 (critical region) Step 3:Test Statistics( z calculated) P=123/300=0.41 Z cal=0.41-0.5/(.5x.5)/300= -3.11
  • 24.
    Solution… Since -3.11 <-1.96 we reject HO and Step 5: state the conclusion We can conclude that 50% of the population didn’t had a dental check up twice Year We conclude that not 50% of the population regularly have a dental check up twice a year Step 4:Make a decision
  • 26.
    Errors of hypothesistesting The null hypothesis is either true or false. Correspondingly, H0 is either not rejected or rejected Type I error: rejecting the null hypothesis when it is true. • The probability of making a type I error is denoted by α. Type II error: not rejecting the null hypothesis when it is actually false. • The probability of making a type II error is denoted by β. Power: The probability of rejecting the null hypothesis when it : is false. Power = 1 - β.
  • 27.
    P –Values §It’s importantto stress that the p-value is not the probability that the null hypothesis is true (or not true). §It’s a measure of the strength of the evidence against the null hypothesis. §The smaller the p-value, the stronger the evidence (the less likely it is that the outcome you got occurred by chance).
  • 28.
    When the p-valueis smaller than 0.01, the result is considered to be very significant. When the p-value is between 0.01 and 0.05, the result is considered to be significant. When the p-value is between 0.05 and 0.10, the result is considered by some as marginally significant (and considered as not significant). When the p-value is greater than 0.10, the result is considered not significant. The p-Value: Rules of Thumb
  • 29.
    Cont… It is importantto distinguish between the significance level ( value) and the p –value. ©The significance level is  the probability of making a type I error. This is set before the test is carried out. ©The P –value is the result observed after the study is completed and is based on the observed data. Confidence interval or p –value? Confidence intervals and p-values are based upon the same theory and mathematics will lead to the same conclusion about whether a population difference exists.
  • 30.
    CI or p–value… If the P-value is greater than 0.05 then, by convention, we conclude that the observed difference could have occurred by chance and there is no statistically significant evidence(at the 5% level of significance) for a difference between the groups in the population. A 95% confidence interval gives a plausible range of values that should contain the true population difference. If the 95% confidence interval includes the point of zero difference then, by convention, any difference in the sample cannot be generalized to the population.
  • 31.