Hypothesis Testing and Statistical Significance

When looking at 2 or more groups that differ based on a treatment or risk factor, there
are two possibilities:
Null Hypothesis &Alternative Hypothesis
No difference between the groups.
There is no relationship between the
risk factor/treatment and occurrence of
the health outcome.
By default you assume the null
hypothesis is valid until you have
enough evidence to support rejecting
this hypothesis.
Null Hypothesis Alternative Hypothesis
There is a difference between groups.
There is a relationship between the
risk factor/treatment and occurrence
of the health outcome.
By default you assume the alternative
hypothesis is false until you have
enough evidence to support this
hypothesis.
(𝐻0) (𝐻1)

> 0(𝐻1)
(𝐻0) ≤ 0
(𝐻1)
(𝐻0) = 0
One-tailed test
Two-tailed test
≠ 0

the health outcome.
this hypothesis.
(𝐻0)
Null Hypothesis

the health outcome.
this hypothesis.
Reject the null hypothesis
Fail to reject the null hypothesis
(𝐻0)
Null Hypothesis

Type I Error = incorrectly rejecting the null hypothesis.
[false positive study result].
Type 1 and Type 2 Error
Type 2 Error = fail to reject null when you should have rejected the null hypothesis.
.[false negative study result].
“Level of Significance” Alpha (α) :
 It is the probability of making a Type I Error.
 Like all probabilities, alpha ranges from 0 to 1.
 Commonly we used 𝑎 = 0.05.

P-Value Definition:
Example:
𝐻0 = the coin is fair
𝐻1 = the coin is tricky
P-value is the probability of obtaining results equal to or more extreme
than what was actually observed, assuming that the null hypothesis is
true.
𝐻0 = the coin is fair
𝐻1 = the coin is tricky
Checkingwhether a coin is fair ?!!!

P-Value Definition:
P-value is the probability of obtaining results equal to or more extreme
than what was actually observed, assuming that the null hypothesis is
true.
When the p-value is very low our data is incompatible with the null
hypothesis and we will reject the null hypothesis.
When the p-value is high there is less disagreement between our
data and the null hypothesis.
Therefore,

༰ what determines whether a p-value is low or high??
UsingAlpha (α) to Determine Statistical Significance :
If our p-value is lower than alpha we conclude that there is a statistically significant
difference between groups. When the p-value is higher than our significance level we
conclude that the observed difference between groups is not statistically significant.

It is a "parameter catcher" and you hope to catch the true parameter.. but
you never know if you actually did, because you don't know which mean
you randomly got!!!
Confidence interval:
95% CI (0.2–0.6) 95% CI (0.4±0.2)

Confidence interval:
• The desired level of confidence is set
by the researcher (not determined by
data).
• Factors affecting the width of the
confidence interval include the size
of the sample, the confidence level,
and the variability in the sample. A
larger sample size normally will lead
to a better estimate of the population
parameter.

Data visualization:
When studying the relationship between two variables, data can be
visualized simply as a scatter plot :
𝑦1𝑋1
𝑦2𝑋2
𝑦3𝑋3
𝑦4𝑋4
𝑦5𝑋5
𝑦6𝑋6
𝑦7𝑋7
This relationship can be
positive or negative or there
is no correlation

Pearson’s r measures the strength of the linear relationship between two variables.
Pearson correlation coefficient (Pearson’sr) :
Pearson’s r is always
between -1 and 1 .

Pearson’s r measures the strength of the linear relationship between two variables.
Pearson correlation coefficient (Pearson’sr) :
How to calculate Pearson’s r ?
By using the Microsoft Excel program:
F(x)=CORREL
R square or the coefficient of determination is defined as :
How mush the difference in the
outcome that is predictable from
the independent variable.

Example:
 No model is perfect..
 So we want to know how accurate our model predictions are?!
 To measure accuracy we use R square which is between 0-1 .

Regression analysis:
Does the Pearson correlation coefficient indicate the slope of the line???
If you get a Pearson correlation coefficient of +1 this does not mean that for every unit
increase in one variable there is a unit increase in another. It simply means that there is
no variation between the data points and the line of best fit.

Regression analysis:
ෙ𝒀 = 𝒂𝑿 + 𝒃
{ 𝒂 =
ෙ𝒀−𝒃
𝒙
}

Linear Relationship
Correlation
coefficient
determination
coefficient
Regression
coefficient

Hypothesis Testing and Statistical Significance

Recommended

Recommended

More Related Content

What's hot

What's hot (19)

Similar to Hypothesis Testing and Statistical Significance

Similar to Hypothesis Testing and Statistical Significance (20)

More from Qussai Abbas

More from Qussai Abbas (19)

Recently uploaded

Recently uploaded (20)

Hypothesis Testing and Statistical Significance