Statistical Inference /Hypothesis Testing

Dr. Nisha Arora
Statistical Inference: Hypothesis
Testing

Statistics for Data Analysis
 What are the four main things we should know before
studying data analysis?
 Descriptive statistics
 Distributions (normal distribution / sampling distribution)
 Inferential statistics/ Hypothesis testing
8

Outline
 Concept
 Motivation
 Example: Quality Control Problem
 Null & Alternative Hypothesis
 One & Two – tailed tests
 Type I & Type II errors
 Courtroom Analogy
 Fallacies in Statistical Hypothesis Testing
 Example: Testing of Mean (σ known) One Tailed Test
 Example: Testing of Mean (σ known) Two Tailed Test
9

Concept
10
It is the procedure to test truth of a statement/claim about the
population parameter on the basis of sample selected from the
population.
The purpose of hypothesis testing is to determine whether there is
enough statistical evidence against a certain belief, or hypothesis,
about a parameter.
Is the sample statistic a function of chance or luck rather than an
accurate representation of the population parameter?

Motivation
CASE – I
 The purchase manager of a machine tool making company has to decide
whether to buy castings from a new supplier or not.
 The new supplier claims that his castings have higher hardness than those
of the competitors.
 If the claim is true, then it would be in the interest of the company to
switch from the existing suppliers to the new supplier because of the
higher hardness, all other conditions being similar.
 However, if the claim is not true, the purchase manager should continue
to buy from the existing suppliers.
 He needs a tool which allows him to test such a claim. Testing of
hypothesis provides such a tool to the decision maker.
11

Motivation
 If the purchase manager were to use this tool, he would ask the
new supplier to deliver a small number of castings.
 This sample of castings will be evaluated and based on the
strength of the evidence produced by the sample, the purchase
manager will accept or reject the claim of the new supplier and
accordingly make his decision.
 The claim made by the new supplier is a hypothesis that needs to
be tested and a statistical procedure which allows us to perform
such a test is called testing of hypothesis.
12

Motivation
CASE - II
 The CEO of a light bulbs manufacturer needs to test that if the bulbs can
last on average 1000 burning hours. If the plan manager randomly sampled
100 bulbs and finds that the sample average is 980 hours with a standard
deviation of 80 hours. At 5 % level of significance do light bulbs lasts an
average of 1000 hours?
 CASE - III
 A bolt manufacturer claims that on an average 3% bolts manufactured by his
factory are defective. A sample of 200 bolts is selected and is found to have
5 % defective bolts.
 The dilemma is to accept the claim or reject ?
 If again a sample of 200 bolts is selected and is found to have 3.2 %
defective bolts.
 To accept the claim or reject ?
14

Motivation
More cases
 Is there statistical evidence, from a random sample of potential
customers, to support the hypothesis that more than 10% of the
potential customers will purchase a new product?
 Is a new drug effective in curing a certain disease? A sample of patients is
randomly selected and given the drug. Then the conditions of the
patients are then measured.
 The CEO of a large electric utility claims that 80 percent of his
1,000,000 customers are very satisfied with the service they receive. To
test this claim, the local newspaper surveyed 100 customers, using
simple random sampling. Among the sampled customers, 73 percent say
they are very satisfied. Based on these findings, can we reject the CEO's
hypothesis that 80% of the customers are very satisfied? Use a 0.05 level
of significance.
19

Quality Control Problem: Bolt Manufacturer
26
Refer case III

27
Refer case III

 To test the claim we set up a cut-off value/critical value (let 4 %).
 If the average number of defective bolts in the sample exceeds
this cut – off value, we decide to reject his claim and accept
otherwise.
 Important thing to notice that we can change this cut – off value
as per accuracy desired.
 E.g., if we want to be more sure that the claim is false
before we reject it, we can set a higher cut – off value, 5.5
%. In this case we need more evidence to reject the claim.
32

 Here, we are less likely to reject the hypothesis/claim when it is
true as the rejection region is reduced.
 Hence, the size of rejection region and the cut – off value is
determined by the researcher.
 Significance level shows how sure we want to be when rejecting
H0.
33

Hypothesis Testing: Basic Concepts
 Hypothesis
 An assumption made about a population parameter
 E.g. At most 3% bolts manufactured by his factory are defective
 Purpose of HypothesisTesting
 To make a judgment about the difference between the sample statistic and
the population parameter
 The sample contains 5 % defectives. Is this an accurate representation of
the bolt’s population?
 The mechanism adopted to make this objective judgment is the core
of hypothesis testing

Null & Alternative Hypothesis
Null hypothesis H0
 It is a statement of the no difference
and any observed difference are by
chance.
 The null hypothesis refers to a
specified value of the population
parameter, not a sample statistic.
 Begin with the assumption that the
null hypothesis is TRUE
 Always contains the ‘=’ sign
Alternative hypothesis H1 or HA
 It is one in which some difference is
expected.
 It says that any observed difference in
the data can be generalized to the
population.
 It is the complementary/opposite
statement to Null hypothesis
 Never contains the ‘=’ sign
These two hypotheses are mutually exclusive and exhaustive so
that one is true to the exclusion of the other

Null hypothesis H0
 It is a statement of the no
difference and any observed
difference are by chance.
null hypothesis isTRUE
 It is one in which some difference
is expected.
 It says that any observed
difference in the data can be
generalized to the population.

Null hypothesis H0
 It is a statement of the no
difference and any observed
difference are by chance.
null hypothesis isTRUE
 It is one in which some difference
is expected.
 It says that any observed
difference in the data can be
generalized to the population.
If null hypothesis is true, no action is
required.That’s why the name

Quiz time
38
Can any one of the two be called the null hypothesis?

Quiz time
39
Can any one of the two be called the null hypothesis?
No, because the roles of Ho and H1 are not
symmetrical.

Quiz time
40
True or False?
Null hypothesis assume that any observed difference
between the true value and estimated value are due to
chance.

Quiz time
41
True or False?
Null hypothesis assume that any observed difference
between the true value and estimated value are due to
chance.
True

 A consumer analyst reports that the mean life of a certain type of
automobile battery is not (claim) 74 months.
 A radio station publicizes that its proportion of the local listing
audience is greater than or equal to (claim) 39 %
1 : 74
H Mean 
0 : 74
H Mean =
1 : 39
H P 
0 : 39
H P 

Null & Alternative Hypothesis: More Examples
 Suppose we wanted to test the hypothesis that the mean familiarity
rating exceeds 4.0, the neutral value on a 7 point scale.
 A new Internet Shopping Service will be introduced if more than 40%
people use it.

Null & Alternative Hypothesis: More Examples
 Suppose we wanted to test the hypothesis that the mean familiarity
rating exceeds 4.0, the neutral value on a 7 point scale.
 A new Internet Shopping Service will be introduced if more than 40%
people use it.
1 : 4.0
H Mean 
0 : 4.0
H Mean 
1 : 0.40
H P 
0 : 0.40
H P 

One Tailed/ Two Tailed Tests
 One - tailed tests
 Determines whether a particular population parameter is larger or
smaller than some predefined value
 If the claim/ hypothesis is expressed in one direction (increasing or
decreasing), then we choose one tailed test.
 Ho: Population mean attitudes are greater than or equal to 3.0
 Ha: Population mean attitudes are less than 3.0
 Two – tailed tests
 Determines the likelihood that a population parameter is within certain
upper and lower bounds
 If the Research Hypothesis is expressed without direction
 Ho: Population mean attitudes = 4.5
 Ha: Population mean attitudes are not equal to 4.5
52

One Tailed Tests
53
Right – Tailed Test Left – Tailed Test

Courtroom Analogy…
 The basic concepts in hypothesis testing are actually quite analogous
to those in a criminal trial.
 If on a jury, must presume defendant is innocent unless enough
evidence to conclude is guilty.
 Trial held because prosecution believes status quo of innocence is
incorrect.
 Prosecution collects evidence, like researchers collect data, in hope
that jurors will be convinced that such evidence is extremely unlikely
if the assumption of innocence were true.
• Defendant is
innocent.
Null
Hypothesis
• Defendant is guilty.
Alternate
Hypothesis

Courtroom Analogy…
Potential choices and errors
Choice - 1
• We cannot rule out that defendant is
innocent, so he or she is set free without
penalty.
• Potential error: A criminal has been
erroneously freed.
Choice - 2
• We believe enough evidence to conclude the
defendant is guilty.
• Potential error: An innocent person falsely
convicted.
Choice 2 is usually seen as more serious.

Related Terminology
 Type I error
 The error of rejecting the null hypothesis, when it is true (incorrect
decision)
 Is similar to convicting an innocent person
 Known as False alarm
 Also known as False positive error
 Type II error
 The error of accepting the null hypothesis, when it is false
(incorrect decision)
 Is similar to letting a guilty person go free
 Known as Failure to sound the alarm
 Also known as False negative error

Related Terminology
 Level of significance
 The probability of type I error
 The conditional probability of rejecting the null hypothesis, when it
is true (incorrect decision)
 Its good to have low value of
 Generally taken as 0.01, 0.05 or 0.1
 Also known as Producer’s Risk
 Confidence level
 The probability of accepting the null hypothesis, when it is true (i. e.
correct decision)
 Most common values are 99 %, 95 % or 90 %

1 
−


Related Terminology
 Weakness of the test
 The probability of type II error
 The probability of accepting the null hypothesis, when it is
false (i. e. incorrect decision)
 Also known as Consumer’s Risk
 Its good to have low value of
 Power of the test
 1 - The probability of type II error
 The conditional probability of rejecting the null hypothesis, when it
is false (i. e. correct decision)

1 
−


Important Fact
Alpha and Beta have an inverse
relationship.
We can not reduce both types
of errors simultaneously
We fix one of the errors,
generally producer’s risk and
then try to minimize other

How can we decrease type I error?
To decrease type I error / to increase confidence interval:
Set low level of significance/ reduce rejection region

How can we decrease type II error?
To decrease type I error / to increase power of test:
Increase sample size
Large enough sample ensures to detect a practical difference when one truly
exists.
Set high level of significance/ increase rejection region

Type I Vs Type II Error
 Which one is worse?
 Both types of errors are problems for individuals, corporations,
and data analysis.
 A false positive (with null hypothesis of health) in medicine
causes unnecessary worry or treatment, while a false negative
gives the patient the dangerous illusion of good health and the
patient might not get an available treatment.
 A false positive in manufacturing quality control (with a null
hypothesis of a product being well made) discards a product that
is actually well made, while a false negative stamps a broken
product as operational.
 A false positive (with null hypothesis of no effect) in scientific
research suggest an effect that is not actually there, while a false
negative fails to detect an effect that is there.

Type I Vs Type II Error
 Which one is worse?
 Based on the real-life consequences of an error, one type may be more
serious than the other.
 For example, NASA engineers would prefer to throw out an electronic
circuit that is really fine (type I, false positive) than to use one on a
spacecraft that is actually broken (type II, false negative). In that
situation a type I error raises the budget, but a type II error would
risk the entire mission.
 On the other hand, criminal courts set a high bar for proof and
procedure and sometimes acquit someone who is guilty (type II, false
negative) rather than convict someone who is innocent (type I, false
positive).
 For any given sample size the effort to reduce one type of error generally
results in increasing the other type of error.
 The only way to minimize both types of error, without just improving
the test, is to increase the sample size, and this may not be feasible.

Critical Value
 Critical values for any test is the boundary of acceptance region or in other
words, it’s the cut point between the acceptance and rejection region.
 For one tailed test
 It is positive for right tailed test & negative for left tailed test
 It is calculated by using the tables of area under standard normal
curve
 E.g., at 5 % level of significance critical values for right/ left tailed tests
are 1.645/-1.645
 For two tailed test
 It gives the upper and lower bounds for the sample statistic
 It may be positive or negative
 E.g., at 1 % level of significance critical values for two tailed tests are
+/- 2.58

Critical Values for z - Test
For left tailed test
• - 2.326
At 1 % level of
significance
• - 1.645
At 5 % level of
significance
• - 1.282
At 10 % level of
significance

For right tailed test
• 2.326
At 1 % level of
significance
• 1.645
At 5 % level of
significance
• 1.282
At 10 % level of
significance

For two tailed test
• ± 2.576
At 1 % level of
significance
• ± 1.960
At 5 % level of
significance
• ± 1.645
At 10 % level of
significance

P-value
The p-value is computed by assuming that the null hypothesis is true, and
then asking how likely we would be to observe such extreme results (or
even more extreme results) under that assumption.

P Value
 The P value, or calculated probability, is the probability of finding the observed, or
more extreme, results when the null hypothesis (H 0) of a study question is true —
the definition of ‘extreme’ depends on how the hypothesis is being tested.
96

P Value
107
Observed value
Observed value

Various Tests of Significance
Parametric
test
Non-
parametric test

Selecting the appropriate tool
111
Parametric & Non-Parametric Tests
Make sure to check all assumptions before applying any statistical
technique.
Parametric Tests
T-test
Anova
Non-parametric
test
U-test
H-test

Selecting the Appropriate Technique
112
ResponseVariable(s) (DVs)
One DV More
than one
DV
Explanato
ry
Variable(s
) (IDVs)
One
IDV
Metric Non-
metric
Metric
Metric Simple
Regression
LDA/Logit
Reg
Path
Analysis
Non-metric t test/Anova Chi Square
Test
Manova
More
than
one IDV
All Metric Multiple Reg MDA/Multipl
e Logit Reg
Path
Analysis
All Non-
metric
n – way Anova Complex
Crosstab/
Log-linear
analysis
n – way
Manova
'n' is the number of non-metric IDV's

Selecting the appropriate tool
113
ResponseVariable(s) (DVs)
One DV More
than one
DV
Explanator
y
Variable(s)
(IDVs)
One IDV
Metric Non-
metric
Metric
Metric Simple
Regression
LDA/Logit Reg Path Analysis
Non-metric t test/Anova Chi Square
Test
Manova
More
than one
IDV
All Metric Multiple Reg MDA/Multiple
Logit Reg
Path Analysis
All Non-
metric
n – way Anova Complex
Crosstab/
Log-linear
analysis
n – way
Manova
Mixed n – way
Ancova/Dummy
var Regression
Multi-Nominal
Reg
n– way
Mancova
'n' is the number of non-metric IDV's

Response Variable(s) (DVs)
One DV
One IDV
Metric Non-metric
Metric Simple Regression LDA/Logit Reg
Non-metric t test/Anova Chi Square Test
Bivariate Analysis

One Sample z – test/ testing of mean (σ known)
 A sample of 40 sales receipts from a grocery store has a mean of $137
and population s. d. is $30.2. Use these values to test whether or not the
mean is sales at the grocery store are different from $150.
 Step 1: Set the null and alternative hypotheses
Null Hypothesis
Alternative Hypothesis
0 : 150
H Mean =
1 : 150
H Mean 

One Sample z – test/ testing of mean (σ known)
 An insurance company is reviewing its current policy rates. When
originally setting the rates they believed that the average claim amount
was $1,800. They are concerned that the true mean is actually higher
than this, because they could potentially lose a lot of money. They
randomly select 40 claims, and calculate a sample mean of $1,950.
Assuming that the standard deviation of claims is $500, and set α = 0.05,
test to see if the insurance company should be concerned.
 Step 1: Set the null and alternative hypotheses
Null Hypothesis
Alternative Hypothesis
0 : 1800
H Mean 
1 : 1800
H Mean 

Quiz time
129
True or False?
In Type 1 error we declare an effect which does not exist.
True

Quiz time
130
True or False?
if we reduce the confidence level from 95% to 90% the chances of
us declaring that the effect observed in the sample actually prevails
in the population, are higher.
True

Quiz time
131
True or False?
In Type-2 error, we may miss an effect which actually exists.
True

Quiz time
132
True or False?
if we increase the confidence level from 95% to 99%, the chances of
us missing that the effect which actually prevails in the population,
are higher.
True

143
My Interesting answers/posts
Want to excel in Statistics?
https://www.quora.com/How-can-I-get-better-at-statistics-within-a-
month/answer/Nisha-Arora-9
Effect of change of origin & Scale in variance?
https://www.quora.com/If-I-multiply-the-result-of-my-observations-by-3-
how-variance-and-mean-will-vary/answer/Nisha-Arora-9
Do you need to brsuh-up your probability concepts?
https://www.quora.com/How-do-I-know-when-to-add-and-when-to-
multiply-in-questions-based-on-probability/answer/Nisha-Arora-9

144
My Interesting answers/posts
Role of Null & Alternative hypothesis?
https://www.quora.com/Can-I-switch-around-the-null-and-alternative-
hypothesis-in-hypothesis-testing/answer/Nisha-Arora-9
Type I error or Type II error?
https://www.quora.com/Why-dont-people-care-much-about-power-1-
Type-II-error-of-a-hypothesis-test/answer/Nisha-Arora-9
Hypothesis testing in layman’s terms?
https://learnerworld.tumblr.com/post/147285942960/how-do-you-explain-
hypothesis-testing-to-a-layman

145
My Expertise
❖Statistics
❖Data Analysis
❖Machine Learning, Data Science, Analytics
❖R Programming, Shiny R
❖Python
❖SPSS, Excel, PowerBI
❖Mathematics, Operation Research
❖Data Visualization & Storytelling
❖Reporting & Dashboarding

146
Reach Out to Me
http://stats.stackexchange.com/users/79100/learner
https://www.researchgate.net/profile/Nisha_Arora2/contributions
https://www.quora.com/profile/Nisha-Arora-9
http://learnerworld.tumblr.com/
Dr.aroranisha@gmail.com

Thank You
Images: Google Image

Statistical Inference /Hypothesis Testing

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Statistical Inference /Hypothesis Testing

Similar to Statistical Inference /Hypothesis Testing (20)

More from Dr Nisha Arora

More from Dr Nisha Arora (15)

Recently uploaded

Recently uploaded (20)

Statistical Inference /Hypothesis Testing