3. ey
I
n
d
ia
P
vt
.
Lt
d
Correlation and Simple Linear Regression Analysis 2
Learning Objectives
Upon completion of this chapter, you will be able to:
Ø Use the simple linear regression equation
Ø Compute the coefficient of correlation and understand its
interpretation.
Ø Understand the concept of measures of variation, coefficient
of
determination, and standard error of the estimate
Ø Understand and use residual analysis for testing the
assumptions of regression
Ø Measure autocorrelation by using the Durbin–Watson statistic
Ø Understand statistical inference about slope, correlation
coefficient of the regression model, and testing the overall
11. rl
in
g
K
in
d
er
sl
ey
I
n
d
ia
P
vt
.
Lt
d
Correlation and Simple Linear Regression Analysis 8
Using MS Excel, Minitab and SPSS for
Computing Correlation Coefficient
Ø Ch 15 Solved ExamplesExcelEx 15.1.xls
Ø Ch 15 Solved ExamplesMinitabEx 15.1.MPJ
Ø Ch 15 Solved ExamplesSPSSEx 15.1.sav
Ø Ch 15 Solved ExamplesSPSSOutput Ex 15.1.spv
14. o
rl
in
g
K
in
d
er
sl
ey
I
n
d
ia
P
vt
.
Lt
d
Correlation and Simple Linear Regression Analysis 10
A Deterministic and Probabilistic Model
ε is the error of the regression line in fitting the points of the
regression equation. If a point is on the regression line, the
corresponding value of ε is equal to zero. If the point is not on
the
regression line, the value of ε measures the error.
18. rl
in
g
K
in
d
er
sl
ey
I
n
d
ia
P
vt
.
Lt
d
Correlation and Simple Linear Regression Analysis 13
A cable wire company has spent heavily on advertisements. The
sales
and advertisement expenses (in thousand rupees) for the 12
randomly selected months are given in Table 14.2. Develop a
regression model to predict the impact of advertisement on
sales.
Example 15.2
20. Lt
d
Correlation and Simple Linear Regression Analysis 14
Using Ms Excel, Minitab, and Spss for
Simple Linear Regression
Ø Ch 15 Solved ExamplesExcelEx 15.2.xls
Ø Ch 15 Solved ExamplesMinitabEX 15.2.MPJ
Ø Ch 15 Solved ExamplesSPSSEx 15.2.sav
Ø Ch 15 Solved ExamplesSPSSOutput Ex 15.2.spv
Lecture Outline
Ø Understand the concept of ANOVA.
Ø Compute and interpret the result of one- way ANOVA.
Analysis of Variance
(ANOVA)
Analysis of Variance
Ø Analysis of variance or ANOVA is a technique of testing
hypotheses about the significant difference in several
population means.
21. Ø In analysis of variance, the total variation in the sample data
can be on account of two components, namely, variance
between the samples and variance within the samples.
Ø Variance between the samples is attributed to the difference
among the sample means.
Ø Variance within the samples is the difference due to chance
or experimental errors.
Figure : Partitioning the total sum of squares of the variation
for
completely randomized design (one-way ANOVA)
SST (total sum of squares) = SSC (sum of squares between
columns) + SSE (sum of squares within samples)
Completely Randomized Design
(One-way Anova)
Completely randomized design contains only one independent
variable, with two or more treatment levels or classifications.
Applying the F -Test Statistic
Ø In case of ANOVA, F value is obtained by dividing the
treatment
variance (MSC) by the error variance (MSE).
Ø F test statistic in one-way ANOVA
22. Ø The F test statistic follows F distribution with k – 1 degrees
of freedom
corresponding to MSC in the numerator and n – k degrees of
freedom
corresponding to MSE in the denominator.
The ANOVA Summary Table
Figure : Rejection and non-rejection region (acceptance region)
when using
ANOVA to test null hypothesis
Vishal Foods Ltd is a leading manufacturer of biscuits. The
company has launched a new brand in the four metros; Delhi,
Mumbai, Kolkata, and Chennai. After one month, the company
realizes that there is a difference in the retail price per pack of
biscuits across cities. Before the launch, the company had
promised its employees and newly-appointed retailers that the
biscuits would be sold at a uniform price in the country. The
difference in price can tarnish the image of the company. In
order to make a quick inference, the company collected data
about the price from three randomly selected stores across the
four cities. Based on the sample information, the price per pack
of the biscuits (in rupees) is given.
Example
23. Use one-way ANOVA to analyse the significant difference in
the prices. Take 95% as the
confidence level.
Example : Continued
Table: ANOVA table
Volkswagon wants to examine the safety of compact
cars, midsize cars, and full-size cars. It collects a sample
of three for each of the treatments (cars types). Using the
data provided below, test whether the mean pressure
applied to the driver’s head during a crash test is equal
for each types of car. Use α = 5%.
Example
Compact
Cars
Midsize
Cars
Full-Size
Cars
15 25 10
25 25 5
24. 20 35 15
Students were given different drug treatments before revising
for
their exams. Some were given a memory drug, some a placebo
drug
and some no treatment. Test whether the performance is
difference
across the three groups or not. The exam scores (%) are shown
below for the three different groups:
Example
Memory Drug Placebo No Treatment
70 37 3
77 43 10
83 50 17
90 57 23
97 63 30
Mean 83.40 50 16.60
Grand Mean 50
THANK YOU!
Statistical Inference:
Hypothesis Testing
25. Lecture Outline
Ø Understand hypothesis-testing procedure using one-tailed
and two- tailed tests
Ø Understand the concepts of Type I and Type II errors in
hypothesis testing
Ø Understand the procedure of hypothesis testing
The Concept of
Normal Distribution
Introduction to Hypothesis Testing
Ø A statistical hypothesis is an assumption about an unknown
population parameter.
Ø Hypothesis testing is a well defined procedure which helps us
to decide objectively whether to accept or reject the hypothesis
based on the information available from the sample.
26. Ø In statistical analysis, we use the concept of probability to
specify a probability level at which a researcher concludes that
the observed difference between the sample statistic and the
population parameter is not due to chance.
Hypothesis Testing Procedure
Seven steps of hypothesis testing
Step 1: Set Null and Alternative
Hypotheses
Ø The null hypothesis generally referred by H0 (H sub-zero), is
the
hypothesis which is tested for possible rejection under the
assumption that is
true. Theoretically, a null hypothesis is set as no difference or
status quo and
considered true, until and unless it is proved wrong by the
collected sample
data.
Ø Symbolically, a null hypothesis is represented as:
Ø The alternative hypothesis, generally referred by H1 (H sub-
one), is a logical
opposite of the null hypothesis.
Ø Symbolically, alternative hypothesis is represented as:
Step 2: Determine the Appropriate
27. Statistical Test
Ø Type, number, and the level of data may provide a platform
for deciding the statistical test.
Step 3: Set the Level of Significance
Ø The level of significance generally denoted by α is the
probability, which is attached to a null hypothesis, which
may be rejected even when it is true.
Ø The level of significance is also known as the size of the
rejection region or the size of the critical region.
Ø The levels of significance which are generally applied by
researchers are: 0.01; 0.05; 0.10.
Type I and Type II Errors
When a researcher tests statistical hypotheses, there can be four
possible outcomes as follows:
Step 4: Set the Decision Rule
Critical region is the area under the normal curve, divided into
two mutually
exclusive regions. These regions are termed as acceptance
region (when the
null hypothesis is accepted) and the rejection region or critical
region (when
28. the null hypothesis is rejected).
Acceptance and rejection regions of null hypothesis (two-tailed
test)
Two-Tailed Test of Hypothesis
Ø Let us consider the null and alternative hypotheses as
below:
Ø Two-tailed tests contain the rejection region on both the tails
of
the sampling distribution of a test statistic. This means a
researcher will reject the null hypothesis if the computed
sample
statistic is significantly higher than or lower than the
hypothesized population parameter (considering both the tails,
right as well as left).
Acceptance and rejection regions (alpha = 0.05)
One-Tailed Test of Hypothesis
Let us consider a null and alternative hypotheses as below:
One-tailed test contains the rejection region on one tail of the
sampling distribution of a test statistic. In case of a left-tailed
test, a
researcher rejects the null hypothesis if the computed sample
statistic is significantly lower than the hypothesized population
29. parameter.
In case of a right-tailed test, a researcher rejects the null
hypothesis
if the computed sample statistic is significantly higher than the
hypothesized population parameter.
Acceptance and rejection regions for one-tailed (left)
test (alpha = 0.05)
Acceptance and rejection regions for one-tailed (right)
test (alpha = 0.05)
Step 5: Collect the Sample Data
Ø In this stage of sampling, data are collected and the
appropriate
sample statistics are computed.
Ø The first four steps should be completed before collecting the
data
for the study.
Ø It is not advisable to collect the data first and then decide on
the
stages of hypothesis testing.
Step 6: Analyse the data
30. Ø In this step, the researcher has to compute the test statistic.
This
involves selection of an appropriate probability distribution for
a
particular test.
Ø Some of the commonly used testing procedures are z, t, F,
and χ2.
Step 7: Arrive at a Statistical
Conclusion and Business Implication
Ø In this step, the researchers draw a statistical conclusion. A
statistical conclusion is a decision to accept or reject a null
hypothesis.
Ø Statisticians present the information obtained using
hypothesis-
testing procedure to the decision makers. Decisions are made on
the basis of this information. Ultimately, a decision maker
decides
that a statistically significant result is a substantive result and
needs
to be implemented for meeting the organization’s goals.
Hypothesis Testing for a Single Population
Mean Using the Z Statistic
Ø When sample size is greater than (equals to) 30.
Ø Population has a normal distribution.
31. Hypothesis Testing for a Single Population
Mean Using the Z Statistic
A marketing research firm conducted a survey 10 years ago and
found that the average household income of a particular
geographic region is Rs 10,000. Mr.Ahmad, who has recently
joined
the firm as a vice president has expressed doubts about the
accuracy of the data. For verifying the data, the firm has
decided to
take a random sample of 200 households that yield a sample
mean
(for household income) of Rs 11,000. Assume that the
population
standard deviation of the household income is Rs 1200.
Verify Mr. Ahmad’s doubts using the seven steps of hypothesis
testing. Let α = 0.05 (5%).
Example (
Solution
)
32. Hypothesis Testing for a Single Population Mean
Using the T Statistic (Case of a Small Random
Sample When N < 30)
When a researcher draw a small random sample (n < 30) to
estimate
the population mean μ and when the population standard
deviation is
unknown and population is normally distributed, t-test can be
applied.
Royal Tyres has launched a new brand of tyres for tractors and
claims
that under normal circumstances the average life of the tyres is
40,000
km. A retailer wants to test this claim and has taken a random
sample
of 8 tyres and the mean was found as 39,750 (S.D = 2618.61).
He tests
the life of the tyres under normal circumstance. The results
obtained