SlideShare a Scribd company logo
1 of 59
Multiple Regression
(1)

Slide 1

Shakeel Nouman
M.Phil Statistics
Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
11

Slide 2

Multiple Regression (1)

• Using Statistics
• The k-Variable Multiple Regression Model
• The F Test of a Multiple Regression Model
• How Good is the Regression
• Tests of the Significance of Individual
•

•

Regression Parameters
Testing the Validity of the Regression
Model
Using the Multiple Regression Model for
Prediction

Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
11

Slide 3

Multiple Regression (2)

• Qualitative Independent Variables
• Polynomial Regression
• Nonlinear Models and Transformations
• Multicollinearity
• Residual Autocorrelation and the DurbinWatson Test
• Partial F Tests and Variable Selection
Methods
• The Matrix Approach to Multiple
Regression Analysis
•Multiple Regression (1) By Shakeel Nouman M.Phil Statisticsof Terms Lahore, Statistical Officer
Summary and Review Govt. College University
11-1 Using Statistics
y

Slide 4

y

Lines

Planes

B
B

A

Slope: 1

C

A

x1

Intercept: 0

x
Any two points (A and B), or
an intercept and slope (0 and
1), define a line on a twodimensional surface.

x2
Any three points (A, B, and C), or an
intercept and coefficients of x1 and x2
(0 , 1, and 2), define a plane in a
three-dimensional surface.

Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
11-2 The k-Variable Multiple
Regression Model
The population regression model of a
dependent variable, Y, on a set of k
independent variables, X1, X2,. . . , Xk is
given by:

Slide 5

x2

y

2

Y= 0 + 1X1 + 2X2 + . . . + kXk +

where 0 is the Y-intercept of the
regression surface and each i , i = 1,2,...,k
is the slope of the regression surface sometimes called the response surface with respect to Xi.

1

0

x1
y   0   1 x1   2 x 2  

Model assumptions:
1. ~N(0,2), independent of other errors.
2. The variables Xi are uncorrelated with the error term.
Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
Simple and Multiple LeastSquares Regression

Slide 6

y

Y

x1


y  b0  b1x
X

In a simple regression model,
the least-squares estimators
minimize the sum of squared
errors from the estimated
regression line.

x2


y  b0  b1 x1  b2 x 2

In a multiple regression model,
the
least-squares
estimators
minimize the sum of squared
errors from the estimated
regression plane.

Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
The Estimated Regression
Relationship

Slide 7

The estimated regression relationship:

Y  b0  b1 X 1  b2 X 2 bk X k
where  is the predicted value of Y, the value lying on the
Y
estimated regression surface. The terms b0,...,k are the leastsquares estimates of the population regression parameters .
i

The actual, observed value of Y is the predicted value plus an
error:
yj = b0+ b1 x1j+ b2 x2j+. . . + bk xkj+e

Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
Least-Squares Estimation:
The 2-Variable Normal Equations

Slide 8

Minimizing the sum of squared errors with respect to the
estimated coefficients b0, b1, and b2 yields the following
normal equations:
 y  nb  b  x  b  x
0

1

1

2

2

x y b x b x b x x
2

1

0

1

1

1

2

1

x y b x b x x b x
2

0

2

1

1

2

2

2

2
2

Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
Example 11-1
Y
72
76
78
70
68
80
82
65
62
90
--743

X1
12
11
15
10
11
16
14
8
8
18
--123

X2
5
8
6
5
3
9
12
4
3
10
--65

X1X2
60
88
90
50
33
144
168
32
24
180
--869

X12
144
121
225
100
121
256
196
64
64
324
---1615

X22
25
64
36
25
9
81
144
16
9
100
--509

X1Y
864
836
1170
700
748
1280
1148
520
496
1620
---9382

X2Y
360
608
468
350
204
720
984
260
186
900
---5040

Slide 9

Normal Equations:
743 = 10b0+123b1+65b2
9382 = 123b0+1615b1+869b2
5040 = 65b0+869b1+509b2

b0 = 47.164942
b1 = 1.5990404
b2 = 1.1487479

Estimated regression equation:

Y  47164942  15990404 X 1  11487479 X 2
.
.
.
Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
Example 11-1: Using the
Template

Slide 10

Regression results for Alka-Seltzer sales

Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
Decomposition of the Total
Deviation in a Multiple
Regression Model
y




Total deviation: Y  Y

y

Slide 11


Y  Y : Error Deviation


Y  Y : Regression Deviation

x1
x2
Total Deviation = Regression Deviation + Error Deviation

SST

=

SSR

+ SSE

Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
11-3 The F Test of a Multiple
Regression Model

Slide 12

A statistical test for the existence of a linear relationship between Y and any or
all of the independent variables X1, x2, ..., Xk:
H0:  =  = ...=  =0
1
2
k
H1: Not all the  (i=1,2,...,k) are 0
i
Source of
Variation

Sum of
Squares

Regression SSR

Error
Total

SSE
SST

Degrees of
Freedom Mean Square
k

n - (k+1)
n-1

SSR

MSR 

MSE 

F Ratio

k

SSE
( n  ( k  1))

MST 

SST
( n  1)

Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
Using the Template: Analysis of
Variance Table (Example 11-1)

F Distribution with 2 and 7 Degrees of Freedom
f(F)

Test statistic 
86.34


=0.01

F
0

Slide 13

The test statistic, F = 86.34, is greater
than the critical point of F(2, 7) for any
common level of significance
(p-value 0), so the null hypothesis is
rejected, and we might conclude that
the dependent variable is related to
one or more of the independent
variables.

F0.01=9.55

Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
11-4 How Good is the
Regression
y

The mean square error is an unbiased
estimator of the variance of the population
2
errors,  , denoted by  :
MSE 

x1
x2

Slide 14

SSE
( n  ( k  1))



 (y  y 2
)
( n  ( k  1))

Standard error of estimate:

Errors: y - y

s=

MSE

2
The multiple coefficient of determination, R , measures the proportion of
the variation in the dependent variable that is explained by the combination
of the independent variables in the multiple regression model:

R2 =

SSR
SSE
=1SST
SST

Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
Decomposition of the Sum of
Squares and the Adjusted
Coefficient of Determination

Slide 15

SST
SSR
R

SSE

2

=

SSR
SST

= 1-

SSE
SST

2
The adjusted multiple coefficient of determination , R , is the coefficient of
determination with the SSE and SST divided by their respective degrees of freedom:
SSE
R

2

= 1-

(n - (k + 1))
SST
(n - 1)

Example 11-1:

s = 1.911

R-sq = 96.1%

R-sq(adj) = 95.0%

Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
Measures of Performance in Multiple
Regression and the ANOVA Table
Source of
Variation

Sum of
Squares

Degrees of
Freedom Mean Square

Regression SSR

(k)
MSR 

Error

R

(n-(k+1))
=(n-k-1)

Total

2

SSE
SST

(n-1)

SSR
=

SSE
= 1-

SST

SST

F 

R

MSE 

Slide 16

F Ratio
F 

SSR
k

MSR
MSE

SSE
(n  ( k  1))

MST 

SST
( n  1)
SSE

2

( n  ( k  1))
2

(1  R )

(k )

R

2

=1-

(n - (k + 1))
SST

=

MSE
MST

(n - 1)
Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
11-5 Tests of the Significance of
Individual Regression Parameters

Slide 17

Hypothesis tests about individual regression slope parameters:
(1)
H0: b1= 0
H1: b1  0
(2)
H0: b2 = 0
H1: b2  0
.
.
.
(k)
H0: bk = 0
H1: bk  0
Test statistic for test i: t

b 0

s(b )
i

( n  ( k 1 )

i

Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
Regression Results for
Individual Parameters
Variable
Constant

Coefficient
Estimate

Standard
Error

Slide 18

t-Statistic

53.12

5.43

9.783

X1

2.03

0.22

9.227

X2

5.60

1.30

4.308

X3

10.35

6.88

1.504

X4

3.45

2.70

1.259

X5

-4.25

0.38

11.184

n=150

*
*
*

*

t0.025=1.96

Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
Example 11-1: Using the
Template

Slide 19

Regression results for Alka-Seltzer sales

Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
Using the Template: Example
11-2

Slide 20

Regression results for Exports to Singapore

Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
11-6 Testing the Validity of the
Regression Model: Residual Plots

Slide 21

Residuals vs M1

It appears that the residuals are randomly distributed with no pattern and
with equal variance as M1 increases
Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
11-6 Testing the Validity of the
Regression Model: Residual Plots

Slide 22

Residuals vs Price

It appears that the residuals are increasing as the Price increases. The
variance of the residuals is not constant.
Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
Normal Probability Plot for the
Residuals: Example 11-2

Slide 23

Linear trend indicates residuals are normally distributed

Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
Investigating the Validity of the
Regression: Outliers and Influential
Observations
y

Regression line
without outlier

. .
. ..
..
. .. ..
.. .
.

Regression
line with
outlier

* Outlier
x
Outliers

Slide 24

Point with a large
value of xi

y

..
...... .. .
. .. .

*

Regression line
when all data are
included

No relationship in
this cluster

x
Influential Observations

Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
Outliers and Influential
Observations: Example 11-2

Unusual Observations
Obs.
M1
EXPORTS
St.Resid
1
5.10
2.6000
2
4.90
2.6000
25
6.20
5.5000
26
6.30
3.7000
50
8.30
4.3000
67
8.20
5.6000

Fit

Stdev.Fit

2.6420
2.6438
4.5949
4.6311
5.1317
4.9474

0.1288
0.1234
0.0676
0.0651
0.0648
0.0668

Slide 25

Residual
-0.0420
-0.0438
0.9051
-0.9311
-0.8317
0.6526

-0.14 X
-0.14 X
2.80R
-2.87R
-2.57R
2.02R

R denotes an obs. with a large st. resid.
X denotes an obs. whose X value gives it large influence.

Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
11-7 Using the Multiple
Regression Model for Prediction
Sales

Slide 26

Estimated Regression Plane for Example 11-1

89.76

Advertising

18.00
63.42
8.00

Promotions

12

3

Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
Prediction in Multiple
Regression

Slide 27

A (1 - a) 100% prediction interval for a value of Y given values of X :
i

yt


( ,( n  ( k 1)))
2

s 2 ( y  MSE
)

A (1 - a) 100% prediction interval for the conditional mean of Y given
values of X :
i

yt


( ,( n  ( k 1)))
2


s[ E (Y )]

Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
11-8 Qualitative (or Categorical)
Independent Variables (in
Regression)

Slide 28

An indicator (dummy, binary) variable of qualitative level A:
1 if level A is obtained
Xh  
0 if level A is not obtained
MOVIEEARN
1
28
2
35
3
50
4
20
5
75
6
60
7
15
8
45
9
50
10
34
11
48
12
82
13
24
14
50
15
58
16
63
17
30
18
37
19
45
20
72

COST
4.2
6.0
5.5
3.3
12.5
9.6
2.5
10.8
8.4
6.6
10.7
11.0
3.5
6.9
7.8
10.1
5.0
7.5
6.4
10.0

PROM
1.0
3.0
6.0
1.0
11.0
8.0
0.5
5.0
3.0
2.0
1.0
15.0
4.0
10.0
9.0
10.0
1.0
5.0
8.0
12.0

BOOK
0
1
1
0
1
1
0
0
1
0
1
1
0
0
1
0
1
0
1
1

EXAMPLE113

Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
Picturing Qualitative Variables in
Regression

Slide 29

y

Y

Line for X2=1
b3

b0+b2

Line for X2=0

x1

b0
X1

A regression with one
quantitative variable (X1) and
one qualitative variable (X2):


y  b bx b x
0

1

1

2

2

x2
A multiple regression with two
quantitative variables (X1 and X2)
and one qualitative variable (X3):


y b bx b x b x
0

1

1

2

2

3

3

Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
Picturing Qualitative Variables in
Regression: Three Categories and
Two Dummy Variables
Y

Line for X = 0 and X3 = 1

Line for X2 = 1 and X3 = 0
b0+b3

Slide 30

A qualitative
variable with r
levels or categories
is represented with
(r-1) 0/1 (dummy)
variables.

Line for X2 = 0 and X3 = 0

b0+b2
b0

X1

A regression with one quantitative variable (X1) and two
qualitative variables (X2 and X2):


y b bx b x b x
0

1

1

2

2

3

Category X2 X3
Adventure 0 0
Drama
0 1
Romance 1 0

3

Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
Using Qualitative Variables in
Regression: Example 11-4

Slide 31

Salary = 8547 + 949 Education + 1258 Experience - 3256 Gender
(SE) (32.6)
(45.1)
(78.5)
(212.4)
(t) (262.2)
(21.0)
(16.0)
(-15.3)

1 if Female
Gender  
0 if Male

On average, female salaries are
$3256 below male salaries

Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
Interactions between Quantitative and 32
Slide
Qualitative Variables: Shifting Slopes
LnorX20

Y

LnorX21

Slop 1

0
Slop 13

02
X1

A regression with interaction between a quantitative
variable (X1) and a qualitative variable (X2 ):


y b bx b x b x x
0

1

1

2

2

3

1

2

Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
11-9 Polynomial Regression

Slide 33

One-variable polynomial regression model:
Y=0+1 X + 2X2 + 3X3 +. . . + mXm +
where m is the degree of the polynomial - the highest power of X appearing in
the equation. The degree of the polynomial is the order of the model.
Y

Y


y  b b X
0


y  b b X
0

1

1


y b b X b X
0

1

2


y b b X b X b X

2

2

(b  0)

0

2

X1

1

2

3

X1

Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer

3
Polynomial Regression:
Example 11-5

Slide 34

Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
Polynomial Regression: Other
Variables and Cross-Product Terms

Slide 35

Variable Estimate Standard Error T-statistic
X1
2.34
0.92
2.54
X2
3.11
1.05
2.96
2
X1
4.22
1.00
4.22
X2 2
3.57
2.12
1.68
2
X1 X
2.77
2.30
1.20

Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
11-10 Nonlinear Models and
Transformations: Multiplicative
Model

Slide 36

The multiplicative model:
Y   X X X 
1

0

1

2

2

3

3

The logarithmic transformation:
log Y  log    log X   log X   log X  log 
0

1

1

2

2

3

3

Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
Transformations: Exponential
Model

Slide 37

The exponential model:
Y   e 
1X

0

The logarithmic transformation:
log Y  log 

0

  X  log 
1

1

Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
Plots of Transformed Variables
S im ple R e g re s s io n of S ale s o n Ad ve rtis ing

R e gre s sio n of S ale s on Log(Advertising)
25

20

Y = 6 .5 9 2 7 1 + 1.19 176 X
R- S q u a r e d = 0 .8 9 5

10

SALES

30

SALES

Slide 38

15

Y = 3 .6 6 8 2 5 + 6 .78 4 X
R- Sq uared = 0 .9 78
5

0

5

10

15

0

1

ADVERT

2

3

LOGADV

R e gre s sio n of Lo g(S ale s ) o n Lo g(Adve rtising)

R e sid ual Plots: S ale s v s Lo g(Adv e rtising)
1.5

3.5

2.5

Y = 1.70 0 8 2 + 0 .5 53 13 6 X
R- S q uar ed = 0 .9 47

RESIDS

LOGSALE

0.5

-0.5

-1.5
1.5
0

1

2

LOGADV

3

2

12

22

Y-HAT

Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
Variance Stabilizing
Transformations
•

Slide 39

Square root transformation:Y   Y

Useful when the variance of the regression errors is
approximately proportional to the conditional mean of Y

•

Logarithmic transformation: Y   log(Y )
Useful when the variance of regression errors is approximately
proportional to the square of the conditional mean of Y

• Reciprocal transformation:

1
Y
Useful when the variance of the regression errors is
Y 

approximately proportional to the fourth power of the
conditional mean of Y

Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
Regression with Dependent
Indicator Variables

Slide 40

The logistic function:
e (  X )
E (Y X ) 
1  e (  X )
0

1

0

1

Transformation to linearize the logistic function:
 p 
p   log

 1  p

y

Logistic Function

1

0

x

Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
11-11: Multicollinearity

Slide 41

x2
x2

x1
Orthogonal X variables provide
information from independent
sources. No multicollinearity.

x2

x1

Perfectly collinear X variables
provide identical information
content. No regression.

x2
x1

Some degree of collinearity.
Problems with regression depend
on the degree of collinearity.

x1
A high degree of negative
collinearity also causes problems
with regression.

Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
Effects of Multicollinearity
•
•

•
•
•
•

Slide 42

Variances of regression coefficients are inflated.
Magnitudes of regression coefficients may be different
from what are expected.
Signs of regression coefficients may not be as expected.
Adding or removing variables produces large changes in
coefficients.
Removing a data point may cause large changes in
coefficient estimates or signs.
In some cases, the F ratio may be significant while the t
ratios are not.

Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
Detecting the Existence of Multicollinearity: Correlation
Matrix of Independent Variables and Variance Inflation
Factors

Slide 43

Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
Variance Inflation Factor

Slide 44

The variance inflation factor associated with X h :
1
VIF ( X h ) 
1  Rh2
where R 2 is the R 2 value obtained for the regression of X on
h
the other independent variables.
Relationship between VIF and Rh2

VIF100

50

0
0.0

0.5

1.0

Rh2

Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
Variance Inflation Factor (VIF)

Slide 45

Observation: The VIF (Variance Inflation Factor) values
for both variables Lend and Price are both greater than
5. This would indicate that some degree of
multicollinearity exists with respect to these two
variables.
Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
Solutions to the
Multicollinearity Problem

Slide 46

• Drop a collinear variable from the
regression
• Change in sampling plan to include
elements outside the multicollinearity range
• Transformations of variables
• Ridge regression

Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
11-12 Residual Autocorrelation
and the Durbin-Watson Test

Slide 47

An autocorrelation is a correlation of the values of a variable
with values of the same variable lagged one or more periods
back. Consequences of autocorrelation include inaccurate
estimates of variances and inaccurate predictions.
Lagged Residuals
i

1
2
3
4
5
6
7
8
9
10

i
1.0
0.0
-1.0
2.0
3.0
-2.0
1.0
1.5
1.0
-2.5

i-1
*
1.0
0.0
-1.0
2.0
3.0
-2.0
1.0
1.5
1.0

i-2
*
*
1.0
0.0
-1.0
2.0
3.0
-2.0
1.0
1.5

i-3
*
*
*
1.0
0.0
-1.0
2.0
3.0
-2.0
1.0

i-4
*
*
*
*
1.0
0.0
-1.0
2.0
3.0
-2.0

The Durbin-Watson test (first-order
autocorrelation):
H 0 : r1 = 0
H1:r1  0
The Durbin-Watson test statistic:
n
2
 ( ei  ei 1 )
d  i2 n
2
 ei
i 1

Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
Critical Points of the Durbin-Watson Statistic:
=0.05, n= Sample Size, k = Number of
Independent Variables

n
15
16
17
18

.
.
.

65
70
75
80
85
90
95
100

k=1
dL dU

1.08
1.10
1.13
1.16
1.57
1.58
1.60
1.61
1.62
1.63
1.64
1.65

1.36
1.37
1.38
1.39
.
.
.
1.63
1.64
1.65
1.66
1.67
1.68
1.69
1.69

k=2
dL dU

0.95
0.98
1.02
1.05
1.54
1.55
1.57
1.59
1.60
1.61
1.62
1.63

1.54
1.54
1.54
1.53
.
.
.
1.66
1.67
1.68
1.69
1.70
1.70
1.71
1.72

k=3
dL dU

0.82
0.86
0.90
0.93
1.50
1.52
1.54
1.56
1.57
1.59
1.60
1.61

1.75
1.73
1.71
1.69
.
.
.
1.70
1.70
1.71
1.72
1.72
1.73
1.73
1.74

k=4
dL dU

0.69
0.74
0.78
0.82
1.47
1.49
1.51
1.53
1.55
1.57
1.58
1.59

1.97
1.93
1.90
1.87
.
.
.
1.73
1.74
1.74
1.74
1.75
1.75
1.75
1.76

Slide 48

k=5
dL dU

0.56
0.62
0.67
0.71
1.44
1.46
1.49
1.51
1.52
1.54
1.56
1.57

2.21
2.15
2.10
2.06
.
.
.
1.77
1.77
1.77
1.77
1.77
1.78
1.78
1.78

Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
Using the Durbin-Watson
Statistic

Positive
Autocorrelation

0

Test is
Inconclusive

dL

dU

No
Autocorrelation

Test is
Inconclusive

4-dU

Slide 49

Negative
Autocorrelation

4-dL

4

For n = 67, k = 4: dU1.73 4-dU2.27
dL1.47 4- dL2.53 < 2.58
H0 is rejected, and we conclude there is negative first-order
autocorrelation.

Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
11-13 Partial F Tests and
Variable
Selection Methods

Slide 50

Full model:
Y = 0 + 1 X 1 + 2 X 2 + 3 X 3 + 4 X 4 + 
Reduced model:
Y = 0 + 1 X 1 + 2 X 2 + 
Partial F test:
H0: 3 = 4 = 0
H1: 3 and 4 not both 0

Partial F statistic:

 SSE ) / r
R
F
F

MSE
(r, (n  (k  1))
F
where SSER is the sum of squared errors of the reduced model, SSEF is the sum of squared
errors of the full model; MSEF is the mean square error of the full model [MSEF =
SSEF/(n-(k+1))]; r is the number of variables dropped from the full model.
(SSE

Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
Variable Selection Methods
•

Slide 51

All possible regressions
Run regressions with all possible
combinations of independent variables and
select best model
A p-value of 0.001 indicates
that we should reject the null
hypothesis H0: the slopes for
Lend and Exch. are zero.

Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
Variable Selection Methods
•

Slide 52

Stepwise procedures
Forward selection
» Add one variable at a time to the model, on the basis of its
F statistic

Backward elimination
» Remove one variable at a time, on the basis of its F
statistic

Stepwise regression
» Adds variables to the model and subtracts variables from
the model, on the basis of the F statistic

Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
Stepwise Regression

Slide 53

ComputF tttorvrlnotntmol

Itrtltonvrlwt pvlu Pn

No

Stop

Y
Entrmotnnt(mlltpvlu)vrlntomol

ClultprtlForllvrlntmol

ItrvrlwtpvluPout

Rmov
vrl

No

Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
Stepwise Regression: Using the
Computer (MINITAB)

Slide 54

MTB > STEPWISE 'EXPORTS' PREDICTORS 'M1’ 'LEND' 'PRICE’
'EXCHANGE'
Stepwise Regression
F-to-Enter:

4.00

F-to-Remove:

4.00

Response is EXPORTS on 4 predictors, with N = 67
Step
Constant
M1
T-Ratio

1
0.9348

2
-3.4230

0.520
9.89

0.361
9.21

PRICE
T-Ratio

S
R-Sq

0.0370
9.05

0.495
60.08

0.331
82.48

Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
Using the Computer: MINITAB

Slide 55

MTB > REGRESS 'EXPORTS’ 4 'M1’ 'LEND’ 'PRICE' 'EXCHANGE';
SUBC> vif;
SUBC> dw.
Regression Analysis
The regression equation is
EXPORTS = - 4.02 + 0.368 M1 + 0.0047 LEND + 0.0365 PRICE + 0.27 EXCHANGE
Predictor
Coef
Stdev
t-ratio
p
VIF
Constant
-4.015
2.766
-1.45
0.152
M1
0.36846
0.06385
5.77
0.000
3.2
LEND
0.00470
0.04922
0.10
0.924
5.4
PRICE
0.036511
0.009326
3.91
0.000
6.3
EXCHANGE
0.268
1.175
0.23
0.820
1.4
s = 0.3358

R-sq = 82.5%

R-sq(adj) = 81.4%

Analysis of Variance
SOURCE
Regression

DF
4

SS
32.9463
62

Error
Total

MS
F
8.2366
73.06
6.9898
0.1127
66
39.9361

p
0.000

Durbin-Watson statistic = 2.58

Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
Using the Computer: SAS
(continued)

Slide 56

Parameter Estimates
Variable
INTERCEP
M1
LEND
PRICE
EXCHANGE

DF
1
1
1
1
1

Parameter
Error

Estimate
-4.015461
0.368456
0.004702
0.036511
0.267896

2.76640057
0.06384841
0.04922186
0.00932601
1.17544016

Variable
INTERCEP
M1
LEND
PRICE
EXCHANGE
Durbin-Watson D
(For Number of Obs.)
1st Order Autocorrelation

-1.452
5.771
0.096
3.915
0.228

Prob > |T|
0.1517
0.0001
0.9242
0.0002
0.8205

Variance
Inflation

DF
1
1
1
1
1

Standard
T for H0:
Parameter=0

0.00000000
3.20719533
5.35391367
6.28873181
1.38570639
2.583
67
-0.321

Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
11-15: The Matrix Approach to
Regression Analysis (1)

Slide 57

The population regression model:
y
y

y
.

.

.
y


1

2

3

k


1

1



1
  .



.



.

1



x
x
x
.
.
.
x

11

21

31

n1

x
x
x
.
.
.
x

12

22

32

n2

x ... x    


x ... x    



x ... x    

.
.
.   .   .



.
.
.
.  . 



.
.
.  . 
.

x ... x    



13

1k

1

1

23

2k

2

2

33

3k

3

3

k

k

n3

nk

Y  X  
The estimated regression model:
Y = Xb + e
Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer












The Matrix Approach to
Regression Analysis (2)

Slide 58

The normal equations:
X Xb  X Y
Estimators:
b  ( X X )

1

X Y

Predicted values:

Y  Xb  X ( X  )
X

1

V (b)   ( X X )
s (b)  MSE ( X  )
X
2

2

X Y  HY

1

1

Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
Slide 59

Name
Religion
Domicile
Contact #
E.Mail
M.Phil (Statistics)

Shakeel Nouman
Christian
Punjab (Lahore)
0332-4462527. 0321-9898767
sn_gcu@yahoo.com
sn_gcu@hotmail.com
GC University, .
(Degree awarded by GC University)

M.Sc (Statistics)
Statitical Officer
(BS-17)
(Economics & Marketing
Division)

GC University, .
(Degree awarded by GC University)

Livestock Production Research Institute
Bahadurnagar (Okara), Livestock & Dairy Development
Department, Govt. of Punjab

Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer

More Related Content

What's hot

Basics of Regression analysis
 Basics of Regression analysis Basics of Regression analysis
Basics of Regression analysisMahak Vijayvargiya
 
Simple Linier Regression
Simple Linier RegressionSimple Linier Regression
Simple Linier Regressiondessybudiyanti
 
Introduction to Regression Analysis
Introduction to Regression AnalysisIntroduction to Regression Analysis
Introduction to Regression AnalysisMinha Hwang
 
Simple lin regress_inference
Simple lin regress_inferenceSimple lin regress_inference
Simple lin regress_inferenceKemal İnciroğlu
 
Regression analysis
Regression analysisRegression analysis
Regression analysisbijuhari
 
Regression analysis
Regression analysisRegression analysis
Regression analysisRavi shankar
 
Simple regression and correlation
Simple regression and correlationSimple regression and correlation
Simple regression and correlationMary Grace
 
Correlation and Regression
Correlation and RegressionCorrelation and Regression
Correlation and RegressionNeha Dokania
 
Regression Analysis presentation by Al Arizmendez and Cathryn Lottier
Regression Analysis presentation by Al Arizmendez and Cathryn LottierRegression Analysis presentation by Al Arizmendez and Cathryn Lottier
Regression Analysis presentation by Al Arizmendez and Cathryn LottierAl Arizmendez
 
Regression Analysis
Regression AnalysisRegression Analysis
Regression Analysisnadiazaheer
 
ML - Simple Linear Regression
ML - Simple Linear RegressionML - Simple Linear Regression
ML - Simple Linear RegressionAndrew Ferlitsch
 

What's hot (20)

Basics of Regression analysis
 Basics of Regression analysis Basics of Regression analysis
Basics of Regression analysis
 
Simple Linier Regression
Simple Linier RegressionSimple Linier Regression
Simple Linier Regression
 
Introduction to Regression Analysis
Introduction to Regression AnalysisIntroduction to Regression Analysis
Introduction to Regression Analysis
 
Regression
RegressionRegression
Regression
 
Simple lin regress_inference
Simple lin regress_inferenceSimple lin regress_inference
Simple lin regress_inference
 
Regression Analysis
Regression AnalysisRegression Analysis
Regression Analysis
 
Linear regression
Linear regressionLinear regression
Linear regression
 
Regression analysis
Regression analysisRegression analysis
Regression analysis
 
Regression analysis
Regression analysisRegression analysis
Regression analysis
 
Regression Analysis
Regression AnalysisRegression Analysis
Regression Analysis
 
Malhotra17
Malhotra17Malhotra17
Malhotra17
 
Chi square using excel
Chi square using excelChi square using excel
Chi square using excel
 
Regression analysis in excel
Regression analysis in excelRegression analysis in excel
Regression analysis in excel
 
Simple regression and correlation
Simple regression and correlationSimple regression and correlation
Simple regression and correlation
 
Correlation and Regression
Correlation and RegressionCorrelation and Regression
Correlation and Regression
 
Linear regression
Linear regressionLinear regression
Linear regression
 
Regression Analysis presentation by Al Arizmendez and Cathryn Lottier
Regression Analysis presentation by Al Arizmendez and Cathryn LottierRegression Analysis presentation by Al Arizmendez and Cathryn Lottier
Regression Analysis presentation by Al Arizmendez and Cathryn Lottier
 
Chap12 simple regression
Chap12 simple regressionChap12 simple regression
Chap12 simple regression
 
Regression Analysis
Regression AnalysisRegression Analysis
Regression Analysis
 
ML - Simple Linear Regression
ML - Simple Linear RegressionML - Simple Linear Regression
ML - Simple Linear Regression
 

Viewers also liked

Collapse of 4 storied building in begunbari --
Collapse of 4 storied building in begunbari --Collapse of 4 storied building in begunbari --
Collapse of 4 storied building in begunbari --Sumon Dhrubo
 
Multiple regression (1)
Multiple regression (1)Multiple regression (1)
Multiple regression (1)Shakeel Nouman
 
Simple linear regression and correlation
Simple linear regression and correlationSimple linear regression and correlation
Simple linear regression and correlationShakeel Nouman
 
Disaster management terminology
Disaster management terminologyDisaster management terminology
Disaster management terminologySumon Dhrubo
 
The comparison of two populations
The comparison of two populationsThe comparison of two populations
The comparison of two populationsShakeel Nouman
 
Discrete random variable.
Discrete random variable.Discrete random variable.
Discrete random variable.Shakeel Nouman
 
The normal distribution
The normal distributionThe normal distribution
The normal distributionShakeel Nouman
 
Sampling and sampling distributions
Sampling and sampling distributionsSampling and sampling distributions
Sampling and sampling distributionsShakeel Nouman
 
The comparison of two populations
The comparison of two populationsThe comparison of two populations
The comparison of two populationsShakeel Nouman
 
Pendidikan seni budaya kelas 5
Pendidikan seni budaya kelas 5Pendidikan seni budaya kelas 5
Pendidikan seni budaya kelas 5Dika Wulandari
 

Viewers also liked (18)

Tugas makalah bhs
Tugas makalah bhsTugas makalah bhs
Tugas makalah bhs
 
Publ pokana aop
Publ pokana aopPubl pokana aop
Publ pokana aop
 
Collapse of 4 storied building in begunbari --
Collapse of 4 storied building in begunbari --Collapse of 4 storied building in begunbari --
Collapse of 4 storied building in begunbari --
 
Info skliuchen dogovor_2015
Info skliuchen dogovor_2015Info skliuchen dogovor_2015
Info skliuchen dogovor_2015
 
Quality control
Quality controlQuality control
Quality control
 
Multiple regression (1)
Multiple regression (1)Multiple regression (1)
Multiple regression (1)
 
Sampling methods
Sampling methodsSampling methods
Sampling methods
 
Sampling distribution
Sampling distributionSampling distribution
Sampling distribution
 
Simple linear regression and correlation
Simple linear regression and correlationSimple linear regression and correlation
Simple linear regression and correlation
 
Disaster management terminology
Disaster management terminologyDisaster management terminology
Disaster management terminology
 
The comparison of two populations
The comparison of two populationsThe comparison of two populations
The comparison of two populations
 
Discrete random variable.
Discrete random variable.Discrete random variable.
Discrete random variable.
 
The normal distribution
The normal distributionThe normal distribution
The normal distribution
 
Sampling and sampling distributions
Sampling and sampling distributionsSampling and sampling distributions
Sampling and sampling distributions
 
The comparison of two populations
The comparison of two populationsThe comparison of two populations
The comparison of two populations
 
Analysis of variance
Analysis of varianceAnalysis of variance
Analysis of variance
 
Confidence interval
Confidence intervalConfidence interval
Confidence interval
 
Pendidikan seni budaya kelas 5
Pendidikan seni budaya kelas 5Pendidikan seni budaya kelas 5
Pendidikan seni budaya kelas 5
 

Similar to Multiple regression (1)

Multiple Regression.ppt
Multiple Regression.pptMultiple Regression.ppt
Multiple Regression.pptTanyaWadhwani4
 
Ch 6 Slides.doc/9929292929292919299292@:&:&:&9/92
Ch 6 Slides.doc/9929292929292919299292@:&:&:&9/92Ch 6 Slides.doc/9929292929292919299292@:&:&:&9/92
Ch 6 Slides.doc/9929292929292919299292@:&:&:&9/92ohenebabismark508
 
Lecture - 8 MLR.pptx
Lecture - 8 MLR.pptxLecture - 8 MLR.pptx
Lecture - 8 MLR.pptxiris765749
 
Lesson 27 using statistical techniques in analyzing data
Lesson 27 using statistical techniques in analyzing dataLesson 27 using statistical techniques in analyzing data
Lesson 27 using statistical techniques in analyzing datamjlobetos
 
whitehead-logistic-regression.ppt
whitehead-logistic-regression.pptwhitehead-logistic-regression.ppt
whitehead-logistic-regression.ppt19DSMA012HarshSingh
 
Module 2_ Regression Models..pptx
Module 2_ Regression Models..pptxModule 2_ Regression Models..pptx
Module 2_ Regression Models..pptxnikshaikh786
 
Nonparametric methods and chi square tests (1)
Nonparametric methods and chi square tests (1)Nonparametric methods and chi square tests (1)
Nonparametric methods and chi square tests (1)Shakeel Nouman
 
Nonparametric methods and chi square tests (1)
Nonparametric methods and chi square tests (1)Nonparametric methods and chi square tests (1)
Nonparametric methods and chi square tests (1)Shakeel Nouman
 
Statistics-Regression analysis
Statistics-Regression analysisStatistics-Regression analysis
Statistics-Regression analysisRabin BK
 
Multinomial Logistic Regression Analysis
Multinomial Logistic Regression AnalysisMultinomial Logistic Regression Analysis
Multinomial Logistic Regression AnalysisHARISH Kumar H R
 
Bba 3274 qm week 6 part 1 regression models
Bba 3274 qm week 6 part 1 regression modelsBba 3274 qm week 6 part 1 regression models
Bba 3274 qm week 6 part 1 regression modelsStephen Ong
 
Applied Business Statistics ,ken black , ch 3 part 2
Applied Business Statistics ,ken black , ch 3 part 2Applied Business Statistics ,ken black , ch 3 part 2
Applied Business Statistics ,ken black , ch 3 part 2AbdelmonsifFadl
 
Logistic regression with SPSS
Logistic regression with SPSSLogistic regression with SPSS
Logistic regression with SPSSLNIPE
 
Regression vs Neural Net
Regression vs Neural NetRegression vs Neural Net
Regression vs Neural NetRatul Alahy
 
Regression analysis ppt
Regression analysis pptRegression analysis ppt
Regression analysis pptElkana Rorio
 

Similar to Multiple regression (1) (20)

Regression analysis
Regression analysisRegression analysis
Regression analysis
 
Multiple Regression.ppt
Multiple Regression.pptMultiple Regression.ppt
Multiple Regression.ppt
 
Ch 6 Slides.doc/9929292929292919299292@:&:&:&9/92
Ch 6 Slides.doc/9929292929292919299292@:&:&:&9/92Ch 6 Slides.doc/9929292929292919299292@:&:&:&9/92
Ch 6 Slides.doc/9929292929292919299292@:&:&:&9/92
 
Quality control
Quality controlQuality control
Quality control
 
Lecture - 8 MLR.pptx
Lecture - 8 MLR.pptxLecture - 8 MLR.pptx
Lecture - 8 MLR.pptx
 
Lesson 27 using statistical techniques in analyzing data
Lesson 27 using statistical techniques in analyzing dataLesson 27 using statistical techniques in analyzing data
Lesson 27 using statistical techniques in analyzing data
 
whitehead-logistic-regression.ppt
whitehead-logistic-regression.pptwhitehead-logistic-regression.ppt
whitehead-logistic-regression.ppt
 
Module 2_ Regression Models..pptx
Module 2_ Regression Models..pptxModule 2_ Regression Models..pptx
Module 2_ Regression Models..pptx
 
Nonparametric methods and chi square tests (1)
Nonparametric methods and chi square tests (1)Nonparametric methods and chi square tests (1)
Nonparametric methods and chi square tests (1)
 
Nonparametric methods and chi square tests (1)
Nonparametric methods and chi square tests (1)Nonparametric methods and chi square tests (1)
Nonparametric methods and chi square tests (1)
 
Statistics-Regression analysis
Statistics-Regression analysisStatistics-Regression analysis
Statistics-Regression analysis
 
Multinomial Logistic Regression Analysis
Multinomial Logistic Regression AnalysisMultinomial Logistic Regression Analysis
Multinomial Logistic Regression Analysis
 
Bba 3274 qm week 6 part 1 regression models
Bba 3274 qm week 6 part 1 regression modelsBba 3274 qm week 6 part 1 regression models
Bba 3274 qm week 6 part 1 regression models
 
Applied Business Statistics ,ken black , ch 3 part 2
Applied Business Statistics ,ken black , ch 3 part 2Applied Business Statistics ,ken black , ch 3 part 2
Applied Business Statistics ,ken black , ch 3 part 2
 
Logistic regression with SPSS
Logistic regression with SPSSLogistic regression with SPSS
Logistic regression with SPSS
 
Regression vs Neural Net
Regression vs Neural NetRegression vs Neural Net
Regression vs Neural Net
 
Regression analysis ppt
Regression analysis pptRegression analysis ppt
Regression analysis ppt
 
Binary Logistic Regression
Binary Logistic RegressionBinary Logistic Regression
Binary Logistic Regression
 
Chapter14
Chapter14Chapter14
Chapter14
 
Chapter14
Chapter14Chapter14
Chapter14
 

Recently uploaded

Solving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxSolving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxOH TEIK BIN
 
Science 7 - LAND and SEA BREEZE and its Characteristics
Science 7 - LAND and SEA BREEZE and its CharacteristicsScience 7 - LAND and SEA BREEZE and its Characteristics
Science 7 - LAND and SEA BREEZE and its CharacteristicsKarinaGenton
 
Class 11 Legal Studies Ch-1 Concept of State .pdf
Class 11 Legal Studies Ch-1 Concept of State .pdfClass 11 Legal Studies Ch-1 Concept of State .pdf
Class 11 Legal Studies Ch-1 Concept of State .pdfakmcokerachita
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)eniolaolutunde
 
Hybridoma Technology ( Production , Purification , and Application )
Hybridoma Technology  ( Production , Purification , and Application  ) Hybridoma Technology  ( Production , Purification , and Application  )
Hybridoma Technology ( Production , Purification , and Application ) Sakshi Ghasle
 
Employee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxEmployee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxNirmalaLoungPoorunde1
 
mini mental status format.docx
mini    mental       status     format.docxmini    mental       status     format.docx
mini mental status format.docxPoojaSen20
 
URLs and Routing in the Odoo 17 Website App
URLs and Routing in the Odoo 17 Website AppURLs and Routing in the Odoo 17 Website App
URLs and Routing in the Odoo 17 Website AppCeline George
 
Presiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsPresiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsanshu789521
 
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTiammrhaywood
 
Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfsanyamsingh5019
 
Alper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentAlper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentInMediaRes1
 
Interactive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationInteractive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationnomboosow
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdfSoniaTolstoy
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityGeoBlogs
 

Recently uploaded (20)

Staff of Color (SOC) Retention Efforts DDSD
Staff of Color (SOC) Retention Efforts DDSDStaff of Color (SOC) Retention Efforts DDSD
Staff of Color (SOC) Retention Efforts DDSD
 
Solving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxSolving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptx
 
Science 7 - LAND and SEA BREEZE and its Characteristics
Science 7 - LAND and SEA BREEZE and its CharacteristicsScience 7 - LAND and SEA BREEZE and its Characteristics
Science 7 - LAND and SEA BREEZE and its Characteristics
 
Class 11 Legal Studies Ch-1 Concept of State .pdf
Class 11 Legal Studies Ch-1 Concept of State .pdfClass 11 Legal Studies Ch-1 Concept of State .pdf
Class 11 Legal Studies Ch-1 Concept of State .pdf
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)
 
Hybridoma Technology ( Production , Purification , and Application )
Hybridoma Technology  ( Production , Purification , and Application  ) Hybridoma Technology  ( Production , Purification , and Application  )
Hybridoma Technology ( Production , Purification , and Application )
 
TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdfTataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
 
Employee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxEmployee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptx
 
mini mental status format.docx
mini    mental       status     format.docxmini    mental       status     format.docx
mini mental status format.docx
 
Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1
 
URLs and Routing in the Odoo 17 Website App
URLs and Routing in the Odoo 17 Website AppURLs and Routing in the Odoo 17 Website App
URLs and Routing in the Odoo 17 Website App
 
Presiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsPresiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha elections
 
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
 
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
 
Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdf
 
Alper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentAlper Gobel In Media Res Media Component
Alper Gobel In Media Res Media Component
 
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
 
Interactive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationInteractive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communication
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activity
 

Multiple regression (1)

  • 1. Multiple Regression (1) Slide 1 Shakeel Nouman M.Phil Statistics Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
  • 2. 11 Slide 2 Multiple Regression (1) • Using Statistics • The k-Variable Multiple Regression Model • The F Test of a Multiple Regression Model • How Good is the Regression • Tests of the Significance of Individual • • Regression Parameters Testing the Validity of the Regression Model Using the Multiple Regression Model for Prediction Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
  • 3. 11 Slide 3 Multiple Regression (2) • Qualitative Independent Variables • Polynomial Regression • Nonlinear Models and Transformations • Multicollinearity • Residual Autocorrelation and the DurbinWatson Test • Partial F Tests and Variable Selection Methods • The Matrix Approach to Multiple Regression Analysis •Multiple Regression (1) By Shakeel Nouman M.Phil Statisticsof Terms Lahore, Statistical Officer Summary and Review Govt. College University
  • 4. 11-1 Using Statistics y Slide 4 y Lines Planes B B A Slope: 1 C A x1 Intercept: 0 x Any two points (A and B), or an intercept and slope (0 and 1), define a line on a twodimensional surface. x2 Any three points (A, B, and C), or an intercept and coefficients of x1 and x2 (0 , 1, and 2), define a plane in a three-dimensional surface. Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
  • 5. 11-2 The k-Variable Multiple Regression Model The population regression model of a dependent variable, Y, on a set of k independent variables, X1, X2,. . . , Xk is given by: Slide 5 x2 y 2 Y= 0 + 1X1 + 2X2 + . . . + kXk + where 0 is the Y-intercept of the regression surface and each i , i = 1,2,...,k is the slope of the regression surface sometimes called the response surface with respect to Xi. 1 0 x1 y   0   1 x1   2 x 2   Model assumptions: 1. ~N(0,2), independent of other errors. 2. The variables Xi are uncorrelated with the error term. Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
  • 6. Simple and Multiple LeastSquares Regression Slide 6 y Y x1  y  b0  b1x X In a simple regression model, the least-squares estimators minimize the sum of squared errors from the estimated regression line. x2  y  b0  b1 x1  b2 x 2 In a multiple regression model, the least-squares estimators minimize the sum of squared errors from the estimated regression plane. Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
  • 7. The Estimated Regression Relationship Slide 7 The estimated regression relationship:  Y  b0  b1 X 1  b2 X 2 bk X k where  is the predicted value of Y, the value lying on the Y estimated regression surface. The terms b0,...,k are the leastsquares estimates of the population regression parameters . i The actual, observed value of Y is the predicted value plus an error: yj = b0+ b1 x1j+ b2 x2j+. . . + bk xkj+e Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
  • 8. Least-Squares Estimation: The 2-Variable Normal Equations Slide 8 Minimizing the sum of squared errors with respect to the estimated coefficients b0, b1, and b2 yields the following normal equations:  y  nb  b  x  b  x 0 1 1 2 2 x y b x b x b x x 2 1 0 1 1 1 2 1 x y b x b x x b x 2 0 2 1 1 2 2 2 2 2 Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
  • 9. Example 11-1 Y 72 76 78 70 68 80 82 65 62 90 --743 X1 12 11 15 10 11 16 14 8 8 18 --123 X2 5 8 6 5 3 9 12 4 3 10 --65 X1X2 60 88 90 50 33 144 168 32 24 180 --869 X12 144 121 225 100 121 256 196 64 64 324 ---1615 X22 25 64 36 25 9 81 144 16 9 100 --509 X1Y 864 836 1170 700 748 1280 1148 520 496 1620 ---9382 X2Y 360 608 468 350 204 720 984 260 186 900 ---5040 Slide 9 Normal Equations: 743 = 10b0+123b1+65b2 9382 = 123b0+1615b1+869b2 5040 = 65b0+869b1+509b2 b0 = 47.164942 b1 = 1.5990404 b2 = 1.1487479 Estimated regression equation:  Y  47164942  15990404 X 1  11487479 X 2 . . . Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
  • 10. Example 11-1: Using the Template Slide 10 Regression results for Alka-Seltzer sales Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
  • 11. Decomposition of the Total Deviation in a Multiple Regression Model y   Total deviation: Y  Y y Slide 11  Y  Y : Error Deviation  Y  Y : Regression Deviation x1 x2 Total Deviation = Regression Deviation + Error Deviation SST = SSR + SSE Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
  • 12. 11-3 The F Test of a Multiple Regression Model Slide 12 A statistical test for the existence of a linear relationship between Y and any or all of the independent variables X1, x2, ..., Xk: H0:  =  = ...=  =0 1 2 k H1: Not all the  (i=1,2,...,k) are 0 i Source of Variation Sum of Squares Regression SSR Error Total SSE SST Degrees of Freedom Mean Square k n - (k+1) n-1 SSR MSR  MSE  F Ratio k SSE ( n  ( k  1)) MST  SST ( n  1) Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
  • 13. Using the Template: Analysis of Variance Table (Example 11-1) F Distribution with 2 and 7 Degrees of Freedom f(F) Test statistic  86.34  =0.01 F 0 Slide 13 The test statistic, F = 86.34, is greater than the critical point of F(2, 7) for any common level of significance (p-value 0), so the null hypothesis is rejected, and we might conclude that the dependent variable is related to one or more of the independent variables. F0.01=9.55 Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
  • 14. 11-4 How Good is the Regression y The mean square error is an unbiased estimator of the variance of the population 2 errors,  , denoted by  : MSE  x1 x2 Slide 14 SSE ( n  ( k  1))   (y  y 2 ) ( n  ( k  1)) Standard error of estimate:  Errors: y - y s= MSE 2 The multiple coefficient of determination, R , measures the proportion of the variation in the dependent variable that is explained by the combination of the independent variables in the multiple regression model: R2 = SSR SSE =1SST SST Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
  • 15. Decomposition of the Sum of Squares and the Adjusted Coefficient of Determination Slide 15 SST SSR R SSE 2 = SSR SST = 1- SSE SST 2 The adjusted multiple coefficient of determination , R , is the coefficient of determination with the SSE and SST divided by their respective degrees of freedom: SSE R 2 = 1- (n - (k + 1)) SST (n - 1) Example 11-1: s = 1.911 R-sq = 96.1% R-sq(adj) = 95.0% Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
  • 16. Measures of Performance in Multiple Regression and the ANOVA Table Source of Variation Sum of Squares Degrees of Freedom Mean Square Regression SSR (k) MSR  Error R (n-(k+1)) =(n-k-1) Total 2 SSE SST (n-1) SSR = SSE = 1- SST SST F  R MSE  Slide 16 F Ratio F  SSR k MSR MSE SSE (n  ( k  1)) MST  SST ( n  1) SSE 2 ( n  ( k  1)) 2 (1  R ) (k ) R 2 =1- (n - (k + 1)) SST = MSE MST (n - 1) Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
  • 17. 11-5 Tests of the Significance of Individual Regression Parameters Slide 17 Hypothesis tests about individual regression slope parameters: (1) H0: b1= 0 H1: b1  0 (2) H0: b2 = 0 H1: b2  0 . . . (k) H0: bk = 0 H1: bk  0 Test statistic for test i: t b 0  s(b ) i ( n  ( k 1 ) i Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
  • 18. Regression Results for Individual Parameters Variable Constant Coefficient Estimate Standard Error Slide 18 t-Statistic 53.12 5.43 9.783 X1 2.03 0.22 9.227 X2 5.60 1.30 4.308 X3 10.35 6.88 1.504 X4 3.45 2.70 1.259 X5 -4.25 0.38 11.184 n=150 * * * * t0.025=1.96 Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
  • 19. Example 11-1: Using the Template Slide 19 Regression results for Alka-Seltzer sales Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
  • 20. Using the Template: Example 11-2 Slide 20 Regression results for Exports to Singapore Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
  • 21. 11-6 Testing the Validity of the Regression Model: Residual Plots Slide 21 Residuals vs M1 It appears that the residuals are randomly distributed with no pattern and with equal variance as M1 increases Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
  • 22. 11-6 Testing the Validity of the Regression Model: Residual Plots Slide 22 Residuals vs Price It appears that the residuals are increasing as the Price increases. The variance of the residuals is not constant. Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
  • 23. Normal Probability Plot for the Residuals: Example 11-2 Slide 23 Linear trend indicates residuals are normally distributed Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
  • 24. Investigating the Validity of the Regression: Outliers and Influential Observations y Regression line without outlier . . . .. .. . .. .. .. . . Regression line with outlier * Outlier x Outliers Slide 24 Point with a large value of xi y .. ...... .. . . .. . * Regression line when all data are included No relationship in this cluster x Influential Observations Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
  • 25. Outliers and Influential Observations: Example 11-2 Unusual Observations Obs. M1 EXPORTS St.Resid 1 5.10 2.6000 2 4.90 2.6000 25 6.20 5.5000 26 6.30 3.7000 50 8.30 4.3000 67 8.20 5.6000 Fit Stdev.Fit 2.6420 2.6438 4.5949 4.6311 5.1317 4.9474 0.1288 0.1234 0.0676 0.0651 0.0648 0.0668 Slide 25 Residual -0.0420 -0.0438 0.9051 -0.9311 -0.8317 0.6526 -0.14 X -0.14 X 2.80R -2.87R -2.57R 2.02R R denotes an obs. with a large st. resid. X denotes an obs. whose X value gives it large influence. Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
  • 26. 11-7 Using the Multiple Regression Model for Prediction Sales Slide 26 Estimated Regression Plane for Example 11-1 89.76 Advertising 18.00 63.42 8.00 Promotions 12 3 Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
  • 27. Prediction in Multiple Regression Slide 27 A (1 - a) 100% prediction interval for a value of Y given values of X : i yt  ( ,( n  ( k 1))) 2 s 2 ( y  MSE ) A (1 - a) 100% prediction interval for the conditional mean of Y given values of X : i yt  ( ,( n  ( k 1))) 2  s[ E (Y )] Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
  • 28. 11-8 Qualitative (or Categorical) Independent Variables (in Regression) Slide 28 An indicator (dummy, binary) variable of qualitative level A: 1 if level A is obtained Xh   0 if level A is not obtained MOVIEEARN 1 28 2 35 3 50 4 20 5 75 6 60 7 15 8 45 9 50 10 34 11 48 12 82 13 24 14 50 15 58 16 63 17 30 18 37 19 45 20 72 COST 4.2 6.0 5.5 3.3 12.5 9.6 2.5 10.8 8.4 6.6 10.7 11.0 3.5 6.9 7.8 10.1 5.0 7.5 6.4 10.0 PROM 1.0 3.0 6.0 1.0 11.0 8.0 0.5 5.0 3.0 2.0 1.0 15.0 4.0 10.0 9.0 10.0 1.0 5.0 8.0 12.0 BOOK 0 1 1 0 1 1 0 0 1 0 1 1 0 0 1 0 1 0 1 1 EXAMPLE113 Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
  • 29. Picturing Qualitative Variables in Regression Slide 29 y Y Line for X2=1 b3 b0+b2 Line for X2=0 x1 b0 X1 A regression with one quantitative variable (X1) and one qualitative variable (X2):  y  b bx b x 0 1 1 2 2 x2 A multiple regression with two quantitative variables (X1 and X2) and one qualitative variable (X3):  y b bx b x b x 0 1 1 2 2 3 3 Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
  • 30. Picturing Qualitative Variables in Regression: Three Categories and Two Dummy Variables Y Line for X = 0 and X3 = 1 Line for X2 = 1 and X3 = 0 b0+b3 Slide 30 A qualitative variable with r levels or categories is represented with (r-1) 0/1 (dummy) variables. Line for X2 = 0 and X3 = 0 b0+b2 b0 X1 A regression with one quantitative variable (X1) and two qualitative variables (X2 and X2):  y b bx b x b x 0 1 1 2 2 3 Category X2 X3 Adventure 0 0 Drama 0 1 Romance 1 0 3 Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
  • 31. Using Qualitative Variables in Regression: Example 11-4 Slide 31 Salary = 8547 + 949 Education + 1258 Experience - 3256 Gender (SE) (32.6) (45.1) (78.5) (212.4) (t) (262.2) (21.0) (16.0) (-15.3) 1 if Female Gender   0 if Male On average, female salaries are $3256 below male salaries Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
  • 32. Interactions between Quantitative and 32 Slide Qualitative Variables: Shifting Slopes LnorX20 Y LnorX21 Slop 1 0 Slop 13 02 X1 A regression with interaction between a quantitative variable (X1) and a qualitative variable (X2 ):  y b bx b x b x x 0 1 1 2 2 3 1 2 Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
  • 33. 11-9 Polynomial Regression Slide 33 One-variable polynomial regression model: Y=0+1 X + 2X2 + 3X3 +. . . + mXm + where m is the degree of the polynomial - the highest power of X appearing in the equation. The degree of the polynomial is the order of the model. Y Y  y  b b X 0  y  b b X 0 1 1  y b b X b X 0 1 2  y b b X b X b X 2 2 (b  0) 0 2 X1 1 2 3 X1 Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer 3
  • 34. Polynomial Regression: Example 11-5 Slide 34 Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
  • 35. Polynomial Regression: Other Variables and Cross-Product Terms Slide 35 Variable Estimate Standard Error T-statistic X1 2.34 0.92 2.54 X2 3.11 1.05 2.96 2 X1 4.22 1.00 4.22 X2 2 3.57 2.12 1.68 2 X1 X 2.77 2.30 1.20 Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
  • 36. 11-10 Nonlinear Models and Transformations: Multiplicative Model Slide 36 The multiplicative model: Y   X X X  1 0 1 2 2 3 3 The logarithmic transformation: log Y  log    log X   log X   log X  log  0 1 1 2 2 3 3 Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
  • 37. Transformations: Exponential Model Slide 37 The exponential model: Y   e  1X 0 The logarithmic transformation: log Y  log  0   X  log  1 1 Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
  • 38. Plots of Transformed Variables S im ple R e g re s s io n of S ale s o n Ad ve rtis ing R e gre s sio n of S ale s on Log(Advertising) 25 20 Y = 6 .5 9 2 7 1 + 1.19 176 X R- S q u a r e d = 0 .8 9 5 10 SALES 30 SALES Slide 38 15 Y = 3 .6 6 8 2 5 + 6 .78 4 X R- Sq uared = 0 .9 78 5 0 5 10 15 0 1 ADVERT 2 3 LOGADV R e gre s sio n of Lo g(S ale s ) o n Lo g(Adve rtising) R e sid ual Plots: S ale s v s Lo g(Adv e rtising) 1.5 3.5 2.5 Y = 1.70 0 8 2 + 0 .5 53 13 6 X R- S q uar ed = 0 .9 47 RESIDS LOGSALE 0.5 -0.5 -1.5 1.5 0 1 2 LOGADV 3 2 12 22 Y-HAT Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
  • 39. Variance Stabilizing Transformations • Slide 39 Square root transformation:Y   Y Useful when the variance of the regression errors is approximately proportional to the conditional mean of Y • Logarithmic transformation: Y   log(Y ) Useful when the variance of regression errors is approximately proportional to the square of the conditional mean of Y • Reciprocal transformation: 1 Y Useful when the variance of the regression errors is Y  approximately proportional to the fourth power of the conditional mean of Y Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
  • 40. Regression with Dependent Indicator Variables Slide 40 The logistic function: e (  X ) E (Y X )  1  e (  X ) 0 1 0 1 Transformation to linearize the logistic function:  p  p   log   1  p y Logistic Function 1 0 x Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
  • 41. 11-11: Multicollinearity Slide 41 x2 x2 x1 Orthogonal X variables provide information from independent sources. No multicollinearity. x2 x1 Perfectly collinear X variables provide identical information content. No regression. x2 x1 Some degree of collinearity. Problems with regression depend on the degree of collinearity. x1 A high degree of negative collinearity also causes problems with regression. Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
  • 42. Effects of Multicollinearity • • • • • • Slide 42 Variances of regression coefficients are inflated. Magnitudes of regression coefficients may be different from what are expected. Signs of regression coefficients may not be as expected. Adding or removing variables produces large changes in coefficients. Removing a data point may cause large changes in coefficient estimates or signs. In some cases, the F ratio may be significant while the t ratios are not. Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
  • 43. Detecting the Existence of Multicollinearity: Correlation Matrix of Independent Variables and Variance Inflation Factors Slide 43 Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
  • 44. Variance Inflation Factor Slide 44 The variance inflation factor associated with X h : 1 VIF ( X h )  1  Rh2 where R 2 is the R 2 value obtained for the regression of X on h the other independent variables. Relationship between VIF and Rh2 VIF100 50 0 0.0 0.5 1.0 Rh2 Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
  • 45. Variance Inflation Factor (VIF) Slide 45 Observation: The VIF (Variance Inflation Factor) values for both variables Lend and Price are both greater than 5. This would indicate that some degree of multicollinearity exists with respect to these two variables. Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
  • 46. Solutions to the Multicollinearity Problem Slide 46 • Drop a collinear variable from the regression • Change in sampling plan to include elements outside the multicollinearity range • Transformations of variables • Ridge regression Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
  • 47. 11-12 Residual Autocorrelation and the Durbin-Watson Test Slide 47 An autocorrelation is a correlation of the values of a variable with values of the same variable lagged one or more periods back. Consequences of autocorrelation include inaccurate estimates of variances and inaccurate predictions. Lagged Residuals i 1 2 3 4 5 6 7 8 9 10 i 1.0 0.0 -1.0 2.0 3.0 -2.0 1.0 1.5 1.0 -2.5 i-1 * 1.0 0.0 -1.0 2.0 3.0 -2.0 1.0 1.5 1.0 i-2 * * 1.0 0.0 -1.0 2.0 3.0 -2.0 1.0 1.5 i-3 * * * 1.0 0.0 -1.0 2.0 3.0 -2.0 1.0 i-4 * * * * 1.0 0.0 -1.0 2.0 3.0 -2.0 The Durbin-Watson test (first-order autocorrelation): H 0 : r1 = 0 H1:r1  0 The Durbin-Watson test statistic: n 2  ( ei  ei 1 ) d  i2 n 2  ei i 1 Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
  • 48. Critical Points of the Durbin-Watson Statistic: =0.05, n= Sample Size, k = Number of Independent Variables n 15 16 17 18 . . . 65 70 75 80 85 90 95 100 k=1 dL dU 1.08 1.10 1.13 1.16 1.57 1.58 1.60 1.61 1.62 1.63 1.64 1.65 1.36 1.37 1.38 1.39 . . . 1.63 1.64 1.65 1.66 1.67 1.68 1.69 1.69 k=2 dL dU 0.95 0.98 1.02 1.05 1.54 1.55 1.57 1.59 1.60 1.61 1.62 1.63 1.54 1.54 1.54 1.53 . . . 1.66 1.67 1.68 1.69 1.70 1.70 1.71 1.72 k=3 dL dU 0.82 0.86 0.90 0.93 1.50 1.52 1.54 1.56 1.57 1.59 1.60 1.61 1.75 1.73 1.71 1.69 . . . 1.70 1.70 1.71 1.72 1.72 1.73 1.73 1.74 k=4 dL dU 0.69 0.74 0.78 0.82 1.47 1.49 1.51 1.53 1.55 1.57 1.58 1.59 1.97 1.93 1.90 1.87 . . . 1.73 1.74 1.74 1.74 1.75 1.75 1.75 1.76 Slide 48 k=5 dL dU 0.56 0.62 0.67 0.71 1.44 1.46 1.49 1.51 1.52 1.54 1.56 1.57 2.21 2.15 2.10 2.06 . . . 1.77 1.77 1.77 1.77 1.77 1.78 1.78 1.78 Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
  • 49. Using the Durbin-Watson Statistic Positive Autocorrelation 0 Test is Inconclusive dL dU No Autocorrelation Test is Inconclusive 4-dU Slide 49 Negative Autocorrelation 4-dL 4 For n = 67, k = 4: dU1.73 4-dU2.27 dL1.47 4- dL2.53 < 2.58 H0 is rejected, and we conclude there is negative first-order autocorrelation. Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
  • 50. 11-13 Partial F Tests and Variable Selection Methods Slide 50 Full model: Y = 0 + 1 X 1 + 2 X 2 + 3 X 3 + 4 X 4 +  Reduced model: Y = 0 + 1 X 1 + 2 X 2 +  Partial F test: H0: 3 = 4 = 0 H1: 3 and 4 not both 0 Partial F statistic:  SSE ) / r R F F  MSE (r, (n  (k  1)) F where SSER is the sum of squared errors of the reduced model, SSEF is the sum of squared errors of the full model; MSEF is the mean square error of the full model [MSEF = SSEF/(n-(k+1))]; r is the number of variables dropped from the full model. (SSE Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
  • 51. Variable Selection Methods • Slide 51 All possible regressions Run regressions with all possible combinations of independent variables and select best model A p-value of 0.001 indicates that we should reject the null hypothesis H0: the slopes for Lend and Exch. are zero. Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
  • 52. Variable Selection Methods • Slide 52 Stepwise procedures Forward selection » Add one variable at a time to the model, on the basis of its F statistic Backward elimination » Remove one variable at a time, on the basis of its F statistic Stepwise regression » Adds variables to the model and subtracts variables from the model, on the basis of the F statistic Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
  • 53. Stepwise Regression Slide 53 ComputF tttorvrlnotntmol Itrtltonvrlwt pvlu Pn No Stop Y Entrmotnnt(mlltpvlu)vrlntomol ClultprtlForllvrlntmol ItrvrlwtpvluPout Rmov vrl No Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
  • 54. Stepwise Regression: Using the Computer (MINITAB) Slide 54 MTB > STEPWISE 'EXPORTS' PREDICTORS 'M1’ 'LEND' 'PRICE’ 'EXCHANGE' Stepwise Regression F-to-Enter: 4.00 F-to-Remove: 4.00 Response is EXPORTS on 4 predictors, with N = 67 Step Constant M1 T-Ratio 1 0.9348 2 -3.4230 0.520 9.89 0.361 9.21 PRICE T-Ratio S R-Sq 0.0370 9.05 0.495 60.08 0.331 82.48 Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
  • 55. Using the Computer: MINITAB Slide 55 MTB > REGRESS 'EXPORTS’ 4 'M1’ 'LEND’ 'PRICE' 'EXCHANGE'; SUBC> vif; SUBC> dw. Regression Analysis The regression equation is EXPORTS = - 4.02 + 0.368 M1 + 0.0047 LEND + 0.0365 PRICE + 0.27 EXCHANGE Predictor Coef Stdev t-ratio p VIF Constant -4.015 2.766 -1.45 0.152 M1 0.36846 0.06385 5.77 0.000 3.2 LEND 0.00470 0.04922 0.10 0.924 5.4 PRICE 0.036511 0.009326 3.91 0.000 6.3 EXCHANGE 0.268 1.175 0.23 0.820 1.4 s = 0.3358 R-sq = 82.5% R-sq(adj) = 81.4% Analysis of Variance SOURCE Regression DF 4 SS 32.9463 62 Error Total MS F 8.2366 73.06 6.9898 0.1127 66 39.9361 p 0.000 Durbin-Watson statistic = 2.58 Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
  • 56. Using the Computer: SAS (continued) Slide 56 Parameter Estimates Variable INTERCEP M1 LEND PRICE EXCHANGE DF 1 1 1 1 1 Parameter Error Estimate -4.015461 0.368456 0.004702 0.036511 0.267896 2.76640057 0.06384841 0.04922186 0.00932601 1.17544016 Variable INTERCEP M1 LEND PRICE EXCHANGE Durbin-Watson D (For Number of Obs.) 1st Order Autocorrelation -1.452 5.771 0.096 3.915 0.228 Prob > |T| 0.1517 0.0001 0.9242 0.0002 0.8205 Variance Inflation DF 1 1 1 1 1 Standard T for H0: Parameter=0 0.00000000 3.20719533 5.35391367 6.28873181 1.38570639 2.583 67 -0.321 Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
  • 57. 11-15: The Matrix Approach to Regression Analysis (1) Slide 57 The population regression model: y y  y .  .  . y  1 2 3 k  1  1    1   .    .    .  1   x x x . . . x 11 21 31 n1 x x x . . . x 12 22 32 n2 x ... x       x ... x        x ... x      . . .   .   .    . . . .  .     . . .  .  .  x ... x        13 1k 1 1 23 2k 2 2 33 3k 3 3 k k n3 nk Y  X   The estimated regression model: Y = Xb + e Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer           
  • 58. The Matrix Approach to Regression Analysis (2) Slide 58 The normal equations: X Xb  X Y Estimators: b  ( X X ) 1 X Y Predicted values:  Y  Xb  X ( X  ) X 1 V (b)   ( X X ) s (b)  MSE ( X  ) X 2 2 X Y  HY 1 1 Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer
  • 59. Slide 59 Name Religion Domicile Contact # E.Mail M.Phil (Statistics) Shakeel Nouman Christian Punjab (Lahore) 0332-4462527. 0321-9898767 sn_gcu@yahoo.com sn_gcu@hotmail.com GC University, . (Degree awarded by GC University) M.Sc (Statistics) Statitical Officer (BS-17) (Economics & Marketing Division) GC University, . (Degree awarded by GC University) Livestock Production Research Institute Bahadurnagar (Okara), Livestock & Dairy Development Department, Govt. of Punjab Multiple Regression (1) By Shakeel Nouman M.Phil Statistics Govt. College University Lahore, Statistical Officer