SlideShare a Scribd company logo
6-1
Nonlinear Regression Functions
(SW Ch. 6)
 Everything so far has been linear in the X’s
 The approximation that the regression function is
linear might be good for some variables, but not for
others.
 The multiple regression framework can be extended to
handle regression functions that are nonlinear in one
or more X.
6-2
The TestScore – STR relation looks approximately
linear…
6-3
But the TestScore – average district income relation
looks like it is nonlinear.
6-4
If a relation between Y and X is nonlinear:
 The effect on Y of a change in X depends on the value
of X – that is, the marginal effect of X is not constant
 A linear regression is mis-specified – the functional
form is wrong
 The estimator of the effect on Y of X is biased – it
needn’t even be right on average.
 The solution to this is to estimate a regression
function that is nonlinear in X
6-5
The General Nonlinear Population Regression Function
Yi = f(X1i,X2i,…,Xki) + ui, i = 1,…, n
Assumptions
1. E(ui| X1i,X2i,…,Xki) = 0 (same); implies that f is the
conditional expectation of Y given the X’s.
2. (X1i,…,Xki,Yi) are i.i.d. (same).
3. “enough” moments exist (same idea; the precise
statement depends on specific f).
4. No perfect multicollinearity (same idea; the precise
statement depends on the specific f).
6-6
6-7
Nonlinear Functions of a Single Independent Variable
(SW Section 6.2)
We’ll look at two complementary approaches:
1. Polynomials in X
The population regression function is approximated
by a quadratic, cubic, or higher-degree polynomial
2. Logarithmic transformations
 Y and/or X is transformed by taking its logarithm
 this gives a “percentages” interpretation that makes
sense in many applications
6-8
1. Polynomials in X
Approximate the population regression function by a
polynomial:
Yi = 0 + 1Xi + 2
2
i
X +…+ r
r
i
X + ui
 This is just the linear multiple regression model –
except that the regressors are powers of X!
 Estimation, hypothesis testing, etc. proceeds as in the
multiple regression model using OLS
 The coefficients are difficult to interpret, but the
regression function itself is interpretable
6-9
Example: the TestScore – Income relation
Incomei = average district income in the ith
district
(thousdand dollars per capita)
Quadratic specification:
TestScorei = 0 + 1Incomei + 2(Incomei)2
+ ui
Cubic specification:
TestScorei = 0 + 1Incomei + 2(Incomei)2
+ 3(Incomei)3
+ ui
6-10
Estimation of the quadratic specification in STATA
generate avginc2 = avginc*avginc; Create a new regressor
reg testscr avginc avginc2, r;
Regression with robust standard errors Number of obs = 420
F( 2, 417) = 428.52
Prob > F = 0.0000
R-squared = 0.5562
Root MSE = 12.724
------------------------------------------------------------------------------
| Robust
testscr | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
avginc | 3.850995 .2680941 14.36 0.000 3.32401 4.377979
avginc2 | -.0423085 .0047803 -8.85 0.000 -.051705 -.0329119
_cons | 607.3017 2.901754 209.29 0.000 601.5978 613.0056
------------------------------------------------------------------------------
The t-statistic on Income2
is -8.85, so the hypothesis of
linearity is rejected against the quadratic alternative at the
1% significance level.
6-11
Interpreting the estimated regression function:
(a) Plot the predicted values
TestScore = 607.3 + 3.85Incomei – 0.0423(Incomei)2
(2.9) (0.27) (0.0048)
6-12
Interpreting the estimated regression function:
(a) Compute “effects” for different values of X
TestScore = 607.3 + 3.85Incomei – 0.0423(Incomei)2
(2.9) (0.27) (0.0048)
Predicted change in TestScore for a change in income to
$6,000 from $5,000 per capita:
TestScore = 607.3 + 3.85 6 – 0.0423 62
– (607.3 + 3.85 5 – 0.0423 52
)
= 3.4
6-13
TestScore = 607.3 + 3.85Incomei – 0.0423(Incomei)2
Predicted “effects” for different values of X
Change in Income (th$ per capita) TestScore
from 5 to 6 3.4
from 25 to 26 1.7
from 45 to 46 0.0
The “effect” of a change in income is greater at low than
high income levels (perhaps, a declining marginal benefit
of an increase in school budgets?)
Caution! What about a change from 65 to 66?
Don’t extrapolate outside the range of the data.
6-14
Estimation of the cubic specification in STATA
gen avginc3 = avginc*avginc2; Create the cubic regressor
reg testscr avginc avginc2 avginc3, r;
Regression with robust standard errors Number of obs = 420
F( 3, 416) = 270.18
Prob > F = 0.0000
R-squared = 0.5584
Root MSE = 12.707
------------------------------------------------------------------------------
| Robust
testscr | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
avginc | 5.018677 .7073505 7.10 0.000 3.628251 6.409104
avginc2 | -.0958052 .0289537 -3.31 0.001 -.1527191 -.0388913
avginc3 | .0006855 .0003471 1.98 0.049 3.27e-06 .0013677
_cons | 600.079 5.102062 117.61 0.000 590.0499 610.108
------------------------------------------------------------------------------
The cubic term is statistically significant at the 5%, but
not 1%, level
6-15
Testing the null hypothesis of linearity, against the
alternative that the population regression is quadratic
and/or cubic, that is, it is a polynomial of degree up to 3:
H0: pop’n coefficients on Income2
and Income3
= 0
H1: at least one of these coefficients is nonzero.
test avginc2 avginc3; Execute the test command after running the regression
( 1) avginc2 = 0.0
( 2) avginc3 = 0.0
F( 2, 416) = 37.69
Prob > F = 0.0000
The hypothesis that the population regression is linear is
rejected at the 1% significance level against the
alternative that it is a polynomial of degree up to 3.
6-16
Summary: polynomial regression functions
Yi = 0 + 1Xi + 2
2
i
X +…+ r
r
i
X + ui
 Estimation: by OLS after defining new regressors
 Coefficients have complicated interpretations
 To interpret the estimated regression function:
oplot predicted values as a function of x
ocompute predicted Y/X at different values of x
 Hypotheses concerning degree r can be tested by t-
and F-tests on the appropriate (blocks of) variable(s).
 Choice of degree r
oplot the data; t- and F-tests, check sensitivity of
estimated effects; judgment.
oOr use model selection criteria (maybe later)
6-17
2. Logarithmic functions of Y and/or X
 ln(X) = the natural logarithm of X
 Logarithmic transforms permit modeling relations
in “percentage” terms (like elasticities), rather than
linearly.
Here’s why: ln(x+x) – ln(x) = ln 1
x
x

 

 
 
x
x

(calculus:
ln( ) 1
d x
dx x
 )
Numerically:
ln(1.01) = .00995 .01; ln(1.10) = .0953 .10 (sort of)
6-18
Three cases:
Case Population regression function
I. linear-log Yi = 0 + 1ln(Xi) + ui
II. log-linear ln(Yi) = 0 + 1Xi + ui
III. log-log ln(Yi) = 0 + 1ln(Xi) + ui
 The interpretation of the slope coefficient differs in
each case.
 The interpretation is found by applying the general
“before and after” rule: “figure out the change in Y for
a given change in X.”
6-19
I. Linear-log population regression function
Yi = 0 + 1ln(Xi) + ui (b)
Now change X: Y + Y = 0 + 1ln(X + X) (a)
Subtract (a) – (b): Y = 1[ln(X + X) – ln(X)]
now ln(X + X) – ln(X)
X
X

,
so Y 1
X
X

or 1
/
Y
X X


(small X)
6-20
Linear-log case, continued
Yi = 0 + 1ln(Xi) + ui
for small X,
1
/
Y
X X


Now 100
X
X

= percentage change in X, so a 1%
increase in X (multiplying X by 1.01) is associated with
a .011 change in Y.
6-21
Example: TestScore vs. ln(Income)
 First defining the new regressor, ln(Income)
 The model is now linear in ln(Income), so the linear-log
model can be estimated by OLS:
TestScore = 557.8 + 36.42 ln(Incomei)
(3.8) (1.40)
so a 1% increase in Income is associated with an
increase in TestScore of 0.36 points on the test.
 Standard errors, confidence intervals, R2
– all the
usual tools of regression apply here.
 How does this compare to the cubic model?
6-22
TestScore = 557.8 + 36.42 ln(Incomei)
6-23
II. Log-linear population regression function
ln(Yi) = 0 + 1Xi + ui (b)
Now change X: ln(Y + Y) = 0 + 1(X + X) (a)
Subtract (a) – (b): ln(Y + Y) – ln(Y) = 1X
so
Y
Y

1X
or 1
/
Y Y
X


(small X)
6-24
Log-linear case, continued
ln(Yi) = 0 + 1Xi + ui
for small X, 1
/
Y Y
X


 Now 100
Y
Y

= percentage change in Y, so a change
in X by one unit (X = 1) is associated with a 1001%
change in Y (Y increases by a factor of 1+1).
 Note: What are the units of ui and the SER?
ofractional (proportional) deviations
ofor example, SER = .2 means…
6-25
III. Log-log population regression function
ln(Yi) = 0 + 1ln(Xi) + ui (b)
Now change X: ln(Y + Y) = 0 + 1ln(X + X) (a)
Subtract: ln(Y + Y) – ln(Y) = 1[ln(X + X) – ln(X)]
so
Y
Y

1
X
X

or 1
/
/
Y Y
X X


(small X)
6-26
Log-log case, continued
ln(Yi) = 0 + 1ln(Xi) + ui
for small X,
1
/
/
Y Y
X X


Now 100
Y
Y

= percentage change in Y, and 100
X
X

=
percentage change in X, so a 1% change in X is
associated with a 1% change in Y.
 In the log-log specification, 1 has the interpretation
of an elasticity.
6-27
Example: ln( TestScore) vs. ln( Income)
 First defining a new dependent variable, ln(TestScore),
and the new regressor, ln(Income)
 The model is now a linear regression of ln(TestScore)
against ln(Income), which can be estimated by OLS:
ln( )
TestScore = 6.336 + 0.0554 ln(Incomei)
(0.006) (0.0021)
An 1% increase in Income is associated with an
increase of .0554% in TestScore (factor of 1.0554)
 How does this compare to the log-linear model?
6-28
Neither specification seems to fit as well as the cubic or linear-log
6-29
Summary: Logarithmic transformations
 Three cases, differing in whether Y and/or X is
transformed by taking logarithms.
 After creating the new variable(s) ln(Y) and/or ln(X),
the regression is linear in the new variables and the
coefficients can be estimated by OLS.
 Hypothesis tests and confidence intervals are now
standard.
 The interpretation of 1 differs from case to case.
 Choice of specification should be guided by judgment
(which interpretation makes the most sense in your
application?), tests, and plotting predicted values
6-30
Interactions Between Independent Variables
(SW Section 6.3)
 Perhaps a class size reduction is more effective in some
circumstances than in others…
 Perhaps smaller classes help more if there are many
English learners, who need individual attention
 That is,
TestScore
STR


might depend on PctEL
 More generally,
1
Y
X


might depend on X2
 How to model such “interactions” between X1 and X2?
 We first consider binary X’s, then continuous X’s
6-31
(a) Interactions between two binary variables
Yi = 0 + 1D1i + 2D2i + ui
 D1i, D2i are binary
 1 is the effect of changing D1=0 to D1=1. In this
specification, this effect doesn’t depend on the value of
D2.
 To allow the effect of changing D1 to depend on D2,
include the “interaction term” D1i D2i as a regressor:
Yi = 0 + 1D1i + 2D2i + 3(D1i D2i) + ui
6-32
Interpreting the coefficients
Yi = 0 + 1D1i + 2D2i + 3(D1i D2i) + ui
General rule: compare the various cases
E(Yi|D1i=0, D2i=d2) = 0 + 2d2 (b)
E(Yi|D1i=1, D2i=d2) = 0 + 1 + 2d2 + 3d2 (a)
subtract (a) – (b):
E(Yi|D1i=1, D2i=d2) – E(Yi|D1i=0, D2i=d2) = 1 + 3d2
 The effect of D1 depends on d2 (what we wanted)
 3 = increment to the effect of D1, when D2 = 1
6-33
Example: TestScore, STR, English learners
Let
HiSTR =
1 if 20
0 if 20
STR
STR





and HiEL =
1 if l0
0 if 10
PctEL
PctEL





TestScore = 664.1 – 18.2HiEL – 1.9HiSTR – 3.5(HiSTR HiEL)
(1.4) (2.3) (1.9) (3.1)
 “Effect” of HiSTR when HiEL = 0 is –1.9
 “Effect” of HiSTR when HiEL = 1 is –1.9 – 3.5 = –5.4
 Class size reduction is estimated to have a bigger effect
when the percent of English learners is large
 This interaction isn’t statistically significant: t = 3.5/3.1
6-34
(b) Interactions between continuous and binary
variables
Yi = 0 + 1Di + 2Xi + ui
 Di is binary, X is continuous
 As specified above, the effect on Y of X (holding
constant D) = 2, which does not depend on D
 To allow the effect of X to depend on D, include the
“interaction term” Di Xi as a regressor:
Yi = 0 + 1Di + 2Xi + 3(Di Xi) + ui
6-35
Interpreting the coefficients
Yi = 0 + 1Di + 2Xi + 3(Di Xi) + ui
General rule: compare the various cases
Y = 0 + 1D + 2X + 3(D X) (b)
Now change X:
Y + Y = 0 + 1D + 2(X+X) + 3[D (X+X)] (a)
subtract (a) – (b):
Y = 2X + 3DX or
Y
X


= 2 + 3D
 The effect of X depends on D (what we wanted)
 3 = increment to the effect of X, when D = 1
Example: TestScore, STR, HiEL (=1 if PctEL 20)
6-36
TestScore = 682.2 – 0.97STR + 5.6HiEL – 1.28(STR HiEL)
(11.9) (0.59) (19.5) (0.97)
 When HiEL = 0:
TestScore = 682.2 – 0.97STR
 When HiEL = 1,
TestScore = 682.2 – 0.97STR + 5.6 – 1.28STR
= 687.8 – 2.25STR
 Two regression lines: one for each HiSTR group.
 Class size reduction is estimated to have a larger effect
when the percent of English learners is large.
Example, ctd.
6-37
TestScore = 682.2 – 0.97STR + 5.6HiEL – 1.28(STR HiEL)
(11.9) (0.59) (19.5) (0.97)
Testing various hypotheses:
 The two regression lines have the same slope the
coefficient on STR HiEL is zero:
t = –1.28/0.97 = –1.32 can’t reject
 The two regression lines have the same intercept
the coefficient on HiEL is zero:
t = –5.6/19.5 = 0.29 can’t reject
Example, ctd.
TestScore = 682.2 – 0.97STR + 5.6HiEL – 1.28(STR HiEL),
(11.9) (0.59) (19.5) (0.97)
6-38
 Joint hypothesis that the two regression lines are the
same population coefficient on HiEL = 0 and
population coefficient on STR HiEL = 0:
F = 89.94 (p-value < .001) !!
 Why do we reject the joint hypothesis but neither
individual hypothesis?
 Consequence of high but imperfect multicollinearity:
high correlation between HiEL and STR HiEL
Binary-continuous interactions: the two regression lines
Yi = 0 + 1Di + 2Xi + 3(Di Xi) + ui
Observations with Di= 0 (the “D = 0” group):
6-39
Yi = 0 + 2Xi + ui
Observations with Di= 1 (the “D = 1” group):
Yi = 0 + 1 + 2Xi + 3Xi + ui
= (0+1) + (2+3)Xi + ui
6-40
6-41
(c) Interactions between two continuous variables
Yi = 0 + 1X1i + 2X2i + ui
 X1, X2 are continuous
 As specified, the effect of X1 doesn’t depend on X2
 As specified, the effect of X2 doesn’t depend on X1
 To allow the effect of X1 to depend on X2, include the
“interaction term” X1i X2i as a regressor:
Yi = 0 + 1X1i + 2X2i + 3(X1i X2i) + ui
Coefficients in continuous-continuous interactions
6-42
Yi = 0 + 1X1i + 2X2i + 3(X1i X2i) + ui
General rule: compare the various cases
Y = 0 + 1X1 + 2X2 + 3(X1 X2) (b)
Now change X1:
Y+ Y = 0 + 1(X1+X1) + 2X2 + 3[(X1+X1) X2] (a)
subtract (a) – (b):
Y = 1X1 + 3X2X1 or
1
Y
X


= 2 + 3X2
 The effect of X1 depends on X2 (what we wanted)
 3 = increment to the effect of X1 from a unit change
in X2
Example: TestScore, STR, PctEL
6-43
TestScore = 686.3 – 1.12STR – 0.67PctEL + .0012(STR PctEL),
(11.8) (0.59) (0.37) (0.019)
The estimated effect of class size reduction is nonlinear
because the size of the effect itself depends on PctEL:
TestScore
STR


= –1.12 + .0012PctEL
PctEL TestScore
STR


0 –1.12
20% –1.12+.0012 20 = –1.10
Example, ctd: hypothesis tests
TestScore = 686.3 – 1.12STR – 0.67PctEL + .0012(STR PctEL),
(11.8) (0.59) (0.37) (0.019)
6-44
 Does population coefficient on STR PctEL = 0?
t = .0012/.019 = .06 can’t reject null at 5% level
 Does population coefficient on STR = 0?
t = –1.12/0.59 = –1.90 can’t reject null at 5% level
 Do the coefficients on both STR and STR PctEL = 0?
F = 3.89 (p-value = .021) reject null at 5% level(!!)
(Why? high but imperfect multicollinearity)
6-45
Application: Nonlinear Effects on Test Scores
of the Student-Teacher Ratio
(SW Section 6.4)
Focus on two questions:
1. Are there nonlinear effects of class size reduction on
test scores? (Does a reduction from 35 to 30 have
same effect as a reduction from 20 to 15?)
2. Are there nonlinear interactions between PctEL and
STR? (Are small classes more effective when there are
many English learners?)
6-46
Strategy for Question #1 (different effects for different STR?)
 Estimate linear and nonlinear functions of STR, holding
constant relevant demographic variables
oPctEL
oIncome (remember the nonlinear TestScore-Income
relation!)
oLunchPCT (fraction on free/subsidized lunch)
 See whether adding the nonlinear terms makes an
“economically important” quantitative difference
(“economic” or “real-world” importance is different than
statistically significant)
 Test for whether the nonlinear terms are significant
6-47
What is a good “base” specification?
6-48
The TestScore – Income relation
An advantage of the logarithmic specification is that it is
better behaved near the ends of the sample, especially large
values of income.
6-49
Base specification
From the scatterplots and preceding analysis, here are
plausible starting points for the demographic control
variables:
Dependent variable: TestScore
Independent variable Functional form
PctEL linear
LunchPCT linear
Income ln(Income)
(or could use cubic)
6-50
Question #1:
Investigate by considering a polynomial in STR
TestScore = 252.0 + 64.33STR – 3.42STR2
+ .059STR3
(163.6) (24.86) (1.25) (.021)
– 5.47HiEL – .420LunchPCT + 11.75ln(Income)
(1.03) (.029) (1.78)
Interpretation of coefficients on:
 HiEL?
 LunchPCT?
 ln(Income)?
 STR, STR2
, STR3
?
6-51
Interpreting the regression function via plots
(preceding regression is labeled (5) in this figure)
6-52
Are the higher order terms in STR statistically
significant?
TestScore = 252.0 + 64.33STR – 3.42STR2
+ .059STR3
(163.6) (24.86) (1.25) (.021)
– 5.47HiEL – .420LunchPCT + 11.75ln(Income)
(1.03) (.029) (1.78)
(a) H0: quadratic in STR v. H1: cubic in STR?
t = .059/.021 = 2.86 (p = .005)
(b) H0: linear in STR v. H1: nonlinear/up to cubic in STR?
F = 6.17 (p = .002)
6-53
Question #2: STR-PctEL interactions
(to simplify things, ignore STR2
, STR3
terms for now)
TestScore = 653.6 – .53STR + 5.50HiEL – .58HiEL STR
(9.9) (.34) (9.80) (.50)
– .411LunchPCT + 12.12ln(Income)
(.029) (1.80)
Interpretation of coefficients on:
 STR?
 HiEL? (wrong sign?)
 HiEL STR?
 LunchPCT?
 ln(Income)?
6-54
Interpreting the regression functions via plots:
TestScore = 653.6 – .53STR + 5.50HiEL – .58HiEL STR
(9.9) (.34) (9.80) (.50)
– .411LunchPCT + 12.12ln(Income)
(.029) (1.80)
“Real-world” (“policy” or “economic”) importance of
the interaction term:
TestScore
STR


= –.53 – .58HiEL =
1.12 if 1
.53 if 0
HiEL
HiEL
 


 

 The difference in the estimated effect of reducing the
STR is substantial; class size reduction is more
effective in districts with more English learners
6-55
Is the interaction effect statistically significant?
TestScore = 653.6 – .53STR + 5.50HiEL – .58HiEL STR
(9.9) (.34) (9.80) (.50)
– .411LunchPCT + 12.12ln(Income)
(.029) (1.80)
(a) H0: coeff. on interaction=0 v. H1: nonzero interaction
t = –1.17 not significant at the 10% level
(b) H0: both coeffs involving STR = 0 vs.
H1: at least one coefficient is nonzero (STR enters)
F = 5.92 (p = .003)
Next: specifications with polynomials + interactions!
6-56
6-57
Interpreting the regression functions via plots:
6-58
Tests of joint hypotheses:
6-59
Summary: Nonlinear Regression Functions
 Using functions of the independent variables such as
ln(X) or X1 X2, allows recasting a large family of
nonlinear regression functions as multiple regression.
 Estimation and inference proceeds in the same way as
in the linear multiple regression model.
 Interpretation of the coefficients is model-specific, but
the general rule is to compute effects by comparing
different cases (different value of the original X’s)
 Many nonlinear specifications are possible, so you must
use judgment: What nonlinear effect you want to
analyze? What makes sense in your application?

More Related Content

Similar to Ch 6 Slides.doc/9929292929292919299292@:&:&:&9/92

Chapter5.pdf.pdf
Chapter5.pdf.pdfChapter5.pdf.pdf
Chapter5.pdf.pdf
ROBERTOENRIQUEGARCAA1
 
Machine Learning - Regression model
Machine Learning - Regression modelMachine Learning - Regression model
Machine Learning - Regression model
RADO7900
 
ML Module 3.pdf
ML Module 3.pdfML Module 3.pdf
ML Module 3.pdf
Shiwani Gupta
 
Nonparametric approach to multiple regression
Nonparametric approach to multiple regressionNonparametric approach to multiple regression
Nonparametric approach to multiple regression
Alexander Decker
 
Lecture - 8 MLR.pptx
Lecture - 8 MLR.pptxLecture - 8 MLR.pptx
Lecture - 8 MLR.pptx
iris765749
 
Linear regression without tears
Linear regression without tearsLinear regression without tears
Linear regression without tears
Ankit Sharma
 
Reg
RegReg
Regression and corelation (Biostatistics)
Regression and corelation (Biostatistics)Regression and corelation (Biostatistics)
Regression and corelation (Biostatistics)
Muhammadasif909
 
Linear regression
Linear regressionLinear regression
Linear regression
vermaumeshverma
 
Regression
RegressionRegression
Corr-and-Regress (1).ppt
Corr-and-Regress (1).pptCorr-and-Regress (1).ppt
Corr-and-Regress (1).ppt
MuhammadAftab89
 
Corr-and-Regress.ppt
Corr-and-Regress.pptCorr-and-Regress.ppt
Corr-and-Regress.ppt
BAGARAGAZAROMUALD2
 
Cr-and-Regress.ppt
Cr-and-Regress.pptCr-and-Regress.ppt
Cr-and-Regress.ppt
RidaIrfan10
 
Correlation & Regression for Statistics Social Science
Correlation & Regression for Statistics Social ScienceCorrelation & Regression for Statistics Social Science
Correlation & Regression for Statistics Social Science
ssuser71ac73
 
Corr-and-Regress.ppt
Corr-and-Regress.pptCorr-and-Regress.ppt
Corr-and-Regress.ppt
HarunorRashid74
 
Corr-and-Regress.ppt
Corr-and-Regress.pptCorr-and-Regress.ppt
Corr-and-Regress.ppt
krunal soni
 
Corr-and-Regress.ppt
Corr-and-Regress.pptCorr-and-Regress.ppt
Corr-and-Regress.ppt
MoinPasha12
 
Statistics-Regression analysis
Statistics-Regression analysisStatistics-Regression analysis
Statistics-Regression analysis
Rabin BK
 
Regression
RegressionRegression
Regression
LavanyaK75
 

Similar to Ch 6 Slides.doc/9929292929292919299292@:&:&:&9/92 (20)

Chapter5.pdf.pdf
Chapter5.pdf.pdfChapter5.pdf.pdf
Chapter5.pdf.pdf
 
Machine Learning - Regression model
Machine Learning - Regression modelMachine Learning - Regression model
Machine Learning - Regression model
 
ML Module 3.pdf
ML Module 3.pdfML Module 3.pdf
ML Module 3.pdf
 
Powerpoint2.reg
Powerpoint2.regPowerpoint2.reg
Powerpoint2.reg
 
Nonparametric approach to multiple regression
Nonparametric approach to multiple regressionNonparametric approach to multiple regression
Nonparametric approach to multiple regression
 
Lecture - 8 MLR.pptx
Lecture - 8 MLR.pptxLecture - 8 MLR.pptx
Lecture - 8 MLR.pptx
 
Linear regression without tears
Linear regression without tearsLinear regression without tears
Linear regression without tears
 
Reg
RegReg
Reg
 
Regression and corelation (Biostatistics)
Regression and corelation (Biostatistics)Regression and corelation (Biostatistics)
Regression and corelation (Biostatistics)
 
Linear regression
Linear regressionLinear regression
Linear regression
 
Regression
RegressionRegression
Regression
 
Corr-and-Regress (1).ppt
Corr-and-Regress (1).pptCorr-and-Regress (1).ppt
Corr-and-Regress (1).ppt
 
Corr-and-Regress.ppt
Corr-and-Regress.pptCorr-and-Regress.ppt
Corr-and-Regress.ppt
 
Cr-and-Regress.ppt
Cr-and-Regress.pptCr-and-Regress.ppt
Cr-and-Regress.ppt
 
Correlation & Regression for Statistics Social Science
Correlation & Regression for Statistics Social ScienceCorrelation & Regression for Statistics Social Science
Correlation & Regression for Statistics Social Science
 
Corr-and-Regress.ppt
Corr-and-Regress.pptCorr-and-Regress.ppt
Corr-and-Regress.ppt
 
Corr-and-Regress.ppt
Corr-and-Regress.pptCorr-and-Regress.ppt
Corr-and-Regress.ppt
 
Corr-and-Regress.ppt
Corr-and-Regress.pptCorr-and-Regress.ppt
Corr-and-Regress.ppt
 
Statistics-Regression analysis
Statistics-Regression analysisStatistics-Regression analysis
Statistics-Regression analysis
 
Regression
RegressionRegression
Regression
 

More from ohenebabismark508

Ch 8 Slides.doc28383&3&3&39388383288283838
Ch 8 Slides.doc28383&3&3&39388383288283838Ch 8 Slides.doc28383&3&3&39388383288283838
Ch 8 Slides.doc28383&3&3&39388383288283838
ohenebabismark508
 
Ch 4 Slides.doc655444444444444445678888776
Ch 4 Slides.doc655444444444444445678888776Ch 4 Slides.doc655444444444444445678888776
Ch 4 Slides.doc655444444444444445678888776
ohenebabismark508
 
Ch 10 Slides.doc546555544554555455555777777
Ch 10 Slides.doc546555544554555455555777777Ch 10 Slides.doc546555544554555455555777777
Ch 10 Slides.doc546555544554555455555777777
ohenebabismark508
 
Ch 12 Slides.doc. Introduction of science of business
Ch 12 Slides.doc. Introduction of science of businessCh 12 Slides.doc. Introduction of science of business
Ch 12 Slides.doc. Introduction of science of business
ohenebabismark508
 
Ch 11 Slides.doc. Introduction to econometric modeling
Ch 11 Slides.doc. Introduction to econometric modelingCh 11 Slides.doc. Introduction to econometric modeling
Ch 11 Slides.doc. Introduction to econometric modeling
ohenebabismark508
 
Chapter_1_Intro.pptx. Introductory econometric book
Chapter_1_Intro.pptx. Introductory econometric bookChapter_1_Intro.pptx. Introductory econometric book
Chapter_1_Intro.pptx. Introductory econometric book
ohenebabismark508
 

More from ohenebabismark508 (6)

Ch 8 Slides.doc28383&3&3&39388383288283838
Ch 8 Slides.doc28383&3&3&39388383288283838Ch 8 Slides.doc28383&3&3&39388383288283838
Ch 8 Slides.doc28383&3&3&39388383288283838
 
Ch 4 Slides.doc655444444444444445678888776
Ch 4 Slides.doc655444444444444445678888776Ch 4 Slides.doc655444444444444445678888776
Ch 4 Slides.doc655444444444444445678888776
 
Ch 10 Slides.doc546555544554555455555777777
Ch 10 Slides.doc546555544554555455555777777Ch 10 Slides.doc546555544554555455555777777
Ch 10 Slides.doc546555544554555455555777777
 
Ch 12 Slides.doc. Introduction of science of business
Ch 12 Slides.doc. Introduction of science of businessCh 12 Slides.doc. Introduction of science of business
Ch 12 Slides.doc. Introduction of science of business
 
Ch 11 Slides.doc. Introduction to econometric modeling
Ch 11 Slides.doc. Introduction to econometric modelingCh 11 Slides.doc. Introduction to econometric modeling
Ch 11 Slides.doc. Introduction to econometric modeling
 
Chapter_1_Intro.pptx. Introductory econometric book
Chapter_1_Intro.pptx. Introductory econometric bookChapter_1_Intro.pptx. Introductory econometric book
Chapter_1_Intro.pptx. Introductory econometric book
 

Recently uploaded

一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单
enxupq
 
Predicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Predicting Product Ad Campaign Performance: A Data Analysis Project PresentationPredicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Predicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Boston Institute of Analytics
 
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
pchutichetpong
 
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
slg6lamcq
 
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
ewymefz
 
My burning issue is homelessness K.C.M.O.
My burning issue is homelessness K.C.M.O.My burning issue is homelessness K.C.M.O.
My burning issue is homelessness K.C.M.O.
rwarrenll
 
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
v3tuleee
 
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
g4dpvqap0
 
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
u86oixdj
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
Timothy Spann
 
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
AbhimanyuSinha9
 
Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)
TravisMalana
 
Influence of Marketing Strategy and Market Competition on Business Plan
Influence of Marketing Strategy and Market Competition on Business PlanInfluence of Marketing Strategy and Market Competition on Business Plan
Influence of Marketing Strategy and Market Competition on Business Plan
jerlynmaetalle
 
FP Growth Algorithm and its Applications
FP Growth Algorithm and its ApplicationsFP Growth Algorithm and its Applications
FP Growth Algorithm and its Applications
MaleehaSheikh2
 
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
mbawufebxi
 
Empowering Data Analytics Ecosystem.pptx
Empowering Data Analytics Ecosystem.pptxEmpowering Data Analytics Ecosystem.pptx
Empowering Data Analytics Ecosystem.pptx
benishzehra469
 
Q1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year ReboundQ1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year Rebound
Oppotus
 
Quantitative Data AnalysisReliability Analysis (Cronbach Alpha) Common Method...
Quantitative Data AnalysisReliability Analysis (Cronbach Alpha) Common Method...Quantitative Data AnalysisReliability Analysis (Cronbach Alpha) Common Method...
Quantitative Data AnalysisReliability Analysis (Cronbach Alpha) Common Method...
2023240532
 
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdfSample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
Linda486226
 
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
vcaxypu
 

Recently uploaded (20)

一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单
 
Predicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Predicting Product Ad Campaign Performance: A Data Analysis Project PresentationPredicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Predicting Product Ad Campaign Performance: A Data Analysis Project Presentation
 
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
 
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
 
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
 
My burning issue is homelessness K.C.M.O.
My burning issue is homelessness K.C.M.O.My burning issue is homelessness K.C.M.O.
My burning issue is homelessness K.C.M.O.
 
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
 
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
 
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
 
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
 
Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)
 
Influence of Marketing Strategy and Market Competition on Business Plan
Influence of Marketing Strategy and Market Competition on Business PlanInfluence of Marketing Strategy and Market Competition on Business Plan
Influence of Marketing Strategy and Market Competition on Business Plan
 
FP Growth Algorithm and its Applications
FP Growth Algorithm and its ApplicationsFP Growth Algorithm and its Applications
FP Growth Algorithm and its Applications
 
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
 
Empowering Data Analytics Ecosystem.pptx
Empowering Data Analytics Ecosystem.pptxEmpowering Data Analytics Ecosystem.pptx
Empowering Data Analytics Ecosystem.pptx
 
Q1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year ReboundQ1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year Rebound
 
Quantitative Data AnalysisReliability Analysis (Cronbach Alpha) Common Method...
Quantitative Data AnalysisReliability Analysis (Cronbach Alpha) Common Method...Quantitative Data AnalysisReliability Analysis (Cronbach Alpha) Common Method...
Quantitative Data AnalysisReliability Analysis (Cronbach Alpha) Common Method...
 
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdfSample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
 
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
 

Ch 6 Slides.doc/9929292929292919299292@:&:&:&9/92

  • 1. 6-1 Nonlinear Regression Functions (SW Ch. 6)  Everything so far has been linear in the X’s  The approximation that the regression function is linear might be good for some variables, but not for others.  The multiple regression framework can be extended to handle regression functions that are nonlinear in one or more X.
  • 2. 6-2 The TestScore – STR relation looks approximately linear…
  • 3. 6-3 But the TestScore – average district income relation looks like it is nonlinear.
  • 4. 6-4 If a relation between Y and X is nonlinear:  The effect on Y of a change in X depends on the value of X – that is, the marginal effect of X is not constant  A linear regression is mis-specified – the functional form is wrong  The estimator of the effect on Y of X is biased – it needn’t even be right on average.  The solution to this is to estimate a regression function that is nonlinear in X
  • 5. 6-5 The General Nonlinear Population Regression Function Yi = f(X1i,X2i,…,Xki) + ui, i = 1,…, n Assumptions 1. E(ui| X1i,X2i,…,Xki) = 0 (same); implies that f is the conditional expectation of Y given the X’s. 2. (X1i,…,Xki,Yi) are i.i.d. (same). 3. “enough” moments exist (same idea; the precise statement depends on specific f). 4. No perfect multicollinearity (same idea; the precise statement depends on the specific f).
  • 6. 6-6
  • 7. 6-7 Nonlinear Functions of a Single Independent Variable (SW Section 6.2) We’ll look at two complementary approaches: 1. Polynomials in X The population regression function is approximated by a quadratic, cubic, or higher-degree polynomial 2. Logarithmic transformations  Y and/or X is transformed by taking its logarithm  this gives a “percentages” interpretation that makes sense in many applications
  • 8. 6-8 1. Polynomials in X Approximate the population regression function by a polynomial: Yi = 0 + 1Xi + 2 2 i X +…+ r r i X + ui  This is just the linear multiple regression model – except that the regressors are powers of X!  Estimation, hypothesis testing, etc. proceeds as in the multiple regression model using OLS  The coefficients are difficult to interpret, but the regression function itself is interpretable
  • 9. 6-9 Example: the TestScore – Income relation Incomei = average district income in the ith district (thousdand dollars per capita) Quadratic specification: TestScorei = 0 + 1Incomei + 2(Incomei)2 + ui Cubic specification: TestScorei = 0 + 1Incomei + 2(Incomei)2 + 3(Incomei)3 + ui
  • 10. 6-10 Estimation of the quadratic specification in STATA generate avginc2 = avginc*avginc; Create a new regressor reg testscr avginc avginc2, r; Regression with robust standard errors Number of obs = 420 F( 2, 417) = 428.52 Prob > F = 0.0000 R-squared = 0.5562 Root MSE = 12.724 ------------------------------------------------------------------------------ | Robust testscr | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- avginc | 3.850995 .2680941 14.36 0.000 3.32401 4.377979 avginc2 | -.0423085 .0047803 -8.85 0.000 -.051705 -.0329119 _cons | 607.3017 2.901754 209.29 0.000 601.5978 613.0056 ------------------------------------------------------------------------------ The t-statistic on Income2 is -8.85, so the hypothesis of linearity is rejected against the quadratic alternative at the 1% significance level.
  • 11. 6-11 Interpreting the estimated regression function: (a) Plot the predicted values TestScore = 607.3 + 3.85Incomei – 0.0423(Incomei)2 (2.9) (0.27) (0.0048)
  • 12. 6-12 Interpreting the estimated regression function: (a) Compute “effects” for different values of X TestScore = 607.3 + 3.85Incomei – 0.0423(Incomei)2 (2.9) (0.27) (0.0048) Predicted change in TestScore for a change in income to $6,000 from $5,000 per capita: TestScore = 607.3 + 3.85 6 – 0.0423 62 – (607.3 + 3.85 5 – 0.0423 52 ) = 3.4
  • 13. 6-13 TestScore = 607.3 + 3.85Incomei – 0.0423(Incomei)2 Predicted “effects” for different values of X Change in Income (th$ per capita) TestScore from 5 to 6 3.4 from 25 to 26 1.7 from 45 to 46 0.0 The “effect” of a change in income is greater at low than high income levels (perhaps, a declining marginal benefit of an increase in school budgets?) Caution! What about a change from 65 to 66? Don’t extrapolate outside the range of the data.
  • 14. 6-14 Estimation of the cubic specification in STATA gen avginc3 = avginc*avginc2; Create the cubic regressor reg testscr avginc avginc2 avginc3, r; Regression with robust standard errors Number of obs = 420 F( 3, 416) = 270.18 Prob > F = 0.0000 R-squared = 0.5584 Root MSE = 12.707 ------------------------------------------------------------------------------ | Robust testscr | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- avginc | 5.018677 .7073505 7.10 0.000 3.628251 6.409104 avginc2 | -.0958052 .0289537 -3.31 0.001 -.1527191 -.0388913 avginc3 | .0006855 .0003471 1.98 0.049 3.27e-06 .0013677 _cons | 600.079 5.102062 117.61 0.000 590.0499 610.108 ------------------------------------------------------------------------------ The cubic term is statistically significant at the 5%, but not 1%, level
  • 15. 6-15 Testing the null hypothesis of linearity, against the alternative that the population regression is quadratic and/or cubic, that is, it is a polynomial of degree up to 3: H0: pop’n coefficients on Income2 and Income3 = 0 H1: at least one of these coefficients is nonzero. test avginc2 avginc3; Execute the test command after running the regression ( 1) avginc2 = 0.0 ( 2) avginc3 = 0.0 F( 2, 416) = 37.69 Prob > F = 0.0000 The hypothesis that the population regression is linear is rejected at the 1% significance level against the alternative that it is a polynomial of degree up to 3.
  • 16. 6-16 Summary: polynomial regression functions Yi = 0 + 1Xi + 2 2 i X +…+ r r i X + ui  Estimation: by OLS after defining new regressors  Coefficients have complicated interpretations  To interpret the estimated regression function: oplot predicted values as a function of x ocompute predicted Y/X at different values of x  Hypotheses concerning degree r can be tested by t- and F-tests on the appropriate (blocks of) variable(s).  Choice of degree r oplot the data; t- and F-tests, check sensitivity of estimated effects; judgment. oOr use model selection criteria (maybe later)
  • 17. 6-17 2. Logarithmic functions of Y and/or X  ln(X) = the natural logarithm of X  Logarithmic transforms permit modeling relations in “percentage” terms (like elasticities), rather than linearly. Here’s why: ln(x+x) – ln(x) = ln 1 x x         x x  (calculus: ln( ) 1 d x dx x  ) Numerically: ln(1.01) = .00995 .01; ln(1.10) = .0953 .10 (sort of)
  • 18. 6-18 Three cases: Case Population regression function I. linear-log Yi = 0 + 1ln(Xi) + ui II. log-linear ln(Yi) = 0 + 1Xi + ui III. log-log ln(Yi) = 0 + 1ln(Xi) + ui  The interpretation of the slope coefficient differs in each case.  The interpretation is found by applying the general “before and after” rule: “figure out the change in Y for a given change in X.”
  • 19. 6-19 I. Linear-log population regression function Yi = 0 + 1ln(Xi) + ui (b) Now change X: Y + Y = 0 + 1ln(X + X) (a) Subtract (a) – (b): Y = 1[ln(X + X) – ln(X)] now ln(X + X) – ln(X) X X  , so Y 1 X X  or 1 / Y X X   (small X)
  • 20. 6-20 Linear-log case, continued Yi = 0 + 1ln(Xi) + ui for small X, 1 / Y X X   Now 100 X X  = percentage change in X, so a 1% increase in X (multiplying X by 1.01) is associated with a .011 change in Y.
  • 21. 6-21 Example: TestScore vs. ln(Income)  First defining the new regressor, ln(Income)  The model is now linear in ln(Income), so the linear-log model can be estimated by OLS: TestScore = 557.8 + 36.42 ln(Incomei) (3.8) (1.40) so a 1% increase in Income is associated with an increase in TestScore of 0.36 points on the test.  Standard errors, confidence intervals, R2 – all the usual tools of regression apply here.  How does this compare to the cubic model?
  • 22. 6-22 TestScore = 557.8 + 36.42 ln(Incomei)
  • 23. 6-23 II. Log-linear population regression function ln(Yi) = 0 + 1Xi + ui (b) Now change X: ln(Y + Y) = 0 + 1(X + X) (a) Subtract (a) – (b): ln(Y + Y) – ln(Y) = 1X so Y Y  1X or 1 / Y Y X   (small X)
  • 24. 6-24 Log-linear case, continued ln(Yi) = 0 + 1Xi + ui for small X, 1 / Y Y X    Now 100 Y Y  = percentage change in Y, so a change in X by one unit (X = 1) is associated with a 1001% change in Y (Y increases by a factor of 1+1).  Note: What are the units of ui and the SER? ofractional (proportional) deviations ofor example, SER = .2 means…
  • 25. 6-25 III. Log-log population regression function ln(Yi) = 0 + 1ln(Xi) + ui (b) Now change X: ln(Y + Y) = 0 + 1ln(X + X) (a) Subtract: ln(Y + Y) – ln(Y) = 1[ln(X + X) – ln(X)] so Y Y  1 X X  or 1 / / Y Y X X   (small X)
  • 26. 6-26 Log-log case, continued ln(Yi) = 0 + 1ln(Xi) + ui for small X, 1 / / Y Y X X   Now 100 Y Y  = percentage change in Y, and 100 X X  = percentage change in X, so a 1% change in X is associated with a 1% change in Y.  In the log-log specification, 1 has the interpretation of an elasticity.
  • 27. 6-27 Example: ln( TestScore) vs. ln( Income)  First defining a new dependent variable, ln(TestScore), and the new regressor, ln(Income)  The model is now a linear regression of ln(TestScore) against ln(Income), which can be estimated by OLS: ln( ) TestScore = 6.336 + 0.0554 ln(Incomei) (0.006) (0.0021) An 1% increase in Income is associated with an increase of .0554% in TestScore (factor of 1.0554)  How does this compare to the log-linear model?
  • 28. 6-28 Neither specification seems to fit as well as the cubic or linear-log
  • 29. 6-29 Summary: Logarithmic transformations  Three cases, differing in whether Y and/or X is transformed by taking logarithms.  After creating the new variable(s) ln(Y) and/or ln(X), the regression is linear in the new variables and the coefficients can be estimated by OLS.  Hypothesis tests and confidence intervals are now standard.  The interpretation of 1 differs from case to case.  Choice of specification should be guided by judgment (which interpretation makes the most sense in your application?), tests, and plotting predicted values
  • 30. 6-30 Interactions Between Independent Variables (SW Section 6.3)  Perhaps a class size reduction is more effective in some circumstances than in others…  Perhaps smaller classes help more if there are many English learners, who need individual attention  That is, TestScore STR   might depend on PctEL  More generally, 1 Y X   might depend on X2  How to model such “interactions” between X1 and X2?  We first consider binary X’s, then continuous X’s
  • 31. 6-31 (a) Interactions between two binary variables Yi = 0 + 1D1i + 2D2i + ui  D1i, D2i are binary  1 is the effect of changing D1=0 to D1=1. In this specification, this effect doesn’t depend on the value of D2.  To allow the effect of changing D1 to depend on D2, include the “interaction term” D1i D2i as a regressor: Yi = 0 + 1D1i + 2D2i + 3(D1i D2i) + ui
  • 32. 6-32 Interpreting the coefficients Yi = 0 + 1D1i + 2D2i + 3(D1i D2i) + ui General rule: compare the various cases E(Yi|D1i=0, D2i=d2) = 0 + 2d2 (b) E(Yi|D1i=1, D2i=d2) = 0 + 1 + 2d2 + 3d2 (a) subtract (a) – (b): E(Yi|D1i=1, D2i=d2) – E(Yi|D1i=0, D2i=d2) = 1 + 3d2  The effect of D1 depends on d2 (what we wanted)  3 = increment to the effect of D1, when D2 = 1
  • 33. 6-33 Example: TestScore, STR, English learners Let HiSTR = 1 if 20 0 if 20 STR STR      and HiEL = 1 if l0 0 if 10 PctEL PctEL      TestScore = 664.1 – 18.2HiEL – 1.9HiSTR – 3.5(HiSTR HiEL) (1.4) (2.3) (1.9) (3.1)  “Effect” of HiSTR when HiEL = 0 is –1.9  “Effect” of HiSTR when HiEL = 1 is –1.9 – 3.5 = –5.4  Class size reduction is estimated to have a bigger effect when the percent of English learners is large  This interaction isn’t statistically significant: t = 3.5/3.1
  • 34. 6-34 (b) Interactions between continuous and binary variables Yi = 0 + 1Di + 2Xi + ui  Di is binary, X is continuous  As specified above, the effect on Y of X (holding constant D) = 2, which does not depend on D  To allow the effect of X to depend on D, include the “interaction term” Di Xi as a regressor: Yi = 0 + 1Di + 2Xi + 3(Di Xi) + ui
  • 35. 6-35 Interpreting the coefficients Yi = 0 + 1Di + 2Xi + 3(Di Xi) + ui General rule: compare the various cases Y = 0 + 1D + 2X + 3(D X) (b) Now change X: Y + Y = 0 + 1D + 2(X+X) + 3[D (X+X)] (a) subtract (a) – (b): Y = 2X + 3DX or Y X   = 2 + 3D  The effect of X depends on D (what we wanted)  3 = increment to the effect of X, when D = 1 Example: TestScore, STR, HiEL (=1 if PctEL 20)
  • 36. 6-36 TestScore = 682.2 – 0.97STR + 5.6HiEL – 1.28(STR HiEL) (11.9) (0.59) (19.5) (0.97)  When HiEL = 0: TestScore = 682.2 – 0.97STR  When HiEL = 1, TestScore = 682.2 – 0.97STR + 5.6 – 1.28STR = 687.8 – 2.25STR  Two regression lines: one for each HiSTR group.  Class size reduction is estimated to have a larger effect when the percent of English learners is large. Example, ctd.
  • 37. 6-37 TestScore = 682.2 – 0.97STR + 5.6HiEL – 1.28(STR HiEL) (11.9) (0.59) (19.5) (0.97) Testing various hypotheses:  The two regression lines have the same slope the coefficient on STR HiEL is zero: t = –1.28/0.97 = –1.32 can’t reject  The two regression lines have the same intercept the coefficient on HiEL is zero: t = –5.6/19.5 = 0.29 can’t reject Example, ctd. TestScore = 682.2 – 0.97STR + 5.6HiEL – 1.28(STR HiEL), (11.9) (0.59) (19.5) (0.97)
  • 38. 6-38  Joint hypothesis that the two regression lines are the same population coefficient on HiEL = 0 and population coefficient on STR HiEL = 0: F = 89.94 (p-value < .001) !!  Why do we reject the joint hypothesis but neither individual hypothesis?  Consequence of high but imperfect multicollinearity: high correlation between HiEL and STR HiEL Binary-continuous interactions: the two regression lines Yi = 0 + 1Di + 2Xi + 3(Di Xi) + ui Observations with Di= 0 (the “D = 0” group):
  • 39. 6-39 Yi = 0 + 2Xi + ui Observations with Di= 1 (the “D = 1” group): Yi = 0 + 1 + 2Xi + 3Xi + ui = (0+1) + (2+3)Xi + ui
  • 40. 6-40
  • 41. 6-41 (c) Interactions between two continuous variables Yi = 0 + 1X1i + 2X2i + ui  X1, X2 are continuous  As specified, the effect of X1 doesn’t depend on X2  As specified, the effect of X2 doesn’t depend on X1  To allow the effect of X1 to depend on X2, include the “interaction term” X1i X2i as a regressor: Yi = 0 + 1X1i + 2X2i + 3(X1i X2i) + ui Coefficients in continuous-continuous interactions
  • 42. 6-42 Yi = 0 + 1X1i + 2X2i + 3(X1i X2i) + ui General rule: compare the various cases Y = 0 + 1X1 + 2X2 + 3(X1 X2) (b) Now change X1: Y+ Y = 0 + 1(X1+X1) + 2X2 + 3[(X1+X1) X2] (a) subtract (a) – (b): Y = 1X1 + 3X2X1 or 1 Y X   = 2 + 3X2  The effect of X1 depends on X2 (what we wanted)  3 = increment to the effect of X1 from a unit change in X2 Example: TestScore, STR, PctEL
  • 43. 6-43 TestScore = 686.3 – 1.12STR – 0.67PctEL + .0012(STR PctEL), (11.8) (0.59) (0.37) (0.019) The estimated effect of class size reduction is nonlinear because the size of the effect itself depends on PctEL: TestScore STR   = –1.12 + .0012PctEL PctEL TestScore STR   0 –1.12 20% –1.12+.0012 20 = –1.10 Example, ctd: hypothesis tests TestScore = 686.3 – 1.12STR – 0.67PctEL + .0012(STR PctEL), (11.8) (0.59) (0.37) (0.019)
  • 44. 6-44  Does population coefficient on STR PctEL = 0? t = .0012/.019 = .06 can’t reject null at 5% level  Does population coefficient on STR = 0? t = –1.12/0.59 = –1.90 can’t reject null at 5% level  Do the coefficients on both STR and STR PctEL = 0? F = 3.89 (p-value = .021) reject null at 5% level(!!) (Why? high but imperfect multicollinearity)
  • 45. 6-45 Application: Nonlinear Effects on Test Scores of the Student-Teacher Ratio (SW Section 6.4) Focus on two questions: 1. Are there nonlinear effects of class size reduction on test scores? (Does a reduction from 35 to 30 have same effect as a reduction from 20 to 15?) 2. Are there nonlinear interactions between PctEL and STR? (Are small classes more effective when there are many English learners?)
  • 46. 6-46 Strategy for Question #1 (different effects for different STR?)  Estimate linear and nonlinear functions of STR, holding constant relevant demographic variables oPctEL oIncome (remember the nonlinear TestScore-Income relation!) oLunchPCT (fraction on free/subsidized lunch)  See whether adding the nonlinear terms makes an “economically important” quantitative difference (“economic” or “real-world” importance is different than statistically significant)  Test for whether the nonlinear terms are significant
  • 47. 6-47 What is a good “base” specification?
  • 48. 6-48 The TestScore – Income relation An advantage of the logarithmic specification is that it is better behaved near the ends of the sample, especially large values of income.
  • 49. 6-49 Base specification From the scatterplots and preceding analysis, here are plausible starting points for the demographic control variables: Dependent variable: TestScore Independent variable Functional form PctEL linear LunchPCT linear Income ln(Income) (or could use cubic)
  • 50. 6-50 Question #1: Investigate by considering a polynomial in STR TestScore = 252.0 + 64.33STR – 3.42STR2 + .059STR3 (163.6) (24.86) (1.25) (.021) – 5.47HiEL – .420LunchPCT + 11.75ln(Income) (1.03) (.029) (1.78) Interpretation of coefficients on:  HiEL?  LunchPCT?  ln(Income)?  STR, STR2 , STR3 ?
  • 51. 6-51 Interpreting the regression function via plots (preceding regression is labeled (5) in this figure)
  • 52. 6-52 Are the higher order terms in STR statistically significant? TestScore = 252.0 + 64.33STR – 3.42STR2 + .059STR3 (163.6) (24.86) (1.25) (.021) – 5.47HiEL – .420LunchPCT + 11.75ln(Income) (1.03) (.029) (1.78) (a) H0: quadratic in STR v. H1: cubic in STR? t = .059/.021 = 2.86 (p = .005) (b) H0: linear in STR v. H1: nonlinear/up to cubic in STR? F = 6.17 (p = .002)
  • 53. 6-53 Question #2: STR-PctEL interactions (to simplify things, ignore STR2 , STR3 terms for now) TestScore = 653.6 – .53STR + 5.50HiEL – .58HiEL STR (9.9) (.34) (9.80) (.50) – .411LunchPCT + 12.12ln(Income) (.029) (1.80) Interpretation of coefficients on:  STR?  HiEL? (wrong sign?)  HiEL STR?  LunchPCT?  ln(Income)?
  • 54. 6-54 Interpreting the regression functions via plots: TestScore = 653.6 – .53STR + 5.50HiEL – .58HiEL STR (9.9) (.34) (9.80) (.50) – .411LunchPCT + 12.12ln(Income) (.029) (1.80) “Real-world” (“policy” or “economic”) importance of the interaction term: TestScore STR   = –.53 – .58HiEL = 1.12 if 1 .53 if 0 HiEL HiEL         The difference in the estimated effect of reducing the STR is substantial; class size reduction is more effective in districts with more English learners
  • 55. 6-55 Is the interaction effect statistically significant? TestScore = 653.6 – .53STR + 5.50HiEL – .58HiEL STR (9.9) (.34) (9.80) (.50) – .411LunchPCT + 12.12ln(Income) (.029) (1.80) (a) H0: coeff. on interaction=0 v. H1: nonzero interaction t = –1.17 not significant at the 10% level (b) H0: both coeffs involving STR = 0 vs. H1: at least one coefficient is nonzero (STR enters) F = 5.92 (p = .003) Next: specifications with polynomials + interactions!
  • 56. 6-56
  • 57. 6-57 Interpreting the regression functions via plots:
  • 58. 6-58 Tests of joint hypotheses:
  • 59. 6-59 Summary: Nonlinear Regression Functions  Using functions of the independent variables such as ln(X) or X1 X2, allows recasting a large family of nonlinear regression functions as multiple regression.  Estimation and inference proceeds in the same way as in the linear multiple regression model.  Interpretation of the coefficients is model-specific, but the general rule is to compute effects by comparing different cases (different value of the original X’s)  Many nonlinear specifications are possible, so you must use judgment: What nonlinear effect you want to analyze? What makes sense in your application?