Linear Regression

LINEAR REGRESSION WITH ONE
REGRESSOR
Ryan Herzog, Ph.D.

HEDONIC PRICING IN REAL ESTATE
• What is the relationship between square footage of a house and
its price?
• If one square foot of living space is added to a house by how much will its
price increase?
• $50?
• $100?
• All of these questions can be answered with the help of a linear
regression. We will be using the ”real estate” dataset to help us
answer these questions.

THE RELATIONSHIP BETWEEN SQUARE FOOTAGE OF A
HOUSE AND ITS PRICE
• Stata: twoway scatter price sqft

SIDENOTE – LABEL VAR
• To clean up graphs (and other output) use the command ”label”
to clean up variable names.
• label variable sqft “Square Feet”
• label variable price “Price (thousands)”

THE RELATIONSHIP BETWEEN SQUARE FOOTAGE OF A
HOUSE AND ITS PRICE
• Add a line of best fit
• twoway (scatter price sqft) (lfit price sqft)

THE EQUATION OF THE LINE
Price = 153.18 + 0.195 * sq ft
For a 2,000 sq ft, expected price
would be
= 153.18 + 0.19537 * 2,000
= 543.92

ANOTHER GRAPHING COMMAND (AAPLOT)
• ssc install aaplot
• aaplot price sqft

LINEAR REGRESSION MODEL
Yi = β0 + β1Xi + ui, i = 1,…, n
• We have n observations, (Xi, Yi), i = 1,.., n.
• X is the independent variable or regressor or explanatory variable
• Y is the dependent variable
• β0 = intercept
• β1 = slope, coefficient
• ui = the regression error or disturbance term
• not a deterministic equation

EXAMPLE
• Name the components of the following linear
regression model
𝑝𝑟𝑖𝑐𝑒𝑖 = 𝛽0 + 𝛽1 𝑠𝑞𝑢𝑎𝑟𝑒 𝑓𝑒𝑒𝑡𝑖 + 𝑒𝑖

FITTED EQUATION AND SLOPE INTERPRETATION
• 𝑝𝑟𝑖𝑐𝑒 = 153.18 + 0.195 ∗ 𝑠𝑞𝑢𝑎𝑟𝑒 𝑓𝑒𝑒𝑡
• On average, a one unit (what is it?) increase in house size is associated
with about a 0.19 unit (what is it?) increase of its price
• If 50 square feet of living space are added to a house, on average its price
will increase by about _______
• If 100 square feet of living space are added to a house, on average its price
will increase by about ______

INTERPRETING BETA FOR THE CHANGE IN X AND AT A
POINT
• 𝑝𝑟𝑖𝑐𝑒 = 153.18 + 0.195 ∗ 𝑠𝑞𝑢𝑎𝑟𝑒 𝑓𝑒𝑒𝑡
• Change in X:
• If square feet (X) changes by 10, by how much will the price change on average?
• At a point:
• If the size of a house (X) is 1500 square feet what is its price?

MORE EXAMPLES
• 𝑇𝑒𝑠𝑡𝑠𝑐𝑜𝑟𝑒 = 520.4 − 5.82 ∗ 𝑐𝑙𝑎𝑠𝑠 𝑠𝑖𝑧𝑒
• What is the regression’s prediction for a classroom of 20 students?
• What is the regression’s prediction for the change in the classroom average
test score if the classroom size increases by 4?
• 𝑊𝑒𝑖𝑔ℎ𝑡 = −99.41 + 3.94 ∗ 𝐻𝑒𝑖𝑔ℎ𝑡
• What is the regression’s weight prediction for someone who is 70 inches tall?
(weight is measured in pounds, height is measured in inches)
• If someone has a growth spurt of 1.5 inches over the course of a year what is
the regression’s prediction for the increase in this person’s weight?

CORRELATION DOES NOT IMPLY CAUSATION
• Proper language: is associated/correlated with, suggests
*Check out the book “Spurious Correlation by Tyler Vigen
https://www.tylervigen.com/spurious-correlations

THE RELATIONSHIP BETWEEN PRICE AND SQUARE FEET OF
A HOUSE

MORE EXAMPLES
• Regress the price on the following variables, write out
the population regression equation, fitted equation, and
interpret the results:
• Number of bedrooms
• Number of bathrooms
• Age
• Year built

EVEN MORE EXAMPLES. DIY TIME.
• Use California test score dataset
• Regress test score on the following variables, write out
the population regression equation, the fitted equation,
and interpret the results.
• Class size
• Percent of ESL students
• Computers per student
• Percent of students qualified for free lunches

MEANINGFUL INTERCEPT
• Pay attention to the following
Intuitively can one of the X variable observations be zero?
- The zero value for the X variable is in the sample
Example: In the following two regressions is the intercept
meaningful?
𝑡𝑒𝑠𝑡 𝑠𝑐𝑜𝑟𝑒𝑖 = 𝛽0 + 𝛽1 𝑝𝑒𝑟𝑐𝑒𝑛𝑡 𝐸𝑆𝐿 𝑠𝑡𝑢𝑑𝑒𝑛𝑡𝑠𝑖 + 𝑢𝑖
𝑡𝑒𝑠𝑡 𝑠𝑐𝑜𝑟𝑒𝑖 = 𝛽0 + 𝛽1 𝑎𝑣𝑒𝑟𝑎𝑔𝑒 𝑖𝑛𝑐𝑜𝑚𝑒𝑖 + 𝑢𝑖

MEANINGFUL INTERCEPT
• Is the intercept meaningful in the following two
regressions?
𝑡𝑒𝑠𝑡 𝑠𝑐𝑜𝑟𝑒𝑖 = 𝛽0 + 𝛽1 𝑐𝑙𝑎𝑠𝑠 𝑠𝑖𝑧𝑒𝑖 + 𝑢𝑖
𝑡𝑒𝑠𝑡 𝑠𝑐𝑜𝑟𝑒𝑖 = 𝛽0 + 𝛽1 𝑐𝑜𝑚𝑝𝑢𝑡𝑒𝑟𝑠 𝑝𝑒𝑟 𝑠𝑡𝑢𝑑𝑒𝑛𝑡𝑖 + 𝑢𝑖

MEANINGFUL SLOPE COEFFICIENT. INTERPRETING THE
MAGNITUDE
Is the increase in X associated with the large increase in Y? if so => the
coefficient is economically meaningful (large)
• We can’t simply compare coefficients from two different regressions, since
each variable has a different standard deviation. We want to find out by how
much Y changes when X changes by its standard deviation
• Steps to find out the magnitude:
1. Find the standard deviation of X
2. Multiply the standard deviation of X by the coefficient
3. Compare the product from (2) to the standard deviation of Y (often by
dividing it by the standard deviation of Y)

INTERPRETING THE MAGNITUDE. EXAMPLES
• Is the effect of class size on test scores large?
• Is the effect of the percent of ESL students on test scores large?
• If we compare the coefficients from the two regressions what
conclusion might we arrive to?
• If class size increases by 1 standard deviation by how much will the test
scores increase?
• If the percent of ESL students increases by 1 standard deviation by how
much will the test scores increase?
• How do these changes compare to the standard deviation of test
scores?

INTERPRETING THE MAGNITUDE.
• Is the effect of the independent variable in the following two
regressions on the dependent variable economically meaningful?
• With one standard deviation change in each of the independent
variables by how much will the dependent variable change in
terms of its standard deviation?
𝑡𝑒𝑠𝑡 𝑠𝑐𝑜𝑟𝑒𝑖
= 𝛽0 + 𝛽1 𝑝𝑒𝑟𝑐𝑒𝑛𝑡 𝑜𝑓 𝑠𝑡𝑢𝑑𝑒𝑛𝑡𝑠 𝑞𝑢𝑎𝑙𝑖𝑓𝑖𝑒𝑑 𝑓𝑜𝑟 𝑓𝑟𝑒𝑒 𝑙𝑢𝑛𝑐ℎ𝑒𝑠𝑖 + 𝑢𝑖
𝑡𝑒𝑠𝑡 𝑠𝑐𝑜𝑟𝑒𝑖 = 𝛽0 + 𝛽1 𝑐𝑜𝑚𝑝𝑢𝑡𝑒𝑟𝑠 𝑝𝑒𝑟 𝑠𝑡𝑢𝑑𝑒𝑛𝑡𝑖 + 𝑢𝑖

REVIEW
• Please name the components of a linear regression (based on an example of
your own choosing)
• Why do we need to have an error term in the regression equation?
• What is a fitted equation? (What is the difference between the fitted equation
and the population equation)? Please give examples.
• How do you interpret the results of a regression at a point/ based on the
change in X?
• Command in Stata to run a regression
• What does a meaningful intercept mean? Please give an example
• How do we interpret the magnitude of the coefficient and decide if it is
economically meaningful?

Linear Regression

Recommended

Recommended

More Related Content

Similar to Linear Regression

Similar to Linear Regression (20)

More from Ryan Herzog

More from Ryan Herzog (20)

Recently uploaded

Recently uploaded (20)

Linear Regression

Editor's Notes