ECON 3406 Multiple Regression Instructions
LO2.2
In this assignment, you are provided with data from a research paper by Ella Hartenian and Nicholas Horton, who examined whether or not there was a relationship between rail trails and property values in Northampton, MA. The data are located in the file ‘house prices.xlsx’ which is in your e-mail on D2L. Please see the accompanying word document, ‘documentation.docx,’ which describes what each of the columns in the data set mean and how they are scaled. For this assignment you will learn how to estimate a common regression model used in economics, finance, and real estate called a “hedonic price model.”
1. Create a copy of the data set by right-clicking the tab at the bottom and clicking ‘Move or Copy...’, then create another copy of the data set in the same workbook. Label this copy “Regression 1” and the original tab “Original Data.” Now click on the tab for “Regression 1.” Delete all variables except ‘price2014,’ ‘bedrooms,’ ‘garage space’, ‘nofullbath’, ‘norooms’, ‘squarefeet,’ and ‘walkscore.’ You should have seven total variables. Make sure ‘price 2014’ is the far left variable, then estimate the model where ‘price2014’ is a function of the other six variables. Then answer the questions:
(a) For each variable, explain what the sign of the regression coefficient is, what the magnitude is (i.e., if x goes up by one unit how much does y increase by), and whether or not the results of your regression are consistent with what you believed the sign and magnitude to be. Then explain if any variables are insignificant.
• Use p < 0.1 as the p-value cutoff for whether results are significant.
(b) Write down the equation of your regression line, including all variables. Next, calculate the average value for each of your six explanatory variables using the AVERAGE() function in Excel. Finally, make a prediction for the price if each of the variables take their average value.
(c) Explain briefly if there are any variables in the data set you think should have been included and, if so, why. You must refer to at least one other variable.
IMPORTANT: TYPE YOUR ANSWERS IN A TEXT BOX (“INSERT” A TEXT BOX). MAKE SURE TO ADDRESS EACH QUESTION (a, b, and c)
LO2.3
In this assignment you are provided with data from a research paper by Ella Hartenian and Nicholas Horton, who examined whether or not there was a relationship between rail trails and property values in Northamp- ton, MA. The data are located in the file ‘house prices.xlsx’ (same data used above) which is in your e-mail on D2L. Please see the accompanying word document, ‘documentation.docx,’ which describes what each of the columns in the data set mean and how they are scaled. For this assignment you will assess whether rail trails influence property values.
1. Create a copy of the same data set used for “LO2.2” by right-clicking the tab at the bottom and clicking ‘Move or Copy...’, then create another copy of the data set in ...
ECON 3406 Multiple Regression InstructionsLO2.2In this ass.docx
1. ECON 3406 Multiple Regression Instructions
LO2.2
In this assignment, you are provided with data from a research
paper by Ella Hartenian and Nicholas Horton, who examined
whether or not there was a relationship between rail trails and
property values in Northampton, MA. The data are located in
the file ‘house prices.xlsx’ which is in your e-mail on D2L.
Please see the accompanying word document,
‘documentation.docx,’ which describes what each of the
columns in the data set mean and how they are scaled. For this
assignment you will learn how to estimate a common regression
model used in economics, finance, and real estate called a
“hedonic price model.”
1. Create a copy of the data set by right-clicking the tab at the
bottom and clicking ‘Move or Copy...’, then create another copy
of the data set in the same workbook. Label this copy
“Regression 1” and the original tab “Original Data.” Now click
on the tab for “Regression 1.” Delete all variables except
‘price2014,’ ‘bedrooms,’ ‘garage space’, ‘nofullbath’,
‘norooms’, ‘squarefeet,’ and ‘walkscore.’ You should have
seven total variables. Make sure ‘price 2014’ is the far left
variable, then estimate the model where ‘price2014’ is a
function of the other six variables. Then answer the questions:
(a) For each variable, explain what the sign of the regression
coefficient is, what the magnitude is (i.e., if x goes up by one
unit how much does y increase by), and whether or not the
results of your regression are consistent with what you believed
the sign and magnitude to be. Then explain if any variables are
insignificant.
2. • Use p < 0.1 as the p-value cutoff for whether results are
significant.
(b) Write down the equation of your regression line, including
all variables. Next, calculate the average value for each of your
six explanatory variables using the AVERAGE() function in
Excel. Finally, make a prediction for the price if each of the
variables take their average value.
(c) Explain briefly if there are any variables in the data set you
think should have been included and, if so, why. You must
refer to at least one other variable.
IMPORTANT: TYPE YOUR ANSWERS IN A TEXT BOX
(“INSERT” A TEXT BOX). MAKE SURE TO ADDRESS
EACH QUESTION (a, b, and c)
LO2.3
In this assignment you are provided with data from a research
paper by Ella Hartenian and Nicholas Horton, who examined
whether or not there was a relationship between rail trails and
property values in Northamp- ton, MA. The data are located in
the file ‘house prices.xlsx’ (same data used above) which is in
your e-mail on D2L. Please see the accompanying word
document, ‘documentation.docx,’ which describes what each of
the columns in the data set mean and how they are scaled. For
this assignment you will assess whether rail trails influence
property values.
1. Create a copy of the same data set used for “LO2.2” by right-
3. clicking the tab at the bottom and clicking ‘Move or Copy...’,
then create another copy of the data set in the same workbook.
Label this copy “Regression 2” and the original tab “Original
Data.” Now click on the tab for “Regression 2.” Then answer
the question:
(a) Estimate the model where house prices (“price2014”
variable) are affected by rail distance only. What is the sign and
magnitude of the coefficient on rail distance, and what is the fit
of the regression?
(b) Create another copy of the “original data” following the
same steps used before and label this new tab “Regression 3.”
Estimate a new model where house prices are affected by both
rail distance and previous prices (“price1998”) only. What is
the sign and new magnitude of the coefficient on rail distance,
and what is the fit of the regression?
(c) Create another copy of the “original data” following the
same steps used before and label this new tab “Regression 4.”
Estimate a new model where house prices are affected by both
rail distance and previous prices, as well as any other variable
you consider relevant (select at least one more variable and
explain briefly why). What is the sign and magnitude of the
coefficient on rail distance, and what is the fit of the
regression?
(d) Write a paragraph that answers whether rail trails affect
housing prices based on your three models.
IMPORTANT: SHOW ALL YOUR NUMERIAL ANSWERS IN
THE EXCEL SHEETS AND TYPE YOUR ANSWERS IN A
TEXT BOX (“INSERT” A TEXT BOX). MAKE SURE TO
ADDRESS EACH QUESTION (a, b, c, and d).
2
27. 72.6230710846.8052845149.5287.5249.9320.805> 1500
sf1.941Williams Street388010601030.31> 1/4
acre175.404272.91125237.048961-2
beds2471.0980.723295455Farther Away0no10442.326989-
72.6589351050.6259834120237.5223.8176.502<= 1500
sf1.197Winslow Ave1381062
Name: Northampton Housing Price, Neighborhood
Characteristics, and Railtrail Proximity Dataset
Type: Natural experiment
Size: 30 variables / 104 observations
Article Title: Rail-Trails and Property Values: Is there an
Association?
Descriptive Abstract:
This dataset comprises 104 homes in Northampton: MA that
were sold in 2007. We measured the shortest distance from each
home to a railtrail on streets and pathways with Google maps
and recorded the Zillow.com estimate of each home’s price in
1998 and 2011. Additional attributes such as square footage:
number of bedrooms and bathrooms are available from a realty
database from 2007. We divide the houses into two groups
based on distance to the trail (distgroup).
Sources:
Housing attribute data from the Multiple Listing Service.
Housing prices in 1998 are from Zillow.com’s 10-year price
estimate while values from 2007: 2011 and 2014 are from
Zillow.com's current price estimator. Distance from homes to
the Rail-trail were mapped in Google maps "my places" feature
by the first author and can be retraced by drawing a line from
the home to the nearest railtrail access point. Bikescores and
Walkscores from walkscore.com.
Variable Descriptions:
housenum: unique house number
28. acre: number of acres that the property encompasses
acregroup: less than or equal to 1/4 acre or greater than 1/4
acre
adj1998: estimated 1998 price in thousands of 2014 dollars
adj2007: estimated 2007 price in thousands of 2014 dollars
adj2011: estimated 2011 price in thousands of 2014 dollars
bedgroup: 1-2, 3, 4+ bedrooms
bedrooms: number of bedrooms
bikescore: bike friendliness (0-100 score, higher scores are
better)
diff2014: difference in price between 2014 estimate and
adjusted 1998 estimate (in thousands of dollars).
distance: distance (in feet) to the nearest entry point to the rail
trail network
distgroup: distance is "Closer" (less than or equal to half mile)
or "Farther Away" (greater than half a mile).
garage_spaces: number of garage spaces (0-4)
garagegroup: any garage spaces (yes/no)
latitude:
longitude:
no_full_baths: number of full baths (includes shower or
bathtub)
no_half_baths: number of half paths (no shower or bathtub)
no_rooms: number of rooms
pctchange: percentage change from adjusted 1998 price to 2014
(value of zero means no change)
price1998: Zillow 10 year estimate from 2008 (in thousands of
dollars)
price2007: Zillow price estimate from 2007 (in thousands of
dollars)
price2011: Zillow price estimate from 2011 (in thousands of
dollars)
price2014: Zillow price estimate from 2014 (in thousands of
dollars)
sfgroup: squarefeet less than or equal to 1500 or greater than
1500 sf
29. squarefeet: square footage of interior finished space (in
thousands of sf)
streetname: name of street
streetno: house number on street
walkscore: walk friendliness (0-100 score, higher scores are
better)
zip: location (1060 = Northampton: 1062 = Florence)
Submitted by:
Nicholas Horton: Professor of Mathematics and Statistics:
Amherst College
Ella Hartenian: Department of Biological Sciences: Smith
College
Email: [email protected]