SlideShare a Scribd company logo
1 of 22
LINEAR REGRESSION WITH ONE
REGRESSOR
Ryan Herzog, Ph.D.
HEDONIC PRICING IN REAL ESTATE
• What is the relationship between square footage of a house and
its price?
• If one square foot of living space is added to a house by how much will its
price increase?
• $50?
• $100?
• All of these questions can be answered with the help of a linear
regression. We will be using the ”real estate” dataset to help us
answer these questions.
THE RELATIONSHIP BETWEEN SQUARE FOOTAGE OF A
HOUSE AND ITS PRICE
• Stata: twoway scatter price sqft
SIDENOTE – LABEL VAR
• To clean up graphs (and other output) use the command ”label”
to clean up variable names.
• label variable sqft “Square Feet”
• label variable price “Price (thousands)”
THE RELATIONSHIP BETWEEN SQUARE FOOTAGE OF A
HOUSE AND ITS PRICE
• Add a line of best fit
• twoway (scatter price sqft) (lfit price sqft)
THE EQUATION OF THE LINE
Price = 153.18 + 0.195 * sq ft
For a 2,000 sq ft, expected price
would be
= 153.18 + 0.19537 * 2,000
= 543.92
ANOTHER GRAPHING COMMAND (AAPLOT)
• ssc install aaplot
• aaplot price sqft
LINEAR REGRESSION MODEL
Yi = β0 + β1Xi + ui, i = 1,…, n
• We have n observations, (Xi, Yi), i = 1,.., n.
• X is the independent variable or regressor or explanatory variable
• Y is the dependent variable
• β0 = intercept
• β1 = slope, coefficient
• ui = the regression error or disturbance term
• not a deterministic equation
EXAMPLE
• Name the components of the following linear
regression model
𝑝𝑟𝑖𝑐𝑒𝑖 = 𝛽0 + 𝛽1 𝑠𝑞𝑢𝑎𝑟𝑒 𝑓𝑒𝑒𝑡𝑖 + 𝑒𝑖
FITTED EQUATION AND SLOPE INTERPRETATION
• 𝑝𝑟𝑖𝑐𝑒 = 153.18 + 0.195 ∗ 𝑠𝑞𝑢𝑎𝑟𝑒 𝑓𝑒𝑒𝑡
• On average, a one unit (what is it?) increase in house size is associated
with about a 0.19 unit (what is it?) increase of its price
• If 50 square feet of living space are added to a house, on average its price
will increase by about _______
• If 100 square feet of living space are added to a house, on average its price
will increase by about ______
INTERPRETING BETA FOR THE CHANGE IN X AND AT A
POINT
• 𝑝𝑟𝑖𝑐𝑒 = 153.18 + 0.195 ∗ 𝑠𝑞𝑢𝑎𝑟𝑒 𝑓𝑒𝑒𝑡
• Change in X:
• If square feet (X) changes by 10, by how much will the price change on average?
• At a point:
• If the size of a house (X) is 1500 square feet what is its price?
MORE EXAMPLES
• 𝑇𝑒𝑠𝑡𝑠𝑐𝑜𝑟𝑒 = 520.4 − 5.82 ∗ 𝑐𝑙𝑎𝑠𝑠 𝑠𝑖𝑧𝑒
• What is the regression’s prediction for a classroom of 20 students?
• What is the regression’s prediction for the change in the classroom average
test score if the classroom size increases by 4?
• 𝑊𝑒𝑖𝑔ℎ𝑡 = −99.41 + 3.94 ∗ 𝐻𝑒𝑖𝑔ℎ𝑡
• What is the regression’s weight prediction for someone who is 70 inches tall?
(weight is measured in pounds, height is measured in inches)
• If someone has a growth spurt of 1.5 inches over the course of a year what is
the regression’s prediction for the increase in this person’s weight?
CORRELATION DOES NOT IMPLY CAUSATION
• Proper language: is associated/correlated with, suggests
*Check out the book “Spurious Correlation by Tyler Vigen
https://www.tylervigen.com/spurious-correlations
THE RELATIONSHIP BETWEEN PRICE AND SQUARE FEET OF
A HOUSE
MORE EXAMPLES
• Regress the price on the following variables, write out
the population regression equation, fitted equation, and
interpret the results:
• Number of bedrooms
• Number of bathrooms
• Age
• Year built
EVEN MORE EXAMPLES. DIY TIME.
• Use California test score dataset
• Regress test score on the following variables, write out
the population regression equation, the fitted equation,
and interpret the results.
• Class size
• Percent of ESL students
• Computers per student
• Percent of students qualified for free lunches
MEANINGFUL INTERCEPT
• Pay attention to the following
Intuitively can one of the X variable observations be zero?
- The zero value for the X variable is in the sample
Example: In the following two regressions is the intercept
meaningful?
𝑡𝑒𝑠𝑡 𝑠𝑐𝑜𝑟𝑒𝑖 = 𝛽0 + 𝛽1 𝑝𝑒𝑟𝑐𝑒𝑛𝑡 𝐸𝑆𝐿 𝑠𝑡𝑢𝑑𝑒𝑛𝑡𝑠𝑖 + 𝑢𝑖
𝑡𝑒𝑠𝑡 𝑠𝑐𝑜𝑟𝑒𝑖 = 𝛽0 + 𝛽1 𝑎𝑣𝑒𝑟𝑎𝑔𝑒 𝑖𝑛𝑐𝑜𝑚𝑒𝑖 + 𝑢𝑖
MEANINGFUL INTERCEPT
• Is the intercept meaningful in the following two
regressions?
𝑡𝑒𝑠𝑡 𝑠𝑐𝑜𝑟𝑒𝑖 = 𝛽0 + 𝛽1 𝑐𝑙𝑎𝑠𝑠 𝑠𝑖𝑧𝑒𝑖 + 𝑢𝑖
𝑡𝑒𝑠𝑡 𝑠𝑐𝑜𝑟𝑒𝑖 = 𝛽0 + 𝛽1 𝑐𝑜𝑚𝑝𝑢𝑡𝑒𝑟𝑠 𝑝𝑒𝑟 𝑠𝑡𝑢𝑑𝑒𝑛𝑡𝑖 + 𝑢𝑖
MEANINGFUL SLOPE COEFFICIENT. INTERPRETING THE
MAGNITUDE
Is the increase in X associated with the large increase in Y? if so => the
coefficient is economically meaningful (large)
• We can’t simply compare coefficients from two different regressions, since
each variable has a different standard deviation. We want to find out by how
much Y changes when X changes by its standard deviation
• Steps to find out the magnitude:
1. Find the standard deviation of X
2. Multiply the standard deviation of X by the coefficient
3. Compare the product from (2) to the standard deviation of Y (often by
dividing it by the standard deviation of Y)
INTERPRETING THE MAGNITUDE. EXAMPLES
• Is the effect of class size on test scores large?
• Is the effect of the percent of ESL students on test scores large?
• If we compare the coefficients from the two regressions what
conclusion might we arrive to?
• If class size increases by 1 standard deviation by how much will the test
scores increase?
• If the percent of ESL students increases by 1 standard deviation by how
much will the test scores increase?
• How do these changes compare to the standard deviation of test
scores?
INTERPRETING THE MAGNITUDE.
• Is the effect of the independent variable in the following two
regressions on the dependent variable economically meaningful?
• With one standard deviation change in each of the independent
variables by how much will the dependent variable change in
terms of its standard deviation?
𝑡𝑒𝑠𝑡 𝑠𝑐𝑜𝑟𝑒𝑖
= 𝛽0 + 𝛽1 𝑝𝑒𝑟𝑐𝑒𝑛𝑡 𝑜𝑓 𝑠𝑡𝑢𝑑𝑒𝑛𝑡𝑠 𝑞𝑢𝑎𝑙𝑖𝑓𝑖𝑒𝑑 𝑓𝑜𝑟 𝑓𝑟𝑒𝑒 𝑙𝑢𝑛𝑐ℎ𝑒𝑠𝑖 + 𝑢𝑖
𝑡𝑒𝑠𝑡 𝑠𝑐𝑜𝑟𝑒𝑖 = 𝛽0 + 𝛽1 𝑐𝑜𝑚𝑝𝑢𝑡𝑒𝑟𝑠 𝑝𝑒𝑟 𝑠𝑡𝑢𝑑𝑒𝑛𝑡𝑖 + 𝑢𝑖
REVIEW
• Please name the components of a linear regression (based on an example of
your own choosing)
• Why do we need to have an error term in the regression equation?
• What is a fitted equation? (What is the difference between the fitted equation
and the population equation)? Please give examples.
• How do you interpret the results of a regression at a point/ based on the
change in X?
• Command in Stata to run a regression
• What does a meaningful intercept mean? Please give an example
• How do we interpret the magnitude of the coefficient and decide if it is
economically meaningful?

More Related Content

Similar to Linear Regression

percentage basic method and formula for easy learning.pptx
percentage basic method and formula for easy learning.pptxpercentage basic method and formula for easy learning.pptx
percentage basic method and formula for easy learning.pptx
immanral27
 
Alg II 3-3 Systems of Inequalities
Alg II 3-3 Systems of InequalitiesAlg II 3-3 Systems of Inequalities
Alg II 3-3 Systems of Inequalities
jtentinger
 
Alg II Unit 3-3-systemsinequalities
Alg II Unit 3-3-systemsinequalitiesAlg II Unit 3-3-systemsinequalities
Alg II Unit 3-3-systemsinequalities
jtentinger
 
Institutional Research and Regression
Institutional Research and RegressionInstitutional Research and Regression
Institutional Research and Regression
Colby Stoever
 

Similar to Linear Regression (20)

Topic 4 (binary)
Topic 4 (binary)Topic 4 (binary)
Topic 4 (binary)
 
Topic 3 (Stats summary)
Topic 3 (Stats summary)Topic 3 (Stats summary)
Topic 3 (Stats summary)
 
Basic Terms in Statistics
Basic Terms in StatisticsBasic Terms in Statistics
Basic Terms in Statistics
 
ML4 Regression.pptx
ML4 Regression.pptxML4 Regression.pptx
ML4 Regression.pptx
 
percentage basic method and formula for easy learning.pptx
percentage basic method and formula for easy learning.pptxpercentage basic method and formula for easy learning.pptx
percentage basic method and formula for easy learning.pptx
 
Inferential Statistics
Inferential StatisticsInferential Statistics
Inferential Statistics
 
Correlation.pptx
Correlation.pptxCorrelation.pptx
Correlation.pptx
 
Statistics for Librarians, Session 3: Inferential statistics
Statistics for Librarians, Session 3: Inferential statisticsStatistics for Librarians, Session 3: Inferential statistics
Statistics for Librarians, Session 3: Inferential statistics
 
Alg II 3-3 Systems of Inequalities
Alg II 3-3 Systems of InequalitiesAlg II 3-3 Systems of Inequalities
Alg II 3-3 Systems of Inequalities
 
Alg II Unit 3-3-systemsinequalities
Alg II Unit 3-3-systemsinequalitiesAlg II Unit 3-3-systemsinequalities
Alg II Unit 3-3-systemsinequalities
 
Topic 6 (model specification)
Topic 6 (model specification)Topic 6 (model specification)
Topic 6 (model specification)
 
Copy of simple Linear regression _1_RK (1).pptx
Copy of simple Linear regression _1_RK (1).pptxCopy of simple Linear regression _1_RK (1).pptx
Copy of simple Linear regression _1_RK (1).pptx
 
Measure of Variability Report.pptx
Measure of Variability Report.pptxMeasure of Variability Report.pptx
Measure of Variability Report.pptx
 
Institutional Research and Regression
Institutional Research and RegressionInstitutional Research and Regression
Institutional Research and Regression
 
Math 221 week 1 lecture nov 2012 with help
Math 221 week 1 lecture nov 2012 with helpMath 221 week 1 lecture nov 2012 with help
Math 221 week 1 lecture nov 2012 with help
 
Types of Scales and Scaling Techniques
Types of Scales and Scaling TechniquesTypes of Scales and Scaling Techniques
Types of Scales and Scaling Techniques
 
Descriptive stats
Descriptive statsDescriptive stats
Descriptive stats
 
Location Scores
Location  ScoresLocation  Scores
Location Scores
 
MODULE-2-PPT-MATH-LANGUAGE-AND-SYMBOLS-GC.pptx
MODULE-2-PPT-MATH-LANGUAGE-AND-SYMBOLS-GC.pptxMODULE-2-PPT-MATH-LANGUAGE-AND-SYMBOLS-GC.pptx
MODULE-2-PPT-MATH-LANGUAGE-AND-SYMBOLS-GC.pptx
 
dss workshop 1.pptx
dss workshop 1.pptxdss workshop 1.pptx
dss workshop 1.pptx
 

More from Ryan Herzog

More from Ryan Herzog (20)

Chapter 14 - Great Recession
Chapter 14 - Great RecessionChapter 14 - Great Recession
Chapter 14 - Great Recession
 
Chapter 13 - AD/AS
Chapter 13 - AD/ASChapter 13 - AD/AS
Chapter 13 - AD/AS
 
Chapter 12 - Monetary Policy
Chapter 12 - Monetary PolicyChapter 12 - Monetary Policy
Chapter 12 - Monetary Policy
 
Chapter 11 - IS Curve
Chapter 11 - IS CurveChapter 11 - IS Curve
Chapter 11 - IS Curve
 
Chapter 10 - Great Recession
Chapter 10 - Great RecessionChapter 10 - Great Recession
Chapter 10 - Great Recession
 
Chapter 9 - Short Run
Chapter 9 - Short RunChapter 9 - Short Run
Chapter 9 - Short Run
 
Chapter 8 - Inflation
Chapter 8 - InflationChapter 8 - Inflation
Chapter 8 - Inflation
 
Chapter 7 - Labor Market
Chapter 7 - Labor MarketChapter 7 - Labor Market
Chapter 7 - Labor Market
 
Chapter 6 - Romer Model
Chapter 6 - Romer Model Chapter 6 - Romer Model
Chapter 6 - Romer Model
 
Chapter 5 - Solow Model for Growth
Chapter 5 - Solow Model for GrowthChapter 5 - Solow Model for Growth
Chapter 5 - Solow Model for Growth
 
Chapter 4 - Model of Production
Chapter 4 - Model of ProductionChapter 4 - Model of Production
Chapter 4 - Model of Production
 
Chapter 3 - Long-Run Economic Growth
Chapter 3 - Long-Run Economic GrowthChapter 3 - Long-Run Economic Growth
Chapter 3 - Long-Run Economic Growth
 
Chapter 2 - Measuring the Macroeconomy
Chapter 2 - Measuring the MacroeconomyChapter 2 - Measuring the Macroeconomy
Chapter 2 - Measuring the Macroeconomy
 
Topic 7 (data)
Topic 7 (data)Topic 7 (data)
Topic 7 (data)
 
Inequality
InequalityInequality
Inequality
 
Topic 7 (questions)
Topic 7 (questions)Topic 7 (questions)
Topic 7 (questions)
 
Hypothesis Testing
Hypothesis TestingHypothesis Testing
Hypothesis Testing
 
Topic 1 part 2
Topic 1 part 2Topic 1 part 2
Topic 1 part 2
 
Introduction
IntroductionIntroduction
Introduction
 
Introduction - Using Stata
Introduction - Using StataIntroduction - Using Stata
Introduction - Using Stata
 

Recently uploaded

Seal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptxSeal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptx
negromaestrong
 
Making and Justifying Mathematical Decisions.pdf
Making and Justifying Mathematical Decisions.pdfMaking and Justifying Mathematical Decisions.pdf
Making and Justifying Mathematical Decisions.pdf
Chris Hunter
 
The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptx
heathfieldcps1
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
QucHHunhnh
 

Recently uploaded (20)

Sociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning ExhibitSociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning Exhibit
 
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
 
Role Of Transgenic Animal In Target Validation-1.pptx
Role Of Transgenic Animal In Target Validation-1.pptxRole Of Transgenic Animal In Target Validation-1.pptx
Role Of Transgenic Animal In Target Validation-1.pptx
 
Seal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptxSeal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptx
 
Measures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDMeasures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SD
 
Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17
 
Making and Justifying Mathematical Decisions.pdf
Making and Justifying Mathematical Decisions.pdfMaking and Justifying Mathematical Decisions.pdf
Making and Justifying Mathematical Decisions.pdf
 
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxBasic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
 
The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptx
 
Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17  How to Extend Models Using Mixin ClassesMixin Classes in Odoo 17  How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes
 
Energy Resources. ( B. Pharmacy, 1st Year, Sem-II) Natural Resources
Energy Resources. ( B. Pharmacy, 1st Year, Sem-II) Natural ResourcesEnergy Resources. ( B. Pharmacy, 1st Year, Sem-II) Natural Resources
Energy Resources. ( B. Pharmacy, 1st Year, Sem-II) Natural Resources
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
 
ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.
 
How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and Mode
 
Food Chain and Food Web (Ecosystem) EVS, B. Pharmacy 1st Year, Sem-II
Food Chain and Food Web (Ecosystem) EVS, B. Pharmacy 1st Year, Sem-IIFood Chain and Food Web (Ecosystem) EVS, B. Pharmacy 1st Year, Sem-II
Food Chain and Food Web (Ecosystem) EVS, B. Pharmacy 1st Year, Sem-II
 
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdf
 
Unit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxUnit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptx
 
Application orientated numerical on hev.ppt
Application orientated numerical on hev.pptApplication orientated numerical on hev.ppt
Application orientated numerical on hev.ppt
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdf
 

Linear Regression

  • 1. LINEAR REGRESSION WITH ONE REGRESSOR Ryan Herzog, Ph.D.
  • 2. HEDONIC PRICING IN REAL ESTATE • What is the relationship between square footage of a house and its price? • If one square foot of living space is added to a house by how much will its price increase? • $50? • $100? • All of these questions can be answered with the help of a linear regression. We will be using the ”real estate” dataset to help us answer these questions.
  • 3. THE RELATIONSHIP BETWEEN SQUARE FOOTAGE OF A HOUSE AND ITS PRICE • Stata: twoway scatter price sqft
  • 4. SIDENOTE – LABEL VAR • To clean up graphs (and other output) use the command ”label” to clean up variable names. • label variable sqft “Square Feet” • label variable price “Price (thousands)”
  • 5. THE RELATIONSHIP BETWEEN SQUARE FOOTAGE OF A HOUSE AND ITS PRICE • Add a line of best fit • twoway (scatter price sqft) (lfit price sqft)
  • 6. THE EQUATION OF THE LINE Price = 153.18 + 0.195 * sq ft For a 2,000 sq ft, expected price would be = 153.18 + 0.19537 * 2,000 = 543.92
  • 7. ANOTHER GRAPHING COMMAND (AAPLOT) • ssc install aaplot • aaplot price sqft
  • 8. LINEAR REGRESSION MODEL Yi = β0 + β1Xi + ui, i = 1,…, n • We have n observations, (Xi, Yi), i = 1,.., n. • X is the independent variable or regressor or explanatory variable • Y is the dependent variable • β0 = intercept • β1 = slope, coefficient • ui = the regression error or disturbance term • not a deterministic equation
  • 9. EXAMPLE • Name the components of the following linear regression model 𝑝𝑟𝑖𝑐𝑒𝑖 = 𝛽0 + 𝛽1 𝑠𝑞𝑢𝑎𝑟𝑒 𝑓𝑒𝑒𝑡𝑖 + 𝑒𝑖
  • 10. FITTED EQUATION AND SLOPE INTERPRETATION • 𝑝𝑟𝑖𝑐𝑒 = 153.18 + 0.195 ∗ 𝑠𝑞𝑢𝑎𝑟𝑒 𝑓𝑒𝑒𝑡 • On average, a one unit (what is it?) increase in house size is associated with about a 0.19 unit (what is it?) increase of its price • If 50 square feet of living space are added to a house, on average its price will increase by about _______ • If 100 square feet of living space are added to a house, on average its price will increase by about ______
  • 11. INTERPRETING BETA FOR THE CHANGE IN X AND AT A POINT • 𝑝𝑟𝑖𝑐𝑒 = 153.18 + 0.195 ∗ 𝑠𝑞𝑢𝑎𝑟𝑒 𝑓𝑒𝑒𝑡 • Change in X: • If square feet (X) changes by 10, by how much will the price change on average? • At a point: • If the size of a house (X) is 1500 square feet what is its price?
  • 12. MORE EXAMPLES • 𝑇𝑒𝑠𝑡𝑠𝑐𝑜𝑟𝑒 = 520.4 − 5.82 ∗ 𝑐𝑙𝑎𝑠𝑠 𝑠𝑖𝑧𝑒 • What is the regression’s prediction for a classroom of 20 students? • What is the regression’s prediction for the change in the classroom average test score if the classroom size increases by 4? • 𝑊𝑒𝑖𝑔ℎ𝑡 = −99.41 + 3.94 ∗ 𝐻𝑒𝑖𝑔ℎ𝑡 • What is the regression’s weight prediction for someone who is 70 inches tall? (weight is measured in pounds, height is measured in inches) • If someone has a growth spurt of 1.5 inches over the course of a year what is the regression’s prediction for the increase in this person’s weight?
  • 13. CORRELATION DOES NOT IMPLY CAUSATION • Proper language: is associated/correlated with, suggests *Check out the book “Spurious Correlation by Tyler Vigen https://www.tylervigen.com/spurious-correlations
  • 14. THE RELATIONSHIP BETWEEN PRICE AND SQUARE FEET OF A HOUSE
  • 15. MORE EXAMPLES • Regress the price on the following variables, write out the population regression equation, fitted equation, and interpret the results: • Number of bedrooms • Number of bathrooms • Age • Year built
  • 16. EVEN MORE EXAMPLES. DIY TIME. • Use California test score dataset • Regress test score on the following variables, write out the population regression equation, the fitted equation, and interpret the results. • Class size • Percent of ESL students • Computers per student • Percent of students qualified for free lunches
  • 17. MEANINGFUL INTERCEPT • Pay attention to the following Intuitively can one of the X variable observations be zero? - The zero value for the X variable is in the sample Example: In the following two regressions is the intercept meaningful? 𝑡𝑒𝑠𝑡 𝑠𝑐𝑜𝑟𝑒𝑖 = 𝛽0 + 𝛽1 𝑝𝑒𝑟𝑐𝑒𝑛𝑡 𝐸𝑆𝐿 𝑠𝑡𝑢𝑑𝑒𝑛𝑡𝑠𝑖 + 𝑢𝑖 𝑡𝑒𝑠𝑡 𝑠𝑐𝑜𝑟𝑒𝑖 = 𝛽0 + 𝛽1 𝑎𝑣𝑒𝑟𝑎𝑔𝑒 𝑖𝑛𝑐𝑜𝑚𝑒𝑖 + 𝑢𝑖
  • 18. MEANINGFUL INTERCEPT • Is the intercept meaningful in the following two regressions? 𝑡𝑒𝑠𝑡 𝑠𝑐𝑜𝑟𝑒𝑖 = 𝛽0 + 𝛽1 𝑐𝑙𝑎𝑠𝑠 𝑠𝑖𝑧𝑒𝑖 + 𝑢𝑖 𝑡𝑒𝑠𝑡 𝑠𝑐𝑜𝑟𝑒𝑖 = 𝛽0 + 𝛽1 𝑐𝑜𝑚𝑝𝑢𝑡𝑒𝑟𝑠 𝑝𝑒𝑟 𝑠𝑡𝑢𝑑𝑒𝑛𝑡𝑖 + 𝑢𝑖
  • 19. MEANINGFUL SLOPE COEFFICIENT. INTERPRETING THE MAGNITUDE Is the increase in X associated with the large increase in Y? if so => the coefficient is economically meaningful (large) • We can’t simply compare coefficients from two different regressions, since each variable has a different standard deviation. We want to find out by how much Y changes when X changes by its standard deviation • Steps to find out the magnitude: 1. Find the standard deviation of X 2. Multiply the standard deviation of X by the coefficient 3. Compare the product from (2) to the standard deviation of Y (often by dividing it by the standard deviation of Y)
  • 20. INTERPRETING THE MAGNITUDE. EXAMPLES • Is the effect of class size on test scores large? • Is the effect of the percent of ESL students on test scores large? • If we compare the coefficients from the two regressions what conclusion might we arrive to? • If class size increases by 1 standard deviation by how much will the test scores increase? • If the percent of ESL students increases by 1 standard deviation by how much will the test scores increase? • How do these changes compare to the standard deviation of test scores?
  • 21. INTERPRETING THE MAGNITUDE. • Is the effect of the independent variable in the following two regressions on the dependent variable economically meaningful? • With one standard deviation change in each of the independent variables by how much will the dependent variable change in terms of its standard deviation? 𝑡𝑒𝑠𝑡 𝑠𝑐𝑜𝑟𝑒𝑖 = 𝛽0 + 𝛽1 𝑝𝑒𝑟𝑐𝑒𝑛𝑡 𝑜𝑓 𝑠𝑡𝑢𝑑𝑒𝑛𝑡𝑠 𝑞𝑢𝑎𝑙𝑖𝑓𝑖𝑒𝑑 𝑓𝑜𝑟 𝑓𝑟𝑒𝑒 𝑙𝑢𝑛𝑐ℎ𝑒𝑠𝑖 + 𝑢𝑖 𝑡𝑒𝑠𝑡 𝑠𝑐𝑜𝑟𝑒𝑖 = 𝛽0 + 𝛽1 𝑐𝑜𝑚𝑝𝑢𝑡𝑒𝑟𝑠 𝑝𝑒𝑟 𝑠𝑡𝑢𝑑𝑒𝑛𝑡𝑖 + 𝑢𝑖
  • 22. REVIEW • Please name the components of a linear regression (based on an example of your own choosing) • Why do we need to have an error term in the regression equation? • What is a fitted equation? (What is the difference between the fitted equation and the population equation)? Please give examples. • How do you interpret the results of a regression at a point/ based on the change in X? • Command in Stata to run a regression • What does a meaningful intercept mean? Please give an example • How do we interpret the magnitude of the coefficient and decide if it is economically meaningful?

Editor's Notes

  1. In general terms before we know what the equation of the line is we will write: price = b0+b1*sqft. However, very few of our observations will actually follow this formula. For example (open excel) the first house has 2400 square feet and its price is 300. If I calculate what its price should be using the equation of the line we would get 617.748, which is different than the price it has of 300. This means that the equation of the line is not perfect in describing the relationship between price and sqft. There is an error most of the time. In the first case the error is equal to 317.748. so we will write our equation as price = b0+b1*sqft+error. On average the error term will be 0. Note the equation of the line in Stata was added with the help of “aaplot” function that has to be installed separately.
  2. We add a disturbance term to our model and write down its general form What is n in our sample on Seattle real estate? Stata: count N=420 Deterministic equation – a model that defines an exact relationship between variables, no room for error Before when we didn’t add any of the other factors that might affect test scores we pretty much lumped them all together in the error term
  3. The price will change by $1950 The price of a house this size is 445680 dollars
  4. Are the questions asking to interpret a change or “at a point”? Answers: 404 points, - 9.82 points; 176.39 pounds, 5.91 pounds
  5. OLS regression does not show which way causation goes, we were the ones to decide that test scores go on the left side of the equation and student-teacher ratio goes on the right side of the equation. Could it be that districts with higher scores also have lower student-teacher ratios? Can you think of a reason why? How do we show causation: economic theory, econometrics techniques (including experiments), common sense and intuition
  6. Price = 153.18+0.195*square feet
  7. The coefficients are: -2.28; -0.671; 79.4; -0.61
  8. No in the first one, yes in the second one
  9. The coefficient on class size (STR) is -2.28 while the coefficient on the percent of ESL students is -0.67. At this point it seems that the effect of class size is larger. The standard deviation of class size is 1.89 the standard deviation of percent of ESL students is 18.28. -2.28*1.89=-4.3092 (class size) -0.67*18.28=-12.25 (percent of ESL students) The standard deviation of test scores is 19.05 With one st dev increase in class size the average increase in the test scores is about 22% of its standard deviation With one st deviation increase in el_pct the average increase in test scores is about 64% of its standard deviation
  10. They probably both have a meaningful effect Meal_pct: 0.061*27=16.47 16.47/19.05=86% Comp_stu: 0.065*79.4=5.161 5.161/19.05 = 27%