SlideShare a Scribd company logo
1 of 19
MULTIPLE REGRESSION
ECON 355 – Regression Analysis
SOME LOGISTICS
• Verify current directory path
• In Stata type pwd to show current directory.
• Use cd “path” to change directory
NEW STATA FUNCTIONS AND OPTIONS
• Preserve/restore – lets you preserve and go back to the sample you are working with
before you make any changes with the data
• Drop/keep – lets you keep drop/keep certain observations/variables
• Example:
• Work with realestate dataset
• preserve
• drop age
• restore
NEW STATA FUNCTIONS AND OPTIONS CONT’D
• Another example:
• Preserve
• hist price
• keep if age >100
• hist price
• Restore
HEDONIC PRICING
• We are going to discuss how the size of the house affects the relationship between its
price and its age
• What is the relationship between the price of the house and its age in general?
• Are all the houses in our sample the same size? Let’s look at its descriptive statistics and
histogram.
• We are going to divide our data sample into 5 groups depending on the size of the
house (under 1000 sqft, 1000-2000 sqft, 2000-3000 sqft, 3000-4000 sqft, 4000-5000
sqft) and see if the relationship between price and age changes for any of these groups.
THE SIZE OF THE HOUSE MATTERS!
• Not only is the size of the house related to its price but also most likely related to its
age. Intuitively, why do you think the size and the age of a house might be related?
• In general, we will want to include in the regression everything that possibly affects Y
and is correlated to X
• Do you think the number of bedrooms and bathrooms can also be related to the age of
the house and potentially affect its price?
• If so, we should probably include them in the regression too. What is the relationship
between the price of the house and its age now?
IN GENERAL
• Population regression will now look like this
• 𝑌𝑖 = 𝛽0 + 𝛽1 𝑋1𝑖 + 𝛽2 𝑋2𝑖 + 𝛽3…𝑘−1
𝑋3 … 𝑘−1 𝑖 + 𝛽 𝑘 𝑋𝐾𝑖 + 𝑢𝑖
• The interpretation of betas slightly changes. Since there are more than one independent
variable included, when interpreting the beta on one of them, the others are held
constant.
• i.e. 𝛽1 =
∆𝑌
∆𝑋1
holding everything else constant (or ceteris paribus).
• With one unit change in 𝑋1, 𝑌 will change by 𝛽1 holding everything else constant
BACK TO THE HEDONIC PRICING EXAMPLE
• Before we interpret the betas in our multiple regression lets figure out the measurement
units for each variable
• Price, beds, baths, age, sqft
• What does the population regression look like when we regress price of a house on its
age, square feet, number beds, and number baths?
• What does the fitted regression look like?
• Please interpret each of the betas except the constant.
STATA – CREATING TABLES
• To be able to compare the results of different regressions with ease we usually create tables.
• You can see an example of a table on blackboard. We are going to try to replicate the table
Stata
• ssc install outreg2
• Each regression has to be added to the table separately
• Stata command: outreg2 using tablename.doc
• Every new column has to be added to the already existing table
• Stata: outreg2 using tablename.doc
• To start the document with the same name over:
• Stata: outreg2 using tablename.doc, replace
DIY TIME
• Please run the following regressions and create a table with the results of the regression
• Regress price of a house on its age
• Regress price of a house on its age and size
• Regress price of a house on its age, size, number of bedrooms and number of
bathrooms
• Please make sure the table looks clean and professional.
T-TEST IN A MULTIPLE REGRESSION
• The significance tests do not change between single and multiple regressions
• Coefficients are still significant at
• 1% if t-stat >|2.58| and p-value<0.01
• 5% if t-stat>|1.65| and p-value<0.05
• 10% if t-stat >|1.96| and p-value<0.1
MORE DIY TIME
• Please use the caschool dataset
• Please run four regressions of test score on class size (1) and control for total
enrollment(2); expenditure per student and average income (3); average income and
computers per students (4)
• We will not edit the table in the word file, we will rather look at the regression results in
stata
• Please interpret one of the betas in your regressions
IMPERFECT MULTICOLLINEARITY
• If we include variables in a regression that are closely related to one another the betas
on them will become statistically insignificant (because the standard errors will increase)
• Example:
• regress test scores on calworks percentage
• then regress test scores on percent qualifying for reduced-price lunch,
• then regress test scores on percent qualifying for reduced-price lunch and percent qualifying for
calworks.
• What happens to the significance of the betas?
• Sometimes a few variables combined together may be correlated with a variable already
included, we may never know.
IMPERFECT MULTICOLLINEARITY WHAT TO DO AND WHAT
NOT TO DO
• Do not run kitchen-sink regressions
• Concentrate on a variable of interest, the rest should be “controlled for”. Be deliberate
about the variables you add to the regression. Start with a baseline regression and then
add more one by one or by group.
• If multicollinearity exists in your results (and it most likely does), you are erring on the
conservative side. This means you are not claiming that the relationship exists when it
does not, much rather the opposite.
• Example: if we want to test the relationship between test scores and average income
what other variables should we control for?
PERFECT MULTICOLLINEARITY
• Happens when your regressors are perfectly correlated
• Use teaching ratings data set
• Create a variable equal to 1 if professor is a male
• Stata:
• generate male=0
• replace male=1 if female==0
• Regress course evaluations on the male and female dummy variables in the same
regression
• What happens? Why do you think it happens?
• This is called a dummy variable trap – we have included a dummy for each category.
Stata will correct for it, other software will not. Remember to always omit one category
and compare the betas on the included categories to the omitted category
PERFECT MULTICOLLINEARITY
EXAMPLE
• Use binarydata dataset
• There are a couple of ways to create dummy variables in stata and include them into a
regression
• Variable “ethnicity” contains three possible outcomes in this dataset “black”, “Hispanic”,
“white”. We can create a dummy variable for each, it will be equal to 1 if a person is
Hispanic, and 0 otherwise.
• Stata:
• tabulate ethnicity, generate(e)
• Let’s look at the three variables Stata created
• Now regress earnings on the three variables that control for ethnicity. What happens?
Why?
• Please interpret the coefficients in the above regression.
PERFECT MULTICOLLINEARITY
EXAMPLE, DIY.
1. Please create a set of dummy variables for the following variable: hsdropout, i. e. a
variable equal to 1 for those who dropped out of high school and 0 otherwise, and a
variable equal to 1 who did not drop out of high school and 0 otherwise
• Now regress EARNINGS on one of the dummy variables. Please interpret the results of
the regression. 2. Please create a set of dummy variables for the variable relationship
status
• Now regress EARNINGS on the group of the dummy variables omitting one of them.
• Please interpret the results of the regression (use slido to pick the correct answer)
MULTIPLE REGRESSION, DIY TIME.
• Please use EAEF22 dataset to show the relationship between one’s earnings and amount
of schooling, while controlling for other variables.
• Please run a few regressions to determine which empirical model explains the
relationship between earnings and schooling best.
• Use the knowledge you have received in this topic to decide which variables to include
in your regressions.
• Please interpret the relationship that you found.
REVIEW
• Why do we need to include more than one regressor in a regression?
• How is a t-test conducted in a multiple regression?
• How do you create a table with the results of a regression in Stata?
• What is imperfect collinearity? Is it a problem? Should we try to avoid it?
• What is perfect multicollinearity? Is it a problem? Should we try to avoid it?
• How do you interpret results of a regression with a set of dummy variables?

More Related Content

Similar to Topic 5 (multiple regression)

Mixed Effects Models - Post-Hoc Comparisons
Mixed Effects Models - Post-Hoc ComparisonsMixed Effects Models - Post-Hoc Comparisons
Mixed Effects Models - Post-Hoc ComparisonsScott Fraundorf
 
Data analysis using spss
Data analysis using spssData analysis using spss
Data analysis using spssSyed Faisal
 
probability.pptx
probability.pptxprobability.pptx
probability.pptxbisan3
 
COORDINATE ALGEBRA Unit One Power point
COORDINATE ALGEBRA Unit One Power pointCOORDINATE ALGEBRA Unit One Power point
COORDINATE ALGEBRA Unit One Power pointAmanda Manning
 
Scatterplots and Cautions of Correlation
Scatterplots and Cautions of CorrelationScatterplots and Cautions of Correlation
Scatterplots and Cautions of CorrelationOleg Janke
 
Sess03 Dimension Reduction Methods.pptx
Sess03 Dimension Reduction Methods.pptxSess03 Dimension Reduction Methods.pptx
Sess03 Dimension Reduction Methods.pptxSarthakKabi1
 
Topic 2 - More on Hypothesis Testing
Topic 2 - More on Hypothesis TestingTopic 2 - More on Hypothesis Testing
Topic 2 - More on Hypothesis TestingRyan Herzog
 
regression.pptx
regression.pptxregression.pptx
regression.pptxaneeshs28
 
MACHINE LEARNING YEAR DL SECOND PART.pptx
MACHINE LEARNING YEAR DL SECOND PART.pptxMACHINE LEARNING YEAR DL SECOND PART.pptx
MACHINE LEARNING YEAR DL SECOND PART.pptxNAGARAJANS68
 
Topic 3 (Stats summary)
Topic 3 (Stats summary)Topic 3 (Stats summary)
Topic 3 (Stats summary)Ryan Herzog
 
Formulating a Hypothesis
Formulating a HypothesisFormulating a Hypothesis
Formulating a Hypothesisbjkim0228
 
Measuring scaling new.pptx
Measuring scaling new.pptxMeasuring scaling new.pptx
Measuring scaling new.pptxRenu Lamba
 
Transformers: Data in Disguise
Transformers: Data in DisguiseTransformers: Data in Disguise
Transformers: Data in Disguisedrplayfoot
 
Chemunit1presentation 110830201747-phpapp01
Chemunit1presentation 110830201747-phpapp01Chemunit1presentation 110830201747-phpapp01
Chemunit1presentation 110830201747-phpapp01Cleophas Rwemera
 
Mixed Effects Models - Random Intercepts
Mixed Effects Models - Random InterceptsMixed Effects Models - Random Intercepts
Mixed Effects Models - Random InterceptsScott Fraundorf
 
Types of Data-Introduction.pptx
Types of Data-Introduction.pptxTypes of Data-Introduction.pptx
Types of Data-Introduction.pptxAnusuya123
 
APSY3206 Lecture 1.pptx
APSY3206 Lecture 1.pptxAPSY3206 Lecture 1.pptx
APSY3206 Lecture 1.pptxMariaMalikAwan
 

Similar to Topic 5 (multiple regression) (20)

Spss
SpssSpss
Spss
 
Mixed Effects Models - Post-Hoc Comparisons
Mixed Effects Models - Post-Hoc ComparisonsMixed Effects Models - Post-Hoc Comparisons
Mixed Effects Models - Post-Hoc Comparisons
 
Data analysis using spss
Data analysis using spssData analysis using spss
Data analysis using spss
 
probability.pptx
probability.pptxprobability.pptx
probability.pptx
 
COORDINATE ALGEBRA Unit One Power point
COORDINATE ALGEBRA Unit One Power pointCOORDINATE ALGEBRA Unit One Power point
COORDINATE ALGEBRA Unit One Power point
 
Scatterplots and Cautions of Correlation
Scatterplots and Cautions of CorrelationScatterplots and Cautions of Correlation
Scatterplots and Cautions of Correlation
 
Sess03 Dimension Reduction Methods.pptx
Sess03 Dimension Reduction Methods.pptxSess03 Dimension Reduction Methods.pptx
Sess03 Dimension Reduction Methods.pptx
 
Topic 2 - More on Hypothesis Testing
Topic 2 - More on Hypothesis TestingTopic 2 - More on Hypothesis Testing
Topic 2 - More on Hypothesis Testing
 
regression.pptx
regression.pptxregression.pptx
regression.pptx
 
Lecture 4
Lecture 4Lecture 4
Lecture 4
 
MACHINE LEARNING YEAR DL SECOND PART.pptx
MACHINE LEARNING YEAR DL SECOND PART.pptxMACHINE LEARNING YEAR DL SECOND PART.pptx
MACHINE LEARNING YEAR DL SECOND PART.pptx
 
Topic 3 (Stats summary)
Topic 3 (Stats summary)Topic 3 (Stats summary)
Topic 3 (Stats summary)
 
Formulating a Hypothesis
Formulating a HypothesisFormulating a Hypothesis
Formulating a Hypothesis
 
Measuring scaling new.pptx
Measuring scaling new.pptxMeasuring scaling new.pptx
Measuring scaling new.pptx
 
Transformers: Data in Disguise
Transformers: Data in DisguiseTransformers: Data in Disguise
Transformers: Data in Disguise
 
Chemunit1presentation 110830201747-phpapp01
Chemunit1presentation 110830201747-phpapp01Chemunit1presentation 110830201747-phpapp01
Chemunit1presentation 110830201747-phpapp01
 
Mixed Effects Models - Random Intercepts
Mixed Effects Models - Random InterceptsMixed Effects Models - Random Intercepts
Mixed Effects Models - Random Intercepts
 
Types of Data-Introduction.pptx
Types of Data-Introduction.pptxTypes of Data-Introduction.pptx
Types of Data-Introduction.pptx
 
APSY3206 Lecture 1.pptx
APSY3206 Lecture 1.pptxAPSY3206 Lecture 1.pptx
APSY3206 Lecture 1.pptx
 
8-1-11
8-1-118-1-11
8-1-11
 

More from Ryan Herzog

Chapter 14 - Great Recession
Chapter 14 - Great RecessionChapter 14 - Great Recession
Chapter 14 - Great RecessionRyan Herzog
 
Chapter 13 - AD/AS
Chapter 13 - AD/ASChapter 13 - AD/AS
Chapter 13 - AD/ASRyan Herzog
 
Chapter 12 - Monetary Policy
Chapter 12 - Monetary PolicyChapter 12 - Monetary Policy
Chapter 12 - Monetary PolicyRyan Herzog
 
Chapter 11 - IS Curve
Chapter 11 - IS CurveChapter 11 - IS Curve
Chapter 11 - IS CurveRyan Herzog
 
Chapter 10 - Great Recession
Chapter 10 - Great RecessionChapter 10 - Great Recession
Chapter 10 - Great RecessionRyan Herzog
 
Chapter 9 - Short Run
Chapter 9 - Short RunChapter 9 - Short Run
Chapter 9 - Short RunRyan Herzog
 
Chapter 8 - Inflation
Chapter 8 - InflationChapter 8 - Inflation
Chapter 8 - InflationRyan Herzog
 
Chapter 7 - Labor Market
Chapter 7 - Labor MarketChapter 7 - Labor Market
Chapter 7 - Labor MarketRyan Herzog
 
Chapter 6 - Romer Model
Chapter 6 - Romer Model Chapter 6 - Romer Model
Chapter 6 - Romer Model Ryan Herzog
 
Chapter 5 - Solow Model for Growth
Chapter 5 - Solow Model for GrowthChapter 5 - Solow Model for Growth
Chapter 5 - Solow Model for GrowthRyan Herzog
 
Chapter 4 - Model of Production
Chapter 4 - Model of ProductionChapter 4 - Model of Production
Chapter 4 - Model of ProductionRyan Herzog
 
Chapter 3 - Long-Run Economic Growth
Chapter 3 - Long-Run Economic GrowthChapter 3 - Long-Run Economic Growth
Chapter 3 - Long-Run Economic GrowthRyan Herzog
 
Chapter 2 - Measuring the Macroeconomy
Chapter 2 - Measuring the MacroeconomyChapter 2 - Measuring the Macroeconomy
Chapter 2 - Measuring the MacroeconomyRyan Herzog
 
Topic 7 (questions)
Topic 7 (questions)Topic 7 (questions)
Topic 7 (questions)Ryan Herzog
 
Topic 6 (model specification)
Topic 6 (model specification)Topic 6 (model specification)
Topic 6 (model specification)Ryan Herzog
 
Hypothesis Testing
Hypothesis TestingHypothesis Testing
Hypothesis TestingRyan Herzog
 

More from Ryan Herzog (20)

Chapter 14 - Great Recession
Chapter 14 - Great RecessionChapter 14 - Great Recession
Chapter 14 - Great Recession
 
Chapter 13 - AD/AS
Chapter 13 - AD/ASChapter 13 - AD/AS
Chapter 13 - AD/AS
 
Chapter 12 - Monetary Policy
Chapter 12 - Monetary PolicyChapter 12 - Monetary Policy
Chapter 12 - Monetary Policy
 
Chapter 11 - IS Curve
Chapter 11 - IS CurveChapter 11 - IS Curve
Chapter 11 - IS Curve
 
Chapter 10 - Great Recession
Chapter 10 - Great RecessionChapter 10 - Great Recession
Chapter 10 - Great Recession
 
Chapter 9 - Short Run
Chapter 9 - Short RunChapter 9 - Short Run
Chapter 9 - Short Run
 
Chapter 8 - Inflation
Chapter 8 - InflationChapter 8 - Inflation
Chapter 8 - Inflation
 
Chapter 7 - Labor Market
Chapter 7 - Labor MarketChapter 7 - Labor Market
Chapter 7 - Labor Market
 
Chapter 6 - Romer Model
Chapter 6 - Romer Model Chapter 6 - Romer Model
Chapter 6 - Romer Model
 
Chapter 5 - Solow Model for Growth
Chapter 5 - Solow Model for GrowthChapter 5 - Solow Model for Growth
Chapter 5 - Solow Model for Growth
 
Chapter 4 - Model of Production
Chapter 4 - Model of ProductionChapter 4 - Model of Production
Chapter 4 - Model of Production
 
Chapter 3 - Long-Run Economic Growth
Chapter 3 - Long-Run Economic GrowthChapter 3 - Long-Run Economic Growth
Chapter 3 - Long-Run Economic Growth
 
Chapter 2 - Measuring the Macroeconomy
Chapter 2 - Measuring the MacroeconomyChapter 2 - Measuring the Macroeconomy
Chapter 2 - Measuring the Macroeconomy
 
Topic 7 (data)
Topic 7 (data)Topic 7 (data)
Topic 7 (data)
 
Inequality
InequalityInequality
Inequality
 
Topic 7 (questions)
Topic 7 (questions)Topic 7 (questions)
Topic 7 (questions)
 
Topic 6 (model specification)
Topic 6 (model specification)Topic 6 (model specification)
Topic 6 (model specification)
 
Hypothesis Testing
Hypothesis TestingHypothesis Testing
Hypothesis Testing
 
Topic 1 part 2
Topic 1 part 2Topic 1 part 2
Topic 1 part 2
 
Introduction
IntroductionIntroduction
Introduction
 

Recently uploaded

ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxiammrhaywood
 
ROOT CAUSE ANALYSIS PowerPoint Presentation
ROOT CAUSE ANALYSIS PowerPoint PresentationROOT CAUSE ANALYSIS PowerPoint Presentation
ROOT CAUSE ANALYSIS PowerPoint PresentationAadityaSharma884161
 
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...JhezDiaz1
 
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdf
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdfFraming an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdf
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdfUjwalaBharambe
 
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfLike-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfMr Bounab Samir
 
Quarter 4 Peace-education.pptx Catch Up Friday
Quarter 4 Peace-education.pptx Catch Up FridayQuarter 4 Peace-education.pptx Catch Up Friday
Quarter 4 Peace-education.pptx Catch Up FridayMakMakNepo
 
ENGLISH6-Q4-W3.pptxqurter our high choom
ENGLISH6-Q4-W3.pptxqurter our high choomENGLISH6-Q4-W3.pptxqurter our high choom
ENGLISH6-Q4-W3.pptxqurter our high choomnelietumpap1
 
EPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptxEPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptxRaymartEstabillo3
 
Judging the Relevance and worth of ideas part 2.pptx
Judging the Relevance  and worth of ideas part 2.pptxJudging the Relevance  and worth of ideas part 2.pptx
Judging the Relevance and worth of ideas part 2.pptxSherlyMaeNeri
 
Proudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxProudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxthorishapillay1
 
Romantic Opera MUSIC FOR GRADE NINE pptx
Romantic Opera MUSIC FOR GRADE NINE pptxRomantic Opera MUSIC FOR GRADE NINE pptx
Romantic Opera MUSIC FOR GRADE NINE pptxsqpmdrvczh
 
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTiammrhaywood
 
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...Nguyen Thanh Tu Collection
 
Grade 9 Q4-MELC1-Active and Passive Voice.pptx
Grade 9 Q4-MELC1-Active and Passive Voice.pptxGrade 9 Q4-MELC1-Active and Passive Voice.pptx
Grade 9 Q4-MELC1-Active and Passive Voice.pptxChelloAnnAsuncion2
 
Keynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designKeynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designMIPLM
 
Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Educationpboyjonauth
 
Atmosphere science 7 quarter 4 .........
Atmosphere science 7 quarter 4 .........Atmosphere science 7 quarter 4 .........
Atmosphere science 7 quarter 4 .........LeaCamillePacle
 
AmericanHighSchoolsprezentacijaoskolama.
AmericanHighSchoolsprezentacijaoskolama.AmericanHighSchoolsprezentacijaoskolama.
AmericanHighSchoolsprezentacijaoskolama.arsicmarija21
 

Recently uploaded (20)

ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
 
ROOT CAUSE ANALYSIS PowerPoint Presentation
ROOT CAUSE ANALYSIS PowerPoint PresentationROOT CAUSE ANALYSIS PowerPoint Presentation
ROOT CAUSE ANALYSIS PowerPoint Presentation
 
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
 
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdf
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdfFraming an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdf
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdf
 
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfLike-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
 
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
 
Quarter 4 Peace-education.pptx Catch Up Friday
Quarter 4 Peace-education.pptx Catch Up FridayQuarter 4 Peace-education.pptx Catch Up Friday
Quarter 4 Peace-education.pptx Catch Up Friday
 
ENGLISH6-Q4-W3.pptxqurter our high choom
ENGLISH6-Q4-W3.pptxqurter our high choomENGLISH6-Q4-W3.pptxqurter our high choom
ENGLISH6-Q4-W3.pptxqurter our high choom
 
EPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptxEPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptx
 
Judging the Relevance and worth of ideas part 2.pptx
Judging the Relevance  and worth of ideas part 2.pptxJudging the Relevance  and worth of ideas part 2.pptx
Judging the Relevance and worth of ideas part 2.pptx
 
Proudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxProudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptx
 
Romantic Opera MUSIC FOR GRADE NINE pptx
Romantic Opera MUSIC FOR GRADE NINE pptxRomantic Opera MUSIC FOR GRADE NINE pptx
Romantic Opera MUSIC FOR GRADE NINE pptx
 
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
 
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
 
Grade 9 Q4-MELC1-Active and Passive Voice.pptx
Grade 9 Q4-MELC1-Active and Passive Voice.pptxGrade 9 Q4-MELC1-Active and Passive Voice.pptx
Grade 9 Q4-MELC1-Active and Passive Voice.pptx
 
Keynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designKeynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-design
 
9953330565 Low Rate Call Girls In Rohini Delhi NCR
9953330565 Low Rate Call Girls In Rohini  Delhi NCR9953330565 Low Rate Call Girls In Rohini  Delhi NCR
9953330565 Low Rate Call Girls In Rohini Delhi NCR
 
Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Education
 
Atmosphere science 7 quarter 4 .........
Atmosphere science 7 quarter 4 .........Atmosphere science 7 quarter 4 .........
Atmosphere science 7 quarter 4 .........
 
AmericanHighSchoolsprezentacijaoskolama.
AmericanHighSchoolsprezentacijaoskolama.AmericanHighSchoolsprezentacijaoskolama.
AmericanHighSchoolsprezentacijaoskolama.
 

Topic 5 (multiple regression)

  • 1. MULTIPLE REGRESSION ECON 355 – Regression Analysis
  • 2. SOME LOGISTICS • Verify current directory path • In Stata type pwd to show current directory. • Use cd “path” to change directory
  • 3. NEW STATA FUNCTIONS AND OPTIONS • Preserve/restore – lets you preserve and go back to the sample you are working with before you make any changes with the data • Drop/keep – lets you keep drop/keep certain observations/variables • Example: • Work with realestate dataset • preserve • drop age • restore
  • 4. NEW STATA FUNCTIONS AND OPTIONS CONT’D • Another example: • Preserve • hist price • keep if age >100 • hist price • Restore
  • 5. HEDONIC PRICING • We are going to discuss how the size of the house affects the relationship between its price and its age • What is the relationship between the price of the house and its age in general? • Are all the houses in our sample the same size? Let’s look at its descriptive statistics and histogram. • We are going to divide our data sample into 5 groups depending on the size of the house (under 1000 sqft, 1000-2000 sqft, 2000-3000 sqft, 3000-4000 sqft, 4000-5000 sqft) and see if the relationship between price and age changes for any of these groups.
  • 6. THE SIZE OF THE HOUSE MATTERS! • Not only is the size of the house related to its price but also most likely related to its age. Intuitively, why do you think the size and the age of a house might be related? • In general, we will want to include in the regression everything that possibly affects Y and is correlated to X • Do you think the number of bedrooms and bathrooms can also be related to the age of the house and potentially affect its price? • If so, we should probably include them in the regression too. What is the relationship between the price of the house and its age now?
  • 7. IN GENERAL • Population regression will now look like this • 𝑌𝑖 = 𝛽0 + 𝛽1 𝑋1𝑖 + 𝛽2 𝑋2𝑖 + 𝛽3…𝑘−1 𝑋3 … 𝑘−1 𝑖 + 𝛽 𝑘 𝑋𝐾𝑖 + 𝑢𝑖 • The interpretation of betas slightly changes. Since there are more than one independent variable included, when interpreting the beta on one of them, the others are held constant. • i.e. 𝛽1 = ∆𝑌 ∆𝑋1 holding everything else constant (or ceteris paribus). • With one unit change in 𝑋1, 𝑌 will change by 𝛽1 holding everything else constant
  • 8. BACK TO THE HEDONIC PRICING EXAMPLE • Before we interpret the betas in our multiple regression lets figure out the measurement units for each variable • Price, beds, baths, age, sqft • What does the population regression look like when we regress price of a house on its age, square feet, number beds, and number baths? • What does the fitted regression look like? • Please interpret each of the betas except the constant.
  • 9. STATA – CREATING TABLES • To be able to compare the results of different regressions with ease we usually create tables. • You can see an example of a table on blackboard. We are going to try to replicate the table Stata • ssc install outreg2 • Each regression has to be added to the table separately • Stata command: outreg2 using tablename.doc • Every new column has to be added to the already existing table • Stata: outreg2 using tablename.doc • To start the document with the same name over: • Stata: outreg2 using tablename.doc, replace
  • 10. DIY TIME • Please run the following regressions and create a table with the results of the regression • Regress price of a house on its age • Regress price of a house on its age and size • Regress price of a house on its age, size, number of bedrooms and number of bathrooms • Please make sure the table looks clean and professional.
  • 11. T-TEST IN A MULTIPLE REGRESSION • The significance tests do not change between single and multiple regressions • Coefficients are still significant at • 1% if t-stat >|2.58| and p-value<0.01 • 5% if t-stat>|1.65| and p-value<0.05 • 10% if t-stat >|1.96| and p-value<0.1
  • 12. MORE DIY TIME • Please use the caschool dataset • Please run four regressions of test score on class size (1) and control for total enrollment(2); expenditure per student and average income (3); average income and computers per students (4) • We will not edit the table in the word file, we will rather look at the regression results in stata • Please interpret one of the betas in your regressions
  • 13. IMPERFECT MULTICOLLINEARITY • If we include variables in a regression that are closely related to one another the betas on them will become statistically insignificant (because the standard errors will increase) • Example: • regress test scores on calworks percentage • then regress test scores on percent qualifying for reduced-price lunch, • then regress test scores on percent qualifying for reduced-price lunch and percent qualifying for calworks. • What happens to the significance of the betas? • Sometimes a few variables combined together may be correlated with a variable already included, we may never know.
  • 14. IMPERFECT MULTICOLLINEARITY WHAT TO DO AND WHAT NOT TO DO • Do not run kitchen-sink regressions • Concentrate on a variable of interest, the rest should be “controlled for”. Be deliberate about the variables you add to the regression. Start with a baseline regression and then add more one by one or by group. • If multicollinearity exists in your results (and it most likely does), you are erring on the conservative side. This means you are not claiming that the relationship exists when it does not, much rather the opposite. • Example: if we want to test the relationship between test scores and average income what other variables should we control for?
  • 15. PERFECT MULTICOLLINEARITY • Happens when your regressors are perfectly correlated • Use teaching ratings data set • Create a variable equal to 1 if professor is a male • Stata: • generate male=0 • replace male=1 if female==0 • Regress course evaluations on the male and female dummy variables in the same regression • What happens? Why do you think it happens? • This is called a dummy variable trap – we have included a dummy for each category. Stata will correct for it, other software will not. Remember to always omit one category and compare the betas on the included categories to the omitted category
  • 16. PERFECT MULTICOLLINEARITY EXAMPLE • Use binarydata dataset • There are a couple of ways to create dummy variables in stata and include them into a regression • Variable “ethnicity” contains three possible outcomes in this dataset “black”, “Hispanic”, “white”. We can create a dummy variable for each, it will be equal to 1 if a person is Hispanic, and 0 otherwise. • Stata: • tabulate ethnicity, generate(e) • Let’s look at the three variables Stata created • Now regress earnings on the three variables that control for ethnicity. What happens? Why? • Please interpret the coefficients in the above regression.
  • 17. PERFECT MULTICOLLINEARITY EXAMPLE, DIY. 1. Please create a set of dummy variables for the following variable: hsdropout, i. e. a variable equal to 1 for those who dropped out of high school and 0 otherwise, and a variable equal to 1 who did not drop out of high school and 0 otherwise • Now regress EARNINGS on one of the dummy variables. Please interpret the results of the regression. 2. Please create a set of dummy variables for the variable relationship status • Now regress EARNINGS on the group of the dummy variables omitting one of them. • Please interpret the results of the regression (use slido to pick the correct answer)
  • 18. MULTIPLE REGRESSION, DIY TIME. • Please use EAEF22 dataset to show the relationship between one’s earnings and amount of schooling, while controlling for other variables. • Please run a few regressions to determine which empirical model explains the relationship between earnings and schooling best. • Use the knowledge you have received in this topic to decide which variables to include in your regressions. • Please interpret the relationship that you found.
  • 19. REVIEW • Why do we need to include more than one regressor in a regression? • How is a t-test conducted in a multiple regression? • How do you create a table with the results of a regression in Stata? • What is imperfect collinearity? Is it a problem? Should we try to avoid it? • What is perfect multicollinearity? Is it a problem? Should we try to avoid it? • How do you interpret results of a regression with a set of dummy variables?

Editor's Notes

  1. In general if we regress the price on age we will find out that there is no relationship. The relationship between price and age of the house is the following: Under 1000: 2.06 1000-2000: 2.3*** 2000-3000: 3.6*** 3000-4000: 2.18 4000-5000: 1.52 Show how to use “reg y x” for the first two groups, then divide the class into three groups and ask to do it for the last three groups
  2. Why do you think there are no subscripts “i” on betas?
  3. Price – thousands of dollars, beds – number, baths – number, age – years, sqft – square feet