SlideShare a Scribd company logo
MULTIPLE REGRESSION
ECON 355 – Regression Analysis
SOME LOGISTICS
• Verify current directory path
• In Stata type pwd to show current directory.
• Use cd “path” to change directory
NEW STATA FUNCTIONS AND OPTIONS
• Preserve/restore – lets you preserve and go back to the sample you are working with
before you make any changes with the data
• Drop/keep – lets you keep drop/keep certain observations/variables
• Example:
• Work with realestate dataset
• preserve
• drop age
• restore
NEW STATA FUNCTIONS AND OPTIONS CONT’D
• Another example:
• Preserve
• hist price
• keep if age >100
• hist price
• Restore
HEDONIC PRICING
• We are going to discuss how the size of the house affects the relationship between its
price and its age
• What is the relationship between the price of the house and its age in general?
• Are all the houses in our sample the same size? Let’s look at its descriptive statistics and
histogram.
• We are going to divide our data sample into 5 groups depending on the size of the
house (under 1000 sqft, 1000-2000 sqft, 2000-3000 sqft, 3000-4000 sqft, 4000-5000
sqft) and see if the relationship between price and age changes for any of these groups.
THE SIZE OF THE HOUSE MATTERS!
• Not only is the size of the house related to its price but also most likely related to its
age. Intuitively, why do you think the size and the age of a house might be related?
• In general, we will want to include in the regression everything that possibly affects Y
and is correlated to X
• Do you think the number of bedrooms and bathrooms can also be related to the age of
the house and potentially affect its price?
• If so, we should probably include them in the regression too. What is the relationship
between the price of the house and its age now?
IN GENERAL
• Population regression will now look like this
• 𝑌𝑖 = 𝛽0 + 𝛽1 𝑋1𝑖 + 𝛽2 𝑋2𝑖 + 𝛽3…𝑘−1
𝑋3 … 𝑘−1 𝑖 + 𝛽 𝑘 𝑋𝐾𝑖 + 𝑢𝑖
• The interpretation of betas slightly changes. Since there are more than one independent
variable included, when interpreting the beta on one of them, the others are held
constant.
• i.e. 𝛽1 =
∆𝑌
∆𝑋1
holding everything else constant (or ceteris paribus).
• With one unit change in 𝑋1, 𝑌 will change by 𝛽1 holding everything else constant
BACK TO THE HEDONIC PRICING EXAMPLE
• Before we interpret the betas in our multiple regression lets figure out the measurement
units for each variable
• Price, beds, baths, age, sqft
• What does the population regression look like when we regress price of a house on its
age, square feet, number beds, and number baths?
• What does the fitted regression look like?
• Please interpret each of the betas except the constant.
STATA – CREATING TABLES
• To be able to compare the results of different regressions with ease we usually create tables.
• You can see an example of a table on blackboard. We are going to try to replicate the table
Stata
• ssc install outreg2
• Each regression has to be added to the table separately
• Stata command: outreg2 using tablename.doc
• Every new column has to be added to the already existing table
• Stata: outreg2 using tablename.doc
• To start the document with the same name over:
• Stata: outreg2 using tablename.doc, replace
DIY TIME
• Please run the following regressions and create a table with the results of the regression
• Regress price of a house on its age
• Regress price of a house on its age and size
• Regress price of a house on its age, size, number of bedrooms and number of
bathrooms
• Please make sure the table looks clean and professional.
T-TEST IN A MULTIPLE REGRESSION
• The significance tests do not change between single and multiple regressions
• Coefficients are still significant at
• 1% if t-stat >|2.58| and p-value<0.01
• 5% if t-stat>|1.96| and p-value<0.05
• 10% if t-stat >|1.68| and p-value<0.1
MORE DIY TIME
• Please use the caschool dataset
• Please run four regressions of test score on class size (1) and control for total
enrollment(2); expenditure per student and average income (3); average income and
computers per students (4)
• We will not edit the table in the word file, we will rather look at the regression results in
stata
• Please interpret one of the betas in your regressions
IMPERFECT MULTICOLLINEARITY
• If we include variables in a regression that are closely related to one another the betas
on them will become statistically insignificant (because the standard errors will increase)
• Example:
• regress test scores on calworks percentage
• then regress test scores on percent qualifying for reduced-price lunch,
• then regress test scores on percent qualifying for reduced-price lunch and percent qualifying for
calworks.
• What happens to the significance of the betas?
• Sometimes a few variables combined together may be correlated with a variable already
included, we may never know.
IMPERFECT MULTICOLLINEARITY WHAT TO DO AND WHAT
NOT TO DO
• Do not run kitchen-sink regressions
• Concentrate on a variable of interest, the rest should be “controlled for”. Be deliberate
about the variables you add to the regression. Start with a baseline regression and then
add more one by one or by group.
• If multicollinearity exists in your results (and it most likely does), you are erring on the
conservative side. This means you are not claiming that the relationship exists when it
does not, much rather the opposite.
• Example: if we want to test the relationship between test scores and average income
what other variables should we control for?
PERFECT MULTICOLLINEARITY
• Happens when your regressors are perfectly correlated
• Use teaching ratings data set
• Create a variable equal to 1 if professor is a male
• Stata:
• generate male=0
• replace male=1 if female==0
• Regress course evaluations on the male and female dummy variables in the same
regression
• What happens? Why do you think it happens?
• This is called a dummy variable trap – we have included a dummy for each category.
Stata will correct for it, other software will not. Remember to always omit one category
and compare the betas on the included categories to the omitted category
PERFECT MULTICOLLINEARITY
EXAMPLE
• Use binarydata dataset
• There are a couple of ways to create dummy variables in Stata and include them into a
regression
• Variable “ethnicity” contains three possible outcomes in this dataset “Black”, “Hispanic”,
“white”. We can create a dummy variable for each, it will be equal to 1 if a person is
Hispanic, and 0 otherwise.
• Stata:
• tabulate ethnicity, generate(e)
• Let’s look at the three variables Stata created
• Now regress earnings on the three variables that control for ethnicity. What happens?
Why?
• Please interpret the coefficients in the above regression.
PERFECT MULTICOLLINEARITY
EXAMPLE, DIY.
1. Please create a set of dummy variables for the following variable: hsdropout, i. e. a
variable equal to 1 for those who dropped out of high school and 0 otherwise, and a
variable equal to 1 who did not drop out of high school and 0 otherwise
• Now regress EARNINGS on one of the dummy variables. Please interpret the results of
the regression. 2. Please create a set of dummy variables for the variable relationship
status
• Now regress EARNINGS on the group of the dummy variables omitting one of them.
• Please interpret the results of the regression
MULTIPLE REGRESSION, DIY TIME.
• Please use EAEF22 dataset to show the relationship between one’s earnings and amount
of schooling, while controlling for other variables.
• Please run a few regressions to determine which empirical model explains the
relationship between earnings and schooling best.
• Use the knowledge you have received in this topic to decide which variables to include
in your regressions.
• Please interpret the relationship that you found.
REVIEW
• Why do we need to include more than one regressor in a regression?
• How is a t-test conducted in a multiple regression?
• How do you create a table with the results of a regression in Stata?
• What is imperfect collinearity? Is it a problem? Should we try to avoid it?
• What is perfect multicollinearity? Is it a problem? Should we try to avoid it?
• How do you interpret results of a regression with a set of dummy variables?

More Related Content

What's hot

Mixed Effects Models - Data Processing
Mixed Effects Models - Data ProcessingMixed Effects Models - Data Processing
Mixed Effects Models - Data Processing
Scott Fraundorf
 
Spearman’s rank correlation (1)
Spearman’s rank correlation (1)Spearman’s rank correlation (1)
Spearman’s rank correlation (1)
PritikaNeupane
 
Poli_399_Tutorial_Week_Three_-_Sept_29th_(2)
Poli_399_Tutorial_Week_Three_-_Sept_29th_(2)Poli_399_Tutorial_Week_Three_-_Sept_29th_(2)
Poli_399_Tutorial_Week_Three_-_Sept_29th_(2)christineshearer
 
4.3 basic concepts of correlation
4.3 basic concepts of correlation4.3 basic concepts of correlation
4.3 basic concepts of correlation
Rajeev Kumar
 
Null hypothesis for pearson correlation
Null hypothesis for pearson correlationNull hypothesis for pearson correlation
Null hypothesis for pearson correlation
Ken Plummer
 
Null hypothesis for partial correlation
Null hypothesis for partial correlationNull hypothesis for partial correlation
Null hypothesis for partial correlation
Ken Plummer
 
4.4 correlation manual calcualtion
4.4 correlation manual calcualtion4.4 correlation manual calcualtion
4.4 correlation manual calcualtion
Rajeev Kumar
 

What's hot (7)

Mixed Effects Models - Data Processing
Mixed Effects Models - Data ProcessingMixed Effects Models - Data Processing
Mixed Effects Models - Data Processing
 
Spearman’s rank correlation (1)
Spearman’s rank correlation (1)Spearman’s rank correlation (1)
Spearman’s rank correlation (1)
 
Poli_399_Tutorial_Week_Three_-_Sept_29th_(2)
Poli_399_Tutorial_Week_Three_-_Sept_29th_(2)Poli_399_Tutorial_Week_Three_-_Sept_29th_(2)
Poli_399_Tutorial_Week_Three_-_Sept_29th_(2)
 
4.3 basic concepts of correlation
4.3 basic concepts of correlation4.3 basic concepts of correlation
4.3 basic concepts of correlation
 
Null hypothesis for pearson correlation
Null hypothesis for pearson correlationNull hypothesis for pearson correlation
Null hypothesis for pearson correlation
 
Null hypothesis for partial correlation
Null hypothesis for partial correlationNull hypothesis for partial correlation
Null hypothesis for partial correlation
 
4.4 correlation manual calcualtion
4.4 correlation manual calcualtion4.4 correlation manual calcualtion
4.4 correlation manual calcualtion
 

Similar to Topic 5 (multiple regression)

Linear Regression
Linear RegressionLinear Regression
Linear Regression
Ryan Herzog
 
Topic 4 (binary)
Topic 4 (binary)Topic 4 (binary)
Topic 4 (binary)
Ryan Herzog
 
Scatterplots and Cautions of Correlation
Scatterplots and Cautions of CorrelationScatterplots and Cautions of Correlation
Scatterplots and Cautions of CorrelationOleg Janke
 
Mixed Effects Models - Centering and Transformations
Mixed Effects Models - Centering and TransformationsMixed Effects Models - Centering and Transformations
Mixed Effects Models - Centering and Transformations
Scott Fraundorf
 
powerpoint 1-19.pdf
powerpoint 1-19.pdfpowerpoint 1-19.pdf
powerpoint 1-19.pdf
JuanPicasso7
 
Inferential Statistics
Inferential StatisticsInferential Statistics
Inferential Statistics
Neny Isharyanti
 
probability.pptx
probability.pptxprobability.pptx
probability.pptx
bisan3
 
Validity andreliability
Validity andreliabilityValidity andreliability
Validity andreliability
nuwan udugampala
 
regression.pptx
regression.pptxregression.pptx
regression.pptx
aneeshs28
 
COORDINATE ALGEBRA Unit One Power point
COORDINATE ALGEBRA Unit One Power pointCOORDINATE ALGEBRA Unit One Power point
COORDINATE ALGEBRA Unit One Power pointAmanda Manning
 
MACHINE LEARNING YEAR DL SECOND PART.pptx
MACHINE LEARNING YEAR DL SECOND PART.pptxMACHINE LEARNING YEAR DL SECOND PART.pptx
MACHINE LEARNING YEAR DL SECOND PART.pptx
NAGARAJANS68
 
Topic 3 (Stats summary)
Topic 3 (Stats summary)Topic 3 (Stats summary)
Topic 3 (Stats summary)
Ryan Herzog
 
Topic 2 - More on Hypothesis Testing
Topic 2 - More on Hypothesis TestingTopic 2 - More on Hypothesis Testing
Topic 2 - More on Hypothesis Testing
Ryan Herzog
 
Formulating a Hypothesis
Formulating a HypothesisFormulating a Hypothesis
Formulating a Hypothesisbjkim0228
 
Types of Data-Introduction.pptx
Types of Data-Introduction.pptxTypes of Data-Introduction.pptx
Types of Data-Introduction.pptx
Anusuya123
 
Spss
SpssSpss
Mixed Effects Models - Post-Hoc Comparisons
Mixed Effects Models - Post-Hoc ComparisonsMixed Effects Models - Post-Hoc Comparisons
Mixed Effects Models - Post-Hoc Comparisons
Scott Fraundorf
 
Sess03 Dimension Reduction Methods.pptx
Sess03 Dimension Reduction Methods.pptxSess03 Dimension Reduction Methods.pptx
Sess03 Dimension Reduction Methods.pptx
SarthakKabi1
 
Measuring scaling new.pptx
Measuring scaling new.pptxMeasuring scaling new.pptx
Measuring scaling new.pptx
Renu Lamba
 

Similar to Topic 5 (multiple regression) (20)

Linear Regression
Linear RegressionLinear Regression
Linear Regression
 
Topic 4 (binary)
Topic 4 (binary)Topic 4 (binary)
Topic 4 (binary)
 
Scatterplots and Cautions of Correlation
Scatterplots and Cautions of CorrelationScatterplots and Cautions of Correlation
Scatterplots and Cautions of Correlation
 
Mixed Effects Models - Centering and Transformations
Mixed Effects Models - Centering and TransformationsMixed Effects Models - Centering and Transformations
Mixed Effects Models - Centering and Transformations
 
powerpoint 1-19.pdf
powerpoint 1-19.pdfpowerpoint 1-19.pdf
powerpoint 1-19.pdf
 
Inferential Statistics
Inferential StatisticsInferential Statistics
Inferential Statistics
 
Lecture 4
Lecture 4Lecture 4
Lecture 4
 
probability.pptx
probability.pptxprobability.pptx
probability.pptx
 
Validity andreliability
Validity andreliabilityValidity andreliability
Validity andreliability
 
regression.pptx
regression.pptxregression.pptx
regression.pptx
 
COORDINATE ALGEBRA Unit One Power point
COORDINATE ALGEBRA Unit One Power pointCOORDINATE ALGEBRA Unit One Power point
COORDINATE ALGEBRA Unit One Power point
 
MACHINE LEARNING YEAR DL SECOND PART.pptx
MACHINE LEARNING YEAR DL SECOND PART.pptxMACHINE LEARNING YEAR DL SECOND PART.pptx
MACHINE LEARNING YEAR DL SECOND PART.pptx
 
Topic 3 (Stats summary)
Topic 3 (Stats summary)Topic 3 (Stats summary)
Topic 3 (Stats summary)
 
Topic 2 - More on Hypothesis Testing
Topic 2 - More on Hypothesis TestingTopic 2 - More on Hypothesis Testing
Topic 2 - More on Hypothesis Testing
 
Formulating a Hypothesis
Formulating a HypothesisFormulating a Hypothesis
Formulating a Hypothesis
 
Types of Data-Introduction.pptx
Types of Data-Introduction.pptxTypes of Data-Introduction.pptx
Types of Data-Introduction.pptx
 
Spss
SpssSpss
Spss
 
Mixed Effects Models - Post-Hoc Comparisons
Mixed Effects Models - Post-Hoc ComparisonsMixed Effects Models - Post-Hoc Comparisons
Mixed Effects Models - Post-Hoc Comparisons
 
Sess03 Dimension Reduction Methods.pptx
Sess03 Dimension Reduction Methods.pptxSess03 Dimension Reduction Methods.pptx
Sess03 Dimension Reduction Methods.pptx
 
Measuring scaling new.pptx
Measuring scaling new.pptxMeasuring scaling new.pptx
Measuring scaling new.pptx
 

More from Ryan Herzog

Chapter 14 - Great Recession
Chapter 14 - Great RecessionChapter 14 - Great Recession
Chapter 14 - Great Recession
Ryan Herzog
 
Chapter 13 - AD/AS
Chapter 13 - AD/ASChapter 13 - AD/AS
Chapter 13 - AD/AS
Ryan Herzog
 
Chapter 12 - Monetary Policy
Chapter 12 - Monetary PolicyChapter 12 - Monetary Policy
Chapter 12 - Monetary Policy
Ryan Herzog
 
Chapter 11 - IS Curve
Chapter 11 - IS CurveChapter 11 - IS Curve
Chapter 11 - IS Curve
Ryan Herzog
 
Chapter 10 - Great Recession
Chapter 10 - Great RecessionChapter 10 - Great Recession
Chapter 10 - Great Recession
Ryan Herzog
 
Chapter 9 - Short Run
Chapter 9 - Short RunChapter 9 - Short Run
Chapter 9 - Short Run
Ryan Herzog
 
Chapter 8 - Inflation
Chapter 8 - InflationChapter 8 - Inflation
Chapter 8 - Inflation
Ryan Herzog
 
Chapter 7 - Labor Market
Chapter 7 - Labor MarketChapter 7 - Labor Market
Chapter 7 - Labor Market
Ryan Herzog
 
Chapter 6 - Romer Model
Chapter 6 - Romer Model Chapter 6 - Romer Model
Chapter 6 - Romer Model
Ryan Herzog
 
Chapter 5 - Solow Model for Growth
Chapter 5 - Solow Model for GrowthChapter 5 - Solow Model for Growth
Chapter 5 - Solow Model for Growth
Ryan Herzog
 
Chapter 4 - Model of Production
Chapter 4 - Model of ProductionChapter 4 - Model of Production
Chapter 4 - Model of Production
Ryan Herzog
 
Chapter 3 - Long-Run Economic Growth
Chapter 3 - Long-Run Economic GrowthChapter 3 - Long-Run Economic Growth
Chapter 3 - Long-Run Economic Growth
Ryan Herzog
 
Chapter 2 - Measuring the Macroeconomy
Chapter 2 - Measuring the MacroeconomyChapter 2 - Measuring the Macroeconomy
Chapter 2 - Measuring the Macroeconomy
Ryan Herzog
 
Topic 7 (data)
Topic 7 (data)Topic 7 (data)
Topic 7 (data)
Ryan Herzog
 
Inequality
InequalityInequality
Inequality
Ryan Herzog
 
Topic 7 (questions)
Topic 7 (questions)Topic 7 (questions)
Topic 7 (questions)
Ryan Herzog
 
Topic 6 (model specification)
Topic 6 (model specification)Topic 6 (model specification)
Topic 6 (model specification)
Ryan Herzog
 
Hypothesis Testing
Hypothesis TestingHypothesis Testing
Hypothesis Testing
Ryan Herzog
 
Topic 1 part 2
Topic 1 part 2Topic 1 part 2
Topic 1 part 2
Ryan Herzog
 
Introduction
IntroductionIntroduction
Introduction
Ryan Herzog
 

More from Ryan Herzog (20)

Chapter 14 - Great Recession
Chapter 14 - Great RecessionChapter 14 - Great Recession
Chapter 14 - Great Recession
 
Chapter 13 - AD/AS
Chapter 13 - AD/ASChapter 13 - AD/AS
Chapter 13 - AD/AS
 
Chapter 12 - Monetary Policy
Chapter 12 - Monetary PolicyChapter 12 - Monetary Policy
Chapter 12 - Monetary Policy
 
Chapter 11 - IS Curve
Chapter 11 - IS CurveChapter 11 - IS Curve
Chapter 11 - IS Curve
 
Chapter 10 - Great Recession
Chapter 10 - Great RecessionChapter 10 - Great Recession
Chapter 10 - Great Recession
 
Chapter 9 - Short Run
Chapter 9 - Short RunChapter 9 - Short Run
Chapter 9 - Short Run
 
Chapter 8 - Inflation
Chapter 8 - InflationChapter 8 - Inflation
Chapter 8 - Inflation
 
Chapter 7 - Labor Market
Chapter 7 - Labor MarketChapter 7 - Labor Market
Chapter 7 - Labor Market
 
Chapter 6 - Romer Model
Chapter 6 - Romer Model Chapter 6 - Romer Model
Chapter 6 - Romer Model
 
Chapter 5 - Solow Model for Growth
Chapter 5 - Solow Model for GrowthChapter 5 - Solow Model for Growth
Chapter 5 - Solow Model for Growth
 
Chapter 4 - Model of Production
Chapter 4 - Model of ProductionChapter 4 - Model of Production
Chapter 4 - Model of Production
 
Chapter 3 - Long-Run Economic Growth
Chapter 3 - Long-Run Economic GrowthChapter 3 - Long-Run Economic Growth
Chapter 3 - Long-Run Economic Growth
 
Chapter 2 - Measuring the Macroeconomy
Chapter 2 - Measuring the MacroeconomyChapter 2 - Measuring the Macroeconomy
Chapter 2 - Measuring the Macroeconomy
 
Topic 7 (data)
Topic 7 (data)Topic 7 (data)
Topic 7 (data)
 
Inequality
InequalityInequality
Inequality
 
Topic 7 (questions)
Topic 7 (questions)Topic 7 (questions)
Topic 7 (questions)
 
Topic 6 (model specification)
Topic 6 (model specification)Topic 6 (model specification)
Topic 6 (model specification)
 
Hypothesis Testing
Hypothesis TestingHypothesis Testing
Hypothesis Testing
 
Topic 1 part 2
Topic 1 part 2Topic 1 part 2
Topic 1 part 2
 
Introduction
IntroductionIntroduction
Introduction
 

Recently uploaded

PCOS corelations and management through Ayurveda.
PCOS corelations and management through Ayurveda.PCOS corelations and management through Ayurveda.
PCOS corelations and management through Ayurveda.
Dr. Shivangi Singh Parihar
 
Top five deadliest dog breeds in America
Top five deadliest dog breeds in AmericaTop five deadliest dog breeds in America
Top five deadliest dog breeds in America
Bisnar Chase Personal Injury Attorneys
 
"Protectable subject matters, Protection in biotechnology, Protection of othe...
"Protectable subject matters, Protection in biotechnology, Protection of othe..."Protectable subject matters, Protection in biotechnology, Protection of othe...
"Protectable subject matters, Protection in biotechnology, Protection of othe...
SACHIN R KONDAGURI
 
MATATAG CURRICULUM: ASSESSING THE READINESS OF ELEM. PUBLIC SCHOOL TEACHERS I...
MATATAG CURRICULUM: ASSESSING THE READINESS OF ELEM. PUBLIC SCHOOL TEACHERS I...MATATAG CURRICULUM: ASSESSING THE READINESS OF ELEM. PUBLIC SCHOOL TEACHERS I...
MATATAG CURRICULUM: ASSESSING THE READINESS OF ELEM. PUBLIC SCHOOL TEACHERS I...
NelTorrente
 
Introduction to AI for Nonprofits with Tapp Network
Introduction to AI for Nonprofits with Tapp NetworkIntroduction to AI for Nonprofits with Tapp Network
Introduction to AI for Nonprofits with Tapp Network
TechSoup
 
Exploiting Artificial Intelligence for Empowering Researchers and Faculty, In...
Exploiting Artificial Intelligence for Empowering Researchers and Faculty, In...Exploiting Artificial Intelligence for Empowering Researchers and Faculty, In...
Exploiting Artificial Intelligence for Empowering Researchers and Faculty, In...
Dr. Vinod Kumar Kanvaria
 
South African Journal of Science: Writing with integrity workshop (2024)
South African Journal of Science: Writing with integrity workshop (2024)South African Journal of Science: Writing with integrity workshop (2024)
South African Journal of Science: Writing with integrity workshop (2024)
Academy of Science of South Africa
 
Azure Interview Questions and Answers PDF By ScholarHat
Azure Interview Questions and Answers PDF By ScholarHatAzure Interview Questions and Answers PDF By ScholarHat
Azure Interview Questions and Answers PDF By ScholarHat
Scholarhat
 
Digital Artefact 1 - Tiny Home Environmental Design
Digital Artefact 1 - Tiny Home Environmental DesignDigital Artefact 1 - Tiny Home Environmental Design
Digital Artefact 1 - Tiny Home Environmental Design
amberjdewit93
 
kitab khulasah nurul yaqin jilid 1 - 2.pptx
kitab khulasah nurul yaqin jilid 1 - 2.pptxkitab khulasah nurul yaqin jilid 1 - 2.pptx
kitab khulasah nurul yaqin jilid 1 - 2.pptx
datarid22
 
A Survey of Techniques for Maximizing LLM Performance.pptx
A Survey of Techniques for Maximizing LLM Performance.pptxA Survey of Techniques for Maximizing LLM Performance.pptx
A Survey of Techniques for Maximizing LLM Performance.pptx
thanhdowork
 
clinical examination of hip joint (1).pdf
clinical examination of hip joint (1).pdfclinical examination of hip joint (1).pdf
clinical examination of hip joint (1).pdf
Priyankaranawat4
 
Assignment_4_ArianaBusciglio Marvel(1).docx
Assignment_4_ArianaBusciglio Marvel(1).docxAssignment_4_ArianaBusciglio Marvel(1).docx
Assignment_4_ArianaBusciglio Marvel(1).docx
ArianaBusciglio
 
The simplified electron and muon model, Oscillating Spacetime: The Foundation...
The simplified electron and muon model, Oscillating Spacetime: The Foundation...The simplified electron and muon model, Oscillating Spacetime: The Foundation...
The simplified electron and muon model, Oscillating Spacetime: The Foundation...
RitikBhardwaj56
 
The basics of sentences session 5pptx.pptx
The basics of sentences session 5pptx.pptxThe basics of sentences session 5pptx.pptx
The basics of sentences session 5pptx.pptx
heathfieldcps1
 
Reflective and Evaluative Practice...pdf
Reflective and Evaluative Practice...pdfReflective and Evaluative Practice...pdf
Reflective and Evaluative Practice...pdf
amberjdewit93
 
MASS MEDIA STUDIES-835-CLASS XI Resource Material.pdf
MASS MEDIA STUDIES-835-CLASS XI Resource Material.pdfMASS MEDIA STUDIES-835-CLASS XI Resource Material.pdf
MASS MEDIA STUDIES-835-CLASS XI Resource Material.pdf
goswamiyash170123
 
DRUGS AND ITS classification slide share
DRUGS AND ITS classification slide shareDRUGS AND ITS classification slide share
DRUGS AND ITS classification slide share
taiba qazi
 
World environment day ppt For 5 June 2024
World environment day ppt For 5 June 2024World environment day ppt For 5 June 2024
World environment day ppt For 5 June 2024
ak6969907
 
Fresher’s Quiz 2023 at GMC Nizamabad.pptx
Fresher’s Quiz 2023 at GMC Nizamabad.pptxFresher’s Quiz 2023 at GMC Nizamabad.pptx
Fresher’s Quiz 2023 at GMC Nizamabad.pptx
SriSurya50
 

Recently uploaded (20)

PCOS corelations and management through Ayurveda.
PCOS corelations and management through Ayurveda.PCOS corelations and management through Ayurveda.
PCOS corelations and management through Ayurveda.
 
Top five deadliest dog breeds in America
Top five deadliest dog breeds in AmericaTop five deadliest dog breeds in America
Top five deadliest dog breeds in America
 
"Protectable subject matters, Protection in biotechnology, Protection of othe...
"Protectable subject matters, Protection in biotechnology, Protection of othe..."Protectable subject matters, Protection in biotechnology, Protection of othe...
"Protectable subject matters, Protection in biotechnology, Protection of othe...
 
MATATAG CURRICULUM: ASSESSING THE READINESS OF ELEM. PUBLIC SCHOOL TEACHERS I...
MATATAG CURRICULUM: ASSESSING THE READINESS OF ELEM. PUBLIC SCHOOL TEACHERS I...MATATAG CURRICULUM: ASSESSING THE READINESS OF ELEM. PUBLIC SCHOOL TEACHERS I...
MATATAG CURRICULUM: ASSESSING THE READINESS OF ELEM. PUBLIC SCHOOL TEACHERS I...
 
Introduction to AI for Nonprofits with Tapp Network
Introduction to AI for Nonprofits with Tapp NetworkIntroduction to AI for Nonprofits with Tapp Network
Introduction to AI for Nonprofits with Tapp Network
 
Exploiting Artificial Intelligence for Empowering Researchers and Faculty, In...
Exploiting Artificial Intelligence for Empowering Researchers and Faculty, In...Exploiting Artificial Intelligence for Empowering Researchers and Faculty, In...
Exploiting Artificial Intelligence for Empowering Researchers and Faculty, In...
 
South African Journal of Science: Writing with integrity workshop (2024)
South African Journal of Science: Writing with integrity workshop (2024)South African Journal of Science: Writing with integrity workshop (2024)
South African Journal of Science: Writing with integrity workshop (2024)
 
Azure Interview Questions and Answers PDF By ScholarHat
Azure Interview Questions and Answers PDF By ScholarHatAzure Interview Questions and Answers PDF By ScholarHat
Azure Interview Questions and Answers PDF By ScholarHat
 
Digital Artefact 1 - Tiny Home Environmental Design
Digital Artefact 1 - Tiny Home Environmental DesignDigital Artefact 1 - Tiny Home Environmental Design
Digital Artefact 1 - Tiny Home Environmental Design
 
kitab khulasah nurul yaqin jilid 1 - 2.pptx
kitab khulasah nurul yaqin jilid 1 - 2.pptxkitab khulasah nurul yaqin jilid 1 - 2.pptx
kitab khulasah nurul yaqin jilid 1 - 2.pptx
 
A Survey of Techniques for Maximizing LLM Performance.pptx
A Survey of Techniques for Maximizing LLM Performance.pptxA Survey of Techniques for Maximizing LLM Performance.pptx
A Survey of Techniques for Maximizing LLM Performance.pptx
 
clinical examination of hip joint (1).pdf
clinical examination of hip joint (1).pdfclinical examination of hip joint (1).pdf
clinical examination of hip joint (1).pdf
 
Assignment_4_ArianaBusciglio Marvel(1).docx
Assignment_4_ArianaBusciglio Marvel(1).docxAssignment_4_ArianaBusciglio Marvel(1).docx
Assignment_4_ArianaBusciglio Marvel(1).docx
 
The simplified electron and muon model, Oscillating Spacetime: The Foundation...
The simplified electron and muon model, Oscillating Spacetime: The Foundation...The simplified electron and muon model, Oscillating Spacetime: The Foundation...
The simplified electron and muon model, Oscillating Spacetime: The Foundation...
 
The basics of sentences session 5pptx.pptx
The basics of sentences session 5pptx.pptxThe basics of sentences session 5pptx.pptx
The basics of sentences session 5pptx.pptx
 
Reflective and Evaluative Practice...pdf
Reflective and Evaluative Practice...pdfReflective and Evaluative Practice...pdf
Reflective and Evaluative Practice...pdf
 
MASS MEDIA STUDIES-835-CLASS XI Resource Material.pdf
MASS MEDIA STUDIES-835-CLASS XI Resource Material.pdfMASS MEDIA STUDIES-835-CLASS XI Resource Material.pdf
MASS MEDIA STUDIES-835-CLASS XI Resource Material.pdf
 
DRUGS AND ITS classification slide share
DRUGS AND ITS classification slide shareDRUGS AND ITS classification slide share
DRUGS AND ITS classification slide share
 
World environment day ppt For 5 June 2024
World environment day ppt For 5 June 2024World environment day ppt For 5 June 2024
World environment day ppt For 5 June 2024
 
Fresher’s Quiz 2023 at GMC Nizamabad.pptx
Fresher’s Quiz 2023 at GMC Nizamabad.pptxFresher’s Quiz 2023 at GMC Nizamabad.pptx
Fresher’s Quiz 2023 at GMC Nizamabad.pptx
 

Topic 5 (multiple regression)

  • 1. MULTIPLE REGRESSION ECON 355 – Regression Analysis
  • 2. SOME LOGISTICS • Verify current directory path • In Stata type pwd to show current directory. • Use cd “path” to change directory
  • 3. NEW STATA FUNCTIONS AND OPTIONS • Preserve/restore – lets you preserve and go back to the sample you are working with before you make any changes with the data • Drop/keep – lets you keep drop/keep certain observations/variables • Example: • Work with realestate dataset • preserve • drop age • restore
  • 4. NEW STATA FUNCTIONS AND OPTIONS CONT’D • Another example: • Preserve • hist price • keep if age >100 • hist price • Restore
  • 5. HEDONIC PRICING • We are going to discuss how the size of the house affects the relationship between its price and its age • What is the relationship between the price of the house and its age in general? • Are all the houses in our sample the same size? Let’s look at its descriptive statistics and histogram. • We are going to divide our data sample into 5 groups depending on the size of the house (under 1000 sqft, 1000-2000 sqft, 2000-3000 sqft, 3000-4000 sqft, 4000-5000 sqft) and see if the relationship between price and age changes for any of these groups.
  • 6. THE SIZE OF THE HOUSE MATTERS! • Not only is the size of the house related to its price but also most likely related to its age. Intuitively, why do you think the size and the age of a house might be related? • In general, we will want to include in the regression everything that possibly affects Y and is correlated to X • Do you think the number of bedrooms and bathrooms can also be related to the age of the house and potentially affect its price? • If so, we should probably include them in the regression too. What is the relationship between the price of the house and its age now?
  • 7. IN GENERAL • Population regression will now look like this • 𝑌𝑖 = 𝛽0 + 𝛽1 𝑋1𝑖 + 𝛽2 𝑋2𝑖 + 𝛽3…𝑘−1 𝑋3 … 𝑘−1 𝑖 + 𝛽 𝑘 𝑋𝐾𝑖 + 𝑢𝑖 • The interpretation of betas slightly changes. Since there are more than one independent variable included, when interpreting the beta on one of them, the others are held constant. • i.e. 𝛽1 = ∆𝑌 ∆𝑋1 holding everything else constant (or ceteris paribus). • With one unit change in 𝑋1, 𝑌 will change by 𝛽1 holding everything else constant
  • 8. BACK TO THE HEDONIC PRICING EXAMPLE • Before we interpret the betas in our multiple regression lets figure out the measurement units for each variable • Price, beds, baths, age, sqft • What does the population regression look like when we regress price of a house on its age, square feet, number beds, and number baths? • What does the fitted regression look like? • Please interpret each of the betas except the constant.
  • 9. STATA – CREATING TABLES • To be able to compare the results of different regressions with ease we usually create tables. • You can see an example of a table on blackboard. We are going to try to replicate the table Stata • ssc install outreg2 • Each regression has to be added to the table separately • Stata command: outreg2 using tablename.doc • Every new column has to be added to the already existing table • Stata: outreg2 using tablename.doc • To start the document with the same name over: • Stata: outreg2 using tablename.doc, replace
  • 10. DIY TIME • Please run the following regressions and create a table with the results of the regression • Regress price of a house on its age • Regress price of a house on its age and size • Regress price of a house on its age, size, number of bedrooms and number of bathrooms • Please make sure the table looks clean and professional.
  • 11. T-TEST IN A MULTIPLE REGRESSION • The significance tests do not change between single and multiple regressions • Coefficients are still significant at • 1% if t-stat >|2.58| and p-value<0.01 • 5% if t-stat>|1.96| and p-value<0.05 • 10% if t-stat >|1.68| and p-value<0.1
  • 12. MORE DIY TIME • Please use the caschool dataset • Please run four regressions of test score on class size (1) and control for total enrollment(2); expenditure per student and average income (3); average income and computers per students (4) • We will not edit the table in the word file, we will rather look at the regression results in stata • Please interpret one of the betas in your regressions
  • 13. IMPERFECT MULTICOLLINEARITY • If we include variables in a regression that are closely related to one another the betas on them will become statistically insignificant (because the standard errors will increase) • Example: • regress test scores on calworks percentage • then regress test scores on percent qualifying for reduced-price lunch, • then regress test scores on percent qualifying for reduced-price lunch and percent qualifying for calworks. • What happens to the significance of the betas? • Sometimes a few variables combined together may be correlated with a variable already included, we may never know.
  • 14. IMPERFECT MULTICOLLINEARITY WHAT TO DO AND WHAT NOT TO DO • Do not run kitchen-sink regressions • Concentrate on a variable of interest, the rest should be “controlled for”. Be deliberate about the variables you add to the regression. Start with a baseline regression and then add more one by one or by group. • If multicollinearity exists in your results (and it most likely does), you are erring on the conservative side. This means you are not claiming that the relationship exists when it does not, much rather the opposite. • Example: if we want to test the relationship between test scores and average income what other variables should we control for?
  • 15. PERFECT MULTICOLLINEARITY • Happens when your regressors are perfectly correlated • Use teaching ratings data set • Create a variable equal to 1 if professor is a male • Stata: • generate male=0 • replace male=1 if female==0 • Regress course evaluations on the male and female dummy variables in the same regression • What happens? Why do you think it happens? • This is called a dummy variable trap – we have included a dummy for each category. Stata will correct for it, other software will not. Remember to always omit one category and compare the betas on the included categories to the omitted category
  • 16. PERFECT MULTICOLLINEARITY EXAMPLE • Use binarydata dataset • There are a couple of ways to create dummy variables in Stata and include them into a regression • Variable “ethnicity” contains three possible outcomes in this dataset “Black”, “Hispanic”, “white”. We can create a dummy variable for each, it will be equal to 1 if a person is Hispanic, and 0 otherwise. • Stata: • tabulate ethnicity, generate(e) • Let’s look at the three variables Stata created • Now regress earnings on the three variables that control for ethnicity. What happens? Why? • Please interpret the coefficients in the above regression.
  • 17. PERFECT MULTICOLLINEARITY EXAMPLE, DIY. 1. Please create a set of dummy variables for the following variable: hsdropout, i. e. a variable equal to 1 for those who dropped out of high school and 0 otherwise, and a variable equal to 1 who did not drop out of high school and 0 otherwise • Now regress EARNINGS on one of the dummy variables. Please interpret the results of the regression. 2. Please create a set of dummy variables for the variable relationship status • Now regress EARNINGS on the group of the dummy variables omitting one of them. • Please interpret the results of the regression
  • 18. MULTIPLE REGRESSION, DIY TIME. • Please use EAEF22 dataset to show the relationship between one’s earnings and amount of schooling, while controlling for other variables. • Please run a few regressions to determine which empirical model explains the relationship between earnings and schooling best. • Use the knowledge you have received in this topic to decide which variables to include in your regressions. • Please interpret the relationship that you found.
  • 19. REVIEW • Why do we need to include more than one regressor in a regression? • How is a t-test conducted in a multiple regression? • How do you create a table with the results of a regression in Stata? • What is imperfect collinearity? Is it a problem? Should we try to avoid it? • What is perfect multicollinearity? Is it a problem? Should we try to avoid it? • How do you interpret results of a regression with a set of dummy variables?

Editor's Notes

  1. In general if we regress the price on age we will find out that there is no relationship. The relationship between price and age of the house is the following: Under 1000: 2.06 1000-2000: 2.3*** 2000-3000: 3.6*** 3000-4000: 2.18 4000-5000: 1.52 Show how to use “reg y x” for the first two groups, then divide the class into three groups and ask to do it for the last three groups
  2. Why do you think there are no subscripts “i” on betas?
  3. Price – thousands of dollars, beds – number, baths – number, age – years, sqft – square feet