SlideShare a Scribd company logo
1 of 49
Download to read offline
Instructions for payment for the full file with all chapters at:
nail.basko@gmail.com
Essentials of Econometrics
Fifth Edition
I dedicate this book to Joan Gujarati, Diane Gujarati-Chesnut, Charles Chesnut, and
my grandchildren, "Tommy" and Laura Chesnut, and to my dear friend Karen Low.
Sara Miller McCune founded SAGE Publishing in 1965 to support
the dissemination of usable knowledge and educate a global
community. SAGE publishes more than 1000 journals and over
600 new books each year, spanning a wide range of subject areas.
Our growing selection of library products includes archives, data,
case studies and video. SAGE remains majority owned by our
founder and after her lifetime will become owned by a charitable
trust that secures the company’s continued independence.
Los Angeles | London | New Delhi | Singapore | Washington DC | Melbourne
Essentials of Econometrics
Fifth Edition
Damodar N. Gujarati
Professor Emeritus of Economics
The United States Military Academy at West Point
FOR INFORMATION:
SAGE Publications, Inc.
2455 Teller Road
Thousand Oaks, California 91320
E-mail: order@sagepub.com
SAGE Publications Ltd.
1 Oliver’s Yard
55 City Road
London EC1Y 1SP
United Kingdom
SAGE Publications India Pvt. Ltd.
B 1/I 1 Mohan Cooperative Industrial Area
Mathura Road, New Delhi 110 044
India
SAGE Publications Asia-Pacific Pte. Ltd.
18 Cross Street #10-10/11/12
China Square Central
Singapore 048423
Acquisitions Editor: Helen Salmon
Product Associate: Kenzie Offley
Production Editor: Rebecca Lee
Copy Editor: Gillian Dickens
Typesetter: C&M Digitals (P) Ltd.
Indexer: Integra
Cover Designer: Scott Van Atta
Marketing Manager: Victoria Velasquez
Copyright © 2022 by Damodar N. Gujarati
All rights reserved. Except as permitted by U.S. copyright law, no part
of this work may be reproduced or distributed in any form or by any
means, or stored in a database or retrieval system, without permission
in writing from the publisher.
All third-party trademarks referenced or depicted herein are included
solely for the purpose of illustration and are the property of their
respective owners. Reference to these trademarks in no way indicates
any relationship with, or endorsement by, the trademark owner.
Printed in the United States of America
Library of Congress Cataloging-in-Publication Data
Names: Gujarati, Damodar N., author.
Title: Essentials of econometrics / Damodar N. Gujarati,
The United States Military Academy at West Point.
Description: Fifth edition. | Thousand Oaks, California : SAGE, [2022] |
Includes bibliographical references and index.
Identifiers: LCCN 2021012627 | ISBN 978-1-0718-5039-8 (paperback ;
alk. paper) | ISBN 978-1-0718-5040-4 (epub) | ISBN 978-1-0718-5041-1
(epub) | ISBN 978-1-0718-5042-8 (pdf)
Subjects: LCSH: Econometrics. | Economics—Statistical methods.
Classification: LCC HB139 .G85 2022 | DDC 330.01/5195—dc23
LC record available at https://lccn.loc.gov/2021012627
This book is printed on acid-free paper.
21 22 23 24 25 10 9 8 7 6 5 4 3 2 1
BRIEF CONTENTS
Acknowledgmentsxvii
Prefacexix
About the Author xxiii
Chapter 1 • The Nature and Scope of Econometrics 1
PART I • THE LINEAR REGRESSION MODEL 23
Chapter 2 • Basic Ideas of Linear Regression: The Two-Variable Model 25
Chapter 3 • The Two-Variable Model: Hypothesis Testing 61
Chapter 4 • Multiple Regression: Estimation and Hypothesis Testing 105
Chapter 5 • Functional Forms of Regression Models 147
Chapter 6 • Qualitative or Dummy Variable Regression Models 193
PART II • REGRESSION ANALYSIS IN PRACTICE 239
Chapter 7 • Model Selection: Criteria and Tests 241
Chapter 8 • Multicollinearity: What Happens If Explanatory
Variables Are Correlated? 279
Chapter 9 • Heteroscedasticity: What Happens If the Error Variance Is
Nonconstant?313
Chapter 10 • Autocorrelation: What Happens If Error Terms
Are Correlated? 359
PART III • ADVANCED TOPICS IN ECONOMETRICS 393
Chapter 11 • Elements of Time-Series Econometrics 395
Chapter 12 • Panel Data Regression Models 419
Appendix A: Review of Statistics: Probability and Probability Distributions 441
Appendix B: Characteristics of Probability Distributions 475
Appendix C: Some Important Probability Distributions 505
Appendix D: Statistical Inference: Estimation and Hypothesis Testing 533
Appendix E: Statistical Tables 565
Index 593
DETAILED CONTENTS
Acknowledgmentsxvii
Prefacexix
About the Author xxiii
Chapter 1 • The Nature and Scope of Econometrics 1
1.1 What Is Econometrics? 1
1.2 Why Study Econometrics? 2
1.3 The Methodology Of Econometrics 4
1. The Object of Research 4
2. Collecting Data 5
3. Specifying the Mathematical Model of Labor Force Participation 8
4. Specifying the Statistical, or Econometric, Model of Labor
Force Participation 9
5. Estimating the Parameters of the Chosen Econometric Model 11
6. Checking for Model Adequacy: Model Specification Testing 11
7. Testing Hypothesis Derived From the Model 13
8. Using the Model for Prediction or Forecasting 14
1.4 The Road Ahead 15
Key Terms and Concepts 16
Questions16
Problems17
Appendix 1A: Economic Data on the World Wide Web 19
PART I • THE LINEAR REGRESSION MODEL 23
Chapter 2 • Basic Ideas of Linear Regression: The Two-Variable Model 25
2.1 The Meaning of Regression 25
2.2 The Population Regression Function (PRF): A Hypothetical Example 26
2.3 Statistical or Stochastic Specification of The Population Regression
Function30
2.4 The Nature of the Stochastic Error Term 31
2.5 The Sample Regression Function (SRF)32
2.6 The Special Meaning of the Term Linear Regression 37
Linearity in the Variables 37
Linearity in the Parameters 38
2.7 Two-Variable Versus Multiple Linear Regression 39
2.8 Estimation of Parameters: The Method of Ordinary Least Squares 39
The Method of Ordinary Least Squares 40
2.9 Putting It All Together 44
Interpretation of the Estimated Math SAT Score Function 44
2.10 Some Illustrative Examples 45
2.11 Summary 52
Key Terms and Concepts 53
Questions53
Problems55
Optional Questions 60
Appendix 2A: Derivation of Least Squares Estimators 60
Chapter 3 • The Two-Variable Model: Hypothesis Testing 61
3.1 The Classical Linear Regression Model 62
3.2 Variances and Standard Errors of Ordinary Least Squares Estimators 66
Variances and Standard Errors of the Math SAT Example 67
Summary of the Math SAT Score Function 68
3.3 Why OLS? Properties of OLS Estimators 69
Gauss–Markov Theorem 70
3.4 The Sampling, or Probability, Distributions of OLS Estimators 70
Central Limit Theorem 71
3.5 Hypothesis Testing 72
Testing H0:B2 = 0 Versus H1:B2 ≠ 0: The Confidence Interval Approach 74
The Test of Significance Approach to Hypothesis Testing 76
Math SAT Example Continued 77
3.6 Hypothesis Testing: Some Practical Aspects 80
3.7 How Good Is The Fitted Regression Line: The Coefficient of
Determination, r280
Formulas to Compute r283
r2 for the Math SAT Example 84
The Coefficient of Correlation, r84
3.8 Reporting the Results of Regression Analysis 85
3.9 Illustrative Examples 87
1. Relationship Between Wages and Productivity in the
Business Sector, USA, 1959–2006 87
2. Expenditure on Education and Income in the 50 U.S. States for 2000 88
3. CEO Salaries of 447 Fortune 500 Companies for 1999 90
4. Impact of Advertising Expenditure on Viewers 91
3.10 Comments on the Illustrative Examples 92
3.11 Forecasting 92
3.12 Normality Tests 96
Histograms of Residuals 96
Jarque–Bera Test 96
3.13 Summary 98
Key Terms and Concepts 98
Questions98
Problems100
Chapter 4 • Multiple Regression: Estimation and Hypothesis Testing 105
4.1 The Three-Variable Linear Regression Model 106
The Meaning of Partial Regression Coefficient 107
4.2 Assumptions of the Multiple Linear Regression Model 108
4.3 Estimation of the Parameters of Multiple Regression 111
Ordinary Least Squares Estimators 111
Variance and Standard Errors of OLS Estimators 113
Properties of OLS Estimators of Multiple Regression 114
4.4 Goodness of Fit of Estimated Multiple Regression:
Multiple Coefficient of Determination, R2114
4.5 Antique Clock Auction Prices Revisited 116
Interpretation of the Regression Results 116
4.6 Hypothesis Testing In A Multiple Regression: General Comments 117
4.7 Testing Hypotheses About Individual Partial Regression Coefficients 118
The Test of Significance Approach 118
The Confidence Interval Approach to Hypothesis Testing 120
4.8 Testing the Joint Hypothesis That B2 = B3 = 0 Or R2 = 0 121
An Important Relationship Between F and R2124
4.9 Two-Variable Regression In the Context of Multiple Regression:
Introduction to Specification Bias 125
4.10 Comparing Two R2 Values: The Adjusted R2127
4.11 When to Add an Additional Explanatory Variable to a Model 128
4.12 Restricted Least Squares 130
4.13 Illustrative Examples 131
Discussion of Regression Results 132
4.14 Summary 138
Key Terms and Concepts 139
Questions140
Problems000
Appendix 4A.1: Derivations of OLS Estimators144
Appendix 4A.2: Derivation of Equation (4.31) 145
Appendix 4A.3: Derivation of Equation (4.49) 145
Chapter 5 • Functional Forms of Regression Models 147
5.1 How to Measure Elasticity: The Log-Linear Model 148
Hypothesis Testing in Log-Linear Models 153
5.2 Multiple Log-Linear Regression Models 154
5.3 How to Measure the Growth Rate: The Semilog Model 157
Instantaneous Versus Compound Rate of Growth 160
The Linear Trend Model 161
5.4 The Lin-Log Model: When the Explanatory Variable Is Logarithmic 162
5.5 Reciprocal Models 164
5.6 Polynomial Regression Models 169
5.7 Regression Through the Origin: The Zero Intercept Model 173
5.8 A Note on Scaling and Units of Measurement 175
5.9 Regression on Standardized Variables 177
5.10 Summary of Functional Forms 180
5.11 SUMMARY 180
Key Terms and Concepts 181
Questions182
Problems183
Appendix 5A: Logarithms 190
Chapter 6 • Qualitative or Dummy Variable Regression Models 193
6.1 The Nature of Dummy Variables 193
6.2 ANCOVA Models: Regression on One Quantitative Variable and
One Qualitative Variable With Two Categories 197
6.3 Regression on One Quantitative Variable and One Qualitative
Variable With More Than Two Classes or Categories 200
6.4 Regression on One Quantiative Explanatory Variable and
More Than One Qualitative Variable 203
Interaction Effects 204
A Generalization 205
6.5 Comparing Two Regessions 207
6.6 The Use of Dummy Variables In Seasonal Analysis 212
6.7 What Happens if the Dependent Variable Is Also a Dummy Variable?
The Linear Probability Model (LPM)214
6.8 The Logit Model 219
Estimation of the Logit Model 221
6.9 Summary 228
Key Terms and Concepts 229
Questions229
Problems230
PART II • REGRESSION ANALYSIS IN PRACTICE 239
Chapter 7 • Model Selection: Criteria and Tests 241
7.1 The Attributes of a Good Model 242
7.2 Types of Specification Errors 243
7.3 Omisson of Relevant Variable Bias: “Underfitting” a Model 243
7.4 Inclusion of Irrelevant Variables: “Overfitting” a Model 248
7.5 Incorrect Functional Form 251
7.6 Errors of Measurement 253
Errors of Measurement in the Dependent Variable 254
Errors of Measurement in the Explanatory Variable(s) 254
7.7 Detecting Specification Errors: Tests of Specification Errors 255
Detecting the Presence of Unnecessary Variables 255
Tests for Omitted Variables and Incorrect Functional Forms 258
Choosing Between Linear and Log-Linear Regression Models:
The MWD Test 260
Regression Error Specification Test: RESET 262
7.8 Outliers, Leverage, and Influence Data 265
7.9 Probabity Distribution of the Error Term 268
7.10 Model Evaluation Criteria 270
7.11 Nonnormal Distribution of the Error Term 272
7.12 Fixed Versus Random (or Stochastic) Explanatory Variables 272
7.13 Summary 273
Key Terms and Concepts 274
Questions275
Problems275
Chapter 8 • Multicollinearity: What Happens if Explanatory
Variables Are Correlated? 279
8.1 The Nature of Multicollinearity: The Case of Perfect Multicollinearity 280
8.2 The Case of Near, or Imperfect, Multicollinearity 283
8.3 Theoretical Consequences of Multicollinearity 285
8.4 Practical Consequences of Multicollinearity 287
8.5 Detection of Multicollinearity 289
8.6 Is Multicollinearity Necessarily Bad? 294
8.7 An Extended Example: The Demand for Chickens In The United States,
1960 To 1982 295
Collinearity Diagnostics for the Demand Function for Chickens 297
8.8 What to Do With Multicollinearity: Remedial Measures 299
Dropping a Variable(s) From the Model 300
Acquiring Additional Data or a New Sample 301
Rethinking the Model 302
Prior Information About Some Parameters 303
Transformation of Variables 304
Other Remedies 305
8.9 Summary 305
Key Terms and Concepts 306
Questions306
Problems307
Chapter 9 • Heteroscedasticity: What Happens if the Error
Variance Is Nonconstant? 313
9.1 The Nature of Heteroscedasticity 313
Reasons for Heteroscedasticity 315
9.2 Consequences of Heteroscedasticity 316
9.3 Detection of Heteroscedasticity: How Do We Know When There Is a
Heteroscedasticity Problem? 319
1. Nature of the Problem 320
2. Graphical Examination of Residuals 320
3. Park Test 323
4. Glejser Test 327
5. White’s General Heteroscedasticity Test 328
6. Breusch-Pagan (BP) Test of Heteroscedasticity 330
Other Tests of Heteroscedasticity 332
9.4 What to Do if Heteroscedasticity Is Observed: Remedial Measures 332
When s2
i Is Known: The Method of Weighted Least Squares (WLS) 333
When True s2
i Is Unknown 334
Respecification of the Model 339
9.5 White’s Heteroscedasticity-Corrected Standard Errors and t Statistics 340
9.6 Some Concrete Examples of Heteroscedasticity 342
9.7 Summary 349
Key Terms and Concepts 350
Questions350
Problems351
Chapter 10 • Autocorrelation: What Happens If Error
Terms Are Correlated? 359
10.1 The Nature of Autocorrelation 360
Inertia361
Model Specification Error(s) 362
The Cobweb Phenomenon 362
Data Manipulation 362
10.2 Consequences of Autocorrelation 364
10.3 Detecting Autocorrelation 364
The Graphical Method 365
The Durbin–Watson d Test 367
10.4 Remedial Measures 372
10.5 How to Estimate r374
ρ = 1: The First Difference Method 375
ρ Estimated From Durbin–Watson d Statistic 375
ρ Estimated From OLS Residuals, et376
Other Methods of Estimating ρ377
10.6 A Large Sample Method of Correcting OLS Standard Errors:
The Newey–West (NW) Method 378
10.7 A General Test of Autocorrelation: The Breusch–Godfrey (BG) Test 383
10.8 Summary 386
Key Terms and Concepts 386
Questions387
Problems388
PART III • ADVANCED TOPICS IN ECONOMETRICS 393
Chapter 11 • Elements of Time-Series Econometrics 395
11.1 The Phenomenon of Spurious Regression: Nonstationary Time Series 395
11.2 Tests of Stationarity 398
1. Graphical Analysis 398
2. Autocorrelation Function (ACF) and Correlogram 399
3. The Unit Root Test of Stationarity 402
11.3 Cointegrated Time Series 406
11.4 The Random Walk Model 408
11.5 Causality In Economics: The Granger Causality Test 411
The Granger Causality Test 411
11.6 Summary 415
Key Terms and Concepts 416
Problems416
Chapter 12 • Panel Data Regression Models 419
12.1 The Importance of Panel Data 420
12.2 An Illustrative Example: Charitable Giving 421
12.3 Pooled OLS Regression of the Charity Function 423
12.4 The Fixed-Effects Least Squares Dummy Variable (LSDV) Model 424
12.5 Limitations of the Fixed-Effects LSDV Model 427
12.6 The Fixed-Effects Within-Group (WG) Estimator 428
12.7 The Random-Effects Model (REM) or Error Components Model (ECM)430
Some Guidelines About REM and FEM 434
12.8 Properties of Various Estimators 435
12.9 Panel Data Regressions: Some Concluding Comments 436
12.10 Summary and Conclusions 436
Key Terms and Concepts 437
Problems438
INTRODUCTION TO APPENDIXES A, B, C, AND D:
BASICS OF PROBABILITY AND STATISTICS 441
Appendix A: Review of Statistics: Probability and Probability Distributions 442
A.1 Some Notation 442
The Summation Notation 442
Properties of the Summation Operator 443
A.2 Experiment, Sample Space, Sample Point, and Events 444
Experiment444
Sample Space or Population 444
Sample Point 445
Events445
Venn Diagrams 445
A.3 Random Variables 446
A.4 Probability 447
Probability of an Event: The Classical or A Priori Definition 448
Relative Frequency or Empirical Definition of Probability 448
Probability of Random Variables 455
A.5 Random Variables and Their Probability Distributions 455
Probability Distribution of a Discrete Random Variable 455
Probability Distribution of a Continuous Random Variable 457
Cumulative Distribution Function (CDF) 459
A.6 Multivariate Probability Density Functions 462
Marginal Probability Functions 464
Conditional Probability Functions 465
Statistical Independence 467
A.7 Summary and Conclusions 468
Key Terms and Concepts 469
References469
Questions470
Problems470
Appendix B: Characteristics of Probability Distributions 475
B.1 Expected Value: A Measure of Central Tendency 475
Properties of Expected Value 477
Expected Value of Multivariate Probability Distributions 479
B.2 Variance: A Measure of Dispersion 479
Properties of Variance 481
Chebyshev’s Inequality 483
Coefficient of Variation 484
B.3 Covariance 484
Properties of Covariance 486
B.4 Correlation Coefficient 486
Properties of the Correlation Coefficient 487
Variances of Correlated Variables 489
B.5 Conditional Expectation 489
Conditional Variance 491
B.6 Skewness and Kurtosis 491
B.7 From Population to the Sample 494
Sample Mean 495
Sample Variance 496
Sample Covariance 496
Sample Correlation Coefficient 498
Sample Skewness and Kurtosis 498
B.8 Summary 499
Key Terms and Concepts 499
Questions500
Problems501
Optional Exercises 503
Appendix C: Some Important Probability Distributions 505
C.1 The Normal Distribution 506
Properties of the Normal Distribution 506
The Standard Normal Distribution 508
Random Sampling From a Normal Population 512
The Sampling or Probability Distribution of the Sample Mean X
–
 512
The Central Limit Theorem (CLT) 518
C.2 The t Distribution 519
Properties of the t Distribution 519
C.3 The Chi-Square (χ2) Probability Distribution 523
Properties of the Chi-Square Distribution 524
C.4 The F Distribution 526
Properties of the F Distribution 527
C.5 Summary 529
Key Terms and Concepts 530
Questions530
Problems531
Appendix D: Statistical Inference: Estimation and Hypothesis Testing 533
D.1 The Meaning of Statistical Inference 533
D.2 Estimation and Hypothesis Testing: Twin Branches of Statistical Inference 535
D.3 Estimation of Parameters 536
D.4 Properties of Point Estimators 541
Linearity541
Unbiasedness542
Minimum Variance 543
Efficiency543
Best Linear Unbiased Estimator (BLUE) 544
Consistency545
D.5 Statistical Inference: Hypothesis Testing 546
The Confidence Interval Approach to Hypothesis Testing 547
Type I and Type II Errors: A Digression 548
The Test of Significance Approach to Hypothesis Testing 551
A Word on Choosing the Level of Significance, α, and the p Value 555
The χ2 and F Tests of Significance 556
D.6 Summary 559
Key Terms and Concepts 560
Questions560
Problems561
Appendix E: Statistical Tables 565
Index 593
xvii
ACKNOWLEDGMENTS
Iwould like to thank Inas R. Kelly, Associate Professor of Economics at Loyola
Marymount University, and Michael Grossman, Distinguished Professor of
Economics at the City University of New York, for reading and providing feedback
on this fifth edition, and also Helen Salmon at SAGE for her behind-the-scenes help
and encouragement.
SAGE and the author are grateful for feedback from the following reviewers in the
development of this fifth edition:
Prasad V. Bidarkota, Florida International University
Chinyere Emmanuel Egbe, Medgar Evers College, City University of New York
Kyungkook Kang, University of Central Florida
C. Burc Kayahan, Acadia University
Tom Means, San Jose State University
Elias Shukralla, Siena College
Robert Sonora, University of Montana
Patricia Kay Smith, University of Michigan–Dearborn
Della Lee Sue, Marist College
W. Scott Trees, Siena College
xix
PREFACE
PURPOSE OF THE FIFTH EDITION OF
ESSENTIALS OF ECONOMETRICS
As in the first four editions of this book, my main purpose of the fifth edition is to pro-
vide a user-friendly introduction to econometric theory and practice to a wide variety
of students. The intended audience is undergraduate economics majors, undergraduate
business administration majors, MBA students, and others in social and behavioral
sciences where econometrics techniques, especially the techniques of linear regression
analysis, are used.
It is no exaggeration to say that regression analysis has become an integral part of study
in any discipline where one is interested in studying the relationship between a variable
of interest, called the dependent or response variable, and a set of explanatory or predic-
tor variables. Sir Francis Galton (1822–1911) used it in the study of heredity, particularly
the height of grownup children in relationship to the height of their parents. He used
the method of least squares, the workhorse of linear regression analysis, for this purpose.
Since then, the methodology of regression analysis has been improved and developed in
many ways. Regression analysis is the most widely used social science research tool.
The book is designed to help beginning students understand econometric techniques
through extensive examples, careful explanations, and a wide variety of problem
material. In each of the editions of Essentials of Econometrics (EE), I have tried to incor-
porate major developments in the field in an intuitive and informative way without
resorting to matrix algebra, calculus, or statistics beyond the introductory level. The
fifth edition of EE continues this tradition. Students wishing to pursue this subject at
a higher mathematical level can consult my book, Linear Regression: A Mathematical
Introduction (Sage, 2018).
DESCRIPTION OF THE SPECIFIC
MARKET (COURSES) FOR THIS BOOK
As noted, the bread-and-butter tool of econometrics is linear regression analysis. A
perusal of the books published in this field shows a variety of titles: Introduction to
Econometrics, Introductory Econometrics, A First Course in Linear Regression, Regression
xx   Essentials of Econometrics
Analysis for the Social Sciences, Statistical Methods in the Social Sciences, Understanding
Econometrics, Econometrics Models and Economic Forecasting, Applied Regression Models,
Running Regressions, Economic Analysis of Financial Data, Linear Models in Statistics,
Understanding Econometrics, and Principles of Econometrics. Despite the variety of names,
they all deal with linear regression analysis at various levels of mathematical sophistica-
tion. The fifth edition of EE basically provides the foundation of linear analysis for the
beginning student. A search on the Internet will reveal that various editions of my book
have been used or are still being used in universities and colleges all over the world. A
growing trend now is that the subject is now offered on several online courses offered by
various colleges and universities. EE has also been used in private business, government
and nongovernment entities, and research organizations as a reference book.
THE MAJOR FEATURES OF THE BOOK AND
THE BENEFITS OF THESE FEATURES
The foundation of linear regression is the classical linear regression model (CLRM). The
CLRM is based on several simplifying assumptions, as is true of many other disciplines.
I discus these assumptions one by one, pointing out the reasons for the assumption.
After discussing CLRM carefully, I look at each assumption critically—how realistic
is the assumption? What are the consequences if the assumption is not met in any
concrete situation, and what remedies are available? I provide several real data-based
examples to shed light on the robustness of the CLRM. Regression outputs of several
examples, using statistical packages, such as Eviews and Stata, are included in each
chapter. The regression outputs at a glance tell the reader the main feature of the data
underlying the various examples. The long longevity of the book probably shows why
students and teachers find my book so appealing.
With the knowledge of linear regression, the reader will be able to undertake projects
involving regression modeling. The reader will also be able to follow empirical research
in their field of interest as well as read professional journals in their field.
The appendix to Chapter 1 lists various sources of governmental and nongovernmen-
tal data that can be easily accessed. The Federal Reserve Bank of St. Louis has a very
extensive set of data on a variety of subjects that can be easily downloaded in Excel or
other formats and used with several statistical packages. And this is all free!
A DISCUSSION OF SPECIAL PEDAGOGICAL
AIDS AND HIGH-INTEREST FEATURES
Each chapter has a summary of the main points discussed in the chapter as well as a
glossary of the key terms. The problem set in each chapter includes some analytical
Preface   xxi
questions as well as real-world data sets that will let the reader solve problems using
a variety of techniques discussed in the text. The variety of examples discussed in
the text as well as those included in the exercises will show the reader how regression
analysis has been used in a variety of disciplines. There are approximately 54 fully
worked illustrative examples, about 82 analytical questions, and about 56 data sets in
the exercises in the book. The teacher can assign one or more of the data sets as a class
project. I firmly believe in learning by doing!
A salient feature of EE is that in four statistical appendices, it provides the basics of
probability and statistics for the benefit of students whose knowledge of these subjects
has become a little rusty or who have studied these subjects a long time ago. Some
teachers might want to cover the material in these appendices before covering the rest
of the book.
CHANGES IN THE FIFTH EDITION
New to this edition, and in response to useful feedback, are Chapter 11 on time-series
econometrics and Chapter 12 on panel data econometrics. Some of the material in
these chapters is rather advanced, but I have tried to explain it in a way that is under-
standable to beginning students.
Since time-series data are generally collected sequentially, there is every likelihood that
adjacent observations are correlated. This leads to the problem of autocorrelation. This
topic is discussed at length in Chapter 10. However, a more fundamental problem
with time series is that the series may not be stationary. The concept of stationary time
series is discussed in this chapter more intuitively, with the warning that if we regress
a nonstationary time series on another nonstationary time series, such regressions may
lead to the so-called spurious or nonsense regressions. In this chapter, I discuss meth-
ods to find out if a time series is stationary or not. Another topic discussed in this
chapter is the topic of causality, a kind of chicken and egg problem: Which comes first,
the chicken or the egg? Thus, does money supply cause gross domestic product, or it is
the other way round? The so-called Granger causality test is often used to answer this
question, although there is controversy about this test.
In panel data modeling, we have data with both time and cross-sectional dimensions.
Thus, we may have data on profits and sales for 50 firms over, say, 10 years, for a total
of 500 observations. In finding out the relationship between sales and profits, how do
we handle such data? We can regress profits on sales for each of the 50 firms using 10
years of data for each firm, obtaining 50 time-series regressions. We can also regress
profits on sales for each year for the 50 firms, obtaining 10 cross-sectional regressions.
How do we reconcile these regressions? The topics discussed in Chapter 12 show the
various alternatives.
xxii   Essentials of Econometrics
The chapter on dummy, or qualitative explanatory, variables now also includes the
discussion of dummy dependent variables. For example, how do we model the deci-
sion to smoke or not? Smoking is a binary variable—you either smoke or you do not
smoke. Whereas, in the traditional regression model the dependent variable is generally
quantifiable. In this chapter, I discuss how the dummy explanatory variables enhances
the linear regression model. I also discuss the problems entailed in estimating a regres-
sion model involving dummy dependent variables by the method of least squares and
suggest alternatives, such as the logit model.
DIGITAL RESOURCES TO
ACCOMPANY THE TEXT
An instructor website at edge.sagepub.com/gujarati5e contains the figures from the
book and answers to all the problems in the text, together with PowerPoint slides.
Students can access the data sets for many of the larger tables for the book on the
student section of this website.
xxiii
ABOUT THE AUTHOR
Damodar N. Gujarati, MCom, University of Bombay (Mumbai), MBA and PhD,
both from the University of Chicago, is Professor Emeritus of Economics at the
United States Military Academy at West Point, New York. Prior to that, he taught
for 25 years at the Baruch College of the City University of New York (CUNY)
and at the Graduate Center of CUNY. He is the author of Government and Business
(McGraw-Hill, 1984), the bestselling textbook Basic Econometrics (fifth edition, 2009,
with coauthor Dawn Porter), Econometrics by Example (second edition, 2014, Palgrave-
Macmillan), and Essentials of Econometrics (fifth edition, 2021) and Linear Regression:
A Mathematical Introduction (2018), both with SAGE. He is also the author of Pensions
and New York City Fiscal Crisis (American Enterprise Institute, 1978). His books on
econometrics have been translated into several languages. He has published extensively
in recognized national and international journals, such as the Review of Economics and
Statistics, Economic Journal, Journal of Financial and Quantitative Analysis, and Journal
of Business, published by the University of Chicago.
He has held visiting professorships at the University of Sheffield, United Kingdom;
National University of Singapore; and the University of New South Wales, Australia.
He was a Visiting Fulbright Scholar to India. He has lectured extensively on micro-
and macroeconomic topics in Australia, China, Bangladesh, Germany, India, Israel,
Mauritius, and the Republic of South Korea.
1
1
THE NATURE AND SCOPE
OF ECONOMETRICS
1Arthur S. Goldberger, Econometric Theory, Wiley, New York, 1964, p. 1.
Econometrics may be defined as the quantitative analysis of actual
economic phenomena based on the concurrent development of theory and
observations, related by appropriate methods of inference.
Paul Samuelson
Econometrics may be defined as the social science in which tools of economic
theory, mathematics, and statistical inference are applied to the analysis of
economic phenomena.
Arthur S. Goldberger
Research in economics, finance, management, marketing, and related disciplines
is becoming increasingly quantitative. Beginning students in these fields are
encouraged, if not required, to take a course or two in econometrics—a field of study
that has become quite popular. This chapter gives the beginner an overview of what
econometrics is all about.
1.1 WHAT IS ECONOMETRICS?
Simply stated, econometrics means economic measurement. Although quantitative
measurement of economic concepts such as the gross domestic product (GDP), unem-
ployment, inflation, imports, and exports is very important, the scope of econometrics
is much broader, as can be seen from the following definitions:
Econometrics may be defined as the social science in which the tools of economic
theory, mathematics, and statistical inference are applied to the analysis of
economic phenomena.1
2   Essentials of Econometrics
Econometrics, the result of a certain outlook on the role of economics, consists
of the application of mathematical statistics to economic data to lend empirical
support to the models constructed by mathematical economics and to obtain
numerical results.2
1.2 WHY STUDY ECONOMETRICS?
As the preceding definitions suggest, econometrics makes use of economic theory,
mathematical economics, economic statistics (i.e., economic data), and mathematical
statistics. Yet, it is a subject that deserves to be studied in its own right for the follow-
ing reasons.
Economic theory makes statements or hypotheses that are mostly qualitative in
nature. For example, microeconomic theory states that, other things remaining the
same (the famous ceteris paribus clause of economics), an increase in the price of a
commodity is expected to decrease the quantity demanded of that commodity. Thus,
economic theory postulates a negative or inverse relationship between the price and
quantity demanded of a commodity—this is the widely known law of downward-
sloping demand or simply the law of demand. But the theory itself does not provide
any numerical measure of the strength of the relationship between the two; that is, it
does not tell by how much the quantity demanded will go up or down as a result of a
certain change in the price of the commodity. It is the econometrician’s job to provide
such numerical estimates. Econometrics gives empirical (i.e., based on observation
or experiment) content to most economic theory. If we find in a study or experiment
that when the price of a unit increases by a dollar the quantity demanded goes down
by, say, 100 units, we have not only confirmed the law of demand, but in the pro-
cess, we have also provided a numerical estimate of the relationship between the two
variables—price and quantity.
The main concern of mathematical economics is to express economic theory
in mathematical form or equations (or models) without regard to measurability
or empirical verification of the theory. Econometrics, as noted earlier, is primar-
ily interested in the empirical verification of economic theory. As we will show
shortly, the econometrician often uses mathematical models proposed by the
mathematical economist but puts these models in forms that lend themselves to
empirical testing.
2P. A. Samuelson, T. C. Koopmans, and J. R. N. Stone, “Report of the Evaluative Committee for Economet-
rica,” Econometrica, vol. 22, no. 2, April 1954, pp. 141–146.
Chapter 1 ■ The Nature and Scope of Econometrics   3
Economic statistics is mainly concerned with collecting, processing, and presenting
economic data in the form of charts, diagrams, and tables. This is the economic statis-
tician’s job. He or she collects data on the GDP, employment, unemployment, prices,
and so on. These data constitute the raw data for econometric work. But the economic
statistician does not go any further because he or she is not primarily concerned with
using the collected data to test economic theories.
Although mathematical statistics provides many of the tools employed in the trade,
the econometrician often needs special methods because of the unique nature of most
economic data, namely, that the data are not usually generated as the result of a con-
trolled experiment. The econometrician, like the meteorologist, generally depends on
data that cannot be controlled directly. Thus, data on consumption, income, invest-
ments, savings, prices, and so on, which are collected by public and private agencies,
are nonexperimental in nature. The econometrician takes these data as given. This
creates special problems not normally dealt with in mathematical statistics. More-
over, such data are likely to contain errors of measurement, of either omission or
commission, and the econometrician may be called upon to develop special methods of
analysis to deal with such errors of measurement.
For students majoring in economics and business, there is a pragmatic reason for
studying econometrics. After graduation, in their employment, they may be called
upon to forecast sales, interest rates, and money supply or to estimate demand and
supply functions or price elasticities for products. Quite often, economists appear
as expert witnesses before federal and state regulatory agencies on behalf of their
clients or the public at large. Thus, an economist appearing before a state regula-
tory commission that controls prices of gas and electricity may be required to assess
the impact of a proposed price increase on the quantity demanded of electricity
before the commission will approve the price increase. In situations like this, the
economist may need to develop a demand function for electricity for this purpose.
Such a demand function may enable the economist to estimate the price elasticity of
demand, that is, the percentage change in the quantity demanded for a percentage
change in the price. Knowledge of econometrics is very helpful in estimating such
demand functions.
It is fair to say that econometrics has become an integral part of training in economics
and business.
It may be added the technics and methods developed in econometrics have found
uses in several other areas of social sciences, in politics and international relations, in
agricultural and medical sciences, as some of the examples discussed in this book will
reveal as we progress through the book.
4   Essentials of Econometrics
1.3 THE METHODOLOGY OF ECONOMETRICS
How does one actually do an econometric study? Broadly speaking, econometric anal-
ysis proceeds along the following lines.
1. The object of research
2. Collecting data
3. Specifying the mathematical model of theory
4. Specifying the statistical, or econometric, model of theory
5. Estimating the parameters of the chosen econometric model
6. Checking for model adequacy: model specification testing
7. Testing hypotheses derived from the model
8. Using the model for prediction or forecasting
To illustrate the methodology, consider this question: Do economic conditions affect
people’s decisions to enter the labor force, that is, their willingness to work? As a mea-
sure of economic conditions, suppose we use the unemployment rate (UNR), and as a
measure of labor force participation, we use the labor force participation rate (LFPR).
Data on UNR and LFPR are regularly published by the government. So to answer the
question, we proceed as follows.
1. The Object of Research
The starting point is to find out what economic theory has to say on the subject you
want to study. In labor economics, there are two rival hypotheses about the effect
of economic conditions on people’s willingness to work. The discouraged-worker
hypothesis (effect) states that when economic conditions worsen, as reflected in a
higher unemployment rate, many unemployed workers give up hope of finding a job
and drop out of the labor force. On the other hand, the added-worker hypothesis
(effect) maintains that when economic conditions worsen, many secondary workers
who are not currently in the labor market (e.g., mothers with children) may decide to
join the labor force if the main breadwinner in the family loses his or her job. Even if
the jobs these secondary workers get are low paying, the earnings will make up some
of the loss in income suffered by the primary breadwinner.
Whether, on balance, the labor force participation rate will increase or decrease will
depend on the relative strengths of the added-worker and discouraged-worker effects. If
the added-worker effect dominates, LFPR will increase even when the unemployment
Chapter 1 ■ The Nature and Scope of Econometrics   5
rate is high. Contrarily, if the discouraged-worker effect dominates, LFPR will decrease.
How do we find this out? This now becomes our empirical question.
2. Collecting Data
For empirical purposes, therefore, we need quantitative information on the two vari-
ables. There are three types of data that are generally available for empirical analysis.
1. Time series
2. Cross-sectional
3. Pooled (a combination of time series and cross-sectional)
Times-series data are collected over a period of time, such as the data on GDP,
employment, unemployment, money supply, or government deficits. Such data
may be collected at regular intervals—daily (e.g., stock prices), weekly (e.g., money
supply), monthly (e.g., the unemployment rate), quarterly (e.g., GDP), or annu-
ally (e.g., government budget). So-called high-frequency data are collected over an
extremely short-period time, such as seconds and minutes. In flash trading in stock
and foreign exchange markets, such high-frequency data have now become common.
These data may be quantitative in nature (e.g., prices, income, money supply) or
qualitative (e.g., male or female, employed or unemployed, married or unmarried,
White or Black). As we will show, qualitative variables, also called dummy or categorical
variables, can be every bit as important as quantitative variables.
Since successive observations in time-series data may be correlated, they pose special
problems for regressions involving time-series data, particularly the problem of auto-
correlation, a topic we discuss at length in Chapter 10 with appropriate examples.
Time-series data pose another problem, namely, that they may not be stationary.
Loosely speaking, a time series is stationary if its mean and variance do not vary system-
atically over time. In Chapter 11 on time-series econometrics, we examine the nature
of stationary and nonstationary time series and show the special statistical problems
created by the latter. If we are dealing with time-series data, we will denote the obser-
vations subscript by t (e.g., Yt, Xt).
Cross-sectional data are data on one or more variables collected at one point in time,
such as the census of population conducted by the U.S. Census Bureau every 10 years
(the most recent was on April 1, 2010; the results of the 2020 census are not yet
available at the time of writing); the surveys of consumer expenditures conducted
by the University of Michigan; and the opinion polls such as those conducted by
Gallup, Harris, and other polling organizations. Like time-series data, cross-sectional
6   Essentials of Econometrics
data have their particular problems, particularly the problem of heterogeneity. For
example, if you collect data on executive salaries in a given industry at the same point
in time, heterogeneity arises because the data may contain small-, medium-, and large-
size companies with their own management style and policies. In Chapter 5, we show
how the size or scale effect of heterogeneous companies can be taken into account.
In pooled data, we have elements of both time-series and cross-sectional data. For
example, if we collect data on the unemployment rate for 10 countries for a period of
20 years, the data will constitute an example of pooled data—data on the unemploy-
ment rate for each country for the 20-year period will form time-series data, whereas
data on the unemployment rate for the 10 countries for any single year will be cross-
sectional data. In pooled data, we will have 200 observations—20 annual observations
for each of the 10 countries.
There is a special type of pooled data called panel data, also called longitudinal
or micropanel data, in which the same cross-sectional unit, say, a family or firm, is
surveyed over time. For example, the U.S. Department of Commerce conducts a cen-
sus of housing at periodic intervals. At each periodic survey, the same household (or
the people living at the same address) is interviewed to find out if there has been any
change in the housing and financial conditions of that household since the last survey.
The panel data that result from repeatedly interviewing the same household at periodic
intervals provide very useful information on the dynamics of household behavior.
We denote panel data by the double subscript it. Thus, Yit will denote the (cross-
sectional) observation for the ith unit at time t.
Quality of the Data. The researcher must check carefully the reputation of the agency
that collects the data, for very often the data contain errors of measurement, errors of
omission of some observations, or errors of systematic rounding and the like. Data col-
lected in public polls or in marketing surveys may be biased because of nonresponse
or incomplete response from the participants. Sometimes the data are available only at
a highly aggregated, or macro, level, which may not tell us much about the individual
entities included in the aggregate. It should always be kept in mind that the results of
research are only as good as the quality of the data.
Since an individual researcher does not have the luxury of collecting data on their
own, very often they have to depend on secondary sources. But every effort must be
made to check the quality of the data used in empirical analysis.
Data Revisions. Macro data on variables such as GDP, consumer price index (CPI),
and other economic variables are often revised upward or downward as initially pub-
lished data may be tentative. It behooves the researcher to keep track of the revised
data.
Chapter 1 ■ The Nature and Scope of Econometrics   7
Not only that, macro and micro economic data are often “jolted” by unusual events,
such as the great recession of 2008 and the following several years, which was trig-
gered by the collapse of the housing market boon that was set in motion by the subpar
loans that were given by real estate brokers and banks. This collapse spilled over into
the stock market. The severe recession that started in the United States very quickly
spread across the globe, so such unusual events should be taken into account in analyz-
ing economic data.
A startling example is the coronavirus disease 2019 (COVID-19) pandemic that
started in one country in March 2019 and quickly spread to other countries, with
devastating effects on their economies. In the United States, according to the U.S.
Centers for Disease Control and Prevention, as of March 29, 2021, the total number
of COVID-19 cases was 30,085,827 and the total number of deaths was 546,704. The
long-term consequences of COVID-19 have yet to be assessed. So doing econometric
analysis in such situations such as this is very challenging, to say the least.
Sources of the Data. A word is in order regarding data sources. The success
of any econometric study hinges on the quality as well as the quantity of data.
Fortunately, the Internet has opened up a veritable wealth of data. In Appendix
1A, we give addresses of several websites that have all kinds of microeconomic
and macroeconomic data. Students should be familiar with such sources of data,
as well as how to access or download them. Of course, these data are continually
updated so the reader should find the latest available data.
Data From Statistical Packages. Statistical packages, such as EViews, Stata,
Minitab, and SAS, have data sets for expository purposes. The Federal Reserve
Bank of St. Louis has extensive data on several macroeconomic variables in Excel
format that can be directly imported into Eviews (http://research.stlouisfed.org/
fred-addin), and FRED economic data are extremely useful for empirical research.
Stata can also import FRED data in Stata format by issuing the command findit
Freduse while you use Stata.
For our analysis, we obtained the time-series data shown in Table 1-1 of the book's
website. This table gives data on the civilian labor force participation rate (CLFPR)
and the civilian unemployment rate (CUNR), defined as the number of civilians
unemployed as a percentage of the civilian labor force, for the United States for
the period 1980–2007.3 The data beyond this period are given in Problem 1.10
(see Table 1-2 found on the book’s website).
3We consider here only the aggregate CLFPR and CUNR, but data are available by age, sex, and ethnic
composition.
8   Essentials of Econometrics
Unlike physical sciences, most data collected in economics (e.g., GDP, money supply,
Dow Jones index, car sales) are nonexperimental in that the data-collecting agency
(e.g., government) may not have any direct control over the data. Thus, the data on
labor force participation and unemployment are based on the information provided
to the government by participants in the labor market. In a sense, the government is
a passive collector of these data and may not be aware of the added- or discouraged-
worker hypotheses, or any other hypothesis, for that matter. Therefore, the collected
data may be the result of several factors affecting the labor force participation decision
made by the individual person. That is, the same data may be compatible with more
than one theory.
3. Specifying the Mathematical
Model of Labor Force Participation
To see how CLFPR behaves in relation to CUNR, the first thing we should do is plot the
data for these variables in a scatter diagram, or scattergram, as shown in Figure 1-1.
The scattergram shows that CLFPR and CUNR are inversely related, perhaps sug-
gesting that, on balance, the discouraged-worker effect is stronger than the added-
worker effect.4 As a first approximation, we can draw a straight line through the scatter
FIGURE 1-1 
Regression plot for civilian labor force participation rate (%) and
civilian unemployment rate (%)
67.5
67.0
66.5
66.0
65.5
65.0
64.5
64.0
63.5
3.5 4.5 5.5 6.5 7.5
CUNR (%)
Fitted Line Plot
CLFPR
(%)
8.5 9.5 10.5
4On this, see Shelly Lundberg, “The Added Worker Effect,” Journal of Labor Economics, vol. 3, January 1985,
pp. 11–37.
Chapter 1 ■ The Nature and Scope of Econometrics   9
points and write the relationship between CLFPR and CUNR by the following simple
mathematical model:
CLFPR = B1 + B2 CUNR (1.1)
Equation (1.1) states that CLFPR is linearly related to CUNR. B1 and B2 are known
as the parameters of the linear function.5 B1 is also known as the intercept; it gives
the value of CLFPR when CUNR is zero.6 B2 is known as the slope. The slope measures
the rate of change in CLFPR for a unit change in CUNR or, more generally, the rate
of change in the value of the variable on the left-hand side of the equation for a unit
change in the value of the variable on the right-hand side. The slope coefficient B2
can be positive (if the added-worker effect dominates the discouraged-worker effect)
or negative (if the discouraged-worker effect dominates the added-worker effect).
Figure 1-1 suggests that in the present case, it is negative.
4. Specifying the Statistical, or
Econometric, Model of Labor Force Participation
The purely mathematical model of the relationship between CLFPR and CUNR given
in Equation (1.1), although of prime interest to the mathematical economist, is of lim-
ited appeal to the econometrician, for such a model assumes an exact, or deterministic,
relationship between the two variables; that is, for a given CUNR, there is a unique
value of CLFPR. In reality, one rarely finds such neat relationships between economic
variables. Most often, the relationships are inexact, or statistical, in nature.
This is seen clearly from the scattergram given in Figure 1-1. Although the two vari-
ables are inversely related, the relationship between them is not perfectly or exactly
linear, for if we draw a straight line through the 28 data points, not all the data points
will lie exactly on that straight line. Recall that to draw a straight line, we need only
two points.7 Why don’t the 28 data points lie exactly on the straight line specified by
the mathematical model, Equation (1.1)? Remember that our data on labor force and
unemployment are nonexperimentally collected. Therefore, as noted earlier, besides
the added- and discouraged-worker hypotheses, there may be other forces affecting
labor force participation decisions. As a result, the observed relationship between
CLFPR and CUNR is likely to be imprecise.
5Broadly speaking, a parameter is an unknown quantity that may vary over a certain set of values. In statis-
tics, a probability distribution function (PDF) of a random variable is often characterized by its parameters,
such as its mean and variance. This topic is discussed in greater detail in Appendixes A and B.
6In Chapter 2, we give a more precise interpretation of the intercept in the context of regression analysis.
7We even tried to fit a parabola to the scatter points given in Figure 1-1, but the results were not materially
different from the linear specification.
10   Essentials of Econometrics
Let us allow for the influence of all other variables affecting CLFPR in a catchall vari-
able u and write Equation (1.2) as follows:
CLFPR = B1 + B2CUNR + u (1.2)
where u represents the random error term, or simply the error term.8 We let u rep-
resent all those forces (besides CUNR) that affect CLFPR but are not explicitly intro-
duced in the model, as well as purely random forces. As we will see in Part II, the error
term distinguishes econometrics from purely mathematical economics.
Equation (1.2) is an example of a statistical, or empirical or econometric, model.
More precisely, it is an example of what is known as a linear regression model,
which is a prime subject of this book. In such a model, the variable appearing on
the left-hand side of the equation is called the dependent variable, and the vari-
able on the right-hand side is called the independent, or explanatory, variable.
In linear regression analysis, our primary objective is to explain the behavior of
one variable (the dependent variable) in relation to the behavior of one or more
other variables (the explanatory variables), allowing for the fact that the relation-
ship between them is inexact.
Notice that the econometric model, Equation (1.2), is derived from the mathematical
model, Equation (1.1), which shows that mathematical economics and econometrics
are mutually complementary disciplines. This is clearly reflected in the definition of
econometrics given at the outset.
Before proceeding further, a warning regarding causation is in order. In the regres-
sion model, Equation (1.2), we have stated that CLFPR is the dependent variable and
CUNR is the independent, or explanatory, variable. Does that mean that the two vari-
ables are causally related; that is, is CUNR the cause and CLFPR the effect? In other
words, does regression imply causation? Not necessarily. As Kendall and Stuart note,
“A statistical relationship, however strong and however suggestive, can never establish
causal connection: our ideas of causation must come from outside statistics, ultimately
from some theory or other.”9 In our example, it is up to economic theory (e.g., the
discouraged-worker hypothesis) to establish the cause-and-effect relationship, if any,
between the dependent and explanatory variables. If causality cannot be established, it
is better to call the relationship, Equation (1.2), a predictive relationship: Given CUNR,
can we predict CLFPR?
8In statistical lingo, the random error term is known as the stochastic error term.
9M. G. Kendall and A. Stuart, The Advanced Theory of Statistics, Charles Griffin, New York, 1961, vol. 2,
chap. 26, p. 279.
Chapter 1 ■ The Nature and Scope of Econometrics   11
5. Estimating the Parameters of the Chosen Econometric Model
Given the data on CLFPR and CUNR, such as that in Table 1-1, how do we estimate
the parameters of the model, Equation (1.2), namely, B1 and B2? That is, how do we
find the numerical values (i.e., estimates) of these parameters? This will be the focus
of our attention in Part II, where we develop the appropriate methods of computation,
especially the method of ordinary least squares (OLS). Using OLS and the data given in
Table 1-1, we obtained the following results:
CLFPR CUNR
 = −
69 4620 0 5814
. . (1.3)
Note that we have put the symbol Λ on CLFPR (read as “CLFPR hat”) to remind us
that Equation (1.3) is an estimate of Equation (1.2). The estimated regression line is
shown in Figure 1-1, along with the actual data points.
As Equation (1.3) shows, the estimated value of B1 is ≈ 69.5 and that of B2 is ≈ –0.58,
where the symbol ≈ means approximately. Thus, if the unemployment rate goes up by
one unit (i.e., one percentage point), ceteris paribus, CLFPR is expected to decrease on
the average by about 0.58 percentage points; that is, as economic conditions worsen,
on average, there is a net decrease in the labor force participation rate of about 0.58
percentage points, perhaps suggesting that the discouraged-worker effect dominates.
We say “on the average” because the presence of the error term u, as noted earlier, is
likely to make the relationship somewhat imprecise. This is vividly seen in Figure 1-1,
where the points not on the estimated regression line are the actual participation rates
and the (vertical) distance between them and the points on the regression line are
the estimated us. As we will see in Chapter 2, the estimated us are called residuals.
In short, the estimated regression line, Equation (1.3), gives the relationship between
average CLFPR and CUNR, that is, on average, how CLFPR responds to a unit change
in CUNR. The value of about 69.5 suggests that the average value of CLFPR will be
about 69.5% if the CUNR is zero; that is, about 69.5% of the civilian working-age
population will participate in the labor force if there is full employment (i.e., zero
unemployment).10
6. Checking for Model Adequacy: Model Specification Testing
How adequate is our model, Equation (1.3)? It is true that a person will take into
account labor market conditions as measured by, say, the unemployment rate before
entering the labor market. For example, in 1982 (a recession year), the civilian unem-
ployment rate was about 9.7%. Compared to that, in 2001, it was only 4.7%. A person
10This is, however, a mechanical interpretation of the intercept. We will see in Chapter 2 how to interpret the
intercept term meaningfully in a given context.
12   Essentials of Econometrics
is more likely to be discouraged from entering the labor market when the unem-
ployment rate is more than 9% than when it is 5%. But other factors also enter into
labor force participation decisions. For example, hourly wages, or earnings, prevailing
in the labor market also will be an important decision variable. In the short run at
least, a higher wage may attract more workers to the labor market, other things
remaining the same (ceteris paribus). To see its importance, in Table 1-1, we have also
given data on real average hourly earnings (AHE82), where real earnings are measured
in 1982 dollars. To take into account the influence of AHE82, we now consider the
following model:
CLFPR
 = B1 + B2CUNR + B3 AHE82 + u (1.4)
Equation (1.4) is an example of a multiple linear regression model, in contrast to
Equation (1.2), which is an example of a simple (two-variable or bivariate) linear regres-
sion model. In the two-variable model, there is a single explanatory variable, whereas in
a multiple regression, there are several, or multiple, explanatory variables. Notice that
in the multiple regression, Equation (1.4), we also have included the error term, u, for
no matter how many explanatory variables one introduces in the model, one cannot
fully explain the behavior of the dependent variable. How many variables one intro-
duces in the multiple regression is a decision that the researcher will have to make in a
given situation. Of course, the underlying economic theory will often tell what these
variables might be. However, keep in mind the warning given earlier that regression
does not mean causation; the relevant theory must determine whether one or more
explanatory variables are related to the dependent variable.
How do we estimate the parameters of the multiple regression, Equation (1.4)? We
cover this topic in Chapter 4, after we discuss the two-variable model in Chapters 2
and 3. We consider the two-variable case first because it is the building block of the
multiple regression model. As we shall see in Chapter 4, the multiple regression model
is in many ways a straightforward extension of the two-variable model.
For our illustrative example, the empirical counterpart of Equation (1.4) is as follows
(these results are based on OLS):
CLFPR CUNR AHE

= − −
81 2267 0 6384 1 4449 82
. . . (1.5)
These results are interesting because both the slope coefficients are negative. The nega-
tive coefficient of CUNR suggests that, ceteris paribus (i.e., holding the influence of
AHE82 constant), a one-percentage-point increase in the unemployment rate leads,
on average, to about a 0.64-percentage-point decrease in CLFPR, perhaps once
again supporting the discouraged-worker hypothesis. On the other hand, holding the
Chapter 1 ■ The Nature and Scope of Econometrics   13
influence of CUNR constant, an increase in real average hourly earnings of one dol-
lar, on average, leads to about a 1.44-percentage-point decline in CLFPR.11 Does the
negative coefficient for AHE82 make economic sense? Would one not expect a positive
coefficient—the higher the hourly earnings, the higher the attraction of the labor mar-
ket? However, one could justify the negative coefficient by recalling the twin concepts
of microeconomics, namely, the income effect and the substitution effect.12
Which model do we choose, Equation (1.3) or Equation (1.5)? Since Equation (1.5)
encompasses Equation (1.3) and adds an additional dimension (earnings) to the analy-
sis, we may choose Equation (1.5). After all, Equation (1.2) was based implicitly on the
assumption that variables other than the unemployment rate were held constant. But
where do we stop? For example, labor force participation may also depend on family
wealth, number of children under age 6 (this is especially critical for married women
thinking of joining the labor market), availability of daycare centers for young chil-
dren, religious beliefs, availability of welfare benefits, unemployment insurance, and
so on. Even if data on these variables are available, we may not want to introduce them
all in the model because the purpose of developing an econometric model is not to
capture total reality but just its salient features. If we decide to include every conceiv-
able variable in the regression model, the model will be so unwieldy that it will be of
little practical use. The model ultimately chosen should be a reasonably good replica of
the underlying reality, but keeping in mind the principle of parsimony or Ockham’s
razor. William Ockham (1285–1349), an English philosopher, held that complicated
explanation should not be accepted without good reason, or as he put it, “It is vain to
do with more what can be done with less.” In Chapter 7, we will discuss this question
further and find out how one can go about developing a model.
7. Testing Hypotheses Derived From the Model
Having finally settled on a model, we may want to perform hypothesis testing. That
is, we may want to find out whether the estimated model makes economic sense and
whether the results obtained conform with the underlying economic theory. For
example, the discouraged-worker hypothesis postulates a negative relationship between
labor force participation and the unemployment rate. Is this hypothesis borne out
by our results? Our statistical results seem to be in conformity with this hypothesis
because the estimated coefficient of CUNR is negative.
11As we will discuss in Chapter 4, the coefficients of CUNR and AHE82 given in Equation (1.5) are
known as partial regression coefficients. In that chapter, we will discuss the precise meaning of partial regres-
sion coefficients.
12Consult any standard textbook on microeconomics. One intuitive justification of this result is as follows.
Suppose both spouses are in the labor force and the earnings of one spouse rise substantially. This may
prompt the other spouse to withdraw from the labor force without substantially affecting the family income.
14   Essentials of Econometrics
However, hypothesis testing can be complicated. In our illustrative example, suppose
someone told us that in a prior study, the coefficient of CUNR was found to be about –1.
Are our results in agreement? If we rely on the model, Equation (1.3), we might get
one answer, but if we rely on Equation (1.5), we might get another answer. How do
we resolve this question? Although we will develop the necessary tools to answer such
questions, we should keep in mind that the answer to a particular hypothesis may
depend on the model we finally choose.
The point worth remembering is that in regression analysis, we may be interested not
only in estimating the parameters of the regression model but also in testing certain
hypotheses suggested by economic theory and/or prior empirical experience.
Although the basic principles of hypothesis testing are covered in a basic course in
statistics, Appendix D discusses this topic at some length for the benefit of the reader
as a refresher course.
8. Using the Model for Prediction or Forecasting
Having gone through this multistage procedure, you can legitimately ask the follow-
ing question: What do we do with the estimated model, such as Equation (1.5)? Quite
naturally, we would like to use it for prediction, or forecasting. For instance, suppose
we have 2008 data on the CUNR and AHE82. Assume these values are 6.0 and 10,
respectively. If we put these values in Equation (1.5), we obtain 62.9473% as the pre-
dicted value of CLFPR for 2008. That is, if the unemployment rate in 2008 were 6.0%
and the real hourly earnings were $10, the civilian labor force participation rate for
2008 would be about 63%. Of course, when data on CLFPR for 2008 actually become
TABLE 1-3  
Summary of the Steps Involved in Econometric Analysis
Step Example
1. Statement of theory The added/discouraged-worker hypothesis
2. Collection of data Table 1-1
3. Mathematical model of theory CLFPR = B1 + B2CUNR
4. Econometric model of theory CLFPR = B1 + B2CUNR + u
5. Parameter estimation CLFPR = 69.462 – 0.5814CUNR
6. Model adequacy check CLFPR = 81.3 – 0.638CUNR – 1.445AHE82
7. Hypothesis test B2  0 or B2  0
8. Prediction What is CLFPR, given values of CUNR and AHE82?
Chapter 1 ■ The Nature and Scope of Econometrics   15
available, we can compare the predicted value with the actual value (see Problem 1.10).
The discrepancy between the two will represent the prediction error. Naturally, we
would like to keep the prediction error as small as possible.
Although we examined econometric methodology using an example from labor eco-
nomics, we should point out that a similar procedure can be employed to analyze
quantitative relationships between variables in any field of knowledge. As a matter of
fact, regression analysis has been used in politics, international relations, psychology,
sociology, meteorology, and many other disciplines. As an example, see Problem 1.9.
1.4 THE ROAD AHEAD
Now that we have provided a glimpse of the nature and scope of econometrics, let us
see what lies ahead. The book is divided into four parts.
Part I introduces the reader to the bread-and-butter tool of econometrics, namely,
the classical linear regression model (CLRM). A thorough understanding of CLRM is a
must in order to follow research in the general areas of economics and business.
Part II considers the practical aspects of regression analysis and discusses a variety of
problems that the practitioner will have to tackle when one or more assumptions of
the CLRM do not hold.
Part III discusses two comparatively advanced topics, time-series econometrics and
panel data regression models.
Part IV, consisting of Appendixes A, B, C, and D, reviews the basics of probability
and statistics for the benefit of those readers whose knowledge of statistics has become
rusty. The reader should have some previous background in introductory statistics.
This book keeps the needs of the beginner in mind. The discussion of most topics
is straightforward and unencumbered with mathematical proofs, derivations, and so
on.13 I firmly believe that the apparently forbidding subject of econometrics can be
taught to beginners in such a way that they can see the value of the subject without
getting bogged down in mathematical and statistical minutiae. The student should
keep in mind that an introductory econometrics course is just like the introductory
statistics course he or she has already taken. As in statistics, econometrics is primar-
ily about estimation and hypothesis testing. What is different, and generally much
more interesting and useful, is that the parameters being estimated or tested are not
13Some of the proofs and derivations are presented in our Basic Econometrics, 5th ed., McGraw-Hill, New
York, 2009. A more mathematical treatment is given in Damodar N. Gujarati, Linear Regression: A Mathe-
matical Introduction, SAGE, Los Angeles, 2018.
16   Essentials of Econometrics
just means and variances but relationships between variables, which is what much of
economics and other social sciences is all about.
A final word: The availability of comparatively inexpensive computer software pack-
ages has now made econometrics readily accessible to beginners. In this book, we will
largely use four software packages: EViews, Excel, STATA, and MINITAB. These
packages are readily available and widely used. Once students get used to such pack-
ages, they will soon realize that learning econometrics is really great fun, and they will
have a better appreciation of the much maligned “dismal” science of economics.
KEY TERMS AND CONCEPTS
The key terms and concepts introduced in this chapter, and page numbers where they are referenced, are as follows:
Econometrics 1
Mathematical economics 2
Discouraged-worker
hypothesis (effect) 4
Added-worker hypothesis
(effect) 4
Time-series data: Quantitative
and qualitative 5
High-frequency data 5
Flash trading 5
Autocorrelation 5
Stationary 5
Cross-sectional data 5
Heterogeneity 6
Size or scale effect 6
Pooled data 6
Panel (or longitudinal or
micropanel data) 6
Scatter diagram
(scattergram) 8
Parameters: Intercept and
slopes 9
Random error term (error
term) 10
Linear regression model:
Dependent variable,
independent (or
explanatory) variable 10
Causation 10
Parameter estimates 11
Principle of parsimony or
Ockham’s razor 13
Hypothesis testing 16
Prediction (forecasting) 16
QUESTIONS
1.1. Suppose a local government decides to increase
the tax rate on residential properties under its
jurisdiction. What will be the effect of this on the
prices of residential houses? Follow the eight-
step procedure discussed in the text to answer
this question.
1.2. How do you perceive the role of econometrics
in decision making in business and
government?
1.3. Suppose you are an economic adviser to the
chairman of the Federal Reserve Board (the
Fed), and he asks you whether it is advisable
to increase the money supply to bolster the
economy. What factors would you take into
account in your advice? How would you use
econometrics in your advice?
1.4. To reduce the dependence on foreign oil
supplies, the government is thinking of
Chapter 1 ■ The Nature and Scope of Econometrics   17
increasing the federal taxes on gasoline.
Suppose the Ford Motor Company has hired
you to assess the impact of the tax increase
on the demand for its cars. How would you go
about advising the company?
1.5. President Joe Biden plans to propose to the
U.S. Congress an infrastructure investment
plan (highways, bridges, tunnels, etc.) at
a cost of about $2 trillion. To pay for this,
he also plans to increase the tax rate on
high-income earners as well as private
corporations, although the details are yet
to be worked out. How would you design an
econometric study to assess the economic
consequences, both short term and long
term, of his proposal?
PROBLEMS
1.6. Table 1-4 on the book's website gives monthly
data on the closing prices of the Dow Jones
Industrial Average and the Standard  Poor's
500 stock market indexes. The data are from
Yahoo Finance's historical stock quotations
page.
a. Plot these data with time on the horizontal
axis and the two variables on the vertical
axis. If you prefer, you may use a separate
figure for each variable.
b. What relationships do you expect to find
between the two indexes? Why?
c. For each variable, “eyeball” a regression
line from the scattergram.
d. Obtain monthly data for the two variables
for the period from January 2012 to
December 2020 and repeat questions a, b,
and c and find out if there are any changes
in the results. If so, what might account
for the change?
1.7. Table 1-5 on the book's website gives data on
the exchange rate between the U.K. pound
and the U.S. dollar (number of U.K. pounds
per U.S. dollar), as well as the consumer
price indexes in the two countries for the
period 1985–2007.
a. Plot the exchange rate (ER) and the two
consumer price indexes against time,
measured in years.
b. Divide the U.S. CPI by the U.K. CPI and call
it the relative price ratio (RPR).
c. Plot ER against RPR.
d. Visually sketch a regression line through
the scatter points.
e. Update the data in Table 1-5 to year 2020.
Repeat questions a, b, c, and d and find
out if there is any changes in the results.
What accounts for the change, if any, in
the results?
1.8. Table 1-6 on the textbook website contains
data on 1,247 cars for 2008.14 To find out if
there is there a relationship between a car’s
MPG (miles per gallon) and the number of
cylinders it has:
a. Create a scatterplot of the combined MPG
for the vehicles based on the number of
cylinders.
b. Sketch a line that seems to fit the
data.
c. What type of relationship is indicated by
the plot?
14Data were collected from the U.S. Department of Energy website at http://www.fueleconomy.gov/.
18   Essentials of Econometrics
1.9. Table 1-7 on the book’s website gives data on
Corruption Perception Index and GDP per worker.
a. Plot Corruption Perception Index against
GDP per worker.
b. A priori, what kind of relationship do you
expect between the two variables?
c. Does the scattergram suggest that the
relationship between the two variables is
linear (i.e., a straight line)? If so, sketch
the regression line.
1.10. Table 1-2 on the website updates the data
given in Table 1-1 for the years 2001–2016.
For the years 2001–2007, the CLFR and CUNR
figures are the same as those shown in Table
1-1. However, the AHE82 figures differ in the
two periods. As pointed out in the text, the
differences are usually due to data revisions.
a. Plot CLFR and CUNR as in Figure 1-1.
What difference due you see in the two
scattergrams?
b. Is the relationship between the variables
linear as in Figure 1-1? If so, visually
sketch a regression line through the
scatterplot.
c. Is there a “break” in the data in the sense
that after a certain date, the relationship
between the two variables has changed?
Can you spot that break point?
d. Based on the data in Table 1-2, the
regression results corresponding to
Equation (1.3) are as follows:
CLFR CUNR
∧
= −
66 5245 0 2465
. .
How does t his regression differ from the one
shown in Equation (1.3)? What may be the
reason for the difference?
e. The regression results corresponding to
Equation (1.5) using the data in Table 1-2
are as follows:
CLF PR CUNR AHE
∧
= + −
1121761 0 0150 5 4385 82
. .
How does this regression differ from the one
shown in Equation (1.5)? What might explain
the difference between the two regression
results?
Note: The full results of the preceding two
regressions will be discussed in Chapter 3
after we discuss the theory behind regression
analysis.
1.11. Table 1-8 on the book's website gives
quarterly data on real personal consumption
expenditure (RPCE) and real personal
disposable (after-tax) income (RPDI) for the
years 2014–2019.
a. Plot RPCE and RPDI on the same graph.
What is your impression about the two
time series?
b. Graph RPCE against RPDI. What does the
scattergram show?
c. Visually sketch a regression line through
the scatter points. What does it show?
d. Save the data for further analysis in
subsequent chapters.
1.12. Based on the data for 1947–2002, Kellsted and
Whitten obtained the following regression:15
Mt = 74.00 – 2.71GDPt
where M = percentage of households in which
a married couple is present and GDP = gross
domestic product.
a. Does this result make sense?
b. How would you interpret the regression?
c. Is there a cause-and-effect relationship
between the two variables?
d. The regression results give above may be
an example of what is called spurious or
nonsense regression. We may have more
to say about it in a later chapter.
1.13. Table 1.9 on the book's website gives data
on the following variables for 99 countries
15Paul M. Kellstedt and Guy D. Whitten, The Fundamentals of Political Science Research, Cambridge University Press, 2nd ed.,
New York, 2013, p. 262.
Chapter 1 ■ The Nature and Scope of Econometrics   19
obtained from the Human Development Report
for 1994.
LifeExp = 1992 life expectancy at birth
TV = Televisions per 100 people
PopDoc = Population per doctor (1990)
GDP = real GDP per person adjusted for PPP
(purchasing power parity)
a. Plot life expectancy against each of the
other variables in separate graphs.
b. A priori, what do you expect the
relationship is between LifeExp and each
of the other variables: positive, negative,
or no relationship?
SUGGESTIONS FOR FURTHER READING
“The Usefulness of Applied Econometrics to the
Policy Maker,” Address by R. Frances, President,
Federal Bank of St. Louis, at the National
Association of Business Economist Seminar,
Chicago, Illinois, April 4, 1973, Federal Bank of
St. Louis, May 1973.
“What Is Econometrics?” International Monetary
Fund, Finance and Development, December 2011,
vol. 48, No. 4 (https://www.imf.org/extenal/pubs/ft/
famd/2011/12/basics.htm).
On corruption, read https://ourworldindata.org/
corruption.
APPENDIX 1A: Economic Data on the World Wide Web16
Economic Statistics Briefing Room: An excellent
source of data on output, income, employment,
unemployment, earnings, production and business
activity, prices and money, credits and security
markets, and international statistics.
http://www.whitehouse.gov/fsbr/esbr.htm
Federal Reserve System Beige Book: Gives a summary
of current economic conditions by the Federal Reserve
District. There are 12 Federal Reserve Districts.
www.federalreserve.gov/FOMC/BeigeBook/2008
National Bureau of Economic Research (NBER)
Home Page: This highly regarded private economic
research institute has extensive data on asset prices,
labor, productivity, money supply, business cycle
indicators, and so on. NBER has many links to other
websites.
http://www.nber.org
Panel Study: Provides data on longitudinal survey
of representative sample of U.S. individuals and
families. These data have been collected annually
since 1968.
http://www.umich.edu/-psid
The Federal Web Locator: Provides information on
almost every sector of the federal government; has
international links.
www.lib.auburn.edu/madd/docs/fedloc.html
WebEC: Resources in Economics: A most
comprehensive library of economic facts and figures.
www.helsinki.fi/WebEc
American Stock Exchange: Information on some 700
companies listed on the second largest stock market.
16It should be noted that this list is by no means exhaustive. The sources listed here are updated continually.
20   Essentials of Econometrics
http://www.amex.com/
Bureau of Economic Analysis (BEA) Home Page: This
agency of the U.S. Department of Commerce, which
publishes the Survey of Current Business, is an excellent
source of data on all kinds of economic activities.
www.bea.gov
Business Cycle Indicators: You will find data on about
256 economic time series.
http://www.globalexposure.com/bci.html
CIA Publication: You will find the World Fact Book
(annual).
www.cia.gov/library/publications
Energy Information Administration (Department of
Energy [DOE]): Economic information and data on each
fuel category.
http://www.eia.doe.gov/
FRED Database: Federal Reserve Bank of St. Louis
publishes historical economic and social data,
which include interest rates, monetary and business
indicators, exchange rates, and so on.
http://www.stls.frb.org/fred/
International Trade Administration: Offers many web
links to trade statistics, cross-country programs,
and so on.
http://www.ita.doc.gov/
STAT-USA Databases: The National Trade Data
Bank provides the most comprehensive source
of international trade data and export promotion
information. It also contains extensive data on
demographic, political, and socioeconomic conditions
for several countries.
http://www.stat-usa.gov/
Bureau of Labor Statistics: The home page contains
data related to various aspects of employment,
unemployment, and earnings and provides links to
other statistical websites.
http://stats.bls.gov
U.S. Census Bureau Home Page: Prime source of
social, demographic, and economic data on income,
employment, income distribution, and poverty.
http://www.census.gov/
General Social Survey: Annual personal interview
survey data on U.S. households that began in 1972.
More than 35,000 have responded to some 2,500
different questions covering a variety of data.
www.norc.org/GCS+Website
Institute for Research on Poverty: Data collected by
nonpartisan and nonprofit university-based research
center on a variety of questions relating to poverty
and social inequality.
http://www.ssc.wisc.edu/irp/
Social Security Administration: The official website of
the Social Security Administration with a variety of
data.
http://www.ssa.gov
Federal Deposit Insurance Corporation, Bank Data and
Statistics
http://www.fdic.gov/bank/statistical/
Federal Reserve Board, Economic Research and Data
http://www.federalreserve.gov/econresdata
U.S. Census Bureau, Home Page
http://www.census.gov
U.S. Department of Energy, Energy Information
Administration
www.eia.doe.gov/overview_hd.html
U.S. Department of Health and Human Services,
National Center for Health Statistics
http://www.cdc.gov/nchs
U.S. Department of Housing and Urban Development,
Data Sets
http://www.huduser.org/datasets/pdrdatas.html
U.S. Department of Labor, Bureau of Labor Statistics
http://www.bls.gov
U.S. Department of Transportation, TranStats
http://www.transtats.bts.gov
U.S. Department of the Treasury, Internal Revenue
Service, Tax Statistics
Chapter 1 ■ The Nature and Scope of Econometrics   21
http://www.irs.gov/taxstats
Rockefeller Institute of Government, State and Local
Fiscal Data
www.rockinst.org/research/sl_finance
American Economic Association, Resources for
Economists
http://www.rfe.org
American Statistical Association, Business and
Economic Statistics
www.amstat.org/publications/jbes
American Statistical Association, Statistics in Sports
http://www.amstat.org/sections/sis/
European Central Bank, Statistics
http://www.ecb.int/stats
World Bank, Data and Statistics
http://www.worldbank.org/data
International Monetary Fund, Statistical Topics
http://www.imf.org/external/np/sta/
Penn World Tables
http://pwt.econ.upenn.edu
Current Population Survey
http://www.bls.census.gov/cps/
Consumer Expenditure Survey
http://www.bls.gov/cex/
Survey of Consumer Finances
http://www.federalreserve.gov/pubs/oss/
City and County Data Book
http://www.census.gov/statab/www/ccdb.html
Panel Study of Income Dynamics
http://psidonline.isr.umich.edu
National Longitudinal Surveys
http://www.bls.gov/nls/
National Association of Home Builders, Economic and
Housing Data
http://www.nahb.org/page.aspx/category/
sectionID=113
National Science Foundation, Division of Science
Resources Statistics
http://www.nsf.gov/sbe/srs/
Economic Report of the President
http://www.gpoaccess.gov/eop/
Various Economic Data Sets
http://www.economy.com/freelunch/
The Economist Market Indicators
http://www.economist.com/markets/indicators
Statistical Resources on the Military
http://www.lib.umich.edu/govdocs/stmil.html
World Economic Indicators
http://devdata.worldbank.org/
Economic Time Series Data
http://www.economagic.com/
United Nations Population Division's Annual Estimates
and Projections
http://unstats.un.org/unsd/default.htm
United Nations Statistics Division-UNdata
http://data.un.org/Default.aspx
World Bank Data
http://databank.worldbank.org/
23
PART
I
THE LINEAR
REGRESSION
MODEL
The objective of Part I, which consists of five chapters, is to introduce the reader
to the “bread-and-butter” tool of econometrics, namely, the linear regression
model.
Chapter 2 discusses the basic ideas of linear regression in terms of the simplest
possible linear regression model, in particular, the two-variable model. We make
an important distinction between the population regression model and the sample
regression model and estimate the former from the latter. This estimation is done
using the method of least squares, one of the popular methods of estimation.1
Chapter 3 considers hypothesis testing. As in any hypothesis testing in statistics,
we try to find out whether the estimated values of the parameters of the regression
model are compatible with the hypothesized values of the parameters. We do this
hypothesis testing in the context of the classical linear regression model (CLRM).
We discuss why the CLRM is used and point out that the CLRM is a useful start-
ing point. In Part II, we will reexamine the assumptions of the CLRM to see what
happens to the CLRM if one or more of its assumptions are not fulfilled.
Chapter 4 extends the idea of the two-variable linear regression model developed
in the previous two chapters to multiple regression models, that is, models having
more than one explanatory variable. Although in many ways the multiple regres-
sion model is an extension of the two-variable model, there are differences when it
comes to interpreting the coefficients of the model and in the hypothesis-testing
procedure.
The linear regression model, whether two-variable or multivariable, only requires
that the parameters of the model be linear; the variables entering the model need
not themselves be linear.
1An alternative is the method of maximum likelihood (ML), which we do not discuss in this text because
it is mathematically a bit complex. For an introduction to ML, see Damodar Gujarati, Econometrics by
Example, 2nd ed., Palgrave-Macmillan, London, 2015, pp. 25−26.
24   Essentials of Econometrics
Chapter 5 considers a variety of models that are linear in the parameters (or can
be made so) but are not necessarily linear in the variables. With several illustrative
examples, we point out how and where such models can be used.
Often the explanatory variables entering into a regression model are qualitative in
nature, such as sex, race, and religion. Chapter 6 shows how such variables can be
measured and how they enrich the linear regression model by taking into account the
influence of variables that otherwise cannot be quantified. This chapter also considers
briefly models in which the dependent variable is also dummy or qualitative.
Part I makes an effort to “wed” practice to theory. The availability of user-friendly
regression packages allows you to estimate a regression model without knowing much
theory, but remember the adage that “a little knowledge is a dangerous thing.” So even
though theory may be boring, it is absolutely essential in understanding and interpret-
ing regression results. Besides, by omitting all mathematical derivations, we have made
the theory “less boring.”

More Related Content

Similar to eBook PDF textbook - Essentials of Econometrics, 5e Damodar Gujarati.pdf

Business Statistics_ Problems and Solutions.pdf
Business Statistics_ Problems and Solutions.pdfBusiness Statistics_ Problems and Solutions.pdf
Business Statistics_ Problems and Solutions.pdfJapneetDhillon1
 
Mathematical Econometrics
Mathematical EconometricsMathematical Econometrics
Mathematical Econometricsjonren
 
Data analysis on Weekly Sales Transaction of products over 52 weeks
Data analysis on Weekly Sales Transaction of products over 52 weeksData analysis on Weekly Sales Transaction of products over 52 weeks
Data analysis on Weekly Sales Transaction of products over 52 weeksSadegh Bamohabbat
 
Morris H. DeGroot, Mark J. Schervish - Probability and Statistics (4th Editio...
Morris H. DeGroot, Mark J. Schervish - Probability and Statistics (4th Editio...Morris H. DeGroot, Mark J. Schervish - Probability and Statistics (4th Editio...
Morris H. DeGroot, Mark J. Schervish - Probability and Statistics (4th Editio...AkbarHidayatullah11
 
Evaluation of the reliability for L2 speech rating in discourse completion te...
Evaluation of the reliability for L2 speech rating in discourse completion te...Evaluation of the reliability for L2 speech rating in discourse completion te...
Evaluation of the reliability for L2 speech rating in discourse completion te...早稲田大学
 
Return on investment
Return on investmentReturn on investment
Return on investmenthprabowo
 
15583198 a-course-in-mathematical-statistics
15583198 a-course-in-mathematical-statistics15583198 a-course-in-mathematical-statistics
15583198 a-course-in-mathematical-statisticsruthtulay
 
James_F_Epperson_An_Introduction_to_Numerical_Methods_and_Analysis.pdf
James_F_Epperson_An_Introduction_to_Numerical_Methods_and_Analysis.pdfJames_F_Epperson_An_Introduction_to_Numerical_Methods_and_Analysis.pdf
James_F_Epperson_An_Introduction_to_Numerical_Methods_and_Analysis.pdfFahimSiddiquee2
 
Factor Analysis for Exploratory Studies
Factor Analysis for Exploratory StudiesFactor Analysis for Exploratory Studies
Factor Analysis for Exploratory StudiesManohar Pahan
 
Econometric Methods for Labour Economics by Stephen Bazen
 Econometric Methods for Labour Economics by Stephen Bazen Econometric Methods for Labour Economics by Stephen Bazen
Econometric Methods for Labour Economics by Stephen BazenAnissa ATMANI
 
Uop qnt-561-week-1-assignment-statistics-concepts-and-descriptive-measures-in...
Uop qnt-561-week-1-assignment-statistics-concepts-and-descriptive-measures-in...Uop qnt-561-week-1-assignment-statistics-concepts-and-descriptive-measures-in...
Uop qnt-561-week-1-assignment-statistics-concepts-and-descriptive-measures-in...assignmentindi
 
Propensity score analysis__statistical_methods_and_applications_advanced_quan...
Propensity score analysis__statistical_methods_and_applications_advanced_quan...Propensity score analysis__statistical_methods_and_applications_advanced_quan...
Propensity score analysis__statistical_methods_and_applications_advanced_quan...Israel Vargas
 
Common Core Warm-Ups.pdf
Common Core Warm-Ups.pdfCommon Core Warm-Ups.pdf
Common Core Warm-Ups.pdfMarjoCeloso1
 
Descriptive Statistics, Numerical Description
Descriptive Statistics, Numerical DescriptionDescriptive Statistics, Numerical Description
Descriptive Statistics, Numerical Descriptiongetyourcheaton
 
SIT095_Lecture_9_Logistic_Regression_Part_3.pptx
SIT095_Lecture_9_Logistic_Regression_Part_3.pptxSIT095_Lecture_9_Logistic_Regression_Part_3.pptx
SIT095_Lecture_9_Logistic_Regression_Part_3.pptxdawitg2
 
Multiplicative number theory i.classical theory cambridge
Multiplicative number theory i.classical theory cambridgeMultiplicative number theory i.classical theory cambridge
Multiplicative number theory i.classical theory cambridgeManuel Jesùs Saavedra Jimènez
 
Economic Dynamics-Phase Diagrams and their Application
Economic Dynamics-Phase Diagrams and their ApplicationEconomic Dynamics-Phase Diagrams and their Application
Economic Dynamics-Phase Diagrams and their ApplicationEce Acardemirci
 
Multiple Regression.ppt
Multiple Regression.pptMultiple Regression.ppt
Multiple Regression.pptTanyaWadhwani4
 

Similar to eBook PDF textbook - Essentials of Econometrics, 5e Damodar Gujarati.pdf (20)

Modern analytical chemistry
Modern analytical chemistryModern analytical chemistry
Modern analytical chemistry
 
Business Statistics_ Problems and Solutions.pdf
Business Statistics_ Problems and Solutions.pdfBusiness Statistics_ Problems and Solutions.pdf
Business Statistics_ Problems and Solutions.pdf
 
Mathematical Econometrics
Mathematical EconometricsMathematical Econometrics
Mathematical Econometrics
 
Data analysis on Weekly Sales Transaction of products over 52 weeks
Data analysis on Weekly Sales Transaction of products over 52 weeksData analysis on Weekly Sales Transaction of products over 52 weeks
Data analysis on Weekly Sales Transaction of products over 52 weeks
 
Morris H. DeGroot, Mark J. Schervish - Probability and Statistics (4th Editio...
Morris H. DeGroot, Mark J. Schervish - Probability and Statistics (4th Editio...Morris H. DeGroot, Mark J. Schervish - Probability and Statistics (4th Editio...
Morris H. DeGroot, Mark J. Schervish - Probability and Statistics (4th Editio...
 
Evaluation of the reliability for L2 speech rating in discourse completion te...
Evaluation of the reliability for L2 speech rating in discourse completion te...Evaluation of the reliability for L2 speech rating in discourse completion te...
Evaluation of the reliability for L2 speech rating in discourse completion te...
 
Return on investment
Return on investmentReturn on investment
Return on investment
 
15583198 a-course-in-mathematical-statistics
15583198 a-course-in-mathematical-statistics15583198 a-course-in-mathematical-statistics
15583198 a-course-in-mathematical-statistics
 
James_F_Epperson_An_Introduction_to_Numerical_Methods_and_Analysis.pdf
James_F_Epperson_An_Introduction_to_Numerical_Methods_and_Analysis.pdfJames_F_Epperson_An_Introduction_to_Numerical_Methods_and_Analysis.pdf
James_F_Epperson_An_Introduction_to_Numerical_Methods_and_Analysis.pdf
 
Factor Analysis for Exploratory Studies
Factor Analysis for Exploratory StudiesFactor Analysis for Exploratory Studies
Factor Analysis for Exploratory Studies
 
Econometric Methods for Labour Economics by Stephen Bazen
 Econometric Methods for Labour Economics by Stephen Bazen Econometric Methods for Labour Economics by Stephen Bazen
Econometric Methods for Labour Economics by Stephen Bazen
 
Uop qnt-561-week-1-assignment-statistics-concepts-and-descriptive-measures-in...
Uop qnt-561-week-1-assignment-statistics-concepts-and-descriptive-measures-in...Uop qnt-561-week-1-assignment-statistics-concepts-and-descriptive-measures-in...
Uop qnt-561-week-1-assignment-statistics-concepts-and-descriptive-measures-in...
 
Propensity score analysis__statistical_methods_and_applications_advanced_quan...
Propensity score analysis__statistical_methods_and_applications_advanced_quan...Propensity score analysis__statistical_methods_and_applications_advanced_quan...
Propensity score analysis__statistical_methods_and_applications_advanced_quan...
 
Applied statistics
Applied statisticsApplied statistics
Applied statistics
 
Common Core Warm-Ups.pdf
Common Core Warm-Ups.pdfCommon Core Warm-Ups.pdf
Common Core Warm-Ups.pdf
 
Descriptive Statistics, Numerical Description
Descriptive Statistics, Numerical DescriptionDescriptive Statistics, Numerical Description
Descriptive Statistics, Numerical Description
 
SIT095_Lecture_9_Logistic_Regression_Part_3.pptx
SIT095_Lecture_9_Logistic_Regression_Part_3.pptxSIT095_Lecture_9_Logistic_Regression_Part_3.pptx
SIT095_Lecture_9_Logistic_Regression_Part_3.pptx
 
Multiplicative number theory i.classical theory cambridge
Multiplicative number theory i.classical theory cambridgeMultiplicative number theory i.classical theory cambridge
Multiplicative number theory i.classical theory cambridge
 
Economic Dynamics-Phase Diagrams and their Application
Economic Dynamics-Phase Diagrams and their ApplicationEconomic Dynamics-Phase Diagrams and their Application
Economic Dynamics-Phase Diagrams and their Application
 
Multiple Regression.ppt
Multiple Regression.pptMultiple Regression.ppt
Multiple Regression.ppt
 

More from EdwinPolack1

eBook PDF textbook - Anatomy & Physiology for Emergency Care, 3e Bryan Bledso...
eBook PDF textbook - Anatomy & Physiology for Emergency Care, 3e Bryan Bledso...eBook PDF textbook - Anatomy & Physiology for Emergency Care, 3e Bryan Bledso...
eBook PDF textbook - Anatomy & Physiology for Emergency Care, 3e Bryan Bledso...EdwinPolack1
 
eBook PDF textbook - Business Analytics, 1e Sanjiv Jaggia.pdf
eBook PDF textbook - Business Analytics, 1e Sanjiv Jaggia.pdfeBook PDF textbook - Business Analytics, 1e Sanjiv Jaggia.pdf
eBook PDF textbook - Business Analytics, 1e Sanjiv Jaggia.pdfEdwinPolack1
 
eBook PDF textbook - Auditing & Assurance Services A Systematic Approach 12e ...
eBook PDF textbook - Auditing & Assurance Services A Systematic Approach 12e ...eBook PDF textbook - Auditing & Assurance Services A Systematic Approach 12e ...
eBook PDF textbook - Auditing & Assurance Services A Systematic Approach 12e ...EdwinPolack1
 
eBook PDF textbook - An Introduction to the Profession of Social Work, 6e Eli...
eBook PDF textbook - An Introduction to the Profession of Social Work, 6e Eli...eBook PDF textbook - An Introduction to the Profession of Social Work, 6e Eli...
eBook PDF textbook - An Introduction to the Profession of Social Work, 6e Eli...EdwinPolack1
 
eBook PDF textbook - Business Statistics A First Course, 8e David Levine, Kat...
eBook PDF textbook - Business Statistics A First Course, 8e David Levine, Kat...eBook PDF textbook - Business Statistics A First Course, 8e David Levine, Kat...
eBook PDF textbook - Business Statistics A First Course, 8e David Levine, Kat...EdwinPolack1
 
eBook PDF textbook - Auditing A Practical Approach with Data Analytics, 2e Ra...
eBook PDF textbook - Auditing A Practical Approach with Data Analytics, 2e Ra...eBook PDF textbook - Auditing A Practical Approach with Data Analytics, 2e Ra...
eBook PDF textbook - Auditing A Practical Approach with Data Analytics, 2e Ra...EdwinPolack1
 
eBook PDF textbook - College Accounting A Practical Approach (Canadian Editi...
eBook PDF textbook - College Accounting A Practical Approach  (Canadian Editi...eBook PDF textbook - College Accounting A Practical Approach  (Canadian Editi...
eBook PDF textbook - College Accounting A Practical Approach (Canadian Editi...EdwinPolack1
 
eBook PDF textbook - Corporate Financial Management, 6e Glen Arnold, Deborah ...
eBook PDF textbook - Corporate Financial Management, 6e Glen Arnold, Deborah ...eBook PDF textbook - Corporate Financial Management, 6e Glen Arnold, Deborah ...
eBook PDF textbook - Corporate Financial Management, 6e Glen Arnold, Deborah ...EdwinPolack1
 
eBook PDF textbook - Data Structures and Algorithm Analysis in Java, 3e Mark ...
eBook PDF textbook - Data Structures and Algorithm Analysis in Java, 3e Mark ...eBook PDF textbook - Data Structures and Algorithm Analysis in Java, 3e Mark ...
eBook PDF textbook - Data Structures and Algorithm Analysis in Java, 3e Mark ...EdwinPolack1
 
eBook PDF textbook - Digital Business and E-Commerce Management, 7e Dave, Tan...
eBook PDF textbook - Digital Business and E-Commerce Management, 7e Dave, Tan...eBook PDF textbook - Digital Business and E-Commerce Management, 7e Dave, Tan...
eBook PDF textbook - Digital Business and E-Commerce Management, 7e Dave, Tan...EdwinPolack1
 
eBook PDF textbook - Corporate Finance, 5e Jonathan Berk, Peter DeMarzo.pdf
eBook PDF textbook - Corporate Finance, 5e Jonathan Berk, Peter DeMarzo.pdfeBook PDF textbook - Corporate Finance, 5e Jonathan Berk, Peter DeMarzo.pdf
eBook PDF textbook - Corporate Finance, 5e Jonathan Berk, Peter DeMarzo.pdfEdwinPolack1
 
eBook PDF textbook - An Introduction to Astrobiology, 3e David A. Rothery_com...
eBook PDF textbook - An Introduction to Astrobiology, 3e David A. Rothery_com...eBook PDF textbook - An Introduction to Astrobiology, 3e David A. Rothery_com...
eBook PDF textbook - An Introduction to Astrobiology, 3e David A. Rothery_com...EdwinPolack1
 
eBook PDF textbook - Criminal Courts, 4e Craig Hemmens, David Brody, Cassia...
eBook PDF textbook - Criminal Courts, 4e Craig  Hemmens, David Brody,  Cassia...eBook PDF textbook - Criminal Courts, 4e Craig  Hemmens, David Brody,  Cassia...
eBook PDF textbook - Criminal Courts, 4e Craig Hemmens, David Brody, Cassia...EdwinPolack1
 
eBook PDF textbook - College Accounting A Practical Approach 15e Jeffrey Slat...
eBook PDF textbook - College Accounting A Practical Approach 15e Jeffrey Slat...eBook PDF textbook - College Accounting A Practical Approach 15e Jeffrey Slat...
eBook PDF textbook - College Accounting A Practical Approach 15e Jeffrey Slat...EdwinPolack1
 
eBook PDF textbook - Environmental Science and Sustainability, 1e Daniel Sher...
eBook PDF textbook - Environmental Science and Sustainability, 1e Daniel Sher...eBook PDF textbook - Environmental Science and Sustainability, 1e Daniel Sher...
eBook PDF textbook - Environmental Science and Sustainability, 1e Daniel Sher...EdwinPolack1
 
eBook PDF textbook - Essentials of Business Law 11e Anthony Liuzzo, Ruth Hugh...
eBook PDF textbook - Essentials of Business Law 11e Anthony Liuzzo, Ruth Hugh...eBook PDF textbook - Essentials of Business Law 11e Anthony Liuzzo, Ruth Hugh...
eBook PDF textbook - Essentials of Business Law 11e Anthony Liuzzo, Ruth Hugh...EdwinPolack1
 

More from EdwinPolack1 (16)

eBook PDF textbook - Anatomy & Physiology for Emergency Care, 3e Bryan Bledso...
eBook PDF textbook - Anatomy & Physiology for Emergency Care, 3e Bryan Bledso...eBook PDF textbook - Anatomy & Physiology for Emergency Care, 3e Bryan Bledso...
eBook PDF textbook - Anatomy & Physiology for Emergency Care, 3e Bryan Bledso...
 
eBook PDF textbook - Business Analytics, 1e Sanjiv Jaggia.pdf
eBook PDF textbook - Business Analytics, 1e Sanjiv Jaggia.pdfeBook PDF textbook - Business Analytics, 1e Sanjiv Jaggia.pdf
eBook PDF textbook - Business Analytics, 1e Sanjiv Jaggia.pdf
 
eBook PDF textbook - Auditing & Assurance Services A Systematic Approach 12e ...
eBook PDF textbook - Auditing & Assurance Services A Systematic Approach 12e ...eBook PDF textbook - Auditing & Assurance Services A Systematic Approach 12e ...
eBook PDF textbook - Auditing & Assurance Services A Systematic Approach 12e ...
 
eBook PDF textbook - An Introduction to the Profession of Social Work, 6e Eli...
eBook PDF textbook - An Introduction to the Profession of Social Work, 6e Eli...eBook PDF textbook - An Introduction to the Profession of Social Work, 6e Eli...
eBook PDF textbook - An Introduction to the Profession of Social Work, 6e Eli...
 
eBook PDF textbook - Business Statistics A First Course, 8e David Levine, Kat...
eBook PDF textbook - Business Statistics A First Course, 8e David Levine, Kat...eBook PDF textbook - Business Statistics A First Course, 8e David Levine, Kat...
eBook PDF textbook - Business Statistics A First Course, 8e David Levine, Kat...
 
eBook PDF textbook - Auditing A Practical Approach with Data Analytics, 2e Ra...
eBook PDF textbook - Auditing A Practical Approach with Data Analytics, 2e Ra...eBook PDF textbook - Auditing A Practical Approach with Data Analytics, 2e Ra...
eBook PDF textbook - Auditing A Practical Approach with Data Analytics, 2e Ra...
 
eBook PDF textbook - College Accounting A Practical Approach (Canadian Editi...
eBook PDF textbook - College Accounting A Practical Approach  (Canadian Editi...eBook PDF textbook - College Accounting A Practical Approach  (Canadian Editi...
eBook PDF textbook - College Accounting A Practical Approach (Canadian Editi...
 
eBook PDF textbook - Corporate Financial Management, 6e Glen Arnold, Deborah ...
eBook PDF textbook - Corporate Financial Management, 6e Glen Arnold, Deborah ...eBook PDF textbook - Corporate Financial Management, 6e Glen Arnold, Deborah ...
eBook PDF textbook - Corporate Financial Management, 6e Glen Arnold, Deborah ...
 
eBook PDF textbook - Data Structures and Algorithm Analysis in Java, 3e Mark ...
eBook PDF textbook - Data Structures and Algorithm Analysis in Java, 3e Mark ...eBook PDF textbook - Data Structures and Algorithm Analysis in Java, 3e Mark ...
eBook PDF textbook - Data Structures and Algorithm Analysis in Java, 3e Mark ...
 
eBook PDF textbook - Digital Business and E-Commerce Management, 7e Dave, Tan...
eBook PDF textbook - Digital Business and E-Commerce Management, 7e Dave, Tan...eBook PDF textbook - Digital Business and E-Commerce Management, 7e Dave, Tan...
eBook PDF textbook - Digital Business and E-Commerce Management, 7e Dave, Tan...
 
eBook PDF textbook - Corporate Finance, 5e Jonathan Berk, Peter DeMarzo.pdf
eBook PDF textbook - Corporate Finance, 5e Jonathan Berk, Peter DeMarzo.pdfeBook PDF textbook - Corporate Finance, 5e Jonathan Berk, Peter DeMarzo.pdf
eBook PDF textbook - Corporate Finance, 5e Jonathan Berk, Peter DeMarzo.pdf
 
eBook PDF textbook - An Introduction to Astrobiology, 3e David A. Rothery_com...
eBook PDF textbook - An Introduction to Astrobiology, 3e David A. Rothery_com...eBook PDF textbook - An Introduction to Astrobiology, 3e David A. Rothery_com...
eBook PDF textbook - An Introduction to Astrobiology, 3e David A. Rothery_com...
 
eBook PDF textbook - Criminal Courts, 4e Craig Hemmens, David Brody, Cassia...
eBook PDF textbook - Criminal Courts, 4e Craig  Hemmens, David Brody,  Cassia...eBook PDF textbook - Criminal Courts, 4e Craig  Hemmens, David Brody,  Cassia...
eBook PDF textbook - Criminal Courts, 4e Craig Hemmens, David Brody, Cassia...
 
eBook PDF textbook - College Accounting A Practical Approach 15e Jeffrey Slat...
eBook PDF textbook - College Accounting A Practical Approach 15e Jeffrey Slat...eBook PDF textbook - College Accounting A Practical Approach 15e Jeffrey Slat...
eBook PDF textbook - College Accounting A Practical Approach 15e Jeffrey Slat...
 
eBook PDF textbook - Environmental Science and Sustainability, 1e Daniel Sher...
eBook PDF textbook - Environmental Science and Sustainability, 1e Daniel Sher...eBook PDF textbook - Environmental Science and Sustainability, 1e Daniel Sher...
eBook PDF textbook - Environmental Science and Sustainability, 1e Daniel Sher...
 
eBook PDF textbook - Essentials of Business Law 11e Anthony Liuzzo, Ruth Hugh...
eBook PDF textbook - Essentials of Business Law 11e Anthony Liuzzo, Ruth Hugh...eBook PDF textbook - Essentials of Business Law 11e Anthony Liuzzo, Ruth Hugh...
eBook PDF textbook - Essentials of Business Law 11e Anthony Liuzzo, Ruth Hugh...
 

Recently uploaded

9548086042 for call girls in Indira Nagar with room service
9548086042  for call girls in Indira Nagar  with room service9548086042  for call girls in Indira Nagar  with room service
9548086042 for call girls in Indira Nagar with room servicediscovermytutordmt
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityGeoBlogs
 
social pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajansocial pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajanpragatimahajan3
 
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...christianmathematics
 
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfJayanti Pande
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Sapana Sha
 
Unit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxUnit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxVishalSingh1417
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...EduSkills OECD
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdfSoniaTolstoy
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfagholdier
 
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...fonyou31
 
Student login on Anyboli platform.helpin
Student login on Anyboli platform.helpinStudent login on Anyboli platform.helpin
Student login on Anyboli platform.helpinRaunakKeshri1
 
APM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAPM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAssociation for Project Management
 
Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3JemimahLaneBuaron
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxheathfieldcps1
 
Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfsanyamsingh5019
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsTechSoup
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13Steve Thomason
 
Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Celine George
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeThiyagu K
 

Recently uploaded (20)

9548086042 for call girls in Indira Nagar with room service
9548086042  for call girls in Indira Nagar  with room service9548086042  for call girls in Indira Nagar  with room service
9548086042 for call girls in Indira Nagar with room service
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activity
 
social pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajansocial pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajan
 
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
 
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdf
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
 
Unit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxUnit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptx
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdf
 
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
 
Student login on Anyboli platform.helpin
Student login on Anyboli platform.helpinStudent login on Anyboli platform.helpin
Student login on Anyboli platform.helpin
 
APM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAPM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across Sectors
 
Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptx
 
Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdf
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The Basics
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13
 
Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and Mode
 

eBook PDF textbook - Essentials of Econometrics, 5e Damodar Gujarati.pdf

  • 1. Instructions for payment for the full file with all chapters at: nail.basko@gmail.com
  • 3. I dedicate this book to Joan Gujarati, Diane Gujarati-Chesnut, Charles Chesnut, and my grandchildren, "Tommy" and Laura Chesnut, and to my dear friend Karen Low. Sara Miller McCune founded SAGE Publishing in 1965 to support the dissemination of usable knowledge and educate a global community. SAGE publishes more than 1000 journals and over 600 new books each year, spanning a wide range of subject areas. Our growing selection of library products includes archives, data, case studies and video. SAGE remains majority owned by our founder and after her lifetime will become owned by a charitable trust that secures the company’s continued independence. Los Angeles | London | New Delhi | Singapore | Washington DC | Melbourne
  • 4. Essentials of Econometrics Fifth Edition Damodar N. Gujarati Professor Emeritus of Economics The United States Military Academy at West Point
  • 5. FOR INFORMATION: SAGE Publications, Inc. 2455 Teller Road Thousand Oaks, California 91320 E-mail: order@sagepub.com SAGE Publications Ltd. 1 Oliver’s Yard 55 City Road London EC1Y 1SP United Kingdom SAGE Publications India Pvt. Ltd. B 1/I 1 Mohan Cooperative Industrial Area Mathura Road, New Delhi 110 044 India SAGE Publications Asia-Pacific Pte. Ltd. 18 Cross Street #10-10/11/12 China Square Central Singapore 048423 Acquisitions Editor: Helen Salmon Product Associate: Kenzie Offley Production Editor: Rebecca Lee Copy Editor: Gillian Dickens Typesetter: C&M Digitals (P) Ltd. Indexer: Integra Cover Designer: Scott Van Atta Marketing Manager: Victoria Velasquez Copyright © 2022 by Damodar N. Gujarati All rights reserved. Except as permitted by U.S. copyright law, no part of this work may be reproduced or distributed in any form or by any means, or stored in a database or retrieval system, without permission in writing from the publisher. All third-party trademarks referenced or depicted herein are included solely for the purpose of illustration and are the property of their respective owners. Reference to these trademarks in no way indicates any relationship with, or endorsement by, the trademark owner. Printed in the United States of America Library of Congress Cataloging-in-Publication Data Names: Gujarati, Damodar N., author. Title: Essentials of econometrics / Damodar N. Gujarati, The United States Military Academy at West Point. Description: Fifth edition. | Thousand Oaks, California : SAGE, [2022] | Includes bibliographical references and index. Identifiers: LCCN 2021012627 | ISBN 978-1-0718-5039-8 (paperback ; alk. paper) | ISBN 978-1-0718-5040-4 (epub) | ISBN 978-1-0718-5041-1 (epub) | ISBN 978-1-0718-5042-8 (pdf) Subjects: LCSH: Econometrics. | Economics—Statistical methods. Classification: LCC HB139 .G85 2022 | DDC 330.01/5195—dc23 LC record available at https://lccn.loc.gov/2021012627 This book is printed on acid-free paper. 21 22 23 24 25 10 9 8 7 6 5 4 3 2 1
  • 6. BRIEF CONTENTS Acknowledgmentsxvii Prefacexix About the Author xxiii Chapter 1 • The Nature and Scope of Econometrics 1 PART I • THE LINEAR REGRESSION MODEL 23 Chapter 2 • Basic Ideas of Linear Regression: The Two-Variable Model 25 Chapter 3 • The Two-Variable Model: Hypothesis Testing 61 Chapter 4 • Multiple Regression: Estimation and Hypothesis Testing 105 Chapter 5 • Functional Forms of Regression Models 147 Chapter 6 • Qualitative or Dummy Variable Regression Models 193 PART II • REGRESSION ANALYSIS IN PRACTICE 239 Chapter 7 • Model Selection: Criteria and Tests 241 Chapter 8 • Multicollinearity: What Happens If Explanatory Variables Are Correlated? 279 Chapter 9 • Heteroscedasticity: What Happens If the Error Variance Is Nonconstant?313 Chapter 10 • Autocorrelation: What Happens If Error Terms Are Correlated? 359 PART III • ADVANCED TOPICS IN ECONOMETRICS 393 Chapter 11 • Elements of Time-Series Econometrics 395 Chapter 12 • Panel Data Regression Models 419 Appendix A: Review of Statistics: Probability and Probability Distributions 441 Appendix B: Characteristics of Probability Distributions 475 Appendix C: Some Important Probability Distributions 505 Appendix D: Statistical Inference: Estimation and Hypothesis Testing 533 Appendix E: Statistical Tables 565 Index 593
  • 7.
  • 8. DETAILED CONTENTS Acknowledgmentsxvii Prefacexix About the Author xxiii Chapter 1 • The Nature and Scope of Econometrics 1 1.1 What Is Econometrics? 1 1.2 Why Study Econometrics? 2 1.3 The Methodology Of Econometrics 4 1. The Object of Research 4 2. Collecting Data 5 3. Specifying the Mathematical Model of Labor Force Participation 8 4. Specifying the Statistical, or Econometric, Model of Labor Force Participation 9 5. Estimating the Parameters of the Chosen Econometric Model 11 6. Checking for Model Adequacy: Model Specification Testing 11 7. Testing Hypothesis Derived From the Model 13 8. Using the Model for Prediction or Forecasting 14 1.4 The Road Ahead 15 Key Terms and Concepts 16 Questions16 Problems17 Appendix 1A: Economic Data on the World Wide Web 19 PART I • THE LINEAR REGRESSION MODEL 23 Chapter 2 • Basic Ideas of Linear Regression: The Two-Variable Model 25 2.1 The Meaning of Regression 25 2.2 The Population Regression Function (PRF): A Hypothetical Example 26 2.3 Statistical or Stochastic Specification of The Population Regression Function30 2.4 The Nature of the Stochastic Error Term 31 2.5 The Sample Regression Function (SRF)32 2.6 The Special Meaning of the Term Linear Regression 37 Linearity in the Variables 37 Linearity in the Parameters 38 2.7 Two-Variable Versus Multiple Linear Regression 39 2.8 Estimation of Parameters: The Method of Ordinary Least Squares 39 The Method of Ordinary Least Squares 40
  • 9. 2.9 Putting It All Together 44 Interpretation of the Estimated Math SAT Score Function 44 2.10 Some Illustrative Examples 45 2.11 Summary 52 Key Terms and Concepts 53 Questions53 Problems55 Optional Questions 60 Appendix 2A: Derivation of Least Squares Estimators 60 Chapter 3 • The Two-Variable Model: Hypothesis Testing 61 3.1 The Classical Linear Regression Model 62 3.2 Variances and Standard Errors of Ordinary Least Squares Estimators 66 Variances and Standard Errors of the Math SAT Example 67 Summary of the Math SAT Score Function 68 3.3 Why OLS? Properties of OLS Estimators 69 Gauss–Markov Theorem 70 3.4 The Sampling, or Probability, Distributions of OLS Estimators 70 Central Limit Theorem 71 3.5 Hypothesis Testing 72 Testing H0:B2 = 0 Versus H1:B2 ≠ 0: The Confidence Interval Approach 74 The Test of Significance Approach to Hypothesis Testing 76 Math SAT Example Continued 77 3.6 Hypothesis Testing: Some Practical Aspects 80 3.7 How Good Is The Fitted Regression Line: The Coefficient of Determination, r280 Formulas to Compute r283 r2 for the Math SAT Example 84 The Coefficient of Correlation, r84 3.8 Reporting the Results of Regression Analysis 85 3.9 Illustrative Examples 87 1. Relationship Between Wages and Productivity in the Business Sector, USA, 1959–2006 87 2. Expenditure on Education and Income in the 50 U.S. States for 2000 88 3. CEO Salaries of 447 Fortune 500 Companies for 1999 90 4. Impact of Advertising Expenditure on Viewers 91 3.10 Comments on the Illustrative Examples 92 3.11 Forecasting 92 3.12 Normality Tests 96 Histograms of Residuals 96 Jarque–Bera Test 96 3.13 Summary 98 Key Terms and Concepts 98 Questions98 Problems100
  • 10. Chapter 4 • Multiple Regression: Estimation and Hypothesis Testing 105 4.1 The Three-Variable Linear Regression Model 106 The Meaning of Partial Regression Coefficient 107 4.2 Assumptions of the Multiple Linear Regression Model 108 4.3 Estimation of the Parameters of Multiple Regression 111 Ordinary Least Squares Estimators 111 Variance and Standard Errors of OLS Estimators 113 Properties of OLS Estimators of Multiple Regression 114 4.4 Goodness of Fit of Estimated Multiple Regression: Multiple Coefficient of Determination, R2114 4.5 Antique Clock Auction Prices Revisited 116 Interpretation of the Regression Results 116 4.6 Hypothesis Testing In A Multiple Regression: General Comments 117 4.7 Testing Hypotheses About Individual Partial Regression Coefficients 118 The Test of Significance Approach 118 The Confidence Interval Approach to Hypothesis Testing 120 4.8 Testing the Joint Hypothesis That B2 = B3 = 0 Or R2 = 0 121 An Important Relationship Between F and R2124 4.9 Two-Variable Regression In the Context of Multiple Regression: Introduction to Specification Bias 125 4.10 Comparing Two R2 Values: The Adjusted R2127 4.11 When to Add an Additional Explanatory Variable to a Model 128 4.12 Restricted Least Squares 130 4.13 Illustrative Examples 131 Discussion of Regression Results 132 4.14 Summary 138 Key Terms and Concepts 139 Questions140 Problems000 Appendix 4A.1: Derivations of OLS Estimators144 Appendix 4A.2: Derivation of Equation (4.31) 145 Appendix 4A.3: Derivation of Equation (4.49) 145 Chapter 5 • Functional Forms of Regression Models 147 5.1 How to Measure Elasticity: The Log-Linear Model 148 Hypothesis Testing in Log-Linear Models 153 5.2 Multiple Log-Linear Regression Models 154 5.3 How to Measure the Growth Rate: The Semilog Model 157 Instantaneous Versus Compound Rate of Growth 160 The Linear Trend Model 161 5.4 The Lin-Log Model: When the Explanatory Variable Is Logarithmic 162 5.5 Reciprocal Models 164 5.6 Polynomial Regression Models 169 5.7 Regression Through the Origin: The Zero Intercept Model 173
  • 11. 5.8 A Note on Scaling and Units of Measurement 175 5.9 Regression on Standardized Variables 177 5.10 Summary of Functional Forms 180 5.11 SUMMARY 180 Key Terms and Concepts 181 Questions182 Problems183 Appendix 5A: Logarithms 190 Chapter 6 • Qualitative or Dummy Variable Regression Models 193 6.1 The Nature of Dummy Variables 193 6.2 ANCOVA Models: Regression on One Quantitative Variable and One Qualitative Variable With Two Categories 197 6.3 Regression on One Quantitative Variable and One Qualitative Variable With More Than Two Classes or Categories 200 6.4 Regression on One Quantiative Explanatory Variable and More Than One Qualitative Variable 203 Interaction Effects 204 A Generalization 205 6.5 Comparing Two Regessions 207 6.6 The Use of Dummy Variables In Seasonal Analysis 212 6.7 What Happens if the Dependent Variable Is Also a Dummy Variable? The Linear Probability Model (LPM)214 6.8 The Logit Model 219 Estimation of the Logit Model 221 6.9 Summary 228 Key Terms and Concepts 229 Questions229 Problems230 PART II • REGRESSION ANALYSIS IN PRACTICE 239 Chapter 7 • Model Selection: Criteria and Tests 241 7.1 The Attributes of a Good Model 242 7.2 Types of Specification Errors 243 7.3 Omisson of Relevant Variable Bias: “Underfitting” a Model 243 7.4 Inclusion of Irrelevant Variables: “Overfitting” a Model 248 7.5 Incorrect Functional Form 251 7.6 Errors of Measurement 253 Errors of Measurement in the Dependent Variable 254 Errors of Measurement in the Explanatory Variable(s) 254 7.7 Detecting Specification Errors: Tests of Specification Errors 255 Detecting the Presence of Unnecessary Variables 255 Tests for Omitted Variables and Incorrect Functional Forms 258
  • 12. Choosing Between Linear and Log-Linear Regression Models: The MWD Test 260 Regression Error Specification Test: RESET 262 7.8 Outliers, Leverage, and Influence Data 265 7.9 Probabity Distribution of the Error Term 268 7.10 Model Evaluation Criteria 270 7.11 Nonnormal Distribution of the Error Term 272 7.12 Fixed Versus Random (or Stochastic) Explanatory Variables 272 7.13 Summary 273 Key Terms and Concepts 274 Questions275 Problems275 Chapter 8 • Multicollinearity: What Happens if Explanatory Variables Are Correlated? 279 8.1 The Nature of Multicollinearity: The Case of Perfect Multicollinearity 280 8.2 The Case of Near, or Imperfect, Multicollinearity 283 8.3 Theoretical Consequences of Multicollinearity 285 8.4 Practical Consequences of Multicollinearity 287 8.5 Detection of Multicollinearity 289 8.6 Is Multicollinearity Necessarily Bad? 294 8.7 An Extended Example: The Demand for Chickens In The United States, 1960 To 1982 295 Collinearity Diagnostics for the Demand Function for Chickens 297 8.8 What to Do With Multicollinearity: Remedial Measures 299 Dropping a Variable(s) From the Model 300 Acquiring Additional Data or a New Sample 301 Rethinking the Model 302 Prior Information About Some Parameters 303 Transformation of Variables 304 Other Remedies 305 8.9 Summary 305 Key Terms and Concepts 306 Questions306 Problems307 Chapter 9 • Heteroscedasticity: What Happens if the Error Variance Is Nonconstant? 313 9.1 The Nature of Heteroscedasticity 313 Reasons for Heteroscedasticity 315 9.2 Consequences of Heteroscedasticity 316 9.3 Detection of Heteroscedasticity: How Do We Know When There Is a Heteroscedasticity Problem? 319 1. Nature of the Problem 320 2. Graphical Examination of Residuals 320
  • 13. 3. Park Test 323 4. Glejser Test 327 5. White’s General Heteroscedasticity Test 328 6. Breusch-Pagan (BP) Test of Heteroscedasticity 330 Other Tests of Heteroscedasticity 332 9.4 What to Do if Heteroscedasticity Is Observed: Remedial Measures 332 When s2 i Is Known: The Method of Weighted Least Squares (WLS) 333 When True s2 i Is Unknown 334 Respecification of the Model 339 9.5 White’s Heteroscedasticity-Corrected Standard Errors and t Statistics 340 9.6 Some Concrete Examples of Heteroscedasticity 342 9.7 Summary 349 Key Terms and Concepts 350 Questions350 Problems351 Chapter 10 • Autocorrelation: What Happens If Error Terms Are Correlated? 359 10.1 The Nature of Autocorrelation 360 Inertia361 Model Specification Error(s) 362 The Cobweb Phenomenon 362 Data Manipulation 362 10.2 Consequences of Autocorrelation 364 10.3 Detecting Autocorrelation 364 The Graphical Method 365 The Durbin–Watson d Test 367 10.4 Remedial Measures 372 10.5 How to Estimate r374 ρ = 1: The First Difference Method 375 ρ Estimated From Durbin–Watson d Statistic 375 ρ Estimated From OLS Residuals, et376 Other Methods of Estimating ρ377 10.6 A Large Sample Method of Correcting OLS Standard Errors: The Newey–West (NW) Method 378 10.7 A General Test of Autocorrelation: The Breusch–Godfrey (BG) Test 383 10.8 Summary 386 Key Terms and Concepts 386 Questions387 Problems388 PART III • ADVANCED TOPICS IN ECONOMETRICS 393 Chapter 11 • Elements of Time-Series Econometrics 395 11.1 The Phenomenon of Spurious Regression: Nonstationary Time Series 395 11.2 Tests of Stationarity 398 1. Graphical Analysis 398
  • 14. 2. Autocorrelation Function (ACF) and Correlogram 399 3. The Unit Root Test of Stationarity 402 11.3 Cointegrated Time Series 406 11.4 The Random Walk Model 408 11.5 Causality In Economics: The Granger Causality Test 411 The Granger Causality Test 411 11.6 Summary 415 Key Terms and Concepts 416 Problems416 Chapter 12 • Panel Data Regression Models 419 12.1 The Importance of Panel Data 420 12.2 An Illustrative Example: Charitable Giving 421 12.3 Pooled OLS Regression of the Charity Function 423 12.4 The Fixed-Effects Least Squares Dummy Variable (LSDV) Model 424 12.5 Limitations of the Fixed-Effects LSDV Model 427 12.6 The Fixed-Effects Within-Group (WG) Estimator 428 12.7 The Random-Effects Model (REM) or Error Components Model (ECM)430 Some Guidelines About REM and FEM 434 12.8 Properties of Various Estimators 435 12.9 Panel Data Regressions: Some Concluding Comments 436 12.10 Summary and Conclusions 436 Key Terms and Concepts 437 Problems438 INTRODUCTION TO APPENDIXES A, B, C, AND D: BASICS OF PROBABILITY AND STATISTICS 441 Appendix A: Review of Statistics: Probability and Probability Distributions 442 A.1 Some Notation 442 The Summation Notation 442 Properties of the Summation Operator 443 A.2 Experiment, Sample Space, Sample Point, and Events 444 Experiment444 Sample Space or Population 444 Sample Point 445 Events445 Venn Diagrams 445 A.3 Random Variables 446 A.4 Probability 447 Probability of an Event: The Classical or A Priori Definition 448 Relative Frequency or Empirical Definition of Probability 448 Probability of Random Variables 455 A.5 Random Variables and Their Probability Distributions 455 Probability Distribution of a Discrete Random Variable 455 Probability Distribution of a Continuous Random Variable 457 Cumulative Distribution Function (CDF) 459
  • 15. A.6 Multivariate Probability Density Functions 462 Marginal Probability Functions 464 Conditional Probability Functions 465 Statistical Independence 467 A.7 Summary and Conclusions 468 Key Terms and Concepts 469 References469 Questions470 Problems470 Appendix B: Characteristics of Probability Distributions 475 B.1 Expected Value: A Measure of Central Tendency 475 Properties of Expected Value 477 Expected Value of Multivariate Probability Distributions 479 B.2 Variance: A Measure of Dispersion 479 Properties of Variance 481 Chebyshev’s Inequality 483 Coefficient of Variation 484 B.3 Covariance 484 Properties of Covariance 486 B.4 Correlation Coefficient 486 Properties of the Correlation Coefficient 487 Variances of Correlated Variables 489 B.5 Conditional Expectation 489 Conditional Variance 491 B.6 Skewness and Kurtosis 491 B.7 From Population to the Sample 494 Sample Mean 495 Sample Variance 496 Sample Covariance 496 Sample Correlation Coefficient 498 Sample Skewness and Kurtosis 498 B.8 Summary 499 Key Terms and Concepts 499 Questions500 Problems501 Optional Exercises 503 Appendix C: Some Important Probability Distributions 505 C.1 The Normal Distribution 506 Properties of the Normal Distribution 506 The Standard Normal Distribution 508 Random Sampling From a Normal Population 512 The Sampling or Probability Distribution of the Sample Mean X – 512 The Central Limit Theorem (CLT) 518 C.2 The t Distribution 519 Properties of the t Distribution 519
  • 16. C.3 The Chi-Square (χ2) Probability Distribution 523 Properties of the Chi-Square Distribution 524 C.4 The F Distribution 526 Properties of the F Distribution 527 C.5 Summary 529 Key Terms and Concepts 530 Questions530 Problems531 Appendix D: Statistical Inference: Estimation and Hypothesis Testing 533 D.1 The Meaning of Statistical Inference 533 D.2 Estimation and Hypothesis Testing: Twin Branches of Statistical Inference 535 D.3 Estimation of Parameters 536 D.4 Properties of Point Estimators 541 Linearity541 Unbiasedness542 Minimum Variance 543 Efficiency543 Best Linear Unbiased Estimator (BLUE) 544 Consistency545 D.5 Statistical Inference: Hypothesis Testing 546 The Confidence Interval Approach to Hypothesis Testing 547 Type I and Type II Errors: A Digression 548 The Test of Significance Approach to Hypothesis Testing 551 A Word on Choosing the Level of Significance, α, and the p Value 555 The χ2 and F Tests of Significance 556 D.6 Summary 559 Key Terms and Concepts 560 Questions560 Problems561 Appendix E: Statistical Tables 565 Index 593
  • 17.
  • 18. xvii ACKNOWLEDGMENTS Iwould like to thank Inas R. Kelly, Associate Professor of Economics at Loyola Marymount University, and Michael Grossman, Distinguished Professor of Economics at the City University of New York, for reading and providing feedback on this fifth edition, and also Helen Salmon at SAGE for her behind-the-scenes help and encouragement. SAGE and the author are grateful for feedback from the following reviewers in the development of this fifth edition: Prasad V. Bidarkota, Florida International University Chinyere Emmanuel Egbe, Medgar Evers College, City University of New York Kyungkook Kang, University of Central Florida C. Burc Kayahan, Acadia University Tom Means, San Jose State University Elias Shukralla, Siena College Robert Sonora, University of Montana Patricia Kay Smith, University of Michigan–Dearborn Della Lee Sue, Marist College W. Scott Trees, Siena College
  • 19.
  • 20. xix PREFACE PURPOSE OF THE FIFTH EDITION OF ESSENTIALS OF ECONOMETRICS As in the first four editions of this book, my main purpose of the fifth edition is to pro- vide a user-friendly introduction to econometric theory and practice to a wide variety of students. The intended audience is undergraduate economics majors, undergraduate business administration majors, MBA students, and others in social and behavioral sciences where econometrics techniques, especially the techniques of linear regression analysis, are used. It is no exaggeration to say that regression analysis has become an integral part of study in any discipline where one is interested in studying the relationship between a variable of interest, called the dependent or response variable, and a set of explanatory or predic- tor variables. Sir Francis Galton (1822–1911) used it in the study of heredity, particularly the height of grownup children in relationship to the height of their parents. He used the method of least squares, the workhorse of linear regression analysis, for this purpose. Since then, the methodology of regression analysis has been improved and developed in many ways. Regression analysis is the most widely used social science research tool. The book is designed to help beginning students understand econometric techniques through extensive examples, careful explanations, and a wide variety of problem material. In each of the editions of Essentials of Econometrics (EE), I have tried to incor- porate major developments in the field in an intuitive and informative way without resorting to matrix algebra, calculus, or statistics beyond the introductory level. The fifth edition of EE continues this tradition. Students wishing to pursue this subject at a higher mathematical level can consult my book, Linear Regression: A Mathematical Introduction (Sage, 2018). DESCRIPTION OF THE SPECIFIC MARKET (COURSES) FOR THIS BOOK As noted, the bread-and-butter tool of econometrics is linear regression analysis. A perusal of the books published in this field shows a variety of titles: Introduction to Econometrics, Introductory Econometrics, A First Course in Linear Regression, Regression
  • 21. xx   Essentials of Econometrics Analysis for the Social Sciences, Statistical Methods in the Social Sciences, Understanding Econometrics, Econometrics Models and Economic Forecasting, Applied Regression Models, Running Regressions, Economic Analysis of Financial Data, Linear Models in Statistics, Understanding Econometrics, and Principles of Econometrics. Despite the variety of names, they all deal with linear regression analysis at various levels of mathematical sophistica- tion. The fifth edition of EE basically provides the foundation of linear analysis for the beginning student. A search on the Internet will reveal that various editions of my book have been used or are still being used in universities and colleges all over the world. A growing trend now is that the subject is now offered on several online courses offered by various colleges and universities. EE has also been used in private business, government and nongovernment entities, and research organizations as a reference book. THE MAJOR FEATURES OF THE BOOK AND THE BENEFITS OF THESE FEATURES The foundation of linear regression is the classical linear regression model (CLRM). The CLRM is based on several simplifying assumptions, as is true of many other disciplines. I discus these assumptions one by one, pointing out the reasons for the assumption. After discussing CLRM carefully, I look at each assumption critically—how realistic is the assumption? What are the consequences if the assumption is not met in any concrete situation, and what remedies are available? I provide several real data-based examples to shed light on the robustness of the CLRM. Regression outputs of several examples, using statistical packages, such as Eviews and Stata, are included in each chapter. The regression outputs at a glance tell the reader the main feature of the data underlying the various examples. The long longevity of the book probably shows why students and teachers find my book so appealing. With the knowledge of linear regression, the reader will be able to undertake projects involving regression modeling. The reader will also be able to follow empirical research in their field of interest as well as read professional journals in their field. The appendix to Chapter 1 lists various sources of governmental and nongovernmen- tal data that can be easily accessed. The Federal Reserve Bank of St. Louis has a very extensive set of data on a variety of subjects that can be easily downloaded in Excel or other formats and used with several statistical packages. And this is all free! A DISCUSSION OF SPECIAL PEDAGOGICAL AIDS AND HIGH-INTEREST FEATURES Each chapter has a summary of the main points discussed in the chapter as well as a glossary of the key terms. The problem set in each chapter includes some analytical
  • 22. Preface   xxi questions as well as real-world data sets that will let the reader solve problems using a variety of techniques discussed in the text. The variety of examples discussed in the text as well as those included in the exercises will show the reader how regression analysis has been used in a variety of disciplines. There are approximately 54 fully worked illustrative examples, about 82 analytical questions, and about 56 data sets in the exercises in the book. The teacher can assign one or more of the data sets as a class project. I firmly believe in learning by doing! A salient feature of EE is that in four statistical appendices, it provides the basics of probability and statistics for the benefit of students whose knowledge of these subjects has become a little rusty or who have studied these subjects a long time ago. Some teachers might want to cover the material in these appendices before covering the rest of the book. CHANGES IN THE FIFTH EDITION New to this edition, and in response to useful feedback, are Chapter 11 on time-series econometrics and Chapter 12 on panel data econometrics. Some of the material in these chapters is rather advanced, but I have tried to explain it in a way that is under- standable to beginning students. Since time-series data are generally collected sequentially, there is every likelihood that adjacent observations are correlated. This leads to the problem of autocorrelation. This topic is discussed at length in Chapter 10. However, a more fundamental problem with time series is that the series may not be stationary. The concept of stationary time series is discussed in this chapter more intuitively, with the warning that if we regress a nonstationary time series on another nonstationary time series, such regressions may lead to the so-called spurious or nonsense regressions. In this chapter, I discuss meth- ods to find out if a time series is stationary or not. Another topic discussed in this chapter is the topic of causality, a kind of chicken and egg problem: Which comes first, the chicken or the egg? Thus, does money supply cause gross domestic product, or it is the other way round? The so-called Granger causality test is often used to answer this question, although there is controversy about this test. In panel data modeling, we have data with both time and cross-sectional dimensions. Thus, we may have data on profits and sales for 50 firms over, say, 10 years, for a total of 500 observations. In finding out the relationship between sales and profits, how do we handle such data? We can regress profits on sales for each of the 50 firms using 10 years of data for each firm, obtaining 50 time-series regressions. We can also regress profits on sales for each year for the 50 firms, obtaining 10 cross-sectional regressions. How do we reconcile these regressions? The topics discussed in Chapter 12 show the various alternatives.
  • 23. xxii   Essentials of Econometrics The chapter on dummy, or qualitative explanatory, variables now also includes the discussion of dummy dependent variables. For example, how do we model the deci- sion to smoke or not? Smoking is a binary variable—you either smoke or you do not smoke. Whereas, in the traditional regression model the dependent variable is generally quantifiable. In this chapter, I discuss how the dummy explanatory variables enhances the linear regression model. I also discuss the problems entailed in estimating a regres- sion model involving dummy dependent variables by the method of least squares and suggest alternatives, such as the logit model. DIGITAL RESOURCES TO ACCOMPANY THE TEXT An instructor website at edge.sagepub.com/gujarati5e contains the figures from the book and answers to all the problems in the text, together with PowerPoint slides. Students can access the data sets for many of the larger tables for the book on the student section of this website.
  • 24. xxiii ABOUT THE AUTHOR Damodar N. Gujarati, MCom, University of Bombay (Mumbai), MBA and PhD, both from the University of Chicago, is Professor Emeritus of Economics at the United States Military Academy at West Point, New York. Prior to that, he taught for 25 years at the Baruch College of the City University of New York (CUNY) and at the Graduate Center of CUNY. He is the author of Government and Business (McGraw-Hill, 1984), the bestselling textbook Basic Econometrics (fifth edition, 2009, with coauthor Dawn Porter), Econometrics by Example (second edition, 2014, Palgrave- Macmillan), and Essentials of Econometrics (fifth edition, 2021) and Linear Regression: A Mathematical Introduction (2018), both with SAGE. He is also the author of Pensions and New York City Fiscal Crisis (American Enterprise Institute, 1978). His books on econometrics have been translated into several languages. He has published extensively in recognized national and international journals, such as the Review of Economics and Statistics, Economic Journal, Journal of Financial and Quantitative Analysis, and Journal of Business, published by the University of Chicago. He has held visiting professorships at the University of Sheffield, United Kingdom; National University of Singapore; and the University of New South Wales, Australia. He was a Visiting Fulbright Scholar to India. He has lectured extensively on micro- and macroeconomic topics in Australia, China, Bangladesh, Germany, India, Israel, Mauritius, and the Republic of South Korea.
  • 25.
  • 26. 1 1 THE NATURE AND SCOPE OF ECONOMETRICS 1Arthur S. Goldberger, Econometric Theory, Wiley, New York, 1964, p. 1. Econometrics may be defined as the quantitative analysis of actual economic phenomena based on the concurrent development of theory and observations, related by appropriate methods of inference. Paul Samuelson Econometrics may be defined as the social science in which tools of economic theory, mathematics, and statistical inference are applied to the analysis of economic phenomena. Arthur S. Goldberger Research in economics, finance, management, marketing, and related disciplines is becoming increasingly quantitative. Beginning students in these fields are encouraged, if not required, to take a course or two in econometrics—a field of study that has become quite popular. This chapter gives the beginner an overview of what econometrics is all about. 1.1 WHAT IS ECONOMETRICS? Simply stated, econometrics means economic measurement. Although quantitative measurement of economic concepts such as the gross domestic product (GDP), unem- ployment, inflation, imports, and exports is very important, the scope of econometrics is much broader, as can be seen from the following definitions: Econometrics may be defined as the social science in which the tools of economic theory, mathematics, and statistical inference are applied to the analysis of economic phenomena.1
  • 27. 2   Essentials of Econometrics Econometrics, the result of a certain outlook on the role of economics, consists of the application of mathematical statistics to economic data to lend empirical support to the models constructed by mathematical economics and to obtain numerical results.2 1.2 WHY STUDY ECONOMETRICS? As the preceding definitions suggest, econometrics makes use of economic theory, mathematical economics, economic statistics (i.e., economic data), and mathematical statistics. Yet, it is a subject that deserves to be studied in its own right for the follow- ing reasons. Economic theory makes statements or hypotheses that are mostly qualitative in nature. For example, microeconomic theory states that, other things remaining the same (the famous ceteris paribus clause of economics), an increase in the price of a commodity is expected to decrease the quantity demanded of that commodity. Thus, economic theory postulates a negative or inverse relationship between the price and quantity demanded of a commodity—this is the widely known law of downward- sloping demand or simply the law of demand. But the theory itself does not provide any numerical measure of the strength of the relationship between the two; that is, it does not tell by how much the quantity demanded will go up or down as a result of a certain change in the price of the commodity. It is the econometrician’s job to provide such numerical estimates. Econometrics gives empirical (i.e., based on observation or experiment) content to most economic theory. If we find in a study or experiment that when the price of a unit increases by a dollar the quantity demanded goes down by, say, 100 units, we have not only confirmed the law of demand, but in the pro- cess, we have also provided a numerical estimate of the relationship between the two variables—price and quantity. The main concern of mathematical economics is to express economic theory in mathematical form or equations (or models) without regard to measurability or empirical verification of the theory. Econometrics, as noted earlier, is primar- ily interested in the empirical verification of economic theory. As we will show shortly, the econometrician often uses mathematical models proposed by the mathematical economist but puts these models in forms that lend themselves to empirical testing. 2P. A. Samuelson, T. C. Koopmans, and J. R. N. Stone, “Report of the Evaluative Committee for Economet- rica,” Econometrica, vol. 22, no. 2, April 1954, pp. 141–146.
  • 28. Chapter 1 ■ The Nature and Scope of Econometrics   3 Economic statistics is mainly concerned with collecting, processing, and presenting economic data in the form of charts, diagrams, and tables. This is the economic statis- tician’s job. He or she collects data on the GDP, employment, unemployment, prices, and so on. These data constitute the raw data for econometric work. But the economic statistician does not go any further because he or she is not primarily concerned with using the collected data to test economic theories. Although mathematical statistics provides many of the tools employed in the trade, the econometrician often needs special methods because of the unique nature of most economic data, namely, that the data are not usually generated as the result of a con- trolled experiment. The econometrician, like the meteorologist, generally depends on data that cannot be controlled directly. Thus, data on consumption, income, invest- ments, savings, prices, and so on, which are collected by public and private agencies, are nonexperimental in nature. The econometrician takes these data as given. This creates special problems not normally dealt with in mathematical statistics. More- over, such data are likely to contain errors of measurement, of either omission or commission, and the econometrician may be called upon to develop special methods of analysis to deal with such errors of measurement. For students majoring in economics and business, there is a pragmatic reason for studying econometrics. After graduation, in their employment, they may be called upon to forecast sales, interest rates, and money supply or to estimate demand and supply functions or price elasticities for products. Quite often, economists appear as expert witnesses before federal and state regulatory agencies on behalf of their clients or the public at large. Thus, an economist appearing before a state regula- tory commission that controls prices of gas and electricity may be required to assess the impact of a proposed price increase on the quantity demanded of electricity before the commission will approve the price increase. In situations like this, the economist may need to develop a demand function for electricity for this purpose. Such a demand function may enable the economist to estimate the price elasticity of demand, that is, the percentage change in the quantity demanded for a percentage change in the price. Knowledge of econometrics is very helpful in estimating such demand functions. It is fair to say that econometrics has become an integral part of training in economics and business. It may be added the technics and methods developed in econometrics have found uses in several other areas of social sciences, in politics and international relations, in agricultural and medical sciences, as some of the examples discussed in this book will reveal as we progress through the book.
  • 29. 4   Essentials of Econometrics 1.3 THE METHODOLOGY OF ECONOMETRICS How does one actually do an econometric study? Broadly speaking, econometric anal- ysis proceeds along the following lines. 1. The object of research 2. Collecting data 3. Specifying the mathematical model of theory 4. Specifying the statistical, or econometric, model of theory 5. Estimating the parameters of the chosen econometric model 6. Checking for model adequacy: model specification testing 7. Testing hypotheses derived from the model 8. Using the model for prediction or forecasting To illustrate the methodology, consider this question: Do economic conditions affect people’s decisions to enter the labor force, that is, their willingness to work? As a mea- sure of economic conditions, suppose we use the unemployment rate (UNR), and as a measure of labor force participation, we use the labor force participation rate (LFPR). Data on UNR and LFPR are regularly published by the government. So to answer the question, we proceed as follows. 1. The Object of Research The starting point is to find out what economic theory has to say on the subject you want to study. In labor economics, there are two rival hypotheses about the effect of economic conditions on people’s willingness to work. The discouraged-worker hypothesis (effect) states that when economic conditions worsen, as reflected in a higher unemployment rate, many unemployed workers give up hope of finding a job and drop out of the labor force. On the other hand, the added-worker hypothesis (effect) maintains that when economic conditions worsen, many secondary workers who are not currently in the labor market (e.g., mothers with children) may decide to join the labor force if the main breadwinner in the family loses his or her job. Even if the jobs these secondary workers get are low paying, the earnings will make up some of the loss in income suffered by the primary breadwinner. Whether, on balance, the labor force participation rate will increase or decrease will depend on the relative strengths of the added-worker and discouraged-worker effects. If the added-worker effect dominates, LFPR will increase even when the unemployment
  • 30. Chapter 1 ■ The Nature and Scope of Econometrics   5 rate is high. Contrarily, if the discouraged-worker effect dominates, LFPR will decrease. How do we find this out? This now becomes our empirical question. 2. Collecting Data For empirical purposes, therefore, we need quantitative information on the two vari- ables. There are three types of data that are generally available for empirical analysis. 1. Time series 2. Cross-sectional 3. Pooled (a combination of time series and cross-sectional) Times-series data are collected over a period of time, such as the data on GDP, employment, unemployment, money supply, or government deficits. Such data may be collected at regular intervals—daily (e.g., stock prices), weekly (e.g., money supply), monthly (e.g., the unemployment rate), quarterly (e.g., GDP), or annu- ally (e.g., government budget). So-called high-frequency data are collected over an extremely short-period time, such as seconds and minutes. In flash trading in stock and foreign exchange markets, such high-frequency data have now become common. These data may be quantitative in nature (e.g., prices, income, money supply) or qualitative (e.g., male or female, employed or unemployed, married or unmarried, White or Black). As we will show, qualitative variables, also called dummy or categorical variables, can be every bit as important as quantitative variables. Since successive observations in time-series data may be correlated, they pose special problems for regressions involving time-series data, particularly the problem of auto- correlation, a topic we discuss at length in Chapter 10 with appropriate examples. Time-series data pose another problem, namely, that they may not be stationary. Loosely speaking, a time series is stationary if its mean and variance do not vary system- atically over time. In Chapter 11 on time-series econometrics, we examine the nature of stationary and nonstationary time series and show the special statistical problems created by the latter. If we are dealing with time-series data, we will denote the obser- vations subscript by t (e.g., Yt, Xt). Cross-sectional data are data on one or more variables collected at one point in time, such as the census of population conducted by the U.S. Census Bureau every 10 years (the most recent was on April 1, 2010; the results of the 2020 census are not yet available at the time of writing); the surveys of consumer expenditures conducted by the University of Michigan; and the opinion polls such as those conducted by Gallup, Harris, and other polling organizations. Like time-series data, cross-sectional
  • 31. 6   Essentials of Econometrics data have their particular problems, particularly the problem of heterogeneity. For example, if you collect data on executive salaries in a given industry at the same point in time, heterogeneity arises because the data may contain small-, medium-, and large- size companies with their own management style and policies. In Chapter 5, we show how the size or scale effect of heterogeneous companies can be taken into account. In pooled data, we have elements of both time-series and cross-sectional data. For example, if we collect data on the unemployment rate for 10 countries for a period of 20 years, the data will constitute an example of pooled data—data on the unemploy- ment rate for each country for the 20-year period will form time-series data, whereas data on the unemployment rate for the 10 countries for any single year will be cross- sectional data. In pooled data, we will have 200 observations—20 annual observations for each of the 10 countries. There is a special type of pooled data called panel data, also called longitudinal or micropanel data, in which the same cross-sectional unit, say, a family or firm, is surveyed over time. For example, the U.S. Department of Commerce conducts a cen- sus of housing at periodic intervals. At each periodic survey, the same household (or the people living at the same address) is interviewed to find out if there has been any change in the housing and financial conditions of that household since the last survey. The panel data that result from repeatedly interviewing the same household at periodic intervals provide very useful information on the dynamics of household behavior. We denote panel data by the double subscript it. Thus, Yit will denote the (cross- sectional) observation for the ith unit at time t. Quality of the Data. The researcher must check carefully the reputation of the agency that collects the data, for very often the data contain errors of measurement, errors of omission of some observations, or errors of systematic rounding and the like. Data col- lected in public polls or in marketing surveys may be biased because of nonresponse or incomplete response from the participants. Sometimes the data are available only at a highly aggregated, or macro, level, which may not tell us much about the individual entities included in the aggregate. It should always be kept in mind that the results of research are only as good as the quality of the data. Since an individual researcher does not have the luxury of collecting data on their own, very often they have to depend on secondary sources. But every effort must be made to check the quality of the data used in empirical analysis. Data Revisions. Macro data on variables such as GDP, consumer price index (CPI), and other economic variables are often revised upward or downward as initially pub- lished data may be tentative. It behooves the researcher to keep track of the revised data.
  • 32. Chapter 1 ■ The Nature and Scope of Econometrics   7 Not only that, macro and micro economic data are often “jolted” by unusual events, such as the great recession of 2008 and the following several years, which was trig- gered by the collapse of the housing market boon that was set in motion by the subpar loans that were given by real estate brokers and banks. This collapse spilled over into the stock market. The severe recession that started in the United States very quickly spread across the globe, so such unusual events should be taken into account in analyz- ing economic data. A startling example is the coronavirus disease 2019 (COVID-19) pandemic that started in one country in March 2019 and quickly spread to other countries, with devastating effects on their economies. In the United States, according to the U.S. Centers for Disease Control and Prevention, as of March 29, 2021, the total number of COVID-19 cases was 30,085,827 and the total number of deaths was 546,704. The long-term consequences of COVID-19 have yet to be assessed. So doing econometric analysis in such situations such as this is very challenging, to say the least. Sources of the Data. A word is in order regarding data sources. The success of any econometric study hinges on the quality as well as the quantity of data. Fortunately, the Internet has opened up a veritable wealth of data. In Appendix 1A, we give addresses of several websites that have all kinds of microeconomic and macroeconomic data. Students should be familiar with such sources of data, as well as how to access or download them. Of course, these data are continually updated so the reader should find the latest available data. Data From Statistical Packages. Statistical packages, such as EViews, Stata, Minitab, and SAS, have data sets for expository purposes. The Federal Reserve Bank of St. Louis has extensive data on several macroeconomic variables in Excel format that can be directly imported into Eviews (http://research.stlouisfed.org/ fred-addin), and FRED economic data are extremely useful for empirical research. Stata can also import FRED data in Stata format by issuing the command findit Freduse while you use Stata. For our analysis, we obtained the time-series data shown in Table 1-1 of the book's website. This table gives data on the civilian labor force participation rate (CLFPR) and the civilian unemployment rate (CUNR), defined as the number of civilians unemployed as a percentage of the civilian labor force, for the United States for the period 1980–2007.3 The data beyond this period are given in Problem 1.10 (see Table 1-2 found on the book’s website). 3We consider here only the aggregate CLFPR and CUNR, but data are available by age, sex, and ethnic composition.
  • 33. 8   Essentials of Econometrics Unlike physical sciences, most data collected in economics (e.g., GDP, money supply, Dow Jones index, car sales) are nonexperimental in that the data-collecting agency (e.g., government) may not have any direct control over the data. Thus, the data on labor force participation and unemployment are based on the information provided to the government by participants in the labor market. In a sense, the government is a passive collector of these data and may not be aware of the added- or discouraged- worker hypotheses, or any other hypothesis, for that matter. Therefore, the collected data may be the result of several factors affecting the labor force participation decision made by the individual person. That is, the same data may be compatible with more than one theory. 3. Specifying the Mathematical Model of Labor Force Participation To see how CLFPR behaves in relation to CUNR, the first thing we should do is plot the data for these variables in a scatter diagram, or scattergram, as shown in Figure 1-1. The scattergram shows that CLFPR and CUNR are inversely related, perhaps sug- gesting that, on balance, the discouraged-worker effect is stronger than the added- worker effect.4 As a first approximation, we can draw a straight line through the scatter FIGURE 1-1 Regression plot for civilian labor force participation rate (%) and civilian unemployment rate (%) 67.5 67.0 66.5 66.0 65.5 65.0 64.5 64.0 63.5 3.5 4.5 5.5 6.5 7.5 CUNR (%) Fitted Line Plot CLFPR (%) 8.5 9.5 10.5 4On this, see Shelly Lundberg, “The Added Worker Effect,” Journal of Labor Economics, vol. 3, January 1985, pp. 11–37.
  • 34. Chapter 1 ■ The Nature and Scope of Econometrics   9 points and write the relationship between CLFPR and CUNR by the following simple mathematical model: CLFPR = B1 + B2 CUNR (1.1) Equation (1.1) states that CLFPR is linearly related to CUNR. B1 and B2 are known as the parameters of the linear function.5 B1 is also known as the intercept; it gives the value of CLFPR when CUNR is zero.6 B2 is known as the slope. The slope measures the rate of change in CLFPR for a unit change in CUNR or, more generally, the rate of change in the value of the variable on the left-hand side of the equation for a unit change in the value of the variable on the right-hand side. The slope coefficient B2 can be positive (if the added-worker effect dominates the discouraged-worker effect) or negative (if the discouraged-worker effect dominates the added-worker effect). Figure 1-1 suggests that in the present case, it is negative. 4. Specifying the Statistical, or Econometric, Model of Labor Force Participation The purely mathematical model of the relationship between CLFPR and CUNR given in Equation (1.1), although of prime interest to the mathematical economist, is of lim- ited appeal to the econometrician, for such a model assumes an exact, or deterministic, relationship between the two variables; that is, for a given CUNR, there is a unique value of CLFPR. In reality, one rarely finds such neat relationships between economic variables. Most often, the relationships are inexact, or statistical, in nature. This is seen clearly from the scattergram given in Figure 1-1. Although the two vari- ables are inversely related, the relationship between them is not perfectly or exactly linear, for if we draw a straight line through the 28 data points, not all the data points will lie exactly on that straight line. Recall that to draw a straight line, we need only two points.7 Why don’t the 28 data points lie exactly on the straight line specified by the mathematical model, Equation (1.1)? Remember that our data on labor force and unemployment are nonexperimentally collected. Therefore, as noted earlier, besides the added- and discouraged-worker hypotheses, there may be other forces affecting labor force participation decisions. As a result, the observed relationship between CLFPR and CUNR is likely to be imprecise. 5Broadly speaking, a parameter is an unknown quantity that may vary over a certain set of values. In statis- tics, a probability distribution function (PDF) of a random variable is often characterized by its parameters, such as its mean and variance. This topic is discussed in greater detail in Appendixes A and B. 6In Chapter 2, we give a more precise interpretation of the intercept in the context of regression analysis. 7We even tried to fit a parabola to the scatter points given in Figure 1-1, but the results were not materially different from the linear specification.
  • 35. 10   Essentials of Econometrics Let us allow for the influence of all other variables affecting CLFPR in a catchall vari- able u and write Equation (1.2) as follows: CLFPR = B1 + B2CUNR + u (1.2) where u represents the random error term, or simply the error term.8 We let u rep- resent all those forces (besides CUNR) that affect CLFPR but are not explicitly intro- duced in the model, as well as purely random forces. As we will see in Part II, the error term distinguishes econometrics from purely mathematical economics. Equation (1.2) is an example of a statistical, or empirical or econometric, model. More precisely, it is an example of what is known as a linear regression model, which is a prime subject of this book. In such a model, the variable appearing on the left-hand side of the equation is called the dependent variable, and the vari- able on the right-hand side is called the independent, or explanatory, variable. In linear regression analysis, our primary objective is to explain the behavior of one variable (the dependent variable) in relation to the behavior of one or more other variables (the explanatory variables), allowing for the fact that the relation- ship between them is inexact. Notice that the econometric model, Equation (1.2), is derived from the mathematical model, Equation (1.1), which shows that mathematical economics and econometrics are mutually complementary disciplines. This is clearly reflected in the definition of econometrics given at the outset. Before proceeding further, a warning regarding causation is in order. In the regres- sion model, Equation (1.2), we have stated that CLFPR is the dependent variable and CUNR is the independent, or explanatory, variable. Does that mean that the two vari- ables are causally related; that is, is CUNR the cause and CLFPR the effect? In other words, does regression imply causation? Not necessarily. As Kendall and Stuart note, “A statistical relationship, however strong and however suggestive, can never establish causal connection: our ideas of causation must come from outside statistics, ultimately from some theory or other.”9 In our example, it is up to economic theory (e.g., the discouraged-worker hypothesis) to establish the cause-and-effect relationship, if any, between the dependent and explanatory variables. If causality cannot be established, it is better to call the relationship, Equation (1.2), a predictive relationship: Given CUNR, can we predict CLFPR? 8In statistical lingo, the random error term is known as the stochastic error term. 9M. G. Kendall and A. Stuart, The Advanced Theory of Statistics, Charles Griffin, New York, 1961, vol. 2, chap. 26, p. 279.
  • 36. Chapter 1 ■ The Nature and Scope of Econometrics   11 5. Estimating the Parameters of the Chosen Econometric Model Given the data on CLFPR and CUNR, such as that in Table 1-1, how do we estimate the parameters of the model, Equation (1.2), namely, B1 and B2? That is, how do we find the numerical values (i.e., estimates) of these parameters? This will be the focus of our attention in Part II, where we develop the appropriate methods of computation, especially the method of ordinary least squares (OLS). Using OLS and the data given in Table 1-1, we obtained the following results: CLFPR CUNR  = − 69 4620 0 5814 . . (1.3) Note that we have put the symbol Λ on CLFPR (read as “CLFPR hat”) to remind us that Equation (1.3) is an estimate of Equation (1.2). The estimated regression line is shown in Figure 1-1, along with the actual data points. As Equation (1.3) shows, the estimated value of B1 is ≈ 69.5 and that of B2 is ≈ –0.58, where the symbol ≈ means approximately. Thus, if the unemployment rate goes up by one unit (i.e., one percentage point), ceteris paribus, CLFPR is expected to decrease on the average by about 0.58 percentage points; that is, as economic conditions worsen, on average, there is a net decrease in the labor force participation rate of about 0.58 percentage points, perhaps suggesting that the discouraged-worker effect dominates. We say “on the average” because the presence of the error term u, as noted earlier, is likely to make the relationship somewhat imprecise. This is vividly seen in Figure 1-1, where the points not on the estimated regression line are the actual participation rates and the (vertical) distance between them and the points on the regression line are the estimated us. As we will see in Chapter 2, the estimated us are called residuals. In short, the estimated regression line, Equation (1.3), gives the relationship between average CLFPR and CUNR, that is, on average, how CLFPR responds to a unit change in CUNR. The value of about 69.5 suggests that the average value of CLFPR will be about 69.5% if the CUNR is zero; that is, about 69.5% of the civilian working-age population will participate in the labor force if there is full employment (i.e., zero unemployment).10 6. Checking for Model Adequacy: Model Specification Testing How adequate is our model, Equation (1.3)? It is true that a person will take into account labor market conditions as measured by, say, the unemployment rate before entering the labor market. For example, in 1982 (a recession year), the civilian unem- ployment rate was about 9.7%. Compared to that, in 2001, it was only 4.7%. A person 10This is, however, a mechanical interpretation of the intercept. We will see in Chapter 2 how to interpret the intercept term meaningfully in a given context.
  • 37. 12   Essentials of Econometrics is more likely to be discouraged from entering the labor market when the unem- ployment rate is more than 9% than when it is 5%. But other factors also enter into labor force participation decisions. For example, hourly wages, or earnings, prevailing in the labor market also will be an important decision variable. In the short run at least, a higher wage may attract more workers to the labor market, other things remaining the same (ceteris paribus). To see its importance, in Table 1-1, we have also given data on real average hourly earnings (AHE82), where real earnings are measured in 1982 dollars. To take into account the influence of AHE82, we now consider the following model: CLFPR  = B1 + B2CUNR + B3 AHE82 + u (1.4) Equation (1.4) is an example of a multiple linear regression model, in contrast to Equation (1.2), which is an example of a simple (two-variable or bivariate) linear regres- sion model. In the two-variable model, there is a single explanatory variable, whereas in a multiple regression, there are several, or multiple, explanatory variables. Notice that in the multiple regression, Equation (1.4), we also have included the error term, u, for no matter how many explanatory variables one introduces in the model, one cannot fully explain the behavior of the dependent variable. How many variables one intro- duces in the multiple regression is a decision that the researcher will have to make in a given situation. Of course, the underlying economic theory will often tell what these variables might be. However, keep in mind the warning given earlier that regression does not mean causation; the relevant theory must determine whether one or more explanatory variables are related to the dependent variable. How do we estimate the parameters of the multiple regression, Equation (1.4)? We cover this topic in Chapter 4, after we discuss the two-variable model in Chapters 2 and 3. We consider the two-variable case first because it is the building block of the multiple regression model. As we shall see in Chapter 4, the multiple regression model is in many ways a straightforward extension of the two-variable model. For our illustrative example, the empirical counterpart of Equation (1.4) is as follows (these results are based on OLS): CLFPR CUNR AHE  = − − 81 2267 0 6384 1 4449 82 . . . (1.5) These results are interesting because both the slope coefficients are negative. The nega- tive coefficient of CUNR suggests that, ceteris paribus (i.e., holding the influence of AHE82 constant), a one-percentage-point increase in the unemployment rate leads, on average, to about a 0.64-percentage-point decrease in CLFPR, perhaps once again supporting the discouraged-worker hypothesis. On the other hand, holding the
  • 38. Chapter 1 ■ The Nature and Scope of Econometrics   13 influence of CUNR constant, an increase in real average hourly earnings of one dol- lar, on average, leads to about a 1.44-percentage-point decline in CLFPR.11 Does the negative coefficient for AHE82 make economic sense? Would one not expect a positive coefficient—the higher the hourly earnings, the higher the attraction of the labor mar- ket? However, one could justify the negative coefficient by recalling the twin concepts of microeconomics, namely, the income effect and the substitution effect.12 Which model do we choose, Equation (1.3) or Equation (1.5)? Since Equation (1.5) encompasses Equation (1.3) and adds an additional dimension (earnings) to the analy- sis, we may choose Equation (1.5). After all, Equation (1.2) was based implicitly on the assumption that variables other than the unemployment rate were held constant. But where do we stop? For example, labor force participation may also depend on family wealth, number of children under age 6 (this is especially critical for married women thinking of joining the labor market), availability of daycare centers for young chil- dren, religious beliefs, availability of welfare benefits, unemployment insurance, and so on. Even if data on these variables are available, we may not want to introduce them all in the model because the purpose of developing an econometric model is not to capture total reality but just its salient features. If we decide to include every conceiv- able variable in the regression model, the model will be so unwieldy that it will be of little practical use. The model ultimately chosen should be a reasonably good replica of the underlying reality, but keeping in mind the principle of parsimony or Ockham’s razor. William Ockham (1285–1349), an English philosopher, held that complicated explanation should not be accepted without good reason, or as he put it, “It is vain to do with more what can be done with less.” In Chapter 7, we will discuss this question further and find out how one can go about developing a model. 7. Testing Hypotheses Derived From the Model Having finally settled on a model, we may want to perform hypothesis testing. That is, we may want to find out whether the estimated model makes economic sense and whether the results obtained conform with the underlying economic theory. For example, the discouraged-worker hypothesis postulates a negative relationship between labor force participation and the unemployment rate. Is this hypothesis borne out by our results? Our statistical results seem to be in conformity with this hypothesis because the estimated coefficient of CUNR is negative. 11As we will discuss in Chapter 4, the coefficients of CUNR and AHE82 given in Equation (1.5) are known as partial regression coefficients. In that chapter, we will discuss the precise meaning of partial regres- sion coefficients. 12Consult any standard textbook on microeconomics. One intuitive justification of this result is as follows. Suppose both spouses are in the labor force and the earnings of one spouse rise substantially. This may prompt the other spouse to withdraw from the labor force without substantially affecting the family income.
  • 39. 14   Essentials of Econometrics However, hypothesis testing can be complicated. In our illustrative example, suppose someone told us that in a prior study, the coefficient of CUNR was found to be about –1. Are our results in agreement? If we rely on the model, Equation (1.3), we might get one answer, but if we rely on Equation (1.5), we might get another answer. How do we resolve this question? Although we will develop the necessary tools to answer such questions, we should keep in mind that the answer to a particular hypothesis may depend on the model we finally choose. The point worth remembering is that in regression analysis, we may be interested not only in estimating the parameters of the regression model but also in testing certain hypotheses suggested by economic theory and/or prior empirical experience. Although the basic principles of hypothesis testing are covered in a basic course in statistics, Appendix D discusses this topic at some length for the benefit of the reader as a refresher course. 8. Using the Model for Prediction or Forecasting Having gone through this multistage procedure, you can legitimately ask the follow- ing question: What do we do with the estimated model, such as Equation (1.5)? Quite naturally, we would like to use it for prediction, or forecasting. For instance, suppose we have 2008 data on the CUNR and AHE82. Assume these values are 6.0 and 10, respectively. If we put these values in Equation (1.5), we obtain 62.9473% as the pre- dicted value of CLFPR for 2008. That is, if the unemployment rate in 2008 were 6.0% and the real hourly earnings were $10, the civilian labor force participation rate for 2008 would be about 63%. Of course, when data on CLFPR for 2008 actually become TABLE 1-3   Summary of the Steps Involved in Econometric Analysis Step Example 1. Statement of theory The added/discouraged-worker hypothesis 2. Collection of data Table 1-1 3. Mathematical model of theory CLFPR = B1 + B2CUNR 4. Econometric model of theory CLFPR = B1 + B2CUNR + u 5. Parameter estimation CLFPR = 69.462 – 0.5814CUNR 6. Model adequacy check CLFPR = 81.3 – 0.638CUNR – 1.445AHE82 7. Hypothesis test B2 0 or B2 0 8. Prediction What is CLFPR, given values of CUNR and AHE82?
  • 40. Chapter 1 ■ The Nature and Scope of Econometrics   15 available, we can compare the predicted value with the actual value (see Problem 1.10). The discrepancy between the two will represent the prediction error. Naturally, we would like to keep the prediction error as small as possible. Although we examined econometric methodology using an example from labor eco- nomics, we should point out that a similar procedure can be employed to analyze quantitative relationships between variables in any field of knowledge. As a matter of fact, regression analysis has been used in politics, international relations, psychology, sociology, meteorology, and many other disciplines. As an example, see Problem 1.9. 1.4 THE ROAD AHEAD Now that we have provided a glimpse of the nature and scope of econometrics, let us see what lies ahead. The book is divided into four parts. Part I introduces the reader to the bread-and-butter tool of econometrics, namely, the classical linear regression model (CLRM). A thorough understanding of CLRM is a must in order to follow research in the general areas of economics and business. Part II considers the practical aspects of regression analysis and discusses a variety of problems that the practitioner will have to tackle when one or more assumptions of the CLRM do not hold. Part III discusses two comparatively advanced topics, time-series econometrics and panel data regression models. Part IV, consisting of Appendixes A, B, C, and D, reviews the basics of probability and statistics for the benefit of those readers whose knowledge of statistics has become rusty. The reader should have some previous background in introductory statistics. This book keeps the needs of the beginner in mind. The discussion of most topics is straightforward and unencumbered with mathematical proofs, derivations, and so on.13 I firmly believe that the apparently forbidding subject of econometrics can be taught to beginners in such a way that they can see the value of the subject without getting bogged down in mathematical and statistical minutiae. The student should keep in mind that an introductory econometrics course is just like the introductory statistics course he or she has already taken. As in statistics, econometrics is primar- ily about estimation and hypothesis testing. What is different, and generally much more interesting and useful, is that the parameters being estimated or tested are not 13Some of the proofs and derivations are presented in our Basic Econometrics, 5th ed., McGraw-Hill, New York, 2009. A more mathematical treatment is given in Damodar N. Gujarati, Linear Regression: A Mathe- matical Introduction, SAGE, Los Angeles, 2018.
  • 41. 16   Essentials of Econometrics just means and variances but relationships between variables, which is what much of economics and other social sciences is all about. A final word: The availability of comparatively inexpensive computer software pack- ages has now made econometrics readily accessible to beginners. In this book, we will largely use four software packages: EViews, Excel, STATA, and MINITAB. These packages are readily available and widely used. Once students get used to such pack- ages, they will soon realize that learning econometrics is really great fun, and they will have a better appreciation of the much maligned “dismal” science of economics. KEY TERMS AND CONCEPTS The key terms and concepts introduced in this chapter, and page numbers where they are referenced, are as follows: Econometrics 1 Mathematical economics 2 Discouraged-worker hypothesis (effect) 4 Added-worker hypothesis (effect) 4 Time-series data: Quantitative and qualitative 5 High-frequency data 5 Flash trading 5 Autocorrelation 5 Stationary 5 Cross-sectional data 5 Heterogeneity 6 Size or scale effect 6 Pooled data 6 Panel (or longitudinal or micropanel data) 6 Scatter diagram (scattergram) 8 Parameters: Intercept and slopes 9 Random error term (error term) 10 Linear regression model: Dependent variable, independent (or explanatory) variable 10 Causation 10 Parameter estimates 11 Principle of parsimony or Ockham’s razor 13 Hypothesis testing 16 Prediction (forecasting) 16 QUESTIONS 1.1. Suppose a local government decides to increase the tax rate on residential properties under its jurisdiction. What will be the effect of this on the prices of residential houses? Follow the eight- step procedure discussed in the text to answer this question. 1.2. How do you perceive the role of econometrics in decision making in business and government? 1.3. Suppose you are an economic adviser to the chairman of the Federal Reserve Board (the Fed), and he asks you whether it is advisable to increase the money supply to bolster the economy. What factors would you take into account in your advice? How would you use econometrics in your advice? 1.4. To reduce the dependence on foreign oil supplies, the government is thinking of
  • 42. Chapter 1 ■ The Nature and Scope of Econometrics   17 increasing the federal taxes on gasoline. Suppose the Ford Motor Company has hired you to assess the impact of the tax increase on the demand for its cars. How would you go about advising the company? 1.5. President Joe Biden plans to propose to the U.S. Congress an infrastructure investment plan (highways, bridges, tunnels, etc.) at a cost of about $2 trillion. To pay for this, he also plans to increase the tax rate on high-income earners as well as private corporations, although the details are yet to be worked out. How would you design an econometric study to assess the economic consequences, both short term and long term, of his proposal? PROBLEMS 1.6. Table 1-4 on the book's website gives monthly data on the closing prices of the Dow Jones Industrial Average and the Standard Poor's 500 stock market indexes. The data are from Yahoo Finance's historical stock quotations page. a. Plot these data with time on the horizontal axis and the two variables on the vertical axis. If you prefer, you may use a separate figure for each variable. b. What relationships do you expect to find between the two indexes? Why? c. For each variable, “eyeball” a regression line from the scattergram. d. Obtain monthly data for the two variables for the period from January 2012 to December 2020 and repeat questions a, b, and c and find out if there are any changes in the results. If so, what might account for the change? 1.7. Table 1-5 on the book's website gives data on the exchange rate between the U.K. pound and the U.S. dollar (number of U.K. pounds per U.S. dollar), as well as the consumer price indexes in the two countries for the period 1985–2007. a. Plot the exchange rate (ER) and the two consumer price indexes against time, measured in years. b. Divide the U.S. CPI by the U.K. CPI and call it the relative price ratio (RPR). c. Plot ER against RPR. d. Visually sketch a regression line through the scatter points. e. Update the data in Table 1-5 to year 2020. Repeat questions a, b, c, and d and find out if there is any changes in the results. What accounts for the change, if any, in the results? 1.8. Table 1-6 on the textbook website contains data on 1,247 cars for 2008.14 To find out if there is there a relationship between a car’s MPG (miles per gallon) and the number of cylinders it has: a. Create a scatterplot of the combined MPG for the vehicles based on the number of cylinders. b. Sketch a line that seems to fit the data. c. What type of relationship is indicated by the plot? 14Data were collected from the U.S. Department of Energy website at http://www.fueleconomy.gov/.
  • 43. 18   Essentials of Econometrics 1.9. Table 1-7 on the book’s website gives data on Corruption Perception Index and GDP per worker. a. Plot Corruption Perception Index against GDP per worker. b. A priori, what kind of relationship do you expect between the two variables? c. Does the scattergram suggest that the relationship between the two variables is linear (i.e., a straight line)? If so, sketch the regression line. 1.10. Table 1-2 on the website updates the data given in Table 1-1 for the years 2001–2016. For the years 2001–2007, the CLFR and CUNR figures are the same as those shown in Table 1-1. However, the AHE82 figures differ in the two periods. As pointed out in the text, the differences are usually due to data revisions. a. Plot CLFR and CUNR as in Figure 1-1. What difference due you see in the two scattergrams? b. Is the relationship between the variables linear as in Figure 1-1? If so, visually sketch a regression line through the scatterplot. c. Is there a “break” in the data in the sense that after a certain date, the relationship between the two variables has changed? Can you spot that break point? d. Based on the data in Table 1-2, the regression results corresponding to Equation (1.3) are as follows: CLFR CUNR ∧ = − 66 5245 0 2465 . . How does t his regression differ from the one shown in Equation (1.3)? What may be the reason for the difference? e. The regression results corresponding to Equation (1.5) using the data in Table 1-2 are as follows: CLF PR CUNR AHE ∧ = + − 1121761 0 0150 5 4385 82 . . How does this regression differ from the one shown in Equation (1.5)? What might explain the difference between the two regression results? Note: The full results of the preceding two regressions will be discussed in Chapter 3 after we discuss the theory behind regression analysis. 1.11. Table 1-8 on the book's website gives quarterly data on real personal consumption expenditure (RPCE) and real personal disposable (after-tax) income (RPDI) for the years 2014–2019. a. Plot RPCE and RPDI on the same graph. What is your impression about the two time series? b. Graph RPCE against RPDI. What does the scattergram show? c. Visually sketch a regression line through the scatter points. What does it show? d. Save the data for further analysis in subsequent chapters. 1.12. Based on the data for 1947–2002, Kellsted and Whitten obtained the following regression:15 Mt = 74.00 – 2.71GDPt where M = percentage of households in which a married couple is present and GDP = gross domestic product. a. Does this result make sense? b. How would you interpret the regression? c. Is there a cause-and-effect relationship between the two variables? d. The regression results give above may be an example of what is called spurious or nonsense regression. We may have more to say about it in a later chapter. 1.13. Table 1.9 on the book's website gives data on the following variables for 99 countries 15Paul M. Kellstedt and Guy D. Whitten, The Fundamentals of Political Science Research, Cambridge University Press, 2nd ed., New York, 2013, p. 262.
  • 44. Chapter 1 ■ The Nature and Scope of Econometrics   19 obtained from the Human Development Report for 1994. LifeExp = 1992 life expectancy at birth TV = Televisions per 100 people PopDoc = Population per doctor (1990) GDP = real GDP per person adjusted for PPP (purchasing power parity) a. Plot life expectancy against each of the other variables in separate graphs. b. A priori, what do you expect the relationship is between LifeExp and each of the other variables: positive, negative, or no relationship? SUGGESTIONS FOR FURTHER READING “The Usefulness of Applied Econometrics to the Policy Maker,” Address by R. Frances, President, Federal Bank of St. Louis, at the National Association of Business Economist Seminar, Chicago, Illinois, April 4, 1973, Federal Bank of St. Louis, May 1973. “What Is Econometrics?” International Monetary Fund, Finance and Development, December 2011, vol. 48, No. 4 (https://www.imf.org/extenal/pubs/ft/ famd/2011/12/basics.htm). On corruption, read https://ourworldindata.org/ corruption. APPENDIX 1A: Economic Data on the World Wide Web16 Economic Statistics Briefing Room: An excellent source of data on output, income, employment, unemployment, earnings, production and business activity, prices and money, credits and security markets, and international statistics. http://www.whitehouse.gov/fsbr/esbr.htm Federal Reserve System Beige Book: Gives a summary of current economic conditions by the Federal Reserve District. There are 12 Federal Reserve Districts. www.federalreserve.gov/FOMC/BeigeBook/2008 National Bureau of Economic Research (NBER) Home Page: This highly regarded private economic research institute has extensive data on asset prices, labor, productivity, money supply, business cycle indicators, and so on. NBER has many links to other websites. http://www.nber.org Panel Study: Provides data on longitudinal survey of representative sample of U.S. individuals and families. These data have been collected annually since 1968. http://www.umich.edu/-psid The Federal Web Locator: Provides information on almost every sector of the federal government; has international links. www.lib.auburn.edu/madd/docs/fedloc.html WebEC: Resources in Economics: A most comprehensive library of economic facts and figures. www.helsinki.fi/WebEc American Stock Exchange: Information on some 700 companies listed on the second largest stock market. 16It should be noted that this list is by no means exhaustive. The sources listed here are updated continually.
  • 45. 20   Essentials of Econometrics http://www.amex.com/ Bureau of Economic Analysis (BEA) Home Page: This agency of the U.S. Department of Commerce, which publishes the Survey of Current Business, is an excellent source of data on all kinds of economic activities. www.bea.gov Business Cycle Indicators: You will find data on about 256 economic time series. http://www.globalexposure.com/bci.html CIA Publication: You will find the World Fact Book (annual). www.cia.gov/library/publications Energy Information Administration (Department of Energy [DOE]): Economic information and data on each fuel category. http://www.eia.doe.gov/ FRED Database: Federal Reserve Bank of St. Louis publishes historical economic and social data, which include interest rates, monetary and business indicators, exchange rates, and so on. http://www.stls.frb.org/fred/ International Trade Administration: Offers many web links to trade statistics, cross-country programs, and so on. http://www.ita.doc.gov/ STAT-USA Databases: The National Trade Data Bank provides the most comprehensive source of international trade data and export promotion information. It also contains extensive data on demographic, political, and socioeconomic conditions for several countries. http://www.stat-usa.gov/ Bureau of Labor Statistics: The home page contains data related to various aspects of employment, unemployment, and earnings and provides links to other statistical websites. http://stats.bls.gov U.S. Census Bureau Home Page: Prime source of social, demographic, and economic data on income, employment, income distribution, and poverty. http://www.census.gov/ General Social Survey: Annual personal interview survey data on U.S. households that began in 1972. More than 35,000 have responded to some 2,500 different questions covering a variety of data. www.norc.org/GCS+Website Institute for Research on Poverty: Data collected by nonpartisan and nonprofit university-based research center on a variety of questions relating to poverty and social inequality. http://www.ssc.wisc.edu/irp/ Social Security Administration: The official website of the Social Security Administration with a variety of data. http://www.ssa.gov Federal Deposit Insurance Corporation, Bank Data and Statistics http://www.fdic.gov/bank/statistical/ Federal Reserve Board, Economic Research and Data http://www.federalreserve.gov/econresdata U.S. Census Bureau, Home Page http://www.census.gov U.S. Department of Energy, Energy Information Administration www.eia.doe.gov/overview_hd.html U.S. Department of Health and Human Services, National Center for Health Statistics http://www.cdc.gov/nchs U.S. Department of Housing and Urban Development, Data Sets http://www.huduser.org/datasets/pdrdatas.html U.S. Department of Labor, Bureau of Labor Statistics http://www.bls.gov U.S. Department of Transportation, TranStats http://www.transtats.bts.gov U.S. Department of the Treasury, Internal Revenue Service, Tax Statistics
  • 46. Chapter 1 ■ The Nature and Scope of Econometrics   21 http://www.irs.gov/taxstats Rockefeller Institute of Government, State and Local Fiscal Data www.rockinst.org/research/sl_finance American Economic Association, Resources for Economists http://www.rfe.org American Statistical Association, Business and Economic Statistics www.amstat.org/publications/jbes American Statistical Association, Statistics in Sports http://www.amstat.org/sections/sis/ European Central Bank, Statistics http://www.ecb.int/stats World Bank, Data and Statistics http://www.worldbank.org/data International Monetary Fund, Statistical Topics http://www.imf.org/external/np/sta/ Penn World Tables http://pwt.econ.upenn.edu Current Population Survey http://www.bls.census.gov/cps/ Consumer Expenditure Survey http://www.bls.gov/cex/ Survey of Consumer Finances http://www.federalreserve.gov/pubs/oss/ City and County Data Book http://www.census.gov/statab/www/ccdb.html Panel Study of Income Dynamics http://psidonline.isr.umich.edu National Longitudinal Surveys http://www.bls.gov/nls/ National Association of Home Builders, Economic and Housing Data http://www.nahb.org/page.aspx/category/ sectionID=113 National Science Foundation, Division of Science Resources Statistics http://www.nsf.gov/sbe/srs/ Economic Report of the President http://www.gpoaccess.gov/eop/ Various Economic Data Sets http://www.economy.com/freelunch/ The Economist Market Indicators http://www.economist.com/markets/indicators Statistical Resources on the Military http://www.lib.umich.edu/govdocs/stmil.html World Economic Indicators http://devdata.worldbank.org/ Economic Time Series Data http://www.economagic.com/ United Nations Population Division's Annual Estimates and Projections http://unstats.un.org/unsd/default.htm United Nations Statistics Division-UNdata http://data.un.org/Default.aspx World Bank Data http://databank.worldbank.org/
  • 47.
  • 48. 23 PART I THE LINEAR REGRESSION MODEL The objective of Part I, which consists of five chapters, is to introduce the reader to the “bread-and-butter” tool of econometrics, namely, the linear regression model. Chapter 2 discusses the basic ideas of linear regression in terms of the simplest possible linear regression model, in particular, the two-variable model. We make an important distinction between the population regression model and the sample regression model and estimate the former from the latter. This estimation is done using the method of least squares, one of the popular methods of estimation.1 Chapter 3 considers hypothesis testing. As in any hypothesis testing in statistics, we try to find out whether the estimated values of the parameters of the regression model are compatible with the hypothesized values of the parameters. We do this hypothesis testing in the context of the classical linear regression model (CLRM). We discuss why the CLRM is used and point out that the CLRM is a useful start- ing point. In Part II, we will reexamine the assumptions of the CLRM to see what happens to the CLRM if one or more of its assumptions are not fulfilled. Chapter 4 extends the idea of the two-variable linear regression model developed in the previous two chapters to multiple regression models, that is, models having more than one explanatory variable. Although in many ways the multiple regres- sion model is an extension of the two-variable model, there are differences when it comes to interpreting the coefficients of the model and in the hypothesis-testing procedure. The linear regression model, whether two-variable or multivariable, only requires that the parameters of the model be linear; the variables entering the model need not themselves be linear. 1An alternative is the method of maximum likelihood (ML), which we do not discuss in this text because it is mathematically a bit complex. For an introduction to ML, see Damodar Gujarati, Econometrics by Example, 2nd ed., Palgrave-Macmillan, London, 2015, pp. 25−26.
  • 49. 24   Essentials of Econometrics Chapter 5 considers a variety of models that are linear in the parameters (or can be made so) but are not necessarily linear in the variables. With several illustrative examples, we point out how and where such models can be used. Often the explanatory variables entering into a regression model are qualitative in nature, such as sex, race, and religion. Chapter 6 shows how such variables can be measured and how they enrich the linear regression model by taking into account the influence of variables that otherwise cannot be quantified. This chapter also considers briefly models in which the dependent variable is also dummy or qualitative. Part I makes an effort to “wed” practice to theory. The availability of user-friendly regression packages allows you to estimate a regression model without knowing much theory, but remember the adage that “a little knowledge is a dangerous thing.” So even though theory may be boring, it is absolutely essential in understanding and interpret- ing regression results. Besides, by omitting all mathematical derivations, we have made the theory “less boring.”