Upcoming SlideShare
×

# Applied Statistics - Introduction

3,342 views

Published on

Slides for the introductory section of my introductory course in applied statistics.

3 Likes
Statistics
Notes
• Full Name
Comment goes here.

Are you sure you want to Yes No
• Be the first to comment

Views
Total views
3,342
On SlideShare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
65
0
Likes
3
Embeds 0
No embeds

No notes for slide

### Applied Statistics - Introduction

1. 1. Outline Econometrics Illustrations On methodApplied Statistics for Economics 1. Introduction SFC - juliohuato@gmail.com Spring 2012 SFC - juliohuato@gmail.com Applied Statistics for Economics 1. Introduction
2. 2. Outline Econometrics Illustrations On methodEconometricsIllustrationsOn method SFC - juliohuato@gmail.com Applied Statistics for Economics 1. Introduction
3. 3. Outline Econometrics Illustrations On methodDeﬁnitions Statistics or statistical inference is the set of methods used in science, technology, and industry to extract information from data. Data is a set of records drawn from observations of the world. When used in economics (and also business management, ﬁnance, and a number of social sciences) and in policymaking,1 statistical methods are often called econometrics. We will see that there is a good reason for the terminological distinction. We will follow this convention and refer to our course as introductory econometrics. 1 Policymaking means choosing rules of behavior (‘policies’). We usually think of governments making (and implementing) policies, but this also applies to any other organization (business, household, nonproﬁt, club) or individual. In this sense, business managers or heads of household are “policymakers.” SFC - juliohuato@gmail.com Applied Statistics for Economics 1. Introduction
4. 4. Outline Econometrics Illustrations On methodEconometric applications In practice, econometrics: tests empirically whether theories2 about social or economic behavior match observed facts, forecasts the future values of interesting economic variables of interest, ﬁts economic models to real-world data, and uses historical data to make quantitative policy recommendations to policymakers. 2 By theory (or model), I mean a clear statement about the relationship between at least two variables of interest. In very general terms, a theory is a statement of the following type: “If x, then y .” Often, x is called the ‘premises’ and y the ‘conclusions.’ More speciﬁcally, a simple theory about cigarette consumption would be a statement like this: “Other things equal, if cigarette prices increase, the consumption of cigarettes will decline.” SFC - juliohuato@gmail.com Applied Statistics for Economics 1. Introduction
5. 5. Outline Econometrics Illustrations On methodThe econometrics approach Ideally, as a scientiﬁc discipline, econometrics uses (1) statistics (a branch of deductive mathematics), (2) probability theory (a theory of uncertainty in the world), and (3) economics (a theory about how economic variables are related) in response to the practical concerns of policymakers. Ultimately, it is the practical needs of policymakers that dictate which theories to test empirically, which relationships to estimate, and which variables to forecast. SFC - juliohuato@gmail.com Applied Statistics for Economics 1. Introduction
6. 6. Outline Econometrics Illustrations On methodIllustrations To illustrate the use of econometrics (and the reason why we call it ‘econometrics’ rather than just ‘statistics’), consider the following examples: SFC - juliohuato@gmail.com Applied Statistics for Economics 1. Introduction
7. 7. Outline Econometrics Illustrations On methodClass size and grades Does reducing class size improve elementary school education? The question cannot be answered well by looking at the data casually. Suppose we do and note that smaller classes and higher grades go together. This may be due to other advantages that students in small classes may have over students in bigger classes. E.g., students in smaller classes may have richer parents, greater access to libraries, etc. The data available don’t come from an experiment where otherwise identical students are placed in classes of diﬀerent size and then test their respective academic performance.3 Hence, we need special tricks to examine this kind of data and try to answer the question. 3 In Latin, the word “data” is plural for the singular “datum.” However, we may subsequently say “data is . . . ” rather than – awkwardly – “data are . . . .” SFC - juliohuato@gmail.com Applied Statistics for Economics 1. Introduction
8. 8. Outline Econometrics Illustrations On methodRacial discrimination in mortgage lending Is there racial discrimination in the market for home loans? Again, a casual look at the data won’t do. If after looking at the data, we say that black applicants are denied loans more often than white applicants and the issue is race, a critic may object that the correlation between race and mortgage approvals may be due to other reasons. For instance, black people may be poorer and have less property to use as collateral. Then the issue is not race, but income or wealth. SFC - juliohuato@gmail.com Applied Statistics for Economics 1. Introduction
9. 9. Outline Econometrics Illustrations On methodRacial discrimination in mortgage lending Again, the data don’t come from black and white people who are otherwise similar. We need econometrics (not just statistics) to get around the deﬁciency of the data. We need to isolate the race eﬀect from other eﬀects. One cause doesn’t exclude the other. Moreover, the causes may interact. Discrimination may result not only from being black or only from being poor, but from being both black and poor! SFC - juliohuato@gmail.com Applied Statistics for Economics 1. Introduction
10. 10. Outline Econometrics Illustrations On methodRacial discrimination in mortgage lending Notice how important a test like this can be for policy recommendations: If the main reason why black people are more often denied loans than whites is because they are black, then we need mainly the enforcement of civil rights laws. But if the main reason is that they are poor, then we mainly need actions and resources to ﬁght poverty, joblessness, etc. If the reason is the interaction between race and economic condition, then the combination of policies required to address the problem will also be diﬀerent. The recommended courses of action depend on the diagnosis. And since the resources of a community to deal with its problems are ﬁnite, you want to spend those limited resources in their most eﬀective uses. SFC - juliohuato@gmail.com Applied Statistics for Economics 1. Introduction
11. 11. Outline Econometrics Illustrations On methodTaxes and cigarette smoking How much do cigarette taxes reduce smoking? Suppose you look at data on cigarette sales, prices, taxes, and personal income for U.S. states in the 1980s and 1990s, and note that states with low taxes and low prices have higher smoking rates, and vice versa. A problem here is double causality. Presumably, low taxes lead to high demand. But also, because of high demand, there will be many voters who smoke, and politicians may try to keep cigarette taxes low to get reelected. Econometrics methods, as opposed to regular statistical inference that relies experimental data, has ways to get around this double causality problem. SFC - juliohuato@gmail.com Applied Statistics for Economics 1. Introduction
12. 12. Outline Econometrics Illustrations On methodForecasting future inﬂation What will the inﬂation rate be next year? Nowadays, most central banks think of their mission as controlling inﬂation (they used to think their mission was to help the economy reach full employment). They set the interest rates based on their inﬂation outlook in the future. If they think inﬂation will increase, they may want to slow down the economy by rising the rates. Or vice versa. If they guess wrong, they can cause an unnecessary recession or they may enable inﬂation to spin out of control. SFC - juliohuato@gmail.com Applied Statistics for Economics 1. Introduction
13. 13. Outline Econometrics Illustrations On methodRequired answers To give quantitative answers to these questions, we use data. If we use diﬀerent data sets, then we may get a diﬀerent answer. In a way, our answer to the question is uncertain. The answer will depend on the data we use. There’s uncertainty. What kind of quantitative answers do we need? Does reducing class size improve elementary school education? If classes are reduced in 10%, holding constant other student characteristics, the test scores of students increased in x%. SFC - juliohuato@gmail.com Applied Statistics for Economics 1. Introduction
14. 14. Outline Econometrics Illustrations On methodRequired answers Is there racial discrimination in the market for home loans? Holding constant all other characteristics of loan applicants and possible applicants,4 being black reduces your chances of getting a loan by x%. How much do cigarette taxes reduce smoking? If the price of cigarettes increases in 1%, holding constant the income of smokers and possible smokers5 and all other variables, the smoking rate declines in x%. 4 Potential applicants must be included in the data sample because it may well be that some blacks don’t apply for loans because they believe they’ll be denied loans. And loan discrimination is what we’re trying to measure. 5 Again, we include potential smokers who don’t currently smoke because a hefty tax may discourage them to join the smoking club and vice versa. SFC - juliohuato@gmail.com Applied Statistics for Economics 1. Introduction
15. 15. Outline Econometrics Illustrations On methodRequired answers To answer these questions, we need the multiple regression model that we’ll introduce by the end of the course. However, because this is an introductory course, we may not be able to get to the topics where we can actually learn the tricks to get around all the data deﬁciencies indicated above. Some, perhaps, but not all. But at least we will know that these issues exist. SFC - juliohuato@gmail.com Applied Statistics for Economics 1. Introduction
16. 16. Outline Econometrics Illustrations On methodRequired answers What will the inﬂation rate be next year? Here the type of answer is obvious: The inﬂation rate next year will be x%. In this course, we will not be able to study the econometric methods required to answer this type of question. These methods are called time-series econometrics, and they are heavily used in macroeconomics and ﬁnance. SFC - juliohuato@gmail.com Applied Statistics for Economics 1. Introduction
17. 17. Outline Econometrics Illustrations On methodCausality An action causes an outcome if the outcome is the immediate result or consequence of that action. Causality means that a speciﬁc action (fertilizing tomatoes) leads to a speciﬁc measurable consequence (more tomatoes). How do we measure whether a speciﬁc action is the cause of certain eﬀects? We can run an experiment. For that we need many plots with tomato plants. They must be, as far as possible, identical except in the amount of fertilizer applied. Moreover, the decision whether a plot should be fertilized or not must be random to make sure that the only systematic diﬀerence between the plots is whether they are fertilized or not. We record the amount of fertilizer and count the tomatoes at the end of the cycle. SFC - juliohuato@gmail.com Applied Statistics for Economics 1. Introduction
18. 18. Outline Econometrics Illustrations On methodCausality That’s a randomized controlled experiment. The non-fertilized plots are called the controlled group. The other is the treatment group. It is randomized because the treatment is assigned randomly to eliminate the possibility of other systematic diﬀerences among control and treatment groups. If the experiment is conducted in a suﬃciently large scale, then we may be able to estimate the causal eﬀect of fertilizing on tomato production. SFC - juliohuato@gmail.com Applied Statistics for Economics 1. Introduction
19. 19. Outline Econometrics Illustrations On methodCausality Our deﬁnition of causal eﬀect: The eﬀect on an outcome of a given action or treatment as measured in a randomized controlled experiment. The only systematic reason for diﬀerences in outcomes between the controlled and treatment groups is the treatment itself. We cannot always conduct experiments in economic life. They’d be too costly, unethical, or practically impossible. So a randomized controlled experiment will be only a theoretical benchmark for us. SFC - juliohuato@gmail.com Applied Statistics for Economics 1. Introduction
20. 20. Outline Econometrics Illustrations On methodCausality Note that, to answer the fourth question, we do not require to know the causes of inﬂation. All we need to know is how to make a reliable forecast. We can forecast rain if we look through a window and see people carrying their umbrellas, relying on the fact that people tend to carry their umbrellas along when they expect rain. But the use of umbrellas is not the cause of rain.6 6 Advanced time-series econometrics also has methods to estimate causes: these methods fall under the rubric of ‘structural models.’ SFC - juliohuato@gmail.com Applied Statistics for Economics 1. Introduction
21. 21. Outline Econometrics Illustrations On methodData sources According to its origin or source, there are two basic types of data: 1. experimental data and 2. observational data In economics (and to a large extent in business) we use observational data. We need to use econometric tricks to estimate causal eﬀects from observational data. In the real world, the levels of “treatment” are not assigned at random and it is therefore hard to disentangle the eﬀect of the “treatment” from the eﬀects of other causes. That’s what econometrics is for. That’s why econometrics exists, as opposed to mere statistical inference of the type used in the physical and natural sciences. SFC - juliohuato@gmail.com Applied Statistics for Economics 1. Introduction
22. 22. Outline Econometrics Illustrations On methodData types Types of data: 1. Cross-sectional data: Data on diﬀerent entities (individuals, ﬁrms, states, countries, etc.) for a single period of time. 2. Time-series data: Data for a single entity (individual, ﬁrm, state, country, etc.) from diﬀerent periods of time or at diﬀerent points in time.7 3. Longitudinal or panel data: Data for more than one entity in which each entity is observed at two or more periods of time. 7 For more on this diﬀerence, see my review slides on ﬂows, stocks, and accounting. SFC - juliohuato@gmail.com Applied Statistics for Economics 1. Introduction
23. 23. Outline Econometrics Illustrations On methodWrap-up 1. Why do we need to give quantitative answers to some questions? 2. What’s a causal eﬀect? 3. What is a randomized controlled experiment? 4. What’s econometrics for? 5. Why do we need techniques diﬀerent from those used in the physical and natural sciences? 6. What is the diﬀerence between cross-sectional, time series, and panel data? SFC - juliohuato@gmail.com Applied Statistics for Economics 1. Introduction