Eidgenössische Technische Hochschule ZürichSwiss Federal Institute of Technology Zurich Overview of the Possibilities of Quantitative Methods in Political Science Tobias Böhmelt ETH Zurich firstname.lastname@example.org International Relations
Overview • Introduction • EITM - The Importance of Methods • Choice of Methods • What is Quantitative Methodology? • The Approach of Quantitative Methods in Political Science • Short Overview of Possibilities • Some Problems and Caveats • Conclusion
Introduction• What do I hope to accomplish? – Teaching you in-depth knowledge of some quantitative approaches? – Teaching you how to employ quantitative methods? – Making you familiar with statistical software packages? • The answer is simple – no.• Instead: – Clarify the value and challenges of quantitative research. – Help you to get interested in these methods for conducting more effective research.
EITM – The Importance of Methods: Why Do We Need Methods to Answer Questions in Political Science?EITM – Empirical Implications of Theoretical Models • Prerogative of theory. • Characteristics of theory determine the testing method: scope and generality, parsimony and complexity, prediction and explanation. • Estimating average causal effects or explaining the complexity of a single event? • The “degree of freedom problem:” most theories argue ceteris paribus, other effects have to be controlled for. This is often not possible with one or two cases. • Is it important how much a variable matters or just that it matters? • Case selection: selection bias, self-selection, selection on the dependent variable lack of independence of cases leads to false conclusions.
EITM – The Importance of MethodsThe Basic Research Design Problem • N problems = . • For any problem, N theories = . • For any theory, N models = . • For any problem, the number of empirical specifications = . This has implications for the use of methods!
EITM – The Importance of Methods• Science contributes to society by simplifying complex phenomena. – Its value increases with the value of the simplification.• Interesting topics per se are insufficient. – You must be able to lead people from where they are to a better conclusion. 1. The goal is inference. 2. The procedures are public. 3. The conclusions are uncertain. 4. The content is the method.
Choice of MethodsFactors Influencing the Research Outcome – A Methods Perspective• The chosen theoretical approach (paradigm) affects the results – approaches often predefine the method to be applied for testing hypotheses.• The method you choose to test propositions impacts the results you get: quantitative vs. qualitative approaches scope and generalizability are crucial!• Case selection: the selection of cases on the basis of the dependent variable impedes the accumulation of knowledge: this leads to selection bias.• Careful case selection on explanatory variables is crucial in order to obtain reliable and valid results.• Selection criteria should be explicitly stated to ensure replicability and show how selection possibly drives the results.
Choice of MethodsDifferent Methods Have Different Comparative Advantages• Deduction: method follows theory: – Test implications of theories against empirical observations. – Hypotheses testing logic of confirmation.• Induction: method used to create or amend theories: – Develop theories: induction, hypothesis formation by studying deviant and outlier cases, historical explanation of individual cases. – Modify theories: adapt theories to outliers.
Choice of Methods• Trade off between explanation and prediction.• In general: quantitative methods have a high predictive power and qualitative a high explanatory power.• Theory testing often requires the combination of qualitative and quantitative methods: – qualitative research looks at outliers of a quantitative analysis. – case studies identify important variables and conceptualize variables. – study the crucial case to test the underlying causal mechanism. – study deviant or outlier cases to analyze why these cases do not fit the theory. – study important historical cases.
What is Quantitatitve Methodology? Has to do with “numbers”… Simple Example demonstrating the „Usefulness‟ of Statistics:Homer is questioned about his newlyformed vigilante group.Newscaster: “Since your group started up,petty crime is down 20%, but other crimesare up. Such as heavy sack beating, whichis up 800%. So you‟re actually increasingcrime.”Homer: “You can make up statistics toprove anything.”
What is Quantitatitve Methodology?Curtis Signorino (1999) “How to Translate a Theory into a Statistical Model:” 1. Specify the theoretical choice model. 2. Add a random component (the source of uncertainty). 3. Derive the probability model associated with one‟s dependent variable. 4. Construct a likelihood equation based on the probability model.
What is Quantitatitve Methodology?• Research techniques that are used to gather and analyze quantitative data, i.e., information dealing with anything that is measurable.• Descriptive statistics: description of central variables by statistical measures such as median, mean, standard deviation and variance.• Inferential statistics: test for a relationship between variables – at least one explanatory factor and one dependent variable.• Inference is the goal: – is it possible to generalize the regression results for the sample under observation to the universe of cases (the population)? – can you draw conclusions for individuals, countries, and time-points beyond those observations in your data-set?
What is Quantitatitve Methodology?• For the application of quantitative data analysis it is crucial that the selected method is appropriate for the data structure:• Dependent Variable: – Dimensionality: spatial and dynamic. – continuous or discrete. – Binary, ordinal categories, count. – Distribution: normal, logistic, poison, negative binomial.• Critical points: – Measurement level of the DV and IV. – Expected and actual distribution of the variables. – Number of observations and variance.
What is Quantitatitve Methodology?Definition of Key Concepts:• Variable: a variable is any measured characteristic or attribute that hast the potential to differ for different subjects.• Independent variables – explanatory variables – exogenous variables – explanans: variables that are causal for a specific outcome (necessary conditions).• Intervening variables: factors that impact the influence of independent variables, variables that interact with explanatory variables and alter the outcome (sufficient conditions).• Dependent variables – endogenous variables – explanandum: outcome variables, that we want to explain.
What is Quantitatitve Methodology?Definition of Key Concepts:• Sample: a specific subset of a population (the universe of cases) – Samples can be random or non-random=selected – For most simple statistical models random samples are a crucial prerequisite• Random sample: drawn from the population in a way that every item in the population has the same opportunity of being drawn – the observations of the random sample are thus independent of each other.• Sampling error: one sample will usually not be completely representative of the population from which it was drawn – this random variation in the results is known as sampling error.• For random samples, mathematical theory is available to assess the sampling error, estimates obtained from random samples can be combined with measures of the uncertainty associated with the estimate, e.g. standard error, confidence intervals.
What is Quantitatitve Methodology?Random Samples• Observations are independent of each other.• The random sample mimics the distribution and all characteristics of the underlying population.• Sampling error is white noise, a random component with no structure, and can therefore be assessed by mathematical and statistical tools.• Often: not observing a random sample renders statistical results biased and unreliable.Selected Samples• Sample selected on the basis of a specific criterion connected with the dependent variable.• Sample selection often precludes inference beyond the sample and renders estimation results biased.• One has to be aware of possible sample selection and account for the possible bias especially of test statistics.
The Approach of Quantitative Political ScienceDatasets• Datasets contain dependent, independent, and intervening variables for a specific sample in order to answer a research question/testing specific theoretical propositions.• All variables in the data have the same dimensionality (observations for the same cases, units, and time points).• Variables in a data can have different measurement levels, types, and distributions.
The Approach of Quantitative Political Science
The Approach of Quantitative Political Science – Types of DataMicro Data: Individual Data• Survey data: Eurobarometer, National Election Study (US), British Election Study, socio-economic panel (Germany and other countries).Macro Data: Aggregated Data at Different Levels• Economic indicators: Inflation, Unemployment, GDP, growth, population (density) and demographic data, government spending, public debt, tax rates, government revenue, interest rates, exchange rates, income distribution, FDI, foreign aid, trade (exports/ imports), no of employees in different sectors etc.• Political indicators: electoral system (majority, proportional), political system (parliamentary, presidential, federal), political institutions, number of veto players, regime type (democracy, autocracy), union density, labor market regulations, wage negotiation system (corporatism), human and civil rights, economic and financial openness, political particularism etc.
The Approach of Quantitative Political Science – Types of DataDimensionality of the Data• Cross-sectional data: observations for N units at one point in time.• Time series data: observations for one unit at different points in time.• Panel data: observations for N units at T points in time: N is significantly larger than T – mostly used for micro data – units are individuals.• Time series cross section (TSCS) data: panel data, but mostly used for macro data – aggregated (country) data.• Cross section time series (CSTS) data: observations for N units at T points in time: T > N.
The Approach of Quantitative Political Science – Data SourcesEconomic Data• OECD: national accounts, government revenue, taxation, main economic indicators (unemployment, inflation, GDP), earnings, labour market, FDI, social expenditure, debt, employment etc.• IMF: economic indictors, direction of trade statistics, international financial statistics (interest rates, exchange rates, capital flows)• World bank: economic indicators• PennWorld tables: macro-economic data• ILO: labour market statistics• WTO: data on preferential trade agreements etc.Political Data• Eurobarometer: regular surveys, microdata European countries• Polity: degree of democracy• Freedom house: human and civil rights• Correlates of War: MID, alliance, membership in IGOs• Event data bases: WEIS (World Event Interaction Survey), IDEA• Cingranelli-Richards (CIRI) Human Rights Database: Political freedom, political rights, civil- and human rights.
Short Overview of Possibilities: OLS Regression• A metric variable Y can be determined by a function of X• The specific values of Y therefore depend on the specific values of X Y = f(X)• The most straightforward association of such a relationship is linear Y = f(X) = a + bX• The „line‟ is hence uniquely determined by two factors:• the constant (a), i.e. the point where the „line‟ crosses the y-axis• and the slope (b), i.e. how does Y change if X is increased by one unit
Short Overview of Possibilities: OLS Regression
Short Overview of Possibilities: OLS RegressionWe do not have „deterministic‟ relationships, however! Hard – if not impossible - to find in Political Science!
Short Overview of Possibilities: OLS Regression• It is impossible to find a linear line on which all points lie jointly.• Nonetheless, you can try to capture all these points straight through a line that describes the underlying relationship in the best way.• And THIS is exactly what regression analysis tries to do.• Which straight line is the best, though?
Short Overview of Possibilities: OLS Regression• The method for doing this is called OLS – ordinary least squares.• The function shall plot a straight line through the points so that the squared distances between the actually observed values (yi) and the values as predicted by the function (ŷi) are minimized when summed up.• The straight line – or the parameters of a and b – is chosen that minimizes the sum of the residuals ei:
Short Overview of Possibilities: OLS Regression• The equation for the OLS function is written like this: ŷi = a + bxi yi = a + bxi + ei• The “hat” in the first equation demonstrates that we are just dealing with estimates ŷi that may differ from the actual values of Y.• Regarding the second equation, the error term ei indicates that not all values of our observations may be found on the straight line automatically.• It is an approach to capture the underlying relationship as closely as possible!• It is an estimation!
Short Overview of Possibilities: OLS Regression• How to determine the “quality” of a regression line?• Follow the principle of ANOVA: Analysis of Variance.
Short Overview of Possibilities: OLS Regression yi = a + bxi + ei conflict=34.94+1.46*water+ eiregression conflict water Source | SS df MS Number of obs = 557-------------+------------------------------ F( 1, 555) = 195.62 Model | 16311.805 1 16311.805 Prob > F = 0.0000 Residual | 46278.3932 555 83.3844922 R-squared = 0.2606-------------+------------------------------ Adj R-squared = 0.2593 Total | 62590.1981 556 112.572299 Root MSE = 9.1315------------------------------------------------------------------------------ conflict | Coef. Std. Err. t P>|t| [95% Conf. Interval]-------------+---------------------------------------------------------------- water | 1.462844 .1045899 13.99 0.000 1.257404 1.668285 _cons | 34.93685 .6476726 53.94 0.000 33.66466 36.20904------------------------------------------------------------------------------
Short Overview of Possibilities: OLS Regression
Problems with Quantiative Research – Stargazing• Begin with a hunch that a particular variable has an unappreciated association with [environmental conflict].• A standard regression is run. The analyst looks for “stars.”• If the stars support the hunch, then the examination stops.• Otherwise, additional regressions are run. No easily stated theory guides such decisions.• The process stops when the stars align.
Problems with Quantiative Research – Misspecification• Claim: “X1, has no effect on Y.”• Evidence: the coefficient of X1 does not achieve a particular level of statistical significance. – So, X1 does not have a statistically significant effect within the stated model.• What if the true underlying data generating mechanism is not identical to the structure of the stated model?
Problems with Quantiative Research – Remedies• New estimators.• Replication data.• Greater rigor in relations between theoretical models and the empirical models used to evaluate them.• Increase transparency and build credibility through theoretical development and evaluation.• The importance of transparency and rigor does not stop when you have developed an empirical model.
Problems with Quantiative Research – Remedies Santiago Ramon y Cajal (1916)“What a wonderful stimulant itwould be for the beginner if hisinstructor, instead of amazing anddismaying him with the sublimityof great past achievements, wouldreveal instead the origin of eachscientific discovery … –information that, from a humanperspective, is essential to anaccurate explanation of thediscovery.”
Conclusion • EITM - The Importance of Methods • Choice of Methods • What is Quantitative Methodology? • The Approach of Quantitative Political Science • Short Overview of Possibilities • Some Problems and Caveats • Any questions?