Linzer slides-barug

847 views
781 views

Published on

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
847
On SlideShare
0
From Embeds
0
Number of Embeds
407
Actions
Shares
0
Downloads
12
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Linzer slides-barug

  1. 1. Forecasting the 2012 Presidential Election from History and the Polls Drew Linzer Assistant Professor Emory University Department of Political Science Visiting Assistant Professor, 2012-13 Stanford University Center on Democracy, Development, and the Rule of Law votamatic.org Bay Area R Users Group February 12, 2013
  2. 2. The 2012 Presidential Election: Obama 332–Romney 206
  3. 3. The 2012 Presidential Election: Obama 332–Romney 206 But also: Nerds 1–Pundits 0
  4. 4. The 2012 Presidential Election: Obama 332–Romney 206 But also: Nerds 1–Pundits 0 Analyst forecasts based on history and the polls Drew Linzer, Emory University 332-206 Simon Jackman, Stanford University 332-206 Josh Putnam, Davidson College 332-206 Nate Silver, New York Times 332-206 Sam Wang, Princeton University 303-235 Pundit forecasts based on intuition and gut instinct Karl Rove, Fox News 259-279 Newt Gingrich, Republican politician 223-315 Michael Barone, Washington Examiner 223-315 George Will, Washington Post 217-321 Steve Forbes, Forbes Magazine 217-321
  5. 5. What we want: Accurate forecasts as early as possibleThe problem: • The data that are available early aren’t accurate: Fundamental variables (economy, approval, incumbency) • The data that are accurate aren’t available early: Late-campaign state-level public opinion polls • The polls contain sampling error, house effects, and most states aren’t even polled on most days
  6. 6. What we want: Accurate forecasts as early as possibleThe problem: • The data that are available early aren’t accurate: Fundamental variables (economy, approval, incumbency) • The data that are accurate aren’t available early: Late-campaign state-level public opinion polls • The polls contain sampling error, house effects, and most states aren’t even polled on most daysThe solution: • A statistical model that uses what we know about presidential campaigns to update forecasts from the polls in real time
  7. 7. What we want: Accurate forecasts as early as possibleThe problem: • The data that are available early aren’t accurate: Fundamental variables (economy, approval, incumbency) • The data that are accurate aren’t available early: Late-campaign state-level public opinion polls • The polls contain sampling error, house effects, and most states aren’t even polled on most daysThe solution: • A statistical model that uses what we know about presidential campaigns to update forecasts from the polls in real timeWhat do we know?
  8. 8. 1. The fundamentals predict national outcomes, noisily Election year economic growth Source: U.S. Bureau of Economic Analysis
  9. 9. 1. The fundamentals predict national outcomes, noisily Presidential approval, June Source: Gallup
  10. 10. 2. States vote outcomes swing (mostly) in tandem Source: New York Times
  11. 11. 3. Polls are accurate on Election Day; maybe not before Florida: Obama, 2008 60 55 Obama vote share Actual 50 outcome 45 40 May Jul Sep Nov Source: HuffPost-Pollster
  12. 12. 4. Voter preferences evolve in similar ways across states Florida: Obama, 2008 Virginia: Obama, 2008 60 60 55 55Obama vote share Obama vote share 50 50 45 45 40 40 May Jul Sep Nov May Jul Sep Nov Ohio: Obama, 2008 Colorado: Obama, 2008 60 60 55 55Obama vote share Obama vote share 50 50 45 45 40 40 May Jul Sep Nov May Jul Sep Nov Source: HuffPost-Pollster
  13. 13. 5. Voters have short term reactions to big campaign events Source: Tom Holbrook, UW-Milwaukee
  14. 14. All together: A forecasting model that learns from the polls Publicly available state polls during the campaign Cumulative number of polls fielded 2000 2008 1500 2012 1000 500 0 12 11 10 9 8 7 6 5 4 3 2 1 0 Months prior to Election Day Forecasts weight fundamentals ←→ Forecasts weight polls Source: HuffPost-Pollster
  15. 15. First, create a baseline forecast of each state outcomeAbramowitz Time-for-Change regression makes a national forecast: Incumbent vote share = 51.5 + 0.6 Q2 GDP growth + 0.1 June net approval − 4.3 In office two+ terms
  16. 16. First, create a baseline forecast of each state outcomeAbramowitz Time-for-Change regression makes a national forecast: Incumbent vote share = 51.5 + 0.6 Q2 GDP growth + 0.1 June net approval − 4.3 In office two+ terms Predicted Obama 2012 vote = 51.5 + 0.6 (1.3) + 0.1 (-0.8) − 4.3 (0)
  17. 17. First, create a baseline forecast of each state outcomeAbramowitz Time-for-Change regression makes a national forecast: Incumbent vote share = 51.5 + 0.6 Q2 GDP growth + 0.1 June net approval − 4.3 In office two+ terms Predicted Obama 2012 vote = 51.5 + 0.6 (1.3) + 0.1 (-0.8) − 4.3 (0) Predicted Obama 2012 vote = 52.2%
  18. 18. First, create a baseline forecast of each state outcomeAbramowitz Time-for-Change regression makes a national forecast: Incumbent vote share = 51.5 + 0.6 Q2 GDP growth + 0.1 June net approval − 4.3 In office two+ terms Predicted Obama 2012 vote = 51.5 + 0.6 (1.3) + 0.1 (-0.8) − 4.3 (0) Predicted Obama 2012 vote = 52.2%Use uniform swing assumption to translate to the state level: Subtract 1.5% for Obama from his 2008 state vote shares Make this a Bayesian prior over the final state outcomes
  19. 19. Combine polls across days and states to estimate trends States with many polls States with fewer polls Florida: Obama, 2012 Oregon: Obama, 2012 56 56 54 54 q q qObama vote share Obama vote share q q q 52 q 52 q qq qq q q q q q q q 50 q q q qq qq q q q q 50 qq q q q 48 q 48 q q q q q q 46 46 44 44 May Jun Jul Aug Sep Oct Nov May Jun Jul Aug Sep Oct Nov
  20. 20. Combine with baseline forecasts to guide future projections Random walk (no) Florida: Obama, 2012 56 54 qObama vote share q 52 q q qq qq q q q q q q q 50 q q q qq qq q q q q qq q q q 48 q q q q q q q 46 44 May Jun Jul Aug Sep Oct Nov
  21. 21. Combine with baseline forecasts to guide future projections Random walk (no) Mean reversion Florida: Obama, 2012 Florida: Obama, 2012 56 56 54 54 q qObama vote share Obama vote share q q 52 q 52 q q q qq qq qq qq q q q q q q q q q q q q q q 50 q q q qq qq q q q q 50 q q q qq qq q q q q qq qq q q q q q q 48 q 48 q q q q q q q q q q q q q 46 46 44 44 May Jun Jul Aug Sep Oct Nov May Jun Jul Aug Sep Oct Nov Forecasts compromise between history and the polls
  22. 22. A dynamic Bayesian forecasting modelModel specification yk ∼ Binomial(πi[k]j[k] , nk ) Number of people preferring Democrat in survey k, in state i, on day j πij = logit −1 (βij + δj ) Proportion reporting support for the Democrat in state i on day j National effects: δj State components: βij Election forecasts: πiJ ˆPriors βiJ ∼ N(logit(hi ), τi ) Informative prior on Election Day, using historical predictions hi , precisions τi δJ ≡ 0 Polls assumed accurate, on average 2 βij ∼ N(βi(j+1) , σβ ) Reverse random walk, states 2 δj ∼ N(δ(j+1) , σδ ) Reverse random walk, national
  23. 23. A dynamic Bayesian forecasting modelModel specification yk ∼ Binomial(πi[k]j[k] , nk ) Number of people preferring Democrat in survey k, in state i, on day j πij = logit −1 (βij + δj ) Proportion reporting support for the Democrat in state i on day j National effects: δj State components: βij Estimated for all states simultaneously Election forecasts: πiJ ˆPriors βiJ ∼ N(logit(hi ), τi ) Informative prior on Election Day, using historical predictions hi , precisions τi δJ ≡ 0 Polls assumed accurate, on average 2 βij ∼ N(βi(j+1) , σβ ) Reverse random walk, states 2 δj ∼ N(δ(j+1) , σδ ) Reverse random walk, national
  24. 24. Results: Anchoring to the fundamentals stabilizes forecasts Florida: Obama forecasts, 2012 60 Obama vote share 55 qq q q q qq q qqqq qq q q q q q q q q qqq qqq q q qqqq qqq qq qqq q q q q q q qqq 50 q q 45 40 Shaded area indicates 95% uncertainty Jul Aug Sep Oct Nov
  25. 25. Results: Anchoring to the fundamentals stabilizes forecasts Electoral Votes 400 OBAMA 332 350 300 250 200 150 ROMNEY 206 Jul Aug Sep Oct Nov
  26. 26. There were almost no surprises in 2012On Election Day, average error = 1.7% Why didn’t the model do more?
  27. 27. There were almost no surprises in 2012On Election Day, average error = 1.7%
  28. 28. There were almost no surprises in 2012 On Election Day, average error = 1.7%Why didn’t the model improve forecasts by more?
  29. 29. The fundamentals and uniform swing were right on target State election outcomes 2012=2008 line q HI 70 q VT q NYq qRI MD qq CA MA 60 q qq DE NJ CTIL q 2012 Obama vote ME q WA q OR q NMq MI q MNq CO q NH WI q NV q IA q VA PA qq OH q FL 50 q NC AZ q MO qGA q q IN q q MSSC q AK q MT q TX q LA q SD 40 q q AL KY q q TN ND q KS NE q AR q WV q q OK ID 30 q WY q UT 30 40 50 60 70 2008 Obama vote
  30. 30. Aggregate preferences were very stablePercent supporting: Obama Romney
  31. 31. Could the model have done better? Yes Difference between actual and predicted vote outcomes 6 HI q AK q ↑ Obama performed better than expectedElection Day forecast error 4 OR q MS ID qq NJ q 2 MDq CA CT qq MI q LA ME q MAq CO q q IA q q RI q NY q q q NV NH q WI q AR TX IN q NM VA q SC q q WAq FL q 0 OK q GA q MN MO q q PA NC qq OH q AZ q q UT q AL qq DE ND q IL −2 KS q VT q MT q KY WY qq TN NE qq −4 SDq ↓ Romney performed better than expected −6 WV q 0 20 40 60 80 100 Number of polls after May 1, 2012
  32. 32. Forecasting is only one of many applications for the model 1 Who’s going to win? 2 Which states are going to be competitive? 3 What are current voter preferences in each state? 4 How much does opinion fluctuate during a campaign? 5 What effect does campaign news/activity have on opinion? 6 Are changes in preferences primarily national or local? 7 How useful are historical factors vs. polls for forecasting? 8 How early can accurate forecasts be made? 9 Were some survey firms biased in one direction or the other?
  33. 33. House effects (biases) were evident during the campaign
  34. 34. Much more at votamatic.org drew@votamatic.org @DrewLinzer

×