Linzer slides-barug

Forecasting the 2012 Presidential Election
from History and the Polls

Drew Linzer

Assistant Professor
Emory University
Department of Political Science

Visiting Assistant Professor, 2012-13
Stanford University
Center on Democracy, Development,
and the Rule of Law

votamatic.org

Bay Area R Users Group
February 12, 2013

The 2012 Presidential Election: Obama 332–Romney 206


But also: Nerds 1–Pundits 0


But also: Nerds 1–Pundits 0

Analyst forecasts based on history and the polls
Drew Linzer, Emory University 332-206
Simon Jackman, Stanford University 332-206
Josh Putnam, Davidson College 332-206
Nate Silver, New York Times 332-206
Sam Wang, Princeton University 303-235

Pundit forecasts based on intuition and gut instinct
Karl Rove, Fox News 259-279
Newt Gingrich, Republican politician 223-315
Michael Barone, Washington Examiner 223-315
George Will, Washington Post 217-321
Steve Forbes, Forbes Magazine 217-321

What we want: Accurate forecasts as early as possible

The problem:
• The data that are available early aren’t accurate:
Fundamental variables (economy, approval, incumbency)

• The data that are accurate aren’t available early:
Late-campaign state-level public opinion polls

• The polls contain sampling error, house eﬀects, and most
states aren’t even polled on most days


The problem:



The solution:
• A statistical model that uses what we know about presidential
campaigns to update forecasts from the polls in real time


The problem:



The solution:
• A statistical model that uses what we know about presidential
campaigns to update forecasts from the polls in real time

What do we know?

1. The fundamentals predict national outcomes, noisily

Election year economic growth

Source: U.S. Bureau of Economic Analysis

1. The fundamentals predict national outcomes, noisily

Presidential approval, June

Source: Gallup

2. States vote outcomes swing (mostly) in tandem

Source: New York Times

3. Polls are accurate on Election Day; maybe not before

Florida: Obama, 2008
60

55
Obama vote share

Actual
50 outcome

45

40

May Jul Sep Nov

Source: HuﬀPost-Pollster

4. Voter preferences evolve in similar ways across states
Florida: Obama, 2008 Virginia: Obama, 2008
60 60

55 55
Obama vote share

Obama vote share
50 50

45 45

40 40

May Jul Sep Nov May Jul Sep Nov

Ohio: Obama, 2008 Colorado: Obama, 2008
60 60

55 55
Obama vote share

Obama vote share
50 50

45 45

40 40

May Jul Sep Nov May Jul Sep Nov


5. Voters have short term reactions to big campaign events

Source: Tom Holbrook, UW-Milwaukee

All together: A forecasting model that learns from the polls

Publicly available state polls during the campaign
Cumulative number of polls fielded
2000
2008

1500
2012

1000

500

0

12 11 10 9 8 7 6 5 4 3 2 1 0

Months prior to Election Day

Forecasts weight fundamentals ←→ Forecasts weight polls


First, create a baseline forecast of each state outcome

Abramowitz Time-for-Change regression makes a national forecast:
Incumbent vote share = 51.5 + 0.6 Q2 GDP growth
+ 0.1 June net approval
− 4.3 In oﬃce two+ terms



Predicted Obama 2012 vote = 51.5 + 0.6 (1.3)
+ 0.1 (-0.8)
− 4.3 (0)



+ 0.1 (-0.8)
− 4.3 (0)

Predicted Obama 2012 vote = 52.2%



+ 0.1 (-0.8)
− 4.3 (0)

Predicted Obama 2012 vote = 52.2%

Use uniform swing assumption to translate to the state level:
Subtract 1.5% for Obama from his 2008 state vote shares
Make this a Bayesian prior over the ﬁnal state outcomes

Combine polls across days and states to estimate trends

States with many polls States with fewer polls
Florida: Obama, 2012 Oregon: Obama, 2012

56 56

54 54 q
q q
Obama vote share

Obama vote share
q q q
52 q 52
q
qq qq
q q
q q
q
q q
50 q q q
qq qq q
q q
q 50
qq
q q q
48 q 48
q q
q q
q
q
46 46

44 44

May Jun Jul Aug Sep Oct Nov May Jun Jul Aug Sep Oct Nov

Combine with baseline forecasts to guide future projections

Random walk (no)
Florida: Obama, 2012

56

54
q
Obama vote share

q
52 q
q
qq qq
q q
q q
q
q q
50 q q q
qq qq q
q q
q
qq
q q q
48 q
q q
q q
q
q
46

44

May Jun Jul Aug Sep Oct Nov

Combine with baseline forecasts to guide future projections

Random walk (no) Mean reversion
Florida: Obama, 2012 Florida: Obama, 2012

56 56

54 54
q q
Obama vote share

Obama vote share
q q
52 q 52 q
q q
qq qq qq qq
q q
q q
q
q q q q
q q
q
q q
50 q q q
qq qq q
q q
q 50 q q q
qq qq q
q q
q
qq qq
q q q q q q
48 q 48 q
q q
q q q q
q q
q q
q q
46 46

44 44

May Jun Jul Aug Sep Oct Nov May Jun Jul Aug Sep Oct Nov

Forecasts compromise between history and the polls

A dynamic Bayesian forecasting model

Model speciﬁcation
yk ∼ Binomial(πi[k]j[k] , nk ) Number of people preferring Democrat
in survey k, in state i, on day j
πij = logit −1 (βij + δj ) Proportion reporting support for the
Democrat in state i on day j
National eﬀects: δj
State components: βij
Election forecasts: πiJ
ˆ

Priors
βiJ ∼ N(logit(hi ), τi ) Informative prior on Election Day, using
historical predictions hi , precisions τi
δJ ≡ 0 Polls assumed accurate, on average
2
βij ∼ N(βi(j+1) , σβ ) Reverse random walk, states
2
δj ∼ N(δ(j+1) , σδ ) Reverse random walk, national

A dynamic Bayesian forecasting model

Model speciﬁcation
yk ∼ Binomial(πi[k]j[k] , nk ) Number of people preferring Democrat
in survey k, in state i, on day j
πij = logit −1 (βij + δj ) Proportion reporting support for the
Democrat in state i on day j
National eﬀects: δj
State components: βij
Estimated for all states simultaneously
Election forecasts: πiJ
ˆ

Priors
βiJ ∼ N(logit(hi ), τi ) Informative prior on Election Day, using
historical predictions hi , precisions τi
δJ ≡ 0 Polls assumed accurate, on average
2
βij ∼ N(βi(j+1) , σβ ) Reverse random walk, states
2
δj ∼ N(δ(j+1) , σδ ) Reverse random walk, national

Results: Anchoring to the fundamentals stabilizes forecasts

Florida: Obama forecasts, 2012
60
Obama vote share

55

qq q q
q qq q qqqq
qq q q q q q q q q qqq qqq q
q
qqqq qqq qq qqq
q
q q q q q qqq
50 q q

45

40 Shaded area indicates 95% uncertainty

Jul Aug Sep Oct Nov

Results: Anchoring to the fundamentals stabilizes forecasts

Electoral Votes
400
OBAMA 332
350

300

250

200

150 ROMNEY 206
Jul Aug Sep Oct Nov

There were almost no surprises in 2012

On Election Day, average error = 1.7%

Why didn’t the model do more?



Why didn’t the model improve forecasts by more?

The fundamentals and uniform swing were right on target

State election outcomes 2012=2008 line

q
HI
70
q
VT

q
NYq
qRI
MD
qq
CA
MA
60 q qq
DE
NJ CTIL
q
2012 Obama vote

ME
q
WA
q
OR
q
NMq
MI
q
MNq
CO q
NH WI
q NV
q
IA
q
VA PA
qq
OH
q
FL
50 q
NC

AZ q MO
qGA q q
IN
q q
MSSC
q
AK q
MT
q
TX
q
LA q
SD
40 q q
AL KY
q q
TN ND
q
KS
NE
q
AR
q
WV
q q
OK ID

30 q
WY

q
UT

30 40 50 60 70

2008 Obama vote

Aggregate preferences were very stable

Percent supporting: Obama Romney

Could the model have done better? Yes

Difference between actual and predicted vote outcomes

6 HI
q

AK
q
↑ Obama performed
better than expected
Election Day forecast error

4
OR
q
MS
ID
qq NJ
q
2 MDq CA
CT
qq MI
q
LA ME
q
MAq
CO
q
q IA q
q
RI q NY q
q q NV NH
q
WI
q
AR TX IN
q NM VA
q
SC
q q WAq
FL
q
0 OK
q
GA
q
MN MO
q
q PA
NC
qq OH
q

AZ
q
q UT q
AL qq
DE ND
q IL
−2 KS
q
VT
q MT
q

KY
WY
qq
TN
NE
qq
−4 SDq
↓ Romney performed
better than expected
−6
WV
q

0 20 40 60 80 100

Number of polls after May 1, 2012

Forecasting is only one of many applications for the model

1 Who’s going to win?
2 Which states are going to be competitive?
3 What are current voter preferences in each state?
4 How much does opinion fluctuate during a campaign?
5 What effect does campaign news/activity have on opinion?
6 Are changes in preferences primarily national or local?
7 How useful are historical factors vs. polls for forecasting?
8 How early can accurate forecasts be made?
9 Were some survey firms biased in one direction or the other?

House eﬀects (biases) were evident during the campaign

Much more at votamatic.org

drew@votamatic.org
@DrewLinzer

Linzer slides-barug

Recommended

Recommended

More Related Content

More from Dmitry Makarchuk

More from Dmitry Makarchuk (11)

Linzer slides-barug