SurveyMonkey 2012 Presidential Election Poll: Final Update


Published on

SurveyMonkey has surveyed roughly 1.2 million people from August 17th to November 2nd. Still, skeptics will ask, “Can an internet poll really be successful at approximating voter turnout?”

Published in: News & Politics
1 Like
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

SurveyMonkey 2012 Presidential Election Poll: Final Update

  1. 1. ! ! ! "#$"!%&()*+,)-.!/.0,)1+!%1..!! ! ! "#$%$&!()%(*+,!"#-.-,!/"!012#3+3%345!! ! ! .(61!3%+71)4,!89:! ;$(*(!9&<21$*,!"#-.-,!=1*$3)!012#3+3%34$<2!! ! ! >**(71%%!=?#
  2. 2. 2345/6789:/6! "#$"!%4/2;</9=;>?!/?/@=;89!%8??!SurveyMonkey has surveyed roughly 1.2 million people from August 17th to November 2nd. Still,skeptics will ask, “Can an internet poll really be successful at approximating voter turnout?”A&!)(!,B!C-D!1E!-0,F-.!G1,&!,F&+1F,!)+!"##H!IJ!01F+,JK!=B)(!)(!,B!C-D!1E!&(D1+*+,(!,1!2F&GJ71+LJM(!"#$"!D&()*+,)-.!.0,)1+!D1..K!.This report contains our newest wave of data from the 600,000 people who responded to ourpresidential election poll from October 3rd through November 2nd. Results will be displayed intwo different ways: first, as popular vote percentages and second as Electoral Collegedistributions. With this data, we seek to show that internet data is as good as phone data (if notbetter) at assessing public opinion. $!
  3. 3. !-!EN!+1,(!-I1F,!1F&!*-,-!! !"#$%&("#)*$)#(()*()+$,%$)%$-.%/"01!!OBJ!*1(!-..!,B!*-,-!IP)+!1+!$#Q$#!&-,B&!,B-+!$#QRS! The data reported below begins at 10/10 due to the fact that we chose to use a seven-day trailing sum. This was done for three main reasons. First, all publicly available polls report data using trailing sums as well. Matching their methodology in this way will facilitate comparisons between SurveyMonkey and other polling firms. This provides a reality check for how well SurveyMonkey is doing measuring public opinion. Second, using a trailing sum, rather than a daily measure, provides a statistic that is less swayed by any single day’s events. Essentially, averaging over a week’s worth of data smoothes out and otherwise jagged curve. Lastly, for analyses at the state level, using more than one day of data gives us a larger sample that increases the power and accuracy of our analyses.OBJ!)(!1+.J!NL*-J!*-,-!&D1&,*S! It is also important to note that all results that will be reported below exclude weekend data. This was done for two reasons. First, we observed that the graphs of our raw, daily data showed spikes every weekend that were aberrant from the trend line, and from publicly available polling data. We speculate that this is due to two main problems. First, our traffic volume is much lower on weekends, with traffic sinking as low as 15% of typical weekday traffic. This lower volume makes our results more susceptible to outliers. Second, we have found in prior studies of our SurveyMonkey traffic that the people who take surveys on weekends are often not representative of the general U.S. population and, consequently, qualitatively different from those who take surveys on weekdays. "!
  4. 4. !C1*.!$T!-!4>O!.11L!!! 2*/&)3.#$4)-%.5/#$&)()%("&-(%$")5/$+).6)*$)#((1!!4>=;89>?/T We have included this model not because we think it will accurately predict what happens on Election Day, but because we want to be as transparent as possible about our methodology.O/;UA=2T None. Other than excluding weekends and using a 7-day trailing sum, this is purely raw data. No corrections. No weighting. 4/23?=2As can be seen in the graph below the raw results from our survey suggest that the twocandidates standing in the Electoral College has flipped back and forth almost daily. This isstrikingly different from all other polls, which have had Obama consistently ahead in theElectoral College for October. This inconsistency of electoral college projections was the mainreason that we pursued weighted models rather than merely reporting our raw data. As of Friday(11/2), Model #1 predicts: Obama, 266; Romney, 272.The above graph was created through a forced choice for each state between the candidates.Separating toss up states provides a glimpse into why SurveyMonkey’s numbers show a tighterrace. RCP uses a 5% margin of error to determine if a state is a clear win for either candidate.SurveyMonkey, on the other hand, uses a slimmer 3% margin of error. Overall the graphs belowshow that SurveyMonkey has roughly half the number of toss up states that RCP does, with moreof these going to Romney than Obama. This accounts for why this model estimates a muchhigher number of electoral votes for Romney than other polls.! R!
  5. 5. Although the Electoral College decides the election, the popular vote is also of interest. Becausewe oversampled swing states to be able to conduct analyses at the state-level, the proportions ofstates in our sample relative to their representation in the population of American voters variedwildly. Additionally, due to low traffic, some states were under-represented in our sample. Forexample, the percentage of voters from Ohio was inflated, because we directed more respondentsto our survey there—and percentage of voters from North Dakota was lower, as we directed lesstraffic there. Thus, publicly available statistics were used to adjust the weights of the statepopular vote totals so that they accurately reflected the proportions of U.S. voter turnout by statein 2008.Unsurprisingly, given that SurveyMonkey’s electoral college shows an inconsistent margin ofvictory for Obama than other polls do, the SurveyMonkey popular vote total shows a lowermargin of Obama supporters than other polls have. V!
  6. 6. !C1*.!"T!,B!XA8OY!01&&0,)1+!!! 2*/&)3.#$4)7.%%$7&)6.%)&(3-4/"0)3$*.#1!4>=;89>?/T The anonymity of internet polling is a blessing and a curse. Because the person being polled has anonymity, he or she is free to respond without feeling self- conscious. This minimizes the demand characteristics of phone polls to change their answer in response to what they think the phone pollster wants to hear. When people are answering surveys online, as opposed to on the phone—they are “talking” to a computer instead of a real, live person. This matters because research has shown that when speaking with a real, live person, respondents are more concerned about what that person thinks of them. This makes respondents less willing to say “I don’t know,” when asked who they would vote for, because it would suggest that they haven’t thought about the election much. Unfortunately, this anonymity can also artificially inflate “don’t know” responses making accurate predictions tougher to make. Moreover, anonymity can also lead to people not taking the survey seriously enough, randomly clicking responses or not thinking through the questions sufficiently.O/;UA=2T • ?-+)+P! G1,&(T! The “don’t know” response percentage in the SurveyMonkey dataset was much higher than that of the average phone poll (9% versus 5%). Consequently, we used a question that asked what candidate voters were “leaning towards” to add a small subset of otherwise undecided voters to the results.! • 51.-,).),JT!Each day was compared to the previous day to compute a “volatility” index. This weight was applied to the day’s average so that more consistent days were weighted more heavily. This makes our averages less susceptible to random error and “satisficers” (people who don’t take online surveys seriously).! 4/23?=2Although RCP and Nate Silver’s “fivethirtyeight” blog have consistently predicted an Obamavictory in the Electoral College by a fairly wide margin, Model # 2 shows a much tighter race.As can be seen in the graph below SurveyMonkey results suggest that if the election had beenheld anytime between 10/10 to 10/18, Mitt Romney would have won. Beginning on 10/18,however, all the way through Friday, Barack Obama has regained the edge in the ElectoralCollege. As of Friday (11/2), Model #2 predicts: Obama, 272; Romney, 266. W!
  7. 7. Again, the above graph was created through a forced choice for each state between thecandidates. Separating toss up states provides a glimpse into why SurveyMonkey’s numbersshow a tighter race. Overall the graphs below show that Model #2 has roughly half the number oftoss up states that RCP does, with 50% of these going to Obama and 50% to Romney.Despite the fact that SurveyMonkey’s electoral college shows a thinner margin of victory forObama than other polls do, the SurveyMonkey popular vote total shows a greater margin ofObama supporters than other polls have. Thus, while other polls indicate that Romney is ahead inthe popular vote, SurveyMonkey data indicates that Obama is actually in the lead. Model #2’sestimation of the popular vote mirrors Nate Silver’s popular vote estimation more closely thanRCP’s estimation. Z!
  8. 8. !C1*.!RT!,B!XOA8Y!01&&0,)1+!!!) 2*/&)3.#$4)7.%%$7&)6.%)&(3-4/"0)6%(3$1!4>=;89>?/T Whether you’re reaching people through their computer or their phone, having them answer your survey does not guarantee that they are going to show up at the polls on Election Day. The people who respond to surveys (whether on the internet or on the phone) and the people who show up to vote are not exactly the same set of people.O/;UA=2T • %-&,J!;<T! Using voter turnout statistics from 2008, we adjusted the proportions of Democrats, Republicans, and Independents in our sample. A state was coded as too “blue” or too “red” and the vote of Republicans or Democrats respectively was weighted heavier to even out the percentage. This correction was applied within a 5% margin of error, as this is the typical polling error.! • /*F0-,)1+T! Having adjusted on party ideology, we then performed a mathematical correction for the representation of educational level (see Appendix for the question options) in the population of U.S. voters.! • 3+*0)**(T! Finally, we eliminated any voters who responded “don’t know” twice when asked who to vote for. If a voter is not leaning towards any political candidate only a few days before the election, chances are low that they will vote at all, and if they do they should be equally split between the two candidates. Eliminating these truly undecided voters from our sample allowed for a more realistic estimate of the popular vote.! 4/23?=2Model #3 predicts a consistent victory for Obama over the past month—even when he wastrailing in the popular vote. Unlike Model #2, which is more conservative in its Electoral Collegeestimations than both RCP and Nate Silver, Model #3 predicts a wider margin of victory thaneither. The electoral vote estimations of Model 3 more closely mirror Nate Silver’s estimations(more so than RCP). Nevertheless, there is a striking difference in our graph for 10/22-10/25,which shows Romney briefly ahead in the electoral college. As of Friday (11/2), Model #3predicts: Obama, 305; Romney, 233.Again, the above graph was created through a forced choice for each state between thecandidates. Separating toss up states provides a glimpse into why SurveyMonkey’s numbers [!
  9. 9. show a bigger lead for Obama. Overall the graphs below show that SurveyMonkey has roughlyhalf the number of toss up states that RCP does, but the majority of these tossup states tend to beattributed to Obama in a forced-choice scenario, creating a wide lead for Obama.Despite the fact that SurveyMonkey’s electoral college shows a thinner margin of victory forObama than RCP polls do, the SurveyMonkey popular vote total shows a greater margin ofObama supporters than RCP polls have. Thus, while RCP polls indicate that Romney is ahead inthe popular vote, SurveyMonkey data indicates that Obama is actually in the lead. H!
  10. 10. !0-..)+P!,B!&-0!!Ultimately, each model is only as good as the calls it makes on the Electoral College and theoverall popular vote percentages. Below are the electoral map predictions for each model and theestimations of the popular vote for each. Key differences in swing states are highlighted. 78</?!]$T!4>O! =>??6T!!!8^>7>!!!"ZZ!!!!4879/6!!!"["! ! :/6!=8223%2T! ! @8!!!;>!!!9A!! ! _?!!!9@!!!95!!!8A!!!5>! ! %8%3?>4!58=/T! !!!8^>7>!!!V[`RH!a!!!!!!4879/6!!!VZ`"Va! ! ! 78</?!]"T!A8O! =>??6T!!!8^>7>!!!"["!!!!4879/6!!!"ZZ! ! :/6!=8223%2T! ! @8!!!;>!!!9A!!!95! ! _?!!!9@!!!8A!!!5>! ! %8%3?>4!58=/T! !!!8^>7>!!!VH`RRa!!!!!!4879/6!!!V[`$$a! ! ! 78</?!]RT!OA8 =>??6T!!!8^>7>!!!R#W!!!!4879/6!!!"RR! ! :/6!=8223%2T! ! @8!!!;>!!!9@!!!9A!!!95!!!8A! ! _?!!!5>! ! %8%3?>4!58=/T! !!!8^>7>!!!V`VZa!!!!!!4879/6!!!V[`W$a! !
  11. 11. 78</?!2377>46T!=8223%2!To provide the best possible prediction, we looked at our three models to determine which statesshould be labeled definitively as “tossups”. If a state was predicted differently in differentmodels, or if the difference in Obama and Romney votes was less than 2% in any given state, wedetermined that it was too close to call. This led to the following overall prediction… /?/@=84>=/T!!!! 8^>7>!!!"W#! ! 4879/6!!!""#! ! =8223%2!!!ZH! ! ! ! ! :/6!=8223%2T!! ;>!!!9@!!!95!!!8A!!!5>!!!O;! ! %8%3?>4!58=/T! 8^>7>!!!VH`#a! 4879/6!!!V[`R$a! 8(5$%(0$).6)9.#$4):;)<)9.#$4):=>>!E)+-.!+1,!1+!(N)+P!(,-,(T!It is important to note that we do not consider Colorado, Florida,New Hampshire, and Pennsylvania swing states. Our data has shown consistent advantages forObama in Colorado, New Hampshire, and Pennsylvania and a consistent advantage for Romneyin Florida. We have only six toss up states, nearly half the number of RCP. Among our threeprevious models, there are only three states that vary among them, accounting for the electoraldifferentials. Thus, regardless of which model is used, 48 out of 51 electorates stay consistent.! 834!%;@:S! 78</?!]R!4>=;89>?/T! Model #3 accounts for the differential of polled and actual voters without gettingcaught up in the pros and cons of an internet sample in particular. It is similar, but not identicalto what other pollsters are saying and has shown itself to be consistently ahead of the curve ofother polls for the past month.! $#!
  12. 12. >&&1*+$Z!X![?1<2$3**($)1!Voting Registration. ! • >)1!53?!@?))1*2%5!(!)14$<21)1+!(*+!1%$4$7%1!6321),! U- J3H!$F&3)2(*2!$<!2#1!&)1<$+1*2$(%!1%1@2$3*!23! 3)!*32A! 53?A! ?$&) NO%$3$4H)/3-.%(") @.) P$%H)/3-.%(") F.3$+*()/3-.%(")Zip Code. F4/0*4H)/3-.%(") • B#(2!$<!2#1!C$61D+$4$2!E$&!@3+1!C3)!2#1!(++)1<<! @.)()(44)/3-.%(") 53?!)14$<21)1+!23!6321!C)3F,!3)!$C!53?G)1!*32! V- WC!2#1!1%1@2$3*!H1)1!#1%+!23F3))3H,!H3?%+!53?! )14$<21)1+!23!6321,!H#(2!$<!2#1!E$&!@3+1!53?! L*3H!H#1)1!23!43!6321A! H3?%+!?<1A! ?$&) A.-$"B$"#$#C) @.) IT- J3H!3C21*!H3?%+!53?!<(5!53?!6321!X!(%H(5<,!Voting Likelihood. *1()%5!(%H(5<,!&()2!3C!2#1!2$F1,!3)!<1%+3FA! I- J3H!F?@#!2#3?4#2!#(61!53?!4$61*!23!2#1! K4+(H&) ?&@3F$*4!1%1@2$3*!C3)!&)1<$+1*2A! @$(%4H)(4+(H&) DE/$)()4.) R(%).6)*$)/3$) F.3$) F$4#.3) G"4H)()4/4$) @$5$%) @."$) I.",)J".+) I.",)J".+) II- S#$*L$*4!7(@L!23!2#1!1%1@2$3*<!#1%+!C3)!83*4)1<<! K- .3!53?!#(&&1*!23!L*3H!H#1)1!&13&%1!H#3!%$61! $*!O361F71)!KTIT,!+$+!53?!6321A! $*!53?)!*1$4#73)#33+!43!23!6321A! ?$&Q)5.$#) ?$&) @.Q)#/#)".)5.$) @.) ! I.",)J".+) Voting Preference. M- J(61!53?!161)!6321+!$*!53?)!&)1@$*@2!3)!1%1@2$3*! • =?&&3<1!2#1!&)1<$+1*2$(%!1%1@2$3*!H1)1!#1%+! +$<2)$@2A! 23+(5-!B#3!H3?%+!53?!71!%$L1%5!23!6321!C3)A! ?$&) S(%(7J)GL(3() @.) 9/)T.3"$H) I.",)J".+) I.",)J".+)U)G*$%) N- .3!53?,!53?)<1%C,!&%(*!23!6321!$*!2#1!1%1@2$3*!2#$<! • B#$@#!@(*+$+(21!()1!53?!%1(*$*4!23H()+<A! O361F71),!3)!*32A! S(%(7J)GL(3() ?$&) 9/)T.3"$H) @.) G*$%) I.",)J".+) I.",)J".+) P- J3H!@1)2($*!()1!53?!2#(2!53?!H$%%!6321A! KL&.4E$4H)7$%(/") Demographics. M(/%4H)7$%(/") • 1*1)(%%5!<&1(L$*4!+3!53?!?<?(%%5!2#$*L!3C! @.)7$%(/") 53?)<1%C!(<!(!Y1&?7%$@(*,!(!.1F3@)(2,!(*! I.",)J".+) W*+1&1*+1*2!3)!<3F12#$*4!1%<1A! Q- J3H!%$L1%5!()1!53?!23!6321!$*!O361F71)G<! I$3.7%() &)1<$+1*2$(%!1%1@2$3*A! T$-EL4/7(") NO%$3$4H)4/J$4H) V"#$-$"#$") P$%H)4/J$4H) F.3$*/"0)$4&$) F.3$+*()4/J$4H) • B#(2!$<!2#1!#$4#1<2!%161%!3C!<@#33%!53?!#(61! F4/0*4H)4/J$4H) @3F&%121+!3)!2#1!#$4#1<2!+14)11!53?!#(61! @.)()(44)4/J$4H) )1@1$61+A! R- S#$*L$*4!7(@L!23!2#1!1%1@2$3*<!#1%+!C3)!83*4)1<<! W$&&)*(")*/0*)&7*..4)#$0%$$) $*!O361F71)!KTIT,!+$+!2#$*4<!@3F1!?&!2#(2!L1&2! X/0*)&7*..4)#$0%$$).%)$YE/5(4$") 53?!C)3F!632$*4,!3)!+$+!53?!#(&&1*!23!6321A! F.3$)7.44$0$)LE)".)#$0%$$) ?$&Q)5.$#) K&&.7/($)#$0%$$) @.Q)#/#)".)5.$) S(7*$4.%)#$0%$$) I.",)J".+) Z%(#E($)#$0%$$)