Indicators and Forecasts
David Rothschild, PhD
August 1, 2013
Mean Absolute Error: 2.78
Median Absolute Error: 2.14
Feb 16, 2012
Data
• Fundamental (politics): past
election results, incumbency,
presidential approval ratings,
economic indicators,
ideo...
Why do we create
Indictors &
Forecasts?
Why Forecasting: Efficiency
 Business Efficiency:
 Election Spending: $6 billion in 2012
 Similar Methods and Uses:
pol...
Why Forecasting: Research
 How/Why:
Not just the outcome, but how/why
the outcome ultimately occurs.
Why Forecasting: Necessary
 Technology:
 Methods almost unchanged for 75+
years, but will be totally different in
5-10 y...
What is the Goal?
 Gather information analyze it, and
aggregate that information into
indicators of upcoming events.
 Relevant
 Timely
 ...
Relevant?
Relevant? (Oct 28)
Relevant? (Oct 28)
Obama expected
to get 51% of vote.
Relevant? (Oct 28)
Obama 80% likely
to win
Electoral College.
Relevant? (Oct 28)
Relevant? (Oct 28)
Romney up
by 4 in latest
Gallup poll
of likely
voters
Obama 80%
likely to win
Electoral
College
Why I do not care about
economic indicator
forecasts
released the night before.
Timely?
 Efficiency
 Early: more resources left to allocate
 Often: always updated
 Research
 Early: capture more of campaign...
Accurate?
Supporting Actress Nate Silver David Rothschild
Anne Hathaway 67.1% 99.5%
Sally Field 13.4% 0.4%
Helen Hunt 11.1...
 Error
 Calibration
 Out-of-sample
Accurate?
Cost Effective?
Original Screenplay Nate Silver David Rothschild
Django Unchained 52.0%
Zero DarkThirty 27.4%
Amour 20.2%
...
 New Questions
 New Answers
Cost Effective?
Data
Data
• Fundamental (politics): past
election results, incumbency,
presidential approval ratings,
economic indicators,
ideo...
Fundamental Data
Polling &
Prediction Markets
GOP Primary
Three 2012 Debates
Social Media Data
Social Media Data
Social Media Data
Next Generation
Polling and
Prediction Games
Next Generation
 Non-Random / Non-Representative Users
 Incentivize self-selected users w/ high info
 New questions (gr...
Xbox Daily Poll
 Between 3 and 5 questions rotated on a
daily basis.
 Over 350k answered at least once,
providing demos....
Predicting the winner of a state’s electoral college
Both correct
217 races
(63%)
Both wrong
45 races (13%)
Intent correct...
Full Distributions
Switches by Prior Support
Overall Shift
Shift in Likelihood
of Taking Poll/Vote
(65%) Other to
Romney
(75%)
Obama to
Romney
(25%)
Shift in Support
(...
Real-Time Polling
#MITXData "The Impending Transformation of Market Research" presented by Microsoft Research
#MITXData "The Impending Transformation of Market Research" presented by Microsoft Research
Upcoming SlideShare
Loading in …5
×

#MITXData "The Impending Transformation of Market Research" presented by Microsoft Research

349 views

Published on

-David Rothschild, Economist, Microsoft Research

For over 75 years survey research has been relatively static; ask a random sample from a representative group of users or a focus group what they would do and report the result. In this session, David Rothschild (Economist, MSR - NYC) will demonstrate how survey research can be more efficient for creating both a snapshot of the present and forecasts of the future, with new questions and accompanying methodology that can utilize more cost effective non-representative samples. The resulting snapshots and forecasts are not only more accurate than standard methods, but more timely and granular, relevant for the stakeholders, and more cost effective. Combined with a nascent growth in our ability to harness social media other new data sources, David will help explain the transformation of market research that will happen next few years.

Published in: News & Politics, Business
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
349
On SlideShare
0
From Embeds
0
Number of Embeds
3
Actions
Shares
0
Downloads
3
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

#MITXData "The Impending Transformation of Market Research" presented by Microsoft Research

  1. 1. Indicators and Forecasts David Rothschild, PhD August 1, 2013
  2. 2. Mean Absolute Error: 2.78 Median Absolute Error: 2.14 Feb 16, 2012
  3. 3. Data • Fundamental (politics): past election results, incumbency, presidential approval ratings, economic indicators, ideological indicators, biographical information • Social media:Twitter, Facebook • Other online: search, page-views, comments • Polls • Prediction Markets • Experts Passive Data Active Data
  4. 4. Why do we create Indictors & Forecasts?
  5. 5. Why Forecasting: Efficiency  Business Efficiency:  Election Spending: $6 billion in 2012  Similar Methods and Uses: political economy, marketing, economic indicators, finance, public policy, business outcomes, etc.,
  6. 6. Why Forecasting: Research  How/Why: Not just the outcome, but how/why the outcome ultimately occurs.
  7. 7. Why Forecasting: Necessary  Technology:  Methods almost unchanged for 75+ years, but will be totally different in 5-10 years  Old technology is getting more expensive  New technology is getting more efficient
  8. 8. What is the Goal?
  9. 9.  Gather information analyze it, and aggregate that information into indicators of upcoming events.  Relevant  Timely  Accurate  Economically Efficient Raw Data -> Indicators
  10. 10. Relevant?
  11. 11. Relevant? (Oct 28)
  12. 12. Relevant? (Oct 28)
  13. 13. Obama expected to get 51% of vote. Relevant? (Oct 28)
  14. 14. Obama 80% likely to win Electoral College. Relevant? (Oct 28)
  15. 15. Relevant? (Oct 28) Romney up by 4 in latest Gallup poll of likely voters Obama 80% likely to win Electoral College
  16. 16. Why I do not care about economic indicator forecasts released the night before. Timely?
  17. 17.  Efficiency  Early: more resources left to allocate  Often: always updated  Research  Early: capture more of campaign  Often: granular Timely?
  18. 18. Accurate? Supporting Actress Nate Silver David Rothschild Anne Hathaway 67.1% 99.5% Sally Field 13.4% 0.4% Helen Hunt 11.1% 0.1% Amy Adams 8.4% 0.0% JackiWeaver 0.0% 0.0% Supporting Actor Nate Silver David Rothschild Tommy Lee Jones 35.4% 44.1% ChristophWaltz 23.8% 40.4% Robert De Niro 6.4% 13.6% Philip Seymour Hoffman 24.1% 1.5% Alan Arkin 10.3% 0.4%
  19. 19.  Error  Calibration  Out-of-sample Accurate?
  20. 20. Cost Effective? Original Screenplay Nate Silver David Rothschild Django Unchained 52.0% Zero DarkThirty 27.4% Amour 20.2% Moonrise Kingdom 0.4% Flight 0.0% Sound Mixing Nate Silver David Rothschild Les Miserables 97.4% Skyfall 1.5% Life of Pi 0.6% Argo 0.3% Lincoln 0.2%
  21. 21.  New Questions  New Answers Cost Effective?
  22. 22. Data
  23. 23. Data • Fundamental (politics): past election results, incumbency, presidential approval ratings, economic indicators, ideological indicators, biographical information • Social media:Twitter, Facebook • Other online: search, page-views, comments • Polls • Prediction Markets • Experts Passive Data Active Data
  24. 24. Fundamental Data
  25. 25. Polling & Prediction Markets
  26. 26. GOP Primary
  27. 27. Three 2012 Debates
  28. 28. Social Media Data
  29. 29. Social Media Data
  30. 30. Social Media Data
  31. 31. Next Generation Polling and Prediction Games
  32. 32. Next Generation  Non-Random / Non-Representative Users  Incentivize self-selected users w/ high info  New questions (graphical interfaces)  New aggregation methods/market makers  Incentive structures for truthful participation  Accurate for new answers and domains  New types of questions: relevant & timely  New domains: cost effective
  33. 33. Xbox Daily Poll  Between 3 and 5 questions rotated on a daily basis.  Over 350k answered at least once, providing demos.  Over 750k polls taken in total.  30k+ completed 5 or more polls.  10k+ completed 10 or more polls.  5k+ completed 15 or more polls.
  34. 34. Predicting the winner of a state’s electoral college Both correct 217 races (63%) Both wrong 45 races (13%) Intent correct 20 races (24%) Expectations correct 63 races (76%) Disagree 83 races (24%) All Races Where the methods disagree  Voter Intentions: in 239 / 345 races = 69%  Voter Expectation: in 279 / 345 races = 81%  Difference in proportion: in proportions: z=3.52***
  35. 35. Full Distributions
  36. 36. Switches by Prior Support
  37. 37. Overall Shift Shift in Likelihood of Taking Poll/Vote (65%) Other to Romney (75%) Obama to Romney (25%) Shift in Support (35%) Total Shift Shift in Support
  38. 38. Real-Time Polling

×