Big data panel slides

290 views
258 views

Published on

Published in: News & Politics, Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
290
On SlideShare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
4
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide
  • Talk about how historically, data has been tightly guarded by political parties and big organizations. And it’s also been limited by available tech. Technology to handle terabytes of data was expensive, and even out of reach of a lot of organizations.
  • But we’re seeing change. One great example of this is NationBuilder. Anyone who signs up can get access to the voter file. This is a slow but steady movement to a greater number of people having access to the data. Some examples: (1) Democratic Data Coop, (2) Catalist, (3) New Koch Brothers Data Organization on the right
  • But we’re also seeing a democratization of the technology. It used to be that if you wanted to handle terabytes of data, you had to go to either a Teradata, Netezza, or Oracle. More and more, organizations are using Hadoop + Hive and other technologies in order to process and query large sets of data. This also corresponds to the commoditization of the hardware required for these big clusters (and the cloud is becoming a better and better option).
  • Talk about starting to put more real time measures into models -- VF data updates once a month, maybe, but online data sources are dynamic and change on a day to day basis. Your voter contact system is producing new day thousands of times throughout a day. This shouldn’t be a process that moves once a month.
  • Shift of what we are modeling FOR
  • Calling more cell phones isn’t the answer because of TCPA - regulates when/how pollsters can call, makes cell interviews expensive and a major burden
  • Some online panels are great for understanding specific audiences but there’s still debate about their use and ability to be representative of large populations for something like a ballot test.
  • This is creating a massive new industry at the intersection of tech and opinion research. I’ve even had friends and colleagues tell me they’ve built models that can predict ballot share by looking just at online conversation and activity data.
  • Here’s one race that threw a wrench into the idea of monitoring online indicators as a way of predicting ballot share.
  • You can see in that 2010 election, sure Christine O’Donnell had a LOT of search traffic, and up until election day always had a higher search volume than her opponent. But did it matter? Nope.
  • It’s not hard to generate a sentiment analysi.. Take Twitter. Here, I can plug in and get an analysis fast. However, note in my example the first “positive” tweet...
  • Really? And look - nobody gets it perfect. That’s the point. It is very, very, very hard to get this right.
  • Sentiment analysis is working its way into the political conversation as a way, in addition to polls, to understand public opinion - but does it matter?
  • If Gingrich’s online buzz was so great ahead of Super Tuesday, why did he only win one of these ten states that day? (TELL ANECDOTE ABOUT NG CAMPAIGN)
  • The point isn’t to rain on the parade of sentiment analysis. I think traditional research’s days are numbered if it doesn’t evolve. But I think sentiment analysis has a lot to learn from traditional survey research for campaigns. (Explain pros and cons)
  • Big data panel slides

    1. 1. 2
    2. 2. MediaButton
    3. 3. Splash page experimentVariations: Button: Media: • Sign Up 1. Get Involved Image • Learn More 2. Family Image • Join Us Now 3. Change Image • Sign Up Now 4. Barack’s Video 5. Springfield Video 6. Sam’s Video
    4. 4. Button: “Sign Up”
    5. 5. Button: “Learn More”
    6. 6. Button: “Join Us Now”
    7. 7. Button: “Sign Up Now”
    8. 8. Splash page experimentVariations: Button: Media: • Sign Up 1. Get Involved Image • Learn More 2. Family Image • Join Us Now 3. Change Image • Sign Up Now 4. Barack’s Video 5. Springfield Video 6. Sam’s Video
    9. 9. Media: “Get Involved”
    10. 10. Media: “Family”
    11. 11. Media: “Change”
    12. 12. Media: “Barack’s Video”
    13. 13. Media: “Springfield Video”
    14. 14. Media: “Sam’s Video”
    15. 15. Splash page experimentVariations: Button: Media: 1. Sign Up 1. Get Involved Image 2. Learn More 2. Family Image 3. Join Us Now 3. Change Image 4. Sign Up Now 4. Barack’s Video 5. Springfield Video 6. Sam’s Video
    16. 16. Splash page experiment results
    17. 17. Splash page experiment results
    18. 18. Splash page experiment results
    19. 19. Splash page experiment results Email Volunteers Amount Raised SubscriptionsOriginal: 7,120,000 712,000 $143,000,000+40.6% +2,880,000 +288,000 +$57,000,000 New: 10,000,000 1,000,000 $200,000,000
    20. 20. Splash page experiment results Email Volunteers Amount Raised SubscriptionsOriginal: 7,120,000 712,000 $143,000,000+40.6% +2,880,000 +288,000 +$57,000,000 New: 10,000,000 1,000,000 $200,000,000
    21. 21. Splash page experiment results Email Volunteers Amount Raised SubscriptionsOriginal: 7,120,000 712,000 $143,000,000+40.6% +2,880,000 +288,000 +$57,000,000 New: 10,000,000 1,000,000 $200,000,000
    22. 22. Email me: Follow me:dan@optimizely.com @dsiroker
    23. 23. 26
    24. 24. Modeling is using the data you have about avoter to make an informed judgment about: - whether they will vote - who they will vote for - what issues affect their vote - any other question you think has predictable (reproducible) behaviors
    25. 25. humans make models all the time, as wecollect data & make informed judgments: 80% ? likelihood of voting for Obama
    26. 26. SURVEY GOP=94% (94/100)STATEWIDE GOP~94% (94,000/100,000)50 years old white56% 59% 63% 68% 85% 89% 94% % GOP
    27. 27. - knowledge management- master data management- data harmonization- voter relationship managementOBAMA CAMPAIGN’S PROJECT NARWHAL
    28. 28. Slowly lowering the wall between online & offline data 1. Data providersWilliam Alexander Lundry, 2. Targeted Display AdsRegistered Republican 3. Facebook Apps 4. Volunteered Association
    29. 29. The Conservative Data Ecosystem RNC Data Trust Themis United In Purpose
    30. 30. 33
    31. 31. Big Political Data for the masses
    32. 32. Periodic to real time
    33. 33. Responds to DM Likelihood to vote Partisan Affiliation Top Issue Likelihood to unsubscribeLikelihood to volunteer Best Channel for GivingReceptiveness to treatment
    34. 34. 39
    35. 35. The world of surveyresearch is changing rapidly.
    36. 36. The average phone surveyresponse rate is around 20% and declining.
    37. 37. 27% of UShouseholds arecell only. (CDC)
    38. 38. 46% of American adults own a smartphone.
    39. 39. Telephone ConsumerProtection Act of 1991
    40. 40. Online research tools are emerging.
    41. 41. Whether online works as a tooldepends on what we want to measure.
    42. 42. As it gets harder to ask... What if we get better about listening?
    43. 43. Sentiment? I can go totwittersentiment.appspot.com and get an analysis.
    44. 44. Does “Who said it - Newt or Buzz Lightyear?” reallycount as a positive tweet?
    45. 45. “Gingrich had 6 percent more activity than the other candidates and the positive sentiment on him related to Super Tuesday is at 84 percent. Sentiment in general online conversation about him is only at 45 percent. So it seems his folks arePhoto: Marc Grob for Time working the online world hard.” -From POLITICO “Playbook,” Super Tuesday (March 6 2012), quoting an email from a Washington-based public affairs consultant
    46. 46. Survey Sentiment Analysis• Landline bias • Online/activist bias• Contained universe • Variable universe• Concrete results • Subject to interpretation• “Snapshot” in time • Real-time, evolving• Message testing • Identify new trends

    ×