Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Powers and Problems of Integrating Social Media Data with Public Health and Safety

706 views

Published on

Social media sites like Twitter provide readily accessible sources of large-volume, high-velocity data streams, now referred to as ``Big Data.''
While private companies have already made great strides in leveraging these social media sources, many public organizations and government agencies could reap significant benefits from these resources.
Care must be exercised in this integration, however, as huge data sets come with their own intrinsic issues.
This presentation explores these advantages and hazards with several experiments that demonstrate social media data's ability to support government organizations and supplement existing programs.

Published in: Social Media
  • Like Watching Videos? Want to get paid to do it? ♥♥♥ http://t.cn/AieXiXbg
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • Looking For A Job? Positions available now. FT or PT. $10-$30/hr. No exp required. ▲▲▲ http://ishbv.com/socialpaid/pdf
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • Be the first to like this

Powers and Problems of Integrating Social Media Data with Public Health and Safety

  1. 1. 1! Powers and Problems of Integrating Social Media Data with Public Health and Safety Bloomberg D4GX 28 September 2015 Bloomberg HQ Cody Buntain and Jennifer Golbeck {cbuntain,golbeck}@cs.umd.edu Human-Computer Interaction Lab University of Maryland Gary LaFree garylafree@gmail.com START Center University of Maryland
  2. 2. What is the point of this talk?! 2!
  3. 3. The Point 3! Social media data can augment ! public survey-based research efforts!
  4. 4. The Point 4! Social media data can augment ! public survey-based research efforts!
  5. 5. The Point 5! Social media data can augment ! public survey-based research efforts!
  6. 6. The Point 6! Social media data can augment ! public survey-based research efforts!
  7. 7. The Point 7! Social media data can augment ! public survey-based research efforts!
  8. 8. The Problem •  Surveys are pervasive and powerful 8!
  9. 9. The Problem •  Surveys are slow 9!
  10. 10. The Problem •  Surveys often don’t give progressive results 10!
  11. 11. The Problem •  Surveys are expensive 11! $0! $3,500,000,000! $7,000,000,000! $10,500,000,000! $14,000,000,000! Total Population! Census Cost! Census Year!CensusCost!
  12. 12. A Solution •  Augment surveys with accessible and cheap* social media data! •  Social media data can supplement when surveys are too costly or slow! 12! 3,000,000! 3,500,000! 4,000,000! 4,500,000! 5,000,000! 5,500,000! 6,000,000! Day of Collection! TweetCount!
  13. 13. A Solution •  Augment surveys with easily accessible social media data! •  Social media data could substitute when surveys are too costly or slow! 13! 3,000,000! 3,500,000! 4,000,000! 4,500,000! 5,000,000! 5,500,000! 6,000,000! Day of Collection! TweetCount!
  14. 14. Demonstrations •  Geocoded tweets and the US Census! •  Drug references on social media! •  Sentiment and law enforcement! 14!
  15. 15. Demonstrations •  Geocoded tweets and the US Census! •  Drug references on social media! •  Sentiment and law enforcement! 15! Focus on Twitter because tweets are easy to acquire via public sample stream!
  16. 16. Twitter and the US Census •  1-3% of tweets have geolocation information! •  GPS coordinates! •  Bounding boxes! •  Places! •  Compare Twitter distributions per state with 2013 US Census data! 16!
  17. 17. •  1-3% of tweets have geolocation information! •  GPS coordinates! •  Bounding boxes! •  Places! •  Compare Twitter distributions per state with 2013 US Census data! 17! Twitter and the US Census
  18. 18. •  Can Twitter track geographic trends in drug use?! •  National Survey on Drug Use and Mental Health (NSDUH) has ground truth for:! •  Marijuana! •  Cocaine! •  Compare orders of states for each drug! 18! Twitter and the NSDUH
  19. 19. •  Can Twitter track geographic trends in drug use?! •  National Survey on Drug Use and Mental Health (NSDUH) has ground truth for:! •  Marijuana! •  Cocaine! •  Compare orders of states for each drug! 19! High ranked correlation! for marijuana! Twitter and the NSDUH
  20. 20. •  Can Twitter track geographic trends in drug use?! •  National Survey on Drug Use and Mental Health (NSDUH) has ground truth for:! •  Marijuana! •  Cocaine! •  Compare orders of states for each drug! 20! High ranked correlation! for marijuana! Low correlation! for cocaine! Twitter and the NSDUH
  21. 21. •  Can we measure public response to law enforcement?! •  Outreach from law enforcement to community is important! 21! Twitter Sentiment and Police
  22. 22. •  Compare sentiment towards police around two events:! •  2013 Boston Marathon Bombing! •  2014 Ferguson, MO Protests! 22! Twitter Sentiment and Police 2013 Boston Marathon Bombings! Sentiment!
  23. 23. •  Compare sentiment towards police around two events:! •  2013 Boston Marathon Bombing! •  2014 Ferguson, MO Protests! 23! Twitter Sentiment and Police 2013 Boston Marathon Bombings! Sentiment! Significant ! drop!
  24. 24. •  Compare sentiment towards police around two events:! •  2013 Boston Marathon Bombing! •  2014 Ferguson, MO Protests! 24! Twitter Sentiment and Police 2014 Ferguson, MO Protests! Sentiment!
  25. 25. •  Location data in the public stream! •  1-3% of the 1% public sample might work for coarse insights! •  Difficult to get community-level granularity! •  Can sacrifice volume for specificity with GPS bounding boxes! 25! The Limitations 1%! Sample Stream!
  26. 26. •  Limited availability of data from other sources! •  Facebook! •  Full access to data sources can be expensive! •  Addressable by companies?! 26! The Limitations
  27. 27. •  Twitter has known biases [1]! •  Mostly male (though decreasing)! •  Oversamples Caucasians! 27! The Limitations and the list previously described. We do so by compar the first word of their self-reported name to the gender We observe that there exists a match for 64.2% of the us Moreover, we find a strong bias towards male users: F 71.8% of the the users who we find a name match for ha male name. 0 0.2 0.4 0.6 0.8 1 2007-01 2007-07 2008-01 2008-07 2009-01 2009-07 FractionofJoiningUsers whoareMale Date Figure 3: Gender of joining users over time, binned groups of 10,000 joining users (note that the join rate
  28. 28. •  Relevancy and credibility! •  Huge amounts of data is nice, but we want the right data! •  No guarantee things on social media are true 28! The Limitations ! ! !
  29. 29. 29! The Conclusion Social media data can augment ! public survey-based research efforts! The Point
  30. 30. 30! The Conclusion Social media data can augment ! public survey-based research efforts! The Point
  31. 31. 31! The Conclusion Social media data can augment ! public survey-based research efforts! The Point The Limits •  User location information! •  Limited data from other sources! •  Sampling bias!
  32. 32. Questions?! 32!

×