2013 05 20 field_directors

  • 317 views
Uploaded on

 

More in: Technology , Business
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
    Be the first to like this
No Downloads

Views

Total Views
317
On Slideshare
0
From Embeds
0
Number of Embeds
0

Actions

Shares
Downloads
3
Comments
0
Likes
0

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. Computational Social Science:The Pros and Cons of Big Data’Cliff Lampe- School of Information, University of MichiganMay 20, 2013Monday, May 20, 13
  • 2. Cliff LampeSchool of Information- associate professorSocial media“Socio-technical systems”Primarily a social scientistMonday, May 20, 13
  • 3. Samples of my research in this areaEffects of participation on FacebookInformation cascades on TwitterUser collaboration on WikipediaDiscussion patterns in large-scale news sitesCoordination in massive online gamesInformation seeking on search engines vs. social mediaMonday, May 20, 13
  • 4. Interactions in social media leave communicationtraces we can mine to understand social processes.These compete with insights from surveys.Monday, May 20, 13
  • 5. Defining “Big Data”“Big Data” is a rough categorization, a marketingterm, and a paradigm shift.Monday, May 20, 13
  • 6. Why “big data” has becomea big deal...More devices collecting dataMore data born digitalEasier/cheaper to storeBetter processorsNew skills / techniquesInsights have proven effectiveMonday, May 20, 13
  • 7. Monday, May 20, 13
  • 8. “Big Data” started in thephysical sciencesMonday, May 20, 13
  • 9. Big Data is increasingly beingapplied to social science questionsMonday, May 20, 13
  • 10. What counts as “big”?LHC: .001% of sensors leadto 25 petabytes annually.Wikipedia: 17 terabytesTwitter: ~ 10 GB/dayHow many observationsneeded to count as “big”?Monday, May 20, 13
  • 11. ‘Big Data’ require multiple,interlinked skills and tools.Monday, May 20, 13
  • 12. Monday, May 20, 13
  • 13. Challenges in “Big Data”CaptureCurationStorageSearchSharingTransferAnalysisVisualizationMonday, May 20, 13
  • 14. Social Media is often linked withBig Data because it is the likeliestsource for human trace data.Monday, May 20, 13
  • 15. What is social media?Monday, May 20, 13
  • 16. Common characteristicsUser generated contentDirect user-to-user interactionBundles of applicationsMore than Facebook and TwitterMonday, May 20, 13
  • 17. Monday, May 20, 13
  • 18. Monday, May 20, 13
  • 19. Monday, May 20, 13
  • 20. Monday, May 20, 13
  • 21. Monday, May 20, 13
  • 22. Monday, May 20, 13
  • 23. Monday, May 20, 13
  • 24. Monday, May 20, 13
  • 25. Social media cover a widevariety of sites.Monday, May 20, 13
  • 26. Trends in social media useMonday, May 20, 13
  • 27. Monday, May 20, 13
  • 28. Monday, May 20, 13
  • 29. Monday, May 20, 13
  • 30. Social media skillNearly 1 million people join Facebook every weekPeople spend on average 16 hours a month onFacebookThere are about 250 million Tweets per dayPeople upload 3000 pictures to Flickr every minuteWikipedia has 17 million articles by 91,000 editorsYouTube has 490 million unique visitors per monthGoogle + reached 10 million users in 16 daysMonday, May 20, 13
  • 31. The social media landscapeis constantly changing.Monday, May 20, 13
  • 32. People are increasinglyenacting their social lives incomputer mediatedchannels.Monday, May 20, 13
  • 33. Examples of social media /big data social insightsMonday, May 20, 13
  • 34. Monday, May 20, 13
  • 35. Monday, May 20, 13
  • 36. Monday, May 20, 13
  • 37. Monday, May 20, 13
  • 38. Monday, May 20, 13
  • 39. Monday, May 20, 13
  • 40. Social media are being usedto capture big social data.Monday, May 20, 13
  • 41. Obtaining social media trace data“Scraping”APIs(rare) public datasetsPartnershipsMonday, May 20, 13
  • 42. Scraping - using software todownload and store datafrom publicly available site.Monday, May 20, 13
  • 43. Application Performance InterfacesMonday, May 20, 13
  • 44. Monday, May 20, 13
  • 45. Available datasetsMonday, May 20, 13
  • 46. PartnershipsMonday, May 20, 13
  • 47. Monday, May 20, 13
  • 48. Issues with social mediatrace dataAccessRepresenting resultsRepresentativenessValidityCross-channel difficultyAppropriate skill setsEthicsMonday, May 20, 13
  • 49. AccessData often owned byprivate corporations.Need special skills toaccess.Monday, May 20, 13
  • 50. Representing resultsProbability testing breaksdown.Visualization is common,but limited.Training audiences for newdata analysis.Monday, May 20, 13
  • 51. Representativeness*How accurately dosocial media usersrepresent the largerpopulation?How do you rigorouslysample from socialmedia?Monday, May 20, 13
  • 52. ValiditySocial media users areperforming (thoughdon’t know scientistsare observing them)Different sites havedifferent purposes.Monday, May 20, 13
  • 53. Cross-channel useHow do you track oneuser over multiplesocial media sites?Monday, May 20, 13
  • 54. Appropriate researcher skillsetsCombination oftechnical and researchskills are required.New generation of“data scientists”coming now.Monday, May 20, 13
  • 55. EthicsHow can a user optout?More on a panel nextsession...Monday, May 20, 13
  • 56. Challenges in “Big Data”CaptureCurationStorageSearchSharingTransferAnalysisVisualizationMonday, May 20, 13
  • 57. “Pros”of big dataIt’s relatively cheapMassive scale coverssome sinsIt’s inevitable(?)In some cases, itworks.Monday, May 20, 13
  • 58. Insensitive Borg says...Big Data will makesurveys obsolete.Monday, May 20, 13
  • 59. Humble SuggestionsMore interdisciplinary work.Propose and fund work to test these issues.Don’t pretend it isn’t coming OR is a panacea.We’re just at the beginning of the journey.Monday, May 20, 13
  • 60. Social media and surveysprojectCan social media data ever replace and/or supplementsocial measurement, especially for official statistics,based on self-reported answers to questions asked ofa representative sample?Fred Conrad, Michael Schober, Josh PasekMonday, May 20, 13
  • 61. Thanks!Cliff Lampecacl@umich.eduTwitter: @clifflampeSlideshare: clifflampeMonday, May 20, 13