Analyzing social conversation: a guide to data mining and data visualization

718 views

Published on

These slides were presented by Mick Conroy of Tempero and Jonathan Stray of Associated Press/Overview Project as part of Social Media Week New York #smwnyc

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
718
On SlideShare
0
From Embeds
0
Number of Embeds
31
Actions
Shares
0
Downloads
7
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Analyzing social conversation: a guide to data mining and data visualization

  1. 1. Are you an NYC influencer? Find out @TemperoUK
  2. 2. Analyzing social conversation:a guide to data mining anddata visualization#SMWDataGeeksMichael Conroy Jonathan StraySenior Analyst, Tempero Project Lead, Overview@mickyconroy @jonathanstray@TemperoUK @overviewproject
  3. 3. WHO ARE WE? Mick Conroy Jonathan Stray Senior Social Media Analyst Project Lead, Overview Tempero Associated Press
  4. 4. WE’RE DROWNING IN TEXT!
  5. 5. THESE ONLY GO SO FAR…
  6. 6. Word clouds? Every time I see a word cloud presented as insight, I die a little inside. -JACOB HARRIS New York Times Data Journalism Team
  7. 7. IT’S NOT JUST ‘US’ Journalists! How do you deal with WikiLeaks? Freedom of Information Requests? The problems are very similar
  8. 8. An overview of OverviewJonathan StrayOverview Project Lead
  9. 9. structured data
  10. 10. unstructured data
  11. 11. more video on YouTube than produced by TV networks during entire 20th century
  12. 12. 10,000 legally-required reports filed by U.S. public companies every day
  13. 13. All New York Times articles ever = 0.06 terabytes (13 million stories, 5k per story)
  14. 14. Computer-assisted unstructured text analysis buzzwords! text analysis semantic visualization automatic classification
  15. 15. Document Clustering = about cars = not about cars
  16. 16. We look at every word in the documentsfind the words that make some documents different from all the others
  17. 17. Unstructured textNot tables of numbers, or stacks of fill-in-the-blank forms, but: Emails, meeting minutes, social media posts, event reports, news articles, blog posts, tweets, contracts, archives, press releases, instruction manuals, research articles, transcripts…
  18. 18. Open source project of the Associated Press overviewproject.org
  19. 19. Overview and Social MediaMick ConroySenior Analyst, Tempero
  20. 20. OVERVIEW AND SOCIAL? Understanding ‘how many’ is easy Understanding what is being said - conversation analysis - is hard
  21. 21. HUMANS AND ALGORITHMS
  22. 22. Exploring social data withOverview
  23. 23. AN INTRODUCTION TO DRONESHUMANS AND ALGORITHMS
  24. 24. AN INTRODUCTION TO DRONESHUMANS AND ALGORITHMS
  25. 25. 7 STAGES OF DATA ANALYSIS 1. Acquire 2. Parse/Filter 3. Mine 4. Represent 5. Refine 6. Interact
  26. 26. 1. ACQUIRE
  27. 27. 1. ACQUIRE
  28. 28. 2. PARSE/FILTER interaction.link text interaction.content url RT @Mikoshoes17: majority of the deaths ... http://twitter.com/Proud_Libtard/statuse ... https://twitter.com/Proud_Libtard/statuse RT @Mikoshoes17: majority of the deaths ... nicki minaj on american idol, even more ... http://twitter.com/FOXXXI/statuses/24734 ... https://twitter.com/FOXXXI/statuses/24734 nicki minaj on american idol, even more ... ... @BarackObama U say ur foreign policy is http://twitter.com/TFitz62/statuses/2473 ... https://twitter.com/TFitz62/statuses/2473 ... @BarackObama U say ur foreign policy is ... Sept 28 we find out which arm company lo ... http://twitter.com/NoDronesCanada/status... https://twitter.com/NoDronesCanada/status Sept 28 we find out which arm company lo ...... So GOP solutions to all foreign policy p ... http://twitter.com/ownersmag/statuses/24 ... https://twitter.com/ownersmag/statuses/24 ... So GOP solutions to all foreign policy p ... @KanwalZaidi @SenRehmanMalik The stubbor http://twitter.com/AymenAhmad1/statuses/ ... ... https://twitter.com/AymenAhmad1/statuses/ ... @KanwalZaidi @SenRehmanMalik The stubbor ... RT @OwenJones84: Other than propping ... http://twitter.com/KCruickshank1/statuseup ... https://twitter.com/KCruickshank1/statuse ... RT @OwenJones84: Other than propping up ... RT @BuzzFeedAndrew: RT @BuzzFeedPol: ... http://twitter.com/svdaveo/statuses/2473 Wha ... https://twitter.com/svdaveo/statuses/2473 ... RT @BuzzFeedAndrew: RT @BuzzFeedPol: Wha ... Marcus Aurelius: CIA Scrambles to Rush S http://twitter.com/PacketknifeToo/status ... https://twitter.com/PacketknifeToo/status ... Marcus Aurelius: CIA Scrambles to Rush S ... Ellip6: from military drones to the Worl ... http://twitter.com/RSIClub/statuses/2473 ... https://twitter.com/RSIClub/statuses/2473 Ellip6: from military drones to the Worl ... ... Parrot Ar.Drone 2.0 - полетееел http://twitter.com/Gertsevoa/statuses/24 ... ... https://twitter.com/Gertsevoa/statuses/24 ... Parrot Ar.Drone 2.0 - полетееел ... RT @OwenJones84: Other than propping up http://twitter.com/NooseCorpSE/statuses/ ... ... https://twitter.com/NooseCorpSE/statuses/ ... RT @OwenJones84: Other than propping up...
  29. 29. 3. MINE
  30. 30. The end result
  31. 31. 4. REPRESENT
  32. 32. 5. REFINE
  33. 33. 5. REFINE
  34. 34. 5. REFINE
  35. 35. 6. MAKE IT INTERACTIVE Add methods for people to manipulate the data, or control what features are visible More importantly, it empowers people to explore your data! Also good for showing off
  36. 36. MAKE DECISIONS Do something with the data If you don’t, you’ve just made expensive graphs What do you know now that you didn’t know before?
  37. 37. WHAT ELSE COULD YOU DO? Look through the rest of Datasift’s data Make ever more beautiful (and useful!) visualizations
  38. 38. OR GET IN TOUCH WITH US!HUMANS AND ALGORITHMS
  39. 39. WE WORK WITH EVERYTHINGHUMANS AND ALGORITHMS
  40. 40. Are you an NYC influencer? Find out @TemperoUK
  41. 41. Thanks for not falling asleep.Any questions?#SMWDataGeeksoverviewproject.orgMichael Conroy Jonathan StraySenior Analyst, Tempero Project Lead, Overview@mickyconroy @jonathanstray@TemperoUK @overviewproject

×