Data Visualization at Twitter

17,978 views

Published on

My talk at the Hacks & Hackers Meetup SF at Twitter HQ on Oct 8, 2014

Published in: Data & Analytics

Data Visualization at Twitter

  1. visualization at Twitter data Krist Wongsuphasawat / @kristw
  2. Krist Wongsuphasawat / @kristw
  3. Bangkok, Thailand Krist Wongsuphasawat / @kristw
  4. Computer Engineer Bangkok, Thailand Chulalongkorn University Krist Wongsuphasawat / @kristw
  5. Computer Engineer Bangkok, Thailand Programming + Soccer Krist Wongsuphasawat / @kristw
  6. Computer Engineer Bangkok, Thailand Programming + Soccer Krist Wongsuphasawat / @kristw
  7. Computer Engineer Bangkok, Thailand Programming + Soccer Krist Wongsuphasawat / @kristw
  8. Computer Engineer Bangkok, Thailand M.S. in Computer Science Univ. of Maryland Krist Wongsuphasawat / @kristw
  9. Computer Engineer Bangkok, Thailand PhD in Computer Science Univ. of Maryland Information Visualization Krist Wongsuphasawat / @kristw
  10. Computer Engineer Bangkok, Thailand PhD in Computer Science Univ. of Maryland Information Visualization Krist Wongsuphasawat / @kristw IBM Microsoft
  11. Computer Engineer Bangkok, Thailand PhD in Computer Science Univ. of Maryland Information Visualization Krist Wongsuphasawat / @kristw IBM Microsoft Sr. Data Visualization Scientist Twitter
  12. data visualization at Twitter
  13. data visualization at Twitter
  14. visualization data at Twitter
  15. vis data at Twitter
  16. data at Twitter “Tweets”
  17. data at Twitter “Tweets” #events TV Shows New Year Earthquake Oscars Protest Super Bowl World Cup Election Breaking news …
  18. data at Twitter “Tweets” #events TV Shows New Year Earthquake Oscars Protest Super Bowl World Cup Election Breaking news … #curiosity Sleep pattern Human behavior Language …
  19. data at Twitter “Tweets” #events TV Shows New Year Earthquake Oscars Protest Super Bowl World Cup Election Breaking news … #curiosity Sleep pattern Human behavior Language … What could we learn from the Tweets?
  20. vis data at Twitter “Tweets” Tell stories about an event, Pursue curiosity or inspiration Goal:
  21. vis data at Twitter “Tweets” Tell stories about an event, Pursue curiosity or inspiration (with deadline) Goal:
  22. Challenge accepted
  23. vis data at Twitter “Tweets” Get data 1
  24. easy?
  25. Having all Tweets How people think I feel.
  26. Having all Tweets How people think I feel. How I really feel.
  27. Challenges • Too much data • Want only relevant Tweets • hashtag: #BRA • keywords: “goal” • Need to aggregate & reduce size • Long processing time (hours)
  28. Hadoop Cluster Vertica Pig / Scalding (slow) SQL Data Storage Tool Workflow
  29. Hadoop Cluster Vertica Pig / Scalding (slow) SQL Data Storage Tool Workflow
  30. Workflow Hadoop Cluster Vertica Pig / Scalding (slow) SQL Data Storage Tool Your laptop Smaller dataset
  31. Hadoop Cluster Vertica Pig / Scalding (slow) SQL Data Storage Tool Tool node.js / python / excel (fast) Final dataset Your laptop Workflow Smaller dataset
  32. vis data at Twitter “Tweets” Get data 1 2 Visualize
  33. Visualize • Peek into data • Check data & test ideas • Decide how to visualize • Guided by data type • Choose tools • Start building
  34. Visualize • Peek into data • Check data & test ideas • Decide how to visualize • Guided by data type • Choose tools • Start building R d3 Tableau Yeoman
  35. (+ media) photos, videos data What? TEXT Where? When? GEO TIME
  36. Visualize Data What? TEXT Where? When? GEO TIME
  37. Visualize Data What? TEXT Where? When? GEO TIME
  38. Time Tweets/second
  39. Time Tweets/second
  40. Time Tweets/second + Annotation http://www.flickr.com/photos/twitteroffice/5681263084/
  41. Visualize Data What? TEXT Where? When? GEO TIME
  42. Geo Heatmap Low density High density
  43. Geo San Francisco flickr.com/photos/twitteroffice/8798020541 Low density High density
  44. Geo San Francisco Rebuild the world based on tweet volumes twitter.github.io/interactive/andes/
  45. Visualize Data What? TEXT Where? When? GEO TIME
  46. Text www.wordle.net Some experiments during World Cup
  47. Text www.wordle.net Word cloud of Tweets right after the 1st goal
  48. Text Word cloud of Tweets right after the 1st goal It was an “own” goal. www.wordle.net
  49. Text WordTree [Wattenberg & Viégas 2008] www.jasondavies.com/wordtree www.jasondavies.com/wordtree
  50. Visualize Data What? TEXT Where? When? GEO TIME
  51. Time + Geo Japan Earthquake 2011 blog.twitter.com/2011/global-pulse youtu.be/SybWjN9pKQk
  52. Time + Geo Tweet pattern [Rios & Lin 2012] Night Late night Daytime Night Late night Daytime
  53. Time + Geo Tweet pattern [Rios & Lin 2012] Night Late night Daytime Night Late night Daytime
  54. Time + Geo Tweet pattern [Rios & Lin 2012] Night Late night Daytime Night Late night Daytime
  55. Time + Geo Tweet pattern [Rios & Lin 2012] Night Late night Daytime Night Late night Daytime
  56. Visualize Data What? TEXT Where? When? GEO TIME
  57. Geo + Text Real-time Tweet map
  58. Geo + Text Real-time Tweet map
  59. Geo + Text Real-time Tweet map most frequent term
  60. Geo + Text Real-time Tweet map Gmail was down Jan 24, 2014
  61. Geo + Text Real-time Tweet map Nelson Mandela passed away Dec 5, 2013
  62. Visualize Data What? TEXT Where? When? GEO TIME
  63. Time + Text UEFA Champions League Biggest tournament for European soccer clubs Many Tweets during the matches
  64. UEFA Champions League Team 1 Team 2 Time + Text Dortmund Bayern Munich
  65. UEFA Champions League Team 1 Team 2 Time + Text Dortmund Bayern Munich
  66. UEFA Champions League Team 1 Team 2 Time + Text Dortmund Bayern Munich
  67. UEFA Champions League Team 1 Team 2 Dortmund Bayern Munich Count Tweets mentioning the teams every minute Time + Text
  68. Time + Text UEFA Champions League
  69. Time + Text UEFA Champions League + “goal” count + context
  70. + “offside” Time + Text UEFA Champions League
  71. + players Time + Text UEFA Champions League
  72. Competition Tree vs vs A B C D vs A C C
  73. Competition Tree vs vs A B C D vs + A C C
  74. Competition Tree vs vs A B C D vs + = A C C
  75. Visualize Data What? TEXT Where? When? GEO TIME
  76. Time + Text + Geo State of the Union twitter.github.io/interactive/sotu2014
  77. Time + Text + Geo State of the Union 1) timeline + topic from Tweets 4) Density map of Tweets about selected topic 3) Volume of Tweets by topics during selected part of the SOTU 2) context (speech) twitter.github.io/interactive/sotu2014
  78. Time + Text World Cup 2014
  79. Time + Text + Geo World Cup 2014
  80. Visualize Data What? TEXT Where? When? GEO TIME
  81. Visualize Data What? TEXT + Where? When? GEO TIME Non-Twitter data CONTEXT
  82. Time + Text New Year 2014
  83. Time + Text New Year 2014
  84. Time + Text + Geo (c) New Year 2014 twitter.github.io/interactive/newyear2014/
  85. vis data at Twitter “Tweets” Get data 1 2 Visualize
  86. vis data at Twitter “Tweets” Get data 1 2 Visualize Evaluate 3
  87. vis data at Twitter “Tweets” Get data 1 2 Visualize Evaluate 3 Iterate!
  88. Evaluation • Self • Peer feedback • Non team members / Potential audience
  89. vis data at Twitter Get data 1 2 Visualize Evaluate 3
  90. vis data at Twitter Get data 1 2 Visualize Evaluate 3 big data => small data
  91. vis data at Twitter Get data 1 2 Visualize Evaluate 3 big data => small data What? Where? When?
  92. big data => small data self, peer, external vis data at Twitter Get data 1 2 Visualize Evaluate 3 What? Where? When?
  93. big data => small data self, peer, external vis data at Twitter “Tweets” Get data 1 2 Visualize Evaluate 3 What? Where? When?
  94. big data => small data self, peer, external vis data at Twitter “Tweets” Get data 1 2 Visualize Evaluate 3 What? Where? When? • users • followers graph • logs • etc. ! • derived data: language, sentiment
  95. big data => small data self, peer, external vis data at Twitter “Tweets” Get data 1 2 Visualize Evaluate 3 What? Where? When? • users Who? … • followers graph • logs • etc. ! • derived data: language, sentiment
  96. big data => small data self, peer, external vis data at Twitter “Tweets” Get data 1 2 Visualize Evaluate 3 What? Where? When? • users Who? … • followers graph • logs • etc. ! • derived data: language, sentiment (with deadline)
  97. big data => small data self, peer, external vis data at Twitter “Tweets” Get data 1 2 Visualize Evaluate 3 What? Where? When? • users Who? … • followers graph • logs • etc. (with deadline) ! • derived data: language, sentiment @kristw / https://interactive.twitter.com
  98. big data => small data self, peer, external vis data at Twitter “Tweets” Get data 1 2 Visualize Evaluate 3 What? Where? When? • users Who? … • followers graph • logs • etc. (with deadline) ! • derived data: language, sentiment @kristw / https://interactive.twitter.com + visualizations by @philogb, @miguelrios & @trebor
  99. Questions?
  100. big data => small data self, peer, external vis data at Twitter “Tweets” Get data 1 2 Visualize Evaluate 3 What? Where? When? • users Who? … • followers graph • logs • etc. (with deadline) @kristw / https://interactive.twitter.com + visualizations by @philogb, @miguelrios & @trebor
  101. Thank you

×