Social Networking: Visualizing Twitter
TEAM BIRCH: Chris, Ruth, Nut, Aminu and Anil
Overview
1. Introduction
2. Background to Twitter and Boston Bombings
3. Big and Dirty Data Issues
4. Process: Capturing t...
Who We Are
• Aminu
• Anil
• Chris
• Nut
• Ruth
Our Data
• Twitter Data from 16:00 to 19:00
RE: Boston Marathon (Bombing)
• Approx 550,000 tweets covering the 3 hour Peri...
Big and Dirty Data Issues
1. Each tweet should have a record of its own! (Lines)
2. Formatting Issues
3. No standardisatio...
Overview of Process
Python
Script
Harvests
Tweets using
the Twitter
API
MapReduce
code
processes
tweets
Acquire Parse/Filt...
Map Reduce
MapReduce code processes tweets
• Parse
• Added information where possible – retweet/hashtag/touser
• Filter
• ...
Visualisation Tools Used
Created a Real-time Twitter Analytics Portal with
• Tableau Public
• Google Fusion
• Wix Web Port...
Twitter Analytics
• 5 W’s of Social Media!
– Who
– What
– Where
– When
– Why
DEMO
Future Work
• Gain an holistic view of the story over time
– Bombing – 15th April
– Shooting – 18th April
– Fire fight & M...
Thank you for Listening!
TEAM BIRCH: Chris, Ruth, Nut, Aminu and Anil
Upcoming SlideShare
Loading in...5
×

Social Networking: Visualizing Twitter

623

Published on

Slides by TEAM BIRCH from the SICSA Big Data InfoVis Summer School 2013 -
Members:
Ruth Agbakoba
Anil Bandhakavi
Aminu Muhammad
Chris Hillman
Nut Limsopathan

Published in: Technology, Business
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
623
On Slideshare
0
From Embeds
0
Number of Embeds
3
Actions
Shares
0
Downloads
0
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Social Networking: Visualizing Twitter

  1. 1. Social Networking: Visualizing Twitter TEAM BIRCH: Chris, Ruth, Nut, Aminu and Anil
  2. 2. Overview 1. Introduction 2. Background to Twitter and Boston Bombings 3. Big and Dirty Data Issues 4. Process: Capturing the integrated learning process 5. 5 W’ of Twitter Analytics 6. DEMO ‘Visualisation’ 7. Further Work 8. Learning Outcomes
  3. 3. Who We Are • Aminu • Anil • Chris • Nut • Ruth
  4. 4. Our Data • Twitter Data from 16:00 to 19:00 RE: Boston Marathon (Bombing) • Approx 550,000 tweets covering the 3 hour Period • Challenges – Data format – Lack of information – UserIDs vs. UserNames
  5. 5. Big and Dirty Data Issues 1. Each tweet should have a record of its own! (Lines) 2. Formatting Issues 3. No standardisation (only ~10% tweets geo-location) 4. Only 5 fields > had to create three more 5. Different languages 6. Information overload – many different patterns identified therefore difficult to focus on a particular visualisation.
  6. 6. Overview of Process Python Script Harvests Tweets using the Twitter API MapReduce code processes tweets Acquire Parse/Filter/Mine Create Visualisation in Tableau Public and Google Fusion Write out Text Files relevant to the analytics Display in Web Portal on Users Screen Represent Interact
  7. 7. Map Reduce MapReduce code processes tweets • Parse • Added information where possible – retweet/hashtag/touser • Filter • Remove Records with invalid fields • Split into Geocoded, non- Geocoded • Mine • Word Counts • Hashtag Counts – all and split by location / original vs. retweet • Sentiment Extraction Acquire Parse/Filter/Mine Represent Interact
  8. 8. Visualisation Tools Used Created a Real-time Twitter Analytics Portal with • Tableau Public • Google Fusion • Wix Web Portal • Purpose: – Insight – Exploratory – Confirmation
  9. 9. Twitter Analytics • 5 W’s of Social Media! – Who – What – Where – When – Why
  10. 10. DEMO
  11. 11. Future Work • Gain an holistic view of the story over time – Bombing – 15th April – Shooting – 18th April – Fire fight & Manhunt – 19th April • Reflect the story as it evolved – Clustering – NLP (to move from basic to advanced analytics) – Explore more visualisation types
  12. 12. Thank you for Listening! TEAM BIRCH: Chris, Ruth, Nut, Aminu and Anil

×