ร้อยเรื่องราวจากข้อมูล / Storytelling with Data

Krist Wongsuphasawat
Krist WongsuphasawatData Visualization
Krist Wongsuphasawat / @kristw
ร้อยเรื่องราวจากข้อมูล
STORYTELLING WITH DATA
แนะนําตัวก่อน
Computer Engineer
Chulalongkorn University
PhD in Computer Science
Information Visualization
Univ. of Maryland
IBM
Microsoft
Data Scientist
Twitter
Krist Wongsuphasawat / @kristw
ข้อมูล
ประมง
ร้อยเรื่องราวจากข้อมูล / Storytelling with Data
400
ร้อยเรื่องราวจากข้อมูล / Storytelling with Data
เก็บข้อมูล
Time Location Type
12:00 Paragon Magikarp
12:05 Siam Dis Magikarp
12:40 CTW Magikarp
… … …
ร้อยเรื่องราวจากข้อมูล / Storytelling with Data
เวลา
00:00 12:00 00:006:00 18:00
จำนวนปลา
เวลา
DATA VISUALIZATION
การแปลงข้อมูลเป็นภาพ
ประวัติศาสตร์
data
ร้อยเรื่องราวจากข้อมูล / Storytelling with Data
Number of Napoleon's troops,
Distance, Temperature,
Latitude and Longitude,
Direction of travel,
Location (relative to specific dates)
2 dimensions
6 types of data
DATA VISUALIZATION
Explanatory
Communicate known information
Exploratory
Explore data to reveal insights
ข้อมูลมาจากไหน?
DATA SOURCES
Open data
Publicly available
Private data
owned by organization, not available to public
Self-collected data
Manual, site scraping, etc.
Combination of the above
OPEN DATA
OPEN DATA
เก็บเองก็ได้
ข้อมูลที่ทวิตเตอร์
Tweets
Text, Time, Location, Media
User information
Age, Country, etc.
Follows
User interactions
Navigation, Views
MANY FORMS OF DATA
Standalone files
txt, csv, tsv, json, excel, Google Docs, …, pdf*
APIs
better quality with more overhead
Databases
doesn’t necessary mean they are organized
Big data
bigger pain
HAVING ALL TWEETS
How people think I feel.
How people think I feel. How I really feel.
HAVING ALL TWEETS
CHALLENGES
Get relevant Tweets
hashtag: #oscars
keywords: “goal” (football)
Too big
Need to aggregate & reduce size
Slow
Long processing time (hours)
Hadoop Cluster
GETTING BIG DATA
Data Storage
Pig / Scalding (slow)
GETTING BIG DATA
Hadoop Cluster
Data Storage
Tool
Hadoop Cluster
Pig / Scalding (slow)
GETTING BIG DATA
Data Storage
Tool
Pig / Scalding (slow)
GETTING BIG DATA
Hadoop Cluster
Data Storage
Tool
Your laptop Smaller dataset
Hadoop Cluster
Pig / Scalding (slow)
Data Storage
Tool
Final dataset
Tool node.js / python / excel (fast)
Your laptop
GETTING BIG DATA
Smaller dataset
เอาข้อมูลไปทําอะไร?
APPLICATIONS OF DATA
Personal analytics
Anyone
Product analytics
Product Manager, Engineer
Data Journalism
News, Magazine, Company’s Public Relations
…
NEW YORK TIMES GRAPHICS
http://www.nytimes.com/interactive/2014/08/13/upshot/where-people-in-each-state-were-born.html?abt=0002&abg=0#New_York
THE GUARDIAN
NEWS
New York Times
The Guardian
Washington Post
Wall Street Journal
FiveThirtyEight
etc.
GOOGLE TRENDS
https://www.google.com/trends/story/US_cu_XRyhKlcBAACrtM_en
GOOGLE TRENDS
https://www.google.com/trends/story/US_cu_XRyhKlcBAACrtM_en
UBER
https://newsroom.uber.com/a-day-in-the-life-of-uber/
ตัวอย่างงาน
ร้อยเรื่องราวจากข้อมูล / Storytelling with Data
ทวีตอะไร?
โปเกมอนที่ถูกพูดถึงมากที่สุด
ทวีตเมื่อไหร่?
ทวีตต่อนาที
ทวีตต่อนาที
interactive.twitter.com/euro2016
ทวีตที่ไหน?
LOCATION
Low density
High density
by Miguel Rios
LOCATION
Low density
High density
by Miguel Rios
LOCATION
flickr.com/photos/twitteroffice/8798020541
San Francisco
Low density
High density
by Miguel Rios
Rebuild the world
based on
tweet density
twitter.github.io/interactive/andes/
by Nicolas Garcia Belmonte
ทวีตอะไร? ที่ไหน? เมื่อไหร่?
HAPPY NEW YEAR
สวัสดีปีใหม่
ปีใหม่ 2013
twitter.github.io/interactive/newyear2014/
ร้อยเรื่องราวจากข้อมูล / Storytelling with Data
USER อยู่ที่ไหน?
USER + LOCATION : FAN MAP
interactive.twitter.com/nfl_followers2014
USER + LOCATION : FAN MAP
interactive.twitter.com/nba_followers
USER + LOCATION : FAN MAP
interactive.twitter.com/premierleague
interactive.twitter.com
มีขั้นตอนอะไรบ้าง?
ขั้นตอนวิเคราะห์ข้อมูล
Collect
Clean
Explore*
Analyze
Present*
ขั้นตอนวิเคราะห์ข้อมูล
Collect
Clean
Explore*
Analyze
Present*
CASE STUDY:
GAME OF THRONES
Problem is coming.
CHAPTER I
“Problem first, not solution backward”
— Brian Caffo (via Ron Brookmeyer)
“If all you have is a hammer,
everything looks like a nail.”
— Abraham Maslow
Problem
Want to know what the audience
talk about a TV show
Problem
Want to know what the audience
talk about a TV show
from Tweets
HBO’s Game of Thrones
Based on a book series “A Song of Ice and Fire”
Medieval Fantasy. Knights, magic and dragons.
Brief Story
A King dies. 
A lot of contenders wage a war
to reclaim the throne.
Minor characters with no claim to the throne
set their own plans in action to gain power
when all the major characters end up killing each other.
Brave/Honest/Honorable characters die.
Intelligent but shady characters
and characters who know nothing
continue to live.
While humans are busy killing each other,
ice zombies “White walkers” are invading from the North.
The only group who seems to care about this
is neutral group called the Night’s Watch.
HBO’s Game of Thrones
Based on a book series “A Song of Ice and Fire”
Medieval Fantasy. Knights, magic and dragons.
Many characters.
Anybody can die.
6 seasons (57 episodes) so far
Multiple storylines in each episode
Problem
Want to know what the audience
talk about a TV show
from Tweets
Ideas
Common words
Too much noise
Ideas
Common words
Too much noise
Characters
How o"en each character were mentioned?
I demand a trial by prototyping.
CHAPTER II
Prototyping
Pull sample data
from Twitter API
Character recognition and counting
naive approach
Sample Tweet
Sample Tweet
List of names
Daenerys Targaryen,Khaleesi
Jon Snow
Sansa Stark
Tyrion Lannister
Arya Stark
Cersei Lannister
Khal Drogo
Gregor Clegane,Mountain
Margaery Tyrell
Joffrey Baratheon
Bran Stark
Theon Greyjoy
Jaime Lannister
Brienne
Eddard Stark,Ned Stark
Ramsay Bolton
Sandor Clegane,Hound
Ygritte
Stannis Baratheon
Petyr Baelish,Little Finger
Robb Stark
Bronn
Varys
Catelyn Stark
Oberyn Martell
Daario Naharis
Davos Seaworth
Jorah Mormont
Melisandre
Myrcella Baratheon
Tywin Lannister
Tommen Baratheon
Grey Worm
Tyene Sand
Rickon Stark
Missandei
Roose Bolton
Robert Baratheon
Jojen Reed
Jeor Mormont
Tormund Giantsbane
Lysa Arryn
Yara Greyjoy,Asha Greyjoy
Samwell Tarly,Sam
Hodor
Victarion Greyjoy
High Sparrow
Dragon
Winter
Dothraki
Sample data
Character Count
Hodor 10000
Jon Snow 5000
Daenerys 4000
Bran Stark 3000
… …
*These numbers are made up for presentation, not real data.
When you play the game of vis,
you iterate or you die.
CHAPTER III
Where to go from here?
+ emotion
+ connections
+ connections
Gain insights from a single episode
emotion & connections
Sample data
Character Count
Jon Snow+Sansa 1000
Tormund+Brienne 500
Bran Stark+Hodor 300
… …
Character Count
Hodor 10000
Jon Snow 5000
Daenerys 4000
… …
INDIVIDUALS CONNECTIONS
+ top emojis + top emojis
*These numbers are made up for presentation, not real data.
Graph
NODES EDGES
+ top emojis + top emojis
Character Count
Jon Snow+Sansa 1000
Tormund+Brienne 500
Bran Stark+Hodor 300
… …
Character Count
Hodor 10000
Jon Snow 5000
Daenerys 4000
… …
*These numbers are made up for presentation, not real data.
Network Visualization
Node-link diagram
Force-directed layout
http://blockbuilder.org/kristw/762b680690e4b2b2666dfec15838a384
+ Collision Detection
http://blockbuilder.org/kristw/2850f65d6329c5fef6d5c9118f1de6e6
+ Community Detection
https://github.com/upphiminn/jLouvain
+ Collision Detection (with clusters)
https://bl.ocks.org/mbostock/7881887
Let’s get other episodes.
(More) data are coming.
CHAPTER IV
More data
1 episode (1 day) => all episodes (6 years)
Rewrite the scripts
to get archived data
How much data do we need?
Whole week?
5 days?
2 days?
A day?
etc.
How much data do we need?
Hold the vis.
CHAPTER V
The vis is not enough.
ร้อยเรื่องราวจากข้อมูล / Storytelling with Data
Legend
Navigation
Top 3
Adjust threshold
Recap
Filtered Recap
Tooltip
Demo
https://interactive.twitter.com/game-of-thrones
Mobile Support
A visualizer always evaluates his work.
CHAPTER VI
“Feedback is the breakfast of champion.”
— Ken Blanchard
Self & Peer
Does it solve the problem?
Tormund + Brienne
Google Analytics
Pageviews
Visitors
Actions
Referrals
Sites/Social
Feedback
Feedback
สรุป
Data are around us and come from many sources.
Open data are valuable.
Telling story from data is one possible application.
News, Magazine, Company PR.
Takes time and iterations
with many trials and errors.
Start with a problem, collect the data, explore, find a story and present it.
Krist Wongsuphasawat / @kristw
kristw.yellowpigz.com
ร้อยเรื่องราวจากข้อมูล / Storytelling with Data
ร้อยเรื่องราวจากข้อมูล / Storytelling with Data
The Reading Room
2 Silom Soi 19,
Bangkok, Thailand 10500
ขอบคุณครับ
1 of 126

More Related Content

Viewers also liked(14)

More from Krist Wongsuphasawat(20)

What I tell myself before visualizingWhat I tell myself before visualizing
What I tell myself before visualizing
Krist Wongsuphasawat26 views
Increasing the Impact of Visualization ResearchIncreasing the Impact of Visualization Research
Increasing the Impact of Visualization Research
Krist Wongsuphasawat1.3K views
What to expect when you are visualizingWhat to expect when you are visualizing
What to expect when you are visualizing
Krist Wongsuphasawat1.7K views
Logs & Visualizations at TwitterLogs & Visualizations at Twitter
Logs & Visualizations at Twitter
Krist Wongsuphasawat1.8K views
A Narrative Display for Sports Tournament RecapA Narrative Display for Sports Tournament Recap
A Narrative Display for Sports Tournament Recap
Krist Wongsuphasawat1.8K views
Visualization for Event Sequences ExplorationVisualization for Event Sequences Exploration
Visualization for Event Sequences Exploration
Krist Wongsuphasawat11.9K views
Usability of Google DocsUsability of Google Docs
Usability of Google Docs
Krist Wongsuphasawat2.1K views
Information Visualization for Knowledge DiscoveryInformation Visualization for Knowledge Discovery
Information Visualization for Knowledge Discovery
Krist Wongsuphasawat1.5K views
Information Visualization for Health CareInformation Visualization for Health Care
Information Visualization for Health Care
Krist Wongsuphasawat2.1K views

Recently uploaded(20)

MOSORE_BRESCIAMOSORE_BRESCIA
MOSORE_BRESCIA
Federico Karagulian5 views
Data structure and algorithm. Data structure and algorithm.
Data structure and algorithm.
Abdul salam 12 views
Introduction to Microsoft Fabric.pdfIntroduction to Microsoft Fabric.pdf
Introduction to Microsoft Fabric.pdf
ishaniuudeshika19 views
Journey of Generative AIJourney of Generative AI
Journey of Generative AI
thomasjvarghese4917 views
PROGRAMME.pdfPROGRAMME.pdf
PROGRAMME.pdf
HiNedHaJar7 views
3196 The Case of The East River3196 The Case of The East River
3196 The Case of The East River
ErickANDRADE909 views
How Leaders See Data? (Level 1)How Leaders See Data? (Level 1)
How Leaders See Data? (Level 1)
Narendra Narendra10 views
RIO GRANDE SUPPLY COMPANY INC, JAYSON.docxRIO GRANDE SUPPLY COMPANY INC, JAYSON.docx
RIO GRANDE SUPPLY COMPANY INC, JAYSON.docx
JaysonGarabilesEspej6 views
RuleBookForTheFairDataEconomy.pptxRuleBookForTheFairDataEconomy.pptx
RuleBookForTheFairDataEconomy.pptx
noraelstela164 views
ColonyOSColonyOS
ColonyOS
JohanKristiansson69 views
PTicketInput.pdfPTicketInput.pdf
PTicketInput.pdf
stuartmcphersonflipm286 views

ร้อยเรื่องราวจากข้อมูล / Storytelling with Data