SlideShare a Scribd company logo
1 of 48
Data Visualization
• http://www.ted.com/talks/hans_rosling_shows_the_best_stats_you_ve_ever_seen.html
Data
Ambiguity
Failure to
precisely define
just what the data
represent
0
0.5
1
1.5
2
2.5
3
3.5
0 1 2 3
Y-Value 1
Data Distortion
Exaggerating or
understating the
values of some of the
data points
Data
Distraction
Extraneous lines,
graphics, etc.
1st Qtr
58%
2nd Qtr
10%
3rd Qtr
23%
4th Qtr
9%
Sales
How to make graphs that work
(advice from Seth Godin)
1. Don't let popular spreadsheets be in charge
of the way you look.
2. Tell a story.
3. Follow some simple rules.*
4. Break some other rules.
Classics – The Table
• While it might be possible to display data
better graphically, a table often does the job
quite nicely.
*Godin’s Rules
• Time goes from left to right.
Sales data in units
1st
Quarter
2nd
Quarter
3rd
Quarter
4th
Quarter
8.2 1.4 3.2 1.2
Classics – Pie Charts
• Pie charts have a mixed reputation.
• They are popular in business and the media but
many information designers have criticized the
technique.
• Some claim that the pie slice shape
communicates numbers less exactly than other
possibilities such as line length.
• At least one study indicates that use of a pie chart
for analyzing a problem as opposed to a bar chart
changes the way people think about the problem.
*Godin’s Rules
• Pie charts are spectacularly overrated. If you
want to show me that four out of five
dentists prefer Trident and that we need to
target the fifth one, show me a picture of 5
dentists, but make one of them stand out. I'll
remember that.
Sales
Sales
1st Qtr
2nd Qtr
3rd Qtr
4th Qtr
Sales
Sales
1st Qtr
2nd Qtr
3rd Qtr
4th Qtr
Sales (% of total units)
1st Qtr
58%
2nd Qtr
10%
3rd Qtr
23%
4th Qtr
9%
Sales
Sales (% of total units)
1st Qtr
58%
2nd Qtr
10%
3rd Qtr
23%
4th Qtr;
9%
Sales
Your Options
(according to Yoda)
Do.
Do not.
Try.
Classics – Line Graphs
• Line graphs are classic diagrams that usually
give a good picture of the data.
• Line graphs should only be used when the
positions on the x-axis have a natural
ordering. If your labels are "2000, 2001,
2002" that's fine. If your labels are "US,
England, Germany" you should consider a bar
graph instead.
*Godin’s Rules
• Good results should go up on the Y axis. This
means that if you're charting weight loss,
don't chart "how much I weigh" because
good results would go down. Instead, chart
"percentage of goal" or "how much I lost.
Sales (total units)
1st Qtr, 8.2
2nd Qtr, 1.4
3rd Qtr, 3.2
4th Qtr; 1.2
0
1
2
3
4
5
6
7
8
9
1st Qtr 2nd Qtr 3rd Qtr 4th Qtr
Sales
*Godin’s Rules
• "Don't connect unrelated events. For
example, a graph of IQs of everyone in your
kindergarten class should be a series of
unrelated points, not a line graph. On the
other hand, your weight loss is in fact a
continuous function, so each piece of data
should be attached.
Classics – Bar Charts
• Bar charts are classic diagrams that usually
give a good picture of the data.
• Their main problem is that when there are
many bars, labeling becomes problematic.
• They also imply that the data is discrete; if
your data is something that is plausibly
continuously changing over time, for
instance, you might consider a line graph
instead.
Sales (total units)
8.2
1.4
3.2
1.2
1st Qtr 2nd Qtr 3rd Qtr 4th Qtr
0
1
2
3
4
5
6
7
8
9
New Classics – Network Diagram
• Real-world information often comes in the
form of relationships between entities or
items, such as people who know each other
(social networks), or Web pages that are
connected to each other.
• In a network diagram, entities are connected
to each other in the form of a node and link
diagram.
New Classics – Word Cloud
• A "Word Cloud" enables you to see how
frequently words appear in a given text, or
see the relationship between a column of
words and a column of numbers.
• You can tweak your word "clouds" with
different fonts, layouts, and color schemes.
• Wordle.net
New Classics - Infographics
• Information graphics or infographics are
graphic visual representations of
information, data or knowledge.
The future of visualization
• One word: DATA
Example: NYT Cascade
• Cascade allows for precise analysis of the
structures which underly sharing activity on the
web.
• Links browsing behavior on a site to sharing
activity to construct a detailed picture of how
information propagates through the social media
space.
• The tool and its underlying logic may be applied
to any publisher or brand interested in
understanding how its messages are shared.
• http://nytlabs.com/projects/cascade.html

More Related Content

Viewers also liked (20)

Google mock for dummies
Google mock for dummiesGoogle mock for dummies
Google mock for dummies
 
Stack squeues lists
Stack squeues listsStack squeues lists
Stack squeues lists
 
Learn ruby intro
Learn ruby introLearn ruby intro
Learn ruby intro
 
Data preprocessing
Data preprocessingData preprocessing
Data preprocessing
 
Decision analysis
Decision analysisDecision analysis
Decision analysis
 
Maven
MavenMaven
Maven
 
Python data structures
Python data structuresPython data structures
Python data structures
 
Test driven development
Test driven developmentTest driven development
Test driven development
 
Stack queue
Stack queueStack queue
Stack queue
 
Game theory
Game theoryGame theory
Game theory
 
Stacks queues lists
Stacks queues listsStacks queues lists
Stacks queues lists
 
Stack queue
Stack queueStack queue
Stack queue
 
Net Framework Overview
Net Framework OverviewNet Framework Overview
Net Framework Overview
 
Test driven development
Test driven developmentTest driven development
Test driven development
 
Behaviour drivendevelopment
Behaviour drivendevelopmentBehaviour drivendevelopment
Behaviour drivendevelopment
 
Hash crypto
Hash cryptoHash crypto
Hash crypto
 
Data preprocessing
Data preprocessingData preprocessing
Data preprocessing
 
Gm theory
Gm theoryGm theory
Gm theory
 
Html5
Html5Html5
Html5
 
Linked list
Linked listLinked list
Linked list
 

Similar to Data visualization

Data displays in statistics
Data displays in statisticsData displays in statistics
Data displays in statistics
annieg8989
 

Similar to Data visualization (20)

Guidelines for data visualisation: eye vegetables and eye candy
Guidelines for data visualisation: eye vegetables and eye candyGuidelines for data visualisation: eye vegetables and eye candy
Guidelines for data visualisation: eye vegetables and eye candy
 
DutchMLSchool. Automating Decision Making
DutchMLSchool. Automating Decision MakingDutchMLSchool. Automating Decision Making
DutchMLSchool. Automating Decision Making
 
Data cube
Data cubeData cube
Data cube
 
UNIT_4_data visualization.pptx
UNIT_4_data visualization.pptxUNIT_4_data visualization.pptx
UNIT_4_data visualization.pptx
 
MLSD18. Feature Engineering
MLSD18. Feature EngineeringMLSD18. Feature Engineering
MLSD18. Feature Engineering
 
A Little Graph Theory for the Busy Developer - Jim Webber @ GraphConnect Chic...
A Little Graph Theory for the Busy Developer - Jim Webber @ GraphConnect Chic...A Little Graph Theory for the Busy Developer - Jim Webber @ GraphConnect Chic...
A Little Graph Theory for the Busy Developer - Jim Webber @ GraphConnect Chic...
 
07 learning
07 learning07 learning
07 learning
 
Exploratory Data Analysis week 4
Exploratory Data Analysis week 4Exploratory Data Analysis week 4
Exploratory Data Analysis week 4
 
Top 50 Diagrams in Editable Powerpoint
Top 50 Diagrams in Editable PowerpointTop 50 Diagrams in Editable Powerpoint
Top 50 Diagrams in Editable Powerpoint
 
VSSML18. Clustering and Latent Dirichlet Allocation
VSSML18. Clustering and Latent Dirichlet AllocationVSSML18. Clustering and Latent Dirichlet Allocation
VSSML18. Clustering and Latent Dirichlet Allocation
 
Data displays in statistics
Data displays in statisticsData displays in statistics
Data displays in statistics
 
Data visualisationresearch
Data visualisationresearchData visualisationresearch
Data visualisationresearch
 
VSSML18. Feature Engineering
VSSML18. Feature EngineeringVSSML18. Feature Engineering
VSSML18. Feature Engineering
 
Tableau Visual Guidebook
Tableau Visual GuidebookTableau Visual Guidebook
Tableau Visual Guidebook
 
L8 scientific visualization of data
L8 scientific visualization of dataL8 scientific visualization of data
L8 scientific visualization of data
 
Measurecamp 6 Workshop: Data Visualisation
Measurecamp 6 Workshop: Data VisualisationMeasurecamp 6 Workshop: Data Visualisation
Measurecamp 6 Workshop: Data Visualisation
 
MLSEV. Automating Decision Making
MLSEV. Automating Decision MakingMLSEV. Automating Decision Making
MLSEV. Automating Decision Making
 
BSSML16 L1. Introduction, Models, and Evaluations
BSSML16 L1. Introduction, Models, and EvaluationsBSSML16 L1. Introduction, Models, and Evaluations
BSSML16 L1. Introduction, Models, and Evaluations
 
Data Visualization using different python libraries.pptx
Data Visualization using different python libraries.pptxData Visualization using different python libraries.pptx
Data Visualization using different python libraries.pptx
 
BSSML17 - Feature Engineering
BSSML17 - Feature EngineeringBSSML17 - Feature Engineering
BSSML17 - Feature Engineering
 

More from Luis Goldster

More from Luis Goldster (20)

Ruby on rails evaluation
Ruby on rails evaluationRuby on rails evaluation
Ruby on rails evaluation
 
Design patterns
Design patternsDesign patterns
Design patterns
 
Lisp and scheme i
Lisp and scheme iLisp and scheme i
Lisp and scheme i
 
Ado.net & data persistence frameworks
Ado.net & data persistence frameworksAdo.net & data persistence frameworks
Ado.net & data persistence frameworks
 
Multithreading models.ppt
Multithreading models.pptMultithreading models.ppt
Multithreading models.ppt
 
Business analytics and data mining
Business analytics and data miningBusiness analytics and data mining
Business analytics and data mining
 
Big picture of data mining
Big picture of data miningBig picture of data mining
Big picture of data mining
 
Data mining and knowledge discovery
Data mining and knowledge discoveryData mining and knowledge discovery
Data mining and knowledge discovery
 
Cache recap
Cache recapCache recap
Cache recap
 
Directory based cache coherence
Directory based cache coherenceDirectory based cache coherence
Directory based cache coherence
 
Hardware managed cache
Hardware managed cacheHardware managed cache
Hardware managed cache
 
How analysis services caching works
How analysis services caching worksHow analysis services caching works
How analysis services caching works
 
Abstract data types
Abstract data typesAbstract data types
Abstract data types
 
Optimizing shared caches in chip multiprocessors
Optimizing shared caches in chip multiprocessorsOptimizing shared caches in chip multiprocessors
Optimizing shared caches in chip multiprocessors
 
Api crash
Api crashApi crash
Api crash
 
Object model
Object modelObject model
Object model
 
Abstraction file
Abstraction fileAbstraction file
Abstraction file
 
Object oriented analysis
Object oriented analysisObject oriented analysis
Object oriented analysis
 
Abstract class
Abstract classAbstract class
Abstract class
 
Concurrency with java
Concurrency with javaConcurrency with java
Concurrency with java
 

Recently uploaded

“Iamnobody89757” Understanding the Mysterious of Digital Identity.pdf
“Iamnobody89757” Understanding the Mysterious of Digital Identity.pdf“Iamnobody89757” Understanding the Mysterious of Digital Identity.pdf
“Iamnobody89757” Understanding the Mysterious of Digital Identity.pdf
Muhammad Subhan
 
Microsoft BitLocker Bypass Attack Method.pdf
Microsoft BitLocker Bypass Attack Method.pdfMicrosoft BitLocker Bypass Attack Method.pdf
Microsoft BitLocker Bypass Attack Method.pdf
Overkill Security
 
Tales from a Passkey Provider Progress from Awareness to Implementation.pptx
Tales from a Passkey Provider  Progress from Awareness to Implementation.pptxTales from a Passkey Provider  Progress from Awareness to Implementation.pptx
Tales from a Passkey Provider Progress from Awareness to Implementation.pptx
FIDO Alliance
 

Recently uploaded (20)

How we scaled to 80K users by doing nothing!.pdf
How we scaled to 80K users by doing nothing!.pdfHow we scaled to 80K users by doing nothing!.pdf
How we scaled to 80K users by doing nothing!.pdf
 
الأمن السيبراني - ما لا يسع للمستخدم جهله
الأمن السيبراني - ما لا يسع للمستخدم جهلهالأمن السيبراني - ما لا يسع للمستخدم جهله
الأمن السيبراني - ما لا يسع للمستخدم جهله
 
Introduction to FIDO Authentication and Passkeys.pptx
Introduction to FIDO Authentication and Passkeys.pptxIntroduction to FIDO Authentication and Passkeys.pptx
Introduction to FIDO Authentication and Passkeys.pptx
 
“Iamnobody89757” Understanding the Mysterious of Digital Identity.pdf
“Iamnobody89757” Understanding the Mysterious of Digital Identity.pdf“Iamnobody89757” Understanding the Mysterious of Digital Identity.pdf
“Iamnobody89757” Understanding the Mysterious of Digital Identity.pdf
 
Microsoft BitLocker Bypass Attack Method.pdf
Microsoft BitLocker Bypass Attack Method.pdfMicrosoft BitLocker Bypass Attack Method.pdf
Microsoft BitLocker Bypass Attack Method.pdf
 
Design and Development of a Provenance Capture Platform for Data Science
Design and Development of a Provenance Capture Platform for Data ScienceDesign and Development of a Provenance Capture Platform for Data Science
Design and Development of a Provenance Capture Platform for Data Science
 
How to Check CNIC Information Online with Pakdata cf
How to Check CNIC Information Online with Pakdata cfHow to Check CNIC Information Online with Pakdata cf
How to Check CNIC Information Online with Pakdata cf
 
The Zero-ETL Approach: Enhancing Data Agility and Insight
The Zero-ETL Approach: Enhancing Data Agility and InsightThe Zero-ETL Approach: Enhancing Data Agility and Insight
The Zero-ETL Approach: Enhancing Data Agility and Insight
 
AI mind or machine power point presentation
AI mind or machine power point presentationAI mind or machine power point presentation
AI mind or machine power point presentation
 
How to Check GPS Location with a Live Tracker in Pakistan
How to Check GPS Location with a Live Tracker in PakistanHow to Check GPS Location with a Live Tracker in Pakistan
How to Check GPS Location with a Live Tracker in Pakistan
 
Observability Concepts EVERY Developer Should Know (DevOpsDays Seattle)
Observability Concepts EVERY Developer Should Know (DevOpsDays Seattle)Observability Concepts EVERY Developer Should Know (DevOpsDays Seattle)
Observability Concepts EVERY Developer Should Know (DevOpsDays Seattle)
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontology
 
State of the Smart Building Startup Landscape 2024!
State of the Smart Building Startup Landscape 2024!State of the Smart Building Startup Landscape 2024!
State of the Smart Building Startup Landscape 2024!
 
Cyber Insurance - RalphGilot - Embry-Riddle Aeronautical University.pptx
Cyber Insurance - RalphGilot - Embry-Riddle Aeronautical University.pptxCyber Insurance - RalphGilot - Embry-Riddle Aeronautical University.pptx
Cyber Insurance - RalphGilot - Embry-Riddle Aeronautical University.pptx
 
Tales from a Passkey Provider Progress from Awareness to Implementation.pptx
Tales from a Passkey Provider  Progress from Awareness to Implementation.pptxTales from a Passkey Provider  Progress from Awareness to Implementation.pptx
Tales from a Passkey Provider Progress from Awareness to Implementation.pptx
 
ERP Contender Series: Acumatica vs. Sage Intacct
ERP Contender Series: Acumatica vs. Sage IntacctERP Contender Series: Acumatica vs. Sage Intacct
ERP Contender Series: Acumatica vs. Sage Intacct
 
Event-Driven Architecture Masterclass: Challenges in Stream Processing
Event-Driven Architecture Masterclass: Challenges in Stream ProcessingEvent-Driven Architecture Masterclass: Challenges in Stream Processing
Event-Driven Architecture Masterclass: Challenges in Stream Processing
 
Event-Driven Architecture Masterclass: Engineering a Robust, High-performance...
Event-Driven Architecture Masterclass: Engineering a Robust, High-performance...Event-Driven Architecture Masterclass: Engineering a Robust, High-performance...
Event-Driven Architecture Masterclass: Engineering a Robust, High-performance...
 
Microsoft CSP Briefing Pre-Engagement - Questionnaire
Microsoft CSP Briefing Pre-Engagement - QuestionnaireMicrosoft CSP Briefing Pre-Engagement - Questionnaire
Microsoft CSP Briefing Pre-Engagement - Questionnaire
 
AI in Action: Real World Use Cases by Anitaraj
AI in Action: Real World Use Cases by AnitarajAI in Action: Real World Use Cases by Anitaraj
AI in Action: Real World Use Cases by Anitaraj
 

Data visualization

  • 3. Data Ambiguity Failure to precisely define just what the data represent 0 0.5 1 1.5 2 2.5 3 3.5 0 1 2 3 Y-Value 1
  • 4. Data Distortion Exaggerating or understating the values of some of the data points
  • 5. Data Distraction Extraneous lines, graphics, etc. 1st Qtr 58% 2nd Qtr 10% 3rd Qtr 23% 4th Qtr 9% Sales
  • 6. How to make graphs that work (advice from Seth Godin) 1. Don't let popular spreadsheets be in charge of the way you look. 2. Tell a story. 3. Follow some simple rules.* 4. Break some other rules.
  • 7. Classics – The Table • While it might be possible to display data better graphically, a table often does the job quite nicely.
  • 8. *Godin’s Rules • Time goes from left to right.
  • 9. Sales data in units 1st Quarter 2nd Quarter 3rd Quarter 4th Quarter 8.2 1.4 3.2 1.2
  • 10. Classics – Pie Charts • Pie charts have a mixed reputation. • They are popular in business and the media but many information designers have criticized the technique. • Some claim that the pie slice shape communicates numbers less exactly than other possibilities such as line length. • At least one study indicates that use of a pie chart for analyzing a problem as opposed to a bar chart changes the way people think about the problem.
  • 11. *Godin’s Rules • Pie charts are spectacularly overrated. If you want to show me that four out of five dentists prefer Trident and that we need to target the fifth one, show me a picture of 5 dentists, but make one of them stand out. I'll remember that.
  • 14. Sales (% of total units) 1st Qtr 58% 2nd Qtr 10% 3rd Qtr 23% 4th Qtr 9% Sales
  • 15. Sales (% of total units) 1st Qtr 58% 2nd Qtr 10% 3rd Qtr 23% 4th Qtr; 9% Sales
  • 16. Your Options (according to Yoda) Do. Do not. Try.
  • 17. Classics – Line Graphs • Line graphs are classic diagrams that usually give a good picture of the data. • Line graphs should only be used when the positions on the x-axis have a natural ordering. If your labels are "2000, 2001, 2002" that's fine. If your labels are "US, England, Germany" you should consider a bar graph instead.
  • 18. *Godin’s Rules • Good results should go up on the Y axis. This means that if you're charting weight loss, don't chart "how much I weigh" because good results would go down. Instead, chart "percentage of goal" or "how much I lost.
  • 19. Sales (total units) 1st Qtr, 8.2 2nd Qtr, 1.4 3rd Qtr, 3.2 4th Qtr; 1.2 0 1 2 3 4 5 6 7 8 9 1st Qtr 2nd Qtr 3rd Qtr 4th Qtr Sales
  • 20. *Godin’s Rules • "Don't connect unrelated events. For example, a graph of IQs of everyone in your kindergarten class should be a series of unrelated points, not a line graph. On the other hand, your weight loss is in fact a continuous function, so each piece of data should be attached.
  • 21. Classics – Bar Charts • Bar charts are classic diagrams that usually give a good picture of the data. • Their main problem is that when there are many bars, labeling becomes problematic. • They also imply that the data is discrete; if your data is something that is plausibly continuously changing over time, for instance, you might consider a line graph instead.
  • 22. Sales (total units) 8.2 1.4 3.2 1.2 1st Qtr 2nd Qtr 3rd Qtr 4th Qtr 0 1 2 3 4 5 6 7 8 9
  • 23. New Classics – Network Diagram • Real-world information often comes in the form of relationships between entities or items, such as people who know each other (social networks), or Web pages that are connected to each other. • In a network diagram, entities are connected to each other in the form of a node and link diagram.
  • 24.
  • 25.
  • 26.
  • 27.
  • 28.
  • 29.
  • 30. New Classics – Word Cloud • A "Word Cloud" enables you to see how frequently words appear in a given text, or see the relationship between a column of words and a column of numbers. • You can tweak your word "clouds" with different fonts, layouts, and color schemes. • Wordle.net
  • 31.
  • 32.
  • 33.
  • 34.
  • 35.
  • 36.
  • 37.
  • 38. New Classics - Infographics • Information graphics or infographics are graphic visual representations of information, data or knowledge.
  • 39.
  • 40.
  • 41.
  • 42.
  • 43.
  • 44.
  • 45. The future of visualization • One word: DATA
  • 46. Example: NYT Cascade • Cascade allows for precise analysis of the structures which underly sharing activity on the web. • Links browsing behavior on a site to sharing activity to construct a detailed picture of how information propagates through the social media space. • The tool and its underlying logic may be applied to any publisher or brand interested in understanding how its messages are shared.
  • 47.