This document discusses the theory of data visualization. It emphasizes showing data with clarity, precision and efficiency according to Edward Tufte. Anscombe's Quartet example shows graphs can be misleading without examining the underlying data. Effective graphs have high data-ink ratios, are free of chartjunk like unnecessary grids or patterns, and accurately represent relationships. Data should not be distorted to strengthen effects. The document also discusses redesigning graphs to reduce clutter and better integrate data and text.
2. What We Will Discuss
Visualisation Excellence/Integrity
Theory of Data Visualisation
What We Will Not Discuss
Perception and Visualisation (Eye/Brain System)
Specific programs to create graphics
4. Graphical Excellence
“Excellence in statistical graphics consists of
complex ideas communicated with clarity,
precision, and efficiency“
- Edward Tufte
The Visual Display of Quantitative Information
6. The 1854 London Cholera Epidemic
Source: http://www.datavis.ca/gallery/historical.php
7. A silly theory means a silly graphic
Source: Tufte (www.wearethepractitioners.com/library/the-
practitioner/2014/07/10/big-data-and-predictive-analytics)
8. Why Learn to Effectively Graph?
Visual representations not only make the patterns, trends, and
exceptions in numbers visible and understandable, they also
extend the capacity of memory, making available in front of
our eyes what we couldn't otherwise hold all at once in our
minds. In simple terms information visualization helps us think.
– Stephen Few
9. Graphical Integrity
A graphic does not distort if the visual representation of the data
is consistent with the numerical representation
10. Lie Factor
Lie factor = size of effect shown in graphic / size of effect in data
Lie factor = 2.8
“The shrinking family doctor in California”, Los Angeles Times, p. 3, August 5, 1979
11. Theory of Data Visualisation
Data-Ink and Graphical Redesign
Chartjunk
12. Data Ink
“Above all else show the data“ - Tufte
“A large share of ink on a graphic should present data-
information, the ink changing as the data change. Data-ink
is the non-erasable core of a graphic, the non-redundant
ink arranged in response to variation in the numbers
represented.”
13. High Data-Ink Ratio
Low Data-Ink Ratio
Erase redundant data-ink, within reason
Source: https://viscomvibz.wordpress.com/2012/02/26/the-visual-display-of-quantitative-information/
Data-Ink and Graphical Redesign
15. Chartjunk
Three widespread types of chartjunk:
Unintentional optical art
The dreaded grid
The self-promoting graphical duck
“The interior decoration of graphics generates a lot of ink that does
not tell the viewer anything new. The purpose of decoration
varies — to make the graphic appear more scientific and precise,
to enliven the display, to give the designer an opportunity to
exercise artistic skills. Regardless of its cause, it is all non-data-
ink or redundant data-ink, and it is often chartjunk.”
23. What is the problem with this graph?
Source: https://trinkerrstuff.wordpress.com/
24. An Economist’s Guide to Visualizing Data
Journal of Economic Perspectives—Volume 28, Number 1—Winter 2014—Pages 209–234
Show the data
Reduce the clutter
Integrate the text and the graph