2. Scientific data visualisation is a
collection of statistical methods, both
quantitative and qualitative, for
identifying relationships in data and
consolidating them into an illustrative
informative summarising graphic.
3. As Edward Tufte, one of the main
authorities in data visualization, says:
5. To illustrate the raw power of
data visualisation, let us walk
trough a classic example:
Analyse the relationships in the next
four datasets to identify signifcant
diferences between them.
6. The Anscombe's Quartet
Graphs in Statistical Analysis
F. J. Anscombe
The American Statistician, Vol. 27, No. 1. (Feb., 1973), pp. 17-21.
7. The Anscombe's Quartet
A summary statistics are shown in this table.
There appears to be little diference between the
four datasets.
Even calculating the correlation coefcient between
each x and y variable, there are no visible
diferences between data sets.
8. Scatter Plots
Volume
per day
Cost per
day
23 125
26 140
29 146
33 160
38 167
42 170
50 188
55 195
60 200
Production Volume vs. Cost per Day
0
50
100
150
200
250
0 10 20 30 40 50 60 70
Volume per Day
CostperDay
16. Let us see another example to
illustrate the raw power of
data visualisation:
Consider the famous investigation on
marriage selection in respect to stature
conducted by Sir Francis Galton.
23. Lifeboats R dataset
A data frame with 18 observations and 8 variables:
launch launch time in "POSIXt" format.
side factor. Side of the boat.
boat factor indicating the boat.
crew number of male crew members on board.
men number of men on board.
women number of women (including female crew) on board.
total total number of passengers.
cap capacity of the boat.