Upcoming SlideShare
×

# Visualiation of quantitative information

7,764 views
8,554 views

Published on

Overviews levels of measurement and graphical approaches to analysis of univariate data.

Published in: Education
1 Comment
7 Likes
Statistics
Notes
• Full Name
Comment goes here.

Are you sure you want to Yes No

Are you sure you want to  Yes  No
Views
Total views
7,764
On SlideShare
0
From Embeds
0
Number of Embeds
1,917
Actions
Shares
0
190
1
Likes
7
Embeds 0
No embeds

No notes for slide

### Visualiation of quantitative information

1. 1. James Neill, 2011 Visualisation of quantitative information
2. 2. Overview <ul><li>Visualisation
3. 3. Approaching data
4. 4. Levels of measurement
5. 5. Principals of graphing
6. 6. Univariate graphs
7. 7. Graphical integrity </li></ul>
8. 8. Visualisation “ Visualization is any technique for creating images, diagrams, or animations to communicate a message.” - Wikipedia
9. 9. Is Pivot a turning point for web exploration? (Gary Flake) ( TED talk - 6 min. )
10. 10. Approaching data
11. 11. Approaching data Entering & screening Exploring, describing, & graphing Hypothesis testing
12. 12. Describing & graphing data THE CHALLENGE: to find a meaningful, accurate way to depict the ‘ true story’ of the data
13. 13. Get your fingers dirty with data
14. 14. Get intimate with your data
15. 15. Clearly report the data's main features
16. 16. Levels of Measurement = Type of Data Stevens (1946)
17. 17. Levels of measurement <ul><li>N ominal / Categorical
18. 18. O rdinal
19. 19. I nterval
20. 20. R atio </li></ul>
21. 21. Discrete vs. continuous Discrete - - - - - - - - - - Continuous ___________
22. 22. Each level has the properties of the preceding levels, plus something more!
23. 23. Categorical / nominal <ul><li>Conveys a category label
24. 24. (Arbitrary) assignment of #s to categories </li></ul>e.g. Gender <ul><li>No useful information, except as labels </li></ul>
25. 25. Ordinal / ranked scale <ul><li>Conveys order , but not distance </li></ul>e.g. in a race, 1st, 2nd, 3rd, etc. or ranking of favourites or preferences
26. 26. Ordinal / ranked example: Ranked importance Rank the following aspects of the university according to what is most important to you (1 = most important through to 5 = least important) __ Quality of the teaching and education __ Quality of the social life __ Quality of the campus __ Quality of the administration __ Quality of the university's reputation
27. 27. Interval scale <ul><li>Conveys order & distance
28. 28. 0 is arbitrary </li></ul>e.g., temperature (degrees C) <ul><li>Usually treat as continuous for > 5 intervals </li></ul>
29. 29. Interval example: 8 point Likert scale
30. 30. Ratio scale <ul><li>Conveys order & distance
31. 31. Continuous, with a meaningful 0 point </li></ul>e.g. height, age, weight, time, number of times an event has occurred <ul><li>Ratio statements can be made </li></ul>e.g. X is twice as old (or high or heavy) as Y
32. 32. Ratio scale: Time
33. 33. Why do levels of measurement matter? Different analytical procedures are used for different levels of data. More powerful statistics can be applied to higher levels
34. 34. Principles of graphing
35. 35. Graphs (Edward Tufte) <ul><li>Visualise data
36. 36. Reveal data </li><ul><li>Describe
37. 37. Explore
38. 38. Tabulate
39. 39. Decorate </li></ul><li>Communicate complex ideas with clarity, precision, and efficiency </li></ul>
40. 40. Tufte's graphing guidelines <ul><li>Show the data
41. 41. Avoid distortion
42. 42. Focus on substance rather than method
43. 43. Present many numbers in a small space
44. 44. Make large data sets coherent </li></ul>
45. 45. Tufte's graphing guidelines <ul><li>Maximise the information-to-ink ratio
46. 46. Encourage the eye to make comparisons
47. 47. Reveal data at several levels/layers
48. 48. Closely integrate with statistical and verbal descriptions </li></ul>
49. 49. Graphing steps <ul><li>Identify the purpose of the graph
50. 50. Select which type of graph to use
51. 51. Draw a graph
52. 52. Modify the graph to be clear, non-distorting, and well-labelled.
53. 53. Disseminate the graph (e.g., include it in a report) </li></ul>
54. 54. Software for data visualisation (graphing) <ul><li>Statistical packages </li></ul><ul><ul><li>e.g., SPSS </li></ul></ul><ul><li>Spreadsheet packages </li></ul><ul><ul><li>e.g., MS Excel </li></ul></ul><ul><li>Word-processors </li></ul><ul><ul><li>e.g., MS Word – Insert – Object – Micrograph Graph Chart </li></ul></ul>
55. 55. Univariate graphs
56. 56. Univariate graphs <ul><li>Bar graph
57. 57. Pie chart
58. 58. Data plot
59. 59. Error bar
60. 60. Stem & leaf plot
61. 61. Box plot (Box & whisker)
62. 62. Histogram </li></ul>
63. 63. Bar chart (Bar graph) <ul><li>Examine comparative heights of bars
64. 64. X-axis: Collapse if too many categories
65. 65. Y-axis: Count or % or mean?
66. 66. Consider whether to use data labels </li></ul>
67. 67. <ul><li>Use a bar chart instead
68. 68. Hard to read </li><ul><li>Does not show small differences
69. 69. Rotation / position influences perception </li></ul></ul>Pie chart
70. 70. Data plot & error bar Data plot Error bar
71. 71. Stem & leaf plot <ul><li>Alternative to histogram
72. 72. Use for ordinal, interval and ratio data
73. 73. May look confusing to unfamiliar reader </li></ul>
74. 74. <ul><li>Contains actual data
75. 75. Collapses tails </li></ul>Stem & leaf plot Frequency Stem & Leaf 7.00 1 . & 192.00 1 . 22223333333 541.00 1 . 444444444444444455555555555555 610.00 1 . 6666666666666677777777777777777777 849.00 1 . 88888888888888888888888888899999999999999999999 614.00 2 . 0000000000000000111111111111111111 602.00 2 . 222222222222222233333333333333333 447.00 2 . 4444444444444455555555555 291.00 2 . 66666666677777777 240.00 2 . 88888889999999 167.00 3 . 000001111 146.00 3 . 22223333 153.00 3 . 44445555 118.00 3 . 666777 99.00 3 . 888999 106.00 4 . 000111 54.00 4 . 222 339.00 Extremes (>=43)
76. 76. Box plot (Box & whisker) <ul><li>Useful for interval and ratio data
77. 77. Represents min., max, median, quartiles, & outliers </li></ul>
78. 78. <ul><li>Alternative to histogram
79. 79. Useful for screening
80. 80. Useful for comparing variables
81. 81. Can get messy - too much info
82. 82. Confusing to unfamiliar reader </li></ul>Box plot
83. 83. Histogram <ul><li>For continuous data
84. 84. X-axis needs a happy medium for # of categories
85. 85. Y-axis matters (can exaggerate) </li></ul>
86. 86. Histogram of male & female heights
87. 87. Non-normal distributions
88. 88. Non-normal distributions
89. 89. Histogram of weight
90. 90. Histogram of daily calorie intake
91. 91. Histogram of fertility
92. 92. Example ‘normal’ distribution 1
93. 93. Example ‘normal’ distribution 2
94. 94. Example ‘normal’ distribution 2
95. 95. Effects of skew on measures of central tendency
96. 96. <ul><li>Alternative to histogram
97. 97. Implies continuity e.g., time
98. 98. Can show multiple lines </li></ul>Line graph
99. 99. NOIR Bar chart & pie chart NOI Histogram IR Stem & leaf IR Data plot & box plot IR Error-bar IR Line graph IR Summary: Graphs & levels of measurement
100. 100. Graphical integrity (part of academic integrity)
101. 101. Graphing can be like a bikini. What they reveal is suggestive, but what they conceal is vital. (aka Aaron Levenstein)
102. 102. Graphical integrity &quot;Like good writing, good graphical displays of data communicate ideas with clarity, precision, and efficiency. Like poor writing, bad graphical displays distort or obscure the data, make it harder to understand or compare, or otherwise thwart the communicative effect which the graph should convey.&quot; Michael Friendly – Gallery of Data Visualisation
103. 103. Cleveland’s hierarchy
104. 104. Cleveland’s hierarchy: Best to worst <ul><li>Position along a common scale
105. 105. Position along identical, non aligned scales
106. 106. Length
107. 107. Angle-slope
108. 108. Area
109. 109. Volume
110. 110. Color hue - color saturation - density </li></ul>
111. 111. Tufte’s graphical integrity <ul><li>Some lapses intentional, some not
112. 112. Lie Factor = size of effect in graph size of effect in data
113. 113. Misleading uses of area
114. 114. Misleading uses of perspective
115. 115. Leaving out important context
116. 116. Lack of taste and aesthetics </li></ul>
117. 117. <ul><li>If a survey question produces a ‘floor effect’, where will the mean, median and mode lie in relation to one another?
118. 118. Over the last century, the performance of the best baseball hitters has declined. Does this imply that the overall performance of baseball batters has decreased? </li></ul>Review questions
119. 119. Can you complete this table? Level Properties Examples Descriptive Statistics Graphs Nominal /Categorical Ordinal / Rank Interval Ratio Answers: http://wilderdom.com/research/Summary_Levels_Measurement.html
120. 120. Links <ul><li>Presenting Data – Statistics Glossary v1.1 - http://www.cas.lancs.ac.uk/glossary_v1.1/presdata.html
121. 121. A Periodic Table of Visualisation Methods - http://www.visual-literacy.org/periodic_table/periodic_table.html
122. 122. Gallery of Data Visualization - http://www.math.yorku.ca/SCS/Gallery/
123. 123. Univariate Data Analysis – The Best & Worst of Statistical Graphs - http://www.csulb.edu/~msaintg/ppa696/696uni.htm
124. 124. Pitfalls of Data Analysis – http://www.vims.edu/~david/pitfalls/pitfalls.htm
125. 125. Statistics for the Life Sciences – http://www.math.sfu.ca/~cschwarz/Stat-301/Handouts/Handouts.html </li></ul>
126. 126. References <ul><li>Cleveland, W. S. (1985). The elements of graphing data . Monterey, CA: Wadsworth.
127. 127. Jones, G. E. (2006). How to lie with charts . Santa Monica, CA: LaPuerta.
128. 128. Tufte, E. (1983). The visual display of quantitative information . Cheshire, CT: Graphics Press. </li></ul>
129. 129. Open Office Impress <ul><li>This presentation was made using Open Office Impress.
130. 130. Free and open source software.
131. 131. http://www.openoffice.org/product/impress.html </li></ul>