Visualiation of quantitative information

  • 4,405 views
Uploaded on

Overviews levels of measurement and graphical approaches to analysis of univariate data.

Overviews levels of measurement and graphical approaches to analysis of univariate data.

More in: Education
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
No Downloads

Views

Total Views
4,405
On Slideshare
0
From Embeds
0
Number of Embeds
6

Actions

Shares
Downloads
148
Comments
1
Likes
2

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. James Neill, 2011 Visualisation of quantitative information
  • 2. Overview
    • Visualisation
    • 3. Approaching data
    • 4. Levels of measurement
    • 5. Principals of graphing
    • 6. Univariate graphs
    • 7. Graphical integrity
  • 8. Visualisation “ Visualization is any technique for creating images, diagrams, or animations to communicate a message.” - Wikipedia
  • 9. Is Pivot a turning point for web exploration? (Gary Flake) ( TED talk - 6 min. )
  • 10. Approaching data
  • 11. Approaching data Entering & screening Exploring, describing, & graphing Hypothesis testing
  • 12. Describing & graphing data THE CHALLENGE: to find a meaningful, accurate way to depict the ‘ true story’ of the data
  • 13. Get your fingers dirty with data
  • 14. Get intimate with your data
  • 15. Clearly report the data's main features
  • 16. Levels of Measurement = Type of Data Stevens (1946)
  • 17. Levels of measurement
  • 21. Discrete vs. continuous Discrete - - - - - - - - - - Continuous ___________
  • 22. Each level has the properties of the preceding levels, plus something more!
  • 23. Categorical / nominal
    • Conveys a category label
    • 24. (Arbitrary) assignment of #s to categories
    e.g. Gender
    • No useful information, except as labels
  • 25. Ordinal / ranked scale
    • Conveys order , but not distance
    e.g. in a race, 1st, 2nd, 3rd, etc. or ranking of favourites or preferences
  • 26. Ordinal / ranked example: Ranked importance Rank the following aspects of the university according to what is most important to you (1 = most important through to 5 = least important) __ Quality of the teaching and education __ Quality of the social life __ Quality of the campus __ Quality of the administration __ Quality of the university's reputation
  • 27. Interval scale
    • Conveys order & distance
    • 28. 0 is arbitrary
    e.g., temperature (degrees C)
    • Usually treat as continuous for > 5 intervals
  • 29. Interval example: 8 point Likert scale
  • 30. Ratio scale
    • Conveys order & distance
    • 31. Continuous, with a meaningful 0 point
    e.g. height, age, weight, time, number of times an event has occurred
    • Ratio statements can be made
    e.g. X is twice as old (or high or heavy) as Y
  • 32. Ratio scale: Time
  • 33. Why do levels of measurement matter? Different analytical procedures are used for different levels of data. More powerful statistics can be applied to higher levels
  • 34. Principles of graphing
  • 35. Graphs (Edward Tufte)
    • Visualise data
    • 36. Reveal data
    • Communicate complex ideas with clarity, precision, and efficiency
  • 40. Tufte's graphing guidelines
    • Show the data
    • 41. Avoid distortion
    • 42. Focus on substance rather than method
    • 43. Present many numbers in a small space
    • 44. Make large data sets coherent
  • 45. Tufte's graphing guidelines
    • Maximise the information-to-ink ratio
    • 46. Encourage the eye to make comparisons
    • 47. Reveal data at several levels/layers
    • 48. Closely integrate with statistical and verbal descriptions
  • 49. Graphing steps
    • Identify the purpose of the graph
    • 50. Select which type of graph to use
    • 51. Draw a graph
    • 52. Modify the graph to be clear, non-distorting, and well-labelled.
    • 53. Disseminate the graph (e.g., include it in a report)
  • 54. Software for data visualisation (graphing)
    • Statistical packages
      • e.g., SPSS
    • Spreadsheet packages
      • e.g., MS Excel
    • Word-processors
      • e.g., MS Word – Insert – Object – Micrograph Graph Chart
  • 55. Univariate graphs
  • 56. Univariate graphs
  • 63. Bar chart (Bar graph)
    • Examine comparative heights of bars
    • 64. X-axis: Collapse if too many categories
    • 65. Y-axis: Count or % or mean?
    • 66. Consider whether to use data labels
  • 67.
    • Use a bar chart instead
    • 68. Hard to read
      • Does not show small differences
      • 69. Rotation / position influences perception
    Pie chart
  • 70. Data plot & error bar Data plot Error bar
  • 71. Stem & leaf plot
    • Alternative to histogram
    • 72. Use for ordinal, interval and ratio data
    • 73. May look confusing to unfamiliar reader
  • 74.
    • Contains actual data
    • 75. Collapses tails
    Stem & leaf plot Frequency Stem & Leaf 7.00 1 . & 192.00 1 . 22223333333 541.00 1 . 444444444444444455555555555555 610.00 1 . 6666666666666677777777777777777777 849.00 1 . 88888888888888888888888888899999999999999999999 614.00 2 . 0000000000000000111111111111111111 602.00 2 . 222222222222222233333333333333333 447.00 2 . 4444444444444455555555555 291.00 2 . 66666666677777777 240.00 2 . 88888889999999 167.00 3 . 000001111 146.00 3 . 22223333 153.00 3 . 44445555 118.00 3 . 666777 99.00 3 . 888999 106.00 4 . 000111 54.00 4 . 222 339.00 Extremes (>=43)
  • 76. Box plot (Box & whisker)
    • Useful for interval and ratio data
    • 77. Represents min., max, median, quartiles, & outliers
  • 78.
    • Alternative to histogram
    • 79. Useful for screening
    • 80. Useful for comparing variables
    • 81. Can get messy - too much info
    • 82. Confusing to unfamiliar reader
    Box plot
  • 83. Histogram
    • For continuous data
    • 84. X-axis needs a happy medium for # of categories
    • 85. Y-axis matters (can exaggerate)
  • 86. Histogram of male & female heights
  • 87. Non-normal distributions
  • 88. Non-normal distributions
  • 89. Histogram of weight
  • 90. Histogram of daily calorie intake
  • 91. Histogram of fertility
  • 92. Example ‘normal’ distribution 1
  • 93. Example ‘normal’ distribution 2
  • 94. Example ‘normal’ distribution 2
  • 95. Effects of skew on measures of central tendency
  • 96.
    • Alternative to histogram
    • 97. Implies continuity e.g., time
    • 98. Can show multiple lines
    Line graph
  • 99. NOIR Bar chart & pie chart NOI Histogram IR Stem & leaf IR Data plot & box plot IR Error-bar IR Line graph IR Summary: Graphs & levels of measurement
  • 100. Graphical integrity (part of academic integrity)
  • 101. Graphing can be like a bikini. What they reveal is suggestive, but what they conceal is vital. (aka Aaron Levenstein)
  • 102. Graphical integrity "Like good writing, good graphical displays of data communicate ideas with clarity, precision, and efficiency. Like poor writing, bad graphical displays distort or obscure the data, make it harder to understand or compare, or otherwise thwart the communicative effect which the graph should convey." Michael Friendly – Gallery of Data Visualisation
  • 103. Cleveland’s hierarchy
  • 104. Cleveland’s hierarchy: Best to worst
  • 111. Tufte’s graphical integrity
    • Some lapses intentional, some not
    • 112. Lie Factor = size of effect in graph size of effect in data
    • 113. Misleading uses of area
    • 114. Misleading uses of perspective
    • 115. Leaving out important context
    • 116. Lack of taste and aesthetics
  • 117.
    • If a survey question produces a ‘floor effect’, where will the mean, median and mode lie in relation to one another?
    • 118. Over the last century, the performance of the best baseball hitters has declined. Does this imply that the overall performance of baseball batters has decreased?
    Review questions
  • 119. Can you complete this table? Level Properties Examples Descriptive Statistics Graphs Nominal /Categorical Ordinal / Rank Interval Ratio Answers: http://wilderdom.com/research/Summary_Levels_Measurement.html
  • 120. Links
    • Presenting Data – Statistics Glossary v1.1 - http://www.cas.lancs.ac.uk/glossary_v1.1/presdata.html
    • 121. A Periodic Table of Visualisation Methods - http://www.visual-literacy.org/periodic_table/periodic_table.html
    • 122. Gallery of Data Visualization - http://www.math.yorku.ca/SCS/Gallery/
    • 123. Univariate Data Analysis – The Best & Worst of Statistical Graphs - http://www.csulb.edu/~msaintg/ppa696/696uni.htm
    • 124. Pitfalls of Data Analysis – http://www.vims.edu/~david/pitfalls/pitfalls.htm
    • 125. Statistics for the Life Sciences – http://www.math.sfu.ca/~cschwarz/Stat-301/Handouts/Handouts.html
  • 126. References
    • Cleveland, W. S. (1985). The elements of graphing data . Monterey, CA: Wadsworth.
    • 127. Jones, G. E. (2006). How to lie with charts . Santa Monica, CA: LaPuerta.
    • 128. Tufte, E. (1983). The visual display of quantitative information . Cheshire, CT: Graphics Press.
  • 129. Open Office Impress
    • This presentation was made using Open Office Impress.
    • 130. Free and open source software.
    • 131. http://www.openoffice.org/product/impress.html