How Humans See Data - Amazon Cut

1. How Humans See Data John Rauser @jrauser January 2017

2. How Humans See Data John Rauser @jrauser January 2017

3. visualization

4. visualization is communication

5. how to make better visualizations

6. help humans solve analytical problems quickly and accurately with visualization

7. Part I: Why visualize data at all?

9. x 1.972 y 1.236 x y 0.111 0.542 1.112 1.994 0.902 0.005 0.000 1.009 0.598 0.085 0.665 1.942 1.613 1.790 0.235 0.356 1.298 1.955 0.247 1.658 0.651 1.937 1.275 1.961 1.949 1.316 0.702 0.045 0.099 0.567 1.760 0.350 0.862 0.010 1.691 0.277 0.027 0.768 1.628 1.778 0.706 1.956 1.957 1.290 1.042 1.999

11. pre-attentive processing

12. A graph is an encoding of the data.

13. x 1.972 y 1.236 x y 0.111 0.542 1.112 1.994 0.902 0.005 0.000 1.009 0.598 0.085 0.665 1.942 1.613 1.790 0.235 0.356 1.298 1.955 0.247 1.658 0.651 1.937 1.275 1.961 1.949 1.316 0.702 0.045 0.099 0.567 1.760 0.350 0.862 0.010 1.691 0.277 0.027 0.768 1.628 1.778 0.706 1.956 1.957 1.290 1.042 1.999

15. n x y n x y 1 1.972 1.236 13 0.111 0.542 2 1.112 1.994 14 0.902 0.005 3 0.000 1.009 15 0.598 0.085 4 0.665 1.942 16 1.613 1.790 5 0.235 0.356 17 1.298 1.955 6 0.247 1.658 18 0.651 1.937 7 1.275 1.961 19 1.949 1.316 8 0.702 0.045 20 0.099 0.567 9 1.760 0.350 21 0.862 0.010 10 1.691 0.277 22 0.027 0.768 11 1.628 1.778 23 0.706 1.956 12 1.957 1.290 24 1.042 1.999

18. Good visualizations optimize for the human visual system.

19. How does the human visual system work?

20. How does the human visual system decode a graph?

22. Cleveland’s three visual operations of pattern perception: 1. Detection 2. Assembly 3. Estimation

23. Part II: estimation

24. Three levels of estimation a. discrimination X=Y X!=Y b. ranking X>Y X<Y c. ratioing X / Y = ?

25. At the heart of quantitative reasoning is a single question: Compared to what? - Tufte, Envisioning Information

26. Three levels of estimation a. discrimination X=Y X!=Y b. ranking X>Y X<Y c. ratioing X / Y = ?

29. the most important thing

31. The most important measurement should exploit the highest ranked encoding possible. • Position along a common scale • Position on identical but nonaligned scales • Length • Angle or Slope • Area • Volume or Density or Color saturation • Color hue

34. “The first rule of color: do not talk about color!” - Tamara Munzner

35. luminance saturation hue

37. luminance saturation hue

42. Observation: Alphabetical is almost never the correct ordering of a categorical variable.

61. 11 mpg

62. 11 mpg

63. 11 mpg

73. Observation: Stacked anything is nearly always a mistake.

79. Stacking makes the reader decode lengths, not position on a common scale.

80. 11 mpg

82. Observation: Stacked anything is nearly always a mistake.

84. Observation: Pie charts are ALWAYS a mistake.

85. Piecharts are the information visualization equivalent of a roofing hammer to the frontal lobe. They have no place in the world of grownups, and occupy the same semiotic space as short pants, a runny nose, and chocolate smeared on one’s face. They are as professional as a pair of assless chaps. http://blog.codahale.com/2006/04/29/google-analytics-the-goggles-they-do-nothing/

86. Piecharts are the information visualization equivalent of a roofing hammer to the frontal lobe. They have no place in the world of grownups, and occupy the same semiotic space as short pants, a runny nose, and chocolate smeared on one’s face. They are as professional as a pair of assless chaps. http://blog.codahale.com/2006/04/29/google-analytics-the-goggles-they-do-nothing/

90. Tables are preferable to graphics for many small data sets. A table is nearly always better than a dumb pie chart; the only thing worse than a pie chart is several of them, for then the viewer is asked to compared quantities located in spatial disarray both within and between pies… Given their low data-density and failure to order numbers along a visual dimension, pie charts should never be used. -Edward Tufte, The Visual Display of Quantitative Information

91. Tables are preferable to graphics for many small data sets. A table is nearly always better than a dumb pie chart; the only thing worse than a pie chart is several of them, for then the viewer is asked to compared quantities located in spatial disarray both within and between pies… Given their low data-density and failure to order numbers along a visual dimension, pie charts should never be used. -Edward Tufte, The Visual Display of Quantitative Information

92. Clinton Trump Among Democrats 99% 1% Among Republicans 53% 47% Who do you think did a better job in tonight’s debate?

94. Afghanistan Albania Algeria Angola Argentina Australia Austria Bahrain Bangladesh Belgium Benin Bolivia Bosnia and Herzegovina Botswana Brazil Bulgaria Burkina Faso Burundi Cambodia Cameroon

95. All good pie charts are jokes.

98. Observation: Comparison is trivial on a common scale.

104. the dashboard metaphor is fundamentally flawed

108. Observation: Scatterplots show relationships directly.

111. Observation: Growth charts usually aren’t.

113. If growth (slope) is important, plot it directly.

115. Observation: Growth charts usually aren’t. If growth (slope) is important, plot it directly.

118. Part three: assembly

119. Gestalt Psychology

121. reification

122. emergence

124. emergence

126. Prägnanz

127. Law Of Closure

129. Law Of Continuity

132. Observation: Good plots leverage the law of continuity to assist with assembly.

135. Law of Similarity

142. Law of Proximity

145. Observation: dodged bar charts are a bad idea

147. Part IV: detection

155. excel’s defaults are pretty bad

156. - 20,000 40,000 60,000 80,000 100,000 120,000 140,000 160,000 180,000 200,000 1 2 3 4 5 6

157. Observation: Detection isn’t as trivial as it seems.

159. “Above all else, show the data.” -Tufte

160. Part V: other useful results

161. Weber’s law: The “Just Noticeable Difference” is proportional to the size of the initial stimuli.

162. 10 20

163. 10 20 100 110

166. 12 units 12 units

167. Observation: Weber’s Law is why gridlines are useful

171. “Erase non-data ink.” -Tufte

172. “Erase non-data ink, within reason.” -Tufte

173. “Erase non-data ink that interferes with detection or doesn’t assist assembly and estimation.” -Rauser

175. You are bad at estimating the difference between lines.

180. Observation: If a difference is important, plot it directly.

181. You are best at detecting variation in slope near 45 degrees.

183. banking to 45

186. Observation: Banking to 45 best shows variation in slope

188. Q: Should I include 0 on my scale?

191. Q: Should I include 0 on my scale? A: It depends.

192. Q: Should I include 0 on my scale? A: Relying on the pre-attentive perception of size or intensity? Yes, otherwise you will mislead. Using position? It’s up to you.

199. “Above all else, show the data.” -Tufte

200. “Above all else, show the variation in the data.” -Rauser (via Tufte)

201. R/GGplot2 code for every plot in this presentation available at http://goo.gl/xH5PLV The rendered document is at http://rpubs.com/jrauser/hhsd_notes This presentation is at https://goo.gl/LuDNje I will tweet these links as @jrauser

202. coda

203. visualization is communication

204. art is communication

205. visualization is art

211. why does it make you feel that way?

212. visualization has as much to learn from art as from science

213. R/GGplot2 code for every plot in this presentation available at http://goo.gl/xH5PLV The rendered document is at http://rpubs.com/jrauser/hhsd_notes This presentation is at https://goo.gl/LuDNje I will tweet these links as @jrauser

214. end

How Humans See Data - Amazon Cut

Recommended

Recommended

More Related Content

Viewers also liked

Viewers also liked (18)

Similar to How Humans See Data - Amazon Cut

Similar to How Humans See Data - Amazon Cut (20)

Recently uploaded

Recently uploaded (20)

How Humans See Data - Amazon Cut