Intro to data visualization

1,660 views
1,383 views

Published on

Slides used in capita selecta HCI course H05N2A

1 Comment
7 Likes
Statistics
Notes
No Downloads
Views
Total views
1,660
On SlideShare
0
From Embeds
0
Number of Embeds
6
Actions
Shares
0
Downloads
78
Comments
1
Likes
7
Embeds 0
No embeds

No notes for slide

Intro to data visualization

  1. 1. Data Visualization - An introductionProf Jan AertsBiodata Visualization and AnalysisESAT/SCDUniversity of LeuvenBelgiumtwitter: @jandotGoogle+: +Jan Aertsjan.aerts@esat.kuleuven.behttp://biovizanlab.wordpress.comhttp://saaientist.blogspot.com
  2. 2. 1. What is data visualization?
  3. 3. “A good sketch is better than a long speech” (Napoleon)
  4. 4. “A good sketch is better than a long speech” (Napoleon)shows: size of the army, geographical coordinates, direction that the armywas traveling, location of the army with respect to certain dates, temperaturealong the path of the retreat
  5. 5. John Snow - cholera map
  6. 6. Shape of Songs: “Like a Prayer” (Madonna) Martin Wattenberg
  7. 7. http://multimedia.mcb.harvard.edu/anim_innerlife.html
  8. 8. What I use as a definition:“computer-based visualization systems providing visual representations ofdatasets intended to help people carry out some task more effectively.” (TMunzner)
  9. 9. cognition <=> perceptioncognitive task => perceptive task “eyes beat memory”
  10. 10. Why do we visualize data?• record information • blueprints, photographs, seismographs, ...• analyze data to support reasoning • develop & assess hypotheses • discover errors in data • expand memory • find patterns (see Snow’s cholera map)• communicate information • share & persuade • collaborate & revise
  11. 11. exploration explanationpictorial superiority effect “information” 72hr “informa” “i” 65% 1%
  12. 12. 2. Exploration <-> explanation
  13. 13. exploration explanation
  14. 14. exploration explanation visual infographicsanalytics
  15. 15. exploration explanation visual infographicsanalytics
  16. 16. exploration explanation visual infographicsanalytics hypothesis generation
  17. 17. exploration explanation“visual analytics” => identify unexpected patterns
  18. 18. exploration explanation J van Wijk
  19. 19. Anscombe’s quartet• uX = 9.0• uY = 7.5• sigma X = 3.317• sigma Y = 2.03• Y = 3 + 0.5X• R2 = 0.67
  20. 20. A concrete example: hive plots
  21. 21. same network Martin Krzewinsky
  22. 22. different networks! Martin Krzewinsky
  23. 23. 3D, anyone?
  24. 24. 3D, anyone? occlusion interaction complexity perspective distortion text legibility
  25. 25. Functions in linux operation system: “function A calls function B”Gene interaction data:“gene A regulates gene B”
  26. 26. regulatorworkhorse manager
  27. 27. 3. Why specifically learn about dataviz?
  28. 28. Isn’t it all just about using common sense?
  29. 29. • huge space of design alternatives => many tradeoffs• many possibilities known to be ineffective • avoid random walk through parameter space • avoid some of our past mistakes • extensive experimentation has already been done• guidelines continue to evolve • we reflect on lessons learned in design studies • iterative refinement usually wise
  30. 30. 4. Stages of data visualization
  31. 31. How do we get from data to visualization? We need to understand:• properties of the data• properties of the image• the rules mapping data to image
  32. 32. 4.1. Properties of the data
  33. 33. S Stevens “On the theory of scales and measurements” (1946)
  34. 34. 4.2. Properties of the image - perception
  35. 35. Semiology of graphics• Jacques Bertin, Gauthier-Villars 1967, EHESS 1998• semiology = study of signs and sign processes, likeness, analogy, metaphor, symbolism, signification, and communication (Wikipedia)• visual encoding: • what - points, lines, areas (, patterns, trees/networks, grids) • where - positional: XY (1D, 2D, 3D) • how - retinal: Z (size, lightness, texture, colour, orientation, shape) • when - temporal: animation
  36. 36. “marks” - geometric primitives H V S “channels” - control appearance of marks
  37. 37. Gestalt laws - interplay between parts and thewhole (Kurt Koffka) series of principles Election results Florida: • black = Bush • white = Gore
  38. 38. Gestalt - Principle of Simplicity Every pattern we see is seen such that we see a structure that is as simple as possible.
  39. 39. Gestalt - Principle of Proximity Things that are close to each other are seen as belonging together (=> clusters)
  40. 40. Gestalt - Principle of Similarity Things that are similar in some way are perceived as belonging together.
  41. 41. Gestalt - Principle of Closure You will try to complete a pattern.
  42. 42. Gestalt - Principle of Connectedness Things that are connected are perceived as belonging together. This encoding is stronger than similarity, shape, colour, and size.
  43. 43. Gestalt - Principle of Good Continuation Objects that are arranged in a straight or smooth line tend to be seen as a unit.
  44. 44. Gestalt - Principle of Common Fate Objects that move in the same direction tend to be seen as a unit.
  45. 45. Gestalt - Principle of Familiarity
  46. 46. Gestalt - Principle of Symmetry Symmetrical areas tend to be seen as figures against asymmetrical backgrounds.
  47. 47. Context affects perceptual tasks
  48. 48. Pre-attentive vision= ability of low-level human visual system to rapidly identify certain basic visualproperties• some features “pop out”• used for: • target detection • boundary detection • counting/estimation • ...• visual system takes over => all cognitive power available for interpreting the figure, rather than needing part of it for processing the figure
  49. 49. Really fast; see http://www.csc.ncsu.edu/faculty/healey/PP/
  50. 50. Limitations of preattentive vision1. Combining pre-attentive features does not always work => would need toresort to “serial search” (most channel pairs; all channel triplets)e.g. is there a red square in this picture 2. Speed depends on which channel (use one that is good for categorical; see further (“accuracy”))
  51. 51. 4.3. Mapping data to image: visual encoding
  52. 52. Language of graphics• graphics = sign system: • each mark (point, line, area) represents a data element • choose visual variables to encode relationships between data elements • difference, similarity, order, proportion • only position supports all relationships (see later) • huge range of alternatives for data with many attributes • find images that express & effectively convey the information
  53. 53. Which encoding should I use?• From huge list of possibilities, you have to choose the best one.• Principle of Consistency • properties of the representation should match properties of the data (e.g. pie chart: area vs radius)• Principle of Importance Ordering • encode the most important piece of information in the most “effective” way (i.e. spatial position)
  54. 54. Steven’s psychophysical law = proposed relationship between the magnitude of a physical stimulus and its perceived intensity or strength
  55. 55. Accuracy of quantitative perceptual tasks how much (quantitative) what/where (qualitative) McKinlay
  56. 56. Accuracy of quantitative perceptual tasks how much (quantitative) what/where (qualitative) McKinlay
  57. 57. Accuracy of quantitative perceptual tasks how much (quantitative) what/where (qualitative) McKinlay “power of the plane”
  58. 58. Accuracy of quantitative perceptual tasks how much (quantitative) what/where (qualitative) grouping: see Gestalt laws McKinlay
  59. 59. COLOUR
  60. 60. COLOUR ... is tricky, and often used wrong
  61. 61. Colour space• = mathematical model to talk about colour• RGB (red-green-blue) • most common, but less useful• HSV (hue-saturation-value) • more useful
  62. 62. colorbrewer2.orgin R: please use RColorBrewer!
  63. 63. Context affects colour perception
  64. 64. Context affects colour perception
  65. 65. Dangers of Depth (3D)• We do NOT see in 3D; we see in 2.05D.• occlusion• interaction complexity• perspective distortion
  66. 66. 3D example
  67. 67. Lie factor size of effect shown in graphic “lie factor” = size of effect in data
  68. 68. 3D scatter plots are better as series of 2D projections
  69. 69. Dynamic data• animation is good sometimes, but often not: • we can only follow 3-4 visual cues simultaneously • change in “mental map”• change blindness (e.g. http://nivea.psycho.univ-paris5.fr/CBMovies/ BarnTrackFlickerMovie.gif)
  70. 70. http://vimeo.com/2035117
  71. 71. 5. Interaction
  72. 72. Overview, zoom and filter, details on demand(Schneiderman’s Information Seeking Mantra)
  73. 73. Operations on the data• sorting• filtering• browsing/exploring• comparison• characterizing trends & distributions• finding anomalies & outliers• ...
  74. 74. Techniques to support these operations• re-orderable matrices• brushing• linked views• overview & detail• focus & context• ...
  75. 75. 6. Validation
  76. 76. Evaluate the right thing Munzner, 2009
  77. 77. Slide/picture acknowledgments• Jeffrey Heer• Tamara Munzner• Jessie Kennedy• Nils Gehlenborg• Miriah Meyer
  78. 78. “I think this presentation went quite well...”

×