• Like
Intro to data visualization
Upcoming SlideShare
Loading in...5

Thanks for flagging this SlideShare!

Oops! An error has occurred.

Intro to data visualization


Slides used in capita selecta HCI course H05N2A

Slides used in capita selecta HCI course H05N2A

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
No Downloads


Total Views
On SlideShare
From Embeds
Number of Embeds



Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

    No notes for slide


  • 1. Data Visualization - An introductionProf Jan AertsBiodata Visualization and AnalysisESAT/SCDUniversity of LeuvenBelgiumtwitter: @jandotGoogle+: +Jan Aertsjan.aerts@esat.kuleuven.behttp://biovizanlab.wordpress.comhttp://saaientist.blogspot.com
  • 2. 1. What is data visualization?
  • 3. “A good sketch is better than a long speech” (Napoleon)
  • 4. “A good sketch is better than a long speech” (Napoleon)shows: size of the army, geographical coordinates, direction that the armywas traveling, location of the army with respect to certain dates, temperaturealong the path of the retreat
  • 5. John Snow - cholera map
  • 6. Shape of Songs: “Like a Prayer” (Madonna) Martin Wattenberg
  • 7. http://multimedia.mcb.harvard.edu/anim_innerlife.html
  • 8. What I use as a definition:“computer-based visualization systems providing visual representations ofdatasets intended to help people carry out some task more effectively.” (TMunzner)
  • 9. cognition <=> perceptioncognitive task => perceptive task “eyes beat memory”
  • 10. Why do we visualize data?• record information • blueprints, photographs, seismographs, ...• analyze data to support reasoning • develop & assess hypotheses • discover errors in data • expand memory • find patterns (see Snow’s cholera map)• communicate information • share & persuade • collaborate & revise
  • 11. exploration explanationpictorial superiority effect “information” 72hr “informa” “i” 65% 1%
  • 12. 2. Exploration <-> explanation
  • 13. exploration explanation
  • 14. exploration explanation visual infographicsanalytics
  • 15. exploration explanation visual infographicsanalytics
  • 16. exploration explanation visual infographicsanalytics hypothesis generation
  • 17. exploration explanation“visual analytics” => identify unexpected patterns
  • 18. exploration explanation J van Wijk
  • 19. Anscombe’s quartet• uX = 9.0• uY = 7.5• sigma X = 3.317• sigma Y = 2.03• Y = 3 + 0.5X• R2 = 0.67
  • 20. A concrete example: hive plots
  • 21. same network Martin Krzewinsky
  • 22. different networks! Martin Krzewinsky
  • 23. 3D, anyone?
  • 24. 3D, anyone? occlusion interaction complexity perspective distortion text legibility
  • 25. Functions in linux operation system: “function A calls function B”Gene interaction data:“gene A regulates gene B”
  • 26. regulatorworkhorse manager
  • 27. 3. Why specifically learn about dataviz?
  • 28. Isn’t it all just about using common sense?
  • 29. • huge space of design alternatives => many tradeoffs• many possibilities known to be ineffective • avoid random walk through parameter space • avoid some of our past mistakes • extensive experimentation has already been done• guidelines continue to evolve • we reflect on lessons learned in design studies • iterative refinement usually wise
  • 30. 4. Stages of data visualization
  • 31. How do we get from data to visualization? We need to understand:• properties of the data• properties of the image• the rules mapping data to image
  • 32. 4.1. Properties of the data
  • 33. S Stevens “On the theory of scales and measurements” (1946)
  • 34. 4.2. Properties of the image - perception
  • 35. Semiology of graphics• Jacques Bertin, Gauthier-Villars 1967, EHESS 1998• semiology = study of signs and sign processes, likeness, analogy, metaphor, symbolism, signification, and communication (Wikipedia)• visual encoding: • what - points, lines, areas (, patterns, trees/networks, grids) • where - positional: XY (1D, 2D, 3D) • how - retinal: Z (size, lightness, texture, colour, orientation, shape) • when - temporal: animation
  • 36. “marks” - geometric primitives H V S “channels” - control appearance of marks
  • 37. Gestalt laws - interplay between parts and thewhole (Kurt Koffka) series of principles Election results Florida: • black = Bush • white = Gore
  • 38. Gestalt - Principle of Simplicity Every pattern we see is seen such that we see a structure that is as simple as possible.
  • 39. Gestalt - Principle of Proximity Things that are close to each other are seen as belonging together (=> clusters)
  • 40. Gestalt - Principle of Similarity Things that are similar in some way are perceived as belonging together.
  • 41. Gestalt - Principle of Closure You will try to complete a pattern.
  • 42. Gestalt - Principle of Connectedness Things that are connected are perceived as belonging together. This encoding is stronger than similarity, shape, colour, and size.
  • 43. Gestalt - Principle of Good Continuation Objects that are arranged in a straight or smooth line tend to be seen as a unit.
  • 44. Gestalt - Principle of Common Fate Objects that move in the same direction tend to be seen as a unit.
  • 45. Gestalt - Principle of Familiarity
  • 46. Gestalt - Principle of Symmetry Symmetrical areas tend to be seen as figures against asymmetrical backgrounds.
  • 47. Context affects perceptual tasks
  • 48. Pre-attentive vision= ability of low-level human visual system to rapidly identify certain basic visualproperties• some features “pop out”• used for: • target detection • boundary detection • counting/estimation • ...• visual system takes over => all cognitive power available for interpreting the figure, rather than needing part of it for processing the figure
  • 49. Really fast; see http://www.csc.ncsu.edu/faculty/healey/PP/
  • 50. Limitations of preattentive vision1. Combining pre-attentive features does not always work => would need toresort to “serial search” (most channel pairs; all channel triplets)e.g. is there a red square in this picture 2. Speed depends on which channel (use one that is good for categorical; see further (“accuracy”))
  • 51. 4.3. Mapping data to image: visual encoding
  • 52. Language of graphics• graphics = sign system: • each mark (point, line, area) represents a data element • choose visual variables to encode relationships between data elements • difference, similarity, order, proportion • only position supports all relationships (see later) • huge range of alternatives for data with many attributes • find images that express & effectively convey the information
  • 53. Which encoding should I use?• From huge list of possibilities, you have to choose the best one.• Principle of Consistency • properties of the representation should match properties of the data (e.g. pie chart: area vs radius)• Principle of Importance Ordering • encode the most important piece of information in the most “effective” way (i.e. spatial position)
  • 54. Steven’s psychophysical law = proposed relationship between the magnitude of a physical stimulus and its perceived intensity or strength
  • 55. Accuracy of quantitative perceptual tasks how much (quantitative) what/where (qualitative) McKinlay
  • 56. Accuracy of quantitative perceptual tasks how much (quantitative) what/where (qualitative) McKinlay
  • 57. Accuracy of quantitative perceptual tasks how much (quantitative) what/where (qualitative) McKinlay “power of the plane”
  • 58. Accuracy of quantitative perceptual tasks how much (quantitative) what/where (qualitative) grouping: see Gestalt laws McKinlay
  • 59. COLOUR
  • 60. COLOUR ... is tricky, and often used wrong
  • 61. Colour space• = mathematical model to talk about colour• RGB (red-green-blue) • most common, but less useful• HSV (hue-saturation-value) • more useful
  • 62. colorbrewer2.orgin R: please use RColorBrewer!
  • 63. Context affects colour perception
  • 64. Context affects colour perception
  • 65. Dangers of Depth (3D)• We do NOT see in 3D; we see in 2.05D.• occlusion• interaction complexity• perspective distortion
  • 66. 3D example
  • 67. Lie factor size of effect shown in graphic “lie factor” = size of effect in data
  • 68. 3D scatter plots are better as series of 2D projections
  • 69. Dynamic data• animation is good sometimes, but often not: • we can only follow 3-4 visual cues simultaneously • change in “mental map”• change blindness (e.g. http://nivea.psycho.univ-paris5.fr/CBMovies/ BarnTrackFlickerMovie.gif)
  • 70. http://vimeo.com/2035117
  • 71. 5. Interaction
  • 72. Overview, zoom and filter, details on demand(Schneiderman’s Information Seeking Mantra)
  • 73. Operations on the data• sorting• filtering• browsing/exploring• comparison• characterizing trends & distributions• finding anomalies & outliers• ...
  • 74. Techniques to support these operations• re-orderable matrices• brushing• linked views• overview & detail• focus & context• ...
  • 75. 6. Validation
  • 76. Evaluate the right thing Munzner, 2009
  • 77. Slide/picture acknowledgments• Jeffrey Heer• Tamara Munzner• Jessie Kennedy• Nils Gehlenborg• Miriah Meyer
  • 78. “I think this presentation went quite well...”