Stat405
    Polishing your plots for presentation



                             Hadley Wickham
Wednesday, 4 November 2009
1. Thanksgiving
              2. Finish off exploratory graphics
              3. Communication graphics
              4. ...
Thanksgiving


                  No class Wed 25 November.
                  No assignment due over Thanksgiving.




Wedn...
50
                ●
                   ● ●●
                ● ● ●●       ● ●
                             ●● ●●
         ...
Your turn
                  Perform identify the data and layers in the
                  flight delays data, then write th...
usa <- map_data("state")

     ggplot(feb13, aes(long, lat)) +
       geom_point(aes(size = 1), colour = "white") +
      ...
troops <- read.csv("minard-troops.csv")
     cities <- read.csv("minard-cities.csv")

     ggplot(cities, aes(long, lat)) ...
Communication graphics

                  When you need to communicate your
                  findings, you need to spend a...
Context
Wednesday, 4 November 2009
Consumption
Wednesday, 4 November 2009
Tools

                  Scales. Used to override default
                  perceptual mappings, and tune
                ...
Scales
                  Control how data is mapped to perceptual
                  properties, and produce guides (axes a...
# Default scales
     scale_x_continuous()
     scale_y_discrete()
     scale_colour_discrete()

     # Custom scales
    ...
p <- qplot(cyl, displ, data = mpg)

     # First argument (name) controls axis label
     p + scale_y_continuous("Displace...
Your turn
              qplot(carat, price, data = diamonds, geom = "hex")

              Manipulate the fill colour legend...
p <- qplot(carat, price, data = diamonds, geom = "hex")

     # First argument (name) controls legend title
     p + scale...
p <- qplot(color, carat, data = diamonds)

     # Basically the same for discrete variables
     p + scale_x_discrete("Col...
Alternate scales
                  Can also override the default choice of
                  scales. You are most likely t...
Colour spaces
                  Most familiar is rgb: defines colour as
                  mixture of red, green and blue. M...
hue
                             luminance




                                         chroma
Wednesday, 4 November 2009
hue

                             Demo of real
                             shape in 3d
                             lumin...
Default colour scales

                  Discrete: evenly spaced hues of equal
                  chroma and luminance. No ...
Alternatives


                  Discrete: brewer
                  Continuous: gradient2, gradientn




Wednesday, 4 Nove...
Color brewer

                  Cynthia Brewer applied basics principles
                  and then rigorously tested to p...
# Fancy looking trigonometric function
     vals <- seq(-4 * pi, 4 * pi, len = 50)
     df <- expand.grid(x = vals, y = va...
p1 + scale_fill_gradient(low = "white",
       high = "black")

     # Highlight deviations
     p1 + scale_fill_gradient2...
Colour blindness

                  7-10% of men are red-green colour
                  “blind”. (Many other rarer types o...
Your turn

                  Read through the examples for
                  scale_colour_brewer,
                  scale_...
Other resources

                  A. Zeileis, K. Hornik, and P. Murrell.
                  Escaping RGBland: Selecting co...
Themes



Wednesday, 4 November 2009
Visual appearance
                  So far have only discussed how to get the
                  data displayed the way you...
# Two built in themes. The default:
     qplot(carat, price, data = diamonds)

     # And a theme with a white background:...
Elements
                  You can also make your own theme, or
                  modify and existing.
                  T...
Elements
                  Axis: axis.line, axis.text.x, axis.text.y,
                  axis.ticks, axis.title.x, axis.tit...
p <- qplot(displ, hwy, data = mpg) +
       opts(title = "Bigger engines are less efficient")

     # To modify a plot
   ...
Your turn
                  Fix the overlapping y labels on this plot:
                  qplot(reorder(model, hwy), hwy, d...
Feedback
                 http://hadley.wufoo.com/forms/
                        stat405-feedback/




Wednesday, 4 Novemb...
Upcoming SlideShare
Loading in …5
×

20 Polishing

1,910
-1

Published on

Published in: Education
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
1,910
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
32
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

20 Polishing

  1. 1. Stat405 Polishing your plots for presentation Hadley Wickham Wednesday, 4 November 2009
  2. 2. 1. Thanksgiving 2. Finish off exploratory graphics 3. Communication graphics 4. Scales 5. Themes Wednesday, 4 November 2009
  3. 3. Thanksgiving No class Wed 25 November. No assignment due over Thanksgiving. Wednesday, 4 November 2009
  4. 4. 50 ● ● ●● ● ● ●● ● ● ●● ●● ● ● ● ●● ●● ● ● ●● ● ● ● ● ● ● ●●● 45 ● ●● ● ● ● ● ● ● ●● ● ●● ● ● ● ●● ● ● ● ●● ●● ● ●● ● ●●● ● ● ●● ● ● ●●● ● ● ●● ● ●● ● ● ●●● ● ● ● ● ● ●● ●● ●● ●● % cancelled ● ●● ● ● ●● ● ● 0.0 ●● 40 ● ● ● ●● ● ● ● ●● ● ● ● ● ● ● ●●● ● ●●●●● ● 0.2 ● ●● ● ●● ●● ●● ●● ● 0.4 35 ●● ● ● ● ● ●●● ● ● ●● ● ●● ● ● ●● ● ● 0.6 ●● ●● ● ● ● ● ●●●● ● ●● ● ●●●● ● ●●●● ● 0.8 ● ● ●●● ● ● ●● ●● ●● ●●●●● ●● ● 1.0 30 ●●● ●●● ● ●●●● ● ● ●● ●● ● ●● ● ● ●● 25 ● −120 −110 −100 −90 −80 −70 Wednesday, 4 November 2009
  5. 5. Your turn Perform identify the data and layers in the flight delays data, then write the ggplot2 code to create it. library(ggplot2) library(maps) usa <- map_data("state") feb13 <- read.csv("delays-feb-13-2007.csv") Wednesday, 4 November 2009
  6. 6. usa <- map_data("state") ggplot(feb13, aes(long, lat)) + geom_point(aes(size = 1), colour = "white") + geom_polygon(aes(group = group), data = usa, colour = "grey70", fill = NA) + geom_point(aes(size = ncancelw / ntot), colour = alpha("black", 1/2)) # Polishing: up next last_plot() + scale_area("% cancelled", to = c(1, 8), breaks = seq(0, 1, by = 0.2), limits = c(0, 1)) scale_x_continuous("", limits = c(-125, -67)), scale_y_continuous("", limits = c(24, 50)) Wednesday, 4 November 2009
  7. 7. troops <- read.csv("minard-troops.csv") cities <- read.csv("minard-cities.csv") ggplot(cities, aes(long, lat)) + geom_path(aes(size = survivors, colour = direction, group = interaction(group, direction)), data = troops) + geom_text(aes(label = city), hjust = 0, vjust = 1, size = 4) # Polish appearance last_plot() + scale_x_continuous("", limits = c(24, 39)) + scale_y_continuous("") + scale_colour_manual(values = c("grey50","red")) + scale_size(to = c(1, 10)) Wednesday, 4 November 2009
  8. 8. Communication graphics When you need to communicate your findings, you need to spend a lot of time polishing your graphics to eliminate distractions and focus on the story. Now it’s time to pay attention to the small stuff: labels, colour choices, tick marks... Wednesday, 4 November 2009
  9. 9. Context Wednesday, 4 November 2009
  10. 10. Consumption Wednesday, 4 November 2009
  11. 11. Tools Scales. Used to override default perceptual mappings, and tune parameters of axes and legends. Themes: control presentation of non-data elements. Wednesday, 4 November 2009
  12. 12. Scales Control how data is mapped to perceptual properties, and produce guides (axes and legends) which allow us to read the plot. Important parameters: name, breaks & labels, limits. Naming scheme: scale_aesthetic_name. All default scales have name continuous or discrete. Wednesday, 4 November 2009
  13. 13. # Default scales scale_x_continuous() scale_y_discrete() scale_colour_discrete() # Custom scales scale_colour_hue() scale_x_log10() scale_fill_brewer() # Scales with parameters scale_x_continuous("X Label", limits = c(1, 10)) scale_colour_gradient(low = "blue", high = "red") Wednesday, 4 November 2009
  14. 14. p <- qplot(cyl, displ, data = mpg) # First argument (name) controls axis label p + scale_y_continuous("Displacement (l)") p + scale_x_continuous("Cylinders") # Breaks and labels control tick marks p + scale_x_continuous(breaks = c(4, 6, 8)) p + scale_x_continuous(breaks = c(4, 6, 8), labels = c("small", "medium", "big")) # Limits control range of data p + scale_y_continuous(limits = c(1, 8)) # same as: p + ylim(1, 8) Wednesday, 4 November 2009
  15. 15. Your turn qplot(carat, price, data = diamonds, geom = "hex") Manipulate the fill colour legend to: • Change the title to “Count" • Display breaks at 1000, 3500 & 7000 • Add commas to the keys (e.g. 1,000) • Set the limit for the scale from 0 to 8000. Wednesday, 4 November 2009
  16. 16. p <- qplot(carat, price, data = diamonds, geom = "hex") # First argument (name) controls legend title p + scale_fill_continuous("Count") # Breaks and labels control legend keys p + scale_fill_continuous(breaks = c(1000, 3500, 7000)) p + scale_fill_continuous(breaks = c(0, 4000, 8000)) # Why don't 0 and 8000 have colours? p + scale_fill_continuous(breaks = c(0, 4000, 8000), limits = c(0, 8000)) # Can use labels to make more human readable breaks <- c(0, 2000, 4000, 6000, 8000) labels <- format(breaks, big.mark = ",") p + scale_fill_continuous(breaks = breaks, labels = labels, limits = c(0, 8000)) Wednesday, 4 November 2009
  17. 17. p <- qplot(color, carat, data = diamonds) # Basically the same for discrete variables p + scale_x_discrete("Color") # Except limits is now a character vector p + scale_x_discrete(limits = c("D", "E", "F")) # Should work for boxplots too qplot(color, carat, data = diamonds, geom = "boxplot") + scale_x_discrete(limits = c("D", "E", "F")) # But currently a bug :( Wednesday, 4 November 2009
  18. 18. Alternate scales Can also override the default choice of scales. You are most likely to want to do this with colour, as it is the most important aesthetic after position. Need to know a little theory to be able to use it effectively: colour spaces & colour blindness. Wednesday, 4 November 2009
  19. 19. Colour spaces Most familiar is rgb: defines colour as mixture of red, green and blue. Matches the physics of eye, but the brain does a lot of post-processing, so it’s hard to directly perceive these components. A more useful colour space is hcl: hue, chroma and luminance Wednesday, 4 November 2009
  20. 20. hue luminance chroma Wednesday, 4 November 2009
  21. 21. hue Demo of real shape in 3d luminance chroma Wednesday, 4 November 2009
  22. 22. Default colour scales Discrete: evenly spaced hues of equal chroma and luminance. No colour appears more important than any other. Does not imply order. Continuous: evenly spaced hues between two colours. Wednesday, 4 November 2009
  23. 23. Alternatives Discrete: brewer Continuous: gradient2, gradientn Wednesday, 4 November 2009
  24. 24. Color brewer Cynthia Brewer applied basics principles and then rigorously tested to produce selection of good palettes, particularly tailored for maps: http://colorbrewer2.org/ Can use cut_interval() or cut_number() to convert continuous to discrete. Wednesday, 4 November 2009
  25. 25. # Fancy looking trigonometric function vals <- seq(-4 * pi, 4 * pi, len = 50) df <- expand.grid(x = vals, y = vals) df$r <- with(df, sqrt(x ^ 2 + y ^ 2)) df$z <- with(df, cos(r ^ 2) * exp(- r / 6)) df$z_cut <- cut_interval(df$z, 9) (p1 <- qplot(x, y, data = df, fill = z, geom = "tile")) (p2 <- qplot(x, y, data = df, fill = z_cut, geom = "tile")) Wednesday, 4 November 2009
  26. 26. p1 + scale_fill_gradient(low = "white", high = "black") # Highlight deviations p1 + scale_fill_gradient2() p1 + scale_fill_gradient2(breaks = seq(-1, 1, by = 0.25), limits = c(-1, 1)) p1 + scale_fill_gradient2(mid = "white", low = "black", high = "black") p2 + scale_fill_brewer(pal = "Blues") Wednesday, 4 November 2009
  27. 27. Colour blindness 7-10% of men are red-green colour “blind”. (Many other rarer types of colour blindness) Solutions: avoid red-green contrasts; use redundant mappings; test. I like color oracle: http://colororacle.cartography.ch Wednesday, 4 November 2009
  28. 28. Your turn Read through the examples for scale_colour_brewer, scale_colour_gradient2 and scale_colour_gradientn. Experiment! Wednesday, 4 November 2009
  29. 29. Other resources A. Zeileis, K. Hornik, and P. Murrell. Escaping RGBland: Selecting colors for statistical graphics. Computational Statistics & Data Analysis, 2008. http://statmath.wu-wien.ac.at/~zeileis/papers/ Zeileis+Hornik+Murrell-2008.pdf. Wednesday, 4 November 2009
  30. 30. Themes Wednesday, 4 November 2009
  31. 31. Visual appearance So far have only discussed how to get the data displayed the way you want, focussing on the essence of the plot. Themes give you a huge amount of control over the appearance of the plot, the choice of background colours, fonts and so on. Wednesday, 4 November 2009
  32. 32. # Two built in themes. The default: qplot(carat, price, data = diamonds) # And a theme with a white background: qplot(carat, price, data = diamonds) + theme_bw() # Use theme_set if you want it to apply to every # future plot. theme_set(theme_bw()) # This is the best way of seeing all the default # options theme_bw() theme_grey() Wednesday, 4 November 2009
  33. 33. Elements You can also make your own theme, or modify and existing. Themes are made up of elements which can be one of: theme_line, theme_segment, theme_text, theme_rect, theme_blank Gives you a lot of control over plot appearance. Wednesday, 4 November 2009
  34. 34. Elements Axis: axis.line, axis.text.x, axis.text.y, axis.ticks, axis.title.x, axis.title.y Legend: legend.background, legend.key, legend.text, legend.title Panel: panel.background, panel.border, panel.grid.major, panel.grid.minor Strip: strip.background, strip.text.x, strip.text.y Wednesday, 4 November 2009
  35. 35. p <- qplot(displ, hwy, data = mpg) + opts(title = "Bigger engines are less efficient") # To modify a plot p p + opts(plot.title = theme_text(size = 12, face = "bold")) p + opts(plot.title = theme_text(colour = "red")) p + opts(plot.title = theme_text(angle = 45)) p + opts(plot.title = theme_text(hjust = 1)) Wednesday, 4 November 2009
  36. 36. Your turn Fix the overlapping y labels on this plot: qplot(reorder(model, hwy), hwy, data = mpg) Rotate the labels on these strips so they are easier to read. qplot(hwy, reorder(model, hwy), data = mpg) + facet_grid(manufacturer ~ ., scales = "free", space = "free") Wednesday, 4 November 2009
  37. 37. Feedback http://hadley.wufoo.com/forms/ stat405-feedback/ Wednesday, 4 November 2009
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×