SlideShare a Scribd company logo
1 of 33
Download to read offline
Stat405
       Graphical theory & critique


                          Hadley Wickham
Monday, 2 November 2009
Exploratory graphics
                  Are for you (not others). Need to be able
                  to create rapidly because your first
                  attempt will never be the most revealing.
                  Iteration is crucial for developing the best
                  display of your data.
                  Gives rise to two key questions:



Monday, 2 November 2009
What should I plot?
                    How can I plot it?


Monday, 2 November 2009
Two general tools
                                Plot critique toolkit:
                           “graphics are like pumpkin pie”
                               Theory behind ggplot2:
                          “A layered grammar of graphics”



                           plus lots of practice...

Monday, 2 November 2009
Graphics are like
                     pumpkin pie

                          The four C’s of critiquing a graphic
Monday, 2 November 2009
Content



Monday, 2 November 2009
Construction



Monday, 2 November 2009
Context
Monday, 2 November 2009
Consumption
Monday, 2 November 2009
Content
                    What data (variables) does the
                    graph display?
                    What non-data is present?
                    What is pumpkin (essence of the
                    graphic) vs what is spice (useful
                    additional info)?

Monday, 2 November 2009
Your turn

                    Identify the data and non-data
                    on “Napoleon's march” and
                    “Building an electoral victory”.
                    Which features are the most
                    important? Which are just
                    useful background information?

Monday, 2 November 2009
Results
                    Minard’s march: (top) latitude,
                    longitude, number of troops,
                    direction, branch, city name
                    (bottom) latitude, temperature, date
                    Building an electoral victory: state,
                    number of electoral college votes,
                    winner, margin of victory

Monday, 2 November 2009
Construction

                    How many layers are on the plot?
                    What data does each layer
                    display? What sort of geometric
                    object does it use? Is it a summary
                    of the raw data? How are
                    variables mapped to aesthetics?

Monday, 2 November 2009
Fo r i a
                                                                  r c bl
                                                                   va
                           Perceptual mapping




                                                                     on e s
                                                                       t in on
                                                                           uo l y !
                                                                             us
                          Best   1. Position along a common scale
                                 2. Position along nonaligned scale
                                 3. Length
                                 4. Angle/slope
                                 5. Area
                                 6. Volume
                          Worst 7. Colour


Monday, 2 November 2009
Your turn
                    Answer the following questions
                    for “Napoleon's march” and
                    “Flight delays”:
                    How many layers are on the plot?
                    What data does the layer
                    display? How does it display it?

Monday, 2 November 2009
Results
                    Napoleon’s march: (top) (1) path plot with width
                    mapped to number of troops, colour to direction,
                    separate group for each branch (2) labels giving
                    city names (bottom) (1) line plot with longitude on
                    x-axis and temperature on y-axis (2) text labels
                    giving dates
                    Flight delays: (1) white circles showing 100%
                    cancellation, (2) outline of states, (3) points with
                    size proportional to percent cancellations at each
                    airport.


Monday, 2 November 2009
Can the explain
                          composition of a graphic
                          in words, but how do we
                                 create it?



Monday, 2 November 2009
“If any number of
                          magnitudes are each
                          the same multiple of
                          the same number of
                          other magnitudes,
                          then the sum is that
                          multiple of the sum.”
                          Euclid, ~300 BC




Monday, 2 November 2009
“If any number of
                                magnitudes are each
                                the same multiple of
                                the same number of
                                other magnitudes,
                                then the sum is that
                                multiple of the sum.”
                                Euclid, ~300 BC




                          m(Σx) = Σ(mx)
Monday, 2 November 2009
The grammar of graphics
                  An abstraction which makes thinking about,
                  reasoning about and communicating
                  graphics easier.
                  Developed by Leland Wilkinson, particularly
                  in “The Grammar of Graphics” 1999/2005
                  You’ve been using it in ggplot2 without
                  knowing it! But to do more, you need to
                  learn more about the theory.


Monday, 2 November 2009
What is a layer?
                  • Data
                  • Mappings from variables to aesthetics
                    (aes)
                  • A geometric object (geom)
                  • A statistical transformation (stat)
                  • A position adjustment (position)


Monday, 2 November 2009
layer(geom, stat, position, data, mapping, ...)

     layer(
       data = mpg,
       mapping = aes(x = displ, y = hwy),
       geom = "point",
       stat = "identity",
       position = "identity"
     )

     layer(
       data = diamonds,
       mapping = aes(x = carat),
       geom = "bar",
       stat = "bin",
       position = "stack"
     )

Monday, 2 November 2009
# A lot of typing!

     layer(
       data = mpg,
       mapping = aes(x = displ, y = hwy),
       geom = "point",
       stat = "identity",
       position = "identity"
     )

     # Every geom has an associated default statistic
     # (and vice versa), and position adjustment.

     geom_point(aes(displ, hwy), data = mpg)
     geom_histogram(aes(displ), data = mpg)
Monday, 2 November 2009
# To actually create the plot
     ggplot() +
       geom_point(aes(displ, hwy), data = mpg)

     ggplot() +
       geom_histogram(aes(displ), data = mpg)




Monday, 2 November 2009
# Multiple layers
     ggplot() +
       geom_point(aes(displ, hwy), data = mpg) +
       geom_smooth(aes(displ, hwy), data = mpg)

     # Avoid redundancy:
     ggplot(mpg, aes(displ, hwy)) +
       geom_point() +
       geom_smooth()




Monday, 2 November 2009
# Different layers can have different aesthetics
     ggplot(mpg, aes(displ, hwy)) +
       geom_point(aes(colour = class)) +
       geom_smooth()

     ggplot(mpg, aes(displ, hwy)) +
       geom_point(aes(colour = class)) +
       geom_smooth(aes(group = class), method = "lm",
         se = F)




Monday, 2 November 2009
Your turn
                  For each of the following plots created with
                  qplot, recreate the equivalent ggplot code.
                  qplot(price, carat, data = diamonds)
                  qplot(hwy, cty, data = mpg, geom = "jitter")
                  qplot(reorder(class, hwy), hwy, data = mpg,
                    geom = c("jitter", "boxplot"))
                  qplot(log10(price), log10(carat),
                  data = diamonds), colour = color) +
                  geom_smooth(method = "lm")


Monday, 2 November 2009
ggplot(diamonds, aes(price, data)) +
       geom_smooth()

     gglot(mpg, aes(hwy, cty)) +
       geom_jitter()

     ggplot(mpg, aes(reorder(class, hwy), hwy)) +
       geom_jitter() +
       geom_boxplot()

     ggplot(diamonds, aes(log10(price), log10(carat),
       colour = color)) +
       geom_point() +
       geom_smooth(method = "lm")

Monday, 2 November 2009
More geoms & stats

                  See http://had.co.nz/ggplot2 for complete
                  list with helpful icons:
                  Geoms: (0d) point, (1d) line, path, (2d)
                  boxplot, bar, tile, text, polygon
                  Stats: bin, summary, sum



Monday, 2 November 2009
Your turn

                  Go back to the descriptions of “Minard’s
                  march” and “Flight delays” that you
                  created before. Start converting your
                  textual description in to working ggplot2
                  code. Use the data sets available on line
                  to produce the graphics.



Monday, 2 November 2009
Other features
                  Scales. Used to override default perceptual
                  mappings. Mainly useful for polishing plot for
                  communication.
                  Coordinate system. Rarely useful, but when
                  needed are critical.
                  Facetting. Have seen in use already. Only other
                  feature of importance is interaction with scales.
                  Themes: control presentation of non-data
                  elements.


Monday, 2 November 2009
Project 3 preview
                  Choose your own dataset.
                  Results will be presented as poster during
                  a formal poster session in the last week of
                  class.
                  On November 16, Tracey Volz (a
                  communications expert) will come and
                  talk about how to create a good poster


Monday, 2 November 2009
Data suggestions

                  http://delicious.com/hadley/data
                  http://github.com/hadley/data-housing-
                  crisis




Monday, 2 November 2009

More Related Content

Viewers also liked

Motivations of Baby Boomer Entrepreneurs in the Hospitality Industry
Motivations of Baby Boomer Entrepreneurs in the Hospitality IndustryMotivations of Baby Boomer Entrepreneurs in the Hospitality Industry
Motivations of Baby Boomer Entrepreneurs in the Hospitality IndustryEcole Hôtelière de Lausanne
 
#2 DataBeersBCN - "Why counting people at public transport" by Caterina Font
 #2 DataBeersBCN - "Why counting people at public transport" by Caterina Font #2 DataBeersBCN - "Why counting people at public transport" by Caterina Font
#2 DataBeersBCN - "Why counting people at public transport" by Caterina FontDataBeersBCN
 
#3 DataBeersBCN - "Big Fun Data" by Xavier Guardiola
#3 DataBeersBCN - "Big Fun Data" by Xavier Guardiola#3 DataBeersBCN - "Big Fun Data" by Xavier Guardiola
#3 DataBeersBCN - "Big Fun Data" by Xavier GuardiolaDataBeersBCN
 
#2 DataBeersBCN - "Using data to make great and succesful mobile games" by J...
 #2 DataBeersBCN - "Using data to make great and succesful mobile games" by J... #2 DataBeersBCN - "Using data to make great and succesful mobile games" by J...
#2 DataBeersBCN - "Using data to make great and succesful mobile games" by J...DataBeersBCN
 
Fedegan_Cuadernos_Ganaderos_Segundo_Colombia_Animal_Bovino
Fedegan_Cuadernos_Ganaderos_Segundo_Colombia_Animal_BovinoFedegan_Cuadernos_Ganaderos_Segundo_Colombia_Animal_Bovino
Fedegan_Cuadernos_Ganaderos_Segundo_Colombia_Animal_BovinoFedegan
 

Viewers also liked (11)

23 data-structures
23 data-structures23 data-structures
23 data-structures
 
Motivations of Baby Boomer Entrepreneurs in the Hospitality Industry
Motivations of Baby Boomer Entrepreneurs in the Hospitality IndustryMotivations of Baby Boomer Entrepreneurs in the Hospitality Industry
Motivations of Baby Boomer Entrepreneurs in the Hospitality Industry
 
Graphical inference
Graphical inferenceGraphical inference
Graphical inference
 
24 modelling
24 modelling24 modelling
24 modelling
 
#2 DataBeersBCN - "Why counting people at public transport" by Caterina Font
 #2 DataBeersBCN - "Why counting people at public transport" by Caterina Font #2 DataBeersBCN - "Why counting people at public transport" by Caterina Font
#2 DataBeersBCN - "Why counting people at public transport" by Caterina Font
 
#3 DataBeersBCN - "Big Fun Data" by Xavier Guardiola
#3 DataBeersBCN - "Big Fun Data" by Xavier Guardiola#3 DataBeersBCN - "Big Fun Data" by Xavier Guardiola
#3 DataBeersBCN - "Big Fun Data" by Xavier Guardiola
 
#2 DataBeersBCN - "Using data to make great and succesful mobile games" by J...
 #2 DataBeersBCN - "Using data to make great and succesful mobile games" by J... #2 DataBeersBCN - "Using data to make great and succesful mobile games" by J...
#2 DataBeersBCN - "Using data to make great and succesful mobile games" by J...
 
R packages
R packagesR packages
R packages
 
Fedegan_Cuadernos_Ganaderos_Segundo_Colombia_Animal_Bovino
Fedegan_Cuadernos_Ganaderos_Segundo_Colombia_Animal_BovinoFedegan_Cuadernos_Ganaderos_Segundo_Colombia_Animal_Bovino
Fedegan_Cuadernos_Ganaderos_Segundo_Colombia_Animal_Bovino
 
Rhetorical Terms
Rhetorical TermsRhetorical Terms
Rhetorical Terms
 
27 development
27 development27 development
27 development
 

More from Hadley Wickham (20)

21 spam
21 spam21 spam
21 spam
 
20 date-times
20 date-times20 date-times
20 date-times
 
19 tables
19 tables19 tables
19 tables
 
18 cleaning
18 cleaning18 cleaning
18 cleaning
 
17 polishing
17 polishing17 polishing
17 polishing
 
15 time-space
15 time-space15 time-space
15 time-space
 
14 case-study
14 case-study14 case-study
14 case-study
 
13 case-study
13 case-study13 case-study
13 case-study
 
12 adv-manip
12 adv-manip12 adv-manip
12 adv-manip
 
11 adv-manip
11 adv-manip11 adv-manip
11 adv-manip
 
11 adv-manip
11 adv-manip11 adv-manip
11 adv-manip
 
10 simulation
10 simulation10 simulation
10 simulation
 
10 simulation
10 simulation10 simulation
10 simulation
 
09 bootstrapping
09 bootstrapping09 bootstrapping
09 bootstrapping
 
08 functions
08 functions08 functions
08 functions
 
07 problem-solving
07 problem-solving07 problem-solving
07 problem-solving
 
06 data
06 data06 data
06 data
 
05 subsetting
05 subsetting05 subsetting
05 subsetting
 
04 reports
04 reports04 reports
04 reports
 
03 extensions
03 extensions03 extensions
03 extensions
 

Recently uploaded

Using IESVE for Loads, Sizing and Heat Pump Modeling to Achieve Decarbonization
Using IESVE for Loads, Sizing and Heat Pump Modeling to Achieve DecarbonizationUsing IESVE for Loads, Sizing and Heat Pump Modeling to Achieve Decarbonization
Using IESVE for Loads, Sizing and Heat Pump Modeling to Achieve DecarbonizationIES VE
 
9 Steps For Building Winning Founding Team
9 Steps For Building Winning Founding Team9 Steps For Building Winning Founding Team
9 Steps For Building Winning Founding TeamAdam Moalla
 
Artificial Intelligence & SEO Trends for 2024
Artificial Intelligence & SEO Trends for 2024Artificial Intelligence & SEO Trends for 2024
Artificial Intelligence & SEO Trends for 2024D Cloud Solutions
 
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCostKubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCostMatt Ray
 
COMPUTER 10 Lesson 8 - Building a Website
COMPUTER 10 Lesson 8 - Building a WebsiteCOMPUTER 10 Lesson 8 - Building a Website
COMPUTER 10 Lesson 8 - Building a Websitedgelyza
 
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDE
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDEADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDE
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDELiveplex
 
How Accurate are Carbon Emissions Projections?
How Accurate are Carbon Emissions Projections?How Accurate are Carbon Emissions Projections?
How Accurate are Carbon Emissions Projections?IES VE
 
UiPath Studio Web workshop series - Day 7
UiPath Studio Web workshop series - Day 7UiPath Studio Web workshop series - Day 7
UiPath Studio Web workshop series - Day 7DianaGray10
 
Salesforce Miami User Group Event - 1st Quarter 2024
Salesforce Miami User Group Event - 1st Quarter 2024Salesforce Miami User Group Event - 1st Quarter 2024
Salesforce Miami User Group Event - 1st Quarter 2024SkyPlanner
 
Building AI-Driven Apps Using Semantic Kernel.pptx
Building AI-Driven Apps Using Semantic Kernel.pptxBuilding AI-Driven Apps Using Semantic Kernel.pptx
Building AI-Driven Apps Using Semantic Kernel.pptxUdaiappa Ramachandran
 
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...Will Schroeder
 
Empowering Africa's Next Generation: The AI Leadership Blueprint
Empowering Africa's Next Generation: The AI Leadership BlueprintEmpowering Africa's Next Generation: The AI Leadership Blueprint
Empowering Africa's Next Generation: The AI Leadership BlueprintMahmoud Rabie
 
Bird eye's view on Camunda open source ecosystem
Bird eye's view on Camunda open source ecosystemBird eye's view on Camunda open source ecosystem
Bird eye's view on Camunda open source ecosystemAsko Soukka
 
Videogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdfVideogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdfinfogdgmi
 
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...Aggregage
 
Machine Learning Model Validation (Aijun Zhang 2024).pdf
Machine Learning Model Validation (Aijun Zhang 2024).pdfMachine Learning Model Validation (Aijun Zhang 2024).pdf
Machine Learning Model Validation (Aijun Zhang 2024).pdfAijun Zhang
 
Meet the new FSP 3000 M-Flex800™
Meet the new FSP 3000 M-Flex800™Meet the new FSP 3000 M-Flex800™
Meet the new FSP 3000 M-Flex800™Adtran
 
20230202 - Introduction to tis-py
20230202 - Introduction to tis-py20230202 - Introduction to tis-py
20230202 - Introduction to tis-pyJamie (Taka) Wang
 
Igniting Next Level Productivity with AI-Infused Data Integration Workflows
Igniting Next Level Productivity with AI-Infused Data Integration WorkflowsIgniting Next Level Productivity with AI-Infused Data Integration Workflows
Igniting Next Level Productivity with AI-Infused Data Integration WorkflowsSafe Software
 

Recently uploaded (20)

Using IESVE for Loads, Sizing and Heat Pump Modeling to Achieve Decarbonization
Using IESVE for Loads, Sizing and Heat Pump Modeling to Achieve DecarbonizationUsing IESVE for Loads, Sizing and Heat Pump Modeling to Achieve Decarbonization
Using IESVE for Loads, Sizing and Heat Pump Modeling to Achieve Decarbonization
 
9 Steps For Building Winning Founding Team
9 Steps For Building Winning Founding Team9 Steps For Building Winning Founding Team
9 Steps For Building Winning Founding Team
 
201610817 - edge part1
201610817 - edge part1201610817 - edge part1
201610817 - edge part1
 
Artificial Intelligence & SEO Trends for 2024
Artificial Intelligence & SEO Trends for 2024Artificial Intelligence & SEO Trends for 2024
Artificial Intelligence & SEO Trends for 2024
 
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCostKubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
 
COMPUTER 10 Lesson 8 - Building a Website
COMPUTER 10 Lesson 8 - Building a WebsiteCOMPUTER 10 Lesson 8 - Building a Website
COMPUTER 10 Lesson 8 - Building a Website
 
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDE
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDEADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDE
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDE
 
How Accurate are Carbon Emissions Projections?
How Accurate are Carbon Emissions Projections?How Accurate are Carbon Emissions Projections?
How Accurate are Carbon Emissions Projections?
 
UiPath Studio Web workshop series - Day 7
UiPath Studio Web workshop series - Day 7UiPath Studio Web workshop series - Day 7
UiPath Studio Web workshop series - Day 7
 
Salesforce Miami User Group Event - 1st Quarter 2024
Salesforce Miami User Group Event - 1st Quarter 2024Salesforce Miami User Group Event - 1st Quarter 2024
Salesforce Miami User Group Event - 1st Quarter 2024
 
Building AI-Driven Apps Using Semantic Kernel.pptx
Building AI-Driven Apps Using Semantic Kernel.pptxBuilding AI-Driven Apps Using Semantic Kernel.pptx
Building AI-Driven Apps Using Semantic Kernel.pptx
 
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
 
Empowering Africa's Next Generation: The AI Leadership Blueprint
Empowering Africa's Next Generation: The AI Leadership BlueprintEmpowering Africa's Next Generation: The AI Leadership Blueprint
Empowering Africa's Next Generation: The AI Leadership Blueprint
 
Bird eye's view on Camunda open source ecosystem
Bird eye's view on Camunda open source ecosystemBird eye's view on Camunda open source ecosystem
Bird eye's view on Camunda open source ecosystem
 
Videogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdfVideogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdf
 
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
 
Machine Learning Model Validation (Aijun Zhang 2024).pdf
Machine Learning Model Validation (Aijun Zhang 2024).pdfMachine Learning Model Validation (Aijun Zhang 2024).pdf
Machine Learning Model Validation (Aijun Zhang 2024).pdf
 
Meet the new FSP 3000 M-Flex800™
Meet the new FSP 3000 M-Flex800™Meet the new FSP 3000 M-Flex800™
Meet the new FSP 3000 M-Flex800™
 
20230202 - Introduction to tis-py
20230202 - Introduction to tis-py20230202 - Introduction to tis-py
20230202 - Introduction to tis-py
 
Igniting Next Level Productivity with AI-Infused Data Integration Workflows
Igniting Next Level Productivity with AI-Infused Data Integration WorkflowsIgniting Next Level Productivity with AI-Infused Data Integration Workflows
Igniting Next Level Productivity with AI-Infused Data Integration Workflows
 

19 Critique

  • 1. Stat405 Graphical theory & critique Hadley Wickham Monday, 2 November 2009
  • 2. Exploratory graphics Are for you (not others). Need to be able to create rapidly because your first attempt will never be the most revealing. Iteration is crucial for developing the best display of your data. Gives rise to two key questions: Monday, 2 November 2009
  • 3. What should I plot? How can I plot it? Monday, 2 November 2009
  • 4. Two general tools Plot critique toolkit: “graphics are like pumpkin pie” Theory behind ggplot2: “A layered grammar of graphics” plus lots of practice... Monday, 2 November 2009
  • 5. Graphics are like pumpkin pie The four C’s of critiquing a graphic Monday, 2 November 2009
  • 10. Content What data (variables) does the graph display? What non-data is present? What is pumpkin (essence of the graphic) vs what is spice (useful additional info)? Monday, 2 November 2009
  • 11. Your turn Identify the data and non-data on “Napoleon's march” and “Building an electoral victory”. Which features are the most important? Which are just useful background information? Monday, 2 November 2009
  • 12. Results Minard’s march: (top) latitude, longitude, number of troops, direction, branch, city name (bottom) latitude, temperature, date Building an electoral victory: state, number of electoral college votes, winner, margin of victory Monday, 2 November 2009
  • 13. Construction How many layers are on the plot? What data does each layer display? What sort of geometric object does it use? Is it a summary of the raw data? How are variables mapped to aesthetics? Monday, 2 November 2009
  • 14. Fo r i a r c bl va Perceptual mapping on e s t in on uo l y ! us Best 1. Position along a common scale 2. Position along nonaligned scale 3. Length 4. Angle/slope 5. Area 6. Volume Worst 7. Colour Monday, 2 November 2009
  • 15. Your turn Answer the following questions for “Napoleon's march” and “Flight delays”: How many layers are on the plot? What data does the layer display? How does it display it? Monday, 2 November 2009
  • 16. Results Napoleon’s march: (top) (1) path plot with width mapped to number of troops, colour to direction, separate group for each branch (2) labels giving city names (bottom) (1) line plot with longitude on x-axis and temperature on y-axis (2) text labels giving dates Flight delays: (1) white circles showing 100% cancellation, (2) outline of states, (3) points with size proportional to percent cancellations at each airport. Monday, 2 November 2009
  • 17. Can the explain composition of a graphic in words, but how do we create it? Monday, 2 November 2009
  • 18. “If any number of magnitudes are each the same multiple of the same number of other magnitudes, then the sum is that multiple of the sum.” Euclid, ~300 BC Monday, 2 November 2009
  • 19. “If any number of magnitudes are each the same multiple of the same number of other magnitudes, then the sum is that multiple of the sum.” Euclid, ~300 BC m(Σx) = Σ(mx) Monday, 2 November 2009
  • 20. The grammar of graphics An abstraction which makes thinking about, reasoning about and communicating graphics easier. Developed by Leland Wilkinson, particularly in “The Grammar of Graphics” 1999/2005 You’ve been using it in ggplot2 without knowing it! But to do more, you need to learn more about the theory. Monday, 2 November 2009
  • 21. What is a layer? • Data • Mappings from variables to aesthetics (aes) • A geometric object (geom) • A statistical transformation (stat) • A position adjustment (position) Monday, 2 November 2009
  • 22. layer(geom, stat, position, data, mapping, ...) layer( data = mpg, mapping = aes(x = displ, y = hwy), geom = "point", stat = "identity", position = "identity" ) layer( data = diamonds, mapping = aes(x = carat), geom = "bar", stat = "bin", position = "stack" ) Monday, 2 November 2009
  • 23. # A lot of typing! layer( data = mpg, mapping = aes(x = displ, y = hwy), geom = "point", stat = "identity", position = "identity" ) # Every geom has an associated default statistic # (and vice versa), and position adjustment. geom_point(aes(displ, hwy), data = mpg) geom_histogram(aes(displ), data = mpg) Monday, 2 November 2009
  • 24. # To actually create the plot ggplot() + geom_point(aes(displ, hwy), data = mpg) ggplot() + geom_histogram(aes(displ), data = mpg) Monday, 2 November 2009
  • 25. # Multiple layers ggplot() + geom_point(aes(displ, hwy), data = mpg) + geom_smooth(aes(displ, hwy), data = mpg) # Avoid redundancy: ggplot(mpg, aes(displ, hwy)) + geom_point() + geom_smooth() Monday, 2 November 2009
  • 26. # Different layers can have different aesthetics ggplot(mpg, aes(displ, hwy)) + geom_point(aes(colour = class)) + geom_smooth() ggplot(mpg, aes(displ, hwy)) + geom_point(aes(colour = class)) + geom_smooth(aes(group = class), method = "lm", se = F) Monday, 2 November 2009
  • 27. Your turn For each of the following plots created with qplot, recreate the equivalent ggplot code. qplot(price, carat, data = diamonds) qplot(hwy, cty, data = mpg, geom = "jitter") qplot(reorder(class, hwy), hwy, data = mpg, geom = c("jitter", "boxplot")) qplot(log10(price), log10(carat), data = diamonds), colour = color) + geom_smooth(method = "lm") Monday, 2 November 2009
  • 28. ggplot(diamonds, aes(price, data)) + geom_smooth() gglot(mpg, aes(hwy, cty)) + geom_jitter() ggplot(mpg, aes(reorder(class, hwy), hwy)) + geom_jitter() + geom_boxplot() ggplot(diamonds, aes(log10(price), log10(carat), colour = color)) + geom_point() + geom_smooth(method = "lm") Monday, 2 November 2009
  • 29. More geoms & stats See http://had.co.nz/ggplot2 for complete list with helpful icons: Geoms: (0d) point, (1d) line, path, (2d) boxplot, bar, tile, text, polygon Stats: bin, summary, sum Monday, 2 November 2009
  • 30. Your turn Go back to the descriptions of “Minard’s march” and “Flight delays” that you created before. Start converting your textual description in to working ggplot2 code. Use the data sets available on line to produce the graphics. Monday, 2 November 2009
  • 31. Other features Scales. Used to override default perceptual mappings. Mainly useful for polishing plot for communication. Coordinate system. Rarely useful, but when needed are critical. Facetting. Have seen in use already. Only other feature of importance is interaction with scales. Themes: control presentation of non-data elements. Monday, 2 November 2009
  • 32. Project 3 preview Choose your own dataset. Results will be presented as poster during a formal poster session in the last week of class. On November 16, Tracey Volz (a communications expert) will come and talk about how to create a good poster Monday, 2 November 2009
  • 33. Data suggestions http://delicious.com/hadley/data http://github.com/hadley/data-housing- crisis Monday, 2 November 2009