Your SlideShare is downloading. ×
14 Ddply
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×

Introducing the official SlideShare app

Stunning, full-screen experience for iPhone and Android

Text the download link to your phone

Standard text messaging rates apply

14 Ddply

1,915
views

Published on

Published in: Technology, Education

0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
1,915
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
36
Comments
0
Likes
1
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. Stat405 Still more ddply Hadley Wickham Wednesday, 14 October 2009
  • 2. 1. Homework 2. Projects 3. Continuing ddply 4. Next week Wednesday, 14 October 2009
  • 3. Homework Homework 4: out of 20. Code quality not graded (sorry - ran out of time) Homework 5: out of 5 (but equal weight with others). Make sure to check some great examples online. Please check grades online Wednesday, 14 October 2009
  • 4. Projects Generally great. Make sure to read through the two I’ve posted online. Lots of interesting findings! One common mistake - don’t forget about the denominator! Wednesday, 14 October 2009
  • 5. b <- read.csv("batting.csv") b$onplate <- with(b, ab + bb + ibb + hbp + sh + sf) b$onbase <- with(b, h + bb + ibb + hbp) b$obp <- with(b, onbase / onplate) library(ggplot2) # What is this going to look like? qplot(obp, data = b, binwidth = 0.01) qplot(onplate, obp, data = b) qplot(onplate, obp, data = b, alpha = I(1 / 100) Wednesday, 14 October 2009
  • 6. How would you remove these outliers? 4000 3000 count 2000 1000 0 0.0 0.2 0.4 0.6 0.8 1.0 obp Wednesday, 14 October 2009
  • 7. How would you remove these outliers? 4000 3000 count 2000 1000 0 0.0 0.2 0.4 0.6 0.8 1.0 qplot(obp, data = b, binwidth = 0.01) obp Wednesday, 14 October 2009
  • 8. Which player would most like to have on your team? Wednesday, 14 October 2009
  • 9. Wednesday, 14 October 2009
  • 10. Wednesday, 14 October 2009
  • 11. 15000 10000 count 5000 0 0 200 400 600 onplate Wednesday, 14 October 2009
  • 12. 15000 10000 count 5000 0 0 200 400 600 qplot(onplate, data = b, binwidth = 5) onplate Wednesday, 14 October 2009
  • 13. What cutoff should we 2000 choose? 1500 count 1000 500 0 50 100 150 200 onplate Wednesday, 14 October 2009
  • 14. What cutoff should we 2000 choose? 1500 count 1000 500 0 50 100 150 200 last_plot() + xlim(10, onplate 200) Wednesday, 14 October 2009
  • 15. # How many players make that many apperances # for each team in a given year? b2000 <- subset(b, year == 2000) ddply(b2000, "team", summarise, n100 = sum(onplate > 100), n200 = sum(onplate > 200), n = length(onplate) ) # Problems? Wednesday, 14 October 2009
  • 16. qplot(onplate, reorder(team, onplate), data = b2000) qplot(year, onplate, data = subset(b, year > 1960), geom = "boxplot", group = year) Wednesday, 14 October 2009
  • 17. 4000 3000 count 2000 1000 0 0.0 0.2 0.4 0.6 0.8 1.0 obp Wednesday, 14 October 2009
  • 18. 4000 3000 count 2000 1000 0 0.0 0.2 0.4 0.6 0.8 1.0 qplot(obp, data = b, binwidth = 0.01) obp Wednesday, 14 October 2009
  • 19. 1500 1000 count 500 0 0.0 0.2 0.4 0.6 0.8 1.0 obp Wednesday, 14 October 2009
  • 20. 1500 1000 count 500 0 0.0 0.2 0.4 0.6 0.8 1.0 obp qplot(obp, data = subset(b, onplate > 100), binwidth = 0.01) + xlim(0, 1) Wednesday, 14 October 2009
  • 21. Project tips Proof read - far too many projects with obvious mistakes. Include a section on the data, giving a quick English run-down of what you did to the data. Appendix should only contain technical details. Presentation matters - you should be proud of your work, so take a little time to put it in a nice wrapper. Wednesday, 14 October 2009
  • 22. Baby name sex exploration Wednesday, 14 October 2009
  • 23. library(plyr) library(ggplot2) bnames <- read.csv("baby-names.csv", stringsAsFactors = FALSE) times <- ddply(bnames, c("name"), summarise, boys = sum(prop[sex == "boy"]), boys_n = sum(sex == "boy"), girls = sum(prop[sex == "girl"]), girls_n = sum(sex == "girl"), .progress = "text" ) both_sexes <- subset(times, boys_n > 10 & girls > 10 & boys + girls > 0.4) selected_names <- both_sexes$name selected <- subset(bnames, name %in% selected_names) Wednesday, 14 October 2009
  • 24. Yearly summaries Next problem is to classify which names are dual-sex, and which are errors. To do that, we’ll need to calculate yearly summaries for each of those names, and use our knowledge of names to come up with a good classification criterion. Wednesday, 14 October 2009
  • 25. Your turn For each name, in each year, figure out the total number of boys and girls. Think of ways to summarise the difference between the number of boys and girls, and start visualising the data. Wednesday, 14 October 2009
  • 26. bysex <- ddply(selected, c("name", "year"), summarise, boys = sum(prop[sex == "boy"]), girls = sum(prop[sex == "girl"]), .progress = "text" ) # It's useful to have a symmetric means of comparing # the relative abundance of boys and girls - the log # ratio is good for this. bysex$lratio <- log10(bysex$boys / bysex$girls) bysex$lratio[!is.finite(bysex$lratio)] <- NA Wednesday, 14 October 2009
  • 27. 2 1 0 lratio −1 −2 1880 1900 1920 1940 1960 1980 2000 year Wednesday, 14 October 2009
  • 28. Ronald ● ●●● ●●● ●● Mark ● ●●●● ●● ●● ●● Larry ● ●● ● ● ● ●●● ● ●● ● Richard ●●●●●●●● ● ●●● ● ●● ●●● ●● ● ●● ● ● ●● ● ● ● ● ●● ●● ●●● ● William ●● ●●● ● ●●●●●● ● ●●●●● ● ●●●● ● ● ●● ●● ●●●● ●●●● ● ●● ●●● ●●● ● ●● ●● ●● ● ●● ● ●● ●● Edward ● ●●●●●●●●●● ● ● ●●●●●● ● ● ●●●●● ● ●● ● ● ●● ●● Thomas ● ●●●●●●●●● ● ● ●●● ●●●●●●● ● ●● ●●● ●● ●●● ● ● ●●● ● Donald ●●● ● ●●● ●● ●●● ● ● ●● ●● ● David ● ●●●●●● ●●●●●● ● ●●●●●● ● ●● ●● ●●●●● ●●● ●● ●●● ● ●●●● ●● ● John ●●●●● ● ●●●●●● ●●● ● ●●●●●● ● ●● ●● ● ●●● ●● ●●● ● ●●●● ● ● ● ●●●● ● ● ●● ● ● ● Robert ●● ●●●●●●●●● ● ●● ●●●●●●●●● ●● ● ●●●● ●● ● ●● ●●● ● ●●● ●●●●●● ●● ● Harry ● ●●●● ●● ●● ● ●● ● ● ● James ●●●●●●●●●●●● ●●●●●●● ●●● ● ●● ●●●●● ●●● ●● ●●●● ●● ● ●● ●● ● ●● ●● ●● Joseph ●●●●●●●●●●● ●●●●●●●●● ●● ●●● ●●●●●●● ● ●●●●●●●● ● ● ●● ● ● ● ●●● ●● ● Frank ● ●●●●●●●●●● ● ●●● ●●● ● ● ●●●● ●● ●● ● ● ● ● Charles ●●●●●●●●●● ●● ● ● ●● ●●●●●●● ● ●●●●●● ●● ● ●●●●●● ●●●● ●● ● ●● ● ● Albert ●● ● ●● ●● ● ● ●● ●● ● Paul ● ●●● ●● ● ● ● ●●●● ●●●● ● ●● ●● ● Michael ●● ●●● ●●●● ● ● ●●●●● ●●●● ●● ●● ● ● ● ●● ● ● ● ●● ● ●● ● Brian ● ●● ●● ● ●●●● Kenneth ●● ● ●● ●● ● ● ● ● ● ●● Harold ●●●●● ●● ● ●●●● ●● Walter ●●● ●●●●●●● ● ● ●●●● ●● ●● ●●● ● ●● ● ● ● Arthur ●●●● ● ● ● ● ● ●●● ● ●● ● ●● ● ● ●● ● Matthew ● ●● ● ●● ● ●●● ● George ●●●●●●●● ●● ● ●●●●●● ●● ● ●●●● ● ●● ●●●●● ● ●● ●●●● ●●● ● Kevin ● ● ●● ● ● ●● ● ● Christopher ●●● ●●● ●●● ● ● ●●● ● ●●● Jack ●● ●● ●● ● ● ●●● ●● ● Henry ●●●●●●●● ●●● ●●● ●●●●● ●● ●●● ●● ● ● ● ●●● ●● ● ●● ●●● ●● Fred ●● ● ●● ● ● ●●● ● ●● ●● ●● Jason ●●● ● ●● ● ●● ●● Joshua ● ●● ● ●● ● ● ● Eric ● ●● ●●● ● ●● ●● reorder(name, lratio, na.rm = T) Daniel ●● ● ●● ● ●● ● ●● Anthony ●●● ● ●● ●●● ● ●● ●● ●● ● ●● Louis ● ● ●●●● ● ●●● ● ● ●●●● ● ● Joe ●●●●●●● ●●● ● ● ●●● ●●●●●●● ● ● ●● ● ●●●● ● ● ●● ● ● ●● ●● ● ● ● ●● Ryan ●● ●●● ●●●● ● ●● ● ●● ●●●● ● ● ●● ● ●●● ●● ● Jerry ●● ● ● ●● ●● ●● ●● ●● ●● ● ● ● ● ● ● ●● ● ● ● ●● ● ●● ● ● ● ● ●● ● ●● ●● ●● ● ●● Willie ●●●●●● ●● ●● ●●● ● ●● ●● ●●●●●● ● ● ● ●● ● ●● ● ● ●●●●●● ●●●●●● ● ●● ● ● ● Shirley ●● ● ●●● ●●● ●●● ●● ● ●● ●● ● ●● ● ●● ●● ● ●● ● ● ● ● ● ●● ●●● ●● ●●● ●● ●●● ● ● ●● ●●● ●● ● ● ● ● ● ●● Ashley ●● ● ● ●● ● ● ● ●● ● ● ● ● ● ● ●● ●●● ● ●● ● ● ●● ● Carol ●●●● ● ● ●● ●● ●● ● ● ● ●● ●● ● ● ● ●●●● ● ●●●● ● ●● ●●● ●● ● ● ● ● ●● ●● ●● ● ●● ●● ● ● ● Frances ● ●● ●● ●●●●●● ● ● ●● ●● ● ●●● ● ● ●● ●● ● ● ● ●● ● ● ● ● ● ●● ● ● ● ● ● Julia ● ●● ● ● ●●● ● ● ● Doris ● ●●●●● ● ● ● ● ●● ●●● ● ● ● ● ● ● ●●● ● ● Irene ● ●●●● ● ● ● ● ● ● Louise ●●●● ● ●●●● ●●●●●● ● ●● ●● ● ●● ● ● ● Rose ● ●● ● ●● ●● ● ● ●● ●● ●●● ● ●● ● ●● ●● Florence ● ●●●● ● ●● ● ●● ●●●● ● ●●● ● ●● ●● ● Ethel ● ●●● ●●●● ● ● ● ●● ●● ●●● ● ●● ●● ● ● ●● ● Edith ● ●●● ● ●● ● ●● ● ●● ● ● ● Kimberly ● ●● ●● ● ●● ●●● ● ● ● ● ● ●● ● Annie ● ●●● ●● ● ●●● ● ● ●● ● ● ●●● ● ●●● ● ● ●● ●● Edna ● ●●●●●●● ●●●● ● ●●●● ● ● ● Minnie ● ●●● ●●●●● ●● ●● ●● ●● ● ● ● ● Grace ● ● ●●● ● ● ● ●● ● ● ●●● ●● ● ● Clara ● ●●●● ●● ● ● ● ●●● ●● ● ●● Bertha ● ● ● ● ● ●●● ● ● ● ● ●● ● ●● ●● Lillian ● ●●●●● ● ● ● ● ●● ● ● ●● ●● ● ● Martha ●●● ●●● ● ●●● ●● ● Marie ●●●●●●●● ●●●● ● ● ● ●● ● Emma ● ● ● ● ● ●●● ● ● ● ●● ●● ● ●● ● Mildred ●● ●● ●●●● ●●● ● ● ● ● ●● ● ● ● Alice ●●●● ●●●● ● ● ●● ●●● ● ●● ●● ●●● ● Anna ●●●●●●●●● ● ● ● ●● ● ● ● ●● ● ● ● ●● ● ● Sarah ● ●● ●● ● ●● ● ● ● ● ● ●● ●● ● Elizabeth ●● ●●●●●●● ● ● ●● ●●●●●● ● ● ●●●●●●● ●●● ●● ●●●●● ●● ● ●● ●● Ruth ●● ●● ● ●●● ●● ● ● ●● ● ●●● ● ● ● ●●● ● ●● ● ●●● ● ● ● Margaret ●●● ● ●●●●●● ●● ●● ● ● ●●●●●●●● ● ● ●●● ●● ● ●●●● ● ●● ● ●●● ●● ● Helen ●●●●● ●●●● ● ● ● ● ●● ●●●●● ● ● ● ● ● ●● ●● ●● ● ● ●● ● ●● ●●● Virginia ●● ●●● ●●●● ●● ●● ● Dorothy ● ● ● ●●● ●●● ● ● ● ● ● ● ●● ●●● ● ● ● ●● ● ●● ● ●● ● ● ●● ● Mary ● ●●●●●●●●●●●● ● ●●●●●●●●●● ●●● ● ● ● ●●●● ●● ●● ● ● ● ●●● ●●● ● ●● ● ● ● ● Betty ● ●●●● ●● ●●● ●● ● ● ●●● ●● ●● ● Michelle ●●● ● ●●● ● ●● ● ● Sharon ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● Jessica ● ●●●● ● ●● ● Melissa ●●● ● ● ●● ●●●● ●● ●● Nancy ●● ●● ● ● ● ●● ● ● ● ● ● ● Jennifer ● ●●● ● ● ● ●●● ●●● ●● ● Amanda ● ●●● ● ●● ● ●● Patricia ●●● ●● ●●● ● ●● ● ●●● ●● ●●●● ● ●● ●● ●● ●● Donna ● ●●● ● ●● ● ● ● ● ● Sandra ●● ●●● ●● ●● ● ● ● ● Barbara ● ●●●● ●●●● ● ● ● ● ●● ● ●● ● Lisa ●●● ●●● ●● ●● ● ●● Karen ●● ● ●● ● ● ●● ● ● ● ● ● Linda ● ● ● ●●● ● ● ●● ● ●● ● ●●●●●●● ● Susan ●● ●●●● ● ● ●● ●● ● ● ●● ● ● ● ● −2 −1 0 1 2 lratio Wednesday, 14 October 2009
  • 29. Ronald ● ●●●● ● ●● ● ● Mark ● ● ● ● ● ●● ●● ●● ● ● Larry ●● ●●● ●● ●●● ● ● ● Richard ● ● ● ● ●●● ●●● ● ●●● ●● ●● ●● ● ● ● ● ●● ●●● ● ●● ●● ● ● ● ● ● ● ● ●● ● ● ● William ● ●●●● ● ●● ●● ●●●● ● ● ● ●●●●● ● ● ●● ● ● ● ●● ●●● ●● ● ● ●● ● ● ● ●● ●● ●●● ●●● ● ● ● ●● ●●●●● ●● ● ●● ● ● ● ● ● Edward ● ● ● ●● ● ●● ● ●●●● ● ● ● ● ● ●● ●● ● ● ● ● ● ● ●● ● ● ● Thomas ●● ● ● ● ● ●●●● ●●● ● ● ●● ● ● ● ● ● ●●●● ●● ● ● ● ● ● ●●●●●●● ●● ●●●● ● ●●● ● Donald ●● ●● ●●● ● ●●● ● ●●● ●● ●● ● ● ● David ● ● ●● ●●● ●●● ● ● ● ●●● ● ● ● ● ● ● ●● ●●● ● ● ● ● ●●● ● ● ●● ● ● ● ● ● ●● ● ● ● ● ● ● ● ●● John ● ●●●● ● ●●●●● ●●●●●●●●● ● ●●● ● ●●●●● ●●● ● ●●●●● ●●● ● ●● ●● ●● ● ●● ●● ●● ● ● ●● ●● ● ● ● ●● ●●● ● ● Robert ●●●●● ●●●●●●●●●●●● ●● ● ● ●●● ● ● ●●●●●●●● ●●●●● ● ●● ● ● ● ●● ●● ● ●● ● ● ●●● ● ● ●● ● ● ●● ●● ● ● ● Harry ●● ● ● ● ● ● ● ● ● ●● ● ●● James ● ●●●● ●●● ●●●●●●●●●●●●● ● ● ●●●● ● ● ●●● ●●●●●●● ●● ● ●●● ●● ●● ● ●●● ● ● ●● ● ●●● ● ● ●●● ● ●● ●● ●● ●● Joseph ●●●●●●●● ●● ●●●●● ● ● ●●●● ● ● ● ● ● ●● ●●●●● ● ●●● ●● ● ●● ●● ●● ●●● ● ●● ● ● ●●● ●● ● Frank ● ●●● ●●●●● ● ● ●●●●●● ● ● ● ●●●● ● ●● ● ● ●●● ● ● ● ● ● Charles ● ● ●●●●●●●●●●●●● ● ● ● ●● ● ●●●●● ●●●●●●●● ● ● ● ● ● ● ●●●●●● ●● ● ● ●● ●● ● ● ●● ● ●● ●● ● Albert ● ●● ● ● ● ● ●●● ● ● ● ● ● Paul ●● ● ● ●●● ●●● ● ● ● ● ● ● ●●● ● ● ● Michael ●● ●●●●●●● ● ●● ● ● ●● ● ● ● ●● ●●● ● ● ● ●● ● ● ● ● ●● ●● ● ● ●● ● ● ● ● ● Brian ● ● ●● ● ● ●● ●●● Kenneth ● ● ● ●● ●● ● ●●● ● ● ● Harold ● ●●● ● ●● ● ● ● ●● ● ●● Walter ● ●●●● ● ●● ●●● ● ●● ● ●●● ●●● ●●● ●● ● ● ● ● ●● ● Arthur ● ●●●●● ●●● ●● ●● ● ● ● ●●● ● ●●● ● ● ● ● Matthew ●● ● ● ● ● ● ● ● ● ● George ● ●●● ● ●●●●●●●● ●●●● ● ●● ● ●●●● ● ●● ●●●● ●● ●● ●● ●● ● ● ● ●● ● ●● Kevin ● ●● ● ● ● ●● ● ● Christopher ● ●● ● ● ●● ● ● ● ●● ● ● ● ● ● ●● ● ● ●● Jack ●●●●●●●● ● ● ●● ● ● ● Henry ●● ● ●●● ●●● ●● ●● ●● ●● ● ● ●●●● ● ● ●●●● ● ● ●● ● ●●●● ● ● ● ●● ● ● ● ● Fred ● ● ● ●● ●●● ●●●●● ●● ●● Jason ●●●●● ● ● ●● ● ● ● ● ● Joshua ● ●●● ● ● ●● ● Eric ● ●● ● ● ● ● ●● ● ● reorder(name, lratio, na.rm = T) Daniel ●● ●● ● ● ●●● ● ●●●● ● Anthony ●●● ●●● ● ● ●● ● ●● ●● ● Louis ●● ● ●●● ●● ● ● ● ●● ● ● ●● ● ● ● ● Joe ●●●●● ● ● ● ●● ●●● ●● ● ● ●● ●● ●●●●● ●●●● ● ● ● ● ● ●●●● ●● ●● ● ● ●● ●● ● ● ● ● ● ● ●● Ryan ●● ● ● ● ●●● ● ●● ● ● ●● ● ● ●● ● ● ● ●● ● ● ●● Jerry ●●● ● ● ● ● ● ●● ● ● ●● ● ● ● ● ● ● ●●●● ● ● ●● ● ●● ●● ● ● ●● ● ● ●● ● ● ●●● ● ● ●● ● ●● Willie ●●● ●●●●●●●● ●● ● ● ● ●● ● ● ● ●● ● ● ● ● ●● ● ●●●●●●● ● ●●●●● ●●● ● ● ● ● ●● ● ● ●●● ●● ●● ● ● ●● ● ● ●●● Shirley ● ●● ● ● ● ● ● ● ● ●●● ● ● ● ●● ● ● ● ● ●● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ●● ● ●●●● ● ● ● ● ●●● ● ● ● ●● ● ● ● ● ● ● Ashley ●● ● ● ● ● ●● ● ● ● ●● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● Carol ● ● ●●● ● ● ● ●● ● ● ●●● ● ● ● ● ● ●● ● ● ● ● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ●●● ●● ● ● ● ● ● ● ●● ● Frances ● ●● ●●● ●● ●● ●●●● ●●●● ●● ● ●● ●● ●● ●● ● ● ●●● ● ●● ● ● ●● ●● ● ● ●● ● ● ● ●● Julia ● ● ● ● ● ●●● ● ●● Doris ● ●● ● ● ● ● ●● ●●●●●●●● ● ● ●● ●●● ● ●● ● ●● ● ● Irene ● ● ● ●● ● ● ● ●● ●● Louise ●● ● ●●● ●● ● ● ●●● ● ●● ● ● ●● ● ● ●● ● ● Rose ● ●●● ●● ● ●●●● ● ●● ● ●●●● ● ●● ● ●● Florence ●●● ● ● ● ●●● ● ●● ● ●● ●●● ●● ●● ●● ● ●●● Ethel ● ●●● ●●● ●● ●●● ●●● ● ● ● ●●●● ● ● ● ●● ● ● Edith ● ●●● ● ●●● ● ●● ● ● ● ●● ● Kimberly ● ●● ● ● ● ● ● ● ● ● ●● ●● ●● ● ● ● Annie ●● ● ●●● ●● ●● ● ● ● ● ● ●● ● ●● ●● ●● ● ●● ● ●● ● ● Edna ● ●●●●●● ●●● ● ●● ● ● ●● ● ● ● ● Minnie ●●● ●●● ● ●●● ● ● ● ●● ●● ● ● ●● ●● ● Grace ● ●● ● ● ●●● ● ●●● ● ● ●● ● ●● ● Clara ● ● ●●●● ● ●● ●● ● ●● ● ● ● ●●● ● ●● Bertha ●● ● ● ●●●● ● ● ● ● ● ●● ● ● ● ● ● ●● ● Lillian ● ● ● ● ●●●●● ●● ●● ● ● ● ● ● ● ●● ● ● ● Martha ●● ● ●● ● ●● ● ●● ● ● ●●● Marie ● ● ●● ● ●● ●●●● ● ●● ● ● ● ● ●● ● Emma ● ● ●●● ● ● ● ● ● ● ●● ● ● ● ●● ● ●● ● ●● Mildred ●● ● ●● ● ● ● ● ● ●● ●● ● ● ● ●● ● ● ● Alice ● ● ●●● ● ● ●●● ●●● ● ● ●● ● ●● ● ●● ● ●● ●● ● ● Anna ● ● ●●●● ●●●● ●● ●● ● ●●●● ●●● ● ● ● ● ●● ● ● ● ● Sarah ● ● ● ● ● ● ● ● ● ●● ● ●●● ●● ● ● Elizabeth ● ● ● ●● ●● ●● ● ● ● ● ●● ● ● ●● ● ● ●●●● ● ● ● ● ● ● ●● ●●●● ● ● ● ●● ● Ruth ● ● ●● ● ●● ●● ●● ●● ●● ●● ●● ● ● ●● ●● ●● ●● ● ● ●● ● ● ● ● ● Margaret ●●● ●●● ●● ● ●●● ● ● ●● ●● ●●● ● ●● ● ●● ● ● ● ● ● ● ●●● ● ● ●● ●● ●●● ●● ● Helen ● ●● ● ●●● ●● ●● ●●● ● ●● ● ● ● ● ● ●● ● ● ● ●● ● ● ● ● ●● ● ● ● ●● ● ● ● Virginia ●●●●● ● ● ● ●● ●● ● Dorothy ● ● ● ● ● ●● ● ●●● ●● ●● ● ● ●● ● ● ● ● ● ● ●● ● ●● ●● ● ●● ●● ● ● Mary ●●●● ●●● ●●●●●●●●● ●●●●● ●●● ●●●● ●● ●●●●●●● ●● ●●● ● ●● ●● ●●● ● ● ● ● ● ●●● ●●● ● ●● ● ● ●● ● Betty ● ● ● ● ● ● ● ●●● ● ●● ● ●● ● ● ●● ● ● Michelle ● ●●● ● ● ● ● ●●● ● ● ●● Sharon ● ● ● ● ●●● ● ● ● ●● ●● ● ● ● Jessica ● ● ● ●● ● ● ●● ● Melissa ● ● ● ●●● ● ●● ● ● ●●●● Nancy ● ●● ●● ● ●● ●● ● ● ● ● ●● ● Jennifer ● ● ●●● ●●● ● ●●●● ● ●● ● ● ●● Amanda ●● ●● ●● ●● ●● ● ● Patricia ● ●●●● ● ●●●●●●● ● ●●● ● ● ●● ● ● ● ●● ● ● ● ● ●● ● ● Donna ● ● ● ● ●●● ● ● ● ● ●● ● Sandra ●● ● ● ● ● ● ● ● ● ●● ● ● Barbara ● ● ● ●● ● ● ●● ● ● ● ● ● ● ● ● ● ●● ● ●● ● ● Lisa ● ● ● ● ● ●● ● ● ●● ● ● ● Karen ● ● ●● ● ● ● ●● ● ● ● ● ●● ● ● Linda ● ● ● ●● ● ●●● ● ●● ●● ● ● ● ●●● ● ●● ● Susan ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ●● ● 0.5 1.0 1.5 2.0 2.5 abs(lratio) Wednesday, 14 October 2009
  • 30. theme_set(theme_grey(10)) qplot(year, lratio, data = bysex, group = name, geom = "line") qplot(lratio, reorder(name, lratio, na.rm = T), data = bysex) qplot(abs(lratio), reorder(name, lratio, na.rm = T), data = bysex) qplot(abs(lratio), reorder(name, lratio, na.rm = T), data = bysex) + geom_point(data = both_sexes, colour = "red") Wednesday, 14 October 2009
  • 31. 2 1 0 lratio −1 What characteristics of −2 each name might we want to use to classify them into dual-sex with sex-errors? 1880 1900 1920 1940 1960 1980 2000 year Wednesday, 14 October 2009
  • 32. Your turn Compute the mean and range of lratio for each name. Plot and come up with cutoffs that you think separate the two groups. Wednesday, 14 October 2009
  • 33. rng <- ddply(bysex, "name", summarise, diff = diff(range(lratio, na.rm = T)), mean = mean(lratio, na.rm = T) ) qplot(diff, abs(mean), data = rng) qplot(diff, abs(mean), data = rng, colour = abs(mean) < 1.75 | diff > 0.9) shared_names <- subset(rng, abs(mean) < 1.75 | diff > 0.9)$name qplot(abs(lratio), reorder(name, lratio, na.rm=T), data = subset(bysex, name %in% shared_names)) qplot(year, lratio, geom = "line", group = name, data = subset(bysex, name %in% shared_names)) Wednesday, 14 October 2009
  • 34. Up next More ggplot2. Graphics for time and space. A little more theory. Monday: no class. Friday: introduction to Git (with Garrett). Wednesday, 14 October 2009