Upcoming SlideShare
×

08 Functions

514 views

Published on

0 Likes
Statistics
Notes
• Full Name
Comment goes here.

Are you sure you want to Yes No
• Be the first to comment

• Be the first to like this

Views
Total views
514
On SlideShare
0
From Embeds
0
Number of Embeds
10
Actions
Shares
0
18
0
Likes
0
Embeds 0
No embeds

No notes for slide

08 Functions

1. 1. Stat405 Functions & for loops Hadley Wickham Monday, 21 September 2009
2. 2. 1. Changes to homework 2. Function structure and calling conventions 3. Practice writing functions 4. For loops Monday, 21 September 2009
3. 3. Homework changes Figuring out the problem isn’t as easy as I’d thought, so we’ll cover it in class today. Due next Wednesday, with some TBA function drills. Pay attention to the style guide! Monday, 21 September 2009
4. 4. Basic structure • Name • Input arguments • Names/positions • Defaults • Body • Output (ﬁnal result) Monday, 21 September 2009
5. 5. calculate_prize <- function(windows) { payoffs <- c("DD" = 800, "7" = 80, "BBB" = 40, "BB" = 25, "B" = 10, "C" = 10, "0" = 0) same <- length(unique(windows)) == 1 allbars <- all(windows %in% c("B", "BB", "BBB")) if (same) { prize <- payoffs[windows[1]] } else if (allbars) { prize <- 5 } else { cherries <- sum(windows == "C") diamonds <- sum(windows == "DD") prize <- c(0, 2, 5)[cherries + 1] * c(1, 2, 4)[diamonds + 1] } prize } Monday, 21 September 2009
6. 6. calculate_prize <- function(windows) { payoffs <- c("DD" = 800, "7" = 80, "BBB" = 40, arguments name = 25, "B" = 10, "C" = 10, "0" = 0) "BB" same <- length(unique(windows)) == 1 allbars <- all(windows %in% c("B", "BB", "BBB")) if (same) { prize <- payoffs[windows[1]] } else if (allbars) { prize <- 5 } else { always indent <- sum(windows == "C") cherries inside {}! <- sum(windows == "DD") diamonds prize <- c(0, 2, 5)[cherries + 1] * c(1, 2, 4)[diamonds + 1] } prize } last value in function is result Monday, 21 September 2009
7. 7. mean <- function(x) { sum(x) / length(x) } mean(1:10) mean <- function(x, na.rm = TRUE) { if (na.rm) x <- x[!is.na(x)] sum(x) / length(x) } mean(c(NA, 1:9)) Monday, 21 September 2009
8. 8. mean <- function(x) { sum(x) / length(x) } mean(1:10) default value mean <- function(x, na.rm = TRUE) { if (na.rm) x <- x[!is.na(x)] sum(x) / length(x) } mean(c(NA, 1:9)) Monday, 21 September 2009
9. 9. # Function: mean str(mean) mean args(mean) formals(mean) body(mean) Monday, 21 September 2009
10. 10. Function arguments Every function argument has a name and a position. When calling, matches exact name, then partial name, then position. Rule of thumb: for common functions, position based matching is ok for required arguments. Otherwise use names (abbreviated as long as clear). Monday, 21 September 2009
11. 11. mean(c(NA, 1:9), na.rm = T) # Confusing in this case, but often saves typing mean(c(NA, 1:9), na = T) # Generally don't need to name all arguments mean(x = c(NA, 1:9), na.rm = T) # Unusual orders best avoided, but # illustrate the principle mean(na.rm = T, c(NA, 1:9)) mean(na = T, c(NA, 1:9)) Monday, 21 September 2009
12. 12. # Overkill qplot(x = price, y = carat, data = diamonds) # Don't need to supply defaults mean(c(NA, 1:9), na.rm = F) # Need to remember too much about mean mean(c(NA, 1:9), , T) # Don't abbreviate too much! mean(c(NA, 1:9), n = T) Monday, 21 September 2009
13. 13. f <- function(a = 1, abcd = 1, abdd = 1) { print(a) print(abcd) print(abdd) } f(a = 5) f(ab = 5) f(abc = 5) Monday, 21 September 2009
14. 14. Your turn Write a function to calculate the variance of a vector. Write a function to calculate the covariance of two vectors. Make sure each has a na.rm argument. Write functions in your text editor, then copy into R. Monday, 21 September 2009
15. 15. Strategy Always want to start simple: start with test values and get the body of the function working ﬁrst. Check each step as you go. Don’t try and do too much at once! Create the function once everything works. Monday, 21 September 2009
16. 16. x <- 1:10 sum((x - mean(x)) ^ 2) var <- function(x) sum((x - mean(x)) ^ 2) x <- c(1:10, NA) var(x) na.rm <- TRUE if (na.rm) { x <- x[!is.na(x)] } var <- function(x, na.rm = F) { if (na.rm) { x <- x[!is.na(x)] } sum((x - mean(x)) ^ 2) } Monday, 21 September 2009
17. 17. Testing Always a good idea to test your code. We have a prebuilt set of test cases: the prize column in slots.csv So for each row in slots.csv, we need to calculate the prize and compare it to the actual. (Hopefully they will be same!) Monday, 21 September 2009
18. 18. For loops for(value in 1:10) { print(value) } cuts <- levels(diamonds\$cut) for(cut in cuts) { selected <- diamonds\$price[diamonds\$cut == cut] print(mean(selected)) } # Have to do something with output! Monday, 21 September 2009
19. 19. # Common pattern: create object for output, # then fill with results cuts <- levels(diamonds\$cut) means <- vector("numeric", length(cuts)) for(i in seq_along(cuts)) { sub <- diamonds[diamonds\$cut == cuts[i], ] means[i] <- mean(sub\$price) } # We will learn more sophisticated ways to do this # later on, but this is the most explicit Monday, 21 September 2009
20. 20. # Common pattern: create object for output, # then fill with results cuts <- levels(diamonds\$cut) means <- vector("numeric", length(cuts)) for(i in seq_along(cuts)) { Why use i and not cut? sub <- diamonds[diamonds\$cut == cuts[i], ] means[i] <- mean(sub\$price) } # We will learn more sophisticated ways to do this # later on, but this is the most explicit Monday, 21 September 2009
21. 21. seq_len(5) seq_len(10) n <- 10 seq_len(10) seq_along(1:10) seq_along(1:10 * 2) Monday, 21 September 2009
22. 22. Your turn For each diamond colour, calculate the median price and carat size Monday, 21 September 2009
23. 23. colours <- levels(diamonds\$color) n <- length(colours) mprice <- rep(NA, n) mcarat <- rep(NA, n) for(i in seq_len(n)) { set <- diamonds[diamonds\$color == colours[i], ] mprice[i] <- median(set\$price) mcarat[i] <- median(set\$carat) } results <- data.frame(colours, mprice, mcarat) Monday, 21 September 2009
24. 24. Back to slots... For each row, calculate the prize and save it, then compare calculated prize to actual prize Question: given a row, how can we extract the slots in the right form for the function? Monday, 21 September 2009
25. 25. slots <- read.csv("slots.csv") i <- 334 slots[i, ] slots[i, 1:3] str(slots[i, 1:3]) as.character(slots[i, 1:3]) c(as.character(slots[i, 1]), as.character(slots[i, 2]), as.character(slots[i, 3])) slots <- read.csv("slots.csv", stringsAsFactors = F) str(slots[i, 1:3]) as.character(slots[i, 1:3]) calculate_prize(as.character(slots[i, 1:3])) Monday, 21 September 2009
26. 26. # Create space to put the results slots\$check <- NA # For each row, calculate the prize for(i in seq_len(nrow(slots))) { w <- as.character(slots[i, 1:3]) slots\$check[i] <- calculate_prize(w) } # Check with known answers subset(slots, prize != prize_c) # Uh oh! Monday, 21 September 2009
27. 27. # Create space to put the results slots\$check <- NA # For each row, calculate the prize for(i in seq_len(nrow(slots))) { w <- as.character(slots[i, 1:3]) slots\$check[i] <- calculate_prize(w) } # Check with known answers subset(slots, prize != prize_c) # Uh oh! What is the problem? Think about the most generalise case Monday, 21 September 2009
28. 28. DD DD DD 800 windows <- c("DD", "C", "C") 7 7 7 80 # How can we calculate the BBB BBB BBB 40 # payoff? B B B 10 C C C 10 Any bar Any bar Any bar 5 C C * 5 C * C 5 C * * 2 * C * 2 DD doubles any winning combination. Two DD * * C 2 quadruples. DD is wild Monday, 21 September 2009