Your SlideShare is downloading. ×
23 data-structures
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×

Saving this for later?

Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime - even offline.

Text the download link to your phone

Standard text messaging rates apply

23 data-structures

1,170
views

Published on

Published in: Technology, Business

0 Comments
4 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
1,170
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
14
Comments
0
Likes
4
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. Hadley Wickham Stat405Data structures Thursday, 11 November 2010
  • 2. Assessment • Final: there is no final • Two groups still haven’t sent me electronic versions of their project • Many of you STILL HAVEN’T returned your team evaluations Thursday, 11 November 2010
  • 3. 1. Basic data types 2. Vectors, matrices & arrays 3. Lists & data.frames Thursday, 11 November 2010
  • 4. Vector Matrix Array List Data frame 1d 2d nd Same types Different types Thursday, 11 November 2010
  • 5. character numeric logical mode() length() A scalar is a vector of length 1 as.character(c(T, F)) as.character(seq_len(5)) as.logical(c(0, 1, 100)) as.logical(c("T", "F", "a")) as.numeric(c("A", "100")) as.numeric(c(T, F)) When vectors of different types occur in an expression, they will be automatically coerced to the same type: character > numeric > logical names() Optional, but useful Technically, these are all atomic vectors Thursday, 11 November 2010
  • 6. Your turn Experiment with automatic coercion. What is happening in the following cases? 104 & 2 < 4 mean(diamonds$cut == "Good") c(T, F, T, T, "F") c(1, 2, 3, 4, F) Thursday, 11 November 2010
  • 7. Matrix (2d) Array (>2d) a <- seq_len(12) dim(a) <- c(1, 12) dim(a) <- c(4, 3) dim(a) <- c(2, 6) dim(a) <- c(3, 2, 2) 1 5 9 2 6 10 3 7 11 4 8 12 1 2 3 4 5 6 7 8 9 10 11 12 1 3 5 7 9 11 2 4 6 8 10 12 (1, 12) (4, 3) (2, 6)Just like a vector. Has mode() and length(). Create with matrix () or array(), or from a vector by setting dim() as.vector() converts back to a vector Thursday, 11 November 2010
  • 8. List Is also a vector (so has mode, length and names), but is different in that it can store any other vector inside it (including lists). Use unlist() to convert to a vector. Use as.list() to convert a vector to a list. c(1, 2, c(3, 4)) list(1, 2, list(3, 4)) c("a", T, 1:3) list("a", T, 1:3) a <- list(1:3, 1:5) unlist(a) as.list(a) b <- list(1:3, "a", "b") unlist(b) Technically a recursive vector Thursday, 11 November 2010
  • 9. Data frame List of vectors, each of the same length. (Cross between list and matrix) Different to matrix in that each column can have a different type Thursday, 11 November 2010
  • 10. # How do you convert a matrix to a data frame? # How do you convert a data frame to a matrix? # What is different? # What does these subsetting operations do? # Why do they work? (Remember to use str) diamonds[1] diamonds[[1]] diamonds[["cut"]] Thursday, 11 November 2010
  • 11. x <- sample(12) # What's the difference between a & b? a <- matrix(x, 4, 3) b <- array(x, c(4, 3)) # What's the difference between x & y y <- matrix(x, 12) # How are these subsetting operations different? a[, 1] a[, 1, drop = FALSE] a[1, ] a[1, , drop = FALSE] Thursday, 11 November 2010
  • 12. 1d names() length() c() 2d colnames() rownames() ncol() nrow() cbind() rbind() nd dimnames() dim() abind() (special package) Thursday, 11 November 2010
  • 13. b <- seq_len(10) a <- letters[b] # What sort of matrix does this create? rbind(a, b) cbind(a, b) # Why would you want to use a data frame here? # How would you create it? Thursday, 11 November 2010
  • 14. load(url("http://had.co.nz/stat405/data/quiz.rdata")) # What is a? What is b? # How are they different? How are they similar? # How can you turn a in to b? # How can you turn b in to a? # What are c, d, and e? # How are they different? How are they similar? # How can you turn one into another? # What is f? # How can you extract the first element? # How can you extract the first value in the first # element? Thursday, 11 November 2010
  • 15. # a is numeric vector, containing the numbers 1 to 10 # b is a list of numeric scalars # they contain the same values, but in a different format identical(a[1], b[[1]]) identical(a, unlist(b)) identical(b, as.list(a)) # c is a named list # d is a data.frame # e is a numeric matrix # From most to least general: c, d, e identical(c, as.list(d)) identical(d, as.data.frame(c)) identical(e, data.matrix(d)) Thursday, 11 November 2010
  • 16. # f is a list of matrices of different dimensions f[[1]] f[[1]][1, 2] Thursday, 11 November 2010
  • 17. # What does these subsetting operations do? # Why do they work? (Remember to use str) diamonds[1] diamonds[[1]] diamonds[["cut"]] diamonds[["cut"]][1:10] diamonds$cut[1:10] Thursday, 11 November 2010
  • 18. Vectors x[1:4] — Matrices Arrays x[1:4, ] x[, 2:3, ] x[1:4, , drop = F] Lists x[[1]] x$name x[1] Thursday, 11 November 2010
  • 19. # What's the difference between a & b? a <- matrix(x, 4, 3) b <- array(x, c(4, 3)) # What's the difference between x & y y <- matrix(x, 12) # How are these subsetting operations different? a[, 1] a[, 1, drop = FALSE] a[1, ] a[1, , drop = FALSE] Thursday, 11 November 2010
  • 20. 1d c() 2d matrix() data.frame() t() nd array() aperm() Thursday, 11 November 2010
  • 21. b <- seq_len(10) a <- letters[b] # What sort of matrix does this create? rbind(a, b) cbind(a, b) # Why would you want to use a data frame here? # How would you create it? Thursday, 11 November 2010