Hadley Wickham
Stat405Data structures
Thursday, 11 November 2010
Assessment
• Final: there is no final
• Two groups still haven’t sent me
electronic versions of their project
• Many of you...
1. Basic data types
2. Vectors, matrices & arrays
3. Lists & data.frames
Thursday, 11 November 2010
Vector
Matrix
Array
List
Data frame
1d
2d
nd
Same types Different types
Thursday, 11 November 2010
character
numeric
logical
mode()
length() A scalar is a vector of length 1
as.character(c(T, F))
as.character(seq_len(5))
...
Your turn
Experiment with automatic coercion.
What is happening in the following cases?
104 & 2 < 4
mean(diamonds$cut == "...
Matrix (2d)
Array (>2d)
a <- seq_len(12)
dim(a) <- c(1, 12)
dim(a) <- c(4, 3)
dim(a) <- c(2, 6)
dim(a) <- c(3, 2, 2)
1 5 9...
List
Is also a vector (so has
mode, length and names),
but is different in that it can
store any other vector inside
it (i...
Data frame
List of vectors, each of the
same length. (Cross
between list and matrix)
Different to matrix in that
each colu...
# How do you convert a matrix to a data frame?
# How do you convert a data frame to a matrix?
# What is different?
# What ...
x <- sample(12)
# What's the difference between a & b?
a <- matrix(x, 4, 3)
b <- array(x, c(4, 3))
# What's the difference...
1d names() length() c()
2d
colnames()
rownames()
ncol()
nrow()
cbind()
rbind()
nd dimnames() dim() abind()
(special packag...
b <- seq_len(10)
a <- letters[b]
# What sort of matrix does this create?
rbind(a, b)
cbind(a, b)
# Why would you want to u...
load(url("http://had.co.nz/stat405/data/quiz.rdata"))
# What is a? What is b?
# How are they different? How are they simil...
# a is numeric vector, containing the numbers 1 to 10
# b is a list of numeric scalars
# they contain the same values, but...
# f is a list of matrices of different dimensions
f[[1]]
f[[1]][1, 2]
Thursday, 11 November 2010
# What does these subsetting operations do?
# Why do they work? (Remember to use str)
diamonds[1]
diamonds[[1]]
diamonds[[...
Vectors x[1:4] —
Matrices
Arrays
x[1:4, ]
x[, 2:3, ]
x[1:4, ,
drop = F]
Lists
x[[1]]
x$name
x[1]
Thursday, 11 November 2010
# What's the difference between a & b?
a <- matrix(x, 4, 3)
b <- array(x, c(4, 3))
# What's the difference between x & y
y...
1d c()
2d
matrix()
data.frame()
t()
nd array() aperm()
Thursday, 11 November 2010
b <- seq_len(10)
a <- letters[b]
# What sort of matrix does this create?
rbind(a, b)
cbind(a, b)
# Why would you want to u...
Upcoming SlideShare
Loading in...5
×

23 data-structures

1,243

Published on

Published in: Technology, Business
0 Comments
4 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
1,243
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
17
Comments
0
Likes
4
Embeds 0
No embeds

No notes for slide

Transcript of "23 data-structures"

  1. 1. Hadley Wickham Stat405Data structures Thursday, 11 November 2010
  2. 2. Assessment • Final: there is no final • Two groups still haven’t sent me electronic versions of their project • Many of you STILL HAVEN’T returned your team evaluations Thursday, 11 November 2010
  3. 3. 1. Basic data types 2. Vectors, matrices & arrays 3. Lists & data.frames Thursday, 11 November 2010
  4. 4. Vector Matrix Array List Data frame 1d 2d nd Same types Different types Thursday, 11 November 2010
  5. 5. character numeric logical mode() length() A scalar is a vector of length 1 as.character(c(T, F)) as.character(seq_len(5)) as.logical(c(0, 1, 100)) as.logical(c("T", "F", "a")) as.numeric(c("A", "100")) as.numeric(c(T, F)) When vectors of different types occur in an expression, they will be automatically coerced to the same type: character > numeric > logical names() Optional, but useful Technically, these are all atomic vectors Thursday, 11 November 2010
  6. 6. Your turn Experiment with automatic coercion. What is happening in the following cases? 104 & 2 < 4 mean(diamonds$cut == "Good") c(T, F, T, T, "F") c(1, 2, 3, 4, F) Thursday, 11 November 2010
  7. 7. Matrix (2d) Array (>2d) a <- seq_len(12) dim(a) <- c(1, 12) dim(a) <- c(4, 3) dim(a) <- c(2, 6) dim(a) <- c(3, 2, 2) 1 5 9 2 6 10 3 7 11 4 8 12 1 2 3 4 5 6 7 8 9 10 11 12 1 3 5 7 9 11 2 4 6 8 10 12 (1, 12) (4, 3) (2, 6)Just like a vector. Has mode() and length(). Create with matrix () or array(), or from a vector by setting dim() as.vector() converts back to a vector Thursday, 11 November 2010
  8. 8. List Is also a vector (so has mode, length and names), but is different in that it can store any other vector inside it (including lists). Use unlist() to convert to a vector. Use as.list() to convert a vector to a list. c(1, 2, c(3, 4)) list(1, 2, list(3, 4)) c("a", T, 1:3) list("a", T, 1:3) a <- list(1:3, 1:5) unlist(a) as.list(a) b <- list(1:3, "a", "b") unlist(b) Technically a recursive vector Thursday, 11 November 2010
  9. 9. Data frame List of vectors, each of the same length. (Cross between list and matrix) Different to matrix in that each column can have a different type Thursday, 11 November 2010
  10. 10. # How do you convert a matrix to a data frame? # How do you convert a data frame to a matrix? # What is different? # What does these subsetting operations do? # Why do they work? (Remember to use str) diamonds[1] diamonds[[1]] diamonds[["cut"]] Thursday, 11 November 2010
  11. 11. x <- sample(12) # What's the difference between a & b? a <- matrix(x, 4, 3) b <- array(x, c(4, 3)) # What's the difference between x & y y <- matrix(x, 12) # How are these subsetting operations different? a[, 1] a[, 1, drop = FALSE] a[1, ] a[1, , drop = FALSE] Thursday, 11 November 2010
  12. 12. 1d names() length() c() 2d colnames() rownames() ncol() nrow() cbind() rbind() nd dimnames() dim() abind() (special package) Thursday, 11 November 2010
  13. 13. b <- seq_len(10) a <- letters[b] # What sort of matrix does this create? rbind(a, b) cbind(a, b) # Why would you want to use a data frame here? # How would you create it? Thursday, 11 November 2010
  14. 14. load(url("http://had.co.nz/stat405/data/quiz.rdata")) # What is a? What is b? # How are they different? How are they similar? # How can you turn a in to b? # How can you turn b in to a? # What are c, d, and e? # How are they different? How are they similar? # How can you turn one into another? # What is f? # How can you extract the first element? # How can you extract the first value in the first # element? Thursday, 11 November 2010
  15. 15. # a is numeric vector, containing the numbers 1 to 10 # b is a list of numeric scalars # they contain the same values, but in a different format identical(a[1], b[[1]]) identical(a, unlist(b)) identical(b, as.list(a)) # c is a named list # d is a data.frame # e is a numeric matrix # From most to least general: c, d, e identical(c, as.list(d)) identical(d, as.data.frame(c)) identical(e, data.matrix(d)) Thursday, 11 November 2010
  16. 16. # f is a list of matrices of different dimensions f[[1]] f[[1]][1, 2] Thursday, 11 November 2010
  17. 17. # What does these subsetting operations do? # Why do they work? (Remember to use str) diamonds[1] diamonds[[1]] diamonds[["cut"]] diamonds[["cut"]][1:10] diamonds$cut[1:10] Thursday, 11 November 2010
  18. 18. Vectors x[1:4] — Matrices Arrays x[1:4, ] x[, 2:3, ] x[1:4, , drop = F] Lists x[[1]] x$name x[1] Thursday, 11 November 2010
  19. 19. # What's the difference between a & b? a <- matrix(x, 4, 3) b <- array(x, c(4, 3)) # What's the difference between x & y y <- matrix(x, 12) # How are these subsetting operations different? a[, 1] a[, 1, drop = FALSE] a[1, ] a[1, , drop = FALSE] Thursday, 11 November 2010
  20. 20. 1d c() 2d matrix() data.frame() t() nd array() aperm() Thursday, 11 November 2010
  21. 21. b <- seq_len(10) a <- letters[b] # What sort of matrix does this create? rbind(a, b) cbind(a, b) # Why would you want to use a data frame here? # How would you create it? Thursday, 11 November 2010
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×