Day 1c access, select ordering copy.pptx

Access, select & ordering
Day 1 - Introduction to R for Life Sciences

Accessing vectors, matrices, data.frames
Positions within vectors, matrices and data.frames are accessed
using [ ]:
> v <- c(10, 3, 5, 10)
> v[2]
3
[] can also be used to assign (write) new values, e.g: v[2] <- 10
( ) are used for function calls (or grouping operators, more later) !!!
for instance: myvector <- c( ), mymatrix <- matrix( ), mydata <- data.frame( )

Three ways to access values from vectors, matrices
and data.frames
Integers: specify the positions of the elements you mean
Logical: specify (using TRUE/FALSE) which elements you want
Character: specify their names
only if your vector/matrix/data.frame has (unique) names!
All these selections are made with vectors.
They are sometimes called indexes.

Examples:
chromlength <- c(230218, 813184, 316620, 1531933)
Integer:
chromlength[ c(4, 2) ] => 1531933, 813184
Logical:
chromlength[ c(FALSE, FALSE, TRUE, FALSE) ] => 316620
Character:
names(chromlength) <- c("chrI", "chrII", "chrIII", "chrIV")
chromlength[ c("chrIII", "chrI") ] => 316620, 230218

Specifics for lists & data.frames
lists
>mylist <- list(analysis=”GSEA”, genes=c(“Foxo3a”, “TP53”), cutoff=0.05)
> mylist$analysis
> mylist$genes[2]
data.frames
> mydata[ , "id"]
> mydata$id # does the same thing

Dimensions of data.frames and matrices
> x <- matrix(1:6, nrow=2, byrow=TRUE)
[,1] [,2] [,3]
[1,] 1 2 3
[2,] 4 5 6
x[ i, j ]
index before the comma: indicates the row(s). If missing: all rows
index after the comma: indicates column(s). If missing: all columns

Example
> x
[,1] [,2] [,3]
[1,] 1 2 3
[2,] 4 5 6
Using integers:
> x[2, 3] # the value on the second row, third column
> x[ , 2] # all rows, second column. So: the whole 2nd column
> x[ , c(1,3)] # the first and third column (new data.frame or matrix!)
> x[ , -2] # everything but the second column
> x[ , 1:3] # first up to and including third column

Using logicals:
delA delB delC
geneA 1 2 3
geneB 4 5 6
> ind <- c(FALSE, TRUE, TRUE)
> x[ 1 , ind] # first row; first column:no, 2nd, 3rd column: yes
[1] 2 3

Using characters:
> x <- matrix(1:6, nrow=2, byrow=TRUE,
dimnames=list( c("geneA", "geneB"), c("delA", "delB", "delC"))
delA delB delC
geneA 1 2 3
geneB 4 5 6
> x["geneB", "delA"] # selects the value of geneB in delA
> x[, c("delA", "delC")] # selects columns delA and delC

Logical vector and selection
Often (implicitly) used in combination with select statements
delA delB delC
geneA 1 2 3
geneB 4 5 6
> ind <- x["geneA", ] > 1
[1] FALSE TRUE TRUE
> x["geneA", ind]
[1] 2 3
> x["geneA", x["geneA", ] > 1] # same as above, but implicit

Operators
< # Less than
> # Greater than
== # Equal to. Note: don’t confuse with = (assignment)
>= # Greater than or equal to
<= # Less than or equal to
& # AND
| # OR
Note: x <- 2 is an assignment
x < -2 is a comparison! Use extra spaces or parentheses

AND (&), OR (|) , NOT (!)
a b a & b
FALSE FALSE FALSE
FALSE TRUE FALSE
TRUE FALSE FALSE
TRUE TRUE TRUE
FALSE NA FALSE
TRUE NA NA
a b a | b
FALSE FALSE FALSE
FALSE TRUE TRUE
TRUE FALSE TRUE
TRUE TRUE TRUE
FALSE NA NA
TRUE NA TRUE
a ! a
FALSE TRUE
TRUE FALSE
NA NA

Auto-recycling of vector content
If you combine vectors of different length, R will automatically
‘recycle’ the content of the shortest vector to become the length of
the longest:
> mynumbers <- c(10.4, 5, 8.4, 3)
> mynumbers2 <- mynumbers + 1
> mynumbers2
11.4, 6, 9.4, 4 # In fact, mynumbers + c(1, 1, 1, 1) is done
But also:
> mynumbers2 + c(2, 30)
13.4, 36, 11.4, 34 # Here, mynumbers2 + c(2, 30, 2, 30) is done.

Recycling also works with logical operators
Comparison of equal length vectors (no recycling needed) :
> v1 <- c(10, 5, 5, 1)
> v2 <- c(10, 3, 5, 2)
> v1 == v2
TRUE, FALSE, TRUE, FALSE
Comparison of unequal length vectors:
> v1 == 5 # The value 5 is recycled to get an equal length vector.
# So in fact, v1 == c(5,5,5,5) is done
FALSE, TRUE, TRUE, FALSE

Operators
delA delB delC
geneA 1 2 3
geneB 4 5 6
> ind <- x["geneA", ] > 1 & x["geneA", ] < 3
> x["geneA", ind]
[1] 2

Combining logical operators
AND-operator has precedence over OR-operator
(like in mathematics: *, / have precedence over -, +)
Group them with parentheses if needed, or for clarity
> ind <- ( x < -1.7 | x > 2 ) & !is.na(x)

Select statements
Special (common) functions, all return a logical vector
is.na()
is.numeric() (and also is.character(), is.factor(), is.matrix(), is.data.frame() )
duplicated()
! # (exclamation mark): logical NOT, i.e. negation
Used a lot in checking the consistency of your data or arguments
for a function

Ordering
(Re)order a data.frame or matrix using the values from a single
column using order()
> mydata <- data.frame( id=c(1,3,4,2), name=c("geneB", "geneA", "geneD",
"geneC"), value=c(-0.2, 1.5, -3, 3))
> mydata[order(mydata[, "id"]), ] # sort on id
> mydata[order(mydata[, "name"]), ] # sort on name

Day 1c access, select ordering copy.pptx

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Day 1c access, select ordering copy.pptx

Similar to Day 1c access, select ordering copy.pptx (20)

Recently uploaded

Recently uploaded (20)

Day 1c access, select ordering copy.pptx