3. Data structure
Vectors are one-dimensional arrays
a <- c(1, 2, 5, 3, 6, -2, 4)
b <- c("one", "two", "three")
c <- c(TRUE, TRUE, TRUE, FALSE, TRUE, FALSE)
a is numeric vector,
b is a character vector, and
c is a logical vector
http://shakthydoss.com 3
4. Data structure
Scalars are one-element vectors.
f <- 3
g <- "US"
h <- TRUE.
They’re used to hold constants.
http://shakthydoss.com 4
5. Data structure
The colon operator :
a <- c(1:5)
is equivalent to
a <- c(1,2, 3, 4, 5)
http://shakthydoss.com 5
6. Data structure
Vector
You can refer to elements of a vector using a numeric vector of positions within
brackets.
Example
vec <- c(“a”, “b”, “c”, “d”, “e”, ”f”)
vec[1] # will return the first element in the vector
vec[c(2,4)] # will return the 2nd and 4th element in the vector.
http://shakthydoss.com 6
7. Data structure
Matrices
Matrix are two-dimensional data structure in R.
Elements in matrix should have same mode (numeric, character, or logical).
Matrices are created with the matrix() function.
vector <- c(1,2,3,4)
foo <- matrix(vector, nrow=2, ncol=2)
http://shakthydoss.com 7
8. Data structure
Matrices byrow (optional parameter)
byrow=TRUE, matrix elements are filled by row wise.
byrow=FALSE, matrix elements are filled by column wise.
foo <- matrix(vector, nrow=2, ncol=2, byrow = TRUE)
foo <- matrix(vector, nrow=2, ncol=2, byrow = FALSE)
http://shakthydoss.com 8
9. Data structure
Matrix element can be accessed by subscript and brackets
Example
mat <- matrix(c(1:4), nrow=2,ncol = 2)
mat[1,] # returns first row in the matrix.
mat[2,] # returns second row in the matrix.
mat[,1] # returns first column in the matrix.
mat[,2] # returns second column in the matrix.
mat[1,2] # return element at first row of second column.
http://shakthydoss.com 9
10. Data structure
Array
Arrays are similar to matrices but can have more than two dimensions
Arrays are created with the array() function.
array(vector, dimensions, dimnames)
a <- matrix(c(1,1,1,1) , 2, 2)
b <- matrix(c(2,2,2,2) , 2, 2)
foo <- array(c(a,b), c(2,2,2))
http://shakthydoss.com 10
11. Data structure
Array
array elements can be accessed in the same way a matrices.
foo[1,,] # returns all elements in first dimension
foo[2,,] # returns all element in second dimension
foo[2,1,] # returns only first row element in second dimension
http://shakthydoss.com 11
12. Data structure
Data frame
Data frames are the most commonly used data structure in R.
Data frame is more like general matrix but its columns can contain different
modes of data (numeric, character, etc.)
A data frame is created with the data.frame() function
data.frame(col1, col2, col3,..)
name <- c( “joe” , “jhon” , “Nancy” )
sex <- c(“M”, “M”, “F”)
age <- c(27,26,26)
foo <- data.frame(name,sex,age)
http://shakthydoss.com 12
13. Data structure
Data frame
Accessing data frame elements can be straight forward. Element can be
accessed by column names.
Example
foo$name # retruns name vector in the data frame
foo$age # retuns age vector in the data frame
foo$age[2] # retuns second element of age vector in the data frame
http://shakthydoss.com 13
14. Data structure
Factors
Categorical variables in R are called factors.
Status (poor, improved, excellent) and Gender (Male, Female) are good
example of an categorical variables.
Factor are created using factor() function.
gender <- c(“Male", “Female“, “Female”, “Male”)
status <- c(“Poor”, “Improved” “Excellent”, “Poor” , “Excellent”)
factor_gender <- factor(gender) # factor_genter has two levels called Male and Female
factor_status <- factor(status) # factor_status has three levels called Poor, Improved and
Excellent.
http://shakthydoss.com 14
15. Data structure
List
Lists are the most complex data structure in R
List may contain a combination of vectors, matrices, data frames, and even
other lists.
You create a list using the list() function
vec <- c(1,2,3,4)
mat <- matrix(vec,2,2)
foo <- list(vec, mat)
http://shakthydoss.com 15
16. Data Import/Export
Import Excel File
Quite frequently, the sample data is in Excel format, and needs to be
imported into R prior to use.
library(gdata) # load gdata package
help(read.xls) # documentation
mydata = read.xls("mydata.xls") # read from first sheet
http://shakthydoss.com 16
18. Data Import/Export
Import Minitab File
If the data file is in Minitab Portable Worksheet format, it can be
opened with the function read.mtp from the foreign package. It returns
a list of components in the Minitab worksheet.
library(foreign) # load the foreign package
help(read.mtp) # documentation
mydata = read.mtp("mydata.mtp") # read from .mtp file
http://shakthydoss.com 18
19. Data Import/Export
Import Table File
A data table can resides in a text file. The cells inside the table are separated
by blank characters. Here is an example of a table with 4 rows and 3
columns.
100 a1 b1
200 a2 b2
300 a3 b3
400 a4 b4
help(read.table) #documentation
mydata = read.table("mydata.txt")
http://shakthydoss.com 19
20. Data Import/Export
Import CSV File
The sample data can also be in comma separated values (CSV) format.
Each cell inside such data file is separated by a special character, which
usually is a comma.
help(read.csv) #documentation
mydata = read.csv("mydata.csv", sep=",")
http://shakthydoss.com 20
24. Data Import/Export
Every individual data value has a data type that tells us what sort of
value it is.
A. TRUE
B. FALSE
Answer A
http://shakthydoss.com 24
25. Data Import/Export
What happen when execute the code.
vec <- c(1,"hello",TRUE)
A. vec is assigned with multiple values.
B. Nothing happens.
C. ERROE
D. vec has only one value and that is TRUE.
Answer C
http://shakthydoss.com 25
26. Data Import/Export
Which statement is TRUE
A. Matrix is a three-dimensional collection of values that all have the same
type.
B. A factor can be used to represent a categorical variable.
C. Vector is a two-dimensional collection of values that can have multiple
mode (numeric, character, boolean).
D. At maximum a single data frame can hold only 20GB of data.
Answer B
http://shakthydoss.com 26
27. Data Import/Export
What is most appropriate data structure for the below dataset.
A. Matrix
B. Data frame
C. Array
D. List
Answer B
Name Age Gender
Jhon 24 M
Joe 24 M
Nancy 25 F
http://shakthydoss.com 27
28. Data Import/Export
Function that is used to create array
A. a(vector, dimensions, dimnames)
B. create(vector, dimensions, dimnames)
C. array(vector, dimensions, dimnames)
D. a(vector,dimensions)
Answer C
http://shakthydoss.com 28