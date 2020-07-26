Successfully reported this slideshow.
Data manipulation

This presentation is very good for student to understand the sub setting the variables, creating new variable , fetching the data, understanding the different part of that.

Published in: Education
Data manipulation

  1. 1. Data Manipulation Mr. Rahul Tiwari Assistant Professor Department of Statistics Ramniranjan Jhunjhunwala College Ghatkopar (W) Mumbai-400086
  2. 2. Mr. Rahul Tiwari2 Today’s Agenda  Creating a new variable  Merging/Appending new variables  Sub-setting dataset  Dropping a variables(s)  Removing rows/ columns  Renaming variables(s)
  3. 3. Mr. Rahul Tiwari3 Creating a new variable ➢ Suppose we want to create new variables which is multiple/addition/division of two variables i.e 2*x + 5* y or x+y and so on, so the syntax is as follows ➢ In case of above example, ➢ a$add=a$x+a$y # it will display addition of two variables ➢ a ➢ a$multi =a$x*a$y # it will display multiplication of two variables ➢ a ➢ Or View(a) # will Create new tab
  4. 4. Mr. Rahul Tiwari4 Merging/Appending columns ➢ To merge two data frames (datasets) horizontally, use the merge function. In most cases, you join two data frames by one or more common key variables (i.e., an inner join) ➢ a<- data.frame(“id”=letters[1:10],z=seq(10,19)) # it contain, id and z variables in the form of data frame ➢ b<- data.frame(“id”=letters[1:10],z1=seq(30,39)) # it contain, id and z1 variables in the form of data frame ➢ c<- merge(a,b,by= “id”) # you join or merge two data frames by one or more common key variables.
  5. 5. Mr. Rahul Tiwari5 Merging/Appending rows ➢ To join two data frames (datasets) vertically, use the rbind function. The two data frames must have the same variables, but they do not have to be in the same order. ➢ a<-data.frame(id=letters[1:10],x=1:10,y=11:20) # it contain, id x and y variables in the form of data frame ➢ b<-data.frame(id=letters[11:20],x=11:20,y=21:30) # it contain, id x and y variables in the form of data frame with different data sets ➢ c<-rbind(a,b) # it will club the data by row having same variables.
  6. 6. Mr. Rahul Tiwari6 MCQ● What will be the output of the following R code? > x <- 1:3 > y <- 10:12 > rbind(x, y) a) [,1] [,2] [,3] x 1 2 3 y 10 11 12 b) [,1] [,2] [,3] x 1 2 3 y 10 11 View Answer Answer: a Explanation: rbind() function combines vector, matrix or data frame by rows.
  7. 7. Mr. Rahul Tiwari7 Subsetting dataset ➢ Using data set “a” ➢ mydat <- subset(a, add > 20, select = c("x","y","add")) ➢ mydat ➢ write.csv(mydat, "new1.csv") # create a new file ➢ write.csv(mydat, "new1.csv", row.names = FALSE) # suppose dont want remove row number. ➢ Note : Go to CSV file and change whatever we want and press Ctrl+ S and run ➢ read.csv (mydata.csv)
  8. 8. Mr. Rahul Tiwari8 Dropping a variables(s) ➢d <- data.frame(id = letters[1:10], x = 1:10, y = 11:20) ➢d ➢In above example, we keep only first two variables ➢The syntax is d[1:2] # dropping 3rd variable. ➢Now suppose we want to drop 2nd column so the syntax is as follows: ➢d[c(1,3)]# it will display 1st and 3rd column
  9. 9. Mr. Rahul Tiwari9 Removing rows/column ➢d <- data.frame(id = letters[1:10], x = 1:10, y = 11:20) ➢d ➢d[-1 ] # removing 1st column. ➢d[-1, ] # removing 1st row ➢Now suppose we want to remove 2nd column so the syntax is as follows: ➢d[-c(1,3)]# it will display 1st and 3rd column
  10. 10. Mr. Rahul Tiwari10 MCQ Which of the following extracts first element from the following R vector? > x <- c("a", "b", "c", "c", "d", "a") a) x[10] b) x[1] c) x[0] d) x[2] View Answer Answer: b Explanation: The element which we want to extract will be in the format of variable[index value of the element] in R script.
  11. 11. Mr. Rahul Tiwari11 Renaming variables(s) ➢ To rename variables, you have to first install the dplyr package. ➢ install.packages("dplyr") # Install the dplyrpackag ➢ library(dplyr) # Load the plyr package d <- data.frame(id = letters[1:10], x = 1:10, y = 11:20 ➢ d ➢ We want to change variable x to x1 and y to x2 so the syntax as follows: ➢ rename(d, x1=x,x2=y)
  12. 12. Mr. Rahul Tiwari12

