Moving Data to and From R


Published on

Part of advanced analytics course.

Published in: Education, Technology
1 Comment
No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Moving Data to and From R

  1. 1. Advanced Data Analytics: Moving Data Around Jeffrey Stanton School of Information Studies Syracuse University
  2. 2. R and the File System• R maintains a current working directory to simplify the process of reading and saving filesgetwd() # shows the pathname of current foldersetwd("pathname") # Sets a new pathhistory() # shows most recent commands# Creates a CSV file using data from a dataframewrite.table(dataFr, sep=",", file="filename.csv")# Reads a CSV file into a dataframetargetFrame = read.table("filename.csv", sep=",") 2
  3. 3. R and the Windows Clipboard• For small chunks of data, it may be convenient to “cut and paste”• Create a small rectangle of data in Excel and copy it to the clipboard• Then, in R: > read.DIF("clipboard",transpose=TRUE) V1 V2 1 1 1 2 2 0 3 3 1 4 4 0 5 5 1 6 6 0 3
  4. 4. Include Variable Names• You can pull in the variable names (the column headings) as well• Then, in R: > read.DIF("clipboard",transpose=TRUE,header=TRUE) Subject Code 1 1 1 2 2 0 3 3 1 4 4 0 5 5 1 6 6 0 4
  5. 5. Best Option: Put Clipboard into Dataframe > newDF = read.DIF("clipboard",transpose=TRUE,header=TRUE) > newDF Subject Code 1 1 1 2 2 0 3 3 1 4 4 0 5 5 1 6 6 0 > class(newDF) [1] "data.frame" 5
  6. 6. An Explanation of Data Frames• Every single piece of data in R is a “vector”: A list of “scalar” values all of the same mode – Scalar just means a single element or value, like the number 5 – R vectors can be lists with any number of elements, including just one element; so a scalar could be stored in a vector of length one – The mode of a vector can be numerical, or character, or logical• Just like Excel spreadsheets and other data programs like SPSS, vectors in R can be two dimensional, with a certain number of columns and a certain number of rows; a two dimensional vector is called a matrix• But, being a vector, a matrix has to contain elements all of the same mode, so a matrix cannot always hold a typical spreadsheet or data set, because these often have different types in each column• This is where the data frame comes in: A data frame is a list of vectors, all of the same length, each of which can be a different type 6
  7. 7. read.DIF also works with files> setwd(“C:/DataMining/DataFiles")> newDF = read.DIF(“excelExport.dif", transpose=TRUE,header=TRUE)> class(newDF)[1] "data.frame"> attach(newDF)# Note that Excel, DIF, and R# don’t always agree on data# formats. For example, currency# in Excel will not export to# integer values in R, so remove# as much formatting as possible. 7
  8. 8. Demonstrating Mastery• Create or find data in an Excel spreadsheet and export as a CSV file• Import data into R from a CSV or TXT file• Export a data frame into a CSV file• Read the CSV file into Excel• Advanced: Use data interchange format (“DIF”) to exchange files between R and Excel• Advanced: Use a data frame in R to store data obtained from a spreadsheet 8