2. Working with excel files
R also comes with different packages to support read, write and
manipulate excel files directly without converting them in other
formats.
Some of the common packages used today are
ØXLConnect - uses rjava: a low level R to java interface
ØOpen.xlsx - uses C++ dependencies instead of rjava(java)
ØGdata - with pearl dependencies
ØReadXL, XLSX, readr packages.
Let’s learn each of them in detail.
Rupak Roy
3. XLCONNECT
ØXLCONNECT: is a connector for R that provides comprehensive
functionality to read, write and format Excel data.
ØImport functions include:
loadWorkbook()
readWorkbook()
readWorkbookFromFile()
ØExport functions inlude:
createSheet()
writeWorkSheet()
saveWorkbook()
Rupak Roy
4. XLCONNECT:loadWorkbook()
loadWorkbook(): Loads or create a Microsoft excel workbook in R
for further manipulation.
>loadWorkbook(filename, create = FALSE, password = NULL)
Where
filename = excel workbook to be loaded
create = Specifies if the file should be created if it does not already
exist (default is FALSE)
password = Password to use when opening password protected files.
The default NULL means no password is being used. This
argument is ignored when creating new files using create = TRUE.
5. XLCONNECT:loadWorkbook()
#install the XLConnect package
>install.packages(“XLConnect”, dependencies = TRUE)
#load the functions from XLConnect package.
>library(XLConnect)
#load the excel file
>xlsx_data<- loadWorkbook("sample.xlsx")
>class(xlsx_data)
To know more about the features of loadWorkbook() use
>?XLConnect::loadWorkbook
Rupak Roy
6. XLCONNECT:readWorksheet ()
readWorksheet(): Reads data from worksheets of a loadWorkbook.
>worksheet1<-readWorksheet(object, sheet, startCol, endRow, Header = T,….)
Where
object = name of the workbook from loadWorkbook
sheet = sheet name of the workbook
startCol = The index of the first column to read from. Defaults to 0 meaning that
the start column is determined automatically..
endRow = The index of the last row to read from. Defaults to 0 meaning that the
end row is determined automatically.
startRow = The index of the first row to read from. Defaults to 0 meaning that
the start row is determined automatically.
endCol = The index of the last column to read from. Defaults to 0 meaning that
the end column is determined automatically..
7. XLCONNECT:readWorksheet()
#install the XLConnect package
>install.packages(“XLConnect”, dependencies = TRUE)
#load the functions from XLConnect package.
>library(XLConnect)
#Read the 1st excel sheet from xlsx_data R object i.e. sample.xlsx file.
>excel_data<- readWorksheet (xlsx_data, “store”, header = T)
>View(excel_data)
#Read the 2nd excel sheet from xlsx_data R object i.e. sample.xlsx file.
>excel_data2<- readWorksheet (xlsx_data,“bike_sharing_program”, endRow = 10,
startCol =3, header = T)
>View(excel_data2)
To know more about the features of loadWorkbook() use
>?XLConnect::readWorksheet
8. XLCONNECT:readWorksheetFromFile()
readWorksheetFromFile(): Reads data from a worksheet directly from a
physical excel file.
>worksheet3<-readWorksheetFromFile(file, sheet, startCol, endRow, Header =
T ……. Same Arguments passed to readWorksheet)
Where
file = name of the excel file to be read
sheet = sheet name of workbook
startCol = The index of the first column to read from. Defaults to 0 meaning that
the start column is determined automatically..
endRow = The index of the last row to read from. Defaults to 0 meaning that the
end row is determined automatically.
startRow = The index of the first row to read from. Defaults to 0 meaning that
the start row is determined automatically.
endCol = The index of the last column to read from. Defaults to 0 meaning that
the end column is determined automatically..
9. XLCONNECT:readWorksheetFromFile()
#install the XLConnect package
>install.packages(“XLConnect”, dependencies = TRUE)
#load the functions from XLConnect package.
>library(XLConnect)
#Read the excel sheet directly from an excel file
>excel_data3<- readWorksheetFromFile (“sample.xlsx”, “store”, header = T)
>View(excel_data3)
XLConnect::readWorksheetFromFile - the only difference between
readWorksheet and readWorksheetFromFile is that in readWorksheet() the
excel file have to be first loaded in R directory using loadWorkbook() in order
to view the data but in readWorksheetFromFile() it reads the excel sheet
directly from a physical file.
To know more about the features of readWorksheetFromFile() use
>?XLConnect::readWorksheetFromFile
10. XLCONNECT:createSheet()
createSheet(): Creates new worksheet in a workbook loaded via
loadWorkbook()
>createSheet (object, name)
Where
object = name of the workbook to use
name = name of the sheet to create
Rupak Roy
11. XLCONNECT:createSheet()
#install the XLConnect package
>install.packages(“XLConnect”, dependencies = TRUE)
#load the functions from XLConnect package.
>library(XLConnect)
#Create the a new empty excel sheet in the workbook
>createSheet(xlsx_data, “new_sheet”)
XLConnect::createSheet() - Creates a worksheet with the specified name if it
does not already exist. The naming of worksheets needs to be in line with
Excel's convention, otherwise an exception will be thrown. For example,
worksheet names cannot be longer than 31 characters.
To know more about the features of createSheet() use
>?XLConnect::createSheet
Rupak Roy
12. XLCONNECT:writeWorksheet()
writeWorksheet(): Creates new worksheet in a workbook loaded via
loadWorkbook()
>writeWorksheet (object, data, sheet=“sheet_name”)
Where
object = name of the worksheet to read
data = data to be written
sheet = The name or index of the sheet to write to
startRow = Index of the first row to write to. The default is startRow = 1
startCol = Index of the first column to write to. The default is startCol = 1
header = Specifies if the column names should be written. Default (TRUE).
13. XLCONNECT:writeWorksheet()
#install the XLConnect package
>install.packages(“XLConnect”, dependencies = TRUE)
#load the functions from XLConnect package.
>library(XLConnect)
#Write/Copy a workbook sheet directly to a new workbook sheet
>writeWorkSheet(xlsx_data, bike_sharing_program, “new_sheet”)
XLConnect::writeWorksheet() - Writes data to the worksheet specified
by sheet. Data here is assumed to be a data.frame and is coerced to one if this
is not already the case. StartRow and startCol define the top left corner of the
data region to be written.
To know more about the features of writeWorksheet() use
>?XLConnect::writeWorksheet
Rupak Roy
14. XLCONNECT:saveWorkbook()
saveWorkbook(): Saves a workbook to the corresponding Excel file. This
method actually writes the workbook object to disk.
>saveWorkbook (object,file)
Where
object = the workbook to save
file = The file to which it will save the workbook ("save as")
>saveWorkbook(xlsx, “document1.xlsx”);
To know more about the saveWorkbook function use
?XLConnect::saveWorkbook
Rupak Roy