SlideShare a Scribd company logo
I/O
Day 2 - Introduction to R for Life Sciences
Before input and output: folders
Find out where you are: getwd()
Go elsewhere: setwd("S://SeqData/Illumina/14apr2014")
Convenience: choose.dir() and file.choose() (Windows only)
Make sure your scripts and ‘source data’ are backed up
Derived data should not be backed up
Input - formats
.RData - data in binary form, as produced by
save.image(file='Foxo.rda') # 'workspace'
Similar:
save(table1, table2, pvalues, file="mytables.rda") ↔ load("mytables.rda")
application-specific data: special libraries
(e.g.: XML, JSON, .bam, .bed, .gff, .bw. Also Excel)
tab-delimited data
Tab-delimited input
function read.table()
→ read.delim(), read.delim2(), read.csv(), read.csv2()
Have different defaults but all return a data.frame
common arguments: file, header, sep, quote, row/col.names,
stringsAsFactors
can be URL!
"t"
Tab-delimited input
function read.table()
→ read.delim(), read.delim2(), read.csv(), read.csv2()
Have different defaults but all return a data.frame
> SGD <- read.table("SGD.txt", sep="t", header=TRUE, row.names=1)
> SGD <- read.delim("SGD.txt", row.names=1) ## does the same thing!
Put following in "C:/Users/YourName/Documents/.Rprofile":
options(stringsAsFactors = FALSE)
Data types
Sometimes the data type is wrong:
> mean( c("-0.82", "1.12", "-0.39") ) # note the quotes
[1] NA
Warning message:
In mean.default(c("-0.82", "1.12", "-0.39")) :
argument is not numeric or logical: returning NA
Sometimes this doesn’t matter:
> paste(1,2,3, sep=",")
[1] "1,2,3"
Type conversion
Automatic conversion('coercion'):
sum( c(TRUE, FALSE, TRUE) ) => 2
Explicit conversion:
as.numeric(); as.logical(); as.character(); as.matrix(), as.factor(), …
Checking the type:
is.numeric(); is.logical; is.character(); is.matrix(), is.factor(), …
Special cases:
is.null()
is.na() # Example: x[ ! is.na(x) ] <- 0 #or x <- x[ ! is.na(x) ]
Selecting data from data.frame
Index can be vector of numbers, logicals, names
Notation: some.frame[myrows, mycolumns] # as for matrix
But also: some.frame$geneName # for a particular column
some.frame[ , my.col ] # if the column(s) varies
Checking data.frames
Overview:
str(fr) # pay attention to the types!
Size:
dim(fr) # rows, then columns (as for matrices)
Distinct values:
unique(fr$type) # also consider length(unique(fr$type))
Arithmetic:
max(fr$length) # also: min, mean, sd, var, median, sum
Creating and extending data.frames
New frame:
f <- data.frame(gene.names, p.values)
Adding columns to frame:
f$status=new.status)
Adding rows to frame:
f <- rbind(f, list(genes2, pval2))
f <- rbind(f, another.data.frame)
You cannot "delete" rows or columns.
names and typesmust match!
I/O Caveats
Single or double quotes as part of strings
Comment-characters as part of strings
Spaces instead of tabs
Carriage-returns (Mac/Windows/Linux)
Duplicates in row or column names
Always check thenumber andnames of rowsand columns andtheir types!
Duplicate values
> v <- c("a", "b", "c", "d", "d", "e", "f", "a", "g", "a")
> duplicated(v)
[1] FALSE FALSE FALSE FALSE TRUE FALSE FALSE TRUE FALSE TRUE
> v[ duplicated(v) ]
[1] "d" "a" "a"
# sum(duplicated(v)) → 3
> v[ ! duplicated(v) ]
[1] "a" "b" "c" "d" "e" "f" "g" # same as unique(v)
Tab-delimited output
write.table() with arguments similar to read.table(). To get an empty
topleft cell, use col.names=NA
Again, check the results.

More Related Content

What's hot

Files,blocks and functions in R
Files,blocks and functions in RFiles,blocks and functions in R
Files,blocks and functions in R
Vladimir Bakhrushin
 
Introduction to Array ppt
Introduction to Array pptIntroduction to Array ppt
Introduction to Array ppt
sandhya yadav
 
Arrays in c
Arrays in cArrays in c
Arrays in c
Jeeva Nanthini
 
Array in C
Array in CArray in C
Array in C
Kamal Acharya
 
Array in c
Array in cArray in c
Array in c
AnIsh Kumar
 
Basic array in c programming
Basic array in c programmingBasic array in c programming
Basic array in c programming
Sajid Hasan
 
3 Data Structure in R
3 Data Structure in R3 Data Structure in R
3 Data Structure in R
Dr Nisha Arora
 
Transpose and manipulate character strings
Transpose and manipulate character strings Transpose and manipulate character strings
Transpose and manipulate character strings
Rupak Roy
 
Data Wrangling with dplyr and tidyr Cheat Sheet
Data Wrangling with dplyr and tidyr Cheat SheetData Wrangling with dplyr and tidyr Cheat Sheet
Data Wrangling with dplyr and tidyr Cheat Sheet
Dr. Volkan OBAN
 
SPL 12 | Multi-dimensional Array in C
SPL 12 | Multi-dimensional Array in CSPL 12 | Multi-dimensional Array in C
SPL 12 | Multi-dimensional Array in C
Mohammad Imam Hossain
 
Multi dimensional arrays
Multi dimensional arraysMulti dimensional arrays
Multi dimensional arraysAseelhalees
 
Array
ArrayArray
Data Visualization using base graphics
Data Visualization using base graphicsData Visualization using base graphics
Data Visualization using base graphics
Rupak Roy
 
Manipulating Data using base R package
Manipulating Data using base R package Manipulating Data using base R package
Manipulating Data using base R package
Rupak Roy
 
C Programming : Arrays
C Programming : ArraysC Programming : Arrays
C Programming : Arrays
Gagan Deep
 
Lecture17 arrays.ppt
Lecture17 arrays.pptLecture17 arrays.ppt
Lecture17 arrays.ppt
eShikshak
 
Array in c programming
Array in c programmingArray in c programming
Array in c programming
Mazharul Islam
 
Arrays in c
Arrays in cArrays in c
Arrays in c
vampugani
 

What's hot (20)

Files,blocks and functions in R
Files,blocks and functions in RFiles,blocks and functions in R
Files,blocks and functions in R
 
Introduction to Array ppt
Introduction to Array pptIntroduction to Array ppt
Introduction to Array ppt
 
Arrays in c
Arrays in cArrays in c
Arrays in c
 
Array in C
Array in CArray in C
Array in C
 
Array in c
Array in cArray in c
Array in c
 
Basic array in c programming
Basic array in c programmingBasic array in c programming
Basic array in c programming
 
3 Data Structure in R
3 Data Structure in R3 Data Structure in R
3 Data Structure in R
 
Transpose and manipulate character strings
Transpose and manipulate character strings Transpose and manipulate character strings
Transpose and manipulate character strings
 
Data Wrangling with dplyr and tidyr Cheat Sheet
Data Wrangling with dplyr and tidyr Cheat SheetData Wrangling with dplyr and tidyr Cheat Sheet
Data Wrangling with dplyr and tidyr Cheat Sheet
 
SPL 12 | Multi-dimensional Array in C
SPL 12 | Multi-dimensional Array in CSPL 12 | Multi-dimensional Array in C
SPL 12 | Multi-dimensional Array in C
 
Array in c
Array in cArray in c
Array in c
 
Multi dimensional arrays
Multi dimensional arraysMulti dimensional arrays
Multi dimensional arrays
 
Array
ArrayArray
Array
 
Data Visualization using base graphics
Data Visualization using base graphicsData Visualization using base graphics
Data Visualization using base graphics
 
Manipulating Data using base R package
Manipulating Data using base R package Manipulating Data using base R package
Manipulating Data using base R package
 
C Programming : Arrays
C Programming : ArraysC Programming : Arrays
C Programming : Arrays
 
Lecture17 arrays.ppt
Lecture17 arrays.pptLecture17 arrays.ppt
Lecture17 arrays.ppt
 
Array in c programming
Array in c programmingArray in c programming
Array in c programming
 
Arrays in c
Arrays in cArrays in c
Arrays in c
 
R learning by examples
R learning by examplesR learning by examples
R learning by examples
 

Similar to Day 2b i/o.pptx

20130215 Reading data into R
20130215 Reading data into R20130215 Reading data into R
20130215 Reading data into RKazuki Yoshida
 
Python Pandas
Python PandasPython Pandas
Python Pandas
Sunil OS
 
R Introduction
R IntroductionR Introduction
R Introduction
Sangeetha S
 
Data Migration with Spark to Hive
Data Migration with Spark to HiveData Migration with Spark to Hive
Data Migration with Spark to Hive
Databricks
 
R language introduction
R language introductionR language introduction
R language introduction
Shashwat Shriparv
 
Introduction to r studio on aws 2020 05_06
Introduction to r studio on aws 2020 05_06Introduction to r studio on aws 2020 05_06
Introduction to r studio on aws 2020 05_06
Barry DeCicco
 
fINAL Lesson_5_Data_Manipulation_using_R_v1.pptx
fINAL Lesson_5_Data_Manipulation_using_R_v1.pptxfINAL Lesson_5_Data_Manipulation_using_R_v1.pptx
fINAL Lesson_5_Data_Manipulation_using_R_v1.pptx
dataKarthik
 
R Programming Reference Card
R Programming Reference CardR Programming Reference Card
R Programming Reference Card
Maurice Dawson
 
The Very ^ 2 Basics of R
The Very ^ 2 Basics of RThe Very ^ 2 Basics of R
The Very ^ 2 Basics of R
Winston Chen
 
Practical data science_public
Practical data science_publicPractical data science_public
Practical data science_public
Long Nguyen
 
Pandas Dataframe reading data Kirti final.pptx
Pandas Dataframe reading data  Kirti final.pptxPandas Dataframe reading data  Kirti final.pptx
Pandas Dataframe reading data Kirti final.pptx
Kirti Verma
 
Introduction to Stata
Introduction to Stata Introduction to Stata
Introduction to Stata
Samaa Hazem Hosny
 
Practical cats
Practical catsPractical cats
Practical cats
Raymond Tay
 
R Text-Based Data I/O and Data Frame Access and Manupulation
R Text-Based Data I/O and Data Frame Access and ManupulationR Text-Based Data I/O and Data Frame Access and Manupulation
R Text-Based Data I/O and Data Frame Access and Manupulation
Ian Cook
 
Data handling in r
Data handling in rData handling in r
Data handling in r
Abhik Seal
 
Data Management in R
Data Management in RData Management in R
Data Management in R
Sankhya_Analytics
 
Database
DatabaseDatabase
Database
Mayank Garg
 
Pandas cheat sheet_data science
Pandas cheat sheet_data sciencePandas cheat sheet_data science
Pandas cheat sheet_data science
Subrata Shaw
 

Similar to Day 2b i/o.pptx (20)

20130215 Reading data into R
20130215 Reading data into R20130215 Reading data into R
20130215 Reading data into R
 
Python Pandas
Python PandasPython Pandas
Python Pandas
 
R Introduction
R IntroductionR Introduction
R Introduction
 
Data Migration with Spark to Hive
Data Migration with Spark to HiveData Migration with Spark to Hive
Data Migration with Spark to Hive
 
R language introduction
R language introductionR language introduction
R language introduction
 
Introduction to r studio on aws 2020 05_06
Introduction to r studio on aws 2020 05_06Introduction to r studio on aws 2020 05_06
Introduction to r studio on aws 2020 05_06
 
fINAL Lesson_5_Data_Manipulation_using_R_v1.pptx
fINAL Lesson_5_Data_Manipulation_using_R_v1.pptxfINAL Lesson_5_Data_Manipulation_using_R_v1.pptx
fINAL Lesson_5_Data_Manipulation_using_R_v1.pptx
 
R Programming Reference Card
R Programming Reference CardR Programming Reference Card
R Programming Reference Card
 
The Very ^ 2 Basics of R
The Very ^ 2 Basics of RThe Very ^ 2 Basics of R
The Very ^ 2 Basics of R
 
Practical data science_public
Practical data science_publicPractical data science_public
Practical data science_public
 
Pandas Dataframe reading data Kirti final.pptx
Pandas Dataframe reading data  Kirti final.pptxPandas Dataframe reading data  Kirti final.pptx
Pandas Dataframe reading data Kirti final.pptx
 
Introduction to Stata
Introduction to Stata Introduction to Stata
Introduction to Stata
 
Practical cats
Practical catsPractical cats
Practical cats
 
R Text-Based Data I/O and Data Frame Access and Manupulation
R Text-Based Data I/O and Data Frame Access and ManupulationR Text-Based Data I/O and Data Frame Access and Manupulation
R Text-Based Data I/O and Data Frame Access and Manupulation
 
Data handling in r
Data handling in rData handling in r
Data handling in r
 
Data Management in R
Data Management in RData Management in R
Data Management in R
 
Database
DatabaseDatabase
Database
 
R교육1
R교육1R교육1
R교육1
 
R-Excel Integration
R-Excel IntegrationR-Excel Integration
R-Excel Integration
 
Pandas cheat sheet_data science
Pandas cheat sheet_data sciencePandas cheat sheet_data science
Pandas cheat sheet_data science
 

More from Adrien Melquiond

Day 1a welcome introduction
Day 1a   welcome   introductionDay 1a   welcome   introduction
Day 1a welcome introduction
Adrien Melquiond
 
R course ggplot2
R course   ggplot2R course   ggplot2
R course ggplot2
Adrien Melquiond
 
Day 5a iteration and functions if().pptx
Day 5a   iteration and functions  if().pptxDay 5a   iteration and functions  if().pptx
Day 5a iteration and functions if().pptx
Adrien Melquiond
 
Day 4b iteration and functions for-loops.pptx
Day 4b   iteration and functions  for-loops.pptxDay 4b   iteration and functions  for-loops.pptx
Day 4b iteration and functions for-loops.pptx
Adrien Melquiond
 
Day 4a iteration and functions.pptx
Day 4a   iteration and functions.pptxDay 4a   iteration and functions.pptx
Day 4a iteration and functions.pptx
Adrien Melquiond
 
Day 3 plotting.pptx
Day 3   plotting.pptxDay 3   plotting.pptx
Day 3 plotting.pptx
Adrien Melquiond
 

More from Adrien Melquiond (6)

Day 1a welcome introduction
Day 1a   welcome   introductionDay 1a   welcome   introduction
Day 1a welcome introduction
 
R course ggplot2
R course   ggplot2R course   ggplot2
R course ggplot2
 
Day 5a iteration and functions if().pptx
Day 5a   iteration and functions  if().pptxDay 5a   iteration and functions  if().pptx
Day 5a iteration and functions if().pptx
 
Day 4b iteration and functions for-loops.pptx
Day 4b   iteration and functions  for-loops.pptxDay 4b   iteration and functions  for-loops.pptx
Day 4b iteration and functions for-loops.pptx
 
Day 4a iteration and functions.pptx
Day 4a   iteration and functions.pptxDay 4a   iteration and functions.pptx
Day 4a iteration and functions.pptx
 
Day 3 plotting.pptx
Day 3   plotting.pptxDay 3   plotting.pptx
Day 3 plotting.pptx
 

Recently uploaded

Palestine last event orientationfvgnh .pptx
Palestine last event orientationfvgnh .pptxPalestine last event orientationfvgnh .pptx
Palestine last event orientationfvgnh .pptx
RaedMohamed3
 
Synthetic Fiber Construction in lab .pptx
Synthetic Fiber Construction in lab .pptxSynthetic Fiber Construction in lab .pptx
Synthetic Fiber Construction in lab .pptx
Pavel ( NSTU)
 
Unit 2- Research Aptitude (UGC NET Paper I).pdf
Unit 2- Research Aptitude (UGC NET Paper I).pdfUnit 2- Research Aptitude (UGC NET Paper I).pdf
Unit 2- Research Aptitude (UGC NET Paper I).pdf
Thiyagu K
 
The geography of Taylor Swift - some ideas
The geography of Taylor Swift - some ideasThe geography of Taylor Swift - some ideas
The geography of Taylor Swift - some ideas
GeoBlogs
 
1.4 modern child centered education - mahatma gandhi-2.pptx
1.4 modern child centered education - mahatma gandhi-2.pptx1.4 modern child centered education - mahatma gandhi-2.pptx
1.4 modern child centered education - mahatma gandhi-2.pptx
JosvitaDsouza2
 
The Roman Empire A Historical Colossus.pdf
The Roman Empire A Historical Colossus.pdfThe Roman Empire A Historical Colossus.pdf
The Roman Empire A Historical Colossus.pdf
kaushalkr1407
 
"Protectable subject matters, Protection in biotechnology, Protection of othe...
"Protectable subject matters, Protection in biotechnology, Protection of othe..."Protectable subject matters, Protection in biotechnology, Protection of othe...
"Protectable subject matters, Protection in biotechnology, Protection of othe...
SACHIN R KONDAGURI
 
Honest Reviews of Tim Han LMA Course Program.pptx
Honest Reviews of Tim Han LMA Course Program.pptxHonest Reviews of Tim Han LMA Course Program.pptx
Honest Reviews of Tim Han LMA Course Program.pptx
timhan337
 
The French Revolution Class 9 Study Material pdf free download
The French Revolution Class 9 Study Material pdf free downloadThe French Revolution Class 9 Study Material pdf free download
The French Revolution Class 9 Study Material pdf free download
Vivekanand Anglo Vedic Academy
 
Embracing GenAI - A Strategic Imperative
Embracing GenAI - A Strategic ImperativeEmbracing GenAI - A Strategic Imperative
Embracing GenAI - A Strategic Imperative
Peter Windle
 
The approach at University of Liverpool.pptx
The approach at University of Liverpool.pptxThe approach at University of Liverpool.pptx
The approach at University of Liverpool.pptx
Jisc
 
Language Across the Curriculm LAC B.Ed.
Language Across the  Curriculm LAC B.Ed.Language Across the  Curriculm LAC B.Ed.
Language Across the Curriculm LAC B.Ed.
Atul Kumar Singh
 
Adversarial Attention Modeling for Multi-dimensional Emotion Regression.pdf
Adversarial Attention Modeling for Multi-dimensional Emotion Regression.pdfAdversarial Attention Modeling for Multi-dimensional Emotion Regression.pdf
Adversarial Attention Modeling for Multi-dimensional Emotion Regression.pdf
Po-Chuan Chen
 
Introduction to AI for Nonprofits with Tapp Network
Introduction to AI for Nonprofits with Tapp NetworkIntroduction to AI for Nonprofits with Tapp Network
Introduction to AI for Nonprofits with Tapp Network
TechSoup
 
The Challenger.pdf DNHS Official Publication
The Challenger.pdf DNHS Official PublicationThe Challenger.pdf DNHS Official Publication
The Challenger.pdf DNHS Official Publication
Delapenabediema
 
CACJapan - GROUP Presentation 1- Wk 4.pdf
CACJapan - GROUP Presentation 1- Wk 4.pdfCACJapan - GROUP Presentation 1- Wk 4.pdf
CACJapan - GROUP Presentation 1- Wk 4.pdf
camakaiclarkmusic
 
Lapbook sobre os Regimes Totalitários.pdf
Lapbook sobre os Regimes Totalitários.pdfLapbook sobre os Regimes Totalitários.pdf
Lapbook sobre os Regimes Totalitários.pdf
Jean Carlos Nunes Paixão
 
CLASS 11 CBSE B.St Project AIDS TO TRADE - INSURANCE
CLASS 11 CBSE B.St Project AIDS TO TRADE - INSURANCECLASS 11 CBSE B.St Project AIDS TO TRADE - INSURANCE
CLASS 11 CBSE B.St Project AIDS TO TRADE - INSURANCE
BhavyaRajput3
 
The basics of sentences session 5pptx.pptx
The basics of sentences session 5pptx.pptxThe basics of sentences session 5pptx.pptx
The basics of sentences session 5pptx.pptx
heathfieldcps1
 
TESDA TM1 REVIEWER FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
TESDA TM1 REVIEWER  FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...TESDA TM1 REVIEWER  FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
TESDA TM1 REVIEWER FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
EugeneSaldivar
 

Recently uploaded (20)

Palestine last event orientationfvgnh .pptx
Palestine last event orientationfvgnh .pptxPalestine last event orientationfvgnh .pptx
Palestine last event orientationfvgnh .pptx
 
Synthetic Fiber Construction in lab .pptx
Synthetic Fiber Construction in lab .pptxSynthetic Fiber Construction in lab .pptx
Synthetic Fiber Construction in lab .pptx
 
Unit 2- Research Aptitude (UGC NET Paper I).pdf
Unit 2- Research Aptitude (UGC NET Paper I).pdfUnit 2- Research Aptitude (UGC NET Paper I).pdf
Unit 2- Research Aptitude (UGC NET Paper I).pdf
 
The geography of Taylor Swift - some ideas
The geography of Taylor Swift - some ideasThe geography of Taylor Swift - some ideas
The geography of Taylor Swift - some ideas
 
1.4 modern child centered education - mahatma gandhi-2.pptx
1.4 modern child centered education - mahatma gandhi-2.pptx1.4 modern child centered education - mahatma gandhi-2.pptx
1.4 modern child centered education - mahatma gandhi-2.pptx
 
The Roman Empire A Historical Colossus.pdf
The Roman Empire A Historical Colossus.pdfThe Roman Empire A Historical Colossus.pdf
The Roman Empire A Historical Colossus.pdf
 
"Protectable subject matters, Protection in biotechnology, Protection of othe...
"Protectable subject matters, Protection in biotechnology, Protection of othe..."Protectable subject matters, Protection in biotechnology, Protection of othe...
"Protectable subject matters, Protection in biotechnology, Protection of othe...
 
Honest Reviews of Tim Han LMA Course Program.pptx
Honest Reviews of Tim Han LMA Course Program.pptxHonest Reviews of Tim Han LMA Course Program.pptx
Honest Reviews of Tim Han LMA Course Program.pptx
 
The French Revolution Class 9 Study Material pdf free download
The French Revolution Class 9 Study Material pdf free downloadThe French Revolution Class 9 Study Material pdf free download
The French Revolution Class 9 Study Material pdf free download
 
Embracing GenAI - A Strategic Imperative
Embracing GenAI - A Strategic ImperativeEmbracing GenAI - A Strategic Imperative
Embracing GenAI - A Strategic Imperative
 
The approach at University of Liverpool.pptx
The approach at University of Liverpool.pptxThe approach at University of Liverpool.pptx
The approach at University of Liverpool.pptx
 
Language Across the Curriculm LAC B.Ed.
Language Across the  Curriculm LAC B.Ed.Language Across the  Curriculm LAC B.Ed.
Language Across the Curriculm LAC B.Ed.
 
Adversarial Attention Modeling for Multi-dimensional Emotion Regression.pdf
Adversarial Attention Modeling for Multi-dimensional Emotion Regression.pdfAdversarial Attention Modeling for Multi-dimensional Emotion Regression.pdf
Adversarial Attention Modeling for Multi-dimensional Emotion Regression.pdf
 
Introduction to AI for Nonprofits with Tapp Network
Introduction to AI for Nonprofits with Tapp NetworkIntroduction to AI for Nonprofits with Tapp Network
Introduction to AI for Nonprofits with Tapp Network
 
The Challenger.pdf DNHS Official Publication
The Challenger.pdf DNHS Official PublicationThe Challenger.pdf DNHS Official Publication
The Challenger.pdf DNHS Official Publication
 
CACJapan - GROUP Presentation 1- Wk 4.pdf
CACJapan - GROUP Presentation 1- Wk 4.pdfCACJapan - GROUP Presentation 1- Wk 4.pdf
CACJapan - GROUP Presentation 1- Wk 4.pdf
 
Lapbook sobre os Regimes Totalitários.pdf
Lapbook sobre os Regimes Totalitários.pdfLapbook sobre os Regimes Totalitários.pdf
Lapbook sobre os Regimes Totalitários.pdf
 
CLASS 11 CBSE B.St Project AIDS TO TRADE - INSURANCE
CLASS 11 CBSE B.St Project AIDS TO TRADE - INSURANCECLASS 11 CBSE B.St Project AIDS TO TRADE - INSURANCE
CLASS 11 CBSE B.St Project AIDS TO TRADE - INSURANCE
 
The basics of sentences session 5pptx.pptx
The basics of sentences session 5pptx.pptxThe basics of sentences session 5pptx.pptx
The basics of sentences session 5pptx.pptx
 
TESDA TM1 REVIEWER FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
TESDA TM1 REVIEWER  FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...TESDA TM1 REVIEWER  FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
TESDA TM1 REVIEWER FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
 

Day 2b i/o.pptx

  • 1. I/O Day 2 - Introduction to R for Life Sciences
  • 2. Before input and output: folders Find out where you are: getwd() Go elsewhere: setwd("S://SeqData/Illumina/14apr2014") Convenience: choose.dir() and file.choose() (Windows only) Make sure your scripts and ‘source data’ are backed up Derived data should not be backed up
  • 3. Input - formats .RData - data in binary form, as produced by save.image(file='Foxo.rda') # 'workspace' Similar: save(table1, table2, pvalues, file="mytables.rda") ↔ load("mytables.rda") application-specific data: special libraries (e.g.: XML, JSON, .bam, .bed, .gff, .bw. Also Excel) tab-delimited data
  • 4. Tab-delimited input function read.table() → read.delim(), read.delim2(), read.csv(), read.csv2() Have different defaults but all return a data.frame common arguments: file, header, sep, quote, row/col.names, stringsAsFactors can be URL! "t"
  • 5. Tab-delimited input function read.table() → read.delim(), read.delim2(), read.csv(), read.csv2() Have different defaults but all return a data.frame > SGD <- read.table("SGD.txt", sep="t", header=TRUE, row.names=1) > SGD <- read.delim("SGD.txt", row.names=1) ## does the same thing! Put following in "C:/Users/YourName/Documents/.Rprofile": options(stringsAsFactors = FALSE)
  • 6. Data types Sometimes the data type is wrong: > mean( c("-0.82", "1.12", "-0.39") ) # note the quotes [1] NA Warning message: In mean.default(c("-0.82", "1.12", "-0.39")) : argument is not numeric or logical: returning NA Sometimes this doesn’t matter: > paste(1,2,3, sep=",") [1] "1,2,3"
  • 7. Type conversion Automatic conversion('coercion'): sum( c(TRUE, FALSE, TRUE) ) => 2 Explicit conversion: as.numeric(); as.logical(); as.character(); as.matrix(), as.factor(), … Checking the type: is.numeric(); is.logical; is.character(); is.matrix(), is.factor(), … Special cases: is.null() is.na() # Example: x[ ! is.na(x) ] <- 0 #or x <- x[ ! is.na(x) ]
  • 8. Selecting data from data.frame Index can be vector of numbers, logicals, names Notation: some.frame[myrows, mycolumns] # as for matrix But also: some.frame$geneName # for a particular column some.frame[ , my.col ] # if the column(s) varies
  • 9. Checking data.frames Overview: str(fr) # pay attention to the types! Size: dim(fr) # rows, then columns (as for matrices) Distinct values: unique(fr$type) # also consider length(unique(fr$type)) Arithmetic: max(fr$length) # also: min, mean, sd, var, median, sum
  • 10. Creating and extending data.frames New frame: f <- data.frame(gene.names, p.values) Adding columns to frame: f$status=new.status) Adding rows to frame: f <- rbind(f, list(genes2, pval2)) f <- rbind(f, another.data.frame) You cannot "delete" rows or columns. names and typesmust match!
  • 11. I/O Caveats Single or double quotes as part of strings Comment-characters as part of strings Spaces instead of tabs Carriage-returns (Mac/Windows/Linux) Duplicates in row or column names Always check thenumber andnames of rowsand columns andtheir types!
  • 12. Duplicate values > v <- c("a", "b", "c", "d", "d", "e", "f", "a", "g", "a") > duplicated(v) [1] FALSE FALSE FALSE FALSE TRUE FALSE FALSE TRUE FALSE TRUE > v[ duplicated(v) ] [1] "d" "a" "a" # sum(duplicated(v)) → 3 > v[ ! duplicated(v) ] [1] "a" "b" "c" "d" "e" "f" "g" # same as unique(v)
  • 13. Tab-delimited output write.table() with arguments similar to read.table(). To get an empty topleft cell, use col.names=NA Again, check the results.