SlideShare a Scribd company logo
1 of 33
Introduction to the R language
http://www.allsoftsolutions.in
Acknowledgments
Workshop materials
developed with
• Robert Gentleman, Harvard
• Wolfgang Huber, German
Cancer Research Center
• Sandrine Dudoit, UC
Berkeley
Bioconductor core
developers include
• Vince Carey, Harvard
• Yongchao Ge, Mount Sinai
School of Medicine
• Robert Gentleman, Harvard
• Jeff Gentry, Dana-Farber
Cancer Institute
• Rafael Irizarry, Johns Hopkins
• Yee Hwa (Jean) Yang, UCSF
• Jianhua (John) Zhang, Dana-
Farber Cancer Institute
• Sandrine Dudoit, UC
Berkeley
http://www.allsoftsolutions.in
Websites
• Bioconductor www.bioconductor.org
– software, data, and documentation;
– training materials from short courses;
www.bioconductor.org/workshops/UCSC03/ucsc03.html
– mailing list.
• R www.r-project.org
– software;
– documentation;
– RNews.
http://www.allsoftsolutions.in
R language: Overview
• Open source and open development.
• Design and deployment of portable, extensible,
and scalable software.
• Interoperability with other languages: C, XML.
• Variety of statistical and numerical methods.
• High quality visualization and graphics tools.
• Effective, extensible user interface.
• Innovative tools for producing documentation
and training materials: vignettes.
• Supports the creation, testing, and distribution of
software and data modules: packages.http://www.allsoftsolutions.in
Object Oriented Programming (OOP)
Class
• software abstraction of a
real world object.
• reflects how we think of
objects and what
information they contain.
• defined in terms of slots.
• an object is an instance
of a class.
• defines the structure,
inheritance, and
initialization of objects.
Method
• function that performs an
action on data (objects).
• defines how a particular
function should behave
depending on the class of
its arguments.
• allows computations to be
adapted to particular
classes.
• A generic function is a
dispatcher.
J. M. Chambers (1998). Programming with Data.
http://www.allsoftsolutions.in
R user interface
• Batch or command line processing
bash$ R to start
R> q() to quit
• Graphics windows
> X11()
> postscript()
> dev.off()
• File path is relative to working directory
> getwd()
> setwd()
• Load a package library with library()
• GUIs, tcltk
http://www.allsoftsolutions.in
Getting Help
o Details about a specific command whose name
you know (input arguments, options, algorithm):
> ? t.test
> help(t.test)
o See an example of usage:
> demo(graphics)
> example(mean)
mean> x <- c(0:10, 50)
mean> xm <- mean(x)
mean> c(xm, mean(x, trim = 0.1))
[1] 8.75 5.50
http://www.allsoftsolutions.in
Getting Help
o HTML search engine lets you search for topics
with regular expressions:
> help.search
o Find commands containing a regular expression
or object name:
> apropos("var")
[1] "var.na" ".__M__varLabels:Biobase"
[3] "varLabels" "var.test"
[5] "varimax" "all.vars"
[7] "var" "variable.names"
[9] "variable.names.default" "variable.names.lm"
http://www.allsoftsolutions.in
Getting Help
o Vignettes contain text and executable code:
> library(tkWidgets)
> vExplorer()
> openVignette()
Created using the Sweave() function.
.Rnw files produce a PDF file and a vignette.
o To see code for a function, type the name
with no parentheses or arguments:
> plot
http://www.allsoftsolutions.in
R as a Calculator
> log2(32)
[1] 5
> print(sqrt(2))
[1] 1.414214
> pi
[1] 3.141593
> seq(0, 5, length=6)
[1] 0 1 2 3 4 5
> 1+1:10
[1] 2 3 4 5 6 7 8 9 10 11
http://www.allsoftsolutions.in
R as a Graphics Tool
> plot(sin(seq(0, 2*pi, length=100)))
0 20 40 60 80 100
-1.0-0.50.00.51.0
Index
sin(seq(0,2*pi,length=100))
http://www.allsoftsolutions.in
> a <- 49
> sqrt(a)
[1] 7
> b <- "The dog ate my homework"
> sub("dog","cat",b)
[1] "The cat ate my homework"
> c <- (1+1==3)
> c
[1] FALSE
> as.character(b)
[1] "FALSE"
numeric
character
string
logical
Variables
http://www.allsoftsolutions.in
Missing Values
Variables of each data type (numeric, character, logical)
can also take the value NA: not available.
o NA is not the same as 0
o NA is not the same as “”
o NA is not the same as FALSE
o NA is not the same as NULL
Operations that involve NA may or may not produce NA:
> NA==1
[1] NA
> 1+NA
[1] NA
> max(c(NA, 4, 7))
[1] NA
> max(c(NA, 4, 7), na.rm=T)
[1] 7
> NA | TRUE
[1] TRUE
> NA & TRUE
[1] NA
http://www.allsoftsolutions.in
Vectors
vector: an ordered collection of data of the
same type
> a <- c(1,2,3)
> a*2
[1] 2 4 6
Example: the mean spot intensities of all 15488
spots on a microarray is a numeric vector
In R, a single number is the special case of a
vector with 1 element.
Other vector types: character strings, logical
http://www.allsoftsolutions.in
Matrices and Arrays
matrix: rectangular table of data of the same
type
Example: the expression values for 10000
genes for 30 tissue biopsies is a numeric
matrix with 10000 rows and 30 columns.
array: 3-,4-,..dimensional matrix
Example: the red and green foreground and
background values for 20000 spots on 120
arrays is a 4 x 20000 x 120 (3D) array.
http://www.allsoftsolutions.in
Lists
list: ordered collection of data of arbitrary types.
Example:
> doe <- list(name="john",age=28,married=F)
> doe$name
[1] "john“
> doe$age
[1] 28
> doe[[3]]
[1] FALSE
Typically, vector elements are accessed by their
index (an integer) and list elements by $name (a
character string). But both types support both
access methods. Slots are accessed by @name.http://www.allsoftsolutions.in
Data Frames
data frame: rectangular table with rows and columns; data
within each column has the same type (e.g. number, text,
logical), but different columns may have different types.
Represents the typical data table that researchers come up
with – like a spreadsheet.
Example:
> a <-
data.frame(localization,tumorsize,progress,row
.names=patients)
> a
localization tumorsize progress
XX348 proximal 6.3 FALSE
XX234 distal 8.0 TRUE
XX987 proximal 10.0 FALSEhttp://www.allsoftsolutions.in
What type is my data?
class Class from which object inherits
(vector, matrix, function, logical, list, … )
mode Numeric, character, logical, …
storage.mode
typeof
Mode used by R to store object
(double, integer, character, logical, …)
is.function Logical (TRUE if function)
is.na Logical (TRUE if missing)
names Names associated with object
dimnames Names for each dim of array
slotNames Names of slots of BioC objects
attributes Names, class, etc.http://www.allsoftsolutions.in
Subsetting
Individual elements of a vector, matrix, array or data frame are
accessed with “[ ]” by specifying their index, or their name
> a
localization tumorsize progress
XX348 proximal 6.3 0
XX234 distal 8.0 1
XX987 proximal 10.0 0
> a[3, 2]
[1] 10
> a["XX987", "tumorsize"]
[1] 10
> a["XX987",]
localization tumorsize progress
XX987 proximal 10 0
http://www.allsoftsolutions.in
>a
localization tumorsize progress
XX348 proximal 6.3 0
XX234 distal 8.0 1
XX987 proximal 10.0 0
> a[c(1,3),]
localization tumorsize progress
XX348 proximal 6.3 0
XX987 proximal 10.0 0
> a[-c(1,2),]
localization tumorsize progress
XX987 proximal 10.0 0
> a[c(T,F,T),]
localization tumorsize progress
XX348 proximal 6.3 0
XX987 proximal 10.0 0
> a$localization
[1] "proximal" "distal" "proximal"
> a$localization=="proximal"
[1] TRUE FALSE TRUE
> a[ a$localization=="proximal", ]
localization tumorsize progress
XX348 proximal 6.3 0
XX987 proximal 10.0 0
subset rows by a
vector of indices
subset rows by a
logical vector
subset columns
comparison resulting
in logical vector
subset the
selected rows
Example:
http://www.allsoftsolutions.in
Functions and Operators
Functions do things with data
“Input”: function arguments (0,1,2,…)
“Output”: function result (exactly one)
Example:
add <- function(a,b) {
result <- a+b
return(result)
}
Operators: Short-cut writing for frequently
used functions of one or two arguments.
http://www.allsoftsolutions.in
Frequently used operators
<- Assign
+ Sum
- Difference
* Multiplication
/ Division
^ Exponent
%% Mod
%*% Dot product
%/% Integer division
%in% Subset
| Or
& And
< Less
> Greater
<= Less or =
>= Greater or =
! Not
!= Not equal
== Is equal
http://www.allsoftsolutions.in
Frequently used functions
c Concatenate
cbind,
rbind
Concatenate
vectors
min Minimum
max Maximum
length # values
dim # rows, cols
floor Max integer in
which TRUE indices
table Counts
summary Generic stats
Sort,
order,
rank
Sort, order,
rank a vector
print Show value
cat Print as char
paste c() as char
round Round
apply Repeat over
rows, cols
http://www.allsoftsolutions.in
Statistical functions
rnorm, dnorm,
pnorm, qnorm
Normal distribution random sample,
density, cdf and quantiles
lm, glm, anova Model fitting
loess, lowess Smooth curve fitting
sample Resampling (bootstrap, permutation)
.Random.seed Random number generation
mean, median Location statistics
var, cor, cov,
mad, range
Scale statistics
svd, qr, chol,
eigen
Linear algebra
http://www.allsoftsolutions.in
Graphical functions
plot Generic plot eg: scatter
points Add points
lines, abline Add lines
text, mtext Add text
legend Add a legend
axis Add axes
box Add box around all axes
par Plotting parameters (lots!)
colors, palette Use colors
http://www.allsoftsolutions.in
Branching
if (logical expression) {
statements
}
else {
alternative statements
}
else branch is optional
{ } are optional with one statement
ifelse (logical expression, yes
statement, no statement)
http://www.allsoftsolutions.in
Loops
When the same or similar tasks need to be
performed multiple times; for all elements of a
list; for all columns of an array; etc.
for(i in 1:10) {
print(i*i)
}
i<-1
while(i<=10) {
print(i*i)
i<-i+sqrt(i)
}
Also: repeat, break, nexthttp://www.allsoftsolutions.in
Regular Expressions
Tools for text matching and replacement which are available in similar
forms in many programming languages (Perl, Unix shells, Java)
> a <- c("CENP-F","Ly-9", "MLN50", "ZNF191", "CLH-17")
> grep("L", a)
[1] 2 3 5
> grep("L", a, value=T)
[1] "Ly-9" "MLN50" "CLH-17"
> grep("^L", a, value=T)
[1] "Ly-9"
> grep("[0-9]", a, value=T)
[1] "Ly-9" "MLN50" "ZNF191" "CLH-17"
> gsub("[0-9]", "X", a)
[1] "CENP-F" "Ly-X" "MLNXX" "ZNFXXX" "CLH-XX"http://www.allsoftsolutions.in
Storing Data
Every R object can be stored into and restored
from a file with the commands
“save” and “load”.
This uses the XDR (external data
representation) standard of Sun Microsystems
and others, and is portable between MS-
Windows, Unix, Mac.
> save(x, file=“x.Rdata”)
> load(“x.Rdata”)
http://www.allsoftsolutions.in
Importing and Exporting Data
There are many ways to get data in and out.
Most programs (e.g. Excel), as well as humans,
know how to deal with rectangular tables in the
form of tab-delimited text files.
> x <- read.delim(“filename.txt”)
Also: read.table, read.csv, scan
> write.table(x, file=“x.txt”, sep=“t”)
Also: write.matrix, write
http://www.allsoftsolutions.in
Importing Data: caveats
Type conversions: by default, the read functions try to
guess and auto convert the data types of the different
columns (e.g. number, factor, character). There are
options as.is and colClasses to control this.
Special characters: the delimiter character (space,
comma, tabulator) and the end-of-line character cannot
be part of a data field. To circumvent this, text may be
“quoted”. However, if this option is used (the default),
then the quote characters themselves cannot be part of
a data field. Except if they themselves are within
quotes…
Understand the conventions your input files use and set
the quote options accordingly.
http://www.allsoftsolutions.in
Bioconductor
• R software project for the analysis and
comprehension of biomedical and genomic data.
– Gene expression arrays (cDNA, Affymetrix)
– Pathway graphs
– Genome sequence data
• Started in 2001 by Robert Gentleman, Dana
Farber Cancer Institute.
• About 25 core developers, at various institutions
in the US and Europe.
• Tools for integrating biological metadata from
the web (annotation, literature) in the analysis of
experimental metadata.
• End-user and developer packages.
http://www.allsoftsolutions.in
Example R/BioC Packages
methods Class/method tools
tools
tkWidgets
Sweave(),gui tools
marrayTools,
marrayPlots
Spotted cDNA array analysis
affy Affymetrix array analysis
annotate Link microarray data to metadata
on the web
mva, cluster,
clust, class
Clustering and classification
t.test, prop.test,
wilcox.test
Statistical tests
http://www.allsoftsolutions.in

More Related Content

What's hot

Java 8 DOs and DON'Ts - javaBin Oslo May 2015
Java 8 DOs and DON'Ts - javaBin Oslo May 2015Java 8 DOs and DON'Ts - javaBin Oslo May 2015
Java 8 DOs and DON'Ts - javaBin Oslo May 2015Fredrik Vraalsen
 
Python Pandas
Python PandasPython Pandas
Python PandasSunil OS
 
Introduction to the basics of Python programming (part 3)
Introduction to the basics of Python programming (part 3)Introduction to the basics of Python programming (part 3)
Introduction to the basics of Python programming (part 3)Pedro Rodrigues
 
Introduction to R programming
Introduction to R programmingIntroduction to R programming
Introduction to R programmingAlberto Labarga
 
Introduction to R Programming
Introduction to R ProgrammingIntroduction to R Programming
Introduction to R Programmingizahn
 
Python for R Users
Python for R UsersPython for R Users
Python for R UsersAjay Ohri
 
R reference card
R reference cardR reference card
R reference cardHesher Shih
 
5-minute intro to property-based testing in Python with hypothesis
5-minute intro to property-based testing in Python with hypothesis5-minute intro to property-based testing in Python with hypothesis
5-minute intro to property-based testing in Python with hypothesisFranklin Chen
 
R short-refcard
R short-refcardR short-refcard
R short-refcardconline
 
Python Puzzlers - 2016 Edition
Python Puzzlers - 2016 EditionPython Puzzlers - 2016 Edition
Python Puzzlers - 2016 EditionNandan Sawant
 
Advanced geoprocessing with Python
Advanced geoprocessing with PythonAdvanced geoprocessing with Python
Advanced geoprocessing with PythonChad Cooper
 
Introduction to pandas
Introduction to pandasIntroduction to pandas
Introduction to pandasPiyush rai
 
Mementopython3 english
Mementopython3 englishMementopython3 english
Mementopython3 englishssuser442080
 

What's hot (20)

Reading Data into R
Reading Data into RReading Data into R
Reading Data into R
 
R for Statistical Computing
R for Statistical ComputingR for Statistical Computing
R for Statistical Computing
 
Java 8 DOs and DON'Ts - javaBin Oslo May 2015
Java 8 DOs and DON'Ts - javaBin Oslo May 2015Java 8 DOs and DON'Ts - javaBin Oslo May 2015
Java 8 DOs and DON'Ts - javaBin Oslo May 2015
 
Python Pandas
Python PandasPython Pandas
Python Pandas
 
Introduction to the basics of Python programming (part 3)
Introduction to the basics of Python programming (part 3)Introduction to the basics of Python programming (part 3)
Introduction to the basics of Python programming (part 3)
 
Introduction to R programming
Introduction to R programmingIntroduction to R programming
Introduction to R programming
 
Introduction to R Programming
Introduction to R ProgrammingIntroduction to R Programming
Introduction to R Programming
 
R workshop
R workshopR workshop
R workshop
 
Python for R Users
Python for R UsersPython for R Users
Python for R Users
 
R reference card
R reference cardR reference card
R reference card
 
5-minute intro to property-based testing in Python with hypothesis
5-minute intro to property-based testing in Python with hypothesis5-minute intro to property-based testing in Python with hypothesis
5-minute intro to property-based testing in Python with hypothesis
 
R short-refcard
R short-refcardR short-refcard
R short-refcard
 
Fp java8
Fp java8Fp java8
Fp java8
 
Practical cats
Practical catsPractical cats
Practical cats
 
CS215 Lec 1 introduction
CS215 Lec 1   introductionCS215 Lec 1   introduction
CS215 Lec 1 introduction
 
Python Puzzlers - 2016 Edition
Python Puzzlers - 2016 EditionPython Puzzlers - 2016 Edition
Python Puzzlers - 2016 Edition
 
Advanced geoprocessing with Python
Advanced geoprocessing with PythonAdvanced geoprocessing with Python
Advanced geoprocessing with Python
 
Introduction to pandas
Introduction to pandasIntroduction to pandas
Introduction to pandas
 
R Refcard
R RefcardR Refcard
R Refcard
 
Mementopython3 english
Mementopython3 englishMementopython3 english
Mementopython3 english
 

Similar to R Basics

R-programming-training-in-mumbai
R-programming-training-in-mumbaiR-programming-training-in-mumbai
R-programming-training-in-mumbaiUnmesh Baile
 
Ti1220 Lecture 7: Polymorphism
Ti1220 Lecture 7: PolymorphismTi1220 Lecture 7: Polymorphism
Ti1220 Lecture 7: PolymorphismEelco Visser
 
Next Generation Programming in R
Next Generation Programming in RNext Generation Programming in R
Next Generation Programming in RFlorian Uhlitz
 
Introducing Reactive Machine Learning
Introducing Reactive Machine LearningIntroducing Reactive Machine Learning
Introducing Reactive Machine LearningJeff Smith
 
20130215 Reading data into R
20130215 Reading data into R20130215 Reading data into R
20130215 Reading data into RKazuki Yoshida
 
R Programming - part 1.pdf
R Programming - part 1.pdfR Programming - part 1.pdf
R Programming - part 1.pdfRohanBorgalli
 
Introduction to the basics of Python programming (part 1)
Introduction to the basics of Python programming (part 1)Introduction to the basics of Python programming (part 1)
Introduction to the basics of Python programming (part 1)Pedro Rodrigues
 
Standardizing on a single N-dimensional array API for Python
Standardizing on a single N-dimensional array API for PythonStandardizing on a single N-dimensional array API for Python
Standardizing on a single N-dimensional array API for PythonRalf Gommers
 
Data exploration validation and sanitization
Data exploration validation and sanitizationData exploration validation and sanitization
Data exploration validation and sanitizationVenkata Reddy Konasani
 
TAO Fayan_Report on Top 10 data mining algorithms applications with R
TAO Fayan_Report on Top 10 data mining algorithms applications with RTAO Fayan_Report on Top 10 data mining algorithms applications with R
TAO Fayan_Report on Top 10 data mining algorithms applications with RFayan TAO
 
Introduction to python programming 1
Introduction to python programming   1Introduction to python programming   1
Introduction to python programming 1Giovanni Della Lunga
 
4 Descriptive Statistics with R
4 Descriptive Statistics with R4 Descriptive Statistics with R
4 Descriptive Statistics with RDr Nisha Arora
 
python-numpyandpandas-170922144956 (1).pptx
python-numpyandpandas-170922144956 (1).pptxpython-numpyandpandas-170922144956 (1).pptx
python-numpyandpandas-170922144956 (1).pptxAkashgupta517936
 

Similar to R Basics (20)

R-programming-training-in-mumbai
R-programming-training-in-mumbaiR-programming-training-in-mumbai
R-programming-training-in-mumbai
 
R programming by ganesh kavhar
R programming by ganesh kavharR programming by ganesh kavhar
R programming by ganesh kavhar
 
User biglm
User biglmUser biglm
User biglm
 
R language introduction
R language introductionR language introduction
R language introduction
 
Ti1220 Lecture 7: Polymorphism
Ti1220 Lecture 7: PolymorphismTi1220 Lecture 7: Polymorphism
Ti1220 Lecture 7: Polymorphism
 
Language R
Language RLanguage R
Language R
 
Modern C++
Modern C++Modern C++
Modern C++
 
Next Generation Programming in R
Next Generation Programming in RNext Generation Programming in R
Next Generation Programming in R
 
Introducing Reactive Machine Learning
Introducing Reactive Machine LearningIntroducing Reactive Machine Learning
Introducing Reactive Machine Learning
 
20130215 Reading data into R
20130215 Reading data into R20130215 Reading data into R
20130215 Reading data into R
 
R Programming - part 1.pdf
R Programming - part 1.pdfR Programming - part 1.pdf
R Programming - part 1.pdf
 
Introduction to the basics of Python programming (part 1)
Introduction to the basics of Python programming (part 1)Introduction to the basics of Python programming (part 1)
Introduction to the basics of Python programming (part 1)
 
Standardizing on a single N-dimensional array API for Python
Standardizing on a single N-dimensional array API for PythonStandardizing on a single N-dimensional array API for Python
Standardizing on a single N-dimensional array API for Python
 
Data exploration validation and sanitization
Data exploration validation and sanitizationData exploration validation and sanitization
Data exploration validation and sanitization
 
TAO Fayan_Report on Top 10 data mining algorithms applications with R
TAO Fayan_Report on Top 10 data mining algorithms applications with RTAO Fayan_Report on Top 10 data mining algorithms applications with R
TAO Fayan_Report on Top 10 data mining algorithms applications with R
 
Java 103
Java 103Java 103
Java 103
 
Introduction to python programming 1
Introduction to python programming   1Introduction to python programming   1
Introduction to python programming 1
 
4 Descriptive Statistics with R
4 Descriptive Statistics with R4 Descriptive Statistics with R
4 Descriptive Statistics with R
 
python-numpyandpandas-170922144956 (1).pptx
python-numpyandpandas-170922144956 (1).pptxpython-numpyandpandas-170922144956 (1).pptx
python-numpyandpandas-170922144956 (1).pptx
 
Arrays
ArraysArrays
Arrays
 

More from AllsoftSolutions (9)

C#.net
C#.netC#.net
C#.net
 
Python tutorial
Python tutorialPython tutorial
Python tutorial
 
Iot basics
Iot basicsIot basics
Iot basics
 
C++ basics
C++ basicsC++ basics
C++ basics
 
Python1
Python1Python1
Python1
 
Mysql using php
Mysql using phpMysql using php
Mysql using php
 
Hbase
HbaseHbase
Hbase
 
Map reduce part1
Map reduce part1Map reduce part1
Map reduce part1
 
Bigdata overview
Bigdata overviewBigdata overview
Bigdata overview
 

Recently uploaded

Capitol Tech U Doctoral Presentation - April 2024.pptx
Capitol Tech U Doctoral Presentation - April 2024.pptxCapitol Tech U Doctoral Presentation - April 2024.pptx
Capitol Tech U Doctoral Presentation - April 2024.pptxCapitolTechU
 
Historical philosophical, theoretical, and legal foundations of special and i...
Historical philosophical, theoretical, and legal foundations of special and i...Historical philosophical, theoretical, and legal foundations of special and i...
Historical philosophical, theoretical, and legal foundations of special and i...jaredbarbolino94
 
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdf
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdfFraming an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdf
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdfUjwalaBharambe
 
Alper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentAlper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentInMediaRes1
 
Full Stack Web Development Course for Beginners
Full Stack Web Development Course  for BeginnersFull Stack Web Development Course  for Beginners
Full Stack Web Development Course for BeginnersSabitha Banu
 
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfLike-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfMr Bounab Samir
 
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTiammrhaywood
 
Roles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in PharmacovigilanceRoles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in PharmacovigilanceSamikshaHamane
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxSayali Powar
 
Final demo Grade 9 for demo Plan dessert.pptx
Final demo Grade 9 for demo Plan dessert.pptxFinal demo Grade 9 for demo Plan dessert.pptx
Final demo Grade 9 for demo Plan dessert.pptxAvyJaneVismanos
 
EPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptxEPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptxRaymartEstabillo3
 
Presiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsPresiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsanshu789521
 
What is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERPWhat is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERPCeline George
 
Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Celine George
 
Crayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon ACrayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon AUnboundStockton
 
How to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxHow to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxmanuelaromero2013
 
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...JhezDiaz1
 
Pharmacognosy Flower 3. Compositae 2023.pdf
Pharmacognosy Flower 3. Compositae 2023.pdfPharmacognosy Flower 3. Compositae 2023.pdf
Pharmacognosy Flower 3. Compositae 2023.pdfMahmoud M. Sallam
 

Recently uploaded (20)

Capitol Tech U Doctoral Presentation - April 2024.pptx
Capitol Tech U Doctoral Presentation - April 2024.pptxCapitol Tech U Doctoral Presentation - April 2024.pptx
Capitol Tech U Doctoral Presentation - April 2024.pptx
 
Historical philosophical, theoretical, and legal foundations of special and i...
Historical philosophical, theoretical, and legal foundations of special and i...Historical philosophical, theoretical, and legal foundations of special and i...
Historical philosophical, theoretical, and legal foundations of special and i...
 
Model Call Girl in Bikash Puri Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Bikash Puri  Delhi reach out to us at 🔝9953056974🔝Model Call Girl in Bikash Puri  Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Bikash Puri Delhi reach out to us at 🔝9953056974🔝
 
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdf
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdfFraming an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdf
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdf
 
Alper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentAlper Gobel In Media Res Media Component
Alper Gobel In Media Res Media Component
 
Full Stack Web Development Course for Beginners
Full Stack Web Development Course  for BeginnersFull Stack Web Development Course  for Beginners
Full Stack Web Development Course for Beginners
 
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfLike-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
 
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
 
Roles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in PharmacovigilanceRoles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in Pharmacovigilance
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
 
Final demo Grade 9 for demo Plan dessert.pptx
Final demo Grade 9 for demo Plan dessert.pptxFinal demo Grade 9 for demo Plan dessert.pptx
Final demo Grade 9 for demo Plan dessert.pptx
 
EPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptxEPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptx
 
Presiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsPresiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha elections
 
What is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERPWhat is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERP
 
Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17
 
Crayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon ACrayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon A
 
How to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxHow to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptx
 
9953330565 Low Rate Call Girls In Rohini Delhi NCR
9953330565 Low Rate Call Girls In Rohini  Delhi NCR9953330565 Low Rate Call Girls In Rohini  Delhi NCR
9953330565 Low Rate Call Girls In Rohini Delhi NCR
 
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
 
Pharmacognosy Flower 3. Compositae 2023.pdf
Pharmacognosy Flower 3. Compositae 2023.pdfPharmacognosy Flower 3. Compositae 2023.pdf
Pharmacognosy Flower 3. Compositae 2023.pdf
 

R Basics

  • 1. Introduction to the R language http://www.allsoftsolutions.in
  • 2. Acknowledgments Workshop materials developed with • Robert Gentleman, Harvard • Wolfgang Huber, German Cancer Research Center • Sandrine Dudoit, UC Berkeley Bioconductor core developers include • Vince Carey, Harvard • Yongchao Ge, Mount Sinai School of Medicine • Robert Gentleman, Harvard • Jeff Gentry, Dana-Farber Cancer Institute • Rafael Irizarry, Johns Hopkins • Yee Hwa (Jean) Yang, UCSF • Jianhua (John) Zhang, Dana- Farber Cancer Institute • Sandrine Dudoit, UC Berkeley http://www.allsoftsolutions.in
  • 3. Websites • Bioconductor www.bioconductor.org – software, data, and documentation; – training materials from short courses; www.bioconductor.org/workshops/UCSC03/ucsc03.html – mailing list. • R www.r-project.org – software; – documentation; – RNews. http://www.allsoftsolutions.in
  • 4. R language: Overview • Open source and open development. • Design and deployment of portable, extensible, and scalable software. • Interoperability with other languages: C, XML. • Variety of statistical and numerical methods. • High quality visualization and graphics tools. • Effective, extensible user interface. • Innovative tools for producing documentation and training materials: vignettes. • Supports the creation, testing, and distribution of software and data modules: packages.http://www.allsoftsolutions.in
  • 5. Object Oriented Programming (OOP) Class • software abstraction of a real world object. • reflects how we think of objects and what information they contain. • defined in terms of slots. • an object is an instance of a class. • defines the structure, inheritance, and initialization of objects. Method • function that performs an action on data (objects). • defines how a particular function should behave depending on the class of its arguments. • allows computations to be adapted to particular classes. • A generic function is a dispatcher. J. M. Chambers (1998). Programming with Data. http://www.allsoftsolutions.in
  • 6. R user interface • Batch or command line processing bash$ R to start R> q() to quit • Graphics windows > X11() > postscript() > dev.off() • File path is relative to working directory > getwd() > setwd() • Load a package library with library() • GUIs, tcltk http://www.allsoftsolutions.in
  • 7. Getting Help o Details about a specific command whose name you know (input arguments, options, algorithm): > ? t.test > help(t.test) o See an example of usage: > demo(graphics) > example(mean) mean> x <- c(0:10, 50) mean> xm <- mean(x) mean> c(xm, mean(x, trim = 0.1)) [1] 8.75 5.50 http://www.allsoftsolutions.in
  • 8. Getting Help o HTML search engine lets you search for topics with regular expressions: > help.search o Find commands containing a regular expression or object name: > apropos("var") [1] "var.na" ".__M__varLabels:Biobase" [3] "varLabels" "var.test" [5] "varimax" "all.vars" [7] "var" "variable.names" [9] "variable.names.default" "variable.names.lm" http://www.allsoftsolutions.in
  • 9. Getting Help o Vignettes contain text and executable code: > library(tkWidgets) > vExplorer() > openVignette() Created using the Sweave() function. .Rnw files produce a PDF file and a vignette. o To see code for a function, type the name with no parentheses or arguments: > plot http://www.allsoftsolutions.in
  • 10. R as a Calculator > log2(32) [1] 5 > print(sqrt(2)) [1] 1.414214 > pi [1] 3.141593 > seq(0, 5, length=6) [1] 0 1 2 3 4 5 > 1+1:10 [1] 2 3 4 5 6 7 8 9 10 11 http://www.allsoftsolutions.in
  • 11. R as a Graphics Tool > plot(sin(seq(0, 2*pi, length=100))) 0 20 40 60 80 100 -1.0-0.50.00.51.0 Index sin(seq(0,2*pi,length=100)) http://www.allsoftsolutions.in
  • 12. > a <- 49 > sqrt(a) [1] 7 > b <- "The dog ate my homework" > sub("dog","cat",b) [1] "The cat ate my homework" > c <- (1+1==3) > c [1] FALSE > as.character(b) [1] "FALSE" numeric character string logical Variables http://www.allsoftsolutions.in
  • 13. Missing Values Variables of each data type (numeric, character, logical) can also take the value NA: not available. o NA is not the same as 0 o NA is not the same as “” o NA is not the same as FALSE o NA is not the same as NULL Operations that involve NA may or may not produce NA: > NA==1 [1] NA > 1+NA [1] NA > max(c(NA, 4, 7)) [1] NA > max(c(NA, 4, 7), na.rm=T) [1] 7 > NA | TRUE [1] TRUE > NA & TRUE [1] NA http://www.allsoftsolutions.in
  • 14. Vectors vector: an ordered collection of data of the same type > a <- c(1,2,3) > a*2 [1] 2 4 6 Example: the mean spot intensities of all 15488 spots on a microarray is a numeric vector In R, a single number is the special case of a vector with 1 element. Other vector types: character strings, logical http://www.allsoftsolutions.in
  • 15. Matrices and Arrays matrix: rectangular table of data of the same type Example: the expression values for 10000 genes for 30 tissue biopsies is a numeric matrix with 10000 rows and 30 columns. array: 3-,4-,..dimensional matrix Example: the red and green foreground and background values for 20000 spots on 120 arrays is a 4 x 20000 x 120 (3D) array. http://www.allsoftsolutions.in
  • 16. Lists list: ordered collection of data of arbitrary types. Example: > doe <- list(name="john",age=28,married=F) > doe$name [1] "john“ > doe$age [1] 28 > doe[[3]] [1] FALSE Typically, vector elements are accessed by their index (an integer) and list elements by $name (a character string). But both types support both access methods. Slots are accessed by @name.http://www.allsoftsolutions.in
  • 17. Data Frames data frame: rectangular table with rows and columns; data within each column has the same type (e.g. number, text, logical), but different columns may have different types. Represents the typical data table that researchers come up with – like a spreadsheet. Example: > a <- data.frame(localization,tumorsize,progress,row .names=patients) > a localization tumorsize progress XX348 proximal 6.3 FALSE XX234 distal 8.0 TRUE XX987 proximal 10.0 FALSEhttp://www.allsoftsolutions.in
  • 18. What type is my data? class Class from which object inherits (vector, matrix, function, logical, list, … ) mode Numeric, character, logical, … storage.mode typeof Mode used by R to store object (double, integer, character, logical, …) is.function Logical (TRUE if function) is.na Logical (TRUE if missing) names Names associated with object dimnames Names for each dim of array slotNames Names of slots of BioC objects attributes Names, class, etc.http://www.allsoftsolutions.in
  • 19. Subsetting Individual elements of a vector, matrix, array or data frame are accessed with “[ ]” by specifying their index, or their name > a localization tumorsize progress XX348 proximal 6.3 0 XX234 distal 8.0 1 XX987 proximal 10.0 0 > a[3, 2] [1] 10 > a["XX987", "tumorsize"] [1] 10 > a["XX987",] localization tumorsize progress XX987 proximal 10 0 http://www.allsoftsolutions.in
  • 20. >a localization tumorsize progress XX348 proximal 6.3 0 XX234 distal 8.0 1 XX987 proximal 10.0 0 > a[c(1,3),] localization tumorsize progress XX348 proximal 6.3 0 XX987 proximal 10.0 0 > a[-c(1,2),] localization tumorsize progress XX987 proximal 10.0 0 > a[c(T,F,T),] localization tumorsize progress XX348 proximal 6.3 0 XX987 proximal 10.0 0 > a$localization [1] "proximal" "distal" "proximal" > a$localization=="proximal" [1] TRUE FALSE TRUE > a[ a$localization=="proximal", ] localization tumorsize progress XX348 proximal 6.3 0 XX987 proximal 10.0 0 subset rows by a vector of indices subset rows by a logical vector subset columns comparison resulting in logical vector subset the selected rows Example: http://www.allsoftsolutions.in
  • 21. Functions and Operators Functions do things with data “Input”: function arguments (0,1,2,…) “Output”: function result (exactly one) Example: add <- function(a,b) { result <- a+b return(result) } Operators: Short-cut writing for frequently used functions of one or two arguments. http://www.allsoftsolutions.in
  • 22. Frequently used operators <- Assign + Sum - Difference * Multiplication / Division ^ Exponent %% Mod %*% Dot product %/% Integer division %in% Subset | Or & And < Less > Greater <= Less or = >= Greater or = ! Not != Not equal == Is equal http://www.allsoftsolutions.in
  • 23. Frequently used functions c Concatenate cbind, rbind Concatenate vectors min Minimum max Maximum length # values dim # rows, cols floor Max integer in which TRUE indices table Counts summary Generic stats Sort, order, rank Sort, order, rank a vector print Show value cat Print as char paste c() as char round Round apply Repeat over rows, cols http://www.allsoftsolutions.in
  • 24. Statistical functions rnorm, dnorm, pnorm, qnorm Normal distribution random sample, density, cdf and quantiles lm, glm, anova Model fitting loess, lowess Smooth curve fitting sample Resampling (bootstrap, permutation) .Random.seed Random number generation mean, median Location statistics var, cor, cov, mad, range Scale statistics svd, qr, chol, eigen Linear algebra http://www.allsoftsolutions.in
  • 25. Graphical functions plot Generic plot eg: scatter points Add points lines, abline Add lines text, mtext Add text legend Add a legend axis Add axes box Add box around all axes par Plotting parameters (lots!) colors, palette Use colors http://www.allsoftsolutions.in
  • 26. Branching if (logical expression) { statements } else { alternative statements } else branch is optional { } are optional with one statement ifelse (logical expression, yes statement, no statement) http://www.allsoftsolutions.in
  • 27. Loops When the same or similar tasks need to be performed multiple times; for all elements of a list; for all columns of an array; etc. for(i in 1:10) { print(i*i) } i<-1 while(i<=10) { print(i*i) i<-i+sqrt(i) } Also: repeat, break, nexthttp://www.allsoftsolutions.in
  • 28. Regular Expressions Tools for text matching and replacement which are available in similar forms in many programming languages (Perl, Unix shells, Java) > a <- c("CENP-F","Ly-9", "MLN50", "ZNF191", "CLH-17") > grep("L", a) [1] 2 3 5 > grep("L", a, value=T) [1] "Ly-9" "MLN50" "CLH-17" > grep("^L", a, value=T) [1] "Ly-9" > grep("[0-9]", a, value=T) [1] "Ly-9" "MLN50" "ZNF191" "CLH-17" > gsub("[0-9]", "X", a) [1] "CENP-F" "Ly-X" "MLNXX" "ZNFXXX" "CLH-XX"http://www.allsoftsolutions.in
  • 29. Storing Data Every R object can be stored into and restored from a file with the commands “save” and “load”. This uses the XDR (external data representation) standard of Sun Microsystems and others, and is portable between MS- Windows, Unix, Mac. > save(x, file=“x.Rdata”) > load(“x.Rdata”) http://www.allsoftsolutions.in
  • 30. Importing and Exporting Data There are many ways to get data in and out. Most programs (e.g. Excel), as well as humans, know how to deal with rectangular tables in the form of tab-delimited text files. > x <- read.delim(“filename.txt”) Also: read.table, read.csv, scan > write.table(x, file=“x.txt”, sep=“t”) Also: write.matrix, write http://www.allsoftsolutions.in
  • 31. Importing Data: caveats Type conversions: by default, the read functions try to guess and auto convert the data types of the different columns (e.g. number, factor, character). There are options as.is and colClasses to control this. Special characters: the delimiter character (space, comma, tabulator) and the end-of-line character cannot be part of a data field. To circumvent this, text may be “quoted”. However, if this option is used (the default), then the quote characters themselves cannot be part of a data field. Except if they themselves are within quotes… Understand the conventions your input files use and set the quote options accordingly. http://www.allsoftsolutions.in
  • 32. Bioconductor • R software project for the analysis and comprehension of biomedical and genomic data. – Gene expression arrays (cDNA, Affymetrix) – Pathway graphs – Genome sequence data • Started in 2001 by Robert Gentleman, Dana Farber Cancer Institute. • About 25 core developers, at various institutions in the US and Europe. • Tools for integrating biological metadata from the web (annotation, literature) in the analysis of experimental metadata. • End-user and developer packages. http://www.allsoftsolutions.in
  • 33. Example R/BioC Packages methods Class/method tools tools tkWidgets Sweave(),gui tools marrayTools, marrayPlots Spotted cDNA array analysis affy Affymetrix array analysis annotate Link microarray data to metadata on the web mva, cluster, clust, class Clustering and classification t.test, prop.test, wilcox.test Statistical tests http://www.allsoftsolutions.in