R Language
ian
Why ?
Background of R
What is R?
GNU Project Developed by John Chambers @ Bell Lab
Free software environment for statistical computing and graphics
Functional programming language written primarily in C, Fortran
 R is functional programming language
 R is an interpreted language
 R is object oriented-language
R Language
 Statistic analysis on the fly
 Mathematical function and graphic module
embedded
 FREE! & Open Source!
 http://cran.r-project.org/src/base/
Why Using R
What is your programming language of choice, R, Python or
something else?
“I use R, and occasionally matlab, for data analysis. There is a
large, active and extremely knowledgeable R community at
Google.”
http://simplystatistics.org/2013/02/15/interview-with-nick-chamandy-statistician-at-google/
Data Scientist of these Companies
Using R
“Expert knowledge of SAS (With Enterprise
Guide/Miner) required and candidates with strong
knowledge of R will be preferred”
http://www.kdnuggets.com/jobs/13/03-29-apple-sr-data-
scientist.html?utm_source=twitterfeed&utm_medium=facebook&utm_campaign=tfb&utm
_content=FaceBook&utm_term=analytics#.UVXibgXOpfc.facebook
 In 2007, Revolution Analytics providea commercial support for
Revolution R
 http://www.revolutionanalytics.com/products/revolution-r.php
 http://www.revolutionanalytics.com/why-revolution-r/which-r-is-right-for-me.php
 Big Data Appliance, which integrates R, Apache Hadoop, Oracle
Enterprise Linux, and a NoSQL database with the
Exadata hardware
 http://www.oracle.com/us/products/database/big-data-
appliance/overview/index.html
Commercial support for R
 Free for Community Version
 http://www.revolutionanalytics.com/downloads/
 http://www.revolutionanalytics.com/why-revolution-r/benchmarks.php
Revolotion R
Base R 2.14.2 64
Revolution R (1-
core)
Revolution R (4-
core)
Speedup (4 core)
Matrix
Calculation
17.4 sec 2.9 sec 2.0 sec 7.9x
Matrix Functions 10.3 sec 2.0 sec 1.2 sec 7.8x
Program Control 2.7 sec 2.7 sec 2.7 sec Not Appreciable
R Studio
 http://www.rstudio.com/
IDE
RGUI
•http://www.r-project.org/
Shiny makes it super simple for R users like you to turn
analyses into interactive web applications that anyone can
use
http://www.rstudio.com/shiny/
Web App Development
 CRAN (Comprehensive R Archive Network)
Package Management
Repository URL
CRAN http://cran.r-project.org/web/packages/
Bioconductor http://www.bioconductor.org/packages/release/Software.html
R-Forge http://r-forge.r-project.org/
R Basic
 help()
 help(demo)
 demo()
 demo(is.things)
 q()
 ls()
 rm()
 rm(x)
Basic Command
 Vector
 List
 Factor
 Array
 Matrix
 Data Frame
Basic Object
 物件類型(type)主要是向量(vector),矩陣(matrix),陣列(array),因
素(factor),列表(list),資料框架(data frame),函式(function).
 物件基本元素之“模式” (basic mode)分成
 1."numeric",實數型,含"integer",整數型(有時需特別指定),與
"double",倍精確度型.
 2."logical",邏輯型(true or false),以TRUE(T)或FALSE(F)呈現, (也
可以是1 (T)與0 (F).
 3."complex",複數型
 4."character",文字型(或字串),通常輸入時,在文字或字串兩側加
上雙引號(").
 Scalar
 x=3; y<-5; x+y
 Vectors
 x = c(1,2,3, 7); y= c(2,3,5,1); x+y; x*y; x – y; x/y;
 x =seq(1,10); y= 2:11; x+y
 x =seq(1,10,by=2); y =seq(1,10,length=2)
 rep(c(5,8), 3)
 x= c(1,2,3); length(x)
Objects & Arithmetic
 Summary
 X = c(1,2,3,4,5,6,7,8,9,10)
 mean(x), min(x), median(x), max(x), var(x)
 summary(x)
 Subscripting
 x = c(1,2,3,4,5,6,7,8,9,10)
 x[1:3]; x[c(1,3,5)];
 x[c(1,3,5)] * 2 + x[c(2,2,2)]
 x[-(1:6)]
Summaries and Subscripting
 Contain a heterogeneous selection of objects
 e <- list(thing="hat", size="8.25"); e
 l <- list(a=1,b=2,c=3,d=4,e=5,f=6,g=7,h=8,i=9,j=10)
 l$j
 man = list(name="Qoo", height=183); man$name
Lists
 Ordered collection of items to present categorical
value
 Different values that the factor can take are called
levels
 Factors
 phone = factor(c('iphone', 'htc', 'iphone', 'samsung',
'iphone', 'samsung'))
 levels(phone)
Factor
 Array
 An extension of a vector to more than two dimensions
 a <- array(c(1,2,3,4,5,6,7,8,9,10,11,12),dim=c(3,4))
 Matrices
 A vector to two dimensions – 2d-array
 x = c(1,2,3); y = c(4,5,6); rbind(x,y);cbind(x,y)
 x = rbind(c(1,2,3),c(4,5,6)); dim(x)
 x<-matrix(c(1,2,3,4,5,6),nr=3);
 x<-matrix(c(1,2,3,4,5,6),nrow=3, ,byrow=T)
 x<-matrix(c(1,2,3,4),nr=2);y<-matrix(c(5,6),nr=2); x%*%y
 t(matrix(c(1,2,3,4),nr=2))
 solve(matrix(c(1,2,3,4),nr=2))
Matrices & Array
 Useful way to represent tabular data
 essentially a matrix with named columns may also
include non-numerical variables
 Example
 df = data.frame(a=c(1,2,3,4,5),b=c(2,3,4,5,6));df
Data Frame
 Function
 `%myop%` <- function(a, b) {2*a + 2*b}; 1 %myop% 1
 f <- function(x) {return(x^2 + 3)}
 create.vector.of.ones <- function(n) {
return.vector <- NA;
for (i in 1:n) {
return.vector[i] <- 1;
} return.vector;
}
 create.vector.of.ones(3)
 Control Structures
 If …else…
 Repeat, for, while
 Catch error – trycatch
Function
 Functional language Characteristic
 apply.to.three <- function(f) {f(3)}
 apply.to.three(function(x) {x * 7})
Anonymous Function
 All R code manipulates objects.
 Every object in R has a type
 In assignment statements, R will copy the object, not
just the reference to the object Attributes
Objects and Classes
 Many R functions were implemented using S3
methods
 In S version 4 (hence S4), formal classes and methods
were introduced that allowed
 Multiple arguments
 Abstract types
 inheritance.
S3 & S4 Object
 S4 OOP Example
 setClass("Student", representation(name = "character",
score="numeric"))
 studenta = new ("Student", name="david", score=80 )
 studentb = new ("Student", name="andy", score=90 )
setMethod("show", signature("Student"),
function(object) {
cat(object@score+100)
})
 setGeneric("getscore", function(object)
standardGeneric("getscore"))
 Studenta
OOP of S4
 A package is a related set of functions, help files, and
data files that have been bundled together.
 Basic Command
 library(rpart)
 CRAN
 Install
 (.packages())
Packages
29
Package used in Machine Learning
for Hackers
 Apply
 Returns a vector or array or list of values obtained by
applying a function to margins of an array or matrix.
 data <- cbind(c(1,2),c(3,4))
 data.rowsum <- apply(data,1,sum)
 data.colsum <- apply(data,2,sum)
 data
Apply
 Save and Load
 x = USPersonalExpenditure
 save(x, file="~/test.RData")
 rm(x)
 load("~/test.RData")
 x
File IO
Charts and Graphics
 xrange = range(as.numeric(colnames(USPersonalExpenditure)));
 yrange= range(USPersonalExpenditure);
 plot(xrange, yrange, type="n", xlab="Year",ylab="Category" )
 for(i in 1:5) {
lines(as.numeric(colnames(USPersonalExpenditure)),USPersonalExpendi
ture[i,], type="b", lwd=1.5)
}
Plotting Example
Reference & Resource
 R in a nutshell
Study Material
Online Reference
37
Community Resources for R help
 Websites
 Stackoverflow
 Cross Validated
 R-help
 R-devel
 R-sig-*
 Package-specific mailing list
 Blog
 R-bloggers
 Twitter
 https://twitter.com/#rstats
 Quora
 http://www.quora.com/R-software
Resource
 Conference
 useR!
 R in Finance
 R in Insurance
 Others
 Joint Statistical Meetings
 Royal Statistical Society Conference
 Local User Group
 http://blog.revolutionanalytics.com/local-r-groups.html
 Taiwan R User Group
 http://www.facebook.com/Tw.R.User
 http://www.meetup.com/Taiwan-R/
Resource (Con’d)
Thank You!
11/20/2015 40Confidential | Copyright 2012 Trend Micro Inc.

R language

  • 1.
  • 2.
  • 3.
  • 4.
    What is R? GNUProject Developed by John Chambers @ Bell Lab Free software environment for statistical computing and graphics Functional programming language written primarily in C, Fortran
  • 5.
     R isfunctional programming language  R is an interpreted language  R is object oriented-language R Language
  • 6.
     Statistic analysison the fly  Mathematical function and graphic module embedded  FREE! & Open Source!  http://cran.r-project.org/src/base/ Why Using R
  • 7.
    What is yourprogramming language of choice, R, Python or something else? “I use R, and occasionally matlab, for data analysis. There is a large, active and extremely knowledgeable R community at Google.” http://simplystatistics.org/2013/02/15/interview-with-nick-chamandy-statistician-at-google/ Data Scientist of these Companies Using R “Expert knowledge of SAS (With Enterprise Guide/Miner) required and candidates with strong knowledge of R will be preferred” http://www.kdnuggets.com/jobs/13/03-29-apple-sr-data- scientist.html?utm_source=twitterfeed&utm_medium=facebook&utm_campaign=tfb&utm _content=FaceBook&utm_term=analytics#.UVXibgXOpfc.facebook
  • 8.
     In 2007,Revolution Analytics providea commercial support for Revolution R  http://www.revolutionanalytics.com/products/revolution-r.php  http://www.revolutionanalytics.com/why-revolution-r/which-r-is-right-for-me.php  Big Data Appliance, which integrates R, Apache Hadoop, Oracle Enterprise Linux, and a NoSQL database with the Exadata hardware  http://www.oracle.com/us/products/database/big-data- appliance/overview/index.html Commercial support for R
  • 9.
     Free forCommunity Version  http://www.revolutionanalytics.com/downloads/  http://www.revolutionanalytics.com/why-revolution-r/benchmarks.php Revolotion R Base R 2.14.2 64 Revolution R (1- core) Revolution R (4- core) Speedup (4 core) Matrix Calculation 17.4 sec 2.9 sec 2.0 sec 7.9x Matrix Functions 10.3 sec 2.0 sec 1.2 sec 7.8x Program Control 2.7 sec 2.7 sec 2.7 sec Not Appreciable
  • 10.
  • 11.
    Shiny makes itsuper simple for R users like you to turn analyses into interactive web applications that anyone can use http://www.rstudio.com/shiny/ Web App Development
  • 12.
     CRAN (ComprehensiveR Archive Network) Package Management Repository URL CRAN http://cran.r-project.org/web/packages/ Bioconductor http://www.bioconductor.org/packages/release/Software.html R-Forge http://r-forge.r-project.org/
  • 13.
  • 14.
     help()  help(demo) demo()  demo(is.things)  q()  ls()  rm()  rm(x) Basic Command
  • 15.
     Vector  List Factor  Array  Matrix  Data Frame Basic Object
  • 16.
     物件類型(type)主要是向量(vector),矩陣(matrix),陣列(array),因 素(factor),列表(list),資料框架(data frame),函式(function). 物件基本元素之“模式” (basic mode)分成  1."numeric",實數型,含"integer",整數型(有時需特別指定),與 "double",倍精確度型.  2."logical",邏輯型(true or false),以TRUE(T)或FALSE(F)呈現, (也 可以是1 (T)與0 (F).  3."complex",複數型  4."character",文字型(或字串),通常輸入時,在文字或字串兩側加 上雙引號(").
  • 17.
     Scalar  x=3;y<-5; x+y  Vectors  x = c(1,2,3, 7); y= c(2,3,5,1); x+y; x*y; x – y; x/y;  x =seq(1,10); y= 2:11; x+y  x =seq(1,10,by=2); y =seq(1,10,length=2)  rep(c(5,8), 3)  x= c(1,2,3); length(x) Objects & Arithmetic
  • 18.
     Summary  X= c(1,2,3,4,5,6,7,8,9,10)  mean(x), min(x), median(x), max(x), var(x)  summary(x)  Subscripting  x = c(1,2,3,4,5,6,7,8,9,10)  x[1:3]; x[c(1,3,5)];  x[c(1,3,5)] * 2 + x[c(2,2,2)]  x[-(1:6)] Summaries and Subscripting
  • 19.
     Contain aheterogeneous selection of objects  e <- list(thing="hat", size="8.25"); e  l <- list(a=1,b=2,c=3,d=4,e=5,f=6,g=7,h=8,i=9,j=10)  l$j  man = list(name="Qoo", height=183); man$name Lists
  • 20.
     Ordered collectionof items to present categorical value  Different values that the factor can take are called levels  Factors  phone = factor(c('iphone', 'htc', 'iphone', 'samsung', 'iphone', 'samsung'))  levels(phone) Factor
  • 21.
     Array  Anextension of a vector to more than two dimensions  a <- array(c(1,2,3,4,5,6,7,8,9,10,11,12),dim=c(3,4))  Matrices  A vector to two dimensions – 2d-array  x = c(1,2,3); y = c(4,5,6); rbind(x,y);cbind(x,y)  x = rbind(c(1,2,3),c(4,5,6)); dim(x)  x<-matrix(c(1,2,3,4,5,6),nr=3);  x<-matrix(c(1,2,3,4,5,6),nrow=3, ,byrow=T)  x<-matrix(c(1,2,3,4),nr=2);y<-matrix(c(5,6),nr=2); x%*%y  t(matrix(c(1,2,3,4),nr=2))  solve(matrix(c(1,2,3,4),nr=2)) Matrices & Array
  • 22.
     Useful wayto represent tabular data  essentially a matrix with named columns may also include non-numerical variables  Example  df = data.frame(a=c(1,2,3,4,5),b=c(2,3,4,5,6));df Data Frame
  • 23.
     Function  `%myop%`<- function(a, b) {2*a + 2*b}; 1 %myop% 1  f <- function(x) {return(x^2 + 3)}  create.vector.of.ones <- function(n) { return.vector <- NA; for (i in 1:n) { return.vector[i] <- 1; } return.vector; }  create.vector.of.ones(3)  Control Structures  If …else…  Repeat, for, while  Catch error – trycatch Function
  • 24.
     Functional languageCharacteristic  apply.to.three <- function(f) {f(3)}  apply.to.three(function(x) {x * 7}) Anonymous Function
  • 25.
     All Rcode manipulates objects.  Every object in R has a type  In assignment statements, R will copy the object, not just the reference to the object Attributes Objects and Classes
  • 26.
     Many Rfunctions were implemented using S3 methods  In S version 4 (hence S4), formal classes and methods were introduced that allowed  Multiple arguments  Abstract types  inheritance. S3 & S4 Object
  • 27.
     S4 OOPExample  setClass("Student", representation(name = "character", score="numeric"))  studenta = new ("Student", name="david", score=80 )  studentb = new ("Student", name="andy", score=90 ) setMethod("show", signature("Student"), function(object) { cat(object@score+100) })  setGeneric("getscore", function(object) standardGeneric("getscore"))  Studenta OOP of S4
  • 28.
     A packageis a related set of functions, help files, and data files that have been bundled together.  Basic Command  library(rpart)  CRAN  Install  (.packages()) Packages
  • 29.
    29 Package used inMachine Learning for Hackers
  • 30.
     Apply  Returnsa vector or array or list of values obtained by applying a function to margins of an array or matrix.  data <- cbind(c(1,2),c(3,4))  data.rowsum <- apply(data,1,sum)  data.colsum <- apply(data,2,sum)  data Apply
  • 31.
     Save andLoad  x = USPersonalExpenditure  save(x, file="~/test.RData")  rm(x)  load("~/test.RData")  x File IO
  • 32.
  • 33.
     xrange =range(as.numeric(colnames(USPersonalExpenditure)));  yrange= range(USPersonalExpenditure);  plot(xrange, yrange, type="n", xlab="Year",ylab="Category" )  for(i in 1:5) { lines(as.numeric(colnames(USPersonalExpenditure)),USPersonalExpendi ture[i,], type="b", lwd=1.5) } Plotting Example
  • 34.
  • 35.
     R ina nutshell Study Material
  • 36.
  • 37.
  • 38.
     Websites  Stackoverflow Cross Validated  R-help  R-devel  R-sig-*  Package-specific mailing list  Blog  R-bloggers  Twitter  https://twitter.com/#rstats  Quora  http://www.quora.com/R-software Resource
  • 39.
     Conference  useR! R in Finance  R in Insurance  Others  Joint Statistical Meetings  Royal Statistical Society Conference  Local User Group  http://blog.revolutionanalytics.com/local-r-groups.html  Taiwan R User Group  http://www.facebook.com/Tw.R.User  http://www.meetup.com/Taiwan-R/ Resource (Con’d)
  • 40.
    Thank You! 11/20/2015 40Confidential| Copyright 2012 Trend Micro Inc.