Learning notes of r for python programmer (Temp1)
Upcoming SlideShare
Loading in...5
×
 

Learning notes of r for python programmer (Temp1)

on

  • 912 views

 

Statistics

Views

Total Views
912
Views on SlideShare
912
Embed Views
0

Actions

Likes
0
Downloads
9
Comments
0

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Learning notes of r for python programmer (Temp1) Learning notes of r for python programmer (Temp1) Presentation Transcript

  • Learning Notes of RFor Python Programmer
  • R Basic Scalar Types• R basic scalar data types – integer ( 1L,2L,3L,…) – numeric ( 1,2,3,…) – character – complex – logical (TRUE, FALSE) • and(&) , or(|), not(!)
  • R Basic Scalar Types Constructors• RScalarType(0) == NULL – length(xxx(0)) == 0• RScalarType(1) – integer 0L/ 0 – numeric 0 – character “” – complex 0+0i – logical FALSE
  • R Basic Object Types• R basic data structure types – (row) vector (In R, everything is vector) – matrix – list – data.frame – factor – environment• In R the “base" type is a vector, not a scalar.
  • R Object
  • Find R Object’s Properties• length(object)• mode(object) / class(object)/ typeof(obj)• attributes(object)• attr(object, name)• str(object)
  • Python type(obj)• R> class(obj)• R> mode(obj) class mode typeof• R> typeof(obj) 1 "numeric" "numeric" "double" 1:10 “integer" "numeric" “integer" “1” "character" "character" "character" class "function" "function" "builtin"
  • Python dir(obj)• attributes(obj)• str(object)• ls() (Python> dir() )• The function attributes(object) returns a list of all the non-intrinsic attributes currently defined for that object.
  • R attr(object, name)• The function attr(object, name) can be used to select a specific attribute.• When it is used on the left hand side of an assignment it can be used either to associate a new attribute with object or to change an existing one.• For example • > attr(z, "dim") <- c(10,10) – allows R to treat z as if it were a 10-by-10 matrix.
  • R character
  • Python “a,b,c,d,e”.split(“,”) (R strsplit)• strsplit(“a,b,c,d,e”,“,“) • (Output R-list)• unlist(strsplit(“a,b,c,d,e”,“,"))[vector_index]
  • R paste• paste(“a”,”b”,sep=“”) – Python> “a”+”b”  “ab”
  • R-ListPython-Dictionary
  • Python Dictionary (R List)• Constructor – Rlist <- list(key1=value1, … , key_n = value_n)• Evaluate – Rlist$key1 (Python> D[key1]) – Rlist[[1]]• Sublist – Rlist[key_i] (output list(key_i=value_i))
  • Python D[“new_key”]=new_value• Rlist$new_key = new_value or• Rlist$new_key <- new_value
  • Python> del D[key]• New_Rlist <- Rlist[-key_index] or• New_Rlist <- Rlist[-vector_of_key_index]
  • Python Dict.keys()• vector_of_Rlist_keys <- names(Rlist) • ( output “vector_of_Rlist_keys” is a R-vector)
  • R-VectorPython-List
  • Python List (R vector)• [Constructor] vector(mode , length) – vector(mode = "character", length = 10)• 0:10 – 0:10 == c(0,1,2,3,4,5,6,7,8,9,10) – Python> range(0,11) )• seq(0,1,0.1) – seq(0,1,0.1) == 0:10*0.1 – Matlab> linspace(0,1,0.1)• rep(0:10, times = 2)
  • Python List.methods• vector <- c(vector, other_vector) – Python> List.append• vector[-J] or vector[-(I:J)] – Python> List.pop• subvector <- vector[vector_of_index]• which( vector == value ) – Python> List.index(value)
  • R which• which( vector == value ) – Python> List.index(value)• which( vector < v) or which( vector > v)• which(arg, arr.in=TRUE)• http://fortheloveof.co.uk/2010/04/11/r- select-specific-elements-or-find-their-index- in-a-vector-or-a-matrix/
  • R vector• length(vector) – Python> len(List)• names(vector)• rownames(vector)
  • Python> element in List• R> element %in% R-Vector• R> !(element %in% R-Vector) (not in)
  • R matrixR-Vector with Dimension
  • R-Matrix• Constructor: – matrix( ?? , nrow = ?? , ncol = ?? ) – as.matrix( ?? )
  • R-Matrix=R-Vector with Dimension> x <- 1:15> class(x)[1] "integer"> dim(x) <- c(3, 5)> class(x)[1] "matrix"
  • Names on Matrix• Just as you can name indices in a vector you can (and should!) name columns and rows in a matrix with colnames(X) and rownames(X).• E.g. – colname(R-matrix) <- c(name_1,name_2,…) – colname(R-matrix) [i] <- name_i
  • Functions on Matrix• If X is a matrix apply(X, 1, f) is the result of applying f to each row of X; apply(X, 2, f) to the columns. – Python> map(func,py-List)
  • Add Columns and Rows• cbindE.g.> cbind(c(1,2,3),c(4,5,6))• rbindE.g.> rbind(c(1,2,3),c(4,5,6))
  • Data Frame in R Explicitly like a list
  • Explicitly like a list• When can a list be made into a data.frame? – Components must be vectors (numeric, character, logical) or factors. – All vectors and factors must have the same lengths.
  • Python os and R
  • Python os.method• getwd() (Python> os.getcwd() )• setwd(Path) (Python> os.chdir(Path))
  • Control Structures and Looping
  • if• if ( statement1 )• statement2• else if ( statement3 )• statement4• else if ( statement5 )• statement6• else• statement8
  • swtich• Switch (statement, list)• Example:> y <- "fruit"> switch(y, fruit = "banana", vegetable = "broccoli", meat = "beef")[1] "banana"
  • for• for ( name in vector ) statement1• E.g.>.for ( ind in 1:10) { print(ind) }
  • while• while ( statement1 ) statement2
  • repeat• repeat statement• The repeat statement causes repeated evaluation of the body until a break is specifically requested.• When using repeat, statement must be a block statement. You need to both perform some computation and test whether or not to break from the loop and usually this requires two statements.
  • Functions in R
  • Create Function in R• name <- function(arg_1, arg_2, ...) expression• E.g. – ADD <- function(a,b) a+b – ADD <- function(a,b) {c<-a+b} – ADD <- function(a,b) {c<-a+b;c} – ADD <- function(a,b) {c<-a+b; return(c)} – (All these functions are the same functions)
  • Function Return R-List• To return more than one item, create a list using list()• E.g. – MyFnTest1 <- function(a,b) {c<-a+b;d<-a-b; list(r1=c,r2=d)} – MyFnTest1 <- function(a,b) {c<-a+b;d<-a-b; return(list(r1=c,r2=d))} – (These two functions are the same, too)
  • Python map(func,Py-List)• apply series methods (to be continued.)
  • R Time Objects
  • R Basic Time Objects• Basic Types – Date – POSIXct – POSIXlt• Constructors: – as.Date – as. POSIXct – as. POSIXlt
  • as.POSIXct/ as.POSIXlt• as. POSIXct( timestamp , origin , tz , …)• E.g. – as. POSIXct( timestamp , origin="1970-01- 01",tz="CST“, …)
  • strftime / strptime• "POSIXlt“/"POSIXct“ to Character – strftime(x, format="", tz = "", usetz = FALSE, ...)• Character to "POSIXlt“ – strptime(x, format, tz = "")• E.g. – strptime(… ,"%Y-%m-%d %H:%M:%S", tz="CST")
  • Time to Timestamp [Python> time.mktime(…)]• as.numeric(POSIXlt Object)• E.g. – as.numeric(Sys.time())
  • R Graph
  • Types of Graphics• Base• Lattice
  • Base Graphics• Use function such as – plot – barplot – contour – boxplot – pie – pairs – persp – image
  • Plot Arguments• type = ???• axes = FALSE : suppresses axes• xlab = “str” : label of x-axis• ylab = “str” : label of y-axis• sub = “str” : subtitle appear under the x-axis• main = “str” : title appear at top of plot• xlim = c(lo,hi)• ylim = c(lo,hi)
  • Plot’s type arg• type = – “p” : plots points – “l” : plots a line – “n” : plots nothing, just creates the axes for later use – “b” : plots both lines and points – “o” : plot overlaid lines and points – “h” : plots histogram-like vertical lines – “s” : plots step-like lines
  • Plot Example• R> plot(x=(1:20),y=(11:30),pch=1:20,col=1:20,mai n="plot",xlab="x-axis",ylab="y- axis",ylim=c(0,30))• R> example(points)
  • pch• 0:18: S-compatible vector symbols.• 19:25: further R vector symbols.• 26:31: unused (and ignored).• 32:127: ASCII characters.• 128:255 native characters only in a single-byte locale and for the symbol font. (128:159 are only used on Windows.)• Ref: http://stat.ethz.ch/R-manual/R-devel/library/graphics/html/points.html http://rgraphics.limnology.wisc.edu/
  • cex• a numerical vector giving the amount by which plotting characters and symbols should be scaled relative to the default. This works as a multiple of par("cex"). NULL and NA are equivalent to 1.0. Note that this does not affect annotation: see below.• E.g. – points(c(6,2), c(2,1), pch = 5, cex = 3, col = "red") – points(c(6,2), c(2,1), pch = 5, cex = 10, col = "red")
  • points, lines, text, abline
  • arrows
  • par/layout (Matlab> subplot)• par(mfrow=c(m,n)) – Matlab> subplot(m,n,?)
  • pairs• E.g. – R> pairs(iris[,c(1,3,5)]) – R> example(pairs)
  • MISC. Code1 (Saving Graph)• postscript("myfile.ps")• plot(1:10)• dev.off()
  • MISC. Code2 (Saving Graph)• windows(record=TRUE, width=7, height=7)• Last_30_TXF<-last(TXF,30)plt• chartSeries(Last_30_TXF)• savePlot(paste("Last30_",unlist(strsplit(filena me,"."))[1],sep=""),type = "jpeg",device = dev.cur(),restoreConsole = TRUE)
  • 可使用的顏色種類• R> colors() 可以查出所有顏色• 可搭配grep找尋想要的色系, 如• R> grep("red",colors())• Reference:• http://research.stowers-institute.org/efg/R/Color/Chart/
  • R xts
  • Tools for xts• diff• lag
  • My XTS’ Tools• Integration_of_XTS• Indexing_of_XTS• XTS_Push_Events_Back• Get_XTS_Local_Max• Get_XTS_Local_Min
  • Basic Statistics Tools
  • R Statistical Models
  • Model Formulae• formula(x, ...)• as.formula(object, env = parent.frame())• E.g. – R> example(formula)
  • MISC. 1 Updating fitted models• http://cran.r-project.org/doc/manuals/R- intro.html#Updating-fitted-models
  • R Packages
  • • library()• search()• loadedNamespaces()• getAnywhere(Package_Name)• http://cran.r-project.org/doc/manuals/R- intro.html#Namespaces
  • Random Number Generators
  • • rnorm• runif•
  • Regular Expression Python Re Module
  • grep• Pattern_Index <- grep(Pattern, Search_Vector)• E.g. (quantmod中的 Cl function)return(x[, grep("Close", colnames(x))])
  • • hits <- grep( pattern, x )• Ref: Lecture5v1
  • R LibSVM (e1071)http://www.csie.ntu.edu.tw/~cjlin/lib svm/R_example
  • R CR Tree Method (rpart)Classification and Regression Tree
  • • http://www.statsoft.com/textbook/classificati on-and-regression-trees/• http://www.stat.cmu.edu/~cshalizi/350/lectur es/22/lecture-22.pdf• http://www.stat.wisc.edu/~loh/treeprogs/gui de/eqr.pdf
  • R Adaboost Package (adabag)
  • adaboost.M1• 此函數的演算法使用 Freund and Schapire‘s Adaboost.M1 algorithm• 其中 weak learner 的部分使用 CR Tree 也就 是R中的 rpart package
  • adaboost.M1’s Training Data Form• Label Column must be a factor object(in source code)fit <- rpart(formula = formula, weights = data$pesos, data = data[, -1], maxdepth = maxdepth)flearn <- predict(fit, data = data[, -1], type = "class")
  • R IDE Tools
  • Reference• http://en.wikipedia.org/wiki/R_(programming_language)• http://jekyll.math.byuh.edu/other/howto/R/RE.shtml (Emacs)• http://stat.ethz.ch/ESS/
  • Reference
  • Graph• http://addictedtor.free.fr/graphiques/
  • • http://www.nd.edu/~steve/Rcourse/Lecture2 v1.pdf• http://addictedtor.free.fr/graphiques/• http://www.evc- cit.info/psych018/r_intro/r_intro4.html• http://www.r-tutor.com/r-introduction/data- frame• http://msenux.redwoods.edu/math/R/datafra me.php