Upcoming SlideShare
×

# Datamining R 1st

1,113 views

Published on

2 Likes
Statistics
Notes
• Full Name
Comment goes here.

Are you sure you want to Yes No
• Be the first to comment

Views
Total views
1,113
On SlideShare
0
From Embeds
0
Number of Embeds
297
Actions
Shares
0
0
0
Likes
2
Embeds 0
No embeds

No notes for slide

### Datamining R 1st

1. 1. R sesejun@is.ocha.ac.jp 2009/10/1
2. 2. R • http://r-project.org/ DL • Mac, Win, Linux • S-Plus • • Interactive shell • • :)
3. 3. • Applications R • Version 2.6 ( ) • R project DL • 1+1[RET] > 1+1 > 8/3 [1] 2 [1] 2.666667 > 3*6 > as.integer(8/3) [1] 18 [1] 2 > 3^3 > 8%%3 [1] 27 [1] 2
4. 4. & > c(1,2,3) [1] 1 2 3 > x <- 2 > c(1,2,3) + c(4,5,6) > y <- 3 [1] 5 7 9 > x*y > c(1,2,3) * c(4,5,6) [1] 6 [1] 4 10 18 > x^y [1] 8 > c(1,2,3) * 2 [1] 2 4 6 > c(1,2,3) / 2 [1] 0.5 1.0 1.5 > v <- c(1,2,3) > w <- v + 3 > w [1] 4 5 6 > v*w [1] 4 10 18
5. 5. > v <- c(3,2,5,7,2,4,3,1,4) > length(v) [1] 9 > max(v) [1] 7 > min(v) [1] 1 > mean(v) [1] 3.444444 > median(v) [1] 3 > unique(v) [1] 3 2 5 7 4 1 > sort(v) [1] 1 2 2 3 3 4 4 5 7 > order(v) [1] 8 2 5 1 7 6 9 3 4 > hist(v) > help(max)
6. 6. > v <- c(3,2,5,7,2,4,3,1,4) > hist(v, main="My First Histgram", col="gray") > hist(v, col="gray", main="My First Histgram") > w <- sort(v) > plot(v,w) > plot(w,v)
7. 7. > seq(1,4) [1] 1 2 3 4 > 1:4 [1] 1 2 3 4 > seq(1,5,by=2) [1] 1 3 5 > rep(1,4) [1] 1 1 1 1 > rep(1:3,2) [1] 1 2 3 1 2 3 > v <- c(3,2,5,7,2,4,3,1,4) > v[1] [1] 3 > v[c(1,3,5)] [1] 3 5 2 > v[c(5,3,1)] [1] 2 5 3 > v[c(F,F,T,T,F,F,T,T,F)] [1] 5 7 3 1
8. 8. > x <- 3 > x [1] 3 > x == 3 [1] TRUE > x == 5 [1] FALSE > x < 5 [1] TRUE > v <- c(3,2,5,7,2,4,3,1,4) > v == c(3,3,3,3,3,3,3,3,3) [1] TRUE FALSE FALSE FALSE FALSE FALSE TRUE FALSE FALSE > v == 3 [1] TRUE FALSE FALSE FALSE FALSE FALSE TRUE FALSE FALSE > v < 3 [1] FALSE TRUE FALSE FALSE TRUE FALSE FALSE TRUE FALSE
9. 9. > v <- c(3,2,5,7,2,4,3,1,4) > v < 3 [1] FALSE TRUE FALSE FALSE TRUE FALSE FALSE TRUE FALSE > v[v<3] [1] 2 2 1 > v[v>3] [1] 5 7 4 4 > v[v>3 & v<7] [1] 5 4 4 > (1:length(v))[v<3] [1] 2 5 8 > sum(v>3) [1] 4 > v %in% c(2,3,4) [1] TRUE TRUE FALSE FALSE TRUE TRUE TRUE FALSE TRUE > v[v %in% c(2,3,4)] [1] 3 2 2 4 3 4
10. 10. > runif(10,min=0,max=1) [1] 0.45189074 0.15543373 0.04654874 0.56946222 0.06086409 [6] 0.64340708 0.91820279 0.28365751 0.91056890 0.61600679 > n <- 10 > hist(runif(n,min=0,max=1), main=paste("n=",n,sep="")) > n <- 10000 > hist(runif(n,min=0,max=1), main=paste("n=",n,sep=""))
11. 11. . > n <- 10 > x <- runif(n,min=0,max=1) > x [1] 0.9308879 0.6457174 0.7480667 0.9277555 0.2432229 0.7852049 [7] 0.9005295 0.3948717 0.3442392 0.7808671 > x < 0.3 [1] FALSE FALSE FALSE FALSE TRUE FALSE FALSE FALSE FALSE FALSE > sum(x < 0.3) [1] 1 > sum(x < 0.3)/n [1] 0.1 > n <- 10000 > x <- runif(n,min=0,max=1) > sum(x < 0.3)/n [1] 0.3013 > n <- 10000 > x <- rnorm(n,mean=0,sd=1) > sum(x < 0.3)/n [1] 0.6125 > sum(x > 1.0)/n [1] 0.1591
12. 12. > m <- matrix((1:9)**2,nrow=3) > m [,1] [,2] [,3] [1,] 1 16 49 [2,] 4 25 64 [3,] 9 36 81 > m[c(2,3),c(2,3)] [,1] [,2] [1,] 25 64 [2,] 36 81 > m[2,] [1] 4 25 64 > m[c(1,2),] [,1] [,2] [,3] [1,] 1 16 49 [2,] 4 25 64 > m[,2] [1] 16 25 36 > m<50 [,1] [,2] [,3] [1,] TRUE TRUE TRUE [2,] TRUE TRUE FALSE [3,] TRUE TRUE FALSE
13. 13. > m <- matrix((1:9)**2,nrow=3) > solve(m) [,1] [,2] [,3] [1,] 1.291667 -2.166667 0.9305556 [2,] -1.166667 1.666667 -0.6111111 [3,] 0.375000 -0.500000 0.1805556 > eigen(m) \$values [1] 112.9839325 -6.2879696 0.3040371 \$vectors [,1] [,2] [,3] [1,] -0.3993327 -0.8494260 0.7612507 [2,] -0.5511074 -0.4511993 -0.6195403 [3,] -0.7326760 0.2736690 0.1914866 > v <- c(3,2,5,7,2,4,3,1,4) > t(v) %*% v [,1] [1,] 133
14. 14. R • R ≠ • • if for • R • • apply family ( R apply, sapply, lapply ) • •
15. 15. • R WEB • R-Tips: • http://cse.naro.affrc.go.jp/takezawa/r-tips/r.html • RjpWiki • http://www.okada.jp.org/RWiki/ • R