1
2
3
Environment for statistical computing and graphics
Praveen Nair
ninethsense.com
Data Science
Statistics
Collection
Organization
Analysis
Interpretation
Presentation
S
•1976 - 1998
•Created by John Chambers
•Developed by statisticians
•At Bell Laboratories
R
• Industry: Finance, Bio science, supply chain,
Sports, Marketing, manufacturing, Health Care…
• Used by: orion, bing, facebook, ford, google,
twitter, firefox
• 1993
Developers
• Robert Gentleman
• Ross Ihaka
R project
•http://r-project.org, CRAN
•Collaboration of researchers in statistical computing
•Open Source, Free Software
•Interpreter
•Over 3500 extension packages
•Supported by large user network
R console – RGui & Rterm
Calculator
> 1+2
[1] 3
> 1+2*3
[1] 7
> (1+2)*3
[1] 9
> pi
[1] 3.141593
> sin(0)
[1] 0
> sin(1)
[1] 0.841471
Variables
> a = 100
> a + 5
[1] 105
Vector
> a = c(1,2,3,4)
[1] 1 2 3 4
> b = c(2,3,4,5)
> a+b
[1] 3 5 7 9
>
Sequences
> 1:10
[1] 1 2 3 4 5 6 7 8 9 10
> a = 1:10
> a
[1] 1 2 3 4 5 6 7 8 9 10
> a*2
[1] 2 4 6 8 10 12 14 16 18 20
min, max, range
> a
[1] 1 9 3 4
> min(a)
[1] 1
> max(a)
[1] 9
> range(a)
[1] 1 9
sum, prod
> a = c(1,2,3,4)
> sum(a)
[1] 10
> prod(a)
[1] 24
Strings
> a = "ORION"
> a
[1] "ORION"
> substring(a,2,4)
[1] "RIO“
> nchar(a)
[1] 5
if-then/else
>if (a==50) date()
[1] "Wed Jun 11 01:03:14 2014"
> a = 110
> if (a > 100) "Greater than 100" else " Not greater than 100 "
[1] "Greater than 100"
> a = 50
> if (a > 100) "Greater than 100" else “Not greater than 100"
[1] "Not greater than 100"
for loop
> a
[1] 1 2 3 4
> s = 0
> for(i in a)
+ s = s + i
> s
[1] 10
switch
> i = 2
> switch(i,"I","II","III")
[1] "II"
User functions
> f1 = function(r) 2 * pi * r
> f1(10)
[1] 62.83185
Graphics
• Pie, Bar & Histograms
• Box-and-Whisker
• Scatter plot
• Time Series plots
• Surface plots
plot
> x = 1:5
> y = c(1.5, 2, 1, 3.2, 4.5)
> plot(x,y)
plot, lines, polygon, hist,…
> lines(x,y)
> polygon(x,y)
Demo time

R programming basics