These slides are designed for people who already have some background in coding, but are new to the R language and environment.
This presentation took between 1.5 to 2 hours.
The slides were created with RMarkdown, so all the code shown here should run exactly in R.
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Introduction to R
1. 6/26/15, 12:48 AMIntroduction to R
Page 1 of 80file:///Users/bencharoenwong/Dropbox/Citadel/R_overview/introToR-bencharoenwong.html#1
Introduction to R
Ben Charoenwong
June 25, 2015
6/26/15, 12:48 AMIntroduction to R
Page 2 of 80file:///Users/bencharoenwong/Dropbox/Citadel/R_overview/introToR-bencharoenwong.html#1
R Markdown
This is an R Markdown presentation. Markdown is a simple
formatting syntax for authoring HTML, PDF, and MS Word
documents. For more details on using R Markdown see
http://rmarkdown.rstudio.com.
I wrote this document through R Studio, which is an editing
environment for R. It can also compile PDFs, HTML documents,
This document contains live R code that is evaluated when this
document is "compiled" (or rather, "knitted").
2/80
2. 6/26/15, 12:48 AMIntroduction to R
Page 3 of 80file:///Users/bencharoenwong/Dropbox/Citadel/R_overview/introToR-bencharoenwong.html#1
Why R?
R is a freely available language and enviroment for statistical
computing and graphics. It provides a wide variety of statistical
and graphical techniques.
Most statisticians are using R (though not all finance firms do). A
large number of add-on packages can be installed and run on R.
You rarely need to code anything from scratch yourself
R's syntax is very simple (so simple it's a bit like sloppy
programming). Since R is an interpreted language, not a compiled
one, it doesn't require building a complete program like in many
other languages (like C, Fortran, etc.)
·
·
·
·
3/80
6/26/15, 12:48 AMIntroduction to R
Page 4 of 80file:///Users/bencharoenwong/Dropbox/Citadel/R_overview/introToR-bencharoenwong.html#1
How to install R
You can download R in different forms (Windows, Linux, Mac) at:
http://cran.r-project.org/
You can also get the most updated R version from there
For packages, to install, use install.packages() and to update,
use update.packages(ask = F).
·
·
·
4/80
3. 6/26/15, 12:48 AMIntroduction to R
Page 5 of 80file:///Users/bencharoenwong/Dropbox/Citadel/R_overview/introToR-bencharoenwong.html#1
Coding Environment
Once you initialize R, the programming console will appear.
The interface looks a little different in R Studio. Use whichever
you feel most comfortable with!
Code is written in scripts.
To create a new script, select "File" -> "New Script". The editor
looks standard.
You can set up other text editors to be able to interpret and run R
code too (won't get into this).
Code is saved in files ending with ".R", and code can be run using
the command source().
Don't forget to save!
·
·
·
·
·
·
·
5/80
6/26/15, 12:48 AMIntroduction to R
Page 6 of 80file:///Users/bencharoenwong/Dropbox/Citadel/R_overview/introToR-bencharoenwong.html#1
Let's Start Coding
4. 6/26/15, 12:48 AMIntroduction to R
Page 7 of 80file:///Users/bencharoenwong/Dropbox/Citadel/R_overview/introToR-bencharoenwong.html#1
If/when you get stuck
Try the built-in help menu. For example, suppose I want to know
how the command "lm" works. The code below all have the same
effect:
·
?lm
help(lm)
help("lm")
Google for help. There is a large R user community on
StackOverflow and R-Blogger. Being able to find the most
efficient and readable code from these online resources is an
extremely valuable skill.
Email the TA (me) at: bcharoen@chicagobooth.edu, but only
AFTER you try steps 1 and 2.
·
·
7/80
6/26/15, 12:48 AMIntroduction to R
Page 8 of 80file:///Users/bencharoenwong/Dropbox/Citadel/R_overview/introToR-bencharoenwong.html#1
Basic Commands
Let's jump into some basic commands.
Assign objects by using either "<-" or "=" (the former is more
robust)
·
a = 5
a
## [1] 5
-a
## [1] -5
8/80
5. 6/26/15, 12:48 AMIntroduction to R
Page 9 of 80file:///Users/bencharoenwong/Dropbox/Citadel/R_overview/introToR-bencharoenwong.html#1
Basic Commands
b = 4
b
## [1] 4
a + b
## [1] 9
9/80
6/26/15, 12:48 AMIntroduction to R
Page 10 of 80file:///Users/bencharoenwong/Dropbox/Citadel/R_overview/introToR-bencharoenwong.html#1
Basic Commands
Example where "<-" is different from "=":
but {r, echo = T, eval = F} system.time(a = 5) returns an
error (you can try it).
system.time(a <- 5)
## user system elapsed
## 0 0 0
The "<-" always means assign, but "=" is also for passing in
arguments into a function. More on this later.
·
10/80
6. 6/26/15, 12:48 AMIntroduction to R
Page 11 of 80file:///Users/bencharoenwong/Dropbox/Citadel/R_overview/introToR-bencharoenwong.html#1
Basic Commands
R remembers everything that you assign (up to overwriting
variables) in memory. You can check what's in memory with
·
ls()
## [1] "a" "b"
11/80
6/26/15, 12:48 AMIntroduction to R
Page 12 of 80file:///Users/bencharoenwong/Dropbox/Citadel/R_overview/introToR-bencharoenwong.html#1
Basic Commands
Just a tiny bit on style:
I use ";" to end a line in R. This is not necessary. Using ";" allows
me to sometimes compress multiple lines of code into one (we
can debate whether that is good or bad).
This is different from MATLAB! In MATLAB the ";" suppresses the
output.
I also usually use "=" to assign values, rather than "<-".
Mainly because I started programming in C++ and MATLAB.
·
·
·
·
12/80
7. 6/26/15, 12:48 AMIntroduction to R
Page 13 of 80file:///Users/bencharoenwong/Dropbox/Citadel/R_overview/introToR-bencharoenwong.html#1
Basic Commands
Just a tiny bit on style:
a = 5; a;
## [1] 5
b = 4; b + a;
## [1] 9
13/80
6/26/15, 12:48 AMIntroduction to R
Page 14 of 80file:///Users/bencharoenwong/Dropbox/Citadel/R_overview/introToR-bencharoenwong.html#1
Basic Commands
Recall a = 5; b = 4; :
a - b;
## [1] 1
a * b;
## [1] 20
a / b;
## [1] 1.25
14/80
8. 6/26/15, 12:48 AMIntroduction to R
Page 15 of 80file:///Users/bencharoenwong/Dropbox/Citadel/R_overview/introToR-bencharoenwong.html#1
Basic Commands
Recall: a = 5; b = 4;:
a ^ b;
## [1] 625
exp(b);
## [1] 54.59815
log(b); # Default base "e". You can specify "base = 2", etc.
## [1] 1.386294
15/80
6/26/15, 12:48 AMIntroduction to R
Page 16 of 80file:///Users/bencharoenwong/Dropbox/Citadel/R_overview/introToR-bencharoenwong.html#1
Basic Commands
Recall: a = 5; b = 4;:
min(a, b);
## [1] 4
max(a, b);
## [1] 5
abs(-a);
## [1] 5
16/80
9. 6/26/15, 12:48 AMIntroduction to R
Page 17 of 80file:///Users/bencharoenwong/Dropbox/Citadel/R_overview/introToR-bencharoenwong.html#1
Basic Commands
Recall: a = 5; b = 4;:
R doesn't like this kind of imaginary numbers. Need to be assigned
slightly differently
sqrt(a);
## [1] 2.236068
sqrt(-a);
## Warning in sqrt(-a): NaNs produced
## [1] NaN
17/80
6/26/15, 12:48 AMIntroduction to R
Page 18 of 80file:///Users/bencharoenwong/Dropbox/Citadel/R_overview/introToR-bencharoenwong.html#1
Basic Commands
R also handles real and imaginary numbers. (You may never need to
use these though.)
x = 0 + 5i; temp = complex(imaginary = sqrt(5));
Re(x); Re(temp);
## [1] 0
## [1] 0
Im(x); Im(temp);
## [1] 5
18/80
10. 6/26/15, 12:48 AMIntroduction to R
Page 19 of 80file:///Users/bencharoenwong/Dropbox/Citadel/R_overview/introToR-bencharoenwong.html#1
Vectors & Matrices
6/26/15, 12:48 AMIntroduction to R
Page 20 of 80file:///Users/bencharoenwong/Dropbox/Citadel/R_overview/introToR-bencharoenwong.html#1
Basic Commands: Vectors
Let's look at some vectors, and a case where R may work when
we don't want it to!
By default, operations in R are element-wise
We will look at matrix operations in a couple of slides
·
·
·
x = c(1:4); x; # by default: row vector
## [1] 1 2 3 4
y = c(1:3);
x * y
## Warning in x * y: longer object length is not a multiple of shorter object
## length
20/80
11. 6/26/15, 12:48 AMIntroduction to R
Page 21 of 80file:///Users/bencharoenwong/Dropbox/Citadel/R_overview/introToR-bencharoenwong.html#1
Basic Commands: Vectors
The rep command repeats the x-variable times-number of times
z1 = rep(5, 2); z1;
## [1] 5 5
z2 = rep(x = 5, times = 2); z2;
## [1] 5 5
z3 = rep(times = 2, x = 5); z3;
## [1] 5 5
21/80
6/26/15, 12:48 AMIntroduction to R
Page 22 of 80file:///Users/bencharoenwong/Dropbox/Citadel/R_overview/introToR-bencharoenwong.html#1
Basic Commands: Vectors
The rep command repeats the x-variable times-number of times
z1 = rep(x = c(1,2,3), times = 2); z1;
## [1] 1 2 3 1 2 3
22/80
12. 6/26/15, 12:48 AMIntroduction to R
Page 23 of 80file:///Users/bencharoenwong/Dropbox/Citadel/R_overview/introToR-bencharoenwong.html#1
Basic Commands: Matrices
To create a matrix (note: Matrix != Array/List != Data Frame)
M = matrix(c(1:4),nrow = 2, ncol = 2);
M;
## [,1] [,2]
## [1,] 1 3
## [2,] 2 4
N = matrix(c(1:4),nrow = 2, ncol = 2, byrow = T);
N;
## [,1] [,2]
## [1,] 1 2
## [2,] 3 4 23/80
6/26/15, 12:48 AMIntroduction to R
Page 24 of 80file:///Users/bencharoenwong/Dropbox/Citadel/R_overview/introToR-bencharoenwong.html#1
Basic Commands: Matrices
Transposing a matrix
M;
## [,1] [,2]
## [1,] 1 3
## [2,] 2 4
t(M);
## [,1] [,2]
## [1,] 1 2
## [2,] 3 4
any(t(M) != N); # element-wise checking whether M[i,j] != N[i,j]
24/80
13. 6/26/15, 12:48 AMIntroduction to R
Page 25 of 80file:///Users/bencharoenwong/Dropbox/Citadel/R_overview/introToR-bencharoenwong.html#1
Basic Commands: Matrices
Let's look at some matrix operations
M * N;
## [,1] [,2]
## [1,] 1 6
## [2,] 6 16
M %*% N;
## [,1] [,2]
## [1,] 10 14
## [2,] 14 20
25/80
6/26/15, 12:48 AMIntroduction to R
Page 26 of 80file:///Users/bencharoenwong/Dropbox/Citadel/R_overview/introToR-bencharoenwong.html#1
Basic Commands: Matrices
What about subscripting matrices?
The command M[1,5] gives an error
Error in M[1, 5] : subscript out of bounds
M;
## [,1] [,2]
## [1,] 1 3
## [2,] 2 4
M[1,2];
## [1] 3
26/80
14. 6/26/15, 12:48 AMIntroduction to R
Page 27 of 80file:///Users/bencharoenwong/Dropbox/Citadel/R_overview/introToR-bencharoenwong.html#1
Basic Commands: Matrices
What about subscripting matrices?
M[1];
## [1] 1
M[4]; # bad style!
## [1] 4
M[6]; # note: no error! Unlike the subscript out of bounds
## [1] NA
27/80
6/26/15, 12:48 AMIntroduction to R
Page 28 of 80file:///Users/bencharoenwong/Dropbox/Citadel/R_overview/introToR-bencharoenwong.html#1
Basic Commands: Matrices
Picking out rows and columns (used frequently)
M[1,];
## [1] 1 3
M[,1];
## [1] 1 2
28/80
15. 6/26/15, 12:48 AMIntroduction to R
Page 29 of 80file:///Users/bencharoenwong/Dropbox/Citadel/R_overview/introToR-bencharoenwong.html#1
Basic Commands: Matrices
We can also specify which rows we DON'T want
M;
## [,1] [,2]
## [1,] 1 3
## [2,] 2 4
M[-1,];
## [1] 2 4
M[,-1];
29/80
6/26/15, 12:48 AMIntroduction to R
Page 30 of 80file:///Users/bencharoenwong/Dropbox/Citadel/R_overview/introToR-bencharoenwong.html#1
Basic Commands: Matrices
You can name rows and columns in matrices (same with
data.frame), so you can refer to names instead of indices
row.names(M); colnames(M);
## NULL
## NULL
row.names(M) = c("dave","pete");
colnames(M) = c("weight","height"); #note: function name inconsistent!
M
## weight height
## dave 1 3
30/80
16. 6/26/15, 12:48 AMIntroduction to R
Page 31 of 80file:///Users/bencharoenwong/Dropbox/Citadel/R_overview/introToR-bencharoenwong.html#1
Basic Commands: Matrices
You can name rows and columns in matrices (same with
data.frame), so you can refer to names instead of indices
M;
## weight height
## dave 1 3
## pete 2 4
M["dave","height"]; M[c("pete","dave"), "height"];
## [1] 3
## pete dave
## 4 3
31/80
6/26/15, 12:48 AMIntroduction to R
Page 32 of 80file:///Users/bencharoenwong/Dropbox/Citadel/R_overview/introToR-bencharoenwong.html#1
Basic Commands: Matrices
Inverting some matrices may cause problems (numerically singular,
etc.)
M;
## weight height
## dave 1 3
## pete 2 4
solve(M);
## dave pete
## weight -2 1.5
## height 1 -0.5
32/80
17. 6/26/15, 12:48 AMIntroduction to R
Page 33 of 80file:///Users/bencharoenwong/Dropbox/Citadel/R_overview/introToR-bencharoenwong.html#1
Basic Commands: Matrices
To solve a system of equations of the form Mx = b
det(M);
## [1] -2
b = matrix(c(2,5) , nrow = nrow(M), ncol = 1); b;
## [,1]
## [1,] 2
## [2,] 5
solve(a = M, b = b); #solves for ax = b, a is square, b is column
33/80
6/26/15, 12:48 AMIntroduction to R
Page 34 of 80file:///Users/bencharoenwong/Dropbox/Citadel/R_overview/introToR-bencharoenwong.html#1
Basic Commands: Matrices
Diagonal matrices
D = diag(c(1,2,3), nrow = 3) # much better than "eye" in MATLAB!!!
34/80
18. 6/26/15, 12:48 AMIntroduction to R
Page 35 of 80file:///Users/bencharoenwong/Dropbox/Citadel/R_overview/introToR-bencharoenwong.html#1
Basic Commands: Matrices
Diagonal matrices. Let's look at some potential pathological cases.
DANGER: THEY WORK!
diag(c(1,2,3), nrow = 3, ncol = 1)
## [,1]
## [1,] 1
## [2,] 0
## [3,] 0
diag(c(1:50), nrow = 2, ncol = 2);
## [,1] [,2]
## [1,] 1 0
## [2,] 0 2 35/80
6/26/15, 12:48 AMIntroduction to R
Page 36 of 80file:///Users/bencharoenwong/Dropbox/Citadel/R_overview/introToR-bencharoenwong.html#1
Data Types
19. 6/26/15, 12:48 AMIntroduction to R
Page 37 of 80file:///Users/bencharoenwong/Dropbox/Citadel/R_overview/introToR-bencharoenwong.html#1
Basic Commands: Data Types
We use the command class() to verify the variable type.
class(M)
## [1] "matrix"
class(4)
## [1] "numeric"
class(c(1,2,3))
## [1] "numeric"
37/80
6/26/15, 12:48 AMIntroduction to R
Page 38 of 80file:///Users/bencharoenwong/Dropbox/Citadel/R_overview/introToR-bencharoenwong.html#1
Basic Commands: Data Types
vec = c(1:5);
vec > 2;
## [1] FALSE FALSE TRUE TRUE TRUE
as.numeric(vec > 2);
## [1] 0 0 1 1 1
38/80
20. 6/26/15, 12:48 AMIntroduction to R
Page 39 of 80file:///Users/bencharoenwong/Dropbox/Citadel/R_overview/introToR-bencharoenwong.html#1
Basic Commands: Data Types
Like in many other languages, logicals and numerics are brethren.
as.logical(c(0,2,1,0))
## [1] FALSE TRUE TRUE FALSE
39/80
6/26/15, 12:48 AMIntroduction to R
Page 40 of 80file:///Users/bencharoenwong/Dropbox/Citadel/R_overview/introToR-bencharoenwong.html#1
Basic Commands: Data Types
Let's look at some potentially pathological cases.
is.na(NA); is.nan(NaN);
## [1] TRUE
## [1] TRUE
class(NA); class(NaN);
## [1] "logical"
## [1] "numeric"
40/80
21. 6/26/15, 12:48 AMIntroduction to R
Page 41 of 80file:///Users/bencharoenwong/Dropbox/Citadel/R_overview/introToR-bencharoenwong.html#1
Basic Commands: Data Types
Some functions/operations only work on specific variable types
(cue: overloading concepts).
x = complex(real = 4);
class(x);
## [1] "complex"
as.numeric(x);
## [1] 4
class(as.numeric(x));
41/80
6/26/15, 12:48 AMIntroduction to R
Page 42 of 80file:///Users/bencharoenwong/Dropbox/Citadel/R_overview/introToR-bencharoenwong.html#1
Basic Commands: Data Types
Some functions/operations only work on specific variable types
(cue: overloading concepts).
y = complex(imaginary = 5);
as.numeric(y);
## Warning: imaginary parts discarded in coercion
## [1] 0
42/80
22. 6/26/15, 12:48 AMIntroduction to R
Page 43 of 80file:///Users/bencharoenwong/Dropbox/Citadel/R_overview/introToR-bencharoenwong.html#1
Basic Commands: Data Types
Again, let's consider some potentially pathological cases.
class(as.numeric(NA));
## [1] "numeric"
as.logical(NaN);
## [1] NA
class(as.logical(NaN));
## [1] "logical"
43/80
6/26/15, 12:48 AMIntroduction to R
Page 44 of 80file:///Users/bencharoenwong/Dropbox/Citadel/R_overview/introToR-bencharoenwong.html#1
"Combinding"
23. 6/26/15, 12:48 AMIntroduction to R
Page 45 of 80file:///Users/bencharoenwong/Dropbox/Citadel/R_overview/introToR-bencharoenwong.html#1
Basic Commands: "Combinding" Items
You can combine things of the same data type.
Command c() works for vectors
·
·
x = c(1,2,4);
c(x, 5);
## [1] 1 2 4 5
y = c(T, TRUE, F, FALSE);
c(y, NA, FALSE);
## [1] TRUE TRUE FALSE FALSE NA FALSE
45/80
6/26/15, 12:48 AMIntroduction to R
Page 46 of 80file:///Users/bencharoenwong/Dropbox/Citadel/R_overview/introToR-bencharoenwong.html#1
Basic Commands: "Combinding" Items
Recall that logical and numeric are brethren, and that numeric is
more general. It will automatically be converted.
Command c() works for vectors·
c(F,10,2,T,F);
## [1] 0 10 2 1 0
c("hi","world","!");
## [1] "hi" "world" "!"
46/80
24. 6/26/15, 12:48 AMIntroduction to R
Page 47 of 80file:///Users/bencharoenwong/Dropbox/Citadel/R_overview/introToR-bencharoenwong.html#1
Basic Commands: "Combinding" Items
We can use cbind() and rbind() to combine
matrices/data.frames
·
M;
## weight height
## dave 1 3
## pete 2 4
rbind(M, c(1,2)); # a bit sloppy
## weight height
## dave 1 3
## pete 2 4
## 1 2 47/80
6/26/15, 12:48 AMIntroduction to R
Page 48 of 80file:///Users/bencharoenwong/Dropbox/Citadel/R_overview/introToR-bencharoenwong.html#1
Basic Commands: "Combinding" Items
We can use cbind() and rbind() to combine
matrices/data.frames
Pathological cases
·
·
rbind(M, 5);
## weight height
## dave 1 3
## pete 2 4
## 5 5
48/80
25. 6/26/15, 12:48 AMIntroduction to R
Page 49 of 80file:///Users/bencharoenwong/Dropbox/Citadel/R_overview/introToR-bencharoenwong.html#1
Basic Commands: "Combinding" Items
R will NOT NECESSARILY throw an error when using rbind() if
number of columns don't match.
Pathological cases·
rbind(M, c(1,2,3));
## Warning in rbind(M, c(1, 2, 3)): number of columns of result is not a
## multiple of vector length (arg 2)
## weight height
## dave 1 3
## pete 2 4
## 1 2
49/80
6/26/15, 12:48 AMIntroduction to R
Page 50 of 80file:///Users/bencharoenwong/Dropbox/Citadel/R_overview/introToR-bencharoenwong.html#1
Basic Commands: "Combinding" Items
We can use cbind() and rbind() to combine
matrices/data.frames
·
newRow = c(1,2);
rbind(M, newRow); # still a bit sloppy. also notice row.name
## weight height
## dave 1 3
## pete 2 4
## newRow 1 2
50/80
26. 6/26/15, 12:48 AMIntroduction to R
Page 51 of 80file:///Users/bencharoenwong/Dropbox/Citadel/R_overview/introToR-bencharoenwong.html#1
Basic Commands: "Combinding" Items
We can use cbind() and rbind() to combine
matrices/data.frames
·
newRow = c(1,2);
rbind(M, newRow); # still a bit sloppy. also notice row.name
## weight height
## dave 1 3
## pete 2 4
## newRow 1 2
51/80
6/26/15, 12:48 AMIntroduction to R
Page 52 of 80file:///Users/bencharoenwong/Dropbox/Citadel/R_overview/introToR-bencharoenwong.html#1
Basic Commands: "Combinding" Items
We can use cbind() and rbind() to combine
matrices/data.frames
·
newRow = matrix(c(1,2), ncol = 2, nrow = 1);
rbind(M, newRow) # slightly better style
## weight height
## dave 1 3
## pete 2 4
## 1 2
52/80
27. 6/26/15, 12:48 AMIntroduction to R
Page 53 of 80file:///Users/bencharoenwong/Dropbox/Citadel/R_overview/introToR-bencharoenwong.html#1
Basic Commands: "Combinding" Items
We can use cbind() and rbind() to combine
matrices/data.frames
·
# another difference between assign vs "="
rbind(M, newRow <- matrix(c(1,2), ncol = 2, nrow = 1));
## weight height
## dave 1 3
## pete 2 4
## 1 2
53/80
6/26/15, 12:48 AMIntroduction to R
Page 54 of 80file:///Users/bencharoenwong/Dropbox/Citadel/R_overview/introToR-bencharoenwong.html#1
I include those I have used a lot in financial applications.
Other Useful Commands
30. 6/26/15, 12:48 AMIntroduction to R
Page 59 of 80file:///Users/bencharoenwong/Dropbox/Citadel/R_overview/introToR-bencharoenwong.html#1
Basic Commands: Additional
x;
## [1] 2 3 5 1 8
rank(x);
## [1] 2 3 4 1 5
cut(x, breaks = 2);
## [1] (0.993,4.5] (0.993,4.5] (4.5,8.01] (0.993,4.5] (4.5,8.01]
## Levels: (0.993,4.5] (4.5,8.01]
59/80
6/26/15, 12:48 AMIntroduction to R
Page 60 of 80file:///Users/bencharoenwong/Dropbox/Citadel/R_overview/introToR-bencharoenwong.html#1
Basic Application: Ranking Analysts
Here is an example:
data = data.frame(analystID = c(10001,10002,10003),
ret = c(0.05, -0.21, 0.62));
data;
## analystID ret
## 1 10001 0.05
## 2 10002 -0.21
## 3 10003 0.62
60/80
31. 6/26/15, 12:48 AMIntroduction to R
Page 61 of 80file:///Users/bencharoenwong/Dropbox/Citadel/R_overview/introToR-bencharoenwong.html#1
Basic Application: Ranking Analysts
Here is an example:
data$rank = rank(-data$ret);
data;
## analystID ret rank
## 1 10001 0.05 2
## 2 10002 -0.21 3
## 3 10003 0.62 1
61/80
6/26/15, 12:48 AMIntroduction to R
Page 62 of 80file:///Users/bencharoenwong/Dropbox/Citadel/R_overview/introToR-bencharoenwong.html#1
This is probably what you will doing most.
Data Analysis with R
32. 6/26/15, 12:48 AMIntroduction to R
Page 63 of 80file:///Users/bencharoenwong/Dropbox/Citadel/R_overview/introToR-bencharoenwong.html#1
Set Working Directory
Should be pretty self-explanatory. It sets the default for
input/output interactions with R.
For Windows, the file path is with double backslashes :(
setwd("C:Usersbencharoenwong")
getwd(); #this gets the current working directory
## [1] "/Users/bencharoenwong/Dropbox/Citadel/R_overview"
setwd("/Users/bencharoenwong/"); #for Linux/Mac
63/80
6/26/15, 12:48 AMIntroduction to R
Page 64 of 80file:///Users/bencharoenwong/Dropbox/Citadel/R_overview/introToR-bencharoenwong.html#1
Reading in Data
The most commonly used commands will be:
read.csv()
read.table()
R can read many other formats:
Excel using read.xls in gdata or xlsx
SAS using spss.get in Hmisc
SPSS using read.sas7bdat in sas7bdat
Stata using read.dta in foreign
SQL using RMySQL, RPostgreSQL
etc.
·
·
·
·
·
·
·
·
· 64/80
33. 6/26/15, 12:48 AMIntroduction to R
Page 65 of 80file:///Users/bencharoenwong/Dropbox/Citadel/R_overview/introToR-bencharoenwong.html#1
Generating Data
data = data.frame(GS = rnorm(100, mean = 0.2, sd = 0.4),
MS = rnorm(100, mean = 0.1, sd = 0.8));
head(data, n = 4); #default is n = 6
## GS MS
## 1 0.22541579 0.7166003
## 2 0.14879169 -0.1149048
## 3 0.07723751 0.5002469
## 4 -0.54295420 0.5789232
tail(data, n = 4);
## GS MS
## 97 -0.02045411 0.4689193
## 98 0.91976861 -0.9991719 65/80
6/26/15, 12:48 AMIntroduction to R
Page 66 of 80file:///Users/bencharoenwong/Dropbox/Citadel/R_overview/introToR-bencharoenwong.html#1
Generating Data
You can draw random numbers from many distributions. They all
usually take form of r + <distribution>
You can include more distributions from other packages too.
runif (default U[0,1])
rnorm (default N(0,1))
rt (must set degree of freedom df)
rchisq (must set degree of freedom df)
·
·
·
·
66/80
34. 6/26/15, 12:48 AMIntroduction to R
Page 67 of 80file:///Users/bencharoenwong/Dropbox/Citadel/R_overview/introToR-bencharoenwong.html#1
Plotting
Let's look at plotting in the base package. (I don't really use this
anymore. I use the package ggplot2).
plot(x = cars$speed, y = cars$dist); # default scatter, ugly hollow point
67/80
6/26/15, 12:48 AMIntroduction to R
Page 68 of 80file:///Users/bencharoenwong/Dropbox/Citadel/R_overview/introToR-bencharoenwong.html#1
Plotting
plot(x = cars$speed, y = cars$dist, main = "Dist vs. Speed",
type = "p", pch = 19); # look up "pch" online...it's a pain!
68/80
35. 6/26/15, 12:48 AMIntroduction to R
Page 69 of 80file:///Users/bencharoenwong/Dropbox/Citadel/R_overview/introToR-bencharoenwong.html#1
Plotting
plot(x = cars$speed, y = cars$dist, main = "Dist vs. Speed",
type = "l", col = "red", xlab = "Speed", ylab= "Distance (km)");
abline(a = 0.5, b = 10, col = 5); # of the form ax + b
69/80
6/26/15, 12:48 AMIntroduction to R
Page 70 of 80file:///Users/bencharoenwong/Dropbox/Citadel/R_overview/introToR-bencharoenwong.html#1
Plotting
plot(x = cars$speed, y = cars$dist, main = "Dist vs. Speed",
type = "l", col = "red", xlab = "Speed", ylab= "Distance (km)");
abline(h = 20, col = "blue");
abline(v = 17, col = "green");
70/80
36. 6/26/15, 12:48 AMIntroduction to R
Page 71 of 80file:///Users/bencharoenwong/Dropbox/Citadel/R_overview/introToR-bencharoenwong.html#1
Statistics
The standard statistical functions work:
mean() computes sample mean
var(), or sd() which is just the sqrt(var())
median() (might not be in data. If coinciding with 2 numbers, will
take average.)
cov() computes sample covariance
cor() computes sample correlation, can specify pair-wise and
correlation type.
hist() makes histograms
summary()
No built-in function for the statistical mode!
·
·
·
·
·
·
·
· 71/80
6/26/15, 12:48 AMIntroduction to R
Page 72 of 80file:///Users/bencharoenwong/Dropbox/Citadel/R_overview/introToR-bencharoenwong.html#1
Statistics
summary(cars);
## speed dist
## Min. : 4.0 Min. : 2.00
## 1st Qu.:12.0 1st Qu.: 26.00
## Median :15.0 Median : 36.00
## Mean :15.4 Mean : 42.98
## 3rd Qu.:19.0 3rd Qu.: 56.00
## Max. :25.0 Max. :120.00
mean(cars$speed); var(cars$speed); sqrt(var(cars$speed)) == sd(cars$speed);
## [1] 15.4
## [1] 27.95918
72/80
37. 6/26/15, 12:48 AMIntroduction to R
Page 73 of 80file:///Users/bencharoenwong/Dropbox/Citadel/R_overview/introToR-bencharoenwong.html#1
Statistics
hist(cars$speed, breaks = 10);
73/80
6/26/15, 12:48 AMIntroduction to R
Page 74 of 80file:///Users/bencharoenwong/Dropbox/Citadel/R_overview/introToR-bencharoenwong.html#1
Linear Models
Basic regression models can be estimated with the command lm.
There are more general and special versions of this.
summary(lm(speed ~ dist, data = cars));
##
## Call:
## lm(formula = speed ~ dist, data = cars)
##
## Residuals:
## Min 1Q Median 3Q Max
## -7.5293 -2.1550 0.3615 2.4377 6.4179
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 8.28391 0.87438 9.474 1.44e-12 ***
74/80
38. 6/26/15, 12:48 AMIntroduction to R
Page 75 of 80file:///Users/bencharoenwong/Dropbox/Citadel/R_overview/introToR-bencharoenwong.html#1
This will go through the basic loops. This will NOT cover
the apply class of functions, nor advanced function
application tools like dplyr or data.table.
Programming with R
6/26/15, 12:48 AMIntroduction to R
Page 76 of 80file:///Users/bencharoenwong/Dropbox/Citadel/R_overview/introToR-bencharoenwong.html#1
For Loops
Pretty self explanatory.
for (i in c(1:5)) {
print('hi');
};
## [1] "hi"
## [1] "hi"
## [1] "hi"
## [1] "hi"
## [1] "hi"
76/80
39. 6/26/15, 12:48 AMIntroduction to R
Page 77 of 80file:///Users/bencharoenwong/Dropbox/Citadel/R_overview/introToR-bencharoenwong.html#1
While Loops
Pretty self explanatory.
i = 0;
while (i < 5) {
print('hi');
i = i + 1;
};
## [1] "hi"
## [1] "hi"
## [1] "hi"
## [1] "hi"
## [1] "hi"
77/80
6/26/15, 12:48 AMIntroduction to R
Page 78 of 80file:///Users/bencharoenwong/Dropbox/Citadel/R_overview/introToR-bencharoenwong.html#1
Conditionals
If, and if-else commands.
x = 5; y = 3;
if (x > y) {print("x more than y");}
## [1] "x more than y"
if (x <= y) {print ("x less than or equal to y");}
78/80
40. 6/26/15, 12:48 AMIntroduction to R
Page 79 of 80file:///Users/bencharoenwong/Dropbox/Citadel/R_overview/introToR-bencharoenwong.html#1
Conditionals
If, and if-else commands.
x = 5; y = 3;
if (x > y) {
print("x more than y");
} else {
print ("x less than or equal to y");
}
## [1] "x more than y"
79/80
6/26/15, 12:48 AMIntroduction to R
Page 80 of 80file:///Users/bencharoenwong/Dropbox/Citadel/R_overview/introToR-bencharoenwong.html#1
Conclusion
R is a powerful tool!
Use R!
Depending on how much R you anticipate to use in the future,
decide whether to pay the fixed cost to learn new
packages/commands.
There are also a lot of shady packages out there…
Time to quit with q().
·
·
·
·
·
80/80