Rstudio is an integrated development environment for R that allows users to interact more easily with R by integrating different aspects of scripting, from code completion to debugging.
Similar to Rstudio is an integrated development environment for R that allows users to interact more easily with R by integrating different aspects of scripting, from code completion to debugging.
Similar to Rstudio is an integrated development environment for R that allows users to interact more easily with R by integrating different aspects of scripting, from code completion to debugging. (20)
Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...
Rstudio is an integrated development environment for R that allows users to interact more easily with R by integrating different aspects of scripting, from code completion to debugging.
2. R is
A (not ideal) programming language
A collection of 6,700 packages (as of June 2015, so more now)
A software package for statistical computing and graphics
A work environment
Widely used
Powerful
Free
3.
4. Some history
R was based on S, with code written in C
R was created in the 1990s by Ross Ihaka and Robert Gentleman
S was developed at Bell Labs, starting in the 1970s
S largely was used to make good graphs – not an easy thing
in 1975. R, like S, is quite good for graphing.
For lots of examples, see http://rgraphgallery.blogspot.com/
or http://www.r-graph-gallery.com/
(Or for more detail, see http://docs.ggplot2.org/current/
See ggplot2-cheatsheet-2.0.pdf
9. RStudio is
A gift, from J.J. Allaire (Macalester College, ‘91) to the world
An Integrated Development Environment (IDE) for R
Free – unless you want the newest version, with more
bells and whistles, and you are not eligible for the
educational discount (= free)
An easy (easier) way to use R
Available as a desktop product or, as used at OC, run
off of a file server.
R supports rpubs – see http://rpubs.com/jawitmer
11. R is object-oriented
e.g., MyModel <- lm(wt ~ ht, data = mydata)
then hist(MyModel$residuals)
Note: lm(wt ~ ht*age + log(bp), data = mydata) regresses
wt on ht, age, the ht-by-age interaction, and log(bp).
There is no need to create the interaction or the lob(bp)
variable outside of the lm() command.
Comparing nested models:
mod1 <- lm(wt ~ ht*age + log(bp), data = mydata)
mod2 <- lm(wt ~ ht + log(bp), data = mydata)
anova(mod2, mod1) gives a nested F-test
12. R as a programming language
If you want R to be (relatively) fast, take advantage of
vector operations; e.g., use the replicate command
(rather than a loop) or the tapply function.
E.g., replicate(k=25,addingLines(n=10)) calls the
addingLines function (something I wrote) 25 times.
> with(Dabbs, tapply(testosterone, occupation, mean))
Actor MD Minister Prof
12.7 11.6 8.4 10.6
13. If you want to know how to do something in R
See the “Minimal R.pdf” handout
Go to the Quick-R.com page (http://www.statmethods.net/)
Google “How do I do xxx in R?”
A standing joke among R users is that the answer
is always “There are many ways to do that in R.”
See http://swirlstats.com/
See https://www.datacamp.com/home
14. Speaking of many ways to do something in R…
(1) mean(mydata$ht)
(2) with(mydata, mean(ht))
(3) mean(ht, data=mydata)
However
(1) plot(mydata$ht,mydata$wt) works
(2) with(mydata, plot(ht,wt)) works
(3) plot(ht, wt, data=mydata) does not work
(3a) plot(wt~ht, data=mydata) works
15. The mosaic package (Kaplan, Pruim, Horton) was created
to make R easy to use for intro stats.
mosaic package syntax:
goal(y ~ x|z, data=mydata)
E.g.: tally(~sex, data=HELPrct)
E.g.: test(age ~ sex, data=HELPrct)
E.g.: favstats(age ~ substance|sex, data=HELPrct)
E.g.: t.test(age ~ sex, data=HELPrct)$p.value
See MinimalR-2pages.pdf
16. The mosaic package mPlot() command makes graphing easy.
mPlot(SaratogaHouses)
17. The openintro package edaPlot() command makes exploring
data graphically easy to do. edaPlot(SaratogaHouses)
18. The mosaic tidyr and dplyr packages handle SQL-ytpe
work: merging files, extracting subsets, etc.
data(NCHS) #loads in the NCHS data frame
newNCHS <- NCHS %>% sample_n(size=5000)
%>% filter(age > 18) #takes a sample of size 5000,
extracts only the rows for which age > 18, and saves
the result in newNCHS
See data-wrangling-cheatsheet.pdf
19. I use R, and the do() command in the mosaic package, for
simulations.
data(FirstYearGPA) #loads in the data frame
FY <- FirstYearGPA) #rename the data frame
lm(GPA ~ SATM, data=FY) #gives 0.0012 as slope
lm(GPA ~ SATM, data=FY)$coeff[2] #just look at the slope
do(3)*lm(GPA ~ shuffle(SATM), data=FY)$coeff[2] #break link b/w GPA and
SATM
null.dist <- do(1000)*lm(GPA ~ shuffle(SATM), data=FY)$coeff[2] #1000
random slopes
histogram(null.dist$SATM, v=0.0012) #look at the 1000 slopes
with(null.dist, tally(abs(SATM.)>=0.0012)) #How many are far from zero?
with(null.dist, tally(abs(SATM.)>=0.0012, format='prop')) #What proportion are
far from zero?
Editor's Notes
R is an interpreted language, but with much of it compiled in C.
plot(wt~ht, data=mydata) feeds the plot command a function, whereas plot(ht, wt, data=mydata) doesn’t