SlideShare a Scribd company logo
1 of 19
Introduction to R and RStudio
R is
A (not ideal) programming language
A collection of 6,700 packages (as of June 2015, so more now)
A software package for statistical computing and graphics
A work environment
Widely used
Powerful
Free
Some history
R was based on S, with code written in C
R was created in the 1990s by Ross Ihaka and Robert Gentleman
S was developed at Bell Labs, starting in the 1970s
S largely was used to make good graphs – not an easy thing
in 1975. R, like S, is quite good for graphing.
For lots of examples, see http://rgraphgallery.blogspot.com/
or http://www.r-graph-gallery.com/
(Or for more detail, see http://docs.ggplot2.org/current/
See ggplot2-cheatsheet-2.0.pdf
A few simple graphs using the ggplot2 package
An example of graphing using the GGally package in R
Who uses R?
RStudio is
A gift, from J.J. Allaire (Macalester College, ‘91) to the world
An Integrated Development Environment (IDE) for R
Free – unless you want the newest version, with more
bells and whistles, and you are not eligible for the
educational discount (= free)
An easy (easier) way to use R
Available as a desktop product or, as used at OC, run
off of a file server.
R supports rpubs – see http://rpubs.com/jawitmer
RStudio screen shot
R is object-oriented
e.g., MyModel <- lm(wt ~ ht, data = mydata)
then hist(MyModel$residuals)
Note: lm(wt ~ ht*age + log(bp), data = mydata) regresses
wt on ht, age, the ht-by-age interaction, and log(bp).
There is no need to create the interaction or the lob(bp)
variable outside of the lm() command.
Comparing nested models:
mod1 <- lm(wt ~ ht*age + log(bp), data = mydata)
mod2 <- lm(wt ~ ht + log(bp), data = mydata)
anova(mod2, mod1) gives a nested F-test
R as a programming language
If you want R to be (relatively) fast, take advantage of
vector operations; e.g., use the replicate command
(rather than a loop) or the tapply function.
E.g., replicate(k=25,addingLines(n=10)) calls the
addingLines function (something I wrote) 25 times.
> with(Dabbs, tapply(testosterone, occupation, mean))
Actor MD Minister Prof
12.7 11.6 8.4 10.6
If you want to know how to do something in R
See the “Minimal R.pdf” handout
Go to the Quick-R.com page (http://www.statmethods.net/)
Google “How do I do xxx in R?”
A standing joke among R users is that the answer
is always “There are many ways to do that in R.”
See http://swirlstats.com/
See https://www.datacamp.com/home
Speaking of many ways to do something in R…
(1) mean(mydata$ht)
(2) with(mydata, mean(ht))
(3) mean(ht, data=mydata)
However
(1) plot(mydata$ht,mydata$wt) works
(2) with(mydata, plot(ht,wt)) works
(3) plot(ht, wt, data=mydata) does not work
(3a) plot(wt~ht, data=mydata) works
The mosaic package (Kaplan, Pruim, Horton) was created
to make R easy to use for intro stats.
mosaic package syntax:
goal(y ~ x|z, data=mydata)
E.g.: tally(~sex, data=HELPrct)
E.g.: test(age ~ sex, data=HELPrct)
E.g.: favstats(age ~ substance|sex, data=HELPrct)
E.g.: t.test(age ~ sex, data=HELPrct)$p.value
See MinimalR-2pages.pdf
The mosaic package mPlot() command makes graphing easy.
mPlot(SaratogaHouses)
The openintro package edaPlot() command makes exploring
data graphically easy to do. edaPlot(SaratogaHouses)
The mosaic tidyr and dplyr packages handle SQL-ytpe
work: merging files, extracting subsets, etc.
data(NCHS) #loads in the NCHS data frame
newNCHS <- NCHS %>% sample_n(size=5000)
%>% filter(age > 18) #takes a sample of size 5000,
extracts only the rows for which age > 18, and saves
the result in newNCHS
See data-wrangling-cheatsheet.pdf
I use R, and the do() command in the mosaic package, for
simulations.
data(FirstYearGPA) #loads in the data frame
FY <- FirstYearGPA) #rename the data frame
lm(GPA ~ SATM, data=FY) #gives 0.0012 as slope
lm(GPA ~ SATM, data=FY)$coeff[2] #just look at the slope
do(3)*lm(GPA ~ shuffle(SATM), data=FY)$coeff[2] #break link b/w GPA and
SATM
null.dist <- do(1000)*lm(GPA ~ shuffle(SATM), data=FY)$coeff[2] #1000
random slopes
histogram(null.dist$SATM, v=0.0012) #look at the 1000 slopes
with(null.dist, tally(abs(SATM.)>=0.0012)) #How many are far from zero?
with(null.dist, tally(abs(SATM.)>=0.0012, format='prop')) #What proportion are
far from zero?

More Related Content

Similar to Rstudio is an integrated development environment for R that allows users to interact more easily with R by integrating different aspects of scripting, from code completion to debugging.

Presentation on use of r statistics
Presentation on use of r statisticsPresentation on use of r statistics
Presentation on use of r statisticsKrishna Dhakal
 
Spark 4th Meetup Londond - Building a Product with Spark
Spark 4th Meetup Londond - Building a Product with SparkSpark 4th Meetup Londond - Building a Product with Spark
Spark 4th Meetup Londond - Building a Product with Sparksamthemonad
 
2015-10-23_wim_davis_r_slides.pptx on consumer
2015-10-23_wim_davis_r_slides.pptx on consumer2015-10-23_wim_davis_r_slides.pptx on consumer
2015-10-23_wim_davis_r_slides.pptx on consumertirlukachaitanya
 
Scala as a Declarative Language
Scala as a Declarative LanguageScala as a Declarative Language
Scala as a Declarative Languagevsssuresh
 
Introduction to R
Introduction to RIntroduction to R
Introduction to Ragnonchik
 
Best corporate-r-programming-training-in-mumbai
Best corporate-r-programming-training-in-mumbaiBest corporate-r-programming-training-in-mumbai
Best corporate-r-programming-training-in-mumbaiUnmesh Baile
 
R Brown-bag seminars : Seminar-8
R Brown-bag seminars : Seminar-8R Brown-bag seminars : Seminar-8
R Brown-bag seminars : Seminar-8Muhammad Nabi Ahmad
 
Python 培训讲义
Python 培训讲义Python 培训讲义
Python 培训讲义leejd
 
Reproducibility with R
Reproducibility with RReproducibility with R
Reproducibility with RMartin Jung
 
An Overview of Hadoop
An Overview of HadoopAn Overview of Hadoop
An Overview of HadoopAsif Ali
 
ComputeFest 2012: Intro To R for Physical Sciences
ComputeFest 2012: Intro To R for Physical SciencesComputeFest 2012: Intro To R for Physical Sciences
ComputeFest 2012: Intro To R for Physical Sciencesalexstorer
 
Mapreduce Algorithms
Mapreduce AlgorithmsMapreduce Algorithms
Mapreduce AlgorithmsAmund Tveit
 
Apache pig presentation_siddharth_mathur
Apache pig presentation_siddharth_mathurApache pig presentation_siddharth_mathur
Apache pig presentation_siddharth_mathurSiddharth Mathur
 

Similar to Rstudio is an integrated development environment for R that allows users to interact more easily with R by integrating different aspects of scripting, from code completion to debugging. (20)

Lecture_R.ppt
Lecture_R.pptLecture_R.ppt
Lecture_R.ppt
 
Presentation on use of r statistics
Presentation on use of r statisticsPresentation on use of r statistics
Presentation on use of r statistics
 
Have you met Julia?
Have you met Julia?Have you met Julia?
Have you met Julia?
 
R basics
R basicsR basics
R basics
 
Easy R
Easy REasy R
Easy R
 
Spark 4th Meetup Londond - Building a Product with Spark
Spark 4th Meetup Londond - Building a Product with SparkSpark 4th Meetup Londond - Building a Product with Spark
Spark 4th Meetup Londond - Building a Product with Spark
 
2015-10-23_wim_davis_r_slides.pptx on consumer
2015-10-23_wim_davis_r_slides.pptx on consumer2015-10-23_wim_davis_r_slides.pptx on consumer
2015-10-23_wim_davis_r_slides.pptx on consumer
 
Scala as a Declarative Language
Scala as a Declarative LanguageScala as a Declarative Language
Scala as a Declarative Language
 
Introduction to R
Introduction to RIntroduction to R
Introduction to R
 
Best corporate-r-programming-training-in-mumbai
Best corporate-r-programming-training-in-mumbaiBest corporate-r-programming-training-in-mumbai
Best corporate-r-programming-training-in-mumbai
 
R Brown-bag seminars : Seminar-8
R Brown-bag seminars : Seminar-8R Brown-bag seminars : Seminar-8
R Brown-bag seminars : Seminar-8
 
Python 培训讲义
Python 培训讲义Python 培训讲义
Python 培训讲义
 
User biglm
User biglmUser biglm
User biglm
 
An Intoduction to R
An Intoduction to RAn Intoduction to R
An Intoduction to R
 
Reproducibility with R
Reproducibility with RReproducibility with R
Reproducibility with R
 
An Overview of Hadoop
An Overview of HadoopAn Overview of Hadoop
An Overview of Hadoop
 
ComputeFest 2012: Intro To R for Physical Sciences
ComputeFest 2012: Intro To R for Physical SciencesComputeFest 2012: Intro To R for Physical Sciences
ComputeFest 2012: Intro To R for Physical Sciences
 
Mapreduce Algorithms
Mapreduce AlgorithmsMapreduce Algorithms
Mapreduce Algorithms
 
Apache pig presentation_siddharth_mathur
Apache pig presentation_siddharth_mathurApache pig presentation_siddharth_mathur
Apache pig presentation_siddharth_mathur
 
Unit 3
Unit 3Unit 3
Unit 3
 

Recently uploaded

EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptxEMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptxthyngster
 
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一ffjhghh
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Serviceranjana rawat
 
Ukraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICSUkraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICSAishani27
 
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiLow Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiSuhani Kapoor
 
B2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxB2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxStephen266013
 
Predicting Employee Churn: A Data-Driven Approach Project Presentation
Predicting Employee Churn: A Data-Driven Approach Project PresentationPredicting Employee Churn: A Data-Driven Approach Project Presentation
Predicting Employee Churn: A Data-Driven Approach Project PresentationBoston Institute of Analytics
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationshipsccctableauusergroup
 
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Sapana Sha
 
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfSocial Samosa
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz1
 
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /WhatsappsBeautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsappssapnasaifi408
 
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改atducpo
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Callshivangimorya083
 
VIP High Class Call Girls Bikaner Anushka 8250192130 Independent Escort Servi...
VIP High Class Call Girls Bikaner Anushka 8250192130 Independent Escort Servi...VIP High Class Call Girls Bikaner Anushka 8250192130 Independent Escort Servi...
VIP High Class Call Girls Bikaner Anushka 8250192130 Independent Escort Servi...Suhani Kapoor
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130Suhani Kapoor
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfLars Albertsson
 
Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...
Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...
Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...shivangimorya083
 

Recently uploaded (20)

EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptxEMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
 
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
 
Ukraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICSUkraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICS
 
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiLow Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
 
B2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxB2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docx
 
Predicting Employee Churn: A Data-Driven Approach Project Presentation
Predicting Employee Churn: A Data-Driven Approach Project PresentationPredicting Employee Churn: A Data-Driven Approach Project Presentation
Predicting Employee Churn: A Data-Driven Approach Project Presentation
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships
 
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
 
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /WhatsappsBeautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
 
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
 
E-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptxE-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptx
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
 
VIP High Class Call Girls Bikaner Anushka 8250192130 Independent Escort Servi...
VIP High Class Call Girls Bikaner Anushka 8250192130 Independent Escort Servi...VIP High Class Call Girls Bikaner Anushka 8250192130 Independent Escort Servi...
VIP High Class Call Girls Bikaner Anushka 8250192130 Independent Escort Servi...
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdf
 
Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...
Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...
Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...
 

Rstudio is an integrated development environment for R that allows users to interact more easily with R by integrating different aspects of scripting, from code completion to debugging.

  • 1. Introduction to R and RStudio
  • 2. R is A (not ideal) programming language A collection of 6,700 packages (as of June 2015, so more now) A software package for statistical computing and graphics A work environment Widely used Powerful Free
  • 3.
  • 4. Some history R was based on S, with code written in C R was created in the 1990s by Ross Ihaka and Robert Gentleman S was developed at Bell Labs, starting in the 1970s S largely was used to make good graphs – not an easy thing in 1975. R, like S, is quite good for graphing. For lots of examples, see http://rgraphgallery.blogspot.com/ or http://www.r-graph-gallery.com/ (Or for more detail, see http://docs.ggplot2.org/current/ See ggplot2-cheatsheet-2.0.pdf
  • 5. A few simple graphs using the ggplot2 package
  • 6. An example of graphing using the GGally package in R
  • 8.
  • 9. RStudio is A gift, from J.J. Allaire (Macalester College, ‘91) to the world An Integrated Development Environment (IDE) for R Free – unless you want the newest version, with more bells and whistles, and you are not eligible for the educational discount (= free) An easy (easier) way to use R Available as a desktop product or, as used at OC, run off of a file server. R supports rpubs – see http://rpubs.com/jawitmer
  • 11. R is object-oriented e.g., MyModel <- lm(wt ~ ht, data = mydata) then hist(MyModel$residuals) Note: lm(wt ~ ht*age + log(bp), data = mydata) regresses wt on ht, age, the ht-by-age interaction, and log(bp). There is no need to create the interaction or the lob(bp) variable outside of the lm() command. Comparing nested models: mod1 <- lm(wt ~ ht*age + log(bp), data = mydata) mod2 <- lm(wt ~ ht + log(bp), data = mydata) anova(mod2, mod1) gives a nested F-test
  • 12. R as a programming language If you want R to be (relatively) fast, take advantage of vector operations; e.g., use the replicate command (rather than a loop) or the tapply function. E.g., replicate(k=25,addingLines(n=10)) calls the addingLines function (something I wrote) 25 times. > with(Dabbs, tapply(testosterone, occupation, mean)) Actor MD Minister Prof 12.7 11.6 8.4 10.6
  • 13. If you want to know how to do something in R See the “Minimal R.pdf” handout Go to the Quick-R.com page (http://www.statmethods.net/) Google “How do I do xxx in R?” A standing joke among R users is that the answer is always “There are many ways to do that in R.” See http://swirlstats.com/ See https://www.datacamp.com/home
  • 14. Speaking of many ways to do something in R… (1) mean(mydata$ht) (2) with(mydata, mean(ht)) (3) mean(ht, data=mydata) However (1) plot(mydata$ht,mydata$wt) works (2) with(mydata, plot(ht,wt)) works (3) plot(ht, wt, data=mydata) does not work (3a) plot(wt~ht, data=mydata) works
  • 15. The mosaic package (Kaplan, Pruim, Horton) was created to make R easy to use for intro stats. mosaic package syntax: goal(y ~ x|z, data=mydata) E.g.: tally(~sex, data=HELPrct) E.g.: test(age ~ sex, data=HELPrct) E.g.: favstats(age ~ substance|sex, data=HELPrct) E.g.: t.test(age ~ sex, data=HELPrct)$p.value See MinimalR-2pages.pdf
  • 16. The mosaic package mPlot() command makes graphing easy. mPlot(SaratogaHouses)
  • 17. The openintro package edaPlot() command makes exploring data graphically easy to do. edaPlot(SaratogaHouses)
  • 18. The mosaic tidyr and dplyr packages handle SQL-ytpe work: merging files, extracting subsets, etc. data(NCHS) #loads in the NCHS data frame newNCHS <- NCHS %>% sample_n(size=5000) %>% filter(age > 18) #takes a sample of size 5000, extracts only the rows for which age > 18, and saves the result in newNCHS See data-wrangling-cheatsheet.pdf
  • 19. I use R, and the do() command in the mosaic package, for simulations. data(FirstYearGPA) #loads in the data frame FY <- FirstYearGPA) #rename the data frame lm(GPA ~ SATM, data=FY) #gives 0.0012 as slope lm(GPA ~ SATM, data=FY)$coeff[2] #just look at the slope do(3)*lm(GPA ~ shuffle(SATM), data=FY)$coeff[2] #break link b/w GPA and SATM null.dist <- do(1000)*lm(GPA ~ shuffle(SATM), data=FY)$coeff[2] #1000 random slopes histogram(null.dist$SATM, v=0.0012) #look at the 1000 slopes with(null.dist, tally(abs(SATM.)>=0.0012)) #How many are far from zero? with(null.dist, tally(abs(SATM.)>=0.0012, format='prop')) #What proportion are far from zero?

Editor's Notes

  1. R is an interpreted language, but with much of it compiled in C.
  2. plot(wt~ht, data=mydata) feeds the plot command a function, whereas plot(ht, wt, data=mydata) doesn’t