SlideShare a Scribd company logo
R IntroWeek 1 Scott Chamberlain [modified from Haldre Rogers] September 9, 2011
Don’t just listen to me! Other Intros to R: http://www.stat.duke.edu/programs/gcc/ResourcesDocuments/RTutorial.pdf http://www.cyclismo.org/tutorial/R/ http://www.r-tutor.com/r-introduction Quick R: http://www.statmethods.net/ http://www.bioconductor.org/help/course-materials/2011/CSAMA/Monday/Morning%20Talks/R_intro.pdf
R user frameworks R from command line: OSX and PC Just type “R” into the command line – and have fun! R itself http://www.r-project.org/ RStudio – good choice http://www.rstudio.org/ RevolutionR [free academic version] – this is sort of the SAS-ised version of R http://www.revolutionanalytics.com/downloads/free-academic.php Uses proprietary .xdf file format that speeds up computation times Many other ways to use R, including GUIs, other IDEs, and huge variety of text editors https://github.com/RatRiceEEB/RIntroCode/wiki/R-Resources If you are afraid of the code interface, use Rattle, or R Commander, or Deducer, or Red R You can learn using these interfaces what code does what after pressing buttons
R user frameworks, cont. R from Python RPy: http://rpy.sourceforge.net/ C from R:  rcpp package: http://cran.r-project.org/web/packages/Rcpp/index.html http://dirk.eddelbuettel.com/code/rcpp.html Can hugely speed up computation times by writing R functions in C language. Then the function calls C to run instead of R. E.g., http://helmingstay.blogspot.com/2011/06/efficient-loops-in-r-complexity-versus.html & http://dirk.eddelbuettel.com/code/rcpp.examples.html Excel from R XLConnect package: http://cran.r-project.org/web/packages/XLConnect/index.html And more….see for yourself
R Tips R can crash  Do not use R’s built in text editor or solely write code in the R console. Instead use any text editor that integrates with R. See here for links:  https://github.com/RatRiceEEB/RIntroCode/wiki/R-Resources When asking for help on listserves/help websites, use BRIEF and  REPRODUCIBLE examples Not doing this makes people not want to help you! R automatically overwrites files with the same file name!!!! Make sure you want to overwrite a file before doing so
Style
Not this kind of style…
This kind of style!!!
Style Style is important so YOU and OTHERS can read your code and actually use it Google style guide:  http://google-styleguide.googlecode.com/svn/trunk/google-r-style.html#generallayout Henrik Bengtsson style guide:  http://www1.maths.lth.se/help/R/RCC/ Hadley Wickham's style guide:  https://github.com/hadley/devtools/wiki/Style
Preparing your data for R What makes clean data? Correct spelling Identical capitalization (e.g. Premna vspremna) If myvector <- c(3, 4, 5), calling Myvector does not work! No spaces between words (spaces turned into “.”) Generally try to avoid, use underscores instead NA or blank (if using csv) for missing values Find and replace to get rid of spaces after words I generally keep an .xls and a .csv file so you can always recreate work in R with the .csv file and still modify the .xls file
Bringing data into R Create csv file One worksheet only No special formatting, filters, comments etc. Copy only columns and rows with your data to the CSV, as R will read in columns without data sometimes Name your variables well  self-explanatory, unique, lowercase, short-ish, one-word names In R, set the working directory setwd("/Users/ScottMac/Dropbox/R Group/Week1_R-Intro") What is the working directory? getwd() What is in the working directory? dir() Read in data CSV files: iris.df <- read.csv("iris_df.csv", header=T) Clipboard: read.csv("clipboard")- reads in file like cutting and pasting it From web: read.csv("http://explore.data.gov/download/pwaj-zn2n/CSV") From excel files: (using the XLConnect package) iris.df <- readWorksheetFromFile("/Users/ScottMac/Dropbox/R Group/Week1_R-Intro/iris_df.xlsx", sheet=“Sheet1”) Write data write.csv(dataframe, “dataframename.csv”), OR save(iris, “iris.RData”) [and load(“iris.RData”) to open in R]
R data structures Scalar: Object with a single value, either numeric or character Vector: Sequence of any values, including numeric, character, and NA List: Arbitrary collections of variables – very useful R object Character: Text, e.g., “this is some text” Factor: Like character vectors, but only w/ values in predefined “levels” Matrix: Only numeric values allowed Dataframe:  Each column can be of a different class Immutable dataframe:  special dataframe used in plyr package for faster dataframe manipulation, it references the original dataframe for faster calculations Function Environment
Exploring dataframes str(dataframe) gives column formats and dimensions head(dataframe) and tail() give first and last 6 rows names(dataframe) gives column names row.names(dataframe) gives row names attributes(dataframe) gives column and row names and object class summary(dataframe) gives a lot of good information Make sure variables are appropriate form Character/string, Numeric, Factor, Integer, logical Make sure mins, maxs, means, etc. seem right Make sure you don’t have typing errors so Premna and premna are two separate factors Use: unique(iris$species) to see what all unique values of a column are Or use: levels(spider$species) to see different levels
To attach or not to attach…that is the question Some like to use ‘attach’ to make dataframe variables accessible by name within the R session  Generally, ‘attach’ is frowned upon by R junkies.   Use dataframe$y, or data=dataframe, or dataframe[,”y”], or dataframe[, 2] To detach the object, use: detach()   I recommend: do not use attach, but do what you want
R Packages 3,262 packages!!!! Packages are extensions written by anyone for any purpose, usually loaded by: install.packages(”packagename”), then require(packagename) or library() Use ?functionname for help on any function in base R or in R packages In RStudio, just press tab when in parentheses after the function name to see function options!!! Explore packages at the CRAN site: http://cran.r-project.org/web/packages/ Inside-R package reference:  http://www.inside-r.org/packages
Data manipulation Packages: plyr, data.table, doBY, sqldf, reshape2, and more Comparison of packages Modified from code from Recipes, scripts and Genomics blog: https://gist.github.com/878919 data.table is by far the fastest!!!  BUT, ease of use and flexibility may be plyr? See for yourself… Also, see examples in the tutorial code for reshape2 package for neat data manipulation tricks
Visualizations A few different approaches: Base graphics Lattice graphics Grid graphics ggplot2 graphics Further reading: http://www.slideshare.net/dataspora/a-survey-of-r-graphics An example:
more on ggplot2 graphics There are classes taught by Hadley Wickham here at Rice if you want to learn more! Data visualization (Stat645): http://had.co.nz/stat645/ Statistical computing (Stat405): http://had.co.nz/stat405/ Hadley’s website is really helpful: http://had.co.nz/ggplot2/ The ggplot2 google groups site: https://groups.google.com/forum/#!forum/ggplot2
QUICK RSTUDIO RUN THROUGH Keyboard shortcuts!! http://www.rstudio.org/docs/using/keyboard_shortcuts
USE CASE HERE [see intro_usecase.R file]

More Related Content

What's hot

R programming
R programmingR programming
R programming
Pooja Sharma
 
R studio
R studio R studio
R studio
Kinza Irshad
 
Introduction to Rstudio
Introduction to RstudioIntroduction to Rstudio
Introduction to Rstudio
Olga Scrivner
 
R programming presentation
R programming presentationR programming presentation
R programming presentation
Akshat Sharma
 
R programming slides
R  programming slidesR  programming slides
R programming slides
Pankaj Saini
 
Introduction to data analysis using R
Introduction to data analysis using RIntroduction to data analysis using R
Introduction to data analysis using R
Victoria López
 
Getting Started with R
Getting Started with RGetting Started with R
Getting Started with R
Sankhya_Analytics
 
Step By Step Guide to Learn R
Step By Step Guide to Learn RStep By Step Guide to Learn R
Step By Step Guide to Learn R
Venkata Reddy Konasani
 
1 R Tutorial Introduction
1 R Tutorial Introduction1 R Tutorial Introduction
1 R Tutorial Introduction
Sakthi Dasans
 
Lasso and ridge regression
Lasso and ridge regressionLasso and ridge regression
Lasso and ridge regression
SreerajVA
 
Big Data Analytics with R
Big Data Analytics with RBig Data Analytics with R
Big Data Analytics with R
Great Wide Open
 
R Programming Language
R Programming LanguageR Programming Language
R Programming Language
NareshKarela1
 
R data-import, data-export
R data-import, data-exportR data-import, data-export
R data-import, data-export
FAO
 
Introduction to R and R Studio
Introduction to R and R StudioIntroduction to R and R Studio
Introduction to R and R Studio
Rupak Roy
 
Data visualization using R
Data visualization using RData visualization using R
Data visualization using R
Ummiya Mohammedi
 
Functional dependency and normalization
Functional dependency and normalizationFunctional dependency and normalization
Functional dependency and normalization
University of Potsdam
 
R programming
R programmingR programming
R programming
Shantanu Patil
 
How to get started with R programming
How to get started with R programmingHow to get started with R programming
How to get started with R programming
Ramon Salazar
 
Introduction to R programming
Introduction to R programmingIntroduction to R programming
Introduction to R programming
Alberto Labarga
 

What's hot (20)

R programming
R programmingR programming
R programming
 
R studio
R studio R studio
R studio
 
Introduction to Rstudio
Introduction to RstudioIntroduction to Rstudio
Introduction to Rstudio
 
R programming presentation
R programming presentationR programming presentation
R programming presentation
 
R programming slides
R  programming slidesR  programming slides
R programming slides
 
Introduction to data analysis using R
Introduction to data analysis using RIntroduction to data analysis using R
Introduction to data analysis using R
 
Getting Started with R
Getting Started with RGetting Started with R
Getting Started with R
 
Step By Step Guide to Learn R
Step By Step Guide to Learn RStep By Step Guide to Learn R
Step By Step Guide to Learn R
 
1 R Tutorial Introduction
1 R Tutorial Introduction1 R Tutorial Introduction
1 R Tutorial Introduction
 
Lasso and ridge regression
Lasso and ridge regressionLasso and ridge regression
Lasso and ridge regression
 
Naive bayes
Naive bayesNaive bayes
Naive bayes
 
Big Data Analytics with R
Big Data Analytics with RBig Data Analytics with R
Big Data Analytics with R
 
R Programming Language
R Programming LanguageR Programming Language
R Programming Language
 
R data-import, data-export
R data-import, data-exportR data-import, data-export
R data-import, data-export
 
Introduction to R and R Studio
Introduction to R and R StudioIntroduction to R and R Studio
Introduction to R and R Studio
 
Data visualization using R
Data visualization using RData visualization using R
Data visualization using R
 
Functional dependency and normalization
Functional dependency and normalizationFunctional dependency and normalization
Functional dependency and normalization
 
R programming
R programmingR programming
R programming
 
How to get started with R programming
How to get started with R programmingHow to get started with R programming
How to get started with R programming
 
Introduction to R programming
Introduction to R programmingIntroduction to R programming
Introduction to R programming
 

Viewers also liked

Why R? A Brief Introduction to the Open Source Statistics Platform
Why R? A Brief Introduction to the Open Source Statistics PlatformWhy R? A Brief Introduction to the Open Source Statistics Platform
Why R? A Brief Introduction to the Open Source Statistics Platform
Syracuse University
 
R programming Basic & Advanced
R programming Basic & AdvancedR programming Basic & Advanced
R programming Basic & Advanced
Sohom Ghosh
 
An Interactive Introduction To R (Programming Language For Statistics)
An Interactive Introduction To R (Programming Language For Statistics)An Interactive Introduction To R (Programming Language For Statistics)
An Interactive Introduction To R (Programming Language For Statistics)
Dataspora
 
Class ppt intro to r
Class ppt intro to rClass ppt intro to r
Class ppt intro to r
JigsawAcademy2014
 
R programming language
R programming languageR programming language
R programming language
Alberto Minetti
 
Introduction to R
Introduction to RIntroduction to R
Introduction to R
Samuel Bosch
 
Intro to RStudio
Intro to RStudioIntro to RStudio
Intro to RStudio
egoodwintx
 
R tutorial
R tutorialR tutorial
R tutorial
Richard Vidgen
 
Presentation R basic teaching module
Presentation R basic teaching modulePresentation R basic teaching module
Presentation R basic teaching module
Sander Timmer
 
Introduction to the R Statistical Computing Environment
Introduction to the R Statistical Computing EnvironmentIntroduction to the R Statistical Computing Environment
Introduction to the R Statistical Computing Environment
izahn
 
Introduction to R for Data Science :: Session 7 [Multiple Linear Regression i...
Introduction to R for Data Science :: Session 7 [Multiple Linear Regression i...Introduction to R for Data Science :: Session 7 [Multiple Linear Regression i...
Introduction to R for Data Science :: Session 7 [Multiple Linear Regression i...
Goran S. Milovanovic
 
Introduction to R Graphics with ggplot2
Introduction to R Graphics with ggplot2Introduction to R Graphics with ggplot2
Introduction to R Graphics with ggplot2
izahn
 
Counterfactual evaluation of machine learning models
Counterfactual evaluation of machine learning modelsCounterfactual evaluation of machine learning models
Counterfactual evaluation of machine learning models
Michael Manapat
 
R- Introduction
R- IntroductionR- Introduction
R- Introduction
Venkata Reddy Konasani
 
R introduction v2
R introduction v2R introduction v2
R introduction v2
Martin Johnsson
 
Introduction to R
Introduction to RIntroduction to R
Introduction to R
Stacy Irwin
 
Iris data analysis example in R
Iris data analysis example in RIris data analysis example in R
Iris data analysis example in R
Duyen Do
 

Viewers also liked (20)

Why R? A Brief Introduction to the Open Source Statistics Platform
Why R? A Brief Introduction to the Open Source Statistics PlatformWhy R? A Brief Introduction to the Open Source Statistics Platform
Why R? A Brief Introduction to the Open Source Statistics Platform
 
R programming Basic & Advanced
R programming Basic & AdvancedR programming Basic & Advanced
R programming Basic & Advanced
 
An Interactive Introduction To R (Programming Language For Statistics)
An Interactive Introduction To R (Programming Language For Statistics)An Interactive Introduction To R (Programming Language For Statistics)
An Interactive Introduction To R (Programming Language For Statistics)
 
R learning by examples
R learning by examplesR learning by examples
R learning by examples
 
Class ppt intro to r
Class ppt intro to rClass ppt intro to r
Class ppt intro to r
 
R presentation
R presentationR presentation
R presentation
 
R programming language
R programming languageR programming language
R programming language
 
Rtutorial
RtutorialRtutorial
Rtutorial
 
Introduction to R
Introduction to RIntroduction to R
Introduction to R
 
Intro to RStudio
Intro to RStudioIntro to RStudio
Intro to RStudio
 
R tutorial
R tutorialR tutorial
R tutorial
 
Presentation R basic teaching module
Presentation R basic teaching modulePresentation R basic teaching module
Presentation R basic teaching module
 
Introduction to the R Statistical Computing Environment
Introduction to the R Statistical Computing EnvironmentIntroduction to the R Statistical Computing Environment
Introduction to the R Statistical Computing Environment
 
Introduction to R for Data Science :: Session 7 [Multiple Linear Regression i...
Introduction to R for Data Science :: Session 7 [Multiple Linear Regression i...Introduction to R for Data Science :: Session 7 [Multiple Linear Regression i...
Introduction to R for Data Science :: Session 7 [Multiple Linear Regression i...
 
Introduction to R Graphics with ggplot2
Introduction to R Graphics with ggplot2Introduction to R Graphics with ggplot2
Introduction to R Graphics with ggplot2
 
Counterfactual evaluation of machine learning models
Counterfactual evaluation of machine learning modelsCounterfactual evaluation of machine learning models
Counterfactual evaluation of machine learning models
 
R- Introduction
R- IntroductionR- Introduction
R- Introduction
 
R introduction v2
R introduction v2R introduction v2
R introduction v2
 
Introduction to R
Introduction to RIntroduction to R
Introduction to R
 
Iris data analysis example in R
Iris data analysis example in RIris data analysis example in R
Iris data analysis example in R
 

Similar to R Introduction

Language-agnostic data analysis workflows and reproducible research
Language-agnostic data analysis workflows and reproducible researchLanguage-agnostic data analysis workflows and reproducible research
Language-agnostic data analysis workflows and reproducible research
Andrew Lowe
 
BUSINESS ANALYTICS WITH R SOFTWARE DIAST
BUSINESS ANALYTICS WITH R SOFTWARE DIASTBUSINESS ANALYTICS WITH R SOFTWARE DIAST
BUSINESS ANALYTICS WITH R SOFTWARE DIAST
HaritikaChhatwal1
 
Devtools cheatsheet
Devtools cheatsheetDevtools cheatsheet
Devtools cheatsheet
Dr. Volkan OBAN
 
Devtools cheatsheet
Devtools cheatsheetDevtools cheatsheet
Devtools cheatsheet
Dieudonne Nahigombeye
 
Unit 3
Unit 3Unit 3
Data Science - Part II - Working with R & R studio
Data Science - Part II -  Working with R & R studioData Science - Part II -  Working with R & R studio
Data Science - Part II - Working with R & R studio
Derek Kane
 
Reproducible research (and literate programming) in R
Reproducible research (and literate programming) in RReproducible research (and literate programming) in R
Reproducible research (and literate programming) in R
liz__is
 
Basics R.ppt
Basics R.pptBasics R.ppt
Basics R.ppt
AtulTandan
 
Basics.ppt
Basics.pptBasics.ppt
Reading Data into R REVISED
Reading Data into R REVISEDReading Data into R REVISED
Reading Data into R REVISEDKazuki Yoshida
 
Introduction to Data Mining with R and Data Import/Export in R
Introduction to Data Mining with R and Data Import/Export in RIntroduction to Data Mining with R and Data Import/Export in R
Introduction to Data Mining with R and Data Import/Export in R
Yanchang Zhao
 
Workshop presentation hands on r programming
Workshop presentation hands on r programmingWorkshop presentation hands on r programming
Workshop presentation hands on r programming
Nimrita Koul
 
r,rstats,r language,r packages
r,rstats,r language,r packagesr,rstats,r language,r packages
r,rstats,r language,r packagesAjay Ohri
 
Introduction to R software, by Leire ibaibarriaga
Introduction to R software, by Leire ibaibarriaga Introduction to R software, by Leire ibaibarriaga
Introduction to R software, by Leire ibaibarriaga
DTU - Technical University of Denmark
 
Introduction to r
Introduction to rIntroduction to r
Introduction to r
gslicraf
 
Easy R
Easy REasy R
Easy R
Ajay Ohri
 
Reproducible Computational Research in R
Reproducible Computational Research in RReproducible Computational Research in R
Reproducible Computational Research in R
Samuel Bosch
 
Reproducible Research in R and R Studio
Reproducible Research in R and R StudioReproducible Research in R and R Studio
Reproducible Research in R and R Studio
Susan Johnston
 
1 installing & Getting Started with R
1 installing & Getting Started with R1 installing & Getting Started with R
1 installing & Getting Started with R
Dr Nisha Arora
 
1 Installing & getting started with R
1 Installing & getting started with R1 Installing & getting started with R
1 Installing & getting started with R
naroranisha
 

Similar to R Introduction (20)

Language-agnostic data analysis workflows and reproducible research
Language-agnostic data analysis workflows and reproducible researchLanguage-agnostic data analysis workflows and reproducible research
Language-agnostic data analysis workflows and reproducible research
 
BUSINESS ANALYTICS WITH R SOFTWARE DIAST
BUSINESS ANALYTICS WITH R SOFTWARE DIASTBUSINESS ANALYTICS WITH R SOFTWARE DIAST
BUSINESS ANALYTICS WITH R SOFTWARE DIAST
 
Devtools cheatsheet
Devtools cheatsheetDevtools cheatsheet
Devtools cheatsheet
 
Devtools cheatsheet
Devtools cheatsheetDevtools cheatsheet
Devtools cheatsheet
 
Unit 3
Unit 3Unit 3
Unit 3
 
Data Science - Part II - Working with R & R studio
Data Science - Part II -  Working with R & R studioData Science - Part II -  Working with R & R studio
Data Science - Part II - Working with R & R studio
 
Reproducible research (and literate programming) in R
Reproducible research (and literate programming) in RReproducible research (and literate programming) in R
Reproducible research (and literate programming) in R
 
Basics R.ppt
Basics R.pptBasics R.ppt
Basics R.ppt
 
Basics.ppt
Basics.pptBasics.ppt
Basics.ppt
 
Reading Data into R REVISED
Reading Data into R REVISEDReading Data into R REVISED
Reading Data into R REVISED
 
Introduction to Data Mining with R and Data Import/Export in R
Introduction to Data Mining with R and Data Import/Export in RIntroduction to Data Mining with R and Data Import/Export in R
Introduction to Data Mining with R and Data Import/Export in R
 
Workshop presentation hands on r programming
Workshop presentation hands on r programmingWorkshop presentation hands on r programming
Workshop presentation hands on r programming
 
r,rstats,r language,r packages
r,rstats,r language,r packagesr,rstats,r language,r packages
r,rstats,r language,r packages
 
Introduction to R software, by Leire ibaibarriaga
Introduction to R software, by Leire ibaibarriaga Introduction to R software, by Leire ibaibarriaga
Introduction to R software, by Leire ibaibarriaga
 
Introduction to r
Introduction to rIntroduction to r
Introduction to r
 
Easy R
Easy REasy R
Easy R
 
Reproducible Computational Research in R
Reproducible Computational Research in RReproducible Computational Research in R
Reproducible Computational Research in R
 
Reproducible Research in R and R Studio
Reproducible Research in R and R StudioReproducible Research in R and R Studio
Reproducible Research in R and R Studio
 
1 installing & Getting Started with R
1 installing & Getting Started with R1 installing & Getting Started with R
1 installing & Getting Started with R
 
1 Installing & getting started with R
1 Installing & getting started with R1 Installing & getting started with R
1 Installing & getting started with R
 

More from schamber

Poster
PosterPoster
Poster
schamber
 
Chamberlain PhD Thesis
Chamberlain PhD ThesisChamberlain PhD Thesis
Chamberlain PhD Thesis
schamber
 
Phylogenetics in R
Phylogenetics in RPhylogenetics in R
Phylogenetics in R
schamber
 
Web data from R
Web data from RWeb data from R
Web data from Rschamber
 
regex-presentation_ed_goodwin
regex-presentation_ed_goodwinregex-presentation_ed_goodwin
regex-presentation_ed_goodwinschamber
 

More from schamber (6)

Poster
PosterPoster
Poster
 
Poster
PosterPoster
Poster
 
Chamberlain PhD Thesis
Chamberlain PhD ThesisChamberlain PhD Thesis
Chamberlain PhD Thesis
 
Phylogenetics in R
Phylogenetics in RPhylogenetics in R
Phylogenetics in R
 
Web data from R
Web data from RWeb data from R
Web data from R
 
regex-presentation_ed_goodwin
regex-presentation_ed_goodwinregex-presentation_ed_goodwin
regex-presentation_ed_goodwin
 

Recently uploaded

Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Product School
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and Grafana
RTTS
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
Safe Software
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
BookNet Canada
 
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Ramesh Iyer
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
Elena Simperl
 
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
Product School
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
KatiaHIMEUR1
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
Kari Kakkonen
 
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMsTo Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
Paul Groth
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
OnBoard
 
How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...
Product School
 
Generating a custom Ruby SDK for your web service or Rails API using Smithy
Generating a custom Ruby SDK for your web service or Rails API using SmithyGenerating a custom Ruby SDK for your web service or Rails API using Smithy
Generating a custom Ruby SDK for your web service or Rails API using Smithy
g2nightmarescribd
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
ControlCase
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Jeffrey Haguewood
 
Knowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backKnowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and back
Elena Simperl
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
UiPathCommunity
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Thierry Lestable
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance
 

Recently uploaded (20)

Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and Grafana
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
 
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
 
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
 
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMsTo Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
 
How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...
 
Generating a custom Ruby SDK for your web service or Rails API using Smithy
Generating a custom Ruby SDK for your web service or Rails API using SmithyGenerating a custom Ruby SDK for your web service or Rails API using Smithy
Generating a custom Ruby SDK for your web service or Rails API using Smithy
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
 
Knowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backKnowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and back
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
 

R Introduction

  • 1. R IntroWeek 1 Scott Chamberlain [modified from Haldre Rogers] September 9, 2011
  • 2. Don’t just listen to me! Other Intros to R: http://www.stat.duke.edu/programs/gcc/ResourcesDocuments/RTutorial.pdf http://www.cyclismo.org/tutorial/R/ http://www.r-tutor.com/r-introduction Quick R: http://www.statmethods.net/ http://www.bioconductor.org/help/course-materials/2011/CSAMA/Monday/Morning%20Talks/R_intro.pdf
  • 3. R user frameworks R from command line: OSX and PC Just type “R” into the command line – and have fun! R itself http://www.r-project.org/ RStudio – good choice http://www.rstudio.org/ RevolutionR [free academic version] – this is sort of the SAS-ised version of R http://www.revolutionanalytics.com/downloads/free-academic.php Uses proprietary .xdf file format that speeds up computation times Many other ways to use R, including GUIs, other IDEs, and huge variety of text editors https://github.com/RatRiceEEB/RIntroCode/wiki/R-Resources If you are afraid of the code interface, use Rattle, or R Commander, or Deducer, or Red R You can learn using these interfaces what code does what after pressing buttons
  • 4. R user frameworks, cont. R from Python RPy: http://rpy.sourceforge.net/ C from R: rcpp package: http://cran.r-project.org/web/packages/Rcpp/index.html http://dirk.eddelbuettel.com/code/rcpp.html Can hugely speed up computation times by writing R functions in C language. Then the function calls C to run instead of R. E.g., http://helmingstay.blogspot.com/2011/06/efficient-loops-in-r-complexity-versus.html & http://dirk.eddelbuettel.com/code/rcpp.examples.html Excel from R XLConnect package: http://cran.r-project.org/web/packages/XLConnect/index.html And more….see for yourself
  • 5. R Tips R can crash  Do not use R’s built in text editor or solely write code in the R console. Instead use any text editor that integrates with R. See here for links: https://github.com/RatRiceEEB/RIntroCode/wiki/R-Resources When asking for help on listserves/help websites, use BRIEF and REPRODUCIBLE examples Not doing this makes people not want to help you! R automatically overwrites files with the same file name!!!! Make sure you want to overwrite a file before doing so
  • 7. Not this kind of style…
  • 8. This kind of style!!!
  • 9. Style Style is important so YOU and OTHERS can read your code and actually use it Google style guide: http://google-styleguide.googlecode.com/svn/trunk/google-r-style.html#generallayout Henrik Bengtsson style guide: http://www1.maths.lth.se/help/R/RCC/ Hadley Wickham's style guide: https://github.com/hadley/devtools/wiki/Style
  • 10. Preparing your data for R What makes clean data? Correct spelling Identical capitalization (e.g. Premna vspremna) If myvector <- c(3, 4, 5), calling Myvector does not work! No spaces between words (spaces turned into “.”) Generally try to avoid, use underscores instead NA or blank (if using csv) for missing values Find and replace to get rid of spaces after words I generally keep an .xls and a .csv file so you can always recreate work in R with the .csv file and still modify the .xls file
  • 11. Bringing data into R Create csv file One worksheet only No special formatting, filters, comments etc. Copy only columns and rows with your data to the CSV, as R will read in columns without data sometimes Name your variables well self-explanatory, unique, lowercase, short-ish, one-word names In R, set the working directory setwd("/Users/ScottMac/Dropbox/R Group/Week1_R-Intro") What is the working directory? getwd() What is in the working directory? dir() Read in data CSV files: iris.df <- read.csv("iris_df.csv", header=T) Clipboard: read.csv("clipboard")- reads in file like cutting and pasting it From web: read.csv("http://explore.data.gov/download/pwaj-zn2n/CSV") From excel files: (using the XLConnect package) iris.df <- readWorksheetFromFile("/Users/ScottMac/Dropbox/R Group/Week1_R-Intro/iris_df.xlsx", sheet=“Sheet1”) Write data write.csv(dataframe, “dataframename.csv”), OR save(iris, “iris.RData”) [and load(“iris.RData”) to open in R]
  • 12. R data structures Scalar: Object with a single value, either numeric or character Vector: Sequence of any values, including numeric, character, and NA List: Arbitrary collections of variables – very useful R object Character: Text, e.g., “this is some text” Factor: Like character vectors, but only w/ values in predefined “levels” Matrix: Only numeric values allowed Dataframe: Each column can be of a different class Immutable dataframe: special dataframe used in plyr package for faster dataframe manipulation, it references the original dataframe for faster calculations Function Environment
  • 13. Exploring dataframes str(dataframe) gives column formats and dimensions head(dataframe) and tail() give first and last 6 rows names(dataframe) gives column names row.names(dataframe) gives row names attributes(dataframe) gives column and row names and object class summary(dataframe) gives a lot of good information Make sure variables are appropriate form Character/string, Numeric, Factor, Integer, logical Make sure mins, maxs, means, etc. seem right Make sure you don’t have typing errors so Premna and premna are two separate factors Use: unique(iris$species) to see what all unique values of a column are Or use: levels(spider$species) to see different levels
  • 14. To attach or not to attach…that is the question Some like to use ‘attach’ to make dataframe variables accessible by name within the R session Generally, ‘attach’ is frowned upon by R junkies. Use dataframe$y, or data=dataframe, or dataframe[,”y”], or dataframe[, 2] To detach the object, use: detach()  I recommend: do not use attach, but do what you want
  • 15. R Packages 3,262 packages!!!! Packages are extensions written by anyone for any purpose, usually loaded by: install.packages(”packagename”), then require(packagename) or library() Use ?functionname for help on any function in base R or in R packages In RStudio, just press tab when in parentheses after the function name to see function options!!! Explore packages at the CRAN site: http://cran.r-project.org/web/packages/ Inside-R package reference: http://www.inside-r.org/packages
  • 16. Data manipulation Packages: plyr, data.table, doBY, sqldf, reshape2, and more Comparison of packages Modified from code from Recipes, scripts and Genomics blog: https://gist.github.com/878919 data.table is by far the fastest!!! BUT, ease of use and flexibility may be plyr? See for yourself… Also, see examples in the tutorial code for reshape2 package for neat data manipulation tricks
  • 17. Visualizations A few different approaches: Base graphics Lattice graphics Grid graphics ggplot2 graphics Further reading: http://www.slideshare.net/dataspora/a-survey-of-r-graphics An example:
  • 18. more on ggplot2 graphics There are classes taught by Hadley Wickham here at Rice if you want to learn more! Data visualization (Stat645): http://had.co.nz/stat645/ Statistical computing (Stat405): http://had.co.nz/stat405/ Hadley’s website is really helpful: http://had.co.nz/ggplot2/ The ggplot2 google groups site: https://groups.google.com/forum/#!forum/ggplot2
  • 19. QUICK RSTUDIO RUN THROUGH Keyboard shortcuts!! http://www.rstudio.org/docs/using/keyboard_shortcuts
  • 20. USE CASE HERE [see intro_usecase.R file]

Editor's Notes

  1. Header=T means first row contains variable names
  2. Some numbers are actually factors- think of 0/1 for dead/alive or zipcodes (average zipcode?)