Slideshare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our User Agreement and Privacy Policy.

Slideshare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our Privacy Policy and User Agreement for details.

Like this presentation? Why not share!

- regex-presentation_ed_goodwin by schamber 4488 views
- R language tutorial by David Chiu 7502 views
- Basic R by Roberto de Pinho 414 views
- R: Apply Functions by DataminingTools Inc 12080 views
- R Programming: Mathematical Functio... by Rsquared Academy 534 views
- R squared project by Eduardo García 1569 views

No Downloads

Total views

5,120

On SlideShare

0

From Embeds

0

Number of Embeds

1,316

Shares

0

Downloads

94

Comments

0

Likes

2

No embeds

No notes for slide

- 1. R IntroWeek 1<br />Scott Chamberlain<br />[modified from Haldre Rogers]<br />September 9, 2011<br />
- 2. Don’t just listen to me! Other Intros to R:<br />http://www.stat.duke.edu/programs/gcc/ResourcesDocuments/RTutorial.pdf<br />http://www.cyclismo.org/tutorial/R/<br />http://www.r-tutor.com/r-introduction<br />Quick R: http://www.statmethods.net/<br />http://www.bioconductor.org/help/course-materials/2011/CSAMA/Monday/Morning%20Talks/R_intro.pdf<br />
- 3. R user frameworks<br />R from command line: OSX and PC<br />Just type “R” into the command line – and have fun!<br />R itself<br />http://www.r-project.org/<br />RStudio – good choice<br />http://www.rstudio.org/<br />RevolutionR [free academic version] – this is sort of the SAS-ised version of R<br />http://www.revolutionanalytics.com/downloads/free-academic.php<br />Uses proprietary .xdf file format that speeds up computation times<br />Many other ways to use R, including GUIs, other IDEs, and huge variety of text editors<br />https://github.com/RatRiceEEB/RIntroCode/wiki/R-Resources<br />If you are afraid of the code interface, use Rattle, or R Commander, or Deducer, or Red R<br />You can learn using these interfaces what code does what after pressing buttons<br />
- 4. R user frameworks, cont.<br />R from Python<br />RPy: http://rpy.sourceforge.net/<br />C from R: <br />rcpp package:<br />http://cran.r-project.org/web/packages/Rcpp/index.html<br />http://dirk.eddelbuettel.com/code/rcpp.html<br />Can hugely speed up computation times by writing R functions in C language. Then the function calls C to run instead of R.<br />E.g., http://helmingstay.blogspot.com/2011/06/efficient-loops-in-r-complexity-versus.html<br />& http://dirk.eddelbuettel.com/code/rcpp.examples.html<br />Excel from R<br />XLConnect package: http://cran.r-project.org/web/packages/XLConnect/index.html<br />And more….see for yourself<br />
- 5. R Tips<br />R can crash Do not use R’s built in text editor or solely write code in the R console. Instead use any text editor that integrates with R. See here for links: <br />https://github.com/RatRiceEEB/RIntroCode/wiki/R-Resources<br />When asking for help on listserves/help websites, use BRIEF and REPRODUCIBLE examples<br />Not doing this makes people not want to help you!<br />R automatically overwrites files with the same file name!!!!<br />Make sure you want to overwrite a file before doing so<br />
- 6. Style<br />
- 7. Not this kind of style…<br />
- 8. This kind of style!!!<br />
- 9. Style<br />Style is important so YOU and OTHERS can read your code and actually use it<br />Google style guide: <br />http://google-styleguide.googlecode.com/svn/trunk/google-r-style.html#generallayout<br />Henrik Bengtsson style guide: <br />http://www1.maths.lth.se/help/R/RCC/<br />Hadley Wickham's style guide: <br />https://github.com/hadley/devtools/wiki/Style<br />
- 10. Preparing your data for R<br />What makes clean data?<br />Correct spelling<br />Identical capitalization (e.g. Premna vspremna)<br />If myvector <- c(3, 4, 5), calling Myvector does not work!<br />No spaces between words (spaces turned into “.”)<br />Generally try to avoid, use underscores instead<br />NA or blank (if using csv) for missing values<br />Find and replace to get rid of spaces after words<br />I generally keep an .xls and a .csv file so you can always recreate work in R with the .csv file and still modify the .xls file<br />
- 11. Bringing data into R<br />Create csv file<br />One worksheet only<br />No special formatting, filters, comments etc.<br />Copy only columns and rows with your data to the CSV, as R will read in columns without data sometimes<br />Name your variables well <br />self-explanatory, unique, lowercase, short-ish, one-word names<br />In R, set the working directory<br />setwd("/Users/ScottMac/Dropbox/R Group/Week1_R-Intro")<br />What is the working directory? getwd()<br />What is in the working directory? dir()<br />Read in data<br />CSV files: iris.df <- read.csv("iris_df.csv", header=T)<br />Clipboard: read.csv("clipboard")- reads in file like cutting and pasting it<br />From web: read.csv("http://explore.data.gov/download/pwaj-zn2n/CSV")<br />From excel files: (using the XLConnect package)<br />iris.df <- readWorksheetFromFile("/Users/ScottMac/Dropbox/R Group/Week1_R-Intro/iris_df.xlsx", sheet=“Sheet1”)<br />Write data<br />write.csv(dataframe, “dataframename.csv”), OR<br />save(iris, “iris.RData”) [and load(“iris.RData”) to open in R]<br />
- 12. R data structures<br />Scalar:<br />Object with a single value, either numeric or character<br />Vector:<br />Sequence of any values, including numeric, character, and NA<br />List:<br />Arbitrary collections of variables – very useful R object<br />Character:<br />Text, e.g., “this is some text”<br />Factor:<br />Like character vectors, but only w/ values in predefined “levels”<br />Matrix:<br />Only numeric values allowed<br />Dataframe: <br />Each column can be of a different class<br />Immutable dataframe: <br />special dataframe used in plyr package for faster dataframe manipulation, it references the original dataframe for faster calculations<br />Function<br />Environment<br />
- 13. Exploring dataframes<br />str(dataframe) gives column formats and dimensions<br />head(dataframe) and tail() give first and last 6 rows<br />names(dataframe) gives column names<br />row.names(dataframe) gives row names<br />attributes(dataframe) gives column and row names and object class<br />summary(dataframe) gives a lot of good information<br />Make sure variables are appropriate form<br />Character/string, Numeric, Factor, Integer, logical<br />Make sure mins, maxs, means, etc. seem right<br />Make sure you don’t have typing errors so Premna and premna are two separate factors<br />Use: unique(iris$species) to see what all unique values of a column are<br />Or use: levels(spider$species) to see different levels<br />
- 14. To attach or not to attach…that is the question<br />Some like to use ‘attach’ to make dataframe variables accessible by name within the R session <br />Generally, ‘attach’ is frowned upon by R junkies. <br />Use dataframe$y, or data=dataframe, or dataframe[,”y”], or dataframe[, 2]<br />To detach the object, use: detach() <br /> I recommend: do not use attach, but do what you want<br />
- 15. R Packages<br />3,262 packages!!!!<br />Packages are extensions written by anyone for any purpose, usually loaded by:<br />install.packages(”packagename”), then<br />require(packagename) or library()<br />Use ?functionname for help on any function in base R or in R packages<br />In RStudio, just press tab when in parentheses after the function name to see function options!!!<br />Explore packages at the CRAN site:<br />http://cran.r-project.org/web/packages/<br />Inside-R package reference: <br />http://www.inside-r.org/packages<br />
- 16. Data manipulation<br />Packages: plyr, data.table, doBY, sqldf, reshape2, and more<br />Comparison of packages<br />Modified from code from Recipes, scripts and Genomics blog: https://gist.github.com/878919<br />data.table is by far the fastest!!! <br />BUT, ease of use and flexibility may be plyr? See for yourself…<br />Also, see examples in the tutorial code for reshape2 package for neat data manipulation tricks<br />
- 17. Visualizations<br />A few different approaches:<br />Base graphics<br />Lattice graphics<br />Grid graphics<br />ggplot2 graphics<br />Further reading: http://www.slideshare.net/dataspora/a-survey-of-r-graphics<br />An example:<br />
- 18. more on ggplot2 graphics<br />There are classes taught by Hadley Wickham here at Rice if you want to learn more!<br />Data visualization (Stat645): http://had.co.nz/stat645/<br />Statistical computing (Stat405): http://had.co.nz/stat405/<br />Hadley’s website is really helpful: http://had.co.nz/ggplot2/<br />The ggplot2 google groups site: https://groups.google.com/forum/#!forum/ggplot2<br />
- 19. QUICK RSTUDIO RUN THROUGH<br />Keyboard shortcuts!!<br />http://www.rstudio.org/docs/using/keyboard_shortcuts<br />
- 20. USE CASE HERE<br />[see intro_usecase.R file]<br />

No public clipboards found for this slide

×
### Save the most important slides with Clipping

Clipping is a handy way to collect and organize the most important slides from a presentation. You can keep your great finds in clipboards organized around topics.

Be the first to comment