Intro to R for SAS and SPSS User Webinar

21,475 views
21,520 views

Published on

R is free software for data analysis and graphics that is similar to SAS and SPSS. Two million people are part of the R Open Source Community. Its use is growing very rapidly and Revolution Analytics distributes a commercial version of R that adds capabilities that are not available in the Open Source version. This 60-minute webinar is for people who are familiar with SAS or SPSS who want to know how R can strengthen their analytics strategy.

Published in: Education, Technology
1 Comment
8 Likes
Statistics
Notes
  • Hi All, We are planning to start Hadoop online training batch on this week... If any one interested to attend the demo please register in our website... For this batch we are also provide everyday recorded sessions with Materials. For more information feel free to contact us : siva@keylabstraining.com. For Course Content and Recorded Demo Click Here : http://www.keylabstraining.com/hadoop-online-training-hyderabad-bangalore
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
No Downloads
Views
Total views
21,475
On SlideShare
0
From Embeds
0
Number of Embeds
14,443
Actions
Shares
0
Downloads
256
Comments
1
Likes
8
Embeds 0
No embeds

No notes for slide

Intro to R for SAS and SPSS User Webinar

  1. 1.  What is R?  R’s Advantages  R’s Disadvantages  Installing and Maintaining R  Ways of Running RBob Muenchen, Author R for SAS and SPSS Users,  An Example Program Co-Author R for Stata Users  Where to Learn More muenchen.bob@gmail.com, http://r4stats.com Copyright © 2010, 2011, Robert A Muenchen. All rights reserved. 2  “The most powerful statistical computing language on the planet.” -Norman Nie, Developer of SPSS  Language + package + environment for graphics and data analysis  Free and open source  Created by Ross Ihaka & Robert Gentleman 1996 & extended by many more  An implementation of the S language by John Chambers and others  R has 4,950 add-ons, or nearly 100,000 procs 3 4
  2. 2. 5 Source: r4stats.com/popularity 6 http://r4stats.com/popularity1. Data input & management (data step) * SAS Approach;2. Analytics & graphics procedures (proc step) DATA A; SET A;3. Macro language logX = log(X);4. Matrix language PROC REG;5. Output management systems (ODS/OMS) MODEL Y = logX;R integrates these all seamlessly. # R Approach lm( Y ~ log(X) ) 7 8
  3. 3.  Vast selection of analytics & graphics New methods are available sooner Many packages can run R (SAS, SPSS, Excel…) Its object orientation “does the right thing” Its language is powerful & fully integrated Procedures you write are on an equal footing It is the universal language of data analysis It runs on any computer Being open source, you can study and modify it It is free 9 10* Using SAS;  Language is somewhat harder to learnPROC TTEST DATA=classroom;  Help files are sparse & complexCLASS gender;  Must find R and its add-ons yourselfVAR score;  Graphical user interfaces not as polished  Most R functions hold data in main memory# In R  Rule-of-thumb: 10 million values per gigabytet.test(score ~ gender, data=classroom)  SAS/SPSS: billions of records  Several efforts underway to break R’s memory limitt.test(posttest, pretest , paired=TRUE, data=classroom) including Revolution Analytics’ distribution 11 12
  4. 4.  Base R plus Recommended Packages like:  Email support is free, quick, 24-hours:  Base SAS, SAS/STAT, SAS/GRAPH, SAS/IML Studio  www.r-project.org/mail.html  SPSS Stat. Base, SPSS Stat. Advanced, Regression  Stackoverflow.com Tested via extensive validation programs  Quora.com But add-on packages written by…  Crossvalidated stats.stackexchange.com  Professor who invented the method? /questions/tagged/r  A student interpreting the method?  Phone support available commercially 13 141. Go to cran.r-project.org,  Comprehensive R Archive Network the Comprehensive R Archive Network  Crantastic.com2. Download binaries for Base & run  Inside-R.org3. Add-ons:  R4Stats.com install.packages(“myPackage”)4. To update: update.packages() 15 16
  5. 5. 17 1819 20
  6. 6.  Run code interactively  Submit code from Excel, SAS, SPSS,…  Point-n-click using Graphical User Interfaces (GUIs)  Batch mode21 2223 24
  7. 7. Copyright © 2010, 2011, Robert A Muenchen. All rights reserved. 26 25run ExportDataSetToR("mydata"); GET FILE=‘mydata.sav’. BEGIN PROGRAM R.submit/r; mydata <- spssdata.GetDataFromSPSS( mydata$workshop <- variables = c("workshop gender factor(mydata$workshop) q1 to q4"), summary(mydata) missingValueToNA = TRUE,endsubmit; row.label = "id" ) summary(mydata) END PROGRAM. 27 28
  8. 8. 29 30 3231
  9. 9. 34 33 A company focused on R development & support Run by SPSS founder Norman Nie Their enhanced distribution of R: Revolution R Enterprise Free for colleges and universities, including for outside consulting 35
  10. 10. 43 44
  11. 11. mydata <- read.csv("mydata.csv") > mydata <- read.csv("mydata.csv") print(mydata) > print(mydata) workshop gender q1 q2 q3 q4 mydata$workshop <- factor(mydata$workshop) 1 1 f 1 1 5 1 summary(mydata) 2 2 f 2 1 4 1 plot( mydata$q1, mydata$q4 ) 3 1 f 2 2 4 3 4 2 <NA> 3 1 NA 3 myModel <- lm( q4~q1+q2+q3, data=mydata ) 5 1 m 4 5 2 4 summary( myModel ) 6 2 m 5 4 5 5 anova( myModel ) 7 1 m 5 3 4 4 plot( myModel ) 8 2 m 4 5 5 5 45 46> mydata$workshop <-factor(mydata$workshop)> summary(mydata) workshop gender 1:4 f :3 2:4 m :4 NAs:1q1 q2 q3 q4Min. :1.00 Min. :1.00 Min. :2.000 Min. :1.001st Qu.:2.00 1st Qu.:1.00 1st Qu.:4.000 1st Qu.:2.50Median :3.50 Median :2.50 Median :4.000 Median :3.50Mean :3.25 Mean :2.75 Mean :4.143 Mean :3.253rd Qu.:4.25 3rd Qu.:4.25 3rd Qu.:5.000 3rd Qu.:4.25Max. :5.00 Max. :5.00 Max. :5.000 Max. :5.00 NAs :1.000 47 48
  12. 12. > myModel <- lm(q4 ~ q1+q2+q3, data=mydata)> summary(myModel)Call:lm(formula = q4 ~ q1 + q2 + q3, data = mydata)Residuals: 1 2 3 5 6 7 8-0.3113 -0.4261 0.9428 -0.1797 0.0765 0.0225 -0.1246Coefficients: Estimate Std. Error t value Pr(>|t|)(Intercept) -1.3243 1.2877 -1.028 0.379q1 0.4297 0.2623 1.638 0.200q2 0.6310 0.2503 2.521 0.086q3 0.3150 0.2557 1.232 0.306Multiple R-squared: 0.9299, Adjusted R-squared: 0.8598F-statistic: 13.27 on 3 and 3 DF, p-value: 0.03084 49 Copyright © 2010, 2011, Robert A Muenchen. All rights reserved. 50 51 52
  13. 13.  R for SAS and SPSS Users, Muenchen  R for Stata Users, Muenchen & Hilbe  R Through Excel: A Spreadsheet Interface for Statistics, Data Analysis, and Graphics, Heiberger & Neuwirth  Data Mining with Rattle and R: The Art of Excavating Data for Knowledge Discovery, Williams 53 54 R is powerful, extensible, free Download it from CRAN Academics download Revolution R Enterprise for free at www.revolutionanalytics.com You run it many ways & from many packages muenchen@utk.edu Several graphical user interfaces are available Rs programming language is the way Slides: r4stats.com/misc/webinar Presentation: bit.ly/R-sas-spss to access its full power 55

×