Your SlideShare is downloading. ×
  • Like
Revolution R - 100% R and More
Upcoming SlideShare
Loading in...5

Thanks for flagging this SlideShare!

Oops! An error has occurred.


Now you can save presentations on your phone or tablet

Available for both IPhone and Android

Text the download link to your phone

Standard text messaging rates apply

Revolution R - 100% R and More



Published in Technology
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
No Downloads


Total Views
On SlideShare
From Embeds
Number of Embeds



Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

    No notes for slide


  • 1. Revolution ConfidentialR evolution R :100% R and MoreP res ented by:David S mithV P Marketing and C ommunityR evolution A nalytic s
  • 2. Revolution ConfidentialP oll Ques tion Which stats package do you use most?
  • 3. F ebruary 22, 2011: Welc ome! Revolution Confidential Thanks for coming. Slides and replay available (soon) at:  David Smith VP Marketing & Community, Revolution Analytics Editor, Revolutions blog Twitter: @revodavid 3
  • 4. In today’s webc as t: Revolution Confidential About Revolution Analytics and R What Revolution R adds to R Resources for getting more from R Q&A Introducing Revolution R 4
  • 5. What is R ? Download the White PaperConfidential R is Hot Revolution Data analysis software A programming language  Development platform designed by and for statisticians An environment  Huge library of algorithms for data access, data manipulation, analysis and graphics An open-source software project  Free, open, and active A community  Thousands of contributors, 2 million users  Resources and help in every domain 5
  • 6. R is exploding in popularity andfunc tionality Revolution ConfidentialScholarly Activity Google Scholar hits (’05-’09 CAGR) R 46% “I’ve been astonished by the rate at which R has been adopted. Four years ago, SAS -11% everyone in my economics department [at SPSS -27% the University of Chicago] was using Stata; now, as far as I can tell, R is the S-Plus 0% standard tool, and students learn it first.” Stata 10% Deputy Editor for New Products at ForbesPackage Growth Number of R packages listed on CRAN “A key benefit of R is that it provides near- instant availability of new and experimental methods created by its user base — without waiting for the development/release cycle of commercial software. SAS recognizes the value of R to our customer base…” Product Marketing Manager SAS Institute, Inc. 2002 2004 2006 2008 2010 Source: 6
  • 7. “ R is the mos t powerful & flexible s tatis tic al Revolution Confidentialprogramming language in the world” 1 Capabilities  Sophisticated statistical analyses  Predictive analytics  Data visualization Applications  Real-time trading MSFT [2009-  Last 29.29 Finance 30  Risk assessment 25  Forecasting 20  Bio-technology 15  Drug development  Social networks  .. and more 1. Norman Nie, multiple interviews 7
  • 8. From: The R EcosystemR Us er C ommunity 8
  • 9. Revolution ConfidentialP oll Ques tion If youre not using R today, what would you most like to use R for?
  • 10. R evolution R E nterpris e is Revolution Confidential 10
  • 11. R P roduc tivity E nvironment (Windows ) Revolution Confidential Script with type ahead and code Solutions window snippets for organizing code and data Sophisticated debugging with breakpoints , variable Objects values etc. loaded in the R Environment Packages Object installed and details loaded 11
  • 12. Interac tive Debugging Revolution Confidential One-click to set a breakpoint in an R script Step in/out/over, inspect variables Eliminate the edit -> browser -> repair cycle 12
  • 13. P erformanc e: Multi-threaded Math Revolution Confidential Open Revolution R Source R Enterprise Computation (4-core laptop) Open Source R Revolution R Speedup Linear Algebra1 Matrix Multiply 327 sec 13.4 sec 23x Cholesky Factorization 31.3 sec 1.8 sec 17x Linear Discriminant Analysis 216 sec 74.6 sec 2x General R Benchmarks2 R Benchmarks (Matrix Functions) 22 sec 3.5 sec 5x R Benchmarks (Program Control) 5.6 sec 5.4 sec Not appreciable 1. 2. 13
  • 14. T hree P aradigms for B ig Data Revolution Confidential Standard R engine is constrained by capacity and performance Revolution R Enterprise offers three methods for big data with R:  Off-line: high-performance file-based analytics  Off-line, parallel & distributed analytics  On-line, in-database analytics  Hadoop  Netezza 14
  • 15. R evolution R E nterpris e with R evoS c aleRB ig Data S tatis tic s in R Revolution Confidential US airlinedeparture and arrival,1987-2008File: AirlineData87to08.xdfRows: 123.5 millionVariables: 29Size on disk: 13.2Gb arrDelayLm2 <- rxLinMod(ArrDelay ~ DayOfWeek:F(CRSDepTime),cube=TRUE) 15
  • 16. R evoS c aleR : B ig Data algorithms Revolution Confidential Data processing (rxDataStep) Descriptive statistics (rxSummary) Tables and cubes (rxCube, rxCrossTabs) Correlations/covariances (rxCovCor, rxCor, rxCov, rxSSCP) Linear regressions (rxLinMod) Logistic regressions (rxLogit) K means clustering (rxKmeans) Predictions (scoring) (rxPredict) Custom distributed computing (RxExec) Revolution R Enterprise 16
  • 17. R evoS c aleR – Dis tributed C omputing Revolution Confidential Compute • Portions of the data source are Data Node made available to each compute Partition (RevoScaleR) node • RevoScaleR on the master node Compute assigns a task to each compute Data Node node Partition (RevoScaleR) Master • Each compute node independently Node processes its data, and returns its Compute (RevoScaleR) intermediate results back to the Data Node master node Partition (RevoScaleR) • master node aggregates all of the intermediate results from each Compute compute node and produces the Data Node final result Partition (RevoScaleR) *Available now for Microsoft HPC Server Video demo: 17
  • 18. P latform-agnos tic B ig Data A nalytic s Revolution Confidential  Set “compute context” to define hardware (one line of code)  Native job-scheduler handles distribution, monitoring, failover etc.  Same code runs on other supported architectures  Just change compute context  Supported architectures:  Windows: Microsoft HPC Server  Linux: Platform Computing LSF (coming 2012) 42 seconds instead of 6 minutes 18
  • 19. A c ommon analytic platform ac ros s bigdata arc hitec tures Revolution Confidential Hadoop File Based In-database 19
  • 20. In-Databas e E xec ution with IB M Netezza Revolution Confidential More info: 20
  • 21. R and Hadoop Revolution Confidential Hadoop offers a scalable infrastructure for processing massive amounts of data  Storage – HDFS, HBASE  Distributed Computing - MapReduce R is a statistical programming language for developing advanced analytic applications Currently, writing analytics for Hadoop requires a combination of Java, pig, Python, … The Rhadoop project makes it possible to write Big Data algorithms for Hadoop using the R language alone. 21
  • 22. R evoC onnec tR for Hadoop Revolution Confidential Write Map-Reduce analytics using HBASE only R code with these R packages: HDFS  rhdfs - R and HDFS R Thrift  rhbase - R and HBASE Map or Reduce  rmr - R and MapReduce Task rhbase rhdfs Node Revolution R More information at: Job Client Tracker rmr 22
  • 23. E nterpris e R eadines s :R evolution R E nterpris e S erver Revolution Confidential Multi-User Support Production Applications Integrate R analytics into Web based applications  Data Analysis and Visualization  Reporting  Dashboards  Interactive applications Revolution R Enterprise Server with RevoDeployR 23
  • 24. E nterpris e-Wide Deployment Revolution Confidential Production Research and Development Revolution R Enterprise Server + Hadoop + IBM Netezza Data Scientists / Modelers + Windows HPC Server cluster Management End-User Deployment Console Excel Web BI RevoDeployR Server App Web Services API Analysts / Corporate Users 24
  • 25. On-Demand A nalytic s with R evoDeployR Revolution Confidential 25
  • 26. T he A dvanc ed A nalytic s S tac k Revolution Confidential Deployment / Consumption Advanced Analytics ETL Data / Infrastructure “Open Analytics Stack” White Paper: 26
  • 27. Revolution Confidential On-Call Technical Support Consulting  Migration | Analytics | Applications | Validation Training  R | Revolution R | Statistical Topics Systems Integration  BI | ERP | Databases | Cloud 27
  • 28. Revolution ConfidentialWrapping Up
  • 29. Why R ? Revolution Confidential Every data analysis technique at your fingertips Create beautiful and unique data visualizations Get better results faster Draw on the talents of data scientists worldwide R is hot, and growing fast 29
  • 30. R evolution R E nterpris e Revolution ConfidentialProduction-Grade Statistical Analysis for the Workplace  High-performance R for multiprocessor systems  Modern Integrated Development Environment  Statistical Analysis of Terabyte-Class Data Sets  In-database R analytics with Hadoop and Netezza  Deploy R Applications via Web Services  Telephone and email technical support  Training and consulting services  100% compatible with R packages 30
  • 31. R evolution R E nterpris e: F ree to A c ademia Revolution Confidential  Personal use  Research  Teaching  Package development Free Academic Download Discounted Technical Support Subscriptions Available 31
  • 32. T hank You! Revolution Confidential Download slides, replay  Learn more about Revolution R  Contact Revolution Analytics  Feb 29: Turbo-Charge Your Analytics with IBM Netezza and Revolution R Enterprise A Step-by-Step Approach for Acceleration and Innovation, presented by William Zanine (IBM Analytics Solutions). 32
  • 33. Revolution ConfidentialP oll Ques tion What interests you most about Revolution R Enterprise?
  • 34. Revolution ConfidentialThe leading commercial provider of software and support for the popular open source R statistics language. +1 (650) 646 9545 Twitter: @RevolutionR 34