Revolution ConfidentialR evolution R :100% R and MoreP res ented by:David S mithV P Marketing and C ommunityR evolution A ...
Revolution ConfidentialP oll Ques tion    Which stats package do you use                 most?
F ebruary 22, 2011: Welc ome!                            Revolution Confidential Thanks for coming. Slides and replay av...
In today’s webc as t:                       Revolution Confidential About Revolution Analytics and R What Revolution R a...
What is R ?                          Download the White PaperConfidential                                            R is ...
R is exploding in popularity andfunc tionality                                                                            ...
“ R is the mos t powerful & flexible s tatis tic al                                                    Revolution Confiden...
From: The R EcosystemR Us er C ommunity   bit.ly/R-ecosystem                                              8
Revolution ConfidentialP oll Ques tion    If youre not using R today, what    would you most like to use R for?
R evolution R E nterpris e is   Revolution Confidential                                                 10
R P roduc tivity E nvironment (Windows )                                                                                  ...
Interac tive Debugging                        Revolution Confidential One-click to set a breakpoint in an R script Step ...
P erformanc e: Multi-threaded Math                                                              Revolution Confidential  O...
T hree P aradigms for B ig Data              Revolution Confidential Standard R engine is constrained by  capacity and pe...
R evolution R E nterpris e with R evoS c aleRB ig Data S tatis tic s in R                                                 ...
R evoS c aleR : B ig Data algorithms          Revolution Confidential   Data processing (rxDataStep)   Descriptive stati...
R evoS c aleR – Dis tributed C omputing                      Revolution Confidential              Compute                 ...
P latform-agnos tic B ig Data A nalytic s                                      Revolution Confidential   Set “compute con...
A c ommon analytic platform ac ros s bigdata arc hitec tures                   Revolution Confidential    Hadoop         F...
In-Databas e E xec ution with IB M Netezza     Revolution Confidential          More info: http://bit.ly/R-Netezza        ...
R and Hadoop                              Revolution Confidential Hadoop offers a scalable infrastructure for  processing...
R evoC onnec tR for Hadoop                                            Revolution Confidential                             ...
E nterpris e R eadines s :R evolution R E nterpris e S erver          Revolution Confidential Multi-User Support Product...
E nterpris e-Wide Deployment                             Revolution Confidential        Production                 Researc...
On-Demand A nalytic s with R evoDeployR                                   Revolution Confidential                         ...
T he A dvanc ed A nalytic s S tac k                           Revolution Confidential       Deployment / Consumption      ...
Revolution Confidential On-Call Technical Support Consulting   Migration | Analytics | Applications | Validation Train...
Revolution ConfidentialWrapping Up
Why R ?                                        Revolution Confidential   Every data analysis technique at your fingertips...
R evolution R E nterpris e                                Revolution ConfidentialProduction-Grade Statistical Analysis for...
R evolution R E nterpris e: F ree to A c ademia                   Revolution Confidential                                ...
T hank You!                                                              Revolution Confidential Download slides, replay ...
Revolution ConfidentialP oll Ques tion     What interests you most about      Revolution R Enterprise?
Revolution ConfidentialThe leading commercial provider of software and support for the          popular open source R stat...
Upcoming SlideShare
Loading in...5
×

Revolution R - 100% R and More

5,711

Published on

Published in: Technology
0 Comments
7 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
5,711
On Slideshare
0
From Embeds
0
Number of Embeds
6
Actions
Shares
0
Downloads
147
Comments
0
Likes
7
Embeds 0
No embeds

No notes for slide

Revolution R - 100% R and More

  1. 1. Revolution ConfidentialR evolution R :100% R and MoreP res ented by:David S mithV P Marketing and C ommunityR evolution A nalytic s
  2. 2. Revolution ConfidentialP oll Ques tion Which stats package do you use most?
  3. 3. F ebruary 22, 2011: Welc ome! Revolution Confidential Thanks for coming. Slides and replay available (soon) at:  http://bit.ly/z9xUG9 David Smith VP Marketing & Community, Revolution Analytics Editor, Revolutions blog http://blog.revolutionanalytics.com Twitter: @revodavid 3
  4. 4. In today’s webc as t: Revolution Confidential About Revolution Analytics and R What Revolution R adds to R Resources for getting more from R Q&A Introducing Revolution R 4
  5. 5. What is R ? Download the White PaperConfidential R is Hot Revolution bit.ly/r-is-hot Data analysis software A programming language  Development platform designed by and for statisticians An environment  Huge library of algorithms for data access, data manipulation, analysis and graphics An open-source software project  Free, open, and active A community  Thousands of contributors, 2 million users  Resources and help in every domain 5
  6. 6. R is exploding in popularity andfunc tionality Revolution ConfidentialScholarly Activity Google Scholar hits (’05-’09 CAGR) R 46% “I’ve been astonished by the rate at which R has been adopted. Four years ago, SAS -11% everyone in my economics department [at SPSS -27% the University of Chicago] was using Stata; now, as far as I can tell, R is the S-Plus 0% standard tool, and students learn it first.” Stata 10% Deputy Editor for New Products at ForbesPackage Growth Number of R packages listed on CRAN “A key benefit of R is that it provides near- instant availability of new and experimental methods created by its user base — without waiting for the development/release cycle of commercial software. SAS recognizes the value of R to our customer base…” Product Marketing Manager SAS Institute, Inc. 2002 2004 2006 2008 2010 Source: http://r4stats.com/popularity 6
  7. 7. “ R is the mos t powerful & flexible s tatis tic al Revolution Confidentialprogramming language in the world” 1 Capabilities  Sophisticated statistical analyses  Predictive analytics  Data visualization Applications  Real-time trading MSFT [2009-  Last 29.29 Finance 30  Risk assessment 25  Forecasting 20  Bio-technology 15  Drug development  Social networks  .. and more 1. Norman Nie, multiple interviews 7
  8. 8. From: The R EcosystemR Us er C ommunity bit.ly/R-ecosystem 8
  9. 9. Revolution ConfidentialP oll Ques tion If youre not using R today, what would you most like to use R for?
  10. 10. R evolution R E nterpris e is Revolution Confidential 10
  11. 11. R P roduc tivity E nvironment (Windows ) Revolution Confidential Script with type ahead and code Solutions window snippets for organizing code and data Sophisticated debugging with breakpoints , variable Objects values etc. loaded in the R Environment Packages Object installed and details loaded http://www.revolutionanalytics.com/demos/revolution-productivity-environment/demo.htm 11
  12. 12. Interac tive Debugging Revolution Confidential One-click to set a breakpoint in an R script Step in/out/over, inspect variables Eliminate the edit -> browser -> repair cycle 12
  13. 13. P erformanc e: Multi-threaded Math Revolution Confidential Open Revolution R Source R Enterprise Computation (4-core laptop) Open Source R Revolution R Speedup Linear Algebra1 Matrix Multiply 327 sec 13.4 sec 23x Cholesky Factorization 31.3 sec 1.8 sec 17x Linear Discriminant Analysis 216 sec 74.6 sec 2x General R Benchmarks2 R Benchmarks (Matrix Functions) 22 sec 3.5 sec 5x R Benchmarks (Program Control) 5.6 sec 5.4 sec Not appreciable 1. http://www.revolutionanalytics.com/why-revolution-r/benchmarks.php 2. http://r.research.att.com/benchmarks/ 13
  14. 14. T hree P aradigms for B ig Data Revolution Confidential Standard R engine is constrained by capacity and performance Revolution R Enterprise offers three methods for big data with R:  Off-line: high-performance file-based analytics  Off-line, parallel & distributed analytics  On-line, in-database analytics  Hadoop  Netezza 14
  15. 15. R evolution R E nterpris e with R evoS c aleRB ig Data S tatis tic s in R Revolution Confidential www.revolutionanalytics.com/bigdataEvery US airlinedeparture and arrival,1987-2008File: AirlineData87to08.xdfRows: 123.5 millionVariables: 29Size on disk: 13.2Gb arrDelayLm2 <- rxLinMod(ArrDelay ~ DayOfWeek:F(CRSDepTime),cube=TRUE) 15
  16. 16. R evoS c aleR : B ig Data algorithms Revolution Confidential Data processing (rxDataStep) Descriptive statistics (rxSummary) Tables and cubes (rxCube, rxCrossTabs) Correlations/covariances (rxCovCor, rxCor, rxCov, rxSSCP) Linear regressions (rxLinMod) Logistic regressions (rxLogit) K means clustering (rxKmeans) Predictions (scoring) (rxPredict) Custom distributed computing (RxExec) Revolution R Enterprise 16
  17. 17. R evoS c aleR – Dis tributed C omputing Revolution Confidential Compute • Portions of the data source are Data Node made available to each compute Partition (RevoScaleR) node • RevoScaleR on the master node Compute assigns a task to each compute Data Node node Partition (RevoScaleR) Master • Each compute node independently Node processes its data, and returns its Compute (RevoScaleR) intermediate results back to the Data Node master node Partition (RevoScaleR) • master node aggregates all of the intermediate results from each Compute compute node and produces the Data Node final result Partition (RevoScaleR) *Available now for Microsoft HPC Server Video demo: http://bit.ly/ugQ9KR 17
  18. 18. P latform-agnos tic B ig Data A nalytic s Revolution Confidential  Set “compute context” to define hardware (one line of code)  Native job-scheduler handles distribution, monitoring, failover etc.  Same code runs on other supported architectures  Just change compute context  Supported architectures:  Windows: Microsoft HPC Server  Linux: Platform Computing LSF (coming 2012) 42 seconds instead of 6 minutes 18
  19. 19. A c ommon analytic platform ac ros s bigdata arc hitec tures Revolution Confidential Hadoop File Based In-database 19
  20. 20. In-Databas e E xec ution with IB M Netezza Revolution Confidential More info: http://bit.ly/R-Netezza 20
  21. 21. R and Hadoop Revolution Confidential Hadoop offers a scalable infrastructure for processing massive amounts of data  Storage – HDFS, HBASE  Distributed Computing - MapReduce R is a statistical programming language for developing advanced analytic applications Currently, writing analytics for Hadoop requires a combination of Java, pig, Python, … The Rhadoop project makes it possible to write Big Data algorithms for Hadoop using the R language alone. 21
  22. 22. R evoC onnec tR for Hadoop Revolution Confidential Write Map-Reduce analytics using HBASE only R code with these R packages: HDFS  rhdfs - R and HDFS R Thrift  rhbase - R and HBASE Map or Reduce  rmr - R and MapReduce Task rhbase rhdfs Node Revolution R More information at: Job Client bit.ly/r-hadoop Tracker rmr 22
  23. 23. E nterpris e R eadines s :R evolution R E nterpris e S erver Revolution Confidential Multi-User Support Production Applications Integrate R analytics into Web based applications  Data Analysis and Visualization  Reporting  Dashboards  Interactive applications Revolution R Enterprise Server with RevoDeployR 23
  24. 24. E nterpris e-Wide Deployment Revolution Confidential Production Research and Development Revolution R Enterprise Server + Hadoop + IBM Netezza Data Scientists / Modelers + Windows HPC Server cluster Management End-User Deployment Console Excel Web BI RevoDeployR Server App Web Services API Analysts / Corporate Users 24
  25. 25. On-Demand A nalytic s with R evoDeployR Revolution Confidential 25
  26. 26. T he A dvanc ed A nalytic s S tac k Revolution Confidential Deployment / Consumption Advanced Analytics ETL Data / Infrastructure “Open Analytics Stack” White Paper: bit.ly/lC43Kw 26
  27. 27. Revolution Confidential On-Call Technical Support Consulting  Migration | Analytics | Applications | Validation Training  R | Revolution R | Statistical Topics Systems Integration  BI | ERP | Databases | Cloud 27
  28. 28. Revolution ConfidentialWrapping Up
  29. 29. Why R ? Revolution Confidential Every data analysis technique at your fingertips Create beautiful and unique data visualizations Get better results faster Draw on the talents of data scientists worldwide R is hot, and growing fast 29
  30. 30. R evolution R E nterpris e Revolution ConfidentialProduction-Grade Statistical Analysis for the Workplace  High-performance R for multiprocessor systems  Modern Integrated Development Environment  Statistical Analysis of Terabyte-Class Data Sets  In-database R analytics with Hadoop and Netezza  Deploy R Applications via Web Services  Telephone and email technical support  Training and consulting services  100% compatible with R packages 30
  31. 31. R evolution R E nterpris e: F ree to A c ademia Revolution Confidential  Personal use  Research  Teaching  Package development Free Academic Download www.revolutionanalytics.com/downloads/free-academic.php Discounted Technical Support Subscriptions Available 31
  32. 32. T hank You! Revolution Confidential Download slides, replay  http://bit.ly/z9xUG9 Learn more about Revolution R  revolutionanalytics.com/products Contact Revolution Analytics  http://bit.ly/hey-revo Feb 29: Turbo-Charge Your Analytics with IBM Netezza and Revolution R Enterprise A Step-by-Step Approach for Acceleration and Innovation, presented by William Zanine (IBM Analytics Solutions). www.revolutionanalytics.com/news-events/free-webinars 32
  33. 33. Revolution ConfidentialP oll Ques tion What interests you most about Revolution R Enterprise?
  34. 34. Revolution ConfidentialThe leading commercial provider of software and support for the popular open source R statistics language. www.revolutionanalytics.com +1 (650) 646 9545 Twitter: @RevolutionR 34
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×