Your SlideShare is downloading. ×
  • Like
Revolution R - 100% R and More
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×

Now you can save presentations on your phone or tablet

Available for both IPhone and Android

Text the download link to your phone

Standard text messaging rates apply

Revolution R - 100% R and More

  • 5,561 views
Published

 

Published in Technology
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
No Downloads

Views

Total Views
5,561
On SlideShare
0
From Embeds
0
Number of Embeds
6

Actions

Shares
Downloads
144
Comments
0
Likes
7

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. Revolution ConfidentialR evolution R :100% R and MoreP res ented by:David S mithV P Marketing and C ommunityR evolution A nalytic s
  • 2. Revolution ConfidentialP oll Ques tion Which stats package do you use most?
  • 3. F ebruary 22, 2011: Welc ome! Revolution Confidential Thanks for coming. Slides and replay available (soon) at:  http://bit.ly/z9xUG9 David Smith VP Marketing & Community, Revolution Analytics Editor, Revolutions blog http://blog.revolutionanalytics.com Twitter: @revodavid 3
  • 4. In today’s webc as t: Revolution Confidential About Revolution Analytics and R What Revolution R adds to R Resources for getting more from R Q&A Introducing Revolution R 4
  • 5. What is R ? Download the White PaperConfidential R is Hot Revolution bit.ly/r-is-hot Data analysis software A programming language  Development platform designed by and for statisticians An environment  Huge library of algorithms for data access, data manipulation, analysis and graphics An open-source software project  Free, open, and active A community  Thousands of contributors, 2 million users  Resources and help in every domain 5
  • 6. R is exploding in popularity andfunc tionality Revolution ConfidentialScholarly Activity Google Scholar hits (’05-’09 CAGR) R 46% “I’ve been astonished by the rate at which R has been adopted. Four years ago, SAS -11% everyone in my economics department [at SPSS -27% the University of Chicago] was using Stata; now, as far as I can tell, R is the S-Plus 0% standard tool, and students learn it first.” Stata 10% Deputy Editor for New Products at ForbesPackage Growth Number of R packages listed on CRAN “A key benefit of R is that it provides near- instant availability of new and experimental methods created by its user base — without waiting for the development/release cycle of commercial software. SAS recognizes the value of R to our customer base…” Product Marketing Manager SAS Institute, Inc. 2002 2004 2006 2008 2010 Source: http://r4stats.com/popularity 6
  • 7. “ R is the mos t powerful & flexible s tatis tic al Revolution Confidentialprogramming language in the world” 1 Capabilities  Sophisticated statistical analyses  Predictive analytics  Data visualization Applications  Real-time trading MSFT [2009-  Last 29.29 Finance 30  Risk assessment 25  Forecasting 20  Bio-technology 15  Drug development  Social networks  .. and more 1. Norman Nie, multiple interviews 7
  • 8. From: The R EcosystemR Us er C ommunity bit.ly/R-ecosystem 8
  • 9. Revolution ConfidentialP oll Ques tion If youre not using R today, what would you most like to use R for?
  • 10. R evolution R E nterpris e is Revolution Confidential 10
  • 11. R P roduc tivity E nvironment (Windows ) Revolution Confidential Script with type ahead and code Solutions window snippets for organizing code and data Sophisticated debugging with breakpoints , variable Objects values etc. loaded in the R Environment Packages Object installed and details loaded http://www.revolutionanalytics.com/demos/revolution-productivity-environment/demo.htm 11
  • 12. Interac tive Debugging Revolution Confidential One-click to set a breakpoint in an R script Step in/out/over, inspect variables Eliminate the edit -> browser -> repair cycle 12
  • 13. P erformanc e: Multi-threaded Math Revolution Confidential Open Revolution R Source R Enterprise Computation (4-core laptop) Open Source R Revolution R Speedup Linear Algebra1 Matrix Multiply 327 sec 13.4 sec 23x Cholesky Factorization 31.3 sec 1.8 sec 17x Linear Discriminant Analysis 216 sec 74.6 sec 2x General R Benchmarks2 R Benchmarks (Matrix Functions) 22 sec 3.5 sec 5x R Benchmarks (Program Control) 5.6 sec 5.4 sec Not appreciable 1. http://www.revolutionanalytics.com/why-revolution-r/benchmarks.php 2. http://r.research.att.com/benchmarks/ 13
  • 14. T hree P aradigms for B ig Data Revolution Confidential Standard R engine is constrained by capacity and performance Revolution R Enterprise offers three methods for big data with R:  Off-line: high-performance file-based analytics  Off-line, parallel & distributed analytics  On-line, in-database analytics  Hadoop  Netezza 14
  • 15. R evolution R E nterpris e with R evoS c aleRB ig Data S tatis tic s in R Revolution Confidential www.revolutionanalytics.com/bigdataEvery US airlinedeparture and arrival,1987-2008File: AirlineData87to08.xdfRows: 123.5 millionVariables: 29Size on disk: 13.2Gb arrDelayLm2 <- rxLinMod(ArrDelay ~ DayOfWeek:F(CRSDepTime),cube=TRUE) 15
  • 16. R evoS c aleR : B ig Data algorithms Revolution Confidential Data processing (rxDataStep) Descriptive statistics (rxSummary) Tables and cubes (rxCube, rxCrossTabs) Correlations/covariances (rxCovCor, rxCor, rxCov, rxSSCP) Linear regressions (rxLinMod) Logistic regressions (rxLogit) K means clustering (rxKmeans) Predictions (scoring) (rxPredict) Custom distributed computing (RxExec) Revolution R Enterprise 16
  • 17. R evoS c aleR – Dis tributed C omputing Revolution Confidential Compute • Portions of the data source are Data Node made available to each compute Partition (RevoScaleR) node • RevoScaleR on the master node Compute assigns a task to each compute Data Node node Partition (RevoScaleR) Master • Each compute node independently Node processes its data, and returns its Compute (RevoScaleR) intermediate results back to the Data Node master node Partition (RevoScaleR) • master node aggregates all of the intermediate results from each Compute compute node and produces the Data Node final result Partition (RevoScaleR) *Available now for Microsoft HPC Server Video demo: http://bit.ly/ugQ9KR 17
  • 18. P latform-agnos tic B ig Data A nalytic s Revolution Confidential  Set “compute context” to define hardware (one line of code)  Native job-scheduler handles distribution, monitoring, failover etc.  Same code runs on other supported architectures  Just change compute context  Supported architectures:  Windows: Microsoft HPC Server  Linux: Platform Computing LSF (coming 2012) 42 seconds instead of 6 minutes 18
  • 19. A c ommon analytic platform ac ros s bigdata arc hitec tures Revolution Confidential Hadoop File Based In-database 19
  • 20. In-Databas e E xec ution with IB M Netezza Revolution Confidential More info: http://bit.ly/R-Netezza 20
  • 21. R and Hadoop Revolution Confidential Hadoop offers a scalable infrastructure for processing massive amounts of data  Storage – HDFS, HBASE  Distributed Computing - MapReduce R is a statistical programming language for developing advanced analytic applications Currently, writing analytics for Hadoop requires a combination of Java, pig, Python, … The Rhadoop project makes it possible to write Big Data algorithms for Hadoop using the R language alone. 21
  • 22. R evoC onnec tR for Hadoop Revolution Confidential Write Map-Reduce analytics using HBASE only R code with these R packages: HDFS  rhdfs - R and HDFS R Thrift  rhbase - R and HBASE Map or Reduce  rmr - R and MapReduce Task rhbase rhdfs Node Revolution R More information at: Job Client bit.ly/r-hadoop Tracker rmr 22
  • 23. E nterpris e R eadines s :R evolution R E nterpris e S erver Revolution Confidential Multi-User Support Production Applications Integrate R analytics into Web based applications  Data Analysis and Visualization  Reporting  Dashboards  Interactive applications Revolution R Enterprise Server with RevoDeployR 23
  • 24. E nterpris e-Wide Deployment Revolution Confidential Production Research and Development Revolution R Enterprise Server + Hadoop + IBM Netezza Data Scientists / Modelers + Windows HPC Server cluster Management End-User Deployment Console Excel Web BI RevoDeployR Server App Web Services API Analysts / Corporate Users 24
  • 25. On-Demand A nalytic s with R evoDeployR Revolution Confidential 25
  • 26. T he A dvanc ed A nalytic s S tac k Revolution Confidential Deployment / Consumption Advanced Analytics ETL Data / Infrastructure “Open Analytics Stack” White Paper: bit.ly/lC43Kw 26
  • 27. Revolution Confidential On-Call Technical Support Consulting  Migration | Analytics | Applications | Validation Training  R | Revolution R | Statistical Topics Systems Integration  BI | ERP | Databases | Cloud 27
  • 28. Revolution ConfidentialWrapping Up
  • 29. Why R ? Revolution Confidential Every data analysis technique at your fingertips Create beautiful and unique data visualizations Get better results faster Draw on the talents of data scientists worldwide R is hot, and growing fast 29
  • 30. R evolution R E nterpris e Revolution ConfidentialProduction-Grade Statistical Analysis for the Workplace  High-performance R for multiprocessor systems  Modern Integrated Development Environment  Statistical Analysis of Terabyte-Class Data Sets  In-database R analytics with Hadoop and Netezza  Deploy R Applications via Web Services  Telephone and email technical support  Training and consulting services  100% compatible with R packages 30
  • 31. R evolution R E nterpris e: F ree to A c ademia Revolution Confidential  Personal use  Research  Teaching  Package development Free Academic Download www.revolutionanalytics.com/downloads/free-academic.php Discounted Technical Support Subscriptions Available 31
  • 32. T hank You! Revolution Confidential Download slides, replay  http://bit.ly/z9xUG9 Learn more about Revolution R  revolutionanalytics.com/products Contact Revolution Analytics  http://bit.ly/hey-revo Feb 29: Turbo-Charge Your Analytics with IBM Netezza and Revolution R Enterprise A Step-by-Step Approach for Acceleration and Innovation, presented by William Zanine (IBM Analytics Solutions). www.revolutionanalytics.com/news-events/free-webinars 32
  • 33. Revolution ConfidentialP oll Ques tion What interests you most about Revolution R Enterprise?
  • 34. Revolution ConfidentialThe leading commercial provider of software and support for the popular open source R statistics language. www.revolutionanalytics.com +1 (650) 646 9545 Twitter: @RevolutionR 34