Slideshare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our User Agreement and Privacy Policy.

Slideshare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our Privacy Policy and User Agreement for details.

Like this presentation? Why not share!

No Downloads

Total views

6,142

On SlideShare

0

From Embeds

0

Number of Embeds

3,889

Shares

0

Downloads

149

Comments

0

Likes

8

No embeds

No notes for slide

- 1. Revolution ConfidentialR evolution R :100% R and MoreP res ented by:David S mithV P Marketing and C ommunityR evolution A nalytic s
- 2. Revolution ConfidentialP oll Ques tion Which stats package do you use most?
- 3. F ebruary 22, 2011: Welc ome! Revolution Confidential Thanks for coming. Slides and replay available (soon) at: http://bit.ly/z9xUG9 David Smith VP Marketing & Community, Revolution Analytics Editor, Revolutions blog http://blog.revolutionanalytics.com Twitter: @revodavid 3
- 4. In today’s webc as t: Revolution Confidential About Revolution Analytics and R What Revolution R adds to R Resources for getting more from R Q&A Introducing Revolution R 4
- 5. What is R ? Download the White PaperConfidential R is Hot Revolution bit.ly/r-is-hot Data analysis software A programming language Development platform designed by and for statisticians An environment Huge library of algorithms for data access, data manipulation, analysis and graphics An open-source software project Free, open, and active A community Thousands of contributors, 2 million users Resources and help in every domain 5
- 6. R is exploding in popularity andfunc tionality Revolution ConfidentialScholarly Activity Google Scholar hits (’05-’09 CAGR) R 46% “I’ve been astonished by the rate at which R has been adopted. Four years ago, SAS -11% everyone in my economics department [at SPSS -27% the University of Chicago] was using Stata; now, as far as I can tell, R is the S-Plus 0% standard tool, and students learn it first.” Stata 10% Deputy Editor for New Products at ForbesPackage Growth Number of R packages listed on CRAN “A key benefit of R is that it provides near- instant availability of new and experimental methods created by its user base — without waiting for the development/release cycle of commercial software. SAS recognizes the value of R to our customer base…” Product Marketing Manager SAS Institute, Inc. 2002 2004 2006 2008 2010 Source: http://r4stats.com/popularity 6
- 7. “ R is the mos t powerful & flexible s tatis tic al Revolution Confidentialprogramming language in the world” 1 Capabilities Sophisticated statistical analyses Predictive analytics Data visualization Applications Real-time trading MSFT [2009- Last 29.29 Finance 30 Risk assessment 25 Forecasting 20 Bio-technology 15 Drug development Social networks .. and more 1. Norman Nie, multiple interviews 7
- 8. From: The R EcosystemR Us er C ommunity bit.ly/R-ecosystem 8
- 9. Revolution ConfidentialP oll Ques tion If youre not using R today, what would you most like to use R for?
- 10. R evolution R E nterpris e is Revolution Confidential 10
- 11. R P roduc tivity E nvironment (Windows ) Revolution Confidential Script with type ahead and code Solutions window snippets for organizing code and data Sophisticated debugging with breakpoints , variable Objects values etc. loaded in the R Environment Packages Object installed and details loaded http://www.revolutionanalytics.com/demos/revolution-productivity-environment/demo.htm 11
- 12. Interac tive Debugging Revolution Confidential One-click to set a breakpoint in an R script Step in/out/over, inspect variables Eliminate the edit -> browser -> repair cycle 12
- 13. P erformanc e: Multi-threaded Math Revolution Confidential Open Revolution R Source R Enterprise Computation (4-core laptop) Open Source R Revolution R Speedup Linear Algebra1 Matrix Multiply 327 sec 13.4 sec 23x Cholesky Factorization 31.3 sec 1.8 sec 17x Linear Discriminant Analysis 216 sec 74.6 sec 2x General R Benchmarks2 R Benchmarks (Matrix Functions) 22 sec 3.5 sec 5x R Benchmarks (Program Control) 5.6 sec 5.4 sec Not appreciable 1. http://www.revolutionanalytics.com/why-revolution-r/benchmarks.php 2. http://r.research.att.com/benchmarks/ 13
- 14. T hree P aradigms for B ig Data Revolution Confidential Standard R engine is constrained by capacity and performance Revolution R Enterprise offers three methods for big data with R: Off-line: high-performance file-based analytics Off-line, parallel & distributed analytics On-line, in-database analytics Hadoop Netezza 14
- 15. R evolution R E nterpris e with R evoS c aleRB ig Data S tatis tic s in R Revolution Confidential www.revolutionanalytics.com/bigdataEvery US airlinedeparture and arrival,1987-2008File: AirlineData87to08.xdfRows: 123.5 millionVariables: 29Size on disk: 13.2Gb arrDelayLm2 <- rxLinMod(ArrDelay ~ DayOfWeek:F(CRSDepTime),cube=TRUE) 15
- 16. R evoS c aleR : B ig Data algorithms Revolution Confidential Data processing (rxDataStep) Descriptive statistics (rxSummary) Tables and cubes (rxCube, rxCrossTabs) Correlations/covariances (rxCovCor, rxCor, rxCov, rxSSCP) Linear regressions (rxLinMod) Logistic regressions (rxLogit) K means clustering (rxKmeans) Predictions (scoring) (rxPredict) Custom distributed computing (RxExec) Revolution R Enterprise 16
- 17. R evoS c aleR – Dis tributed C omputing Revolution Confidential Compute • Portions of the data source are Data Node made available to each compute Partition (RevoScaleR) node • RevoScaleR on the master node Compute assigns a task to each compute Data Node node Partition (RevoScaleR) Master • Each compute node independently Node processes its data, and returns its Compute (RevoScaleR) intermediate results back to the Data Node master node Partition (RevoScaleR) • master node aggregates all of the intermediate results from each Compute compute node and produces the Data Node final result Partition (RevoScaleR) *Available now for Microsoft HPC Server Video demo: http://bit.ly/ugQ9KR 17
- 18. P latform-agnos tic B ig Data A nalytic s Revolution Confidential Set “compute context” to define hardware (one line of code) Native job-scheduler handles distribution, monitoring, failover etc. Same code runs on other supported architectures Just change compute context Supported architectures: Windows: Microsoft HPC Server Linux: Platform Computing LSF (coming 2012) 42 seconds instead of 6 minutes 18
- 19. A c ommon analytic platform ac ros s bigdata arc hitec tures Revolution Confidential Hadoop File Based In-database 19
- 20. In-Databas e E xec ution with IB M Netezza Revolution Confidential More info: http://bit.ly/R-Netezza 20
- 21. R and Hadoop Revolution Confidential Hadoop offers a scalable infrastructure for processing massive amounts of data Storage – HDFS, HBASE Distributed Computing - MapReduce R is a statistical programming language for developing advanced analytic applications Currently, writing analytics for Hadoop requires a combination of Java, pig, Python, … The Rhadoop project makes it possible to write Big Data algorithms for Hadoop using the R language alone. 21
- 22. R evoC onnec tR for Hadoop Revolution Confidential Write Map-Reduce analytics using HBASE only R code with these R packages: HDFS rhdfs - R and HDFS R Thrift rhbase - R and HBASE Map or Reduce rmr - R and MapReduce Task rhbase rhdfs Node Revolution R More information at: Job Client bit.ly/r-hadoop Tracker rmr 22
- 23. E nterpris e R eadines s :R evolution R E nterpris e S erver Revolution Confidential Multi-User Support Production Applications Integrate R analytics into Web based applications Data Analysis and Visualization Reporting Dashboards Interactive applications Revolution R Enterprise Server with RevoDeployR 23
- 24. E nterpris e-Wide Deployment Revolution Confidential Production Research and Development Revolution R Enterprise Server + Hadoop + IBM Netezza Data Scientists / Modelers + Windows HPC Server cluster Management End-User Deployment Console Excel Web BI RevoDeployR Server App Web Services API Analysts / Corporate Users 24
- 25. On-Demand A nalytic s with R evoDeployR Revolution Confidential 25
- 26. T he A dvanc ed A nalytic s S tac k Revolution Confidential Deployment / Consumption Advanced Analytics ETL Data / Infrastructure “Open Analytics Stack” White Paper: bit.ly/lC43Kw 26
- 27. Revolution Confidential On-Call Technical Support Consulting Migration | Analytics | Applications | Validation Training R | Revolution R | Statistical Topics Systems Integration BI | ERP | Databases | Cloud 27
- 28. Revolution ConfidentialWrapping Up
- 29. Why R ? Revolution Confidential Every data analysis technique at your fingertips Create beautiful and unique data visualizations Get better results faster Draw on the talents of data scientists worldwide R is hot, and growing fast 29
- 30. R evolution R E nterpris e Revolution ConfidentialProduction-Grade Statistical Analysis for the Workplace High-performance R for multiprocessor systems Modern Integrated Development Environment Statistical Analysis of Terabyte-Class Data Sets In-database R analytics with Hadoop and Netezza Deploy R Applications via Web Services Telephone and email technical support Training and consulting services 100% compatible with R packages 30
- 31. R evolution R E nterpris e: F ree to A c ademia Revolution Confidential Personal use Research Teaching Package development Free Academic Download www.revolutionanalytics.com/downloads/free-academic.php Discounted Technical Support Subscriptions Available 31
- 32. T hank You! Revolution Confidential Download slides, replay http://bit.ly/z9xUG9 Learn more about Revolution R revolutionanalytics.com/products Contact Revolution Analytics http://bit.ly/hey-revo Feb 29: Turbo-Charge Your Analytics with IBM Netezza and Revolution R Enterprise A Step-by-Step Approach for Acceleration and Innovation, presented by William Zanine (IBM Analytics Solutions). www.revolutionanalytics.com/news-events/free-webinars 32
- 33. Revolution ConfidentialP oll Ques tion What interests you most about Revolution R Enterprise?
- 34. Revolution ConfidentialThe leading commercial provider of software and support for the popular open source R statistics language. www.revolutionanalytics.com +1 (650) 646 9545 Twitter: @RevolutionR 34

No public clipboards found for this slide

×
### Save the most important slides with Clipping

Clipping is a handy way to collect and organize the most important slides from a presentation. You can keep your great finds in clipboards organized around topics.

Be the first to comment