• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
05Nov13 Webinar: Introducing Revolution R Enterprise 7 - The Big Data Big Analytics Platform
 

05Nov13 Webinar: Introducing Revolution R Enterprise 7 - The Big Data Big Analytics Platform

on

  • 9,392 views

 

Statistics

Views

Total Views
9,392
Views on SlideShare
1,489
Embed Views
7,903

Actions

Likes
2
Downloads
65
Comments
0

32 Embeds 7,903

http://www.revolutionanalytics.com 3148
http://blog.revolutionanalytics.com 2778
http://revolutionanalytics.com 548
http://cloud.feedly.com 451
http://www.r-bloggers.com 370
http://yonnietest.devcloud.acquia-sites.com 118
http://www.informaticsblogs.com 69
http://www.revolution-computing.com 65
http://newsblur.com 50
http://www.feedspot.com 47
http://www.newsblur.com 47
http://digg.com 38
http://rva.localhost 33
http://revolution-computing.typepad.com 25
http://feeds.feedburner.com 25
http://yonniedev.devcloud.acquia-sites.com 22
http://feedly.com 16
http://reader.aol.com 14
http://inoreader.com 10
http://revolution-analytics.com 5
http://feedproxy.google.com 5
http://yonnie.local 3
http://www.revoanalytics.com 3
http://127.0.0.1 2
http://www.hanrss.com 2
http://revoanalytics.com 2
http://www.inoreader.com 2
https://www.rebelmouse.com 1
http://www.feedreader.com 1
http://yonnie.devcloud.acquia-sites.com 1
http://translate.googleusercontent.com 1
http://flavors.me 1
More...

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment
  • Enterprise readinessPerformance architectureBig Data analyticsData source integrationDevelopment toolsDeployment tools
  • RRE license is a combo of GPL v2 license (which guarantees commercial usage of R) plus a proprietary license to our proprietary components.
  • Enterprise readinessBuild assurance: Continuous testing, custom validationImplementation tools: validation utilityTechnical support, documentation, trainingPerformance architectureFast math librariesBetter memory managementMulti-core processingDistributed computing architectureBig Data analyticsDescriptive StatisticsCross TabulationStatistical TestsCorrelation, Covariance and SSCP MatricesLinear RegressionLogistic RegressionGeneralized Linear ModelsDecision TreesK-Means ClusteringData source integrationODBCTeradata (high speed)Text Files: Delimited & Fixed formatSASSPSSHadoop:HDFS & HbaseDevelopment toolsVisual DebuggerScript EditorR SnippetsObject BrowserSolution ExplorerCustomizable WorkspaceVersion Control Plug-InDeployment toolsR objects as JSON, XMLSupports Java, JavaScript, .NETRESTful web services APISecurity: LDAP, SSOBuilt-In load balancingAsynchronous schedulingManagement consoleAccelerators: Jaspersoft, Qlikview
  • A Revolution R Enterprise ScaleR analytic is provided a data source as inputThe analytic loops over data, reading a block at a time. Blocks of data are read by a separate worker thread (Thread 0).Worker threads (Threads 1..n) process the data block from the previous iteration of the data loop and update intermediate results objects in memoryWhen all of the data is processed a master results object is created from the intermediate results objects
  • Most current stable release~150 new featuresSupport for long vectors~100 bug fixes and performance improvements83 miscellaneous enhancements (installation, utilities, internationalization etc
  • Semi-automatic modelingIdeal for variable selectionMethods:ForwardBackwardBidirectionalSelection criteria:AICBICMallows’ Cp
  • “Random ForestsTM”Ensemble learning methodClassificationRegressionTrains many treesOutput is mode of classesVariety of use cases
  • “Random ForestsTM”Ensemble learning methodClassificationRegressionTrains many treesOutput is mode of classesVariety of use cases
  • “Random ForestsTM”Ensemble learning methodClassificationRegressionTrains many treesOutput is mode of classesVariety of use cases

05Nov13 Webinar: Introducing Revolution R Enterprise 7 - The Big Data Big Analytics Platform 05Nov13 Webinar: Introducing Revolution R Enterprise 7 - The Big Data Big Analytics Platform Presentation Transcript

  • Announcing: Release 7 Revolution R Enterprise Tuesday, November 5 Michele Chambers, Chief Strategy Officer and VP Product Management Thomas W. Dinsmore, Director of Product Management
  • Agenda  Introduction – Demystifying R – Revolution Analytics at a Glance – Revolution R Enterprise – Revolution Analytics Partner Ecosystem – Customer Testimonials  What‟s New in RRE 7?  More Information  Questions 2
  • Demystifying R  What is R & why is it so darn popular?
  • R is exploding in popularity & function Internet Discussion Package Growth Mean monthly traffic on email discussion list Number of R packages listed on CRAN 4,332 as of Feb 2013 R Stata SAS SPSS S-Plus Web Site Popularity Scholarly Activity Number of links to main web site Google Scholar hits (’05-’09 CAGR) R R SAS SAS SPSS SPSS -27% S-Plus S-Plus Stata Stata 46% -11% 0% 10% 4
  • Latest survey shows significant growth in R adoption R Usage Growth Rexer Data Miner Survey, 2007-2013 70% of data miners report using R “I’ve been astonished by the rate at which R has been adopted. Four years ago, everyone in my economics department [at the University of Chicago] was using Stata; now, as far as I can tell, R is the standard tool, and students learn it first.” Deputy Editor for New Products at Forbes 24% use R as primary tool “A key benefit of R is that it provides near-instant availability of new and experimental methods created by its user base — without waiting for the development/release cycle of commercial software. SAS recognizes the value of R to our customer base…” Source: www.rexeranalytics.com Product Marketing Manager SAS Institute, Inc 5
  • Revolution Analytics at a Glance Who We Are Only provider of commercial big data big analytics platform based on open source R statistical computing language Customers 200+ Global 2000 Our Software Delivers Global Presence North America / EMEA / APAC Scalable Performance: Distributed & parallelized analytics Cross Platform: Write once, deploy anywhere Productivity: Easily build & deploy with latest modern analytics Our Services Deliver Knowledge: Our experts enable you to be experts Time-to-Value: Our Quickstart program gives you a jumpstart Guidance: Our customer support team is here to help you Global Industries Served Financial Services Digital Media Government Health & Life Sciences High Tech Manufacturing Retail Telco 6
  • Revolution R Enterprise is…. the only big data big analytics platform based on open source R the defacto statistical computing language for modern analytics  High Performance, Scalable Analytics  Portable Across Enterprise Platforms  Easier to Build & Deploy Analytics 7
  • R is open source and drives analytic innovation but….has some limitations for Enterprises Big Data In-memory bound Hybrid memory & disk scalability Operates on bigger volumes & factors Speed of Analysis Single threaded Parallel threading Shrinks analysis time Enterprise Readiness Community support Commercial support Delivers full service production support Analytic Breadth & Depth 5000+ innovative analytic packages Leverage open source packages plus Big Data ready packages Supercharges R Commercial Viability Risk of deployment of open source Commercial license Eliminate risk with open source 8
  • Introducing Revolution R Enterprise (RRE) The Big Data Big Analytics Platform  Big Data Big Analytics Ready – Enterprise readiness DevelopR ConnectR ScaleR DistributedR DeployR – High performance analytics – Multi-platform architecture – Data source integration – Development tools – Deployment tools 9
  • The Platform Step by Step: R Capabilities R+CRAN RevoR • • • • • • Performance enhanced R interpreter • Based on open source R • Adds high-performance math Open source R interpreter Freely-available R algorithms Algorithms callable by RevoR Embeddable in R scripts 100% Compatible with existing R scripts, functions and packages 10
  • The Platform Step by Step: Parallelization & Data Sourcing ConnectR • High-speed & direct connectors ScaleR • Ready-to-Use high-performance big data big analytics • Fully-parallelized analytics • Data prep & data distillation • Descriptive statistics & statistical tests • Correlation & covariance matrices • Predictive Models – linear, logistic, GLM • Machine learning • Monte Carlo simulation • Tools for distributing customized algorithms across nodes DistributedR • Distributed computing framework • Delivers portability across platforms 11
  • The Platform Step by Step: Tools & Deployment DevelopR DeployR • Integrated development environment for R • Visual „step-into‟ debugger • Web services software development kit for integration analytics via Java, JavaScript or .NET APIs • Integrates R Into application infrastructures DevelopR DeployR Capabilities: • Invokes R Scripts from web services calls • RESTful interface for easy integration • Works with web & mobile apps, leading BI & Visualization tools and business rules engines 12
  • Write Once. Deploy Anywhere. Hadoop Hortonworks Cloudera EDW IBM Netezza Teradata Clustered Systems IBM Platform LSF Microsoft HPC Workstations & Servers Desktop Server In the Cloud Microsoft Azure Burst Amazon AWS DeployR ConnectR ScaleR DistributedR DESIGNED FOR SCALE, PORTABILITY & PERFORMANCE 13
  • The Power of Revolution R Enterprise Performance & Scalability ScaleR ScaleR Moves computation to data ScaleR V a l u e Moves computation to data Leverage CRAN ScaleR Labor saving power DistributedR Maximizes computation DistributedR Powerful divide & conquer DistributedR Effective memory utilization RevoR 3-50X faster Open Source Leverage latest innovation 14
  • Revolution R Enterprise Powering Next Generation Analytics COMBINE INTERMEDIATE RESULTS 15
  • Revolution R Enterprise Revo R Performance Enhanced R Open Source R Customers reportRevolution R 3-50x Enterprise performance improvements compared to Open Source R — without changing any code Computation (4-core laptop) Open Source R Revolution R Speedup Matrix Multiply 176 sec 9.3 sec 18x Cholesky Factorization 25.5 sec 1.3 sec 19x Linear Discriminant Analysis 189 sec 74 sec 3x R Benchmarks (Matrix Functions) 22 sec 3.5 sec 5x R Benchmarks (Program Control) 5.6 sec 5.4 sec Not appreciable Linear Algebra1 General R Benchmarks2 1. http://www.revolutionanalytics.com/why-revolution-r/benchmarks.php 2. http://r.research.att.com/benchmarks/ 16
  • RRE ScaleR outperforms SAS HPA – at a fraction of the cost Logistic Regression: Rows of data 1 billion 1 billion Parameters “just a few” Double 7 Time 80 seconds 45% 44 seconds Data location In memory Nodes 32 1/6th 5 Cores 384 5% 20 RAM 1,536 GB 5% On disk Revolution R is faster on the same amount of data, despite using approximately a 6th as many nodes, and not pre-loading data into RAM. 80 GB 20th as many cores, a 20th as much RAM, a Bottom Line: Revolution R Enterprise Performance = Greatly Reduced TCO *As published by SAS in HPC Wire, April 21, 2011 17
  • R + Revolution R Enterprise Unequaled Big Data Big Analytics Deploy Analytics Web, Mobile, Data Visualization, BI Big Data Distributed Analytics Big Data Distributed Analytics Performance Enhanced R Performance Enhanced R 18
  • Revolution R Enterprise Ecosystem Power of Integration SI / Service Deployment / Consumption MSP / DSP Advanced Analytics ETL Corios Data / Infrastructure 19
  • Customers Revolutionize their Business Power 4X performance 50M records scored daily “We‟ve combined Revolution R Enterprise and Hadoop to build and deploy customized exploratory data analysis and GAM survival models for our marketing performance management and attribution platform. Given that our data sets are already in the terabytes and are growing rapidly, we depend on Revolution R Enterprise‟s scalability and power – we saw about a 4x performance improvement on 50 million records. It works brilliantly.” - CEO, John Wallace, DataSong Scalability Performance TB’s data from 200+ data sources 10’s thousands attributes 100’s millions of scores daily 2X data 2X attributes no impact on performance “We‟ve been able to scale our solution to a problem that‟s so big that most companies could not address it. If we had to go with a different solution we wouldn‟t be as efficient as we are now.” - SVP Analytics, Kevin Lyons, eXelate “We need a high-performance analytics infrastructure because marketing optimization is a lot like a financial trading. By watching the market constantly for data or market condition updates, we can now identify opportunities for our clients that would otherwise be lost.” - Chief Analytics Officer, Leon Zemel, [x+1] 20
  • What‟s New in RRE 7
  • The Power of R Most widely used analytics tool Preferred by working analysts More than 6,000 packages Global footprint New • R 3.0.2 22
  • Scalable Statistical Modeling Linear Regression Stepwise Linear Logistic Regression Generalized Linear Models New • Stepwise Logistic • Stepwise GLM 23
  • Scalable Machine Learning Decision Trees New • Decision Forests • Tree Visualization 24
  • Data Source Integration Fixed/delimited text SAS, SPSS ODBC HDFS and HBase Teradata Tested • HP Vertica • Teradata Aster 25
  • New: Model Integration 26
  • BI Integration Custom web reports QlikView accelerator New • Excel Accelerator • Tableau Integration 27
  • New: Business User Interface 28
  • Choice of Operating Systems 29
  • New: Inside-Hadoop Deployment 30
  • Multi-Node Package Manager HDFS Name Node MapReduce Data Node Data Node Data Node Data Node Data Node Task Tracker Task Tracker Task Tracker Task Tracker Task Tracker Job Tracker 31
  • ScaleR in Hadoop HDFS Name Node MapReduce Data Node Data Node Data Node Data Node Data Node Task Tracker Task Tracker Task Tracker Task Tracker Task Tracker Job Tracker 32
  • In-Database Deployment 33
  • Summary: What’s New in RRE 7.0 R 3.0.2 34
  • Summary: What’s New in RRE 7.0 Stepwise Logistic Stepwise GLM Decision Forests Tree Visualizer PMML Export 35
  • Summary: What’s New in RRE 7.0 36
  • Summary: What’s New in RRE 7.0 37
  • Summary: What’s New in RRE 7.0 DevelopR DeployR 38
  • www.revolutionanalytics.com 39
  • 40
  • www.revolutionanalytics.com/contact-us 41
  • 42
  • 43