SlideShare a Scribd company logo
Revolution Confidential
Revolution Analytics
Bringing the Analytical Power of
R to the Hadoop Platform
Simon Field
Technical Director,
Revolution Analytics
June 14, 2013
Revolution Confidential
Vigorous Growth of Big Data…
2
The global Big Data Market revenue is expected to grow from $1.56
billion in 2012 to $13.95 billion in 2017, at an estimated CAGR of
54.9% from 2012 to 2017.
- Marketsandmarkets.com study, 14 April 2013
“…the market for Big Data technology will reach 16.9 billion by
2015, up from $3.2 billion in 2010. That is a 40 percent-a-year
growth rate – about seven times the estimated growth rate for the
overall information technology and communications business.”
– IDC study, March 2012
Revolution Confidential
Big Data = Opportunity + Disruption
3
Huge New Data Assets
• Internet – Commerce, Communications, Collaboration
• Social Media – Personal, Presence, New Social Networks
• Ubiquitous Telemetry – Machines Everywhere
Huge New Data Assets
• Internet – Commerce, Communications, Collaboration
• Social Media – Personal, Presence, New Social Networks
• Ubiquitous Telemetry – Machines Everywhere
Rapidly-Evolving Platforms
• “Data Lake” vs. “Warehouse” vs. “Big Data App. Platforms”
• Vast Choices Among Open Source Platfroms
• Eliminate Time Consuming Data Movements
Rapidly-Evolving Platforms
• “Data Lake” vs. “Warehouse” vs. “Big Data App. Platforms”
• Vast Choices Among Open Source Platfroms
• Eliminate Time Consuming Data Movements
Emerging Business Opportunities
• Data Science Unlocks New Insight
• Big Data Drives Better Decisionmaking
• Platforms Evolve Rationally Toward Big Data Vision
Emerging Business Opportunities
• Data Science Unlocks New Insight
• Big Data Drives Better Decisionmaking
• Platforms Evolve Rationally Toward Big Data Vision
Revolution Confidential
Hadoop Analytics Platforms: Disruption,
Challenge, Growth & Opportunity At Once
4
• Java Skill Requirements
• Hadoop’s Innovation Pace
• Java Skill Requirements
• Hadoop’s Innovation Pace
• Analytical
• Write Once, Deploy Anywhere
Growth: Skill Development
• EDW Saturation
• Limited Analytical Capabilities
• EDW Saturation
• Limited Analytical Capabilities
• Data Science Skill Shortage
• MapReduce Paradigm
Disruption: Evolving Ecosystems
• Designed for Massive Scale
• Commodity Foundations
• Designed for Massive Scale
• Commodity Foundations
• Built for Data Variety
• Open Source Innovation Pace
Challenge: Big Data Readiness
• Descriptive -> Predictive
• Short Analytical Cycle Time
• Descriptive -> Predictive
• Short Analytical Cycle Time
• Ubiquitous Analytical Decisions
• Low-Latency Analytics
Opportunity: New, More Capable Analytic Foundation
Revolution Confidential
What We Need: Convergence
 Data Science
 With business solutions that fuse statistics, mathematics
and software into meaningful applications.
 Software Engineering
 With tools and frameworks to create agile, scalable
analytics-based applications
 IT Operations Management
 Deployment platforms that are integrated, cost-effective,
secure and ubiquitous.
5
Revolution Confidential
What is the R Statistics Language?
 The R Language:
 Straightforward Procedural Language for Stats, Math
and Data Science
 Open Source
 The R Community:
 2M Users with the skill to tackle big data mathematical /
statistical and ML needs.
 Began on workstation / modest SMP servers
 The R Ecosystem:
 4500+ Freely Available Algorithms in CRAN
 Applicable to Big Data if scaled
6
Revolution Confidential
Why R and Hadoop?
 Hadoop’s dominates Big Data Storage and
Computational platforms.
 R dominates Data Science, Providing a
Language, Users Thousands of Pre-Built
Algorithms.
 Bringing Them Together is Our Goal Today.
7
Revolution Confidential
Mission
Company Confidential – Do not distribute 8
Enterprise-ready
Revolution R Enterprise
is the only commercial big data analytics platform
based on open source R statistical computing language
Multi-platform
Scalable from desktop to big data
Delivers high performance analytics
Easier to build and deploy analytic applications
Revolution Confidential
Global Industries
Served
Financial Services
Digital Media
Government
Health & Life Sciences
High Tech
Manufacturing
Retail
Telco
Our Software Delivers
Power: Distributed, scalable high performance advanced analytics
Productivity: Easier to build and deploy analytic applications
Enterprise Readiness: Multi-platform
Our Philosophy
Customer-centric innovation
Easy to do business with
Our Investors
Intel Capital
North Bridge
Presidio Ventures
Who we are
Leading provider of commercial analytics platform based
on open source R statistical computing language
Customers
200+ Global 2000
Global Presence
North America / EMEA / APAC
Our Services Deliver
Knowledge: Our experts enable you to be experts
Time-to-Value: Our Quickstart projects give you a jumpstart
Guidance: Our customer support team is here to help you
Company Confidential – Do not distribute 9
Revolution Confidential
Big Data Speed and Scale with
Revolution R Enterprise
Fast Math Libraries
Parallelized Algorithms
In-Database Execution
Multi-Threaded Execution
Multi-Core Execution
In-Hadoop Execution
Memory Management
Parallelized User Code
Revolution Confidential
11
Revolution R Enterprise Propels
Enterprises into the Future
Decision
Analytic ApplicationsAnalytic Applications
Integration
MiddlewareMiddleware
Data
HadoopHadoop
Data
Warehouse
Data
Warehouse
Other
Data
Sources
Other
Data
Sources
Analytics
Revolution R Enterprise
High Performance Analytics Platform
Revolution R Enterprise
High Performance Analytics Platform
|||||||||||||||||||||||||||
Revolution Confidential
Digital Media & RetailDigital Media & Retail
200+ Corporate Customers and Growing
Finance & InsuranceFinance & Insurance Healthcare & Life SciencesHealthcare & Life Sciences
Manufacturing & High TechManufacturing & High TechAcademic & Gov’tAcademic & Gov’t
12
Revolution Confidential
Revolution R Enterprise and
R MapReduce
Bringing The R Language to the
Hadoop Environment.
13
Revolution Confidential
R MapReduce:
Fast, Agile Analytics for Hadoop Today
 R MapReduce Enables R-Based Analytics In Hadoop:
 Use R to Explore and Visualize Data to Develop Insights
 Build Models Using Widely-Available Techniques
 Score Data Directly in Hadoop Using R Models
 Run R as Mappers and Reducers in Hadoop
 Advantages:
 No data movement
 Connects R to HDFS, Hbase and Hive
 Run standard MapReduce jobs
 R Programmers need not learn Java
 Need Not Rewrite R into Java Pig or SQL to Score Data
 No Data Movement Needed
 Accelerates Projects Leveraging Libraries By Bringing
4500+ Open Source R Algorithms in CRAN1 to Hadoop
14
Data
Data
Warehouse
Data
Warehouse
Other
Data
Sources
Other
Data
Sources
Analytics
MapReduceMapReduce
Applications
Hadoop
||||||||
|||||||||||||||||||||||||||||||||||||||||||||||||||||
||||||||
||||||||
Other
MapReduce
Jobs
Other
MapReduce
Jobs
HDFSHDFS
HbaseHbase
R MapReduce
(RMR)
R MapReduce
(RMR)
HiveHive
1 CRAN: Comprehensive R
Archive Network – an open
source collection of 4500+ R-
based statistics, analtyics,
graphics and data manipulations
algorithms for R users.
Revolution Confidential
R MapReduce (RMR)
R MapReduce:
Build MapReduce Jobs Entirely In R
15
Your Creativity.
+
Your Code.
+
4500+ R Packges in
CRAN
=
Rich, Powerful Data
Analytics That
Runs in
MapReduce.
Revolution R
Enterprise
Revolution R
Enterprise
Hbase
Hadoop
Hive
HDFS
MAPMAP MAPMAP MAPMAP
REDUCEREDUCE REDUCEREDUCE CRAN Packages
Revolution Confidential
Why Build MapReduce Jobs using R?
 What can you do with it?
 Transform, Aggregate, Regress, Cluster, Filter, Simulate, Model,
Score …
 Run R Programs While Leveraging Hadoop’s Scalability
 Big I/O: Score data files containing billions of rows
 Big Math: Run compute-intensive algorithms in parallel – Monte Carlo,
Random Trees, etc.
 Deliver results to BI or Visualization Tools and Production
Applications
 When to chose RMR:
 Need to Develop Analytics in R, on Big data in Hadoop
 Stringent Latency Requirements
 Scarce R and Java Developers Need to Collaborate Not Duplicate
16
Revolution Confidential
R MapReduce:
Create Mappers and Reducers Using R
 How:
 Build R Code Using
Revolution R Enterprise
 Use Open Source Algorithms
From CRAN project.
 Leverage HDFS and
MapReduce Directly
 Deploy R Mappers &
Reducers in Hadoop
17
Data
Data
Warehouse
Data
Warehouse
Other
Data
Sources
Other
Data
Sources
Analytics
MapReduceMapReduce
Applications
R MapReduce
(RMR)
R MapReduce
(RMR)
Revolution R
Enterprise
Revolution R
Enterprise
Hadoop
||||||||
|||||||||||||||||||||||||||||||||||||||||||||||||||||
||||||||
||||||||
Other
MapReduce
Jobs
Other
MapReduce
Jobs
R CodeR Code
R PackagesR Packages
HDFSHDFS
HbaseHbaseHiveHive
RRERRE
CRAN Packages
Revolution Confidential
Mappers & Reducers:
100% R. 100% Hadoop.
 For Hadoop Users:
 Integrates R with Hadoop via
Hadoop Streaming
 Creates MapReduce Jobs
Compatible with JobTracker
 No Need to Recode Models
 No Latency to Move Data
 For R Programmers
 No need for Java Programming
 Serialized & Deserializes Data
Between HDFS and R
 Handles Standard HDFS Read &
Write Transparently
 Provides Explicit Access to
HDFS, Hbase and Hive via
Packages
 Access to CRAN Algorithm
Library
18
Mapper
or
Reducer
Hadoop Streaming
R Code
Revolution R
Enterprise
Revolution R
Enterprise
High-Speed
Connectors
Data Deserialization
Data Serialization
HbaseHive
HDFS
HDFS
CRAN
Revolution Confidential
Leveraging R with Hadoop
With R “Inside” Hadoop…
 In-Place ETL
 Data Transformation in R
 Enrichment and Correlation Using
Other Data In Hadoop
 Simulation/Experimentation
 Execute Complex Simulations on
Massively-Parallel Hadoop Clusters
 Scoring
 Run Scoring Models Directly in
Hadoop.
 No Movement Penalty
 How?
 Write Mappers & Reducers in R and
Deploy Using RMapReduce
 Augment Hadoop with CRAN1
Packages
19
1 Use of CRAN algorithms limited to non-graphical, parallelizable algorithms
Revolution Confidential
Limitations of R MapReduce
 R Programmer Must “Think MapReduce” –
Dividing Work into Cascades of Map, Reduce,
Repeat.
 Algorithms Must be Designed for Parallelism
Including External Packages Used.
 Fits:
 Hadoop Literate Teams or Those With Good Support
 Non-Fits:
 Analytics Teams Tinkering with Hadoop on Short
Timeframes.
Company Confidential – Do not distribute 20
Data
Data
Warehouse
Data
Warehouse
Other
Data
Sources
Other
Data
Sources
Analytics
MapReduceMapReduce
Applications
R MapReduce
(RMR)
R MapReduce
(RMR)
Hadoop
||||||||
|||||||||||||||||||||||||||||||||||||||||||||||||||||
||||||||
||||||||
Other
MapReduce
Jobs
Other
MapReduce
Jobs
HDFSHDFS
HbaseHbaseHiveHive
Revolution Confidential
More Ways to Leverage R with Hadoop:
“Beside” Architectures
Inside Hadoop
 In-Place ETL
 Data Transformation in R
 Enrichment and Correlation Using
Other Data In Hadoop
 Simulation/Experimentation
 Execute Complex Simulations on
Massively-Parallel Hadoop Clusters
 Scoring
 Run Scoring Models Directly in
Hadoop.
 No Movement Penalty
 How?
 Write Mappers & Reducers in R and
Deploy Using RMapReduce
 Augment Hadoop with CRAN1
Packages
“Beside” Architectures:
 Drivers:
 Large or Unpredictable R Workloads
 Modest Hadoop Cluster
 Shared Production Hadoop Cluster
 Hadoop Novice
 Large Numbers of R Users.
 Modest Data Sets To Be Scored
 Movement Penalty Isn’t Prohibitive
 Maximized Computational Scale
 Access to ScaleR Parallel External
Memory Algorithms (PEMAs)
 Advantages:
 Makes Hadoop Easier to Administer
 Stabilies Hadoop Resource Availability
21
Revolution Confidential
Two Additional “Beside” Architectures
 Alternatives:
 RRE “Beside” Hadoop
 RRE Both “Beside” and “Inside” Hadoop with RMR
 “Beside” Usage:
 Sample into “Beside” Server or Cluster
 Analyze and Model on R Server or Cluster
 Score Data on R Server or Cluster
 Results to Hadoop for Use.
 “Both” Usage - Same As Above Except:
 Move Model to Data on Hadoop
 Score Data In-Place on Hadoop
 Why multiple options?
 Greatest Flexibility
 Optimize Skill Sets
 Scale Clusters Independently
 Control Concurrency and Security
 Optimize Utilization
 Same R Code Can Run in Both
 Balance Ease of Use/Development and Resulting Performance & Scale
22
Revolution Confidential
Data
Warehouse
Data
Warehouse
Other
Data
Sources
Other
Data
Sources
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|||||||
|||||||
RRE “Beside” Hadoop
 Separate Hadoop & R
Clusters
 Connectors HDFS,
Hbase & Hive
 Explore & Model Data
on R server(s)
 Return Scored Data to
HDFS/Hbase/Hive
 When To Use:
 Small, Shared or
Production Hadoop
Cluster
 Need Parallelized
Algorithms
 Heavy Random
Workloads
 Extensive
“Sandboxing”
 Modest Data Scoring
 Data Security
Constraints.
 … while awaiting
YARN…
 Advantages:
 Concurrency By
Separation
 Security By Separation
 Independent
Scalability
 ScaleR Parallel
Algorithms
23
DataAnalytics
MapReduceMapReduce
Applications
Hadoop
Cluster
|||||||
Other
MapReduce
Jobs
Other
MapReduce
Jobs
HDFSHDFS
HbaseHbaseHiveHive
RRERRE
CRAN Packages
Revolution R
Enterprise
Revolution R
Enterprise
||||||
ConnectR:
Hbase
HDFS
ODBC &
High-Speed
Connectors
Analytics
Apps.
Analytics
Apps.
Analytics Server
or Cluster:
Linux, Windows,
LSF or Azure
Data
Manipulation
and Analysis
Data
Manipulation
and Analysis
BI &
Visualization
Revolution Confidential
Data
Warehouse
Data
Warehouse
Other
Data
Sources
Other
Data
Sources
|||||||
|||||||
RRE “Beside” and “Inside”  Both “Inside” and
“Beside” Platforms
 Connect a Compute
Cluster to Hadoop
to Run R
 Move Models to
Score Big Data on
Hadoop
 When To Use:
 Production Hadoop
Cluster
 Need Parallelized
Algorithms
 Heavy Random
Workloads
 Extensive
“Sandboxing”
 Large Data Scoring
 Data Security
Constraints.
 … while awaiting
YARN…
 Advantages:
 Concurrency &
Security
 Independent
Scalability
 Big Data Scoring
 Flexibility
 Low Latency
24
DataAnalytics
MapReduceMapReduce
Applications
Hadoop
Cluster
|||||||
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Other
MapReduce
Jobs
Other
MapReduce
Jobs
HDFSHDFS
HbaseHbaseHiveHive
||||||
ConnectR:
Hbase
HDFS
ODBC &
High-Speed
Connectors
Analytics Server
or Cluster:
Linux, Windows,
LSF or Azure
R MapReduce
(RMR)
R MapReduce
(RMR)
RRERRE
CRAN Packages
Analytics
Apps.
Analytics
Apps.
Revolution R
Enterprise
Revolution R
Enterprise
ConnectR:
Hbase
HDFS
ODBC &
High-Speed
Connectors
Analytics Server
or Cluster:
Linux, Windows,
LSF or Azure
BI &
Visualization
Revolution Confidential
•Segment
•Categorize
•Select
Features
•Simulate
•Predict
•Validate
ModelModel
•Deploy
•Score
•Integrate
DeployDeploy
• Measure
Accuracy
• Iterate
ImproveImprove
Typical Predictive Analytics Workflow
25
• Ingest
• Format
• Enrich
• Filter
• Aggregate
• Profile
Data
Prep
Data
Prep
•Sample
•Cluster
•Visualize
•Correlate
•Sandboxing
ExploreExplore
Revolution Confidential
‘Beside’ and/or ‘Inside’:
Dominant Usage Patterns Observed
 Use Case 1: Real-Time Scoring
 Example – Fraud Prevention
 Use Case 2: Modeling and Scoring
 Example – Attribution Analysis
 Use Case 3: Production Analytics
 Example – Telematics-Assisted Underwriting
26
Revolution Confidential
In-House
Systems:
Transaction
History
27
Example 1:
Card Fraud Detection
MapReduceMapReduce
Hadoop
HDFSHDFS
HbaseHbase
1 Ingest
Weblog Data
Personal
Data:
Credit-
worthiness
Banking
2
4
Filter &
Xform
3
Correlate &
Rate
Transaction
Data
R MapReduce
(RMR)
R MapReduce
(RMR)
Other
MapReduce
Jobs
Other
MapReduce
Jobs
Develop
Risk
Models
6
Revolution R
Enterprise
Revolution R
Enterprise
ConnectR:
Hbase
HDFS
ODBC &
High-Speed
Connectors
R
Workstation
Deliver &
Integrate
Execute
Models5
Filter &
Score
Transactions
BI &
Visualization
Mortgage
Data
Authorization
Systems
Demographic
Data
Revolution Confidential
In-House
Systems:
EDW, CRM,
Datamarts
Example 2:
Attribution Analysis “Beside” Hadoop
MapReduceMapReduce
Hadoop
HDFSHDFS HbaseHbase
1
Ingest
Weblog Data
Marketing
Service
Provider
Feeds:
Acxiom
Experian
ExactTarget
Monitored
Responses
CoreMetrics
Dotomi
DoubleClick
8
3
7
4
Call center
Data
Java
MapReduce
Jobs
Java
MapReduce
Jobs
Develop
Attribution
Models
Deliver to
Users
Revolution R
Enterprise
Revolution R
Enterprise
ConnectR:
Hbase
HDFS
ODBC &
High-Speed
Connectors
Analytics
Apps.
Analytics
Apps.
Linux Server
Cluster
Server
BI &
Visualization
2
Filter &
Transform
Score
6
6
Load Analysis
Environment
Aggregate,
Profile,
& EnrichSessionize
Revolution Confidential
29
Example 3:
Telematics-Enhanced Underwriting
1
Ingest
8
2
Correlate Sources
3 Filter,
Aggregate &
Profile
Deliver to
Underwriting
& Call
Response
Systems
Revolution R
Enterprise
Revolution R
Enterprise
ConnectR:
Hbase
HDFS
ODBC &
High-Speed
Connectors
Underwriting
Applications
Underwriting
Applications
Linux Server
Cluster
Server
MapReduceMapReduce
Hadoop
HDFSHDFS
Other
MapReduce
Jobs
Other
MapReduce
Jobs
HbaseHbase
6
Policy Origination
Data
Vehicle Sensor
Data:
Speed
Time
Acceleration
Location
Creditworthiness
Data
Insured Data:
Loss History
Payment History
Credit File
Demographics 4
Load Model
Environment
Export
Models
Score
Large
Datasets
5R MapReduce
(RMR)
R MapReduce
(RMR)
7
Develop
Risk
Models
Revolution Confidential
Conclusion
 Big Data Is Hard.
 Hadoop is Key to Managing It.
 R is Key to Applying It.
 Revolution R on Hadoop Brings Data Science to
Big Data
 Hadoop Brings Parallel Performance to R
 R Brings a Community with Know-How to Hadoop
 Revolution Analytics Can Deliver Convergence
Today.
 … and the Future of R on Hadoop is Even Brighter…
30
Revolution Confidential
31
Revolution Confidential
Thank you.
32
www.revolutionanalytics.com  650.646.9545 Twitter: @RevolutionR
The leading commercial provider of software and support for the popular 
open source R statistics language.

More Related Content

What's hot

Predictive Analytics with Hadoop
Predictive Analytics with HadoopPredictive Analytics with Hadoop
Predictive Analytics with HadoopDataWorks Summit
 
Model Building with RevoScaleR: Using R and Hadoop for Statistical Computation
Model Building with RevoScaleR: Using R and Hadoop for Statistical ComputationModel Building with RevoScaleR: Using R and Hadoop for Statistical Computation
Model Building with RevoScaleR: Using R and Hadoop for Statistical Computation
Revolution Analytics
 
12Nov13 Webinar: Big Data Analysis with Teradata and Revolution Analytics
12Nov13 Webinar: Big Data Analysis with Teradata and Revolution Analytics12Nov13 Webinar: Big Data Analysis with Teradata and Revolution Analytics
12Nov13 Webinar: Big Data Analysis with Teradata and Revolution AnalyticsRevolution Analytics
 
Big Data Predictive Analytics with Revolution R Enterprise (Gartner BI Summit...
Big Data Predictive Analytics with Revolution R Enterprise (Gartner BI Summit...Big Data Predictive Analytics with Revolution R Enterprise (Gartner BI Summit...
Big Data Predictive Analytics with Revolution R Enterprise (Gartner BI Summit...
Revolution Analytics
 
Accelerating R analytics with Spark and Microsoft R Server for Hadoop
Accelerating R analytics with Spark and  Microsoft R Server  for HadoopAccelerating R analytics with Spark and  Microsoft R Server  for Hadoop
Accelerating R analytics with Spark and Microsoft R Server for Hadoop
Willy Marroquin (WillyDevNET)
 
In-Database Analytics Deep Dive with Teradata and Revolution
In-Database Analytics Deep Dive with Teradata and RevolutionIn-Database Analytics Deep Dive with Teradata and Revolution
In-Database Analytics Deep Dive with Teradata and Revolution
Revolution Analytics
 
Basics of Digital Design and Verilog
Basics of Digital Design and VerilogBasics of Digital Design and Verilog
Basics of Digital Design and Verilog
Ganesan Narayanasamy
 
Introducing Revolution R Open: Enhanced, Open Source R distribution from Revo...
Introducing Revolution R Open: Enhanced, Open Source R distribution from Revo...Introducing Revolution R Open: Enhanced, Open Source R distribution from Revo...
Introducing Revolution R Open: Enhanced, Open Source R distribution from Revo...Revolution Analytics
 
Moving From SAS to R Webinar Presentation - 07Aug14
Moving From SAS to R Webinar Presentation - 07Aug14Moving From SAS to R Webinar Presentation - 07Aug14
Moving From SAS to R Webinar Presentation - 07Aug14
Revolution Analytics
 
Big data analytics on teradata with revolution r enterprise bill jacobs
Big data analytics on teradata with revolution r enterprise   bill jacobsBig data analytics on teradata with revolution r enterprise   bill jacobs
Big data analytics on teradata with revolution r enterprise bill jacobs
Bill Jacobs
 
Big Data - Analytics with R
Big Data - Analytics with RBig Data - Analytics with R
Big Data - Analytics with R
Techsparks
 
Batter Up! Advanced Sports Analytics with R and Storm
Batter Up! Advanced Sports Analytics with R and StormBatter Up! Advanced Sports Analytics with R and Storm
Batter Up! Advanced Sports Analytics with R and Storm
Revolution Analytics
 
Taking R Analytics to SQL and the Cloud
Taking R Analytics to SQL and the CloudTaking R Analytics to SQL and the Cloud
Taking R Analytics to SQL and the Cloud
Revolution Analytics
 
Is Revolution R Enterprise Faster than SAS? Benchmarking Results Revealed
Is Revolution R Enterprise Faster than SAS? Benchmarking Results RevealedIs Revolution R Enterprise Faster than SAS? Benchmarking Results Revealed
Is Revolution R Enterprise Faster than SAS? Benchmarking Results RevealedRevolution Analytics
 
Intro to R for SAS and SPSS User Webinar
Intro to R for SAS and SPSS User WebinarIntro to R for SAS and SPSS User Webinar
Intro to R for SAS and SPSS User Webinar
Revolution Analytics
 
R and Data Science
R and Data ScienceR and Data Science
R and Data Science
Revolution Analytics
 
Quick and Dirty: Scaling Out Predictive Models Using Revolution Analytics on ...
Quick and Dirty: Scaling Out Predictive Models Using Revolution Analytics on ...Quick and Dirty: Scaling Out Predictive Models Using Revolution Analytics on ...
Quick and Dirty: Scaling Out Predictive Models Using Revolution Analytics on ...
Revolution Analytics
 
Managing a Multi-Tenant Data Lake
Managing a Multi-Tenant Data LakeManaging a Multi-Tenant Data Lake
Managing a Multi-Tenant Data Lake
DataWorks Summit/Hadoop Summit
 

What's hot (20)

Revolution R - 100% R and More
Revolution R - 100% R and MoreRevolution R - 100% R and More
Revolution R - 100% R and More
 
Predictive Analytics with Hadoop
Predictive Analytics with HadoopPredictive Analytics with Hadoop
Predictive Analytics with Hadoop
 
Model Building with RevoScaleR: Using R and Hadoop for Statistical Computation
Model Building with RevoScaleR: Using R and Hadoop for Statistical ComputationModel Building with RevoScaleR: Using R and Hadoop for Statistical Computation
Model Building with RevoScaleR: Using R and Hadoop for Statistical Computation
 
12Nov13 Webinar: Big Data Analysis with Teradata and Revolution Analytics
12Nov13 Webinar: Big Data Analysis with Teradata and Revolution Analytics12Nov13 Webinar: Big Data Analysis with Teradata and Revolution Analytics
12Nov13 Webinar: Big Data Analysis with Teradata and Revolution Analytics
 
Big Data Predictive Analytics with Revolution R Enterprise (Gartner BI Summit...
Big Data Predictive Analytics with Revolution R Enterprise (Gartner BI Summit...Big Data Predictive Analytics with Revolution R Enterprise (Gartner BI Summit...
Big Data Predictive Analytics with Revolution R Enterprise (Gartner BI Summit...
 
Accelerating R analytics with Spark and Microsoft R Server for Hadoop
Accelerating R analytics with Spark and  Microsoft R Server  for HadoopAccelerating R analytics with Spark and  Microsoft R Server  for Hadoop
Accelerating R analytics with Spark and Microsoft R Server for Hadoop
 
In-Database Analytics Deep Dive with Teradata and Revolution
In-Database Analytics Deep Dive with Teradata and RevolutionIn-Database Analytics Deep Dive with Teradata and Revolution
In-Database Analytics Deep Dive with Teradata and Revolution
 
Basics of Digital Design and Verilog
Basics of Digital Design and VerilogBasics of Digital Design and Verilog
Basics of Digital Design and Verilog
 
Introducing Revolution R Open: Enhanced, Open Source R distribution from Revo...
Introducing Revolution R Open: Enhanced, Open Source R distribution from Revo...Introducing Revolution R Open: Enhanced, Open Source R distribution from Revo...
Introducing Revolution R Open: Enhanced, Open Source R distribution from Revo...
 
Moving From SAS to R Webinar Presentation - 07Aug14
Moving From SAS to R Webinar Presentation - 07Aug14Moving From SAS to R Webinar Presentation - 07Aug14
Moving From SAS to R Webinar Presentation - 07Aug14
 
Big data analytics on teradata with revolution r enterprise bill jacobs
Big data analytics on teradata with revolution r enterprise   bill jacobsBig data analytics on teradata with revolution r enterprise   bill jacobs
Big data analytics on teradata with revolution r enterprise bill jacobs
 
Big Data - Analytics with R
Big Data - Analytics with RBig Data - Analytics with R
Big Data - Analytics with R
 
Batter Up! Advanced Sports Analytics with R and Storm
Batter Up! Advanced Sports Analytics with R and StormBatter Up! Advanced Sports Analytics with R and Storm
Batter Up! Advanced Sports Analytics with R and Storm
 
Taking R Analytics to SQL and the Cloud
Taking R Analytics to SQL and the CloudTaking R Analytics to SQL and the Cloud
Taking R Analytics to SQL and the Cloud
 
Is Revolution R Enterprise Faster than SAS? Benchmarking Results Revealed
Is Revolution R Enterprise Faster than SAS? Benchmarking Results RevealedIs Revolution R Enterprise Faster than SAS? Benchmarking Results Revealed
Is Revolution R Enterprise Faster than SAS? Benchmarking Results Revealed
 
Intro to R for SAS and SPSS User Webinar
Intro to R for SAS and SPSS User WebinarIntro to R for SAS and SPSS User Webinar
Intro to R for SAS and SPSS User Webinar
 
R and Data Science
R and Data ScienceR and Data Science
R and Data Science
 
Quick and Dirty: Scaling Out Predictive Models Using Revolution Analytics on ...
Quick and Dirty: Scaling Out Predictive Models Using Revolution Analytics on ...Quick and Dirty: Scaling Out Predictive Models Using Revolution Analytics on ...
Quick and Dirty: Scaling Out Predictive Models Using Revolution Analytics on ...
 
Managing a Multi-Tenant Data Lake
Managing a Multi-Tenant Data LakeManaging a Multi-Tenant Data Lake
Managing a Multi-Tenant Data Lake
 
Big data business case
Big data   business caseBig data   business case
Big data business case
 

Similar to R and Big Data using Revolution R Enterprise with Hadoop

Revolution Analytics Podcast
Revolution Analytics PodcastRevolution Analytics Podcast
Revolution Analytics Podcast
inside-BigData.com
 
Big Data Analytics with R
Big Data Analytics with RBig Data Analytics with R
Big Data Analytics with R
Great Wide Open
 
18Mar14 Find the Hidden Signal in Market Data Noise Webinar
18Mar14 Find the Hidden Signal in Market Data Noise Webinar 18Mar14 Find the Hidden Signal in Market Data Noise Webinar
18Mar14 Find the Hidden Signal in Market Data Noise Webinar Revolution Analytics
 
Applications in R - Success and Lessons Learned from the Marketplace
Applications in R - Success and Lessons Learned from the MarketplaceApplications in R - Success and Lessons Learned from the Marketplace
Applications in R - Success and Lessons Learned from the Marketplace
Revolution Analytics
 
05Nov13 Webinar: Introducing Revolution R Enterprise 7 - The Big Data Big Ana...
05Nov13 Webinar: Introducing Revolution R Enterprise 7 - The Big Data Big Ana...05Nov13 Webinar: Introducing Revolution R Enterprise 7 - The Big Data Big Ana...
05Nov13 Webinar: Introducing Revolution R Enterprise 7 - The Big Data Big Ana...Revolution Analytics
 
Big Data in Action – Real-World Solution Showcase
 Big Data in Action – Real-World Solution Showcase Big Data in Action – Real-World Solution Showcase
Big Data in Action – Real-World Solution Showcase
Inside Analysis
 
Future of Enterprise PaaS (Cloud Foundry Summit 2014)
 Future of Enterprise PaaS (Cloud Foundry Summit 2014) Future of Enterprise PaaS (Cloud Foundry Summit 2014)
Future of Enterprise PaaS (Cloud Foundry Summit 2014)
VMware Tanzu
 
BIG Data & Hadoop Applications in Social Media
BIG Data & Hadoop Applications in Social MediaBIG Data & Hadoop Applications in Social Media
BIG Data & Hadoop Applications in Social Media
Skillspeed
 
Game Changed – How Hadoop is Reinventing Enterprise Thinking
Game Changed – How Hadoop is Reinventing Enterprise ThinkingGame Changed – How Hadoop is Reinventing Enterprise Thinking
Game Changed – How Hadoop is Reinventing Enterprise Thinking
Inside Analysis
 
Risk Analysis in the Financial Services Industry
Risk Analysis in the Financial Services IndustryRisk Analysis in the Financial Services Industry
Risk Analysis in the Financial Services Industry
Revolution Analytics
 
3 Benefits of Multi-Temperature Data Management for Data Analytics
3 Benefits of Multi-Temperature Data Management for Data Analytics3 Benefits of Multi-Temperature Data Management for Data Analytics
3 Benefits of Multi-Temperature Data Management for Data Analytics
MapR Technologies
 
C-BAG Big Data Meetup Chennai Oct.29-2014 Hortonworks and Concurrent on Casca...
C-BAG Big Data Meetup Chennai Oct.29-2014 Hortonworks and Concurrent on Casca...C-BAG Big Data Meetup Chennai Oct.29-2014 Hortonworks and Concurrent on Casca...
C-BAG Big Data Meetup Chennai Oct.29-2014 Hortonworks and Concurrent on Casca...
Hortonworks
 
Future of Enterprise PaaS
Future of Enterprise PaaSFuture of Enterprise PaaS
Future of Enterprise PaaS
SAP Technology
 
Getting started with Hadoop on the Cloud with Bluemix
Getting started with Hadoop on the Cloud with BluemixGetting started with Hadoop on the Cloud with Bluemix
Getting started with Hadoop on the Cloud with Bluemix
Nicolas Morales
 
R for SAS Users Complement or Replace Two Strategies
R for SAS Users Complement or Replace Two StrategiesR for SAS Users Complement or Replace Two Strategies
R for SAS Users Complement or Replace Two Strategies
Revolution Analytics
 
Big Data Analytics
Big Data AnalyticsBig Data Analytics
Big Data Analytics
Sreedhar Chowdam
 
Revolution Analytics - Presentation at Hortonworks Booth - Strata 2014
Revolution Analytics - Presentation at Hortonworks Booth - Strata 2014Revolution Analytics - Presentation at Hortonworks Booth - Strata 2014
Revolution Analytics - Presentation at Hortonworks Booth - Strata 2014
Hortonworks
 
Hadoop as an Analytic Platform: Why Not?
Hadoop as an Analytic Platform: Why Not?Hadoop as an Analytic Platform: Why Not?
Hadoop as an Analytic Platform: Why Not?
Inside Analysis
 
WCIT 2014 Rohit Tandon - Big Data to Drive Business Results: HP HAVEn
WCIT 2014 Rohit Tandon - Big Data to Drive Business Results: HP HAVEnWCIT 2014 Rohit Tandon - Big Data to Drive Business Results: HP HAVEn
WCIT 2014 Rohit Tandon - Big Data to Drive Business Results: HP HAVEn
WCIT 2014
 
Kristof Coussement - The Debate: the Future of (Big) Data Analytics Software
Kristof Coussement - The Debate: the Future of (Big) Data Analytics SoftwareKristof Coussement - The Debate: the Future of (Big) Data Analytics Software
Kristof Coussement - The Debate: the Future of (Big) Data Analytics Software
BAQMaR
 

Similar to R and Big Data using Revolution R Enterprise with Hadoop (20)

Revolution Analytics Podcast
Revolution Analytics PodcastRevolution Analytics Podcast
Revolution Analytics Podcast
 
Big Data Analytics with R
Big Data Analytics with RBig Data Analytics with R
Big Data Analytics with R
 
18Mar14 Find the Hidden Signal in Market Data Noise Webinar
18Mar14 Find the Hidden Signal in Market Data Noise Webinar 18Mar14 Find the Hidden Signal in Market Data Noise Webinar
18Mar14 Find the Hidden Signal in Market Data Noise Webinar
 
Applications in R - Success and Lessons Learned from the Marketplace
Applications in R - Success and Lessons Learned from the MarketplaceApplications in R - Success and Lessons Learned from the Marketplace
Applications in R - Success and Lessons Learned from the Marketplace
 
05Nov13 Webinar: Introducing Revolution R Enterprise 7 - The Big Data Big Ana...
05Nov13 Webinar: Introducing Revolution R Enterprise 7 - The Big Data Big Ana...05Nov13 Webinar: Introducing Revolution R Enterprise 7 - The Big Data Big Ana...
05Nov13 Webinar: Introducing Revolution R Enterprise 7 - The Big Data Big Ana...
 
Big Data in Action – Real-World Solution Showcase
 Big Data in Action – Real-World Solution Showcase Big Data in Action – Real-World Solution Showcase
Big Data in Action – Real-World Solution Showcase
 
Future of Enterprise PaaS (Cloud Foundry Summit 2014)
 Future of Enterprise PaaS (Cloud Foundry Summit 2014) Future of Enterprise PaaS (Cloud Foundry Summit 2014)
Future of Enterprise PaaS (Cloud Foundry Summit 2014)
 
BIG Data & Hadoop Applications in Social Media
BIG Data & Hadoop Applications in Social MediaBIG Data & Hadoop Applications in Social Media
BIG Data & Hadoop Applications in Social Media
 
Game Changed – How Hadoop is Reinventing Enterprise Thinking
Game Changed – How Hadoop is Reinventing Enterprise ThinkingGame Changed – How Hadoop is Reinventing Enterprise Thinking
Game Changed – How Hadoop is Reinventing Enterprise Thinking
 
Risk Analysis in the Financial Services Industry
Risk Analysis in the Financial Services IndustryRisk Analysis in the Financial Services Industry
Risk Analysis in the Financial Services Industry
 
3 Benefits of Multi-Temperature Data Management for Data Analytics
3 Benefits of Multi-Temperature Data Management for Data Analytics3 Benefits of Multi-Temperature Data Management for Data Analytics
3 Benefits of Multi-Temperature Data Management for Data Analytics
 
C-BAG Big Data Meetup Chennai Oct.29-2014 Hortonworks and Concurrent on Casca...
C-BAG Big Data Meetup Chennai Oct.29-2014 Hortonworks and Concurrent on Casca...C-BAG Big Data Meetup Chennai Oct.29-2014 Hortonworks and Concurrent on Casca...
C-BAG Big Data Meetup Chennai Oct.29-2014 Hortonworks and Concurrent on Casca...
 
Future of Enterprise PaaS
Future of Enterprise PaaSFuture of Enterprise PaaS
Future of Enterprise PaaS
 
Getting started with Hadoop on the Cloud with Bluemix
Getting started with Hadoop on the Cloud with BluemixGetting started with Hadoop on the Cloud with Bluemix
Getting started with Hadoop on the Cloud with Bluemix
 
R for SAS Users Complement or Replace Two Strategies
R for SAS Users Complement or Replace Two StrategiesR for SAS Users Complement or Replace Two Strategies
R for SAS Users Complement or Replace Two Strategies
 
Big Data Analytics
Big Data AnalyticsBig Data Analytics
Big Data Analytics
 
Revolution Analytics - Presentation at Hortonworks Booth - Strata 2014
Revolution Analytics - Presentation at Hortonworks Booth - Strata 2014Revolution Analytics - Presentation at Hortonworks Booth - Strata 2014
Revolution Analytics - Presentation at Hortonworks Booth - Strata 2014
 
Hadoop as an Analytic Platform: Why Not?
Hadoop as an Analytic Platform: Why Not?Hadoop as an Analytic Platform: Why Not?
Hadoop as an Analytic Platform: Why Not?
 
WCIT 2014 Rohit Tandon - Big Data to Drive Business Results: HP HAVEn
WCIT 2014 Rohit Tandon - Big Data to Drive Business Results: HP HAVEnWCIT 2014 Rohit Tandon - Big Data to Drive Business Results: HP HAVEn
WCIT 2014 Rohit Tandon - Big Data to Drive Business Results: HP HAVEn
 
Kristof Coussement - The Debate: the Future of (Big) Data Analytics Software
Kristof Coussement - The Debate: the Future of (Big) Data Analytics SoftwareKristof Coussement - The Debate: the Future of (Big) Data Analytics Software
Kristof Coussement - The Debate: the Future of (Big) Data Analytics Software
 

More from Revolution Analytics

Speeding up R with Parallel Programming in the Cloud
Speeding up R with Parallel Programming in the CloudSpeeding up R with Parallel Programming in the Cloud
Speeding up R with Parallel Programming in the Cloud
Revolution Analytics
 
Migrating Existing Open Source Machine Learning to Azure
Migrating Existing Open Source Machine Learning to AzureMigrating Existing Open Source Machine Learning to Azure
Migrating Existing Open Source Machine Learning to Azure
Revolution Analytics
 
R in Minecraft
R in Minecraft R in Minecraft
R in Minecraft
Revolution Analytics
 
The case for R for AI developers
The case for R for AI developersThe case for R for AI developers
The case for R for AI developers
Revolution Analytics
 
Speed up R with parallel programming in the Cloud
Speed up R with parallel programming in the CloudSpeed up R with parallel programming in the Cloud
Speed up R with parallel programming in the Cloud
Revolution Analytics
 
The R Ecosystem
The R EcosystemThe R Ecosystem
The R Ecosystem
Revolution Analytics
 
R Then and Now
R Then and NowR Then and Now
R Then and Now
Revolution Analytics
 
Predicting Loan Delinquency at One Million Transactions per Second
Predicting Loan Delinquency at One Million Transactions per SecondPredicting Loan Delinquency at One Million Transactions per Second
Predicting Loan Delinquency at One Million Transactions per Second
Revolution Analytics
 
Reproducible Data Science with R
Reproducible Data Science with RReproducible Data Science with R
Reproducible Data Science with R
Revolution Analytics
 
The Value of Open Source Communities
The Value of Open Source CommunitiesThe Value of Open Source Communities
The Value of Open Source Communities
Revolution Analytics
 
The R Ecosystem
The R EcosystemThe R Ecosystem
The R Ecosystem
Revolution Analytics
 
R at Microsoft (useR! 2016)
R at Microsoft (useR! 2016)R at Microsoft (useR! 2016)
R at Microsoft (useR! 2016)
Revolution Analytics
 
Building a scalable data science platform with R
Building a scalable data science platform with RBuilding a scalable data science platform with R
Building a scalable data science platform with R
Revolution Analytics
 
R at Microsoft
R at MicrosoftR at Microsoft
R at Microsoft
Revolution Analytics
 
The Business Economics and Opportunity of Open Source Data Science
The Business Economics and Opportunity of Open Source Data ScienceThe Business Economics and Opportunity of Open Source Data Science
The Business Economics and Opportunity of Open Source Data Science
Revolution Analytics
 
The Network structure of R packages on CRAN & BioConductor
The Network structure of R packages on CRAN & BioConductorThe Network structure of R packages on CRAN & BioConductor
The Network structure of R packages on CRAN & BioConductor
Revolution Analytics
 
Simple Reproducibility with the checkpoint package
Simple Reproducibilitywith the checkpoint packageSimple Reproducibilitywith the checkpoint package
Simple Reproducibility with the checkpoint package
Revolution Analytics
 
R at Microsoft
R at MicrosoftR at Microsoft
R at Microsoft
Revolution Analytics
 
Revolution R Enterprise 7.4 - Presentation by Bill Jacobs 11Jun15
Revolution R Enterprise 7.4 - Presentation by Bill Jacobs 11Jun15Revolution R Enterprise 7.4 - Presentation by Bill Jacobs 11Jun15
Revolution R Enterprise 7.4 - Presentation by Bill Jacobs 11Jun15
Revolution Analytics
 
Warranty Predictive Analytics solution
Warranty Predictive Analytics solutionWarranty Predictive Analytics solution
Warranty Predictive Analytics solution
Revolution Analytics
 

More from Revolution Analytics (20)

Speeding up R with Parallel Programming in the Cloud
Speeding up R with Parallel Programming in the CloudSpeeding up R with Parallel Programming in the Cloud
Speeding up R with Parallel Programming in the Cloud
 
Migrating Existing Open Source Machine Learning to Azure
Migrating Existing Open Source Machine Learning to AzureMigrating Existing Open Source Machine Learning to Azure
Migrating Existing Open Source Machine Learning to Azure
 
R in Minecraft
R in Minecraft R in Minecraft
R in Minecraft
 
The case for R for AI developers
The case for R for AI developersThe case for R for AI developers
The case for R for AI developers
 
Speed up R with parallel programming in the Cloud
Speed up R with parallel programming in the CloudSpeed up R with parallel programming in the Cloud
Speed up R with parallel programming in the Cloud
 
The R Ecosystem
The R EcosystemThe R Ecosystem
The R Ecosystem
 
R Then and Now
R Then and NowR Then and Now
R Then and Now
 
Predicting Loan Delinquency at One Million Transactions per Second
Predicting Loan Delinquency at One Million Transactions per SecondPredicting Loan Delinquency at One Million Transactions per Second
Predicting Loan Delinquency at One Million Transactions per Second
 
Reproducible Data Science with R
Reproducible Data Science with RReproducible Data Science with R
Reproducible Data Science with R
 
The Value of Open Source Communities
The Value of Open Source CommunitiesThe Value of Open Source Communities
The Value of Open Source Communities
 
The R Ecosystem
The R EcosystemThe R Ecosystem
The R Ecosystem
 
R at Microsoft (useR! 2016)
R at Microsoft (useR! 2016)R at Microsoft (useR! 2016)
R at Microsoft (useR! 2016)
 
Building a scalable data science platform with R
Building a scalable data science platform with RBuilding a scalable data science platform with R
Building a scalable data science platform with R
 
R at Microsoft
R at MicrosoftR at Microsoft
R at Microsoft
 
The Business Economics and Opportunity of Open Source Data Science
The Business Economics and Opportunity of Open Source Data ScienceThe Business Economics and Opportunity of Open Source Data Science
The Business Economics and Opportunity of Open Source Data Science
 
The Network structure of R packages on CRAN & BioConductor
The Network structure of R packages on CRAN & BioConductorThe Network structure of R packages on CRAN & BioConductor
The Network structure of R packages on CRAN & BioConductor
 
Simple Reproducibility with the checkpoint package
Simple Reproducibilitywith the checkpoint packageSimple Reproducibilitywith the checkpoint package
Simple Reproducibility with the checkpoint package
 
R at Microsoft
R at MicrosoftR at Microsoft
R at Microsoft
 
Revolution R Enterprise 7.4 - Presentation by Bill Jacobs 11Jun15
Revolution R Enterprise 7.4 - Presentation by Bill Jacobs 11Jun15Revolution R Enterprise 7.4 - Presentation by Bill Jacobs 11Jun15
Revolution R Enterprise 7.4 - Presentation by Bill Jacobs 11Jun15
 
Warranty Predictive Analytics solution
Warranty Predictive Analytics solutionWarranty Predictive Analytics solution
Warranty Predictive Analytics solution
 

Recently uploaded

LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
DanBrown980551
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and Grafana
RTTS
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
Jemma Hussein Allen
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
Guy Korland
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Albert Hoitingh
 
Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*
Frank van Harmelen
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
UiPathCommunity
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
DianaGray10
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
ControlCase
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
KatiaHIMEUR1
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
Safe Software
 
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
Product School
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Thierry Lestable
 
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Product School
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
Kari Kakkonen
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
Prayukth K V
 
Knowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backKnowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and back
Elena Simperl
 
Elevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object CalisthenicsElevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object Calisthenics
Dorra BARTAGUIZ
 

Recently uploaded (20)

LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and Grafana
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
 
Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
 
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
 
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
 
Knowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backKnowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and back
 
Elevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object CalisthenicsElevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object Calisthenics
 

R and Big Data using Revolution R Enterprise with Hadoop

  • 1. Revolution Confidential Revolution Analytics Bringing the Analytical Power of R to the Hadoop Platform Simon Field Technical Director, Revolution Analytics June 14, 2013
  • 2. Revolution Confidential Vigorous Growth of Big Data… 2 The global Big Data Market revenue is expected to grow from $1.56 billion in 2012 to $13.95 billion in 2017, at an estimated CAGR of 54.9% from 2012 to 2017. - Marketsandmarkets.com study, 14 April 2013 “…the market for Big Data technology will reach 16.9 billion by 2015, up from $3.2 billion in 2010. That is a 40 percent-a-year growth rate – about seven times the estimated growth rate for the overall information technology and communications business.” – IDC study, March 2012
  • 3. Revolution Confidential Big Data = Opportunity + Disruption 3 Huge New Data Assets • Internet – Commerce, Communications, Collaboration • Social Media – Personal, Presence, New Social Networks • Ubiquitous Telemetry – Machines Everywhere Huge New Data Assets • Internet – Commerce, Communications, Collaboration • Social Media – Personal, Presence, New Social Networks • Ubiquitous Telemetry – Machines Everywhere Rapidly-Evolving Platforms • “Data Lake” vs. “Warehouse” vs. “Big Data App. Platforms” • Vast Choices Among Open Source Platfroms • Eliminate Time Consuming Data Movements Rapidly-Evolving Platforms • “Data Lake” vs. “Warehouse” vs. “Big Data App. Platforms” • Vast Choices Among Open Source Platfroms • Eliminate Time Consuming Data Movements Emerging Business Opportunities • Data Science Unlocks New Insight • Big Data Drives Better Decisionmaking • Platforms Evolve Rationally Toward Big Data Vision Emerging Business Opportunities • Data Science Unlocks New Insight • Big Data Drives Better Decisionmaking • Platforms Evolve Rationally Toward Big Data Vision
  • 4. Revolution Confidential Hadoop Analytics Platforms: Disruption, Challenge, Growth & Opportunity At Once 4 • Java Skill Requirements • Hadoop’s Innovation Pace • Java Skill Requirements • Hadoop’s Innovation Pace • Analytical • Write Once, Deploy Anywhere Growth: Skill Development • EDW Saturation • Limited Analytical Capabilities • EDW Saturation • Limited Analytical Capabilities • Data Science Skill Shortage • MapReduce Paradigm Disruption: Evolving Ecosystems • Designed for Massive Scale • Commodity Foundations • Designed for Massive Scale • Commodity Foundations • Built for Data Variety • Open Source Innovation Pace Challenge: Big Data Readiness • Descriptive -> Predictive • Short Analytical Cycle Time • Descriptive -> Predictive • Short Analytical Cycle Time • Ubiquitous Analytical Decisions • Low-Latency Analytics Opportunity: New, More Capable Analytic Foundation
  • 5. Revolution Confidential What We Need: Convergence  Data Science  With business solutions that fuse statistics, mathematics and software into meaningful applications.  Software Engineering  With tools and frameworks to create agile, scalable analytics-based applications  IT Operations Management  Deployment platforms that are integrated, cost-effective, secure and ubiquitous. 5
  • 6. Revolution Confidential What is the R Statistics Language?  The R Language:  Straightforward Procedural Language for Stats, Math and Data Science  Open Source  The R Community:  2M Users with the skill to tackle big data mathematical / statistical and ML needs.  Began on workstation / modest SMP servers  The R Ecosystem:  4500+ Freely Available Algorithms in CRAN  Applicable to Big Data if scaled 6
  • 7. Revolution Confidential Why R and Hadoop?  Hadoop’s dominates Big Data Storage and Computational platforms.  R dominates Data Science, Providing a Language, Users Thousands of Pre-Built Algorithms.  Bringing Them Together is Our Goal Today. 7
  • 8. Revolution Confidential Mission Company Confidential – Do not distribute 8 Enterprise-ready Revolution R Enterprise is the only commercial big data analytics platform based on open source R statistical computing language Multi-platform Scalable from desktop to big data Delivers high performance analytics Easier to build and deploy analytic applications
  • 9. Revolution Confidential Global Industries Served Financial Services Digital Media Government Health & Life Sciences High Tech Manufacturing Retail Telco Our Software Delivers Power: Distributed, scalable high performance advanced analytics Productivity: Easier to build and deploy analytic applications Enterprise Readiness: Multi-platform Our Philosophy Customer-centric innovation Easy to do business with Our Investors Intel Capital North Bridge Presidio Ventures Who we are Leading provider of commercial analytics platform based on open source R statistical computing language Customers 200+ Global 2000 Global Presence North America / EMEA / APAC Our Services Deliver Knowledge: Our experts enable you to be experts Time-to-Value: Our Quickstart projects give you a jumpstart Guidance: Our customer support team is here to help you Company Confidential – Do not distribute 9
  • 10. Revolution Confidential Big Data Speed and Scale with Revolution R Enterprise Fast Math Libraries Parallelized Algorithms In-Database Execution Multi-Threaded Execution Multi-Core Execution In-Hadoop Execution Memory Management Parallelized User Code
  • 11. Revolution Confidential 11 Revolution R Enterprise Propels Enterprises into the Future Decision Analytic ApplicationsAnalytic Applications Integration MiddlewareMiddleware Data HadoopHadoop Data Warehouse Data Warehouse Other Data Sources Other Data Sources Analytics Revolution R Enterprise High Performance Analytics Platform Revolution R Enterprise High Performance Analytics Platform |||||||||||||||||||||||||||
  • 12. Revolution Confidential Digital Media & RetailDigital Media & Retail 200+ Corporate Customers and Growing Finance & InsuranceFinance & Insurance Healthcare & Life SciencesHealthcare & Life Sciences Manufacturing & High TechManufacturing & High TechAcademic & Gov’tAcademic & Gov’t 12
  • 13. Revolution Confidential Revolution R Enterprise and R MapReduce Bringing The R Language to the Hadoop Environment. 13
  • 14. Revolution Confidential R MapReduce: Fast, Agile Analytics for Hadoop Today  R MapReduce Enables R-Based Analytics In Hadoop:  Use R to Explore and Visualize Data to Develop Insights  Build Models Using Widely-Available Techniques  Score Data Directly in Hadoop Using R Models  Run R as Mappers and Reducers in Hadoop  Advantages:  No data movement  Connects R to HDFS, Hbase and Hive  Run standard MapReduce jobs  R Programmers need not learn Java  Need Not Rewrite R into Java Pig or SQL to Score Data  No Data Movement Needed  Accelerates Projects Leveraging Libraries By Bringing 4500+ Open Source R Algorithms in CRAN1 to Hadoop 14 Data Data Warehouse Data Warehouse Other Data Sources Other Data Sources Analytics MapReduceMapReduce Applications Hadoop |||||||| ||||||||||||||||||||||||||||||||||||||||||||||||||||| |||||||| |||||||| Other MapReduce Jobs Other MapReduce Jobs HDFSHDFS HbaseHbase R MapReduce (RMR) R MapReduce (RMR) HiveHive 1 CRAN: Comprehensive R Archive Network – an open source collection of 4500+ R- based statistics, analtyics, graphics and data manipulations algorithms for R users.
  • 15. Revolution Confidential R MapReduce (RMR) R MapReduce: Build MapReduce Jobs Entirely In R 15 Your Creativity. + Your Code. + 4500+ R Packges in CRAN = Rich, Powerful Data Analytics That Runs in MapReduce. Revolution R Enterprise Revolution R Enterprise Hbase Hadoop Hive HDFS MAPMAP MAPMAP MAPMAP REDUCEREDUCE REDUCEREDUCE CRAN Packages
  • 16. Revolution Confidential Why Build MapReduce Jobs using R?  What can you do with it?  Transform, Aggregate, Regress, Cluster, Filter, Simulate, Model, Score …  Run R Programs While Leveraging Hadoop’s Scalability  Big I/O: Score data files containing billions of rows  Big Math: Run compute-intensive algorithms in parallel – Monte Carlo, Random Trees, etc.  Deliver results to BI or Visualization Tools and Production Applications  When to chose RMR:  Need to Develop Analytics in R, on Big data in Hadoop  Stringent Latency Requirements  Scarce R and Java Developers Need to Collaborate Not Duplicate 16
  • 17. Revolution Confidential R MapReduce: Create Mappers and Reducers Using R  How:  Build R Code Using Revolution R Enterprise  Use Open Source Algorithms From CRAN project.  Leverage HDFS and MapReduce Directly  Deploy R Mappers & Reducers in Hadoop 17 Data Data Warehouse Data Warehouse Other Data Sources Other Data Sources Analytics MapReduceMapReduce Applications R MapReduce (RMR) R MapReduce (RMR) Revolution R Enterprise Revolution R Enterprise Hadoop |||||||| ||||||||||||||||||||||||||||||||||||||||||||||||||||| |||||||| |||||||| Other MapReduce Jobs Other MapReduce Jobs R CodeR Code R PackagesR Packages HDFSHDFS HbaseHbaseHiveHive RRERRE CRAN Packages
  • 18. Revolution Confidential Mappers & Reducers: 100% R. 100% Hadoop.  For Hadoop Users:  Integrates R with Hadoop via Hadoop Streaming  Creates MapReduce Jobs Compatible with JobTracker  No Need to Recode Models  No Latency to Move Data  For R Programmers  No need for Java Programming  Serialized & Deserializes Data Between HDFS and R  Handles Standard HDFS Read & Write Transparently  Provides Explicit Access to HDFS, Hbase and Hive via Packages  Access to CRAN Algorithm Library 18 Mapper or Reducer Hadoop Streaming R Code Revolution R Enterprise Revolution R Enterprise High-Speed Connectors Data Deserialization Data Serialization HbaseHive HDFS HDFS CRAN
  • 19. Revolution Confidential Leveraging R with Hadoop With R “Inside” Hadoop…  In-Place ETL  Data Transformation in R  Enrichment and Correlation Using Other Data In Hadoop  Simulation/Experimentation  Execute Complex Simulations on Massively-Parallel Hadoop Clusters  Scoring  Run Scoring Models Directly in Hadoop.  No Movement Penalty  How?  Write Mappers & Reducers in R and Deploy Using RMapReduce  Augment Hadoop with CRAN1 Packages 19 1 Use of CRAN algorithms limited to non-graphical, parallelizable algorithms
  • 20. Revolution Confidential Limitations of R MapReduce  R Programmer Must “Think MapReduce” – Dividing Work into Cascades of Map, Reduce, Repeat.  Algorithms Must be Designed for Parallelism Including External Packages Used.  Fits:  Hadoop Literate Teams or Those With Good Support  Non-Fits:  Analytics Teams Tinkering with Hadoop on Short Timeframes. Company Confidential – Do not distribute 20 Data Data Warehouse Data Warehouse Other Data Sources Other Data Sources Analytics MapReduceMapReduce Applications R MapReduce (RMR) R MapReduce (RMR) Hadoop |||||||| ||||||||||||||||||||||||||||||||||||||||||||||||||||| |||||||| |||||||| Other MapReduce Jobs Other MapReduce Jobs HDFSHDFS HbaseHbaseHiveHive
  • 21. Revolution Confidential More Ways to Leverage R with Hadoop: “Beside” Architectures Inside Hadoop  In-Place ETL  Data Transformation in R  Enrichment and Correlation Using Other Data In Hadoop  Simulation/Experimentation  Execute Complex Simulations on Massively-Parallel Hadoop Clusters  Scoring  Run Scoring Models Directly in Hadoop.  No Movement Penalty  How?  Write Mappers & Reducers in R and Deploy Using RMapReduce  Augment Hadoop with CRAN1 Packages “Beside” Architectures:  Drivers:  Large or Unpredictable R Workloads  Modest Hadoop Cluster  Shared Production Hadoop Cluster  Hadoop Novice  Large Numbers of R Users.  Modest Data Sets To Be Scored  Movement Penalty Isn’t Prohibitive  Maximized Computational Scale  Access to ScaleR Parallel External Memory Algorithms (PEMAs)  Advantages:  Makes Hadoop Easier to Administer  Stabilies Hadoop Resource Availability 21
  • 22. Revolution Confidential Two Additional “Beside” Architectures  Alternatives:  RRE “Beside” Hadoop  RRE Both “Beside” and “Inside” Hadoop with RMR  “Beside” Usage:  Sample into “Beside” Server or Cluster  Analyze and Model on R Server or Cluster  Score Data on R Server or Cluster  Results to Hadoop for Use.  “Both” Usage - Same As Above Except:  Move Model to Data on Hadoop  Score Data In-Place on Hadoop  Why multiple options?  Greatest Flexibility  Optimize Skill Sets  Scale Clusters Independently  Control Concurrency and Security  Optimize Utilization  Same R Code Can Run in Both  Balance Ease of Use/Development and Resulting Performance & Scale 22
  • 23. Revolution Confidential Data Warehouse Data Warehouse Other Data Sources Other Data Sources ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| ||||||| ||||||| RRE “Beside” Hadoop  Separate Hadoop & R Clusters  Connectors HDFS, Hbase & Hive  Explore & Model Data on R server(s)  Return Scored Data to HDFS/Hbase/Hive  When To Use:  Small, Shared or Production Hadoop Cluster  Need Parallelized Algorithms  Heavy Random Workloads  Extensive “Sandboxing”  Modest Data Scoring  Data Security Constraints.  … while awaiting YARN…  Advantages:  Concurrency By Separation  Security By Separation  Independent Scalability  ScaleR Parallel Algorithms 23 DataAnalytics MapReduceMapReduce Applications Hadoop Cluster ||||||| Other MapReduce Jobs Other MapReduce Jobs HDFSHDFS HbaseHbaseHiveHive RRERRE CRAN Packages Revolution R Enterprise Revolution R Enterprise |||||| ConnectR: Hbase HDFS ODBC & High-Speed Connectors Analytics Apps. Analytics Apps. Analytics Server or Cluster: Linux, Windows, LSF or Azure Data Manipulation and Analysis Data Manipulation and Analysis BI & Visualization
  • 24. Revolution Confidential Data Warehouse Data Warehouse Other Data Sources Other Data Sources ||||||| ||||||| RRE “Beside” and “Inside”  Both “Inside” and “Beside” Platforms  Connect a Compute Cluster to Hadoop to Run R  Move Models to Score Big Data on Hadoop  When To Use:  Production Hadoop Cluster  Need Parallelized Algorithms  Heavy Random Workloads  Extensive “Sandboxing”  Large Data Scoring  Data Security Constraints.  … while awaiting YARN…  Advantages:  Concurrency & Security  Independent Scalability  Big Data Scoring  Flexibility  Low Latency 24 DataAnalytics MapReduceMapReduce Applications Hadoop Cluster ||||||| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| Other MapReduce Jobs Other MapReduce Jobs HDFSHDFS HbaseHbaseHiveHive |||||| ConnectR: Hbase HDFS ODBC & High-Speed Connectors Analytics Server or Cluster: Linux, Windows, LSF or Azure R MapReduce (RMR) R MapReduce (RMR) RRERRE CRAN Packages Analytics Apps. Analytics Apps. Revolution R Enterprise Revolution R Enterprise ConnectR: Hbase HDFS ODBC & High-Speed Connectors Analytics Server or Cluster: Linux, Windows, LSF or Azure BI & Visualization
  • 25. Revolution Confidential •Segment •Categorize •Select Features •Simulate •Predict •Validate ModelModel •Deploy •Score •Integrate DeployDeploy • Measure Accuracy • Iterate ImproveImprove Typical Predictive Analytics Workflow 25 • Ingest • Format • Enrich • Filter • Aggregate • Profile Data Prep Data Prep •Sample •Cluster •Visualize •Correlate •Sandboxing ExploreExplore
  • 26. Revolution Confidential ‘Beside’ and/or ‘Inside’: Dominant Usage Patterns Observed  Use Case 1: Real-Time Scoring  Example – Fraud Prevention  Use Case 2: Modeling and Scoring  Example – Attribution Analysis  Use Case 3: Production Analytics  Example – Telematics-Assisted Underwriting 26
  • 27. Revolution Confidential In-House Systems: Transaction History 27 Example 1: Card Fraud Detection MapReduceMapReduce Hadoop HDFSHDFS HbaseHbase 1 Ingest Weblog Data Personal Data: Credit- worthiness Banking 2 4 Filter & Xform 3 Correlate & Rate Transaction Data R MapReduce (RMR) R MapReduce (RMR) Other MapReduce Jobs Other MapReduce Jobs Develop Risk Models 6 Revolution R Enterprise Revolution R Enterprise ConnectR: Hbase HDFS ODBC & High-Speed Connectors R Workstation Deliver & Integrate Execute Models5 Filter & Score Transactions BI & Visualization Mortgage Data Authorization Systems Demographic Data
  • 28. Revolution Confidential In-House Systems: EDW, CRM, Datamarts Example 2: Attribution Analysis “Beside” Hadoop MapReduceMapReduce Hadoop HDFSHDFS HbaseHbase 1 Ingest Weblog Data Marketing Service Provider Feeds: Acxiom Experian ExactTarget Monitored Responses CoreMetrics Dotomi DoubleClick 8 3 7 4 Call center Data Java MapReduce Jobs Java MapReduce Jobs Develop Attribution Models Deliver to Users Revolution R Enterprise Revolution R Enterprise ConnectR: Hbase HDFS ODBC & High-Speed Connectors Analytics Apps. Analytics Apps. Linux Server Cluster Server BI & Visualization 2 Filter & Transform Score 6 6 Load Analysis Environment Aggregate, Profile, & EnrichSessionize
  • 29. Revolution Confidential 29 Example 3: Telematics-Enhanced Underwriting 1 Ingest 8 2 Correlate Sources 3 Filter, Aggregate & Profile Deliver to Underwriting & Call Response Systems Revolution R Enterprise Revolution R Enterprise ConnectR: Hbase HDFS ODBC & High-Speed Connectors Underwriting Applications Underwriting Applications Linux Server Cluster Server MapReduceMapReduce Hadoop HDFSHDFS Other MapReduce Jobs Other MapReduce Jobs HbaseHbase 6 Policy Origination Data Vehicle Sensor Data: Speed Time Acceleration Location Creditworthiness Data Insured Data: Loss History Payment History Credit File Demographics 4 Load Model Environment Export Models Score Large Datasets 5R MapReduce (RMR) R MapReduce (RMR) 7 Develop Risk Models
  • 30. Revolution Confidential Conclusion  Big Data Is Hard.  Hadoop is Key to Managing It.  R is Key to Applying It.  Revolution R on Hadoop Brings Data Science to Big Data  Hadoop Brings Parallel Performance to R  R Brings a Community with Know-How to Hadoop  Revolution Analytics Can Deliver Convergence Today.  … and the Future of R on Hadoop is Even Brighter… 30
  • 32. Revolution Confidential Thank you. 32 www.revolutionanalytics.com  650.646.9545 Twitter: @RevolutionR The leading commercial provider of software and support for the popular  open source R statistics language.