Microsoft has been a leader in the enterprise analytics space for years. In 2014, Microsoft had already created R language functionality within Azure Machine Learning. On April 6, 2015, Microsoft and closed on a deal to acquire Revolution Analytics, a company focusing on scalable processing solutions initiated by the well-known R language. Many data science projects and initial demos do not need high-volume solutions: however, having a high-volume answer for the R language allows for planning or working toward the largest data science solutions.
This presentation describes the add-value for the Revolution Analytics acquisition. The talk covers 1) an overview of current data science technologies from Microsoft; 2) a description of the R language; 3) a brief review of the add-value for R with Azure Machine Learning, and 4) a description of the performance architecture and demo of the language constructs developed by Revolution Analytics. Most of the presentation will be focused on sections two and four. It is anticipated that these technologies will be partially if not fully integrated into SQL Server 2016.
2. Mark Tab
Consulting
Training
Teaching
Presenting
SQL Server MVP
Linked In
@MarkTabNet
3. Outline
1) an overview of current data science technologies from Microsoft;
2) a description of the R language;
3) a brief review of the add-value for R with Azure Machine Learning, and
4) a description of the performance architecture and demo of the
language constructs developed by Revolution Analytics
4. Current Data Science Technologies
• SQL Server License (Win OS)
• Business Intelligence or Enterprise
SQL Server Analysis
Services Data Mining
• Excel 2007 or Higher
• X64 better
Excel Data Mining Add-In
• Free or Paid Tiers
• Any OS
Microsoft Azure Machine
Learning
• Open Source
• Mono-Project, Visual Studio
F#
• SQL Server 2016Revolution Analytics
5. Data Scientist
Interact directly with data
Built-in to SQL Server
Data Developer/DBA
Manage data and
analytics together
Built-in advanced analytics
In-database analytics
Example Solutions
• Fraud detection
• Salesforecasting
• Warehouse efficiency
• Predictive maintenance
Relational Data
Analytic Library
T-SQL Interface
Extensibility
?
R
RIntegration
010010
100100
010101
Microsoft Azure
Machine Learning Marketplace
New R scripts
010010
100100
010101
010010
100100
010101
010010
100100
010101
010010
100100
010101
010010
100100
010101
9. Growth and Demand for R
R is the highest paid IT skill
Dice.com, Jan 2014
R most-used data science language after SQL
O’Reilly, Jan 2014
R is used by 70% of data miners
Rexer, Sep 2013
R is #15 of all programming languages
RedMonk, Jan 2014
R growing faster than any other data science
language
KDnuggets, Aug 2013
More than 2 million users worldwide
R Usage Growth
Rexer Data Miner Survey, 2007-2013
70% of data miners report using
R
R is the first choice of more
data miners than any other
software
Source: www.rexeranalytics.com
18. Rows of data 1 billion 1 billion
Parameters “just a few” 7
Time 80 seconds 44 seconds
Data location In memory On disk
Nodes 32 5
Cores 384 20
RAM 1,536 GB 80 GB
Double
45%
1/6th
5%
5%
Revolution R is faster on the same amount of data, despite using approximately a 20th as many cores, a 20th as
much RAM, a 6th as many nodes, and not pre-loading data into RAM.
Bottom Line: Revolution R Enterprise Performance = Greatly Reduced TCO
*As published by SAS in HPC Wire, April 21, 2011
Logistic Regression:
18
2010: Head to Head with SAS
21. 2014: Technical Support for Open
Source R
AdviseR™ from Revolution Analytics
21
Technical support for open source R, from the R experts.
10x5 email and phone support
Support for R, validated packages, and third-party software connections
On-line case management and knowledgebase
Access to technical resources, documentation and user forums
Exclusive on-line webinars from community experts
Guaranteed response times
Also available: expert hands-on and on-line training for R, from Revolution
Analytics AcademyR.
http://www.revolutionanalytics.com/adviser
http://revolutionanalytics.com/academyr-training-
education
39. Abstract
Microsoft has been a leader in the enterprise analytics space for years. In 2014, Microsoft
had already created R language functionality within Azure Machine Learning. On April 6,
2015, Microsoft and closed on a deal to acquire Revolution Analytics, a company
focusing on scalable processing solutions initiated by the well-known R language. Many
data science projects and initial demos do not need high-volume solutions: however,
having a high-volume answer for the R language allows for planning or working toward
the largest data science solutions.
This presentation describes the add-value for the Revolution Analytics acquisition. The
talk covers 1) an overview of current data science technologies from Microsoft; 2) a
description of the R language; 3) a brief review of the add-value for R with Azure
Machine Learning, and 4) a description of the performance architecture and demo of
the language constructs developed by Revolution Analytics. Most of the presentation
will be focused on sections two and four. It is anticipated that these technologies will be
partially if not fully integrated into SQL Server 2016.