Microsoft Azure ♥ R
Data Science with Microsoft Azure and R
Dmitry Petukhov,
Microsoft Data Platform MVP, C# MCP,
Big Data Enthusiast && Coffee Addicted
Microsoft Azure + R. Prototype to Product Challenge
Prototyping
Flexibility
Distributed
Scalable
Fault-tolerance
Reliable
Production
Flexibility
Distributed
Scalable
Fault-tolerance
Reliable
+ Big Data Ready
+ LSML
Black Magic!
Migration
Microsoft Azure + R. Hello R!
Python is a COOL language!
But R…
Specialized in statistical analyze
Time-effective => ideal for…
…prototyping, competition, researching, and for fun!
Standalone computing => not bad scalable 
Open source
Big bearded community
Storage
Resource
Management
ML Framework
Execution
Engine
Local OS
Local Disc
PythonRuntime
YetAnother
Runtime
scikit
learn
HDFS
YARN
MapReduce
Mahout
HDFS / S3
YARN /
Apache Mesos
Spark
MLlib
HDFS / S3
YARN /
Apache Mesos
Python / R
on Spark
Python/R
tools
Spark
Local PC Hybrid Model Cluster (on-premises/on-demand)
some
library
Machine Learning in Finance. Infrastructure for Data Scientist
Low HighCost of deployment/ownership
Distributed
FS
Dark
Magic…
ML as a Service
Python/R
tools
Microsoft Azure + R. Infrastructures for Data Scientists
Microsoft Azure + R. Microsoft ♥ R
R Server for Azure HDInsight
Data Science VM
Azure Machine Learning
Support R-scripts execution
Allow authoring custom R modules
Jupyter Notebooks with R kernel support
Azure HDInsight
Hadoop/Spark-cluster as a Service
SQL Server R Services
Power BI
Running R Scripts & excellent visualization
R Tools for Visual Studio
Microsoft
Azure
References
Reference: http://www.r-bloggers.com/using-microsoft-r-server-to-address-scalability-issues-in-r/
Microsoft Azure + R. Microsoft R Server Platform
References
Reference: http://www.r-bloggers.com/using-microsoft-r-server-to-address-scalability-issues-in-r/
Microsoft Azure + R. DistributedR: write once, deploy anywhere
R Server for Azure HDInsight
Killer features list:
100% open source R implementation;
workload running inside HDInsight (Hadoop/Spark).
Microsoft Azure + R. R Server for Azure HDInsight
R, Python, SQL, C#
Microsoft Azure + R. Data Science VM
Microsoft R Server Developer Edition,
Anaconda Python distribution,
Jupyter notebooks for Python and R,
Visual Studio Community Edition with Python and R Tools,
Power BI desktop,
SQL Server Express edition
ML libs: CNTK, xgboost and Vowpal Wabbit
Azure SDK
Data Science VM inside:
R Tools in Azure Machine Learning:
Support R-scripts execution;
Allow authoring custom R modules;
Jupyter Notebooks with R kernel support.
Microsoft Azure + R. Azure Machine Learning
Microsoft Azure + R. Azure Machine Learning
Jupyter
Notebook
Azure ML
Studio
GitHub/
TFS in Azure
h(θ0, θn)
Commands flow
Data flow
Request/response flow
References
Cortana Intelligence and Machine Learning Blog
R for Azure Machine Learning. Quickstart
Machine Learning Algorithm Cheat Sheet
Machine Learning Hackathon. How to win?
Azure ML Repositories on GitHub
Microsoft Azure for all group on Facebook
Soon in Slack (invite form)
Microsoft Azure + R. References
© 2016 Dmitry Petukhov All rights reserved. Microsoft and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries.
Data Science must win!
Q&A
Now or later (send on d.petukhov@outlook.com)
Ping me
Habr: @codezombie
LinkedIn: @dpetukhov
Facebook: @code.zombi
Read my tech code instinct blog ( http://0xCode.in/ )
Microsoft Azure + R. Stay in Touch!

Microsoft Azure + R

  • 1.
    Microsoft Azure ♥R Data Science with Microsoft Azure and R Dmitry Petukhov, Microsoft Data Platform MVP, C# MCP, Big Data Enthusiast && Coffee Addicted
  • 2.
    Microsoft Azure +R. Prototype to Product Challenge Prototyping Flexibility Distributed Scalable Fault-tolerance Reliable Production Flexibility Distributed Scalable Fault-tolerance Reliable + Big Data Ready + LSML Black Magic! Migration
  • 3.
    Microsoft Azure +R. Hello R! Python is a COOL language! But R… Specialized in statistical analyze Time-effective => ideal for… …prototyping, competition, researching, and for fun! Standalone computing => not bad scalable  Open source Big bearded community
  • 4.
    Storage Resource Management ML Framework Execution Engine Local OS LocalDisc PythonRuntime YetAnother Runtime scikit learn HDFS YARN MapReduce Mahout HDFS / S3 YARN / Apache Mesos Spark MLlib HDFS / S3 YARN / Apache Mesos Python / R on Spark Python/R tools Spark Local PC Hybrid Model Cluster (on-premises/on-demand) some library Machine Learning in Finance. Infrastructure for Data Scientist Low HighCost of deployment/ownership Distributed FS Dark Magic… ML as a Service Python/R tools Microsoft Azure + R. Infrastructures for Data Scientists
  • 5.
    Microsoft Azure +R. Microsoft ♥ R R Server for Azure HDInsight Data Science VM Azure Machine Learning Support R-scripts execution Allow authoring custom R modules Jupyter Notebooks with R kernel support Azure HDInsight Hadoop/Spark-cluster as a Service SQL Server R Services Power BI Running R Scripts & excellent visualization R Tools for Visual Studio Microsoft Azure
  • 6.
  • 7.
  • 8.
    R Server forAzure HDInsight Killer features list: 100% open source R implementation; workload running inside HDInsight (Hadoop/Spark). Microsoft Azure + R. R Server for Azure HDInsight
  • 9.
    R, Python, SQL,C# Microsoft Azure + R. Data Science VM Microsoft R Server Developer Edition, Anaconda Python distribution, Jupyter notebooks for Python and R, Visual Studio Community Edition with Python and R Tools, Power BI desktop, SQL Server Express edition ML libs: CNTK, xgboost and Vowpal Wabbit Azure SDK Data Science VM inside:
  • 10.
    R Tools inAzure Machine Learning: Support R-scripts execution; Allow authoring custom R modules; Jupyter Notebooks with R kernel support. Microsoft Azure + R. Azure Machine Learning
  • 11.
    Microsoft Azure +R. Azure Machine Learning Jupyter Notebook Azure ML Studio GitHub/ TFS in Azure h(θ0, θn) Commands flow Data flow Request/response flow
  • 12.
    References Cortana Intelligence andMachine Learning Blog R for Azure Machine Learning. Quickstart Machine Learning Algorithm Cheat Sheet Machine Learning Hackathon. How to win? Azure ML Repositories on GitHub Microsoft Azure for all group on Facebook Soon in Slack (invite form) Microsoft Azure + R. References
  • 13.
    © 2016 DmitryPetukhov All rights reserved. Microsoft and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries. Data Science must win!
  • 14.
    Q&A Now or later(send on d.petukhov@outlook.com) Ping me Habr: @codezombie LinkedIn: @dpetukhov Facebook: @code.zombi Read my tech code instinct blog ( http://0xCode.in/ ) Microsoft Azure + R. Stay in Touch!

Editor's Notes

  • #7 Revolution Analytics Revolution R Open и Revolution R Enterprise Revolution R — это среда выполнения языка R (язык программирования для статистической обработки данных и работы с графикой), оптимизированная для многопоточных вычислений, а также, набор библиотек, для параллельной обработки в рамках концепции «больших данных».
  • #9 R Server for Azure HDInsight is a 100% open source R implementation running the most comprehensive set of ML algorithms and statistical functions in the cloud that leverages Hadoop and Spark. By making R Server available as a workload running inside HDInsight, we remove obstacles for users to unlock the power of R by eliminating memory and processing constraints and extending analytics from the laptop to large multi-node Hadoop and Spark clusters. This enables the ability to train and run ML models on larger datasets than previously possible to make more accurate predictions that affect the business. It also reduces the time to move ideas into production by eliminating the time-consuming installation or set up and procurement cycles for new hardware.