Streamlining AI Prototyping and Deployment with R and MLflow

Kevin Kuo
Kevin KuoSoftware Engineer at RStudio
Kevin Kuo @kevinykuo, RStudio
Streamlining AI
Prototyping and
Deployment with R and
MLflow
#UnifiedAnalytics #SparkAISummit
Daily specials
- Quick update on the R ecosystems for AI stuff
- Recap of MLflow
- Demo
- Discussion + Q&A
2#UnifiedAnalytics #SparkAISummit
Sparklyr Update
- Arrow integration to massively speed up UDFs
- XGBoost
- TFRecord read/write
- SparkNLP on the way
https://spark.rstudio.com/
3#UnifiedAnalytics #SparkAISummit
TensorFlow Update
4#UnifiedAnalytics #SparkAISummit
TensorFlow Update
5#UnifiedAnalytics #SparkAISummit
TensorFlow Update
He looks skeptical, as if you
were nothing
get it right.
It drives me crazy, I can do
this
reproachful look no longer
endure.
His name is Olaf.
6#UnifiedAnalytics #SparkAISummit
TensorFlow Update
- library(keras) defaults to tf.keras
- TensorFlow Probability for probabilistic modeling
- Eager execution
- Preparing for TF2.0 drop
https://tensorflow.rstudio.com/
https://blogs.rstudio.com/tensorflow/
7#UnifiedAnalytics #SparkAISummit
Quick recap of MLflow
Open source platform for
- Experiment instrumentation (Tracking)
- Reproducible runs (Projects)
- Model deployment (Models)
8#UnifiedAnalytics #SparkAISummit
Tracking
Keeping track of stuff
mlflow_log_param("num_hidden_units", 64)
mlflow_log_artifact("training_history.png")
mlflow_log_metric("accuracy", metrics$acc)
9#UnifiedAnalytics #SparkAISummit
Projects
Packaging up (reproducible) building blocks
mlflow_run("data-prep.R")
10#UnifiedAnalytics #SparkAISummit
Models
Deployment
flavors:
keras:
version: 2.2.2
data: model.h5
python_function:
loader_module: mlflow.keras
data: model.h5
env: conda_env.yaml
utc_time_created: 19-04-25T01:00:21.21.72
11#UnifiedAnalytics #SparkAISummit
mlflow_rfunc_serve(
"keras_model",
run_uuid = training_run_id
)
Demo!
12#UnifiedAnalytics #SparkAISummit
Roadmap
How are package dependencies handled for R
projects?
Conda? Packrat?
What if your packages depend on Java/Python
libraries?
13#UnifiedAnalytics #SparkAISummit
Quick excursion on
dependency management
14#UnifiedAnalytics #SparkAISummit
Renv is in
15#UnifiedAnalytics #SparkAISummit
renv::init()
renv::restore()
Renv is in
16#UnifiedAnalytics #SparkAISummit
Conda support in
progress for reticulated
packages
What about...
What about stuff with
Java/rJava
dependencies?!?!
17#UnifiedAnalytics #SparkAISummit
Betting on new tech
Why MLflow and not
something else?
18#UnifiedAnalytics #SparkAISummit
Roadmap
Better integration with deployment tech
- MLeap (https://github.com/rstudio/mleap)
- H2O
- TensorFlow Serving
- Arbitrary R models (plumber + docker)
19#UnifiedAnalytics #SparkAISummit
Resources
- https://mlflow.org/
- https://github.com/mlflow/mlflow
- https://tensorflow.rstudio.com/
- https://spark.rstudio.com/
- https://community.rstudio.com/
- Demo Repo:
https://github.com/kevinykuo/sais2019-mlflow
20#UnifiedAnalytics #SparkAISummit
DON’T FORGET TO RATE
AND REVIEW THE SESSIONS
SEARCH SPARK + AI SUMMIT
1 of 21

Recommended

Introducing MLflow for R by
Introducing MLflow for RIntroducing MLflow for R
Introducing MLflow for RKevin Kuo
176 views20 slides
MATLAB Simulation for Master Thesis by
MATLAB Simulation for Master ThesisMATLAB Simulation for Master Thesis
MATLAB Simulation for Master ThesisPhdtopiccom
38 views5 slides
Intro to Elixir by
Intro to ElixirIntro to Elixir
Intro to ElixirEduardo Nunes Pereira
159 views21 slides
Sparking pandas: an experiment by
Sparking pandas: an experimentSparking pandas: an experiment
Sparking pandas: an experimentFrancesco Bruni
942 views13 slides
Progressive f# tutorials nyc rachel reese & phil trelford on try f# from zero... by
Progressive f# tutorials nyc rachel reese & phil trelford on try f# from zero...Progressive f# tutorials nyc rachel reese & phil trelford on try f# from zero...
Progressive f# tutorials nyc rachel reese & phil trelford on try f# from zero...Skills Matter
1.1K views32 slides
Network Tools for Master Thesis by
Network Tools for Master ThesisNetwork Tools for Master Thesis
Network Tools for Master ThesisPhdtopiccom
27 views5 slides

More Related Content

Similar to Streamlining AI Prototyping and Deployment with R and MLflow

Briefing on the Modern ML Stack with R by
 Briefing on the Modern ML Stack with R Briefing on the Modern ML Stack with R
Briefing on the Modern ML Stack with RDatabricks
447 views32 slides
Build a deep learning pipeline on apache spark for ads optimization by
Build a deep learning pipeline on apache spark for ads optimizationBuild a deep learning pipeline on apache spark for ads optimization
Build a deep learning pipeline on apache spark for ads optimizationCraig Chao
1.8K views42 slides
Databricks with R: Deep Dive by
Databricks with R: Deep DiveDatabricks with R: Deep Dive
Databricks with R: Deep DiveDatabricks
797 views24 slides
DevOps for DataScience by
DevOps for DataScienceDevOps for DataScience
DevOps for DataScienceStepan Pushkarev
1.7K views61 slides
Apache Spark 2.3 boosts advanced analytics and deep learning with Python by
Apache Spark 2.3 boosts advanced analytics and deep learning with PythonApache Spark 2.3 boosts advanced analytics and deep learning with Python
Apache Spark 2.3 boosts advanced analytics and deep learning with PythonDataWorks Summit
836 views34 slides
Turbocharge your data science with python and r by
Turbocharge your data science with python and rTurbocharge your data science with python and r
Turbocharge your data science with python and rKelli-Jean Chun
39 views19 slides

Similar to Streamlining AI Prototyping and Deployment with R and MLflow(20)

Briefing on the Modern ML Stack with R by Databricks
 Briefing on the Modern ML Stack with R Briefing on the Modern ML Stack with R
Briefing on the Modern ML Stack with R
Databricks447 views
Build a deep learning pipeline on apache spark for ads optimization by Craig Chao
Build a deep learning pipeline on apache spark for ads optimizationBuild a deep learning pipeline on apache spark for ads optimization
Build a deep learning pipeline on apache spark for ads optimization
Craig Chao1.8K views
Databricks with R: Deep Dive by Databricks
Databricks with R: Deep DiveDatabricks with R: Deep Dive
Databricks with R: Deep Dive
Databricks797 views
Apache Spark 2.3 boosts advanced analytics and deep learning with Python by DataWorks Summit
Apache Spark 2.3 boosts advanced analytics and deep learning with PythonApache Spark 2.3 boosts advanced analytics and deep learning with Python
Apache Spark 2.3 boosts advanced analytics and deep learning with Python
DataWorks Summit836 views
Turbocharge your data science with python and r by Kelli-Jean Chun
Turbocharge your data science with python and rTurbocharge your data science with python and r
Turbocharge your data science with python and r
Kelli-Jean Chun39 views
MongoDB.local Dallas 2019: MongoDB and Spark by MongoDB
MongoDB.local Dallas 2019: MongoDB and SparkMongoDB.local Dallas 2019: MongoDB and Spark
MongoDB.local Dallas 2019: MongoDB and Spark
MongoDB556 views
Terraform a gitlab ci by Juraj Hantak
Terraform a gitlab ciTerraform a gitlab ci
Terraform a gitlab ci
Juraj Hantak127 views
8 Lessons Learned from Using Kafka in 1000 Scala microservices - Scale by the... by Natan Silnitsky
8 Lessons Learned from Using Kafka in 1000 Scala microservices - Scale by the...8 Lessons Learned from Using Kafka in 1000 Scala microservices - Scale by the...
8 Lessons Learned from Using Kafka in 1000 Scala microservices - Scale by the...
Natan Silnitsky316 views
Dashboards for Business Intelligence by Research Fellow
Dashboards for Business IntelligenceDashboards for Business Intelligence
Dashboards for Business Intelligence
Research Fellow797 views
Emiliano Martinez | Deep learning in Spark Slides | Codemotion Madrid 2018 by Codemotion
Emiliano Martinez | Deep learning in Spark Slides | Codemotion Madrid 2018Emiliano Martinez | Deep learning in Spark Slides | Codemotion Madrid 2018
Emiliano Martinez | Deep learning in Spark Slides | Codemotion Madrid 2018
Codemotion120 views
Analyzing 1.2 Million Network Packets per Second in Real-time by DataWorks Summit
Analyzing 1.2 Million Network Packets per Second in Real-timeAnalyzing 1.2 Million Network Packets per Second in Real-time
Analyzing 1.2 Million Network Packets per Second in Real-time
DataWorks Summit14.8K views
Deploying web apis on core clr to docker by Glenn Block
Deploying web apis on core clr to dockerDeploying web apis on core clr to docker
Deploying web apis on core clr to docker
Glenn Block1.2K views
Puppet Camp Dallas 2014: How Puppet Ops Rolls by Puppet
Puppet Camp Dallas 2014: How Puppet Ops RollsPuppet Camp Dallas 2014: How Puppet Ops Rolls
Puppet Camp Dallas 2014: How Puppet Ops Rolls
Puppet885 views

Recently uploaded

Mitigating Common CloudStack Instance Deployment Failures - Jithin Raju - Sha... by
Mitigating Common CloudStack Instance Deployment Failures - Jithin Raju - Sha...Mitigating Common CloudStack Instance Deployment Failures - Jithin Raju - Sha...
Mitigating Common CloudStack Instance Deployment Failures - Jithin Raju - Sha...ShapeBlue
138 views18 slides
Hypervisor Agnostic DRS in CloudStack - Brief overview & demo - Vishesh Jinda... by
Hypervisor Agnostic DRS in CloudStack - Brief overview & demo - Vishesh Jinda...Hypervisor Agnostic DRS in CloudStack - Brief overview & demo - Vishesh Jinda...
Hypervisor Agnostic DRS in CloudStack - Brief overview & demo - Vishesh Jinda...ShapeBlue
120 views13 slides
Qualifying SaaS, IaaS.pptx by
Qualifying SaaS, IaaS.pptxQualifying SaaS, IaaS.pptx
Qualifying SaaS, IaaS.pptxSachin Bhandari
897 views8 slides
DRBD Deep Dive - Philipp Reisner - LINBIT by
DRBD Deep Dive - Philipp Reisner - LINBITDRBD Deep Dive - Philipp Reisner - LINBIT
DRBD Deep Dive - Philipp Reisner - LINBITShapeBlue
140 views21 slides
Extending KVM Host HA for Non-NFS Storage - Alex Ivanov - StorPool by
Extending KVM Host HA for Non-NFS Storage -  Alex Ivanov - StorPoolExtending KVM Host HA for Non-NFS Storage -  Alex Ivanov - StorPool
Extending KVM Host HA for Non-NFS Storage - Alex Ivanov - StorPoolShapeBlue
84 views10 slides
GDG Cloud Southlake 28 Brad Taylor and Shawn Augenstein Old Problems in the N... by
GDG Cloud Southlake 28 Brad Taylor and Shawn Augenstein Old Problems in the N...GDG Cloud Southlake 28 Brad Taylor and Shawn Augenstein Old Problems in the N...
GDG Cloud Southlake 28 Brad Taylor and Shawn Augenstein Old Problems in the N...James Anderson
156 views32 slides

Recently uploaded(20)

Mitigating Common CloudStack Instance Deployment Failures - Jithin Raju - Sha... by ShapeBlue
Mitigating Common CloudStack Instance Deployment Failures - Jithin Raju - Sha...Mitigating Common CloudStack Instance Deployment Failures - Jithin Raju - Sha...
Mitigating Common CloudStack Instance Deployment Failures - Jithin Raju - Sha...
ShapeBlue138 views
Hypervisor Agnostic DRS in CloudStack - Brief overview & demo - Vishesh Jinda... by ShapeBlue
Hypervisor Agnostic DRS in CloudStack - Brief overview & demo - Vishesh Jinda...Hypervisor Agnostic DRS in CloudStack - Brief overview & demo - Vishesh Jinda...
Hypervisor Agnostic DRS in CloudStack - Brief overview & demo - Vishesh Jinda...
ShapeBlue120 views
DRBD Deep Dive - Philipp Reisner - LINBIT by ShapeBlue
DRBD Deep Dive - Philipp Reisner - LINBITDRBD Deep Dive - Philipp Reisner - LINBIT
DRBD Deep Dive - Philipp Reisner - LINBIT
ShapeBlue140 views
Extending KVM Host HA for Non-NFS Storage - Alex Ivanov - StorPool by ShapeBlue
Extending KVM Host HA for Non-NFS Storage -  Alex Ivanov - StorPoolExtending KVM Host HA for Non-NFS Storage -  Alex Ivanov - StorPool
Extending KVM Host HA for Non-NFS Storage - Alex Ivanov - StorPool
ShapeBlue84 views
GDG Cloud Southlake 28 Brad Taylor and Shawn Augenstein Old Problems in the N... by James Anderson
GDG Cloud Southlake 28 Brad Taylor and Shawn Augenstein Old Problems in the N...GDG Cloud Southlake 28 Brad Taylor and Shawn Augenstein Old Problems in the N...
GDG Cloud Southlake 28 Brad Taylor and Shawn Augenstein Old Problems in the N...
James Anderson156 views
CloudStack and GitOps at Enterprise Scale - Alex Dometrius, Rene Glover - AT&T by ShapeBlue
CloudStack and GitOps at Enterprise Scale - Alex Dometrius, Rene Glover - AT&TCloudStack and GitOps at Enterprise Scale - Alex Dometrius, Rene Glover - AT&T
CloudStack and GitOps at Enterprise Scale - Alex Dometrius, Rene Glover - AT&T
ShapeBlue112 views
Declarative Kubernetes Cluster Deployment with Cloudstack and Cluster API - O... by ShapeBlue
Declarative Kubernetes Cluster Deployment with Cloudstack and Cluster API - O...Declarative Kubernetes Cluster Deployment with Cloudstack and Cluster API - O...
Declarative Kubernetes Cluster Deployment with Cloudstack and Cluster API - O...
ShapeBlue88 views
Elevating Privacy and Security in CloudStack - Boris Stoyanov - ShapeBlue by ShapeBlue
Elevating Privacy and Security in CloudStack - Boris Stoyanov - ShapeBlueElevating Privacy and Security in CloudStack - Boris Stoyanov - ShapeBlue
Elevating Privacy and Security in CloudStack - Boris Stoyanov - ShapeBlue
ShapeBlue179 views
Future of AR - Facebook Presentation by Rob McCarty
Future of AR - Facebook PresentationFuture of AR - Facebook Presentation
Future of AR - Facebook Presentation
Rob McCarty62 views
The Power of Heat Decarbonisation Plans in the Built Environment by IES VE
The Power of Heat Decarbonisation Plans in the Built EnvironmentThe Power of Heat Decarbonisation Plans in the Built Environment
The Power of Heat Decarbonisation Plans in the Built Environment
IES VE69 views
VNF Integration and Support in CloudStack - Wei Zhou - ShapeBlue by ShapeBlue
VNF Integration and Support in CloudStack - Wei Zhou - ShapeBlueVNF Integration and Support in CloudStack - Wei Zhou - ShapeBlue
VNF Integration and Support in CloudStack - Wei Zhou - ShapeBlue
ShapeBlue163 views
KVM Security Groups Under the Hood - Wido den Hollander - Your.Online by ShapeBlue
KVM Security Groups Under the Hood - Wido den Hollander - Your.OnlineKVM Security Groups Under the Hood - Wido den Hollander - Your.Online
KVM Security Groups Under the Hood - Wido den Hollander - Your.Online
ShapeBlue181 views
Confidence in CloudStack - Aron Wagner, Nathan Gleason - Americ by ShapeBlue
Confidence in CloudStack - Aron Wagner, Nathan Gleason - AmericConfidence in CloudStack - Aron Wagner, Nathan Gleason - Americ
Confidence in CloudStack - Aron Wagner, Nathan Gleason - Americ
ShapeBlue88 views
State of the Union - Rohit Yadav - Apache CloudStack by ShapeBlue
State of the Union - Rohit Yadav - Apache CloudStackState of the Union - Rohit Yadav - Apache CloudStack
State of the Union - Rohit Yadav - Apache CloudStack
ShapeBlue253 views
CloudStack Managed User Data and Demo - Harikrishna Patnala - ShapeBlue by ShapeBlue
CloudStack Managed User Data and Demo - Harikrishna Patnala - ShapeBlueCloudStack Managed User Data and Demo - Harikrishna Patnala - ShapeBlue
CloudStack Managed User Data and Demo - Harikrishna Patnala - ShapeBlue
ShapeBlue94 views
Enabling DPU Hardware Accelerators in XCP-ng Cloud Platform Environment - And... by ShapeBlue
Enabling DPU Hardware Accelerators in XCP-ng Cloud Platform Environment - And...Enabling DPU Hardware Accelerators in XCP-ng Cloud Platform Environment - And...
Enabling DPU Hardware Accelerators in XCP-ng Cloud Platform Environment - And...
ShapeBlue63 views

Streamlining AI Prototyping and Deployment with R and MLflow