Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Migrating Existing Open Source Machine Learning to Azure


Published on

By David Smith. Presented at Microsoft Build (Seattle), May 7 2018.

Your data scientists have created predictive models using open-source tools, proprietary software, or some combination of both, and now you are interested in lifting and shifting those models to the cloud. In this talk, I'll describe how data scientists can transition their existing workflows — while using mostly the same tools and processes — to train and deploy machine learning models based on open source frameworks to Azure. I'll provide guidance on keeping connections to data sources up-to-date, evaluating and monitoring models, and deploying applications that make use of those models.

Published in: Software

Migrating Existing Open Source Machine Learning to Azure

  1. 1.
  2. 2. Replicable and scriptable Consistent syntax on Windows (cmd / Powershell), Mac, Linux, WSL
  3. 3. Visual Studio [Code] Tools for AI VS & VS Code extensions to streamline computations in servers, Azure ML, Batch AI, … End to end development environment, from new project through training Support for remote training & job management On top of all of the goodness of VS (Python, Jupyter, Git, etc) THR3129 Getting Started with Visual Studio Tools for AI, Chris Lauren
  4. 4.
  5. 5.
  6. 6. • Local tools • Local Debug • Faster experimentation Single VM Development • Larger VMs • GPU Scale Up • Multi Node • Remote Spark • Batch Nodes • VM Scale Sets Scale Out
  7. 7. Series RAM vCPU GPU Approx Cost Standard_B1s 1 Gb 1 None Free [*] DS3_v2 14Gb 4 None $0.23 / hr DS4_v2 28Gb 8 None $0.46 / hr A8v2 16Gb 8 None $0.82 / hr Standard_NC6 56 Gb 6 0.5 NV Tesla K80 $0.93 / hr Standard_ND6s 112 Gb 6 1x Tesla P40 $2.14 / hr [*] Not recommended: Standard_B1s (free, but too small to be useful)
  8. 8.
  9. 9. Not Hotdog:
  10. 10. Azure Batch Batch pools Configure and create VMs to cater for any scale: tens to thousands. Automatically scale the number of VMs to maximize utilization. Choose the VM size most suited to your application. Batch jobs and tasks Task is a unit of execution; task = command line application Jobs created and tasks submitted to a pool; tasks are queued, then assigned to VMs. Any application, any execution time; run applications unchanged. Automatic detection and retry of frozen or failing tasks.
  11. 11. Cost savings Scale clusters size up and down as needed Reserved Instances for persistent infrastructure Per-second billing for VMs Flexible consumption and savings with low- priority VMs
  12. 12. Scaling AI with DSVM and Batch AI DSVM (Dev/Test Workstation) Azure File Store Azure Batch AI Cluster Batch AI Run Script Store Py Scripts in File Store Create Py Scripts Trained AI Model
  13. 13.
  14. 14. BRK3320 The Developer Data Scientist – Creating New Analytics Driven Applications using Apache Spark with Azure Databricks May 8 10:30 AM-11:45 AM, Sheraton Grand Ballroom A
  15. 15. • Traditionally, static-sized clusters were the standard, so compute and storage had to be collocated • A single cluster with all necessary applications would be installed onto the cluster (typically managed by YARN, or something similar) • The cluster was either over-utilized (jobs had to be queued due to lack of capacity) OR was under-utilized (there were idle cores that burned costs) • Teams of data-scientists would have to submit jobs agaisnt a single cluster - this meant that the cluster had to be generic, preventing users from truly customizing their clusters specifically for their jobs Traditional / On-Premise Paradigm DataStore
  16. 16. • With cloud computing, customers are no longer limited to static size clusters • Each job, or set of jobs, can have its own cluster so that a customer is only charged for the minutes that the job runs for • Each user can have their own cluster, so that they don’t have to compute for resources • Each user can have their own custom cluster that is created specifically for their experience and their workload. Each user can install exactly the software they need without polluting other user’s experiences • IT admins don’t need to worry about running out of capacity or burning dollars on idle cores Modern / Cloud Paradigm DataStore
  17. 17.
  18. 18.
  19. 19. Connect to the Spark cluster: library(sparklyr) cluster_url <- paste0("spark://", system("hostname -i", intern = TRUE), ":7077") sc <- spark_connect(master = cluster_url) Load in some data: library(dplyr) flights_tbl <- copy_to(sc, nycflights13::flights, "flights") Munge with dplyr: delay <- flights_tbl %>% group_by(tailnum) %>% summarise(count = n(), dist = mean(distance), delay = mean(arr_delay)) %>% filter(count > 20, dist < 2000, ! %>% collect
  20. 20. > m <- ml_linear_regression(delay ~ dist, data=delay_near) * No rows dropped by 'na.omit' call > summary(m) Call: ml_linear_regression(delay ~ dist, data = delay_near) Deviance Residuals:: Min 1Q Median 3Q Max -19.9499 -5.8752 -0.7035 5.1867 40.8973 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 0.6904319 1.0199146 0.677 0.4986 dist 0.0195910 0.0019252 10.176 <2e-16 *** --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 R-Squared: 0.09619 Root Mean Squared Error: 8.075 >
  21. 21.
  22. 22.