Intuit Proprietary 1
Mar 25, 2019
Practical Lessons on Model
Deployments with AWS
Managed Services
Ian Sebanja
Tobias Wenzel
“We spend more time bringing the model to
production than developing and training it”
Data Science Operations is not easy
“We spend more time bringing the model to
production than developing and training it”
Data Science Operations is not easy
Model Development Lifecycle
Deploy / HostPerf TestTraining
Model
Development
/ Training
Feature
Pipelines
Explore
Intuit ML Platform
Cloud
Swarm
Sagemaker
Sagemaker
Intuit’s ML Platform
Services Layer
Prediction
Logging
Data
Aggregation
Model Deployment
Data
Prefetching
Logging & Monitoring
Model Hosting
MLPlatform
Self-Service
Self-Service Interfaces
Ease of Use
Powering Data Velocity
6 MONTHS < 1 WEEK
Deployment time down by 90%
Model deployment +
Model deployment +
Logging
Model deployment +
Logging
We need to know what happens in the models
Model deployment +
Logging Metrics
Model deployment +
Logging Metrics
We need to know how the model runs
Model deployment +
Logging Data
Aggregation
Metrics
Model deployment +
Logging Data
Aggregation
Metrics
Data is available at the fingertips of the Data Scientist
Model deployment +
Logging Data
Aggregation
Metrics Tooling
Model deployment +
Logging Data
Aggregation
Metrics Tooling
Provide tooling to increase speed
Model deployment +
Logging Data
Aggregation
Metrics Tooling A/B Test
Model deployment +
Logging Data
Aggregation
Metrics Tooling A/B Test
Iterate over model versions quickly
One more thing …
Logging Data
Aggregation
Metrics Tooling A/B Test
One more thing …
Security
One more thing …
https://news.ycombinator.com/item?id=15256121
One more thing …
https://medium.com/@bertusk/detecting-cyber-attacks-in-the-python-package-index-pypi-61ab2b585c67
Secure your hosted models
Sagemaker
Secure your hosted models
KMS
VPC
IAM
Sagemaker
Lessons learned
1. Work with the vendor to get the security you need
Lessons learned
1. Work with the vendor to get the security you need
- Monitor traffic from endpoints hosting a model
- Set security groups to block traffic where necessary
- Trace requests through all layers of the platform
Lessons learned
1. Work with the vendor to get the security you need
2. Add your own monitoring and tooling
Lessons learned
1. Work with the vendor to get the security you need
2. Add your own monitoring and tooling
- Helped greatly during performance tests of the service
- Find breaking points early - collect your own data
- Test out uncommon usages - 1 request per minute?
- Create automation around right-sizing infrastructure for models
Lessons learned
1. Work with the vendor to get the security you need
2. Add your own monitoring and tooling
3. Make the vendor aware of everything
Lessons learned
1. Work with the vendor to get the security you need
2. Add your own monitoring and tooling
3. Make the vendor aware of everything
- Monitoring and alerting can reveal unintended behaviour
Lessons learned
1. Work with the vendor to get the security you need
2. Add your own monitoring and tooling
3. Make the vendor aware of everything
4. Be very transparent about cost of a model
Lessons learned
1. Work with the vendor to get the security you need
2. Add your own monitoring and tooling
3. Make the vendor aware of everything
4. Be very transparent about cost of a model
- Intuit implemented a “billback” mechanism
Lessons learned
1. Work with the vendor to get the security you need
2. Add your own monitoring and tooling
3. Make the vendor aware of everything
4. Be very transparent about cost of a model
Lessons learned

Intuit's Machine Learning Platform Journey

  • 1.
    Intuit Proprietary 1 Mar25, 2019 Practical Lessons on Model Deployments with AWS Managed Services Ian Sebanja Tobias Wenzel
  • 3.
    “We spend moretime bringing the model to production than developing and training it” Data Science Operations is not easy
  • 4.
    “We spend moretime bringing the model to production than developing and training it” Data Science Operations is not easy
  • 5.
    Model Development Lifecycle Deploy/ HostPerf TestTraining Model Development / Training Feature Pipelines Explore Intuit ML Platform Cloud Swarm Sagemaker Sagemaker
  • 6.
    Intuit’s ML Platform ServicesLayer Prediction Logging Data Aggregation Model Deployment Data Prefetching Logging & Monitoring Model Hosting MLPlatform Self-Service
  • 7.
  • 8.
  • 9.
    Powering Data Velocity 6MONTHS < 1 WEEK Deployment time down by 90%
  • 10.
  • 11.
  • 12.
    Model deployment + Logging Weneed to know what happens in the models
  • 13.
  • 14.
    Model deployment + LoggingMetrics We need to know how the model runs
  • 15.
    Model deployment + LoggingData Aggregation Metrics
  • 16.
    Model deployment + LoggingData Aggregation Metrics Data is available at the fingertips of the Data Scientist
  • 17.
    Model deployment + LoggingData Aggregation Metrics Tooling
  • 18.
    Model deployment + LoggingData Aggregation Metrics Tooling Provide tooling to increase speed
  • 19.
    Model deployment + LoggingData Aggregation Metrics Tooling A/B Test
  • 20.
    Model deployment + LoggingData Aggregation Metrics Tooling A/B Test Iterate over model versions quickly
  • 21.
    One more thing… Logging Data Aggregation Metrics Tooling A/B Test
  • 22.
    One more thing… Security
  • 23.
    One more thing… https://news.ycombinator.com/item?id=15256121
  • 24.
    One more thing… https://medium.com/@bertusk/detecting-cyber-attacks-in-the-python-package-index-pypi-61ab2b585c67
  • 25.
    Secure your hostedmodels Sagemaker
  • 26.
    Secure your hostedmodels KMS VPC IAM Sagemaker
  • 27.
  • 28.
    1. Work withthe vendor to get the security you need Lessons learned
  • 29.
    1. Work withthe vendor to get the security you need - Monitor traffic from endpoints hosting a model - Set security groups to block traffic where necessary - Trace requests through all layers of the platform Lessons learned
  • 30.
    1. Work withthe vendor to get the security you need 2. Add your own monitoring and tooling Lessons learned
  • 31.
    1. Work withthe vendor to get the security you need 2. Add your own monitoring and tooling - Helped greatly during performance tests of the service - Find breaking points early - collect your own data - Test out uncommon usages - 1 request per minute? - Create automation around right-sizing infrastructure for models Lessons learned
  • 32.
    1. Work withthe vendor to get the security you need 2. Add your own monitoring and tooling 3. Make the vendor aware of everything Lessons learned
  • 33.
    1. Work withthe vendor to get the security you need 2. Add your own monitoring and tooling 3. Make the vendor aware of everything - Monitoring and alerting can reveal unintended behaviour Lessons learned
  • 34.
    1. Work withthe vendor to get the security you need 2. Add your own monitoring and tooling 3. Make the vendor aware of everything 4. Be very transparent about cost of a model Lessons learned
  • 35.
    1. Work withthe vendor to get the security you need 2. Add your own monitoring and tooling 3. Make the vendor aware of everything 4. Be very transparent about cost of a model - Intuit implemented a “billback” mechanism Lessons learned
  • 36.
    1. Work withthe vendor to get the security you need 2. Add your own monitoring and tooling 3. Make the vendor aware of everything 4. Be very transparent about cost of a model Lessons learned