MLOps and Data Quality: Deploying Reliable ML Models in Production

MLOps and Data Quality:
Deploying Reliable ML Models in
Production
Presented by:
Stepan Pushkarev, CTO @ Provectus
Rinat Gareev, ML Solutions Architect @ Provectus

Webinar Objectives
1. Explore best practices of building and deploying reliable Machine Learning
models
2. Review existing open source tools and reference architectures for
implementation of Data Quality components as part of your MLOps
pipelines
3. Get qualified for Provectus ML Infrastructure Acceleration Program – A
fully funded discovery workshop

Agenda
● Introduction and Why
● How: Common Practical Challenges and Solutions
○ Data Testing
○ Model Testing
● MLOps: Wiring Things Together
● Provectus ML Infrastructure Acceleration Program

Introductions
Stepan Pushkarev
Chief Technology
Officer, Provectus
Rinat Gareev
ML Solutions Architect,
Provectus

AI-First Consultancy & Solutions Provider
Сlients ranging from
fast-growing startups to
large enterprises
450 employees and
growing
Established in 2010
HQ in Palo Alto
Offices across the US,
Canada, and Europe
We are obsessed about leveraging cloud, data, and AI to reimagine the way
businesses operate, compete, and deliver customer value

Innovative Tech Vendors
Seeking for niche expertise to differentiate
and win the market
Midsize to Large Enterprises
Seeking to accelerate innovation, achieve
operational excellence
Our Clients

Why Quality Data Matters?
After Data Cleaning 0.91
TFIDF, PoS, Stop Words 0.695
Scikit Learn Default 0.69
Python Hyperopt 0.73
ACCURACY
Sigmod2016
Sanjay Krishnan (UC Berkeley)
And Jiannan Wang (Simon Fraser U.)
https://sigmod2016.org/sigmod_tutorial1.shtml

End-to-end deep learning image classification
models to detect child gaze, strabismus,
crescent, and dark iris/pupil population.
GoCheck Kids
Case Study
Before After Data QA
Precision 32% 40%
Recall 89% 91%
FPR 19% 17%
PR AUC 57% 76%

Machine Learning Lifecycle
Data Ingestion
Data Cleaning
Data Merging
Data Labeling
Feature Engineering
Versioned
Dataset
Model Training
Experimentation
Model Packaging
Model
Candidate
Regression Testing
Model Selection
Production
Deployment
Monitoring
Data Preparation ML Engineering Delivery & Operations

All Stages of ML Lifecycle Require QA
Data Ingestion
Data Cleaning
Data Merging
Data Labeling
Feature Engineering
Versioned
Dataset
Model Training
Experimentation
Model Packaging
Model
Candidate
Regression Testing
Model Selection
Production
Deployment
Monitoring
Data Preparation ML Engineering Delivery & Operations
Data
Tests
Code
Tests
Model
Tests
Data
Tests
Code
Tests
Model
Tests
Data
Tests
Code
Tests

Error Cascades
* from "Everyone wants to do the model work, not the data work": Data Cascades in High-Stakes AI”,
N. Sambasivan et al., SIGCHI, ACM (2021)

How: Practical Challenges and
Solutions

Common Challenge #1:
How to find & access the data I trust?
1. Data is scattered across multiple data sources and
technologies: RDMS, DWH, Data Lakes, Blobs
2. Data ownership is not clear
3. Data requirements and SLAs are not clear
4. Metadata is not discoverable
5. As a result, all investments into Data and ML are killed by
data access and discoverability issues

Solution: Migrate to Data Mesh
Data Mesh is in the convergence of
Distributed Domain-Driven Architecture, Self-
Serve Platform Design, and Product Thinking
with Data
● Brings data closer to Domain Context
● Introduces the concept of Data as a
Product and all appropriate data
contracts
● Sorts out data ownership issues
https://martinfowler.com/articles/data-monolith-to-mesh.html

Invest into Global Data Catalog
The solution to answer questions like:
● Does this data exist? Where is it?
● What is the source of truth of the data?
● Who and/or which team is the owner?
● Who are the users of the data?
● Are there existing assets I can reuse?
● Can I trust this data?
* There are no established leaders
* Commercial vendors are not listed

Common Challenge #2:
How to get started with QA for Data and ML?
1. What exactly to test?
2. Who should test (Traditional QA, Data Engs, ML Engs,
Analysts)?
3. What tools to use?
4. As a result, low productivity of ML Engineers having to deal
with data quality issues.

Data: What to Test
Default data quality checks:
● Duplicates
● Missing values
● Syntax errors
● Format errors
● Semantic errors
● Integrity

Advanced unsupervised methods:
● Distribution tests
● KS, Chi-squared tests
● Outlier detection with AutoML
● Auto Constraints suggestion
● Data Profiling for Complex
Dependencies
Default data quality checks:
● Duplicates
● Missing values
● Syntax errors
● Format errors
● Semantic errors
● Integrity checks
Data: What to Test

Unsupervised Constraints Generation
Use cases:
● existing data with poor
documentation or
schema
● rapidly evolving data
● rich structure
● starting from scratch
1. Compute data
profiles/summaries
2. Generate checks on:
● types
● completeness
● ranges
● uniqueness
● distributions
Extensible:
● e.g., conventions on
column naming
3. Evaluate on
holdout subset
4. Review and add to
test suites

● Deequ
● GreatExpectations
● Tensorflow Data Validation
● dbt
Data Testing: Available Tools
* Commercial vendors are not listed

Model Testing: Analyzing Input and
Output Datasets

Model Testing: Datasets Are Test
Suites with Test Cases
● Golden UAT datasets
● Security datasets
● Production traffic replay
● Regression datasets
● Datasets for bias
● Datasets for edge cases

Model Testing: Bias
Bias is considered to be a disproportionate inclination or prejudice for or against an idea or thing.

10+ Bias Types
● Selection Bias — The selection of data in such
a way that the sample is not representative of
the population
● The Framing Effect — Annotation questions
that are constructed with a particular slant
● Systematic Bias — Consistent and repeatable
error.
● Outlier Data, Missing Values, Filtering Data
● Bias / Variance Trade off
● Personal Perception Bias

Model Testing: Available Tools
Adversarial Testing & Model Robustness:
1. Cleverhans by Ian Goodfellow & Nicolas Papernot
2. Adversarial Robustness Toolbox (ART) by DARPA
Bias and Fairness
1. AWS SageMaker Clarify
2. AIF360 by IBM
3. Aequitas by University of Chicago

The Core of MLOps Pipelines
Model Code
ML Pipeline Code
Infrastructure as a
Code
Versioned Dataset
Production Metrics &
Alerts
Model Artifacts
Prediction Service
ML Metrics
Automated Pipeline Execution
Pipeline Metadata
Alerts Reports
Feature Store
Orchestration: Idempotent Execution
Feedback Loop for Production Data

The Core of MLOps Pipelines
Model Code
ML Pipeline Code
Infrastructure as a
Code
Versioned Dataset
Production Metrics &
Alerts
Model Artifacts
Prediction Service
ML Metrics
Automated Pipeline Execution
Pipeline Metadata
Alerts Reports
Feature Store
Orchestration: Idempotent Execution
Feedback Loop for Production Data
Data Quality Checks

Expanding Validation Pipelines
Feature Store ML Model
Versioned Dataset
Batch Quality
Checkpoints
Dataset Rules
Validation
Dataset
Bias Checker
Statistical Assertions
Outlier Detector
Deployed Model
Model
Validation
Model
Test for Bias
Model
Security Test
Regression
Test
Business
Acceptance
Traffic
Replay

1. You cannot deploy ML models to production without a clear
Data QA Strategy in place.
2. As a leader, focus on organizing data teams around product
features, to make them fully responsible for Data as a Product.
3. Design Data QA components as an essential part of your MLOps
foundation.
Final Recommendations

125 University Avenue
Suite 295, Palo Alto
California, 94301
provectus.com
Questions, details?
We would be happy to answer!

MLOps and Data Quality: Deploying Reliable ML Models in Production

More Related Content

What's hot

Similar to MLOps and Data Quality: Deploying Reliable ML Models in Production

More from Provectus

Recently uploaded

MLOps and Data Quality: Deploying Reliable ML Models in Production