MLFlow
as a core driver of our ML CI/CD workflow
@Avalara
About me
Manoj Mahalingam
Principal Engineer @Avalara
Scale of Data @ Avalara (Indix)
2.1
Billion
Product
URLs 8 TB
HTML Data
Crawled
Daily
1B
Unique
Products
7000
Categories
120 B
Price
Points
3000
Sites
Crawling Pipeline
Data PipelineML
AggregateMatchStandardizeExtract AttributesClassifyDedupe
Parse
Crawl
Data
CrawlSeed
Brand & Retailer
Websites
Feeds Pipeline
Transform Clean Connect
Feed
Data
Brand & Retailer
Feeds
Product
Catalog
Customizable
Feeds
Search &
Analytics
Index
Indexing PipelineReal Time
Index Analyze Derive Join
API
(Bulk &
Synchronous)
Product Data
Transformation
Service
Data Pipeline @ Avalara (Indix)
Product Classification Example
Model Hierarchy
BMV Ensemble
Toplevel
Ensemble
Shoes Models
Clothing
Models
Electronics
Models
...
Predict if
Media - Books,
Music, Videos
categories
Assign one of
the other
applicable top
levelsInterm
ediate
Subca
tegory
Leaf
Gend
er
jackets
& coats
men, women,
boy, girl,
baby, unisex
plus-size,
active,
maternity
women’s
plus-size
fleece
jackets
Clothing Models
Principles from CD
Source Code and
Artifact Repository
for
Reproducibility
Source Code, Data and
Model Repository
for
Reproducibility
Continuous Delivery
for
Software
Continuous Delivery
for
Machine Learning
Requirements from a model repo
● Similar to an artifact repository like Maven, Ivy
○ Directory Structure, Versioning, Publishing of models
● Has clients to publish models for most commonly used frameworks
○ scikit-learn, Spark MLLib, Keras
● For a model,
○ Data
■ Stored in S3
■ In Different formats
● Parquet (Spark MLLib), Scikit-Learn - Pickle, Keras - HDF5
○ Metadata
■ Training/Validation/Test Datasets
■ Hyper-parameters used
■ Evaluation Metrics
Model Promotion
Training
Data
● Tagging the “latest good” version that needs to be deployed
● Not all models need/can be promoted
○ Experimental models
○ Models that fail the test set or performance/latency metrics
● Easy rollback - tag the “last good” version as the latest
Pre-process Data
(Spark Job)
Build Model
(Python)
Evaluate Model
(Python)
Publish Model Promote Model
Manual Step
Training Pipeline
MLFlow
● End-to-end machine learning platform
● By DataBricks (creators of Spark)
● Three components as shown below..
mlflow.set_tracking_uri(self.sagemaker_configs["mlflow_host"])
mlflow.create_experiment(experiment_name)
mlflow.start_run(experiment_id=experiment_id)
mlflow.log_param(key=k, value=quant_hyper_params[k])
mlflow.log_artifact(output + ".bin")
mlflow.log_metric("precision{}".format(metric_type), prec_score)
mlflow.set_tag("promote", "true")
Training module code snippets
MLFlow integration with GoCD
● Open source plugin for GoCD that enables MLFlow to be a material
(source) for CI/CD pipelines on GoCD - https://github.com/indix/mlflow-gocd
Questions?
I blog at https://stacktoheap.com
Twitter and most other platforms @manojlds

MLFlow as part of ML CI/CD at Avalara