SlideShare a Scribd company logo
1 of 27
Download to read offline
Emad Elwany - CTO, Lexion
Evolution of ML Infrastructure at an AI-First Startup
Rsqrd AI Meetup - May 2020
Agenda
● Lexion Overview
● Document Understanding Pipeline
● Evolution of ML Infrastructure at Lexion
● Deep Dive - Model Versioning
Lexion: Applying NLP to legal agreements
Creating this simple report could take weeks without automation.
It’s a complex NLP problem
● Messy PDFs make OCR non-trivial
● Long, multi-agreement documents
● Domain specific language
● Complex schemas/ontologies
● Mix of non/semi/fully structured data
Sample: Identify Contract Term
Contract term is AUTO RENEW if, e.g.:
“will automatically renew for three year terms”
“shall continue on a month to month basis until terminated”
Contract term is FIXED if, e.g.:
“terminate effective April 1, 2007.”
“will continue until the 1 year anniversary”
Document Understanding Pipeline
Input
OCR
Output
BL
.
.
.
Entities
Classes
Relations
Text
Layout
Structured
Data
.
.
.
Many many models!
Key Takeaway: Every node in this graph is a “model” (of hundreds), and the remainder of this talk applies to
each and every one of them.
Initial Goals (Pre-MVP)
● Evaluate technical feasibility: Can we build it?
● Evaluate business viability: Will they find it useful?
● Move very quickly: Can we ship it before we run out of money?
Use tools that are easy to
● Understand
● Setup
● Deploy
Steady state Goals (Post-MVP)
● Scale model development
● Scale model deployment
● Keep users happy at all times
Use tools that are easy to
● Integrate
● Configure
● Scale
Typical model lifecycle
Experience with ML in
research, applications,
and platforms:
Data
EARLY
● Finding the data
Scrapers/FOIA
● Cleaning the data
Scripting + Rules
● Annotating the data
Simple annotation tools
LATER
● Managing the data
Data Stores and Caches
● Protecting the data
Encryption and Access control
● Scaling annotation
Weakly/Unsupervised
Training
EARLY
Optimize for Speed of Results
Jupyter, Scripts
Goal: does it work?
LATER
Optimize for speed of Experimentation
Frameworks and metrics
Goal: make it the best!
Packaging
EARLY
Optimize for shipping the models
REST endpoint (online)
Batch script (offline)
LATER
Optimize for operationalizing the
model
Versioning of artefacts
Dependency management
Cost management
More on this a bit later...
Validate Model
EARLY
● Does it work well enough?
Simple high level metrics (F1, P, R etc.)
LATER
● Is it better?
● Why is it better?
● How is it better?
Much more rigor:
● Validation sets
● E2E tests
● More detailed metrics
Deployment
EARLY
Optimize for Speed of deployment
LATER
Optimize for Scale of deployment
● Inference time
● Priority vs. starvation
● Rapid update deployment
Monitor
EARLY
Bare minimum to ensure things are
working:
● High level E2E alert
LATER
Invest in monitoring all aspects of the
models:
● Detailed KPIs
● Model Drift
● User DSAT
Logging, Dashboards, Alerts
Deep Dive: Model Versioning
Real life problems
● “We used to predict the right X on this document - when/why did it break?”
○ Usually accompanied by an alert or even worse: a user complaint.
● “The model we trained 2 months ago was so much better at Y - we can’t seem
to get the same performance. How do we roll back?”
○ Usually accompanied by a frustrated product manager / quality engineering.
● “I swear I got better results over the weekend for the same experiment, I don’t
know what changed!”
○ Usually accompanied by a confused data scientist.
But first: can you reproduce your model results to the 10th decimal place? If not, STOP!
Wait… didn’t we solve this problem a long time ago?
Source control has been used for decades. How is this different?
Versioning ML models shares a lot with code versioning, for e.g.:
But it also includes a lot more:
Code (*) Config
Library dependencies Topology
Training Data Training Parameters
Model State (weights, hyperparameters) Hardware
(*) Code is a lot of things in the context of ML models, it’s data prep, libraries, models, featurizers etc.
What exactly is Versioning for ML models?
L1: Production/Staging slots.
Allows very short-term rollback/rollforward.
L2: Reproducing Inference.
Once you have a trained model, this kind of versioning allows you to deterministically
reconstruct a model for inference. Allows pinning models for a long time as well as long-term
rollback/rollforward.
L3: Reproducing Training.
You can at any point in time, re-train a model that yields the exact same model you had
previously trained. This is a much stronger kind of versioninging, it enables reproducibility as
well as dealing with issues as training data corruption.
Artefacts that need to be versioned
Simple examples Inference Training
Model Hyper Parameters Size of Layer N
Featurizer Code Input feature vector size
Featurizer Data Vocab
Model Code NN Architecture
Model Config Remove Stop Words?
Model State Model Weights
Library Dependencies PyTorch Version
Hardware V100
Training Config Early Stopping Criteria
Training Data Data + Labels
Remember this pipeline?
Input
OCR
Output
BL
.
.
.
Entities
Classes
Relations
Text
Layout
Structured
Data
.
.
.
Many many models!
You need to version the aforementioned artefacts for every single node in this graph. That’s a lot of things to
version!
Some solutions (that don’t work)
● Let’s snapshot everything in a Docker image and store it forever
> How do you hotfix the model?
● Let’s mark a “stable” production model and not deploy any future “staging”
versions till they have been tested enough.
> How do you make “breaking” changes to the code?
● Let’s always support only “latest” version and never commit a new version
until we’re sure it’s good.
> How do you iterate quickly?
We evaluated some existing solutions
It’s always better to not reinvent the wheel
It’s a lot of work to move infrastructure
The question is when not if. Early stage startups need to ship and sell their
product, hard to justify infrastructure plumbing till the flywheel turns.
Instead of a full solution, these investments have paid off:
1. Versioning all model state during packaging
2. Versioning all data artefacts in our our data store and making them immutable
3. Versioning all code explicitly by keeping stable interfaces and supporting
minor/major version upgrades to model/featurizer code.
4. Pinning major versions of stable dependencies
Remember: we are building a whole user facing application on top of this,
prioritizing when to invest here is critical.
BTW, all this ML is in addition to…
● Permissions
● Email alerts
● SSO
● End-user annotations
● Custom reporting
● Full text search
● Task management
● Custom fields
● Doc schemas
● APIs
● Integrations
● Bulk export
● Integrations
● Dashboards
● Pretty charts
● Bulk ingestion
● Security
● Audit trail
… building a complete user facing application!
A note on ML technical debt
● Identify when cost debt > cost addressing debt
● Incorporate cost of ML infrastructure in your business model
● Pick the right kind of technical debt, with a plan to get out
● Model versioning is one of the areas you might want to invest in early
● Getting a great model is just the first step of a long journey. You have to build
a product customers love!
Questions?
Learn more at https://lexion.ai (we’re hiring!)

More Related Content

What's hot

Best Practices in Software Development
Best Practices in Software DevelopmentBest Practices in Software Development
Best Practices in Software DevelopmentAndré Pitombeira
 
Test-drive development and Umple
Test-drive development and UmpleTest-drive development and Umple
Test-drive development and Umpletylerjdmcconnell
 
03 the c language
03 the c language03 the c language
03 the c languagearafatmirza
 
Uses for scripting languages,web scripting in perl
Uses for scripting languages,web scripting in perlUses for scripting languages,web scripting in perl
Uses for scripting languages,web scripting in perlsana mateen
 
SDL Trados Studio 2014 Masterclass
SDL Trados Studio 2014 MasterclassSDL Trados Studio 2014 Masterclass
SDL Trados Studio 2014 MasterclassSDL Trados
 
mbeddr meets IncQuer - Combining the Best Features of Two Modeling Worlds
mbeddr meets IncQuer - Combining the Best Features of Two Modeling Worldsmbeddr meets IncQuer - Combining the Best Features of Two Modeling Worlds
mbeddr meets IncQuer - Combining the Best Features of Two Modeling WorldsIstvan Rath
 
Ncrafts.io - Refactor your software architecture
Ncrafts.io - Refactor your software architectureNcrafts.io - Refactor your software architecture
Ncrafts.io - Refactor your software architectureJulien Lavigne du Cadet
 
Documenting Code - Patterns and Anti-patterns - NLPW 2016
Documenting Code - Patterns and Anti-patterns - NLPW 2016Documenting Code - Patterns and Anti-patterns - NLPW 2016
Documenting Code - Patterns and Anti-patterns - NLPW 2016Søren Lund
 
Bootstrapping in Compiler
Bootstrapping in CompilerBootstrapping in Compiler
Bootstrapping in CompilerAkhil Kaushik
 
Beyond Unit Testing
Beyond Unit TestingBeyond Unit Testing
Beyond Unit TestingSøren Lund
 
Ambiguous Requirements – Translating the message from C-level to implementation
Ambiguous Requirements – Translating the message from C-level to implementationAmbiguous Requirements – Translating the message from C-level to implementation
Ambiguous Requirements – Translating the message from C-level to implementationGeorgina Tilby
 
Introduction to Machine translation - AEM
Introduction to Machine translation - AEMIntroduction to Machine translation - AEM
Introduction to Machine translation - AEMVivek Sachdeva
 
Documenting code yapceu2016
Documenting code yapceu2016Documenting code yapceu2016
Documenting code yapceu2016Søren Lund
 
The Psychology of C# Analysis
The Psychology of C# AnalysisThe Psychology of C# Analysis
The Psychology of C# AnalysisCoverity
 
How to estimate the cost of a Maximo migration project with a high level of c...
How to estimate the cost of a Maximo migration project with a high level of c...How to estimate the cost of a Maximo migration project with a high level of c...
How to estimate the cost of a Maximo migration project with a high level of c...Mariano Zelaya Feijoo
 

What's hot (20)

Best Practices in Software Development
Best Practices in Software DevelopmentBest Practices in Software Development
Best Practices in Software Development
 
Test-drive development and Umple
Test-drive development and UmpleTest-drive development and Umple
Test-drive development and Umple
 
C programming
C programmingC programming
C programming
 
03 the c language
03 the c language03 the c language
03 the c language
 
Uses for scripting languages,web scripting in perl
Uses for scripting languages,web scripting in perlUses for scripting languages,web scripting in perl
Uses for scripting languages,web scripting in perl
 
SDL Trados Studio 2014 Masterclass
SDL Trados Studio 2014 MasterclassSDL Trados Studio 2014 Masterclass
SDL Trados Studio 2014 Masterclass
 
mbeddr meets IncQuer - Combining the Best Features of Two Modeling Worlds
mbeddr meets IncQuer - Combining the Best Features of Two Modeling Worldsmbeddr meets IncQuer - Combining the Best Features of Two Modeling Worlds
mbeddr meets IncQuer - Combining the Best Features of Two Modeling Worlds
 
Ncrafts.io - Refactor your software architecture
Ncrafts.io - Refactor your software architectureNcrafts.io - Refactor your software architecture
Ncrafts.io - Refactor your software architecture
 
Documenting Code - Patterns and Anti-patterns - NLPW 2016
Documenting Code - Patterns and Anti-patterns - NLPW 2016Documenting Code - Patterns and Anti-patterns - NLPW 2016
Documenting Code - Patterns and Anti-patterns - NLPW 2016
 
Bootstrapping in Compiler
Bootstrapping in CompilerBootstrapping in Compiler
Bootstrapping in Compiler
 
Beyond Unit Testing
Beyond Unit TestingBeyond Unit Testing
Beyond Unit Testing
 
Chapter 10
Chapter 10 Chapter 10
Chapter 10
 
LabVIEW: This Or That?
LabVIEW: This Or That?LabVIEW: This Or That?
LabVIEW: This Or That?
 
Lecture 29
Lecture 29Lecture 29
Lecture 29
 
Ambiguous Requirements – Translating the message from C-level to implementation
Ambiguous Requirements – Translating the message from C-level to implementationAmbiguous Requirements – Translating the message from C-level to implementation
Ambiguous Requirements – Translating the message from C-level to implementation
 
Introduction to Machine translation - AEM
Introduction to Machine translation - AEMIntroduction to Machine translation - AEM
Introduction to Machine translation - AEM
 
Documenting code yapceu2016
Documenting code yapceu2016Documenting code yapceu2016
Documenting code yapceu2016
 
Solid principles
Solid principlesSolid principles
Solid principles
 
The Psychology of C# Analysis
The Psychology of C# AnalysisThe Psychology of C# Analysis
The Psychology of C# Analysis
 
How to estimate the cost of a Maximo migration project with a high level of c...
How to estimate the cost of a Maximo migration project with a high level of c...How to estimate the cost of a Maximo migration project with a high level of c...
How to estimate the cost of a Maximo migration project with a high level of c...
 

Similar to Evolution of ML Infrastructure at an AI-First Startup

Python for Data Logistics
Python for Data LogisticsPython for Data Logistics
Python for Data LogisticsKen Farmer
 
Sanjaykumar Kakaso Mane_MAY2016
Sanjaykumar Kakaso Mane_MAY2016Sanjaykumar Kakaso Mane_MAY2016
Sanjaykumar Kakaso Mane_MAY2016Sanjay Mane
 
Software development life cycle
Software development life cycleSoftware development life cycle
Software development life cycleNishant Srivastava
 
ML Platform Q1 Meetup: Airbnb's End-to-End Machine Learning Infrastructure
ML Platform Q1 Meetup: Airbnb's End-to-End Machine Learning InfrastructureML Platform Q1 Meetup: Airbnb's End-to-End Machine Learning Infrastructure
ML Platform Q1 Meetup: Airbnb's End-to-End Machine Learning InfrastructureFei Chen
 
VidyaBhooshanMishra_CV
VidyaBhooshanMishra_CVVidyaBhooshanMishra_CV
VidyaBhooshanMishra_CVLandis+Gyr
 
[DSC Europe 22] Engineers guide for shepherding models in to production - Mar...
[DSC Europe 22] Engineers guide for shepherding models in to production - Mar...[DSC Europe 22] Engineers guide for shepherding models in to production - Mar...
[DSC Europe 22] Engineers guide for shepherding models in to production - Mar...DataScienceConferenc1
 
The Design, Evolution and Use of KernelF
The Design, Evolution and Use of KernelFThe Design, Evolution and Use of KernelF
The Design, Evolution and Use of KernelFMarkus Voelter
 
Prasad Rompalli latest Resume
Prasad Rompalli latest ResumePrasad Rompalli latest Resume
Prasad Rompalli latest ResumeRsv Prasad
 
Software Development Standard Operating Procedure
Software Development Standard Operating Procedure Software Development Standard Operating Procedure
Software Development Standard Operating Procedure rupeshchanchal
 
Slides-Артем Коваль-Cloud-Native MLOps Framework - DataFest 2021.pdf
Slides-Артем Коваль-Cloud-Native MLOps Framework - DataFest 2021.pdfSlides-Артем Коваль-Cloud-Native MLOps Framework - DataFest 2021.pdf
Slides-Артем Коваль-Cloud-Native MLOps Framework - DataFest 2021.pdfvitm11
 
OpenMetrics: What Does It Mean for You (PromCon 2019, Munich)
OpenMetrics: What Does It Mean for You (PromCon 2019, Munich)OpenMetrics: What Does It Mean for You (PromCon 2019, Munich)
OpenMetrics: What Does It Mean for You (PromCon 2019, Munich)Brian Brazil
 
The working architecture of NodeJS applications, Виктор Турский
The working architecture of NodeJS applications, Виктор ТурскийThe working architecture of NodeJS applications, Виктор Турский
The working architecture of NodeJS applications, Виктор ТурскийSigma Software
 
The working architecture of node js applications open tech week javascript ...
The working architecture of node js applications   open tech week javascript ...The working architecture of node js applications   open tech week javascript ...
The working architecture of node js applications open tech week javascript ...Viktor Turskyi
 
Prasad Rompalli latest Resume
Prasad Rompalli latest ResumePrasad Rompalli latest Resume
Prasad Rompalli latest ResumeRsv Prasad
 
Best Practices with OLAP Modeling with Cognos Transformer (Cognos 8)
Best Practices with OLAP Modeling with Cognos Transformer (Cognos 8)Best Practices with OLAP Modeling with Cognos Transformer (Cognos 8)
Best Practices with OLAP Modeling with Cognos Transformer (Cognos 8)Senturus
 
Advanced web application architecture - Talk
Advanced web application architecture - TalkAdvanced web application architecture - Talk
Advanced web application architecture - TalkMatthias Noback
 

Similar to Evolution of ML Infrastructure at an AI-First Startup (20)

Python for Data Logistics
Python for Data LogisticsPython for Data Logistics
Python for Data Logistics
 
Sanjaykumar Kakaso Mane_MAY2016
Sanjaykumar Kakaso Mane_MAY2016Sanjaykumar Kakaso Mane_MAY2016
Sanjaykumar Kakaso Mane_MAY2016
 
Software development life cycle
Software development life cycleSoftware development life cycle
Software development life cycle
 
ML Platform Q1 Meetup: Airbnb's End-to-End Machine Learning Infrastructure
ML Platform Q1 Meetup: Airbnb's End-to-End Machine Learning InfrastructureML Platform Q1 Meetup: Airbnb's End-to-End Machine Learning Infrastructure
ML Platform Q1 Meetup: Airbnb's End-to-End Machine Learning Infrastructure
 
VidyaBhooshanMishra_CV
VidyaBhooshanMishra_CVVidyaBhooshanMishra_CV
VidyaBhooshanMishra_CV
 
[DSC Europe 22] Engineers guide for shepherding models in to production - Mar...
[DSC Europe 22] Engineers guide for shepherding models in to production - Mar...[DSC Europe 22] Engineers guide for shepherding models in to production - Mar...
[DSC Europe 22] Engineers guide for shepherding models in to production - Mar...
 
The Design, Evolution and Use of KernelF
The Design, Evolution and Use of KernelFThe Design, Evolution and Use of KernelF
The Design, Evolution and Use of KernelF
 
SudhanshuKumar
SudhanshuKumarSudhanshuKumar
SudhanshuKumar
 
Prasad Rompalli latest Resume
Prasad Rompalli latest ResumePrasad Rompalli latest Resume
Prasad Rompalli latest Resume
 
Mannu_Kumar_CV
Mannu_Kumar_CVMannu_Kumar_CV
Mannu_Kumar_CV
 
Software Development Standard Operating Procedure
Software Development Standard Operating Procedure Software Development Standard Operating Procedure
Software Development Standard Operating Procedure
 
Slides-Артем Коваль-Cloud-Native MLOps Framework - DataFest 2021.pdf
Slides-Артем Коваль-Cloud-Native MLOps Framework - DataFest 2021.pdfSlides-Артем Коваль-Cloud-Native MLOps Framework - DataFest 2021.pdf
Slides-Артем Коваль-Cloud-Native MLOps Framework - DataFest 2021.pdf
 
OpenMetrics: What Does It Mean for You (PromCon 2019, Munich)
OpenMetrics: What Does It Mean for You (PromCon 2019, Munich)OpenMetrics: What Does It Mean for You (PromCon 2019, Munich)
OpenMetrics: What Does It Mean for You (PromCon 2019, Munich)
 
The working architecture of NodeJS applications, Виктор Турский
The working architecture of NodeJS applications, Виктор ТурскийThe working architecture of NodeJS applications, Виктор Турский
The working architecture of NodeJS applications, Виктор Турский
 
The working architecture of node js applications open tech week javascript ...
The working architecture of node js applications   open tech week javascript ...The working architecture of node js applications   open tech week javascript ...
The working architecture of node js applications open tech week javascript ...
 
Prasad Rompalli latest Resume
Prasad Rompalli latest ResumePrasad Rompalli latest Resume
Prasad Rompalli latest Resume
 
Best Practices with OLAP Modeling with Cognos Transformer (Cognos 8)
Best Practices with OLAP Modeling with Cognos Transformer (Cognos 8)Best Practices with OLAP Modeling with Cognos Transformer (Cognos 8)
Best Practices with OLAP Modeling with Cognos Transformer (Cognos 8)
 
Shivaprasada_Kodoth
Shivaprasada_KodothShivaprasada_Kodoth
Shivaprasada_Kodoth
 
Advanced web application architecture - Talk
Advanced web application architecture - TalkAdvanced web application architecture - Talk
Advanced web application architecture - Talk
 
01lifecycles
01lifecycles01lifecycles
01lifecycles
 

More from Sanjana Chowdhury

Rsqrd AI: Making Conversational AI Work for Everybody
Rsqrd AI: Making Conversational AI Work for EverybodyRsqrd AI: Making Conversational AI Work for Everybody
Rsqrd AI: Making Conversational AI Work for EverybodySanjana Chowdhury
 
Rsqrd AI: Application of Explanation Model in Healthcare
Rsqrd AI: Application of Explanation Model in HealthcareRsqrd AI: Application of Explanation Model in Healthcare
Rsqrd AI: Application of Explanation Model in HealthcareSanjana Chowdhury
 
Rsqrd AI: Recent Advances in Explainable Machine Learning Research
Rsqrd AI: Recent Advances in Explainable Machine Learning ResearchRsqrd AI: Recent Advances in Explainable Machine Learning Research
Rsqrd AI: Recent Advances in Explainable Machine Learning ResearchSanjana Chowdhury
 
Rsqrd AI: Incorporating Priors with Feature Attribution on Text Classification
Rsqrd AI: Incorporating Priors with Feature Attribution on Text ClassificationRsqrd AI: Incorporating Priors with Feature Attribution on Text Classification
Rsqrd AI: Incorporating Priors with Feature Attribution on Text ClassificationSanjana Chowdhury
 
Rsqrd AI: Discovering Natural Bugs Using Adversarial Perturbations
Rsqrd AI: Discovering Natural Bugs Using Adversarial PerturbationsRsqrd AI: Discovering Natural Bugs Using Adversarial Perturbations
Rsqrd AI: Discovering Natural Bugs Using Adversarial PerturbationsSanjana Chowdhury
 
Rsqrd AI: A Survey of The Current Ecosystem of Explainability Techniques
Rsqrd AI: A Survey of The Current Ecosystem of Explainability TechniquesRsqrd AI: A Survey of The Current Ecosystem of Explainability Techniques
Rsqrd AI: A Survey of The Current Ecosystem of Explainability TechniquesSanjana Chowdhury
 
Rsqrd AI: Explaining ML Models w/ Geometric Intuition
Rsqrd AI: Explaining ML Models w/ Geometric IntuitionRsqrd AI: Explaining ML Models w/ Geometric Intuition
Rsqrd AI: Explaining ML Models w/ Geometric IntuitionSanjana Chowdhury
 
Rsqrd AI: Errudite- Scalable, Reproducible, and Testable Error Analysis
Rsqrd AI: Errudite- Scalable, Reproducible, and Testable Error AnalysisRsqrd AI: Errudite- Scalable, Reproducible, and Testable Error Analysis
Rsqrd AI: Errudite- Scalable, Reproducible, and Testable Error AnalysisSanjana Chowdhury
 
Rsqrd AI: Exploring Machine Learning Model Predictions
Rsqrd AI: Exploring Machine Learning Model PredictionsRsqrd AI: Exploring Machine Learning Model Predictions
Rsqrd AI: Exploring Machine Learning Model PredictionsSanjana Chowdhury
 
Rsqrd AI: Zestimates and Zillow AI Platform
Rsqrd AI: Zestimates and Zillow AI PlatformRsqrd AI: Zestimates and Zillow AI Platform
Rsqrd AI: Zestimates and Zillow AI PlatformSanjana Chowdhury
 
Rsqrd AI: From R&D to ROI of AI
Rsqrd AI: From R&D to ROI of AIRsqrd AI: From R&D to ROI of AI
Rsqrd AI: From R&D to ROI of AISanjana Chowdhury
 
Rsqrd AI: How to Design a Reliable and Reproducible Pipeline
Rsqrd AI: How to Design a Reliable and Reproducible PipelineRsqrd AI: How to Design a Reliable and Reproducible Pipeline
Rsqrd AI: How to Design a Reliable and Reproducible PipelineSanjana Chowdhury
 

More from Sanjana Chowdhury (12)

Rsqrd AI: Making Conversational AI Work for Everybody
Rsqrd AI: Making Conversational AI Work for EverybodyRsqrd AI: Making Conversational AI Work for Everybody
Rsqrd AI: Making Conversational AI Work for Everybody
 
Rsqrd AI: Application of Explanation Model in Healthcare
Rsqrd AI: Application of Explanation Model in HealthcareRsqrd AI: Application of Explanation Model in Healthcare
Rsqrd AI: Application of Explanation Model in Healthcare
 
Rsqrd AI: Recent Advances in Explainable Machine Learning Research
Rsqrd AI: Recent Advances in Explainable Machine Learning ResearchRsqrd AI: Recent Advances in Explainable Machine Learning Research
Rsqrd AI: Recent Advances in Explainable Machine Learning Research
 
Rsqrd AI: Incorporating Priors with Feature Attribution on Text Classification
Rsqrd AI: Incorporating Priors with Feature Attribution on Text ClassificationRsqrd AI: Incorporating Priors with Feature Attribution on Text Classification
Rsqrd AI: Incorporating Priors with Feature Attribution on Text Classification
 
Rsqrd AI: Discovering Natural Bugs Using Adversarial Perturbations
Rsqrd AI: Discovering Natural Bugs Using Adversarial PerturbationsRsqrd AI: Discovering Natural Bugs Using Adversarial Perturbations
Rsqrd AI: Discovering Natural Bugs Using Adversarial Perturbations
 
Rsqrd AI: A Survey of The Current Ecosystem of Explainability Techniques
Rsqrd AI: A Survey of The Current Ecosystem of Explainability TechniquesRsqrd AI: A Survey of The Current Ecosystem of Explainability Techniques
Rsqrd AI: A Survey of The Current Ecosystem of Explainability Techniques
 
Rsqrd AI: Explaining ML Models w/ Geometric Intuition
Rsqrd AI: Explaining ML Models w/ Geometric IntuitionRsqrd AI: Explaining ML Models w/ Geometric Intuition
Rsqrd AI: Explaining ML Models w/ Geometric Intuition
 
Rsqrd AI: Errudite- Scalable, Reproducible, and Testable Error Analysis
Rsqrd AI: Errudite- Scalable, Reproducible, and Testable Error AnalysisRsqrd AI: Errudite- Scalable, Reproducible, and Testable Error Analysis
Rsqrd AI: Errudite- Scalable, Reproducible, and Testable Error Analysis
 
Rsqrd AI: Exploring Machine Learning Model Predictions
Rsqrd AI: Exploring Machine Learning Model PredictionsRsqrd AI: Exploring Machine Learning Model Predictions
Rsqrd AI: Exploring Machine Learning Model Predictions
 
Rsqrd AI: Zestimates and Zillow AI Platform
Rsqrd AI: Zestimates and Zillow AI PlatformRsqrd AI: Zestimates and Zillow AI Platform
Rsqrd AI: Zestimates and Zillow AI Platform
 
Rsqrd AI: From R&D to ROI of AI
Rsqrd AI: From R&D to ROI of AIRsqrd AI: From R&D to ROI of AI
Rsqrd AI: From R&D to ROI of AI
 
Rsqrd AI: How to Design a Reliable and Reproducible Pipeline
Rsqrd AI: How to Design a Reliable and Reproducible PipelineRsqrd AI: How to Design a Reliable and Reproducible Pipeline
Rsqrd AI: How to Design a Reliable and Reproducible Pipeline
 

Recently uploaded

CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDGMarianaLemus7
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024BookNet Canada
 
Unlocking the Potential of the Cloud for IBM Power Systems
Unlocking the Potential of the Cloud for IBM Power SystemsUnlocking the Potential of the Cloud for IBM Power Systems
Unlocking the Potential of the Cloud for IBM Power SystemsPrecisely
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 

Recently uploaded (20)

The transition to renewables in India.pdf
The transition to renewables in India.pdfThe transition to renewables in India.pdf
The transition to renewables in India.pdf
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDG
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
 
Unlocking the Potential of the Cloud for IBM Power Systems
Unlocking the Potential of the Cloud for IBM Power SystemsUnlocking the Potential of the Cloud for IBM Power Systems
Unlocking the Potential of the Cloud for IBM Power Systems
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 

Evolution of ML Infrastructure at an AI-First Startup

  • 1. Emad Elwany - CTO, Lexion Evolution of ML Infrastructure at an AI-First Startup Rsqrd AI Meetup - May 2020
  • 2. Agenda ● Lexion Overview ● Document Understanding Pipeline ● Evolution of ML Infrastructure at Lexion ● Deep Dive - Model Versioning
  • 3. Lexion: Applying NLP to legal agreements Creating this simple report could take weeks without automation.
  • 4. It’s a complex NLP problem ● Messy PDFs make OCR non-trivial ● Long, multi-agreement documents ● Domain specific language ● Complex schemas/ontologies ● Mix of non/semi/fully structured data
  • 5. Sample: Identify Contract Term Contract term is AUTO RENEW if, e.g.: “will automatically renew for three year terms” “shall continue on a month to month basis until terminated” Contract term is FIXED if, e.g.: “terminate effective April 1, 2007.” “will continue until the 1 year anniversary”
  • 6. Document Understanding Pipeline Input OCR Output BL . . . Entities Classes Relations Text Layout Structured Data . . . Many many models! Key Takeaway: Every node in this graph is a “model” (of hundreds), and the remainder of this talk applies to each and every one of them.
  • 7. Initial Goals (Pre-MVP) ● Evaluate technical feasibility: Can we build it? ● Evaluate business viability: Will they find it useful? ● Move very quickly: Can we ship it before we run out of money? Use tools that are easy to ● Understand ● Setup ● Deploy
  • 8. Steady state Goals (Post-MVP) ● Scale model development ● Scale model deployment ● Keep users happy at all times Use tools that are easy to ● Integrate ● Configure ● Scale
  • 9. Typical model lifecycle Experience with ML in research, applications, and platforms:
  • 10. Data EARLY ● Finding the data Scrapers/FOIA ● Cleaning the data Scripting + Rules ● Annotating the data Simple annotation tools LATER ● Managing the data Data Stores and Caches ● Protecting the data Encryption and Access control ● Scaling annotation Weakly/Unsupervised
  • 11. Training EARLY Optimize for Speed of Results Jupyter, Scripts Goal: does it work? LATER Optimize for speed of Experimentation Frameworks and metrics Goal: make it the best!
  • 12. Packaging EARLY Optimize for shipping the models REST endpoint (online) Batch script (offline) LATER Optimize for operationalizing the model Versioning of artefacts Dependency management Cost management More on this a bit later...
  • 13. Validate Model EARLY ● Does it work well enough? Simple high level metrics (F1, P, R etc.) LATER ● Is it better? ● Why is it better? ● How is it better? Much more rigor: ● Validation sets ● E2E tests ● More detailed metrics
  • 14. Deployment EARLY Optimize for Speed of deployment LATER Optimize for Scale of deployment ● Inference time ● Priority vs. starvation ● Rapid update deployment
  • 15. Monitor EARLY Bare minimum to ensure things are working: ● High level E2E alert LATER Invest in monitoring all aspects of the models: ● Detailed KPIs ● Model Drift ● User DSAT Logging, Dashboards, Alerts
  • 16. Deep Dive: Model Versioning
  • 17. Real life problems ● “We used to predict the right X on this document - when/why did it break?” ○ Usually accompanied by an alert or even worse: a user complaint. ● “The model we trained 2 months ago was so much better at Y - we can’t seem to get the same performance. How do we roll back?” ○ Usually accompanied by a frustrated product manager / quality engineering. ● “I swear I got better results over the weekend for the same experiment, I don’t know what changed!” ○ Usually accompanied by a confused data scientist. But first: can you reproduce your model results to the 10th decimal place? If not, STOP!
  • 18. Wait… didn’t we solve this problem a long time ago? Source control has been used for decades. How is this different? Versioning ML models shares a lot with code versioning, for e.g.: But it also includes a lot more: Code (*) Config Library dependencies Topology Training Data Training Parameters Model State (weights, hyperparameters) Hardware (*) Code is a lot of things in the context of ML models, it’s data prep, libraries, models, featurizers etc.
  • 19. What exactly is Versioning for ML models? L1: Production/Staging slots. Allows very short-term rollback/rollforward. L2: Reproducing Inference. Once you have a trained model, this kind of versioning allows you to deterministically reconstruct a model for inference. Allows pinning models for a long time as well as long-term rollback/rollforward. L3: Reproducing Training. You can at any point in time, re-train a model that yields the exact same model you had previously trained. This is a much stronger kind of versioninging, it enables reproducibility as well as dealing with issues as training data corruption.
  • 20. Artefacts that need to be versioned Simple examples Inference Training Model Hyper Parameters Size of Layer N Featurizer Code Input feature vector size Featurizer Data Vocab Model Code NN Architecture Model Config Remove Stop Words? Model State Model Weights Library Dependencies PyTorch Version Hardware V100 Training Config Early Stopping Criteria Training Data Data + Labels
  • 21. Remember this pipeline? Input OCR Output BL . . . Entities Classes Relations Text Layout Structured Data . . . Many many models! You need to version the aforementioned artefacts for every single node in this graph. That’s a lot of things to version!
  • 22. Some solutions (that don’t work) ● Let’s snapshot everything in a Docker image and store it forever > How do you hotfix the model? ● Let’s mark a “stable” production model and not deploy any future “staging” versions till they have been tested enough. > How do you make “breaking” changes to the code? ● Let’s always support only “latest” version and never commit a new version until we’re sure it’s good. > How do you iterate quickly?
  • 23. We evaluated some existing solutions It’s always better to not reinvent the wheel
  • 24. It’s a lot of work to move infrastructure The question is when not if. Early stage startups need to ship and sell their product, hard to justify infrastructure plumbing till the flywheel turns. Instead of a full solution, these investments have paid off: 1. Versioning all model state during packaging 2. Versioning all data artefacts in our our data store and making them immutable 3. Versioning all code explicitly by keeping stable interfaces and supporting minor/major version upgrades to model/featurizer code. 4. Pinning major versions of stable dependencies Remember: we are building a whole user facing application on top of this, prioritizing when to invest here is critical.
  • 25. BTW, all this ML is in addition to… ● Permissions ● Email alerts ● SSO ● End-user annotations ● Custom reporting ● Full text search ● Task management ● Custom fields ● Doc schemas ● APIs ● Integrations ● Bulk export ● Integrations ● Dashboards ● Pretty charts ● Bulk ingestion ● Security ● Audit trail … building a complete user facing application!
  • 26. A note on ML technical debt ● Identify when cost debt > cost addressing debt ● Incorporate cost of ML infrastructure in your business model ● Pick the right kind of technical debt, with a plan to get out ● Model versioning is one of the areas you might want to invest in early ● Getting a great model is just the first step of a long journey. You have to build a product customers love!
  • 27. Questions? Learn more at https://lexion.ai (we’re hiring!)