SlideShare a Scribd company logo
Emad Elwany - CTO, Lexion
Evolution of ML Infrastructure at an AI-First Startup
Rsqrd AI Meetup - May 2020
Agenda
● Lexion Overview
● Document Understanding Pipeline
● Evolution of ML Infrastructure at Lexion
● Deep Dive - Model Versioning
Lexion: Applying NLP to legal agreements
Creating this simple report could take weeks without automation.
It’s a complex NLP problem
● Messy PDFs make OCR non-trivial
● Long, multi-agreement documents
● Domain specific language
● Complex schemas/ontologies
● Mix of non/semi/fully structured data
Sample: Identify Contract Term
Contract term is AUTO RENEW if, e.g.:
“will automatically renew for three year terms”
“shall continue on a month to month basis until terminated”
Contract term is FIXED if, e.g.:
“terminate effective April 1, 2007.”
“will continue until the 1 year anniversary”
Document Understanding Pipeline
Input
OCR
Output
BL
.
.
.
Entities
Classes
Relations
Text
Layout
Structured
Data
.
.
.
Many many models!
Key Takeaway: Every node in this graph is a “model” (of hundreds), and the remainder of this talk applies to
each and every one of them.
Initial Goals (Pre-MVP)
● Evaluate technical feasibility: Can we build it?
● Evaluate business viability: Will they find it useful?
● Move very quickly: Can we ship it before we run out of money?
Use tools that are easy to
● Understand
● Setup
● Deploy
Steady state Goals (Post-MVP)
● Scale model development
● Scale model deployment
● Keep users happy at all times
Use tools that are easy to
● Integrate
● Configure
● Scale
Typical model lifecycle
Experience with ML in
research, applications,
and platforms:
Data
EARLY
● Finding the data
Scrapers/FOIA
● Cleaning the data
Scripting + Rules
● Annotating the data
Simple annotation tools
LATER
● Managing the data
Data Stores and Caches
● Protecting the data
Encryption and Access control
● Scaling annotation
Weakly/Unsupervised
Training
EARLY
Optimize for Speed of Results
Jupyter, Scripts
Goal: does it work?
LATER
Optimize for speed of Experimentation
Frameworks and metrics
Goal: make it the best!
Packaging
EARLY
Optimize for shipping the models
REST endpoint (online)
Batch script (offline)
LATER
Optimize for operationalizing the
model
Versioning of artefacts
Dependency management
Cost management
More on this a bit later...
Validate Model
EARLY
● Does it work well enough?
Simple high level metrics (F1, P, R etc.)
LATER
● Is it better?
● Why is it better?
● How is it better?
Much more rigor:
● Validation sets
● E2E tests
● More detailed metrics
Deployment
EARLY
Optimize for Speed of deployment
LATER
Optimize for Scale of deployment
● Inference time
● Priority vs. starvation
● Rapid update deployment
Monitor
EARLY
Bare minimum to ensure things are
working:
● High level E2E alert
LATER
Invest in monitoring all aspects of the
models:
● Detailed KPIs
● Model Drift
● User DSAT
Logging, Dashboards, Alerts
Deep Dive: Model Versioning
Real life problems
● “We used to predict the right X on this document - when/why did it break?”
○ Usually accompanied by an alert or even worse: a user complaint.
● “The model we trained 2 months ago was so much better at Y - we can’t seem
to get the same performance. How do we roll back?”
○ Usually accompanied by a frustrated product manager / quality engineering.
● “I swear I got better results over the weekend for the same experiment, I don’t
know what changed!”
○ Usually accompanied by a confused data scientist.
But first: can you reproduce your model results to the 10th decimal place? If not, STOP!
Wait… didn’t we solve this problem a long time ago?
Source control has been used for decades. How is this different?
Versioning ML models shares a lot with code versioning, for e.g.:
But it also includes a lot more:
Code (*) Config
Library dependencies Topology
Training Data Training Parameters
Model State (weights, hyperparameters) Hardware
(*) Code is a lot of things in the context of ML models, it’s data prep, libraries, models, featurizers etc.
What exactly is Versioning for ML models?
L1: Production/Staging slots.
Allows very short-term rollback/rollforward.
L2: Reproducing Inference.
Once you have a trained model, this kind of versioning allows you to deterministically
reconstruct a model for inference. Allows pinning models for a long time as well as long-term
rollback/rollforward.
L3: Reproducing Training.
You can at any point in time, re-train a model that yields the exact same model you had
previously trained. This is a much stronger kind of versioninging, it enables reproducibility as
well as dealing with issues as training data corruption.
Artefacts that need to be versioned
Simple examples Inference Training
Model Hyper Parameters Size of Layer N
Featurizer Code Input feature vector size
Featurizer Data Vocab
Model Code NN Architecture
Model Config Remove Stop Words?
Model State Model Weights
Library Dependencies PyTorch Version
Hardware V100
Training Config Early Stopping Criteria
Training Data Data + Labels
Remember this pipeline?
Input
OCR
Output
BL
.
.
.
Entities
Classes
Relations
Text
Layout
Structured
Data
.
.
.
Many many models!
You need to version the aforementioned artefacts for every single node in this graph. That’s a lot of things to
version!
Some solutions (that don’t work)
● Let’s snapshot everything in a Docker image and store it forever
> How do you hotfix the model?
● Let’s mark a “stable” production model and not deploy any future “staging”
versions till they have been tested enough.
> How do you make “breaking” changes to the code?
● Let’s always support only “latest” version and never commit a new version
until we’re sure it’s good.
> How do you iterate quickly?
We evaluated some existing solutions
It’s always better to not reinvent the wheel
It’s a lot of work to move infrastructure
The question is when not if. Early stage startups need to ship and sell their
product, hard to justify infrastructure plumbing till the flywheel turns.
Instead of a full solution, these investments have paid off:
1. Versioning all model state during packaging
2. Versioning all data artefacts in our our data store and making them immutable
3. Versioning all code explicitly by keeping stable interfaces and supporting
minor/major version upgrades to model/featurizer code.
4. Pinning major versions of stable dependencies
Remember: we are building a whole user facing application on top of this,
prioritizing when to invest here is critical.
BTW, all this ML is in addition to…
● Permissions
● Email alerts
● SSO
● End-user annotations
● Custom reporting
● Full text search
● Task management
● Custom fields
● Doc schemas
● APIs
● Integrations
● Bulk export
● Integrations
● Dashboards
● Pretty charts
● Bulk ingestion
● Security
● Audit trail
… building a complete user facing application!
A note on ML technical debt
● Identify when cost debt > cost addressing debt
● Incorporate cost of ML infrastructure in your business model
● Pick the right kind of technical debt, with a plan to get out
● Model versioning is one of the areas you might want to invest in early
● Getting a great model is just the first step of a long journey. You have to build
a product customers love!
Questions?
Learn more at https://lexion.ai (we’re hiring!)

More Related Content

What's hot

Best Practices in Software Development
Best Practices in Software DevelopmentBest Practices in Software Development
Best Practices in Software Development
André Pitombeira
 
Test-drive development and Umple
Test-drive development and UmpleTest-drive development and Umple
Test-drive development and Umple
tylerjdmcconnell
 
C programming
C programmingC programming
C programming
Anurag Byala
 
03 the c language
03 the c language03 the c language
03 the c language
arafatmirza
 
Uses for scripting languages,web scripting in perl
Uses for scripting languages,web scripting in perlUses for scripting languages,web scripting in perl
Uses for scripting languages,web scripting in perl
sana mateen
 
SDL Trados Studio 2014 Masterclass
SDL Trados Studio 2014 MasterclassSDL Trados Studio 2014 Masterclass
SDL Trados Studio 2014 Masterclass
SDL Trados
 
mbeddr meets IncQuer - Combining the Best Features of Two Modeling Worlds
mbeddr meets IncQuer - Combining the Best Features of Two Modeling Worldsmbeddr meets IncQuer - Combining the Best Features of Two Modeling Worlds
mbeddr meets IncQuer - Combining the Best Features of Two Modeling Worlds
Istvan Rath
 
Ncrafts.io - Refactor your software architecture
Ncrafts.io - Refactor your software architectureNcrafts.io - Refactor your software architecture
Ncrafts.io - Refactor your software architecture
Julien Lavigne du Cadet
 
Documenting Code - Patterns and Anti-patterns - NLPW 2016
Documenting Code - Patterns and Anti-patterns - NLPW 2016Documenting Code - Patterns and Anti-patterns - NLPW 2016
Documenting Code - Patterns and Anti-patterns - NLPW 2016
Søren Lund
 
Bootstrapping in Compiler
Bootstrapping in CompilerBootstrapping in Compiler
Bootstrapping in Compiler
Akhil Kaushik
 
Beyond Unit Testing
Beyond Unit TestingBeyond Unit Testing
Beyond Unit Testing
Søren Lund
 
Chapter 10
Chapter 10 Chapter 10
LabVIEW: This Or That?
LabVIEW: This Or That?LabVIEW: This Or That?
LabVIEW: This Or That?
Chrisa T.S (Tzortzaki- Stratoudakis)
 
Lecture 29
Lecture 29Lecture 29
Lecture 29
Skillspire LLC
 
Ambiguous Requirements – Translating the message from C-level to implementation
Ambiguous Requirements – Translating the message from C-level to implementationAmbiguous Requirements – Translating the message from C-level to implementation
Ambiguous Requirements – Translating the message from C-level to implementation
Georgina Tilby
 
Introduction to Machine translation - AEM
Introduction to Machine translation - AEMIntroduction to Machine translation - AEM
Introduction to Machine translation - AEM
Vivek Sachdeva
 
Documenting code yapceu2016
Documenting code yapceu2016Documenting code yapceu2016
Documenting code yapceu2016
Søren Lund
 
Solid principles
Solid principlesSolid principles
Solid principles
Kumaresh Chandra Baruri
 
The Psychology of C# Analysis
The Psychology of C# AnalysisThe Psychology of C# Analysis
The Psychology of C# Analysis
Coverity
 
How to estimate the cost of a Maximo migration project with a high level of c...
How to estimate the cost of a Maximo migration project with a high level of c...How to estimate the cost of a Maximo migration project with a high level of c...
How to estimate the cost of a Maximo migration project with a high level of c...
Mariano Zelaya Feijoo
 

What's hot (20)

Best Practices in Software Development
Best Practices in Software DevelopmentBest Practices in Software Development
Best Practices in Software Development
 
Test-drive development and Umple
Test-drive development and UmpleTest-drive development and Umple
Test-drive development and Umple
 
C programming
C programmingC programming
C programming
 
03 the c language
03 the c language03 the c language
03 the c language
 
Uses for scripting languages,web scripting in perl
Uses for scripting languages,web scripting in perlUses for scripting languages,web scripting in perl
Uses for scripting languages,web scripting in perl
 
SDL Trados Studio 2014 Masterclass
SDL Trados Studio 2014 MasterclassSDL Trados Studio 2014 Masterclass
SDL Trados Studio 2014 Masterclass
 
mbeddr meets IncQuer - Combining the Best Features of Two Modeling Worlds
mbeddr meets IncQuer - Combining the Best Features of Two Modeling Worldsmbeddr meets IncQuer - Combining the Best Features of Two Modeling Worlds
mbeddr meets IncQuer - Combining the Best Features of Two Modeling Worlds
 
Ncrafts.io - Refactor your software architecture
Ncrafts.io - Refactor your software architectureNcrafts.io - Refactor your software architecture
Ncrafts.io - Refactor your software architecture
 
Documenting Code - Patterns and Anti-patterns - NLPW 2016
Documenting Code - Patterns and Anti-patterns - NLPW 2016Documenting Code - Patterns and Anti-patterns - NLPW 2016
Documenting Code - Patterns and Anti-patterns - NLPW 2016
 
Bootstrapping in Compiler
Bootstrapping in CompilerBootstrapping in Compiler
Bootstrapping in Compiler
 
Beyond Unit Testing
Beyond Unit TestingBeyond Unit Testing
Beyond Unit Testing
 
Chapter 10
Chapter 10 Chapter 10
Chapter 10
 
LabVIEW: This Or That?
LabVIEW: This Or That?LabVIEW: This Or That?
LabVIEW: This Or That?
 
Lecture 29
Lecture 29Lecture 29
Lecture 29
 
Ambiguous Requirements – Translating the message from C-level to implementation
Ambiguous Requirements – Translating the message from C-level to implementationAmbiguous Requirements – Translating the message from C-level to implementation
Ambiguous Requirements – Translating the message from C-level to implementation
 
Introduction to Machine translation - AEM
Introduction to Machine translation - AEMIntroduction to Machine translation - AEM
Introduction to Machine translation - AEM
 
Documenting code yapceu2016
Documenting code yapceu2016Documenting code yapceu2016
Documenting code yapceu2016
 
Solid principles
Solid principlesSolid principles
Solid principles
 
The Psychology of C# Analysis
The Psychology of C# AnalysisThe Psychology of C# Analysis
The Psychology of C# Analysis
 
How to estimate the cost of a Maximo migration project with a high level of c...
How to estimate the cost of a Maximo migration project with a high level of c...How to estimate the cost of a Maximo migration project with a high level of c...
How to estimate the cost of a Maximo migration project with a high level of c...
 

Similar to Rsqrd AI: ML Tooling at an AI-first Startup

Python for Data Logistics
Python for Data LogisticsPython for Data Logistics
Python for Data Logistics
Ken Farmer
 
Sanjaykumar Kakaso Mane_MAY2016
Sanjaykumar Kakaso Mane_MAY2016Sanjaykumar Kakaso Mane_MAY2016
Sanjaykumar Kakaso Mane_MAY2016
Sanjay Mane
 
Software development life cycle
Software development life cycleSoftware development life cycle
Software development life cycle
Nishant Srivastava
 
ML Platform Q1 Meetup: Airbnb's End-to-End Machine Learning Infrastructure
ML Platform Q1 Meetup: Airbnb's End-to-End Machine Learning InfrastructureML Platform Q1 Meetup: Airbnb's End-to-End Machine Learning Infrastructure
ML Platform Q1 Meetup: Airbnb's End-to-End Machine Learning Infrastructure
Fei Chen
 
VidyaBhooshanMishra_CV
VidyaBhooshanMishra_CVVidyaBhooshanMishra_CV
VidyaBhooshanMishra_CV
Landis+Gyr
 
[DSC Europe 22] Engineers guide for shepherding models in to production - Mar...
[DSC Europe 22] Engineers guide for shepherding models in to production - Mar...[DSC Europe 22] Engineers guide for shepherding models in to production - Mar...
[DSC Europe 22] Engineers guide for shepherding models in to production - Mar...
DataScienceConferenc1
 
The Design, Evolution and Use of KernelF
The Design, Evolution and Use of KernelFThe Design, Evolution and Use of KernelF
The Design, Evolution and Use of KernelF
Markus Voelter
 
SudhanshuKumar
SudhanshuKumarSudhanshuKumar
SudhanshuKumar
Sudhanshu Kumar
 
Prasad Rompalli latest Resume
Prasad Rompalli latest ResumePrasad Rompalli latest Resume
Prasad Rompalli latest Resume
Rsv Prasad
 
Mannu_Kumar_CV
Mannu_Kumar_CVMannu_Kumar_CV
Mannu_Kumar_CV
Mannu Kumar
 
Software Development Standard Operating Procedure
Software Development Standard Operating Procedure Software Development Standard Operating Procedure
Software Development Standard Operating Procedure
rupeshchanchal
 
Slides-Артем Коваль-Cloud-Native MLOps Framework - DataFest 2021.pdf
Slides-Артем Коваль-Cloud-Native MLOps Framework - DataFest 2021.pdfSlides-Артем Коваль-Cloud-Native MLOps Framework - DataFest 2021.pdf
Slides-Артем Коваль-Cloud-Native MLOps Framework - DataFest 2021.pdf
vitm11
 
OpenMetrics: What Does It Mean for You (PromCon 2019, Munich)
OpenMetrics: What Does It Mean for You (PromCon 2019, Munich)OpenMetrics: What Does It Mean for You (PromCon 2019, Munich)
OpenMetrics: What Does It Mean for You (PromCon 2019, Munich)
Brian Brazil
 
The working architecture of NodeJS applications, Виктор Турский
The working architecture of NodeJS applications, Виктор ТурскийThe working architecture of NodeJS applications, Виктор Турский
The working architecture of NodeJS applications, Виктор Турский
Sigma Software
 
The working architecture of node js applications open tech week javascript ...
The working architecture of node js applications   open tech week javascript ...The working architecture of node js applications   open tech week javascript ...
The working architecture of node js applications open tech week javascript ...
Viktor Turskyi
 
Prasad Rompalli latest Resume
Prasad Rompalli latest ResumePrasad Rompalli latest Resume
Prasad Rompalli latest Resume
Rsv Prasad
 
ChatGPT and Beyond - Elevating DevOps Productivity
ChatGPT and Beyond - Elevating DevOps ProductivityChatGPT and Beyond - Elevating DevOps Productivity
ChatGPT and Beyond - Elevating DevOps Productivity
VictorSzoltysek
 
Best Practices with OLAP Modeling with Cognos Transformer (Cognos 8)
Best Practices with OLAP Modeling with Cognos Transformer (Cognos 8)Best Practices with OLAP Modeling with Cognos Transformer (Cognos 8)
Best Practices with OLAP Modeling with Cognos Transformer (Cognos 8)
Senturus
 
Shivaprasada_Kodoth
Shivaprasada_KodothShivaprasada_Kodoth
Shivaprasada_Kodoth
Shivaprasada Kodoth
 
Advanced web application architecture - Talk
Advanced web application architecture - TalkAdvanced web application architecture - Talk
Advanced web application architecture - Talk
Matthias Noback
 

Similar to Rsqrd AI: ML Tooling at an AI-first Startup (20)

Python for Data Logistics
Python for Data LogisticsPython for Data Logistics
Python for Data Logistics
 
Sanjaykumar Kakaso Mane_MAY2016
Sanjaykumar Kakaso Mane_MAY2016Sanjaykumar Kakaso Mane_MAY2016
Sanjaykumar Kakaso Mane_MAY2016
 
Software development life cycle
Software development life cycleSoftware development life cycle
Software development life cycle
 
ML Platform Q1 Meetup: Airbnb's End-to-End Machine Learning Infrastructure
ML Platform Q1 Meetup: Airbnb's End-to-End Machine Learning InfrastructureML Platform Q1 Meetup: Airbnb's End-to-End Machine Learning Infrastructure
ML Platform Q1 Meetup: Airbnb's End-to-End Machine Learning Infrastructure
 
VidyaBhooshanMishra_CV
VidyaBhooshanMishra_CVVidyaBhooshanMishra_CV
VidyaBhooshanMishra_CV
 
[DSC Europe 22] Engineers guide for shepherding models in to production - Mar...
[DSC Europe 22] Engineers guide for shepherding models in to production - Mar...[DSC Europe 22] Engineers guide for shepherding models in to production - Mar...
[DSC Europe 22] Engineers guide for shepherding models in to production - Mar...
 
The Design, Evolution and Use of KernelF
The Design, Evolution and Use of KernelFThe Design, Evolution and Use of KernelF
The Design, Evolution and Use of KernelF
 
SudhanshuKumar
SudhanshuKumarSudhanshuKumar
SudhanshuKumar
 
Prasad Rompalli latest Resume
Prasad Rompalli latest ResumePrasad Rompalli latest Resume
Prasad Rompalli latest Resume
 
Mannu_Kumar_CV
Mannu_Kumar_CVMannu_Kumar_CV
Mannu_Kumar_CV
 
Software Development Standard Operating Procedure
Software Development Standard Operating Procedure Software Development Standard Operating Procedure
Software Development Standard Operating Procedure
 
Slides-Артем Коваль-Cloud-Native MLOps Framework - DataFest 2021.pdf
Slides-Артем Коваль-Cloud-Native MLOps Framework - DataFest 2021.pdfSlides-Артем Коваль-Cloud-Native MLOps Framework - DataFest 2021.pdf
Slides-Артем Коваль-Cloud-Native MLOps Framework - DataFest 2021.pdf
 
OpenMetrics: What Does It Mean for You (PromCon 2019, Munich)
OpenMetrics: What Does It Mean for You (PromCon 2019, Munich)OpenMetrics: What Does It Mean for You (PromCon 2019, Munich)
OpenMetrics: What Does It Mean for You (PromCon 2019, Munich)
 
The working architecture of NodeJS applications, Виктор Турский
The working architecture of NodeJS applications, Виктор ТурскийThe working architecture of NodeJS applications, Виктор Турский
The working architecture of NodeJS applications, Виктор Турский
 
The working architecture of node js applications open tech week javascript ...
The working architecture of node js applications   open tech week javascript ...The working architecture of node js applications   open tech week javascript ...
The working architecture of node js applications open tech week javascript ...
 
Prasad Rompalli latest Resume
Prasad Rompalli latest ResumePrasad Rompalli latest Resume
Prasad Rompalli latest Resume
 
ChatGPT and Beyond - Elevating DevOps Productivity
ChatGPT and Beyond - Elevating DevOps ProductivityChatGPT and Beyond - Elevating DevOps Productivity
ChatGPT and Beyond - Elevating DevOps Productivity
 
Best Practices with OLAP Modeling with Cognos Transformer (Cognos 8)
Best Practices with OLAP Modeling with Cognos Transformer (Cognos 8)Best Practices with OLAP Modeling with Cognos Transformer (Cognos 8)
Best Practices with OLAP Modeling with Cognos Transformer (Cognos 8)
 
Shivaprasada_Kodoth
Shivaprasada_KodothShivaprasada_Kodoth
Shivaprasada_Kodoth
 
Advanced web application architecture - Talk
Advanced web application architecture - TalkAdvanced web application architecture - Talk
Advanced web application architecture - Talk
 

More from Sanjana Chowdhury

Rsqrd AI: Making Conversational AI Work for Everybody
Rsqrd AI: Making Conversational AI Work for EverybodyRsqrd AI: Making Conversational AI Work for Everybody
Rsqrd AI: Making Conversational AI Work for Everybody
Sanjana Chowdhury
 
Rsqrd AI: Application of Explanation Model in Healthcare
Rsqrd AI: Application of Explanation Model in HealthcareRsqrd AI: Application of Explanation Model in Healthcare
Rsqrd AI: Application of Explanation Model in Healthcare
Sanjana Chowdhury
 
Rsqrd AI: Recent Advances in Explainable Machine Learning Research
Rsqrd AI: Recent Advances in Explainable Machine Learning ResearchRsqrd AI: Recent Advances in Explainable Machine Learning Research
Rsqrd AI: Recent Advances in Explainable Machine Learning Research
Sanjana Chowdhury
 
Rsqrd AI: Incorporating Priors with Feature Attribution on Text Classification
Rsqrd AI: Incorporating Priors with Feature Attribution on Text ClassificationRsqrd AI: Incorporating Priors with Feature Attribution on Text Classification
Rsqrd AI: Incorporating Priors with Feature Attribution on Text Classification
Sanjana Chowdhury
 
Rsqrd AI: Discovering Natural Bugs Using Adversarial Perturbations
Rsqrd AI: Discovering Natural Bugs Using Adversarial PerturbationsRsqrd AI: Discovering Natural Bugs Using Adversarial Perturbations
Rsqrd AI: Discovering Natural Bugs Using Adversarial Perturbations
Sanjana Chowdhury
 
Rsqrd AI: A Survey of The Current Ecosystem of Explainability Techniques
Rsqrd AI: A Survey of The Current Ecosystem of Explainability TechniquesRsqrd AI: A Survey of The Current Ecosystem of Explainability Techniques
Rsqrd AI: A Survey of The Current Ecosystem of Explainability Techniques
Sanjana Chowdhury
 
Rsqrd AI: Explaining ML Models w/ Geometric Intuition
Rsqrd AI: Explaining ML Models w/ Geometric IntuitionRsqrd AI: Explaining ML Models w/ Geometric Intuition
Rsqrd AI: Explaining ML Models w/ Geometric Intuition
Sanjana Chowdhury
 
Rsqrd AI: Errudite- Scalable, Reproducible, and Testable Error Analysis
Rsqrd AI: Errudite- Scalable, Reproducible, and Testable Error AnalysisRsqrd AI: Errudite- Scalable, Reproducible, and Testable Error Analysis
Rsqrd AI: Errudite- Scalable, Reproducible, and Testable Error Analysis
Sanjana Chowdhury
 
Rsqrd AI: Exploring Machine Learning Model Predictions
Rsqrd AI: Exploring Machine Learning Model PredictionsRsqrd AI: Exploring Machine Learning Model Predictions
Rsqrd AI: Exploring Machine Learning Model Predictions
Sanjana Chowdhury
 
Rsqrd AI: Zestimates and Zillow AI Platform
Rsqrd AI: Zestimates and Zillow AI PlatformRsqrd AI: Zestimates and Zillow AI Platform
Rsqrd AI: Zestimates and Zillow AI Platform
Sanjana Chowdhury
 
Rsqrd AI: From R&D to ROI of AI
Rsqrd AI: From R&D to ROI of AIRsqrd AI: From R&D to ROI of AI
Rsqrd AI: From R&D to ROI of AI
Sanjana Chowdhury
 
Rsqrd AI: How to Design a Reliable and Reproducible Pipeline
Rsqrd AI: How to Design a Reliable and Reproducible PipelineRsqrd AI: How to Design a Reliable and Reproducible Pipeline
Rsqrd AI: How to Design a Reliable and Reproducible Pipeline
Sanjana Chowdhury
 

More from Sanjana Chowdhury (12)

Rsqrd AI: Making Conversational AI Work for Everybody
Rsqrd AI: Making Conversational AI Work for EverybodyRsqrd AI: Making Conversational AI Work for Everybody
Rsqrd AI: Making Conversational AI Work for Everybody
 
Rsqrd AI: Application of Explanation Model in Healthcare
Rsqrd AI: Application of Explanation Model in HealthcareRsqrd AI: Application of Explanation Model in Healthcare
Rsqrd AI: Application of Explanation Model in Healthcare
 
Rsqrd AI: Recent Advances in Explainable Machine Learning Research
Rsqrd AI: Recent Advances in Explainable Machine Learning ResearchRsqrd AI: Recent Advances in Explainable Machine Learning Research
Rsqrd AI: Recent Advances in Explainable Machine Learning Research
 
Rsqrd AI: Incorporating Priors with Feature Attribution on Text Classification
Rsqrd AI: Incorporating Priors with Feature Attribution on Text ClassificationRsqrd AI: Incorporating Priors with Feature Attribution on Text Classification
Rsqrd AI: Incorporating Priors with Feature Attribution on Text Classification
 
Rsqrd AI: Discovering Natural Bugs Using Adversarial Perturbations
Rsqrd AI: Discovering Natural Bugs Using Adversarial PerturbationsRsqrd AI: Discovering Natural Bugs Using Adversarial Perturbations
Rsqrd AI: Discovering Natural Bugs Using Adversarial Perturbations
 
Rsqrd AI: A Survey of The Current Ecosystem of Explainability Techniques
Rsqrd AI: A Survey of The Current Ecosystem of Explainability TechniquesRsqrd AI: A Survey of The Current Ecosystem of Explainability Techniques
Rsqrd AI: A Survey of The Current Ecosystem of Explainability Techniques
 
Rsqrd AI: Explaining ML Models w/ Geometric Intuition
Rsqrd AI: Explaining ML Models w/ Geometric IntuitionRsqrd AI: Explaining ML Models w/ Geometric Intuition
Rsqrd AI: Explaining ML Models w/ Geometric Intuition
 
Rsqrd AI: Errudite- Scalable, Reproducible, and Testable Error Analysis
Rsqrd AI: Errudite- Scalable, Reproducible, and Testable Error AnalysisRsqrd AI: Errudite- Scalable, Reproducible, and Testable Error Analysis
Rsqrd AI: Errudite- Scalable, Reproducible, and Testable Error Analysis
 
Rsqrd AI: Exploring Machine Learning Model Predictions
Rsqrd AI: Exploring Machine Learning Model PredictionsRsqrd AI: Exploring Machine Learning Model Predictions
Rsqrd AI: Exploring Machine Learning Model Predictions
 
Rsqrd AI: Zestimates and Zillow AI Platform
Rsqrd AI: Zestimates and Zillow AI PlatformRsqrd AI: Zestimates and Zillow AI Platform
Rsqrd AI: Zestimates and Zillow AI Platform
 
Rsqrd AI: From R&D to ROI of AI
Rsqrd AI: From R&D to ROI of AIRsqrd AI: From R&D to ROI of AI
Rsqrd AI: From R&D to ROI of AI
 
Rsqrd AI: How to Design a Reliable and Reproducible Pipeline
Rsqrd AI: How to Design a Reliable and Reproducible PipelineRsqrd AI: How to Design a Reliable and Reproducible Pipeline
Rsqrd AI: How to Design a Reliable and Reproducible Pipeline
 

Recently uploaded

Integrating Kafka with MuleSoft 4 and usecase
Integrating Kafka with MuleSoft 4 and usecaseIntegrating Kafka with MuleSoft 4 and usecase
Integrating Kafka with MuleSoft 4 and usecase
shyamraj55
 
Vertex AI Agent Builder - GDG Alicante - Julio 2024
Vertex AI Agent Builder - GDG Alicante - Julio 2024Vertex AI Agent Builder - GDG Alicante - Julio 2024
Vertex AI Agent Builder - GDG Alicante - Julio 2024
Nicolás Lopéz
 
Girls call Kolkata 👀 XXXXXXXXXXX 👀 Rs.9.5 K Cash Payment With Room Delivery
Girls call Kolkata 👀 XXXXXXXXXXX 👀 Rs.9.5 K Cash Payment With Room Delivery Girls call Kolkata 👀 XXXXXXXXXXX 👀 Rs.9.5 K Cash Payment With Room Delivery
Girls call Kolkata 👀 XXXXXXXXXXX 👀 Rs.9.5 K Cash Payment With Room Delivery
sunilverma7884
 
It's your unstructured data: How to get your GenAI app to production (and spe...
It's your unstructured data: How to get your GenAI app to production (and spe...It's your unstructured data: How to get your GenAI app to production (and spe...
It's your unstructured data: How to get your GenAI app to production (and spe...
Zilliz
 
MAKE MONEY ONLINE Unlock Your Income Potential Today.pptx
MAKE MONEY ONLINE Unlock Your Income Potential Today.pptxMAKE MONEY ONLINE Unlock Your Income Potential Today.pptx
MAKE MONEY ONLINE Unlock Your Income Potential Today.pptx
janagijoythi
 
Retrieval Augmented Generation Evaluation with Ragas
Retrieval Augmented Generation Evaluation with RagasRetrieval Augmented Generation Evaluation with Ragas
Retrieval Augmented Generation Evaluation with Ragas
Zilliz
 
UX Webinar Series: Drive Revenue and Decrease Costs with Passkeys for Consume...
UX Webinar Series: Drive Revenue and Decrease Costs with Passkeys for Consume...UX Webinar Series: Drive Revenue and Decrease Costs with Passkeys for Consume...
UX Webinar Series: Drive Revenue and Decrease Costs with Passkeys for Consume...
FIDO Alliance
 
Acumatica vs. Sage Intacct _Construction_July (1).pptx
Acumatica vs. Sage Intacct _Construction_July (1).pptxAcumatica vs. Sage Intacct _Construction_July (1).pptx
Acumatica vs. Sage Intacct _Construction_July (1).pptx
BrainSell Technologies
 
Premium Girls Call Mumbai 9920725232 Unlimited Short Providing Girls Service ...
Premium Girls Call Mumbai 9920725232 Unlimited Short Providing Girls Service ...Premium Girls Call Mumbai 9920725232 Unlimited Short Providing Girls Service ...
Premium Girls Call Mumbai 9920725232 Unlimited Short Providing Girls Service ...
shanihomely
 
LeadMagnet IQ Review: Unlock the Secret to Effortless Traffic and Leads.pdf
LeadMagnet IQ Review:  Unlock the Secret to Effortless Traffic and Leads.pdfLeadMagnet IQ Review:  Unlock the Secret to Effortless Traffic and Leads.pdf
LeadMagnet IQ Review: Unlock the Secret to Effortless Traffic and Leads.pdf
SelfMade bd
 
Step-By-Step Process to Develop a Mobile App From Scratch
Step-By-Step Process to Develop a Mobile App From ScratchStep-By-Step Process to Develop a Mobile App From Scratch
Step-By-Step Process to Develop a Mobile App From Scratch
softsuave
 
Patch Tuesday de julio
Patch Tuesday de julioPatch Tuesday de julio
Patch Tuesday de julio
Ivanti
 
Using LLM Agents with Llama 3, LangGraph and Milvus
Using LLM Agents with Llama 3, LangGraph and MilvusUsing LLM Agents with Llama 3, LangGraph and Milvus
Using LLM Agents with Llama 3, LangGraph and Milvus
Zilliz
 
kk vathada _digital transformation frameworks_2024.pdf
kk vathada _digital transformation frameworks_2024.pdfkk vathada _digital transformation frameworks_2024.pdf
kk vathada _digital transformation frameworks_2024.pdf
KIRAN KV
 
The Path to General-Purpose Robots - Coatue
The Path to General-Purpose Robots - CoatueThe Path to General-Purpose Robots - Coatue
The Path to General-Purpose Robots - Coatue
Razin Mustafiz
 
Connector Corner: Leveraging Snowflake Integration for Smarter Decision Making
Connector Corner: Leveraging Snowflake Integration for Smarter Decision MakingConnector Corner: Leveraging Snowflake Integration for Smarter Decision Making
Connector Corner: Leveraging Snowflake Integration for Smarter Decision Making
DianaGray10
 
The Impact of the Internet of Things (IoT) on Smart Homes and Cities
The Impact of the Internet of Things (IoT) on Smart Homes and CitiesThe Impact of the Internet of Things (IoT) on Smart Homes and Cities
The Impact of the Internet of Things (IoT) on Smart Homes and Cities
Arpan Buwa
 
Mastering OnlyFans Clone App Development: Key Strategies for Success
Mastering OnlyFans Clone App Development: Key Strategies for SuccessMastering OnlyFans Clone App Development: Key Strategies for Success
Mastering OnlyFans Clone App Development: Key Strategies for Success
David Wilson
 
Uncharted Together- Navigating AI's New Frontiers in Libraries
Uncharted Together- Navigating AI's New Frontiers in LibrariesUncharted Together- Navigating AI's New Frontiers in Libraries
Uncharted Together- Navigating AI's New Frontiers in Libraries
Brian Pichman
 
Use Cases & Benefits of RPA in Manufacturing in 2024.pptx
Use Cases & Benefits of RPA in Manufacturing in 2024.pptxUse Cases & Benefits of RPA in Manufacturing in 2024.pptx
Use Cases & Benefits of RPA in Manufacturing in 2024.pptx
SynapseIndia
 

Recently uploaded (20)

Integrating Kafka with MuleSoft 4 and usecase
Integrating Kafka with MuleSoft 4 and usecaseIntegrating Kafka with MuleSoft 4 and usecase
Integrating Kafka with MuleSoft 4 and usecase
 
Vertex AI Agent Builder - GDG Alicante - Julio 2024
Vertex AI Agent Builder - GDG Alicante - Julio 2024Vertex AI Agent Builder - GDG Alicante - Julio 2024
Vertex AI Agent Builder - GDG Alicante - Julio 2024
 
Girls call Kolkata 👀 XXXXXXXXXXX 👀 Rs.9.5 K Cash Payment With Room Delivery
Girls call Kolkata 👀 XXXXXXXXXXX 👀 Rs.9.5 K Cash Payment With Room Delivery Girls call Kolkata 👀 XXXXXXXXXXX 👀 Rs.9.5 K Cash Payment With Room Delivery
Girls call Kolkata 👀 XXXXXXXXXXX 👀 Rs.9.5 K Cash Payment With Room Delivery
 
It's your unstructured data: How to get your GenAI app to production (and spe...
It's your unstructured data: How to get your GenAI app to production (and spe...It's your unstructured data: How to get your GenAI app to production (and spe...
It's your unstructured data: How to get your GenAI app to production (and spe...
 
MAKE MONEY ONLINE Unlock Your Income Potential Today.pptx
MAKE MONEY ONLINE Unlock Your Income Potential Today.pptxMAKE MONEY ONLINE Unlock Your Income Potential Today.pptx
MAKE MONEY ONLINE Unlock Your Income Potential Today.pptx
 
Retrieval Augmented Generation Evaluation with Ragas
Retrieval Augmented Generation Evaluation with RagasRetrieval Augmented Generation Evaluation with Ragas
Retrieval Augmented Generation Evaluation with Ragas
 
UX Webinar Series: Drive Revenue and Decrease Costs with Passkeys for Consume...
UX Webinar Series: Drive Revenue and Decrease Costs with Passkeys for Consume...UX Webinar Series: Drive Revenue and Decrease Costs with Passkeys for Consume...
UX Webinar Series: Drive Revenue and Decrease Costs with Passkeys for Consume...
 
Acumatica vs. Sage Intacct _Construction_July (1).pptx
Acumatica vs. Sage Intacct _Construction_July (1).pptxAcumatica vs. Sage Intacct _Construction_July (1).pptx
Acumatica vs. Sage Intacct _Construction_July (1).pptx
 
Premium Girls Call Mumbai 9920725232 Unlimited Short Providing Girls Service ...
Premium Girls Call Mumbai 9920725232 Unlimited Short Providing Girls Service ...Premium Girls Call Mumbai 9920725232 Unlimited Short Providing Girls Service ...
Premium Girls Call Mumbai 9920725232 Unlimited Short Providing Girls Service ...
 
LeadMagnet IQ Review: Unlock the Secret to Effortless Traffic and Leads.pdf
LeadMagnet IQ Review:  Unlock the Secret to Effortless Traffic and Leads.pdfLeadMagnet IQ Review:  Unlock the Secret to Effortless Traffic and Leads.pdf
LeadMagnet IQ Review: Unlock the Secret to Effortless Traffic and Leads.pdf
 
Step-By-Step Process to Develop a Mobile App From Scratch
Step-By-Step Process to Develop a Mobile App From ScratchStep-By-Step Process to Develop a Mobile App From Scratch
Step-By-Step Process to Develop a Mobile App From Scratch
 
Patch Tuesday de julio
Patch Tuesday de julioPatch Tuesday de julio
Patch Tuesday de julio
 
Using LLM Agents with Llama 3, LangGraph and Milvus
Using LLM Agents with Llama 3, LangGraph and MilvusUsing LLM Agents with Llama 3, LangGraph and Milvus
Using LLM Agents with Llama 3, LangGraph and Milvus
 
kk vathada _digital transformation frameworks_2024.pdf
kk vathada _digital transformation frameworks_2024.pdfkk vathada _digital transformation frameworks_2024.pdf
kk vathada _digital transformation frameworks_2024.pdf
 
The Path to General-Purpose Robots - Coatue
The Path to General-Purpose Robots - CoatueThe Path to General-Purpose Robots - Coatue
The Path to General-Purpose Robots - Coatue
 
Connector Corner: Leveraging Snowflake Integration for Smarter Decision Making
Connector Corner: Leveraging Snowflake Integration for Smarter Decision MakingConnector Corner: Leveraging Snowflake Integration for Smarter Decision Making
Connector Corner: Leveraging Snowflake Integration for Smarter Decision Making
 
The Impact of the Internet of Things (IoT) on Smart Homes and Cities
The Impact of the Internet of Things (IoT) on Smart Homes and CitiesThe Impact of the Internet of Things (IoT) on Smart Homes and Cities
The Impact of the Internet of Things (IoT) on Smart Homes and Cities
 
Mastering OnlyFans Clone App Development: Key Strategies for Success
Mastering OnlyFans Clone App Development: Key Strategies for SuccessMastering OnlyFans Clone App Development: Key Strategies for Success
Mastering OnlyFans Clone App Development: Key Strategies for Success
 
Uncharted Together- Navigating AI's New Frontiers in Libraries
Uncharted Together- Navigating AI's New Frontiers in LibrariesUncharted Together- Navigating AI's New Frontiers in Libraries
Uncharted Together- Navigating AI's New Frontiers in Libraries
 
Use Cases & Benefits of RPA in Manufacturing in 2024.pptx
Use Cases & Benefits of RPA in Manufacturing in 2024.pptxUse Cases & Benefits of RPA in Manufacturing in 2024.pptx
Use Cases & Benefits of RPA in Manufacturing in 2024.pptx
 

Rsqrd AI: ML Tooling at an AI-first Startup

  • 1. Emad Elwany - CTO, Lexion Evolution of ML Infrastructure at an AI-First Startup Rsqrd AI Meetup - May 2020
  • 2. Agenda ● Lexion Overview ● Document Understanding Pipeline ● Evolution of ML Infrastructure at Lexion ● Deep Dive - Model Versioning
  • 3. Lexion: Applying NLP to legal agreements Creating this simple report could take weeks without automation.
  • 4. It’s a complex NLP problem ● Messy PDFs make OCR non-trivial ● Long, multi-agreement documents ● Domain specific language ● Complex schemas/ontologies ● Mix of non/semi/fully structured data
  • 5. Sample: Identify Contract Term Contract term is AUTO RENEW if, e.g.: “will automatically renew for three year terms” “shall continue on a month to month basis until terminated” Contract term is FIXED if, e.g.: “terminate effective April 1, 2007.” “will continue until the 1 year anniversary”
  • 6. Document Understanding Pipeline Input OCR Output BL . . . Entities Classes Relations Text Layout Structured Data . . . Many many models! Key Takeaway: Every node in this graph is a “model” (of hundreds), and the remainder of this talk applies to each and every one of them.
  • 7. Initial Goals (Pre-MVP) ● Evaluate technical feasibility: Can we build it? ● Evaluate business viability: Will they find it useful? ● Move very quickly: Can we ship it before we run out of money? Use tools that are easy to ● Understand ● Setup ● Deploy
  • 8. Steady state Goals (Post-MVP) ● Scale model development ● Scale model deployment ● Keep users happy at all times Use tools that are easy to ● Integrate ● Configure ● Scale
  • 9. Typical model lifecycle Experience with ML in research, applications, and platforms:
  • 10. Data EARLY ● Finding the data Scrapers/FOIA ● Cleaning the data Scripting + Rules ● Annotating the data Simple annotation tools LATER ● Managing the data Data Stores and Caches ● Protecting the data Encryption and Access control ● Scaling annotation Weakly/Unsupervised
  • 11. Training EARLY Optimize for Speed of Results Jupyter, Scripts Goal: does it work? LATER Optimize for speed of Experimentation Frameworks and metrics Goal: make it the best!
  • 12. Packaging EARLY Optimize for shipping the models REST endpoint (online) Batch script (offline) LATER Optimize for operationalizing the model Versioning of artefacts Dependency management Cost management More on this a bit later...
  • 13. Validate Model EARLY ● Does it work well enough? Simple high level metrics (F1, P, R etc.) LATER ● Is it better? ● Why is it better? ● How is it better? Much more rigor: ● Validation sets ● E2E tests ● More detailed metrics
  • 14. Deployment EARLY Optimize for Speed of deployment LATER Optimize for Scale of deployment ● Inference time ● Priority vs. starvation ● Rapid update deployment
  • 15. Monitor EARLY Bare minimum to ensure things are working: ● High level E2E alert LATER Invest in monitoring all aspects of the models: ● Detailed KPIs ● Model Drift ● User DSAT Logging, Dashboards, Alerts
  • 16. Deep Dive: Model Versioning
  • 17. Real life problems ● “We used to predict the right X on this document - when/why did it break?” ○ Usually accompanied by an alert or even worse: a user complaint. ● “The model we trained 2 months ago was so much better at Y - we can’t seem to get the same performance. How do we roll back?” ○ Usually accompanied by a frustrated product manager / quality engineering. ● “I swear I got better results over the weekend for the same experiment, I don’t know what changed!” ○ Usually accompanied by a confused data scientist. But first: can you reproduce your model results to the 10th decimal place? If not, STOP!
  • 18. Wait… didn’t we solve this problem a long time ago? Source control has been used for decades. How is this different? Versioning ML models shares a lot with code versioning, for e.g.: But it also includes a lot more: Code (*) Config Library dependencies Topology Training Data Training Parameters Model State (weights, hyperparameters) Hardware (*) Code is a lot of things in the context of ML models, it’s data prep, libraries, models, featurizers etc.
  • 19. What exactly is Versioning for ML models? L1: Production/Staging slots. Allows very short-term rollback/rollforward. L2: Reproducing Inference. Once you have a trained model, this kind of versioning allows you to deterministically reconstruct a model for inference. Allows pinning models for a long time as well as long-term rollback/rollforward. L3: Reproducing Training. You can at any point in time, re-train a model that yields the exact same model you had previously trained. This is a much stronger kind of versioninging, it enables reproducibility as well as dealing with issues as training data corruption.
  • 20. Artefacts that need to be versioned Simple examples Inference Training Model Hyper Parameters Size of Layer N Featurizer Code Input feature vector size Featurizer Data Vocab Model Code NN Architecture Model Config Remove Stop Words? Model State Model Weights Library Dependencies PyTorch Version Hardware V100 Training Config Early Stopping Criteria Training Data Data + Labels
  • 21. Remember this pipeline? Input OCR Output BL . . . Entities Classes Relations Text Layout Structured Data . . . Many many models! You need to version the aforementioned artefacts for every single node in this graph. That’s a lot of things to version!
  • 22. Some solutions (that don’t work) ● Let’s snapshot everything in a Docker image and store it forever > How do you hotfix the model? ● Let’s mark a “stable” production model and not deploy any future “staging” versions till they have been tested enough. > How do you make “breaking” changes to the code? ● Let’s always support only “latest” version and never commit a new version until we’re sure it’s good. > How do you iterate quickly?
  • 23. We evaluated some existing solutions It’s always better to not reinvent the wheel
  • 24. It’s a lot of work to move infrastructure The question is when not if. Early stage startups need to ship and sell their product, hard to justify infrastructure plumbing till the flywheel turns. Instead of a full solution, these investments have paid off: 1. Versioning all model state during packaging 2. Versioning all data artefacts in our our data store and making them immutable 3. Versioning all code explicitly by keeping stable interfaces and supporting minor/major version upgrades to model/featurizer code. 4. Pinning major versions of stable dependencies Remember: we are building a whole user facing application on top of this, prioritizing when to invest here is critical.
  • 25. BTW, all this ML is in addition to… ● Permissions ● Email alerts ● SSO ● End-user annotations ● Custom reporting ● Full text search ● Task management ● Custom fields ● Doc schemas ● APIs ● Integrations ● Bulk export ● Integrations ● Dashboards ● Pretty charts ● Bulk ingestion ● Security ● Audit trail … building a complete user facing application!
  • 26. A note on ML technical debt ● Identify when cost debt > cost addressing debt ● Incorporate cost of ML infrastructure in your business model ● Pick the right kind of technical debt, with a plan to get out ● Model versioning is one of the areas you might want to invest in early ● Getting a great model is just the first step of a long journey. You have to build a product customers love!
  • 27. Questions? Learn more at https://lexion.ai (we’re hiring!)