Interpretability and Reproducibility in Production Machine Learning Applications

•

3 likes•440 views

The past decade has seen tremendous growth in production deployments of machine learning algorithms across a range of applications such as targeted advertising, self driving cars, speech translation, medical diagnosis etc [1]. In these contexts, models make key decisions such as predicting the likelihood of a person committing a future crime, trustworthiness for a loan approval, medical diagnosis etc [2]. Presence of bias based on gender, geographical location, race etc., and their consequent negative impact, have been uncovered in several of these deployments [3], [4]. Industries and governments are reacting, enacting regulations requiring that decisions made by machine learning models be Interpretable/Explainable [5]. Explainability across the full range of ML and DL algorithms is an unsolved research problem, with many innovations over the last several years and entire conferences devoted to the topic. However, even simple explainability solutions that are considered established in development (training environments) run into additional difficulties when put into live production. Our design pattern uses a well known technique for explainability - the Canary model (sometimes called Surrogate model) [6,7]. In this approach, a classically non-explainable technique, such as a Neural Network, is paired with an explainable model (that approximates the predictions of the non-explainable technique) such as a Decision Tree. As long as predictions match - the Canary model’s behavior can be used to provide a human understandable reasoning for the prediction.

Technology

Interpretability and Reproducibility in
Production Machine Learning
Applications
S i n d h u G h a n t a , S r i r a m S u b r a m a n i a n ,
S w a m i n a t h a n S u n d a r a r a m a n , L i o r
K h e r m o s h , V i n a y S r i d h a r , D u l c a r d o
A r t e a g a , Q i a n m e i L u o , D h a n a n j o y D a s ,
N i s h a T a l a g a l a
s w a m i @ p a r a l l e l m . c o m

Agenda
1
2
3
ML in
Production
(Challenges)
Explainability
(via Canary)
Reproducibility
(via Timeline Captures)

ML in production
Research
Sandbox Production
Deployment
Operations
Predictions
Errors
Alerts
Warnings
Data
Scientist
Checkout https://mlops.org

Challenges
1) Interpretability of models is a requirement (i.e., model explainability)
• Growing regulatory requirements (SR11-7, OSFI-E23, etc.)
• Complex Data → Complex Models (e.g., Deep Learning models)
• Correlation ≠ Causality
2) Complex models require a “Canary” model for explainability
• Production data does not have labels and can change overtime
• Models represent the learnings from the data that they were trained on
3) How to diagnose or recreate production issues?
• Complex dependencies, distributed and heterogeneity, changing state
• Not always possible to recreate the production state

Explainability in Production (Canary)
Features
Labels
Primary
Model
(complex)
Training
Features
Labels
Canary
Model
(simple)
Training

Explainability in Production (pred. comparison)
Train set Train
RMSE
Inference set
Periodic Flash Linear Constant Poisson
Periodic 0.029 0.029 0.43 0.4 0 0.25
Flash 0.01 0.3 0.01 0.62 0.11 0.77
Linear 0.08 0.5 0.19 0.08 0.62 0.04
Constant 0 0 0 0 0 0
Poisson 0036 0.03 0.01 0.23 1 0.037
Same load Different load
How do you know
that Canary is able to
explain primary
predictions in
production?
Primary: MLP Canary: Decision Tree
Compare predictions
TELCO Dataset

Explainability in Production (in action)

Reproducibility (Challenges & Requirements)
• Complex Dependencies
• Datasets, pipelines, schedule, user actions
• Distribution & Heterogeneity
• Running at different locations, libraries, languages, environments
• Changing temporal State
• Interdependent pipelines running on different schedules
• Newer models can impact the prediction results

Reproducibility (Timeline Capture)
• Built a system to capture the entire state of the application
• Datasets, pipelines, models, user actions, logs, events, environment, etc.
• Supports “Auto” & “After the fact” captures and “live” browsing of timelines

Conclusions
• Production ≠ Sandbox
• Maintaining interpretability in production deployments is challenging!
• Reproducibility is just the first step in this journey
• We built a system that enables reproducibility & explainability for
complex ML environments
• Demonstrated using the Canary deployment use case

Questions
1
2
3
ML in
Production
Explainability
Reproducibility
Try for yourself:
https://parallelm.com/free-account

Similar to Interpretability and Reproducibility in Production Machine Learning Applications

Diagnosability versus The Cloud, Redwood Shores 2011-08-30Cary Millsap

[DSC Europe 23] Dmitry Ustalov - Design and Evaluation of Large Language ModelsDataScienceConferenc1

Supporting Change in Product Lines within the Context of Use Case-driven Deve...Lionel Briand

Synergy of Human and Artificial Intelligence in Software EngineeringTao Xie

Towards a Better Understanding of the Impact of Experimental Components on De...Chakkrit (Kla) Tantithamthavorn

B4UConference_machine learning_deeplearningHoa Le

NLP-Focused Applied ML at Scale for Global Fleet Analytics at ExxonMobilDatabricks

PhD_defense_presentation_Oct2013Selvi Kadirvel

Making Observability Actionable At Scale - DBS DevConnect 2019Squadcast Inc

Machine Learning Impact on IoT - Part 2Value Amplify Consulting

深度學習在AOI的應用CHENHuiMei

B2 2005 introduction_load_testing_blackboard_primer_draftSteve Feldman

Testing survey by_directionsTao He

LucaPozziTimeSeriesLuca Pozzi

New challenges monolixday2011blaudez

The Power of Probabilistic Thinking (keynote talk at ASE 2016)David Rosenblum

What are the Unique Challenges and Opportunities in Systems for ML?Matei Zaharia

Next Generation Standards - A Science-Based Discipline of Information Managem...Steve Ray

PythonQuants conference - QuantUniversity presentation - Stress Testing in th...QuantUniversity

Useful Techniques in Artificial IntelligenceIla Group

Similar to Interpretability and Reproducibility in Production Machine Learning Applications (20)

Diagnosability versus The Cloud, Redwood Shores 2011-08-30

[DSC Europe 23] Dmitry Ustalov - Design and Evaluation of Large Language Models

Supporting Change in Product Lines within the Context of Use Case-driven Deve...

Synergy of Human and Artificial Intelligence in Software Engineering

Towards a Better Understanding of the Impact of Experimental Components on De...

B4UConference_machine learning_deeplearning

NLP-Focused Applied ML at Scale for Global Fleet Analytics at ExxonMobil

PhD_defense_presentation_Oct2013

Making Observability Actionable At Scale - DBS DevConnect 2019

Machine Learning Impact on IoT - Part 2

深度學習在AOI的應用

B2 2005 introduction_load_testing_blackboard_primer_draft

Testing survey by_directions

LucaPozziTimeSeries

New challenges monolixday2011

The Power of Probabilistic Thinking (keynote talk at ASE 2016)

What are the Unique Challenges and Opportunities in Systems for ML?

Next Generation Standards - A Science-Based Discipline of Information Managem...

PythonQuants conference - QuantUniversity presentation - Stress Testing in th...

Useful Techniques in Artificial Intelligence

Recently uploaded

AI as an Interface for Commercial BuildingsMemoori

#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada

How to Remove Document Management Hurdles with X-Docs?XfilesPro

Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAndikSusilo4

Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsHyundai Motor Group

08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls

Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK

IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge

WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal

CloudStudio User manual (basic edition):comworks

Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited

Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software

Understanding the Laravel MVC ArchitecturePixlogix Infotech

GenCyber Cyber Security Day PresentationMichael W. Hawkins

Slack Application Development 101 Slidespraypatel2

SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren

Benefits Of Flutter Compared To Other FrameworksSoftradix Technologies

08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls

Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix

E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxnull - The Open Security Community

Recently uploaded (20)

AI as an Interface for Commercial Buildings

#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024

How to Remove Document Management Hurdles with X-Docs?

Azure Monitor & Application Insight to monitor Infrastructure & Application

Snow Chain-Integrated Tire for a Safe Drive on Winter Roads

08448380779 Call Girls In Civil Lines Women Seeking Men

Unblocking The Main Thread Solving ANRs and Frozen Frames

IAC 2024 - IA Fast Track to Search Focused AI Solutions

WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service

CloudStudio User manual (basic edition):

Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365

Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation

Understanding the Laravel MVC Architecture

GenCyber Cyber Security Day Presentation

Slack Application Development 101 Slides

SQL Database Design For Developers at php[tek] 2024

Benefits Of Flutter Compared To Other Frameworks

08448380779 Call Girls In Friends Colony Women Seeking Men

Swan(sea) Song – personal research during my six years at Swansea ... and bey...

E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx

Interpretability and Reproducibility in Production Machine Learning Applications

1. Interpretability and Reproducibility in Production Machine Learning Applications S i n d h u G h a n t a , S r i r a m S u b r a m a n i a n , S w a m i n a t h a n S u n d a r a r a m a n , L i o r K h e r m o s h , V i n a y S r i d h a r , D u l c a r d o A r t e a g a , Q i a n m e i L u o , D h a n a n j o y D a s , N i s h a T a l a g a l a s w a m i @ p a r a l l e l m . c o m

2. Agenda 1 2 3 ML in Production (Challenges) Explainability (via Canary) Reproducibility (via Timeline Captures)

3. ML in production Research Sandbox Production Deployment Operations Predictions Errors Alerts Warnings Data Scientist Checkout https://mlops.org

4. Challenges 1) Interpretability of models is a requirement (i.e., model explainability) • Growing regulatory requirements (SR11-7, OSFI-E23, etc.) • Complex Data → Complex Models (e.g., Deep Learning models) • Correlation ≠ Causality 2) Complex models require a “Canary” model for explainability • Production data does not have labels and can change overtime • Models represent the learnings from the data that they were trained on 3) How to diagnose or recreate production issues? • Complex dependencies, distributed and heterogeneity, changing state • Not always possible to recreate the production state

5. Explainability in Production (Canary) Features Labels Primary Model (complex) Training Features Labels Canary Model (simple) Training

6. Explainability in Production (pred. comparison) Train set Train RMSE Inference set Periodic Flash Linear Constant Poisson Periodic 0.029 0.029 0.43 0.4 0 0.25 Flash 0.01 0.3 0.01 0.62 0.11 0.77 Linear 0.08 0.5 0.19 0.08 0.62 0.04 Constant 0 0 0 0 0 0 Poisson 0036 0.03 0.01 0.23 1 0.037 Same load Different load How do you know that Canary is able to explain primary predictions in production? Primary: MLP Canary: Decision Tree Compare predictions TELCO Dataset

7. Explainability in Production (in action)

8. Reproducibility (Challenges & Requirements) • Complex Dependencies • Datasets, pipelines, schedule, user actions • Distribution & Heterogeneity • Running at different locations, libraries, languages, environments • Changing temporal State • Interdependent pipelines running on different schedules • Newer models can impact the prediction results

9. Reproducibility (Timeline Capture) • Built a system to capture the entire state of the application • Datasets, pipelines, models, user actions, logs, events, environment, etc. • Supports “Auto” & “After the fact” captures and “live” browsing of timelines

10. Reproducibility (Timeline Capture) Demo

11. Conclusions • Production ≠ Sandbox • Maintaining interpretability in production deployments is challenging! • Reproducibility is just the first step in this journey • We built a system that enables reproducibility & explainability for complex ML environments • Demonstrated using the Canary deployment use case

12. Questions 1 2 3 ML in Production Explainability Reproducibility Try for yourself: https://parallelm.com/free-account

Interpretability and Reproducibility in Production Machine Learning Applications

Recommended

Recommended

More Related Content

Similar to Interpretability and Reproducibility in Production Machine Learning Applications

Similar to Interpretability and Reproducibility in Production Machine Learning Applications (20)

Recently uploaded

Recently uploaded (20)

Interpretability and Reproducibility in Production Machine Learning Applications