SlideShare a Scribd company logo
Maintainability Challenges
in ML : A SLR
KARTHIK SHIVASHANKAR ANTONIO MARTINI
UNIVERSITY OF OSLO
DEPARTMENT OF INFORMATICS
Study Objective
Our study aims to identify and synthesise the maintainability
challenges in different stages of the ML workflow and understand
how these stages are interdependent and impact each other’s
maintainability.
Maintainability
Software maintainability means ”the ease with which a software
system or component can be modified to correct faults, improve
performance or other attributes and adapt to a changing
environment”
Method
We have a replication package with all the
details and metadata related to this SLR
study @
https://doi.org/10.5281/zenodo.6400559
Research Questions
(RQ1) What are the Data Engineering
Maintainability challenges?
(RQ2) What are the Model Engineering
Maintainability challenges?
(RQ3) What are the current maintainability
challenges when Building an ML system?
RQ1 Key
takeaways
•Data is messy, error-prone, and lacks transparency
and ownership.
•No guarantee that pre-processing can handle all
types of quality errors, bias and adversarial data.
•Most Data pipelines are tested in a trial and error
manner. It also changes and evolves, making it
difficult to validate and maintain it on an ongoing
basis.
Courtesy Randal Munroe of XKCD
RQ2 Key takeaways
•The entanglement in hyperparameters directly affects
the model performance and training pipeline.
•Stochastic nature of ML and rapidly changing input and
expected output create a moving target and make ML
testing an open challenge.
•Data seasonality and fluctuation in data collection may
lead to model staleness and degrading in performance
Image credits:
https://matthewmcateer.me/blog/machine-learning-technical-debt/
RQ3 Key takeaways
• In general, most cloud providers do not provide a common programming
model. They typically use either a black box or a complex runtime environment
to approach ML, leading to a tight coupling between the modelling and
infrastructure layers.
• Although AutoML alleviates some challenges by automating the model
selection and hyper-tuning, it is still hard to minimise expert intervention
easily with the current scene.
• Engineers spend significant effort developing ad hoc programs to connect
components from different software libraries, processing various forms of raw
input, and interfacing with external systems, leading to pipeline jungles and
glue codes in an MLOps-like set-up.
Credits: https://towardsdatascience.com/seven-signs-you-might-be-creating-ml-technical-debt-
1a96a840fd80
Interdependence of
ML challenges
ML has unique quality attributes concerns during
development, such as
•data-dependent behaviour,
•detecting and responding to drift over time,
•handling bias and quality issues,
•timely capture of ground truth for retraining of a model
to deliver a quality ML system
•And many more
Image credits:
https://matthewmcateer.me/blog/machine-learning-technical-debt/
Interdependence
of Maintainability
challenges in
different stages
If you try to use ML to give fashion advice, know that fashion changes over
time
CREDITS:
https://towardsdatascience.com/how-to-attack-machine-learning-evasion-poisoning-inference-trojans-backdoors-a7cb5832595c
https://medium.com/thelaunchpad/how-to-protect-your-machine-learning-product-from-time-adversaries-and-itself-ff07727d6712
ML systems are data-dependent and complex, making them susceptible to Data
and Concept Drift which leads to rapid obsolescence of input and expected
output parts
Credits: https://towardsdatascience.com/machine-learning-in-production-why-is-it-so-difficult-28ce74bfc732
Implication for developers
▪There is a lack of standard tools and method for provenance tracking, publishing of ML models
and their artefacts, tracking data transformations, querying and storing intermediate steps.
▪Many ML projects fail at the prototyping stage because setting up infrastructure for deployment
and maintenance requires integration and management of glue code, ad-hoc pipelines, and data
monitoring.
▪In collaborative or multi-organisational projects, monitoring processes are complex because
different teams have different metrics and requirements, especially in terms of governance and
regulations and also a lack of standards to communicate about ML issues and their quality
Implication for Researcher
•It is unclear even for experienced developers how to select between several data processing
steps and how they will affect the model’s performance.
•ML systems constantly adapt to new data, creating a moving target and posing a different set of
challenges to maintain unit and regression testing than traditional software.
•Need better validation algorithms and Monitoring techniques to identify key data and model
metrics over time.
Thank you
Questions

More Related Content

Similar to Maintainability Challenges inML:ASLR

Accelerating Machine Learning as a Service with Automated Feature Engineering
Accelerating Machine Learning as a Service with Automated Feature EngineeringAccelerating Machine Learning as a Service with Automated Feature Engineering
Accelerating Machine Learning as a Service with Automated Feature Engineering
Cognizant
 
Bridging the Gap: from Data Science to Production
Bridging the Gap: from Data Science to ProductionBridging the Gap: from Data Science to Production
Bridging the Gap: from Data Science to Production
Florian Wilhelm
 
Testing and Deployment - Full Stack Deep Learning
Testing and Deployment - Full Stack Deep LearningTesting and Deployment - Full Stack Deep Learning
Testing and Deployment - Full Stack Deep Learning
Sergey Karayev
 
Unlocking MLOps Potential: Streamlining Machine Learning Lifecycle with Datab...
Unlocking MLOps Potential: Streamlining Machine Learning Lifecycle with Datab...Unlocking MLOps Potential: Streamlining Machine Learning Lifecycle with Datab...
Unlocking MLOps Potential: Streamlining Machine Learning Lifecycle with Datab...
AbishekSubramanian2
 
Flexi dc: A flexible platform for database conversion by wael yahfooz and Sk ...
Flexi dc: A flexible platform for database conversion by wael yahfooz and Sk ...Flexi dc: A flexible platform for database conversion by wael yahfooz and Sk ...
Flexi dc: A flexible platform for database conversion by wael yahfooz and Sk ...
SK Ahammad Fahad
 
Aditya Bhattacharya - Enterprise DL - Accelerating Deep Learning Solutions to...
Aditya Bhattacharya - Enterprise DL - Accelerating Deep Learning Solutions to...Aditya Bhattacharya - Enterprise DL - Accelerating Deep Learning Solutions to...
Aditya Bhattacharya - Enterprise DL - Accelerating Deep Learning Solutions to...
Aditya Bhattacharya
 
Week 3 data journey and data storage
Week 3   data journey and data storageWeek 3   data journey and data storage
Week 3 data journey and data storage
Ajay Taneja
 
MLOps Virtual Event: Automating ML at Scale
MLOps Virtual Event: Automating ML at ScaleMLOps Virtual Event: Automating ML at Scale
MLOps Virtual Event: Automating ML at Scale
Databricks
 
A SURVEY ON ACCURACY OF REQUIREMENT TRACEABILITY LINKS DURING SOFTWARE DEVELO...
A SURVEY ON ACCURACY OF REQUIREMENT TRACEABILITY LINKS DURING SOFTWARE DEVELO...A SURVEY ON ACCURACY OF REQUIREMENT TRACEABILITY LINKS DURING SOFTWARE DEVELO...
A SURVEY ON ACCURACY OF REQUIREMENT TRACEABILITY LINKS DURING SOFTWARE DEVELO...
ijiert bestjournal
 
An Integrated Simulation Tool Framework for Process Data Management
An Integrated Simulation Tool Framework for Process Data ManagementAn Integrated Simulation Tool Framework for Process Data Management
An Integrated Simulation Tool Framework for Process Data Management
Cognizant
 
Unlock the power of MLOps.pdf
Unlock the power of MLOps.pdfUnlock the power of MLOps.pdf
Unlock the power of MLOps.pdf
StephenAmell4
 
SE_Unit 2.pdf it is a process model of it student
SE_Unit 2.pdf it is a process model of it studentSE_Unit 2.pdf it is a process model of it student
SE_Unit 2.pdf it is a process model of it student
RAVALCHIRAG1
 
System Development Life Cycle Overview.ppt
System Development Life Cycle Overview.pptSystem Development Life Cycle Overview.ppt
System Development Life Cycle Overview.ppt
KENNEDYDONATO1
 
SDLC and Software Process Models Introduction ppt
SDLC and Software Process Models Introduction pptSDLC and Software Process Models Introduction ppt
SDLC and Software Process Models Introduction ppt
SushDeshmukh
 
Marlabs Capabilities Overview: Application Maintenance Support Services
Marlabs Capabilities Overview: Application Maintenance Support Services Marlabs Capabilities Overview: Application Maintenance Support Services
Marlabs Capabilities Overview: Application Maintenance Support Services
Marlabs
 
1-SDLC - Development Models – Waterfall, Rapid Application Development, Agile...
1-SDLC - Development Models – Waterfall, Rapid Application Development, Agile...1-SDLC - Development Models – Waterfall, Rapid Application Development, Agile...
1-SDLC - Development Models – Waterfall, Rapid Application Development, Agile...
JOHNLEAK1
 
A 5-step methodology for complex E&P data management
A 5-step methodology for complex E&P data managementA 5-step methodology for complex E&P data management
A 5-step methodology for complex E&P data management
ETLSolutions
 

Similar to Maintainability Challenges inML:ASLR (20)

Accelerating Machine Learning as a Service with Automated Feature Engineering
Accelerating Machine Learning as a Service with Automated Feature EngineeringAccelerating Machine Learning as a Service with Automated Feature Engineering
Accelerating Machine Learning as a Service with Automated Feature Engineering
 
Adm Workshop Program
Adm Workshop ProgramAdm Workshop Program
Adm Workshop Program
 
Bridging the Gap: from Data Science to Production
Bridging the Gap: from Data Science to ProductionBridging the Gap: from Data Science to Production
Bridging the Gap: from Data Science to Production
 
Testing and Deployment - Full Stack Deep Learning
Testing and Deployment - Full Stack Deep LearningTesting and Deployment - Full Stack Deep Learning
Testing and Deployment - Full Stack Deep Learning
 
Unlocking MLOps Potential: Streamlining Machine Learning Lifecycle with Datab...
Unlocking MLOps Potential: Streamlining Machine Learning Lifecycle with Datab...Unlocking MLOps Potential: Streamlining Machine Learning Lifecycle with Datab...
Unlocking MLOps Potential: Streamlining Machine Learning Lifecycle with Datab...
 
Flexi dc: A flexible platform for database conversion by wael yahfooz and Sk ...
Flexi dc: A flexible platform for database conversion by wael yahfooz and Sk ...Flexi dc: A flexible platform for database conversion by wael yahfooz and Sk ...
Flexi dc: A flexible platform for database conversion by wael yahfooz and Sk ...
 
Aditya Bhattacharya - Enterprise DL - Accelerating Deep Learning Solutions to...
Aditya Bhattacharya - Enterprise DL - Accelerating Deep Learning Solutions to...Aditya Bhattacharya - Enterprise DL - Accelerating Deep Learning Solutions to...
Aditya Bhattacharya - Enterprise DL - Accelerating Deep Learning Solutions to...
 
Week 3 data journey and data storage
Week 3   data journey and data storageWeek 3   data journey and data storage
Week 3 data journey and data storage
 
MLOps Virtual Event: Automating ML at Scale
MLOps Virtual Event: Automating ML at ScaleMLOps Virtual Event: Automating ML at Scale
MLOps Virtual Event: Automating ML at Scale
 
manikandan_16_05_2015
manikandan_16_05_2015manikandan_16_05_2015
manikandan_16_05_2015
 
A SURVEY ON ACCURACY OF REQUIREMENT TRACEABILITY LINKS DURING SOFTWARE DEVELO...
A SURVEY ON ACCURACY OF REQUIREMENT TRACEABILITY LINKS DURING SOFTWARE DEVELO...A SURVEY ON ACCURACY OF REQUIREMENT TRACEABILITY LINKS DURING SOFTWARE DEVELO...
A SURVEY ON ACCURACY OF REQUIREMENT TRACEABILITY LINKS DURING SOFTWARE DEVELO...
 
An Integrated Simulation Tool Framework for Process Data Management
An Integrated Simulation Tool Framework for Process Data ManagementAn Integrated Simulation Tool Framework for Process Data Management
An Integrated Simulation Tool Framework for Process Data Management
 
Unlock the power of MLOps.pdf
Unlock the power of MLOps.pdfUnlock the power of MLOps.pdf
Unlock the power of MLOps.pdf
 
I
II
I
 
SE_Unit 2.pdf it is a process model of it student
SE_Unit 2.pdf it is a process model of it studentSE_Unit 2.pdf it is a process model of it student
SE_Unit 2.pdf it is a process model of it student
 
System Development Life Cycle Overview.ppt
System Development Life Cycle Overview.pptSystem Development Life Cycle Overview.ppt
System Development Life Cycle Overview.ppt
 
SDLC and Software Process Models Introduction ppt
SDLC and Software Process Models Introduction pptSDLC and Software Process Models Introduction ppt
SDLC and Software Process Models Introduction ppt
 
Marlabs Capabilities Overview: Application Maintenance Support Services
Marlabs Capabilities Overview: Application Maintenance Support Services Marlabs Capabilities Overview: Application Maintenance Support Services
Marlabs Capabilities Overview: Application Maintenance Support Services
 
1-SDLC - Development Models – Waterfall, Rapid Application Development, Agile...
1-SDLC - Development Models – Waterfall, Rapid Application Development, Agile...1-SDLC - Development Models – Waterfall, Rapid Application Development, Agile...
1-SDLC - Development Models – Waterfall, Rapid Application Development, Agile...
 
A 5-step methodology for complex E&P data management
A 5-step methodology for complex E&P data managementA 5-step methodology for complex E&P data management
A 5-step methodology for complex E&P data management
 

More from SEAA 2022

Risk and Engineering Knowledge Integration in Cyber-physical Production Syste...
Risk and Engineering Knowledge Integration in Cyber-physical Production Syste...Risk and Engineering Knowledge Integration in Cyber-physical Production Syste...
Risk and Engineering Knowledge Integration in Cyber-physical Production Syste...
SEAA 2022
 
Bad Smells in Industrial Automation: Sniffing out Feature Envy
Bad Smells in Industrial Automation: Sniffing out Feature EnvyBad Smells in Industrial Automation: Sniffing out Feature Envy
Bad Smells in Industrial Automation: Sniffing out Feature Envy
SEAA 2022
 
Software Architecture Challenges in Process Automation - From Code Generation...
Software Architecture Challenges in Process Automation - From Code Generation...Software Architecture Challenges in Process Automation - From Code Generation...
Software Architecture Challenges in Process Automation - From Code Generation...
SEAA 2022
 
From Traditional to Digital: How software, data and AI are transforming the e...
From Traditional to Digital: How software, data and AI are transforming the e...From Traditional to Digital: How software, data and AI are transforming the e...
From Traditional to Digital: How software, data and AI are transforming the e...
SEAA 2022
 
Exploiting dynamic analysis for architectural smell detection: a preliminary ...
Exploiting dynamic analysis for architectural smell detection: a preliminary ...Exploiting dynamic analysis for architectural smell detection: a preliminary ...
Exploiting dynamic analysis for architectural smell detection: a preliminary ...
SEAA 2022
 
On the Role of Personality Traits in Implementation Tasks: A Preliminary Inve...
On the Role of Personality Traits in Implementation Tasks: A Preliminary Inve...On the Role of Personality Traits in Implementation Tasks: A Preliminary Inve...
On the Role of Personality Traits in Implementation Tasks: A Preliminary Inve...
SEAA 2022
 
An Empirical Analysis of Microservices Systems Using Consumer-Driven Contract...
An Empirical Analysis of Microservices Systems Using Consumer-Driven Contract...An Empirical Analysis of Microservices Systems Using Consumer-Driven Contract...
An Empirical Analysis of Microservices Systems Using Consumer-Driven Contract...
SEAA 2022
 
Have Java Production Methods Co-Evolved With Test Methods Properly?: A Fine-G...
Have Java Production Methods Co-Evolved With Test Methods Properly?: A Fine-G...Have Java Production Methods Co-Evolved With Test Methods Properly?: A Fine-G...
Have Java Production Methods Co-Evolved With Test Methods Properly?: A Fine-G...
SEAA 2022
 
A Preliminary Conceptualization and Analysis on Automated Static Analysis Too...
A Preliminary Conceptualization and Analysis on Automated Static Analysis Too...A Preliminary Conceptualization and Analysis on Automated Static Analysis Too...
A Preliminary Conceptualization and Analysis on Automated Static Analysis Too...
SEAA 2022
 
An Evaluation of Effort-Aware Fine-Grained Just-in-Time Defect Prediction Met...
An Evaluation of Effort-Aware Fine-Grained Just-in-Time Defect Prediction Met...An Evaluation of Effort-Aware Fine-Grained Just-in-Time Defect Prediction Met...
An Evaluation of Effort-Aware Fine-Grained Just-in-Time Defect Prediction Met...
SEAA 2022
 
The Impact of Forced Working-From-Home on Code Technical Debt: An Industrial ...
The Impact of Forced Working-From-Home on Code Technical Debt: An Industrial ...The Impact of Forced Working-From-Home on Code Technical Debt: An Industrial ...
The Impact of Forced Working-From-Home on Code Technical Debt: An Industrial ...
SEAA 2022
 
Service Classification through Machine Learning: Aiding in the Efficient Ide...
 Service Classification through Machine Learning: Aiding in the Efficient Ide... Service Classification through Machine Learning: Aiding in the Efficient Ide...
Service Classification through Machine Learning: Aiding in the Efficient Ide...
SEAA 2022
 
Model-Driven Optimization: Generating Smart Mutation Operators for Multi-Obj...
 Model-Driven Optimization: Generating Smart Mutation Operators for Multi-Obj... Model-Driven Optimization: Generating Smart Mutation Operators for Multi-Obj...
Model-Driven Optimization: Generating Smart Mutation Operators for Multi-Obj...
SEAA 2022
 
An Industrial Experience Report about Challenges from Continuous Monitoring, ...
An Industrial Experience Report about Challenges from Continuous Monitoring, ...An Industrial Experience Report about Challenges from Continuous Monitoring, ...
An Industrial Experience Report about Challenges from Continuous Monitoring, ...
SEAA 2022
 
API Deprecation: A Systematic Mapping Study
API Deprecation: A Systematic Mapping StudyAPI Deprecation: A Systematic Mapping Study
API Deprecation: A Systematic Mapping Study
SEAA 2022
 
MDEML_UMLsec4Edge Extending UMLsec to model data-protection-compliant edge co...
MDEML_UMLsec4Edge Extending UMLsec to model data-protection-compliant edge co...MDEML_UMLsec4Edge Extending UMLsec to model data-protection-compliant edge co...
MDEML_UMLsec4Edge Extending UMLsec to model data-protection-compliant edge co...
SEAA 2022
 
EMMM: A Unified Meta-Model for Tracking Machine Learning Experiments
 EMMM: A Unified Meta-Model for Tracking Machine Learning Experiments EMMM: A Unified Meta-Model for Tracking Machine Learning Experiments
EMMM: A Unified Meta-Model for Tracking Machine Learning Experiments
SEAA 2022
 
Easing the Reuse of ML Solutions by Interactive Clustering-based Autotuning i...
Easing the Reuse of ML Solutions by Interactive Clustering-based Autotuning i...Easing the Reuse of ML Solutions by Interactive Clustering-based Autotuning i...
Easing the Reuse of ML Solutions by Interactive Clustering-based Autotuning i...
SEAA 2022
 

More from SEAA 2022 (18)

Risk and Engineering Knowledge Integration in Cyber-physical Production Syste...
Risk and Engineering Knowledge Integration in Cyber-physical Production Syste...Risk and Engineering Knowledge Integration in Cyber-physical Production Syste...
Risk and Engineering Knowledge Integration in Cyber-physical Production Syste...
 
Bad Smells in Industrial Automation: Sniffing out Feature Envy
Bad Smells in Industrial Automation: Sniffing out Feature EnvyBad Smells in Industrial Automation: Sniffing out Feature Envy
Bad Smells in Industrial Automation: Sniffing out Feature Envy
 
Software Architecture Challenges in Process Automation - From Code Generation...
Software Architecture Challenges in Process Automation - From Code Generation...Software Architecture Challenges in Process Automation - From Code Generation...
Software Architecture Challenges in Process Automation - From Code Generation...
 
From Traditional to Digital: How software, data and AI are transforming the e...
From Traditional to Digital: How software, data and AI are transforming the e...From Traditional to Digital: How software, data and AI are transforming the e...
From Traditional to Digital: How software, data and AI are transforming the e...
 
Exploiting dynamic analysis for architectural smell detection: a preliminary ...
Exploiting dynamic analysis for architectural smell detection: a preliminary ...Exploiting dynamic analysis for architectural smell detection: a preliminary ...
Exploiting dynamic analysis for architectural smell detection: a preliminary ...
 
On the Role of Personality Traits in Implementation Tasks: A Preliminary Inve...
On the Role of Personality Traits in Implementation Tasks: A Preliminary Inve...On the Role of Personality Traits in Implementation Tasks: A Preliminary Inve...
On the Role of Personality Traits in Implementation Tasks: A Preliminary Inve...
 
An Empirical Analysis of Microservices Systems Using Consumer-Driven Contract...
An Empirical Analysis of Microservices Systems Using Consumer-Driven Contract...An Empirical Analysis of Microservices Systems Using Consumer-Driven Contract...
An Empirical Analysis of Microservices Systems Using Consumer-Driven Contract...
 
Have Java Production Methods Co-Evolved With Test Methods Properly?: A Fine-G...
Have Java Production Methods Co-Evolved With Test Methods Properly?: A Fine-G...Have Java Production Methods Co-Evolved With Test Methods Properly?: A Fine-G...
Have Java Production Methods Co-Evolved With Test Methods Properly?: A Fine-G...
 
A Preliminary Conceptualization and Analysis on Automated Static Analysis Too...
A Preliminary Conceptualization and Analysis on Automated Static Analysis Too...A Preliminary Conceptualization and Analysis on Automated Static Analysis Too...
A Preliminary Conceptualization and Analysis on Automated Static Analysis Too...
 
An Evaluation of Effort-Aware Fine-Grained Just-in-Time Defect Prediction Met...
An Evaluation of Effort-Aware Fine-Grained Just-in-Time Defect Prediction Met...An Evaluation of Effort-Aware Fine-Grained Just-in-Time Defect Prediction Met...
An Evaluation of Effort-Aware Fine-Grained Just-in-Time Defect Prediction Met...
 
The Impact of Forced Working-From-Home on Code Technical Debt: An Industrial ...
The Impact of Forced Working-From-Home on Code Technical Debt: An Industrial ...The Impact of Forced Working-From-Home on Code Technical Debt: An Industrial ...
The Impact of Forced Working-From-Home on Code Technical Debt: An Industrial ...
 
Service Classification through Machine Learning: Aiding in the Efficient Ide...
 Service Classification through Machine Learning: Aiding in the Efficient Ide... Service Classification through Machine Learning: Aiding in the Efficient Ide...
Service Classification through Machine Learning: Aiding in the Efficient Ide...
 
Model-Driven Optimization: Generating Smart Mutation Operators for Multi-Obj...
 Model-Driven Optimization: Generating Smart Mutation Operators for Multi-Obj... Model-Driven Optimization: Generating Smart Mutation Operators for Multi-Obj...
Model-Driven Optimization: Generating Smart Mutation Operators for Multi-Obj...
 
An Industrial Experience Report about Challenges from Continuous Monitoring, ...
An Industrial Experience Report about Challenges from Continuous Monitoring, ...An Industrial Experience Report about Challenges from Continuous Monitoring, ...
An Industrial Experience Report about Challenges from Continuous Monitoring, ...
 
API Deprecation: A Systematic Mapping Study
API Deprecation: A Systematic Mapping StudyAPI Deprecation: A Systematic Mapping Study
API Deprecation: A Systematic Mapping Study
 
MDEML_UMLsec4Edge Extending UMLsec to model data-protection-compliant edge co...
MDEML_UMLsec4Edge Extending UMLsec to model data-protection-compliant edge co...MDEML_UMLsec4Edge Extending UMLsec to model data-protection-compliant edge co...
MDEML_UMLsec4Edge Extending UMLsec to model data-protection-compliant edge co...
 
EMMM: A Unified Meta-Model for Tracking Machine Learning Experiments
 EMMM: A Unified Meta-Model for Tracking Machine Learning Experiments EMMM: A Unified Meta-Model for Tracking Machine Learning Experiments
EMMM: A Unified Meta-Model for Tracking Machine Learning Experiments
 
Easing the Reuse of ML Solutions by Interactive Clustering-based Autotuning i...
Easing the Reuse of ML Solutions by Interactive Clustering-based Autotuning i...Easing the Reuse of ML Solutions by Interactive Clustering-based Autotuning i...
Easing the Reuse of ML Solutions by Interactive Clustering-based Autotuning i...
 

Recently uploaded

role of pramana in research.pptx in science
role of pramana in research.pptx in sciencerole of pramana in research.pptx in science
role of pramana in research.pptx in science
sonaliswain16
 
In silico drugs analogue design: novobiocin analogues.pptx
In silico drugs analogue design: novobiocin analogues.pptxIn silico drugs analogue design: novobiocin analogues.pptx
In silico drugs analogue design: novobiocin analogues.pptx
AlaminAfendy1
 
Leaf Initiation, Growth and Differentiation.pdf
Leaf Initiation, Growth and Differentiation.pdfLeaf Initiation, Growth and Differentiation.pdf
Leaf Initiation, Growth and Differentiation.pdf
RenuJangid3
 
in vitro propagation of plants lecture note.pptx
in vitro propagation of plants lecture note.pptxin vitro propagation of plants lecture note.pptx
in vitro propagation of plants lecture note.pptx
yusufzako14
 
platelets_clotting_biogenesis.clot retractionpptx
platelets_clotting_biogenesis.clot retractionpptxplatelets_clotting_biogenesis.clot retractionpptx
platelets_clotting_biogenesis.clot retractionpptx
muralinath2
 
4. An Overview of Sugarcane White Leaf Disease in Vietnam.pdf
4. An Overview of Sugarcane White Leaf Disease in Vietnam.pdf4. An Overview of Sugarcane White Leaf Disease in Vietnam.pdf
4. An Overview of Sugarcane White Leaf Disease in Vietnam.pdf
ssuserbfdca9
 
Mammalian Pineal Body Structure and Also Functions
Mammalian Pineal Body Structure and Also FunctionsMammalian Pineal Body Structure and Also Functions
Mammalian Pineal Body Structure and Also Functions
YOGESH DOGRA
 
Hemoglobin metabolism_pathophysiology.pptx
Hemoglobin metabolism_pathophysiology.pptxHemoglobin metabolism_pathophysiology.pptx
Hemoglobin metabolism_pathophysiology.pptx
muralinath2
 
Richard's aventures in two entangled wonderlands
Richard's aventures in two entangled wonderlandsRichard's aventures in two entangled wonderlands
Richard's aventures in two entangled wonderlands
Richard Gill
 
The ASGCT Annual Meeting was packed with exciting progress in the field advan...
The ASGCT Annual Meeting was packed with exciting progress in the field advan...The ASGCT Annual Meeting was packed with exciting progress in the field advan...
The ASGCT Annual Meeting was packed with exciting progress in the field advan...
Health Advances
 
Astronomy Update- Curiosity’s exploration of Mars _ Local Briefs _ leadertele...
Astronomy Update- Curiosity’s exploration of Mars _ Local Briefs _ leadertele...Astronomy Update- Curiosity’s exploration of Mars _ Local Briefs _ leadertele...
Astronomy Update- Curiosity’s exploration of Mars _ Local Briefs _ leadertele...
NathanBaughman3
 
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
University of Maribor
 
GBSN- Microbiology (Lab 3) Gram Staining
GBSN- Microbiology (Lab 3) Gram StainingGBSN- Microbiology (Lab 3) Gram Staining
GBSN- Microbiology (Lab 3) Gram Staining
Areesha Ahmad
 
platelets- lifespan -Clot retraction-disorders.pptx
platelets- lifespan -Clot retraction-disorders.pptxplatelets- lifespan -Clot retraction-disorders.pptx
platelets- lifespan -Clot retraction-disorders.pptx
muralinath2
 
Structures and textures of metamorphic rocks
Structures and textures of metamorphic rocksStructures and textures of metamorphic rocks
Structures and textures of metamorphic rocks
kumarmathi863
 
SCHIZOPHRENIA Disorder/ Brain Disorder.pdf
SCHIZOPHRENIA Disorder/ Brain Disorder.pdfSCHIZOPHRENIA Disorder/ Brain Disorder.pdf
SCHIZOPHRENIA Disorder/ Brain Disorder.pdf
SELF-EXPLANATORY
 
EY - Supply Chain Services 2018_template.pptx
EY - Supply Chain Services 2018_template.pptxEY - Supply Chain Services 2018_template.pptx
EY - Supply Chain Services 2018_template.pptx
AlguinaldoKong
 
Structural Classification Of Protein (SCOP)
Structural Classification Of Protein  (SCOP)Structural Classification Of Protein  (SCOP)
Structural Classification Of Protein (SCOP)
aishnasrivastava
 
filosofia boliviana introducción jsjdjd.pptx
filosofia boliviana introducción jsjdjd.pptxfilosofia boliviana introducción jsjdjd.pptx
filosofia boliviana introducción jsjdjd.pptx
IvanMallco1
 
general properties of oerganologametal.ppt
general properties of oerganologametal.pptgeneral properties of oerganologametal.ppt
general properties of oerganologametal.ppt
IqrimaNabilatulhusni
 

Recently uploaded (20)

role of pramana in research.pptx in science
role of pramana in research.pptx in sciencerole of pramana in research.pptx in science
role of pramana in research.pptx in science
 
In silico drugs analogue design: novobiocin analogues.pptx
In silico drugs analogue design: novobiocin analogues.pptxIn silico drugs analogue design: novobiocin analogues.pptx
In silico drugs analogue design: novobiocin analogues.pptx
 
Leaf Initiation, Growth and Differentiation.pdf
Leaf Initiation, Growth and Differentiation.pdfLeaf Initiation, Growth and Differentiation.pdf
Leaf Initiation, Growth and Differentiation.pdf
 
in vitro propagation of plants lecture note.pptx
in vitro propagation of plants lecture note.pptxin vitro propagation of plants lecture note.pptx
in vitro propagation of plants lecture note.pptx
 
platelets_clotting_biogenesis.clot retractionpptx
platelets_clotting_biogenesis.clot retractionpptxplatelets_clotting_biogenesis.clot retractionpptx
platelets_clotting_biogenesis.clot retractionpptx
 
4. An Overview of Sugarcane White Leaf Disease in Vietnam.pdf
4. An Overview of Sugarcane White Leaf Disease in Vietnam.pdf4. An Overview of Sugarcane White Leaf Disease in Vietnam.pdf
4. An Overview of Sugarcane White Leaf Disease in Vietnam.pdf
 
Mammalian Pineal Body Structure and Also Functions
Mammalian Pineal Body Structure and Also FunctionsMammalian Pineal Body Structure and Also Functions
Mammalian Pineal Body Structure and Also Functions
 
Hemoglobin metabolism_pathophysiology.pptx
Hemoglobin metabolism_pathophysiology.pptxHemoglobin metabolism_pathophysiology.pptx
Hemoglobin metabolism_pathophysiology.pptx
 
Richard's aventures in two entangled wonderlands
Richard's aventures in two entangled wonderlandsRichard's aventures in two entangled wonderlands
Richard's aventures in two entangled wonderlands
 
The ASGCT Annual Meeting was packed with exciting progress in the field advan...
The ASGCT Annual Meeting was packed with exciting progress in the field advan...The ASGCT Annual Meeting was packed with exciting progress in the field advan...
The ASGCT Annual Meeting was packed with exciting progress in the field advan...
 
Astronomy Update- Curiosity’s exploration of Mars _ Local Briefs _ leadertele...
Astronomy Update- Curiosity’s exploration of Mars _ Local Briefs _ leadertele...Astronomy Update- Curiosity’s exploration of Mars _ Local Briefs _ leadertele...
Astronomy Update- Curiosity’s exploration of Mars _ Local Briefs _ leadertele...
 
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
 
GBSN- Microbiology (Lab 3) Gram Staining
GBSN- Microbiology (Lab 3) Gram StainingGBSN- Microbiology (Lab 3) Gram Staining
GBSN- Microbiology (Lab 3) Gram Staining
 
platelets- lifespan -Clot retraction-disorders.pptx
platelets- lifespan -Clot retraction-disorders.pptxplatelets- lifespan -Clot retraction-disorders.pptx
platelets- lifespan -Clot retraction-disorders.pptx
 
Structures and textures of metamorphic rocks
Structures and textures of metamorphic rocksStructures and textures of metamorphic rocks
Structures and textures of metamorphic rocks
 
SCHIZOPHRENIA Disorder/ Brain Disorder.pdf
SCHIZOPHRENIA Disorder/ Brain Disorder.pdfSCHIZOPHRENIA Disorder/ Brain Disorder.pdf
SCHIZOPHRENIA Disorder/ Brain Disorder.pdf
 
EY - Supply Chain Services 2018_template.pptx
EY - Supply Chain Services 2018_template.pptxEY - Supply Chain Services 2018_template.pptx
EY - Supply Chain Services 2018_template.pptx
 
Structural Classification Of Protein (SCOP)
Structural Classification Of Protein  (SCOP)Structural Classification Of Protein  (SCOP)
Structural Classification Of Protein (SCOP)
 
filosofia boliviana introducción jsjdjd.pptx
filosofia boliviana introducción jsjdjd.pptxfilosofia boliviana introducción jsjdjd.pptx
filosofia boliviana introducción jsjdjd.pptx
 
general properties of oerganologametal.ppt
general properties of oerganologametal.pptgeneral properties of oerganologametal.ppt
general properties of oerganologametal.ppt
 

Maintainability Challenges inML:ASLR

  • 1. Maintainability Challenges in ML : A SLR KARTHIK SHIVASHANKAR ANTONIO MARTINI UNIVERSITY OF OSLO DEPARTMENT OF INFORMATICS
  • 2. Study Objective Our study aims to identify and synthesise the maintainability challenges in different stages of the ML workflow and understand how these stages are interdependent and impact each other’s maintainability.
  • 3. Maintainability Software maintainability means ”the ease with which a software system or component can be modified to correct faults, improve performance or other attributes and adapt to a changing environment”
  • 4. Method We have a replication package with all the details and metadata related to this SLR study @ https://doi.org/10.5281/zenodo.6400559
  • 5. Research Questions (RQ1) What are the Data Engineering Maintainability challenges? (RQ2) What are the Model Engineering Maintainability challenges? (RQ3) What are the current maintainability challenges when Building an ML system?
  • 6. RQ1 Key takeaways •Data is messy, error-prone, and lacks transparency and ownership. •No guarantee that pre-processing can handle all types of quality errors, bias and adversarial data. •Most Data pipelines are tested in a trial and error manner. It also changes and evolves, making it difficult to validate and maintain it on an ongoing basis. Courtesy Randal Munroe of XKCD
  • 7. RQ2 Key takeaways •The entanglement in hyperparameters directly affects the model performance and training pipeline. •Stochastic nature of ML and rapidly changing input and expected output create a moving target and make ML testing an open challenge. •Data seasonality and fluctuation in data collection may lead to model staleness and degrading in performance Image credits: https://matthewmcateer.me/blog/machine-learning-technical-debt/
  • 8. RQ3 Key takeaways • In general, most cloud providers do not provide a common programming model. They typically use either a black box or a complex runtime environment to approach ML, leading to a tight coupling between the modelling and infrastructure layers. • Although AutoML alleviates some challenges by automating the model selection and hyper-tuning, it is still hard to minimise expert intervention easily with the current scene. • Engineers spend significant effort developing ad hoc programs to connect components from different software libraries, processing various forms of raw input, and interfacing with external systems, leading to pipeline jungles and glue codes in an MLOps-like set-up. Credits: https://towardsdatascience.com/seven-signs-you-might-be-creating-ml-technical-debt- 1a96a840fd80
  • 9. Interdependence of ML challenges ML has unique quality attributes concerns during development, such as •data-dependent behaviour, •detecting and responding to drift over time, •handling bias and quality issues, •timely capture of ground truth for retraining of a model to deliver a quality ML system •And many more Image credits: https://matthewmcateer.me/blog/machine-learning-technical-debt/
  • 11. If you try to use ML to give fashion advice, know that fashion changes over time
  • 14. Implication for developers ▪There is a lack of standard tools and method for provenance tracking, publishing of ML models and their artefacts, tracking data transformations, querying and storing intermediate steps. ▪Many ML projects fail at the prototyping stage because setting up infrastructure for deployment and maintenance requires integration and management of glue code, ad-hoc pipelines, and data monitoring. ▪In collaborative or multi-organisational projects, monitoring processes are complex because different teams have different metrics and requirements, especially in terms of governance and regulations and also a lack of standards to communicate about ML issues and their quality
  • 15. Implication for Researcher •It is unclear even for experienced developers how to select between several data processing steps and how they will affect the model’s performance. •ML systems constantly adapt to new data, creating a moving target and posing a different set of challenges to maintain unit and regression testing than traditional software. •Need better validation algorithms and Monitoring techniques to identify key data and model metrics over time.