This document discusses software engineering challenges in building AI-based complex systems. It notes that while AI is providing promising results on a functional level, building full systems using AI introduces new technical and non-technical questions. These include how to interpret data, ensure systems are dependable, and address ethical concerns. From a technical perspective, challenges involve efficiently collecting, storing, processing, and analyzing data, as well as building and testing AI-based systems. The document argues that addressing these challenges will require interdisciplinary work between software engineering, data science, and domain knowledge.
3. 4/25/2018 Chalmers 3
• Trend – Overall use of AI in software applications and Software-intensive systems
• Many promising results on the functional/feature level
• Enormous expectations to build system that use AI in obtaining features
• New questions – of non technical character appear
• How the interpret data?
• Which real problems can be solved using data-driven and AI-based approaches?
• Who is owner of data?
• What are the ethical aspects of using data, and allowing machine to decide?
• New questions of technical nature
• How to efficiently collect, store, process, analyse, and present data?
• How to efficiently build the AI-based systems?
• How to ensure dependability/trustworthy of such systems?
• WHAT KIND OF SOFTWARE ENGINEERING SUPPORT IS NEEDED?
Expectations – overall use of AI
4. 4/25/2018 Chalmers 4
Presentation based on
• Machine Learning: The High-Interest Credit Card of Technical Debt, D.
Sculley, Gary Holt, Daniel Golovin, Eugene Davydov, Todd Phillips, Dietmar
Ebner, Vinay Chaudhary, Michael Young
• Software Engineering Challenges of Deep Learning, Anders Arpteg, Björn
Brinne, Luka Crnkovic-Friis, Jan Bosch
Software Engineering Challenges to develop
AI-base systems
6. www.icse2018.org Gothenburg May 27 – June 2, 2018
Sara
Mazur
Margaret
Hamilton
Fred
Brooks
Keynotes
Industry Forum Invited Speakers
Danica
Kragic
Jan
Bosch
• 50 years of Software engineering
• Tutorials (AI & SE,…)
• Different tracks
• 300+ presentations & 200+ posters
• Doctoral symposium
• 29 dedicated workshops
• 7 co-located conferences
• 1500+ participants
7. 4/25/2018 Chalmers 7
AI
Machine Learning
Data Science, Computational Science
statistics, algorithms simulation, new algorithms
Data Engineering Software Engineering
Visualisation
System Requirements System constraints
performance, reliability, security, safety storage, memory, computation
Data Value,
accuracy, availability trustworthiness, ethical values
Domain knowledge
AI with impact is more than just AI
8. 4/25/2018
Chalmers 8
System architecture (example )
Platform
Middleware
Subsystem A
HW
Subsystem B Subsystem C
C1
C2
C3
C4
C3
Subsystem and components
Component-based and service-based approach
- Components
- Encapsulation of data
- Encapsulation of functionality
- Dependency between compotes defined and controlled
9. 4/25/2018
Chalmers 9
Ai-Systems - system architecture (example)
Platform
Middleware
Subsystem A
HW
Subsystem B Subsystem C
C1
C2
C3
C4
C3
Subsystem and components
Components – black boxes
- Components
- Encapsulation of AI-based functionality
- Dependency between compotes defined and controlled
- What about data?
10. 4/25/2018 10
AI- based components
C1
Data – used for ML
controlled
Controlled?
• The results depend not only
on the algorithms and controlled data
but also on uncontrolled/unknown data
• The AI-based functions are not continuous
small change of data can cause big changes
11. The first version of AI system is easy to obtain, but making subsequent
improvements is unexpectedly difficult.
4/25/2018 Chalmers 11
• M= {f1, f2, …, fn} - system – a set of features f1…fn in an AI model
• Changing input data for fx, requires changes in values of fi –
weights, importance, to get optimal result
• Adding a new feature may require the same change
• That can cause unpredictable changes in the model
Data-related challenges I - Entanglement (Data fusion)
CACE principle: Changing Anything Changes Everything
12. 4/25/2018 Chalmers 12
• Code dependency – known as a technical debt (built-in problems)
• Data dependencies – more complex
Data-related challenges II – Data dependencies
DATA
DATA
DATA
AI-component
AI-component
13. 4/25/2018 Chalmers 13
• Unstable data dependencies
• Some data change over time (value, accuracy, precision)
• A common mitigation strategy
• Introduce versions of data sets
• Implication
• version and configuration management challenges
• Not only version and configuration management of functions (code) but
also data
• how to manage data dependency? No developed tools
• Static analysis of systems with data dependencies - no tools
Data-related challenges III – Data dependencies
DATA
DATA
DATA
AI-component
AI-component
14. 4/25/2018 Chalmers 14
• Code dependency – known as a technical debt (built-in problems)
• Data dependencies – more complex
Data-related challenges IV – Data dependencies
DATA
DATA
DATA
AI-component
AI-component
15. 4/25/2018 Chalmers 15
• The system can optimise for the feedback
Data-related challenges II – Hidden Feedback Loops
DATA
DATA
AI-component
16. 4/25/2018 Chalmers 16
• Changes of data can have unexpected consequences
Data-related challenges IV – Undeclared Consumers
DATA
DATA
AI-component
AI-component
17. 4/25/2018 Chalmers 17
• Model m for problem P
• Model m’ for problem P’ (P´- P = DP)
• Often used solution
• m’(P’) = m(P) + Dm - by changes of data used for P break the relation
between m’ and m
• Dependency of data should be controlled
Data-related challenges III – Correction Cascades
DATA
DATA
AI-component
AI-component
DATA
DATA
AI-component
AI-component
18. 4/25/2018 Chalmers 18
• Heterogeneity of data (different formats, accuracy, semantics) and use of
standard ML functions require a lot Glue code
• 95% of code in AI-based systems is a glue-code (empirical data)
• Requires
• Frequent refactoring of code
• Re-implementing AI models
• Pipeline Jungles
• ML-friendly format data become a jungle of scrapes, joins, and sampling steps,
intermediate files
• Requires – a close team work of data and domain engineers
System-design anti-patterns
19. 4/25/2018 Chalmers 19
• Dead Experimental Codepaths
• AI solution requires a lot of experimentation
• A lot of code that will not be used later
• Problems
• Dead code
• Version management – how to preserve useful configuration branches, and remove
unnecessary
System-design anti-patterns (II)
20. 4/25/2018 Chalmers 20
Managing Changes in the External World
System
Cloud Data
Local Data
Local Data
Local Data
Historical data
Used in ML
Continuous change of data
Challenge:
models and system behaviour
dependent of data
Examples:
- Threshold changes
- Correlation between data
Requirements: Continous monitoring of data and system. Continous test.
21. 4/25/2018 Chalmers 21
• Development challenges
• Production challenges
• Project management and organisational challenges
Software engineering Challenges
22. 4/25/2018 Chalmers 22
• Experiment Management
• Hardware, Platform, Source code, Configuration, Data sources, Training
state
• Difficult to predict behaviour of ML
• Limited Transparency, Troubleshooting and Testing
• Glue Code and Supporting Systems
• Resource Limitations
• Memory, CPU power, Storage
Development challenges
23. 4/25/2018 Chalmers 23
• Effort Estimation
• Difficult to know when the models will be sufficiently good
• Cultural Differences
• Software developers, data scientists
• Development process
• Continuous changes
• Issues with the compatibilities in changes
Organizational & project challenges
24. 4/25/2018 Chalmers 24
• Dependency Management.
• Requirements on powerful computational resources that are
continuously changing
• Monitoring and Logging.
• Unintended Feedback Loops
• Safety, security, and privacy
Production challenges
25. 4/25/2018 Chalmers 25
• How easily can an entirely new algorithmic approach be tested at full scale?
• How precisely can the impact of a new change to the system be measured?
• Does improving one model or signal degrade others?
• How quickly can new members of the team be brought up to speed?
A lot of new challenges - Important to manage them
Useful questions*
Conclusion
Hidden Technical Debt in Machine Learning Systems - D. Sculley, Gary Holt, Daniel Golovin, Eugene Davydov, Todd Phillips,
Dietmar Ebner, Vinay Chaudhary, Michael Young, Jean-Franc¸ois Crespo, Dan Dennison
Editor's Notes
(1) Hardware (e.g. GPU models primarily)
(2) Platform (e.g. operating system and installed packages)
(3) Source code (e.g. model training and pre-processing)
(4) Configuration (e.g. model configuration and pre-processing
settings)
(5) Data sources (e.g. input signals and target values)
(6) Training state (e.g. versions of trained model).
Traditional management of software is usually light