QoA4ML – A Framework for Supporting Contracts in Machine Learning Services
1. QoA4ML – A Framework for
Supporting Contracts in
Machine Learning Services
Hong-Linh Truong, Minh-Tri Nguyen
Department of Computer Science
https://rdsea.github.io
2. Outline
▪ Context, scenario and research questions
▪ Key components of the QoA4ML framework
▪ Prototype and experiments
▪ Conclusions and future work
September 9, 2021
IEEE International Conference on Web Services (ICWS) 2021
2
3. Context
▪ Machine learning as a service becomes popular
▪ ML service providers offer ML services for different consumers
▪ Different stakeholders and interaction models
▪ Two stakeholders engagement: consumer and ML service provider
▪ Three stakeholders engagement: consumer, ML service provider
and ML infrastructure/platform provider
▪ Key issue
▪ How do we support contracts between the ML service provider and
other stakeholders? It is not just about performance!
▪ ML has several distinguishable attributes
September 9, 2021
IEEE International Conference on Web Services (ICWS) 2021
3
4. Scenario: predictive maintenance in
Base Transceiver Stations (BTS)
▪ Dynamic inference from IoT data about equipment and infrastructure
components in a BTS
September 9, 2021
IEEE International Conference on Web Services (ICWS) 2021
4
5. Key research questions & our approach
▪ Which are key attributes for ML contracts?
▪ How would ML attributes and constraints be specified?
▪ How would ML-specific attributes/constraints be monitored and
evaluated?
▪ Approach
▪ Focus on ML-specific attributes
▪ Researchers have identified many attributes for ML models and systems
▪ Design ML contract specs suitable for cloud-native services
▪ Constraints, policies and monitoring reports
▪ Monitor ML attributes for contract monitoring
September 9, 2021
IEEE International Conference on Web Services (ICWS) 2021
5
6. QoA4ML framework – important attributes
for ML-specific contracts
▪ Focus on important categories
▪ Inference Accuracy, Reliability and Elasticity, Quality of Data,
Security and Privacy, Fairness and Interpretability and Cost
September 9, 2021
IEEE International Conference on Web Services (ICWS) 2021
6
7. QoA4ML specifications
▪ Decoupling attributes/constraints vs policies
▪ Required attributes and their constraints can be changed and
updated at runtime
▪ Policies to check attributes and constraints can be implemented in
different ways
▪ Monitoring probes and other utilities supporting observability
▪ New probes for quality of data and ML models
▪ Need to be instrumented and deployed to capture runtime attributes
▪ Must be well integrated with common monitoring features
September 9, 2021
IEEE International Conference on Web Services (ICWS) 2021
7
9. Constraints for the BTS ML service
▪ Use terms in the
QoA4ML specs
▪ Attributes and
constraints can be
changed
September 9, 2021
IEEE International Conference on Web Services (ICWS) 2021
9
10. Example of policies for validating contract
constraints
▪ Is based on Rego
▪ Can load contract
terms from JSON
and compare with
runtime monitoring
▪ Can be changed at
runtime
September 9, 2021
IEEE International Conference on Web Services (ICWS) 2021
10
11. Monitoring utilities and Observability
Service
September 9, 2021
IEEE International Conference on Web Services (ICWS) 2021
11
Design for different
engines to be used
Reuse well-known
monitoring systems
Monitor ML-specific
attributes
12. Current prototype
▪ QoA4ML Specs: initial version based on JSON
▪ Use OPA (https://www.openpolicyagent.org/) as engine
▪ Rego and JSON are used for policies, attributes and constraints
▪ QoA4ML Observability as microservices
▪ Using state-of-the-art monitoring tools like
Prometheus/Grafana
▪ Testing environments
▪ Edge and cloud infrastructures
▪ Source code is currently being pushed into:
▪ https://github.com/rdsea/QoA4ML
September 9, 2021
IEEE International Conference on Web Services (ICWS) 2021
12
13. Experiments
▪ Dynamic inferences of BTS load of power grid
▪ LSTM, TensorFlow
▪ IoT data from BTS (several months)
▪ Training in cloud and export to the edge (BTS-model-edge)
and retraining several times in the cloud (BTS-model-cloud)
▪ Deployment
▪ Contracts:
▪ ResponseTime
▪ Inference Accuracy
▪ Data Quality
September 9, 2021
IEEE International Conference on Web Services (ICWS) 2021
13
14. Effect of edge and cloud serving
platform deployment in ML contracts.
September 9, 2021
IEEE International Conference on Web Services (ICWS) 2021
14
Both consumer and service are in the
same edge; 3000 records per 15 minutes
Both consumer and broker are in the same edge
Broker is in the cloud
15. Impact of violation monitoring
All services in the edge (except the observability)
September 9, 2021
IEEE International Conference on Web Services (ICWS) 2021
15
Help to detect outdated models in ML
services: violation changes when
retraining models
Help to see
correlations
among
attributes:
data quality
and inference
accuracy
16. Conclusions and future work
▪ QoA4ML is a framework to support ML service contracts
▪ Contract specifications (constraints and policies), tools and services
▪ QoA4ML benefits
▪ Establish contracts, moving to the step of continuous testing and
observability of ML production
▪ Support flexible contracts and policies, enabling reuses and real-
world ML services integration
▪ Future work
▪ Extending ML attributes and specifications; integration with cloud
service contracts; new probes and observability capabilities
September 9, 2021
IEEE International Conference on Web Services (ICWS) 2021
16