Powerful Google developer tools for immediate impact! (2023-24 C)
Robust Testing Strategies for Machine Learning Models
1. Robust Testing Strategies for Machine Learning Models
Agile Testing Alliance Hyderabad Meet
TMI Networks, Hyderabad, 22 July 2023
Tilottama Goswami, Ph.D. (University Of Hyderabad)
Professor, Department Of Information Technology
Vasavi College Of Engineering
Hyderabad, INDIA
3. Motivation
Fourth Industrial Revolution
Digital Transformation
AI & Automation – Intelligent Systems
Data Usage - Privacy Security Ethics
Social Transformation – Quality of Life
Robust Testing Strategies
o Performance
o Security
o Reliability
o Seamless Integration & Deployment
o Building Trust
DEMAND
4. Impact of Industrial
Revolution 4.0 in
Real World
Scenarios
ROBOTIC
PROCESS
AUTOMATION
MACHINE
LEARNING
Explore.entenic.com
Depositphotos.com
5. Repetitive
Rule Based
Structured Data
Pre-Programmed Rules
Not Adaptable to
handle variations
No Cognitive
Capabilities
Struggles with
Unstructured Data –
Audio/Image/Text
Learn From Data
Predictions
Unstructured Data
Predictions on new
unseen Data
Adaptable and Evolves
with changes – Flexible
Complex Cognitive
Tasks – Reasoning
Unstructured Data –
Image Recognition
Language Translation
Data Entry & Transactional Tasks
Sentiment Analysis & Pattern Recognition
Machine Learning
Robotic Process Automation
9. Real-World Examples of ML Model Failures
IBM Watson's
Cancer Treatment
Recommendations
Amazon's AI
Recruitment Tool
Google Photos'
Racist Labelling
Tesla's Autopilot
Accidents
Microsoft's Tay
Chatbot
10. Real-World Examples of ML Model Failures
IBM Watson's
Cancer
Treatment
Recommendatio
ns
Amazon's AI
Recruitment
Tool
Google
Photos' Racist
Labelling
Tesla's
Autopilot
Accidents
Microsoft's
Tay Chatbot
Bias Against
Female Candidates
Limitation of Training Data
and Biased Training
Real time Decision Making in
complex environments
Learnt offensive and
inappropriate
conversations from
tweets
Lacked proper testing and Validation
Erroneous Recommendations
11. Challenges
A. IBM Watson's Cancer Treatment Recommendations
1. Challenges with training data and complexity of cancer treatment
2. Interpretation of unstructured data and limited contextual
understanding
3. Lessons learned and improvements made
B. Microsoft's Tay Chatbot
1. Vulnerability to manipulation and lack of contextual
understanding
2. Rapid learning and amplification of bias
3. Importance of human oversight and responsibility
12. Challenges
C. Google Photos' Racist Labeling / Amazon’s AI Recruitment Tool
1. Biased training data and insufficient testing
2. Limited diversity in development teams
3. Ethical considerations and response to the incident
D. Tesla's Autopilot Accidents
1. Overreliance on the Autopilot system and
inattentive driving
2. System limitations and edge cases
3. Regulatory and legal challenges
13. Key Factors for Robust ML Model Testing
1.BIAS-VARIANCE Trade Off Overfitting/Underfitting
2.Comprehensive Training Data
3.Hyperparameter Tuning
4.Validation and Evaluation Techniques
5.Adversarial Testing
6.Continuous Monitoring and Maintenance
15. 15
Bias-Variance
The goal of any predictive modelling machine learning algorithm is to achieve low
bias and low variance.
Bias are the simplifying assumptions made by a model to make the target function
easier to learn.
V
ariance is the amount that the estimate of the target function will change if
different training data was used.
16. Address BIAS in ML
High Bias = Underfitting
Building Ethical and
Trustworthy AI systems
Promote Fairness and
Inclusivity
Diverse and
Representative
Data
Regularization &
Post Processing
Fairness-aware
Algorithms
Bias Aware
Evaluation
Feature
Engineering
L
O
W
B
I
A
S
Address VARIANCE in ML
High Variance = Overfitting
Stable and Robust AI Systems
L
O
W
V
A
R
I
A
N
C
E
Generalization
Adequate Variance
to avoid
Underfitting
Feature
Engineering
Regularization &
Post Processing
Cross Validation Ensemble Methods
Early Stopping of
Training
17. Comprehensive Training Data
1. Importance of diverse, representative, and unbiased training data
2. Data quality, data augmentation, and addressing class imbalance
3. Rigorous Hyperparameter Tuning
19. Validation & Evaluation Techniques
1.Cross-validation and holdout validation for assessing
model performance
2.Metrics selection, including accuracy, precision, recall,
F1-score, and AUC-ROC
3.Uncovering vulnerabilities and weaknesses in ML models
2. Crafting deceptive inputs and evaluating model
robustness
20. Adversarial Testing
1. Uncovering vulnerabilities and weaknesses in ML
models
2. Crafting deceptive inputs and evaluating model
robustness
21. Continuous Monitoring &
Maintenance
1. Importance of ongoing model performance monitoring
2. Regular updates, retraining, and version control
3. Human Feedback in Loop