Learning for Testing Prioritization- an industrial case study

Salesforce Engineering
Salesforce EngineeringSalesforce Engineering
Learning for Test Prioritization
An Industrial Case Study
Benjamin Busjaeger
Salesforce
Tao Xie
University of Illinois,
Urbana-Champaign
ACM SIGSOFT FSE 2016
Large Scale Continuous Integration
2K+
Modules
600K+
Tests
2K+
Engineers
500+
Changes/Day
Main repository ...
350K+
Source files
Motivation
Before Commit After Commit
Objective Reject faulty changes Detect test failures as quickly as possible
Current Run hand-picked tests Parallelize (8000 concurrent VMs) &
batch (1-1000 changes per run)
Problem Too complex for human Feedback time constrained by capacity &
batching complicates fault assignment
Desired Select top-k tests likely to fail Prioritize all tests by likelihood of failure
Insight: Need Multiple Existing Techniques
● Heterogeneous languages: Java, PL/SQL, JavaScript, Apex, etc.
●Non-code artifacts: metadata and configuration
●New/recently-failed tests more likely to fail
Test code coverage of change Textual similarity between test and change Test age and recent failures
Insight: Need Scalable Techniques
●Complex integration tests: concurrent execution
●Large data volume: 500+ changes, 50M+ test runs, 20TB+ results per day
→ Our approach: Periodically collect coarse-grained measurements
Test code coverage of change Textual similarity between test and change Test age and recent failures
Formal model for
learning from
past test results
New Approach: Learning for Test Prioritization
Test code coverage of change Textual similarity between test and change Test age and recent failures
Change
Test
Ranking
→ Implementation currently in pilot use at Salesforce
Empirical Study: Setup
●Test results of 45K tests for ~3 month period
●In this period, 711 changes with ≥1 failure
• 440 for training
• 271 for testing
●Collected once for each test:
• Test code coverage
• Test text content
●Collected continuously:
• Test age and recent failures
● New approach achieves highest
average recall at all cutoff points
• 50% failures detected with top 0.2%
• 75% failures detected with top 3%
Results: Test selection (before commit)
New Approach
● New approach achieves highest
APFD with least variance
• Median: 85%
• Average: 99%
Results: Test prioritization (after commit)
New Approach
Results: Side Benefits
Invalid Assignments
Flaky Tests
Summary
● Main insights gained in conducting test prioritization in industry
● Novel learning-based approach to test prioritization
● Implementation currently in pilot use at Salesforce
● Empirical evaluation using a large Salesforce dataset
● Ongoing work: add features, word2vec, deep learning, cost-aware
Thank you
Q&A
Summary
● Main insights gained in conducting test prioritization in industry
● Novel learning-based approach to test prioritization
● Implementation currently in pilot use at Salesforce
● Empirical evaluation using a large Salesforce dataset
● Ongoing work: add features, word2vec/deep learning, cost-aware
1 of 13

More Related Content

Viewers also liked(17)

1   gerir o teu dinheiro1   gerir o teu dinheiro
1 gerir o teu dinheiro
maladigitalmourao309 views
Summit2013   choi - wise kb-introdSummit2013   choi - wise kb-introd
Summit2013 choi - wise kb-introd
Semantic Technology Institute International1.6K views
nature of learningnature of learning
nature of learning
vigneswari116199663K views
Best Philosophy of EducationBest Philosophy of Education
Best Philosophy of Education
Kaplan University12.9K views
The learning process- Fundamentals of InstructionThe learning process- Fundamentals of Instruction
The learning process- Fundamentals of Instruction
Holmes Aviation Training12.9K views
The learning processThe learning process
The learning process
Kerry Harrison13.5K views
Drive test learningDrive test learning
Drive test learning
Kshitij Verma45.2K views
 Learning Process Theories  Learning Process Theories
Learning Process Theories
Malyn Singson14.2K views
Learning presentationLearning presentation
Learning presentation
Lara Melissa Cariño13.3K views
Teaching and Learning ProcessTeaching and Learning Process
Teaching and Learning Process
Jaser Daher354.2K views
Philosophy of educationPhilosophy of education
Philosophy of education
ajtame48.6K views
Philosophies of educationPhilosophies of education
Philosophies of education
Sherwin Balbuena335.1K views
Major philosophies in educationMajor philosophies in education
Major philosophies in education
Bogs De Castro198.5K views
Philosophy of educationPhilosophy of education
Philosophy of education
David Deubelbeiss110K views

More from Salesforce Engineering(20)

Scaling HBase for Big DataScaling HBase for Big Data
Scaling HBase for Big Data
Salesforce Engineering2.1K views
Predictive System Performance Data AnalysisPredictive System Performance Data Analysis
Predictive System Performance Data Analysis
Salesforce Engineering639 views
Apache HBase State of the ProjectApache HBase State of the Project
Apache HBase State of the Project
Salesforce Engineering326 views
Hit the Trail with TrailheadHit the Trail with Trailhead
Hit the Trail with Trailhead
Salesforce Engineering4.1K views
HBase/PHOENIX @ ScaleHBase/PHOENIX @ Scale
HBase/PHOENIX @ Scale
Salesforce Engineering221 views
Scaling up data science applicationsScaling up data science applications
Scaling up data science applications
Salesforce Engineering342 views
Containers and Security for DevOpsContainers and Security for DevOps
Containers and Security for DevOps
Salesforce Engineering337 views
Monitoring @ Scale in SalesforceMonitoring @ Scale in Salesforce
Monitoring @ Scale in Salesforce
Salesforce Engineering455 views
Performance Tuning with XHProfPerformance Tuning with XHProf
Performance Tuning with XHProf
Salesforce Engineering850 views
Koober Preduction IO PresentationKoober Preduction IO Presentation
Koober Preduction IO Presentation
Salesforce Engineering1.7K views
Finding Security Issues Fast!Finding Security Issues Fast!
Finding Security Issues Fast!
Salesforce Engineering964 views
MicroservicesMicroservices
Microservices
Salesforce Engineering790 views
Global State Management of Micro ServicesGlobal State Management of Micro Services
Global State Management of Micro Services
Salesforce Engineering582 views
The Future of HbaseThe Future of Hbase
The Future of Hbase
Salesforce Engineering537 views

Recently uploaded(20)

Codes and Conventions.pptxCodes and Conventions.pptx
Codes and Conventions.pptx
IsabellaGraceAnkers6 views
SWM L15-L28_drhasan (Part 2).pdfSWM L15-L28_drhasan (Part 2).pdf
SWM L15-L28_drhasan (Part 2).pdf
MahmudHasan74787028 views
IWISS Catalog 2022IWISS Catalog 2022
IWISS Catalog 2022
Iwiss Tools Co.,Ltd23 views
CHI-SQUARE ( χ2) TESTS.pptxCHI-SQUARE ( χ2) TESTS.pptx
CHI-SQUARE ( χ2) TESTS.pptx
ssusera597c514 views
SWM L1-L14_drhasan (Part 1).pdfSWM L1-L14_drhasan (Part 1).pdf
SWM L1-L14_drhasan (Part 1).pdf
MahmudHasan74787044 views
Deutsch CrimpingDeutsch Crimping
Deutsch Crimping
Iwiss Tools Co.,Ltd15 views
Solar PVSolar PV
Solar PV
Iwiss Tools Co.,Ltd11 views
PlumbingPlumbing
Plumbing
Iwiss Tools Co.,Ltd11 views
SEMI CONDUCTORSSEMI CONDUCTORS
SEMI CONDUCTORS
pavaniaalla200516 views
String.pptxString.pptx
String.pptx
Ananthi Palanisamy47 views
Sanitary Landfill- SWM.pptxSanitary Landfill- SWM.pptx
Sanitary Landfill- SWM.pptx
Vinod Nejkar5 views
Wire RopeWire Rope
Wire Rope
Iwiss Tools Co.,Ltd8 views
Object Oriented Programming with JAVAObject Oriented Programming with JAVA
Object Oriented Programming with JAVA
Demian Antony D'Mello58 views
IEC 61850 Technical Overview.pdfIEC 61850 Technical Overview.pdf
IEC 61850 Technical Overview.pdf
ssusereeea975 views
What is Whirling Hygrometer.pdfWhat is Whirling Hygrometer.pdf
What is Whirling Hygrometer.pdf
IIT KHARAGPUR 10 views
Wire Cutting & StrippingWire Cutting & Stripping
Wire Cutting & Stripping
Iwiss Tools Co.,Ltd6 views

Learning for Testing Prioritization- an industrial case study

  • 1. Learning for Test Prioritization An Industrial Case Study Benjamin Busjaeger Salesforce Tao Xie University of Illinois, Urbana-Champaign ACM SIGSOFT FSE 2016
  • 2. Large Scale Continuous Integration 2K+ Modules 600K+ Tests 2K+ Engineers 500+ Changes/Day Main repository ... 350K+ Source files
  • 3. Motivation Before Commit After Commit Objective Reject faulty changes Detect test failures as quickly as possible Current Run hand-picked tests Parallelize (8000 concurrent VMs) & batch (1-1000 changes per run) Problem Too complex for human Feedback time constrained by capacity & batching complicates fault assignment Desired Select top-k tests likely to fail Prioritize all tests by likelihood of failure
  • 4. Insight: Need Multiple Existing Techniques ● Heterogeneous languages: Java, PL/SQL, JavaScript, Apex, etc. ●Non-code artifacts: metadata and configuration ●New/recently-failed tests more likely to fail Test code coverage of change Textual similarity between test and change Test age and recent failures
  • 5. Insight: Need Scalable Techniques ●Complex integration tests: concurrent execution ●Large data volume: 500+ changes, 50M+ test runs, 20TB+ results per day → Our approach: Periodically collect coarse-grained measurements Test code coverage of change Textual similarity between test and change Test age and recent failures
  • 6. Formal model for learning from past test results New Approach: Learning for Test Prioritization Test code coverage of change Textual similarity between test and change Test age and recent failures Change Test Ranking → Implementation currently in pilot use at Salesforce
  • 7. Empirical Study: Setup ●Test results of 45K tests for ~3 month period ●In this period, 711 changes with ≥1 failure • 440 for training • 271 for testing ●Collected once for each test: • Test code coverage • Test text content ●Collected continuously: • Test age and recent failures
  • 8. ● New approach achieves highest average recall at all cutoff points • 50% failures detected with top 0.2% • 75% failures detected with top 3% Results: Test selection (before commit) New Approach
  • 9. ● New approach achieves highest APFD with least variance • Median: 85% • Average: 99% Results: Test prioritization (after commit) New Approach
  • 10. Results: Side Benefits Invalid Assignments Flaky Tests
  • 11. Summary ● Main insights gained in conducting test prioritization in industry ● Novel learning-based approach to test prioritization ● Implementation currently in pilot use at Salesforce ● Empirical evaluation using a large Salesforce dataset ● Ongoing work: add features, word2vec, deep learning, cost-aware
  • 13. Summary ● Main insights gained in conducting test prioritization in industry ● Novel learning-based approach to test prioritization ● Implementation currently in pilot use at Salesforce ● Empirical evaluation using a large Salesforce dataset ● Ongoing work: add features, word2vec/deep learning, cost-aware

Editor's Notes

  1. coverage: periodically collected (slightly out-of-date), dynamic instrumentation (JaCoCo, no rebuild), approximate proximity