SlideShare a Scribd company logo
Learning for Test Prioritization
An Industrial Case Study
Benjamin Busjaeger
Salesforce
Tao Xie
University of Illinois,
Urbana-Champaign
ACM SIGSOFT FSE 2016
Large Scale Continuous Integration
2K+
Modules
600K+
Tests
2K+
Engineers
500+
Changes/Day
Main repository ...
350K+
Source files
Motivation
Before Commit After Commit
Objective Reject faulty changes Detect test failures as quickly as possible
Current Run hand-picked tests Parallelize (8000 concurrent VMs) &
batch (1-1000 changes per run)
Problem Too complex for human Feedback time constrained by capacity &
batching complicates fault assignment
Desired Select top-k tests likely to fail Prioritize all tests by likelihood of failure
Insight: Need Multiple Existing Techniques
● Heterogeneous languages: Java, PL/SQL, JavaScript, Apex, etc.
●Non-code artifacts: metadata and configuration
●New/recently-failed tests more likely to fail
Test code coverage of change Textual similarity between test and change Test age and recent failures
Insight: Need Scalable Techniques
●Complex integration tests: concurrent execution
●Large data volume: 500+ changes, 50M+ test runs, 20TB+ results per day
→ Our approach: Periodically collect coarse-grained measurements
Test code coverage of change Textual similarity between test and change Test age and recent failures
Formal model for
learning from
past test results
New Approach: Learning for Test Prioritization
Test code coverage of change Textual similarity between test and change Test age and recent failures
Change
Test
Ranking
→ Implementation currently in pilot use at Salesforce
Empirical Study: Setup
●Test results of 45K tests for ~3 month period
●In this period, 711 changes with ≥1 failure
• 440 for training
• 271 for testing
●Collected once for each test:
• Test code coverage
• Test text content
●Collected continuously:
• Test age and recent failures
● New approach achieves highest
average recall at all cutoff points
• 50% failures detected with top 0.2%
• 75% failures detected with top 3%
Results: Test selection (before commit)
New Approach
● New approach achieves highest
APFD with least variance
• Median: 85%
• Average: 99%
Results: Test prioritization (after commit)
New Approach
Results: Side Benefits
Invalid Assignments
Flaky Tests
Summary
● Main insights gained in conducting test prioritization in industry
● Novel learning-based approach to test prioritization
● Implementation currently in pilot use at Salesforce
● Empirical evaluation using a large Salesforce dataset
● Ongoing work: add features, word2vec, deep learning, cost-aware
Thank you
Q&A
Summary
● Main insights gained in conducting test prioritization in industry
● Novel learning-based approach to test prioritization
● Implementation currently in pilot use at Salesforce
● Empirical evaluation using a large Salesforce dataset
● Ongoing work: add features, word2vec/deep learning, cost-aware

More Related Content

Viewers also liked

The learning process- Fundamentals of Instruction
The learning process- Fundamentals of InstructionThe learning process- Fundamentals of Instruction
The learning process- Fundamentals of Instruction
Holmes Aviation Training
 
Learning Process Theories
 Learning Process Theories  Learning Process Theories
Learning Process Theories
Malyn Singson
 
Philosophy of education
Philosophy of educationPhilosophy of education
Philosophy of education
ajtame
 

Viewers also liked (17)

1 gerir o teu dinheiro
1   gerir o teu dinheiro1   gerir o teu dinheiro
1 gerir o teu dinheiro
 
Group Mentoring Session 12: Dealing with Difficulties
Group Mentoring Session 12: Dealing with DifficultiesGroup Mentoring Session 12: Dealing with Difficulties
Group Mentoring Session 12: Dealing with Difficulties
 
Summit2013 choi - wise kb-introd
Summit2013   choi - wise kb-introdSummit2013   choi - wise kb-introd
Summit2013 choi - wise kb-introd
 
nature of learning
nature of learningnature of learning
nature of learning
 
Transferring Software Testing Tools to Practice
Transferring Software Testing Tools to PracticeTransferring Software Testing Tools to Practice
Transferring Software Testing Tools to Practice
 
Best Philosophy of Education
Best Philosophy of EducationBest Philosophy of Education
Best Philosophy of Education
 
The learning process- Fundamentals of Instruction
The learning process- Fundamentals of InstructionThe learning process- Fundamentals of Instruction
The learning process- Fundamentals of Instruction
 
The learning process
The learning processThe learning process
The learning process
 
Drive test learning
Drive test learningDrive test learning
Drive test learning
 
Learning Process Theories
 Learning Process Theories  Learning Process Theories
Learning Process Theories
 
Learning presentation
Learning presentationLearning presentation
Learning presentation
 
The Teaching Learning Process: Intro, Phases, Definitions, Theories and Model...
The Teaching Learning Process: Intro, Phases, Definitions, Theories and Model...The Teaching Learning Process: Intro, Phases, Definitions, Theories and Model...
The Teaching Learning Process: Intro, Phases, Definitions, Theories and Model...
 
Teaching and Learning Process
Teaching and Learning ProcessTeaching and Learning Process
Teaching and Learning Process
 
Philosophy of education
Philosophy of educationPhilosophy of education
Philosophy of education
 
Philosophies of education
Philosophies of educationPhilosophies of education
Philosophies of education
 
Major philosophies in education
Major philosophies in educationMajor philosophies in education
Major philosophies in education
 
Philosophy of education
Philosophy of educationPhilosophy of education
Philosophy of education
 

More from Salesforce Engineering

More from Salesforce Engineering (20)

Locker Service Ready Lightning Components With Webpack
Locker Service Ready Lightning Components With WebpackLocker Service Ready Lightning Components With Webpack
Locker Service Ready Lightning Components With Webpack
 
Scaling HBase for Big Data
Scaling HBase for Big DataScaling HBase for Big Data
Scaling HBase for Big Data
 
Techniques to Effectively Monitor the Performance of Customers in the Cloud
Techniques to Effectively Monitor the Performance of Customers in the CloudTechniques to Effectively Monitor the Performance of Customers in the Cloud
Techniques to Effectively Monitor the Performance of Customers in the Cloud
 
Predictive System Performance Data Analysis
Predictive System Performance Data AnalysisPredictive System Performance Data Analysis
Predictive System Performance Data Analysis
 
Apache HBase State of the Project
Apache HBase State of the ProjectApache HBase State of the Project
Apache HBase State of the Project
 
Hit the Trail with Trailhead
Hit the Trail with TrailheadHit the Trail with Trailhead
Hit the Trail with Trailhead
 
HBase/PHOENIX @ Scale
HBase/PHOENIX @ ScaleHBase/PHOENIX @ Scale
HBase/PHOENIX @ Scale
 
Scaling up data science applications
Scaling up data science applicationsScaling up data science applications
Scaling up data science applications
 
Containers and Security for DevOps
Containers and Security for DevOpsContainers and Security for DevOps
Containers and Security for DevOps
 
Aspect Oriented Programming: Hidden Toolkit That You Already Have
Aspect Oriented Programming: Hidden Toolkit That You Already HaveAspect Oriented Programming: Hidden Toolkit That You Already Have
Aspect Oriented Programming: Hidden Toolkit That You Already Have
 
Monitoring @ Scale in Salesforce
Monitoring @ Scale in SalesforceMonitoring @ Scale in Salesforce
Monitoring @ Scale in Salesforce
 
Performance Tuning with XHProf
Performance Tuning with XHProfPerformance Tuning with XHProf
Performance Tuning with XHProf
 
A Smarter Pig: Building a SQL interface to Pig using Apache Calcite
A Smarter Pig: Building a SQL interface to Pig using Apache CalciteA Smarter Pig: Building a SQL interface to Pig using Apache Calcite
A Smarter Pig: Building a SQL interface to Pig using Apache Calcite
 
Implementing a Content Strategy Is Like Running 100 Miles
Implementing a Content Strategy Is Like Running 100 MilesImplementing a Content Strategy Is Like Running 100 Miles
Implementing a Content Strategy Is Like Running 100 Miles
 
Salesforce Cloud Infrastructure and Challenges - A Brief Overview
Salesforce Cloud Infrastructure and Challenges - A Brief OverviewSalesforce Cloud Infrastructure and Challenges - A Brief Overview
Salesforce Cloud Infrastructure and Challenges - A Brief Overview
 
Koober Preduction IO Presentation
Koober Preduction IO PresentationKoober Preduction IO Presentation
Koober Preduction IO Presentation
 
Finding Security Issues Fast!
Finding Security Issues Fast!Finding Security Issues Fast!
Finding Security Issues Fast!
 
Microservices
MicroservicesMicroservices
Microservices
 
Global State Management of Micro Services
Global State Management of Micro ServicesGlobal State Management of Micro Services
Global State Management of Micro Services
 
The Future of Hbase
The Future of HbaseThe Future of Hbase
The Future of Hbase
 

Recently uploaded

ONLINE VEHICLE RENTAL SYSTEM PROJECT REPORT.pdf
ONLINE VEHICLE RENTAL SYSTEM PROJECT REPORT.pdfONLINE VEHICLE RENTAL SYSTEM PROJECT REPORT.pdf
ONLINE VEHICLE RENTAL SYSTEM PROJECT REPORT.pdf
Kamal Acharya
 
RS Khurmi Machine Design Clutch and Brake Exercise Numerical Solutions
RS Khurmi Machine Design Clutch and Brake Exercise Numerical SolutionsRS Khurmi Machine Design Clutch and Brake Exercise Numerical Solutions
RS Khurmi Machine Design Clutch and Brake Exercise Numerical Solutions
Atif Razi
 
School management system project report.pdf
School management system project report.pdfSchool management system project report.pdf
School management system project report.pdf
Kamal Acharya
 
Fruit shop management system project report.pdf
Fruit shop management system project report.pdfFruit shop management system project report.pdf
Fruit shop management system project report.pdf
Kamal Acharya
 

Recently uploaded (20)

KIT-601 Lecture Notes-UNIT-3.pdf Mining Data Stream
KIT-601 Lecture Notes-UNIT-3.pdf Mining Data StreamKIT-601 Lecture Notes-UNIT-3.pdf Mining Data Stream
KIT-601 Lecture Notes-UNIT-3.pdf Mining Data Stream
 
ONLINE VEHICLE RENTAL SYSTEM PROJECT REPORT.pdf
ONLINE VEHICLE RENTAL SYSTEM PROJECT REPORT.pdfONLINE VEHICLE RENTAL SYSTEM PROJECT REPORT.pdf
ONLINE VEHICLE RENTAL SYSTEM PROJECT REPORT.pdf
 
Peek implant persentation - Copy (1).pdf
Peek implant persentation - Copy (1).pdfPeek implant persentation - Copy (1).pdf
Peek implant persentation - Copy (1).pdf
 
Top 13 Famous Civil Engineering Scientist
Top 13 Famous Civil Engineering ScientistTop 13 Famous Civil Engineering Scientist
Top 13 Famous Civil Engineering Scientist
 
RS Khurmi Machine Design Clutch and Brake Exercise Numerical Solutions
RS Khurmi Machine Design Clutch and Brake Exercise Numerical SolutionsRS Khurmi Machine Design Clutch and Brake Exercise Numerical Solutions
RS Khurmi Machine Design Clutch and Brake Exercise Numerical Solutions
 
retail automation billing system ppt.pptx
retail automation billing system ppt.pptxretail automation billing system ppt.pptx
retail automation billing system ppt.pptx
 
School management system project report.pdf
School management system project report.pdfSchool management system project report.pdf
School management system project report.pdf
 
KIT-601 Lecture Notes-UNIT-5.pdf Frame Works and Visualization
KIT-601 Lecture Notes-UNIT-5.pdf Frame Works and VisualizationKIT-601 Lecture Notes-UNIT-5.pdf Frame Works and Visualization
KIT-601 Lecture Notes-UNIT-5.pdf Frame Works and Visualization
 
Quality defects in TMT Bars, Possible causes and Potential Solutions.
Quality defects in TMT Bars, Possible causes and Potential Solutions.Quality defects in TMT Bars, Possible causes and Potential Solutions.
Quality defects in TMT Bars, Possible causes and Potential Solutions.
 
Cloud-Computing_CSE311_Computer-Networking CSE GUB BD - Shahidul.pptx
Cloud-Computing_CSE311_Computer-Networking CSE GUB BD - Shahidul.pptxCloud-Computing_CSE311_Computer-Networking CSE GUB BD - Shahidul.pptx
Cloud-Computing_CSE311_Computer-Networking CSE GUB BD - Shahidul.pptx
 
HYDROPOWER - Hydroelectric power generation
HYDROPOWER - Hydroelectric power generationHYDROPOWER - Hydroelectric power generation
HYDROPOWER - Hydroelectric power generation
 
Fruit shop management system project report.pdf
Fruit shop management system project report.pdfFruit shop management system project report.pdf
Fruit shop management system project report.pdf
 
Democratizing Fuzzing at Scale by Abhishek Arya
Democratizing Fuzzing at Scale by Abhishek AryaDemocratizing Fuzzing at Scale by Abhishek Arya
Democratizing Fuzzing at Scale by Abhishek Arya
 
RESORT MANAGEMENT AND RESERVATION SYSTEM PROJECT REPORT.pdf
RESORT MANAGEMENT AND RESERVATION SYSTEM PROJECT REPORT.pdfRESORT MANAGEMENT AND RESERVATION SYSTEM PROJECT REPORT.pdf
RESORT MANAGEMENT AND RESERVATION SYSTEM PROJECT REPORT.pdf
 
A CASE STUDY ON ONLINE TICKET BOOKING SYSTEM PROJECT.pdf
A CASE STUDY ON ONLINE TICKET BOOKING SYSTEM PROJECT.pdfA CASE STUDY ON ONLINE TICKET BOOKING SYSTEM PROJECT.pdf
A CASE STUDY ON ONLINE TICKET BOOKING SYSTEM PROJECT.pdf
 
2024 DevOps Pro Europe - Growing at the edge
2024 DevOps Pro Europe - Growing at the edge2024 DevOps Pro Europe - Growing at the edge
2024 DevOps Pro Europe - Growing at the edge
 
İTÜ CAD and Reverse Engineering Workshop
İTÜ CAD and Reverse Engineering WorkshopİTÜ CAD and Reverse Engineering Workshop
İTÜ CAD and Reverse Engineering Workshop
 
Halogenation process of chemical process industries
Halogenation process of chemical process industriesHalogenation process of chemical process industries
Halogenation process of chemical process industries
 
ENERGY STORAGE DEVICES INTRODUCTION UNIT-I
ENERGY STORAGE DEVICES  INTRODUCTION UNIT-IENERGY STORAGE DEVICES  INTRODUCTION UNIT-I
ENERGY STORAGE DEVICES INTRODUCTION UNIT-I
 
Event Management System Vb Net Project Report.pdf
Event Management System Vb Net  Project Report.pdfEvent Management System Vb Net  Project Report.pdf
Event Management System Vb Net Project Report.pdf
 

Learning for Testing Prioritization- an industrial case study

  • 1. Learning for Test Prioritization An Industrial Case Study Benjamin Busjaeger Salesforce Tao Xie University of Illinois, Urbana-Champaign ACM SIGSOFT FSE 2016
  • 2. Large Scale Continuous Integration 2K+ Modules 600K+ Tests 2K+ Engineers 500+ Changes/Day Main repository ... 350K+ Source files
  • 3. Motivation Before Commit After Commit Objective Reject faulty changes Detect test failures as quickly as possible Current Run hand-picked tests Parallelize (8000 concurrent VMs) & batch (1-1000 changes per run) Problem Too complex for human Feedback time constrained by capacity & batching complicates fault assignment Desired Select top-k tests likely to fail Prioritize all tests by likelihood of failure
  • 4. Insight: Need Multiple Existing Techniques ● Heterogeneous languages: Java, PL/SQL, JavaScript, Apex, etc. ●Non-code artifacts: metadata and configuration ●New/recently-failed tests more likely to fail Test code coverage of change Textual similarity between test and change Test age and recent failures
  • 5. Insight: Need Scalable Techniques ●Complex integration tests: concurrent execution ●Large data volume: 500+ changes, 50M+ test runs, 20TB+ results per day → Our approach: Periodically collect coarse-grained measurements Test code coverage of change Textual similarity between test and change Test age and recent failures
  • 6. Formal model for learning from past test results New Approach: Learning for Test Prioritization Test code coverage of change Textual similarity between test and change Test age and recent failures Change Test Ranking → Implementation currently in pilot use at Salesforce
  • 7. Empirical Study: Setup ●Test results of 45K tests for ~3 month period ●In this period, 711 changes with ≥1 failure • 440 for training • 271 for testing ●Collected once for each test: • Test code coverage • Test text content ●Collected continuously: • Test age and recent failures
  • 8. ● New approach achieves highest average recall at all cutoff points • 50% failures detected with top 0.2% • 75% failures detected with top 3% Results: Test selection (before commit) New Approach
  • 9. ● New approach achieves highest APFD with least variance • Median: 85% • Average: 99% Results: Test prioritization (after commit) New Approach
  • 10. Results: Side Benefits Invalid Assignments Flaky Tests
  • 11. Summary ● Main insights gained in conducting test prioritization in industry ● Novel learning-based approach to test prioritization ● Implementation currently in pilot use at Salesforce ● Empirical evaluation using a large Salesforce dataset ● Ongoing work: add features, word2vec, deep learning, cost-aware
  • 13. Summary ● Main insights gained in conducting test prioritization in industry ● Novel learning-based approach to test prioritization ● Implementation currently in pilot use at Salesforce ● Empirical evaluation using a large Salesforce dataset ● Ongoing work: add features, word2vec/deep learning, cost-aware

Editor's Notes

  1. coverage: periodically collected (slightly out-of-date), dynamic instrumentation (JaCoCo, no rebuild), approximate proximity