SlideShare a Scribd company logo
1 of 22
CARFAST:
ACHIEVING HIGHER STATEMENT
     COVERAGE FASTER

              Sangmin Park,
              Ishtiaque Hussain,
              Christoph Csallner,
              Kunal Taneja,
              B. M. Mainul Hossain,
              Mark Grechanik,
              Chen Fu, Qing Xie
CarFast Implementation Evaluation Conclusion

Motivation - Achieving High Coverage

   Coverage
     Degreeto which program has been tested
     Measure of confidence



   Widely used in industry
     Avionics industry standard, DO-254 and DO-178B
     Automotive industry standard, IEC 61508

     Other organizations



                                                            2
CarFast Implementation Evaluation Conclusion

 Motivation - Achieving Coverage Fast

Current approaches




                                                Timeout

Goal: Achieve high high coverage
  Achieving coverage faster          fast is difficult
     Complex programs
     Too many test inputs
      (e.g., Renters Insurance Program with 78M customer profiles)

                                                                     3
CarFast Implementation Evaluation Conclusion

High level approach

   Observation (study we performed)
       80% of statements are covered by 20% of branches
        (we call those branches "profitable")


   Intuition
       Cover profitable branches fast leading to achieving
        high statement coverage quickly

   High level approach
     Use static analysis to find profitable branches
     Select inputs that direct program execution towards
      profitable branches
                                                                      4
CarFast Implementation Evaluation Conclusion

    CarFast – Illustrative Example

           i1 = 20 and i2 = 20
void foo (int i1, int i2) {
                                                                 i1==10
 1:   if (i1 == 10) {                                    T       F
 2:     … // branch 1: 300 statements            300                  i2==50
 3:   } else if (i2 == 50) {                     stmts           T    F
 4:     … // branch 2: 600 statements                    600              100
 5:   } else {                                           stmts            stmts
 6:     … // branch 3: 100 statements
 7:     if (i1==20) {
 8:       if (i2==30) { … }
 9:     }
10:   }

}
                                                                          5
CarFast Implementation Evaluation Conclusion

    CarFast – Illustrative Example

           i1 = 20 and i2 = 20
void foo (int i1, int i2) {
                                                                 i1==10
 1:   if (i1 == 10) {                                    T       F
 2:     … // branch 1: 300 statements            300                   i2==50
 3:   } else if (i2 == 50) {                     stmts           T     F
 4:     … // branch 2: 600 statements                    600               100
 5:   } else {                                           stmts             stmts
 6:     … // branch 3: 100 statements
 7:     if (i1==20) {
 8:       if (i2==30) { … }
 9:     }
10:   }
                                                 DFS search: up to 10%

}
                                                 Branch 2: up to 70%
                                                                          6
CarFast Implementation Evaluation Conclusion

    CarFast – Algorithm

           i1 = 20 and i2 = 20
void foo (int i1, int i2) {
                                                                  i1==10
 1:   if (i1 == 10) {                                     T       F
 2:     … // branch 1: 300 statements             300                  i2==50
 3:   } else if (i2 == 50) {                      stmts           T    F
 4:     … // branch 2: 600 statements                     600              100
 5:   } else {                                            stmts            stmts
 6:     … // branch 3: 100 statements
 7:     if (i1==20) {
 8:       if (i2==30) { … }
 9:     }
10:   }
         Step 1:                    Step 2:                    Step 3:
}                                    Select                    Select
          Rank
        Branches                  Initial Input               Next Input   7
CarFast Implementation Evaluation Conclusion

    CarFast – AlgorithmStep 1: Rank branches
                                  • Counts (transitively) branches
                                    by the number of statements
                                    they contain
void foo (int i1, int i2) {
                                  • Resolves method calls
                                  • Ranks branches by statements
                                                          i1==10
 1: if (i1 == 10) {                               T       F
 2:    … // branch 1: 300 statements      300                  i2==50
 3: } else if (i2 == 50) {                stmts           T    F
 4:    … // branch 2: 600 statements              600              100
 5: } else {                                      stmts            stmts
 6:    … // branch 3: 100 statements
 7:    if (i1==20) {
 8:      if (i2==30) { … }
 9:    }
    Rank Branch # Stmt
10: }
         1    2    600
}        2    1    300
         3    3    100
                                                                   8
         4   …      …
CarFast Implementation Evaluation Conclusion

    CarFast – Algorithm 2: Select a random input
                       Step
                                       • Selects a random input from input
                                         database

void foo (int i1, int i2) {
                                                            i1==10
 1: if (i1 == 10) {                                 T        F
 2:    … // branch 1: 300 statements        300                  i2==50
 3: } else if (i2 == 50) {                  stmts          T     F
 4:    … // branch 2: 600 statements                600              100
 5: } else {                                        stmts            stmts
 6:    … // branch 3: 100 statements
 7:    if (i1==20) {
 8:      if (i2==30) { … }      Input 1: i1 = 20 and i2 = 20
 9:    }
    Rank Branch # Stmt i1  i2
10: }
         1       2    600     5   50
}        2       1    300    20   20
         3       3    100    30   30
                                                                     9
         4      …      …     40   40
CarFast Implementation Evaluation Conclusion
                                  Step 3: Select next input from trace
                            • Executes the program with the input
    CarFast – Algorithm       to collect path condition
                               • Modifies path condition to cover
                                  higher ranked branches
                               • Queries the condition to database
void foo (int i1, int i2) { • Selects random input if there are no
                              satisfying input
                                                             i1==10
  1: if (i1 == 10) {                                 T        F
  2:    … // branch 1: 300 statements        300                  i2==50
  3: } else if (i2 == 50) {                  stmts          T     F
  4:    … // branch 2: 600 statements                600              100
  5: } else {                                        stmts            stmts
  6:    … // branch 3: 100 statements
  7:    if (i1==20) {
  8:      if (i2==30) { … }      Input 1: i1 = 20 and i2 = 20
  9:    }
     Rank Branch # Stmt i1  i2
 10: }
          1       2    600     5   50
}         2       1    300    20   20
          3       3    100    30   30
                                                                      10
          4      …      …     40   40
CarFast Implementation Evaluation Conclusion
                                 Step 3: Select next input from trace
                            • Executes the program with the input
    CarFast – Algorithm       to collect path condition
                               • Modifies path condition to cover
                                  higher ranked branches
                               • Queries the condition to database
void foo (int i1, int i2) { • Selects random input if there are no
                              satisfying input
                                                            i1==10
  1: if (i1 == 10) {                                 T        F
  2:    … // branch 1: 300 statements
                                                                 i2==50
  3: } else if (i2 == 50) {                                 T    F
  4:    … // branch 2: 600 statements                600              100
  5: } else {                                        stmts            stmts
  6:    … // branch 3: 100 statements
  7:    if (i1==20) {
  8:      if (i2==30) { … }      Input 1: i1 = 20 and i2 = 20
  9:    }
     Rank Branch # Stmt i1  i2   C: (i1!=10)&&(i2!=50)&&(i1==20)&&(i2!=30)
 10: }
         1       2     600    5   50
}        2       1     300   20   20
         3       3     100   30   30
                                                                     11
         4      …       …    40   40
CarFast Implementation Evaluation Conclusion
                                 Step 3: Select next input from trace
                            • Executes the program with the input
    CarFast – Algorithm       to collect path condition
                               • Modifies path condition to cover
                                  higher ranked branches
                               • Queries the condition to database
void foo (int i1, int i2) { • Selects random input if there are no
                              satisfying input
                                                                     i1==10
  1: if (i1 == 10) {                                 T        F
  2:    … // branch 1: 300 statements
                                                                 i2==50
  3: } else if (i2 == 50) {                                 T    F
  4:    … // branch 2: 600 statements                600              100
  5: } else {                                        stmts            stmts
  6:    … // branch 3: 100 statements
  7:    if (i1==20) {
  8:      if (i2==30) { … }      Input 1: i1 = 20 and i2 = 20
  9:    }
     Rank Branch # Stmt i1  i2   C: (i1!=10)&&(i2!=50)&&(i1==20)&&(i2!=30)
 10: }
         1       2     600    5   50
                                       C’: (i1!=10)&&(i2==50)
}        2       1     300   20   20
         3       3     100   30   30   Input 2: i1 = 5 and i2 = 50
                                                                              12
         4      …       …    40   40
CarFast Implementation Evaluation Conclusion

Implementation

   Scalability challenges in large applications: up to 1MLOC
       Large constraints of size up to 5MB
       Existing tools run out of memory

   Execution Engine
       Initial tool: Concolic execution engine (Dsc)
       Solution: DSC-Dumper mode
           Uses disk instead of memory
           Removes memory overhead

   Test Input Database
       Initial tool: MSSQL server 2008
       Solution: Constraint-based selector
           Uses B+ tree based index
           Provides API to process queries
                                                                         13
CarFast Implementation Evaluation Conclusion

Experiment – Approaches

                                  Adaptive Random Testing
Random Testing                    (ART)
• Random selection of inputs      • Random selection of evenly
• Black-box approach                distributed inputs
                                  • Black-box approach




DART                              CarFast
• Concolic execution              • Our approach
  approach                        • Static ranking based path
• Depth-first path exploration      exploration
• White-box approach              • White-box approach

                                                                14
CarFast Implementation Evaluation Conclusion

Experiment – Subject Programs

   Challenges in selecting programs
       Programs with various sizes
       Programs with complex properties
       Programs without external dependencies

   RugRat program generator [WODA 2012]
       Stochastic-parse-tree based program generation approach
       Highly configurable option parameters
       Used in generating 12 programs from 1KLOC to 1MLOC

   Test inputs
       Each program has up to 20 integer inputs
       Complete combination of inputs for 20 integers = 10020
       Pairwise combination of inputs for 20 integers = 1M        15
CarFast Implementation Evaluation Conclusion

Experiment – Setup

   Study Protocol
     For statistical significance, ran 30 times
     Total time = 4 approaches*12 programs*
                    30 times*24 hours
                 = 34,560 hours

   Baseline coverage = min(covi)
                   where i = {Random, ART, DART, CarFast}


   Measurement (to achieve baseline coverage)
     Number of iterations (1 iteration = 1 selection)
     Elapsed time

                                                                  16
CarFast Implementation Evaluation Conclusion

Experiment – Results
                                       3               1                  2
  Programs               Baseline          Appoaches         Iterations       Elapsed Time
                        Coverage                                (mean)              (mean)
                                           Random                 17.1               522.2

                                             ART                  17.8                59.8
 3 (1.2K)               45%
                                            DART                 693.5              1447.0

                                           CarFast                 5.9               571.0

                                           Random               1023.2              3162.5

 5 (2.1K)               78%                  ART                1615.6              5157.7

                                           CarFast               463.9             20040.9

                                           Random          • DART doesn't
                                                               543.1   1736.8

 7 (7.8K)               79%                  ART             scale
                                                               684.1   2217.6

                                           CarFast               380.0              18829
                                                                                        17
* Complete results are in the paper.
CarFast Implementation Evaluation Conclusion

Future Work

   Bottleneck
     Current: Identified modules causing bottlenecks
     Future: Improve the runtime of CarFast



   Fault-detection ability
     Current: Does not measure fault-detection ability
     Future: Investigate fault-detection ability



   Other test coverage metrics
     Current: Used static measure on statements
     Future: Use static measure on branches
                                                                18
CarFast Implementation Evaluation Conclusion

Contributions

   CarFast
    The first approach to select inputs for achieving
    statement coverage fast

   Implementation
    The tool scales up to 1MLOC

   Experiment
    The study shows limitations in popular testing
    techniques with statistical significance

   Tool, subjects, experimental data are available
    www.carfast.org
                                                                20
BACKUP SLIDES
CarFast Implementation Evaluation Conclusion

Related Work

   Test-case prioritization
     Test   case prioritization: empirical studies
      [Elbaum, 2002]
   Dynamic symbolic execution
     DART [Godefroid, 2005]

     Hybrid concolic testing [Majundar, 2007]
     Heuristics for dynamic test generation [Burnim, 2008]

   Search-based testing
     Fitness-guided    path exploration [Xie, 2009]

                                                                 22
CarFast Implementation Evaluation Conclusion

CarFast – Preliminary Study

   Study
     Performed on Apache programs
     Investigated branches and statements
     Observed power law in results –
      20% of branches contain
      80% of statements

   Hypothesis
       Assuming the observation holds,
        we can steer execution to cover
        those 20% of branches

                                                                 23

More Related Content

Viewers also liked

The power of datomic
The power of datomicThe power of datomic
The power of datomicKonrad Szydlo
 
Effective Fault-Localization Techniques for Concurrent Software
Effective Fault-Localization Techniques for Concurrent SoftwareEffective Fault-Localization Techniques for Concurrent Software
Effective Fault-Localization Techniques for Concurrent SoftwareSangmin Park
 
Testing Concurrent Programs to Achieve High Synchronization Coverage
Testing Concurrent Programs to Achieve High Synchronization CoverageTesting Concurrent Programs to Achieve High Synchronization Coverage
Testing Concurrent Programs to Achieve High Synchronization CoverageSangmin Park
 
Hitchhiker Trees - Strangeloop 2016
Hitchhiker Trees - Strangeloop 2016Hitchhiker Trees - Strangeloop 2016
Hitchhiker Trees - Strangeloop 2016David Greenberg
 
PyCon APAC 2016 Keynote
PyCon APAC 2016 KeynotePyCon APAC 2016 Keynote
PyCon APAC 2016 KeynoteWes McKinney
 
Apache Arrow and Python: The latest
Apache Arrow and Python: The latestApache Arrow and Python: The latest
Apache Arrow and Python: The latestWes McKinney
 
Huohua: A Distributed Time Series Analysis Framework For Spark
Huohua: A Distributed Time Series Analysis Framework For SparkHuohua: A Distributed Time Series Analysis Framework For Spark
Huohua: A Distributed Time Series Analysis Framework For SparkJen Aman
 
Python Data Ecosystem: Thoughts on Building for the Future
Python Data Ecosystem: Thoughts on Building for the FuturePython Data Ecosystem: Thoughts on Building for the Future
Python Data Ecosystem: Thoughts on Building for the FutureWes McKinney
 
Python Data Wrangling: Preparing for the Future
Python Data Wrangling: Preparing for the FuturePython Data Wrangling: Preparing for the Future
Python Data Wrangling: Preparing for the FutureWes McKinney
 
Raising the Tides: Open Source Analytics for Data Science
Raising the Tides: Open Source Analytics for Data ScienceRaising the Tides: Open Source Analytics for Data Science
Raising the Tides: Open Source Analytics for Data ScienceWes McKinney
 
Improving Python and Spark (PySpark) Performance and Interoperability
Improving Python and Spark (PySpark) Performance and InteroperabilityImproving Python and Spark (PySpark) Performance and Interoperability
Improving Python and Spark (PySpark) Performance and InteroperabilityWes McKinney
 
Mesos: The Operating System for your Datacenter
Mesos: The Operating System for your DatacenterMesos: The Operating System for your Datacenter
Mesos: The Operating System for your DatacenterDavid Greenberg
 

Viewers also liked (13)

The power of datomic
The power of datomicThe power of datomic
The power of datomic
 
Effective Fault-Localization Techniques for Concurrent Software
Effective Fault-Localization Techniques for Concurrent SoftwareEffective Fault-Localization Techniques for Concurrent Software
Effective Fault-Localization Techniques for Concurrent Software
 
Testing Concurrent Programs to Achieve High Synchronization Coverage
Testing Concurrent Programs to Achieve High Synchronization CoverageTesting Concurrent Programs to Achieve High Synchronization Coverage
Testing Concurrent Programs to Achieve High Synchronization Coverage
 
Hitchhiker Trees - Strangeloop 2016
Hitchhiker Trees - Strangeloop 2016Hitchhiker Trees - Strangeloop 2016
Hitchhiker Trees - Strangeloop 2016
 
PyCon APAC 2016 Keynote
PyCon APAC 2016 KeynotePyCon APAC 2016 Keynote
PyCon APAC 2016 Keynote
 
Apache Arrow and Python: The latest
Apache Arrow and Python: The latestApache Arrow and Python: The latest
Apache Arrow and Python: The latest
 
Huohua: A Distributed Time Series Analysis Framework For Spark
Huohua: A Distributed Time Series Analysis Framework For SparkHuohua: A Distributed Time Series Analysis Framework For Spark
Huohua: A Distributed Time Series Analysis Framework For Spark
 
Python Data Ecosystem: Thoughts on Building for the Future
Python Data Ecosystem: Thoughts on Building for the FuturePython Data Ecosystem: Thoughts on Building for the Future
Python Data Ecosystem: Thoughts on Building for the Future
 
Python Data Wrangling: Preparing for the Future
Python Data Wrangling: Preparing for the FuturePython Data Wrangling: Preparing for the Future
Python Data Wrangling: Preparing for the Future
 
Raising the Tides: Open Source Analytics for Data Science
Raising the Tides: Open Source Analytics for Data ScienceRaising the Tides: Open Source Analytics for Data Science
Raising the Tides: Open Source Analytics for Data Science
 
Improving Python and Spark (PySpark) Performance and Interoperability
Improving Python and Spark (PySpark) Performance and InteroperabilityImproving Python and Spark (PySpark) Performance and Interoperability
Improving Python and Spark (PySpark) Performance and Interoperability
 
Datomic
DatomicDatomic
Datomic
 
Mesos: The Operating System for your Datacenter
Mesos: The Operating System for your DatacenterMesos: The Operating System for your Datacenter
Mesos: The Operating System for your Datacenter
 

Similar to CarFast: Achieving Higher Statement Coverage Faster

Very interesting C programming Technical Questions
Very interesting C programming Technical Questions Very interesting C programming Technical Questions
Very interesting C programming Technical Questions Vanathi24
 
Excelマクロはじめの一歩
Excelマクロはじめの一歩Excelマクロはじめの一歩
Excelマクロはじめの一歩Ayumu Hanba
 
Lecture#6 functions in c++
Lecture#6 functions in c++Lecture#6 functions in c++
Lecture#6 functions in c++NUST Stuff
 
Technical aptitude Test 1 CSE
Technical aptitude Test 1 CSETechnical aptitude Test 1 CSE
Technical aptitude Test 1 CSESujata Regoti
 

Similar to CarFast: Achieving Higher Statement Coverage Faster (6)

White box-sol
White box-solWhite box-sol
White box-sol
 
Qno 1 (c)
Qno 1 (c)Qno 1 (c)
Qno 1 (c)
 
Very interesting C programming Technical Questions
Very interesting C programming Technical Questions Very interesting C programming Technical Questions
Very interesting C programming Technical Questions
 
Excelマクロはじめの一歩
Excelマクロはじめの一歩Excelマクロはじめの一歩
Excelマクロはじめの一歩
 
Lecture#6 functions in c++
Lecture#6 functions in c++Lecture#6 functions in c++
Lecture#6 functions in c++
 
Technical aptitude Test 1 CSE
Technical aptitude Test 1 CSETechnical aptitude Test 1 CSE
Technical aptitude Test 1 CSE
 

Recently uploaded

My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAndikSusilo4
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptxLBM Solutions
 
Next-generation AAM aircraft unveiled by Supernal, S-A2
Next-generation AAM aircraft unveiled by Supernal, S-A2Next-generation AAM aircraft unveiled by Supernal, S-A2
Next-generation AAM aircraft unveiled by Supernal, S-A2Hyundai Motor Group
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxMaking_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxnull - The Open Security Community
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 

Recently uploaded (20)

My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & Application
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptx
 
Next-generation AAM aircraft unveiled by Supernal, S-A2
Next-generation AAM aircraft unveiled by Supernal, S-A2Next-generation AAM aircraft unveiled by Supernal, S-A2
Next-generation AAM aircraft unveiled by Supernal, S-A2
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxMaking_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 

CarFast: Achieving Higher Statement Coverage Faster

  • 1. CARFAST: ACHIEVING HIGHER STATEMENT COVERAGE FASTER Sangmin Park, Ishtiaque Hussain, Christoph Csallner, Kunal Taneja, B. M. Mainul Hossain, Mark Grechanik, Chen Fu, Qing Xie
  • 2. CarFast Implementation Evaluation Conclusion Motivation - Achieving High Coverage  Coverage  Degreeto which program has been tested  Measure of confidence  Widely used in industry  Avionics industry standard, DO-254 and DO-178B  Automotive industry standard, IEC 61508  Other organizations 2
  • 3. CarFast Implementation Evaluation Conclusion Motivation - Achieving Coverage Fast Current approaches Timeout Goal: Achieve high high coverage  Achieving coverage faster fast is difficult  Complex programs  Too many test inputs (e.g., Renters Insurance Program with 78M customer profiles) 3
  • 4. CarFast Implementation Evaluation Conclusion High level approach  Observation (study we performed)  80% of statements are covered by 20% of branches (we call those branches "profitable")  Intuition  Cover profitable branches fast leading to achieving high statement coverage quickly  High level approach  Use static analysis to find profitable branches  Select inputs that direct program execution towards profitable branches 4
  • 5. CarFast Implementation Evaluation Conclusion CarFast – Illustrative Example i1 = 20 and i2 = 20 void foo (int i1, int i2) { i1==10 1: if (i1 == 10) { T F 2: … // branch 1: 300 statements 300 i2==50 3: } else if (i2 == 50) { stmts T F 4: … // branch 2: 600 statements 600 100 5: } else { stmts stmts 6: … // branch 3: 100 statements 7: if (i1==20) { 8: if (i2==30) { … } 9: } 10: } } 5
  • 6. CarFast Implementation Evaluation Conclusion CarFast – Illustrative Example i1 = 20 and i2 = 20 void foo (int i1, int i2) { i1==10 1: if (i1 == 10) { T F 2: … // branch 1: 300 statements 300 i2==50 3: } else if (i2 == 50) { stmts T F 4: … // branch 2: 600 statements 600 100 5: } else { stmts stmts 6: … // branch 3: 100 statements 7: if (i1==20) { 8: if (i2==30) { … } 9: } 10: } DFS search: up to 10% } Branch 2: up to 70% 6
  • 7. CarFast Implementation Evaluation Conclusion CarFast – Algorithm i1 = 20 and i2 = 20 void foo (int i1, int i2) { i1==10 1: if (i1 == 10) { T F 2: … // branch 1: 300 statements 300 i2==50 3: } else if (i2 == 50) { stmts T F 4: … // branch 2: 600 statements 600 100 5: } else { stmts stmts 6: … // branch 3: 100 statements 7: if (i1==20) { 8: if (i2==30) { … } 9: } 10: } Step 1: Step 2: Step 3: } Select Select Rank Branches Initial Input Next Input 7
  • 8. CarFast Implementation Evaluation Conclusion CarFast – AlgorithmStep 1: Rank branches • Counts (transitively) branches by the number of statements they contain void foo (int i1, int i2) { • Resolves method calls • Ranks branches by statements i1==10 1: if (i1 == 10) { T F 2: … // branch 1: 300 statements 300 i2==50 3: } else if (i2 == 50) { stmts T F 4: … // branch 2: 600 statements 600 100 5: } else { stmts stmts 6: … // branch 3: 100 statements 7: if (i1==20) { 8: if (i2==30) { … } 9: } Rank Branch # Stmt 10: } 1 2 600 } 2 1 300 3 3 100 8 4 … …
  • 9. CarFast Implementation Evaluation Conclusion CarFast – Algorithm 2: Select a random input Step • Selects a random input from input database void foo (int i1, int i2) { i1==10 1: if (i1 == 10) { T F 2: … // branch 1: 300 statements 300 i2==50 3: } else if (i2 == 50) { stmts T F 4: … // branch 2: 600 statements 600 100 5: } else { stmts stmts 6: … // branch 3: 100 statements 7: if (i1==20) { 8: if (i2==30) { … } Input 1: i1 = 20 and i2 = 20 9: } Rank Branch # Stmt i1 i2 10: } 1 2 600 5 50 } 2 1 300 20 20 3 3 100 30 30 9 4 … … 40 40
  • 10. CarFast Implementation Evaluation Conclusion Step 3: Select next input from trace • Executes the program with the input CarFast – Algorithm to collect path condition • Modifies path condition to cover higher ranked branches • Queries the condition to database void foo (int i1, int i2) { • Selects random input if there are no satisfying input i1==10 1: if (i1 == 10) { T F 2: … // branch 1: 300 statements 300 i2==50 3: } else if (i2 == 50) { stmts T F 4: … // branch 2: 600 statements 600 100 5: } else { stmts stmts 6: … // branch 3: 100 statements 7: if (i1==20) { 8: if (i2==30) { … } Input 1: i1 = 20 and i2 = 20 9: } Rank Branch # Stmt i1 i2 10: } 1 2 600 5 50 } 2 1 300 20 20 3 3 100 30 30 10 4 … … 40 40
  • 11. CarFast Implementation Evaluation Conclusion Step 3: Select next input from trace • Executes the program with the input CarFast – Algorithm to collect path condition • Modifies path condition to cover higher ranked branches • Queries the condition to database void foo (int i1, int i2) { • Selects random input if there are no satisfying input i1==10 1: if (i1 == 10) { T F 2: … // branch 1: 300 statements i2==50 3: } else if (i2 == 50) { T F 4: … // branch 2: 600 statements 600 100 5: } else { stmts stmts 6: … // branch 3: 100 statements 7: if (i1==20) { 8: if (i2==30) { … } Input 1: i1 = 20 and i2 = 20 9: } Rank Branch # Stmt i1 i2 C: (i1!=10)&&(i2!=50)&&(i1==20)&&(i2!=30) 10: } 1 2 600 5 50 } 2 1 300 20 20 3 3 100 30 30 11 4 … … 40 40
  • 12. CarFast Implementation Evaluation Conclusion Step 3: Select next input from trace • Executes the program with the input CarFast – Algorithm to collect path condition • Modifies path condition to cover higher ranked branches • Queries the condition to database void foo (int i1, int i2) { • Selects random input if there are no satisfying input i1==10 1: if (i1 == 10) { T F 2: … // branch 1: 300 statements i2==50 3: } else if (i2 == 50) { T F 4: … // branch 2: 600 statements 600 100 5: } else { stmts stmts 6: … // branch 3: 100 statements 7: if (i1==20) { 8: if (i2==30) { … } Input 1: i1 = 20 and i2 = 20 9: } Rank Branch # Stmt i1 i2 C: (i1!=10)&&(i2!=50)&&(i1==20)&&(i2!=30) 10: } 1 2 600 5 50 C’: (i1!=10)&&(i2==50) } 2 1 300 20 20 3 3 100 30 30 Input 2: i1 = 5 and i2 = 50 12 4 … … 40 40
  • 13. CarFast Implementation Evaluation Conclusion Implementation  Scalability challenges in large applications: up to 1MLOC  Large constraints of size up to 5MB  Existing tools run out of memory  Execution Engine  Initial tool: Concolic execution engine (Dsc)  Solution: DSC-Dumper mode  Uses disk instead of memory  Removes memory overhead  Test Input Database  Initial tool: MSSQL server 2008  Solution: Constraint-based selector  Uses B+ tree based index  Provides API to process queries 13
  • 14. CarFast Implementation Evaluation Conclusion Experiment – Approaches Adaptive Random Testing Random Testing (ART) • Random selection of inputs • Random selection of evenly • Black-box approach distributed inputs • Black-box approach DART CarFast • Concolic execution • Our approach approach • Static ranking based path • Depth-first path exploration exploration • White-box approach • White-box approach 14
  • 15. CarFast Implementation Evaluation Conclusion Experiment – Subject Programs  Challenges in selecting programs  Programs with various sizes  Programs with complex properties  Programs without external dependencies  RugRat program generator [WODA 2012]  Stochastic-parse-tree based program generation approach  Highly configurable option parameters  Used in generating 12 programs from 1KLOC to 1MLOC  Test inputs  Each program has up to 20 integer inputs  Complete combination of inputs for 20 integers = 10020  Pairwise combination of inputs for 20 integers = 1M 15
  • 16. CarFast Implementation Evaluation Conclusion Experiment – Setup  Study Protocol  For statistical significance, ran 30 times  Total time = 4 approaches*12 programs* 30 times*24 hours = 34,560 hours  Baseline coverage = min(covi) where i = {Random, ART, DART, CarFast}  Measurement (to achieve baseline coverage)  Number of iterations (1 iteration = 1 selection)  Elapsed time 16
  • 17. CarFast Implementation Evaluation Conclusion Experiment – Results 3 1 2 Programs Baseline Appoaches Iterations Elapsed Time Coverage (mean) (mean) Random 17.1 522.2 ART 17.8 59.8 3 (1.2K) 45% DART 693.5 1447.0 CarFast 5.9 571.0 Random 1023.2 3162.5 5 (2.1K) 78% ART 1615.6 5157.7 CarFast 463.9 20040.9 Random • DART doesn't 543.1 1736.8 7 (7.8K) 79% ART scale 684.1 2217.6 CarFast 380.0 18829 17 * Complete results are in the paper.
  • 18. CarFast Implementation Evaluation Conclusion Future Work  Bottleneck  Current: Identified modules causing bottlenecks  Future: Improve the runtime of CarFast  Fault-detection ability  Current: Does not measure fault-detection ability  Future: Investigate fault-detection ability  Other test coverage metrics  Current: Used static measure on statements  Future: Use static measure on branches 18
  • 19. CarFast Implementation Evaluation Conclusion Contributions  CarFast The first approach to select inputs for achieving statement coverage fast  Implementation The tool scales up to 1MLOC  Experiment The study shows limitations in popular testing techniques with statistical significance  Tool, subjects, experimental data are available www.carfast.org 20
  • 21. CarFast Implementation Evaluation Conclusion Related Work  Test-case prioritization  Test case prioritization: empirical studies [Elbaum, 2002]  Dynamic symbolic execution  DART [Godefroid, 2005]  Hybrid concolic testing [Majundar, 2007]  Heuristics for dynamic test generation [Burnim, 2008]  Search-based testing  Fitness-guided path exploration [Xie, 2009] 22
  • 22. CarFast Implementation Evaluation Conclusion CarFast – Preliminary Study  Study  Performed on Apache programs  Investigated branches and statements  Observed power law in results – 20% of branches contain 80% of statements  Hypothesis  Assuming the observation holds, we can steer execution to cover those 20% of branches 23

Editor's Notes

  1. Testing is an important part of the software-engineering process.Test coverage is a measure used in software testing.
  2. Coverage - degree to which how much of the programCoverage - important because provides measure of how well the program is testedHence - achieving high essentialWidely….
  3. Current approaches focused on achieving high coverage are slow, and often run out of available resources <animation>Achieving high coverage fast is difficult because …. <bullets>Hence, the goal of our technique is to achieve high coverage faster <animation>
  4. To achieve the goal, we designed a high level approachThe approach starts with an observationFrom the observation, we got an intuitionBased on the intuition, we designed high level approach First, the approach can use Then, the approach select inputs using the analysis
  5. Let me explain the approach with an illustrating example.Here is a Java-like example program.The function foo takes two integers i1 and i2.<click> It has three outer branches, branches 1 to 3, <click> and two inner branches under branch 3.<click> If the input takes branch 1, it covers 300 statements.If the input takes branch 2 or 3, <click> it covers 600 <click> or 100 statements.Our goal is to select an input to take branch 2 fast. <click> However, before developing a technique, we got a question on program characteristics.Do real programs have branches like branch 2?That is, do real programs have branches containing many statements than other branches?To answer the question, we performed a preliminary study.
  6. The input takes branch 3Clearly the input is not good w.r.t. increasing statement coverageHowever, none of existing approaches systematically steer execution to branch 2One existing approach is DARTAs next input, it selects one inside branch 3, so the total coverage will remain up to 10%An ideal appraoch will try to get an input that covers branch 2 and
  7. Finally, if the input takes branch 3, it can cover up to 100 statements.Assume that we have an input, i1 = 3 and i2 = 7. Then, this input will lead a path to branch 3. This is clearly a bad input because it covers at most 10% of the statement coverage.
  8. The algorithm consists of three steps.Step 1 statically ranks branches before program executions.To do so, it counts the number of statements per branch.It works transitively in that if there are branches or method calls inside a branch, it includes the statements of them.Then, Step 1 ranks branches in decreasing order of statements.<click> For our example code, the algorithm generates the ranking table.<click> For example, branch 2 is ranked first because it contains 600 statements, the most number of statements in the program.
  9. Step 2 of the algorithm selects an input randomly from the test input database.Then, it executes the program with the input to collect path condition.Here, path condition is conjunctions of constraints.Let’s go back to the example.<click> We assume that we have a test input database as Renters Insurance Program has user profiles.The input database has two columns, i1 and i2.<click> Step 2 selects a random input,i1 = 20 and i2 = 20.
  10. Step 3 is the main part of the algorithm.Step 3 actually executes the program with inputs and selects the next input.<click>First, it executes the program with the current input to collect path condition.Here, path condition is conjunctions of constraints.<click>Then, it modifies path condition to cover high-ranked branches.To do so, it uses the static ranking table.<click>If Step 3 creates a new path condition, it queries the condition to the input database to find an input.This modification of path conditions goes on with a loop.<click>If there are no satisfying inputs with new path conditions, Step 3 randomly selects a new input from the database.<click>Let’s go back to the example.
  11. Step 3 executes the program with Input 1.Then, it covers four if-statements and collects a path condition, C.<click> The first two constraints are from two outer if-statements, corresponding branches 1 and 2.<click> The last two constraints are from two inner if-statements, corresponding two branches inside branch3.
  12. Then, Step 3 modifies the path condition, C.<click> It investigates the ranking table from the top rank. <click> Branch 2 corresponds to the if-statement of checking i2 == 50, and the corresponding constraint is in C.<click> Thus, Step 3 modifies C to create a new path condition C’ to cover Branch 2.<click> Then, it queries the input database to find an input satisfying C’. <click> It finds a new input i1=5 and i2=10, <click> and uses it as Input 2.In our example, Step 3 found an input with a first try. However, if there is no corresponding input in the database, <click><click> Step 3 can search for a new input by investigating other branches and constraints.<click> Then, if there is no such a test in the database, Step 3 finally can select a random input from database.Step 3 works in this way until a target coverage limit.In summary, CarFast guides the next input based on current input information.To evaluate CarFast, we implemented it for Java programs.
  13. The goal of implementation is to apply CarFast to large applications, <click>up to 1MLOC programs.<clicik>To do so, we implemented our technique in three different modules: the CarFast main module that executes the main algorithm, the execution engine that executes the program with a test input and generates execution trace, and the input database.We found and addressed several challenges in the modules.<click> The first challenge was in the execution engine. Initially, we used a Java concolic execution engine for our purpose.We used Dsc, that was developed by our co-authors.<click> However, we observed several exceptions with memory-related problems. The problems occurred because the concolic execution engine could not hold the data for large programs.<click>Thus, we developed a new mode, Dsc-dumper mode. Instead of saving the execution engine states in memory, Dsc dumps its constraints into disk and passes them to CarFast.With the modification, we didn’t observe memory exceptions.
  14. In our experiment, we compared four approaches each other.<click> First two approaches are random testing techniques.The first one is a pure random testing approach: It selects inputs with randomly.The second one is adaptive random testing approach: It selects inputs randomly, but it computes distance among inputs and selects evenly distributed inputs.<click> Last two approaches are white-box approaches.They collect path constraints and explores program path using the constraints.The third one is DART. It explores program path in a depth-first-search manner.The final approach is CarFast. In contrast to DART, it uses static ranking based approach to select inputs.
  15. Finally, let me explain the subject programs.Selecting subject programs was challenging for several reasons.The program sizes should be various, and the programs should have complex logics.However, to run execution engines, we needed programs without any external dependencies.Finding such programs was challenging.<click> Thus, we created and used a program generator, RugRat, which was published in this year’s WODA.It uses stochastic parse tree based program generation approach to create random programs.It provides highly configurable options to express different program properties.We used RugRat in generating 12 programs from 1KLOC to 1MLOC.<click> The programs takes up to 20 integers as inputs.For input data, we used integer inputs with ranges -50 to 50.The complete combination of 20 integers of the range becomes 100 to 20.Instead of using the complete input, we used pairwise combinatorial testing technique to use reduced combinations.
  16. Then, let me explain the experimental setting.<click>Because all four approaches have random nature, for statistical significance, for each approach, we ran each program 30 times.<click>Then, we performed a large-scale experiment to get the statistical significance.The total time becomes multiplications of 4 approaches, 12 programs, 30 times, and 24 hours – the time limit. So, it is 34 thousand hours.We ran the experiments on Amazon EC2 cloud.<click> We used two measurement criteria to compare approaches: number of iterations and elapsed time to reach target coverage.To determine target coverage, we used the lowest possible coverage after 24-hour time limit.
  17. Random had less runtime than CarFast.However, CarFast performs more sophisticated analysis to select a fewer number of test cases that achieves same amount of test coverage.
  18. 1.2. 3.
  19. There are several classes of related work.The first work is about test case prioritization. The work is in the context of regression testing and requires prior knowledge.The second work is about DSE. Systematic.The final work is about search-based testing. Not scalable.
  20. In conclusion, there are several important contributions of our work.<click>First, we presented a new technique, called CarFast, that has high potential to achieve high statement coverage faster.<click>Second, we implemented the technique in a tool that scales to 1MLOC program.Moreover, we made all the data used in the project publicly available at www.carfast.org.<click>Finally, we performed the first large scale experiment that shows limitations and advantages of popular testing techniques with statistical significance.
  21. There are several classes of related work.The first work is about test case prioritization. The work is in the context of regression testing and requires prior knowledge.The second work is about DSE. Systematic.The final work is about search-based testing. Not scalable.
  22. The study has performed on three popular Apache programs: log4j, ant, and jMeter.<click> The study is to see the relationship between branches and their containing statements. Specifically, we counted the number of statements for their control-dependent branch.<click> As the result, we observed the Power law.That is, 20% of the branches contain 80% of the statements.<click> Assuming this observation holds for other programs, we hypothesized that if we can select inputs that cover those 20% branches first, we can get higher coverage faster.We developed such an algorithm, called CarFast.