Puzzle-Based Automatic Testing: Bringing Humans Into the Loop by Solving Puzzles (ASE 2012)

Sung Kim
Sung KimAssociate Prof.
1




Puzzle-Based Automatic Testing:
Bringing Humans Into the Loop by
Solving Puzzles
Ning Chen and Sunghun Kim
The Hong Kong University of Science and Technology

                  ASE 2012, Essen
2




Motivation
• Many automatic test generation techniques introduced:
  • Randoop (C.Pacheco, ICSE 2007)
  • Pex (N.Tillmann, TAP 2008)
  • OCAT (H.Jaygarl, ISSTA 2010)




• However, their coverage results are still not satisfactory
 when applied to complex object-oriented programs.
3




Coverage by Randoop

 Subject               Branches   Coverage
 Commons Math            7707      61.6%
 Commons Collections     5242      53.0%
4




Coverage by Pex

 Subject      LOC             Coverage
 SvnBridge    17.1K            56.2%
 xUnit        11.4K            15.5%
 Math.Net     3.5K             62.8%
 QuickGraph   8.3K             53.2%
 Total        40.3K            49.8%


              - by Xiao et.al (ICSE 2011)
5




Major Challenges
• The Constraint Solving challenge:
      Test generators fail to solve path conditions to cover
      certain branches.

• The Object Creation/Mutation challenge:
      Test generators cannot create and/or modify test
      inputs to desired object states;
6




Challenge 1 : The Constraint Solving problem

void foo(int x, int y, int n) {
   int value = x << n;
   if (value < y && n > 2) {
     // not covered branch
     …
   }
 }
What’s a model (solution) for the path condition:
      (x << n) < y && n > 2
7




Challenge 2: The Object Creation/Mutation Challenge

void foo(Container container) {
                                          Given Model:
  if (container.size() >= 10) {           container.size() = 10
    // not covered branch
    …

 }

How do we create and mutate a Container object into
                  size() = 10?
8




Can we leverage the human to help solve
           these challenges?
9




PAT Puzzle
Overviews
10




Object Mutation Puzzle
11




Object Mutation Puzzle
12




Object Mutation Puzzle
13




Object Mutation Puzzle
14




Constraint Solving Puzzle
15




Constraint Solving Puzzle
16




Constraint Solving Puzzle
17




Approach
18




Architectural Design of PAT
19




Up-front Testing Runs
20




Up-front Testing Runs
• Coverage report by existing automatic techniques
  • Avoid wasting human effort
  • PAT targets only not covered branches


• Retrieve dynamic information
  • Runtime object instances
  • Method call sequences
21




Path Computation
22




Path Computation
• The final result of this phase:
  • The models which will be used to generate the object mutation
    puzzles.

  • The constraints not solvable by SMT solver, which will be used to
   generate the constraint solving puzzles.
23




Object Mutation Puzzle
24




Extracting Sub-models
• A complete model can contain object states for different
 inputs, we want to divide them into sub-models.

                         Complete Model:
                     in == not null
                     in.readInt() == 1
                     this.currentState == null




          Sub-model 1:
                                              Sub-model 2:
           in == not null
                                        this.currentState == null
         in.readInt() == 1
25




Prioritizing Sub-models
• One sub-model may appear in many models.


    Model 1:                    Model 2:
    in == not null              in == not null
    in.readInt() ==1            in.readInt() == 0
    this.currentState == null   this.currentState == null



• Sub-models are prioritized so that high frequent sub-
 models are ranked higher.
26




Generating Puzzles from Sub-models
27




Constraint Solving Puzzle
28




Extracting Error Related Constraints
• From up-front testing runs, we can obtain many branches
 whose path conditions are not solvable by the SMT solver.
                          Path Conditions:
        1. this == not null
        2. this.sums == not null
        3. this.sums.length * this.sums.length <= 4096
        4. this.sums.length > 0
        5. this.n > 1
           Error: feature not support: non-linear problem .
29




Extracting Error Related Constraints
• From up-front testing runs, we can obtain many branches
 whose path conditions are not solvable by the SMT solver.
                          Path Conditions:
        1. this == not null
        2. this.sums == not null
        3. this.sums.length * this.sums.length <= 4096
        4. this.sums.length > 0
        5. this.n > 1
           Error: feature not support: non-linear problem .




                     Error related constraints:
        1. this.sums.length * this.sums.length <= 4096
        2. this.sums.length > 0
30

                                                              30

Grouping Constraint Sets
• Error related constraint sets could be identical except only
 the variable names:


      Group 1              Group 2              Group 3

                           Constraint
                             Set                 Constraint
      Constraint                                   Set
        Sets
31

                                                              31

Grouping Constraint Sets
• Error related constraint sets could be identical except only
 the variable names:


      Group 1              Group 2              Group 3

                           Constraint
                             Set                 Constraint
      Constraint                                   Set
        Sets




      Puzzle 3             Puzzle 1             Puzzle 2
32




Generating and Presenting Puzzles
33




Approach Overview
34




Test Case Generation from Solutions
• Solutions from Constraint Solving Puzzles:
  • More models for not covered branches


• Solutions from Object Mutation Puzzles:
  • Method call sequences to achieve the goal state
  • They can be directly translated to source code for generating one
    test input.
  • A test case can be generated when all test inputs are generated.
35




Evaluation
36




Evaluation Setup
• Subjects:
           Subject           Version          Branches
     Commons Math              2.1              7707
     Commons Collections      3.2.1             5242



• Up-front testing runs:
  • Baseline techniques: Randoop + Symbolic execution module


  • Coverage:
           Subject           Randoop           Symbolic
     Commons Math              61.6%             64.4%
     Commons Collections       53.0%             56.7%
37




Research Question 1
• How many PAT puzzles are solvable by humans?


• Participants:
  • Eight computer science major graduate students.


• Subject:

         Subject              Version            Branches
      Commons Math              2.1                   7707
38




Research Question 1
• Presenting Puzzles:
  • The same top 100 object mutation puzzles.
  • The same top 100 constraint solving puzzles.
  • Repeated solutions are counted only once.


• Result:

    Puzzle      Total Presented     Solved         Avg. Time
   Mutation          100              51           1 minute
   Constraint        100              72           1 minute
39




Research Question 2
• How many people would play PAT voluntarily?


• Participants:
  • We posted the links to PAT puzzles on Twitter and encourage
    people to participate.


• Subject:

         Subject             Version            Branches
   Commons Collections         3.2.1              5242
40




Research Question 2
• Puzzles:
  • The same top 100 object mutation puzzles.
  • The same top 100 constraint solving puzzles.
  • Repeated solutions are counted only once.



• Result
  • In total, 120 people volunteered to play the puzzles


       Puzzle     Total Presented       Solved         Avg. Time
    Mutation            100               24               1 minute
    Constraint          100               84               1 minute
41




Research Question 3
• How much is the test coverage improved by the puzzle
 solutions of PAT?

• We executed test cases generated from puzzles solutions
 from RQ1 and RQ2.

• We measure the # of additional branches coverable by
 PAT over the baseline techniques (Randoop + Symbolic)
42




Research Question 3
                      75
                      70
Branch Coverage (%)




                           +534 (7.0%)
                      65
                           +217 (+2.8%)
                      60                  +308 (5.8%)
                                                         PAT
                      55                  +190 (3.7%)
                                                         Symbolic
                      50   4750 (61.6%)
                                                         Randoop
                                          2780 (53.0%)
                      45
                      40
                              ACM            ACC
                                          Subjects
43




Research Question 4
• How much manual test case writing effort can be saved
 with the help of PAT?
44




Research Question 4
• How much manual test case writing effort can be saved
 with the help of PAT?
45




Conclusion
• A novel framework to support software testing through
 puzzle solving by Internet users.

• Two prototype puzzles have been introduced:
   • Constraint solving puzzles
   • Object mutation puzzles
   • More kinds of puzzles could be developed in the future


• Evaluations show that PAT puzzles are solvable and can
 help improve non-trivial branch coverage on complex OO
 programs.
1 of 45

More Related Content

What's hot(15)

Viewers also liked

Picture PuzzlesPicture Puzzles
Picture PuzzlesOH TEIK BIN
17.8K views22 slides
Visual Thinking GamesVisual Thinking Games
Visual Thinking Gamesscottekim
214.7K views56 slides

Viewers also liked(7)

Picture PuzzlesPicture Puzzles
Picture Puzzles
OH TEIK BIN17.8K views
Lateral Thinking PuzzlesLateral Thinking Puzzles
Lateral Thinking Puzzles
New College Durham197.2K views
Mind Stretching PuzzlesMind Stretching Puzzles
Mind Stretching Puzzles
OH TEIK BIN4.9K views
15 lateral thinking puzzles15 lateral thinking puzzles
15 lateral thinking puzzles
MADAR VALLI.P137.2K views
Visual Thinking GamesVisual Thinking Games
Visual Thinking Games
scottekim214.7K views
10 Creative Thinking Puzzles10 Creative Thinking Puzzles
10 Creative Thinking Puzzles
OH TEIK BIN184.8K views

Similar to Puzzle-Based Automatic Testing: Bringing Humans Into the Loop by Solving Puzzles (ASE 2012)

Similar to Puzzle-Based Automatic Testing: Bringing Humans Into the Loop by Solving Puzzles (ASE 2012)(20)

Recently uploaded(16)

Diapositivas-GisneyS.pptxDiapositivas-GisneyS.pptx
Diapositivas-GisneyS.pptx
olivarescalona97 views
Cluesday 420 by Vishnu Rao.pptxCluesday 420 by Vishnu Rao.pptx
Cluesday 420 by Vishnu Rao.pptx
Nambirajan Vanamamalai6 views
RESUME (Updated)RESUME (Updated)
RESUME (Updated)
bvtp649ry88 views
polaris scriptpolaris script
polaris script
scribdgrudge30813 views
ZewindTV Pitch DeckZewindTV Pitch Deck
ZewindTV Pitch Deck
Darren Levy6 views
Loupz Pitch Deck-3.pdfLoupz Pitch Deck-3.pdf
Loupz Pitch Deck-3.pdf
PancrazioScalambrino7 views
Thanksgiving Family Feud.pptxThanksgiving Family Feud.pptx
Thanksgiving Family Feud.pptx
Rhemi Culver8 views
RESUMERESUME
RESUME
bvtp649ry88 views
Upcoming films.pdfUpcoming films.pdf
Upcoming films.pdf
Afdah26 views
NITC Casuals Quiz (General)NITC Casuals Quiz (General)
NITC Casuals Quiz (General)
Sreeram M75 views
Durley House.pdfDurley House.pdf
Durley House.pdf
victusx14997 views
ECOLUXEAwardsSeasonpre-OSCARSLounge_2024.pdfECOLUXEAwardsSeasonpre-OSCARSLounge_2024.pdf
ECOLUXEAwardsSeasonpre-OSCARSLounge_2024.pdf
Durkin Entertainment LLC80 views
Seq 350a.pdfSeq 350a.pdf
Seq 350a.pdf
Jose Antonio Cerro5 views

Puzzle-Based Automatic Testing: Bringing Humans Into the Loop by Solving Puzzles (ASE 2012)

  • 1. 1 Puzzle-Based Automatic Testing: Bringing Humans Into the Loop by Solving Puzzles Ning Chen and Sunghun Kim The Hong Kong University of Science and Technology ASE 2012, Essen
  • 2. 2 Motivation • Many automatic test generation techniques introduced: • Randoop (C.Pacheco, ICSE 2007) • Pex (N.Tillmann, TAP 2008) • OCAT (H.Jaygarl, ISSTA 2010) • However, their coverage results are still not satisfactory when applied to complex object-oriented programs.
  • 3. 3 Coverage by Randoop Subject Branches Coverage Commons Math 7707 61.6% Commons Collections 5242 53.0%
  • 4. 4 Coverage by Pex Subject LOC Coverage SvnBridge 17.1K 56.2% xUnit 11.4K 15.5% Math.Net 3.5K 62.8% QuickGraph 8.3K 53.2% Total 40.3K 49.8% - by Xiao et.al (ICSE 2011)
  • 5. 5 Major Challenges • The Constraint Solving challenge: Test generators fail to solve path conditions to cover certain branches. • The Object Creation/Mutation challenge: Test generators cannot create and/or modify test inputs to desired object states;
  • 6. 6 Challenge 1 : The Constraint Solving problem void foo(int x, int y, int n) { int value = x << n; if (value < y && n > 2) { // not covered branch … } } What’s a model (solution) for the path condition: (x << n) < y && n > 2
  • 7. 7 Challenge 2: The Object Creation/Mutation Challenge void foo(Container container) { Given Model: if (container.size() >= 10) { container.size() = 10 // not covered branch … } How do we create and mutate a Container object into size() = 10?
  • 8. 8 Can we leverage the human to help solve these challenges?
  • 20. 20 Up-front Testing Runs • Coverage report by existing automatic techniques • Avoid wasting human effort • PAT targets only not covered branches • Retrieve dynamic information • Runtime object instances • Method call sequences
  • 22. 22 Path Computation • The final result of this phase: • The models which will be used to generate the object mutation puzzles. • The constraints not solvable by SMT solver, which will be used to generate the constraint solving puzzles.
  • 24. 24 Extracting Sub-models • A complete model can contain object states for different inputs, we want to divide them into sub-models. Complete Model: in == not null in.readInt() == 1 this.currentState == null Sub-model 1: Sub-model 2: in == not null this.currentState == null in.readInt() == 1
  • 25. 25 Prioritizing Sub-models • One sub-model may appear in many models. Model 1: Model 2: in == not null in == not null in.readInt() ==1 in.readInt() == 0 this.currentState == null this.currentState == null • Sub-models are prioritized so that high frequent sub- models are ranked higher.
  • 28. 28 Extracting Error Related Constraints • From up-front testing runs, we can obtain many branches whose path conditions are not solvable by the SMT solver. Path Conditions: 1. this == not null 2. this.sums == not null 3. this.sums.length * this.sums.length <= 4096 4. this.sums.length > 0 5. this.n > 1 Error: feature not support: non-linear problem .
  • 29. 29 Extracting Error Related Constraints • From up-front testing runs, we can obtain many branches whose path conditions are not solvable by the SMT solver. Path Conditions: 1. this == not null 2. this.sums == not null 3. this.sums.length * this.sums.length <= 4096 4. this.sums.length > 0 5. this.n > 1 Error: feature not support: non-linear problem . Error related constraints: 1. this.sums.length * this.sums.length <= 4096 2. this.sums.length > 0
  • 30. 30 30 Grouping Constraint Sets • Error related constraint sets could be identical except only the variable names: Group 1 Group 2 Group 3 Constraint Set Constraint Constraint Set Sets
  • 31. 31 31 Grouping Constraint Sets • Error related constraint sets could be identical except only the variable names: Group 1 Group 2 Group 3 Constraint Set Constraint Constraint Set Sets Puzzle 3 Puzzle 1 Puzzle 2
  • 34. 34 Test Case Generation from Solutions • Solutions from Constraint Solving Puzzles: • More models for not covered branches • Solutions from Object Mutation Puzzles: • Method call sequences to achieve the goal state • They can be directly translated to source code for generating one test input. • A test case can be generated when all test inputs are generated.
  • 36. 36 Evaluation Setup • Subjects: Subject Version Branches Commons Math 2.1 7707 Commons Collections 3.2.1 5242 • Up-front testing runs: • Baseline techniques: Randoop + Symbolic execution module • Coverage: Subject Randoop Symbolic Commons Math 61.6% 64.4% Commons Collections 53.0% 56.7%
  • 37. 37 Research Question 1 • How many PAT puzzles are solvable by humans? • Participants: • Eight computer science major graduate students. • Subject: Subject Version Branches Commons Math 2.1 7707
  • 38. 38 Research Question 1 • Presenting Puzzles: • The same top 100 object mutation puzzles. • The same top 100 constraint solving puzzles. • Repeated solutions are counted only once. • Result: Puzzle Total Presented Solved Avg. Time Mutation 100 51 1 minute Constraint 100 72 1 minute
  • 39. 39 Research Question 2 • How many people would play PAT voluntarily? • Participants: • We posted the links to PAT puzzles on Twitter and encourage people to participate. • Subject: Subject Version Branches Commons Collections 3.2.1 5242
  • 40. 40 Research Question 2 • Puzzles: • The same top 100 object mutation puzzles. • The same top 100 constraint solving puzzles. • Repeated solutions are counted only once. • Result • In total, 120 people volunteered to play the puzzles Puzzle Total Presented Solved Avg. Time Mutation 100 24 1 minute Constraint 100 84 1 minute
  • 41. 41 Research Question 3 • How much is the test coverage improved by the puzzle solutions of PAT? • We executed test cases generated from puzzles solutions from RQ1 and RQ2. • We measure the # of additional branches coverable by PAT over the baseline techniques (Randoop + Symbolic)
  • 42. 42 Research Question 3 75 70 Branch Coverage (%) +534 (7.0%) 65 +217 (+2.8%) 60 +308 (5.8%) PAT 55 +190 (3.7%) Symbolic 50 4750 (61.6%) Randoop 2780 (53.0%) 45 40 ACM ACC Subjects
  • 43. 43 Research Question 4 • How much manual test case writing effort can be saved with the help of PAT?
  • 44. 44 Research Question 4 • How much manual test case writing effort can be saved with the help of PAT?
  • 45. 45 Conclusion • A novel framework to support software testing through puzzle solving by Internet users. • Two prototype puzzles have been introduced: • Constraint solving puzzles • Object mutation puzzles • More kinds of puzzles could be developed in the future • Evaluations show that PAT puzzles are solvable and can help improve non-trivial branch coverage on complex OO programs.

Editor's Notes

  1. Hello everyone, I am ningchen from hkust. Today I will present the paper: Puzzle-Based Automatic Testing: Bringing Humans intothe Loop by Solving Puzzles
  2. Recent years, many automatic test generation approaches have been introduced, such as randooppex and ocat. However, theirtest coverage is still not satisfactory when to complex object-oriented programs.
  3. For example, we applied an automatic test generation approach, Randoop, on two complex object oriented programs, apache commons math and apache commons collections. The branch coverage achieved by Randoop is only 61 and 53% respectively.From the bottom figure, we can see that, the coverage by Randoop was basically saturated after 1000 seconds.
  4. Also, in another recent research, the researchers found that the branch coverage achieved by Pex is around 15 – 60% when applied to complex object oriented programs.
  5. Two of the major challenges identified to block higher coverage in these automatic test generation approaches is: The constraint solving challenge: Where Test generators fail to solve path conditions to cover the branches due to SMT solver limitations.And the object creation and mutation challenge, Where test generators cannot create and mutate test inputs to desired object states that can cover the target branches.
  6. I will now present a simple motivating example to show these challenges in automatic test generation approaches. Given a foo method which takes three parameters. Our goal is to cover the not covered branch in green.The first challenge we face is: what is a model (a solution) for the path condition in bold. SMT solvers leveraged by the test generation approaches may return error if they do not support non-linear bitwise operation.
  7. In the second example, assume we are able to retrieve a model for the path condition of the not covered branch which is the container object size is equal to 10.The second challenge we face is: ‘how can we create and mutate the Container object into size of 10’? Sometimes there are setter methods available for modifying the necessary object state. But in many there is no direct setter functions available. In such cases, it’s required that some specific method call sequences are constructed to create and mutate the object into the desired object state.However, creating the method sequences could be a non-trivial task.
  8. After struggling trying to handle these challenges manually, we raised a question: can we leverage humans to help solve these challenges in the form of puzzles? In our previous motivating examples, especially the one on constraint solving, even though SMT solvers may fail to provide us the result, it is in fact not difficult for human to solve them.
  9. That’s why we propose PAT, a puzzle-based testing environment which incorporates the help from humans to handle challenges like the ones we just presented.Before going into more technical details of PAT, I will first present a brief overview of PAT puzzles.
  10. The first kind of PAT puzzle we designed is the object mutation puzzle. The purpose of object mutation puzzle is to incorporate human to help us find out how to construct the method call sequences that can create and mutate an object into a target object state.
  11. Initially, PAT presents a goal object state to the human players. The goal state shows the object state which we want to generate objects to satisfy. Human players are asked to help mutate a randomly selected object to satisfy this goal state.
  12. Under the goal state, the current state is also displayed which represents the state of the current object under mutated. An object mutation puzzle is considered solved when each condition in the current object matches exactly with the condition in the goal state.Of course, players can always load another object to play with, when they think that it is not likely to mutate the current object to the goal state.
  13. In the panel at the bottom half of the puzzle, a set of available methods are listed. These are the available public methods in the object’s class. Human players can select methods from the list to execute. The execution results are immediately shown on the screen.Furthermore, PAT can heuristically identify and recommend the most possible methods in the list for satisfying the goal state.
  14. The second kind of PAT puzzle we designed is the constraint solving puzzle. The purpose of the constraint solving puzzle is to incorporate human to help us find out models for constraints that were not solved by the SMT solver.
  15. The left panel of the puzzle shows the set of constraints currently being solved. Each line on the panel represents one constraint to be satisfied. A constraint is highlighted in green if it is satisfied under the current set of concrete values.Otherwise the constraint is shown in red.
  16. The panel on the right is the concrete values currently assign to the variables in the constraints. Human players are expected to provided concrete values in this panel to the variables. AndThe final goal of the constraint solving puzzle is to provide a set of concrete values to the variables that can satisfy all of the constraints presented in the left panel.
  17. So, I have just presented an overview of the PAT puzzles, But how did PAT construct the PAT puzzles from the program code.Next, I will present the detail approach of PAT on creating the puzzles from a program.
  18. The architectural design of the PAT framework is as follows. PAT is consists of five modules:
  19. Initially, in the Up-front Testing Runs, we run existing automatic test generation techniques for the target code. We obtain two categories of information in this phase: The first one is…
  20. A complete coverage report from these automatic techniques Since state-of-the-art test generation approaches such as Randoop can already achieve 50-60% coverage, PAT will only focus on the rest of the difficult to cover areas.So that human effort would not be wasted.The second pieceof artifacts we collect is dynamic runtime information such as:a) Object instancesb)Method call sequencesWe will need to use these information for the later phases to help generate PAT puzzles.
  21. After performing the up-front testing runs, PAT next conducts a path Computation on branches not covered in the up-front testing runs.It is a backward symbolic execution process. During which, PAT propagates path conditions from the target branch to public method entries. The purpose of this phase is to find out the models which can satisfy these path conditions to cover the not covered branches.
  22. Two outputs will be generated in this phase:First, PAT retrieves the models to satisfy in order to cover the not coverage branches.These models are used to construct the object mutation puzzle.Second, PAT saves the complex path conditions which the SMT solver could not solved during the computation process.PAT uses these complex path conditions to construct the constraint solving puzzles.
  23. After obtaining the model for covering branches, PAT uses these models to construct the object mutation puzzles.
  24. Complete models extracted from the path computation process are not used directly to form the puzzle. The reason is, a complete model can contain several object states related to different inputs. For example, the complete model in the upper figure consists of two object states related to different test inputs. Therefore, PAT first divide models into sub-models instead of using complete models to construct puzzles.
  25. After extracting sub-models, PAT next prioritizes sub-models.A sub-model can be shared by many models. For example, the two models on this page share a common sub-model. solving this common sub-model may benefit both models.Therefore, PAT prioritizes sub-models so that high frequent sub-models are always ranked higher.
  26. Finally, each sub-model is presented as a object mutation puzzle, where human players can interact on the interface to satisfy the presented sub-model.
  27. The second kind of puzzle PAT creates the constraint solving puzzle.
  28. The first step in creating the constraint solving puzzle is to extract the error related constraints.Given a path condition that causes the smt solver to fail in the up-front testing runs, PAT identifies and extracts the constraints that causes SMT error.
  29. For instance, in this example, constraint 3 and 4 are the error-causing constraints and were extracted from the path conditions.
  30. After extracting the error related constraint sets, they are then grouped into different categories. The underlying reason is that, many constraint sets are identical except only the variable names.Therefore, to avoid solving the same constraint sets again and again, PAT puts semantically identical constraint sets into the same group.Since constraint sets are grouped, each grouped could then be prioritized according to the number of constraint sets they have in the group.Groups with more constraint sets are ranked higher and will be more likely to be chosen to present as a constraint solving puzzle.
  31. After constraint sets are grouped, each grouped can then be prioritized according to the number of constraint sets they have in the group.Groups with more constraint sets are ranked higher and will be more likely to be chosen to present as a constraint solving puzzle.
  32. Finally, one representative constraint set is selected from each group and presented as a constraint solving puzzle to players.The solution of this puzzle can be applied to all constraint sets of the same group by simply changing the variable names.
  33. Finally, in the last phase, test cases are generated by analyzing the puzzle solutions from players.
  34. For solutions from constraint solving puzzles, PAT uses them as additional models for the not covered branches. For solutions from object mutation puzzles, PAT translates human actions into method call sequences to generate test inputs. If all test inputs of a complete model are generated, a test case can be constructed by combining the corresponding method call sequences together.
  35. Next, I will present the evaluation results of PAT.
  36. For the evaluation, we use two different subject programs , the apache-commons-math and the apache-commons-collections. Both subjects are complex object-oriented programs with more than 5000 and 7000 branches respectively.For the up-front testing runs, we use a state-of-the-art Random test generation framework, Randoop plus a symbolic execution module. The bottom table shows that, about 55 to 65 % of the total branches are coverable by these baseline approaches. So PAT will only focus on the rest of the 40% branches not coverable in the Up-front testing runs.
  37. In research question 1, we want to study, how many of the PAT puzzles are solvable by humans.We generated both kinds of puzzles for the subject program: apache-commons-math and Invited eight computer science graduate students to play the puzzles.
  38. Each participant was presented with the top 100 object mutation and 100 constraint solving puzzles. We measured the total number of puzzles solved by the participants. Repeated solutions were counted only once.In total, 51 out of the 100 object mutation puzzles and 72 out of the 100 constraint solving puzzles were solved. The average time spent by each participant on a puzzle is around 1 minute before they solve it or move on to the next puzzle..
  39. The next research question we investigate is, how many people would play PAT voluntarily?We generated both kinds of puzzles for the subject program: apache-commons-collection and posted the links to the puzzles on Twitter and encourage people to participate.
  40. In total, 120 people volunteered to play the PAT puzzles.Similar to research question one, We also measured the total number of puzzles solved by the participants. In total, 24 out of the 100 object mutation puzzles and 84 out of the 100 constraint solving puzzles were solved by these 120 participants. The average time spent by each participant on a puzzle is also around 1 minute.
  41. After retrieving puzzle solutions from the first two research questions, we went on to study how much test coverage can we improved by leveraging the puzzle solutions.To do so, we generated test cases from these PAT puzzle solutions and measure the number of additional branches coverable by PAT over our baseline techniques.
  42. In this figure, the highlight areas are the additional branches covered by test cases generated from PAT puzzle solutions. In total, 534 additional branches and 308 additional branches were covered by these test cases. Considering the saturated coverage already achieved by the baseline test generation techniques, and the relatively small scale of the study, the additional improvements by PAT is non-trivial.
  43. From the last research question, we saw that by playing only 200 PAT puzzles by humans, more than 500 additional branches could be covered. In research question 4, we study how much manual test case writing effort can potentially be saved by PAT.We randomly selected 20 branches from the two subjects and manually constructed test cases to cover them.
  44. On average, it took us about 8 to 9 minutes of manual effort to construct one test case to cover a branch. So, PAT has the potential of reducing hours or even days of manual work by developers on test case writing.
  45. In conclusion, in this paper, we proposed a novel framework to support software testing through puzzle solving by Internet users.Two types of puzzles have been introduced in the PAT framework, and more kinds of puzzles may be introduced in the future.Evaluations show that PAT puzzles are solvable and can help improve non-trivial branch coverage on complex OO programs.