Nice to meet you. I’m Hojun Jaygarl from Iowa State University. I’m gonna present the paper called OCAT, Object capture-based Automated Testing. Professor Sunghun Kim from HKUST, and Tao Xie from NCSU helped me to publish this paper and Professor Carl Chang is my advisor.
As all people know in this room, generating objects is hard. Object-oriented program are commonly used in these days even in the firmware level. However, many automated test input generating algorithms have low structural coverage (e.g., less than 50% branch coverage) Why? because, search space for object instances is huge. For example, Assume there is target branch that needs an ArrayList which has 500 elements of BankAccount and check 499 th elements of it. It’s very hard to generate it.
As a preliminary experiment, we’ve investigated what’s the main causes of not-covered branches from Randoop. We select 30 source files from 3 open-source projects and categorize the causes.
This is the one example of not-covered branch with insufficient object inputs. As you seen in this source, this is a constructor of Algorithm class and it needs an object input of Document class. a constructor of super class actually checks the validity of Document class, thus without having the Document class, we cannot generate Algorithm instance also, therefore, we cannot test any method in Algorithm class.
Current approaches to generate object input has some limitations. Korat needs manual efforts for writing class invariants and value domains. Pex uses .. Randoop cannot generate..
This slide shows an overview idea of OCAT. First of all, we capture objects from program execution such as system tests, regression tests, normal or planned program execution. After capturing objects, we seed these objects into a method sequence generation technique. By generating method sequences with captured object, more objects can be generated. Since these inputs reflect real usage, capturing these inputs and exploiting them in automated testing provide great potential for being desirable in achieving new branch coverage. After using method sequence generation, we checks not-covered branches. We are doing this because method sequence generation approach might quickly cover most branches, but cannot touch some complex branches. So, we do static analysis of not-covered branches. If we can get a value that needs for the not-covered branches, we directly mutated captured objects based on the SMT solver’s answer. All these captured and generated and mutated objects can be used and test inputs and achieve high code coverage.
Let’s talk about capturing process. First, we need to instrument the target program. And then, by running the target program we can capture objects.
This is an example of captured program. Actually the instrumentation is done in the byte code level of Java. We instrument the code that call our capturing library into entry points of method calls. So now by executing the program that uses this Algorithm class, we can capture Algorithm class as a receiver, and Document instance and string algorithmURI.
If the capture method is called, the method gets a type and a concrete state of an object to be stored. and we keep types and states of all stated objects to maintain an object repository with redundant objects eliminated. We use a concrete state representation for the state-equality checking.
This is an example to show how actually object instances are saved. Assume we captured an array list of bank accounts, that will be saved as an xml file. BankAccount has three fields and we can see the values of fields in the XML.
OCAT generates the more objects by feeding captured objects to an existing automated method sequence generation technique. The captured objects are evolved to achieve high coverage by constructing and executing method sequences.
When captured objects are used for method sequence generation, it uses in two ways. It can be directly used as you see in the first diagram, and it can be used indirectly as you see in the second diagram. If the directly used one generate more objects, it can be used to test other methods.
This is the example of a method sequence that generated with captured objects. Assume there is a target method that needs a input of Map container. Method sequence generator produces method invocations of loading captured objects. TreeMap, String, BeanMap and integer. By generating method sequences, we produce more object instance by evolving capture objects.
Instead of having toy projects for evaluation, we use open-source projects. Apache Commons Collections (ACC) : is an extended collection library that provides new utilities, such as buffer, queue, and map. Apache XML Security (AXS) : is a library implementing the XML Digital Signature Specification and XML Encryption Specification. JSAP : is a command-line argument parser.
Now I’m going to present the first result of our approach. So, how much can OCAT improve code coverage through captured object instance with method sequence generation? The first line is our approach and blue line is Randoop and the dotted line shows a coverage of our execution for capturing objects. We measured based on the number of tests has been generated by both our approach and randoop. As a result, 19% has been improved from 45.2% which Randoop achived.
You can see the result of XML security is significant because this open source project needs lots of legitimate XML inputs. Even if we have strong string constrain solver, it’s difficult to generate inputs like XML data. These results suggest that captured object instances plays an important role to improve code coverage by assisting an existing testing technique. We observe that the captured object instances are inﬂuenced by the original set of tests provided with the subject system. However, the OCAT’s achieved coverage is much higher than the coverage achieve by simply executing these original tests because captured object instances can lead a random generation technique to cover more branches.
Captured and generated objects may not be able to cover all branches. CAPTIG mutates captured objects to try to satisfy constraints on conditionals related to not-yet-covered branches. SMT: Satifiability Modulo Theory
This is the example of mutation. In the AbstractReferenceMap class, there is one branch that has not been covered, since ATG cannot generate the instance of AbstractRefereceMap. We already know the true path of this branch has not been covered, so we extract precondition of it by static analysis. We put the extracted precondition into SMT solver, and get a value. Because we have captured objects, we simply change the object instance with the value we got and check whether the new mutated object instance satisfies the branch condition.
OCAT (mutated objects) can still improve approximately 4% of code coverage. This might not be significant since we are using very simple static analysis technique. We still can see the improvement which is not trivial.
Overall, 22.9% of branch coverage improved from for ACC 32.7% “” for AXS and 20.9% for JSAP. On average 25.5% has been increased
The captured object instances reflect real usage of object instances. Capturing these object instances and exploiting them provide great potential for being desirable in achieving new branch coverage Capturing objects is easier than writing a specification or an initial test case.
Testing object-oriented (OO) software is critical OO languages have been commonly used. Many automated test input generating algorithms low structural coverage (e.g., less than 50% branch coverage) Challenges search space for object instances is huge. near half of difficult-to-cover branches require desirable objects. The main difference between unit tests of procedural languages (such as C) and object-oriented languages (such as C++ and Java) is that in procedural languages, a tester only needs to prepare input arguments for tested functions plus global variables, but in object-oriented languages, a tester needs to prepare both re- ceiver and argument object instances.
OCAT: Object Capture based Automated Testing (ISSTA 2010)
OCAT: Object Capture based Automated Testing Hojun Jaygarl, Carl K. Chang Iowa State University Sunghun Kim The Hong Kong University of Science and Technology Tao Xie North Carolina State University ISSTA 2010
Problem Generating object inputs is hard in Object-Oriented (OO) unit testing2
Automated Test Generation (ATG) Automatically generate test inputs for a unit. Reduce manual efforts in OO unit testing.3
Two Main Types of ATG Techniques Direct object construction Directly assign values to object fields of the object instance E.g., Korat [ISSTA02] Method sequence generation Generate method sequences that can produce an object instance under construction E.g., Randoop [ICSE’07], Pex [TAP08]
Results of the State of the Arts in ATG Pex Pex automatically generates test inputs. Dynamic Symbolic Execution 21% of branch coverage [Thummalapenta et al. FSE 09] for QuickGraph, a C# graph library. Randoop Randoop automatically generates method sequences Random but feedback directed 58% of branch coverage [Thummalapenta et al. FSE 09] 45% according to our evaluation for Apache Common Collections 3.2 5
Case of not-covered branchesCause of not- # of branches Explanationcovered branchesInsufficient object 135 (46.3%) unable to generate desirable object instances required to cover certain branches.String comparison 61 (20.9%) Difficult to randomly find a desirable string to satisfy such constraints, since the input space of string type values is huge.Container object access 39 (13.4%) Not easy to create a certain size of a container with necessary elements.Array comparison 25 (8.6%) Not easy to create a certain size of an array with necessary elements.Exception branches 18 (6.1%) These branches have a particularity of exception handling code that handles run-time errorsEnvironmental setting 9 (3.1%) Almost 90% Hard to get environment variables and file-system structure.Non-deterministic 4 (1.3%) Hard to handle multi-threading and user interactionsbranch 6 Run Randoop for three projects Apache Commons, XML Security, and JSAP We randomly selected 10 source code files from each subject and investigated the causes of uncovered branches.
Not-Covered Branch Example Checks the validity of “doc”7
Limitations of Current Approaches Korat Require manual efforts for writing class invariants and value domains. Pex Pex uses built-in simple heuristics for generating fixed sequences, which are often ineffective. Randoop Random approach cannot generate relevant sequences that produce desirable object instances. 8
Idea Practical approach (high coverage) Reflect real usage Easier process Object capture based Automated Testing 9
Usage of Captured Objects Captured objects are de-serialized and used as test inputs. Evolved Objects Captured foo(A) New instance of A returns A instance of A Indirect usage Captured bar(A) New instance of A returns B instance of B 16
Evaluation- Mutated Objects Q2 How much can mutated object instances further improve code coverage? 23
Evaluation- Total 25.5% improved on average, with maximum 32.7% 32.7%22.9% 20.9% 24
Why is It a Feasible Idea? Reflect real usage Potential for being desirable inputs in achieving new branch coverage Capturing objects is easy25
Discussion Object Capturing Process Problem: OCAT’s coverage depends on captured objects Capturing objects is an easy process (but we still need to capture “good-enough” objects). Captured Objects and Software Evolution Problem: software evolves and objects are changing. Captured object instances may be obsolete and not be valid anymore. 26
Discussion Branches to Cover Problem: Still not-covered branches are more than 20%. Cross-system object capturing objects can be captured from system A and used for system B. Static analysis currently we use a simple static analysis. Iterative process two phases, object generation and object mutation, can be iteratively applied 27
Threats to Validity Software under test might not be representative Our three subjects may yield better or worse OCAT coverage than that of other software projects. Our object capturing relies on the existing tests OCAT test coverage reported in this paper depends on the quality and quantity of existing tests. 28
Conclusions Problem Hard to generate desirable object instances in OO unit testing. OCAT approach Capture objects from program execution. Generate more objects with method sequence generation. Mutate directly the captured objects to try to cover those not-yet-covered branches. 29
Conclusions Results OCAT helps Randoop to achieve high branch coverage averagely 68.5% 25.5% improved (with maximum 32.7%) from only 43.0% achieved by Randoop alone. Future work Enhance static analysis part Apply captured objects to other state-of-the-art techniques (e.g., parameterized unit testing, concolic testing, and model checking approaches). 30 Release the tool.