This document discusses using the Korat tool for automated testing of Java programs. It provides an overview of Korat and how it works. Korat takes a formal specification of a method including a precondition and postcondition. It then automatically generates test cases within given size bounds and executes the method on each test case to check correctness. The document explains the key components of Korat including imperative predicates to specify structural constraints, and finitizations to bound the input space. It provides examples using binary trees and heap arrays to illustrate how to write the predicates and finitizations needed for Korat to generate test cases.
1. CS5393: Project ReportProject Report: Bounded
Exhaustive Testing on Java using Korat Tool Instructor: Dr. Guowei Yang
Akshay Mittal 12/1/16 Software Quality
2. 1
Project Report: Bounded Exhaustive Testing on Java
using Korat Tool
Akshay Mittal
Texas State University
San Marcos, Texas
240-751-5112
akki@txstate.edu
ABSTRACT
This report, presents Korat, a novel framework for automated
testing of Java programs. Given a formal specification for a
method, Korat uses the method precondition to automatically
generate all (non-isomorphic) test cases up to a given small size.
Korat then executes the method on each test case, and uses the
method postcondition as a test oracle to check the correctness of
each output.
The inputs that Korat generates enable bounded-exhaustive testing
that checks the code under test exhaustively for all inputs within
the given bounds. To generate test cases for a method, Korat
constructs a Java predicate (i.e., a method that returns a Boolean)
from the method’s precondition. The heart of Korat is a technique
for automatic test case generation: given a predicate and a bound
on the size of its inputs, Korat generates all (non-isomorphic) inputs
for which the predicate returns true. Korat exhaustively explores
the bounded input space of the predicate but does so efficiently by
monitoring the predicate’s executions and pruning large portions of
the search space
Categories and Subject Descriptors: D.2.5
[Software Engineering]: Testing and Debugging. Testing tools;
D.1.3 [Programming Techniques]: Concurrent Programming.
Parallel programming
General Terms: Algorithms, Performance, Reliability
Keywords: Parallel testing, Korat, bounded-exhaustive
testing, test, Predicate
1. INTRODUCTION
Software testing is an important part of software development
and can account for more than 50% of the development cost.
Two main activities in testing are test generation, which creates
tests to be executed, and text execution, which executes the tests
to check the code under test. While test execution is often
automated and can easily handle many tests, test generation is
typically manual and thus tedious and error-prone when
generating a large number of tests.
Korat tool for automated testing of Java programs. Korat focuses
on programs that have structurally complex inputs: the inputs are
structural. e.g., represented with linked data structures and must
satisfy complex properties that relate parts of the structure. e.g.,
invariants for linked data structures. Almost all modern software
systems manipulate structurally complex data.
For example, Java programs operate on a heap that consists
of linked objects; each heap configuration must satisfy the
consistency properties of the data structures in the heap.
As another example, Web services manipulate XML documents;
each service operates only on documents that satisfy certain
syntactic and semantic properties.
The Korat tool implements a solver for imperative predicates
that express structural invariants in Java code. The solver takes an
imperative predicate and additionally finitization that bounds the
size of the structures that are inputs to the predicate.
1.1 Motivation
Modern software pervasively uses structurally complex data, for
example web-traversal code operates on graphs that encode web
pages, and IDEs manipulate program representations such as
abstract syntax trees. The standard approach to generating test
suites for such software, manual generation of the inputs in the
suite, is tedious and error-prone. This project presents a new
approach that automates the generation of suites with structurally
complex test inputs. Approach is based on test abstractions which
provide a high-level description of desired test suites. Developers
do not need to manually write large suites of individual tests but
instead write test abstractions from which tools automatically
generate individual tests.
As a computer science student, passionate about software testing
decided to study on Korat tool. This project includes the study of
tool which enables automated testing of java programs using
bounded exhaustive testing. This project has given me the
opportunity to explore the world of testing using bounded
exhaustive testing.
1.2 Overview of the tool and the analyzed
problems
Korat can graphically show the structures it generates. The
visualization in Korat was inspired by Alloy, and current Korat
implementation uses the Alloy Analyzer's visualization facility,
which provides a fully customizable display that allows users to
specify desired views on the underlying structures. Korat
automatically translates object graphs into the Alloy representation.
Korat requires (1) an imperative predicate that specifies the desired
structural constraints and (2) a finitization that bounds the desired
test input size. Korat generates all predicate inputs (within the
bounds) for which the predicate returns true. To do so, Korat
performs a systematic search of the predicate's input space. The
inputs that Korat generates enable bounded-exhaustive testing for
programs ranging from library classes to stand-alone applications.
1.2.1 Imperative predicate
Imperative predicate (also called repOK) is a java method that
checks class invariants. Predicates don't take any parameters and
should return true for all valid structures and false for all other.
Korat uses this method to check if the structure generated during
the search is the valid one, i.e. satisfies all class invariants and
hence can be considered as a valid test case. This means that every
class you want to test using Korat has to have its predicate method.
Important: it is a good practice to return false as soon as possible,
before accessing all fields and checking all conditions. If a certain
3. 2
condition is not met, return false immediately, since that can speed
up Korat search process significantly.
Note: when writing predicate methods you should always name
them repOK because that is the default predicate name that Korat
searches for. If you really want to use a different name, you can do
that, but don't forget to inform Korat about it through the --
predicate command line switch.
Finitization
Finitization method tells Korat how to bound the input space. The
statements in the finitization method specify bounds on the number
of objects to be used to construct instances of the data structure, as
well as possible values stored in the fields of those objects. The
basic idea is to create Field domains for all fields that you want
Korat to vary during the search process. Each field domain will
contain a set of values that its corresponding field can take during
the search. For reference type fields we call them ObjSet and for
primitive types we have IntSet, FloatSet and so on. In general, it is
perfectly fine to have single field domain associated with many
fields.
Note: when writing finitization methods you should always name
them fin<className>because that is the default finitization name
that Korat searches for. If you really want to use a different name,
you can do that, but don't forget to inform Korat about it through
the --finitization command line switch.
These two concepts can be understood by going
through BinaryTree and HeapArray examples and explaining them
in more detail.
BinaryTree Example
Introduction
This example illustrates the generation and checking of linked data
structures using simple binary trees. Each tree node has pointers to
its left and right child and binary tree itself has a pointer to the root
node and an int value that says the number of all nodes in the tree.
Skeleton for this structure could look like:
public class BinaryTree {
public static class Node {
Node left;
Node right;
}
private Node root;
private int size;
}
Writing predicate method (repOK)
Now we want to write a predicate method. In the case of binary
trees, predicate method checks that the tree doesn't have any cycles
and that the number of nodes traversed from root matches the value
of the field size. Simple method like the following should do the
job. Note that it returns false as soon as it can.
public boolean repOK() {
if (root == null)
return size == 0;
// checks that tree has no cycle
Set visited = new HashSet();
visited.add(root);
LinkedList workList = new LinkedList();
workList.add(root);
while (!workList.isEmpty()) {
Node current = (Node) workList.removeFirst();
if (current.left != null) {
if (!visited.add(current.left))
return false;
workList.add(current.left);
}
if (current.right != null) {
if (!visited.add(current.right))
return false;
workList.add(current.right);
}
}
// checks that size is consistent
return (visited.size() == size);
}
Writing finitization method (finBinaryTree)
The only thing that's left to be done is to tell Korat how to bound
the input space.
public static IFinitization finBinaryTree(int nodesNum, int
minSize,
int maxSize) {
IFinitization f = FinitizationFactory.create(BinaryTree.class);
IObjSet nodes = f.createObjSet(Node.class, nodesNum, true);
f.set("root", nodes);
f.set("Node.left", nodes);
f.set("Node.right", nodes);
IIntSet sizes = f.createIntSet(minSize, maxSize);
f.set("size", sizes);
return f;
}
Here is the explanation line-by-line.First line just creates an
"empty" finitization using FinitizationFactory.create factory
method by passing it class under test as an argument.
Then, a set of nodes is created by calling createObjSet method.
This method takes several parameters:
• class of objects to be created
• number of objects of the given class to be created
• whether to include null or not
which means that the second line creates a set of Node objects
which contain null plusnodesNum instances of class Node.
Next thing to do is to associate certain fields with newly created
object set. Fields BinaryTree.root, Node.left and Node.right are all
of type Node and it is ok to have them all associated with this object
set. That is what next three lines do.
Only field that is left to be bounded is BinaryTree.size so we simply
create an IntSet and assign it to the field size.
HeapArray Example
Introduction
This example illustrates the generation and checking of array-based
data structures, using the heap data structure. The (binary) heap
data structure can be viewed as a complete binary tree — the tree
is filled on all levels except possibly the lowest, which is filled from
the left up to some point. Heaps also satisfy the heap property —
for every node n other than the root, the value of n’s parent is
greater than or equal to the value of n. Heap structure can be
efficiently implemented using a single array to represent all heap
nodes. Skeleton for this implementation could look like:
4. 3
public class HeapArray {
private int size;
private int array[];
}
Writing predicate method (repOK)
Predicate method should check that all heap properties are satisfied.
It returns false as soon as possible, immediately after discovers that
some conditions are not met.
public boolean repOK() {
if (array == null)
return false;
if (size < 0 || size > array.length)
return false;
for (int i = 0; i < size; i++) {
int elem_i = array[i];
if (elem_i == -1)
return false;
if (i < 0) {
int elem_parent = array[(i - 1) / 2];
if (elem_i < elem_parent)
return false;
}
}
for (int i = size; i < array.length; i++)
if (array[i] != -1)
return false;
return true;
}
1.2.2 Writing finitization method (finHeapArray)
Finitization method looks basically the same as the finitization for
the BinaryTree example, so you should first take a look into the
previous example if haven't already done so. A single new thing
here is the finitization of arrays.
public static IFinitization finHeapArray(int maxSize, int
maxArrayLength,
int maxArrayValue) {
IFinitization f = FinitizationFactory.create(HeapArray.class);
IIntSet sizes = f.createIntSet(0, 1, maxSize);
IIntSet arrayLength = f.createIntSet(0, 1, maxArrayLength);
IIntSet arrayValues = f.createIntSet(-1, 1, maxArrayValue);
IArraySet arrays = f.createArraySet(int[].class, arrayLength,
arrayValues, 1);
f.set("size", sizes);
f.set("array", arrays);
return f;
}
To finitize array fields, you must provide two field domains. First
one (arrayLength in the code above) is IIntSet which tells all
possible values that array length can take. The second one
(arrayValues in the code above) tells all possible values that array
elements can take. This field domain has to be compatible with
the array type, meaning that in the case of int arrays (like in this
example) it has to be IIntSet and in the case of reference type
arrays it has to be IObjSet with appropriate class of objects.
2. Study
2.1 Installing the tool
1. Prerequisite: Required Java version >=1.5
2. Download binary
distribution, korat_binaries_bundle.zip
(requires Java 1.5)
3. unzip korat_binaries_bundle.zip
4. download graphviz, and start graphviz.exe to install it
(this is needed only for Korat visualization feature)
To Run the tool:
a. Open CMD in Admin mode.
b. Change director to korat resource files (I save all files in
C:korat)
c. To verify tool is working, try to run any of the code
snippet in command prompt.
1. java -cp korat.jar korat.Korat --visualize --class
korat.examples.binarytree.BinaryTree --args
3,3,3(runs binary tree example)
2. java -cp korat.jar korat.Korat --visualize --class
korat.examples.searchtree.SearchTree --args
3,3,3,0,2(runs binary search tree example)
3. java -cp korat.jar korat.Korat --visualize --class
korat.examples.singlylinkedlist.SinglyLinkedList --
args 2,3,3,3(runs singly linked-list example)
4. java -cp korat.jar korat.Korat --visualize --class
korat.examples.doublylinkedlist.DoublyLinkedList --
args 2,3,3,3(runs doubly-linked list example)
Online Documentation API:
http://korat.sourceforge.net/docs/korat_api/index.html
Offline Documentation API:
http://korat.sourceforge.net/docs/korat_api.zip
Korat uses the following additional libraries that must be
present in the CLASSPATH environment variable:
Javaassist, Alloy4Viz, GraphViz (required by Alloy),
Jakarta Commons CLI Library.
Source distribution can be found at:
https://sourceforge.net/svn/?group_id=186053
5. 4
2.2 Analyzing the problems using tool
Problem 1:
This section illustrates Korat's generation. Here use binary
search trees as a running example. Figure 1 shows Java
code that defines a binary search tree. The method repOk is
a Java predicate that checks the representation invariant
of SearchTree. First, repOk checks if the tree is empty. If
not, repOk checks that there are no undirected cycles along
the left and right fields, that the number of nodes reachable
from root is the same as the value of size, and that
all elements in the left (right) subtree of a node are smaller
(larger) than the element in that node.
Korat can generate valid binary search trees. To limit
the number of generated structures, Korat uses finitization
that bounds the number of objects in the data
structures and the field values of these objects. For
trees, finitization gives the maximum number of nodes and the
possible values in nodes. Following Alloy's terminology
for bounds, we say that a tree is in scope s if it has at
most s nodes and s values. Two trees are isomorphic if they
have the same shape (branching structure) and (primitive)
elements, regardless of the identity of the actual nodes in
the trees.
Problem 2:
Code that represents DAGS and check their structural constraints.
DAGs with Korat, the user first needs to write a representation
for DAGs and code that checks structural constraints on
this representation. Figure shows one possibility in Java. Each
object of the class DAG represents a DAG, and each object of the
class DAGNode represents a node. The field nodes stores all nodes
of a DAG, and each node has the field children that stores the
destination nodes of outgoing edges. These fields effectively
represent sets, and Section 4 discusses the issues that arise in
representing a set with an array.
The two repOK methods check the structural constraints of DAGs.
Following Liskov, we use the name repOK for the methods that
check the representation invariant of data structures used to
implement abstract data types. We refer to these methods as
imperative:
predicates they are written in an imperative language and return a
boolean value. In our example, the methods take as input an object
graph consisting of DAGNode objects and check the absence of
(directed) cycles. These repOK methods use the Tarjan's algorithm
for strongly connected components to traverse the graph and return
true if a given object graph indeed represents a DAG or return false
otherwise. Writing repOK methods is usually easy. Two
undergraduate students (the first two authors of this paper) wrote in
a matter of hours repOK methods for several data structures,
including various versions of DAGs: connected or unconnected,
labeled or unlabeled, with one root or multiple roots, etc. The
specific repOK methods in Figure 1 allow unconnected DAGs with
multiple roots. The user also provides a finitization that bounds the
size of the test inputs that Korat generates. Figure 2 shows sample
code that specifies bounds for the DAG class. Each finitization
bounds the number of objects of a given class (for example,
numNodes objects of the class DAGNode) and the values of _elds
for the objects (for example, size is set to numNodes, and the
children array has length between 0 and numNodes - 1 and has
elements that are from the set of nodes). To specify these bounds,
the code uses the classes from the Korat library .
2.3 Results and Analysis
Result and analysis 1: Binary search trees as a running
example. Figure 1 shows Java code that defines a binary
search tree. The method repOk is a Java predicate that
checks the representation invariant of SearchTree. First,
repOk checks if the tree is empty. If not, repOk checks that
there are no undirected cycles along the left and right
fields, that the number of nodes reachable from root is the
same as the value of size, and that all elements in the left
(right) subtree of a node are smaller (larger) than the element
in that node. Korat can generate valid binary search trees. To
limit the number of generated structures, Korat uses a
finitization that bounds the number of objects in the data
structures and the field values of these objects. For trees,
6. 5
finitization gives the maximum number of nodes and the
possible values in nodes. Following Alloy's terminology
for bounds, we say that a tree is in scope s if it has at
most s nodes and s values. Two trees are isomorphic if they
have the same shape (branching structure) and (primitive)
elements, regardless of the identity of the actual nodes in
the trees. Given a finitization and a value for scope, Korat
generates all non-isomorphic structures that satisfy the class
invariant. For example, in scope three, Korat generates the
15 trees shown in Figure 2 in less than one second. It is
practical to use Korat to generate inputs that give high code
and mutation coverage. To illustrate, consider the method
remove that removes a given element from a given tree.
Figure 3 shows how statement coverage and the rate of
mutant killing vary with the scope for this method. Scope
five is sufficient to achieve complete coverage, and scope six
is sufficient to kill all non-equivalent mutants. Generating
inputs and checking correctness for these scopes using Korat
takes just a few seconds.
To illustrate, the command:
java korat.Korat -class SearchTree -visualize -params 3 0 3 1 3
executes Korat (1) to generate all binary search trees with up to 3
nodes with info values ranging from 1 to 3 and (2) to display
graphically the generated structures. Figure 5 shows an example
visualization window. The First, Previous, Next, and Last buttons
allow scrolling through the list of generated structures.
Result and Analysis 2:
Korat is effectively a constraint-solver for imperative predicates:
given an imperative predicate (repOK) and a _nitization (finDAG),
Korat generates all inputs (within the bounds given in the
_nitization) for which the predicate returns true. We refer to such
inputs as valid inputs (even though the code might be expected to
generate an error message for those inputs). Executing all valid
inputs on the code under test is called bounded-exhaustive testing
and provides a strong guarantee that there is no fault within the
given bounds. When time limits prevent bounded-exhaustive
testing, one can consider taking a subset of all test inputs that Korat
generates. Korat can save the generated objects on disk or display
them graphically. For example, Korat generates four DAGs with
exactly three nodes, and Figure 3 shows the visualization for two
of these four DAGs. Our current Korat implementation uses the
Alloy Analyzer for visualization; Korat automatically translates
object graphs into the Alloy representation.
7. 6
Two DAGs of four that Korat generates for three
nodes; the left DAG shows an entire visualization window
with several customization options
Korat does not actually generate all valid object graphs but only
non-isomorphic object graphs. Two object graphs are isomorphic if
they differ only in the identity of the objects in the graphs [4,14,29]:
isomorphic object graphs have the same branching structure (same
shape) and the same values for primitive fields. Arrays are viewed
as objects with one field labeled length and the other fields. for
array elements labeled with array indexes. For example, we obtain
isomorphic graphs if we swap the identity of some nodes in the left
DAG in Figure 3, say put DAGNode2 in children[0] and
DAGNode1children[1]. However, the four graphs (two of which
are shown)are themselves all non-isomorphic as they have different
shapes. Isomorphic object graphs form equivalent test inputs, i.e.,
two isomorphic object graphs either both reveal a fault or none
reveals a fault [45]. Testing code with more than one representative
from an equivalence class only increases the testing time but cannot
reveal more faults. Hence, we want to avoid isomorphic object
graphs among the generated graphs. Korat can efficiently generate
all non-isomorphic objects graphs (i.e., exactly one representative
from each isomorphism class) at the concrete level [4, 29].
For DAGs, however, the arrays used at the concrete level represent
sets at the abstract level, and several object graphs that Korat
generates can be non-isomorphic at the concrete level but
correspond to the same DAG at the abstract level. Section 4
presents a methodology for reducing the number of generated
equivalent structures in such cases
8. 7
Figures 8 and 9 show the experimental results for DAGs of size
7 and 8 nodes, respectively. For a range of the number of worker
machines, we tabulate the total running time, the actual test
generation and execution time for workers, and the speedup
obtained by using a different number of workers.
Next, Table 3 shows performance of different data structures on
different benchmarks.
9. 8
Strength and Weakness:
Strength would be automated testing for structurally complex
inputs with exhaustive bounded. Also, effective for data structures,
adopted in industry, Able to find errors in real world programs.
Added to this, experiments have shown that exhaustive testing in
small scope can achieve near complete statement and branch
coverage.
Main weakness would be predicates are dependent on imperative
code such as java.
2.4 Suggest Improvements
Two important suggestions would be:
Korat is Inherently declarative,
-User specifies “what” Inputs to generate, if it could also accept
how then this would be very efficient and effective.
-Predicates are declarative specifications in only Java, if this would
work on other languages such C++, C# it could be more productive.
Other suggestion would be improving computational power by not
only in parallel programs but also testing for sequential programs.
3. Conclusion
Korat uses the method precondition to automatically generate all
Non-isomorphic test cases up to a given small size. Korat then
executes the method on each test case, and uses the method
postcondition as a test oracle to check the correctness of each
output. Korat exhaustively explores the input space of the
predicate, but does so efficiently by:
1) monitoring the predicate’s executions
to prune large portions of the search space and
2) generating only non-isomorphic inputs.
The Korat prototype uses the Java Modeling Language (JML) for
specifications, i.e., class invariants and method preconditions and
postconditions. Good programming practice suggests that
implementations of abstract data types should already provide
methods for checking class invariants—Korat then generates test
cases almost for free.
All in all, Automated testing structurally complex inputs
-Efficient bounded-exhaustive generation
-Effective for data structures, adopted in Industry
-Found errors in real world programs.
4. REFERENCES
[1] http://korat.sourceforge.net/index.html
[2] S. Misailovic, A. Milicevic, N. Petrovic, S. Khurshid, and D.
Marinov Parallel test generation and execution with Korat
6th joint meeting of the European Software Engineering
Conference and the ACM SIGSOFT Symposium on the
Foundations of Software Engineering
(ESEC/FSE 2007), Dubrovnik, Croatia, Sept. 2007.
[Acceptance rate 17% (43/251 papers)]
[3] D. Marinov Automatic Testing of Software with Structurally
Complex Inputs PhD thesis, Massachusetts Institute of
Technology,
Cambridge, MA, Dec. 2004Tavel, P. 2007. Modeling and
Simulation Design. AK Peters Ltd., Natick, MA.
[4] D. Marinov A. Andoni, D. Daniliuc, S. Khurshid, and M.
Rinard
An evaluation of exhaustive testing for data structures
Technical Report MIT-LCS-TR-921, MIT CSAIL,
Cambridge, MA, September 2003Forman, G. 2003. An
extensive empirical study of feature selection metrics for text
classification. J. Mach. Learn. Res. 3 (Mar. 2003), 1289-
1305.
[5] C. Boyapati, S. Khurshid, and D. Marinov
Korat: Automated testing based on Java predicates
International Symposium on Software Testing and Analysis
(ISSTA 2002), pages 123-133, Rome, Italy, July 2002
(This paper won an ACM SIGSOFT Distinguished Paper
Award.)
[6] http://www.cs.umd.edu/~atif/Teaching/Spring2008/StudentSl
ides/Korat.pdf
[7] https://www.youtube.com/watch?v=wHJKiGURgKc
10. 9
Appendix-A:1
5. How to run the tool:
1. Open cmd
2. CMD executing binary tree example
1
Appendix
11. 10
Binary Tree size 3, Graph Viz structure.
Binary tree visualization example.
15. 14
Command Line Options:
--args <arg-list> (required) comma separated list of finitization parameters, ordered as in corresponding finitization
method.
--class <fullClassName> (required) name of the class that contains finitization
--config <fileName> name of the config file to be used
--cvDelta use delta file format when storing candidate vectors to disk
--cvEnd <num> set the end candidate vector to <num>-th vector from cvFile
--cvExpected <num> expected number of total explored vectors
--cvFile <filename> name of the candidate-vectors file
--cvFullFormatRatio <num> the ratio of full format vectors (if delta file format is used)
--cvStart <num> set the start candidate vector to <num>-th vector from cvFile
--cvWrite write all explored candidate vectors to file
--cvWriteNum <num> write only <num> equi-distant vectors to disk
--excludePackages <packages> comma separated list of packages to be excluded from instrumentation
--finitization
<finMethodName>
set the name of finitization method. If omitted, default namefin<ClassName> is used.
--help print help message.
--listeners
<listenerClasses>
comma separated list of full class names that implementITestCaseListener interface.
--maxStructs <num> stop execution after finding <num> invariant-passing structures
--predicate <predMethodName> set the name of predicate method. If omitted, default name "repOK" will be used
--print print the generated structure to the console
--printCandVects print candidate vector and accessed field list during the search.
--progress <threshold> print status of the search after exploration of <threshold> candidates
--serialize <filename> seralize the invariant-passing test cases to the specified file. If filename contains absolute path, use
quotes.
--visualize visualize the generated data structures