Using Production Grammers in Software TestingPresentation Transcript
Using Production Grammars in Software Testing
How to test Large, complex and safety critical software Systems ?
Extensible typesafe system, such as Java , rely critically on a large and complex software base for their overall protection and integrity.
Traditional Testing Techniques are time consuming, expensive and imprecise.
Commercial virtual machines deployed so far exhibited numerous bugs and security holes.
Using Production grammars in testing large, complex and safety critical software systems.
Now we describe “Lava” a domain specific language for specifying production grammars.
We use “Lava” to generate effective test suite for java virtual machine.
Effectiveness of production grammars in generating complex test cases combined with comparative and variant testing techniques achieve code and value coverage.
What is Lava?
A special purpose language for specifying production grammars.
It summarizes how production grammars can be used as part of concerted engineering effort to test large systems.
Applying production grammars written in LAVA to testing of java virtual machines.
Safety of modern virtual machines like Java depends critically on three large and complex software components.
A verifier for static inspection of untrusted code against a set of safety axioms.
An interpreter or a compiler to respect instruction semantics during execution.
A running system to correctly provide services such as threading.
Why don’t we use scripts for automatic testing using General purpose language for test generation?
Writing such scripts is time consuming and difficult.
Managing many such scripts is operationally difficult especially as they evolve during life cycle of the project.
Un Structured nature of general purpose language poses a steep learning curve for those who need to understand and modify the test scripts.
Fundamentally a general purpose language is too general and therefore un-structured for test generation.
“ A collection of non-terminal to terminal mappings that resembles a regular parsing grammars, but is used “in reverse ”
Reverse means that instead of parsing a sequence of tokens into higher level constructs, a production grammars generates a stream of tokens from a set of non-terminals that specify the overall structure of the stream.
Production grammars are well suited for test generation because
They can effectively create diverse test cases.
They can provide guidance on how the test cases they generate ought to behave.
How Production grammars attack Oracle Problem?
What is Oracle Problem ?
It is hard to determine the correct system behavior for automatically generated test cases.
In worst cases , automated test generation may require reverse engineering and manual examination to determine the expected behavior of the system on the given input.
Addressing the Problem
Test case generated with production grammars can be used in conjunction with comparative testing to create effective test suite without human involvement.
An extended production grammars language can concurrently generate certificates for test cases.
Automated Testing Requirements: Goals
Automatic : Testing should proceed without human involvement and therefore should be cheap.
Complete : Testing should generate numerous test cases that cover much of the functionality of a virtual machine as possible.
Conservative : Bad java byte codes should not be allowed to pass undetected through the byte code verifier.
Well Structured : Examining , directing, check pointing and resuming verification efforts should be simple.
Efficient: Testing should result in a high confidence java virtual machine within a reasonable amount of time.
Test Generation Process
A generic code-generator –generator parses a java byte code grammar written in Lava and emits a specialized code-generator.
The code-generator is a state machine that in turn takes a seed as input and applies the grammar to it.
The seed consists of high-level description that guides the production process.
Running the code-generator on a seed production test cases in java byte code that can be used for testing.
High Level Structures of Test Generation Process
The input to the code-generator-generator consists of conventional grammar description.
The context free grammar consists of productions with left hand side (LHS) containing a single non-terminal.
Matching against the input.
If match found , Replace the LHS with RHS (Right hand side)
Two Phase Approach is used for increased efficiency.
The Grammar of the Lava input language
The code-generator-generator converts the grammar specification into a set of action tables .
It generates a code-generator that perform the actual code production based on given seed.
Within the code-generator the main data structure is a new-line separated stream, initially set to correspond to seed input.
Each line in the stream is scanned.
Occurrences of an LHS are replaced by corresponding RHS.
What if there is more than one action for a given match?
Code-generator picks an outcome probabilistically , bases on weights that can be associated with each production.
When all possible non-terminals are exhausted , the code-generator outputs the resulting system.
Lava Grammars can be annotated with three properties.
Each Production rule can have an associated name. Along with each test case , the code-generator creates a summary of file listing the names of the grammar rules.
Each grammar rule in Lava may have an associated limit on how many times it can be exercised.
In order to enable the production of context-sensitive outputs, Lava allows an optional code fragment, called an action, to be associated with each production.
A simplified Lava grammar and corresponding seed
In sample grammar written in Lava
The insts production has a specified limit of 5000 invocations, which restricts the size of generated test case.
The two main production , jsrstmt and ifeqstmt , generate instruction sequences that perform subroutine calls and integer equality tests.
These concise descriptions, when exercised on the seed shown, generate a valid class file with complicated branching behavior.
Equal weighting between if and the jsr statements ensures that they are represented equally in the test cases.
A weight of 0 for the emptyinst production effectively disables the production until the limit on insts production is reached, to ensure that test generation does not terminate early.
A sample method body produced by the grammar
What can be done?
Test cases can be used to test for easy-to-detect errors like system crashes.
Type safe systems like virtual machines are never supposed to crash on any input.
Stylized test cases generated by Lava for characterizing the time complexity of our verifier as a function of basic block size and total code length.
Parameterized nature of grammar facilitated test case construction and code-generator with different weights and seeds can produce different cases.
Generated test can verify the correctness of Java components that perform transformations.
“ To direct the same test cases to two or more versions of a virtual machine and to compare their outputs”
A discrepancy indicates that at least one of the virtual machines differ from others .
It typically requires human involvement to determine the cause and severity of discrepancy.
We expand comparative testing by introducing variations into test cases generated by production grammars.
A variation is simple a random modification of the test case to generate a new test.
A variation engine injects errors into a set of test bases, which are fed to two different byte code verifiers.
A discrepancy indicates an error, a diversion from specification, or an ambiguity in the specification.
Self-Describing Test Cases
Extending the grammar testing to generate certificates concurrently with test case.
What is a Certificate?
“ A certificate is a behavioral description that specifies the intended outcome of the generated test case.”
It acts as an oracle by which the correctness of the tested system can be evaluated in isolation.
Certificates allows us to capture both static and dynamic properties of test programs like their safety, side effects or computed values.
The behavior of a virtual machine can then be compared against the certificate to check that the virtual machine is implemented correctly.
Two types of useful certificates may accompany synthetically generated code.
First form of certificate is a proof over the grammar, which can accompany all test programs generated by that specifications as a guarantee that they possess certain properties.
Second form of certificate describe the run time behavior of a specific test.
Complex test cases generated by production grammars achieved as good as or better code coverage than the best hand-generated tests.
They are much easier to construct.
Production grammars used in conjunction with comparative evaluations to check compiler implementations for compatibility.