Using Production Grammers in Software Testing


Published on

  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Using Production Grammers in Software Testing

  1. 1. Using Production Grammars in Software Testing
  2. 2. <ul><li>How to test Large, complex and safety critical software Systems ? </li></ul><ul><li>Extensible typesafe system, such as Java , rely critically on a large and complex software base for their overall protection and integrity. </li></ul><ul><li>Traditional Testing Techniques are time consuming, expensive and imprecise. </li></ul><ul><li>Commercial virtual machines deployed so far exhibited numerous bugs and security holes. </li></ul>
  3. 3. <ul><li>Strategy Outline </li></ul><ul><li>Using Production grammars in testing large, complex and safety critical software systems. </li></ul><ul><li>Now we describe “Lava” a domain specific language for specifying production grammars. </li></ul><ul><li>We use “Lava” to generate effective test suite for java virtual machine. </li></ul><ul><li>Effectiveness of production grammars in generating complex test cases combined with comparative and variant testing techniques achieve code and value coverage. </li></ul>
  4. 4. What is Lava? <ul><li>A special purpose language for specifying production grammars. </li></ul><ul><li>It summarizes how production grammars can be used as part of concerted engineering effort to test large systems. </li></ul><ul><li>Applying production grammars written in LAVA to testing of java virtual machines. </li></ul><ul><li>Safety of modern virtual machines like Java depends critically on three large and complex software components. </li></ul><ul><li>A verifier for static inspection of untrusted code against a set of safety axioms. </li></ul><ul><li>An interpreter or a compiler to respect instruction semantics during execution. </li></ul><ul><li>A running system to correctly provide services such as threading. </li></ul>
  5. 5. Why don’t we use scripts for automatic testing using General purpose language for test generation? <ul><li>Writing such scripts is time consuming and difficult. </li></ul><ul><li>Managing many such scripts is operationally difficult especially as they evolve during life cycle of the project. </li></ul><ul><li>Un Structured nature of general purpose language poses a steep learning curve for those who need to understand and modify the test scripts. </li></ul><ul><li>Fundamentally a general purpose language is too general and therefore un-structured for test generation. </li></ul>
  6. 6. Production Grammars <ul><li>“ A collection of non-terminal to terminal mappings that resembles a regular parsing grammars, but is used “in reverse ” </li></ul><ul><li>Reverse means that instead of parsing a sequence of tokens into higher level constructs, a production grammars generates a stream of tokens from a set of non-terminals that specify the overall structure of the stream. </li></ul><ul><li>Production grammars are well suited for test generation because </li></ul><ul><li>They can effectively create diverse test cases. </li></ul><ul><li>They can provide guidance on how the test cases they generate ought to behave. </li></ul>
  7. 7. How Production grammars attack Oracle Problem? <ul><li>What is Oracle Problem ? </li></ul><ul><li>It is hard to determine the correct system behavior for automatically generated test cases. </li></ul><ul><li>In worst cases , automated test generation may require reverse engineering and manual examination to determine the expected behavior of the system on the given input. </li></ul><ul><li>Addressing the Problem </li></ul><ul><li>Test case generated with production grammars can be used in conjunction with comparative testing to create effective test suite without human involvement. </li></ul><ul><li>An extended production grammars language can concurrently generate certificates for test cases. </li></ul>
  8. 8. Automated Testing Requirements: Goals <ul><li>Automatic : Testing should proceed without human involvement and therefore should be cheap. </li></ul><ul><li>Complete : Testing should generate numerous test cases that cover much of the functionality of a virtual machine as possible. </li></ul><ul><li>Conservative : Bad java byte codes should not be allowed to pass undetected through the byte code verifier. </li></ul><ul><li>Well Structured : Examining , directing, check pointing and resuming verification efforts should be simple. </li></ul><ul><li>Efficient: Testing should result in a high confidence java virtual machine within a reasonable amount of time. </li></ul>
  9. 9. Test Generation Process <ul><li>A generic code-generator –generator parses a java byte code grammar written in Lava and emits a specialized code-generator. </li></ul><ul><li>The code-generator is a state machine that in turn takes a seed as input and applies the grammar to it. </li></ul><ul><li>The seed consists of high-level description that guides the production process. </li></ul><ul><li>Running the code-generator on a seed production test cases in java byte code that can be used for testing. </li></ul>
  10. 10. High Level Structures of Test Generation Process
  11. 11. Lava Grammars <ul><li>The input to the code-generator-generator consists of conventional grammar description. </li></ul><ul><li>The context free grammar consists of productions with left hand side (LHS) containing a single non-terminal. </li></ul><ul><li>Matching against the input. </li></ul><ul><li>If match found , Replace the LHS with RHS (Right hand side) </li></ul><ul><li>Two Phase Approach is used for increased efficiency. </li></ul><ul><li>How ? </li></ul>
  12. 12. The Grammar of the Lava input language
  13. 13. <ul><li>The code-generator-generator converts the grammar specification into a set of action tables . </li></ul><ul><li>It generates a code-generator that perform the actual code production based on given seed. </li></ul><ul><li>Within the code-generator the main data structure is a new-line separated stream, initially set to correspond to seed input. </li></ul><ul><li>Each line in the stream is scanned. </li></ul><ul><li>Occurrences of an LHS are replaced by corresponding RHS. </li></ul><ul><li>What if there is more than one action for a given match? </li></ul><ul><li>Code-generator picks an outcome probabilistically , bases on weights that can be associated with each production. </li></ul><ul><li>When all possible non-terminals are exhausted , the code-generator outputs the resulting system. </li></ul>
  14. 14. <ul><li>Lava Grammars can be annotated with three properties. </li></ul><ul><li>Each Production rule can have an associated name. Along with each test case , the code-generator creates a summary of file listing the names of the grammar rules. </li></ul><ul><li>Each grammar rule in Lava may have an associated limit on how many times it can be exercised. </li></ul><ul><li>In order to enable the production of context-sensitive outputs, Lava allows an optional code fragment, called an action, to be associated with each production. </li></ul>
  15. 15. A simplified Lava grammar and corresponding seed
  16. 16. <ul><li>In sample grammar written in Lava </li></ul><ul><li>The insts production has a specified limit of 5000 invocations, which restricts the size of generated test case. </li></ul><ul><li>The two main production , jsrstmt and ifeqstmt , generate instruction sequences that perform subroutine calls and integer equality tests. </li></ul><ul><li>These concise descriptions, when exercised on the seed shown, generate a valid class file with complicated branching behavior. </li></ul><ul><li>Equal weighting between if and the jsr statements ensures that they are represented equally in the test cases. </li></ul><ul><li>A weight of 0 for the emptyinst production effectively disables the production until the limit on insts production is reached, to ensure that test generation does not terminate early. </li></ul>
  17. 17. A sample method body produced by the grammar
  18. 18. What can be done? <ul><li>Test cases can be used to test for easy-to-detect errors like system crashes. </li></ul><ul><li>Type safe systems like virtual machines are never supposed to crash on any input. </li></ul><ul><li>Stylized test cases generated by Lava for characterizing the time complexity of our verifier as a function of basic block size and total code length. </li></ul><ul><li>Parameterized nature of grammar facilitated test case construction and code-generator with different weights and seeds can produce different cases. </li></ul><ul><li>Generated test can verify the correctness of Java components that perform transformations. </li></ul>
  19. 19. Comparative Testing <ul><li>“ To direct the same test cases to two or more versions of a virtual machine and to compare their outputs” </li></ul><ul><li>A discrepancy indicates that at least one of the virtual machines differ from others . </li></ul><ul><li>It typically requires human involvement to determine the cause and severity of discrepancy. </li></ul><ul><li>We expand comparative testing by introducing variations into test cases generated by production grammars. </li></ul><ul><li>A variation is simple a random modification of the test case to generate a new test. </li></ul>
  20. 20. Comparative Evaluation <ul><li>A variation engine injects errors into a set of test bases, which are fed to two different byte code verifiers. </li></ul><ul><li>A discrepancy indicates an error, a diversion from specification, or an ambiguity in the specification. </li></ul>
  21. 21. Self-Describing Test Cases <ul><li>Extending the grammar testing to generate certificates concurrently with test case. </li></ul><ul><li>What is a Certificate? </li></ul><ul><li>“ A certificate is a behavioral description that specifies the intended outcome of the generated test case.” </li></ul><ul><li>It acts as an oracle by which the correctness of the tested system can be evaluated in isolation. </li></ul><ul><li>Certificates allows us to capture both static and dynamic properties of test programs like their safety, side effects or computed values. </li></ul>
  22. 22. <ul><li>The behavior of a virtual machine can then be compared against the certificate to check that the virtual machine is implemented correctly. </li></ul><ul><li>Two types of useful certificates may accompany synthetically generated code. </li></ul><ul><li>First form of certificate is a proof over the grammar, which can accompany all test programs generated by that specifications as a guarantee that they possess certain properties. </li></ul><ul><li>Second form of certificate describe the run time behavior of a specific test. </li></ul>
  23. 23. Summary <ul><li>Complex test cases generated by production grammars achieved as good as or better code coverage than the best hand-generated tests. </li></ul><ul><li>They are much easier to construct. </li></ul><ul><li>Production grammars used in conjunction with comparative evaluations to check compiler implementations for compatibility. </li></ul><ul><li>Comparative testing with variations is fast. </li></ul>