Software Testing


Published on

  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • Testing a "big-bang" integration is about as hopeless as doing one. ADTs or object classes are ideal subsystems to accumulate. Input sequences throw the software into various "modes" in which its behavior may differ radically. Interface checking is difficult and requires detailed knowledge of the system; it isn't usually attempted.
  • It might as well be done by third parties -- understanding has been left with developers (who may be gone). Fault-based tests are like looking for a needle in a haystack. Coverage measurement may be impossible; only the simplest (statement coverage?) is feasible. The problem is only partly tool limitations: people don't have the analysis time.
  • The UNIX `profile' command does module coverage. The current buzz word for "mode-covering sequence" is "use case." Using an environment simulation coupled to the system under test is dangerous, because they may "accommodate" each other in a way that the real world will not do for the system. Test scripts ("automated testing") provides the bookkeeping.
  • Special values testing is perhaps the best at failure finding; little use in gaining confidence. The user weighting may be different for each user!
  • A practical histogram may have about 100 input classes.
  • In practice, nothing like a real profile is ever available. But the concept of a "usage spike" explains why failures go unobserved during test, then appear magically after release.
  • Users can supply rough histogram information, which may be inaccurate, and may differ for different users. The "novice" vs. "expert" user distinction is particularly important, and their different profiles (and the latter being used for test) explains why novices so easily `break' software.
  • "Systematic" in measurement means errors that do not cancel on average; random errors do cancel. Pseudorandom number generators are often really lousy, particularly when a few bits are extracted (e.g., to model a coin toss). Don't just use the one in your programming language library if you care. What, for example, would constitute a "random C program"?
  • The example is unrealistic in many ways: There is a user profile. Input is numeric. There is an effective oracle. With this test, the confidence in a MTTF of 100000 runs is 1%; the confidence in a MTTF of 1000 runs is 63%.
  • The work on comparison of systematic and random testing is almost all theoretical. How could an experiment be done? The stopping rule is simply that the reliability goal has been reached with high enough confidence.
  • The regression testing problem is much easier in principle than the general testing problem, if it assumed that the existing testset is adequate. It usually isn't. A regression testset (from the existing one) is called safe if all the tests whose outcome might differ are found. Finding new tests could use any of the usual methods.
  • Dependency analysis has other uses that have provided some of the technology: optimization, parallelization, etc. Dynamic methods are more precise than static, but also more expensive.
  • Example: Buying a coverage analyzer. The closest one gets to a specification may be a user's manual. Examples of standards with acceptance tests: protocols; the POSIX standard.
  • Object libraries are one of the main features of O-O languages, and have given hope for actual reuse. COTS is still little more than a buzzword. ("Reuse" is another.) Specification seems to require some kind of formal language. Component quality is mostly process-defined. More about quality (in terms of reliability) later.
  • Software Testing

    1. 1. Software Testing OMSE 535 Lecture 8 Manny Gatlin
    2. 2. Overview <ul><li>Integration testing </li></ul><ul><li>System testing </li></ul><ul><li>Operational testing </li></ul><ul><li>Regression testing </li></ul><ul><li>Component/package testing </li></ul>
    3. 3. Integration Testing <ul><li>A Definition </li></ul><ul><ul><li>Testing of functional interactions at interfaces </li></ul></ul><ul><ul><li>Occurs between unit and system testing </li></ul></ul><ul><li>Purpose </li></ul><ul><ul><li>Ensure components work together in a subsystem/system </li></ul></ul>
    4. 4. When to Integration Test <ul><li>Unit testing is at 0% integration </li></ul><ul><li>System testing is at 100% integration </li></ul><ul><li>General approach </li></ul><ul><ul><li>Start integration testing after unit testing </li></ul></ul><ul><ul><li>Integration testing should be a continuous approach </li></ul></ul><ul><ul><ul><li>“ Big-Bang” integration and testing is a futile exercise </li></ul></ul></ul>
    5. 5. Integration Testing Approaches <ul><li>Subsystems </li></ul><ul><ul><li>Accumulate slowly into natural units </li></ul></ul><ul><ul><li>Tests consisting of input sequences are needed </li></ul></ul><ul><li>Check interfaces </li></ul><ul><ul><li>Local instrumentation is needed </li></ul></ul><ul><ul><li>Assertions </li></ul></ul><ul><ul><ul><li>Modules check inputs </li></ul></ul></ul><ul><ul><ul><li>Callers check returned values </li></ul></ul></ul>
    6. 6. System Testing <ul><li>Driven by functional requirements </li></ul><ul><ul><li>Use requirements document </li></ul></ul><ul><ul><li>User manual </li></ul></ul><ul><li>Similar to Acceptance testing, except </li></ul><ul><ul><li>Acceptance testing performed by customer/users/proxy </li></ul></ul><ul><ul><li>Acceptance testing demonstrative of fitness, rather than detection of defects </li></ul></ul>
    7. 7. Nature of System Testing <ul><li>Similar to Unit Testing approach, but: </li></ul><ul><ul><li>Performed on integrated system </li></ul></ul><ul><ul><li>Black-box only </li></ul></ul><ul><ul><li>Intellectual control has been lost </li></ul></ul><ul><li>Test planning critical </li></ul><ul><ul><li>Plan should be developed in conjunction with functional specification </li></ul></ul>
    8. 8. System Testing Mechanisms <ul><li>Mechanisms </li></ul><ul><ul><li>Module coverage </li></ul></ul><ul><ul><li>Sequences (Use-cases) </li></ul></ul><ul><ul><li>Environment simulation </li></ul></ul><ul><li>Automatic bookkeeping required </li></ul><ul><ul><li>Test scripts help </li></ul></ul>
    9. 9. Operational Testing <ul><li>Operational Testing </li></ul><ul><ul><li>Execute test cases with same statistical properties as those found in real operational use </li></ul></ul><ul><li>Finding failures  confidence in none </li></ul><ul><ul><li>Since operations used frequently by users are testing focus, failures are not distributed </li></ul></ul><ul><li>Structural coverage? </li></ul><ul><ul><li>No </li></ul></ul><ul><li>Functional coverage? </li></ul><ul><ul><li>Almost, needs weighting by usage </li></ul></ul>
    10. 10. Operational Profile <ul><li>Operational Profile </li></ul><ul><ul><li>A set of operations that a program may perform and their probabilities of occurrence in actual operation </li></ul></ul><ul><ul><li>There may be many operational profiles </li></ul></ul><ul><li>(In theory) Probability density of input </li></ul><ul><li>(In practice) Histogram of function usage </li></ul>
    11. 11. A Theoretical Operational Profile
    12. 12. Histogram Profile by System Function
    13. 13. Random Testing (Inputs) <ul><li>Not “random” methodology </li></ul><ul><ul><li>Opposite of systematic </li></ul></ul><ul><ul><li>Pseudorandom number generator </li></ul></ul><ul><ul><li>Non-numerical inputs? </li></ul></ul><ul><li>Effective oracle essential </li></ul><ul><ul><li>Too many (not nice) inputs </li></ul></ul>
    14. 14. Cube-root Subroutine <ul><li>Range [-10000, 10000] </li></ul><ul><li>Accuracy 1 in 100000 </li></ul><ul><li>Profile: (1000 test points) </li></ul><ul><ul><li>[-1.1, 1.1] 80% (800) </li></ul></ul><ul><ul><li>(1.1, 100] 10% (100) </li></ul></ul><ul><ul><li>(100, 10000] 5% (50) </li></ul></ul><ul><ul><li>[-10000, -1.1) 5% (50) </li></ul></ul><ul><ul><li>(Uniform pseudorandom with subdomains) </li></ul></ul><ul><li>Oracle: cube result plus/minus .00001 </li></ul>
    15. 15. Random Testing in Practice <ul><li>Surprisingly good at failure-finding </li></ul><ul><ul><li>Measures: probability of finding at least one failure </li></ul></ul><ul><ul><li>Expected value of delivered reliability after test </li></ul></ul><ul><li>Failures are found in the order users will see them </li></ul><ul><li>Good stopping rule </li></ul><ul><ul><li>Reliability goals have been reached with acceptable confidence </li></ul></ul>
    16. 16. Regression Testing <ul><li>Pick from existing testset </li></ul><ul><ul><li>Cases needed because of changes </li></ul></ul><ul><ul><li>Cases to alter because of changes </li></ul></ul><ul><ul><li>Relies on existing testset being good </li></ul></ul><ul><li>New tests for changes </li></ul><ul><li>Dependency analysis is the answer </li></ul>
    17. 17. Dependency Analysis <ul><li>Dependence </li></ul><ul><ul><li>Exists between statements when the order of the statements execution affects the results of the program. </li></ul></ul><ul><li>Static methods </li></ul><ul><ul><li>Worst-case flowgraph traversal </li></ul></ul><ul><ul><li>Very similar to dataflow coverage </li></ul></ul><ul><li>Dynamic methods </li></ul><ul><ul><li>Relies on given test data </li></ul></ul>
    18. 18. Package Testing (Systems) <ul><li>Requirements? </li></ul><ul><ul><li>(Yes, but neglected.) </li></ul></ul><ul><li>Specification, design, code? </li></ul><ul><ul><li>(No.) </li></ul></ul><ul><li>Black-box system tests are dictated </li></ul><ul><li>There may be an acceptance test based on a standard </li></ul>
    19. 19. Component Testing (Modules) <ul><li>Libraries </li></ul><ul><ul><li>Language (Java standard class library; GNU C lib, etc.) </li></ul></ul><ul><ul><li>User groups (search the net) </li></ul></ul><ul><li>Commercial Off-The-Shelf components (COTS) </li></ul><ul><li>Problems </li></ul><ul><ul><li>What is the specification of the component? </li></ul></ul><ul><ul><li>Is the quality good enough? </li></ul></ul>
    20. 20. Conclusions <ul><li>Integration testing is crucial to finding failures at interfaces </li></ul><ul><li>System test planning necessary for credible results </li></ul><ul><li>Operational testing important for finding failures likely to occur in use </li></ul>