Software Testing
Upcoming SlideShare
Loading in...5
×
 

Software Testing

on

  • 3,259 views

 

Statistics

Views

Total Views
3,259
Views on SlideShare
3,259
Embed Views
0

Actions

Likes
0
Downloads
202
Comments
1

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Software Testing Software Testing Document Transcript

  • Software Testing 10-April-2003 Software Testing Submitted to Alfred Hussein as part of the course requirements for SENG 623 –Software Quality Management University of Calgary Winter, 2003 submitted by Yuhang (Henry) Wang Scott Thornton April 10, 2003 SENG 623 Software Quality Management Yuhang (Henry) Wang Winter Term 2003 Scott Thornton
  • Software Testing 10-April-2003 ABSTRACT Software Testing happens immediately after the source code of the software has been generated. Within the field of Software Quality Management, Software Testing is an important approach to Software Quality Assurance. It represents the last defense to correct deviations from specification and errors in design or code implementation when compared to other techniques such as inspection, walkthroughs and other reviews. Software testing is examined from a quality management context. A brief history of the theory of testing is provided in order to frame the current approach and techniques. Testing fundamentals and the four testing levels or stages of the software development cycle: unit, integration, system, and acceptance testing are described. Organizational issues associated with software testing are provided, with several keys and several pitfalls to testing provided in the concluding section. SENG 623 Software Quality Management i Yuhang (Henry) Wang Winter Term 2003 Scott Thornton
  • Software Testing 10-April-2003 Table of Contents 1 INTRODUCTION........................................................................................................................................... 1 1.1 HISTORY OF SOFTWARE TESTING ................................................................................................................... 1 2 TESTING FUNDAMENTALS ....................................................................................................................... 3 2.1 TESTING OVERVIEW ...................................................................................................................................... 3 2.2 TESTING STAGES ........................................................................................................................................... 4 2.3 TESTING TECHNIQUES ................................................................................................................................... 6 2.3.1 Static and Dynamic Testing.................................................................................................................. 6 2.3.2 Black Box and White (Glass) Box Testing............................................................................................. 6 2.3.3 Equivalence Partitioning ..................................................................................................................... 7 2.3.4 Boundary Value Analysis ..................................................................................................................... 7 2.3.5 Path Testing ........................................................................................................................................ 7 2.4 TESTING VERSUS INSPECTIONS ...................................................................................................................... 8 3 UNIT TESTING.............................................................................................................................................. 9 3.1 PROCEDURES ................................................................................................................................................. 9 3.2 METRICS ....................................................................................................................................................... 9 4 INTEGRATION TESTING ...........................................................................................................................11 4.1 PROCEDURES ................................................................................................................................................11 4.2 METRICS ......................................................................................................................................................11 5 SYSTEM TESTING.......................................................................................................................................12 5.1 PROCEDURES ................................................................................................................................................12 5.2 METRICS ......................................................................................................................................................12 6 VALIDATION TESTING..............................................................................................................................13 6.1 PROCEDURES ................................................................................................................................................13 6.2 METRICS ......................................................................................................................................................13 7 TESTING ORGANIZATION ISSUES..........................................................................................................14 7.1 WHO TESTS? ................................................................................................................................................14 7.2 WHEN SHOULD TESTING STOP? ....................................................................................................................15 7.3 ORGANIZATIONAL ISSUES .............................................................................................................................16 8 DISCUSSION AND CONCLUSIONS ...........................................................................................................17 9 REFERENCES...............................................................................................................................................18 SENG 623 Software Quality Management ii Yuhang (Henry) Wang Winter Term 2003 Scott Thornton
  • Software Testing 10-April-2003 List of Figures Figure 1 Verification and Validation Software Lifecycle Model......................................................5 Figure 2 Unit Testing Infrastructure ................................................................................................9 List of Tables Table 1 Testing Levels verus Quality Views ...................................................................................5 Table 2 Typical Tester versus Testing Level .................................................................................15 SENG 623 Software Quality Management iii Yuhang (Henry) Wang Winter Term 2003 Scott Thornton
  • Software Testing 10-April-2003 1 Introduction Software Testing happens immediately after the source code of the software has been generated. It is performed to uncover and correct as many of the potential errors as possible before delivery to the customer. Within the field of Software Quality Management, Software Testing is an important approach to Software Quality Assurance. It represents the last defense to correct deviations from specification and errors in design or code implementation when compared to other techniques such as inspection, walkthroughs and other reviews. This paper will examine software testing from a quality management context. It begins with a brief history of the theory of testing in order to frame the current approach and techniques. Section Two provides some testing fundamentals, describing what constitutes a valid test, various testing techniques and the four testing levels or stages of the software development cycle: unit, integration, system, and acceptance testing. Sections Three through Six delve into the details of each of these stages. Section Seven addresses organizational issues associated with software testing, with a discussion and conclusions wrapping up the paper in Section Eight. 1.1 History of Software Testing Throughout the history of software development, there have been many divergent definitions for Software Testing. First, in the 1950s, testing was defined as “what programmers did to find bugs in their programs” [Hetzel 1988]. Today, this definition is much too restrictive in that software testing has been extended to include not only determining that a program functions correctly, but also that the functions themselves are correct. As the science of software engineering matured through the 1960s and 1970s, the definition of testing underwent a revision. Consideration was given to exhaustive testing of the software in terms of the possible paths through the code, or by enumerating the possible input datasets. Even with the complexity of the software systems being developed at that time, this was impractical, if not theoretically impossible. The 1950s concepts were extended to include “what is done to demonstrate correctness of a program” [Goodenough 1975] or to define testing as “the process of establishing confidence that a program or system does what it is supposed to do” [Hetzel 1973]. Although this concept is valid in theory, in practice it is insufficient. If only simple, straightforward tests are performed, it is easy to show that the software “works”. Since these tests may not exercise a significant portion of the software, large number of defects may remain to be discovered during actual operational use. It was therefore concluded that correctness demonstrations are an ineffective method of testing during software development. There is still a need for correctness demonstrations (acceptance testing for example), as will be seen later in this paper. The 1980s saw the definition of testing extended to include defect prevention. According to Boris Beizer [Beizer 1983], the act of designing tests is one of the most effective bug preventers known. As well, with the blossoming costs and effort dedicated to testing, it was recognized that a testing methodology was required, specifically that testing must include reviews and that it should be a managed process. SENG 623 Software Quality Management 1 Yuhang (Henry) Wang Winter Term 2003 Scott Thornton
  • Software Testing 10-April-2003 The power of early test design was recognized in the beginning of the 1990s. Testing was redefined to be “the planning designing, building, maintaining and executing tests and test environments” [Hetzel 1991]. This incorporated all of the ideas to date in that good testing is a managed process, a total life cycle concern with testability [Beizer 1990]. SENG 623 Software Quality Management 2 Yuhang (Henry) Wang Winter Term 2003 Scott Thornton
  • Software Testing 10-April-2003 2 Testing Fundamentals This section will provide the basis for the remainder of the paper. Some basic definitions will be provided, along with identifying the purposes and objectives of testing in the first subsection. Subsection Two shows how testing is applied to software in stages, starting at the smallest testable unit and ending with a complete demonstration of a system’s functionality and capabilities. The various levels are introduced, with further specifics being provided in later sections. In the last subsection looks at common testing techniques and the types of errors they are likely to find. 2.1 Testing Overview As the name suggests, the Software Testing includes the running of a series of dynamic executions of the software product. For a test to be a test (sometimes called a test case) it must have the following properties: 1. A controlled/observed environment that will allow a test to be reproducible, so as to verify that the defect was corrected, 2. A set of sample input that is (generally) a small subset of the possible values, 3. Predicted results that are expected from the set of sample input, so as to verify correct or incorrect operation. The predicted result must be available before the execution of the test case. 4. An analyzed outcome, that compares the predicted and actual results for each execution of the test. Tests are run to assess the level of quality in a software product. They determine what a system actually does, and to what level of performance. As well, testing helps to achieve software quality by finding defects, along with other activities in the software development process. Finally testing helps to preserve quality by enabling a modified system to be retested, ensuring that what worked before still works. Glen Myers [Myers 1979] states that 1. Testing is a process of executing a program with the intent of finding an error. 2. A good test case is one that has a high probability of finding an as-yet-undiscovered error. 3. A successful test is one that uncovers an as-yet-undiscovered error. By uncovering errors in the software, their cause can be determined and then a fix implemented. This results in a software product that contains fewer defects, achieving a greater level of quality. Additional tests may or may not uncover errors, which leads to the fundamental limitation of testing. Testing can only show the presence of a defect. It can never prove the absence of all errors. As is well known, producing error free software is extremely difficult. The developer requires an assessment of the quality of the software before releasing it to customers. This assessment is derived from testing. If no or few errors are found, then the quality of the software is assumed good, providing some confidence that the product is working well. SENG 623 Software Quality Management 3 Yuhang (Henry) Wang Winter Term 2003 Scott Thornton
  • Software Testing 10-April-2003 Therefore, from this it can be seen that software testing has two main objectives: 1. To uncover errors in the software product before its delivery to the customer, (or the next state of development), and 2. To give confidence that the product is working well. It is interesting to note that the two testing objectives result in what is known as the “Testing Paradox”. If the first objective is to uncover errors in the software product, how can there be confidence that the product is working well, since it was just proved that it is, in fact is not working! It must also be noted that test cases that pass (i.e. produce the expected results) do not, in and of themselves, improve the quality of the software product. Nothing has changed in the software under test, so given that there is no change there can be no improvement or degradation. Rather, these tests only improve our confidence in the software product. 2.2 Testing Stages Overall testing of a software system can be divided into essentially four levels or stages. Each stage parallels the level of complexity found within a software product during development. At the lowest and simplest level is Unit Testing. Here, the basic units of the software are tested in isolation. (A unit is defined to be the smallest testable piece of software [Beizer 1990].) The objectives are to find errors in these units in either the logic or data. Units are assembled into larger aggregates called components. When two or more tested components or units are combined, the testing done on the aggregate is called Component Integration Testing, or just Integration Testing. The tests done at this level look for errors in the interfaces between the components. As well, the functions, which can now be performed by the aggregate, that were not testable individually must also be examined. After the all the components have been assembled, the entire system can be tested as a whole, and as such this is called System Testing. It might be argued that “System Testing” is just the final stage of component integration testing. However, at this stage the functional and/or requirements specification is used to generate the test cases. System Testing looks for errors in the end to end functionality of the system as well as errors in the non-functional requirements such as performance, reliability, and security. The last stage of testing is Validation or Acceptance Testing. Here the system is handed over to the end-user or customers. The purpose of this testing is to give confidence that the system is ready for operational use, rather than to find errors. Thus, this is more correctly called a demonstration rather than a test. SENG 623 Software Quality Management 4 Yuhang (Henry) Wang Winter Term 2003 Scott Thornton
  • Software Testing 10-April-2003 Each level of testing addresses different views of the quality of the software product as shown in Table 1. Software Testing Levels Quality Views Unit Testing Manufacturing View of Quality Integration Testing Manufacturing View of Quality System Testing Product View of Quality Validation Testing User View of Quality Table 1 Testing Levels verus Quality Views A “Verification and Validation” software life cycle model is usually used to demonstrate the goals of the different testing stages. In this model, Verification and Validation is used to refer the testing activities. Verification is the actions in the testing that ensure a specification or function is correctly implemented. The activities that ensure the software that has been built is traceable to the original requirement specification are known as Validation. Boehm [Boehm 1981] states this in two simple sentences: Verification: “Are we building the product right?”. Validation: “Are we building the right product?” Figure 1 Verification and Validation Software Lifecycle Model SENG 623 Software Quality Management 5 Yuhang (Henry) Wang Winter Term 2003 Scott Thornton
  • Software Testing 10-April-2003 2.3 Testing Techniques Software Testing Techniques provide systematic approaches for designing tests that exercise the internal logic of software components and the input and output domains of the program to uncover errors in program functions, behaviours and performances. Therefore, testing techniques are not only performed on the functional areas but also non-functional areas of the software. The next few sections will examine several types of testing techniques. 2.3.1 Static and Dynamic Testing The primary difference between static and dynamic testing is that static techniques don not exercise the software in its execution environment. On the other hand, dynamic testing involves the operation of the software with a set of test inputs, and results in a test output. Static analysis testing developed from compiler technology, resulting in a significant number of errors being uncovered. This analysis can examine the control and data flow, check for dead or unreachable code, and identify infinite loops, uninitialized or unused variables, and standards violations. Certain measures such as McCabe’s Cyclomatic Complexity are calculated statically and provide an assessment of the testability of the software entity. Dynamic test techniques can be classified into Functional or Structural techniques. These are often referred to as Black Box or White (Glass) Box testing, respectively, and are described in the next section. 2.3.2 Black Box and White (Glass) Box Testing Black-box tests are executed to validate the functional requirements without analysis of the internal logic of the program. In other words, the tester does not consider the how a software component performs its function, but rather just that it does. The ideal black-box tests are executed based on a set of test cases that satisfy the following criteria [Myers 1979]: 1. test cases that reduce, by a count that is greater than one, the number of additional test cases that must be designed to achieve reasonable testing, and 2. test cases that tell us something about the presence or absence of classes of errors, rather than an error associated only with the specific test at hand. The goal of black box testing is to find errors such as 1. incorrect or missing functions, 2. interface errors, 3. behaviour or performance errors, and, 4. initialization and termination errors. White box tests are executed to verify whether the program is running correctly by considering the internal logical structure. Test items are chosen to exercise the required parts of the structure. The test cases designed for performing white box testing should follow the following criteria: 1. guarantee that all independent paths within a module are all covered at least once, 2. execute all the logical decisions for true and false, SENG 623 Software Quality Management 6 Yuhang (Henry) Wang Winter Term 2003 Scott Thornton
  • Software Testing 10-April-2003 3. exercise all loops at their boundaries and within their operational bounds, and 4. ensure internal data structures are validated. One concrete white box testing technique is “Basis Path Testing”, as proposed by [McCabe 1976]. There are also other techniques, such as Condition Testing [Tai 1989], Data Flow Testing, and Loop testing. White box testing and black box testing are complementary techniques. Only using one of them is impractical in the real software testing world. They are typically combined to provide an approach that validates both the interface and the internal workings of the software. 2.3.3 Equivalence Partitioning Testing is more effective if the various tests are distributed over the complete range of possibilities rather than being drawn from just a few. Input values that are processed in an equivalent fashion can be regarded as being equivalent. If the input domain is then partitioned into equivalent subsets, then a good set of cases would draw an input value from each rather than concentrate on only a few of the subsets. 2.3.4 Boundary Value Analysis Boundary Value Analysis is related to equivalence partitioning. These values lie on the boundaries of an equivalence partition. As was suggested above, test cases should be drawn from each equivalence partition. Values on each side of a partition should also be tested, addressing one of the most common coding errors, that of being “off by one”. It might be argued that since any value within a particular equivalence set is good as another, boundary testing is superfluous. It must be noted however that the equivalence partitions are defined by these boundary values. Boundaries can exist on output data ranges as well. Tests should be designed to produce valid and (attempt to produce) invalid outputs. Hidden boundaries such as maximum string lengths or data set sizes also need to be determined and verified that appropriate responses are generated by the software under test. 2.3.5 Path Testing A path is defined to be a sequence of program statements that are executed by the software under test in response to a specific input. In most software units, there is a potentially (near) infinite number of different paths through the code, so complete path coverage is impractical. Not withstanding that, a number of structural techniques that involve the paths through the code can lead to a reasonable test result. 2.3.5.1 Branch Testing The structure of the software under test can be shown using a control flow diagram, illustrating statements and decision points. The number of unique paths from the start to the end of the code is equal to the number of decision points plus one. This number is just the Cyclomatic Complexity of the program [McCabe 1976]. Using the control flow diagram, test cases can be designed such that each exercises at least one new segment of the control flow graph. In theory the number of test SENG 623 Software Quality Management 7 Yuhang (Henry) Wang Winter Term 2003 Scott Thornton
  • Software Testing 10-April-2003 cases required should be equal to the cyclomatic complexity, however in practice it is extremely difficult to do this operationally. 2.3.5.2 Condition Testing In branch testing, only the value of the complete boolean expression at each decision point is taken into consideration, regardless if that expression is a simple one or is a compound expression. In Condition Testing, a test case is designed for each component of the boolean expression involved in a decision point. 2.4 Testing Versus Inspections There are many debates around the idea that software inspections at the code level can replace testing. Experience has shown them to be a powerful defect prevention methodology. It is often said that the best way to come to understand a topic is to have to teach or explain it to another. Inspections encourage just that philosophy. However, inspections are a static analysis of the software being examined. Given the complexity of software systems being developed today, it is unlikely that this static analysis will be capable of detecting defects that involve significant interactions between multiple components, or be able to assess its performance characteristics. The Such assessments can only be determined from a dynamic form of testing. These leads to the potential conclusion that inspections may effectively address the software quality issues at the unit test and some portion of the integration test levels. However, as the complexity of the system under test increases, there is a threshold where static analysis must give way to the dynamic nature of testing. In other words, inspections and testing are complementary techniques to assess or improve the quality of the software product. Each should be employed where its return on investment is greatest, or where the alternate technique is incapable of being completely successful. SENG 623 Software Quality Management 8 Yuhang (Henry) Wang Winter Term 2003 Scott Thornton
  • Software Testing 10-April-2003 3 Unit Testing Unit testing is performed on the smallest unit of software, the software module. The module-level design descriptions are normally used as the guideline for the unit testing. The internal paths are tested to uncover the errors within the boundary of the module. The interface, local data structures, boundary conditions, independent paths and error handling paths are examined at this testing level. 3.1 Procedures For unit testing, driver and/or stub software are developed for executing the unit testing. Drivers consist of a “main program” that accepts input test case data and output the corresponding results. Stubs are the constructed to provide the module interface that is called by the component to be tested. The whole infrastructure for unit testing is showed as the following: Test Driver Results Test Cases Module Under Test Stub Stub Stub Figure 2 Unit Testing Infrastructure The Unit Test has a dynamic, white-box orientation. 3.2 Metrics Metrics that are typically applied at the Unit Test level are focused at Product Quality Assessment or at Test Quality Assessment. Code complexity metrics such as McCabe’s [McCabe 1976] are used to help design test cases. Test Effectiveness Ratios such as • Statement Coverage (TER 1), • Branch and Decision Coverage (TER 2), • Decision Condition Coverage, and • Linear Code Sequence and Jump (LCSAJ) Coverage (TER 3) • are commonly used to assess the effectiveness of a given test set. It is believed that a thorough test of a module should exercise every statement at least once, so 100% of statement coverage (TER 1) should be achieved. SENG 623 Software Quality Management 9 Yuhang (Henry) Wang Winter Term 2003 Scott Thornton
  • Software Testing 10-April-2003 Complete branch coverage does not necessarily mean that all statements in a unit have been executed. For example if the test cases only exercise the “true” branch of all decisions, 100% branch coverage will have been done, however not all statements (those in the ‘”false” branch) will not have been touched. Defect tracking metrics such as • Defect Arrival Rate • Defect Densities • Cumulative Defects by Severity • Defect Closure Rates • are an important assessment of the quality of the product and of the rework/repair process. Test Completeness metrics determine the progress of the testing effort. This is required by both the developer and the project manager in assessing the level of resources required to achieve the desired level of quality. Time/cost metrics for test design, debugging tests, test execution and analysis, and in defect resolution are important to determine the true costs of testing and hence the true cost of quality. SENG 623 Software Quality Management 10 Yuhang (Henry) Wang Winter Term 2003 Scott Thornton
  • Software Testing 10-April-2003 4 Integration Testing The testing done on the combination of two or more tested components or units called Integration Testing. It is a systematic technique for conducting tests to uncover the errors associated with interfacing between software units. As well, it will expose errors related to the larger functionality of the individual components under test. Integration testing follows unit testing, in that the components being integrated need to unit tested first. 4.1 Procedures There are two common ways to conduct integration testing. One is non-incremental integration, which uses a “big bang” approach. All the software units are assembled into the entire program. This assembly is then tested as a whole from the beginning, usually resulting in a chaotic situation, as the causes of defects are not easily isolated and corrected. Alternatively, a much superior way to conduct the integration testing is to follow an incremental approach. The program is constructed and tested in small increments by adding a minimum number of components at each interval. Therefore, the errors are easier to isolate and correct, and the interfaces are more likely to be tested completely. Two different approaches have been identified as means of performing incremental integration testing: Top-Down integration and Bottom-up integration. In Top-Down integration, modules are integrated from the main module (main program) to the subordinate modules either in the depth- first or breadth-first manner. Bottom-up integration, as the name suggests, has the lowest level sub- modules integrated and tested first, then the successively superior level components are added and tested, transiting the hierarchy from the bottom, upwards. When conduction integration testing, the new functionality of the integrated set must be confirmed. In addition, previously confirmed functionality must also be tested again. This is because the newly integrated components may have broken the previously tested set. Such testing is known as Regression Testing. (Some authors suggest that it should really be known as “Anti-Regression Testing”; i.e. confirming that the integrated components have not regressed in their level of quality!) Integration Testing is typically dynamic, and is usually black box. 4.2 Metrics Metrics that can be collected during Integration Testing include • Error Rates in design and implementation As well, several of the metrics described in Section 3.2 Unit Test Metrics may also be collected during integration testing. Specifically these are the Test Execution Progress, Time/Cost, and Defect Tracking metrics. SENG 623 Software Quality Management 11 Yuhang (Henry) Wang Winter Term 2003 Scott Thornton
  • Software Testing 10-April-2003 5 System Testing After the all the components have been assembled, the entire system can be tested as a whole, and as such this is called System Testing. The software must be tested under a certain context (for example with certain hardware, users, or environment) that should be as similar to the end use context as possible. System Testing looks for errors in the end to end functionality of the system as well as errors in the non-functional requirements. At this stage, the functional and/or requirements specifications are used to generate the test cases. They should be a series of different tests whose primary purpose is to fully exercise the computer-based system. 5.1 Procedures As indicated in the previous section, the test cases developed for system testing are derived from the functional and/or requirement specifications. The procedures used for functional system testing are essentially equivalent to those used in Integration Testing (Section 4.2). Depending on the type of software system being tested, many non-functional tests may be required. These include items such as 1. Recovery Testing 2. Security Testing 3. Stress Testing 4. Performance Testing 5. Maintainability 6. Usability 5.2 Metrics Process metrics at this level of testing are those described in Section 4.2, Integration Testing Metrics. SENG 623 Software Quality Management 12 Yuhang (Henry) Wang Winter Term 2003 Scott Thornton
  • Software Testing 10-April-2003 6 Validation Testing Once System Testing has been completed, the software product is handed over to the customers or users. Validation Testing marks the transition of ownership of the software from the developers to these users. As such, this level of testing differs from the previous three in the following ways: 1. Typically, the Validation Testing is the responsibility of the users or customer rather than the developers. (Having stated that, it is often the case that the development organization will write the Validation Test procedures for approval and then execution by the customer.) 2. The intent of Validation Testing is to give confidence to the customer that the delivered system is working as intended, rather than to find errors in the implementation. As such it is more correctly called a demonstration rather than a test. 3. Validation Testing can often include the testing of the customers work practices to ensure that the software is compatible with the organizations internal procedures and processes. A simple definition for Validation Testing is that validation succeeds when software functions in a manner that can be reasonably expected by the customer. 6.1 Procedures There are three common forms for Validation Testing. The first one is “Acceptance Testing”, where an extensive series of formal tests with specific procedures and well documented pass/fail criteria are planned and then executed. A second form of Validation Testing is known as an “Alpha Test”. Here the customer is placed in a controlled environment usually at the developers’ site. In the Alpha Test, any errors will be recorded and corrected promptly, since the developers and the customer work together. Typically, there are no formal test procedures. Rather, the customers use the software in a manner that represents their intended use in their environment. As such, if errors are found the exact procedure which elicits the error may be difficult to repeat. “Beta testing”, the third form of Validation Testing, is conducted at one or more customer sites by the end-users. Since the developers are not on site when the software is used, the errors are reported back and then fixed. As with Alpha Tests, Beta Tests involve the execution of the software in the customer’s environment with typical use and without formal test procedures. Beta Testing is most often used with mass-market software. It is usually the final test performed before the software is released. 6.2 Metrics During validation testing, the Defect Tracking metrics can be critical. They, coupled with the time cost metrics provide an ongoing evaluation of the Cost of Quality due to Software Testing. SENG 623 Software Quality Management 13 Yuhang (Henry) Wang Winter Term 2003 Scott Thornton
  • Software Testing 10-April-2003 7 Testing Organization Issues 7.1 Who Tests? There are several possibilities as to what part of the software development organization actually performs the testing. It is important to take an independent view of the software under test, in order to ensure that no personal stake exists for not finding defects. For developers, this is quite difficult to do for two reasons. First, software engineers create the programs, documentation and related artifacts. They are proud of what they have built and look askance at anyone who attempts to find fault with it. This includes themselves! If they have produced what they consider their very best work, they will not want to immediately destroy it! Secondly, it is human nature to see what was intended rather than what is actually there. Studies have shown that an individual can only find 25 to 30% of their own errors [Beizer 1988]. Not withstanding the above, the developer of a software unit is likely the one that has the greatest understanding of its internal structure. As such, the developer is best suited to design the test cases based on its structure to ensure complete path coverage. Thus, it is often the case that the developers are responsible for testing the individual units of the program. In many cases, the developers may also conduct integration testing to a certain level. An alternative to having developers perform all the testing activities is to create an Independent Test Group (ITG). This has several advantages: a separate team has true independence, and a testing expertise is developed in the group that can be applied across multiple products. An ITG can be a resource to the developers to ensure testability and to perform inspections during design and coding. The software developer does not just turn over the software units to the test team and walk away. The two must work closely together throughout the development to ensure that thorough tests will be conducted. Further, while testing is being conducted, the developer must be available to correct defects as they are uncovered. Issues can arise with an ITG however. The Test Group is responsible to “break” the thing that the developer has build since they are paid to find the errors. This can lead to animosity between developers and the test team. Social qualities that the individuals on the test team must have to help minimize this animosity are being in control, being competent, critical, comprehensive and considerate [Hetzel 1988]. Finally, customers make valuable testers. It is well know that as soon as customers start to use a new system, they have the “skill” to find large numbers of errors. This derives from the fact that they often exercise the system in ways that were never conceived by the developers. They (the customers) do not have the psychological bias present in the developers because they have no preconceived notion of the architecture of the software. The customer is always involved in the validation testing. SENG 623 Software Quality Management 14 Yuhang (Henry) Wang Winter Term 2003 Scott Thornton
  • Software Testing 10-April-2003 Table 2 shows for each of the testing levels who conducts the major portion of the testing effort. Unit Integration System Validation Testing Testing Testing Testing Acceptance Alpha Beta Testing Testing Testing Software X X X X Developers Independent X X X X X Testing Group Customers X X X Table 2 Typical Tester versus Testing Level 7.2 When Should Testing Stop? As was stated earlier in Section 2.1, testing can only establish the presence of errors; it can never establish that the software is error free. Therefore, the question arises as to when sufficient testing has been completed. Beizer notes that “There is no single, valid rational criterion for stopping [testing]. Furthermore, given any set of applicable criteria, how each is weighted depends very much on the product, the environment, the culture and the attitude to risk.” [Beizer 1990]. Metrics that might be employed in the determination of when to stop testing include 1. Defect Discovery Rate If the discovery rate is decreasing and passes a particular threshold, it may mean that the product is ready to ship. However, it must be understood why the rate is decreasing. If it were due to less testing effort or if no remaining new test cases are available, the decreased rate would not be representative of the software quality. 2. Trends in Severity Levels of Defects Found If the trend in severity level is not downwards, then the product is not ready to ship. 3. Remaining Defects Estimation Criteria (based on past, historical data) 4. Percentage Coverage Measurements 5. Reliability Growth Models (Mean time between failure (MTBF) and Mean Time To Repair (MTTR)) 6. Running out of Testing Resources (budget or schedule). Unfortunately, the last item, running out of resources, is one of the most common reasons for the termination of testing. If testing is halted too soon many defects may still be left in the software, with some at the critical level. This results in a reactive environment in the development organization, with programmers “fire-fighting” the defects in the product rather than working on new features or new products. Turnover rates for key employees will increase because of this. In addition, the customers will be frustrated in their attempts to use the software and/or have it fixed. SENG 623 Software Quality Management 15 Yuhang (Henry) Wang Winter Term 2003 Scott Thornton
  • Software Testing 10-April-2003 If testing is continued beyond the ideal point, both the team and the end users will be confident of the quality in the product, however there are negative ramifications. There will be a significant increase in the products costs as testing is an expensive activity. Either margins will be smaller or the price of the product will be increased. Either way, the return on the software investment will be reduced, possibly to the point where the product is no longer economically viable. Late deliveries may cause the organization a loss of revenue, smaller market share, project cancellations, and frustrated customers. Recall that software testing is not striving for perfection but rather to achieve an acceptable level of risk. Sometimes the risk of shipping a flawed product may out-weighted the business risks resulting from competition or time-to-market. 7.3 Organizational Issues As has been indicated previously, software testing needs to be carefully managed. In order to accomplish this, there are responsibilities at levels higher that the specific software developer that have an impact on the effectiveness of the testing. At the project level, the project manager should make the decision with the independent testing group for: • Assignment of resources for testing. • Testing plan (before development) • Test cases design (before or in parallel with development) • Test execution (after development) • Problem resolution (whenever needed) At the organization level, executive senior management will make the following decision about the organizational policies: • Set the testing policy, strategy and objectives for the company • Make policy to ensure that metrics for test effort and results are collected and used • How much effort and resource will be invested in Tool Support • Commit to improving the test process Metrics that should be monitored across the organization include • Error rates in design, inspection, test design, test execution, operation • Error severity and cost to correct • Time and costs in test design, test execution, defect densities, defect repair, inspections and review The testing policy should apply to the whole development organization to ensure consistent quality across multiple products. This policy should include • The objectives of testing • Economic constraints for testing • The quality and acceptance criteria for test documentation • What tools will be provided to support the testing activities • The evaluation of test effectiveness ratios SENG 623 Software Quality Management 16 Yuhang (Henry) Wang Winter Term 2003 Scott Thornton
  • Software Testing 10-April-2003 8 Discussion and Conclusions Software Testing is part of a managed software development process. Testing itself needs to be managed carefully as a significant portion of the development dollars are spent during this phase. There are several keys to successful testing. • In order to be cost effective, the testing must be concentrated on areas where it will be most effective. In other words, the testing effort does not have to be uniformly distributed over the system being tested. It is most important to test the most critical parts (to the users) of the system. • The testing should be concentrated on the most difficult, most complex, the hardest to write, or the least liked. It is in these components that the greatest number of defects is historically found. Defects (bugs) are social creatures; they tend to congregate in the same “locations” regardless of the software product they are found in. • The testing should be planned such that when testing is stopped for whatever reason, the most effective testing in the time allotted has already been done. Just as there are keys to success in testing, there are also many pitfalls. • The absence of an organizational testing policy may result in too much effort and money will be spent on testing, attempting to achieve a level of quality that is impossible or unnecessary. Alternatively without a policy, insufficient time may be allocated to the testing phase. Either way, the testing is not effective or efficient. • The actual testing must be planned. Doing the planning early in the development will identify what is needed in terms of test data, test drivers or harnesses. The need for test tools may also be determined in a timely fashion. Uneven testing, where important (to the customer) functions are missed entirely or unimportant functions are tested to an unreasonable level also is a result of unplanned testing. Priorities for all test should be decided in advance. • If the software is very difficult to test, then the job of the tester will also be difficult. Early test design helps to identify these issues and will ensure testability. If the software is of such poor quality, a vast number of errors will be generated. Testing will essentially stop because many tests cannot be completed. This should not be considered an issue of Testing but rather one of software quality. Regardless of how good our testing and test management becomes, we will always be caught by the “Pesticide Paradox”: Every method used to prevent or find bugs leaves a residue of subtler bugs against which those methods are ineffectual. [Beizer 1983]. SENG 623 Software Quality Management 17 Yuhang (Henry) Wang Winter Term 2003 Scott Thornton
  • Software Testing 10-April-2003 9 References Boehm, B. Software Engineering Economics, Prentice-Hall, 1981, Beizer, B. Software System Testing and Quality Assurance, Van Nostrand Reinhold, New York, 1984 Beizer, B. Software Testing Techniques, Van Nostrand Reinhold, New York, 1983, and 1990 (2nd edition) Craig, R. & Jaskiel, S. System Software Testing, Arctech House Publishers, Boston, 2002 Goodenough, J. & Gerhart, S. “Toward a Theory of Test data Selection”, IEEE Transactions on Software Engineering, SE-1(2), (June 1975) Hetzel, B., ed, Program Test Methods, Prentice-Hall, Englewood Cliffs, NJ, 1973 Hetzel, B., The Complete Guide to Software Testing, QED Information Sciences, Wellesley, Mass. 1988 Hetzel, B., “Software Testing: Some Troubling Issues and Opportunities”, BCS Special Interest Group in Software Testing, Dec 6, 1991. Hetzel, B., Making Software Measurement Work: Building an Effective Measurement Program, John Wiley and Sons, Inc., New York, 1993 Malevris N, & Petrova, E. On the Determination of an Appropriate Time for Ending the Software Testing Process, Proceedings of the First Asia-Pacific Conference on Quality Software, Hong Kong, October 2000, Marciniak, J. Encyclopaedia of Software Engineering, John J Marciniak ed; John Wiley & Sons, 1994 McCabe, T. “A Complexity Measure”, IEEE Transactions on Software Engineering, SE- 2(4), 1976 Myers, G., The Art of Software Testing, John Wiley & Sons, New York, 1979 Tai, K.C., "What to Do Beyond Branch Testing," ACM Software Engineering Notes, vol. 14, no. 2, April 1989, pp. 58-61 Tennant, R. Creating Five-Star Test Metrics on a One Star Budget, International Conference on Software Testing, Analysis and Review, Anaheim, CA, 2002 Vaughan, D. & Elledge, J. Test Metrics Without Tear, International Conference on Software Testing, Analysis and Review, Orlando FL, 2000 SENG 623 Software Quality Management 18 Yuhang (Henry) Wang Winter Term 2003 Scott Thornton