Common Testing Problems – Pitfalls to Prevent and Mitigate
Upcoming SlideShare
Loading in...5
×
 

Common Testing Problems – Pitfalls to Prevent and Mitigate

on

  • 2,240 views

Common System and Software Testing Pitfalls book published by Addison Wesley December 2013. ...

Common System and Software Testing Pitfalls book published by Addison Wesley December 2013.

Obsolete review draft version of Common Testing Problems –
Pitfalls to Prevent and Mitigate:
Descriptions, Symptoms, Consequences, Causes, and Recommendations

Statistics

Views

Total Views
2,240
Views on SlideShare
2,240
Embed Views
0

Actions

Likes
0
Downloads
69
Comments
0

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Microsoft Word

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Common Testing Problems – Pitfalls to Prevent and Mitigate Common Testing Problems – Pitfalls to Prevent and Mitigate Document Transcript

  • Common Testing Problems –Pitfalls to Prevent and Mitigate:Descriptions, Symptoms, Consequences, Causes,and RecommendationsDonald G. FiresmithPage 1 of 111© 2013 by Carnegie Mellon University
  • Common Testing Problems: Pitfalls to Prevent and Mitigate 25 January 2013Descriptions, Symptoms, Consequences, Causes, and Recommendations Table of Contents1 Introduction ........................................................................................................................... 5 1.1 Usage ................................................................................................................................ 5 1.2 Problem Specifications ..................................................................................................... 6 1.3 Problem Interpretation...................................................................................................... 62 Testing Problems ................................................................................................................... 8 2.1 General Testing Problems ................................................................................................ 8 2.1.1 Test Planning and Scheduling Problems................................................................... 8 2.1.2 Stakeholder Involvement and Commitment Problems ........................................... 17 2.1.3 Management-related Testing Problems .................................................................. 21 2.1.4 Test Organization and Professionalism Problems .................................................. 28 2.1.5 Test Process Problems ............................................................................................ 32 2.1.6 Test Tools and Environments Problems ................................................................. 45 2.1.7 Test Communication Problems ............................................................................... 54 2.1.8 Requirements-related Testing Problems ................................................................. 60 2.2 Test Type Specific Problems.......................................................................................... 70 2.2.1 Unit Testing Problems ............................................................................................ 71 2.2.2 Integration Testing Problems .................................................................................. 72 2.2.3 Specialty Engineering Testing Problems ................................................................ 74 2.2.4 System Testing Problems ........................................................................................ 82 2.2.5 System of Systems (SoS) Testing Problems ........................................................... 84 2.2.6 Regression Testing Problems .................................................................................. 893 2BConclusion ....................................................................................................................... 97 3.1 Testing Problems ............................................................................................................ 97 3.2 Common Consequences ................................................................................................. 97 3.3 Common Solutions ......................................................................................................... 984 Potential Future Work ..................................................................................................... 1005 Acknowledgements ........................................................................................................... 101© 2012-2013 by Carnegie Mellon University Page 2 of 111
  • Common Testing Problems: Pitfalls to Prevent and Mitigate 25 January 2013Descriptions, Symptoms, Consequences, Causes, and Recommendations© 2012-2013 by Carnegie Mellon University Page 3 of 111 View slide
  • Common Testing Problems: Pitfalls to Prevent and Mitigate 25 January 2013Descriptions, Symptoms, Consequences, Causes, and Recommendations AbstractThis special report documents the different types of problems that commonly occur when testingsoftware-reliant systems. These 77 problems are organized into 14 categories. Each of theseproblems is given a title, description, a set of potential symptoms by which it can be recognized,a set of potential negative consequences that can result if the problem occurs, a set of potentialcauses for the problem, and recommendations for avoiding the problem or solving the should itoccur.© 2012-2013 by Carnegie Mellon University Page 4 of 111 View slide
  • Common Testing Problems: Pitfalls to Prevent and Mitigate 25 January 2013Descriptions, Symptoms, Consequences, Causes, and Recommendations1 IntroductionMany testing problems can occur during the development or maintenance of software-reliantsystems and software applications. While no project is likely to be so poorly managed andexecuted as to experience the majority of these problems, most projects will suffer several ofthem. Similarly, while these testing problems do not guarantee failure, they definitely poseserious risks that need to be managed.Based on over 30 years of experience developing systems and software as well asperformingnumerous independent technical assessments, this technical report documents 77problems that have been observed to commonly occur during testing. These problems havebeencategorized as follows:• General Testing Problems Test Planning and Scheduling Problems Stakeholder Involvement and Commitment Problems Management-related Testing Problems Test Organization and Professionalism Problems Test Process Problems Test Tools and Environments Problems Test Communication Problems Requirements-related Testing Problems• Testing Type Specific Problems Unit Testing Problems Integration Testing Problems Specialty Engineering Testing Problems System Testing Problems System of Systems (SoS) Problems Regression Testing Problems1.1 UsageTheinformation describing each of the commonly occurring testing problems can be used:• To improve communication regarding commonly occurring testing problems• As training materials for testers and the stakeholders of testing• As checklists when: Developing and reviewing an organizational or project testing process or strategy Developing and reviewing test plans, the testing sections of system engineering management plans (SEMPs), and software development plans (SDPs) Evaluating the testing-related parts of contractor proposals Evaluating test plans and related documentation (quality control)© 2012-2013 by Carnegie Mellon University Page 5 of 111
  • Common Testing Problems: Pitfalls to Prevent and Mitigate 25 January 2013Descriptions, Symptoms, Consequences, Causes, and Recommendations Evaluating the actual as-performed testing process during the oversight 1 (quality 0F assurance) Identifying testing risks and appropriate risk mitigation approaches• To categorize testing problems for metrics collection, analysis, and reporting• As an aid to identify testingareas potentially needing improvement during project post mortems (post implementation reviews)Although each of these testing problems has been observed on multiple projects, it is entirelypossible that you may have testing problems not addressed by this document.1.2 Problem SpecificationsThe following tables document each testing problem with the following information:• Title – a short descriptive nameof the problem• Description – a brief definition of the problem• Potential Symptoms (how you will know) –potential symptoms that indicate possible existence of the problem• Potential Consequences (why you should care) –potential negative consequences to expect if the problem is not avoided or solved2• Potential Causes –potential root and proximate causes of the problem3• Recommendations (what you should do) –recommended (prepare, enable, perform, and verify) actions to take to avoid or solve the problem4• Related Problems – a list of links to other related testing problems1.3 Problem InterpretationThe goal of testing is not to prove that something works, but rather to demonstrate that it doesnot. 5A good tester assumes that there are always defects (an extremely safe assumption) and 2F1 Not all testing problems have the same probability or harm severity. These problem specifications are not intended to be used as part of a quantitative scoring scheme based on the number of problems found. Instead, they are offered to support qualitative review and planning.2 Note that the occurrence of a potential consequence may be a symptom by which the problem is recognized.3 Causes are important because recommendations should be based on the causes. Also, recommendation to address root causes may be more important than proximate causes, because recommendations addressing proximate causes may not combat the root cause and therefore may not prevent the problem under all circumstances.4 Some of the recommendations may no longer be practical after the problem rears its ugly head. It is usually much easier to avoid the problem or nip it in the bud instead of fixing it when the project is well along or near completion. For example, several possible ways exist to deal with inadequate time to complete testing including (1) delay the test completion date and reschedule testing, (2) keep the test completion date and (a) reduce the scope of delivered capabilities, (b) reduce the amount of testing, (c) add testers, and (d) perform more parallel testing (e.g., different types of testing simultaneously). Selection of the appropriate recommendations to follow therefore depends on the actual state of the project.5 Although tests that pass are often used as evidence that the system (or subsystem) under test meets its (derived and allocated) requirements, testing can never be exhaustive for even a simple system and therefore cannot “prove” that all requirements are met. However, system and operational testing can provide evidence that the system under test is “fit for purpose” and ready to be placed into operation.For example, certain types of testing© 2012-2013 by Carnegie Mellon University Page 6 of 111
  • Common Testing Problems: Pitfalls to Prevent and Mitigate 25 January 2013Descriptions, Symptoms, Consequences, Causes, and Recommendationsseeks to uncover them.Thus, a good test is one that causes the thing being tested to fail so thatthe underlying defect(s) can be found and fixed.6Defects are not restricted to violations of specified (or unspecified) requirements. Some of theother important types of defects are:• inconsistencies between the architecture, design, and implementation• violations of coding standards• lack of input checking (i.e., unexpected data)• the inclusion of safety or security vulnerabilities (e.g., the use of inherently unsafe language features or lack of verification of input data) may provide evidence required for safety and security accreditation and certification. Nevertheless, a tester must take a “show it fails” rather than a “show it works” mindsetto be effective.6 Note that testing cannot identify all defects because some defects (e.g., the failure to implement missing requirements) do not cause the system to fail in a manner detectable by testing.© 2012-2013 by Carnegie Mellon University Page 7 of 111
  • Common Testing Problems: Pitfalls to Prevent and Mitigate 25 January 2013Descriptions, Symptoms, Consequences, Causes, and Recommendations2 Testing ProblemsThe commonly occurring testing problems documented in this section are categorized as eithergeneral testing problems or testing type specific problems.2.1 General Testing ProblemsThe following testing problems can occur regardless of the type of testing being performed:• Test Planning and Scheduling Problems• Stakeholder Involvement and Commitment Problems• Management-related Testing Problems• Test Organization and Professionalism Problems• Test Process Problems• Test Tools and Environments Problems• Test Communication Problems• Requirements-related Testing Problems2.1.1 Test Planning and Scheduling ProblemsThe following testing problems are related to test planning and estimation:• GEN-TPS-1 No Separate Test Plan• GEN-TPS-2 Incomplete Test Planning• GEN-TPS-3 Test Plans Ignored• GEN-TPS-4 Test Case Documents rather than Test Plans• GEN-TPS-5 Inadequate Test Schedule• GEN-TPS-6 Testing is Postponed2.1.1.1 GEN-TPS-1 No Separate Test Plan Description: There are no separate testing-specific planning document(s). Potential Symptoms: • Thereisno separate Test and Evaluation Master Plan (TEMP) or System/Software Test Plan (STP). • Thereareonlyincomplete high-level overviews of testing in System Engineering Master Plan (SEMP) and System/Software Development Plan (SDP). Potential Consequences: • The test planning parts of these other documents arenot written by testers. • Testing is not adequately planned. • The test plans are not adequately documented. • It is difficult or impossible to evaluate the planned testing process.© 2012-2013 by Carnegie Mellon University Page 8 of 111
  • Common Testing Problems: Pitfalls to Prevent and Mitigate 25 January 2013Descriptions, Symptoms, Consequences, Causes, and Recommendations • Testing is inefficiently and ineffectively performed. Potential Causes: • The customer has not specified the development and delivery of a separate test plan. • The system engineering, software engineering, or testing process has not included the development of a separate test plan. • There was no template for the content and format of a separate test plan. • Management, the customer representative, or the testers did not understand the: scope, complexity, and importance of testing value of a separate test plan Recommendations: • Prepare: Reuse or create a standard template and content/format standard for test plans. Include one or more separateTEMPsand/or STPs as deliverable work products in the contract. Include the development and delivery of test planning documents in the project master schedule (e.g., as part of major milestones). • Enable: Provide sufficient resources (staffing and schedule) for the development of one or more separate test plans. • Perform: Develop and deliver one or more separateTEMPsand/or STPs. • Verify: Verify the existence and delivery of one or more separate test planning documents. Do not accept incomplete high-level overviews of testing in the SEMP and/orSDP as the only test planning documentation.2.1.1.2 GEN-TPS-2 Incomplete Test Planning Description: The test planning documents are incomplete. Potential Symptoms: • The test planning documents are incomplete, missing some or all7 of the: references – listing of all relevant documents influencing testing test goals and objectives – listing the high-level goals and subordinate objectives of the testing program scope of testing – listing the component(s), functionality, and/or capabilities to be7 This does not mean that every test plan must include all of this information; test plans should include only the information that is relevant for the current project. It is quite reasonable to reuse much/most of this information in multiple test plans; just because it is highly reusable does not mean that it is meaningless boilerplate that can be ignored. Test plans can be used to estimate the amount of test resources (e.g., time and tools) needed as well as the skills/expertise that the testers need.© 2012-2013 by Carnegie Mellon University Page 9 of 111
  • Common Testing Problems: Pitfalls to Prevent and Mitigate 25 January 2013Descriptions, Symptoms, Consequences, Causes, and Recommendations tested (and any that are not to be tested) test levels – listing and describing the relevant levels of testing (e.g., unit, subsystem integration, system integration, system, and system of systems testing) test types – listing and describing the types of testing such as: blackbox, graybox, and whitebox testing developmental vs. acceptance testing initial vs. regression testing manual vs. automated mode-based testing (system start-up 8, operational mode, degraded mode, training 7F mode, and system shutdown) normal vs. abnormal behavior (i.e., nominal vs. off-nominal, sunny day vs. rainy day use case paths) quality criteria based testing such as availability, capacity (e.g., load and stress testing), interoperability, performance, reliability, robustness9, safety, security (e.g., penetration testing), and usability testing static vs. dynamic testing time- or date-based testing testing methods and techniques – listing and describing the planned testing methods and techniques (e.g., boundary value testing, penetration testing, fuzz testing, alpha and beta testing) to be used including the associated: test case selection criteria – listing and describing the criteria to be used to select test cases (e.g., interface-based, use-case path,boundary value testing, and error guessing) test entrance criteria – listing the criteria that must hold before testing should begin test exit/completion criteria – listing the test completion criteria (e.g., based on different levels of code coverage such as statement, branch, condition coverage) test suspension and resumption criteria test completeness and rigor – describing how the rigor and completeness of the testing varies as a function of mission-, safety-, and security-criticality resources: staffing – listingthe different testing roles and teams, their responsibilities, their associated qualifications (e.g., expertise, training, and experience), and their numbers environments – listing and description of required computers (e.g., laptops and servers), test tools (e.g., debuggers and test management tools), test environments (software and hardware test beds), and test facilities testing work products – listing and describing of the testing work products to be8 This includes combinations such as the testing of system start-up when hardware/software components fail.9 This includes the testing of error, fault, and failure tolerance.© 2012-2013 by Carnegie Mellon University Page 10 of 111
  • Common Testing Problems: Pitfalls to Prevent and Mitigate 25 January 2013Descriptions, Symptoms, Consequences, Causes, and Recommendations produced or obtained such as test documents (e.g., plans and reports), test software (e.g., test drivers and stubs), test data (e.g., inputs and expected outputs), test hardware, and test environments testing tasks – listing and describing the major testing tasks (e.g., name, objective, preconditions, inputs, steps, postconditions, and outputs) testing schedule – listing and describing the major testing milestones and activities in the context of the project development cycle, schedule, and major project milestones reviews, metrics, and status reporting – listing and describing the test-related reviews (e.g., Test Readiness Review), test metrics (e.g., number of tests developed and run), and status reports (e.g., content, frequency, and distribution) dependencies of testing on other project activities – such as the need to incorporate certain hardware and software components into test beds before testing using those environments can begin acronym list and glossary Potential Consequences: • Testers and stakeholders in testing may not understand the primary objective of testing (i.e., to find defects so that they can be fixed). • Some levels and types of tests may notbe performed, allowing certain types of residual defects to remain in the system. • Some testing may be ad hoc and therefore inefficient and ineffectual. • Mission-, safety-, and security-critical software may not be sufficiently tested to the appropriate level of rigor. • Certain types of test cases may be ignored, resulting in related residual defects in the tested system. • Test completion criteria may be based more on schedule deadlines than on the required degree of freedom from defects. • Adequate amounts of test resources (e.g., e.g., testers, test tools, environments, and test facilities) may not be made available because they are not in the budget. • Some testers may not have adequate expertise, experience, and skills to perform all of the types of testing that needs to be performed. Potential Causes: • There were no templates or content and format standards for separate test plans. • The associated templates or content and format standardswere incomplete. • The test planning documents were written by people (e.g., managers or developers) who did not understand the scope, complexity, and importance of testing. Recommendations: • Prepare: Reuse or create a standard template and/or content/format standard for test plans. • Enable: Provide sufficient resources (staffing and schedule) to develop complete test plan(s). • Perform:© 2012-2013 by Carnegie Mellon University Page 11 of 111
  • Common Testing Problems: Pitfalls to Prevent and Mitigate 25 January 2013Descriptions, Symptoms, Consequences, Causes, and Recommendations Use a proper template and/orcontent/format standard to develop the test plans (i.e., ones that are derived from test plan standards and tailored if necessary for the specific project). • Verify: Verify during inspections/reviews that all test plans are sufficiently complete Do not accept incomplete test plans. Related Problems:GEN-TOP-2 Unclear Testing Responsibilities, GEN-PRO-8 Inadequate Test Evaluations, GEN-TTE-7 Tests Not Delivered, TTS-SPC-1 Inadequate Capacity Requirements, TTS-SPC-2 Inadequate Concurrency Requirements, TTS-SPC-3 Inadequate Performance Requirements, TTS-SPC-4 Inadequate Reliability Requirements, TTS-SPC-5 Inadequate Robustness Requirements, TTS-SPC-6 Inadequate Safety Requirements, TTS-SPC- 7 Inadequate Security Requirements, TTS-SPC-8 Inadequate Usability Requirements, TTS- SoS-1 Inadequate SoS Planning, TTS-REG-5 Disagreement over Maintenance Test Resources2.1.1.3 GEN-TPS-3Test Plans Ignored Description: The test plans are ignored once developed and delivered. Potential Symptoms: • The way the testers perform testing is not consistent with the relevant test plan(s). • The test plan(s) are never updated after initial delivery shortly after the start of the project. Potential Consequences: • Management may not have budgeted sufficient funds to the pay for the necessary test resources e.g., testers, test tools, environments, and test facilities). • Management may not have made adequate amounts of test resources available because they are not in the budget. • Testers will not have an approved document that justifies: their request for additional needed resources when they need them their insistence that certain types of testing is necessary and must not be dropped when the schedule becomes tight • Some testers may not have adequate expertise, experience, and skills to perform all of the types of testing that needs to be performed. • The test plan may not be maintained. • Some levels and types of tests may not be performed so that certain types of residual defects to remain in the system. • Some important test cases may not be developed and executed. • Mission-, safety-, and security-critical software may not be sufficiently tested to the appropriate level of rigor. • Test completion criteria may be based more on schedule deadlines than on the required degree of freedom from defects. Potential Causes:© 2012-2013 by Carnegie Mellon University Page 12 of 111
  • Common Testing Problems: Pitfalls to Prevent and Mitigate 25 January 2013Descriptions, Symptoms, Consequences, Causes, and Recommendations • The testers may have forgotten some of the test plan contents. • The testers may have thought that the only reason a test plan was developed was because it was a deliverable in the contract that needed to be check off. • The test plan(s) may be so incomplete and at such a generic high level of abstraction as to be relatively useless. Recommendations: • Prepare: Have project management (both administrative and technical), testers, and quality assurance personnel read and review the test plan. Have management (acquisition and project) sign off on the completed test plan document. Use the test plan as input to the project master schedule and work breakdown schedule (WBS). • Enable: Develop a short check list from the test plan(s) for use when assessing the performance of testing. • Perform: Have the test manager periodically review the test work products and as-performed test process against the test plan(s). Have the test team update the test plan(s) as needed. • Verify: Have the testers present their work and status at project and test-team status meetings. Have quality engineering periodically review the test work products (quality control) and as performed test process (quality assurance). Have progress, productivity, and quality test metrics collected, analyzed, and reported to project and customer management. Related Problems:GEN-TPS-2Incomplete Test Planning2.1.1.4 GEN-TPS-4Test Case Documents rather than Test Plans Description: Test case documents documenting specific test cases are labeled test plans. Potential Symptoms: • The “test plan(s)” contain specific test cases including inputs, test steps, expected outputs, and sources such as specific requirements (blackbox testing) or design decisions (whitebox testing). • The test plans do not contain the type of general planning information listed in GEN-TPS-2 Incomplete Test Planning. Potential Consequences: • All of the negative consequences of GEN-TPS-2 Incomplete Test Planning may occur. • The test case documents may not be maintained.© 2012-2013 by Carnegie Mellon University Page 13 of 111
  • Common Testing Problems: Pitfalls to Prevent and Mitigate 25 January 2013Descriptions, Symptoms, Consequences, Causes, and Recommendations Potential Causes: • There may have been no template or content format for the test case documents. • The test plan authors may not have had adequate expertise, experience, and skills to develop test plans or know their proper content. Recommendations: • Prepare: Provide the test manager and testers with at least minimal training in test planning. • Enable: Provide a proper test plan template. Provide a proper content and format standard for test plans. Add test plans and test case documents to the project technical glossary. • Perform: Develop the test plan in accordance with the test plan template or content and format standard. Develop the test case documents in accordance with the test case document template and/or content and format standard. Where practical, automate the test cases so that the resulting tests (extended with comments) replace the test case documents so that the distinction is clear (i.e., the test plan is a document meant to be read whereas the test case is meant to be executable). • Verify: Have the test plan(s) reviewed against the associated template or content and format standard prior to acceptance. Related Problems:GEN-TPS-2 Incomplete Test Planning2.1.1.5 GEN-TPS-5 Inadequate Test Schedule Description: The testing schedule is inadequate to permit proper testing. Potential Symptoms: • Testing is significantly incomplete and behind schedule. • An insufficient time is allocated in the project master schedule to perform all: test activities (e.g., automating testing, configuring test environments, and developing test data, test scripts/drivers, and test stubs) appropriate tests (e.g., abnormal behavior, quality requirements, regression testing) 10 8F • Testers are working excessively and unsustainably long hours and days per week in an attempt to meet schedule deadlines.10 Note that an agile (i.e., iterative, incremental, and concurrent) development/life cycle greatly increases the amount of regression testing needed (although this increase in testing can be largely offset by highly automating regression tests). Although testing can never be exhaustive, more time is typically needed for adequate testing unless testing can be made more efficient. For example, fewer defects could be produced and these defects could be found and fixed earlier and thereby be prevented from reaching the current iteration.© 2012-2013 by Carnegie Mellon University Page 14 of 111
  • Common Testing Problems: Pitfalls to Prevent and Mitigate 25 January 2013Descriptions, Symptoms, Consequences, Causes, and Recommendations Potential Consequences: • Testers are exhausted and therefore making an unacceptably large number of mistakes. • Tester productivity (e.g., importance of defects found and number of defects found per unit time) is decreasing. • Customer representatives, managers, and developers have a false sense of security that the system functions properly. • There is a significant probability that the system or software will be delivered late with an unacceptably large number of residual defects. Potential Causes: • The overall project schedule was insufficient. • The size and complexity of the system were underestimated. • The project master plan was written by people (e.g., managers, chief engineers, or technical leads) who do not understand the scope, complexity, and importance of testing. • The project master plan was developed without input from the test team(s). Recommendations: • Prepare: Provide evidence-based estimates of the amount of testing and associated test effort that will be needed. Ensure that adequate time for testing is included in the program master schedule and test team schedules including the testing of abnormal behavior and the specialty engineering testing of quality requirements (e.g., load testing for capacity requirements and penetration testing for security requirements). 11 9F Provide adequate time for testing in change request estimates. • Enable: Deliver inputs to the testing process (e.g., requirements, architecture, design, and implementation) earlier and more often (e.g., as part of an incremental, iterative, parallel – agile – development cycle). Provide sufficient test resources (e.g., number of testers, test environments, and test tools). If at all possible, do not reduce the testing effort in order to meet a delivery deadline. • Perform: Automate as much of the regression testing as is practical, and allocate sufficient resources to maintain the automated tests. 12 10F • Verify: Verify that amount of time scheduled for testing is consistent with the evidence-based11 Also integrate the testing process into the software development process.12 When there is insufficient time to perform manual testing, it may be difficult to justify the automation of these tests. However, automating regression testing is not just a maintenance issue. Even during initial development, there should typically be a large amount of regression testing, especially if an iterative and incremental development cycle is used. Thus, ignoring the automation of regression testing is often a case of being penny wise and pound foolish.© 2012-2013 by Carnegie Mellon University Page 15 of 111
  • Common Testing Problems: Pitfalls to Prevent and Mitigate 25 January 2013Descriptions, Symptoms, Consequences, Causes, and Recommendations estimates of need time. Related Problems:TTS-SoS-5 SoS Testing Not Properly Scheduled2.1.1.6 GEN-TPS-6 Testing is Postponed Description: Testing is postponed until late in the development schedule. Potential Symptoms: • Testing is scheduled to be performed late in the development cycle on the project master schedule. • Little or no unit or integration testing: is planned is being performed during the early and middle stages of the development cycle Potential Consequences: • There is insufficient time left in the schedule to correct any major defects found. 13 11F • It is difficult to show the required degree of test coverage. • Because so much of the system has been integrated before the beginning of testing, it is very difficult to find and localize defects that remain hidden within the internals of the system. Potential Causes: • The project is using a strictly-interpreted traditional sequential Waterfall development cycle. • Management was not able to staff the testing team early during the development cycle. • Management was primarily interested in system testing and did not recognize the need for lower-level (e.g., unit and integration) testing. Recommendations: • Prepare: Plan and schedule testing to be performed iteratively, incrementally, and in a parallel manner (i.e., agile) starting early during development. Provide training in incremental iterative testing. Incorporate iterative and incremental testing into the project’s system/software engineering process. • Enable: Provide adequate testing resources (staffing, tools, budget, and schedule) early during development. • Perform: Perform testing in an iterative, incremental, and parallel manner starting early during the development cycle.13 An interesting example of this is the Hubble telescope. Testing of the mirror’s focusing was postponed until after launch, resulting in an incredibly expensive repair mission.© 2012-2013 by Carnegie Mellon University Page 16 of 111
  • Common Testing Problems: Pitfalls to Prevent and Mitigate 25 January 2013Descriptions, Symptoms, Consequences, Causes, and Recommendations • Verify: Verify in an ongoing manner (or at the very least during major project milestones) that testing is being performed iteratively, incrementally, and in parallel with design, implementation, and integration. Use testing metrics to verify status and ongoing progress. Related Problems:GEN-PRO-1 Testing and Engineering Process not Integrated2.1.2 Stakeholder Involvement and Commitment ProblemsThe following testing problems are related to stakeholder involvement in and commitment to thetesting effort:• GEN-SIC-1 Wrong Testing Mindset• GEN-SIC-2 Unrealistic Testing Expectations / False Sense of Security• GEN-SIC-3 Lack of Stakeholder Commitment2.1.2.1 GEN-SIC-1 Wrong Testing Mindset Description: Some of the testers and other testing stakeholders have the wrong testing mindset. Potential Symptoms: • Some testers and other testing stakeholdersbegin testing assumingthat the system/software works. • Testers believe that their job is to verify or “prove” that the system/software works. 14 12 F • Testing is used to demonstrate that the system/software works properly rather than to determinewhere and how it fails. • Only normal (“sunny day”, “happy path”, or “golden path”) behavior is being tested. • There is little or no testing of: exceptional or fault/failure tolerant(“rainy day”) behavior input data (e.g., range testing to identify incorrect handling of invalid input values) • Test inputsonly include middle of the road values rather than boundary values and corner cases. Potential Consequences: • There is a high probability that: the delivered system or software will contain a significant number of residual defects, especially related to abnormal behavior (e.g., exceptional use case paths) these defects will unacceptably reduce its reliability and robustness (e.g., error, fault, and failure tolerance) • Customer representatives, managers, and developers have a false sense of security that the system functions properly.14 Using testing to “prove” that their software works is most likely to become a problem when developers test their own software (e.g., with unit testing and with small cross-functional or agile teams).© 2012-2013 by Carnegie Mellon University Page 17 of 111
  • Common Testing Problems: Pitfalls to Prevent and Mitigate 25 January 2013Descriptions, Symptoms, Consequences, Causes, and Recommendations Potential Causes: • Testers were taught or explicitly told that their job is to verify or “prove” that the system/software works. • Developers are testing their own software15 so that there is a “conflict of interest” (i.e., build software that works and show that their software does not work). This is especially a problem with small, cross-functional development organizations/teams that “cannot afford” to have separate testers (i.e., professional testers who specialize in testing). • There was insufficient schedule allocated for testing so that there is only sufficient time to test the normal behavior (e.g., use case paths). • The organizational culture is very success oriented so that looking “too hard” for problems is (implicitly) discouraged. • Management gave the testers the strong impression that they do not want to hear any “bad” news (i.e., that there are any significant defects being found in the system). Recommendations: • Prepare: Explicitly state in the project test plan that the primary goal of testing is to: find defects by causing system faults and failuresrather than to demonstrate that there are no defects break the system rather than toprove that it works • Enable: Provide test training that emphasizes uncovering defects by causing faults or failures. Provide sufficient time in the schedule for testing beyond the basic success paths. Hire new testers who exhibit a strong “destructive” mindset to testing. • Perform: In addition to test cases that verify all normal behavior, emphasize looking for defects where they are most likely to hide (e.g., boundary values, corner cases, and input type/range verification). 16 1 3F Incentivize testers based more on the number of significant defects they uncoverthan merely on the number requirements “verified” or test cases ran.17 Foster a healthy competition between developers (who seek to avoid inserting defects) and testers (who seek to find those defects). • Verify: Verify that the testers exhibit a testing mindset.15 Developers typically do their own unit level (i.e., lowest level) testing. With small, cross functional (e.g., agile) teams, it is becoming more common for developers to also do integration and subsystem testing.16 Whereas tests that verify nominal behavior are essential, testers must keep in mind that there are typically many more ways for the system/software under test to fail than to work properly. Also, nominal tests must remain part of the regression test suite even after all known defects are fixed because changes could introduce new defects that cause nominal behavior to fail.17 Take care to avoid incentivizing developers to insert defects into their own software so that they can then find them during testing.© 2012-2013 by Carnegie Mellon University Page 18 of 111
  • Common Testing Problems: Pitfalls to Prevent and Mitigate 25 January 2013Descriptions, Symptoms, Consequences, Causes, and Recommendations Related Problems:GEN-MGMT-2 Inappropriate External Pressures, GEN-COM-4Inadequate Communication Concerning Testing, TTS-UNT-3 Unit Testing Considered Unimportant2.1.2.2 GEN-SIC-2 Unrealistic Testing Expectations / False Sense of Security Description: Testers and other testing stakeholders have unrealistic testing expectations that generate a false sense of security. Potential Symptoms: • Testing stakeholders (e.g., managers and customer representatives) and some testers falsely believe that: Testing detects all (or even the majority of) defects. 18 14F Testing proves that there are no remaining defects and that the system therefore works as intended. Testing can be, for all practical purposes, exhaustive. Testing can be relied on for all verification. (Note that some requirements are better verified via analysis, demonstration, certification, and inspection.) Testing (if it is automated) will guarantee the quality of the tests and reduce the testing effort 19 15 F • Managers and other testing stakeholders may not understand that: Test automation requires specialized expertise and needs to be budgeted for the effort required to develop, verify, and maintain the automated tests. A passed test could result from a weak/incorrect test rather than a lack of defects. A truly successful/useful test is one that finds one or more defects, whereas a passed test only shows that the system worked in that single specific instance. Potential Consequences: • Testers and other testing stakeholders have a false sense of security that the system or software will work properly on delivery and deployment. • Non-testing forms of verification (e.g., analysis, demonstration, inspection, and simulation) are not given adequate emphasis. Potential Causes: • Testing stakeholders and testers were not exposed to research results that document the relatively large percentage of residual defects that typically remain after testing. • Testers and testing stakeholders have not been trained in verification approaches (e.g., analysis, demonstration, inspection) other than testing and their relative pros and cons. • Project testing metrics do not include estimates of residual defects.18 Testing typically finds less than half of all latent defects and is not the most efficient way of detecting many defects.19 This depends on the development cycle and the volatility of the system’s requirements, architecture, design, and implementation.© 2012-2013 by Carnegie Mellon University Page 19 of 111
  • Common Testing Problems: Pitfalls to Prevent and Mitigate 25 January 2013Descriptions, Symptoms, Consequences, Causes, and Recommendations Recommendations: • Prepare: Collect information on the limitations of testing. Collect information on when and how to augment testing with other types of verification. • Enable: Provide basic training in verification methods including their associated strengths and limitations. • Perform: Train and mentor managers, customer representatives, testers, and other test stakeholders concerning the limits of testing: Testing will not detect all (or even a majority of) defects. No testing is truly exhaustive. Testing cannot prove (or demonstrate) that the system works under all combinations of preconditions and trigger events. A passed test could result from a weak test rather than a lack of defects. A truly successful test is one that finds one or more defects. Do not rely on testing for the verification of all requirements, especially architecturally- significant quality requirements. Collect, analyze, and report testing metrics that estimate the number of defectsremaining after testing. • Verify: Verify that testing stakeholders understand the limitations of testing. Verify that testing is not the only type of verification being used. Verify that the number of defects remaining is estimated and reported. Related Problems:GEN-MGMT-2 Inappropriate External Pressures, GEN-COM-4Inadequate Communication Concerning Testing, TTS-REG-2 Regression Testing not Performed2.1.2.3 GEN-SIC-3 Lack of Stakeholder Commitment Description: There is a lack of adequate stakeholder commitment to the testing effort. Potential Symptoms: • Stakeholders (especially customers and management) are not providing sufficient resources (e.g., people, schedule, tools, funding) for the testing effort. • Stakeholders are unavailable for the review of test assets such as test plans and important test cases. • Stakeholders (e.g., customer representatives) point out defects in test assets after they have been reviewed. • Stakeholders do not support testing when resources must be cut (e.g., due to schedule slippages and budget overruns).© 2012-2013 by Carnegie Mellon University Page 20 of 111
  • Common Testing Problems: Pitfalls to Prevent and Mitigate 25 January 2013Descriptions, Symptoms, Consequences, Causes, and Recommendations Potential Consequences: • Testing is less effective due to inadequate resources. • Stakeholders (e.g., customer representatives) reject reviewed test assets. • The testing effort losesneeded resources when the schedule slips or the budget overruns. Potential Causes: • Stakeholders did not understand the scope, complexity, and importance of testing. • Stakeholders were not provided adequate estimates of the resources needed to properly perform testing. • Stakeholders wereextremely busy with other duties. • The overall project schedule and budget estimates were inadequate, thereby forcing cuts in testing. Recommendations: • Prepare: Convey the scope, complexity, and importance of testing to the testing stakeholders. • Enable: Provide stakeholders with adequate estimates of the resources needed to properly perform testing. • Perform: Officially request sufficient testing resources from the testing stakeholders. Obtain commitments of support for authoritative stakeholders at the beginning of the project. • Verify: Verify that the testing stakeholders are providing sufficient resources (e.g., people, schedule, tools, funding) for the testing effort. Related Problems:GEN-MGMT-1 Inadequate Test Resources, GEN-MGMT-5 Test Lessons Learned Ignored,GEN-MGMT-2 Inappropriate External Pressures, GEN-COM-4Inadequate Communication Concerning Testing, TTS-SoS-4 Inadequate Funding for SoS Testing, TTS- SoS-6 Inadequate Test Support from Individual Systems2.1.3 Management-related Testing ProblemsThe following testing problems are related to stakeholder involvement in and commitment to thetesting effort:• GEN-MGMT-1 Inadequate Test Resources• GEN-MGMT-2 Inappropriate External Pressures• GEN-MGMT-3 Inadequate Test-related Risk Management• GEN-MGMT-4 Inadequate Test Metrics• GEN-MGMT-5 Test Lessons Learned Ignored© 2012-2013 by Carnegie Mellon University Page 21 of 111
  • Common Testing Problems: Pitfalls to Prevent and Mitigate 25 January 2013Descriptions, Symptoms, Consequences, Causes, and Recommendations2.1.3.1 GEN-MGMT-1 Inadequate Test Resources Description: Management allocates an inadequate amount of resources to testing. Potential Symptoms: • The test planning documents and schedulesfail to provide for adequate test resources such as: test time in schedule with inadequate schedule reserves trained and experienced testers and reviewers funding test tools and environments (e.g., integration test beds and repositories of test data) Potential Consequences: • Adequate test resources will likely not be provided to perform sufficient testing within schedule and budget limitations. • An unnecessary number of defects may make it through testing and into the deployed system. Potential Causes: • Testing stakeholders may not understand the scope, complexity, and importance of testing, and therefore its impact on the resources needed to properly perform testing. • Estimates of needed testing resources may not be based on any evidenced-based cost/effort models. • Resource estimates may be informally made by management without input from the testing organization, especially those testers who will be actually performing the testing tasks. • Resource estimates may be based on available resources rather than resource needs. • Management may believe that the testers have padded their estimates and therefore cut the tester’s estimates. • Testers and testing stakeholders may be being overly optimistic so that their informal estimates of needed resources are based on best case scenarios rather than most likely or worst case scenarios. Recommendations: • Prepare: Ensure that testing stakeholders understand the scope, complexity, and importance of testing, and therefore its impact on the resources needed to properly perform testing. • Enable: Begin test planning at project inception (e.g., at contract award or during proposal development). Train testers in the use of evidence-based cost/effort models to estimate the amount of testing resources needed. • Perform: Use evidenced-based cost/effort models to estimate the needed testing resources. Officially request sufficient testing resources from the testing stakeholders.© 2012-2013 by Carnegie Mellon University Page 22 of 111
  • Common Testing Problems: Pitfalls to Prevent and Mitigate 25 January 2013Descriptions, Symptoms, Consequences, Causes, and Recommendations Ensure that the test planning documents, schedules, and project work breakdown structure (WBS) provide for adequate levels of these test resources. Obtain commitments of support for authoritative stakeholders at the beginning of the project. • Verify: Verify that the testing stakeholders are providing sufficient resources (e.g., people, schedule, tools, funding) for the testing effort. Related Problems:GEN-SIC-3 Lack of Stakeholder Commitment, GEN-TOP-3 Inadequate Testing Expertise2.1.3.2 GEN-MGMT-2 Inappropriate External Pressures Description: Testers are subject to inappropriate external pressures, primarily from managers. Potential Symptoms: • Managers (or possibly customers or developers) are dictating to the testers what constitutes a bug or a defect worth reporting. • Managerial pressure exists to: inappropriately cut corners (e.g., only perform “sunny day” testing in order to meet schedule deadlines inappropriately lower the severity and priority of reported defects not find defects (e.g., until after delivery because the project is so far behind schedule that there is no time to fix any defects found) Potential Consequences: • If the testers yield to this pressure, then the test metrics do not accurately reflect either the true state of the system / software or the status of the testing process. • The delivered system or software contains an unacceptably large number of residual defects. Potential Causes: • The project is significantly behind schedule and/or over budget. • There is insufficient time until the delivery/release date to fix a significant number of defects that were found via testing. • The project is in danger of being cancelled due to lack of performance. • Management is highly risk adverse and thereforedid not want to officially label any testing risk as a risk. Recommendations: • Prepare: Establish criteria for determining the priority and severity of reported defects. • Enable: Ensure that trained testers determine what constitutes a bug or a defect worth reporting. Place the manager of the testing organization at the same or higher level as the project© 2012-2013 by Carnegie Mellon University Page 23 of 111
  • Common Testing Problems: Pitfalls to Prevent and Mitigate 25 January 2013Descriptions, Symptoms, Consequences, Causes, and Recommendations manager in the organizational hierarchy (i.e., have the test manager report independently of the project manager).20 • Perform: Support testers when they oppose any inappropriate managerial pressure that would have them violate their professional ethics. Customer representatives must insist on proper testing. • Verify: Verify that the testers are the ones who decide what constitutes a reportable defect. Verify that the testing manager reports independently of the project manager. Related Problems:GEN-SIC-1 Wrong Testing Mindset, GEN-TOP-1 Lack of Independence2.1.3.3 GEN-MGMT-3 Inadequate Test-related Risk Management Description: There are too few test-related risks identified in the project’s official risk repository. 21F Potential Symptoms: • Managers are highly risk adverse, treating risk as if it were a “four letter word”. 17 F • Because adding risks to the risk repository is looked on as a symptom of management failure, risks (including testing risks) are mislabeled as issues or concerns so that they need not be reported as an official risk. • There arefew if anytest-related risks identified in the project’s official risk repository. • The number of test-related risks is unrealistically low. • The identified test-related risks have inappropriately low probabilities, low harm severities, and low priorities. • The identified test risks have no: associated risk mitigation approaches one assigned as being responsible for the risk • The test risks are never updated (e.g., additions or modification) over the course of the project. • Testing risks are not addressed in either the test plan(s) or the risk management plan. Potential Consequences: • Testing risks are not reported. • Management and acquirer representatives are unaware of their existence. • Testing risks are not being managed. • The management of testing risks is not given sufficiently high priority. Potential Causes: • Management ishighly risk adverse.20 Note that this will only help if the test manager is not below the manager applying improper pressure.21 These potential testing problems can be viewed as generic testing risks.© 2012-2013 by Carnegie Mellon University Page 24 of 111
  • Common Testing Problems: Pitfalls to Prevent and Mitigate 25 January 2013Descriptions, Symptoms, Consequences, Causes, and Recommendations • Managers strongly communicate their preference that only a small number of the most critical risks be entered into the project risk repository. • The people responsible for risk management and managing the risk repository have never been trained or exposed to the many potential test-related risks (e.g., those associated with the commonly occurring testing problems addressed in this document). • The risk management process strongly emphasizes system-specific or system-level (as opposed to software-level) risks and tends to not address any development activity risks (such as those associated with testing). • It is early in the development cycle before sufficient testing has begun. • There have been few if any evaluations of the testing process. • There has been little if any oversight of the testing process. Recommendations: • Prepare: Determine management’s degree of risk aversion and attitude regarding inclusion of risks in the project risk repository. • Enable: Ensure that the people responsible for risk management and managing the risk repository are aware of the many potential test-related risks. • Perform: Identify test-related risks and incorporate them into the official project risk repository. Provide test-related risks with realistic probabilities, harm severities, and priorities. • Verify: Verify that the risk repository contains an appropriate number of testing risks. Verify that there is sufficient management and quality assurance oversight and evaluation of the testing process. Related Problems: GEN-SIC-2 Unrealistic Testing Expectations / False Sense of Security2.1.3.4 GEN-MGMT-4 Inadequate Test Metrics Description: Insufficient test metrics are being produced, analyzed, and reported. Potential Symptoms: • Insufficient or no test metrics are being produced, analyzed, and reported. • The primary test metrics (e.g., number of tests 22, number of tests needed to meet adequate 18F or required test coverage levels, number of tests passed/failed, number of defects found) show neither the productivity of the testers nor their effectiveness at finding defects (e.g.,22 Note that the number of tests metric does not indicate the effort or complexity of identifying, analyzing, and fixing defects.© 2012-2013 by Carnegie Mellon University Page 25 of 111
  • Common Testing Problems: Pitfalls to Prevent and Mitigate 25 January 2013Descriptions, Symptoms, Consequences, Causes, and Recommendations defects found per test or per day). • The number of latent undiscovered defects remaining is not being estimated (e.g., using COQUALMO 23). 19 F • Management measures tester productivity strictly in terms of defects found per unit time, ignoring the importance or severity of the defects found. Potential Consequences: • Managers, testers, and other stakeholders in testing do not accurately know the quality of testing, the importance of the defects being found, or the number of residual defects in the delivered system or software. • Managers do not know the productivity of the testers and their effectiveness at finding of important defects, thereby making it difficult to improve the testing process. • Testers concentrate on finding lots of (unimportant) defects rather than finding critical defects (e.g., those with mission-critical, safety-critical, or security-critical ramifications). • Customer representatives, managers, and developers have a false sense of security that the system functions properly. Potential Causes: • Project management (including the managers/leaders of test organizations/teams) are not familiar with the different types of testing metrics (e.g., quality, status, and productivity) that could be useful. • Metrics collection, analysis, and reporting is at such a high level that individual disciplines (such as testing) are rarely assigned more than one or two highly-generic metrics (e.g., “Inadequate testing is a risk”). • Project management (and testers) are only aware of backward looking metrics (e.g., defects found and fixed) as opposed to forward looking metrics (e.g., residual defects remaining to be found). Recommendations: • Prepare: Provide testers and testing stakeholders with basic training in metrics with an emphasis on test metrics. • Enable: Incorporate a robust metrics program in the test plan that covers leading indicators. Emphasize the finding of important defects. • Perform: Consider using some of the following representative examples of useful testing metrics: number of defects found per test (test effectiveness metric) number of defects found per tester day (tester productivity metric) number of defects that slip through each verification milestone / inch pebble (e.g.,23 COQUALMO (COnstructiveQUALity Model is an estimation model that can be used for predicting the number of residual defects/KSLOC (thousands of source lines of code) or defects/FP (Function Point) in a software product.© 2012-2013 by Carnegie Mellon University Page 26 of 111
  • Common Testing Problems: Pitfalls to Prevent and Mitigate 25 January 2013Descriptions, Symptoms, Consequences, Causes, and Recommendations reviews, inspections, tests) 24 20F estimated number of latent undiscovered defects remaining in the delivered system (e.g., estimated using COQUALMO) Regularly collect, analyze, and report an appropriate set of testing metrics. • Verify: Important: Evaluate and maintain visibility into the as-performed testing process to ensure that it does not become metrics-driven. Watch out for signs that testers worry more about looking good (e.g., by concentrating on only the defects that are easy to find) than on finding the most important defects. Verify that sufficient testing metrics are collected, analyzed, and reported. Related Problems: None2.1.3.5 GEN-MGMT-5 Test Lessons Learned Ignored Description: Lessons that are learned regarding testing are not placed into practice. Potential Symptoms: • Management, the test teams, or customer representatives ignore lessons learned during previous projects or during the testing of previous increments of the system under test. Potential Consequences: • The test processes is not being continually improved. • The same problems continue to occur. • Customer representatives, managers, and developers have a false sense of security that the system functions properly. Potential Causes: • Lessons learned were not documented. • The capturing of lessons learned was being postponed until after the project was over when the people who have learned the lessons were no longer available, having scattered to new projects. • The only usage of lessons learned is informal and solely based on the experience that the individual developers and testers bring to new projects. • Lessons learned from previous projects are not reviewed before starting new projects. Recommendations: • Prepare: Make the documentation of lessons learned an explicit part of the testing process. Review previous lessons learned as an initial step in determining the testing process. • Enable: Capture (and implement) lessons learned as they are learned.24 For example, what are the percentages of defects that manage to slip by architecture reviews, design reviews, implementation inspections, unit testing, integration testing, and system testingwithout being detected?© 2012-2013 by Carnegie Mellon University Page 27 of 111
  • Common Testing Problems: Pitfalls to Prevent and Mitigate 25 January 2013Descriptions, Symptoms, Consequences, Causes, and Recommendations Do not wait until a project postmortem when project staff member’s memories are fading and they are moving (have moved) on to their next project. • Perform: Incorporate previously learned testing lessons learned into the current testing process and test plans. • Verify: Verify that previously learned testing lessons learned have been incorporated into the current testing process and test plans. Verify that testing lessons learned are capture (and implemented) as they are learned. Related Problems:GEN-SIC-3 Lack of Stakeholder Commitment2.1.4 Test Organization and Professionalism ProblemsThe following testing problems are related to the test organization and the professionalism of thetesters:• GEN-TOP-1 Lack of Independence• GEN-TOP-2 Unclear Testing Responsibilities• GEN-TOP-3 Inadequate Testing Expertise2.1.4.1 GEN-TOP-1 Lack of Independence Description: The test organization or team lacks adequate independence to enable them to properly perform their testing tasks. Potential Symptoms: • The manager of the test organization reports to the development manager. • The lead of the project test team reports to the project manager. • The test organization manager or test team leader does not have sufficient authority to raise and manage testing-related risks. Potential Consequences: • A lack of sufficient independence forces the test organization or team to select an inappropriate test process or tool. • Members of the test organization or teamare intimidated into withholding objective and timely information from the testing stakeholders. • The test organization or team has insufficient budget and schedule to be effective. • The project manager inappropriately overrules or pressures the testers to violate their principles. Potential Causes: • Management does not see the value or need for independent reporting. • Management does not see the similarity between quality assurance and testing with regard to independence.© 2012-2013 by Carnegie Mellon University Page 28 of 111
  • Common Testing Problems: Pitfalls to Prevent and Mitigate 25 January 2013Descriptions, Symptoms, Consequences, Causes, and Recommendations Recommendations: • Prepare: Determine reporting structures Identify potential independence problems • Enable: Clarify to testing stakeholders (especially project management) the value of independent reporting for the test organization manager and project test team leader. • Perform: Ensure that the test organization or team has: Technical independence so that they can select the most appropriate test process and tools for the job Managerial independence so that they can provide objective and timely information about the test program and results without fear of intimidation due to business considerations or project-internal politics Financial independence so that their budget (and schedule) is sufficient to enable them to be effective and efficient Have the test organization manager report at the same or higher level as the development organization manager. Have the project test team leader report independently of the project manager to the test organization manager or equivalent (e.g., quality assurance manager). • Verify: Verify that the test organization manager reports at the same or higher level as the development organization manager. Verify that project test team leader report independently of the project manager to the test organization manager or equivalent (e.g., quality assurance manager). Related Problems:GEN-MGMT-2 Inappropriate External Pressures2.1.4.2 GEN-TOP-2 Unclear Testing Responsibilities Description: The testing responsibilities are unclear. Potential Symptoms: • The test planning documents does not adequately address testing responsibilities in terms of which organizations, teams, and people: will perform which types of testing on what [types of] components are responsible for procuring, building, configuring, and maintaining the test environments are the ultimate decision makers regarding testing risks, test completion criteria, test completion, and the status/priority of defects Potential Consequences: • Certain tests are not performed, while other tests are performed redundantly by multiple organizations or people.© 2012-2013 by Carnegie Mellon University Page 29 of 111
  • Common Testing Problems: Pitfalls to Prevent and Mitigate 25 January 2013Descriptions, Symptoms, Consequences, Causes, and Recommendations • Incomplete testing enables some defects to make it through testing and into the deployed system. • Redundant testing wastes test resources and cause testing deadlines to slip. Potential Causes: • The test plan template did not clearly address responsibilities. • The project team is very small with everyone wearing multiple hats and therefore performing testing on an as available / as needed basis. Recommendations: • Prepare: Obtain documents describing current testing responsibilities Identify potential testing responsibility problems (e.g., missing, vague responsibilities) • Enable: Obtain organizational agreement as to the testing responsibilities. • Perform: Clearly and completely document the responsibilities for testing in the test plans as well as the charters of the teams who will be performing the tests. Managers should clearly communicate these responsibilities to the relevant organizations and people. • Verify: Verify that testing responsibilities are clearly and completely documented in the test plans as well as the charters of the teams who will be performing the tests. Related Problems:GEN-TPS-2 Incomplete Test Planning, GEN-PRO-7 Too Immature for Testing, GEN-COM-2 Inadequate Test Documentation, TTS-SoS-3 Unclear SoS Testing Responsibilities2.1.4.3 GEN-TOP-3 Inadequate Testing Expertise Description: Too many people have inadequate testing expertise, experience, and training. Potential Symptoms: • Testers and/or those who oversee them (e.g., managers and customer representatives) have inadequate testing expertise, experience, or training. • Developers who are not professional testers have been tasked to perform testing. • Little or no classroom or on-the-job training in testing has taken place. • Testing is ad hoc without any proper process. • Industry best practices are not followed. Potential Consequences: • Testing is not effective in detecting defects, especially the less obvious ones. • There areunusually large numbers of false positive and false negative test results. • The productivity of the testers is needlessly low.© 2012-2013 by Carnegie Mellon University Page 30 of 111
  • Common Testing Problems: Pitfalls to Prevent and Mitigate 25 January 2013Descriptions, Symptoms, Consequences, Causes, and Recommendations • There is a high probability that the system or software will be delivered late with an unacceptably large number of residual defects. • During development, managers, developers, and customer representatives have a false sense of security that the system functions properly. 25 21 F Potential Causes: • Management did not understand the scope and complexity of testing. • Management did not understand the required qualifications of a professional tester. • There was insufficient funding to hire fully qualified professional testers. • The project team is very small with everyone wearing multiple hats and therefore performing testing on an as available / as needed basis. • An agile development method is being followed that emphasizes cross functional development teams. Recommendations: • Prepare: Provide proper test processes including procedures, standards, guidelines, and templates for On-The-Job training. Ensure that the required qualifications of a professional tester are documented in the tester job description. • Enable: Convey the required qualifications of the different types of testers to those technically evaluating prospective testers. Provide appropriate amounts of test training (both classroom and on-the-job) for both testers and those overseeing testing. Ensure that the testers who will be automating testing have the necessary specialized expertise and training. 26 22F Obtain independent support for those overseeing testing. • Perform: Hire full time (i.e., professional) testers who have sufficient expertise and experience in testing. Use an independent test organization staffed with experienced trained testers for system/acceptance testing, whereby the head of this organization is at the same (or higher) level as the project manager. • Verify: Verify that those technically evaluating prospective testers understand the required qualifications of the different types of testers. Verify that the testers have adequate testing expertise, experience, and training.25 This false sense of security is likely to be replaced by a sense of panic when the system begins to frequently fail operational testing or real-world usage after deployment.26 Note that these recommendations apply, regardless of whether the project uses separate testing teams or cross functional teams including testers.© 2012-2013 by Carnegie Mellon University Page 31 of 111
  • Common Testing Problems: Pitfalls to Prevent and Mitigate 25 January 2013Descriptions, Symptoms, Consequences, Causes, and Recommendations Related Problems:GEN-MGMT-1 Inadequate Test Resources2.1.5 Test ProcessProblemsThe following testing problems are related to the processes and techniques being used to performtesting:• GEN-PRO-1 Testing and Engineering Process not Integrated• GEN-PRO-2 One-Size-Fits-All Testing• GEN-PRO-3 Inadequate Test Prioritization• GEN-PRO-4 Functionality Testing Overemphasized• GEN-PRO-5 Black-boxSystem Testing Overemphasized• GEN-PRO-6 White-boxUnit and Integration Testing Overemphasized• GEN-PRO-7 Too Immature for Testing• GEN-PRO-8 Inadequate Test Evaluations• GEN-PRO-9 Inadequate Test Maintenance2.1.5.1 GEN-PRO-1 Testing and Engineering Process Not Integrated Description: The testing process is not adequately integrated into the overall system/software engineering process. Potential Symptoms: • There is little or no discussion of testing in the system/software engineering documentation: System Engineering Master Plan (SEMP), Software Development Plan (SDP), Work Breakdown Structure (WBS), Project Master Schedule (PMS), and system/software development cycle (SDC). • All or most of the testing is being done as a completely independent activity performed by staff members who are not part of the project engineering team. • Testing istreated as a separate specialty-engineering activity with only limited interfaces with the primary engineering activities. • Testers are not included in the requirements teams, architecture teams, and any cross functional engineering teams. Potential Consequences: • There is inadequate communication between testers and other system/software engineers (e.g., requirements engineers, architects, designers, and implementers). • Few testing outsiders understand the scope, complexity, and importance of testing. • Testers do not understand the work being performed by other engineers. • There are incompatibilities between outputs and associated inputs at the interfaces between testers and other engineers. • Testing is less effective and takes longer than necessary. Potential Causes: • Testers are not involved in the determination and documentation of the overall engineering© 2012-2013 by Carnegie Mellon University Page 32 of 111
  • Common Testing Problems: Pitfalls to Prevent and Mitigate 25 January 2013Descriptions, Symptoms, Consequences, Causes, and Recommendations process. • The people determining and documenting the overall engineering process do not have significant testing expertise, training, or experience. Recommendations: • Prepare: ObtainSEMP, SDP, WBS, and project master schedule • Enable: Provide a top-level briefing/training in testing to the chief system engineer, system architect, system/software process engineer. • Perform: Have test subject matter experts and project testers collaborate closely with the project chief engineer / technical lead and process engineer when they develop the engineering process descriptions and associated process documents. In addition to being in test plans such as the Test and Evaluation Master Plan (TEMP) or Software Test Plan (STP) as well as in other process documents, provide high-level overviews of testing in the SEMP(s) and SDP(s). Document how testing is integrated into the system/software development/life cycle, regardless of whether it is traditional waterfall, agile (iterative, incremental, and parallel), or anything in between. For example, document handover points in the development cycle when testing input and output work products are delivered from one project organization or group to another. Incorporate testing into the Project Master Schedule. Incorporate testing into the project’s work breakdown structure (WBS). • Verify: Verify that testing is incorporated into the project’s: system/software engineering process SEMP and SDP WBS PMS SDC Related Problems:GEN-COM-4 Inadequate Communication Concerning Testing2.1.5.2 GEN-PRO-2 One-Size-Fits-All Testing Description: All testing is to be performed to the same level of rigor, regardless of its criticality. Potential Symptoms: • The test planning documents may contain only generic boilerplate rather than appropriate system-specific information.© 2012-2013 by Carnegie Mellon University Page 33 of 111
  • Common Testing Problems: Pitfalls to Prevent and Mitigate 25 January 2013Descriptions, Symptoms, Consequences, Causes, and Recommendations • Mission-, safety-, and security-critical software may not be required to be tested more completely and rigorously than other less-critical software. • Only general techniques suitable for testing functional requirements/behavior may be documented; for example, there is no description of the special types of testing needed for quality requirements (e.g., availability, capacity, performance, reliability, robustness, safety, security, and usability requirements). Potential Consequences: • Mission-, safety-, and security-critical software may not be adequately tested. • When there are insufficient resources to adequately test all of the software, some of these limited resources may be misapplied to lower-priority software instead of being concentrated on the testing of more critical capabilities. • Some defects may not be found, and an unnecessary number of these defects may make it through testing and into the deployed system. • The system may not be sufficiently safe or secure. Potential Causes: • Test plan templates and content/format standards may be incomplete and may not address the impact of mission/safety/security criticality on testing. • Test engineers may not be familiar with the impact of safety and security on testing (e.g., the higher level of testing rigor required to achieve accreditation and certification. • Safety and security engineers may not have input into the test planning process. Recommendations: • Prepare: Provide training to those writing system/software development plans and system/software test plans concerning the need to include project-specific testing information including potential content Tailor the templates for test plans and development methods to address the need for project/system-specific information. • Enable: Update (if needed) the templates for test plans and development methods to address the type, completeness, and rigor • Perform: Address in the system/software test plans and system/software development plans: Difference in testing types/degrees of completeness and rigor, etc. as a function of mission/safety/security criticality. Specialty engineering testing methods and techniques for testing the quality requirements (e.g., penetration testing for security requirements). Test mission-, safety-, and security-critical software more completely and rigorously than other less-critical software. • Verify: Verify that the completeness, type, and rigor of testing: is addressed in the system/software development plans and system/software test© 2012-2013 by Carnegie Mellon University Page 34 of 111
  • Common Testing Problems: Pitfalls to Prevent and Mitigate 25 January 2013Descriptions, Symptoms, Consequences, Causes, and Recommendations plans are a function of the criticality of the system/subsystem/software being tested are sufficient based on the degree of criticality of the system/subsystem/software being tested Related Problems:GEN-PRO-3 Inadequate Test Prioritization2.1.5.3 GEN-PRO-3 Inadequate Test Prioritization Description: Testing is not being adequately prioritized. Potential Symptoms: • All types of testing may have the same priority. • All test cases for the system or one of its subsystems mayhave the same priority. • The most important tests of a given type may not be being performed first. • Testing may begin with the easy testing of “low-hanging fruit”. • Difficult testing or the testing of high risk functionality/components may be being postponed until late in the schedule. • Testing ignores the order of integration and delivery; for example, unit testing before integration before system testing and the testing of the functionality of current the current increment before the testing of future increments. 27 23 F Potential Consequences: • Limited testing resources may be wasted or ineffectively used. • Some of the most critical defects (in terms of failure consequences) may not be discovered until after the system/software is delivered and placed into operation. • Specifically, defects with mission, safety, and security ramifications may not be found. Potential Causes: • The system/software test plans and testing parts of the system/software development plans do not address the prioritization of the testing. • Any prioritization of testing is not used to schedule testing. • Evaluations of the individual testers and test teams: are based [totally] on number of tests performed per unit time ignore the importance of capabilities, subsystems, or defects found Recommendations: • Prepare: Update the following documents to address the prioritization of testing: system/software test plans testing parts of the system/software development plans27 While the actual testing of future capabilities must wait until those capabilities are delivered to the testers, one can begin to develop black-box test cases based on requirements allocated to future builds (i.e., tests that are currently not needed and may never be needed if the associated requirements change or are deleted).© 2012-2013 by Carnegie Mellon University Page 35 of 111
  • Common Testing Problems: Pitfalls to Prevent and Mitigate 25 January 2013Descriptions, Symptoms, Consequences, Causes, and Recommendations Define the different types and levels/categories of criticality • Enable: Perform a mission analysis to determine the mission-criticality of the different capabilities and subsystems Perform a safety (hazard) analysis to determine the safety-criticality of the different capabilities and subsystems Perform a security (threat) analysis to determine the safety-criticality of the different capabilities and subsystems • Perform: Work with the developers, management, and stakeholders to prioritize testing according to the: criticality (e.g., mission, safety, and security) of the system/subsystem/software being tested potential importance of the potential defects identified via test failure probability that the test is likely to elicit important failures potential level of risk incurred if the defects are not identified via test failure delivery schedules integration/dependency order Use prioritization of testing to schedule testing so that thehighest priority tests are tested first. Collect test metrics based on the number and importance of the defects found Base the performance evaluations of the individual testers and test teams on the test effectiveness (e.g., the number and importance of defects found) rather than merely on the number of tests written and performed. • Verify: Evaluate the system/software test plans and the testing parts of the system/software development plans to verify that they properly address test prioritization. Verify that mission, safety, and security analysis have been performed and the results are used to prioritize testing. Verify that testing is properly prioritized. Verify that testing is in fact being performed in accordance with the prioritization. Verify that testing metrics address test prioritization. Verify that performance evaluations are based on Related Problems:GEN-PRO-2 One-Size-Fits-All Testing2.1.5.4 GEN-PRO-4 Functionality Testing Overemphasized Description: There is an over emphasis on testing functionality as opposed to quality characteristics, data, and interfaces. Potential Symptoms:© 2012-2013 by Carnegie Mellon University Page 36 of 111
  • Common Testing Problems: Pitfalls to Prevent and Mitigate 25 January 2013Descriptions, Symptoms, Consequences, Causes, and Recommendations • The vast majority of testing may be concerned with verifying functional behavior. • Little unit or testing may be being performed to verify adequate levels of the quality characteristics (e.g., availability, reliability, robustness, safety, security, and usability). • Inadequate levels of various quality characteristics and their attributes are onlybeing recognized after the system has been delivered and placed into operation. Potential Consequences: • The system may not have adequate levels of important quality characteristics and thereby fail to meet all of its quality requirements. • Failures to meet data and interface requirements (e.g., due to a lack of verification of input data and message contents) may not be recognized until late during integration or after delivery. • Testers and developers may have a harder time localizing the defects that the system tests reveal. • The system or software may be delivered late and fail to meet an unacceptably large number of non-functional requirements. Potential Causes: • The test plans and process documents do not adequately address the testing of non- functional requirements. • There are no process requirements (e.g., in the development contract) mandating the specialized testing of non-functional requirements. • Managers, developers, and or testers believe: Testing other types of requirements (i.e., data, interface, quality, and architecture/design/implementation/configuration constraints) is too hard. Testing the non-functional requirements will take too long.28 The non-functional requirements are not as important as the functional requirements. Testing the non-functional testing will naturally occur as a byproduct of the testing of the functional requirements.29 • The other types of requirements (especially quality requirements) are: poorly specified (e.g., “The system shall be secure.” or “The system shall be easy to use.”) not specified therefore not testable • Functional testing may be the only testing that is mandated by the development contract and therefore the testing of the non-functional requirements is out of scope or unimportant to the acquisition organization. Recommendations:28 Note that adequately testing quality requirements requires significantly more time to prepare for and perform that typical functional requirements.29 Note that this can be largely true for some of the non-functional requirements (e.g., interface requirements and performance requirements).© 2012-2013 by Carnegie Mellon University Page 37 of 111
  • Common Testing Problems: Pitfalls to Prevent and Mitigate 25 January 2013Descriptions, Symptoms, Consequences, Causes, and Recommendations • Prepare: Adequately address the testing of non-functional requirements in the test plans and process documents. Include process requirements mandating the specialized testing of non-functional requirements in the contract. • Enable: Ensure that managers, developers, and or testers understand the importance of testing non-functional requirements as well as conformance to the architecture and design (e.g., via whitebox testing). • Perform: Adequately perform the other types of testing. • Verify: Verify that the managers, developers, and or testers understand the importance of testing non-functional requirements and conformance to the architecture and design. Have quality engineers verify that the testers are testing the quality, data, and interface requirements as well as the architecture/design/implementation/configuration constraints. Review the test plans and process documents to ensure that they adequately address the testing of non-functional behavior. Measure, analyze, and report the types of non-functional defects and when they are being detected. Related Problems: None2.1.5.5 GEN-PRO-5Black-boxSystem Testing Overemphasized Description: There is an over emphasis on black-box system testing for requirements conformance. Potential Symptoms: • The vast majority of testing isoccurring at the system level for purposes of verifying conformance to requirements. • There is very little white-box unit and integration testing. • System testing is detecting many defects that could have been more easily identified during unit or integration testing. • Similar residual defects may also be causing faults and failures after the system has been delivered and placed into operation. Potential Consequences: • Defects that could have been found during unit or integration testing are harder to detect, localize, analyze, and fix. • System testing is unlikely to be completed on schedule. • It is harder to develop sufficient system-level tests to meet code coverage criteria. • The system or software may be delivered late with an unacceptably large number of© 2012-2013 by Carnegie Mellon University Page 38 of 111
  • Common Testing Problems: Pitfalls to Prevent and Mitigate 25 January 2013Descriptions, Symptoms, Consequences, Causes, and Recommendations residual defects that will only rarely be executed and thereby cause faults or failures. Potential Causes: • The test plans and process documents do not adequately address unit and integration testing. • There are no process requirements (e.g., in the development contract) mandating unit and integration testing. • The developers believe that blackbox system test is all that is necessary to detect the defects. • Developers believe that testing is totally the responsibility of the independent test team, which is only planning on performing system-level testing. • The schedule does not contain adequate time for unit and integration testing. Note that this may really be an under emphasis of unit and integration testing rather than an overemphasis on system testing. • Independent testers rather than developers are performing the testing. Recommendations: • Prepare: Adequately address in the test plans, test process documents, and contract: whitebox and graybox testing unit and integration testing • Enable: Ensure that managers, developers, and or testers understand the importance these lower- level types of testing. Use a test plan template or content and format standard that addresses these lower-level types of testing. • Perform: Increase the amount and effectiveness of these lower-level types of testing. • Verify: Review the test plans and process documents to ensure that they adequately address these lower-level types of tests. Verify that the managers, developers, and or testers understand the importance of these lower-level types of testing. Have quality engineers verify that the testers are actually performing these lower-level types of testing and at an appropriate percentage of total tests. Review the test plans and process documents to ensure that they adequately address lower-level testing. Measure the number of defects slipping past unit and integration testing. Related Problems:GEN-PRO-6White-box Unit and Integration Testing Overemphasized© 2012-2013 by Carnegie Mellon University Page 39 of 111
  • Common Testing Problems: Pitfalls to Prevent and Mitigate 25 January 2013Descriptions, Symptoms, Consequences, Causes, and Recommendations2.1.5.6 GEN-PRO-6White-boxUnit and Integration Testing Overemphasized Description: There is an over emphasis on white-box unit and integration testing. Potential Symptoms: • The vast majority of testing isoccurring at the unit and integration level. • Very little time is being spent on black-box system testing to verify conformance to the requirements. • People are stating that significant system testing is not necessary because “lower level tests have already verified the system requirements” or “there is insufficient time left to perform significant system testing”. • There is little or no testing of quality requirements (because the associated quality attributes are system-level characteristics). Potential Consequences: • The delivered system may fail to meet some of its system requirements, especially quality requirements and those functional requirements that require the collaboration of the integrated subsystems. Potential Causes: • The test plans and process documents do not adequately address black-box system testing. • There are no process requirements (e.g., in the development contract) mandating black-box system/software testing. • The developers believe that if the components work properly, then the system will work properly when they are integrated. • No blackbox testing metrics are being collected, analyzed, and reported. • The schedule does not contain adequate time for blackbox system testing (e.g., due to schedule slippages and a firm release date). Note that this may really be an under emphasis of blackbox testing rather than an overemphasis on whitebox unit and integration testing. • Developers rather than independent testers are performing much/most of the testing. Recommendations: • Prepare: TBD • Enable: TBD • Perform: TBD • Verify: Verify that TBD • Increase the amount and effectiveness of system testing. • Review the test plans and process documents to ensure that they adequately address black- box system testing.© 2012-2013 by Carnegie Mellon University Page 40 of 111
  • Common Testing Problems: Pitfalls to Prevent and Mitigate 25 January 2013Descriptions, Symptoms, Consequences, Causes, and Recommendations • When appropriate, improve the test plans and process documents with regard to system testing. • Measure, analyze, and report the number of requirements that have been verified by system testing. Related Problems:GEN-PRO-5 Black-box System Testing Overemphasized2.1.5.7 GEN-PRO-7Too Immature for Testing Description: Some of the products being tested are immature, containing too many defects. Potential Symptoms: • Large numbers of requirements, architecture, and design defects are being found that should have been discovered (during reviews) and fixed prior to current testing. • The product may be being delivered for testing when it is not ready for testing because: Schedule pressures cause corners to be cut during earlier testing. Test readiness criteria do not exist or are not enforced. Management, customer/user representatives, and developers do not understand the impact on testing of immature products. Potential Consequences: • Testing may find many defects that should have been detected during previous levels of testing. • Encapsulation due to integration may make it unnecessarily difficult to localize the defect that caused the test failure. • Testing may not be completed on schedule. Potential Causes: • There is insufficient time for proper design and implementation prior to testing. • There is insufficient staff for proper design and implementation. • There are no completion criteria for design, implementation, and lower-level testing. • Lower-level tests have not been properly performed. Recommendations: • Prepare: Set reasonable criteria for test readiness. • Enable: Enforce the following of reasonable criteria for test readiness. TBD • Perform: TBD • Verify: Verify that TBD • Increase the amount of earlier verification of the requirements, architecture, and design© 2012-2013 by Carnegie Mellon University Page 41 of 111
  • Common Testing Problems: Pitfalls to Prevent and Mitigate 25 January 2013Descriptions, Symptoms, Consequences, Causes, and Recommendations (e.g., with peer-level reviews and inspections). • Improve the effectiveness of earlier disciplines and types of testing (e.g., by improving methods and providing training). • Measure the number of defects slipping through multiple disciplines and types of testing (e.g., where the defect was introduced and where it was found). Related Problems:GEN-TOP-2 Unclear Testing Responsibilities2.1.5.8 GEN-PRO-8 Inadequate Test Evaluations Description: The quality of the test assets is not being adequately evaluated prior to their use. Potential Symptoms: • Little or no [peer-level] inspections, walk-throughs, or reviews of the test assets (e.g., test inputs, preconditions, trigger events, expected test outputs and postconditions)are being performed prior to actual testing. Potential Consequences: • Test plans, procedures, test cases, and other testing work products contain defects that could have been found during these evaluations. • There will be an increase in false positive and false negative test results. • Unnecessary effort will be wasted identifying and fixing problems. • Some defects may not be found, and an unnecessary number of these defects may make it through testing and into the deployed system. Potential Causes: • Evaluating the test assets is not addressed in the: Test and Evaluation Master Plan (TEMP) or System/Software Test Plan (STP) The high-level overviews of testing in the System Engineering Master Plan (SEMP) and System/Software Development Plan (SDP) Quality Engineering/Assurance plans. Master Project Schedule Work Breakdown Structure (WBS) • There is insufficient time and staff to evaluate the deliverable system/software and the test assets. • The test assets are not deemed to be sufficiently important to evaluate. • The developers believe that the test assets will automatically be verified during the actual testing. Recommendations: • Prepare: Incorporate test evaluations into the: system/software development plans system/software test plans project schedules (master and team)© 2012-2013 by Carnegie Mellon University Page 42 of 111
  • Common Testing Problems: Pitfalls to Prevent and Mitigate 25 January 2013Descriptions, Symptoms, Consequences, Causes, and Recommendations project work breakdown structure (WBS) Ensure that the following test assets are reviewed prior to actual testing: test inputs, preconditions (pre-test state), and test oracle including expected test outputs and postconditions. • Enable: To the extent practical, ensure that the test evaluation team includes other testers, requirements engineers, user representatives, subject matter experts, architects, and implementers. • Perform: Have the testers perform peer-level reviews of the testing work products. Have quality engineering perform evaluations of the testing work products (quality control) and test process (quality assurance) Have stakeholders in testing perform technical evaluations of the major testing work products. Have the results of these technical evaluations presented at major project status meetings and major formal reviews. • Verify: Verify that these evaluations do in fact occur. Verify that the results of these evaluations are reported to the proper stakeholders. Verify that problems discovered are assigned for fixing and are in fact fixed. Related Problems:GEN-TPS-2 Incomplete Test Planning, GEN-TOP-2 Unclear Testing Responsibilities2.1.5.9 GEN-PRO-9 Inadequate Test Maintenance Description: Testing assets are not being properly maintained. Potential Symptoms: • Testing assets (e.g., test software and documents such as test cases, test procedures, test drivers, test stubs, and tracings between requirements and tests30) may not be being adequately updated and iterated as defects are found and the system software is changed (e.g., due to refactoring, change requests, or the use of an agile – incremental and iterative development cycle). Potential Consequences: • Testing assets (e.g., automated regression tests) may no longer be consistent with the current requirements, architecture, design, and implementation. • Test productivity may decrease as the number of false negative test results increases (i.e., as tests fail due to test defects).30 Although requirements traceability matrices can be used when the number of requirements and test cases are quite small, the use of a requirements management, test management, or configuration management tool is usually needed to document and maintain tracings between requirements and tests.© 2012-2013 by Carnegie Mellon University Page 43 of 111
  • Common Testing Problems: Pitfalls to Prevent and Mitigate 25 January 2013Descriptions, Symptoms, Consequences, Causes, and Recommendations • The amount of productive regression testing may decrease as effort is redirect to identifying and fixing test defects. Potential Causes: • Maintenance of the testing assets may notbe an explicit part of the testing process. • Maintenance of the testing assets may not be explicitly documented in the Test and Evaluation Master Plan (TEMP), the System/Software Test Plan (STP), or the testing sections of the System Engineering Master Plan (SEMP) and Software Development Plan (SDP). • The test resources (e.g., schedule and staffing) provided by management may be insufficient to properly maintain the testing assets. • The project master schedule may not have included (sufficient) time for test asset maintenance. • Testing stakeholders may not understand the importance of maintaining the testing assets. • There may be no (requirements management, test management, or configuration management) tool support for maintaining the tracing between requirements and tests. Recommendations: • Prepare: Explicitly address the maintenance of the testing assets in the: Test and Evaluation Master Plan (TEMP), the System/Software Test Plan (STP), or the testing sections of the System Engineering Master Plan (SEMP) and Software Development Plan (SDP) testing process documents (e.g., procedures and guidelines) project work breakdown structure (WBS) Include adequate time for test asset maintenance in the project master schedule. Clearly communicate the importance of maintaining the testing assets to the testing stakeholders. Ensure that the maintenance testers are adequately trained and experienced. 31 25F • Enable: Provide sufficient test resources (e.g., schedule and staffing) to properly maintain the testing assets. Provide tool support (e.g., via a requirements management, test management, or configuration management tool) for maintaining the tracing between requirements and tests. • Perform: Keep the testing assets consistent with the current requirements, architecture, design, and implementation. 32 24F31 This will help combat the loss of project expertise due to the fact that many/most of the testers who are members of the development staff tend to move on after delivery.32 While this is useful with regard to any product that undergoes multiple internal or external releases, it is especially a good idea when an agile (iterative and incremental) development cycle produces numerous short duration increments.© 2012-2013 by Carnegie Mellon University Page 44 of 111
  • Common Testing Problems: Pitfalls to Prevent and Mitigate 25 January 2013Descriptions, Symptoms, Consequences, Causes, and Recommendations Properly maintain the testing assets as defects are found and system changes are introduced. • Verify: Verify that the test plans address maintaining testing assets. Verify that the project master schedule include time for maintaining testing assets. Verify that the testing assets are in fact being maintained (e.g., via quality assurance and control). Related Problems:GEN-TPS-1No Separate Test Plan, GEN-TPS-2Incomplete Test Planning, GEN-TOP-2 Unclear Testing Responsibilities2.1.6 Test Tools and Environments ProblemsThe following testing problems are related to the test tools and environments:• GEN-TTE-1 Over-reliance on Manual Testing• GEN-TTE-2 Over-reliance on Testing Tools• GEN-TTE-3 Insufficient Test Environments• GEN-TTE-4 Poor Fidelity of Test Environments• GEN-TTE-5 Inadequate Test Environment Quality• GEN-TTE-6 System/Software Under Test Behaves Differently• GEN-TTE-7 Tests not Delivered• GEN-TTE-8 Inadequate Test Configuration Management (CM)2.1.6.1 GEN-TTE-1 Over-reliance on Manual Testing Description: Testers are placing too much reliance on manual testing. Potential Symptoms: • All, or the majority of, testing is being performed manually without adequate support of test tools or test scripts.33 Potential Consequences: • Testing will be very labor intensive. • Any non-trivial amount of regression testing will likely be impractical. • Testing will likely be subject to significant human error, especially with regard to test inputs and the interpretation and recording of test outputs. Potential Causes: • Test automation is not addressed in the Test and Evaluation Master Plan (TEMP) or System/Software Test Plan (STP). • Test automation is not addressed in the testing parts of the System Engineering Master Plan33 This may not be a problem if test automation is not practical for some reason (e.g., the quick and dirty testing of a UI- heavy rapid prototype that will not be maintained).© 2012-2013 by Carnegie Mellon University Page 45 of 111
  • Common Testing Problems: Pitfalls to Prevent and Mitigate 25 January 2013Descriptions, Symptoms, Consequences, Causes, and Recommendations (SEMP) and Software Development Plans (SDP). • Test automation is not included in the project Work Breakdown Schedule (WBS). • Time for test automation is not included in the project master schedule and test team schedules. • Testers do not have adequate training and experience in test automation. Recommendations: • Prepare: Address test automation in the TEMP and/or STP. Address test automation in the testing parts of the SEMP and/or SDP. Address test automation in the project WBS. Include time for test automation in the project master schedule and test team schedules. Provide sufficient funding for the evaluation, select, purchase, and maintenance of test tools. Provide sufficientstaff and funding for the automation of testing. • Enable: Evaluate, select, purchase, and maintain of test tools. Where needed, provide training in automated testing. • Perform: Limit manual testing to only the testing for which is most appropriate. Automate regression testing. Maintain regression tests (e.g., scripts, inputs, expected outputs). Use test tools and scripts to automate appropriate parts of the testing process (e.g., to ensure that testing provides adequate code coverage). • Verify: Evaluate the test planning documentation for the inclusion of automated testing. Evaluate the schedules for inclusion of test automation. Verify that sufficient tests are being automated and maintained. Related Problems:GEN-TTE-2 Over-reliance on Testing Tools, TTS-REG-1 Insufficient Regression Test Automation2.1.6.2 GEN-TTE-2 Over-reliance on Testing Tools Description: Testers and other testing stakeholders are placing too much reliance on (COTS and home-grown) testing tools.34 Potential Symptoms: • Testers and other testing stakeholders are relying on testing tools to do far more than to34 This problem does not imply that regression tests need not be automated (it should to the extent practical), but rather that the tester needs to use the requirements, architecture, and design as the oracle rather than tool and code.© 2012-2013 by Carnegie Mellon University Page 46 of 111
  • Common Testing Problems: Pitfalls to Prevent and Mitigate 25 January 2013Descriptions, Symptoms, Consequences, Causes, and Recommendations merely generate sufficient white-box test cases to ensure code coverage. • Testers are relying on the tools to automate test case creation including test case selection and completion (“coverage”) criteria. • Testers are relying on the test tools as their test oracle (to determine the expected – correct – test result). • Testers let the tool drive the test methodology rather than the other way around. Potential Consequences: • Testing may emphasize white-box (design-driven) testing and may include inadequate black-box (requirements-driven) testing. • Many design defects may not be found during testing and thus remain in the delivered system. Potential Causes: • The tool vendor’s marketing information may: — be overly optimistic (e.g., promise that it covers everything) — equate tool with method (so no extra methodology needs to be addressed) • Management may equate the test tools with the testing method. • The testers may be sufficiently inexperienced in testing to not recognize what the tool does not cover. Recommendations: • Prepare: Ensure that manual testing including its scope (when and for what) is documented in the test plans and test process documents. • Enable: Provide sufficient resources to perform the tests that should or have to be performed manually. Ensure that testers (e.g., via training and test planning) understand the limits of testing tools and the automation of test case creation. • Perform: Let the test methodology drive tool selection. Ensure that testers (when appropriate) use the requirements, architecture, and design as the test oracle (to determine the correct test result). • Verify: Verify that the testers are not relying 100% on test tools to automate test case selection and set the test completion (“coverage”) criteria. Related Problems:GEN-TTE-1 Over-reliance on Manual Testing2.1.6.3 GEN-TTE-3 Insufficient Test Environments Description: There are too few test environments. Potential Symptoms:© 2012-2013 by Carnegie Mellon University Page 47 of 111
  • Common Testing Problems: Pitfalls to Prevent and Mitigate 25 January 2013Descriptions, Symptoms, Consequences, Causes, and Recommendations • The types of test environments neededare not [adequately or completely] addressed in the Test and Evaluation Master Plan (TEMP) or System/Software Test Plan (STP). • The types of test environments needed are not [adequately or completely] addressed in the testing parts of the System Engineering Master Plan (SEMP) and Software Development Plans (SDP). • There are not sufficient test environmentsof one or more types (listed in order of increasing fidelity) such as: • Software only test environments hosted on basic general-purpose platform such as desktop and laptop computers • Software only test environment on appropriate computational environment (e.g., correct processors, busses, operating system, middleware, databases) • Software with prototype hardware (e.g., sensors and actuators) • Software with early/previous version of the actual hardware • Initial integrated test system with partial functionality (e.g., initial test aircraft for ground testing or flight testing of existing functionality) • Integrated test system with full functionality (e.g., operational aircraft for flight testing or operational evaluation testing) • There is an excessive amount of competition between and among the integration testers and other testers for time on the test environments. Potential Consequences: • It may be difficult to optimally schedule the allocation of test teams to test environments, resulting in scheduling conflicts. • Too much time may be wasted reconfiguring the test environments for the next team’s use. Testing may not be completed on schedule. • Certain types of testing are not able to be performed. • Defects that should be found during testing on earlier test environments are not found until later test environments when it becomes harder to cause and reproduce test failures and localize defects. Potential Causes: • Lack of adequate planning (e.g., test environments missing from test plans) • Lack of experience with the different types of test environments • Lack of funding for the creation, testing, and maintenance of test environments • Under estimation of the amount of testing to be performed • Lack of adequate hardware for test environments, for example because hardware is needed for (1) initial prototype systems or (2) initial systems during Low Rate of Initial Production (LRIP)© 2012-2013 by Carnegie Mellon University Page 48 of 111
  • Common Testing Problems: Pitfalls to Prevent and Mitigate 25 January 2013Descriptions, Symptoms, Consequences, Causes, and Recommendations Recommendations: • Prepare: Ensure that the test team and especially the test managers understand the different types of test environments, their uses, their costs, and their benefits. Determine/estimate the amount and types of tests needed. Determine the number of testers needed. Determine the test environment requirements in terms of types of test environments and numbers of each type. • Enable: Address all of the types of test environments neededin the Test and Evaluation Master Plan (TEMP) or System/Software Test Plan (STP). Address (e.g., list) the types of test environments needed in the testing parts of the System Engineering Master Plan (SEMP) and Software Development Plans (SDP). Include the development of the test environments in the project master schedule (and schedules of individual test teams) and ensure that it is consistent with the test schedule. Include the development and maintenance of the test environments in the project Work Breakdown Schedule (WBS). Ensure that the necessary software (e.g., test tools) is available when needed. Ensure that sufficient hardware/systems to build/support the test environments are available when needed. Do not transfer the hardware/systems needed for the test environments to other uses, leaving insufficient hardware for the test environments. Create and use a process for scheduling the test teams’ use of the test environments. • Perform: Develop all of the needed test environments. Maintain all of the needed test environments. • Verify: Verify that development of the test environments is properly addressed in the TEMPS, STPs, schedules, and WBS. Verify that sufficient test environments are available and reliable (e.g., via testing metrics). Related Problems:GEN-TTE-5Inadequate Test Environment Quality, GEN-TTE-8Inadequate Test Configuration Management (CM)2.1.6.4 GEN-TTE-4 Poor Fidelity of Test Environments Description: Testing is problematic due to the test environment having poor fidelity related to the operational system/software. Potential Symptoms: • Testing is being performed using test environments incorporating:© 2012-2013 by Carnegie Mellon University Page 49 of 111
  • Common Testing Problems: Pitfalls to Prevent and Mitigate 25 January 2013Descriptions, Symptoms, Consequences, Causes, and Recommendations A different (or different version of) the computing platform than that used on the delivered software: compiler or programming language class library operating system, middleware, or database(s) network software A different (or different version of) the computer or system hardware: processor(s), memory, motherboard, or graphic card network devices (e.g., routers and firewalls) sensors, actuators, etc. Software that poorly simulates hardware (e.g., stimulators/drivers and stubs) • There are a significant number of tests that fail due to low fidelity of the testing environment. Potential Consequences: • Testing will experience many false negatives. It will be more difficult to localize and fix defects. • Test cases will need to be repeated when the fidelity problems are solved by: fixing defects in the test environment using a different test environment that better conforms to the operational system and its environment (e.g., by replacing software simulation by hardware or by replacing prototype hardware with actual operational hardware) Potential Causes: • Poor configuration management of the hardware or software. • Lack of availability of the correct version of the hardware or software. Recommendations: • Prepare: Include how the testers are going to address test environment fidelity in the test plans • Enable: Provide good configuration management of components under test and test environments. Provide tools to evaluate the fidelity of the test environment’s behavior. Provide the test labs with sufficient numbers of prototype and Low Rate of Initial Production (LRIP) system components (subsystems, software, and hardware). • Perform: To the extent practical, use the same versionsof development tools and system components during testing as during operation. • Verify: Verify to the extent practical that when used, test stimulators and stubs have the characteristics as the eventual components they are replacing during testing. Related Problems:GEN-TTE-5 Inadequate Test Environment Quality, GEN-TTE-6 Software© 2012-2013 by Carnegie Mellon University Page 50 of 111
  • Common Testing Problems: Pitfalls to Prevent and Mitigate 25 January 2013Descriptions, Symptoms, Consequences, Causes, and Recommendations Under Test Behaves Differently, TTS-INT-2 Unavailable Components2.1.6.5 GEN-TTE-5 Inadequate Test Environment Quality Description: The quality of the test environments is inadequate. Potential Symptoms: • One or more test environments contain an excessive number of defects.35 Potential Consequences: • There may be numerous false negative test results. 36 27F • It will be more difficult to determine whether test failures are due to the system/software under test or the test environments. • Testing will take a needlessly long time to perform. • The system may be delivered late and with an unacceptably large number of residual defects. Potential Causes: • TBD Recommendations: • Prepare: TBD • Enable: TBD • Perform: TBD • Verify: Verify that TBD • Ensure that the quality of the test environment is as good as the system/software under test, especially when testing mission-, safety-, or security-critical software. • Ensure that the test environments are of sufficient quality (e.g., via good development practices, adequate testing, and careful tool selection). Related Problems:GEN-TTE-4 Poor Fidelity of Test Environment, GEN-TTE-6 System/Software Under Test Behaves Differently2.1.6.6 GEN-TTE-6 System/Software Under Test Behaves Differently Description: The system or software under test (SUT) and the operational software behave differently.35 This is primarily a problem with test environments and their components that aredevelopedin-house.36 A false negative test result is a test indicating that the system/software under test fails when it is actually the test environment that failed to perform properly.© 2012-2013 by Carnegie Mellon University Page 51 of 111
  • Common Testing Problems: Pitfalls to Prevent and Mitigate 25 January 2013Descriptions, Symptoms, Consequences, Causes, and Recommendations Potential Symptoms: • A fault or failure that occurs during testing is not repeatable during normal operation. • The SUT that behaved correctly during testing causes a fault or failure during operation. • The SUT contains test software that is either removed (e.g., physically or via complier switch) before being placed in operation. Potential Consequences: • Extra time is spent localizing the defect. • Correct behavior due to the existence of integrated test software leads to a false sense of security. Potential Causes: • TBD Recommendations: • Prepare: TBD • Enable: TBD • Perform: TBD • Verify: Verify that TBD • Perform black-box regression testing after removing the test software. • Consider incorporating the test software as deliverable built-in-test (BIT) software. 37 28F Related Problems:GEN-TTE-4 Poor Fidelity of Test Environments, GEN-TTE-5 Inadequate Test Environment Quality2.1.6.7 GEN-TTE-7 Tests not Delivered Description: Test assets are not being delivered along with the system / software. Potential Symptoms: • The delivery of test assets (e.g., test cases, test oracles, test drivers/scripts, test stubs, and test environments) is neither required nor planned. • Test assets are not delivered along with the system / software. Potential Consequences: • It may be unnecessarily difficult to perform testing during maintenance. • There may be inadequate regression testing as the delivered system/software is updated.37 Note that it may not be practical (e.g., for performance reasons or code size) or permitted (e.g., for safety or security reasons) to deliver the system with embedded test software. For example, embedded test software could provide an attacker with a back door capability.© 2012-2013 by Carnegie Mellon University Page 52 of 111
  • Common Testing Problems: Pitfalls to Prevent and Mitigate 25 January 2013Descriptions, Symptoms, Consequences, Causes, and Recommendations • Some post-delivery testing may not be performed so that some post-delivery defects may not being found and fixed. Potential Causes: • The delivery of test assets is not an explicit part of the testing process. • The delivery of test assets is not mentioned in the System Engineering Management Plan (SEMP), System/Software Development Plan (SDP), or Test and Evaluation Master Plan (TEMP). Recommendations: • Prepare: TBD • Enable: TBD • Perform: TBD • Verify: Verify that TBD • Ensure that the migration to maintenance section of the system development contract or associated list of deliverables includes the delivery of all test assets needed to perform testing after delivery. Related Problems:GEN-TPS-2 Incomplete Test Planning, GEN-TOP-2 Unclear Testing Responsibilities2.1.6.8 GEN-TTE-8 Inadequate Test Configuration Management (CM) Description: Testing work products are not under configuration control. Potential Symptoms: • Test environments, test cases, and other testing work products are not under configuration control. • Inconsistencies are found between the current versions of the system/software under test and the test cases and test environments. Potential Consequences: • Test environments, test cases, and other testing work products may cease to be consistent with the system/software being tested and with each other. 38 29 F • It may be impossible to reproduce tests (i.e., get the same test results given the same preconditions and test stimuli). • It may be much more difficult to know that the correct versions of the system, test38 Note that a closely related problem would be that the subsystem/software under test (SUT) is not under configuration control. Incompatibilities will also occur if the SUT is informally updated with undocumented and uncontrolled “fixes” without the test team being aware.© 2012-2013 by Carnegie Mellon University Page 53 of 111
  • Common Testing Problems: Pitfalls to Prevent and Mitigate 25 January 2013Descriptions, Symptoms, Consequences, Causes, and Recommendations environment, and tests are being used when testing. • There may be an increase in false positive and false negative test results. • False positive test results due to incorrect version control may lead to incorrect fixes and the resulting insertion of defects into the system/software. • Unnecessary effort may be wasted identifying and fixing CM problems. • Some defects may not be found, and an unnecessary number of these defects may make it through testing and into the deployed system. Potential Causes: • Placing the test assets under configuration management is not an explicit part of either the configuration management or testing process. • The CM of test assets is not mentioned in the System Engineering Management Plan (SEMP), System/Software Development Plan (SDP), Test and Evaluation Master Plan (TEMP), or System/Software Test Plan (STP). Recommendations: • Prepare: TBD • Enable: TBD • Perform: TBD • Verify: Verify that TBD • Ensure that all test plans, procedures, test cases, test environments, and other testing work products are placed under configuration control before they are used. • Ensure that the right versions of test environment components are used so that the test environments are restored to the correct known state prior to each new testing cycle. Related Problems: None2.1.7 Test Communication ProblemsThe following testing problems are related to the communication including documentation andreporting:• GEN-COM-1 Inadequate Defect Reports• GEN-COM-2 Inadequate Test Documentation• GEN-COM-3 Source Documents Not Maintained• GEN-COM-4 Inadequate Communication Concerning Testing© 2012-2013 by Carnegie Mellon University Page 54 of 111
  • Common Testing Problems: Pitfalls to Prevent and Mitigate 25 January 2013Descriptions, Symptoms, Consequences, Causes, and Recommendations2.1.7.1 GEN-COM-1 Inadequate Defect Reports Description: Some defect (a.k.a., bug and trouble) reports are incomplete or contain incorrect information. 39 30F Potential Symptoms: • Some defect reports are incomplete (e.g., do not contain some of the following information): Summary – a one sentence summary of the fault/failure Detailed Description – a relatively comprehensive detailed description of the failure Author – the name and contact information of the person reporting the defect System – the version (build) or variant of the system Environment – the software infrastructure such as OS, middleware, and database types and versions Location – the author’s assessment of the subsystem or module that contains the defect that cause the fault/failure Priority and severity – the author’s assessment of the priority and severity of the defect Steps – the steps to be followed to replicate the fault/failure (if reproducible by the report’s author) including: Preconditions (e.g., system mode or state and stored data values) Trigger events including input data Actual behavior including fault/failure warnings, cautions, or advisories Expected behavior Comments Attachments (e.g., screen shots or logs) • Some defect reports contain incorrect information. • Defect reports are returned with comments such as “not clear” and “need more information”. • Developers/testers contact the defect report author for more information. • Different individuals or teams use different defect report templates, content/format standards, or test management tools. 40 31F Potential Consequences: • Testers will be unable to reproduce to faults/failures and thereby identify the underlying defects. • It will take longer for developers/testers to identify and diagnose the underlying defects. Potential Causes: • TBD39 This is especially a problem when the fault/failure is intermittent and inherently difficult to reproduce.40 This is especially likely when a prime contractor/system integrator and subcontractors are involved in development.© 2012-2013 by Carnegie Mellon University Page 55 of 111
  • Common Testing Problems: Pitfalls to Prevent and Mitigate 25 January 2013Descriptions, Symptoms, Consequences, Causes, and Recommendations Recommendations: • Prepare: TBD • Enable: TBD • Perform: TBD • Verify: Verify that TBD • Use templates and standards to specify content and format the defect reports. Use a test management tool to enter and managed the defect reports. • To the extent practical, 41 ensure that all defect reports are reviewed for completeness, 32 F duplication, and scope by the test manager or the change control board (CCB) before being assigned to: 42 33 F individual testers for testing and analysis (for defect reports not authored by testers) developers for analysis and fixing (for defect/test reports authored by testers) Related Problems: None2.1.7.2 GEN-COM-2 Inadequate Test Documentation Description: Some test documentsare incomplete or contain incorrect information. Potential Symptoms: • Some test documentation is inadequate for defect identification and analysis, regression testing, test automation, reuse, and quality assurance of the testing process. 43 34 F • Some test documentation templates or format/content standards are either missing or incomplete. • Test scripts/cases do not completely describe test preconditions, test trigger events, test input data, expected/mandatory test outputs (data and commands), and expected/mandatory test post-conditions. • An agile approach is being used by developers with little testing expertise and experience. • Testing documents are not maintained or placed under configuration management. Potential Consequences: • Testing assets (e.g., test documents, environments, and test cases) may not be sufficiently documented to be used by: testers to drive test automation41 It is critical to ensure that this review not become a bottle neck.42 This review should not happen more than once and the correct time to perform this review may well depend on who authored the defect report and on the defect resolution process.43 This is often caused by managers attempting to decrease the testing effort and thereby meet schedule deadlines or by processes developed by people who do not have adequate testing training and experience.© 2012-2013 by Carnegie Mellon University Page 56 of 111
  • Common Testing Problems: Pitfalls to Prevent and Mitigate 25 January 2013Descriptions, Symptoms, Consequences, Causes, and Recommendations testers to perform regression testing, either during initial development or during maintenance quality assurance personnel and customer representatives during evaluation and oversight of the testing process testers other than the original test developer (e.g., by those performing integration, system, system of system, and maintenance testing) test teams from other projects developing/maintaining related systems within a product family or product line • Tests may not be reproducible. It may take longer to identify and fix some of the underlying defects, thereby causing some test deadlines to be missed. • Maintenance costs may be needlessly high. Insufficient regression testing may be performed. • The reuse of testing assets may be needlessly low, thereby unacceptably increasing the costs, schedule, and effort that will be spent recreating testing assets. Potential Causes: • Prepare: TBD • Enable: TBD • Perform: TBD • Verify: Verify that TBD • The content (and format) of test documents may not be an explicit part of the testing process and thus not addressed in document templates or content/format standards. • Testers may not appreciate the need for good test documentation, which tends to occur if an agile development method is used (e.g., because of the emphasis on minimizing documentation and the emphasis on having each development team determine its own documentation needs on a case-by-case basis). Recommendations: • Use the contract, test plans, test training, test process documents, and test standards to specify the required test documents and ensure that test work products are adequately documented. • Ensure that test cases completely describe test preconditions, test trigger events, test input data, mandatory/expected test outputs (data and commands), and mandatory/expected system post-conditions. • When using an iterative, incremental, and parallel – agile – development cycle in which components under test will frequently change, concentrate on making their associated executable testing work products self-documenting (rather than using separate testing documentation) so that the components and their testing work products are more likely to be changed together and thereby remain consistent. • Use common standard templates for test documents (e.g., test plans, test cases, test© 2012-2013 by Carnegie Mellon University Page 57 of 111
  • Common Testing Problems: Pitfalls to Prevent and Mitigate 25 January 2013Descriptions, Symptoms, Consequences, Causes, and Recommendations procedures, and test reports). • Use a test documentation tool or database to record test reports. • When using a database to store test results, make sure that its schema supports easy searches. • Clearly identify the versions of the software, test environment, test cases, etc. to use to ensure consistency. Related Problems:GEN-TOP-2 Unclear Testing Responsibilities, GEN-TTE-8 Inadequate Test Configuration Management, GEN-PRO-9 Inadequate Test Maintenance2.1.7.3 GEN-COM-3 Source Documents Not Maintained Description: • The requirements specifications, architecture documents, design documents, and other developmental documents that are needed as inputs to the development of tests (e.g., needed to determine test inputs, preconditions, steps, expected outputs, expected postconditions) are not maintained. Potential Symptoms: • The requirements specifications, architecture documents, design documents, and developmental documents that are needed to drive the development of tests are obsolete. • The test drivers and code get out of sync with each other. • The testers are unaware that their tests have become obsolete by undocumented changes to the requirements, architecture, design, and implementation. Potential Consequences: • The regression tests 44 run during maintenance begin to produce large numbers of false 35F negative results. • The effort running these tests is wasted. • Testing takes longer because the testers must determine the true current state of the requirements, architecture, design, and implementation. Potential Causes: • TBD Recommendations: • Prepare: TBD • Enable: TBD • Perform:44 Although this is primarily a problem with regression tests that are obsoleted by changes to the requirements, architecture, design, and software, it can also be a problem with developing new tests for new capabilities if these capabilities are not documented properly.© 2012-2013 by Carnegie Mellon University Page 58 of 111
  • Common Testing Problems: Pitfalls to Prevent and Mitigate 25 January 2013Descriptions, Symptoms, Consequences, Causes, and Recommendations TBD • Verify: Verify that TBD • Ensure that the requirements specifications, architecture documents, design documents, and other developmental documents that are needed as inputs to the development of tests are properly maintained. 4536 F • Ensure that testers are notified when these changes occur. • Testers should report the occurrence of this problem to project management including the test manager, the project manager, and the technical leader. Related Problems:GEN-TTE-8 Inadequate Test Configuration Management (CM), TTS-UNT- 1 Unstable Design2.1.7.4 GEN-COM-4 Inadequate Communication Concerning Testing Description: There is inadequate communication concerning testing among testers and other testing stakeholders. Potential Symptoms: • There is inadequate testing-related communication between: Teams within large or geographically-distributed programs Contractually separated teams (prime vs. subcontractor, system of systems) Between testers and: Other developers (requirements engineers, architects, designers, and implementers) Other testers Customer representatives, user representatives, and subject matter experts (SMEs) • For example, the developers fail to notify the testers of bug fixes and their consequences. • Testers fail to notify other testers of test environment changes (e.g., configurations and uses of different versions of hardware and software). Potential Consequences: • Some of the requirements may not be testable. • Some architectural decisions may make certain types of testing more difficult or impossible. • Safety and security concerns may not influence the level of testing of safety- and security- critical functionality. • Different test teams may have difficulty coordinating their testing and scheduling their use of common test environments. Potential Causes: • TBD Recommendations:45 As stated in GEN-TTE-5 Inadequate Test Configuration Management (CM), these test documents also need to be placed under configuration control.© 2012-2013 by Carnegie Mellon University Page 59 of 111
  • Common Testing Problems: Pitfalls to Prevent and Mitigate 25 January 2013Descriptions, Symptoms, Consequences, Causes, and Recommendations • Prepare: TBD • Enable: TBD • Perform: TBD • Verify: Verify that TBD • Ensure that there is sufficient testing-related communication between and among the testers and the stakeholders in testing. Related Problems:GEN-PRO-1 Testing Process Not Integrated Into Engineering Process2.1.8 Requirements-related Testing ProblemsGood requirements are complete, consistent, correct, feasible, mandatory,46 testable, andunambiguous.47Requirements that are deficient in any of these criteria decrease the testability ofsystems and software. Given poor requirements, black-box testing is relatively inefficient andineffective so that testersmay rely on higher-risk strategies including white-box testing (e.g.,structural testing such as path testing for code coverage). 48 3FThe following testing problems are directly related to requirements:49• GEN-REQ-1 Ambiguous Requirements• GEN-REQ-2 Missing Requirements• GEN-REQ-3 Incomplete Requirements• GEN-REQ-4 Incorrect Requirements• GEN-REQ-5 Unstable Requirements• GEN-REQ-6 Poor Derived Requirements• GEN-REQ-7 Verification Methods Not Specified• GEN-REQ-8 Lack of Requirements Tracing2.1.8.1 GEN-REQ-1 Ambiguous Requirements Description: Testing fails to expose certain defects because some of the requirements are ambiguous.46 They are truly needed and not unnecessary architectural or design constraints.47 There are actually quite a few characteristics that good requirements should exhibit. These are merely some of the more important ones.48 At least, this will help to get the system to where it will runwithout crashing and thereby provide a stable system that can be modified when the customer finally determines what the true requirementsare.49 While the first five problem types below are violations of the characteristics of good requirements, there are many other such characteristics that are not listed below. This inconsistency is because the first five below tend to cause the most frequent and severe testing problems.© 2012-2013 by Carnegie Mellon University Page 60 of 111
  • Common Testing Problems: Pitfalls to Prevent and Mitigate 25 January 2013Descriptions, Symptoms, Consequences, Causes, and Recommendations Potential Symptoms: • Some of the requirements are ambiguous due to the use of: inherently ambiguous words undefined technical jargon (e.g., application domain-specific terminology) and acronyms misuses of contractual words such as “shall”, “should”, “may”, “recommended”, and “optional” required quantities without associated units of measure unclear synonyms, near synonyms, and false synonyms • Inconsistencies result when requirements engineers and testers interpret the same requirement differently. Potential Consequences: • Testers may misinterpret the requirements, leading to incorrect black-box testing. • Numerous false positive and false negative test results are observed because the tests were developed in accordance with the testers’, rather than the requirements engineers’, interpretation of the associated requirements. • Specifically, ambiguous requirements will often give rise to incorrect test inputs and incorrect expected outputs (i.e., the test oracle is incorrect). • Testers may have to spend significant time meeting with requirements engineers, customer/user representatives, and subject matter experts to clarify ambiguities so that testing can proceed. Potential Causes: • The people (e.g., requirements engineers and business analysts) who are engineering the requirements have not been adequately trained in how to recognize and avoid ambiguous requirements. • The requirement team does not include anyone with testing expertise. • The requirements have not been reviewed for ambiguity. The testers have not reviewed the requirements for ambiguity. • The requirements reviewers are not using a requirements review checklist or the checklist does not address ambiguous requirements. • The textual requirements have not been analyzed by a tool for inherently ambiguous words. • The requirements engineers are rushed because insufficient resources (e.g., time and staffing) have been allocated to properly engineer the requirements. Recommendations: • Prepare: TBD • Enable: TBD • Perform: TBD© 2012-2013 by Carnegie Mellon University Page 61 of 111
  • Common Testing Problems: Pitfalls to Prevent and Mitigate 25 January 2013Descriptions, Symptoms, Consequences, Causes, and Recommendations • Verify: Verify that TBD • Improve the requirements engineering process and associated training. • Consider adding a senior test engineer to the requirements engineering team to ensure that the requirements are unambiguous.50 • Promote testability by ensuring that requirements are clear and unambiguous. • Require that one or more testers review the requirements documents and each requirement for verifiability (esp. testability) before it is approved for use. • Encourage testers to request clarification for all ambiguous requirements, and encourage that the requirements be updated based on the clarification given. • Verify that the requirements do not include words that are inherently ambiguous, undefined technical terms and acronyms, quantities without associated units of measure, or synonyms. • Ensure that (1) the project has both a glossary and acronym list and (2) the requirements include technical jargon and acronyms only if they are defined therein. Related Problems:GEN-REQ-3 Incomplete Requirements, GEN-COM-4Inadequate Communication Concerning Testing, TTS-SoS-2 Poor or Missing SoS Requirements2.1.8.2 GEN-REQ-2 Missing Requirements Description: Testing fails to expose certain defects because some of the requirements are missing. Potential Symptoms: • Some requirements are missing such as: Requirements specifying mandatory responses to abnormal conditions (e.g., error, fault, and failure detection and reaction)51 Quality requirements (e.g., availability, interoperability, maintainability, performance, portability, reliability, robustness, safety, security, and usability) Data requirements Requirements specifying system behavior during non-operational modes (e.g., start-up, degraded mode, training, and shut-down) Potential Consequences: • Tests cannot be developed for missing requirements. • Requirements-based testing will not reveal missing behavior and characteristics. • Customer representatives and developers will have a false sense of security that the system will function properly on delivery and deployment.50 Because testers often begin outlining black-box test cases during requirements engineering based on initial requirements, they are in a good position to identify requirements ambiguity.51 These requirements are often critical for achieving adequate reliability, robustness, and safety. While requirements often specify how the system should behave, they rarely specify how the system should behave in those cases where it does not or cannot behave as it should. It is equally critical that testing verify that the system does not do what it should not do (i.e., that the system meets its negative as well as positive requirements).© 2012-2013 by Carnegie Mellon University Page 62 of 111
  • Common Testing Problems: Pitfalls to Prevent and Mitigate 25 January 2013Descriptions, Symptoms, Consequences, Causes, and Recommendations • Testers may have to spend a sizable amount of time meeting with requirements engineers, customer/user representatives in order to identify missing requirements the existence of which was implied by failed tests. • Defects associated with missing requirements may not be found and therefore they make it through testing and into the deployed system. Potential Causes: • Use cases only define normal paths (a.k.a., sunny day, happy path, and golden path) and not fault tolerant and failure paths (a.k.a., rainy day paths or alternative flows). 52 4F • The stakeholders have not reviewed the set of requirements for missing requirements. • The requirements have not been reviewed to ensure that they contain robustness requirements that mandate the detection and proper reaction to input errors, system faults (e.g., incorrect system-internal modes, states, or data), and system failures. • The requirements engineers are rushed because insufficient resources (e.g., time and staffing) have been allocated to properly engineer the requirements. Recommendations: • Prepare: TBD • Enable: TBD • Perform: TBD • Verify: Verify that TBD • Improve the requirements engineering process and associated training. • Consider adding a tester to the requirements engineering team to ensure that the requirements specify rainy day situations that must be addressed to achieve error, fault, and failure tolerance. • Promote testability by ensuring that use case analysis adequately addresses error, fault, and failure (i.e., rainy day) tolerant paths as well as normal (sunny day or golden) paths. • Ensure that the requirements repository includes an appropriate number of the quality and data requirements. • Ensure that one or more requirements stakeholders (e.g., customer representatives, user representatives, subject matter experts) review the requirements documents and requirements repository contents for missing requirements before they are accepted and approved for use. • Ensure that higher-level requirements are traced to lower-level (derived) requirements so that it is possible to verify that the lower-level requirements, if met, are sufficient to meet52 This is not to imply that the testing of normal paths is not necessary. However, it is often incorrectly assumed that it is sufficient. Software can misbehave in many more ways than it can work properly. Defects are also more likely to be triggered by boundary conditions or rainy day paths than in sunny day paths. Thus, there should typically be more boundary and invalid condition test cases than normal behavior test cases.© 2012-2013 by Carnegie Mellon University Page 63 of 111
  • Common Testing Problems: Pitfalls to Prevent and Mitigate 25 January 2013Descriptions, Symptoms, Consequences, Causes, and Recommendations the higher-level requirements. Related Problems:GEN-COM-4Inadequate Communication Concerning Testing, TTS-SoS-2 Poor or Missing SoS Requirements2.1.8.3 GEN-REQ-3 Incomplete Requirements Description: Testing fails to expose certain defects because some of the individual requirements are incomplete. Potential Symptoms: • Individual requirements are incomplete and lack (where appropriate) some of the following components:53 trigger events preconditions mandatory quantitative thresholds mandatory postconditions mandatory outputs Potential Consequences: • Testing will be incomplete or may return incorrect (i.e., false negative and false positive) results. • Some defects associated with incomplete requirements may not be found and therefore make it through testing and into the deployed system. Potential Causes: • The people (e.g., requirements engineers and business analysts) who are engineering the requirements have not been adequately trained in how to recognize and avoid incomplete requirements. • The requirement team does not include anyone with testing expertise. • The individual requirements have not been reviewed for completeness. • The testers have not reviewed each requirement for completeness. • The requirements reviewers are not using a requirements review checklist or the checklist does not address incomplete requirements. • The requirements engineers are rushed because insufficient resources (e.g., time and staffing) have been allocated to properly engineer the requirements. Recommendations: • Prepare: TBD53 All of these components are not needed for each requirement. However, stakeholders and requirements engineers often assume them to be implicitly part of the requirements and thus unnecessary to stateexplicitly. However, tests that ignore these missing parts of incomplete requirements can easily yield incorrect results if they are not taken into account.© 2012-2013 by Carnegie Mellon University Page 64 of 111
  • Common Testing Problems: Pitfalls to Prevent and Mitigate 25 January 2013Descriptions, Symptoms, Consequences, Causes, and Recommendations • Enable: TBD • Perform: TBD • Verify: Verify that TBD • Improve the requirements engineering process and associated training. • Consider adding a tester to the requirements engineering team to ensure that the requirements are sufficiently complete to enable testers to develop test inputs and determine correct associated outputs. • Ensure that the individual requirements are complete (e.g., via templates, guidelines, and inspections). • Ensure that the testers and one or more requirements stakeholders review the requirements documents and requirements repository contents for incomplete requirements before they are accepted and approved for use. Related Problems:GEN-REQ-1 Ambiguous Requirements, GEN-COM-4Inadequate Communication Concerning Testing2.1.8.4 GEN-REQ-4 Incorrect Requirements Description: Testing fails to expose certain defects because some of the requirements are incorrect. Potential Symptoms: • Requirements are determined to be incorrect after the associated black-box tests have been developed and run. Potential Consequences: • Testing results include many false positive and false negative results. • The tests associated with incorrect requirements must be modified or replaced and then rerun, potentially from scratch. • Some defects caused by incorrect requirements may not be found and therefore make it through testing and into the deployed system. Potential Causes: • The stakeholders have not reviewed the requirements for correctness. • Stakeholders are not available to validate the requirements. Insufficient resources (e.g., time and staffing) are allocated to properly engineer the requirements. Recommendations: • Prepare: TBD • Enable: TBD© 2012-2013 by Carnegie Mellon University Page 65 of 111
  • Common Testing Problems: Pitfalls to Prevent and Mitigate 25 January 2013Descriptions, Symptoms, Consequences, Causes, and Recommendations • Perform: TBD • Verify: Verify that TBD • Improve the requirements engineering process. • Ensure that the requirements are sufficiently validated by requirements stakeholders (e.g., customer representatives, user representatives, subject matter experts) before they are accepted by the testers and large numbers of associated test cases are development based on them. Related Problems:GEN-SIC-3Lack of Stakeholder Commitment, TTS-SoS-2 Poor or Missing SoS Requirements2.1.8.5 GEN-REQ-5 Unstable Requirements Description: Testing is problematic because of the volatility of many of the requirements. 54 5F Potential Symptoms: • The requirements are continually changing: new requirements are being added and existing requirements are being modified and deleted. • The requirements selected for implementation are not frozen, especially during a short duration increment (e.g., Scrum sprint) when using an incremental, iterative, and parallel (a.k.a., agile) development cycle. Potential Consequences: • Test cases (test inputs, preconditions, and expected test outputs) and automated regression tests are being obsoleted because of requirements changes. • Significant time originally scheduled for the development and running of new tests is spent in testing churn (fixing and rerunning broken tests). • As testing schedules get further behind, regression tests are not maintained and rerun. • Broken tests may be abandoned. Potential Causes: • The requirements are not well understood by the requirements stakeholders. • Many of the requirements are being rapidly iterated because they do not exhibit the characteristics of good requirements. • The actual requirements are rapidly changing due to changes in the system’s environment (e.g., new competing systems, rapidly changing threats, and changing markets). • These potential causes can be exacerbated when using a development cycle with many, short-duration iterative increments. • The requirements engineers are rushed because insufficient resources (e.g., time and54 This testing problem is similar to but more general than the preceding problem: Incorrect Requirements because fixing incorrect requirements is one potential reason that the requirements may be volatile. Other reasons may be engineering missing requirements and changing stakeholder needs.© 2012-2013 by Carnegie Mellon University Page 66 of 111
  • Common Testing Problems: Pitfalls to Prevent and Mitigate 25 January 2013Descriptions, Symptoms, Consequences, Causes, and Recommendations staffing) have been allocated to properly engineer the requirements. Recommendations: • Prepare: TBD • Enable: TBD • Perform: TBD • Verify: Verify that TBD • Where practical, ensure that the requirements are reasonably stable before developing test cases (test scripts, test inputs, preconditions, and expected test outputs) and automating regression tests.55 • Consider adding a tester to the requirements engineering team as a liaison to the testers so that the testers know which requirements are most likely to be sufficiently stable to enable the testers to begin developing the associated tests. Related Problems:GEN-COM-4Inadequate Communication Concerning Testing2.1.8.6 GEN-REQ-6 Poor Derived Requirements Description: Testing is problematic due to problems with derived requirements. Potential Symptoms: • Derived requirements merely restate theirassociated parent requirements. • Newly derived requirements are not at the proper level of abstraction (e.g., subsystem requirements at the same level of abstraction as the system requirements from which they were derived). Note that the first symptom is often an example of this second symptom. • The set of lower-level requirements derived from a higher requirement are necessary but not sufficient (i.e., meeting the lower-level requirements does not imply meeting the higher- level requirement). • A derived requirement is not actually implied by its “source” requirement. • Restrictions implied by architecture and design decisions are not being used to derive requirements in the form of derived architecture or design constraints. Potential Consequences: • It will be difficult to produce tests at the correct level of abstraction. • Testing at the unit- and subsystem-level for these derived requirements may be incomplete. • Associated lower-level defects may not be detected during testing. Potential Causes: • The people (e.g., requirements engineers and business analysts) who are engineering the55 Note that this may be impossible or impractical due to delivery schedules and the amount of testing required.© 2012-2013 by Carnegie Mellon University Page 67 of 111
  • Common Testing Problems: Pitfalls to Prevent and Mitigate 25 January 2013Descriptions, Symptoms, Consequences, Causes, and Recommendations requirements have not been adequately trained in how to derive new requirements at the appropriate level of abstraction. • The requirements engineers are rushed because insufficient resources (e.g., time and staffing) have been allocated to properly engineer the requirements. Recommendations: • Prepare: TBD • Enable: TBD • Perform: TBD • Verify: Verify that TBD • Review the derived and allocated requirements to ensure that they are at the proper level of abstraction and exhibit all of the standard characteristics of good requirements (e.g., complete, consistent, correct, feasible, mandatory, testable and unambiguous). Related Problems:GEN-REQ-8Lack of Requirements Tracing2.1.8.7 GEN-REQ-7 Verification Methods Not Specified Description: Each requirement has not been allocated one or more verification methods (e.g., analysis, demonstration, inspection, simulation, testing).56 Potential Symptoms: • The requirements specifications do not specify the verification method for its associated requirements. • The requirements repository does not include verification method(s) as requirements metadata.57 Potential Consequences: • Testers and testing stakeholders may incorrectly assume that all requirements must be verified via testing, even though other verification methods may be adequate, be more appropriate, require less time or effort. • Time may be spent testing requirements that should have been verified using another, more appropriate verification method. • Requirements stakeholders may incorrectly assume that if a requirement is not testable, then it is also not verifiable.56 Note that it may be adequate that a group of requirements are allocated one or more verification method(s). Thus, the individual requirements in the group are indirectly assigned verification methods via the group. This approach can potentially save time and effort and is therefore not an example of this problem.57 There are multiple ways to specify verification methods, and the appropriate one(s) to use will depend on the requirements engineering process.© 2012-2013 by Carnegie Mellon University Page 68 of 111
  • Common Testing Problems: Pitfalls to Prevent and Mitigate 25 January 2013Descriptions, Symptoms, Consequences, Causes, and Recommendations Potential Causes: • The requirements repository (or requirements management tool) schema does not include metadata for specifying verification methods. • Specifying verification methods for each requirement is not an explicit part of the requirements engineering process. • The requirements engineers are rushed because insufficient resources (e.g., time and staffing) have been allocated to properly engineer the requirements. Recommendations: • Prepare: TBD • Enable: TBD • Perform: TBD • Verify: Verify that TBD • Ensure that each requirement (or set of similar requirements) has one or more appropriate verification methods assigned to it/them. • Check the appropriateness of these verification methods during requirements inspections, walk-throughs, and reviews. • Ensure that actual verification methods used are consistent with the specified requirements verification methods, updating the requirements specifications and repositories when necessary. • Consider adding a tester to the requirements engineering team to ensure that the requirements verification methods are properly specified. Related Problems:GEN-COM-4Inadequate Communication Concerning Testing2.1.8.8 GEN-REQ-8 Lack of Requirements Trace Description: The requirements are not traced to the individual test cases. Potential Symptoms: • There is no documented tracing from individual requirements to their associated test cases. • The mapping from the requirements to the test cases is not stored in any project repository (e.g., requirements management, test management, or configuration tool). • There may only be a backwards trace from the individual test cases to the requirement(s) they test. • Any tracing that was originally created is not maintained as the requirements change. Potential Consequences: • There will not be any easy way to plan testing tasks, determine if all requirements have been tested, and determine what needs to be regression tested after changes occur.© 2012-2013 by Carnegie Mellon University Page 69 of 111
  • Common Testing Problems: Pitfalls to Prevent and Mitigate 25 January 2013Descriptions, Symptoms, Consequences, Causes, and Recommendations • If requirements change, there will be no way of knowing which test cases need to be created, modified, or deleted. Potential Causes: • The requirements repository (or requirements management tool) schema does not include metadata for specifying the requirements trace to test cases. • Specifying the requirements trace to testing is not an explicit part of the requirements engineering process. • There is insufficient staffing and time allocated to tracing requirements. • The tool support for tracing requirements is inadequate or non-existent. Recommendations: • Prepare: TBD • Enable: TBD • Perform: TBD • Verify: Verify that TBD • Create a tracing between the requirements and test cases. • Include the tracing from requirements to tests as a test asset in the appropriate repository. • Include generating and maintaining the tracing from requirements to test cases in the test plan(s). • Evaluate the testing process and work products to ensure that this tracing is being properly performed. • Allocate time in the project master schedule to perform this tracing. Related Problems: None2.2 Test Type SpecificProblemsThe following types of testing problems are related to the type of testing being performed:• Unit Testing Problems• Integration Testing Problems• Specialty Engineering Testing Problems• System Testing Problems• System of Systems (SoS) Testing Problems• Regression Testing Problems© 2012-2013 by Carnegie Mellon University Page 70 of 111
  • Common Testing Problems: Pitfalls to Prevent and Mitigate 25 January 2013Descriptions, Symptoms, Consequences, Causes, and Recommendations2.2.1 Unit Testing ProblemsThe following testing problems are related to unit testing: 58 37 F• TTS-UNT-1 Unstable Design• TTS-UNT-2 Inadequate Design Detail• TTS-UNT-3 Unit Testing Considered Unimportant2.2.1.1 TTS-UNT-1Unstable Design Description: Unit testing is problematic due to design volatility. Potential Symptoms: • Design changes (e.g., refactoring and new capabilities) cause the test cases to be constantly updated and test hooks to be lost. 59 38F Potential Consequences: • Unit tests will be unstable, requiring numerous changes and unit-level regression testing. • Unit testing will take an unacceptably long time to perform. Potential Causes: • TBD Recommendations: • Promote testability by ensuring that the design is reasonably stable so that test cases do not need to be constantly updated and test hooks are not lost due to refactoring and new capabilities. 60 39F Related Problems:GEN-COM-3 Source Documents Not Maintained2.2.1.2 TTS-UNT-2Inadequate Design Detail Description: Unit testing is problematic due to an inadequate level of design detail. Potential Symptoms: • There is insufficient design detail to drive the testing. • Specifically, there is insufficient detail to support black-box (interface) and white-box (implementation) unit and integration testing. Potential Consequences: • Unit testing (especially regression testing during maintenance by someone other than the original developer) will be difficult to perform and repeat.58 Note that because unit testing is typically the responsibility of the developers instead of professional testers, the general problem of inadequate testing expertise, experience, and training often applies.59 This is especially true with agile development cycles with many short-duration increments and with projects where abnormal behavior is postponed until late increments.60 This is especially important with agile development cycles with many short-duration increments and with projects where abnormal behavior is postponed until late increments.© 2012-2013 by Carnegie Mellon University Page 71 of 111
  • Common Testing Problems: Pitfalls to Prevent and Mitigate 25 January 2013Descriptions, Symptoms, Consequences, Causes, and Recommendations • Unit testing will take an unacceptably long time to perform. Unit-level defects may not be found. Potential Causes: • TBD Recommendations: • Ensure that the designers/programmers provide sufficient, well-documenteddesign details to drive the unit testing. Related Problems: None2.2.1.3 TTS-UNT-3Unit Testing Considered Unimportant Description: Unit testing is poorly and incompletely done because the developers consider it to be unimportant. Potential Symptoms: • Developers consider unit testing to be unimportant, especially in relationship to the actual development of the software. • Developers feel that the testers will catch any defects they miss. 61 40 F Potential Consequences: • Unit testing is poorly or incompletely done. • An unacceptably large number of defects that should have been found during unit testing pass through to integration and system testing, which are thereby slowed down. Potential Causes: • TBD Recommendations: • Ensure that the developers are clear as to their testing responsibilities (see GEN-TOP- 2Unclear Testing Responsibilities). • Ensure that they understand the importance of finding highly localized defects during unit testing when they are much easier to localize, analyze, and fix. • Establish clear unit testing success criteria that must be passed before the unit can be delivered for integration and integration testing. Related Problems:GEN-SIC-1 Wrong Testing Mindset2.2.2 Integration Testing ProblemsThe following testing problems are related to poor requirements:• TTS-INT-1 Defect Localization61 This problem is exacerbated by schedule pressures on the developers and their tendency to try to show that their software works rather than to find the defects they have incorporated into their software (see GEN-SIC-1 Wrong Testing Mindset).© 2012-2013 by Carnegie Mellon University Page 72 of 111
  • Common Testing Problems: Pitfalls to Prevent and Mitigate 25 January 2013Descriptions, Symptoms, Consequences, Causes, and Recommendations• TTS-INT-2 Unavailable Components• TTP-INT-3 Inadequate Self-Test2.2.2.1 TTS-INT-1 Defect Localization Description: Localizing defects is problematic due to encapsulation caused by integration. Potential Symptoms: • It is difficult to determine the location of the defect: in the new or updated operational software under test, in the operational hardware under test, in the COTS OS and middleware, in the software test bed (e.g., in software simulations of hardware), in the hardware test beds (e.g., in pre-production hardware), in the tests themselves (e.g., in the test inputs, preconditions, expected outputs, and expected postconditions), or in a configuration/version mismatch among them. Potential Consequences: • Defect localization will take an unacceptably large amount of time and effort to perform. • Errors in defect localization may cause the wrong fix (e.g., the wrong changes or changes to the wrong software) to be made. Potential Causes: • TBD Recommendations: • Ensure that the architecture and design adequately support testability (i.e., provide the testers with sufficient visibility and control to develop and execute adequate tests). • Ensure that the design and implementation (with exception handling, BIT, and test hooks), the tests, and the test tools make it relatively easy to determine the location of defects. • Where appropriate, incorporate a test mode that logs information about errors, faults, and failures to support defect identification and localization. • Because a single type of defect often occurs in multiple locations, check similar locations for the same type of defect once a defect has been localized. Related Problems:GEN-SIC-1 Wrong Testing Mindset2.2.2.2 TTS-INT-2Unavailable Components Description: Integration testing is problematic due to unavailability of needed system, software, or test environment components. Potential Symptoms: • The operational software, simulation software, test hardware, and actual hardware components (e.g., sensors, actuators, and network devices) are not available for integration into the test environments prior to scheduled integration testing. Potential Consequences: • Testing will not be able to begin until the missing components are available and have been integrated into the test environments. Testing may not be completed on schedule.© 2012-2013 by Carnegie Mellon University Page 73 of 111
  • Common Testing Problems: Pitfalls to Prevent and Mitigate 25 January 2013Descriptions, Symptoms, Consequences, Causes, and Recommendations Potential Causes: • TBD Recommendations: • Ensure that the operational software, simulation software, test hardware, and actual hardware components are available for integration into the test environments prior to scheduled integration testing. • The project budget and schedule need to include the effort and time required to develop and install the simulation software and test hardware. • If necessary: Obtain components with lower fidelity for initial testing. Develop simulators for the missing components. Related Problems:GEN-TTE-4 Poor Fidelity of Test Environments2.2.2.3 TTS-INT-3Inadequate Self-Test Description: Testing is problematic due to a lack of system- or software-internal self-tests. Potential Symptoms: • The operational subsystem or software does not contain sufficient test hooks, built-in-test (BIT), or prognostics and health management (PHM) software. Potential Consequences: • Failures will be difficult to cause, reproduce, and localize. • Testing will take an unacceptably long time to perform, potentially exceeding the test schedule. Potential Causes: • TBD Recommendations: • Ensure that the operational software or subsystem contains sufficient test hooks, built-in- test (BIT), or prognostics and health management (PHM) software so that failures are reasonably easy to cause, reproduce, and localize. Related Problems: None2.2.3 Specialty Engineering Testing ProblemsThe following testing problems are related to the specialty engineering testing of qualitycharacteristics and attributes: 62 41F• TTS-SPC-1 Inadequate Capacity Testing• TTS-SPC-2 Inadequate Concurrency Testing62 Note that analogous testing problems could also exist for other quality characteristics.© 2012-2013 by Carnegie Mellon University Page 74 of 111
  • Common Testing Problems: Pitfalls to Prevent and Mitigate 25 January 2013Descriptions, Symptoms, Consequences, Causes, and Recommendations• TTS-SPC-3 Inadequate Performance Testing• TTS-SPC-4 Inadequate Reliability Testing• TTS-SPC-5 Inadequate Robustness Testing• TTS-SPC-6 Inadequate Safety Testing• TTS-SPC-7 Inadequate Security Testing• TTS-SPC-8 Inadequate Usability TestingNote that specialty engineering tests tend to find the kinds of defects that are both difficult andcostly to fix (e.g., because they often involve making architectural changes). Even though theseare system-level quality characteristics, waiting until system testing is generally a bad idea.These types of testing (or other verification approaches) should begin relatively early duringdevelopment.2.2.3.1 TTS-SPC-1 Inadequate Capacity Testing Description: An inadequate level of capacity testing is being performed. Potential Symptoms: • All capacity requirements are not identified and specified. • There is little or no testing to determine if performance degrades gracefully as capacity limits are approached, reached, and exceeded. • There is little or no verification of adequate capacity-related computational resources (e.g., memory utilization or processor utilization). Potential Consequences: • Testing is less likely to detect some defects causing violations of capacity requirements. • The system may not meet its capacity requirements. Potential Causes: • TBD Recommendations: • Ensure that all capacity requirements are properly specified. • Specify how capacity requirements will be verified (and tested) in a project test planning document. • Ensure that all capacity requirements are adequately tested to determine performance as capacity limits are approached, reached, and exceeded. • Use tools that simulate large numbers of simultaneous users. Related Problems:GEN-TPS-2 Incomplete Test Planning2.2.3.2 TTS-SPC-2 Inadequate Concurrency Testing Description: An inadequate level of concurrency testing is being performed. Potential Symptoms:© 2012-2013 by Carnegie Mellon University Page 75 of 111
  • Common Testing Problems: Pitfalls to Prevent and Mitigate 25 January 2013Descriptions, Symptoms, Consequences, Causes, and Recommendations • The testing of concurrent behavior is not addressed in any test planning or process description documents. • There is little or no testing being performed explicitly to identify the defects that cause the common types of concurrency faults and failures: deadlock, livelock, starvation, priority inversion, race conditions, inconsistent views of shared memory, and unintentional infinite loops. • Any concurrency testing that is being performed is based on a random rather than systematic approach to test case identification (e.g., based on the interleaving of threads). • Any concurrency testing is being performed manually. • Concurrency faults and failures are only being identified when they happen to occur while unrelated testing is being performed. • Concurrency faults and failures occur infrequently, intermittently, and are difficult to reproduce. • Concurrency testing is performed using a low fidelity environment with regard to concurrency: threads rather than processes single rather than multiple processors the use of deterministic rather than probabilistic drivers and stubs the use of hardware simulation rather than actual hardware. Potential Consequences: • Any concurrency testing is both ineffectual and labor intensive. • Many defects that can cause concurrency faults and failures are not being found and fixed until final system testing, operational testing, or system operation when they are much more difficult to reproduce, localize, and understand. Potential Causes: • TBD Recommendations: • Provide testers with training in concurrency defects, faults, and failures. • Use concurrency testing techniques that enable the systematic selection of a reasonable number of test cases (e.g., ways of interleaving the threads) from the impractically large number of potential test cases. • For testing of threads sharing a single processor, use a concurrency testing tool that provides control over thread creation and scheduling. • When such tools are unavailable or inadequate, develop scripts that: automate the testing of deadlock and race conditions enable the reproducibility of test inputs record test results for analysis© 2012-2013 by Carnegie Mellon University Page 76 of 111
  • Common Testing Problems: Pitfalls to Prevent and Mitigate 25 January 2013Descriptions, Symptoms, Consequences, Causes, and Recommendations • To the extent possible, do not rely on: merely throwing large numbers of simultaneous inputs/requests 63 at the system 42F performing manual testing Related Problems:GEN-TPS-2 Incomplete Test Planning, TTS-SPC-3 Inadequate Performance Testing2.2.3.3 TTS-SPC-3 Inadequate Performance Testing Description: An inadequate level of performance testing is being performed. Potential Symptoms: • Performance requirements are not specified for all of performance quality attributes: event schedulability, jitter, latency, response time, and through-put. • There is little or no performance testing or testing to determine if performance degrades gracefully. • There is little or no verification of adequate performance-related computational resources. 64 43 F • Performance testing is performed using a low fidelity environment. Potential Consequences: • Testing is less likely to detect some performance defects. • The system may not meet its performance requirements. • Developers may have a false sense of security based on adequate performance under normal testing involving nominal loads and a subset of operational profiles. Potential Causes: • TBD Recommendations: • Specify how performance requirements will be verified (and tested) in a project test planning document. • Create realistic workload models under all relevant operational profiles. Ensure that all performance requirements are properly identified and specified. • Create or use existing (COTS) performance tools, such as a System Level Exerciser (SLE), to manage, schedule, perform, monitor, and report the results of performance tests. • Measure performance under nominal conditions and exceptional (i.e., fault and failure tolerance) conditions as well as conditions of peak loading and graceful degradation. • As appropriate, run single thread, multi-thread, and multi-processor/core tests. • Ensure that all performance requirements are adequately tested including all relevant performance attributes, operational profiles, and credible workloads.63 Such tests may redundantly test the same interleaving of threads while leaving many interleavings untested. Unexpected determinism may even result in the exact same interleaving being performed over and over again.64 Examples include network and disk I/O, bus/network bandwidth, processor utilization, memory (RAM and disk) utilization, or database performance.© 2012-2013 by Carnegie Mellon University Page 77 of 111
  • Common Testing Problems: Pitfalls to Prevent and Mitigate 25 January 2013Descriptions, Symptoms, Consequences, Causes, and Recommendations Related Problems:GEN-TPS-2 Incomplete Test Planning, TTS-SPC-2 Inadequate Concurrency Testing2.2.3.4 TTS-SPC-4 Inadequate Reliability Testing Description: An inadequate level of reliability testing is being performed. 65 4 4F Potential Symptoms: • There is little or no long duration reliability testing (a.k.a., stability testing) under operational profiles. Potential Consequences: • Testing is less likely to detect some defects causing violations of reliability requirements (and data to enable the estimation of system reliability will not be collected). • The system may not meet its reliability requirements. Potential Causes: • TBD Recommendations: • Ensure that all reliability requirements are properly identified and specified. • Specify how reliability requirements will be verified (or tested) in a project test planning document. • To the degree that testing as opposed to analysis is practical as a verification method, ensure that all reliability requirements undergo sufficient long duration reliability testing (a.k.a., soak tests) under operational profiles to estimate the system’s reliability. Related Problems:GEN-TPS-2 Incomplete Test Planning, TTS-SPC-5 Inadequate Robustness Testing2.2.3.5 TTS-SPC-5 Inadequate Robustness Testing Description: An inadequate level of robustness testing is being performed. Potential Symptoms: • Robustness testing is not based on robustness analysis such as abnormal (i.e., fault, degraded mode, and failure) use case paths, Event Tree Analysis (ETA), Fault Tree Analysis (FTA), or Failure Modes Effects Criticality Analysis (FMECA). • There is little or no robustness testing: Error Tolerance Testing, the goal of which is to show that system does not detect or react properly to input errors (a subtype of which is Fuzz Testing) Fault Tolerance Testing, the goal of which is to show that system does not detect or65 Note that reliability (load and stability) testing are nominal tests in the sense that they are executed within the performance envelop of the System Under Test (SUT). Capacity (stress) testing, where you test for graceful degradation, is outside the scope of performance testing.© 2012-2013 by Carnegie Mellon University Page 78 of 111
  • Common Testing Problems: Pitfalls to Prevent and Mitigate 25 January 2013Descriptions, Symptoms, Consequences, Causes, and Recommendations react properly to system faults (bad internal states) Failure Tolerance Testing, the goal of which is to show that system does not detect or react properly to system failures (to meet requirements) Environmental Tolerance Testing, the goal of which is to show that system does not detect or react properly to dangerous environmental conditions Potential Consequences: • Testing is less likely to detect some defects causing violations of robustness requirements. Some error, fault, failure, and environmental tolerance defects will not be found. • The system may exhibit inadequate robustness. Potential Causes: • TBD Recommendations: • Ensure that all robustness requirements are properly identified and specified. • Specify how robustness requirements will be verified (and tested) in a project test planning document. • Ensure that there is sufficient testing of all robustness requirements to verify adequate error, fault, failure, and environmental tolerance. • Ensure that this testing is based on proper robustness analysis such as abnormal (i.e., fault, degraded mode, and failure) use case paths, Event Tree Analysis (ETA), Fault Tree Analysis (FTA), or Failure Modes Effects Criticality Analysis (FMECA). Related Problems:GEN-TPS-2 Incomplete Test Planning, TTS-SPC-4 Inadequate Reliability Testing2.2.3.6 TTS-SPC-6 Inadequate Safety Testing Description: An inadequate level of safety testing is being performed. Potential Symptoms: • There is little or no: Testing based on safety analysis (e.g., abuse/mishap cases, ETA, or FTA) Testing of safeguards (e.g., interlocks) Testing of fail-safe behavior Safety-specific testing: Vulnerability Testing, the goal of which is to expose a system vulnerability (i.e., defect or weakness) 66 45 F Hazard Testing, the goal of which is to make the system cause a hazard to come into existence66 Note that the term vulnerability (meaning a weakness in the system/software) applies to both safety and security. Vulnerabilities can be exploited by an abuser [either unintentional (safety) or intentional (security)] and contribute to the occurrence of an abuse [either mishap (safety) or misuse (security)].© 2012-2013 by Carnegie Mellon University Page 79 of 111
  • Common Testing Problems: Pitfalls to Prevent and Mitigate 25 January 2013Descriptions, Symptoms, Consequences, Causes, and Recommendations Mishap Testing, the goal of which is to make the system cause an accident or near miss Potential Consequences: • Testing is less likely to detect some defects causing violations of safety requirements. Some defects with safety ramifications will not be found. • The system may exhibit inadequate safety. Potential Causes: • TBD Recommendations: • Ensure that all safety-related requirements are properly identified and specified. • Specify how safety requirements will be verified (and tested) in a project test planning document. • Ensure that there is sufficient black-box testing of all safety requirements and sufficient white-box testing of safeguards (e.g., interlocks) and fail-safe behavior. • Ensure that this testing is based on adequate safety analysis (e.g., abuse/mishap cases) as well as the safety architecture and design. Related Problems:GEN-TPS-2 Incomplete Test Planning, TTS-SPC-7 Inadequate Security Testing2.2.3.7 TTS-SPC-7 Inadequate Security Testing Description: An inadequate level of security testing is being performed. Potential Symptoms: • There is little or no: Testing based on security analysis (e.g., attack trees or abuse/misuse cases) Testing of security controls (e.g., access control, encryption/decryption, or intrusion detection) Testing of fail-secure behavior Security-specific testing: Penetration Testing, the goal of which is to penetrate the system’s defenses Fuzz Testing, the goal of which is to cause the system to fail due to random input Vulnerability Testing, the goal of which is to expose a system vulnerability (i.e., defect or weakness) Potential Consequences: • Testing is less likely to detect some defects causing violations of security requirements. 67 46F • Some vulnerabilities and other defects having security ramifications will not be found.67 Warning; although a bad idea, security requirements are sometimes specified in a security document rather than in the requirements specification/repository. Similarly, security testing is sometimes documented in security rather than testing documents.© 2012-2013 by Carnegie Mellon University Page 80 of 111
  • Common Testing Problems: Pitfalls to Prevent and Mitigate 25 January 2013Descriptions, Symptoms, Consequences, Causes, and Recommendations • The system may exhibit inadequate security. Potential Causes: • TBD Recommendations: • Ensure that all security-related requirements are properly identified and specified. • Specify how security requirements will be verified (and tested) in a project test planning document. • Ensure that all system actors are documented (e.g., profiled). • Ensure that there is sufficient security testing (e.g., penetration testing) of all security requirements, security features, security controls, and fail-secure behavior. • Ensure that this testing is based on adequate security analysis (e.g., attack trees, abuse/misuse cases). • Use static vulnerability analysis tools to identify commonly occurring security vulnerabilities. Related Problems:GEN-TPS-2 Incomplete Test Planning, TTS-SPC-6 Inadequate Safety Testing2.2.3.8 TTS-SPC-8 Inadequate Usability Testing Description: An inadequate level of usability testing is being performed. Potential Symptoms: • There is little or no explicit usability testing of the system’s or software’s human interfaces. • Specifically, there is no testing of such quality attributes of usability such as: accessibility, attractiveness(a.k.a., engagability, preference, and stickiness68), credibility (a.k.a., trustworthiness), differentiation, ease of entry, ease of location, ease of remembering, effectiveness, effort minimization,error minimization, predictability, learnability, navigability, retrievability, suitability (also known as appropriateness) , understandability, and user satisfaction. Potential Consequences: • Testing is less likely to detect some defects causing violations of usability requirements. • Some defects with usability ramifications will not be found. • The system may exhibit inadequate usability. Potential Causes: • TBD Recommendations: • Ensure that all usability requirements are properly identified and specified. • Specify how usability requirements will be verified (and tested) in a project test planning68 The term “stickiness” is typically used with reference to web pages and refers to how long users remain at (that is, remain “stuck” to) given web pages.© 2012-2013 by Carnegie Mellon University Page 81 of 111
  • Common Testing Problems: Pitfalls to Prevent and Mitigate 25 January 2013Descriptions, Symptoms, Consequences, Causes, and Recommendations document. • Ensure that there is sufficient usability testing of the human interfaces. Include usability testing for allrelevant usability attributes such as accessibility, attractiveness (also known as engagability, preference, and stickiness), credibility (also known as trustworthiness), differentiation, ease of entry, ease of location, ease of remembering, effectiveness, effort minimization, error minimization, learnability, navigability, retrievability, suitability (also known as appropriateness), understandability, and user satisfaction. Related Problems:GEN-TPS-2 Incomplete Test Planning2.2.4 System Testing ProblemsThe very nature of system testing often ensures that these problems cannot be eliminated. Atbest, the recommended solutions can only mitigate them.The following testing problems are related to system testing:• TTS-SYS-1 Testing Robustness Requirements is Difficult• TTS-SYS-2 Lack of Test Hooks• TTS-SYS-3 Testing Code Coverage is Difficult2.2.4.1 TTS-SYS-1 Testing Robustness Requirements is Difficult Description: • The testing of robustness requirements (specifying error, fault, and failure tolerance) 69 is 47 F difficult. Potential Symptoms: • It is difficult for tests of the integrated system to cause local faults (i.e., internal to a subsystem) in order to test for fault tolerance. Potential Consequences: • The system or software is less testable because it is less controllable (e.g., causing local faults). • Less robustness testing will be done and the delivered system will contain an unacceptably large number of defects that lessen error, fault, and failure tolerance. Potential Causes: • TBD Recommendations: • Ensure that robustness requirements are specified and associated architecture/design decisions are documented.69 An error is bad input (from a human, another system, or hardware). A fault is an encapsulated (information hiding) incorrect state or incorrect stored data. A failure is an externally visible incorrect response (e.g., output data or control) that typically is a violation of some requirement. An error may or may not result in a fault depending on whether it is stored and there is error tolerance. A fault may or may not cause a failure depending on whether it is executed and there is fault tolerance.© 2012-2013 by Carnegie Mellon University Page 82 of 111
  • Common Testing Problems: Pitfalls to Prevent and Mitigate 25 January 2013Descriptions, Symptoms, Consequences, Causes, and Recommendations • Ensure adequate test tool support or that sufficient robustness including error, fault, and failure logging is incorporated into the system to enable adequate testing for tolerance (e.g., by causing encapsulated errors and faults, and observing the resulting robustness). • Where appropriate, incorporate test hooks, built-in test (BIT), fault logging (possibly triggered by exception handling, a prognostics and health management (PHM) function or subsystem, or some other way to overcome information hiding in order to verify test case preconditions and post-conditions. Related Problems:TTS-SPC-5 Inadequate Robustness Testing2.2.4.2 TTS-SYS-2 Lack of Test Hooks Description: System testing is difficult because temporary test hooks have been removed. Potential Symptoms: • Internal test hooks and testing software has been removed prior to system testing (e.g., for security or performance reasons). Potential Consequences: • It will be difficult to test locally implemented requirements. • Such requirements will not be verified at the system level because of decreased testability due to low controllability and observability. Potential Causes: • TBD Recommendations: • Ensure that unit and integration testing have adequately tested locally implemented and encapsulated requirements that are difficult to verify during system testing. Use a test/logging system mode (if one exists). Related Problems: None2.2.4.3 TTS-SYS-3 Testing Code Coverage is Difficult Description: Ensuring that tests provide adequate code coverage is difficult. Potential Symptoms: • It is difficult for tests of the integrated system to demonstrate code coverage. 70 48F Consequences: • Adequate code coverage as mandated for mission-, safety-, and security-critical software will not be verified.70 Code coverage is typically very important for software with safety or security ramifications. When software is categorized by safety or security significance, the mandatory rigor of testing (including the completeness of coverage) increases as the safety and security risk increases (e.g., from function coverage through statement coverage, decision or branch coverage, and condition coverage to path coverage).© 2012-2013 by Carnegie Mellon University Page 83 of 111
  • Common Testing Problems: Pitfalls to Prevent and Mitigate 25 January 2013Descriptions, Symptoms, Consequences, Causes, and Recommendations • The system will not receive its safety and security accreditation and certification until code coverage is verified. Potential Causes: • TBD Recommendations: • Ensure that unit and integration testing (including regression testing) have demonstrated sufficient code coverage so that code coverage need not be demonstrated at the system level. • Use software test tools or probes to measure and report code coverage. Related Problems: None2.2.5 System of Systems (SoS) Testing ProblemsNote that system of systems means the integration of separately developed, funded, andscheduled systems having independent governance. This is not referring to a system developedby a prime contractor or integrated by a system integrator consisting of subsystems developed bysubcontractors or vendors.The following testing problems are related to system of systems testing:• TTS-SoS-1 Inadequate SoS Test Planning• TTS-SoS-2 Unclear SoS Testing Responsibilities• TTS-SoS-3 Inadequate Funding for SoS Testing• TTS-SoS-4 SoS Testing not Properly Scheduled• TTS-SoS-5 Poor or Missing SoS Requirements• TTS-SoS-6 Inadequate Test Support from Individual Systems• TTS-SoS-7 Inadequate Defect Tracking Across Projects• TTS-SoS-8 Finger-Pointing2.2.5.1 TTS-SoS-1Inadequate SoSTest Planning Description: An inadequate amount of SoStest planning is being performed. Potential Symptoms: • There may benoSoS Test and Evaluation Master Plan (TEMP) or SoS Test Plan. • Theremay beno SoS System Engineering Master Plan (SEMP) and SoS Development Plan (SoSDP). • Theremay only be incomplete high-level overviews of testing in the SoS SEMP or Test Plan. Potential Consequences: • There may be no clear test responsibilities, objectives, methods and techniques, and completion/acceptance criteria at the SoS level. • It may be unclear which project, organization, team, or individual is toresponsible for© 2012-2013 by Carnegie Mellon University Page 84 of 111
  • Common Testing Problems: Pitfalls to Prevent and Mitigate 25 January 2013Descriptions, Symptoms, Consequences, Causes, and Recommendations performing the different SoS testing tasks. • Adequate resources (funding, staffing, and schedule) may not be made available for SoS testing. • SoS testing may be inadequate. • There may be numerous system to system interface defects causing the failure of end-to-end mission threads. Potential Causes: • There may be (probably is) little if any governance at the SoS level. • Little or no planning may have occurred for testing above the individual system level. • The SoS testing tasks may not have been determined, planned for, or documented. Recommendations: • Prepare: Determine the level of testing that is taking place at the system level. Reuse or create a standard template and content/format standard for the SoS TEMP or Test Plan. Include the SoS TEMP or Test Plan as a deliverable work product in the SoS integration project’s contract. Include the delivery of the SoS TEMP or Test Planin the SoSproject’s master schedule (e.g., as part of major milestones). • Enable: To the extent practical, ensure close and regular communication (e.g., via status/working meetings and participation in major reviews) between the various system-level test organizations/teams. • Perform: Perform sufficient test planning at the SoS level. Create a SoS TEMP or Test Plan in order to ensure that. • Verify: Verify the existence and completeness of the SoS TEMP or Test Plan. Related Problems:GEN-TPS-1 No Separate Test Plan, GEN-TPS-2 Incomplete Test Planning2.2.5.2 TTS-SoS-2Unclear SoSTesting Responsibilities Description: The responsibilities for performing end-to-end system of systems testing are unclear. Potential Symptoms: • No project is explicitly tasked with testing end-to-end SoS behavior. Potential Consequences: • No project will have planned to provide the resources (e.g., staffing, budget, schedule) needed to perform SoS testing. • Adequate SoS testing is unlikely to be performed, and the SoS will be unlikely to meet its© 2012-2013 by Carnegie Mellon University Page 85 of 111
  • Common Testing Problems: Pitfalls to Prevent and Mitigate 25 January 2013Descriptions, Symptoms, Consequences, Causes, and Recommendations schedule for deployment of new/updated capabilities. Potential Causes:TBD Recommendations: • Ensure that responsibilities for testing the end-to-end SoS behavior are clearly assigned to some organization and project. • To the extent practical, ensure close and regular communication (e.g., via status/working meetings and participation in major reviews) between the various system-level test organizations/teams. Related Problems:GEN-TOP-2 Unclear Testing Responsibilities2.2.5.3 TTS-SoS-3Inadequate Funding for SoSTesting Description: The funding for system of systems (SoS) testing is not adequate for the performance of sufficient testing. Potential Symptoms: • Little or no funding has been provided to perform end-to-end SoS testing. • None of the system-level projectshave been funded to perform end-to-end SoS testing. Potential Consequences: • Little or no end-to-end SoS testing will be performed. • It is likely that residual system to system interface defects will cause the failure of end-to- end mission threads. Potential Causes: • TBD Recommendations: • Ensure that adequate funding for testing the end-to-end SoS behavior is clearly supplied to the responsible organization and project. Related Problems:GEN-SIC-3 Lack of Stakeholder Commitment2.2.5.4 TTS-SoS-4SoSTesting not Properly Scheduled Description: System of system testing is not properly scheduled. Potential Symptoms: • SoS testing is not in the individual system’s integrated master schedules, and there is no SoS-level master schedule. • SoS testing must be fit into the uncoordinated schedules of the individual systems comprising the SoS. Potential Consequences: • SoS testing that is not scheduled will be unlikely to be performed. • If performed, it is likely that the testing will be rushed, incomplete, and inadequate with© 2012-2013 by Carnegie Mellon University Page 86 of 111
  • Common Testing Problems: Pitfalls to Prevent and Mitigate 25 January 2013Descriptions, Symptoms, Consequences, Causes, and Recommendations more mistakes than typical. • The operational SoS is likely to contain more SoS integration defects and end-to-end mission thread defects than is appropriate. Potential Causes: • TBD Recommendations: • Ensure that SoS testing is on the SoS master schedule. • Ensure that SoS testing is also on the individual system’s integrated master schedules so that support for SoS testing can be planned. • Ensure that SoS testing is coordinated with the schedules of the individual systems. • To the extent practical, ensure close and regular communication (e.g., via status/working meetings and participation in major reviews) between the various system-level test organizations/teams. Related Problems:GEN-TPS-3 Inadequate Test Schedule2.2.5.5 TTS-SoS-5 Poor or Missing SoS Requirements Description: Many system of systems requirements are either missing or of poor quality. Potential Symptoms: • Little or no requirements exist above the system level. • Those SoS requirements that do exist do not exhibit all of the characteristics of good requirements. Potential Consequences: • Requirements-based SoS testing will be difficult to perform because there are no officially- approved SoS requirements to verify. • It will be hard to develop test cases and to determine the corresponding expected test outputs. • It is likely that system to system interface defects will cause the failure of end-to-end mission threads. Potential Causes: • TBD Recommendations: • Ensure that there are sufficient officially approved SoS requirements to drive requirements- based SoS testing. Related Problems:GEN-REQ-1 Ambiguous Requirements, GEN-REQ-2 Missing Requirements, GEN-REQ-4 Incorrect Requirements2.2.5.6 TTS-SoS-6 Inadequate Test Support from Individual Systems Description: Test support from individual system development/maintenance projects is© 2012-2013 by Carnegie Mellon University Page 87 of 111
  • Common Testing Problems: Pitfalls to Prevent and Mitigate 25 January 2013Descriptions, Symptoms, Consequences, Causes, and Recommendations inadequate to perform system of system testing. Potential Symptoms: • All available system-level test resources (e.g., staffing, funding, and test environments) are already committed to system testing. Potential Consequences: • It will be difficult or impossible to obtain the necessary test resources from individual projects to support SoS testing. Potential Causes: • TBD Recommendations: • Ensure that the individual projects provide adequate test resources (e.g., people and test beds) to support SoS testing. • Ensure that these resources are not already committed elsewhere. Related Problems:GEN-SIC-3 Lack of Stakeholder Commitment2.2.5.7 TTS-SoS-7 Inadequate Defect Tracking Across Projects Description: Defect tracking across individual system development or maintenance projects is inadequate to support system of systems testing. Potential Symptoms: • There is little or no coordination of defect tracking and associated regression testing across multiple projects. • Different projects collect different types and amounts of information concerning defects identified during testing. Potential Consequences: • It will be unnecessarily difficult to synchronize system- and SoS-level activities. • Defect localization and allocation of defects to individual or sets of systems will be difficult to perform. Potential Causes: • TBD Recommendations: • Develop a consensus concerning how to address defect reporting and tracking across the systems making up the SoS. • Document this consensus in all relevant testing plans (SoS and individual systems). • Verify that defect tracking and associated regression testing across the individualprojects of the systems making up the SoS are adequately coordinated and reported. Related Problems: None© 2012-2013 by Carnegie Mellon University Page 88 of 111
  • Common Testing Problems: Pitfalls to Prevent and Mitigate 25 January 2013Descriptions, Symptoms, Consequences, Causes, and Recommendations2.2.5.8 TTS-SoS-8Finger-Pointing Description: Different system development/maintenance projects assign the responsibility for defects and fixing them to other projects. Potential Symptoms: • There is a significant amount of finger pointing across project boundaries regarding whether something is a defect (or feature) or where defects lie (i.e., in which systems and in which project’s testing). Potential Consequences: • Time and effort will be wasted in the allocation of defects to individual or sets of systems. • Defects will take longer to be fixed, and these fixes will take longer to be verified. Potential Causes: • TBD Recommendations: • Ensure that representatives of the individual systems are on the SoS change control board (CCB) and are involved in SoS defect triage. • Work to develop a SoS mindset among the members of the SoS CCB. Related Problems: None2.2.6 Regression Testing ProblemsThe following problems are specific to the performance of regression testing including testingduring maintenance:• TTS-REG-1 Insufficient Regression Test Automation• TTS-REG-2 Regression Testing not Performed• TTS-REG-3 Inadequate Scope of Regression Testing• TTS-REG-4 Only Low-Level Regression Tests• TTS-REG-5 Disagreement over Maintenance Test Resources2.2.6.1 TTS-REG-1 Insufficient Regression Test Automation Description: Too few of the regression tests are automated.71 Potential Symptoms: • Many or even most of the tests may be being performed manually. Potential Consequences:71 The automation of regression testing is especially important when an agile (iterative, incremental, and parallel) development cycle is used. The resulting numerous, short-duration increments of the system must be retested because of changes due to iteration (e.g., refactoring and defect correction) and the integration of additional components with existing components.© 2012-2013 by Carnegie Mellon University Page 89 of 111
  • Common Testing Problems: Pitfalls to Prevent and Mitigate 25 January 2013Descriptions, Symptoms, Consequences, Causes, and Recommendations • Manual regression testing may take so much time and effort that it is not done. • If performed, regression testing maybe rushed, incomplete, and inadequate to uncover sufficient number of defects. • Testers may be making an excessive number of mistakes while manually performing the tests. • Defects introduced into previously tested subsystems/software while making changes may remain in the operational system Potential Causes: • Testing stakeholders (e.g., managers and the developers of unit tests) may: mistakenly believe that performing regression testing is neither necessary nor cost effective because: of the minor scope of most changes system testing will catch any inadvertently introduced integration defects they are overconfident that changes have not introduced any new defects not be aware of the: importance of regression testing value of automating regression testing • Automated regression testing may not be an explicit part of the testing process. • Automated regression testing may not be incorporated into the Test and Evaluation Master Plan (TEMP) or System/Software Test Plan (STP). • The schedule may contain little or no time for the development and maintenance of automated tests. • Tool support for automated regression testing may be lacking (e.g., due to insufficient test budget) or impractical to use. • The initially developed automated tests may not be maintained. • The initially developed automated testsmay not be delivered with the system/software. Recommendations: • Prepare: Explicitly address automated regression testing in the project’s: test process documentation (e.g., procedures and guidelines) TEMP or STP master schedule work break down structure (WBS) • Enable: Provide training/mentoring to the testing stakeholders in the importance and value of automated regression testing. Provide sufficient time in the schedule for automating and maintaining the tests. Provide sufficient funding to pay for automated test tools Ensure that adequate resources (staffing, budget, and schedule) are planned and available for automating and maintaining the tests.© 2012-2013 by Carnegie Mellon University Page 90 of 111
  • Common Testing Problems: Pitfalls to Prevent and Mitigate 25 January 2013Descriptions, Symptoms, Consequences, Causes, and Recommendations • Perform: Automate as many of the regression tests as is practical. Where appropriate, use commercially available test tools to automate testing. Ensure that both automated and manual test results are integrated into the same overall test results database so that test reporting and monitoring are seamless. Maintain the automated tests as the system/software changes. Deliver the automated tests with the system/software. • Verify: Verify that the test process documentation addresses automated regression testing Verify that the TEMP / STP and WBS address automated regression testing. Verify that the schedule provides sufficient time to automate and maintain the tests Verify that a sufficient number of the tests have been automated. Verify that the automated tests function properly. Verify that the automated tests are properly maintained. Verify that the automated tests are delivered with the system/software. Related Problems:GEN-TPS-1 No Separate Test Plan,GEN-TPS-2Incomplete Test Planning, GEN-TPS-3Inadequate Test Schedule,GEN-SIC-2Unrealistic Testing Expectations / False Sense of Security,GEN-MGMT-1Inadequate Test Resources,GEN-PRO-9Inadequate Test Maintenance, GEN-TTE-1Over-reliance on Manual Testing, GEN-TTE-7Tests not Delivered, GEN-TTE- 8Inadequate Test Configuration Management (CM)2.2.6.2 TTS-REG-2 Regression Testing not Performed Description: Insufficient72regression testing is being performed after changes are made to the system/software. Potential Symptoms: • The may be no regression testing being performed. • Parts of the system/software may not be being retested after they are changed (e.g., refactoring and fixes). • Appropriate parts of the system/software may not be being retested after interfacing parts are changed (e.g., additions, modifications, deletions). • Defects may be being traced to previously tested components. Potential Consequences: • Defects introduced into previously tested subsystems/software while making changesmay:72 The proper amount of regression testing depends on many factors including the criticality of the system/software, the potential risks associated with introducing new defects, the potential costs of fixing these defects, the potential costs of performing regression testing, and the resources available to perform regression testing. There is a natural tension between managers that wants to minimize regression testing and testers who want to perform as much testing as practical.© 2012-2013 by Carnegie Mellon University Page 91 of 111
  • Common Testing Problems: Pitfalls to Prevent and Mitigate 25 January 2013Descriptions, Symptoms, Consequences, Causes, and Recommendations not be found during regression testing remain in the operational system Potential Causes: • Testing stakeholders (e.g., managers and the developers of unit tests) may: mistakenly believe that performing regression testing is neither necessary nor cost effective because: of the minor scope of most changes the change will only have local effects and thus can’t affect the rest of the system system testing will catch any inadvertently introduced integration defects they are overconfident that changes have not introduced any new defects not be aware of the: importance of regression testing value of automating regression testing • Regression testing may not be an explicit part of the testing process. • Regression testing may not be incorporated into the Test and Evaluation Master Plan (TEMP) or System/Software Test Plan (STP). • The schedule may contain little or no time for the performance and maintenance of automated tests. • Regression tests may not be automated. • The initially developed automated tests may not be maintained. • The initially developed automated tests may not be delivered with the system/software. • There is insufficient time and staffing to perform regression testing, especially if it must be performed manually. • Change impact analysis may not: be performed (e.g., because of inadequate configuration management) address the impact on regression testing • The architecture and design of the system/software may be overly complex with excessive coupling and insufficient encapsulation between components, thereby hiding interactions that may be broken by the changes. Recommendations: • Prepare: Explicitly address regression testing in the project’s: test process documentation (e.g., procedures and guidelines) TEMP or STP master schedule work break down structure (WBS) Provide sufficient time in the schedule for performing and maintaining the regression tests. • Enable: Provide training/mentoring to the testing stakeholders in the importance and value of© 2012-2013 by Carnegie Mellon University Page 92 of 111
  • Common Testing Problems: Pitfalls to Prevent and Mitigate 25 January 2013Descriptions, Symptoms, Consequences, Causes, and Recommendations automated regression testing. Automate as many of the regression tests as is practical. Maintain the regression tests. Deliver the regression tests with the system/software. Provide sufficient time in the schedule to perform the regression testing. Collect, analyze, and distribute the results of metrics concerning the performance of regression testing. • Perform: Perform change impact analysis to determine what part of the system/software needs to be regression tested. Perform regression testing on the potentially impacted parts of the system/software. Resist efforts to skip regression testing unless a change impact analysis has determined that retesting is not necessary. • Verify: Verify that the test process documentation addresses automated regression testing Verify that the TEMP / STP and WBS address automated regression testing. Verify that the schedule provides sufficient time to automate and maintain the tests Verify that a sufficient number of the tests have been automated. Verify that the automated tests function properly. Verify that the automated tests are properly maintained. Verify that the automated tests are delivered with the system/software. Verify that change impact analysis is being performed and addresses the impact of the change on regression testing. Verify that sufficient regression testing is being performed. Related Problems:GEN-TPS-1 No Separate Test Plan,GEN-TPS-2Incomplete Test Planning, GEN-TPS-3Inadequate Test Schedule, GEN-SIC-2Unrealistic Testing Expectations / False Sense of Security, GEN-MGMT-1Inadequate Test Resources,GEN-PRO-9Inadequate Test Maintenance, GEN-TTE-1Over-reliance on Manual Testing, GEN-TTE-7Tests not Delivered, GEN-TTE- 8Inadequate Test Configuration Management (CM)2.2.6.3 TTS-REG-3 Inadequate Scope of Regression Testing Description: The scope of regression testing is not sufficiently broad. Potential Symptoms: • Regression testing may be restricted to only the subsystem/software thatchanged.73 • Appropriate parts of the system/software may not be retested after interfacing parts are changed (e.g., additions, modifications, deletions).73 Unfortunately, changes in one part of the system/software can sometimes impact apparently unrelated parts of the system/software. Defects also often unexpectedly propagate faults and failures beyond their local scope.© 2012-2013 by Carnegie Mellon University Page 93 of 111
  • Common Testing Problems: Pitfalls to Prevent and Mitigate 25 January 2013Descriptions, Symptoms, Consequences, Causes, and Recommendations • Defects may be found to trace to previously tested components. Potential Consequences: • Defects introduced into previously tested subsystems/software while making changes may: not be found during regression testing remain in the operational system Potential Causes: • Testing stakeholders (e.g., managers and the developers of unit tests) may: mistakenly believe that performing regression testing is neither necessary nor cost effective because: of the minor scope of most changes the change will only have local effects and thus can’t affect the rest of the system system testing will catch any inadvertently introduced integration defects they are overconfident that changes have not introduced any new defects be under significant cost and schedule pressure to minimize regression testing • Determining the proper scope of regression testing may not be an explicit part of the testing process. • The schedule may contain little or no time for the performance and maintenance of regression tests. • Regression tests may not be automated. • The initially developed automated tests may not be maintained. • The initially developed automated tests may not be delivered with the system/software. • There is insufficient time and staffing to perform regression testing, especially if it must be performed manually. • Change impact analysis may not: be performed (e.g., because of inadequate configuration management) address the impact on regression testing • The architecture and design of the system/software may be overly complex with excessive coupling and insufficient encapsulation between components, thereby hiding interactions that may be broken by the changes. Recommendations: • Prepare: Explicitly address the proper scope of regression testing in the project’s test process documentation (e.g., procedures and guidelines). Provide sufficient time in the schedule for performing and maintaining the regression tests. • Enable: Provide training/mentoring to the testers in the proper scope of regression testing. Automate as many of the regression tests as is practical. Maintain the regression tests.© 2012-2013 by Carnegie Mellon University Page 94 of 111
  • Common Testing Problems: Pitfalls to Prevent and Mitigate 25 January 2013Descriptions, Symptoms, Consequences, Causes, and Recommendations Deliver the regression tests with the system/software. Provide sufficient time in the schedule to perform the regression testing. Collect, analyze, and distribute the results of metrics concerning the performance of regression testing. • Perform: Perform change impact analysis to determine what part of the system/software needs to be regression tested. Perform regression testing on the potentially impacted parts of the system/software. Resist efforts to skip regression testing unless a change impact analysis has determined that retesting is not necessary. • Verify: Verify that the test process documentation addresses the proper scope of regression testing Verify that the schedule provides sufficient time to automate and maintain the tests Verify that a sufficient number of the tests have been automated. Verify that the automated tests function properly. Verify that the automated tests are properly maintained. Verify that the automated tests are delivered with the system/software. Verify that change impact analysis is being performed and addresses the impact of the change on regression testing. Verify that sufficient regression testing is being performed. Related Problems:GEN-TPS-1 No Separate Test Plan,GEN-TPS-2Incomplete Test Planning, GEN-TPS-3Inadequate Test Schedule, GEN-SIC-2Unrealistic Testing Expectations / False Sense of Security, GEN-MGMT-1Inadequate Test Resources,GEN-PRO-9Inadequate Test Maintenance, GEN-TTE-1Over-reliance on Manual Testing, GEN-TTE-7Tests not Delivered, GEN-TTE- 8Inadequate Test Configuration Management (CM), TTS-REG-2 Regression Testing not Performed,TTS-REG-4 Only Low-Level Regression Tests2.2.6.4 TTS-REG-4 Only Low-Level Regression Tests Description: Only low-level (e.g., unit level) regression tests are rerun. Potential Symptoms: • Regression testing may be restricted to unit testing (and possibly some integration testing). • Regression testing may not include system and/or SoS testing. Potential Consequences: • Integration defects introduced while changing existing previously tested subsystems/software will remain in the operational system because they will not be found during regression testing. Potential Causes:© 2012-2013 by Carnegie Mellon University Page 95 of 111
  • Common Testing Problems: Pitfalls to Prevent and Mitigate 25 January 2013Descriptions, Symptoms, Consequences, Causes, and Recommendations • TBD Recommendations: • Ensure that all relevant levels of regression testing (e.g., unit, integration, system, specialty, and SoS) are rerun when changes are made. • Automate as many of these regression tests so that it will be practical to rerun them. Related Problems:TTS-REG-3 Inadequate Scope of Regression Testing2.2.6.5 TTS-REG-5 Disagreement over Maintenance Test Resources Description: The development and maintenance projects disagree over who is responsible for providing the test resources (e.g., staffing, budget, test work products) during maintenance. Potential Symptoms: • There is disagreement as to whether the resources for maintenance testing should be provided by the development or maintenance projects. Potential Consequences: • Insufficient resources will be made available to adequately support maintenance testing. Testing will be delayed while the source of these resources is negotiated. Potential Causes: • TBD Recommendations: • Ensure that the funding for maintenance testing is clearly assigned to either the development or sustainment project. Include funding responsibilities in the transition plan (if there is one). Related Problems:GEN-TPS-2 Incomplete Test Planning© 2012-2013 by Carnegie Mellon University Page 96 of 111
  • 3 Conclusion 2B3.1 Testing ProblemsThere are many testing problems that can occur during the development or maintenance ofsoftware-reliant systems and software applications. While no project is likely to be so poorlymanaged and executed as to experience the majority of these problems, most projects will sufferseveral of them. Similarly, while exhibiting these testing problems does not guarantee failure,these problems are definitely risks that need to be managed.The 77 common problems involving how testing is performed have been grouped into thefollowing 14 categories:• General Testing Problems Requirements Testing Problems Test Planning and Scheduling Problems Stakeholder Involvement and Commitment Problems Management-related Testing Problems Test Organization and Professionalism Problems Test Process Problems Test Tools and Environments Problems Test Communication Problems• Testing Type Specific Problems Unit Testing Problems Integration Testing Problems Specialty Engineering Testing Problems System Testing Problems System of Systems (SoS) Problems Regression Testing Problems3.2 Common ConsequencesWhile different testing problems have different proximate negative consequences, they all tendto contribute to the following overall ultimate results:• The testing effort is less effective and efficient.• Some defects are discovered later than they should be, when they are more difficult to localize and fix.• The testers must work unsustainably long hours causing them to become exhausted and therefore make excessive numbers of mistakes.• The software-reliant system or software application is delivered late and over budget because of extra unplanned time and effort spent finding and fixing defects late during development.Page 97 of 111© 2013 by Carnegie Mellon University
  • Common Testing Problems: Pitfalls to Prevent and Mitigate 25 January 2013Descriptions, Symptoms, Consequences, Causes, and Recommendations• In spite of this extra budget and schedule, the software-reliant system or software application is still delivered and placed into operation with more residual defects than either expected or necessary.3.3 Common SolutionsIn addition to the individual problem-specific recommendations provided in the precedingproblem specifications, the following general solutions are applicable to most of the commontesting problems:• Prevention Solutions– The following solutions can prevent the problems from occurring in the first place: Formally require the solutions – Customer representatives formally require the solutions to the testing problems in the appropriate documentation such as the Request for Proposals and Contract. Mandate the solutions – Managers, chief engineers (development team leaders), or chief testers (test team leaders) explicitly mandate the solutions to the testing problems in the appropriate documentation such as the System Engineering Management Plan (SEMP), System/Software Development Plan (SDP), Test Plan(s), and/or Test Strategy. Provide training – Chief testers or trainers provide appropriate amounts and levels of test training to relevant personnel (such as to acquisition staff, management, testers, and quality assurance) that covers the potential testing problems and how to prevent, detect, and react to them. Management support – Managers explicitly state (and provide) their support for testing and the need to avoid the commonly occurring test problems.• Detection Solutions – The following solutions enable existing problems to be identified and diagnosed: Evaluate documentation – Review, inspect, or walk through the test-related documentation (e.g., Test Plan and test sections of development plans). Oversight – Provide acquirer, management, quality assurance, and peer oversight of the testing processas it is performed. Metrics – Collect, analyze, and report relevant test metrics to stakeholders (e.g., acquirers, managers, technical leads or chief engineers, and chief testers).• Reaction Solutions – The following solutions help to solve existing problems once they are detected: Reject test documentation – Customer representatives, managers, and chief engineers refuse to accept test-related documentation until identified problems are solved. Fail the test– Customer representatives, managers, and chief engineers refuse to accept the system/subsystem/software under test until identified problems (e.g., in test environments, test procedures, or test cases) are solved. Rerun the tests after prioritizing and fixing the associated defects. Provide training – Chief testers or trainers provide appropriate amounts and levels of remedial test training to relevant personnel (such as to acquisition staff, management,© 2012-2013 by Carnegie Mellon University Page 98 of 111
  • Common Testing Problems: Pitfalls to Prevent and Mitigate 25 January 2013Descriptions, Symptoms, Consequences, Causes, and Recommendations testers, and quality assurance) that covers the observed testing problems and how to prevent, detect, and react to them. Update process – Chief engineers, chief testers, and/or process engineers update the test process documentation to minimize the likelihood of reoccurrence of the observed testing problems. Formally raise risk – Raise existing test problems as formal risks and inform both project management and the customer representative.© 2012-2013 by Carnegie Mellon University Page 99 of 111
  • Common Testing Problems: Pitfalls to Prevent and Mitigate 25 January 2013Descriptions, Symptoms, Consequences, Causes, and Recommendations4 Potential Future WorkThe contents of this document were not the results of a formal academic study. Rather, they werederived largely from the author’s 30+ years of experience assessing and taking part in numerousprojects as well as numerous discussions with testing subject matter experts.As such, the current qualitative document leaves several important quantitative questionsunanswered:• Frequency. What is the probability distribution of these problems? Which problems occur most often? Which problems tend to cluster together?• Impact. Which problems have the largest negative consequences? What are the probability distributions of harm caused by each problem?• Risk. Based on the above frequencies and impacts, which of these problems cause the greatest risks? Given these risks, how should one prioritize the identification and resolution of these problems?• Distribution. Do different problems tend to occur with different probabilities in different application domains such as commercial vs. governmental vs. military, web vs. IT vs. embedded systems, etc.)?Provided sufficient funding, it is the author’s intent to turn this document into an industry surveyand to perform a formal study to answer these questions.© 2012-2013 by Carnegie Mellon University Page 100 of 111
  • Common Testing Problems: Pitfalls to Prevent and Mitigate 25 January 2013Descriptions, Symptoms, Consequences, Causes, and Recommendations5 AcknowledgementsThis paper has been provided for review to over 191 professionals and academics from 33countries and incorporates comments and recommendations received from the followingindividuals who I would like to acknowledge:1. Vince Alcalde, Independent Consultant, Australia2. LaxmiBhat, Minerva Networks, USA3. Robert V. Binder, System Verification Associates, USA4. Peter Bolin, Revolution IT Pty Ltd, Australia5. AlexandruCosma, ISDC, Romania6. Jorge Alberto De Flon, Servicio de AdministraciónTributara (SAT), Mexico7. Lee Eldridge,Independent Consultant, Australia8. Eliazar Elisha, University of Liverpool, UK9. Sam Harbaugh,Integrated Software Inc., USA10. M. E. Hom, Compass360 Consulting, USA11. Thanh Cong Huynh, LogiGear, Vietnam12. Ronald Kohl, Independent Consultant, USA13. WidoKunde, Baker Hughes, Germany14. PhilippeLebacq, Toyota Europe, Belgium15. Stephen Masters, Software Engineering Institute, USA16. Ken Niddefer, Software Engineering Institute, USA17. Anne Nieberding, Independent Consultant, USA18. William Novak, Software Engineering Institute, USA19. Mahesh Palan, Calypso Technology, USA20. Dan Pautler, Elekta, USA21. Mark Powel, Attwater Consulting, USA22. James Redpath, Sogeti, USA23. SudipSaha, Navigators Software,India24. Alejandro Salado, Kayser–Threde GmbH, Germany25. Matt Sheranko, Knowledge Code, USA26. Oleg Spozito, Independent Consultant, Canada27. Barry Stanly, Independent Consultant, USA28. Lou Wheatcraft, Requirements Experts, USA29. Thomas Zalewski, Texas State Government, USA© 2012-2013 by Carnegie Mellon University Page 101 of 111
  • Common Testing Problems: Pitfalls to Prevent and Mitigate 25 January 2013Descriptions, Symptoms, Consequences, Causes, and RecommendationsAppendix A: GlossaryAnalysis (verification) – the verification method in which established technical or mathematicalmodels or simulations, algorithms, or other scientific principles and procedures are used toprovide evidence thata work product (e.g., document, software application, or system) meets itsspecified requirementsBlack-box testing(a.k.a., interface testing) – any method of testing the externally visiblebehavior and characteristics of software without regard to its internal structures or workingsBoundary value testing – the testing technique in which test cases are selected just inside, on,and just outside of each boundary of an equivalence class of potential test cases74Branch coverage – the type of code coverage in which test cases have executed each possiblebranch each branch of each control structure (e.g., If-Then-Else and Case statement) in thesoftware under testCode coverage – a measure of the degree to which testingexecutes the source code within aprogramCondition coverage (a.k.a., predicate coverage) –the type of code coverage in which test caseshave caused each Boolean sub-expression to be evaluated both to true and to false in the softwareunder test75Decision coverage–the type of code coverage in which test cases have met as well as not meteach condition of each branch of each control structure (e.g., If-Then-Else and Case statement) inthe software under testDefect – any flaw resulting from an error made during development that will cause the system toperform in an unintended or unanticipated manner if executed (possibly only under certaincircumstances)76Demonstration – theverification method in which a system or subsystem is observed duringoperation under specific scenarios to provide visual evidence of whether it behaves properlyDerived requirement – any requirement that is implied or inferred from other requirements orfrom applicable standards, laws, policies, common practices, management and businessdecisions, orconstraintsEntry/exit coverage–the type of code coverage in which test cases have executed every possiblecall and return of the function in the software under testError – any human mistake (e.g., an incorrect, missing, extra, or improperly timed action) thatcan cause erroneous input or a defect7774 All test cases within the equivalence class are considered equivalent because they all follow the same path through the code with regards to branching.75 Condition coverage does not necessarily imply decision coverage.76 The defect could be in software(e.g., incorrect statements or declarations), in hardware (e.g., a flaw in material or workmanship, manufacturing defects), or in data (e.g., incorrect hardcoded values in configuration files).A software defect (a.k.a., bug) is the concrete manifestation within the software of one or more errors. One error may cause several defects, and multiple errors may cause the same defect.© 2012-2013 by Carnegie Mellon University Page 102 of 111
  • Common Testing Problems: Pitfalls to Prevent and Mitigate 25 January 2013Descriptions, Symptoms, Consequences, Causes, and RecommendationsErroneous input – any incorrect input values (i.e., those that do not match the actual or requiredvalues)Error tolerance – the degree to which the system detects erroneous input (e.g., from a human orfailed sensor) and response properly to avoid faults and failuresFailure – the system ceases to meet one or more of its requirements (i.e., fails to exhibit amandatory behavior or characteristic)78Failure tolerance – the degree to which the system detects the existence of failures and reactsappropriately to avoid harm (e.g., by going into a degraded mode or failing into a safe and securestate)False negative test result–the test result implies that no underlying defect exists although adefect actually exists (i.e., the test fails to expose the defect)79False positive test result – the test result implies the existence of an underlying defect althoughno such defect actually exists80Fault – any abnormal system-internal condition (e.g., incorrect stored data value, incorrectsubsystem state, or execution of the wrong block of code) which may cause the system to fail 81Fault tolerance – the degree to which the system detects the existence of faults and reactsappropriately to avoid failuresFunction coverage–the type of code coverage in which test cases have called each function(a.k.a., procedure, method, or subroutine) in the software under testFunctional requirement – any requirementthat specifies a mandatory behavior of a system orsubsystemFunctional testing –any testing intended to cause the implementation of a system function to failin order to identify associated defects as well as to provide some information that they function iscorrectly implementedFuzz testing – the testing technique in which random inputs are used to cause the system to failIncremental development cycle – any development cycle in which the development process(including testing) is repeated to add additional capabilities77 If an error occurs during development, it can create a defect. If the error occurs during operation, it can produce erroneous input that can cause a fault.78 Failure often refers to both the condition of not meeting requirements as well as the event that causes this condition to occur.79 There are many reasons for false negative test results. They are most often caused by selecting test inputs and preconditions that do not exercise the underlying defect.80 A false positive test result could be caused by bad test input data, incorrect test preconditions, incorrect test oracles (outputs and postconditions), defects in a test driver or test stub, improperly configured test environment, etc.81 A fault can be caused by erroneous input or execution of a defect. Unless properly handled, a fault can cause a failure.© 2012-2013 by Carnegie Mellon University Page 103 of 111
  • Common Testing Problems: Pitfalls to Prevent and Mitigate 25 January 2013Descriptions, Symptoms, Consequences, Causes, and RecommendationsInspection–theverification method in which a static work product is observed using one or moreof the five senses, simple physical manipulation, and mechanical and electrical gauging andmeasurement to determine ifitcontains defectsIntegration testing –the incremental testing of larger and larger subsystems as they areintegrated to form the overall system.Iterative development cycle –any development cycle in which all or part of the developmentprocess (including testing) is repeated to modify an existing subsystem or software component,typically to correct defects or make improvements (e.g., refactoring the architecture/design orreplace existing components)Load testing –TBDLoop coverage–the type of code coverage in which test cases have executed every loop zerotimes, once, and more than once in the software under testMetadata –TBDPath coverage–the type of code coverage in which test cases have executed every possible routethrough the software under test82Parallel development cycle –TBDPenetration testing – the testing technique in which a tester plays the role of an attacker andtries to penetrate the system’s defensesPost-condition - anyassertion that must hold following the successful execution of the associatedfunction (e.g., use case path)Precondition - anyassertion that must hold prior to the successful execution of the associatedfunction (e.g., use case path)Quality requirement – any requirementthat specifies a mandatory quality characteristic in termsof a minimum acceptable level of some associated quality attributeRegression testing –the repetition of testing after a change has been made to ensure that thechange did not inadvertently introduce any defectsRequirement – any requirement that specifies a mandatory capability of a specific product ortype of product (e.g., system or subsystem)Requirements management tool –TBDRequirements metadata –TBDRequirements trace –TBDStatement coverage–the type of code coverage in which test cases have executed each statementin the software under testStructural testing – synonym for whitebox testingSystem testing –TBDTest –TBD82 This level of code coverage is usually impractical or impossible.© 2012-2013 by Carnegie Mellon University Page 104 of 111
  • Common Testing Problems: Pitfalls to Prevent and Mitigate 25 January 2013Descriptions, Symptoms, Consequences, Causes, and RecommendationsTestability–TBDTest asset –TBDTest case –TBDTest case selection criteria–the criteria that are used to determine the actual test cases to createand runTest completion criteria –TBDTest driver –TBDTest engineer –TBDTest environment –TBDTester –TBDTesting - theverification method in which a system/subsystem is executed under controlledpreconditions (e.g., inputs and pretest mode and states) and actual postconditions (e.g., outputsand post-test mode and states) are compared with expected/required postconditionsTesting method –TBDTest input –TBDTest oracle –any source of the information defining correct and expected system behavior andtest postconditionsTest output –TBDTest plan –TBDTest script –TBDTest stakeholder –TBDTest stub –TBDTest tool –TBDTrigger event –TBDUnit testing –TBDUse case –TBDUse case path –TBDValidation–TBDVerification–TBDVulnerability – anysystem-internal weakness that can increase the likelihood or harm severityof one or more abuses (i.e., mishaps or misuses)Vulnerability testing – the testing technique the goal of which is to expose a systemvulnerability (i.e., defect or weakness) that can be exploited to cause a mishap or misuse© 2012-2013 by Carnegie Mellon University Page 105 of 111
  • Common Testing Problems: Pitfalls to Prevent and Mitigate 25 January 2013Descriptions, Symptoms, Consequences, Causes, and RecommendationsWhite-box testing (a.k.a., structural and implementation testing) – anymethod of testing theinternal, typically encapsulated structures or workings of software as opposed to its externallyvisible behavior, often performed to meet some kind of code coverage criteria8383 Typical code coverage criteria include branch, decision, path, and statement coverage.© 2012-2013 by Carnegie Mellon University Page 106 of 111
  • Appendix B: Checklist Potential Potential Potential Symptom(s) Consequence(s) Cause(s) Recommendations Testing Problems Observed Observed Identified ImplementedTest Planning and Scheduling ProblemsGEN-TPS-1 No Separate Test PlanGEN-TPS-2 Incomplete Test PlanningGEN-TPS-3 Test Plans IgnoredGEN-TPS-4 Test Case Documents rather than Test PlansGEN-TPS-5 Inadequate Test ScheduleGEN-TPS-6 Testing is PostponedStakeholder Involvement and Commitment ProblemsGEN-SIC-1 Wrong Testing MindsetGEN-SIC-2 Unrealistic Testing Expectations / False Sense of SecurityGEN-SIC-3 Lack of Stakeholder CommitmentManagement-related Testing ProblemsGEN-MGMT-1 Inadequate Test ResourcesGEN-MGMT-2 Inappropriate External PressuresGEN-MGMT-3 Inadequate Test-related Risk ManagementGEN-MGMT-4 Inadequate Test MetricsGEN-MGMT-5 Test Lessons Learned IgnoredTest Organization and Professionalism ProblemsGEN-TOP-1 Lack of IndependenceGEN-TOP-2 Unclear Testing ResponsibilitiesGEN-TOP-3 Inadequate Testing Expertise Potential Potential PotentialPage 107 of 111© 2013 by Carnegie Mellon University
  • Common Testing Problems: Pitfalls to Prevent and Mitigate 25 January 2013Descriptions, Symptoms, Consequences, Causes, and Recommendations Symptom(s) Consequence(s) Cause(s) Recommendations Testing Problems Observed Observed Identified ImplementedTest Process ProblemsGEN-PRO-1 Testing Process Not Integrated Into Engineering ProcessGEN-PRO-2 One-Size-Fits-All TestingGEN-PRO-3 Inadequate Test PrioritizationGEN-PRO-4 Functionality Testing OveremphasizedGEN-PRO-5 Black-box System Testing OveremphasizedGEN-PRO-6 White-box Unit and Integration Testing OveremphasizedGEN-PRO-7 Too Immature for TestingGEN-PRO-8 Inadequate Test EvaluationsGEN-PRO-9 Inadequate Test MaintenanceTest Tools and Environments ProblemsGEN-TTE-1 Over-reliance on Manual TestingGEN-TTE-2 Over-reliance on Testing ToolsGEN-TTE-3 Insufficient Test EnvironmentsGEN-TTE-4 Poor Fidelity of Test EnvironmentsGEN-TTE-5 Inadequate Test Environment QualityGEN-TTE-6 System/Software Under Test Behaves DifferentlyGEN-TTE-7 Tests not DeliveredGEN-TTE-8 Inadequate Test Configuration Management (CM)© 2012-2013 by Carnegie Mellon University Page 108 of 111
  • Common Testing Problems: Pitfalls to Prevent and Mitigate 25 January 2013Descriptions, Symptoms, Consequences, Causes, and Recommendations Potential Potential Potential Symptom(s) Consequence(s) Cause(s) Recommendations Testing Problems Observed Observed Identified ImplementedTest Communication ProblemsGEN-COM-1 Inadequate Defect ReportsGEN-COM-2 Inadequate Test DocumentationGEN-COM-3 Source Documents Not MaintainedGEN-COM-4 Inadequate Communication Concerning TestingRequirements-related Testing ProblemsGEN-REQ-1 Ambiguous RequirementsGEN-REQ-2 Missing RequirementsGEN-REQ-3 Incomplete RequirementsGEN-REQ-4 Incorrect RequirementsGEN-REQ-5 Unstable RequirementsGEN-REQ-6 Poor Derived RequirementsGEN-REQ-7 Verification Methods Not SpecifiedGEN-REQ-8 Lack of Requirements TracingUnit Testing ProblemsTTS-UNT-1 Unstable DesignTTS-UNT-2 Inadequate Design DetailTTS-UNT-3 Unit Testing Considered Unimportant© 2012-2013 by Carnegie Mellon University Page 109 of 111
  • Common Testing Problems: Pitfalls to Prevent and Mitigate 25 January 2013Descriptions, Symptoms, Consequences, Causes, and Recommendations Potential Potential Potential Symptom(s) Consequence(s) Cause(s) Recommendations Observed Observed Identified Implemented Testing ProblemsIntegration Testing ProblemsTTS-INT-1 Defect LocalizationTTS-INT-2 Unavailable ComponentsTTP-INT-3 Inadequate Self-TestSpecialty Engineering Testing ProblemsTTS-SPC-1 Inadequate Capacity TestingTTS-SPC-2 Inadequate Concurrency TestingTTS-SPC-3 Inadequate Performance TestingTTS-SPC-4 Inadequate Reliability TestingTTS-SPC-5 Inadequate Robustness TestingTTS-SPC-6 Inadequate Safety TestingTTS-SPC-7 Inadequate Security TestingTTS-SPC-8 Inadequate Usability TestingSystem Testing ProblemsTTS-SYS-1 Testing Robustness Requirements is DifficultTTS-SYS-2 Lack of Test HooksTTS-SYS-3 Testing Code Coverage is Difficult© 2012-2013 by Carnegie Mellon University Page 110 of 111
  • Common Testing Problems: Pitfalls to Prevent and Mitigate 25 January 2013Descriptions, Symptoms, Consequences, Causes, and Recommendations Potential Potential Potential Symptom(s) Consequence(s) Cause(s) Recommendations Observed Observed Identified Implemented Testing ProblemsSystem of Systems (SoS) Testing ProblemsTTS-SoS-1 Inadequate SoS PlanningTTS-SoS-2 Unclear SoS Testing ResponsibilitiesTTS-SoS-3 Inadequate Funding for SoS TestingTTS-SoS-4 SoS Testing not Properly ScheduledTTS-SoS-5 Poor or Missing SoS RequirementsTTS-SoS-6 Inadequate Test Support from Individual SystemsTTS-SoS-7 Inadequate Defect Tracking Across ProjectsTTS-SoS-8 Finger-PointingRegression Testing ProblemsTTS-REG-1 Insufficient Regression Test AutomationTTS-REG-2 Regression Testing not PerformedTTS-REG-3 Inadequate Scope of Regression TestingTTS-REG-4 Only Low-Level Regression TestsTTS-REG-5 Disagreement over Maintenance Test Resources© 2012-2013 by Carnegie Mellon University Page 111 of 111