EuroSTARSoftware TestingC o n fe r e n c eEuroSTARSoftware TestingC o m m u n i t yAdvanced Software Testing - Vol. 2:Guid...
Advanced	 Software	 Testing	 -	 Vol.	 2:	 Guide	 to	 the	 ISTQB	 Advanced	 Certification	 as	 an	 Advanced	 Test	 Manager1P...
Advanced	 Software	 Testing	 -	 Vol.	 2:	 Guide	 to	 the	 ISTQB	 Advanced	 Certification	 as	 an	 Advanced	 Test	 Manager2P...
Advanced	 Software	 Testing	 -	 Vol.	 2:	 Guide	 to	 the	 ISTQB	 Advanced	 Certification	 as	 an	 Advanced	 Test	 Manager3P...
Advanced	 Software	 Testing	 -	 Vol.	 2:	 Guide	 to	 the	 ISTQB	 Advanced	 Certification	 as	 an	 Advanced	 Test	 Manager4P...
Advanced	 Software	 Testing	 -	 Vol.	 2:	 Guide	 to	 the	 ISTQB	 Advanced	 Certification	 as	 an	 Advanced	 Test	 Manager5P...
Advanced	 Software	 Testing	 -	 Vol.	 2:	 Guide	 to	 the	 ISTQB	 Advanced	 Certification	 as	 an	 Advanced	 Test	 Manager6P...
Advanced	 Software	 Testing	 -	 Vol.	 2:	 Guide	 to	 the	 ISTQB	 Advanced	 Certification	 as	 an	 Advanced	 Test	 Manager7P...
Advanced	 Software	 Testing	 -	 Vol.	 2:	 Guide	 to	 the	 ISTQB	 Advanced	 Certification	 as	 an	 Advanced	 Test	 Manager8P...
Advanced	 Software	 Testing	 -	 Vol.	 2:	 Guide	 to	 the	 ISTQB	 Advanced	 Certification	 as	 an	 Advanced	 Test	 Manager9P...
Advanced	 Software	 Testing	 -	 Vol.	 2:	 Guide	 to	 the	 ISTQB	 Advanced	 Certification	 as	 an	 Advanced	 Test	 Manager10...
Advanced	 Software	 Testing	 -	 Vol.	 2:	 Guide	 to	 the	 ISTQB	 Advanced	 Certification	 as	 an	 Advanced	 Test	 Manager11...
Advanced	 Software	 Testing	 -	 Vol.	 2:	 Guide	 to	 the	 ISTQB	 Advanced	 Certification	 as	 an	 Advanced	 Test	 Manager12...
Advanced	 Software	 Testing	 -	 Vol.	 2:	 Guide	 to	 the	 ISTQB	 Advanced	 Certification	 as	 an	 Advanced	 Test	 Manager13...
Advanced	 Software	 Testing	 -	 Vol.	 2:	 Guide	 to	 the	 ISTQB	 Advanced	 Certification	 as	 an	 Advanced	 Test	 Manager14...
Advanced	 Software	 Testing	 -	 Vol.	 2:	 Guide	 to	 the	 ISTQB	 Advanced	 Certification	 as	 an	 Advanced	 Test	 Manager15...
Advanced	 Software	 Testing	 -	 Vol.	 2:	 Guide	 to	 the	 ISTQB	 Advanced	 Certification	 as	 an	 Advanced	 Test	 Manager16...
Advanced	 Software	 Testing	 -	 Vol.	 2:	 Guide	 to	 the	 ISTQB	 Advanced	 Certification	 as	 an	 Advanced	 Test	 Manager17...
Advanced	 Software	 Testing	 -	 Vol.	 2:	 Guide	 to	 the	 ISTQB	 Advanced	 Certification	 as	 an	 Advanced	 Test	 Manager18...
Advanced	 Software	 Testing	 -	 Vol.	 2:	 Guide	 to	 the	 ISTQB	 Advanced	 Certification	 as	 an	 Advanced	 Test	 Manager19...
Advanced	 Software	 Testing	 -	 Vol.	 2:	 Guide	 to	 the	 ISTQB	 Advanced	 Certification	 as	 an	 Advanced	 Test	 Manager20...
Advanced	 Software	 Testing	 -	 Vol.	 2:	 Guide	 to	 the	 ISTQB	 Advanced	 Certification	 as	 an	 Advanced	 Test	 Manager21...
Advanced	 Software	 Testing	 -	 Vol.	 2:	 Guide	 to	 the	 ISTQB	 Advanced	 Certification	 as	 an	 Advanced	 Test	 Manager22...
Advanced	 Software	 Testing	 -	 Vol.	 2:	 Guide	 to	 the	 ISTQB	 Advanced	 Certification	 as	 an	 Advanced	 Test	 Manager23...
Advanced	 Software	 Testing	 -	 Vol.	 2:	 Guide	 to	 the	 ISTQB	 Advanced	 Certification	 as	 an	 Advanced	 Test	 Manager24...
Advanced	 Software	 Testing	 -	 Vol.	 2:	 Guide	 to	 the	 ISTQB	 Advanced	 Certification	 as	 an	 Advanced	 Test	 Manager25...
Advanced	 Software	 Testing	 -	 Vol.	 2:	 Guide	 to	 the	 ISTQB	 Advanced	 Certification	 as	 an	 Advanced	 Test	 Manager26...
Advanced	 Software	 Testing	 -	 Vol.	 2:	 Guide	 to	 the	 ISTQB	 Advanced	 Certification	 as	 an	 Advanced	 Test	 Manager27...
Advanced	 Software	 Testing	 -	 Vol.	 2:	 Guide	 to	 the	 ISTQB	 Advanced	 Certification	 as	 an	 Advanced	 Test	 Manager28...
Advanced	 Software	 Testing	 -	 Vol.	 2:	 Guide	 to	 the	 ISTQB	 Advanced	 Certification	 as	 an	 Advanced	 Test	 Manager29...
Advanced	 Software	 Testing	 -	 Vol.	 2:	 Guide	 to	 the	 ISTQB	 Advanced	 Certification	 as	 an	 Advanced	 Test	 Manager30...
Advanced	 Software	 Testing	 -	 Vol.	 2:	 Guide	 to	 the	 ISTQB	 Advanced	 Certification	 as	 an	 Advanced	 Test	 Manager31...
Advanced	 Software	 Testing	 -	 Vol.	 2:	 Guide	 to	 the	 ISTQB	 Advanced	 Certification	 as	 an	 Advanced	 Test	 Manager32...
Advanced	 Software	 Testing	 -	 Vol.	 2:	 Guide	 to	 the	 ISTQB	 Advanced	 Certification	 as	 an	 Advanced	 Test	 Manager33...
Free ebook-rex-black advanced-software-testing
Free ebook-rex-black advanced-software-testing
Free ebook-rex-black advanced-software-testing
Free ebook-rex-black advanced-software-testing
Free ebook-rex-black advanced-software-testing
Free ebook-rex-black advanced-software-testing
Free ebook-rex-black advanced-software-testing
Free ebook-rex-black advanced-software-testing
Free ebook-rex-black advanced-software-testing
Free ebook-rex-black advanced-software-testing
Free ebook-rex-black advanced-software-testing
Free ebook-rex-black advanced-software-testing
Free ebook-rex-black advanced-software-testing
Upcoming SlideShare
Loading in …5

Free ebook-rex-black advanced-software-testing


Published on

Published in: Technology, Business
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Free ebook-rex-black advanced-software-testing

  1. 1. EuroSTARSoftware TestingC o n fe r e n c eEuroSTARSoftware TestingC o m m u n i t yAdvanced Software Testing - Vol. 2:Guide to the ISTQB AdvancedCertification as an AdvancedTest ManagerRex BlackPresident of RBCS
  2. 2. Advanced Software Testing - Vol. 2: Guide to the ISTQB Advanced Certification as an Advanced Test Manager1PAGEThe following is an excerpt from Rex Black’sbook, Advanced Software Testing: Volume2. It consists of the section concerning risk-based testing.Risk-Based Testingand Failure Modeand Effects AnalysisLearning objectives(K2) Explain the different ways that risk-basedtesting responds to risks.(K4) Identify risks within a project and product,and determine an adequate test strategy andtest plan based on these risks.(K3) Execute a risk analysis for product from atester’s perspective, following the failure modeand effects analysis approach.(K4) Summarize the results from the variousperspectivesonrisktypicallyheldbykeyprojectstakeholders, and use their collective judgmentin order to outline test activities to mitigaterisks.(K2)Describecharacteristicsofriskmanagementthat require it to be an iterative process.(K3) Translate a given risk-based test strategy totest activities and monitor its effects during thetesting.(K4) Analyze and report test results, includingdetermining and reporting residual risks toenable project managers to make intelligentrelease decisions.(K2) Describe the concept of FMEA, and explainits application in projects and benefits toprojects by example.Riskisthepossibilityofanegativeorundesirableoutcome or event. A specific risk is any problemthat might occur that would decrease customer,user, participant, or stakeholder perceptions ofproduct quality or project success.In testing, we’re concerned with two maintypes of risks. The first type is product or qualityrisks. When the primary effect of a potentialproblem is on the quality of the product itself,the potential problem is called a product risk.A synonym for product risk, which I use mostfrequently myself, is quality risk. An example ofa quality risk is a possible reliability defect thatcould cause a system to crash during normaloperation.ISTQB Glossaryproduct risk: A risk directly related to the testobject.project risk: A risk related to management andcontrol of the (test) project, e.g., lack of staffing,strict deadlines, changing requirements, etc.risk: A factor that could result in future negativeconsequences; usually expressed as impact andlikelihood.The second type of risk is project or planningrisks. When the primary effect of a potentialproblem is on the overall success of a project,those potential problems are called projectrisks. Some people also refer to project risks asplanning risks. An example of a project risk isa possible staffing shortage that could delaycompletion of a project.Not all risks are equal in importance.There are anumber of ways to classify the level of risk. Thesimplest is to look at two factors:
  3. 3. Advanced Software Testing - Vol. 2: Guide to the ISTQB Advanced Certification as an Advanced Test Manager2PAGEThe likelihood of the problem occurringThe impact of the problem should itoccurLikelihood of a problem arises primarilyfrom technical considerations, such as theprogramming languages used, the bandwidthof connections, and so forth. The impact of aproblem arises from business considerations,such as the financial loss the business will suffer,the number of users or customers affected, andso forth.In risk-based testing, we use the risk itemsidentified during risk analysis, together with thelevel of risk associated with each risk item, toguide our testing. In fact, under a true analyticalrisk-based testing strategy, risk is the primarybasis of testing.Risk can guide testing in various ways, but thereare three very common ones:First, during all test activities, test analystsand test managers allocate effort for eachquality risk item proportional to the levelof risk. Test analysts select test techniquesin a way that matches the rigor andextensiveness of the technique with thelevel of risk. Test managers and test analystscarry out test activities in reverse risk order,addressing the most important qualityrisks first and only at the very end spendingany time at all on less important ones.Finally, test managers and test analystswork with the project team to ensure thatthe prioritization and resolution of defectsis appropriate to the level of riskSecond, during test planning and testcontrol, test managers carry out risk controlfor all significant, identified project risks.The higher the level of risk, the morethoroughly that project risk is controlled.We’ll cover risk control options in amoment.Third, test managers and test analystsreport test results and project status interms of residual risks. For example, whichtests have we not yet run or have weskipped? Which tests have we run? Whichhave passed? Which have failed? Whichdefects have we not yet fixed or retested?How do the tests and defects relate back tothe risks?When following a true analytical risk-basedtesting strategy, it’s important that riskmanagement not be something that happensonly at the start of a project. The threeresponses to risk I just covered—along withany others that might be needed—shouldoccur throughout the lifecycle. Specifically, weshould try to reduce quality risk by runningtests and finding defects and reduce projectrisks through mitigation and, if necessary,contingency actions. Periodically in the project,we should reevaluate risk and risk levels basedon new information. This might result in ourreprioritizing tests and defects, reallocating testeffort, and other test control actions. This willdiscussed further later in this section.ISTQB Glossaryrisk level: The importance of a risk as definedby its characteristics impact and likelihood.The level of risk can be used to determine theintensity of testing to be performed. A risk levelcan be expressed either qualitatively (e.g., high,medium, low) or quantitatively.risk management: Systematic applicationof procedures and practices to the tasksof identifying, analyzing, prioritizing, andcontrolling risk.One metaphor sometimes used to help peopleunderstand risk-based testing is that testingis a form of insurance. In your daily life, youbuy insurance when you are worried about
  4. 4. Advanced Software Testing - Vol. 2: Guide to the ISTQB Advanced Certification as an Advanced Test Manager3PAGEsome potential risk. You don’t buy insurancefor risks that you are not worried about. So, weshould test the areas and test for bugs that areworrisome and ignore the ones that aren’t.One potentially misleading aspect of thismetaphor is that insurance professionals andactuaries can use statistically valid data forquantitative risk analysis. Typically, risk-basedtestingreliesonqualitativeanalysesbecausewedon’t have the same kind of data that insurancecompanies have.During risk-based testing, you have to remainaware of many possible sources of risks. Thereare safety risks for some systems. There arebusiness and economic risks for most systems.Thereareprivacyanddatasecurityrisksformanysystems. There are technical, organizational,and political risks too.Characteristics andBenefits of Risk-Based TestingWhat does an analytical risk-based testingstrategy involve? What characteristics andbenefits does it have?For one thing, an analytical risk-based testingstrategy matches the level of testing effort tothe level of risk. The higher the risk, the moretest effort we expend. This means not only theeffort expended in test execution, but also theeffortexpendedindesigningandimplementingthe tests. We’ll look at the ways to accomplishthis later in this section.For another thing, an analytical risk-basedtesting strategy matches the order of testingto the level of risk. Higher-risk tests tend to findmore bugs, or tend to test more important areasof the system, or both. So, the higher the risk,the earlier the test coverage. This is consistentwith a rule of thumb for testing that I often telltesters, which is to try to find the scary stuff first.Again, we’ll see how we can accomplish thislater in this section.Because of this effort allocation and ordering oftesting, the total remaining level of quality riskis systematically and predictably reduced astesting continues. By maintaining traceabilityfrom the tests to the risks and from the locateddefects to the risks, we can report test resultsin terms of residual risk. This allows projectstakeholders to decide to declare testingcomplete whenever the risk of continuingtesting exceeds the risk of declaring the testingcomplete.Since the remaining risk is going down in apredictable way, this means that we can triagetests in risk order. Should schedule compressionrequire that we reduce test coverage, we cando this in risk order, providing a way that isboth acceptable and explainable to projectstakeholders.For all of these reasons, an analytical risk-basedtestingstrategyismorerobustthanananalyticalrequirements-basedteststrategy.Pureanalyticalrequirements-based test strategies require atleast one test per requirement, but they don’ttell us how many tests we need in a way thatresponds intelligently to project constraints.They don’t tell us the order in which to run tests.In pure analytical requirements-based teststrategies, the risk reduction throughout testexecutionisneitherpredictablenormeasurable.Therefore, with analytical requirements-basedtest strategies, we cannot easily express theremaining level of risk if project stakeholdersaskuswhetherwecansafelycurtailorcompresstesting.
  5. 5. Advanced Software Testing - Vol. 2: Guide to the ISTQB Advanced Certification as an Advanced Test Manager4PAGEThat is not to say that we ignore requirementsspecifications when we use an analyticalrisk-based testing strategy. On the contrary,we use requirements specifications, designspecifications, marketing claims, technicalsupport or help desk data, and myriad otherinputs to inform our risk identification andanalysis process if they are available. However,if we don’t have this information available or wefind such information of limited usefulness, wecan still plan, design, implement, and executeour tests by using stakeholder input to the riskidentification and assessment process. This alsomakes an analytical risk-based testing strategymore robust than an analytical requirements-based strategy, because we reduce ourdependency on upstream processes (which wemay not control) like requirements gatheringand design.ISTQB Glossaryrisk identification: The process of identifyingrisks using techniques such as brainstorming,checklists, and failure history.All that said, an analytical risk-based testingstrategyisnotperfect.Likeanyanalyticaltestingstrategy, we will not have all of the informationwe need for a perfect risk assessment at thebeginning of the project. Even with periodicreassessment of risk—which I will also discusslater in this section—we will miss someimportant risks. Therefore, an analytical risk-basedtestingstrategy,likeanyanalyticaltestingstrategy,shouldblendreactivestrategiesduringtest implementation and execution so that wecan detect risks that we missed during our riskassessment.Let me be more specific and concise aboutthe testing problems we often face and howanalytical risk-based testing can help solvethem.First, as testers, we often face significant timepressures. There is seldom sufficient time to runthe tests we’d want to run, particularly whendoing requirements-based testing. Ultimately,all testing is time-boxed. Risk-based testingprovides a way to prioritize and triage tests atany point in the lifecycle.When I say that all testing is time-boxed, Imean that we face a challenge in determiningappropriate test coverage. If we measure testcoverage as a percentage of what could betested, any amount of testing yields a coveragemetric of 0 percent because the set of tests thatcould be run is infinite for any real-sized system.So, risk-based testing provides a means tochoose a smart subset from the infinite numberof comparatively small subsets of tests we couldrun.Further, we often have to deal with poor ormissingspecifications.Byinvolvingstakeholdersin the decision about what not to test, what totest, and how much to test it, risk-based testingallows us to identify and fills gaps in documentslike requirements specifications that mightresult in big holes in our testing. It also helps tosensitize the other stakeholders to the difficultproblem of determining what to test (and howmuch) and what not to test.To return to the issue of time pressure, notonly are they significant, they tend to escalateduring the test execution period. We are oftenasked to compress the test schedule at thestart of or even midway through test execution.Risk-based testing provides a means to droptests intelligently while also providing a wayto discuss with project stakeholders the risksinherent in doing so.Finally, as we reach the end of our test executionperiod, we need to be able to help projectstakeholders make smart release decisions.Risk-based testing allows us to work withstakeholders to determine an acceptable level
  6. 6. Advanced Software Testing - Vol. 2: Guide to the ISTQB Advanced Certification as an Advanced Test Manager5PAGEof residual risk rather than forcing them—andus—to rely on inadequate, tactical metrics likebug and test counts.The History of Risk-Based TestingHow did analytical risk-based testing strategiescome to be? Understanding this history canhelp you understand where we are and wherethese strategies might evolve.In the early 1980s, Barry Boehm and BorisBeizer each separately examined the idea ofrisk as it relates to software development.Boehm advanced the idea of a risk-driven spiraldevelopment lifecycle, which we covered in theFoundation syllabus. The idea of this approachis to develop the architecture and design inrisk order to reduce the risk of developmentcatastrophes and blind alleys later in theproject.Beizer advanced the idea of risk-drivenintegration and integration testing. In otherwords, it’s not enough to develop in risk order,we need to assemble and test in risk order,too.1If you reflect on the implications of Boehm andBeizer’s ideas, you can see that these are theprecursors of iterative and agile lifecycles.Now, in the mid 1980s, Beizer and Bill Hetzeleach separately declared that risk should be aprimary driver of testing. By this, they meantboth in terms of effort and in terms of order.However, while giving some general ideason this, they did not elaborate any specificmechanisms or methodologies for for makingthis happen. I don’t say this to criticize them. Atthat point, it perhaps seemed that just ensuringawarenessofriskamongthetesterswasenoughto ensure risk-based testing.2However, it was not. Some testers have followedthis concept of using the tester’s idea of riskto determine test coverage and priority. Forreasons we’ll cover later, this results in testingdevolving into an ill-informed, reactive bughunt. There’s nothing wrong with finding manybugs, but finding as many bugs as possible isnot a well-balanced test objective.So, more structure was needed to ensure asystematic exploration of the risks. This bringsus to the 1990s. Separately, Rick Craig, PaulGerrard, Felix Redmill, and I were all looking forways to systematize this concept of risk-basedtesting. I can’t speak for Craig, Gerrard, andRedmill,butIknowthatIhadbecomefrustratedwith requirements-based strategies for thereasons mentioned earlier. So in parallel andwith very little apparent cross-pollination, thefour of us—and perhaps others—developedsimilar approaches for quality risk analysis andrisk-based testing. In this section, you’ll learnthese approaches.3So, where are we now? In the mid- to late 2000s,test practitioners widely use analytical risk-based testing strategies in various forms. Somestill practice misguided, reactive, tester-focusedbug hunts. However, many practitioners aretrying to use analytical approaches to preventbugs from entering later phases of testing, tofocus testing on what is likely to fail and whatis important, to report test status in terms ofresidual risk, and to respond better as their1: See Beizer’s book Software System Testing and Quality Assurance.2: See Beizer’s book Software Testing Techniques and Hetzel’s book The Complete Guide to Software Testing3: For more details on my approach, see my discussion of formal techniques in Critical Testing Processes and mydiscussion of informal techniques in Pragmatic Software Testing. For Paul Gerrard’s approach, see Risk-based e-Business Testing. Van Veenendaal discusses informal techniques in The Testing Practitioner.
  7. 7. Advanced Software Testing - Vol. 2: Guide to the ISTQB Advanced Certification as an Advanced Test Manager6PAGEunderstanding of risk changes. By putting theideas in this section into practice, you can joinus in this endeavor. As you learn more abouthow analytical risk-based testing strategieswork—and where they need improvements—Iencourage you to share what you’ve learnedwith others by writing articles, books, andpresentations on the topic.However, while we still have much to learn, thatdoes not mean that analytical risk-based testingstrategies are at all experimental. They are well-proven practice. I am unaware of any othertest strategies that adapt as well to the myriadrealities and constraints of software projects.They are the best thing going, especially whenblended with reactive strategies.Anotherformofblendingthatrequiresattentionand work is blending of analytical risk-basedtesting strategies with all the existing lifecyclemodels. My associates have used analyticalrisk-based testing strategies with sequentiallifecycles,iterativelifecycles,andspirallifecycles.These strategies work regardless of lifecycle.However, the strategies must be adapted to thelifecycle.Beyondlearningmorethroughpractice,anotherimportant next step is for test managementtools to catch up and start to advance the useof analytical risk-based testing strategies. Sometest management tools now incorporate thestate of the practice in risk-based testing. Somestill do not support risk-based testing directly atall. I encourage those of you who are workingon test management tools to build support forthis strategy into your tools and look for waysto improve it.How to Do Risk-Based TestingLet’s move on to the tactical questions abouthow we can perform risk-based testing. Let’sstart with a general discussion about riskmanagement, and then we’ll focus on specificelements of risk-based testing for the rest ofthis section.Risk management includes three primaryactivities:Risk identification, figuring out what thedifferent project and quality risks are forthe projectRisk analysis, assessing the level of risk—typically based on likelihood and impact—for each identified risk itemRisk mitigation (which is really moreproperly called“risk control”becauseit consists of mitigation, contingency,transference, and acceptance actions forvarious risks)In some sense, these activities are sequential, atleast in when they start. They are staged suchthat risk identification starts first. Risk analysiscomes next. Risk control starts once we havedeterminedthelevelofriskthroughriskanalysis.However,sinceweshouldcontinuouslymanagerisk in a project, risk identification, risk analysis,and risk control are all recurring activities.ISTQB Glossaryrisk control: The process through whichdecisions are reached and protective measuresare implemented for reducing risks to, ormaintaining risks within, specified levels
  8. 8. Advanced Software Testing - Vol. 2: Guide to the ISTQB Advanced Certification as an Advanced Test Manager7PAGErisk mitigation: See risk control.Everyone has their own perspective onhow to manage risks on a project, includingwhat the risks are, the level of risk, and theappropriate controls to put in place for risks.So risk management should include all projectstakeholders.Test analysts bring particular expertise to riskmanagement due to their defect-focusedoutlook. They should participate wheneverpossible.Infact,inmanycases,thetestmanagerwill lead the quality risk analysis effort with testanalysts providing key support in the process.Let’s look at these activities more closely. Forproper risk-based testing, we need to identifyboth product and project risks. We can identifyboth kinds of risks using techniques like these:Expert interviewsIndependent assessmentsUse of risk templatesProject retrospectivesRisk workshops and brainstormingChecklistsCalling on past experienceConceivably, you can use a single integratedprocess to identify both project and productrisks. I usually separate them into two separateprocesses since they have two separatedeliverables and often separate stakeholders. Iinclude the project risk identification process inthe test planning process. In parallel, the qualityrisk identification process occurs early in theproject.That said, project risks—and not just fortesting but also for the project as a whole—areoften identified as by-products of quality riskanalysis. In addition, if you use a requirementsspecification, design specification, use cases,and the like as inputs into your quality riskanalysis process, you should expect to finddefects in those documents as another set ofby-products. These are valuable by-products,which you should plan to capture and escalateto the proper person.Previously, I encouraged you to includerepresentatives of all possible stakeholdergroups in the risk management process. For therisk identification activities, the broadest rangeof stakeholders will yield the most complete,accurate, and precise risk identification. Themore stakeholder group representatives youomit from the process, the more risk items andeven whole risk categories will be missing.How far should you take this process? Well,it depends on the technique. In informaltechniques, which I frequently use, riskidentification stops at the risk items. The riskitems must be specific enough to allow foranalysis and assessment of each one to yieldan unambiguous likelihood rating and anunambiguous impact rating.Techniques that are more formal often look“downstream” to identify potential effects ofthe risk item if it becomes an actual negativeoutcome. These effects include effects onthe system—or the system of systems ifapplicable—as well as on potential users,customers, stakeholders, and even society ingeneral. Failure Mode and Effect Analysis is anexample of such a formal risk managementtechnique, and it is commonly used on safety-critical and embedded systems.4Other formal techniques look “upstream” toidentify the source of the risk. Hazard Analysis isan example of such a formal risk managementtechnique. I’ve never used it myself, but I havetalked to clients who have used it for safety-critical medical systems.4: For a discussion of Failure Mode and Effect Analysis, see Stamatis’s book Failure Mode and Effect Analysis.
  9. 9. Advanced Software Testing - Vol. 2: Guide to the ISTQB Advanced Certification as an Advanced Test Manager8PAGEWe’ll look at some examples of various levelsof formality in risk analysis a little later in thissection.The Advanced syllabus refers to the nextstep in the risk management process as riskanalysis. I prefer to call it risk assessment, justbecause analysis would seem to include bothidentification and assessment of risk to me.Regardless of what we call it, risk analysis or riskassessment involves the study of the identifiedrisks. We typically want to categorize each riskitem appropriately and assign each risk item anappropriate level of risk.We can use ISO 9126 or other quality categoriesto organize the risk items. In my opinion, itdoesn’t matter so much what category a riskitems goes into, usually, so long as we don’tforget it. However, in complex projects andfor large organizations, the category of riskcan determine who has to deal with the risk. Apractical implication of categorization like thiswill make the categorization important.The Level of RiskTheotherpartofriskassessmentorriskanalysisisdetermining the level of risk.This often involveslikelihood and impact as the two key factors.Likelihood arises from technical considerations,typically, while impact arises from businessconsiderations. However, in some formalizedapproaches you use three factors, such asseverity, priority, and likelihood of detection,or even subfactors underlying likelihood andimpact. Again, we’ll discuss this further later inthe book.So, what technical factors should we consider?Here’s a list to get you started:Complexity of technology and teamsPersonnel and training issuesIntrateam and interteam conflict/communicationSupplier and vendor contractual problemsGeographical distribution of thedevelopment organization, as withoutsourcingLegacy or established designs andtechnologies versus new technologies anddesignsThe quality—or lack of quality—in thetools and technology usedBad managerial or technical leadershipTime, resource, and management pressure,especially when financial penalties applyLack of earlier testing and quality assurancetasks in the lifecycleHigh rates of requirements, design, andcode changes in the projectHigh defect ratesComplex interfacing and integration issuesLack of sufficiently documentedrequirementsAnd what business factors should we consider?Here’s a list to get you started:The frequency of use and importance ofthe affected featurePotential damage to imageLoss of customers and businessPotential financial, ecological, or sociallosses or liabilityCivil or criminal legal sanctionsLoss of licenses, permits, and the likeThe lack of reasonable workaroundsThe visibility of failure and the associatednegative publicityBoth of these lists are just starting points.When determining the level of risk, we cantry to work quantitatively or qualitatively. Inquantitative risk analysis, we have numerical
  10. 10. Advanced Software Testing - Vol. 2: Guide to the ISTQB Advanced Certification as an Advanced Test Manager9PAGEratings for both likelihood and impact.Likelihood is a percentage, and impact is oftena monetary quantity. If we multiply the twovalues together, we can calculate the cost ofexposure, which is called—in the insurancebusiness—the expected payout or expectedloss.While it will be nice some day in the futureof software engineering to be able to dothis routinely, typically the level of risk isdetermined qualitatively. Why? Because wedon’t have statistically valid data on which toperform quantitative quality risk analysis. So wecan speak of likelihood being very high, high,medium, low, or very low, but we can’t say—atleast, not in any meaningful way—whether thelikelihood is 90 percent, 75 percent, 50 percent,25 percent, or 10 percent.This is not to say—by any means—that aqualitative approach should be seen as inferiororuseless.Infact,giventhedatamostofushaveto work with, use of a quantitative approach isalmostcertainlyinappropriateonmostprojects.The illusory precision thus produced misleadsthe stakeholders about the extent to which youactually understand and can manage risk. WhatI’ve found is that if I accept the limits of my dataand apply appropriate informal quality riskmanagement approaches, the results are notonly perfectly useful, but also indeed essentialto a well-managed test process.Unless your risk analysis is based on extensiveand statistically valid risk data, your risk analysiswill reflect perceived likelihood and impact. Inotherwords,personalperceptionsandopinionsheldbythestakeholderswilldeterminethelevelof risk. Again, there’s absolutely nothing wrongwith this, and I don’t bring this up to condemnthe technique at all.The key point is that projectmanagers, programmers, users, businessanalysts, architects, and testers typically havedifferentperceptionsandthuspossiblydifferentopinions on the level of risk for each risk item.By including all these perceptions, we distill thecollective wisdom of the team.However, we do have a strong possibility ofdisagreements between stakeholders. So therisk analysis process should include some wayof reaching consensus. In the worst case, if wecannot obtain consensus, we should be ableto escalate the disagreement to some level ofmanagement to resolve. Otherwise, risk levelswill be ambiguous and conflicted and thus notuseful as a guide for risk mitigation activities—including testing.Controlling theRisksPart of any management role, including testmanagement, is controlling risks that affectyour area of interest. How can we control risks?We have four main options for risk control:Mitigation, where we take preventivemeasures to reduce the likelihood and/orthe impact of a risk.Contingency, where we have a plan orperhaps multiple plans to reduce theimpact if the risk becomes an actuality.Transference, where we get another partyto accept the consequences of a risk.Finally, we can ignore or accept the riskand its consequences.For any given risk item, selecting one or more ofthese options creates its own set of benefits andopportunities as well as costs and, potentially,additional risks associated with each option.
  11. 11. Advanced Software Testing - Vol. 2: Guide to the ISTQB Advanced Certification as an Advanced Test Manager10PAGEDone wrong, risk control can make thingsworse, not better.Analytical risk-based testing is focused oncreating risk mitigation opportunities for thetest team, especially for quality risks. Risk-basedtesting mitigates quality risks through testingthroughout the entire lifecycle.In some cases, there are standards that canapply. We’ll look at a couple of risk-relatedstandards shortly in this section.Project RisksWhile much of this section deals with productrisks, test managers often identify project risks,andsometimestheyhavetomanagethem.Let’sdiscuss this topic now so we can subsequentlyfocus on product risks. A specific list of allpossible test-related project risks would behuge, but includes issues like these:Test environment and tool readinessTest staff availability and qualificationLow quality of test deliverablesToo much change in scope or productdefinitionSloppy, ad-hoc testing effortTest-related project risks can often be mitigatedor at least one or more contingency plans putin place to respond to the unhappy event if itoccurs. A test manager can manage risk to thetest effort in a number of ways.We can accelerate the moment of testinvolvement and ensure early preparation oftestware.Bydoingthis,wecanmakesureweareready to start testing when the product is ready.In addition, as mentioned in the Foundationsyllabus and elsewhere in this course, earlyinvolvement of the test team allows our testanalysis, design, and implementation activitiestoserveasaformofstatictestingfortheproject,which can serve to prevent bugs from showingup later during dynamic testing, such as duringsystem test. Detecting an unexpectedly largenumber of bugs during high-level testinglike system test, system integration test, andacceptance test creates a significant risk ofproject delay, so this bug-preventing activity isa key project risk-reducing benefit of testing.We can make sure that we check out the testenvironment before test execution starts. Thiscan be paired with another risk-mitigationactivity, that of testing early versions of theproduct before formal test execution begins.If we do this in the test environment, we cantest the testware, the test environment, the testrelease and test object installation process, andmany other test execution processes in advancebefore the first day of testing.We can also define tougher entry criteria totesting. That can be an effective approach ifthe project manager will slip the end date oftesting if the start date slips. Often, projectmanagers won’t do that, so making it harder tostart testing while not changing the end dateof testing simply creates more stress and putspressure on the test team.We can try to institute requirements fortestability. For example, getting the userinterface design team to change editable fieldsinto non-editable pull-down fields whereverpossible—such as on date and time fields—can reduce the size of the potential userinput validation test set dramatically and helpautomation efforts.Toreducethelikelihoodofbeingcaughtunawareby really bad test objects, and to help reducebugs in those test objects, test team memberscan participate in reviews of earlier project work
  12. 12. Advanced Software Testing - Vol. 2: Guide to the ISTQB Advanced Certification as an Advanced Test Manager11PAGEproducts, such as requirements specifications.We can also have the test team participate inproblem and change management.Finally, during the test execution effort—hopefully starting with unit testing and perhapseven before, but if not at least from day one offormal testing—we can monitor the projectprogress and quality. If we see alarming trendsdeveloping, we can try to manage them beforethey turn into end-game disasters.In Figure 1, you see the test-related project risksfor an Internet appliance project that serves asa recurring case study in this book. These riskswere identified in the test plan and steps weretaken throughout the project to manage themthrough mitigation or respond to them throughcontingency.Let’s review the main project risks identified fortesting on this project and the mitigation andcontingency plans put in place for them.We were worried, given the initial aggressiveschedules, that we might not be able to staffthe test team on time. Our contingency planwas to reduce scope of test effort in reverse-priority order.On some projects, test release managementis not well defined, which can result in a testcycle’s results being invalidated. Our mitigationplan was to ensure a well-defined crisp releasemanagement process.We have sometimes had to deal with testenvironment system administration supportthat was either unavailable at key times orsimply unable to carry out the tasks required.Our mitigation plan was to identify systemadministration resources with pager and cellphone availability and appropriate Unix, QNX,and network skills.As consultants, my associates and I oftenencounter situations in which test environmentare shared with development, which canintroducetremendousdelaysandunpredictableinterruptions into the test execution schedule.Figure 1: Test-related project risks example
  13. 13. Advanced Software Testing - Vol. 2: Guide to the ISTQB Advanced Certification as an Advanced Test Manager12PAGEIn this case, we had not yet determined the bestmitigation or contingency plan for this, so it wasmarked“[TBD].”Of course, buggy deliverables can impedetesting progress. In fact, more often than not,the determining factor in test cycle duration fornew applications (as opposed to maintenancereleases) is the number of bugs in the productand how long it takes to grind them out. Weasked for complete unit testing and adherenceto test entry and exit criteria as mitigation plansfor the software. For the hardware component,we wanted to mitigate this risk through earlyauditing of vendor test and reliability plans andresults.It’s also the case that frequent or sizeable testand product scope and definition changes canimpede testing progress. As a contingency planto manage this should it occur, we wanted achange management or change control boardto be established.Two IndustryStandards and TheirRelation to RiskYou can find an interesting example of howrisk management, including quality riskmanagement, plays into the engineering ofcomplex and/or safety-critical systems in theISO/IEC standard 61508, which is mentioned inthe Advanced syllabus. It is designed especiallyfor embedded software that controls systemswith safety-related implications, as you can tellfrom its title: “Functional safety of electrical/electronic/programmable electronic safety-related systems.”The standard focuses on risks. It requires riskanalysis. It considers two primary factors todetermine the level of risk: likelihood andimpact. During a project, the standard directs usto reduce the residual level of risk to a tolerablelevel, specifically through the application ofelectrical, electronic, or software improvementsto the system.The standard has an inherent philosophy aboutrisk. It acknowledges that we can’t attain a levelof zero risk—whether for an entire system oreven for a single risk item. It says that we haveto build quality, especially safety, in from thebeginning, not try to add it at the end, andthus must take defect-preventing actions likerequirements, design, and code reviews.The standard also insists that we know whatconstitutes tolerable and intolerable risks andthat we take steps to reduce intolerable risks.When those steps are testing steps, we mustdocument them, including a software safetyvalidation plan, software test specification,software test results, software safety validation,verification report, and software functionalsafety report. The standard is concernedwith the author-bias problem, which, as youshould recall from the Foundation syllabus,is the problem with self-testing, so it calls fortester independence, indeed insisting on itfor those performing any safety-related tests.And, since testing is most effective when thesystem is written to be testable, that’s also arequirement.The standard has a concept of a safety integritylevel (SIL), which is based on the likelihood offailure for a particular component or subsystem.The safety integrity level influences a numberof risk-related decisions, including the choice oftesting and QA techniques.Some of the techniques are ones I discuss inthe companion volume on Advanced TestAnalyst, such as the various functional and
  14. 14. Advanced Software Testing - Vol. 2: Guide to the ISTQB Advanced Certification as an Advanced Test Manager13PAGEblack-box testing design techniques. Manyof the techniques are ones I discuss in thecompanion volume on Advanced TechnicalTest Analyst, including probabilistic testing,dynamic analysis, data recording and analysis,performance testing, interface testing, staticanalysis, and complexity metrics. Additionally,since thorough coverage, including duringregression testing, is important to reducethe likelihood of missed bugs, the standardmandates the use of applicable automated testtools.Again, depending on the safety integritylevel, the standard might require variouslevels of testing. These levels include moduletesting, integration testing, hardware-softwareintegration testing, safety requirements testing,and system testing. If a level is required, thestandard states that it should be documentedand independently verified. In other words,the standard can require auditing or outsidereviews of testing activities. Continuing in thatvein of“guarding the guards,”the standard alsorequires reviews for test cases, test procedures,and test results, along with verification of dataintegrity under test conditions.The 61508 standard requires structural testingas a test design technique. So structuralcoverage is implied, again based on the safetyintegrity level. Because the desire is to havehigh confidence in the safety-critical aspectsof the system, the standard requires completerequirements coverage not once but multipletimes, at multiple levels of testing. Again, thelevel of test coverage required depends on thesafety integrity level.Now,thismightseemabitexcessive,especiallyifyou come from a very informal world. However,the next time you step between two pieces ofmetal that can move—e.g., elevator doors—askyourself how much risk you want to remain inthe software the controls that movement.Let’s look at another risk-related testingstandard. The United States Federal AviationAdministration provides a standard called DO-178B for avionics systems. In Europe, it’s calledED-12B.The standard assigns a criticality level based onthe potential impact of a failure, as shown inTable 1. Based on the criticality level, the DO-Table 1: FAA-DO 178B mandated coverage
  15. 15. Advanced Software Testing - Vol. 2: Guide to the ISTQB Advanced Certification as an Advanced Test Manager14PAGE178B standard requires a certain level of white-box test coverage.Criticality level A, or Catastrophic, applieswhen a software failure can result in acatastrophic failure of the system. For softwarewith such criticality, the standard requiresModified Condition/Decision, Decision, andStatement coverage.Criticality level B, or Hazardous and Severe,applies when a software failure can result in ahazardous,severe,ormajorfailureofthesystem.For software with such criticality, the standardrequires Decision and Statement coverage.Criticality level C, or Major, applies when asoftware failure can result in a major failure ofthesystem.Forsoftwarewithsuchcriticality,thestandard requires only Statement coverage.Criticality level D, or Minor, applies when asoftware failure can result in only a minor failureof the system. For software with such criticality,the standard does not require any level ofcoverage.Finally, criticality level E, or No effect, applieswhen a software failure cannot have an effecton the system. For software with such criticality,the standard does not require any level ofcoverage.This makes a certain amount of sense. Youshould be more concerned about software thataffects flight safety, such as rudder and aileroncontrol modules, than you are about softwarethat doesn’t, such as video entertainmentsystems. Of course, lately there has been a trendtoward putting all of the software, both criticaland noncritical, on a common network in theplane, which introduces enormous potentialrisks for inadvertent interference and malicioushacking.However, I consider it dangerous to use a one-dimensional white-box measuring stick todetermine how much confidence we shouldhave in a system. Coverage metrics are ameasure of confidence, it’s true, but we shoulduse multiple coverage metrics, both white-boxand black-box.5By the way, if you found this material a bitconfusing, note that the white-box coveragemetrics used in this standard were discussedin the Foundation syllabus in Chapter 4. If youdon’t remember these coverage metrics, youshould go back and review that material in thatchapter of the Foundation syllabus.Risk Identificationand AssessmentTechniquesVarious techniques exist for performing qualityrisk identification and assessment. These rangefrom informal to semiformal to formal.You can think of risk identification andassessment as a structured form of project andproduct review. In a requirements review, wefocus on what the system should do. In qualityrisk identification and assessment sessions,we focus on what the system might do thatit should not. Thus, we can see quality riskidentification and assessment as the mirrorimage of the requirements, the design, and theimplementation.As with any review, as the level of formalityincreases, so does the cost, the defect removal5: You might be tempted to say, “Well, why worry about this? It seems to work for aviation software.?” Spend a fewmoments on the Risks Digest at and peruse some of the software-related aviation near misses. Youmight feel less sanguine. There is also a discussion of the Boeing 787 design issue that relates to the use of a singlenetwork for all onboard systems, both safety critical and non–safety critical.
  16. 16. Advanced Software Testing - Vol. 2: Guide to the ISTQB Advanced Certification as an Advanced Test Manager15PAGEeffectiveness, and the extent of documentationassociated with it. You’ll want to choose thetechnique you use based on constraints andneeds for your project. For example, if you areworking on a short project with a very tightbudget, adopting a formal technique withextensive documentation doesn’t make muchsense.Let’s review the various techniques for qualityriskidentificationandassessment,frominformalto formal, and then some ways in which you canorganize the sessions themselves.Inmanysuccessfulimplementationsofprojects,we use informal methods for risk-based testing.These can work just fine. In particular, it’s a goodway to start learning about and practicing risk-based testing because excessive formality andpaperwork can create barriers to successfuladoption of risk-based testing.In informal techniques, we rely primarily onhistory, stakeholder domain and technicalexperience, and checklists of risk category toguide us through the process. These informalapproaches are easy to put in place and tocarry out. They are lightweight in terms of bothdocumentation and time commitment. Theyare flexible from one project to the next sincethe amount of documented process is minimal.However, since we rely so much on stakeholderexperience, these techniques are participantdependent.Thewrongsetofparticipantsmeansa relatively poor set of risk items and assessedrisk levels. Because we follow a checklist, if thechecklist has gaps, so does our risk analysis.Because of the relatively high level at whichrisk items are specified, they can be impreciseboth in terms of the items and the level of riskassociated with them.That said, these informal techniques are a greatway to get started doing risk-based testing. If itturnsoutthatamorepreciseorformaltechniqueis needed, the informal quality risk analysis canbe expanded and formalized for subsequentprojects. Even experienced users of risk-basedtesting should consider informal techniquesfor low-risk or agile projects. You should avoidusing informal techniques on safety-critical orregulated projects due to the lack of precisionand tendency toward gaps.Categories ofQuality RisksI mentioned that informal risk-based testingtends to rely on a checklist to identify risk items.What are the categories of risks that we wouldlook for? In part, that depends on the level oftesting we are considering. Let’s start withthe early levels of testing, unit or componenttesting. In the following lists, I’m going to posethese checklist risk categories in the form ofquestions, to help stimulate your thinkingabout what might go wrong.Does the unit handle state-relatedbehaviors properly? Do transitionsfrom one state to another occur when theappropriate events occur? Are the correctactions triggered? Are the correct eventsassociated with each input?Can the unit handle thetransactions it should handle, correctly,without any undesirable side effects?What statements, branches,conditions, complex condition, loops, andother paths through the code might resultin failures?What flows of datainto or out of the unit—whether through
  17. 17. Advanced Software Testing - Vol. 2: Guide to the ISTQB Advanced Certification as an Advanced Test Manager16PAGEparameters, objects, global variables, orpersistent data structures like files ordatabase tables—might result inimmediate or delayed failures, includingcorrupted persistent data (the worst kindof corrupted data).Is the functionality providedto the rest of the system by this componentincorrect, or might it have invalid sideeffects?If this component interactswith the user, might users have problemsunderstanding prompts and messages,deciding what to do next, or feelingcomfortable with color schemes andgraphics?For hardware components,might this component wear out or fail afterrepeated motion or use?For hardware components,are the signals correct and in the correctform?As we move into integration testing, additionalrisks arises, many in the following areas:Arethe interfaces between components welldefined? What problems might arisein direct and indirect interaction betweencomponents?Again, what problemsmight exist in terms of actions and sideeffects, particularly as a result ofcomponent interaction?Are the static dataspaces such as memory and disk spacesufficient to hold the information needed?Are the dynamic volume conduits suchas networks going to provide sufficientbandwidth?Willthe integrated component respondcorrectly under typical and extremeadverse conditions? Can they recover tonormal functionality after such a condition?Can the system store, load,modify, archive, and manipulate datareliably, without corruption or loss of data?What problems might existin terms of response time, efficient resourceutilization, and the like?Again, for this integrationcollection of components, if a userinterface is involved, might users haveproblems understanding prompts andmessages, deciding what to do next, orfeeling comfortable with color schemes andgraphics?Similar issues apply for system integrationtesting, but we would be concerned withintegration of systems, not components.Finally, what kinds of risk might we considerfor system and user acceptance testing?Again, we need to considerfunctionality problems. At these levels, theissues we are concerned with are systemic.Do end-to-end functions work properly?Are deep levels of functionality andcombinations of related functions working?In terms ofthe whole system interface to the user,are we consistent? Can the user understandthe interface? Do we mislead or distract theuser at any point? Trap the user in dead-endinterfaces?Overall, does
  18. 18. Advanced Software Testing - Vol. 2: Guide to the ISTQB Advanced Certification as an Advanced Test Manager17PAGEthe system handle various states correctly?Considering states the user or objectsacted on by the system might be in, arethere potential problems here?Considering the entire setof data that the system uses—includingdata it might share with other systems—can the system store, load, modify, archive,and manipulate that data reliably, withoutcorrupting or losing it?Complex systems oftenrequire administration. Databases,networks, and servers are examples.Operations these administrators performcan include essential maintenance tasks.For example, might there be problemswith backing up and restoring files ortables? Can you migrate the system fromone version or type of database server ormiddleware to another? Can storage,memory, or processor capacity be added?Are there potential issues with responsetime? With behavior under combinedconditions of heavy load and lowresources? Insufficient static space?Insufficient dynamic capacity andbandwidth?Willthe system fail under normal, exceptional,or heavy load conditions? Might the systembe unavailable when needed? Might itprove unstable with certain functions?Configuration: What installation,data migration, application migration,configuration, or initial conditions mightcause problems?Will the system respond correctly undertypical and extreme adverse conditions?Can it recover to normal functionality aftersuch a condition? Might its response tosuch conditions create consequentconditions that negatively affectinteroperating or cohabiting applications?Might certaindate- or time-triggered events fail? Dorelated functions that use dates or timeswork properly together? Could situationslike leap years or daylight saving timetransitions cause problems? What abouttime zones?In terms of the variouslanguages we need to support, will someof those character sets or translatedmessages cause problems? Might currencydifferences cause problems?Do latency,bandwidth, or other factors related to thenetworking or distribution of processingand storage cause potential problems?Might the system beincompatible with various environmentsit has to work in? Might the systembe incompatible with interoperatingor cohabiting applications in some of thesupported environments?What standards apply to oursystem, and might it violate some of thosestandards?Is it possible for users withoutproper permission to access functions ordata they should not? Are users with properpermission potentially denied access?Is data encrypted when it should be? Cansecurity attacks bypass various accesscontrols?For hardware systems,
  19. 19. Advanced Software Testing - Vol. 2: Guide to the ISTQB Advanced Certification as an Advanced Test Manager18PAGEmight normal or exceptional operatingenvironments cause failures? Will humidity,dust, or heat cause failures, eitherpermanent or intermittent?Are there problems with powerconsumption for hardware systems? Donormal variations in the quality of thepower supplied cause problems? Is batterylife sufficient?For hardwaresystems, might foreseeable physical shocks,background vibrations, or routine bumpsand drops cause failureIs thedocumentation incorrect, insufficient, orunhelpful? Is the packaging sufficient?Can we upgrade thesystem? Apply patches? Remove or addfeatures from the installation media?There are certainly other potential riskcategories, but this list forms a good startingpoint. You’ll want to customize this list to yourparticular systems if you use it.DocumentingQuality RisksIn Figure 2, you see a template that can beused to capture the information you identifyin quality risk analysis. In this template, youstart by identifying the risk items, using thecategories just discussed as a framework. Next,for each risk item, you assess its level of risk interms of the factors of likelihood and impact.You then use these two ratings to determinethe overall priority of testing and the extentof testing. Finally, if the risks arise from specificFigure 2: A template for capturing quality risk information
  20. 20. Advanced Software Testing - Vol. 2: Guide to the ISTQB Advanced Certification as an Advanced Test Manager19PAGErequirements or design specification elements,you establish traceability back to theseitems. Let’s look at these activities and howthey generate information to populate thistemplate.First, remember that quality risks are potentialsystem problems that could reduce usersatisfaction. We can use the risk categories toorganize the list and to jog people’s memoryabout risk items to include. Working with thestakeholders, we identify one or more qualityrisk item for each category and populate thetemplate.Having identified the risks, we can now gothrough the list of risk items and assess thelevel of risk because we can see the risk itemsin relation to each other. An informal techniquetypically uses main factors for assessing risk.Thefirstisthelikelihoodoftheproblem,whichisdetermined mostly by technical considerations.I sometimes call this “technical risk” to remindme of that fact. The second is the impact ofthe problem, which is determined mostlyby business or operational considerations. Isometimes call this“business risk”to remind meof that fact.Both likelihood and impact can be rated on anordinal scale. A three-point ordinal scale is high,medium, and low. I prefer to use a five-pointscale, from very high to very low.Given the likelihood and impact, we cancalculate a single, aggregate measure of riskfor the quality risk item. A generic term for thismeasure of risk is risk priority number. One wayto do this is to use a formula to calculate the riskpriority number from the likelihood and impact.First, translate the ordinal scale into a numericalscale, as in this example:1 = Very high2 = High3 = Medium4 = Low5 = Very lowYou can then calculate the risk priority numberas the product of the two numbers. We’ll revisitthisissueinalatersectionofthischapterbecausethis is just one of many ways to determine therisk priority.Theriskprioritynumbercanbeusedtosequencethe tests. To allocate test effort, I determine theextent of testing. Figure 2 shows one way to dothis, by dividing the risk priority number intofive groups and using those to determine testeffort:1–-5 = Extensive6–10 = Broad11–15 = Cursory16–20 = Opportunity21–25 = Report bugs onlyWe’ll return to the matter of variations in theway to accomplish this later in this section.As noted before, while you go through thequality risk analysis process, you are likely togenerate various useful by-products. Theseinclude implementation assumptions that youand the stakeholders made about the systemin assessing likelihood. You’ll want to validatethese, and they might prove to be usefulsuggestions. The by-products also includeproject risks that you discovered, which theproject manager can address. Perhaps mostimportantly, the by-products include problemswith the requirements, design, or other inputdocuments. We can now avoid having theseproblemsturnintoactualsystemdefects.Noticethat all three enable the bug-preventive role oftesting discussed earlier in this book.In Figure 3, you see an example of an informalquality risk analysis. We have used six qualitycategories for our framework:
  21. 21. Advanced Software Testing - Vol. 2: Guide to the ISTQB Advanced Certification as an Advanced Test Manager20PAGEThese are the standard quality categories usedby some groups at Hewlett Packard.I’ve provided one or two example qualityrisk items for each category. Of course, for atypical product there would more like 100 to500 total quality risks—perhaps even more forparticularly complex products.Quality Risk AnalysisUsing ISO 9126We can increase the structure of an informalquality risk analysis—formalize it slightly, if youwill—by using the ISO 9126 standard as thequality characteristic framework instead of therather lengthy and unstructured list of qualityrisk categories given on the previous pages.This has some strengths. The ISO 9126 standardprovidesapredefinedandthoroughframework.The standard itself—that is, the entire set ofdocuments that the standard comprises—provides a predefined way to tailor it to yourorganization. If you use this across all projects,you will have a common basis for your qualityrisk analyses and thus your test coverage.Consistency in testing across projects providescomparability of results.Figure 3: Informal quality risk analysis example
  22. 22. Advanced Software Testing - Vol. 2: Guide to the ISTQB Advanced Certification as an Advanced Test Manager21PAGEThe use of ISO 9126 in risk analysis has itsweaknesses too. For one thing, if you are notcareful tailoring the quality characteristics, youcould find that you are potentially over-broadin your analysis. That makes you less efficient.For another thing, applying the standard to allprojects, big and small, complex and simple,could prove over-regimented and heavyweightfrom a process point of view.I would suggest that you consider the use ofISO 9126 structure for risk analysis whenever abit more formality and structure is needed, or ifyou are working on a project where standardscompliance matters. I would avoid its use onatypical projects or projects where too muchstructure, process overhead, or paperwork islikely to cause a problem, relying instead onthe lightest-weight informal process possible insuch cases.TorefreshyourmemoryontheISO9126standard,here are the six quality characteristics:Functionality, which has thesubcharacteristics of suitability, accuracy,interoperability, security, and complianceReliability, which has the subcharacteristicsof maturity (robustness), fault tolerance,recoverability, and complianceUsability, which has the subcharacteristicsof understandability, learnability,operability, attractiveness, and complianceEfficiency, which has the subcharacteristicsof time behavior, resource utilization, andcomplianceMaintainability, which has thesubcharacteristics of analyzability,changeability, stability, testability, andcompliancePortability, which has the subcharacteristicsof adaptability, installability, coexistence,replaceability, and complianceYou should remember, too, that in the ISTQBtaxonomyofblack-boxorbehavioraltests,thoserelatedtofunctionalityanditssubcharacteristicsare functional tests, while those related toreliability, usability, efficiency, maintainability,and portability and their subcharacteristics arenon-functional tests.Quality Risk AnalysisUsing Cost ofExposureAnother form of quality risk analysis is referredto as cost of exposure, a name derived fromthe financial and insurance world. The costof exposure—or the expected payout ininsurance parlance—is the likelihood of a losstimes the average cost of such a loss. Across alarge enough sample of risks for a long enoughperiod, we would expect the total amount lostto tend toward the total of the costs of exposurefor all the risks.So, for each risk, we should estimate, evaluate,and balance the costs of testing versus nottesting. If the cost of testing were below the costof exposure for a risk, we would expect testingto save us money on that particular risk. If thecost of testing were above the cost of exposurefor a risk, we would expect testing not to be asmart way to reduce costs of that risk.This is obviously a very judicious and balancedapproach to testing. Where there’s a businesscase we test, where’s there’s not we don’t.What could be more practical? Further, in theinsurance and financial worlds, you’re likely to
  23. 23. Advanced Software Testing - Vol. 2: Guide to the ISTQB Advanced Certification as an Advanced Test Manager22PAGEfind that stakeholders relate easily and well tothis approach.That said, it has some problems. In order to dothis with any degree of confidence, we needenough data to make reasonable estimates oflikelihood and cost. Furthermore, this approachuses monetary considerations exclusively todecide on the extent and sequence of testing.For some risks, the primary downsides arenonmonetary, or at least difficult to quantify,such as lost business and damage to companyimage.If I were working on a project in a financialor actuarial world, and had access to data,I’d probably lean toward this approach. Theaccessibility of the technique to the otherparticipants in the risk analysis process is quitevaluable. However, I’d avoid this technique onsafety- or mission-critical projects. There’s noway to account properly for the risk of injuringpeople or the risk of catastrophic impact to thebusiness.Quality Risk AnalysisUsing HazardAnalysisAnotherriskanalysistechniquethatyoucanuseis called hazard analysis. Like cost of exposure, itfits with certain fields quite well and doesn’t fitmany others.A hazard is the thing that creates a risk. Forexample, a wet bathroom floor creates therisk of a broken limb due to a slip and fall. Inhazard analysis, we try to understand thehazards that create risks for our systems. Thishas implications not only for testing but also forupstream activities that can reduce the hazardsand thus reduce the likelihood of the risks.As you might imagine, this is a very exact,cautious, and systematic technique. Havingidentified a risk, we then must ask ourselveshow that risk comes to be and what we mightdo about the hazards that create the risk. Insituations in which we can’t afford to missanything, this makes sense.However, in complex systems there could bedozens or hundreds or thousands of hazardsthat interact to create risks. Many of the hazardsmight be beyond our ability to predict. So,hazard analysis is overwhelmed by excessivecomplexity and in fact might lead us to thinkthe risks are fewer than they really are. That’sbad.I would consider using this technique onmedical or embedded systems projects.However, on unpredictable, rapidly evolving, orhighly complex projects, I’d avoid it.Determining theAggregate RiskPriorityWe are going to cover one more approach forrisk analysis in a moment, but I want to returnto this issue of using risk factors to derive anaggregate risk priority using a formula. You’llrecall this was the technique shown earlierwhen we multiplied the likelihood and impactto determine the risk priority number. It is alsoimplicit in the cost of exposure technique,where the cost of exposure for any given risk isthe product of the likelihood and the averagecost of a loss associated with that risk.
  24. 24. Advanced Software Testing - Vol. 2: Guide to the ISTQB Advanced Certification as an Advanced Test Manager23PAGESome people prefer to use addition rather thanmultiplication. For example, Rick Craig usesaddition of the likelihood and impact.6Thisresults in a more compressed and less sparsescale of risks. To see that, take a moment toconstruct two tables. Use likelihood and impactranging from 1–5 for each, and then populatethe tables showing all possible risk prioritynumber calculations for all combinations oflikelihood and impact. The tables should eachhave 25 cells. In the case of addition, the riskpriority numbers range from 2–10, while in thecase of multiplication, the risk priority numbersrange from 1–25.It’s also possible to construct sophisticatedformulas for the risk priority number, someof which might use subfactors for each majorfactor. For example, certain test managementtools such as the newer versions of QualityCenter support this. In these formulas, we canweight some of the factors so that they accountfor more points in the total risk priority scorethan others.In addition to calculating a risk priority numberfor sequencing of tests, we also need to use riskfactors to allocate test effort. We can derive theextent of testing using these factors in a coupleways. We could try to use another formula. Forexample, we could take the risk priority numberand multiply it times some given number ofhours for design and implementation andsome other number of hours for test execution.Alternatively, we could use a qualitativelymethod where we try to match the extent oftesting with the risk priority number, allowingsome variation according to tester judgment.If you do choose to use formulas, make sureyou tune them based on historical data. Or, ifyou are time-boxed in your testing, you canuse formulas based on risk priority numbers todistribute the test effort proportionally basedon risks.Some people prefer to use a table rather thana formula to derive the aggregate risk priorityfrom the factors. Table 2 shows an example ofsuch a table.First you assess the likelihood and impact asbefore. You then use the table to select theaggregate risk priority for the risk item basedon likelihood and impact scores. Notice thatthe table looks quite different than the twoyou constructed earlier. Now, experiment withdifferent mappings of risk priority numbers torisk priority ratings—ranging from very highto very low—to see whether the addition or6: See his book, Systematic Software Testing.Table 2: Using a table for risk priority
  25. 25. Advanced Software Testing - Vol. 2: Guide to the ISTQB Advanced Certification as an Advanced Test Manager24PAGEmultiplicationmethodmorecloselycorrespondsto this table.As with the formulas discussed a moment ago,you should tune the table based on historicaldata.Also,youshouldincorporateflexibilityintothis approach by allowing deviation from theaggregate risk priority value in the table basedon stakeholder judgment for each individualrisk.In Table 3, you see that not only can we derivethe aggregate risk rating from a table, we cando something similar for the extent of testing.Based on the risk priority rating, we can nowuse a table like Table 3 to allocate testing effort.You might want to take a moment to study thistable.StakeholderInvolvementOn a few occasions in this section so far, I’vementioned the importance of stakeholderinvolvement. In the last sections, we’ve lookedat various techniques for risk identificationand analysis. However, the involvement ofthe right participants is just as important, andprobably more important, than the choiceof technique. The ideal technique withoutadequate stakeholder involvement will usuallyprovide little or no valuable input, while a less-than-ideal technique, actively supported andparticipated in by all stakeholder groups, willalmost always produce useful information andguidance for testing.What is most critical is that we have a cross-functional team representing all of thestakeholders who have an interest in testingand quality. This means that we involve at leasttwo general stakeholder groups. One is madeup of those who understand the needs andTable 3: Using a table for extent of testing
  26. 26. Advanced Software Testing - Vol. 2: Guide to the ISTQB Advanced Certification as an Advanced Test Manager25PAGEinterests of the customers and/or users. Theother includes those who have insight into thetechnical details of the system. We can involvebusiness stakeholders, project funders, andothers as well. Through the proper participantmix, a good risk-based testing process gathersinformation and builds consensus around whatnot to test, what to test, the order in which totest, and the way to allocate test effort.I cannot overstate the value of this stakeholderinvolvement. Lack of stakeholder involvementleads to at least two major dysfunctions in therisk identification and analysis. First, there isno consensus on priority or effort allocation.This means that people will second-guessyour testing after the fact. Second, you willfind—either during test execution or worse yetafter delivery—that there are many gaps in theidentified risks, or errors in the assessment ofthe level of risk, due to the limited perspectivesinvolved in the process.Whileweshouldalwaystrytoincludeacompletesetofstakeholders,oftennotallstakeholderscanparticipate or would be willing to do so. In suchcases, some stakeholders may act as surrogatesfor other stakeholders. For example, in mass-market software development, the marketingteam might ask a small sample of potentialcustomers to help identify potential defectsthat would affect their use of the software mostheavily. In this case, the sample of potentialcustomers serves as a surrogate for the entireeventual customer base. As another example,business analysts on IT projects can sometimesrepresent the users rather than involving usersin potentially distressing risk analysis sessionswhere we have conversations about what couldgo wrong and how bad it would be.Failure Mode andEffect AnalysisThe last, and most formal, technique we’llconsider for risk-based testing is FailureMode and Effect Analysis. This technique wasdeveloped originally as a design-for-qualitytechnique. However, you can extend it for risk-based software and systems testing. As withan informal technique, we identify quality riskitems, in this case called failure modes. We tendto be more fine grained about this than wewould in an informal approach. This is in partbecause, after identifying the failure modes, wethen identity the effects those failure modeswould have on users, customers, society, thebusiness, and other project stakeholders.This technique has as its strength the propertiesof precision and meticulousness.When it’s usedproperly, you’re less likely to miss an importantquality risk with this technique than with theother techniques. Hazard analysis is similarlyprecise, but it tends to be overwhelmed bycomplexity due to the need to analyze theupstream hazards that cause risks. For FailureMode and Effect Analysis (often called FMEA,or “fuh-me-uh”), the downstream analysis ofeffects is easier, making the technique moregeneral.ISTQB GlossaryFailure Mode and Effect Analysis (FMEA): Asystematic approach to risk identificationand analysis in which you identify possiblemodes of failure and attempt to prevent theiroccurrence.Failure Mode, Effect and Criticality Analysis(FMECA): An extension of FMEA, as in additionto the basic FMEA, it includes a criticality
  27. 27. Advanced Software Testing - Vol. 2: Guide to the ISTQB Advanced Certification as an Advanced Test Manager26PAGEanalysis, which is used to chart the probabilityof failure modes against the severity of theirconsequences. The result highlights failuremodes with relatively high probability andseverity of consequences, allowing remedialeffort to be directed where it will produce thegreatest value.However, this precision and meticulousnesshas its weaknesses. It tends to produce lengthyoutputs. It is document heavy.The large volumeof documentation produced requires a lot ofwork not only during the initial analysis, butalso during maintenance of the analysis duringthe project and on subsequent projects. It isalso hard to learn, requiring much practice tomaster. If you must learn to use FMEA, it’s bestto start with an informal technique for qualityrisk analysis on another project first or to firstdo an informal quality risk analysis and thenupgrade that to FMEA after it is complete.I have used FMEA on a number of projects,and would definitely consider it for high-riskor conservative projects. However, for chaotic,fast-changing, or prototyping projects, I wouldavoid it.Failure mode and effect analysis was originallydeveloped to help prevent defects duringdesign and implementation work. I came acrossthe idea initially in D.H. Stamatis’s book FailureMode and Effect Analysis and decided to applyit to software and hardware/software systemsbased on some work I was doing with clients inthe mid-1990s. I later included a discussion of itin my first book, Managing the Testing Process,published in 1999, which as far as I know makesit the first software-testing-focused discussionof the technique. I discussed it further in CriticalTesting Processes as well. So, I can’t claim tohave invented the technique by any means, butI can claim to have been a leading popularizerof the technique amongst software testers.Failure Mode and Effect Analysis exists inseveral variants. One is Failure Mode, Effects andCriticality Analysis (FMECA, or “fuh-me-kuh”),where the criticality of each effect is assessedalong with other factors affecting the level ofrisk for the effect in question.Two other variants—at least in naming—existwhenthetechniqueisappliedtosoftware.Theseare software failure mode and effect analysisand software failure mode, effects and criticalityanalysis. In practice, I usually hear people usethe terms FMEA and FMECA in the context ofboth software and hardware/software systems.In this book, we’ll focus on FMEA. The changesinvolved in the criticality analysis are minor andwe can ignore them here.Quality Risk AnalysisUsing Failure Modeand Effect AnalysisThe FMEA approach is iterative. In other words,reevaluation of residual risk—on an effect-by-effect basis—is repeated throughout theprocess. Since this technique began as a designand implementation technique, ideally thetechnique is used early in the project.As with other forms of risk analysis, we wouldexpect test analysts and test managers tocontribute to the process and the creation ofthe FMEA document. Because the documentscan be intricate, it’s important that testerswho want to contribute understand theirpurpose and application. As with any otherrisk analysis, test analysts and test managers,like all participants, should be able to applytheir knowledge, skills, experience, and uniqueoutlook to help perform the risk analysis itself,following a FMEA approach.
  28. 28. Advanced Software Testing - Vol. 2: Guide to the ISTQB Advanced Certification as an Advanced Test Manager27PAGEAs I mentioned before, FMEA and its variantsare not ideal for all projects. However, it shouldbe applied when appropriate, as it is preciseand thorough. Specifically, FMEA makes senseunder the following circumstances:The software, system, or system of systemsis potentially critical and the risk of failuremust be brought to a minimum. Forexample, avionics software, industrialcontrol software, and nuclear controlsoftware would deserve this type ofscrutiny.The system is subject to mandatory risk-reduction or regulatory requirements—forexample, medical systems or those subjectto ISO 61508.The risk of project delay is unacceptable,so management has decided to invest extraeffort to remove defects during early stagesof the project. This involves using thedesign and implementation aspects ofFMEA more so than the testing aspects.The system is both complex and safetycritical, so close analysis is needed todefine special test considerations,operational constraints, and designdecisions. For example, a battlefieldcommand, communication, and controlsystem that tied together disparate systemsparticipating in the ever-changing scenarioof a modern battle would benefit from thetechnique.As I mentioned earlier, if necessary, you can usean informal quality risk analysis technique first,then augment that to include the additionalprecision and factors considered with FMEA.Since FMEA arose from the world of designand implementation—not testing—andsince it is inherently iterative, you shouldplan to schedule FMEA activities very earlyin the process, even if only preliminary, high-level information is available. For example, amarketing requirements document or evena project charter can suffice to start. As moreinformationbecomesavailable,andasdecisionsfirm up, you can refine the FMEA based on theadditional details.Additionally, you can perform a FMEA at anylevel of system or software decomposition. Inother words, you can—and I have—perform aFMEA on a system, but you can—and I have—also perform it on a subset of system modulesduring integration testing or even on an singlemodule or component.Whether you start at the system level, theintegration level, or the component level, theprocess is the same. First, working functionby function, quality characteristic by qualitycharacteristic, or quality risk category by qualityriskcategory,identifythefailuremodes.Afailuremode is exactly what it sounds like: a way inwhich something can fail. For example, if we areconsidering an e-commerce system’s security,a failure mode could be “Allows inappropriateaccess to customer credit card data.”So far, thisprobably sounds much like informal quality riskanalysis to you, but the next step is the point atwhich it gets different.In the next step, we try to identify the possiblecauses for each failure mode. This is notsomething included in the informal techniqueswe discussed before. Why do we do this? Well,remember that FMEA is originally a design andimplementation tool. We try to identify causesfor failures so we can define those causes outof the design and avoid introducing theminto the implementation. To continue withour e-commerce example, one cause of theinappropriate access failure mode could be“Credit card data not encrypted.”The next step, also unique to FMEA, is that, foreach failure mode, we identify the possibleeffects. Those effects can be on the system
  29. 29. Advanced Software Testing - Vol. 2: Guide to the ISTQB Advanced Certification as an Advanced Test Manager28PAGEitself, on users, on customers, on other projectand product stakeholders, even on society as awhole. (Remember, this technique is often usedfor safety-critical systems like nuclear controlwhere society is indeed affected by failures.)Again, using our e-commerce example, oneeffect of the access failure mode could be“Fraudulent charges to customer credit cards.”Based on these three elements—the failuremode, the cause, and the effect—we can thenassess the level of risk. We’ll look at how thisworks in just a moment. We can also assesscriticality. In our e-commerce example, we’d saythat leakage of credit card data is critical.Now, we can decide what types of mitigation orrisk reduction steps we can take for each failuremode. In our informal approaches to qualityrisk analysis, we limited ourselves to definingan extent of testing to be performed here.However, in FMEA—assuming we involved theright people—we can specify other design andimplementation steps too. For the e-commerceexample, a mitigation step might be “Encryptall credit card data.” A testing step might be“Penetration-test the encryption.”Notice that this example highlights the iterativeelements of this technique. The mitigation stepof encryption reduces the likelihood of thefailure mode, but it introduces new causes forthe failure mode, such as “Weak keys used forencryption.”We not only iterate during the process, weiterate at regular intervals in the lifecycle, aswe gain new information and carry out riskmitigation steps, to refine the failure modes,causes, effects, and mitigation actions.Determining theRisk Priority NumberLet’s return to the topic of risk factors and theoverall level of risk. In FMEA, people commonlyrefer to the overall level of risk as the risk prioritynumber, or RPN.When doing FMEA, there are typically threerisk factors used to determine the risk prioritynumber:Severity. This is an assessment of the impactof the failure mode on the system, based onthe failure mode itself and the effects.Priority. This is an assessment of the impactof the failure mode on users, customers,the business, stakeholders, the project, theproduct, and society, based on the effects.Detection. This is an assessment of thelikelihood of the problem existing in thesystem and escaping detection withoutany additional mitigation. This takes intoconsideration the causes of the failuremode and the failure mode itself.People performing a FMEA often rate theserisk factors on a numerical scale. You can usea 1 to 10 scale, though a 1 to 5 scale is alsocommon. You can use either a descending orascending, so long as each of the factors usesthe same type of scale, either all descending orall ascending. In other words, 1 can be the mostrisky assessment or the least risky, respectively.If you use a 1 to 10 scale, then a descendingscale means 10 is the least risky. If you use a 1to 5 scale, then a descending scale means 5 isthe least risky. For ascending scales, the mostrisky would be 10 or 5, depending on the scale.Personally, I always worry about using anythingfiner grained than a five-point scale. Unless I
  30. 30. Advanced Software Testing - Vol. 2: Guide to the ISTQB Advanced Certification as an Advanced Test Manager29PAGEcan actually tell the difference between a 9 anda 10 or a 2 and a 3, for example, it seems likeI’m just lying to others and myself about thelevel of detail at which I understand the risks.Trying to achieve this degree of precision canalso lengthen debates between stakeholdersin the risk analysis process, often to little if anybenefit.AsImentionedbefore,youdeterminetheoverallor aggregate measure of risk, the risk prioritynumber (or RPN), using the three factors. Thesimplest way to do this—and one in commonuse—is to multiply the three factors. However,you can also add the factors. You can also usemore complex calculations, including the use ofweighting to emphasize one or two factors.As with risk priority numbers for the informaltechniques discussed earlier, the FMEA RPN willhelp determine the level of effort we invest inrisk mitigation. However, note that FMEA riskmitigation isn’t always just through testing.In fact, multiple levels of risk mitigation couldoccur, particularly if the RPN is serious enough.Where failure modes are addressed throughtesting, we can use the FMEA RPN to sequencethe test cases. Each test case inherits the RPNfor the highest-priority risk related to it. We canthen sequence the test cases in risk priorityorder wherever possible.Benefits, Costs, andChallenges of FMEASo, what are the benefits of FMEA? In additionto being precise and thorough—and thus lesslikelytomisassessoromitrisks—FMEAprovidesotheradvantages.Itrequiresdetailedanalysisofexpected system failures that could be causedby software failures or usage errors, resulting ina complete view—if perhaps an overwhelmingview—of the potential problems.If FMEA is used at the system level—ratherthan only at a component level—we can have adetailed view of potential problems across thesystem. In other words, if we consider systemicrisks, including emergent reliability, security,and performance risks, we have a deeplyinformed understanding of system risks. Again,those performing and especially managingthe analysis can find this overwhelming, and itcertainly requires a significant time investmentto understand the entire view and its import.As I’ve mentioned, another advantage ofFMEA—as opposed to other quality riskanalysis techniques discussed—is that wecan use our analysis to help guide design andimplementation decisions. The analysis canalso provide justification for not doing certainthings, for avoiding certain design decisions, fornot implementing in a particular way or with aparticular technology.As with any quality risk analysis technique, ourFMEA analysis can focus our testing on specific,critical areas of the system. However, becauseit’s more precise than other techniques, thefocusing effect is correspondingly more precise.This can have test design implications, too,since you might choose to implement morefine-grained tests to take the finer-grainedunderstanding of risk into account.There are costs and challenges associated withFMEA,ofcourse.Foronething,youhavetoforceyourself to think about use cases, scenarios,and other realities that can lead to sequencesof failures. Because of the fine-grained natureof the analysis, it’s easy to focus on eachfailure mode in isolation, without consideringeverything else that’s going on. You can—andshould—overcome this challenge, of course.
  31. 31. Advanced Software Testing - Vol. 2: Guide to the ISTQB Advanced Certification as an Advanced Test Manager30PAGEAs mentioned a few times, the FMEA tablesand other documentation can be huge. Thismeans that participants and those managingthe analysis can find the development andmaintenance of these documents a large, time-consuming, expensive investment.As originally conceived, FMEA works functionby function. When looking at a component ora complex system, it might be difficult to defineindependent functions. I’ve managed to getaround this myself by doing the analysis notjust by function, but by quality characteristic orby quality risk category.Finally, when trying to anticipate causes,it might be challenging to distinguish truecauses from intermediate effects. For example,suppose we are considering a failure modefor an e-commerce system such as “Foreigncurrency transactions rejected.” We could list acause as “Credit card validation cannot handleforeign currency.” However, the true causemight be that we simply haven’t enabledforeign currency processing with our creditcard processing vendor, which is a simpleimplementation detail—provided someoneremembers to do it. These challenges are inaddition to those discussed earlier for qualityrisk analysis in general.Case Study of FMEAIn Figure 4, you see an example of a quality riskanalysis document. It is a case study of an actualproject. This document—and the approach weused—followed the Failure Mode and EffectAnalysis approach.As you can see, we started—at the left side ofthe figure—with a specific function and thenidentified failure modes and their possibleeffects. We determined criticality based on theFigure 4: Case study of Failure Mode and Effect Analysis
  32. 32. Advanced Software Testing - Vol. 2: Guide to the ISTQB Advanced Certification as an Advanced Test Manager31PAGEeffects, along with the severity and priority. Welisted possible causes to enable bug preventionwork during requirements, design, andimplementation.Next, we looked at detection methods—thosemethods we expected to apply anyway for thisproject. The more likely the failure mode wasto escape detection, the worse the detectionnumber. We calculated a risk priority numberbased on the severity, priority, and detectionnumbers.Smallernumberswereworse.Severity,priority, and detection each ranged from 1 to5. So the risk priority number ranged from 1 to125.Thisparticularfigureshowsthehighest-levelriskitems only because it was sorted by risk prioritynumber. For these risk items, we’d expect a lot ofadditional detection and other recommendedrisk control actions. You can see that we haveassigned some additional actions at this pointbut have not yet assigned the owners.During testing actions associated with a riskitem, we’d expect that the number of testcases, the amount of test data, and the degreeof test coverage would all increase as the riskincreased. Notice that we can allow any testprocedures that cover a risk item to inherit thelevel of risk from the risk item. That documentsthe priority of the test procedure, based on thelevel of risk.Risk Based Testingand the TestingProcessWe’ve talked so far about quality risk analysistechniques. As with any technique, we haveto align and integrate the selected qualityrisk analysis technique with the larger testingprocess and indeed the larger software orsystem development process. Table 8 shows ageneral process that you can use to organizethe quality risk identification, assessment, andmanagement process for quality risk-basedtesting.7Let’s go through the process step-by-step andin detail. Identify the stakeholders who willparticipate. This is essential to obtain the mostbenefit from quality risk-based testing. Youwant a cross-functional team that representsthe interests of all stakeholders in testing andquality. The better the representation of allinterests, the less likely it is that you will misskey risk items or improperly estimate the levelsof risk associated with each risk item. Thesestakeholders are typically in two groups. Thefirst group consists of those who understandthe needs and interests of the customers andusers—or are the customers and users.They seepotential business-related problems and canassess the impact. The second group consistsof those who understand the technical details.They see what is likely to go wrong and howlikely it is.Selectatechnique.Theprevioussectionsshouldhave given you some ideas on how to do that.Identify the quality risk items using thetechnique chosen. Assess the level of riskassociated with each item. The identificationand assessment can occur as a single meeting,using brainstorming or similar techniques, or asa series of interviews, either with small groupsor one-on-one. Try to achieve consensus on therating for each risk item If you can’t, escalateto the appropriate level of management. Nowselect appropriate mitigation techniques.Remember that this doesn’t just have to betesting at one or more level. It can also include7: This process was first published in my book Critical Testing Processes.
  33. 33. Advanced Software Testing - Vol. 2: Guide to the ISTQB Advanced Certification as an Advanced Test Manager32PAGEalso reviews of requirements, design, and code;defensive programming techniques; staticanalysis to ensure secure and high-quality code;and so forth.Deliver the by-products. Risk identification andanalysisoftenlocatesproblemsinrequirements,design, code, or other project documents,models, and deliverables. These can be actualdefects in these documents, project risks, andimplementation assumptions and suggestions.You should send these by-products to the rightperson for handling.Review, revise, and finalize the quality riskdocument that was produced.This document is now a valuable project workproduct. You should save it to the projectrepository,placingitundersomeformofchangecontrol.The document should change only withthe knowledge of—ideally, the consent of—theother stakeholders who participated.That said, it will change. You should plan torevise the risk assessment at regular intervals.For example, review and update the documentat major project milestones such as thecompletion of the requirements, design, andimplementation phases and at test level entryand exit reviews. Also, review and update whensignificant chunks of new information becomeavailable, such as at the completion of the firstTable 8: Quality risk analysis process
  34. 34. Advanced Software Testing - Vol. 2: Guide to the ISTQB Advanced Certification as an Advanced Test Manager33PAGEtest cycle in a test level. You should plan to addnew risk items and reassess the level of risk forthe existing items.Throughout this process, be careful to preservethe collaborative nature of the endeavor.In addition to the information-gatheringnature of the process, the consensus-buildingaspects are critical. Both business-focused andtechnically focused participants can and shouldhelp prioritize the risks and select mitigationstrategies. This way, everyone has someresponsibility for and ownership of the testingeffort that will be undertaken.Risk-Based Testingthroughout theLifecycleA basic principle of testing discussed in theFoundation syllabus is the principle of earlytesting and quality assurance. This principlestresses the preventive potential of testing.Preventivetestingispartofanalyticalrisk-basedtesting. It’s implicit in the informal quality riskanalysis techniques and explicit in FMEA.Preventive testing means that we mitigate riskbefore test execution starts. This can entailearly preparation of testware, pretesting testenvironments, pretesting early versions of theproduct well before a test level starts, insistingon tougher entry criteria to testing, ensuringrequirements for and designing for testability,participatinginreviewsincludingretrospectivesfor earlier project activities, participatingin problem and change management, andmonitoring the project progress and quality.In preventive testing, we integrate quality riskcontrol actions into the entire lifecycle. Testmanagers should look for opportunities tocontrol risk using various techniques, such asthose listed here:An appropriate test design techniqueReviews and inspectionReviews of test designAn appropriate level of independence forthe various levels of testingThe use of the most experienced personon test tasksThe strategies chosen for confirmationtesting (retesting) and regression testingPreventive test strategies acknowledge thatwe can and should mitigate quality risks usinga broad range of activities, many of them notwhat we traditionally think of as “testing.”For example, if the requirements are not wellwritten, perhaps we should institute reviewsto improve their quality rather than relying ontests that will be run once the badly writtenrequirements become a bad design andultimately bad, buggy code?Dynamic testing is not effective against allkinds of quality risks. For example, while wecan easily find maintainability issues related topoor coding practices in a code review—whichis a static test—dynamic testing will only revealthe consequences of unmaintainable code overtime, as excessive regression starts to occur.In some cases, it’s possible to estimate the riskreductioneffectivenessoftestingingeneralandof specific test techniques for given risk items.For example, use-case-based functional testsare unlikely to do much to reduce performanceor reliability risks.So, there’s not much point in using dynamictesting to reduce risk where there is a low levelof test effectiveness. Quality risk analysis, doneearlierintheproject,makesprojectstakeholdersaware of quality risk mitigation opportunities