RCA Presentation V0 1


Published on

RCA - Root Cause Analysis

  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

RCA Presentation V0 1

  1. 1. Ian McDonald Root Cause Analysis - Presentation© 2010 Ian McDonald 1
  2. 2. Capability Maturity Model (CMM)The importance of a good CMM profile is to achieve: Better customer satisfaction Increased quality More accurate schedules Lower development costs Substantial return on investment Improved employee morale and reduced turnover 2
  3. 3. CMM LevelsThe levels within CMM are:2. Initial (chaotic, ad hoc, individual heroics) - the starting point for use of a new process.3. Managed - the process is managed according to the metrics described in the Defined stage.4. Defined - the process is defined/confirmed as a standard business process, and decomposed to levels 0, 1 and 2 (the latter being Work Instructions).5. Quantitatively managed6. Optimizing - process management includes deliberate process optimization/improvement. 3
  4. 4. Optimisation CMM Level 51. Level 5 is about not only collecting metrics, but using these to feedback into the company processes to improve the process further.2. This presentation addresses the opportunities for improvement from the collection of Defect Record Metrics. In particular two processes as part of the life cycle:3. Root Cause Analysis (RCA) – covered within this presentation.4. Test Escape Analysis (TEA) – covered in a further presentation, but mentioned here for clarification. 4
  5. 5. Cost of Defect Fixing It can take one person 10 minutes to find a defect in requirements, 10 minutes to fix the document. Cost £10 per defect (2010). Finding the same defect in Functional Test and then Fixing. Cost £211 per defect. Finding the same single defect in System Integration Testing. Cost £587 per defect. Cost following delivery £? + £Reputation. 5
  6. 6. RCA vs TEA Root Cause Analysis (RCA) allows us to learn, rectify and avoid future defect injection. Providing we use the data. Test Escape Analysis (TEA) in contrast allows us to become more efficient at testing and avoid future defects from escaping detection. Provided we use the data. 6
  7. 7. Cost of Defect Fixing It can take one person 10 minutes to find a defect in requirements, 10 minutes to fix the document. Cost £10 per defect. Finding the same defect in Functional Test and then Fixing. Cost £211 per defect. Finding the same single defect in System Integration Testing. Cost £587 per defect. Cost following delivery £? + £Reputation. Ideally we want to prevent the defect injection in the first place and be able to find those defects that do occur a lot sooner. RCA and TEA ultimately lead to budget savings and faster delivery. 7
  8. 8. Purpose Root Cause Analysis is aimed at the code programming – why was the defect introduced.  Analogy why did the tightrope walker fall TEA in contrast is about why a defect was not detected.  Analogy why did we not catch the falling tightrope walker. 8
  9. 9. Purpose We do RCA to improve ourdevelopment techniques. This mightresult for example in changes in designmethods, review check lists updated,training, new tools deployed, etc. RCA istargeted at developers. We do TEA to learn how to detectmore defects and how to detect themsooner and smarter. The ultimate goal isto drive down the number of defectsleaking to the customer by finding thosedefects ourselves first. 9
  10. 10. TEA and RCA Handling Both RCA and TEA are handled as part of the Defect Management process. Collecting data within the Defect Management Database, using the Defect Management Tooling deployed. TEA is handled on a statistical bases by development team (a tester and developer – to assist in setting the context of the defect fix). The process helps the test team to prioritise specific test design approaches (e.g. methods as described within BS7925-2 and other such test design methods such as Classification Tree). The outcome is to ensure that we learn from the experience and apply the right test design techniques or right test tooling far sooner in the development and test lifecycle. That means we spend less time and money testing and fixing defects. TEA is targeted at a particular phase of a project, to avoid future defect leeks beyond that phase. It uses only a sample of defects (e.g. Customer Reported Critical and High Severity), it will aim typically to address 80% of the defects within the targeted group and is about spotting trends. TEA accepts that not all defects will be capable of being identified as being able to prevent earlier. 10
  11. 11. TEA and RCA Handling RCA in contrast is carried out for ALL defects. at all levels of the product life cycle. It can be carried out by the development team as defects are reported and fixed. However it is mandated to be completed by the time the defect is declared fixed and ready for test. The test team need to ensure that the RCA process has been completed at the time the Defect is received, before closing the defect. This is important since the RCA report may prompt additional targeted testing. Consequently RCA also has a side benefit in that it can prompt improved defect fix testing. 11
  12. 12. Benefits of RCA This is about collecting data that influences change for the better. The information may lead for example to: An individual ABC has always used a particular method to set up a call to a procedure, this quick method has been passed onto other team members. However on the new project this often causes problems, which is fixed later by the defect fixing team. Spotting this reoccurring problem will mean that we can simply talk to the team and point out the reoccurring problem and prevent this type of defect being re-injected. We might also want to update the project review check list, to spot these problems sooner. We may have specific problems with a COTS package, which may lead to a commercial decision to create a specific support process or use a different package. We may be interfacing with customer legacy code that is being maintained by another team or company. Having the metrics that show that we are having problems with changes that break links can be helpful in changing process operating procedures. We might have issues with operating system changes. We may need to adapt to a different compiler (e.g. we are using techniques that only worked on the previous one in a different project). We may need to update our company training. e.g. we may have issues with the way requirements are defined and reviewed. OR we may find that we are letting too many defects through at review and we may want to look at how the review process is running. We may have race conditions in code that require that in future we do specific timing checks, or use specific methods to prevent future conditions arising. 12
  13. 13. When Triggered in Defect Life Cycle RCA data can be collected at any point within a defect lifecycle. This is because information about the cause of the defect starts to become available at the moment the defect is reported. This information can later be updated, however by the time the defect is declared fixed and ready for test RCA is mandatory. TEA can be triggered in a number of ways:  By a defect being raised to the triggered criteria (e.g. Customer Reported, Critical or High Severity, Defect Fixed and an RCA report raised).  By manually choosing to include a TEA report.  Either as an early report for the targeted classification of defects.  OR for any other type of defect. 13
  14. 14. Commonality between TEA andRCA Within the defect data fields, there will be some commonality between TEA and RCA, as both processes will want to use SOME common data e.g: Reason fault introduced Development Phase fault introduced (we will want this to fit the delivery life cycle management process) Code designed from (e.g. Customer Requirements, Engineering Requirements, API Specification, etc) 14
  15. 15. Reason Introduced A shared field with the main Defect between RCA and TEA. The record can be changed from any of the reports. Fields include:  Code Missing  Code not fully built or tested Stage Defect Introduced:  Coding Error “Earliest stage in the processes  Design Incorrect that the defect could have been  Design Missing or Incomplete prevented”.  Design Unclear Reason Introduced:  Environment Unavailable “Provide finer detail as to the initial root cause of the defect”.  Initial Fix Incomplete  Initial Fix Incorrect  Requirement Incorrect  Requirement Missing or Incomplete  Requirement Unclear  Standards Not Followed  Typing Mistake  Other (free text) 15
  16. 16. Development Phase Requirements Review, Design Review, Code Review, Phase Defect Introduced: “Earliest phase in the processes Unit Test, that the defect was introduced”. This should ideally fit within the Delivery Life Cycle process. Component Test, Component Integration Test, System Test 16
  17. 17. Code Designed From API Spec, Detail Design, What was the source used, from which the code with the Requirement, defect was created? Standards, Functional Specification, Developer Led Requirements, Other (Selecting Other allows a free text field to be used). 17
  18. 18. Typical data to collect within RCA In addition with the data that is common with TEA, one may want to consider the following. However one can update and change this list. It may be for example that in looking at System Architecture that we have a lot of defects and we decide to provide sub- clauses to help us manage these. On the other hand we may find that coding contains too many options to be useful and we decide to reduce or group options. The point being is that we collect data to bring about improvement and we do not collect data for the sake of it.: This is considering a Software project and Hardware projects will need different data collecting. 18
  19. 19. RCA Data Capture RecursionDatabase Error ResourceCoding Error “Identify list of common coding problems.” • Null Pointer Arithmetic • Un-initialised Pointer  Division by zero • Data Type  Arithmetic overflow • Access Violations  Arithmetic Underflow • Memory Handling  Precision • File Handling  Rounding • Buffer Overflow  Arithmetic Stability • Storage Violation Boundary Condition or Off Bye One Error (OBOE) • Stack Overflow Compatibility Security – Coding Issue Browser • Buffer Overflows • Format Strings Operating System • Canonicalization Interface • Privilege Checking GUI Related • Script Injection Logic • Information Leakage Infinite Loops • Error Handling Busy Wait Loops Syntax Deadlock – two or more actions are each waiting for the other to finish Team working - clashes due to conflicting changes Memory Access Violation within code Race Condition Hardware Compatibility System Architecture Error Other (plus free text) 19
  20. 20. Summary  For projects where products are continually revised and updated, TEA can help to reduce defect leakage.  It can also be useful for large programmes targeting reviews at past phases.  The defect record fields need to be set up to accommodate TEA.  TEA reviews are targeted at sample batches the reviewers are seeking patterns to discover where to target test improvement. 20