Software reliability & quality


Published on

Published in: Education, Technology, Business
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Software reliability & quality

  1. 1. By Nur ISLAM
  2. 2.  Categorising and specifying the reliability of software systems  Discussing various issues associated with Software Quality Assurance (SQA)
  3. 3.    Probability of failure-free operation for a specified time in a specified environment for a given purpose This means quite different things depending on the system and the users of that system Informally, reliability is a measure of how well system users think it provides the services they require
  4. 4.  Cannot be defined objectively   Requires operational profile for its definition   Reliability measurements which are quoted out of context are not meaningful The operational profile defines the expected pattern of software usage Must consider fault consequences  Not all faults are equally serious. System is perceived as more unreliable if there are more serious faults
  5. 5.  What does matter most, assessing the core of the program (running 90% of the time) or non-core section (running 10% of the time)  Use tools called program profilers (in Unix it is called prof)  Removing defects/bugs does not indicate its effectiveness in increasing the reliability of the product  Removing defects from non-core section does not have the same effect as removing the ones in core section.
  6. 6.  The perceived software reliability is an observant dependent  If you do not face the problem, you do not report it  Different users have different views of the systems and thus different quality and reliability assessments  The software reliability keeps changing as the defects are detected and fixed
  7. 7.    An important characteristic feature that sets hardware and software reliability issues apart is the difference between their failure patterns Hardware components fail due to very different reasons as compared to software components. Hardware components fail mostly due to wear and tear, whereas software components fail due to bugs
  8. 8.   Hardware metrics not really suitable for software as they are based on component failures and the need to repair or replace a component once it has failed. The design is assumed to be correct Software failures are always design failures. Often the system continues to be available in spite of the fact that a failure has occurred
  9. 9.  Probability of failure on demand     This is a measure of the likelihood that the system will fail when a service request is made POFOD = 0.001 means 1 out of 1000 service requests result in failure Relevant for safety-critical or non-stop systems Rate of fault occurrence (ROCOF)    Frequency of occurrence of unexpected behaviour ROCOF of 0.02 means 2 failures are likely in each 100 operational time units Relevant for operating systems, transaction processing systems
  10. 10.  Mean time to failure     Measure of the time between observed failures MTTF of 500 means that the time between failures is 500 time units Relevant for systems with long transactions e.g. CAD systems Availability    Measure of how likely the system is available for use. Takes repair/restart time into account Availability of 0.998 means software is available for 998 out of 1000 time units Relevant for continuously running systems e.g. telephone switching systems
  11. 11.    Reliability does not take consequences into account Transient faults have no real consequences but other faults might cause data loss or corruption May be worthwhile to identify different classes of failure, and use different metrics for each
  12. 12.    When specifying reliability both the number of failures and the consequences of each matter Failures with serious consequences are more damaging than those where repair and recovery is straightforward In some cases, different reliability specifications may be defined for different failure types
  13. 13.       Transient - only occurs with certain inputs Permanent - occurs on all inputs Recoverable - system can recover without operator help Unrecoverable - operator has to help Non-corrupting - failure does not corrupt system state or data Corrupting - system state or data are altered
  14. 14.    Growth model is a mathematical model of the system reliability change as it is tested and faults are removed Used as a means of reliability prediction by extrapolating from current data Depends on the use of statistical testing to measure the reliability of a system version
  15. 15.  Step Function Model  The simplest reliability growth model: ▪ a step function model  The basic assumption: ▪ reliability increases by a constant amount each time an error is detected and repaired.  Assumes: ▪ all errors contribute equally to reliability growth ▪ highly unrealistic: ▪ we already know that different errors contribute differently to reliability growth.
  16. 16.  Jelinski and Moranda Model  Realizes each time an error is repaired reliability does not increase by a constant amount.  Reliability improvement due to fixing of an error is assumed to be proportional to the number of errors present in the system at that time.
  17. 17.  Littlewood and Verall’s Model  Assumes different fault have different sizes, thereby contributing unequally to failures.  Allows for negative reliability growth  Large sized faults tends to be detected and fixed earlier  As number of errors is driven down with the progress in test, so is the average error size, causing a law of diminishing return in debugging
  18. 18.  Applicability of models:  There is no universally applicable reliability growth model.  Reliability growth is not independent of application.  Fit observed data to several growth models. ▪ Take the one that best fits the data.
  19. 19.  The objective is to determine reliability rather than discover errors.  Uses data different from defect testing.
  20. 20.  Different users have different operational profile:  i.e. they use the system in different ways  formally, operational profile: ▪ probability distribution of input  Divide the input data into a number of input classes:  e.g. create, edit, print, file operations, etc.  Assign a probability value to each input class:  a probability for an input value from that class to be selected.
  21. 21.  Determine the operational profile of the software:  This can be determined by analyzing the usage pattern.  Manually select or automatically generate a set of test data:  corresponding to the operational profile.  Apply test cases to the program:  record execution time between each failure  it may not be appropriate to use raw execution time  After a statistically significant number of failures have been observed:  reliability can be computed
  22. 22.   Relies on using large test data set. Assumes that only a small percentage of test inputs:  likely to cause system failure.  It is straight forward to generate tests corresponding to the most common inputs:  but a statistically significant percentage of unlikely inputs should also be included.  Creating these may be difficult:  especially if test generators are used.
  23. 23.  Pros and cons -Say by yourself
  24. 24.  Software quality is: - The degree to which a system, component, or process meets specified requirements. - The degree to which a system, component, or process meets customer or user needs or expectations.
  25. 25.               Correctness Efficiency Flexibility Robustness Interoperability Maintainability Performance Portability Reliability Reusability Testability Usability Availability Understandability
  26. 26.    A quality management system is a principal methodology used by organizations to ensure that the products they develop have the desired quality A quality system is the responsibility of the system as a whole, and the full support of the top management is a must A good quality system must be well documented
  27. 27.  The quality system activities encompass the following:  Auditing of projects  Review of the quality system  Development of standards, procedures and guidelines… etc  Production of reports for the top management summarizing the effectiveness of the quality system in the organization
  28. 28.   Pre WWII, the usual way to produce quality products is to inspect the final product for defective units Then Quality Control (QC) principle was found: focuses not only on detecting the defective products and eliminating them, but also on determining the causes behind the defects (and fixing them), so that the product rejection rate can be reduced
  29. 29.   The Quality Assurance (QA) :the basic premise of modern quality assurance is that if an organization's processes are good and are followed rigorously, then the products are bound to be of good quality Total Quality Management (TQM) advocates that the process followed by an organization must continuously be improved through process management
  30. 30.   TQM requires continuous improvement (more than just documenting the process and optimizing them through redesign) Over the last six decades the quality management has shifted from inspection to Total quality management, and the quality assurance has shifted from product to process assurance
  31. 31. Quality Assurance Method Inspection Quality Paradigm Product Assurance Quality Control (QC) Quality Assurance (QA) Total Quality Management (TQM) Process Assurance
  32. 32.    Product metrics help measure the characteristics of a product being developed, whereas process metric help measure how a process is performing Product metrics: LOC ,FP, PM, time to develop the product, the complexity of the system … etc Process metrics: review effectiveness, inspection efficiency … etc