Software reliability

21,441 views
20,858 views

Published on

A Brief description of Software reliability

Published in: Education, Technology, Business
3 Comments
14 Likes
Statistics
Notes
No Downloads
Views
Total views
21,441
On SlideShare
0
From Embeds
0
Number of Embeds
3
Actions
Shares
0
Downloads
1,086
Comments
3
Likes
14
Embeds 0
No embeds

No notes for slide
  • When we fix a fault we are not sure if the corrections are complete and successful and no other faults are introduced Even if the faults are fixed properly we do not know how much will be the improvement to interfailure time.
  • Software reliability

    1. 1. LT CDR PABITRA KUMAR PANDA M TECH (RE), IIT KGP 11 AUG 2010 SOFTWARE RELIABILITY
    2. 2. <ul><li>INTRODUCTION </li></ul><ul><li>RELIABILITY </li></ul><ul><li>HARDWARE VS SOFTWARE RELIABILITY </li></ul><ul><li>SOFTWARE RELIABILITY GROWTH MODELS </li></ul><ul><li>STATISTICAL TESTING </li></ul><ul><li>CONCLUSION </li></ul>SCOPE OF PRESENTATION
    3. 3. SOFTWARE RELIABILITY <ul><li>Reliability is usually defined in terms of a statistical measure for the operation of a software system without a failure occurring </li></ul><ul><li>Software reliability is a measure for the probability of a software failure occurring </li></ul><ul><li>Two terms related to software reliability </li></ul><ul><ul><li>Fault : a defect in the software, e.g. a bug in the code which may cause a failure </li></ul></ul><ul><ul><li>Failure : a derivation of the programs observed behavior from the required behavior </li></ul></ul>
    4. 4. SOFTWARE REL I AB I L I TY CONTD . <ul><li>Software Reliability is an important attribute of software quality, together with </li></ul><ul><ul><li>functionality, </li></ul></ul><ul><ul><li>usability, </li></ul></ul><ul><ul><li>performance, </li></ul></ul><ul><ul><li>serviceability, </li></ul></ul><ul><ul><li>capability, </li></ul></ul><ul><ul><li>installability, </li></ul></ul><ul><ul><li>maintainability, </li></ul></ul><ul><ul><li>documentation. </li></ul></ul>
    5. 5. What is Software Reliability <ul><li>Software Reliability is hard to achieve, because the complexity of software tends to be high. </li></ul><ul><li>While the complexity of software is inversely related to software reliability, it is directly related to other important factors in software quality, especially functionality, capability. </li></ul>
    6. 6. <ul><li>Cannot be defined objectively </li></ul><ul><ul><li>Reliability measurements which are quoted out of context are not meaningful </li></ul></ul><ul><li>Requires operational profile for its definition </li></ul><ul><ul><li>The operational profile defines the expected pattern of software usage </li></ul></ul><ul><li>Must consider fault consequences </li></ul><ul><ul><li>Not all faults are equally serious. System is perceived as more unreliable if there are more serious faults </li></ul></ul>Software reliability
    7. 7. HARD WARE VS SOFTWARE HAZARD RATE
    8. 8. SOFTWARE FA I LURE MECHAN I SM <ul><li>Failure cause : D esign D efects. </li></ul><ul><li>Repairable system concept : Periodic restarts may fix software problems. </li></ul><ul><li>Time dependency and life cycle : Not a function of operational time. </li></ul><ul><li>Environmental factors : Do not affect </li></ul><ul><li>Reliability prediction : Software reliability can not be predicted from any physical basis, since it depends completely on human factors in design. </li></ul>
    9. 9. <ul><li>A failure corresponds to unexpected run-time behaviour observed by a user of the software. </li></ul><ul><li>A fault is a static software characteristic which causes a failure to occur. </li></ul><ul><li>Faults need not necessarily cause failures. They only do so if the faulty part of the software is used. </li></ul>FAILURES AND FAULTS
    10. 10. INPUT/OUTPUT MAPPING
    11. 11. RELIABILITY PERCEPTION
    12. 12. FAILURE CLASSIFICATION <ul><ul><li>Transient: failures occur only for certain inputs. </li></ul></ul><ul><ul><li>Permanent: failures occur for all input values. </li></ul></ul><ul><ul><li>Recoverable: when failures occur the system recovers with or without operator intervention. </li></ul></ul><ul><ul><li>Unrecoverable: the system may have to be restarted. </li></ul></ul><ul><ul><li>Cosmetic:May cause minor irritations. Do not lead to incorrect results. </li></ul></ul><ul><ul><ul><ul><li>. </li></ul></ul></ul></ul>
    13. 13. Measuring Software Reliability <ul><li>Errors do not cause failures at the same frequency and severity. </li></ul><ul><ul><li>measuring latent errors alone not enough </li></ul></ul><ul><li>The failure rate is observer-dependent </li></ul><ul><li>No simple relationship observed between system reliability and the number of latent software defects. </li></ul><ul><li>Removing errors from parts of software which are rarely used makes little difference to the perceived reliability. </li></ul><ul><li>removing 60% defects from least used parts would lead to only about 3% improvement to product reliability. </li></ul><ul><li>Reliability improvements from correction of a single error depends on whether the error belongs to the core or the non-core part of the program. </li></ul><ul><li>The perceived reliability depends to a large extent upon how the product is used . In technical terms on its operation profile. </li></ul>
    14. 14. SOFTWARE FA I LURE MECHAN I SM
    15. 15. AVERAGE FAILURE RATE OF A MS PRODUCT
    16. 16. REASONS FOR THIS PHENOMENON <ul><li>Users learn with time and avoid failure causing situation. </li></ul><ul><li>Users start with exploring more, then limit to some part of the product </li></ul><ul><ul><li>Most users use a few product features </li></ul></ul><ul><li>Configuration related failures are much more in the start. </li></ul><ul><li>These failures reduce with time </li></ul>
    17. 17. Measuring Software Reliability <ul><li>Don’t define what you won’t collect.. </li></ul><ul><li>Don’t collect what you won’t analyse.. </li></ul><ul><li>Don’t analyse what you won’t use.. </li></ul>
    18. 18. MEASUR I NG SOFTWARE REL I AB I L I TY <ul><li>Measuring software reliability remains a difficult problem because we don't have a good understanding of the nature of software </li></ul><ul><li>Even the most obvious product metrics such as software size have not uniform definition. </li></ul><ul><li>Level of reliability required for a software product should be specified in the SRS document . </li></ul>
    19. 19. SOFTWARE RELIABILITY MODELING
    20. 20. SOFTWARE REL I AB I L I TY MODELS <ul><li>Models have emerged as people try to understand the characteristics of how and why software fails, and try to quantify software reliability </li></ul><ul><li>Over 200 models have been developed since the early 1970s, but how to quantify software reliability still remains largely unsolved </li></ul><ul><li>No single model completely represent software reliability . </li></ul><ul><li>Assumption : reliability is a function of the defect level and as defects are removed, reliability improves . </li></ul>
    21. 21. SOFTWARE REL I AB I L I TY MODELS <ul><li>Software modeling techniques can be divided into two subcategories: </li></ul><ul><ul><li>prediction modeling </li></ul></ul><ul><ul><li>estimation modeling. </li></ul></ul><ul><li>Both kinds of modeling techniques are based on observing and accumulating failure data and analyzing with statistical inference. </li></ul>
    22. 22. SOFTWARE REL I AB I L I TY MODELS ISSUES PREDICTION MODELS ESTIMATION MODELS DATA REFERENCE Uses historical data Uses data from the current software development effort WHEN USED IN DEVELOPMENT CYCLE Usually made prior to development or test phases; can be used as early as concept phase Usually made later in life cycle(after some data have been collected); not typically used in concept or development phases TIME FRAME Predict reliability at some future time Estimate reliability at either present or some future time
    23. 23. SOFTWARE REL I AB I L I TY MODELS <ul><li>Two main types of uncertainty renders any reliability measurement inaccurate: </li></ul><ul><li>Type 1 uncertainty: </li></ul><ul><ul><li>our lack of knowledge about how the system will be used </li></ul></ul><ul><li>Type 2 uncertainty: </li></ul><ul><ul><li>reflects our lack of knowledge about the effect of fault removal. </li></ul></ul>
    24. 24. SOFTWARE REL I AB I L I TY MODELS <ul><li>Most software models contain  the following parts: </li></ul><ul><ul><li>assumptions, </li></ul></ul><ul><ul><li>factors, </li></ul></ul><ul><ul><li>a mathematical function </li></ul></ul><ul><ul><ul><li>relates the reliability with the factors. </li></ul></ul></ul><ul><ul><ul><li>is usually higher order exponential or logarithmic. </li></ul></ul></ul>
    25. 25. SOFTWARE REL I AB I L I TY MODELS <ul><li>Jelinski and Moranda Model </li></ul><ul><ul><li>Realizes each time an error is repaired reliability does not increase by a constant amount. </li></ul></ul><ul><ul><li>Reliability improvement due to fixing of an error is assumed to be proportional to the number of errors present in the system at that time. </li></ul></ul>
    26. 26. SOFTWARE REL I AB I L I TY MODELS <ul><li>Littlewood and Verall’s Model </li></ul><ul><ul><li>Assumes different fault have different sizes, thereby contributing unequally to failures. </li></ul></ul><ul><ul><li>Large sized faults tends to be detected and fixed earlier </li></ul></ul><ul><ul><li>As number of errors is driven down with the progress in test, so is the average error size, causing a law of diminishing return in debugging </li></ul></ul>
    27. 27. EQUAL-STEP RELIABILITY GROWTH
    28. 28. RANDOM-STEP RELIABILITY GROWTH
    29. 29. MUSA’S MODEL <ul><li>Assumptions:- </li></ul><ul><li>- Faults are independent and distributed with constant rate of encounter. </li></ul><ul><li>- Well mixed types of instructions, execution time between failures is large compared to instruction execution time. </li></ul><ul><ul><ul><li>Set of inputs for each run selected randomly. </li></ul></ul></ul><ul><ul><ul><li>All failures are observed, implied by definition. </li></ul></ul></ul><ul><li>- Fault causing failure is corrected immediately, otherwise reoccurrence of that failure is not counted. </li></ul>
    30. 30. MUSA’S BASIC MODEL <ul><li>Assumption : Decrement in failure intensity function is constant. </li></ul><ul><li>Result : Failure intensity is function of average number of failures experienced at any given point in time (= failure probability). </li></ul><ul><ul><li> (  ): failure intensity. </li></ul></ul><ul><ul><li> 0 : initial failure intensity at start of execution. </li></ul></ul><ul><ul><li> : average total number of failures at a given point in time. </li></ul></ul><ul><ul><li>v 0 : total number of failures over infinite time. </li></ul></ul>
    31. 31. EXAMPLE <ul><li>Assume that we are at some point of time t time units in the life cycle of a software system after it has been deployed. </li></ul><ul><li>Assume the program will experience 100 failures over infinite execution time. During the last t time unit interval 50 failures have been observed (and counted). The initially failure intensity was 10 failures per CPU hour. </li></ul><ul><li>Compute the current (at t ) failure intensity: </li></ul>
    32. 32. MUSA/OKUMOTO LOGARITHMIC MODEL <ul><li>Decrement per encountered failure decreases: </li></ul><ul><ul><li> : failure intensity decay parameter. </li></ul></ul><ul><li>Example 2 </li></ul><ul><ul><li> 0 = 10 failures per CPU hour. </li></ul></ul><ul><ul><li> = 0.02/failure. </li></ul></ul><ul><ul><li>50 failures have been experienced (  = 50). </li></ul></ul><ul><ul><li>Current failure intensity: </li></ul></ul>
    33. 33. Model Extension (1) <ul><li>Average total number of counted experienced failures (  ) as a function of the elapsed execution time (  ). </li></ul><ul><li>For basic model </li></ul><ul><li>For logarithmic model </li></ul>
    34. 34. Example 3 (Basic Model) <ul><li> 0 = 10 [failures/CPU hour]. </li></ul><ul><li>v 0 = 100 (number of failures over infinite execution time). </li></ul><ul><li> = 10 CPU hours: </li></ul><ul><li> = 100 CPU hours: </li></ul>
    35. 35. Example 4 (Logarithmic Model) <ul><li> 0 = 10 [failures/CPU hour]. </li></ul><ul><li> = 0.02 / failure. </li></ul><ul><li> = 10 CPU hours: </li></ul><ul><li> = 100 CPU hours: </li></ul>(63 in basic model) (100 in basic model)
    36. 36. Model Extension (2) <ul><li>Failure intensity as a function of execution time. </li></ul><ul><li>For basic model: </li></ul><ul><li>For logarithmic model </li></ul>
    37. 37. Example 5 (Basic Model) <ul><li> 0 = 10 [failures/CPU hour]. </li></ul><ul><li>v 0 = 100 (number of failures over infinite execution time). </li></ul><ul><li> = 10 CPU hours: </li></ul><ul><li> = 100 CPU hours: </li></ul>
    38. 38. Example 6 (Logarithmic Model) <ul><li> 0 = 10 [failures/CPU hour].  = 0.02 / failure. </li></ul><ul><li> = 10 CPU hours: </li></ul><ul><li> = 100 CPU hours: </li></ul>(3.68 in basic model) (0.000454 in basic model)
    39. 39. MODEL DISCUSSION <ul><li>Comparison of basic and logarithmic model: </li></ul><ul><ul><li>Basic model assumes that there is a 0 failure intensity, logarithmic model assumes convergence to 0 failure intensity. </li></ul></ul><ul><ul><li>Basic model assumes a finite number of failures in the system, logarithmic model assumes infinite number. </li></ul></ul><ul><li>Parameter estimation is major problem:  0 ,  , and v 0 . Usually obtained from: </li></ul><ul><ul><li>system test, </li></ul></ul><ul><ul><li>observation of operational system, </li></ul></ul><ul><ul><li>by comparison with values from similar projects. </li></ul></ul>
    40. 40. APPL I CAB I L I TY OF SOFTWARE REL I AB I L I TY MODELS <ul><ul><li>There is no universally applicable reliability growth model. </li></ul></ul><ul><ul><li>Reliability growth is not independent of application. </li></ul></ul><ul><ul><li>Fit observed data to several growth models. </li></ul></ul><ul><ul><ul><li>Take the one that best fits the data. </li></ul></ul></ul>
    41. 41. STATISTICAL TESTING
    42. 42. <ul><li>Testing for reliability rather than fault detection </li></ul><ul><li>Test data selection should follow the predicted usage profile for the software </li></ul><ul><li>Measuring the number of errors allows the reliability of the software to be predicted </li></ul><ul><li>An acceptable level of reliability should be specified and the software tested and amended until that level of reliability is reached </li></ul>STATISTICAL TESTING
    43. 43. STAT I ST I CAL TEST I NG <ul><li>Different users have different operational profile: </li></ul><ul><ul><li>i.e. they use the system in different ways </li></ul></ul><ul><ul><li>formally, operational profile: </li></ul></ul><ul><ul><ul><li>probability distribution of input </li></ul></ul></ul><ul><li>Divide the input data into a number of input classes: </li></ul><ul><ul><li>e.g. create, edit, print, file operations, etc. </li></ul></ul><ul><li>Assign a probability value to each input class: </li></ul><ul><ul><li>a probability for an input value from that class to be selected. </li></ul></ul>
    44. 44. <ul><li>Determine operational profile of the software </li></ul><ul><li>Generate a set of test data corresponding to this profile </li></ul><ul><li>Apply tests, measuring amount of execution time between each failure </li></ul><ul><li>After a statistically valid number of tests have been executed, reliability can be measured </li></ul>STATISTICAL TESTING PROCEDURE
    45. 45. STATISTICAL TESTING DIFFICULTIES <ul><li>Uncertainty in the operational profile </li></ul><ul><ul><li>particular problem for new systems with no operational history. </li></ul></ul><ul><li>High costs of generating the operational profile </li></ul><ul><ul><li>Costs dependent on what usage information is collected by the organisation which requires the profile </li></ul></ul><ul><li>Statistical uncertainty when high reliability is specified </li></ul><ul><ul><li>Usage pattern of software may change with time </li></ul></ul>
    46. 46. AN OPERATIONAL PROFILE
    47. 47. RELIABILITY PREDICTION
    48. 48. CASE STUDY
    49. 49. BANK AUTO-TELLER SYSTEM <ul><li>Each machine in a network is used 300 times a day </li></ul><ul><li>Bank has 1000 machines </li></ul><ul><li>Lifetime of software release is 2 years </li></ul><ul><li>Each machine handles about 200, 000 transactions </li></ul><ul><li>About 300, 000 database transactions in total per day </li></ul>
    50. 50. EXAMPLES OF A RELIABILITY SPEC.
    51. 51. SPECIFICATION VALIDATION <ul><li>It is impossible to empirically validate very high reliability specifications </li></ul><ul><li>No database corruptions means POFOD of less than 1 in 200 million </li></ul><ul><li>If a transaction takes 1 second, then simulating one day’s transactions takes 3.5 days </li></ul><ul><li>It would take longer than the system’s lifetime to test it for reliability </li></ul>
    52. 52. COSTS OF INCREASING RELIABILITY
    53. 53. CONCLUS I ONS <ul><li>Software reliability is a key part in software quality </li></ul><ul><li>Software reliability improvement is hard </li></ul><ul><li>There are no generic models. </li></ul><ul><li>Measurement is very important for finding the correct model. </li></ul><ul><li>Statistical testing should be used but it is not easy again… </li></ul><ul><li>Software Reliability Modelling is not as simple as it looks. </li></ul>
    54. 54. REFERENCES <ul><li>Musa, JD, Iannino, A. and Okumoto, K., “ Software Reliability: Measurement, Prediction, Application” , McGraw-Hill Book Company, NY, 1987. </li></ul><ul><li>A. V. Aho, R. Sethi, and J. Ullman, &quot; Compilers: Principles,Techniques, and Tools &quot;, Addison-Wesley, Reading, MA, 1986. </li></ul><ul><li>Reliability Engineering Hand Book, by Bryan Dodson and Dennis Nolan. </li></ul><ul><li>Software Reliability Prediction Model, White Paper, By M Thangarajan </li></ul><ul><li>Software Reliability Engineering – A Roadmap by Michael R Lyu </li></ul>
    55. 55. THANK YOU!

    ×