3. Reliability
• Reliability is a broad concept.
• Reliability is one of the metrics that are used to measure
quality.
• It is a user-oriented quality factor relating to system operation.
• Intuitively, if the users of a system rarely experience failure, the system
is considered to be more reliable than one that fails more often.
• A system without faults is considered to be highly Reliable.
4.
5. Key Concepts
• Failure
If observable outcome of a program execution is different
from the expected outcome.
• Fault
Cause of failure.
eg.
Failure :You Failed In Exam
Fault: You didn't study/No one Helped You/Undefined
6. Continue...
• Time : If the time gap between two successive failures is short, we say
that the system is less reliable.
• Two types of time models are:
• Execution time(𝜏):
Amount of time given by System To
Software Operation
• Calendar time (t ):
Overall Time Spend During Operation
By User.
11. Reliability metrics
IV] POFOD : Probability of Failure on Demand
• POFOD is the likelihood that the system will fail when a
service request is made. A POFOD of 0.001 means that
one out of a thousand service requests may result in
failure.
• POFOD is an important measure for safety critical systems
and should be kept as low as possible. It is relevant for
many safety-critical systems with the exception of
management components, such as an emergency
shutdown system in a chemical plant.
12. Reliability metrics
V] ROCOF : Rate Of Occurrences Of
Failure
• ROCOF is the frequency of occurrence with which unexpected
behavior is likely to occur.
• A ROCOF of 2 / 100 means that two failures are likely to occur in each
100 operational time units. This metric is sometimes called the failure
intensity.
• It is relevant for operating systems and transaction-processing
systems where the system has to process a large number of similar
requests that are relatively frequent; for example, credit-card
processing systems, airline booking systems, etc.
13. Reliability metrics
VI] AVAIL : Availability
• Availability is the probability that the system is
available for use at a given time.
• An availability of 0.998 means that in every 1000 time units, the
system is likely to be available for 998 of these.
14.
15. Software Reliability
• First definition:
Software reliability is defined as the probability of failure-free operation
of a software system for a specified time in a specified environment.
Key elements of the above definition:
• Probability of failure-free operation
• Length of time of failure-free operation
• A given execution environment
Example :
The probability that a PC in a store is up and running for
eight hours without crash is 0.99.
16. SR-Defn
• Second definition
• Failure intensity is a measure of Defining the reliability of a
software system operating in a given environment.
• Example : An air traffic control system fails once in two years.
17. Comparing Definitions
First Definition Second Definition
Based On MTTF.
Time From Failure Free Software
to->First Failure.
It Describes How Much Time
Software can be Free Of Any
Failure.
Based On Frequency Of Failures [
ROCOF ].
Count Of Failure States Time
interval 't'.
It Describes How
Vulnerable/Stable Software is in
Time interval.
18.
19. Factors
• Reliability of a software depends upon two categories of
information
1)The number of faults present in the software
2)The ways user operate the system-Operational profile
20. SR-Influencing Factors
Fault count is influenced by following:
• Size and complexity of code
• Characteristics of development process used
• Education, experience and training of development
Personnel.
• Operational Environment
21. SR-Influencing Factors
Software Operational Env. is influenced by following:
• Change in Environment
• Change in Infrastructure OR technology.
• Huge Change in Requirements.
• Lack of Maintenance / Difficult to maintain.
22.
23. Methodologies
• Critical systems (spacecraft, aircraft, nuclear power plant etc. )
require a high level of dependability in their operation.
• Dependability Methodologies:
• 1)Fault avoidance
• 2)Fault tolerance
• 3)Fault removal
• 4)Fault forecasting
24. SR-Methodologies
1] Fault Avoidance:
• Prevent the introduction of faults during the development of the
software.
How?
• Use standards and guidelines
-How to implement the code?
-When and where to use functions, pointers etc.,
• Use formal methods
-state management to verify system working
• Methods against software aging
-to prevent memory leaks-system crash
25. SR-Methodologies
2]Fault Tolerance:
• Fault tolerance refers to the ability of a system (computer, network, cloud cluster, etc.)
to continue operating without interruption when one or more of its components fail.
• The objective of creating a fault-tolerant system is to prevent disruptions arising from
a single point of failure, ensuring the high relilability.
• Software systems that are backed up by other software instances. For example, a
database with customer information can be continuously replicated to another
machine. If the primary database goes down, operations can be automatically
redirected to the second database.
26. SR-Methodologies
3]Fault Removal:
• Aim at detecting and fixing faults once the code has been developed.
How?
• Testing techniques
• Using various methods and verification
• Analysis ( Dynamic , Semantic etc.,)
27. SR-Methodologies
4]Fault Forecasting:
• Estimating the presence of faults.
• Occurrence and consequences of failure.
• Main aim of fault forecasting is predicting the reliability of a software
• They are mainly concerned with reliability models.
28. SUMMARY
Factors influencing SR are fault
count and operational profile
Factors
fault avoidance, fault tolerance,
fault removal and fault forecasting.
Dependability
MTTF: Mean Time ToFailure
MTTR: Mean Time ToRepair
MTBF: Mean Time Between Failures
POFOD: Probability Of Failure On
Demand
ROCOF: Rate Of Occurrences Of Failures
AVAIL: Availability Of Service
Performance Metrics
Software reliability is defined as the probability of failure-free operation of a software system
for a specified time in a specified environment.