Functional requirements to define error
checking and recovery facilities and
protection against system failures.
Non-functional requirements defining the
required reliability and availability of the
Excluding requirements that define states
and conditions that must not arise.
Examples of ‘shall not’ requirements
System shall not allow users to modify
access permissions on any files that they
have not created. (security)
System shall not allow reverse thrust
mode to be selected when the aircraft is
in flight. (safety)
System shall not allow the simultaneous
activation of more than three alarm
I. Risk-driven Specification
Critical systems specification should be
Approach has been widely used in safety
and security-critical systems.
Aim of the specification process should be
to understand the risks (safety, security,
etc.) faced by the system & to define
requirements that reduce these risks.
Stages of Risk-based Analysis
• Risk identification
Identify potential risks that may arise.
• Risk analysis and classification
Assess the seriousness of each risk.
Decompose risks to discover their potential
• Risk decomposition
• Risk reduction assessment
Define how each risk must be eliminated or
reduced when the system is designed.
1. Risk Identification
Identify the risks faced by the critical
In safety-critical systems, the risks are
the hazards that can lead to accidents.
In security-critical systems, the risks are
the potential attacks on the system.
Identify and classify risks
• Service failure
• Electrical risks
Insulin Pump Risks
Insulin overdose (service failure).
Insulin underdose (service failure).
Power failure due to exhausted battery
Electrical interference with other medical
equipment like pace maker (electrical).
Poor sensor and actuator contact because
of incorrect fitting (physical).
Parts of machine break off in body
Infection caused by introduction of
Allergic reaction to materials or insulin
2. Risk Analysis and Classification
Process is concerned with understanding
the likelihood that a risk will arise &
consequences if an accident or incident
Need to make this analysis to understand
whether a risk is a serious threat to the
system or environment and to provide a
basis for deciding the resources that
should be used to manage the risk.
For each risk, the outcome of the risk
analysis and classification process is a
statement of acceptability.
Risk Analysis and Classification
Risks may be categorised as:
• As low as reasonably
The system must be designed in such a
way so that either the risk cannot arise
or, if it does arise, it will not result in an
Threaten human life or the financial
stability of a business and which have a
major probability of occurrence.
Example of an intolerable risk for an ecommerce system in an Internet
bookstore, would be a risk of the system
going down for more than a day.
As Low as Reasonably Practical
ALARP risks are those which have less
serious consequences or which have a
low probability of occurrence.
An ALARP risk for an e-commerce system
might be corruption of the web page
images that presented the brand of the
Commercially undesirable but is unlikely
to have serious short-term consequences
While the system designers should take
all possible steps to reduce the
probability of an ‘acceptable’ hazard
arising, these should not increase costs,
delivery time or other non-functional
Example of an acceptable risk for an ecommerce system is the risk that people
using beta-release web browsers could
not successfully complete orders
Levels of Risk
Social Acceptability of Risk
Acceptability of a risk is determined by
human, social and political
In most societies, the boundaries
between the regions are pushed upwards
with time i.e. society is less willing to
• For example, the costs of cleaning up
pollution may be less than the costs
of preventing it but this may not be
Risk assessment is subjective
• Risks are identified as probable,
• Depends on who is making the
Estimate the risk probability and the risk
Not normally possible to do this precisely
so relative values are used such as
‘unlikely’, ‘rare’, ‘very high’, etc.
Aim must be to exclude risks that are
likely to arise or that have high severity.
Risk Assessment – Insulin Pump
3. Risk Decomposition
Concerned with discovering the root
causes of risks in a particular system.
Techniques have been mostly derived
from safety-critical systems and can be
• Inductive, bottom-up techniques.
• Start with a proposed system failure
and assess the hazards that could
arise from that failure;
• Deductive, top-down techniques.
• Start with a hazard and deduce what
the causes of this could be.
Deductive top-down technique.
Put the risk or hazard at the root of the
tree & identify system states that could
lead to that hazard.
Where appropriate, link these with ‘and’
or ‘or’ conditions.
A goal should be to minimise the number
of single causes of system failure.
Insulin Pump Fault Tree
4. Risk Reduction Assessment
Identify dependability requirements that
specify how the risks should be managed
and ensure that accidents/incidents do
Risk reduction strategies
1. Risk avoidance
• Risk or hazard cannot arise
2. Risk detection and removal
• Risks are detected & neutralised before they
result in an accident.
3. Damage limitation
• Consequences of an accident are minimised.
Normally, in critical systems, a mix of risk
reduction strategies are used.
In a chemical plant control system, the
system will include sensors to detect and
correct excess pressure in the reactor.
It will also include an independent
protection system that opens a relief
valve if dangerously high pressure is
Insulin Pump – Software Risks
• Computation causes the value of a
variable to overflow or underflow
• May include an exception handler for
each type of arithmetic error
• Compare dose to be delivered with
previous dose or safe maximum
• Reduce dose if too high
Safety Requirements – Insulin Pump
II. Safety Specification
Safety requirements of a system should
be separately specified
Requirements should be based on an
analysis of the possible hazards and risks
Safety requirements usually apply to the
system as a whole rather than to
International standard for safety
management that was specifically
designed for protection systems - it is not
applicable to all safety-critical systems.
Incorporates a model of the safety life
cycle and covers all aspects of safety
management from scope definition to
Control System Safety Requirements
The Safety Life-Cycle
Functional safety requirements
• Define the safety functions of the protection
system i.e. the define how the system should
Safety integrity requirements
• Define the reliability and availability of the
protection system. They are based on
expected usage and are classified using a
safety integrity level from 1 to 4.
III. Security Specification
Has some similarities to safety
• Not possible to specify security
Requirements are often ‘shall not’ rather
than ‘shall’ requirements.
• No well-defined notion of a security life
cycle for security management; No
Generic threats rather than system specific
Mature security technology (encryption,
The conventional (non-computerised)
approach to security analysis is based
around the assets to be protected and
their value to an organisation.
A bank will provide high security in an
area where large amounts of money are
stored compared to other public areas
where the potential losses are limited.
The same approach can be used for
specifying security for computer-based
A possible security specification process
is shown in next slide.....
The Security Specification Process
Stages in Security Specification
Asset identification and evaluation
• Assets (data and programs) & their required
degree of protection are identified.
Password file (say) is more valuable than a
set of public web pages because of its asset
Threat analysis and risk assessment
• Possible security threats are identified and
the risks associated with each of these
threats is estimated.
• Identified threats are related to the assets so
that, for each identified asset, there is a list
of associated threats.
Stages in Security Specification
• Available security technologies and their
applicability against the identified threats
Security requirements specification
• The security requirements are specified.
Where appropriate, these will explicitly
identify the security technologies that may
be used to protect against different threats
to the system.
Security specification & security
management are essential for all critical
If a system is insecure, it is subject to
infection with viruses & worms,
corruption & unauthorised modification
of data, & denial of service attacks
Types of Security Requirement
Intrusion detection requirements.
• Privacy requirements.
• Security auditing requirements.
• System maintenance security
Types of Security Requirement
Identification requirements specify
whether a system should identify its
users before interacting with them.
Authentication requirements specify how
users are identified.
Authorisation requirements specify the
privileges and access permissions of
Immunity requirements specify how a
system should protect itself against
viruses, worms, and similar threats.
Types of Security Requirement
5. Integrity requirements specify how data
corruption can be avoided.
6. Intrusion detection requirements specify
what mechanisms should be used to
detect attacks on the system.
7. Non-repudiation requirements specify that
a party in a transaction cannot deny its
involvement in that transaction
Types of Security Requirement
8. Privacy requirements specify how data
privacy is to be maintained.
9. Security auditing requirements specify
how system use can be audited and
10. System maintenance security
requirements specify how an application
can prevent authorised changes from
accidentally defeating its security
Not every system needs all of these
security requirements. Requirements
depend on the type of system, the
situation of use and the expected users.
Next slide shows security requirements
that might be included in the LIBSYS
LIBSYS Security Requirements
System Reliability Specification
• What is the probability of a hardware
component failing & how long does it take to
repair that component?
• How likely is it that a software component
will produce an incorrect output. Software
failures are different from hardware failures
in that software does not wear out. It can
continue in operation even after an incorrect
result has been produced.
• How likely is it that the operator of a system
will make an error?
A predefined range for all values that are
input by the operator shall be defined &
the system shall check that all operator
inputs fall within this predefined range.
The system shall check all disks for bad
blocks when it is initialised.
The system must use N-version
programming to implement the braking
The required level of system reliability
required should be expressed
Reliability is a dynamic system attributereliability specifications related to the
source code are meaningless.
• No more than N faults/1000 lines;
• This is only useful for a post-delivery
process analysis where you are trying to
assess how good your development
An appropriate reliability metric should
be chosen to specify the overall system
Units of measurement of system
System reliability is measured by
counting the number of operational
failures & where appropriate, relating
these to the demands made on the system
& the time that the system has been
A long-term measurement programme is
required to assess the reliability of
Probability of Failure on Demand
Probability that the system will fail when
a service request is made.
Metric is most appropriate for systems
where services are demanded at
unpredictable or at relatively long time
intervals & where there are serious
consequences if the service is not
Relevant for many safety-critical systems
Reliability of a pressure relief system in a
chemical plant or an emergency shutdown
system in a power plant.
Rate of Fault Occurrence
Metric should be used where regular
demands are made on system services &
is important that these services are
Might be used in the specification of a
bank teller system that processes
customer transactions or in Airline
Reflects the rate of occurrence of failure
ROCOF of 0.002 means 2 failures are
likely in each 1000 operational time units
e.g. 2 failures per 1000 hours of
Mean Time To Failure
Measure of the time between observed
failures of the system.
MTTF of 500 means that the mean time
between failures is 500 time units.
Relevant for systems with long
transactions i.e. where system processing
takes a long time.
MTTF should be longer than transaction
• Computer-aided design systems where a
designer will work on a design for several
hours, word processor systems etc
Measure of the fraction of the time that
the system is available for use.
Takes repair and restart time into
Availability of 0.998 means software is
available for 998 out of 1000 time units.
Relevant for non-stop, continuously
• Telephone Switching Systems, Railway
Three kinds of measurements that can be
made when assessing the reliability of a
1. No. of system failures given a number of
requests for system services. Used to
2. Time (or number of transactions) between
system failures. Used to measure ROCOF
3. Elapsed repair or restart time when a
system failure occurs. Given that the
system must be continuously available.
Used to measure AVAIL
Reliability measurements do NOT take
the consequences of failure into account.
Transient faults may have no real
consequences but other faults may cause
data loss or corruption and loss of system
May be necessary to identify different
failure classes and use different metrics
for each of these. The reliability
specification must be structured.
Statements such as ‘The software shall be
reliable under normal conditions of use’
are meaningless. Quasi-quantitative
statements such as ‘The software shall
exhibit no more than N faults/1000 lines’
are equally useless.
It is impossible to measure the number of
faults/1000 lines of code as you can’t tell
when all faults have been discovered.
When specifying reliability, it is not just
the number of system failures that matter
but the consequences of these failures.
Failures that have serious consequences
are clearly more damaging than those
where repair and recovery is
In some cases, therefore, different
reliability specifications for different
types of failure may be defined.
Steps to a Reliability Specification
For each sub-system, analyse the
consequences of possible system failures.
From the system failure analysis,
partition failures into appropriate classes.
For each failure class identified, set out
the reliability using an appropriate
Different metrics may be used for
different reliability requirements.
Identify functional reliability
requirements to reduce the chances of
Reliability Requirements for Bank AutoTeller System
Each machine in the network is used
about 300 times per day.
Lifetime of the system hardware is 5
Software is normally upgraded every
During the lifetime of a software release,
each machine will handle about 100,000
Bank has 1,000 machines in its network.
This means that there are 300,000
transactions on the central database per
day (say 100 million per year).
Reliability Specification for an ATM
Two Types of Failure
1. Transient failures that can be repaired by
user actions such as resetting or
recalibrating the machine.
• For these types of failures, a relatively low
value of POFOD (say 0.002) may be
Means that one failure may occur in every
500 demands made on the machine.
Approximately once every 3.5 days.
2. Permanent failures that require the
machine to be repaired by the
• Probability of this type of failure should be
much lower- say once a year is the minimum
figure, so POFOD should be no more than
It is impossible to empirically validate
very high reliability specifications.
No database corruptions means POFOD of
less than 1 in 200 million.
If a transaction takes 1 second, then
simulating one day’s transactions takes
It would take longer than the system’s
lifetime to test it for reliability.