This document discusses reliability specification and metrics. It describes how to identify types of system failure, estimate costs and consequences, and identify root causes to generate reliability specifications. Types of failures include loss of service, incorrect service, and system/data corruption. Reliability metrics are discussed such as probability of failure on demand, rate of occurrence of failures/mean time to failure, and availability. These metrics provide measurements of system reliability.
The document discusses specifications for dependability and security. It covers topics like risk-driven specification, safety specification, and security specification. It emphasizes that critical systems specification should be risk-driven as risks pose a threat to the system. The risk-driven approach aims to understand risks faced by the system and define requirements to reduce these risks through phased risk analysis including preliminary, life cycle, and operational risk analysis. Safety specification identifies protection requirements to ensure system failures do not cause harm, with risk identification, analysis, and reduction mirroring hazard identification, assessment, and analysis. An example of a safety-critical insulin pump system is provided to illustrate dependability requirements and risk analysis.
CS 5032 L6 reliability and security specification 2013Ian Sommerville
This document discusses reliability and security specification. It defines reliability metrics like probability of failure on demand, rate of occurrence of failures, mean time to failure, and availability. It describes the reliability specification process of risk identification, analysis, and decomposition to generate quantitative requirements. The document also discusses security specification, threat assessment, and defining security requirements to protect system assets. Formal methods for specification are introduced.
This document discusses dependability and security in computer systems. It defines dependability as the extent to which a system operates as expected without failure. Dependability is determined by attributes like availability, reliability, safety, and security. A system is considered dependable if it does not fail and continues delivering its expected services. The document outlines the importance of dependability and explains how attributes like availability, reliability, safety, and security are related and impact one another. It provides terminology and concepts regarding faults, failures, hazards, and risks as they relate to system dependability and security.
This document discusses dependability and security specification. It covers topics like risk-driven specification, safety specification, and security specification. For risk-driven specification, it emphasizes identifying risks through preliminary, life cycle, and operational risk analysis to define requirements that reduce risks. For safety specification, it describes identifying hazards, assessing hazards, and defining safety requirements to ensure system failures do not cause harm. Examples of applying these techniques to an insulin pump are provided.
This document discusses dependable systems architectures, including protection systems, self-monitoring architectures, and N-version programming. It notes that dependable architectures use redundancy and diversity to ensure fault tolerance. Key challenges include achieving true software and design diversity, as teams may interpret specifications similarly and diverse versions could still contain common errors.
CS5032 L11 validation and reliability testing 2013Ian Sommerville
Critical systems require additional validation processes beyond non-critical systems due to the high costs of failure. Validation costs for critical systems are significantly higher, usually taking over 50% of development costs. Various static analysis techniques can be used for validation, including formal verification, model checking, and automated program analysis. Statistical testing with an accurate operational profile is also used to measure a critical system's reliability.
This document discusses key aspects of dependability engineering including: achieving dependability through fault avoidance, detection, and tolerance; using redundancy and diversity; the importance of well-defined, repeatable processes; and guidelines for dependable programming such as checking inputs, handling exceptions, avoiding error-prone constructs, and including timeouts. Critical systems often have high dependability requirements and their development must convince regulators that the system is dependable, safe, and secure.
Requirements engineering involves discovering, documenting, and maintaining requirements for computer systems. It is important because getting requirements wrong can lead to projects being late, over budget, or delivering systems users do not like. Requirements engineering is difficult due to changing needs, differing stakeholder views, and politics influencing priorities.
The document discusses specifications for dependability and security. It covers topics like risk-driven specification, safety specification, and security specification. It emphasizes that critical systems specification should be risk-driven as risks pose a threat to the system. The risk-driven approach aims to understand risks faced by the system and define requirements to reduce these risks through phased risk analysis including preliminary, life cycle, and operational risk analysis. Safety specification identifies protection requirements to ensure system failures do not cause harm, with risk identification, analysis, and reduction mirroring hazard identification, assessment, and analysis. An example of a safety-critical insulin pump system is provided to illustrate dependability requirements and risk analysis.
CS 5032 L6 reliability and security specification 2013Ian Sommerville
This document discusses reliability and security specification. It defines reliability metrics like probability of failure on demand, rate of occurrence of failures, mean time to failure, and availability. It describes the reliability specification process of risk identification, analysis, and decomposition to generate quantitative requirements. The document also discusses security specification, threat assessment, and defining security requirements to protect system assets. Formal methods for specification are introduced.
This document discusses dependability and security in computer systems. It defines dependability as the extent to which a system operates as expected without failure. Dependability is determined by attributes like availability, reliability, safety, and security. A system is considered dependable if it does not fail and continues delivering its expected services. The document outlines the importance of dependability and explains how attributes like availability, reliability, safety, and security are related and impact one another. It provides terminology and concepts regarding faults, failures, hazards, and risks as they relate to system dependability and security.
This document discusses dependability and security specification. It covers topics like risk-driven specification, safety specification, and security specification. For risk-driven specification, it emphasizes identifying risks through preliminary, life cycle, and operational risk analysis to define requirements that reduce risks. For safety specification, it describes identifying hazards, assessing hazards, and defining safety requirements to ensure system failures do not cause harm. Examples of applying these techniques to an insulin pump are provided.
This document discusses dependable systems architectures, including protection systems, self-monitoring architectures, and N-version programming. It notes that dependable architectures use redundancy and diversity to ensure fault tolerance. Key challenges include achieving true software and design diversity, as teams may interpret specifications similarly and diverse versions could still contain common errors.
CS5032 L11 validation and reliability testing 2013Ian Sommerville
Critical systems require additional validation processes beyond non-critical systems due to the high costs of failure. Validation costs for critical systems are significantly higher, usually taking over 50% of development costs. Various static analysis techniques can be used for validation, including formal verification, model checking, and automated program analysis. Statistical testing with an accurate operational profile is also used to measure a critical system's reliability.
This document discusses key aspects of dependability engineering including: achieving dependability through fault avoidance, detection, and tolerance; using redundancy and diversity; the importance of well-defined, repeatable processes; and guidelines for dependable programming such as checking inputs, handling exceptions, avoiding error-prone constructs, and including timeouts. Critical systems often have high dependability requirements and their development must convince regulators that the system is dependable, safe, and secure.
Requirements engineering involves discovering, documenting, and maintaining requirements for computer systems. It is important because getting requirements wrong can lead to projects being late, over budget, or delivering systems users do not like. Requirements engineering is difficult due to changing needs, differing stakeholder views, and politics influencing priorities.
CS 5032 L12 security testing and dependability cases 2013Ian Sommerville
The document discusses security validation techniques like experience-based validation using known attacks, tiger teams that simulate attacks, and tool-based validation. It also discusses the importance of having a well-defined development process for safety-critical systems that includes identifying and tracking hazards. Safety and dependability cases collect evidence like hazard analyses, test results, and review reports to argue that a system meets its safety requirements. Structured safety arguments demonstrate that hazardous conditions cannot occur by considering all program paths and showing unsafe conditions cannot be true.
This document summarizes key topics from a lecture on security engineering, including design guidelines for security, design for deployment, and system survivability. The design guidelines encourage basing decisions on an explicit security policy, avoiding single points of failure, and failing securely. Deployment issues like vulnerable defaults and access permissions are addressed. Finally, resilience strategies like resistance, recognition and recovery are discussed to help systems continue operating during attacks.
This document summarizes Chapter 12 of a textbook on dependability and security specification. It discusses risk-driven specification, including identifying risks, analyzing risks, and defining requirements to reduce risks. It also covers specifying safety requirements by identifying hazards, assessing hazards, and analyzing hazards to discover root causes. The goal is to specify requirements that ensure systems function dependably and securely without failures causing harm.
The document discusses how to specify requirements for critical systems based on risk analysis. It explains how to identify risks, analyze and classify them, then derive safety, security, and reliability requirements to reduce risks. For reliability, it describes metrics like probability of failure on demand and mean time to failure that can be used to specify quantitative reliability levels. The goal is to develop requirements that eliminate intolerable risks and minimize other risks given cost and schedule constraints.
This document summarizes the topics covered in the first lecture of a security engineering course. It discusses security engineering and management, security risk assessment, and designing systems for security. The lecture covers tools and techniques for developing secure systems, assessing security risks, and designing system architectures to protect assets and distribute them for redundancy.
CS 5032 L1 critical socio-technical systems 2013Ian Sommerville
This document outlines the aims and topics of a course on critical systems engineering. The course aims to help students understand critical systems, which are technical systems that are profoundly affected by organizational and human factors. Key topics covered include system dependability, security engineering, and human/organizational factors. The course will examine critical infrastructure systems through topics like resilience engineering and cybersecurity. Assessment includes an exam and a coursework assignment involving requirements specification.
The document discusses security engineering design guidelines and system survivability. It covers:
1) Design guidelines that help make secure design decisions and raise security awareness.
2) Guidelines for avoiding single points of failure, failing securely, balancing security and usability, and more.
3) Designing for deployment to minimize vulnerabilities introduced during configuration and installation.
4) Ensuring systems can continue essential services when under attack through resilience and recoverability.
This document discusses safety engineering for systems that contain software. It covers topics like safety-critical systems, safety requirements, and safety engineering processes. Safety is defined as a system's ability to operate normally and abnormally without harm. For safety-critical systems like aircraft or medical devices, software is often used for control and monitoring, so software safety is important. Hazard identification, risk assessment, and specifying safety requirements to mitigate risks are key parts of the safety engineering process. The goal is to design systems where failures cannot cause injury, death or environmental damage.
This document discusses software security engineering. It covers security concepts like assets, vulnerabilities and threats. It discusses why security engineering is important to protect systems from malicious attackers. The document outlines security risk management processes like preliminary risk assessment. It also discusses designing systems for security through architectural choices that provide protection and distributing assets. The document concludes by covering system survivability through building resistance, recognition and recovery capabilities into systems.
This document discusses principles of software safety for clinical information systems and electronic medical records (EMRs). It provides background on software safety incidents in other industries. Key concepts discussed include adjusting the software development methodology based on risk level, and that no software is completely safe. The document advocates analyzing EMR software to understand how defects could contribute to patient safety risk scenarios from minor to catastrophic. It suggests increased rigor for software that controls computerized protocols, clinical data posting and updating, and overall EMR performance and availability.
The document discusses critical systems and their dependability requirements. It defines critical systems as those where failure could result in loss of life, environmental damage, or large economic losses. Dependability encompasses availability, reliability, safety, and security. The document uses the example of an insulin pump, a safety-critical system, to illustrate dependability dimensions and how failures could threaten human life. Formal development methods may be required for critical systems due to the high costs of failure.
The document discusses safety-critical systems and their dependability. It defines safety-critical systems as those whose failure could result in catastrophic consequences such as loss of life. Examples include failures that caused loss of spacecraft, power outages, and airplane crashes. Dependability is the ability of a system to deliver services and avoid failures. It consists of threats like faults, errors, and failures; attributes like availability and reliability; and means to achieve attributes like fault prevention and fault tolerance. The document outlines techniques used to develop dependable safety-critical systems, including verification, validation, and engineering practices applied at early stages.
This document discusses requirements specification for critical systems. It covers dependability requirements, risk-driven specification, safety specification, security specification, system reliability specification, and non-functional reliability requirements. For risk-driven specification, it describes the stages of risk identification, analysis and classification, decomposition, and risk reduction assessment. It provides examples of applying this process to an insulin pump. For safety specification, it discusses safety requirements, the safety life cycle, and the IEC 61508 standard. For security specification, it outlines a similar process to safety with stages of asset identification, threat analysis, and security requirements specification. It also discusses different types of security requirements.
This document provides an overview of topics in chapter 13 on security engineering. It discusses security and dependability, security dimensions of confidentiality, integrity and availability. It also outlines different security levels including infrastructure, application and operational security. Key aspects of security engineering are discussed such as secure system design, security testing and assurance. Security terminology and examples are provided. The relationship between security and dependability factors like reliability, availability, safety and resilience is examined. The document also covers security in organizations and the role of security policies.
Resilience And Failure Obviation Software Engineeringtyramisu
The document discusses applying concepts from resilience engineering, failure obviation engineering, reliability engineering, and safety engineering to software engineering. It notes software is a critical component of many systems and failures can lead to accidents. The document proposes investigating past software-related accidents to identify failure patterns and develop a resilience model to prevent failures and adapt to changing conditions. Case studies of accidents are presented where software issues were overlooked and contributed to loss of life. Potential research topics applying the engineering concepts to software safety are listed.
This document discusses security cases, which provide a structured argument and evidence to support the claim that a system is acceptably secure. It focuses on addressing the potential for buffer overflows in code. The argument is that coding practices, code reviews, static analysis, and system testing with invalid inputs provide evidence there are no buffer overflow possibilities in the code. Tool support is needed to manage the large amount of documentation required to build the security case.
The document provides an overview of key security engineering activities that should be integrated into the software development lifecycle (SDLC). It discusses securing each phase of development through threat modeling, secure coding practices like code reviews, and security testing. The goal is to build security into applications from the start to help prevent vulnerabilities and deliver more robust products.
This document provides an overview of key topics from Chapter 11 on security and dependability, including:
- The principal dependability properties of availability, reliability, safety, and security.
- Dependability covers attributes like maintainability, repairability, survivability, and error tolerance.
- Dependability is important because system failures can have widespread effects and undependable systems may be rejected.
- Dependability is achieved through techniques like fault avoidance, detection and removal, and building in fault tolerance.
The document discusses cybersecurity and why a technological approach alone is not sufficient. It argues that cybersecurity is a socio-technical problem, as technology cannot guarantee reliability and human and organizational factors like insider threats, procedures, carelessness, and social engineering present vulnerabilities. A holistic approach is needed across personal, organizational, national, and international levels that includes deterrence, awareness, realistic procedures, monitoring, and cooperation.
This is the DDS Security adopted specification.
It was adopted as an OMG standard in June 2014.
The official URL is http://www.omg.org/spec/DDS-SECURITY/
The document discusses how success and failure in sociotechnical systems are subjective and depend on the perspective of the observer. It notes that sociotechnical systems are non-deterministic due to human factors and system changes over time. Different stakeholders may have conflicting views of what constitutes success or failure based on their goals. The document concludes that failures are normal and inevitable in complex systems due to technical limitations and differing stakeholder perspectives.
The document discusses emergent properties in socio-technical systems. It defines emergent properties as properties of a system as a whole that arise from the relationships and interactions between components, rather than properties of the individual components. Emergent properties only emerge once a system is integrated and can include functional properties like a bicycle's ability to transport, as well as non-functional properties like reliability, security, and volume. Reliability in particular is an emergent property, as failures can occur due to unforeseen interactions between components. The document uses examples to illustrate different types of emergent properties.
CS 5032 L12 security testing and dependability cases 2013Ian Sommerville
The document discusses security validation techniques like experience-based validation using known attacks, tiger teams that simulate attacks, and tool-based validation. It also discusses the importance of having a well-defined development process for safety-critical systems that includes identifying and tracking hazards. Safety and dependability cases collect evidence like hazard analyses, test results, and review reports to argue that a system meets its safety requirements. Structured safety arguments demonstrate that hazardous conditions cannot occur by considering all program paths and showing unsafe conditions cannot be true.
This document summarizes key topics from a lecture on security engineering, including design guidelines for security, design for deployment, and system survivability. The design guidelines encourage basing decisions on an explicit security policy, avoiding single points of failure, and failing securely. Deployment issues like vulnerable defaults and access permissions are addressed. Finally, resilience strategies like resistance, recognition and recovery are discussed to help systems continue operating during attacks.
This document summarizes Chapter 12 of a textbook on dependability and security specification. It discusses risk-driven specification, including identifying risks, analyzing risks, and defining requirements to reduce risks. It also covers specifying safety requirements by identifying hazards, assessing hazards, and analyzing hazards to discover root causes. The goal is to specify requirements that ensure systems function dependably and securely without failures causing harm.
The document discusses how to specify requirements for critical systems based on risk analysis. It explains how to identify risks, analyze and classify them, then derive safety, security, and reliability requirements to reduce risks. For reliability, it describes metrics like probability of failure on demand and mean time to failure that can be used to specify quantitative reliability levels. The goal is to develop requirements that eliminate intolerable risks and minimize other risks given cost and schedule constraints.
This document summarizes the topics covered in the first lecture of a security engineering course. It discusses security engineering and management, security risk assessment, and designing systems for security. The lecture covers tools and techniques for developing secure systems, assessing security risks, and designing system architectures to protect assets and distribute them for redundancy.
CS 5032 L1 critical socio-technical systems 2013Ian Sommerville
This document outlines the aims and topics of a course on critical systems engineering. The course aims to help students understand critical systems, which are technical systems that are profoundly affected by organizational and human factors. Key topics covered include system dependability, security engineering, and human/organizational factors. The course will examine critical infrastructure systems through topics like resilience engineering and cybersecurity. Assessment includes an exam and a coursework assignment involving requirements specification.
The document discusses security engineering design guidelines and system survivability. It covers:
1) Design guidelines that help make secure design decisions and raise security awareness.
2) Guidelines for avoiding single points of failure, failing securely, balancing security and usability, and more.
3) Designing for deployment to minimize vulnerabilities introduced during configuration and installation.
4) Ensuring systems can continue essential services when under attack through resilience and recoverability.
This document discusses safety engineering for systems that contain software. It covers topics like safety-critical systems, safety requirements, and safety engineering processes. Safety is defined as a system's ability to operate normally and abnormally without harm. For safety-critical systems like aircraft or medical devices, software is often used for control and monitoring, so software safety is important. Hazard identification, risk assessment, and specifying safety requirements to mitigate risks are key parts of the safety engineering process. The goal is to design systems where failures cannot cause injury, death or environmental damage.
This document discusses software security engineering. It covers security concepts like assets, vulnerabilities and threats. It discusses why security engineering is important to protect systems from malicious attackers. The document outlines security risk management processes like preliminary risk assessment. It also discusses designing systems for security through architectural choices that provide protection and distributing assets. The document concludes by covering system survivability through building resistance, recognition and recovery capabilities into systems.
This document discusses principles of software safety for clinical information systems and electronic medical records (EMRs). It provides background on software safety incidents in other industries. Key concepts discussed include adjusting the software development methodology based on risk level, and that no software is completely safe. The document advocates analyzing EMR software to understand how defects could contribute to patient safety risk scenarios from minor to catastrophic. It suggests increased rigor for software that controls computerized protocols, clinical data posting and updating, and overall EMR performance and availability.
The document discusses critical systems and their dependability requirements. It defines critical systems as those where failure could result in loss of life, environmental damage, or large economic losses. Dependability encompasses availability, reliability, safety, and security. The document uses the example of an insulin pump, a safety-critical system, to illustrate dependability dimensions and how failures could threaten human life. Formal development methods may be required for critical systems due to the high costs of failure.
The document discusses safety-critical systems and their dependability. It defines safety-critical systems as those whose failure could result in catastrophic consequences such as loss of life. Examples include failures that caused loss of spacecraft, power outages, and airplane crashes. Dependability is the ability of a system to deliver services and avoid failures. It consists of threats like faults, errors, and failures; attributes like availability and reliability; and means to achieve attributes like fault prevention and fault tolerance. The document outlines techniques used to develop dependable safety-critical systems, including verification, validation, and engineering practices applied at early stages.
This document discusses requirements specification for critical systems. It covers dependability requirements, risk-driven specification, safety specification, security specification, system reliability specification, and non-functional reliability requirements. For risk-driven specification, it describes the stages of risk identification, analysis and classification, decomposition, and risk reduction assessment. It provides examples of applying this process to an insulin pump. For safety specification, it discusses safety requirements, the safety life cycle, and the IEC 61508 standard. For security specification, it outlines a similar process to safety with stages of asset identification, threat analysis, and security requirements specification. It also discusses different types of security requirements.
This document provides an overview of topics in chapter 13 on security engineering. It discusses security and dependability, security dimensions of confidentiality, integrity and availability. It also outlines different security levels including infrastructure, application and operational security. Key aspects of security engineering are discussed such as secure system design, security testing and assurance. Security terminology and examples are provided. The relationship between security and dependability factors like reliability, availability, safety and resilience is examined. The document also covers security in organizations and the role of security policies.
Resilience And Failure Obviation Software Engineeringtyramisu
The document discusses applying concepts from resilience engineering, failure obviation engineering, reliability engineering, and safety engineering to software engineering. It notes software is a critical component of many systems and failures can lead to accidents. The document proposes investigating past software-related accidents to identify failure patterns and develop a resilience model to prevent failures and adapt to changing conditions. Case studies of accidents are presented where software issues were overlooked and contributed to loss of life. Potential research topics applying the engineering concepts to software safety are listed.
This document discusses security cases, which provide a structured argument and evidence to support the claim that a system is acceptably secure. It focuses on addressing the potential for buffer overflows in code. The argument is that coding practices, code reviews, static analysis, and system testing with invalid inputs provide evidence there are no buffer overflow possibilities in the code. Tool support is needed to manage the large amount of documentation required to build the security case.
The document provides an overview of key security engineering activities that should be integrated into the software development lifecycle (SDLC). It discusses securing each phase of development through threat modeling, secure coding practices like code reviews, and security testing. The goal is to build security into applications from the start to help prevent vulnerabilities and deliver more robust products.
This document provides an overview of key topics from Chapter 11 on security and dependability, including:
- The principal dependability properties of availability, reliability, safety, and security.
- Dependability covers attributes like maintainability, repairability, survivability, and error tolerance.
- Dependability is important because system failures can have widespread effects and undependable systems may be rejected.
- Dependability is achieved through techniques like fault avoidance, detection and removal, and building in fault tolerance.
The document discusses cybersecurity and why a technological approach alone is not sufficient. It argues that cybersecurity is a socio-technical problem, as technology cannot guarantee reliability and human and organizational factors like insider threats, procedures, carelessness, and social engineering present vulnerabilities. A holistic approach is needed across personal, organizational, national, and international levels that includes deterrence, awareness, realistic procedures, monitoring, and cooperation.
This is the DDS Security adopted specification.
It was adopted as an OMG standard in June 2014.
The official URL is http://www.omg.org/spec/DDS-SECURITY/
The document discusses how success and failure in sociotechnical systems are subjective and depend on the perspective of the observer. It notes that sociotechnical systems are non-deterministic due to human factors and system changes over time. Different stakeholders may have conflicting views of what constitutes success or failure based on their goals. The document concludes that failures are normal and inevitable in complex systems due to technical limitations and differing stakeholder perspectives.
The document discusses emergent properties in socio-technical systems. It defines emergent properties as properties of a system as a whole that arise from the relationships and interactions between components, rather than properties of the individual components. Emergent properties only emerge once a system is integrated and can include functional properties like a bicycle's ability to transport, as well as non-functional properties like reliability, security, and volume. Reliability in particular is an emergent property, as failures can occur due to unforeseen interactions between components. The document uses examples to illustrate different types of emergent properties.
This document discusses sociotechnical systems and introduces their key concepts. It defines a system as a purposeful collection of interrelated components working towards a common goal. Sociotechnical systems specifically include both technical systems (e.g. software, hardware) as well as the operational processes and people interacting with the technical systems. These systems have a layered "stack" structure with different levels including equipment, operating systems, applications, business processes, organizations, and broader society. Changes at one level can ripple through other levels due to interdependencies between layers. Achieving dependability requires containing failures within layers and understanding how adjacent layers may be affected.
Quality attributes in software architectureHimanshu
The document discusses software quality attributes and how they relate to software architecture. It defines quality attributes as factors that affect runtime behavior, system design, and user experience. It outlines common quality attributes including design qualities, runtime qualities, system qualities, and more. For each category, it provides examples of specific attributes like reliability, performance, usability, and maintainability. It includes diagrams to illustrate how quality attributes are defined in scenarios and how they can be measured. The document aims to explain how architecture should support and enable achieving various quality goals.
This document discusses the concept of dependability in computer systems. It defines dependability as the extent to which a system operates as expected without failure. Dependability is determined by attributes like availability, reliability, safety, and security. The document outlines these principal properties and how they are related. It also discusses how dependability is perceived subjectively and how availability and reliability can be quantified.
This document discusses dependability engineering and techniques for achieving dependable software systems. It covers fault avoidance, fault detection, and fault tolerance. Critical systems often use redundancy, diversity, and regulated development processes to meet high dependability requirements. Dependable architectures and protection systems can provide fault tolerance to prevent failures from causing outages or emergencies.
Reliability is the ability of a system or component to function under stated conditions for a specified period of time. There are several reasons why failures occur, including design flaws, overstressing, wear and tear, vibration, incorrect specifications, misuse, and operating outside intended environments. The objectives of reliability engineering are to prevent or reduce failures, identify and correct causes of failures, determine ways to cope with failures, and estimate reliability of new designs. Reliability is defined as the probability of success and avoids downtime, repair costs, and warranty claims. Modes of failure include initial infant mortality failures, random stable failures, and wear-out failures over time depicted by a bathtub curve. Reliability influences system
Unit 2-software development process notes arvind pandey
Critical systems must be dependable to avoid catastrophic failures. Dependability encompasses availability, reliability, safety, and security. Availability refers to a system's ability to deliver services when requested, while reliability means delivering services correctly. Safety ensures excessive errors do not occur, as even one failure could endanger life. Development methods for critical systems aim to formally prove correctness due to high failure costs. An insulin pump example demonstrated how software controls a medical device, requiring stringent dependability to safely regulate insulin doses.
Static analysis and reliability testing (CS 5032 2012)Ian Sommerville
The document discusses various topics related to dependability and security assurance for critical systems, including static analysis techniques, reliability testing, and validation processes. It notes that validation costs for critical systems are significantly higher than for non-critical systems, often over 50% of total development costs, due to additional validation activities required. Specific static analysis techniques covered include formal verification, model checking, and automated program analysis.
Software reliability is influenced by fault count and operational profile. Key factors include fault avoidance, fault tolerance, fault removal and fault forecasting. Dependability is measured by metrics such as MTTF, MTTR, MTBF, POFOD, ROCOF and availability. Software reliability is defined as the probability of failure-free operation of a software system for a specified time period in a given environment.
McCall proposed a model in 1977 to measure software quality based on quality factors related to software requirements. The model breaks quality factors into three main categories: product operation factors, product revision factors, and product transition factors. Alternative models by Evans/Marciniak and Deutsch/Willis also use these factors to evaluate software quality.
This document provides an overview of reliability engineering topics including software reliability, fault tolerance, and reliability requirements. It discusses key concepts such as availability, reliability, faults, errors and failures. It also describes different fault-tolerant system architectures and reliability metrics including probability of failure on demand, rate of occurrence of failures, and availability. Functional reliability requirements and examples are also presented relating to checking requirements, recovery requirements, redundancy requirements and development process requirements.
This document provides an overview of dependability and dependable systems. It defines dependability as an umbrella term that includes reliability, availability, maintainability, and other attributes that allow systems to be trusted. Dependability addresses how systems can continue operating correctly even when faults occur. Key topics covered include fault tolerance techniques, error processing, failure modes, and modeling approaches for analyzing dependability. The goal of the course is to understand how to design systems that can be relied upon to deliver their services as specified, even in the presence of faults or unexpected events.
The document discusses state-based modeling approaches for dependability analysis, including Markov chains and Petri nets. It begins by defining dependability and its attributes like availability, reliability, safety, and maintainability. It then discusses state-based models and how they can explicitly model complex system relationships. Markov chains and continuous-time Markov chains are described as examples of state-based models. The document provides an example of using a continuous-time Markov chain to model a 2-out-of-3 system and calculate its steady-state availability. It concludes by noting that Markov chains can grow exponentially with system size and discusses decomposition approaches to address this issue.
This document discusses the topics of security and dependability in computer systems. It defines dependability as comprising reliability, availability, safety, and security. These properties are interdependent and important for systems where failures could significantly impact users. The document outlines various dependability properties and how they are measured. It discusses how dependability is achieved through techniques like fault avoidance and tolerance. It also distinguishes between safety and reliability, defining safety as preventing harm even if a system fails. Key aspects of safety-critical systems and achieving safety are also covered.
This document discusses the key aspects of system dependability, including availability, reliability, safety, and security. It notes that dependability reflects the degree to which users trust a system and defines it as covering attributes like availability, reliability, and security. It also discusses factors that influence perceptions of reliability and availability, such as usage patterns, outage length and number of users affected.
The document discusses reliability engineering and fault tolerance. It covers topics like availability, reliability requirements, fault-tolerant architectures, and reliability measurement. It defines key terms like faults, errors, and failures. It also describes techniques for achieving reliability like fault avoidance, fault detection, and fault tolerance. Specific architectures discussed include redundant systems and protection systems that can take emergency action if failures occur.
An introduction to requirements engineering for students with no previous background in this area. Part of critical systems engineering course, CS 5032.
The document discusses Lloyd's Register's asset maintenance and integrity solutions for the energy industry. It outlines that effective maintenance management is important for equipment integrity but can be expensive, while poor maintenance opens risks. Lloyd's Register offers services to review maintenance systems, provide guidance for improvement, and assure compliance. The goal is to achieve sustainable asset performance, improved safety, cost efficiency and higher returns through optimized maintenance management.
This document discusses availability and reliability in systems. Availability is defined as the probability that a system will be operational to deliver requested services, while reliability is the probability of failure-free operation over time. Both can be expressed as percentages. Availability takes into account repair times, whereas reliability does not. Faults can lead to errors and failures if not addressed through techniques like fault avoidance, detection, and tolerance.
This document discusses reliability modeling and evaluation. It defines systems and describes two types: non-maintained and maintained. For non-maintained systems, reliability and mean time to failure are key metrics, while for maintained systems availability and mean time to repair are more important. The document also discusses different modeling approaches like black box, white box, and graphical representations. It covers reliability modeling concepts like independent failures, various system models including series, parallel and k-out-of-n configurations.
A distributed database management system aims to provide reliability even when the underlying system experiences failures. Reliability in a distributed system is achieved through data replication and easy scaling. However, several protocols must be implemented to take advantage of distribution and replication. Reliability is closely related to maintaining atomicity and durability of transactions. Key concepts related to reliability include faults, failures, errors, mean time between failures, mean time to repair, reliability, and availability. Reliability refers to the probability of no failures within a time period while availability is the fraction of time the system is operational.
The document discusses techniques for achieving dependable software systems through fault tolerance. It describes hardware-based triple modular redundancy and software-based N-version programming. It also discusses challenges with achieving true design diversity and problems that can arise from specification errors. The document concludes with guidelines for dependable programming practices such as limiting information visibility, checking inputs, using exception handling, avoiding error-prone constructs, including restart capabilities and timeouts.
Similar to Reliability and security specification (CS 5032 2012) (20)
The document discusses ultra large scale systems (ULSS) and introduces key points from an SEI report on ULSS. It defines ULSS as interconnected webs of software, people, policies and economics at an internet scale. The scale of ULSS undermines traditional software engineering approaches. Some challenges in developing ULSS include design/evolution, orchestration, monitoring, organizational integration, and regulation/control. New interdisciplinary research is needed to address issues arising from increased system scale.
The document discusses responsibility modeling for socio-technical systems. It describes how responsibility models can be used to identify vulnerabilities in systems by clarifying agent responsibilities and information needs. The document presents examples of responsibility models for emergency response coordination and uses HAZOP analysis to identify potential failures based on deviations to information resources. The goal of responsibility modeling is to improve system dependability by facilitating analysis of social and organizational factors.
The document discusses approaches to system failure and recovery in large, complex information systems. It argues that a conventional view of failure as deviation from specifications is not adequate, as failures are subjective judgments that depend on context. Failures are also inevitable due to system complexity and stakeholder conflicts. Instead of seeking to avoid all failures, systems should be designed to support recognition of problems when they occur and enable recovery actions. Guidelines for designing systems to enhance recovery include supporting local knowledge, flexible reconfiguration of processes when needed, and redundancy through data replication and alternative records. The goal is to design systems that make recovery from inevitable failures easier rather than focusing solely on preventing failures.
This document discusses the challenges of engineering large, socio-technical systems where there are significant unknown factors (LSCITS). It argues that the traditional reductionist approach to engineering breaks down for these systems. LSCITS engineering must account for scale, uncertainty, and the entanglement of social and technical aspects. Some key challenges identified include managing large scale systems, dealing with uncertainty, reasoning about complex systems, integrating different systems, and developing standards for LSCITS. The document advocates an approach that tempers reductionism with pragmatism and recognition of real-world constraints like imperfect people and organizations.
This document discusses the realities of requirements engineering processes compared to formal textbook models. It notes that real RE processes are often ad hoc and vary significantly based on factors like the type of system, customer, developer culture, and deployment environment. Formal processes do not account for human, social, and political factors that influence requirements. The document emphasizes that stakeholders, customers, and environments are diverse and changing, so requirements engineering must adapt to these realities rather than following rigid predefined processes.
This document discusses dependability requirements engineering for safety-critical information systems. It introduces the concepts of system dependability, dependability requirement types, and the need for integrated requirements engineering. Dependability requirements should be considered alongside other business requirements. The document uses an exemplar healthcare records management system to illustrate how concerns can help identify safety requirements and requirements conflicts. Concerns reflect organizational goals and help bridge the gap between goals and system requirements.
The document discusses the conceptual design process for a large-scale complex IT system (LSCITS) for road pricing. It covers the key activities in conceptual design including concept formulation, problem understanding, requirements engineering, feasibility studies, and architectural design. Specifically, it provides examples for a proposed road pricing system, describing the problem it aims to solve, potential high-level requirements and constraints, different design options considered in the feasibility study, and examples of technologies that could enable each option.
This document provides an introduction to large-scale complex IT systems (LSCITS). It defines LSCITS and distinguishes them from other types of systems. Key points made include:
- An LSCITS is a large-scale IT system where there are significant unknown factors in its development, use, and operating environments that introduce complexity.
- LSCITS are often part of larger socio-technical systems and there are complex relationships between the technical and social aspects.
- Examples given of LSCITS include digital music systems and national identity management systems.
In November 1988, a student at Cornell University deliberately released a program called the Internet Worm that exploited security vulnerabilities in Unix systems and spread across computers connected to the Internet. The worm did not cause direct damage but its replication overloaded systems and severely disrupted network services. It took system administrators several days to devise and implement modifications to vulnerable programs to stop the spread of the worm and clear infected machines. This was the first widely distributed Internet security threat and highlighted the importance of securing systems and patching vulnerabilities.
Discusses sociotechnical issues that arose in the design of a national digital learning system intended for use by more than a million students and their teachers
The Ariane 5 rocket failed on its maiden flight 37 seconds after launch due to a software failure. The failure occurred when an attempt to convert a 64-bit number to a 16-bit integer caused an overflow, crashing the inertial reference system and backup system. The failed software was reused from the Ariane 4 without being properly tested or reviewed for the new rocket configuration. This led to an avoidable launch failure 37 seconds into the flight.
British Midland Flight 92 from Heathrow to Belfast crashed near Kegworth, England in 1989, killing 47 people. A fan blade broke off on the left engine, causing vibration, but the pilot mistakenly shut down the right engine instead. The left engine then failed 20 minutes later, as the pilot had not detected the initial error. The crash resulted from a combination of factors including pilot error, inadequate training, aircraft design issues, and lack of communication between the pilot and cabin crew.
Cybersecurity involves protecting individuals, businesses, and governments from cyber threats on computers and the internet. It is a broad field that includes threat analysis, security technologies, policies and laws. Cybersecurity problems stem from technical issues as well as human and organizational factors. It aims to prevent malicious cyber attacks and accidental damage. Attacks can come from inside or outside an organization and include fraud, spying, stalking, assault, and warfare between nations. The scale of the problem is large but difficult to measure fully. Cybersecurity issues have arisen because the internet was not designed with security in mind and prioritizes convenience, while widespread connectivity has increased risks.
Critical infrastructure refers to the essential systems and services in modern societies that support public services, the economy, and national security. These include systems for power, water, transportation, communications, health care, finance, and emergency services. Modern critical infrastructure is controlled and managed by interconnected computer systems, so these digital systems are also considered critical infrastructure. Critical infrastructure is characterized as being large in scale, complex with many interdependencies, reliant on standards, and long-lasting. The availability of critical infrastructure is essential for society, and the unavailability of critical systems could have significant human, economic, and social consequences.
The Maroochy SCADA attack in 2000 involved an insider, Vitek Boden, who had worked for the company that installed the sewage pumping control system for Maroochy, Australia. After leaving the company and being denied a job with the local council, Boden stole hardware and software from his previous employer to launch a revenge attack. Over several months, he remotely hacked into the SCADA system controlling the sewage pumps and caused pumps to fail, releasing over 1 million liters of untreated sewage. The attack highlighted security issues with the system's insecure radio communications and lack of monitoring. Boden was later convicted and jailed for his role in the incident.
1) SCADA systems are used to monitor and control critical infrastructure through networks of sensors and programmable logic controllers.
2) These systems were traditionally isolated but are now increasingly connected to external networks, making them vulnerable to attacks.
3) Common vulnerabilities of SCADA systems include weak passwords, unencrypted network traffic, and lack of input validation. Improving SCADA security is challenging due to the operational needs of control systems and lack of security experience among operators.
Socio-technical systems include both technical and human elements. They are made up of interconnected layers from equipment and software to business processes and societal rules. Properties emerge from the interactions between these layers, including reliability, security, and usability. Whether a socio-technical system is considered a success or failure depends on perspective, as stakeholders have differing views and system behavior is non-deterministic due to human factors. Failures are also inevitable given the complexity of relationships in socio-technical systems.
This talk will cover ScyllaDB Architecture from the cluster-level view and zoom in on data distribution and internal node architecture. In the process, we will learn the secret sauce used to get ScyllaDB's high availability and superior performance. We will also touch on the upcoming changes to ScyllaDB architecture, moving to strongly consistent metadata and tablets.
As AI technology is pushing into IT I was wondering myself, as an “infrastructure container kubernetes guy”, how get this fancy AI technology get managed from an infrastructure operational view? Is it possible to apply our lovely cloud native principals as well? What benefit’s both technologies could bring to each other?
Let me take this questions and provide you a short journey through existing deployment models and use cases for AI software. On practical examples, we discuss what cloud/on-premise strategy we may need for applying it to our own infrastructure to get it to work from an enterprise perspective. I want to give an overview about infrastructure requirements and technologies, what could be beneficial or limiting your AI use cases in an enterprise environment. An interactive Demo will give you some insides, what approaches I got already working for real.
Keywords: AI, Containeres, Kubernetes, Cloud Native
Event Link: https://meine.doag.org/events/cloudland/2024/agenda/#agendaId.4211
From Natural Language to Structured Solr Queries using LLMsSease
This talk draws on experimentation to enable AI applications with Solr. One important use case is to use AI for better accessibility and discoverability of the data: while User eXperience techniques, lexical search improvements, and data harmonization can take organizations to a good level of accessibility, a structural (or “cognitive” gap) remains between the data user needs and the data producer constraints.
That is where AI – and most importantly, Natural Language Processing and Large Language Model techniques – could make a difference. This natural language, conversational engine could facilitate access and usage of the data leveraging the semantics of any data source.
The objective of the presentation is to propose a technical approach and a way forward to achieve this goal.
The key concept is to enable users to express their search queries in natural language, which the LLM then enriches, interprets, and translates into structured queries based on the Solr index’s metadata.
This approach leverages the LLM’s ability to understand the nuances of natural language and the structure of documents within Apache Solr.
The LLM acts as an intermediary agent, offering a transparent experience to users automatically and potentially uncovering relevant documents that conventional search methods might overlook. The presentation will include the results of this experimental work, lessons learned, best practices, and the scope of future work that should improve the approach and make it production-ready.
In the realm of cybersecurity, offensive security practices act as a critical shield. By simulating real-world attacks in a controlled environment, these techniques expose vulnerabilities before malicious actors can exploit them. This proactive approach allows manufacturers to identify and fix weaknesses, significantly enhancing system security.
This presentation delves into the development of a system designed to mimic Galileo's Open Service signal using software-defined radio (SDR) technology. We'll begin with a foundational overview of both Global Navigation Satellite Systems (GNSS) and the intricacies of digital signal processing.
The presentation culminates in a live demonstration. We'll showcase the manipulation of Galileo's Open Service pilot signal, simulating an attack on various software and hardware systems. This practical demonstration serves to highlight the potential consequences of unaddressed vulnerabilities, emphasizing the importance of offensive security practices in safeguarding critical infrastructure.
Conversational agents, or chatbots, are increasingly used to access all sorts of services using natural language. While open-domain chatbots - like ChatGPT - can converse on any topic, task-oriented chatbots - the focus of this paper - are designed for specific tasks, like booking a flight, obtaining customer support, or setting an appointment. Like any other software, task-oriented chatbots need to be properly tested, usually by defining and executing test scenarios (i.e., sequences of user-chatbot interactions). However, there is currently a lack of methods to quantify the completeness and strength of such test scenarios, which can lead to low-quality tests, and hence to buggy chatbots.
To fill this gap, we propose adapting mutation testing (MuT) for task-oriented chatbots. To this end, we introduce a set of mutation operators that emulate faults in chatbot designs, an architecture that enables MuT on chatbots built using heterogeneous technologies, and a practical realisation as an Eclipse plugin. Moreover, we evaluate the applicability, effectiveness and efficiency of our approach on open-source chatbots, with promising results.
"Choosing proper type of scaling", Olena SyrotaFwdays
Imagine an IoT processing system that is already quite mature and production-ready and for which client coverage is growing and scaling and performance aspects are life and death questions. The system has Redis, MongoDB, and stream processing based on ksqldb. In this talk, firstly, we will analyze scaling approaches and then select the proper ones for our system.
QA or the Highway - Component Testing: Bridging the gap between frontend appl...zjhamm304
These are the slides for the presentation, "Component Testing: Bridging the gap between frontend applications" that was presented at QA or the Highway 2024 in Columbus, OH by Zachary Hamm.
AppSec PNW: Android and iOS Application Security with MobSFAjin Abraham
Mobile Security Framework - MobSF is a free and open source automated mobile application security testing environment designed to help security engineers, researchers, developers, and penetration testers to identify security vulnerabilities, malicious behaviours and privacy concerns in mobile applications using static and dynamic analysis. It supports all the popular mobile application binaries and source code formats built for Android and iOS devices. In addition to automated security assessment, it also offers an interactive testing environment to build and execute scenario based test/fuzz cases against the application.
This talk covers:
Using MobSF for static analysis of mobile applications.
Interactive dynamic security assessment of Android and iOS applications.
Solving Mobile app CTF challenges.
Reverse engineering and runtime analysis of Mobile malware.
How to shift left and integrate MobSF/mobsfscan SAST and DAST in your build pipeline.
The Department of Veteran Affairs (VA) invited Taylor Paschal, Knowledge & Information Management Consultant at Enterprise Knowledge, to speak at a Knowledge Management Lunch and Learn hosted on June 12, 2024. All Office of Administration staff were invited to attend and received professional development credit for participating in the voluntary event.
The objectives of the Lunch and Learn presentation were to:
- Review what KM ‘is’ and ‘isn’t’
- Understand the value of KM and the benefits of engaging
- Define and reflect on your “what’s in it for me?”
- Share actionable ways you can participate in Knowledge - - Capture & Transfer
How information systems are built or acquired puts information, which is what they should be about, in a secondary place. Our language adapted accordingly, and we no longer talk about information systems but applications. Applications evolved in a way to break data into diverse fragments, tightly coupled with applications and expensive to integrate. The result is technical debt, which is re-paid by taking even bigger "loans", resulting in an ever-increasing technical debt. Software engineering and procurement practices work in sync with market forces to maintain this trend. This talk demonstrates how natural this situation is. The question is: can something be done to reverse the trend?
AI in the Workplace Reskilling, Upskilling, and Future Work.pptxSunil Jagani
Discover how AI is transforming the workplace and learn strategies for reskilling and upskilling employees to stay ahead. This comprehensive guide covers the impact of AI on jobs, essential skills for the future, and successful case studies from industry leaders. Embrace AI-driven changes, foster continuous learning, and build a future-ready workforce.
Read More - https://bit.ly/3VKly70
ScyllaDB is making a major architecture shift. We’re moving from vNode replication to tablets – fragments of tables that are distributed independently, enabling dynamic data distribution and extreme elasticity. In this keynote, ScyllaDB co-founder and CTO Avi Kivity explains the reason for this shift, provides a look at the implementation and roadmap, and shares how this shift benefits ScyllaDB users.
Connector Corner: Seamlessly power UiPath Apps, GenAI with prebuilt connectorsDianaGray10
Join us to learn how UiPath Apps can directly and easily interact with prebuilt connectors via Integration Service--including Salesforce, ServiceNow, Open GenAI, and more.
The best part is you can achieve this without building a custom workflow! Say goodbye to the hassle of using separate automations to call APIs. By seamlessly integrating within App Studio, you can now easily streamline your workflow, while gaining direct access to our Connector Catalog of popular applications.
We’ll discuss and demo the benefits of UiPath Apps and connectors including:
Creating a compelling user experience for any software, without the limitations of APIs.
Accelerating the app creation process, saving time and effort
Enjoying high-performance CRUD (create, read, update, delete) operations, for
seamless data management.
Speakers:
Russell Alfeche, Technology Leader, RPA at qBotic and UiPath MVP
Charlie Greenberg, host
"NATO Hackathon Winner: AI-Powered Drug Search", Taras KlobaFwdays
This is a session that details how PostgreSQL's features and Azure AI Services can be effectively used to significantly enhance the search functionality in any application.
In this session, we'll share insights on how we used PostgreSQL to facilitate precise searches across multiple fields in our mobile application. The techniques include using LIKE and ILIKE operators and integrating a trigram-based search to handle potential misspellings, thereby increasing the search accuracy.
We'll also discuss how the azure_ai extension on PostgreSQL databases in Azure and Azure AI Services were utilized to create vectors from user input, a feature beneficial when users wish to find specific items based on text prompts. While our application's case study involves a drug search, the techniques and principles shared in this session can be adapted to improve search functionality in a wide range of applications. Join us to learn how PostgreSQL and Azure AI can be harnessed to enhance your application's search capability.
Northern Engraving | Modern Metal Trim, Nameplates and Appliance PanelsNorthern Engraving
What began over 115 years ago as a supplier of precision gauges to the automotive industry has evolved into being an industry leader in the manufacture of product branding, automotive cockpit trim and decorative appliance trim. Value-added services include in-house Design, Engineering, Program Management, Test Lab and Tool Shops.
Reliability and security specification (CS 5032 2012)
1. Dependability and Security Specification
Lecture 2
Dependability and Security Specification, CSE course, 2011 Slide 1
2. Reliability specification
• Reliability can be measured so non-functional reliability
requirements may be specified quantitatively.
• Non-functional reliability requirements define the number
of failures that are acceptable during normal use of the
system or the time in which the system must be available.
• Functional reliability requirements define system and
software functions that avoid, detect or tolerate faults
in the software and so ensure that these faults do not
lead to system failure.
• Software reliability requirements may also be
included to cope with hardware failure or operator
error.
Dependability and Security Specification, CSE course, 2011 Slide 2
3. The reliability specification process
Identify the types of system failure that
may lead to economic losses.
Risk identification
Estimate the costs
and consequences of Risk analysis
the different types of
software failure
Identify the root causes Risk
of system failure decompostition
Generate reliability
specifications, including quantitative Risk reduction
requirements defining the acceptable
Dependability andlevels Specification, CSE course, 2011
Security of failure Slide 3
4. Types of system failure
Failure type Description
Loss of service The system is unavailable and cannot deliver its
services to users. You may separate this into loss of
critical services and loss of non-critical services, where
the consequences of a failure in non-critical services
are less than the consequences of critical service
failure.
Incorrect service delivery The system does not deliver a service correctly to
users. Again, this may be specified in terms of minor
and major errors or errors in the delivery of critical and
non-critical services.
System/data corruption The failure of the system causes damage to the system
itself or its data. This will usually but not necessarily be
in conjunction with other types of failures.
Dependability and Security Specification, CSE course, 2011 Slide 4
5. Reliability metrics
• Reliability metrics
are units of
measurement of
system reliability.
• System reliability is
measured by
counting the number
of operational
failures and, where
appropriate, relating
these to the
demands made on
the system and the
time that the system
has been
Metrics operational.
* Probability of failure on demand
•
* Rate of occurrence of failures/Mean time to failure A long-term
* Availability measurement
programme is
required to assess
Dependability and Security Specification, CSE course, 2011
the reliability of Slide 5
critical systems.
6. Probability of failure on demand
(POFOD)
• The probability that the
system will fail when a
request for service is
made.
• Used when demands for
service are intermittent
and relatively infrequent.
• Appropriate for protection
Protection system at Sizewell B power station systems where services
are demanded
Shuts down reactor if problems detected
occasionally and where
there are serious
Dependability and Security Specification, CSE course, 2011
consequence if the Slide 6
7. Rate of fault occurrence (ROCOF)
• Reflects the rate of
occurrence of failure in the
system.
– ROCOF of 0.002 means 2
failures are likely in each
1000 operational time units
e.g. 2 failures per 1000
hours of operation
• Relevant for systems where the
system has to process a large
number of similar requests in a
defined time period
– Credit card processing
system, supermarket checkout
system.
Dependability and Security Specification, CSE course, 2011 Slide 7
8. Mean time to failure
• Reciprocal of ROCOF is
Mean time to Failure (MTTF)
– Relevant for systems with long
transactions i.e. where system
processing takes a long time (e.g.
CAD systems).
• MTTF should be longer than
expected transaction length
so that the system does not
normally fail during a session
or transaction
Dependability and Security Specification, CSE course, 2011 Slide 8
9. Availability
• Measure of the fraction of the
time that the system is
available for use.
• Takes repair and restart time
into account
• Availability of 0.998 means
software is available for 998
out of 1000 time units.
• Relevant for non-
stop, continuously running
systems
– telephone switching
Dependability and Security Specification, CSE course, 2011
systems, railway signalling Slide 9
systems, e-commerce
10. Availability specification
Availability Explanation
0.9 The system is available for 90% of the time. This means that,
in a 24-hour period (1,440 minutes), the system will be
unavailable for 144 minutes.
0.99 In a 24-hour period, the system is unavailable for 14.4
minutes.
0.999 The system is unavailable for 84 seconds in a 24-hour period.
0.9999 The system is unavailable for 8.4 seconds in a 24-hour
period. Roughly, one minute per week.
Dependability and Security Specification, CSE course, 2011 Slide 10
11. Failure consequences
• When specifying reliability, it is
not just the number of system
failures that matter but the
consequences of these failures.
• Failures that have serious
consequences are clearly more
damaging than those where
repair and recovery is
straightforward.
• In some
cases, therefore, different
reliability specifications for
different types of failure may be
Dependability and Security Specification, CSE course, 2011
defined. Slide 11
12. Over-specification of reliability
• Over-specification of reliability means that a high-
level of reliability is specified but it is not cost-
effective to achieve this.
• In many cases, it is cheaper to accept and deal with
failures rather than avoid them occurring.
• To avoid over-specification
– Specify reliability requirements for different types of failure.
Minor failures may be acceptable.
– Specify requirements for different services separately.
Critical services should have the highest reliability
requirements.
– Decide whether or not high reliability is really required or if
dependability goals can be achieved in some other way.
Dependability and Security Specification, CSE course, 2011 Slide 12
13. Steps to a reliability specification
For each sub-
system, analyse the
consequences of
possible system
failures. From the system failure Different metrics may be
analysis, partition used for different reliability
failures into appropriate requirements
classes
For each failure class
identified, set out the
reliability using an
appropriate metric
Identify functional
reliability requirements
to reduce the chances
of critical failures
Dependability and Security Specification, CSE course, 2011 Slide 13
14. Insulin pump specification
• Probability of failure (POFOD) is the
most appropriate metric.
– Relatively infrequent demands (10s
per day).
• Transient failures that can be repaired
by user actions such as recalibration of
the machine. A relatively low value of
POFOD is acceptable (say 0.002) –
one failure may occur in every 500
demands.
• Permanent failures require the
software to be re-installed by the
manufacturer. This should occur no
more than once per year. POFOD for
Dependability and Security Specification, CSE course, 2011
this situation should be less than Slide 14
0.00002.
15. Functional reliability requirements
• Checking requirements that identify checks to ensure
that incorrect data is detected before it leads to a
failure.
• Recovery requirements that are geared to help the
system recover after a failure has occurred.
• Redundancy requirements that specify redundant
features of the system to be included.
• Process requirements for reliability which specify the
development process to be used may also be
included.
Dependability and Security Specification, CSE course, 2011 Slide 15
16. Examples of functional reliability
requirements for MHC-PMS
RR1: A pre-defined range for all operator inputs shall be defined
and the system shall check that all operator inputs fall within this
pre-defined range. (Checking)
RR2: Copies of the patient database shall be maintained on two
separate servers that are not housed in the same building.
(Recovery, redundancy)
RR3: N-version programming shall be used to implement the
braking control system. (Redundancy)
RR4: The system must be implemented in a safe subset of Ada
and checked using static analysis. (Process)
Dependability and Security Specification, CSE course, 2011 Slide 16
17. Security specification
• Security specification has something in common with safety
requirements specification – in both cases, your concern is to
avoid something bad happening.
• Four major differences
– Safety problems are accidental – the software is not operating in a
hostile environment. In security, you must assume that attackers
have knowledge of system weaknesses
– When safety failures occur, you can look for the root cause or
weakness that led to the failure. When failure results from a
deliberate attack, the attacker may conceal the cause of the failure.
– Shutting down a system can avoid a safety-related failure. Causing
a shut down may be the aim of an attack.
– Safety-related events are not generated from an intelligent
adversary. An attacker can probe defenses over time to discover
weaknesses.
Dependability and Security Specification, CSE course, 2011 Slide 17
18. Security policy
• An organizational security policy applies
to all systems and sets out what should
and should not be allowed.
• For example, a military policy might be:
– Readers may only examine
documents whose classification is the
same as or below the readers vetting
level.
• A security policy sets out the conditions
that must be maintained by a security
system and so helps identify system
security requirements.
Dependability and Security Specification, CSE course, 2011 Slide 18
19. The preliminary risk assessment
process for security requirements
Dependability and Security Specification, CSE course, 2011 Slide 19
20. Security risk assessment
Identify the key
system assets (or
services) that have to
be protected.
Asset identification Estimate the value of the
identified assets
Asset value
assessment Assess the potential
losses associated with
each asset
Exposure
assessment
Threat identification
Identify the most
probable threats to
Dependability and Security Specification, CSE course, 2011 the system assets Slide 20
21. Security risk assessment
Decompose threats
into possible attacks
on the system and the
ways that these may
occur
Attack assessment Propose the controls that
may be put in place to
protect an asset
Control Assess the technical
identification feasibility and cost of
the controls
Feasibility
assessment
Security
requirements
definition
Security requirements can be
infrastructure or application
Dependability and Security Specification, CSE course, 2011 system requirements Slide 21
22. Asset analysis in a preliminary risk assessment report
for the MHC-PMS
Asset Value Exposure
The information system High. Required to support all High. Financial loss as
clinical consultations. clinics may have to be
Potentially safety-critical. canceled. Costs of restoring
system. Possible patient
harm if treatment cannot be
prescribed.
The patient database High. Required to support all High. Financial loss as
clinical consultations. clinics may have to be
Potentially safety-critical. canceled. Costs of restoring
system. Possible patient
harm if treatment cannot be
prescribed.
An individual patient record Normally low although may Low direct losses but
be high for specific high- possible loss of reputation.
profile patients.
Dependability and Security Specification, CSE course, 2011 Slide 22
23. Threat and control analysis in a
preliminary risk assessment report
Threat Probability Control Feasibility
Unauthorized user Low Only allow system Low cost of implementation
gains access as management from but care must be taken with
system manager specific locations key distribution and to
and makes system that are physically ensure that keys are
unavailable secure. available in the event of an
emergency.
Unauthorized user High Require all users to Technically feasible but
gains access as authenticate high-cost solution. Possible
system user and themselves using a user resistance.
accesses biometric
confidential mechanism. Simple and transparent to
information implement and also
Log all changes to supports recovery.
patient information to
track system usage.
Dependability and Security Specification, CSE course, 2011 Slide 23
24. Security requirements for the MHC-PMS
• Patient information shall be downloaded at the start
of a clinic session to a secure area on the system
client that is used by clinical staff.
• All patient information on the system client shall be
encrypted.
• Patient information shall be uploaded to the database
after a clinic session has finished and deleted from
the client computer.
• A log on a separate computer from the database
server must be maintained of all changes made to the
system database.
Dependability and Security Specification, CSE course, 2011 Slide 24
25. Formal methods and specification
Dependability and Security Specification, CSE course, 2011 Slide 25
26. Formal methods and critical systems
• Formal specification is part of a more general
collection of techniques that are known as ‘formal
methods’.
• These are all based on mathematical representation
and analysis of software.
• Formal methods include
– Formal specification;
– Specification analysis and proof;
– Transformational development;
– Program verification.
Dependability and Security Specification, CSE course, 2011 Slide 26
27. Use of formal methods
• The principal benefits of formal
methods are in reducing the
number of faults in systems.
• Consequently, their main area of
applicability is in critical systems
engineering. There have been
several successful projects where
formal methods have been used
in this area.
• In this area, the use of formal
methods is most likely to be cost-
effective because high system
failure costs must be avoided.
Dependability and Security Specification, CSE course, 2011 Slide 27
28. Specification in the software process
• Specification and design are inextricably
intermingled.
• Architectural design is essential to structure a
specification and the specification process.
• Formal specifications are expressed in a
mathematical notation with precisely defined
vocabulary, syntax and semantics.
Dependability and Security Specification, CSE course, 2011 Slide 28
29. Formal specification in a plan-based
software process
Dependability and Security Specification, CSE course, 2011 Slide 29
30. Benefits of formal specification
• Developing a formal specification requires the system
requirements to be analyzed in detail. This helps to detect
problems, inconsistencies and incompleteness in the
requirements.
• As the specification is expressed in a formal language, it
can be automatically analyzed to discover inconsistencies
and incompleteness.
• If you use a formal method such as the B method, you can
transform the formal specification into a ‘correct’ program.
• Program testing costs may be reduced if the program is
formally verified against its specification.
Dependability and Security Specification, CSE course, 2011 Slide 30
31. Acceptance of formal methods
• Formal methods have had limited impact on practical
software development:
– Problem owners cannot understand a formal specification
and so cannot assess if it is an accurate representation of
their requirements.
– It is easy to assess the costs of developing a formal
specification but harder to assess the benefits. Managers
may therefore be unwilling to invest in formal methods.
– Software engineers are unfamiliar with this approach and are
therefore reluctant to propose the use of FM.
– Formal methods are still hard to scale up to large systems.
– Formal specification is not really compatible with agile
development methods.
Dependability and Security Specification, CSE course, 2011 Slide 31
32. Key points
• Reliability requirements can be defined quantitatively. They
include probability of failure on demand (POFOD), rate of
occurrence of failure (ROCOF) and availability (AVAIL).
• Security requirements are more difficult to identify than safety
requirements because a system attacker can use knowledge of
system vulnerabilities to plan a system attack, and can learn
about vulnerabilities from unsuccessful attacks.
• To specify security requirements, you should identify the assets
that are to be protected and define how security techniques and
technology should be used to protect these assets.
• Formal methods of software development rely on a system
specification that is expressed as a mathematical model. The
use of formal methods avoids ambiguity in a critical systems
specification.
Dependability and Security Specification, CSE course, 2011 Slide 32