This document discusses key aspects of dependability engineering including: achieving dependability through fault avoidance, detection, and tolerance; using redundancy and diversity; the importance of well-defined, repeatable processes; and guidelines for dependable programming such as checking inputs, handling exceptions, avoiding error-prone constructs, and including timeouts. Critical systems often have high dependability requirements and their development must convince regulators that the system is dependable, safe, and secure.
This document discusses dependability and security in computer systems. It defines dependability as the extent to which a system operates as expected without failure. Dependability is determined by attributes like availability, reliability, safety, and security. A system is considered dependable if it does not fail and continues delivering its expected services. The document outlines the importance of dependability and explains how attributes like availability, reliability, safety, and security are related and impact one another. It provides terminology and concepts regarding faults, failures, hazards, and risks as they relate to system dependability and security.
This document discusses dependable systems architectures, including protection systems, self-monitoring architectures, and N-version programming. It notes that dependable architectures use redundancy and diversity to ensure fault tolerance. Key challenges include achieving true software and design diversity, as teams may interpret specifications similarly and diverse versions could still contain common errors.
CS 5032 L6 reliability and security specification 2013Ian Sommerville
This document discusses reliability and security specification. It defines reliability metrics like probability of failure on demand, rate of occurrence of failures, mean time to failure, and availability. It describes the reliability specification process of risk identification, analysis, and decomposition to generate quantitative requirements. The document also discusses security specification, threat assessment, and defining security requirements to protect system assets. Formal methods for specification are introduced.
This document discusses dependability and security specification. It covers topics like risk-driven specification, safety specification, and security specification. For risk-driven specification, it emphasizes identifying risks through preliminary, life cycle, and operational risk analysis to define requirements that reduce risks. For safety specification, it describes identifying hazards, assessing hazards, and defining safety requirements to ensure system failures do not cause harm. Examples of applying these techniques to an insulin pump are provided.
CS5032 L11 validation and reliability testing 2013Ian Sommerville
Critical systems require additional validation processes beyond non-critical systems due to the high costs of failure. Validation costs for critical systems are significantly higher, usually taking over 50% of development costs. Various static analysis techniques can be used for validation, including formal verification, model checking, and automated program analysis. Statistical testing with an accurate operational profile is also used to measure a critical system's reliability.
Requirements engineering involves discovering, documenting, and maintaining requirements for computer systems. It is important because getting requirements wrong can lead to projects being late, over budget, or delivering systems users do not like. Requirements engineering is difficult due to changing needs, differing stakeholder views, and politics influencing priorities.
CS 5032 L1 critical socio-technical systems 2013Ian Sommerville
This document outlines the aims and topics of a course on critical systems engineering. The course aims to help students understand critical systems, which are technical systems that are profoundly affected by organizational and human factors. Key topics covered include system dependability, security engineering, and human/organizational factors. The course will examine critical infrastructure systems through topics like resilience engineering and cybersecurity. Assessment includes an exam and a coursework assignment involving requirements specification.
This document summarizes key topics from a lecture on security engineering, including design guidelines for security, design for deployment, and system survivability. The design guidelines encourage basing decisions on an explicit security policy, avoiding single points of failure, and failing securely. Deployment issues like vulnerable defaults and access permissions are addressed. Finally, resilience strategies like resistance, recognition and recovery are discussed to help systems continue operating during attacks.
This document discusses dependability and security in computer systems. It defines dependability as the extent to which a system operates as expected without failure. Dependability is determined by attributes like availability, reliability, safety, and security. A system is considered dependable if it does not fail and continues delivering its expected services. The document outlines the importance of dependability and explains how attributes like availability, reliability, safety, and security are related and impact one another. It provides terminology and concepts regarding faults, failures, hazards, and risks as they relate to system dependability and security.
This document discusses dependable systems architectures, including protection systems, self-monitoring architectures, and N-version programming. It notes that dependable architectures use redundancy and diversity to ensure fault tolerance. Key challenges include achieving true software and design diversity, as teams may interpret specifications similarly and diverse versions could still contain common errors.
CS 5032 L6 reliability and security specification 2013Ian Sommerville
This document discusses reliability and security specification. It defines reliability metrics like probability of failure on demand, rate of occurrence of failures, mean time to failure, and availability. It describes the reliability specification process of risk identification, analysis, and decomposition to generate quantitative requirements. The document also discusses security specification, threat assessment, and defining security requirements to protect system assets. Formal methods for specification are introduced.
This document discusses dependability and security specification. It covers topics like risk-driven specification, safety specification, and security specification. For risk-driven specification, it emphasizes identifying risks through preliminary, life cycle, and operational risk analysis to define requirements that reduce risks. For safety specification, it describes identifying hazards, assessing hazards, and defining safety requirements to ensure system failures do not cause harm. Examples of applying these techniques to an insulin pump are provided.
CS5032 L11 validation and reliability testing 2013Ian Sommerville
Critical systems require additional validation processes beyond non-critical systems due to the high costs of failure. Validation costs for critical systems are significantly higher, usually taking over 50% of development costs. Various static analysis techniques can be used for validation, including formal verification, model checking, and automated program analysis. Statistical testing with an accurate operational profile is also used to measure a critical system's reliability.
Requirements engineering involves discovering, documenting, and maintaining requirements for computer systems. It is important because getting requirements wrong can lead to projects being late, over budget, or delivering systems users do not like. Requirements engineering is difficult due to changing needs, differing stakeholder views, and politics influencing priorities.
CS 5032 L1 critical socio-technical systems 2013Ian Sommerville
This document outlines the aims and topics of a course on critical systems engineering. The course aims to help students understand critical systems, which are technical systems that are profoundly affected by organizational and human factors. Key topics covered include system dependability, security engineering, and human/organizational factors. The course will examine critical infrastructure systems through topics like resilience engineering and cybersecurity. Assessment includes an exam and a coursework assignment involving requirements specification.
This document summarizes key topics from a lecture on security engineering, including design guidelines for security, design for deployment, and system survivability. The design guidelines encourage basing decisions on an explicit security policy, avoiding single points of failure, and failing securely. Deployment issues like vulnerable defaults and access permissions are addressed. Finally, resilience strategies like resistance, recognition and recovery are discussed to help systems continue operating during attacks.
CS 5032 L12 security testing and dependability cases 2013Ian Sommerville
The document discusses security validation techniques like experience-based validation using known attacks, tiger teams that simulate attacks, and tool-based validation. It also discusses the importance of having a well-defined development process for safety-critical systems that includes identifying and tracking hazards. Safety and dependability cases collect evidence like hazard analyses, test results, and review reports to argue that a system meets its safety requirements. Structured safety arguments demonstrate that hazardous conditions cannot occur by considering all program paths and showing unsafe conditions cannot be true.
This document summarizes the topics covered in the first lecture of a security engineering course. It discusses security engineering and management, security risk assessment, and designing systems for security. The lecture covers tools and techniques for developing secure systems, assessing security risks, and designing system architectures to protect assets and distribute them for redundancy.
This document discusses security cases, which provide a structured argument and evidence to support the claim that a system is acceptably secure. It focuses on addressing the potential for buffer overflows in code. The argument is that coding practices, code reviews, static analysis, and system testing with invalid inputs provide evidence there are no buffer overflow possibilities in the code. Tool support is needed to manage the large amount of documentation required to build the security case.
The document discusses security engineering design guidelines and system survivability. It covers:
1) Design guidelines that help make secure design decisions and raise security awareness.
2) Guidelines for avoiding single points of failure, failing securely, balancing security and usability, and more.
3) Designing for deployment to minimize vulnerabilities introduced during configuration and installation.
4) Ensuring systems can continue essential services when under attack through resilience and recoverability.
Reliability and security specification (CS 5032 2012)Ian Sommerville
This document discusses reliability specification and metrics. It describes how to identify types of system failure, estimate costs and consequences, and identify root causes to generate reliability specifications. Types of failures include loss of service, incorrect service, and system/data corruption. Reliability metrics are discussed such as probability of failure on demand, rate of occurrence of failures/mean time to failure, and availability. These metrics provide measurements of system reliability.
The document discusses specifications for dependability and security. It covers topics like risk-driven specification, safety specification, and security specification. It emphasizes that critical systems specification should be risk-driven as risks pose a threat to the system. The risk-driven approach aims to understand risks faced by the system and define requirements to reduce these risks through phased risk analysis including preliminary, life cycle, and operational risk analysis. Safety specification identifies protection requirements to ensure system failures do not cause harm, with risk identification, analysis, and reduction mirroring hazard identification, assessment, and analysis. An example of a safety-critical insulin pump system is provided to illustrate dependability requirements and risk analysis.
This document provides an overview of key topics from Chapter 11 on security and dependability, including:
- The principal dependability properties of availability, reliability, safety, and security.
- Dependability covers attributes like maintainability, repairability, survivability, and error tolerance.
- Dependability is important because system failures can have widespread effects and undependable systems may be rejected.
- Dependability is achieved through techniques like fault avoidance, detection and removal, and building in fault tolerance.
Norman Patch and
Remediation Advanced
provides:
• Rapid, accurate and secure
patch management
• Automated collection, analysis
and delivery of patches
• Security for your organization
from worms, trojans, viruses and other malicious threats
• Single consolidated solution
for heterogeneous environments
provides effective management
at a significantly reduced TCO
The document discusses critical systems and their dependability requirements. It defines critical systems as those where failure could result in loss of life, environmental damage, or large economic losses. Dependability encompasses availability, reliability, safety, and security. The document uses the example of an insulin pump, a safety-critical system, to illustrate dependability dimensions and how failures could threaten human life. Formal development methods may be required for critical systems due to the high costs of failure.
This document provides an overview of topics covered in a lecture on security engineering. It discusses security engineering and management, security risk assessment, and designing systems for security. It also covers security concerns like confidentiality, integrity and availability. Application security is a software problem while infrastructure security is a systems management problem. Security risk management involves preliminary risk assessment, life cycle risk assessment and operational risk assessment using techniques like misuse cases and asset analysis. The document provides examples of threats, controls, and security requirements.
This document provides an overview of topics in chapter 13 on security engineering. It discusses security and dependability, security dimensions of confidentiality, integrity and availability. It also outlines different security levels including infrastructure, application and operational security. Key aspects of security engineering are discussed such as secure system design, security testing and assurance. Security terminology and examples are provided. The relationship between security and dependability factors like reliability, availability, safety and resilience is examined. The document also covers security in organizations and the role of security policies.
This document summarizes key topics from a lecture on security engineering:
1. It discusses security engineering and management, risk assessment, and designing systems for security. Application security focuses on design while infrastructure security is a management problem.
2. It outlines guidelines for secure system design including basing decisions on security policies, avoiding single points of failure, balancing security and usability, validating all inputs, and designing for deployment and recoverability.
3. It also covers risk management, assessing threats, and designing architectures with layered protection and distributed assets to minimize the effects of attacks.
This document discusses software security engineering. It covers security concepts like assets, vulnerabilities and threats. It discusses why security engineering is important to protect systems from malicious attackers. The document outlines security risk management processes like preliminary risk assessment. It also discusses designing systems for security through architectural choices that provide protection and distributing assets. The document concludes by covering system survivability through building resistance, recognition and recovery capabilities into systems.
The document discusses techniques for achieving dependable software systems. It covers redundancy and diversity approaches including N-version programming where multiple versions of software are developed independently. Dependable system architectures like protection systems and self-monitoring architectures that use redundancy are described. The document emphasizes that a well-defined development process is important for minimizing faults and notes validation activities should include requirements reviews, testing, and change management.
This document summarizes Chapter 12 of a textbook on dependability and security specification. It discusses risk-driven specification, including identifying risks, analyzing risks, and defining requirements to reduce risks. It also covers specifying safety requirements by identifying hazards, assessing hazards, and analyzing hazards to discover root causes. The goal is to specify requirements that ensure systems function dependably and securely without failures causing harm.
This document discusses requirements specification for critical systems. It covers dependability requirements, risk-driven specification, safety specification, security specification, system reliability specification, and non-functional reliability requirements. For risk-driven specification, it describes the stages of risk identification, analysis and classification, decomposition, and risk reduction assessment. It provides examples of applying this process to an insulin pump. For safety specification, it discusses safety requirements, the safety life cycle, and the IEC 61508 standard. For security specification, it outlines a similar process to safety with stages of asset identification, threat analysis, and security requirements specification. It also discusses different types of security requirements.
This document provides a comparative survey of fault handling in field programmable gate arrays (FPGAs) and microcontrollers (MCUs) for safety-critical embedded applications. It discusses how the different hardware platforms lead to fundamental differences in design that influence aspects of safety, reliability, and fault tolerance. Specifically, it examines how the hardware architecture, software design process, verification capabilities, and device technologies of FPGAs and MCUs each impact the ability to avoid or tolerate faults in hardware and software.
The document provides an overview of software engineering, discussing what it is, why it is important, and key concepts like the software development lifecycle, processes, and models. It introduces software engineering as a way to build software in a controlled, predictable manner by giving control over functionality, quality, and resources. It also summarizes several software development process models like waterfall, evolutionary development, and spiral.
This document discusses safety engineering for systems that contain software. It covers topics like safety-critical systems, safety requirements, and safety engineering processes. Safety is defined as a system's ability to operate normally and abnormally without harm. For safety-critical systems like aircraft or medical devices, software is often used for control and monitoring, so software safety is important. Hazard identification, risk assessment, and specifying safety requirements to mitigate risks are key parts of the safety engineering process. The goal is to design systems where failures cannot cause injury, death or environmental damage.
British Midland Flight 92 from Heathrow to Belfast crashed near Kegworth, England in 1989, killing 47 people. A fan blade broke off on the left engine, causing vibration, but the pilot mistakenly shut down the right engine instead. The left engine then failed 20 minutes later, as the pilot had not detected the initial error. The crash resulted from a combination of factors including pilot error, inadequate training, aircraft design issues, and lack of communication between the pilot and cabin crew.
The document discusses cybersecurity and why a technological approach alone is not sufficient. It argues that cybersecurity is a socio-technical problem, as technology cannot guarantee reliability and human and organizational factors like insider threats, procedures, carelessness, and social engineering present vulnerabilities. A holistic approach is needed across personal, organizational, national, and international levels that includes deterrence, awareness, realistic procedures, monitoring, and cooperation.
CS 5032 L12 security testing and dependability cases 2013Ian Sommerville
The document discusses security validation techniques like experience-based validation using known attacks, tiger teams that simulate attacks, and tool-based validation. It also discusses the importance of having a well-defined development process for safety-critical systems that includes identifying and tracking hazards. Safety and dependability cases collect evidence like hazard analyses, test results, and review reports to argue that a system meets its safety requirements. Structured safety arguments demonstrate that hazardous conditions cannot occur by considering all program paths and showing unsafe conditions cannot be true.
This document summarizes the topics covered in the first lecture of a security engineering course. It discusses security engineering and management, security risk assessment, and designing systems for security. The lecture covers tools and techniques for developing secure systems, assessing security risks, and designing system architectures to protect assets and distribute them for redundancy.
This document discusses security cases, which provide a structured argument and evidence to support the claim that a system is acceptably secure. It focuses on addressing the potential for buffer overflows in code. The argument is that coding practices, code reviews, static analysis, and system testing with invalid inputs provide evidence there are no buffer overflow possibilities in the code. Tool support is needed to manage the large amount of documentation required to build the security case.
The document discusses security engineering design guidelines and system survivability. It covers:
1) Design guidelines that help make secure design decisions and raise security awareness.
2) Guidelines for avoiding single points of failure, failing securely, balancing security and usability, and more.
3) Designing for deployment to minimize vulnerabilities introduced during configuration and installation.
4) Ensuring systems can continue essential services when under attack through resilience and recoverability.
Reliability and security specification (CS 5032 2012)Ian Sommerville
This document discusses reliability specification and metrics. It describes how to identify types of system failure, estimate costs and consequences, and identify root causes to generate reliability specifications. Types of failures include loss of service, incorrect service, and system/data corruption. Reliability metrics are discussed such as probability of failure on demand, rate of occurrence of failures/mean time to failure, and availability. These metrics provide measurements of system reliability.
The document discusses specifications for dependability and security. It covers topics like risk-driven specification, safety specification, and security specification. It emphasizes that critical systems specification should be risk-driven as risks pose a threat to the system. The risk-driven approach aims to understand risks faced by the system and define requirements to reduce these risks through phased risk analysis including preliminary, life cycle, and operational risk analysis. Safety specification identifies protection requirements to ensure system failures do not cause harm, with risk identification, analysis, and reduction mirroring hazard identification, assessment, and analysis. An example of a safety-critical insulin pump system is provided to illustrate dependability requirements and risk analysis.
This document provides an overview of key topics from Chapter 11 on security and dependability, including:
- The principal dependability properties of availability, reliability, safety, and security.
- Dependability covers attributes like maintainability, repairability, survivability, and error tolerance.
- Dependability is important because system failures can have widespread effects and undependable systems may be rejected.
- Dependability is achieved through techniques like fault avoidance, detection and removal, and building in fault tolerance.
Norman Patch and
Remediation Advanced
provides:
• Rapid, accurate and secure
patch management
• Automated collection, analysis
and delivery of patches
• Security for your organization
from worms, trojans, viruses and other malicious threats
• Single consolidated solution
for heterogeneous environments
provides effective management
at a significantly reduced TCO
The document discusses critical systems and their dependability requirements. It defines critical systems as those where failure could result in loss of life, environmental damage, or large economic losses. Dependability encompasses availability, reliability, safety, and security. The document uses the example of an insulin pump, a safety-critical system, to illustrate dependability dimensions and how failures could threaten human life. Formal development methods may be required for critical systems due to the high costs of failure.
This document provides an overview of topics covered in a lecture on security engineering. It discusses security engineering and management, security risk assessment, and designing systems for security. It also covers security concerns like confidentiality, integrity and availability. Application security is a software problem while infrastructure security is a systems management problem. Security risk management involves preliminary risk assessment, life cycle risk assessment and operational risk assessment using techniques like misuse cases and asset analysis. The document provides examples of threats, controls, and security requirements.
This document provides an overview of topics in chapter 13 on security engineering. It discusses security and dependability, security dimensions of confidentiality, integrity and availability. It also outlines different security levels including infrastructure, application and operational security. Key aspects of security engineering are discussed such as secure system design, security testing and assurance. Security terminology and examples are provided. The relationship between security and dependability factors like reliability, availability, safety and resilience is examined. The document also covers security in organizations and the role of security policies.
This document summarizes key topics from a lecture on security engineering:
1. It discusses security engineering and management, risk assessment, and designing systems for security. Application security focuses on design while infrastructure security is a management problem.
2. It outlines guidelines for secure system design including basing decisions on security policies, avoiding single points of failure, balancing security and usability, validating all inputs, and designing for deployment and recoverability.
3. It also covers risk management, assessing threats, and designing architectures with layered protection and distributed assets to minimize the effects of attacks.
This document discusses software security engineering. It covers security concepts like assets, vulnerabilities and threats. It discusses why security engineering is important to protect systems from malicious attackers. The document outlines security risk management processes like preliminary risk assessment. It also discusses designing systems for security through architectural choices that provide protection and distributing assets. The document concludes by covering system survivability through building resistance, recognition and recovery capabilities into systems.
The document discusses techniques for achieving dependable software systems. It covers redundancy and diversity approaches including N-version programming where multiple versions of software are developed independently. Dependable system architectures like protection systems and self-monitoring architectures that use redundancy are described. The document emphasizes that a well-defined development process is important for minimizing faults and notes validation activities should include requirements reviews, testing, and change management.
This document summarizes Chapter 12 of a textbook on dependability and security specification. It discusses risk-driven specification, including identifying risks, analyzing risks, and defining requirements to reduce risks. It also covers specifying safety requirements by identifying hazards, assessing hazards, and analyzing hazards to discover root causes. The goal is to specify requirements that ensure systems function dependably and securely without failures causing harm.
This document discusses requirements specification for critical systems. It covers dependability requirements, risk-driven specification, safety specification, security specification, system reliability specification, and non-functional reliability requirements. For risk-driven specification, it describes the stages of risk identification, analysis and classification, decomposition, and risk reduction assessment. It provides examples of applying this process to an insulin pump. For safety specification, it discusses safety requirements, the safety life cycle, and the IEC 61508 standard. For security specification, it outlines a similar process to safety with stages of asset identification, threat analysis, and security requirements specification. It also discusses different types of security requirements.
This document provides a comparative survey of fault handling in field programmable gate arrays (FPGAs) and microcontrollers (MCUs) for safety-critical embedded applications. It discusses how the different hardware platforms lead to fundamental differences in design that influence aspects of safety, reliability, and fault tolerance. Specifically, it examines how the hardware architecture, software design process, verification capabilities, and device technologies of FPGAs and MCUs each impact the ability to avoid or tolerate faults in hardware and software.
The document provides an overview of software engineering, discussing what it is, why it is important, and key concepts like the software development lifecycle, processes, and models. It introduces software engineering as a way to build software in a controlled, predictable manner by giving control over functionality, quality, and resources. It also summarizes several software development process models like waterfall, evolutionary development, and spiral.
This document discusses safety engineering for systems that contain software. It covers topics like safety-critical systems, safety requirements, and safety engineering processes. Safety is defined as a system's ability to operate normally and abnormally without harm. For safety-critical systems like aircraft or medical devices, software is often used for control and monitoring, so software safety is important. Hazard identification, risk assessment, and specifying safety requirements to mitigate risks are key parts of the safety engineering process. The goal is to design systems where failures cannot cause injury, death or environmental damage.
British Midland Flight 92 from Heathrow to Belfast crashed near Kegworth, England in 1989, killing 47 people. A fan blade broke off on the left engine, causing vibration, but the pilot mistakenly shut down the right engine instead. The left engine then failed 20 minutes later, as the pilot had not detected the initial error. The crash resulted from a combination of factors including pilot error, inadequate training, aircraft design issues, and lack of communication between the pilot and cabin crew.
The document discusses cybersecurity and why a technological approach alone is not sufficient. It argues that cybersecurity is a socio-technical problem, as technology cannot guarantee reliability and human and organizational factors like insider threats, procedures, carelessness, and social engineering present vulnerabilities. A holistic approach is needed across personal, organizational, national, and international levels that includes deterrence, awareness, realistic procedures, monitoring, and cooperation.
1) SCADA systems are used to monitor and control critical infrastructure through networks of sensors and programmable logic controllers.
2) These systems were traditionally isolated but are now increasingly connected to external networks, making them vulnerable to attacks.
3) Common vulnerabilities of SCADA systems include weak passwords, unencrypted network traffic, and lack of input validation. Improving SCADA security is challenging due to the operational needs of control systems and lack of security experience among operators.
The Maroochy SCADA attack in 2000 involved an insider, Vitek Boden, who had worked for the company that installed the sewage pumping control system for Maroochy, Australia. After leaving the company and being denied a job with the local council, Boden stole hardware and software from his previous employer to launch a revenge attack. Over several months, he remotely hacked into the SCADA system controlling the sewage pumps and caused pumps to fail, releasing over 1 million liters of untreated sewage. The attack highlighted security issues with the system's insecure radio communications and lack of monitoring. Boden was later convicted and jailed for his role in the incident.
Cybersecurity involves protecting individuals, businesses, and governments from cyber threats on computers and the internet. It is a broad field that includes threat analysis, security technologies, policies and laws. Cybersecurity problems stem from technical issues as well as human and organizational factors. It aims to prevent malicious cyber attacks and accidental damage. Attacks can come from inside or outside an organization and include fraud, spying, stalking, assault, and warfare between nations. The scale of the problem is large but difficult to measure fully. Cybersecurity issues have arisen because the internet was not designed with security in mind and prioritizes convenience, while widespread connectivity has increased risks.
The document discusses how success and failure in sociotechnical systems are subjective and depend on the perspective of the observer. It notes that sociotechnical systems are non-deterministic due to human factors and system changes over time. Different stakeholders may have conflicting views of what constitutes success or failure based on their goals. The document concludes that failures are normal and inevitable in complex systems due to technical limitations and differing stakeholder perspectives.
The document discusses the characteristics of critical systems. Critical systems must have high dependability, meaning they must be reliable, available, safe, and secure. They are often systems of systems that rely on external infrastructure and other systems. Critical systems are important to society and can include safety-critical systems, mission-critical systems, business-critical systems, and infrastructure systems. Dependability is the most important attribute of critical systems.
System dependability is a composite system property that reflects the degree of trust users have in a system. It is determined by availability, reliability, safety, and security. Dependability is subjective as it depends on user expectations - a system deemed dependable by one user may be seen as unreliable by another if it does not meet their expectations. Formal specifications of dependability do not always capture real user experiences.
Critical systems engineering focuses on developing dependable and secure systems for applications where failure could result in loss of life, injury, or significant economic damage. These systems often require certification by regulators to demonstrate compliance with safety and dependability standards before use. The development process for critical systems uses rigorous, plan-driven methodologies and techniques like formal methods, fault detection tools, and redundancy to reduce risk of failure, which may involve higher costs than for other systems due to the potentially severe consequences of errors.
The document discusses emergent properties in socio-technical systems. It defines emergent properties as properties of a system as a whole that arise from the relationships and interactions between components, rather than properties of the individual components. Emergent properties only emerge once a system is integrated and can include functional properties like a bicycle's ability to transport, as well as non-functional properties like reliability, security, and volume. Reliability in particular is an emergent property, as failures can occur due to unforeseen interactions between components. The document uses examples to illustrate different types of emergent properties.
Socio-technical systems include both technical and human elements. They are made up of interconnected layers from equipment and software to business processes and societal rules. Properties emerge from the interactions between these layers, including reliability, security, and usability. Whether a socio-technical system is considered a success or failure depends on perspective, as stakeholders have differing views and system behavior is non-deterministic due to human factors. Failures are also inevitable given the complexity of relationships in socio-technical systems.
This document discusses availability and reliability in systems. Availability is defined as the probability that a system will be operational to deliver requested services, while reliability is the probability of failure-free operation over time. Both can be expressed as percentages. Availability takes into account repair times, whereas reliability does not. Faults can lead to errors and failures if not addressed through techniques like fault avoidance, detection, and tolerance.
This document discusses sociotechnical systems and introduces their key concepts. It defines a system as a purposeful collection of interrelated components working towards a common goal. Sociotechnical systems specifically include both technical systems (e.g. software, hardware) as well as the operational processes and people interacting with the technical systems. These systems have a layered "stack" structure with different levels including equipment, operating systems, applications, business processes, organizations, and broader society. Changes at one level can ripple through other levels due to interdependencies between layers. Achieving dependability requires containing failures within layers and understanding how adjacent layers may be affected.
The document discusses system security and defines key related terms. System security is the ability of a system to protect itself from accidental or deliberate attacks. It is essential for availability, reliability, and safety as most systems are networked. Without proper security, systems are vulnerable to damage like denial of service, data corruption, and disclosure of confidential information. Security can be achieved through strategies such as avoiding vulnerabilities, detecting and eliminating attacks, and limiting exposure and enabling recovery from successful attacks.
This document discusses dependability engineering and techniques for achieving dependable software systems. It covers fault avoidance, fault detection, and fault tolerance. Critical systems often use redundancy, diversity, and regulated development processes to meet high dependability requirements. Dependable architectures and protection systems can provide fault tolerance to prevent failures from causing outages or emergencies.
The document discusses techniques for achieving dependable software systems through fault tolerance. It describes hardware-based triple modular redundancy and software-based N-version programming. It also discusses challenges with achieving true design diversity and problems that can arise from specification errors. The document concludes with guidelines for dependable programming practices such as limiting information visibility, checking inputs, using exception handling, avoiding error-prone constructs, including restart capabilities and timeouts.
Quality attributes in software architectureGang Tao
This document discusses various quality attributes in software architecture including responsiveness, scalability, usability, security, accessibility, serviceability, extensibility, distributability, maintainability, portability, reliability, testability, and compatibility. For each attribute, it provides definitions and considerations for how to achieve that attribute in architecture and design. It also discusses relationships between attributes and references quality models for evaluating software.
McCall proposed a model in 1977 to measure software quality based on quality factors related to software requirements. The model breaks quality factors into three main categories: product operation factors, product revision factors, and product transition factors. Alternative models by Evans/Marciniak and Deutsch/Willis also use these factors to evaluate software quality.
This document outlines evaluation criteria for assessing the quality of software. It discusses competencies such as suitability, maintainability, usability, and dependability. Suitability refers to the main features and core functions of software for a specific application or industry. Maintainability involves the modularity, portability, interoperability, and testability of software code. Usability pertains to documentation of software operation, installation procedures, testing methods, and the intuitiveness of learning how to use the software. Dependability relates to the reliability, scale, and security of software operations. Testing is important for both dependability and suitability in verifying functionality and integration.
The document outlines software testing best practices organized into groups:
- The Basic Practices include writing functional specifications, code reviews, test criteria, and automated test execution.
- Foundational Practices involve user scenarios, usability testing, and feedback loops.
- Incremental Practices focus on close collaboration between testers and developers, code coverage, test automation, and testing for quick releases.
This document discusses techniques for engineering dependable software systems. It covers topics like redundancy and diversity, dependable processes, and dependable system architectures. Redundancy and diversity are fundamental approaches to achieving fault tolerance. Dependable processes use well-defined and repeatable development practices to minimize faults. Dependable system architectures, like protection systems and self-monitoring architectures, are designed to tolerate faults through techniques such as voting across redundant or diverse versions of the system.
This document outlines the challenges in adopting mobility and how to address them. It identifies key stakeholders impacted - CXOs, IT/Admin, and End Users. For each stakeholder, it lists challenges they may face such as data security, ROI justification, device fragmentation, and ease of use. It then provides recommendations on how to approach each challenge, such as ensuring security through encryption, taking a middleware approach to integration, and focusing on usability to improve the user experience.
This document outlines the challenges in adopting mobility and how to address them. It identifies key stakeholders impacted - CXOs, IT/Admin, and End Users. For each stakeholder, it lists challenges they may face such as data security, ROI justification, device fragmentation, and ease of use. It then provides recommendations on how to approach each challenge, such as ensuring security through encryption, taking a 'middleware' approach to integration, and focusing on usability to improve the user experience.
Cloud-native testing is a quality assurance procedure specifically designed for cloud-native applications. The latter are created based on standards and enhancements that could impact computing capacity with distributed systems, such as microservices engineering
The document discusses leveraging DevOps practices to improve mainframe application delivery. It describes how traditional mainframe development and testing causes delays due to shared, restricted resources and inefficient processes. The solution presented uses DevOps tools and practices like continuous integration/delivery, dependency virtualization, and automated quality testing to enable more efficient mainframe application development and testing. This allows development and operations teams to work in parallel, validate code quality earlier, and deploy applications more frequently.
DigiCert supports the concept of certificate transparency which provides insight into issued SSL certificates and better remediation. Certificate transparency logs all certificates issued and enables fast detection of mis-issued certificates. While DigiCert has already implemented certificate transparency, questions remain regarding log security, privacy, and exemptions that require discussion before industry-wide adoption. Overall, certificate transparency enhances security without introducing unintended consequences.
1) Legacy systems are older software systems that have been in use for a long time and are often critical to business operations.
2) They were often developed years ago using outdated technologies and have evolved over many years of customizations and changes.
3) Replacing legacy systems carries significant business risks due to a lack of complete documentation and embedded business rules not formally documented elsewhere. Changing legacy systems can also be very expensive.
This document discusses how to bake quality into an agile scrum model. It covers quality driven by scrum practices like short iterations and frequent course corrections. It also discusses quality of requirements, architecture/design, code, verification/testing and maturing the definition of done. Automated testing, code reviews, continuous integration and refactoring are recommended to ensure code quality. Quality is baked in through quality user stories, engineering standards/best practices, exploratory testing and peer reviews.
The document discusses various software development life cycle (SDLC) models, including:
- The waterfall model, which uses sequential phases of requirements, design, coding, testing, and deployment. It is structured but rigid.
- Iterative development models, which allow for feedback loops and releasing partial software in iterations to get faster feedback.
- Agile methodologies like Scrum, which embrace changing requirements, focus on working software over documentation, and value customer collaboration over contracts. Key aspects are iterative development, regular refactoring, and communicating for learning.
- Pitfalls of agile include skill gaps, lack of traceability, poor communication, and not staying close enough to customers. Overall, agile aims to
This document discusses techniques for engineering dependable software systems. It covers redundancy and diversity approaches to achieve fault tolerance. Dependable systems are achieved through fault avoidance, detection, and tolerance. Critical systems often use regulated processes and dependable architectures like protection systems, self-monitoring architectures, and N-version programming which involve redundant and diverse components to continue operating despite failures. The document gives examples of how these techniques are applied in systems like aircraft flight control to maximize availability.
The document discusses various aspects of software processes and life cycles. It describes three types of reusable software components: web services, object collections, and stand-alone systems. It also outlines common phases in a software life cycle like requirements analysis, design, implementation, testing, deployment, and maintenance. Incremental delivery approaches are discussed where early increments are delivered to customers.
The document discusses various aspects of software processes and life cycles. It describes three types of reusable software components: web services, object collections, and stand-alone systems. It also outlines common phases in a software life cycle like requirements analysis, design, implementation, testing, deployment, and maintenance. Incremental delivery approaches are discussed where early increments are delivered to customers.
Automated acceptance testing is an important part of the deployment pipeline. It tests that the application meets business requirements and provides value to users. Creating maintainable acceptance test suites involves deriving tests from acceptance criteria, layering the tests, and avoiding direct coupling to the GUI. Non-functional requirements like performance and capacity also need to be tested. The deployment process should be automated and standardized across environments using techniques like blue-green deployment and canary releases to allow rolling back changes if needed.
Similar to CS 5032 L7 dependability engineering 2013 (20)
The document discusses ultra large scale systems (ULSS) and introduces key points from an SEI report on ULSS. It defines ULSS as interconnected webs of software, people, policies and economics at an internet scale. The scale of ULSS undermines traditional software engineering approaches. Some challenges in developing ULSS include design/evolution, orchestration, monitoring, organizational integration, and regulation/control. New interdisciplinary research is needed to address issues arising from increased system scale.
The document discusses responsibility modeling for socio-technical systems. It describes how responsibility models can be used to identify vulnerabilities in systems by clarifying agent responsibilities and information needs. The document presents examples of responsibility models for emergency response coordination and uses HAZOP analysis to identify potential failures based on deviations to information resources. The goal of responsibility modeling is to improve system dependability by facilitating analysis of social and organizational factors.
The document discusses approaches to system failure and recovery in large, complex information systems. It argues that a conventional view of failure as deviation from specifications is not adequate, as failures are subjective judgments that depend on context. Failures are also inevitable due to system complexity and stakeholder conflicts. Instead of seeking to avoid all failures, systems should be designed to support recognition of problems when they occur and enable recovery actions. Guidelines for designing systems to enhance recovery include supporting local knowledge, flexible reconfiguration of processes when needed, and redundancy through data replication and alternative records. The goal is to design systems that make recovery from inevitable failures easier rather than focusing solely on preventing failures.
This document discusses the challenges of engineering large, socio-technical systems where there are significant unknown factors (LSCITS). It argues that the traditional reductionist approach to engineering breaks down for these systems. LSCITS engineering must account for scale, uncertainty, and the entanglement of social and technical aspects. Some key challenges identified include managing large scale systems, dealing with uncertainty, reasoning about complex systems, integrating different systems, and developing standards for LSCITS. The document advocates an approach that tempers reductionism with pragmatism and recognition of real-world constraints like imperfect people and organizations.
This document discusses the realities of requirements engineering processes compared to formal textbook models. It notes that real RE processes are often ad hoc and vary significantly based on factors like the type of system, customer, developer culture, and deployment environment. Formal processes do not account for human, social, and political factors that influence requirements. The document emphasizes that stakeholders, customers, and environments are diverse and changing, so requirements engineering must adapt to these realities rather than following rigid predefined processes.
This document discusses dependability requirements engineering for safety-critical information systems. It introduces the concepts of system dependability, dependability requirement types, and the need for integrated requirements engineering. Dependability requirements should be considered alongside other business requirements. The document uses an exemplar healthcare records management system to illustrate how concerns can help identify safety requirements and requirements conflicts. Concerns reflect organizational goals and help bridge the gap between goals and system requirements.
The document discusses the conceptual design process for a large-scale complex IT system (LSCITS) for road pricing. It covers the key activities in conceptual design including concept formulation, problem understanding, requirements engineering, feasibility studies, and architectural design. Specifically, it provides examples for a proposed road pricing system, describing the problem it aims to solve, potential high-level requirements and constraints, different design options considered in the feasibility study, and examples of technologies that could enable each option.
This document provides an introduction to large-scale complex IT systems (LSCITS). It defines LSCITS and distinguishes them from other types of systems. Key points made include:
- An LSCITS is a large-scale IT system where there are significant unknown factors in its development, use, and operating environments that introduce complexity.
- LSCITS are often part of larger socio-technical systems and there are complex relationships between the technical and social aspects.
- Examples given of LSCITS include digital music systems and national identity management systems.
In November 1988, a student at Cornell University deliberately released a program called the Internet Worm that exploited security vulnerabilities in Unix systems and spread across computers connected to the Internet. The worm did not cause direct damage but its replication overloaded systems and severely disrupted network services. It took system administrators several days to devise and implement modifications to vulnerable programs to stop the spread of the worm and clear infected machines. This was the first widely distributed Internet security threat and highlighted the importance of securing systems and patching vulnerabilities.
Discusses sociotechnical issues that arose in the design of a national digital learning system intended for use by more than a million students and their teachers
The Ariane 5 rocket failed on its maiden flight 37 seconds after launch due to a software failure. The failure occurred when an attempt to convert a 64-bit number to a 16-bit integer caused an overflow, crashing the inertial reference system and backup system. The failed software was reused from the Ariane 4 without being properly tested or reviewed for the new rocket configuration. This led to an avoidable launch failure 37 seconds into the flight.
Critical infrastructure refers to the essential systems and services in modern societies that support public services, the economy, and national security. These include systems for power, water, transportation, communications, health care, finance, and emergency services. Modern critical infrastructure is controlled and managed by interconnected computer systems, so these digital systems are also considered critical infrastructure. Critical infrastructure is characterized as being large in scale, complex with many interdependencies, reliant on standards, and long-lasting. The availability of critical infrastructure is essential for society, and the unavailability of critical systems could have significant human, economic, and social consequences.
Unlocking Productivity: Leveraging the Potential of Copilot in Microsoft 365, a presentation by Christoforos Vlachos, Senior Solutions Manager – Modern Workplace, Uni Systems
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...Neo4j
Leonard Jayamohan, Partner & Generative AI Lead, Deloitte
This keynote will reveal how Deloitte leverages Neo4j’s graph power for groundbreaking digital twin solutions, achieving a staggering 100x performance boost. Discover the essential role knowledge graphs play in successful generative AI implementations. Plus, get an exclusive look at an innovative Neo4j + Generative AI solution Deloitte is developing in-house.
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionAggregage
Join Maher Hanafi, VP of Engineering at Betterworks, in this new session where he'll share a practical framework to transform Gen AI prototypes into impactful products! He'll delve into the complexities of data collection and management, model selection and optimization, and ensuring security, scalability, and responsible use.
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?Speck&Tech
ABSTRACT: A prima vista, un mattoncino Lego e la backdoor XZ potrebbero avere in comune il fatto di essere entrambi blocchi di costruzione, o dipendenze di progetti creativi e software. La realtà è che un mattoncino Lego e il caso della backdoor XZ hanno molto di più di tutto ciò in comune.
Partecipate alla presentazione per immergervi in una storia di interoperabilità, standard e formati aperti, per poi discutere del ruolo importante che i contributori hanno in una comunità open source sostenibile.
BIO: Sostenitrice del software libero e dei formati standard e aperti. È stata un membro attivo dei progetti Fedora e openSUSE e ha co-fondato l'Associazione LibreItalia dove è stata coinvolta in diversi eventi, migrazioni e formazione relativi a LibreOffice. In precedenza ha lavorato a migrazioni e corsi di formazione su LibreOffice per diverse amministrazioni pubbliche e privati. Da gennaio 2020 lavora in SUSE come Software Release Engineer per Uyuni e SUSE Manager e quando non segue la sua passione per i computer e per Geeko coltiva la sua curiosità per l'astronomia (da cui deriva il suo nickname deneb_alpha).
Removing Uninteresting Bytes in Software FuzzingAftab Hussain
Imagine a world where software fuzzing, the process of mutating bytes in test seeds to uncover hidden and erroneous program behaviors, becomes faster and more effective. A lot depends on the initial seeds, which can significantly dictate the trajectory of a fuzzing campaign, particularly in terms of how long it takes to uncover interesting behaviour in your code. We introduce DIAR, a technique designed to speedup fuzzing campaigns by pinpointing and eliminating those uninteresting bytes in the seeds. Picture this: instead of wasting valuable resources on meaningless mutations in large, bloated seeds, DIAR removes the unnecessary bytes, streamlining the entire process.
In this work, we equipped AFL, a popular fuzzer, with DIAR and examined two critical Linux libraries -- Libxml's xmllint, a tool for parsing xml documents, and Binutil's readelf, an essential debugging and security analysis command-line tool used to display detailed information about ELF (Executable and Linkable Format). Our preliminary results show that AFL+DIAR does not only discover new paths more quickly but also achieves higher coverage overall. This work thus showcases how starting with lean and optimized seeds can lead to faster, more comprehensive fuzzing campaigns -- and DIAR helps you find such seeds.
- These are slides of the talk given at IEEE International Conference on Software Testing Verification and Validation Workshop, ICSTW 2022.
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc
How does your privacy program stack up against your peers? What challenges are privacy teams tackling and prioritizing in 2024?
In the fifth annual Global Privacy Benchmarks Survey, we asked over 1,800 global privacy professionals and business executives to share their perspectives on the current state of privacy inside and outside of their organizations. This year’s report focused on emerging areas of importance for privacy and compliance professionals, including considerations and implications of Artificial Intelligence (AI) technologies, building brand trust, and different approaches for achieving higher privacy competence scores.
See how organizational priorities and strategic approaches to data security and privacy are evolving around the globe.
This webinar will review:
- The top 10 privacy insights from the fifth annual Global Privacy Benchmarks Survey
- The top challenges for privacy leaders, practitioners, and organizations in 2024
- Key themes to consider in developing and maintaining your privacy program
For the full video of this presentation, please visit: https://www.edge-ai-vision.com/2024/06/building-and-scaling-ai-applications-with-the-nx-ai-manager-a-presentation-from-network-optix/
Robin van Emden, Senior Director of Data Science at Network Optix, presents the “Building and Scaling AI Applications with the Nx AI Manager,” tutorial at the May 2024 Embedded Vision Summit.
In this presentation, van Emden covers the basics of scaling edge AI solutions using the Nx tool kit. He emphasizes the process of developing AI models and deploying them globally. He also showcases the conversion of AI models and the creation of effective edge AI pipelines, with a focus on pre-processing, model conversion, selecting the appropriate inference engine for the target hardware and post-processing.
van Emden shows how Nx can simplify the developer’s life and facilitate a rapid transition from concept to production-ready applications.He provides valuable insights into developing scalable and efficient edge AI solutions, with a strong focus on practical implementation.
“An Outlook of the Ongoing and Future Relationship between Blockchain Technologies and Process-aware Information Systems.” Invited talk at the joint workshop on Blockchain for Information Systems (BC4IS) and Blockchain for Trusted Data Sharing (B4TDS), co-located with with the 36th International Conference on Advanced Information Systems Engineering (CAiSE), 3 June 2024, Limassol, Cyprus.
Pushing the limits of ePRTC: 100ns holdover for 100 daysAdtran
At WSTS 2024, Alon Stern explored the topic of parametric holdover and explained how recent research findings can be implemented in real-world PNT networks to achieve 100 nanoseconds of accuracy for up to 100 days.
Threats to mobile devices are more prevalent and increasing in scope and complexity. Users of mobile devices desire to take full advantage of the features
available on those devices, but many of the features provide convenience and capability but sacrifice security. This best practices guide outlines steps the users can take to better protect personal devices and information.
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024Neo4j
Neha Bajwa, Vice President of Product Marketing, Neo4j
Join us as we explore breakthrough innovations enabled by interconnected data and AI. Discover firsthand how organizations use relationships in data to uncover contextual insights and solve our most pressing challenges – from optimizing supply chains, detecting fraud, and improving customer experiences to accelerating drug discoveries.
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Albert Hoitingh
In this session I delve into the encryption technology used in Microsoft 365 and Microsoft Purview. Including the concepts of Customer Key and Double Key Encryption.
Maruthi Prithivirajan, Head of ASEAN & IN Solution Architecture, Neo4j
Get an inside look at the latest Neo4j innovations that enable relationship-driven intelligence at scale. Learn more about the newest cloud integrations and product enhancements that make Neo4j an essential choice for developers building apps with interconnected data and generative AI.
Introducing Milvus Lite: Easy-to-Install, Easy-to-Use vector database for you...Zilliz
Join us to introduce Milvus Lite, a vector database that can run on notebooks and laptops, share the same API with Milvus, and integrate with every popular GenAI framework. This webinar is perfect for developers seeking easy-to-use, well-integrated vector databases for their GenAI apps.
How to Get CNIC Information System with Paksim Ga.pptxdanishmna97
Pakdata Cf is a groundbreaking system designed to streamline and facilitate access to CNIC information. This innovative platform leverages advanced technology to provide users with efficient and secure access to their CNIC details.
Climate Impact of Software Testing at Nordic Testing DaysKari Kakkonen
My slides at Nordic Testing Days 6.6.2024
Climate impact / sustainability of software testing discussed on the talk. ICT and testing must carry their part of global responsibility to help with the climat warming. We can minimize the carbon footprint but we can also have a carbon handprint, a positive impact on the climate. Quality characteristics can be added with sustainability, and then measured continuously. Test environments can be used less, and in smaller scale and on demand. Test techniques can be used in optimizing or minimizing number of tests. Test automation can be used to speed up testing.
2. Software dependability
• Software customers expect all
software to be dependable.
However, for non-critical
applications, they may be willing to
accept some system failures.
• Some critical systems have very high
dependability requirements and
special software engineering
techniques may be used to achieve
this.
– Medical systems
– Telecommunications and power
systems
Dependability engineering 1, 2013
– Aerospace systems Slide 2
3. Dependability achievement
• Fault avoidance
– The system is developed in such a way that human error is
avoided and thus system faults are minimised.
– The development process is organised so that faults in the
system are detected and repaired before delivery to the
customer.
• Fault detection
– Verification and validation techniques are used to discover
and remove faults in a system before it is deployed.
• Fault tolerance
– The system is designed so that faults in the delivered
software do not result in system failure.
Dependability engineering 1, 2013 Slide 3
4. Regulated systems
• Many critical systems are regulated systems, which
means that their use must be approved by an
external regulator before the systems go into service.
– Nuclear systems
– Air traffic control systems
– Medical devices
• A safety and dependability case has to be approved
by the regulator. Therefore, critical systems
development has to create the evidence to convince
a regulator that the system is dependable, safe and
secure.
Dependability engineering 1, 2013 Slide 4
5. The increasing costs of residual
fault removal
Dependability engineering 1, 2013 Slide 5
6. Diversity and redundancy
• Redundancy
– Keep more than 1 version of a critical
component available so that if one
fails then a backup is available.
• Diversity
– Provide the same functionality in
different ways so that they will not fail
in the same way.
• However, diversity adds
complexity – more chance of
errors.
• Some engineers advocate
simplicity and extensive V & V
Dependability engineering 1, 2013
rather then redundancy. Slide 6
7. Diversity and redundancy
examples
• Redundancy. Where availability
is critical (e.g. in e-commerce
systems), companies normally
keep backup servers and switch
to these automatically if failure
occurs.
• Diversity. To provide resilience
against external attacks, different
servers may be implemented
using different operating systems
(e.g. Windows and Linux)
Dependability engineering 1, 2013 Slide 7
8. Dependable processes
• To ensure a minimal number of software faults, it is
important to have a well-defined, repeatable
software process.
• A well-defined repeatable process is one that does
not depend entirely on individual skills; rather can be
enacted by different people.
• Regulators use information about the process to
check if good software engineering practice has
been used.
• For fault detection, it is clear that the process
activities should include significant effort devoted to
verification and validation.
Dependability engineering 1, 2013 Slide 8
9. Attributes of dependable
processes
Process characteristic Description
Documentable The process should have a defined process model
that sets out the activities in the process and the
documentation that is to be produced during these
activities.
Standardized A comprehensive set of software development
standards covering software production and
documentation should be available.
Auditable The process should be understandable by people
apart from process participants, who can check that
process standards are being followed and make
suggestions for process improvement.
Diverse The process should include redundant and diverse
verification and validation activities.
Robust The process should be able to recover from failures
of individual process activities.
Dependability engineering 1, 2013 Slide 9
10. Process diversity and
redundancy
• Process activities, such as validation, should not
depend on a single approach, such as testing, to
validate the system
• Rather, multiple different process activities the
complement each other and allow for cross-checking
help to avoid process errors, which may lead to errors
in the software
Reviews Automated analysis Testing
Dependability engineering 1, 2013 Slide 10
12. Dependable programming
• Good programming practices can be adopted that
help reduce the incidence of program faults.
• These programming practices support
– Fault avoidance
– Fault detection
– Fault tolerance
Dependability engineering 1, 2013 Slide 12
13. Good practice guidelines for
dependable programming
Dependable programming guidelines
1. Limit the visibility of information in a program
2. Check all inputs for validity
3. Provide a handler for all exceptions
4. Minimize the use of error-prone constructs
5. Provide restart capabilities
6. Check array bounds
7. Include timeouts when calling external components
8. Name all constants that represent real-world values
Dependability engineering 1, 2013 Slide 13
14. Control the visibility of
information in a program
• Program components should only be allowed access to data that
they need for their implementation. This means that accidental
corruption of parts of the program state by these components is
impossible.
Specification
Data structure
• You can control visibility by using abstract data types where the
data representation is private and you only allow access to the
data through predefined operations such as get () and put ().
Dependability engineering 1, 2013 Slide 14
15. Check all inputs for validity
Input
Input
Check
Processing
Processing
Many programs behave unpredictably when presented
with unusual inputs and, sometimes, these are threats
to the security of the system.
Dependability engineering 1, 2013 Slide 15
16. Validity checks
• Range checks
– Check that the input falls within a known range.
• Size checks
– Check that the input does not exceed some maximum size
e.g. 40 characters for a name.
• Representation checks
– Check that the input does not include characters that should
not be part of its representation e.g. names do not include
numerals.
• Reasonableness checks
– Use information about the input to check if it is reasonable
rather than an extreme value.
Dependability engineering 1, 2013 Slide 16
17. Provide a handler for all
exceptions
• A program exception is an error or some unexpected
event such as a power failure.
• Exception handling constructs allow for such events
to be handled without the need for continual status
checking to detect exceptions.
• Using normal control constructs to detect exceptions
needs many additional statements to be added to
the program. This adds a significant overhead and is
potentially error-prone.
Dependability engineering 1, 2013 Slide 17
19. Exception handling
• Three possible exception handling strategies
– Signal to a calling component that an exception has occurred
and provide information about the type of exception.
– Carry out some alternative processing to the processing
where the exception occurred. This is only possible where
the exception handler has enough information to recover
from the problem that has arisen.
– Pass control to a run-time support system to handle the
exception.
• Exception handling is a mechanism to provide some
fault tolerance
Dependability engineering 1, 2013 Slide 19
20. Minimize the use of error-prone
constructs
• Program faults are usually a consequence of human
error because programmers lose track of the
relationships between the different parts of the
system
• This is exacerbated by error-prone constructs in
programming languages that are inherently complex
or that don’t check for mistakes when they could do
so.
• Therefore, when programming, you should try to
avoid or at least minimize the use of these error-
prone constructs.
Dependability engineering 1, 2013 Slide 20
21. Error-prone constructs
• Unconditional branch (goto) statements
• Floating-point numbers
– Inherently imprecise. The imprecision may lead to invalid
comparisons.
• Pointers
– Pointers referring to the wrong memory areas can corrupt
data. Aliasing can make programs difficult to understand
and change.
• Dynamic memory allocation
– Run-time allocation can cause memory overflow.
Dependability engineering 1, 2013 Slide 21
22. Error-prone constructs
• Parallelism
– Can result in subtle timing errors because of unforeseen
interaction between parallel processes.
• Recursion
– Errors in recursion can cause memory overflow as the
program stack fills up.
• Interrupts
– Interrupts can cause a critical operation to be terminated
and make a program difficult to understand.
• Inheritance
– Code is not localised. This can result in unexpected
behaviour when changes are made and problems of
understanding the code.
Dependability engineering 1, 2013 Slide 22
23. Error-prone constructs
• Aliasing
– Using more than 1 name to refer to the same state variable.
• Unbounded arrays
– Buffer overflow failures can occur if no bound checking on
arrays.
• Default input processing
– An input action that occurs irrespective of the input.
– This can cause problems if the default action is to transfer
control elsewhere in the program. In incorrect or deliberately
malicious input can then trigger a program failure.
Dependability engineering 1, 2013 Slide 23
24. Provide restart capabilities
• For systems that involve long transactions or user
interactions, you should always provide a restart
capability that allows the system to restart after failure
without users having to redo everything that they
have done.
• Restart depends on the type of system
– Keep copies of forms so that users don’t have to fill them in
again if there is a problem
– Save state periodically and restart from the saved state
Dependability engineering 1, 2013 Slide 24
25. Check array bounds
• In some programming languages, such as C, it is possible to
address a memory location outside of the range allowed for in
an array declaration.
• This leads to the well-known ‘bounded buffer’ vulnerability
where attackers write executable code into memory by
deliberately writing beyond the top element in an array.
• If your language does not include bound checking, you should
therefore always check that an array access is within the
bounds of the array.
Dependability engineering 1, 2013 Slide 25
26. Include timeouts when calling
external components
• In a distributed system, failure of a remote
computer can be ‘silent’ so that programs
A expecting a service from that computer
may never receive that service or any
indication that there has been a failure.
• To avoid this, you should always include
timeouts on all calls to external
components.
• After a defined time period has elapsed
B without a response, your system should
then assume failure and take whatever
actions are required to recover from this.
Dependability engineering 1, 2013 Slide 26
27. Name all constants that
represent real-world values
Minimum_spee • Always give constants that reflect
d real-world values (such as tax rates)
names rather than using their numeric
Error_rate values and always refer to them by
name
Max_daily_dos • You are less likely to make mistakes
e and type the wrong value when you
are using a name rather than a value.
Tolerance
• This means that when these
VAT_rate ‘constants’ change (for sure, they are
not really constant), then you only
have to make the change in one place
Dependability engineering 1, 2013 Slide 27
in your program.
28. Key points
• Dependability in a program can be achieved by avoiding the
introduction of faults, by detecting and removing faults before
system deployment, and by including fault tolerance facilities.
• The use of redundancy and diversity in hardware, software
processes and software systems is essential for the
development of dependable systems.
• The use of a well-defined, repeatable process is essential if
faults in a system are to be minimized.
• Dependable programming relies on the inclusion of redundancy
in a program to check the validity of inputs and the values of
program variables.
Dependability engineering 1, 2013 Slide 28