Ten Tips to Succeed in Global Software Engineering Education
Resilient systems - predicatbility ane evolution
1. Resilient Systems - Predictability
and Evolution
Ivica Crnkovic
Mälardalen University, Sweden
www.idt.mdh.se/~icc
ivica.crnkovic@mdh.se
2. What is resilience?
• In ecology1
– the capacity of an ecosystem to respond to a perturbation or disturbance
by resisting damage and recovering quickly
• In System Engineering
– Resilience is the ability of organizational, hardware and software systems
to mitigate the severity and likelihood of failures or losses, to adapt to
changing conditions, and to respond appropriately after the fact.
1Resilience (ecology - Wikipedia)
1/25/2013 Serene 2011 2
3. Characteristics of resilient systems
Critical aspects of resilience: System Resistance
1. Latitude: the maximum amount a state
system can be changed before
losing its ability to recover. Robustness
2. Resistance: the ease or difficulty of
changing the system; how
“resistant” it is to being changed.
3. Precariousness: how close the
current state of the system is to a
limit or “threshold.” Environment
Latitude
4. Panarchy: the degree to which a
certain hierarchical level of an
ecosystem is influenced by other Precariousness
levels.
1/25/2013 Serene 2011 4
4. Resilience vs. dependability vs.
robustness?
• Dependability Ability of a system to Availability
deliver service that can justifiably be Reliability
trusted Safety
Confidentiality
Attributes Integrity
(Ability of a system to avoid failures Maintainability
that are more frequent or more severe
than is acceptable to user(s)
Faults
Dependability Threats Errors
• Survivability is the ability to remain
Failures
alive or continue to exist.
• Robustness is the persistence of a
system’s characteristic behavior under Fault Prevention
perturbations or conditions of Fault Tolerance
uncertainty. Means
Fault Removal
Fault Forecasting
• Resilience is the persistence of the
avoidance of failures that are
unexpectedly frequent or severe, when
facing change [Laprie]
1/25/2013 Serene 2011 5
6. Resilience vs. dependability vs.
robustness?
dependable (robust&resistent) systems” “Resilient systems”
states
• Highly controlled • A broad spectrum of possible equilibrium state
• Operates in a narrow band • Not necessary all states are predicted
• Predefined states (“modes”) • Adaptive and evolving systems
• Top-down design • impact of the system on the environment
• Challenge: predict all states • Challenge:
caused by the environment • Adaptation
• Optimal performance in different states
• Minimize unwanted impact on the
environment
1/25/2013 Serene 2011 7
7. Do we care about impact on the environment?
ABB vision: “Power and productivity for a better world”
“….., we help our customers to use electrical
power efficiently, to increase industrial
productivity and to lower environmental
impact in a sustainable way.”
1/25/2013 Serene 2011 8
8. Resilient system development
challenges
• Resilience is an extra-functional (non-functional)
property
• A part of the design:
– System adaptability
– System evolvability
– Impact on external environment
• (direct & indirect)
• (short & long term)
1/25/2013 Serene 2011 10
9. Development process – top-down vs.
Bottom-up
• Modern, sustainable and resilient systems:
– Complex, distributed, highly dynamic, highly adaptive
– Adjustable to the changing environment and to the
users
• Top-down principle is not enough!
– It would be a problem with evolution and flexibility
• Late compositions, reconfigurations, on-line
verifications, on-line validations
1/25/2013 Serene 2011 11
10. Extended development process
1JOSEPH FIKSEL Designing Resilient, Sustainable Systems , Environ. Sci. Technol. 2003, 37, 5330-5339
1/25/2013 Serene 2011 12
12. Resilience property model
is refined to
Resilience subcharacteristics
1 1..*
1
is refined to
1..*
measured by
measuring attributes 1..* metrics
1
1
reason about 1..*
QoS
Questions:
which are subcharacteristics?
which measuring attributes?
How to design for resilience?
How to analyze resilience?
1/25/2013 22
13. Analysis Process
Elicit Architectural Concerns
Stakeholders’ concerns
(subcharacteristics)
Potential architectural
Change stimuli requirements
Analyze Implication of Propose architectural
change stimuli solutions
Analyze Implication of
change stimuli
14. Example: Software Evolvability Model
• Literature survey and analysis of the literature
• Industrial case studies
-is refined to
Evolvability Subcharacteristics
1 1..*
1 -is refined to
• Evolvability subcharacteristics 1..*
-is measured by
• Analyzability Measuring Attributes Metrics
• Architectural Integrity 1 1..*
• Changeability 1 -reason about
QoS
• Portability 1..*
• Extensibility
• Testability
• Domain-specific attributes
Hongyu Pei Breivold, Ivica Crnkovic, Peter Eriksson, Analyzing Software Evolvability, 32nd IEEE International computer Software and Applications
Conference (COMPSAC), 2008
1/25/2013 Serene 2011 24
15. Case Study: ABB Robotic Control System
PC Applications Man Machine Interactions
Controller Control Interfaces and
Language Resources
I/O File System Safety Base system Support services
OS & HW abstraction Device Drivers
A conceptual view of the software architecture
1/25/2013 Serene 2011 27
17. Phase 1: Change Stimuli and Architectural Requirements
Change Stimuli
• Time-to-market requirements
• Increased ease and flexibility of
distributed development of diverse
application variants.
Requirements Subcharacteristics
R1. Transform the monolithic architecture to Analyzability
modular architecture. Changeability
R2. Reduced architecture complexity.
R3. Enable distributed development of Extensibility
extensions with minimum dependency.
R4. Portability. Portability
R5. Impact on product development process. Testability
R6. Minimized software code size and runtime Domain-specific
footprint. Attribute
R1, R2, R3 prioritized
1/25/2013 Serene 2011 29
18. Phase 2: Architectural analysis
• Extract architectural constructs related to the respective identified issues
• Identify refactoring components for each identified issue
Refactoring components – dynamic component biding
1/25/2013 Serene 2011 30
19. Phase 3: finalize the analysis
Example: Implications of the IPC component refactoring on
evolvability subcharacteristics (+ positive impact, - negative impact)
Subcharacteristics IPC Component Refactoring
Analyzability – due to less possibility of static analysis since definitions
are defined dynamically
Architectural Integrity + due to documentation of specific requirements,
architectural solutions and consequences
Changeability + due to the dynamism which makes it easier to introduce
and deploy new slots
Portability + due to improved abstraction of Application Programming
Interfaces (APIs) for IPC
Extensibility + due to encapsulation of IPC facilities and dynamic
deployment
Testability No impact
Domain-specific + resource limitation issue is handled through dynamic IPC
attribute connection
– due to introduced dynamism, the system performance
could be slightly reduced
1/25/2013 Serene 2011 31
20. Quantitative Architecture Evolvability Analysis Method
Using Analytic Hierarchy Process (AHP) for comparison of choices
Phase 1: Analyze the implications of change stimuli on Phase 3: Finalize the evaluation
software architecture
Step 3.1: Analyze and
Step 1.1: Elicit stakeholders’ Step 1.2: Extract stakeholders’ present evaluation results
views on evolvability prioritization and preferences of
subcharacteristics evolvability subcharacteristics
Phase 2: Analyze and prepare the software architecture to
accommodate change stimuli and potential future changes
Step 2.2: Assess candidate
Step 2.1: Identify candidate
architectural solutions’ impacts on
architectural solutions
evolvability subcharacteristics
1/25/2013 Serene 2011 32
21. Analytic Hierarchy Process (AHP)
[mij] nxn matrics – comparison elements (subcharacteristics)
mij – comparison values between two elements (1 … 9/1)
(results – binary relations between the elements)
Calculation and value normalization -> normalized significance of the values Wi
(W1+W2+… Wn = 1)
22. Case study: Ericsson
• mobile network architecture with respect to the
evolvability of a logical node at Ericsson.
• Logical Node:
– handle control signaling for and keep track of user equipment
such as mobiles using a certain type of radio access.
Application level
In-Service Software Upgrade (ISSU)
Control System Transmission System
Platform level
Middleware OS
1/25/2013 Serene 2011 34
23. Phase 1: requirements identification
Requirements Subcharacteristics
R1. The ISSU solution should be easy to understand for the Analyzability
development organization.
R2. Many kinds of application changes shall be possible Changeability
without special upgrade code, e.g., backward compatible
interfaces.
R3. Enable introduction of ISSU solution without limitations on Extensibility
extending existing features or adding new ones.
R4. Enable change of operating system or hardware. Portability
R5. Enable the ease to test and debug parts of the system Testability
individually, and extract test data from the system.
R6. The ISSU total time, capacity impact during ISSU and Capacity
normal execution should be within specified values.
R7. Critical components need to have redundancy during Availability
ISSU.
R8. The impact of software or hardware failure during upgrade
should be limited.
1/25/2013 Serene 2011 35
25. Preferences of evolvability subcharacteristics (2)
CR < 0.1 – the consistent selections between stakeholders
1/25/2013 Serene 2011 37
26. Phase 2: provide alternative architectural solutions
Select alternative:
1/25/2013 Serene 2011 38
27. Summary - Questions
• What is the importance of resilience?
• How to include resilience aspects in the development process?
– In certification?
– In real-time systems?
– Component-based approach?
– Model-driven approach?
• How to verify resilient systems?
– Can we test all possible scenarios?
• Adaptability
– Where to implement the adaptability?
• Source code level?
• Architectural level?
1/25/2013 Serene 2011 39
4. For example, organisms living in communities that are in isolation from one another may be organized differently than the same type of organism living in a large continuous population, thus the community-level structure is influenced by population-level interactions
robustness is the ability of a computer system to cope with errors during execution or the ability of an algorithm to continue to operate despite abnormalities in input, calculations, etc.
robustness is the ability of a computer system to cope with errors during execution or the ability of an algorithm to continue to operate despite abnormalities in input, calculations, etc.
Every open system interacts with its environment and, hence, is part of a larger system.Carried to the extreme, this type of reasoning leads to the “Gaia hypothesis”, which claims that the world is a single giant organism.
The main problem with this software architecture was the existence of tight coupling among some components that reside in different layers.
Analyze impacts of the solutions to the subcharacteristics (using AHP