SlideShare a Scribd company logo
1 of 7
Download to read offline
Using Dynamic Adaptive Systems
in Safety-Critical Domains
Ethan T. McGee
Clemson University
McAdams Hall
Clemson, SC 29632
etmcgee@clemson.edu
John D. McGregor
Clemson University
McAdams Hall
Clemson, SC 29632
johnmc@clemson.edu
ABSTRACT
The development of safety-critical Cyber-Physical Systems
(CPS) is expanding due to the Internet of Thingsā€™ promise
to make high-integrity applications and services part of ev-
eryday life. This expansion is seen in the dependencies some
connected vehicles have on cloud services that provide guid-
ance and accident avoidance / detection features. Such sys-
tems are safety-critical since failure could result in serious
injury or death. Due to the severe consequences of fail-
ure, fault-tolerance, reliability and dependability should be
primary driving qualities in the design and development of
these systems. However, the cost of the analysis, evaluation
and certiļ¬cation activities needed to ensure that the possi-
bility of failure has been suļ¬ƒciently mitigated is signiļ¬cantly
higher than the cost of developing traditional software.
Our group is exploring the addition of dynamic adap-
tive capabilities to safety-critical systems. We postulate
that dynamic adaptivity could provide several enhancements
to safety-critical systems. It would allow systems to rea-
son about the environment within which they are sited and
about their internal operation enabling decision making that
is context-speciļ¬c and appropriately prioritized. However,
the addition of adaptivity with the associated overhead of
reasoning is not without drawbacks particularly when hard
real-time safety-critical systems are involved. In this brief
position paper, we explore some of the questions and con-
cerns that are raised when dynamic adaptive behavior is
introduced into safety-critical systems as well as ways that
the Architecture Analysis & Design Language (AADL) can
be used to model / analyze such systems.
CCS Concepts
ā€¢Software and its engineering ā†’ Software organiza-
tion and properties; Designing software;
Permission to make digital or hard copies of all or part of this work for personal or
classroom use is granted without fee provided that copies are not made or distributed
for proļ¬t or commercial advantage and that copies bear this notice and the full cita-
tion on the ļ¬rst page. Copyrights for components of this work owned by others than
ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or re-
publish, to post on servers or to redistribute to lists, requires prior speciļ¬c permission
and/or a fee. Request permissions from permissions@acm.org.
SEAMSā€™16, May 16-17, 2016, Austin, TX, USA
Ā© 2016 ACM. ISBN 978-1-4503-4187-5/16/05...$15.00
DOI: http://dx.doi.org/10.1145/2897053.2897062
Keywords
Safety Critical Systems; Dynamic Adaptive Systems; Soft-
ware Product Lines; Dynamic Software Product Lines
1. INTRODUCTION
Cyber-Physical Systems (CPS) and the Intenet of Things
(IoT) are seeing expanded use in a wide variety of domains.
They are enabling the automation of common household de-
vices, and they are also playing a major role in the imple-
mentation of connected / autonomous vehicles [23]. Pro-
totype autonomous vehicles are relying on infrastructures
powered by the IoT [14] for features including Global Po-
sitioning System (GPS) navigation, adaptive cruise control
and automated accident response / detection among others.
Especially for autonomous vehicles, these systems would be
considered safety-critical, where system failure could result
in serious injury or death. We consider adding dynamic
adaptive capabilities to safety-critical systems as a way to
extend the life, deploy-ability and functionality of the sys-
tems. This poses signiļ¬cant challenges to the analysis and
certiļ¬cation that systems are safe.
Dynamic Adaptive Systems (DAS) are systems that mon-
itor select properties of their environment and reason about
the measured values in order to determine what behaviors
the system should adopt in order to best operate in the de-
tected environment [24]. Introducing dynamic adaptive ca-
pabilities into safety-critical systems would address several
issues.
First, it could increase the fault-tolerance of safety-critical
systems. In many safety-critical systems, if the system fails
in an unexpected or unforeseen way, there is generally no
prescribed method of handling this failure. Since DAS in-
herently include a degree of self-monitoring, provided the
correct system properties are monitored, it would be possi-
ble to determine that the system has failed in the current
conļ¬guration. Systems could also provide a fail-safe con-
ļ¬guration that could be transitioned to in the event of a
failure.
Second, it could increase the deploy-ability of safety-critical
systems. Since DAS monitor and adapt to their environ-
ment, it would be possible to build a safety-critical system
that could support multiple environments rather than build-
ing multiple systems, one per environment. Doing so would
greatly reduce development and certiļ¬cation costs as only
one, not multiple, system would be considered.
Finally, the inclusion of dynamic adaptivity in safety-
critical software could allow it to respond to unexpected
situations in a manner that prevents loss of life. Alaska Air-
2016 IEEE/ACM 11th International Symposium on Software Engineering for Adaptive and Self-Managing Systems
115
lines Flight 261 was a ļ¬‚ight that crashed killing 88. The
crash was caused by a screw sheering in such a way that the
ļ¬‚ight control module which controls pitch was not oriented
correctly [27]. If this system had been able to correctly de-
tect its current orientation, the software of the ļ¬‚ight control
module could have reoriented itself, allowing the plane to
land safely.
Despite the beneļ¬ts possible, adaptive behavior signiļ¬-
cantly hinders safety analysis and certiļ¬cation. At the cur-
rent time, the safety analysis methods used for adaptive sys-
tems are formal methods [28, 15], which are generally advo-
cated to be used in addition to reasoning-based safety anal-
ysis not standalone [16, 3]. Formal methods are generally
more diļ¬ƒcult to understand, modify and maintain [16, 3],
and mathematically-based reasoning techniques, like Soft-
ware Fault Tree Analysis (SFTA) [19], are easier to con-
struct while still maintaining the mathematical rigor of for-
mal methods. In our work, we use reasoning-based tech-
niques over formal methods due to the diļ¬ƒculty involved in
building a correct mathematical speciļ¬cation of the system
needed for evaluation by formal methods. The speciļ¬cation
requires specialized training in mathematical proofs / spec-
iļ¬cation to build and, like any large development artifact,
can be prone to errors. Reasoning based techniques have no
such constraint.
In this position paper, we explore adding dynamic adap-
tive capabilities to safety-critical systems. The primary con-
tributions of this paper are as follows:
ā€¢ We consider the diļ¬ƒculties of testing dynamic adaptive
software products due to the large number of possible
run-time conļ¬gurations.
ā€¢ We consider the use of a partially-tested DAS in a
safety-critical environment and how the use of moni-
toring and dynamic reconļ¬guration could be used as
assurance for the partially tested system.
ā€¢ We consider the use of the Architecture Analysis & De-
sign Language (AADL) [12] as a method of designing,
analyzing and modeling DAS.
For each item, we will discuss our in-progress research that
provides potential solutions, and we also introduce questions
and concerns arising from each topic that are open work.
The remainder of this paper is structured as follows. Sec-
tion 2 contains the background information for this work,
and section 3 contains issues that we have found preventing
adaptive designs from being used in safety-critical systems
as well as work we are conducting towards providing a so-
lution for the issue. Section 4 contains related work, and
ļ¬nally, section 5 concludes this position paper.
2. BACKGROUND
We now provide the information necessary for understand-
ing the work that we present. We provide overviews of
safety-critical systems and AADL. We also give an overview
of variation management.
2.1 Safety-Critical Systems
Safety-critical systems are systems where failure can result
in serious injury, loss of life or property / environmental
damage. These devices have been employed in the domains
of aviation and transportation as well as numerous other
domains including medical and nuclear energy [17]. Such
systems are becoming more commonplace, particularly as
humans are increasingly relying on machines for situations,
such as autonomous vehicles, where humans cannot always
react quickly enough.
Safety-critical systems are designed and developed diļ¬€er-
ently than traditional software systems. They require exten-
sive analysis, design and testing which all contribute assets
to the certiļ¬cation process, and an assurance case to show
that potential failures have been suļ¬ƒciently mitigated. The
necessity of extensive work both initially in design and later
in testing is owed to the fact that the certiļ¬cation process is,
in general, expensive and long. Any errors found during the
certiļ¬cation process require that the software be corrected
and resubmitted starting the certiļ¬cation process over from
the beginning.
Safety-critical systems have diļ¬€erent development con-
cerns than traditional software, especially since human lives
are at risk. Unintended behavior must be prevented from
emerging when separately developed components are brought
together. An example of such emergent behavior can be seen
in the failure of the London ambulance dispatch system in
1992. The deployed system worked initially, but a combi-
nation of interactions caused the system to fail increasingly
frequently throughout the day which resulted in service de-
lays and up to 46 deaths [20]. In addition to extra work
during development, additional work is needed during re-
quirements analysis to prevent incorrect assumptions about
the software. An example of this type of failure is the Mars
Climate Orbiter which crashed because diļ¬€erent units of
measurement were employed by the ground and on-board
control systems [17]. These are only a few examples of the
additional concerns while developing safety-critical software.
2.2 Architecture Analysis & Design Language
AADL is an architecture representation language and anal-
ysis tool designed for embedded system technologies. It
is used to represent both hardware and software architec-
tures [12] and is an approved standard of the SAE [10].
AADL is a highly extensible language structured as a set
of annexes around a core language. As an example, the
behavior annex provides the ability to specify the runtime
behavior of a component. Other annexes provide support for
veriļ¬cation, error speciļ¬cation and requirements among oth-
ers. AADLā€™s Integrated Development Environment (IDE),
OSATE, also provides tools for analyzing and verifying the
architecture speciļ¬cation. Some tools evaluate only the ar-
chitecture speciļ¬cation while others utilize information from
the annexes to aid analysis.
AADL has been used as a method of representing SPL, a
methodology used to build DAS [2], as its rich typing and
high extensibility support the intricacies involved in model-
ing SPL. One such feature is that of AADLā€™s extends key-
word which can be used to represent the variation / variation
point structure used in product lines. Extends, an example
of which is shown in ļ¬gure 1, is used to denote a system that
inherits properties from a parent system. The child system
will receive the features (inputs and outputs), subcompo-
nents, connections and annexes of the parent. Children are
allowed to add to or to reļ¬ne the parentā€™s structure, but they
are not allowed override the structure of the parent. This
provides an ideal representation for SPL variation points /
variations as variation points are representations of ā€œholesā€
116
to be ļ¬lled. The ā€œholeā€ is normally represented as an object
that contains input / output types or other structural infor-
mation. Variants extend the variation pointā€™s ā€œcontractā€ to
ensure that they can be properly instantiated in the varia-
tion point.
system my_variation_point
end my_variation_point;
system my_variation extends my_variation_point
end my_variation;
Figure 1: AADL Extends Example
2.3 Variation Points & Variations
SPL is a development methodology that decreases the cost
associated with development by allowing developers to sep-
arate out a set of common assets that are shared across a
family of products. It allows developers to more intensely fo-
cus on product-speciļ¬c issues rather than focusing on shared
assets [26]. The shared assets are morphed into what is
known as a product family architecture. This architecture
represents all common functionality to the entire suite of
products, but the family architecture containsā€œholesā€known
as variation points. In order for a new product to be created
within the context of the product line, each variation point
must be substituted with an implemented variant [26]. This
instantiation of a variation point can happen at any time
during the product life-cycle, including runtime. Variations
and variation points are also used by DPSL, where instanti-
ation of a variation point does not necessarily lead to a new
product but could lead to new run-time conļ¬guration.
Variation points possess two primary properties: state and
variant addition state [26]. We add to this list the variation
pointā€™s type [5]. The state of a variant can be one of three
mutually exclusive values:
ā€¢ Implicit - These are variation points that have not
yet been speciļ¬cally created in the architecture. They
represent uncertainty or variability that has not yet
been decided and has not yet been explicitly left as a
decision to be bound by the product. These types of
variation points tend to be created early in the project
life-cycle and are migrated to designed variation points
as the project progresses.
ā€¢ Designed - These are variation points where the de-
sign decision has been deliberately left unanswered.
ā€¢ Bound - A variation point that has been replaced by
a variant.
[26] The second property, variant addition state, deals with
whether it is possible to add new variants to the system.
This property can assume one of two values:
ā€¢ Open - An open variation point is one that can still
have additional variants added to its list of potential
variants.
ā€¢ Closed - A closed variation point cannot have any
further variants added to its list of potential variants.
[26] The ļ¬nal property is the variant type. This property
can also assume one of two values:
ā€¢ Static - A static variation point is a variation that,
once bound, cannot be rebound. This binding occurs
as the result of a decision outside the system.
ā€¢ Dynamic - A dynamic variation point is a variation
point that can be rebound, even after binding. This
binding usually, but not always, occurs as a result of a
decision made autonomously by the system itself.
[5] Each of these properties can be represented in the fea-
ture tree of the product line. With AADL, it is possible to
create a property set that will also allow these properties
to be expressed as part of the AADL model which in turn
allows variation point properties to be evaluated during ar-
chitecture analysis.
3. ISSUES
Our motivation for undertaking this work is to explore
how safety-critical systems can beneļ¬t from the work of the
dynamic adaptive community. The intention is to build
safety-critical systems that are capable of operating opti-
mally in a wide range of situations, especially environments
that threaten successful execution. To provide a motivat-
ing example, consider that some autonomous vehicles rely
on external services provided by the IoT. These external
services could provide non-safety-critical features like enter-
tainment, but they could also provide safety-critical features
such as accident alerts, road condition warnings and navi-
gation. Each year as new vehicles are released, the new
vehicles will likely rely on new, upgraded versions of each
service that may not yet be available in all areas. Much as
cell phones are capable of adapting to weak cellular signals,
it is desirable for these vehicles to be both capable of de-
tecting and adapting to the absence of the services on which
they rely.
Despite the beneļ¬ts that DAS oļ¬€er to safety-critical sys-
tems, there are concerns that hinder the composition of the
disciplines. In particular, we present three concerns:
ā€¢ What constitutesā€œsuļ¬ƒcient pre-deployment testingā€of
a safety-critical DAS? (discussed in section 3.1)
ā€¢ What is the nature of the failures that can arise from
executing untested conļ¬gurations containing faults and
how can they be mitigated? (discussed in section 3.2)
ā€¢ How can safety-criticality be incorporated into the de-
sign of DAS? (discussed in section 3.3)
For each of these concerns, we will discuss the concern then
provide work our group is undertaking that we postulate will
provide a resolution to the stated concern.
Throughout our discussion, we will use standard terminol-
ogy from SPL. We use variation point to designate a decision
that has been deliberately left undecided. We use the term
variant to represent a candidate decision that can be bound
to, or swapped into, a variation point. For the purposes of
this discussion, we assume that the variation point is dy-
namic and can be bound and rebound at run-time, unless
explicitly stated otherwise. We will use the term context to
mean the current operating environment, or the current set
of values measured by the system. Finally, we use conļ¬gu-
ration to deļ¬ne the current bindings for the set of variation
points. Since we are dealing with dynamic variation points,
we also assume, unless explicitly stated otherwise, that the
117
conļ¬guration can be changed autonomously by the system
at run-time.
3.1 Pre-deployment Testing of Safety-Critical
DAS
A single dynamically adaptive system deļ¬nes multiple prod-
uct conļ¬gurations and therefore can be represented and man-
aged using SPL methods [2]. As such they share many of
the same concerns with regards to testability. The shared
concerns include:
ā€¢ Large Number of Test Cases - As additional po-
tential variants for the systemā€™s variation points are
identiļ¬ed, the number of possible run-time conļ¬gura-
tions increases combinatorially.
ā€¢ Reusable Components - Variants could be reused
between adaptive systems, but which artifacts of the
variants (testing results, test cases, certiļ¬cation assets)
are to be shared and how much they will reduce the
eļ¬€ort in testing the new system is not well understood.
ā€¢ Variability - SPL, and subsequently DAS, have two
types of variation points, static and dynamic. How
testing the various types of variation points should be
handled or if they should be handled diļ¬€erently is not
known.
[9] Our work focuses primarily on the ļ¬rst concern.
For a given safety-critical SPL, one approach to ensuring
suļ¬ƒcient test coverage would be to automate the testing
of all possible combinations. While this method may be
acceptable in some cases, there also exist cases where this
is not an option. One such example would be a situation
where there is not enough time for the test suite to run
before the system is to be deployed. In such situations, it is
necessary to determine which tests will capture the largest
set of faults in the system. One method of reducing the
number of potential test cases would involve ranking the
conļ¬gurations of the system by the likelihood that they will
be used. After ranking, the top k conļ¬gurations could be
selected for testing based on how much time is available for
testing and how long it takes to test a single conļ¬guration.
This method was employed by [1].
Our group is currently using an additional complementary
technique based on the error ontology given in the Error An-
nex to AADL [6]. We produce models of each component of
the system, and we augment each model with the ontology
of errors that can propagate from that component. When
a conļ¬guration of the system is composed from the models
of each component, the diļ¬€erent error propagations of each
conļ¬guration can be examined. We postulate that this in-
formation can be leveraged in order to determine which tests
should be run against a conļ¬guration.
Consider that we have a variation point, A, with two vari-
ants, Aā€™ and Aā€. A can propagate errors BadValue and Mu-
texError. This means that the variant could produce an in-
correct value or that it could produce an error due to threads
improperly using a mutex. Aā€ can propagate errors Bad-
Value. It does not propagate MutexError as it, unlike Aā€™,
does not use multiple threads. Based on this information,
we determine that it is not necessary to run multi-threading
or mutex tests against conļ¬gurations of the system in which
Aā€ has been swapped in for variation point A while it is
necessary to run tests for multi-threading / mutexes against
conļ¬gurations that use Aā€™. By examining the error propaga-
tions, we can not only limit our testing to the conļ¬gurations
that are most likely to be used, but we can also limit our
tests based on the types of errors that each conļ¬guration
can produce. Our hypothesis is this will allow us to test a
greater number of conļ¬gurations than if we ran every test
against a conļ¬guration.
3.2 Mitigation of Failures at Run-time
Assuming that we have a limited testing budget and a suf-
ļ¬ciently large system, it is probable that an untested conļ¬g-
uration containing latent faults will be entered into at run-
time. Due to the conļ¬gurationā€™s not being tested, safety-
critical DAS should have some method of both detecting
and handling execution-time failures that could occur as a
result of the faults.
A primary strategy is that of performing a series of execution-
time tests just before a conļ¬guration is deployed, as is done
by [21]. Components provide a series of dependencies and
provisions, and the tester ensures that all dependencies are
satisļ¬ed by some component. Another possibility is that
each component contains test cases, and these cases are used
to perform the pre-deployment testing. In general, a conļ¬g-
uration that fails to pass the run-time tests will not be de-
ployed. Pre-deployment testing can help mitigate the prob-
ability of failure, but testing alone cannot guarantee that
failure will not occur.
Since tests alone cannot guarantee that failure will not oc-
cur, it is necessary to have some method of handling failures
if and when they occur. [22] proposed a method of main-
taining a list of potential alternatives for a given variant in
a conļ¬guration. If the variant fails, each alternative can be
attempted until all possibilities have been exhausted or an
alternative resolves the failure. A natural concern is what
should happen if all possibilities have been attempted but
the failure persists. In some domains, it would be possible to
halt the system, but in safety-critical systems, this is rarely
possible. Another concern is how the occurrence of a failure
aļ¬€ects the hard real-time requirements of a system.
Our proposed method is to provide a failure conļ¬guration
that is entered into immediately upon failure. This conļ¬g-
uration will spawn two co-operative tasks that execute side
by side. The ļ¬rst task provides basic functionality so that
the system can continue to receive and transmit informa-
tion meeting its hard real-time requirements. The second
task attempts to resolve the failure by changing the conļ¬gu-
ration. In our current work, the change is to simply rollback
to the previous conļ¬guration. We would like to incorpo-
rate the alternative list used by [22] rolling back only once
all alternatives have been tried. Our approach currently
raises several unresolved concerns. First, detecting failure
occurrence, loading the failure conļ¬guration and switching
into the failure conļ¬guration is not an instantaneous process.
For systems with real-time requirements in the millisecond
range, it is possible that our process could still cause the
system to fail to satisfy its requirements. Second, our rea-
son for wanting to incorporate the list of alternatives stems
from the fact that rolling back to the previous conļ¬guration
can cause the system to re-enter the failing conļ¬guration.
This happens when the rollback occurs but the context has
not changed. The previous conļ¬guration detects that an-
other conļ¬guration is better suited to handle the context,
the failing conļ¬guration, and switches. An idea currently
118
being explored is the concept of banning conļ¬gurations that
have failed as is done by [13], discussed later.
Even though run-time testing cannot prevent failure, it
still oļ¬€ers a potential mitigation that makes it less likely
that we will need to rely on run-time failure resolution. The
manner in which the execution-time tests are performed,
however, do raise concerns for safety-critical systems. Un-
like [21], it is rarely possible for a safety-critical system to
halt execution in order to perform run-time tests. Once
again, considering hard real-time requirements, we currently
have the reconļ¬guration process run concurrently alongside
the current conļ¬guration. This has several beneļ¬ts. First,
the system can continue normal operation while the new to-
be-deployed conļ¬guration is tested and validated. Second,
if the to-be-deployed conļ¬guration fails validation, the con-
ļ¬guration process can simply terminate with no additional
actions. The current conļ¬guration that has been running
alongside the validation process may be able to continue
uninterrupted. If the to-be-deployed conļ¬guration succeeds,
then and only then is the current conļ¬guration interrupted
and the new conļ¬guration swapped in. There are also sev-
eral drawbacks to this approach. First, as previously stated,
switching contexts is not instantaneous. For systems with
real-time requirements in the millisecond range, uninter-
rupted execution may not be possible. Second, additional
hardware is needed to allow concurrent execution of conļ¬g-
uration testing on smaller embedded devices. Finally, what
is necessary to conduct run-time testing varies depending
on the test to be executed. Our group has also considered
the possibility of representing the entire architecture as this
would allow complex analysis tasks like ļ¬‚ow, dependency
and latency analysis to be performed on the system at run-
time. This would be particularly beneļ¬cial since the system
may have to consider never-before-seen variants. However,
safety-critical systems can be small embedded devices with
limited memory constraints. It may not be possible to load
a representation of the entire architecture or the entire fea-
ture model into memory all at once for analysis. This would
require multiple read operations from storage increasing the
time required for the run-time testing of the reconļ¬guration
process. Even if size were not a concern, which types of tests
to perform is. It is likely that the reconļ¬guration process
will not have enough budget or resources to execute all of
the analysis tests.
Our group has also considered the possibility of the sys-
tem context as a source of failure. Once again consider an
autonomous vehicle. The vehicle is traveling on a sunny,
cloud-free day and, using sensors on the windshield / under-
body, determines that the traction control system should be
conļ¬gured to support mostly dry road conditions. As the
vehicle is traveling along it encounters wet roadways due
to a ļ¬re hydrant that is in the process of being depressur-
ized. The car under-body detects wet road conditions, but
the windshield sensor determines that conditions are still
sunny so the current conļ¬guration is not changed. As a
result the car begins to slide on the wet roadway possibly
ending in a collision. In the example, an untested conļ¬gura-
tion was not the source of the fault; the combination of the
context and conļ¬guration resulted in a fault. Our work on
providing a minimal conļ¬guration for fault situations pro-
vides one potential solution to this concern, but our group
is also considering other avenues. We are currently inves-
tigating an extension of [13] so that a systemā€™s failure or
success in a particular context is recorded and, at a later
point in time, can be used to determine if a conļ¬guration
should be deployed in that context again. Continuing with
the example, we encounter wet roads but sun again. The
vehicle can leverage the collected history to determine that
dry road traction control should not be used as last time it
resulted in a failure. This history of conļ¬gurations and the
contexts they have failed in could also be shared with other
vehicles allowing them to ā€œlearnā€ and become more reliable
with time.
3.3 Safety-Critical DAS Design Techniques
The extensive analysis required by safety-critical systems
needs special tooling for modeling and design. Current ar-
chitecture tools, such as AADL and UML, do not explic-
itly support dynamic structural adaptation, but they pos-
sess several features that could be useful to the analysis and
design of DAS.
AADL and its IDE, OSATE, have many features that
would be beneļ¬cial to the modeling, analysis and design
of DAS. Their support of annexes would allow for tooling to
be built for the modeling of DAS that use structural adapta-
tion, and our group is currently considering the development
of an annex speciļ¬c to that goal. The current features we
feel would be of use to the DAS community, and that have
been particularly useful in the work we have undertaken,
would be the features used to model the unique features of
SPL [11], modes, behavior and error ļ¬‚ows.
system receiver
features
inevent: in event port;
modes
nominal: initial mode;
recovery: mode;
end receiver;
system implementation receiver.i
subcomponents
sub1: system sub_receiver.i in modes (nominal,recovery);
sub2: system sub_receiver.i in modes (recovery);
internal features
triggerrecovery: event;
triggernormal: event;
modes
nominal-[triggerrecovery]->recovery;
recovery-[triggernormal]->nominal;
end receiver.id;
Figure 2: AADL Modes Example
AADL allows developers to specify state transitions using
a Finite State Machine (FSM) notation, and the states spec-
iļ¬ed can be used to enable or disable components as well as
to drive analysis. As an example, consider ļ¬gure 2. The sys-
tem receiver has two modes nominal and recovery. It also
has two sub-components sub1 and sub2. The system starts
in its initial mode, nominal, and continues execution until
a triggerrecovery event occurs. Following the rules speci-
ļ¬ed under modes in receiver.i, the system transitions to the
recovery mode. Under normal operation only the sub1 sub-
component is available to the system, but in recovery mode
sub2 can be used as well. Both sub1 and sub2 are usable
until the system returns to nominal mode via the trigger-
119
nominal event.
The mode can also be used to drive decision modeling
as both the error and behavior annexes can reference the
current mode. The behavior annex provided by AADL al-
lows the behavior of the system or a sub-component to be
speciļ¬ed and modeled [7], and there are some commercial
tools available that can utilize this annex when modeling or
simulating the system represented by the architecture [8].
An error model, deļ¬ned using the AADL error annex, al-
lows developers to specify how errors propagate throughout
the system. The errors and their types can be speciļ¬ed as
well as which errors the component can handle. Errors that
cannot be handled by a component are propagated further
up the composition hierarchy until they reach a component
that handles them [18].
Since the behavior annex is used to model dynamic behav-
ior in a system, the modes could be leveraged as a method of
allowing certain types of adaptation only when the system
is in a speciļ¬c mode. AADL could also be used to explic-
itly model and trace the context of the system through its
extensible type interfaces. The context could then also be
referenced by the error and behavior annexes. Commercial
tools and analysis interfaces in OSATE would, as a result, be
able to directly predict the adaptive systemā€™s behavior. This
can be used as an additional method of testing the adaptive
system, and it oļ¬€ers high-level insight into the operation of
the dynamic software.
4. RELATED WORK
We now cover work with similar focus to ours identifying
areas of overlap. We also provide how our work diļ¬€ers and if
our work could beneļ¬t from incorporation of ideas expressed
by these similar ideas.
[1] provides a method of automatically calculating the dif-
ferent conļ¬gurations that a system can assume. The relia-
bility of the system is then evaluated using a Markov chain.
Our work diļ¬€ers in that we explicitly assume that even if
the number of possible conļ¬gurations is determinable, the
number of conļ¬gurations will be too large to test in a rea-
sonable amount of time. It is also the case that a reliable
system does not necessarily mean that the system is safe,
however a safe system cannot be unreliable. Our work uses
analysis of the systemā€™s architecture to limit the number of
tests that need to be executed against the system.
[15] provides a method, using DCCA, of proving that an
adaptive system is more dependable than a conventional sys-
tem. The check is based on temporal logic; however, the
method is not a direct proof of safety, but it could be used
to contribute to the case for safety. The method takes into
account that a failure could be resolved by the reconļ¬gura-
tion process, but it doesnā€™t take into account that the failure
could also be re-introduced by the reconļ¬guration process.
This method is complimentary to our own. It could be used
as an additional safety case or as the basis for developing an
analysis method tailored to dynamic adaptive systems.
[4] presents an overview that describes the need for run-
time metric measurement in adaptive systems, and [25] pro-
vides a method of measuring speciļ¬c quality attributes. This
method could be used as an assurance that adaptation cor-
rectly places safety as a high-priority. Its monitoring capa-
bilities could be extended to support error monitoring during
run-time. The process, as deļ¬ned, does not consider safety
or run-time monitoring, but the goals of seeking to maximize
speciļ¬c properties of software are parallel to our own. We
would like to, in future work, adapt this method to execute
at run-time on an adaptive system so that attributes of the
system can be measured as they change.
Our work diļ¬€ers from the above in that we explicitly con-
sider safety-critical systems unlike [4] and [25]. We diļ¬€er
from [15] and [1] in that we believe the reconļ¬guration pro-
cess needs additional safeguards that will prevent faulty con-
ļ¬gurations that have been resolved from being redeployed
into the same context in which they failed. [15]ā€™s goal is
very similar to ours but it, as mentioned, lacks the focus
on additional reconļ¬guration guards. Each of the ideas ex-
pressed however could eventually be incorporated into our
work to strengthen the assurance case for system safety.
5. CONCLUSION
In this position paper, we explored several concerns that
have been raised by our research considering incorporating
dynamic adaptivity into safety-critical systems. Our interest
in applying dynamic adaptivity stems from the unique prop-
erties and behaviors that DAS oļ¬€er in regards to designing
and deploying safety-critical systems. One such possibility
explored in this paper is the utilization of dynamic adaptiv-
ity to increase a safety-critical systemā€™s reliability.
Even though DAS oļ¬€er many exciting possibilities for
safety-critical software, there are many unanswered ques-
tions. Testing of DAS, DAS representation and how DAS
handle faults that occur during run-time all raise concerns
that must be either mitigated or resolved. In future work,
we plan to further explore how DAS can be certiļ¬ed for
safety-critical environments. We are currently conducting
research to determine which, if any, current safety analysis
techniques could be adapted to use dynamic adaptive soft-
ware. There are several possibilities which we are still in the
process of exploring. We also plan to further explore what
role context plays when deciding if a conļ¬guration that has
previously failed should be deployed again, and we are inves-
tigating run-time testing of safety-critical DAS to mitigate
failures.
6. ACKNOWLEDGEMENTS
The work of the authors was funded by the National Sci-
ence Foundation (NSF) grant # 2008912. We also thank
those that have reviewed our work oļ¬€ering recommendations
for improvement.
7. REFERENCES
[1] R. Adler, M. FĀØorster, and M. Trapp. Determining
conļ¬guration probabilities of safety-critical adaptive
systems. In Advanced Information Networking and
Applications Workshops, 2007, AINAWā€™07. 21st
International Conference on, volume 2, pages 548ā€“555.
IEEE, 2007.
[2] N. Bencomo, P. Sawyer, G. S. Blair, and P. Grace.
Dynamically adaptive systems are product lines too:
Using model-driven techniques to capture dynamic
variability of adaptive systems. In SPLC (2), pages
23ā€“32, 2008.
[3] J. Bowen and V. Stavridou. Safety-critical systems,
formal methods and standards. Software Engineering
Journal, 8(4):189ā€“209, 1993.
120
[4] R. Calinescu, C. Ghezzi, M. Kwiatkowska, and
R. Mirandola. Self-adaptive software needs
quantitative veriļ¬cation at runtime. Communications
of the ACM, 55(9):69ā€“77, 2012.
[5] R. De Lemos, H. Giese, H. A. MĀØuller, M. Shaw,
J. Andersson, M. Litoiu, B. Schmerl, G. Tamura,
N. M. Villegas, T. Vogel, et al. Software engineering
for self-adaptive systems: A second research roadmap.
In Software Engineering for Self-Adaptive Systems II,
pages 1ā€“32. Springer, 2013.
[6] J. Delange and P. Feiler. Architecture fault modeling
with the aadl error-model annex. In Software
Engineering and Advanced Applications (SEAA), 2014
40th EUROMICRO Conference on, pages 361ā€“368.
IEEE, 2014.
[7] P. Dissaux, J. Bodeveix, M. Filali, P. Gauļ¬llet, and
F. Vernadat. Aadl behavioral annex. In Proceedings of
DASIA conference, Berlin, 2006.
[8] Ellidiss.com. Aadl inspector, 2016.
[9] E. EngstrĀØom and P. Runeson. Software product line
testingā€“a systematic mapping study. Information and
Software Technology, 53(1):2ā€“13, 2011.
[10] H. Feiler, B. Lewis, and S. Vestal. The SAE
architecture analysis and design language (AADL)
standard. In IEEE RTAS Workshop, 2003.
[11] P. Feiler. Product line modeling with aadl, 2005.
[12] P. H. Feiler, D. P. Gluch, and J. J. Hudak. The
architecture analysis & design language (AADL): An
introduction. Technical report, DTIC Document, 2006.
[13] B. J. Garvin, M. B. Cohen, and M. B. Dwyer. Using
feature locality: can we leverage history to avoid
failures during reconļ¬guration? In Proceedings of the
8th workshop on Assurances for self-adaptive systems,
pages 24ā€“33. ACM, 2011.
[14] M. Gerla, E.-K. Lee, G. Pau, and U. Lee. Internet of
vehicles: From intelligent grid to autonomous cars and
vehicular clouds. In Internet of Things (WF-IoT),
2014 IEEE World Forum on, pages 241ā€“246. IEEE,
2014.
[15] M. Gudemann, F. Ortmeier, and W. Reif. Safety and
dependability analysis of self-adaptive systems. In
Leveraging Applications of Formal Methods,
Veriļ¬cation and Validation, 2006. ISoLA 2006. Second
International Symposium on, pages 177ā€“184. IEEE,
2006.
[16] A. Hall. Seven myths of formal methods. Software,
IEEE, 7(5):11ā€“19, Sept 1990.
[17] J. C. Knight. Safety critical systems: challenges and
directions. In Software Engineering, 2002. ICSE 2002.
Proceedings of the 24rd International Conference on,
pages 547ā€“550. IEEE, 2002.
[18] B. Larson, J. Hatcliļ¬€, K. Fowler, and J. Delange.
Illustrating the aadl error modeling annex (v.2) using
a simple safety-critical medical device. In Proceedings
of the 2013 ACM SIGAda Annual Conference on High
Integrity Language Technology, HILT ā€™13, pages 65ā€“84,
New York, NY, USA, 2013. ACM.
[19] N. G. Leveson and P. R. Harvey. Software fault tree
analysis. Journal of Systems and Software,
3(2):173ā€“181, 1983.
[20] E. Musick. The 1992 london ambulance service
computer aided dispatch system failure.
[21] D. Niebuhr, A. Rausch, C. Klein, J. Reichmann, and
R. Schmid. Achieving dependable component bindings
in dynamic adaptive systems-a runtime testing
approach. In Self-Adaptive and Self-Organizing
Systems, 2009. SASOā€™09. Third IEEE International
Conference on, pages 186ā€“197. IEEE, 2009.
[22] K. Paulsson, M. HĀØubner, and J. Becker. Strategies to
on-line failure recovery in self-adaptive systems based
on dynamic and partial reconļ¬guration. In Adaptive
Hardware and Systems, 2006. AHS 2006. First
NASA/ESA Conference on, pages 288ā€“291. IEEE,
2006.
[23] R. R. Rajkumar, I. Lee, L. Sha, and J. Stankovic.
Cyber-physical systems: The next computing
revolution. In Proceedings of the 47th Design
Automation Conference, DAC ā€™10, pages 731ā€“736,
New York, NY, USA, 2010. ACM.
[24] A. J. Ramirez, A. C. Jensen, and B. H. C. Cheng. A
taxonomy of uncertainty for dynamically adaptive
systems. In Proceedings of the 7th International
Symposium on Software Engineering for Adaptive and
Self-Managing Systems, SEAMS ā€™12, pages 99ā€“108,
Piscataway, NJ, USA, 2012. IEEE Press.
[25] G. Tamura, N. M. Villegas, H. A. MĀØuller, J. P. Sousa,
B. Becker, G. Karsai, S. Mankovskii, M. Pezz`e,
W. SchĀØafer, L. Tahvildari, et al. Towards practical
runtime veriļ¬cation and validation of self-adaptive
software systems. In Software Engineering for
Self-Adaptive Systems II, pages 108ā€“132. Springer,
2013.
[26] J. Van Gurp, J. Bosch, and M. Svahnberg. On the
notion of variability in software product lines. In
Software Architecture, 2001. Proceedings. Working
IEEE/IFIP Conference on, pages 45ā€“54. IEEE, 2001.
[27] Wikipedia. Alaska airlines ļ¬‚ight 261, 2016.
[28] J. Zhang, H. J. Goldsby, and B. H. Cheng. Modular
veriļ¬cation of dynamically adaptive systems. In
Proceedings of the 8th ACM international conference
on Aspect-oriented software development, pages
161ā€“172. ACM, 2009.
121

More Related Content

What's hot

Ch11-Software Engineering 9
Ch11-Software Engineering 9Ch11-Software Engineering 9
Ch11-Software Engineering 9Ian Sommerville
Ā 
Ch10-Software Engineering 9
Ch10-Software Engineering 9Ch10-Software Engineering 9
Ch10-Software Engineering 9Ian Sommerville
Ā 
Critical Systems
Critical SystemsCritical Systems
Critical SystemsUsman Bin Saad
Ā 
A BRIEF PROGRAM ROBUSTNESS SURVEY
A BRIEF PROGRAM ROBUSTNESS SURVEYA BRIEF PROGRAM ROBUSTNESS SURVEY
A BRIEF PROGRAM ROBUSTNESS SURVEYijseajournal
Ā 
L7 Design For Recovery
L7 Design For RecoveryL7 Design For Recovery
L7 Design For RecoveryIan Sommerville
Ā 
An Investigation of Fault Tolerance Techniques in Cloud Computing
An Investigation of Fault Tolerance Techniques in Cloud ComputingAn Investigation of Fault Tolerance Techniques in Cloud Computing
An Investigation of Fault Tolerance Techniques in Cloud Computingijtsrd
Ā 
Ch13-Software Engineering 9
Ch13-Software Engineering 9Ch13-Software Engineering 9
Ch13-Software Engineering 9Ian Sommerville
Ā 
Safety specification (CS 5032 2012)
Safety specification (CS 5032 2012)Safety specification (CS 5032 2012)
Safety specification (CS 5032 2012)Ian Sommerville
Ā 
Dr Dev Kambhampati | Security Tenets for Life Critical Embedded Systems
Dr Dev Kambhampati | Security Tenets for Life Critical Embedded SystemsDr Dev Kambhampati | Security Tenets for Life Critical Embedded Systems
Dr Dev Kambhampati | Security Tenets for Life Critical Embedded SystemsDr Dev Kambhampati
Ā 
Control systems
Control systemsControl systems
Control systemsVo Quoc Hieu
Ā 
An analysis of software aging in cloud environment
An analysis of software aging in cloud environment  An analysis of software aging in cloud environment
An analysis of software aging in cloud environment IJECEIAES
Ā 
SRIS Classification_Copyright_ERL_2014
SRIS Classification_Copyright_ERL_2014SRIS Classification_Copyright_ERL_2014
SRIS Classification_Copyright_ERL_2014Jim Mateer MSc MIET MRAeS
Ā 
Introduction to Critical Systems Engineering (CS 5032 2012)
Introduction to Critical Systems Engineering (CS 5032 2012)Introduction to Critical Systems Engineering (CS 5032 2012)
Introduction to Critical Systems Engineering (CS 5032 2012)Ian Sommerville
Ā 
Failure analysis buisness impact-backup-archive
Failure analysis buisness impact-backup-archiveFailure analysis buisness impact-backup-archive
Failure analysis buisness impact-backup-archiveDavin Abraham
Ā 
Proactive cloud service assurance framework for fault remediation in cloud en...
Proactive cloud service assurance framework for fault remediation in cloud en...Proactive cloud service assurance framework for fault remediation in cloud en...
Proactive cloud service assurance framework for fault remediation in cloud en...IJECEIAES
Ā 
Mc calley pserc_final_report_s35_special_protection_schemes_dec_2010_nm_nsrc
Mc calley pserc_final_report_s35_special_protection_schemes_dec_2010_nm_nsrcMc calley pserc_final_report_s35_special_protection_schemes_dec_2010_nm_nsrc
Mc calley pserc_final_report_s35_special_protection_schemes_dec_2010_nm_nsrcNeil McNeill
Ā 

What's hot (18)

Ch11-Software Engineering 9
Ch11-Software Engineering 9Ch11-Software Engineering 9
Ch11-Software Engineering 9
Ā 
Ch14 resilience engineering
Ch14 resilience engineeringCh14 resilience engineering
Ch14 resilience engineering
Ā 
Ch10-Software Engineering 9
Ch10-Software Engineering 9Ch10-Software Engineering 9
Ch10-Software Engineering 9
Ā 
Critical Systems
Critical SystemsCritical Systems
Critical Systems
Ā 
50320130403010
5032013040301050320130403010
50320130403010
Ā 
A BRIEF PROGRAM ROBUSTNESS SURVEY
A BRIEF PROGRAM ROBUSTNESS SURVEYA BRIEF PROGRAM ROBUSTNESS SURVEY
A BRIEF PROGRAM ROBUSTNESS SURVEY
Ā 
L7 Design For Recovery
L7 Design For RecoveryL7 Design For Recovery
L7 Design For Recovery
Ā 
An Investigation of Fault Tolerance Techniques in Cloud Computing
An Investigation of Fault Tolerance Techniques in Cloud ComputingAn Investigation of Fault Tolerance Techniques in Cloud Computing
An Investigation of Fault Tolerance Techniques in Cloud Computing
Ā 
Ch13-Software Engineering 9
Ch13-Software Engineering 9Ch13-Software Engineering 9
Ch13-Software Engineering 9
Ā 
Safety specification (CS 5032 2012)
Safety specification (CS 5032 2012)Safety specification (CS 5032 2012)
Safety specification (CS 5032 2012)
Ā 
Dr Dev Kambhampati | Security Tenets for Life Critical Embedded Systems
Dr Dev Kambhampati | Security Tenets for Life Critical Embedded SystemsDr Dev Kambhampati | Security Tenets for Life Critical Embedded Systems
Dr Dev Kambhampati | Security Tenets for Life Critical Embedded Systems
Ā 
Control systems
Control systemsControl systems
Control systems
Ā 
An analysis of software aging in cloud environment
An analysis of software aging in cloud environment  An analysis of software aging in cloud environment
An analysis of software aging in cloud environment
Ā 
SRIS Classification_Copyright_ERL_2014
SRIS Classification_Copyright_ERL_2014SRIS Classification_Copyright_ERL_2014
SRIS Classification_Copyright_ERL_2014
Ā 
Introduction to Critical Systems Engineering (CS 5032 2012)
Introduction to Critical Systems Engineering (CS 5032 2012)Introduction to Critical Systems Engineering (CS 5032 2012)
Introduction to Critical Systems Engineering (CS 5032 2012)
Ā 
Failure analysis buisness impact-backup-archive
Failure analysis buisness impact-backup-archiveFailure analysis buisness impact-backup-archive
Failure analysis buisness impact-backup-archive
Ā 
Proactive cloud service assurance framework for fault remediation in cloud en...
Proactive cloud service assurance framework for fault remediation in cloud en...Proactive cloud service assurance framework for fault remediation in cloud en...
Proactive cloud service assurance framework for fault remediation in cloud en...
Ā 
Mc calley pserc_final_report_s35_special_protection_schemes_dec_2010_nm_nsrc
Mc calley pserc_final_report_s35_special_protection_schemes_dec_2010_nm_nsrcMc calley pserc_final_report_s35_special_protection_schemes_dec_2010_nm_nsrc
Mc calley pserc_final_report_s35_special_protection_schemes_dec_2010_nm_nsrc
Ā 

Similar to Fault tolerance on cloud computing

Hp2513711375
Hp2513711375Hp2513711375
Hp2513711375IJERA Editor
Ā 
In what ways do you think the Elaboration Likelihood Model applies.docx
In what ways do you think the Elaboration Likelihood Model applies.docxIn what ways do you think the Elaboration Likelihood Model applies.docx
In what ways do you think the Elaboration Likelihood Model applies.docxjaggernaoma
Ā 
ANALYSIS OF SOFTWARE SECURITY TESTING TECHNIQUES IN CLOUD COMPUTING
ANALYSIS OF SOFTWARE SECURITY TESTING TECHNIQUES IN CLOUD COMPUTINGANALYSIS OF SOFTWARE SECURITY TESTING TECHNIQUES IN CLOUD COMPUTING
ANALYSIS OF SOFTWARE SECURITY TESTING TECHNIQUES IN CLOUD COMPUTINGEditor IJMTER
Ā 
A survey of online failure prediction methods
A survey of online failure prediction methodsA survey of online failure prediction methods
A survey of online failure prediction methodsunyil96
Ā 
Zue2015Uncertainties
Zue2015UncertaintiesZue2015Uncertainties
Zue2015UncertaintiesWilliam Chipman
Ā 
High dependability of the automated systems
High dependability of the automated systemsHigh dependability of the automated systems
High dependability of the automated systemsAlan Tatourian
Ā 
IN-DEPTH ANALYSIS AND SYSTEMATIC LITERATURE REVIEW ON RISK BASED ACCESS CONTR...
IN-DEPTH ANALYSIS AND SYSTEMATIC LITERATURE REVIEW ON RISK BASED ACCESS CONTR...IN-DEPTH ANALYSIS AND SYSTEMATIC LITERATURE REVIEW ON RISK BASED ACCESS CONTR...
IN-DEPTH ANALYSIS AND SYSTEMATIC LITERATURE REVIEW ON RISK BASED ACCESS CONTR...ijcseit
Ā 
PLANT LEAF DISEASES IDENTIFICATION IN DEEP LEARNING
PLANT LEAF DISEASES IDENTIFICATION IN DEEP LEARNINGPLANT LEAF DISEASES IDENTIFICATION IN DEEP LEARNING
PLANT LEAF DISEASES IDENTIFICATION IN DEEP LEARNINGCSEIJJournal
Ā 
Privacy Protection in Distributed Industrial System
Privacy Protection in Distributed Industrial SystemPrivacy Protection in Distributed Industrial System
Privacy Protection in Distributed Industrial Systemiosrjce
Ā 
9fcfd50a69d9647585
9fcfd50a69d96475859fcfd50a69d9647585
9fcfd50a69d9647585Mowaten Masry
Ā 
Dynamic RWX ACM Model Optimizing the Risk on Real Time Unix File System
Dynamic RWX ACM Model Optimizing the Risk on Real Time Unix File SystemDynamic RWX ACM Model Optimizing the Risk on Real Time Unix File System
Dynamic RWX ACM Model Optimizing the Risk on Real Time Unix File SystemRadita Apriana
Ā 
Best Practices for Microsoft-Based Plant Software Address Reliability, Cost, ...
Best Practices for Microsoft-Based Plant Software Address Reliability, Cost, ...Best Practices for Microsoft-Based Plant Software Address Reliability, Cost, ...
Best Practices for Microsoft-Based Plant Software Address Reliability, Cost, ...ARC Advisory Group
Ā 
IJCER (www.ijceronline.com) International Journal of computational Engineerin...
IJCER (www.ijceronline.com) International Journal of computational Engineerin...IJCER (www.ijceronline.com) International Journal of computational Engineerin...
IJCER (www.ijceronline.com) International Journal of computational Engineerin...ijceronline
Ā 
IJCER (www.ijceronline.com) International Journal of computational Engineerin...
IJCER (www.ijceronline.com) International Journal of computational Engineerin...IJCER (www.ijceronline.com) International Journal of computational Engineerin...
IJCER (www.ijceronline.com) International Journal of computational Engineerin...ijceronline
Ā 
Cloud Storage and Security
Cloud Storage and SecurityCloud Storage and Security
Cloud Storage and SecurityShashank Srivastava
Ā 
the_role_of_resilience_data_in_ensuring_cloud_security.pptx
the_role_of_resilience_data_in_ensuring_cloud_security.pptxthe_role_of_resilience_data_in_ensuring_cloud_security.pptx
the_role_of_resilience_data_in_ensuring_cloud_security.pptxsarah david
Ā 
How Enterprise Architects Can Build Resilient, Reliable Software-Based Health...
How Enterprise Architects Can Build Resilient, Reliable Software-Based Health...How Enterprise Architects Can Build Resilient, Reliable Software-Based Health...
How Enterprise Architects Can Build Resilient, Reliable Software-Based Health...Cognizant
Ā 
SMRP 24th Conf Paper - Vextec -J Carter
SMRP 24th Conf Paper - Vextec -J CarterSMRP 24th Conf Paper - Vextec -J Carter
SMRP 24th Conf Paper - Vextec -J Carterjcarter1972
Ā 

Similar to Fault tolerance on cloud computing (20)

Hp2513711375
Hp2513711375Hp2513711375
Hp2513711375
Ā 
In what ways do you think the Elaboration Likelihood Model applies.docx
In what ways do you think the Elaboration Likelihood Model applies.docxIn what ways do you think the Elaboration Likelihood Model applies.docx
In what ways do you think the Elaboration Likelihood Model applies.docx
Ā 
ANALYSIS OF SOFTWARE SECURITY TESTING TECHNIQUES IN CLOUD COMPUTING
ANALYSIS OF SOFTWARE SECURITY TESTING TECHNIQUES IN CLOUD COMPUTINGANALYSIS OF SOFTWARE SECURITY TESTING TECHNIQUES IN CLOUD COMPUTING
ANALYSIS OF SOFTWARE SECURITY TESTING TECHNIQUES IN CLOUD COMPUTING
Ā 
A survey of online failure prediction methods
A survey of online failure prediction methodsA survey of online failure prediction methods
A survey of online failure prediction methods
Ā 
Zue2015Uncertainties
Zue2015UncertaintiesZue2015Uncertainties
Zue2015Uncertainties
Ā 
High dependability of the automated systems
High dependability of the automated systemsHigh dependability of the automated systems
High dependability of the automated systems
Ā 
IN-DEPTH ANALYSIS AND SYSTEMATIC LITERATURE REVIEW ON RISK BASED ACCESS CONTR...
IN-DEPTH ANALYSIS AND SYSTEMATIC LITERATURE REVIEW ON RISK BASED ACCESS CONTR...IN-DEPTH ANALYSIS AND SYSTEMATIC LITERATURE REVIEW ON RISK BASED ACCESS CONTR...
IN-DEPTH ANALYSIS AND SYSTEMATIC LITERATURE REVIEW ON RISK BASED ACCESS CONTR...
Ā 
PLANT LEAF DISEASES IDENTIFICATION IN DEEP LEARNING
PLANT LEAF DISEASES IDENTIFICATION IN DEEP LEARNINGPLANT LEAF DISEASES IDENTIFICATION IN DEEP LEARNING
PLANT LEAF DISEASES IDENTIFICATION IN DEEP LEARNING
Ā 
Society of Petroleum Engineers : Model Based Engineering
Society of Petroleum Engineers : Model Based EngineeringSociety of Petroleum Engineers : Model Based Engineering
Society of Petroleum Engineers : Model Based Engineering
Ā 
Privacy Protection in Distributed Industrial System
Privacy Protection in Distributed Industrial SystemPrivacy Protection in Distributed Industrial System
Privacy Protection in Distributed Industrial System
Ā 
F017223742
F017223742F017223742
F017223742
Ā 
9fcfd50a69d9647585
9fcfd50a69d96475859fcfd50a69d9647585
9fcfd50a69d9647585
Ā 
Dynamic RWX ACM Model Optimizing the Risk on Real Time Unix File System
Dynamic RWX ACM Model Optimizing the Risk on Real Time Unix File SystemDynamic RWX ACM Model Optimizing the Risk on Real Time Unix File System
Dynamic RWX ACM Model Optimizing the Risk on Real Time Unix File System
Ā 
Best Practices for Microsoft-Based Plant Software Address Reliability, Cost, ...
Best Practices for Microsoft-Based Plant Software Address Reliability, Cost, ...Best Practices for Microsoft-Based Plant Software Address Reliability, Cost, ...
Best Practices for Microsoft-Based Plant Software Address Reliability, Cost, ...
Ā 
IJCER (www.ijceronline.com) International Journal of computational Engineerin...
IJCER (www.ijceronline.com) International Journal of computational Engineerin...IJCER (www.ijceronline.com) International Journal of computational Engineerin...
IJCER (www.ijceronline.com) International Journal of computational Engineerin...
Ā 
IJCER (www.ijceronline.com) International Journal of computational Engineerin...
IJCER (www.ijceronline.com) International Journal of computational Engineerin...IJCER (www.ijceronline.com) International Journal of computational Engineerin...
IJCER (www.ijceronline.com) International Journal of computational Engineerin...
Ā 
Cloud Storage and Security
Cloud Storage and SecurityCloud Storage and Security
Cloud Storage and Security
Ā 
the_role_of_resilience_data_in_ensuring_cloud_security.pptx
the_role_of_resilience_data_in_ensuring_cloud_security.pptxthe_role_of_resilience_data_in_ensuring_cloud_security.pptx
the_role_of_resilience_data_in_ensuring_cloud_security.pptx
Ā 
How Enterprise Architects Can Build Resilient, Reliable Software-Based Health...
How Enterprise Architects Can Build Resilient, Reliable Software-Based Health...How Enterprise Architects Can Build Resilient, Reliable Software-Based Health...
How Enterprise Architects Can Build Resilient, Reliable Software-Based Health...
Ā 
SMRP 24th Conf Paper - Vextec -J Carter
SMRP 24th Conf Paper - Vextec -J CarterSMRP 24th Conf Paper - Vextec -J Carter
SMRP 24th Conf Paper - Vextec -J Carter
Ā 

More from www.pixelsolutionbd.com

Adaptive fault tolerance_in_real_time_cloud_computing
Adaptive fault tolerance_in_real_time_cloud_computingAdaptive fault tolerance_in_real_time_cloud_computing
Adaptive fault tolerance_in_real_time_cloud_computingwww.pixelsolutionbd.com
Ā 
Software rejuvenation based fault tolerance
Software rejuvenation based fault toleranceSoftware rejuvenation based fault tolerance
Software rejuvenation based fault tolerancewww.pixelsolutionbd.com
Ā 
Privacy preserving secure data exchange in mobile p2 p
Privacy preserving secure data exchange in mobile p2 pPrivacy preserving secure data exchange in mobile p2 p
Privacy preserving secure data exchange in mobile p2 pwww.pixelsolutionbd.com
Ā 
Adaptive fault tolerance in cloud survey
Adaptive fault tolerance in cloud surveyAdaptive fault tolerance in cloud survey
Adaptive fault tolerance in cloud surveywww.pixelsolutionbd.com
Ā 
Adaptive fault tolerance in real time cloud_computing
Adaptive fault tolerance in real time cloud_computingAdaptive fault tolerance in real time cloud_computing
Adaptive fault tolerance in real time cloud_computingwww.pixelsolutionbd.com
Ā 
Protecting from transient failures in cloud deployments
Protecting from transient failures in cloud deploymentsProtecting from transient failures in cloud deployments
Protecting from transient failures in cloud deploymentswww.pixelsolutionbd.com
Ā 
Protecting from transient failures in cloud microsoft azure deployments
Protecting from transient failures in cloud microsoft azure deploymentsProtecting from transient failures in cloud microsoft azure deployments
Protecting from transient failures in cloud microsoft azure deploymentswww.pixelsolutionbd.com
Ā 
Fault tolerance on cloud computing
Fault tolerance on cloud computingFault tolerance on cloud computing
Fault tolerance on cloud computingwww.pixelsolutionbd.com
Ā 
Fault tolerance on cloud computing
Fault tolerance on cloud computingFault tolerance on cloud computing
Fault tolerance on cloud computingwww.pixelsolutionbd.com
Ā 
Fault tolerance on cloud computing
Fault tolerance on cloud computingFault tolerance on cloud computing
Fault tolerance on cloud computingwww.pixelsolutionbd.com
Ā 
Fault tolerance on cloud computing
Fault tolerance on cloud computingFault tolerance on cloud computing
Fault tolerance on cloud computingwww.pixelsolutionbd.com
Ā 
Fault tolerance on cloud computing
Fault tolerance on cloud computingFault tolerance on cloud computing
Fault tolerance on cloud computingwww.pixelsolutionbd.com
Ā 
Fault tolerance on cloud computing
Fault tolerance on cloud computingFault tolerance on cloud computing
Fault tolerance on cloud computingwww.pixelsolutionbd.com
Ā 
Real time service oriented cloud computing
Real time service oriented cloud computingReal time service oriented cloud computing
Real time service oriented cloud computingwww.pixelsolutionbd.com
Ā 
Performance, fault tolerance and scalability analysis of virtual infrastructu...
Performance, fault tolerance and scalability analysis of virtual infrastructu...Performance, fault tolerance and scalability analysis of virtual infrastructu...
Performance, fault tolerance and scalability analysis of virtual infrastructu...www.pixelsolutionbd.com
Ā 
Comprehensive analysis of performance, fault tolerance and scalability in gri...
Comprehensive analysis of performance, fault tolerance and scalability in gri...Comprehensive analysis of performance, fault tolerance and scalability in gri...
Comprehensive analysis of performance, fault tolerance and scalability in gri...www.pixelsolutionbd.com
Ā 
A task based fault-tolerance mechanism to hierarchical master worker with div...
A task based fault-tolerance mechanism to hierarchical master worker with div...A task based fault-tolerance mechanism to hierarchical master worker with div...
A task based fault-tolerance mechanism to hierarchical master worker with div...www.pixelsolutionbd.com
Ā 
Privacy preserving secure data exchange in mobile P2P
Privacy preserving secure data exchange in mobile P2PPrivacy preserving secure data exchange in mobile P2P
Privacy preserving secure data exchange in mobile P2Pwww.pixelsolutionbd.com
Ā 

More from www.pixelsolutionbd.com (20)

Adaptive fault tolerance_in_real_time_cloud_computing
Adaptive fault tolerance_in_real_time_cloud_computingAdaptive fault tolerance_in_real_time_cloud_computing
Adaptive fault tolerance_in_real_time_cloud_computing
Ā 
Software rejuvenation based fault tolerance
Software rejuvenation based fault toleranceSoftware rejuvenation based fault tolerance
Software rejuvenation based fault tolerance
Ā 
Privacy preserving secure data exchange in mobile p2 p
Privacy preserving secure data exchange in mobile p2 pPrivacy preserving secure data exchange in mobile p2 p
Privacy preserving secure data exchange in mobile p2 p
Ā 
Adaptive fault tolerance in cloud survey
Adaptive fault tolerance in cloud surveyAdaptive fault tolerance in cloud survey
Adaptive fault tolerance in cloud survey
Ā 
Adaptive fault tolerance in real time cloud_computing
Adaptive fault tolerance in real time cloud_computingAdaptive fault tolerance in real time cloud_computing
Adaptive fault tolerance in real time cloud_computing
Ā 
Protecting from transient failures in cloud deployments
Protecting from transient failures in cloud deploymentsProtecting from transient failures in cloud deployments
Protecting from transient failures in cloud deployments
Ā 
Protecting from transient failures in cloud microsoft azure deployments
Protecting from transient failures in cloud microsoft azure deploymentsProtecting from transient failures in cloud microsoft azure deployments
Protecting from transient failures in cloud microsoft azure deployments
Ā 
Speaking
SpeakingSpeaking
Speaking
Ā 
Fault tolerance on cloud computing
Fault tolerance on cloud computingFault tolerance on cloud computing
Fault tolerance on cloud computing
Ā 
Fault tolerance on cloud computing
Fault tolerance on cloud computingFault tolerance on cloud computing
Fault tolerance on cloud computing
Ā 
Fault tolerance on cloud computing
Fault tolerance on cloud computingFault tolerance on cloud computing
Fault tolerance on cloud computing
Ā 
Fault tolerance on cloud computing
Fault tolerance on cloud computingFault tolerance on cloud computing
Fault tolerance on cloud computing
Ā 
Fault tolerance on cloud computing
Fault tolerance on cloud computingFault tolerance on cloud computing
Fault tolerance on cloud computing
Ā 
Fault tolerance on cloud computing
Fault tolerance on cloud computingFault tolerance on cloud computing
Fault tolerance on cloud computing
Ā 
Cyber Physical System
Cyber Physical SystemCyber Physical System
Cyber Physical System
Ā 
Real time service oriented cloud computing
Real time service oriented cloud computingReal time service oriented cloud computing
Real time service oriented cloud computing
Ā 
Performance, fault tolerance and scalability analysis of virtual infrastructu...
Performance, fault tolerance and scalability analysis of virtual infrastructu...Performance, fault tolerance and scalability analysis of virtual infrastructu...
Performance, fault tolerance and scalability analysis of virtual infrastructu...
Ā 
Comprehensive analysis of performance, fault tolerance and scalability in gri...
Comprehensive analysis of performance, fault tolerance and scalability in gri...Comprehensive analysis of performance, fault tolerance and scalability in gri...
Comprehensive analysis of performance, fault tolerance and scalability in gri...
Ā 
A task based fault-tolerance mechanism to hierarchical master worker with div...
A task based fault-tolerance mechanism to hierarchical master worker with div...A task based fault-tolerance mechanism to hierarchical master worker with div...
A task based fault-tolerance mechanism to hierarchical master worker with div...
Ā 
Privacy preserving secure data exchange in mobile P2P
Privacy preserving secure data exchange in mobile P2PPrivacy preserving secure data exchange in mobile P2P
Privacy preserving secure data exchange in mobile P2P
Ā 

Recently uploaded

HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVHARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVRajaP95
Ā 
Application of Residue Theorem to evaluate real integrations.pptx
Application of Residue Theorem to evaluate real integrations.pptxApplication of Residue Theorem to evaluate real integrations.pptx
Application of Residue Theorem to evaluate real integrations.pptx959SahilShah
Ā 
Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024hassan khalil
Ā 
Biology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptxBiology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptxDeepakSakkari2
Ā 
chaitra-1.pptx fake news detection using machine learning
chaitra-1.pptx  fake news detection using machine learningchaitra-1.pptx  fake news detection using machine learning
chaitra-1.pptx fake news detection using machine learningmisbanausheenparvam
Ā 
main PPT.pptx of girls hostel security using rfid
main PPT.pptx of girls hostel security using rfidmain PPT.pptx of girls hostel security using rfid
main PPT.pptx of girls hostel security using rfidNikhilNagaraju
Ā 
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...Soham Mondal
Ā 
Heart Disease Prediction using machine learning.pptx
Heart Disease Prediction using machine learning.pptxHeart Disease Prediction using machine learning.pptx
Heart Disease Prediction using machine learning.pptxPoojaBan
Ā 
Call Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile serviceCall Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile servicerehmti665
Ā 
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICSAPPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICSKurinjimalarL3
Ā 
Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...VICTOR MAESTRE RAMIREZ
Ā 
VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...
VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...
VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...VICTOR MAESTRE RAMIREZ
Ā 
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdfCCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdfAsst.prof M.Gokilavani
Ā 
Artificial-Intelligence-in-Electronics (K).pptx
Artificial-Intelligence-in-Electronics (K).pptxArtificial-Intelligence-in-Electronics (K).pptx
Artificial-Intelligence-in-Electronics (K).pptxbritheesh05
Ā 
Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.eptoze12
Ā 
šŸ”9953056974šŸ”!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...
šŸ”9953056974šŸ”!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...šŸ”9953056974šŸ”!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...
šŸ”9953056974šŸ”!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...9953056974 Low Rate Call Girls In Saket, Delhi NCR
Ā 
Introduction to Microprocesso programming and interfacing.pptx
Introduction to Microprocesso programming and interfacing.pptxIntroduction to Microprocesso programming and interfacing.pptx
Introduction to Microprocesso programming and interfacing.pptxvipinkmenon1
Ā 

Recently uploaded (20)

HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVHARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
Ā 
Application of Residue Theorem to evaluate real integrations.pptx
Application of Residue Theorem to evaluate real integrations.pptxApplication of Residue Theorem to evaluate real integrations.pptx
Application of Residue Theorem to evaluate real integrations.pptx
Ā 
Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024
Ā 
Biology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptxBiology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptx
Ā 
chaitra-1.pptx fake news detection using machine learning
chaitra-1.pptx  fake news detection using machine learningchaitra-1.pptx  fake news detection using machine learning
chaitra-1.pptx fake news detection using machine learning
Ā 
main PPT.pptx of girls hostel security using rfid
main PPT.pptx of girls hostel security using rfidmain PPT.pptx of girls hostel security using rfid
main PPT.pptx of girls hostel security using rfid
Ā 
Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR
Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCRCall Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR
Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR
Ā 
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
Ā 
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
Ā 
Heart Disease Prediction using machine learning.pptx
Heart Disease Prediction using machine learning.pptxHeart Disease Prediction using machine learning.pptx
Heart Disease Prediction using machine learning.pptx
Ā 
Call Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile serviceCall Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile service
Ā 
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICSAPPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
Ā 
Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...
Ā 
VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...
VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...
VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...
Ā 
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdfCCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
Ā 
Artificial-Intelligence-in-Electronics (K).pptx
Artificial-Intelligence-in-Electronics (K).pptxArtificial-Intelligence-in-Electronics (K).pptx
Artificial-Intelligence-in-Electronics (K).pptx
Ā 
Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.
Ā 
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptxExploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
Ā 
šŸ”9953056974šŸ”!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...
šŸ”9953056974šŸ”!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...šŸ”9953056974šŸ”!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...
šŸ”9953056974šŸ”!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...
Ā 
Introduction to Microprocesso programming and interfacing.pptx
Introduction to Microprocesso programming and interfacing.pptxIntroduction to Microprocesso programming and interfacing.pptx
Introduction to Microprocesso programming and interfacing.pptx
Ā 

Fault tolerance on cloud computing

  • 1. Using Dynamic Adaptive Systems in Safety-Critical Domains Ethan T. McGee Clemson University McAdams Hall Clemson, SC 29632 etmcgee@clemson.edu John D. McGregor Clemson University McAdams Hall Clemson, SC 29632 johnmc@clemson.edu ABSTRACT The development of safety-critical Cyber-Physical Systems (CPS) is expanding due to the Internet of Thingsā€™ promise to make high-integrity applications and services part of ev- eryday life. This expansion is seen in the dependencies some connected vehicles have on cloud services that provide guid- ance and accident avoidance / detection features. Such sys- tems are safety-critical since failure could result in serious injury or death. Due to the severe consequences of fail- ure, fault-tolerance, reliability and dependability should be primary driving qualities in the design and development of these systems. However, the cost of the analysis, evaluation and certiļ¬cation activities needed to ensure that the possi- bility of failure has been suļ¬ƒciently mitigated is signiļ¬cantly higher than the cost of developing traditional software. Our group is exploring the addition of dynamic adap- tive capabilities to safety-critical systems. We postulate that dynamic adaptivity could provide several enhancements to safety-critical systems. It would allow systems to rea- son about the environment within which they are sited and about their internal operation enabling decision making that is context-speciļ¬c and appropriately prioritized. However, the addition of adaptivity with the associated overhead of reasoning is not without drawbacks particularly when hard real-time safety-critical systems are involved. In this brief position paper, we explore some of the questions and con- cerns that are raised when dynamic adaptive behavior is introduced into safety-critical systems as well as ways that the Architecture Analysis & Design Language (AADL) can be used to model / analyze such systems. CCS Concepts ā€¢Software and its engineering ā†’ Software organiza- tion and properties; Designing software; Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for proļ¬t or commercial advantage and that copies bear this notice and the full cita- tion on the ļ¬rst page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or re- publish, to post on servers or to redistribute to lists, requires prior speciļ¬c permission and/or a fee. Request permissions from permissions@acm.org. SEAMSā€™16, May 16-17, 2016, Austin, TX, USA Ā© 2016 ACM. ISBN 978-1-4503-4187-5/16/05...$15.00 DOI: http://dx.doi.org/10.1145/2897053.2897062 Keywords Safety Critical Systems; Dynamic Adaptive Systems; Soft- ware Product Lines; Dynamic Software Product Lines 1. INTRODUCTION Cyber-Physical Systems (CPS) and the Intenet of Things (IoT) are seeing expanded use in a wide variety of domains. They are enabling the automation of common household de- vices, and they are also playing a major role in the imple- mentation of connected / autonomous vehicles [23]. Pro- totype autonomous vehicles are relying on infrastructures powered by the IoT [14] for features including Global Po- sitioning System (GPS) navigation, adaptive cruise control and automated accident response / detection among others. Especially for autonomous vehicles, these systems would be considered safety-critical, where system failure could result in serious injury or death. We consider adding dynamic adaptive capabilities to safety-critical systems as a way to extend the life, deploy-ability and functionality of the sys- tems. This poses signiļ¬cant challenges to the analysis and certiļ¬cation that systems are safe. Dynamic Adaptive Systems (DAS) are systems that mon- itor select properties of their environment and reason about the measured values in order to determine what behaviors the system should adopt in order to best operate in the de- tected environment [24]. Introducing dynamic adaptive ca- pabilities into safety-critical systems would address several issues. First, it could increase the fault-tolerance of safety-critical systems. In many safety-critical systems, if the system fails in an unexpected or unforeseen way, there is generally no prescribed method of handling this failure. Since DAS in- herently include a degree of self-monitoring, provided the correct system properties are monitored, it would be possi- ble to determine that the system has failed in the current conļ¬guration. Systems could also provide a fail-safe con- ļ¬guration that could be transitioned to in the event of a failure. Second, it could increase the deploy-ability of safety-critical systems. Since DAS monitor and adapt to their environ- ment, it would be possible to build a safety-critical system that could support multiple environments rather than build- ing multiple systems, one per environment. Doing so would greatly reduce development and certiļ¬cation costs as only one, not multiple, system would be considered. Finally, the inclusion of dynamic adaptivity in safety- critical software could allow it to respond to unexpected situations in a manner that prevents loss of life. Alaska Air- 2016 IEEE/ACM 11th International Symposium on Software Engineering for Adaptive and Self-Managing Systems 115
  • 2. lines Flight 261 was a ļ¬‚ight that crashed killing 88. The crash was caused by a screw sheering in such a way that the ļ¬‚ight control module which controls pitch was not oriented correctly [27]. If this system had been able to correctly de- tect its current orientation, the software of the ļ¬‚ight control module could have reoriented itself, allowing the plane to land safely. Despite the beneļ¬ts possible, adaptive behavior signiļ¬- cantly hinders safety analysis and certiļ¬cation. At the cur- rent time, the safety analysis methods used for adaptive sys- tems are formal methods [28, 15], which are generally advo- cated to be used in addition to reasoning-based safety anal- ysis not standalone [16, 3]. Formal methods are generally more diļ¬ƒcult to understand, modify and maintain [16, 3], and mathematically-based reasoning techniques, like Soft- ware Fault Tree Analysis (SFTA) [19], are easier to con- struct while still maintaining the mathematical rigor of for- mal methods. In our work, we use reasoning-based tech- niques over formal methods due to the diļ¬ƒculty involved in building a correct mathematical speciļ¬cation of the system needed for evaluation by formal methods. The speciļ¬cation requires specialized training in mathematical proofs / spec- iļ¬cation to build and, like any large development artifact, can be prone to errors. Reasoning based techniques have no such constraint. In this position paper, we explore adding dynamic adap- tive capabilities to safety-critical systems. The primary con- tributions of this paper are as follows: ā€¢ We consider the diļ¬ƒculties of testing dynamic adaptive software products due to the large number of possible run-time conļ¬gurations. ā€¢ We consider the use of a partially-tested DAS in a safety-critical environment and how the use of moni- toring and dynamic reconļ¬guration could be used as assurance for the partially tested system. ā€¢ We consider the use of the Architecture Analysis & De- sign Language (AADL) [12] as a method of designing, analyzing and modeling DAS. For each item, we will discuss our in-progress research that provides potential solutions, and we also introduce questions and concerns arising from each topic that are open work. The remainder of this paper is structured as follows. Sec- tion 2 contains the background information for this work, and section 3 contains issues that we have found preventing adaptive designs from being used in safety-critical systems as well as work we are conducting towards providing a so- lution for the issue. Section 4 contains related work, and ļ¬nally, section 5 concludes this position paper. 2. BACKGROUND We now provide the information necessary for understand- ing the work that we present. We provide overviews of safety-critical systems and AADL. We also give an overview of variation management. 2.1 Safety-Critical Systems Safety-critical systems are systems where failure can result in serious injury, loss of life or property / environmental damage. These devices have been employed in the domains of aviation and transportation as well as numerous other domains including medical and nuclear energy [17]. Such systems are becoming more commonplace, particularly as humans are increasingly relying on machines for situations, such as autonomous vehicles, where humans cannot always react quickly enough. Safety-critical systems are designed and developed diļ¬€er- ently than traditional software systems. They require exten- sive analysis, design and testing which all contribute assets to the certiļ¬cation process, and an assurance case to show that potential failures have been suļ¬ƒciently mitigated. The necessity of extensive work both initially in design and later in testing is owed to the fact that the certiļ¬cation process is, in general, expensive and long. Any errors found during the certiļ¬cation process require that the software be corrected and resubmitted starting the certiļ¬cation process over from the beginning. Safety-critical systems have diļ¬€erent development con- cerns than traditional software, especially since human lives are at risk. Unintended behavior must be prevented from emerging when separately developed components are brought together. An example of such emergent behavior can be seen in the failure of the London ambulance dispatch system in 1992. The deployed system worked initially, but a combi- nation of interactions caused the system to fail increasingly frequently throughout the day which resulted in service de- lays and up to 46 deaths [20]. In addition to extra work during development, additional work is needed during re- quirements analysis to prevent incorrect assumptions about the software. An example of this type of failure is the Mars Climate Orbiter which crashed because diļ¬€erent units of measurement were employed by the ground and on-board control systems [17]. These are only a few examples of the additional concerns while developing safety-critical software. 2.2 Architecture Analysis & Design Language AADL is an architecture representation language and anal- ysis tool designed for embedded system technologies. It is used to represent both hardware and software architec- tures [12] and is an approved standard of the SAE [10]. AADL is a highly extensible language structured as a set of annexes around a core language. As an example, the behavior annex provides the ability to specify the runtime behavior of a component. Other annexes provide support for veriļ¬cation, error speciļ¬cation and requirements among oth- ers. AADLā€™s Integrated Development Environment (IDE), OSATE, also provides tools for analyzing and verifying the architecture speciļ¬cation. Some tools evaluate only the ar- chitecture speciļ¬cation while others utilize information from the annexes to aid analysis. AADL has been used as a method of representing SPL, a methodology used to build DAS [2], as its rich typing and high extensibility support the intricacies involved in model- ing SPL. One such feature is that of AADLā€™s extends key- word which can be used to represent the variation / variation point structure used in product lines. Extends, an example of which is shown in ļ¬gure 1, is used to denote a system that inherits properties from a parent system. The child system will receive the features (inputs and outputs), subcompo- nents, connections and annexes of the parent. Children are allowed to add to or to reļ¬ne the parentā€™s structure, but they are not allowed override the structure of the parent. This provides an ideal representation for SPL variation points / variations as variation points are representations of ā€œholesā€ 116
  • 3. to be ļ¬lled. The ā€œholeā€ is normally represented as an object that contains input / output types or other structural infor- mation. Variants extend the variation pointā€™s ā€œcontractā€ to ensure that they can be properly instantiated in the varia- tion point. system my_variation_point end my_variation_point; system my_variation extends my_variation_point end my_variation; Figure 1: AADL Extends Example 2.3 Variation Points & Variations SPL is a development methodology that decreases the cost associated with development by allowing developers to sep- arate out a set of common assets that are shared across a family of products. It allows developers to more intensely fo- cus on product-speciļ¬c issues rather than focusing on shared assets [26]. The shared assets are morphed into what is known as a product family architecture. This architecture represents all common functionality to the entire suite of products, but the family architecture containsā€œholesā€known as variation points. In order for a new product to be created within the context of the product line, each variation point must be substituted with an implemented variant [26]. This instantiation of a variation point can happen at any time during the product life-cycle, including runtime. Variations and variation points are also used by DPSL, where instanti- ation of a variation point does not necessarily lead to a new product but could lead to new run-time conļ¬guration. Variation points possess two primary properties: state and variant addition state [26]. We add to this list the variation pointā€™s type [5]. The state of a variant can be one of three mutually exclusive values: ā€¢ Implicit - These are variation points that have not yet been speciļ¬cally created in the architecture. They represent uncertainty or variability that has not yet been decided and has not yet been explicitly left as a decision to be bound by the product. These types of variation points tend to be created early in the project life-cycle and are migrated to designed variation points as the project progresses. ā€¢ Designed - These are variation points where the de- sign decision has been deliberately left unanswered. ā€¢ Bound - A variation point that has been replaced by a variant. [26] The second property, variant addition state, deals with whether it is possible to add new variants to the system. This property can assume one of two values: ā€¢ Open - An open variation point is one that can still have additional variants added to its list of potential variants. ā€¢ Closed - A closed variation point cannot have any further variants added to its list of potential variants. [26] The ļ¬nal property is the variant type. This property can also assume one of two values: ā€¢ Static - A static variation point is a variation that, once bound, cannot be rebound. This binding occurs as the result of a decision outside the system. ā€¢ Dynamic - A dynamic variation point is a variation point that can be rebound, even after binding. This binding usually, but not always, occurs as a result of a decision made autonomously by the system itself. [5] Each of these properties can be represented in the fea- ture tree of the product line. With AADL, it is possible to create a property set that will also allow these properties to be expressed as part of the AADL model which in turn allows variation point properties to be evaluated during ar- chitecture analysis. 3. ISSUES Our motivation for undertaking this work is to explore how safety-critical systems can beneļ¬t from the work of the dynamic adaptive community. The intention is to build safety-critical systems that are capable of operating opti- mally in a wide range of situations, especially environments that threaten successful execution. To provide a motivat- ing example, consider that some autonomous vehicles rely on external services provided by the IoT. These external services could provide non-safety-critical features like enter- tainment, but they could also provide safety-critical features such as accident alerts, road condition warnings and navi- gation. Each year as new vehicles are released, the new vehicles will likely rely on new, upgraded versions of each service that may not yet be available in all areas. Much as cell phones are capable of adapting to weak cellular signals, it is desirable for these vehicles to be both capable of de- tecting and adapting to the absence of the services on which they rely. Despite the beneļ¬ts that DAS oļ¬€er to safety-critical sys- tems, there are concerns that hinder the composition of the disciplines. In particular, we present three concerns: ā€¢ What constitutesā€œsuļ¬ƒcient pre-deployment testingā€of a safety-critical DAS? (discussed in section 3.1) ā€¢ What is the nature of the failures that can arise from executing untested conļ¬gurations containing faults and how can they be mitigated? (discussed in section 3.2) ā€¢ How can safety-criticality be incorporated into the de- sign of DAS? (discussed in section 3.3) For each of these concerns, we will discuss the concern then provide work our group is undertaking that we postulate will provide a resolution to the stated concern. Throughout our discussion, we will use standard terminol- ogy from SPL. We use variation point to designate a decision that has been deliberately left undecided. We use the term variant to represent a candidate decision that can be bound to, or swapped into, a variation point. For the purposes of this discussion, we assume that the variation point is dy- namic and can be bound and rebound at run-time, unless explicitly stated otherwise. We will use the term context to mean the current operating environment, or the current set of values measured by the system. Finally, we use conļ¬gu- ration to deļ¬ne the current bindings for the set of variation points. Since we are dealing with dynamic variation points, we also assume, unless explicitly stated otherwise, that the 117
  • 4. conļ¬guration can be changed autonomously by the system at run-time. 3.1 Pre-deployment Testing of Safety-Critical DAS A single dynamically adaptive system deļ¬nes multiple prod- uct conļ¬gurations and therefore can be represented and man- aged using SPL methods [2]. As such they share many of the same concerns with regards to testability. The shared concerns include: ā€¢ Large Number of Test Cases - As additional po- tential variants for the systemā€™s variation points are identiļ¬ed, the number of possible run-time conļ¬gura- tions increases combinatorially. ā€¢ Reusable Components - Variants could be reused between adaptive systems, but which artifacts of the variants (testing results, test cases, certiļ¬cation assets) are to be shared and how much they will reduce the eļ¬€ort in testing the new system is not well understood. ā€¢ Variability - SPL, and subsequently DAS, have two types of variation points, static and dynamic. How testing the various types of variation points should be handled or if they should be handled diļ¬€erently is not known. [9] Our work focuses primarily on the ļ¬rst concern. For a given safety-critical SPL, one approach to ensuring suļ¬ƒcient test coverage would be to automate the testing of all possible combinations. While this method may be acceptable in some cases, there also exist cases where this is not an option. One such example would be a situation where there is not enough time for the test suite to run before the system is to be deployed. In such situations, it is necessary to determine which tests will capture the largest set of faults in the system. One method of reducing the number of potential test cases would involve ranking the conļ¬gurations of the system by the likelihood that they will be used. After ranking, the top k conļ¬gurations could be selected for testing based on how much time is available for testing and how long it takes to test a single conļ¬guration. This method was employed by [1]. Our group is currently using an additional complementary technique based on the error ontology given in the Error An- nex to AADL [6]. We produce models of each component of the system, and we augment each model with the ontology of errors that can propagate from that component. When a conļ¬guration of the system is composed from the models of each component, the diļ¬€erent error propagations of each conļ¬guration can be examined. We postulate that this in- formation can be leveraged in order to determine which tests should be run against a conļ¬guration. Consider that we have a variation point, A, with two vari- ants, Aā€™ and Aā€. A can propagate errors BadValue and Mu- texError. This means that the variant could produce an in- correct value or that it could produce an error due to threads improperly using a mutex. Aā€ can propagate errors Bad- Value. It does not propagate MutexError as it, unlike Aā€™, does not use multiple threads. Based on this information, we determine that it is not necessary to run multi-threading or mutex tests against conļ¬gurations of the system in which Aā€ has been swapped in for variation point A while it is necessary to run tests for multi-threading / mutexes against conļ¬gurations that use Aā€™. By examining the error propaga- tions, we can not only limit our testing to the conļ¬gurations that are most likely to be used, but we can also limit our tests based on the types of errors that each conļ¬guration can produce. Our hypothesis is this will allow us to test a greater number of conļ¬gurations than if we ran every test against a conļ¬guration. 3.2 Mitigation of Failures at Run-time Assuming that we have a limited testing budget and a suf- ļ¬ciently large system, it is probable that an untested conļ¬g- uration containing latent faults will be entered into at run- time. Due to the conļ¬gurationā€™s not being tested, safety- critical DAS should have some method of both detecting and handling execution-time failures that could occur as a result of the faults. A primary strategy is that of performing a series of execution- time tests just before a conļ¬guration is deployed, as is done by [21]. Components provide a series of dependencies and provisions, and the tester ensures that all dependencies are satisļ¬ed by some component. Another possibility is that each component contains test cases, and these cases are used to perform the pre-deployment testing. In general, a conļ¬g- uration that fails to pass the run-time tests will not be de- ployed. Pre-deployment testing can help mitigate the prob- ability of failure, but testing alone cannot guarantee that failure will not occur. Since tests alone cannot guarantee that failure will not oc- cur, it is necessary to have some method of handling failures if and when they occur. [22] proposed a method of main- taining a list of potential alternatives for a given variant in a conļ¬guration. If the variant fails, each alternative can be attempted until all possibilities have been exhausted or an alternative resolves the failure. A natural concern is what should happen if all possibilities have been attempted but the failure persists. In some domains, it would be possible to halt the system, but in safety-critical systems, this is rarely possible. Another concern is how the occurrence of a failure aļ¬€ects the hard real-time requirements of a system. Our proposed method is to provide a failure conļ¬guration that is entered into immediately upon failure. This conļ¬g- uration will spawn two co-operative tasks that execute side by side. The ļ¬rst task provides basic functionality so that the system can continue to receive and transmit informa- tion meeting its hard real-time requirements. The second task attempts to resolve the failure by changing the conļ¬gu- ration. In our current work, the change is to simply rollback to the previous conļ¬guration. We would like to incorpo- rate the alternative list used by [22] rolling back only once all alternatives have been tried. Our approach currently raises several unresolved concerns. First, detecting failure occurrence, loading the failure conļ¬guration and switching into the failure conļ¬guration is not an instantaneous process. For systems with real-time requirements in the millisecond range, it is possible that our process could still cause the system to fail to satisfy its requirements. Second, our rea- son for wanting to incorporate the list of alternatives stems from the fact that rolling back to the previous conļ¬guration can cause the system to re-enter the failing conļ¬guration. This happens when the rollback occurs but the context has not changed. The previous conļ¬guration detects that an- other conļ¬guration is better suited to handle the context, the failing conļ¬guration, and switches. An idea currently 118
  • 5. being explored is the concept of banning conļ¬gurations that have failed as is done by [13], discussed later. Even though run-time testing cannot prevent failure, it still oļ¬€ers a potential mitigation that makes it less likely that we will need to rely on run-time failure resolution. The manner in which the execution-time tests are performed, however, do raise concerns for safety-critical systems. Un- like [21], it is rarely possible for a safety-critical system to halt execution in order to perform run-time tests. Once again, considering hard real-time requirements, we currently have the reconļ¬guration process run concurrently alongside the current conļ¬guration. This has several beneļ¬ts. First, the system can continue normal operation while the new to- be-deployed conļ¬guration is tested and validated. Second, if the to-be-deployed conļ¬guration fails validation, the con- ļ¬guration process can simply terminate with no additional actions. The current conļ¬guration that has been running alongside the validation process may be able to continue uninterrupted. If the to-be-deployed conļ¬guration succeeds, then and only then is the current conļ¬guration interrupted and the new conļ¬guration swapped in. There are also sev- eral drawbacks to this approach. First, as previously stated, switching contexts is not instantaneous. For systems with real-time requirements in the millisecond range, uninter- rupted execution may not be possible. Second, additional hardware is needed to allow concurrent execution of conļ¬g- uration testing on smaller embedded devices. Finally, what is necessary to conduct run-time testing varies depending on the test to be executed. Our group has also considered the possibility of representing the entire architecture as this would allow complex analysis tasks like ļ¬‚ow, dependency and latency analysis to be performed on the system at run- time. This would be particularly beneļ¬cial since the system may have to consider never-before-seen variants. However, safety-critical systems can be small embedded devices with limited memory constraints. It may not be possible to load a representation of the entire architecture or the entire fea- ture model into memory all at once for analysis. This would require multiple read operations from storage increasing the time required for the run-time testing of the reconļ¬guration process. Even if size were not a concern, which types of tests to perform is. It is likely that the reconļ¬guration process will not have enough budget or resources to execute all of the analysis tests. Our group has also considered the possibility of the sys- tem context as a source of failure. Once again consider an autonomous vehicle. The vehicle is traveling on a sunny, cloud-free day and, using sensors on the windshield / under- body, determines that the traction control system should be conļ¬gured to support mostly dry road conditions. As the vehicle is traveling along it encounters wet roadways due to a ļ¬re hydrant that is in the process of being depressur- ized. The car under-body detects wet road conditions, but the windshield sensor determines that conditions are still sunny so the current conļ¬guration is not changed. As a result the car begins to slide on the wet roadway possibly ending in a collision. In the example, an untested conļ¬gura- tion was not the source of the fault; the combination of the context and conļ¬guration resulted in a fault. Our work on providing a minimal conļ¬guration for fault situations pro- vides one potential solution to this concern, but our group is also considering other avenues. We are currently inves- tigating an extension of [13] so that a systemā€™s failure or success in a particular context is recorded and, at a later point in time, can be used to determine if a conļ¬guration should be deployed in that context again. Continuing with the example, we encounter wet roads but sun again. The vehicle can leverage the collected history to determine that dry road traction control should not be used as last time it resulted in a failure. This history of conļ¬gurations and the contexts they have failed in could also be shared with other vehicles allowing them to ā€œlearnā€ and become more reliable with time. 3.3 Safety-Critical DAS Design Techniques The extensive analysis required by safety-critical systems needs special tooling for modeling and design. Current ar- chitecture tools, such as AADL and UML, do not explic- itly support dynamic structural adaptation, but they pos- sess several features that could be useful to the analysis and design of DAS. AADL and its IDE, OSATE, have many features that would be beneļ¬cial to the modeling, analysis and design of DAS. Their support of annexes would allow for tooling to be built for the modeling of DAS that use structural adapta- tion, and our group is currently considering the development of an annex speciļ¬c to that goal. The current features we feel would be of use to the DAS community, and that have been particularly useful in the work we have undertaken, would be the features used to model the unique features of SPL [11], modes, behavior and error ļ¬‚ows. system receiver features inevent: in event port; modes nominal: initial mode; recovery: mode; end receiver; system implementation receiver.i subcomponents sub1: system sub_receiver.i in modes (nominal,recovery); sub2: system sub_receiver.i in modes (recovery); internal features triggerrecovery: event; triggernormal: event; modes nominal-[triggerrecovery]->recovery; recovery-[triggernormal]->nominal; end receiver.id; Figure 2: AADL Modes Example AADL allows developers to specify state transitions using a Finite State Machine (FSM) notation, and the states spec- iļ¬ed can be used to enable or disable components as well as to drive analysis. As an example, consider ļ¬gure 2. The sys- tem receiver has two modes nominal and recovery. It also has two sub-components sub1 and sub2. The system starts in its initial mode, nominal, and continues execution until a triggerrecovery event occurs. Following the rules speci- ļ¬ed under modes in receiver.i, the system transitions to the recovery mode. Under normal operation only the sub1 sub- component is available to the system, but in recovery mode sub2 can be used as well. Both sub1 and sub2 are usable until the system returns to nominal mode via the trigger- 119
  • 6. nominal event. The mode can also be used to drive decision modeling as both the error and behavior annexes can reference the current mode. The behavior annex provided by AADL al- lows the behavior of the system or a sub-component to be speciļ¬ed and modeled [7], and there are some commercial tools available that can utilize this annex when modeling or simulating the system represented by the architecture [8]. An error model, deļ¬ned using the AADL error annex, al- lows developers to specify how errors propagate throughout the system. The errors and their types can be speciļ¬ed as well as which errors the component can handle. Errors that cannot be handled by a component are propagated further up the composition hierarchy until they reach a component that handles them [18]. Since the behavior annex is used to model dynamic behav- ior in a system, the modes could be leveraged as a method of allowing certain types of adaptation only when the system is in a speciļ¬c mode. AADL could also be used to explic- itly model and trace the context of the system through its extensible type interfaces. The context could then also be referenced by the error and behavior annexes. Commercial tools and analysis interfaces in OSATE would, as a result, be able to directly predict the adaptive systemā€™s behavior. This can be used as an additional method of testing the adaptive system, and it oļ¬€ers high-level insight into the operation of the dynamic software. 4. RELATED WORK We now cover work with similar focus to ours identifying areas of overlap. We also provide how our work diļ¬€ers and if our work could beneļ¬t from incorporation of ideas expressed by these similar ideas. [1] provides a method of automatically calculating the dif- ferent conļ¬gurations that a system can assume. The relia- bility of the system is then evaluated using a Markov chain. Our work diļ¬€ers in that we explicitly assume that even if the number of possible conļ¬gurations is determinable, the number of conļ¬gurations will be too large to test in a rea- sonable amount of time. It is also the case that a reliable system does not necessarily mean that the system is safe, however a safe system cannot be unreliable. Our work uses analysis of the systemā€™s architecture to limit the number of tests that need to be executed against the system. [15] provides a method, using DCCA, of proving that an adaptive system is more dependable than a conventional sys- tem. The check is based on temporal logic; however, the method is not a direct proof of safety, but it could be used to contribute to the case for safety. The method takes into account that a failure could be resolved by the reconļ¬gura- tion process, but it doesnā€™t take into account that the failure could also be re-introduced by the reconļ¬guration process. This method is complimentary to our own. It could be used as an additional safety case or as the basis for developing an analysis method tailored to dynamic adaptive systems. [4] presents an overview that describes the need for run- time metric measurement in adaptive systems, and [25] pro- vides a method of measuring speciļ¬c quality attributes. This method could be used as an assurance that adaptation cor- rectly places safety as a high-priority. Its monitoring capa- bilities could be extended to support error monitoring during run-time. The process, as deļ¬ned, does not consider safety or run-time monitoring, but the goals of seeking to maximize speciļ¬c properties of software are parallel to our own. We would like to, in future work, adapt this method to execute at run-time on an adaptive system so that attributes of the system can be measured as they change. Our work diļ¬€ers from the above in that we explicitly con- sider safety-critical systems unlike [4] and [25]. We diļ¬€er from [15] and [1] in that we believe the reconļ¬guration pro- cess needs additional safeguards that will prevent faulty con- ļ¬gurations that have been resolved from being redeployed into the same context in which they failed. [15]ā€™s goal is very similar to ours but it, as mentioned, lacks the focus on additional reconļ¬guration guards. Each of the ideas ex- pressed however could eventually be incorporated into our work to strengthen the assurance case for system safety. 5. CONCLUSION In this position paper, we explored several concerns that have been raised by our research considering incorporating dynamic adaptivity into safety-critical systems. Our interest in applying dynamic adaptivity stems from the unique prop- erties and behaviors that DAS oļ¬€er in regards to designing and deploying safety-critical systems. One such possibility explored in this paper is the utilization of dynamic adaptiv- ity to increase a safety-critical systemā€™s reliability. Even though DAS oļ¬€er many exciting possibilities for safety-critical software, there are many unanswered ques- tions. Testing of DAS, DAS representation and how DAS handle faults that occur during run-time all raise concerns that must be either mitigated or resolved. In future work, we plan to further explore how DAS can be certiļ¬ed for safety-critical environments. We are currently conducting research to determine which, if any, current safety analysis techniques could be adapted to use dynamic adaptive soft- ware. There are several possibilities which we are still in the process of exploring. We also plan to further explore what role context plays when deciding if a conļ¬guration that has previously failed should be deployed again, and we are inves- tigating run-time testing of safety-critical DAS to mitigate failures. 6. ACKNOWLEDGEMENTS The work of the authors was funded by the National Sci- ence Foundation (NSF) grant # 2008912. We also thank those that have reviewed our work oļ¬€ering recommendations for improvement. 7. REFERENCES [1] R. Adler, M. FĀØorster, and M. Trapp. Determining conļ¬guration probabilities of safety-critical adaptive systems. In Advanced Information Networking and Applications Workshops, 2007, AINAWā€™07. 21st International Conference on, volume 2, pages 548ā€“555. IEEE, 2007. [2] N. Bencomo, P. Sawyer, G. S. Blair, and P. Grace. Dynamically adaptive systems are product lines too: Using model-driven techniques to capture dynamic variability of adaptive systems. In SPLC (2), pages 23ā€“32, 2008. [3] J. Bowen and V. Stavridou. Safety-critical systems, formal methods and standards. Software Engineering Journal, 8(4):189ā€“209, 1993. 120
  • 7. [4] R. Calinescu, C. Ghezzi, M. Kwiatkowska, and R. Mirandola. Self-adaptive software needs quantitative veriļ¬cation at runtime. Communications of the ACM, 55(9):69ā€“77, 2012. [5] R. De Lemos, H. Giese, H. A. MĀØuller, M. Shaw, J. Andersson, M. Litoiu, B. Schmerl, G. Tamura, N. M. Villegas, T. Vogel, et al. Software engineering for self-adaptive systems: A second research roadmap. In Software Engineering for Self-Adaptive Systems II, pages 1ā€“32. Springer, 2013. [6] J. Delange and P. Feiler. Architecture fault modeling with the aadl error-model annex. In Software Engineering and Advanced Applications (SEAA), 2014 40th EUROMICRO Conference on, pages 361ā€“368. IEEE, 2014. [7] P. Dissaux, J. Bodeveix, M. Filali, P. Gauļ¬llet, and F. Vernadat. Aadl behavioral annex. In Proceedings of DASIA conference, Berlin, 2006. [8] Ellidiss.com. Aadl inspector, 2016. [9] E. EngstrĀØom and P. Runeson. Software product line testingā€“a systematic mapping study. Information and Software Technology, 53(1):2ā€“13, 2011. [10] H. Feiler, B. Lewis, and S. Vestal. The SAE architecture analysis and design language (AADL) standard. In IEEE RTAS Workshop, 2003. [11] P. Feiler. Product line modeling with aadl, 2005. [12] P. H. Feiler, D. P. Gluch, and J. J. Hudak. The architecture analysis & design language (AADL): An introduction. Technical report, DTIC Document, 2006. [13] B. J. Garvin, M. B. Cohen, and M. B. Dwyer. Using feature locality: can we leverage history to avoid failures during reconļ¬guration? In Proceedings of the 8th workshop on Assurances for self-adaptive systems, pages 24ā€“33. ACM, 2011. [14] M. Gerla, E.-K. Lee, G. Pau, and U. Lee. Internet of vehicles: From intelligent grid to autonomous cars and vehicular clouds. In Internet of Things (WF-IoT), 2014 IEEE World Forum on, pages 241ā€“246. IEEE, 2014. [15] M. Gudemann, F. Ortmeier, and W. Reif. Safety and dependability analysis of self-adaptive systems. In Leveraging Applications of Formal Methods, Veriļ¬cation and Validation, 2006. ISoLA 2006. Second International Symposium on, pages 177ā€“184. IEEE, 2006. [16] A. Hall. Seven myths of formal methods. Software, IEEE, 7(5):11ā€“19, Sept 1990. [17] J. C. Knight. Safety critical systems: challenges and directions. In Software Engineering, 2002. ICSE 2002. Proceedings of the 24rd International Conference on, pages 547ā€“550. IEEE, 2002. [18] B. Larson, J. Hatcliļ¬€, K. Fowler, and J. Delange. Illustrating the aadl error modeling annex (v.2) using a simple safety-critical medical device. In Proceedings of the 2013 ACM SIGAda Annual Conference on High Integrity Language Technology, HILT ā€™13, pages 65ā€“84, New York, NY, USA, 2013. ACM. [19] N. G. Leveson and P. R. Harvey. Software fault tree analysis. Journal of Systems and Software, 3(2):173ā€“181, 1983. [20] E. Musick. The 1992 london ambulance service computer aided dispatch system failure. [21] D. Niebuhr, A. Rausch, C. Klein, J. Reichmann, and R. Schmid. Achieving dependable component bindings in dynamic adaptive systems-a runtime testing approach. In Self-Adaptive and Self-Organizing Systems, 2009. SASOā€™09. Third IEEE International Conference on, pages 186ā€“197. IEEE, 2009. [22] K. Paulsson, M. HĀØubner, and J. Becker. Strategies to on-line failure recovery in self-adaptive systems based on dynamic and partial reconļ¬guration. In Adaptive Hardware and Systems, 2006. AHS 2006. First NASA/ESA Conference on, pages 288ā€“291. IEEE, 2006. [23] R. R. Rajkumar, I. Lee, L. Sha, and J. Stankovic. Cyber-physical systems: The next computing revolution. In Proceedings of the 47th Design Automation Conference, DAC ā€™10, pages 731ā€“736, New York, NY, USA, 2010. ACM. [24] A. J. Ramirez, A. C. Jensen, and B. H. C. Cheng. A taxonomy of uncertainty for dynamically adaptive systems. In Proceedings of the 7th International Symposium on Software Engineering for Adaptive and Self-Managing Systems, SEAMS ā€™12, pages 99ā€“108, Piscataway, NJ, USA, 2012. IEEE Press. [25] G. Tamura, N. M. Villegas, H. A. MĀØuller, J. P. Sousa, B. Becker, G. Karsai, S. Mankovskii, M. Pezz`e, W. SchĀØafer, L. Tahvildari, et al. Towards practical runtime veriļ¬cation and validation of self-adaptive software systems. In Software Engineering for Self-Adaptive Systems II, pages 108ā€“132. Springer, 2013. [26] J. Van Gurp, J. Bosch, and M. Svahnberg. On the notion of variability in software product lines. In Software Architecture, 2001. Proceedings. Working IEEE/IFIP Conference on, pages 45ā€“54. IEEE, 2001. [27] Wikipedia. Alaska airlines ļ¬‚ight 261, 2016. [28] J. Zhang, H. J. Goldsby, and B. H. Cheng. Modular veriļ¬cation of dynamically adaptive systems. In Proceedings of the 8th ACM international conference on Aspect-oriented software development, pages 161ā€“172. ACM, 2009. 121