The document discusses state-based modeling approaches for dependability analysis, including Markov chains and Petri nets. It begins by defining dependability and its attributes like availability, reliability, safety, and maintainability. It then discusses state-based models and how they can explicitly model complex system relationships. Markov chains and continuous-time Markov chains are described as examples of state-based models. The document provides an example of using a continuous-time Markov chain to model a 2-out-of-3 system and calculate its steady-state availability. It concludes by noting that Markov chains can grow exponentially with system size and discusses decomposition approaches to address this issue.
UiPath Test Automation using UiPath Test Suite series, part 2
State model based
1. State Model Based
Disciplina: Tópicos Avançado em Avaliação de
Desempenho de Sistemas
Aluno: Rafael Roque de Souza
Professores: Eduardo Tavares, Ricardo Massa
4. — Dependability:
— The concept of dependable computing first appears in the 1830’s
in the context of Babbage’s Calculating Engine [1,2].
— The ability to deliver service that can be justifiably be trusted [3].
— The ability to avoid service failures that are more frequent and
more severe than acceptable [3].
What is Dependability?
5. More…
Default approach: Utilize a formalism to model system dependability
• Quantify the availability of components, calculate system availability
based on this data and a set of assumptions
- the availability model
• Most models expose the same expressiveness
• Each formalism allows to focus on certain aspects
• Component-based models: Reliability block diagram, fault tree
• State-based models: Markov chains, petri nets
• System understanding evolved from hardware to software to IT
infrastructures
What is Dependability?
6. • Some assumptions
– All failure and repair events are exponentially distributed
– Components are either fully working or completely failed
– All failure and repair events are pair-wisely stochastically independent
– Correct functioning at t can be treated as event, with event probability
derived from the availability value computed by failure rate and repair
rate
Exemple
Dependable Systems Course PT 2011
Dependability Modeling
3
Up Down
µ = 1
MT T R
= 1
MT T F
• Some assumptions
• All failure and repair events are exponentially distributed
• Components are either fully working or completely failed
• All failure and repair events are pair-wisely stochastically independent
• Correct functioning at t can be treated as event, with event probability derived
from the availability value computed by failure rate and repair rate
8. According to different properties, which may be more or less
emphasized depending on the application intended for the computer
system under consideration:
— Availability is always required, although to a varying degree
depending on the application;
— reliability, safety, confidentiality may or may not be required
according to the application [3].
Dependability
and
its
a4ributes
9. Dependability
and
its
a4ributes
— Trustworthiness of a computer system such that
reliance can justifiably be placed on the service it
delivers, It encompasses the following attributes:
— Availability: readiness for correct service
— Reliability: continuity of correct service
— Safety: absence of catastrophic consequences
— Integrity: absence of improper system alterations
— Maintainability: ability to undergo modifications and repairs
9
10. What is Availability?
Availability is the probability that the system will still
be operating to requirements at a given time.
— Availability is a function not only of how rarely a
system fails (reliability) but also of how quickly it can
be repaired (time to repair)
— Availability of 0.998 means software is available for 998
out of 1000 time units
10
11. 11
What is Reliability?
— Reliability is the probability that the system will deliver a set of
services for a given period of time, whereas a system is fault
tolerant when it does not fail even when there are faulty
components.
12. Example
• Reliability x Availability?
• Look at the example:
– Consider a site online that negotiators get down to 1
minute every 4 hours, ie every 240 (4x60) minutes. The
availability and 239/240 = 99 583% (relative high
availability)
– Reliability: can it be low down periods occur at
critical times when the market isfluctuating
and clients who trade their shares!
13. Safety is an extension of reliability. When the state of
correct service and the states of incorrect service due to
non-catastrophic failure are grouped into a safe state (in
the sense of being free from catastrophic damage, not from
danger), safety is a measure of continuous safeness, or
equivalently, of the time to catastrophic failure. Safety is
thus reliability with respect to catastrophic failures.
What is Safety
14. • A measure of the time to service restoration since the last failure
occurrence, or equivalently, measure of the continuous delivery of
incorrect service;
• The maintainability model situations where the system fails and the
return to proper functioning of the state requires any maintenance.
The maintainability and defined as the probability of a repair system
was successfully completed in given time.
What is Maintainability
16. State-space methods are much more comprehensive. They
allow explicit modeling of complex relationships (e.g., [5]),
and their transition structure encodes important sequencing
information. Historically, state-space methods have been
explored in the context of mathematical models that specify
probabilistic assumptions about time durations and
transition behavior. We now review those models and
comment on how they are being applied in the security
context.
What is Model State?
17. • In contrast with state-space models, combinatorial
models do not enumerate all possible system states to
obtain a solu- tion. Instead, simpler approaches are used
to compute system dependability measures. Despite
several extensions that have been made to
combinatorial models, they do not easily capture certain
features, such as stochastic dependence and imperfect
fault coverage. We present a brief overview of
combinatorial models.
What is Model State?
21. • States and labeled state transitions
• State can keep track of:
– Number of functioning resources of each type
– States of recovery for each failed resource
– Number of tasks of each type waiting at each resource
– Allocation of resources to tasks
• A transition:
– Can occur from any state to any other state
– Can represent a simple or a compound event
State-Space Models
22. • State space explosion problem or the largeness problem
• Stochastic Petri nets and related formalisms for easy specification
and automated generation/solution of underlying Markov model
• Or use hierarchical (Multilevel) model composition.
– e.g. Upper level : FT or RBD, lower level: Markov chains
– Many practical examples of the use of hierarchical models exist
Problem with State Space Models
25. • Discrete random process, usually drawn as state transition diagram
• Markov property - Next step depends only on the current step
• Impossible to predict future states, but useful for statistical
properties
• Finite state space (chain), transitions with probabilities, initial
state probabilities
• Transient state - Probability to not return to this state (finite number
of visits)
• Recurrent state - Probability of 1 to return to this state after
unspecified time t
• Mean recurrence time can be used as MTTF metric
• Time-homogeneous Markov chains - Transition probabilities do not
change in time.
What is Markov Chain
Dependable Systems Course
• Discrete random process, usually drawn as state transition
• Markov property - Next step depends only on the curr
• Impossible to predict future states, but useful for statist
• Finite state space (chain), transitions with probabilities,
• Transient state - Probability to not return to this state (fini
• Recurrent state - Probability of 1 to return to this state af
• Mean recurrence time can be used as MTTF metric
• Time-homogeneous Markov chains - Transition probabi
8
• Discrete random process, usually drawn as state transitio
• Markov property - Next step depends only on the cur
• Impossible to predict future states, but useful for statis
• Finite state space (chain), transitions with probabilities
• Transient state - Probability to not return to this state (fin
• Recurrent state - Probability of 1 to return to this state a
• Mean recurrence time can be used as MTTF metric
• Time-homogeneous Markov chains - Transition probab
26. • Discrete-time Markov chain (DTMC)
• System state only changes after a fixed time interval, system is in
exactly one state
• Transition to next state depends on transition probability (non-
negative) at t
• Each row of the probability transition matrix represents flow out
of that state, the columns the transition flow into the state, row sum is one
• Continuous-time Markov chain (CTMC)
• Allows state changes at any instance of time - continous parameter
space, still discrete state space
• Transition to next state after spending some time in a state - holding
time
• Generator matrix Q therefore expresses transition rates instead of
probabilities • By definition, the diagonal entries are equal to minus the
total rate out of that state
– Rates with which no state change takes place
What is Markov Chain
27. • Initial distribution vector can be combined with transition matrix to find
probabilities for being in one of the states after one step
• Each row sum of the transition matrix is 1
Markov Chains - DTMC Example
Dependable Systems Course PT 2011
Markov Chains - DTMC Example
10
Transition Matrix
• Each row sum of the transition matrix is 1
• Initial distribution vector can be combined with transition matrix to find probabilities
for being in one of the states after one step
Probability Matrix after 2 steps
(C) Tamara Lynn Anthony
ns - DTMC Example
Transition Matrix
he transition matrix is 1
Probability Matrix after 2 stepsPT 20110
Transition Matrix
with transition matrix to find probabilities
p
Probability Matrix after 2 steps
28. • Each state represents a particular error state, transition with
component failure rate • States expresses number of failed
components at any given time
• Time-homogeneous process - Failure / repair rates do not change
over time
• Components have identical failure rates and identical repair rates
– Failure and repair events are independent, process is memory-less
• Row sum is zero: Probability mass flowing out of state i will go to some other
state
– Example:
Markov Chains - CTMC Exampleexpresses number of failed components at any given time
homogeneous process - Failure / repair rates do not change o
onents have identical failure rates and identical repair rates
and repair events are independent, process is memory-less
m is zero: Probability mass flowing out of state i will go to some
:
30. • Interested in steady-state availability of the system
• Interpretation as steady-state probability for the system being
operational at
– Derived from probability vector -> contains steady-state
probabilities for the system being in one of the failure states after
a number of steps
– ,Static‘ steady-state availability computable if probabilities are in
equilibrium
• Probability for leaving state is similar to probability for going
into that state - probability mass is evenly distributed
– Typically achieved after a high number of steps
Example: 2-of-3 System
Example: 2-of-3 System
• Interested in steady-state availability of the system
31. • Resulting formula equals to result from Boolean investigation, but Markov
chains also support non-independent events - common cause failure
• Markov chains grow exponentially with their number of components - which is
bad
– Divide-and-conquer - Decompose and aggregate chain parts
– Structural decomposition - Consider a system as set of independent subsystems
– Behavioral decomposition - Assume time constants for some fault occurences and handling
processes based on criticality - e.g. fault in parked airplane
Markov ChainsMarkov Chains
• Resulting formula equals to result from Boolean investigation, but Markov chains
also support non-independent events - common cause failure
• Markov chains grow exponentially with their number of components - which is b
• Divide-and-conquer - Decompose and aggregate chain parts
• Structural decomposition - Consider a system as set of independent subsyste
• Behavioral decomposition - Assume time constants for some fault occurences
handling processes based on criticality - e.g. fault in parked airplane
32. • Mathematical model for concurrent systems with many components (Carl
Adam Petri)
• Bipartit directed graph (places vs. transitions)
• Each place has a capacity for tokens, default is unlimited or one
• Each arc has a weight expressing a cost factor, default is one
• Places are pre- / postconditions for transitions
• Distribution of tokens is called a marking
• Every net has an initial marking
What is Petri Nets
Dependable Systems Course PT 2011
Stochatic Petri Nets
• Mathematical model for concurrent systems with
many components (Carl Adam Petri)
• Bipartit directed graph (places vs. transitions)
• Each place has a capacity for tokens,
default is unlimited or one
• Each arc has a weight expressing a cost factor,
default is one
• Places are pre- / postconditions for transitions
• Distribution of tokens is called a marking
• Every net has an initial marking
16
Place / State
Transition
Token
Input place of
the transition
Output place of
the transition
34. • Transition is activated (may fire) when
– All input places contain enough tokens for the transition costs
– All output places have enough capacity to take the new tokens
• Tokens are consumed and placed in output places, considering the arc
weights
• Atomic nondeterminstic operation - any activated transition may fire
• Firing happens with given delay
• More complex Petri net versions can
• distinguish different token types
– Colored tokens (data values)
– Activation times for tokens
• Petri nets allow both formal analysis (for exponential distribution) and
simulation
Stochastic Petri Nets
Dependable Systems Course PT 2011
• Transition is activated (may fire) when
• All input places contain enough tokens for the transition costs
• All output places have enough capacity to take the new tokens
• Tokens are consumed and placed in output places, considering the arc weights
• Atomic nondeterminstic operation - any activated transition may fire
• Firing happens with given delay
• More complex Petri net versions can
distinguish different token types
• Colored tokens (data values)
• Activation times for tokens
• Petri nets allow both formal analysis (for exponential distribution) and simulation
17
35. • A stochastic process and a sequence of random variables indexed
on time witha well-defined correlation structure
• Have probability distributions associated with them
– Arrival of customers in a bank queue
– Number of requests in a Web Server
• Why stochastic modeling?
– In many systems, you need to join in time to events
• How to model stochastic processes?
– Analytical queuing
– theory models
– Petri Nets
Stochastic Petri Nets
36. • Probabilistic behavior model
• Distributions:
– Exponential - SPN (stochastic Petri net)
– exponential or immediate GSPN(Generalized)
– There are other models with arbitrary functions
Stochastic Petri Nets
37. • Reachability set
– All possible markings reachable from an initial marking
– Possible analysis questions
• Can some system state (e.g. an error state) be reached at
all ?
• Exists a firing sequence that transforms M0 to M ?
• Boundedness
– Marking is bounded if there is a k so that for every reachable
marking the number of tokens in each place is bounded by k
– Useful for modeling limited (bounded) resources
Typical Petri Net Properties
38. • Complexity of the petri net does not depend on the number of
components !
Example: 2-of-3 System
Dependable Systems Course PT 2011
Example: 2-of-3 System
23
Complexity of the petri net does not depend on the number of components !
39. • Modeling of cold standby components (inhibitor arc)
• Limited repair capacities - at most R repairmen available at a time
• Dependability analysis - prove that there is no state where some
property is violated
Example: K-of-N With Standby and Repairmen
Dependable Systems Course PT 2011
Example: K-of-N With Standby and Repairmen
• Modeling of cold standby components (inhibitor arc)
• Limited repair capacities - at most R repairmen available at a time
• Dependability analysis - prove that there is no state where some property is violated
40. • buffer size = #token_capacity(p1 + p2)
• unit count = #token_capacity(p3 + p4)
• Firing rate of t1 is arrival rate
t2 is an immediate transition
• Firing rate of t3 is the service rate, depends on token count in p4
Example: Parallel System with Input Buffer
Dependable Systems Course PT 2011
Example: Parallel System with Input Buffer
25
Input buffer with
positions
Identical units
• buffer size = #token_capacity(p1 + p2)
• unit count = #token_capacity(p3 + p4)
• Firing rate of t1 is arrival rate
• t2 is an immediate transition
• Firing rate of t3 is the service rate, depends on token count in p4
Free buffer
positions
Filled buffer positions
Free units
(C)AndreaBobbio
Active units
41. Dependable Systems Course PT 2011
Example: Parallel System with Input Buffer
26
(C) Andrea Bobbio
• Light lines - Fault free operation
• Heavy lines - Failures
• Dotted lines - repairs
• Rate computation demands exponential distribution
Example: Parallel System with Input Buffer
42. • In many cases, simulation is the only way to solve the net • More
than one outgoing non-exponential distribution
• Special guard functions
• Complexity issues
• ...
• Typical simulation problems
• Modeled failure rates might be small, so many runs needed
for valid result
• Random number generation
• Confidence intervals
Petri Net Simulation
43. • Petri net has according reachability graph
• Combines to Markov chain when transition probabilities are given
Petri Net ->Markov Chain
able Systems Course PT 2011
tri Net -> Markov Chain
• Petri net has according reachability graph
• Combines to Markov chain when transition
probabilities are given
21
Petri Net -> Markov Chain
• Petri net has according reachabi
• Combines to Markov chain wh
probabilities are given
• Petri net has according reachability
• Combines to Markov chain when
probabilities are given
• Petri net has according reachability graph
• Combines to Markov chain when transition
probabilities are given
46. • [1] D. Lardner, Babbage's calculating engine. Edinburgh Review, July 1834. Reprinted in P.
Morrison and E. Morrison, editors, Charles Babbage and His Calculating Engines. Dover, 1961.
• [2] C. Babbage. On the mathematical powers of the calculating engine (December 1837).
Unpublished Manuscript. Buxton MS7, Museum of the History of Science. In B. Randell, editor,
The Origins of Digital Computers: Selected papers, pages 17-52. Springer, 1974.
• [3]Fundamental Concepts of Dependability by A Avizienis, J C Laprie, B Randell, Brian Randell
• K. Goseva-Popstojanov, K. S. Trivedi, Stochastic Modeling Formalisms for Dependability,
Performance and Performability, LNCS 1769, 2000
• [5] David M. Nicol, Fellow, IEEE, William H. Sanders, Fellow, IEEE, and Kishor S. Trivedi, Fellow,
IEEE, Dependability to Security Model-Based Evaluation
• [6] . K. Muppala, M. Malhotra, and K. S. Trivedi, “Markov dependability models of complex
systems: Analysis techniques,” in Reliability and Maintenance of Complex Systems, S. Ozekici,
Ed. Berlin, Germany: Springer, 1996, pp. 442–486.
• [7] Vedran Kordic, Petri Net Theory and Applications
• [8] Peter J. Haas, Stochastic Petri Nets- Modelling, Stability, Simulation, Springer.
• [9] Ebeling, C. E., An Introduction to Reliability and Maintainability Engineering. Illinois, Waveland
Press, 1997
References