SlideShare a Scribd company logo
1 of 95
Download to read offline
Model Checking of Fault-Tolerant Distributed 
Algorithms 
Part II: Modeling Fault-tolerant Distributed Algorithms 
Annu Gmeiner Igor Konnov Ulrich Schmid 
Helmut Veith Josef Widder 
TMPA 2014, Kostroma, Russia 
Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA’14, Nov. 2014 1 / 64
Why Modeling and Verification? 
Let’s have a look at the recent technical report by amazon.com 1: 
We have found that the standard verification techniques in 
industry are necessary but not sufficient. We use 
deep design reviews, 
code reviews, 
static code analysis, 
stress testing, 
fault-injection testing [. . . ], 
but we still find that subtle bugs can hide in complex 
concurrent fault-tolerant systems. . . 
. . .We have found that testing the code is inadequate as a 
method to find subtle errors in design, as the number of 
reachable states of the code is astronomical. 
1C. Newcombe et al. Use of Formal Methods at Amazon Web Services, 2014 
Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA’14, Nov. 2014 2 / 64
Why Modeling and Verification? (cont.) 
. . . the final executable code is unambiguous, but contains 
an overwhelming amount of detail. 
We needed to be able to capture the essence of a design in 
a few hundred lines of precise description . 
Engineers naturally focus on designing the “happy case” for 
a system, i.e. the processing path in which no errors occur. 
. . . the shortest error trace exhibiting the bug contained 35 
high level steps. The improbability of such compound events 
is not a defense against such bugs; historically, AWS has 
observed many combinations of events at least as 
complicated as those that could trigger this bug. 
Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA’14, Nov. 2014 3 / 64
What are we doing? 
The approach we have: 
A high-level description of a design, 
which is precise. 
Sound verification method, 
as complete as possible. 
Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA’14, Nov. 2014 4 / 64
What are we doing? 
The approach we have: 
A high-level description of a design, 
which is precise. 
Sound verification method, 
as complete as possible. 
More specifically: 
modeling approach suitable for model checking 
automatic parameterized verification method 
targeting at fault-tolerant distributed algorithms 
(We are not interested in verifying mathematical toy examples) 
Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA’14, Nov. 2014 4 / 64
Why Model Checking? 
an alternative proof approach 
useful counter-examples 
ability to define and vary assumptions about the system 
and see why it breaks 
closer to code level 
good degree of automation 
Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA’14, Nov. 2014 5 / 64
Distributed Algorithms: Model Checking Challenges 
unbounded data types 
unbounded number of rounds (round numbers part of messages) 
parameterization in multiple parameters 
among n processes f  t are faulty with n  3t 
contrast to concurrent programs 
diverse fault models (adverse environments) 
continuous time 
fault-tolerant clock synchronization 
degrees of concurrency: synchronous, asynchronous partially 
synchronous 
a process makes at most 5 steps between 2 steps of any other 
process 
Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA’14, Nov. 2014 6 / 64
Fault-tolerant distributed algorithms 
n 
n processes communicate by messages 
all processes know that at most t of them might be faulty 
f are actually faulty 
Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA’14, Nov. 2014 7 / 64
Fault-tolerant distributed algorithms 
n 
? ? ? 
t 
n processes communicate by messages 
all processes know that at most t of them might be faulty 
f are actually faulty 
Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA’14, Nov. 2014 7 / 64
Fault-tolerant distributed algorithms 
n 
? ? ? 
t f 
n processes communicate by messages 
all processes know that at most t of them might be faulty 
f are actually faulty 
Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA’14, Nov. 2014 7 / 64
Challenge #1: fault models 
clean crashes: least severe 
faulty processes prematurely halt after/before “send to all” 
crash faults: 
faulty processes prematurely halt (also) in the middle of “send to all” 
omission faults: 
faulty processes follow the algorithm, but some messages sent by them 
might be lost 
symmetric faults: 
faulty processes send arbitrarily to all or nobody 
Byzantine faults: most severe 
faulty processes can do anything 
encompass all behaviors of above models 
Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA’14, Nov. 2014 8 / 64
Challenges #2  #3: Pseudo-code and 
Communication 
Translate pseudo-code to a formal description 
that allows us to verify the algorithm 
and does not oversimplify the original algorithm. 
Assumptions about the communication medium 
are usually written in plain English, 
spread across research papers, 
constitute folklore knowledge. 
Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA’14, Nov. 2014 9 / 64
Asynchronous Reliable Broadcast (Srikanth  Toueg, 
87) 
The core of the classic broadcast algorithm from the DA literature. 
It solves an agreement problem depending on the inputs vi . 
Variables of process i 
vi : f0 , 1g i n i t i a l l y 0 or 1 
accepti : f0 , 1g i n i t i a l l y 0 
An atomic step: 
i f vi = 1 
then send ( echo ) to all ; 
i f received (echo) from 
at l e a s t t + 1 distinct proces ses 
and not sent ( echo ) before 
then send ( echo ) to all ; 
i f received ( echo ) from at l e a s t 
n - t distinct proces ses 
then accepti := 1 ; 
Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA’14, Nov. 2014 10 / 64
Asynchronous Reliable Broadcast (Srikanth  Toueg, 
87) 
The core of the classic broadcast algorithm from the DA literature. 
It solves an agreement problem depending on the inputs vi . 
Variables of process i 
vi : f0 , 1g i n i t i a l l y 0 or 1 
accepti : f0 , 1g i n i t i a l l y 0 
An atomic step: 
i f vi = 1 
then send ( echo ) to all ; 
i f received (echo) from 
at l e a s t t + 1 distinct proces ses 
and not sent ( echo ) before 
then send ( echo ) to all ; 
i f received ( echo ) from at l e a s t 
n - t distinct proces ses 
then accepti := 1 ; 
asynchronous 
t Byzantine faults 
correct if n  3t 
the code is 
parameterized in n 
and t 
) process template 
P(n; t; f ) 
Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA’14, Nov. 2014 10 / 64
Typical Structure of a Computation Step 
receive messages 
compute using 
messages and local variables 
(description in English 
with basic control flow 
if-then-else) 
send messages 
atomic 
Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA’14, Nov. 2014 11 / 64
Typical Structure of a Computation Step 
receive messages 
compute using 
messages and local variables 
(description in English 
with basic control flow 
if-then-else) 
send messages 
atomic 
implicit 
pseudo-code 
Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA’14, Nov. 2014 11 / 64
Challenge #4: Parameterized Model Checking 
Parameterized model checking problem: 
given a process template P(n; t; f ), 
resilience condition RC : n  3t ^ t  f  0, 
fairness constraints , e.g., “all messages will be delivered” 
and an LTL-X formula ' 
show for all n, t, and f satisfying RC 
(P(n; t; f ))nf + f faults j= ( ! ') 
n 
? ? ? 
t 
n 
? ? ? 
t f 
Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA’14, Nov. 2014 12 / 64
Challenge #5: Liveness in Distributed Algorithms 
Interplay of safety and liveness is a central challenge in DAs 
achieving safety and liveness is non-trivial 
asynchrony and faults lead to impossibility results 
(recall first part of lecture (Fischer et al., 1985)) 
Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA’14, Nov. 2014 13 / 64
Challenge #5: Liveness in Distributed Algorithms 
Interplay of safety and liveness is a central challenge in DAs 
achieving safety and liveness is non-trivial 
asynchrony and faults lead to impossibility results 
(recall first part of lecture (Fischer et al., 1985)) 
Rich literature to verify safety (e.g. in concurrent systems) 
Distributed algorithms perspective: 
“doing nothing is always safe” 
“tools verify algorithms that actually might do nothing” 
Verification efforts often have to simplify assumptions 
Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA’14, Nov. 2014 13 / 64
Summary 
We have to model: 
faults, 
communication medium captured in English, 
algorithms written in pseudo-code. 
and check: 
safety and liveness 
of parameterized systems 
with unbounded integers, 
non-standard fairness constraints, 
Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA’14, Nov. 2014 14 / 64
Existing formalization frameworks 
Design  Specification 
TLA+/PlusCal 
Concurrent Alg. 
Proving/ 
TLC 
(Timed) IOA 
Asynchronous DA 
Proving/ 
UPPAAL 
PVS 
Theorem Proving 
? (Parameterized) 
Simulation 
Model Checking 
of FTDAs 
DISTAL 
PBFT 
Implementation 
Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA’14, Nov. 2014 15 / 64
Alternative frameworks 
TLA (temporal logic of actions): 
used to design (distributed) algorithms by refinement of the spec 
verification with proof assistants (low degree of automation) 
Encodings of DA in proof assistant PVS (e.g., by Rushby): 
ad-hoc encoding 
found a bug in a published synchronous Byzantine Agreement 
algorithm (Lincoln  Rushby, 1993) 
I/O-Automata: 
originally designed to write clearer hand-written proofs 
limited tool support, e.g., Veromodo toolset is still in beta 
suitable only for asynchronous distributed algorithms 
Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA’14, Nov. 2014 16 / 64
Alternative frameworks 
TLA (temporal logic of actions): 
used to design (distributed) algorithms by refinement of the spec 
verification with proof assistants (low degree of automation) 
Encodings of DA in proof assistant PVS (e.g., by Rushby): 
ad-hoc encoding 
found a bug in a published synchronous Byzantine Agreement 
algorithm (Lincoln  Rushby, 1993) 
I/O-Automata: 
originally designed to write clearer hand-written proofs 
limited tool support, e.g., Veromodo toolset is still in beta 
suitable only for asynchronous distributed algorithms 
proof assistants are very general, but with low automation degree 
“everything is possible, but nothing is easy” 
Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA’14, Nov. 2014 16 / 64
Simulation and Implementation 
Distal: 
Domain-specific language (Biely et al., 2013) 
Simulation and evaluate performance of fault-tolerant algorithms 
Practical Byzantine Fault-Tolerance (Castro et al., 1999) 
and other practical algorithms: 
Implementation with optimizations 
Precise semantics is unclear 
The system is partially synchronous: 
non-divergent message delays are assumed 
Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA’14, Nov. 2014 17 / 64
In this part 
We introduce efficient encoding in PROMELA. 
Verify safety and liveness of fault-tolerant algorithms (fixed parameters). 
Find counterexamples for parameters known from the literature. 
This proves adequacy of our modeling. 
Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA’14, Nov. 2014 18 / 64
Preliminaries: 
Promela 
Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA’14, Nov. 2014 19 / 64
Promela 
PROMELA  PROcess MEta LAnguage 
SPIN  Simple Promela INterpreter 
(not that simple any more) 
Here we give a short introduction and cover only 
the features important to our work. 
Detailed documentation, tutorials, and books on: 
http://spinroot.com Gerard Holzmann 
Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA’14, Nov. 2014 20 / 64
Top-level: global variables and processes 
/  g l o b a l d e c l a r a t i o n s v i s i b l e t o a l l p r o c e s s e s  / 
int x; /  a g l o b a l i n t e g e r ( a s in C)  / 
mtype = { X, Y }; /  c o n s t a n t me s s age t y p e s  / 
/  a FIFO c h anne l wi th a t most 2 me s s a g e s o f t y p e mtype  / 
chan c = [2] of { mtype }; 
active[2] proctype ProcA() { Two processes are created 
... at the initial state 
} 
proctype ProcB() { Processes can be created 
... later using: run ProcB() 
} 
init { A special process, use to 
run ProcB(); run ProcB(); create other processes 
} 
Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA’14, Nov. 2014 21 / 64
One process: Basics 
int x, y; 
active proctype ProcA() { 
int z; Declare a local variable 
z = x; Assignment 
x  y; Block until the expression is evaluated to true 
true; one step to execute, no effect 
z++; 
skip; same as true 
} 
Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA’14, Nov. 2014 22 / 64
One process: Control flow 
int x, y; 
active proctype P() { 
main: 
if A guarded command 
:: x == 0 - x = 1; 
:: y == 0 - y = 1; 
non-deterministically selects an option 
:: x == 1  y == 1 whose first expression is not blocked. 
- x = 0; y = 0; 
fi; continues executing the rest of the option 
goto main; step-by-step. 
} 
Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA’14, Nov. 2014 23 / 64
One process: Control flow (cont.) 
int x = 0, y = 0; 
active 
proctype P() { 
main: 
if 
:: x == 0 - x = 1; 
:: y == 0 - y = 1; 
:: x == 1  y == 1 
- x = 0; y = 0; 
fi; 
goto main; 
} 
Run 1 Run 2 Run 3 
x=0,y=0 x=0,y=0 x=0,y=0 
x=1,y=0 x=0,y=1 x=1,y=0 
x=1,y=1 x=1,y=1 x=1,y=1 
x=0,y=0 x=0,y=0 x=0,y=0 
x=0,y=1 x=1,y=0 x=0,y=1 
x=1,y=1 x=1,y=1 x=1,y=1 
x=0,y=0 x=0,y=0 x=0,y=0 
Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA’14, Nov. 2014 24 / 64
One process: Loops 
int x; 
active proctype P() { 
do a do..od loop 
:: x == 10 - x = 0; 
:: x == 10 - break; 
:: x  10 - x++; 
od; 
A: 
if 
basically the same. goto A 
introduces one more step 
:: x == 10 - x = 0; 
:: x == 10 - goto B; 
:: x  10 - x++; 
fi; 
goto A; 
B: 
} 
Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA’14, Nov. 2014 25 / 64
Many Processes: Interleavings 
Pure interleaving semantics 
Every statement is executed 
atomically 
int x = 0, y = 1; 
active[2] proctype A() { 
x = 1 - x; 
y = 1 - y; 
} 
A[0] A[1] 
The red path is an example execution where the steps of processes 0 
and 1 alternate. 
Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA’14, Nov. 2014 26 / 64
Many Processes: Atomics 
use atomic f ... g to make 
execution of a sequence indivisible. 
non-deterministic choice with 
if..fi is still allowed! 
int x = 0, y = 1; 
active[2] proctype A() { 
atomic { 
x = 1 - x; 
y = 1 - y; 
} 
} 
A[0] A[1] 
Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA’14, Nov. 2014 27 / 64
Many Processes: Atomics 
use atomic f ... g to make 
execution of a sequence indivisible. 
non-deterministic choice with 
if..fi is still allowed! 
int x = 0, y = 1; 
active[2] proctype A() { 
atomic { 
x = 1 - x; 
y = 1 - y; 
} 
} 
A[0] A[1] 
Larger atomic steps lead to less possible paths and states. 
Note: different atomicity degrees may lead to different verification 
results 
Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA’14, Nov. 2014 27 / 64
(Asynchronous) message passing 
mtype = { A, B }; 
chan chan1 = [1] of { mtype }; queue of size 1 
chan chan2 = [1] of { mtype }; 
active proctype Ping() { 
chan1!A; insert A to “chan1” 
do 
:: chan2?B - chan1!A; 
od; 
when B is on the top of “chan2”, 
remove it and insert A to “chan1” 
} 
active proctype Pong() { 
do 
:: chan1?A - chan2!B; 
od; 
} 
Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA’14, Nov. 2014 29 / 64
Blocking receive 
mtype = { A, B }; 
chan chan1 = [1] of { mtype }; 
chan chan2 = [1] of { mtype }; 
active proctype Ping() { 
chan1!A; 
do :: chan2?B -   deadlock! 
chan1!A; Ping sends A, Pong receives A, 
od; chan1?A is blocked 
} 
active proctype Pong() { 
do :: chan1?A - 
chan1?A;   deadlock! 
chan2!B; 
od; 
} 
Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA’14, Nov. 2014 30 / 64
Blocking send 
mtype = { A, B }; 
chan chan1 = [1] of { mtype }; 
chan chan2 = [1] of { mtype }; 
active proctype Ping() { 
chan1!A; 
do :: chan2?B - When chan1=[A] and chan2=[B], the system deadlocks 
chan1!A; 
chan1!A; 
chan1!A;   deadlock! 
chan1!A; The shortest counter-example has 10 steps 
od; 
} Use Spin to find it 
active proctype Pong() { 
do :: chan1?A - 
chan2!B;   deadlock! 
od; 
} 
Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA’14, Nov. 2014 31 / 64
Promela vs. C 
PROMELA looks like C 
But it is not! 
Non-determinism in the if statements (internal non-determinism) 
Non-determinstic scheduler (external non-determinism) 
Atomic statements 
Message passing 
PROMELA is a modeling language 
Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA’14, Nov. 2014 32 / 64
Preliminaries: 
Kripke Structures 
Linear Temporal Logic (LTL) 
Control Flow Automata (CFA) 
Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA’14, Nov. 2014 33 / 64
Kripke structures 
A Kripke structure is a 
M = (S;S0;R;AP; L), where: 
s0 : frg 
S is a set of states, 
S0  S is the set of initial 
states, 
R  S  S is a transition 
relation, 
AP is a set of atomic 
propositions, 
L : S ! 2AP is a state-labeling 
function. s4 : fgg 
s3 : fr ; y; gg 
s1 : fyg s2 : fyg 
Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA’14, Nov. 2014 34 / 64
Linear Temporal Logic 
An LTL formula is defined inductively w.r.t. 
atomic propositions AP: 
(base) p 2 AP is an LTL formula, 
if ' and   are LTL formulas, then the 
following expressions are LTL 
formulas: 
Nexttime: X', 
Eventually: F ', 
Globally: G', 
Until:   U'. 
Boolean combinations: 
' ^  , ' _  , and :'. 
s0 s1 s2 s3 s4 
s00 
s01 
s02 
s04 
s03 
s00 
0 s00 
1 s00 
2 s00 
3 s00 
4 
s000 
0 
  
s000 
1 
  
s000 
2 
' 
s000 
3 s000 
4 
Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA’14, Nov. 2014 35 / 64
Recall: Typical Structure of a Computation Step 
receive messages 
compute using 
messages and local variables 
(description in English 
with basic control flow 
if-then-else) 
send messages 
atomic 
implicit 
pseudo-code 
Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA’14, Nov. 2014 36 / 64
CFA: Intermediate representation 
Intermediate representation of a 
loop body: a path from qI to qF 
encodes one iteration. 
Every variable is assigned at most 
once (SSA). 
active proctype P() { 
int x, y; 
do 
:: x == 0 - x = 1; 
:: x == 1 - x = 2; 
:: x == 2 - x = 0; 
:: x == 1 
- x = 0; y = 1 - y; 
od; 
} 
qI 
q0 q1 q2 q3 
q4 
qF 
x = 0 
x = 1 
x = 1 
x = 2 
x0 = 1 
x0 = 2 
x0 = 0 
x0 = 0 
y0 = 1  y 
Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA’14, Nov. 2014 37 / 64
Example: from a CFA to a Kripke structure 
Kripke structure M(n; t; f ) = (S;S0;R;AP; L) of N(n; t; f ) processes 
For a path  from qI to qF construct a formula (x; y; x0; y0) 
A state is a pair of x = (x1; : : : ; xN) 2 NN and y = (y1; : : : ; yN) 2 NN 
and the initial states are S0 = f(0; : : : ; 0)g 
((x; y); (x0; y0)) 2 R iff there are 
process index k: 1  k  N and path : 
[k moves]: (xk ; yk ; x0k 
; y0k 
) holds 
[others do not]: 8i 2 f1; : : : ;Ng n fkg: 
x0 
i = xi , y0 
i = yi . 
Propositions 
AP = f[9i: yi6= 0]; [8i: yi6= 0]g 
and a state (x; y) is labeled as: 
p 2 L((x; y)) iff (x; y) j= p. 
qI 
q0 q1 q2 q3 
q4 
qF 
x = 0 
x = 1 
x = 1 
x = 2 
x0 = 1 
x0 = 2 
x0 = 0 
x0 = 0 
y0 = 1  y 
Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA’14, Nov. 2014 38 / 64
Example: Properties of ST87 in LTL 
Unforgeability. If vi = 0 for all correct processes i, then for all correct 
processes j, acceptj remains 0 forever. 
 nVf 
G 
i=1 
vi = 0 
 
! G 
 nVf 
j=1 
 
acceptj = 0 
Safety 
Completeness. If vi = 1 for all correct processes i, then there is a correct 
process j that eventually sets acceptj to 1. 
 nVf 
G 
i=1 
vi = 1 
 
! F 
 nWf 
j=1 
 
acceptj = 1 
Liveness 
Relay. If a correct process i sets accepti to 1, then eventually all 
correct processes j set acceptj to 1. 
 nWf 
G 
i=1 
 
! F 
accepti = 1 
 nVf 
j=1 
 
acceptj = 1 
Liveness 
Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA’14, Nov. 2014 39 / 64
Model Checking Problems 
Finite-state MC 
Input: 
a process template P, 
an LTL formula ' (including fairness), 
values of parameters n, t, and f . 
Problem: check, whether M(n; t; f ) j= '. 
Parameterized MC 
Input: 
a process template P, 
an LTL formula  (including fairness) 
with atomic propositions of the form [9i:xi  y] and [8i:xi  y] 
Problem: check, whether 8n; t; f : n  3t ^ t  f ^ f  0: M(n; t; f ) j= . 
Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA’14, Nov. 2014 40 / 64
Parameterized modeling 
 
Non-parameterized model 
checking 
as in SPIN’13: (John et al., 2013) 
Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA’14, Nov. 2014 41 / 64
Modeling of threshold-based algorithms in Promela. . . 
We introduce efficient encoding of threshold-based 
fault-tolerant algorithms in PROMELA 
(with parametrization!) 
Verify safety and liveness of fault-tolerant algorithms (fixed parameters). 
Find counterexamples for parameters known from the literature. 
This proves adequacy of our modeling. 
Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA’14, Nov. 2014 42 / 64
Modeling of threshold-based algorithms in Promela. . . 
We introduce efficient encoding of threshold-based 
fault-tolerant algorithms in PROMELA 
(with parametrization!) 
Verify safety and liveness of fault-tolerant algorithms (fixed parameters). 
Find counterexamples for parameters known from the literature. 
This proves adequacy of our modeling. 
For our method, we exploit specifics of FTDAs: 
1 central feature of the algorithms 
(message counting); 
2 specific message passing 
(we do not need to know who sent but how many of them sent 
messages); 
3 the way faults affect messages 
(again, counting messages). 
Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA’14, Nov. 2014 42 / 64
Case Studies 
We consider a number of threshold-based algorithms. 
Our running example ST87 for 
1 Byzantine faults (BYZ) 
2 omission faults (OMIT) 
3 symmetric faults (SYMM) 
4 clean crashes (CLEAN). 
5 Forklore reliable broadcast for clean crashes 
[Chandra  Toueg 96, CT96] 
(to be continued) 
Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA’14, Nov. 2014 43 / 64
Characteristics of the FTDA by Srikanth  Toueg, 87 
Variables of process i 
vi : f0 , 1g i n i t i a l l y 0 or 1 
accepti : f0 , 1g i n i t i a l l y 0 
An atomic step: 
i f vi = 1 
then send ( echo ) to all ; 
i f received (echo) from 
at l e a s t t + 1 distinct proces ses 
and not sent ( echo ) before 
then send ( echo ) to all ; 
i f received ( echo ) from at l e a s t 
n - t distinct proces ses 
then accepti := 1 ; 
Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA’14, Nov. 2014 44 / 64
Characteristics of the FTDA by Srikanth  Toueg, 87 
Variables of process i 
vi : f0 , 1g i n i t i a l l y 0 or 1 
accepti : f0 , 1g i n i t i a l l y 0 
An atomic step: 
i f vi = 1 
then send ( echo ) to all ; 
i f received (echo) from 
at l e a s t t + 1 distinct proces ses 
and not sent ( echo ) before 
then send ( echo ) to all ; 
i f received ( echo ) from at l e a s t 
n - t distinct proces ses 
then accepti := 1 ; 
the algorithm consists 
of threshold-guarded 
commands, only 
thresholds t + 1 and 
n  t 
communication is by 
“send to all” 
how processes 
distinguish distinct 
senders is not part of 
the algorithm (i.e., 
algorithm description is 
high level) 
Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA’14, Nov. 2014 44 / 64
Case Studies (cont.): Larger Algorithms 
more involved algorithms in the purely asynchronous setting: 
6 Asynchronous Byzantine Agreement (Bracha  Toueg 85, BT85) 
Byzantine faults 
two phases and two message types 
five status values 
properties: unforgeability, correctness (liveness), agreement 
(liveness) 
7 Condition-based Consensus (Most´ efaoui et al. 01, MRRR01) 
crash faults 
two phases and four message types 
nine status variables 
properties: validity, agreement, termination (liveness) 
8 Fast Byzantine Consensus: common case (Martin, Alvisi 06, 
MA06) 
Byzantine faults 
the core part of the algorithm 
no cryptography 
Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA’14, Nov. 2014 45 / 64
Experimental Results at Glance 
Algorithm Fault Parameters Resilience Properties Time 
1. ST87 BYZ n = 7; t = 2; f = 2 n  3t U, C, R 6 sec. 
1. ST87 BYZ n = 7; t = 3; f = 2 n  3t U, C, R 5 sec. 
1. ST87 BYZ n = 7; t = 1; f = 2 n  3t U, C, R 1 sec. 
2. ST87 OMIT n = 5; t = 2; f = 2 n  2t U, C, R 4 sec. 
2. ST87 OMIT n = 5; t = 2; f = 3 n  2t U, C, R 5 sec. 
3. ST87 SYMM n = 5; t = 1; fp = 1; fs = 0 n  2t U, C, R 1 sec. 
3. ST87 SYMM n = 5; t = 2; fp = 3; fs = 1 n  2t U, C, R 1 sec. 
4. ST87 CLEAN n = 3; t = 2; fc = 2; fnc = 0 n  t U, C, R 1 sec. 
5. CT96 CRASH n = 2 — U, C, R 1 sec. 
6. BT85 BYZ n = 5; t = 1; f = 1 n  3t R 131 sec. 
6. BT85 BYZ n = 5; t = 1; f = 2 n  3t R 1 sec. 
6. BT85 BYZ n = 5; t = 2; f = 2 n  3t R 1 sec. 
7. MRRR01 CRASH n = 3; t = 1; f = 1 n  2t V0, V1, A, T 1 sec. 
7. MRRR01 CRASH n = 3; t = 1; f = 2 n  2t V0, V1, A, T 1 sec. 
8. MA06 BYZ 
p = 4,a = 6,l = 4, 
t = 1,f = 1 
p  3t, a  5t, l  3t CS1, CS3, CL1, CL2 3 hrs. 
8. MA06 BYZ 
p = 4,a = 5,l = 4, 
t = 1, f = 1 
p  3t, a  5t, l  3t CS1, CS3, CL1, CL2 14 min. 
8. MA06 BYZ 
p = 4,a = 6,l = 4, 
t = 1, f = 2 
p  3t, a  5t, l  3t CS1, CS3, CL1, CL2 2 sec. 
Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA’14, Nov. 2014 46 / 64
Experimental Results at Glance 
Algorithm Fault Parameters Resilience Properties Time 
1. ST87 BYZ n = 7; t = 2; f = 2 n  3t U, C, R 6 sec. 
1. ST87 BYZ n = 7; t = 3; f = 2 n  3t U, C, R 5 sec. 
1. ST87 BYZ n = 7; t = 1; f = 2 n  3t U, C, R 1 sec. 
2. ST87 OMIT n = 5; t = 2; f = 2 n  2t U, C, R 4 sec. 
2. ST87 OMIT n = 5; t = 2; f = 3 n  2t U, C, R 5 sec. 
3. ST87 SYMM n = 5; t = 1; fp = 1; fs = 0 n  2t U, C, R 1 sec. 
3. ST87 SYMM n = 5; t = 2; fp = 3; fs = 1 n  2t U, C, R 1 sec. 
4. ST87 CLEAN n = 3; t = 2; fc = 2; fnc = 0 n  t U, C, R 1 sec. 
5. CT96 CRASH n = 2 — U, C, R 1 sec. 
6. BT85 BYZ n = 5; t = 1; f = 1 n  3t R 131 sec. 
6. BT85 BYZ n = 5; t = 1; f = 2 n  3t R 1 sec. 
6. BT85 BYZ n = 5; t = 2; f = 2 n  3t R 1 sec. 
7. MRRR01 CRASH n = 3; t = 1; f = 1 n  2t V0, V1, A, T 1 sec. 
7. MRRR01 CRASH n = 3; t = 1; f = 2 n  2t V0, V1, A, T 1 sec. 
8. MA06 BYZ 
p = 4,a = 6,l = 4, 
t = 1,f = 1 
p  3t, a  5t, l  3t CS1, CS3, CL1, CL2 3 hrs. 
8. MA06 BYZ 
p = 4,a = 5,l = 4, 
t = 1, f = 1 
p  3t, a  5t, l  3t CS1, CS3, CL1, CL2 14 min. 
8. MA06 BYZ 
p = 4,a = 6,l = 4, 
t = 1, f = 2 
p  3t, a  5t, l  3t CS1, CS3, CL1, CL2 2 sec. 
Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA’14, Nov. 2014 46 / 64
Experimental Results at Glance 
Algorithm Fault Parameters Resilience Properties 1. ST87 BYZ n = 7; t = 2; f = 2 n  3t U, C, R 1. ST87 BYZ n = 7; t = 3; f = 2 n  3t U, C, R 1. ST87 BYZ n = 7; t = 1; f = 2 n  3t U, C, R 2. ST87 OMIT n = 5; t = 2; f = 2 n  2t U, C, R 2. ST87 OMIT n = 5; t = 2; f = 3 n  2t U, C, R 3. ST87 SYMM n = 5; t = 1; fp = 1; fs = 0 n  2t U, C, R 3. ST87 SYMM n = 5; t = 2; fp = 3; fs = 1 n  2t U, C, R 4. ST87 CLEAN n = 3; t = 2; fc = 2; fnc = 0 n  t U, C, R 5. CT96 CRASH n = 2 — U, C, R Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA’14, Nov. 2014 46 / 64
Experimental Results at Glance 
Algorithm Fault Parameters Resilience Properties Time 
1. ST87 BYZ n = 7; t = 2; f = 2 n  3t U, C, R 6 sec. 
1. ST87 BYZ n = 7; t = 3; f = 2 n  3t U, C, R 5 sec. 
1. ST87 BYZ n = 7; t = 1; f = 2 n  3t U, C, R 1 sec. 
2. ST87 OMIT n = 5; t = 2; f = 2 n  2t U, C, R 4 sec. 
2. ST87 OMIT n = 5; t = 2; f = 3 n  2t U, C, R 5 sec. 
3. ST87 SYMM n = 5; t = 1; fp = 1; fs = 0 n  2t U, C, R 1 sec. 
3. ST87 SYMM n = 5; t = 2; fp = 3; fs = 1 n  2t U, C, R 1 sec. 
4. ST87 CLEAN n = 3; t = 2; fc = 2; fnc = 0 n  t U, C, R 1 sec. 
5. CT96 CRASH n = 2 — U, C, R 1 sec. 
6. BT85 BYZ n = 5; t = 1; f = 1 n  3t R 131 sec. 
6. BT85 BYZ n = 5; t = 1; f = 2 n  3t R 1 sec. 
6. BT85 BYZ n = 5; t = 2; f = 2 n  3t R 1 sec. 
7. MRRR01 CRASH n = 3; t = 1; f = 1 n  2t V0, V1, A, T 1 sec. 
7. MRRR01 CRASH n = 3; t = 1; f = 2 n  2t V0, V1, A, T 1 sec. 
8. MA06 BYZ 
p = 4,a = 6,l = 4, 
t = 1,f = 1 
p  3t, a  5t, l  3t CS1, CS3, CL1, CL2 3 hrs. 
8. MA06 BYZ 
p = 4,a = 5,l = 4, 
t = 1, f = 1 
p  3t, a  5t, l  3t CS1, CS3, CL1, CL2 14 min. 
8. MA06 BYZ 
p = 4,a = 6,l = 4, 
t = 1, f = 2 
p  3t, a  5t, l  3t CS1, CS3, CL1, CL2 2 sec. 
Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA’14, Nov. 2014 46 / 64
Experimental Results at Glance 
6. BT85 BYZ n = 5; t = 1; f = 1 n  3t R 6. BT85 BYZ n = 5; t = 1; f = 2 n  3t R 6. BT85 BYZ n = 5; t = 2; f = 2 n  3t R 7. MRRR01 CRASH n = 3; t = 1; f = 1 n  2t V0, V1, A, T 7. MRRR01 CRASH n = 3; t = 1; f = 2 n  2t V0, V1, A, T 8. MA06 BYZ 
p = 4,a = 6,l = 4, 
t = 1,f = 1 
p  3t, a  5t, l  3t CS1, CS3, CL1, CL2 8. MA06 BYZ 
p = 4,a = 5,l = 4, 
t = 1, f = 1 
p  3t, a  5t, l  3t CS1, CS3, CL1, CL2 8. MA06 BYZ 
p = 4,a = 6,l = 4, 
t = 1, f = 2 
p  3t, a  5t, l  3t CS1, CS3, CL1, CL2 Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA’14, Nov. 2014 46 / 64
Experimental Results at Glance 
Algorithm Fault Parameters Resilience Properties Time 
1. ST87 BYZ n = 7; t = 2; f = 2 n  3t U, C, R 6 sec. 
1. ST87 BYZ n = 7; t = 3; f = 2 n  3t U, C, R 5 sec. 
1. ST87 BYZ n = 7; t = 1; f = 2 n  3t U, C, R 1 sec. 
2. ST87 OMIT n = 5; t = 2; f = 2 n  2t U, C, R 4 sec. 
2. ST87 OMIT n = 5; t = 2; f = 3 n  2t U, C, R 5 sec. 
3. ST87 SYMM n = 5; t = 1; fp = 1; fs = 0 n  2t U, C, R 1 sec. 
3. ST87 SYMM n = 5; t = 2; fp = 3; fs = 1 n  2t U, C, R 1 sec. 
4. ST87 CLEAN n = 3; t = 2; fc = 2; fnc = 0 n  t U, C, R 1 sec. 
5. CT96 CRASH n = 2 — U, C, R 1 sec. 
6. BT85 BYZ n = 5; t = 1; f = 1 n  3t R 131 sec. 
6. BT85 BYZ n = 5; t = 1; f = 2 n  3t R 1 sec. 
6. BT85 BYZ n = 5; t = 2; f = 2 n  3t R 1 sec. 
7. MRRR01 CRASH n = 3; t = 1; f = 1 n  2t V0, V1, A, T 1 sec. 
7. MRRR01 CRASH n = 3; t = 1; f = 2 n  2t V0, V1, A, T 1 sec. 
8. MA06 BYZ 
p = 4,a = 6,l = 4, 
t = 1,f = 1 
p  3t, a  5t, l  3t CS1, CS3, CL1, CL2 3 hrs. 
8. MA06 BYZ 
p = 4,a = 5,l = 4, 
t = 1, f = 1 
p  3t, a  5t, l  3t CS1, CS3, CL1, CL2 14 min. 
8. MA06 BYZ 
p = 4,a = 6,l = 4, 
t = 1, f = 2 
p  3t, a  5t, l  3t CS1, CS3, CL1, CL2 2 sec. 
Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA’14, Nov. 2014 46 / 64
Experimental Results: on ST87, the Byzantine Case 
Time (sec, logscale) 
two faults: f = 2 
no faults: f = 0 
Memory (MB, logscale) 
no faults: f = 2 
The more faults we have, the easier the problem is: 
Two faults: we can check the systems of up to nine processes 
No faults: we can check the systems of up to seven processes 
Precision of modeling: we found counter-examples for the corner 
cases n = 3t and f  t, where the resilience condition is violated. 
(June 2013: somebody wrote on Wikipedia that n = 3t should work :-) 
Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA’14, Nov. 2014 47 / 64
Discussion of the specifications 
Unforgeability. If vi = 0 for all correct processes i, then for all correct 
processes j, acceptj remains 0 forever. 
 nVf 
G 
i=1 
vi = 0 
 
! G 
 nVf 
j=1 
 
acceptj = 0 
The specification of Byzantine FTDAs have the following features: 
Only the states of correct processes are evaluated. 
Faulty processes may be Byzantine. (no assumption on 
behavior) 
Specifications do not talk about individual processes. 
Only global safety and progress are important. 
Indexed temporal logic is not required! 
Quantification over processes is on the level of atomic 
propositions. 
Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA’14, Nov. 2014 48 / 64
Threshold-Guarded Distributed Algorithms 
Standard construct: quantified guards (t=f=0) 
Existential Guard 
if received m from some process then ... 
Universal Guard 
if received m from all processes then ... 
Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA’14, Nov. 2014 49 / 64
Threshold-Guarded Distributed Algorithms 
Standard construct: quantified guards (t=f=0) 
Existential Guard 
if received m from some process then ... 
Universal Guard 
if received m from all processes then ... 
what if faults might occur? 
Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA’14, Nov. 2014 49 / 64
Threshold-Guarded Distributed Algorithms 
Standard construct: quantified guards (t=f=0) 
Existential Guard 
if received m from some process then ... 
Universal Guard 
if received m from all processes then ... 
what if faults might occur? 
Fault-Tolerant Algorithms: n processes, at most t are Byzantine 
Threshold Guard 
if received m from n  t processes then ... 
(the processes cannot refer to f!) 
Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA’14, Nov. 2014 49 / 64
Counting Argument in Threshold-Guarded Algorithms 
n 
t f 
t + 1 
Correct processes count incoming messages from distinct processes 
Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA’14, Nov. 2014 50 / 64
Counting Argument in Threshold-Guarded Algorithms 
n 
t f 
t + 1 
Correct processes count incoming messages from distinct processes 
Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA’14, Nov. 2014 50 / 64
Counting Argument in Threshold-Guarded Algorithms 
n 
t f 
t + 1 
at least one non-faulty sent the message 
Correct processes count incoming messages from distinct processes 
Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA’14, Nov. 2014 50 / 64
Modeling threshold-based algorithms in Promela 
As the distributed algorithms are given in pseudo-code, 
we have to decide on how to encode in PROMELA: 
send to all and receive 
counting expressions “received m from n  t distinct processes” 
faults 
Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA’14, Nov. 2014 51 / 64
Modeling threshold-based algorithms in Promela 
As the distributed algorithms are given in pseudo-code, 
we have to decide on how to encode in PROMELA: 
send to all and receive 
counting expressions “received m from n  t distinct processes” 
faults 
In what follows, we compare side-by-side two solutions: 
A straightforward encoding with PROMELA channels and 
explicit representation of faulty processes. [Solution 1] 
An advanced encoding with shared variables and fault injection. 
[Solution 2] 
Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA’14, Nov. 2014 51 / 64
Modeling threshold-based algorithms in Promela 
As the distributed algorithms are given in pseudo-code, 
we have to decide on how to encode in PROMELA: 
send to all and receive 
counting expressions “received m from n  t distinct processes” 
faults 
In what follows, we compare side-by-side two solutions: 
A straightforward encoding with PROMELA channels and 
explicit representation of faulty processes. [Solution 1] 
An advanced encoding with shared variables and fault injection. 
[Solution 2] 
To decouple encoding of reliable message passing and of faults, 
we first consider message passing without faults, 
and then show how to encode faults. 
Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA’14, Nov. 2014 51 / 64
Template in Promela 
We implement the following loop 
on the right. 
receive messages 
compute using 
messages and local variables 
(description in English 
with basic control flow 
if-then-else) 
send messages 
atomic 
/  s h a r e d s t a t e : 
a v a r i a b l e o r a c h anne l  / 
active proctype[N(n,t,f)] P(){ 
/  l o c a l v a r i a b l e t o c ount 
me s s a g e s from d i s t i n c t 
p r o c e s s e s  / 
int nrcvd; 
/  i n i t i a l i z a t i o n  / 
loop: atomic { 
/  
1 . r e c e i v e and c ount me s s a ge s 
2 . compute us ing nr cvd 
3 . s end me s s a g e s  / 
} 
goto loop; 
} 
Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA’14, Nov. 2014 52 / 64
Modeling Message Passing 
All our case studies are designed with the assumption of 
classic reliable asynchronous message passing as in (Fischer et al., 
1985): 
non-blocking communication, 
operations “receive” and “send” are executed immediately. 
if a message can be received now, it may be also received later, 
a process does not have to receive a message as soon as it is 
able to. 
every sent message is eventually received, 
but there are no bounds on the delays. 
Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA’14, Nov. 2014 53 / 64
Solution 1: Message Passing using Promela channels 
A straightforward encoding using message channels: 
/  me s s age t y p e  / 
mtype = { ECHO }; 
/  p o inttop o i n t c h a n n e l s  / 
chan p2p[N][N] = [1] of { mtype }; 
/  t a g r e c e i v e d me s s a g e s  / 
bit rx[N][N]; 
Sending a message to all processes: 
for (i : 1 .. N) { p2p[_pid][i]!ECHO; } 
Note: pid denotes the process identifier in PROMELA 
(we use it solely to encode message passing). 
Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA’14, Nov. 2014 54 / 64
Solution 1: Message Passing (cont.) 
Receiving and counting messages from distinct processes 
(no faults yet): 
/  l o c a l  / int nrcvd = 0; /  i n i t i a l l y , no me s s a ge s  / 
... 
i = 0; 
do /  i s t h e r e a me s s age from p r o c e s s i ?  / 
:: (i  N)  nempty(p2p[i][_pid]) - 
p2p[i][_pid]?ECHO; /  remove i t  / 
if 
:: !rx[i][_pid] - /  1 . t h e f i r s t t ime :  / 
rx[i][_pid] = 1; /  a . mark a s r e c e i v e d  / 
nrcvd++; break; /  b . i n c r e a s e l o c a l c o u n t e r  / 
:: rx[i][_pid]; /  2 . i g n o r e a d u p l i c a t e  / 
fi; i++; /  ne xt p r o c e s s  / 
:: (i  N) - i++; /  r e c e i v e no t h ing from i  / 
:: i == N - break; 
od 
Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA’14, Nov. 2014 55 / 64
Solution 2: Simulating message passing with variables 
Keeping the number of send-to-all’s by (correct) processes: 
int nsnt; /  s h a r e d v a r i a b l e  / 
/  number o f sendtoa l l ’ s s e n t by c o r r e c t p r o c e s s e s  / 
Sending a message to all: 
nsnt++; 
Receiving and counting messages from distinct processes (no faults): 
if /  p i c k a l a r g e r v a l ue  nsnt  / 
:: ((nrcvd + 1)  nsnt) - 
nrcvd++; /  one more me s s age  / 
:: skip; /  o r no t h ing  / 
fi; 
Reliable communication as a fairness property: 
FG[8i:nrcvdi  nsnt] 
Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA’14, Nov. 2014 56 / 64
Solution 2: Some questions you might ask 
Q1: instead of 
if 
:: ((nrcvd + 1)  nsnt) - 
nrcvd++; /  one more me s s age  / 
:: skip; /  o r no t h ing  / 
fi; 
why cannot we just write: 
nrcvd = nsnt; 
A1: You can, but that will be another model, not [FLP85]! 
[FLP85] only guarantees that every message is eventually received. 
Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA’14, Nov. 2014 57 / 64
Solution 2: Some questions you might ask (cont.) 
Reliable communication: 
every sent message is eventually received. 
Q2: Why do we write 
FG[8i:nrcvdi  nsnt] (1) 
instead of: 
8i: GF [nrcvdi  nsnt] (2) 
A2: We like to write (2), but it will require us to use another logic called 
indexed LTL, which will cause problems in the parameterized case. 
For threshold-based algorithms, the value of nsnt is changes 
at most n times. 
Under this assumption, (2) is equivalent (1). 
Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA’14, Nov. 2014 58 / 64
Solution 1 (cont.): Explicit Modeling of Faults 
(Lamport et al., 1982) introduce Byzantine processes that can virtually 
do anything. 
In our case, Byzantine behavior boils down to sending ECHO to some 
of the correct processes and not sending ECHO to the others: 
active[F] proctype Byz() { 
step: 
atomic { 
i = 0; 
do /  s end ECHO t o p r o c e s s i  / 
:: i  N - p2p[_pid][i]!ECHO; i++; 
/  o r no t  / 
:: i  N - i++; 
:: i == N - break; 
od 
}; 
goto step; 
} 
Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA’14, Nov. 2014 59 / 64
Solution 2 (cont.): 
Injecting Faults into Message Counters 
We instantiate n  f correct processes and no faulty processes. 
Instead, we say that the correct processes may receive up to f 
additional messages due to faults: 
if :: ((nrcvd + 1)  nsnt + f) - 
nrcvd++; /  r e c e i v e one more me s s age  / 
:: skip; /  o r no t h ing  / 
fi; 
Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA’14, Nov. 2014 60 / 64
Solution 2 (cont.): 
Injecting Faults into Message Counters 
We instantiate n  f correct processes and no faulty processes. 
Instead, we say that the correct processes may receive up to f 
additional messages due to faults: 
if :: ((nrcvd + 1)  nsnt + f) - 
nrcvd++; /  r e c e i v e one more me s s age  / 
:: skip; /  o r no t h ing  / 
fi; 
The fairness still forces the processes to receive all the messages sent 
by the correct processes: 
FG[8i:nrcvdi  nsnt] 
Note: each correct process sends at most one ECHO message. 
Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA’14, Nov. 2014 60 / 64
Solution 2 (cont.): Modeling different kinds of faults 
Byzantine faults (previous slide): 
create only correct processes, i.e., n  f processes 
only they have to satisfy spec 
extra messages from Byzantine: ((nrcvd + 1)  nsnt + f) 
fairness (reliable communication): FG[8i:nrcvdi  nsnt] 
Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA’14, Nov. 2014 61 / 64
Solution 2 (cont.): Modeling different kinds of faults 
Byzantine faults (previous slide): 
create only correct processes, i.e., n  f processes 
only they have to satisfy spec 
extra messages from Byzantine: ((nrcvd + 1)  nsnt + f) 
fairness (reliable communication): FG[8i:nrcvdi  nsnt] 
Omission faults (processes fail to send messages): 
create all processes, i.e., n processes 
all of them are mentioned in the specification 
no additional messages: ((nrcvd + 1)  nsnt) 
fairness (with possible message loss due to faults) 
FG[8i:nrcvdi  nsnt  f ] 
Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA’14, Nov. 2014 61 / 64
Solution 2 (cont.): Modeling different kinds of faults 
Byzantine faults (previous slide): 
create only correct processes, i.e., n  f processes 
only they have to satisfy spec 
extra messages from Byzantine: ((nrcvd + 1)  nsnt + f) 
fairness (reliable communication): FG[8i:nrcvdi  nsnt] 
Omission faults (processes fail to send messages): 
create all processes, i.e., n processes 
all of them are mentioned in the specification 
no additional messages: ((nrcvd + 1)  nsnt) 
fairness (with possible message loss due to faults) 
FG[8i:nrcvdi  nsnt  f ] 
Crash faults: similar to omissions with crash control state added 
Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA’14, Nov. 2014 61 / 64
Experiments: Solution 1 vs. Solution 2 
States (logscale) 
1e+08 
1e+07 
1e+06 
100000 
10000 
1000 
100 
10 
states (logscale) 
3 4 5 6 7 8 
number of processes, N 
Memory (MB, logscale,  12 GB) 
10000 
1000 
100 
memory, MB (logscale) 
3 4 5 6 7 8 
number of processes, N 
Solution 1: Channels + explicit Byzantine processes (blue) 
Solution 2: shared variables + fault injection (red) 
in the presence of one Byzantine faulty process (f = 1) 
(case f = 2 runs out of memory too fast) 
Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA’14, Nov. 2014 62 / 64
Summary 
We show how to model threshold-based fault-tolerant algorithms 
starting with an imprecise description 
We create PROMELA models using expert advice. 
The tool demonstrates that the model behaves as predicted by theory 
(for concrete values of parameters) 
This reference implementation allows us to optimize the encoding 
... and to make the model amenable to parameterized verification 
Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA’14, Nov. 2014 63 / 64
Tomorrow: 
parameterized model checking 
Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA’14, Nov. 2014 64 / 64
References I 
Biely, M., Delgado, P., Milosevic, Z.,  Schiper, A. 2013 (June). 
Distal: A framework for implementing fault-tolerant distributed algorithms. 
Pages 1–8 of: Dependable Systems and Networks (DSN), 2013 43rd Annual IEEE/IFIP 
International Conference on. 
Castro, Miguel, Liskov, Barbara, et al. 1999. 
Practical Byzantine fault tolerance. 
Pages 173–186 of: OSDI, vol. 99. 
Fischer, Michael J., Lynch, Nancy A.,  Paterson, M. S. 1985. 
Impossibility of Distributed Consensus with one Faulty Process. 
J. ACM, 32(2), 374–382. 
http://doi.acm.org/10.1145/3149.214121. 
John, Annu, Konnov, Igor, Schmid, Ulrich, Veith, Helmut,  Widder, Josef. 2013. 
Towards Modeling and Model Checking Fault-Tolerant Distributed Algorithms. 
Pages 209–226 of: SPIN. 
LNCS, vol. 7976. 
Lamport, Leslie, Shostak, Robert E.,  Pease, Marshall C. 1982. 
The Byzantine Generals Problem. 
ACM Trans. Program. Lang. Syst., 4(3), 382–401. 
Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA’14, Nov. 2014 65 / 64
References II 
Lincoln, P.,  Rushby, J. 1993. 
A formally verified algorithm for interactive consistency under a hybrid fault model. 
Pages 402–411 of: FTCS-23. 
http://dx.doi.org/10.1109/FTCS.1993.627343. 
Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA’14, Nov. 2014 66 / 64
Folklore Reliable Broadcast (e.g., Chandra  Toueg, 
96) 
Correct processes agree on value vi in the presence of crash faults. 
Variables of process i 
vi : f0 , 1g i n i t i a l l y 0 or 1 
accepti : f0 , 1g i n i t i a l l y 0 
An atomic step: 
i f ( vi = 1 or received echo from some proces s ) 
and accepti = 0 
then begin 
send echo to all ; 
accepti := 1 ; 
end 
Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA’14, Nov. 2014 67 / 64
Folklore Reliable Broadcast (e.g., Chandra  Toueg, 
96) 
Correct processes agree on value vi in the presence of crash faults. 
Variables of process i 
vi : f0 , 1g i n i t i a l l y 0 or 1 
accepti : f0 , 1g i n i t i a l l y 0 
An atomic step: 
i f ( vi = 1 or received echo from some proces s ) 
and accepti = 0 
then begin 
send echo to all ; 
/* when crashing it sends to a subset of processes */ 
accepti := 1 ; /* it can also crash here */ 
end 
Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA’14, Nov. 2014 67 / 64
Verification Problem as in Distributed Computing 
Given a distributed algorithm A and specifications 'U, 'C, 'R, 
Fix n and t with n  3t, 
show that every execution of A(n; t) satisfies 'U, 'C, 'R. 
Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA’14, Nov. 2014 68 / 64
Verification Problem as in Distributed Computing 
Given a distributed algorithm A and specifications 'U, 'C, 'R, 
Fix n and t with n  3t, 
show that every execution of A(n; t) satisfies 'U, 'C, 'R. 
In every execution: 
the number of faulty processes is restricted, i.e., f  t; 
processes can use n and t in the code, but not f ; 
f is constant 
(if a process fails late, its “correct” behavior was a Byzantine trick). 
A distributed 
system A(n; t) 
f = 0 
. . . 
f = t 
Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA’14, Nov. 2014 68 / 64
Verification Problem as in Distributed Computing 
Given a distributed algorithm A and specifications 'U, 'C, 'R, 
Fix n and t with n  3t, 
show that every execution of A(n; t) satisfies 'U, 'C, 'R. 
In every execution: 
the number of faulty processes is restricted, i.e., f  t; 
processes can use n and t in the code, but not f ; 
f is constant 
(if a process fails late, its “correct” behavior was a Byzantine trick). 
A distributed 
system A(n; t) 
f = 0 
. . . 
f = t 
Counterexamples when f  t? 
Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA’14, Nov. 2014 68 / 64
Experiments: Channels vs. Shared Variables 
enumerating reachable states in SPIN with POR and state 
compression 
States (logscale) 
1e+09 
1e+08 
1e+07 
1e+06 
100000 
10000 
1000 
100 
states (logscale) 
3 4 5 6 7 8 
number of processes, N 
Memory (MB, logscale, limit of 12 
GB) 
10000 
1000 
100 
memory, MB (logscale) 
3 4 5 6 7 8 
number of processes, N 
Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA’14, Nov. 2014 69 / 64

More Related Content

What's hot

How Triton can help to reverse virtual machine based software protections
How Triton can help to reverse virtual machine based software protectionsHow Triton can help to reverse virtual machine based software protections
How Triton can help to reverse virtual machine based software protectionsJonathan Salwan
 
Dynamic Binary Analysis and Obfuscated Codes
Dynamic Binary Analysis and Obfuscated Codes Dynamic Binary Analysis and Obfuscated Codes
Dynamic Binary Analysis and Obfuscated Codes Jonathan Salwan
 
Process synchronization(deepa)
Process synchronization(deepa)Process synchronization(deepa)
Process synchronization(deepa)Nagarajan
 
Agilent flash programming agilent utility card versus deep serial memory-ca...
Agilent flash programming   agilent utility card versus deep serial memory-ca...Agilent flash programming   agilent utility card versus deep serial memory-ca...
Agilent flash programming agilent utility card versus deep serial memory-ca...AgilentT&M EMEA
 
20080501 software verification_sharygina_lecture01
20080501 software verification_sharygina_lecture0120080501 software verification_sharygina_lecture01
20080501 software verification_sharygina_lecture01Computer Science Club
 
الإعتيان والتبديل التمثيلي الرقمي
الإعتيان والتبديل التمثيلي الرقميالإعتيان والتبديل التمثيلي الرقمي
الإعتيان والتبديل التمثيلي الرقميDr. Munthear Alqaderi
 
Pragmatic Optimization in Modern Programming - Ordering Optimization Approaches
Pragmatic Optimization in Modern Programming - Ordering Optimization ApproachesPragmatic Optimization in Modern Programming - Ordering Optimization Approaches
Pragmatic Optimization in Modern Programming - Ordering Optimization ApproachesMarina Kolpakova
 
Operating System - Monitors (Presentation)
Operating System - Monitors (Presentation)Operating System - Monitors (Presentation)
Operating System - Monitors (Presentation)Experts Desk
 
Sensors and Journalism Seminar, University of British Columbia - Day 1
Sensors and Journalism Seminar, University of British Columbia - Day 1Sensors and Journalism Seminar, University of British Columbia - Day 1
Sensors and Journalism Seminar, University of British Columbia - Day 1ferguspitt
 
Mca ii os u-2 process management & communication
Mca  ii  os u-2 process management & communicationMca  ii  os u-2 process management & communication
Mca ii os u-2 process management & communicationRai University
 
Pragmatic optimization in modern programming - modern computer architecture c...
Pragmatic optimization in modern programming - modern computer architecture c...Pragmatic optimization in modern programming - modern computer architecture c...
Pragmatic optimization in modern programming - modern computer architecture c...Marina Kolpakova
 
20051019 automating regression testing for evolving gui software
20051019 automating regression testing for evolving gui software20051019 automating regression testing for evolving gui software
20051019 automating regression testing for evolving gui softwareWill Shen
 

What's hot (19)

Monitors
MonitorsMonitors
Monitors
 
How Triton can help to reverse virtual machine based software protections
How Triton can help to reverse virtual machine based software protectionsHow Triton can help to reverse virtual machine based software protections
How Triton can help to reverse virtual machine based software protections
 
Semaphores
SemaphoresSemaphores
Semaphores
 
Dynamic Binary Analysis and Obfuscated Codes
Dynamic Binary Analysis and Obfuscated Codes Dynamic Binary Analysis and Obfuscated Codes
Dynamic Binary Analysis and Obfuscated Codes
 
Process synchronization(deepa)
Process synchronization(deepa)Process synchronization(deepa)
Process synchronization(deepa)
 
Os3
Os3Os3
Os3
 
SYNCHRONIZATION
SYNCHRONIZATIONSYNCHRONIZATION
SYNCHRONIZATION
 
Agilent flash programming agilent utility card versus deep serial memory-ca...
Agilent flash programming   agilent utility card versus deep serial memory-ca...Agilent flash programming   agilent utility card versus deep serial memory-ca...
Agilent flash programming agilent utility card versus deep serial memory-ca...
 
Real time SHVC decoder
Real time SHVC decoderReal time SHVC decoder
Real time SHVC decoder
 
20080501 software verification_sharygina_lecture01
20080501 software verification_sharygina_lecture0120080501 software verification_sharygina_lecture01
20080501 software verification_sharygina_lecture01
 
الإعتيان والتبديل التمثيلي الرقمي
الإعتيان والتبديل التمثيلي الرقميالإعتيان والتبديل التمثيلي الرقمي
الإعتيان والتبديل التمثيلي الرقمي
 
Pragmatic Optimization in Modern Programming - Ordering Optimization Approaches
Pragmatic Optimization in Modern Programming - Ordering Optimization ApproachesPragmatic Optimization in Modern Programming - Ordering Optimization Approaches
Pragmatic Optimization in Modern Programming - Ordering Optimization Approaches
 
SCP
SCPSCP
SCP
 
Operating System - Monitors (Presentation)
Operating System - Monitors (Presentation)Operating System - Monitors (Presentation)
Operating System - Monitors (Presentation)
 
Sensors and Journalism Seminar, University of British Columbia - Day 1
Sensors and Journalism Seminar, University of British Columbia - Day 1Sensors and Journalism Seminar, University of British Columbia - Day 1
Sensors and Journalism Seminar, University of British Columbia - Day 1
 
Mca ii os u-2 process management & communication
Mca  ii  os u-2 process management & communicationMca  ii  os u-2 process management & communication
Mca ii os u-2 process management & communication
 
Pragmatic optimization in modern programming - modern computer architecture c...
Pragmatic optimization in modern programming - modern computer architecture c...Pragmatic optimization in modern programming - modern computer architecture c...
Pragmatic optimization in modern programming - modern computer architecture c...
 
Code GPU with CUDA - SIMT
Code GPU with CUDA - SIMTCode GPU with CUDA - SIMT
Code GPU with CUDA - SIMT
 
20051019 automating regression testing for evolving gui software
20051019 automating regression testing for evolving gui software20051019 automating regression testing for evolving gui software
20051019 automating regression testing for evolving gui software
 

Viewers also liked

Introduction to Parallel Distributed Computer Systems
Introduction to Parallel Distributed Computer SystemsIntroduction to Parallel Distributed Computer Systems
Introduction to Parallel Distributed Computer SystemsMrMaKKaWi
 
Fault Tolerance System
Fault Tolerance SystemFault Tolerance System
Fault Tolerance Systemprakashjjaya
 
Distributed & parallel system
Distributed & parallel systemDistributed & parallel system
Distributed & parallel systemManish Singh
 
Introduction to Embedded Systems
Introduction to Embedded SystemsIntroduction to Embedded Systems
Introduction to Embedded SystemsAnil Kumar Pugalia
 
Computer science seminar topics
Computer science seminar topicsComputer science seminar topics
Computer science seminar topics123seminarsonly
 
Distributed computing
Distributed computingDistributed computing
Distributed computingshivli0769
 

Viewers also liked (9)

Workshop on Digital Datcom and Simulink
Workshop on Digital Datcom and SimulinkWorkshop on Digital Datcom and Simulink
Workshop on Digital Datcom and Simulink
 
Introduction to Parallel Distributed Computer Systems
Introduction to Parallel Distributed Computer SystemsIntroduction to Parallel Distributed Computer Systems
Introduction to Parallel Distributed Computer Systems
 
Fault Tolerance System
Fault Tolerance SystemFault Tolerance System
Fault Tolerance System
 
Distributed & parallel system
Distributed & parallel systemDistributed & parallel system
Distributed & parallel system
 
Embedded Software Design
Embedded Software DesignEmbedded Software Design
Embedded Software Design
 
Introduction to Embedded Systems
Introduction to Embedded SystemsIntroduction to Embedded Systems
Introduction to Embedded Systems
 
Toolchain
ToolchainToolchain
Toolchain
 
Computer science seminar topics
Computer science seminar topicsComputer science seminar topics
Computer science seminar topics
 
Distributed computing
Distributed computingDistributed computing
Distributed computing
 

Similar to Introduction into Fault-tolerant Distributed Algorithms and their Modeling (Part 2)

Introduction into Fault-tolerant Distributed Algorithms and their Modeling (P...
Introduction into Fault-tolerant Distributed Algorithms and their Modeling (P...Introduction into Fault-tolerant Distributed Algorithms and their Modeling (P...
Introduction into Fault-tolerant Distributed Algorithms and their Modeling (P...Iosif Itkin
 
Crash Analysis with Reverse Taint
Crash Analysis with Reverse TaintCrash Analysis with Reverse Taint
Crash Analysis with Reverse Taintmarekzmyslowski
 
DEFCON 21: EDS: Exploitation Detection System WP
DEFCON 21: EDS: Exploitation Detection System WPDEFCON 21: EDS: Exploitation Detection System WP
DEFCON 21: EDS: Exploitation Detection System WPAmr Thabet
 
Joe armstrong erlanga_languageforprogrammingreliablesystems
Joe armstrong erlanga_languageforprogrammingreliablesystemsJoe armstrong erlanga_languageforprogrammingreliablesystems
Joe armstrong erlanga_languageforprogrammingreliablesystemsSentifi
 
Fuzzing101 uvm-reporting-and-mitigation-2011-02-10
Fuzzing101 uvm-reporting-and-mitigation-2011-02-10Fuzzing101 uvm-reporting-and-mitigation-2011-02-10
Fuzzing101 uvm-reporting-and-mitigation-2011-02-10Codenomicon
 
SIP Flooding Attack Detection Using Hybrid Detection Algorithm
SIP Flooding Attack Detection Using Hybrid Detection AlgorithmSIP Flooding Attack Detection Using Hybrid Detection Algorithm
SIP Flooding Attack Detection Using Hybrid Detection AlgorithmEditor IJMTER
 
Keynote joearmstrong
Keynote joearmstrongKeynote joearmstrong
Keynote joearmstrongSentifi
 
An Architectural Concept for Intrusion Tolerance in Air Traffic Networks
An Architectural Concept for Intrusion Tolerance in Air Traffic NetworksAn Architectural Concept for Intrusion Tolerance in Air Traffic Networks
An Architectural Concept for Intrusion Tolerance in Air Traffic NetworksÜlger Ahmet
 
A Smart Fuzzing Approach for Integer Overflow Detection
A Smart Fuzzing Approach for Integer Overflow DetectionA Smart Fuzzing Approach for Integer Overflow Detection
A Smart Fuzzing Approach for Integer Overflow DetectionITIIIndustries
 
VCCFinder: Finding Potential Vulnerabilities in Open-Source Projects to Assis...
VCCFinder: Finding Potential Vulnerabilities in Open-Source Projects to Assis...VCCFinder: Finding Potential Vulnerabilities in Open-Source Projects to Assis...
VCCFinder: Finding Potential Vulnerabilities in Open-Source Projects to Assis...Stefano Dalla Palma
 
A formal definition of computer worms and some related results
A formal definition of computer worms and some related resultsA formal definition of computer worms and some related results
A formal definition of computer worms and some related resultsUltraUploader
 
Sioux Hot-or-Not: Functional programming: unlocking the real power of multi-c...
Sioux Hot-or-Not: Functional programming: unlocking the real power of multi-c...Sioux Hot-or-Not: Functional programming: unlocking the real power of multi-c...
Sioux Hot-or-Not: Functional programming: unlocking the real power of multi-c...siouxhotornot
 
SOPRANO Ambient Middleware (AAL)
SOPRANO Ambient Middleware (AAL)SOPRANO Ambient Middleware (AAL)
SOPRANO Ambient Middleware (AAL)Andreas Schmidt
 
Us 13-opi-evading-deep-inspection-for-fun-and-shell-wp
Us 13-opi-evading-deep-inspection-for-fun-and-shell-wpUs 13-opi-evading-deep-inspection-for-fun-and-shell-wp
Us 13-opi-evading-deep-inspection-for-fun-and-shell-wpOlli-Pekka Niemi
 
[Usenix's WOOT'14] Attacking the Linux PRNG and Android - Weaknesses in Seedi...
[Usenix's WOOT'14] Attacking the Linux PRNG and Android - Weaknesses in Seedi...[Usenix's WOOT'14] Attacking the Linux PRNG and Android - Weaknesses in Seedi...
[Usenix's WOOT'14] Attacking the Linux PRNG and Android - Weaknesses in Seedi...srkedmi
 
L018118083.new ramya publication (1)
L018118083.new ramya publication (1)L018118083.new ramya publication (1)
L018118083.new ramya publication (1)IOSR Journals
 
Dismantling intrusion prevention_systems
Dismantling intrusion prevention_systemsDismantling intrusion prevention_systems
Dismantling intrusion prevention_systemsOlli-Pekka Niemi
 

Similar to Introduction into Fault-tolerant Distributed Algorithms and their Modeling (Part 2) (20)

Introduction into Fault-tolerant Distributed Algorithms and their Modeling (P...
Introduction into Fault-tolerant Distributed Algorithms and their Modeling (P...Introduction into Fault-tolerant Distributed Algorithms and their Modeling (P...
Introduction into Fault-tolerant Distributed Algorithms and their Modeling (P...
 
Crash Analysis with Reverse Taint
Crash Analysis with Reverse TaintCrash Analysis with Reverse Taint
Crash Analysis with Reverse Taint
 
DEFCON 21: EDS: Exploitation Detection System WP
DEFCON 21: EDS: Exploitation Detection System WPDEFCON 21: EDS: Exploitation Detection System WP
DEFCON 21: EDS: Exploitation Detection System WP
 
Joe armstrong erlanga_languageforprogrammingreliablesystems
Joe armstrong erlanga_languageforprogrammingreliablesystemsJoe armstrong erlanga_languageforprogrammingreliablesystems
Joe armstrong erlanga_languageforprogrammingreliablesystems
 
Fuzzing101 uvm-reporting-and-mitigation-2011-02-10
Fuzzing101 uvm-reporting-and-mitigation-2011-02-10Fuzzing101 uvm-reporting-and-mitigation-2011-02-10
Fuzzing101 uvm-reporting-and-mitigation-2011-02-10
 
SIP Flooding Attack Detection Using Hybrid Detection Algorithm
SIP Flooding Attack Detection Using Hybrid Detection AlgorithmSIP Flooding Attack Detection Using Hybrid Detection Algorithm
SIP Flooding Attack Detection Using Hybrid Detection Algorithm
 
Fuzz nt
Fuzz ntFuzz nt
Fuzz nt
 
Fuzz
FuzzFuzz
Fuzz
 
Keynote joearmstrong
Keynote joearmstrongKeynote joearmstrong
Keynote joearmstrong
 
An Architectural Concept for Intrusion Tolerance in Air Traffic Networks
An Architectural Concept for Intrusion Tolerance in Air Traffic NetworksAn Architectural Concept for Intrusion Tolerance in Air Traffic Networks
An Architectural Concept for Intrusion Tolerance in Air Traffic Networks
 
A Smart Fuzzing Approach for Integer Overflow Detection
A Smart Fuzzing Approach for Integer Overflow DetectionA Smart Fuzzing Approach for Integer Overflow Detection
A Smart Fuzzing Approach for Integer Overflow Detection
 
VCCFinder: Finding Potential Vulnerabilities in Open-Source Projects to Assis...
VCCFinder: Finding Potential Vulnerabilities in Open-Source Projects to Assis...VCCFinder: Finding Potential Vulnerabilities in Open-Source Projects to Assis...
VCCFinder: Finding Potential Vulnerabilities in Open-Source Projects to Assis...
 
A formal definition of computer worms and some related results
A formal definition of computer worms and some related resultsA formal definition of computer worms and some related results
A formal definition of computer worms and some related results
 
Sioux Hot-or-Not: Functional programming: unlocking the real power of multi-c...
Sioux Hot-or-Not: Functional programming: unlocking the real power of multi-c...Sioux Hot-or-Not: Functional programming: unlocking the real power of multi-c...
Sioux Hot-or-Not: Functional programming: unlocking the real power of multi-c...
 
SOPRANO Ambient Middleware (AAL)
SOPRANO Ambient Middleware (AAL)SOPRANO Ambient Middleware (AAL)
SOPRANO Ambient Middleware (AAL)
 
Software testing
Software testingSoftware testing
Software testing
 
Us 13-opi-evading-deep-inspection-for-fun-and-shell-wp
Us 13-opi-evading-deep-inspection-for-fun-and-shell-wpUs 13-opi-evading-deep-inspection-for-fun-and-shell-wp
Us 13-opi-evading-deep-inspection-for-fun-and-shell-wp
 
[Usenix's WOOT'14] Attacking the Linux PRNG and Android - Weaknesses in Seedi...
[Usenix's WOOT'14] Attacking the Linux PRNG and Android - Weaknesses in Seedi...[Usenix's WOOT'14] Attacking the Linux PRNG and Android - Weaknesses in Seedi...
[Usenix's WOOT'14] Attacking the Linux PRNG and Android - Weaknesses in Seedi...
 
L018118083.new ramya publication (1)
L018118083.new ramya publication (1)L018118083.new ramya publication (1)
L018118083.new ramya publication (1)
 
Dismantling intrusion prevention_systems
Dismantling intrusion prevention_systemsDismantling intrusion prevention_systems
Dismantling intrusion prevention_systems
 

More from Iosif Itkin

Foundations of Software Testing Lecture 4
Foundations of Software Testing Lecture 4Foundations of Software Testing Lecture 4
Foundations of Software Testing Lecture 4Iosif Itkin
 
QA Financial Forum London 2021 - Automation in Software Testing. Humans and C...
QA Financial Forum London 2021 - Automation in Software Testing. Humans and C...QA Financial Forum London 2021 - Automation in Software Testing. Humans and C...
QA Financial Forum London 2021 - Automation in Software Testing. Humans and C...Iosif Itkin
 
Exactpro FinTech Webinar - Global Exchanges Test Oracles
Exactpro FinTech Webinar - Global Exchanges Test OraclesExactpro FinTech Webinar - Global Exchanges Test Oracles
Exactpro FinTech Webinar - Global Exchanges Test OraclesIosif Itkin
 
Exactpro FinTech Webinar - Global Exchanges FIX Protocol
Exactpro FinTech Webinar - Global Exchanges FIX ProtocolExactpro FinTech Webinar - Global Exchanges FIX Protocol
Exactpro FinTech Webinar - Global Exchanges FIX ProtocolIosif Itkin
 
Operational Resilience in Financial Market Infrastructures
Operational Resilience in Financial Market InfrastructuresOperational Resilience in Financial Market Infrastructures
Operational Resilience in Financial Market InfrastructuresIosif Itkin
 
20 Simple Questions from Exactpro for Your Enjoyment This Holiday Season
20 Simple Questions from Exactpro for Your Enjoyment This Holiday Season20 Simple Questions from Exactpro for Your Enjoyment This Holiday Season
20 Simple Questions from Exactpro for Your Enjoyment This Holiday SeasonIosif Itkin
 
Testing the Intelligence of your AI
Testing the Intelligence of your AITesting the Intelligence of your AI
Testing the Intelligence of your AIIosif Itkin
 
EXTENT 2019: Exactpro Quality Assurance for Financial Market Infrastructures
EXTENT 2019: Exactpro Quality Assurance for Financial Market InfrastructuresEXTENT 2019: Exactpro Quality Assurance for Financial Market Infrastructures
EXTENT 2019: Exactpro Quality Assurance for Financial Market InfrastructuresIosif Itkin
 
ClearTH Test Automation Framework: Case Study in IRS & CDS Swaps Lifecycle Mo...
ClearTH Test Automation Framework: Case Study in IRS & CDS Swaps Lifecycle Mo...ClearTH Test Automation Framework: Case Study in IRS & CDS Swaps Lifecycle Mo...
ClearTH Test Automation Framework: Case Study in IRS & CDS Swaps Lifecycle Mo...Iosif Itkin
 
EXTENT Talks 2019 Tbilisi: Failover and Recovery Test Automation - Ivan Shamrai
EXTENT Talks 2019 Tbilisi: Failover and Recovery Test Automation - Ivan ShamraiEXTENT Talks 2019 Tbilisi: Failover and Recovery Test Automation - Ivan Shamrai
EXTENT Talks 2019 Tbilisi: Failover and Recovery Test Automation - Ivan ShamraiIosif Itkin
 
EXTENT Talks QA Community Tbilisi 20 April 2019 - Conference Open
EXTENT Talks QA Community Tbilisi 20 April 2019 - Conference OpenEXTENT Talks QA Community Tbilisi 20 April 2019 - Conference Open
EXTENT Talks QA Community Tbilisi 20 April 2019 - Conference OpenIosif Itkin
 
User-Assisted Log Analysis for Quality Control of Distributed Fintech Applica...
User-Assisted Log Analysis for Quality Control of Distributed Fintech Applica...User-Assisted Log Analysis for Quality Control of Distributed Fintech Applica...
User-Assisted Log Analysis for Quality Control of Distributed Fintech Applica...Iosif Itkin
 
QAFF Chicago 2019 - Complex Post-Trade Systems, Requirements Traceability and...
QAFF Chicago 2019 - Complex Post-Trade Systems, Requirements Traceability and...QAFF Chicago 2019 - Complex Post-Trade Systems, Requirements Traceability and...
QAFF Chicago 2019 - Complex Post-Trade Systems, Requirements Traceability and...Iosif Itkin
 
QA Community Saratov: Past, Present, Future (2019-02-08)
QA Community Saratov: Past, Present, Future (2019-02-08)QA Community Saratov: Past, Present, Future (2019-02-08)
QA Community Saratov: Past, Present, Future (2019-02-08)Iosif Itkin
 
Machine Learning and RoboCop Testing
Machine Learning and RoboCop TestingMachine Learning and RoboCop Testing
Machine Learning and RoboCop TestingIosif Itkin
 
Behaviour Driven Development: Oltre i limiti del possibile
Behaviour Driven Development: Oltre i limiti del possibileBehaviour Driven Development: Oltre i limiti del possibile
Behaviour Driven Development: Oltre i limiti del possibileIosif Itkin
 
2018 - Exactpro Year in Review
2018 - Exactpro Year in Review2018 - Exactpro Year in Review
2018 - Exactpro Year in ReviewIosif Itkin
 
Exactpro Discussion about Joy and Strategy
Exactpro Discussion about Joy and StrategyExactpro Discussion about Joy and Strategy
Exactpro Discussion about Joy and StrategyIosif Itkin
 
FIX EMEA Conference 2018 - Post Trade Software Testing Challenges
FIX EMEA Conference 2018 - Post Trade Software Testing ChallengesFIX EMEA Conference 2018 - Post Trade Software Testing Challenges
FIX EMEA Conference 2018 - Post Trade Software Testing ChallengesIosif Itkin
 
BDD. The Outer Limits. Iosif Itkin at Youcon (in Russian)
BDD. The Outer Limits. Iosif Itkin at Youcon (in Russian)BDD. The Outer Limits. Iosif Itkin at Youcon (in Russian)
BDD. The Outer Limits. Iosif Itkin at Youcon (in Russian)Iosif Itkin
 

More from Iosif Itkin (20)

Foundations of Software Testing Lecture 4
Foundations of Software Testing Lecture 4Foundations of Software Testing Lecture 4
Foundations of Software Testing Lecture 4
 
QA Financial Forum London 2021 - Automation in Software Testing. Humans and C...
QA Financial Forum London 2021 - Automation in Software Testing. Humans and C...QA Financial Forum London 2021 - Automation in Software Testing. Humans and C...
QA Financial Forum London 2021 - Automation in Software Testing. Humans and C...
 
Exactpro FinTech Webinar - Global Exchanges Test Oracles
Exactpro FinTech Webinar - Global Exchanges Test OraclesExactpro FinTech Webinar - Global Exchanges Test Oracles
Exactpro FinTech Webinar - Global Exchanges Test Oracles
 
Exactpro FinTech Webinar - Global Exchanges FIX Protocol
Exactpro FinTech Webinar - Global Exchanges FIX ProtocolExactpro FinTech Webinar - Global Exchanges FIX Protocol
Exactpro FinTech Webinar - Global Exchanges FIX Protocol
 
Operational Resilience in Financial Market Infrastructures
Operational Resilience in Financial Market InfrastructuresOperational Resilience in Financial Market Infrastructures
Operational Resilience in Financial Market Infrastructures
 
20 Simple Questions from Exactpro for Your Enjoyment This Holiday Season
20 Simple Questions from Exactpro for Your Enjoyment This Holiday Season20 Simple Questions from Exactpro for Your Enjoyment This Holiday Season
20 Simple Questions from Exactpro for Your Enjoyment This Holiday Season
 
Testing the Intelligence of your AI
Testing the Intelligence of your AITesting the Intelligence of your AI
Testing the Intelligence of your AI
 
EXTENT 2019: Exactpro Quality Assurance for Financial Market Infrastructures
EXTENT 2019: Exactpro Quality Assurance for Financial Market InfrastructuresEXTENT 2019: Exactpro Quality Assurance for Financial Market Infrastructures
EXTENT 2019: Exactpro Quality Assurance for Financial Market Infrastructures
 
ClearTH Test Automation Framework: Case Study in IRS & CDS Swaps Lifecycle Mo...
ClearTH Test Automation Framework: Case Study in IRS & CDS Swaps Lifecycle Mo...ClearTH Test Automation Framework: Case Study in IRS & CDS Swaps Lifecycle Mo...
ClearTH Test Automation Framework: Case Study in IRS & CDS Swaps Lifecycle Mo...
 
EXTENT Talks 2019 Tbilisi: Failover and Recovery Test Automation - Ivan Shamrai
EXTENT Talks 2019 Tbilisi: Failover and Recovery Test Automation - Ivan ShamraiEXTENT Talks 2019 Tbilisi: Failover and Recovery Test Automation - Ivan Shamrai
EXTENT Talks 2019 Tbilisi: Failover and Recovery Test Automation - Ivan Shamrai
 
EXTENT Talks QA Community Tbilisi 20 April 2019 - Conference Open
EXTENT Talks QA Community Tbilisi 20 April 2019 - Conference OpenEXTENT Talks QA Community Tbilisi 20 April 2019 - Conference Open
EXTENT Talks QA Community Tbilisi 20 April 2019 - Conference Open
 
User-Assisted Log Analysis for Quality Control of Distributed Fintech Applica...
User-Assisted Log Analysis for Quality Control of Distributed Fintech Applica...User-Assisted Log Analysis for Quality Control of Distributed Fintech Applica...
User-Assisted Log Analysis for Quality Control of Distributed Fintech Applica...
 
QAFF Chicago 2019 - Complex Post-Trade Systems, Requirements Traceability and...
QAFF Chicago 2019 - Complex Post-Trade Systems, Requirements Traceability and...QAFF Chicago 2019 - Complex Post-Trade Systems, Requirements Traceability and...
QAFF Chicago 2019 - Complex Post-Trade Systems, Requirements Traceability and...
 
QA Community Saratov: Past, Present, Future (2019-02-08)
QA Community Saratov: Past, Present, Future (2019-02-08)QA Community Saratov: Past, Present, Future (2019-02-08)
QA Community Saratov: Past, Present, Future (2019-02-08)
 
Machine Learning and RoboCop Testing
Machine Learning and RoboCop TestingMachine Learning and RoboCop Testing
Machine Learning and RoboCop Testing
 
Behaviour Driven Development: Oltre i limiti del possibile
Behaviour Driven Development: Oltre i limiti del possibileBehaviour Driven Development: Oltre i limiti del possibile
Behaviour Driven Development: Oltre i limiti del possibile
 
2018 - Exactpro Year in Review
2018 - Exactpro Year in Review2018 - Exactpro Year in Review
2018 - Exactpro Year in Review
 
Exactpro Discussion about Joy and Strategy
Exactpro Discussion about Joy and StrategyExactpro Discussion about Joy and Strategy
Exactpro Discussion about Joy and Strategy
 
FIX EMEA Conference 2018 - Post Trade Software Testing Challenges
FIX EMEA Conference 2018 - Post Trade Software Testing ChallengesFIX EMEA Conference 2018 - Post Trade Software Testing Challenges
FIX EMEA Conference 2018 - Post Trade Software Testing Challenges
 
BDD. The Outer Limits. Iosif Itkin at Youcon (in Russian)
BDD. The Outer Limits. Iosif Itkin at Youcon (in Russian)BDD. The Outer Limits. Iosif Itkin at Youcon (in Russian)
BDD. The Outer Limits. Iosif Itkin at Youcon (in Russian)
 

Recently uploaded

《Queensland毕业文凭-昆士兰大学毕业证成绩单》
《Queensland毕业文凭-昆士兰大学毕业证成绩单》《Queensland毕业文凭-昆士兰大学毕业证成绩单》
《Queensland毕业文凭-昆士兰大学毕业证成绩单》rnrncn29
 
User Guide: Capricorn FLX™ Weather Station
User Guide: Capricorn FLX™ Weather StationUser Guide: Capricorn FLX™ Weather Station
User Guide: Capricorn FLX™ Weather StationColumbia Weather Systems
 
Harmful and Useful Microorganisms Presentation
Harmful and Useful Microorganisms PresentationHarmful and Useful Microorganisms Presentation
Harmful and Useful Microorganisms Presentationtahreemzahra82
 
Citronella presentation SlideShare mani upadhyay
Citronella presentation SlideShare mani upadhyayCitronella presentation SlideShare mani upadhyay
Citronella presentation SlideShare mani upadhyayupadhyaymani499
 
FREE NURSING BUNDLE FOR NURSES.PDF by na
FREE NURSING BUNDLE FOR NURSES.PDF by naFREE NURSING BUNDLE FOR NURSES.PDF by na
FREE NURSING BUNDLE FOR NURSES.PDF by naJASISJULIANOELYNV
 
Pests of Blackgram, greengram, cowpea_Dr.UPR.pdf
Pests of Blackgram, greengram, cowpea_Dr.UPR.pdfPests of Blackgram, greengram, cowpea_Dr.UPR.pdf
Pests of Blackgram, greengram, cowpea_Dr.UPR.pdfPirithiRaju
 
Call Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 Genuine
Call Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 GenuineCall Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 Genuine
Call Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 Genuinethapagita
 
Microteaching on terms used in filtration .Pharmaceutical Engineering
Microteaching on terms used in filtration .Pharmaceutical EngineeringMicroteaching on terms used in filtration .Pharmaceutical Engineering
Microteaching on terms used in filtration .Pharmaceutical EngineeringPrajakta Shinde
 
Good agricultural practices 3rd year bpharm. herbal drug technology .pptx
Good agricultural practices 3rd year bpharm. herbal drug technology .pptxGood agricultural practices 3rd year bpharm. herbal drug technology .pptx
Good agricultural practices 3rd year bpharm. herbal drug technology .pptxSimeonChristian
 
Speech, hearing, noise, intelligibility.pptx
Speech, hearing, noise, intelligibility.pptxSpeech, hearing, noise, intelligibility.pptx
Speech, hearing, noise, intelligibility.pptxpriyankatabhane
 
Call Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCR
Call Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCRCall Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCR
Call Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCRlizamodels9
 
Transposable elements in prokaryotes.ppt
Transposable elements in prokaryotes.pptTransposable elements in prokaryotes.ppt
Transposable elements in prokaryotes.pptArshadWarsi13
 
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...lizamodels9
 
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptx
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptxLIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptx
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptxmalonesandreagweneth
 
(9818099198) Call Girls In Noida Sector 14 (NOIDA ESCORTS)
(9818099198) Call Girls In Noida Sector 14 (NOIDA ESCORTS)(9818099198) Call Girls In Noida Sector 14 (NOIDA ESCORTS)
(9818099198) Call Girls In Noida Sector 14 (NOIDA ESCORTS)riyaescorts54
 
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
Vision and reflection on Mining Software Repositories research in 2024
Vision and reflection on Mining Software Repositories research in 2024Vision and reflection on Mining Software Repositories research in 2024
Vision and reflection on Mining Software Repositories research in 2024AyushiRastogi48
 
Functional group interconversions(oxidation reduction)
Functional group interconversions(oxidation reduction)Functional group interconversions(oxidation reduction)
Functional group interconversions(oxidation reduction)itwameryclare
 
RESPIRATORY ADAPTATIONS TO HYPOXIA IN HUMNAS.pptx
RESPIRATORY ADAPTATIONS TO HYPOXIA IN HUMNAS.pptxRESPIRATORY ADAPTATIONS TO HYPOXIA IN HUMNAS.pptx
RESPIRATORY ADAPTATIONS TO HYPOXIA IN HUMNAS.pptxFarihaAbdulRasheed
 

Recently uploaded (20)

《Queensland毕业文凭-昆士兰大学毕业证成绩单》
《Queensland毕业文凭-昆士兰大学毕业证成绩单》《Queensland毕业文凭-昆士兰大学毕业证成绩单》
《Queensland毕业文凭-昆士兰大学毕业证成绩单》
 
User Guide: Capricorn FLX™ Weather Station
User Guide: Capricorn FLX™ Weather StationUser Guide: Capricorn FLX™ Weather Station
User Guide: Capricorn FLX™ Weather Station
 
Harmful and Useful Microorganisms Presentation
Harmful and Useful Microorganisms PresentationHarmful and Useful Microorganisms Presentation
Harmful and Useful Microorganisms Presentation
 
Citronella presentation SlideShare mani upadhyay
Citronella presentation SlideShare mani upadhyayCitronella presentation SlideShare mani upadhyay
Citronella presentation SlideShare mani upadhyay
 
FREE NURSING BUNDLE FOR NURSES.PDF by na
FREE NURSING BUNDLE FOR NURSES.PDF by naFREE NURSING BUNDLE FOR NURSES.PDF by na
FREE NURSING BUNDLE FOR NURSES.PDF by na
 
Pests of Blackgram, greengram, cowpea_Dr.UPR.pdf
Pests of Blackgram, greengram, cowpea_Dr.UPR.pdfPests of Blackgram, greengram, cowpea_Dr.UPR.pdf
Pests of Blackgram, greengram, cowpea_Dr.UPR.pdf
 
Call Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 Genuine
Call Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 GenuineCall Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 Genuine
Call Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 Genuine
 
Microteaching on terms used in filtration .Pharmaceutical Engineering
Microteaching on terms used in filtration .Pharmaceutical EngineeringMicroteaching on terms used in filtration .Pharmaceutical Engineering
Microteaching on terms used in filtration .Pharmaceutical Engineering
 
Good agricultural practices 3rd year bpharm. herbal drug technology .pptx
Good agricultural practices 3rd year bpharm. herbal drug technology .pptxGood agricultural practices 3rd year bpharm. herbal drug technology .pptx
Good agricultural practices 3rd year bpharm. herbal drug technology .pptx
 
Speech, hearing, noise, intelligibility.pptx
Speech, hearing, noise, intelligibility.pptxSpeech, hearing, noise, intelligibility.pptx
Speech, hearing, noise, intelligibility.pptx
 
Call Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCR
Call Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCRCall Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCR
Call Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCR
 
Transposable elements in prokaryotes.ppt
Transposable elements in prokaryotes.pptTransposable elements in prokaryotes.ppt
Transposable elements in prokaryotes.ppt
 
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
 
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptx
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptxLIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptx
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptx
 
(9818099198) Call Girls In Noida Sector 14 (NOIDA ESCORTS)
(9818099198) Call Girls In Noida Sector 14 (NOIDA ESCORTS)(9818099198) Call Girls In Noida Sector 14 (NOIDA ESCORTS)
(9818099198) Call Girls In Noida Sector 14 (NOIDA ESCORTS)
 
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
 
Vision and reflection on Mining Software Repositories research in 2024
Vision and reflection on Mining Software Repositories research in 2024Vision and reflection on Mining Software Repositories research in 2024
Vision and reflection on Mining Software Repositories research in 2024
 
Functional group interconversions(oxidation reduction)
Functional group interconversions(oxidation reduction)Functional group interconversions(oxidation reduction)
Functional group interconversions(oxidation reduction)
 
RESPIRATORY ADAPTATIONS TO HYPOXIA IN HUMNAS.pptx
RESPIRATORY ADAPTATIONS TO HYPOXIA IN HUMNAS.pptxRESPIRATORY ADAPTATIONS TO HYPOXIA IN HUMNAS.pptx
RESPIRATORY ADAPTATIONS TO HYPOXIA IN HUMNAS.pptx
 
Hot Sexy call girls in Moti Nagar,🔝 9953056974 🔝 escort Service
Hot Sexy call girls in  Moti Nagar,🔝 9953056974 🔝 escort ServiceHot Sexy call girls in  Moti Nagar,🔝 9953056974 🔝 escort Service
Hot Sexy call girls in Moti Nagar,🔝 9953056974 🔝 escort Service
 

Introduction into Fault-tolerant Distributed Algorithms and their Modeling (Part 2)

  • 1. Model Checking of Fault-Tolerant Distributed Algorithms Part II: Modeling Fault-tolerant Distributed Algorithms Annu Gmeiner Igor Konnov Ulrich Schmid Helmut Veith Josef Widder TMPA 2014, Kostroma, Russia Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA’14, Nov. 2014 1 / 64
  • 2. Why Modeling and Verification? Let’s have a look at the recent technical report by amazon.com 1: We have found that the standard verification techniques in industry are necessary but not sufficient. We use deep design reviews, code reviews, static code analysis, stress testing, fault-injection testing [. . . ], but we still find that subtle bugs can hide in complex concurrent fault-tolerant systems. . . . . .We have found that testing the code is inadequate as a method to find subtle errors in design, as the number of reachable states of the code is astronomical. 1C. Newcombe et al. Use of Formal Methods at Amazon Web Services, 2014 Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA’14, Nov. 2014 2 / 64
  • 3. Why Modeling and Verification? (cont.) . . . the final executable code is unambiguous, but contains an overwhelming amount of detail. We needed to be able to capture the essence of a design in a few hundred lines of precise description . Engineers naturally focus on designing the “happy case” for a system, i.e. the processing path in which no errors occur. . . . the shortest error trace exhibiting the bug contained 35 high level steps. The improbability of such compound events is not a defense against such bugs; historically, AWS has observed many combinations of events at least as complicated as those that could trigger this bug. Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA’14, Nov. 2014 3 / 64
  • 4. What are we doing? The approach we have: A high-level description of a design, which is precise. Sound verification method, as complete as possible. Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA’14, Nov. 2014 4 / 64
  • 5. What are we doing? The approach we have: A high-level description of a design, which is precise. Sound verification method, as complete as possible. More specifically: modeling approach suitable for model checking automatic parameterized verification method targeting at fault-tolerant distributed algorithms (We are not interested in verifying mathematical toy examples) Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA’14, Nov. 2014 4 / 64
  • 6. Why Model Checking? an alternative proof approach useful counter-examples ability to define and vary assumptions about the system and see why it breaks closer to code level good degree of automation Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA’14, Nov. 2014 5 / 64
  • 7. Distributed Algorithms: Model Checking Challenges unbounded data types unbounded number of rounds (round numbers part of messages) parameterization in multiple parameters among n processes f t are faulty with n 3t contrast to concurrent programs diverse fault models (adverse environments) continuous time fault-tolerant clock synchronization degrees of concurrency: synchronous, asynchronous partially synchronous a process makes at most 5 steps between 2 steps of any other process Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA’14, Nov. 2014 6 / 64
  • 8. Fault-tolerant distributed algorithms n n processes communicate by messages all processes know that at most t of them might be faulty f are actually faulty Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA’14, Nov. 2014 7 / 64
  • 9. Fault-tolerant distributed algorithms n ? ? ? t n processes communicate by messages all processes know that at most t of them might be faulty f are actually faulty Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA’14, Nov. 2014 7 / 64
  • 10. Fault-tolerant distributed algorithms n ? ? ? t f n processes communicate by messages all processes know that at most t of them might be faulty f are actually faulty Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA’14, Nov. 2014 7 / 64
  • 11. Challenge #1: fault models clean crashes: least severe faulty processes prematurely halt after/before “send to all” crash faults: faulty processes prematurely halt (also) in the middle of “send to all” omission faults: faulty processes follow the algorithm, but some messages sent by them might be lost symmetric faults: faulty processes send arbitrarily to all or nobody Byzantine faults: most severe faulty processes can do anything encompass all behaviors of above models Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA’14, Nov. 2014 8 / 64
  • 12. Challenges #2 #3: Pseudo-code and Communication Translate pseudo-code to a formal description that allows us to verify the algorithm and does not oversimplify the original algorithm. Assumptions about the communication medium are usually written in plain English, spread across research papers, constitute folklore knowledge. Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA’14, Nov. 2014 9 / 64
  • 13. Asynchronous Reliable Broadcast (Srikanth Toueg, 87) The core of the classic broadcast algorithm from the DA literature. It solves an agreement problem depending on the inputs vi . Variables of process i vi : f0 , 1g i n i t i a l l y 0 or 1 accepti : f0 , 1g i n i t i a l l y 0 An atomic step: i f vi = 1 then send ( echo ) to all ; i f received (echo) from at l e a s t t + 1 distinct proces ses and not sent ( echo ) before then send ( echo ) to all ; i f received ( echo ) from at l e a s t n - t distinct proces ses then accepti := 1 ; Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA’14, Nov. 2014 10 / 64
  • 14. Asynchronous Reliable Broadcast (Srikanth Toueg, 87) The core of the classic broadcast algorithm from the DA literature. It solves an agreement problem depending on the inputs vi . Variables of process i vi : f0 , 1g i n i t i a l l y 0 or 1 accepti : f0 , 1g i n i t i a l l y 0 An atomic step: i f vi = 1 then send ( echo ) to all ; i f received (echo) from at l e a s t t + 1 distinct proces ses and not sent ( echo ) before then send ( echo ) to all ; i f received ( echo ) from at l e a s t n - t distinct proces ses then accepti := 1 ; asynchronous t Byzantine faults correct if n 3t the code is parameterized in n and t ) process template P(n; t; f ) Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA’14, Nov. 2014 10 / 64
  • 15. Typical Structure of a Computation Step receive messages compute using messages and local variables (description in English with basic control flow if-then-else) send messages atomic Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA’14, Nov. 2014 11 / 64
  • 16. Typical Structure of a Computation Step receive messages compute using messages and local variables (description in English with basic control flow if-then-else) send messages atomic implicit pseudo-code Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA’14, Nov. 2014 11 / 64
  • 17. Challenge #4: Parameterized Model Checking Parameterized model checking problem: given a process template P(n; t; f ), resilience condition RC : n 3t ^ t f 0, fairness constraints , e.g., “all messages will be delivered” and an LTL-X formula ' show for all n, t, and f satisfying RC (P(n; t; f ))nf + f faults j= ( ! ') n ? ? ? t n ? ? ? t f Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA’14, Nov. 2014 12 / 64
  • 18. Challenge #5: Liveness in Distributed Algorithms Interplay of safety and liveness is a central challenge in DAs achieving safety and liveness is non-trivial asynchrony and faults lead to impossibility results (recall first part of lecture (Fischer et al., 1985)) Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA’14, Nov. 2014 13 / 64
  • 19. Challenge #5: Liveness in Distributed Algorithms Interplay of safety and liveness is a central challenge in DAs achieving safety and liveness is non-trivial asynchrony and faults lead to impossibility results (recall first part of lecture (Fischer et al., 1985)) Rich literature to verify safety (e.g. in concurrent systems) Distributed algorithms perspective: “doing nothing is always safe” “tools verify algorithms that actually might do nothing” Verification efforts often have to simplify assumptions Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA’14, Nov. 2014 13 / 64
  • 20. Summary We have to model: faults, communication medium captured in English, algorithms written in pseudo-code. and check: safety and liveness of parameterized systems with unbounded integers, non-standard fairness constraints, Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA’14, Nov. 2014 14 / 64
  • 21. Existing formalization frameworks Design Specification TLA+/PlusCal Concurrent Alg. Proving/ TLC (Timed) IOA Asynchronous DA Proving/ UPPAAL PVS Theorem Proving ? (Parameterized) Simulation Model Checking of FTDAs DISTAL PBFT Implementation Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA’14, Nov. 2014 15 / 64
  • 22. Alternative frameworks TLA (temporal logic of actions): used to design (distributed) algorithms by refinement of the spec verification with proof assistants (low degree of automation) Encodings of DA in proof assistant PVS (e.g., by Rushby): ad-hoc encoding found a bug in a published synchronous Byzantine Agreement algorithm (Lincoln Rushby, 1993) I/O-Automata: originally designed to write clearer hand-written proofs limited tool support, e.g., Veromodo toolset is still in beta suitable only for asynchronous distributed algorithms Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA’14, Nov. 2014 16 / 64
  • 23. Alternative frameworks TLA (temporal logic of actions): used to design (distributed) algorithms by refinement of the spec verification with proof assistants (low degree of automation) Encodings of DA in proof assistant PVS (e.g., by Rushby): ad-hoc encoding found a bug in a published synchronous Byzantine Agreement algorithm (Lincoln Rushby, 1993) I/O-Automata: originally designed to write clearer hand-written proofs limited tool support, e.g., Veromodo toolset is still in beta suitable only for asynchronous distributed algorithms proof assistants are very general, but with low automation degree “everything is possible, but nothing is easy” Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA’14, Nov. 2014 16 / 64
  • 24. Simulation and Implementation Distal: Domain-specific language (Biely et al., 2013) Simulation and evaluate performance of fault-tolerant algorithms Practical Byzantine Fault-Tolerance (Castro et al., 1999) and other practical algorithms: Implementation with optimizations Precise semantics is unclear The system is partially synchronous: non-divergent message delays are assumed Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA’14, Nov. 2014 17 / 64
  • 25. In this part We introduce efficient encoding in PROMELA. Verify safety and liveness of fault-tolerant algorithms (fixed parameters). Find counterexamples for parameters known from the literature. This proves adequacy of our modeling. Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA’14, Nov. 2014 18 / 64
  • 26. Preliminaries: Promela Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA’14, Nov. 2014 19 / 64
  • 27. Promela PROMELA PROcess MEta LAnguage SPIN Simple Promela INterpreter (not that simple any more) Here we give a short introduction and cover only the features important to our work. Detailed documentation, tutorials, and books on: http://spinroot.com Gerard Holzmann Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA’14, Nov. 2014 20 / 64
  • 28. Top-level: global variables and processes / g l o b a l d e c l a r a t i o n s v i s i b l e t o a l l p r o c e s s e s / int x; / a g l o b a l i n t e g e r ( a s in C) / mtype = { X, Y }; / c o n s t a n t me s s age t y p e s / / a FIFO c h anne l wi th a t most 2 me s s a g e s o f t y p e mtype / chan c = [2] of { mtype }; active[2] proctype ProcA() { Two processes are created ... at the initial state } proctype ProcB() { Processes can be created ... later using: run ProcB() } init { A special process, use to run ProcB(); run ProcB(); create other processes } Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA’14, Nov. 2014 21 / 64
  • 29. One process: Basics int x, y; active proctype ProcA() { int z; Declare a local variable z = x; Assignment x y; Block until the expression is evaluated to true true; one step to execute, no effect z++; skip; same as true } Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA’14, Nov. 2014 22 / 64
  • 30. One process: Control flow int x, y; active proctype P() { main: if A guarded command :: x == 0 - x = 1; :: y == 0 - y = 1; non-deterministically selects an option :: x == 1 y == 1 whose first expression is not blocked. - x = 0; y = 0; fi; continues executing the rest of the option goto main; step-by-step. } Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA’14, Nov. 2014 23 / 64
  • 31. One process: Control flow (cont.) int x = 0, y = 0; active proctype P() { main: if :: x == 0 - x = 1; :: y == 0 - y = 1; :: x == 1 y == 1 - x = 0; y = 0; fi; goto main; } Run 1 Run 2 Run 3 x=0,y=0 x=0,y=0 x=0,y=0 x=1,y=0 x=0,y=1 x=1,y=0 x=1,y=1 x=1,y=1 x=1,y=1 x=0,y=0 x=0,y=0 x=0,y=0 x=0,y=1 x=1,y=0 x=0,y=1 x=1,y=1 x=1,y=1 x=1,y=1 x=0,y=0 x=0,y=0 x=0,y=0 Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA’14, Nov. 2014 24 / 64
  • 32. One process: Loops int x; active proctype P() { do a do..od loop :: x == 10 - x = 0; :: x == 10 - break; :: x 10 - x++; od; A: if basically the same. goto A introduces one more step :: x == 10 - x = 0; :: x == 10 - goto B; :: x 10 - x++; fi; goto A; B: } Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA’14, Nov. 2014 25 / 64
  • 33. Many Processes: Interleavings Pure interleaving semantics Every statement is executed atomically int x = 0, y = 1; active[2] proctype A() { x = 1 - x; y = 1 - y; } A[0] A[1] The red path is an example execution where the steps of processes 0 and 1 alternate. Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA’14, Nov. 2014 26 / 64
  • 34. Many Processes: Atomics use atomic f ... g to make execution of a sequence indivisible. non-deterministic choice with if..fi is still allowed! int x = 0, y = 1; active[2] proctype A() { atomic { x = 1 - x; y = 1 - y; } } A[0] A[1] Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA’14, Nov. 2014 27 / 64
  • 35. Many Processes: Atomics use atomic f ... g to make execution of a sequence indivisible. non-deterministic choice with if..fi is still allowed! int x = 0, y = 1; active[2] proctype A() { atomic { x = 1 - x; y = 1 - y; } } A[0] A[1] Larger atomic steps lead to less possible paths and states. Note: different atomicity degrees may lead to different verification results Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA’14, Nov. 2014 27 / 64
  • 36. (Asynchronous) message passing mtype = { A, B }; chan chan1 = [1] of { mtype }; queue of size 1 chan chan2 = [1] of { mtype }; active proctype Ping() { chan1!A; insert A to “chan1” do :: chan2?B - chan1!A; od; when B is on the top of “chan2”, remove it and insert A to “chan1” } active proctype Pong() { do :: chan1?A - chan2!B; od; } Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA’14, Nov. 2014 29 / 64
  • 37. Blocking receive mtype = { A, B }; chan chan1 = [1] of { mtype }; chan chan2 = [1] of { mtype }; active proctype Ping() { chan1!A; do :: chan2?B - deadlock! chan1!A; Ping sends A, Pong receives A, od; chan1?A is blocked } active proctype Pong() { do :: chan1?A - chan1?A; deadlock! chan2!B; od; } Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA’14, Nov. 2014 30 / 64
  • 38. Blocking send mtype = { A, B }; chan chan1 = [1] of { mtype }; chan chan2 = [1] of { mtype }; active proctype Ping() { chan1!A; do :: chan2?B - When chan1=[A] and chan2=[B], the system deadlocks chan1!A; chan1!A; chan1!A; deadlock! chan1!A; The shortest counter-example has 10 steps od; } Use Spin to find it active proctype Pong() { do :: chan1?A - chan2!B; deadlock! od; } Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA’14, Nov. 2014 31 / 64
  • 39. Promela vs. C PROMELA looks like C But it is not! Non-determinism in the if statements (internal non-determinism) Non-determinstic scheduler (external non-determinism) Atomic statements Message passing PROMELA is a modeling language Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA’14, Nov. 2014 32 / 64
  • 40. Preliminaries: Kripke Structures Linear Temporal Logic (LTL) Control Flow Automata (CFA) Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA’14, Nov. 2014 33 / 64
  • 41. Kripke structures A Kripke structure is a M = (S;S0;R;AP; L), where: s0 : frg S is a set of states, S0 S is the set of initial states, R S S is a transition relation, AP is a set of atomic propositions, L : S ! 2AP is a state-labeling function. s4 : fgg s3 : fr ; y; gg s1 : fyg s2 : fyg Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA’14, Nov. 2014 34 / 64
  • 42. Linear Temporal Logic An LTL formula is defined inductively w.r.t. atomic propositions AP: (base) p 2 AP is an LTL formula, if ' and are LTL formulas, then the following expressions are LTL formulas: Nexttime: X', Eventually: F ', Globally: G', Until: U'. Boolean combinations: ' ^ , ' _ , and :'. s0 s1 s2 s3 s4 s00 s01 s02 s04 s03 s00 0 s00 1 s00 2 s00 3 s00 4 s000 0 s000 1 s000 2 ' s000 3 s000 4 Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA’14, Nov. 2014 35 / 64
  • 43. Recall: Typical Structure of a Computation Step receive messages compute using messages and local variables (description in English with basic control flow if-then-else) send messages atomic implicit pseudo-code Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA’14, Nov. 2014 36 / 64
  • 44. CFA: Intermediate representation Intermediate representation of a loop body: a path from qI to qF encodes one iteration. Every variable is assigned at most once (SSA). active proctype P() { int x, y; do :: x == 0 - x = 1; :: x == 1 - x = 2; :: x == 2 - x = 0; :: x == 1 - x = 0; y = 1 - y; od; } qI q0 q1 q2 q3 q4 qF x = 0 x = 1 x = 1 x = 2 x0 = 1 x0 = 2 x0 = 0 x0 = 0 y0 = 1 y Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA’14, Nov. 2014 37 / 64
  • 45. Example: from a CFA to a Kripke structure Kripke structure M(n; t; f ) = (S;S0;R;AP; L) of N(n; t; f ) processes For a path from qI to qF construct a formula (x; y; x0; y0) A state is a pair of x = (x1; : : : ; xN) 2 NN and y = (y1; : : : ; yN) 2 NN and the initial states are S0 = f(0; : : : ; 0)g ((x; y); (x0; y0)) 2 R iff there are process index k: 1 k N and path : [k moves]: (xk ; yk ; x0k ; y0k ) holds [others do not]: 8i 2 f1; : : : ;Ng n fkg: x0 i = xi , y0 i = yi . Propositions AP = f[9i: yi6= 0]; [8i: yi6= 0]g and a state (x; y) is labeled as: p 2 L((x; y)) iff (x; y) j= p. qI q0 q1 q2 q3 q4 qF x = 0 x = 1 x = 1 x = 2 x0 = 1 x0 = 2 x0 = 0 x0 = 0 y0 = 1 y Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA’14, Nov. 2014 38 / 64
  • 46. Example: Properties of ST87 in LTL Unforgeability. If vi = 0 for all correct processes i, then for all correct processes j, acceptj remains 0 forever. nVf G i=1 vi = 0 ! G nVf j=1 acceptj = 0 Safety Completeness. If vi = 1 for all correct processes i, then there is a correct process j that eventually sets acceptj to 1. nVf G i=1 vi = 1 ! F nWf j=1 acceptj = 1 Liveness Relay. If a correct process i sets accepti to 1, then eventually all correct processes j set acceptj to 1. nWf G i=1 ! F accepti = 1 nVf j=1 acceptj = 1 Liveness Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA’14, Nov. 2014 39 / 64
  • 47. Model Checking Problems Finite-state MC Input: a process template P, an LTL formula ' (including fairness), values of parameters n, t, and f . Problem: check, whether M(n; t; f ) j= '. Parameterized MC Input: a process template P, an LTL formula (including fairness) with atomic propositions of the form [9i:xi y] and [8i:xi y] Problem: check, whether 8n; t; f : n 3t ^ t f ^ f 0: M(n; t; f ) j= . Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA’14, Nov. 2014 40 / 64
  • 48. Parameterized modeling Non-parameterized model checking as in SPIN’13: (John et al., 2013) Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA’14, Nov. 2014 41 / 64
  • 49. Modeling of threshold-based algorithms in Promela. . . We introduce efficient encoding of threshold-based fault-tolerant algorithms in PROMELA (with parametrization!) Verify safety and liveness of fault-tolerant algorithms (fixed parameters). Find counterexamples for parameters known from the literature. This proves adequacy of our modeling. Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA’14, Nov. 2014 42 / 64
  • 50. Modeling of threshold-based algorithms in Promela. . . We introduce efficient encoding of threshold-based fault-tolerant algorithms in PROMELA (with parametrization!) Verify safety and liveness of fault-tolerant algorithms (fixed parameters). Find counterexamples for parameters known from the literature. This proves adequacy of our modeling. For our method, we exploit specifics of FTDAs: 1 central feature of the algorithms (message counting); 2 specific message passing (we do not need to know who sent but how many of them sent messages); 3 the way faults affect messages (again, counting messages). Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA’14, Nov. 2014 42 / 64
  • 51. Case Studies We consider a number of threshold-based algorithms. Our running example ST87 for 1 Byzantine faults (BYZ) 2 omission faults (OMIT) 3 symmetric faults (SYMM) 4 clean crashes (CLEAN). 5 Forklore reliable broadcast for clean crashes [Chandra Toueg 96, CT96] (to be continued) Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA’14, Nov. 2014 43 / 64
  • 52. Characteristics of the FTDA by Srikanth Toueg, 87 Variables of process i vi : f0 , 1g i n i t i a l l y 0 or 1 accepti : f0 , 1g i n i t i a l l y 0 An atomic step: i f vi = 1 then send ( echo ) to all ; i f received (echo) from at l e a s t t + 1 distinct proces ses and not sent ( echo ) before then send ( echo ) to all ; i f received ( echo ) from at l e a s t n - t distinct proces ses then accepti := 1 ; Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA’14, Nov. 2014 44 / 64
  • 53. Characteristics of the FTDA by Srikanth Toueg, 87 Variables of process i vi : f0 , 1g i n i t i a l l y 0 or 1 accepti : f0 , 1g i n i t i a l l y 0 An atomic step: i f vi = 1 then send ( echo ) to all ; i f received (echo) from at l e a s t t + 1 distinct proces ses and not sent ( echo ) before then send ( echo ) to all ; i f received ( echo ) from at l e a s t n - t distinct proces ses then accepti := 1 ; the algorithm consists of threshold-guarded commands, only thresholds t + 1 and n t communication is by “send to all” how processes distinguish distinct senders is not part of the algorithm (i.e., algorithm description is high level) Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA’14, Nov. 2014 44 / 64
  • 54. Case Studies (cont.): Larger Algorithms more involved algorithms in the purely asynchronous setting: 6 Asynchronous Byzantine Agreement (Bracha Toueg 85, BT85) Byzantine faults two phases and two message types five status values properties: unforgeability, correctness (liveness), agreement (liveness) 7 Condition-based Consensus (Most´ efaoui et al. 01, MRRR01) crash faults two phases and four message types nine status variables properties: validity, agreement, termination (liveness) 8 Fast Byzantine Consensus: common case (Martin, Alvisi 06, MA06) Byzantine faults the core part of the algorithm no cryptography Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA’14, Nov. 2014 45 / 64
  • 55. Experimental Results at Glance Algorithm Fault Parameters Resilience Properties Time 1. ST87 BYZ n = 7; t = 2; f = 2 n 3t U, C, R 6 sec. 1. ST87 BYZ n = 7; t = 3; f = 2 n 3t U, C, R 5 sec. 1. ST87 BYZ n = 7; t = 1; f = 2 n 3t U, C, R 1 sec. 2. ST87 OMIT n = 5; t = 2; f = 2 n 2t U, C, R 4 sec. 2. ST87 OMIT n = 5; t = 2; f = 3 n 2t U, C, R 5 sec. 3. ST87 SYMM n = 5; t = 1; fp = 1; fs = 0 n 2t U, C, R 1 sec. 3. ST87 SYMM n = 5; t = 2; fp = 3; fs = 1 n 2t U, C, R 1 sec. 4. ST87 CLEAN n = 3; t = 2; fc = 2; fnc = 0 n t U, C, R 1 sec. 5. CT96 CRASH n = 2 — U, C, R 1 sec. 6. BT85 BYZ n = 5; t = 1; f = 1 n 3t R 131 sec. 6. BT85 BYZ n = 5; t = 1; f = 2 n 3t R 1 sec. 6. BT85 BYZ n = 5; t = 2; f = 2 n 3t R 1 sec. 7. MRRR01 CRASH n = 3; t = 1; f = 1 n 2t V0, V1, A, T 1 sec. 7. MRRR01 CRASH n = 3; t = 1; f = 2 n 2t V0, V1, A, T 1 sec. 8. MA06 BYZ p = 4,a = 6,l = 4, t = 1,f = 1 p 3t, a 5t, l 3t CS1, CS3, CL1, CL2 3 hrs. 8. MA06 BYZ p = 4,a = 5,l = 4, t = 1, f = 1 p 3t, a 5t, l 3t CS1, CS3, CL1, CL2 14 min. 8. MA06 BYZ p = 4,a = 6,l = 4, t = 1, f = 2 p 3t, a 5t, l 3t CS1, CS3, CL1, CL2 2 sec. Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA’14, Nov. 2014 46 / 64
  • 56. Experimental Results at Glance Algorithm Fault Parameters Resilience Properties Time 1. ST87 BYZ n = 7; t = 2; f = 2 n 3t U, C, R 6 sec. 1. ST87 BYZ n = 7; t = 3; f = 2 n 3t U, C, R 5 sec. 1. ST87 BYZ n = 7; t = 1; f = 2 n 3t U, C, R 1 sec. 2. ST87 OMIT n = 5; t = 2; f = 2 n 2t U, C, R 4 sec. 2. ST87 OMIT n = 5; t = 2; f = 3 n 2t U, C, R 5 sec. 3. ST87 SYMM n = 5; t = 1; fp = 1; fs = 0 n 2t U, C, R 1 sec. 3. ST87 SYMM n = 5; t = 2; fp = 3; fs = 1 n 2t U, C, R 1 sec. 4. ST87 CLEAN n = 3; t = 2; fc = 2; fnc = 0 n t U, C, R 1 sec. 5. CT96 CRASH n = 2 — U, C, R 1 sec. 6. BT85 BYZ n = 5; t = 1; f = 1 n 3t R 131 sec. 6. BT85 BYZ n = 5; t = 1; f = 2 n 3t R 1 sec. 6. BT85 BYZ n = 5; t = 2; f = 2 n 3t R 1 sec. 7. MRRR01 CRASH n = 3; t = 1; f = 1 n 2t V0, V1, A, T 1 sec. 7. MRRR01 CRASH n = 3; t = 1; f = 2 n 2t V0, V1, A, T 1 sec. 8. MA06 BYZ p = 4,a = 6,l = 4, t = 1,f = 1 p 3t, a 5t, l 3t CS1, CS3, CL1, CL2 3 hrs. 8. MA06 BYZ p = 4,a = 5,l = 4, t = 1, f = 1 p 3t, a 5t, l 3t CS1, CS3, CL1, CL2 14 min. 8. MA06 BYZ p = 4,a = 6,l = 4, t = 1, f = 2 p 3t, a 5t, l 3t CS1, CS3, CL1, CL2 2 sec. Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA’14, Nov. 2014 46 / 64
  • 57. Experimental Results at Glance Algorithm Fault Parameters Resilience Properties 1. ST87 BYZ n = 7; t = 2; f = 2 n 3t U, C, R 1. ST87 BYZ n = 7; t = 3; f = 2 n 3t U, C, R 1. ST87 BYZ n = 7; t = 1; f = 2 n 3t U, C, R 2. ST87 OMIT n = 5; t = 2; f = 2 n 2t U, C, R 2. ST87 OMIT n = 5; t = 2; f = 3 n 2t U, C, R 3. ST87 SYMM n = 5; t = 1; fp = 1; fs = 0 n 2t U, C, R 3. ST87 SYMM n = 5; t = 2; fp = 3; fs = 1 n 2t U, C, R 4. ST87 CLEAN n = 3; t = 2; fc = 2; fnc = 0 n t U, C, R 5. CT96 CRASH n = 2 — U, C, R Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA’14, Nov. 2014 46 / 64
  • 58. Experimental Results at Glance Algorithm Fault Parameters Resilience Properties Time 1. ST87 BYZ n = 7; t = 2; f = 2 n 3t U, C, R 6 sec. 1. ST87 BYZ n = 7; t = 3; f = 2 n 3t U, C, R 5 sec. 1. ST87 BYZ n = 7; t = 1; f = 2 n 3t U, C, R 1 sec. 2. ST87 OMIT n = 5; t = 2; f = 2 n 2t U, C, R 4 sec. 2. ST87 OMIT n = 5; t = 2; f = 3 n 2t U, C, R 5 sec. 3. ST87 SYMM n = 5; t = 1; fp = 1; fs = 0 n 2t U, C, R 1 sec. 3. ST87 SYMM n = 5; t = 2; fp = 3; fs = 1 n 2t U, C, R 1 sec. 4. ST87 CLEAN n = 3; t = 2; fc = 2; fnc = 0 n t U, C, R 1 sec. 5. CT96 CRASH n = 2 — U, C, R 1 sec. 6. BT85 BYZ n = 5; t = 1; f = 1 n 3t R 131 sec. 6. BT85 BYZ n = 5; t = 1; f = 2 n 3t R 1 sec. 6. BT85 BYZ n = 5; t = 2; f = 2 n 3t R 1 sec. 7. MRRR01 CRASH n = 3; t = 1; f = 1 n 2t V0, V1, A, T 1 sec. 7. MRRR01 CRASH n = 3; t = 1; f = 2 n 2t V0, V1, A, T 1 sec. 8. MA06 BYZ p = 4,a = 6,l = 4, t = 1,f = 1 p 3t, a 5t, l 3t CS1, CS3, CL1, CL2 3 hrs. 8. MA06 BYZ p = 4,a = 5,l = 4, t = 1, f = 1 p 3t, a 5t, l 3t CS1, CS3, CL1, CL2 14 min. 8. MA06 BYZ p = 4,a = 6,l = 4, t = 1, f = 2 p 3t, a 5t, l 3t CS1, CS3, CL1, CL2 2 sec. Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA’14, Nov. 2014 46 / 64
  • 59. Experimental Results at Glance 6. BT85 BYZ n = 5; t = 1; f = 1 n 3t R 6. BT85 BYZ n = 5; t = 1; f = 2 n 3t R 6. BT85 BYZ n = 5; t = 2; f = 2 n 3t R 7. MRRR01 CRASH n = 3; t = 1; f = 1 n 2t V0, V1, A, T 7. MRRR01 CRASH n = 3; t = 1; f = 2 n 2t V0, V1, A, T 8. MA06 BYZ p = 4,a = 6,l = 4, t = 1,f = 1 p 3t, a 5t, l 3t CS1, CS3, CL1, CL2 8. MA06 BYZ p = 4,a = 5,l = 4, t = 1, f = 1 p 3t, a 5t, l 3t CS1, CS3, CL1, CL2 8. MA06 BYZ p = 4,a = 6,l = 4, t = 1, f = 2 p 3t, a 5t, l 3t CS1, CS3, CL1, CL2 Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA’14, Nov. 2014 46 / 64
  • 60. Experimental Results at Glance Algorithm Fault Parameters Resilience Properties Time 1. ST87 BYZ n = 7; t = 2; f = 2 n 3t U, C, R 6 sec. 1. ST87 BYZ n = 7; t = 3; f = 2 n 3t U, C, R 5 sec. 1. ST87 BYZ n = 7; t = 1; f = 2 n 3t U, C, R 1 sec. 2. ST87 OMIT n = 5; t = 2; f = 2 n 2t U, C, R 4 sec. 2. ST87 OMIT n = 5; t = 2; f = 3 n 2t U, C, R 5 sec. 3. ST87 SYMM n = 5; t = 1; fp = 1; fs = 0 n 2t U, C, R 1 sec. 3. ST87 SYMM n = 5; t = 2; fp = 3; fs = 1 n 2t U, C, R 1 sec. 4. ST87 CLEAN n = 3; t = 2; fc = 2; fnc = 0 n t U, C, R 1 sec. 5. CT96 CRASH n = 2 — U, C, R 1 sec. 6. BT85 BYZ n = 5; t = 1; f = 1 n 3t R 131 sec. 6. BT85 BYZ n = 5; t = 1; f = 2 n 3t R 1 sec. 6. BT85 BYZ n = 5; t = 2; f = 2 n 3t R 1 sec. 7. MRRR01 CRASH n = 3; t = 1; f = 1 n 2t V0, V1, A, T 1 sec. 7. MRRR01 CRASH n = 3; t = 1; f = 2 n 2t V0, V1, A, T 1 sec. 8. MA06 BYZ p = 4,a = 6,l = 4, t = 1,f = 1 p 3t, a 5t, l 3t CS1, CS3, CL1, CL2 3 hrs. 8. MA06 BYZ p = 4,a = 5,l = 4, t = 1, f = 1 p 3t, a 5t, l 3t CS1, CS3, CL1, CL2 14 min. 8. MA06 BYZ p = 4,a = 6,l = 4, t = 1, f = 2 p 3t, a 5t, l 3t CS1, CS3, CL1, CL2 2 sec. Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA’14, Nov. 2014 46 / 64
  • 61. Experimental Results: on ST87, the Byzantine Case Time (sec, logscale) two faults: f = 2 no faults: f = 0 Memory (MB, logscale) no faults: f = 2 The more faults we have, the easier the problem is: Two faults: we can check the systems of up to nine processes No faults: we can check the systems of up to seven processes Precision of modeling: we found counter-examples for the corner cases n = 3t and f t, where the resilience condition is violated. (June 2013: somebody wrote on Wikipedia that n = 3t should work :-) Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA’14, Nov. 2014 47 / 64
  • 62. Discussion of the specifications Unforgeability. If vi = 0 for all correct processes i, then for all correct processes j, acceptj remains 0 forever. nVf G i=1 vi = 0 ! G nVf j=1 acceptj = 0 The specification of Byzantine FTDAs have the following features: Only the states of correct processes are evaluated. Faulty processes may be Byzantine. (no assumption on behavior) Specifications do not talk about individual processes. Only global safety and progress are important. Indexed temporal logic is not required! Quantification over processes is on the level of atomic propositions. Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA’14, Nov. 2014 48 / 64
  • 63. Threshold-Guarded Distributed Algorithms Standard construct: quantified guards (t=f=0) Existential Guard if received m from some process then ... Universal Guard if received m from all processes then ... Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA’14, Nov. 2014 49 / 64
  • 64. Threshold-Guarded Distributed Algorithms Standard construct: quantified guards (t=f=0) Existential Guard if received m from some process then ... Universal Guard if received m from all processes then ... what if faults might occur? Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA’14, Nov. 2014 49 / 64
  • 65. Threshold-Guarded Distributed Algorithms Standard construct: quantified guards (t=f=0) Existential Guard if received m from some process then ... Universal Guard if received m from all processes then ... what if faults might occur? Fault-Tolerant Algorithms: n processes, at most t are Byzantine Threshold Guard if received m from n t processes then ... (the processes cannot refer to f!) Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA’14, Nov. 2014 49 / 64
  • 66. Counting Argument in Threshold-Guarded Algorithms n t f t + 1 Correct processes count incoming messages from distinct processes Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA’14, Nov. 2014 50 / 64
  • 67. Counting Argument in Threshold-Guarded Algorithms n t f t + 1 Correct processes count incoming messages from distinct processes Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA’14, Nov. 2014 50 / 64
  • 68. Counting Argument in Threshold-Guarded Algorithms n t f t + 1 at least one non-faulty sent the message Correct processes count incoming messages from distinct processes Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA’14, Nov. 2014 50 / 64
  • 69. Modeling threshold-based algorithms in Promela As the distributed algorithms are given in pseudo-code, we have to decide on how to encode in PROMELA: send to all and receive counting expressions “received m from n t distinct processes” faults Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA’14, Nov. 2014 51 / 64
  • 70. Modeling threshold-based algorithms in Promela As the distributed algorithms are given in pseudo-code, we have to decide on how to encode in PROMELA: send to all and receive counting expressions “received m from n t distinct processes” faults In what follows, we compare side-by-side two solutions: A straightforward encoding with PROMELA channels and explicit representation of faulty processes. [Solution 1] An advanced encoding with shared variables and fault injection. [Solution 2] Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA’14, Nov. 2014 51 / 64
  • 71. Modeling threshold-based algorithms in Promela As the distributed algorithms are given in pseudo-code, we have to decide on how to encode in PROMELA: send to all and receive counting expressions “received m from n t distinct processes” faults In what follows, we compare side-by-side two solutions: A straightforward encoding with PROMELA channels and explicit representation of faulty processes. [Solution 1] An advanced encoding with shared variables and fault injection. [Solution 2] To decouple encoding of reliable message passing and of faults, we first consider message passing without faults, and then show how to encode faults. Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA’14, Nov. 2014 51 / 64
  • 72. Template in Promela We implement the following loop on the right. receive messages compute using messages and local variables (description in English with basic control flow if-then-else) send messages atomic / s h a r e d s t a t e : a v a r i a b l e o r a c h anne l / active proctype[N(n,t,f)] P(){ / l o c a l v a r i a b l e t o c ount me s s a g e s from d i s t i n c t p r o c e s s e s / int nrcvd; / i n i t i a l i z a t i o n / loop: atomic { / 1 . r e c e i v e and c ount me s s a ge s 2 . compute us ing nr cvd 3 . s end me s s a g e s / } goto loop; } Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA’14, Nov. 2014 52 / 64
  • 73. Modeling Message Passing All our case studies are designed with the assumption of classic reliable asynchronous message passing as in (Fischer et al., 1985): non-blocking communication, operations “receive” and “send” are executed immediately. if a message can be received now, it may be also received later, a process does not have to receive a message as soon as it is able to. every sent message is eventually received, but there are no bounds on the delays. Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA’14, Nov. 2014 53 / 64
  • 74. Solution 1: Message Passing using Promela channels A straightforward encoding using message channels: / me s s age t y p e / mtype = { ECHO }; / p o inttop o i n t c h a n n e l s / chan p2p[N][N] = [1] of { mtype }; / t a g r e c e i v e d me s s a g e s / bit rx[N][N]; Sending a message to all processes: for (i : 1 .. N) { p2p[_pid][i]!ECHO; } Note: pid denotes the process identifier in PROMELA (we use it solely to encode message passing). Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA’14, Nov. 2014 54 / 64
  • 75. Solution 1: Message Passing (cont.) Receiving and counting messages from distinct processes (no faults yet): / l o c a l / int nrcvd = 0; / i n i t i a l l y , no me s s a ge s / ... i = 0; do / i s t h e r e a me s s age from p r o c e s s i ? / :: (i N) nempty(p2p[i][_pid]) - p2p[i][_pid]?ECHO; / remove i t / if :: !rx[i][_pid] - / 1 . t h e f i r s t t ime : / rx[i][_pid] = 1; / a . mark a s r e c e i v e d / nrcvd++; break; / b . i n c r e a s e l o c a l c o u n t e r / :: rx[i][_pid]; / 2 . i g n o r e a d u p l i c a t e / fi; i++; / ne xt p r o c e s s / :: (i N) - i++; / r e c e i v e no t h ing from i / :: i == N - break; od Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA’14, Nov. 2014 55 / 64
  • 76. Solution 2: Simulating message passing with variables Keeping the number of send-to-all’s by (correct) processes: int nsnt; / s h a r e d v a r i a b l e / / number o f sendtoa l l ’ s s e n t by c o r r e c t p r o c e s s e s / Sending a message to all: nsnt++; Receiving and counting messages from distinct processes (no faults): if / p i c k a l a r g e r v a l ue nsnt / :: ((nrcvd + 1) nsnt) - nrcvd++; / one more me s s age / :: skip; / o r no t h ing / fi; Reliable communication as a fairness property: FG[8i:nrcvdi nsnt] Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA’14, Nov. 2014 56 / 64
  • 77. Solution 2: Some questions you might ask Q1: instead of if :: ((nrcvd + 1) nsnt) - nrcvd++; / one more me s s age / :: skip; / o r no t h ing / fi; why cannot we just write: nrcvd = nsnt; A1: You can, but that will be another model, not [FLP85]! [FLP85] only guarantees that every message is eventually received. Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA’14, Nov. 2014 57 / 64
  • 78. Solution 2: Some questions you might ask (cont.) Reliable communication: every sent message is eventually received. Q2: Why do we write FG[8i:nrcvdi nsnt] (1) instead of: 8i: GF [nrcvdi nsnt] (2) A2: We like to write (2), but it will require us to use another logic called indexed LTL, which will cause problems in the parameterized case. For threshold-based algorithms, the value of nsnt is changes at most n times. Under this assumption, (2) is equivalent (1). Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA’14, Nov. 2014 58 / 64
  • 79. Solution 1 (cont.): Explicit Modeling of Faults (Lamport et al., 1982) introduce Byzantine processes that can virtually do anything. In our case, Byzantine behavior boils down to sending ECHO to some of the correct processes and not sending ECHO to the others: active[F] proctype Byz() { step: atomic { i = 0; do / s end ECHO t o p r o c e s s i / :: i N - p2p[_pid][i]!ECHO; i++; / o r no t / :: i N - i++; :: i == N - break; od }; goto step; } Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA’14, Nov. 2014 59 / 64
  • 80. Solution 2 (cont.): Injecting Faults into Message Counters We instantiate n f correct processes and no faulty processes. Instead, we say that the correct processes may receive up to f additional messages due to faults: if :: ((nrcvd + 1) nsnt + f) - nrcvd++; / r e c e i v e one more me s s age / :: skip; / o r no t h ing / fi; Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA’14, Nov. 2014 60 / 64
  • 81. Solution 2 (cont.): Injecting Faults into Message Counters We instantiate n f correct processes and no faulty processes. Instead, we say that the correct processes may receive up to f additional messages due to faults: if :: ((nrcvd + 1) nsnt + f) - nrcvd++; / r e c e i v e one more me s s age / :: skip; / o r no t h ing / fi; The fairness still forces the processes to receive all the messages sent by the correct processes: FG[8i:nrcvdi nsnt] Note: each correct process sends at most one ECHO message. Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA’14, Nov. 2014 60 / 64
  • 82. Solution 2 (cont.): Modeling different kinds of faults Byzantine faults (previous slide): create only correct processes, i.e., n f processes only they have to satisfy spec extra messages from Byzantine: ((nrcvd + 1) nsnt + f) fairness (reliable communication): FG[8i:nrcvdi nsnt] Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA’14, Nov. 2014 61 / 64
  • 83. Solution 2 (cont.): Modeling different kinds of faults Byzantine faults (previous slide): create only correct processes, i.e., n f processes only they have to satisfy spec extra messages from Byzantine: ((nrcvd + 1) nsnt + f) fairness (reliable communication): FG[8i:nrcvdi nsnt] Omission faults (processes fail to send messages): create all processes, i.e., n processes all of them are mentioned in the specification no additional messages: ((nrcvd + 1) nsnt) fairness (with possible message loss due to faults) FG[8i:nrcvdi nsnt f ] Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA’14, Nov. 2014 61 / 64
  • 84. Solution 2 (cont.): Modeling different kinds of faults Byzantine faults (previous slide): create only correct processes, i.e., n f processes only they have to satisfy spec extra messages from Byzantine: ((nrcvd + 1) nsnt + f) fairness (reliable communication): FG[8i:nrcvdi nsnt] Omission faults (processes fail to send messages): create all processes, i.e., n processes all of them are mentioned in the specification no additional messages: ((nrcvd + 1) nsnt) fairness (with possible message loss due to faults) FG[8i:nrcvdi nsnt f ] Crash faults: similar to omissions with crash control state added Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA’14, Nov. 2014 61 / 64
  • 85. Experiments: Solution 1 vs. Solution 2 States (logscale) 1e+08 1e+07 1e+06 100000 10000 1000 100 10 states (logscale) 3 4 5 6 7 8 number of processes, N Memory (MB, logscale, 12 GB) 10000 1000 100 memory, MB (logscale) 3 4 5 6 7 8 number of processes, N Solution 1: Channels + explicit Byzantine processes (blue) Solution 2: shared variables + fault injection (red) in the presence of one Byzantine faulty process (f = 1) (case f = 2 runs out of memory too fast) Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA’14, Nov. 2014 62 / 64
  • 86. Summary We show how to model threshold-based fault-tolerant algorithms starting with an imprecise description We create PROMELA models using expert advice. The tool demonstrates that the model behaves as predicted by theory (for concrete values of parameters) This reference implementation allows us to optimize the encoding ... and to make the model amenable to parameterized verification Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA’14, Nov. 2014 63 / 64
  • 87. Tomorrow: parameterized model checking Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA’14, Nov. 2014 64 / 64
  • 88. References I Biely, M., Delgado, P., Milosevic, Z., Schiper, A. 2013 (June). Distal: A framework for implementing fault-tolerant distributed algorithms. Pages 1–8 of: Dependable Systems and Networks (DSN), 2013 43rd Annual IEEE/IFIP International Conference on. Castro, Miguel, Liskov, Barbara, et al. 1999. Practical Byzantine fault tolerance. Pages 173–186 of: OSDI, vol. 99. Fischer, Michael J., Lynch, Nancy A., Paterson, M. S. 1985. Impossibility of Distributed Consensus with one Faulty Process. J. ACM, 32(2), 374–382. http://doi.acm.org/10.1145/3149.214121. John, Annu, Konnov, Igor, Schmid, Ulrich, Veith, Helmut, Widder, Josef. 2013. Towards Modeling and Model Checking Fault-Tolerant Distributed Algorithms. Pages 209–226 of: SPIN. LNCS, vol. 7976. Lamport, Leslie, Shostak, Robert E., Pease, Marshall C. 1982. The Byzantine Generals Problem. ACM Trans. Program. Lang. Syst., 4(3), 382–401. Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA’14, Nov. 2014 65 / 64
  • 89. References II Lincoln, P., Rushby, J. 1993. A formally verified algorithm for interactive consistency under a hybrid fault model. Pages 402–411 of: FTCS-23. http://dx.doi.org/10.1109/FTCS.1993.627343. Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA’14, Nov. 2014 66 / 64
  • 90. Folklore Reliable Broadcast (e.g., Chandra Toueg, 96) Correct processes agree on value vi in the presence of crash faults. Variables of process i vi : f0 , 1g i n i t i a l l y 0 or 1 accepti : f0 , 1g i n i t i a l l y 0 An atomic step: i f ( vi = 1 or received echo from some proces s ) and accepti = 0 then begin send echo to all ; accepti := 1 ; end Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA’14, Nov. 2014 67 / 64
  • 91. Folklore Reliable Broadcast (e.g., Chandra Toueg, 96) Correct processes agree on value vi in the presence of crash faults. Variables of process i vi : f0 , 1g i n i t i a l l y 0 or 1 accepti : f0 , 1g i n i t i a l l y 0 An atomic step: i f ( vi = 1 or received echo from some proces s ) and accepti = 0 then begin send echo to all ; /* when crashing it sends to a subset of processes */ accepti := 1 ; /* it can also crash here */ end Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA’14, Nov. 2014 67 / 64
  • 92. Verification Problem as in Distributed Computing Given a distributed algorithm A and specifications 'U, 'C, 'R, Fix n and t with n 3t, show that every execution of A(n; t) satisfies 'U, 'C, 'R. Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA’14, Nov. 2014 68 / 64
  • 93. Verification Problem as in Distributed Computing Given a distributed algorithm A and specifications 'U, 'C, 'R, Fix n and t with n 3t, show that every execution of A(n; t) satisfies 'U, 'C, 'R. In every execution: the number of faulty processes is restricted, i.e., f t; processes can use n and t in the code, but not f ; f is constant (if a process fails late, its “correct” behavior was a Byzantine trick). A distributed system A(n; t) f = 0 . . . f = t Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA’14, Nov. 2014 68 / 64
  • 94. Verification Problem as in Distributed Computing Given a distributed algorithm A and specifications 'U, 'C, 'R, Fix n and t with n 3t, show that every execution of A(n; t) satisfies 'U, 'C, 'R. In every execution: the number of faulty processes is restricted, i.e., f t; processes can use n and t in the code, but not f ; f is constant (if a process fails late, its “correct” behavior was a Byzantine trick). A distributed system A(n; t) f = 0 . . . f = t Counterexamples when f t? Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA’14, Nov. 2014 68 / 64
  • 95. Experiments: Channels vs. Shared Variables enumerating reachable states in SPIN with POR and state compression States (logscale) 1e+09 1e+08 1e+07 1e+06 100000 10000 1000 100 states (logscale) 3 4 5 6 7 8 number of processes, N Memory (MB, logscale, limit of 12 GB) 10000 1000 100 memory, MB (logscale) 3 4 5 6 7 8 number of processes, N Josef Widder (www.forsyte.at) Checking Fault-Tolerant Distributed Algos TMPA’14, Nov. 2014 69 / 64