Interactive and Incremental
Process Model Repair
Abel Armas Cervantes, Nick R.T.P. van Beest, Marcello La Rosa,
Marlon Dumas and Luciano García-Bañuelos
1
Process mining
Process mining is a family of methods for analyzing business processes based
on event logs.
2
Process mining
Process mining is a family of methods for analyzing business processes based
on event logs.
• Some of the most important process mining operations:
• Discovery
Model
Log
3
Process mining
Process mining is a family of methods for analyzing business processes based
on event logs.
• Some of the most important process mining operations:
• Discovery
• Conformance checking
Model v1
Log
4
Process mining
Process mining is a family of methods for analyzing business processes based
on event logs.
• Some of the most important process mining operations:
• Discovery
• Conformance checking
• Enhancement
• Repair
Model v1
Log
Model v2
5
Process model repair at a glance
Model v1
Log
Conformance
Checking
Diagnosis Repair
Model v2
Behavioral-based
(true concurrency
semantics)
Textual descriptions and
graphical
representations
Interactive and
incremental
6
Process
Quality
Dimensions
Fitness
Simplicity
Precision
Generalization
Does the model follow
the Occam’s razor
principle?
Does the model
contain additional
undesired behavior?
Does the model
contain additional
welcomed behavior?
Is the model able
to replay the event
log?
Log
⟨A,B⟩
⟨A,B⟩
⟨A,B⟩
Log
⟨A⟩
⟨A,A⟩
⟨A,A,A⟩
7
Process quality dimensions
Process quality dimensions
8
Process
Quality
Dimensions
Fitness
Simplicity
Precision
Generalization
Fitness
Approach 1: Model Repair - Aligning Process Models to
Reality
• Automatic
• Trace-alignment based (interleaving semantics)
• Sub-logs of non-fitting traces are extracted from the log and repaired by
inserting cycles or sub-processes
• Fitness based
• Extend the model such that it explains the observed behavior
D. Fahland, and W. M.P. van der Aalst, Model Repair - Aligning Process Models to Reality. IS 2015 9
Approach 2 - Impact-Driven Process Model
Repair
• Automatic
• Trace-alignment based (interleaving semantics)
• Fitness based subject to a budget
• Maximize fitness
• Define cost to repairs
• Apply most impactful model repairs
• Insert self-loops for moves on log
• Skip labels for moves on model
A. Polyvyanyy, W. M. P. van der Aalst, A. H. M. ter Hofstede, and M. T. Wynn: Impact-Driven Process Model Repair. ACM TOSEM 2016 10
Automatic process model repair in action
11
modelFigure1
A B C D E
F
G
A B C D E
F
G
F
E H
A B C D E
F
G H
A B C D E
F
G
Approach 1
modelFigure1
A B C D E
F
G
A B C D E
F
G
F
E H
A B C D E
F
G H
Approach 2
modelFigure1
A B C D E
F
G
A B C D E
F
G
F
E H
A B C D E
F
G H
A B C D E
F
G
Log
⟨A,B,C,D,E,F,G,H⟩
⟨A,B,C,D,F,E,G,H⟩
Automatic process model repair in action
I
B
A B C
A B C D OI
X
D
X
O
I A B X
A
X O
A
C D
modelFigure2
I
B
A B C
X
D
X
O
I A B X
A
X
A
C D
modelFigure2
I
B
A B C
A B C D OI
X
D
X
O
I A B X
A
X O
A
C D
Approach 1 Approach 2
Log
⟨I, A, B, X, C, O⟩
⟨I, A, B, X, D, O⟩
⟨I, B, A, X, C, O⟩
⟨I, B, A, X, D, O⟩
Log or model as absolute truth
• Log can contain noise or negative deviances from expected behavior
Don’t you fully trust the model? Don’t you fully trust the log?
Select what to repair in the modelRepair the model
13
Model v1
Model v2
Model v1
Model v2
Interactive and incremental repair
• Conformance checker based on behavioral-alignment
• Support users during the reconciliation of differences
• Let the users decide what, when and how to repair
14
modelFigure1
A B C D E
F
G
A B C D E
F
G
F
E H
A B C D E G H
Log
⟨A,B,C,D,E,F,G,H⟩
⟨A,B,C,D,F,E,G,H⟩
model
A DB C E
F
G
H
modelFigure1
A B C D E
F
G
A B C D E
F
G
F
E H
A B C D E
F
G H
How do we do that?
• Adopt true concurrency semantics:
• Event structures (ES) as a unified representation for logs and models
• Identify common and deviant behavior
• Describe differences via natural language statements
L. García-Bañuelos, N. R.T.P. van Beest , M. Dumas, and M. La Rosa, and W. Mertens: Complete and interpretable conformance checking of
business processes. TSE 2017 15
Difference
statements
Event log
Input model
PESM
unfold
PESL
merge
Partially
Synchronized
Product (PSP)
compare
extract
differences
From an event log to a PES
Log Trace Ref N
A B C E t1 3
A C B E t2 2
A B E t3 2
A D E t4 3
Runs
{e0,f0,g0}:A
{e1,f1}:B
{f2}:E {e3}:E {g2}:E
{e2}:C {g1}:D
e0:A
e1:B e2:C
e3:E
f0:A
f1:B
f2:E
g0:A
g1:D
g2:E
t1, t2 → p1 t3 → p2 t4 → p3
PES
16
From model to PES
BPMN model
Petri net
17
Complete prefix unfolding
From model to PES
Complete prefix unfolding
PES
18
Behavioral alignment overview
Log PES Model PES
e0:A
e1:B e2:C e3:D
e4:E e5:E e6:E
Trace Ref N
A B C E t1 3
A C B E t2 2
A B E t3 2
A D E t4 3
A
B
D
E
C
f0:A
f1:B f2:C f3:D
f4:E f5:E
19
match B
lh = {}, rh = {}
m = {(e0,f0)A,(e1,f1)B}
rhide Cmatch C
lh = {}, rh = {f2:C}
m = {(e0,f0)A,(e1,f1)B}
lh = {}, rh = {}
m = {(e0,f0)A,(e1,f1)B,(e2,f2)C}
lh = {}, rh = {}
m = {(e0,f0)A}
match A
lh = {}, rh = {}
m = {}
match E
lh = {}, rh = {}
m = {(e0,f0)A,(e1,f1)B,(e2,f2)C,(e5,f4)E}
Log PES Model PES
e0:A
e1:B e2:C e3:D
e4:E e5:E e6:E
f0:A
f1:B f2:C f3:D
f4:E f5:E
Behavioral alignment overview
20
21
Behavioral alignment overview
match B
lh = {}, rh = {}
m = {(e0,f0)A,(e1,f1)B}
rhide Cmatch C
lh = {}, rh = {f2:C}
m = {(e0,f0)A,(e1,f1)B}
lh = {}, rh = {}
m = {(e0,f0)A,(e1,f1)B,(e2,f2)C}
lh = {}, rh = {}
m = {(e0,f0)A}
match A
lh = {}, rh = {}
m = {}
match E
lh = {}, rh = {}
m = {(e0,f0)A,(e1,f1)B,(e2,f2)C,(e5,f4)E}
Log PES Model PES
e0:A
e1:B e2:C e3:D
e4:E e5:E e6:E
f0:A
f1:B f2:C f3:D
f4:E f5:E
22
match B
lh = {}, rh = {}
m = {(e0,f0)A,(e1,f1)B}
rhide Cmatch C
lh = {}, rh = {f2:C}
m = {(e0,f0)A,(e1,f1)B}
lh = {}, rh = {}
m = {(e0,f0)A,(e1,f1)B,(e2,f2)C}
lh = {}, rh = {}
m = {(e0,f0)A}
match A
lh = {}, rh = {}
m = {}
match E
lh = {}, rh = {}
m = {(e0,f0)A,(e1,f1)B,(e2,f2)C,(e5,f4)E}
Log PES Model PES
e0:A
e1:B e2:C e3:D
e4:E e5:E e6:E
f0:A
f1:B f2:C f3:D
f4:E f5:E
Behavioral alignment overview
23
match B
lh = {}, rh = {}
m = {(e0,f0)A,(e1,f1)B}
rhide Cmatch C
lh = {}, rh = {f2:C}
m = {(e0,f0)A,(e1,f1)B}
lh = {}, rh = {}
m = {(e0,f0)A,(e1,f1)B,(e2,f2)C}
lh = {}, rh = {}
m = {(e0,f0)A}
match A
lh = {}, rh = {}
m = {}
match E
lh = {}, rh = {}
m = {(e0,f0)A,(e1,f1)B,(e2,f2)C,(e5,f4)E}
Log PES Model PES
e0:A
e1:B e2:C e3:D
e4:E e5:E e6:E
f0:A
f1:B f2:C f3:D
f4:E f5:E
Behavioral alignment overview
24
match B
lh = {}, rh = {}
m = {(e0,f0)A,(e1,f1)B}
rhide Cmatch C
lh = {}, rh = {f2:C}
m = {(e0,f0)A,(e1,f1)B}
lh = {}, rh = {}
m = {(e0,f0)A,(e1,f1)B,(e2,f2)C}
lh = {}, rh = {}
m = {(e0,f0)A}
match A
lh = {}, rh = {}
m = {}
match E
lh = {}, rh = {}
m = {(e0,f0)A,(e1,f1)B,(e2,f2)C,(e5,f4)E}
Log PES Model PES
e0:A
e1:B e2:C e3:D
e4:E e5:E e6:E
f0:A
f1:B f2:C f3:D
f4:E f5:E
Behavioral alignment overview
25
match B
lh = {}, rh = {}
m = {(e0,f0)A,(e1,f1)B}
rhide Cmatch C
lh = {}, rh = {f2:C}
m = {(e0,f0)A,(e1,f1)B}
lh = {}, rh = {}
m = {(e0,f0)A,(e1,f1)B,(e2,f2)C}
lh = {}, rh = {}
m = {(e0,f0)A}
match A
lh = {}, rh = {}
m = {}
match E
lh = {}, rh = {}
m = {(e0,f0)A,(e1,f1)B,(e2,f2)C,(e5,f4)E}
Log PES Model PES
e0:A
e1:B e2:C e3:D
e4:E e5:E e6:E
f0:A
f1:B f2:C f3:D
f4:E f5:E
Behavioral alignment overview
26
match B
lh = {}, rh = {}
m = {(e0,f0)A,(e1,f1)B}
rhide Cmatch C
lh = {}, rh = {f2:C}
m = {(e0,f0)A,(e1,f1)B}
lh = {}, rh = {}
m = {(e0,f0)A,(e1,f1)B,(e2,f2)C}
lh = {}, rh = {}
m = {(e0,f0)A}
match A
lh = {}, rh = {}
m = {}
match E
lh = {}, rh = {}
m = {(e0,f0)A,(e1,f1)B,(e2,f2)C,(e5,f4)E}
Log PES Model PES
e0:A
e1:B e2:C e3:D
e4:E e5:E e6:E
f0:A
f1:B f2:C f3:D
f4:E f5:E
Behavioral alignment overview
In the log, C is optional
after {A,B}, whereas in the
model it is not
27
match B
lh = {}, rh = {}
m = {(e0,f0)A,(e1,f1)B}
rhide Cmatch C
lh = {}, rh = {f2:C}
m = {(e0,f0)A,(e1,f1)B}
lh = {}, rh = {}
m = {(e0,f0)A,(e1,f1)B,(e2,f2)C}
lh = {}, rh = {}
m = {(e0,f0)A}
match A
lh = {}, rh = {}
m = {}
match E
lh = {}, rh = {}
m = {(e0,f0)A,(e1,f1)B,(e2,f2)C,(e5,f4)E}
Log PES Model PES
e0:A
e1:B e2:C e3:D
e4:E e5:E e6:E
f0:A
f1:B f2:C f3:D
f4:E f5:E
Behavioral alignment overview
Patterns for behavioral alignment
diagnosis
Unfitting behavior:
• Relation mismatch:
1. Causality-Concurrency
2. Conflict
• Event mismatch:
3. Task skipping
4. Task substitution
5. Unmatched repetition
6. Task relocation
7. Task insertion / absence
28
Additional model behavior:
8. Unobserved acyclic interval
9. Unobserved cyclic interval
Repair assistance through patterns
match B
lh = {}, rh = {}
m = {(e0,f0)A,(e1,f1)B}
rhide Cmatch C
lh = {}, rh = {f2:C}
m = {(e0,f0)A,(e1,f1)B}
lh = {}, rh = {}
m = {(e0,f0)A,(e1,f1)B,(e2,f2)C}
lh = {}, rh = {}
m = {(e0,f0)A}
match A
lh = {}, rh = {}
m = {}
match E
lh = {}, rh = {}
m = {(e0,f0)A,(e1,f1)B,(e2,f2)C,(e5,f4)E}
In the log, C is optional after
{A,B}, whereas in the model
it is not
2
A
B
C
E
D
ExampleCoopis
A
B
C
E
D
29
Repair patterns extension
• Distinct sets of patterns can be used to explain the encountered difference
match B
lh = {}, rh = {}
m = {(e0,f0)A,(e1,f1)B}
rhide Cmatch C
lh = {}, rh = {f2:C}
m = {(e0,f0)A,(e1,f1)B}
lh = {}, rh = {}
m = {(e0,f0)A,(e1,f1)B,(e2,f2)C}
lh = {}, rh = {}
m = {(e0,f0)A}
match A
lh = {}, rh = {}
m = {}
match E
lh = {}, rh = {}
m = {(e0,f0)A,(e1,f1)B,(e2,f2)C,(e5,f4)E}
lh = {}, rh = {f2 : C, fx:E}
m = {(e0,f0)A, (e1,f1)B}
rhide E
• In the log, C is optional after {A,B}, whereas in the model
it is not
• In the log, task E does not occur after {A,B}
• In the log, task C does not occur after {A,B}
• In the log, task E does not occur after {A,B}
• In the log, the interval [C, E] is optional after {A,B},
whereas in the model it is not
30
Order of pattern detection
1. Patterns based on intervals
2. Patterns based on binary relations
3. Patterns based on a single task
i a b o
i o
c
ba
i a b o bc c
d
e
i a ob c
i a oec
TaskReloc: In the log, the interval [b,c] occurs after [i,a,o] instead of [i,a]
ConcConf: In the log, after i, Task a and Task b are concurrent, while in the model they
are mutually exclusive
i
a
o
b
i a o
b
i
a
o
b
i
a
o
b
i
a
o
b
TaskIns: In the log, Task b occurs after [i, a] and before o
New Process
i a b o
i a b o
i o
c
ba
i a b o b
i a o b
i a o
b
Repair patterns based on intervals
i a b o
i o
c
ba
i a b o bc c
d
e
i a ob c
i a oec
Intervals
i a b o
i a b o
i o
c
ba
i a b o bc c
d
e
i a ob c
c
i a b o bc c
e
i a ob c
i a oec
i o
c
ba
i a b o bc c
d
e
i a ob c
i a oec
32
Intervals
i a b o
i a b o
i o
c
ba
i a b o bc c
d
e
i a ob c
i a b o
i o
c
ba
i a b o bc c
d
e
i a ob c
i a oec
i a o
b c
Repair patterns based on binary relations
i a ob
i a ob
i a ob
i a ob
i a ob
i a ob
i a ob
i a ob
i
a
o
b
i a o
b
i
a
o
b
i
a
o
b
i
a
o
b
b
i
a
o
b
i a o
b
i
a
o
b
i
a
o
b
i
a
o
i a o
b
i
a
o
b
i a o
b
i
a
o
b
i
a
o
b
i
a
o
i
a
o
b
b
i
a
o
b
i
a
o
b
i
a
o
b
33
Repair patterns based on single tasks
NewProcess2
i a ob
d c e
i a ob
i a ob
i a b o
i a b o
i o
c
ba
i a b o b
i a o b
b
i a o
b
34
Impact of repairs
• Most impactful (proportion of traces affected by the change) repairs are
presented first
35
In the log, F occurs after {A,B}
match B
lh = {}, rh = {}
m = {(e0,f0)A,(e1,f1)B}
rhide Cmatch C
lh = {}, rh = {f2:C}
m = {(e0,f0)A,(e1,f1)B}
lh = {}, rh = {}
m = {(e0,f0)A,(e1,f1)B,(e2,f2)C}
lh = {}, rh = {}
m = {(e0,f0)A}
match A
lh = {}, rh = {}
m = {}
match E
lh = {}, rh = {}
m = {(e0,f0)A,(e1,f1)B,(e2,f2)C,(e5,f4)E}
lhide F
(100)
(100)
(90)
(90)
(10)
Impact: 10/100 = 0.1
Impact of repairs
• Most impactful (proportion of traces affected by the change) repairs are
presented first
36
In the log, F occurs after {A,B}
match B
lh = {}, rh = {}
m = {(e0,f0)A,(e1,f1)B}
rhide Cmatch C
lh = {}, rh = {f2:C}
m = {(e0,f0)A,(e1,f1)B}
lh = {}, rh = {}
m = {(e0,f0)A,(e1,f1)B,(e2,f2)C}
lh = {}, rh = {}
m = {(e0,f0)A}
match A
lh = {}, rh = {}
m = {}
match E
lh = {}, rh = {}
m = {(e0,f0)A,(e1,f1)B,(e2,f2)C,(e5,f4)E}
lhide F
(100)
(100)
(10)
(10)
(90)
Impact: 90/100 = 0.9
Interactive and incremental repair
37
Powered by
Future work
• Perform a more extensive evaluation on real-life datasets
• Consider additional information in the log to define a more refined notion of
impact
• Identify sets of discrepancies that can be applied together

Incremental and Interactive Process Model Repair

  • 1.
    Interactive and Incremental ProcessModel Repair Abel Armas Cervantes, Nick R.T.P. van Beest, Marcello La Rosa, Marlon Dumas and Luciano García-Bañuelos 1
  • 2.
    Process mining Process miningis a family of methods for analyzing business processes based on event logs. 2
  • 3.
    Process mining Process miningis a family of methods for analyzing business processes based on event logs. • Some of the most important process mining operations: • Discovery Model Log 3
  • 4.
    Process mining Process miningis a family of methods for analyzing business processes based on event logs. • Some of the most important process mining operations: • Discovery • Conformance checking Model v1 Log 4
  • 5.
    Process mining Process miningis a family of methods for analyzing business processes based on event logs. • Some of the most important process mining operations: • Discovery • Conformance checking • Enhancement • Repair Model v1 Log Model v2 5
  • 6.
    Process model repairat a glance Model v1 Log Conformance Checking Diagnosis Repair Model v2 Behavioral-based (true concurrency semantics) Textual descriptions and graphical representations Interactive and incremental 6
  • 7.
    Process Quality Dimensions Fitness Simplicity Precision Generalization Does the modelfollow the Occam’s razor principle? Does the model contain additional undesired behavior? Does the model contain additional welcomed behavior? Is the model able to replay the event log? Log ⟨A,B⟩ ⟨A,B⟩ ⟨A,B⟩ Log ⟨A⟩ ⟨A,A⟩ ⟨A,A,A⟩ 7 Process quality dimensions
  • 8.
  • 9.
    Approach 1: ModelRepair - Aligning Process Models to Reality • Automatic • Trace-alignment based (interleaving semantics) • Sub-logs of non-fitting traces are extracted from the log and repaired by inserting cycles or sub-processes • Fitness based • Extend the model such that it explains the observed behavior D. Fahland, and W. M.P. van der Aalst, Model Repair - Aligning Process Models to Reality. IS 2015 9
  • 10.
    Approach 2 -Impact-Driven Process Model Repair • Automatic • Trace-alignment based (interleaving semantics) • Fitness based subject to a budget • Maximize fitness • Define cost to repairs • Apply most impactful model repairs • Insert self-loops for moves on log • Skip labels for moves on model A. Polyvyanyy, W. M. P. van der Aalst, A. H. M. ter Hofstede, and M. T. Wynn: Impact-Driven Process Model Repair. ACM TOSEM 2016 10
  • 11.
    Automatic process modelrepair in action 11 modelFigure1 A B C D E F G A B C D E F G F E H A B C D E F G H A B C D E F G Approach 1 modelFigure1 A B C D E F G A B C D E F G F E H A B C D E F G H Approach 2 modelFigure1 A B C D E F G A B C D E F G F E H A B C D E F G H A B C D E F G Log ⟨A,B,C,D,E,F,G,H⟩ ⟨A,B,C,D,F,E,G,H⟩
  • 12.
    Automatic process modelrepair in action I B A B C A B C D OI X D X O I A B X A X O A C D modelFigure2 I B A B C X D X O I A B X A X A C D modelFigure2 I B A B C A B C D OI X D X O I A B X A X O A C D Approach 1 Approach 2 Log ⟨I, A, B, X, C, O⟩ ⟨I, A, B, X, D, O⟩ ⟨I, B, A, X, C, O⟩ ⟨I, B, A, X, D, O⟩
  • 13.
    Log or modelas absolute truth • Log can contain noise or negative deviances from expected behavior Don’t you fully trust the model? Don’t you fully trust the log? Select what to repair in the modelRepair the model 13 Model v1 Model v2 Model v1 Model v2
  • 14.
    Interactive and incrementalrepair • Conformance checker based on behavioral-alignment • Support users during the reconciliation of differences • Let the users decide what, when and how to repair 14 modelFigure1 A B C D E F G A B C D E F G F E H A B C D E G H Log ⟨A,B,C,D,E,F,G,H⟩ ⟨A,B,C,D,F,E,G,H⟩ model A DB C E F G H modelFigure1 A B C D E F G A B C D E F G F E H A B C D E F G H
  • 15.
    How do wedo that? • Adopt true concurrency semantics: • Event structures (ES) as a unified representation for logs and models • Identify common and deviant behavior • Describe differences via natural language statements L. García-Bañuelos, N. R.T.P. van Beest , M. Dumas, and M. La Rosa, and W. Mertens: Complete and interpretable conformance checking of business processes. TSE 2017 15 Difference statements Event log Input model PESM unfold PESL merge Partially Synchronized Product (PSP) compare extract differences
  • 16.
    From an eventlog to a PES Log Trace Ref N A B C E t1 3 A C B E t2 2 A B E t3 2 A D E t4 3 Runs {e0,f0,g0}:A {e1,f1}:B {f2}:E {e3}:E {g2}:E {e2}:C {g1}:D e0:A e1:B e2:C e3:E f0:A f1:B f2:E g0:A g1:D g2:E t1, t2 → p1 t3 → p2 t4 → p3 PES 16
  • 17.
    From model toPES BPMN model Petri net 17 Complete prefix unfolding
  • 18.
    From model toPES Complete prefix unfolding PES 18
  • 19.
    Behavioral alignment overview LogPES Model PES e0:A e1:B e2:C e3:D e4:E e5:E e6:E Trace Ref N A B C E t1 3 A C B E t2 2 A B E t3 2 A D E t4 3 A B D E C f0:A f1:B f2:C f3:D f4:E f5:E 19
  • 20.
    match B lh ={}, rh = {} m = {(e0,f0)A,(e1,f1)B} rhide Cmatch C lh = {}, rh = {f2:C} m = {(e0,f0)A,(e1,f1)B} lh = {}, rh = {} m = {(e0,f0)A,(e1,f1)B,(e2,f2)C} lh = {}, rh = {} m = {(e0,f0)A} match A lh = {}, rh = {} m = {} match E lh = {}, rh = {} m = {(e0,f0)A,(e1,f1)B,(e2,f2)C,(e5,f4)E} Log PES Model PES e0:A e1:B e2:C e3:D e4:E e5:E e6:E f0:A f1:B f2:C f3:D f4:E f5:E Behavioral alignment overview 20
  • 21.
    21 Behavioral alignment overview matchB lh = {}, rh = {} m = {(e0,f0)A,(e1,f1)B} rhide Cmatch C lh = {}, rh = {f2:C} m = {(e0,f0)A,(e1,f1)B} lh = {}, rh = {} m = {(e0,f0)A,(e1,f1)B,(e2,f2)C} lh = {}, rh = {} m = {(e0,f0)A} match A lh = {}, rh = {} m = {} match E lh = {}, rh = {} m = {(e0,f0)A,(e1,f1)B,(e2,f2)C,(e5,f4)E} Log PES Model PES e0:A e1:B e2:C e3:D e4:E e5:E e6:E f0:A f1:B f2:C f3:D f4:E f5:E
  • 22.
    22 match B lh ={}, rh = {} m = {(e0,f0)A,(e1,f1)B} rhide Cmatch C lh = {}, rh = {f2:C} m = {(e0,f0)A,(e1,f1)B} lh = {}, rh = {} m = {(e0,f0)A,(e1,f1)B,(e2,f2)C} lh = {}, rh = {} m = {(e0,f0)A} match A lh = {}, rh = {} m = {} match E lh = {}, rh = {} m = {(e0,f0)A,(e1,f1)B,(e2,f2)C,(e5,f4)E} Log PES Model PES e0:A e1:B e2:C e3:D e4:E e5:E e6:E f0:A f1:B f2:C f3:D f4:E f5:E Behavioral alignment overview
  • 23.
    23 match B lh ={}, rh = {} m = {(e0,f0)A,(e1,f1)B} rhide Cmatch C lh = {}, rh = {f2:C} m = {(e0,f0)A,(e1,f1)B} lh = {}, rh = {} m = {(e0,f0)A,(e1,f1)B,(e2,f2)C} lh = {}, rh = {} m = {(e0,f0)A} match A lh = {}, rh = {} m = {} match E lh = {}, rh = {} m = {(e0,f0)A,(e1,f1)B,(e2,f2)C,(e5,f4)E} Log PES Model PES e0:A e1:B e2:C e3:D e4:E e5:E e6:E f0:A f1:B f2:C f3:D f4:E f5:E Behavioral alignment overview
  • 24.
    24 match B lh ={}, rh = {} m = {(e0,f0)A,(e1,f1)B} rhide Cmatch C lh = {}, rh = {f2:C} m = {(e0,f0)A,(e1,f1)B} lh = {}, rh = {} m = {(e0,f0)A,(e1,f1)B,(e2,f2)C} lh = {}, rh = {} m = {(e0,f0)A} match A lh = {}, rh = {} m = {} match E lh = {}, rh = {} m = {(e0,f0)A,(e1,f1)B,(e2,f2)C,(e5,f4)E} Log PES Model PES e0:A e1:B e2:C e3:D e4:E e5:E e6:E f0:A f1:B f2:C f3:D f4:E f5:E Behavioral alignment overview
  • 25.
    25 match B lh ={}, rh = {} m = {(e0,f0)A,(e1,f1)B} rhide Cmatch C lh = {}, rh = {f2:C} m = {(e0,f0)A,(e1,f1)B} lh = {}, rh = {} m = {(e0,f0)A,(e1,f1)B,(e2,f2)C} lh = {}, rh = {} m = {(e0,f0)A} match A lh = {}, rh = {} m = {} match E lh = {}, rh = {} m = {(e0,f0)A,(e1,f1)B,(e2,f2)C,(e5,f4)E} Log PES Model PES e0:A e1:B e2:C e3:D e4:E e5:E e6:E f0:A f1:B f2:C f3:D f4:E f5:E Behavioral alignment overview
  • 26.
    26 match B lh ={}, rh = {} m = {(e0,f0)A,(e1,f1)B} rhide Cmatch C lh = {}, rh = {f2:C} m = {(e0,f0)A,(e1,f1)B} lh = {}, rh = {} m = {(e0,f0)A,(e1,f1)B,(e2,f2)C} lh = {}, rh = {} m = {(e0,f0)A} match A lh = {}, rh = {} m = {} match E lh = {}, rh = {} m = {(e0,f0)A,(e1,f1)B,(e2,f2)C,(e5,f4)E} Log PES Model PES e0:A e1:B e2:C e3:D e4:E e5:E e6:E f0:A f1:B f2:C f3:D f4:E f5:E Behavioral alignment overview
  • 27.
    In the log,C is optional after {A,B}, whereas in the model it is not 27 match B lh = {}, rh = {} m = {(e0,f0)A,(e1,f1)B} rhide Cmatch C lh = {}, rh = {f2:C} m = {(e0,f0)A,(e1,f1)B} lh = {}, rh = {} m = {(e0,f0)A,(e1,f1)B,(e2,f2)C} lh = {}, rh = {} m = {(e0,f0)A} match A lh = {}, rh = {} m = {} match E lh = {}, rh = {} m = {(e0,f0)A,(e1,f1)B,(e2,f2)C,(e5,f4)E} Log PES Model PES e0:A e1:B e2:C e3:D e4:E e5:E e6:E f0:A f1:B f2:C f3:D f4:E f5:E Behavioral alignment overview
  • 28.
    Patterns for behavioralalignment diagnosis Unfitting behavior: • Relation mismatch: 1. Causality-Concurrency 2. Conflict • Event mismatch: 3. Task skipping 4. Task substitution 5. Unmatched repetition 6. Task relocation 7. Task insertion / absence 28 Additional model behavior: 8. Unobserved acyclic interval 9. Unobserved cyclic interval
  • 29.
    Repair assistance throughpatterns match B lh = {}, rh = {} m = {(e0,f0)A,(e1,f1)B} rhide Cmatch C lh = {}, rh = {f2:C} m = {(e0,f0)A,(e1,f1)B} lh = {}, rh = {} m = {(e0,f0)A,(e1,f1)B,(e2,f2)C} lh = {}, rh = {} m = {(e0,f0)A} match A lh = {}, rh = {} m = {} match E lh = {}, rh = {} m = {(e0,f0)A,(e1,f1)B,(e2,f2)C,(e5,f4)E} In the log, C is optional after {A,B}, whereas in the model it is not 2 A B C E D ExampleCoopis A B C E D 29
  • 30.
    Repair patterns extension •Distinct sets of patterns can be used to explain the encountered difference match B lh = {}, rh = {} m = {(e0,f0)A,(e1,f1)B} rhide Cmatch C lh = {}, rh = {f2:C} m = {(e0,f0)A,(e1,f1)B} lh = {}, rh = {} m = {(e0,f0)A,(e1,f1)B,(e2,f2)C} lh = {}, rh = {} m = {(e0,f0)A} match A lh = {}, rh = {} m = {} match E lh = {}, rh = {} m = {(e0,f0)A,(e1,f1)B,(e2,f2)C,(e5,f4)E} lh = {}, rh = {f2 : C, fx:E} m = {(e0,f0)A, (e1,f1)B} rhide E • In the log, C is optional after {A,B}, whereas in the model it is not • In the log, task E does not occur after {A,B} • In the log, task C does not occur after {A,B} • In the log, task E does not occur after {A,B} • In the log, the interval [C, E] is optional after {A,B}, whereas in the model it is not 30
  • 31.
    Order of patterndetection 1. Patterns based on intervals 2. Patterns based on binary relations 3. Patterns based on a single task i a b o i o c ba i a b o bc c d e i a ob c i a oec TaskReloc: In the log, the interval [b,c] occurs after [i,a,o] instead of [i,a] ConcConf: In the log, after i, Task a and Task b are concurrent, while in the model they are mutually exclusive i a o b i a o b i a o b i a o b i a o b TaskIns: In the log, Task b occurs after [i, a] and before o New Process i a b o i a b o i o c ba i a b o b i a o b i a o b
  • 32.
    Repair patterns basedon intervals i a b o i o c ba i a b o bc c d e i a ob c i a oec Intervals i a b o i a b o i o c ba i a b o bc c d e i a ob c c i a b o bc c e i a ob c i a oec i o c ba i a b o bc c d e i a ob c i a oec 32 Intervals i a b o i a b o i o c ba i a b o bc c d e i a ob c i a b o i o c ba i a b o bc c d e i a ob c i a oec i a o b c
  • 33.
    Repair patterns basedon binary relations i a ob i a ob i a ob i a ob i a ob i a ob i a ob i a ob i a o b i a o b i a o b i a o b i a o b b i a o b i a o b i a o b i a o b i a o i a o b i a o b i a o b i a o b i a o b i a o i a o b b i a o b i a o b i a o b 33
  • 34.
    Repair patterns basedon single tasks NewProcess2 i a ob d c e i a ob i a ob i a b o i a b o i o c ba i a b o b i a o b b i a o b 34
  • 35.
    Impact of repairs •Most impactful (proportion of traces affected by the change) repairs are presented first 35 In the log, F occurs after {A,B} match B lh = {}, rh = {} m = {(e0,f0)A,(e1,f1)B} rhide Cmatch C lh = {}, rh = {f2:C} m = {(e0,f0)A,(e1,f1)B} lh = {}, rh = {} m = {(e0,f0)A,(e1,f1)B,(e2,f2)C} lh = {}, rh = {} m = {(e0,f0)A} match A lh = {}, rh = {} m = {} match E lh = {}, rh = {} m = {(e0,f0)A,(e1,f1)B,(e2,f2)C,(e5,f4)E} lhide F (100) (100) (90) (90) (10) Impact: 10/100 = 0.1
  • 36.
    Impact of repairs •Most impactful (proportion of traces affected by the change) repairs are presented first 36 In the log, F occurs after {A,B} match B lh = {}, rh = {} m = {(e0,f0)A,(e1,f1)B} rhide Cmatch C lh = {}, rh = {f2:C} m = {(e0,f0)A,(e1,f1)B} lh = {}, rh = {} m = {(e0,f0)A,(e1,f1)B,(e2,f2)C} lh = {}, rh = {} m = {(e0,f0)A} match A lh = {}, rh = {} m = {} match E lh = {}, rh = {} m = {(e0,f0)A,(e1,f1)B,(e2,f2)C,(e5,f4)E} lhide F (100) (100) (10) (10) (90) Impact: 90/100 = 0.9
  • 37.
    Interactive and incrementalrepair 37 Powered by
  • 38.
    Future work • Performa more extensive evaluation on real-life datasets • Consider additional information in the log to define a more refined notion of impact • Identify sets of discrepancies that can be applied together

Editor's Notes

  • #3 Event logs capture the behavior of a system by means of events
  • #6 Why? It is a reference model It is descriptive or normative It captures knowledge that can help to guide discovery
  • #14 Noisy containing uncommon behavior, or Incomplete: events occurred in the system might be not recorded, or Innacurate: contain events that did not occur in reality
  • #17 To simplify, in an event structure we don’t show: Transitive causality relations, e.g. A is causal to E Hereditary conflict, e.g. B and C are in conflict with {g2}E Concurrency: every pair of events that is neither directly nor transitively causally related, nor in conflict of course, is concurrent. E.g. B and C are concurrent Every state of a PES (called “configuration”) is represented by a set of events. A configuration is: causally closed, meaning that for each event e in C, C includes all causal predecessors of e, and conflict-free, meaning that there cannot be any pair of events in conflict within C. e is extension of C if {e} U C is also a configuration. A maximal configuration is a configuration that is maximal w.r.t. set inclusion Lossless representation: the set of maximal configurations of the PES is equivalent to the set of runs inferred from the log, modulo the inaccuracy of the concurrency oracle. The complexity of building such an event structure from a log is cubic on the length of the longest trace. We merge events with the same label (e.g. e0 and f0) and which have same history (same prefix)
  • #18 The complete prefix unfolding is essentially the branching process with a trick: we unfold loops only once (otherwise the branching process will be infinitely large) and use a mechanism called cut-off and corresponding events to ”jump” from one event to another one which is future equivalent The model is a variation of what we showed before, with C being optional and without loopback after F. Requirements: The Petri net must be safe The silent transitions will be eliminated when constructing the PES First, we construct the branching process by prefix-merging all the partially ordered runs induced by the Petri net. We do so because branching processes explicitly represent the same set of behavioral relations as event structures. Transitions in the branching process represent events in the event structure, and the behavioral relations in the branching process can thus be used to generate an event structure. However, the event structure that we create by using a set of inductive rules, includes silent transitions. It has been shown (Abel’s IS paper) that silent transitions can be abstracted away in a behavioral-preserving manner, under the well-known notion of visible-pomset equivalence (which requires that sink events in the event structure are not silent, and we can always add fake ones). In the example, the future of t2 and t4 (C) is isomorphic, which is unfolded in the branching process separately for both t2 and t4. Furthermore, the future of E and F is isomorphic. Therefore, we can safely stop unfolding the branching process once we reach t4, provided that we continue unfolding from t2 and onwards. COMPLETE PREFIX UNFOLDING McMillan showed that for a safe net, a prefix of a branching process that unfolds each loop once fully encodes the behavior of the original net. Such prefix is referred to as complete prefix unfolding of the net. We use Esparza’s optimization that has been shown to produce compact unfoldings The trick is to find transitions which have isomorphic futures. When we stop unfolding after t4, we call this the cutoff event. The event with the isomorphic future, t2, is called the corresponding event. The resulting cc-pair (t4,t2) (cutoff-corresponding pair) is grapically depicted here with the red line. Similar for E and F
  • #19 We call the PES derived from a complex prefix unfolding PES prefix unfolding or simply prefix PES This translates to a PES prefix unfolding as shown in this slide. Reasoning about possible executions of a PES prefix unfolding is not convenient because some configurations are not explicitly represented. To make it more convenient to explore the configurations of a PES prefix unfolding, we use the “shift” operation on net unfoldings. Given a cc-pair (t4, t2), since the futures of [t4] and [t2] are isomorphic, we can “shift” from one configuration to the other. In other words, the shift operation is a “step” function that allows us to move from one configuration to another. Note that in a PES we retain those tau transitions that are involved in a cc-pair and safely get rid of the rest. The extraction of a complete prefix unfolding from a Petri net is in the worst-case scenario is exponential on the size of the net.
  • #20 Let’s take the example of a model and a log
  • #26 E is not in conflict with C in the model In the example, note that C is optional in the log and mandatory in the model ONLY after state {A,B} and not always, e.g. in the model after {A} I can execute D and thus skip C At each synchronized state, the set of enabled events in the two PESs is checked, to identify those that are label-preserving and order-preserving. Label preservation is a simple check, but order-preservation requires a backward traversal of the event structure to check that all causality relations are maintained between each of the enabled events and the configuration path in the event structure. If this is not the case, the algorithm returns the set of events that have a causality discrepancy, the cut-off events being traversed and the set of all causality relations that are violated. This is later used to characterize behavioral mismatches. The PSP contruction aims at finding the optimal matchings (i.e. maximum number of matchings, meaning minimum number of hide) for every maximal configuration of the log PES. Hence, priority is given to the log. A PSP with no hide operations identifies a situation where the log is fully fitting into the model. --- Complexity of PSP construction: we use an A* heuristic to find the optimal number of matches, so worst-case is O(3^(nPES1 x nPES2)) where nPESx is the number of configurations of PESx. 3 is the branching factor – avg number of successors per state (match, lhide and rhide). Indeed each configuration of PES1 is associated with a configuration of PES2 via 3 possible operations
  • #27 E is not in conflict with C in the model In the example, note that C is optional in the log and mandatory in the model ONLY after state {A,B} and not always, e.g. in the model after {A} I can execute D and thus skip C At each synchronized state, the set of enabled events in the two PESs is checked, to identify those that are label-preserving and order-preserving. Label preservation is a simple check, but order-preservation requires a backward traversal of the event structure to check that all causality relations are maintained between each of the enabled events and the configuration path in the event structure. If this is not the case, the algorithm returns the set of events that have a causality discrepancy, the cut-off events being traversed and the set of all causality relations that are violated. This is later used to characterize behavioral mismatches. The PSP contruction aims at finding the optimal matchings (i.e. maximum number of matchings, meaning minimum number of hide) for every maximal configuration of the log PES. Hence, priority is given to the log. A PSP with no hide operations identifies a situation where the log is fully fitting into the model. --- Complexity of PSP construction: we use an A* heuristic to find the optimal number of matches, so worst-case is O(3^(nPES1 x nPES2)) where nPESx is the number of configurations of PESx. 3 is the branching factor – avg number of successors per state (match, lhide and rhide). Indeed each configuration of PES1 is associated with a configuration of PES2 via 3 possible operations
  • #28 E is not in conflict with C in the model In the example, note that C is optional in the log and mandatory in the model ONLY after state {A,B} and not always, e.g. in the model after {A} I can execute D and thus skip C At each synchronized state, the set of enabled events in the two PESs is checked, to identify those that are label-preserving and order-preserving. Label preservation is a simple check, but order-preservation requires a backward traversal of the event structure to check that all causality relations are maintained between each of the enabled events and the configuration path in the event structure. If this is not the case, the algorithm returns the set of events that have a causality discrepancy, the cut-off events being traversed and the set of all causality relations that are violated. This is later used to characterize behavioral mismatches. The PSP contruction aims at finding the optimal matchings (i.e. maximum number of matchings, meaning minimum number of hide) for every maximal configuration of the log PES. Hence, priority is given to the log. A PSP with no hide operations identifies a situation where the log is fully fitting into the model. --- Complexity of PSP construction: we use an A* heuristic to find the optimal number of matches, so worst-case is O(3^(nPES1 x nPES2)) where nPESx is the number of configurations of PESx. 3 is the branching factor – avg number of successors per state (match, lhide and rhide). Indeed each configuration of PES1 is associated with a configuration of PES2 via 3 possible operations