The paper was presented at the 1st International Conference on Process Mining (icpmconference.org) and won the best paper award. The paper can be retrieved from: https://www.researchgate.net/publication/332396663_Monotone_Conformance_Checking_for_Partially_Matching_Designed_and_Observed_Processes
ABSTRACT
Conformance checking is a subarea of process mining that studies relations between designed processes, also called process models, and records of observed processes, also called event logs. In the last decade, research in conformance checking has proposed a plethora of techniques for characterizing the discrepancies between process models and event logs. Often, these techniques are also applied to measure the quality of process models automatically discovered from event logs. Recently, the process mining community has initiated a discussion on the desired properties of such measures. This discussion witnesses the lack of measures with the desired properties and the lack of properties intended for measures that support partially matching processes, i.e., processes that are not identical but differ in some steps. The paper at hand addresses these limitations. Firstly, it extends the recently introduced precision and recall conformance measures between process models and event logs that possess the desired property of monotonicity with the support of partially matching processes. Secondly, it introduces new intuitively desired properties of conformance measures that support partially matching processes and shows that our measures indeed possess them. The new measures have been implemented in a publicly available tool. The reported qualitative and quantitative evaluations based on our implementation demonstrate the feasibility of using the proposed measures in industrial settings.
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Monotone Conformance Checking for Partially Matching Designed and Observed Processes
1. Monotone
Conformance Checking
for Partially Matching
Designed and Observed
Processes
Artem Polyvyanyy and Anna Kalenkova
1st International Conference on Process Mining, June 24-26, 2019, Aachen, Germany
1
2. Conformance Checking
Artem Polyvyanyy and Anna Kalenkova | Monotone Conformance Checking for Partially Matching Designed and Observed Processes | June 24-26, 2019, Aachen, Germany 2
Trace Frequency
ABDEI 1,207
ACDGHFI 145
ACGDHFI 56
ACDHFI 28
ACHDFI 23
A
B
C
D
E
F
I
G H
Ʈ
Event log 1 Process model 1
Conformance checking relates events in the event log to activities in the process model and
compares both. The goal is to find commonalities and discrepancies between the modeled behavior
and the observed behavior [1].
[1] Wil M. P. van der Aalst: Process Mining - Data Science in Action, Second Edition. Springer 2016, ISBN 978-3-662-49850-7
Precision: Process model should not allow for behavior unrelated to what was seen in the event log.
Fitness: Process model should allow for the behavior seen in the event log.
+ fitness
+ precision
3. A
B
C
E
F
I
G H
Ʈ
D
Conformance Checking (II)
Artem Polyvyanyy and Anna Kalenkova | Monotone Conformance Checking for Partially Matching Designed and Observed Processes | June 24-26, 2019, Aachen, Germany 3
Single trace model
Flower model
Trace Frequency
ABDEI 1,207
ACDGHFI 145
ACGDHFI 56
ACDHFI 28
ACHDFI 23
Event log 1
Ʈ
A
B C
E
D
H
I
F G
Ʈ
A B D E I
+ fitness
+/- precision
- fitness
+ precision
+ fitness
- precision
A
B
C
D
E
F
I
G H
Process model 2
Process model 3
+ fitness
+/- precision
4. Precision and Recall in Information Retrieval
Artem Polyvyanyy and Anna Kalenkova | Monotone Conformance Checking for Partially Matching Designed and Observed Processes | June 24-26, 2019, Aachen, Germany 4
Information Retrieval
engine
Document
collection
Query
Precision:
Recall:
Relevant
documents
Retrieved
documents
=
3
4
=
3
5
5. (Relevant)
log traces
(Retrieved)
model
traces
Precision and Recall in Conformance Checking
Artem Polyvyanyy and Anna Kalenkova | Monotone Conformance Checking for Partially Matching Designed and Observed Processes | June 24-26, 2019, Aachen, Germany 5
Process Discovery
technique
System’s
traces
Event log
Precision:
Recall (Fitness):
=
5
6
=
5
5
A
B
C
D
E
F
I
G H
Ʈ
ABDEI
ACDGHFI
ACGDHFI
ACDHFI
ACHDFI
L
M
|𝑳 ∩ 𝑴|
|𝑳|
|𝑳 ∩ 𝑴|
|𝑴|
𝑴= 𝑳 ∪ ACGHDFI
6. Two Problems of Precision and Recall
Artem Polyvyanyy and Anna Kalenkova | Monotone Conformance Checking for Partially Matching Designed and Observed Processes | June 24-26, 2019, Aachen, Germany 6
Problem 1: Infinite number of traces Problem 2: Partially matching traces
A
B
C
D
E
F
I
G H
Trace
ABDEI
ACDGHFI
ACGDHFI
ACDHFI
ACHDFI
Process model 2
Precision:
Recall:
|𝑳 𝟏 ∩ 𝑴 𝟐|
|𝑴 𝟐|
=
𝟓
∞
Event log 1
|𝑳 𝟏 ∩ 𝑴 𝟐|
|𝑳 𝟏|
=
𝟓
𝟓
= 𝟏
Precision: Recall:
|𝑳 𝟐 ∩ 𝑴 𝟒|
|𝑴 𝟒|
= 𝟎
|𝑳 𝟐 ∩ 𝑴 𝟒|
|𝑳 𝟐|
= 𝟎
Process model 4
A
B C
D E
C B
Event log 2
ABBCE
7. Problem 1: Infinite Number of Traces
Artem Polyvyanyy and Anna Kalenkova | Monotone Conformance Checking for Partially Matching Designed and Observed Processes | June 24-26, 2019, Aachen, Germany 7
Solution: Estimate the number of traces using the notion of topological entropy [12]:
[12] Tullio Ceccherini-Silberstein et al.: On the entropy of regular languages. Theor. Comput. Sci. 307(1): 93-102 (2003)
𝑒𝑛𝑡 𝐵 ≔ lim
𝑛→∞
sup
𝑙𝑜𝑔 𝐶 𝑛(𝐵)
𝑛
𝐵 is an ergodic deterministic finite automaton;
𝐶 𝑛(𝐵) is the set of all traces of 𝐵 of length 𝑛;
𝑒𝑛𝑡 𝐵 is equal to a Perron-Frobenius eigenvalue
of the adjacency matrix of 𝐵;
𝑒𝑛𝑡• 𝐵 ≔ 𝑒𝑛𝑡( 𝐵), where 𝐵 is a short-circuit
version of 𝐵.
𝑒𝑛𝑡 𝐵 ≔ lim
𝑛→∞
sup
𝑙𝑜𝑔 𝐶 𝑛(𝐵)
𝑛
𝑒𝑛𝑡 𝐵 ≔ lim
𝑛→∞
sup
𝑙𝑜𝑔 𝐶 𝑛(𝐵)
𝑛
A
B C
D E
C B
χ
Short-circuit version of model 4
Precision [8]: Recall [8]:
𝑒𝑛𝑡•(𝐿 ∩ 𝑀)
𝑒𝑛𝑡•(𝑀)
𝑒𝑛𝑡•(𝐿 ∩ 𝑀)
𝑒𝑛𝑡•(𝐿)
[8] Artem Polyvyanyy, Andreas Solti, Matthias Weidlich, Claudio Di Ciccio, Jan Mendling: Monotone Precision and Recall Measures for Comparing Executions and Specifications of
Dynamic Systems. CoRR abs/1812.07334 (2018)
.
8. Entropy-Based Precision and Recall Examples
Artem Polyvyanyy and Anna Kalenkova | Monotone Conformance Checking for Partially Matching Designed and Observed Processes | June 24-26, 2019, Aachen, Germany 8
A 2
C
B A
B
A 2
C
B A
B
D
M1 M2
L1 L2 L3
AB
BA
𝜺
AB
BA
AC
BAB
BAB
BAD
L1
L2
L3
AB
BA
AC
BAB
BAD
BA(DA)+
M2
M1
ε
Magic numbers ?!
9. Properties of Conformance Checking Measures
Artem Polyvyanyy and Anna Kalenkova | Monotone Conformance Checking for Partially Matching Designed and Observed Processes | June 24-26, 2019, Aachen, Germany 9
Entropy-based precision
A1: Precision is deterministic;
A2:
A3: Precision between an event log and the flower
model (over the same alphabet) is the lowest;
A4: Precisions of an event log on two language
equivalent models should be equal;
A5:
prec 𝐿1, 𝑀1 ≤ prec(𝐿2, 𝑀1)
prec 𝐿1, 𝑀2 ≤ prec(𝐿1, 𝑀1)
[10] Niek Tax, Xixi Lu, Natalia Sidorova, Dirk Fahland, Wil M. P. van der Aalst: The imprecisions of precision measures in process mining. Inf. Process. Lett. 135: 1-8 (2018)
Precision properties [10]:
A5’ [8]:
A2’ [8]:
𝑒𝑛𝑡•(𝐿1∩𝑀1)
𝑒𝑛𝑡•(𝑀1)
<
𝑒𝑛𝑡•(𝐿2∩𝑀1)
𝑒𝑛𝑡•(𝑀1)
𝑒𝑛𝑡•(𝐿1∩𝑀2)
𝑒𝑛𝑡•(𝑀2)
<
𝑒𝑛𝑡•(𝐿1∩𝑀1)
𝑒𝑛𝑡•(𝑀1)
L1 M1 M2
L1 L2 M1 .
10. Entropy-Based Precision and Recall Revisited
Artem Polyvyanyy and Anna Kalenkova | Monotone Conformance Checking for Partially Matching Designed and Observed Processes | June 24-26, 2019, Aachen, Germany 10
A 2
C
B A
B
A 2
C
B A
B
D
M1 M2
L1 L2 L3
AB
BA
𝜺
AB
BA
AC
BAB
BAB
BAD
L1
L2
L3
AB
BA
AC
BAB
BAD
BA(DA)+
M2
M1
ε
A2’~A2’
11. Artem Polyvyanyy and Anna Kalenkova | Monotone Conformance Checking for Partially Matching Designed and Observed Processes | June 24-26, 2019, Aachen, Germany 11
Problem 2: Partially Matching Traces (I)
A C EB B
A
B C
D E
C B
M
L
Precision: Recall:
𝑒𝑛𝑡•(𝐿′ ∩ 𝑀′)
𝑒𝑛𝑡•(𝑀′)
𝑒𝑛𝑡•(𝐿′ ∩ 𝑀′)
𝑒𝑛𝑡•(𝐿′)
L
ABBCE
M
ABCDE
ACBDE
12. Artem Polyvyanyy and Anna Kalenkova | Monotone Conformance Checking for Partially Matching Designed and Observed Processes | June 24-26, 2019, Aachen, Germany 12
Problem 2: Partially Matching Traces (II)
A C EB B
τ
A
B C
D E
C B τ
M
L
Precision: Recall:
𝑒𝑛𝑡•(𝐿′ ∩ 𝑀′)
𝑒𝑛𝑡•(𝑀′)
𝑒𝑛𝑡•(𝐿′ ∩ 𝑀′)
𝑒𝑛𝑡•(𝐿′)
“Dilute” model and log traces to
identify commonalities:
ABCE
L
ABBCE
M
ABCDE
ACBDE
ACBE
13. Artem Polyvyanyy and Anna Kalenkova | Monotone Conformance Checking for Partially Matching Designed and Observed Processes | June 24-26, 2019, Aachen, Germany 13
Problem 2: Partially Matching Traces (III)
A C EB B
τ τ τττ
A
B C
D E
C B τ ττ
τ τ
τ τ
M’
L’
Precision: Recall:
𝑒𝑛𝑡•(𝐿′ ∩ 𝑀′)
𝑒𝑛𝑡•(𝑀′)
𝑒𝑛𝑡•(𝐿′ ∩ 𝑀′)
𝑒𝑛𝑡•(𝐿′)
“Dilute” model and log traces to
identify commonalities:
L’ M ’
L
ABBCE
M
ABCDE
ACBDE
ACBE
ABCE
ABC
ABE
ACE
BCE
AB...
BBCE
ABBE
ABBC
BBE
BBC
BB
ABB
ACB
CBE
CB
BCDE
ACDE
ABDE
ABCD
ACBD
...
14. Artem Polyvyanyy and Anna Kalenkova | Monotone Conformance Checking for Partially Matching Designed and Observed Processes | June 24-26, 2019, Aachen, Germany 14
Problem 2: Partially Matching Traces (IV)
A
B C
D E
C B τ ττ
τ τ
τ τ
χ
A C EB B
τ τ τττ
χ
Precisionτ: Recallτ:
𝑒𝑛𝑡•(𝐿′ ∩ 𝑀′)
𝑒𝑛𝑡•(𝑀′)
𝑒𝑛𝑡•(𝐿′ ∩ 𝑀′)
𝑒𝑛𝑡•(𝐿′)
“Dilute” model and log traces to
identify commonalities:
L’ M ’
L
ABBCE
M
ABCDE
ACBDE
ACBE
ABCE
ABC
ABE
ACE
BCE
AB...
BBCE
ABBE
ABBC
BBE
BBC
BB
ABB
ACB
CBE
CB
BCDE
ACDE
ABDE
ABCD
ACBD
...
M’
L’
15. Entropy-Based Precision and Recall Revisited
Artem Polyvyanyy and Anna Kalenkova | Monotone Conformance Checking for Partially Matching Designed and Observed Processes | June 24-26, 2019, Aachen, Germany 15
L1
L2
L3
AB
BA
AC
BAB
BAD
BA(DA)+
M2
M1
ε
16. A Selection of New Properties
Artem Polyvyanyy and Anna Kalenkova | Monotone Conformance Checking for Partially Matching Designed and Observed Processes | June 24-26, 2019, Aachen, Germany 16
L1
M1
L2
L1 L2
M1
L M
L’=M ’
𝑒𝑛𝑡•(𝐿′) ≤ 𝑒𝑛𝑡•(𝑀′)
Consequently, A2 and A5 hold.
𝑒𝑛𝑡• 𝐿′ < 𝑒𝑛𝑡•(𝑀′)
Consequently, A2’ and A5’ hold.
∃𝛼 ∈ 𝑀: 𝛼 ∉ 𝐿′ ∀𝛽 ∈ 𝐿: 𝛽 ∈ 𝑀′and
M ’
α
L M
L’
𝐿′ ⊂ 𝑀′ and
𝑒𝑛𝑡•(𝐿1 ∩ 𝑀1)
𝑒𝑛𝑡•(𝑀1)
𝑒𝑛𝑡•(𝐿2 ∩ 𝑀1)
𝑒𝑛𝑡•(𝑀1)
≤
𝑒𝑛𝑡•(𝐿1 ∩ 𝑀1)
𝑒𝑛𝑡•(𝑀1)
𝑒𝑛𝑡•(𝐿2 ∩ 𝑀1)
𝑒𝑛𝑡•(𝑀1)
<*
* For each k, there are at least as many words of length k in 𝐿2⋂ 𝑀1 as in 𝐿1⋂ 𝑀1, and
sometimes more.
** A possible outcome
T3: T5:
T7:T6:
17. Entropy-Based Precision and Recall Revisited
Artem Polyvyanyy and Anna Kalenkova | Monotone Conformance Checking for Partially Matching Designed and Observed Processes | June 24-26, 2019, Aachen, Germany 17
L1
L2
L3
AB
BA
AC
BAB
BAD
BA(DA)+
M2
M1
ε
T3&T7
T3&T7
T5
18. Experiment Using Synthetic Event Data [14,15]
Artem Polyvyanyy and Anna Kalenkova | Monotone Conformance Checking for Partially Matching Designed and Observed Processes | June 24-26, 2019, Aachen, Germany 18
Trace
ABDEI
ACDGHFI
ACGDHFI
ACDHFI
ACHDFI
Event log 1
1
2
3
19. Precision for Synthetic Event Data
Artem Polyvyanyy and Anna Kalenkova | Monotone Conformance Checking for Partially Matching Designed and Observed Processes | June 24-26, 2019, Aachen, Germany 19
2
3
1
A
B
C
E
F
I
G H
Ʈ
D
A
B
C
D
E
F
I
G H
Process model 2 Process model 3
𝐸𝐵 = 0.568
𝐸𝐵 𝜏
= 0.933
𝐸𝐵 = 0.758
𝐸𝐵 𝜏 = 0.970
Set Difference (SD);
Negative Events (NE);
Alignment-based ETC
precision (ETCa);
Projected conformance
checking (PCC);
Anti-alignment precision (AA);
k-order Markovian abstraction
(MAPk);
Entropy-based precision (EB);
EB with partial matching (EBτ).
20. Experiment Using Real-life Event Data
Artem Polyvyanyy and Anna Kalenkova | Monotone Conformance Checking for Partially Matching Designed and Observed Processes | June 24-26, 2019, Aachen, Germany 20
Analyzed 5 real-life event logs;
Filtered out infrequent events that appear less
than in 80% of traces using Filter Log using
Simple Heuristics ProM plugin;
From 596 to 11,636 traces;
From 1,403 to 164,144 events;
The Inductive miner was used to discover
models from event logs;
During determinization, automata sizes
increased up to ten times;
The experiment was conducted using the Intel
CoreTM i7-3970X CPU@3.50 GHz, 64 GB RAM;
Tool is publicly available at:
https://github.com/akalenkova/eigen-measure.
21. Contributions, Limitations, and Future Work
Artem Polyvyanyy and Anna Kalenkova | Monotone Conformance Checking for Partially Matching Designed and Observed Processes | June 24-26, 2019, Aachen, Germany 21
Contributions:
Extension of the entropy-based precision and recall measures to support partially matching traces;
Properties for conformance measures that address partially matching traces;
An evaluation that demonstrates the feasibility of using the new measures in industrial settings.
Limitations:
Efficiency due to the automata determinization step;
No support of process models that describe non-finite spaces of reachable states;
Frequencies of traces are ignored.
Future/current work:
Efficient computation of the entropy of an event log;
Entropy-based precision and recall for unbounded-state-systems;
Controlled “dilution” of traces, i.e., precision and recall over traces that differ in up to k events.