Process mining is a discipline that aims at discovering, monitoring and improving real-life processes by extracting knowledge from event logs. Process discovery and conformance checking are the two main process mining tasks. Process discovery techniques can be used to learn a process model from example traces in an event log, whereas the goal of conformance checking is to compare the observed behavior in the event log with the modeled behavior. In this paper, we propose an approach based on temporal logic query checking, which is in the middle between process discovery and conformance checking. It can be used to discover those LTL-based business rules that are valid in the log, by checking against the log a (user-defined) class of rules. The proposed approach is not limited to provide a boolean answer about the validity of a business rule in the log, but it rather provides valuable diagnostics in terms of traces in which the rule is satisfied (witnesses) and traces in which the rule is violated (counterexamples). We have implemented our approach as a proof of concept and conducted a wide experimentation using both synthetic and real-life logs.
MARGINALIZATION (Different learners in Marginalized Group
Log-Based Understanding of Business Processes through Temporal Logic Query Checking
1. Log-Based Understanding
of Business Processes
through Temporal Logic Query Checking
Margus Räim, Claudio Di Ciccio, Fabrizio Maria Maggi, Massimo Mecella, and Jan Mendling
22nd International Conference on Cooperative Information Systems
Amantea, Italy
claudio.di.ciccio@wu.ac.at
20. Log-based understanding
of a process model
1. Which activities require another one to follow?
2. Which activities require another one to
precede?
3. Which activities are mutually exclusive?
…
We want to leave the user free to specify the
rules to be discovered
SEITE 20
21. LTLf
Linear Temporal Logic (LTL) was originarily a
specification language for the execution of
(endless) concurrent programs (Pnueli, 1977)
Syntax (let A be a propositional symbol):
Interpretation over infinite traces,
i.e., an infinite sequence of consecutive instants of time
LTLf formulae are meant to be interpreted over
finite traces
“Until”
“Eventually”“Always”
“Next”
22. Log-based understanding
SEITE 22
1. Which activities require another one to follow?
2. Which activities require another one to
precede?
3. Which activities are mutually exclusive?
1.
2.
3.
23. Log-based understanding:
An example
SEITE 23
A A B C A B C A C B C
C C C C C A A B C A A B A A B
A B B B D
B A B D
A B B D
C A B A A C C B B
B D A D B D
A B C A A B B C
D D D D D
C A A C C C A A B C B C C B D
24. Log-based understanding:
An example
SEITE 24
A A B C A B C A C B C
C C C C C A A B C A A B A A B
A B B B D
B A B D
A B B D
C A B A A C C B B
B D A D B D
A B C A A B B C
D D D D D
C A A C C C A A B C B C C B
A always requires B to follow (10/10)
0 Counterexamples
10 Witnesses
25. Log-based understanding:
An example
SEITE 25
A A B C A B C A C B C
C C C C C A A B C A A B A A B
A B B B D
B A B D
A B B D
C A B A A C C B B
B D A D B D
A B C A A B B C
D D D D D
C A A C C C A A B C B C C B
B does not always require A to precede (8/10)
2 Counterexamples
8 Witnesses
26. Log-based understanding:
An example
SEITE 26
A A B C A B C A C B C
C C C C C A A B C A A B A A B
A B B B D
B A B D
A B B D
C A B A A C C B B
B D A D B D
A B C A A B B C
D D D D D
C A A C C C A A B C B C C B
C and D are mutually exclusive (10/10)
10 Witnesses
0 Counterexamples
27. Log-based understanding:
LTLf query checking
SEITE 27
1. Which activities require another one to follow?
2. Which activities require another one to
precede?
3. Which activities are mutually exclusive?
1.
2.
3.
Placeholder
Placeholder
Placeholders are meant to be assigned with one of the activities in the log alphabet
(in the example, either to A, B, C or D)
29. Recap
An event log is given
The user wants to have an understanding of what
went on there, to gain knowledge about the process
behind such log
To this extent, (s)he formulates queries, asking for
activities that satisfy given conditions about
temporal constraints
Our technique aims at answering such queries
We take advantage of the fact that:
1. we know the finite set of activities of which the process
consists,
2. the queries are formulated in a well-known formal
language, and
3. …
SEITE 29
31. Traces are finite
SEITE 31
A A B C A B C A C B C ¶
C C C C C A A B C A A B A A B ¶
A B B B D ¶
B A B D ¶
A B B D ¶
C A B A A C C B B ¶
B D A D B D ¶
A B C A A B B C ¶
D D D D D ¶
C A A C C C A A B C B C C B D ¶
35. Divide et impera:
the query evaluation tree
SEITE 35
The algorithm is designed to recursively call sub-procedures
36. Evaluation:
performance w.r.t. query
SEITE 36
Default log:
100 traces of 10 events each, log alphabet of 10 activities
Windows 7 OS, Intel Core i7 CPU, 8GB of main memory
Prototype encoded in C (https://github.com/r2im/pickaxe)
37. Evaluation:
performance w.r.t. query
SEITE 37
Default log:
100 traces of 10 events each, log alphabet of 10 activities
Windows 7 OS, Intel Core i7 CPU, 8GB of main memory
Prototype encoded in C (https://github.com/r2im/pickaxe)
39. Conclusions
What we saw:
A novel technique for the log-based understanding of a
process model
More in the paper:
Formal definition of the folded temporal structure
The algorithm for answering LTLf queries
Proof of the theorem stating the soundness of the
proposed algorithm
Experiments in detail
Future work:
Improve performance
Create a user-interaction for refining the query
formulation, iteratively
SEITE 39
40. Log-Based Understanding
of Business Processes
through Temporal Logic Query Checking
Margus Räim, Claudio Di Ciccio, Fabrizio Maria Maggi, Massimo Mecella, and Jan Mendling
22nd International Conference on Cooperative Information Systems
Amantea, Italy
claudio.di.ciccio@wu.ac.at
41. Log-Based Understanding
of Business Processes
through Temporal Logic Query Checking
Margus Räim, Claudio Di Ciccio, Fabrizio Maria Maggi, Massimo Mecella, and Jan Mendling
Extra
42. Verifying constraints on log
(state of the art)
SEITE 42
B|C|D
A|C|D
A
B
A A B C A B C A C B C
C C C C C A A B C A A B A A B
A B B B D
B A B D
A B B D
C A B A A C C B B
B D A D B D
A B C A A B B C
D D D D D
C A A C C C A A B C B C C B D
43. Verifying constraints on log
(state of the art)
SEITE 43
B|C|D
A|C|D
A
B
C|D
A|B|C|D
A
B
A|B|C|D
A A B C A B C A C B C
C C C C C A A B C A A B A A B
A B B B D
B A B D
A B B D
C A B A A C C B B
B D A D B D
A B C A A B B C
D D D D D
C A A C C C A A B C B C C B D
44. Verifying constraints on log
(state of the art)
SEITE 44
B|C|D
A|C|D
A
B
C|D
A|B|C|D
A
B
A|B|C|D
C|D
A|C|D
B|C|D
A
B
A
B
A|B|C|D
A|B|C|D
Here we already know which activities
are meant to be constrained
A A B C A B C A C B C
C C C C C A A B C A A B A A B
A B B B D
B A B D
A B B D
C A B A A C C B B
B D A D B D
A B C A A B B C
D D D D D
C A A C C C A A B C B C C B D
45. Intuition
Replay turns out to be the best technique to
Maintain the history in the current state, and
Wait for the future moves, which are unknown
Working with logs, we have an advantage…
SEITE 45
B|C|D
A|C|D
A
B
C C C C C A A B C A A B A A B C C A C A B A C B B A B C
47. Verifying constraints on log
SEITE 47
[^A]
[^B]
A
B
A A B C A B C A C B C
C C C C C A A B C A A B A A B
A B B B D
B A B D
A B B D
C A B A A C C B B
B D A D B D
A B C A A B B C
D D D D D
C A A C C C A A B C B C C B D
[^A]
[^C]
A
C
[^A]
[^D]
A
D
[^B]
[^A]
B
A
[^B]
[^C]
B
C
[^B]
[^D]
B
D
[^C]
[^A]
C
A
[^C]
[^B]
C
B
[^C]
[^D]
C
D
[^D]
[^A]
D
A
48. Verifying constraints on log
SEITE 48
[^A]
[^B]
A
B
[^A]
[^C]
A
C
[^A]
[^D]
A
D
[^B]
[^A]
B
A
[^B]
[^C]
B
C
[^B]
[^D]
B
D
[^C]
[^A]
C
A
[^C]
[^B]
C
B
[^C]
[^D]
C
D
[^D]
[^A]
D
A