Slides supporting the book "Process Mining: Discovery, Conformance, and Enhancement of Business Processes" by Wil van der Aalst. See also http://springer.com/978-3-642-19344-6 (ISBN 978-3-642-19344-6) and the website http://www.processmining.org/book/start providing sample logs.
2. Overview
Chapter 1
Introduction
Part I: Preliminaries
Chapter 2 Chapter 3
Process Modeling and Data Mining
Analysis
Part II: From Event Logs to Process Models
Chapter 4 Chapter 5 Chapter 6
Getting the Data Process Discovery: An Advanced Process
Introduction Discovery Techniques
Part III: Beyond Process Discovery
Chapter 7 Chapter 8 Chapter 9
Conformance Mining Additional Operational Support
Checking Perspectives
Part IV: Putting Process Mining to Work
Chapter 10 Chapter 11 Chapter 12
Tool Support Analyzing “Lasagna Analyzing “Spaghetti
Processes” Processes”
Part V: Reflection
Chapter 13 Chapter 14
Cartography and Epilogue
Navigation
PAGE 1
3. Conformance checking
supports/
“world” business
controls
processes software
people machines system
components
organizations records
events, e.g.,
messages,
specifies transactions,
models
configures etc.
analyzes
implements
analyzes
discovery
(process) event
model conformance logs
enhancement
PAGE 2
5. Context
• Corporate governance, risk, compliance, and
legislation such as the Sarbanes-Oxley (US), Basel
II/III (EU), J-SOX (Japan), C-SOX (Canada), 8th EU
Directive (EURO-SOX), BilMoG (Germany), MiFID
(EU), Law 262/05 (Italy), Code Lippens (Belgium), and
Code Tabaksblat (Netherlands).
• ISO 9001:2008 requires organizations to model their
operational processes.
• Business alignment: make sure that the information
systems and the real business processes are well
aligned.
PAGE 4
6. Auditing
• The term auditing refers to the evaluation of
organizations and their processes.
• Audits are performed to ascertain the validity and
reliability of information about these organizations
and associated processes.
• This is done to check whether business processes
are executed within certain boundaries set by
managers, governments, and other stakeholders.
• Obviously, process mining can help to detect fraud,
malpractice, risks, and inefficiencies.
• All events in a business process can be evaluated
and this can be done while the process is still
running.
PAGE 5
7. Deviations?
• Is the model or the log “wrong”?
• “Desirable” or “undesirable” deviations?
• “Breaking the glass” may save lives!
PAGE 6
8. Replay: Connecting events to model
elements is essential for process mining
Play-In
event log process model
Play-Out
process model event log
Replay
• extended model
showing times,
frequencies, etc.
• diagnostics
• predictions
• recommendations
event log process model PAGE 7
9. Play Out (Classical use of models)
B
A p1 E p3 D
start end
p2 C p4
A B C D AED AED
ABCD ACBD
ACBD
AED ACBD PAGE 8
10. Play In (Process Discovery)
ABCD
ACBD
a process discovery
AED
algorithm like the α
ACBD
algorithm
AED
ABCD
…
B
A p1 E p3 D
start end
p2 C p4
PAGE 9
11. Replay
ABC D
B
A p1 E p3 D
start end
p2 C p4
PAGE 10
12. Replay can detect problems
AC D
Problem! Problem!
token left behind B missing token
A p1 E p3 D
start end
p2 C p4
PAGE 11
13. Replay can extract timing information
A5 B8 C9 D13
8
5 6
4 7
3 B 2 5
8
A p1 E p3 D
start end
5 13
4 p2
3 C p4 4
37 4 7
6
9 PAGE 12
14. Let us now focus on conformance
checking based on Replay
Play-In
event log process model
Play-Out
process model event log
Replay
• extended model
showing times,
frequencies, etc.
• diagnostics
• predictions
• recommendations
event log process model PAGE 13
15. Four models, one log
N1 b
examine
thoroughly
g
p1 p3 pay
c compensation
a examine e
start register casually decide p5 end
request
h
p2 d p4 reject
check ticket request
f reinitiate
request
N2 b pay
compensation
examine g
thoroughly
a c d e p4
start register p1 examine p2 check p3 decide end
request casually ticket
h
f reject request
reinitiate request
N3 c
p1 examine p3
casually
a e h
start register decide p5 reject end
request request
d
p2 check p4
ticket
N4 examine check
thoroughly b d ticket g
pay
compensation
a p1
start register examine end
request casually c
e f reinitiate
reject
h PAGE 14
decide request
request
17. Replaying (1/3)
σ1 on N1
p=0 p=1 b
c=0 c=0
m=0 m=0
r=0 r=0 g
p1 p3
c
a e
start p5 end
h
p2 d p4
f
p=3 b
c=1
m=0
r=0 g
p1 p3
c
a e
start p5 end
h
p2 d p4
f
PAGE 16
18. Replaying (2/3)
p=4 b
c=2
m=0
r=0 g
p1 p3
c
a e
start p5 end
h
p2 d p4
f
p=5 b
c=3
m=0
r=0 g
p1 p3
c
a e
start p5 end
h
p2 d p4
f
PAGE 17
19. Replaying (3/3)
p=6 b
c=5
m=0
r=0 g
p1 p3
c
a e
start p5 end
h
p2 d p4
f
p=7 p=7 b
c=6 c=7
m=0
r=0
m=0
r=0 g
No problems
a
p1
c p3
e
found!
start p5 end
h
p2 d p4
f
PAGE 18
20. Replaying (1/3)
σ3 on N2
p=0 p=1
c=0 c=0 b
m=0 m=0
r=0 r=0
g
a c d e p4
start p1 p2 p3 end
h
f
p=2
c=1 b
m=0
r=0
g
a c d e p4
start p1 p2 p3 end
h
f
PAGE 19
21. Replaying (2/3)
p=3
c=2 b
m=1
r=0
g
m
a c d e p4
start p1 p2 p3 end
h
f
p=4
c=3 b
m=1
r=0
g
m
a c d e p4
start p1 p2 p3 end
h
f
PAGE 20
22. Replaying (3/3)
p=5
c=4 b
m=1
r=0
g
m
a c d e p4
start p1 p2 p3 end
h
f
p=6 p=6
c=5 c=6 b
m=1 m=1
r=0 r=1
g
m
a c r d e p4
start p1 p2 p3 end
h
f
PAGE 21
23. Problems encountered when
replaying σ3 on N2
p=6
c=6 b
m=1
r=1
g
m
a c r d e p4
start p1 p2 p3 end
h
f
• One missing token (of 6 consumed tokens)
• One remaining token (of 6 produced tokens)
PAGE 22
24. Computing fitness at trace level
p=6
c=6 b
m=1
r=1
g
m
a c r d e p4
start p1 p2 p3 end
h
f
PAGE 23
25. Replaying (1/3)
σ2 on N3
p=0 p=1
c=0 c=0
m=0 m=0
r=0 r=0
c
p1 p3
a e h
start p5 end
d
p2 p4
p=3
c=1
m=0
r=0
c
p1 p3
a e h
start p5 end
d
p2 p4
PAGE 24
26. Replaying (2/3)
p=4
c=2
m=0
r=0
c
p1 p3
a e h
start p5 end
d
p2 p4
p=5
c=4
m
m=1
r=0
c
p1 p3
a e h
start p5 end
d
p2 p4
PAGE 25
27. Replaying (3/3)
p=5
c=5
m
m=2
r=2
r c
p1 p3
m
a e r h
start p5 end
d
p2 p4
p = 5, c = 5, m = 2, and r = 2
PAGE 26
29. Example values
N1 b
examine
thoroughly
g
p1 p3 pay
c compensation
a examine e
start register casually decide p5 end
request
h
p2 d p4 reject
check ticket request
f reinitiate
request
N2 b pay
compensation
examine g
thoroughly
a c d e p4
start register p1 examine p2 check p3 decide end
request casually ticket
h
f reject request
reinitiate request
N3 c
p1 examine p3
casually
a e h
start register decide p5 reject end
request request
d
p2 check p4
ticket
N4 examine check
thoroughly b d ticket g
pay
compensation
a p1
start register examine end
request casually c
e f reinitiate
reject
h PAGE 28
decide request
request
30. Diagnostics
566 566
971 971 1537 1537 461 461
1391 1391
b 1537 1537
examine g
thoroughly pay
+443 compensation
a c -443 d e p4
start register p1 examine p2 check p3 decide end
request casually ticket
h
930
problem f reject request
443 tokens remain in place p2, reinitiate 930
because d did not occur although request 146
the model expected d to happen
146
problem
443 tokens were missing in place p2 during
replay, because d happened even though
this was not possible according to the model
PAGE 29
31. Diagnostics
problem problem
430 tokens remain in place p1, 566 tokens were missing in
because c did not happen while place p3 during replay,
the model expected c to happen because e happened
while this was not possible
according to the model
problem
10 tokens were missing in place p1 during
replay, because c happened while this
was not possible according to the model
971 971
1391 1537
+430
1391 -10
c -566 1537 930 930
p1 examine p3
casually
a e +607 h -461
start register decide p5 reject end
request request
-146 d
1537
p2 check p4
1391
ticket
1537 1537
problem problem
146 tokens were missing in problem 461 of the 1391
place p2 during replay, because 607 tokens remain in place p5, cases did not
d happened while this was not because h did not happen while reach place end
possible according to the model the model expected h to happen
PAGE 30
36. Diagnostics
(x:y where x is in log or N1 and y in N2)
N1 b N2 b pay
examine compensation
thoroughly examine g
g thoroughly
p1 p3 pay
c compensation a c d e
a examine e p4
start register p1 examine p2 check p3 decide end
start register casually decide p5 end request casually ticket
request
h h
p2 d p4 reject f reject request
check ticket request reinitiate request
f reinitiate
request
PAGE 35
37. Checking Declare specifications
pay
branched response
compensation precedence
g
a e
register decide
request
h
non co-existence
reject
request
See Declare and LTL checker in ProM
PAGE 36
38. Connecting event log and model
1 1
process b
* case e
* event timestamp
h
1 1 * 1
a d f g resource
i
* 1
*
activity
* j
activity c * instance 1 attribute costs
k ...
model instance event
level level level trans-
action
• Very important!
• Model may be discovered or hand-made.
• Connected during replay.
• Starting point for other types of process mining!
PAGE 37