Slides supporting the book "Process Mining: Discovery, Conformance, and Enhancement of Business Processes" by Wil van der Aalst. See also http://springer.com/978-3-642-19344-6 (ISBN 978-3-642-19344-6) and the website http://www.processmining.org/book/start providing sample logs.
2. Overview
Chapter 1
Introduction
Part I: Preliminaries
Chapter 2 Chapter 3
Process Modeling and Data Mining
Analysis
Part II: From Event Logs to Process Models
Chapter 4 Chapter 5 Chapter 6
Getting the Data Process Discovery: An Advanced Process
Introduction Discovery Techniques
Part III: Beyond Process Discovery
Chapter 7 Chapter 8 Chapter 9
Conformance Mining Additional Operational Support
Checking Perspectives
Part IV: Putting Process Mining to Work
Chapter 10 Chapter 11 Chapter 12
Tool Support Analyzing “Lasagna Analyzing “Spaghetti
Processes” Processes”
Part V: Reflection
Chapter 13 Chapter 14
Cartography and Epilogue
Navigation
PAGE 1
3. Goal of process mining
• What really happened in the past?
• Why did it happen?
• What is likely to happen in the future?
• When and why do organizations and people deviate?
• How to control a process better?
• How to redesign a process to improve its
performance?
PAGE 2
4. Getting the data
supports/
“world” business
controls
processes software
people machines system
components
organizations records
events, e.g.,
messages,
specifies transactions,
models
configures etc.
analyzes
implements
analyzes
discovery
(process) event
conformance
model logs
enhancement
PAGE 3
5. From heterogeneous data sources to
process mining results
Extract, Transform,
and Load (ELT)
optional
data
source ELT
data
ELT warehouse
data
source
ELT
data coarse-grained
source scoping
data
source
extract
XES, MXML, or
data similar
source
unfiltered event logs process mining
discovery conformance enhancement
filter
filtered event logs (process) models answers
fine-grained
scoping
PAGE 4
6. Example log
• A process consists of
cases.
• A case consists of
events such that each
event relates to precisely
one case.
• Events within a case are
ordered.
• Events can have
attributes.
• Examples of typical
attribute names are
activity, time, costs, and
resource.
PAGE 5
7. process cases events attributes
activity= register request
Another view 1 35654423
time = 30-12-2010:11.02
resource = Pete
costs = 50
35654424 ...
...
activity= reject request
time = 07-01-2011:14.24
35654427 resource = Pete
costs = 200
activity= register request
time = 30-12-2010:11.32
35654483 resource = Mike
2 costs = 50
35654485 ...
...
activity= pay compensation
time = 08-01-2011:12.05
35654489 resource = Ellen
costs = 200
activity= register request
time = 30-12-2010:11.32
35654521 resource = Pete
3 costs = 50
35654522 ...
...
activity= pay compensation
time = 15-01-2011:10.45
35654533 resource = Ellen
costs = 200
... ... ...PAGE 6
11. Using attributes
social network showing how work
flows from one person to another
Pete Sara Sue
Mike
performance indicators per activity
Ellen Sean
Activity b
Frequency: 456 role Activity g
Waiting time: 15.6 +/- 2.5 hours Frequency: 311
Service time: 1.2 +/- 0.5 hours E Waiting time: 12.4 +/- 2.1 hours
Costs: 412 +/- 55 euros Service time: 0.5 +/- 0.2 hours
b Costs: 198 +/- 35 euros
A
examine
thoroughly
A
g
A M
c1 c3 pay
c compensation
a examine
e
A
start register casually
A decide c5 end
request
h
c2 d c4 M reject
Activity h
check ticket request
Frequency: 407
f
reinitiate Waiting time: 7.4 +/- 1.8 hours
request Service time: 1.1 +/- 0.3 hours
control flow Costs: 209 +/- 38 euros PAGE 10
12. XES (eXtensible Event Stream)
• See www.xes-standard.org.
• Adopted by the IEEE Task Force on Process Mining.
• Predecessor: MXML and SA-MXML.
• The format is supported by tools such as ProM (as of
version 6), Nitro, XESame, and OpenXES.
• ProMimport supports MXML.
PAGE 11
15. extensions
loaded
every trace
has a name
every event has a
name and a transition
start of trace (i.e.
process instance) classifier = name + transition
name of trace
resource
timestamp
name of event
transition (activity name)
PAGE 14
16. end of trace (i.e.
process instance)
start of trace
name of trace
resource
timestamp
name of event (activity name)
data associated to event
PAGE 15
17. Challenges when extracting event logs
• Correlation: Events in an event log are grouped per
case. This simple requirement can be quite challenging
as it requires event correlation, i.e., events need to be
related to each other.
• Timestamps: Events need to be ordered per case.
Typical problems: only dates, different clocks, delayed
logging.
• Snapshots. Cases may have a lifetime extending beyond
the recorded period, e.g., a case was started before the
beginning of the event log.
• Scoping. How to decide which tables to incorporate?
• Granularity: the events in the event log are at a different
level of granularity than the activities relevant for end
users. PAGE 16
20. Order:91245
Order instance
Case id: 91245 Case id: 91245 Case id: 91245
Activity: create order Activity: pay order Activity: complete order
Timestamp: 28-11-2011:08.12 Timestamp: 02-12-2011:13.45 Timestamp: 05-12-2011:11.33
Customer: John Customer: John Customer: John
Amount: 100 Amount: 100 Amount: 100
OrderLine:112345
Case id: 91245 Case id: 91245
Activity: enter order line Activity: secure order line
Timestamp: 28-11-2011:08.13 Timestamp: 28-11-2011:08.55
Orderline OrderLineID: 112345 OrderLineID: 112345
Product: iPhone 4G Product: iPhone 4G
Order OrderLineID : OrderLineID
NofItems: 1
TotalWeight: 0.250
NofItems: 1
TotalWeight: 0.250
1 1..* DellID: 882345 DellID: 882345
OrderID : OrderID OrderID : OrderID
Customer : CustID Product : ProdID OrderLine:112346
Amount : Euro NofItems : PosInt Case id: 91245 Case id: 91245 Case id: 91245
Activity: enter order line Activity: create backorder Activity: secure order line
Created : DateTime TotalWeight : Weight Timestamp: 28-11-2011:08.14 Timestamp: 28-11-2011:08.55 Timestamp: 30-11-2011:09.06
OrderLineID: 112346 OrderLineID: 112346 OrderLineID: 112346
Product: iPod nano Product: iPod nano Product: iPod nano
Paid : DateTime Entered : DateTime NofItems: 2 NofItems: 2 NofItems: 2
TotalWeight: 0.300 TotalWeight: 0.300 TotalWeight: 0.300
DellID: 882346 DellID: 882346 DellID: 882346
Completed : DateTime BackOrdered : DateTime
Secured : DateTime OrderLine:112347
DelID : DelID
Case id: 91245 Case id: 91245
1..* Activity: enter order line
Timestamp: 28-11-2011:08.15
Activity: secure order line
Timestamp: 29-11-2011:10.06
OrderLineID: 112347 OrderLineID: 112347
Product: iPod classic Product: iPod classic
NofItems: 1 NofItems: 1
0..1 TotalWeight: 0.200
DellID: 882345
TotalWeight: 0.200
DellID: 882345
Attempt Delivery
Delivery:882345
0..* 1
DelID : DelID DelID : DelID
When : DateTime DelAddress : Address
Successful : Bool Contact : PhoneNo
Attempt:882345-1 Attempt:882345-2 Attempt:882345-3
Case id: 91245 Case id: 91245 Case id: 91245
Activity: delivery attempt Activity: delivery attempt Activity: delivery attempt
Timestamp: 05-12-2011:08.55 Timestamp: 06-12-2011:09.12 Timestamp: 07-12-2011:08.56
DellID: 882345 DellID: 882345 DellID: 882345
Successful: false Successful: false Successful: true
DelAddress: 5513VJ-22a DelAddress: 5513VJ-22a DelAddress: 5513VJ-22a
Contact: 0497-2553660 Contact: 0497-2553660 Contact: 0497-2553660
Delivery:882346
Attempt:882346-1
Case id: 91245
Activity: delivery attempt
Timestamp: 05-12-2011:08.43
DellID: 882346
Successful: true
DelAddress: 5513XG-45
PAGE 19
Contact: 040-2298761
21. Order:91245
Case id: 91245 Case id: 91245 Case id: 91245
Activity: create order Activity: pay order Activity: complete order
Timestamp: 28-11-2011:08.12 Timestamp: 02-12-2011:13.45 Timestamp: 05-12-2011:11.33
Customer: John Customer: John Customer: John
Amount: 100 Amount: 100 Amount: 100
OrderLine:112345
Case id: 91245 Case id: 91245
Activity: enter order line Activity: secure order line
Timestamp: 28-11-2011:08.13 Timestamp: 28-11-2011:08.55
OrderLineID: 112345 OrderLineID: 112345
Product: iPhone 4G Product: iPhone 4G
NofItems: 1 NofItems: 1
TotalWeight: 0.250 TotalWeight: 0.250
DellID: 882345 DellID: 882345
OrderLine:112346
Case id: 91245 Case id: 91245 Case id: 91245
Activity: enter order line Activity: create backorder Activity: secure order line
Timestamp: 28-11-2011:08.14 Timestamp: 28-11-2011:08.55 Timestamp: 30-11-2011:09.06
OrderLineID: 112346 OrderLineID: 112346 OrderLineID: 112346
Product: iPod nano Product: iPod nano Product: iPod nano
NofItems: 2 NofItems: 2 NofItems: 2
TotalWeight: 0.300 TotalWeight: 0.300 TotalWeight: 0.300
DellID: 882346 DellID: 882346 DellID: 882346
OrderLine:112347
Case id: 91245 Case id: 91245
Activity: enter order line Activity: secure order line
Timestamp: 28-11-2011:08.15 Timestamp: 29-11-2011:10.06
OrderLineID: 112347 OrderLineID: 112347
Product: iPod classic Product: iPod classic
NofItems: 1 NofItems: 1
TotalWeight: 0.200 TotalWeight: 0.200
DellID: 882345 DellID: 882345
Delivery:882345
Attempt:882345-1 Attempt:882345-2 Attempt:882345-3
Case id: 91245 Case id: 91245 Case id: 91245
Activity: delivery attempt Activity: delivery attempt Activity: delivery attempt
Timestamp: 05-12-2011:08.55 Timestamp: 06-12-2011:09.12 Timestamp: 07-12-2011:08.56
DellID: 882345 DellID: 882345 DellID: 882345
Successful: false Successful: false Successful: true
DelAddress: 5513VJ-22a DelAddress: 5513VJ-22a DelAddress: 5513VJ-22a
Contact: 0497-2553660 Contact: 0497-2553660 Contact: 0497-2553660
Delivery:882346
Attempt:882346-1
Case id: 91245
Activity: delivery attempt
Timestamp: 05-12-2011:08.43
DellID: 882346
Successful: true
DelAddress: 5513XG-45
Contact: 040-2298761
PAGE 20
22. Orderline
Order OrderLineID : OrderLineID
1 1..*
OrderID : OrderID OrderID : OrderID
Orderline instance Customer : CustID
Amount : Euro
Product : ProdID
NofItems : PosInt
Created : DateTime TotalWeight : Weight
Paid : DateTime Entered : DateTime
Completed : DateTime BackOrdered : DateTime
Secured : DateTime
DelID : DelID
1..*
0..1
OrderLine:112345
Attempt Delivery
0..* 1
Case id: 112345 Case id: 112345 DelID : DelID DelID : DelID
Activity: enter order line Activity: secure order line
Timestamp: 28-11-2011:08.13 Timestamp: 28-11-2011:08.55 When : DateTime DelAddress : Address
OrderLineID: 112345 OrderLineID: 112345
Product: iPhone 4G Product: iPhone 4G Successful : Bool Contact : PhoneNo
NofItems: 1 NofItems: 1
TotalWeight: 0.250 TotalWeight: 0.250
DellID: 882345 DellID: 882345
Order:91245
Case id: 112345 Case id: 112345 Case id: 112345
Activity: create order Activity: pay order Activity: complete order
Timestamp: 28-11-2011:08.12 Timestamp: 02-12-2011:13.45 Timestamp: 05-12-2011:11.33
Customer: John Customer: John Customer: John
Amount: 100 Amount: 100 Amount: 100
Delivery:882345
Attempt:882345-1 Attempt:882345-2 Attempt:882345-3
Case id: 112345 Case id: 112345 Case id: 112345
Activity: delivery attempt Activity: delivery attempt Activity: delivery attempt
Timestamp: 05-12-2011:08.55 Timestamp: 06-12-2011:09.12 Timestamp: 07-12-2011:08.56
DellID: 882345 DellID: 882345 DellID: 882345
Successful: false Successful: false Successful: true
DelAddress: 5513VJ-22a DelAddress: 5513VJ-22a DelAddress: 5513VJ-22a PAGE 21
Contact: 0497-2553660 Contact: 0497-2553660 Contact: 0497-2553660
23. Other examples
• The life cycles of reviewers, authors, papers,
reviews, PC chairs, etc.
• The life cycles of job applications and vacancies.
• X-ray machine logs: machine, machine day, patient,
treatment, routine, etc.?
• Therefore, the selection and scoping of instances is
needed.
• Like making deciding on the elements to be put on
map; there may be many maps covering partially
overlapping areas.
PAGE 22
24. Extracting event logs
• Not just a syntactical issue.
• Different views are possible.
• Important:
− Selecting the right instance notion.
− Ordering of events.
− Selection of events.
• Proclets: the true fabric of real-life processes.
PAGE 23