Process Mining
Data Science in Action
Wil van der Aalst
www.vdaalst.com @wvdaalst
www.processmining.org
… , but data science is here to stay!
Data Science Center Eindhoven
http://www.tue.nl/dsce/
©Wil van der Aalst & TU/e (use only with permission & acknowledgements)
DSC/e: Competences and Research Programs
28 groups and 420+ people involved
Data Science Flagship (Philips & DSC/e)
4 Strategic topics
•Data Driven Value Propositions
•Healthcare Smart Maintenance
•Optimizing Healthcare Workflows
•Continuous Personal Health
4 TU/e departments
16 PhD students
30 Data science specialists
©Wil van der Aalst & TU/e (use only with permission & acknowledgements)
“Data Science University” in Den Bosch
©Wil van der Aalst & TU/e (use only with permission & acknowledgements)
Process Mining: On the interface
between process science and
data science
As generic as a
spreadsheet!
©Wil van der Aalst & TU/e (use only with permission & acknowledgements)
Spreadsheet: Killer App for early computers
• VisiCalc (killer
app for Apple II,
Oct. 1979)
• Lotus 1-2-3 (killer
app for IBM PC
1983)
• Microsoft Excel
(1985)
©Wil van der Aalst & TU/e (use only with permission & acknowledgements)
Spreadsheet: Static data
©Wil van der Aalst & TU/e (use only with permission & acknowledgements)
Spreadsheet: Static data
fact derived
©Wil van der Aalst & TU/e (use only with permission & acknowledgements)
Spreadsheet: Static data
31 items
sold
total
value
average
distribution
©Wil van der Aalst & TU/e (use only with permission & acknowledgements)
Spreadsheet: Static data
How to analyze operational processes?
©Wil van der Aalst & TU/e (use only with permission & acknowledgements)
Process Mining: Spreadsheet for behavior
• Input: events (“things that
have happened”)
• Mandatory per event:
− case identifier
− activity name
− timestamp/date
• Optional
− resource
− transaction type
− costs
− …
case
identifier
activity
name
timestamp
resourcerow = event
©Wil van der Aalst & TU/e (use only with permission & acknowledgements)
Process Mining: Spreadsheet for behavior
208 cases
5987 events
74 activities
©Wil van der Aalst & TU/e (use only with permission & acknowledgements)
Process Mining: Spreadsheet for behavior
batching for activities
“opstellen eindnota”
and “archiveren”
©Wil van der Aalst & TU/e (use only with permission & acknowledgements)
Loesje van
der Aalst
desire line
Process Discovery
©Wil van der Aalst & TU/e (use only with permission & acknowledgements)
Process Mining: Spreadsheet for behavior
process discovery
NO
modeling
needed!
©Wil van der Aalst & TU/e (use only with permission & acknowledgements)
Process Mining: Spreadsheet for behavior
process discovery
NO
modeling
needed!
74 act.
11 act.
3 act.
©Wil van der Aalst & TU/e (use only with permission & acknowledgements)
event dataprocess
model
Conformance Checking
©Wil van der Aalst & TU/e (use only with permission & acknowledgements)
desire line
very safe
system
Conformance Checking
©Wil van der Aalst & TU/e (use only with permission & acknowledgements)
Process Mining: Spreadsheet for behavior
conformance checking
?
discovered or
hand-made
©Wil van der Aalst & TU/e (use only with permission & acknowledgements)
Process Mining: Spreadsheet for behavior
conformance checking
fitness of
93.5%
©Wil van der Aalst & TU/e (use only with permission & acknowledgements)
Process Mining: Spreadsheet for behavior
conformance checking
final inspection is
skipped 40 times
©Wil van der Aalst & TU/e (use only with permission & acknowledgements)
Process Mining: Spreadsheet for behavior
conformance checking
move on model
(something should have
happened, but did not)
move on log
(something happened
that should not happen)
©Wil van der Aalst & TU/e (use only with permission & acknowledgements)
Process Mining: Spreadsheet for behavior
performance analysis
average
flowtime is
1.92 months
bottleneck
NO
modeling
needed!
©Wil van der Aalst & TU/e (use only with permission & acknowledgements)
Process Mining: Spreadsheet for behavior
performance analysis
waiting time of
15.74 days
NO
modeling
needed!
©Wil van der Aalst & TU/e (use only with permission & acknowledgements)
Process Mining: Spreadsheet for behavior
animating reality
real cases
NO
modeling
needed!
©Wil van der Aalst & TU/e (use only with permission & acknowledgements)
Process Mining: Spreadsheet for behavior
16 cases are
queueing
animating reality
Deviations
Where?
Why? time
costs
…
©Wil van der Aalst & TU/e (use only with permission & acknowledgements)
How to get started?
• Event Data
• Process Mining Tools
• Data Science Mindset
©Wil van der Aalst & TU/e (use only with permission & acknowledgements)
Starting point for process mining:
Event data
patient activity timestamp doctor age cost
5781 make X-ray 23-1-2014@10.30 Dr. Jones 45 70.00
5541 blood test 23-1-2014@10.18 Dr. Scott 61 40.00
5833 blood test 23-1-2014@10.27 Dr. Scott 24 40.00
5781 blood test 23-1-2014@10.49 Dr. Scott 45 40.00
5781 CT scan 23-1-2014@11.10 Dr. Fox 45 1200.00
5833 surgery 23-1-2014@12.34 Dr. Scott 24 2300.00
5781 handle payment 23-1-2014@12.41 Carol Hope 45 0.00
5541 radiation therapy 23-1-2014@13.57 Dr. Jones 61 140.00
5541 radiation therapy 23-1-2014@13.08 Dr. Jones 61 140.00
… … … … … …
case id activity name timestamp other dataresource
Such data is everywhere (databases,
ERP/CRM/HIS/… systems, transaction
logs, messaging, social media, etc.)
©Wil van der Aalst & TU/e (use only with permission & acknowledgements)
How to get started?
• Event Data
• Process Mining Tools
• Data Science Mindset
©Wil van der Aalst & TU/e (use only with permission & acknowledgements)
Process Mining Software
©Wil van der Aalst & TU/e (use only with permission & acknowledgements)
900+ plug-ins available covering the
whole process mining spectrum
©Wil van der Aalst & TU/e (use only with permission & acknowledgements)©Wil van der Aalst & TU/e (use only with permission & acknowledgements)
©Wil van der Aalst & TU/e (use only with permission & acknowledgements)
How to get started?
• Event Data
• Process Mining Tools
• Data Science Mindset
Process Mining
Data Science in Action
43.000+25.000 people joined!
Starts again on October 7th
2015!
Register via https://www.coursera.org/course/procmin
Conclusion
http://www.tue.nl/dsce/
Get started today!
spreadsheet
for behavior
data-oriented analysis
(data mining, machine learning, business intelligence)
process model analysis
(simulation, verification, optimization, gaming, etc.)
performance-
oriented
questions,
problems and
solutions
compliance-
oriented
questions,
problems and
solutions

Big Data Expo 2015 - Data Science Center Eindhove

  • 1.
    Process Mining Data Sciencein Action Wil van der Aalst www.vdaalst.com @wvdaalst www.processmining.org
  • 2.
    … , butdata science is here to stay!
  • 3.
    Data Science CenterEindhoven http://www.tue.nl/dsce/
  • 4.
    ©Wil van derAalst & TU/e (use only with permission & acknowledgements) DSC/e: Competences and Research Programs 28 groups and 420+ people involved
  • 5.
    Data Science Flagship(Philips & DSC/e) 4 Strategic topics •Data Driven Value Propositions •Healthcare Smart Maintenance •Optimizing Healthcare Workflows •Continuous Personal Health 4 TU/e departments 16 PhD students 30 Data science specialists
  • 6.
    ©Wil van derAalst & TU/e (use only with permission & acknowledgements) “Data Science University” in Den Bosch
  • 7.
    ©Wil van derAalst & TU/e (use only with permission & acknowledgements) Process Mining: On the interface between process science and data science
  • 8.
    As generic asa spreadsheet!
  • 9.
    ©Wil van derAalst & TU/e (use only with permission & acknowledgements) Spreadsheet: Killer App for early computers • VisiCalc (killer app for Apple II, Oct. 1979) • Lotus 1-2-3 (killer app for IBM PC 1983) • Microsoft Excel (1985)
  • 10.
    ©Wil van derAalst & TU/e (use only with permission & acknowledgements) Spreadsheet: Static data
  • 11.
    ©Wil van derAalst & TU/e (use only with permission & acknowledgements) Spreadsheet: Static data fact derived
  • 12.
    ©Wil van derAalst & TU/e (use only with permission & acknowledgements) Spreadsheet: Static data 31 items sold total value average distribution
  • 13.
    ©Wil van derAalst & TU/e (use only with permission & acknowledgements) Spreadsheet: Static data How to analyze operational processes?
  • 14.
    ©Wil van derAalst & TU/e (use only with permission & acknowledgements) Process Mining: Spreadsheet for behavior • Input: events (“things that have happened”) • Mandatory per event: − case identifier − activity name − timestamp/date • Optional − resource − transaction type − costs − … case identifier activity name timestamp resourcerow = event
  • 15.
    ©Wil van derAalst & TU/e (use only with permission & acknowledgements) Process Mining: Spreadsheet for behavior 208 cases 5987 events 74 activities
  • 16.
    ©Wil van derAalst & TU/e (use only with permission & acknowledgements) Process Mining: Spreadsheet for behavior batching for activities “opstellen eindnota” and “archiveren”
  • 17.
    ©Wil van derAalst & TU/e (use only with permission & acknowledgements) Loesje van der Aalst desire line Process Discovery
  • 18.
    ©Wil van derAalst & TU/e (use only with permission & acknowledgements) Process Mining: Spreadsheet for behavior process discovery NO modeling needed!
  • 19.
    ©Wil van derAalst & TU/e (use only with permission & acknowledgements) Process Mining: Spreadsheet for behavior process discovery NO modeling needed! 74 act. 11 act. 3 act.
  • 20.
    ©Wil van derAalst & TU/e (use only with permission & acknowledgements) event dataprocess model Conformance Checking
  • 21.
    ©Wil van derAalst & TU/e (use only with permission & acknowledgements) desire line very safe system Conformance Checking
  • 22.
    ©Wil van derAalst & TU/e (use only with permission & acknowledgements) Process Mining: Spreadsheet for behavior conformance checking ? discovered or hand-made
  • 23.
    ©Wil van derAalst & TU/e (use only with permission & acknowledgements) Process Mining: Spreadsheet for behavior conformance checking fitness of 93.5%
  • 24.
    ©Wil van derAalst & TU/e (use only with permission & acknowledgements) Process Mining: Spreadsheet for behavior conformance checking final inspection is skipped 40 times
  • 25.
    ©Wil van derAalst & TU/e (use only with permission & acknowledgements) Process Mining: Spreadsheet for behavior conformance checking move on model (something should have happened, but did not) move on log (something happened that should not happen)
  • 26.
    ©Wil van derAalst & TU/e (use only with permission & acknowledgements) Process Mining: Spreadsheet for behavior performance analysis average flowtime is 1.92 months bottleneck NO modeling needed!
  • 27.
    ©Wil van derAalst & TU/e (use only with permission & acknowledgements) Process Mining: Spreadsheet for behavior performance analysis waiting time of 15.74 days NO modeling needed!
  • 28.
    ©Wil van derAalst & TU/e (use only with permission & acknowledgements) Process Mining: Spreadsheet for behavior animating reality real cases NO modeling needed!
  • 29.
    ©Wil van derAalst & TU/e (use only with permission & acknowledgements) Process Mining: Spreadsheet for behavior 16 cases are queueing animating reality
  • 30.
  • 31.
    ©Wil van derAalst & TU/e (use only with permission & acknowledgements) How to get started? • Event Data • Process Mining Tools • Data Science Mindset
  • 32.
    ©Wil van derAalst & TU/e (use only with permission & acknowledgements) Starting point for process mining: Event data patient activity timestamp doctor age cost 5781 make X-ray 23-1-2014@10.30 Dr. Jones 45 70.00 5541 blood test 23-1-2014@10.18 Dr. Scott 61 40.00 5833 blood test 23-1-2014@10.27 Dr. Scott 24 40.00 5781 blood test 23-1-2014@10.49 Dr. Scott 45 40.00 5781 CT scan 23-1-2014@11.10 Dr. Fox 45 1200.00 5833 surgery 23-1-2014@12.34 Dr. Scott 24 2300.00 5781 handle payment 23-1-2014@12.41 Carol Hope 45 0.00 5541 radiation therapy 23-1-2014@13.57 Dr. Jones 61 140.00 5541 radiation therapy 23-1-2014@13.08 Dr. Jones 61 140.00 … … … … … … case id activity name timestamp other dataresource Such data is everywhere (databases, ERP/CRM/HIS/… systems, transaction logs, messaging, social media, etc.)
  • 33.
    ©Wil van derAalst & TU/e (use only with permission & acknowledgements) How to get started? • Event Data • Process Mining Tools • Data Science Mindset
  • 34.
    ©Wil van derAalst & TU/e (use only with permission & acknowledgements) Process Mining Software
  • 35.
    ©Wil van derAalst & TU/e (use only with permission & acknowledgements) 900+ plug-ins available covering the whole process mining spectrum
  • 36.
    ©Wil van derAalst & TU/e (use only with permission & acknowledgements)©Wil van der Aalst & TU/e (use only with permission & acknowledgements)
  • 37.
    ©Wil van derAalst & TU/e (use only with permission & acknowledgements) How to get started? • Event Data • Process Mining Tools • Data Science Mindset
  • 38.
    Process Mining Data Sciencein Action 43.000+25.000 people joined! Starts again on October 7th 2015! Register via https://www.coursera.org/course/procmin
  • 39.
    Conclusion http://www.tue.nl/dsce/ Get started today! spreadsheet forbehavior data-oriented analysis (data mining, machine learning, business intelligence) process model analysis (simulation, verification, optimization, gaming, etc.) performance- oriented questions, problems and solutions compliance- oriented questions, problems and solutions