Process Mining
Based on the
Internet of
Events
Wil van der Aalst
Scientific Director Data Science
Center Eindhoven (DSC/e)
Distinguished Univ. Professor
Eindhoven University of Technology
@wvdaalst vdaalst.com
February 2017, Düsseldorf
Data science is changing
any industry (including
manufacturing)!
Event data are
everywhere!
Agenda
• Setting the scene: uptake of data
science (in the Brainport region)
• Using event data: process mining as
a new type of spreadsheet
• Process mining tools and
applications
• Taking a step back: main challenges
in data science
From Analog to Digital
An example
World's earliest
surviving camera
photograph (1826)
Kodak box camera
developed by George
Eastman (1888)
First digital camera
by Steve Sasson
from Kodak (1975)
Sales digital cameras
exceeds analog
cameras (2003)
Release of first
iPhone (2007)
Release of iPad 2
(2011)
2.8 million apps in
Google Play and 2.2
million apps in Apple
App Store (2017)
analog
digital
World's earliest
surviving camera
photograph (1826)
Kodak box camera
developed by George
Eastman (1888)
First digital camera
by Steve Sasson
from Kodak (1975)
Sales digital cameras
exceeds analog
cameras (2003)
Release of first
iPhone (2007)
Release of iPad 2
(2011)
2.8 million apps in
Google Play and 2.2
million apps in Apple
App Store (2017)
analog
digital
Around 1800, Thomas Wedgwood attempted
to capture the image in a camera obscura by
means of a light-sensitive substance. The
earliest remaining photo dates from 1826.
1826
World's earliest
surviving camera
photograph (1826)
Kodak box camera
developed by George
Eastman (1888)
First digital camera
by Steve Sasson
from Kodak (1975)
Sales digital cameras
exceeds analog
cameras (2003)
Release of first
iPhone (2007)
Release of iPad 2
(2011)
2.8 million apps in
Google Play and 2.2
million apps in Apple
App Store (2017)
analog
digital
George Eastman founded Kodak
around 1890 and produced “The
Kodak” box camera that was sold
for $25, thus making photography
accessible for a larger group of
people.
1888
World's earliest
surviving camera
photograph (1826)
Kodak box camera
developed by George
Eastman (1888)
First digital camera
by Steve Sasson
from Kodak (1975)
Sales digital cameras
exceeds analog
cameras (2003)
Release of first
iPhone (2007)
Release of iPad 2
(2011)
2.8 million apps in
Google Play and 2.2
million apps in Apple
App Store (2017)
analog
digital
In 1976, Kodak was responsible
for 90% of film sales and 85% of
camera sales in the United
States. Kodak developed the
first digital camera in 1975, i.e.,
at the peak of its success.
0.01 megapixel black
and white pictures
1975
World's earliest
surviving camera
photograph (1826)
Kodak box camera
developed by George
Eastman (1888)
First digital camera
by Steve Sasson
from Kodak (1975)
Sales digital cameras
exceeds analog
cameras (2003)
Release of first
iPhone (2007)
Release of iPad 2
(2011)
2.8 million apps in
Google Play and 2.2
million apps in Apple
App Store (2017)
analog
digital
In 2003, the sales of digital cameras
exceeded the sales of traditional
cameras for the first time. Kodak
and others could not adapt.
2003
World's earliest
surviving camera
photograph (1826)
Kodak box camera
developed by George
Eastman (1888)
First digital camera
by Steve Sasson
from Kodak (1975)
Sales digital cameras
exceeds analog
cameras (2003)
Release of first
iPhone (2007)
Release of iPad 2
(2011)
2.8 million apps in
Google Play and 2.2
million apps in Apple
App Store (2017)
analog
digital
Soon after their introduction,
smartphones with built-in
cameras overtook dedicated
cameras.
2007
World's earliest
surviving camera
photograph (1826)
Kodak box camera
developed by George
Eastman (1888)
First digital camera
by Steve Sasson
from Kodak (1975)
Sales digital cameras
exceeds analog
cameras (2003)
Release of first
iPhone (2007)
Release of iPad 2
(2011)
2.8 million apps in
Google Play and 2.2
million apps in Apple
App Store (2017)
analog
digital
The first iPad having a camera
(iPad 2) was presented on March
2nd, 2011 by Steve Jobs.
2011
World's earliest
surviving camera
photograph (1826)
Kodak box camera
developed by George
Eastman (1888)
First digital camera
by Steve Sasson
from Kodak (1975)
Sales digital cameras
exceeds analog
cameras (2003)
Release of first
iPhone (2007)
Release of iPad 2
(2011)
2.8 million apps in
Google Play and 2.2
million apps in Apple
App Store (2017)
analog
digital
Most photos are made using mobile phones and tablets. Photos can be shared online (e.g. Flickr,
Instagram, Facebook, and Twitter) and changed the way we communicate and socialize. Smartphone
apps can detect eye cancer, melanoma, and other diseases by analyzing photos. A photo created
using a smartphone may generate to a wide range of events (e.g., sharing) having data attributes
(e.g., location) that reach far beyond the actual image.
World's earliest
surviving camera
photograph (1826)
Kodak box camera
developed by George
Eastman (1888)
First digital camera
by Steve Sasson
from Kodak (1975)
Sales digital cameras
exceeds analog
cameras (2003)
Release of first
iPhone (2007)
Release of iPad 2
(2011)
2.8 million apps in
Google Play and 2.2
million apps in Apple
App Store (2017)
analog
digital
Data explosion
Brainport
Europe's leading innovative top technology region
12 0110100110011100111010101010100100010110
13 0110100110011100111010101010100100010110
Internet of Events
14
Uptake of Data Science
15
Uptake of Data Science
data
science
science
society
business
health
industry
mobility
government
Uptake of Data Science
data
science
science
society
business
health
industry
mobility
government
Data & Processes
18
Process Mining: The missing link
19
Process Mining
Spreadsheets
for behavior
20
General Motors managers update “spreadsheet” to keep track of
the flow of materials needed for the wartime production (1941).
Spreadsheets
Killer App for early computers
VisiCalc
(killer app for Apple II, Oct. 1979)
Spreadsheets
Killer App for early computers
Followed by
Microsoft Excel
(1985).
Lotus 1-2-3
(killer app for IBM PC 1983)
Spreadsheet: Static data
23
Spreadsheet: Static data
24
fact derived
Spreadsheet: Static data
25
31 items
sold
total
value
average
distribution
Spreadsheet: Static data
26
How to analyze operational processes?
27
case identifier activity name timestamp
resourcerow = event
Event data
28
• Input: events (“things
that have happened”)
• Mandatory per event:
− case identifier
− activity name
− timestamp/date
• Optional
− resource
− transaction type
− costs
− …
case
identifier
activity
name
timestamp
resourcerow = event
Excel cannot deal with events and
analyze dynamic behavior
29
208 cases
5987 events
74 activities
Exploring event data with ProM
30
batching for activities
“opstellen eindnota” and
“archiveren”
31
Loesje van der
Aalst
desire line
Process Discovery
Process Discovery
32
NO
modeling
needed!
Process Discovery
33
NO
modeling
needed!
34
event data
process
model
Conformance Checking
35
desire line
very safe
system
Conformance Checking
Conformance Checking
36
+
Input: Event data and process model
(discovered or hand-made)
Question: Where do modeled and
observed behavior disagree?
Conformance Checking
37
Conformance Checking
38
• Which cases deviate?
(including being late)
• Why do they
deviate?
• How do the deviate?
• What do they have in
common?
Assume this is the normative model
39
“happy flows”
Deviations
40
Deviations
41
skipped 15 times skipped 12 times skipped 16 times
21 activities were
executed that should not
have happened here
Deviations
42
Deviations
43
Deviations
44
shows the cases
that skipped the
1st inspection
(activity 070)
Deviations
45
shows the cases that
executed activity 030
(address) or 070 (inspection)
at the wrong point in time
Performance
46
NO
modeling
needed!
Animation
47
NO
modeling
needed!
real cases
Demo?
48
ProM: Open Source
49
1500+ plug-ins available covering the
whole process mining spectrum
>130k downloads
Commercial process mining tools
50
• 25+ software vendors sell
software based on our
algorithms, ideas, etc.
• Several focussed process
mining companies, e.g.,
Celonis, Fluxicon,
ProcessGold, Minit,
myInvenio, etc.
Overview of tools (incomplete)
51
Short name Full name of tool Version Vendor Webpage XES support
Academic program
(available online) Webpage Academic program
Celonis Celonis Process Mining 4 Celonis GmbH www.celonis.de yes yes
www.celonis.de/en/company/academic-
alliance
Disco Disco 1.9.5 Fluxicon www.fluxicon.com yes yes fluxicon.com/academic/
EDS Enterprise Discovery Suite 4 StereoLOGIC Ltd www.stereologic.com no
Fujitsu
Interstage Business Process
Manager Analytics 12.2 Fujitsu Ltd www.fujitsu.com no
Icaro Icaro EVERFlow 1 Icaro Tech www.icarotech.com no
Icris Icris Process Mining Factory 1 Icris
www.processminingfact
ory.com yes no
LANA LANA Process Mining 1 Lana Labs www.lana-labs.com not yet no
Minit Minit 1 Gradient ECM www.minitlabs.com yes no
myInvenio myInvenio 1 Cognitive Technology www.my-invenio.com yes yes
www.my-invenio.com/myinvenio-
academic-alliance/
Perceptive Perceptive Process Mining 2.7 Lexmark www.lexmark.com no no
ProcessGold ProcessGold Enterprise Platform 8
Processgold International
B.V. www.processgold.com yes not yet
ProM ProM 6.6 Open Source hosted at TU/e www.promtools.org yes open source www.promtools.org
ProM Lite ProM Lite 1.1 Open Source hosted at TU/e www.promtools.org yes open source www.promtools.org
QPR QPR ProcessAnalyzer 2015.5 QPR www.qpr.com yes no
RapidProM RapidProM 4.0.0 Open Source hosted at TU/e www.rapidprom.org yes open source www.rapidprom.org
Rialto Rialto Process 1.5 Exeura www.exeura.eu yes yes
www.exeura.eu/en/products/rialto-
process/
Signavio Signavio Process Intelligence Signavio www.signavio.com
SNP SNP Business Process Analysis 15.27
SNP Schneider-Neureither &
Partner AG www.snp-bpa.com yes no
PPM
webMethods Process Performance
Manager 9.9 Software AG www.softwareag.com no
Worksoft
Worksoft Analyze & Process Mining
for SAP Worksoft www.worksoft.com
Ecosystem
52
challenges
ideas
new techniques
and approaches
data
53
We have applied
process mining in over
200 organizations
(hospitals,
municipalities,
governments,
universities, logistics,
manufacturing,
insurance companies,
banks, etc.)
Applications
54
Applications
55
56
data-
driven
process
centric
performance
compliance
always the same questions …
57
Let’s take a step back
58
infrastructure analysis effect
11010101010111101
1001011101001011101
00100111111001110
o networks & sensors
o distributed systems
(e.g. Hadoop)
o databases (NoSQL)
o programming (MapReduce)
o security
o ...
o statistics
o data/process mining
o machine learning
o operations research
o algorithms
o visualization
o ...
o ethics & privacy
o human technology
interaction
o operations management
o business models
o entrepreneurship
o ...
“volume and velocity” “extracting knowledge” “people, organizations, society”
t h e d a t a s c i e n c e p i p e l i n e
infrastructure analysis effect
“volume and velocity” “extracting knowledge” “people, organizations, society”
Challenge:
Making things
scalable & instant
infrastructure analysis effect
“volume and velocity” “extracting knowledge” “people, organizations, society”
Challenge: Providing
answers to known and
unknown unknowns
infrastructure analysis effect
“volume and velocity” “extracting knowledge” “people, organizations, society”
Challenge: Doing all of this
in a responsible manner!
www.responsibledatascience.org
Fairness
Accuracy
Confidentiality
Transparency
Value
63
infrastructure analysis effect
11010101010111101
1001011101001011101
00100111111001110
“volume and velocity” “extracting knowledge” “people, organizations, society”
All three
challenges also
apply to process
mining
But there is no reason
not to start today!
The Internet of Things will not shut up, so let’s
try to use it effectively and responsibly!
Wil van der Aalst
 vdaalst.com
 processmining.org
 @wvdaalst
Learn more?
• www.vdaalst.com
• www.processmining.org
• www.coursera.org/learn/process-
mining
• www.promtools.org
• www.springer.com/9783662498507
• www.win.tue.nl/ieeetfpm/
• www.tue.nl/dsce/
• @wvdaalst
65

Process Mining based on the Internet of Events

  • 1.
    Process Mining Based onthe Internet of Events Wil van der Aalst Scientific Director Data Science Center Eindhoven (DSC/e) Distinguished Univ. Professor Eindhoven University of Technology @wvdaalst vdaalst.com February 2017, Düsseldorf
  • 2.
    Data science ischanging any industry (including manufacturing)! Event data are everywhere! Agenda • Setting the scene: uptake of data science (in the Brainport region) • Using event data: process mining as a new type of spreadsheet • Process mining tools and applications • Taking a step back: main challenges in data science
  • 3.
    From Analog toDigital An example World's earliest surviving camera photograph (1826) Kodak box camera developed by George Eastman (1888) First digital camera by Steve Sasson from Kodak (1975) Sales digital cameras exceeds analog cameras (2003) Release of first iPhone (2007) Release of iPad 2 (2011) 2.8 million apps in Google Play and 2.2 million apps in Apple App Store (2017) analog digital
  • 4.
    World's earliest surviving camera photograph(1826) Kodak box camera developed by George Eastman (1888) First digital camera by Steve Sasson from Kodak (1975) Sales digital cameras exceeds analog cameras (2003) Release of first iPhone (2007) Release of iPad 2 (2011) 2.8 million apps in Google Play and 2.2 million apps in Apple App Store (2017) analog digital Around 1800, Thomas Wedgwood attempted to capture the image in a camera obscura by means of a light-sensitive substance. The earliest remaining photo dates from 1826. 1826
  • 5.
    World's earliest surviving camera photograph(1826) Kodak box camera developed by George Eastman (1888) First digital camera by Steve Sasson from Kodak (1975) Sales digital cameras exceeds analog cameras (2003) Release of first iPhone (2007) Release of iPad 2 (2011) 2.8 million apps in Google Play and 2.2 million apps in Apple App Store (2017) analog digital George Eastman founded Kodak around 1890 and produced “The Kodak” box camera that was sold for $25, thus making photography accessible for a larger group of people. 1888
  • 6.
    World's earliest surviving camera photograph(1826) Kodak box camera developed by George Eastman (1888) First digital camera by Steve Sasson from Kodak (1975) Sales digital cameras exceeds analog cameras (2003) Release of first iPhone (2007) Release of iPad 2 (2011) 2.8 million apps in Google Play and 2.2 million apps in Apple App Store (2017) analog digital In 1976, Kodak was responsible for 90% of film sales and 85% of camera sales in the United States. Kodak developed the first digital camera in 1975, i.e., at the peak of its success. 0.01 megapixel black and white pictures 1975
  • 7.
    World's earliest surviving camera photograph(1826) Kodak box camera developed by George Eastman (1888) First digital camera by Steve Sasson from Kodak (1975) Sales digital cameras exceeds analog cameras (2003) Release of first iPhone (2007) Release of iPad 2 (2011) 2.8 million apps in Google Play and 2.2 million apps in Apple App Store (2017) analog digital In 2003, the sales of digital cameras exceeded the sales of traditional cameras for the first time. Kodak and others could not adapt. 2003
  • 8.
    World's earliest surviving camera photograph(1826) Kodak box camera developed by George Eastman (1888) First digital camera by Steve Sasson from Kodak (1975) Sales digital cameras exceeds analog cameras (2003) Release of first iPhone (2007) Release of iPad 2 (2011) 2.8 million apps in Google Play and 2.2 million apps in Apple App Store (2017) analog digital Soon after their introduction, smartphones with built-in cameras overtook dedicated cameras. 2007
  • 9.
    World's earliest surviving camera photograph(1826) Kodak box camera developed by George Eastman (1888) First digital camera by Steve Sasson from Kodak (1975) Sales digital cameras exceeds analog cameras (2003) Release of first iPhone (2007) Release of iPad 2 (2011) 2.8 million apps in Google Play and 2.2 million apps in Apple App Store (2017) analog digital The first iPad having a camera (iPad 2) was presented on March 2nd, 2011 by Steve Jobs. 2011
  • 10.
    World's earliest surviving camera photograph(1826) Kodak box camera developed by George Eastman (1888) First digital camera by Steve Sasson from Kodak (1975) Sales digital cameras exceeds analog cameras (2003) Release of first iPhone (2007) Release of iPad 2 (2011) 2.8 million apps in Google Play and 2.2 million apps in Apple App Store (2017) analog digital Most photos are made using mobile phones and tablets. Photos can be shared online (e.g. Flickr, Instagram, Facebook, and Twitter) and changed the way we communicate and socialize. Smartphone apps can detect eye cancer, melanoma, and other diseases by analyzing photos. A photo created using a smartphone may generate to a wide range of events (e.g., sharing) having data attributes (e.g., location) that reach far beyond the actual image.
  • 11.
    World's earliest surviving camera photograph(1826) Kodak box camera developed by George Eastman (1888) First digital camera by Steve Sasson from Kodak (1975) Sales digital cameras exceeds analog cameras (2003) Release of first iPhone (2007) Release of iPad 2 (2011) 2.8 million apps in Google Play and 2.2 million apps in Apple App Store (2017) analog digital Data explosion
  • 12.
    Brainport Europe's leading innovativetop technology region 12 0110100110011100111010101010100100010110
  • 13.
  • 14.
  • 15.
    Uptake of DataScience 15
  • 16.
    Uptake of DataScience data science science society business health industry mobility government
  • 17.
    Uptake of DataScience data science science society business health industry mobility government
  • 18.
  • 19.
    Process Mining: Themissing link 19
  • 20.
    Process Mining Spreadsheets for behavior 20 GeneralMotors managers update “spreadsheet” to keep track of the flow of materials needed for the wartime production (1941).
  • 21.
    Spreadsheets Killer App forearly computers VisiCalc (killer app for Apple II, Oct. 1979)
  • 22.
    Spreadsheets Killer App forearly computers Followed by Microsoft Excel (1985). Lotus 1-2-3 (killer app for IBM PC 1983)
  • 23.
  • 24.
  • 25.
    Spreadsheet: Static data 25 31items sold total value average distribution
  • 26.
    Spreadsheet: Static data 26 Howto analyze operational processes?
  • 27.
    27 case identifier activityname timestamp resourcerow = event
  • 28.
    Event data 28 • Input:events (“things that have happened”) • Mandatory per event: − case identifier − activity name − timestamp/date • Optional − resource − transaction type − costs − … case identifier activity name timestamp resourcerow = event
  • 29.
    Excel cannot dealwith events and analyze dynamic behavior 29 208 cases 5987 events 74 activities
  • 30.
    Exploring event datawith ProM 30 batching for activities “opstellen eindnota” and “archiveren”
  • 31.
    31 Loesje van der Aalst desireline Process Discovery
  • 32.
  • 33.
  • 34.
  • 35.
  • 36.
    Conformance Checking 36 + Input: Eventdata and process model (discovered or hand-made) Question: Where do modeled and observed behavior disagree?
  • 37.
  • 38.
    Conformance Checking 38 • Whichcases deviate? (including being late) • Why do they deviate? • How do the deviate? • What do they have in common?
  • 39.
    Assume this isthe normative model 39 “happy flows”
  • 40.
  • 41.
    Deviations 41 skipped 15 timesskipped 12 times skipped 16 times 21 activities were executed that should not have happened here
  • 42.
  • 43.
  • 44.
    Deviations 44 shows the cases thatskipped the 1st inspection (activity 070)
  • 45.
    Deviations 45 shows the casesthat executed activity 030 (address) or 070 (inspection) at the wrong point in time
  • 46.
  • 47.
  • 48.
  • 49.
    ProM: Open Source 49 1500+plug-ins available covering the whole process mining spectrum >130k downloads
  • 50.
    Commercial process miningtools 50 • 25+ software vendors sell software based on our algorithms, ideas, etc. • Several focussed process mining companies, e.g., Celonis, Fluxicon, ProcessGold, Minit, myInvenio, etc.
  • 51.
    Overview of tools(incomplete) 51 Short name Full name of tool Version Vendor Webpage XES support Academic program (available online) Webpage Academic program Celonis Celonis Process Mining 4 Celonis GmbH www.celonis.de yes yes www.celonis.de/en/company/academic- alliance Disco Disco 1.9.5 Fluxicon www.fluxicon.com yes yes fluxicon.com/academic/ EDS Enterprise Discovery Suite 4 StereoLOGIC Ltd www.stereologic.com no Fujitsu Interstage Business Process Manager Analytics 12.2 Fujitsu Ltd www.fujitsu.com no Icaro Icaro EVERFlow 1 Icaro Tech www.icarotech.com no Icris Icris Process Mining Factory 1 Icris www.processminingfact ory.com yes no LANA LANA Process Mining 1 Lana Labs www.lana-labs.com not yet no Minit Minit 1 Gradient ECM www.minitlabs.com yes no myInvenio myInvenio 1 Cognitive Technology www.my-invenio.com yes yes www.my-invenio.com/myinvenio- academic-alliance/ Perceptive Perceptive Process Mining 2.7 Lexmark www.lexmark.com no no ProcessGold ProcessGold Enterprise Platform 8 Processgold International B.V. www.processgold.com yes not yet ProM ProM 6.6 Open Source hosted at TU/e www.promtools.org yes open source www.promtools.org ProM Lite ProM Lite 1.1 Open Source hosted at TU/e www.promtools.org yes open source www.promtools.org QPR QPR ProcessAnalyzer 2015.5 QPR www.qpr.com yes no RapidProM RapidProM 4.0.0 Open Source hosted at TU/e www.rapidprom.org yes open source www.rapidprom.org Rialto Rialto Process 1.5 Exeura www.exeura.eu yes yes www.exeura.eu/en/products/rialto- process/ Signavio Signavio Process Intelligence Signavio www.signavio.com SNP SNP Business Process Analysis 15.27 SNP Schneider-Neureither & Partner AG www.snp-bpa.com yes no PPM webMethods Process Performance Manager 9.9 Software AG www.softwareag.com no Worksoft Worksoft Analyze & Process Mining for SAP Worksoft www.worksoft.com
  • 52.
  • 53.
    53 We have applied processmining in over 200 organizations (hospitals, municipalities, governments, universities, logistics, manufacturing, insurance companies, banks, etc.)
  • 54.
  • 55.
  • 56.
  • 57.
  • 58.
    58 infrastructure analysis effect 11010101010111101 1001011101001011101 00100111111001110 onetworks & sensors o distributed systems (e.g. Hadoop) o databases (NoSQL) o programming (MapReduce) o security o ... o statistics o data/process mining o machine learning o operations research o algorithms o visualization o ... o ethics & privacy o human technology interaction o operations management o business models o entrepreneurship o ... “volume and velocity” “extracting knowledge” “people, organizations, society” t h e d a t a s c i e n c e p i p e l i n e
  • 59.
    infrastructure analysis effect “volumeand velocity” “extracting knowledge” “people, organizations, society” Challenge: Making things scalable & instant
  • 60.
    infrastructure analysis effect “volumeand velocity” “extracting knowledge” “people, organizations, society” Challenge: Providing answers to known and unknown unknowns
  • 61.
    infrastructure analysis effect “volumeand velocity” “extracting knowledge” “people, organizations, society” Challenge: Doing all of this in a responsible manner!
  • 62.
  • 63.
    63 infrastructure analysis effect 11010101010111101 1001011101001011101 00100111111001110 “volumeand velocity” “extracting knowledge” “people, organizations, society” All three challenges also apply to process mining But there is no reason not to start today!
  • 64.
    The Internet ofThings will not shut up, so let’s try to use it effectively and responsibly! Wil van der Aalst  vdaalst.com  processmining.org  @wvdaalst
  • 65.
    Learn more? • www.vdaalst.com •www.processmining.org • www.coursera.org/learn/process- mining • www.promtools.org • www.springer.com/9783662498507 • www.win.tue.nl/ieeetfpm/ • www.tue.nl/dsce/ • @wvdaalst 65