Process Mining and Predictive Process Monitoring

Process Mining and
Predictive Process
Monitoring
Marlon Dumas
marlon.dumas@ut.ee
1

Business Process Monitoring
Dashboards & Reports
Process MiningEvent
stream
DB
2

Offline Process Mining
3
/
event log
discovered model
Discovery
Conformance
Deviance
Difference
diagnostics
Performance
input model
Enhanced model
event log’

Offline Process Mining: The Apromore Approach
4
/
event log
discovered model
Discovery
Conformance
Deviance
Difference
diagnostics
Performance
input model
Enhanced model
event log’
BPMN Miner
Log Delta
Analysis
Behavioral Alignment
All integrated into:
http://apromore.org

Automated Process Discovery
5
Enter Loan
Application
Retrieve
Applicant
Data
Compute
Installments
Approve
Simple
Application
Approve
Complex
Application
Notify
Rejection
Notify
Eligibility
CID Task Time Stamp …
13219 Enter Loan Application 2007-11-09 T 11:20:10 -
13219 Retrieve Applicant Data 2007-11-09 T 11:22:15 -
13220 Enter Loan Application 2007-11-09 T 11:22:40 -
13219 Compute Installments 2007-11-09 T 11:22:45 -
13219 Notify Eligibility 2007-11-09 T 11:23:00 -
13219 Approve Simple Application 2007-11-09 T 11:24:30 -
13220 Compute Installements 2007-11-09 T 11:24:35 -
… … … …

Automated Process Discovery:
Before BPMN Miner (Heuristics Miner)

Automated Process Discovery: BPMN Miner

Conformance Checking with Trace Alignment
A B C H E I J K C D I J K C E G
A B C H E I J K C D I J K C E
A B C H E I J K C E I K CJ F
A B C H E I J K C D I J K G
A B C H E I J K C D I J K G
A B C H E I J K C E I KJ
A B C H E I J K C E I KJ
A B C D I J K C I J KE G
A B C D I J K I J K C E G
A B C H E I J K C I KJH
H
H
H
H
H
A B C H E I J K C I KJH
A B C H I J K C E I KJH
A B C H E I J K I K CJ FH
A B C H E I J K I K CJ FH
A B C D I J K C I J KEH
A B C H E I J K I KJC D
A B C H E I J K I KJC D
A B C H E I J K I KJH
A B C H E I J K I KJH
A B C H E I J K GEC
A B C H E I J K GEC
A B C H E I J K EC
A B C H E I J K EC
A B C H I J K EC G
A B C D I J K GEC
A B C H I J K C F
A B C H I J K C F
A B C H I J K G
A B C H E I J K
A B C GE
A IE J K
A GE
Activity occurs in the log only,
but occurs in the model in another path
Activity occurs in the model only
and is not observed anywhere in the log
Activity occurs in the model only,
but occurs in the log in another trace
Activity occurs both in the model and the log
Legend

Difference
statements
Event log
Input model
PESM
unfold
PESL
merge
Partially
Synchronized
Product (PSP)
compare
extract
differences
Conformance Checking with Behavioral Alignment

Conformance Checking with Behavioral Alignment
Desired conformance output:
• task C is optional in the log
• the cycle including IGDF is not observed in the log
Log traces:
ABCDEH
ACBDEH
ABCDFH
ACBDFH
ABDEH
ABDFH
L. Garcia-Banuelos, N.R. van Beest, M. Dumas, M. La Rosa, W. Mertens, Complete and Interpretable Conformance Checking of Business
Processes, Technical Report, IEEE Transactions on Software Engineering, in press.

Given two logs, find the differences and root causes for
variation or deviance between the two logs
Simple claims and quick Simple claims and slow
Deviance Mining
MODEL
S. Suriadi et al.: Understanding Process Behaviours in a Large Insurance Company in Australia: A Case Study. CAiSE 2013

Deviance Mining via Sequence Classification
• Apply discriminative sequence mining methods to extract
features characteristic of one class
• Build classification models (e.g. decision trees)
• Extract difference diagnostics from classification model
C. Sun et al. Mining explicit rules for software process evaluation. ICSSP’2013.

Difference
statements
Event log
Input model
PESM
unfold
PESL
merge
Partially
Synchronized
Product (PSP)
compare
extract
differences
Log Delta Analysis
Difference
statements
Event log
Input model
PESM
unfold
PESL
merge
Partially
Synchronized
Product (PSP)
compare
extract
differences
22
Difference
statements
Event log
Input model
PESM
unfold
PESL
merge
Partially
Synchronized
Product (PSP)
compare
extract
differences
N.R. van Beest, L. Garcia-Banuelos, M. Dumas, M. La Rosa, Log Delta Analysis: Interpretable Differencing of Business Process Event Logs.
BPM 2015: 386-405

Sequence classification vs. log delta analysis
L1 - Short stay
448 cases
7329 events
L2 - Long stay
363 cases
7496 events
Sequence classification
106-130 statements
IF |“NursingProgressNotes”| > 7.5
THEN L1
IF |“Nursing Progress Notes”| ≤ 7.5
AND |“Nursing Assessment”| > 1.5
THEN L2
…
Log delta analysis
48 statements
In L1, “Nursing Primary Assessment”
is repeated after “Medical Assign”
and “Triage Request”, while in L2 it is
not
…
N.R. van Beest, L. Garcia-Banuelos, M. Dumas, M. La Rosa, Log Delta Analysis: Interpretable Differencing of Business Process Event Logs.
BPM 2015: 386-405

Apromore Process Analytics Platform
(apromore.org)
Open-source, highly scalable, SaaS BPM analytics platform
M. La Rosa, H. Reijers, W. van der Aalst, R. Dijkman, J. Mendling, M. Dumas, L. Garcia-Banuelos “APROMORE: an advanced process model
repository”, EXP.SYS.APP. 2011

How likely is it that a running
process will become “deviant”?
Will it end up in
a negative
outcome?
Will it fail to
meet its SLAs in
the next 24
hours?
Will it generate
abnormal
effort, costs or
rework?
Beyond Deviance Mining:
Predictive Process Monitoring

Deviance Mining and Predictive Monitoring
19

20
Debt repayment due Call the debtor Send a reminder Payment received
Predictive Monitoring Example:
Debt Recovery Process

Debt repayment due Call the debtor Send a reminder Send a warning Call the debtor Call the debtor
Send to external debt
collection agency
Call the debtor
Send a reminder Send a warning Call the debtor Call the debtorCall the debtor
Call the debtor
Call the debtor
Call the debtor
Call the debtor Call the debtor
21
Predictive Monitoring Example:
Debt Recovery Process

Event log
Classifier
/
Outcome
Predictions
Attributes
Traces
Predictive Process Monitoring: General Approach
22
Event log
Regressor /
structured
predictor
Future “paths”
prediction
Attributes
Traces

Predictor
Decision tree
learning
Decision
tree
Class
estimation
Current trace
[Data+] Prediction
Predictive Monitoring:
Runtime Nearest-Neighbors Approach
23
Trace Processor
kNN extraction
(string-edit
distance)
Current trace
[Event+]
Event log
Similar execution
traces
Feature
extraction
Labeled
samples
Current trace
[Data+]
F.M. Maggi, C. Di Francescomarino, M. Dumas, C. Ghidini. Predictive Monitoring of Business Processes. CAiSE'2014

• BPI Challenge 2011 dataset
• Healthcare process at Dutch hospital
• 1141 cases, avg length 14 events/case
• Split normal-deviant via 5 predicates: φ1–φ5
• Prediction made at:
• Start event (initial event)
• Early event (ca. ¼ of the trace)
• Middle
Evaluation Setup
24

• Reasonably accurate at mid-
point (AUC 0.78-0.88)
• High runtime overhead 5-10
secs / prediction
Evaluation Results
25

Predictive Process Monitoring:
Cluster & Classify
26
Pre-processing
Historical
execution
traces
Running
trace
Runtime
Clustering Clusters
Control
flow
encoding
Encoded
control
flow
CONTROL
FLOW
Prefix
extraction
Trace
Prefixes
Predictive Monitoring
Control
flow
encoding
Data
encoding
Cluster(s)
identification
Classification
Prediction
Problem
Prediction
Supervised
Learning Classifiers
Data
encoding
Encoded
data
DATALabeling
function
AUC of 0.6 to 0.85 with a lot of variation

Each technique has its own hyperparameters
Other parameters:
• Trace prefix size
• Voting mechanism
• Interval choice in case of interval time predictions
Predictive Process Monitoring:
Cluster & Classify with Hyperparameter Optimization
27

• Four outcome labellings of a large real-life patient treatment
dataset
Experimental Settings
Dataset preparation:
•Training set (70%)
•Validation set (20%)
•Testing set (10%)
Identification of the
most suitable
configurations
(among 160)
Evaluation of the
identified
configurations (with
the testing set)

• No unique best configuration.
• Accuracy is consistently high and accuracy on testing set
consistent with the tuning.
Evaluation Results
Chiara Di Francescomarino, Marlon Dumas, Fabrizio Maria Maggi, Irene Teinemaa. Clustering-Based Predictive Process Monitoring. IEEE
Transactions on Services Computing, 2017.

• Idea: One classifier per index
• Classifier for prefixes of length 1
• Classifier for prefixes of length 2
• Etc.
• Traces of length m encoded using an index-based schem
• At runtime, classify a trace of length m using the
corresponding classifier
Index-Based Multi-Classifier
31
Anna Leontjeva, Raffaele Conforti, Chiara Di Francescomarino, Marlon Dumas, Fabrizio Maria Maggi: Complex Symbolic Sequence Encodings
for Predictive Monitoring of Business Processes. Proc. Of BPM 2015, pp. 297-313.

• Same as before, but feature vector of a prefix extended with
Log-Likelihood Ratio of being in the deviant or regular class
according to a Hidden-Markov Model
Index-Based Multi-Classifier + HMM
32

Predictive Monitoring with Unstructured Data
35

Text-Extended Index-Based Encoding
37
• Bag-of-N-grams
• Weighted bag-of-N-grams
• Latent Dirichlet Allocation (LDA)
• Paragraph Vector (PV)

Debt Recovery Lead-to-contract
# normal cases 13608 385
# deviant cases 417 390
Avg # words per doc 11 8
# lemmas 11822 2588
Evaluation Setup
38
• Data split: 80% train, 20% test (randomly)
• Handling imbalance: oversampling
• Classifiers: random forest and logistic regression
• Evaluation metrics: F-Score and earliness
• Parameter-tuning: grid search with 5-fold cross validation
on training set

Ongoing work
LSTM-Based Predictive Process Monitoring
40
Niek Tax, Ilya Verenich, Marcello La Rosa, Marlon Dumas: Predictive Business Process Monitoring with LSTM Neural Networks. CoRR
abs/1612.02130 (2016).

• Accurate, robust techniques to predict case outcome,
covering control-flow, structured and textual data
• LSTM-based architecture to predict
• Next task + timestamp + resource or other attributes
• Remaining execution path and time
• All code available:
• Clustering-based method: http://goo.gl/ykozBf
• Index-based method: https://goo.gl/BQFk7k
• Index-based method with textual features:
https://goo.gl/a2DoWT
• LSTM-based method: https://goo.gl/mkQDyy
Online predictive process monitoring
41

Process Mining and Predictive Process Monitoring

Recommended

Recommended

More Related Content

What's hot

What's hot (6)

Similar to Process Mining and Predictive Process Monitoring

Similar to Process Mining and Predictive Process Monitoring (20)

More from Marlon Dumas

More from Marlon Dumas (20)

Recently uploaded

Recently uploaded (20)

Process Mining and Predictive Process Monitoring

Editor's Notes