This document discusses process mining and predictive process monitoring. It begins with an overview of offline process mining techniques like process discovery, conformance checking, and deviance mining. It then discusses applying these techniques online for predictive process monitoring, including predicting outcomes, deviations, or failures. Various techniques are presented like nearest neighbor classification of partial traces and clustering traces before classification. The goal is to accurately predict outcomes during process execution based on control flow, data attributes, and textual case data.
9. Conformance Checking with Trace Alignment
A B C H E I J K C D I J K C E G
A B C H E I J K C D I J K C E
A B C H E I J K C E I K CJ F
A B C H E I J K C D I J K G
A B C H E I J K C D I J K G
A B C H E I J K C E I KJ
A B C H E I J K C E I KJ
A B C D I J K C I J KE G
A B C D I J K I J K C E G
A B C H E I J K C I KJH
H
H
H
H
H
A B C H E I J K C I KJH
A B C H I J K C E I KJH
A B C H E I J K I K CJ FH
A B C H E I J K I K CJ FH
A B C D I J K C I J KEH
A B C H E I J K I KJC D
A B C H E I J K I KJC D
A B C H E I J K I KJH
A B C H E I J K I KJH
A B C H E I J K GEC
A B C H E I J K GEC
A B C H E I J K EC
A B C H E I J K EC
A B C H I J K EC G
A B C D I J K GEC
A B C H I J K C F
A B C H I J K C F
A B C H I J K G
A B C H E I J K
A B C GE
A IE J K
A GE
Activity occurs in the log only,
but occurs in the model in another path
Activity occurs in the model only
and is not observed anywhere in the log
Activity occurs in the model only,
but occurs in the log in another trace
Activity occurs both in the model and the log
Legend
11. Conformance Checking with Behavioral Alignment
Desired conformance output:
• task C is optional in the log
• the cycle including IGDF is not observed in the log
Log traces:
ABCDEH
ACBDEH
ABCDFH
ACBDFH
ABDEH
ABDFH
L. Garcia-Banuelos, N.R. van Beest, M. Dumas, M. La Rosa, W. Mertens, Complete and Interpretable Conformance Checking of Business
Processes, Technical Report, IEEE Transactions on Software Engineering, in press.
12. Given two logs, find the differences and root causes for
variation or deviance between the two logs
Simple claims and quick Simple claims and slow
Deviance Mining
MODEL
S. Suriadi et al.: Understanding Process Behaviours in a Large Insurance Company in Australia: A Case Study. CAiSE 2013
13. Deviance Mining via Sequence Classification
• Apply discriminative sequence mining methods to extract
features characteristic of one class
• Build classification models (e.g. decision trees)
• Extract difference diagnostics from classification model
C. Sun et al. Mining explicit rules for software process evaluation. ICSSP’2013.
14. Difference
statements
Event log
Input model
PESM
unfold
PESL
merge
Partially
Synchronized
Product (PSP)
compare
extract
differences
Log Delta Analysis
Difference
statements
Event log
Input model
PESM
unfold
PESL
merge
Partially
Synchronized
Product (PSP)
compare
extract
differences
22
Difference
statements
Event log
Input model
PESM
unfold
PESL
merge
Partially
Synchronized
Product (PSP)
compare
extract
differences
N.R. van Beest, L. Garcia-Banuelos, M. Dumas, M. La Rosa, Log Delta Analysis: Interpretable Differencing of Business Process Event Logs.
BPM 2015: 386-405
15. Sequence classification vs. log delta analysis
L1 - Short stay
448 cases
7329 events
L2 - Long stay
363 cases
7496 events
Sequence classification
106-130 statements
IF |“NursingProgressNotes”| > 7.5
THEN L1
IF |“Nursing Progress Notes”| ≤ 7.5
AND |“Nursing Assessment”| > 1.5
THEN L2
…
Log delta analysis
48 statements
In L1, “Nursing Primary Assessment”
is repeated after “Medical Assign”
and “Triage Request”, while in L2 it is
not
…
N.R. van Beest, L. Garcia-Banuelos, M. Dumas, M. La Rosa, Log Delta Analysis: Interpretable Differencing of Business Process Event Logs.
BPM 2015: 386-405
16. Apromore Process Analytics Platform
(apromore.org)
Open-source, highly scalable, SaaS BPM analytics platform
M. La Rosa, H. Reijers, W. van der Aalst, R. Dijkman, J. Mendling, M. Dumas, L. Garcia-Banuelos “APROMORE: an advanced process model
repository”, EXP.SYS.APP. 2011
17. How likely is it that a running
process will become “deviant”?
Will it end up in
a negative
outcome?
Will it fail to
meet its SLAs in
the next 24
hours?
Will it generate
abnormal
effort, costs or
rework?
Beyond Deviance Mining:
Predictive Process Monitoring
19. 20
Debt repayment due Call the debtor Send a reminder Payment received
Predictive Monitoring Example:
Debt Recovery Process
20. Debt repayment due Call the debtor Send a reminder Send a warning Call the debtor Call the debtor
Send to external debt
collection agency
Call the debtor
Send a reminder Send a warning Call the debtor Call the debtorCall the debtor
Call the debtor
Call the debtor
Call the debtor
Call the debtor Call the debtor
21
Predictive Monitoring Example:
Debt Recovery Process
22. Predictor
Decision tree
learning
Decision
tree
Class
estimation
Current trace
[Data+] Prediction
Predictive Monitoring:
Runtime Nearest-Neighbors Approach
23
Trace Processor
kNN extraction
(string-edit
distance)
Current trace
[Event+]
Event log
Similar execution
traces
Feature
extraction
Labeled
samples
Current trace
[Data+]
F.M. Maggi, C. Di Francescomarino, M. Dumas, C. Ghidini. Predictive Monitoring of Business Processes. CAiSE'2014
23. • BPI Challenge 2011 dataset
• Healthcare process at Dutch hospital
• 1141 cases, avg length 14 events/case
• Split normal-deviant via 5 predicates: φ1–φ5
• Prediction made at:
• Start event (initial event)
• Early event (ca. ¼ of the trace)
• Middle
Evaluation Setup
24
24. • Reasonably accurate at mid-
point (AUC 0.78-0.88)
• High runtime overhead 5-10
secs / prediction
Evaluation Results
25
25. Predictive Process Monitoring:
Cluster & Classify
26
Pre-processing
Historical
execution
traces
Running
trace
Runtime
Clustering Clusters
Control
flow
encoding
Encoded
control
flow
CONTROL
FLOW
Prefix
extraction
Trace
Prefixes
Predictive Monitoring
Control
flow
encoding
Data
encoding
Cluster(s)
identification
Classification
Prediction
Problem
Prediction
Supervised
Learning Classifiers
Data
encoding
Encoded
data
DATALabeling
function
AUC of 0.6 to 0.85 with a lot of variation
26. Each technique has its own hyperparameters
Other parameters:
• Trace prefix size
• Voting mechanism
• Interval choice in case of interval time predictions
Predictive Process Monitoring:
Cluster & Classify with Hyperparameter Optimization
27
27. • Four outcome labellings of a large real-life patient treatment
dataset
Experimental Settings
Dataset preparation:
•Training set (70%)
•Validation set (20%)
•Testing set (10%)
Identification of the
most suitable
configurations
(among 160)
Evaluation of the
identified
configurations (with
the testing set)
28. • No unique best configuration.
• Accuracy is consistently high and accuracy on testing set
consistent with the tuning.
Evaluation Results
Chiara Di Francescomarino, Marlon Dumas, Fabrizio Maria Maggi, Irene Teinemaa. Clustering-Based Predictive Process Monitoring. IEEE
Transactions on Services Computing, 2017.
30. • Idea: One classifier per index
• Classifier for prefixes of length 1
• Classifier for prefixes of length 2
• Etc.
• Traces of length m encoded using an index-based schem
• At runtime, classify a trace of length m using the
corresponding classifier
Index-Based Multi-Classifier
31
Anna Leontjeva, Raffaele Conforti, Chiara Di Francescomarino, Marlon Dumas, Fabrizio Maria Maggi: Complex Symbolic Sequence Encodings
for Predictive Monitoring of Business Processes. Proc. Of BPM 2015, pp. 297-313.
31. • Same as before, but feature vector of a prefix extended with
Log-Likelihood Ratio of being in the deviant or regular class
according to a Hidden-Markov Model
Index-Based Multi-Classifier + HMM
32
39. Ongoing work
LSTM-Based Predictive Process Monitoring
40
Niek Tax, Ilya Verenich, Marcello La Rosa, Marlon Dumas: Predictive Business Process Monitoring with LSTM Neural Networks. CoRR
abs/1612.02130 (2016).
40. • Accurate, robust techniques to predict case outcome,
covering control-flow, structured and textual data
• LSTM-based architecture to predict
• Next task + timestamp + resource or other attributes
• Remaining execution path and time
• All code available:
• Clustering-based method: http://goo.gl/ykozBf
• Index-based method: https://goo.gl/BQFk7k
• Index-based method with textual features:
https://goo.gl/a2DoWT
• LSTM-based method: https://goo.gl/mkQDyy
Online predictive process monitoring
41
Editor's Notes
As an example, we developed BPMN Miner, a technique that can discover hierarchical BPMN models which are structured in blocks (as structuredness generally makes models more understandable) and hierarchical (processes and subprocesses)
Alternative approaches are based on replay, negative events etc.
Each discrepancy falls under one of a set of disjoint patterns. For each pattern, we have a verbalization of the difference.
The first statement characterizes the behavior observed in the log but not in the model: in the model, task C is compulsory, while in the log C is skippable
The second statement characterizes the behavior observed in the model but not in the log
Trace alignment would produce two optimal alignments:
One between ABDEH of the log and ABCDHE of the model, the other between ABDFH of the log and ABCDFH of the model. From this one can infer that task C is optional in the log (move on log only).
1) However the number of misaligned traces is often very large, rendering this inference quite hard in practice. Visualizations, e.g. on top of Petri net, and at an aggregate level, can help, but fundamentally the problem is that trace alignment provides feedback at the level of individual traces, not at the level of behavioral relations observed in the log but not captured in the model.
2) Moreover, trace alignment would detect that there is escaping behavior starting with “Request addition information” at a trace prefix finishing with “Notify rejection”, but it will not identify that the extra behavior includes tasks IG and that IGDF is behavior that can be repeated in the model but not in the log. For example, task “Assess application” can be repeated in the model but not in the log.
Each discrepancy falls under one of a set of disjoint patterns. For each pattern, we have a verbalization of the difference.
Fair enough, our output is intuitively more interpretable, but let’s actually evaluate it.