Considering Non-sequential
Control Flows for Process
Prediction with Recurrent
Neural Networks
Andreas Metzger, Adrian Neubauer
Predictive Process Monitoring
2SEAA 2018, Prague
Monitoring
Prediction
Decision
Time
t t + 
Acceptable/
Planned
Situations
Violation
Proactive
Adaptation
No
Violation

Prediction accuracy is essential
• Missing a true violation  No adaptation  Violation not prevented
• Predicting a false violation  Unnecessary adaptation
e.g., completion of transport
process by given deadline
Process Prediction with RNNs
RNN = Recurrent Neural Network
• Special type of artificial neural network
• Neuron feeds back information into itself
Advantages of RNNs
• High accuracy in general
• Can handle arbitrary number of process steps
• One prediction model sufficient to make prediction at any point in time
(“checkpoint”)
Problem
• RNNs devised for natural language processing (linear sequences of text)
• Non-sequential control flow (order of process steps) may make
RNN prediction more difficult
 How to consider non-sequential control flows?
3
Considering Non-sequential Control Flows
Cycles
• Incremental prediction
• Direct prediction
Parallel branches
• No path: No encoding of parallel branches
• Path: Parallel branch encoded as
attribute of process step
• Slice: Encoding of steps running in parallel
4
No path A E F
Path A1 E2 . F2
Slice AE BE BF
Parallel
Branch 1
No path:
Path:
Slice:
A B
E F
Parallel
Branch 2
Accumulation of prediction error
B A A A
Experiment
Cargo 2000 Data Set
• 3,942 process instances
• 56,082 process steps
• Challenging: Parallel branches include
same types of process steps
Environment / Tooling
• Adaptation of LSTM-Realization of RNN [Tax et al. @ CAiSE 2017]
• Cloud environment for training: 15 physical machines, dockerized
• Ca. 3 hours training / model
Accuracy Metric: MCC
• Robust against class imbalances
• More challenging to score high on
5
-0,05
0,05
0,15
0,25
0,35
0,45
0,55
0,65
0% 10% 20% 30% 40% 50% 60% 70% 80% 90%
Results
SEAA 2018, Prague 6
Incremental Prediction Direct Prediction
-0,05
0,05
0,15
0,25
0,35
0,45
0,55
0,65
0% 10% 20% 30% 40% 50% 60% 70% 80% 90%
MCC
No path Path Slice No path Path Slice
Accuracy [MCC]
Checkpoint
MLP
Direct Prediction 6.3% (avg) better than Incremental
But: Flip after 50% mark!
0,24750
-36%
RNN 36% better than “traditional” Neural Network
Impact of parallel encoding very small: 1.4% (avg)
Conclusion and Outlook
• Deep learning promising technique for predictive business
process monitoring
• Facilitates proactive business process management based on
accurate predictions
• Improvement of accuracy when explicitly considering non-
sequential control flows
• Cycles: +6.3 %
• Parallel branches: +1.4%
• Future work
• Further empirical evidence (port logistics, e-commerce, …)
• Measure impact of control flow structure and complexity on
accuracy
SEAA 2018, Prague 7
Thank you!
…the EFRE co-financed operational
program NRW.Ziel2
http://www.lofip.de
…the EU’s Horizon 2020 research and innovation
programme under Objective ICT-15 ‘Big Data PPP: Large
Scale Pilot Actions ‘
http://www.transformingtransport.eu
Research leading to these results has received funding
from…
SEAA 2018, Prague 8

Considering Non-sequential Control Flows for Process Prediction with Recurrent Neural Networks

  • 1.
    Considering Non-sequential Control Flowsfor Process Prediction with Recurrent Neural Networks Andreas Metzger, Adrian Neubauer
  • 2.
    Predictive Process Monitoring 2SEAA2018, Prague Monitoring Prediction Decision Time t t +  Acceptable/ Planned Situations Violation Proactive Adaptation No Violation  Prediction accuracy is essential • Missing a true violation  No adaptation  Violation not prevented • Predicting a false violation  Unnecessary adaptation e.g., completion of transport process by given deadline
  • 3.
    Process Prediction withRNNs RNN = Recurrent Neural Network • Special type of artificial neural network • Neuron feeds back information into itself Advantages of RNNs • High accuracy in general • Can handle arbitrary number of process steps • One prediction model sufficient to make prediction at any point in time (“checkpoint”) Problem • RNNs devised for natural language processing (linear sequences of text) • Non-sequential control flow (order of process steps) may make RNN prediction more difficult  How to consider non-sequential control flows? 3
  • 4.
    Considering Non-sequential ControlFlows Cycles • Incremental prediction • Direct prediction Parallel branches • No path: No encoding of parallel branches • Path: Parallel branch encoded as attribute of process step • Slice: Encoding of steps running in parallel 4 No path A E F Path A1 E2 . F2 Slice AE BE BF Parallel Branch 1 No path: Path: Slice: A B E F Parallel Branch 2 Accumulation of prediction error B A A A
  • 5.
    Experiment Cargo 2000 DataSet • 3,942 process instances • 56,082 process steps • Challenging: Parallel branches include same types of process steps Environment / Tooling • Adaptation of LSTM-Realization of RNN [Tax et al. @ CAiSE 2017] • Cloud environment for training: 15 physical machines, dockerized • Ca. 3 hours training / model Accuracy Metric: MCC • Robust against class imbalances • More challenging to score high on 5
  • 6.
    -0,05 0,05 0,15 0,25 0,35 0,45 0,55 0,65 0% 10% 20%30% 40% 50% 60% 70% 80% 90% Results SEAA 2018, Prague 6 Incremental Prediction Direct Prediction -0,05 0,05 0,15 0,25 0,35 0,45 0,55 0,65 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% MCC No path Path Slice No path Path Slice Accuracy [MCC] Checkpoint MLP Direct Prediction 6.3% (avg) better than Incremental But: Flip after 50% mark! 0,24750 -36% RNN 36% better than “traditional” Neural Network Impact of parallel encoding very small: 1.4% (avg)
  • 7.
    Conclusion and Outlook •Deep learning promising technique for predictive business process monitoring • Facilitates proactive business process management based on accurate predictions • Improvement of accuracy when explicitly considering non- sequential control flows • Cycles: +6.3 % • Parallel branches: +1.4% • Future work • Further empirical evidence (port logistics, e-commerce, …) • Measure impact of control flow structure and complexity on accuracy SEAA 2018, Prague 7
  • 8.
    Thank you! …the EFREco-financed operational program NRW.Ziel2 http://www.lofip.de …the EU’s Horizon 2020 research and innovation programme under Objective ICT-15 ‘Big Data PPP: Large Scale Pilot Actions ‘ http://www.transformingtransport.eu Research leading to these results has received funding from… SEAA 2018, Prague 8