Successfully reported this slideshow.
Your SlideShare is downloading. ×

Mine Your Simulation Model: Automated Discovery of Business Process Simulation Models from Execution Data

Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad

Check these out next

1 of 32 Ad

Mine Your Simulation Model: Automated Discovery of Business Process Simulation Models from Execution Data

Download to read offline

Keynote talk by Marlon Dumas at the SIMULTECH 2021 conference. The talk gives an overview of ongoing research on automated construction of simulation models / digital twins from business process execution logs, including approaches that combine discrete event simulation with deep learning methods.

Keynote talk by Marlon Dumas at the SIMULTECH 2021 conference. The talk gives an overview of ongoing research on automated construction of simulation models / digital twins from business process execution logs, including approaches that combine discrete event simulation with deep learning methods.

Advertisement
Advertisement

More Related Content

Slideshows for you (20)

Similar to Mine Your Simulation Model: Automated Discovery of Business Process Simulation Models from Execution Data (20)

Advertisement

More from Marlon Dumas (20)

Advertisement

Mine Your Simulation Model: Automated Discovery of Business Process Simulation Models from Execution Data

  1. 1. Marlon Dumas Professor @ University of Tartu Co-founder @ Apromore With contributions by Manuel Camargo and Oscar González-Rojas SIMULTECH’2021, 7 July 2021
  2. 2. Model the process Define a simulation scenario Run the simulation Analyze the simulation outputs Repeat for alternative scenarios 3
  3. 3. 4
  4. 4. 5 Exp(20m) Normal(20m, 4m) Normal(10m, 2m) Normal(10m, 2m) Normal(10m, 2m) 0m
  5. 5. 6 Arrival rate = 2 applications per hour Inter-arrival time = 0.5 hour Negative exponential distribution From Monday-Friday, 9am-5pm 0.3 0.7 0.3
  6. 6. Clerk Officer System Clerk Officer Officer
  7. 7. • The simulated logs are analyzed and compared using dashboards displaying performance measures such as case duration, waiting time, cost, etc. and/or via heat maps or animations D
  8. 8. Business Process Simulation: Assumptions The process model is authoritative (always followed to the letter) • No deviations • No workarounds The simulation parameters accurately reflect reality • …whereas in reality, they are often guesstimates A resource only works on one task instance at a time / a task is performed by one resource • No multi-tasking / no multi-resource tasks (teamwork) Resources have robotic behavior (eager resources consume work items in FIFO mode) • No batching • No tiredness effects, no interruptions, no distractions beyond “stochastic” ones Undifferentiated resources • Every resource in a pool has the same performance as others No time-sharing outside the simulated process • Resources fully dedicated to one process 9
  9. 9. End Result Business process simulations based on incomplete models, guesstimates, and simplifying assumptions are not faithful  adoption of business process simulation is disappointing 0
  10. 10. Event Log
  11. 11. Given • one or more business processes, for which we have: • one or more process specifications and/or • event logs generated by the execution of the processes on top of one or more information systems. • one or more process performance measures of interest (e.g. cycle time, resource cost) • One or more changes to the process (interventions) Predict • Predict the values of the process performance measures after the given interventions.
  12. 12. Non-Functional Requirements 15 Predictions accurate. Accuracy may be measured e.g. via an error between the predicted and the actual performance measures after intervention. Predictions should be accompanied by a reliability estimate. In most cases, the reliability is high. Reliability could be captured, e.g. by confidence intervals
  13. 13. Stochastic Process Model Discovery Event log Simod Control-Flow Discovery (BPMN Model) • Filtering threshold • Parallelism threshold Log repair & branching probability discovery Trace alignment + replay to determine how many times each sequence flow is traversed Parameter discovery • Resource pools • Pool-task assignment • Resource timetables • Multi-tasking behavior • Interarrival distribution • Activities durations distribution Simulation Simulator Optimizer Simulation Parameters Discovery SplitMiner
  14. 14. 17 ε = 0.554245605 / η = 0.899926286 ε = 0.7057 η = 0.0133
  15. 15. SiMo-Discoverer - Processing 18 Branching probabilities definition Interarrival dist. • Resource pool extraction • Task assignment Task 1 Task 2 Task 3 Activities durations X Task 4 X
  16. 16. Phase Category Variable Control flow discovery Parallelism threshold (ε) Filtering threshold (η) Parameters for log repair Simulation parameters discovery Thresholds for resource pool discovery Parameters for fitting temporal distributions Paired Damerau-Levenshtein (DL) distance with penalty for temporal mismatch {T1 -> T2 -> T3} {T1 -> T2 -> T3} {T1 -> T2 -> T3} {T1 -> T2 -> T3} {T1 -> T2 -> T3} {T1 -> T2 -> T3} {T1 -> T2 -> T3} {T1 -> T2 -> T3} {T1 -> T2 -> T3} {T1 -> T2 -> T3} {T1 -> T2 -> T3} {T1 -> T2 -> T3} {T1 -> T2 -> T3} {T1 -> T2 -> T3} {T1 -> T2 -> T3} {T1 -> T2 -> T3} {T1 -> T2 -> T3} {T1 -> T2 -> T3} {T1 -> T2 -> T3} {T1 -> T2 -> T3} Test Fold of Event-Log Simulated Log
  17. 17. - Assumes undifferentiated resources with robotic behavior - Branches are selected using branching probabilities Deep Learning Sequence Gen Methods - Does not explicitly take into account resource constraints - Models resource availability via neural networks that may capture non-linear availability functions - Learns the case arrival process from data (univariate or multivariate models) - May take as input a process specification (helps with interpretability) - Models the case creation process via a probability distribution - Takes into account resource constraints - No interpretable process specification - Branching behavior modeled via neural networks (e.g. LSTM) that may capture complex relations - Models resource availability as calendars (possibly discovered from historical data) - May capture differentiated resources and robotic behavior Data-Driven (Discrete Event) Simulation - Provides a natural mechanism for capturing the effect of changes to the process - Does not have a mechanism for capturing the effect of changes to the process
  18. 18. LSTM LSTM Dense LSTM Dense LSTM LSTM Dense Concatenate Embedded Embedded activities category role category relative times contextual features LSTM LSTM Embedded Dense LSTM Embedded Dense LSTM Concatenate Dense activities category role categoryrelative times contextual features (a) Shared categorical (b) Full shared
  19. 19. Partition 2 (20%) Testing Partition 1 (80%) Training Testing Deep Learning Trainer BEST DL MODEL Trace generator Time splitting Evaluator DDS Parameter extraction BEST SIM MODEL Simulator Training (80%) Validation (20%) Event-log 1 2 3
  20. 20. • DDS Models (SIMOD) and DL models have comparable performance w.r.t. control-flow similarity (CLFS) • DL models sometimes clearly outperform DDS models on temporal metrics (MAE, ELS) Could we combine them?
  21. 21. Introducing DeepSimulator DeepSimulator Phase 1: Activity sequence generation Stochastic process model discovery • Control-flow model discovery • Trace alignment (log repair) • Branching probabilities discovery Model eval and optimization Activity sequences generation Phase 2: Case start-times generation Time-series analysis • Prophet model training Case start-times generation Model eval and optimization Phase 3: Activity timestamps generation Activity timestamps generator • Log replay • Feature engineering / embeddings • Model training Output log assembly Model eval and optimization
  22. 22. Given • one or more event logs recording the execution of one or more processes • one or more performance measures that we seek to maximize/minimize • a process model, decision rules and resource allocation rules • a set of allowed changes to the process model and associated rules Find • Possible sets of changes to the process to optimize the performance measures
  23. 23. LSTM {tanh, selu} Dense {linear} Concatenate {50, 100} LSTM {tanh, selu} {50, 100} {current act processing or waiting until next act} Embedded activities category processing or waiting time contextual features inter-case features
  24. 24. Evaluation Results Deep Simulation generally outperforms classical DDS in temporal measures (MAE, EMD)
  25. 25. Remove activity Train DSIM baseline model List of changes Update model Generate syntethic examples Update embeddings Replace embeddings of generative models Evaluator Generate log Generate log Simulate Simulation model Simulation model modified Simulate change Train DSIM modified model BASELINE MODEL UPDATED MODEL
  26. 26. Results 34 Dataset Version MAE RMSE MAPE D1 As-is 7155 22006 0.25497 What-if 17546 33137 0.42854 D2 As-is 283061 357717 0.25912 What-if 1040344 1052255 0.95599 Accuracy of “As is” simulation model vs Accuracy of “What-if” simulation model after adding an activity
  27. 27. References Limitations and pitfalls of traditional BP simulation • van der Aalst: Business Process Simulation Survival Guide. In Handbook on Business Process Management Vol. 1, 2015, 337-370 Data-Driven Simulation (discovering simulation models from logs) • Rozinat et al. Discovering simulation models. Inf. Syst. 34(3): 305-327 (2009) • Martin et al. The Use of Process Mining in Business Process Simulation Model Construction - Structuring the Field. Bus. Inf. Syst. Eng. 58(1): 73-87 • Camargo et al. Automated discovery of business process simulation models from event logs. Decis. Support Syst. 134:113284, 2020 https://arxiv.org/abs/2009.03567 • Pourbafrani et al. Extracting Process Features from Event Logs to Learn Coarse-Grained Simulation Models. CAiSE 2021: 125-140 Data-Driven Simulation and Deep Learning • Camargo et al. Discovering Generative Models from Event Logs: Data-driven Simulation vs Deep Learning, 2020 https://arxiv.org/abs/2009.03567 • Camargo et al. Learning Accurate Business Process Simulation Models from Event Logs via Automated Process Discovery and Deep Learning. Arxiv.org (2021) https://arxiv.org/abs/2103.11944

×