Business Process Simulation (BPS) is an approach to analyze the performance of business processes under different scenarios. For example, BPS allows us to estimate what would be the cycle time of a process if one or more resources became unavailable. The starting point of BPS is a process model annotated with simulation parameters (a BPS model). BPS models may be manually designed, based on information collected from stakeholders and empirical observations, or automatically discovered from execution data. Regardless of its origin, a key question when using a BPS model is how to assess its quality. In this paper, we propose a collection of measures to evaluate the quality of a BPS model w.r.t. its ability to replicate the observed behavior of the process. We advocate an approach whereby different measures tackle different process perspectives. We evaluate the ability of the proposed measures to discern the impact of modifications to a BPS model, and their ability to uncover the relative strengths and weaknesses of two approaches for automated discovery of BPS models. The evaluation shows that the measures not only capture how close a BPS model is to the observed behavior, but they also help us to identify sources of discrepancies.
Presentation delivered by David Chapela-Campa at the BPM'2023 conference, Utrecht, September 2023.
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
ย
Can I Trust My Simulation Model? Measuring the Quality of Business Process Simulation Models
1. Can I Trust My Simulation
Model? Measuring the
Quality of Business Process
Simulation Models
David Chapela-Campa1, Ismail Benchekroun2, Opher Baron2,
Marlon Dumas1, Dmitry Krass2, and Arik Senderovich3
21st International Conference on Business Process
Management (BPM 2023)
1 University of Tartu, Estonia
2 University of Toronto, Canada
3 York University, Canada
2. Can I Trust My Simulation Model? Measuring the Quality of Business Process Simulation Models 2
Introduction
3. Business Process Simulation (BPS)
3
BPS allows users to address โwhat-ifโ analysis questions.
What would be the cycle time of the process if the rate of arrival of new cases
doubles?
Can I Trust My Simulation Model? Measuring the Quality of Business Process Simulation Models
4. Business Process Simulation (BPS)
4
BPS models can be manually created by modeling experts.
Can I Trust My Simulation Model? Measuring the Quality of Business Process Simulation Models
5. Business Process Simulation (BPS)
5
BPS models can be manually created by modeling experts.
Use of process mining techniques to automatically discover BPS
models from business process event logs.
Can I Trust My Simulation Model? Measuring the Quality of Business Process Simulation Models
6. Business Process Simulation (BPS)
6
How to assess the quality of a BPS model?
Automatic assessment.
Useful to detect the sources of deviations.
Can I Trust My Simulation Model? Measuring the Quality of Business Process Simulation Models
7. Can I Trust My Simulation Model? Measuring the Quality of Business Process Simulation Models 7
Proposed Framework
8. Quality of a BPS model
8
How to assess the quality of a BPS model?
Comparing an event log with a BPS model.
Variety of different BPS models formats.
Can I Trust My Simulation Model? Measuring the Quality of Business Process Simulation Models
Process event log
9. Quality of a BPS model
9
How to assess the quality of a BPS model?
Generate K simulated event logs.
Compare individually and report the average and confidence interval.
Can I Trust My Simulation Model? Measuring the Quality of Business Process Simulation Models
9
Process event log
K simulated event logs
10. Quality of a BPS model
10
A BPS model can be very accurate in one aspect (e.g., control-flow), yet
very different in another (e.g., processing times).
Three main dimensions: control-flow, temporal, congestion.
Can I Trust My Simulation Model? Measuring the Quality of Business Process Simulation Models
Process event log Simulated event log
11. Can I Trust My Simulation Model? Measuring the Quality of Business Process Simulation Models 11
Proposed Framework
Control-flow measures
12. Control-flow: Control-Flow Log
Distance
12
Control-Flow Log Distance (CFLD): given two event logs L1 and L2,
(minimum) average distance to transform each case in L1 into another
case in L2, such that each case in L1 is paired to a different case in L2.
Can I Trust My Simulation Model? Measuring the Quality of Business Process Simulation Models
Camargo, M., Dumas, M., Rojas, O.G.: Discovering generative models from event logs:
data-driven simulation vs deep learning. PeerJ Comput. Sci. 7, e577 (2021)
Process event log Simulated event log
13. Control-flow: Control-Flow Log
Distance
13
Control-Flow Log Distance (CFLD): given two event logs L1 and L2,
(minimum) average distance to transform each case in L1 into another
case in L2, such that each case in L1 is paired to a different case in L2.
Can I Trust My Simulation Model? Measuring the Quality of Business Process Simulation Models
Process event log Simulated event log
A B C D
A B C D
A C B D
A E F G H
A E F G I
A B C D
A C B D
A E F G
A E F G H
A E F G H
14. Control-flow: Control-Flow Log
Distance
14
Control-Flow Log Distance (CFLD): given two event logs L1 and L2,
(minimum) average distance to transform each case in L1 into another
case in L2, such that each case in L1 is paired to a different case in L2.
Can I Trust My Simulation Model? Measuring the Quality of Business Process Simulation Models
Process event log Simulated event log
A B C D
A B C D
A C B D
A E F G H
A E F G I
A B C D
A C B D
A E F G
A E F G H
A E F G H
0
15. Control-flow: Control-Flow Log
Distance
15
Control-Flow Log Distance (CFLD): given two event logs L1 and L2,
(minimum) average distance to transform each case in L1 into another
case in L2, such that each case in L1 is paired to a different case in L2.
Can I Trust My Simulation Model? Measuring the Quality of Business Process Simulation Models
Process event log Simulated event log
A B C D
A B C D
A C B D
A E F G H
A E F G I
A B C D
A C B D
A E F G
A E F G H
A E F G H
0
0
0.2
16. Control-flow: Control-Flow Log
Distance
16
Control-Flow Log Distance (CFLD): given two event logs L1 and L2,
(minimum) average distance to transform each case in L1 into another
case in L2, such that each case in L1 is paired to a different case in L2.
Can I Trust My Simulation Model? Measuring the Quality of Business Process Simulation Models
CFLD =
0+0+0.75+0+0.2
5
= 0.19
17. Control-flow: N-Gram Distance
17
N-Gram Distance (NGD): given two event logs L1 and L2, and a positive
integer ๐, difference in the frequencies of the ๐-grams observed in
both L1 and L2.
Can I Trust My Simulation Model? Measuring the Quality of Business Process Simulation Models
Leemans, S.J.J., Syring, A.F., van der Aalst, W.M.P.: Earth moversโ stochastic conformance
checking. In: BPM Forum 2019. LNBIP, vol. 360, pp. 127โ143. Springer (2019)
Process event log Simulated event log
18. Control-flow: N-Gram Distance
18
N-Gram Distance (NGD): given two event logs L1 and L2, and a positive
integer ๐, difference in the frequencies of the ๐-grams observed in
both L1 and L2.
Can I Trust My Simulation Model? Measuring the Quality of Business Process Simulation Models
Process event log Simulated event log
A B C D
A B C D
A C B D
A E F G H
A E F G I
A B C D
A C B D
A E F G
A E F G H
A E F G H
N = 3
19. Control-flow: N-Gram Distance
19
N-Gram Distance (NGD): given two event logs L1 and L2, and a positive
integer ๐, difference in the frequencies of the ๐-grams observed in
both L1 and L2.
Can I Trust My Simulation Model? Measuring the Quality of Business Process Simulation Models
Process event log Simulated event log
A B C D
A B C D
A C B D
A E F G H
A E F G I
A B C D
A C B D
A E F G
A E F G H
A E F G H
N = 3
20. Control-flow: N-Gram Distance
20
N-Gram Distance (NGD): given two event logs L1 and L2, and a positive
integer ๐, difference in the frequencies of the ๐-grams observed in
both L1 and L2.
Can I Trust My Simulation Model? Measuring the Quality of Business Process Simulation Models
Process event log Simulated event log
A B C D
A B C D
A C B D
A E F G H
A E F G I
A B C D
A C B D
A E F G
A E F G H
A E F G H
N = 3
21. Control-flow: N-Gram Distance
21
N-Gram Distance (NGD): given two event logs L1 and L2, and a positive
integer ๐, difference in the frequencies of the ๐-grams observed in
both L1 and L2.
Can I Trust My Simulation Model? Measuring the Quality of Business Process Simulation Models
Process event log Simulated event log
A B C D
A B C D
A C B D
A E F G H
A E F G I
A B C D
A C B D
A E F G
A E F G H
A E F G H
N = 3
22. Control-flow: N-Gram Distance
22
N-Gram Distance (NGD): given two event logs L1 and L2, and a positive
integer ๐, difference in the frequencies of the ๐-grams observed in
both L1 and L2.
Can I Trust My Simulation Model? Measuring the Quality of Business Process Simulation Models
0
1
2
3
4
5
6
_ _ A _ A B _ A C _ A E A B C A C B A E F B C D C B D E F G F G H F G I C D _
Process event log Simulated event log
23. Can I Trust My Simulation Model? Measuring the Quality of Business Process Simulation Models 23
Proposed Framework
Temporal measures
24. Process event log Simulated event log
Temporal: Absolute Event
Distribution
24
Absolute Event Distribution (AED): given two event logs L1 and L2,
distance between the time series of the events in L1 and L2.
How different they are distributed through the event log.
Can I Trust My Simulation Model? Measuring the Quality of Business Process Simulation Models
25. Process event log Simulated event log
Temporal: Absolute Event
Distribution
25
Absolute Event Distribution (AED): given two event logs L1 and L2,
distance between the time series of the events in L1 and L2.
How different they are distributed through the event log.
Can I Trust My Simulation Model? Measuring the Quality of Business Process Simulation Models
26. Temporal: Absolute Event
Distribution
26
Absolute Event Distribution (AED): given two event logs L1 and L2,
distance between the time series of the events in L1 and L2.
How different they are distributed through the event log.
Can I Trust My Simulation Model? Measuring the Quality of Business Process Simulation Models
06-10-2022 10am โ 11am
07-10-2022 11am โ 12pm
27. Temporal: Absolute Event
Distribution
27
Absolute Event Distribution (AED): given two event logs L1 and L2,
distance between the time series of the events in L1 and L2.
How different they are distributed through the event log.
Can I Trust My Simulation Model? Measuring the Quality of Business Process Simulation Models
Earth
mover's
distance
28. Temporal: Circadian Event
Distribution
28
Circadian Event Distribution (CED): given two event logs L1 and L2,
distance between the time series of the events in L1 and L2, for each
day of the week.
How different they are distributed through each day of the week.
Can I Trust My Simulation Model? Measuring the Quality of Business Process Simulation Models
Monday
Tuesday
Wednesday
Thursday 10am โ 11am
Friday 11am โ 12pm
29. Temporal: Circadian Event
Distribution
29
Circadian Event Distribution (CED): given two event logs L1 and L2,
distance between the time series of the events in L1 and L2, for each
day of the week.
How different they are distributed through each day of the week.
Can I Trust My Simulation Model? Measuring the Quality of Business Process Simulation Models
EMD
Monday Monday
30. Temporal: Relative Event
Distribution
30
Relative Event Distribution (RED): given two event logs L1 and L2,
distance between the time series of the events in L1 and L2, with
respect to the start of their case.
How different they are distributed within each process case.
Can I Trust My Simulation Model? Measuring the Quality of Business Process Simulation Models
00:00:00 01:01:47
31. Temporal: Relative Event
Distribution
Relative Event Distribution (RED): given two event logs L1 and L2,
distance between the time series of the events in L1 and L2, with
respect to the start of their case.
How different they are distributed within each process case.
Can I Trust My Simulation Model? Measuring the Quality of Business Process Simulation Models
EMD
32. Can I Trust My Simulation Model? Measuring the Quality of Business Process Simulation Models 32
Proposed Framework
Congestion measures
33. Congestion: Case Arrival Rate
33
Case Arrival Rate (CAR): given two event logs L1 and L2, distance
between how the case arrivals are distributed in L1 and L2.
Can I Trust My Simulation Model? Measuring the Quality of Business Process Simulation Models
06-10-2022 10am โ 11am
34. Congestion: Case Arrival Rate
34
Case Arrival Rate (CAR): given two event logs L1 and L2, distance
between how the case arrivals are distributed in L1 and L2.
Can I Trust My Simulation Model? Measuring the Quality of Business Process Simulation Models
EMD
35. Congestion: Cycle Time Distribution
35
Cycle Time Distribution (CTD): given two event logs L1 and L2, distance
between the distribution of cycle times in L1 and L2.
Can I Trust My Simulation Model? Measuring the Quality of Business Process Simulation Models
01:07:02
36. Congestion: Cycle Time Distribution
36
Cycle Time Distribution (CTD): given two event logs L1 and L2, distance
between the distribution of cycle times in L1 and L2.
Can I Trust My Simulation Model? Measuring the Quality of Business Process Simulation Models
EMD
37. Can I Trust My Simulation Model? Measuring the Quality of Business Process Simulation Models 37
Evaluation
38. Evaluation
38
Can I Trust My Simulation Model? Measuring the Quality of Business Process Simulation Models
EQ1: Are the proposed measures able to discern the impact of different
known modifications to a BPS model?
EQ2: Is the N-Gram Distanceโs performance significantly different from
the CFLDโs performance?
No modifications
Control-flow
Gateway probabilities
Case arrival rate
Activity durations
Resource contention
Working calendars
Extraneous delays
39. Evaluation
39
Can I Trust My Simulation Model? Measuring the Quality of Business Process Simulation Models
40. Evaluation
40
Can I Trust My Simulation Model? Measuring the Quality of Business Process Simulation Models
EQ1: Are the proposed measures able to discern the impact of different
known modifications to a BPS model?
41. Evaluation
41
Can I Trust My Simulation Model? Measuring the Quality of Business Process Simulation Models
EQ2: Is the N-Gram Distanceโs performance significantly different from
the CFLDโs performance?
Kendall
rank
correlation
coefficient
1.0
42. Evaluation
42
Can I Trust My Simulation Model? Measuring the Quality of Business Process Simulation Models
EQ3: Given two BPS models discovered by existing automated BPS
model discovery techniques in real-life scenarios, are the proposed
measures able to identify the strengths and weaknesses of each
technique?
43. Evaluation
43
Can I Trust My Simulation Model? Measuring the Quality of Business Process Simulation Models
EQ3: Given two BPS models discovered by existing automated BPS
model discovery techniques in real-life scenarios, are the proposed
measures able to identify the strengths and weaknesses of each
technique?
4 real-life processes: each split into disjoint training and test.
44. Evaluation
44
Can I Trust My Simulation Model? Measuring the Quality of Business Process Simulation Models
EQ3: Given two BPS models discovered by existing automated BPS
model discovery techniques in real-life scenarios, are the proposed
measures able to identify the strengths and weaknesses of each
technique?
Automatically discover BPS model with SIMOD and Service Miner.
45. Evaluation
45
Can I Trust My Simulation Model? Measuring the Quality of Business Process Simulation Models
EQ3: Given two BPS models discovered by existing automated BPS
model discovery techniques in real-life scenarios, are the proposed
measures able to identify the strengths and weaknesses of each
technique?
46. Evaluation
46
Can I Trust My Simulation Model? Measuring the Quality of Business Process Simulation Models
EQ3: Given two BPS models discovered by existing automated BPS
model discovery techniques in real-life scenarios, are the proposed
measures able to identify the strengths and weaknesses of each
technique?
47. Evaluation
47
Can I Trust My Simulation Model? Measuring the Quality of Business Process Simulation Models
EQ3: Given two BPS models discovered by existing automated BPS
model discovery techniques in real-life scenarios, are the proposed
measures able to identify the strengths and weaknesses of each
technique?
48. Evaluation
48
Can I Trust My Simulation Model? Measuring the Quality of Business Process Simulation Models
EQ3: Given two BPS models discovered by existing automated BPS
model discovery techniques in real-life scenarios, are the proposed
measures able to identify the strengths and weaknesses of each
technique?
49. Evaluation
49
Can I Trust My Simulation Model? Measuring the Quality of Business Process Simulation Models
EQ4: Does the 1-WD report the same insights in real-life scenarios as
the EMD?
50. Can I Trust My Simulation Model? Measuring the Quality of Business Process Simulation Models 50
Conclusion
51. Conclusion
51
Can I Trust My Simulation Model? Measuring the Quality of Business Process Simulation Models
Proposed a framework to measure the quality of a BPS model:
decomposing into three perspectives (control-flow, temporal, and
congestion), and defined measures for each of these perspectives.
The measures proved their ability to detect the alterations in their
corresponding perspectives.
Beyond capturing the quality of BPS model and identifying the sources of
discrepancies, the measures can also assist in eliciting areas for
improvement in these techniques.
The presented computationally efficient alternatives led to similar
conclusions.
52. Future Work
52
Can I Trust My Simulation Model? Measuring the Quality of Business Process Simulation Models
Explore the applicability of the proposed measures to other process
mining problems, e.g., concept drift detection and variant analysis.
Studying how to assess the quality of BPS models in the context of
object-centric event logs.
Study other quality measures for BPS models adapted from the field of
generative machine learning, for example, by using a discriminative
model that attempts to distinguish between data generated by the
BPS model and real data.
Editor's Notes
So, the first thing we need to know is, what is business process simulation?
BPS aims to replicate the execution of a process, to mimic the behavior of the process, in a certain scenario (set of resources, etc.) analyzing its performance (KPIs)
This allows usersโฆ
The starting point is a BPS modelโฆ A process model annotated with a set of simulation parameters that define the scenario (resources, calendars, activity durationsโฆ).
[NEXT]
BPS models may be manually created based on information collected via
interviews or empirical observations.. Or [NEXTโฆ]
they may be automatically discovered from execution data recorded in process-aware information systems (event logs)
Regardless of the origin, a key question when using a BPS model isโฆ [NEXT]
how to assess its quality?
Several approaches have been proposed to address this problem. However, these approaches are either manual and qualitative or they produce a single number that does not allow one to identify the source(s) of deviations between the BPS model and the observed reality
First we need to decide what to compare when assessing the quality of a BPS model.
What we are comparing is a BPS model, with a PROCESS
What we usually have isโฆ
โฆan event log!
Now, the first thing we asked ourselves was: should we compare a BPS model against a event log?
But it is true that BPS models do not follow a standard structureโฆ
They can be formed by queue systems, but less of more models (resources, more complex waiting times), and they will change during time with new research.
Thus, what we can do is simulate an event log out of the BPS model
and compare log to log.
K runs and compute the avg and conf int
Abstract event logs into time-series or histograms and compare them
We have two event logs, we are focusing on the control-flow, so the first step is toโฆ [NEXT]
obtain the activity sequences of each event log.
Then, we compute the Damerau-Levenshtein (string edit distance) distance between each pair of casesโฆ [NEXT]
For exampe, [comment examples], we repeat this for each case
Once we have all the pairings computed, we compute the matching between cases of one log to another (such as each case in one log is matched to one case in the other event log, with no repetitionsโฆ [NEXT]
While minimizing the sum of distances using the Hungarian algorithm for optimal alignment.
Finally, the CFLD measure is the average of these distancesโฆ [NEXT]
The computational complexity of
computing the DL-distance for all possible pairings is O(N2 รMTL3) where N
is the number of traces in the logs (assuming both logs have an equal number of
cases, which holds in our setting) and MTL is the maximum trace length. Since
all pairings are put into a matrix to compute the optimal alignment of cases (the
one that minimizes the total sum of distances), CFLDโs memory complexity is
quadratic on the number of cases. The optimal alignment of traces using the
Hungarian algorithm has a cubic complexity on the number of cases.
In the same way than for the CFLD, we are focusing on the control-flow, so the first step is to obtain the activity sequences of each event log.
Leemans et al. measure the quality of a stochastic process model by mapping the model and a log to their Directly-Follows Graph (DFG), viewing each DFG as a histogram, and measuring the distance between these histograms.
We note that the histogram of 2-grams of a log is equal to the histogram of its DFG. Given this observation, we generalize the approach of to n-grams, noting that the histogram of n-grams of a log is equal to the (n-1)th-Markovian abstraction of the log.
Thenโฆ [NEXT]
Letโs assume a size of N=3, so the N-grams are 3-grams (sequences of three activites).
We compute all 3-grams observed in both logs, considering two dummy activities in the start and end of each trace.
Thenโฆ [NEXT]
We measure the frequency of each N-gram in each logโฆ
We measure the frequency of each N-gram in each logโฆ
And compute the sum of absolute differences between them, normalized by the sum of frequencies of all n-grams (value between 0-1).
NGD is considerably more efficient than CFLD, as the construction of the
histogram of n-grams is linear on the number of events in the log, and the same
goes for computing the differences between the n-gram histograms.
For the temporal measures, we first do the opposite of the control-flow, we abstract from the control-flow informationโฆ [NEXT]
retaining only the events (in this case start and end)โฆ [NEXT]
Then we discretize these events into bins of 1h in the following way. Obtaining a time-series with the number of events happening in each hour of the process timeline.
Once we have the temporal distribution (not a probabilistic distribution, but just the events occurring in the timeline), we compare both time-series with the EMD to measure the distance.
We measure the trend.
The same process is followed for the next measure, but in this case discretized to weekdays. In this way, we measure the seasonality of the events happening in the process.
The temporal distribution of events of each day of the week is compared and then we compute the average distance of the 7 days.
Finally, for the third one, we focus on how the events are distributed within their corresponding trace.
For this, we compute the time from the case arrival to the event happening and bin it in hours.
Finally, for the third one, we focus on how the events are distributed within their corresponding trace.
For this, we compute the time from the case arrival to the event happening and bin it in hours.
For the case arrival rate we want to measure how different the arrival of cases is.
So, we retain only the events denoting the arrival of each case (start of first activity instance).
Then we build the distribution in the same way than the previous metricsโฆ [NEXT]