Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Complete and Interpretable Conformance Checking of Business Processes

748 views

Published on

Presentation of our paper with the same title available at: https://eprints.qut.edu.au/91552/

Published in: Education
  • Be the first to comment

  • Be the first to like this

Complete and Interpretable Conformance Checking of Business Processes

  1. 1. Complete and Interpretable Conformance Checking of Business Processes Luciano García-Bañuelos University of Tartu, Estonia Nick van Beest Data61 | CSIRO, Australia Marlon Dumas University of Tartu, Estonia Marcello La Rosa Queensland University of Technology, Australia Willem Mertens Queensland University of Technology, Australia
  2. 2. Conformance checking 1. Compliance auditing – detect deviations with respect to a normative model (unfitting behavior) 2. Model maintenance – unfitting behavior – additional model behavior 3. Automated process model discovery – Iterative model improvement
  3. 3. Given a process model M and an event log L, explain the differences between the process behavior observed in M and L
  4. 4. State of the art Current approaches: • Are designed to identify the number and exact location of the differences • Don’t provide a “high-level” diagnosis that easily allows analysts to pinpoint differences: – Are unable to identify differences across traces – Are unable to fully characterize extra model behavior not present in the log
  5. 5. An example Desired conformance output: • task C is optional in the log • the cycle including IGDF is not observed in the log Log traces: ABCDEH ACBDEH ABCDFH ACBDFH ABDEH ABDFH
  6. 6. Our approach A method for business process conformance checking that: 1. Identifies all differences between the behavior in the model and the behavior in the log 2. Describes each difference via a natural language statement
  7. 7. How does it work? Difference statements Event log Input model PESM unfold PESL merge Partially Synchronized Product (PSP) compare extract differences
  8. 8. How does it work? Difference statements Event log Input model PESM unfold PESL merge Partially Synchronized Product (PSP) compare extract differences
  9. 9. How does it work? Difference statements Event log Input model PESM unfold PESL merge Partially Synchronized Product (PSP) compare extract differences
  10. 10. How does it work? Difference statements Event log Input model PESM unfold PESL merge Partially Synchronized Product (PSP) compare extract differences
  11. 11. Prime event structure (PES) A Prime Event Structure (PES) is a graph of events, where each event e represents the occurrence of a task in the modeled system (e.g. a business process) As such, multiple occurrences of the same task are represented by different events Pairs of events in a PES can have one of the following binary relations: • Causality: event e is a prerequisite for e' • Conflict: e and e' cannot occur in the same execution • Concurrency: no order can be established between e and e'
  12. 12. From event log to PES Log: Trace Ref N A B C E t1 3 A C B E t2 2 A B E t3 2 A D E t4 3 e0:A e1:B e2:C e3:E f0:A f1:B f2:E g0:A g1:D g2:E t1, t2 → p1 t3 → p2 t4 → p3 PO runs: {e0,f0,g0}:A
  13. 13. From event log to PES Log: Trace Ref N A B C E t1 3 A C B E t2 2 A B E t3 2 A D E t4 3 PO runs: {e0,f0,g0}:A {e1,f1}:B {e2}:C e0:A e1:B e2:C e3:E f0:A f1:B f2:E g0:A g1:D g2:E t1, t2 → p1 t3 → p2 t4 → p3
  14. 14. From event log to PES Log: Trace Ref N A B C E t1 3 A C B E t2 2 A B E t3 2 A D E t4 3 PO runs: {e0,f0,g0}:A {e1,f1}:B {e2}:C {g1}:D e0:A e1:B e2:C e3:E f0:A f1:B f2:E g0:A g1:D g2:E t1, t2 → p1 t3 → p2 t4 → p3
  15. 15. From event log to PES Log: Trace Ref N A B C E t1 3 A C B E t2 2 A B E t3 2 A D E t4 3 PO runs: {e0,f0,g0}:A {e1,f1}:B {e2}:C {g1}:D e0:A e1:B e2:C e3:E f0:A f1:B f2:E g0:A g1:D g2:E t1, t2 → p1 t3 → p2 t4 → p3
  16. 16. From event log to PES Log: Trace Ref N A B C E t1 3 A C B E t2 2 A B E t3 2 A D E t4 3 PO runs: {e0,f0,g0}:A {e1,f1}:B {f2}:E {e3}:E {g2}:E {e2}:C {g1}:D e0:A e1:B e2:C e3:E f0:A f1:B f2:E g0:A g1:D g2:E t1, t2 → p1 t3 → p2 t4 → p3
  17. 17. From event log to PES Log: Trace Ref N A B C E t1 3 A C B E t2 2 A B E t3 2 A D E t4 3 PO runs: {e0,f0,g0}:A {e1,f1}:B {f2}:E {e3}:E {g2}:E {e2}:C {g1}:D e0:A e1:B e2:C e3:E f0:A f1:B f2:E g0:A g1:D g2:E t1, t2 → p1 t3 → p2 t4 → p3
  18. 18. From model to PES BPMN model Petri net
  19. 19. From model to PES Branching process
  20. 20. From model to PES Complete prefix unfolding Cutoff event Corresponding event Cutoff event Corresponding event
  21. 21. PES prefix unfolding Complete prefix unfolding PES prefix unfolding Cutoff eventCorresponding event Corresponding event Cutoff event
  22. 22. Loop relations A C D D A B C D B C
  23. 23. Comparing PESs Log PES EL Model PES prefix unfolding EM e0:A e1:B e2:C e3:D e4:E e5:E e6:E Trace Ref N A B C E t1 3 A C B E t2 2 A B E t3 2 A D E t4 3 A B D E C f0:A f1:B f2:C f3:D f4:E f5:E
  24. 24. lh = {}, rh = {} m = {} Comparing PESs (cont’d) e0:A e1:B e2:C e3:D e4:E e5:E e6:E f0:A f1:B f2:C f3:D f4:E f5:E
  25. 25. lh = {}, rh = {} m = {(e0,f0)A} match A lh = {}, rh = {} m = {} Comparing PESs (cont’d) e0:A e1:B e2:C e3:D e4:E e5:E e6:E f0:A f1:B f2:C f3:D f4:E f5:E
  26. 26. match B lh = {}, rh = {} m = {(e0,f0)A,(e1,f1)B} lh = {}, rh = {} m = {(e0,f0)A} match A lh = {}, rh = {} m = {} Comparing PESs (cont’d) e0:A e1:B e2:C e3:D e4:E e5:E e6:E match Dmatch C f0:A f1:B f2:C f3:D f4:E f5:E
  27. 27. match B lh = {}, rh = {} m = {(e0,f0)A,(e1,f1)B} match C lh = {}, rh = {} m = {(e0,f0)A,(e1,f1)B,(e2,f2)C} lh = {}, rh = {} m = {(e0,f0)A} match A lh = {}, rh = {} m = {} Comparing PESs (cont’d) e0:A e1:B e2:C e3:D e4:E e5:E e6:E match Dmatch C f0:A f1:B f2:C f3:D f4:E f5:E
  28. 28. match B lh = {}, rh = {} m = {(e0,f0)A,(e1,f1)B} match C match E lh = {}, rh = {} m = {(e0,f0)A,(e1,f1)B,(e2,f2)C} lh = {}, rh = {} m = {(e0,f0)A,(e1,f1)B,(e2,f2)C,(e5,f4)E} lh = {}, rh = {} m = {(e0,f0)A} match A lh = {}, rh = {} m = {} Comparing PESs (cont’d) e0:A e1:B e2:C e3:D e4:E e5:E e6:E match Dmatch C f0:A f1:B f2:C f3:D f4:E f5:E
  29. 29. match B lh = {}, rh = {} m = {(e0,f0)A,(e1,f1)B} rhide Cmatch C lh = {}, rh = {f2:C} m = {(e0,f0)A,(e1,f1)B} lh = {}, rh = {} m = {(e0,f0)A,(e1,f1)B,(e2,f2)C} lh = {}, rh = {} m = {(e0,f0)A} match A lh = {}, rh = {} m = {} match E lh = {}, rh = {} m = {(e0,f0)A,(e1,f1)B,(e2,f2)C,(e5,f4)E} Comparing PESs (cont’d) e0:A e1:B e2:C e3:D e4:E e5:E e6:E match Dmatch C f0:A f1:B f2:C f3:D f4:E f5:E
  30. 30. match B lh = {}, rh = {} m = {(e0,f0)A,(e1,f1)B} rhide Cmatch C match E match E lh = {}, rh = {f2:C} m = {(e0,f0)A,(e1,f1)B} lh = {}, rh = {} m = {(e0,f0)A,(e1,f1)B,(e2,f2)C} lh = {}, rh = {} m = {(e0,f0)A,(e1,f1)B,(e2,f2)C,(e5,f4)E} lh = {}, rh = {f2:C} m = {(e0,f0)A,(e1,f1)B,(e4,f4)E} lh = {}, rh = {} m = {(e0,f0)A} match A lh = {}, rh = {} m = {} Comparing PESs (cont’d) e0:A e1:B e2:C e3:D e4:E e5:E e6:E match Dmatch C In the log, C is optional after {A,B}, whereas in the model it is not (task skipping) f0:A f1:B f2:C f3:D f4:E f5:E
  31. 31. Elementary mismatch patterns Unfitting behavior patterns: • Relation mismatch patterns 1. Causality-Concurrency 2. Conflict • Event mismatch patterns 3. Task skipping 4. Task substitution 5. Unmatched repetition 6. Task relocation 7. Task insertion / absence Additional model behavior patterns: 8. Unobserved acyclic interval 9. Unobserved cyclic interval
  32. 32. Example: Causality / Concurrency
  33. 33. Example: Task substitution
  34. 34. Unobserved cyclic interval: PES and PES prefix unfolding A B C D Log PES EL Model PES prefix unfolding EM
  35. 35. Pomsets (partially ordered multisets) • A pomset is a Directed Acyclic Graph where: – the nodes are configurations – the edges represent direct causality relations between configurations – an edge is labeled by an event • Unlike an event structure, a pomset does not have any conflict relation, since a pomset represents one possible execution • The behavior of a PES can be characterized by the set of pomsets it induces • In the case of a PES prefix, the set of induced pomsets is infinite when the PES prefix captures cyclic behavior via cc-pairs • We cannot enumerate all pomsets of a PES prefix to compare with the PES of the log • Therefore, we can extract a set of elementary pomsets (inspired by the notion of elementary paths), which collectively cover all the possible pomsets induced by a PES prefix • Cyclic behavior is not required to be unfolded infinitely
  36. 36. Unobserved cyclic interval: expanded prefix with elementary pomsets
  37. 37. Unobserved cyclic interval: creating a PSP using the expanded prefix Two unobserved elementary acyclic pomsets: • s3 [a5, a9] • s9 [a8, a9] Two unobserved elementary cyclic pomsets: • s5 [a3, a6, a4, a7] • s11 [a4, a7, a3, a6]
  38. 38. Verbalization of elementary mismatch patterns Change pattern Condition Verbalization Causality / Concurrency if e' < e else In the log, after σ, λ(e') occurs before λ(e), while in the model they are concurrent In the model, after σ, λ(f') occurs before λ(f), while in the log they are concurrent Conflict if e' || e else if f' || f else if e' < e else In the log, after σ, λ(e') and λ(e) are concurrent, while in the model they are mutually exclusive In the model, after σ, λ(f') and λ(f) are concurrent, while in the log they are mutually exclusive In the log, after σ, λ(e') occurs before task λ(e), while in the model they are mutually exclusive after σ In the model, after σ, λ(f') occurs before λ(f), while in the log they are mutually exclusive Task skipping if e ≠ ┴ else In the log, after σ, λ(e) is optional In the model, after σ, λ(f) is optional Task substitution In the log, after σ, λ(f) is substituted by λ(e) Unmatched repetition In the log, λ(e) is repeated after σ Task relocation if e ≠ ┴ else In the log, λ(e) occurs after σ instead of σ' In the model, λ(f) occurs after σ instead of σ'
  39. 39. Change pattern Condition Verbalization Task insertion / absence if e ≠ ┴ else In the log, λ(e) occurs after σ and before σ' In the model, λ(f) occurs after σ and before σ' Unobserved acyclic interval In the log, interval ... does not occur after σ Unobserved cyclic interval In the log, the cycle involving interval ... does not occur after σ Verbalization of elementary mismatch patterns
  40. 40. Implementation Standalone Java tool: ProConformance OSGi plugin for Apromore: Compare – Input: BPMN process model and a log (MXML or XES format). Also accepts: • Two BPMN models for model comparison and • Two logs for log delta analysis – Output: set of difference statements
  41. 41. Evaluation 1. Qualitative evaluation on real life process: – Traffic fines management process in Italy with 150,370 traces, 231 distinct traces 2. Quantitative evaluation on two large process model collections: – IBM Business Integration Unit (BIT): 735 models – SAP R/3: 604 models 3. User evaluation (academics vs practitioners)
  42. 42. Qualitative evaluation: traffic fines model Start Create Fine Payment Send Fine Insert Fine Notification Add Penalty Appeal to Judge Send for Credit Collection Notify Result Appeal to Offender Insert Date Appeal to Prefecture Receive Result Appeal from Prefecture Send Appeal to Prefecture End Tau10
  43. 43. Qualitative evaluation: trace alignment • Replay a Log on Petri Net for Conformance Analysis: 205 misalignments out of 231 alignments • Replay a Log on Petri Net for All Optimal Alignments: 406 misalignments out of 412 alignments
  44. 44. Qualitative evaluation: verbalization 15 distinct statements in total, e.g. 1. In the log, “Send for credit collection” occurs after “Payment” and before the end state 2. In the model, after “Insert fine notification”, “Add penalty” occurs before “Appeal to judge”, while in the log they are concurrent 3. In the log, after “Add penalty”, “Receive results appeal from prefecture” is substituted by “Appeal to judge” 4. In the log, the cycle involving “Insert date appeal to prefecture, Send appeal to prefecture, Receive result appeal from prefecture, Notify result appeal to offender” does not occur after “Insert fine notification”.
  45. 45. Qualitative evaluation: verbalization 2. In the model, after “Insert fine notification”, “Add penalty” occurs before “Appeal to judge”, while in the log they are concurrent 4. In the log, the cycle involving “Insert date appeal to prefecture, Send appeal to prefecture, Receive result appeal from prefecture, Notify result appeal to offender” does not occur after “Insert fine notification”. Cannot be detected by trace alignment, as diagnostics are provided at the level of individual traces Cannot be entirely detected by trace alignment, as this difference concerns additional model behavior, while alignment-based ETC conformance only detects escaping edges
  46. 46. Qualitative evaluation: summary Verbalization: • produces a more compact yet more understandable diagnosis • exposes behavioral differences that are difficult or impossible to identify using trace alignment
  47. 47. Quantitative evaluation • For each model, we generated an event log using the ProM plugin “Generate Event Log from Petri Net” • This plugin generates a distinct log trace for each possible execution sequence in the model • The tool was only able to parse 274 models from the BIT collection, and 438 models from the R/3 collection, running into out-of-memory exceptions for the remaining models • Total models: 712 sound Workflow nets
  48. 48. Quantitative evaluation: model complexity
  49. 49. Quantitative evaluation: log size Total log size (events)
  50. 50. Quantitative evaluation: time performance 0 100 200 300 400 500 600 700 800 0 50 100 150 200 250 300 Logsize(#events) Time (ms) No noise 5% noise 10% noise 15% noise 20% noise 0 2000 4000 6000 8000 10000 12000 0 0.5 1 1.5 2 Logsize(#events) Time (s) No noise 5% noise 10% noise 15% noise 20% noise BIT SAP Trace alignmentVerbalization BIT SAP 0 100 200 300 400 500 600 700 800 0 50 100 150 200 250 300 Logsize(#events) Time (ms) No noise 5% noise 10% noise 15% noise 20% noise 0 2000 4000 6000 8000 10000 0 10 20 30 40 50 60 70 80 90 100 Logsize(#events) Time (s) No noise 5% noise 10% noise 15% noise 20% noise
  51. 51. Quantitative evaluation: results Statements Misalignments Escaping edges
  52. 52. Quantitative evaluation: summary • Verbalization, although generally slower than trace alignment, shows reasonable execution times (within 10s) • Extreme cases: (logs with over 8,000 events in distinct traces) and a high number of differences, the execution time is still below 2 minutes • Verbalization consistently produces a more compact difference diagnosis than trace alignment
  53. 53. User evaluation • Online survey: – a simple Petri net with 31 nodes (10 visible transitions), created from a real-life claims handling process model – assumed that this model was accompanied by a log with 53 traces • Output of the alignment method (misalignments + Petri net with alignment information) overlaid vs • Output of the verbalization method (list of statements)
  54. 54. User evaluation • Respondents compared both methods using the Technology Acceptance Model: 1. What is the easiest approach for checking the conformance of an event log to a process model? 2. What is the easiest approach for identifying the differences between a process model and an event log? 3. What is the most useful approach for checking the conformance of an event log to a process model? 4. What is the most useful approach for identifying the differences between a process model and an event log? 5. Which approach would you likely use for checking the conformance of an event log to a process model? 6. Which approach would you likely use for identifying the differences between a process model and an event log? • Seven point Likert-scale: “Strongly prefer Alignment” to “Strongly prefer Verbalization” • Background: academic vs professional • Experience in process modelling • Confidence in modelling with Petri nets
  55. 55. User evaluation: hypotheses H1: respondents would have a preference for verbalization H2: respondents with less experience, familiarity, confidence and competence in the use of Petri nets would have a stronger preference for verbalization
  56. 56. User evaluation: results • Academics (38 responses) – More familiar in working with Petri nets – More competent in working with Petri nets – Analysed and created more models in the past 12 months • Professionals (33 responses) – Less familiar with Petri nets – Mostly rely on professional training
  57. 57. User evaluation: results • H1: – Tested for the full sample and for the two cohorts separately – For the full sample there is no general preference for our method: the median was zero (“neutral”) – Professionals did show a preference for verbalization (especially along ease of use) while academics preferred alignment, so H1 is supported for the professionals cohort only • H2: – Respondents with more experience, familiarity, confidence and competence in working with Petri nets have a stronger preference for alignments – H2 is supported by the results
  58. 58. User evaluation: summary • Academics prefer alignment • Professionals prefer verbalization • Overall, people with less expertise in the use of Petri nets show a stronger preference for verbalization
  59. 59. Limitations of the approach • Input log is assumed to consist of sequences of event labels – timestamps are ignored – event payloads are ignored • Simplicity of the used concurrency oracle (a+), leading to occasional difficulties in the presence of – short loops – skipped and/or duplicated tasks • Lack of visual representation of differences (text only) • No option to use different levels of abstraction • No statistical support for differences: all equally important even if some may be very infrequent
  60. 60. Future work • Employing a more accurate concurrency oracle (e.g. local a)* • Group related statements to trade accuracy with greater interpretability • Add statistical support to statements • Visual representation of differences in addition to natural language statements (e.g. via “representative” runs) • Capturing non-control-flow deviance – Analysis of underlying data – Resources – Temporal aspects • Use differences as a basis for model repair *Armas-Cervantes, A., Dumas, M., & La Rosa, M. (2016) Discovering Local Concurrency Relations in Business Process Event Logs, https://eprints.qut.edu.au/97615
  61. 61. Thank you for your attention

×