Slideshare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our User Agreement and Privacy Policy.

Slideshare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our Privacy Policy and User Agreement for details.

Successfully reported this slideshow.

Like this presentation? Why not share!

- What to Upload to SlideShare by SlideShare 5314746 views
- Customer Code: Creating a Company C... by HubSpot 3989200 views
- Be A Great Product Leader (Amplify,... by Adam Nash 891111 views
- Trillion Dollar Coach Book (Bill Ca... by Eric Schmidt 1059043 views
- APIdays Paris 2019 - Innovation @ s... by apidays 1184121 views
- A few thoughts on work life-balance by Wim Vanderbauwhede 928617 views

1,343 views

Published on

Published in:
Data & Analytics

No Downloads

Total views

1,343

On SlideShare

0

From Embeds

0

Number of Embeds

90

Shares

0

Downloads

131

Comments

7

Likes

6

No notes for slide

- 1. Introduction to Business Process Monitoring and Process Mining Marlon Dumas University of Tartu, Estonia marlon.dumas@ut.ee
- 2. 3 months later 2
- 3. 1. Any process is better than no process 2. A good process is better than a bad process 3. Even a good process can be improved 4. Any good process eventually becomes a bad process • […unless continuously cared for] • Michael Hammer Back to basics… 3
- 4. 4
- 5. Part 1: Techniques for Business Process Monitoring 5
- 6. Business Process Monitoring Dashboards & reports Process miningEvent stream DB logs Event log
- 7. Process Dashboards Operational dashboards (runtime) Tactical dashboards (historical) Strategic dashboards (historical) Types of process dashboards
- 8. Operational process dashboards • Aimed at process workers & operational managers • Emphasis on monitoring (detect-and-respond), e.g.: - Work-in-progress - Problematic cases – e.g. overdue/at-risk cases - Resource load
- 9. • Aimed at process owners / managers • Emphasis on analysis and management • E.g. detecting bottlenecks • Typical process performance indicators • Cycle times • Error rates • Resource utilization Tactical dashboards
- 10. Tactical Performance Dashboard @ Australian Insurer
- 11. • Aimed at executives & managers • Emphasis on linking process performance to strategic objectives Strategic dashboards
- 12. Manage Unplanned Outages Manage Emergencies & Disasters Manage Work Programming & Resourcing Manage Procurement Customer Satisfaction 0.5 0.55 - 0.2 Customer Complaint 0.6 - - 0.5 Customer Feedback 0.4 - - 0.8 Connection Less Than Agreed Time 0.3 0.6 0.7 - Key Performance Process Strategic Performance Dashboard @ Australian Utilities Provider
- 13. Process: Manage Emergencies & Disasters Process: Manage Procurement Process: Manage Unplanned Outages Overall Process Performance Financial People Customer Excellence Operational Excellence Risk Management Health & Safety Customer Satisfaction Customer Complaint Customer Rating (%) Customer Loyalty Index Average Time Spent on Plan 1st Layer Key Result Area 2nd Layer Key Performance Satisfied Customer Index Market Share (%) 3rd & 4th Layer Process Performance Measures 0.65 0.6 0.7 0.7 0.6 0.8 0.4 0.8 0.5 0.4 0.5 0.8 0.4 0.54 0.58 0.67
- 14. Sketch operational and tactical process monitoring dashboards for CVS Pharmacy’s prescription fulfillment process. Consider the viewpoints of each stakeholder in the process. Teamwork
- 15. Business Process Monitoring Dashboards & reports Process miningEvent stream DB logs Event log
- 16. Business Process Monitoring Dashboards & reports Process miningEvent stream DB logs Event log
- 17. Process Mining 17 / event log discovered model Process Discovery Conformance Checking Variants Analysis Difference diagnostics Performance Mining input model Enhanced model event log’
- 18. Event logs structure: minimum requirements Concrete formats: • Comma-Separated Values (CSV) • XES (XML format)
- 19. Automated Process Discovery 19 Enter Loan Application Retrieve Applicant Data Compute Installments Approve Simple Application Approve Complex Application Notify Rejection Notify Eligibility CID Task Time Stamp … 13219 Enter Loan Application 2007-11-09 T 11:20:10 - 13219 Retrieve Applicant Data 2007-11-09 T 11:22:15 - 13220 Enter Loan Application 2007-11-09 T 11:22:40 - 13219 Compute Installments 2007-11-09 T 11:22:45 - 13219 Notify Eligibility 2007-11-09 T 11:23:00 - 13219 Approve Simple Application 2007-11-09 T 11:24:30 - 13220 Compute Installements 2007-11-09 T 11:24:35 - … … … …
- 20. Process Mining Tools Open-source • Apromore • ProM • bupaR Lightweight • Disco Mid-range • Minit • myInvenio • ProcessGold • QPR Process Analyzer • Signavio Process Intelligence • StereoLOGIC Discovery Analyst Heavyweight • ARIS Process Performance Manager • Celonis Process Mining • Perceptive Process Mining (Lexmark) • Interstage Process Discovery (Fujitsu) 20
- 21. Fluxicon Disco 21
- 22. Process Maps • A process map of an event log is a graph where: • Each activity is represented by one node • An arc from activity A to activity B means that B is directly followed by A in at least one trace in the log • Arcs in a process map can be annotated with: • Absolute frequency: how many times B directly follows A? • Relative frequency: in what percentage of times when A is executed, it is directly followed by B? • Time: What is the average time between the occurrence of A and the occurrence of B? 22
- 23. Process Maps – Example 23 Event log: 10: a,b,c,g,e,h 10: a,b,c,f,g,h 10: a,b,d,g,e,h 10: a,b,d,e,g,h 10: a,b,e,c,g,h 10: a,b,e,d,g,h 10: a,c,b,e,g,h 10: a,c,b,f,g,h 10: a,d,b,e,g,h 10: a,d,b,f,g,h
- 24. Process Maps – Exercise Case ID Task Name Originator Timestamp Case ID Task Name Originator Timestamp 1 File Fine Anne 20-07-2004 14:00:00 3 Reminder John 21-08-2004 10:00:00 2 File Fine Anne 20-07-2004 15:00:00 2 Process Payment system 22-08-2004 09:05:00 1 Send Bill system 20-07-2004 15:05:00 2 Close case system 22-08-2004 09:06:00 2 Send Bill system 20-07-2004 15:07:00 4 Reminder John 22-08-2004 15:10:00 3 File Fine Anne 21-07-2004 10:00:00 4 Reminder Mary 22-08-2004 17:10:00 3 Send Bill system 21-07-2004 14:00:00 4 Process Payment system 29-08-2004 14:01:00 4 File Fine Anne 22-07-2004 11:00:00 4 Close Case system 29-08-2004 17:30:00 4 Send Bill system 22-07-2004 11:10:00 3 Reminder John 21-09-2004 10:00:00 1 Process Payment system 24-07-2004 15:05:00 3 Reminder John 21-10-2004 10:00:00 1 Close Case system 24-07-2004 15:06:00 3 Process Payment system 25-10-2004 14:00:00 2 Reminder Mary 20-08-2004 10:00:00 3 Close Case system 25-10-2004 14:01:00 24
- 25. Process Maps in Disco • Disco (and other commercial process mining tools) use process maps as the main visualization technique for event logs • These tools also provide three types of operations: 1. Abstract the process map: • Show only most frequent activities • Show only most frequent arcs 2. Filter the traces in the event log… 25
- 26. Types of filters • Event filters • Retain only events that fulfil a given condition (e.g. all events of type “Create purchase order”) • Performance filter • Retain traces that have a duration above or below a given value • Event pair filter (a.k.a. “follower” filter) • Retain traces where there is a pair of events that fulfil a given condition (e.g. “Create invoice” followed by “Create purchase order”) • Endpoint filter • Retain traces that start with or finish with an event that fulfils a given condition 26
- 27. Process Maps in Disco • Disco (and other commercial process mining tools) use process maps as the main visualization technique for event logs • These tools also provide three types of operations: 1. Abstract the process map: • Show only most frequent activities • Show only most frequent arcs 2. Filter the traces in the event log 3. Enhance the process map 27
- 28. Process Map Enhancement • Nodes and arcs in a process map can be color- coded or thickness-coded to capture: • Frequency: How often a given task or a given directly- follows relation occurs? • Time performance: processing times, waiting times, cycles times of tasks • More advanced tools support enhancement by other attributes, e.g. cost, revenue, etc. if the data is available. 28
- 29. Hands-on Disco demo 29
- 30. Using Disco, answer the following questions on the PurchasingExample log: • How many cases had to settle a dispute with the purchasing agent? • Is there a difference in cycle time for the cases that had to settle a dispute with the purchasing agent, compared to the ones that did not? Make sure you only compare cases that actually reach the endpoint ‘Pay invoice’ • Are there any cases where the invoice is released and authorized by the same resource? And if so, who is doing this most often? Exercise Exercise by Anne Rozinat, Fluxicon
- 31. Consider the dataset of a refund process from an electronics manufacturer. Customer complaints and the inspection of individual cases indicate that this process suffers from inefficiencies and overly long cycle times. Assume that only cases that have reached the ‘Order completed’ event are finished. Questions: 1. Is it a problem if you take the average cycle time of all cases, also the ones that have not finished yet? 2. In general, which channel(s) have the biggest problems with missing documents that need to be requested from the customer? 3. How many customers have received a refund without the product being received by the electronics manufacturing company? This should not happen in this process. 4. Has a customer ever received a double payment? This should not happen in this process. To complete this exercise use the log of RefundProcess.fbt One more exercise: Refund process
- 32. Process Maps - Limitations • Process maps over-generalize: some paths of a process map might not exist and might not make sense • Example: Draw the process map of [ abc, adc, afce, afec ] and check which traces it can recognize, for which there is no support in the event log. • Process maps make it difficult to distinguish conditional branching, parallelism, and loops. • See previous example… or a simpler one: [abcd, acbd] • Solution: automated BPMN process discovery • More on this tomorrow… 33
- 33. Process Mining 34 / event log discovered model Process Discovery Conformance Checking Variants Analysis Difference diagnostics Performance Mining input model Enhanced model event log’
- 34. • Dotted charts • One line per trace, each line contains points, one point per event • Each event type is mapped to a colour • Position of the point denotes its occurrence time (in a relative scale) • Birds-eye view of the timing of different events (e.g. activity end times), but does not allow one to see the “processing” times of activities • Timeline diagrams • One line per trace, each line contains segments capturing the start and end of tasks • Captures process time (unlike dotted charts) • Not scalable for large event logs – good to show “representative” traces • Performance-enhanced process maps • Process maps where nodes are colour-coded w.r.t a performance measure. Nodes may represent activities (default option) • But they may represent resources and then arcs denote hands-offs between resources Process Performance Mining
- 35. Slide 36 Dotted chart
- 36. Slide 37 Timeline diagram See: http://timelines.nirdizati.org
- 37. Slide 38 Performance-enhanced process map Nodes are activities (default) Screenshot of Disco
- 38. Slide 39 Performance-enhanced process map Nodes are tasks (handoff graph) Screenshot of MyInvenio
- 39. Exercise • Consider the following event log of a telephone repair process: http://tinyurl.com/repairLogs • What are the bottlenecks in this process? • Which task has the longest waiting time and which one has the longest processing time? 40
- 40. Process Mining 41 / event log discovered model Process Discovery Conformance Checking Variants Analysis Difference diagnostics Performance Mining input model Enhanced model event log’
- 41. Given two logs, find the differences and root causes for variation or deviance between the two logs Variants Analysis ≠
- 42. Case Study: Variants Analysis at Suncorp OK OK Good Bad Expected Performance Line
- 43. Simple claims and quick Simple claims and slow Variants Analysis via Process Map Comparison ? S. Suriadi et al.: Understanding Process Behaviours in a Large Insurance Company in Australia: A Case Study. CAiSE 2013
- 44. Variants analysis - Exercise We consider a process for handling health insurance claims, for which we have extracted two event logs, namely L1 and L2. Log L1 contains all the cases executed in 2011, while L2 contains all cases executed in 2012. The logs are available in the book’s companion website or directly at: http://tinyurl.com/InsuranceLogs Based on these logs, answer the following questions using a process mining tool: 1. What is the cycle time of each log? 2. Where are the bottlenecks (highest waiting times) in each of the two logs and how do these bottlenecks differ? 3. Describe the differences between the frequency of tasks and the order in which tasks are executed in 2011 (L1) versus 2012 (L2). Hint: If you are using process maps, you should consider using the abstraction slider in your tool to hide some of the most infrequent arcs so as to make the maps more readable 45
- 45. Process Mining 46 / event log discovered model Process Discovery Conformance Checking Variants Analysis Difference diagnostics Performance Mining input model Enhanced model event log’
- 46. Conformance Checking 47 ≠
- 47. Conformance Checking: Unfitting vs. Additional Behavior Unfitting behaviour: • Task C is optional (i.e. may be skipped) in the log Additional behavior: • The cycle including IGDF is not observed in the log Event log: ABCDEH ACBDEH ABCDFH ACBDFH ABDEH ABDFH
- 48. Conformance Checking in Apromore 49 Full demo at: https://www.youtube.com/watch?v=3d00pORc9X8
- 49. Open-source tools: Apromore (apromore.org)• Open-source BPM analytics platform as Software as a Service • Focus is on end users (business analytics and operations managers), not on data scientists • Over 40 plugins ! !
- 50. Key features • Repository of process models and event logs (BPMN, AML, XPDL, EPML, AML, YAWL, XES, MXML) • Offers a range of features along the BPM lifecycle: From logs • Automated discovery of BPMN models • Filter noise from log • Visualize log • Mine process stages From models • Structure model On logs • Animate logs • Compare model-log, log-log • Detect and characterize drifts • Measure log complexity • Mine process performance On models • Measure model complexity • Compare model-model • Detect clones • Search similar models • Simulate model On models • Merge model variants • Configure model with questionnaire From logs • Animate logs • Compare model-log, log-log • Detect and characterize drifts • Mine process performance • Predict outcomes and performance (via Nirdizati)
- 51. Access Apromore You can access it in the cloud or download and install a standalone version Cloud-version • Node 1(Estonia): http://apromore.cs.ut.ee • Node 2 (Australia): http://apromore.qut.edu.au Standalone • One-click: a lightweight version of Apromore. Simply unzip and run from localhost • Full-fledged: for developers and advanced users, this distribution gives you full control over Apromore Source code • Apromore’s source code is open-source, licensed under LGPL 3.0 • The code can be accessed from GitHub
- 52. ProM: the very first process mining tool • 600+ plug-ins available for the whole process mining spectrum • Open source license • Download it from www.processmining.org
- 53. Nirdizati: predictive process monitoring (nirdizati.com)• Predict process outcome (e.g. “Is this loan offer going to be rejected?”) • Predict process performance (e.g. “Will this claim take longer than 5 days to be handled?”) • Predict future events (e.g. “What activity is likely to be executed next? And after that?”)
- 54. Part 2: Process Mining Algorithms 55
- 55. BPMN-Based Process Mining 56 / event log discovered model Process Discovery Conformance Checking Variants Analysis Difference diagnostics Performance Mining input model Enhanced model event log’
- 56. Conformance Checking: Unfitting vs. Additional Behavior Unfitting behaviour: • Task C is optional (i.e. may be skipped) in the log Additional behavior: • The cycle including IGDF is not observed in the log Event log: ABCDEH ACBDEH ABCDFH ACBDFH ABDEH ABDFH
- 57. 58 Process Model Log Unfitting behavior (lack of fitness) Additional behavior (lack of precision) Lack of generalization Accuracy of Automatically Discovered Process Models
- 58. Accuracy of Automatically Discovered Process Models • Fitness: To what extent the behaviour observed in the event log fits the process model? • No unfitting behaviour Fitness = 1 • Precision: How much additional behaviour the process model allows that is not observed in the event log • No additional behaviour Precision = 1 • Generalization (of an algorithm): If we have a (partial) event log of a process, to what extent the discovery algorithm produces models that fit the behaviour of the process that is not observed in the log 59
- 59. Measuring Fitness • Replay • Replay each trace against the model • When a parsing error occurs, repair it locally • Keep track of the “parsing error” • Does not calculate an exact distance measure! • Optimal Trace Alignment • For each trace in the model t, find the trace t’ of the process model such that the string-edit distance of t and t’ is minimal • Use the string-edit distances • Calculates a “distance” between log and model 60
- 60. Conformance Checking via Replay A B C E
- 61. Conformance Checking via Replay B C E
- 62. Conformance Checking via Replay B C E
- 63. Conformance Checking via Replay C E
- 64. Conformance Checking via Replay C E
- 65. Conformance Checking via Replay E
- 66. Conformance Checking via Replay E
- 67. Conformance Checking via Replay E
- 68. Conformance Checking via Replay
- 69. Conformance Checking via Replay
- 70. Conformance Checking via Replay A C E
- 71. Conformance Checking via Replay C E
- 72. Conformance Checking via Replay C E
- 73. Conformance Checking via Replay C E
- 74. Conformance Checking via Replay E
- 75. Conformance Checking via Replay E
- 76. Conformance Checking via Replay E missing token
- 77. Conformance Checking via Replay E
- 78. Conformance Checking via Replay
- 79. Conformance Checking via Replay
- 80. Conformance Checking via Replay remaining token
- 81. Accuracy of automatically discovered process models The accuracy of an automatically discovered process models consists of three quality dimensions: 1. Fitness: the discovered model should allow for the behavior seen in the event log. A model has a perfect fitness if all traces in the log can be replayed from the beginning to the end.
- 82. Accuracy of process models The accuracy of an automatically discovered process models consists of three quality dimensions: 1. Fitness 2. Precision (avoid underfitting): the discovered model should not allow for behavior completely unrelated to what was seen in the event log.
- 83. The accuracy of an automatically discovered process models consists of three quality dimensions: 1. Fitness: 2. Precision (avoid underfitting) 3. Generalization (avoid overfitting): the discovered model should generalize the example behavior seen in the event log. Accuracy of process models
- 84. Flower model (underfitting) L= { <a,b,i,j,k,l>10, <a,b,g,j,k,i,l>140, <a,f,g,j,i,k>5, <a,f,g,i,j,k,l> 360}
- 85. Enumerating model (overfitting) L= { <a,b,i,j,k,l>10, <a,b,g,j,k,i,l>140, <a,f,g,j,i,k>5, <a,f,g,i,j,k,l> 360}
- 86. Something in the middle… L= { <a,b,i,j,k,l>10, <a,b,g,j,k,i,l>140, <a,f,g,j,i,k>5, <a,f,g,i,j,k,l>360}
- 87. Computing fitness: basic approach L= { <a,b,i,j,k,l>10, <a,b,g,j,k,i,l>140, <a,f,g,j,i,k>5, <a,f,g,i,j,k,l>360}
- 88. Computing fitness: basic approach L= { <a,b,i,j,k,l>10, <a,b,g,j,k,i,l>140, <a,f,g,j,i,k>5, <a,f,g,i,j,k,l> 360}
- 89. Computing fitness: basic approach A B I J K L
- 90. Computing fitness: basic approach A B I J K L
- 91. Computing fitness: basic approach B I J K L
- 92. Computing fitness: basic approach B I J K L
- 93. Computing fitness: basic approach I J K L
- 94. Computing fitness: basic approach I J K L
- 95. Computing fitness: basic approach I J K L non-conformance
- 96. Computing fitness: basic approach L= { <a,b,i,j,k,l>10, <a,b,g,j,k,i,l>140, <a,f,g,j,i,k>5, <a,f,g,i,j,k,l> 360} A “basic approach” to compute fitness is to count the fraction of cases that can be “parsed completely” (i.e., the proportion of cases corresponding to firing sequences leading from [start] to [end]).
- 97. Computing fitness: basic approach L= { <a,b,i,j,k,l>10, <a,b,g,j,k,i,l>140, <a,f,g,j,i,k>5, <a,f,g,i,j,k,l> 360} A “basic approach” to compute fitness is to count the fraction of cases that can be “parsed completely” (i.e., the proportion of cases corresponding to firing sequences leading from [start] to [end]). Fitness = 0.97
- 98. Computing fitness: Event-based approach • In the simple fitness computation, we stopped replaying a trace once we encounter a problem and mark it as non-fitting. • An event-based approach to calculate fitness consists of just continue replaying the trace on the model and: • record all situations where a transition is forced to fire without being enabled, i.e., we count all missing tokens. • record the tokens that remain at the end. • Use of four counters: • p = produced tokens • c = consumed tokens • m = missing tokens • r = remaining tokens
- 99. Computing fitness: Event-based approach L= { <a,b,i,j,k,l>10, <a,b,g,j,k,i,l>140, <a,f,g,j,i,k>5, <a,f,g,i,j,k,l> 360}
- 100. Computing fitness: Event-based approach L= { <a,b,i,j,k,l>10, <a,b,g,j,k,i,l>140, <a,f,g,j,i,k>5, <a,f,g,i,j,k,l> 360}
- 101. Computing fitness: Event-based approach A B I J K L
- 102. Computing fitness: Event-based approach p = 1 c = 0 m = 0 r = 0 A B I J K L
- 103. Computing fitness: Event-based approach p = 1 c = 1 m = 0 r = 0 p = 1 c = 0 m = 0 r = 0 B I J K L
- 104. Computing fitness: Event-based approach p = 1 c = 1 m = 0 r = 0 p = 1 c = 1 m = 0 r = 0 B I J K L p = 1 c = 0 m = 0 r = 0
- 105. Computing fitness: Event-based approach p = 1 c = 1 m = 0 r = 0 p = 1 c = 1 m = 0 r = 0 I J K L p = 1 c = 1 m = 0 r = 0 p = 1 c = 0 m = 0 r = 0
- 106. Computing fitness: Event-based approach p = 1 c = 1 m = 0 r = 0 p = 1 c = 1 m = 0 r = 0 I J K L p = 1 c = 1 m = 0 r = 0 p = 1 c = 1 m = 0 r = 0 p = 1 c = 0 m = 0 r = 0
- 107. Computing fitness: Event-based approach p = 1 c = 1 m = 0 r = 0 p = 1 c = 1 m = 0 r = 0 I J K L p = 1 c = 1 m = 0 r = 0 p = 1 c = 1 m = 0 r = 0 p = 1 c = 0 m = 0 r = 1 p = 0 c = 0 m = 1 r = 0
- 108. Computing fitness: Event-based approach p = 1 c = 1 m = 0 r = 0 p = 1 c = 1 m = 0 r = 0 I J K L p = 1 c = 1 m = 0 r = 0 p = 1 c = 1 m = 0 r = 0 p = 1 c = 0 m = 0 r = 1 p = 0 c = 1 m = 1 r = 0 p = 1 c = 0 m = 0 r = 0 p = 1 c = 0 m = 0 r = 0
- 109. Computing fitness: Event-based approach p = 1 c = 1 m = 0 r = 0 p = 1 c = 1 m = 0 r = 0 J K L p = 1 c = 1 m = 0 r = 0 p = 1 c = 1 m = 0 r = 0 p = 1 c = 0 m = 0 r = 1 p = 0 c = 1 m = 1 r = 0 p = 1 c = 0 m = 0 r = 0 p = 1 c = 1 m = 0 r = 0 p = 1 c = 0 m = 0 r = 0
- 110. Computing fitness: Event-based approach p = 1 c = 1 m = 0 r = 0 p = 1 c = 1 m = 0 r = 0 K L p = 1 c = 1 m = 0 r = 0 p = 1 c = 1 m = 0 r = 0 p = 1 c = 0 m = 0 r = 1 p = 0 c = 1 m = 1 r = 0 p = 1 c = 1 m = 0 r = 0 p = 1 c = 1 m = 0 r = 0 p = 1 c = 0 m = 0 r = 0 p = 1 c = 0 m = 0 r = 0
- 111. Computing fitness: Event-based approach p = 1 c = 1 m = 0 r = 0 p = 1 c = 1 m = 0 r = 0 L p = 1 c = 1 m = 0 r = 0 p = 1 c = 1 m = 0 r = 0 p = 1 c = 0 m = 0 r = 1 p = 0 c = 1 m = 1 r = 0 p = 1 c = 1 m = 0 r = 0 p = 1 c = 1 m = 0 r = 0 p = 1 c = 0 m = 0 r = 0 p = 1 c = 1 m = 0 r = 0 p = 1 c = 0 m = 0 r = 0
- 112. Computing fitness: Event-based approach p = 1 c = 1 m = 0 r = 0 p = 1 c = 1 m = 0 r = 0 L p = 1 c = 1 m = 0 r = 0 p = 1 c = 1 m = 0 r = 0 p = 1 c = 0 m = 0 r = 1 p = 0 c = 1 m = 1 r = 0 p = 1 c = 1 m = 0 r = 0 p = 1 c = 1 m = 0 r = 0 p = 1 c = 1 m = 0 r = 0 p = 1 c = 1 m = 0 r = 0 p = 1 c = 1 m = 0 r = 0 p = 1 c = 0 m = 0 r = 0
- 113. Computing fitness: Event-based approach p = 1 c = 1 m = 0 r = 0 p = 1 c = 1 m = 0 r = 0 p = 1 c = 1 m = 0 r = 0 p = 1 c = 1 m = 0 r = 0 p = 1 c = 0 m = 0 r = 1 p = 0 c = 1 m = 1 r = 0 p = 1 c = 1 m = 0 r = 0 p = 1 c = 1 m = 0 r = 0 p = 1 c = 1 m = 0 r = 0 p = 1 c = 1 m = 0 r = 0 p = 1 c = 1 m = 0 r = 0 p = 1 c = 1 m = 0 r = 0 p = 1 c = 0 m = 0 r = 0
- 114. Computing fitness: Event-based approach p = 1 c = 1 m = 0 r = 0 p = 1 c = 1 m = 0 r = 0 p = 1 c = 1 m = 0 r = 0 p = 1 c = 1 m = 0 r = 0 p = 1 c = 0 m = 0 r = 1 p = 0 c = 1 m = 1 r = 0 p = 1 c = 1 m = 0 r = 0 p = 1 c = 1 m = 0 r = 0 p = 1 c = 1 m = 0 r = 0 p = 1 c = 1 m = 0 r = 0 p = 1 c = 1 m = 0 r = 0 p = 1 c = 1 m = 0 r = 0 p = 1 c = 1 m = 0 r = 0 p = 12 c = 12 m = 1 r = 1
- 115. Computing fitness: Event-based approach L= { <a,b,i,j,k,l>10, <a,b,g,j,k,i,l>140, <a,f,g,j,i,k>5, <a,f,g,i,j,k,l> 360}
- 116. Computing fitness: Event-based approach L= { <a,b,i,j,k,l>10, <a,b,g,j,k,i,l>140, <a,f,g,j,i,k>5, <a,f,g,i,j,k,l> 360} p = 12 c = 12 m = 1 r = 1
- 117. Computing fitness: Event-based approach L= { <a,b,i,j,k,l>10, <a,b,g,j,k,i,l>140, <a,f,g,j,i,k>5, <a,f,g,i,j,k,l>360} p = 12 c = 12 m = 1 r = 1 Fitness = 0.9166
- 118. Computing fitness: Event-based approach L= { <a,b,i,j,k,l>10, <a,b,g,j,k,i,l>140, <a,f,g,j,i,k>5, <a,f,g,i,j,k,l>360}
- 119. Computing fitness: Event-based approach L= { <a,b,i,j,k,l>10, <a,b,g,j,k,i,l>140, <a,f,g,j,i,k>5, <a,f,g,i,j,k,l>360} p = 13 c = 13 m = 0 r = 0
- 120. Computing fitness: Event-based approach L= { <a,b,i,j,k,l>10, <a,b,g,j,k,i,l>140, <a,f,g,j,i,k>5, <a,f,g,i,j,k,l>360} p = 13 c = 13 m = 0 r = 0 Fitness = 1
- 121. Computing fitness: Event-based approach L= { <a,b,i,j,k,l>10, <a,b,g,j,k,i,l>140, <a,f,g,j,i,k>5, <a,f,g,i,j,k,l>360}
- 122. Computing fitness: Event-based approach L= { <a,b,i,j,k,l>10, <a,b,g,j,k,i,l>140, <a,f,g,j,i,k>5, <a,f,g,i,j,k,l>360} p = 1 c = 1 m = 0 r = 0 p = 1 c = 1 m = 0 r = 0 p = 1 c = 1 m = 0 r = 0 p = 1 c = 1 m = 0 r = 0 p = 1 c = 1 m = 0 r = 0 p = 1 c = 1 m = 0 r = 0 p = 1 c = 1 m = 0 r = 0 p = 1 c = 1 m = 0 r = 0 p = 1 c = 1 m = 0 r = 0 p = 1 c = 1 m = 0 r = 0 p = 1 c = 1 m = 0 r = 0 p = 1 c = 0 m = 0 r = 1 p = 0 c = 0 m = 0 r = 0
- 123. Computing fitness: Event-based approach L= { <a,b,i,j,k,l>10, <a,b,g,j,k,i,l>140, <a,f,g,j,i,k>5, <a,f,g,i,j,k,l>360} p = 1 c = 1 m = 0 r = 0 p = 1 c = 1 m = 0 r = 0 p = 1 c = 1 m = 0 r = 0 p = 1 c = 1 m = 0 r = 0 p = 1 c = 1 m = 0 r = 0 p = 1 c = 1 m = 0 r = 0 p = 1 c = 1 m = 0 r = 0 p = 1 c = 1 m = 0 r = 0 p = 1 c = 1 m = 0 r = 0 p = 1 c = 1 m = 0 r = 0 p = 1 c = 1 m = 0 r = 0 p = 1 c = 0 m = 0 r = 1 p = 0 c = 0 m = 0 r = 0 p = 12 c = 11 m = 0 r = 1
- 124. Computing fitness: Event-based approach L= { <a,b,i,j,k,l>10, <a,b,g,j,k,i,l>140, <a,f,g,j,i,k>5, <a,f,g,i,j,k,l>360} p = 12 c = 11 m = 0 r = 1 Fitness = 0.9583
- 125. Computing fitness: Event-based approach L= { <a,b,i,j,k,l>10, <a,b,g,j,k,i,l>140, <a,f,g,j,i,k>5, <a,f,g,i,j,k,l>360}
- 126. Computing fitness: Event-based approach L= { <a,b,i,j,k,l>10, <a,b,g,j,k,i,l>140, <a,f,g,j,i,k>5, <a,f,g,i,j,k,l>360} p = 13 c = 13 m = 0 r = 0
- 127. Computing fitness: Event-based approach L= { <a,b,i,j,k,l>10, <a,b,g,j,k,i,l>140, <a,f,g,j,i,k>5, <a,f,g,i,j,k,l> 360} p = 13 c = 13 m = 0 r = 0 Fitness = 1
- 128. Computing fitness at log level L= { <a,b,i,j,k,l>10, <a,b,g,j,k,i,l>140, <a,f,g,j,i,k>5, <a,f,g,i,j,k,l>360}
- 129. Computing fitness at log level L= { <a,b,i,j,k,l>10, <a,b,g,j,k,i,l>140, <a,f,g,j,i,k>5, <a,f,g,i,j,k,l>360} Number of occurrences of a specific trace in the log (e.g., if a trace σ appears 200 times in the log, L(σ) will be equal to 200 )
- 130. Computing fitness at log level L= { <a,b,i,j,k,l>10, <a,b,g,j,k,i,l>140, <a,f,g,j,i,k>5, <a,f,g,i,j,k,l>360} p = 13 c = 13 m = 0 r = 0 p = 12 c = 12 m = 1 r = 1 p = 13 c = 13 m = 0 r = 0 p = 12 c = 11 m = 0 r = 1
- 131. Computing fitness at log level L= { <a,b,i,j,k,l>10, <a,b,g,j,k,i,l>140, <a,f,g,j,i,k>5, <a,f,g,i,j,k,l>360} p = 13 c = 13 m = 0 r = 0 p = 12 c = 12 m = 1 r = 1 p = 13 c = 13 m = 0 r = 0 p = 12 c = 11 m = 0 r = 1 Fitness = 0.998
- 132. Calculating precision • Precision = 1 the behaviour allowed by the model is contained or equal to the behavior in the log • Precision close to 0 None of the behaviour in the model is observed in the log • Precision can be calculated as a “difference” between a state space representing the behaviour of the model, and a state space representing the behaviour of the log • Adriano Augusto et al. “Abstract-and-Compare: A Family of Scalable Precision Measures for Automated Process Discovery”. In Proceedings of BPM’2018 133
- 133. Measuring generalization of a process discovery algorithm via cross-validation L L1 L2 L3 L4 L5
- 134. L L1 L2 L3 L4 L5 N1 N2 N3 N4 N5 Discovery Measuring generalization of a process discovery algorithm via cross-validation
- 135. Measuring generalization of a process discovery algorithm via cross-validation L L1 L2 L3 L4 L5 N1 N2 N3 N4 N5 Fitness
- 136. Measuring generalization of a process discovery algorithm via cross-validation L L1 L2 L3 L4 L5 N1 N2 N3 N4 N5 Fitness
- 137. Measuring generalization of a process discovery algorithm via cross-validation L L1 L2 L3 L4 L5 N1 N2 N3 N4 N5 Fitness
- 138. Measuring generalization of a process discovery algorithm via cross-validation L L1 L2 L3 L4 L5 N1 N2 N3 N4 N5 Fitness
- 139. Measuring generalization of a process discovery algorithm via cross-validation L L1 L2 L3 L4 L5 N1 N2 N3 N4 N5 Fitness
- 140. Measuring generalization of a process discovery algorithm via cross-validation L L1 L2 L3 L4 L5 N1 N2 N3 N4 N5 Avg Fitness
- 141. Automated Discovery of BPMN Process Models 142
- 142. α-algorithm: the Origin of Process Discovery van der Aalst, W. M. P. and Weijters, A. J. M. M. and Maruster, L. (2003). Workflow Mining: Discovering process models from event logs, IEEE Transactions on Knowledge and Data Engineering
- 143. α-algorithm Basic Idea: Ordering relations • Direct succession: x>y iff for some case x is directly followed by y. • Causality: xy iff x>y and not y>x. • Parallel: x||y iff x>y and y>x • Unrelated: x#y iff not x>y and not y>x. case 1 : task A case 2 : task A case 3 : task A case 3 : task B case 1 : task B case 1 : task C case 2 : task C case 4 : task A case 2 : task B ... A>B A>C B>C B>D C>B C>D E>F AB AC BD CD EF B||C C||B ABCD ACBD EF
- 144. Basic Idea: Example
- 145. Basic Idea: Example
- 146. Basic Idea: Footprints
- 147. α-Algorithm • Idea (a)
- 148. α-Algorithm • Idea (a) a b
- 149. α-Algorithm • Idea (b)
- 150. α-Algorithm • Idea (b) a b, a c and b # c
- 151. α-Algorithm • Idea (c)
- 152. α-Algorithm • Idea (c) b d, c d and b # c
- 153. α-Algorithm • Idea (d)
- 154. α-Algorithm • Idea (d) a b, a c and b || c
- 155. α-Algorithm • Idea (e)
- 156. α-Algorithm • Idea (e) b d, c d and b || c
- 157. α-algorithm: Applicative Example α(L) = ? L= { <a,b,c,d,e,g> 3, <a,c,b,d,e,g> 2, <a,b,c,d,f,g>, <a,c,b,d,f,g> 4}
- 158. α-algorithm: Applicative Example ALPHABET L= { <a,b,c,d,e,g> 3, <a,c,b,d,e,g> 2, <a,b,c,d,f,g>, <a,c,b,d,f,g> 4}
- 159. α-algorithm: Applicative Example ALPHABET {a,b,c,d,e,f,g} L= { <a,b,c,d,e,g> 3, <a,c,b,d,e,g> 2, <a,b,c,d,f,g>, <a,c,b,d,f,g> 4}
- 160. α-algorithm: Applicative Example INITIAL ACTIVITIES L= { <a,b,c,d,e,g> 3, <a,c,b,d,e,g> 2, <a,b,c,d,f,g>, <a,c,b,d,f,g> 4}
- 161. α-algorithm: Applicative Example INITIAL ACTIVITIES {a} L= { <a,b,c,d,e,g> 3, <a,c,b,d,e,g> 2, <a,b,c,d,f,g>, <a,c,b,d,f,g> 4}
- 162. α-algorithm: Applicative Example FINAL ACTIVITIES L= { <a,b,c,d,e,g> 3, <a,c,b,d,e,g> 2, <a,b,c,d,f,g>, <a,c,b,d,f,g> 4}
- 163. α-algorithm: Applicative Example FINAL ACTIVITIES {g} L= { <a,b,c,d,e,g> 3, <a,c,b,d,e,g> 2, <a,b,c,d,f,g>, <a,c,b,d,f,g> 4}
- 164. α-algorithm: Applicative Example FOOTPRINTS L= { <a,b,c,d,e,g> 3, <a,c,b,d,e,g> 2, <a,b,c,d,f,g>, <a,c,b,d,f,g> 4} a b c d e f g a # > > # # # # b < # || > # # # c < || # > # # # d # < < # > > # e # # # < # # > f # # # < # # > g # # # # < < #
- 165. α-algorithm: Applicative Example FOOTPRINTS L= { <a,b,c,d,e,g> 3, <a,c,b,d,e,g> 2, <a,b,c,d,f,g>, <a,c,b,d,f,g> 4} a b c d e f g a # > > # # # # b < # || > # # # c < || # > # # # d # < < # > > # e # # # < # # > f # # # < # # > g # # # # < < #
- 166. α-algorithm: Applicative Example a b c d e f g a # > > # # # # b < # || > # # # c < || # > # # # d # < < # > > # e # # # < # # > f # # # < # # > g # # # # < < # FOOTPRINTS L= { <a,b,c,d,e,g> 3, <a,c,b,d,e,g> 2, <a,b,c,d,f,g>, <a,c,b,d,f,g> 4}
- 167. α-algorithm: Applicative Example a b c d e f g a # > > # # # # b < # || > # # # c < || # > # # # d # < < # > > # e # # # < # # > f # # # < # # > g # # # # < < # FOOTPRINTS L= { <a,b,c,d,e,g> 3, <a,c,b,d,e,g> 2, <a,b,c,d,f,g>, <a,c,b,d,f,g> 4}
- 168. α-algorithm: Applicative Example a b c d e f g a # > > # # # # b < # || > # # # c < || # > # # # d # < < # > > # e # # # < # # > f # # # < # # > g # # # # < < # FOOTPRINTS L= { <a,b,c,d,e,g> 3, <a,c,b,d,e,g> 2, <a,b,c,d,f,g>, <a,c,b,d,f,g> 4}
- 169. α-algorithm: Applicative Example a b c d e f g a # > > # # # # b < # || > # # # c < || # > # # # d # < < # > > # e # # # < # # > f # # # < # # > g # # # # < < # FOOTPRINTS L= { <a,b,c,d,e,g> 3, <a,c,b,d,e,g> 2, <a,b,c,d,f,g>, <a,c,b,d,f,g> 4}
- 170. α-algorithm: Applicative Example a b c d e f g a # > > # # # # b < # || > # # # c < || # > # # # d # < < # > > # e # # # < # # > f # # # < # # > g # # # # < < # FOOTPRINTS L= { <a,b,c,d,e,g> 3, <a,c,b,d,e,g> 2, <a,b,c,d,f,g>, <a,c,b,d,f,g> 4}
- 171. α-algorithm: Applicative Example a b c d e f g a # > > # # # # b < # || > # # # c < || # > # # # d # < < # > > # e # # # < # # > f # # # < # # > g # # # # < < # FOOTPRINTS L= { <a,b,c,d,e,g> 3, <a,c,b,d,e,g> 2, <a,b,c,d,f,g>, <a,c,b,d,f,g> 4}
- 172. α-algorithm: Applicative Example a b c d e f g a # > > # # # # b < # || > # # # c < || # > # # # d # < < # > > # e # # # < # # > f # # # < # # > g # # # # < < # FOOTPRINTS L= { <a,b,c,d,e,g> 3, <a,c,b,d,e,g> 2, <a,b,c,d,f,g>, <a,c,b,d,f,g> 4}
- 173. α-algorithm: Applicative Example a b c d e f g a # > > # # # # b < # || > # # # c < || # > # # # d # < < # > > # e # # # < # # > f # # # < # # > g # # # # < < # FOOTPRINTS L= { <a,b,c,d,e,g> 3, <a,c,b,d,e,g> 2, <a,b,c,d,f,g>, <a,c,b,d,f,g> 4}
- 174. α-algorithm: Applicative Example a b c d e f g a # > > # # # # b < # || > # # # c < || # > # # # d # < < # > > # e # # # < # # > f # # # < # # > g # # # # < < # FOOTPRINTS L= { <a,b,c,d,e,g> 3, <a,c,b,d,e,g> 2, <a,b,c,d,f,g>, <a,c,b,d,f,g> 4}
- 175. α-algorithm: Applicative Example a b c d e f g a # > > # # # # b < # || > # # # c < || # > # # # d # < < # > > # e # # # < # # > f # # # < # # > g # # # # < < # FOOTPRINTS L= { <a,b,c,d,e,g> 3, <a,c,b,d,e,g> 2, <a,b,c,d,f,g>, <a,c,b,d,f,g> 4}
- 176. α-algorithm: Applicative Example a b c d e f g a # > > # # # # b < # || > # # # c < || # > # # # d # < < # > > # e # # # < # # > f # # # < # # > g # # # # < < # FOOTPRINTS L= { <a,b,c,d,e,g> 3, <a,c,b,d,e,g> 2, <a,b,c,d,f,g>, <a,c,b,d,f,g> 4}
- 177. α-algorithm: Applicative Example L= { <a,b,c,d,e,g> 3, <a,c,b,d,e,g> 2, <a,b,c,d,f,g>, <a,c,b,d,f,g> 4}
- 178. α-algorithm: Applicative Example α(L) = ? L= { <a,b,c,d,e,f> 4, <a,g,h,d,f,e> 2, <a,b,c,d,f,e> 3, <a,g,h,d,e,f> 3}
- 179. Limitations of alpha miner Completeness All possible traces of the process (model) need to be in the log Short loops c>b and b>c implies c||b and b||c instead of cb and bc Self-loops b>b and not b>b implies bb (impossible!)
- 180. Frequency of the Ordering Relations?
- 181. Little Thumb to Deal with Noise van der Aalst, W. M. P. and Weijters, A. J. M. M. (2003). Rediscovering workflow models from event-based data using little thumb, Integrated Computer-Aided Engineering
- 182. Heuristics Miner
- 183. Heuristics Miner number between -1 and 1 indicating strength of causal dependency between a and b
- 184. Heuristics Miner
- 185. Heuristics Miner
- 186. Heuristics Miner What’s the corresponding process model?
- 187. Automated Process Discovery Automated process discovery method Simplicity minimal size & structural complexity Precision does not parse traces not in the log Fitness parses the traces of the log Generalization parses traces of the process not included in the log 188
- 188. Process Discovery Algorithms: The Two Worlds High-Fitness High-Precision Heuristic Miner Fodina Miner High-Fitness Low-Complexity
- 189. Process Model discovered with Heuristics Miner
- 190. Process Model discovered with Inductive Miner • Structured by construction • Based on process tree
- 191. Process Discovery Algorithms: The Two Worlds High-Fitness High-Precision High complexity Heuristic Miner Fodina Miner High-Fitness Low Precision Low-Complexity Inductive Miner Evolutionary Tree Miner
- 192. Split Miner Augusto, A. and Conforti, R. and Dumas, M. and La Rosa, M. (2017). Split Miner: Discovering Accurate and Simple Business Process Models from Event Logs. ICDM 2017
- 193. Process Model discovered with Split Miner
- 194. Process Discovery Algorithms High-Fitness High-Precision Low-Complexity Split Miner
- 195. From Event Log to Process Model in 5 Steps 196 Directly-Follows Graph and Loops Discovery Filtering Concurrency Discovery Splits Discovery Joins Discovery Event Log Process Model
- 196. Trace #obs a » b » c » g » e » h 10 a » b » c » f » g » h 10 a » b » d » g » e » h 10 a » b » d » e » g » h 10 a » b » e » c » g » h 10 a » b » e » d » g » h 10 a » c » b » e » g » h 10 a » c » b » f » g » h 10 a » d » b » e » g » h 10 a » d » b » f » g » h 10 197 Directly-Follows Graph and Loops Discovery Concurrency DiscoveryEvent Log Filtering Splits Discovery Joins Discovery Process Model
- 197. Trace #obs a » b » c » g » e » h 10 a » b » c » f » g » h 10 a » b » d » g » e » h 10 a » b » d » e » g » h 10 a » b » e » c » g » h 10 a » b » e » d » g » h 10 a » c » b » e » g » h 10 a » c » b » f » g » h 10 a » d » b » e » g » h 10 a » d » b » f » g » h 10 198 Directly-Follows Graph and Loops Discovery Concurrency DiscoveryEvent Log Filtering Splits Discovery Joins Discovery Process Model
- 198. 199 Directly-Follows Graph and Loops Discovery Concurrency DiscoveryEvent Log Filtering Splits Discovery Joins Discovery Process Model (b || c) (b || d) (d || e) (e || g)
- 199. 200 Directly-Follows Graph and Loops Discovery Concurrency DiscoveryEvent Log Filtering Splits Discovery Joins Discovery Process Model
- 200. 201 Break for Make-up!
- 201. 202 Directly-Follows Graph and Loops Discovery Concurrency DiscoveryEvent Log Filtering Splits Discovery Joins Discovery Process Model (b || c) (b || d) (d || e) (e || g)
- 202. 203 Directly-Follows Graph and Loops Discovery Concurrency DiscoveryEvent Log Filtering Splits Discovery Joins Discovery Process Model
- 203. 204 Done!
- 204. Automated discovery of BPM Process Models in Apromore 205
- 205. Process Dashboards Operational dashboards Tactical dashboards Strategic dashboards Process Mining Automated process discovery Conformance checking Performance mining Variants analysis Summary
- 206. Topics not covered in this class • Event log filtering • Removing anomalous or infrequent behaviour from an event log • Business process drift detection • Detecting changes in a business process (over time) using event logs • Predictive process monitoring • Predicting the outcome or a future property of a process based on an event log containing completed cases, and an incomplete case 207
- 207. Thank you! http://fundamentals-of-bpm.org Chapter 12: Process Monitoring

No public clipboards found for this slide

Login to see the comments