Slice, Mine, Dice: Complexity-Aware Automated Discovery of Business Process Models

1,376 views

Published on

Presentation at the 11th International Conference on Business Process Management (BPM'2013), Beijing, China, 27 August 2013. Winner of the best paper award.

Published in: Education, Technology, Business
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
1,376
On SlideShare
0
From Embeds
0
Number of Embeds
35
Actions
Shares
0
Downloads
23
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

Slice, Mine, Dice: Complexity-Aware Automated Discovery of Business Process Models

  1. 1. 1
  2. 2. Event log Discovering “all-in-one” process models… Process discovery algorithms 2
  3. 3. a b a b c d e c a f g h a b a b k d e c h f g h b c p q p r a k q r s b c p h p r a k q r s a x p h y z t t u Trace clustering a b a b c d e c a f g h a b a b k d e c h f g h b c p q p r a k q r s b c p h p r a k q r s a x p h y z t t u Cluster 1 Cluster 2 Noise Event log Process variant 1 Process variant 2 Trace clustering 3
  4. 4. Process discovery Trace clustering Event log Complexity threshold e.g. Size ≤ 30 Process model Slice, Mine and Dice (SMD) >? 4 1) Slice 2) Mine 3) Dice
  5. 5. Process discovery Trace clustering Event log Complexity threshold e.g. Size ≤ 30 Slice, Mine and Dice (SMD) >? Process models 5
  6. 6. Process discovery Trace clustering Event log Complexity threshold e.g. Size ≤ 30 Slice, Mine and Dice (SMD) >? Process models 6
  7. 7. Process model M3 Process model M2 A closer look… 7
  8. 8. F10 F11 F13 F14F12 M2 RPST of M2 F20 F21 F22 F24F23 F25 RPST of M3 Refined Process Structure Tree (RPST) J. Vanhatalo, H. Volzer, J. Koehler: The Refined Process Structure Tree. Data Knowl. Eng., 2009 M3 8
  9. 9. M2 M3 F14 F25 M. Dumas, L. García-Bañuelos, M. La Rosa, and R. Uba. Fast Detection of Exact Clones in Business Process Model Repositories. Information Systems, 2013 RPSDAG F10 F11 F13 F14F12 RPST of M2 F20 F21 F22 F24F23 F25 RPST of M3 9
  10. 10. F14 M2 M3 S3 + S3 + Extracting exact clones S3 Exact clones: • SESE • Non-trivial • Identical 10
  11. 11. F12 F22M3 + S3 + S3 ? ? Extracting approximate clones C. C. Ekanayake, M. Dumas, L. García-Bañuelos, M. La Rosa, and A. H. M. ter Hofstede. Approximate Clone Detection in Repositories of Business Process Models, BPM 2012 Appr. clones: • SESE • Non-trivial • Similar • Unrelated 11 + + M2
  12. 12. Merging algorithm Fragment F12 of model M2 Fragment F22 of model M3 Configurable gateway Configurable label M. La Rosa, M. Dumas, R. Uba, and R. M. Dijkman. Business Process Model Merging: An Approach to Business Process Consolidation. ACM TOSEM, 2013. Merging approximate clones S4 12
  13. 13. S4S3 + The result… 13 + + +
  14. 14. Trace clustering • M. Song, C.W. Gunther, and W.M.P. van der Aalst, Improving Process Mining with Trace Clustering, J. Korean Inst. of Industrial Engineers 34(4), 2008 • R.P.J.C. Bose, W.M.P. van der Aalst, Trace Clustering Based on Conserved Patterns: Towards Achieving Better Process Models, BPM 2009 Workshops • A.K.A. de Medeiros, A. Guzzo, G. Greco, W.M.P. van der Aalst, A.J.M.M. Weijters, B.F. van Dongen, D. Saccà. Process Mining Based on Clustering: A Quest for Precision, BPM Workshops 2007 Discovery • A.J.M.M. Weijters, J.T.S. Ribeiro. Flexible Heuristics Miner (FHM), CIDM, 2011. Evaluation setup Log Traces Events Event classes Duplication ratio Motor 4,293 33,202 292 114 Commercial 4,852 54,134 81 668 BPI 2012 5,312 91,949 36 2,554 14
  15. 15. Evaluation – repository size and models number S: Song et al. B: Bose et al. M: Medeiros et al. • up to 64% reduction in repository size • up to 66% reduction in # of top level process models • up to 120 sub-processes extracted Motor Comm BPI Motor Comm BPI 14% 22% 66% 15 64%
  16. 16. Evaluation – individual model complexity Log Method Size CFC ACD Density CNC Avg Min Max Savings (%) Motor S 22.75 4 37 22.8 12.07 2.71 0.07 1.26 S + SMD 17.57 4 37 10.07 2.34 0.11 1.21 B 20.01 4 37 9.8 9.97 2.51 0.08 1.2 B + SMD 18.04 4 37 10.05 2.38 0.11 1.2 M 15.73 3 49 -1.1 7.36 2.14 0.11 1.12 M + SMD 15.9 4 49 8.34 2.12 0.12 1.14 Commercial S 24.07 6 34 22.4 13.65 2.96 0.06 1.32 S + SMD 18.67 2 34 11.34 2.49 0.1 1.24 B 21.11 2 34 20.3 11.04 2.65 0.07 1.23 B+ SMD 16.82 2 34 9.73 2.29 0.12 1.18 M 18.86 2 40 11.1 10.18 2.47 0.09 1.22 M + SMD 16.76 2 34 9.71 2.38 0.11 1.21 BPI S 47.32 15 56 29.7 20.77 2.34 0.03 1.24 S + SMD 33.27 4 56 20.18 2.41 0.07 1.28 B 46.54 13 56 30.6 20.48 2.35 0.03 1.23 B + SMD 32.3 4 56 19.29 2.33 0.07 1.27 M 46.48 21 61 18.9 21.16 2.34 0.03 1.24 M + SMD 37.71 7 56 25.29 2.38 0.04 1.3 16
  17. 17. + Seeks to discover process models that meet user-specified complexity thresholds + Reduces repository size and top level models number compared to trace clustering techniques + Preserves fitness, appropriateness & generalization of models discovered from trace clusters + Little impact on structural complexity of process models - Performance overhead The SMD technique 17
  18. 18. Optimizations • Parallelization of divisive trace clustering, discovery and GED computation • Incremental divisive trace clustering • Sub-polygons and sub-bonds extraction Complexity thresholds tuning In the future… 18

×