ProM 2012

1,050 views

Published on

Slides of my presentation at ProM meeting at Technische Universiteit Eindhoven, 6 February 2012, Eindhoven, NL

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
1,050
On SlideShare
0
From Embeds
0
Number of Embeds
5
Actions
Shares
0
Downloads
11
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

ProM 2012

  1. 1. FACULTY OF ECONOMICS AND BUSINESS ADMINISTRATION Merging Event Logs in ProM Jan Claes Ghent University http://processmining.ugent.beFaculty of Economics and Business Administration Jan Claes for TUe 2012Department of Management Information and Operations Management 6 February, 2012
  2. 2. Merging Event Logs ? Multiple event logs ProM plugin Merged event logFaculty of Economics and Business Administration Jan Claes for TUe 2012Department of Management Information and Operations Management 2 / 21
  3. 3. Merging Event Logs1. Find links 2. Merge chronologically 3. Add unlinked traces 4. Put in new log fileFaculty of Economics and Business Administration Jan Claes for TUe 2012Department of Management Information and Operations Management 3 / 21
  4. 4. ApproachesGenetic Algorithm  J. Claes, G. Poels, Integrating Computer Log Files for Process Mining: a Genetic Algorithm Inspired Technique, in CAiSE 2011 Workshops, LNBIP 83, 2011Artificial Immune System  J. Claes, G. Poels, Merging Computer Log Files for Process Mining: an Artificial Immune System Technique, in BPM 2011 Workshops, LNBIP 99, 2011Rule Based  J. Claes, G. Poels, Merging Event Logs for Process Mining: A Rule Based Merging Method and Rule Suggestion Algorithm, to be submitted in 2012Faculty of Economics and Business Administration Jan Claes for TUe 2012Department of Management Information and Operations Management 4 / 21
  5. 5. FACULTY OF ECONOMICS AND BUSINESS ADMINISTRATION 1. Genetic AlgorithmFaculty of Economics and Business Administration Jan Claes for TUe 2012Department of Management Information and Operations Management 6 February, 2012
  6. 6. 1. Genetic Algorithm SEL cross-over RAND fitness MUT POP POP mutation POP Selection ReproductionFaculty of Economics and Business Administration Jan Claes for TUe 2012Department of Management Information and Operations Management 6 / 21
  7. 7. 1. Genetic AlgorithmFitness function  Sum of weighted factor scores per link • Same trace id (STIi) • Trace order (TOi) if all start events are in the first log • Equal attribute values (EAVi) • Number of linked traces (NLTi) • Time distance (TDi)Faculty of Economics and Business Administration Jan Claes for TUe 2012Department of Management Information and Operations Management 7 / 21
  8. 8. 1. Genetic AlgorithmSimplification  Population size one  Only mutationsImprovements  More intelligent start population (not random)  More intelligent mutations (improve at least one factor of the fitness function)Attention  Intensification vs. diversificationFaculty of Economics and Business Administration Jan Claes for TUe 2012Department of Management Information and Operations Management 8 / 21
  9. 9. FACULTY OF ECONOMICS AND BUSINESS ADMINISTRATION 2. Artificial Immune systemFaculty of Economics and Business Administration Jan Claes for TUe 2012Department of Management Information and Operations Management 6 February, 2012
  10. 10. 2. Artificial Immune System Immune cells (type B-cell) Antigen Antibodies (receptor)Faculty of Economics and Business Administration Jan Claes for TUe 2012Department of Management Information and Operations Management 10 / 21
  11. 11. 2. Artificial Immune System HIGH HIGH HIGH mutations INIT sorted CLONE MUT EDIT POP POP POP POP POPRAND POP LOW LOW Affinity maturation Initial population Clonal selection Hypermutation Receptor editing SEED LOW Faculty of Economics and Business Administration Jan Claes for TUe 2012 Department of Management Information and Operations Management 11 / 21
  12. 12. 2. Artificial Immune SystemClonal selection  Clone the fittest x% solutions (I)Hypermutation  Randomly change each clone  The higher the fitness score, the less changes (I)Receptor editing  Take the best y% solutions (I)  Add totally random solutions to the set (D) (I: Intensification, D: Diversification)Faculty of Economics and Business Administration Jan Claes for TUe 2012Department of Management Information and Operations Management 12 / 21
  13. 13. 2. Artificial Immune SystemHypermutation  Choose ‘random’ indicator factor to improve • Higher chance to pick factors with positive previous effect  Choose random action • Add link, remove link or alter link  Choose random candidate • From all solutions that would improve with selected action  Choose random improvement • From all possible improvements for selected candidateFaculty of Economics and Business Administration Jan Claes for TUe 2012Department of Management Information and Operations Management 13 / 21
  14. 14. FACULTY OF ECONOMICS AND BUSINESS ADMINISTRATION 3. Rule BasedFaculty of Economics and Business Administration Jan Claes for TUe 2012Department of Management Information and Operations Management 6 February, 2012
  15. 15. 3. Rule BasedAutomatic merging is not transparant (how good is the merging result?)Previous algorithms are (too) slowMy experience  in most cases it is about finding an attribute value (literally) in a trace of the other log  you need data experts/analyst to get the right data, they mostly have a good idea about the link between two log filesFaculty of Economics and Business Administration Jan Claes for TUe 2012Department of Management Information and Operations Management 15 / 21
  16. 16. 3. Rule BasedSemi-automatic solution  Let user configure merging rule based on attribute values • More transparent • Faster • Includes expert knowledge if available  Help user by suggesting merging rules based on the data in the logFaculty of Economics and Business Administration Jan Claes for TUe 2012Department of Management Information and Operations Management 16 / 21
  17. 17. 3. Rule BasedMerging rules  Merge all traces where… attribute <select name> from <select container> in the 1st log <select operator> attribute <select name> from <select container> in the 2nd log  E.g. Merge all traces where attribute Trace ID from a trace in the 1st log equals attribute Supplier Reference from event Send goods in the 2nd logFaculty of Economics and Business Administration Jan Claes for TUe 2012Department of Management Information and Operations Management 17 / 21
  18. 18. 3. Rule Based  <select name> • Contains all possible attribute names available in the log  <select container> • From a trace • From any event in a trace • From a trace or any event in a trace • From event X, From event Y, From event Z, …  <select operator> • equals, is not equal, greater than, greater or equal, … • comes before, comes afterFaculty of Economics and Business Administration Jan Claes for TUe 2012Department of Management Information and Operations Management 18 / 21
  19. 19. 3. Rule BasedSuggesting rules  Look at all attribute values in the log  Make a rule for every equal match in both logs  Count the number of linked traces for every rule  Filter rules with only one link  Sort such that rule that is closer to 1-to-1 match is higher in the list • rules that make more or fewer links are lower in the list • if no 1-to-1 rule exist, the ‘best’ rule is still on topFaculty of Economics and Business Administration Jan Claes for TUe 2012Department of Management Information and Operations Management 19 / 21
  20. 20. 3. Rule BasedSome remarks  User can configure rules or select from the suggestion list  Suggestion list is currently limited to equals-rules but is calculated very fast (order n1 + n2 !)  Rules can be combined with And or Or  By explicitly selecting rules, the approach is more transparent  Possible use as shortcut for merging logs from within one systemFaculty of Economics and Business Administration Jan Claes for TUe 2012Department of Management Information and Operations Management 20 / 21
  21. 21. Contact information Jan Claes jan.claes@ugent.be http://processmining.ugent.be Twitter: @janclaesbelgium Pav D8.a (until February 10)Faculty of Economics and Business Administration Jan Claes for TUe 2012Department of Management Information and Operations Management 21 / 21

×