Successfully reported this slideshow.

PhD Day 2011

718 views

Published on

Slides of my presentation at our faculty's PhD Day, 24 May 2011, Gent, B

Published in: Business, Technology
  • Be the first to comment

  • Be the first to like this

PhD Day 2011

  1. 1. Merging Log Files for Process Mining<br />Jan Claes<br />Promotor: Prof. Dr. Geert Poels<br />Copromotor: Prof. Dr. Ir. Birger Raa<br />
  2. 2. Roadmap<br />3. Genetic algorithm<br />1. Process Mining<br />4. Experimentresults<br />2. Merging log files<br />5. Discussion<br />© http://maps.google.com<br />
  3. 3. 1. Process Mining<br />
  4. 4. Analyse the ‘black box’<br />A plane crashed... What happened?<br />
  5. 5. Analyse the ‘black box’: look for historical data<br />Process Mining: <br />Reconstruct and analyse processes<br />From historical process data<br />Log files<br />Audit trails<br />Database history fields/tables<br />A process failed... What happened?<br />
  6. 6. Process Mining<br />Audit trail, database fields, csv log file<br />ProM Analyses<br />ProM Log File<br />
  7. 7. 2. Merging log files<br />
  8. 8. Process Mining<br />Merging log files<br />
  9. 9. Process Mining<br />Log Merging<br />Merging log files<br />
  10. 10. 3. Genetic algorithm<br />
  11. 11. Genetic algorithm<br />cross-over<br />survival of the fittest<br />mutation<br />1st generation<br />2nd generation<br />3th generation<br />
  12. 12. Genetic algorithm<br />Fitness function score<br />14<br />18<br />18<br />cross-over<br />27<br />29<br />28<br />survival of the fittest<br />6<br />5<br />32<br />mutation<br />1st generation<br />2nd generation<br />3th generation<br />
  13. 13. 4. Experiment results<br />
  14. 14. Simulated data<br />Given model<br />Generate <br />random set of logs<br />single log (=solution)<br />Use merge algorithm to merge set of logs<br />Check resulting log with solution log<br />Experiment: proof of concept<br />
  15. 15. Experiment: proof of concept<br />Advantages of using simulated data<br />Solution is known<br />Controllable parameters (e.g. noise, overlap, matching id)<br />Disadvantages of using simulated data<br />Limited internal validity (are results realistic?)<br />No external validity (results not generalisable)<br />
  16. 16. Meantime<br />4 min<br />Results of version of 31/03/2011: GA inspired algorithm<br />Experiment results<br />
  17. 17. Meantime<br />4 min<br />Results of version of 31/03/2011: GA inspired algorithm<br />Experiment results<br />
  18. 18. Meantime<br />350 msec<br />Results of version of 15/05/2011: AIS algorithm<br />Experiment results: NEW ALGORITHM<br />
  19. 19. 5. Discussion<br />
  20. 20. Future work<br />Optimise genetic algorithm<br />Less incorrect links<br />Faster implementation<br />Fitness function factors<br />Validation with real test cases<br />Ghent University DPO (Human Resources)<br />Century21 (Real Estate)<br />BNP Paribas Fortis (Loan approvements)<br />...<br />
  21. 21. Jan Claes<br />jan.claes@ugent.be<br />http://processmining.ugent.be<br />FEB08, Tweekerkenstraat 2<br />9000 Gent, Belgium<br />Contact information<br />

×