Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

INISET@CAiSE 2011

1,662 views

Published on

Slides of my presentation at INISET workshop at CAiSE conference, 21 June 2011, London, UK

Published in: Business
  • Be the first to comment

  • Be the first to like this

INISET@CAiSE 2011

  1. 1. FACULTY OF ECONOMICS AND BUSINESS ADMINISTRATION Integrating Computer Log Files for Process Mining A Genetic Algorithm Inspired Technique Jan Claes jan.claes@ugent.be http://processmining.ugent.be Ghent University, BelgiumFaculty of Economics and Business Administration Jan Claes for INISET@CAiSE 2011Department of Management Information and Operations Management 21 June, 2011
  2. 2. FACULTY OF ECONOMICS AND BUSINESS ADMINISTRATION 1. Process MiningFaculty of Economics and Business Administration Jan Claes for INISET@CAiSE 2011Department of Management Information and Operations Management 21 June, 2011
  3. 3. A plane crashed... What happened?Analyse the ‘black box’Faculty of Economics and Business Administration Jan Claes for INISET@CAiSE 2011Department of Management Information and Operations Management 3 / 24
  4. 4. A process failed... What happened?Analyse the ‘black box’: look for historical dataProcess Mining:  Reconstruct and analyse processes  From historical process data • Log files • Audit trails • Database history fields/tablesFaculty of Economics and Business Administration Jan Claes for INISET@CAiSE 2011Department of Management Information and Operations Management 4 / 24
  5. 5. Process MiningProcesses are supported by IT systemsIT systems record actual process dataProcess data can be used to automatically  Discover process model  Check conformance with existing process info  Extend existing process modelAttention Process Mining  Only As-Is  Only (correctly) recorded informationFaculty of Economics and Business Administration Jan Claes for INISET@CAiSE 2011Department of Management Information and Operations Management 5 / 24
  6. 6. Process Mining steps Preparation  Collect data: find traces  Merge data: from different sources  Structure data: group per instance  Convert data: to tool specific format Process mining Make decisions, take actionFaculty of Economics and Business Administration Jan Claes for INISET@CAiSE 2011Department of Management Information and Operations Management 6 / 24
  7. 7. Process Mining stepsFaculty of Economics and Business Administration Jan Claes for INISET@CAiSE 2011Department of Management Information and Operations Management 7 / 24
  8. 8. FACULTY OF ECONOMICS AND BUSINESS ADMINISTRATION 2. Merging log filesFaculty of Economics and Business Administration Jan Claes for INISET@CAiSE 2011Department of Management Information and Operations Management 21 June, 2011
  9. 9. ExampleProduct ordering: registered events:  Sales order: document creation (administration)  Delivery: truck load confirmation (warehouse)  Invoice: document creation (administration)Logging  from administration software  from warehouse softwareHow to merge both log files?Faculty of Economics and Business Administration Jan Claes for INISET@CAiSE 2011Department of Management Information and Operations Management 9 / 24
  10. 10. Example 1Administration Warehouse SO1 SO > Inv SO1 Deliver SO2 SO > Inv SO2 Deliver SO3 SO > Inv SO3 Deliver SO1 SO > Deliver > Inv SO2 SO > Deliver > Inv SO3 SO > Deliver > Inv Merge based on matching trace identifiersFaculty of Economics and Business Administration Jan Claes for INISET@CAiSE 2011Department of Management Information and Operations Management 10 / 24
  11. 11. Example 2Administration Warehouse SO1 SO > Inv Del1 Deliver (SO1) SO2 SO > Inv Del2 Deliver (SO2) SO3 SO > Inv Del3 Deliver (SO3) SO1 SO > Deliver > Inv SO2 SO > Deliver > Inv SO3 SO > Deliver > Inv Merge based on matching attribute valuesFaculty of Economics and Business Administration Jan Claes for INISET@CAiSE 2011Department of Management Information and Operations Management 11 / 24
  12. 12. Example 3Administration t1<t2<t3 Warehouse << SO1 SO t > Inv t Arr1 Deliver t 1 3 t4<t5<t6 2 SO2 SO t > Inv t 6 << Arr2 Deliver t 4 5 SO3 SO t > Inv t t7<t8<t9 Arr3 Deliver t 7 9 8 SO1 SO > Deliver > Inv SO2 SO > Deliver > Inv SO3 SO > Deliver > Inv Merge based on time informationFaculty of Economics and Business Administration Jan Claes for INISET@CAiSE 2011Department of Management Information and Operations Management 12 / 24
  13. 13. Merging computer log filesMerge based on  Example 1: matching trace identifiers indicator 1  Example 2: matching attribute values indicator 2  Example 3: time information indicator 3General solution  algorithm combining different indicatorsGenetic algorithm  indicators build up fitness functionFaculty of Economics and Business Administration Jan Claes for INISET@CAiSE 2011Department of Management Information and Operations Management 13 / 24
  14. 14. FACULTY OF ECONOMICS AND BUSINESS ADMINISTRATION 3. Genetic algorithmFaculty of Economics and Business Administration Jan Claes for INISET@CAiSE 2011Department of Management Information and Operations Management 21 June, 2011
  15. 15. Genetic algorithm cross-over survival of the fittest mutation 1st generation 2nd generation 3th generationFaculty of Economics and Business Administration Jan Claes for INISET@CAiSE 2011Department of Management Information and Operations Management 15 / 24
  16. 16. Genetic algorithm Fitness function score 14 18 18 cross-over 27 29 28 survival of the fittest mutation 6 5 32 1st generation 2nd generation 3th generationFaculty of Economics and Business Administration Jan Claes for INISET@CAiSE 2011Department of Management Information and Operations Management 16 / 24
  17. 17. Genetic algorithm inspired techniqueFind links between traces of both log files and merge them chronologically in new log fileSteps  Make initial solution (best individual links)  Make pseudo-random changes (try to improve score for one specific factor)  Evaluate (keep original or changed solution)  Stop condition (fixed amount of steps)Only one solution, no cross-overFaculty of Economics and Business Administration Jan Claes for INISET@CAiSE 2011Department of Management Information and Operations Management 17 / 24
  18. 18. FACULTY OF ECONOMICS AND BUSINESS ADMINISTRATION 4. Experiment resultsFaculty of Economics and Business Administration Jan Claes for INISET@CAiSE 2011Department of Management Information and Operations Management 21 June, 2011
  19. 19. Experiment: proof of conceptSimulated data  Given model  Generate • random set of logs • single log (=solution)  Use merge algorithm to merge set of logs  Check resulting log with solution logFaculty of Economics and Business Administration Jan Claes for INISET@CAiSE 2011Department of Management Information and Operations Management 19 / 24
  20. 20. Experiment: proof of conceptAdvantages of using simulated data  Solution is known  Controllable parameters (e.g. noise, overlap, matching id)Disadvantages of using simulated data  Limited internal validity (are results realistic?)  No external validity (results not generalisable)Faculty of Economics and Business Administration Jan Claes for INISET@CAiSE 2011Department of Management Information and Operations Management 20 / 24
  21. 21. Experiment resultsIncorrect links related to total links identifiedFaculty of Economics and Business Administration Jan Claes for INISET@CAiSE 2011Department of Management Information and Operations Management 21 / 24
  22. 22. FACULTY OF ECONOMICS AND BUSINESS ADMINISTRATION 5. DiscussionFaculty of Economics and Business Administration Jan Claes for INISET@CAiSE 2011Department of Management Information and Operations Management 21 June, 2011
  23. 23. Future workOptimise genetic algorithm  Less incorrect links Faster implementation (AIS algorithm)  Fitness function factorsValidation with real test cases Ghent University DPO (Human Resources) Century21 (Real Estate) & FlexPack (Packaging)  BNP Paribas Fortis (Finance)  ...Faculty of Economics and Business Administration Jan Claes for INISET@CAiSE 2011Department of Management Information and Operations Management 23 / 24
  24. 24. Contact information Jan Claes jan.claes@ugent.be http://processmining.ugent.be Twitter: @janclaesbelgiumFaculty of Economics and Business Administration Jan Claes for INISET@CAiSE 2011Department of Management Information and Operations Management 24 / 24

×