FACULTY OF ECONOMICS AND BUSINESS ADMINISTRATION       Integrating Computer Log Files             for Process Mining      ...
FACULTY OF ECONOMICS AND BUSINESS ADMINISTRATION                            1. Process MiningFaculty of Economics and Busi...
A plane crashed... What happened?Analyse the ‘black box’Faculty of Economics and Business Administration                 J...
A process failed... What happened?Analyse the ‘black box’: look for historical dataProcess Mining:        Reconstruct a...
Process MiningProcesses are supported by IT systemsIT systems record actual process dataProcess data can be used to aut...
Process Mining steps Preparation            Collect data: find traces            Merge data: from different sources    ...
Process Mining stepsFaculty of Economics and Business Administration                 Jan Claes for INISET@CAiSE 2011Depart...
FACULTY OF ECONOMICS AND BUSINESS ADMINISTRATION                          2. Merging log filesFaculty of Economics and Bus...
ExampleProduct ordering: registered events:        Sales order: document creation (administration)        Delivery: tru...
Example 1Administration                                                   Warehouse          SO1       SO > Inv         ...
Example 2Administration                                                   Warehouse           SO1      SO > Inv         ...
Example 3Administration                                    t1<t2<t3       Warehouse                                     ...
Merging computer log filesMerge based on        Example 1: matching trace identifiers                        indicator 1...
FACULTY OF ECONOMICS AND BUSINESS ADMINISTRATION                        3. Genetic algorithmFaculty of Economics and Busin...
Genetic algorithm                            cross-over                                                                 su...
Genetic algorithm                     Fitness function score         14                                            18     ...
Genetic algorithm inspired techniqueFind links between traces of both log files and merge them chronologically in new log...
FACULTY OF ECONOMICS AND BUSINESS ADMINISTRATION                      4. Experiment resultsFaculty of Economics and Busine...
Experiment: proof of conceptSimulated data        Given model        Generate             • random set of logs         ...
Experiment: proof of conceptAdvantages of using simulated data        Solution is known        Controllable parameters ...
Experiment resultsIncorrect links related to total links identifiedFaculty of Economics and Business Administration       ...
FACULTY OF ECONOMICS AND BUSINESS ADMINISTRATION                                    5. DiscussionFaculty of Economics and ...
Future workOptimise genetic algorithm        Less incorrect links       Faster implementation (AIS algorithm)        F...
Contact information                                             Jan Claes                                             jan....
Upcoming SlideShare
Loading in …5
×

INISET@CAiSE 2011

1,520 views
1,530 views

Published on

Slides of my presentation at INISET workshop at CAiSE conference, 21 June 2011, London, UK

Published in: Business
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
1,520
On SlideShare
0
From Embeds
0
Number of Embeds
665
Actions
Shares
0
Downloads
8
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

INISET@CAiSE 2011

  1. 1. FACULTY OF ECONOMICS AND BUSINESS ADMINISTRATION Integrating Computer Log Files for Process Mining A Genetic Algorithm Inspired Technique Jan Claes jan.claes@ugent.be http://processmining.ugent.be Ghent University, BelgiumFaculty of Economics and Business Administration Jan Claes for INISET@CAiSE 2011Department of Management Information and Operations Management 21 June, 2011
  2. 2. FACULTY OF ECONOMICS AND BUSINESS ADMINISTRATION 1. Process MiningFaculty of Economics and Business Administration Jan Claes for INISET@CAiSE 2011Department of Management Information and Operations Management 21 June, 2011
  3. 3. A plane crashed... What happened?Analyse the ‘black box’Faculty of Economics and Business Administration Jan Claes for INISET@CAiSE 2011Department of Management Information and Operations Management 3 / 24
  4. 4. A process failed... What happened?Analyse the ‘black box’: look for historical dataProcess Mining:  Reconstruct and analyse processes  From historical process data • Log files • Audit trails • Database history fields/tablesFaculty of Economics and Business Administration Jan Claes for INISET@CAiSE 2011Department of Management Information and Operations Management 4 / 24
  5. 5. Process MiningProcesses are supported by IT systemsIT systems record actual process dataProcess data can be used to automatically  Discover process model  Check conformance with existing process info  Extend existing process modelAttention Process Mining  Only As-Is  Only (correctly) recorded informationFaculty of Economics and Business Administration Jan Claes for INISET@CAiSE 2011Department of Management Information and Operations Management 5 / 24
  6. 6. Process Mining steps Preparation  Collect data: find traces  Merge data: from different sources  Structure data: group per instance  Convert data: to tool specific format Process mining Make decisions, take actionFaculty of Economics and Business Administration Jan Claes for INISET@CAiSE 2011Department of Management Information and Operations Management 6 / 24
  7. 7. Process Mining stepsFaculty of Economics and Business Administration Jan Claes for INISET@CAiSE 2011Department of Management Information and Operations Management 7 / 24
  8. 8. FACULTY OF ECONOMICS AND BUSINESS ADMINISTRATION 2. Merging log filesFaculty of Economics and Business Administration Jan Claes for INISET@CAiSE 2011Department of Management Information and Operations Management 21 June, 2011
  9. 9. ExampleProduct ordering: registered events:  Sales order: document creation (administration)  Delivery: truck load confirmation (warehouse)  Invoice: document creation (administration)Logging  from administration software  from warehouse softwareHow to merge both log files?Faculty of Economics and Business Administration Jan Claes for INISET@CAiSE 2011Department of Management Information and Operations Management 9 / 24
  10. 10. Example 1Administration Warehouse SO1 SO > Inv SO1 Deliver SO2 SO > Inv SO2 Deliver SO3 SO > Inv SO3 Deliver SO1 SO > Deliver > Inv SO2 SO > Deliver > Inv SO3 SO > Deliver > Inv Merge based on matching trace identifiersFaculty of Economics and Business Administration Jan Claes for INISET@CAiSE 2011Department of Management Information and Operations Management 10 / 24
  11. 11. Example 2Administration Warehouse SO1 SO > Inv Del1 Deliver (SO1) SO2 SO > Inv Del2 Deliver (SO2) SO3 SO > Inv Del3 Deliver (SO3) SO1 SO > Deliver > Inv SO2 SO > Deliver > Inv SO3 SO > Deliver > Inv Merge based on matching attribute valuesFaculty of Economics and Business Administration Jan Claes for INISET@CAiSE 2011Department of Management Information and Operations Management 11 / 24
  12. 12. Example 3Administration t1<t2<t3 Warehouse << SO1 SO t > Inv t Arr1 Deliver t 1 3 t4<t5<t6 2 SO2 SO t > Inv t 6 << Arr2 Deliver t 4 5 SO3 SO t > Inv t t7<t8<t9 Arr3 Deliver t 7 9 8 SO1 SO > Deliver > Inv SO2 SO > Deliver > Inv SO3 SO > Deliver > Inv Merge based on time informationFaculty of Economics and Business Administration Jan Claes for INISET@CAiSE 2011Department of Management Information and Operations Management 12 / 24
  13. 13. Merging computer log filesMerge based on  Example 1: matching trace identifiers indicator 1  Example 2: matching attribute values indicator 2  Example 3: time information indicator 3General solution  algorithm combining different indicatorsGenetic algorithm  indicators build up fitness functionFaculty of Economics and Business Administration Jan Claes for INISET@CAiSE 2011Department of Management Information and Operations Management 13 / 24
  14. 14. FACULTY OF ECONOMICS AND BUSINESS ADMINISTRATION 3. Genetic algorithmFaculty of Economics and Business Administration Jan Claes for INISET@CAiSE 2011Department of Management Information and Operations Management 21 June, 2011
  15. 15. Genetic algorithm cross-over survival of the fittest mutation 1st generation 2nd generation 3th generationFaculty of Economics and Business Administration Jan Claes for INISET@CAiSE 2011Department of Management Information and Operations Management 15 / 24
  16. 16. Genetic algorithm Fitness function score 14 18 18 cross-over 27 29 28 survival of the fittest mutation 6 5 32 1st generation 2nd generation 3th generationFaculty of Economics and Business Administration Jan Claes for INISET@CAiSE 2011Department of Management Information and Operations Management 16 / 24
  17. 17. Genetic algorithm inspired techniqueFind links between traces of both log files and merge them chronologically in new log fileSteps  Make initial solution (best individual links)  Make pseudo-random changes (try to improve score for one specific factor)  Evaluate (keep original or changed solution)  Stop condition (fixed amount of steps)Only one solution, no cross-overFaculty of Economics and Business Administration Jan Claes for INISET@CAiSE 2011Department of Management Information and Operations Management 17 / 24
  18. 18. FACULTY OF ECONOMICS AND BUSINESS ADMINISTRATION 4. Experiment resultsFaculty of Economics and Business Administration Jan Claes for INISET@CAiSE 2011Department of Management Information and Operations Management 21 June, 2011
  19. 19. Experiment: proof of conceptSimulated data  Given model  Generate • random set of logs • single log (=solution)  Use merge algorithm to merge set of logs  Check resulting log with solution logFaculty of Economics and Business Administration Jan Claes for INISET@CAiSE 2011Department of Management Information and Operations Management 19 / 24
  20. 20. Experiment: proof of conceptAdvantages of using simulated data  Solution is known  Controllable parameters (e.g. noise, overlap, matching id)Disadvantages of using simulated data  Limited internal validity (are results realistic?)  No external validity (results not generalisable)Faculty of Economics and Business Administration Jan Claes for INISET@CAiSE 2011Department of Management Information and Operations Management 20 / 24
  21. 21. Experiment resultsIncorrect links related to total links identifiedFaculty of Economics and Business Administration Jan Claes for INISET@CAiSE 2011Department of Management Information and Operations Management 21 / 24
  22. 22. FACULTY OF ECONOMICS AND BUSINESS ADMINISTRATION 5. DiscussionFaculty of Economics and Business Administration Jan Claes for INISET@CAiSE 2011Department of Management Information and Operations Management 21 June, 2011
  23. 23. Future workOptimise genetic algorithm  Less incorrect links Faster implementation (AIS algorithm)  Fitness function factorsValidation with real test cases Ghent University DPO (Human Resources) Century21 (Real Estate) & FlexPack (Packaging)  BNP Paribas Fortis (Finance)  ...Faculty of Economics and Business Administration Jan Claes for INISET@CAiSE 2011Department of Management Information and Operations Management 23 / 24
  24. 24. Contact information Jan Claes jan.claes@ugent.be http://processmining.ugent.be Twitter: @janclaesbelgiumFaculty of Economics and Business Administration Jan Claes for INISET@CAiSE 2011Department of Management Information and Operations Management 24 / 24

×