SlideShare a Scribd company logo
FACULTY OF ECONOMICS AND BUSINESS ADMINISTRATION




       Integrating Computer Log Files
             for Process Mining
           A Genetic Algorithm Inspired Technique

                                                       Jan Claes
                                                       jan.claes@ugent.be
                                                       http://processmining.ugent.be
                                                       Ghent University, Belgium

Faculty of Economics and Business Administration                                      Jan Claes for INISET@CAiSE 2011
Department of Management Information and Operations Management                                            21 June, 2011
FACULTY OF ECONOMICS AND BUSINESS ADMINISTRATION




                            1. Process Mining




Faculty of Economics and Business Administration                                      Jan Claes for INISET@CAiSE 2011
Department of Management Information and Operations Management                                            21 June, 2011
A plane crashed... What happened?




Analyse the ‘black box’
Faculty of Economics and Business Administration                 Jan Claes for INISET@CAiSE 2011
Department of Management Information and Operations Management                             3 / 24
A process failed... What happened?


Analyse the ‘black box’: look for historical data
Process Mining:
        Reconstruct and analyse processes
        From historical process data
             • Log files
             • Audit trails
             • Database history fields/tables



Faculty of Economics and Business Administration                 Jan Claes for INISET@CAiSE 2011
Department of Management Information and Operations Management                             4 / 24
Process Mining

Processes are supported by IT systems
IT systems record actual process data
Process data can be used to automatically
   Discover process model
   Check conformance with existing process info
   Extend existing process model
Attention                      Process Mining
        Only As-Is
        Only (correctly) recorded information
Faculty of Economics and Business Administration                 Jan Claes for INISET@CAiSE 2011
Department of Management Information and Operations Management                             5 / 24
Process Mining steps

 Preparation
            Collect data: find traces
            Merge data: from different sources
            Structure data: group per instance
            Convert data: to tool specific format
 Process mining
 Make decisions, take action


Faculty of Economics and Business Administration                 Jan Claes for INISET@CAiSE 2011
Department of Management Information and Operations Management                             6 / 24
Process Mining steps




Faculty of Economics and Business Administration                 Jan Claes for INISET@CAiSE 2011
Department of Management Information and Operations Management                             7 / 24
FACULTY OF ECONOMICS AND BUSINESS ADMINISTRATION




                          2. Merging log files




Faculty of Economics and Business Administration                                      Jan Claes for INISET@CAiSE 2011
Department of Management Information and Operations Management                                            21 June, 2011
Example

Product ordering: registered events:
        Sales order: document creation (administration)
        Delivery: truck load confirmation (warehouse)
        Invoice: document creation (administration)
Logging
        from administration software
        from warehouse software
How to merge both log files?
Faculty of Economics and Business Administration                 Jan Claes for INISET@CAiSE 2011
Department of Management Information and Operations Management                             9 / 24
Example 1

Administration                                                   Warehouse
          SO1       SO > Inv                                       SO1       Deliver

          SO2       SO > Inv                                       SO2       Deliver

          SO3       SO > Inv                                       SO3       Deliver

                                         SO1 SO > Deliver > Inv

                                         SO2 SO > Deliver > Inv

                                         SO3 SO > Deliver > Inv


       Merge based on matching trace identifiers
Faculty of Economics and Business Administration                         Jan Claes for INISET@CAiSE 2011
Department of Management Information and Operations Management                                    10 / 24
Example 2

Administration                                                   Warehouse
           SO1      SO > Inv                                        Del1 Deliver (SO1)

           SO2      SO > Inv                                        Del2 Deliver (SO2)

           SO3      SO > Inv                                        Del3 Deliver (SO3)

                                         SO1 SO > Deliver > Inv

                                         SO2 SO > Deliver > Inv

                                         SO3 SO > Deliver > Inv


       Merge based on matching attribute values
Faculty of Economics and Business Administration                        Jan Claes for INISET@CAiSE 2011
Department of Management Information and Operations Management                                   11 / 24
Example 3

Administration                                    t1<t2<t3       Warehouse
                                                      <<
           SO1 SO t > Inv t                                         Arr1     Deliver t
                   1       3                       t4<t5<t6                              2

           SO2 SO t > Inv t
                           6
                                                      <<            Arr2     Deliver t
                   4                                                                     5

           SO3 SO t > Inv t
                                                   t7<t8<t9         Arr3     Deliver t
                   7       9                                                             8


                                         SO1 SO > Deliver > Inv

                                         SO2 SO > Deliver > Inv

                                         SO3 SO > Deliver > Inv


                 Merge based on time information
Faculty of Economics and Business Administration                       Jan Claes for INISET@CAiSE 2011
Department of Management Information and Operations Management                                  12 / 24
Merging computer log files

Merge based on
        Example 1: matching trace identifiers                        indicator 1
        Example 2: matching attribute values                         indicator 2
        Example 3: time information                                  indicator 3
General solution
  algorithm combining different indicators
Genetic algorithm
  indicators build up fitness function

Faculty of Economics and Business Administration                 Jan Claes for INISET@CAiSE 2011
Department of Management Information and Operations Management                            13 / 24
FACULTY OF ECONOMICS AND BUSINESS ADMINISTRATION




                        3. Genetic algorithm




Faculty of Economics and Business Administration                                      Jan Claes for INISET@CAiSE 2011
Department of Management Information and Operations Management                                            21 June, 2011
Genetic algorithm




                            cross-over




                                                                 survival of
                                                                 the fittest
                           mutation



  1st generation                               2nd generation                        3th generation
Faculty of Economics and Business Administration                               Jan Claes for INISET@CAiSE 2011
Department of Management Information and Operations Management                                          15 / 24
Genetic algorithm
                     Fitness function score


         14                                            18                                    18
                            cross-over


         27                                            29                                    28
                                                                 survival of
                                                                 the fittest
                           mutation
          6                                             5                                    32

  1st generation                               2nd generation                        3th generation
Faculty of Economics and Business Administration                               Jan Claes for INISET@CAiSE 2011
Department of Management Information and Operations Management                                          16 / 24
Genetic algorithm inspired technique

Find links between traces of both log files and
 merge them chronologically in new log file
Steps
        Make initial solution (best individual links)
        Make pseudo-random changes
         (try to improve score for one specific factor)
        Evaluate (keep original or changed solution)
        Stop condition (fixed amount of steps)
Only one solution, no cross-over
Faculty of Economics and Business Administration                 Jan Claes for INISET@CAiSE 2011
Department of Management Information and Operations Management                            17 / 24
FACULTY OF ECONOMICS AND BUSINESS ADMINISTRATION




                      4. Experiment results




Faculty of Economics and Business Administration                                      Jan Claes for INISET@CAiSE 2011
Department of Management Information and Operations Management                                            21 June, 2011
Experiment: proof of concept


Simulated data
        Given model
        Generate
             • random set of logs
             • single log (=solution)
        Use merge algorithm to merge set of logs
        Check resulting log with solution log



Faculty of Economics and Business Administration                 Jan Claes for INISET@CAiSE 2011
Department of Management Information and Operations Management                            19 / 24
Experiment: proof of concept

Advantages of using simulated data
        Solution is known
        Controllable parameters
         (e.g. noise, overlap, matching id)
Disadvantages of using simulated data
        Limited internal validity (are results realistic?)
        No external validity (results not generalisable)


Faculty of Economics and Business Administration                 Jan Claes for INISET@CAiSE 2011
Department of Management Information and Operations Management                            20 / 24
Experiment results

Incorrect links related to total links identified




Faculty of Economics and Business Administration                 Jan Claes for INISET@CAiSE 2011
Department of Management Information and Operations Management                            21 / 24
FACULTY OF ECONOMICS AND BUSINESS ADMINISTRATION




                                    5. Discussion




Faculty of Economics and Business Administration                                      Jan Claes for INISET@CAiSE 2011
Department of Management Information and Operations Management                                            21 June, 2011
Future work

Optimise genetic algorithm
        Less incorrect links
       Faster implementation (AIS algorithm)
        Fitness function factors
Validation with real test cases
       Ghent University DPO (Human Resources)
       Century21 (Real Estate) & FlexPack (Packaging)
        BNP Paribas Fortis (Finance)
        ...
Faculty of Economics and Business Administration                 Jan Claes for INISET@CAiSE 2011
Department of Management Information and Operations Management                            23 / 24
Contact information




                                             Jan Claes
                                             jan.claes@ugent.be

                                             http://processmining.ugent.be
                                             Twitter: @janclaesbelgium




Faculty of Economics and Business Administration                       Jan Claes for INISET@CAiSE 2011
Department of Management Information and Operations Management                                  24 / 24

More Related Content

More from Jan Claes

COGNISE@CAiSE 2019
COGNISE@CAiSE 2019COGNISE@CAiSE 2019
COGNISE@CAiSE 2019
Jan Claes
 
BPMS2@BPM2018
BPMS2@BPM2018BPMS2@BPM2018
BPMS2@BPM2018
Jan Claes
 
ICLTC 2018
ICLTC 2018ICLTC 2018
ICLTC 2018
Jan Claes
 
EMMSAD++@CAiSE 2018
EMMSAD++@CAiSE 2018EMMSAD++@CAiSE 2018
EMMSAD++@CAiSE 2018
Jan Claes
 
BPM Cluster Meeting 2018
BPM Cluster Meeting 2018BPM Cluster Meeting 2018
BPM Cluster Meeting 2018
Jan Claes
 
Research: Why? What? How?
Research: Why? What? How?Research: Why? What? How?
Research: Why? What? How?
Jan Claes
 
TEDxGhent 2016 PhD Contest
TEDxGhent 2016 PhD ContestTEDxGhent 2016 PhD Contest
TEDxGhent 2016 PhD Contest
Jan Claes
 
PhD defense November 2015
PhD defense November 2015PhD defense November 2015
PhD defense November 2015
Jan Claes
 
PhD pre-defense September 2015
PhD pre-defense September 2015PhD pre-defense September 2015
PhD pre-defense September 2015
Jan Claes
 
UGent MIS research seminar June 2015
UGent MIS research seminar June 2015UGent MIS research seminar June 2015
UGent MIS research seminar June 2015
Jan Claes
 
UGent MIS research seminar December 2014
UGent MIS research seminar December 2014UGent MIS research seminar December 2014
UGent MIS research seminar December 2014
Jan Claes
 
BPM Cluster Meeting 2014
BPM Cluster Meeting 2014BPM Cluster Meeting 2014
BPM Cluster Meeting 2014
Jan Claes
 
PhD Day 2014
PhD Day 2014PhD Day 2014
PhD Day 2014
Jan Claes
 
Colloquium@TUe
Colloquium@TUeColloquium@TUe
Colloquium@TUe
Jan Claes
 
ECIS2013DC
ECIS2013DCECIS2013DC
ECIS2013DC
Jan Claes
 
PhD Day 2013
PhD Day 2013PhD Day 2013
PhD Day 2013
Jan Claes
 
Stad Gent 2012
Stad Gent 2012Stad Gent 2012
Stad Gent 2012
Jan Claes
 
Confenis 2012
Confenis 2012Confenis 2012
Confenis 2012
Jan Claes
 
Confenis2012DC
Confenis2012DCConfenis2012DC
Confenis2012DC
Jan Claes
 
BPI@BPM2012
BPI@BPM2012BPI@BPM2012
BPI@BPM2012
Jan Claes
 

More from Jan Claes (20)

COGNISE@CAiSE 2019
COGNISE@CAiSE 2019COGNISE@CAiSE 2019
COGNISE@CAiSE 2019
 
BPMS2@BPM2018
BPMS2@BPM2018BPMS2@BPM2018
BPMS2@BPM2018
 
ICLTC 2018
ICLTC 2018ICLTC 2018
ICLTC 2018
 
EMMSAD++@CAiSE 2018
EMMSAD++@CAiSE 2018EMMSAD++@CAiSE 2018
EMMSAD++@CAiSE 2018
 
BPM Cluster Meeting 2018
BPM Cluster Meeting 2018BPM Cluster Meeting 2018
BPM Cluster Meeting 2018
 
Research: Why? What? How?
Research: Why? What? How?Research: Why? What? How?
Research: Why? What? How?
 
TEDxGhent 2016 PhD Contest
TEDxGhent 2016 PhD ContestTEDxGhent 2016 PhD Contest
TEDxGhent 2016 PhD Contest
 
PhD defense November 2015
PhD defense November 2015PhD defense November 2015
PhD defense November 2015
 
PhD pre-defense September 2015
PhD pre-defense September 2015PhD pre-defense September 2015
PhD pre-defense September 2015
 
UGent MIS research seminar June 2015
UGent MIS research seminar June 2015UGent MIS research seminar June 2015
UGent MIS research seminar June 2015
 
UGent MIS research seminar December 2014
UGent MIS research seminar December 2014UGent MIS research seminar December 2014
UGent MIS research seminar December 2014
 
BPM Cluster Meeting 2014
BPM Cluster Meeting 2014BPM Cluster Meeting 2014
BPM Cluster Meeting 2014
 
PhD Day 2014
PhD Day 2014PhD Day 2014
PhD Day 2014
 
Colloquium@TUe
Colloquium@TUeColloquium@TUe
Colloquium@TUe
 
ECIS2013DC
ECIS2013DCECIS2013DC
ECIS2013DC
 
PhD Day 2013
PhD Day 2013PhD Day 2013
PhD Day 2013
 
Stad Gent 2012
Stad Gent 2012Stad Gent 2012
Stad Gent 2012
 
Confenis 2012
Confenis 2012Confenis 2012
Confenis 2012
 
Confenis2012DC
Confenis2012DCConfenis2012DC
Confenis2012DC
 
BPI@BPM2012
BPI@BPM2012BPI@BPM2012
BPI@BPM2012
 

Recently uploaded

TALENT ACQUISITION AND MANAGEMENT LECTURE 2
TALENT ACQUISITION AND MANAGEMENT LECTURE 2TALENT ACQUISITION AND MANAGEMENT LECTURE 2
TALENT ACQUISITION AND MANAGEMENT LECTURE 2
projectseasy
 
MEA Union Budget 2024-25 Final Presentation
MEA Union Budget 2024-25 Final PresentationMEA Union Budget 2024-25 Final Presentation
MEA Union Budget 2024-25 Final Presentation
PhysicsUtu
 
STRATEGY TO OVERCOME CURRENT PROBLEMS AT MTC.pptx
STRATEGY TO OVERCOME CURRENT PROBLEMS AT MTC.pptxSTRATEGY TO OVERCOME CURRENT PROBLEMS AT MTC.pptx
STRATEGY TO OVERCOME CURRENT PROBLEMS AT MTC.pptx
ImranTabish1
 
Gym business MODEL .pdf .
Gym business MODEL .pdf                 .Gym business MODEL .pdf                 .
Gym business MODEL .pdf .
Divyanshu56740
 
TEST BANK For Auditing & Assurance Services A Systematic Approach, 12th Editi...
TEST BANK For Auditing & Assurance Services A Systematic Approach, 12th Editi...TEST BANK For Auditing & Assurance Services A Systematic Approach, 12th Editi...
TEST BANK For Auditing & Assurance Services A Systematic Approach, 12th Editi...
kevinkariuki227
 
Green Minimalist Aesthetic Project Proposal Presentation.pdf
Green Minimalist Aesthetic Project Proposal Presentation.pdfGreen Minimalist Aesthetic Project Proposal Presentation.pdf
Green Minimalist Aesthetic Project Proposal Presentation.pdf
shivamkush646
 
Business Model Canvas for Successful Business
Business Model Canvas for Successful BusinessBusiness Model Canvas for Successful Business
Business Model Canvas for Successful Business
SuganthiPrakash1
 
Patrick Dwyer Merrill Lynch - Founder of the Dwyer Family Foundation
Patrick Dwyer Merrill Lynch - Founder of the Dwyer Family FoundationPatrick Dwyer Merrill Lynch - Founder of the Dwyer Family Foundation
Patrick Dwyer Merrill Lynch - Founder of the Dwyer Family Foundation
Patrick Dwyer Merrill Lynch
 
1234567891011121314151617181920212223242
12345678910111213141516171819202122232421234567891011121314151617181920212223242
1234567891011121314151617181920212223242
fauzanal343
 
حبوب %77 الميفيبريستون 200 ملغ في دبي الامارات العين ابوظبي عجمان واتس - 0096...
حبوب %77 الميفيبريستون 200 ملغ في دبي الامارات العين ابوظبي عجمان واتس - 0096...حبوب %77 الميفيبريستون 200 ملغ في دبي الامارات العين ابوظبي عجمان واتس - 0096...
حبوب %77 الميفيبريستون 200 ملغ في دبي الامارات العين ابوظبي عجمان واتس - 0096...
حبوب الاجهاض سايتوتك للبيع في الامارات cytotec واتس 00966583759617
 
Retail Store Scavenger Hunt powerpoint slides
Retail Store Scavenger Hunt powerpoint slidesRetail Store Scavenger Hunt powerpoint slides
Retail Store Scavenger Hunt powerpoint slides
JairSemexant
 
Girls Call Kharghar 9910780858 Provide Best And Top Girl Service And No1 in City
Girls Call Kharghar 9910780858 Provide Best And Top Girl Service And No1 in CityGirls Call Kharghar 9910780858 Provide Best And Top Girl Service And No1 in City
Girls Call Kharghar 9910780858 Provide Best And Top Girl Service And No1 in City
maigasapphire
 
foodgasm restaurant and Bar pune road.docx
foodgasm restaurant and Bar pune road.docxfoodgasm restaurant and Bar pune road.docx
foodgasm restaurant and Bar pune road.docx
PraghyaBhandari
 
New Girls Call Mumbai 9910780858 Provide Best And Top Girl Service And No1 in...
New Girls Call Mumbai 9910780858 Provide Best And Top Girl Service And No1 in...New Girls Call Mumbai 9910780858 Provide Best And Top Girl Service And No1 in...
New Girls Call Mumbai 9910780858 Provide Best And Top Girl Service And No1 in...
44annissa
 
The-Three-Pillars-of-Doctoral-Research-What-Why-and-How (1).pptx
The-Three-Pillars-of-Doctoral-Research-What-Why-and-How (1).pptxThe-Three-Pillars-of-Doctoral-Research-What-Why-and-How (1).pptx
The-Three-Pillars-of-Doctoral-Research-What-Why-and-How (1).pptx
Jindal Global University, Sonipat Haryana 131001
 
How to buy a fake Keiser University diploma
How to buy a fake Keiser University diplomaHow to buy a fake Keiser University diploma
How to buy a fake Keiser University diploma
College diploma
 
Data Analytics and AI Strategy Toolkit, Playbook and Templates
Data Analytics and AI Strategy Toolkit, Playbook and TemplatesData Analytics and AI Strategy Toolkit, Playbook and Templates
Data Analytics and AI Strategy Toolkit, Playbook and Templates
Aurelien Domont, MBA
 
You Get Me! Leveraging Communication Styles in Virtual Trainingpptx
You Get Me! Leveraging Communication Styles in Virtual TrainingpptxYou Get Me! Leveraging Communication Styles in Virtual Trainingpptx
You Get Me! Leveraging Communication Styles in Virtual Trainingpptx
Cynthia Clay
 
WAM Corporate Presentation July 2024.pdf
WAM Corporate Presentation July 2024.pdfWAM Corporate Presentation July 2024.pdf
WAM Corporate Presentation July 2024.pdf
Western Alaska Minerals Corp.
 
Restaurant Chiraz Sindbad Hotel Hammamet
Restaurant Chiraz Sindbad Hotel HammametRestaurant Chiraz Sindbad Hotel Hammamet
Restaurant Chiraz Sindbad Hotel Hammamet
rihabkorbi24
 

Recently uploaded (20)

TALENT ACQUISITION AND MANAGEMENT LECTURE 2
TALENT ACQUISITION AND MANAGEMENT LECTURE 2TALENT ACQUISITION AND MANAGEMENT LECTURE 2
TALENT ACQUISITION AND MANAGEMENT LECTURE 2
 
MEA Union Budget 2024-25 Final Presentation
MEA Union Budget 2024-25 Final PresentationMEA Union Budget 2024-25 Final Presentation
MEA Union Budget 2024-25 Final Presentation
 
STRATEGY TO OVERCOME CURRENT PROBLEMS AT MTC.pptx
STRATEGY TO OVERCOME CURRENT PROBLEMS AT MTC.pptxSTRATEGY TO OVERCOME CURRENT PROBLEMS AT MTC.pptx
STRATEGY TO OVERCOME CURRENT PROBLEMS AT MTC.pptx
 
Gym business MODEL .pdf .
Gym business MODEL .pdf                 .Gym business MODEL .pdf                 .
Gym business MODEL .pdf .
 
TEST BANK For Auditing & Assurance Services A Systematic Approach, 12th Editi...
TEST BANK For Auditing & Assurance Services A Systematic Approach, 12th Editi...TEST BANK For Auditing & Assurance Services A Systematic Approach, 12th Editi...
TEST BANK For Auditing & Assurance Services A Systematic Approach, 12th Editi...
 
Green Minimalist Aesthetic Project Proposal Presentation.pdf
Green Minimalist Aesthetic Project Proposal Presentation.pdfGreen Minimalist Aesthetic Project Proposal Presentation.pdf
Green Minimalist Aesthetic Project Proposal Presentation.pdf
 
Business Model Canvas for Successful Business
Business Model Canvas for Successful BusinessBusiness Model Canvas for Successful Business
Business Model Canvas for Successful Business
 
Patrick Dwyer Merrill Lynch - Founder of the Dwyer Family Foundation
Patrick Dwyer Merrill Lynch - Founder of the Dwyer Family FoundationPatrick Dwyer Merrill Lynch - Founder of the Dwyer Family Foundation
Patrick Dwyer Merrill Lynch - Founder of the Dwyer Family Foundation
 
1234567891011121314151617181920212223242
12345678910111213141516171819202122232421234567891011121314151617181920212223242
1234567891011121314151617181920212223242
 
حبوب %77 الميفيبريستون 200 ملغ في دبي الامارات العين ابوظبي عجمان واتس - 0096...
حبوب %77 الميفيبريستون 200 ملغ في دبي الامارات العين ابوظبي عجمان واتس - 0096...حبوب %77 الميفيبريستون 200 ملغ في دبي الامارات العين ابوظبي عجمان واتس - 0096...
حبوب %77 الميفيبريستون 200 ملغ في دبي الامارات العين ابوظبي عجمان واتس - 0096...
 
Retail Store Scavenger Hunt powerpoint slides
Retail Store Scavenger Hunt powerpoint slidesRetail Store Scavenger Hunt powerpoint slides
Retail Store Scavenger Hunt powerpoint slides
 
Girls Call Kharghar 9910780858 Provide Best And Top Girl Service And No1 in City
Girls Call Kharghar 9910780858 Provide Best And Top Girl Service And No1 in CityGirls Call Kharghar 9910780858 Provide Best And Top Girl Service And No1 in City
Girls Call Kharghar 9910780858 Provide Best And Top Girl Service And No1 in City
 
foodgasm restaurant and Bar pune road.docx
foodgasm restaurant and Bar pune road.docxfoodgasm restaurant and Bar pune road.docx
foodgasm restaurant and Bar pune road.docx
 
New Girls Call Mumbai 9910780858 Provide Best And Top Girl Service And No1 in...
New Girls Call Mumbai 9910780858 Provide Best And Top Girl Service And No1 in...New Girls Call Mumbai 9910780858 Provide Best And Top Girl Service And No1 in...
New Girls Call Mumbai 9910780858 Provide Best And Top Girl Service And No1 in...
 
The-Three-Pillars-of-Doctoral-Research-What-Why-and-How (1).pptx
The-Three-Pillars-of-Doctoral-Research-What-Why-and-How (1).pptxThe-Three-Pillars-of-Doctoral-Research-What-Why-and-How (1).pptx
The-Three-Pillars-of-Doctoral-Research-What-Why-and-How (1).pptx
 
How to buy a fake Keiser University diploma
How to buy a fake Keiser University diplomaHow to buy a fake Keiser University diploma
How to buy a fake Keiser University diploma
 
Data Analytics and AI Strategy Toolkit, Playbook and Templates
Data Analytics and AI Strategy Toolkit, Playbook and TemplatesData Analytics and AI Strategy Toolkit, Playbook and Templates
Data Analytics and AI Strategy Toolkit, Playbook and Templates
 
You Get Me! Leveraging Communication Styles in Virtual Trainingpptx
You Get Me! Leveraging Communication Styles in Virtual TrainingpptxYou Get Me! Leveraging Communication Styles in Virtual Trainingpptx
You Get Me! Leveraging Communication Styles in Virtual Trainingpptx
 
WAM Corporate Presentation July 2024.pdf
WAM Corporate Presentation July 2024.pdfWAM Corporate Presentation July 2024.pdf
WAM Corporate Presentation July 2024.pdf
 
Restaurant Chiraz Sindbad Hotel Hammamet
Restaurant Chiraz Sindbad Hotel HammametRestaurant Chiraz Sindbad Hotel Hammamet
Restaurant Chiraz Sindbad Hotel Hammamet
 

INISET@CAiSE 2011

  • 1. FACULTY OF ECONOMICS AND BUSINESS ADMINISTRATION Integrating Computer Log Files for Process Mining A Genetic Algorithm Inspired Technique Jan Claes jan.claes@ugent.be http://processmining.ugent.be Ghent University, Belgium Faculty of Economics and Business Administration Jan Claes for INISET@CAiSE 2011 Department of Management Information and Operations Management 21 June, 2011
  • 2. FACULTY OF ECONOMICS AND BUSINESS ADMINISTRATION 1. Process Mining Faculty of Economics and Business Administration Jan Claes for INISET@CAiSE 2011 Department of Management Information and Operations Management 21 June, 2011
  • 3. A plane crashed... What happened? Analyse the ‘black box’ Faculty of Economics and Business Administration Jan Claes for INISET@CAiSE 2011 Department of Management Information and Operations Management 3 / 24
  • 4. A process failed... What happened? Analyse the ‘black box’: look for historical data Process Mining:  Reconstruct and analyse processes  From historical process data • Log files • Audit trails • Database history fields/tables Faculty of Economics and Business Administration Jan Claes for INISET@CAiSE 2011 Department of Management Information and Operations Management 4 / 24
  • 5. Process Mining Processes are supported by IT systems IT systems record actual process data Process data can be used to automatically  Discover process model  Check conformance with existing process info  Extend existing process model Attention Process Mining  Only As-Is  Only (correctly) recorded information Faculty of Economics and Business Administration Jan Claes for INISET@CAiSE 2011 Department of Management Information and Operations Management 5 / 24
  • 6. Process Mining steps  Preparation  Collect data: find traces  Merge data: from different sources  Structure data: group per instance  Convert data: to tool specific format  Process mining  Make decisions, take action Faculty of Economics and Business Administration Jan Claes for INISET@CAiSE 2011 Department of Management Information and Operations Management 6 / 24
  • 7. Process Mining steps Faculty of Economics and Business Administration Jan Claes for INISET@CAiSE 2011 Department of Management Information and Operations Management 7 / 24
  • 8. FACULTY OF ECONOMICS AND BUSINESS ADMINISTRATION 2. Merging log files Faculty of Economics and Business Administration Jan Claes for INISET@CAiSE 2011 Department of Management Information and Operations Management 21 June, 2011
  • 9. Example Product ordering: registered events:  Sales order: document creation (administration)  Delivery: truck load confirmation (warehouse)  Invoice: document creation (administration) Logging  from administration software  from warehouse software How to merge both log files? Faculty of Economics and Business Administration Jan Claes for INISET@CAiSE 2011 Department of Management Information and Operations Management 9 / 24
  • 10. Example 1 Administration Warehouse SO1 SO > Inv SO1 Deliver SO2 SO > Inv SO2 Deliver SO3 SO > Inv SO3 Deliver SO1 SO > Deliver > Inv SO2 SO > Deliver > Inv SO3 SO > Deliver > Inv Merge based on matching trace identifiers Faculty of Economics and Business Administration Jan Claes for INISET@CAiSE 2011 Department of Management Information and Operations Management 10 / 24
  • 11. Example 2 Administration Warehouse SO1 SO > Inv Del1 Deliver (SO1) SO2 SO > Inv Del2 Deliver (SO2) SO3 SO > Inv Del3 Deliver (SO3) SO1 SO > Deliver > Inv SO2 SO > Deliver > Inv SO3 SO > Deliver > Inv Merge based on matching attribute values Faculty of Economics and Business Administration Jan Claes for INISET@CAiSE 2011 Department of Management Information and Operations Management 11 / 24
  • 12. Example 3 Administration t1<t2<t3 Warehouse << SO1 SO t > Inv t Arr1 Deliver t 1 3 t4<t5<t6 2 SO2 SO t > Inv t 6 << Arr2 Deliver t 4 5 SO3 SO t > Inv t t7<t8<t9 Arr3 Deliver t 7 9 8 SO1 SO > Deliver > Inv SO2 SO > Deliver > Inv SO3 SO > Deliver > Inv Merge based on time information Faculty of Economics and Business Administration Jan Claes for INISET@CAiSE 2011 Department of Management Information and Operations Management 12 / 24
  • 13. Merging computer log files Merge based on  Example 1: matching trace identifiers indicator 1  Example 2: matching attribute values indicator 2  Example 3: time information indicator 3 General solution  algorithm combining different indicators Genetic algorithm  indicators build up fitness function Faculty of Economics and Business Administration Jan Claes for INISET@CAiSE 2011 Department of Management Information and Operations Management 13 / 24
  • 14. FACULTY OF ECONOMICS AND BUSINESS ADMINISTRATION 3. Genetic algorithm Faculty of Economics and Business Administration Jan Claes for INISET@CAiSE 2011 Department of Management Information and Operations Management 21 June, 2011
  • 15. Genetic algorithm cross-over survival of the fittest mutation 1st generation 2nd generation 3th generation Faculty of Economics and Business Administration Jan Claes for INISET@CAiSE 2011 Department of Management Information and Operations Management 15 / 24
  • 16. Genetic algorithm Fitness function score 14 18 18 cross-over 27 29 28 survival of the fittest mutation 6 5 32 1st generation 2nd generation 3th generation Faculty of Economics and Business Administration Jan Claes for INISET@CAiSE 2011 Department of Management Information and Operations Management 16 / 24
  • 17. Genetic algorithm inspired technique Find links between traces of both log files and merge them chronologically in new log file Steps  Make initial solution (best individual links)  Make pseudo-random changes (try to improve score for one specific factor)  Evaluate (keep original or changed solution)  Stop condition (fixed amount of steps) Only one solution, no cross-over Faculty of Economics and Business Administration Jan Claes for INISET@CAiSE 2011 Department of Management Information and Operations Management 17 / 24
  • 18. FACULTY OF ECONOMICS AND BUSINESS ADMINISTRATION 4. Experiment results Faculty of Economics and Business Administration Jan Claes for INISET@CAiSE 2011 Department of Management Information and Operations Management 21 June, 2011
  • 19. Experiment: proof of concept Simulated data  Given model  Generate • random set of logs • single log (=solution)  Use merge algorithm to merge set of logs  Check resulting log with solution log Faculty of Economics and Business Administration Jan Claes for INISET@CAiSE 2011 Department of Management Information and Operations Management 19 / 24
  • 20. Experiment: proof of concept Advantages of using simulated data  Solution is known  Controllable parameters (e.g. noise, overlap, matching id) Disadvantages of using simulated data  Limited internal validity (are results realistic?)  No external validity (results not generalisable) Faculty of Economics and Business Administration Jan Claes for INISET@CAiSE 2011 Department of Management Information and Operations Management 20 / 24
  • 21. Experiment results Incorrect links related to total links identified Faculty of Economics and Business Administration Jan Claes for INISET@CAiSE 2011 Department of Management Information and Operations Management 21 / 24
  • 22. FACULTY OF ECONOMICS AND BUSINESS ADMINISTRATION 5. Discussion Faculty of Economics and Business Administration Jan Claes for INISET@CAiSE 2011 Department of Management Information and Operations Management 21 June, 2011
  • 23. Future work Optimise genetic algorithm  Less incorrect links Faster implementation (AIS algorithm)  Fitness function factors Validation with real test cases Ghent University DPO (Human Resources) Century21 (Real Estate) & FlexPack (Packaging)  BNP Paribas Fortis (Finance)  ... Faculty of Economics and Business Administration Jan Claes for INISET@CAiSE 2011 Department of Management Information and Operations Management 23 / 24
  • 24. Contact information Jan Claes jan.claes@ugent.be http://processmining.ugent.be Twitter: @janclaesbelgium Faculty of Economics and Business Administration Jan Claes for INISET@CAiSE 2011 Department of Management Information and Operations Management 24 / 24