FACULTY OF ECONOMICS AND BUSINESS ADMINISTRATIONMerging Computer Log Files for Process Mining:   An Artificial Immune Syst...
Process MiningProcesses are supported by IT systemsIT systems record actual process dataProcess data can be used to   ...
Process data in event logs                                                                                       Event log...
Process Mining steps Preparation             Collect data: find event information             Merge data: from differen...
Merging log files                              My research:                             Merging log filesGhent University,...
Merging log files1. Find links between traces               2. Merge events chronologically   3. Add unlinked tracesGhent ...
Find linksRequired properties of solution        Finds traces in both log files that belong to the         same process ...
Find linksProposed solution       Take the best possible guess based on assumptions       Include multiple indicator fa...
Decisions to makeWhich indicator factors?How to calculate a score for each factor?How to combine factor scores to globa...
Indicator factorsSame trace identifier        Assumption: If both logs contain a trace with the         same id, there i...
Indicator factorsEqual attribute values        Assumption: The more attributes of a trace and its         events from bo...
Test resultsSimulated data (300-400 msec on standard laptop)        Benefit of controllable parameters, known solution  ...
New approachRule Based Merger        User has to configure rules for linking traces        Rule = relationship between ...
New approachGhent University, Faculty of Economics and Business Administration   Jan Claes for EIS 2011Department of Manag...
Contact information                                                Jan Claes                                              ...
Upcoming SlideShare
Loading in …5
×

EIS 2011

1,164
-1

Published on

Slides of my presentation at EIS conference, 31 October 2011, Delft, NL

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
1,164
On Slideshare
0
From Embeds
0
Number of Embeds
3
Actions
Shares
0
Downloads
4
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

EIS 2011

  1. 1. FACULTY OF ECONOMICS AND BUSINESS ADMINISTRATIONMerging Computer Log Files for Process Mining: An Artificial Immune System Technique Jan Claes and Geert Poels http://processmining.ugent.beGhent University, Faculty of Economics and Business Administration Jan Claes for EIS 2011Department of Management Information and Operations Management 30 October, 2011
  2. 2. Process MiningProcesses are supported by IT systemsIT systems record actual process dataProcess data can be used to  Discover process model  Check conformance with existing process info  Improve or extend existing process modelAttention Process Mining  Only As-Is  Only (correctly) recorded informationGhent University, Faculty of Economics and Business Administration Jan Claes for EIS 2011Department of Management Information and Operations Management 2 / 15
  3. 3. Process data in event logs Event log The process Process support Grouped events Recorded eventsGhent University, Faculty of Economics and Business Administration Jan Claes for EIS 2011Department of Management Information and Operations Management 3 / 15
  4. 4. Process Mining steps Preparation  Collect data: find event information  Merge data: from different sources  Structure data: group per instance  Convert data: to tool specific format Process mining Make decisions, take action Manual task Analysts needed in most cases Automated task Less human involvement neededGhent University, Faculty of Economics and Business Administration Jan Claes for EIS 2011Department of Management Information and Operations Management 4 / 15
  5. 5. Merging log files My research: Merging log filesGhent University, Faculty of Economics and Business Administration Jan Claes for EIS 2011Department of Management Information and Operations Management 5 / 15
  6. 6. Merging log files1. Find links between traces 2. Merge events chronologically 3. Add unlinked tracesGhent University, Faculty of Economics and Business Administration Jan Claes for EIS 2011Department of Management Information and Operations Management 6 / 15
  7. 7. Find linksRequired properties of solution  Finds traces in both log files that belong to the same process execution  Without prior knowledge about the provided log files (as generic as possible)  But with maximal possibilities for the (expert) user to include his knowledge about the log filesGhent University, Faculty of Economics and Business Administration Jan Claes for EIS 2011Department of Management Information and Operations Management 7 / 15
  8. 8. Find linksProposed solution  Take the best possible guess based on assumptions  Include multiple indicator factors in analysis  Calculate factor scores for each analysed solution  Combine factor scores into global score per solution  ‘Best guess’ is solution with highest combined score, because based on assumed indicators, most indicator value points to this solution  Provide user interaction possibilitiesGhent University, Faculty of Economics and Business Administration Jan Claes for EIS 2011Department of Management Information and Operations Management 8 / 15
  9. 9. Decisions to makeWhich indicator factors?How to calculate a score for each factor?How to combine factor scores to global score?Which solutions to analyse? (analyse = calculate & compare scores)Which user interactions to include (expert) user knowledge? See paper for more detailsGhent University, Faculty of Economics and Business Administration Jan Claes for EIS 2011Department of Management Information and Operations Management 9 / 15
  10. 10. Indicator factorsSame trace identifier  Assumption: If both logs contain a trace with the same id, there is a very high chance they match  Not always though (e.g. customer id vs. order id) 16 10 17 12 18 14 19 16 20 18 21 20Ghent University, Faculty of Economics and Business Administration Jan Claes for EIS 2011Department of Management Information and Operations Management 10 / 15
  11. 11. Indicator factorsEqual attribute values  Assumption: The more attributes of a trace and its events from both logs are equal, the higher the chance they match 16 JAN 12:00 17 JC 14 14:00 17 JAN 12:10 18 JC 15 14:10 18 JAN 12:20 19 JC 16 14:20 19 JAN 12:30 1A JC 17 14:30 20 JAN 12:40 1B JC 18 14:40 21 JAN 12:50 1C JC 19 14:50Ghent University, Faculty of Economics and Business Administration Jan Claes for EIS 2011Department of Management Information and Operations Management 11 / 15
  12. 12. Test resultsSimulated data (300-400 msec on standard laptop)  Benefit of controllable parameters, known solution  Correct number of linked traces in all tests  Perfect results for same trace id and up to 50% noise, worse results for higher overlap of tracesReal data (6-10 min on standard laptop)  Correct number of linked traces in all tests  Almost perfect results for same trace id and up to 50% noise, worse results for higher overlapGhent University, Faculty of Economics and Business Administration Jan Claes for EIS 2011Department of Management Information and Operations Management 12 / 15
  13. 13. New approachRule Based Merger  User has to configure rules for linking traces  Rule = relationship between attributes in both logs  Events of linked traces are merged chronologically“Merge all traces where attribute A of the trace in log 1 equals attribute B of any event in the trace in log 2”Select attributes, contexts and operatorResearch focus: suggesting merging rulesGhent University, Faculty of Economics and Business Administration Jan Claes for EIS 2011Department of Management Information and Operations Management 13 / 15
  14. 14. New approachGhent University, Faculty of Economics and Business Administration Jan Claes for EIS 2011Department of Management Information and Operations Management 14 / 15
  15. 15. Contact information Jan Claes jan.claes@ugent.be http://processmining.ugent.be Twitter: @janclaesbelgiumGhent University, Faculty of Economics and Business Administration Jan Claes for EIS 2011Department of Management Information and Operations Management 15 / 15

×