Your SlideShare is downloading. ×
PhD Day 2011
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×

Introducing the official SlideShare app

Stunning, full-screen experience for iPhone and Android

Text the download link to your phone

Standard text messaging rates apply

PhD Day 2011

477
views

Published on

Slides of my presentation at our faculty's PhD Day, 24 May 2011, Gent, B

Slides of my presentation at our faculty's PhD Day, 24 May 2011, Gent, B

Published in: Business, Technology

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
477
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
2
Comments
0
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. Merging Log Files for Process Mining
    Jan Claes
    Promotor: Prof. Dr. Geert Poels
    Copromotor: Prof. Dr. Ir. Birger Raa
  • 2. Roadmap
    3. Genetic algorithm
    1. Process Mining
    4. Experimentresults
    2. Merging log files
    5. Discussion
    © http://maps.google.com
  • 3. 1. Process Mining
  • 4. Analyse the ‘black box’
    A plane crashed... What happened?
  • 5. Analyse the ‘black box’: look for historical data
    Process Mining:
    Reconstruct and analyse processes
    From historical process data
    Log files
    Audit trails
    Database history fields/tables
    A process failed... What happened?
  • 6. Process Mining
    Audit trail, database fields, csv log file
    ProM Analyses
    ProM Log File
  • 7. 2. Merging log files
  • 8. Process Mining
    Merging log files
  • 9. Process Mining
    Log Merging
    Merging log files
  • 10. 3. Genetic algorithm
  • 11. Genetic algorithm
    cross-over
    survival of the fittest
    mutation
    1st generation
    2nd generation
    3th generation
  • 12. Genetic algorithm
    Fitness function score
    14
    18
    18
    cross-over
    27
    29
    28
    survival of the fittest
    6
    5
    32
    mutation
    1st generation
    2nd generation
    3th generation
  • 13. 4. Experiment results
  • 14. Simulated data
    Given model
    Generate
    random set of logs
    single log (=solution)
    Use merge algorithm to merge set of logs
    Check resulting log with solution log
    Experiment: proof of concept
  • 15. Experiment: proof of concept
    Advantages of using simulated data
    Solution is known
    Controllable parameters (e.g. noise, overlap, matching id)
    Disadvantages of using simulated data
    Limited internal validity (are results realistic?)
    No external validity (results not generalisable)
  • 16. Meantime
    4 min
    Results of version of 31/03/2011: GA inspired algorithm
    Experiment results
  • 17. Meantime
    4 min
    Results of version of 31/03/2011: GA inspired algorithm
    Experiment results
  • 18. Meantime
    350 msec
    Results of version of 15/05/2011: AIS algorithm
    Experiment results: NEW ALGORITHM
  • 19. 5. Discussion
  • 20. Future work
    Optimise genetic algorithm
    Less incorrect links
    Faster implementation
    Fitness function factors
    Validation with real test cases
    Ghent University DPO (Human Resources)
    Century21 (Real Estate)
    BNP Paribas Fortis (Loan approvements)
    ...
  • 21. Jan Claes
    jan.claes@ugent.be
    http://processmining.ugent.be
    FEB08, Tweekerkenstraat 2
    9000 Gent, Belgium
    Contact information