Metaheuristic Optimization
for Automated Business Process Discovery
Adriano Augusto, Marlon Dumas, Marcello La Rosa
Context
Automated discovery of (business) process models from event logs
Automated
Process Discovery
Approach (APDA)
2
a » b » c » g » e » h 10
a » b » c » f » g » h 10
a » b » d » g » e » h 10
a » b » d » e » g » h 10
a » b » e » c » g » h 10
a » b » e » d » g » h 10
a » c » b » e » g » h 10
a » c » b » f » g » h 10
a » d » b » e » g » h 10
a » d » b » f » g » h 10
Process Model Quality
How good is an automatically discovered process model?
Process
Model
APDA
3
Event
Log
Compare
Fitness, Precision (F-score)
Generalization
Simplicity
Soundness
State-of-the-art
4
Automated Process
Discovery Approaches
APDAs based on
Directly-Follows Graphs
(DFG-based APDAs)
Split Miner
Inductive Miner
Fodina Miner
Heuristics Miner
…
DFG-based APDAs
5
Process
Model
DFG-based
APDA
(e.g. Split Miner)
Event
Log
Discover
DFG
Manipulate
DFG
(e.g. Filtering)
Convert DFG
to Model
(e.g. BPMN)
a » b » c » g » e » h 10
a » b » c » f » g » h 10
a » b » d » g » e » h 10
a » b » d » e » g » h 10
a » b » e » c » g » h 10
a » b » e » d » g » h 10
a » c » b » e » g » h 10
a » c » b » f » g » h 10
a » d » b » e » g » h 10
a » d » b » f » g » h 10
DFG-based APDAs
6
Process
Model
DFG-based
APDA
(e.g. Split Miner)
Event
Log
a » b » c » g » e » h 10
a » b » c » f » g » h 10
a » b » d » g » e » h 10
a » b » d » e » g » h 10
a » b » e » c » g » h 10
a » b » e » d » g » h 10
a » c » b » e » g » h 10
a » c » b » f » g » h 10
a » d » b » e » g » h 10
a » d » b » f » g » h 10
input params
6
Discover
DFG
Manipulate
DFG
(e.g. Filtering)
Convert DFG
to Model
(e.g. BPMN)
What is the best input configuration?
Model
(1)
DFG-based
APDA
(e.g. Split Miner)
Log
Configuration 1
Assess
Quality
Model
Quality (1)
Model
(2)
DFG-based
APDA
(e.g. Split Miner)
Log
Configuration 2
Assess
Quality
Model
Quality (2)
Model
(N)
DFG-based
APDA
(e.g. Split Miner)
Log
Configuration N
Assess
Quality
Model
Quality (N)
Compare
What is the best input configuration?
Model
(1)
DFG-based
APDA
(e.g. Split Miner)
Log
Configuration 1
Assess
Quality
Model
Quality (1)
Model
(2)
DFG-based
APDA
(e.g. Split Miner)
Log
Configuration 2
Assess
Quality
Model
Quality (2)
Model
(N)
DFG-based
APDA
(e.g. Split Miner)
Log
Configuration N
Assess
Quality
Model
Quality (N)
Compare
Model (x)
is the
BEST!
How to be more efficient?
9
Optimization Metaheuristics
Population Based
Evolutionary computation
Ant colony
Bee colony
Swarm particles
…
Single-solution Based
Repetitive local search
Iterative local search
Tabu search
Simulated annealing
…
Adapting the Metaheuristics to our Context
10
Repetitive Local Search (RLS)
Iterative Local Search (ILS)
Tabu Search (TS)
Simulated Annealing (SA)
1. Solution Space
2. Solution Neighbourhood
3. Objective Function
Adapting the Metaheuristics to our Context
Model
(1)
DFG-based
APDA
(e.g. Split Miner)
Log
Configuration 1
Assess
Quality
Model
Quality (1)
Model
(2)
DFG-based
APDA
(e.g. Split Miner)
Log
Configuration 2
Assess
Quality
Model
Quality (2)
Model
(N)
DFG-based
APDA
(e.g. Split Miner)
Log
Configuration N
Assess
Quality
Model
Quality (N)
Compare
Model (x)
is the
BEST!
Solution Space Objective Function
Neighbours?
Optimizing a DFG-based APDAs
12
Event
Log
Discover
DFG
Manipulate
DFG
(e.g. Filtering)
Convert DFG
to Model
(e.g. BPMN)
input params
Optimizing a DFG-based APDAs
13
Event
Log
Discover
DFG
Manipulate
DFG
(e.g. Filtering)
Convert DFG
to Model
(e.g. BPMN)
input params
Assess
Quality
Assess Quality
 Fitness, precision, generalization, or simplicity?
 What measure to use?
Assess
Quality
Fitness and precision > F-score
Alignment, anti-alignment, PCC, entropy, Markovian accuracy
Optimizing a DFG-based APDAs
15
Event
Log
Discover
DFG
Manipulate
DFG
(e.g. Filtering)
Convert DFG
to Model
(e.g. BPMN)
input params
Assess
Quality
Optimizing a DFG-based APDAs
16
Event
Log
Discover
DFG
Manipulate
DFG
(e.g. Filtering)
Convert DFG
to Model
(e.g. BPMN)
input params
Explore
Neighbour
DFGs
Assess
Quality
Explore Neighbour DFGs
Explore
Neighbour
DFGs
 Given a DFG, its closer neighbours are the ones having one more or one less edge.
 Adding edges will result into adding behaviour (increasing the fitness of the model)
 Removing edges will result into removing behaviour (increasing the precision of the model)
Explore
Neighbour
DFGs
DFG DFG
DFG
DFG
DFG
DFGModel
Quality
Optimizing a DFG-based APDAs
18
Event
Log
Discover
DFG
Manipulate
DFG
(e.g. Filtering)
Convert DFG
to Model
(e.g. BPMN)
input params
Explore
Neighbour
DFGs
Assess
Quality
Convert
DFGs to
Models
Optimizing a DFG-based APDAs
19
Event
Log
Discover
DFG
Manipulate
DFG
(e.g. Filtering)
Convert DFG
to Model
(e.g. BPMN)
input params
Explore
Neighbour
DFGs
Assess
Quality
Convert
DFGs to
Models
Assess
Quality
Optimizing a DFG-based APDAs
20
Event
Log
Discover
DFG
Manipulate
DFG
(e.g. Filtering)
Convert DFG
to Model
(e.g. BPMN)
input params
Explore
Neighbour
DFGs
Assess
Quality
Select Best
DFG
Candidate
Convert
DFGs to
Models
Assess
Quality
Optimizing a DFG-based APDAs
21
Event
Log
Discover
DFG
Manipulate
DFG
(e.g. Filtering)
Convert DFG
to Model
(e.g. BPMN)
input params
Explore
Neighbour
DFGs
Check
Termination
Condition
Assess
Quality
Select Best
DFG
Candidate
Convert
DFGs to
Models
Assess
Quality
Timeout
Number of iterations
Objective function threshold
Optimizing a DFG-based APDAs
22
Process
Model
Event
Log
Discover
DFG
Manipulate
DFG
(e.g. Filtering)
Convert DFG
to Model
(e.g. BPMN)
input params
Explore
Neighbour
DFGs
Check
Termination
Condition fulfilled
not
fulfilled
Assess
Quality
Select Best
DFG
Candidate
Convert
DFGs to
Models
Assess
Quality
Optimization Metaheuristic
Optimization Framework
23
APDA – Metaheuristic Interface
Event
Log
Input
Settings
Objective
FunctionsOptimization metaheuristics ID
APDA ID
Objective Function ID
Process
Model
Optimization
Metaheuristics
DFG-based
APDAs
Optimization Framework Instantiation
24
APDA – Metaheuristic Interface
Event
Log
Input
Settings
Markovian
F-scoreOptimization metaheuristics ID
APDA ID
Objective Function ID
Process
Model
RLS, ILS,
TS, SA
Split Miner
Evaluation Setup
25
— 20 real-life event logs (10 BPIC logs, RTFMP, SEPSIS case, and 8 private logs)
— 3 baselines without hyper-parameters optimization:
Inductive Miner (IM), Evolutionary Tree Miner (ETM), Split Miner (SM)
— 1 baseline with hyper-parameters optimization, Split Miner (HPO)
— Markovian accuracy, Alignment accuracy, simplicity, and time performance
MarkvovianF-score
Event Logs
Limitations
27
— Slower than baselines with default input params (Inductive Miner, Split Miner)
— More complex models when optimizing fitness
— Not applicable to any APDA, only for DFG-based APDA
Thanks for attending!
Questions?
28
Future Work
29
— Add more DFG-based APDAs to our framework
(Fodina Miner and Inductive Miner)
— Explore alternative quality measures to drive the optimization metaheuristics
— Combine accuracy and simplicity measures
Results – Markvovian F-score 30
Results – Alignment F-score 31
Optimizing a DFG-based APDAs
32
Process
Model
Event
Log
Discover
DFG
Manipulate
DFG
(e.g. Filtering)
Convert DFG
to Model
(e.g. BPMN)
input params
Explore
Neighbour
DFGs
Check
Termination
Condition fulfilled
not
fulfilled
Assess
Quality
Select Best
DFG
Candidate
Convert
DFGs to
Models
Assess
Quality
Optimization Metaheuristic

Metaheuristic Optimization for Automated Business Process Discovery

  • 1.
    Metaheuristic Optimization for AutomatedBusiness Process Discovery Adriano Augusto, Marlon Dumas, Marcello La Rosa
  • 2.
    Context Automated discovery of(business) process models from event logs Automated Process Discovery Approach (APDA) 2 a » b » c » g » e » h 10 a » b » c » f » g » h 10 a » b » d » g » e » h 10 a » b » d » e » g » h 10 a » b » e » c » g » h 10 a » b » e » d » g » h 10 a » c » b » e » g » h 10 a » c » b » f » g » h 10 a » d » b » e » g » h 10 a » d » b » f » g » h 10
  • 3.
    Process Model Quality Howgood is an automatically discovered process model? Process Model APDA 3 Event Log Compare Fitness, Precision (F-score) Generalization Simplicity Soundness
  • 4.
    State-of-the-art 4 Automated Process Discovery Approaches APDAsbased on Directly-Follows Graphs (DFG-based APDAs) Split Miner Inductive Miner Fodina Miner Heuristics Miner …
  • 5.
    DFG-based APDAs 5 Process Model DFG-based APDA (e.g. SplitMiner) Event Log Discover DFG Manipulate DFG (e.g. Filtering) Convert DFG to Model (e.g. BPMN) a » b » c » g » e » h 10 a » b » c » f » g » h 10 a » b » d » g » e » h 10 a » b » d » e » g » h 10 a » b » e » c » g » h 10 a » b » e » d » g » h 10 a » c » b » e » g » h 10 a » c » b » f » g » h 10 a » d » b » e » g » h 10 a » d » b » f » g » h 10
  • 6.
    DFG-based APDAs 6 Process Model DFG-based APDA (e.g. SplitMiner) Event Log a » b » c » g » e » h 10 a » b » c » f » g » h 10 a » b » d » g » e » h 10 a » b » d » e » g » h 10 a » b » e » c » g » h 10 a » b » e » d » g » h 10 a » c » b » e » g » h 10 a » c » b » f » g » h 10 a » d » b » e » g » h 10 a » d » b » f » g » h 10 input params 6 Discover DFG Manipulate DFG (e.g. Filtering) Convert DFG to Model (e.g. BPMN)
  • 7.
    What is thebest input configuration? Model (1) DFG-based APDA (e.g. Split Miner) Log Configuration 1 Assess Quality Model Quality (1) Model (2) DFG-based APDA (e.g. Split Miner) Log Configuration 2 Assess Quality Model Quality (2) Model (N) DFG-based APDA (e.g. Split Miner) Log Configuration N Assess Quality Model Quality (N) Compare
  • 8.
    What is thebest input configuration? Model (1) DFG-based APDA (e.g. Split Miner) Log Configuration 1 Assess Quality Model Quality (1) Model (2) DFG-based APDA (e.g. Split Miner) Log Configuration 2 Assess Quality Model Quality (2) Model (N) DFG-based APDA (e.g. Split Miner) Log Configuration N Assess Quality Model Quality (N) Compare Model (x) is the BEST!
  • 9.
    How to bemore efficient? 9 Optimization Metaheuristics Population Based Evolutionary computation Ant colony Bee colony Swarm particles … Single-solution Based Repetitive local search Iterative local search Tabu search Simulated annealing …
  • 10.
    Adapting the Metaheuristicsto our Context 10 Repetitive Local Search (RLS) Iterative Local Search (ILS) Tabu Search (TS) Simulated Annealing (SA) 1. Solution Space 2. Solution Neighbourhood 3. Objective Function
  • 11.
    Adapting the Metaheuristicsto our Context Model (1) DFG-based APDA (e.g. Split Miner) Log Configuration 1 Assess Quality Model Quality (1) Model (2) DFG-based APDA (e.g. Split Miner) Log Configuration 2 Assess Quality Model Quality (2) Model (N) DFG-based APDA (e.g. Split Miner) Log Configuration N Assess Quality Model Quality (N) Compare Model (x) is the BEST! Solution Space Objective Function Neighbours?
  • 12.
    Optimizing a DFG-basedAPDAs 12 Event Log Discover DFG Manipulate DFG (e.g. Filtering) Convert DFG to Model (e.g. BPMN) input params
  • 13.
    Optimizing a DFG-basedAPDAs 13 Event Log Discover DFG Manipulate DFG (e.g. Filtering) Convert DFG to Model (e.g. BPMN) input params Assess Quality
  • 14.
    Assess Quality  Fitness,precision, generalization, or simplicity?  What measure to use? Assess Quality Fitness and precision > F-score Alignment, anti-alignment, PCC, entropy, Markovian accuracy
  • 15.
    Optimizing a DFG-basedAPDAs 15 Event Log Discover DFG Manipulate DFG (e.g. Filtering) Convert DFG to Model (e.g. BPMN) input params Assess Quality
  • 16.
    Optimizing a DFG-basedAPDAs 16 Event Log Discover DFG Manipulate DFG (e.g. Filtering) Convert DFG to Model (e.g. BPMN) input params Explore Neighbour DFGs Assess Quality
  • 17.
    Explore Neighbour DFGs Explore Neighbour DFGs Given a DFG, its closer neighbours are the ones having one more or one less edge.  Adding edges will result into adding behaviour (increasing the fitness of the model)  Removing edges will result into removing behaviour (increasing the precision of the model) Explore Neighbour DFGs DFG DFG DFG DFG DFG DFGModel Quality
  • 18.
    Optimizing a DFG-basedAPDAs 18 Event Log Discover DFG Manipulate DFG (e.g. Filtering) Convert DFG to Model (e.g. BPMN) input params Explore Neighbour DFGs Assess Quality Convert DFGs to Models
  • 19.
    Optimizing a DFG-basedAPDAs 19 Event Log Discover DFG Manipulate DFG (e.g. Filtering) Convert DFG to Model (e.g. BPMN) input params Explore Neighbour DFGs Assess Quality Convert DFGs to Models Assess Quality
  • 20.
    Optimizing a DFG-basedAPDAs 20 Event Log Discover DFG Manipulate DFG (e.g. Filtering) Convert DFG to Model (e.g. BPMN) input params Explore Neighbour DFGs Assess Quality Select Best DFG Candidate Convert DFGs to Models Assess Quality
  • 21.
    Optimizing a DFG-basedAPDAs 21 Event Log Discover DFG Manipulate DFG (e.g. Filtering) Convert DFG to Model (e.g. BPMN) input params Explore Neighbour DFGs Check Termination Condition Assess Quality Select Best DFG Candidate Convert DFGs to Models Assess Quality Timeout Number of iterations Objective function threshold
  • 22.
    Optimizing a DFG-basedAPDAs 22 Process Model Event Log Discover DFG Manipulate DFG (e.g. Filtering) Convert DFG to Model (e.g. BPMN) input params Explore Neighbour DFGs Check Termination Condition fulfilled not fulfilled Assess Quality Select Best DFG Candidate Convert DFGs to Models Assess Quality Optimization Metaheuristic
  • 23.
    Optimization Framework 23 APDA –Metaheuristic Interface Event Log Input Settings Objective FunctionsOptimization metaheuristics ID APDA ID Objective Function ID Process Model Optimization Metaheuristics DFG-based APDAs
  • 24.
    Optimization Framework Instantiation 24 APDA– Metaheuristic Interface Event Log Input Settings Markovian F-scoreOptimization metaheuristics ID APDA ID Objective Function ID Process Model RLS, ILS, TS, SA Split Miner
  • 25.
    Evaluation Setup 25 — 20real-life event logs (10 BPIC logs, RTFMP, SEPSIS case, and 8 private logs) — 3 baselines without hyper-parameters optimization: Inductive Miner (IM), Evolutionary Tree Miner (ETM), Split Miner (SM) — 1 baseline with hyper-parameters optimization, Split Miner (HPO) — Markovian accuracy, Alignment accuracy, simplicity, and time performance
  • 26.
  • 27.
    Limitations 27 — Slower thanbaselines with default input params (Inductive Miner, Split Miner) — More complex models when optimizing fitness — Not applicable to any APDA, only for DFG-based APDA
  • 28.
  • 29.
    Future Work 29 — Addmore DFG-based APDAs to our framework (Fodina Miner and Inductive Miner) — Explore alternative quality measures to drive the optimization metaheuristics — Combine accuracy and simplicity measures
  • 30.
  • 31.
  • 32.
    Optimizing a DFG-basedAPDAs 32 Process Model Event Log Discover DFG Manipulate DFG (e.g. Filtering) Convert DFG to Model (e.g. BPMN) input params Explore Neighbour DFGs Check Termination Condition fulfilled not fulfilled Assess Quality Select Best DFG Candidate Convert DFGs to Models Assess Quality Optimization Metaheuristic