Decision Mining Revisited - Discovering Overlapping Rules

Decision Mining Revisited
Discovering Overlapping Rules
Felix Mannhardt, Massimiliano de Leoni,
Hajo A. Reijers, Wil M.P. van der Aalst

Scope: Mining decision rules from event logs
PAGE 1
Apply
Amount
Grant
Extensive
Check
Reject
Eligibility
Simple
Check
Request
Information
Income
Receive
Information
Category
Activity
Data

Control-flow – Petri net defines order & possible choices
PAGE 2
Apply Grant
Extensive
Check
Reject
Simple
Check
Request
Information
Receive
Information
Exclusive
Choice
Sequence
Exclusive
Choice

Data-perspective – Data Petri Net modelling decisions
PAGE 3
Decision point
Data recording
Decision rule

DMN 1.1 released on 2016
Widely adopted by tool vendors, for example:
U Eligibility Outcome
1 Yes Grant
2 No Reject
Decision Table
Grant
Reject
[Eligibility = No]
[Eligibility = Yes]
Comparing the Petri net notation to DMN
Decision Rule / Guard

Why are overlapping rules needed?
PAGE 5
Incomplete
Information
• Not recorded
• Process context
• Confidential
• ...
• Expert approval
• Deferred choice
• Randomized check
• Inconsistent human behavior
• ...

Goal: Discover rules which may overlap
PAGE 6
Process Model
Event Log
Process Model with
Overlapping Decision Rules
Overlapping Rule
Discovery

Decision point - Mutually-exclusive rule
PAGE 7
Grant
Reject
[Eligibility = No]
[Eligibility = Yes]
Count Eligibility Outcome
5x “No” Reject
20x “Yes” Grant
Observation instances from an event log
Grant
Reject

Decision point – Overlapping rule
PAGE 8
C Rating Amount Activity
1 Good - Simple Check
2 Bad - Extensive Check
3 Bad Low Simple Check
4 Bad High Request Information
5 Unknown - Request Information
Alternative Decision Table Notation

Proposed Discovery Method
PAGE 9
Process Model
Event Log
Process Model
With Overlapping Rules
Overlapping Rule
Discovery
foreach
Decision Point
Collect
Instances
1st
Classification
2nd
Classification
Collect
Misclassified
Build
Rules

1) Collect Instances
PAGE 10
Event Log collect
Rating Amount Outcome
6x Good Low Simple
6x Good High Simple
6x Bad High Extensive
4x Bad High Request
6x Bad Low Extensive
4x Bad Low Simple
6x Unknown High Request
Observation instances
• Cyclic Behavior
• Noise (Missing / Additional Events)
• Unassigned values
• Inconsistent recording
Alignment-based method

2) 1st Classification & 3) Misclassified Instances
PAGE 11
6x Good Low Simple
6x Good High Simple
4x Bad High Request
4x Bad Low Simple
Rating
Simple RequestExtensive
Good Unknown
Bad
12 OK 12 OK
8 NOK
6 OK
Instances Decision Tree

4) 2nd Classification
PAGE 12
Instances
Amount
Request Simple
High Low
2nd Decision Tree
4x Bad High Request
4x Bad Low Simple

5) Build Overlapping Decision Rules
PAGE 13
Rating
Simple RequestExtensive
Good Unknown
Bad
Amount
Request Simple
High Low
Compiled to overlapping rules
If Rating = Good then Simple
If Rating = Unknown then Request
If Rating = Bad then Extensive
If Rating = Bad AND Amount = High
then Request
If Rating = Bad AND Amount = Low
then Simple

Resulting Data-aware Process Model
PAGE 14

Trade-off: Precise and fitting model
PAGE 15
6x Good Low Simple
6x Good High Simple
4x Bad High Request
4x Bad Low Simple
Unfitting
Imprecise
[Underfitting]
Good Trade-off

Evaluation – Measures
PAGE 16
Precision Fitness
How much unobserved
behavior is modelled?
How much observed
behavior is modelled?
Image source (CC BY-SA): https://en.wikipedia.org/wiki/Precision_and_recall#/media/File:Precisionrecall.svg

Evaluation – Setup
PAGE 17
Method Description Expected
Precision
Expected
Fitness
WO Without rules Poor Good
DTF Mutually-exclusive approach Good Poor
DTT Naïve overlapping approach Poor Good
DTO Presented overlapping approach Balanced Balanced
Dataset # Traces # Events # Attributes # Decisions
Road Fines 150,000 500,000 9 5
Hospital 1,000 15,000 39 11
Datasets
Compared Methods

Evaluation – Example rules in the hospital data
PAGE 18
Method Intensive Care Normal Care Skip
DTO L > 0 ∧ H = true L > 0 L ≤ 0 ∨
(L > 0 ∧ H = false)
DTT true L > 0 L ≤ 0
DTF false L > 0 L ≤ 0
Imprecise
Unfitting
Good
trade-off

Evaluation – Precision & Fitness
PAGE 19
Fitness Precision
• Fitness  how often rules are violated
• DTO improves fitness over DTF (mutually-exclusive)
• Precision  how strict are the rules
• DTO improves precision against WO
• DTO does sacrifice precision vs. DTF

Conclusion & Future Work
• Method: Discovery of overlapping rules using event logs
• Based on decision tree induction
• ProM framework: MultiPerspectiveExplorer
http://www.promtools.org
• Results: Trade-off fitness & precision
• Improves the model fitness over
standard trees
• Improves the model precision over
naïve approach
• Future work
• Better experimental validation
• Manage the complexity of discovered rules
• Imbalanced distributions
PAGE 20

Questions?
PAGE 21
@fmannhardt - f.mannhardt@tue.nl - http://promtools.org
Multi-Perspective Explorer

Decision Mining Revisited - Discovering Overlapping Rules

Recommended

Recommended

More Related Content

Similar to Decision Mining Revisited - Discovering Overlapping Rules

Similar to Decision Mining Revisited - Discovering Overlapping Rules (20)

More from Felix Mannhardt

More from Felix Mannhardt (7)

Recently uploaded

Recently uploaded (20)

Decision Mining Revisited - Discovering Overlapping Rules

Editor's Notes