Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Decision Mining Revisited
Discovering Overlapping Rules
Felix Mannhardt, Massimiliano de Leoni,
Hajo A. Reijers, Wil M.P. ...
Scope: Mining decision rules from event logs
PAGE 1
Apply
Amount
Grant
Extensive
Check
Reject
Eligibility
Simple
Check
Req...
Control-flow – Petri net defines order & possible choices
PAGE 2
Apply Grant
Extensive
Check
Reject
Simple
Check
Request
I...
Data-perspective – Data Petri Net modelling decisions
PAGE 3
Decision point
Data recording
Decision rule
PAGE 4
DMN 1.1 released on 2016
Widely adopted by tool vendors, for example:
U Eligibility Outcome
1 Yes Grant
2 No Reject...
Why are overlapping rules needed?
PAGE 5
Incomplete
Information
• Not recorded
• Process context
• Confidential
• ...
• Ex...
Goal: Discover rules which may overlap
PAGE 6
Process Model
Event Log
Process Model with
Overlapping Decision Rules
Overla...
Decision point - Mutually-exclusive rule
PAGE 7
Grant
Reject
[Eligibility = No]
[Eligibility = Yes]
Count Eligibility Outc...
Decision point – Overlapping rule
PAGE 8
C Rating Amount Activity
1 Good - Simple Check
2 Bad - Extensive Check
3 Bad Low ...
Proposed Discovery Method
PAGE 9
Process Model
Event Log
Process Model
With Overlapping Rules
Overlapping Rule
Discovery
f...
1) Collect Instances
PAGE 10
Event Log collect
Rating Amount Outcome
6x Good Low Simple
6x Good High Simple
6x Bad High Ex...
2) 1st Classification & 3) Misclassified Instances
PAGE 11
Rating Amount Outcome
6x Good Low Simple
6x Good High Simple
6x...
4) 2nd Classification
PAGE 12
Instances
Amount
Request Simple
High Low
2nd Decision Tree
Rating Amount Outcome
4x Bad High...
5) Build Overlapping Decision Rules
PAGE 13
Rating
Simple RequestExtensive
Good Unknown
Bad
Amount
Request Simple
High Low...
Resulting Data-aware Process Model
PAGE 14
Trade-off: Precise and fitting model
PAGE 15
Rating Amount Outcome
6x Good Low Simple
6x Good High Simple
6x Bad High Exte...
Evaluation – Measures
PAGE 16
Precision Fitness
How much unobserved
behavior is modelled?
How much observed
behavior is mo...
Evaluation – Setup
PAGE 17
Method Description Expected
Precision
Expected
Fitness
WO Without rules Poor Good
DTF Mutually-...
Evaluation – Example rules in the hospital data
PAGE 18
Method Intensive Care Normal Care Skip
DTO L > 0 ∧ H = true L > 0 ...
Evaluation – Precision & Fitness
PAGE 19
Fitness Precision
• Fitness  how often rules are violated
• DTO improves fitness...
Conclusion & Future Work
• Method: Discovery of overlapping rules using event logs
• Based on decision tree induction
• Pr...
Questions?
PAGE 21
@fmannhardt - f.mannhardt@tue.nl - http://promtools.org
Multi-Perspective Explorer
Upcoming SlideShare
Loading in …5
×

Decision Mining Revisited - Discovering Overlapping Rules

452 views

Published on

Decision mining enriches process models with rules underlying decisions in processes using historical process execution data. Choices between multiple activities are specified through rules defined over process data. Existing decision mining methods focus on discovering mutually-exclusive rules, which only allow one out of multiple activities to be performed. These methods assume that decision making is fully deterministic, and all factors influencing decisions are recorded. In case the underlying decision rules are overlapping due to non-determinism or incomplete information, the rules returned by existing methods do not fit the recorded data well. This paper proposes a new technique to discover overlapping decision rules, which fit the recorded data better at the expense of precision, using decision tree learning techniques. An evaluation of the method on two real-life data sets confirms this trade off. Moreover, it shows that the method returns rules with better fitness and precision in under certain conditions.

Original paper: http://dx.doi.org/10.1007/978-3-319-39696-5_23
Presented at CAiSE'16

Published in: Science
  • Be the first to comment

  • Be the first to like this

Decision Mining Revisited - Discovering Overlapping Rules

  1. 1. Decision Mining Revisited Discovering Overlapping Rules Felix Mannhardt, Massimiliano de Leoni, Hajo A. Reijers, Wil M.P. van der Aalst
  2. 2. Scope: Mining decision rules from event logs PAGE 1 Apply Amount Grant Extensive Check Reject Eligibility Simple Check Request Information Income Receive Information Category Activity Data
  3. 3. Control-flow – Petri net defines order & possible choices PAGE 2 Apply Grant Extensive Check Reject Simple Check Request Information Receive Information Exclusive Choice Sequence Exclusive Choice
  4. 4. Data-perspective – Data Petri Net modelling decisions PAGE 3 Decision point Data recording Decision rule
  5. 5. PAGE 4 DMN 1.1 released on 2016 Widely adopted by tool vendors, for example: U Eligibility Outcome 1 Yes Grant 2 No Reject Decision Table Grant Reject [Eligibility = No] [Eligibility = Yes] Comparing the Petri net notation to DMN Decision Rule / Guard
  6. 6. Why are overlapping rules needed? PAGE 5 Incomplete Information • Not recorded • Process context • Confidential • ... • Expert approval • Deferred choice • Randomized check • Inconsistent human behavior • ...
  7. 7. Goal: Discover rules which may overlap PAGE 6 Process Model Event Log Process Model with Overlapping Decision Rules Overlapping Rule Discovery
  8. 8. Decision point - Mutually-exclusive rule PAGE 7 Grant Reject [Eligibility = No] [Eligibility = Yes] Count Eligibility Outcome 5x “No” Reject 20x “Yes” Grant Observation instances from an event log Grant Reject
  9. 9. Decision point – Overlapping rule PAGE 8 C Rating Amount Activity 1 Good - Simple Check 2 Bad - Extensive Check 3 Bad Low Simple Check 4 Bad High Request Information 5 Unknown - Request Information Alternative Decision Table Notation
  10. 10. Proposed Discovery Method PAGE 9 Process Model Event Log Process Model With Overlapping Rules Overlapping Rule Discovery foreach Decision Point Collect Instances 1st Classification 2nd Classification Collect Misclassified Build Rules
  11. 11. 1) Collect Instances PAGE 10 Event Log collect Rating Amount Outcome 6x Good Low Simple 6x Good High Simple 6x Bad High Extensive 4x Bad High Request 6x Bad Low Extensive 4x Bad Low Simple 6x Unknown High Request Observation instances • Cyclic Behavior • Noise (Missing / Additional Events) • Unassigned values • Inconsistent recording Alignment-based method
  12. 12. 2) 1st Classification & 3) Misclassified Instances PAGE 11 Rating Amount Outcome 6x Good Low Simple 6x Good High Simple 6x Bad High Extensive 4x Bad High Request 6x Bad Low Extensive 4x Bad Low Simple 6x Unknown High Request Rating Simple RequestExtensive Good Unknown Bad 12 OK 12 OK 8 NOK 6 OK Instances Decision Tree
  13. 13. 4) 2nd Classification PAGE 12 Instances Amount Request Simple High Low 2nd Decision Tree Rating Amount Outcome 4x Bad High Request 4x Bad Low Simple
  14. 14. 5) Build Overlapping Decision Rules PAGE 13 Rating Simple RequestExtensive Good Unknown Bad Amount Request Simple High Low Compiled to overlapping rules If Rating = Good then Simple If Rating = Unknown then Request If Rating = Bad then Extensive If Rating = Bad AND Amount = High then Request If Rating = Bad AND Amount = Low then Simple
  15. 15. Resulting Data-aware Process Model PAGE 14
  16. 16. Trade-off: Precise and fitting model PAGE 15 Rating Amount Outcome 6x Good Low Simple 6x Good High Simple 6x Bad High Extensive 4x Bad High Request 6x Bad Low Extensive 4x Bad Low Simple 6x Unknown High Request Unfitting Imprecise [Underfitting] Good Trade-off
  17. 17. Evaluation – Measures PAGE 16 Precision Fitness How much unobserved behavior is modelled? How much observed behavior is modelled? Image source (CC BY-SA): https://en.wikipedia.org/wiki/Precision_and_recall#/media/File:Precisionrecall.svg
  18. 18. Evaluation – Setup PAGE 17 Method Description Expected Precision Expected Fitness WO Without rules Poor Good DTF Mutually-exclusive approach Good Poor DTT Naïve overlapping approach Poor Good DTO Presented overlapping approach Balanced Balanced Dataset # Traces # Events # Attributes # Decisions Road Fines 150,000 500,000 9 5 Hospital 1,000 15,000 39 11 Datasets Compared Methods
  19. 19. Evaluation – Example rules in the hospital data PAGE 18 Method Intensive Care Normal Care Skip DTO L > 0 ∧ H = true L > 0 L ≤ 0 ∨ (L > 0 ∧ H = false) DTT true L > 0 L ≤ 0 DTF false L > 0 L ≤ 0 Imprecise Unfitting Good trade-off
  20. 20. Evaluation – Precision & Fitness PAGE 19 Fitness Precision • Fitness  how often rules are violated • DTO improves fitness over DTF (mutually-exclusive) • Precision  how strict are the rules • DTO improves precision against WO • DTO does sacrifice precision vs. DTF
  21. 21. Conclusion & Future Work • Method: Discovery of overlapping rules using event logs • Based on decision tree induction • ProM framework: MultiPerspectiveExplorer http://www.promtools.org • Results: Trade-off fitness & precision • Improves the model fitness over standard trees • Improves the model precision over naïve approach • Future work • Better experimental validation • Manage the complexity of discovered rules • Imbalanced distributions PAGE 20
  22. 22. Questions? PAGE 21 @fmannhardt - f.mannhardt@tue.nl - http://promtools.org Multi-Perspective Explorer

×