Process Mining meets
Causal Machine Learning:
Discovering Causal Rules
from Event logs
Zahra Dasht Bozorgi, Irene Teinemaa, Marlon Dumas,
Marcello La Rosa, and Artem Polyvyanyy
1
2nd International Conference on Process Mining (ICPM 2020)
Padua, Italy, 4-9 October 2020
Introduction
An approach that addresses the following:
• Discover case-level treatment
recommendations to increase the positive
outcome rate of a process;
• Identify subsets of cases to which a
recommendation should be applied;
• Estimate the causal effect and the incremental
Return-on-Investment (ROI) of a treatment.
2
Preliminaries
What is Causal Inference?
Inferring the effects of any
treatment/policy/intervention/etc.
Examples:
• Effect of a treatment on a disease
• Effect of climate change policy on emissions
• Effect of an action on a business process outcome
3
Notation:
Y denotes outcome
T denotes treatment assignment
X denotes case attributes
Preliminaries
How do we measure causal effects?
Taking the difference between potential outcomes
Example: Loan application Process
Treatment/intervention: calling the customer after offer
Outcome: whether customer selects a loan offer or not
4
World 1
E[Y1] E[Y0]
World 2
Average Treatment Effect:
ATE = E[Y1 – Y0]
Conditional Average Treatment Effect:
CATE = E[Y1 – Y0|X= x]
CATE is also known as Uplift
in marketing literature.
Preliminaries
id T Y Y1 Y0 Y1 – Y0
1 0 0 ? 0 ?
2 1 1 1 ? ?
3 1 0 0 ? ?
4 0 0 ? 0 ?
5 0 1 ? 1 ?
6 1 1 1 ? ?
5
Fundamental Problem of Causal Inference:
Missing Data Problem
Solution: Randomised Experiment
• Expensive
• Time-consuming
• Unethical
Next best solution: Observational Study
Aim
6
Event Log Treatment/Intervention Causal Rules
Proposed Framework
7
Candidate Treatment Identification
Aim: generating recommendations automatically
Method: Action rule mining
• Extension of classification rules
• Suggests actions to change class label
• Based on support
Action Rule: r = [(a: a1) ∧ (b: b1 → b2)] ⇒ [y: y1 → y2]
8
Controllable
Uncontrollable
Input: Case Attributes
Candidate Treatment Identification
9
Rule: r = [(LoanGoal: Home Improvement) ∧ (NumOfTerms: 48 → 84)] ⇒ [Selected: 0 → 1]
with support = 0.05
Pre-condition Action Outcome
Example:
In loan applications where the loan goal is home improvement, changing the number of payback months (terms)
from 48 to 84 months will increase the chance of the customer selecting the loan offer.
Example: Loan application process
Attributes: Loan Goal (Uncontrollable)
Number of terms (Controllable)
Causal Rules Discovery
Aim: Discovering subgroups where an action has a
high causal effect on the outcome
Method: Uplift Trees1
Uplift tree: predicts which cases will have a
positive outcome because a treatment is applied.
101. P. Rzepakowski and S. Jaroszewicz, “Decision trees for uplift modelling with
single and multiple treatments,”Knowl. Inf. Syst., vol. 32, no. 2,2012.
Uplift Tree
11
Splitting criterion:
D(.) a divergence measure
PT probability distribution of the outcome in the treatment group
PC probability distribution of the outcome in the control group
Divergence Metrics:
Kullback-Leibler divergence
Squared Euclidean distance
Chi-squared divergence
Uplift Tree
E[Y|T=1, X=x] – E[Y|T=0, X=x]
Recall:
Uplift = E[Y1 | X=x] – E[Y0 | X=x]
The reason is Confounding.
12
≠
Confounding
Confounding is the presence of a variable that affects both treatment assignment and the outcome.
How do we address confounding in uplift trees?
Normalisation
13
NT and NC are the number of cases in the treatment and control group respectively.
N is the total number of cases. A is a test. And H(.) is entropy.
Example Causal Rule
Example rule:
• Action: [NumOfTerms: 48 → 84]
• Objective: [Selected: 0 → 1]
• Sub-group:
a) Customers whose loan goal is not existing loan takeover AND
b) have a credit score less than 920 AND
c) their offer includes a monthly cost greater than 149 AND
d) and the first withdrawal amount is less than 8304
14
Ranking Rules using Cost-benefit Model
Parameters:
• v: value (benefit) of a positive outcome
• c: impression cost for a treatment
• u: uplift of applying a treatment
• n: size of the treated group
Net value of applying the recommendation:
Net = n × (u × v - c)
15
Dataset and Experimental setup
• The approach was implemented in Python 3.7 using
the ActionRules package for generating candidate
treatments and the CausalML package for
constructing uplift trees.
• We ran the action rule discovery algorithm on this
dataset and discovered 24 rules containing 17
distinct recommendations.
• The uplift tree was constructed for each of the 17
recommendations resulting in 8 causal rules.
• Details of the recommendations can be found in the
paper
16
BPI Challenge 2017:
• 31,509 applications (cases)
• 42,995 offers
• Outcome is whether customer selects the offer
Uncontrollable attributes:
• Application Type
• Loan Goal
• Credit Score
• Requested Amount
Controllable attributes:
• Number of Offers
• Number of Payback Terms
• Monthly Cost
• Initial Withdrawal Amount
Comparison of Results
• Increase number of term
from 48 months to more than 120 months
for customers whose credit scores are between 899 and 943 and their first withdrawal amount
is less than 8,304.
• Decrease the duration of the process
Delays are mostly due to unresponsive customers
17
Future Work
• Incorporating other types of recommendation
• Addressing other goals, such as reduction of cycle time and waste
• Adjusting for unobserved confounding effects
• Conducting complimentary evaluations such as randomised experiments
18
Thank you
Any Questions?
Zahra Dasht Bozorgi
zdashtbozorg@student.unimelb.edu.au
School of Computing and Information Systems
University of Melbourne

Process Mining Meets Causal Machine Learning: Discovering Causal Rules From Event Logs

  • 1.
    Process Mining meets CausalMachine Learning: Discovering Causal Rules from Event logs Zahra Dasht Bozorgi, Irene Teinemaa, Marlon Dumas, Marcello La Rosa, and Artem Polyvyanyy 1 2nd International Conference on Process Mining (ICPM 2020) Padua, Italy, 4-9 October 2020
  • 2.
    Introduction An approach thataddresses the following: • Discover case-level treatment recommendations to increase the positive outcome rate of a process; • Identify subsets of cases to which a recommendation should be applied; • Estimate the causal effect and the incremental Return-on-Investment (ROI) of a treatment. 2
  • 3.
    Preliminaries What is CausalInference? Inferring the effects of any treatment/policy/intervention/etc. Examples: • Effect of a treatment on a disease • Effect of climate change policy on emissions • Effect of an action on a business process outcome 3 Notation: Y denotes outcome T denotes treatment assignment X denotes case attributes
  • 4.
    Preliminaries How do wemeasure causal effects? Taking the difference between potential outcomes Example: Loan application Process Treatment/intervention: calling the customer after offer Outcome: whether customer selects a loan offer or not 4 World 1 E[Y1] E[Y0] World 2 Average Treatment Effect: ATE = E[Y1 – Y0] Conditional Average Treatment Effect: CATE = E[Y1 – Y0|X= x] CATE is also known as Uplift in marketing literature.
  • 5.
    Preliminaries id T YY1 Y0 Y1 – Y0 1 0 0 ? 0 ? 2 1 1 1 ? ? 3 1 0 0 ? ? 4 0 0 ? 0 ? 5 0 1 ? 1 ? 6 1 1 1 ? ? 5 Fundamental Problem of Causal Inference: Missing Data Problem Solution: Randomised Experiment • Expensive • Time-consuming • Unethical Next best solution: Observational Study
  • 6.
  • 7.
  • 8.
    Candidate Treatment Identification Aim:generating recommendations automatically Method: Action rule mining • Extension of classification rules • Suggests actions to change class label • Based on support Action Rule: r = [(a: a1) ∧ (b: b1 → b2)] ⇒ [y: y1 → y2] 8 Controllable Uncontrollable Input: Case Attributes
  • 9.
    Candidate Treatment Identification 9 Rule:r = [(LoanGoal: Home Improvement) ∧ (NumOfTerms: 48 → 84)] ⇒ [Selected: 0 → 1] with support = 0.05 Pre-condition Action Outcome Example: In loan applications where the loan goal is home improvement, changing the number of payback months (terms) from 48 to 84 months will increase the chance of the customer selecting the loan offer. Example: Loan application process Attributes: Loan Goal (Uncontrollable) Number of terms (Controllable)
  • 10.
    Causal Rules Discovery Aim:Discovering subgroups where an action has a high causal effect on the outcome Method: Uplift Trees1 Uplift tree: predicts which cases will have a positive outcome because a treatment is applied. 101. P. Rzepakowski and S. Jaroszewicz, “Decision trees for uplift modelling with single and multiple treatments,”Knowl. Inf. Syst., vol. 32, no. 2,2012.
  • 11.
    Uplift Tree 11 Splitting criterion: D(.)a divergence measure PT probability distribution of the outcome in the treatment group PC probability distribution of the outcome in the control group Divergence Metrics: Kullback-Leibler divergence Squared Euclidean distance Chi-squared divergence
  • 12.
    Uplift Tree E[Y|T=1, X=x]– E[Y|T=0, X=x] Recall: Uplift = E[Y1 | X=x] – E[Y0 | X=x] The reason is Confounding. 12 ≠
  • 13.
    Confounding Confounding is thepresence of a variable that affects both treatment assignment and the outcome. How do we address confounding in uplift trees? Normalisation 13 NT and NC are the number of cases in the treatment and control group respectively. N is the total number of cases. A is a test. And H(.) is entropy.
  • 14.
    Example Causal Rule Examplerule: • Action: [NumOfTerms: 48 → 84] • Objective: [Selected: 0 → 1] • Sub-group: a) Customers whose loan goal is not existing loan takeover AND b) have a credit score less than 920 AND c) their offer includes a monthly cost greater than 149 AND d) and the first withdrawal amount is less than 8304 14
  • 15.
    Ranking Rules usingCost-benefit Model Parameters: • v: value (benefit) of a positive outcome • c: impression cost for a treatment • u: uplift of applying a treatment • n: size of the treated group Net value of applying the recommendation: Net = n × (u × v - c) 15
  • 16.
    Dataset and Experimentalsetup • The approach was implemented in Python 3.7 using the ActionRules package for generating candidate treatments and the CausalML package for constructing uplift trees. • We ran the action rule discovery algorithm on this dataset and discovered 24 rules containing 17 distinct recommendations. • The uplift tree was constructed for each of the 17 recommendations resulting in 8 causal rules. • Details of the recommendations can be found in the paper 16 BPI Challenge 2017: • 31,509 applications (cases) • 42,995 offers • Outcome is whether customer selects the offer Uncontrollable attributes: • Application Type • Loan Goal • Credit Score • Requested Amount Controllable attributes: • Number of Offers • Number of Payback Terms • Monthly Cost • Initial Withdrawal Amount
  • 17.
    Comparison of Results •Increase number of term from 48 months to more than 120 months for customers whose credit scores are between 899 and 943 and their first withdrawal amount is less than 8,304. • Decrease the duration of the process Delays are mostly due to unresponsive customers 17
  • 18.
    Future Work • Incorporatingother types of recommendation • Addressing other goals, such as reduction of cycle time and waste • Adjusting for unobserved confounding effects • Conducting complimentary evaluations such as randomised experiments 18
  • 19.
    Thank you Any Questions? ZahraDasht Bozorgi zdashtbozorg@student.unimelb.edu.au School of Computing and Information Systems University of Melbourne