Learning When to Treat Business Processes: Prescriptive Process Monitoring with Causal Inference and Reinforcement Learning

Learning When to Treat
Business Processes
Prescriptive Process Monitoring
with Causal Inference and
Reinforcement Learning
1
Zahra Dasht Bozorgi, Marlon Dumas, Marcello La Rosa, Artem
Polyvyanyy, Mahmoud Shoush, Irene Teinemaa
35th International Conference on Advanced
Information Systems Engineering (CAiSE 2023)
Zaragoza, Spain, 12-16 June 2023

Motivation
Some process cases end with a positive outcome while others end with a negative outcome.
2
Bob
Alice
Search View View
View
Book
Book
Pay
Pay Call
Check in
Cancel
Intervention Outcome

Example Problem
• Mary is the operator of an accommodation booking process.
• Sometimes, customers cancel their booking. Mary would like to minimize
the number of such customers.
• There is a treatment (intervention) that Mary and colleagues can trigger to
prevent cancellation: Offer a discount to the customer
• The company cannot offer discounts to everybody because of the
associated costs.
• For which cases should the treatment be triggered and when?
3

Baseline Solution: Predictive Monitoring
1. Train a predictive model from an
event log.
2. Find cases that are predicted to end
in an undesired outcome.
3. Apply the treatment to the cases
with the highest probability of an
undesired outcome.
4

Example: Accommodation Booking Process
5
Time
Case start
Account Created
Further Info Requested
Property Viewed
Property Booked
Call Received
Discount Granted
Booking Cancelled
Treatment point
Waiting time
Case end
If we treat those cases only based on reliable predictions, we might miss
opportunities to apply effective treatments!

Empirical Thresholding
6
• Raise an alarm1 if P(undesired outcome) > 𝜏
• Optimal 𝜏 is found via empirical thresholding
P(cancel) = 0.2 0.6 0.8
Alarm
Search View View
Example: 𝜏 = 0.65
1. Teinemaa, Irene, et al. "Alarm-based prescriptive process monitoring." Business Process
Management Forum: BPM Forum, Sydney, NSW, Australia, September 9-14, 2018

Online Reinforcement Learning
• Predictions, their reliability, and prefix
length are given to a reinforcement
learning agent2.
• The agent decides when treatment is
needed through trial and error
• This is shown to outperform empirical
thresholding.
• But the agent’s decision is based on
predictions, not treatment effectiveness.
7
2. Metzger, Andreas, Tristan Kley, and Alexander Palm. "Triggering proactive business process adaptations via online reinforcement
learning." Business Process Management: 18th International Conference, BPM 2020, Seville, Spain, September 13–18, 2020

Model Training
1. Causal Estimation Component3
• Any causal estimator that can produce confidence interval
• We chose causal forest
2. Conformal Prediction Component4
• Any probabilistic predictive method
• We used Catboost
9
3. Wager, Stefan, and Susan Athey. "Estimation and inference of heterogeneous treatment
effects using random forests." Journal of the American Statistical Association 113.523 (2018)
4. Shafer, Glenn, and Vladimir Vovk. "A Tutorial on Conformal Prediction." Journal of Machine
Learning Research 9.3 (2008).

Causal Effect Estimation
10
Prefix
extraction
Sequence
encoding
Model training
TE lower bound
TE upper bound

11
Prefix
extraction
Sequence
encoding
Model training
TE lower bound
TE upper bound

12
Prefix
extraction
Sequence
encoding
Model training
TE lower bound
TE upper bound

Why Conformal Prediction?
Conformal prediction:
• For an unseen sample, instead of producing a single prediction
L, a conformal predictor produces a prediction set {L1, L2,…, Lk}
for a user-specified error tolerance level α.
• The authors provide proof that P(Ltrue ∈ {L1, L2,…, Lk})>1- α
• For binary outcomes possible sets are {}, {0}, {1}, {0,1}
• If the conformal set is {0} or {1}, we can be highly confident
about the outcomes of the case.
• Providing the conformal prediction set to the RL agent should
speed up convergence.
13

Data Enhancement
• Generate potential outcomes for every prefix.
• The potential outcomes will be used in policy
selection.
14
5. Neal, Brady, Chin-Wei Huang, and Sunand Raghupathi. "Realcause: Realistic
causal inference benchmarking." arXiv preprint arXiv:2011.15007 (2020).

Dynamic Treatment Policy Selection
• State description:
1. Upper and lower bound of causal effect
2. Conformal prediction set (converted into a score)
3. Prefix length
• Treatment application is only allowed once.
• Suppose Gain is the monetary benefit of a case achieving a
positive outcome.
• Cost is the expenses associated with treatment application.
• Then the below table describes the reward Function:
15
Observed outcome
Agent’s treatment Good Bad
Yes Gain - Cost -Cost
No Gain 0

Experimental Setup
Temporal split of traces in the event log:
• 50% training and validating causal estimator
and conformal predictor
• 50% Policy selection using RL
• Further 50-50 split for training PO generation
and input to the RL component.
Feature encoding:
• Aggregation encoding for event attributes
• Last state encoding for temporal features.
• One-hot for categorical attributes.
16
Feature generation:
• Temporal features:
• Time since case start
• Time since last event
• Time since first case
• Inter-case feature:
• Number of active cases
• Distance to the start of the log
Gain function:
NetGain = y(t)*gain-t*cost

Hypothesis: Using causal effect estimates leads to better policy than
prediction estimates.
Hypothesis: Using conformal prediction speeds up convergence.
Experiments:
• Using both causal effect confidence bounds and conformal score
• Using only causal effect confidence bounds
• Baseline using predicted outcome and a reliability score
• Same baseline using our proposed reward function
Results
17

Future Directions
18
Future work:
• Focusing on multiple treatments
• Optimising multiple objectives
• Addressing unobserved confounding
• Addressing resource constraints
• Incorporating domain expertise

Thank you
Any Questions?
Zahra Dasht Bozorgi
zdashtbozorg@student.unimelb.edu.au
School of Computing and Information Systems
University of Melbourne

Learning When to Treat Business Processes: Prescriptive Process Monitoring with Causal Inference and Reinforcement Learning

Recommended

Recommended

More Related Content

Similar to Learning When to Treat Business Processes: Prescriptive Process Monitoring with Causal Inference and Reinforcement Learning

Similar to Learning When to Treat Business Processes: Prescriptive Process Monitoring with Causal Inference and Reinforcement Learning (20)

More from Marlon Dumas

More from Marlon Dumas (20)

Recently uploaded

Recently uploaded (20)

Learning When to Treat Business Processes: Prescriptive Process Monitoring with Causal Inference and Reinforcement Learning

Editor's Notes