Paper presentation at the 35th International Conference on Advanced Information Systems Engineering (CAiSE'2023).
Abstract.
Increasing the success rate of a process, i.e. the percentage of cases that end in a positive outcome, is a recurrent process improvement goal. At runtime, there are often certain actions (a.k.a. treatments) that workers may execute to lift the probability that a case ends in a positive outcome. For example, in a loan origination process, a possible treatment is to issue multiple loan offers to increase the probability that the customer takes a loan. Each treatment has a cost. Thus, when defining policies for prescribing treatments to cases, managers need to consider the net gain of the treatments. Also, the effect of a treatment varies over time: treating a case earlier may be more effective than later in a case. This paper presents a prescriptive monitoring method that automates this decision-making task. The method combines causal inference and reinforcement learning to learn treatment policies that maximize the net gain. The method leverages a conformal prediction technique to speed up the convergence of the reinforcement learning mechanism by separating cases that are likely to end up in a positive or negative outcome, from uncertain cases. An evaluation on two real-life datasets shows that the proposed method outperforms a state-of-the-art baseline.
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Learning When to Treat Business Processes: Prescriptive Process Monitoring with Causal Inference and Reinforcement Learning
1. Learning When to Treat
Business Processes
Prescriptive Process Monitoring
with Causal Inference and
Reinforcement Learning
1
Zahra Dasht Bozorgi, Marlon Dumas, Marcello La Rosa, Artem
Polyvyanyy, Mahmoud Shoush, Irene Teinemaa
35th International Conference on Advanced
Information Systems Engineering (CAiSE 2023)
Zaragoza, Spain, 12-16 June 2023
2. Motivation
Some process cases end with a positive outcome while others end with a negative outcome.
2
Bob
Alice
Search View View
View
Book
Book
Pay
Pay Call
Check in
Cancel
Intervention Outcome
3. Example Problem
• Mary is the operator of an accommodation booking process.
• Sometimes, customers cancel their booking. Mary would like to minimize
the number of such customers.
• There is a treatment (intervention) that Mary and colleagues can trigger to
prevent cancellation: Offer a discount to the customer
• The company cannot offer discounts to everybody because of the
associated costs.
• For which cases should the treatment be triggered and when?
3
4. Baseline Solution: Predictive Monitoring
1. Train a predictive model from an
event log.
2. Find cases that are predicted to end
in an undesired outcome.
3. Apply the treatment to the cases
with the highest probability of an
undesired outcome.
4
5. Example: Accommodation Booking Process
5
Time
Case start
Account Created
Further Info Requested
Property Viewed
Property Booked
Call Received
Discount Granted
Booking Cancelled
Treatment point
Waiting time
Case end
If we treat those cases only based on reliable predictions, we might miss
opportunities to apply effective treatments!
6. Empirical Thresholding
6
• Raise an alarm1 if P(undesired outcome) > 𝜏
• Optimal 𝜏 is found via empirical thresholding
P(cancel) = 0.2 0.6 0.8
Alarm
Search View View
Example: 𝜏 = 0.65
1. Teinemaa, Irene, et al. "Alarm-based prescriptive process monitoring." Business Process
Management Forum: BPM Forum, Sydney, NSW, Australia, September 9-14, 2018
7. Online Reinforcement Learning
• Predictions, their reliability, and prefix
length are given to a reinforcement
learning agent2.
• The agent decides when treatment is
needed through trial and error
• This is shown to outperform empirical
thresholding.
• But the agent’s decision is based on
predictions, not treatment effectiveness.
7
2. Metzger, Andreas, Tristan Kley, and Alexander Palm. "Triggering proactive business process adaptations via online reinforcement
learning." Business Process Management: 18th International Conference, BPM 2020, Seville, Spain, September 13–18, 2020
9. Model Training
1. Causal Estimation Component3
• Any causal estimator that can produce confidence interval
• We chose causal forest
2. Conformal Prediction Component4
• Any probabilistic predictive method
• We used Catboost
9
3. Wager, Stefan, and Susan Athey. "Estimation and inference of heterogeneous treatment
effects using random forests." Journal of the American Statistical Association 113.523 (2018)
4. Shafer, Glenn, and Vladimir Vovk. "A Tutorial on Conformal Prediction." Journal of Machine
Learning Research 9.3 (2008).
13. Why Conformal Prediction?
Conformal prediction:
• For an unseen sample, instead of producing a single prediction
L, a conformal predictor produces a prediction set {L1, L2,…, Lk}
for a user-specified error tolerance level α.
• The authors provide proof that P(Ltrue ∈ {L1, L2,…, Lk})>1- α
• For binary outcomes possible sets are {}, {0}, {1}, {0,1}
• If the conformal set is {0} or {1}, we can be highly confident
about the outcomes of the case.
• Providing the conformal prediction set to the RL agent should
speed up convergence.
13
14. Data Enhancement
• Generate potential outcomes for every prefix.
• The potential outcomes will be used in policy
selection.
14
5. Neal, Brady, Chin-Wei Huang, and Sunand Raghupathi. "Realcause: Realistic
causal inference benchmarking." arXiv preprint arXiv:2011.15007 (2020).
15. Dynamic Treatment Policy Selection
• State description:
1. Upper and lower bound of causal effect
2. Conformal prediction set (converted into a score)
3. Prefix length
• Treatment application is only allowed once.
• Suppose Gain is the monetary benefit of a case achieving a
positive outcome.
• Cost is the expenses associated with treatment application.
• Then the below table describes the reward Function:
15
Observed outcome
Agent’s treatment Good Bad
Yes Gain - Cost -Cost
No Gain 0
16. Experimental Setup
Temporal split of traces in the event log:
• 50% training and validating causal estimator
and conformal predictor
• 50% Policy selection using RL
• Further 50-50 split for training PO generation
and input to the RL component.
Feature encoding:
• Aggregation encoding for event attributes
• Last state encoding for temporal features.
• One-hot for categorical attributes.
16
Feature generation:
• Temporal features:
• Time since case start
• Time since last event
• Time since first case
• Inter-case feature:
• Number of active cases
• Distance to the start of the log
Gain function:
NetGain = y(t)*gain-t*cost
17. Hypothesis: Using causal effect estimates leads to better policy than
prediction estimates.
Hypothesis: Using conformal prediction speeds up convergence.
Experiments:
• Using both causal effect confidence bounds and conformal score
• Using only causal effect confidence bounds
• Baseline using predicted outcome and a reliability score
• Same baseline using our proposed reward function
Results
17
19. Thank you
Any Questions?
Zahra Dasht Bozorgi
zdashtbozorg@student.unimelb.edu.au
School of Computing and Information Systems
University of Melbourne
Editor's Notes
Suppose Mary is the operator for a loan origination process. She likes to handle the cases as quickly as possible to increase customer satisfaction. There are a couple of actions which from now on we call treatments, that Mary can perform to reduce the duration of a case. For example, if there are missing documents in case, she can decide to call the customer on the phone instead of sending an automated email, because customer appear to respond better to phone calls. Another possible treatment is assigning additional staff to work on that case. But these treatments are either time-consuming or they can have a cost.
So, in practice, Mary can’t do these extra actions for every customer. So the question now is, for which case should we do these treatments and when?
We can answer this question using predictions we train a predictive model. Find cases which are predicted to take long. And apply the treatment if the predicted duration is above a threshold. And there various prescriptive techniques that propose the best way to select this threshold.
To see why the predictive solution is not enough, let’s have a closer look at the loan application process that Mary is working on. Let’s say, in the beginning of the case when reviewing application documents, our predictive system tells us that this case is going to take long. So we bring in more people to finish the review more quickly. But, what if the reason this case takes a long time, is because of the activity check fraud? Then applying the treatment in the beginning is useless. We should have done it later for the check fraud activity. and We just wasted the time of the additional employee. So we can’t make decision only based on prediction. We have to be confident that the treatment we apply is effective.
reference
reference
Animations (highlight the quads when talking about them)