Simple explanations to summarise Subgroup Discovery outcomes: a case of study concerning patient phenotyping

Enrique Valero-Leal 1, M. Campos2,3, J. M. Juarez2
1 Technical University of Madrid
2 AIKE research group (INTICO), University of Murcia
3 IMIB-ARRIXACA Murcian Biomedical Research Institute
Simple explanations to summarise
Subgroup Discovery outcomes: a case of
study concerning patient phenotyping
Sept 19 2022 X-KDD workshop, Grenoble
Funded by Spanish Ministry of Science, Innovation and Universities
under the CONFAINCE project (Ref:PID2021-122194OB-I00 ), and by
the European Fund for Regional Development (EFRD, FEDER).

2
Simple explanations to summarise Subgroup Discovery: patient phenotyping
FULL PAPER DOWNLOADABLE AT:
https://kdd.isti.cnr.it/xkdd2022/papers/XKDD_2022_paper_9989.pdf
Enrique Valero-Leal, M. Campos, J. M. Juarez. Simple
explanations to summarise Subgroup Discovery:
patient phenotyping. Proceedings of the
International Workshop on
eXplainable Knowledge Discovery in Data Mining
XKDD 2022. Lecture Notes in Computer Science.
Springer Series. 2022
These slides summarise the conference
paper presented at XKDD 2022 workshop @ECML-PKDD:
This research was funded by
under the CONFAINCE project
(Ref:PID2021-122194OB-I00 )

• OUR RESEARCH GOAL
Generate trustworthy medical hypotheses for
patient phenotyping.
Subgroup Discovery algorithms approach
Medical-friendly explanations
SubgroupExplainer
3

• OUTLINE:
1. Clinical problem & research goal
2. Subgroup discovery
3. Contribution: SubgroupExplainer
4. Experiments
5. Conclusions
4
✓

• SUBGROUP DISCOVERY
Clustering != Subgroup discovery
5
Picture from: S. Ventura and J. M. Luna (2018). Supervised Descriptive Pattern Mining. Springer books.

• SUBGROUP DISCOVERY: DEFINITIONS
6
𝑫𝒂𝒕𝒂𝒔𝒆𝒕: 𝐷 = 𝐼, 𝐴 𝐴 = 𝑎!, 𝑎", … , 𝑎#
𝑺𝒆𝒍𝒆𝒄𝒕𝒐𝒓 𝑠$%: 𝐼 → 𝐵𝑜𝑜𝑙𝑒𝑎𝑛 𝑠$% 𝑖 = 𝑇 ⟺ 𝑠𝑐 𝑖𝑠 𝑓𝑢𝑙𝑓𝑖𝑙𝑙𝑒𝑑 𝑏𝑦 𝑖
𝑷𝒂𝒕𝒕𝒆𝒓𝒏: 𝑃 = 𝑠!, 𝑠", … , 𝑠& 𝑃 𝑖𝑛 𝑐𝑜𝑛𝑗𝑢𝑛𝑐𝑡𝑖𝑣𝑒 𝑓𝑜𝑟𝑚
𝑺𝒖𝒃𝒈𝒓𝒐𝒖𝒑: 𝑆𝐺 = 𝑃, 𝑠$' 𝑆𝐺 ∘ = ∀𝑖 ∈ 𝐼 |𝑠$% 𝑖 = 𝑇, ∀𝑠𝑐 ∈ 𝑃
SG= if (age>35, culture=Enteroc.Faecium) THEN suscept=Resistant
𝑸𝒖𝒂𝒍𝒊𝒕𝒚 𝒇𝒖𝒏𝒄𝒕𝒊𝒐𝒏: 𝑞𝑓: 𝑃, 𝐷 → ℝ
SD algorithms:
Frequent pattern mining: SD-MAP, Dp-Subroup, BSD, etc.
Beam search: SD, CN2-SD, SD4TS (heuristics)

• OUTLINE:
4. Experiments
5. Conclusions
7
✓
✓

• CONTRIBUTION
– Overcome difficulties explaining SD to clinicians
– Simple explanations to increase trust in SD
8

• CONTRIBUTION
– SubgroupExplainer
• XAI characteristics
–SD model-agnostic
–Global explanations
–Surrogated model
–Tree-like explanations
9

• CONTRIBUTION
– SubgroupExplainer
• Tree-like explanations, why?
10

• CONTRIBUTION
– SubgroupExplainer: simple explanations
11
DATASET
#attributes: 15
#instances:1049
BlackBox
SD
Algorithm
SG #1
SG #2
SG#19
SG# 20
...
#20
#5,2 #11
#19 #9,2 #7,14
#17
#7 #10
#3
#2,7
AT1 AT2 . . . AT15
1 26 . . . A
. . . . . . . . . B
64 12 . . . A
Step 1
DB
labelling
A1 … A14 L SUBGROUPS
1 … . . . #17,#2, #5
. . . . . . #20
64 . . . #11, #7, #2
Step 2:
SURROGATE
Explainer
building
SUBGROUPS:
SG#1: At1>2,A3=6=>AT15=A
SG#2: At1=4,At2=5, At4=21=>AT15=B
SG#3: At1<11,At2=5=>AT15=A
SG#4: At2>40,At3=5, At4=21=>AT15=C
. . .
SG#20: At2=3,At3<7,AT4<22 =>AT15=A
Labelled Dataset
SD EXPLANATION
Target Attribute
PROPOSAL ADVANTAGES:

• OUTLINE:
4. Experiments
5. Conclusions
12
✓
✓
✓

• EXPERIMENTS
1. Computational properties and scalability
2. Clinical reproducible use case
3. Human subjective study
13

• EXPERIMENTS: Computational properties and scalability
14
Ssg: subgroups
S: all selectors from Ssg
Su : unique selectors from S
card: mean cardinality |S|/|Ssg|
T: number vertex of tree
purity: proportion correctly class

• EXPERIMENTS
– Clinical reproducible use case
MIMIC III dataset (60,000/1280 admissions)
15
CN2-SD CART+WRAcc

• EXPERIMENTS
– Human subjective study
18 participants surveyed
ML and unfamiliar AI
Task oriented: SD & Trees.
Subjective opinion
16

• OUTLINE:
4. Experiments
5. Conclusions
17
✓
✓
✓
✓

• CONCLUSIONS
– Subgroup explainer:
SD problem pioneer
SD-agnostic, global, tree-like surrogate explanations.
Designed for phenotyping problems.
– Compactness: distil myriad of subgroups
– Comparative method: multiple SD outcomes
– Secondary use: surrogated model.
18

Simple explanations to summarise Subgroup Discovery
outcomes: a case of study concerning patient phenotyping
Contact:
Jose M. Juarez
jmjuarez@um.es
Subgroup Explainer:
Phenotyping method
Compact, comparative, secondary use

20
FULL PAPER DOWNLOADABLE AT:
https://kdd.isti.cnr.it/xkdd2022/papers/XKDD_2022_paper_9989.pdf
Enrique Valero-Leal, M. Campos, J. M. Juarez. Simple
explanations to summarise Subgroup Discovery:
patient phenotyping. Proceedings of the
International Workshop on
eXplainable Knowledge Discovery in Data Mining
XKDD 2022. Lecture Notes in Computer Science.
Springer Series. 2022
These slides summarise the conference
paper presented at XKDD 2022 workshop @ECML-PKDD:

Simple explanations to summarise Subgroup Discovery outcomes: a case of study concerning patient phenotyping

Recommended

Recommended

More Related Content

Similar to Simple explanations to summarise Subgroup Discovery outcomes: a case of study concerning patient phenotyping

Similar to Simple explanations to summarise Subgroup Discovery outcomes: a case of study concerning patient phenotyping (20)

Recently uploaded

Recently uploaded (20)

Simple explanations to summarise Subgroup Discovery outcomes: a case of study concerning patient phenotyping