Curating Naturally Adversarial Datasets for Learning-Enabled Medical Cyber-Physical Systems

Curating Naturally Adversarial
Datasets for Learning-Enabled Medical
Cyber-Physical Systems
Sydney Pugh1, Ivan Ruchkin2, James Weimer3, and Insup Lee1
1 University of Pennsylvania
2 University of Florida
3 Vanderbilt University
15th ACM/IEEE International Conference on Cyber-Physical Systems (ICCPS)
May 16, 2024

Outline
• Introduction
• Motivation
• Related Work
• Problem Statement
• Approach
• Results
• Conclusion
ICCPS -- 5/16/24 2

Reality of the LE-MCPS Domain
LABELED
Data
UNLABELED Data
Primarily used
for training LE-MCPS
Often unused!
How can we make unlabeled data useful
to LE-MCPS developers?
ICCPS -- 5/16/24 3
Can be used for training or analysis
In our paper, we investigate whether
we can we use unlabeled data to
analyze the robustness of trained LE-MCPS

Evaluating Robustness using Unlabeled Data
ICCPS -- 5/16/24 4
Unlabeled
Data
Labeled
Data
Clinician
Adversarial
Dataset
Curation
Labels are highly accurate
but expensive!
Mostly focuses on synthetic
adversarial examples!
Robustness is typically evaluated by observing a LE-MCPS’s
performance against adversarial examples.
Adversarial
Dataset

Synthetic Adversarial Examples
• Apply adversarial perturbations to clean inputs to cause misclassification
• Given sample 𝑥 with true label 𝑦, add noise 𝛿 such that 𝑓 𝑥 + 𝛿 ≠ 𝑦
• E.g., ℓ! adversarial examples (Vorobeychik et al., 2018)
• Limitations: adding noise to medical data typically yields invalid/unrealistic examples
• Synthetic data generation techniques
• E.g., Patient simulators and Generative Adversarial Networks (GANs)
• Limitations: lack of realism; difficult to generate complex physiology; bias
ICCPS -- 5/16/24 5
Clean ECG ECG with noise
Noise

Evaluating Robustness using Weakly-Labeled Data
ICCPS -- 5/16/24 6
Unlabeled
Data
Labels are inexpensive but
less accurate!
Mostly focuses on synthetic
adversarial examples!
We can avoid manual labeling with weakly-supervised data labeling!
Clinician Labeling Functions
def LF_1(x):
return heuristic_1(x)
⋮
def LF_2(x):
return re.find(“abnormal”, x)
Programmatic
Weak
Supervision
Weakly-Labeled
Data
Adversarial
Dataset
Curation
Adversarial
Dataset
• Weak label confidences are typically overconfident
• However, we suspect the ordering of confidences is legit

Evaluating Robustness via Our Approach
ICCPS -- 5/16/24 7
Key Idea 1: Sample natural adversarial examples from real unlabeled medical data!
Key Idea 2: Uncertainty in weak labels are indicative of “adversarialness”
Dataset Curation via
Adversarial Ordering
Adversarially Ordered
Natural Datasets
More adversarial
Weakly-Labeled
Data
Weakly-Labeled Data
High uncertainty
More adversarial
Labels prone to inaccuracies
Predictions disagree with labels
Low uncertainty
Less adversarial
Labels largely correct
Predictions match labels
Analyze robustness by observing trend in accuracy across the datasets

Natural Adversarial Examples
• Hendrycks et al. discovered that clean, realistic inputs can also degrade the
performance of machine learning models
• Constructs a naturally adversarial dataset from ImageNet via adversarial filtration
• Remove examples classified easily via very predictable classification boundaries
• Limitation: medical data often lacks spurious cues
• Possible feature-based approaches
• Density estimator with outlier detection (Aggarwal 2013)
• Out-of-distribution (OOD) detectors (Ruff et al., 2021)
• Out of scope for this paper
ICCPS -- 5/16/24 8
Hendrycks et al.,
“Natural adversarial
examples”, CVPR
2021.

Problem Statement
ICCPS -- 5/16/24 9
Dataset Curation
via Adversarial
Ordering
Inputs Outputs
Unlabeled Data
𝑋
Labeling Functions
Λ = 𝜆 ∶ 𝒳 → 𝒴 or “𝑎𝑏𝑠𝑡𝑎𝑖𝑛”
𝐷!, …, 𝐷"
where 𝐷# = 𝑥$, &
𝑦$
Natural Datasets

Outline
• Introduction
• Motivation
• Related Work
• Approach
• Results
• Conclusion
ICCPS -- 5/16/24 10

Dataset Curation via Adversarial Ordering
Labeling
Functions (LFs)
Λ
Unlabeled
Data
𝑋
ICCPS -- 5/16/24 11
LF Pruning
Step 1
𝐷!, …, 𝐷"
Natural Datasets
Independent LFs
Λ′ ⊆ Λ
Probabilistic
Labeling
Step 2 LF Weights
𝜇
Weak Labels
-
𝑌
Confidence
Intervals
Step 3
Intervals
Θ%, Θ&
Adversarial
Dataset Curation
Step 4
Independent LFs
Λ′ ⊆ Λ

Labeling Function Pruning
• Weakly-supervised data labeling techniques
assume LFs are conditionally independent if not
otherwise specified
• We identify an independent subset of LFs Λ′ ⊆ Λ
• At a high-level, we
1. Construct a graph representation of LF dependencies
• LFs as nodes
• Edges between LFs with Pearson Correlation magnitude > 𝛿
2. Rank LFs (in descending order) by the number of
maximal cliques they belong to
• Break ties by giving preference to LFs with higher coverage
• Reveals subsets of LFs that tend to share similar labeling
patterns
3. Iterate through the ranking to drop dependent LFs
• Goal: select smallest subset of LFs that cover all the cliques
ICCPS -- 5/16/24 12
LF Maximal cliques
𝜆! 𝜆", 𝜆! , 𝜆#, 𝜆!
𝜆# 𝜆#, 𝜆!
𝜆" 𝜆", 𝜆!
𝜆$,&,' 𝜆$, 𝜆&, 𝜆'
𝜆( 𝜆(
𝛿 = 0.5
X
Result: Λ#
= 𝜆$, 𝜆%, 𝜆&
𝜆"
𝜆#
𝜆$
𝜆%
𝜆&
𝜆'
Graph
𝜆(
Correlation
Matrix
X
X X

Labeling
Functions (LFs)
Λ
Unlabeled
Data
𝑋
ICCPS -- 5/16/24 13
LF Pruning
Step 1
𝐷!, …, 𝐷"
Natural Datasets
Independent LFs
Λ′ ⊆ Λ
Probabilistic
Labeling
Step 2 LF Weights
𝜇
Weak Labels
-
𝑌
Confidence
Intervals
Step 3
Intervals
Θ%, Θ&
Adversarial
Dataset Curation
Step 4
Independent LFs
Λ′ ⊆ Λ
Weak Labels
-
𝑌
LF Weights
𝜇

Probabilistic Labeling
• We weakly label the unlabeled input data 𝑋 using programmatic weak supervision
• A label model aggregates the outputs of independent LFs Λ′ via a weighted combination
• How are the LF weights 𝜇 determined?
• Depends on the model used
• Majority Vote whose weight vector is uniform
• Snorkel whose weight vector reflects the unknown accuracies of the LFs (Ratner, et al. 2019)
• Limitation: weak label confidences are typically overconfident
ICCPS -- 5/16/24 14
Label Model
Independent LFs Output
𝑃
) 𝑌 = 𝑦 𝝀 = 0
*+$
|-!|
𝜇*
(/)
2 𝟏 𝜆* 𝑥 = 𝑦
…then softmax over 𝒴
Weak Labels

Labeling
Functions (LFs)
Λ
Unlabeled
Data
𝑋
ICCPS -- 5/16/24 15
LF Pruning
Step 1
𝐷!, …, 𝐷"
Natural Datasets
Independent LFs
Λ′ ⊆ Λ
Probabilistic
Labeling
Step 2 LF Weights
𝜇
Weak Labels
-
𝑌
Confidence
Intervals
Step 3
Intervals
Θ%, Θ&
Adversarial
Dataset Curation
Step 4
Independent LFs
Λ′ ⊆ Λ
Weak Labels
-
𝑌
LF Weights
𝜇
Intervals
Θ%, Θ&

Confidence Intervals for Weak Labels
• Construct intervals Θ', Θ( containing
the true uncertainty in weak labels #
𝑌
with probability at least 1 − 𝛼
• The interval size depends on two factors:
• Weights of non-abstaining LFs
• Number of non-abstaining LFs
• We construct Clopper-Pearson
confidence intervals where we consider
LFs as Bernoulli trials
• Number of trials 𝑛 𝑥 is the number non-
abstaining LFs
• Number of success 𝑠 𝑥 is the normalized
probability of label &
𝑦 weighted by 𝑛 𝑥
ICCPS -- 5/16/24 16
Θ) 𝛼; 𝑛, 𝑠 = 𝐵
𝛼
2
; 𝑠, 𝑛 − 𝑠 + 1
Θ* 𝛼; 𝑛, 𝑠 = 𝐵 1 −
𝛼
2
; 𝑠 + 1 , 𝑛 − 𝑠
where 𝐵 𝑞; 𝑎, 𝑏 is the 𝑞-th quantile from a
beta distribution with shape parameters 𝑎
and 𝑏 and
𝑛 𝑥 = 9
+,$
|.!|
𝟏 𝜆+ 𝑥 ≠ “𝑎𝑏𝑠𝑡𝑎𝑖𝑛”
𝑠 𝑥 = 𝑛 𝑥 @
exp ∑+,$
|.!|
𝜇+
(0)
@ 𝟏 𝜆+ 𝑥 = F
𝑦
∑0∈𝒴 exp ∑+,$
|.!|
𝜇+
(0)
@ 𝟏 𝜆+ 𝑥 = 𝑦
For 𝑥 ∈ 𝑋 with weak label F
𝑦 ∈ 𝒴,

Labeling
Functions (LFs)
Λ
Unlabeled
Data
𝑋
ICCPS -- 5/16/24 17
LF Pruning
Step 1
𝐷!, …, 𝐷"
Natural Datasets
Independent LFs
Λ′ ⊆ Λ
Probabilistic
Labeling
Step 2 LF Weights
𝜇
Weak Labels
-
𝑌
Confidence
Intervals
Step 3
Intervals
Θ%, Θ&
Adversarial
Dataset Curation
Step 4
Independent LFs
Λ′ ⊆ Λ
Weak Labels
-
𝑌
LF Weights
𝜇
Intervals
Θ%, Θ&
𝐷!, …, 𝐷"
Natural Datasets

Adversarial Dataset Curation
• Curate a sequence of adversarially ordered datasets 𝐷!, …, 𝐷"
1. Adversarially order the data
• Intuition: samples with small CI lower bounds are more adversarial
• Order samples by CI lower bound in descending order
2. Construct datasets from the adversarial ordering
• For each dataset 𝐷1, select the top ⁄
100 𝑛 percent of ordered samples
𝑥'4
, 𝑥'5
, … , 𝑥'|6|
where Θ( 𝑥'4
≥ Θ( 𝑥'5
≥ ⋯ ≥ Θ( 𝑥|*| and 𝑖%, … , 𝑖|*| ∈ 1, … , |𝑋|
𝐷+ = 𝑥'7
, @
𝑦'7
for 𝑗 ∈ 1, … ,
𝑖 D 𝑋
𝑁
ICCPS -- 5/16/24 18

Outline
• Introduction
• Motivation
• Related Work
• Approach
• Results
• Conclusion
ICCPS -- 5/16/24 19

Evaluation
• Goal: Statistically valid adversarial ordering
• Accuracy of the weak labels per dataset decreases
• Robust LE-MCPS is expected to show decreasing accuracy on our datasets
• We use Spearman’s Rank Correlation to validate adversarial ordering
• Good result indicated by negative correlation with statistically significant p-value (<0.01)
• Bad result indicated by positive correlation with statistically significant p-value (<0.01)
• Otherwise abstain
ICCPS -- 5/16/24 20
High uncertainty
More adversarial
Labels prone to inaccuracies
Predictions disagree with labels
Low uncertainty
Less adversarial
Labels largely correct
Predictions match labels
Analyze robustness by observing trend in accuracy across the datasets

Results
• Datasets:
• HR Low/High, RR
Low/High, SpO2 Low:
classify suppressible
physiologic monitoring
alarms from time-series
vital sign data
• Cross-modal: classify
abnormal radiography
images from
corresponding imaging text
reports
• Crowdsourcing: classify
sentiment in tweets
• Recsys: predict if a user
will read and like a book
given their reading history
• Spam: classify spam
emails
ICCPS -- 5/16/24 21
Abstain
Abstain
Abstain
Abstain
Abstain
Abstain
Abstain
Abstain
Abstain
Abstain
Abstain
Abstain
Abstain
Abstain
Abstain
Abstain
Takeaways:
• Our approach successfully produces natural datasets
with statistically valid adversarial ordering
• And does not produce statistically invalid datasets!
Our approach
without LF
pruning and
confidence
intervals
Our approach
without
confidence
intervals
Our
approach
without LF
pruning

Outline
• Introduction
• Motivation
• Related Work
• Approach
• Results
• Conclusion
ICCPS -- 5/16/24 22

Conclusion
• We proposed a weakly-supervised approach to curating
adversarially ordered datasets for evaluating robustness
• Using unlabeled data
• And labeling functions
• We demonstrated our approach yields datasets with statistically
valid adversarial ordering
• Future work:
• Evaluate real-world LE-MCPS on our datasets
• Create a significance detector for adversarial ordering
• Generally requires ground truth
ICCPS -- 5/16/24 23

Thank You!
ICCPS -- 5/16/24 24
Sydney Pugh
sfpugh@seas.upenn.edu
Ivan Ruchkin Insup Lee
James Weimer
Recently
defended and
graduating this
summer!
Code
Paper

Curating Naturally Adversarial Datasets for Learning-Enabled Medical Cyber-Physical Systems

Recommended

Recommended

More Related Content

Similar to Curating Naturally Adversarial Datasets for Learning-Enabled Medical Cyber-Physical Systems

Similar to Curating Naturally Adversarial Datasets for Learning-Enabled Medical Cyber-Physical Systems (20)

More from Ivan Ruchkin

More from Ivan Ruchkin (20)

Recently uploaded

Recently uploaded (20)

Curating Naturally Adversarial Datasets for Learning-Enabled Medical Cyber-Physical Systems