Dept. of ECE and Ingenuity Labs Research Institute
Queen’s University, Kingston, Canada
CardioGAN: Attentive Generative Adversarial Network
with Dual Discriminators for Synthesis of ECG from PPG
Pritam Sarkar, Ali Etemad
AAAI 2021
2
Outline
Background
Problem
Motivation
Proposed
Solution
ECG and PPG
Related Work
Method
Proposed
Framework
Method
Objective
Function
(Losses)
Data and
Training
Datasets
Data
Preparation
and Training
Results
Qualitative
Results
Quantitative
Results
Analyses
Ablation Study
Attention Map
Paired Training
Application
ECG Synthesis
from PPG
based
wearable
Conclusion
Summary
Future
Directions
3
Problem and Motivation
Problem Statement:
❑ Our goal is to enable the use of ECG in wrist-based wearable devices such as smart watches,
for continuous cardiac monitoring.
❑ Currently, there are no reliable solutions for continuous ECG monitoring in wrist-
based wearable, feasible for everyday and pervasive use.
Motivation:
❑ Cardiovascular diseases cause approximately 31% of global deaths worldwide.
❑ Continuous wearable-based ECG could enable early diagnosis of cardiovascular diseases, and
in turn, early preventative measures can be taken to overcome severe cardiac problems.
5
ECG and PPG
Electrocardiogram (ECG):
Electrical measurement of
cardiac activity
Photoplethysmogram (PPG):
Optical measurement of
volumetric changes in blood
circulation
6
Related Work
PPG to ECG translation was first attempted by Zhu et al. 2019b.
Approach:
Discrete Cosine Transformation (DCT) technique was used to map each PPG cycle to its
corresponding ECG cycle, followed by a linear regression model was trained to learn the relation
between DCT coefficients of PPG segments and corresponding ECG segments.
Limitations:
❑ The model failed to produce reliable ECG in a subject independent manner, which limits its
application to only previously seen subject’s data.
❑ The relation between PPG segments and ECG segments are not linear, therefore in several
cases, this model failed to capture the non-linear relationships between ECG and PPG
domains.
❑ No experiments have been performed to indicate any performance enhancement gained from
using the generated ECG as opposed to the available PPG.
7
Proposed Framework
The architecture of the proposed CardioGAN is presented. The original ECG and PPG signals are shown in
orange; the generated outputs are represented with green; and the reconstructed or cyclic outputs are marked
with the color black for better visibility. Moreover, connections to the generators are marked with solid lines,
whereas connections to the discriminators are marked with dashed lines.
8
Method
Objective is to learn the mapping between PPG (P) and ECG (E) domains.
Generator
forward mapping:
GE : P → E
fake ECG:
𝐸′
= 𝐺𝐸 𝑃
reconstructed PPG:
𝑃′′
= 𝐺𝑃 𝐺𝐸 𝑃
Discriminator
time-domain:
𝐷𝐸
𝒕
: 𝐸 𝑣𝑠 𝐺𝐸 𝑃
frequency-domain:
𝐷𝐸
𝑓
: 𝑓 𝐸 𝑣𝑠 𝑓 𝐺𝐸 𝑃
Where 𝑓 𝑥 = 𝑆𝑇𝐹𝑇 𝑥
Adversarial Loss
We calculate adversarial losses as follows (forward mapping):
ℒ𝒶𝒹𝓋 𝐺𝐸, 𝐷𝐸
𝑡
= 𝐸𝑒∼𝐸 log 𝐷𝐸
𝑡
𝑒 + 𝐸𝑝∼𝑃 log 1 − 𝐷𝐸
𝑡
𝐺𝐸 𝑝
ℒ𝒶𝒹𝓋 𝐺𝐸, 𝐷𝐸
𝑓
= 𝐸𝑒∼𝐸 log 𝐷𝐸
𝑓
𝑓 𝑒 + 𝐸𝑝∼𝑃 log 1 − 𝐷𝐸
𝑓
𝑓 𝐺𝐸 𝑝
Where, GE : P → E is obtained as:
mi
𝐺𝐸
max
𝐷𝐸
𝑡 ℒ𝒶𝒹𝓋
𝐺𝐸, 𝐷𝐸
𝑡
mi
𝐺𝐸
max
𝐷𝐸
𝑓
ℒ𝒶𝒹𝓋
𝐺𝐸, 𝐷𝐸
𝑓
Similarly, adversarial loss corresponding to inverse mapping are ℒ𝒶𝒹𝓋 𝐺𝑃, 𝐷𝑃
𝑡
and ℒ𝒶𝒹𝓋 𝐺𝑃, 𝐷𝑃
𝑓
,
and the mapping GP : E → P, is obtained as
mi
𝐺𝑃
max
𝐷𝑃
𝑡 ℒ𝒶𝒹𝓋
𝐺𝑃, 𝐷𝑃
𝑡
mi
𝐺𝑃
max
𝐷𝑃
𝑓
ℒ𝒶𝒹𝓋
𝐺𝑃, 𝐷𝑃
𝑓
Cycle Consistency Loss
We calculate cyclic-consistency loss as follows:
ℒ𝒸𝓎𝒸𝓁𝒾𝒸 𝐺𝐸, 𝐺𝑃 = 𝐸𝑒∼𝐸 𝐺𝐸 𝐺𝑃 𝑒 − 𝑒
1
+ 𝐸𝑝∼𝑃 𝐺𝑃 𝐺𝐸 𝑝 − 𝑝
1
to ensure forward mappings and inverse mappings are consistent:
i.e., p → GE(p) → GP(GE(p)) ≈ p
e → GP(e) → GE(GP(e)) ≈ e
Final Loss
ℒ𝒞𝒶𝓇𝒹𝒾ℴ𝒢𝒜𝒩 = α ℒ𝒶𝒹𝓋 E, E
t
+ α ℒ𝒶𝒹𝓋 P, P
t
+ β ℒ𝒶𝒹𝓋 E, E
f
+ β ℒ𝒶𝒹𝓋 P, P
f
+ λ ℒ𝒸𝓎𝒸𝓁𝒾𝒸 E, P
where α and β are adversarial loss coefficients corresponding to Dt and Df respectively, and λ is the
cyclic consistency loss coefficient.
We empirically set α, β and λ as 3, 1, 30 respectively.
12
Datasets
Dataset No. of Subjects Average Length of Each
Recording
Sampling Freq.
ECG PPG
BIDMC 53 8 mins. 125 125
CAPNO 42 8 mins. 300 300
DALIA 15 2 hrs. 700 64
WESAD 15 1 hr. 700 64
13
Data Preparation and Training
❑ Data Preparation
▪ Resampling to 128 Hz.
▪ Filtering
▪ Z-Score Normalization
▪ Segmentation into 4 Secs. Window
▪ Min-Max Normalization [-1,1]
▪ Divide into Training Set (80% participants) and Test Set (20% participants)
▪ Shuffling the training data
❑ Training Parameters
▪ Epoch: 15
▪ Batch Size: 128
▪ Learning Rate: 1𝑒−4
(fixed rate for first 10 epochs, and then linearly decreased to 0)
▪ Adam Optimizer
14
Qualitative Results
C C
nput
riginal
C
Cardio
We present ECG samples generated by our proposed CardioGAN. We show 2 different samples from each dataset to
better demonstrate the qualitative performance of our method.
17
Attention Map
Visualization of attention maps are presented where the brighter parts indicate regions
to which the generator pays more attention compared to the darker regions. We
present 4 samples of generated ECG segments corresponding to different subjects.
20
Summary
❑ We propose a novel framework called CardioGAN for generating ECG signals from PPG
inputs.
❑ Our approach has the potential to be used for continuous cardiac activity monitoring.
❑ To the best of our knowledge, no other studies have attempted to generate ECG from PPG (or
in fact any cross-modality signal-to-signal translation in the biosignal domain) using GANs or
other deep learning techniques.
❑ More accurate and reliable HR from generated ECG by CardioGAN vs. PPG.
❑ CardioGAN can be integrated into existing PPG-based wearables to obtain continuous
synthetic ECG.
21
Future Directions
❑ The use of generated ECG in other tasks should be evaluated, for example, identification of
cardiovascular diseases, detection of abnormal heart rhythms among others.
❑ Synthesizing multi-lead ECG can also be studied in order to extract more useful cardiac
information often missing in single-channel ECG recordings.
❑ Further research can be carried out towards cross-modality signal-to-signal translation in the
biosignal domain, allowing for less available physiological recording to be generated from more
affordable and readily available signals.
❑ We also believe further research should be conducted towards defining more robust evaluation
metrics to quantify the quality of synthesized biosignals based on the inherent properties of
different modalities.