Denoising Unpaired
Low Dose CT Images with
Self-Ensembled CycleGAN
Joonhyung Lee, Sangjoon Park, Sun Kyoung You, Jong Chul Ye
ISBI 2020 Workshop 1 Poster (SaPcPo): “Deep Learning for Biomedical Image Reconstruction”
Introduction
2
Low Dose vs. High Dose
3
Unpaired Image Data
4
CycleGAN
5
Issues with Naïve Implementation
Input Output
6
Method
7
8
Generator Architecture
Discriminator
Architecture
9
Spectral Normalization
PatchGAN
Training Details
• Training Epochs at 200 ×
256
𝑝𝑎𝑡𝑐ℎ 𝑠𝑖𝑧𝑒
×
256
𝑝𝑎𝑡𝑐ℎ 𝑠𝑖𝑧𝑒
.
• One Adam Optimizer per domain (2 in total) with 𝛽1 = 0.5, 𝛽2 = 0.999.
10
Results
11
Comparison 1
12
Input Output
Difference Image 1
13
Output - Input
Comparison 2
14
Input Output
Difference Image 2
15
Output - Input
Reconstruction Method
16
Issues & Future Direction
17
Input Output
Thank You for Listening!
• The slides for this presentation are available at
https://www.slideshare.net/ssuserc416e2/presentations.
• Please contact the authors if you have any questions.
18

Denoising Unpaired Low Dose CT Images with Self-Ensembled CycleGAN

Editor's Notes

  • #5 However, unlike in most denoising projects, we cannot use supervised learning with paired noisy and clean images as this would expose patients to unnecessary radiation. However, unpaired images of clean and noisy images at different radiation doses are easily available. Therefore, we are faced with the problem of how to learn denoising using unpaired image data.
  • #9 First, to solve the problem with image segments disappearing and bright spots, we simply reduced the input image patch size. Our data consist of images that are at least 512 pixels in length. However, most GAN models have difficulty in learning very large images. Because denoising is a low-level task, using small patches can produce adequate quality results. Second, to stabilize training, we included spectral normalization in both generator and discriminator. We find that using both is better than using either alone. For the generator architecture, we use the RRDB network modified to have identically sized images. We use 32 channels for all convolutional layers. The model has no feature normalization layers to reduce computation and preserve the original range of values. Pooling layers are also removed. We found that removing pooling layers both improved outcomes and reduced convergence times.