Scaling API-first – The story of a global engineering organization
CT Scan Synthesis | Tejas Prabhune
1. Generation of CT Scans
and Segmentation Masks
Using GANs
Tejas S Prabhune
2. The Importance of CT Scans in the Medical Field
• Key for diagnosing diseases like cancer.
• Greater detail and information than conventional X-rays.
• Much faster than conventional X-rays.
3. The Importance of Segmentation Masks
• Doctors extract crucial information from CT scans to
segmentation masks
• The volumes of highlighted organs is extremely important
for analyzing results of radiotherapy.
5. The Importance of AI in Medical Imaging
• AI used in conjunction with CT scans can diagnose up to
150x faster
• AI can greatly reduce the amount of radiation patients are
exposed to because of better grain inference.
• AI can help decrease the time and work needed to create
segmentation masks from CT scans.
6. AI Limitations
• AI researchers need huge amounts of data to gain optimal
performance
• Privacy concerns
• Legal factors
• Data integrity losses/breaches
• There is a lack of large public data:
7. Public datasets of CT scans and segmentation masks
for AI are limited in size.
A way to generate CT scans and segmentation masks
is needed for AI implementation in medical imaging.
My solution is a framework using dual Generative
Adversarial Networks to generate segmentation masks
and their paired CT scans.
8. Data
1A: left lung
1B: right lung
2: heart
3: sternum
4: anterior muscles
5: trapezius muscles
6: spinal canal
7: vertebral body
161 scans provided by Stony Brook University
9. Generative Adversarial Networks
• Direct competition between neural networks
• Generator vs Discriminator
• Losses help train the generator/discriminator
11. Segmentation Mask WGAN
• Generator: starts with input vector of size 50
• Scaled to 256x256 using various layers
• Dense, Reshape, Conv2D, BatchNormalization, Activation
• Wasserstein GAN removes mode collapse
12. Segmentation Mask WGAN
• Discriminator: starts with image from generator
• Scaled from 256x256 to one number showing how real or
fake the image is
• Dense, Reshape, Conv2D, BatchNormalization, Activation
13. Losses
• The discriminator is trained to score the source data as
real and the generated data as fake.
• Every iteration, the discriminator receives a loss value
that helps it learn to differentiate and learn the nuances of
the generated images.
• Based on the scores the discriminator gives the generated
images, the generator receives a loss value that helps it
learn as well.
14. Game
• This process of losses makes both the generator and
discriminator better, increasing the quality of the images.
• 100,000s of iterations lead to the generator creating
realistic images that look like the source data but with
variance.
15. Now we have a model that can generate as many new,
unique segmentation masks as we want.
So now, we need to translate them to detailed CT scans to
create paired data.
16. CT Scan cGAN
• We use a conditional GAN and paired data to translate the
generated segmentation masks to CT scans.
• This uses a special generator that can go from a 256x256
image to another 256x256 image in contrast to starting
with an input vector.
17. CT Scan cGAN
• Generator: encodes the input segmentation masks to a
vector then decodes it into another 256x256 image.
• Starts off as noise but slowly approaches quality images
as time progresses.
18. CT Scan cGAN
• Discriminator: takes a combined paired image and
attempts to give it a score of how real it is.
• This is like the Segmentation Mask WGAN with its loss
functions that help train both models.
Now we have a models that can generate new, unique
segmentation masks and the corresponding CT scans.
Onto results!
20. The results of the WGAN had most of the parts that were present in the original
segmentation masks. The heart, lungs, vertebral body, and outer body were present, albeit
in a less organized manner. There was visible noise around the image that should not have
been present. Mode collapse was not present as every mask was different.
In masks 1, 4, 7, and 9 of Figure 3, both lungs were shaped very well with minimal
artifacts and the hearts were of different sizes. In contrast, masks 2, 3, and 8 had very
broken up lungs and hearts. Mask 5 and 10 shared the attribute of having different sized
lungs that were seen in some of the source data, while showing artifacts.
Segmentation Mask WGAN Results
22. The training of the cGAN went as expected. The cGAN translated the segmentation masks very well so the
synthetic CT scan images were of high quality and they did not have any noise. The heart textures varied
from image to image; however, they were slightly blurrier than the source CT scans. The sternum was not
clearly defined in any of the generated CT scans and the muscles did not have the clearest definition as well.
Yet, these CT scans still had high quality and variance.
In scans 1B and 2B from Figure 4, we could see the variance of generated CT scan heart textures very well,
since the textures were not just a flat gray plane – rather, we could see creases that were different between
the scans. However, these textures were not as high quality or as detailed as the original scans, for example,
in the heart textures of scans 3C and 6C we could see different specific parts of the heart, but these textures
were not reflected prominently in the generated scans 3B and 6B. The cGAN also tended to fill the vertebral
body in white whereas the original CT scans shaded them a slightly darker shade of gray that could be
noticed in scans 2B/2C, 3B/3C, and 6B/6C.
CT Scan cGAN Training Results
24. When the cGAN translated the generated segmentation masks to CT scans as shown in Figure 5, it
recognized the locations of the heart, lungs, and vertebral body and translated them well. The generated
textures for the heart were very detailed and varied greatly from scan to scan. The sternum was not still
translated properly, and the muscles in the outer body had no definition. The noise present in the generated
segmentation masks was also carried over to the CT scans.
The quality of the generated pairs was exemplified in 2A/2B, 5A/5B, and 6A/6B. The noise in 2A, 5A, and
6A was very limited and helped the cGAN translate each part cleanly. The variance in textures was also
evident in these scans. The heart texture diversity apparent in Figure 4 showed up in scans 2B, 5B, and 6B of
Figure 5 as well. However, when the generated segmentation masks had more noise, the cGAN handled it
poorly, as seen in scans 1A/1B and 7A/7B. The lung shapes were not defined properly in the segmentation
masks, and this lack of definition translated over into the generated CT scans as well.
CT Scan cGAN Generated Results
25. The proposed framework proves to be viable for generating synthetic CT scan/segmentation mask data for AI
researchers to innovate in the medical imaging field.
The Segmentation Mask WGAN shows no sign of mode collapse and generates masks that are unique and
different from input data. The major parts of the segmentation are present in the generated masks. However,
work needs to be done to remove the noise and artifacts present. The shapes need to be more consistently
defined, as some segmentation masks are of high quality while others suffer from noise around the image.
The cGAN translates the generated masks effectively into CT scans, with different heart textures in every
image. We can see that it does a very good job of translating the source images, but when it tries to estimate
CT scans from some of the generated segmentation masks that have excessive noise, the noise and artifacts
make the cGAN have poor performance. Overall, the cGAN is a success since it performs with quality and
varies from image to image.
As the first effort to create a method to expand the public synthetic dataset of unique CT scans for AI
researchers, this framework is a success. Some tuning and more processing power could improve the WGAN
results, but both GANs work and prove that this framework can fulfill its purpose.
Conclusion
26. Impacts
• When these models are more refined, public datasets can be enlarged,
leading to AI innovation.
• AI innovation in medical imaging has several different impacts:
• Faster diagnoses
• Faster segmentation of CT scans
• Less radiation exposure
• Faster treatment/decision making
• These can all directly help save lives.
A PatchGAN discriminator in the WGAN could be very effective to fix issues such as noise and artifacts
present in the generated segmentation masks. This discriminator takes in patches of N×N size and measures
how real or fake that patch is. By running this convolutionally over the image, smaller details like noise can
be eliminated.
Future Work
27. The research was conducted in the Visual Analytics Lab at the Department of Computer Science at Stony
Brook University, New York from June to August 2019. Professor Klaus Mueller and PhD student Arjun
Krishna helped with narrowing down the selection of topics, guiding research, and providing laboratory
space.
Acknowledgements