Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Remo Monti - DL for Clinical Brain MRI Segmentation


Published on

Deep Learning Club, MDC, Berlin

Published in: Education
  • Be the first to comment

  • Be the first to like this

Remo Monti - DL for Clinical Brain MRI Segmentation

  1. 1. Deep Learning for Clinical Brain MRI Segmentation MDC Deep Learning Club Remo Monti May 2018
  2. 2. Overview • Introduction to MRI • Introduction to Image Segmentation • Brain Ventricle Segmentation • Our Machine Learning Problem • DCNN Architectures for Image Segmentation • U-Net, V-Net • Dice Coefficient and Dice Loss • Results • Discussion
  3. 3. A Brief Introduction to MRI • MRI = (nuclear) Magnetic resonance imaging • MRI scanners use strong magnetic fields and radio waves to generate images of the organs in the body. • Certain atomic nuclei are able to absorb and emit radio frequency energy when placed in an external magnetic field. • Hydrogen atoms are most often used to generate a detectable radio-frequency signal, that is received by antennas in close proximity to the anatomy being examined.
  4. 4. A Brief Introduction to MRI • MRI can be divided into Excitation, Relaxation, Acquisition, Computing and Display Excitation • Protons align their magnetic fields (spin axes) in parallel or anti- parallel to the outer magnetic field • Imbalance between parallel and anti- parallel spins leads to NET magnetization • Alignment can be perturbed by a radio-frequency signal. This process is called Excitation. mri:Physics For anyone who does not have a degree in physics (Evert J Blink)
  5. 5. A Brief Introduction to MRI • T1 images measure the tissue-specific relaxation times after excitation of the induced magnetic field on the Z-axis • During relaxation the hydrogen atoms emit energy in the form of radio-waves, which are measured by nearby sensors. emission Excitation Relaxation (T1) T1 Relaxation Time (tissue specific!) mri:Physics For anyone who does not have a degree in physics (Evert J Blink)
  6. 6. • T2 images measure the loss of phase in the XY-plane. The process of getting from a total in-phase situation to a total out-of-phase situation is called T2 relaxation. • After excitation the spins of the hydrogen atoms are in phase. The phase is lost over time (A-E). T2 Relaxation Time (tissue specific!) A Brief Introduction to MRI mri:Physics For anyone who does not have a degree in physics (Evert J Blink)
  7. 7. T1 vs T2 Images • T1 and T2 relaxation are two independent processes, which happen simultaneously. • T1 happens along the Z-axis • T2 happens in the X-Y plane • Different tissues appear bright/dark in T1/T2 images • T1: water = dark • T2: water = bright T1 T2
  8. 8. A Brief Introduction to Image Segmentation • In image segmentation we assign a class to every input pixel (or voxel) Figure: Pelt, Sethian (PNAS 2018)
  9. 9. Brain MRI Segmentation – Ventricles • The ventricular system is a set of four interconnected cavities (ventricles) in the brain, where the cerebrospinal fluid (CSF) is produced. • Hypothesis: Ø Ventricles increase size before / during Multiple Sclerosis episodes Ø“Brain inflammation” measurable by looking at ventricles By Polygon data were generated by Life Science Databases(LSDB). - Polygon data are from BodyParts3D., CC BY-SA 2.5,
  10. 10. Brain MRI Segmentation – Ventricles T2 ImagesAxial Sagittal Coronal 512 x 512 512 x 44 512 x 44 Voxel Sizes x : 0.5 mm y : 0.5 mm z : 3.0 mm Jason Millward
  11. 11. Machine Learning Problem • Learn to predict segmentation (drawn using T2 imgs) from T1 imgs • Nscans > 220 , resolution 512 x 512 x 44 (0.5 x 0.5 x 3 mm) • T1 and T2 images + segmentation (Jason Millward) • “MS-dataset” • Model should produce satisfactory results on different dataset “Day2Day” (Filevich, 2017) • T1 images of healthy subjects measured over many time points • resolution 192 x 256 x 256 ( 1 x 1 x 1 mm)
  12. 12. Machine Learning Problem Learn from this… … predict on this
  13. 13. DCNN Architectures for Image Segmentation • “Standard” Deep Convolutional Neural Net (DCNN) architectures that rely on pooling in order to aggregate spatial context and deconvolution/upscaling to restore the original image dimensions are not the best choice for image segmentation • The reason for this is the loss of spatial information associated with pooling: 256 x 256 x 1 256 x 256 x nclasses 256 x 256 x c1 128 x 128 x c2 64 x 64 x c3 32 x 32 x c4 c1 < c2 < c3 < c4 Convolution Max Pooling Up-Scaling Number of channels: nrow x ncol x nchannel Input Desired Output
  14. 14. DCNN Architectures for Image Segmentation • There are multiple ways to tackle the problem of loss of spatial information: • Skip connections • 2D: U-Net, Ronneberger 2015 • 3D: V-Net, Milletari 2016 • Dense Architectures • Huang 2016 Ø Original publication • MSD-Net, Pelt 2017 • HyperDense-Net, Dolz 2018 Ø Recent examples…
  15. 15. Skip Connections
  16. 16. Dense Connections
  17. 17. 2D Models: U-Net • 2 Convolutions without padding at every ”level” of the Network • Skip Connections propagate information from early to later layers (after cropping) Levels 0 1 2 3 4
  18. 18. 3D Models: V-Net • Residual blocks of varying depths at every ”level” of the Network • Skip Connections propagate information from early to later layers • ”Down Conv” and “Up Conv” layers between levels (instead of MaxPool and UpScale) • Can be used with 2D input too! • U-Net and V-Net make use of the same idea: Skip connections
  19. 19. V-Net Building Blocks Residual Block Conv3D k=2, s=2 Conv3DTranspose k=2, s=2 Concatenate Conv3D Add PReLu Keras Layers
  20. 20. U-Net vs V-Net • One disadvantage of 3D architectures is the large memory footprint • This is especially true if we want to feed the entire scan (i.e. all slices) to the network at once • V-Net : one batch contains (batch-size * nslices ) 2D-images • U-Net: batch-size directly determines number of 2D images in one batch • In other words: • For the U-Net, each slice is one training example • For the V-net, each scan is one training example • Nvnet < Nunet • Is it harder to learn 3D kernels? Ø Convolutions are not rotation-invariant Ø 3D adds an additional rotational axis
  21. 21. Quality of Predicted Segmentation • Quantitative vs Ground Truth • (weighted) cross-entropy • Dice Coefficient • … • Qualitative • Do the predicted contours look as if they were produced by an expert? • Radiology Turing Test: ØIf there is an attending physician on one side of a wall (A), and a computer or radiologist on the other, can the attending physician tell the difference? ISMRM April 2017 Computer Aided Diagnosis
  22. 22. The Dice Coefficient … similar to Intersection over Union Let R be the reference segmentation (gold standard) with voxel values rn for the foreground class and voxel n over N image elements. Let P with values pn be the corresponding predicted probabilistic map. !" = 2 ∗ ∑' (')' + + ∑'(('+)') + + Dice Coefficient is between 0 (no overlap) and 1 (perfect overlap).
  23. 23. Generalized Dice Loss Sum over all voxels Sum over all classes Let R be the reference segmentation (gold standard) with voxel value rln for class l and voxel n over N image elements. Let P with values pln be the corresponding predicted probabilistic map. Sudre 2017
  24. 24. V-Net: Our Implementation on T1 Scans Milletari 2016 Our Architecture Input Shape x, y, z, nchannel 128, 128, 64, 1 256, 256, 32, 1 Kernel Size x, y, z 5, 5, 5 3, 3, 2 Strides for Up- and Down Conv 2, 2, 2 2, 2, 1 nfilters @ 1 … nlevels 16, 32, 64, 128, 256 32, 64, 128, 256 i.e. one level less Residual Block depth @ 1 … nlevels 1, 2, 2, 2, 3 1, 2, 2, 3 Loss „Dice-based loss“ (?) Generalized Dice Loss
  25. 25. V-Net: Our Implementation on T1 + T2 Scans • Input has 2 Channels (T1, T2) • First layer finds „useful“ combinations of T1 & T2 Channels Ø Conv3D(filters=32, kernel_size=(1,1,1)) Ø This only works if subject hasn’t moved… … 32 * Conv1,1,1 256, 256, 32, 2 256, 256, 32, 32 Vnet
  26. 26. Results • Currently, validation is performed on all scans of one single subject (12 timepoints) • None of the scans from that subject are part of the training set Model Input # Param avg DC sd DC Unet* T1 56,442,132 0.824 ± 0.036 Vnet T1 8,074,338 0.855 ± 0.030 Vnet T1+T2 8,092,322 0.869 ± 0.019 * Our implementation of the Unet uses residual blocks and up- and down-convolutions just like the Vnet
  27. 27. Results per Slice 0 7 15 SliceLocation //2z-axis //2 SliceLocation Nvoxels Ventricle Size
  28. 28. Results on Validation Set (1 Patient) Vnet T1
  29. 29. Results on Day2Day data • Not that good. Vnet T1
  30. 30. Discussion • 3D Deep Convolutional Neural Networks provide state of the art performance for image segmentation • A 3D model with comparable architecture but just 1/8 of the parameters of a 2D model outperforms the 2D model • The performance gain comes at the cost of higher memory needs • Our V-Net does not easily generalize to the Day2Day data • Data Augmentation? • Transfer Learning?
  31. 31. Acknowledgements AG Niendorf Jason Millward Sonia Waiczies Andreas Pohlmann AG Lippert Christoph Lippert Aiham Taleb Sharyar Khorasani
  32. 32. References • Ronneberger, Olaf, Philipp Fischer, and Thomas Brox. "U-net: Convolutional networks for biomedical image segmentation."International Conference on Medical image computing and computer-assisted intervention. Springer, Cham, 2015. • Milletari, Fausto, Nassir Navab, and Seyed-Ahmad Ahmadi. "V-net: Fullyconvolutional neural networks for volumetric medical imagesegmentation."3D Vision (3DV), 2016 Fourth International Conference on. IEEE, 2016. • Filevich, Elisa, et al. "Day2day: investigating daily variability of magnetic resonance imaging measures over half a year."BMC neuroscience18.1 (2017): 65. • Huang, G., Z. Liu, and K. Q. Weinberger. "Densely Connected Convolutional Networks. arXiv Preprint, 1–12." (2016). • Dolz, Jose, et al. "HyperDense-Net: A hyper-densely connected CNN for multi-modal image segmentation."arXiv preprint arXiv:1804.02967(2018). • Sudre, Carole H., et al. "Generalised Dice overlap as a deep learning loss function for highly unbalanced segmentations."Deep Learning in Medical Image Analysis and Multimodal Learning for Clinical Decision Support. Springer, Cham, 2017. 240-248. • Pelt, Daniël M., and James A. Sethian. "A mixed-scale dense convolutional neural network for image analysis."Proceedings of the National Academy of Sciences(2017): 201715832.