Deep Learning Fast MRI Using Channel Attention in Magnitude Domain

•Download as PPTX, PDF•

0 likes•146 views

My presentation on how we participated in the fastMRI Challanege in 2019. Aside from theoretical considerations, it also explains key implementation issues that arise in all deep learning for MRI such as disk I/O and CPU/GPU load balancing. Used for presentation at ISBI 2020 Oral session. Accidentally wrote the title as "Deep Learning Sum-of-Squares Images in Accelerated Parallel MRI". Sorry for the mistake!

Science

Deep Learning
Sum-of-Squares Images in
Accelerated Parallel MRI
Joonhyung Lee, Hyunjong Kim, Hyungjin Chung, Jong Chul Ye
ISBI 2020 Oral Session “MRI Reconstruction Methods” (MoPaO1)

Accelerated (Sub-Nyquist) k-space Sampling
3

Imaging Domains
4
Complex, Fourier Complex, Image Real, Image

Theoretical Challenges
• The raw k-space data lies in the signal domain, not the image domain.
• The k-space data consists of complex values, not real values.
• MRI machines have many coils to provide redundancy and reduce noise.
• It is not clear how to handle the multiple coils with different sensitivities.
6

Solutions to Theoretical Challenges
• Concatenate the coils in the channel axis.
• Use real-valued magnitude images as input, discarding image phase.
• Output a single channel with all coil information (no sum-of-squares).
8

Channel Axis Concatenation of Coil Inputs
11
Channel Axis

Overview
12
Channel Axis
BarbellNet
Under-sampled Inputs
Single-Channel
Reconstruction

Model Training Details
• 𝑺𝑺𝑰𝑴 𝟕 as loss function.
• Inputs center cropped to 320x320.
• Adam optimizer with 𝛽1 = 0.9, 𝛽2 = 0.999.
• Initial learning rate of 10−4, eventually reduced to 10−6 during training.
• Single GTX 1080Ti/2080Ti GPU used for training of ~100 epochs.
14

Practical Challenges
• Approximately 60% of teams could not manage implementation issues.
• Naïve data ETL pipeline has <10% GPU utilization.
• Disk I/O is a bottleneck due to the large size of multi-coil k-space data.
• CPU/GPU load balancing is a bottleneck for pre/post-processing of data.
16

Implementation Tips
• Compress the raw k-space data (HDF5 has native compression functionality).
• Store the data in fast SSD devices. SSDs also support parallel reads, unlike HDDs.
• Perform FFT (which is compute-heavy and parallelizable) on GPU, not CPU.
• Perform disk reading and host-to-device data transfer asynchronously and overlap
them with GPU computation.
18

Image Domain Reconstruction Results
20
Ground Truth Reconstruction Ground Truth Reconstruction

Results in Metrics
Acceleration Acquisition NMSE SSIM PSNR
4-fold
All 0.006 0.924 39
PD 0.0029 0.9542 40
PDFS 0.0086 0.8922 38
8-fold
All 0.010 0.897 37
PD 0.0067 0.9280 37
PDFS 0.0135 0.8651 36
21

Thank You for Listening!
• Code for this project is available at
https://github.com/veritas9872/fastMRI-kspace.
• The slides for this presentation are available at
https://www.slideshare.net/ssuserc416e2/presentations.
• Please contact the authors if you have any questions.
23

What's hot

Liturature servey of rain technlogy by narayan dudhenarayan dudhe

Transforming Your Payments Through Kyriba Payment NetworkKyriba Corporation

Cloud computing ananyaakk

WhyCloud?Amazon Web Services

Cloud Data Integration Best PracticesDarren Cunningham

Mobile cloud computing.pptxMejanurRahmanJunayed

Bio_ComputingMeenakshi Chandrasekaran

Big data on google cloudTu Pham

What's hot (8)

Liturature servey of rain technlogy by narayan dudhe

Transforming Your Payments Through Kyriba Payment Network

Cloud computing

WhyCloud?

Cloud Data Integration Best Practices

Mobile cloud computing.pptx

Bio_Computing

Big data on google cloud

Similar to Deep Learning Fast MRI Using Channel Attention in Magnitude Domain

04 accelerating dl inference with (open)capi and posit numbersYutaka Kawai

Cvpr 2018 papers review (efficient computing)DonghyunKang12

DigitRecognition.pptxruvex

Tutorial-on-DNN-09A-Co-design-Sparsity.pdfDuy-Hieu Bui

Using Many-Core Processors to Improve the Performance of Space Computing Plat...Fisnik Kraja

240422_Thuy_Labseminar[Large Graph Property Prediction via Graph Segment Trai...thanhdowork

Manycores for the MassesIntel® Software

In datacenter performance analysis of a tensor processing unitJinwon Lee

Morph : a novel acceleratorBaharJV

Large Scale Parallel FDTD Simulation of Full 3D Photonic Crystal Structuresayubimoak

Convolutional neural networks 이론과 응용홍배 김

lec6a.pptSaadMemon23

[unofficial] Pyramid Scene Parsing Network (CVPR 2017)Shunta Saito

Performance Models for Apache AccumuloSqrrl

OpenCAPI-based Image Analysis Pipeline for 18 GB/s kilohertz-framerate X-ray ...Ganesan Narayanasamy

OOW-IMC-finalManuel Martin Marquez

Dl2 computing gpuArmando Vieira

Introduction to computer visionMarcin Jedyk

Vision Transformer(ViT) / An Image is Worth 16*16 Words: Transformers for Ima...changedaeoh

DVLSI_project_presentation_template.pptxAkshitAgiwal1

Similar to Deep Learning Fast MRI Using Channel Attention in Magnitude Domain (20)

04 accelerating dl inference with (open)capi and posit numbers

Cvpr 2018 papers review (efficient computing)

DigitRecognition.pptx

Tutorial-on-DNN-09A-Co-design-Sparsity.pdf

Using Many-Core Processors to Improve the Performance of Space Computing Plat...

240422_Thuy_Labseminar[Large Graph Property Prediction via Graph Segment Trai...

Manycores for the Masses

In datacenter performance analysis of a tensor processing unit

Morph : a novel accelerator

Large Scale Parallel FDTD Simulation of Full 3D Photonic Crystal Structures

Convolutional neural networks 이론과 응용

lec6a.ppt

[unofficial] Pyramid Scene Parsing Network (CVPR 2017)

Performance Models for Apache Accumulo

OpenCAPI-based Image Analysis Pipeline for 18 GB/s kilohertz-framerate X-ray ...

OOW-IMC-final

Dl2 computing gpu

Introduction to computer vision

Vision Transformer(ViT) / An Image is Worth 16*16 Words: Transformers for Ima...

DVLSI_project_presentation_template.pptx

Recently uploaded

Biopesticide (2).pptx .This slides helps to know the different types of biop...RohitNehra6

Recombinant DNA technology (Immunological screening)PraveenaKalaiselvan1

Bentham & Hooker's Classification. along with the merits and demerits of the ...Nistarini College, Purulia (W.B) India

Analytical Profile of Coleus Forskohlii | Forskolin .pdfSwapnil Therkar

Artificial Intelligence In Microbiology by Dr. Prince C PPRINCE C P

Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...jana861314

Physiochemical properties of nanomaterials and its nanotoxicity.pptxAArockiyaNisha

Analytical Profile of Coleus Forskohlii | Forskolin .pptxSwapnil Therkar

Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝soniya singh

Behavioral Disorder: Schizophrenia & it's Case Study.pdfSELF-EXPLANATORY

Natural Polymer Based NanomaterialsAArockiyaNisha

PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...Sérgio Sacani

Disentangling the origin of chemical differences using GHOSTSérgio Sacani

Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCRDelhi Call girls

Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Lokesh Kothari

NAVSEA PEO USC - Unmanned & Small Combatants 26Oct23.pdfWadeK3

Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bSérgio Sacani

Engler and Prantl system of classification in plant taxonomyNistarini College, Purulia (W.B) India

Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Sérgio Sacani

Recombination DNA Technology (Nucleic Acid Hybridization )aarthirajkumar25

Recently uploaded (20)

Biopesticide (2).pptx .This slides helps to know the different types of biop...

Recombinant DNA technology (Immunological screening)

Bentham & Hooker's Classification. along with the merits and demerits of the ...

Analytical Profile of Coleus Forskohlii | Forskolin .pdf

Artificial Intelligence In Microbiology by Dr. Prince C P

Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...

Physiochemical properties of nanomaterials and its nanotoxicity.pptx

Analytical Profile of Coleus Forskohlii | Forskolin .pptx

Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝

Behavioral Disorder: Schizophrenia & it's Case Study.pdf

Natural Polymer Based Nanomaterials

PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...

Disentangling the origin of chemical differences using GHOST

Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR

Labelling Requirements and Label Claims for Dietary Supplements and Recommend...

NAVSEA PEO USC - Unmanned & Small Combatants 26Oct23.pdf

Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b

Engler and Prantl system of classification in plant taxonomy

Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...

Recombination DNA Technology (Nucleic Acid Hybridization )

Deep Learning Fast MRI Using Channel Attention in Magnitude Domain

1. Deep Learning Sum-of-Squares Images in Accelerated Parallel MRI Joonhyung Lee, Hyunjong Kim, Hyungjin Chung, Jong Chul Ye ISBI 2020 Oral Session “MRI Reconstruction Methods” (MoPaO1)

2. Introduction 2

3. Accelerated (Sub-Nyquist) k-space Sampling 3

4. Imaging Domains 4 Complex, Fourier Complex, Image Real, Image

5. Comparison of Images 5

6. Theoretical Challenges • The raw k-space data lies in the signal domain, not the image domain. • The k-space data consists of complex values, not real values. • MRI machines have many coils to provide redundancy and reduce noise. • It is not clear how to handle the multiple coils with different sensitivities. 6

7. Method 7

8. Solutions to Theoretical Challenges • Concatenate the coils in the channel axis. • Use real-valued magnitude images as input, discarding image phase. • Output a single channel with all coil information (no sum-of-squares). 8

9. Image Domain Reconstruction 9

10. 10

11. Channel Axis Concatenation of Coil Inputs 11 Channel Axis

12. Overview 12 Channel Axis BarbellNet Under-sampled Inputs Single-Channel Reconstruction

13. BarbellNet Network Architecture 13

14. Model Training Details • 𝑺𝑺𝑰𝑴 𝟕 as loss function. • Inputs center cropped to 320x320. • Adam optimizer with 𝛽1 = 0.9, 𝛽2 = 0.999. • Initial learning rate of 10−4, eventually reduced to 10−6 during training. • Single GTX 1080Ti/2080Ti GPU used for training of ~100 epochs. 14

15. Implementation 15

16. Practical Challenges • Approximately 60% of teams could not manage implementation issues. • Naïve data ETL pipeline has <10% GPU utilization. • Disk I/O is a bottleneck due to the large size of multi-coil k-space data. • CPU/GPU load balancing is a bottleneck for pre/post-processing of data. 16

17. MRI Data Pipeline

18. Implementation Tips • Compress the raw k-space data (HDF5 has native compression functionality). • Store the data in fast SSD devices. SSDs also support parallel reads, unlike HDDs. • Perform FFT (which is compute-heavy and parallelizable) on GPU, not CPU. • Perform disk reading and host-to-device data transfer asynchronously and overlap them with GPU computation. 18

19. Results 19

20. Image Domain Reconstruction Results 20 Ground Truth Reconstruction Ground Truth Reconstruction

21. Results in Metrics Acceleration Acquisition NMSE SSIM PSNR 4-fold All 0.006 0.924 39 PD 0.0029 0.9542 40 PDFS 0.0086 0.8922 38 8-fold All 0.010 0.897 37 PD 0.0067 0.9280 37 PDFS 0.0135 0.8651 36 21

22. Comparison with Other Methods 22

23. Thank You for Listening! • Code for this project is available at https://github.com/veritas9872/fastMRI-kspace. • The slides for this presentation are available at https://www.slideshare.net/ssuserc416e2/presentations. • Please contact the authors if you have any questions. 23

Editor's Notes

Hello, ISBI 2020. My name is Joonhyung Lee and today, I will introduce our work on MRI acceleration, which was done during our participation in the 2019 fastMRI Challenge.
I will begin with a brief introduction into the problems that we face.
Magnetic Resonance Imaging, or MRI, works by acquiring radio-frequency signals in the Fourier Domain. This information is set in a grid known as k-space. According to the Nyquist Sampling Theorem, a signal must be sampled at at least double the maximum frequency to produce a sound analog representation. However, MRI scans take a long time and a major research goal in MRI is to reduce scanning times. Because most of the information in an image is concentrated in the low frequency region at the center of k-space, we can reduce the amount of sampling in higher frequency regions while retaining most of the information. We can thus reduce the acquisition time significantly with minimal loss of information. The image on the right shows an 8-fold acceleration with 4% sampled at the low frequency ACS region.
After acquiring the k-space signal, the inverse Fourier Transform is applied to create an image of the underlying object. This image will be a complex-valued image. We create a real-valued image from the magnitude values of this complex-valued image. Also, modern MRI machines have multiple radio-frequency coils, which create separate images with different sensitivities to different locations. A single image is formed by performing a root-sum-of-squares on the magnitude image of each coil.
As you can see from the knee images, even with 8-fold acceleration of the high frequency regions, the general outline of the underlying object is still visible. However, we wish to reconstruct an image suitable for use in medical diagnosis. To this end, we use deep learning produce a high-quality image from the information latent within the under-sampled data.
However, due to the unique nature of MRI, several theoretical challenges arise that need to be solved. First, k-space data lies in the signal domain, where each point contains information about the entire image. Second, k-space consists of complex numbers, whereas almost all deep learning models use real numbers. Third, modern MRI machines have multiple radio-frequency coils, each with different sensitivities to different locations. Moreover, the number of coils and the location of each coil is different for each device.
In the following section, I will discuss our solutions to these problems.
After much experimentation, we arrived at the following solutions. First, we concatenate the coils in the channel axis. Second, we used the magnitude images for each coil as the input. Finally, we found that outputting a single-channel image for the final reconstruction produced much superior results than producing results for each coil separately.
Image domain reconstruction is the form used by the vast majority of deep learning image reconstruction models. Over the last few years, through extensive experimentation, it has been proven to work on a wide array of images. While this discards the complex-image phase information, this solved the problem of handling complex numbers.
Image domain learning for MRI loses the phase information of the complex-valued image. The images above show the magnitude and phase from a single coil in a slice in an MRI scan. While this means that the images cannot be converted back into k-space, for k-space input consistency, we find that the magnitude images contain enough information to allow high quality reconstruction.
Concatenating the coils in the channel axis allows the model to utilize the information from each coil while dramatically reducing the amount of computation compared to other methods. For example, concatenation in the batch axis would have entailed a 15-fold increase in memory and computation time on the fastMRI dataset. Also, the different coil sensitivities are available to the entire model all the time. One limitation is that this requires all data to have the same number of coils, each with the same sensitivity information. Channel concatenation was possible on the fastMRI knee dataset because it consists entirely of MRI scans with 15 coils.
Finally, we take the magnitude images from each coil and, instead of performing a root-sum-of-squares operation, input the coil images into a neural network. The output is a single image that is compared to the ground-truth root-sum-of-squares image. This method produces better results than separate generation of coil images. We suspect that this is because this allows the model to learn the best method to combine the coil information.
Our model is loosely based on the U-Net architecture but has a long chain of residual channel attention blocks, or RCABs, in the middle and multiple DenseBlocks for efficient feature extraction. The residual blocks were based on the EDSR residual block, which is optimized for image reconstruction. No feature normalization layers were included in the model to reduce computation and improve metric performance by preserving the original distribution and allowing for greater representational capacity of the model. The channel attention mechanism is identical to the Squeeze Excitation channel attention mechanism. Channel attention of this type has been shown to be very effective for both high-level tasks such as classification as well as low-level image denoising and super-resolution. We also found that it helped stabilize network training, which was very important because the lack of normalization layers caused training instability. We termed our model “BarbellNet” due to its resemblance to a barbell.
During training, we used structural similarity (SSIM) with kernel size 7 as the sole loss function as this was the primary metric of the fastMRI challenge. Inputs were center cropped to 320x320 to reduce computation and create a uniform input size. Learning rate decay was used to reduce the learning rate from 10^-4 to 10^-6. Finally, all models were trained on a single GTX 1080Ti or RTX 2080Ti GPU. These hardware requirements are within the reach of most researchers.
Finally, I would like to mention some challenges in implementation.
The unique nature of k-space data poses not only theoretical challenges but also practical challenges in implementation that mean that standard deep learning pipelines are very inefficient. A naïve data pipeline will significantly under-utilize GPU, but the causes of this under-utilization will not be obvious. Because of this, approximately 60% of challenge contestants could not participate in the multi-coil challenge but only applied to the single-coil track, which consists of synthetic data derived from the multi-coil raw data.
The biggest problem is the very large size of each MRI slice or volume. Reading data from HDD is very slow. This means that the CPU and GPU are idle most of the time, waiting for data to arrive. The second issue is data pre-processing. The Fourier Transform is very computationally intensive and performing it on CPU on such a large amount of data will form a bottleneck, no matter how efficiently implemented.
To solve these problems, we offer these solutions. First, to solve the problem of disk I/O, the simplest solution is to use an SSD storage device to store the data. Additionally, you can compress the data file to reduce the amount of raw data that must be read from disk, although this comes at the cost of increased CPU computation. Second, perform Fourier Transforms on GPU, not CPU. There are many highly optimized Fourier Transform libraries on GPU. Finally, overlap disk reading on multiple processes and overlap it with GPU computation. This will prevent the GPU from being idle for too long.
Finally, our results.
We have here a qualitative comparison between the ground truth and reconstruction images. Compared to the inputs, they are much clearer and show no obvious artifacts. One limitation is that there is a loss of detail in the horizontal direction, the direction of down-sampling.
The chart shows our results in more quantitative form. One can see that different accelerations and different acquisition methods have very different average metrics. 4-fold acceleration with 8% ACS sampling has superior results compared to 8-fold acceleration with 4% ACS sampling because the former has more signal in the input data. PD, or Proton Density images have better metrics than PDFS, or Proton Density Fat Suppression because the fat suppression acquisition sequence increases the amount of noise, although it also reduces the visibility of fat.
Our methods were very close to competitors in terms of metrics. Even compared to the first-place results, there was only a small difference in SSIM. Our results are therefore comparable to even state-of-the art methods, despite using fewer computational resources.
This is the end of the session. Thank you for listening. Please contact us if you have any questions and have a good day.

Deep Learning Fast MRI Using Channel Attention in Magnitude Domain

Recommended

Recommended

More Related Content

What's hot

What's hot (8)

Similar to Deep Learning Fast MRI Using Channel Attention in Magnitude Domain

Similar to Deep Learning Fast MRI Using Channel Attention in Magnitude Domain (20)

More from Joonhyung Lee

More from Joonhyung Lee (11)

Recently uploaded

Recently uploaded (20)

Deep Learning Fast MRI Using Channel Attention in Magnitude Domain

Editor's Notes