Practical Points of Deep Learning
for Medical Imaging
Kyu-Hwan Jung, Ph.D
Co-founder and CTO, VUNO Inc.
Overview of Artificial Intelligence
and Its Application to Medical Imaging
Definition of Artificial Intelligence
• Machine (Mechanical? Biological?) that perform tasks as humans or field of study to do it
–No clear consensus.
–Here are definition of A.I. who first coined the term.
Paradigms of Artificial Intelligence
• From Knowledge-based Approach to Data-driven Approach
Artificial Intelligence : From Bust to Boom
• From Artificial Intelligence to Deep Learning
Brief History of Artificial Intelligence
Brief History of Neural Networks
Machine Learning
• "Field of study that gives computers the ability to learn without being explicitly programmed”
Toward Human-level Recognition Performance
• Deep Learning is Driving Recent Major Breakthroughs in Visual and Speech Recognition Tasks
Beyond Human-level Performance
• Now, Machines Beat Human in Tasks Once Considered Impossible
5:0
vs Fan Hui
(Oct. 2015)
4:1
vs Sedol Lee
(Mar. 2016)
Beyond Human-level Performance
• Now, Machines Beat Human in Tasks Once Considered Impossible
TPU Server
used against Lee Sedol
TPU Board
used against Ke Jie
Beyond Human-level Performance
• Now, Machines Beat Human in Tasks Once Considered Impossible
Libratus(Jan 30, 2017) DeepStack(Science, Mar 02, 2017)
Beyond Human-level Performance
• Now, Machines Beat Human in Tasks Once Considered Impossible
Explaining Deep Learning in One Sentence
You could think of Deep Learning as the building of
learning machines, say pattern recognition systems or
whatever, by assembling lots of modules or elements
that all train the same way.
IEEE Spectrum, Feb. 2015
Deep learning is a branch of machine learning based
on a set of algorithms that attempt to model high level
abstractions in data by using a deep graph with
multiple processing layers, composed of multiple linear
and non-linear transformations.
Brain-Inspired Learning
Convolutional Neural Networks
Elements of Convolutional Neural Networks
▪Local Connectivity
▪Parameter Sharing
▪Pooling/Subsampling
▪Nonlinearity
Convolution Layer
Pooling Layer
Activation Function
Convolution Pooling Activation Function
Neural Network Architectures
Traditional Machine Learning vs Deep Learning
Traditional Machine LearningDeep Learning
Hierarchical Learned Representation
From Yann LeCun and Zeiler(2013)
Feature Engineering vs Feature Learning
From Yann LeCun
Knowledge-driven Feature Engineering Data-driven Feature Learning
• Feature Learning instead of Conventional Feature Engineering Removes Barriers for Multi-modal
Studies and Data-driven Approaches in Medical Data Analysis
Feature Engineering vs Feature Learning
• Clinically-defined Features vs Data-driven Features for DILD Quantification in Chest CT
–Learned features of CNN improves classification performance of lung patches into 6 subtypes of DILD by
significant margin.
–Learned features are more robust to inter-scanner setting, where images are collected from different institutions
or scanners.
–Presented at RSNA 2015
Feature Engineering vs Feature Learning
• Visualization of Hand-crafted Feature vs Learned Feature in 2D
Feature Engineering vs Feature Learning
• Clinically-defined Features vs Data-driven Features for Early Prediction of Arrhythmia using RNN
–Existing method uses multi-level feature extraction method after ectopic beats removal.
–By replacing hand-crafted feature extraction steps with data-driven feature learning method,
the prediction accuracy has been improved with significant margin.
Toward Fully Data-driven Medicine
• End-to-end Data-driven Workflow for Medical Research
http://tcr.amegroups.com/article/view/8705/html
End-to-end
Deep Learning for Medicine, Why Now?
Big Data Computational Power Algorithm
SPIE, 1993
Med. Phys. 1995
A.I. Medicine in Tech Keynotes
“So imagine that, soon every doctor
around the world just gonna have the
ability to snap a photo and as well as
the best doctors in the world be able
to diagnose your cancer. That’s gonna
save lives !”
- Mark Zuckerberg at F8 2016
“If there is one application where a lot
of very complicated, messy and
unstructured data is available, it is in the
field of medicine. And what better
application for deep learning than to
improve our health, improve life?”
- Jen-Hsun Huang, GTC 2016
Facebook F8, April 2016 Google I/O, May 2016Nvidia GTC, March 2016
“It’s very very difficult to have highly
trained doctors available in many
parts of the world. Deep learning did
really good at detecting DR. We can
see the promise again, of using
machine learning.
- Sundar Pichai, Google IO 2016
AlphaGo in Medicine?
[Streams App for AKI Patient Management] [Medical Research]
A.I. Research on the Cover
Artificial Intelligence in Healthcare
A.I. for Medicine in Healthcare Investment
• Increasing investment of smart money to healthcare, especially A.I.-based imaging & diagnostics
Medical Imaging A.I. Startups by Applications
Source : Signify Research(2017)
• Number of Medical Imaging Startups Founded and Funding Volume by Quarter(2014 to 2017)
Medical Imaging A.I. Startups by Applications
Source : Signify Research(2017)
IBM Watson – Most Advanced Expert System
IBM Watson – Most Advanced Expert System
IBM Watson – Expert System with Perception
Common Challenges in Medical Data Analysis
using Deep Learning
Common Challenges
• Data Collection
–How many images do we need?
–What if we don’t have enough data?
–What if we don’t’ have enough annotations?
• Model Selection
–Do we really need ‘deep’ models?
–Is there any ‘off-the-shelf’ models?
–How can we incorporate context or prior into the models?
–Is there more trainer-friendly models?
• Result Interpretation
–Can we visually interpret the result?
–Can we obtain human-friendly interpretation?
Data
Model
Result
Data
- How many images do we need?
- What if we don’t have enough data?
- What if we don’t’ have enough annotations?
Data - How Much Medical Images Do We Need?
• Explorative Study for Measuring the Effect of Training Data Size on the Test Performance
–Predict the necessary training data size by extrapolating the performance/training size using nonlinear least
square.
–Not clinically meaningful but validating common assumption on the performance-dataset size trade-off.
J. Cho et. al. arXiv, 2015
How Much Medical Images Do We Need?
• The Effect of Training Dataset Size and Number of Annotation in the Fundus Image Classification
V. Gulshan et.al., JAMA, 2016
How Much Medical Images Do We Need?
• The Effect of Training Dataset Size and Number of Annotation in the Fundus Image Classification
V. Gulshan et.al., JAMA, 2016
How Much Medical Images Do We Need?
• The Inter-observer Variability or Disagreement is Significant
V. Gulshan et.al., JAMA, 2016
How Much Medical Images Do We Need?
• Dermatologist-level Classification of Skin Cancer
–Classification of skin cancer
–129,450 skin lesions comprising 2,032 different disease are used for training and 1,942 biopsy-labelled images
for test.
–Data is collected from ISIC Dermoscopic Archive, the Edinburgh Dermofit Library and Stanford Hospital.
–Rotation by 0~359 degrees and flip is used for data augmentation.
A. Esteva et. al, Nature 2017
How Much Medical Images Do We Need?
• Detection of Cancer Metastases on Pathology Image
–Generated 299x299 patches from 270 slides with resolution 10,000 x 10,000.
–Each slides contains 10,000 to 400,000 patches (median 90,000)
–But each tumor slide contains 20 to 150,000 tumor patches(median 2,000) – Class ratio from 0.01% to
70%(median 2%)
–Careful sampling strategy – 1) select class(normal or tumor) 2)select slide number randomly, 3)select patch
randomly to reduce bias toward slide with more patches.
–To reduce class imbalance, several data augmentation is used : 1)Rotation(90 degree x 4), horizontal flip,
2)Color perturbation(brightness, saturation, hue, contrast), 3)x,y offset upto 8 pixels.
–In total 10^7 patches + Augmentation
Y. Liu et. al. 2017
Data – What If We Don’t Have Enough Data?
• Data Augmentation for Effective Training Set Expansion
–In many cases, data augmentation techniques used in natural images does not semantically make sense in
medical image
(flips, rotations, scale shifts, color shifts)
–Physically-plausible deformations or morphological transform can be used in limited cases.
–More augmentation choices for texture classification problems.
H. R. Roth et. al., MICCAI, 2015
Data – What If We Don’t Have Enough Data?
• Transfer Learning from Other Domains
–Performance of off-the-shelf features vs random initialization vs initialization from transferred feature
–Initializing deeper network with transferred feature leads to better performance.
–Transferred network with ‘deep’ fine-tuning shows best results.
–Produced better results both on lymph node detection and polyp detection that networks with random init.
H. Shin et. al. IEEE Medical Imaging, 2016 N. Tajbakhksh et. al. IEEE Medical Imaging, 2016
Data – What If We Don’t Have Enough Annotations?
• Unsupervised Pre-training and Supervised Fine-tuning
–Stacked denoising auto-encoders are used for unsupervised training of input images
–Sparse annotations are used for supervised fine-tuning for better prediction performance.
J. Cheng et. al. Scientific Reports, 2016 H. Suk et. al. MICCAI, 2013
Data – What If We Don’t Have Enough Annotations?
• Weakly and Semi-supervised Semantic Segmentation for Lung Disease Detection
–With very limited strong information of lesion and abundant weak diagnostic information, semantic
segmentation network is trained.
–By sharing feature extractor for multi-task, classification of disease with localized lesion can be obtained.
–But… the training the network was tricky. We did pre-trained and semantic segmentation with skipped
connected ASPP network.
–Slight improvement of segmentation performance by exploiting weak label(Cancer)
Strong label
S. Hong et. al. arXiv:1512.07928, 2015
Data – What If We Don’t Have Enough Annotations?
• Medical Image Annotation Tool
–Provide the right tool for the higher quality annotation.
–Quality monitoring and control functionality is crucial for reducing trial and errors.
Model
- Do we really need ‘deep’ models?
- Is there any ‘off-the-shelf’ models?
- How can we incorporate context or prior into the model?
Model – Do We Really Need Deep Models?
• Surpassing human-level performance in medical imaging
–Detection of diabetic retinopathy in fundoscopy
V. Gulshan et.al., JAMA, 2016
sens : 96.7%, spec : 84.0%
sens : 90.7%, spec : 93.8%
AUROC : 97.4%
Model – Do We Really Need Deep Models?
• Surpassing Human-level Performance in Medical Imaging Diagnosis
–Classification of skin cancer
A. Esteva et. al, Nature 2017
Model – Do We Really Need Deep Models?
• Detection of Cancer Metastases on Pathology Image
–State-of-the-art sensitivity with 8 FP
Y. Liu et. al. 2017
Model – Do We Really Need Deep Models?
• Increased Performance with Deeper Networks
–Deeper models learn more discriminative features for better classification performance.
Shin et. al(2016)Jung et. al(2015)
Model – Is There Any Off-the-shelf Models?
• U-net for Biomedical Image Segmentation
–Winner of various image segmentation tasks
–Shows stable performance even with small annotated images
O. Ronneberger et. al. 2015
Model – Is There Any Off-the-shelf Models?
• V-net for Volumetric Biomedical Image Segmentation
–Expansion of U-net to 3D volumetric medical images such as CT and MRI
–The feature map of last stage is added to last feature map of current state to learn residual functions
F. Milletari et. al. 2016
Model – Is There Any Off-the-shelf Models?
• Inception-V3 Network for Surpassing Human Experts in Multiple Medical Imaging Tasks
–Detection of diabetic retinopathy
–Detection of skin cancer
–Detection of tumor in histopathology image
Model – How Can We Incorporate the Context Information?
• Location Sensitive CNN for the Segmentation of White Matter Hyperintensities
–Explicit Spatial Location Features
• (x, y, z) Coordinate
• in-plane distance from (left ventricle, right ventricle, brain cortex, midsagittal brain surface)
• Prior probability of WMH in that location
–Comparison of Single Scale(SS), Multi-scale Early Fusion(MSEF), Multi-scale Late Fusion with Independent
Weights(MSIW), and Multi-scale Late Fusion with Weight Sharing(MSWS)
M. Ghafoorian et. al., 2016
Model – How Can We Incorporate the Context Information?
• Location Sensitive CNN for the Segmentation of White Matter Hyperintensities
–Explicit Spatial Location Features
• (x, y, z) Coordinate
• in-plane distance from (left ventricle, right ventricle, brain cortex, midsagittal brain surface)
• Prior probability of WMH in that location
M. Ghafoorian et. al., 2016
Model – How Can We Incorporate the Context Information?
• DeepLung for Semantic Lung Segmentation
–Convolutional neural network is trained to semantically segment parenchymal part in lung HRCT
–High resolution feature maps with ‘atrous’ convolution layers are used to improve segmentation performance.
–Spatial context information is used to better capture anatomical structure of lungs and other organs.
Model – How Can We Incorporate the Context Information?
• DeepLung for Semantic Lung Segmentation
–Improved Segmentation Performance using Spatial Information and Hi-Res Feature Map
Spatial Context Information
Curriculum Learning
Model Selection
Model – How Can We Incorporate the Context Information?
• DeepLung for Semantic Lung Segmentation
–Improved Segmentation Performance using Spatial Information and Hi-Res Feature Map
Model – How Can We Incorporate the Context Information?
• DeepLung for Semantic Lung Segmentation
–Further improving performance using fully-connected conditional random field(NIPS 2011).
Model – How Can We Incorporate the Context Information?
• DeepLung for Semantic Lung Segmentation
–Clinical validation to totally unseen cases with different scanner and parameters. “Vendor Agnostic”
–When spatial context information is used we can get better segmentation result in the lower part of the
sequence.
Model – Is There More Trainer-friendly Models?
• Brain Lesion Detection using Generative Adversarial Network
–Detect lesion in the multi-modal brain images using patch-wise classifier trained with GAN
–Generator generates fake non-lesion patches while discriminator distinguishes real patches from fake non-lesion
patches
–In inference phase, the discriminator is expected to provide low value for lesion patches than non-lesion patches
Model – Is There More Trainer-friendly Models?
• Brain Lesion Detection using Generative Adversarial Network
–Detect lesion in the multi-modal brain images using patch-wise classifier trained with GAN
–Generator generates fake non-lesion patches while discriminator distinguishes real patches from fake non-lesion
patches
–In inference phase, the discriminator is expected to provide low value for lesion patches than non-lesion patches
Model – Is There More Trainer-friendly Models?
• Brain Lesion Detection using Generative Adversarial Network
–Detect lesion in the multi-modal brain images using patch-wise classifier trained with GAN
–Generator generates fake non-lesion patches while discriminator distinguishes real patches from fake non-lesion
patches
–In inference phase, the discriminator is expected to provide low value for lesion patches than non-lesion patches
Model – Is There More Trainer-friendly Models?
• Detection of Aggressive Prostate Cancer
–Detect of prostate cancer using semantic segmentation with generative adversarial object
–Instead of generator in the original GAN, segmentor is used to generate pixel-level lesion detection.
–Instead of using pixel-wise cross-entropy loss, GAN loss from segmentor and discriminator is used for training.
Model – Is There More Trainer-friendly Models?
• Detection of Aggressive Prostate Cancer
–Detect of prostate cancer using semantic segmentation with generative adversarial object
–Instead of generator in the original GAN, segmentor is used to generate pixel-level lesion detection.
–Instead of using pixel-wise cross-entropy loss, GAN loss from segmentor and discriminator is used for training.
Result
- Can we visually interpret the result?
- Can we obtain human-friendly interpretation?
Result – Can We Visually Interpret the Result?
• Class Activation Map for Visualize Salient Regions in the Image
B. Zhou et. al., CVPR, 2016
Objects
Actions
Result – Can We Visually Interpret the Result?
• Evidence Hotspot for Lesion Visualization
–Radiological score prediction and evidence pathological region suggestion
–Jointly learn multiple grading system and produce evidence for predictions.
–For training, disc volumes and corresponding multiple labels are used as input and multi-class classification
network is trained with class-balanced loss.
–‘Saliency Map’ approach is used for producing evidence hotspot
A. Jamaludin et. al. MICCAI, 2016
Result – Can We Visually Interpret the Result?
• Bone Age Assessment from Hand-bone X-ray
–Visualization of salient region in the bone x-ray image
H. Lee, et. al., JDI, 2017
Result – Can We Visually Interpret the Result?
• Open-source Visualization Tool
–PICASSO(https://github.com/merantix/Picasso)
Recent Topics
- Automatic Translation of Medical Image to Text or Image
Result – Can We Get Clinician-friendly Interpretation?
• Learning to Read Chest X-ray
–Automated x-ray annotation with recurrent neural cascade model.
H. Shin et. al., CVPR 2016
Result – Generation of Realistic Medical Images
D. Nie, et. al., 2016
CT Image Synthesis from MRI Decomposition of X-ray Image
S. Albarquoni, et. al., 2016
Result – Generation of Realistic Medical Images
• Translation of Image to Image without Paired Dataset
–Unpaired image-to-image translation has great potential for medical imaging such as segmentation, registration,
decomposition, modality shift and so on.
J-Y. Zhu et. al., arXiv, 2017
Conclusion and Future Directions
Conclusion
• Deep learning-based medical image analysis has shown promising results for data-driven medicine.
• By adopting recent progress in deep learning, many challenges in data-driven medical image analysis
has been overcome.
• Deep learning has the potential to improve the accuracy and sensitivity of image analysis tools and
will accelerate innovation and new product launches.
Future Directions in Medical Imaging
• Further studies to incorporate clinical knowledge into data-driven models.
• More studies on the application of recent advances in unsupervised and reinforcement learning to
medical image analysis.
• Studies on higher-dimensional(3D, 4D or even higher) medical image analysis.
• However, the greatest market impact in the short-term will be from cognitive workflow solutions that
enhance radiologist productivity.
• Diagnostic decision support solutions are close to commercialization, but several market barriers need
to be overcome, e.g. regulatory clearance, legal implications and resistance from clinicians.
• A.I. will “Augment”, not “Replace” Physicians. Radiologists become “Physicians of Physicians”.
Thank You!
khwan.jung@vuno.co
kyuhwanjung@gmail.com

(2017/06)Practical points of deep learning for medical imaging

  • 1.
    Practical Points ofDeep Learning for Medical Imaging Kyu-Hwan Jung, Ph.D Co-founder and CTO, VUNO Inc.
  • 2.
    Overview of ArtificialIntelligence and Its Application to Medical Imaging
  • 3.
    Definition of ArtificialIntelligence • Machine (Mechanical? Biological?) that perform tasks as humans or field of study to do it –No clear consensus. –Here are definition of A.I. who first coined the term.
  • 4.
    Paradigms of ArtificialIntelligence • From Knowledge-based Approach to Data-driven Approach
  • 5.
    Artificial Intelligence :From Bust to Boom • From Artificial Intelligence to Deep Learning
  • 6.
    Brief History ofArtificial Intelligence
  • 7.
    Brief History ofNeural Networks
  • 8.
    Machine Learning • "Fieldof study that gives computers the ability to learn without being explicitly programmed”
  • 9.
    Toward Human-level RecognitionPerformance • Deep Learning is Driving Recent Major Breakthroughs in Visual and Speech Recognition Tasks
  • 10.
    Beyond Human-level Performance •Now, Machines Beat Human in Tasks Once Considered Impossible 5:0 vs Fan Hui (Oct. 2015) 4:1 vs Sedol Lee (Mar. 2016)
  • 11.
    Beyond Human-level Performance •Now, Machines Beat Human in Tasks Once Considered Impossible TPU Server used against Lee Sedol TPU Board used against Ke Jie
  • 12.
    Beyond Human-level Performance •Now, Machines Beat Human in Tasks Once Considered Impossible Libratus(Jan 30, 2017) DeepStack(Science, Mar 02, 2017)
  • 13.
    Beyond Human-level Performance •Now, Machines Beat Human in Tasks Once Considered Impossible
  • 14.
    Explaining Deep Learningin One Sentence You could think of Deep Learning as the building of learning machines, say pattern recognition systems or whatever, by assembling lots of modules or elements that all train the same way. IEEE Spectrum, Feb. 2015 Deep learning is a branch of machine learning based on a set of algorithms that attempt to model high level abstractions in data by using a deep graph with multiple processing layers, composed of multiple linear and non-linear transformations.
  • 15.
  • 16.
  • 17.
    Elements of ConvolutionalNeural Networks ▪Local Connectivity ▪Parameter Sharing ▪Pooling/Subsampling ▪Nonlinearity Convolution Layer Pooling Layer Activation Function Convolution Pooling Activation Function
  • 18.
  • 19.
    Traditional Machine Learningvs Deep Learning Traditional Machine LearningDeep Learning
  • 20.
    Hierarchical Learned Representation FromYann LeCun and Zeiler(2013)
  • 21.
    Feature Engineering vsFeature Learning From Yann LeCun Knowledge-driven Feature Engineering Data-driven Feature Learning • Feature Learning instead of Conventional Feature Engineering Removes Barriers for Multi-modal Studies and Data-driven Approaches in Medical Data Analysis
  • 22.
    Feature Engineering vsFeature Learning • Clinically-defined Features vs Data-driven Features for DILD Quantification in Chest CT –Learned features of CNN improves classification performance of lung patches into 6 subtypes of DILD by significant margin. –Learned features are more robust to inter-scanner setting, where images are collected from different institutions or scanners. –Presented at RSNA 2015
  • 23.
    Feature Engineering vsFeature Learning • Visualization of Hand-crafted Feature vs Learned Feature in 2D
  • 24.
    Feature Engineering vsFeature Learning • Clinically-defined Features vs Data-driven Features for Early Prediction of Arrhythmia using RNN –Existing method uses multi-level feature extraction method after ectopic beats removal. –By replacing hand-crafted feature extraction steps with data-driven feature learning method, the prediction accuracy has been improved with significant margin.
  • 25.
    Toward Fully Data-drivenMedicine • End-to-end Data-driven Workflow for Medical Research http://tcr.amegroups.com/article/view/8705/html End-to-end
  • 26.
    Deep Learning forMedicine, Why Now? Big Data Computational Power Algorithm SPIE, 1993 Med. Phys. 1995
  • 27.
    A.I. Medicine inTech Keynotes “So imagine that, soon every doctor around the world just gonna have the ability to snap a photo and as well as the best doctors in the world be able to diagnose your cancer. That’s gonna save lives !” - Mark Zuckerberg at F8 2016 “If there is one application where a lot of very complicated, messy and unstructured data is available, it is in the field of medicine. And what better application for deep learning than to improve our health, improve life?” - Jen-Hsun Huang, GTC 2016 Facebook F8, April 2016 Google I/O, May 2016Nvidia GTC, March 2016 “It’s very very difficult to have highly trained doctors available in many parts of the world. Deep learning did really good at detecting DR. We can see the promise again, of using machine learning. - Sundar Pichai, Google IO 2016
  • 28.
    AlphaGo in Medicine? [StreamsApp for AKI Patient Management] [Medical Research]
  • 29.
  • 30.
  • 31.
    A.I. for Medicinein Healthcare Investment • Increasing investment of smart money to healthcare, especially A.I.-based imaging & diagnostics
  • 32.
    Medical Imaging A.I.Startups by Applications Source : Signify Research(2017) • Number of Medical Imaging Startups Founded and Funding Volume by Quarter(2014 to 2017)
  • 33.
    Medical Imaging A.I.Startups by Applications Source : Signify Research(2017)
  • 34.
    IBM Watson –Most Advanced Expert System
  • 35.
    IBM Watson –Most Advanced Expert System
  • 36.
    IBM Watson –Expert System with Perception
  • 37.
    Common Challenges inMedical Data Analysis using Deep Learning
  • 38.
    Common Challenges • DataCollection –How many images do we need? –What if we don’t have enough data? –What if we don’t’ have enough annotations? • Model Selection –Do we really need ‘deep’ models? –Is there any ‘off-the-shelf’ models? –How can we incorporate context or prior into the models? –Is there more trainer-friendly models? • Result Interpretation –Can we visually interpret the result? –Can we obtain human-friendly interpretation? Data Model Result
  • 39.
    Data - How manyimages do we need? - What if we don’t have enough data? - What if we don’t’ have enough annotations?
  • 40.
    Data - HowMuch Medical Images Do We Need? • Explorative Study for Measuring the Effect of Training Data Size on the Test Performance –Predict the necessary training data size by extrapolating the performance/training size using nonlinear least square. –Not clinically meaningful but validating common assumption on the performance-dataset size trade-off. J. Cho et. al. arXiv, 2015
  • 41.
    How Much MedicalImages Do We Need? • The Effect of Training Dataset Size and Number of Annotation in the Fundus Image Classification V. Gulshan et.al., JAMA, 2016
  • 42.
    How Much MedicalImages Do We Need? • The Effect of Training Dataset Size and Number of Annotation in the Fundus Image Classification V. Gulshan et.al., JAMA, 2016
  • 43.
    How Much MedicalImages Do We Need? • The Inter-observer Variability or Disagreement is Significant V. Gulshan et.al., JAMA, 2016
  • 44.
    How Much MedicalImages Do We Need? • Dermatologist-level Classification of Skin Cancer –Classification of skin cancer –129,450 skin lesions comprising 2,032 different disease are used for training and 1,942 biopsy-labelled images for test. –Data is collected from ISIC Dermoscopic Archive, the Edinburgh Dermofit Library and Stanford Hospital. –Rotation by 0~359 degrees and flip is used for data augmentation. A. Esteva et. al, Nature 2017
  • 45.
    How Much MedicalImages Do We Need? • Detection of Cancer Metastases on Pathology Image –Generated 299x299 patches from 270 slides with resolution 10,000 x 10,000. –Each slides contains 10,000 to 400,000 patches (median 90,000) –But each tumor slide contains 20 to 150,000 tumor patches(median 2,000) – Class ratio from 0.01% to 70%(median 2%) –Careful sampling strategy – 1) select class(normal or tumor) 2)select slide number randomly, 3)select patch randomly to reduce bias toward slide with more patches. –To reduce class imbalance, several data augmentation is used : 1)Rotation(90 degree x 4), horizontal flip, 2)Color perturbation(brightness, saturation, hue, contrast), 3)x,y offset upto 8 pixels. –In total 10^7 patches + Augmentation Y. Liu et. al. 2017
  • 46.
    Data – WhatIf We Don’t Have Enough Data? • Data Augmentation for Effective Training Set Expansion –In many cases, data augmentation techniques used in natural images does not semantically make sense in medical image (flips, rotations, scale shifts, color shifts) –Physically-plausible deformations or morphological transform can be used in limited cases. –More augmentation choices for texture classification problems. H. R. Roth et. al., MICCAI, 2015
  • 47.
    Data – WhatIf We Don’t Have Enough Data? • Transfer Learning from Other Domains –Performance of off-the-shelf features vs random initialization vs initialization from transferred feature –Initializing deeper network with transferred feature leads to better performance. –Transferred network with ‘deep’ fine-tuning shows best results. –Produced better results both on lymph node detection and polyp detection that networks with random init. H. Shin et. al. IEEE Medical Imaging, 2016 N. Tajbakhksh et. al. IEEE Medical Imaging, 2016
  • 48.
    Data – WhatIf We Don’t Have Enough Annotations? • Unsupervised Pre-training and Supervised Fine-tuning –Stacked denoising auto-encoders are used for unsupervised training of input images –Sparse annotations are used for supervised fine-tuning for better prediction performance. J. Cheng et. al. Scientific Reports, 2016 H. Suk et. al. MICCAI, 2013
  • 49.
    Data – WhatIf We Don’t Have Enough Annotations? • Weakly and Semi-supervised Semantic Segmentation for Lung Disease Detection –With very limited strong information of lesion and abundant weak diagnostic information, semantic segmentation network is trained. –By sharing feature extractor for multi-task, classification of disease with localized lesion can be obtained. –But… the training the network was tricky. We did pre-trained and semantic segmentation with skipped connected ASPP network. –Slight improvement of segmentation performance by exploiting weak label(Cancer) Strong label S. Hong et. al. arXiv:1512.07928, 2015
  • 50.
    Data – WhatIf We Don’t Have Enough Annotations? • Medical Image Annotation Tool –Provide the right tool for the higher quality annotation. –Quality monitoring and control functionality is crucial for reducing trial and errors.
  • 51.
    Model - Do wereally need ‘deep’ models? - Is there any ‘off-the-shelf’ models? - How can we incorporate context or prior into the model?
  • 52.
    Model – DoWe Really Need Deep Models? • Surpassing human-level performance in medical imaging –Detection of diabetic retinopathy in fundoscopy V. Gulshan et.al., JAMA, 2016 sens : 96.7%, spec : 84.0% sens : 90.7%, spec : 93.8% AUROC : 97.4%
  • 53.
    Model – DoWe Really Need Deep Models? • Surpassing Human-level Performance in Medical Imaging Diagnosis –Classification of skin cancer A. Esteva et. al, Nature 2017
  • 54.
    Model – DoWe Really Need Deep Models? • Detection of Cancer Metastases on Pathology Image –State-of-the-art sensitivity with 8 FP Y. Liu et. al. 2017
  • 55.
    Model – DoWe Really Need Deep Models? • Increased Performance with Deeper Networks –Deeper models learn more discriminative features for better classification performance. Shin et. al(2016)Jung et. al(2015)
  • 56.
    Model – IsThere Any Off-the-shelf Models? • U-net for Biomedical Image Segmentation –Winner of various image segmentation tasks –Shows stable performance even with small annotated images O. Ronneberger et. al. 2015
  • 57.
    Model – IsThere Any Off-the-shelf Models? • V-net for Volumetric Biomedical Image Segmentation –Expansion of U-net to 3D volumetric medical images such as CT and MRI –The feature map of last stage is added to last feature map of current state to learn residual functions F. Milletari et. al. 2016
  • 58.
    Model – IsThere Any Off-the-shelf Models? • Inception-V3 Network for Surpassing Human Experts in Multiple Medical Imaging Tasks –Detection of diabetic retinopathy –Detection of skin cancer –Detection of tumor in histopathology image
  • 59.
    Model – HowCan We Incorporate the Context Information? • Location Sensitive CNN for the Segmentation of White Matter Hyperintensities –Explicit Spatial Location Features • (x, y, z) Coordinate • in-plane distance from (left ventricle, right ventricle, brain cortex, midsagittal brain surface) • Prior probability of WMH in that location –Comparison of Single Scale(SS), Multi-scale Early Fusion(MSEF), Multi-scale Late Fusion with Independent Weights(MSIW), and Multi-scale Late Fusion with Weight Sharing(MSWS) M. Ghafoorian et. al., 2016
  • 60.
    Model – HowCan We Incorporate the Context Information? • Location Sensitive CNN for the Segmentation of White Matter Hyperintensities –Explicit Spatial Location Features • (x, y, z) Coordinate • in-plane distance from (left ventricle, right ventricle, brain cortex, midsagittal brain surface) • Prior probability of WMH in that location M. Ghafoorian et. al., 2016
  • 61.
    Model – HowCan We Incorporate the Context Information? • DeepLung for Semantic Lung Segmentation –Convolutional neural network is trained to semantically segment parenchymal part in lung HRCT –High resolution feature maps with ‘atrous’ convolution layers are used to improve segmentation performance. –Spatial context information is used to better capture anatomical structure of lungs and other organs.
  • 62.
    Model – HowCan We Incorporate the Context Information? • DeepLung for Semantic Lung Segmentation –Improved Segmentation Performance using Spatial Information and Hi-Res Feature Map Spatial Context Information Curriculum Learning Model Selection
  • 63.
    Model – HowCan We Incorporate the Context Information? • DeepLung for Semantic Lung Segmentation –Improved Segmentation Performance using Spatial Information and Hi-Res Feature Map
  • 64.
    Model – HowCan We Incorporate the Context Information? • DeepLung for Semantic Lung Segmentation –Further improving performance using fully-connected conditional random field(NIPS 2011).
  • 65.
    Model – HowCan We Incorporate the Context Information? • DeepLung for Semantic Lung Segmentation –Clinical validation to totally unseen cases with different scanner and parameters. “Vendor Agnostic” –When spatial context information is used we can get better segmentation result in the lower part of the sequence.
  • 66.
    Model – IsThere More Trainer-friendly Models? • Brain Lesion Detection using Generative Adversarial Network –Detect lesion in the multi-modal brain images using patch-wise classifier trained with GAN –Generator generates fake non-lesion patches while discriminator distinguishes real patches from fake non-lesion patches –In inference phase, the discriminator is expected to provide low value for lesion patches than non-lesion patches
  • 67.
    Model – IsThere More Trainer-friendly Models? • Brain Lesion Detection using Generative Adversarial Network –Detect lesion in the multi-modal brain images using patch-wise classifier trained with GAN –Generator generates fake non-lesion patches while discriminator distinguishes real patches from fake non-lesion patches –In inference phase, the discriminator is expected to provide low value for lesion patches than non-lesion patches
  • 68.
    Model – IsThere More Trainer-friendly Models? • Brain Lesion Detection using Generative Adversarial Network –Detect lesion in the multi-modal brain images using patch-wise classifier trained with GAN –Generator generates fake non-lesion patches while discriminator distinguishes real patches from fake non-lesion patches –In inference phase, the discriminator is expected to provide low value for lesion patches than non-lesion patches
  • 69.
    Model – IsThere More Trainer-friendly Models? • Detection of Aggressive Prostate Cancer –Detect of prostate cancer using semantic segmentation with generative adversarial object –Instead of generator in the original GAN, segmentor is used to generate pixel-level lesion detection. –Instead of using pixel-wise cross-entropy loss, GAN loss from segmentor and discriminator is used for training.
  • 70.
    Model – IsThere More Trainer-friendly Models? • Detection of Aggressive Prostate Cancer –Detect of prostate cancer using semantic segmentation with generative adversarial object –Instead of generator in the original GAN, segmentor is used to generate pixel-level lesion detection. –Instead of using pixel-wise cross-entropy loss, GAN loss from segmentor and discriminator is used for training.
  • 71.
    Result - Can wevisually interpret the result? - Can we obtain human-friendly interpretation?
  • 72.
    Result – CanWe Visually Interpret the Result? • Class Activation Map for Visualize Salient Regions in the Image B. Zhou et. al., CVPR, 2016 Objects Actions
  • 73.
    Result – CanWe Visually Interpret the Result? • Evidence Hotspot for Lesion Visualization –Radiological score prediction and evidence pathological region suggestion –Jointly learn multiple grading system and produce evidence for predictions. –For training, disc volumes and corresponding multiple labels are used as input and multi-class classification network is trained with class-balanced loss. –‘Saliency Map’ approach is used for producing evidence hotspot A. Jamaludin et. al. MICCAI, 2016
  • 74.
    Result – CanWe Visually Interpret the Result? • Bone Age Assessment from Hand-bone X-ray –Visualization of salient region in the bone x-ray image H. Lee, et. al., JDI, 2017
  • 75.
    Result – CanWe Visually Interpret the Result? • Open-source Visualization Tool –PICASSO(https://github.com/merantix/Picasso)
  • 76.
    Recent Topics - AutomaticTranslation of Medical Image to Text or Image
  • 77.
    Result – CanWe Get Clinician-friendly Interpretation? • Learning to Read Chest X-ray –Automated x-ray annotation with recurrent neural cascade model. H. Shin et. al., CVPR 2016
  • 78.
    Result – Generationof Realistic Medical Images D. Nie, et. al., 2016 CT Image Synthesis from MRI Decomposition of X-ray Image S. Albarquoni, et. al., 2016
  • 79.
    Result – Generationof Realistic Medical Images • Translation of Image to Image without Paired Dataset –Unpaired image-to-image translation has great potential for medical imaging such as segmentation, registration, decomposition, modality shift and so on. J-Y. Zhu et. al., arXiv, 2017
  • 80.
  • 81.
    Conclusion • Deep learning-basedmedical image analysis has shown promising results for data-driven medicine. • By adopting recent progress in deep learning, many challenges in data-driven medical image analysis has been overcome. • Deep learning has the potential to improve the accuracy and sensitivity of image analysis tools and will accelerate innovation and new product launches.
  • 82.
    Future Directions inMedical Imaging • Further studies to incorporate clinical knowledge into data-driven models. • More studies on the application of recent advances in unsupervised and reinforcement learning to medical image analysis. • Studies on higher-dimensional(3D, 4D or even higher) medical image analysis. • However, the greatest market impact in the short-term will be from cognitive workflow solutions that enhance radiologist productivity. • Diagnostic decision support solutions are close to commercialization, but several market barriers need to be overcome, e.g. regulatory clearance, legal implications and resistance from clinicians. • A.I. will “Augment”, not “Replace” Physicians. Radiologists become “Physicians of Physicians”.
  • 84.