Project presentation - Capstone

CONVOLUTIONAL DEEP LEARNING
NEURAL NETWORK FOR RADIOLOGICAL
CHEST X-RAY DIAGNOSIS
ISDS 577 Capstone Seminar
Guided by Professor Daniel Soper
Team Members: Skandha Chinta,
Scott Cunningham, Apurva Desai,
Saket Dhamne

CAN A CONVOLUTIONAL NEURAL NETWORK BE USED
TO DETERMINE IF A PATIENT X-RAY HAS A
DIAGNOSABLE CONDITION?
One Tailed T-Test:
Is the mean grayscale value of an X-Ray with
a condition present > than the mean gray
scale value of a X-Ray with no condition?
Image Classification (Deep Learning CNN):
Identify whether an X-Ray has a diagnosable
condition with reasonable accuracy.
Research on identifying specific types of cancer

DATASET FROM NATIONAL INSTITUTE
OF HEALTH
108,948 X-Ray images
from 32,717 unique
patients from one hospital.
01
14 disease categories
image findings and
normal.
02
Made publicly available in
September 2017 for
research purposes.
03

NEURAL NETWORKS
o Assigns weights to input values and
runs through a ‘black box’ of
interconnected data points to
determine which nodes should receive
greatest weights for a given output set.
o Several types of Neural Networks for
different applications.
o Attributes of Convolutional Neural
Networks
• Difficult to train
• Very good at image classification

GOOGLE COLABORATORY
Google’s online testing space
(much like Jupyter)
Uses Gigabyte processors
free
Vastly reduces processing time with
millions of calculations
Python lab

DATA PRE-PROCESSING
Process Reason
Width x Height matching Consistency,
Aspect Ratio is off
AP Labeling AP is preferred over PA
Unique Patients To avoid bias
Latest patient images To avoid bias
Unusual and inaccurate
X-Rays
Eg: You guessed it right, a
PELVIS!
Data Reduction

PRE-
PROCESSING
: T-TEST
• Need to prepare data for T-Test:
• Transform images to mean
grayscale values
• Per image and per cancer type
• Separate into cancer types
• Check outliers, skewness, kurtosis
• Use Histograms, Boxplots and IQR

CHECK ON OUTLIERS, KURTOSIS, SKEWNESS
No_Findings
Quartile 1 112.5463085
Quartile 3 130.4091029
Inter-Quartile Range 17.8627944
Upper bound 157.2032945
Lower bound 85.75211692
Findings
Quartile 1 111.5376172
Quartile 3 128.6265164
Inter-Quartile Range 17.08889915
Upper bound 154.2598651
Lower bound 85.90426848

CHECK ON OUTLIERS, KURTOSIS, SKEWNESS

PRE-PROCESSING: NEURAL NETWORK
Convert all images from
.png to .jpg
Photoshop image filters to
emphasize disease
characteristics
• Crop, Median filter, Smart
Sharpen (edges)
• Exported images in only 2
colors (black & white)
Separated images into
Training (70%) and Testing
(30%)
• Within each divided by
Findings & No Findings
• Compatibility with
neural network
code

T-TEST ON THE BASIS OF VISUAL
PERCEPTION
Grayscale
Normal
Lung
Diseased
Lung

F-TEST TO PERFORM
THE CORRECT T-TEST
• Comparison of
Variance
• If F-Test Value ~= F
Critical Value
We choose T-Test for
two samples assuming
equal variances.

T-TEST
HYPOTHESIS:
• Null Hypothesis based on
Descriptive Statistics
H0 : μgNF >= μgF
• Alternate Hypothesis based on
Visual Perception
HA : μgNF < μgF

T-TEST RESULTS
• Reject the Null Hypothesis with
1% significance level.
• Mean grayscale value of diseased Lung X-
ray
is greater than that of a normal Lung X-
Ray.
• Although T-Test is a statistical test
different from image-classification
techniques, it can be considered a basis
for visual perception.

NEURAL NETWORK
Neural Network architecture for image processing
using Keras library

STEP 1: BUILDING A
CONVOLUTIONAL
LAYER
Number of
Convolutional
Layers - 5
01
Conv2D Function:
• Number of Input filters
varies with the
Convolutional Layer
• Kernel Size
• Activation function =
‘RELU’ (Rectified Linear
Unit)
02

STEP 2: POOLING
• Pooling layer is used to aggregate
information within a small region
of input features and then down
sampling the results
Max Pooling, Pool Size
Strides
Padding
Dropout Layer

STEP 3:
FLATTENING
• What is Flattening?
• Purpose of
Flattening

STEP 4: FULL
CONNECTION LAYER
& OUTPUT LAYER
Purpose to create a fully connected layer
Fully Connected Layer is Hidden Layer
Number of nodes in Hidden Layer
Activation Function for Hidden Layer: RELU
Activation Function for Output Layer: Sigmoid

NEURAL NETWORKS:
OTHER HYPER
PARAMETERS
“Adam” Optimizer
“Binary” Cross Entropy
Evaluation Metrics: Accuracy & Loss
Batch Size: 32
Epochs: 20
Steps per Epochs: 100

METRICS: GRAPHICAL REPRESENTATION

RESULTS OF CONVOLUTIONAL
NETWORK MODEL
Model Name Validation Loss Validation
Accuracy
Convolutional Network
Model (Version 55)
0.66 63.13%

PRE-TRAINED MODELS
• Different types of available pre-trained models:
• ResNet50 Model
• MobileNet Model
• Comparison of Different Models:
Model Name Validation Loss Validation Accuracy
Convolutional Network Model
(Version 55)
0.66 63.13%
ResNet50 Model 7.85 51.32%
MobileNet Model 6.82 47.50%

Neuron activation is focused by the darker pixels
HEATMAP FOR TRAINED MODEL

CONCLUSION: MODEL EVALUATION
• Validation Loss
• As close to 0 as possible
• Validation Accuracy
• As a percentage
• CNN Model outperformed pre-trained models: ResNet &
MobileNet
• Best image classification accuracy until 2012 was ~73.8%

FUTURE RESEARCH
• With continued tuning of Hyperparameters and image pre-
processing radiological tools can be developed to identify and flag
abnormal X-Rays for medical attention
• For Future research, we want to train the model to classify by cancer
type.
• Using heatmaps to identify neuron activation for different lesion
types.

Project presentation - Capstone

Recommended

Recommended

More Related Content

What's hot

What's hot (6)

Similar to Project presentation - Capstone

Similar to Project presentation - Capstone (20)

Recently uploaded

Recently uploaded (20)

Project presentation - Capstone

Editor's Notes