2. What and How?
Business Problem:
● Surgery inevitably brings discomfort, and oftentimes involves significant post-surgical pain. Currently, patient pain is frequently
managed through the use of narcotics that bring a bevy of unwanted side effects
● An alternate is to indwell catheters that block or mitigate pain at the source
● Accurately identifying nerve structures in ultrasound images is a critical step in effectively inserting a patient’s pain management
catheter
● Doing so would improve catheter placement and contribute to a more pain free future
Objective: The goal of is to predict which pixels of an ultrasound image contain the brachial plexus
Data:
Training Image : 5,635 (as 420x580 .tiffs)
Training Image Mask : 5,635 (as 420x580 .tiffs)
Test Images: 5,508 (as 420x580 .tiffs)
Pixels in each image belonged to one of two possible classes – Nerve, Not-Nerve Train Image Train Mask
So, of the 243,500 pixels in every image of the testing data, the result set must consist of all the pixels that belong to be a part of the
Brachial Plexus nerve.
3. Evaluation
This competition is evaluated on the mean Dice coefficient. The Dice coefficient can be used to compare the pixel-wise agreement between
a predicted segmentation and its corresponding ground truth. The formula is given by:
Where X is the predicted set of pixels and Y is the ground truth. The Dice coefficient is defined to be 1 when both X and Y are empty.
The leaderboard score is the mean of the Dice coefficients for each image in the test set.
(https://www.kaggle.com/c/ultrasound-nerve-segmentation)
Understanding Dice coefficient:
For every image i,
If yi=<yi1, yi2, yi3 . . . . . . . ,yin> is the binary vector identifying the mask for the image i
And ŷ = <ŷi2, ŷi2, ŷi3,. . ,ŷin> be the binary vector predicted.
4. U-Net and Data Augmentatiomn
A message that I hear often is that "deep learning is only relevant when you have a huge amount of data". While not entirely incorrect,
this is somewhat misleading. Certainly, deep learning requires the ability to learn features automatically from the data, which is generally
only possible when lots of training data is available
In order to make the most of our few training examples, we will "augment" them via a number of random transformations, so that our
model would never see twice the exact same picture. This helps prevent overfitting and helps the model generalize better
An improvement over Fully Connected Netwrok, U-net model has an
expanding and a contracting path.
While upsampling, there are a large number of feature channels, which
allow the network to propagate context information to higher resolution
layers.
5. Input Reading and Preprocessing
Step 1: The images and the image masks when read are converted in
Numpy ndArray and stored in .npy files
Step 2: The images were of high resolution and
took time to process.
Hence, were resized.
Two of these sizes were tested:
64x96 and 64x80
6. Model 1- Basic MLP Structure
● MLP Structure:
○ 1 Input Layer - Flattened
64x80 Layer with 5120
Neurons
○ 1 Hidden Layer - 512 Neurons
○ 1 Output Layer - Flattened
64x80 Layer with 5120
Neurons
● It was pretty obvious this model
was not doing well. And there was
a spike where the dice coefficient
shot up to more than 1. Introducing
a dropout did not make any
difference
Model 1 Results
7. Model 2 - Single Level U-Net
Included a single set of convolution and upSampling to understand what difference the depth of the model
has to do with convergence
Result on Validation Data
8. Model 3 - U-Net
1. Total of 19 Convolution layers
2.Dropout layers did not make a difference
3.Tried various optimizers - SGD, RMSProp, Adam, Adadelta
4.Chose Adam with a learning rate of 1.e-5
5. Used Relu activaiton for all layers and sigmoid at the output layer
6.I went two leves deeper, increasing the channels upto 1024 and 2048
9. Model 3 Results:
This model gave the best dice coefficient for the 100 epochs run over 1000 training images
10. Model 4 - U-Net with Augmentation
In order to make the most of our few training examples, we will "augment" them via a number of random transformations,
so that our model would never see twice the exact same picture. This helps prevent overfitting and helps the model
generalize better
These generators can then be used with the Keras model methods that accept data generators as inputs,
fit_generator, evaluate_generator and predict_generator
11. Model 4 Results:
This model gave a slightly lesser dice coefficient for the 100 epochs run over 1000 training images than the
U-Net.
12. Comparison on Predicted Images
Model 2- One Level U-Net Model 3 - U-Net Model 4 - U-Net with Augmentation
13. Summary
Model Avg. Time
Per Epoch
Train Dice
Coefficient
Dice Coef
Loss
Input Type Input
Size
MLP 5.8 s 1.01 -1.01 1-D Array 5120
Single Layer U-Net 271.6 s 0.4652 -0.4652 2-D Array 64x80
U-Net 498.3 s 0.6967 -0.6967 2-D Array 64x80
U-Net with
augmentation
600.8 s 0.6189 -0.6189 2-D Array 64x80
● X_train = 1000 Training images
● Y_train = 1000 Training images, with mask value
● X_test = 200 test images
● Output of the predict() function, returning a mask for each image is also a scaled image saved as
n-d array in a .npy file
● The ‘Train Dice Coefficient’ and ‘Dice Coef Loss’ is the value after 100 epochs for each model