The document describes a U-Net model for segmenting consolidations in COVID-19 lung CT scans. It includes the aim, introduction, dataset description, U-Net architecture, hyperparameters, and implemented source code. The source code loads CT and mask datasets, visualizes sample data, prints dataset shapes, and concatenates images and masks for model training and evaluation.
2. 3
COVID 19 CT Image Segmentation
Aim:
• To study about the segmentation process and working on the CT Image segmentation
insights from segment radiology on axial lung slices using the U-Net method.
• The objective of segmenting consolidations in COVID-19 lung CT scans is to provide
a more detailed and accurate view of the degree and severity of COVID-19 lung
involvement in patients.
• The primarygoal is to detect disease-affectedregions of the lung and assess the degree
of involvement. This allows radiologists and doctors to better assess the severity of
each patient's condition and select the best course of treatment.
• By this analysis we are going to predict whether the person is having the Corona Virus
or not and how much the lungs was affected can be analyzed in this process.
• Record the observations obtaining on implementing the CT Image Segmentation.
Platform used: Collab.
Dataset: Segmentation of lung changes from CT images
(https://www.kaggle.com/c/covidsegmentation/data) a. Segmentationof consolidations only
with the use of custom U-Net architecture (max 70%)
Purpose:
• The purpose of COVID-19 segmentation is to identify and segregate people who are infected
from those who are not. This is done to assist prevent the virus's spread and to give appropriate
care to people who have been afflicted.
• Aside from testing and isolation, segmentation may also entail identifying and monitoring
close contacts of individuals who have tested positive for COVID-19, as well as implementing
measures such as mask use, hand cleanliness, and so on.
• The overall purposeof COVID-19 segmentation is to prevent the virus's propagation and protect
public health and segmentation of consolidations in COVID-19 lung CT scans is a useful tool for
detecting and managing COVID-19 patients, and it can assist improve illness outcomes.
Goal:
• The goal of segmenting consolidations in COVID-19 lung CT scans is to provide a more detailed
and accurate view of the degree and severity of COVID-19 lung involvement in patients.
• Radiologists and physicians can assess the amount and severity of the disease in each patient
by segmentingthe parts of the lung affectedby consolidation,which can assist guide treatment
options. Patients with more widespread consolidations, for example, may require more
intensive treatment, whereas those with only mild or isolated consolidation may benefit from
less invasive therapies.
• Segmentation can also be used to track illness progression and therapy success over time.
Changes in the size or distribution of consolidations on subsequent CT scans, for example,
may suggest that a patient is responding to treatment or that their condition is worsening.
3. 4
Introduction:
The method of identifying and separating infected regions from healthy regions in CT images of the
lungs of COVID-19 patients is known as COVID-19 segmentationof lung imaging. The diagnosis and
monitoring of the disease's development can both be done using this data. A very well deep learning
architecture for problems involving picture segmentation is UNet. It is made up of a decoder network
that upscales the feature map to create the segmentation mask and an encoder network that collects
features from the input image. The CT images are preprocessed to make them ready for input into the
network to execute COVID-19 segmentation of lung CT images using UNet.
Segmentationof consolidations in lung CT images entails employing image processing techniques to
detect parts of the lung that seem thicker or more opaque than normal tissue, and to construct a visual
depiction of these regions. A radiologist or other medical practitioner can accomplish this manually,
or automatedsoftware solutions designed to identifyand segment areas of interest in medical pictures
can be used. The segmented image that results can then be utilized to assist identify and monitor the
evolution of COVID-19, as well as to guide treatment recommendations.
Medicalsegmentation.com graciously shared the information, Early lung infection detection during
COVID-19 (the pandemic phase) was crucial for COVID patients because it might lower mortality
rates and raise patient survival rates when the likelihood of a curative course of treatment. For the
diagnosis and identification of lung infections or lung cancer, computed tomography imaging is a
reliable medical screening diagnostic. The doctor examines and diagnoses the lung tissues using the
CT scans they have collected. Without the aid of a computer-aided diagnosis (CAD) system, it might
be challenging for the doctor to provide an appropriate diagnosis in many common cases.
The preprocessedimagery is then input into the UNet network, which creates a segmentation mask by
utilizing the encoder-decoder architecture and emphasizing the lung infection-affected areas. The
contaminated regions are then extracted and separated from the healthy portions using the
segmentation mask to conduct additional research and make a diagnosis.
Overview of Segmentation process :
The process of recognizing and distinguishing certain regions or structures within an image is
referred to as "segmentation" in medical imaging. The process of identifying parts of the lung
that have been damaged by the disease and appear as dense, opaque regions on the CT scan in
the context of COVID-19 and lung CT images is referredto as segmentation of consolidations.
Consolidation refers to the filling of the alveoli (air sacs) of the lungs with fluid, pus, or other
substances as a result of infectionor inflammation. Consolidation is a typical feature in severe
COVID-19 patients and might be an essential predictor of disease progression.
• Dataset: Load the dataset into the programmingenvironment by importing the required masks
and CT images.
• Pre-processing: To prepare the appropriate data for training, go over procedures such as
normalizing intensity values, cropping and resizing photos, and so on.
• Data Splitting: Using the train/test options, divide the dataset into training and testing sets.
Simply use the traintest split method from the scikit-learnmodule to quickly divide the dataset
into training and testing sets.
• Data augmentation: To prevent overfitting and increase the amount of training data, use
methods such as random picture rotation, translation, and flipping.
4. 5
• Model development: Create the segmentationU-Net architecture.The U-Net model is made up
of two main parts: a path that contracts to capture context and a path that stretches
symmetrically to allow for precise localization.
• Model Compilation: Create the optimizer, loss function, and metrics required to evaluate the
performance of the U-Net model.
• The training set should be used to train the U-Net model. The gradient of the loss function
and optimizer updates will be utilized to update the model's weights.
• Evaluation of the model: Examine the performance of the trained model on the testingdata.
Intersectionover Union (IoU), Dice coefficient,and accuracy are some performance metrics.
• Model Inference: Finally, segment new pictures using the learned U-Net model.
Dataset Description:
Two components comprise the dataset:
Medicine portion:
• 100 axial CT pictures from more than 40 patients with COVID-19 were transformed into this
dataset has the great depth on the conversion process: Images medseg.npy - training images -
100 slices, 512x512 size, Covid-19 radiology — data gathering and preparation for AI
• Training masks - 100 masks with 4 channels in masks medseg.npy: (zero = "ground glass,"
one = "consolidations," two = "other lungs," and three = "background")
• 10 slices of 512x512 test images in test images medseg.npy.
Part of Radiopedia:
• segmented 9 Radiopaedia axial volumetric CTs. This dataset contains both positive and
negative slices because it contains complete volumes (373 out of the total of 829 slices have
been evaluated by a radiologist as positive and segmented). Similar as above, these volumes
are converted and normalized.
• Training images radiopedia.npy, 829 slices, 512x512 size, training masks - 829 masks with 4
channels in masks radiopedia.npy: (zero = "ground glass," one = "consolidations," two =
"other lungs," and three = "background").
• The image of a person's lung from a CT scan is shown in the figure below. It clearly
demonstrates how the analysis can be distinguished.
5. 6
Figure:1 CT Image of Lung
U-Net Architecture:
The technique for recognizing and separating infected regions from healthy regions in CT images of
the lungs of COVID-19 patients is termedas COVID-19 segmentationof lung imaging.The diagnosis
and surveillance of the disease's development can both be done using this data.
Figure 2: Segmentation of Lung CT Image using U-NET Method
A well-likeddeep learning architecture for problems involving picture segmentation is UNet. It is
made up of a decoder network that upscales the feature map to create the segmentation mask and
an encoder network that collects features from the input picture.
The CT images are preprocessed to make them ready for input into the network to execute
COVID-19 segmentation of lung CT images using UNet. The segmentation of the lung CT image
using the U-Net Method is illustratedin the above figure, and the result is the segmented masking
of the lung CT image.
6. 7
Hyperparameter Tunning:
Tuning a model's hyperparameters to maximize its performance on a certain task is known as
hyperparameter tuning. Some typical hyperparameters that can be tweaked in a U-Net for image
segmentation include:
The capacity of the model to learn features from the input image is impacted by the number of
filters in the encoder and decoder.
Convolutional layer kernel size: This impacts the model's receptive field's size and may affect
the model's capacity to recognize larger structures in the image.
The convolutional layer's stride size: This determines the spatial resolution of the model's
output, with bigger strides resulting in a coarser output and smaller strides resulting in a more
detailed output.
The convolutional layers' padding: When the convolutional layers are padded, the output of the
model has the same spatial extent as the input. The activation function is used to incorporate non-
linearities into the model and can affect its capacity to learn intricate correlations between the
input and output. Its kind and parameters are listed below.
Type and parameters of the optimizer: The optimizer modifies the weights of the model based
on the gradient of the loss function, which affects the training's speed and stability.
Type and parameters of the loss function: The loss function measures the discrepancy between
model predictions and reality and influences the output quality.
Size x: The height of the model's input photos (default: 256).
Size y: The width of the model's input images (default: 256).
The number of filters in the first convolutional layer, denoted by n filters (default: 16).
kernel size: The convolutional kernels' size (default: 3x3).
Activation: The activationfunctionused in the convolutional and output layers (by default,'relu' in
the convolutional layers and 'sigmoid' in the output layer).
Initializer: The weight initialization approach for the convolutional layers (he normal is the default).
Padding: The padding approach for the convolutional and transpose convolutional layers (the
default is'same').
Optimizer: The training optimization algorithm (default: 'adam').
Loss: The training loss function (default: 'binary crossentropy').
Metrics: The training evaluation metric (default: 'accuracy').
The code implemented for the U-Net model uses the following hyperparameters:
• Size of the Conv2D layers' filters (3x3)
• Filter count in Conv2D layers (4,6,8,16(By Default))
• Conv2DTranspose layers' stride size, for instance in (2, 2)
• Activation Function, "relu"
• padding ( "same")
7. 8
Steps to be followed:
Import the Required Libraries and Loading the Dataset from the Kaggle into the collab.
Implemented Source Code:
# Install required versions of various packages in python such as keras/tensorflow/numpy
!pip install tensorflow
!pip install tensorflow==2.4.0
!pip install --upgrade tensorflow
!pip install tensorflow
!pip install tensorflow==2.4.0
!pip install --upgrade tensorflow
!pip install tensorflow-cpu !pip
install numpy --upgrade
# Import the required libraries
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from PIL import Image
from sklearn.model_selection import train_test_split
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras.layers import Conv2D, MaxPooling2D, UpSampling2D, Concat
enate, BatchNormalization, Dropout
from tensorflow.keras.optimizers import Adam
from tensorflow.keras.models import Model
# Load .npy files
from google.colab import drive
drive.mount('/content/drive')
images_medseg = np.load('/content/drive/MyDrive/Kaggle dataset/images_medseg.n
py')
masks_medseg = np.load('/content/drive/MyDrive/Kaggle dataset/masks_medseg.npy
')
test_images_medseg = np.load('/content/drive/MyDrive/Kaggle dataset/test_image
s_medseg.npy')
from google.colab import drive
drive.mount('/content/drive')
images_radiopedia = np.load('/content/drive/MyDrive/Kaggle dataset/images_radi
opedia.npy')
8. 9
masks_radiopedia = np.load('/content/drive/MyDrive/Kaggle dataset/masks_radiop
edia.npy')
Comments:
I loaded the dataset into Google Colab using the drive.mount command, which will upload the files
from my Google Drive account.
# Choose a random image and its mask to visualize
idx = 0 # index of the image you want to visualize
image = images_medseg[idx, :, :, 0]
mask = masks_medseg[idx, :, :, 0]
fig, axs = plt.subplots(1, 3, figsize=(20,
10)) axs[0].imshow(image, cmap='gray')
axs[0].set_title('CT Image')
axs[1].imshow(mask, cmap='gray')
axs[1].set_title('GGO')
axs[2].imshow(masks_medseg[idx, :, :, 1], cmap='gray')
axs[2].set_title('Consolidations')
plt.show()
Comments:
The random image from the input dataset of the medseg image and visualization of the
corresponding mask and the Consolidations.
Obtained Output:
# Visualizing the data
# Plot an example image and its corresponding mask
idx = 30 # index of the image you want to visualize
image = images_radiopedia[idx, :, :, 0]
mask = masks_radiopedia[idx, :, :, 1]
fig, axs = plt.subplots(1, 3, figsize=(20, 10))
axs[0].imshow(image, cmap='gray')
axs[0].set_title('Image')
axs[1].imshow(masks_radiopedia[idx, :, :, 0], cmap='gray')
axs[1].set_title('GGO')
axs[2].imshow(mask, cmap='gray')
axs[2].set_title('Consolidations')
plt.show()
9. 10
Obtained Output:
index = 0
image = images_radiopedia[index].squeeze()
mask = masks_radiopedia[index].squeeze()
# Remove the last dimension of the images_radiopedia array
images_radiopedia = images_radiopedia[..., 0]
images_radiopedia = images_radiopedia.astype('float32')
masks_radiopedia = masks_radiopedia.astype('float32')
# take consolidations only
masks_radiopedia = masks_radiopedia[..., 1]
masks_medseg = masks_medseg[..., 1]
# Expanding the dimension of masks medseg and radiopedia
masks_radiopedia = np.expand_dims(masks_radiopedia, axis=-
1) masks_medseg = np.expand_dims(masks_medseg, axis=-1)
# Choose a random index to visualize
index = np.random.randint(10, images_radiopedia.shape[0]-10)
# Plot the image and mask
plt.figure()
plt.imshow(images_radiopedia[index, ..., 0], cmap='gray')
plt.imshow(masks_radiopedia[index, ..., 0], cmap='jet', alpha=0.5)
plt.axis('off')
plt.show()
Obtained Output:
10. 11
Comments:
The random image from the input dataset of the medseg image and mask was visualized.
# Print the shape of the input dataset .npy files
print("Shape of images_radiopedia:", images_radiopedia.shape)
print("Shape of images_medseg:", images_medseg.shape)
print("Shape of masks_radiopedia:", masks_radiopedia.shape)
print("Shape of masks_medseg:", masks_medseg.shape)
print("Shape of images_test_medseg:", test_images_medseg.shape)
Obtained Result:
Shape of images_radiopedia: (829, 512, 512, 1)
Shape of images_medseg: (100, 512, 512, 1)
Shape of masks_radiopedia: (829, 512, 512, 4)
Shape of masks_medseg: (100, 512, 512, 4)
Shape of images_test_medseg: (10, 512, 512, 1)
Comments:
Using the shape of each individual file to determine what type of convolution network should be built.
# Concatenate the images and masks arrays
images = np.concatenate([images_radiopedia, images_medseg, test_images_medseg]
)
masks = np.concatenate([masks_radiopedia, masks_medseg])
Comments:
• Concatenating these arrays is most usually done to combine the data from the two sources into a
single piece of data for further processingor analysis. The data from both sources is integrated into
a single, unified collection of data that may be used together by concatenating the arrays.
• Concatenating the arrays presupposes that the pictures and masks from the two sources are
compatible and can be mixed. If the pictures and masks have different dimensions, or if the
image and mask arrays are not properly aligned, the concatenation may fail.
# Normalize images to [0, 1] range
images_medseg = (images_medseg - np.min(images_medseg)) / (np.max(images_medse
g) - np.min(images_medseg))
images_radiopedia = (images_radiopedia - np.min(images_radiopedia)) / (np.max(
images_radiopedia) - np.min(images_radiopedia))
11. 12
test_images_medseg = (test_images_medseg - np.min(test_images_medseg)) / (np.m
ax(test_images_medseg) - np.min(test_images_medseg))
# Resize of the CT images and printing the shape of images
from skimage.transform import rescale, resize, downscale_local_mean
images_radiopedia = resize(images_radiopedia, (images_radiopedia.shape[0], 256
, 256, 1))
images_medseg = resize(images_medseg, (images_medseg.shape[0], 256, 256, 1))
masks_radiopedia = resize(masks_radiopedia, (masks_radiopedia.shape[0], 256, 2
56, 1))
masks_medseg = resize(masks_medseg, (masks_medseg.shape[0], 256, 256, 1))
test_images_medseg = resize(test_images_medseg,
(test_images_medseg.shape[0], 256, 256, 1))
print("Shape of images_radiopedia:", images_radiopedia.shape)
print("Shape of images_medseg:", images_medseg.shape)
print("Shape of masks_radiopedia:", masks_radiopedia.shape)
print("Shape of masks_medseg:", masks_medseg.shape)
print("Shape of images_test_medseg:", test_images_medseg.shape)
Obtained Output:
Shape of images_radiopedia: (829, 256, 256, 1)
Shape of images_medseg: (100, 256, 256, 1)
Shape of masks_radiopedia: (829, 256, 256, 1)
Shape of masks_medseg: (100, 256, 256, 1)
Shape of images_test_medseg: (10, 512, 512, 1)
# Split the images and masks arrays into training and validation sets
x_train = images_radiopedia
y_train = masks_radiopedia
x_val = images_medseg
y_val = masks_medseg
Comments:
Splitting the dataset into training and validation sets using image and mask arrays.
# U-net Model creation
from keras.layers import Input, Conv2D, MaxPooling2D, concatenate, UpSampling2
D
from keras.models import Model
import numpy as np
from keras.layers import Conv2D, MaxPooling2D, UpSampling2D, Input, concatenat
e, Conv2DTranspose
def create_unet_model(size_x=256, size_y=256, n_filters=16):
IMG_HEIGHT = size_x
IMG_WIDTH = size_y
IMG_CHANNELS = 1
inputs = Input((IMG_HEIGHT, IMG_WIDTH, IMG_CHANNELS))
15. 16
)
'conv2d_3[0][0]']
conv2d_14 (Conv2D) (None, 128, 128, 32 18464
['concatenate_2[0][0]']
)
conv2d_15 (Conv2D) (None, 128, 128, 32 9248
['conv2d_14[0][0]']
)
conv2d_transpose_3 (Conv2DTran (None, 256, 256, 16 2064
['conv2d_15[0][0]']
spose) )
concatenate_3 (Concatenate) (None, 256, 256, 32 0
['conv2d_transpose_3[0][0]',
)
'conv2d_1[0][0]']
conv2d_16 (Conv2D) (None, 256, 256, 16 4624
['concatenate_3[0][0]']
)
conv2d_17 (Conv2D) (None, 256, 256, 16 2320
['conv2d_16[0][0]']
)
conv2d_18 (Conv2D) (None, 256, 256, 1) 17
['conv2d_17[0][0]']
==============================================================================
====================
Total params: 1,940,817
Trainable params: 1,940,817
Non-trainable params: 0
______________________________________________________________________________
____________________
Comments:
• The method represents threeparameters:size x and size y, which specify the height and breadth of
the input picture, and n filters, which specifies how many filters are used in the convolutional
layers. The input image size is set to 256x256 by default, and the number of filters is set to 16.
• The convolutional layer has the contracting and an expanding path with 3X3 kernels and
"same" padding. Each convolutional layer was assigned or utilizes the RELu
function(Rectified linear) and uses the normal intializer.
• Following each pair of convolutional layers, a max pooling layer with a 2x2 kernel and stride
of 2 halves the feature maps' height and width. To upsample the feature maps, the expanding
route employs transposedconvolutional layers with 2x2 kernels and "identical" padding. Each
transposed convolutional layer is followed by two pairs of convolutional layers and a
concatenation operation with the matching feature maps from the contracted path.
• The method generates a model by utilizing the Keras functional API, which enables more
complicatednetwork designs. The model begins with an input layer that accepts an image with
the dimensions (size x, size y, 1), where 1 indicates a single grayscale channel.Finally, the
model's outputs are produced by employing a final convolutional layer with four filters and a
softmax activation function, which produces a probability map for each of the four classes.
• Total params: 1,940,817
• Trainable params: 1,940,817
16. 17
• Non-trainable params: 0
# model compile
unet_model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['acc
uracy'])
Comments:
• The compile method is used to configure a deep learning model's trainingprocess. It is based
on the following arguments:
• optimizer: The optimization algorithm to be utilized during the training process is specified
by this argument.The adam optimizer is employed in this scenario. Adam (Adaptive Moment
Estimation) is a prominent optimization approach for deep learning model training.
• loss: The loss functionthat will be used to evaluate the model's performance during training is
specifiedby this argument. The binary crossentropyloss function is employed in this scenario.
This is a typical loss function for binary classification issues in which the goal is to predict
one of two potential classes.
# Evaluation of the model
history = unet_model.fit(x_train, y_train, batch_size=32, epochs=10, validatio
n_data=(x_val, y_val))
Obtained Output:
Epoch 1/10
26/26 [==============================] - 31s 423ms/step - loss: 0.1246 -
accuracy: 0.9571 - val_loss: 0.2757 - val_accuracy: 0.9775
Epoch 2/10
26/26 [==============================] - 6s 243ms/step - loss: 0.0182 -
accuracy: 0.9968 - val_loss: 0.1127 - val_accuracy: 0.9775
Epoch 3/10
26/26 [==============================] - 6s 246ms/step - loss: 0.0123 -
accuracy: 0.9968 - val_loss: 0.1233 - val_accuracy: 0.9775
Epoch 4/10
26/26 [==============================] - 6s 246ms/step - loss: 0.0091 -
accuracy: 0.9968 - val_loss: 0.1226 - val_accuracy: 0.9775
Epoch 5/10
26/26 [==============================] - 7s 252ms/step - loss: 0.0074 -
accuracy: 0.9968 - val_loss: 0.1192 - val_accuracy: 0.9775
Epoch 6/10
26/26 [==============================] - 6s 250ms/step - loss: 0.0069 -
accuracy: 0.9968 - val_loss: 0.1270 - val_accuracy: 0.9775
Epoch 7/10
26/26 [==============================] - 7s 252ms/step - loss: 0.0061 -
accuracy: 0.9968 - val_loss: 0.1390 - val_accuracy: 0.9775
Epoch 8/10
26/26 [==============================] - 7s 256ms/step - loss: 0.0054 -
accuracy: 0.9968 - val_loss: 0.1374 - val_accuracy: 0.9775
Epoch 9/10
26/26 [==============================] - 7s 257ms/step - loss: 0.0046 -
accuracy: 0.9968 - val_loss: 0.1437 - val_accuracy: 0.9775
Epoch 10/10
26/26 [==============================] - 7s 254ms/step - loss: 0.0043 -
accuracy: 0.9968 - val_loss: 0.1516 - val_accuracy: 0.9775
17. 18
Comments:
• The model in this output starts with a somewhat high loss and accuracy during the first
epoch, but quickly improves over consecutive epochs. Throughout the training phase, the
validation loss and validation accuracy remain steady, showing that the model is not
overfittingto the training data. The model achieves a relativelylow loss and high accuracy on
both the training and validation data by the conclusion of the 10th epoch, indicating that it
has learned to make correct predictions on new data.
• Fittingthe model on a given set of trainingdata yields a History object that can be used to
study the training process. The following arguments are passed to the fit method:
• x train: the training data, which is commonly a numpy array of samples, is sent as
an argument.
• y train: The ground truth labels for the training data, which is commonly a numpy array,
are passed as this input.
• batch size: defines the number of samples to be handled in a batch during the training
process. The batch size is set to 32 in this scenario.
• epochs: this input defines the number of iterations to be done across the full training dataset
throughout the trainingprocedure. In this example, the model will be trainedfor a total of ten
epochs.
• Validation data: This option defines the validation data and ground truth labels that will be
used to evaluate the model's performance after each epoch. The validation data argument
should consist of a pair of arrays (x val, y val). In this situation,the model's performance on x
val and y val will be evaluated after each epoch.
• Conclusion:
The model has the overall accuracy is 97% which a very good accuracy and the model was well
trained and best fit in the neural network. In computer vision, image segmentation is a very beneficial
process that may be used in a range of use-cases, including medical imaging and driverless cars, to
collect various segments or classes in real-time. One can now experiment with utilizing U-Net to
implement image segmentationon various challenges or by investigatingother models that are helpful
in image segmentation.
References:
[1] https://pyimagesearch.com/2022/02/21/u-net-image-segmentation-in-keras/
[2] https://keras.io/examples/vision/oxford_pets_image_segmentation/
[3] https://pallawi-ds.medium.com/semantic-segmentation-with-u-net-train-and-test-on-your-custom-
data-in-keras-39e4f972ec89
[4] https://www.educba.com/keras-u-net/
[5] https://blog.paperspace.com/unet-architecture-image-
segmentation/#:~:text=Image%20segmentation%20makes%20it%20easier,network%20designed%20for%20bi
omedical%20applications.