phase2.pptx project slides which helps to know the content
1. PES College of Engineering
Under the Guidance of:
Mrs. Ramyashree H P
Assistant Professor
Department of CSE
Department of Computer Science and Engineering
PROJECT PRESENTATION ON
Multi-organ Nuclei Segmentation and Classification
Presented By:
Mohanapriya K.J [4PS20CS059]
Nanditha KP [4PS20CS062]
Thejaswini Raj M D [4PS20CS117]
Vishwas B C [4PS20CS124]
2. Data Preprocessing
• Primary objective of this project is to develop a generalized model capable of accurately
segmenting nuclei in whole slide images from various organs, addressing the limitations of current
organ-specific methods in cancer diagnostics
• Data Preprocessing: To implement effective preprocessing techniques on hematoxylin and eosin
stained images in our multi-organ dataset to enhance the quality and consistency of the images for
improved accuracy in nuclei segmentation and classification.
• The preprocessing steps involve dividing the whole slide images into patches, discarding certain
patches based on the mean value of their masks, and splitting the dataset into training and
validation sets.
• The dataset is split into training and validation sets using an 80:20 ratio.
3. Data Augmentation
• Data Augmentation: To employ data augmentation techniques, such as random rotations, flips, and
adjustments in brightness and contrast, during the training phase of our CNN model. This approach
aims to expand the dataset for multi-organ nuclei segmentation and classification, compensating
for the limited size of the original dataset.
• Data augmentation is typically used during the training phase to artificially increase the diversity of
the training dataset, thereby improving the generalization and robustness of the model.
• Rotation: Randomly rotating the image by a certain angle.
• Zoom: Randomly zooming into or out of the image.
• Brightness and Contrast: Adjusting the brightness and contrast of the image.
Horizontally Flipped and also
Rotated By 80,90,270 degree.
4. Segmentation and Classification
• Segmentation: To implement Convolutional Neural Networks (CNN) for the precise segmentation of
individual cells in whole slide images across multiple organs. This step is crucial as it facilitates the
detailed analysis of cell distribution and morphology within the tissue.
• Segmentation is performed to identify and delineate the boundaries of nuclei within the images.
This is essential for subsequent classification tasks, where the type of cells present in each
segmented nucleus is determined (e.g., epithelial cells, lymphocytes, macrophages, and
neutrophils).
• Classification: This classification process is aimed at providing a deeper insight into cellular
structures, aiding in precise disease diagnosis, and informing the development of targeted
treatment strategies. The success of this objective will significantly contribute to advancing the field
of cancer diagnostics and therapy
5. PatchEUNet
• PatchEUNet is a convolutional neural network architecture designed for semantic segmentation
tasks, particularly in medical image analysis. It combines elements of U-Net architecture with
patch-based processing for efficient context capture and precise localization. This architecture is
specifically designed to tackle the challenges of segmenting high-resolution medical images by
using a sophisticated encoder-decoder structure, incorporating the EfficientNet-B3 architecture,
and leveraging skip connections.
6. Encoder
Contracting Path (Encoder):
• EfficientNet-B3 Integration: The encoder in PatchEUNet is based on the EfficientNet-B3 architecture,
which is known for its efficiency and scalability. EfficientNet-B3 is used to extract features from the
input whole slide images (WSIs) in a hierarchical manner.
• Hierarchical Downsampling: The encoder progressively reduces the spatial dimensions of the input
images while increasing the depth of the feature maps. This downsampling process captures high-
level contextual information at multiple scales, which is crucial for segmenting nuclei across different
organs and staining variations.
• Pre-trained Weights: The encoder is initialized with weights from the ImageNet dataset, allowing the
model to leverage pre-learned features. This pre-training step enhances the model's ability to
recognize diverse patterns and textures in the WSIs, which is essential for accurate segmentation and
classification.
Overall, the encoder in PatchEUNet extracts high-level features from the input WSIs using the EfficientNet-
B3 architecture,
7. Decoder
Expanding Path (Decoder):
• Upsampling and Convolutional Layers: The decoder in PatchEUNet reverses the downsampling process of the
encoder by progressively upsampling the feature maps and applying convolutional layers. This upsampling process
helps to recover spatial details lost during downsampling and refine the segmentation masks. Every step in the
expansive path consists of an upsampling of the feature map, followed by a convolution. Hence, the expansive
branch increases the resolution of the output. In order to localize the upsampled features, the expansive path
combines them with high-resolution features from the contracting path via skip-connections [3]. The output of the
model is a pixelby-pixel mask that shows the class of each pixel
• Batch Normalization and ReLU Activation: Each convolutional layer in the decoder is followed by batch
normalization and ReLU activation. Batch normalization standardizes the activations from a layer, which helps in
stabilizing and accelerating the training process. ReLU activation introduces non-linearity, enabling the model to
learn complex mappings from the feature maps to the segmentation masks.
• Skip Connections: Skip connections are used to concatenate feature maps from the encoder (EfficientNet-B3
blocks) to the corresponding decoder layers. This mechanism helps in recovering spatial details and combining
high-resolution features from the encoder with the upsampled features in the decoder, improving the accuracy of
segmentation.
• We use 256, 128, 64, 32 and 16 filters for the convolutional layers in the decoder blocks. Each decoder block is
combined to output of the EfficientNet-B3 blocks numbered 2, 3, 4 and 6 via skip-connections.
The decoder reconstructs the segmented masks by progressively upsampling the features and refining the
segmentation details.
8. Efficient Net b3
• EfficientNet-B3 architecture [4], which results from a compound scaling method applied on the baseline
network EfficientNet-B0 that uniformly scales all three dimensions with a fixed ratio.
• EfficientNet-B3 specifically refers to a particular configuration of the EfficientNet architecture, where the
depth, width, and resolution of the blocks are optimized to balance between accuracy and efficiency. It's
characterized by deeper and wider layers compared to smaller variants like EfficientNet-B0, but not as
heavy as larger variants like EfficientNet-B7.
• The encoder component of PatchEUNet is based on the EfficientNet-B3 architecture, which is known for its
efficiency and effectiveness in image classification tasks.
• It consists of seven blocks that hierarchically downsample the input image while preserving important
features. Each block contains a sequence of layers including convolutional layers, activation functions, and
other operations Each block incorporates multiple operations to efficiently process the input data and
extract relevant features. By utilizing EfficientNet-B3 as the encoder, PatchEUNet benefits from its ability
to capture rich contextual information from the input whole slide images.