INDIANINSTITUTE OF TECHNOLOGY KHARAGPUR
TOPIC:-IMAGE SEGMENTATION USING U-Net
INTRODUCTION
Image segmentation is the task of partitioning an image based on the
objects present and their semantic importance. This makes it a whole lot
easier to analyze the given image, because instead of getting an
approximate location from a rectangular box. We can get the exact
pixel-wise location of the objects.
WHAT IS SEMANTIC SEGMENTATION
Image classification
CLASSIFICATION WITH LOCALIZATION
In localization along with the discrete label, we also expect the compute
to localize where exactly the object is present in the image.
OBJECT DETECTION
Object detection extends localization to the next level, where images are no longer
constrained to have a single object, but can contain multiple objects. The task is to
classify and locate all the objects in the image.
SEMANTIC SEGMENTATION
The goal of semantic image segmentation is to label each pixel of an image with a
corresponding class of what is being represented. Because we’re predicting for
every pixel in the image, this task is commonly referred to as dense prediction.
2D-CONVOLUTION
POOLING LAYER
UN-POOLING
There are three method 1) max un-pool 2) nearest neighbour 3) bed and nails. I
discuss only max un-pool because (nn.ConvTranspose2d)un-pool
use max operation.
MAX-UN POOL AND CONVOLUTION COMBINE OPERATION
U-NET ARCHITECTURE
SEG-NET ARCHITECTURE
1) Conv 3*3 ReLU 2) Max-Pool 2*2 3) up_conv 2*2
IMPLEMENTATION USING PYTHON
COMPARE BETWEENU-NET ANDSEGNET
Here discuss two different compare 1) efficiency (FLOPS) 2) accuracy (loss)
 1) FLOPS:- Convolutions - FLOPs = 2x Number of Kernel x Kernel Shape x Output Shape
=2* c* w* h* ((n-w+2*p)/s+1) * ((m-h+2*p)/s+1)
 Pooling Layers - FLOPs = Height x Depth x Width of an image
= 2 * n * m * c
So the FLOPS for U-Net is=2,251,171,840 and FLOPS for SegNet is= 5,412,076,480
2) accuracy (loss):-- U-Net:- Epoch 1/3 : Training loss: 0.1033
Epoch 2/3 : Training loss: 0.0973
SegNet:- Epoch 1/3 : Training loss: 0.0215
Epoch 2/3 : Training loss: 0.0215
CONCLUSION
The U-Net architecture is one of the most significant and
revolutionary landmarks in the field of deep learning.
While the initial research paper that introduced the U-Net
architecture was to solve the task of Biomedical Image
Segmentation, it was not limited to this single
application. The model could and can still solve the most
complex problems in deep learning. Although some of the
elements in the original architecture are outdated, there
are several variations of this architecture. These include
LadderNet, U-Net with attention, the recurrent and
residual convolutional U-Net (R2-UNet), and other similar
networks which are derived successfully from the original
U-Net Models.
THANK YOU

image_segmentation_ppt.pptx

  • 1.
    INDIANINSTITUTE OF TECHNOLOGYKHARAGPUR TOPIC:-IMAGE SEGMENTATION USING U-Net
  • 2.
    INTRODUCTION Image segmentation isthe task of partitioning an image based on the objects present and their semantic importance. This makes it a whole lot easier to analyze the given image, because instead of getting an approximate location from a rectangular box. We can get the exact pixel-wise location of the objects.
  • 3.
    WHAT IS SEMANTICSEGMENTATION Image classification
  • 4.
    CLASSIFICATION WITH LOCALIZATION Inlocalization along with the discrete label, we also expect the compute to localize where exactly the object is present in the image.
  • 5.
    OBJECT DETECTION Object detectionextends localization to the next level, where images are no longer constrained to have a single object, but can contain multiple objects. The task is to classify and locate all the objects in the image.
  • 6.
    SEMANTIC SEGMENTATION The goalof semantic image segmentation is to label each pixel of an image with a corresponding class of what is being represented. Because we’re predicting for every pixel in the image, this task is commonly referred to as dense prediction.
  • 7.
  • 8.
  • 9.
    UN-POOLING There are threemethod 1) max un-pool 2) nearest neighbour 3) bed and nails. I discuss only max un-pool because (nn.ConvTranspose2d)un-pool use max operation.
  • 10.
    MAX-UN POOL ANDCONVOLUTION COMBINE OPERATION
  • 11.
  • 12.
    SEG-NET ARCHITECTURE 1) Conv3*3 ReLU 2) Max-Pool 2*2 3) up_conv 2*2
  • 13.
  • 14.
    COMPARE BETWEENU-NET ANDSEGNET Herediscuss two different compare 1) efficiency (FLOPS) 2) accuracy (loss)  1) FLOPS:- Convolutions - FLOPs = 2x Number of Kernel x Kernel Shape x Output Shape =2* c* w* h* ((n-w+2*p)/s+1) * ((m-h+2*p)/s+1)  Pooling Layers - FLOPs = Height x Depth x Width of an image = 2 * n * m * c So the FLOPS for U-Net is=2,251,171,840 and FLOPS for SegNet is= 5,412,076,480 2) accuracy (loss):-- U-Net:- Epoch 1/3 : Training loss: 0.1033 Epoch 2/3 : Training loss: 0.0973 SegNet:- Epoch 1/3 : Training loss: 0.0215 Epoch 2/3 : Training loss: 0.0215
  • 15.
    CONCLUSION The U-Net architectureis one of the most significant and revolutionary landmarks in the field of deep learning. While the initial research paper that introduced the U-Net architecture was to solve the task of Biomedical Image Segmentation, it was not limited to this single application. The model could and can still solve the most complex problems in deep learning. Although some of the elements in the original architecture are outdated, there are several variations of this architecture. These include LadderNet, U-Net with attention, the recurrent and residual convolutional U-Net (R2-UNet), and other similar networks which are derived successfully from the original U-Net Models.
  • 16.