2. INTRODUCTION
Image segmentation is the task of partitioning an image based on the
objects present and their semantic importance. This makes it a whole lot
easier to analyze the given image, because instead of getting an
approximate location from a rectangular box. We can get the exact
pixel-wise location of the objects.
4. CLASSIFICATION WITH LOCALIZATION
In localization along with the discrete label, we also expect the compute
to localize where exactly the object is present in the image.
5. OBJECT DETECTION
Object detection extends localization to the next level, where images are no longer
constrained to have a single object, but can contain multiple objects. The task is to
classify and locate all the objects in the image.
6. SEMANTIC SEGMENTATION
The goal of semantic image segmentation is to label each pixel of an image with a
corresponding class of what is being represented. Because we’re predicting for
every pixel in the image, this task is commonly referred to as dense prediction.
9. UN-POOLING
There are three method 1) max un-pool 2) nearest neighbour 3) bed and nails. I
discuss only max un-pool because (nn.ConvTranspose2d)un-pool
use max operation.
14. COMPARE BETWEENU-NET ANDSEGNET
Here discuss two different compare 1) efficiency (FLOPS) 2) accuracy (loss)
1) FLOPS:- Convolutions - FLOPs = 2x Number of Kernel x Kernel Shape x Output Shape
=2* c* w* h* ((n-w+2*p)/s+1) * ((m-h+2*p)/s+1)
Pooling Layers - FLOPs = Height x Depth x Width of an image
= 2 * n * m * c
So the FLOPS for U-Net is=2,251,171,840 and FLOPS for SegNet is= 5,412,076,480
2) accuracy (loss):-- U-Net:- Epoch 1/3 : Training loss: 0.1033
Epoch 2/3 : Training loss: 0.0973
SegNet:- Epoch 1/3 : Training loss: 0.0215
Epoch 2/3 : Training loss: 0.0215
15. CONCLUSION
The U-Net architecture is one of the most significant and
revolutionary landmarks in the field of deep learning.
While the initial research paper that introduced the U-Net
architecture was to solve the task of Biomedical Image
Segmentation, it was not limited to this single
application. The model could and can still solve the most
complex problems in deep learning. Although some of the
elements in the original architecture are outdated, there
are several variations of this architecture. These include
LadderNet, U-Net with attention, the recurrent and
residual convolutional U-Net (R2-UNet), and other similar
networks which are derived successfully from the original
U-Net Models.