1BM19CS155_ShreshthaAggarwal_Technical Seminar_PPT.pptx

Image Segmentation Using Machine Learning
Presented by
Shreshtha Aggarwal
(1BM19CS155)
Under the guidance of
Prof. Sunayana S
Assistant Professor
Department of Computer Science and Engineering
BMS College and Engineering, Bengaluru
3-01-2023 Image Segmentation Using Machine Learning Department of CSE, BMSCE

Introduction

1.1. Overview
As the connectivity increasing so though the amount of data transferred is increasing, this has led to sharing
all type of data like images, voice etc. As we look at the images, we know that computer can only understand
numbers only and also as for human they might not understand all the hidden aspect of images. So to help
computers understand images the process of image segmentation was developed which is the process of
partitioning a digital image into multiple segments.
The goal of segmentation is to simplify and/or change the representation of an image into something that is
more meaningful and easier to analyze. This helps in computer in understanding the image and also help
human analyzing the images. In this one we will use two architectures of it known as Unet and mask-RCNN.

1.2. Motivation
With the increase of the technology and modernization, there has been increase in the data. This data contains text,
images, and many more. But as we know that computer can only understand the numbers only so understanding of
the images by the computer was a hefty thing. But with the booming of the Machine learning and deep learning
this task has become easier for computer to understand the images and even analyze and classify them.
Few of the tools and architectures have helped such as U-net architecture and Mask-RCNN has helped in our task.
How these architectures work and classify images is the motivation here.

1.3. Objectives
The objective of our project is simple. There are two of them firstly since we know that computer can only
understand the numbers so our first objective is to make computer understand the images. With the release
of the Pytorch and tensorflow it has helped us in achieving.
This is done by understanding what exactly the images are made up of and how computer perceive them. It
is one of the technologies that help digital world interact with the physical world. Second Objective is to
understand how these technologies work and various architecture used in these. Thus, we will demonstrate
the following architectures and show how they work.

Literature Survey

U-net and its variants for medical image segmentation: A
review of theoryand applications
Authors: NAHIAN SIDDIQUE, SIDIKE PAHEDING, COLIN P. ELKIN and VIJAY
DEVABHAKTUNI
In this paper, it was aimed to provide a starting point for researchers who wish to explore U- net, which is a
powerful deep learning model used extensively for medical image segmentation. To do so, firstly we started
with the definition of Unet and what it is and then we explored the many variants of Unet and its diverse
applications on a multitude of image modalities. We also examined the major deep learning methods and
their application areas for all of the papers in this survey. Indeed U-net based architecture is quite
groundbreaking and valuable in medical image analysis. We also saw that the growth of U-net papers since
2017 lends credence to its status as a premier deep learning technique in medical image diagnosis. Thus,
despite the many challenges and limitations remaining in deep learning- based image analysis, we expect U-
net to be one of the major paths forward.

U-Net: Convolutional Networks for Biomedical Image
Segmentation
Author: Olaf Ronneberger, Philipp Fischer, and Thomas Brox
This paper introduces the Unet architecture and talks about its architecture and design when it was designed
for first time. It also shows us the experimental details which show it has high accuracy than another model
despite using less dataset. It also shows how it is beneficiary for medical field. The u-net architecture
achieves very good performance on very different biomedical segmentation applications. Thanks to data
augmentation with elastic deformations, it only needs very few annotated images and has a very reasonable
training time of only 10 hours on a NVidia Titan GPU (6 GB). We are sure that the u-net architecture can be
applied easily to many more tasks.

Mask R-CNN
Author: Kaiming He Georgia Gkioxari Piotr Dollar Ross Girshick
This paper talks about Mask-RCNN and also about its application and working. It is faster RCNN and also
instance segmentation and talks about various of the Mask – RCNN applications and also talks about the
implementation details and also compare them to many other architectures like FCIS+++ and have shown to
outperform them. The experiment was done on the coco dataset 2015-2016 and also. But one thing is clear
from it is that it can used for many more things.

Classification of Image using CNN
Authors: Md. Anwar Hossain and Md. Shahriar Alam Sajib
This paper focuses on Convolutional Neural Networks which currently is the state-of-art technique for image
classification. The authors have used the CIFAR-10 dataset which has around 60,000RGBimages. It has ten
classes, and they are an airplane, automobile, bird, cat, deer, dog, frog, horse, ship, truck. The images are of
size 32x32 pixels. The dataset consists of 50,000training and 10,000 testing examples. Optimization
algorithm used is the Stochastic Gradient Descent algorithm (SGD) which is a variation of Gradient Descent
algorithm (GD). Upon successful training of the images, among 10,118 test cases, the model misclassified a
total of661 images after three hundred epochs which corresponds to 93.47%recognitionrate. Authors also
found that higher the number of epochs the higher would be the accuracy.

Methodology/Techniques or Algorithms used

Algorithms and techniques were learnt and used in the implementation.
Some of the techniques are listed below.
● Image processing
● Data Augmentation
● Plotting images using matplotlib
● Transfer Learning

Algorithms that were learnt are listed below:
● Gradient Descent
● Stochastic Gradient Descent
● Backpropagation
The algorithm and the techniques go hand in hand for efficient and fast execution. Techniques help reduce the
complexity of data on which algorithms work on. The techniques make sure to get rid of some extreme edge
cases with which our program might crash or with which development time would increase. Thus techniques
are equally important to algorithms.

4. Tools used
Tools that were used to learn and implement CNN model were mainly frameworks and packages that were
open source and implemented in python. These packages helped us get a faster grasp of the field and eased the
task of understanding and solving the problem, also to solve some trivial problems. Yet one should not just use
the packages but must also understand the working of the packages. Under the hood mechanism must also be
learnt so that any future problems which require deeper understanding of the field could be solved.
Tools that were used were:
● TensorFlow
TensorFlow is an end-to-end open-source platform for machine learning. It has a comprehensive, flexible
ecosystem of tools, libraries and community resources that lets researchers push the state-of-the-art in ML
and developers easily build and deploy MLpowered applications.

● Matplotlib
Matplotlib is a comprehensive library for creating static, animated, and interactivevisualizations in
Python. Matplotlib makes easy things easy and hard things possible.
● NumPy
This is a package that is used for making efficient and multi-dimensional arrays that are efficient. This
package also includes many features like Random number generation, generation of values between a range
with equal difference between them and so on. Basically this package is used to play with numbers and arrays.
These were the tools that were selected for the implementation.
The modules were implemented on Kaggle.

Modules implementation and results

MODULE 1: UNET – ARCHITECTURE
U-Net architecture is a deep learning image segmentation architecture introduced by Olaf Ronneberger,
Philipp Fischer and Thomas Brox in 2015. It’s an encoder-decoder architecture convolutional neural network
that was specifically designed tobe used in Biomedical Imaging.
The main goal for this architecture was to tackle two important issues in the field of medical imaging which
are:
1. Lack of large training datasets –
Traditional convolutional neural networks with fully connected layers, require large datasets because of the
large number of parameters needed to learn. Since medical imaging has small datasets, this architecture ensures
maximum learning from the information provided because a fully connected layer is replaced with series of up
convolutions on the decoder side.

2. Capturing context accurately at different resolutions and scales.
Its U shape design consists of two parts. The left side is known as the contracting path or encoder path,
where repeated typical convolutions are applied followed by ReLU and max pooling operations. The right side
is known as the expansive path which has transposed 2D convolutional layers where upsampling technique is
performed.

Input:
Fig.1 Input Image with its actual mask

Output:
Fig.2 Input Image with its actual mask and predicted mask by using U-Net Architecture

Graph output:
Fig.3 Graph Output of Training and Variation Loss

MODULE 2: MASK-RCNN
Modern methods used to perform image segmentation use dilated convolutions at the core to extract high-
resolution features. This architecture is used for instance image segmentation which extends Faster R-CNN
(an architecture proposed by Shaoqing Renetal to eliminate selective search and allow the network to learn
region proposals) by adding an object mask predictor as a parallel branch to bounding box recognition.

Input:
Fig.4 Input Image with various objects to identify

Output:
Fig.5 Image with objects segmented

References
1. N. Siddique, S. Paheding, C. P. Elkin and V. Devabhaktuni, "U-Net and Its Variants for Medical Image
Segmentation: A Review of Theory and Applications," in IEEE Access, vol. 9, pp. 82031-82057, 2021.
2. Ronneberger, O., Fischer, P., Brox, T. (2015). U-Net: Convolutional Networks for Biomedical Image
Segmentation. In: Navab, N., Hornegger, J., Wells, W., Frangi, A. (eds) Medical Image Computing and
Computer-Assisted Intervention – MICCAI 2015. MICCAI 2015. Lecture Notes in Computer Science(), vol
9351. Springer, Cham
3. K. He, G. Gkioxari, P. Dollar, and R. Girshick, “Mask R-CNN,” 2017 IEEE International Conference on
Computer Vision (ICCV), 2017.
4. Md. Anwar Hossain, & Md. Shahriar Alam Sajib. (2019). Classification of Image using Convolutional
Neural Network (CNN). Global Journal of Computer Science and Technology, 19(D2), 13–18. Retrieved
from https://computerresearch.org/index.php/computer/article/view/1821

1BM19CS155_ShreshthaAggarwal_Technical Seminar_PPT.pptx

Recommended

Recommended

More Related Content

Similar to 1BM19CS155_ShreshthaAggarwal_Technical Seminar_PPT.pptx

Similar to 1BM19CS155_ShreshthaAggarwal_Technical Seminar_PPT.pptx (20)

Recently uploaded

Recently uploaded (20)

1BM19CS155_ShreshthaAggarwal_Technical Seminar_PPT.pptx