This document describes a waste classification system that uses a convolutional neural network to classify waste items into different categories. The system was developed by a group of students under the guidance of Dr. Divya Kumar. The objective is to effectively segregate waste using image processing and artificial neural networks. The system classifies waste images into 5 categories (metal, organic, container, paper, plastic) with over 98% accuracy. It uses a CNN model with convolutional, pooling and fully connected layers to perform the classification. The model was trained on a manually collected dataset of over 3,300 images.
3. Group Members
▶ Abhinav Dixit (20151012)
▶ Ishaan Rajput (20154086)
▶ Abhishek Sharma (20154077)
▶ Harshita Rastogi (20154041)
▶ John Prasad (20154010)
4. Objective
▶ To develop a system to effectively segregate the collected waste on the basis
of different categories using the concepts of artificial neural networks and
image processing.
▶ To create Convolution Neural Network(CNN) model which is a powerful and a
deep layered network that helps in the classification process along with
tuning various parameters to get the highest accuracy.
▶ To help in eliminating the need of a middle-man for the treatment of waste.
5. Motivation
▶ By designing an autonomous system we can reduce the workforce required for
waste management and also perform the task efficiently.
▶ We can classify the waste into one of the three categories – recyclable,
compost and landfill(Non-Recyclable waste).
▶ Hazardous waste materials which cause harm to humans can easily be
disposed off.
6. Proposed Work
▶ To carry out the image classification of waste materials using Convolutional Neural
Networks(CNN). Given an image of a random waste item, the system seeks to
categorize the waste item into one of the 5 mentioned categories.
▶ Categories for classification:
▶ 1. Metal
▶ 2. Organic waste
▶ 3. Container
▶ 4. Paper
▶ 5. Plastic
7. Artificial Neural Networks
▶ An ANN is a computational model based on the structure and functions of
biological neural networks.
▶ Information that flows through the network affects the structure of the ANN
because a neural network changes - or learns, in a sense - based on that input
and output.
▶ There are weighted connections (correspond to synapses) between simulated
neurons where signals it receives (numbers) are summed.
▶ A signal is sent (fired) if a certain threshold is reached.
8. Artificial Neural Networks
▶ An ANN is typically defined by three types of parameters:
▶ The interconnection pattern between different layers of neurons
▶ The learning process for updating the weights of the interconnections.
▶ The activation function that converts a neurons weighted input to its output
activation.
9. Artificial Neural Networks
▶ The model learns by repeating the following steps -
▶ Feed Forward Algorithm - The input layer of neural network is fed with samples.
These sample values will be multiplied with the corresponding weights and added
up. The bias will be added to the resulting sum and this sum will be passed to the
activation function.
▶ After the Feedforward the output of the neural network is compared with the
target output. The difference between expected output of a neuron and actual
output of the same neuron gives the error of that neuron.
▶ Back Propagation Algorithm - Backpropagation algorithm is applied after the
feedforward algorithm in order to propagate the errors in other direction of feed
forward and adjust the weights to overcome that error.
10. Convolutional Neural Network (CNN)
▶ Why CNN?
▶ Image recognition is not an easy task to achieve. In Theory, we can use
conventional neural networks for analyzing images, but in practice, it will be
highly expensive from a computational perspective.
▶ For instance an image of more respectable size, e.g. 200x200x3, would lead to
neurons that have 200*200*3 = 120,000 weights.
▶ Convolutional Neural Networks take advantage of the fact that the input consists
of images and they constrain the architecture in a more sensible way. The layers of
a ConvNet have neurons arranged in 3 dimensions: width, height, depth.
▶ The neurons in a layer will only be connected to a small region of the layer before
it, instead of all of the neurons in a fully-connected manner.
11. CNN - Structure
▶ Any CNN comprises of following layers –
▶ Convolution Layer (With ReLU)
▶ Pooling Layer
▶ Flattening
▶ Followed by fully connected ANN
12. Convolution Layer
▶ The CONV layer’s parameters consist of a set of learnable filters.
▶ During the forward pass, we convolve each filter across the width and height
of the input volume and compute dot products between the entries of the
filter and the input at any position.
▶ As we slide the filter over the width and height of the input volume we will
produce a 2-dimensional activation map that gives the responses of that filter
at every spatial position.
▶ Intuitively, the network will learn filters that activate when they see some
type of visual feature such as an edge of some orientation, color patterns,
etc. on higher layers of the network.
13. ReLU Layer
▶ The Convolution Layer is followed by ReLU (Rectified Linear Unit) activation
function, defined as f(x) = Max(0, x).
▶ It is also known as ramp function.
▶ It is used to remove linearity in the output of the CONV Layer.
14. Pooling
▶ The Pooling Layer operates independently on every depth slice of the input
and resizes it spatially, using the MAX operation.
▶ Its function is to progressively reduce the spatial size of the representation to
reduce the amount of parameters and computation in the network, and hence
to also control overfitting.
▶ The most common form is a pooling layer with filters of size 2x2 applied with
a stride of 2 down-samples every depth slice in the input by 2 along both
width and height, discarding 75% of the activations. Every MAX operation
would in this case be taking a max over 4 numbers.
15. Flattening
▶ We need to convert the output of the convolutional part of the CNN into a 1D
feature vector, to be used by the ANN part of it. This is achieved by
Flattening.
▶ It gets the output of the convolutional layers, flattens all its structure to
create a single long feature vector to be used by the dense layer for the final
classification.
16. Fully Connected ANN
▶ Neurons in a fully connected layer have full connections to all activations in
the previous layer, as seen in Conventional Neural Networks.
▶ This is proceeded by further adding dense hidden layers and finally computing
the values of the Output Layer.
17. Dataset used
▶ The dataset was manually collected for every category from various search
engines like Bing, Google and Yahoo through a software called Extreme
Picture Finder.
▶ The dataset contains images that include the waste items in different
illumination, background colors and various angles.
▶ The dataset consists of a total of 3,302 images belonging to one of the 5
classes.
▶ Care was taken to include images of objects as they would be when disposed
off like crushed bottles, crumpled paper, and so on to make the training more
robust to deal with real-life images.
18. Software tools used
▶ Python 3
▶ Tensor-flow
▶ Keras
▶ Cuda (NVIDIA Computing Toolkit)
▶ Matplotlib
▶ SciPy
19. Description Of Model
▶ After doing parameter hyper-tuning, we got the following setting to produce
the best outcome.
▶ Pictures were rescaled to a size of 256 X 256 pixels.
▶ The train-test split ratio was taken to be around 12:1.
▶ Batch size was chosen to be 16.
▶ We used Categorical Cross-Entropy as loss function and used Adam Optimizer
to reduce it over 30 epochs (repetitions).
20. Description Of Model - Layers
▶ Layer 1 - CNN (followed by Batch Normalization and Max Pooling)
▶ Filters - 96, Kernel Size - 11 x 11, Strides - 4
▶ Activation Function - ReLU
▶ Layer 2 - CNN (followed by Batch Normalization and Max Pooling)
▶ Filters – 256, Kernel Size - 5 x 5
▶ Activation Function - ReLU
▶ Layer 3-4 - CNN
▶ Filters – 384, Kernel Size - 3 x 3
▶ Activation Function - ReLU
▶ Layer 5 - CNN (followed by Max Pooling)
▶ Filters – 256, Kernel Size - 3 x 3
▶ Activation Function - ReLU
▶ Layer 6 - Flattening
26. Future Works
▶ Localization — In addition to Classification of Waste, we can use techniques such as
Bounding Box Localization, Instance Segmentation etc. to locate the classified
object in the image.
▶ Generalization — Rather than constricting our search among 5 classes we can use
pre-trained models (like VGG-16) to generalize this approach over all kinds of
waste. The model can be fine-tuned to fit our requirements and provide wider and
better results.
▶ Automation — Build a more sophisticated mechanical system to automatically
separate and segregate the cluster of wastes.
▶ Improving the accuracy rates of the system by adding more training and test
samples to the existing dataset.