SlideShare a Scribd company logo
1 of 32
UNIVERSITY OF BERGAMO
ENGINEERING DEPARTMENT
MASTER OF SCIENCE IN COMPUTER ENGINEERING
CONVOLUTIONAL NEURAL NETWORKS FOR
COMPUTER VISION
Supervisor
Mario Verdicchio
Candidate
Daniele Ettore Ciriello
13 June 2016
Welcome and thank for being here, I am Daniele Ciriello and I wrote
my master’s thesis on convolutional neural networks for computer
vision.
Overview
Introduction
Neural Networks Fundamentals
State of the Art
Implementation
Results
Future Developments
2 of 32
CONVOLUTIONAL NEURAL NETWORKS FOR COMPUTER VISION
The presentation is split in six parts, initially I introduce computer
vision and neural networks, of which I present the fundamental
concepts and the state of the art in matter of image classification
tasks, then I present the different parts that compose the
implemented project and result of some experiment, finally I show
possible developments for the future and applications in practice.
Overview
Introduction
Neural Networks Fundamentals
State of the Art
Implementation
Results
Future Developments
3 of 32
CONVOLUTIONAL NEURAL NETWORKS FOR COMPUTER VISION
Let’s start introducing the main concepts for computer vision and
artificial neural networks.
Computer Vision
Acquisition, process, analysis and comprehension of images
and high-dimensionality data
Applications Examples:
Control processes
Navigation
Event recognition
Information organization
Interaction
4 of 32
CONVOLUTIONAL NEURAL NETWORKS FOR COMPUTER VISION
Computer vision is a filed of computer science which deals the
acquisition, process, analysis and comprehension of the information
contained in images and high dimensionality data from the real world,
with the objective of produce information, in numeric or symbolic
form, for example in form of decision.
Represents the basis for many artificial intelligence systems, finding
application for example in control processes, navigation system,
information organization and human-machine interaction systems.
The kernel of many of these applications are often image
classification problems, in which a system have to decide an image
class from a set of predefined classes.
Neural Networks and Computer Vision
Information process paradigm inspired to the biologic nervous
system
Highly scalable non-linear decision models
Autonomously set the right parameters through learning
algorithms
Need big training data-sets
5 of 32
CONVOLUTIONAL NEURAL NETWORKS FOR COMPUTER VISION
In the last years considerable progress have been possible thanks to
convolutional neural networks, which have established the state of
the art technology solving image classification problems and many
others, related to computer vision. Artificial neural networks are
model inspired by the biologic nervous system and can represent
non-linear, highly scalable, decision models. By using learning
algorithms, we permit the network to set by its-self the parameters
which minimize the error loss.
Overview
Introduction
Neural Networks Fundamentals
State of the Art
Implementation
Results
Future Developments
6 of 32
CONVOLUTIONAL NEURAL NETWORKS FOR COMPUTER VISION
Let’s see some fundamental concepts about artificial neural networks.
Neural Units
Neural unit
x1
x2
...
xn
b y
w1
w2
wn
y = f(w · x + b)
w: weights
b: bias
f: activation function
Activation functions
. . . . . .
7 of 32
CONVOLUTIONAL NEURAL NETWORKS FOR COMPUTER VISION
The basic element of artificial neural networks is the neural unit, or
artificial neuron, which take in input n values x, at which correspond n
weights w, inputs and weights can be seen as two vectors x and w,
the neural unit’s output consists of the application of a function,
named activation function, to the sum between the dot product
between x and w and a value b named bias.
The most simple activation function and the first being used is the
step function and neural unit using it are called perceptrons. The
most famous activation function in literature, even if now is it no more
used, is the sigmoid (or logistic) function, in literature we can find
similar functions like hyperbolic tangent. The most used activation
function in these days is the rectifying function, and neurons which
use it are called ReLU (rectifier linear unit).
Feed-forward Neural Networks
II strato
nascosto
I strato
nascosto
strato
di input
strato di
output
y1 = x, yj = f(Wj · y −1 + bj )
8 of 32
CONVOLUTIONAL NEURAL NETWORKS FOR COMPUTER VISION
By disposing these elements in layers we can build neural networks,
in particular, classic feed-forward models expect the composition in
many layers where, the first layer is called input layer because it
represents the network’s input values, last layer is called output layer
because it carries the output of the network, and internal layers are
called hidden layers, simply because they are nor input nor output
layers.
Layers like these are called affine or fully connected layer, in which
each neuron’s inputs consists in all the neurons’ output from the
previous layer.
Convolution Layers
Groups of neurons which share parameters with all neurons of
the same group
Each neuron take in input a portion of x
Very efficient with image processing problems
x1 x2 x3 x4 x5
A A A A
y1 y2 y3 y4
...
xn−1 xn
A
yn
9 of 32
CONVOLUTIONAL NEURAL NETWORKS FOR COMPUTER VISION
Many other types of layer exist, a very important one, especially for
computer vision problems and more in general, pattern recognition
problems, is the convolution layer. It consists of groups of neurons
where each neuron take in input a portion of the input values and
shares their parameters with every other neuron in the same group.
In the figure you can see a mono-dimensional convolution layer,
composed by neurons of type A, which take in input a segment of x,
in case of computer vision problems layers can be bi-dimensional and
can take in input areas of x.
Other Types of Layer
Pooling
Softmax
Normalization
10 of 32
CONVOLUTIONAL NEURAL NETWORKS FOR COMPUTER VISION
Convolution layers are often interspersed with pooling layers, which
sub-sample the input keeping the number of channels invariant,
another important layer is the softmax layer, used as output layer in
many classification problems, as the output of this layer can be seen
as a probability distribution.
Another important layer type is the normalization layer, that normalize
input values in an unit interval along each channel.
Modularity of these simple concepts allows the composition of more
(or less) complex and specific models.
Overview
Introduction
Neural Networks Fundamentals
State of the Art
Implementation
Results
Future Developments
11 of 32
CONVOLUTIONAL NEURAL NETWORKS FOR COMPUTER VISION
To exhibit the state of the art I show results for a competition that take
place annually attracting institutions from all over the world, becoming
a benchmark for the evaluation of computer vision systems.
ILSVRC
ImageNet Large-Scale Visual Recognition Challenge
1000 classes
1.2 M training
samples
500 k validation
samples
12 of 32
CONVOLUTIONAL NEURAL NETWORKS FOR COMPUTER VISION
ILSRVC is a competition composed by many computer vision tasks
like localization and classification, with the passing years other
sub-tasks are born, like scene classification or object localization in
videos. This competition has become a benchmark for large-scale
convolutional networks performance analysis.
The classification task is supported by 1.2 million images and 1000
classes, in the year 2015 human average error has been surpassed
by convolutional models.
ILSVRC
Blue: Traditional computer vision
Purple: Deep learning
Red: Human capacity
13 of 32
CONVOLUTIONAL NEURAL NETWORKS FOR COMPUTER VISION
In the figure you can see how since deep learning systems (or
systems based on neural models with more than one hidden layer),
have been used, it became possible to obtain considerable progress,
till surpassing human capacity in 2015. You can also see how neural
models has supplanted the classic computer vision approaches.
Residual Networks
14 of 32
CONVOLUTIONAL NEURAL NETWORKS FOR COMPUTER VISION
The model which won all the tasks in the year 2015 is a convolutional
model called residual netowrk, proposed by MSRA, consists in a
convolutional network in which the residual learning concept is
applied.
The basis structure in the middle is called plain network and is
inspired to the VGG networks, a model presented the previous year
at the same competition (image above), carrying a residual of the
input values to the output of a group of convolutional layers by using a
skip path, creating a residual convolution network like the one below.
Overview
Introduction
Neural Networks Fundamentals
State of the Art
Implementation
Results
Future Developments
15 of 32
CONVOLUTIONAL NEURAL NETWORKS FOR COMPUTER VISION
So my objective was to reproduce the results obtained by the residual
model on more simple datasets.
Project Overview
PyFunt
Library for development and training convolutional neural
networks
PyDatSet
Library for loading various data-sets and a collection of
functions for artificial training data augmentation
Deep-residual-networks-pyfunt
Implementation and training of parametric residual networks
on various data-sets
16 of 32
CONVOLUTIONAL NEURAL NETWORKS FOR COMPUTER VISION
The project is composed by three Python repositories. PyFunt
contains the library for development and training convolutional neural
networks, PyDatSet conists of a collection for loading various
data-sets in a python environment and a set functions for artificial
training data augmentation, and the main repository that contains the
residual model implementation and the main application that makes
use of the model and the two libraries to load the data-sets and train
the networks.
All the three repositories are published with an open-source license
on GitHub in a way that anyone can use them or contribute to the
development.
Package Diagram
17 of 32
CONVOLUTIONAL NEURAL NETWORKS FOR COMPUTER VISION
In this diagram you can see how the various parts of the project
interact each other to train the residual model, in particular, the main
application creates a ResNet object, that uses the implementations of
the various layers and initialization functions provided by pyfunt, then
creates a Solver object provided by the same library, to which passes
the model, the data loaded with pydatset and many hyper parameter
for training like number of epochs and learning rate. pyfunt also
contains utilities to verify the correct implementation of the layers and
to visualize the first convolution layer’s weights.
Overview
Introduction
Neural Networks Fundamentals
State of the Art
Implementation
Results
Future Developments
18 of 32
CONVOLUTIONAL NEURAL NETWORKS FOR COMPUTER VISION
So let’s look at the results of experiments on several data-sets.
CIFAR-10 - Canadian Institute For Advanced
Research
60 k images
(50 k + 10 k)
RGB 32x32
10 classi:
airplane
car
bird
cat
deer
dog
frog
horse
ship
truck
19 of 32
CONVOLUTIONAL NEURAL NETWORKS FOR COMPUTER VISION
Wanting to replicate the results obtained by residual networks, I
trained the model implemented by myself on CIFAR 10, which
consists of 60,000 RGB images of 32 pixels per side spread on 10
disjoint classes.
This data-set is much simpler than the previous one but being varied
allows you to easily see if a network can learn properly from the
training samples.
CIFAR-10 – Results
Accuracy: 90.41 %; parameters: ˜248 k
20 of 32
CONVOLUTIONAL NEURAL NETWORKS FOR COMPUTER VISION
To evaluate a network’s behaviour we can analyze the learning
curves, which report the value of the moving average of the cost at
each iteration (above), and the error values on the training data (the
dotted lines) and the error on validation data (the solid lines below).
In this case, by using a 20 layers network, become evident by
observing the errors, a typical phenomenon of these models called
overfitting, according to which the network uses his several
parameters to learn specific features of the training set, without
generalizing enough on the validation set. Memorizing in a certain
sense, the training samples.
By artificially augmenting the size of the training set, we can reduce
this phenomenon and obtain better results in terms of validation error.
MNIST – Mixed National Institute of Standards
and Technology
70 k images
(60 k + 10 k)
B/W 28x28
handwritten
digits
10 classes
(digits from 0 to
9)
21 of 32
CONVOLUTIONAL NEURAL NETWORKS FOR COMPUTER VISION
The MNIST data-set has been presented for the first time in 1991 and
is composed by 60 thousand 28 pixel per side, black/white images of
handwritten digits.
MNIST – Results
Accuracy: 99.64 %; parameters: 442 k
22 of 32
CONVOLUTIONAL NEURAL NETWORKS FOR COMPUTER VISION
Another way to low the validation error in residual models is to
increment the number of layers or the number of neurons for each
layer, in this case for the firsts two experiments I halved the number of
neurons used in the previous experiments, incrementing the number
of layers from 20 to 32, and re-doubling the number of neurons in the
32 layers model I obtained an accuracy of 99.64%, which means the
network erroneously classify just 36 images of the 10 000 validation
images after trained.
The number of each layers’ filters starts from 16 and get doubled
sporadically to 64.
MNIST – Results
23 of 32
CONVOLUTIONAL NEURAL NETWORKS FOR COMPUTER VISION
In this figure we can see all 36 erroneously classified images, in each
cell’s middle there is the original image, in the top left the correct
class for the image, in bottom left the wrong classification from the
network and lower right the second classification for confidence.
So we can see that in many cases the second classification is the
correct one, furthermore we can see some examples which are
almost indistinguishable for a human eye.
SFDDD - State Farm Distracted Driver Detection
˜22 k RGB 640x480 images
of drivers
10 distraction classes:
safe driving
texting - right
talking on the phone - right
texting - right
talking on the phone - left
operating the radio
drinking
reaching behind
hair and makeup
talking to passenger
24 of 32
CONVOLUTIONAL NEURAL NETWORKS FOR COMPUTER VISION
The last dataset I present has been provided by State Farm trough
Kaggle, an institution which collect many machine learning
competitions. In this case the insurance agency wanted to verify the
best accuracy level obtainable in the classification of RGB, 640x480
images of drivers in 10 distraction classes: safe driving, texting with
the right hand, talking on phone with the right hand, etcetera...
To simplify the dataset and the training process, I resized all the
images to 64x48 pixels, by selecting for training a random portion of
each image at each iteration, and the central 32x32 portion for
validation.
SFSDDD – Results
Accuracy: 99.75 %, parameters: ˜636 k
25 of 32
CONVOLUTIONAL NEURAL NETWORKS FOR COMPUTER VISION
Given that the competition is still not ended, the validation dataset is
still not public, so I used a validation set composed by 2000 images,
randomly extracted and excluded from the 22000 circa of the training
set. By analyzing the learning curves of two residual networks of 32
and 44 layers. Despite the difference in the loss values are almost
imperceptible, and despite both the network can recognize practically
all the training samples after 80 epochs, we can see that with the 44
layers resnet I obtained an accuracy of 99.75% on my validation set.
SFSDDD – Saliency Maps
26 of 32
CONVOLUTIONAL NEURAL NETWORKS FOR COMPUTER VISION
By evaluating and visualizing the values of the cost’s derivative, with
respect of the input images, we can observe the so called saliency
maps, where lighter areas represents the images’ portions which
most contribute to the right classification by the trained network, in
this case we can see some samples from the class “talking on phone
with the right hand”. Is interesting to note for example how the
classification is affected by the steering areas where usually hands
reside or areas where it should reside the arm, in the fourth picture
you can instead see the zone that has affected most is that around
the head.
Overview
Introduction
Neural Networks Fundamentals
State of the Art
Implementation
Results
Future Developments
27 of 32
CONVOLUTIONAL NEURAL NETWORKS FOR COMPUTER VISION
We come then to the conclusion, where I describe some possibilities
for Future Developments and applications in practice.
Future Developmentss
Extend the framework
Implement other models
Train on other data-sets
28 of 32
CONVOLUTIONAL NEURAL NETWORKS FOR COMPUTER VISION
In a near future I would like to continue the project, extending the
implemented library, allowing for example the usage of GPU based
hardware accelerated computation, or developing other types of layer
or new models that will be presented in future, or by training the same
model on other data-sets.
Examples of Applications in Practice
Health
. . .
Robotics
. . .
Navigation
. . .
Physics, Natural Sciences, art and entertainment, ...
29 of 32
CONVOLUTIONAL NEURAL NETWORKS FOR COMPUTER VISION
Some possible applications in which convolutional neural netwrorks
are being used are in Health for classify chest diseases from x-ray
images, localization and segmentation of tumor masses in the brain,
or of the pancreas from the trunk sections of the images, useful as
the pancreas varies in size and shape from person to person. In
robotics, for eye-hand coordination systems in 7 degrees of freedom
robotic arms, to improve locomotion skills of robots on irregular
ground through reinforcement learning algorithms, and for the
improvement of the relationship between humans and robots with
facial expressions classification networks. In navigation they are used
in a massive way for autonomous vehicle systems, or for example for
detecting and classifying roads starting from satellite images, or road
signs images classification.
Furthermore, neural models are being used successfully in physics,
natural sciences, for example for classify flowers in botanics or
plankton in marine biology, and many more.
UNIVERSITY OF BERGAMO
ENGINEERING DEPARTMENT
MASTER OF SCIENCE IN COMPUTER ENGINEERING
CONVOLUTIONAL NEURAL NETWORKS FOR
COMPUTER VISION
Supervisor
Mario Verdicchio
Candidate
Daniele Ettore Ciriello
13 June 2016
Thank you for the attention.
Images Credits (1 of 2)
Slide Image Credits URL (http://)
10 Super Vision UofT goo.gl/fbyXSK
10 UvA-Euvision UvA goo.gl/ltNvaP
12 ILSRVC samples ImageNet goo.gl/PvFCAv
13 ILSRVC results 1 NVIDIA goo.gl/QkG3Nf
13 ILSRVC results 2 NVIDIA goo.gl/THMZ1X
14 ResNet MSRA goo.gl/uR0ZXL
29 Medi1 NVIDIA goo.gl/DdPSvD
29 Medi2 BRATS goo.gl/rGgz22
29 Medi3 SPIE goo.gl/R5mbmj
29 Rob1 Google goo.gl/6BycQ7
29 Rob2 UBC goo.gl/A585Iz
29 Rob3 MSRC goo.gl/3exWqJ
31 of 32
CONVOLUTIONAL NEURAL NETWORKS FOR COMPUTER VISION
Images Credits (2 of 2)
Slide Image Credits URL (http://)
29 Nav1 Google goo.gl/DFgPcl
29 Nav2 DeepOSM goo.gl/sR72BF
29 Nav4 IDSIA goo.gl/SR16Uk
32 of 32
CONVOLUTIONAL NEURAL NETWORKS FOR COMPUTER VISION

More Related Content

What's hot

neural networks
 neural networks neural networks
neural networksjoshiblog
 
Artifical Neural Network and its applications
Artifical Neural Network and its applicationsArtifical Neural Network and its applications
Artifical Neural Network and its applicationsSangeeta Tiwari
 
Artificial Neural Network and its Applications
Artificial Neural Network and its ApplicationsArtificial Neural Network and its Applications
Artificial Neural Network and its Applicationsshritosh kumar
 
IRJET-Breast Cancer Detection using Convolution Neural Network
IRJET-Breast Cancer Detection using Convolution Neural NetworkIRJET-Breast Cancer Detection using Convolution Neural Network
IRJET-Breast Cancer Detection using Convolution Neural NetworkIRJET Journal
 
Nonlinear image processing using artificial neural
Nonlinear image processing using artificial neuralNonlinear image processing using artificial neural
Nonlinear image processing using artificial neuralHưng Đặng
 
Machine learning and_neural_network_lecture_slide_ece_dku
Machine learning and_neural_network_lecture_slide_ece_dkuMachine learning and_neural_network_lecture_slide_ece_dku
Machine learning and_neural_network_lecture_slide_ece_dkuSeokhyun Yoon
 
NETWORK LEARNING AND TRAINING OF A CASCADED LINK-BASED FEED FORWARD NEURAL NE...
NETWORK LEARNING AND TRAINING OF A CASCADED LINK-BASED FEED FORWARD NEURAL NE...NETWORK LEARNING AND TRAINING OF A CASCADED LINK-BASED FEED FORWARD NEURAL NE...
NETWORK LEARNING AND TRAINING OF A CASCADED LINK-BASED FEED FORWARD NEURAL NE...ijaia
 
Basics of Artificial Neural Network
Basics of Artificial Neural Network Basics of Artificial Neural Network
Basics of Artificial Neural Network Subham Preetam
 
Ai and neural networks
Ai and neural networksAi and neural networks
Ai and neural networksNikhil Kansari
 
introduction to deep Learning with full detail
introduction to deep Learning with full detailintroduction to deep Learning with full detail
introduction to deep Learning with full detailsonykhan3
 
Secondary structure prediction
Secondary structure predictionSecondary structure prediction
Secondary structure predictionsamantlalit
 
A Parallel Framework For Multilayer Perceptron For Human Face Recognition
A Parallel Framework For Multilayer Perceptron For Human Face RecognitionA Parallel Framework For Multilayer Perceptron For Human Face Recognition
A Parallel Framework For Multilayer Perceptron For Human Face RecognitionCSCJournals
 
Compegence: Dr. Rajaram Kudli - An Introduction to Artificial Neural Network ...
Compegence: Dr. Rajaram Kudli - An Introduction to Artificial Neural Network ...Compegence: Dr. Rajaram Kudli - An Introduction to Artificial Neural Network ...
Compegence: Dr. Rajaram Kudli - An Introduction to Artificial Neural Network ...COMPEGENCE
 

What's hot (20)

neural networks
 neural networks neural networks
neural networks
 
Artifical Neural Network and its applications
Artifical Neural Network and its applicationsArtifical Neural Network and its applications
Artifical Neural Network and its applications
 
Project Report -Vaibhav
Project Report -VaibhavProject Report -Vaibhav
Project Report -Vaibhav
 
Neural network
Neural networkNeural network
Neural network
 
Artificial Neural Network and its Applications
Artificial Neural Network and its ApplicationsArtificial Neural Network and its Applications
Artificial Neural Network and its Applications
 
Neural Networks
Neural NetworksNeural Networks
Neural Networks
 
Deep learning
Deep learning Deep learning
Deep learning
 
IRJET-Breast Cancer Detection using Convolution Neural Network
IRJET-Breast Cancer Detection using Convolution Neural NetworkIRJET-Breast Cancer Detection using Convolution Neural Network
IRJET-Breast Cancer Detection using Convolution Neural Network
 
Nonlinear image processing using artificial neural
Nonlinear image processing using artificial neuralNonlinear image processing using artificial neural
Nonlinear image processing using artificial neural
 
Neural networks
Neural networksNeural networks
Neural networks
 
Artificial Neural Network Topology
Artificial Neural Network TopologyArtificial Neural Network Topology
Artificial Neural Network Topology
 
Machine learning and_neural_network_lecture_slide_ece_dku
Machine learning and_neural_network_lecture_slide_ece_dkuMachine learning and_neural_network_lecture_slide_ece_dku
Machine learning and_neural_network_lecture_slide_ece_dku
 
NETWORK LEARNING AND TRAINING OF A CASCADED LINK-BASED FEED FORWARD NEURAL NE...
NETWORK LEARNING AND TRAINING OF A CASCADED LINK-BASED FEED FORWARD NEURAL NE...NETWORK LEARNING AND TRAINING OF A CASCADED LINK-BASED FEED FORWARD NEURAL NE...
NETWORK LEARNING AND TRAINING OF A CASCADED LINK-BASED FEED FORWARD NEURAL NE...
 
Basics of Artificial Neural Network
Basics of Artificial Neural Network Basics of Artificial Neural Network
Basics of Artificial Neural Network
 
Ai and neural networks
Ai and neural networksAi and neural networks
Ai and neural networks
 
Neural networks
Neural networksNeural networks
Neural networks
 
introduction to deep Learning with full detail
introduction to deep Learning with full detailintroduction to deep Learning with full detail
introduction to deep Learning with full detail
 
Secondary structure prediction
Secondary structure predictionSecondary structure prediction
Secondary structure prediction
 
A Parallel Framework For Multilayer Perceptron For Human Face Recognition
A Parallel Framework For Multilayer Perceptron For Human Face RecognitionA Parallel Framework For Multilayer Perceptron For Human Face Recognition
A Parallel Framework For Multilayer Perceptron For Human Face Recognition
 
Compegence: Dr. Rajaram Kudli - An Introduction to Artificial Neural Network ...
Compegence: Dr. Rajaram Kudli - An Introduction to Artificial Neural Network ...Compegence: Dr. Rajaram Kudli - An Introduction to Artificial Neural Network ...
Compegence: Dr. Rajaram Kudli - An Introduction to Artificial Neural Network ...
 

Viewers also liked

Reti neurali di convoluzione per la visione artificiale - Tesi di Laurea Magi...
Reti neurali di convoluzione per la visione artificiale - Tesi di Laurea Magi...Reti neurali di convoluzione per la visione artificiale - Tesi di Laurea Magi...
Reti neurali di convoluzione per la visione artificiale - Tesi di Laurea Magi...Daniele Ciriello
 
Pres Tesi LM-2016+transcript_ita
Pres Tesi LM-2016+transcript_itaPres Tesi LM-2016+transcript_ita
Pres Tesi LM-2016+transcript_itaDaniele Ciriello
 
Neocognitron
NeocognitronNeocognitron
NeocognitronESCOM
 
Urs Köster - Convolutional and Recurrent Neural Networks
Urs Köster - Convolutional and Recurrent Neural NetworksUrs Köster - Convolutional and Recurrent Neural Networks
Urs Köster - Convolutional and Recurrent Neural NetworksIntel Nervana
 
Deep Learning Cases: Text and Image Processing
Deep Learning Cases: Text and Image ProcessingDeep Learning Cases: Text and Image Processing
Deep Learning Cases: Text and Image ProcessingGrigory Sapunov
 
拡がるディープラーニングの活用
拡がるディープラーニングの活用拡がるディープラーニングの活用
拡がるディープラーニングの活用NVIDIA Japan
 
Python for Image Understanding: Deep Learning with Convolutional Neural Nets
Python for Image Understanding: Deep Learning with Convolutional Neural NetsPython for Image Understanding: Deep Learning with Convolutional Neural Nets
Python for Image Understanding: Deep Learning with Convolutional Neural NetsRoelof Pieters
 

Viewers also liked (7)

Reti neurali di convoluzione per la visione artificiale - Tesi di Laurea Magi...
Reti neurali di convoluzione per la visione artificiale - Tesi di Laurea Magi...Reti neurali di convoluzione per la visione artificiale - Tesi di Laurea Magi...
Reti neurali di convoluzione per la visione artificiale - Tesi di Laurea Magi...
 
Pres Tesi LM-2016+transcript_ita
Pres Tesi LM-2016+transcript_itaPres Tesi LM-2016+transcript_ita
Pres Tesi LM-2016+transcript_ita
 
Neocognitron
NeocognitronNeocognitron
Neocognitron
 
Urs Köster - Convolutional and Recurrent Neural Networks
Urs Köster - Convolutional and Recurrent Neural NetworksUrs Köster - Convolutional and Recurrent Neural Networks
Urs Köster - Convolutional and Recurrent Neural Networks
 
Deep Learning Cases: Text and Image Processing
Deep Learning Cases: Text and Image ProcessingDeep Learning Cases: Text and Image Processing
Deep Learning Cases: Text and Image Processing
 
拡がるディープラーニングの活用
拡がるディープラーニングの活用拡がるディープラーニングの活用
拡がるディープラーニングの活用
 
Python for Image Understanding: Deep Learning with Convolutional Neural Nets
Python for Image Understanding: Deep Learning with Convolutional Neural NetsPython for Image Understanding: Deep Learning with Convolutional Neural Nets
Python for Image Understanding: Deep Learning with Convolutional Neural Nets
 

Similar to Pres Tesi LM-2016+transcript_eng

Artificial Neural Network Implementation On FPGA Chip
Artificial Neural Network Implementation On FPGA ChipArtificial Neural Network Implementation On FPGA Chip
Artificial Neural Network Implementation On FPGA ChipMaria Perkins
 
Handwritten Digit Recognition using Convolutional Neural Networks
Handwritten Digit Recognition using Convolutional Neural  NetworksHandwritten Digit Recognition using Convolutional Neural  Networks
Handwritten Digit Recognition using Convolutional Neural NetworksIRJET Journal
 
IRJET-AI Neural Network Disaster Recovery Cloud Operations Systems
IRJET-AI Neural Network Disaster Recovery Cloud Operations SystemsIRJET-AI Neural Network Disaster Recovery Cloud Operations Systems
IRJET-AI Neural Network Disaster Recovery Cloud Operations SystemsIRJET Journal
 
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020Universitat Politècnica de Catalunya
 
U-Netpresentation.pptx
U-Netpresentation.pptxU-Netpresentation.pptx
U-Netpresentation.pptxNoorUlHaq47
 
Easily Trainable Neural Network Using TransferLearning
Easily Trainable Neural Network Using TransferLearningEasily Trainable Neural Network Using TransferLearning
Easily Trainable Neural Network Using TransferLearningIRJET Journal
 
(Im2col)accelerating deep neural networks on low power heterogeneous architec...
(Im2col)accelerating deep neural networks on low power heterogeneous architec...(Im2col)accelerating deep neural networks on low power heterogeneous architec...
(Im2col)accelerating deep neural networks on low power heterogeneous architec...Bomm Kim
 
Hand Written Digit Classification
Hand Written Digit ClassificationHand Written Digit Classification
Hand Written Digit Classificationijtsrd
 
Devanagari Digit and Character Recognition Using Convolutional Neural Network
Devanagari Digit and Character Recognition Using Convolutional Neural NetworkDevanagari Digit and Character Recognition Using Convolutional Neural Network
Devanagari Digit and Character Recognition Using Convolutional Neural NetworkIRJET Journal
 
DEEP LEARNING BASED BRAIN STROKE DETECTION
DEEP LEARNING BASED BRAIN STROKE DETECTIONDEEP LEARNING BASED BRAIN STROKE DETECTION
DEEP LEARNING BASED BRAIN STROKE DETECTIONIRJET Journal
 
A Review of Neural Networks Architectures, Designs, and Applications
A Review of Neural Networks Architectures, Designs, and ApplicationsA Review of Neural Networks Architectures, Designs, and Applications
A Review of Neural Networks Architectures, Designs, and ApplicationsIRJET Journal
 
IRJET- Machine Learning based Object Identification System using Python
IRJET- Machine Learning based Object Identification System using PythonIRJET- Machine Learning based Object Identification System using Python
IRJET- Machine Learning based Object Identification System using PythonIRJET Journal
 
Implementation of Feed Forward Neural Network for Classification by Education...
Implementation of Feed Forward Neural Network for Classification by Education...Implementation of Feed Forward Neural Network for Classification by Education...
Implementation of Feed Forward Neural Network for Classification by Education...ijsrd.com
 
Implementing Neural Networks Using VLSI for Image Processing (compression)
Implementing Neural Networks Using VLSI for Image Processing (compression)Implementing Neural Networks Using VLSI for Image Processing (compression)
Implementing Neural Networks Using VLSI for Image Processing (compression)IJERA Editor
 
Recent developments in Deep Learning
Recent developments in Deep LearningRecent developments in Deep Learning
Recent developments in Deep LearningBrahim HAMADICHAREF
 
International Refereed Journal of Engineering and Science (IRJES)
International Refereed Journal of Engineering and Science (IRJES)International Refereed Journal of Engineering and Science (IRJES)
International Refereed Journal of Engineering and Science (IRJES)irjes
 
International Refereed Journal of Engineering and Science (IRJES)
International Refereed Journal of Engineering and Science (IRJES)International Refereed Journal of Engineering and Science (IRJES)
International Refereed Journal of Engineering and Science (IRJES)irjes
 
NEURAL NETWORK FOR THE RELIABILITY ANALYSIS OF A SERIES - PARALLEL SYSTEM SUB...
NEURAL NETWORK FOR THE RELIABILITY ANALYSIS OF A SERIES - PARALLEL SYSTEM SUB...NEURAL NETWORK FOR THE RELIABILITY ANALYSIS OF A SERIES - PARALLEL SYSTEM SUB...
NEURAL NETWORK FOR THE RELIABILITY ANALYSIS OF A SERIES - PARALLEL SYSTEM SUB...IAEME Publication
 
EXPERT SYSTEMS AND ARTIFICIAL INTELLIGENCE_ Neural Networks.pptx
EXPERT SYSTEMS AND ARTIFICIAL INTELLIGENCE_ Neural Networks.pptxEXPERT SYSTEMS AND ARTIFICIAL INTELLIGENCE_ Neural Networks.pptx
EXPERT SYSTEMS AND ARTIFICIAL INTELLIGENCE_ Neural Networks.pptxJavier Daza
 

Similar to Pres Tesi LM-2016+transcript_eng (20)

Artificial Neural Network Implementation On FPGA Chip
Artificial Neural Network Implementation On FPGA ChipArtificial Neural Network Implementation On FPGA Chip
Artificial Neural Network Implementation On FPGA Chip
 
Handwritten Digit Recognition using Convolutional Neural Networks
Handwritten Digit Recognition using Convolutional Neural  NetworksHandwritten Digit Recognition using Convolutional Neural  Networks
Handwritten Digit Recognition using Convolutional Neural Networks
 
IRJET-AI Neural Network Disaster Recovery Cloud Operations Systems
IRJET-AI Neural Network Disaster Recovery Cloud Operations SystemsIRJET-AI Neural Network Disaster Recovery Cloud Operations Systems
IRJET-AI Neural Network Disaster Recovery Cloud Operations Systems
 
Neuro network1
Neuro network1Neuro network1
Neuro network1
 
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020
 
U-Netpresentation.pptx
U-Netpresentation.pptxU-Netpresentation.pptx
U-Netpresentation.pptx
 
Easily Trainable Neural Network Using TransferLearning
Easily Trainable Neural Network Using TransferLearningEasily Trainable Neural Network Using TransferLearning
Easily Trainable Neural Network Using TransferLearning
 
(Im2col)accelerating deep neural networks on low power heterogeneous architec...
(Im2col)accelerating deep neural networks on low power heterogeneous architec...(Im2col)accelerating deep neural networks on low power heterogeneous architec...
(Im2col)accelerating deep neural networks on low power heterogeneous architec...
 
Hand Written Digit Classification
Hand Written Digit ClassificationHand Written Digit Classification
Hand Written Digit Classification
 
Devanagari Digit and Character Recognition Using Convolutional Neural Network
Devanagari Digit and Character Recognition Using Convolutional Neural NetworkDevanagari Digit and Character Recognition Using Convolutional Neural Network
Devanagari Digit and Character Recognition Using Convolutional Neural Network
 
DEEP LEARNING BASED BRAIN STROKE DETECTION
DEEP LEARNING BASED BRAIN STROKE DETECTIONDEEP LEARNING BASED BRAIN STROKE DETECTION
DEEP LEARNING BASED BRAIN STROKE DETECTION
 
A Review of Neural Networks Architectures, Designs, and Applications
A Review of Neural Networks Architectures, Designs, and ApplicationsA Review of Neural Networks Architectures, Designs, and Applications
A Review of Neural Networks Architectures, Designs, and Applications
 
IRJET- Machine Learning based Object Identification System using Python
IRJET- Machine Learning based Object Identification System using PythonIRJET- Machine Learning based Object Identification System using Python
IRJET- Machine Learning based Object Identification System using Python
 
Implementation of Feed Forward Neural Network for Classification by Education...
Implementation of Feed Forward Neural Network for Classification by Education...Implementation of Feed Forward Neural Network for Classification by Education...
Implementation of Feed Forward Neural Network for Classification by Education...
 
Implementing Neural Networks Using VLSI for Image Processing (compression)
Implementing Neural Networks Using VLSI for Image Processing (compression)Implementing Neural Networks Using VLSI for Image Processing (compression)
Implementing Neural Networks Using VLSI for Image Processing (compression)
 
Recent developments in Deep Learning
Recent developments in Deep LearningRecent developments in Deep Learning
Recent developments in Deep Learning
 
International Refereed Journal of Engineering and Science (IRJES)
International Refereed Journal of Engineering and Science (IRJES)International Refereed Journal of Engineering and Science (IRJES)
International Refereed Journal of Engineering and Science (IRJES)
 
International Refereed Journal of Engineering and Science (IRJES)
International Refereed Journal of Engineering and Science (IRJES)International Refereed Journal of Engineering and Science (IRJES)
International Refereed Journal of Engineering and Science (IRJES)
 
NEURAL NETWORK FOR THE RELIABILITY ANALYSIS OF A SERIES - PARALLEL SYSTEM SUB...
NEURAL NETWORK FOR THE RELIABILITY ANALYSIS OF A SERIES - PARALLEL SYSTEM SUB...NEURAL NETWORK FOR THE RELIABILITY ANALYSIS OF A SERIES - PARALLEL SYSTEM SUB...
NEURAL NETWORK FOR THE RELIABILITY ANALYSIS OF A SERIES - PARALLEL SYSTEM SUB...
 
EXPERT SYSTEMS AND ARTIFICIAL INTELLIGENCE_ Neural Networks.pptx
EXPERT SYSTEMS AND ARTIFICIAL INTELLIGENCE_ Neural Networks.pptxEXPERT SYSTEMS AND ARTIFICIAL INTELLIGENCE_ Neural Networks.pptx
EXPERT SYSTEMS AND ARTIFICIAL INTELLIGENCE_ Neural Networks.pptx
 

Recently uploaded

What are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptxWhat are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptxwendy cai
 
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICSHARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICSRajkumarAkumalla
 
Porous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingPorous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingrakeshbaidya232001
 
main PPT.pptx of girls hostel security using rfid
main PPT.pptx of girls hostel security using rfidmain PPT.pptx of girls hostel security using rfid
main PPT.pptx of girls hostel security using rfidNikhilNagaraju
 
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINEMANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINESIVASHANKAR N
 
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...ranjana rawat
 
Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024hassan khalil
 
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130Suhani Kapoor
 
Extrusion Processes and Their Limitations
Extrusion Processes and Their LimitationsExtrusion Processes and Their Limitations
Extrusion Processes and Their Limitations120cr0395
 
Microscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxMicroscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxpurnimasatapathy1234
 
Coefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxCoefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxAsutosh Ranjan
 
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
the ladakh protest in leh ladakh 2024 sonam wangchuk.pptx
the ladakh protest in leh ladakh 2024 sonam wangchuk.pptxthe ladakh protest in leh ladakh 2024 sonam wangchuk.pptx
the ladakh protest in leh ladakh 2024 sonam wangchuk.pptxhumanexperienceaaa
 
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxDecoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxJoão Esperancinha
 
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Dr.Costas Sachpazis
 

Recently uploaded (20)

What are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptxWhat are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptx
 
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICSHARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
 
Porous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingPorous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writing
 
Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR
Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCRCall Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR
Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR
 
main PPT.pptx of girls hostel security using rfid
main PPT.pptx of girls hostel security using rfidmain PPT.pptx of girls hostel security using rfid
main PPT.pptx of girls hostel security using rfid
 
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINEMANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
 
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
 
Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024
 
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
 
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
 
Extrusion Processes and Their Limitations
Extrusion Processes and Their LimitationsExtrusion Processes and Their Limitations
Extrusion Processes and Their Limitations
 
Microscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxMicroscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptx
 
Coefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxCoefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptx
 
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
 
the ladakh protest in leh ladakh 2024 sonam wangchuk.pptx
the ladakh protest in leh ladakh 2024 sonam wangchuk.pptxthe ladakh protest in leh ladakh 2024 sonam wangchuk.pptx
the ladakh protest in leh ladakh 2024 sonam wangchuk.pptx
 
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINEDJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
 
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxDecoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
 
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
 
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
 

Pres Tesi LM-2016+transcript_eng

  • 1. UNIVERSITY OF BERGAMO ENGINEERING DEPARTMENT MASTER OF SCIENCE IN COMPUTER ENGINEERING CONVOLUTIONAL NEURAL NETWORKS FOR COMPUTER VISION Supervisor Mario Verdicchio Candidate Daniele Ettore Ciriello 13 June 2016 Welcome and thank for being here, I am Daniele Ciriello and I wrote my master’s thesis on convolutional neural networks for computer vision.
  • 2. Overview Introduction Neural Networks Fundamentals State of the Art Implementation Results Future Developments 2 of 32 CONVOLUTIONAL NEURAL NETWORKS FOR COMPUTER VISION The presentation is split in six parts, initially I introduce computer vision and neural networks, of which I present the fundamental concepts and the state of the art in matter of image classification tasks, then I present the different parts that compose the implemented project and result of some experiment, finally I show possible developments for the future and applications in practice.
  • 3. Overview Introduction Neural Networks Fundamentals State of the Art Implementation Results Future Developments 3 of 32 CONVOLUTIONAL NEURAL NETWORKS FOR COMPUTER VISION Let’s start introducing the main concepts for computer vision and artificial neural networks.
  • 4. Computer Vision Acquisition, process, analysis and comprehension of images and high-dimensionality data Applications Examples: Control processes Navigation Event recognition Information organization Interaction 4 of 32 CONVOLUTIONAL NEURAL NETWORKS FOR COMPUTER VISION Computer vision is a filed of computer science which deals the acquisition, process, analysis and comprehension of the information contained in images and high dimensionality data from the real world, with the objective of produce information, in numeric or symbolic form, for example in form of decision. Represents the basis for many artificial intelligence systems, finding application for example in control processes, navigation system, information organization and human-machine interaction systems. The kernel of many of these applications are often image classification problems, in which a system have to decide an image class from a set of predefined classes.
  • 5. Neural Networks and Computer Vision Information process paradigm inspired to the biologic nervous system Highly scalable non-linear decision models Autonomously set the right parameters through learning algorithms Need big training data-sets 5 of 32 CONVOLUTIONAL NEURAL NETWORKS FOR COMPUTER VISION In the last years considerable progress have been possible thanks to convolutional neural networks, which have established the state of the art technology solving image classification problems and many others, related to computer vision. Artificial neural networks are model inspired by the biologic nervous system and can represent non-linear, highly scalable, decision models. By using learning algorithms, we permit the network to set by its-self the parameters which minimize the error loss.
  • 6. Overview Introduction Neural Networks Fundamentals State of the Art Implementation Results Future Developments 6 of 32 CONVOLUTIONAL NEURAL NETWORKS FOR COMPUTER VISION Let’s see some fundamental concepts about artificial neural networks.
  • 7. Neural Units Neural unit x1 x2 ... xn b y w1 w2 wn y = f(w · x + b) w: weights b: bias f: activation function Activation functions . . . . . . 7 of 32 CONVOLUTIONAL NEURAL NETWORKS FOR COMPUTER VISION The basic element of artificial neural networks is the neural unit, or artificial neuron, which take in input n values x, at which correspond n weights w, inputs and weights can be seen as two vectors x and w, the neural unit’s output consists of the application of a function, named activation function, to the sum between the dot product between x and w and a value b named bias. The most simple activation function and the first being used is the step function and neural unit using it are called perceptrons. The most famous activation function in literature, even if now is it no more used, is the sigmoid (or logistic) function, in literature we can find similar functions like hyperbolic tangent. The most used activation function in these days is the rectifying function, and neurons which use it are called ReLU (rectifier linear unit).
  • 8. Feed-forward Neural Networks II strato nascosto I strato nascosto strato di input strato di output y1 = x, yj = f(Wj · y −1 + bj ) 8 of 32 CONVOLUTIONAL NEURAL NETWORKS FOR COMPUTER VISION By disposing these elements in layers we can build neural networks, in particular, classic feed-forward models expect the composition in many layers where, the first layer is called input layer because it represents the network’s input values, last layer is called output layer because it carries the output of the network, and internal layers are called hidden layers, simply because they are nor input nor output layers. Layers like these are called affine or fully connected layer, in which each neuron’s inputs consists in all the neurons’ output from the previous layer.
  • 9. Convolution Layers Groups of neurons which share parameters with all neurons of the same group Each neuron take in input a portion of x Very efficient with image processing problems x1 x2 x3 x4 x5 A A A A y1 y2 y3 y4 ... xn−1 xn A yn 9 of 32 CONVOLUTIONAL NEURAL NETWORKS FOR COMPUTER VISION Many other types of layer exist, a very important one, especially for computer vision problems and more in general, pattern recognition problems, is the convolution layer. It consists of groups of neurons where each neuron take in input a portion of the input values and shares their parameters with every other neuron in the same group. In the figure you can see a mono-dimensional convolution layer, composed by neurons of type A, which take in input a segment of x, in case of computer vision problems layers can be bi-dimensional and can take in input areas of x.
  • 10. Other Types of Layer Pooling Softmax Normalization 10 of 32 CONVOLUTIONAL NEURAL NETWORKS FOR COMPUTER VISION Convolution layers are often interspersed with pooling layers, which sub-sample the input keeping the number of channels invariant, another important layer is the softmax layer, used as output layer in many classification problems, as the output of this layer can be seen as a probability distribution. Another important layer type is the normalization layer, that normalize input values in an unit interval along each channel. Modularity of these simple concepts allows the composition of more (or less) complex and specific models.
  • 11. Overview Introduction Neural Networks Fundamentals State of the Art Implementation Results Future Developments 11 of 32 CONVOLUTIONAL NEURAL NETWORKS FOR COMPUTER VISION To exhibit the state of the art I show results for a competition that take place annually attracting institutions from all over the world, becoming a benchmark for the evaluation of computer vision systems.
  • 12. ILSVRC ImageNet Large-Scale Visual Recognition Challenge 1000 classes 1.2 M training samples 500 k validation samples 12 of 32 CONVOLUTIONAL NEURAL NETWORKS FOR COMPUTER VISION ILSRVC is a competition composed by many computer vision tasks like localization and classification, with the passing years other sub-tasks are born, like scene classification or object localization in videos. This competition has become a benchmark for large-scale convolutional networks performance analysis. The classification task is supported by 1.2 million images and 1000 classes, in the year 2015 human average error has been surpassed by convolutional models.
  • 13. ILSVRC Blue: Traditional computer vision Purple: Deep learning Red: Human capacity 13 of 32 CONVOLUTIONAL NEURAL NETWORKS FOR COMPUTER VISION In the figure you can see how since deep learning systems (or systems based on neural models with more than one hidden layer), have been used, it became possible to obtain considerable progress, till surpassing human capacity in 2015. You can also see how neural models has supplanted the classic computer vision approaches.
  • 14. Residual Networks 14 of 32 CONVOLUTIONAL NEURAL NETWORKS FOR COMPUTER VISION The model which won all the tasks in the year 2015 is a convolutional model called residual netowrk, proposed by MSRA, consists in a convolutional network in which the residual learning concept is applied. The basis structure in the middle is called plain network and is inspired to the VGG networks, a model presented the previous year at the same competition (image above), carrying a residual of the input values to the output of a group of convolutional layers by using a skip path, creating a residual convolution network like the one below.
  • 15. Overview Introduction Neural Networks Fundamentals State of the Art Implementation Results Future Developments 15 of 32 CONVOLUTIONAL NEURAL NETWORKS FOR COMPUTER VISION So my objective was to reproduce the results obtained by the residual model on more simple datasets.
  • 16. Project Overview PyFunt Library for development and training convolutional neural networks PyDatSet Library for loading various data-sets and a collection of functions for artificial training data augmentation Deep-residual-networks-pyfunt Implementation and training of parametric residual networks on various data-sets 16 of 32 CONVOLUTIONAL NEURAL NETWORKS FOR COMPUTER VISION The project is composed by three Python repositories. PyFunt contains the library for development and training convolutional neural networks, PyDatSet conists of a collection for loading various data-sets in a python environment and a set functions for artificial training data augmentation, and the main repository that contains the residual model implementation and the main application that makes use of the model and the two libraries to load the data-sets and train the networks. All the three repositories are published with an open-source license on GitHub in a way that anyone can use them or contribute to the development.
  • 17. Package Diagram 17 of 32 CONVOLUTIONAL NEURAL NETWORKS FOR COMPUTER VISION In this diagram you can see how the various parts of the project interact each other to train the residual model, in particular, the main application creates a ResNet object, that uses the implementations of the various layers and initialization functions provided by pyfunt, then creates a Solver object provided by the same library, to which passes the model, the data loaded with pydatset and many hyper parameter for training like number of epochs and learning rate. pyfunt also contains utilities to verify the correct implementation of the layers and to visualize the first convolution layer’s weights.
  • 18. Overview Introduction Neural Networks Fundamentals State of the Art Implementation Results Future Developments 18 of 32 CONVOLUTIONAL NEURAL NETWORKS FOR COMPUTER VISION So let’s look at the results of experiments on several data-sets.
  • 19. CIFAR-10 - Canadian Institute For Advanced Research 60 k images (50 k + 10 k) RGB 32x32 10 classi: airplane car bird cat deer dog frog horse ship truck 19 of 32 CONVOLUTIONAL NEURAL NETWORKS FOR COMPUTER VISION Wanting to replicate the results obtained by residual networks, I trained the model implemented by myself on CIFAR 10, which consists of 60,000 RGB images of 32 pixels per side spread on 10 disjoint classes. This data-set is much simpler than the previous one but being varied allows you to easily see if a network can learn properly from the training samples.
  • 20. CIFAR-10 – Results Accuracy: 90.41 %; parameters: ˜248 k 20 of 32 CONVOLUTIONAL NEURAL NETWORKS FOR COMPUTER VISION To evaluate a network’s behaviour we can analyze the learning curves, which report the value of the moving average of the cost at each iteration (above), and the error values on the training data (the dotted lines) and the error on validation data (the solid lines below). In this case, by using a 20 layers network, become evident by observing the errors, a typical phenomenon of these models called overfitting, according to which the network uses his several parameters to learn specific features of the training set, without generalizing enough on the validation set. Memorizing in a certain sense, the training samples. By artificially augmenting the size of the training set, we can reduce this phenomenon and obtain better results in terms of validation error.
  • 21. MNIST – Mixed National Institute of Standards and Technology 70 k images (60 k + 10 k) B/W 28x28 handwritten digits 10 classes (digits from 0 to 9) 21 of 32 CONVOLUTIONAL NEURAL NETWORKS FOR COMPUTER VISION The MNIST data-set has been presented for the first time in 1991 and is composed by 60 thousand 28 pixel per side, black/white images of handwritten digits.
  • 22. MNIST – Results Accuracy: 99.64 %; parameters: 442 k 22 of 32 CONVOLUTIONAL NEURAL NETWORKS FOR COMPUTER VISION Another way to low the validation error in residual models is to increment the number of layers or the number of neurons for each layer, in this case for the firsts two experiments I halved the number of neurons used in the previous experiments, incrementing the number of layers from 20 to 32, and re-doubling the number of neurons in the 32 layers model I obtained an accuracy of 99.64%, which means the network erroneously classify just 36 images of the 10 000 validation images after trained. The number of each layers’ filters starts from 16 and get doubled sporadically to 64.
  • 23. MNIST – Results 23 of 32 CONVOLUTIONAL NEURAL NETWORKS FOR COMPUTER VISION In this figure we can see all 36 erroneously classified images, in each cell’s middle there is the original image, in the top left the correct class for the image, in bottom left the wrong classification from the network and lower right the second classification for confidence. So we can see that in many cases the second classification is the correct one, furthermore we can see some examples which are almost indistinguishable for a human eye.
  • 24. SFDDD - State Farm Distracted Driver Detection ˜22 k RGB 640x480 images of drivers 10 distraction classes: safe driving texting - right talking on the phone - right texting - right talking on the phone - left operating the radio drinking reaching behind hair and makeup talking to passenger 24 of 32 CONVOLUTIONAL NEURAL NETWORKS FOR COMPUTER VISION The last dataset I present has been provided by State Farm trough Kaggle, an institution which collect many machine learning competitions. In this case the insurance agency wanted to verify the best accuracy level obtainable in the classification of RGB, 640x480 images of drivers in 10 distraction classes: safe driving, texting with the right hand, talking on phone with the right hand, etcetera... To simplify the dataset and the training process, I resized all the images to 64x48 pixels, by selecting for training a random portion of each image at each iteration, and the central 32x32 portion for validation.
  • 25. SFSDDD – Results Accuracy: 99.75 %, parameters: ˜636 k 25 of 32 CONVOLUTIONAL NEURAL NETWORKS FOR COMPUTER VISION Given that the competition is still not ended, the validation dataset is still not public, so I used a validation set composed by 2000 images, randomly extracted and excluded from the 22000 circa of the training set. By analyzing the learning curves of two residual networks of 32 and 44 layers. Despite the difference in the loss values are almost imperceptible, and despite both the network can recognize practically all the training samples after 80 epochs, we can see that with the 44 layers resnet I obtained an accuracy of 99.75% on my validation set.
  • 26. SFSDDD – Saliency Maps 26 of 32 CONVOLUTIONAL NEURAL NETWORKS FOR COMPUTER VISION By evaluating and visualizing the values of the cost’s derivative, with respect of the input images, we can observe the so called saliency maps, where lighter areas represents the images’ portions which most contribute to the right classification by the trained network, in this case we can see some samples from the class “talking on phone with the right hand”. Is interesting to note for example how the classification is affected by the steering areas where usually hands reside or areas where it should reside the arm, in the fourth picture you can instead see the zone that has affected most is that around the head.
  • 27. Overview Introduction Neural Networks Fundamentals State of the Art Implementation Results Future Developments 27 of 32 CONVOLUTIONAL NEURAL NETWORKS FOR COMPUTER VISION We come then to the conclusion, where I describe some possibilities for Future Developments and applications in practice.
  • 28. Future Developmentss Extend the framework Implement other models Train on other data-sets 28 of 32 CONVOLUTIONAL NEURAL NETWORKS FOR COMPUTER VISION In a near future I would like to continue the project, extending the implemented library, allowing for example the usage of GPU based hardware accelerated computation, or developing other types of layer or new models that will be presented in future, or by training the same model on other data-sets.
  • 29. Examples of Applications in Practice Health . . . Robotics . . . Navigation . . . Physics, Natural Sciences, art and entertainment, ... 29 of 32 CONVOLUTIONAL NEURAL NETWORKS FOR COMPUTER VISION Some possible applications in which convolutional neural netwrorks are being used are in Health for classify chest diseases from x-ray images, localization and segmentation of tumor masses in the brain, or of the pancreas from the trunk sections of the images, useful as the pancreas varies in size and shape from person to person. In robotics, for eye-hand coordination systems in 7 degrees of freedom robotic arms, to improve locomotion skills of robots on irregular ground through reinforcement learning algorithms, and for the improvement of the relationship between humans and robots with facial expressions classification networks. In navigation they are used in a massive way for autonomous vehicle systems, or for example for detecting and classifying roads starting from satellite images, or road signs images classification. Furthermore, neural models are being used successfully in physics, natural sciences, for example for classify flowers in botanics or plankton in marine biology, and many more.
  • 30. UNIVERSITY OF BERGAMO ENGINEERING DEPARTMENT MASTER OF SCIENCE IN COMPUTER ENGINEERING CONVOLUTIONAL NEURAL NETWORKS FOR COMPUTER VISION Supervisor Mario Verdicchio Candidate Daniele Ettore Ciriello 13 June 2016 Thank you for the attention.
  • 31. Images Credits (1 of 2) Slide Image Credits URL (http://) 10 Super Vision UofT goo.gl/fbyXSK 10 UvA-Euvision UvA goo.gl/ltNvaP 12 ILSRVC samples ImageNet goo.gl/PvFCAv 13 ILSRVC results 1 NVIDIA goo.gl/QkG3Nf 13 ILSRVC results 2 NVIDIA goo.gl/THMZ1X 14 ResNet MSRA goo.gl/uR0ZXL 29 Medi1 NVIDIA goo.gl/DdPSvD 29 Medi2 BRATS goo.gl/rGgz22 29 Medi3 SPIE goo.gl/R5mbmj 29 Rob1 Google goo.gl/6BycQ7 29 Rob2 UBC goo.gl/A585Iz 29 Rob3 MSRC goo.gl/3exWqJ 31 of 32 CONVOLUTIONAL NEURAL NETWORKS FOR COMPUTER VISION
  • 32. Images Credits (2 of 2) Slide Image Credits URL (http://) 29 Nav1 Google goo.gl/DFgPcl 29 Nav2 DeepOSM goo.gl/sR72BF 29 Nav4 IDSIA goo.gl/SR16Uk 32 of 32 CONVOLUTIONAL NEURAL NETWORKS FOR COMPUTER VISION