Deep Learning for Image Analysis
Levan Tsinadze
PulsarAI
Deep Learning Tbilisi
Artificial Neural Network
 Neurons
 Weights
 Graph
Neuron
Specifications
 Input Layer – Hidden Layers – Output Layer
 ā€œEasyā€ Functions
 Parallelism (Matrices)
ANN / DNN
Problems in CV
 Object recognition
 Object detection
 Semantic segmentation
Object recognition
Object detection
Semantic segmentation
Hybrid Models – Detect - Recognize
Input Image RGB
Input Image Grayscale
Input Image as Tensor
Convolutional NN
 Image as 3 dimensional array – tensor of pixels
 Filters
 Small matrix of weights slide on image (tensor)
 Step by step extract features from image
 Locall connectivity and spatial invariance
Convolutional Neural Networks
CNN
Filter - Input - Output
 Input – tensor
 Output – matrix
 Depth is shrinked
Convolutional Block
 Convolutional fiters (often more than one)
 Activation
 Pooling
Features by Blocks - Hierarchy
Number of Filters
 Input channel – Image Depth (RGB)
 Output channel – Hyper Parameter number of
convolutional filters (kernels)
 Feature Map – Generated Matrix for Each
Patch
 Feature Maps on top of each other
Convolution Example
Many to One
Weight Shearing
 Features map – small amount of weights
 6x6 input 10 features matrices 3x3 each 90
parameters produces 4x4x10 tensor
 224x224x3 to 2048 get 101054464 parameters
 Fully connected layer 6x6x4x4 = 576
 32 Features map
 256 or even 512 or 1024 features map
 Still better presision with less parameters = less
calculation
FP – Sparce FFNN
Activations
 ConvLayer
 Activation (Elementwise)
 ConvLayer
 Etc
 Tensor → Tensor → Tensor → ...
ReLU
 x = x < 0 ? 0 : x or x = max(0, x)
 9:15 – Graph
 Df/Dx = 0 when x = 0
 Easy to Propagate (Forward and Back)
 Easy to See Results
ReLU - Watch
ReLU - Graph
Pooling
Pooling Types
 Max-Pooling
 Average – Pooling
 L2-Norm-Poolig
Pooling – Down-Sampling
Pooling only on Matrices
 Only Matrices
 Pooling for each Channel
 Same Dimensions
 Height Changes
 Width Changes
Dropout
MNIST
LeNet
 Convolutional Layer
 Max pooling layer
 Convolutional layer
 Max pooling layer
 Dropout (if training)
 Fully connected layer
 Dropout (if training)
 Fully connected layer
Training Epochs
 Train
 Shuffle
 Train again
 Only on training set
 Avoid local minimum
PyTorch
 Dynamic execution
 Imperative
 Object-oriented
 http://pytorch.org/
Practical Example
Fine-Tuning / Transfer Learning
 Freeze lower layers
 Retrain higher layers
 Small dataset
 Fast training
 Less resources
ResNet-18
 Residual connections
 Vanishing gradient
 Very deep convolutional neural network
 Easy architecture
 Eveilable pretrained (On ImageNet) weights
Residual Connections
Practical Example
One / N – Shot Learning
 Retraining problems
 Labeling problems (Letters OK, Numbers OK
Faces NOT OK)
 Features extractors
 Feature Search
Embedding Vectors
Features Extractor
 Text
 Image
 Features - Vector
Faces
Image Similarities

Deep Learning for Image Analysis