Convolutional Neural Network

Convolutional Neural Networks
RTSS JUN YOUNG PARK

What is Covnet(CNN) ?
C R C R P
…
DOG
POLAR BEAR
WOLF
ELEPHANT
FC
Reduce
Dimension

Discrete Convolution
1 2 3 0
0 1 2 3
3 0 1 2
2 3 0 1
2 0 1
0 1 2
1 0 2
*
15 16
6 15
2 0 3
0 1 4
3 0 2
- FMA(Fused Multiply-Add) for each area to get each ‘Value’
Multiply for each elements
Input data Kernel Output data
Σ

Padding
SAME VALID
0 0 0 0 0 0
0 12 27 7 3 0
0 31 9 6 8 0
0 9 15 3 12 0
0 6 3 30 13 0
0 0 0 0 0 0
12 27 7 3
31 9 6 8
9 15 3 12
6 3 30 13
Padding = 1

Stride - 1
KERNEL = 3X3, STRIDE = 1, PADDING = SAME
0 0 0 0 0 0 0
0 7 10 2 9 3 0
0 12 27 7 3 2 0
0 31 9 6 8 6 0
0 9 15 3 12 4 0
0 6 3 30 13 5 0
0 0 0 0 0 0 0
a b c d e
f g h i j
k l m n o
p .. .. .. ..
.. .. .. .. ..
2 0 1
0 1 2
1 0 2
* =
Stride

Stride - 2
KERNEL = 3X3, STRIDE = 2, PADDING = SAME
0 0 0 0 0 0 0
0 7 10 2 9 3 0
0 12 27 7 3 2 0
0 31 9 6 8 6 0
0 9 15 3 12 4 0
0 6 3 30 13 5 0
0 0 0 0 0 0 0
A B C
D E F
G H I
2 0 1
0 1 2
1 0 2
* =
Stride

Output Size
𝑆𝑆𝑍𝑍𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜 =
𝑁𝑁−𝐹𝐹
𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆 𝑆𝑆𝑆𝑆
+ 1
𝑉𝑉𝑉𝑉𝑉𝑉𝑉𝑉𝑉𝑉 𝑤𝑤𝑤𝑤𝑤𝑤𝑤 𝑆𝑆𝑍𝑍𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜 ∈ 𝑁𝑁
0 0 0 0 0 0 0
0 7 10 2 9 3 0
0 12 27 7 3 2 0
0 31 9 6 8 6 0
0 9 15 3 12 4 0
0 6 3 30 13 5 0
0 0 0 0 0 0 0

Fully Connected Layer
C R C R P
…
DOG
POLAR BEAR
WOLF
ELEPHANT
FC
Reduce
Dimension
By previous procedures …
P
32 x 32 x 3
x Build Classifier
<Feature extraction> <Classification>

Convolution Layers
Filter
(28, 28, 6)(32, 32, 3)
Convolution,
ReLU
With
6 (5,5,3)
filters
With
10 (5,5,6)
filters
(24, 24, 10)
……
𝑁𝑁𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝 = 5 ∗ 5 ∗ 3 ∗ 6 + 5 ∗ 5 ∗ 6 ∗ 10
Convolution,
ReLU

Pooling layer
Convolution layer
Size Reduction
Pooling
Slice !

Pooling (Subsampling)
◦ Object
◦ To reduce Rows/Cols from the matrix
◦ Advantage
◦ No parameters to train
◦ Invariable channels
◦ Not affected by variance of input
◦ Method
◦ Max Pooling
Use maximum value of the target area
◦ Mean Pooling
Use average value of the target area
31 8
15 30
12 27 7 3
31 9 6 8
9 15 3 12
6 3 30 13
12 27 7 3
31 9 6 8
9 15 3 12
6 3 30 13
12.25 6.00
8.25 14.50
<Max Pooling>
<Mean Pooling>
Kernel : 2x2
Stride : 2

Applications
◦ LeNet-5 (1998)
◦ AlexNet (2012)
◦ GoogLeNet (2014)
◦ Inception module : Parallel composition of layers.
◦ 1x1 convolution : Mathematically equivalent to a multi-layer perceptron.
◦ ResNet (2015)
◦ Fast Forward : Step over to skip some layers (Residual Net)

Practical Use
◦ Build NN with simplified API and Class by TensorFlow
◦ Training with MNIST dataset
◦ Test with MNIST dataset and ‘Hand-Written’ own data.
◦ Applying some techniques that we discussed before
◦ Ensemble, Dropout, Batch

Define class for a model
Model
+ keep_prop
+ X, Y
+ sess
+ name
+ bool training
+ layers(conv, pool, dropout, dense …)
+ __init__(self, sess, name)
+ _build_net(self)
+ predict(self,x_test,training)
+ get_accuracy(self,x_test,y_test,training=False)
+ train(self,x_data_y_data_training=True)

Shape of Layers
IMG
[32, 32, 1] FILTER1
[32, 32, 32]
POOL1
[16, 16, 32]
FILTER2
[16, 16, 64]
POOL2
[8, 8, 64]
FILTER3
[8, 8, 128] POOL3
[4, 4, 128]
ReLU
ReLU ReLU
[4*4*128, 625]
Flatten ReLU
Drop
Drop
[625,
10]
Logits : 0 ~ 9
# of filters
Stride = 2
Stride = 2
Stride = 2

Implement Ensemble
Model 1 . . . Model 10
Predict Predict
+
Predictions
mean

Test Result
◦ Model : 5
◦ Epoch : 15
◦ Ensemble Accuracy : 99.05%

Test ‘Hand-Written’ Data
Scan & Resample for
each numbers
Image to Test
[28,28,1] - Bitmap

Self Assignment
◦ 미완성된 자필 숫자 인식기를 완성하라.
◦ 문제점
◦ pyplot.imshow() 메소드가 이미지를 자동으로 4채널(RGBA) 로 읽어 들임. (해결)
◦ Type Mismatch 오류, TensorFlow 에서 필요로 하는 Type을 제대로 설정하지 못하였을 가능성.

Trial and Error
◦ Convert RGB image to mono(gray) bitmap

Troubleshooting
◦ Content of MNIST dataset
……
Black (Background)
White (Content)
# of Test Images
Size of each image
(784 = 28*28)

Troubleshooting
◦ Causes of failure
◦ Different from MNIST dataset, Hand-Written data have black content and white background.
◦ After convert to grayscale, each pixel has too large value different from MNIST dataset.
◦ Solutions
◦ Invert Hand-Written data to have black background and white content.
◦ Normalize grayscale converted image.

Troubleshooting
◦ Read each image from the floder
◦ Convert to grayscale (28x28x3 -> 28x28)
◦ Reshape (28x28 -> 784)
◦ Append the image to the list.
◦ Convert the list to ndarray.
◦ Apply normalization.
◦ Get sum of predictions from models.
◦ Print the index of the max prediction value of each image.

Self Test
◦ Discrete Convolution 에 대해 설명하라.
◦ Pooling 의 대표적인 두가지 방법을 설명하라.
◦ Stride, Padding 에 대하여 설명하고 크기 변화가 어떻게 일어나는지 설명하라.

Convolutional Neural Network

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Convolutional Neural Network

Similar to Convolutional Neural Network (20)

More from Jun Young Park

More from Jun Young Park (8)

Recently uploaded

Recently uploaded (20)

Convolutional Neural Network