3. CNN components
- Convolutional layer
- Pooling layer
CNN architecture
- General architecture of CNN
- Case study (AlexNet, VGG)
CNN implementation
Other computer vision tasks
- Semantic segmentation
3
Contents
4. Traditional Method
- Data에 대한 feature vector 설계
- SVM 및 MLP 등의 분류기 사용 및 적은 양의 데이터로 학습 가능
4
Pattern Recognition in Computer Vision
Hand-crafted feature
extractor
Simple trainable
classifier
data output
Feature representation
5. Deep learning based Method
- Data에 대한 feature representation 및 classifier를 jointly optimize
- End-to-end learning algorithm 설계 가능
5
Pattern Recognition in Computer Vision
Trainable feature
extractor
Trainable classifierdata output
Black box
6. MLP ( = Fully connected layer)
- 영상 데이터에 대하여 MLP를 이용하여 network를 구성하면..
- 32x32x3 image ≅ 3072-dimensional vector
- 학습 parameter가 증가 → 학습에 필요한 데이터 증가
Convolutional layer
- Convolve a filter with an image (preserve spatial structure)
- Slide over the image spatially and compute dot products
- Parameter sharing & Local connectivity
6
Convolutional layer
7. MLP의 구성
- Design parameter: Hidden layer 및 각 layer의 neuron 개수
Convolutional layer
- Convolution filter (kernel) size, layer의 개수, 각 layer의 filter 개수 등
7
Convolutional layer
32x32x3 image
5x5x3 filter
height
width
channel
Input layer Output layerHidden layer
8. Convolutional layer
- Convolve a filter with an image (preserve spatial structure)
- Slide over the image spatially and compute dot products
8
Convolutional layer
32x32x3 image
5x5x3 filter
Convolve (slide) over all
spatial locations
Activation map
(Feature map)
Parameter 수 = 5x5x3 +1(bias)
9. Consider a second, green filter
9
Convolutional layer
32x32x3 image
5x5x3 filter
Convolve (slide) over all
spatial locations
Activation maps
Depth slice
10. 5x5 filters 6개 사용 → 6 separate activation maps
Activation map의 크기: input size – (kernel size – 1)
10
Convolutional layer
Convolution layer
11. 11
Convolutional layer
Convolutional neural network is a sequence of convolution layers
(interspersed with activation functions)
CONV
ReLU
(6개의
5x5x3 filters)
CONV
ReLU
(10개의
5x5x3 filters)
CONV
ReLU
첫 번째와 두 번째 convolutional layer에 있는 parameter의 수?
12. 12
Convolutional layer
Feature map의 크기
5x5 convolutional filter 사용
Receptive field
- Activation map에서 하나의 값을 결정하는데 사용되는 input에서의 영역 크기
- 5x5 filter로 이루어진 convolutional layer 2개를 통과시킨 activation map의 receptive field = 9x9
12x12
8x8
4x4
CONV
5x5
CONV
5x5
16. 16
Architecture
General architecture
: Convolution layer (with an activation function) + Pooling layer의 연속
CNN architecture for classification
- Convolutional layer와 pooling layer를 이용하여 spatial domain에서의 feature map size를
1x1로 만드는 것이 핵심
- Output layer의 depth는 classification 하고자 하는 class의 개수와 동일
17. 17
Case study
AlexNet (2012)
8 layers
INPUT – 227x227x3
CONV1 (11x11x3x96 filter with stride 4) – 55x55x96
POOL1 (3x3 pooling with stride 2) – 27x27x96
CONV2 (5x5x96x256 filter with stride 1 and padding 2) – 27x27x256
POOL2 (3x3 pooling with stride 2) – 13x13x256
CONV3 (3x3x256x384 filter with stride 1 and padding 1) – 13x13x384
CONV4 (3x3x384x384 filter with stride 1 and padding 1) – 13x13x384
CONV5 (3x3x384x256 filter with stride 1 and padding 1) – 13x13x384
Fully-connected (4096)
Fully-connected (4096)
Fully-connected (1000)
Feature map size = (input size + 2x padding – filter size) / stride +1
227
227
19. 19
Case study
VGG (2014)
Small filters & Deeper network
왜 3x3의 small filter를 사용하는가?
: 3개의 3x3 convolutional layer (with stride 1)를
이용하여 상대적으로 적은 수의 parameter로
7x7 convolutional layer와 동일한
receptive field의 효과를 낼 수 있음