SlideShare a Scribd company logo
CONVOLUTIONAL NEURAL NETS 소개
2019. 3.
김 홍 배1
목차
 Convolutional Neural Nets ?
 Convolutional Neural Nets의 응용예
 Convolutional Neural Nets의 동작원리
 Convolutional Neural Nets의 진화과정
 Brief intro : Invariance and Equivariance
 Limitations of CNN
 Group CovNet
 Capsule Net
2
3
x_image
(28x28)
Reshape
28x28  784x1 vector
.
.
.
10 digits
.
.
W, bx y=softmax(Wx+b)
Neural Nets
# of unknown parameters to estimate = # of weights + # of bias
= 784x10+10 = 7,850 !!!
• 일반적인 Neural Net의 경우, 입력 이미지의 pixel 정보로 부터 시작
• 고해상도 이미지를 고속으로 처리가 불가능
CONVOLUTIONAL NEURAL NETS ?
CONVOLUTIONAL NEURAL NETS ?
 딥러닝 기반 시각인지를 위한 Networks
4
• CNN은 간단한 형상의 Patch(Filter or Kernel) 단위로 특징 추출
• 상위계층으로 진행될 수록 사물의 전체 형상을 구성
 추정해야 할 parameter의 수가 줄어듬
5
• Color images are three dimensional and so have a volume
• Time domain speech signals are 1-d while the frequency domain
representations (e.g. MFCC vectors) take a 2d form. They can also be
looked at as a time sequence.
• Medical images (such as CT/MR/etc) are multi-dimensional
• Videos have the additional temporal dimension compared to stationary
images
• Variable length sequences and time series data are again multi-dimensional
• Hence it makes sense to model them as tensors instead of vectors.
CONVOLUTIONAL NEURAL NETS ?
Types of inputs
6
• Image retrieval from database
• Object Detection
• Self driving cars
• Semantic segmentation
• Face recognition (FB tagging)
• Pose estimation
• Detect diseases
• Speech Recognition
• Text processing
• Analysing satellite data
CONVOLUTIONAL NEURAL NETS의 응용 예
CNNs are everywhere
7
CONVOLUTIONAL NEURAL NETS의 응용 예
 상황분석
시각인지 기능과 문장을 만들기 위한 RNN을 이용하
여
주어진 영상에 대한 설명을 수행
8
 물체 감지 및 인식
시각인지 기능을 이용하여 물체의 class와 BB 제시
CONVOLUTIONAL NEURAL NETS의 응용 예
다양한 Convolution layers
Redmon et al. You Only Look Once: Unified, Real-Time Object Detection, CVPR 2016
9
CONVOLUTIONAL NEURAL NETS의 응용 예
 의미론적 분활(Semantic Segmentation)
시각인지 기능을 이용하여 영상의 픽셀단위로 라벨링 작업
10
CLs FCLs A1
Action
Sequential Front view
 End to end learning for Self-driving car
시각인지와 자동차의 행동을 학습하여 자율주행을 수
행
https://youtu.be/qhUvQiKec2U
CONVOLUTIONAL NEURAL NETS의 응용 예
11
90x1
224x224 pixels
 Smart picking robot based on Deep learning
시각인지와 강화학습을 통한 산업용 로봇 훈련
CONVOLUTIONAL NEURAL NETS의 응용 예
12
Feature Extraction Layer Classification Layer
CNN 은 Feature Extraction과 Classification Layer로 구성
CONVOLUTIONAL NEURAL NETS의 구조
13
CONVOLUTIONAL NEURAL NETS의 동작원리
14
x_image
(28x28)
convolution
(5x5,s=1)
h_conv1
(28x28x32)
32 features
h_pool1
(14x14x32)
32 channels
Max pooling
(2x2,s=2)
h_conv2
(14x14x64)
64 features
convolution
(5x5,s=1)
64 features
h_pool2
(7x7x64)
Max pooling
(2x2,s=2)
1st convolutional layer 2nd convolutional layer
Reshape
7 * 7 * 64 Tensor  3,136x1 vector
.
.
.
1,024 neurons 10 digits
Fully connected layer
Networks Architecture
A
A
Readout layer
CONVOLUTION
1 1 1 0 0
0 1 1 1 0
0 0 1 1 1
0 0 1 1 0
0 1 1 0 0
1 0 1
0 1 0
1 0 1
4 3 4
2 4 3
2 3 4
=
convolution
1 1 1 0 0
0 1 1 1 0
0 0 1 1 1
0 0 1 1 0
0 1 1 0 0
1 0 1
0 1 0
1 0 1
4
=convolution
filter feature map
Input or feature map
filter feature map
Input or feature map
 Convolution 연산 : 같은 위치에 있는 숫자끼리 곱한 후 모두 더함
 1x1 + 1x0 + 1x1 + 0x0 + 1x1 + 1x0 + 0x1 + 0x0 + 1x1 = 4
 Filter가 옆으로 이동 후 같은 연산 수행
 옆으로 모두 이동한 이후에는 아래로 이동 후 같은 연산 수행
CONVOLUTIONAL NEURAL NETS의 동작원리
16
CONVOLUTIONAL NEURAL NETS의 동작원리
RELU(WHY RELU ISTEAD OF
SIGMOID ?)
3 0 1
-2 0 2
0 2 3
1 -1 1
0 -1 -1
3 1 0
𝑓
3 0 1
0 0 2
0 2 3
𝑓
1 0 1
0 0 0
3 1 0
ReLU
ReL
U
CONVOLUTIONAL NEURAL NETS의 동작원리
Rectified Linear Unit
(ReLU)
POOLING LAYER
 Max pooling을 많이 사용함
CONVOLUTIONAL NEURAL NETS의 동작원리
2X2 MAX POOLING WITH STRIDE=1
3 0 1
0 0 2
0 2 3
1 0 1
0 0 0
3 1 0
3 2
2 3
1 1
3 1
max pooling
max pooling
CONVOLUTIONAL NEURAL NETS의 동작원리
20
 Dimension Reduction
 Add Spatial(Translation & Rotation) Invariance to
Feature Maps
• Be able to recognize feature regardless of angle, direction
or skew.
• Does not care where feature is, as long as it maintains its
relative position to other features.
CONVOLUTIONAL NEURAL NETS의 동작원리
Why Pooling ?
Spatial Invariance
Input Image
Convolution
(Learned)
Non-linearity
Spatial pooling
Feature maps
Input Feature Map
.
.
.
Key operations in a CNN
Source: R. Fergus, Y. LeCun Slide: Lazebnik
Input Image
Convolution
(Learned)
Non-linearity
Spatial pooling
Feature maps
Key operations
Source: R. Fergus, Y. LeCun
Rectified Linear Unit (ReLU)
Slide: Lazebnik
Input Image
Convolution
(Learned)
Non-linearity
Spatial pooling
Feature maps
Max
Key operations
Source: R. Fergus, Y. LeCun Slide: Lazebnik
Flattening takes the pooled layer and flattens it in sequential order into
a single vector.
• Vector is used as the input to the Classifier
Flattening
CONVOLUTIONAL NEURAL NETS의 동작원리
FULLY-CONNECTED LAYER
3 2
2 3
1 1
3 1
3
2
2
3
1
1
3
1
2
1
softmax
0.8
0.2
Cat
Dog
CONVOLUTIONAL NEURAL NETS의 동작원리
26
CONVOLUTIONAL NEURAL NETS의 진화
LeNet to ResNet: A Deep Journey
LeNet5 (1998): The origin of convolutional neural network
• Repeat of Convolution – Pooling – Non
Linearity
• Average pooling
• Sigmoid activation for the intermediate
layer
• tanh activation at F6
• 5x5 Convolutionfilter
• 7 layers and less than 1M parameters
• Use of convolution to extract
spatial features
• Subsample using spatial average
ofmaps
• Sparse connection matrix
between layers to avoid large
computationalcost
Characteristics Key Contributions
• Slow totrain
• Hard to train (Neuronsdies
quickly)
• Lack of data
The Gap
27
CONVOLUTIONAL NEURAL NETS의 진화
• ImageNet is an image database organized according to the WordNet hierarchy
• is formally a project aimed at (manually) labeling and categorizing images
• ImageNet Large Scale Visual Recognition Challenge (ILSVRC)
• Training Data: 1.2 Million Images, 1000+ categories
• Validation and Test Data: 150K Images, 50K Validation, Remaining Test
• Image Net Data: http://image-net.org/challenges/LSVRC/2010/browse-synsets
• Multiple Challenges; Object recognition, localization etc.
IMAGENET CLASSIFICATION RESULTS
<2012 Result>
• Krizhevsky et al. – 16.4% error(top-5)
• Next best (non-convnet) – 26.2%
error
<2013 Result>
• All rankers use deep
learning(Convnet)
Revolution of Depth!
AlexNet
CONVOLUTIONAL NEURAL NETS의 진화
29
CONVOLUTIONAL NEURAL NETS의 진화
ALEXNET (2012)
• GPU and training in
parallel
• ReLu Activation
• Dropout regularization
• Image Augmentation
Characteristics Key Contributions
- 11x11, 5x5 and 3x3 Convolutions
- Max pooling
- 3 FC layers
- 60 Million parameters
30
CONVOLUTIONAL NEURAL NETS의 진화
A 4 layer CNN with ReLUs is 6
times faster than equivalent
network with thanh in
reaching 25% error rate on
CIFR-10 dataset
RELU NON-LINEARITY – SIMPLER ACTIVATION
Ljubljana, June 2016
Deep learning - ReLU
How does sigmoid function affect learning?
• Enables easier computation of derivative but has negative effects:
– Neuron never reaches 1 or 0  saturating
– Gradient reduces the magnitude of error
• Leads to two problems:
• Slow learning when neurons saturated i.e. big z values
• Vanishing gradient problem (gradient always 25% of error from previous layer!!)
Ljubljana, June 2016
Deep learning - ReLU
• Alex Krizhevsky (2011) proposed Rectified Linear Unit instead of sigmoid function
• Main purpose of ReLu: reduces saturation and vanishing gradient issues
• Still not perfect:
– Stops learning at negative z values (can use piecewise linear - Parametric ReLu, He 2015 from
Microsoft)
– Bigger risk of saturating neurons to infinity
Ljubljana, June 2016
Deep learning - dropout
• Too many weights cause overfitting issues
• Weight decay (regularization) helps but is not perfect
– Also adds another hyper-parameter to setup manually
• Srivastava et al. (2014) proposed a kind of „bagging“ for deep nets (actually Alex
Krizhevsky already used it in AlexNet in 2011)
• Main point:
– Robustify network by disabling neurons
– Each neuron has a probability, usually of 0.4, of being disabled
– Remaining neurons must adept to work without them
• Applied only to fully connected layers
– Conv. layers less susceptible to overfitting
Srivastava et al., Dropout : A Simple Way to Prevent Neural Networks from Overfitting, JMLR 2014
Ljubljana, June 2016
Deep learning – batch norm
• Input needs to be whitened i.e. normalized (LeCun 1998,
Efficient BackProp)
– Usually done on first layer input only
• The same reason for normalization of first layer exists for
other layers as well
• Ioffe and Szegedy, Batch Normalization, 2015
– Normalize input to each layer
– Reduce internal covariance shift
– Too slow to normalize all input data (>1M samples)
– Instead normalize within mini-batch only
– Learning: norm over mini-batch data
– Inference: norm over all trained input data
Ioffe and Szegedy, Batch Normalization: Accelerating Deep Network
Training by Reducing Internal Covariate Shift, 2015
Better results while allowing to use higher learning rate, higher decay, no dropout, no LRN.
35
VGG (2014) • Smaller size convolution 3x3 throughout the net
• Sequence of 3x3 convolution can emulate
larger receptive fields, e.g., 5x5 or 7x7
• Use of 1x1 convolution
• Decrease in spatial volume and increase in
depth of input
What's the advantage of using 3 layers of
3x3 instead of one layer of 7x7?
• 3 non-linear rectification layers
• Less number of parameters, 27C2 as opposed to
49C2
Key Points
• Depth is important
• Simplify the network to go deep
• 140M parameters
(mostly due to the FC layers)
CONVOLUTIONAL NEURAL NETS의 진화
36
VGG(2ND PLACE IN 2014)
 3x3 filter만 반복해서 사용
 Why??
 Convolution filter를 stack하면 더 큰
receptive field를 가질 수 있음
 2개의 3x3 filter = 5x5 filter
 3개의 3x3 filter = 7x7 filter
 Parameter수는 큰 filter 사용하는 경우에
비하여 감소 
regularization 효과
“Very Deep Convolutional Networks for Large-Scale Image Recognition”
37
CONVOLUTIONAL NEURAL NETS의 진화
GOOGLENET OR INCEPTION (2014)
• 22 Layer CNN
• Heavy use of 1x1 ‘Network in Network’
• Use of average pooling before the classification
• Auxiliary classifiers connected to intermediate layers
• During training add the loss of the auxiliary classifiers
with a discount (0.3) weight
38
GOOGLENET KEY IDEAS
• 3x3 or 5x5 중 어떤 것이 좋은가 ?
• 전부 다 사용해보자
 연산량이 많아진다.
Naïve Version
Modified Idea
Way too many output!!! Use 1x1 for dimensionality reduction
Why 1x1 convolution?
• Introduced as “Network in Network” in 2014
• Is a way to increase Non-Linearity and spatially combine
features across feature maps
Only 4M parameters compared to
60M in AlexNet
39
GOOGLENET KEY IDEAS
 1x1 convolution을 사용하여 dimension reduction
 Feature map의 개수를 절반으로 줄여 총 연산량은 비슷하게
40
GOOGLENET KEY IDEAS
Input layer Kth feature map,
output layer
X11
Xij
y11,k
yij,kwk
wk
X11 : 1x256 vector, wk : 1x256 weight vector, Yij,k = f(Xij·wk), f() : Nonlinear ft’n
x y
w
1x1 Convolution의 dimension reduction 원리
Fully Connected NN을 이용한
Feature Dimension Reduction원리와 동일
41
RESNET (RESIDUAL NEURAL NETWORK) (2015)
CONVOLUTIONAL NEURAL NETS의 진화
• Introduce shortcut connections (exists in prior literature in various forms)
• Key invention is to skip 2 layers. Skipping single layer didn’t give much
improvement for some reason
42
RESNET
 Layer수가 많을수록 항상 좋을까?
 56개의 layer를 사용하는 경우가 20개의 layer를 사용하는 경우에 비
해 training error가 더 큰 결과가 나옴
 더 deep한 model은 training error가
더 낮아야 하지만
 Deep한 model은 optimization이 쉽지 않다
는 것을 발견(identity도 힘들다)
원인 : vanishing/exploding gradient
학습시켜야 할 파라메터 수의 증가
A shallower model
(18 layers)
A deeper model
(34 layers)
“Deep Residual Learning for Image Recognition”
RESNET
RESNET의 KEY IDEA
 Identity는 그대로 상위 layer로 전달하고, 나머지 부분만 학습
 H(x)를 얻는 것이 목표가 아니라 F(x)=H(x)-x 를 목표로
F(x) ~0 이므로 수렴이 빠름
 Identity shortcut을 통한 효과
- 깊은 망의 최적화도 가능
- 깊이에 비례해 정확도 개선
“Deep Residual Learning for Image Recognition”
 BOTTLENECK : A PRACTICAL DESIGN
• # parameters
• 256 x 64 + 64 x 3 x 3x 64 + 64 x 25
6 = ~70K
• # parameters just using 3 x 3 x 256 x 2
56 conv layer = ~600K
1x1 conv를 이용하여 dimension reduction  3x3 conv 
1x1 conv를 이용하여 dimension expansion
 연산량을 줄이기 위함
RESNET의 KEY IDEA
Dilated convolutions
The goal of this layer is to increase the size of the receptive field
(input activations that are used to compute a given output)
without using downsampling (in order to preserve local information).
Increasing the size of the receptive field allows to use more context
(information spatially further away).
The idea is to spread the input images and fill the added pixels with
zeros, and then compute a convolution.
Deeper the better!!!
DenseNet
2017 CVPR에서 Densely Connected Network라는 네트워크 구조에 획기적인 변화를
주는 연구 결과가 발표
Brief intro : Invariance and Equivariance
CovNet are translational Equivalent
This demonstrates LeNet-5's invariance to small rotations (+/-40 degrees).
How about Rotation ?
Limitation of Conventional CovNet
2D convolution is equivariant under translation, but not under rotation
Limitation of Conventional CovNet
Invariance
Φ
Image(X)
Feature(Z) Z1 = Z = Z2
𝑇𝑔
1
Mapping
ft’n(Φ(·))
Φ
Transformation
X1 X2
Z = Z1 = Φ(X1) = Z2 = Φ(X2) = Φ(𝑻 𝒈
𝟏
X1 )
: Mapping independent of transformation, 𝑇𝑔, for all 𝑇𝑔
X2 = 𝑇𝑔
1
X1
To make a Convolutional Neural Networks (CNN) transformation-
invariant, data augmentation with training samples is generally used
Invariance
Equivariance
Φ
Image(X)
Feature(Z) Z1 Z2
𝑇𝑔
2
𝑇𝑔
1
Φ
Transformation
X1 X2
Z2 = 𝑻 𝒈
𝟐
Z1 = 𝑻 𝒈
𝟐
Φ(X1) = Φ(𝑻 𝒈
𝟏
X1 )
: Invariance is special case of equivariance where 𝑇𝑔
2 is the identity.
X2 = 𝑇𝑔
1
X1
Z2 = 𝑇𝑔
2
Z1
: Mapping preserves algebraic structure of transformation
Z1 ≠ Z2 but keeps the relationship
Mapping
ft’n(Φ(·))
Equivariance : Group CovNet
To understand the rotation or proportion change of a given entity, a
group of filters(a combination of rotated and mirror reflected versions of
filter) is adopted.
For example, the group p4 which contains translations and rotations by
multiples of ninety degrees, or, which additionally contains mirror
reflections.
: Rotation
: Mirror reflections
A filter in a G-CNN detects co-occurrences of features that have the
preferred relative pose, and can match such a feature constellation in
every global pose through an operation called the G-convolution.
Equivariance : Group CovNet
Filter group 1
Filter group 2
Filter group N
Visualization of classic 2D convolution
Visualization of the G-Conv for the roto-translation group
G-Convolution
Equivariance : Group CovNet
G-convolution is equivariant under rotation
G-Convolution
Equivariance : Group CovNet
Equivariance : Group CovNet
Latent representations learnt by a CNN and a G-CNN.
- The left part is the result of a typical CNN while the right one is that of a G-
CNN.
- In both parts, the outer cycles consist of the rotated images while the inner
cycles consist of the learnt representations.
- Features produced by a G-CNN is equivariant to rotation while that produced
by a typical CNN is not.
What we need : EQUIVARIANCE (not invariance)
“Equivariance makes a CNN understand the rotation or proportion change”
Equivariance : Capsule Net
“A capsule is a group of neurons whose activity vector represents
the instantiation parameters of a specific type of entity such as an
object or an object part.”
Equivariance : Capsule Net
Equivariance of Capsules
“A capsule is a group of neurons whose activity vector represents the
instantiation parameters of a specific type of entity such as an object or
an object part.”
Activity vector map Object
Equivariance : Capsule Net

More Related Content

What's hot

SSII2021 [OS2-03] 自己教師あり学習における対照学習の基礎と応用
SSII2021 [OS2-03] 自己教師あり学習における対照学習の基礎と応用SSII2021 [OS2-03] 自己教師あり学習における対照学習の基礎と応用
SSII2021 [OS2-03] 自己教師あり学習における対照学習の基礎と応用
SSII
 
動作認識におけるディープラーニングの最新動向1 3D-CNN
動作認識におけるディープラーニングの最新動向1 3D-CNN動作認識におけるディープラーニングの最新動向1 3D-CNN
動作認識におけるディープラーニングの最新動向1 3D-CNN
WEBFARMER. ltd.
 
[DL輪読会]Wavenet a generative model for raw audio
[DL輪読会]Wavenet a generative model for raw audio[DL輪読会]Wavenet a generative model for raw audio
[DL輪読会]Wavenet a generative model for raw audio
Deep Learning JP
 
EMアルゴリズム
EMアルゴリズムEMアルゴリズム
[DL輪読会]The Neural Process Family−Neural Processes関連の実装を読んで動かしてみる−
[DL輪読会]The Neural Process Family−Neural Processes関連の実装を読んで動かしてみる−[DL輪読会]The Neural Process Family−Neural Processes関連の実装を読んで動かしてみる−
[DL輪読会]The Neural Process Family−Neural Processes関連の実装を読んで動かしてみる−
Deep Learning JP
 
Superpixel Sampling Networks
Superpixel Sampling NetworksSuperpixel Sampling Networks
Superpixel Sampling Networks
yukihiro domae
 
[DL輪読会]Conditional Neural Processes
[DL輪読会]Conditional Neural Processes[DL輪読会]Conditional Neural Processes
[DL輪読会]Conditional Neural Processes
Deep Learning JP
 
T-sne
T-sneT-sne
T-sne
takutori
 
文献紹介:X3D: Expanding Architectures for Efficient Video Recognition
文献紹介:X3D: Expanding Architectures for Efficient Video Recognition文献紹介:X3D: Expanding Architectures for Efficient Video Recognition
文献紹介:X3D: Expanding Architectures for Efficient Video Recognition
Toru Tamaki
 
Transformer 動向調査 in 画像認識
Transformer 動向調査 in 画像認識Transformer 動向調査 in 画像認識
Transformer 動向調査 in 画像認識
Kazuki Maeno
 
VQ-VAE
VQ-VAEVQ-VAE
VQ-VAE
수철 박
 
Denoising Diffusion Probabilistic Modelsの重要な式の解説
Denoising Diffusion Probabilistic Modelsの重要な式の解説Denoising Diffusion Probabilistic Modelsの重要な式の解説
Denoising Diffusion Probabilistic Modelsの重要な式の解説
Tomonari Masada
 
ResNetの仕組み
ResNetの仕組みResNetの仕組み
ResNetの仕組み
Kota Nagasato
 
[DL輪読会]Neural Ordinary Differential Equations
[DL輪読会]Neural Ordinary Differential Equations[DL輪読会]Neural Ordinary Differential Equations
[DL輪読会]Neural Ordinary Differential Equations
Deep Learning JP
 
[DL輪読会] Spectral Norm Regularization for Improving the Generalizability of De...
[DL輪読会] Spectral Norm Regularization for Improving the Generalizability of De...[DL輪読会] Spectral Norm Regularization for Improving the Generalizability of De...
[DL輪読会] Spectral Norm Regularization for Improving the Generalizability of De...
Deep Learning JP
 
[DL輪読会]Encoder-Decoder with Atrous Separable Convolution for Semantic Image S...
[DL輪読会]Encoder-Decoder with Atrous Separable Convolution for Semantic Image S...[DL輪読会]Encoder-Decoder with Atrous Separable Convolution for Semantic Image S...
[DL輪読会]Encoder-Decoder with Atrous Separable Convolution for Semantic Image S...
Deep Learning JP
 
大域的探索から局所的探索へデータ拡張 (Data Augmentation)を用いた学習の探索テクニック
大域的探索から局所的探索へデータ拡張 (Data Augmentation)を用いた学習の探索テクニック 大域的探索から局所的探索へデータ拡張 (Data Augmentation)を用いた学習の探索テクニック
大域的探索から局所的探索へデータ拡張 (Data Augmentation)を用いた学習の探索テクニック
西岡 賢一郎
 
データサイエンティスト協会 木曜勉強会 #04 『クラスター分析の基礎と総合通販会社での活用例 〜 ビッグデータ時代にクラスター分析はどう変わるか 〜』
データサイエンティスト協会 木曜勉強会 #04 『クラスター分析の基礎と総合通販会社での活用例  〜 ビッグデータ時代にクラスター分析はどう変わるか 〜』データサイエンティスト協会 木曜勉強会 #04 『クラスター分析の基礎と総合通販会社での活用例  〜 ビッグデータ時代にクラスター分析はどう変わるか 〜』
データサイエンティスト協会 木曜勉強会 #04 『クラスター分析の基礎と総合通販会社での活用例 〜 ビッグデータ時代にクラスター分析はどう変わるか 〜』
The Japan DataScientist Society
 
PRML 5.5.6-5.6 畳み込みネットワーク(CNN)・ソフト重み共有・混合密度ネットワーク
PRML 5.5.6-5.6 畳み込みネットワーク(CNN)・ソフト重み共有・混合密度ネットワークPRML 5.5.6-5.6 畳み込みネットワーク(CNN)・ソフト重み共有・混合密度ネットワーク
PRML 5.5.6-5.6 畳み込みネットワーク(CNN)・ソフト重み共有・混合密度ネットワーク
KokiTakamiya
 
밑바닥부터 시작하는딥러닝 8장
밑바닥부터 시작하는딥러닝 8장밑바닥부터 시작하는딥러닝 8장
밑바닥부터 시작하는딥러닝 8장
Sunggon Song
 

What's hot (20)

SSII2021 [OS2-03] 自己教師あり学習における対照学習の基礎と応用
SSII2021 [OS2-03] 自己教師あり学習における対照学習の基礎と応用SSII2021 [OS2-03] 自己教師あり学習における対照学習の基礎と応用
SSII2021 [OS2-03] 自己教師あり学習における対照学習の基礎と応用
 
動作認識におけるディープラーニングの最新動向1 3D-CNN
動作認識におけるディープラーニングの最新動向1 3D-CNN動作認識におけるディープラーニングの最新動向1 3D-CNN
動作認識におけるディープラーニングの最新動向1 3D-CNN
 
[DL輪読会]Wavenet a generative model for raw audio
[DL輪読会]Wavenet a generative model for raw audio[DL輪読会]Wavenet a generative model for raw audio
[DL輪読会]Wavenet a generative model for raw audio
 
EMアルゴリズム
EMアルゴリズムEMアルゴリズム
EMアルゴリズム
 
[DL輪読会]The Neural Process Family−Neural Processes関連の実装を読んで動かしてみる−
[DL輪読会]The Neural Process Family−Neural Processes関連の実装を読んで動かしてみる−[DL輪読会]The Neural Process Family−Neural Processes関連の実装を読んで動かしてみる−
[DL輪読会]The Neural Process Family−Neural Processes関連の実装を読んで動かしてみる−
 
Superpixel Sampling Networks
Superpixel Sampling NetworksSuperpixel Sampling Networks
Superpixel Sampling Networks
 
[DL輪読会]Conditional Neural Processes
[DL輪読会]Conditional Neural Processes[DL輪読会]Conditional Neural Processes
[DL輪読会]Conditional Neural Processes
 
T-sne
T-sneT-sne
T-sne
 
文献紹介:X3D: Expanding Architectures for Efficient Video Recognition
文献紹介:X3D: Expanding Architectures for Efficient Video Recognition文献紹介:X3D: Expanding Architectures for Efficient Video Recognition
文献紹介:X3D: Expanding Architectures for Efficient Video Recognition
 
Transformer 動向調査 in 画像認識
Transformer 動向調査 in 画像認識Transformer 動向調査 in 画像認識
Transformer 動向調査 in 画像認識
 
VQ-VAE
VQ-VAEVQ-VAE
VQ-VAE
 
Denoising Diffusion Probabilistic Modelsの重要な式の解説
Denoising Diffusion Probabilistic Modelsの重要な式の解説Denoising Diffusion Probabilistic Modelsの重要な式の解説
Denoising Diffusion Probabilistic Modelsの重要な式の解説
 
ResNetの仕組み
ResNetの仕組みResNetの仕組み
ResNetの仕組み
 
[DL輪読会]Neural Ordinary Differential Equations
[DL輪読会]Neural Ordinary Differential Equations[DL輪読会]Neural Ordinary Differential Equations
[DL輪読会]Neural Ordinary Differential Equations
 
[DL輪読会] Spectral Norm Regularization for Improving the Generalizability of De...
[DL輪読会] Spectral Norm Regularization for Improving the Generalizability of De...[DL輪読会] Spectral Norm Regularization for Improving the Generalizability of De...
[DL輪読会] Spectral Norm Regularization for Improving the Generalizability of De...
 
[DL輪読会]Encoder-Decoder with Atrous Separable Convolution for Semantic Image S...
[DL輪読会]Encoder-Decoder with Atrous Separable Convolution for Semantic Image S...[DL輪読会]Encoder-Decoder with Atrous Separable Convolution for Semantic Image S...
[DL輪読会]Encoder-Decoder with Atrous Separable Convolution for Semantic Image S...
 
大域的探索から局所的探索へデータ拡張 (Data Augmentation)を用いた学習の探索テクニック
大域的探索から局所的探索へデータ拡張 (Data Augmentation)を用いた学習の探索テクニック 大域的探索から局所的探索へデータ拡張 (Data Augmentation)を用いた学習の探索テクニック
大域的探索から局所的探索へデータ拡張 (Data Augmentation)を用いた学習の探索テクニック
 
データサイエンティスト協会 木曜勉強会 #04 『クラスター分析の基礎と総合通販会社での活用例 〜 ビッグデータ時代にクラスター分析はどう変わるか 〜』
データサイエンティスト協会 木曜勉強会 #04 『クラスター分析の基礎と総合通販会社での活用例  〜 ビッグデータ時代にクラスター分析はどう変わるか 〜』データサイエンティスト協会 木曜勉強会 #04 『クラスター分析の基礎と総合通販会社での活用例  〜 ビッグデータ時代にクラスター分析はどう変わるか 〜』
データサイエンティスト協会 木曜勉強会 #04 『クラスター分析の基礎と総合通販会社での活用例 〜 ビッグデータ時代にクラスター分析はどう変わるか 〜』
 
PRML 5.5.6-5.6 畳み込みネットワーク(CNN)・ソフト重み共有・混合密度ネットワーク
PRML 5.5.6-5.6 畳み込みネットワーク(CNN)・ソフト重み共有・混合密度ネットワークPRML 5.5.6-5.6 畳み込みネットワーク(CNN)・ソフト重み共有・混合密度ネットワーク
PRML 5.5.6-5.6 畳み込みネットワーク(CNN)・ソフト重み共有・混合密度ネットワーク
 
밑바닥부터 시작하는딥러닝 8장
밑바닥부터 시작하는딥러닝 8장밑바닥부터 시작하는딥러닝 8장
밑바닥부터 시작하는딥러닝 8장
 

Similar to Convolutional neural networks 이론과 응용

Cvpr 2018 papers review (efficient computing)
Cvpr 2018 papers review (efficient computing)Cvpr 2018 papers review (efficient computing)
Cvpr 2018 papers review (efficient computing)
DonghyunKang12
 
FINAL_Team_4.pptx
FINAL_Team_4.pptxFINAL_Team_4.pptx
FINAL_Team_4.pptx
nitin571047
 
Towards better analysis of deep convolutional neural networks
Towards better analysis of deep convolutional neural networksTowards better analysis of deep convolutional neural networks
Towards better analysis of deep convolutional neural networks
曾 子芸
 
Autoencoders for image_classification
Autoencoders for image_classificationAutoencoders for image_classification
Autoencoders for image_classification
Cenk Bircanoğlu
 
Development of Deep Learning Architecture
Development of Deep Learning ArchitectureDevelopment of Deep Learning Architecture
Development of Deep Learning Architecture
Pantech ProLabs India Pvt Ltd
 
convnets.pptx
convnets.pptxconvnets.pptx
convnets.pptx
MohamedAliHabib3
 
introduction to deeplearning
introduction to deeplearningintroduction to deeplearning
introduction to deeplearning
Eyad Alshami
 
04 Deep CNN (Ch_01 to Ch_3).pptx
04 Deep CNN (Ch_01 to Ch_3).pptx04 Deep CNN (Ch_01 to Ch_3).pptx
04 Deep CNN (Ch_01 to Ch_3).pptx
ZainULABIDIN496386
 
Lecture 2.B: Computer Vision Applications - Full Stack Deep Learning - Spring...
Lecture 2.B: Computer Vision Applications - Full Stack Deep Learning - Spring...Lecture 2.B: Computer Vision Applications - Full Stack Deep Learning - Spring...
Lecture 2.B: Computer Vision Applications - Full Stack Deep Learning - Spring...
Sergey Karayev
 
intro-to-cnn-April_2020.pptx
intro-to-cnn-April_2020.pptxintro-to-cnn-April_2020.pptx
intro-to-cnn-April_2020.pptx
ssuser3aa461
 
Deep Learning
Deep LearningDeep Learning
Deep Learning
Pierre de Lacaze
 
Fundamental of deep learning
Fundamental of deep learningFundamental of deep learning
Fundamental of deep learning
Stanley Wang
 
Deep Neural Networks (D1L2 Insight@DCU Machine Learning Workshop 2017)
Deep Neural Networks (D1L2 Insight@DCU Machine Learning Workshop 2017)Deep Neural Networks (D1L2 Insight@DCU Machine Learning Workshop 2017)
Deep Neural Networks (D1L2 Insight@DCU Machine Learning Workshop 2017)
Universitat Politècnica de Catalunya
 
ImageNet Classification with Deep Convolutional Neural Networks
ImageNet Classification with Deep Convolutional Neural NetworksImageNet Classification with Deep Convolutional Neural Networks
ImageNet Classification with Deep Convolutional Neural Networks
Willy Marroquin (WillyDevNET)
 
lec6a.ppt
lec6a.pptlec6a.ppt
lec6a.ppt
SaadMemon23
 
Vision Transformer(ViT) / An Image is Worth 16*16 Words: Transformers for Ima...
Vision Transformer(ViT) / An Image is Worth 16*16 Words: Transformers for Ima...Vision Transformer(ViT) / An Image is Worth 16*16 Words: Transformers for Ima...
Vision Transformer(ViT) / An Image is Worth 16*16 Words: Transformers for Ima...
changedaeoh
 
Recent advances of AI for medical imaging : Engineering perspectives
Recent advances of AI for medical imaging : Engineering perspectivesRecent advances of AI for medical imaging : Engineering perspectives
Recent advances of AI for medical imaging : Engineering perspectives
Namkug Kim
 
Image Classification using deep learning
Image Classification using deep learning Image Classification using deep learning
Image Classification using deep learning
Asma-AH
 
Finding the best solution for Image Processing
Finding the best solution for Image ProcessingFinding the best solution for Image Processing
Finding the best solution for Image Processing
Tech Triveni
 
Convolutional Neural Networks (CNN)
Convolutional Neural Networks (CNN)Convolutional Neural Networks (CNN)
Convolutional Neural Networks (CNN)
Gaurav Mittal
 

Similar to Convolutional neural networks 이론과 응용 (20)

Cvpr 2018 papers review (efficient computing)
Cvpr 2018 papers review (efficient computing)Cvpr 2018 papers review (efficient computing)
Cvpr 2018 papers review (efficient computing)
 
FINAL_Team_4.pptx
FINAL_Team_4.pptxFINAL_Team_4.pptx
FINAL_Team_4.pptx
 
Towards better analysis of deep convolutional neural networks
Towards better analysis of deep convolutional neural networksTowards better analysis of deep convolutional neural networks
Towards better analysis of deep convolutional neural networks
 
Autoencoders for image_classification
Autoencoders for image_classificationAutoencoders for image_classification
Autoencoders for image_classification
 
Development of Deep Learning Architecture
Development of Deep Learning ArchitectureDevelopment of Deep Learning Architecture
Development of Deep Learning Architecture
 
convnets.pptx
convnets.pptxconvnets.pptx
convnets.pptx
 
introduction to deeplearning
introduction to deeplearningintroduction to deeplearning
introduction to deeplearning
 
04 Deep CNN (Ch_01 to Ch_3).pptx
04 Deep CNN (Ch_01 to Ch_3).pptx04 Deep CNN (Ch_01 to Ch_3).pptx
04 Deep CNN (Ch_01 to Ch_3).pptx
 
Lecture 2.B: Computer Vision Applications - Full Stack Deep Learning - Spring...
Lecture 2.B: Computer Vision Applications - Full Stack Deep Learning - Spring...Lecture 2.B: Computer Vision Applications - Full Stack Deep Learning - Spring...
Lecture 2.B: Computer Vision Applications - Full Stack Deep Learning - Spring...
 
intro-to-cnn-April_2020.pptx
intro-to-cnn-April_2020.pptxintro-to-cnn-April_2020.pptx
intro-to-cnn-April_2020.pptx
 
Deep Learning
Deep LearningDeep Learning
Deep Learning
 
Fundamental of deep learning
Fundamental of deep learningFundamental of deep learning
Fundamental of deep learning
 
Deep Neural Networks (D1L2 Insight@DCU Machine Learning Workshop 2017)
Deep Neural Networks (D1L2 Insight@DCU Machine Learning Workshop 2017)Deep Neural Networks (D1L2 Insight@DCU Machine Learning Workshop 2017)
Deep Neural Networks (D1L2 Insight@DCU Machine Learning Workshop 2017)
 
ImageNet Classification with Deep Convolutional Neural Networks
ImageNet Classification with Deep Convolutional Neural NetworksImageNet Classification with Deep Convolutional Neural Networks
ImageNet Classification with Deep Convolutional Neural Networks
 
lec6a.ppt
lec6a.pptlec6a.ppt
lec6a.ppt
 
Vision Transformer(ViT) / An Image is Worth 16*16 Words: Transformers for Ima...
Vision Transformer(ViT) / An Image is Worth 16*16 Words: Transformers for Ima...Vision Transformer(ViT) / An Image is Worth 16*16 Words: Transformers for Ima...
Vision Transformer(ViT) / An Image is Worth 16*16 Words: Transformers for Ima...
 
Recent advances of AI for medical imaging : Engineering perspectives
Recent advances of AI for medical imaging : Engineering perspectivesRecent advances of AI for medical imaging : Engineering perspectives
Recent advances of AI for medical imaging : Engineering perspectives
 
Image Classification using deep learning
Image Classification using deep learning Image Classification using deep learning
Image Classification using deep learning
 
Finding the best solution for Image Processing
Finding the best solution for Image ProcessingFinding the best solution for Image Processing
Finding the best solution for Image Processing
 
Convolutional Neural Networks (CNN)
Convolutional Neural Networks (CNN)Convolutional Neural Networks (CNN)
Convolutional Neural Networks (CNN)
 

More from 홍배 김

Automatic Gain Tuning based on Gaussian Process Global Optimization (= Bayesi...
Automatic Gain Tuning based on Gaussian Process Global Optimization (= Bayesi...Automatic Gain Tuning based on Gaussian Process Global Optimization (= Bayesi...
Automatic Gain Tuning based on Gaussian Process Global Optimization (= Bayesi...
홍배 김
 
Gaussian processing
Gaussian processingGaussian processing
Gaussian processing
홍배 김
 
Lecture Summary : Camera Projection
Lecture Summary : Camera Projection Lecture Summary : Camera Projection
Lecture Summary : Camera Projection
홍배 김
 
Learning agile and dynamic motor skills for legged robots
Learning agile and dynamic motor skills for legged robotsLearning agile and dynamic motor skills for legged robots
Learning agile and dynamic motor skills for legged robots
홍배 김
 
Robotics of Quadruped Robot
Robotics of Quadruped RobotRobotics of Quadruped Robot
Robotics of Quadruped Robot
홍배 김
 
Basics of Robotics
Basics of RoboticsBasics of Robotics
Basics of Robotics
홍배 김
 
Recurrent Neural Net의 이론과 설명
Recurrent Neural Net의 이론과 설명Recurrent Neural Net의 이론과 설명
Recurrent Neural Net의 이론과 설명
홍배 김
 
Anomaly detection using deep one class classifier
Anomaly detection using deep one class classifierAnomaly detection using deep one class classifier
Anomaly detection using deep one class classifier
홍배 김
 
Optimal real-time landing using DNN
Optimal real-time landing using DNNOptimal real-time landing using DNN
Optimal real-time landing using DNN
홍배 김
 
The world of loss function
The world of loss functionThe world of loss function
The world of loss function
홍배 김
 
Machine learning applications in aerospace domain
Machine learning applications in aerospace domainMachine learning applications in aerospace domain
Machine learning applications in aerospace domain
홍배 김
 
Anomaly Detection and Localization Using GAN and One-Class Classifier
Anomaly Detection and Localization  Using GAN and One-Class ClassifierAnomaly Detection and Localization  Using GAN and One-Class Classifier
Anomaly Detection and Localization Using GAN and One-Class Classifier
홍배 김
 
ARCHITECTURAL CONDITIONING FOR DISENTANGLEMENT OF OBJECT IDENTITY AND POSTURE...
ARCHITECTURAL CONDITIONING FOR DISENTANGLEMENT OF OBJECT IDENTITY AND POSTURE...ARCHITECTURAL CONDITIONING FOR DISENTANGLEMENT OF OBJECT IDENTITY AND POSTURE...
ARCHITECTURAL CONDITIONING FOR DISENTANGLEMENT OF OBJECT IDENTITY AND POSTURE...
홍배 김
 
Anomaly Detection with GANs
Anomaly Detection with GANsAnomaly Detection with GANs
Anomaly Detection with GANs
홍배 김
 
Focal loss의 응용(Detection & Classification)
Focal loss의 응용(Detection & Classification)Focal loss의 응용(Detection & Classification)
Focal loss의 응용(Detection & Classification)
홍배 김
 
Convolution 종류 설명
Convolution 종류 설명Convolution 종류 설명
Convolution 종류 설명
홍배 김
 
Learning by association
Learning by associationLearning by association
Learning by association
홍배 김
 
알기쉬운 Variational autoencoder
알기쉬운 Variational autoencoder알기쉬운 Variational autoencoder
알기쉬운 Variational autoencoder
홍배 김
 
Binarized CNN on FPGA
Binarized CNN on FPGABinarized CNN on FPGA
Binarized CNN on FPGA
홍배 김
 
Visualizing data using t-SNE
Visualizing data using t-SNEVisualizing data using t-SNE
Visualizing data using t-SNE
홍배 김
 

More from 홍배 김 (20)

Automatic Gain Tuning based on Gaussian Process Global Optimization (= Bayesi...
Automatic Gain Tuning based on Gaussian Process Global Optimization (= Bayesi...Automatic Gain Tuning based on Gaussian Process Global Optimization (= Bayesi...
Automatic Gain Tuning based on Gaussian Process Global Optimization (= Bayesi...
 
Gaussian processing
Gaussian processingGaussian processing
Gaussian processing
 
Lecture Summary : Camera Projection
Lecture Summary : Camera Projection Lecture Summary : Camera Projection
Lecture Summary : Camera Projection
 
Learning agile and dynamic motor skills for legged robots
Learning agile and dynamic motor skills for legged robotsLearning agile and dynamic motor skills for legged robots
Learning agile and dynamic motor skills for legged robots
 
Robotics of Quadruped Robot
Robotics of Quadruped RobotRobotics of Quadruped Robot
Robotics of Quadruped Robot
 
Basics of Robotics
Basics of RoboticsBasics of Robotics
Basics of Robotics
 
Recurrent Neural Net의 이론과 설명
Recurrent Neural Net의 이론과 설명Recurrent Neural Net의 이론과 설명
Recurrent Neural Net의 이론과 설명
 
Anomaly detection using deep one class classifier
Anomaly detection using deep one class classifierAnomaly detection using deep one class classifier
Anomaly detection using deep one class classifier
 
Optimal real-time landing using DNN
Optimal real-time landing using DNNOptimal real-time landing using DNN
Optimal real-time landing using DNN
 
The world of loss function
The world of loss functionThe world of loss function
The world of loss function
 
Machine learning applications in aerospace domain
Machine learning applications in aerospace domainMachine learning applications in aerospace domain
Machine learning applications in aerospace domain
 
Anomaly Detection and Localization Using GAN and One-Class Classifier
Anomaly Detection and Localization  Using GAN and One-Class ClassifierAnomaly Detection and Localization  Using GAN and One-Class Classifier
Anomaly Detection and Localization Using GAN and One-Class Classifier
 
ARCHITECTURAL CONDITIONING FOR DISENTANGLEMENT OF OBJECT IDENTITY AND POSTURE...
ARCHITECTURAL CONDITIONING FOR DISENTANGLEMENT OF OBJECT IDENTITY AND POSTURE...ARCHITECTURAL CONDITIONING FOR DISENTANGLEMENT OF OBJECT IDENTITY AND POSTURE...
ARCHITECTURAL CONDITIONING FOR DISENTANGLEMENT OF OBJECT IDENTITY AND POSTURE...
 
Anomaly Detection with GANs
Anomaly Detection with GANsAnomaly Detection with GANs
Anomaly Detection with GANs
 
Focal loss의 응용(Detection & Classification)
Focal loss의 응용(Detection & Classification)Focal loss의 응용(Detection & Classification)
Focal loss의 응용(Detection & Classification)
 
Convolution 종류 설명
Convolution 종류 설명Convolution 종류 설명
Convolution 종류 설명
 
Learning by association
Learning by associationLearning by association
Learning by association
 
알기쉬운 Variational autoencoder
알기쉬운 Variational autoencoder알기쉬운 Variational autoencoder
알기쉬운 Variational autoencoder
 
Binarized CNN on FPGA
Binarized CNN on FPGABinarized CNN on FPGA
Binarized CNN on FPGA
 
Visualizing data using t-SNE
Visualizing data using t-SNEVisualizing data using t-SNE
Visualizing data using t-SNE
 

Recently uploaded

Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Product School
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
DianaGray10
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
Guy Korland
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Albert Hoitingh
 
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Product School
 
Elevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object CalisthenicsElevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object Calisthenics
Dorra BARTAGUIZ
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
DianaGray10
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
Alan Dix
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
ControlCase
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
Safe Software
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
James Anderson
 
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
Product School
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
Elena Simperl
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
Laura Byrne
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Thierry Lestable
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
Thijs Feryn
 
Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a button
DianaGray10
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
OnBoard
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and Grafana
RTTS
 

Recently uploaded (20)

Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
 
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...
 
Elevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object CalisthenicsElevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object Calisthenics
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
 
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
 
Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a button
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and Grafana
 

Convolutional neural networks 이론과 응용

  • 1. CONVOLUTIONAL NEURAL NETS 소개 2019. 3. 김 홍 배1
  • 2. 목차  Convolutional Neural Nets ?  Convolutional Neural Nets의 응용예  Convolutional Neural Nets의 동작원리  Convolutional Neural Nets의 진화과정  Brief intro : Invariance and Equivariance  Limitations of CNN  Group CovNet  Capsule Net 2
  • 3. 3 x_image (28x28) Reshape 28x28  784x1 vector . . . 10 digits . . W, bx y=softmax(Wx+b) Neural Nets # of unknown parameters to estimate = # of weights + # of bias = 784x10+10 = 7,850 !!! • 일반적인 Neural Net의 경우, 입력 이미지의 pixel 정보로 부터 시작 • 고해상도 이미지를 고속으로 처리가 불가능 CONVOLUTIONAL NEURAL NETS ?
  • 4. CONVOLUTIONAL NEURAL NETS ?  딥러닝 기반 시각인지를 위한 Networks 4 • CNN은 간단한 형상의 Patch(Filter or Kernel) 단위로 특징 추출 • 상위계층으로 진행될 수록 사물의 전체 형상을 구성  추정해야 할 parameter의 수가 줄어듬
  • 5. 5 • Color images are three dimensional and so have a volume • Time domain speech signals are 1-d while the frequency domain representations (e.g. MFCC vectors) take a 2d form. They can also be looked at as a time sequence. • Medical images (such as CT/MR/etc) are multi-dimensional • Videos have the additional temporal dimension compared to stationary images • Variable length sequences and time series data are again multi-dimensional • Hence it makes sense to model them as tensors instead of vectors. CONVOLUTIONAL NEURAL NETS ? Types of inputs
  • 6. 6 • Image retrieval from database • Object Detection • Self driving cars • Semantic segmentation • Face recognition (FB tagging) • Pose estimation • Detect diseases • Speech Recognition • Text processing • Analysing satellite data CONVOLUTIONAL NEURAL NETS의 응용 예 CNNs are everywhere
  • 7. 7 CONVOLUTIONAL NEURAL NETS의 응용 예  상황분석 시각인지 기능과 문장을 만들기 위한 RNN을 이용하 여 주어진 영상에 대한 설명을 수행
  • 8. 8  물체 감지 및 인식 시각인지 기능을 이용하여 물체의 class와 BB 제시 CONVOLUTIONAL NEURAL NETS의 응용 예 다양한 Convolution layers Redmon et al. You Only Look Once: Unified, Real-Time Object Detection, CVPR 2016
  • 9. 9 CONVOLUTIONAL NEURAL NETS의 응용 예  의미론적 분활(Semantic Segmentation) 시각인지 기능을 이용하여 영상의 픽셀단위로 라벨링 작업
  • 10. 10 CLs FCLs A1 Action Sequential Front view  End to end learning for Self-driving car 시각인지와 자동차의 행동을 학습하여 자율주행을 수 행 https://youtu.be/qhUvQiKec2U CONVOLUTIONAL NEURAL NETS의 응용 예
  • 11. 11 90x1 224x224 pixels  Smart picking robot based on Deep learning 시각인지와 강화학습을 통한 산업용 로봇 훈련 CONVOLUTIONAL NEURAL NETS의 응용 예
  • 12. 12 Feature Extraction Layer Classification Layer CNN 은 Feature Extraction과 Classification Layer로 구성 CONVOLUTIONAL NEURAL NETS의 구조
  • 14. 14 x_image (28x28) convolution (5x5,s=1) h_conv1 (28x28x32) 32 features h_pool1 (14x14x32) 32 channels Max pooling (2x2,s=2) h_conv2 (14x14x64) 64 features convolution (5x5,s=1) 64 features h_pool2 (7x7x64) Max pooling (2x2,s=2) 1st convolutional layer 2nd convolutional layer Reshape 7 * 7 * 64 Tensor  3,136x1 vector . . . 1,024 neurons 10 digits Fully connected layer Networks Architecture A A Readout layer
  • 15. CONVOLUTION 1 1 1 0 0 0 1 1 1 0 0 0 1 1 1 0 0 1 1 0 0 1 1 0 0 1 0 1 0 1 0 1 0 1 4 3 4 2 4 3 2 3 4 = convolution 1 1 1 0 0 0 1 1 1 0 0 0 1 1 1 0 0 1 1 0 0 1 1 0 0 1 0 1 0 1 0 1 0 1 4 =convolution filter feature map Input or feature map filter feature map Input or feature map  Convolution 연산 : 같은 위치에 있는 숫자끼리 곱한 후 모두 더함  1x1 + 1x0 + 1x1 + 0x0 + 1x1 + 1x0 + 0x1 + 0x0 + 1x1 = 4  Filter가 옆으로 이동 후 같은 연산 수행  옆으로 모두 이동한 이후에는 아래로 이동 후 같은 연산 수행 CONVOLUTIONAL NEURAL NETS의 동작원리
  • 17. RELU(WHY RELU ISTEAD OF SIGMOID ?) 3 0 1 -2 0 2 0 2 3 1 -1 1 0 -1 -1 3 1 0 𝑓 3 0 1 0 0 2 0 2 3 𝑓 1 0 1 0 0 0 3 1 0 ReLU ReL U CONVOLUTIONAL NEURAL NETS의 동작원리 Rectified Linear Unit (ReLU)
  • 18. POOLING LAYER  Max pooling을 많이 사용함 CONVOLUTIONAL NEURAL NETS의 동작원리
  • 19. 2X2 MAX POOLING WITH STRIDE=1 3 0 1 0 0 2 0 2 3 1 0 1 0 0 0 3 1 0 3 2 2 3 1 1 3 1 max pooling max pooling CONVOLUTIONAL NEURAL NETS의 동작원리
  • 20. 20  Dimension Reduction  Add Spatial(Translation & Rotation) Invariance to Feature Maps • Be able to recognize feature regardless of angle, direction or skew. • Does not care where feature is, as long as it maintains its relative position to other features. CONVOLUTIONAL NEURAL NETS의 동작원리 Why Pooling ? Spatial Invariance
  • 21. Input Image Convolution (Learned) Non-linearity Spatial pooling Feature maps Input Feature Map . . . Key operations in a CNN Source: R. Fergus, Y. LeCun Slide: Lazebnik
  • 22. Input Image Convolution (Learned) Non-linearity Spatial pooling Feature maps Key operations Source: R. Fergus, Y. LeCun Rectified Linear Unit (ReLU) Slide: Lazebnik
  • 23. Input Image Convolution (Learned) Non-linearity Spatial pooling Feature maps Max Key operations Source: R. Fergus, Y. LeCun Slide: Lazebnik
  • 24. Flattening takes the pooled layer and flattens it in sequential order into a single vector. • Vector is used as the input to the Classifier Flattening CONVOLUTIONAL NEURAL NETS의 동작원리
  • 25. FULLY-CONNECTED LAYER 3 2 2 3 1 1 3 1 3 2 2 3 1 1 3 1 2 1 softmax 0.8 0.2 Cat Dog CONVOLUTIONAL NEURAL NETS의 동작원리
  • 26. 26 CONVOLUTIONAL NEURAL NETS의 진화 LeNet to ResNet: A Deep Journey LeNet5 (1998): The origin of convolutional neural network • Repeat of Convolution – Pooling – Non Linearity • Average pooling • Sigmoid activation for the intermediate layer • tanh activation at F6 • 5x5 Convolutionfilter • 7 layers and less than 1M parameters • Use of convolution to extract spatial features • Subsample using spatial average ofmaps • Sparse connection matrix between layers to avoid large computationalcost Characteristics Key Contributions • Slow totrain • Hard to train (Neuronsdies quickly) • Lack of data The Gap
  • 27. 27 CONVOLUTIONAL NEURAL NETS의 진화 • ImageNet is an image database organized according to the WordNet hierarchy • is formally a project aimed at (manually) labeling and categorizing images • ImageNet Large Scale Visual Recognition Challenge (ILSVRC) • Training Data: 1.2 Million Images, 1000+ categories • Validation and Test Data: 150K Images, 50K Validation, Remaining Test • Image Net Data: http://image-net.org/challenges/LSVRC/2010/browse-synsets • Multiple Challenges; Object recognition, localization etc.
  • 28. IMAGENET CLASSIFICATION RESULTS <2012 Result> • Krizhevsky et al. – 16.4% error(top-5) • Next best (non-convnet) – 26.2% error <2013 Result> • All rankers use deep learning(Convnet) Revolution of Depth! AlexNet CONVOLUTIONAL NEURAL NETS의 진화
  • 29. 29 CONVOLUTIONAL NEURAL NETS의 진화 ALEXNET (2012) • GPU and training in parallel • ReLu Activation • Dropout regularization • Image Augmentation Characteristics Key Contributions - 11x11, 5x5 and 3x3 Convolutions - Max pooling - 3 FC layers - 60 Million parameters
  • 30. 30 CONVOLUTIONAL NEURAL NETS의 진화 A 4 layer CNN with ReLUs is 6 times faster than equivalent network with thanh in reaching 25% error rate on CIFR-10 dataset RELU NON-LINEARITY – SIMPLER ACTIVATION
  • 31. Ljubljana, June 2016 Deep learning - ReLU How does sigmoid function affect learning? • Enables easier computation of derivative but has negative effects: – Neuron never reaches 1 or 0  saturating – Gradient reduces the magnitude of error • Leads to two problems: • Slow learning when neurons saturated i.e. big z values • Vanishing gradient problem (gradient always 25% of error from previous layer!!)
  • 32. Ljubljana, June 2016 Deep learning - ReLU • Alex Krizhevsky (2011) proposed Rectified Linear Unit instead of sigmoid function • Main purpose of ReLu: reduces saturation and vanishing gradient issues • Still not perfect: – Stops learning at negative z values (can use piecewise linear - Parametric ReLu, He 2015 from Microsoft) – Bigger risk of saturating neurons to infinity
  • 33. Ljubljana, June 2016 Deep learning - dropout • Too many weights cause overfitting issues • Weight decay (regularization) helps but is not perfect – Also adds another hyper-parameter to setup manually • Srivastava et al. (2014) proposed a kind of „bagging“ for deep nets (actually Alex Krizhevsky already used it in AlexNet in 2011) • Main point: – Robustify network by disabling neurons – Each neuron has a probability, usually of 0.4, of being disabled – Remaining neurons must adept to work without them • Applied only to fully connected layers – Conv. layers less susceptible to overfitting Srivastava et al., Dropout : A Simple Way to Prevent Neural Networks from Overfitting, JMLR 2014
  • 34. Ljubljana, June 2016 Deep learning – batch norm • Input needs to be whitened i.e. normalized (LeCun 1998, Efficient BackProp) – Usually done on first layer input only • The same reason for normalization of first layer exists for other layers as well • Ioffe and Szegedy, Batch Normalization, 2015 – Normalize input to each layer – Reduce internal covariance shift – Too slow to normalize all input data (>1M samples) – Instead normalize within mini-batch only – Learning: norm over mini-batch data – Inference: norm over all trained input data Ioffe and Szegedy, Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift, 2015 Better results while allowing to use higher learning rate, higher decay, no dropout, no LRN.
  • 35. 35 VGG (2014) • Smaller size convolution 3x3 throughout the net • Sequence of 3x3 convolution can emulate larger receptive fields, e.g., 5x5 or 7x7 • Use of 1x1 convolution • Decrease in spatial volume and increase in depth of input What's the advantage of using 3 layers of 3x3 instead of one layer of 7x7? • 3 non-linear rectification layers • Less number of parameters, 27C2 as opposed to 49C2 Key Points • Depth is important • Simplify the network to go deep • 140M parameters (mostly due to the FC layers) CONVOLUTIONAL NEURAL NETS의 진화
  • 36. 36 VGG(2ND PLACE IN 2014)  3x3 filter만 반복해서 사용  Why??  Convolution filter를 stack하면 더 큰 receptive field를 가질 수 있음  2개의 3x3 filter = 5x5 filter  3개의 3x3 filter = 7x7 filter  Parameter수는 큰 filter 사용하는 경우에 비하여 감소  regularization 효과 “Very Deep Convolutional Networks for Large-Scale Image Recognition”
  • 37. 37 CONVOLUTIONAL NEURAL NETS의 진화 GOOGLENET OR INCEPTION (2014) • 22 Layer CNN • Heavy use of 1x1 ‘Network in Network’ • Use of average pooling before the classification • Auxiliary classifiers connected to intermediate layers • During training add the loss of the auxiliary classifiers with a discount (0.3) weight
  • 38. 38 GOOGLENET KEY IDEAS • 3x3 or 5x5 중 어떤 것이 좋은가 ? • 전부 다 사용해보자  연산량이 많아진다. Naïve Version Modified Idea Way too many output!!! Use 1x1 for dimensionality reduction Why 1x1 convolution? • Introduced as “Network in Network” in 2014 • Is a way to increase Non-Linearity and spatially combine features across feature maps Only 4M parameters compared to 60M in AlexNet
  • 39. 39 GOOGLENET KEY IDEAS  1x1 convolution을 사용하여 dimension reduction  Feature map의 개수를 절반으로 줄여 총 연산량은 비슷하게
  • 40. 40 GOOGLENET KEY IDEAS Input layer Kth feature map, output layer X11 Xij y11,k yij,kwk wk X11 : 1x256 vector, wk : 1x256 weight vector, Yij,k = f(Xij·wk), f() : Nonlinear ft’n x y w 1x1 Convolution의 dimension reduction 원리 Fully Connected NN을 이용한 Feature Dimension Reduction원리와 동일
  • 41. 41 RESNET (RESIDUAL NEURAL NETWORK) (2015) CONVOLUTIONAL NEURAL NETS의 진화 • Introduce shortcut connections (exists in prior literature in various forms) • Key invention is to skip 2 layers. Skipping single layer didn’t give much improvement for some reason
  • 42. 42 RESNET  Layer수가 많을수록 항상 좋을까?  56개의 layer를 사용하는 경우가 20개의 layer를 사용하는 경우에 비 해 training error가 더 큰 결과가 나옴
  • 43.  더 deep한 model은 training error가 더 낮아야 하지만  Deep한 model은 optimization이 쉽지 않다 는 것을 발견(identity도 힘들다) 원인 : vanishing/exploding gradient 학습시켜야 할 파라메터 수의 증가 A shallower model (18 layers) A deeper model (34 layers) “Deep Residual Learning for Image Recognition” RESNET
  • 44. RESNET의 KEY IDEA  Identity는 그대로 상위 layer로 전달하고, 나머지 부분만 학습  H(x)를 얻는 것이 목표가 아니라 F(x)=H(x)-x 를 목표로 F(x) ~0 이므로 수렴이 빠름  Identity shortcut을 통한 효과 - 깊은 망의 최적화도 가능 - 깊이에 비례해 정확도 개선 “Deep Residual Learning for Image Recognition”
  • 45.  BOTTLENECK : A PRACTICAL DESIGN • # parameters • 256 x 64 + 64 x 3 x 3x 64 + 64 x 25 6 = ~70K • # parameters just using 3 x 3 x 256 x 2 56 conv layer = ~600K 1x1 conv를 이용하여 dimension reduction  3x3 conv  1x1 conv를 이용하여 dimension expansion  연산량을 줄이기 위함 RESNET의 KEY IDEA
  • 46. Dilated convolutions The goal of this layer is to increase the size of the receptive field (input activations that are used to compute a given output) without using downsampling (in order to preserve local information). Increasing the size of the receptive field allows to use more context (information spatially further away). The idea is to spread the input images and fill the added pixels with zeros, and then compute a convolution.
  • 48. DenseNet 2017 CVPR에서 Densely Connected Network라는 네트워크 구조에 획기적인 변화를 주는 연구 결과가 발표
  • 49.
  • 50. Brief intro : Invariance and Equivariance
  • 51. CovNet are translational Equivalent This demonstrates LeNet-5's invariance to small rotations (+/-40 degrees). How about Rotation ? Limitation of Conventional CovNet
  • 52. 2D convolution is equivariant under translation, but not under rotation Limitation of Conventional CovNet
  • 53. Invariance Φ Image(X) Feature(Z) Z1 = Z = Z2 𝑇𝑔 1 Mapping ft’n(Φ(·)) Φ Transformation X1 X2 Z = Z1 = Φ(X1) = Z2 = Φ(X2) = Φ(𝑻 𝒈 𝟏 X1 ) : Mapping independent of transformation, 𝑇𝑔, for all 𝑇𝑔 X2 = 𝑇𝑔 1 X1
  • 54. To make a Convolutional Neural Networks (CNN) transformation- invariant, data augmentation with training samples is generally used Invariance
  • 55. Equivariance Φ Image(X) Feature(Z) Z1 Z2 𝑇𝑔 2 𝑇𝑔 1 Φ Transformation X1 X2 Z2 = 𝑻 𝒈 𝟐 Z1 = 𝑻 𝒈 𝟐 Φ(X1) = Φ(𝑻 𝒈 𝟏 X1 ) : Invariance is special case of equivariance where 𝑇𝑔 2 is the identity. X2 = 𝑇𝑔 1 X1 Z2 = 𝑇𝑔 2 Z1 : Mapping preserves algebraic structure of transformation Z1 ≠ Z2 but keeps the relationship Mapping ft’n(Φ(·))
  • 56. Equivariance : Group CovNet To understand the rotation or proportion change of a given entity, a group of filters(a combination of rotated and mirror reflected versions of filter) is adopted. For example, the group p4 which contains translations and rotations by multiples of ninety degrees, or, which additionally contains mirror reflections. : Rotation : Mirror reflections
  • 57. A filter in a G-CNN detects co-occurrences of features that have the preferred relative pose, and can match such a feature constellation in every global pose through an operation called the G-convolution. Equivariance : Group CovNet Filter group 1 Filter group 2 Filter group N
  • 58. Visualization of classic 2D convolution Visualization of the G-Conv for the roto-translation group G-Convolution Equivariance : Group CovNet
  • 59. G-convolution is equivariant under rotation G-Convolution Equivariance : Group CovNet
  • 60. Equivariance : Group CovNet Latent representations learnt by a CNN and a G-CNN. - The left part is the result of a typical CNN while the right one is that of a G- CNN. - In both parts, the outer cycles consist of the rotated images while the inner cycles consist of the learnt representations. - Features produced by a G-CNN is equivariant to rotation while that produced by a typical CNN is not.
  • 61. What we need : EQUIVARIANCE (not invariance) “Equivariance makes a CNN understand the rotation or proportion change” Equivariance : Capsule Net
  • 62. “A capsule is a group of neurons whose activity vector represents the instantiation parameters of a specific type of entity such as an object or an object part.” Equivariance : Capsule Net
  • 63. Equivariance of Capsules “A capsule is a group of neurons whose activity vector represents the instantiation parameters of a specific type of entity such as an object or an object part.” Activity vector map Object Equivariance : Capsule Net