iT Cafe - Deep Learning

Deep Learning in Python
Deep Neural Network
김동민|Dongmin Kim
Deep Learning
Dongmin Kim
dmk2436@gmail.com

Dongmin Kim, 2018
Review - Matrix
 행과 열로 표현한 데이터
a
𝑎
𝑏
𝑐
𝑑
𝑒
𝑎 𝑏
𝑐 𝑑
스칼라
(Scalar)
0차원
벡터
(Vector)
1차원
행렬
(Matrix)
2차원
텐서
(Tensor)
n차원(𝑛 ≥ 0)

Dongmin Kim, 2018
Neural Networks

Dongmin Kim, 2018
Neural Network
 사람의 신경망에서 영감을 얻어 이를 구현(꼭 똑같이 구현할 필요 없음)
 각 뉴런 사이의 연결에 의하여 병렬적인 사고
 각 연결은 상황에 맞게 조정됨
가지돌기
몸체
축삭돌기
축삭둔덕
가지돌기
•다른 신경으로부터 온 전기
신호가 모아짐(입력)
축삭둔덕
•일정 세기 이상이 되어야만
전기 신호 발생
축삭돌기
•전기 신호를 다른 신경으로
보냄(출력)
Input
Input
Output
Output

Dongmin Kim, 2018
Artificial Neuron
Input
Vector
(1xm)
X
Weight Matrix
(mxn)
(1xn)
Output
Vector
(1xn)
S
Activation
Function

Dongmin Kim, 2018
Activation Function
Linear Neuron Binary Threshold
Neuron
Rectified Linear
Unit(ReLU)
Leaky ReLU Sigmoid
간단하지만 활용성
제약
미분을 통한
가중치 수정 불가
0보다 작으면
미분을 통한
가중치 수정 불가
Exp연산이
비효율적
편의상 x=0 지점의 기울기를 1이라고 함

Dongmin Kim, 2018
Bias
 편의상 뉴런에 있는 b라는 상수항 대신 입력이 1이고 b라는
가중치를 가진 입력이 있다고 표현
y
𝑥0 𝑥1
y
𝑥0 𝑥1 1
𝑤0 𝑤1
𝑤0
𝑤1
𝑏

Dongmin Kim, 2018
Multi Class Classification
𝑥0 𝑥1 1
𝑦0 𝑦1
Input
Output

Dongmin Kim, 2018
Image Classification Task
𝑥0 𝑥1 1
𝑦0 𝑦1
Input
Output
Image Vector(Matrix to Vector)
Label
(Person, Cat, Vehicle, etc)

Dongmin Kim, 2018
Learning
 모델의 가중치를 최적의 분류를 향하여 조정하는 과정
Accuracy: 46.43%

Dongmin Kim, 2018
Learning
Accuracy: 57.14%

Dongmin Kim, 2018
Learning
Accuracy: 89.29%

Dongmin Kim, 2018
Learning
Accuracy: 100.00%
(Loss function still decreases)

Dongmin Kim, 2018
Learning
Accuracy: 100.00%
(Optimal Point)

Dongmin Kim, 2018
Learning – Decreasing Loss Function

Dongmin Kim, 2018
Loss Function
 모델이 잘못 분류하는 정도를 나타내는 기본적인 척도
 Mean Squared Error: 𝐸 =
1
2 𝑖∈𝑡𝑟𝑎𝑖𝑛𝑖𝑛𝑔(𝑓(𝑥 𝑖
) − 𝑦 𝑖
)2
 Cross Entropy: 𝐸 = −
1
𝑛 𝑖∈𝑡𝑟𝑎𝑖𝑛𝑖𝑛𝑔[𝑦 𝑖 𝑙𝑛𝑓 𝑥 𝑖 + (1 − 𝑦 𝑖 )ln(1 − 𝑓 𝑥 𝑖 )]
 Cross Entropy는 Mean Squared Error가 학습 속도가 느려진다는 단점을 보완함
 𝐶𝑜𝑠𝑡 𝐹𝑢𝑛𝑐𝑡𝑖𝑜𝑛 ⊂ 𝐿𝑜𝑠𝑠 𝐹𝑢𝑛𝑐𝑡𝑖𝑜𝑛 ⊂ 𝑂𝑏𝑗𝑒𝑐𝑡𝑖𝑣𝑒 𝐹𝑢𝑛𝑐𝑡𝑖𝑜𝑛
미분했을 때, 깔끔하게 계수를 없애기 위하여
1
2
을 곱해줌

Dongmin Kim, 2018
Gradient 구하여 학습하기
(Stochastic Gradient Descent, SGD)
 미분을 통하여 Gradient 구하기

Dongmin Kim, 2018
Single Neuron’s Limitations
 스스로 특징을 찾아야 하며, 이를 통하여 학습할 수 있는 정보도 제한적임
 단순하여 복잡한 데이터를 반영 못함(XOR 예시)

Dongmin Kim, 2018
Introducing the Hidden Layer
& Deep Neural Network
 Deep Neural Network(DNN)은 여러 개의 Single Neuron을 계층적으로 쌓은 것임
 다층 구조의 효과를 드러내기 위해서는 각 Neuron은 non-linear이어야 함
 스스로 특성을 찾는 Deep Learning이고 학습할 수 있는 정보도 많아짐
 복잡한 데이터를 반영할 수 있음

Dongmin Kim, 2018
Chain Rule
 Chain Rule:
𝜕𝑦
𝜕𝑥
=
𝜕𝑧
𝜕𝑥
𝜕𝑦
𝜕𝑧

Dongmin Kim, 2018
Data Segmentation
 Without model tuning
 With model tuning
Training Set (70%) Test Set (30%)
Training Set (50%) Test Set (30%)
Validation
Set (20%)

Dongmin Kim, 2018
Batch Learning
 Full Batch Learning: 1번에
 Mini Batch Learning: 일정개씩
 Sophisticated Learning: 1개씩
21 3 4 5 6 7 8 109 11 12
21 3 4 5 6 7 8 109 11 12
21 3 4 5 6 7 8 109 11 12

Dongmin Kim, 2018
Initialization
W=0 (Vanilla Initialization)
Random Initialization
Xavier/He Initialization:

Dongmin Kim, 2018
Batch Normalization
각 Layer마다 적용
PCA(차원 축소) 또는 Whitening(변수 간의 종속관계 제거)을 할 때도 있음

Dongmin Kim, 2018
Handling Learning Circumstances
Good Bad

Dongmin Kim, 2018
Momentum Method
(Vanilla) SGD
SGD + Momentum
Nestrov Momentum

Dongmin Kim, 2018
Momentum Method
AdaGrad
RMSProp
Adam

Dongmin Kim, 2018
Early Stopping

Dongmin Kim, 2018
Regularization
L1
Regularization
• 손실함수 +
1
2
𝜆𝜔2
L2
Regularization
• 손실함수 + 𝜆 𝜔
Elastic Net
Regularization
• L1 + L2
Max Norm
Constraints
• 𝜔 2 < 𝑐
Dropout Dense-sparse-
dense(DSD)

Dongmin Kim, 2018
Restricted Boltzmann Machine(RBM)

Dongmin Kim, 2018
One-hot Vector
 1개의 원소만 값이 1이고, 나머지 원소는 값이 0인 벡터
 E.g. 0 – 4까지의 값을 가질 때, 3 → [0, 0, 1, 0, 0]
 반대말: One-cold Vector, e.g. [1, 0, 1, 1, 1]

Dongmin Kim, 2018
MNIST Classification Example

Dongmin Kim, 2018
Effectively dealing with Computer Vision(CV) problem
→ Convolutional Neural Network(CNN)
DNN CNN

Dongmin Kim, 2018
Convolution Layer

Dongmin Kim, 2018
CNN Architecture

Dongmin Kim, 2018
Pooling
주의: Pooling은 인지적으로 타당하지 않은 접근임. Capsule Network 등이 대안

Dongmin Kim, 2018
Fully Connected Layer
 = DNN

Dongmin Kim, 2018
CIFAR-100 Classification Example

Dongmin Kim, 2018
Effectively dealing with Sequential problems:
Recurrent Neural Network(RNN)
 E.g. Natural Language Processing(NLP)
Memoryless Models Hidden Markov Model(HMM)
Recurrent Neural Network(RNN)

Dongmin Kim, 2018
Problems of RNN

Dongmin Kim, 2018
Reducing the Exploding Gradients
 근본적인 해결책이 되지는 못함

Dongmin Kim, 2018
Solution to the Problem: Changing the Architecture

Dongmin Kim, 2018
Long Short-term Memory(LSTM)

Dongmin Kim, 2018
(Soft) Attention Models

Dongmin Kim, 2018
Other Sequential Models
Gated Recurrent Unit(GRU) Differentiable Neural Computer(DNC)

Dongmin Kim, 2018
Review - Neural Network
 사람의 신경망에서 영감을 얻어 이를 구현(꼭 똑같이 구현할 필요 없음)
 각 뉴런 사이의 연결에 의하여 병렬적인 사고
 각 연결은 상황에 맞게 조정됨
가지돌기
몸체
축삭돌기
축삭둔덕
가지돌기
•다른 신경으로부터 온 전기
신호가 모아짐(입력)
축삭둔덕
•일정 세기 이상이 되어야만
전기 신호 발생
축삭돌기
•전기 신호를 다른 신경으로
보냄(출력)
Input
Input
Output
Output

Dongmin Kim, 2018
Review - Learning – Decreasing Loss Function

Dongmin Kim, 2018
Review - Introducing the Hidden Layer
& Deep Neural Network
 Deep Neural Network(DNN)은 여러 개의 Single Neuron을 계층적으로 쌓은 것임
 다층 구조의 효과를 드러내기 위해서는 각 Neuron은 non-linear이어야 함
 스스로 특성을 찾는 Deep Learning이고 학습할 수 있는 정보도 많아짐
 복잡한 데이터를 반영할 수 있음

Dongmin Kim, 2018
Review - Effectively dealing with Computer Vision(CV)
problems → Convolutional Neural Network(CNN)
DNN CNN

Dongmin Kim, 2018
Review - RNN

iT Cafe - Deep Learning

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

iT Cafe - Deep Learning