Deep Neural Network

Deep Neural Network
RTSS JUN YOUNG PARK

Reference
◦ R을 활용한 기계 학습 – Brett Lantz 著
◦ 2017-1학기 ‘현대사회와 빅데이터‘ 교재
◦ 데이터 전처리/표본 분석 과정 참조

Number of Parameters
From the last presentation …
How many parameters in this linear model ?
X W b S(Y)Y
0
1
0
0
0
Dog !
x
Test data (Image)
[1024x768] image
5 Classes
𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆 𝑊𝑊 + 𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆 𝐵𝐵 = 𝐼𝐼 𝐼𝐼𝐼𝐼𝐼𝐼𝐼𝐼_𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠 ∗ 𝐶𝐶𝐶𝐶𝐶𝐶𝐶𝐶𝐶𝐶𝐶𝐶𝐶𝐶 + 𝐶𝐶𝐶𝐶𝐶𝐶𝐶𝐶𝐶𝐶𝐶𝐶𝐶𝐶 = 3,932,165

Go Deep & Wide !
W1 W2 W3 ?
[784, 256] [256, 256] [256, 10]
Hidden Layer
[10]32
32
X Y
Invisible from the input/output.

Rectified Linear Units
◦ Why not Sigmoid ?
◦ Input signal may too near to 0 during back propagation. (Vanishing Gradient)
𝑅𝑅 𝑥𝑥 = �
𝑥𝑥, 𝑥𝑥 ≥ 0
0, 𝑥𝑥 < 0
𝜕𝜕
𝜕𝜕𝜕𝜕
{𝑅𝑅 𝑥𝑥 } = �
1, 𝑥𝑥 ≥ 0
0, 𝑥𝑥 < 0

Weight Initialization
◦ DBN (Deep Belief Networks )
◦ Process RBM for training each 2 layers
◦ After initialization -> We just need fine tuning(Training).
◦ Using Gaussian random number
◦ Xavier (2010)
◦ Divide Gaussian random number into number of inputs.
◦ He (2015)
◦ Divide the result of Xavier number into 2.

L2 Regularization
◦ Large weight may bend the model.
◦ To avoid ‘Large Weight’, We use the term below
ℒ =
1
𝑁𝑁
�
𝑖𝑖
𝐷𝐷 𝑆𝑆 𝑊𝑊𝑥𝑥𝑖𝑖 + 𝑏𝑏 , 𝐿𝐿𝑖𝑖 + 𝜆𝜆 � 𝑊𝑊2
0 ≤ 𝜆𝜆 ≤ 1 : Regularization strength

Dropout
◦ Forces the network to have redundant representation
While Testing : No Dropout While Training : Apply Dropout

Chain Rule
F GX Y
y = 𝑔𝑔(𝑓𝑓 𝑥𝑥 )
FX G’ *
y′
= 𝑔𝑔′
𝑓𝑓 𝑥𝑥 ∗ 𝑓𝑓𝑓(𝑥𝑥)
F’
X
Y’
◦ To make back propagation easier, We use operation graph like below.

Back Propagation
◦ Get derivatives using ‘Back Propagation’
+
𝑥𝑥
𝑦𝑦
𝑧𝑧
𝑧𝑧 = 𝑥𝑥 + 𝑦𝑦
𝜕𝜕𝑧𝑧
𝜕𝜕𝜕𝜕
=
𝜕𝜕𝜕𝜕
𝜕𝜕𝑦𝑦
= 1
𝜕𝜕𝐿𝐿
𝜕𝜕𝑧𝑧
𝜕𝜕𝐿𝐿
𝜕𝜕𝑧𝑧
𝜕𝜕𝑧𝑧
𝜕𝜕𝜕𝜕
𝜕𝜕𝐿𝐿
𝜕𝜕𝑧𝑧
𝜕𝜕𝑧𝑧
𝜕𝜕𝑦𝑦
x
𝑥𝑥
𝑦𝑦
𝑧𝑧
𝑧𝑧 = 𝑥𝑥𝑥𝑥
𝜕𝜕𝑧𝑧
𝜕𝜕𝜕𝜕
= 𝑦𝑦,
𝜕𝜕𝜕𝜕
𝜕𝜕𝑦𝑦
= 𝑥𝑥
𝜕𝜕𝐿𝐿
𝜕𝜕𝑧𝑧
𝜕𝜕𝐿𝐿
𝜕𝜕𝑧𝑧
𝜕𝜕𝑧𝑧
𝜕𝜕𝜕𝜕
𝜕𝜕𝐿𝐿
𝜕𝜕𝑧𝑧
𝜕𝜕𝑧𝑧
𝜕𝜕𝑦𝑦
𝑦𝑦 �
𝜕𝜕𝐿𝐿
𝜕𝜕𝑧𝑧
𝑥𝑥 �
𝜕𝜕𝐿𝐿
𝜕𝜕𝑧𝑧
For signal 𝐿𝐿 …

Practical Use
◦ Breast cancer diagnosis using ‘Deep Neural Network’
◦ The example from the book ‘Machine Learning with R’
◦ Using the dataset from ‘University of Wisconsin’
◦ The dataset includes 32 features
◦ Diagnosis, Radius, Perimeter, Area … and so on

Import/Define Methods
◦ Import packages for NumPy and TF
◦ Define the method for normalization
𝑧𝑧𝑛𝑛 =
𝑥𝑥𝑛𝑛 − min(𝒙𝒙)
max 𝒙𝒙 − min(𝒙𝒙)

Import Dataset
◦ Dataset from University of Wisconsin.
◦ Exclude unused feature (ID).
◦ Divide dataset for x and y.

One-Hot Encoding
‘M’
[1, 0]
[0, 1]
Malignant
Benign

Build Session
◦ Can control forced/unforced.
◦ Restore previous trained weights.
◦ Write log for TensorBoard.

Training Neurons
◦ 10001 steps per a run.
◦ Add summary for Tensorboard.

Save Results and Get Accuracy
◦ Save previous training data to keep current weight and bias
◦ Each run trains 10001 times

Result #1
<1st Attempt> <2nd Attempt>

Attempt more …
To use Xavier initializer

Result #2
96.27% -> 97.01% 97.01% -> 97.76%

Self Test
◦ 모델의 Parameter 수는 어떻게 결정되는지 설명하라.
◦ ReLU 함수의 개형과 그 미분의 결과는 어떻게 되는지 Sigmoid 함수와 비교하여 설명하라.
◦ Weight Initialization 의 목적과 그 방법을 설명하라.
◦ L2 Regularization 의 목적과 그 원리를 설명하라.
◦ Dropout 은 왜 필요한가 ? 또 훈련/시험시에 어떻게 설정해야 적절한가 ?
◦ NN 에 있어 Back Propagation 이 왜 유리한가?
◦ Ensemble Learning 에 대하여 설명하라.

Deep Neural Network

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Deep Neural Network

Similar to Deep Neural Network (20)

More from Jun Young Park

More from Jun Young Park (8)

Recently uploaded

Recently uploaded (20)

Deep Neural Network