Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
Google net
1. Korea University,
Department of Computer Science & Radio
Communication Engineering
2016010646 Bumsoo Kim
Google NetEverything about GoogleNet and its architecture
Bumsoo Kim Presentation 2017 1
8. Bumsoo Kim Presentation 2017 8
Deeperisbetter?
RevolutionofDepth
The winning models of 2014 was distinguishable by its depth!
9. Bumsoo Kim Presentation 2017 9
Deeperisbetter?
RevolutionofDepth
The winning models of 2014 was distinguishable by its depth!
So, does deeper network is the key to better performance?
10. Bumsoo Kim Presentation 2017 10
Deeperisbetter?
RevolutionofDepth
The winning models of 2014 was distinguishable by its depth!
So, does deeper network is the key to better performance?
The answer is…
11. Bumsoo Kim Presentation 2017 11
Deeperisbetter?
RevolutionofDepth
The winning models of 2014 was distinguishable by its depth!
So, does deeper network is the key to better performance?
The answer is… NO!
18. Bumsoo Kim Presentation 2017 18
ZF-Netdidnotprovideafundamentalsolution
ObstaclesofDepth
ZF-net fundamentally failed in increasing depth
19. Bumsoo Kim Presentation 2017 19
ZF-Netdidnotprovideafundamentalsolution
ObstaclesofDepth
ZF-net fundamentally failed in increasing depth
2014 top models succeeded in addressing this problem
20. Bumsoo Kim Presentation 2017 20
ZF-Netdidnotprovideafundamentalsolution
ObstaclesofDepth
ZF-net fundamentally failed in increasing depth
2014 top models succeeded in addressing this problem
HOW?
23. Bumsoo Kim Presentation 2017 23
NetworkinNetwork
Inceptionnodules
NIN (Network In Network) – Min Lin, 2013, Network in Network
local receptive linear features local receptive linear+non-linear features
non-linear features
24. Bumsoo Kim Presentation 2017 24
NetworkinNetwork
Inceptionnodules
NIN (Network In Network) – Min Lin, 2013, Network in Network
non-linear features
90% of computation
25. Bumsoo Kim Presentation 2017 25
NetworkinNetwork
Inceptionnodules
NIN (Network In Network) – Min Lin, 2013, Network in Network
Inception
Average Pooling
28. Bumsoo Kim Presentation 2017 28
1x1Convolutions
Inceptionnodules
Original model
Various filters for various feature scales
29. Bumsoo Kim Presentation 2017 29
1x1Convolutions
Inceptionnodules
Original model
Expensive unit : Larger filter sizes are very expensive!!
30. Bumsoo Kim Presentation 2017 30
1x1Convolutions
Inceptionnodules
Original model 1x1 Convolution
31. Bumsoo Kim Presentation 2017 31
1x1Convolutions
Inceptionnodules
1x1 Convolution
Dimension reduction
Hebbian Principle
Neurons that fire together,
wire together
동일 차원에서 activate된 neuron들은
비슷한 성질을 가진 것들로 묶일 수 있다.
c2 (large) c3 (small)
35. Bumsoo Kim Presentation 2017 35
Whydoweneedit?
AuxiliaryClassifier
Gradient Vanishing Problem
ReLU doesn’t provide a fundamental solution
Overlapping multiple ReLUs cause the same problem!
36. Bumsoo Kim Presentation 2017 36
Whatiswrongaboutit?
GradientVanishing
Vanishing Gradients on Sigmoids
* Deep Networks use activation function for non-linearity.
*“Saturated”: Weight ‘W’ is too large. => ‘z’is 0 or 1 => z*(1-z) = 0
z = 1/(1 + np.exp(-np.dot(W, x))) # forward pass
dx = np.dot(W.T, z*(1-z)) # backward pass: local gradient for x
dW = np.outer(z*(1-z), x) # backward pass: local gradient for W
* Local gradient is small : 0 <= z*(1-z) <= 0.25 (when z=0.5) : top layers are not trained
37. Bumsoo Kim Presentation 2017 37
Detailsarenotexplainedinthepaper
AuxiliaryClassifier
Training Deeper Convolutional Networks with Deep SuperVision(2014,
Liwei Chang, Chen-Yu Lee)
38. Bumsoo Kim Presentation 2017 38
BeforeGradientVanishes!
AuxiliaryClassifier
Training Deeper Convolutional Networks with Deep SuperVision(2014,
Liwei Chang, Chen-Yu Lee)
깊이가 깊어질수록 vanishing될 우려가
높아지므로, 얕은 깊이에서도 업데이트!
39. Bumsoo Kim Presentation 2017 39
EmpiricalDecision
AuxiliaryClassifier
Training Deeper Convolutional Networks with Deep SuperVision(2014,
Liwei Chang, Chen-Yu Lee)
넣는 위치는 이론적 근거 없이 순수하게
“경험적”으로 판단한다.
40. Bumsoo Kim Presentation 2017 40
HowEffectiveisit?
AuxiliaryClassifier
Training Deeper Convolutional Networks with Deep SuperVision(2014,
Liwei Chang, Chen-Yu Lee)