Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Five	Insights	from		
GoogLeNet		
You	Could	Use	In	Your	Own	Deep	Learning		Nets	
Auro	Tripathy	
3b 4a 4b 4c 4d 4e 5a3a 5b
w...
Year	1989	Kicked-Off	ConvoluKon	Neural	Nets	
Ten-Digit	Classifier	using	a	Modest	Neural	Network	with	Three	Hidden	Layers	
Ba...
Year	2012	Marked	The	InflecKon	Point		
Reintroducing	CNNs	Led	to	Big	Drop	in	Error	for	Image	ClassificaKon.	
	Since	Then,	 	...
The	Trend	has	been	to	Increase	the	
number	of	Layers	(&	Layer	Size)	
•  The	typical	‘design	paBern’	for	ConvoluKonal	Neura...
The	Challenge	of	Deep	Networks	
1.  Adding	layers	increases	the	number	of	
parameters	and	makes	the	network	prone	to	
over...
Year	2014,	GoogLeNet	Took	Aim	at	
Efficiency	and	PracKcality	
	Resultant	benefits	of	the	new	architecture:	
•  12	Kmes	lesser...
Introducing	the	IncepKon	Module	
www.shaBerline.com	 7	
1x1
5x5
3x3
1x1
3x3 Max
Pooling
Previous
Layer
Concatenate
IntuiKon	behind	the	IncepKon	Module	
	•  Cluster	neurons	according	to	the	correlaKon	staKsKcs	in	the	dataset	
–  An	opKmal...
In	Images,	correlaKon	tends	to	be	local,	exploit	it.	
Heterogeneous	set	of	convoluKons	to	cover	spread-out	clusters	
www.s...
Conceiving	the	IncepKon	Module	
www.shaBerline.com	 10	
5x5
3x3
1x1
3x3 Max
Pooling
Concatenate
Previous
Layer
IncepKon	Module	Put	Into	PracKce		
Judicious	Dimension	ReducKon	
www.shaBerline.com	 11	
1x1
5x5
3x3
1x1
3x3 Max
Pooling
P...
www.shaBerline.com	 12	
Insights…	
3b 4a 4b 4c 4d 4e 5a3a 5b
GoogLeNet	Insight	#1	
(Summary	from	previous	Slides)	
Leads	to	the	following	architecture	choices:	
•  Choosing	filter	size...
GoogLeNet	Insights	#2	
Decrease	dimensions	wherever	computaKon	requirements	increase	
via	a	1X1	Dimension	ReducKon	Layer	
...
GoogLeNet	Insight	#3	
Stack	IncepKon	Modules	Upon	Each	Other	
•  Occasionally	insert	max-pooling	layers	with	stride	2	to	
...
GoogLeNet	Components	
Stacking	IncepKon	Modules	
3b 4a 4b 4c 4d 4e 5a3a 5b
Input
Average
Pooling
Traditional
Convolutions
...
GoogLeNet	Insight	#4	
Counter-Balancing	Back-PropagaKon	Downsides	in	Deep	Networks	
•  A	potenKal	problem	
–  Back-propaga...
Two	AddiKonal	Loss	Layers		
for	Training	to	Depth	
3b 4a 4b 4c 4d 4e 5a3a 5b
Input
Average
Pooling
Traditional
Convolution...
GoogLeNet	Insight	#5	
End	with	Global	Average	Pooling	Layer	Instead	of	Fully	Connected	Layer	
•  Fully-Connected	layers	ar...
Summarizing	The	Insights	
1.  Exploit	fully	the	fact	that,	in	Images,	correlaKon	tend	
to	be	local	
•  Concatenate	1X1,	3X...
References	
•  Seminal	
– Backpropaga)on	Applied	to	Handwri4en	Zip	Code	
Recogni)on.	LeCun,	et.	al.	
•  Deep	Networks	
– G...
Upcoming SlideShare
Loading in …5
×

GoogLeNet Insights

3,592 views

Published on

Five Insights from GoogLeNet You Could Use In Your Own Deep Learning Nets

Published in: Engineering
  • Be the first to comment

GoogLeNet Insights

  1. 1. Five Insights from GoogLeNet You Could Use In Your Own Deep Learning Nets Auro Tripathy 3b 4a 4b 4c 4d 4e 5a3a 5b www.shaBerline.com 1
  2. 2. Year 1989 Kicked-Off ConvoluKon Neural Nets Ten-Digit Classifier using a Modest Neural Network with Three Hidden Layers Backpropaga)on Applied to Handwri4en Zip Code Recogni)on. LeCun, et. al. hBp://yann.lecun.com/exdb/publis/pdf/lecun-89e.pdf Hidden Units Connec-ons Params Out – H3 (FC) 10 Visible 10 x (30W +1B )= 310 10 x (30W +1B )= 310 H3 – H2 (FC) 30 30 * (192 Weights + 1 Bias) = 5790 30 * (192 W + 1 B) = 5790 H2 – H1 (Conv) 12 X 4 x 4 = 192 192 x (5 x 5 x 8 + 1)= 38592 5 x 5 x 8 x 12 + 192 Biases = 2592 H1 – Input (Conv) 12 x 8 x 8 = 768 768 x (5 x 5 x 1 + 1) = 19968 5 x 5 x 1 x 12 + 768 Biases = 1068 Totals 16 x 16 In + 990 Hidden + 10 Out 64660 ConnecKons 9760 Params Each of the units in H2 combines local informaKon coming from 8 of the 12 different feature maps in H1. www.shaBerline.com 2
  3. 3. Year 2012 Marked The InflecKon Point Reintroducing CNNs Led to Big Drop in Error for Image ClassificaKon. Since Then, Networks ConKnued to Reduce 28.2 25.8 16.4 11.7 7.3 6.7 3.57 0 5 10 15 20 25 30 ILSVRC'10 ILSVRC'11 ILSVRC'12 (Alexnet) ILSVRC'13 ILSVRC'14 ILSVRC'14 (GoogLeNet) ILSVRC'15 (ResNet) 0 20 40 60 80 100 120 140 160 Error % Layers www.shaBerline.com 3 Top-5
  4. 4. The Trend has been to Increase the number of Layers (& Layer Size) •  The typical ‘design paBern’ for ConvoluKonal Neural Nets: –  Stacked convoluKonal layers, •  linear filter followed by a non-linear acKvaKon –  Followed by contrast normalizaKon and max pooling, –  PenulKmate layers (one or more) are fully connected layers. –  UlKmate layer is a loss layer, possibly more than one, in a weighted mix •  Use of dropouts to address the problem of over-fipng due to many layers •  In addiKon to classificaKon, architecture good for localizaKon and object detecKon –  despite concerns that max-pooling dilutes spaKal informaKon www.shaBerline.com 4
  5. 5. The Challenge of Deep Networks 1.  Adding layers increases the number of parameters and makes the network prone to over-fipng –  Exacerbated by paucity of data –  More data means more expense in their annotaKon 2.  More computaKon –  Linear increase in filters results in quadraKc increase in compute –  If weights are close to zero, we’ve wasted compute resources www.shaBerline.com 5
  6. 6. Year 2014, GoogLeNet Took Aim at Efficiency and PracKcality Resultant benefits of the new architecture: •  12 Kmes lesser parameters than AlexNet – Significantly more accurate than AlexNet – Lower memory-use and lower power-use acutely important for mobile devices. •  Stays within the targeted 1.5 Billion mulKply- add budget – ComputaKonal cost “less than 2X compared to AlexNet” hBp://www.youtube.com/watch?v=ySrj_G5gHWI&t=12m42s www.shaBerline.com 6
  7. 7. Introducing the IncepKon Module www.shaBerline.com 7 1x1 5x5 3x3 1x1 3x3 Max Pooling Previous Layer Concatenate
  8. 8. IntuiKon behind the IncepKon Module •  Cluster neurons according to the correlaKon staKsKcs in the dataset –  An opKmal layered network topology can be constructed by analyzing the correlaKon staKsKcs of the preceding layer acKvaKons and and clustering neurons with highly correlated outputs. •  We already know that, in the lower layers, there exists high correlaKons in image patches that are local and near-local. –  These can be covered by 1x1 convoluKons –  AddiKonally, a smaller number of spaKally spread-out clusters can be covered by convoluKon over larger patches; i.e., 3x3, and 5x5 –  And there will be decreasing number of patches over larger and larger regions. •  It also suggests that the architecture is a combina)on of the of all the convoluKons, the 1x1, 3x3, 5x5, as input to the next stage •  Since max-pooling has been successful, it suggests adding a pooling layer in parallel www.shaBerline.com 8
  9. 9. In Images, correlaKon tends to be local, exploit it. Heterogeneous set of convoluKons to cover spread-out clusters www.shaBerline.com 9 Cover very local clusters w/1x1 convoluKons Cover more spread-out clusters w/3x3 convoluKons Cover even more spread-out clusters w/5x5 convoluKons 5x5 3x3 1x1 5x5 3x31x1 Previous Layer
  10. 10. Conceiving the IncepKon Module www.shaBerline.com 10 5x5 3x3 1x1 3x3 Max Pooling Concatenate Previous Layer
  11. 11. IncepKon Module Put Into PracKce Judicious Dimension ReducKon www.shaBerline.com 11 1x1 5x5 3x3 1x1 3x3 Max Pooling Previous Layer Concatenate
  12. 12. www.shaBerline.com 12 Insights… 3b 4a 4b 4c 4d 4e 5a3a 5b
  13. 13. GoogLeNet Insight #1 (Summary from previous Slides) Leads to the following architecture choices: •  Choosing filter sizes of 1X1, 3X3, 5X5 •  Applying all three filters on the same “patch” of image (no need to choose) •  ConcatenaKng all filters as a single output vector for the next stage. •  ConcatenaKng an addiKonal pooling path since pooling is essenKal to the success of CNNs. www.shaBerline.com 13
  14. 14. GoogLeNet Insights #2 Decrease dimensions wherever computaKon requirements increase via a 1X1 Dimension ReducKon Layer •  Use inexpensive 1X1 convoluKons to compute reducKons before the expensive 3X3 and 3X5 convoluKons •  1X1 convoluKons include a ReLU acKvaKon making then dual-purpose. 1x1 Previous Layer ReLU www.shaBerline.com 14
  15. 15. GoogLeNet Insight #3 Stack IncepKon Modules Upon Each Other •  Occasionally insert max-pooling layers with stride 2 to decimate (by half) the resoluKon of the grid. •  Stacking IncepKon Layers benefits the results when used at higher layers (not strictly necessary) –  Lower layers are kept in tradiKonal convoluKons fashion (for memory efficiency reasons) •  This stacking allows for tweaking each module without uncontrolled blowup in computaKonal complexity at later stages. –  For example, a tweak could be increase width at any stage. www.shaBerline.com 15
  16. 16. GoogLeNet Components Stacking IncepKon Modules 3b 4a 4b 4c 4d 4e 5a3a 5b Input Average Pooling Traditional Convolutions (Conv + MaxPool + Conv + MaxPool) Linear Nine Inception Modules SoftMax w/LossMaxPool Label www.shaBerline.com 16
  17. 17. GoogLeNet Insight #4 Counter-Balancing Back-PropagaKon Downsides in Deep Networks •  A potenKal problem –  Back-propagaKng thru deep networks could result in “vanishing gradients” (possibly mean, dead ReLUs). •  A soluKon –  Intermediate layers do have discriminatory powers –  Auxiliary classifiers were appended to the intermediate layers –  During training, the intermediate loss was added to the total loss with a discounted factor of 0.3 www.shaBerline.com 17
  18. 18. Two AddiKonal Loss Layers for Training to Depth 3b 4a 4b 4c 4d 4e 5a3a 5b Input Average Pooling Traditional Convolutions (Conv + MaxPool + Conv + MaxPool) Linear Nine Inception Modules SoftMax w/Loss 2MaxPool Average Pooling 1x1 Conv DropOutFully Connected SoftMax w/Loss 0Linear Label SoftMax w/Loss 1 www.shaBerline.com 18
  19. 19. GoogLeNet Insight #5 End with Global Average Pooling Layer Instead of Fully Connected Layer •  Fully-Connected layers are prone to over-fipng –  Hampers generalizaKon •  Average Pooling has no parameter to opKmize, thus no over-fipng. •  Averaging more naKve to the convoluKonal structure –  Natural correspondence between feature-maps and categories leading to easier interpretaKon •  Average Pooling does not exclude the use of Dropouts, a proven regularizaKon method to avoid over-fipng. 3b 4a 4b 4c 4d 4e 5a3a 5b Global Average Pooling Linear Layer for adapting to other label Sets SoftMax w/Loss Label www.shaBerline.com 19
  20. 20. Summarizing The Insights 1.  Exploit fully the fact that, in Images, correlaKon tend to be local •  Concatenate 1X1, 3X3, 5x5 convoluKons along with pooling 2.  Decrease dimensions wherever computaKon requirements increase via a 1X1 Dimension ReducKon Layer 3.  Stack IncepKon Modules Upon Each Other 4.  Counter-Balance Back-PropagaKon Downsides in Deep Network •  Uses intermediate losses in the final loss 5.  End with Global Average Pooling Layer Instead of Fully Connected Layer www.shaBerline.com 20
  21. 21. References •  Seminal – Backpropaga)on Applied to Handwri4en Zip Code Recogni)on. LeCun, et. al. •  Deep Networks – Going Deeper with ConvoluKons – Network In Network www.shaBerline.com 21

×