SlideShare a Scribd company logo
1 of 6
Download to read offline
2/27/2019 vgg slides
http://127.0.0.1:8000/vgg.slides.html?print-pdf#/ 1/6
Networks Using Blocks (VGG)
Networks Using Blocks (VGG)
We use the vgg_block function to implement this basic VGG block. This function takes
the number of convolutional layers num_convs and the number of output channels
num_channels as input.
In [1]: import d2l
from mxnet import gluon, init, nd
from mxnet.gluon import nn
def vgg_block(num_convs, num_channels):
blk = nn.Sequential()
for _ in range(num_convs):
blk.add(nn.Conv2D(num_channels, kernel_size=3,
padding=1, activation='relu'))
blk.add(nn.MaxPool2D(pool_size=2, strides=2))
return blk
2/27/2019 vgg slides
http://127.0.0.1:8000/vgg.slides.html?print-pdf#/ 2/6
VGG Architecture
2/27/2019 vgg slides
http://127.0.0.1:8000/vgg.slides.html?print-pdf#/ 3/6
In [2]:
Now, we will implement VGG-11. This is a simple matter of executing a for loop over
conv_arch.
In [3]:
conv_arch = ((1, 64), (1, 128), (2, 256), (2, 512), (2, 512))
def vgg(conv_arch):
net = nn.Sequential()
# The convolutional layer part.
for (num_convs, num_channels) in conv_arch:
net.add(vgg_block(num_convs, num_channels))
# The fully connected layer part.
net.add(nn.Dense(4096, activation='relu'), nn.Dropout(0.5),
nn.Dense(4096, activation='relu'), nn.Dropout(0.5),
nn.Dense(10))
return net
net = vgg(conv_arch)
2/27/2019 vgg slides
http://127.0.0.1:8000/vgg.slides.html?print-pdf#/ 4/6
Memory usage for single input
Next, we will construct a single-channel data example with a height and width of 224 to
observe the output shape of each layer.
In [4]: net.initialize()
X = nd.random.uniform(shape=(1, 1, 224, 224))
for blk in net:
X = blk(X)
print(blk.name, 'output shape:t', X.shape)
sequential1 output shape: (1, 64, 112, 112)
sequential2 output shape: (1, 128, 56, 56)
sequential3 output shape: (1, 256, 28, 28)
sequential4 output shape: (1, 512, 14, 14)
sequential5 output shape: (1, 512, 7, 7)
dense0 output shape: (1, 4096)
dropout0 output shape: (1, 4096)
dense1 output shape: (1, 4096)
dropout1 output shape: (1, 4096)
dense2 output shape: (1, 10)
2/27/2019 vgg slides
http://127.0.0.1:8000/vgg.slides.html?print-pdf#/ 5/6
Model Training
Since VGG-11 is more complicated than AlexNet let's use a smaller network.,
In [5]: ratio = 4
small_conv_arch = [(pair[0], pair[1] // ratio) for pair in conv_arch]
net = vgg(small_conv_arch)
2/27/2019 vgg slides
http://127.0.0.1:8000/vgg.slides.html?print-pdf#/ 6/6
Training Loop
In [6]: lr, num_epochs, batch_size, ctx = 0.05, 5, 128, d2l.try_gpu()
net.initialize(ctx=ctx, init=init.Xavier())
trainer = gluon.Trainer(net.collect_params(), 'sgd', {'learning_rate': lr})
train_iter, test_iter = d2l.load_data_fashion_mnist(batch_size, resize=224)
d2l.train_ch5(net, train_iter, test_iter, batch_size, trainer, ctx, num_epochs)
training on gpu(0)
epoch 1, loss 0.9533, train acc 0.654, test acc 0.852, time 37.7 sec
epoch 2, loss 0.4086, train acc 0.850, test acc 0.886, time 35.8 sec
epoch 3, loss 0.3319, train acc 0.878, test acc 0.900, time 35.9 sec
epoch 4, loss 0.2915, train acc 0.894, test acc 0.904, time 35.9 sec
epoch 5, loss 0.2605, train acc 0.905, test acc 0.911, time 35.8 sec

More Related Content

Similar to vgg.pdf

The Ring programming language version 1.5.3 book - Part 8 of 184
The Ring programming language version 1.5.3 book - Part 8 of 184The Ring programming language version 1.5.3 book - Part 8 of 184
The Ring programming language version 1.5.3 book - Part 8 of 184Mahmoud Samir Fayed
 
Gradle in 45min - JBCN2-16 version
Gradle in 45min - JBCN2-16 versionGradle in 45min - JBCN2-16 version
Gradle in 45min - JBCN2-16 versionSchalk Cronjé
 
Part II: LLVM Intermediate Representation
Part II: LLVM Intermediate RepresentationPart II: LLVM Intermediate Representation
Part II: LLVM Intermediate RepresentationWei-Ren Chen
 
The Ring programming language version 1.5.1 book - Part 7 of 180
The Ring programming language version 1.5.1 book - Part 7 of 180The Ring programming language version 1.5.1 book - Part 7 of 180
The Ring programming language version 1.5.1 book - Part 7 of 180Mahmoud Samir Fayed
 
Optimizing NN inference performance on Arm NEON and Vulkan
Optimizing NN inference performance on Arm NEON and VulkanOptimizing NN inference performance on Arm NEON and Vulkan
Optimizing NN inference performance on Arm NEON and Vulkanax inc.
 
pgconfasia2016 plcuda en
pgconfasia2016 plcuda enpgconfasia2016 plcuda en
pgconfasia2016 plcuda enKohei KaiGai
 
The Ring programming language version 1.5.4 book - Part 8 of 185
The Ring programming language version 1.5.4 book - Part 8 of 185The Ring programming language version 1.5.4 book - Part 8 of 185
The Ring programming language version 1.5.4 book - Part 8 of 185Mahmoud Samir Fayed
 
PL/CUDA - Fusion of HPC Grade Power with In-Database Analytics
PL/CUDA - Fusion of HPC Grade Power with In-Database AnalyticsPL/CUDA - Fusion of HPC Grade Power with In-Database Analytics
PL/CUDA - Fusion of HPC Grade Power with In-Database AnalyticsKohei KaiGai
 
DevFest 2022 - Skaffold 2 Deep Dive Taipei.pdf
DevFest 2022 - Skaffold 2 Deep Dive Taipei.pdfDevFest 2022 - Skaffold 2 Deep Dive Taipei.pdf
DevFest 2022 - Skaffold 2 Deep Dive Taipei.pdfKAI CHU CHUNG
 
Python + GDB = Javaデバッガ
Python + GDB = JavaデバッガPython + GDB = Javaデバッガ
Python + GDB = JavaデバッガKenji Kazumura
 
Kaggle Winning Solution Xgboost algorithm -- Let us learn from its author
Kaggle Winning Solution Xgboost algorithm -- Let us learn from its authorKaggle Winning Solution Xgboost algorithm -- Let us learn from its author
Kaggle Winning Solution Xgboost algorithm -- Let us learn from its authorVivian S. Zhang
 
Kassem2009
Kassem2009Kassem2009
Kassem2009lazchi
 
Java 5 6 Generics, Concurrency, Garbage Collection, Tuning
Java 5 6 Generics, Concurrency, Garbage Collection, TuningJava 5 6 Generics, Concurrency, Garbage Collection, Tuning
Java 5 6 Generics, Concurrency, Garbage Collection, TuningCarol McDonald
 
Native Java with GraalVM
Native Java with GraalVMNative Java with GraalVM
Native Java with GraalVMSylvain Wallez
 
The Ring programming language version 1.8 book - Part 95 of 202
The Ring programming language version 1.8 book - Part 95 of 202The Ring programming language version 1.8 book - Part 95 of 202
The Ring programming language version 1.8 book - Part 95 of 202Mahmoud Samir Fayed
 
DUAL FIELD DUAL CORE SECURE CRYPTOPROCESSOR ON FPGA PLATFORM
DUAL FIELD DUAL CORE SECURE CRYPTOPROCESSOR ON FPGA PLATFORMDUAL FIELD DUAL CORE SECURE CRYPTOPROCESSOR ON FPGA PLATFORM
DUAL FIELD DUAL CORE SECURE CRYPTOPROCESSOR ON FPGA PLATFORMVLSICS Design
 
Shifu plugin-trainer and pmml-adapter
Shifu plugin-trainer and pmml-adapterShifu plugin-trainer and pmml-adapter
Shifu plugin-trainer and pmml-adapterLisa Hua
 

Similar to vgg.pdf (20)

The Ring programming language version 1.5.3 book - Part 8 of 184
The Ring programming language version 1.5.3 book - Part 8 of 184The Ring programming language version 1.5.3 book - Part 8 of 184
The Ring programming language version 1.5.3 book - Part 8 of 184
 
Gradle in 45min - JBCN2-16 version
Gradle in 45min - JBCN2-16 versionGradle in 45min - JBCN2-16 version
Gradle in 45min - JBCN2-16 version
 
Part II: LLVM Intermediate Representation
Part II: LLVM Intermediate RepresentationPart II: LLVM Intermediate Representation
Part II: LLVM Intermediate Representation
 
The Ring programming language version 1.5.1 book - Part 7 of 180
The Ring programming language version 1.5.1 book - Part 7 of 180The Ring programming language version 1.5.1 book - Part 7 of 180
The Ring programming language version 1.5.1 book - Part 7 of 180
 
Optimizing NN inference performance on Arm NEON and Vulkan
Optimizing NN inference performance on Arm NEON and VulkanOptimizing NN inference performance on Arm NEON and Vulkan
Optimizing NN inference performance on Arm NEON and Vulkan
 
pgconfasia2016 plcuda en
pgconfasia2016 plcuda enpgconfasia2016 plcuda en
pgconfasia2016 plcuda en
 
The Ring programming language version 1.5.4 book - Part 8 of 185
The Ring programming language version 1.5.4 book - Part 8 of 185The Ring programming language version 1.5.4 book - Part 8 of 185
The Ring programming language version 1.5.4 book - Part 8 of 185
 
PL/CUDA - Fusion of HPC Grade Power with In-Database Analytics
PL/CUDA - Fusion of HPC Grade Power with In-Database AnalyticsPL/CUDA - Fusion of HPC Grade Power with In-Database Analytics
PL/CUDA - Fusion of HPC Grade Power with In-Database Analytics
 
DevFest 2022 - Skaffold 2 Deep Dive Taipei.pdf
DevFest 2022 - Skaffold 2 Deep Dive Taipei.pdfDevFest 2022 - Skaffold 2 Deep Dive Taipei.pdf
DevFest 2022 - Skaffold 2 Deep Dive Taipei.pdf
 
Python + GDB = Javaデバッガ
Python + GDB = JavaデバッガPython + GDB = Javaデバッガ
Python + GDB = Javaデバッガ
 
Kaggle Winning Solution Xgboost algorithm -- Let us learn from its author
Kaggle Winning Solution Xgboost algorithm -- Let us learn from its authorKaggle Winning Solution Xgboost algorithm -- Let us learn from its author
Kaggle Winning Solution Xgboost algorithm -- Let us learn from its author
 
Kassem2009
Kassem2009Kassem2009
Kassem2009
 
Java 5 6 Generics, Concurrency, Garbage Collection, Tuning
Java 5 6 Generics, Concurrency, Garbage Collection, TuningJava 5 6 Generics, Concurrency, Garbage Collection, Tuning
Java 5 6 Generics, Concurrency, Garbage Collection, Tuning
 
Native Java with GraalVM
Native Java with GraalVMNative Java with GraalVM
Native Java with GraalVM
 
The Ring programming language version 1.8 book - Part 95 of 202
The Ring programming language version 1.8 book - Part 95 of 202The Ring programming language version 1.8 book - Part 95 of 202
The Ring programming language version 1.8 book - Part 95 of 202
 
DUAL FIELD DUAL CORE SECURE CRYPTOPROCESSOR ON FPGA PLATFORM
DUAL FIELD DUAL CORE SECURE CRYPTOPROCESSOR ON FPGA PLATFORMDUAL FIELD DUAL CORE SECURE CRYPTOPROCESSOR ON FPGA PLATFORM
DUAL FIELD DUAL CORE SECURE CRYPTOPROCESSOR ON FPGA PLATFORM
 
Shifu plugin-trainer and pmml-adapter
Shifu plugin-trainer and pmml-adapterShifu plugin-trainer and pmml-adapter
Shifu plugin-trainer and pmml-adapter
 
Xgboost
XgboostXgboost
Xgboost
 
00_Introduction to Java.ppt
00_Introduction to Java.ppt00_Introduction to Java.ppt
00_Introduction to Java.ppt
 
Clojure And Swing
Clojure And SwingClojure And Swing
Clojure And Swing
 

Recently uploaded

DeepFakes presentation : brief idea of DeepFakes
DeepFakes presentation : brief idea of DeepFakesDeepFakes presentation : brief idea of DeepFakes
DeepFakes presentation : brief idea of DeepFakesMayuraD1
 
2016EF22_0 solar project report rooftop projects
2016EF22_0 solar project report rooftop projects2016EF22_0 solar project report rooftop projects
2016EF22_0 solar project report rooftop projectssmsksolar
 
kiln thermal load.pptx kiln tgermal load
kiln thermal load.pptx kiln tgermal loadkiln thermal load.pptx kiln tgermal load
kiln thermal load.pptx kiln tgermal loadhamedmustafa094
 
Computer Lecture 01.pptxIntroduction to Computers
Computer Lecture 01.pptxIntroduction to ComputersComputer Lecture 01.pptxIntroduction to Computers
Computer Lecture 01.pptxIntroduction to ComputersMairaAshraf6
 
"Lesotho Leaps Forward: A Chronicle of Transformative Developments"
"Lesotho Leaps Forward: A Chronicle of Transformative Developments""Lesotho Leaps Forward: A Chronicle of Transformative Developments"
"Lesotho Leaps Forward: A Chronicle of Transformative Developments"mphochane1998
 
Standard vs Custom Battery Packs - Decoding the Power Play
Standard vs Custom Battery Packs - Decoding the Power PlayStandard vs Custom Battery Packs - Decoding the Power Play
Standard vs Custom Battery Packs - Decoding the Power PlayEpec Engineered Technologies
 
AIRCANVAS[1].pdf mini project for btech students
AIRCANVAS[1].pdf mini project for btech studentsAIRCANVAS[1].pdf mini project for btech students
AIRCANVAS[1].pdf mini project for btech studentsvanyagupta248
 
A CASE STUDY ON CERAMIC INDUSTRY OF BANGLADESH.pptx
A CASE STUDY ON CERAMIC INDUSTRY OF BANGLADESH.pptxA CASE STUDY ON CERAMIC INDUSTRY OF BANGLADESH.pptx
A CASE STUDY ON CERAMIC INDUSTRY OF BANGLADESH.pptxmaisarahman1
 
Minimum and Maximum Modes of microprocessor 8086
Minimum and Maximum Modes of microprocessor 8086Minimum and Maximum Modes of microprocessor 8086
Minimum and Maximum Modes of microprocessor 8086anil_gaur
 
notes on Evolution Of Analytic Scalability.ppt
notes on Evolution Of Analytic Scalability.pptnotes on Evolution Of Analytic Scalability.ppt
notes on Evolution Of Analytic Scalability.pptMsecMca
 
Employee leave management system project.
Employee leave management system project.Employee leave management system project.
Employee leave management system project.Kamal Acharya
 
Thermal Engineering -unit - III & IV.ppt
Thermal Engineering -unit - III & IV.pptThermal Engineering -unit - III & IV.ppt
Thermal Engineering -unit - III & IV.pptDineshKumar4165
 
HOA1&2 - Module 3 - PREHISTORCI ARCHITECTURE OF KERALA.pptx
HOA1&2 - Module 3 - PREHISTORCI ARCHITECTURE OF KERALA.pptxHOA1&2 - Module 3 - PREHISTORCI ARCHITECTURE OF KERALA.pptx
HOA1&2 - Module 3 - PREHISTORCI ARCHITECTURE OF KERALA.pptxSCMS School of Architecture
 
Online food ordering system project report.pdf
Online food ordering system project report.pdfOnline food ordering system project report.pdf
Online food ordering system project report.pdfKamal Acharya
 
Introduction to Serverless with AWS Lambda
Introduction to Serverless with AWS LambdaIntroduction to Serverless with AWS Lambda
Introduction to Serverless with AWS LambdaOmar Fathy
 
Engineering Drawing focus on projection of planes
Engineering Drawing focus on projection of planesEngineering Drawing focus on projection of planes
Engineering Drawing focus on projection of planesRAJNEESHKUMAR341697
 
scipt v1.pptxcxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx...
scipt v1.pptxcxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx...scipt v1.pptxcxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx...
scipt v1.pptxcxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx...HenryBriggs2
 
A Study of Urban Area Plan for Pabna Municipality
A Study of Urban Area Plan for Pabna MunicipalityA Study of Urban Area Plan for Pabna Municipality
A Study of Urban Area Plan for Pabna MunicipalityMorshed Ahmed Rahath
 
Double Revolving field theory-how the rotor develops torque
Double Revolving field theory-how the rotor develops torqueDouble Revolving field theory-how the rotor develops torque
Double Revolving field theory-how the rotor develops torqueBhangaleSonal
 
Hostel management system project report..pdf
Hostel management system project report..pdfHostel management system project report..pdf
Hostel management system project report..pdfKamal Acharya
 

Recently uploaded (20)

DeepFakes presentation : brief idea of DeepFakes
DeepFakes presentation : brief idea of DeepFakesDeepFakes presentation : brief idea of DeepFakes
DeepFakes presentation : brief idea of DeepFakes
 
2016EF22_0 solar project report rooftop projects
2016EF22_0 solar project report rooftop projects2016EF22_0 solar project report rooftop projects
2016EF22_0 solar project report rooftop projects
 
kiln thermal load.pptx kiln tgermal load
kiln thermal load.pptx kiln tgermal loadkiln thermal load.pptx kiln tgermal load
kiln thermal load.pptx kiln tgermal load
 
Computer Lecture 01.pptxIntroduction to Computers
Computer Lecture 01.pptxIntroduction to ComputersComputer Lecture 01.pptxIntroduction to Computers
Computer Lecture 01.pptxIntroduction to Computers
 
"Lesotho Leaps Forward: A Chronicle of Transformative Developments"
"Lesotho Leaps Forward: A Chronicle of Transformative Developments""Lesotho Leaps Forward: A Chronicle of Transformative Developments"
"Lesotho Leaps Forward: A Chronicle of Transformative Developments"
 
Standard vs Custom Battery Packs - Decoding the Power Play
Standard vs Custom Battery Packs - Decoding the Power PlayStandard vs Custom Battery Packs - Decoding the Power Play
Standard vs Custom Battery Packs - Decoding the Power Play
 
AIRCANVAS[1].pdf mini project for btech students
AIRCANVAS[1].pdf mini project for btech studentsAIRCANVAS[1].pdf mini project for btech students
AIRCANVAS[1].pdf mini project for btech students
 
A CASE STUDY ON CERAMIC INDUSTRY OF BANGLADESH.pptx
A CASE STUDY ON CERAMIC INDUSTRY OF BANGLADESH.pptxA CASE STUDY ON CERAMIC INDUSTRY OF BANGLADESH.pptx
A CASE STUDY ON CERAMIC INDUSTRY OF BANGLADESH.pptx
 
Minimum and Maximum Modes of microprocessor 8086
Minimum and Maximum Modes of microprocessor 8086Minimum and Maximum Modes of microprocessor 8086
Minimum and Maximum Modes of microprocessor 8086
 
notes on Evolution Of Analytic Scalability.ppt
notes on Evolution Of Analytic Scalability.pptnotes on Evolution Of Analytic Scalability.ppt
notes on Evolution Of Analytic Scalability.ppt
 
Employee leave management system project.
Employee leave management system project.Employee leave management system project.
Employee leave management system project.
 
Thermal Engineering -unit - III & IV.ppt
Thermal Engineering -unit - III & IV.pptThermal Engineering -unit - III & IV.ppt
Thermal Engineering -unit - III & IV.ppt
 
HOA1&2 - Module 3 - PREHISTORCI ARCHITECTURE OF KERALA.pptx
HOA1&2 - Module 3 - PREHISTORCI ARCHITECTURE OF KERALA.pptxHOA1&2 - Module 3 - PREHISTORCI ARCHITECTURE OF KERALA.pptx
HOA1&2 - Module 3 - PREHISTORCI ARCHITECTURE OF KERALA.pptx
 
Online food ordering system project report.pdf
Online food ordering system project report.pdfOnline food ordering system project report.pdf
Online food ordering system project report.pdf
 
Introduction to Serverless with AWS Lambda
Introduction to Serverless with AWS LambdaIntroduction to Serverless with AWS Lambda
Introduction to Serverless with AWS Lambda
 
Engineering Drawing focus on projection of planes
Engineering Drawing focus on projection of planesEngineering Drawing focus on projection of planes
Engineering Drawing focus on projection of planes
 
scipt v1.pptxcxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx...
scipt v1.pptxcxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx...scipt v1.pptxcxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx...
scipt v1.pptxcxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx...
 
A Study of Urban Area Plan for Pabna Municipality
A Study of Urban Area Plan for Pabna MunicipalityA Study of Urban Area Plan for Pabna Municipality
A Study of Urban Area Plan for Pabna Municipality
 
Double Revolving field theory-how the rotor develops torque
Double Revolving field theory-how the rotor develops torqueDouble Revolving field theory-how the rotor develops torque
Double Revolving field theory-how the rotor develops torque
 
Hostel management system project report..pdf
Hostel management system project report..pdfHostel management system project report..pdf
Hostel management system project report..pdf
 

vgg.pdf

  • 1. 2/27/2019 vgg slides http://127.0.0.1:8000/vgg.slides.html?print-pdf#/ 1/6 Networks Using Blocks (VGG) Networks Using Blocks (VGG) We use the vgg_block function to implement this basic VGG block. This function takes the number of convolutional layers num_convs and the number of output channels num_channels as input. In [1]: import d2l from mxnet import gluon, init, nd from mxnet.gluon import nn def vgg_block(num_convs, num_channels): blk = nn.Sequential() for _ in range(num_convs): blk.add(nn.Conv2D(num_channels, kernel_size=3, padding=1, activation='relu')) blk.add(nn.MaxPool2D(pool_size=2, strides=2)) return blk
  • 3. 2/27/2019 vgg slides http://127.0.0.1:8000/vgg.slides.html?print-pdf#/ 3/6 In [2]: Now, we will implement VGG-11. This is a simple matter of executing a for loop over conv_arch. In [3]: conv_arch = ((1, 64), (1, 128), (2, 256), (2, 512), (2, 512)) def vgg(conv_arch): net = nn.Sequential() # The convolutional layer part. for (num_convs, num_channels) in conv_arch: net.add(vgg_block(num_convs, num_channels)) # The fully connected layer part. net.add(nn.Dense(4096, activation='relu'), nn.Dropout(0.5), nn.Dense(4096, activation='relu'), nn.Dropout(0.5), nn.Dense(10)) return net net = vgg(conv_arch)
  • 4. 2/27/2019 vgg slides http://127.0.0.1:8000/vgg.slides.html?print-pdf#/ 4/6 Memory usage for single input Next, we will construct a single-channel data example with a height and width of 224 to observe the output shape of each layer. In [4]: net.initialize() X = nd.random.uniform(shape=(1, 1, 224, 224)) for blk in net: X = blk(X) print(blk.name, 'output shape:t', X.shape) sequential1 output shape: (1, 64, 112, 112) sequential2 output shape: (1, 128, 56, 56) sequential3 output shape: (1, 256, 28, 28) sequential4 output shape: (1, 512, 14, 14) sequential5 output shape: (1, 512, 7, 7) dense0 output shape: (1, 4096) dropout0 output shape: (1, 4096) dense1 output shape: (1, 4096) dropout1 output shape: (1, 4096) dense2 output shape: (1, 10)
  • 5. 2/27/2019 vgg slides http://127.0.0.1:8000/vgg.slides.html?print-pdf#/ 5/6 Model Training Since VGG-11 is more complicated than AlexNet let's use a smaller network., In [5]: ratio = 4 small_conv_arch = [(pair[0], pair[1] // ratio) for pair in conv_arch] net = vgg(small_conv_arch)
  • 6. 2/27/2019 vgg slides http://127.0.0.1:8000/vgg.slides.html?print-pdf#/ 6/6 Training Loop In [6]: lr, num_epochs, batch_size, ctx = 0.05, 5, 128, d2l.try_gpu() net.initialize(ctx=ctx, init=init.Xavier()) trainer = gluon.Trainer(net.collect_params(), 'sgd', {'learning_rate': lr}) train_iter, test_iter = d2l.load_data_fashion_mnist(batch_size, resize=224) d2l.train_ch5(net, train_iter, test_iter, batch_size, trainer, ctx, num_epochs) training on gpu(0) epoch 1, loss 0.9533, train acc 0.654, test acc 0.852, time 37.7 sec epoch 2, loss 0.4086, train acc 0.850, test acc 0.886, time 35.8 sec epoch 3, loss 0.3319, train acc 0.878, test acc 0.900, time 35.9 sec epoch 4, loss 0.2915, train acc 0.894, test acc 0.904, time 35.9 sec epoch 5, loss 0.2605, train acc 0.905, test acc 0.911, time 35.8 sec