Review Wide Resnet

Wide Residual Networks
Sergey Zagoruyko, Nikos Komodakis
University Paris-Est, Ecole des Ponts
ParisTech,
Paris, France
2017 arXiv
발표자 정우진

/ 20
• Introduction
• Representation
• Experimental Result
• Conclusions
2
Contents

/ 203
Introduction
AlexNet, VGG,
GoogleNet
deeper
thinner
Good! : ResNet
(res block,
bottleneck)
???
?Good or Bad?
deeper
wider

/ 20
Contributions
• ResBlock의 여러 변형을 실험
• Wide ResNet 제안
• Dropout + ResBlock의 새 사용법 제안
• 제안하는 방법의 우수한 결과
4
Introduction

/ 20
표기법
• 예
5
Representation
네트워크
이름
Depth
𝑛
Width
𝑘
내부 conv층
개수 𝑙
3x3 conv 두개로
이루어진 구조

/ 20
Wide ResNet의 기본 구조
• Pre-activation ResBlock
– BN-Relu-Conv
6
Structure of Wide Residual Networks

/ 20
CIFAR 10, CIFAR 100
• Classification Benchmark
– CIFAR 10 : 10 classes
– CIFAR 100 : 100 classes
7
CIFAR10 예

/ 20
ResBlock의 구조 실험
• 전체 파라메터 개수를 유
지
• ResBlock구조를 바꿔가며
실험
– WRN-40-2 에서 실험
– B(1,3,1), B(3,1), B(1,3)
– WRN-28-2-B(3,3)
– WRN-22-2-B(3,1,3)
• B(3,3)이 가장 우수
– 파라메터 개수 대비 B(3,1),
B(3,1,3)도 우수
• 큰 차이는 없으므로 이후
실험에서는 B(3,3)을 사용
8
Exp. 1. Type of convolution in a block
>> Experimental Result

/ 20
ResBlock 내부 Conv층의 개수
• WRN-40-2 에서
• 𝑙 ∈ [1,2,3,4] 변경하며 실험
– 총 depth는 유지(=40)
• 𝑙 = 2 경우 가장 우수
• 이후 실험에서 𝑙 = 2 로 고정
9
Exp. 2. Number of convolutions per block

/ 20
폭(width), 깊이(depth), 파라메터 개수(# params) 실험 1
• WRN의 여러 변형 실험
• 같은 깊이  넓을수록 우수
• 같은 폭  깊을수록 우수
• 같은 파라메터 개수  상황마다 다름  최적 깊이, 폭 고려 필요함
10
Exp. 3. Width of residual blocks

/ 20
• WRN과 다른 방법 비교
• 앞선 실험과 비슷한 경향
• WRN이 우수함 - 폭과 깊이의 적정점을 찾아냄
11

/ 20
• ResNet-164와 WRN-28-10의 크기 차이와 학습 난이도
– # params 차이 : 1.7M vs. 36.5M
– WRN-28-10이 훨씬 큰 모델이지만 학습이 잘됨
12

/ 20
Dropout + ResBlock 실험
• K. He의 실험과는 다름
– K. He는 Dropout으로 효과를 보지 못함
– Conv - Dropout - Conv 구조
• 대체로 dropout 사용이 유리함
13
Exp. 4. Dropout in residual blocks

/ 20
Street View House Numbers (SVHN) Dataset
14
SVHN 예

/ 20
Dropout + ResBlock 실험
• ResNet을 늘려서 Wide ResNet 구성함
15
Exp. 5. ImageNet and COCO experiments

/ 20
가장 우수한 결과 모음
• COCO 버전의 경우
– MultiPathNet + LocNet + WRN
16
Exp. 5. ImageNet and COCO experiments

/ 20
계산 효율성 측정
• 조건
– TitanX, minibatch size 32
– Forward+Backward update
time
• ResNet-1001과 WRN-40-4
– 비슷한 정확도
– 8배 속도 차이
17
Exp. 6. Computational efficiency
# params : 1.7M
10.2M
8.9M
17.1M
36.5M

/ 20
• 여러 가지 ResBlock 구조에 대한 고찰
• 깊이와 폭을 동시에 고려
• Dropout과 ResBlock를 조합하는 새로운 방법
• 우수한 결과
• 폭이 넓은 구조가 계산에 유리함을 실험으로 증명
18
Conclusions

Review Wide Resnet

More Related Content

What's hot

Similar to Review Wide Resnet

Review Wide Resnet