slides.pdf

Chair of Network Architectures and Services
Department of Informatics
Technical University of Munich
Capsule Networks - An Overview
Luca Dombetzki
July 13, 2018
Advisor: Marton Kajo
Chair of Network Architectures and Services
Department of Informatics
Technical University of Munich

Overview
Introduction
Convolutional Neural Networks
Capsule Networks
Discussion
Conclusion
Bibliography
Appendix
L. Dombetzki — Capsule Networks 2

Introduction
Introduction
Capsule Networks
Discussion
Conclusion
Bibliography
Appendix

Introduction
Motivation
Figure 1: figure from [12]
Both images are seen as "face" by a typical Convolutional Neural Network
⇒ Capsule Networks

Introduction
Where does AI come from?
Figure 2: A neuron as part of a Multi Layer Neural Network [21]
Designed after human brain
• Advancement in modeling with math
• Performance gains with GPUs
• Deep Learning - leverage both
BUT not like human brain anymore
• Blackbox system
• Requires huge amounts of data
• Very probabilistic

Introduction
Who is Geoffrey E. Hinton?
“The pooling operation used in convolutional neural networks is a big mistake
and the fact that it works so well is a disaster.” - Geoffrey E. Hinton (2014) [7]
• Professor at Toronto University
• Working at Google Brain
• Major advancements in AI [13]
• Research on Capsule Networks:
• Based on biological research
• Understanding Human vision (1981) [9]
• Talks explaining his motivation [8]
• Dynamic Routing Between Capsules (2017) [19]
• Matrix Capsules with EM-Routing (2018) [6]
Figure 3: Geoffrey E. Hinton [24]

Introduction
Capsule Networks
Discussion
Conclusion
Bibliography
Appendix

What are CNNs?
Figure 4: Typcial architecture of a CNN [16]

Convolution and kernels
Figure 5: Convolution operation [11]

Activation functions
Figure 6: Sigmoid and Rectified Linear Unit (ReLU) [20]
Σ σ
+1
x1
x2
x3
xn
w
0
w1
w2
w3
w
n
σ

w0 +
n
P
i=1
wixi

Figure 7: A single neuron [21]

Pooling as a form of routing
Routing
• find important nodes (inputs)
• group together
• give to next layer
Pooling
• reduces input data
• next layer can “see” more than
the previous
• enables detecting full objects
through locational invariance
• static routing
Figure 8: Max pooling example [2]

How CNNs see the world
Figure 9: Feature detections of a CNN [15]

Problems of pooling
Figure 10: Distorted face from [12]
Geoffrey E. Hinton’s arguments against pooling [8]
• Unnatural
• No use of the linear structure of vision
• Static instead of dynamic routing
• Invariance instead of Equivariance

What does a neuron represent?
Figure 11: Face detection with a CNN, from [10]

Capsule Networks
Introduction
Capsule Networks
Discussion
Conclusion
Bibliography
Appendix

Capsule Networks
Hinton’s idea
Figure 12: Hierarchical modeling in Computer Graphics [5]
Build a network to perform inverse graphics
• propagate probability and pose of features
• dynamic routing based on pose information
• introduce concept of an entity into the network’s architecture
⇒ The capsule

Capsule Networks
An abstract view on capsules
Figure 13: Capsule face detection, from [10]

Capsule Networks
The capsule - a group of neurons
Before After
layer of neurons layer of neuron groups
input = n values, output = value input = n vectors, output = vector
• A capsule learns parameters (skew, scale, rotation, etc)
• n-dimensional capsule = n-dimensional vectorout
⇒ n parameters ˆ
= pose
• probability = ||vectorout ||

Capsule Networks
Architecture - The CapsNet
Figure 14: Capsule Network Architecture as described in [19]
Layer Function
Conv1 Convolutional layer
PrimaryCaps Convolutional squashing capsules
DigitCaps Normal (digit) capsules
Class predictions Length of each DigitCapsule

Capsule Networks
Routing-by-agreement - the idea
Figure 15: capsule agreement [4]

Capsule Networks
Routing by agreement
Phenomenon “coincidence filtering”
• high dimensional pose-parameter-space
• similar poses by chance very unlikely (curse of dimensionality)
Clustering the inputs based on their pose:
repeat n times:
1. find the mean vector of the cluster
2. weighs all inputs based on their
distance to this mean
3. normalize the weights
Figure 16: weighted clustering [4]

Capsule Networks
How to train the network
Margin Loss Reconstruction (Decoder) network
Figure 17: Capsule Network architectures [19]
Goal Lossfunction Learning
Parameter learning Reconstruction loss Unsupervised
Classification Margin loss Supervised
Reconstruction loss
• reconstruct digit by
masking the active capsule
Margin loss
• detection: ||v|| ≥ 0.9
• no detection: ||v|| ≤ 0.1

Capsule Networks
How does it perform? - Parameter Effects
Scale and
thickness
Localized
part
Stroke thick-
ness
Localized
skew
Width and
translation
Localized
part
Figure 18: Effects of capsule parameters on reconstruction [19]

Capsule Networks
How does it perform? - MultiMNIST
R:(6, 0) R:(6, 8) R:(7, 1) R:(8, 7) R:(9, 4) R:(9, 5) R:(8, 4)
L:(6, 0) L:(6, 8) L:(7, 1) L:(8, 7) L:(9, 4) L:(9, 5) L:(8, 4)
Routing Rec.Loss MNIST (%) MultiMNIST (%)
CNN - - 0.39 8.1
CapsNet 1 no 0.34±0.032 -
CapsNet 1 yes 0.29±0.011 7.5
CapsNet 3 no 0.35±0.036 -
CapsNet 3 yes 0.25±0.005 5.2
Figure 19: Cpasule Network results on MultiMNIST [19]

Capsule Networks
How does it perform? - MultiMNIST
Network was forced to reconstruct false predictions
*R:(5, 7) *R:(2, 3) *R:(0, 8) *R:(1, 6)
L:(5, 0) L:(4, 3) L:(1, 8) L:(7, 6)
Figure 20: [19]

Capsule Networks
Further research
Authors Contribution
Hinton et. al Pose capsules and EM-routing [6]
Xi et. al Hyperparamter tuning for complex data [25]
Phaye et. al Skip connections [17]
Rawlinson et. al Unsupervised training [18]
Bahadori et. al New routing (Eigen-decomposition) [3]
Wang et. al Optimized routing (KL regularization) [22]

Discussion
Introduction
Capsule Networks
Discussion
Conclusion
Bibliography
Appendix

Discussion
Superior to CNNs?
Advantages
Viewpoint invariance
Less training data needed
Fewer parameters
Better generalization
White-box attacks
Validatability
Challenges
Scalability
“Explain everything”
Entity based structure
Loss functions
Crowding
Unoptimized implementation

Discussion
CapsNets for real world problems
Figure 21: Results from Afshar et. al [1]
Authors Application Benefit
Afshar et. al [1] Brain tumor classification Less training data
Wang et. al [23] Sentiment analysis with RNNs State-of-the-art performance
LaLonde et. al [14] medical image classification Parameter reduction by 95.4%

Conclusion
Introduction
Capsule Networks
Discussion
Conclusion
Bibliography
Appendix

Conclusion
Conclusion
Big step towards human vision
• Novel network architecture
• Inverse graphics through pose vector capsules
• Dynamic routing via routing-by-agreement
• Multiple significant advantages
• Early development phase
But not comparable to CNNs in “mainstream areas”

Questions?
Figure 22: [20]

Bibliography
Introduction
Capsule Networks
Discussion
Conclusion
Bibliography
Appendix

Bibliography
[1] P. Afshar, A. Mohammadi, and K. N. Plataniotis.
Brain tumor type classification via capsule networks.
CoRR, abs/1802.10200, 2018.
[2] Aphex34.
Convolutional neural network - max pooling.
https://en.wikipedia.org/wiki/Convolutional_neural_network#Max_pooling_shape; last accessed on 2018/06/14.
[3] M. T. Bahadori.
Spectral capsule networks.
2018.
[4] N. Bourdakos.
Understanding capsule networks - ai’s alluring new architecture.
https://medium.freecodecamp.org/understanding-capsule-networks-ais-alluring-new-architecture-bdb228173ddc;
last accessed on 2018/07/05.
[5] D. J. Eck.
Introduction to computer graphics: Hierarchical modeling.
http://math.hws.edu/graphicsbook/c2/s4.html; last accessed on 2018/07/05.
[6] G. Hinton, S. Sabour, and N. Frosst.
Matrix capsules with em routing.
2018.
[7] G. E. Hinton.
Askmeanything on reddit.
https://www.reddit.com/r/MachineLearning/comments/2lmo0l/ama_geoffrey_hinton/clyj4jv/; last accessed on
2018/07/03.

Bibliography
[8] G. E. Hinton.
What is wrong with convolutional neural nets?
Talk recorded on youtube, https://youtu.be/rTawFwUvnLE; last accessed on 2018/06/14.
[9] G. F. Hinton and F. Cambridge.
Shape representatton in parallel systems.
1981.
[10] J. Hui.
Understanding dynamic routing between capsules.
https://jhui.github.io/2017/11/03/Dynamic-Routing-Between-Capsules/; last accessed on 2018/07/05.
[11] H. Kazemi.
Image filtering.
http://machinelearninguru.com/computer_vision/basics/convolution/image_convolution_1.html; last accessed on
2018/07/05.
[12] T. Kothari.
Uncovering the intuition behind capsule networks and inverse graphics.
https://hackernoon.com/uncovering-the-intuition-behind-capsule-networks -and-inverse-graphics-part-i-7412d12179
last accessed on 2018/06/14.
[13] A. Krizhevsky, I. Sutskever, and G. E. Hinton.
Imagenet classification with deep convolutional neural networks.
In Advances in neural information processing systems, pages 1097–1105, 2012.
[14] R. LaLonde and U. Bagci.
Capsules for Object Segmentation.
ArXiv e-prints, Apr. 2018.

Bibliography
[15] H. Lee, R. Grosse, R. Ranganath, and A. Y. Ng.
Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations.
In Proceedings of the 26th annual international conference on machine learning, pages 609–616. ACM, 2009.
[16] Mathworks.
Convolutional neural network.
https://www.mathworks.com/solutions/deep-learning/convolutional-neural-network.html; last accessed on
2018/06/14.
[17] S. S. R. Phaye, A. Sikka, A. Dhall, and D. Bathula.
Dense and diverse capsule networks: Making the capsules learn better.
arXiv preprint arXiv:1805.04001, 2018.
[18] D. Rawlinson, A. Ahmed, and G. Kowadlo.
Sparse unsupervised capsules generalize better.
CoRR, abs/1804.06094, 2018.
[19] S. Sabour, N. Frosst, and G. E. Hinton.
Dynamic routing between capsules.
In Advances in Neural Information Processing Systems, pages 3859–3869, 2017.
[20] S. SHARMA.
Activation functions: Neural networks.
https://towardsdatascience.com/activation-functions-neural-networks-1cbd9f8d91d6; last accessed on 2018/07/05.
[21] P. Veličković.
Tikz figure collection.
https://github.com/PetarV-/TikZ/tree/master/Multilayerperceptron; last accessed on 2018/07/05.

Bibliography
[22] D. Wang and Q. Liu.
An optimization view on dynamic routing between capsules.
2018.
[23] Y. Wang, A. Sun, J. Han, Y. Liu, and X. Zhu.
Sentiment analysis by capsules.
In Proceedings of the 2018 World Wide Web Conference on World Wide Web, pages 1165–1174. International World Wide
Web Conferences Steering Committee, 2018.
[24] N. Wolchover.
As machines get smarter, evidence they learn like us.
https://www.quantamagazine.org/as-machines-get-smarter-evidence-they-learn-like-us-20130723/; last accessed
on 2018/07/05.
[25] E. Xi, S. Bing, and Y. Jin.
Capsule Network Performance on Complex Data.
ArXiv e-prints, Dec. 2017.

Appendix
Introduction
Capsule Networks
Discussion
Conclusion
Bibliography
Appendix

Appendix
Improvements: EM-Routing
• Hinton et. al
• 4x4 pose matrix capsules
• Expectation Maximization Routing
• Performance on smallNORB dataset
• CNN: 2.56% CapsNet: 1.4%
• Testing on unseen viewpoints

Appendix
Routing by agreemen algorithm
1: procedure ROUTING(ûj|i, r, l)
2: for all capsule i in layer l and capsule j in layer (l + 1): bij ← 0.
3: for r iterations do
4: for all capsule i in layer l: ci ← softmax(bi)
5: for all capsule j in layer (l + 1): sj ←
P
i cijûj|i
6: for all capsule j in layer (l + 1): vj ← squash(sj)
7: for all capsule i in layer l and capsule j in layer (l + 1): bij ← bij + ûj|i.vj
return vj

Appendix
More on capsules

Appendix
Math
Squashing function
vj =
||sj||2
1 + ||sj||2
sj
||sj||
(1)
Full Capsule Connection
sj =
X
i
cijûj|i , ûj|i = Wijui (2)
Routing softmax
cij =
exp(bij)
P
k exp(bik )
(3)
Margin Loss
Lk = Tk max(0, m+
− ||vk ||)2
+ λ (1 − Tk ) max(0, ||vk || − m−
)2
(4)

Appendix
Pooling is unnatural

slides.pdf

Recommended

Recommended

More Related Content

Similar to slides.pdf

Similar to slides.pdf (20)

Recently uploaded

Recently uploaded (20)

slides.pdf