Sachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
slides.pdf
1. Chair of Network Architectures and Services
Department of Informatics
Technical University of Munich
Capsule Networks - An Overview
Luca Dombetzki
July 13, 2018
Advisor: Marton Kajo
Chair of Network Architectures and Services
Department of Informatics
Technical University of Munich
4. Introduction
Motivation
Figure 1: figure from [12]
Both images are seen as "face" by a typical Convolutional Neural Network
⇒ Capsule Networks
L. Dombetzki — Capsule Networks 4
5. Introduction
Where does AI come from?
Figure 2: A neuron as part of a Multi Layer Neural Network [21]
Designed after human brain
• Advancement in modeling with math
• Performance gains with GPUs
• Deep Learning - leverage both
BUT not like human brain anymore
• Blackbox system
• Requires huge amounts of data
• Very probabilistic
L. Dombetzki — Capsule Networks 5
6. Introduction
Who is Geoffrey E. Hinton?
“The pooling operation used in convolutional neural networks is a big mistake
and the fact that it works so well is a disaster.” - Geoffrey E. Hinton (2014) [7]
• Professor at Toronto University
• Working at Google Brain
• Major advancements in AI [13]
• Research on Capsule Networks:
• Based on biological research
• Understanding Human vision (1981) [9]
• Talks explaining his motivation [8]
• Dynamic Routing Between Capsules (2017) [19]
• Matrix Capsules with EM-Routing (2018) [6]
Figure 3: Geoffrey E. Hinton [24]
L. Dombetzki — Capsule Networks 6
10. Convolutional Neural Networks
Activation functions
Figure 6: Sigmoid and Rectified Linear Unit (ReLU) [20]
Σ σ
+1
x1
x2
x3
xn
w
0
w1
w2
w3
w
n
σ
w0 +
n
P
i=1
wixi
Figure 7: A single neuron [21]
L. Dombetzki — Capsule Networks 10
11. Convolutional Neural Networks
Pooling as a form of routing
Routing
• find important nodes (inputs)
• group together
• give to next layer
Pooling
• reduces input data
• next layer can “see” more than
the previous
• enables detecting full objects
through locational invariance
• static routing
Figure 8: Max pooling example [2]
L. Dombetzki — Capsule Networks 11
13. Convolutional Neural Networks
Problems of pooling
Figure 10: Distorted face from [12]
Geoffrey E. Hinton’s arguments against pooling [8]
• Unnatural
• No use of the linear structure of vision
• Static instead of dynamic routing
• Invariance instead of Equivariance
L. Dombetzki — Capsule Networks 13
14. Convolutional Neural Networks
What does a neuron represent?
Figure 11: Face detection with a CNN, from [10]
L. Dombetzki — Capsule Networks 14
16. Capsule Networks
Hinton’s idea
Figure 12: Hierarchical modeling in Computer Graphics [5]
Build a network to perform inverse graphics
• propagate probability and pose of features
• dynamic routing based on pose information
• introduce concept of an entity into the network’s architecture
⇒ The capsule
L. Dombetzki — Capsule Networks 16
17. Capsule Networks
An abstract view on capsules
Figure 13: Capsule face detection, from [10]
L. Dombetzki — Capsule Networks 17
18. Capsule Networks
The capsule - a group of neurons
Before After
layer of neurons layer of neuron groups
input = n values, output = value input = n vectors, output = vector
• A capsule learns parameters (skew, scale, rotation, etc)
• n-dimensional capsule = n-dimensional vectorout
⇒ n parameters ˆ
= pose
• probability = ||vectorout ||
L. Dombetzki — Capsule Networks 18
19. Capsule Networks
Architecture - The CapsNet
Figure 14: Capsule Network Architecture as described in [19]
Layer Function
Conv1 Convolutional layer
PrimaryCaps Convolutional squashing capsules
DigitCaps Normal (digit) capsules
Class predictions Length of each DigitCapsule
L. Dombetzki — Capsule Networks 19
21. Capsule Networks
Routing by agreement
Phenomenon “coincidence filtering”
• high dimensional pose-parameter-space
• similar poses by chance very unlikely (curse of dimensionality)
Clustering the inputs based on their pose:
repeat n times:
1. find the mean vector of the cluster
2. weighs all inputs based on their
distance to this mean
3. normalize the weights
Figure 16: weighted clustering [4]
L. Dombetzki — Capsule Networks 21
22. Capsule Networks
How to train the network
Margin Loss Reconstruction (Decoder) network
Figure 17: Capsule Network architectures [19]
Goal Lossfunction Learning
Parameter learning Reconstruction loss Unsupervised
Classification Margin loss Supervised
Reconstruction loss
• reconstruct digit by
masking the active capsule
Margin loss
• detection: ||v|| ≥ 0.9
• no detection: ||v|| ≤ 0.1
L. Dombetzki — Capsule Networks 22
23. Capsule Networks
How does it perform? - Parameter Effects
Scale and
thickness
Localized
part
Stroke thick-
ness
Localized
skew
Width and
translation
Localized
part
Figure 18: Effects of capsule parameters on reconstruction [19]
L. Dombetzki — Capsule Networks 23
25. Capsule Networks
How does it perform? - MultiMNIST
Network was forced to reconstruct false predictions
*R:(5, 7) *R:(2, 3) *R:(0, 8) *R:(1, 6)
L:(5, 0) L:(4, 3) L:(1, 8) L:(7, 6)
Figure 20: [19]
L. Dombetzki — Capsule Networks 25
26. Capsule Networks
Further research
Authors Contribution
Hinton et. al Pose capsules and EM-routing [6]
Xi et. al Hyperparamter tuning for complex data [25]
Phaye et. al Skip connections [17]
Rawlinson et. al Unsupervised training [18]
Bahadori et. al New routing (Eigen-decomposition) [3]
Wang et. al Optimized routing (KL regularization) [22]
L. Dombetzki — Capsule Networks 26
28. Discussion
Superior to CNNs?
Advantages
Viewpoint invariance
Less training data needed
Fewer parameters
Better generalization
White-box attacks
Validatability
Challenges
Scalability
“Explain everything”
Entity based structure
Loss functions
Crowding
Unoptimized implementation
L. Dombetzki — Capsule Networks 28
29. Discussion
CapsNets for real world problems
Figure 21: Results from Afshar et. al [1]
Authors Application Benefit
Afshar et. al [1] Brain tumor classification Less training data
Wang et. al [23] Sentiment analysis with RNNs State-of-the-art performance
LaLonde et. al [14] medical image classification Parameter reduction by 95.4%
L. Dombetzki — Capsule Networks 29
31. Conclusion
Conclusion
Big step towards human vision
• Novel network architecture
• Inverse graphics through pose vector capsules
• Dynamic routing via routing-by-agreement
• Multiple significant advantages
• Early development phase
But not comparable to CNNs in “mainstream areas”
L. Dombetzki — Capsule Networks 31
34. Bibliography
[1] P. Afshar, A. Mohammadi, and K. N. Plataniotis.
Brain tumor type classification via capsule networks.
CoRR, abs/1802.10200, 2018.
[2] Aphex34.
Convolutional neural network - max pooling.
https://en.wikipedia.org/wiki/Convolutional_neural_network#Max_pooling_shape; last accessed on 2018/06/14.
[3] M. T. Bahadori.
Spectral capsule networks.
2018.
[4] N. Bourdakos.
Understanding capsule networks - ai’s alluring new architecture.
https://medium.freecodecamp.org/understanding-capsule-networks-ais-alluring-new-architecture-bdb228173ddc;
last accessed on 2018/07/05.
[5] D. J. Eck.
Introduction to computer graphics: Hierarchical modeling.
http://math.hws.edu/graphicsbook/c2/s4.html; last accessed on 2018/07/05.
[6] G. Hinton, S. Sabour, and N. Frosst.
Matrix capsules with em routing.
2018.
[7] G. E. Hinton.
Askmeanything on reddit.
https://www.reddit.com/r/MachineLearning/comments/2lmo0l/ama_geoffrey_hinton/clyj4jv/; last accessed on
2018/07/03.
L. Dombetzki — Capsule Networks 34
35. Bibliography
[8] G. E. Hinton.
What is wrong with convolutional neural nets?
Talk recorded on youtube, https://youtu.be/rTawFwUvnLE; last accessed on 2018/06/14.
[9] G. F. Hinton and F. Cambridge.
Shape representatton in parallel systems.
1981.
[10] J. Hui.
Understanding dynamic routing between capsules.
https://jhui.github.io/2017/11/03/Dynamic-Routing-Between-Capsules/; last accessed on 2018/07/05.
[11] H. Kazemi.
Image filtering.
http://machinelearninguru.com/computer_vision/basics/convolution/image_convolution_1.html; last accessed on
2018/07/05.
[12] T. Kothari.
Uncovering the intuition behind capsule networks and inverse graphics.
https://hackernoon.com/uncovering-the-intuition-behind-capsule-networks -and-inverse-graphics-part-i-7412d12179
last accessed on 2018/06/14.
[13] A. Krizhevsky, I. Sutskever, and G. E. Hinton.
Imagenet classification with deep convolutional neural networks.
In Advances in neural information processing systems, pages 1097–1105, 2012.
[14] R. LaLonde and U. Bagci.
Capsules for Object Segmentation.
ArXiv e-prints, Apr. 2018.
L. Dombetzki — Capsule Networks 35
36. Bibliography
[15] H. Lee, R. Grosse, R. Ranganath, and A. Y. Ng.
Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations.
In Proceedings of the 26th annual international conference on machine learning, pages 609–616. ACM, 2009.
[16] Mathworks.
Convolutional neural network.
https://www.mathworks.com/solutions/deep-learning/convolutional-neural-network.html; last accessed on
2018/06/14.
[17] S. S. R. Phaye, A. Sikka, A. Dhall, and D. Bathula.
Dense and diverse capsule networks: Making the capsules learn better.
arXiv preprint arXiv:1805.04001, 2018.
[18] D. Rawlinson, A. Ahmed, and G. Kowadlo.
Sparse unsupervised capsules generalize better.
CoRR, abs/1804.06094, 2018.
[19] S. Sabour, N. Frosst, and G. E. Hinton.
Dynamic routing between capsules.
In Advances in Neural Information Processing Systems, pages 3859–3869, 2017.
[20] S. SHARMA.
Activation functions: Neural networks.
https://towardsdatascience.com/activation-functions-neural-networks-1cbd9f8d91d6; last accessed on 2018/07/05.
[21] P. Veličković.
Tikz figure collection.
https://github.com/PetarV-/TikZ/tree/master/Multilayerperceptron; last accessed on 2018/07/05.
L. Dombetzki — Capsule Networks 36
37. Bibliography
[22] D. Wang and Q. Liu.
An optimization view on dynamic routing between capsules.
2018.
[23] Y. Wang, A. Sun, J. Han, Y. Liu, and X. Zhu.
Sentiment analysis by capsules.
In Proceedings of the 2018 World Wide Web Conference on World Wide Web, pages 1165–1174. International World Wide
Web Conferences Steering Committee, 2018.
[24] N. Wolchover.
As machines get smarter, evidence they learn like us.
https://www.quantamagazine.org/as-machines-get-smarter-evidence-they-learn-like-us-20130723/; last accessed
on 2018/07/05.
[25] E. Xi, S. Bing, and Y. Jin.
Capsule Network Performance on Complex Data.
ArXiv e-prints, Dec. 2017.
L. Dombetzki — Capsule Networks 37
39. Appendix
Improvements: EM-Routing
• Hinton et. al
• 4x4 pose matrix capsules
• Expectation Maximization Routing
• Performance on smallNORB dataset
• CNN: 2.56% CapsNet: 1.4%
• Testing on unseen viewpoints
L. Dombetzki — Capsule Networks 39
40. Appendix
Routing by agreemen algorithm
1: procedure ROUTING(ûj|i, r, l)
2: for all capsule i in layer l and capsule j in layer (l + 1): bij ← 0.
3: for r iterations do
4: for all capsule i in layer l: ci ← softmax(bi)
5: for all capsule j in layer (l + 1): sj ←
P
i cijûj|i
6: for all capsule j in layer (l + 1): vj ← squash(sj)
7: for all capsule i in layer l and capsule j in layer (l + 1): bij ← bij + ûj|i.vj
return vj
L. Dombetzki — Capsule Networks 40