Dynamic Routing Between Capsules
Sara Sabour, Nicholas Frosst, Geoffrey E Hinton, 10, 2017, Arxiv
LAB SEMINAR
1
2017.11.13
SNU DATAMINING CENTER
MINKI CHUNG
TABLE OF CONTENTS
▸ Intuition
▸ Problems of ConvNet
▸ How brain works, Inverse graphics
▸ Capsule Theory
▸ CapsNet
▸ Capsule
▸ CapsNet architecture
▸ Experiment
▸ Classification on MNIST
▸ Reconstruction on MNIST
▸ Dimension perturbation on MNIST
▸ Discussion
2
INTUITION
▸ Problems of ConvNet
▸ How brain works, Inverse graphics
▸ Capsule Theory
3
PROBLEMS OF CONVNET 4
▸ ConvNet Architecture
PROBLEMS IS ‘POOLING’
https://hackernoon.com/what-is-a-capsnet-or-capsule-network-2bfbe48769cc
Obtain translational, rotational invariance
PROBLEMS OF CONVNET 5
▸
@REDDIT, MACHINE LEARNING
https://www.reddit.com/r/MachineLearning/comments/2lmo0l/ama_geoffrey_hinton/clyj4jv/
PROBLEMS OF CONVNET 6
▸
WHAT IS THIS PICTURE?
https://hackernoon.com/capsule-networks-are-shaking-up-ai-heres-how-to-use-them-c233a0971952
PROBLEMS OF CONVNET 7
▸
HOW ABOUT THIS?
https://hackernoon.com/capsule-networks-are-shaking-up-ai-heres-how-to-use-them-c233a0971952
PROBLEMS OF CONVNET 8
▸
NEED EQUIVARIANCE, NOT INVARIANCE
https://hackernoon.com/capsule-networks-are-shaking-up-ai-heres-how-to-use-them-c233a0971952
HOW BRAIN WORKS, INVERSE GRAPHICS 9
▸ Constructing a visual image from some internal hierarchical representation of
geometric data
▸ Internal representation is stored in computer’s memory as arrays of geometrical
objects and matrices that represent relative positions and orientation of these
objects
▸ Special software takes that representation and converts it into an image on the screen.
This is called rendering
▸ Brains, in fact, do the opposite of rendering. Hinton calls it inverse graphics: Visual
information received by eyes, they deconstruct a hierarchical representation of the
world around us and try to match it with already learned patterns and relationships
stored in the brain
▸ Key idea is that representation of objects in the brain does not depend on view angle
COMPUTER GRAPHICS
https://medium.com/@pechyonkin/understanding-hintons-capsule-networks-part-i-intuition-b4b559d1159b
CAPSULE THEORY 10
▸ In 3D graphics, relationships between 3D objects can be represented by a so-
called pose, which is in essence translation plus rotation
▸ Capsule approach: It incorporates relative relationships between objects (Internal
representation) and it is represented numerically as a 4D pose matrix
▸ by ‘Dynamic Routing’ (more details later)
▸ allows capsules to communicate with each other and create representations
similar to scene graphs in computer graphics
https://medium.com/@pechyonkin/understanding-hintons-capsule-networks-part-i-intuition-b4b559d1159b
YOU CAN EASILY RECOGNIZE THAT THIS IS THE STATUE OF LIBERTY,
EVEN THOUGH ALL THE IMAGES SHOW IT FROM DIFFERENT ANGLES
CAPSULE THEORY 11
▸ Benifits:
▸ Better understanding 3D Space
▸ Achieve state-of-the art performance by only using a fraction of the data that a CNN
would use
▸ In order to learn to tell digits apart, the human brain needs only a couple of dozens of
examples, hundreds at most, while CNN need tens of thousands of examples
https://medium.com/@pechyonkin/understanding-hintons-capsule-networks-part-i-intuition-b4b559d1159b
CAPSNET
▸ Capsule
▸ CapsNet architecture
▸ Experiment
12
CAPSULE 13
▸ Comparison with traditional neuron
https://www.zhihu.com/question/67287444/answer/251460831
V
VEC LENGTH WORKS LIKE PROBABILITY
ACTIVATION OF NEXT CAPSULE
DYNAMIC ROUTING
CAPSNET ARCHITECTURE 14
ARCHITECTURE
Sara Sabour, Nicholas Frosst, Geoffrey E Hinton, 10, 2017, Arxiv. Dynamic Routing Between Capsules
CONV CAPS.CONV CAPS.FC
DYNAMIC ROUTING
8X
32
X
MNIST
LOCAL FEATURE DETECTION
6*6*32=1152 CAPSULES,
EACH HAS 8 PROPERTIES
10 CAPSULES (CLASS),
EACH HAS 16 PROPERTIES
DEEPER MEANS MORE COMPLEX, DIMENSION SHOULD INCREASE
CAPSNET ARCHITECTURE 15
▸ naturomics github
CAPSNET-TENSORFLOW
CAPS.CONVCONV
CONV
X 32
MNIST
X 8
https://github.com/naturomics/CapsNet-Tensorflow
X 32
X 8
CAPS.FC
CAPS.CONV
CAPS.FC
DYNAMIC ROUTING
CAPSNET ARCHITECTURE 16
▸ Place-coded Capsule
▸ Concatenate (=8 different regular conv layers)
▸ Consider each feature map as capsule (6*6*32=1152 capsules with 8
properties)
CAPS.CONV, PRIMARYCAPS
CAPS.CONV
X 32
MNIST
X 8
https://github.com/naturomics/CapsNet-Tensorflow
DIRECTION
CAPSNET ARCHITECTURE 17
▸ Place-coded Capsule
▸ Concatenate (=8 different regular conv layers)
▸ Consider each feature map as capsule (6*6*32=1152 capsules with 8
properties)
▸ Use squashing function in the end
CAPS.CONV, PRIMARYCAPS
CAPS.CONV
X 32
MNIST
X 8
https://github.com/naturomics/CapsNet-Tensorflow
CAPSNET ARCHITECTURE 18
▸ Rate-coded capsules
▸ caps: 1152 → 10
▸ vec-len: 8 → 16
▸ Dynamic Routing
CAPS.FC, DIGITCAPS
https://github.com/naturomics/CapsNet-Tensorflow
X 32
MNIST
X 8
CAPS.FC
DYNAMIC ROUTING
DYNAMIC ROUTING
CAPSNET ARCHITECTURE 19
▸ Dynamic Routing
▸ Top-down feedback
▸ Routing by agreement
▸ Works like attention
CAPS.FC, DIGITCAPS
https://github.com/naturomics/CapsNet-Tensorflow
IF MULTIPLE PREDICTIONS
AGREE, HIGHER LEVEL CAPSULE
BECOMES ACTIVE
VEC LENGTH WORKS LIKE PROBABILITY
ACTIVATION OF NEXT CAPSULE
COUPLING COEFFICIENTS
TOPDOWN FEEDBACK: IF RELATION EXISTS COUPLING COEFFICIENTS INCREASE
AGREEMENT
CAPSNET ARCHITECTURE 20
▸ Dynamic Routing
CAPS.FC, DIGITCAPS
https://github.com/naturomics/CapsNet-Tensorflow
X 32
MNIST
X 8
CAPS.FC
DYNAMIC ROUTING
3 ITERATIONS WILL DO
EXPERIMENT
▸ Classification on MNIST
▸ Reconstruction on MNIST
▸ Dimension perturbation on MNIST
21
EXPERIMENT 22
▸ Introduce first three
▸ Classification on MNIST (99.75%, conv 99.61%)
▸ Reconstruction on MNIST
▸ Dimension Perturbation on MNIST
▸ Robustness to Affine Transformation on MNIST (79%, conv 66%)
▸ Classification on MultiMNIST (5% error)
▸ Classification on CIFAR 10 (10.6% error - ZFNet)
▸ Classification on SVHN (4.3% error)
Sara Sabour, Nicholas Frosst, Geoffrey E Hinton, 10, 2017, Arxiv. Dynamic Routing Between Capsules
EXPERIMENT 23
▸ 99.75% (baseline 99.61%)
1. CLASSIFICATION ON MNIST
Sara Sabour, Nicholas Frosst, Geoffrey E Hinton, 10, 2017, Arxiv. Dynamic Routing Between Capsules
EXPERIMENT 24
▸
2. RECONSTRUCTION ON MNIST
Sara Sabour, Nicholas Frosst, Geoffrey E Hinton, 10, 2017, Arxiv. Dynamic Routing Between Capsules
EXPERIMENT 25
▸
3. DIMENSION PERTURBATION ON MNIST
Sara Sabour, Nicholas Frosst, Geoffrey E Hinton, 10, 2017, Arxiv. Dynamic Routing Between Capsules
DISCUSSION
26
_ 27
▸ Capsule(Vector),
▸ Not conventional neuron(Scalar)
NOVELTY
_ 28
▸ Still use regular conv layer at first for local feature extraction
▸ Capsule cannot extract local feature?
STILL USE CONV LAYER
HOW TO RESTRICT TO GET CERTAIN FEATURE?
▸ Disentangling features
▸ How to obtain ‘certain features’?
ANY Q?
29
REFERENCE
▸ Sara Sabour, Nicholas Frosst, Geoffrey E Hinton, 10, 2017, Arxiv. Dynamic Routing Between Capsules (https://
arxiv.org/abs/1710.09829)
▸ Geoffrey Hinton et al., Matrix Capsules With EM Routing, Under review as a conference paper at ICLR 2018 (https://
openreview.net/pdf?id=HJWLfGWRb)
▸ https://medium.com/@pechyonkin/understanding-hintons-capsule-networks-part-i-intuition-b4b559d1159b
▸ https://hackernoon.com/what-is-a-capsnet-or-capsule-network-2bfbe48769cc
▸ https://hackernoon.com/capsule-networks-are-shaking-up-ai-heres-how-to-use-them-c233a0971952
▸ https://github.com/naturomics/CapsNet-Tensorflow
▸ https://www.zhihu.com/question/67287444/answer/251460831
▸ https://www.reddit.com/r/MachineLearning/comments/2lmo0l/ama_geoffrey_hinton/clyj4jv/
▸ Geoffrey Hinton: "Does the Brain do Inverse Graphics?” (https://www.youtube.com/watch?
v=TFIMqt0yT2I&feature=youtu.be)
▸ Geoffrey Hinton talk "What is wrong with convolutional neural nets ?” (https://www.youtube.com/watch?
v=rTawFwUvnLE&t=1214s)
▸ https://www.youtube.com/watch?v=u50nqWMQe1k
30
END OF
DOCUMENT
31

capsule network

  • 1.
    Dynamic Routing BetweenCapsules Sara Sabour, Nicholas Frosst, Geoffrey E Hinton, 10, 2017, Arxiv LAB SEMINAR 1 2017.11.13 SNU DATAMINING CENTER MINKI CHUNG
  • 2.
    TABLE OF CONTENTS ▸Intuition ▸ Problems of ConvNet ▸ How brain works, Inverse graphics ▸ Capsule Theory ▸ CapsNet ▸ Capsule ▸ CapsNet architecture ▸ Experiment ▸ Classification on MNIST ▸ Reconstruction on MNIST ▸ Dimension perturbation on MNIST ▸ Discussion 2
  • 3.
    INTUITION ▸ Problems ofConvNet ▸ How brain works, Inverse graphics ▸ Capsule Theory 3
  • 4.
    PROBLEMS OF CONVNET4 ▸ ConvNet Architecture PROBLEMS IS ‘POOLING’ https://hackernoon.com/what-is-a-capsnet-or-capsule-network-2bfbe48769cc Obtain translational, rotational invariance
  • 5.
    PROBLEMS OF CONVNET5 ▸ @REDDIT, MACHINE LEARNING https://www.reddit.com/r/MachineLearning/comments/2lmo0l/ama_geoffrey_hinton/clyj4jv/
  • 6.
    PROBLEMS OF CONVNET6 ▸ WHAT IS THIS PICTURE? https://hackernoon.com/capsule-networks-are-shaking-up-ai-heres-how-to-use-them-c233a0971952
  • 7.
    PROBLEMS OF CONVNET7 ▸ HOW ABOUT THIS? https://hackernoon.com/capsule-networks-are-shaking-up-ai-heres-how-to-use-them-c233a0971952
  • 8.
    PROBLEMS OF CONVNET8 ▸ NEED EQUIVARIANCE, NOT INVARIANCE https://hackernoon.com/capsule-networks-are-shaking-up-ai-heres-how-to-use-them-c233a0971952
  • 9.
    HOW BRAIN WORKS,INVERSE GRAPHICS 9 ▸ Constructing a visual image from some internal hierarchical representation of geometric data ▸ Internal representation is stored in computer’s memory as arrays of geometrical objects and matrices that represent relative positions and orientation of these objects ▸ Special software takes that representation and converts it into an image on the screen. This is called rendering ▸ Brains, in fact, do the opposite of rendering. Hinton calls it inverse graphics: Visual information received by eyes, they deconstruct a hierarchical representation of the world around us and try to match it with already learned patterns and relationships stored in the brain ▸ Key idea is that representation of objects in the brain does not depend on view angle COMPUTER GRAPHICS https://medium.com/@pechyonkin/understanding-hintons-capsule-networks-part-i-intuition-b4b559d1159b
  • 10.
    CAPSULE THEORY 10 ▸In 3D graphics, relationships between 3D objects can be represented by a so- called pose, which is in essence translation plus rotation ▸ Capsule approach: It incorporates relative relationships between objects (Internal representation) and it is represented numerically as a 4D pose matrix ▸ by ‘Dynamic Routing’ (more details later) ▸ allows capsules to communicate with each other and create representations similar to scene graphs in computer graphics https://medium.com/@pechyonkin/understanding-hintons-capsule-networks-part-i-intuition-b4b559d1159b YOU CAN EASILY RECOGNIZE THAT THIS IS THE STATUE OF LIBERTY, EVEN THOUGH ALL THE IMAGES SHOW IT FROM DIFFERENT ANGLES
  • 11.
    CAPSULE THEORY 11 ▸Benifits: ▸ Better understanding 3D Space ▸ Achieve state-of-the art performance by only using a fraction of the data that a CNN would use ▸ In order to learn to tell digits apart, the human brain needs only a couple of dozens of examples, hundreds at most, while CNN need tens of thousands of examples https://medium.com/@pechyonkin/understanding-hintons-capsule-networks-part-i-intuition-b4b559d1159b
  • 12.
    CAPSNET ▸ Capsule ▸ CapsNetarchitecture ▸ Experiment 12
  • 13.
    CAPSULE 13 ▸ Comparisonwith traditional neuron https://www.zhihu.com/question/67287444/answer/251460831 V VEC LENGTH WORKS LIKE PROBABILITY ACTIVATION OF NEXT CAPSULE DYNAMIC ROUTING
  • 14.
    CAPSNET ARCHITECTURE 14 ARCHITECTURE SaraSabour, Nicholas Frosst, Geoffrey E Hinton, 10, 2017, Arxiv. Dynamic Routing Between Capsules CONV CAPS.CONV CAPS.FC DYNAMIC ROUTING 8X 32 X MNIST LOCAL FEATURE DETECTION 6*6*32=1152 CAPSULES, EACH HAS 8 PROPERTIES 10 CAPSULES (CLASS), EACH HAS 16 PROPERTIES DEEPER MEANS MORE COMPLEX, DIMENSION SHOULD INCREASE
  • 15.
    CAPSNET ARCHITECTURE 15 ▸naturomics github CAPSNET-TENSORFLOW CAPS.CONVCONV CONV X 32 MNIST X 8 https://github.com/naturomics/CapsNet-Tensorflow X 32 X 8 CAPS.FC CAPS.CONV CAPS.FC DYNAMIC ROUTING
  • 16.
    CAPSNET ARCHITECTURE 16 ▸Place-coded Capsule ▸ Concatenate (=8 different regular conv layers) ▸ Consider each feature map as capsule (6*6*32=1152 capsules with 8 properties) CAPS.CONV, PRIMARYCAPS CAPS.CONV X 32 MNIST X 8 https://github.com/naturomics/CapsNet-Tensorflow DIRECTION
  • 17.
    CAPSNET ARCHITECTURE 17 ▸Place-coded Capsule ▸ Concatenate (=8 different regular conv layers) ▸ Consider each feature map as capsule (6*6*32=1152 capsules with 8 properties) ▸ Use squashing function in the end CAPS.CONV, PRIMARYCAPS CAPS.CONV X 32 MNIST X 8 https://github.com/naturomics/CapsNet-Tensorflow
  • 18.
    CAPSNET ARCHITECTURE 18 ▸Rate-coded capsules ▸ caps: 1152 → 10 ▸ vec-len: 8 → 16 ▸ Dynamic Routing CAPS.FC, DIGITCAPS https://github.com/naturomics/CapsNet-Tensorflow X 32 MNIST X 8 CAPS.FC DYNAMIC ROUTING DYNAMIC ROUTING
  • 19.
    CAPSNET ARCHITECTURE 19 ▸Dynamic Routing ▸ Top-down feedback ▸ Routing by agreement ▸ Works like attention CAPS.FC, DIGITCAPS https://github.com/naturomics/CapsNet-Tensorflow IF MULTIPLE PREDICTIONS AGREE, HIGHER LEVEL CAPSULE BECOMES ACTIVE VEC LENGTH WORKS LIKE PROBABILITY ACTIVATION OF NEXT CAPSULE COUPLING COEFFICIENTS TOPDOWN FEEDBACK: IF RELATION EXISTS COUPLING COEFFICIENTS INCREASE AGREEMENT
  • 20.
    CAPSNET ARCHITECTURE 20 ▸Dynamic Routing CAPS.FC, DIGITCAPS https://github.com/naturomics/CapsNet-Tensorflow X 32 MNIST X 8 CAPS.FC DYNAMIC ROUTING 3 ITERATIONS WILL DO
  • 21.
    EXPERIMENT ▸ Classification onMNIST ▸ Reconstruction on MNIST ▸ Dimension perturbation on MNIST 21
  • 22.
    EXPERIMENT 22 ▸ Introducefirst three ▸ Classification on MNIST (99.75%, conv 99.61%) ▸ Reconstruction on MNIST ▸ Dimension Perturbation on MNIST ▸ Robustness to Affine Transformation on MNIST (79%, conv 66%) ▸ Classification on MultiMNIST (5% error) ▸ Classification on CIFAR 10 (10.6% error - ZFNet) ▸ Classification on SVHN (4.3% error) Sara Sabour, Nicholas Frosst, Geoffrey E Hinton, 10, 2017, Arxiv. Dynamic Routing Between Capsules
  • 23.
    EXPERIMENT 23 ▸ 99.75%(baseline 99.61%) 1. CLASSIFICATION ON MNIST Sara Sabour, Nicholas Frosst, Geoffrey E Hinton, 10, 2017, Arxiv. Dynamic Routing Between Capsules
  • 24.
    EXPERIMENT 24 ▸ 2. RECONSTRUCTIONON MNIST Sara Sabour, Nicholas Frosst, Geoffrey E Hinton, 10, 2017, Arxiv. Dynamic Routing Between Capsules
  • 25.
    EXPERIMENT 25 ▸ 3. DIMENSIONPERTURBATION ON MNIST Sara Sabour, Nicholas Frosst, Geoffrey E Hinton, 10, 2017, Arxiv. Dynamic Routing Between Capsules
  • 26.
  • 27.
    _ 27 ▸ Capsule(Vector), ▸Not conventional neuron(Scalar) NOVELTY
  • 28.
    _ 28 ▸ Stilluse regular conv layer at first for local feature extraction ▸ Capsule cannot extract local feature? STILL USE CONV LAYER HOW TO RESTRICT TO GET CERTAIN FEATURE? ▸ Disentangling features ▸ How to obtain ‘certain features’?
  • 29.
  • 30.
    REFERENCE ▸ Sara Sabour, NicholasFrosst, Geoffrey E Hinton, 10, 2017, Arxiv. Dynamic Routing Between Capsules (https:// arxiv.org/abs/1710.09829) ▸ Geoffrey Hinton et al., Matrix Capsules With EM Routing, Under review as a conference paper at ICLR 2018 (https:// openreview.net/pdf?id=HJWLfGWRb) ▸ https://medium.com/@pechyonkin/understanding-hintons-capsule-networks-part-i-intuition-b4b559d1159b ▸ https://hackernoon.com/what-is-a-capsnet-or-capsule-network-2bfbe48769cc ▸ https://hackernoon.com/capsule-networks-are-shaking-up-ai-heres-how-to-use-them-c233a0971952 ▸ https://github.com/naturomics/CapsNet-Tensorflow ▸ https://www.zhihu.com/question/67287444/answer/251460831 ▸ https://www.reddit.com/r/MachineLearning/comments/2lmo0l/ama_geoffrey_hinton/clyj4jv/ ▸ Geoffrey Hinton: "Does the Brain do Inverse Graphics?” (https://www.youtube.com/watch? v=TFIMqt0yT2I&feature=youtu.be) ▸ Geoffrey Hinton talk "What is wrong with convolutional neural nets ?” (https://www.youtube.com/watch? v=rTawFwUvnLE&t=1214s) ▸ https://www.youtube.com/watch?v=u50nqWMQe1k 30
  • 31.