SlideShare a Scribd company logo
Susang Kim(healess1@gmail.com)
Capsule Network
Dynamic Routing Between Capsules
Dynamic Routing Between Capsules
Hinton 교수님과 Google Brain의 연구진이 작성
(NIPS 2017 발표)
CNN이 가지고 있는 한계점을 지적하며 Capsule을 통한
새로운 Feature Extraction을 제시
(Pose, Speed, Light, Angle….)
A capsule is a group of neurons whose activity vector
represents the instantiation parameters of a specific type
of entity such as an object or an object part.
https://jayhey.github.io/deep%20learning/2017/11/28/CapsNet_1/
Convolutional neural networks
CNN의 Feature Map을 추출하여 Pooling(Subsampling)을 통해 Spatial
Size를 축소함(깊게 쌓아야 넓게 볼 수 있음) -> Location 정보가 손실됨
Convolutional neural networks (CNNs) use translated replicas of
learned feature detectors. This allows them to translate knowledge
about good weight values acquired at one position in an image to
other positions. This has proven extremely helpful in image
interpretation.
https://becominghuman.ai/understanding-capsnet-part-1-e274943a018d
Geoffrey Hinton says
http://helper.ipam.ucla.edu/publications/gss2012/gss2012_10754.pdf
Translational Invariance
이미지 내의 위치나 각도 조명등이 바뀌어도 구분
Invariance means that you can recognize an object as
an object, even when its appearance varies in some
way. This is generally a good thing, because it
preserves the object's identity, category, (etc) across
changes in the specifics of the visual input, like relative
positions of the viewer/camera and the object.
https://stats.stackexchange.com/questions/208936/what-is-translation-invariance-in-computer-vision-an
d-convolutional-neural-netwo
Capsule Network
Capsule Network(Dynamic Routing/Squash)
https://www.slideshare.net/thinkingfactory/pr12-capsule-networks-jaejun-yoo
Capsule Network(Dynamic Routing)
def capsule(input, b_IJ, idx_j):
with tf.variable_scope('routing'):
w_initializer = np.random.normal(size=[1, 1152, 8, 16], scale=0.01)
W_Ij = tf.Variable(w_initializer, dtype=tf.float32)
W_Ij = tf.tile(W_Ij, [cfg.batch_size, 1, 1, 1])
# calc u_hat
# [8, 16].T x [8, 1] => [16, 1] => [batch_size, 1152, 16, 1]
u_hat = tf.matmul(W_Ij, input, transpose_a=True)
assert u_hat.get_shape() == [cfg.batch_size, 1152, 16, 1]
shape = b_IJ.get_shape().as_list()
size_splits = [idx_j, 1, shape[2] - idx_j - 1]
for r_iter in range(cfg.iter_routing):
c_IJ = tf.nn.softmax(b_IJ, dim=2)
assert c_IJ.get_shape() == [1, 1152, 10, 1]
# line 5:
# weighting u_hat with c_I in the third dim,
# then sum in the second dim, resulting in [batch_size, 1, 16, 1]
b_Il, b_Ij, b_Ir = tf.split(b_IJ, size_splits, axis=2)
c_Il, c_Ij, b_Ir = tf.split(c_IJ, size_splits, axis=2)
assert c_Ij.get_shape() == [1, 1152, 1, 1]
s_j = tf.multiply(c_Ij, u_hat)
s_j = tf.reduce_sum(tf.multiply(c_Ij, u_hat), axis=1, keep_dims=True)
assert s_j.get_shape() == [cfg.batch_size, 1, 16, 1]
# line 6:
# squash using Eq.1, resulting in [batch_size, 1, 16, 1]
v_j = squash(s_j)
assert s_j.get_shape() == [cfg.batch_size, 1, 16, 1]
# line 7:
# tile v_j from [batch_size ,1, 16, 1] to [batch_size, 1152, 16, 1]
# [16, 1].T x [16, 1] => [1, 1], then reduce mean in the
# batch_size dim, resulting in [1, 1152, 1, 1]
v_j_tiled = tf.tile(v_j, [1, 1152, 1, 1])
u_produce_v = tf.matmul(u_hat, v_j_tiled, transpose_a=True)
assert u_produce_v.get_shape() == [cfg.batch_size, 1152, 1, 1]
b_Ij += tf.reduce_sum(u_produce_v, axis=0, keep_dims=True)
b_IJ = tf.concat([b_Il, b_Ij, b_Ir], axis=2)
return(v_j, b_IJ)
def squash(vector):
vec_abs = tf.sqrt(tf.reduce_sum(tf.square(vector))) # a scalar
scalar_factor = tf.square(vec_abs) / (1 + tf.square(vec_abs))
vec_squashed = scalar_factor * tf.divide(vector, vec_abs) # element-wise
return(vec_squashed)
if not self.with_routing:
# the PrimaryCaps layer
# input: [batch_size, 20, 20, 256]
assert input.get_shape() == [cfg.batch_size, 20, 20, 256]
capsules = []
for i in range(self.num_units):
# each capsule i: [batch_size, 6, 6, 32]
with tf.variable_scope('ConvUnit_' + str(i)):
caps_i = tf.contrib.layers.conv2d(input,
self.num_outputs,
self.kernel_size,
self.stride,
padding="VALID")
caps_i = tf.reshape(caps_i, shape=(cfg.batch_size, -1, 1, 1))
capsules.append(caps_i)
assert capsules[0].get_shape() == [cfg.batch_size, 1152, 1, 1]
# [batch_size, 1152, 8, 1]
capsules = tf.concat(capsules, axis=2)
capsules = squash(capsules)
assert capsules.get_shape() == [cfg.batch_size, 1152, 8, 1]
else:
# the DigitCaps layer
# Reshape the input into shape [batch_size, 1152, 8, 1]
self.input = tf.reshape(input, shape=(cfg.batch_size, 1152, 8, 1))
# b_IJ: [1, num_caps_l, num_caps_l_plus_1, 1]
b_IJ = tf.zeros(shape=[1, 1152, 10, 1], dtype=np.float32)
capsules = []
for j in range(self.num_outputs):
with tf.variable_scope('caps_' + str(j)):
caps_j, b_IJ = capsule(input, b_IJ, j)
capsules.append(caps_j)
# Return a tensor with shape [batch_size, 10, 16, 1]
capsules = tf.concat(capsules, axis=1)
assert capsules.get_shape() == [cfg.batch_size, 10, 16, 1]
return(capsules)
Loss Function
https://www.slideshare.net/aureliengeron/introduction-to-capsule-networks-capsnets
Capsule Network(Reconstruction)
Codes
def loss(self):
# 1. The margin loss
# [batch_size, 10, 1, 1]
# max_l = max(0, m_plus-||v_c||)^2
max_l = tf.square(tf.maximum(0., cfg.m_plus - self.v_length))
# max_r = max(0, ||v_c||-m_minus)^2
max_r = tf.square(tf.maximum(0., self.v_length - cfg.m_minus))
assert max_l.get_shape() == [cfg.batch_size, self.num_label, 1, 1]
# reshape: [batch_size, 10, 1, 1] => [batch_size, 10]
max_l = tf.reshape(max_l, shape=(cfg.batch_size, -1))
max_r = tf.reshape(max_r, shape=(cfg.batch_size, -1))
# calc T_c: [batch_size, 10]
# T_c = Y, is my understanding correct? Try it.
T_c = self.Y
# [batch_size, 10], element-wise multiply
L_c = T_c * max_l + cfg.lambda_val * (1 - T_c) * max_r
self.margin_loss = tf.reduce_mean(tf.reduce_sum(L_c, axis=1))
# 2. The reconstruction loss
orgin = tf.reshape(self.X, shape=(cfg.batch_size, -1))
squared = tf.square(self.decoded - orgin)
self.reconstruction_err = tf.reduce_mean(squared)
# 3. Total loss
# The paper uses sum of squared error as reconstruction error, but we
# have used reduce_mean in `# 2 The reconstruction loss` to calculate
# mean squared error. In order to keep in line with the paper,the
# regularization scale should be 0.0005*784=0.392
self.total_loss = self.margin_loss + cfg.regularization_scale *
self.reconstruction_err
Capsule on MNIST
Capsule on Multi MNIST
Capsule은 Spatial한 정보를 볼 수 있음
Capsule vs Neuron
Capsule vs CNN
Pros and Cons
https://www.slideshare.net/aureliengeron/introduction-to-capsule-networks-capsnets
References
[논문]
1. Dynamic Routing Between Capsules (2017.11) - NIPS 2017
[블로그]
https://jayhey.github.io/deep%20learning/2017/11/28/CapsNet_1/
https://www.slideshare.net/thinkingfactory/pr12-capsule-networks-jaejun-yoo
https://medium.com/ai%C2%B3-theory-practice-business/understanding-hintons-capsule-networks-part-iii-dyn
amic-routing-between-capsules-349f6d30418
http://helper.ipam.ucla.edu/publications/gss2012/gss2012_10754.pdf
https://towardsdatascience.com/capsule-networks-the-new-deep-learning-network-bd917e6818e8
http://blog.naver.com/PostView.nhn?blogId=sogangori&logNo=221129974140&redirect=Dlog&widgetT
ypeCall=true&directAccess=false
[영상]
http://jaejunyoo.blogspot.com/2018/02/pr12-video-56-capsule-network.html
[Code]
https://github.com/naturomics/CapsNet-Tensorflow
Thanks
Any Questions?
You can send mail to
Susang Kim(healess1@gmail.com)

More Related Content

What's hot

Paper id 71201927
Paper id 71201927Paper id 71201927
Paper id 71201927
IJRAT
 
Image sampling and quantization
Image sampling and quantizationImage sampling and quantization
Image sampling and quantization
BCET, Balasore
 
MLHEP 2015: Introductory Lecture #2
MLHEP 2015: Introductory Lecture #2MLHEP 2015: Introductory Lecture #2
MLHEP 2015: Introductory Lecture #2
arogozhnikov
 
Lesson 25: Evaluating Definite Integrals (Section 10 version)
Lesson 25: Evaluating Definite Integrals (Section 10 version)Lesson 25: Evaluating Definite Integrals (Section 10 version)
Lesson 25: Evaluating Definite Integrals (Section 10 version)
Matthew Leingang
 
A Generalization of Minimax Distribution
A Generalization of Minimax Distribution A Generalization of Minimax Distribution
A Generalization of Minimax Distribution
Premier Publishers
 
735
735735
Building Functional Islands
Building Functional IslandsBuilding Functional Islands
Building Functional Islands
Mark Jones
 
Design and optimization of parts of a suspension system
Design and optimization of parts of a suspension systemDesign and optimization of parts of a suspension system
Design and optimization of parts of a suspension system
Shih Cheng Tung
 
Accelerating Compression Time of the Standard JPEG by Employing the Quantized...
Accelerating Compression Time of the Standard JPEG by Employing the Quantized...Accelerating Compression Time of the Standard JPEG by Employing the Quantized...
Accelerating Compression Time of the Standard JPEG by Employing the Quantized...
IJECEIAES
 

What's hot (9)

Paper id 71201927
Paper id 71201927Paper id 71201927
Paper id 71201927
 
Image sampling and quantization
Image sampling and quantizationImage sampling and quantization
Image sampling and quantization
 
MLHEP 2015: Introductory Lecture #2
MLHEP 2015: Introductory Lecture #2MLHEP 2015: Introductory Lecture #2
MLHEP 2015: Introductory Lecture #2
 
Lesson 25: Evaluating Definite Integrals (Section 10 version)
Lesson 25: Evaluating Definite Integrals (Section 10 version)Lesson 25: Evaluating Definite Integrals (Section 10 version)
Lesson 25: Evaluating Definite Integrals (Section 10 version)
 
A Generalization of Minimax Distribution
A Generalization of Minimax Distribution A Generalization of Minimax Distribution
A Generalization of Minimax Distribution
 
735
735735
735
 
Building Functional Islands
Building Functional IslandsBuilding Functional Islands
Building Functional Islands
 
Design and optimization of parts of a suspension system
Design and optimization of parts of a suspension systemDesign and optimization of parts of a suspension system
Design and optimization of parts of a suspension system
 
Accelerating Compression Time of the Standard JPEG by Employing the Quantized...
Accelerating Compression Time of the Standard JPEG by Employing the Quantized...Accelerating Compression Time of the Standard JPEG by Employing the Quantized...
Accelerating Compression Time of the Standard JPEG by Employing the Quantized...
 

Similar to [Paper] dynamic routing between capsules

[Paper] learning video representations from correspondence proposals
[Paper]  learning video representations from correspondence proposals[Paper]  learning video representations from correspondence proposals
[Paper] learning video representations from correspondence proposals
Susang Kim
 
Need help filling out the missing sections of this code- the sections.docx
Need help filling out the missing sections of this code- the sections.docxNeed help filling out the missing sections of this code- the sections.docx
Need help filling out the missing sections of this code- the sections.docx
lauracallander
 
Assignment 6.2a.pdf
Assignment 6.2a.pdfAssignment 6.2a.pdf
Assignment 6.2a.pdf
dash41
 
Time Series Analysis:Basic Stochastic Signal Recovery
Time Series Analysis:Basic Stochastic Signal RecoveryTime Series Analysis:Basic Stochastic Signal Recovery
Time Series Analysis:Basic Stochastic Signal Recovery
Daniel Cuneo
 
maXbox starter65 machinelearning3
maXbox starter65 machinelearning3maXbox starter65 machinelearning3
maXbox starter65 machinelearning3
Max Kleiner
 
Python Cheat Sheet Presentation Learning
Python Cheat Sheet Presentation LearningPython Cheat Sheet Presentation Learning
Python Cheat Sheet Presentation Learning
Naseer-ul-Hassan Rehman
 
maXbox starter68 machine learning VI
maXbox starter68 machine learning VImaXbox starter68 machine learning VI
maXbox starter68 machine learning VI
Max Kleiner
 
Xgboost
XgboostXgboost
What is TensorFlow and why do we use it
What is TensorFlow and why do we use itWhat is TensorFlow and why do we use it
What is TensorFlow and why do we use it
Robert John
 
Operations management chapter 03 homework assignment use this
Operations management chapter 03 homework assignment use thisOperations management chapter 03 homework assignment use this
Operations management chapter 03 homework assignment use this
POLY33
 
Deep cv 101
Deep cv 101Deep cv 101
Deep cv 101
Xiaohu ZHU
 
밑바닥부터 시작하는 의료 AI
밑바닥부터 시작하는 의료 AI밑바닥부터 시작하는 의료 AI
밑바닥부터 시작하는 의료 AI
NAVER Engineering
 
CNN_INTRO.pptx
CNN_INTRO.pptxCNN_INTRO.pptx
CNN_INTRO.pptx
NiharikaThakur32
 

Similar to [Paper] dynamic routing between capsules (13)

[Paper] learning video representations from correspondence proposals
[Paper]  learning video representations from correspondence proposals[Paper]  learning video representations from correspondence proposals
[Paper] learning video representations from correspondence proposals
 
Need help filling out the missing sections of this code- the sections.docx
Need help filling out the missing sections of this code- the sections.docxNeed help filling out the missing sections of this code- the sections.docx
Need help filling out the missing sections of this code- the sections.docx
 
Assignment 6.2a.pdf
Assignment 6.2a.pdfAssignment 6.2a.pdf
Assignment 6.2a.pdf
 
Time Series Analysis:Basic Stochastic Signal Recovery
Time Series Analysis:Basic Stochastic Signal RecoveryTime Series Analysis:Basic Stochastic Signal Recovery
Time Series Analysis:Basic Stochastic Signal Recovery
 
maXbox starter65 machinelearning3
maXbox starter65 machinelearning3maXbox starter65 machinelearning3
maXbox starter65 machinelearning3
 
Python Cheat Sheet Presentation Learning
Python Cheat Sheet Presentation LearningPython Cheat Sheet Presentation Learning
Python Cheat Sheet Presentation Learning
 
maXbox starter68 machine learning VI
maXbox starter68 machine learning VImaXbox starter68 machine learning VI
maXbox starter68 machine learning VI
 
Xgboost
XgboostXgboost
Xgboost
 
What is TensorFlow and why do we use it
What is TensorFlow and why do we use itWhat is TensorFlow and why do we use it
What is TensorFlow and why do we use it
 
Operations management chapter 03 homework assignment use this
Operations management chapter 03 homework assignment use thisOperations management chapter 03 homework assignment use this
Operations management chapter 03 homework assignment use this
 
Deep cv 101
Deep cv 101Deep cv 101
Deep cv 101
 
밑바닥부터 시작하는 의료 AI
밑바닥부터 시작하는 의료 AI밑바닥부터 시작하는 의료 AI
밑바닥부터 시작하는 의료 AI
 
CNN_INTRO.pptx
CNN_INTRO.pptxCNN_INTRO.pptx
CNN_INTRO.pptx
 

More from Susang Kim

[Paper] GIRAFFE: Representing Scenes as Compositional Generative Neural Featu...
[Paper] GIRAFFE: Representing Scenes as Compositional Generative Neural Featu...[Paper] GIRAFFE: Representing Scenes as Compositional Generative Neural Featu...
[Paper] GIRAFFE: Representing Scenes as Compositional Generative Neural Featu...
Susang Kim
 
[Paper] Multiscale Vision Transformers(MVit)
[Paper] Multiscale Vision Transformers(MVit)[Paper] Multiscale Vision Transformers(MVit)
[Paper] Multiscale Vision Transformers(MVit)
Susang Kim
 
[Paper] anti spoofing for face recognition
[Paper] anti spoofing for face recognition[Paper] anti spoofing for face recognition
[Paper] anti spoofing for face recognition
Susang Kim
 
[Paper] attention mechanism(luong)
[Paper] attention mechanism(luong)[Paper] attention mechanism(luong)
[Paper] attention mechanism(luong)
Susang Kim
 
[Paper] shuffle net an extremely efficient convolutional neural network for ...
[Paper] shuffle net  an extremely efficient convolutional neural network for ...[Paper] shuffle net  an extremely efficient convolutional neural network for ...
[Paper] shuffle net an extremely efficient convolutional neural network for ...
Susang Kim
 
[Paper] EDA : easy data augmentation techniques for boosting performance on t...
[Paper] EDA : easy data augmentation techniques for boosting performance on t...[Paper] EDA : easy data augmentation techniques for boosting performance on t...
[Paper] EDA : easy data augmentation techniques for boosting performance on t...
Susang Kim
 
[Paper] auto ml part 1
[Paper] auto ml part 1[Paper] auto ml part 1
[Paper] auto ml part 1
Susang Kim
 
[Paper] eXplainable ai(xai) in computer vision
[Paper] eXplainable ai(xai) in computer vision[Paper] eXplainable ai(xai) in computer vision
[Paper] eXplainable ai(xai) in computer vision
Susang Kim
 
[Paper] DetectoRS for Object Detection
[Paper] DetectoRS for Object Detection[Paper] DetectoRS for Object Detection
[Paper] DetectoRS for Object Detection
Susang Kim
 
Long term feature banks for detailed video understanding (Action Recognition)
Long term feature banks for detailed video understanding (Action Recognition)Long term feature banks for detailed video understanding (Action Recognition)
Long term feature banks for detailed video understanding (Action Recognition)
Susang Kim
 
I3D and Kinetics datasets (Action Recognition)
I3D and Kinetics datasets (Action Recognition)I3D and Kinetics datasets (Action Recognition)
I3D and Kinetics datasets (Action Recognition)
Susang Kim
 
GroupFace (Face Recognition)
GroupFace (Face Recognition)GroupFace (Face Recognition)
GroupFace (Face Recognition)
Susang Kim
 
제11회공개sw개발자대회 금상 TensorMSA(소개)
제11회공개sw개발자대회 금상 TensorMSA(소개)제11회공개sw개발자대회 금상 TensorMSA(소개)
제11회공개sw개발자대회 금상 TensorMSA(소개)
Susang Kim
 
Sk t academy lecture note
Sk t academy lecture noteSk t academy lecture note
Sk t academy lecture note
Susang Kim
 
Python과 Tensorflow를 활용한 AI Chatbot 개발 및 실무 적용
Python과 Tensorflow를 활용한  AI Chatbot 개발 및 실무 적용Python과 Tensorflow를 활용한  AI Chatbot 개발 및 실무 적용
Python과 Tensorflow를 활용한 AI Chatbot 개발 및 실무 적용
Susang Kim
 

More from Susang Kim (15)

[Paper] GIRAFFE: Representing Scenes as Compositional Generative Neural Featu...
[Paper] GIRAFFE: Representing Scenes as Compositional Generative Neural Featu...[Paper] GIRAFFE: Representing Scenes as Compositional Generative Neural Featu...
[Paper] GIRAFFE: Representing Scenes as Compositional Generative Neural Featu...
 
[Paper] Multiscale Vision Transformers(MVit)
[Paper] Multiscale Vision Transformers(MVit)[Paper] Multiscale Vision Transformers(MVit)
[Paper] Multiscale Vision Transformers(MVit)
 
[Paper] anti spoofing for face recognition
[Paper] anti spoofing for face recognition[Paper] anti spoofing for face recognition
[Paper] anti spoofing for face recognition
 
[Paper] attention mechanism(luong)
[Paper] attention mechanism(luong)[Paper] attention mechanism(luong)
[Paper] attention mechanism(luong)
 
[Paper] shuffle net an extremely efficient convolutional neural network for ...
[Paper] shuffle net  an extremely efficient convolutional neural network for ...[Paper] shuffle net  an extremely efficient convolutional neural network for ...
[Paper] shuffle net an extremely efficient convolutional neural network for ...
 
[Paper] EDA : easy data augmentation techniques for boosting performance on t...
[Paper] EDA : easy data augmentation techniques for boosting performance on t...[Paper] EDA : easy data augmentation techniques for boosting performance on t...
[Paper] EDA : easy data augmentation techniques for boosting performance on t...
 
[Paper] auto ml part 1
[Paper] auto ml part 1[Paper] auto ml part 1
[Paper] auto ml part 1
 
[Paper] eXplainable ai(xai) in computer vision
[Paper] eXplainable ai(xai) in computer vision[Paper] eXplainable ai(xai) in computer vision
[Paper] eXplainable ai(xai) in computer vision
 
[Paper] DetectoRS for Object Detection
[Paper] DetectoRS for Object Detection[Paper] DetectoRS for Object Detection
[Paper] DetectoRS for Object Detection
 
Long term feature banks for detailed video understanding (Action Recognition)
Long term feature banks for detailed video understanding (Action Recognition)Long term feature banks for detailed video understanding (Action Recognition)
Long term feature banks for detailed video understanding (Action Recognition)
 
I3D and Kinetics datasets (Action Recognition)
I3D and Kinetics datasets (Action Recognition)I3D and Kinetics datasets (Action Recognition)
I3D and Kinetics datasets (Action Recognition)
 
GroupFace (Face Recognition)
GroupFace (Face Recognition)GroupFace (Face Recognition)
GroupFace (Face Recognition)
 
제11회공개sw개발자대회 금상 TensorMSA(소개)
제11회공개sw개발자대회 금상 TensorMSA(소개)제11회공개sw개발자대회 금상 TensorMSA(소개)
제11회공개sw개발자대회 금상 TensorMSA(소개)
 
Sk t academy lecture note
Sk t academy lecture noteSk t academy lecture note
Sk t academy lecture note
 
Python과 Tensorflow를 활용한 AI Chatbot 개발 및 실무 적용
Python과 Tensorflow를 활용한  AI Chatbot 개발 및 실무 적용Python과 Tensorflow를 활용한  AI Chatbot 개발 및 실무 적용
Python과 Tensorflow를 활용한 AI Chatbot 개발 및 실무 적용
 

Recently uploaded

Analysis insight about a Flyball dog competition team's performance
Analysis insight about a Flyball dog competition team's performanceAnalysis insight about a Flyball dog competition team's performance
Analysis insight about a Flyball dog competition team's performance
roli9797
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
Timothy Spann
 
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
74nqk8xf
 
Udemy_2024_Global_Learning_Skills_Trends_Report (1).pdf
Udemy_2024_Global_Learning_Skills_Trends_Report (1).pdfUdemy_2024_Global_Learning_Skills_Trends_Report (1).pdf
Udemy_2024_Global_Learning_Skills_Trends_Report (1).pdf
Fernanda Palhano
 
Learn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queriesLearn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queries
manishkhaire30
 
A presentation that explain the Power BI Licensing
A presentation that explain the Power BI LicensingA presentation that explain the Power BI Licensing
A presentation that explain the Power BI Licensing
AlessioFois2
 
The Building Blocks of QuestDB, a Time Series Database
The Building Blocks of QuestDB, a Time Series DatabaseThe Building Blocks of QuestDB, a Time Series Database
The Building Blocks of QuestDB, a Time Series Database
javier ramirez
 
Everything you wanted to know about LIHTC
Everything you wanted to know about LIHTCEverything you wanted to know about LIHTC
Everything you wanted to know about LIHTC
Roger Valdez
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
Timothy Spann
 
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data LakeViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
Walaa Eldin Moustafa
 
My burning issue is homelessness K.C.M.O.
My burning issue is homelessness K.C.M.O.My burning issue is homelessness K.C.M.O.
My burning issue is homelessness K.C.M.O.
rwarrenll
 
一比一原版(牛布毕业证书)牛津布鲁克斯大学毕业证如何办理
一比一原版(牛布毕业证书)牛津布鲁克斯大学毕业证如何办理一比一原版(牛布毕业证书)牛津布鲁克斯大学毕业证如何办理
一比一原版(牛布毕业证书)牛津布鲁克斯大学毕业证如何办理
74nqk8xf
 
Intelligence supported media monitoring in veterinary medicine
Intelligence supported media monitoring in veterinary medicineIntelligence supported media monitoring in veterinary medicine
Intelligence supported media monitoring in veterinary medicine
AndrzejJarynowski
 
一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理
一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理
一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理
nuttdpt
 
DSSML24_tspann_CodelessGenerativeAIPipelines
DSSML24_tspann_CodelessGenerativeAIPipelinesDSSML24_tspann_CodelessGenerativeAIPipelines
DSSML24_tspann_CodelessGenerativeAIPipelines
Timothy Spann
 
一比一原版(Glasgow毕业证书)格拉斯哥大学毕业证如何办理
一比一原版(Glasgow毕业证书)格拉斯哥大学毕业证如何办理一比一原版(Glasgow毕业证书)格拉斯哥大学毕业证如何办理
一比一原版(Glasgow毕业证书)格拉斯哥大学毕业证如何办理
g4dpvqap0
 
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Aggregage
 
University of New South Wales degree offer diploma Transcript
University of New South Wales degree offer diploma TranscriptUniversity of New South Wales degree offer diploma Transcript
University of New South Wales degree offer diploma Transcript
soxrziqu
 
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
sameer shah
 
一比一原版(Harvard毕业证书)哈佛大学毕业证如何办理
一比一原版(Harvard毕业证书)哈佛大学毕业证如何办理一比一原版(Harvard毕业证书)哈佛大学毕业证如何办理
一比一原版(Harvard毕业证书)哈佛大学毕业证如何办理
zsjl4mimo
 

Recently uploaded (20)

Analysis insight about a Flyball dog competition team's performance
Analysis insight about a Flyball dog competition team's performanceAnalysis insight about a Flyball dog competition team's performance
Analysis insight about a Flyball dog competition team's performance
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
 
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
 
Udemy_2024_Global_Learning_Skills_Trends_Report (1).pdf
Udemy_2024_Global_Learning_Skills_Trends_Report (1).pdfUdemy_2024_Global_Learning_Skills_Trends_Report (1).pdf
Udemy_2024_Global_Learning_Skills_Trends_Report (1).pdf
 
Learn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queriesLearn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queries
 
A presentation that explain the Power BI Licensing
A presentation that explain the Power BI LicensingA presentation that explain the Power BI Licensing
A presentation that explain the Power BI Licensing
 
The Building Blocks of QuestDB, a Time Series Database
The Building Blocks of QuestDB, a Time Series DatabaseThe Building Blocks of QuestDB, a Time Series Database
The Building Blocks of QuestDB, a Time Series Database
 
Everything you wanted to know about LIHTC
Everything you wanted to know about LIHTCEverything you wanted to know about LIHTC
Everything you wanted to know about LIHTC
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
 
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data LakeViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
 
My burning issue is homelessness K.C.M.O.
My burning issue is homelessness K.C.M.O.My burning issue is homelessness K.C.M.O.
My burning issue is homelessness K.C.M.O.
 
一比一原版(牛布毕业证书)牛津布鲁克斯大学毕业证如何办理
一比一原版(牛布毕业证书)牛津布鲁克斯大学毕业证如何办理一比一原版(牛布毕业证书)牛津布鲁克斯大学毕业证如何办理
一比一原版(牛布毕业证书)牛津布鲁克斯大学毕业证如何办理
 
Intelligence supported media monitoring in veterinary medicine
Intelligence supported media monitoring in veterinary medicineIntelligence supported media monitoring in veterinary medicine
Intelligence supported media monitoring in veterinary medicine
 
一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理
一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理
一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理
 
DSSML24_tspann_CodelessGenerativeAIPipelines
DSSML24_tspann_CodelessGenerativeAIPipelinesDSSML24_tspann_CodelessGenerativeAIPipelines
DSSML24_tspann_CodelessGenerativeAIPipelines
 
一比一原版(Glasgow毕业证书)格拉斯哥大学毕业证如何办理
一比一原版(Glasgow毕业证书)格拉斯哥大学毕业证如何办理一比一原版(Glasgow毕业证书)格拉斯哥大学毕业证如何办理
一比一原版(Glasgow毕业证书)格拉斯哥大学毕业证如何办理
 
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
 
University of New South Wales degree offer diploma Transcript
University of New South Wales degree offer diploma TranscriptUniversity of New South Wales degree offer diploma Transcript
University of New South Wales degree offer diploma Transcript
 
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
 
一比一原版(Harvard毕业证书)哈佛大学毕业证如何办理
一比一原版(Harvard毕业证书)哈佛大学毕业证如何办理一比一原版(Harvard毕业证书)哈佛大学毕业证如何办理
一比一原版(Harvard毕业证书)哈佛大学毕业证如何办理
 

[Paper] dynamic routing between capsules

  • 2. Dynamic Routing Between Capsules Hinton 교수님과 Google Brain의 연구진이 작성 (NIPS 2017 발표) CNN이 가지고 있는 한계점을 지적하며 Capsule을 통한 새로운 Feature Extraction을 제시 (Pose, Speed, Light, Angle….) A capsule is a group of neurons whose activity vector represents the instantiation parameters of a specific type of entity such as an object or an object part. https://jayhey.github.io/deep%20learning/2017/11/28/CapsNet_1/
  • 3. Convolutional neural networks CNN의 Feature Map을 추출하여 Pooling(Subsampling)을 통해 Spatial Size를 축소함(깊게 쌓아야 넓게 볼 수 있음) -> Location 정보가 손실됨 Convolutional neural networks (CNNs) use translated replicas of learned feature detectors. This allows them to translate knowledge about good weight values acquired at one position in an image to other positions. This has proven extremely helpful in image interpretation. https://becominghuman.ai/understanding-capsnet-part-1-e274943a018d
  • 5. Translational Invariance 이미지 내의 위치나 각도 조명등이 바뀌어도 구분 Invariance means that you can recognize an object as an object, even when its appearance varies in some way. This is generally a good thing, because it preserves the object's identity, category, (etc) across changes in the specifics of the visual input, like relative positions of the viewer/camera and the object. https://stats.stackexchange.com/questions/208936/what-is-translation-invariance-in-computer-vision-an d-convolutional-neural-netwo
  • 9. def capsule(input, b_IJ, idx_j): with tf.variable_scope('routing'): w_initializer = np.random.normal(size=[1, 1152, 8, 16], scale=0.01) W_Ij = tf.Variable(w_initializer, dtype=tf.float32) W_Ij = tf.tile(W_Ij, [cfg.batch_size, 1, 1, 1]) # calc u_hat # [8, 16].T x [8, 1] => [16, 1] => [batch_size, 1152, 16, 1] u_hat = tf.matmul(W_Ij, input, transpose_a=True) assert u_hat.get_shape() == [cfg.batch_size, 1152, 16, 1] shape = b_IJ.get_shape().as_list() size_splits = [idx_j, 1, shape[2] - idx_j - 1] for r_iter in range(cfg.iter_routing): c_IJ = tf.nn.softmax(b_IJ, dim=2) assert c_IJ.get_shape() == [1, 1152, 10, 1] # line 5: # weighting u_hat with c_I in the third dim, # then sum in the second dim, resulting in [batch_size, 1, 16, 1] b_Il, b_Ij, b_Ir = tf.split(b_IJ, size_splits, axis=2) c_Il, c_Ij, b_Ir = tf.split(c_IJ, size_splits, axis=2) assert c_Ij.get_shape() == [1, 1152, 1, 1] s_j = tf.multiply(c_Ij, u_hat) s_j = tf.reduce_sum(tf.multiply(c_Ij, u_hat), axis=1, keep_dims=True) assert s_j.get_shape() == [cfg.batch_size, 1, 16, 1] # line 6: # squash using Eq.1, resulting in [batch_size, 1, 16, 1] v_j = squash(s_j) assert s_j.get_shape() == [cfg.batch_size, 1, 16, 1] # line 7: # tile v_j from [batch_size ,1, 16, 1] to [batch_size, 1152, 16, 1] # [16, 1].T x [16, 1] => [1, 1], then reduce mean in the # batch_size dim, resulting in [1, 1152, 1, 1] v_j_tiled = tf.tile(v_j, [1, 1152, 1, 1]) u_produce_v = tf.matmul(u_hat, v_j_tiled, transpose_a=True) assert u_produce_v.get_shape() == [cfg.batch_size, 1152, 1, 1] b_Ij += tf.reduce_sum(u_produce_v, axis=0, keep_dims=True) b_IJ = tf.concat([b_Il, b_Ij, b_Ir], axis=2) return(v_j, b_IJ) def squash(vector): vec_abs = tf.sqrt(tf.reduce_sum(tf.square(vector))) # a scalar scalar_factor = tf.square(vec_abs) / (1 + tf.square(vec_abs)) vec_squashed = scalar_factor * tf.divide(vector, vec_abs) # element-wise return(vec_squashed) if not self.with_routing: # the PrimaryCaps layer # input: [batch_size, 20, 20, 256] assert input.get_shape() == [cfg.batch_size, 20, 20, 256] capsules = [] for i in range(self.num_units): # each capsule i: [batch_size, 6, 6, 32] with tf.variable_scope('ConvUnit_' + str(i)): caps_i = tf.contrib.layers.conv2d(input, self.num_outputs, self.kernel_size, self.stride, padding="VALID") caps_i = tf.reshape(caps_i, shape=(cfg.batch_size, -1, 1, 1)) capsules.append(caps_i) assert capsules[0].get_shape() == [cfg.batch_size, 1152, 1, 1] # [batch_size, 1152, 8, 1] capsules = tf.concat(capsules, axis=2) capsules = squash(capsules) assert capsules.get_shape() == [cfg.batch_size, 1152, 8, 1] else: # the DigitCaps layer # Reshape the input into shape [batch_size, 1152, 8, 1] self.input = tf.reshape(input, shape=(cfg.batch_size, 1152, 8, 1)) # b_IJ: [1, num_caps_l, num_caps_l_plus_1, 1] b_IJ = tf.zeros(shape=[1, 1152, 10, 1], dtype=np.float32) capsules = [] for j in range(self.num_outputs): with tf.variable_scope('caps_' + str(j)): caps_j, b_IJ = capsule(input, b_IJ, j) capsules.append(caps_j) # Return a tensor with shape [batch_size, 10, 16, 1] capsules = tf.concat(capsules, axis=1) assert capsules.get_shape() == [cfg.batch_size, 10, 16, 1] return(capsules)
  • 12. Codes def loss(self): # 1. The margin loss # [batch_size, 10, 1, 1] # max_l = max(0, m_plus-||v_c||)^2 max_l = tf.square(tf.maximum(0., cfg.m_plus - self.v_length)) # max_r = max(0, ||v_c||-m_minus)^2 max_r = tf.square(tf.maximum(0., self.v_length - cfg.m_minus)) assert max_l.get_shape() == [cfg.batch_size, self.num_label, 1, 1] # reshape: [batch_size, 10, 1, 1] => [batch_size, 10] max_l = tf.reshape(max_l, shape=(cfg.batch_size, -1)) max_r = tf.reshape(max_r, shape=(cfg.batch_size, -1)) # calc T_c: [batch_size, 10] # T_c = Y, is my understanding correct? Try it. T_c = self.Y # [batch_size, 10], element-wise multiply L_c = T_c * max_l + cfg.lambda_val * (1 - T_c) * max_r self.margin_loss = tf.reduce_mean(tf.reduce_sum(L_c, axis=1)) # 2. The reconstruction loss orgin = tf.reshape(self.X, shape=(cfg.batch_size, -1)) squared = tf.square(self.decoded - orgin) self.reconstruction_err = tf.reduce_mean(squared) # 3. Total loss # The paper uses sum of squared error as reconstruction error, but we # have used reduce_mean in `# 2 The reconstruction loss` to calculate # mean squared error. In order to keep in line with the paper,the # regularization scale should be 0.0005*784=0.392 self.total_loss = self.margin_loss + cfg.regularization_scale * self.reconstruction_err
  • 14. Capsule on Multi MNIST Capsule은 Spatial한 정보를 볼 수 있음
  • 18. References [논문] 1. Dynamic Routing Between Capsules (2017.11) - NIPS 2017 [블로그] https://jayhey.github.io/deep%20learning/2017/11/28/CapsNet_1/ https://www.slideshare.net/thinkingfactory/pr12-capsule-networks-jaejun-yoo https://medium.com/ai%C2%B3-theory-practice-business/understanding-hintons-capsule-networks-part-iii-dyn amic-routing-between-capsules-349f6d30418 http://helper.ipam.ucla.edu/publications/gss2012/gss2012_10754.pdf https://towardsdatascience.com/capsule-networks-the-new-deep-learning-network-bd917e6818e8 http://blog.naver.com/PostView.nhn?blogId=sogangori&logNo=221129974140&redirect=Dlog&widgetT ypeCall=true&directAccess=false [영상] http://jaejunyoo.blogspot.com/2018/02/pr12-video-56-capsule-network.html [Code] https://github.com/naturomics/CapsNet-Tensorflow
  • 19. Thanks Any Questions? You can send mail to Susang Kim(healess1@gmail.com)