ICCV2017 N2NMNs

Learning to Reason: End-to-End Module
Networks for Visual Question Answering
Ronghang Hu, Jacob Andreas, Marcus Rohrbach et al.
ICCV 2017
Presented by Choi Seong Jae
2017. 11. 11

Overview: NMN
Where is the dog? Is there a red shape above a circle?

Attentional neural modules
𝒚 = 𝒇 𝒎(𝒂 𝟏, 𝒂 𝟐, … ; 𝒙 𝒗𝒊𝒔, 𝒙 𝒕𝒙𝒕, 𝜽 𝒎)
Textual vector of module 𝒎
Spatial feature map(CNN)
Attention maps

Attentional neural modules
Q: What object is next to the table? describe(relocate(find()))
𝒑(𝒍|𝒒; 𝜽)

Training
𝛻𝜃 𝐿 = 𝔼𝑙~𝑝(𝑙|𝑞;𝜃) 𝐿 𝜃, 𝑙 𝛻 log 𝑝(𝑙 𝑚 𝑞; 𝜃 + 𝛻𝜃 𝐿(𝜃, 𝑙)
Monte-Carlo Policy Gradient
𝑴 = 𝟏

Training: Behavioral cloning from expert polices
• Optimizing loss function in Eqn. 4 from scratch is a
challenging reinforcement learning problem
• Optimizing the layout policy
• Optimizing attention weights for each module
• Learning the parameters in the neural modules

Training: Behavioral cloning from expert polices
Is there a red shape above a circle?
Leaves
Internal
Root
attend
re-attend or combine
measure and classify
J. Andreas, M. Rohrbach et al. Neural module networks, CVPR 2016

ICCV2017 N2NMNs

Recommended

Recommended

More Related Content

What's hot

What's hot (19)

Similar to ICCV2017 N2NMNs

Similar to ICCV2017 N2NMNs (20)

Recently uploaded

Recently uploaded (20)

ICCV2017 N2NMNs