Successfully reported this slideshow.
Your SlideShare is downloading. ×

最近の重要な論文の紹介 - テキストとの対応付けによる映像の理解に関連して(ステアラボ人工知能シンポジウム2017)

Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad

Check these out next

1 of 17 Ad

最近の重要な論文の紹介 - テキストとの対応付けによる映像の理解に関連して(ステアラボ人工知能シンポジウム2017)

Download to read offline

講演者: 中島悠太先生(大阪大学)

講演者: 中島悠太先生(大阪大学)

Advertisement
Advertisement

More Related Content

Viewers also liked (12)

More from STAIR Lab, Chiba Institute of Technology (7)

Advertisement

Recently uploaded (20)

最近の重要な論文の紹介 - テキストとの対応付けによる映像の理解に関連して(ステアラボ人工知能シンポジウム2017)

  1. 1. — 2017/3/12
  2. 2. 2 
 Deep Semantic Feature 
 Sentence Sentence Embedding Video Embedding Web Images Embedding Space Video “A baby is playing a guitar.” Image Search 
 Deep Semantic Feature
  3. 3. • - Xu et al., “Show, attend and tell: Neural image caption generation with visual attention,” in Proc. ICML 2015. • - Grave, Wayne, et al., “Hybrid computing using a neural network with dynamic external memory,” Nature, vol. 2538, pp.471—476, 2016. • Adversarial Examples - Goodfellow, et al., “Exmpaining and harnessing adversarial examples,” in Proc. ICLR 2015. 3
  4. 4. • Xu, Ba, Kiros, Cho, Courville, Salakhutdinov, Zemel, and Bengio
 “Show, attend and tell: Neural image caption generation with visual attention”
 Proc. ICML 2015
  5. 5. • • (?) 5 Images from: [Xu et al. 2015]
  6. 6. 6
  7. 7. • • • • Visual Question Answering • 7 Q: what are racing down the
 track with their jockies? A: horses
  8. 8. • - - • 8Image from: [Nakashima et al. 2012]
  9. 9. • Grave, Wayne, et al. 
 “Hybrid computing using a neural network with dynamic external memory” 
 Nature, vol. 2538, pp.471—476, 2016
  10. 10. Differentiable neural computer (DCN) 10 Image from: [Grave et al. 2016]
  11. 11. • - 
 • 
 DCN 11 : : 
 Controller Image from: [Grave et al. 2016]
  12. 12. • RNN / - 3D-CNN - Mean/Max pooling • - - 12
  13. 13. Adversarial examples • Goodfellow, Shlens, and Szegedy
 “Exmpaining and harnessing adversarial examples”
 Proc. ICLR 2015.
  14. 14. Adversarial examples? • DNN 14 Images from: [Goodfellow et al. 2015]
  15. 15. • DNN • • 15
  16. 16. • • • 16
  17. 17. • Microsoft Research Video Description Corpus • > 2000 Video and descriptions • TVD: a reproducible and multiply aligned TV series dataset • Big Bang Theory Games of Thrones • MSR VTT • > 1M video and description pairs • MPII Movie Description Dataset • > 100K clip and description pairs • YouTube 8M • • SumMe • TVSum • UG Video Dataset 17

×