Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Image description through fusion based recurrent multi model learning
1. IMAGE DESCRIPTION THROUGH FUSION BASED RECURRENT MULTI MODAL
LEARNING
Ram Manohar Oruganti1
, Shagan Sah2
, Suhas Pillai3
and Raymond Ptucha1
ABSTRACT
Index Terms
1. INTRODUCTION
Fig. 1.
2. 2. BACKGROUND
2.1 Convolutional Neural Networks
2.2 Long Short Term Memory Networks
<x1, x2, xt 1, xt, ,
xT>, xt 1 xt
xt
it ft
ot is gt
ct,
ht,
it, ft, ot
W b
3. PROPOSED LEARNING MODEL
3.1 FRMM model
4. 4.2 Training details
Caffe
4.3 Results
Model B 1 B 2 B 3 B 4
AFRMM 70.2 52.8 38.3 27.6
Table I.
CNN layer B 1 B 2 B 3 B 4
AFRMM+fc8 70.2 52.8 38.3 27.6
Table II.
Model B 1 B 2 B 3 B 4 METEOR
40.4
Our model 70.2 52.8 27.6 22.5
Table III.
Model B 1 B 2 B 3 B 4 METEOR
Vinyals [13] 66.3 42.3 27.7 18.3
Table IV.
5. CONCLUSION
6. REFERENCES
, et al.
arXiv preprint
arXiv:1409.0575,
5. 26th Annual Conference on
Neural Information Processing Systems 2012, NIPS
2012, December 3, 2012 December 6, 2012
Proceedings of the IEEE,
27th Annual Conference on Neural Information
Processing Systems, NIPS 2013
Neural Computation,
ICASSP 2013
Computer Vision and Pattern
Recognition
Computer Vision and Pattern
Recognition
, et al.
Computer Vision and Pattern
Recognition
arXiv preprint
arXiv:1505.00487,
, et al.
Proceedings of the IEEE
International Conference on Computer Vision
, et al.
arXiv
preprint arXiv:1502.03044,
arXiv preprint arXiv:1411.4555,
21st
Annual Conference on Neural Information
Processing Systems, NIPS 2007
Advances in neural information processing systems
arXiv preprint arXiv:1410.4615,
Computer Vision and Pattern Recognition
arXiv preprint arXiv:1412.4729,
arXiv preprint
arXiv:1412.6632,
Transactions of the Association
for Computational Linguistics,
, et al.
Computer Vision ECCV
2014
ICLR
Proceedings of the 40th
annual meeting on association for computational
linguistics
In Proceedings of the Ninth
Workshop on Statistical Machine Translation
, et al.
arXiv preprint arXiv:1411.4389,
, et al.
Proceedings of the IEEE Conference
on Computer Vision and Pattern Recognition
arXiv preprint arXiv:1410.1090,