More Related Content
Similar to SCATTER and GFTE (7)
SCATTER and GFTE
- 2. SCATTER: Selective Context Attentional Scene Text
Recognizer
Litman, Ron Anschel, Oron Tsiper, Shahar Litman, Roee Mazor, Shai Manmatha, R.
- 3. 动机
级联
中间监督
更强的特征表示
Shih-En Wei, Varun Ramakrishna, Takeo Kanade, and Yaser Sheikh. Convolutional pose machines. In Pro- ceedings ofthe IEEE Conference on
Computer Vision and Pattern Recognition, pages 4724–4732, 2016.
Alejandro Newell, Zhiao Huang, and Jia Deng. Asso- ciative embedding: End-to-end learning for joint de- tection and grouping. In Advances
in Neural Informa- tion Processing Systems, pages 2277–2287, 2017.
Alejandro Newell, Kaiyu Yang, and Jia Deng. Stacked hourglass networks for human pose estimation. In European conference on computer
vision, pages 483– 499. Springer, 2016.
- 12. 训练和推理过程
MJ + Synth + Synth Add
1 V100, Adadelta , 0.95 decay, 128 (40%, 40%, 20%), clipping ~ 5, 6 epoches
40% 随机缩放, 扭曲, 32 * 100
高>宽,90 度旋转, 每字符平均,取最高概率的版本
- 19. GTC: Guided Training of CTC Towards Efficient and
Accurate Scene Text Recognition
Hu, Wenyang Cai, Xiaocong Hou, Jun Yi, Shuai Lin, Zhiping
- 25. GCN + CTC Decoder
https://www.desmos.com/calculator/vhvhpbbvb4
- 26. 训练和推理过程
训练时,CTC Loss 负责 GCN + CTC encoder 的更新, CE Loss 负责更新其他部分
推理时只使用 CTC 分支
32 卡 V100, 32 batch size, adam 10e-3 decay 0.1/30000 iters
Mj + Synth + Synth Add + all benchmark training set 5.6 M
64 height * up 160
Greedy Decoding
- 29. 总结 + 思考
本质是学习一个更强特征表示和特征对齐
提出了一个选择性的注意力解码器
GCN vs. GCB