These slides include the answers for the following questions:
- What is OCR?
- Why do we need it?
- Why is it difficult?
- Comparison between OCR & object detections
- Three approaches for text localization
- Three approaches for text recognition
Videos are also available from the below:
(Korean) https://youtu.be/ckRFBl_XWFg
(English) coming soon
[Reference] Hwalsuk Lee, https://www.slideshare.net/deview/111-ai
5. Why OCR?
characters in the
computer
characters in the
physical world
Difficult because of the large variations!
(font, size, shape, location, noise, ...)
6. OCR vs Object detection
Text
Localization
Text
Recognition
• OCR
• Object detection
Object
Localization
Object
Recognition
Detect the bounding
boxes that enclose text Read it
• OCR is more challenging than object detection due to
- various aspect (W:H) ratio - large distortions
- confusion w/ textures (‘I’, ‘T’)
- few pretrained models- high density
- various languages
7. Text Localization Text
Localization
Text
Recognition
이활석, https://www.slideshare.net/deview/111-ai
regression-based
(like object detection)
end-to-end
[Textboxes, Liao et al., AAAI2017] [PixelLink, Deng et al., AAAI2018]
classification-based
(like semantic segmentation)
[FOTS, Liu et al., CVPR2018]
simultaneous local+recog
# of
papers
training unstable stable