Terry Taewoong Um
fb.com/deeplearningtalk fb.com/terryum
사진 속 글자를 읽어주는
Optical Character
Recognition (OCR)
42
What is OCR?
• Optical Character Recognition (OCR)
Reading typed/printed/handwritten characters
from image sources
Speech
Recognition
What is OCR?
• Optical Character Recognition (OCR)
Reading typed/printed/handwritten characters
from image sources
OCR
Why OCR?
characters in the
computer
characters in the
physical world
A
Why OCR?
characters in the
computer
characters in the
physical world
Difficult because of the large variations!
(font, size, shape, location, noise, ...)
OCR vs Object detection
Text
Localization
Text
Recognition
• OCR
• Object detection
Object
Localization
Object
Recognition
Detect the bounding
boxes that enclose text Read it
• OCR is more challenging than object detection due to
- various aspect (W:H) ratio - large distortions
- confusion w/ textures (‘I’, ‘T’)
- few pretrained models- high density
- various languages
Text Localization Text
Localization
Text
Recognition
이활석, https://www.slideshare.net/deview/111-ai
regression-based
(like object detection)
end-to-end
[Textboxes, Liao et al., AAAI2017] [PixelLink, Deng et al., AAAI2018]
classification-based
(like semantic segmentation)
[FOTS, Liu et al., CVPR2018]
simultaneous local+recog
# of
papers
training unstable stable
Text Recognition Text
Localization
Text
Recognition
Connectionist
Temporal
Classification
r EOSpt i
t pi<GO> r
Attention
# of
papers
speed
rarely used
accuracy
OCR + Translation = SmartLens
Text
Localization
Text
Recognition
Machine
translation
• What you need to know is
- Machine learning basics - Neural network basics
- Convolutional Neural Networks (+ advanced topics)
- Recurrent Neural Networks (+ advanced topics)

A brief introduction to OCR (Optical character recognition)

  • 1.
    Terry Taewoong Um fb.com/deeplearningtalkfb.com/terryum 사진 속 글자를 읽어주는 Optical Character Recognition (OCR) 42
  • 2.
    What is OCR? •Optical Character Recognition (OCR) Reading typed/printed/handwritten characters from image sources Speech Recognition
  • 3.
    What is OCR? •Optical Character Recognition (OCR) Reading typed/printed/handwritten characters from image sources OCR
  • 4.
    Why OCR? characters inthe computer characters in the physical world A
  • 5.
    Why OCR? characters inthe computer characters in the physical world Difficult because of the large variations! (font, size, shape, location, noise, ...)
  • 6.
    OCR vs Objectdetection Text Localization Text Recognition • OCR • Object detection Object Localization Object Recognition Detect the bounding boxes that enclose text Read it • OCR is more challenging than object detection due to - various aspect (W:H) ratio - large distortions - confusion w/ textures (‘I’, ‘T’) - few pretrained models- high density - various languages
  • 7.
    Text Localization Text Localization Text Recognition 이활석,https://www.slideshare.net/deview/111-ai regression-based (like object detection) end-to-end [Textboxes, Liao et al., AAAI2017] [PixelLink, Deng et al., AAAI2018] classification-based (like semantic segmentation) [FOTS, Liu et al., CVPR2018] simultaneous local+recog # of papers training unstable stable
  • 8.
    Text Recognition Text Localization Text Recognition Connectionist Temporal Classification rEOSpt i t pi<GO> r Attention # of papers speed rarely used accuracy
  • 9.
    OCR + Translation= SmartLens Text Localization Text Recognition Machine translation • What you need to know is - Machine learning basics - Neural network basics - Convolutional Neural Networks (+ advanced topics) - Recurrent Neural Networks (+ advanced topics)