Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

co:op-READ-Convention Marburg - Roger Labahn

923 views

Published on

Roger Labahn (University of Rostock, DE): Handwritten Text Recognition. Key concepts

co:op-READ-Convention Marburg

Technology meets Scholarship, or how Handwritten Text Recognition will Revolutionize Access to Archival Collections.
With a special focus on biographical data in archives

Hessian State Archives Marburg Friedrichsplatz 15, D - 35037 Marburg
19-21 January 2016

Published in: Science
  • Be the first to comment

  • Be the first to like this

co:op-READ-Convention Marburg - Roger Labahn

  1. 1. Handwritten Text Recognition: Key concepts PD Dr. Roger Labahn Computational Intelligence Technology Lab Mathematical Optimization Group Institute for Mathematics University of Rostock co:op Convention | READ Kickoff 19.01.2016
  2. 2. Handwritten Text Recognition: Key concepts Introduction Concepts – Problems – Tasks Recognition & Training Interpretation – Decoding Epilog co:op Convention | READ Kickoff HTR Key Concepts | Introduction Roger Labahn | C I T lab
  3. 3. Framework – Workflow ... ... ... • Application: keyword search, transcription, . . .. . .. . . OUT textual information (words, positions, ...) with alternatives & confidences ⇑⇑⇑ • HTR-Engine ⇑⇑⇑ IN writing images (lines, words, table cells, form fields, ...) • Layout Analysis: . . .. . .. . . , text blocks ... ... ... co:op Convention | READ Kickoff HTR Key Concepts | Introduction Roger Labahn | C I T lab
  4. 4. Framework – Workflow ... ... ... • Application: keyword search, transcription, . . .. . .. . . OUT textual information (words, positions, ...) with alternatives & confidences ⇑⇑⇑ • HTR-Engine ⇑⇑⇑ IN writing images (lines, words, table cells, form fields, ...) • Layout Analysis: . . .. . .. . . , text blocks ... ... ... co:op Convention | READ Kickoff HTR Key Concepts | Introduction Roger Labahn | C I T lab
  5. 5. Alternative recognition strategies Topological methods • learn & read graphical substructures of writings • arcs, lines, curves, holes, ... HMM based methods • Hidden Markov Models • learn & read states while traversing the writing RNN based methods • Recurrent Neural Networks co:op Convention | READ Kickoff HTR Key Concepts | Introduction Roger Labahn | C I T lab
  6. 6. Alternative recognition strategies Topological methods • learn & read graphical substructures of writings • arcs, lines, curves, holes, ... HMM based methods • Hidden Markov Models • learn & read states while traversing the writing RNN based methods • Recurrent Neural Networks co:op Convention | READ Kickoff HTR Key Concepts | Introduction Roger Labahn | C I T lab
  7. 7. Recognition Engine C I T lab & MoU partner PLANET Decoding textual output • textual interpretation of recognition results • matching external requierements / knowledge (dictionaries, language model, ...) ⇑⇑⇑ ⇑⇑⇑ Recognition recognition matrix • recognition information from image information • processing standardized writing image ⇑⇑⇑ ⇑⇑⇑ Writing preprocessing standardized writing • corrections & normalizations • e.g.: baseline, slant, height, ... co:op Convention | READ Kickoff HTR Key Concepts | Introduction Roger Labahn | C I T lab
  8. 8. Introduction Concepts – Problems – Tasks Segmentation Context Language HTR Recognition & Training Interpretation – Decoding Epilog co:op Convention | READ Kickoff HTR Key Concepts | Concepts – Problems – Tasks Roger Labahn | C I T lab
  9. 9. Segmentation ? (classical) OCR = Optical Character Recognition • Reading single characters Sub-images per character ! ? • B a d ␣ D o ??? a n Segmentationfree Reading • processing the entire writing image: word . . .. . .. . . line . . .. . .. . . • scanning information data sequence (signal) / character sequence • co:op Convention | READ Kickoff HTR Key Concepts | Concepts – Problems – Tasks | Segmentation Roger Labahn | C I T lab
  10. 10. Segmentation ? NONE ! (classical) OCR = Optical Character Recognition • Reading single characters Sub-images per character ! ? • B a d ␣ D o ??? a n Segmentationfree Reading • processing the entire writing image: word . . .. . .. . . line . . .. . .. . . • scanning information data sequence (signal) / character sequence • co:op Convention | READ Kickoff HTR Key Concepts | Concepts – Problems – Tasks | Segmentation Roger Labahn | C I T lab
  11. 11. Segmentation ? NONE ! (classical) OCR = Optical Character Recognition • Reading single characters Sub-images per character ! ? • B a d ␣ D o ??? a n Segmentationfree Reading • processing the entire writing image: word . . .. . .. . . line . . .. . .. . . • scanning information data sequence (signal) / character sequence • B co:op Convention | READ Kickoff HTR Key Concepts | Concepts – Problems – Tasks | Segmentation Roger Labahn | C I T lab
  12. 12. Segmentation ? NONE ! (classical) OCR = Optical Character Recognition • Reading single characters Sub-images per character ! ? • B a d ␣ D o ??? a n Segmentationfree Reading • processing the entire writing image: word . . .. . .. . . line . . .. . .. . . • scanning information data sequence (signal) / character sequence • BB co:op Convention | READ Kickoff HTR Key Concepts | Concepts – Problems – Tasks | Segmentation Roger Labahn | C I T lab
  13. 13. Segmentation ? NONE ! (classical) OCR = Optical Character Recognition • Reading single characters Sub-images per character ! ? • B a d ␣ D o ??? a n Segmentationfree Reading • processing the entire writing image: word . . .. . .. . . line . . .. . .. . . • scanning information data sequence (signal) / character sequence • BB. co:op Convention | READ Kickoff HTR Key Concepts | Concepts – Problems – Tasks | Segmentation Roger Labahn | C I T lab
  14. 14. Segmentation ? NONE ! (classical) OCR = Optical Character Recognition • Reading single characters Sub-images per character ! ? • B a d ␣ D o ??? a n Segmentationfree Reading • processing the entire writing image: word . . .. . .. . . line . . .. . .. . . • scanning information data sequence (signal) / character sequence • BB.a co:op Convention | READ Kickoff HTR Key Concepts | Concepts – Problems – Tasks | Segmentation Roger Labahn | C I T lab
  15. 15. Segmentation ? NONE ! (classical) OCR = Optical Character Recognition • Reading single characters Sub-images per character ! ? • B a d ␣ D o ??? a n Segmentationfree Reading • processing the entire writing image: word . . .. . .. . . line . . .. . .. . . • scanning information data sequence (signal) / character sequence • BB.ad co:op Convention | READ Kickoff HTR Key Concepts | Concepts – Problems – Tasks | Segmentation Roger Labahn | C I T lab
  16. 16. Segmentation ? NONE ! (classical) OCR = Optical Character Recognition • Reading single characters Sub-images per character ! ? • B a d ␣ D o ??? a n Segmentationfree Reading • processing the entire writing image: word . . .. . .. . . line . . .. . .. . . • scanning information data sequence (signal) / character sequence • BB.ad␣ co:op Convention | READ Kickoff HTR Key Concepts | Concepts – Problems – Tasks | Segmentation Roger Labahn | C I T lab
  17. 17. Segmentation ? NONE ! (classical) OCR = Optical Character Recognition • Reading single characters Sub-images per character ! ? • B a d ␣ D o ??? a n Segmentationfree Reading • processing the entire writing image: word . . .. . .. . . line . . .. . .. . . • scanning information data sequence (signal) / character sequence • BB.ad␣. co:op Convention | READ Kickoff HTR Key Concepts | Concepts – Problems – Tasks | Segmentation Roger Labahn | C I T lab
  18. 18. Segmentation ? NONE ! (classical) OCR = Optical Character Recognition • Reading single characters Sub-images per character ! ? • B a d ␣ D o ??? a n Segmentationfree Reading • processing the entire writing image: word . . .. . .. . . line . . .. . .. . . • scanning information data sequence (signal) / character sequence • BB.ad␣.D co:op Convention | READ Kickoff HTR Key Concepts | Concepts – Problems – Tasks | Segmentation Roger Labahn | C I T lab
  19. 19. Image context is essential ! Single segment without context • • virtually not (sufficiently) readable Character sequence without context • • virtually not (sufficiently) explainable co:op Convention | READ Kickoff HTR Key Concepts | Concepts – Problems – Tasks | Context Roger Labahn | C I T lab
  20. 20. Image context is essential ! Single segment without context • u ?? OR n ?? • virtually not (sufficiently) readable Character sequence without context • ??? • virtually not (sufficiently) explainable co:op Convention | READ Kickoff HTR Key Concepts | Concepts – Problems – Tasks | Context Roger Labahn | C I T lab
  21. 21. Language context is essential ! Free reading – no restrictions for possible reading results • BB.ad␣DDolo.auu • application: figures & general numbers, ... Comparison against dictionary or keyword • task: • Read a german city name from a given list ! • Find the name Bad Doberan ! • Bad Doberan • goal: optimal / possible correspondence writing / reading result dictionary entry / keyword co:op Convention | READ Kickoff HTR Key Concepts | Concepts – Problems – Tasks | Language Roger Labahn | C I T lab
  22. 22. Language context is essential ! Free reading – no restrictions for possible reading results • BB.ad␣DDolo.auu • application: figures & general numbers, ... Comparison against dictionary or keyword • task: • Read a german city name from a given list ! • Find the name Bad Doberan ! • Bad Doberan • goal: optimal / possible correspondence writing / reading result dictionary entry / keyword co:op Convention | READ Kickoff HTR Key Concepts | Concepts – Problems – Tasks | Language Roger Labahn | C I T lab
  23. 23. Language context is essential ! Free reading – no restrictions for possible reading results • BB.ad␣DDolo.auu • application: figures & general numbers, ... Comparison against dictionary or keyword • task: • Read a german city name from a given list ! • Find the name Bad Doberan ! • Bad Doberan • goal: optimal / possible correspondence writing / reading result dictionary entry / keyword co:op Convention | READ Kickoff HTR Key Concepts | Concepts – Problems – Tasks | Language Roger Labahn | C I T lab
  24. 24. OCR ? new paradigm – new concepts co:op Convention | READ Kickoff HTR Key Concepts | Concepts – Problems – Tasks | HTR Roger Labahn | C I T lab
  25. 25. OCR ? HTR ! new paradigm – new concepts new term ! co:op Convention | READ Kickoff HTR Key Concepts | Concepts – Problems – Tasks | HTR Roger Labahn | C I T lab
  26. 26. OCR ? HTR ! new paradigm – new concepts new term ! • HTR Handwritten Text Recognition • ATR Automatic Text Recognition • ... ??? co:op Convention | READ Kickoff HTR Key Concepts | Concepts – Problems – Tasks | HTR Roger Labahn | C I T lab
  27. 27. Introduction Concepts – Problems – Tasks Recognition & Training Feature extraction Writing processing Neural Network Parameter training Interpretation – Decoding Epilog co:op Convention | READ Kickoff HTR Key Concepts | Recognition & Training Roger Labahn | C I T lab
  28. 28. From pixel values to features original grey image Filtering co:op Convention | READ Kickoff HTR Key Concepts | Recognition & Training | Feature extraction Roger Labahn | C I T lab
  29. 29. From pixel values to features original grey image Filtering co:op Convention | READ Kickoff HTR Key Concepts | Recognition & Training | Feature extraction Roger Labahn | C I T lab
  30. 30. From pixel values to features original grey image Filtering co:op Convention | READ Kickoff HTR Key Concepts | Recognition & Training | Feature extraction Roger Labahn | C I T lab
  31. 31. From pixel values to features original grey image Filtering co:op Convention | READ Kickoff HTR Key Concepts | Recognition & Training | Feature extraction Roger Labahn | C I T lab
  32. 32. From pixel values to features original grey image Filtering co:op Convention | READ Kickoff HTR Key Concepts | Recognition & Training | Feature extraction Roger Labahn | C I T lab
  33. 33. From pixel values to features original grey image Filtering co:op Convention | READ Kickoff HTR Key Concepts | Recognition & Training | Feature extraction Roger Labahn | C I T lab
  34. 34. Collect & remember context ! Writing processing • scanning in different directions data sequences (signals) • Information memory • neural networks with complex neurons (cells) • recurrent connections =⇒=⇒=⇒ memory co:op Convention | READ Kickoff HTR Key Concepts | Recognition & Training | Writing processing Roger Labahn | C I T lab
  35. 35. Collect & remember context ! Writing processing • scanning in different directions data sequences (signals) • Information memory • neural networks with complex neurons (cells) • recurrent connections =⇒=⇒=⇒ memory co:op Convention | READ Kickoff HTR Key Concepts | Recognition & Training | Writing processing Roger Labahn | C I T lab
  36. 36. Complex cells co:op Convention | READ Kickoff HTR Key Concepts | Recognition & Training | Neural Network Roger Labahn | C I T lab
  37. 37. Complex cells – memory by recurrent connections 6 ? ?? co:op Convention | READ Kickoff HTR Key Concepts | Recognition Training | Neural Network Roger Labahn | C I T lab
  38. 38. Hierarchical Neuronal Networks co:op Convention | READ Kickoff HTR Key Concepts | Recognition Training | Neural Network Roger Labahn | C I T lab
  39. 39. From feature input to network output (Figure from GRAVES, SCHMIDHUBER: Offline Handwriting Recognition with Multidimensional Recurrent Neural Networks) co:op Convention | READ Kickoff HTR Key Concepts | Recognition Training | Neural Network Roger Labahn | C I T lab
  40. 40. From feature input to network output co:op Convention | READ Kickoff HTR Key Concepts | Recognition Training | Neural Network Roger Labahn | C I T lab
  41. 41. Parameter training: Machine Learning Theory • objective: optimally adapt parameters in cells along network connections • idea: train the network with learning data samples • optimization: minimize error (network output vs. sample target) over training data Practice: impression of large application cases • 104 network cells • 106 trainable parameters • 104 learning data samples (writing images) • 150 training epochs each processing every sample once • 4 weeks training from the scratch co:op Convention | READ Kickoff HTR Key Concepts | Recognition Training | Parameter training Roger Labahn | C I T lab
  42. 42. Learning data . . .. . .. . . • . . .. . .. . . labeled training samples ground truth HTR: writing images with correct text • . . .. . .. . . the more the better . . .. . .. . . BUT: start with realistic (reasonable) number improve while working • . . .. . .. . . represent all project data . . .. . .. . . BUT: start with HTR (networks) from similar collections corpora • . . .. . .. . . contribute to general HTR engine improvement: put into network repository for specific application cases co:op Convention | READ Kickoff HTR Key Concepts | Recognition Training | Parameter training Roger Labahn | C I T lab
  43. 43. Introduction Concepts – Problems – Tasks Recognition Training Interpretation – Decoding Network output Decoding Epilog co:op Convention | READ Kickoff HTR Key Concepts | Interpretation – Decoding Roger Labahn | C I T lab
  44. 44. Channel probabilities Pre-conditions • (abstract) alphabet of (abstract) characters • text composed of exactly these characters • alphabet characters ⇐⇒⇐⇒⇐⇒ network output neurons channels • example: digits, uppercase letters, lowercase letters, special characters ␣- • much more general: any symbol unit learnable from training data • current (large) application case: up to 150 character channels • independent from (natural) language – reading/writing direction – understanding Network output probability of (character) channel at writing (image) position co:op Convention | READ Kickoff HTR Key Concepts | Interpretation – Decoding | Network output Roger Labahn | C I T lab
  45. 45. Confidence Matrix – recognition / perception matrix . B D a d l o u ␣ co:op Convention | READ Kickoff HTR Key Concepts | Interpretation – Decoding | Network output Roger Labahn | C I T lab
  46. 46. Expression matching • restrict to permissible words dictionary keyword(s) construct(s) regular expression • consider character confidences probability measure or their negative logarithms distance measure Algorithmic method • compare confidence matrix against any permissible expression • use extremely fast algorithm: Dynamic Programming co:op Convention | READ Kickoff HTR Key Concepts | Interpretation – Decoding | Decoding Roger Labahn | C I T lab
  47. 47. Expression matching • restrict to permissible words dictionary keyword(s) construct(s) regular expression • consider character confidences probability measure or their negative logarithms distance measure Algorithmic method • compare confidence matrix against any permissible expression • use extremely fast algorithm: Dynamic Programming co:op Convention | READ Kickoff HTR Key Concepts | Interpretation – Decoding | Decoding Roger Labahn | C I T lab
  48. 48. Decoding Objective – Result • permissible expression(s) with best matching to recognition output • best matching ⇐⇒⇐⇒⇐⇒ maximal probability ⇐⇒⇐⇒⇐⇒ minimum distance • best alternatives ranked by measure (probability / distance) Practice: impression of actual application cases • only decoding on pre-processed lines • searching 1 keyword in 10.500 lines (433 pages) : 2 - 3 sec. average • reading 1 page against 11.650 word dictionary: 8 - 9 sec. average co:op Convention | READ Kickoff HTR Key Concepts | Interpretation – Decoding | Decoding Roger Labahn | C I T lab
  49. 49. Decoding Objective – Result • permissible expression(s) with best matching to recognition output • best matching ⇐⇒⇐⇒⇐⇒ maximal probability ⇐⇒⇐⇒⇐⇒ minimum distance • best alternatives ranked by measure (probability / distance) Practice: impression of actual application cases • only decoding on pre-processed lines • searching 1 keyword in 10.500 lines (433 pages) : 2 - 3 sec. average • reading 1 page against 11.650 word dictionary: 8 - 9 sec. average co:op Convention | READ Kickoff HTR Key Concepts | Interpretation – Decoding | Decoding Roger Labahn | C I T lab
  50. 50. Dynamic Programming co:op Convention | READ Kickoff HTR Key Concepts | Interpretation – Decoding | Decoding Roger Labahn | C I T lab
  51. 51. Dynamic Programming co:op Convention | READ Kickoff HTR Key Concepts | Interpretation – Decoding | Decoding Roger Labahn | C I T lab
  52. 52. Introduction Concepts – Problems – Tasks Recognition Training Interpretation – Decoding Epilog co:op Convention | READ Kickoff HTR Key Concepts | Epilog Roger Labahn | C I T lab
  53. 53. Results from C I T lab’s contribution to ICDAR’s HTRtS-2015 contest WER = 0% CER = 0% who has with or without right the temporary possession of it : and who has with or without right the temporary possession of it : and WER = 17% CER = 4% operation of this act is spent upon Titius only , operation of this act isspeut upon Titius only , WER = 67% CER = 52% of the said first issue : the amount of such second consequently gap/ to the of the and put feet the without of such ; said uitrquunity be the WER = 80% CER = 17% for a simple personal Injury the Offender ’ s punish= For on simple personal injury the offenders punish . 2. Examples of test line images of increasing difficulty. The reference transcript and the CITlab system hypothesis are displayed (in this order) below h image. The corresponding WER and CER figures are also shown on the right of each image. the lines with crossed-out word can be transcribed as the ine shows. Finally, we can see that sometimes, if the line a large WER but a low CER, the transcript can be more ul than if the WER is lower and the CER higher (see third [4] A. Graves, M. Liwicki, S. Fern´andez, R. Bertolami, H. Bunke, and J. Schmidhuber, “A Novel Connectionist System for Unconstrained Handwriting Recognition,” IEEE Tr. PAMI, vol. 31, no. 5, pp. 855– 868, 2009. (Figure from SÁNCHEZ, TOSELLI, ROMERO, VIDAL: ICDAR2015 Competition HTRtS: Handwritten Text Recognition on the tranScriptorium Dataset) co:op Convention | READ Kickoff HTR Key Concepts | Epilog Roger Labahn | C I T lab
  54. 54. Thanks . . .. . .. . . C I T lab Group – URO MoU Partner PLANET intgelligent systems GmbH EU Funding Recognition Enrichment of Archival Documents . . .. . .. . . for your attention! co:op Convention | READ Kickoff HTR Key Concepts | Epilog Roger Labahn | C I T lab
  55. 55. co:op Convention | READ Kickoff HTR Key Concepts | Epilog Roger Labahn | C I T lab

×