IMPACT Final Conference - Research Parallel Sessions02 research session_ncsr_tools
Upcoming SlideShare
Loading in...5
×
 

IMPACT Final Conference - Research Parallel Sessions02 research session_ncsr_tools

on

  • 1,928 views

 

Statistics

Views

Total Views
1,928
Views on SlideShare
527
Embed Views
1,401

Actions

Likes
0
Downloads
35
Comments
0

5 Embeds 1,401

http://impactocr.wordpress.com 1384
http://www.digitisation.eu 10
http://impact.sherrydesign.co.uk 4
http://translate.googleusercontent.com 2
http://webcache.googleusercontent.com 1

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment
  • Outline of your presentation:
  • Outline of your presentation:
  • Outline of your presentation:
  • Outline of your presentation:
  • Outline of your presentation:
  • Outline of your presentation:
  • Outline of your presentation:
  • Outline of your presentation:
  • Outline of your presentation:
  • Outline of your presentation:
  • Outline of your presentation:
  • Outline of your presentation:
  • Outline of your presentation:
  • Outline of your presentation:
  • Outline of your presentation:
  • Outline of your presentation:
  • Outline of your presentation:
  • Outline of your presentation:
  • Outline of your presentation:
  • Outline of your presentation:
  • Outline of your presentation:
  • Outline of your presentation:
  • Outline of your presentation:
  • Outline of your presentation:
  • Outline of your presentation:
  • Outline of your presentation:
  • Outline of your presentation:
  • Outline of your presentation:
  • Outline of your presentation:
  • Outline of your presentation:
  • Outline of your presentation:
  • Outline of your presentation:
  • Outline of your presentation:
  • Outline of your presentation:
  • Outline of your presentation:
  • Outline of your presentation:
  • Outline of your presentation:
  • Outline of your presentation:
  • Outline of your presentation:
  • Outline of your presentation:
  • Outline of your presentation:
  • Outline of your presentation:
  • Outline of your presentation:
  • Outline of your presentation:
  • Outline of your presentation:
  • Outline of your presentation:
  • Outline of your presentation:
  • Outline of your presentation:
  • Outline of your presentation:
  • Outline of your presentation:
  • Outline of your presentation:
  • Outline of your presentation:
  • Outline of your presentation:
  • Outline of your presentation:
  • Outline of your presentation:
  • Outline of your presentation:
  • Outline of your presentation:
  • Outline of your presentation:
  • Outline of your presentation:
  • Outline of your presentation:
  • Outline of your presentation:

IMPACT Final Conference - Research Parallel Sessions02 research session_ncsr_tools IMPACT Final Conference - Research Parallel Sessions02 research session_ncsr_tools Presentation Transcript

  • IMPACT Tools Developed by NCSR IMPACT Final Conference 2011 24-25 October 2011, London, UK B. Gatos Computational Intelligence Laboratory Institute of Informatics and Telecommunications National Center for Scientific Research ( NCSR ) "Demokritos" GR-153 10 Agia Paraskevi, Athens, Greece
  • Outline
    • Research activities of our lab
    • Overview of our involvement in IMPACT project
    • Border Detection
    • Page Curl Detection
    • Character Segmentation
    • Word Spotting
    • H-DocPro Demo
    IMPACT Tools Developed by NCSR - IMPACT Final Conference 2011, 24-25 October, London, UK
  • Computational Intelligence Laboratory (CIL)
    • National Centre of Scientific Research "DEMOKRITOS“:
    • T he largest self-governing research organisation, under the supervision of the Greek Government.
    • It is composed of the following Institutes:
    • Biology
    • Materials Science
    • Microelectronics
    • Informatics & Telecommunications
    • Nuclear Technology & Radiation Protection
    • Nuclear Physics
    • Radioisotopes & Radiodiagnostic Producrs
    • Physical Chemistry
    IMPACT Tools Developed by NCSR - IMPACT Final Conference 2011, 24-25 October, London, UK
    • Focuses its activity in the research & development of methods, techniques and prototypes in the areas of Telecommunication Systems/Networks and Informatics
    • Trains new researchers (under- and post-graduate students)
    • Offers Post-Doctoral Fellowships
    • Participates in National/European/International Projects and Networks of Excellence
    Institute activities IMPACT Tools Developed by NCSR - IMPACT Final Conference 2011, 24-25 October, London, UK
    • 15 Researchers
    • 6 Post-Doctoral Fellows
    • 30 PhD Students
    • 10 Undergraduate Students
    • 50 on contract-basis researchers
    Institute Personnel CIL Personnel
    • 3 Researchers
    • 1 Post-Doctoral Fellows
    • 11 PhD Students
    • 1 Undergraduate Students
    • 5 on contract-basis researchers
    • People working on DIP:
    • 2 Researchers
    • 6 PhD Students
    • 1 Undergraduate Students
    • 2 on contract-basis researchers
    IMPACT Tools Developed by NCSR - IMPACT Final Conference 2011, 24-25 October, London, UK
  • Computational Intelligence Laboratory (CIL)
    • Core Research Activities:
    • Pattern classification and recognition
    • Multimedia processing and retrieval
    • Document Image Analysis – OCR
    • Processing and recognition of
    • Historical Documents
    Computational Intelligence Laboratory Institute of Informatics and Telecommunications N ational C enter for S cientific R esearch "Demokritos" GR-153 10 Agia Paraskevi, Athens, Greece IMPACT Tools Developed by NCSR - IMPACT Final Conference 2011, 24-25 October, London, UK
  • Recent OCR projects Computational Intelligence Laboratory Institute of Informatics and Telecommunications N ational C enter for S cientific R esearch "Demokritos" GR-153 10 Agia Paraskevi, Athens, Greece IMPACT Tools Developed by NCSR - IMPACT Final Conference 2011, 24-25 October, London, UK
  • IMPACT Tools Developed by NCSR - IMPACT Final Conference 2011, 24-25 October, London, UK
  • IMPACT Tools Developed by NCSR - IMPACT Final Conference 2011, 24-25 October, London, UK
  • Information gain web ontology language Image Video Visual Information Non Visual Information Text Audio Video OCR http://www.casam-project.eu/ IMPACT Tools Developed by NCSR - IMPACT Final Conference 2011, 24-25 October, London, UK Fusion Low-level analysis Interpre tation
  • Video Logo Detection IMPACT Tools Developed by NCSR - IMPACT Final Conference 2011, 24-25 October, London, UK
    • WP Responsible Main tasks
    • OC3 (Evaluation tools and resources) B. Gatos Eval. tools for binarization, border detection, OCR output
    • OC5 (Technical architecture) I. Pratikakis System architecture, integration of toolkits
    • TR1 (Image enhancement) B. Gatos SoA for Binarization, Border detection, page curl correction
      • New Toolkits for Border
      • detection, page curl correction
    • TR2 (Segmentation) (WP-Leader) B. Gatos SoA for Block, Char Segmentation New Toolkit for Char Segmentation
    • TR4 (Experimental OCR engines) A. Kesidis Wordspotting
    IMPACT Tools Developed by NCSR - IMPACT Final Conference 2011, 24-25 October, London, UK
  • IMPACT Tools Developed by NCSR - IMPACT Final Conference 2011, 24-25 October, London, UK
  • Border_Detection_v4 [0|1] [infile] [outfile1] [outfile2] parameter [0|1]: 0 -> only border removal,   1 -> border removal & page split parameter [infile]: Input filename (b/w or gray scale image) parameters [outfile1] [outfile2]: Output filenames (b/w or gray scale image) + web service implementation IMPACT Tools Developed by NCSR - IMPACT Final Conference 2011, 24-25 October, London, UK
  • IMPACT Tools Developed by NCSR - IMPACT Final Conference 2011, 24-25 October, London, UK
    • We use projection profiles and a connected component labelling process to detect black borders
    • Signal cross-correlation is used in order to verify the detected noisy text areas
  • N. Stamatopoulos, B. Gatos, T. Georgiou, “ Page frame detection for double page document images ”, 9th IAPR International Workshop on Document Analysis Systems (DAS 2010) , pp. 401-408, Cambridge, MA, USA, June 2010. IMPACT Tools Developed by NCSR - IMPACT Final Conference 2011, 24-25 October, London, UK
    • We remove small noisy components
    • We detect vertical page zones based on vertical white run projections
    • Then, for every vertical zone we detect horizontal page zones based on horizontal white run projections.
  • IMPACT Tools Developed by NCSR - IMPACT Final Conference 2011, 24-25 October, London, UK
  • IMPACT Tools Developed by NCSR - IMPACT Final Conference 2011, 24-25 October, London, UK
  • IMPACT Tools Developed by NCSR - IMPACT Final Conference 2011, 24-25 October, London, UK
  • IMPACT Tools Developed by NCSR - IMPACT Final Conference 2011, 24-25 October, London, UK
  • 1 (Bad) 2 3 4 5 (Good) Av=4.3 Av=3.6 1. Final image almost destroyed! 2. Big part of text is missing 3. Small part of text is missing 4. All text is there, border not completely removed. 5. All text is there, border has been completely removed. 1. Final image almost destroyed! 2. Big part of text is missing 3. Small part of text is missing 4. All text is there, border not completely removed. 5. All text is there, border has been completely removed. 21709 images to test border removal 3003 newspaper images to test border removal
  • 1 (Bad) 2 3 4 5 (Good) Av=3.3 1. Page split fails! 2 Page split with problems. 3. Page split is correct, large parts of noise remains or text is removed 4. Page split is correct, small parts of noise remains or text is removed 5. Page split is correct, only black noise has been removed IMPACT Tools Developed by NCSR - IMPACT Final Conference 2011, 24-25 October, London, UK 3009 images to test page split (results on 50%)
  • IMPACT Tools Developed by NCSR - IMPACT Final Conference 2011, 24-25 October, London, UK SET A: 38718 randomly selected historical images BL BNE BNF BSB JSI NLB ONB TOTAL #images 3632 11126 12251 4784 4430 706 1789 38718 IMPACT Prec (%) 99.49 99.89 98.88 98.10 98.91 99.86 97.29 99.08 Rec (%) 98.83 99.26 99.40 96.07 99.06 99.73 97.82 98.79 FM (%) 99.16 99.58 99.14 97.07 98.99 99.79 97.55 98.93 D.X Le Prec (%) 94.98 99.68 98.67 97.70 97.35 99.80 97.19 98.30 Rec (%) 99.31 90.65 99.24 95.58 99.21 99.81 99.19 96.63 FM (%) 97.10 94.95 98.26 96.63 98.27 99.80 98.18 97.30 BookRestorer Prec (%) 91.13 96.88 98.08 97.29 94.50 99.79 95.12 96.47 Rec (%) 99.56 91.57 99.77 97.43 99.40 99.85 99.61 97.06 FM (%) 95.16 94.15 98.91 97.36 96.89 99.82 97.31 96.76 WiseBook Prec (%) 86.93 88.57 91.20 95.76 90.69 99.46 80.37 90.20 Rec (%) 98.37 99.47 99.10 96.40 97.29 98.45 98.63 98.56 FM (%) 92.30 93.71 94.99 96.08 93.87 98.95 88.57 94.20 ScanFix Prec (%) 81.65 92.87 91.29 95.62 91.00 99.24 84.52 91.17 Rec (%) 94.97 98.66 98.66 97.81 95.66 80.81 96.98 97.46 FM (%) 87.81 95.68 94.83 96.70 93.27 89.08 90.32 94.21
  • IMPACT Tools Developed by NCSR - IMPACT Final Conference 2011, 24-25 October, London, UK SET B: 22383 images with noisy black border BL BNE BNF BSB JSI NLB ONB TOTAL #images 1631 7543 7677 2417 1416 315 1384 22383 IMPACT Prec (%) 98.94 99.88 98.29 96.86 98.01 99.98 96.63 98.62 Rec (%) 98.18 99.27 99.26 93.14 99.15 99.87 98.24 98.46 FM (%) 98.56 99.57 98.77 94.96 98.58 99.92 97.43 98.54 D.X Le Prec (%) 88.89 99.58 97.98 96.08 93.20 99.85 96.48 97.28 Rec (%) 99.05 86.64 98.86 91.53 99.09 99.97 99.06 94.01 FM (%) 93.70 92.66 98.42 93.75 96.05 99.91 97.75 95.62 BookRestorer Prec (%) 80.30 95.46 97.00 95.27 84.26 99.83 93.76 94.11 Rec (%) 99.36 93.02 99.68 95.22 99.56 99.96 99.62 96.92 FM (%) 88.82 94.22 98.32 95.24 91.27 99.89 96.60 95.50 WiseBook Prec (%) 70.98 83.24 86.10 92.24 72.44 99.18 74.77 83.32 Rec (%) 99.38 99.49 99.58 95.19 98.36 98.61 99.09 98.94 FM (%) 82.81 90.64 92.35 93.69 83.43 98.89 85.23 90.46 ScanFix Prec (%) 59.23 89.57 86.23 91.96 73.38 99.04 80.14 85.00 Rec (%) 95.42 98.78 99.03 96.54 98.55 80.61 97.67 98.04 FM (%) 73.09 93.95 92.19 94.19 84.12 88.88 88.04 91.05
  • IMPACT Tools Developed by NCSR - IMPACT Final Conference 2011, 24-25 October, London, UK 3009 images to test page split
  • IMPACT Tools Developed by NCSR - IMPACT Final Conference 2011, 24-25 October, London, UK 458 images from BNF to test page split
  • Page_Curl_Correction _v4 [0|1] [infile] [outfile] parameter [0|1]: 0 -> coarse & fine correction, 1 -> only coarse correction parameter [infile]: Input filename (b/w or gray scale image) parameters [outfile] : Output filename (b/w or gray scale image) + web service implementation IMPACT Tools Developed by NCSR - IMPACT Final Conference 2011, 24-25 October, London, UK
  • N. Stamatopoulos, B. Gatos, I. Pratikakis, S.J. Perantonis, “ Goal-oriented Rectification of Camera-Based Document Images ”, IEEE Transactions on Image Processing , DOI: 10.1109/TIP.2010.2080280, 2010 IMPACT Tools Developed by NCSR - IMPACT Final Conference 2011, 24-25 October, London, UK
    • A computationally low cost transformation which addresses the projection of a curved surface to a 2D rectangular area
    • Fine correction based on text line & word segmentation
  • IMPACT Tools Developed by NCSR - IMPACT Final Conference 2011, 24-25 October, London, UK
  • IMPACT Tools Developed by NCSR - IMPACT Final Conference 2011, 24-25 October, London, UK
  • IMPACT Tools Developed by NCSR - IMPACT Final Conference 2011, 24-25 October, London, UK
  • Only Coarse Correction Coarse + Fine Correction Av=4.2 Av=4.5
    • Lot of page curl added to the result
    • Some curl added to the result
    • Only part of the page curl has been corrected
    • Most of the page curl has been corrected
    • Page Curl has been corrected
    IMPACT Tools Developed by NCSR - IMPACT Final Conference 2011, 24-25 October, London, UK 14706 images to test Page Curl Correction
  • Only Coarse Correction Coarse + Fine Correction Av=3.9
    • Lot of page curl added to the result
    • Some curl added to the result
    • Only part of the page curl has been corrected
    • Most of the page curl has been corrected
    • Page Curl has been corrected
    Av=3.7 IMPACT Tools Developed by NCSR - IMPACT Final Conference 2011, 24-25 October, London, UK 3009x2 = 6018 images to test page split
  • IMPACT Page Curl Correction v.4 87.78% (81.98% only coarse correction) BookRestorer 80.87% N. Stamatopoulos, B. Gatos and I. Pratikakis, " A Methodology for Document Image Dewarping Techniques Performance Evaluation ", 10th International Conference on Document Analysis and Recognition (ICDAR'09) , pp. 956-960, Barcelona, Spain, July 2009. IMPACT Tools Developed by NCSR - IMPACT Final Conference 2011, 24-25 October, London, UK
  • 0.21 0.91 Character_Segmentation_v3 [WordImageFilename] [XMLOutputFilename] parameter [WordImageFilename]: An image containing a word parameter [XMLOutputFilename] : several character segmentation variations encoded following the XML schema of IBM used in TR3 (Adaptive OCR) IMPACT Tools Developed by NCSR - IMPACT Final Conference 2011, 24-25 October, London, UK
  • Merged characters Broken characters Overlapped characters Noise IMPACT Tools Developed by NCSR - IMPACT Final Conference 2011, 24-25 October, London, UK
  • Segmentation variations encoded following the XML schema used in TR3 N. Nikolaou, M. Makridis, B. Gatos, N. Stamatopoulos, N. Papamarkos, " Segmentation of historical machine-printed documents using Adaptive Run Length Smoothing and skeleton segmentation paths ", Image and Vision Computing, Vol. 28 , Issue 4, pp. 590-604, 2010.
    • Calculation of the inner/outer skeleton
    • Classification of the skeleton parts
    • Detection of feature points
    • Construct all possible segmentation paths that result to characters with width in the limit of [MinC*LettH, MaxC*LettH]. For the MinC and MaxC parameters the following pairs are used: (0.3, 0.4). (0.4, 0.5), (0.5, 0.6), (0.6, 0.7), (0.7, 0.8), (0.8, 0.9). As a result we have several segmentation variations with and without applying noise removal. Confidence is based on the difference between Average and Dominant Ratio of Height to Width.
  • 0.61 0.79 0.85 0.98 0.94 IMPACT Tools Developed by NCSR - IMPACT Final Conference 2011, 24-25 October, London, UK
  • 0.83 0.63 0.73 0.89 0.90 IMPACT Tools Developed by NCSR - IMPACT Final Conference 2011, 24-25 October, London, UK
  • 0.61 0.79 0.94 Evaluation of the result with the highest confidence IMPACT Tools Developed by NCSR - IMPACT Final Conference 2011, 24-25 October, London, UK
  • 0.61 0.79 0.94 Evaluation of the best possible result IMPACT Tools Developed by NCSR - IMPACT Final Conference 2011, 24-25 October, London, UK
  • IMPACT Tools Developed by NCSR - IMPACT Final Conference 2011, 24-25 October, London, UK
    • Develop an alternative technique for historical document indexing
      • based on spotting words directly on document images
      • avoiding the conventional OCR procedure
    • Provide three methods for word spotting:
      • Selecting the query from a predefined list of keywords
      • Query by example
      • Free text query
    • Incorporate the whole word spotting functionality in a GUI tool
    IMPACT Tools Developed by NCSR - IMPACT Final Conference 2011, 24-25 October, London, UK
    • The main operational parts of the Word Spotting application are:
    • Page segmentation
    • Feature extraction
    • Marking character templates
    • Word matching
    • User feedback
    • Query selection by example
    • Free text synthetic query creation
    • Searching
    • User access control
    IMPACT Tools Developed by NCSR - IMPACT Final Conference 2011, 24-25 October, London, UK
    • Main steps (1/2)
    • Document pages
      • Select pages from the documents corpus
      • Apply word segmentation to the pages
      • Apply feature extraction to all segmented
      • words
    • Query
      • Define the list of keywords
      • Select the query keyword from the list
      • Mark the character templates
      • Create a synthetic query image
      • Apply feature extraction to the query image
    IMPACT Tools Developed by NCSR - IMPACT Final Conference 2011, 24-25 October, London, UK
    • Main steps (2/2)
    • Matching and User feedback
      • Word matching
      • User feedback
      • Selecting the final results
    IMPACT Tools Developed by NCSR - IMPACT Final Conference 2011, 24-25 October, London, UK
    • Marking character templates
      • Applied directly on a text image
      • Character baseline adjustment
      • Performed “once-for-all” and can be used for entire books or collections with similar text characteristics
    IMPACT Tools Developed by NCSR - IMPACT Final Conference 2011, 24-25 October, London, UK
    • Feature extraction & word matching
      • Describe each word (synthetic or real) by a set of features
      • Normalize
      • Match by checking similarity based on features
    IMPACT Tools Developed by NCSR - IMPACT Final Conference 2011, 24-25 October, London, UK
  • Features based on word profile projections Features based on zones Hybrid features by projections and zones
    • Feature extraction & word matching
    IMPACT Tools Developed by NCSR - IMPACT Final Conference 2011, 24-25 October, London, UK x y x y y=y t Upper Boundary Lower Boundary x y x y x y
  • Features by centers of masses
    • Feature extraction & word matching
    IMPACT Tools Developed by NCSR - IMPACT Final Conference 2011, 24-25 October, London, UK
    • User feedback example
    (a) Synthetic query word (b) Initial ranking of segmented words. The highlighted words denote correct words selected by the user (c) Ranking after user’s feedback. (a) (b) (c) IMPACT Tools Developed by NCSR - IMPACT Final Conference 2011, 24-25 October, London, UK
    • Searching
      • Allow the user to search the image corpus for instances of query keywords that have already undergone the user feedback process .
      • The user selects one of the processed keywords and the application shows all the instances of this keyword in the images of the corpus.
      • The user can navigate through the results in an instance level (showing one instance per time) or in a page level (showing all instances in a page).
    IMPACT Tools Developed by NCSR - IMPACT Final Conference 2011, 24-25 October, London, UK
  • A. L. Kesidis, E. Galiotou, B. Gatos and I. Pratikakis, “ A word spotting framework for historical machine-printed documents ”, International Journal on Document Analysis and Recognition, DOI: 10.1007/s10032-010-0134-4, pp. 1-14, 2010. A. L. Kesidis, E. Galiotou, B. Gatos, A. Lampropoulos, I. Pratikakis, I. Manolessou and A. Ralli, " Accessing the content of Greek historical documents ", 3rd  Workshop on Analytics for Noisy Unstructured Text Data (AND'09), pp. 55-62, Barcelona, Spain, July 2009 IMPACT Tools Developed by NCSR - IMPACT Final Conference 2011, 24-25 October, London, UK
    • Main steps
    • Document pages (applied once)
    • Query
      • Select (by cropping) a query word image from a page
      • Apply feature extraction to the query image
    • Matching
      • Match query features to all segmented words features
      • Rank the segmented words by similarity
      • Return the most similar segmented words
      • No user feedback!
    IMPACT Tools Developed by NCSR - IMPACT Final Conference 2011, 24-25 October, London, UK
    • Main steps
    • Document pages (applied once)
    • Query
      • Type the query text
      • Construct a synthetic query image using letter templates (provided by an administrator)
      • Apply feature extraction to the query image
    • Matching
      • Match query features to all segmented words features
      • Rank the segmented words by similarity
      • Return the most similar segmented words
      • No user feedback!
    IMPACT Tools Developed by NCSR - IMPACT Final Conference 2011, 24-25 October, London, UK
    • Two levels
    • Guest
    • Administrator
    IMPACT Tools Developed by NCSR - IMPACT Final Conference 2011, 24-25 October, London, UK Search Word Spotting Query by Example Free Text Query User management + settings Guest √ √ √ Administrator √ √ √ √ √
  • IMPACT Tools Developed by NCSR - IMPACT Final Conference 2011, 24-25 October, London, UK Query by Keyword Query by Example Free Text OFFLINE PREPARATION – ADMINISTRATIVE TASKS Page segmentation and features extraction Admin Admin Admin Keywords definition Admin Letter templates definition Admin Admin Word Spotting by User ’ s feedback Admin ONLINE USAGE Searching All Users All Users All Users
    • Document corpus
    • French book
    • ( 153 pages, 47836 words)
    • German book
    • ( 126 pages, 24596 words)
    • Segmentation
    • Projections
    • RLSA
    • USAL1 (Connected components)
    • USAL2 (Projections)
    • Features
    • Hybrid (Projections+Zones)
    • Center of Masses
    • Overall 80 experiments
    • Each experiment performed
      • Without user feedback
      • With 1, 2, and 3 user selected words
    • Keywords
    • 5 keywords per book
    • French : Le Dernier fils de France, ou le Duc de Normandie, fils de Louis XVI et de Marie-Antoinette, par A. , 1838
    • German : Aufschlüsse zur Magie aus geprüften Erfahrungen über verborgene philosophische Wissenschaften und verdeckte Geheimnisse der Natur , 1788
    IMPACT Tools Developed by NCSR - IMPACT Final Conference 2011, 24-25 October, London, UK
    • Feature extraction
      • Hybrid provided better results + is faster than Center of masses
    (a) (b) Average precision vs recall diagrams of word spotting in relation to feature extraction methods for (a) Book A and (b) Book B. IMPACT Tools Developed by NCSR - IMPACT Final Conference 2011, 24-25 October, London, UK
    • User feedback
      • User feedback improved the results
    (a) (b) Average precision vs recall diagrams of word spotting in relation to the user’s feedback involvement for (a) Book A and (b) Book B . IMPACT Tools Developed by NCSR - IMPACT Final Conference 2011, 24-25 October, London, UK
    • Segmentation issues
    • Query by Example + Free Text Query
      • In Query by Example the performance is similar to User Feedback when one relevant instance is selected
      • In both methods the results are related to the similarity threshold
    IMPACT Tools Developed by NCSR - IMPACT Final Conference 2011, 24-25 October, London, UK
  • IMPACT Tools Developed by NCSR - IMPACT Final Conference 2011, 24-25 October, London, UK
  • http://users.iit.demokritos.gr/~bgat/H-DocPro/ H-DocPro v.1 IMPACT Tools Developed by NCSR - IMPACT Final Conference 2011, 24-25 October, London, UK
  • H-DocPro v.1 IMPACT Tools Developed by NCSR - IMPACT Final Conference 2011, 24-25 October, London, UK
  • H-DocPro v.1 Step 1: Select the directory with your images or copy your images to directory [Install Dir]/images. IMPACT Tools Developed by NCSR - IMPACT Final Conference 2011, 24-25 October, London, UK
  • H-DocPro v.1 Step 2: Select the directory for saving the results after pressing the "Settings" button. (default save directory: [Install Dir]/Results ) IMPACT Tools Developed by NCSR - IMPACT Final Conference 2011, 24-25 October, London, UK
  • H-DocPro v.1 Step 3: Select one or more document images. IMPACT Tools Developed by NCSR - IMPACT Final Conference 2011, 24-25 October, London, UK
  • H-DocPro v.1
    • Step 4: Define a processing workflow.
    • To add a processing module: Just click on it. You can add a module in any order.
    • To remove a processing module: Just click again on it (at the bottom module line) or right click on the module at the workflow line and select "Remove".
    • To change the module order: Right click on the module at the workflow line and select "Move Right" or "Move Left".
    IMPACT Tools Developed by NCSR - IMPACT Final Conference 2011, 24-25 October, London, UK
  • H-DocPro v.1 Step 5: Select the method for every processing module by pressing &quot;<&quot; or &quot;>&quot; on every module at the workflow line. Right click on the module at the workflow line and deselect &quot;Do not recalculate if result exists&quot; if you want to recalculate an existing result. IMPACT Tools Developed by NCSR - IMPACT Final Conference 2011, 24-25 October, London, UK
  • H-DocPro v.1 Step 6: Execute workflow by pressing &quot;Apply Processes&quot; IMPACT Tools Developed by NCSR - IMPACT Final Conference 2011, 24-25 October, London, UK
  • H-DocPro v.1 Step 7: View results on the preview window or right click on any module at the workflow line and select &quot;View Result&quot;. If you right click on the right-most module you will view the final result otherwise you will view the intermediate results. IMPACT Tools Developed by NCSR - IMPACT Final Conference 2011, 24-25 October, London, UK
  • H-DocPro v.1 - Document Image Processing Components Binarization NCSR: Based on &quot;B. Gatos, I. Pratikakis and S. J. Perantonis, Adaptive Degraded Document Image Binarization, Pattern Recognition, Vol. 39, pp. 317-327, 2006&quot; FR8.1: From FineReader Engine v. 8.1. IMPORTANT NOTICES: (a) You must have the engine already intalled. (b) You must edit file [Install Dir]/temp/Binarization/FRkey.txt and add your FineReader license key code IMPACT Tools Developed by NCSR - IMPACT Final Conference 2011, 24-25 October, London, UK
  • H-DocPro v.1 - Document Image Processing Components Border Removal Auto: Based on projection profiles and connected component analysis. Auto_Edit: Press inside the marked area and adjust it by draging the black points. IMPACT Tools Developed by NCSR - IMPACT Final Conference 2011, 24-25 October, London, UK
  • H-DocPro v.1 - Document Image Processing Components Page Split Auto: Based on &quot;N. Stamatopoulos, B. Gatos, T. Georgiou, Page frame detection for double page document images, 9th IAPR International Workshop on Document Analysis Systems (DAS 2010), pp. 401-408, Cambridge, MA, USA, June 2010&quot; Auto_Edit: Press inside the left or right marked area and adjust it by dragging the black points. IMPACT Tools Developed by NCSR - IMPACT Final Conference 2011, 24-25 October, London, UK
  • ASM 2011, 12-13 April 2011, Munich, Germany H-DocPro v.1 - Document Image Processing Components Dewarping Auto: Based on &quot;N. Stamatopoulos, B. Gatos, I. Pratikakis and S.J. Perantonis, Goal-oriented Rectification of Camera-Based Document Images, IEEE Transactions on Image Processing, vol. 20, no. 4, pp. 910-920, 2011.&quot; IMPORTANT NOTICES: (a) It needs the MATLAB Component Runtime Installer, (b) it can be applied only to single column documents. Auto_Edit: Manually correct the position of the two lines and the two curves that delimit the text area by draging the corresponding black points. Press &quot;>&quot; button to test the result.