Your SlideShare is downloading. ×
0
OCRFeeder LinuxTag 2011
OCRFeeder LinuxTag 2011
OCRFeeder LinuxTag 2011
OCRFeeder LinuxTag 2011
OCRFeeder LinuxTag 2011
OCRFeeder LinuxTag 2011
OCRFeeder LinuxTag 2011
OCRFeeder LinuxTag 2011
OCRFeeder LinuxTag 2011
OCRFeeder LinuxTag 2011
OCRFeeder LinuxTag 2011
OCRFeeder LinuxTag 2011
OCRFeeder LinuxTag 2011
OCRFeeder LinuxTag 2011
OCRFeeder LinuxTag 2011
OCRFeeder LinuxTag 2011
OCRFeeder LinuxTag 2011
OCRFeeder LinuxTag 2011
OCRFeeder LinuxTag 2011
OCRFeeder LinuxTag 2011
OCRFeeder LinuxTag 2011
OCRFeeder LinuxTag 2011
OCRFeeder LinuxTag 2011
OCRFeeder LinuxTag 2011
OCRFeeder LinuxTag 2011
OCRFeeder LinuxTag 2011
OCRFeeder LinuxTag 2011
OCRFeeder LinuxTag 2011
OCRFeeder LinuxTag 2011
OCRFeeder LinuxTag 2011
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

OCRFeeder LinuxTag 2011

1,426

Published on

The slides for the presentation about OCRFeeder given at LinuxTag 2011.

The slides for the presentation about OCRFeeder given at LinuxTag 2011.

Published in: Technology
1 Comment
0 Likes
Statistics
Notes
  • Great presentation.Today, businesses in many industries make extensive use of OCR technology for document automation. Practically every company that deals with paper documents can benefit from OCR.Many businesses, aware of the environmental impact of wasteful paper use, and also just fed up with paper clutter, are moving towards the ideal of a paper-free office.OCR Cloud 2.0 platform can convert virtually any image (TIF, JPG, PNG, BMP) or PDF to any standard text-based document type (TXT, DOC, RTF, XLS, PPT, XML, HTML) or searchable PDF.For free developer account signup here-http://www.ocr-it.com/ocr-cloud-2-0-api
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • Be the first to like this

No Downloads
Views
Total Views
1,426
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
14
Comments
1
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. static void _f_do_barnacle_install_properties(GObjectClass *gobject_class) {OCRFeeder GParamSpec *pspec; /* Party code attribute */ pspec = g_param_spec_uint64 (F_DO_BARNACLE_CODE, "Barnacle code.", "Barnacle code", 0, G_MAXUINT64, G_MAXUINT64 /* default value */,Converting printed documents into G_PARAM_READABLE | G_PARAM_WRITABLE | G_PARAM_PRIVATE);digital formats g_object_class_install_property (gobject_class, F_DO_BARNACLE_PROP_CODE,Joaquim Rochajrocha@igalia.com Berlin, May 2011
  • 2. What is it?Document Analysis and Optical Character Recognition for GNOME Joaquim Rocha (Igalia) · OCRFeeder · LinuxTag 2011
  • 3. Why? Paper has a number of problemsNo applications for GNU/Linux to do a fair job Joaquim Rocha (Igalia) · OCRFeeder · LinuxTag 2011
  • 4. Paper problems: Security Joaquim Rocha (Igalia) · OCRFeeder · LinuxTag 2011 CC Photo by: http://www.flickr.com/photos/badwsky/
  • 5. Paper problems: Preservation Joaquim Rocha (Igalia) · OCRFeeder · LinuxTag 2011 CC Photo by: http://www.flickr.com/photos/98469445@N00/
  • 6. Paper problems:Data processing Joaquim Rocha (Igalia) · OCRFeeder · LinuxTag 2011 CC Photo by: http://www.flickr.com/photos/hugovk/
  • 7. Paper problems: Ecology Joaquim Rocha (Igalia) · OCRFeeder · LinuxTag 2011 CC Photo by: http://www.flickr.com/photos/pranavsingh/
  • 8. No fair conversion apps for GNU/Linuxapart from OCR engines, but... Joaquim Rocha (Igalia) · OCRFeeder · LinuxTag 2011
  • 9. OCR != Document Conversion (it only deals with chars) (does not consider the layout)(does not distinguish contents) Joaquim Rocha (Igalia) · OCRFeeder · LinuxTag 2011
  • 10. Whats needed is Document Analysis and Recognition(conversion of documents to an electronic format) (first projects in the 80s) Joaquim Rocha (Igalia) · OCRFeeder · LinuxTag 2011
  • 11. Where are were we at? * Some closed solutions* Only for proprietary systems * Various prices * still... arguable results Joaquim Rocha (Igalia) · OCRFeeder · LinuxTag 2011
  • 12. How Joaquim Rocha (Igalia) · OCRFeeder · LinuxTag 2011
  • 13. So many layouts... Joaquim Rocha (Igalia) · OCRFeeder · LinuxTag 2011 CC Photo by: http://www.flickr.com/photos/uber-tuber/
  • 14. Layouts vary with the type of documentWhat works on detecting one, wont work on others Joaquim Rocha (Igalia) · OCRFeeder · LinuxTag 2011
  • 15. OCRFeeder focuses on contents, not on layouts! Joaquim Rocha (Igalia) · OCRFeeder · LinuxTag 2011
  • 16. Key concept: If a document image can bedivided in windows of 1 (content) or 0 (not content),then it is possible to group all the 1s and outline the contents Joaquim Rocha (Igalia) · OCRFeeder · LinuxTag 2011
  • 17. Joaquim Rocha (Igalia) · OCRFeeder · LinuxTag 2011
  • 18. Recognition:System-wide OCR engines are used Engines are configured from the GUI or XML files Joaquim Rocha (Igalia) · OCRFeeder · LinuxTag 2011
  • 19. Joaquim Rocha (Igalia) · OCRFeeder · LinuxTag 2011
  • 20. Most known free OCR engines are detected and configured automatically: * Tesseract * GOCR * OCRAD * Cuneiform Joaquim Rocha (Igalia) · OCRFeeder · LinuxTag 2011
  • 21. Exportation formats: ODT HTML Plain text Joaquim Rocha (Igalia) · OCRFeeder · LinuxTag 2011
  • 22. User interaction: Users can edit everythingand review the algorithms resultsSo, UI can work in attended and unattended waysCLI only works in an unattended mode Joaquim Rocha (Igalia) · OCRFeeder · LinuxTag 2011
  • 23. Joaquim Rocha (Igalia) · OCRFeeder · LinuxTag 2011
  • 24. Demo time! Joaquim Rocha (Igalia) · OCRFeeder · LinuxTag 2011
  • 25. Other features: * PDF importation* Unpaper preprocessor * Font style edition * Image deskewing * OCR results cleaning* Project saving/loading Joaquim Rocha (Igalia) · OCRFeeder · LinuxTag 2011
  • 26. A11y:* OCRFeeder is a very useful tool for visually impaired users * Last year, the main target of itsdevelopment was to improve a11y Joaquim Rocha (Igalia) · OCRFeeder · LinuxTag 2011
  • 27. Future: * Integrate Ocropus as an alternative analysis backend* More exportation formats: HOCR, PDF, etc.* Make OCR engines management easier Joaquim Rocha (Igalia) · OCRFeeder · LinuxTag 2011
  • 28. Webpage:http://live.gnome.org/OCRFeedergit:http://git.gnome.org/ocrfeederBugzilla:http://bugzilla.gnome.orgproduct: OCRFeeder Joaquim Rocha (Igalia) · OCRFeeder · LinuxTag 2011
  • 29. Manual in German:http://wiki.ubuntuusers.de/OCRFeeder Joaquim Rocha (Igalia) · OCRFeeder · LinuxTag 2011
  • 30. Thank you! Joaquim Rocha (Igalia) · OCRFeeder · LinuxTag 2011

×