CASE-7 Scanning and OCR the Open Source Way


Published on

Most customers have paper-based documents and faxes they would like to add to their Alfresco ECM but there have not been any open source vendors till now. Ephesoft fits perfectly with Alfresco as a front-end capture solution and allows for a complete end-to-end open source capture-ECM solution. Using the CMIS connector, Ephesoft can enable browser-based scanning, barcode and OCR reading, classification and separation of documents, and extraction of metadata. This all saves labor costs and helps produce a short ROI. This session will deal with the reasons capture technologies are important to ECM and the ease of integration with Alfresco. It will discuss why customers and partners have chosen this partnership and the successful results of their implementations.

Published in: Technology
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

CASE-7 Scanning and OCR the Open Source Way

  1. 1. Scanning  and  OCR  the  Open  Source  Way  Ian  Pope  -­‐  Epheso:  
  2. 2. “Our goal is to allow everyone to enjoy a fully featured document and data capture system that rivals the current industry leaders at a fraction of the cost” By 2016, Open Source Software (OSS) will be included in mission-critical software portfolios within 99% of Global 2000 enterprises.**Source:  Gartner  Predicts  2011:  Open  Source  So7ware,  the  Power  Behind  the  Throne,  23rd  Nov,  2010  
  3. 3. Legacy Advanced Capture Systems:•  high costs – software and services•  click charges (priced by volume)•  confusing pricing options•  difficult to configure and implement•  thick client deployment•  “bloat ware”•  closed standards /vendor lock-in
  4. 4. Ephesoft is the industry’s only Java and100% browser-based Advanced CaptureSolution that is Cloud ready out of the need to install software on everyworkstation. Ephesoft supports Firefox,Chrome, Safari and Explorer.Ephesoft is based on open standards and isthe first Advanced Capture Solution that canrun on Linux.
  5. 5. Scan or import documents from virtually anysource such as fax and e-mail with oneapplicationProcess documents that are stored in therepositoriesUtilize any existing hardware whether it is adepartmental document scanner, high-speedproduction scanner or MFP
  6. 6. Invoices medical mortgage insurance government energy telecommunication HR ...
  7. 7. edA dvanc ion cat c lassifi ation ara nd sep gies olo techn ed includ Learns hundreds of documents within minutes and sorts the documents more efficiently Separates documents by identifying where a document starts and ends Allows operators to focus on only the exceptions
  8. 8. Bar Code Reading Fixed Forms ExtractionMark Sense and Handprint Recognition PDF + text output Free Form “unstructured” extraction Fuzzy Database document matching Cursive Handwriting Recognitioninvoice  date   6/17/2008  account  number   2-­‐1006-­‐475-­‐1  reference  code   22177991  service  code   954381  amount   29.34  due  date   7/7/2008  
  9. 9. Case  Study  Travelcard  
  10. 10. The  challenges  •  Due  to  moving  to  another  office  building,  the  paper  archive   was  reconsidered  •  Digital  archive  should  replace  paper  archive:   –  Reduce  space  (costs)   –  Documents  are  instantly  accessible  •  SoluMon  should  be  embedded  within  current  organizaMon  and   current  staff  •  SoluMon  should  be  scalable  without  any  extra  costs  •  50  documents  (200  pages)  per  day,  about  40  different   document  types  –  invoices  (90%),  contracts,  applicaMon  forms   (request  forms  for  ordering  fuel  cards),  bank  statements  and  a   couple  of  standard  forms  including  a  form  to  authorize   payments.  
  11. 11. The  SoluMon  •  Incoming  documents  are  scanned  with  a   muliVuncMon  device  and  picked  up  by  EphesoX  •  EphesoX  classifies  the  documents  and  extracts   metadata  •  Output  from  EphesoX  (pdf  and  metadata  file)  is  used   to  save  document  to  DMS  •  Saved  documents  go  into  workflow  based  on  their   document  classificaMon  
  12. 12. The  Benefits  •  Over  70%  of  documents  is  handled  automaMcally  •  Remaining  30%  (mostly  variants)  are  handled  by   employees  aXer  only  a  few  hours  of  training  •  No  addiMonal  investment  in  employees  or   peripherals  •  All  documents  are  digital  enabling:   –  Quick  and  easy  access   –  Searching  documents  by  metadata   –  Workflows  (without  documents  gebng  lost)  
  13. 13. invoice  date   6/17/2008   account  number   2-­‐1006-­‐475-­‐1   reference  code   22177991   service  code   954381   amount   29.34   due  date   7/7/2008  CMIS Multipage PDF/TIFF XML
  14. 14. The  Challenges  •  Australia’s  biggest  satellite  company,  also  providing   technical  services  to  the  telecoms  industry.  •  1,000  staff  and  1,400  contractors  –  8  branches.  •  MulMple  document-­‐driven  business  processes.  •  Looking  for  an  ECM  soluMon  –  for  now  and  the  future  –  as   well  as  a  company  intranet  and  specifically  an  accounts   payable  invoice  approval  soluMon.  •  Had  already  evaluated  MicrosoX  SharePoint  and  other   proprietary  products.  
  15. 15. The  SoluMon  •  Zia  was  selected  to  provide  an  EphesoX  &  Alfresco  soluMon   for  Accounts  Payable.  •  EphesoX  does  invoice  capture,  creaMng  PDF’s  and   metadata  tags  –  scanning  &  OCR.  •  Using  the  CMIS  standard,  the  data  and  all  metadata  tags   are  exported  to  an  Alfresco  repository.  •       Once  in  the  Alfresco  system,  a  workflow  begins  by                                triggering  an  email  in  Outlook  with  the  document  URL  to  the  AP  Dept.  The  invoice  is  then  reviewed  and  commented  upon  before  final  approval.    
  16. 16. The  Results  •Automated  the  accounts  payable  invoice   approval  process  •Developed  flexible  workflows  •Integrated  Alfresco,  EphesoX,  Pronto  ERP  •Incorporated  OCR  extracMon  of  invoice  data  •Reduced  invoice  processing  Mme  •Improved  employee  producMvity  
  17. 17. Ephesoft includes:•  feature rich, complete Advanced Capture Solution•  easy to use and implement•  browser based and cloud ready•  one application for paper, email, fax documents•  Web-based scanning application•  document separation and classification•  data extraction – fixed and unstructured•  document and data release via XML, CMIS, and more•  no volume counts on users or imagesWhat is the cost?  •  Zero License Cost•  All features are included•  Only cost is an annual Support/Maintenance Subscription
  18. 18. Ephesoft follows the trend started by Redhat,Apache, Android, MySQL, Alfresco, ...Ephesoft is the only open source AdvancedDocument Capture solution available.It is a true Enterprise Solution with full 24/7enterprise support.
  19. 19. Ephesoft Enterprise Edition includes:•  disaster recovery (if more than one Ephesoft server in use)•  load balancing (if more than one Ephesoft server in use)•  high availability (if more than one Ephesft server in use)•  image enhancement•  auto rotation based on text alignment•  blank page deletion•  professional level OCR•  Browser-based scanning
  20. 20. Mountain West Financial Services Inc.•  225 financial documents ‚trained‘•  95% classification accuracy•  Reduced staff by half•  System implemented in weeks•  4 million pages a year and growing•  ROI in less than 6 months•  Susan Hartsock, IT Supervisor, says „We‘re saving time, labor, and money. The solution is very intuitive and our staff loves it because it is fast and reliable. Ephesoft has been a sea change for us.“•  A sample of joint Alfresco and Ephesoft customers include BSA, Trax, Heifer, Gilt Groupe, City and County of Denver, Colorado, etc.)  
  21. 21. Questions?Thank youIan Popeian.pope@ephesoft.comSkype: ianpopeportishead+44 7411 461804