For the past 20 years we’ve been lead to believe that paper would go away and everything would be electronic. To some extent much of what we do today is done electronically. This certainly is the case with paper forms (structured content), but think of all the paper that is unstructured. All that content cannot be easily replaced with electronic documents. Therefore, the idea that paper will simply vanish is not quite real today.
Although many companies today are using capture solutions in a centralized environment, there is a growing movement among organizations to push the capture of information to the locations where paper first enters the company. Key studies show that ad hoc/distributed capture is expected to grow significantly over the next few years and more companies are looking for solutions to automate the capture of paper at remote locations and integrate distributed capture with back-office processes.
This movement to distributed capture can be attributed to a number of factors, including lower bandwidth cost, low-priced desktop scanners, the emergence of MFP devices, and general acceptance of moving information electronically versus shipping paper.
At the same time, many companies have moved the data entry or correction from the local scanning center to locations with cheaper labor overseas to places like India and other countries in Southeast Asia.
Although paper has proven to be inefficient, it is not going away any time soon. If anything, there is plenty of evidence that the amount of paper in the workplace is steady or even on the rise.
All that talk in the early 1990s about the paperless office was just that: TALK.
Paper remains an issue that organizations must confront. Important business processes continue to rely upon paper documents to participate in transactions, and increasing regulations and legal costs are forcing organization to figure out how to get control of their paper documents.
“The ‘paperless society’ goal is a very nice and noble one…Unfortunately, I don't see any trace of progress toward it.” Arpad Horvath, University of California, Berkeley
“Paper use is rising by up to 8 percent every year with the average $1 billion company generating more than 88 million pages annually.” Lexmark
“The number of documents generated—both electronically and on paper—will soar to 20 trillion over the next five years.” Xerox
Many industries rely heavily on paper documents to communicate and deliver services to customers. Documents are received as paper files and routed through complex workflows to make important business decision. Even when business processes are not involved, paper remains an important part of a business’s compliance or eDiscovery initiative. Unfortunately, working with large volumes of paper documents and files presents several challenges:
Managing sheer volume can be a daunting task, especially when an organization must process a wide variety of document types and formats. Relative to digital data, paper documents are expensive to process, and difficult to search, locate, retrieve, share, and manage in a workflow. In addition, paper files can only be processed sequentially — one document at a time. The reality is that documents are the life blood of your organization. Businesses that continue to manage critical information on paper are preventing their employees from having the necessary global access to important business information. In a paper-based environment, it can be difficult to establish effective controls and monitor activity to ensure compliance with regulations.
Note to Presenter: Some important facts regarding paper documents. The typical organization… Spends $20 in labor to file each document Spends $120 in labor searching for each misfiled document Loses one out of every 20 documents Spends 25 hours recreating each lost document Spends $8 to process every invoice—70 percent (or $5.60) of which is related to document handling Sources: PricewaterhouseCoopers and IAPP (International Accounts Payable Professionals)
What they looked to do with Captiva: Increase our visibility over the order process Look to automate the Data Entry process as much as possible Reduce Errors Remove the complexity from order entry.
Results: -Saving a minimum of $8500 per month in error reduction -Reduced from 16 staff (data entry) to 6 -Duplicate orders reduced by 2/3rds -Order processing has gone from 2-3 days in peak period to 92-97% same day -Order accuracy is 99%
Future Development: -Currently rolling out 6.0 in Europe for order processing, Americas to follow. -Will be doing Accounts Payable at some stage globally. -Workflow integration -Increased QC via automated processes at the validation stage.
Why we will continue with Captiva Changes are leading to: More opportunities to increase productivity Better performance and stability More flexible development.
Make sure to consider the following things when thinking of implementing an enterprise-wide content management solution using SharePoint with a document capture application front-end: Paper will continually play a major role in an organization. So any capture solution chosen to manage the paper-based documents is critical. Addressing the challenges with paper documents require a complete capture offering, with options to address centralized and distributed capture requirements. Capture and SharePoint together enables organizations to manage paper in a small department to enterprise-wide implementations. This is where Captiva shines!
Volume: 2.5M pages per year.
Large financial services company in Canada implemented an ECM platform based on Microsoft Sharepoint 3.0/MOSS 2007 technology and is starting to provide this platform to its bu
The Group Insurance Disability business unit is the first business unit using our ECM platform. For capture, they are using Ricoh MFP devices and the documents are stored in a Sharepoint library.
The Group Savings and Retirement business unit is the next one They use capture (Captiva) in the Centralized Mailroom .
How the process generally works: At prepping and capture stage, resources use barcode sheet as separator for the batch of documents. Few indexing elements are captured to ensure fast throughput. Barcodes are used to ensure that a document and its associated metadata will end up in the proper storage area. These documents are primarily stored in MOSS 2007 document libraries.
The barcode represent a mail drop location and each mail drop location will have its own document library. Once the document is within this library, additional indexing will be performed by specialized resources for further processing of the documents.
Global Target Architecture The architecture needed to support their needs today and into the future both from their business and technology needs. The capture software needed to support: OCR recognition and Barcode recognition Page Blank suppression Support of multiple output formats such as TIF, JPEG, full-searchable PDF, etc. Integration to Microsoft Windows SharePoint Services 3.0 Standard SharePoint document library Custom document library type (which inherits from the standard SharePoint document library template) Integration with Windows-based file systems and Linux-based file systems Integration with Vignette (Towers) EDM repository Provide Application Programming Interface (API) to allow scripting and programming Integration with non-Captiva capture components such as Ricoh GlobalScan software Server-side processing with multi-threading capabilities to support concurrency and provide viable throughput
Although many companies today are using capture solutions in a centralized environment, there is a growing movement among organizations to push the capture of information to the locations where paper first enters the company. Key studies show (see data points below) that distributed capture is expected to grow significantly over the next few years and more companies are looking for solutions to automate the capture of paper at remote locations and integrate distributed capture with back-office processes. This movement to distributed capture can be attributed to a number of factors, including lower bandwidth cost, low-priced desktop scanners, the emergence of MFP devices, and general acceptance of moving information electronically versus shipping paper. At the same time, many companies have moved the date entry or correction from the local scanning center to locations with cheaper labor within the U.S. or overseas to places like India. Captiva is architected to handle this need for remote indexing in a variety of ways. Captiva solutions support various ways of capturing documents: Batch capture: High-volume, dedicated capture from central locations Distributed batch capture: High-volume capture from branch offices in different geographies—running InputAccel servers locally or connecting Scan and Index modules over a WAN Network-attached MFPs: Ad hoc, low -volume, distributed among various locations Direct-attached scanner via Web client: Low-volume, ad hoc scanning from remote offices; can work in an offline environment Remote index client: Low-to-mid volume indexing at remote offices utilizing eInput Remote Scan and Index via direct connect to InputAccel server: Running InputAccel Scan and Index over a WAN Remote indexing (via Citrix): Utilizing Citrix to run InputAccel index clients at remote offices And some of the key reasons and benefits sought by organizations include: Improve business processes—As more services are delivered at remote locations, it has become vital that the processes are improved for ingesting content at those remote locations instead of waiting for it to be shipped to a central location. Improve customer service—By improving their processes for capturing content locally, companies can deliver better customer service by having access to the content they need quicker. Reduced document-processing time—By expediting the capture of information and documents locally, downstream processes are improved because documents are not misplaced and information is validated further upstream, before the documents are sent off to
A key element of Captiva’s intelligent capture suite is its intelligent document recognition technologies—or IDR. These technologies dramatically enhance the ability to capture, organize, and transform any type of document into usable business data. Captiva advances the technology in three key areas: classify, extract, and validate.
Captiva intelligent capture solutions leverage the Digital Mailroom’s classification capabilities, enabling documents to be quickly identified based on their physical appearance—which is great for structured forms—or by the text content within the document—which is required for less structured documents, such as invoices or even legal documents. Captiva’s classification technologies dramatically reduce the work required to prepare documents to be captured, significantly reducing the costs associated with capturing documents and speeding the transformation from paper documents into business-ready information.
A key element of transactional solutions is data extraction, completely transforming a paper document into electronic business data. In some cases, this is achievable using traditional capture techniques, either with data entry operators, or by extracting data from known areas of the document, such as the upper right-hand corner. As more and different types of documents are being captured, the exact location of the information isn’t always known.
The third important component of IDR is validation. EMC’s customers tell us that the costs of finding inaccurate information later in the process are prohibitive, so as a best practice, EMC encourages customers to take advantage of several forms of validation to ensure accurate information is captured and delivered to back-end systems. This makes documents more findable, it makes your business processes more reliable, and of course, it saves time and money. Captiva features data validation against other data sources, such as ERP or other databases, and it enables organizations to compare data against business rules to ensure that it is captured accurately and meets expectations.
Together, Captiva’s IDR technologies advance document capture far beyond traditional capture solutions, providing far more value to customers by addressing many more applications and providing better transformation of documents into usable information. Organizations like State Farm are leveraging Captiva’s automated data extraction and validation capabilities to eliminate over 60% of the data entry operators required when they were doing manual data entry.
Note to Presenter: View in Slide Show mode for animation. By providing customers with superior document classification technology, customers can establish much more useful capture processes. Captiva’s classification employs several techniques—both graphic-based and image-based—to classify all incoming documents with maximum speed and accuracy. Graphic techniques are extremely fast—over 40 pages per second—and very accurate for more structured documents, such as forms. Documents can be identified by looking at the entire page or by looking at a very specific location on the document to identify it. But graphic techniques will not work on all documents, so Captiva complements these techniques with text-based classification. Text-based classification recognizes some, or all, of a page and analyzes the text content, either looking for keywords that indicate the document type or comparing the text against sample documents. The combination of these techniques is what makes Captiva’s solutions stand out—high performance, high accuracy, and complete coverage enables customers to receive the maximum value from this solution. Classification is a very important part of the capture process, and implementing these automated techniques delivers high value to organizations. Without automated classification, organizations must deploy employees to sort, identify, and organize the documents before they are scanned; with automated classification, the system is able to organize and route the documents to the correct locations and appropriate business processes. We can leverage both image- and text-based classification together, maximizing our coverage of different document types. The software quickly determines the distinct document types and, based on the business rules established, breaks those documents into folders. From there, the application is able to automatically extract the data based on the document types and organization of these documents. Once complete, the entire set of documents and data can be delivered to the back-end systems in an organized manner. By leveraging this classification technology, the use of barcodes and separator sheets has been avoided—and the labor of pre-sorting these documents has been eliminated—so the documents can be organized as required, with folders, documents, and pages (up to eight levels of organization are supported) with minimal human intervention.
To maximize both speed and accuracy, Captiva leverages two intelligent data extraction technologies to transform paper images into usable business data. Zonal extraction relies upon a template that identifies which fields to capture and where they are located. This technique works well when the layout of forms is the same or where clear identifiers define the format. It is most frequently used for recurring document types, such as a claims form or a high-volume vendor invoice. Freeform extraction uses keywords and text analysis to identify and extract the data. Since it does not rely upon physical location, it is very flexible and can be used to extract information from different companies or individuals. For example, two vendors may submit invoices in different formats. Freeform technology allows common information to be extracted from each without requiring individual templates for each vendor. By leveraging both techniques, organizations are able to maximize performance, accuracy, and coverage across all document types. By automating the extraction of data from most documents, manual efforts are reduced, which generally leads to increased cost savings and reduced risk of human errors. Documents are also processed more rapidly than manually keying in data from these documents, which will further accelerate your downstream business processes.
Once data is extracted, its accuracy must be ensured. A number of methods are employed to make sure that the data is correct. Other business data sources may be leveraged to compare extracted data against expected data in other systems. In this example, the PO number is compared against the ERP system to ensure that the total is correct. Business rules can also be applied to ensure that patient records are in the correct format or range or that individual data fields “add up” as expected. When discrepancies arise—either with the accuracy with which OCR has recognized the data or with the numbers themselves not making sense—these documents can be routed to a manual validation process, where operators can inspect errors and make corrections, as needed. The bottom line is that errors with the extracted data are corrected early on, compared against the context of both the documents and the business systems, and accurately delivered to the next step in the process.
Aside from providing interfaces directly to MS SharePoint, capture applications need to interface with other systems whether in chronological or parallel processing. Organizations have other back-end systems playing a significant role and can not be immediately turned off to bring the SharePoint repository in line. So a phased implementation is inevitable and opens up opportunities to be able to work with other back-end systems until all the problems and issues have been resolved.
Captiva capture solutions feature direct, API-to-API integration with a back-end system, providing better performance, more reliability, and more functionality than other solutions.
For example, Captiva’s integration with Documentum offers flexibility on where documents are stored, what index values are associated with each document, and who has permission to view them. In addition, when a document and its index data is delivered to Documentum, the exporter can initiate a workflow into a business process or workflow to carry out further tasks associated with the review and approval of the information/documents. Captiva also integrates with a variety of other, commonly used content management systems and business process systems. This connectivity makes Captiva a strong solution regardless of the back-end systems being used.
It’s important to understand that the use of standardize capture rules will benefit the organization as a whole. Corporations have been investing over the past several to implement standards and best practices to improve the time it takes to process transactions. While these efforts have resulted in improved business efficiency, they have also resulted in significant change in the expectations of customers and business partners in terms of the time it takes to process transactions. Whereas taking several days to process a transaction might have been acceptable several years ago, the expectation now is that transactions can occur in less than an hour, while on a phone call, or even instantaneously. This changing expectation highlights the issues with paper document processing. Today, by establishing accepted standards, paper digitization is a lot easier and potentially identifies areas of limitations that can be addressed and resolved.
So your if you are considering to use SharePoint as a repository for storing capture images, then there is a checklist.
Capturing from all sources – this needs to include scanners, but also MFPs, email, and network sources. You should be able manage this environment as one enterprise capture deployment vs. having varying degrees of capture products in different silos
Automatic data extraction - In order to reduce the cost associated with processing the paper, manual steps like classifying a document and recognizing the data from a form or document needs to occur. These critical capabilities help organizations to shorten processing time, do more with fewer resources, reduce cost, eliminate processing errors, and often create a competitive advantage. For example, reading barcodes or performing optical character recognition (OCR) can deliver high value in routing documents to a particular SharePoint library, as well capturing all the relevant index data that can be associated with the various documents. Doing it manually might cost you thousands or hundreds of thousands of dollars, require additional resources, and lengthen processing time considerably.
Intelligent document classification – if you are dealing with large volumes of documents, pre-sorting and preparing documents before capturing can be extremely time-consuming. Automatic classification can easily reduce the manual classification by upwards of 80%.
Rapid and flexible configuration – Even if SharePoint is your content repository of choice, you will still need to integrate with other systems outside of SharePoint. Enterprise capture should have flexible drag-and-drop process design environments, SDKs, and pre-built out of the box integration with databases, and other repositories.
Application integration – SharePoint might be your vision, but don’t forget about the other systems that make up your business operations. This could be your ERP system, your claims management, or any other business application that the capture application will also need to interface with.
Bottom line: Automate paper within the organization – there is no better way to reduce cost, improve productivity, and control risk then to automate paper.
Take an “Enterprise” view of capture to maximize business efficiency – you need to look across your business and think about what is needed to maximize efficiency. Simply putting a capture/scanning app on everyone’s desktop won’t always cut it.
One of the safest IT investments: ROI’s in <12-months – You can be rest assured that many companies have come before you and have shown that an ROI in a short timeframe is possible.
Capture extends the potential value of SharePoint and allows companies to focus on paper-based processes for greater efficiency and ROI – you are either thinking that I need to get paper into SharePoint or your thinking this sounds interesting. Either way capture plus SharePoint can drive real cost savings and improve the productivity of your employees significantly.