Painless Document Scanning and Indexing with Alfresco
1. Painless Document Scanning and Indexing with Alfresco Share February 15, 2011 Michael Trafton Blue Fish Development Group
2. What We’ll Cover Common Terminology Problems with Existing Solutions Making it Painless Q & A
3. Introducing Blue Fish Alfresco Certified Partner Focused on ECM for 12 years Document Management and Collaboration Web Content Management Custom Application Development Content Migrations Client Base Global 2000 Growing Mid-Market Companies Fixed Price Projects Alfresco Quick Start Turnkey ECM Solutions Alfresco Projects in "Small Bites"
4. Common Terminology Scanning – Converting Paper into an Electronic File Coding or Indexing – Applying Metadata Properties OCR (Optical Character Recognition) – Creating Searchable Text Workflow – Routing Electronic Document for Approval Archiving – Importing into Alfresco and Filing in Right Folder
8. Pain Points w/ Existing Solutions Centralized Special Hardware and Software Required Scanners w/ Special Scanner Cards Dedicated Workstations Segmented Network Expensive Requires Special Training and Set-Up This is Only Worth it for High-Volume Scanning
20. Document Indexing for Alfresco Share An easy-to-use extension for Alfresco Share Use your existing scanners Index files from anywhere (even overseas) Document viewer and properties side by side Use existing Alfresco properties forms or create your own Indexing queues pull documents from multiple folders into a single location Automatically approve documents as you index them Index more than just scanned documents Documents imported via CIFS, FTP, IMAP, etc.
21. Questions? Contact Mike Trafton for more information 512-469-9300 x101 mikey@bluefishgroup.com
Editor's Notes
Some of you may already be familiar with scanning solutions, and you are just wondering if there’s an easy way to do it with AlfrescoOthers may not know the first thing about scanning, but you’ve got some paper processes that you want to get a handle onLet me bring the folks who may be new to this up to speedOf course, we all know that you can put some paper on a scanner and it will create a scanned image, and depending on the scanner and the software you are using, that image might be a PDF, or a JPEG, or what we call a TIFF file. But in any case, your scanner will turn your paper documents into electronic files, and you can store and manage these files in a document management system like Alfresco.But there’s more to it than just creating an electronic copy of the file – there are actually 4 things you need to do to scan a documentScan itGet it into AlfrescoAdd MetadataFile it in the right place
I’ve been doing this a long time – let me explain how it’s typically done. Here’s a photo of a typical high-volume scanning setup – multiple high-speed scanners, and several workstations where people are processing the imagesThis is a full time operation – someone doing document prep (removing staples), someone working the scanner, and typically a team of other people process the imagesYou might have one person that just does quality control – makes sure that the image is not skewedAnd you have multiple people performing what we call the indexing step – they look at the image and type in the document properties or metadataIn fact, some of these systems have two people index the same document and compare the results – if they don’t match, it gets rejected to be re-indexedThen, when all this is done, the image and the metadata are bulk loaded into a document management system where retention rules are applied, normal users can search for them, etc.This model is what the industry means when it says “Scanning” “Imaging” or “Document Capture”It is characterized by scanning and processing software that is separate from the document management system, expensive hardware and software, dedicated network, a centralized scanning operation (typically in the mail room or the finance department), AND THIS IS OVERKILL for most companies
Centralized – there’s a single scanning system for the entire company, and you have to take your documents down to it.Special hardware and software – You can’t just use any scanner you have sitting around – it has to support special drivers called TWAIN, you typically need a dedicated network so that you don’t bog the rest of the network downExpensive – It wouldn’t be unusual to spend $25k - $50k or more for a scanning system like the one I described because you have to buy each module separately – the scan station, the indexing station, an export module, a controller moduleRequires special training and set-up – Each kind of document you want to scan has to be set up with its own workflow and its own scanning screens. You can’t just walk up with a random document and scan itThe bottom line is that it’s only worth it to use this kind of set up for high-volume document scanning. It’s for paper-intensive processes - You wouldn’t use it for just regular departmental documents
With Alfresco, you can configure your desktop scanner or the printer/scanner/copier you already have to scan right into AlfrescoLet’s see what that looks likeHere’s our scanner/printer at the Blue Fish office – nothing fancy, just a DellAnd here’s how you scan – press this button (arrow) and then move the arrow to the preset – in this case “scans folder on the network” – unbeknownst to the scanner, this is actually a folder in AlfrescoNow let’s go find theses documents in AlfrescoHere’s a folder I have defined to hold the scanned imagesSo you can see that the document was scanned in, and because I have a rule on this folder to set its document type, I can type in the metadata.And all of this is out of the box – you just have to configure it.
With Alfresco, you can configure your desktop scanner or the printer/scanner/copier you already have to scan right into AlfrescoLet’s see what that looks likeHere’s our scanner/printer at the Blue Fish office – nothing fancy, just a DellAnd here’s how you scan – press this button (arrow) and then move the arrow to the preset – in this case “scans folder on the network” – unbeknownst to the scanner, this is actually a folder in AlfrescoNow let’s go find theses documents in AlfrescoHere’s a folder I have defined to hold the scanned imagesSo you can see that the document was scanned in, and because I have a rule on this folder to set its document type, I can type in the metadata.And all of this is out of the box – you just have to configure it.
With Alfresco, you can configure your desktop scanner or the printer/scanner/copier you already have to scan right into AlfrescoLet’s see what that looks likeHere’s our scanner/printer at the Blue Fish office – nothing fancy, just a DellAnd here’s how you scan – press this button (arrow) and then move the arrow to the preset – in this case “scans folder on the network” – unbeknownst to the scanner, this is actually a folder in AlfrescoNow let’s go find theses documents in AlfrescoHere’s a folder I have defined to hold the scanned imagesSo you can see that the document was scanned in, and because I have a rule on this folder to set its document type, I can type in the metadata.And all of this is out of the box – you just have to configure it.