Intelligent Enterprise featuring Transform
START NEWS & ANALYSIS OPINION CHANNELS PRODUCT GUIDES REVIEWS TECHWEBCASTS
CONTACTS ARCHIVES ADVANCED SEARCH
October, 1996

TAMING THE IMAGE CAPTURE BEAST

Getting documents into your image system is called capture. It's the most expensive part of the process. It turns stacks of paper into retrievable images. To do this, imaging vendors have developed some truly wild capture products. Here's a snare to help you grab the right one by the tail.

Your boss dumps 500 customer invoices on your desk. Your mission: Turn them into electronic images for an imaging system. Sounds simple enough. Just stack 'em up, run 'em through a high-speed scanner and you're done, right?

Not quite. What if some of the

invoices stick together or scan upside down? What if a few are creased or scanned with bent corners? What if several of them are too dark, too light or otherwise illegible? You could rescan them all, but without some control over the process, it would be hard to correct all the problems.

Even if all your paper scans perfectly, it doesn't mean much if you can't find an image quickly when you need it. This brings us to the next question: How are you going to index these invoices? Manually entering an index to match each invoice number is slow and laborious. It's also expensive -- you have to pay someone to do it.

Now think about what this task would be like with 5,000 invoices, 50,000 invoices or even 500,000 invoices. It sounds like a nightmare. But it doesn't have to be. With the right technology, you can tame the image capture beast and make the process work for you rather than against you.

We like to call capture applications "cap apps." They process documents in groups called "batches." A batch is a group of similar documents (invoices, tax forms, bills of lading, etc). The cap app divides the capture process into several functions -- scanning, image cleanup, indexing, quality control and exporting are the most common. Batches move from function to function based on how the cap app is configured.

After the capture process is complete, images travel to

other components of the imaging/workflow/document management system for storage, editing, routing, tracking,

synchronizing and other activities.

There are five major advantages of this approach:

1. Faster scanning. Without a cap app, a scanning operator typically applies an index and does a quick

quality check as each document is scanned. That's expensive, because it reduces the speed of your scanner.

With a cap app, the first step is to turn paper into images as fast as possible. Other capture steps come later.

This lets your scanner strut its stuff. It frees your operator to do other things until the scanner is done. And it may eliminate the need to add more scan stations to handle your document backlog.

2. Better images. Most cap apps let you clean up (enhance) your images several ways. This includes deskewing (straightening out) crooked images, improving contrast, cropping out unwanted borders and blanking out irrelevant sections. Use a variety of filters to erase stray pixels picked up during scanning.

The obvious advantage of image enhancement is improved readability. There are less obvious advantages as well. Cleaned and deskewed images improve OCR accuracy. They reduce compressed image file size by up to 80%. This lowers storage costs and improves

network performance.

3. Easier indexing. Without a cap app, you index using manual data entry. A cap app automates indexing to reduce the costs and error rates of manual key input.

Automate indexing in one of several ways:

  • Link your index function to your database.

  • Use OCR to read a specified index zone on each document.

  • Apply barcodes during document preparation. Done correctly, automated indexing is less expensive. It's also often more accurate than manual key entry.

    4. Tighter quality control. In the document capture process, the quality control (QC) step does two things -- it captures all the pages and it makes sure they're all readable. Cap apps expedite quality control. Operators can view several images at a time. The cap app flags missing or unreadable pages and schedules them for rescanning. (In high-volume environments, this request often goes to an off-line scanning station to avoid slowing down the primary high-speed scanners and operators.) The cap app puts the rescanned pages in the right order.

    5. Better process monitoring. About the only way you can manage the capture process without a cap app is by walking from computer to computer to complete each capture step. You then write down where each document is as you go. Very tedious. Cap apps show you a document's position in a batch, how much of the batch has been processed and so on. Some cap apps display a directory tree showing each page as it's scanned. Most let you interrupt and modify a capture step or task if necessary. You control the flow.

    The Three Layers of Every Capture Application

    Each cap app on the market boasts a few unique features to make it stand out from the competition. But most are designed according to the same basic three-layer model. (The actual name given to each layer varies from vendor to vendor.)

    1. The supervisory layer (often called administration). Here's where you configure the system to process your document batches -- your way. Use the supervisory layer to define document types, designate index fields, select indexing methods, design your capture workflow, set up user IDs, select your scanner setup and so on.

    If the capture process were a construction project, the supervisory layer would be the project engineer. It determines and controls the overall process for turning your paper into images.

    2. The batch management layer. Yup. Does just what you'd expect given the thing's name: manages batches as they travel from one capture module to the next.

    Use this layer to monitor the status of the batch in process or determine where the batch is headed next. Most cap apps have a workflow status screen showing which pages have been processed by which capture module.

    Think of the batch management layer as the cap app's foreman. It keeps track of all the details and makes sure the capture process is done according to the engineer's (supervisor layer's) specifications.

    3. The task layer. Here's where the actual capture work gets done. Modules in the task layer control the scanner, do OCR, verify index data, update the index database, clean up shoddy images, and so on.

    Think of the task layer as a crew of skilled laborers marching to the tune of the cap app's foreman (batch management layer). These guys roll up their sleeves and push your images through the system.

    Bearable Ways to Add Capture

    There are two ways you can add an image capture front-end to your imaging, document management and workflow system:

    1) Build the front-end yourself using cap app development tools or 2) Buy a packaged cap app.

    Here are some of your choices for each option.

    Build-It-Yourself Products. Since there are plenty of packaged cap apps out there that do a good job, why build your own? The main reason is that you can tailor it to your exact capture requirements.

    Most packaged products are designed to appeal to a broad spectrum of imaging users -- "one size fits all." Not all products include all the features you want. Others have features you don't need (but are forced to pay for).

    Another reason you may want to build your own cap app is so you can upgrade individual parts of the capture front-end without making wholesale modifications to the entire app. Want better image enhancement? Just tweak your code or plug in a module from a third-party vendor. No need to do a complete upgrade or buy a new front-end.

    One way to build a cap app is with a toolkit. A tool-kit is a set of programming function calls (known officially as an Application Programming Interface, or API). This lets system developers control the features of today's high-speed scanners.

    Programmers choose the modules they need to perform the desired cap app functions. Then they incorporate these modules into a user interface with such programming languages as C, Visual C++ and Visual Basic.

    Image capture toolkits include the following:

    TMSSequoia's (Stillwater, OK 405-377-0880) ScanDirector is available as a C library for Windows, Windows 95, Unix and as a VBX for Visual Basic and Visual C++. Both versions of the toolkit are bundled with CompressDirector, a high-level API that provides compression in several popular image formats.

    Another of their tool-kits, ScanFix, lets you incorporate a variety of cool image enhancement features. (ScanFix also comes as an app that sits on top of the toolkit.)

    Barcode reading is the fortý of Seaport Imaging's (San Jose, CA 408-366-6400) AutoPilot 2.0. It reads more than 10 types of barcodes (and up to 50 barcodes per page) at any angle. AutoPilot supports several real-time image processing functions like automatic cropping and skew correction, when used with Seaport's IP20 Image Processor Subsystem. It works with most popular programming environments and production scanners. (Seaport also markets two apps based on the VB implementation of AutoPilot: SeaScan for production scanning and ShipShape for production image processing.)

    Toolkits offer the most design flexibility. But using them isn't exactly duck soup. Unless you have some really good object-oriented programmers available, you could wind up with a hapless cap app.

    Other products that let you build-it-yourself are easier to use, such as InputAccel from Cornerstone Imaging (San Jose, CA 408-325-3800). InputAccel is an open, scalable product based on client/server technology that's optimized for Windows NT. It offers customization features similar to what you get with tool-kits. But instead of writing code, you plug task modules into the application.

    Here's how it works. When you buy InputAccel, you get the top two cap app layers: Supervisor ("engineer") and Server (Cornerstone's name for the Batch Management or "foreman" layer), you select the task modules ("skilled laborers") that best meet your capture needs.

    Cornerstone offers a variety of task modules. (The program ships with preconfigured task modules -- just load and run right out of the box if you don't need customization.)

    Other task modules come from a variety of third-party vendors.

    The big advantage is you don't need programming. Once you've loaded third-party modules onto the system, InputAccel recognizes them at the Supervisor layer. Now, they're ready to be incorporated into the capture process.

    Another cap app development tool that uses interchangable task modules is Capture from Imagination Software (Silver Spring, MD 301-588-8411). One major difference between Capture and InputAccel is the products' target markets. InputAccel appeals mainly to mid-range and high-end corporate settings and to production environments. Capture is designed for the desktop user.

    Capture's new low-cost ($50-$300) versions are for end users with modest scanning volumes. The source code and toolkit versions ($450-$2,000) are for developers or integrators who want to customize the capture front-end or extend the product to include workflow and document management.

    The 16-bit version of Capture supports 35 third-party task modules. Note: the 32-bit version is now under development.

    Packaged Capture Products

    If rapid deployment and ease of use are more important to you than flexibility and customization, take a look at some of the packaged capture products.

    Their advantages in-clude quick setup (some products include the hardware if you want it) and single-vendor support. In case something breaks, you make one phone call. With a multi-vendor modular system, it's not always clear which module is the source of the problem. So you call everybody. Not fun.

    Packaged cap apps are going scalable. Some vendors make multiple versions of their product for different volume requirements. Others make it easy to scale up. Some do both.

    Other packaged cap app vendors focus on the high-end market. If you push mountains of paper through your capture system every day, these products are for you. For departmental apps, it's like shooting a fly with a tank.

    We've divided the packaged cap app vendors into two categories: broad market products and high-end production products. (These are marketing and not functional categories.) Broad market products normally support high-end production capture. They place a stronger emphasis on other segments of the capture marketplace than the high-end production vendors do.

    Packaged Cap Apps -- Broad Market

    Intrafed (Potomac, MD 301-315-0240) sold the first packaged cap app to hit the marketplace, PowerScan. PowerScan has captured (pun intended) a strong following. Users range from the CIA to Sallie Mae. For speed and reliability, this product sets the standard. (That's why we gave it a Product of the Year Award last year.)

    PowerScan runs on Windows and Unix. It's compatible with almost all SCSI scanners. Quality control is one of its key strengths. During scanning, operators can view a predictable sampling of images (every 10th image, for example) to determine scan quality -- without slowing down the scanner. It flags scanned batches with missing pages by comparing the number of pages in the batch with the number of pages scanned.

    If the second number is less than the first, the batch goes back for rescanning.

    After scanning, images move to Intrafed's image processing module called StageWorks. Quality control is outstanding here as well. View up to 12 images simultaneously. Zoom, pan and rotate. Insert, delete or replace images. StageWorks maintains the original positions in the batch.

    Stageworks comes in three versions. StageWorks Office is a single-user version suitable for departmental apps. StageWorks Production is for pilot projects and small service bureaus. StageWorks Industrial is for big users who need the highest throughput they can get.

    The product is fully scalable. Start with Office and move up to Industrial as your volume and throughput needs grow. Intrafed also sells a bundled system with hardware.

    Ascent Capture from Kofax (Irvine, CA 714-727-1733) handles batch scanning, image processing, OCRing and document indexing. You can run it on a single workstation -- or 20 workstations. The system's unique load-balancing feature maintains a constant level of work distribution across all stations.

    Set up multiple document classes before scanning. Define document batches with job separator sheets. Print barcodes directly on the pages. Use the product's GUI to select and arrange the processing modules you want to use for each batch.

    Create indices with barcodes, OCR and manual data entry. Validation scripts and manual verification provide increased index accuracy. The validation scripts let the index module reject any characters not prespecified as "legal" elements of an index (a letter, for example, will be rejected by a ZIP code field). Manual verification lets two index operators enter the same index sequentially. The index is verified if both operators enter the same data.

    Another of Ascent Capture's strengths is its export module. Kofax calls it Release. (Export means sending scanned images and/or related data to a back end imaging or workflow system for permanent storage.) Transfer your images and related indices directly to the backend system. Or you can export the data to any ODBC-compliant database. Modifiable scripts control these processes to tighten the integration with your backend product.

    En Masse from Avatar Technologies (Sterling, VA 703-450-3880) is a fully scalable, client/server-based Windows cap app that includes standard capture modules and an API for custom application development. It supports most scanner interface cards and drivers. En Masse exports to more than 100 text retrieval systems, databases and image management systems.

    The Job Control Module lets you create file names, select image file format conversion and configure the OCR engine. Use the module's "At a Glance" feature to view scanned images and track each page through the capture process.

    Want faster throughput? Set the Image Scanning Module to landscape mode and rotate to portrait. This module lets you flip through a series of thumbnail images for easy editing of image folders.

    The Quality Control Module is where you enter and edit barcodes for indexing and QC. A wide range of enhancement features include deskew, rotation, text inversion (inverts region of white text on a black background) and page inversion (inverts page color). Other modules include OCR, Verification, Folder Archive and Export.

    En Masse comes in four versions. Advanced Lit-Pac edition is a single-station package for prototype review and pilot projects. Advanced is the basic network version for mid-level image capture and conversion. Advanced Plus adds server versions of the enhancement and OCR modules and support for the Kodak Imagelink scanners. Professional adds support for the Kodak Imagelink 9XX and others.

    Packaged Cap Apps: High-End

    BSCAN from Image Access (Boca Raton, FL 407-995-8334) weaves together a capture application, a scanner, a high- resolution display and an image processing board to create a high-volume, high-speed capture system with diverse indexing and image processing skills. Customization features let you tailor the system to your specific capture needs.

    BSCAN's customers include trucking and transportation companies that like its barcode recognition. The system reads 12 different types of barcodes in most positions. It can even handle multiple barcodes coexisting on a single document.

    Another strength is index validation. Validate index information using a simple database report file of all the valid values for an index entry. Alternatively, you can tell the system to communicate interactively with your central database (via DDE and DLL interfaces) for index validation.

    BSCAN's correlation feature pulls other information associated with an index from the database and joins it to the image for later use. The same feature expedites manual indexing by identifying and filling in the correct index after the operator enters a few characters.

    For quality control, BSCAN puts up to 16 images on a20-inch dual-page monitor. If image quality is acceptable, push the "POST" key and the images go directly into the image management system.

    DocuLex for Windows from DocuLex (Winterhaven, FL 941-297-3691 is a cap app for litigation support service providers and backfile conversion. Its Admin module tracks and sets up each capture project. This module controls multiple scanning stations.

    The Capture module lets you identify images and documents with a custom ID number. Each ID can be up to 18 characters long and subdivided into eight zones. The ID was designed to track documents in legal cases, but you can also use it as an audit number to track an image back to the original document in a commercial setting.

    Here's how it works: Use Zone 1 to represent a box number, Zone 2 for a folder number and Zone 3 for the page position in the folder. Put barcode separator sheets between every folder and every box.

    When the folder separator sheet is scanned, the system tells the operator to change the folder number. (The page position number automatically increments until a new folder number is entered.) When the box separator sheet is scanned, the system notifies the operator and halts until someone (such as the operator) enters the right number.

    Another great Capture module feature is sound. By plugging a sound card into the scan station, you audibly instruct operators to perform an action.

    Audible feedback is also given to ensure that the system accurately understands what the operator is trying to do. Other modules include OCR, Production Print, QC and Image Viewer.

    DpuScan from J&K Imaging (Tucker, GA 770-414-0010) is designed for the J&K DPU 16+ document processing board. The board lets you perform a number of "on-the-fly" functions like barcode search, image rotation and deskewing -- without slowing down the scanner. It supports scan rates of up to 500 images per minute.

    Barcode support is one of DpuScan's strongest features. Search several barcode types simultaneously during scanning (the system recognizes 12 types) during scanning. Recognition is very fast -- about 50 milliseconds. The software lets you specify barcode search parameters (height, stripe thickness, minimum and maximum stripes required for a meaningful barcode and so on.) This improves recognition of miniscule, scattered or partially corrupted barcodes.

    Another nice feature: Define document classes and save the configuration (scanner parameters, barcode types, file path) for each class. Just load the configuration file the next time you scan the same class and you're off and running.

    Photomatrix's (Culver City, CA 310-417-3800) Vision Series 6000 Image Capture System is great if you're looking for a tight fit between a cap app and

    capture hardware.

    The system consists of: l A high-speed (100 ppm) Photomatrix Vision Series 6000 duplex scanner.

  • A host PC with monitor.

  • A modular cap app for high-end service bureau and corporate needs. Modules include VisionCapture, VisionQC, VisionKey and VisionFlow.

    VisionCapture controls selection of predefined document types and sizes, scanner options, index record definition, file path, file format and so on. VisionQC provides quality control functions such as onscreen viewing and page flip. Enhancement tools such as deskewing, edge cropping and rotating are included in this module.

    VisionKey is for index data entry. Zone OCR and barcode recognition are also available. VisionFlow schedules and tracks modules through all the capture and image processing steps. Connectivity modules transfer images to a FileNet IMS

    server running WorkFlo or to a Unix server. You can also export to document management applications including Concordance, InMagic, JFS and SearchExpress.

    Eureka Software Solutions' (Austin, TX 512-459-9292) QC Factory is for

    service bureaus and high-end corporate imaging users. QC Factory is a Windows product with open architecture and independently scalable components. Quality control (QC) and image enhancement are among the product's highlights. Flip through pages and documents for online review. Rescan, delete, rotate, crop and deskew. Merge and break up documents to

    resolve scan-time errors. Append and insert new pages and documents -- including pages with oversized formats such as engineering plans and documents.

    Image enhancement, OCR, printing and exporting are off loaded for batch processing to improve capture efficiency. Managers can monitor and modify any batch-processing queue. The View Factory, an image processing module, also integrates with The QC Factory.

    Although both products can be used out of the box, they are often customized by the vendor to meet individual specifications and needs.

    SeaScan from Seaport Imaging uses Seaport's image processing board to rotate, deskew, crop and register on the fly, without degrading scanner performance.

    Use SeaScan's "To Do List" to configure the capture process with a few mouse clicks. Define and save capture job definitions and use them over and over.

    SeaScan's novel "Ensigns" (special barcoded pages) control your scanner on the fly. Interleave Ensigns into your document batch and automatically set contrast, dpi, simplex or duplex scanning. They'll also control the naming and routing of document image files -- without interrupting the scanner.

    SeaScan automatically generates an audit file each time you run the capture module. Audit files contain a record of the processing run -- barcode data, path names and process parameters for each scanned image, plus error information. Use the audit file to troubleshoot errors that are flagged during processing.

    Seaport also has another document image processing product called ShipShape. Enhanced versions of both products (SeaScan Plus and ShipShape Plus) are available from Optimis Systems (Fort Collins, CO 970-226-3466). The Optimis products are more robust than Seaport's and better suited to high-end production capture.

    With all these front-end capture products to choose from, you're sure to find something your organization can go wild about.


    Related Articles:

     




  • Channels
    Business Process Management
    Content Storage
    Content Management
    Compliance
    Enterprise Solutions
    Document Scanning & Capture
    Content Delivery & Publishing
    Collaboration & Knowledge Management
    Search and Classification
    Locate an article from our print magazine. Just enter your Locator ID Number below.
    ID#


    NEWS FROM THE PIPELINE

    OpenOffice.org 2.0 Closes On Final

    New Study Finds Steep Growth For Smartphones

    PalmSource Sale Cleared By Federal Agency

    CTIA Panel Examines Enterprise Security Risks

    [more]






    HOME | ARCHIVE | REALWARE AWARDS

    A Publication of the Network Computing Enterprise Architecture Group
    Brought to you by CMP Media LLC, Copyright © 2005
    Privacy Statement | Your California Privacy Rights | Terms Of Service