Intelligent Enterprise featuring Transform
START NEWS & ANALYSIS OPINION CHANNELS PRODUCT GUIDES REVIEWS TECHWEBCASTS
CONTACTS ARCHIVES ADVANCED SEARCH

September 1998

Top of Forms: A Guide to 18 Systems

by David Wood

Which is the most expensive automatic data entry system of all? Answer: The one that doesn't work. This is an old joke in the forms processing arena, but it still applies today. Products are getting better and better at automating common data entry operations, but it's still possible to pick the wrong system for your application and expected demands.

How do you select the vendor and product that will be most successful for you?

This article will help you deal with this thorny issue by asking nine questions that will better define your application and requirements. Along the way, we'll point out some of the strengths of specific vendors, which should help you focus in on a few suppliers that are a strong potential fit for you.

The products we'll consider include two from recent newsmakers FormWare and Wheb Systems. The two companies in late July announced they would merge to form one larger company, Captiva Software. The combined company will boast $25 million in sales and an installed base of 26,000 users. We cover both the Formware and (formerly Wheb Systems) Captiva lines, which will remain core products for the combined company.

We'll also consider Entrylink Bridge from Adaptive Solutions; Cardiff Software's Teleform Standard, Elite and Enterprise; ELA/NT from Com Com Systems; Transform from Dakota Imaging; Task Master 2000 from Datacap; Automated Forms Processing system from Gtess; OCR For Forms from Microsystems Technology; Cartouche from RAF Technology; Eyes and Hands from ReadSoft; Formworks from Recognition Research; VDE+ from Viking Software Services.

We'll cover two other systems that have recently been in the news. AFPS Pro Version 2.5 from Top Image Systems recently gained North American distribution through Dakota Imaging. We'll also look at Vista Capture from Southern Computer Systems, which was recently purchased by Scan Optics. For complete contact information for all of these companies, refer to the product comparison chart on page 34.

The first major challenge in buying a forms processing system is to understand the size, complexity and characteristics of your requirements. Forms processing software ranges in cost from $500 for a shrink-wrapped product to hundreds of thousands of dollars for a large, hand-tailored system. The scale and expense of your system should match the scope and cost of your data entry operation. Purchasing an inadequate system can be a larger mistake than investing in a "Cadillac" system because the former offers no return on investment.

1. How quickly will you recoup your investment? In general, forms processing systems should deliver a 100% return on investment in about one year in order to be worth implementing. ROI is delivered primarily through reducing the data entry labor cost. This is accomplished by:

  • Reducing data entry volume with automatic recognition,
  • Reducing labor cost by routing images to remote locations for data entry,
  • Improving data-entry productivity with key from image.

As the data entry volume goes up, the potential cost savings of automatic recognition also increase, making pricier products and customization more attractive. High-end vendors such as Dakota, Captiva Software (including both FormWare and Wheb Systems), GTESS, RAF, TIS, ReadSoft and Recognition Research all can tune their systems for high-volume capture and become cost effective if your manual data entry costs approach $500,000 per year.

If you are using full-time manual data entry operators in the U.S., your fully loaded data entry cost are probably close to $1.72/ 1,000 characters, according to The Association for Work Process Improvement(Boston 617-426-1167). If personnel working on other tasks, such as accounts payable, perform data entry ad hoc, the cost can be considerably higher.

Moving data entry overseas or to the rural US can cut labor costs by more than 50% without reducing turnaround time. Many vendors support remote distribution of images, though FormWare was the first and probably still has the most reference accounts. Managing any remote data entry site is dependent upon having reliable and adequate operator statistics available. Therefore, look for the capability to generate reports detailing operator level productivity, such as total characters entered by operator, time on task by operator and accuracy by operator.

A forms processing system might also be cost justified if data completeness can be improved and/or if data is required so quickly that manual methods are impractical. For example, long-distance telephone companies typically generate $.50 per day for each of their residential customers. For a large company signing up thousands of new customers, entering their customer information the same day as receipt of their order more than cost justifies an automated data entry system.

Captiva Software, the leader in key-from-image technology, has historical data indicating that the average productivity of manual key entry can be improved by about 25% in most applications by using their software. MTI, NCS, Wheb Systems and Recognition Software corroborate that finding with their own case studies, and they offer competitive key-from-image capabilities as well. (See the sidebar on page 38 for more detail on KFI functionality and products.)

2. Where is the data coming from? All forms processing products can scan paper documents and lift data from the images, but not all can handle fax images, Internet forms or EDI input. Each input type requires capture, validation and export, and each has its own unique challenges. The vendors that are able to accept EDI input and unify it with a data stream from paper forms in the health insurance industry today are Dakota Imaging and Microsystems Technology (MTI).

At this writing, Cardiff, Com Com Systems, Dakota Imaging, MTI and NCS have the only products that can accept input from both paper and electronic forms on the Internet or by email. This feature can be handy if you are collecting the same information on your web site and also from paper forms, such as order forms or market research surveys.

If you collect the same information from both your web site and paper forms, you should consider products that permit the creation of both paper and electronic HTML- or Java Applet-based versions at the same time. MTI and Cardiff both support this capability. The end result will be higher processing accuracy and reduced overall system costs.

If your application will be primarily or entirely fax based and you will design your own forms, you should look at Cardiff's Teleform Standard. This is the most inexpensive product (retail $1,495) in this set, and it was designed specifically for this application.

3. Do you control form design? If your application is direct mail, time cards, surveys or loan, job or school applications, you may be able to modify the form design to reduce data entry cost. A properly designed form -- printed in an OCR dropout color and including boxes for each hand-printed character required -- offers vastly improved recognition accuracy over its black-and-white equivalent without handprint constraints. Your supplier should be willing and able to work with you to redesign your forms and/or create entirely new ones. Some will charge for their time while others will include the effort as part of the system price.

If your form is well designed and OCRable, data entry automation should be able to reduce the amount of data entry by more than 75%. Given this potential, recognition, validation and export capabilities are the most critical aspects of the forms processing products for these types of applications.

Some products that are especially strong in these areas are Task Master 2000 from Datacap, OCR for Forms from Microsystems Technology, Teleform Elite and Standard from Cardiff Software, FormWorks from Recognition Research, Accra from NCS, Eyes and Hands from ReadSoft and Captiva from Captiva Software.

4. Are you stuck with non-OCRable forms? Most forms in the real world are not designed for automatic processing, and the design of your form may be beyond your control. The keys to effective processing of black and white forms are strong image pre-processing, fast "key from image" and data verification capabilities specifically suited to your requirements.

The major challenge for automatic recognition of data on black-and-white forms is reading characters that are mixed into the form itself -- especially characters touching the lines of the boxes they are supposed to be in. Most vendors attack this problem by using image pre-processing to remove the pre-printed form image, but others (RAF, Gtess, and Eyes and Hands) have modified their recognition algorithms to read characters even with interference. Some of those who use image pre-processing license third-party toolkits for this purpose.

Regardless of the technical approach, most products available today are surprisingly accurate, though they may not have a high confidence rating for the read on a character touching a line. You should insist on detailed benchmarks of your forms as well as at least three installed reference accounts you can speak with prior to committing to a vendor.

Both Teleform Elite from Cardiff Software and OCR for Forms from MTI let you try multiple image pre-processing settings and then select the best results depending upon the success of the OCR. This is critical in cases where the field may or may not contain dot-matrix printed characters; the software can try with and without the assumption that the characters are dot matrix and then select the result with the highest confidence.

A problem with many forms that were not designed for automated forms processing is that the data is tightly packed together and can be difficult or impossible to separate reliably by drawing a box for each field. For example, many invoices, purchase orders or insurance claims have a set of line items with data in columns. ReadSoft offers a "matrix" feature that lets you define the entire area as a single field and then parse the data as required after recognition. This enhances capture results, improves the ability to accommodate variation in the form layout and reduces the labor required to define the form.

In many applications, most or all of the data will have to be manually entered. The target information may be too intermixed with the pre-printed information. The information may be handwritten or printed by a poor-quality output device, such as a dirty line printer. Since data-entry labor cost will be the major expense of these systems, the key from image (KFI) functionality becomes the most important feature to evaluate. Captiva Software (including both the FormWare and Captiva lines), MTI, Dakota Imaging and Recognition Research all offer top-quality key from image capabilities.

All forms processing software vendors have developed features specific to particular customers or applications, and they can make a huge difference in system effectiveness. For example, health insurance customers reading HCFA forms benefit greatly from the ability to export their data as a data stream compliant with the NSF EDI format standard. Insurance claims is such an important vertical market that several of the studied vendors have produced vertical products specifically targeting this application (Cardiff Mediclaim, RAF Cartouche Medical, Recognition Research ClaimWorks, ELA for HCFA 1500 from Com Com Systems). Others have integrated complete HCFA 1500 insurance claim validation functionality into their products (Dakota Imaging, Datacap, Captiva Software, Gtess, MTI).

Gtess has specialized in the transportation market and so offers both specialized validation features for waybills and the ability to read driver logs automatically. ReadSoft has developed specific products for address parsing and invoice processing. Almost all vendors license the Postalsoft address verification routines for city, state and zip code validation. Adaptive Solutions offers a unique add-on hardware module that lets it utilize the Kodak Imagelink 70 Microfilmer as a scanner, eliminating the cost of purchasing a new scanner for that group of users. You should make sure you carefully evaluate the validation features of any product you are considering and understand exactly how to apply them to your form.

5. How many different types of forms will you process? Some systems are designed to process only a single form or a very small number of known forms. Others can handle batches with large numbers of different forms. The key to this capability is automatic form identification, and the state of the art is topological processing. Such software identifies forms by looking for clues such as lines, line intersections, logos and the placement of logos and blocks of pre-printed text.

Products including FormWare, Teleform Elite, Accra, Eyes and Hands, OCR for Forms, Formworks, Entrylink Bridge and AFPS Pro all use topological processing for form identification. Some back this up with a secondary form identification process that reads the form name or form number in a specified zone on the page. MTI goes a step further by permitting hierarchical structures using libraries of form IDs specific to various applications.

If the number of different forms your application needs to handle is very high, the time required to define, manage and modify those forms can become a significant portion of the system cost. MTI's OCR for Forms is clearly the leader in the ability to simply and quickly modify a form definition. Eyes and Hands for Invoices, a new product from ReadSoft, automatically builds a library of form identifications over time for each supplier's invoice.

6. How many form variants do you have to deal with? If you process standard forms that have been in use for more than a year or two, you almost certainly will find form variants in the document flow. Variants can be the result of internal actions, such as the marketing department changing the design of a direct mail order form or mortgage application to test whether it improves response rates. Other variants are created by changes in requirements. For example, accident report forms may be changed to reflect new regulatory requirements. Finally, users may create variants by photocopying the original form or sending out copies to a local printer to produce additional copies.

Variants usually look identical to the original to a person, but most forms processing software is not that smart. Captiva's FormWare, OCR For Forms, Task Master 2000, Teleform Enterprise and FormWorks all have some flexibility programmed in to permit the location of data slightly outside of the specified zone, but this feature has its limitations. OCR For Forms also has an interactive "debug" feature that reduces the time required to program variants or tune the form definition in real time. However, each variant is still classified as a separate form with a separate ID number. When you are planning or implementing a system, make sure that you identify your variants and know how to handle them in the processing stream.

Within the vertical market for processing health claims (HCFA 1500 forms), Dakota and Cardiff (with its newly shipping Mediclaim product) are able to process HCFA 1500 forms without separately defining all of the variants. Other vendors, such as Recognition Research, Adaptive Solutions, GTESS, Captiva Software with FormWare, NCS, Com Com Systems and Datacap, have pre-defined libraries of HCFA variants, reducing the implementation time for that application.

7. What are your data entry requirements? Picture two companies with different forms processing demands. One is a shipper that captures 5,000 pages per day reading only a single barcode from each form. Another is an insurance company that scans only 500 pages per day, but they read up to 200 characters that are located in dozens of zones that require a mix of recognition technologies and validations. The insurance company clearly has the tougher challenge.

Products created for general document capture are best if your scanning volume is high and your data entry requirement is low. But the forms processing products mentioned throughout this article are a better choice if you are reading many characters and/or fields per form. These products vary in pricing and functionality, but in general they cost more and offer more functionality than the generic capture products.

8. Which vendors are good long-term choices? The automated forms processing market is growing and changing extremely quickly. The impact of the Internet and improved recognition technology promises to accelerate that change over the next three years. In this environment, it is important for you to protect your investment by selecting vendors who will be able to incorporate new technology and keep up with the industry.

This gives major vendors an advantage over smaller competitors, even if they are not necessarily "the best" in every feature category right now. If you are purchasing from a vendor who does not have thousands of currently installed accounts, you should be confident that they are focused directly on your vertical application and that their products add specific value for your requirements.

9.What is your budget? Most forms processing systems are cost justified based on saving money relative to the current manual data entry system you have in place.

It is reasonable to expect a good automated system to generate at least 20% annual savings over your current cost. As a rule of thumb, you should also set your system budget at this amount. If, for example, you spend $100,000 per year on data entry, expect to spend in the neighborhood of $20,000 on your system.

If you've done your homework and chosen wisely, you'll get a one-year payback. Since the system is a working asset paid for by operating cost reductions, you might be able to get a lease with monthly payments that are less than the cost savings achieved.

Be sure to include enough money in the implementation budget to cover any hardware purchases, training, customization and support required.

Most vendors price their products based on the number of modules required and the volume of forms being processed. They generally charge separately for consulting time and services, so be sure you solicit and receive proposals outlining the total cost for the system, not just the product cost.

In forms processing software, you get what you pay for. Spending 10% more for the Cadillac system will lengthen your payback time a little but could very well decrease the risk of the new system substantially.

Above all, make sure that your budget and the product you buy with it are adequate to meet all of your current and future forms processing system needs.

--David Wood is president of David Wood Associates (Boulder Creek, CA 408-338-1551).

Related Articles:

 




Channels
Business Process Management
Content Storage
Content Management
Compliance
Enterprise Solutions
Document Scanning & Capture
Content Delivery & Publishing
Collaboration & Knowledge Management
Search and Classification
Locate an article from our print magazine. Just enter your Locator ID Number below.
ID#


NEWS FROM THE PIPELINE

OpenOffice.org 2.0 Closes On Final

New Study Finds Steep Growth For Smartphones

PalmSource Sale Cleared By Federal Agency

CTIA Panel Examines Enterprise Security Risks

[more]






HOME | ARCHIVE | REALWARE AWARDS

A Publication of the Network Computing Enterprise Architecture Group
Brought to you by CMP Media LLC, Copyright © 2005
Privacy Statement | Your California Privacy Rights | Terms Of Service