September 1998
Top of Forms: A Guide to 18 Systems
by David Wood
Which is the most expensive automatic data entry system of all? Answer:
The one that doesn't work. This is an old joke in the forms processing
arena, but it still applies today. Products are getting better and better
at automating common data entry operations, but it's still possible to
pick the wrong system for your application and expected demands.
How do you select the vendor and product that will be most successful
for you?
This article will help you deal with this thorny issue by asking nine
questions that will better define your application and requirements.
Along the way, we'll point out some of the strengths of specific vendors,
which should help you focus in on a few suppliers that are a strong
potential fit for you.
The products we'll consider include two from recent newsmakers
FormWare and Wheb Systems. The two companies in late July announced they
would merge to form one larger company, Captiva Software. The combined
company will boast $25 million in sales and an installed base of 26,000
users. We cover both the Formware and (formerly Wheb Systems) Captiva
lines, which will remain core products for the combined company.
We'll also consider Entrylink Bridge from Adaptive Solutions; Cardiff
Software's Teleform Standard, Elite and Enterprise; ELA/NT from Com Com
Systems; Transform from Dakota Imaging; Task Master 2000 from Datacap;
Automated Forms Processing system from Gtess; OCR For Forms from
Microsystems Technology; Cartouche from RAF Technology; Eyes and Hands
from ReadSoft; Formworks from Recognition Research; VDE+ from Viking
Software Services.
We'll cover two other systems that have recently been in the news.
AFPS Pro Version 2.5 from Top Image Systems recently gained North
American distribution through Dakota Imaging. We'll also look at Vista
Capture from Southern Computer Systems, which was recently purchased by
Scan Optics. For complete contact information for all of these companies,
refer to the product comparison chart on page 34.
The first major challenge in buying a forms processing system is to
understand the size, complexity and characteristics of your requirements.
Forms processing software ranges in cost from $500 for a shrink-wrapped
product to hundreds of thousands of dollars for a large, hand-tailored
system. The scale and expense of your system should match the scope and
cost of your data entry operation. Purchasing an inadequate system can be
a larger mistake than investing in a "Cadillac" system because the
former offers no return on investment.
1. How quickly will you recoup your investment? In general,
forms processing systems should deliver a 100% return on investment in
about one year in order to be worth implementing. ROI is delivered
primarily through reducing the data entry labor cost. This is
accomplished by:
- Reducing data entry volume with automatic recognition,
- Reducing labor cost by routing images to remote locations for data
entry,
- Improving data-entry productivity with key from image.
As the data entry volume goes up, the potential cost savings of
automatic recognition also increase, making pricier products and
customization more attractive. High-end vendors such as Dakota, Captiva
Software (including both FormWare and Wheb Systems), GTESS, RAF, TIS,
ReadSoft and Recognition Research all can tune their systems for
high-volume capture and become cost effective if your manual data entry
costs approach $500,000 per year.
If you are using full-time manual data entry operators in the U.S.,
your fully loaded data entry cost are probably close to $1.72/ 1,000
characters, according to The Association for Work Process
Improvement(Boston 617-426-1167). If personnel working on other
tasks, such as accounts payable, perform data entry ad hoc, the cost can
be considerably higher.
Moving data entry overseas or to the rural US can cut labor costs by
more than 50% without reducing turnaround time. Many vendors support
remote distribution of images, though FormWare was the first and probably
still has the most reference accounts. Managing any remote data entry
site is dependent upon having reliable and adequate operator statistics
available. Therefore, look for the capability to generate reports
detailing operator level productivity, such as total characters entered
by operator, time on task by operator and accuracy by operator.
A forms processing system might also be cost justified if data
completeness can be improved and/or if data is required so quickly that
manual methods are impractical. For example, long-distance telephone
companies typically generate $.50 per day for each of their residential
customers. For a large company signing up thousands of new customers,
entering their customer information the same day as receipt of their
order more than cost justifies an automated data entry system.
Captiva Software, the leader in key-from-image technology, has
historical data indicating that the average productivity of manual key
entry can be improved by about 25% in most applications by using their
software. MTI, NCS, Wheb Systems and Recognition Software corroborate
that finding with their own case studies, and they offer competitive
key-from-image capabilities as well. (See the sidebar on page 38 for more
detail on KFI functionality and products.)
2. Where is the data coming from? All forms processing products
can scan paper documents and lift data from the images, but not all can
handle fax images, Internet forms or EDI input. Each input type requires
capture, validation and export, and each has its own unique challenges.
The vendors that are able to accept EDI input and unify it with a data
stream from paper forms in the health insurance industry today are Dakota
Imaging and Microsystems Technology (MTI).
At this writing, Cardiff, Com Com Systems, Dakota Imaging, MTI and NCS
have the only products that can accept input from both paper and
electronic forms on the Internet or by email. This feature can be handy
if you are collecting the same information on your web site and also from
paper forms, such as order forms or market research surveys.
If you collect the same information from both your web site and paper
forms, you should consider products that permit the creation of both
paper and electronic HTML- or Java Applet-based versions at the same
time. MTI and Cardiff both support this capability. The end result will
be higher processing accuracy and reduced overall system costs.
If your application will be primarily or entirely fax based and you
will design your own forms, you should look at Cardiff's Teleform
Standard. This is the most inexpensive product (retail $1,495) in this
set, and it was designed specifically for this application.
3. Do you control form design? If your application is direct
mail, time cards, surveys or loan, job or school applications, you may be
able to modify the form design to reduce data entry cost. A properly
designed form -- printed in an OCR dropout color and including boxes for
each hand-printed character required -- offers vastly improved
recognition accuracy over its black-and-white equivalent without
handprint constraints. Your supplier should be willing and able to work
with you to redesign your forms and/or create entirely new ones. Some
will charge for their time while others will include the effort as part
of the system price.
If your form is well designed and OCRable, data entry automation
should be able to reduce the amount of data entry by more than 75%. Given
this potential, recognition, validation and export capabilities are the
most critical aspects of the forms processing products for these types of
applications.
Some products that are especially strong in these areas are Task
Master 2000 from Datacap, OCR for Forms from Microsystems Technology,
Teleform Elite and Standard from Cardiff Software, FormWorks from
Recognition Research, Accra from NCS, Eyes and Hands from ReadSoft and
Captiva from Captiva Software.
4. Are you stuck with non-OCRable forms? Most forms in the real
world are not designed for automatic processing, and the design of your
form may be beyond your control. The keys to effective processing of
black and white forms are strong image pre-processing, fast "key from
image" and data verification capabilities specifically suited to your
requirements.
The major challenge for automatic recognition of data on
black-and-white forms is reading characters that are mixed into the form
itself -- especially characters touching the lines of the boxes they are
supposed to be in. Most vendors attack this problem by using image
pre-processing to remove the pre-printed form image, but others (RAF,
Gtess, and Eyes and Hands) have modified their recognition algorithms to
read characters even with interference. Some of those who use image
pre-processing license third-party toolkits for this purpose.
Regardless of the technical approach, most products available today
are surprisingly accurate, though they may not have a high confidence
rating for the read on a character touching a line. You should insist on
detailed benchmarks of your forms as well as at least three installed
reference accounts you can speak with prior to committing to a vendor.
Both Teleform Elite from Cardiff Software and OCR for Forms from MTI
let you try multiple image pre-processing settings and then select the
best results depending upon the success of the OCR. This is critical in
cases where the field may or may not contain dot-matrix printed
characters; the software can try with and without the assumption that the
characters are dot matrix and then select the result with the highest
confidence.
A problem with many forms that were not designed for automated forms
processing is that the data is tightly packed together and can be
difficult or impossible to separate reliably by drawing a box for each
field. For example, many invoices, purchase orders or insurance claims
have a set of line items with data in columns. ReadSoft offers a "matrix"
feature that lets you define the entire area as a single field and then
parse the data as required after recognition. This enhances capture
results, improves the ability to accommodate variation in the form layout
and reduces the labor required to define the form.
In many applications, most or all of the data will have to be manually
entered. The target information may be too intermixed with the
pre-printed information. The information may be handwritten or printed by
a poor-quality output device, such as a dirty line printer. Since
data-entry labor cost will be the major expense of these systems, the key
from image (KFI) functionality becomes the most important feature to
evaluate. Captiva Software (including both the FormWare and Captiva
lines), MTI, Dakota Imaging and Recognition Research all offer
top-quality key from image capabilities.
All forms processing software vendors have developed features specific
to particular customers or applications, and they can make a huge
difference in system effectiveness. For example, health insurance
customers reading HCFA forms benefit greatly from the ability to export
their data as a data stream compliant with the NSF EDI format standard.
Insurance claims is such an important vertical market that several of the
studied vendors have produced vertical products specifically targeting
this application (Cardiff Mediclaim, RAF Cartouche Medical, Recognition
Research ClaimWorks, ELA for HCFA 1500 from Com Com Systems). Others have
integrated complete HCFA 1500 insurance claim validation functionality
into their products (Dakota Imaging, Datacap, Captiva Software, Gtess,
MTI).
Gtess has specialized in the transportation market and so offers both
specialized validation features for waybills and the ability to read
driver logs automatically. ReadSoft has developed specific products for
address parsing and invoice processing. Almost all vendors license the
Postalsoft address verification routines for city, state and zip code
validation. Adaptive Solutions offers a unique add-on hardware module
that lets it utilize the Kodak Imagelink 70 Microfilmer as a scanner,
eliminating the cost of purchasing a new scanner for that group of users.
You should make sure you carefully evaluate the validation features of
any product you are considering and understand exactly how to apply them
to your form.
5. How many different types of forms will you process? Some
systems are designed to process only a single form or a very small number
of known forms. Others can handle batches with large numbers of different
forms. The key to this capability is automatic form identification, and
the state of the art is topological processing. Such software identifies
forms by looking for clues such as lines, line intersections, logos and
the placement of logos and blocks of pre-printed text.
Products including FormWare, Teleform Elite, Accra, Eyes and Hands,
OCR for Forms, Formworks, Entrylink Bridge and AFPS Pro all use
topological processing for form identification. Some back this up with a
secondary form identification process that reads the form name or form
number in a specified zone on the page. MTI goes a step further by
permitting hierarchical structures using libraries of form IDs specific
to various applications.
If the number of different forms your application needs to handle is
very high, the time required to define, manage and modify those forms can
become a significant portion of the system cost. MTI's OCR for Forms is
clearly the leader in the ability to simply and quickly modify a form
definition. Eyes and Hands for Invoices, a new product from ReadSoft,
automatically builds a library of form identifications over time for each
supplier's invoice.
6. How many form variants do you have to deal with? If you
process standard forms that have been in use for more than a year or two,
you almost certainly will find form variants in the document flow.
Variants can be the result of internal actions, such as the marketing
department changing the design of a direct mail order form or mortgage
application to test whether it improves response rates. Other variants
are created by changes in requirements. For example, accident report
forms may be changed to reflect new regulatory requirements. Finally,
users may create variants by photocopying the original form or sending
out copies to a local printer to produce additional copies.
Variants usually look identical to the original to a person, but most
forms processing software is not that smart. Captiva's FormWare, OCR For
Forms, Task Master 2000, Teleform Enterprise and FormWorks all have some
flexibility programmed in to permit the location of data slightly outside
of the specified zone, but this feature has its limitations. OCR For
Forms also has an interactive "debug" feature that reduces the time
required to program variants or tune the form definition in real time.
However, each variant is still classified as a separate form with a
separate ID number. When you are planning or implementing a system, make
sure that you identify your variants and know how to handle them in the
processing stream.
Within the vertical market for processing health claims (HCFA 1500
forms), Dakota and Cardiff (with its newly shipping Mediclaim product)
are able to process HCFA 1500 forms without separately defining all of
the variants. Other vendors, such as Recognition Research, Adaptive
Solutions, GTESS, Captiva Software with FormWare, NCS, Com Com Systems
and Datacap, have pre-defined libraries of HCFA variants, reducing the
implementation time for that application.
7. What are your data entry requirements? Picture two companies
with different forms processing demands. One is a shipper that captures
5,000 pages per day reading only a single barcode from each form. Another
is an insurance company that scans only 500 pages per day, but they read
up to 200 characters that are located in dozens of zones that require a
mix of recognition technologies and validations. The insurance company
clearly has the tougher challenge.
Products created for general document capture are best if your
scanning volume is high and your data entry requirement is low. But the
forms processing products mentioned throughout this article are a better
choice if you are reading many characters and/or fields per form. These
products vary in pricing and functionality, but in general they cost more
and offer more functionality than the generic capture products.
8. Which vendors are good long-term choices? The automated
forms processing market is growing and changing extremely quickly. The
impact of the Internet and improved recognition technology promises to
accelerate that change over the next three years. In this environment, it
is important for you to protect your investment by selecting vendors who
will be able to incorporate new technology and keep up with the industry.
This gives major vendors an advantage over smaller competitors, even
if they are not necessarily "the best" in every feature category right
now. If you are purchasing from a vendor who does not have thousands of
currently installed accounts, you should be confident that they are
focused directly on your vertical application and that their products add
specific value for your requirements.
9.What is your budget? Most forms processing systems are cost
justified based on saving money relative to the current manual data entry
system you have in place.
It is reasonable to expect a good automated system to generate at
least 20% annual savings over your current cost. As a rule of thumb, you
should also set your system budget at this amount. If, for example, you
spend $100,000 per year on data entry, expect to spend in the
neighborhood of $20,000 on your system.
If you've done your homework and chosen wisely, you'll get a one-year
payback. Since the system is a working asset paid for by operating cost
reductions, you might be able to get a lease with monthly payments that
are less than the cost savings achieved.
Be sure to include enough money in the implementation budget to cover
any hardware purchases, training, customization and support required.
Most vendors price their products based on the number of modules
required and the volume of forms being processed. They generally charge
separately for consulting time and services, so be sure you solicit and
receive proposals outlining the total cost for the system, not just the
product cost.
In forms processing software, you get what you pay for. Spending 10%
more for the Cadillac system will lengthen your payback time a little but
could very well decrease the risk of the new system substantially.
Above all, make sure that your budget and the product you buy with it
are adequate to meet all of your current and future forms processing
system needs.
--David Wood is president of David Wood Associates (Boulder
Creek, CA 408-338-1551).
Related Articles: