Intelligent Enterprise featuring Transform
START NEWS & ANALYSIS OPINION CHANNELS PRODUCT GUIDES REVIEWS TECHWEBCASTS
CONTACTS ARCHIVES ADVANCED SEARCH
Rate & Review
Letter to the Editor
E-mail Article
Print Article
March 2002

A Thin Take on Data Capture

by Penny Lunt

SourceNet Solutions, a provider of payroll and accounts payable services, was coping with major changes. One of the company's biggest customers, a major energy trading organization, had recently imploded. SourceNet was moving its headquarters from Houston to College Station, TX, in an effort to cut costs. The company was also seeking to change its business model.

"Outsourcing can be a hard sell at times," says Matthew Childress, product manager. "You have to be able to prove that you can do things less expensively and more efficiently than the customer can in-house. We wanted to be able to slice and dice our services into an array of offerings and let our customers pick and choose."

For example, customers could have their own employees do the scanning, data entry, data verification and workflow-based approval for invoices and time sheets, with SourceNet simply hosting the processing software, ASP-style. At the other extreme, customers could have SourceNet handle all the data entry and workflow steps and feed the final results into their enterprise resource planning (ERP) systems.

To improve efficiency and flexibility, SourceNet is implementing a TaskMaster 5.0 data capture system from Datacap, Tarrytown, NY. The Web-deployable system will let SourceNet offer any combination of software and services. Because most functions can be carried out through browsers, SourceNet and customer employees will be able to work remotely. Now, employees who don't want to move to College Station will be able to work from their homes in and around Houston, which will soften the blow of relocation.

Automated data capture software goes through a series of steps starting with document scans, continuing with data recognition and validation and ending with exporting properly formatted data to applications such as ERP systems. TaskMaster 5.0 is the first high-volume data capture system that enables most of these steps to be performed through ordinary browsers as well as thick-client software.

Synopsis

Vendor: Datacap, Tarrytown, NY
www.datacap.com

Product: TaskMaster 5.0

Description: Production-level data capture software that allows many functions — including scanning, verifying, monitoring and administration — to be handled remotely through browsers.

Strengths: Flexible and scalable system in that most functions can be handled through either a thick client or a browser. Behind the scenes, servers can be added as needed to handle high volumes. Use of XML eases integration with multiple input sources (such as fax, EDI and e-forms) and multiple output destinations.

Weaknesses: Complex setup requires help from Datacap staff or a VAR, typically a three-month installation time. Browser-based validation client has limited flexibility.

Price: A typical five-user configuration costs about $20,000.

As with other applications, the overriding benefit of the thin-client approach is that it doesn't require software installation and administration on each user PC. This is particularly attractive for data entry and validation tasks that can be distributed to home workers or service bureaus overseas.

Once employees are logged into TaskMaster through a browser (with a password, station ID and proper permission levels), they can scan documents, verify recognition results, monitor workflows, reroute batches of work and administer the system from any PC.

A technical and philosophical debate is now raging over browser-based scanning. Some argue fervently that high-speed scanning cannot and should not be done through browsers — that a browser can't accommodate the transfer of batches of images from one server to another or the load balancing required in high-volume data capture.

"If you're willing to suspend disbelief and take advantage of what's available today, using TWAIN 1.7 you definitely can scan from a browser using a high-speed scanner," contends Scott Blau, president of Datacap.

TaskMaster 5.0 uses an ActiveX control to enable TWAIN scanning through a browser. (The software's thick scanning client also works with ISIS and Kofax drivers.) The system scans to the scan station's C drive and queues the batch for upload through the Web server to the TaskMaster server for processing and recognition. Using multiple browser sessions, a user can scan in one browser while another browser uploads images. To support production-level scanning and validation, Datacap assumes users will rely on DSL, cable modems or other fast connections, although the system also supports dial-up connectivity.

The benefit of browser-based scanning is that documents can be captured in a distributed approach using low-cost scanners at remote offices, which saves time, eliminates shipping costs and provides higher-quality images that are easier to automate than faxes. TaskMaster automatically detects poor-quality images and reroutes them back to the originating scan station for rescanning.

In the next stage of data capture, recognition, TaskMaster uses RecoStar and Kadmos recognition engines to read both machine-printed and handprinted data. Adjustable voting between these engines improves accuracy, and customers can add other engines for specialized applications.

Data validation can be handled centrally or at remote locations. In addition to supporting home workers or offshore data entry, the browser-based remote option can help allay security concerns. For example, insurance companies must comply with the privacy elements of the HIPAA regulations, which stipulates that unauthorized persons can't view complete patient records. Instead of shipping paper forms or complete images to a service provider, TaskMaster can split tasks, sending contact information validation to one worker and medical information validation to another.

Like most data capture systems, TaskMaster 5.0 displays image zones next to the recognition results for those zones. If there is a low confidence level for a particular character, that character is highlighted and awaits an operator's edit or approval. While Datacap's thick-client interface offers some of the flexible viewing options found in the best systems, the browser-based interface is somewhat limited. The system only displays low-confidence OCR results, and viewing is limited to image zones rather than the full page.

Browser-based administrative access is a boon to system supervisors because they can spot bottlenecks, change the priority of a batch, balance workloads, change passwords and perform other tasks from anywhere. They could even log into the system from home, see how many batches are queued and decide how many temp workers to call in before heading for the office.

TaskMaster's XML-based workflow can be integrated with other back-end systems and it can accept and verify data from outside sources such as EDI, faxes and other XML applications. This was one of the features that attracted the attention of SourceNet.

"TaskMaster can treat a faxed invoice or an EDI as it would a scanned document," Childress says. Over time, Childress envisions suppliers providing invoice information via a Web form straight into TaskMaster.

The closest competitors to TaskMaster include FormWare 4.0 from Captiva, San Diego, CA, and InputAccel 4.0 from ActionPoint, San Jose, CA. While both these systems offer distributed scanning and validation options that can take advantage of the Internet, neither has embraced a thin-client approach.

TaskMaster 5.0 offers a useful combination of back-office forms processing features and Web accessibility. It's a solid choice for processing complex forms such as health care claims and tax returns. The appeal of the system's thin-client functionality depends on the application. If you need highly flexible deployment of scanning, validation or administrative access, TaskMaster is just a browser click away.




Channels
Business Process Management
Content Storage
Content Management
Compliance
Enterprise Solutions
Document Scanning & Capture
Content Delivery & Publishing
Collaboration & Knowledge Management
Search and Classification
Locate an article from our print magazine. Just enter your Locator ID Number below.
ID#


NEWS FROM THE PIPELINE

OpenOffice.org 2.0 Closes On Final

New Study Finds Steep Growth For Smartphones

PalmSource Sale Cleared By Federal Agency

CTIA Panel Examines Enterprise Security Risks

[more]






HOME | ARCHIVE | REALWARE AWARDS

A Publication of the Network Computing Enterprise Architecture Group
Brought to you by CMP Media LLC, Copyright © 2005
Privacy Statement | Your California Privacy Rights | Terms Of Service