December 1998
Heavyweight Competition
Leading OCR vendor Caere (Los Gatos, CA 408-395-5148 www.caere.com) has upped the ante in the recognition toolkit business with the recent release of Developer's Kit 2000, an integrated offering packing nine recognition engines in all. Designed to simplify the integration of multiple recognition modules, the toolkit is priced at $5,495.
Developers Kit 2000 is a single-source toolkit that includes OCR (machine print), ICR (handprint), bar code, OCR-A, OCR-B, MICR (E13B) and OMR. There are two OCR engines included. The Omnifont machine print OCR recognition module is used in Caere OmniPage. It recognizes English, UK English, Spanish, French, German, Italian, Portuguese, Swedish, Danish, Dutch, Norwegian and Brazilian (Portuguese). A second OCR module from Caere's Recognita unit recognizes more than 110 languages.
The output format can be generic Unicode or ANSI or can be filtered through INSO conversion modules into popular word processing formats such as Microsoft Word, WordPerfect and RTF (Rich Text Format) as well as HTML.
Developers Kit 2000 offers an open API architecture, so 32-bit software developers have the flexibility to integrate other individual image capture technologies or products not shipped in the Caere product. ActiveX or C interfaces help drive document processing.
Last January, Caere rival ScanSoft, a Xerox company (Peabody, MA 978-977-2000 www.textbridge.com), unveiled V4.5 of the TextBridge Application Programmer Interface (API). The toolkit is designed to help developers build and customize their own OCR applications. The toolkit employs cooperating expert subsystems, working through central controllers, that contribute to the analysis and recognition of characters and words, as well as understanding the underlying page. It is priced at about $5,000.
The TextBridge API is designed for the Microsoft Visual C and C++ development environment. In addition to improved recognition, TextBridge API will give developers automatic preprocessing capabilities such as page segmentation, rotation, fax/dot matrix detection/adjustment, lineart detection, reverse-video detection, noise removal, and deskew.
The API supports 12 languages including English, Danish, Dutch, Finnish, French, German, Italian, Norwegian, Portuguese, Russian, Spanish and Swedish as well as 15 built-in lexical classes. An additional feature of the API includes the Expanded Lexifier(tm) - a natural language system that increases recognition accuracy for classes of text commonly found in business documents that are not true words, including telephone numbers, dates and social security numbers. Once text has been processed, it can be output into ISO or XDOC format (XDOC being Xerox' fully documented rich text mark-up language). Similarly, output data can also be sent to either a file or application-defined buffer.
Main Article: