Intelligent Enterprise featuring Transform
START NEWS & ANALYSIS OPINION CHANNELS PRODUCT GUIDES REVIEWS TECHWEBCASTS
CONTACTS ARCHIVES ADVANCED SEARCH

September 1998

The Case for Color in Imaging

by Harvey Spencer

Color not only looks better, on some documents it carries important meaning. With costs falling in line, it's time to give color a look.

Once a distant frontier, high-volume color document imaging is well on its way to becoming a reality. Costs are coming in line, and the user- and accuracy-friendly attributes of color are becoming more than compelling.

Today's imaging systems use black-and-white (bitonal) images because they are fairly small (around 50 Kbytes compressed on average) and because the compression used (TIFF Group 4) does not lose data. But despite many tricks, bitonal has drawbacks, particularly when image shadings vary or the image contains a stamp or other item that requires a better view.

Clearly, color images would make the documents easier to read. But in some cases color is critical to an image's usefulness. This is becoming more and more the case as color is used in print and in Internet published documents.

Pictured below are a few examples of documents that a user might want to store and subsequently retrieve. Documents 1, 2 and 3 convert into black and white fairly well. The critical information is easy enough to read, but the reader does get a clearer understanding when the document is in color. In document 2, for example, the first four digits of the client identification number are more clearly seen to be pre-printed when the image is in color. On document 3, the "Cancelled" notation is much more noticeable in color.

While it's nice to have color with the first three documents, it's more than a luxury in the case of documents 4 and 5. The information conveyed in color is important and can be obliterated by a bitonal scanner despite careful adjustment (see document 5).

What About Storage?

So much for the theory, but can you afford to store color images? To give you some idea, the 200 dpi bitonal scan (TIFF Group 4) of document 4 compressed to 112 Kbytes. Surprisingly, the same document scanned in color (JPEG) compressed to 131 KBytes -- just 20% larger. The JPEG format used introduced a 10% loss in image detail, but this was not discernable to the naked eye.

In some cases, color images will be considerably larger. The 200 dpi color (JPEG) scan of document 6, for example, compressed to 434 Kbytes -- five times larger than the 84 Kbyte compressed bitonal. JPEG is the current standard for color, but it was designed for photographs, so companies are working on more efficient formats geared to text and line art. One of these, DjVu, (http://djvu.reaserch.att.com) was demonstrated by AT&T at AIIM and provided lossless 100x color compression.

If size is a problem, you can reduce the dots-per-inch resolution without losing readability. Color and grayscale images carry much more information. You usually have 3 bits designating color times 8 bits designating shading information, making a total of 24 bits of information per pixel versus one bit per pixel in bitonal. Therefore, you can usually substantially reduce the dots-per-inch resolution and still get a very legible image. I scanned the color bar chart (document 5) at 150 dpi, and it is still completely readable due to the increased tonal information. Still, most OCR requires a minimum of 200 dpi, so in a forms processing application you will need the higher resolution.

However, even if color image files are several times larger than their bitonal counterparts, it's important to consider the new realities of storage cost. In the early days of imaging you could buy a 120MB disk for $200. Now you can buy something 30 times larger for the same price. Years ago we were working with ISA buses, 33MHz 386 processors and maybe 2MB of RAM on the desktop. Today you can easily multiply these capacities and speeds by a factor of ten.

Many people say to me that most business documents are black and white so you don't need the color overhead. True, many documents are strictly black and white, but take a look at document 6. This invoice is mostly black and white, but it contains shadings. When converted to bitonal, these shadings interfere with the text and can make recognition impossible. If you remove the shadings, you might lose the data.

Color gives you better readability, as does grayscale (though the latter are nearly as large as color files). This is clearly visible in the case of document 7, which is an enlargement of the back of a check. If you look at the bitonal image, you cannot clearly make out the endorsements. Using grayscale, the layered aspects of the endorsement jump out; the user can read the data and discern which stamp came first. The best of all, however, is the color image, which takes the data to a greater level of understanding and clarity.

It is true that most OCR and ICR works on bitonal patterns, but color and grayscale can improve forms processing in other ways. ScanOptics (Manchester, CT 860-645-7878) use grayscale in their 9000 scanner, and SCS (Birmingham, AL 205-251-2985), which was recently acquired by ScanOptics, can use color images to improve key entry in their Vista Capture forms processing product. OrboGraph (Israel, 972-8-942-3769), which specializes in courtesy amount recognition, says that they can improve recognition by as much as 20% if they use grayscale or color images. NCS (Lincoln, RI 401-334-4811) says their Mark Sense accuracy is improved using grayscale.

What About Scanner Costs?

If color is compelling now, then why aren't more document imaging systems working in color -- or at least able to store color as needed? One reason is the lack of affordable high-speed color scanners. Another reason has been the difficulty of compressing and storing larger files fast enough once they are scanned. This latter issue seems to have been solved with specialized JPEG chips, such as those now available from Picture Elements (Boulder, CO 303-444-6767).

Until recently, there were only two high-performance (60 or more pages per minute) color scanners available on the market: the ImageTrac from Imaging Business Machines Llc. (Birmingham, AL 205-956-4071) and the RecoScan from CGK/Siemens Nixdorf Information Systems (Vienna, VA 703-848-2117). Banctec (Dallas 972-341-4000) demonstrated a new color version of its S-Series scanner at AIIM, and it is expected to ship this fall. All three of these scanners cost more than $50,000, making them expensive for mainstream use, though they have their place in specialized applications that demand color. Fujitsu sells the 600C, a 15-ppm bitonal/2-ppm color scanner, for under $2,000, but it's a simplex scanner that lacks compression. You wouldn't use this scanner for production color scanning.

What we really need is a higher-end 20- to 30-ppm color duplex scanner priced at around $25,000 to $30,000 with on-board compression. It doesn't have to be 300 dpi. Whoever comes up with this device will transform document imaging systems.

 




Channels
Business Process Management
Content Storage
Content Management
Compliance
Enterprise Solutions
Document Scanning & Capture
Content Delivery & Publishing
Collaboration & Knowledge Management
Search and Classification
Locate an article from our print magazine. Just enter your Locator ID Number below.
ID#


NEWS FROM THE PIPELINE

OpenOffice.org 2.0 Closes On Final

New Study Finds Steep Growth For Smartphones

PalmSource Sale Cleared By Federal Agency

CTIA Panel Examines Enterprise Security Risks

[more]






HOME | ARCHIVE | REALWARE AWARDS

A Publication of the Network Computing Enterprise Architecture Group
Brought to you by CMP Media LLC, Copyright © 2005
Privacy Statement | Your California Privacy Rights | Terms Of Service