Intelligent Enterprise featuring Transform
START NEWS & ANALYSIS OPINION CHANNELS PRODUCT GUIDES REVIEWS TECHWEBCASTS
CONTACTS ARCHIVES ADVANCED SEARCH
Rate & Review
Letter to the Editor
E-mail Article
Print Article
September 2003

Putting it Together: Taxonomy, Classification & Search

by Jeff Morris

Continued from [ page 2 ]

Coming Soon to a Web Site Near You

According to Woods of Ovum, a real transformation in search and information discovery is coming as a result of the integration of innovations including taxonomy navigation, personalization, bridging the gap between structured and unstructured information and the use of visualization techniques. He foresees a flexible categorization of information and knowledge sources — structured and unstructured, explicit and tacit — that can be easily visualized, navigated and tailored to the specific search requirements of both organizations and individual users.

In fact, this transformation has already begun. Led by consumer demand for faster, more complete product information, many enterprises are discovering that the e-commerce model speeds access to all kinds of corporate information. With the leading search vendors already on board, innovative ways of utilizing taxonomy and classification in various combinations with search are proliferating, making information retrieval not only faster and easier, but also more accurate and cost-effective than ever before.


A Quick Game of Tag

What if you have massive quantities of information that need to be tagged — fast? Before any type of search can be conducted employing taxonomy or classification, information has to be tagged to allow it to be categorized and, ultimately, found.

Congressional Quarterly (CQ), a Washington, D.C.-based publishing company, faced a particularly daunting tagging challenge. Among the company's products is a Web-based legislative tracking service encompassing 24 databases, including five databases of documents from the Government Printing Office (GPO). While some of these documents are digests, most are large, 3- to 4-MB files. Yet endusers want to be able to locate the relevant portions of these documents almost immediately.

Susan Shipp, CQ's managing editor for new media, explains that although these documents are freely available to the public through government Web sites, CQ is able to attract subscribers by adding value in two ways:

Resources

Antarctica Systems www.antarctica.net

Autonomy www.autonomy.com

Convera www.convera.com

Inxight www.inxight.com

iPhrase www.iphrase.com

Stratify www.stratify.com

Verity www.verity.com

l First, adding structure: The way in which bill texts and legal language are presented is very important, but the original documents could only be searched in a full-text approach. By converting these documents to XML, CQ can provide versions that look exactly like the print version while also providing a computer-navigable structure. The Library of Congress Web site offers only ASCII text versions of these same documents.

l Second, adding metadata and contextual linking: CQ is able to link to specific documents and portions of documents from its own content. As Shipp notes, "the Congressional Record can run up to 100 pages per day, and it's a daunting task to go through all that material just to find the small portion that deals with the bill or legislation in which you're interested." At the document level, CQ is able to provide links to related bills. (In order to preserve the integrity of the source document, these links appear in the margin.) The links are accomplished through a combination of metadata and CQ's own editorial system.

The key to these value-added capabilities lies in the ability to quickly metatag great quantities of information. That has been made possible by DataStream Conversion Services, a College Park, MD-based company that was born out of the University of Maryland's Technology Advancement Program business incubator.

[ BACK | NEXT ]




Channels
Business Process Management
Content Storage
Content Management
Compliance
Enterprise Solutions
Document Scanning & Capture
Content Delivery & Publishing
Collaboration & Knowledge Management
Search and Classification
Locate an article from our print magazine. Just enter your Locator ID Number below.
ID#


NEWS FROM THE PIPELINE

OpenOffice.org 2.0 Closes On Final

New Study Finds Steep Growth For Smartphones

PalmSource Sale Cleared By Federal Agency

CTIA Panel Examines Enterprise Security Risks

[more]






HOME | ARCHIVE | REALWARE AWARDS

A Publication of the Network Computing Enterprise Architecture Group
Brought to you by CMP Media LLC, Copyright © 2005
Privacy Statement | Your California Privacy Rights | Terms Of Service