September 2003
Putting it Together: Taxonomy, Classification & Search
by Jeff Morris
The more complex the enterprise, the greater the need to search among multiple sources, but the one- or two-word search used by most people "doesn't give much complexity in the results," says Eric Woods, research director in the software and service group at Ovum, a London-based technology consultancy.
The problem with search technology on its own, says Sue Feldman, an analyst at Framingham, MA-based IDC, lies in the vagaries of language: Each word can have many meanings, and to fix that, you need to narrow down the topic. The solution? "As soon as you've categorized [classified]," Feldman points out, "you've narrowed it down."
Combining taxonomy and classification with search, notes Woods, "gives people a map of the resources available to them. This kind of taxonomy, classification and search combination is becoming essential for the major search vendors."
Another reason taxonomy, classification and search are being combined, says Feldman, is that not everybody knows exactly how to search. "Often what you want to do is browse a directory because you're not quite sure how to ask the question," she explains. "Taxonomy gives you a display of information that doesn't require you to put your need into words."
Knowledge experts now agree that, as Feldman puts it: "Taxonomy, classification and search need one another." Leading vendors including Autonomy, Convera, Inxight, Stratity and Verity are among those attempting to bring all the pieces together.
Taxonomy: A Brief History of Structure
A taxonomy is basically a structure: It defines the relationship among categories or nodes of information, and it's typically hierarchical. Ovum defines a corporate taxonomy as "a way of representing the information available within an enterprise. In its simplest form, it is a hierarchy of categories that is used to classify documents and other information within the corporate knowledge base."
While taxonomies make it possible to do a certain amount of reasoning about the relationships among classes of information, Feldman warns that "the danger of a taxonomy is in its rigidity. The taxonomy is vital, but you need to have ways of changing it."
One basis for change is the need to tailor the search experience to the differing requirements of various users. "The Marketing and R&D departments, for example, have different ways of looking at the same information," Feldman explains. "Some things can be irrelevant to those in either group, so you may need multiple taxonomies or views of the same information."
Woods of Ovum points out that a document may be of interest to different departments in an organization for different reasons, and that "forcing it into a single predefined category may be neater but may also reduce its usefulness. Corporate taxonomies need to be flexible and pragmatic as well as consistent."
There was a time when taxonomic systems were all that were used, according to Daniel Dabney head of the Taxonomy and Search Access department at Eagan, MN-based West, a division of Thompson Publishing. "Then in reaction to that, it was thought that you needed only free-text search," recalls Dabney. "The enthusiasm for pure free-text goes back to the 1970s. Old structured knowledge lists were thought to be a vestige of the past. For a while, it was assumed that coming at things from a taxonomic viewpoint was to be pitied like we were afraid of computers or something."
Dabney contends that we're now achieving a midpoint. "Only in the last six to eight years has there been a renewed interest in taxonomic systems," he says, signaling a return to some of "the old-fashioned values we learned in library school."
With a taxonomic interface, you don't need to know what all the words are; the system itself will guide you into the area you're looking for. "That's a sort of recapture of old paper classification techniques," Dabney notes.
What, then, is the difference between taxonomy and classification? "Taxonomy is the framework," says Roy Rodenstein, product marketing manager at San Mateo, CA-based iPhrase, a natural language search and navigation software vendor. "Classification is a way of mapping the content to fill that taxonomy the process of taking that content and populating it into that taxonomy."
While search technology is the primary method employed by users to access information, the iPhrase One Step application organizes those search results into folders. "The folders that appear on the left-hand side of the screen are a taxonomy as are the tabs across the top on a retail site," explains Rodenstein. And, in the case of retail, there may be multiple taxonomies, he adds. "That tends to be fairly common, because there are many categories of metadata."
[ BACK | NEXT ]
|