Intelligent Enterprise featuring Transform
START NEWS & ANALYSIS OPINION CHANNELS PRODUCT GUIDES REVIEWS TECHWEBCASTS
CONTACTS ARCHIVES ADVANCED SEARCH
Rate & Review
Letter to the Editor
E-mail Article
Print Article
August 2002

Choose the Right Tools for Content Editing

by Bill Trippe

It's ironic. Technology providers have been eager to shelve the term document management and embrace content management, but when you look at what many organizations need to manage, the most challenging problem remains ... um ... document content.

Why are documents so challenging? Because they have structure and detail that other applications require — everything from manufacturing part numbers to the names on contracts. As content management systems (CMSs) are integrated with other enterprise applications, a key question is, "How can documents be created and kept up to date by human editors, while at the same time providing the structure and detail that the CMS needs?"

For most CMS implementations, template-based and HTML forms-based interfaces still predominate, especially in applications that connect human users to "fielded" data. For example, marketing staff can change a product price by entering the new value in the CMS interface's price box.

But forms-based interfaces don't hold up well with document-length content. Users are much more comfortable with a familiar word processing interface, including formatting and editorial tools such as spell checking that have become a normal part of document workflow. As a result, many CMS vendors have adopted mechanisms for at least loosely coupling word processing tools with their applications.

In light of the ongoing need to manage document content, it's natural to wonder whether the documents themselves, or some components of the documents, should be maintained as XML data. Would this choice mean that the people responsible for creating and updating the documents would need to learn XML coding and use XML editing tools?

These questions are only beginning to be answered, and as a result, the state of the art in editorial interfaces is a mixed bag. The CMS implementations now in the field reveal a combination of XML editing tools, template editing and some specialized tools for content conversion, pretagging and posttagging.

Different Needs Demand Different Interfaces

If organizations are going to take full advantage of CMSs, knowledge workers must be able to create and update content themselves and add intelligence to the content being managed. As more and more people in the enterprise contribute and update content, the need increases for simple-to-learn yet powerful tools that allow contributors to concentrate on the material they're creating or updating. They shouldn't have to worry about the intricacies of how content should be tagged for manipulation by the CMS.

When considering editorial interfaces for content management, organizations have a number of options:

Template (or forms-based) interfaces have become de rigueur for CMSs, mainly because CMS applications are often based on relational repositories, and the shortest distance between a thin client and a relational database is an HTML- or Java-based form. The bulk of CMS applications are reliant on this kind of interface as the primary means of entering and updating content and assigning metadata at a coarse level.

Word processing applications such as Microsoft Word are used to create content to be stored in its native form or saved as HTML, XML or another neutral format. The challenge, of course, is integrating the proprietary structures in Word with the more neutral structures the CMS stores — either relational databases, some sort of object storage or XML. The granularity of tagging available with this option can vary greatly. While Word itself typically applies styles at a paragraph level, some editorial interfaces based on Word can impose XML or object-level granularity.

HTML editors are used to create and edit content stored as HTML or smaller, relatively discrete "chunks" of content that can be mapped from HTML to the underlying data structures. This approach offers little efficiency for situations in which content must be finely grained and/or of high quality.

XML editing tools can be used to create and edit all or some of the content and can also interact with metadata. This approach provides the strongest capability for finely grained content tagging but also can be more expensive in terms of software seats, training and support.

Preprocessing and postprocessing tools are available. For example, proprietary formats such as Microsoft Word and Quark XPress can be "debinarized" and then run through filters on their way into the CMS (including into XML-based relational database storage). The reverse process can be performed on the way out. When using this approach, the level of tag granularity is dependent on the quality and capabilities of the filters and processing from binary formats into XML or other object structure.

Enterprises that have implemented content management typically use a combination of interfaces, with template interfaces being the most popular choice. Each of these tools has different strengths and weaknesses and may be more or less appropriate for your content needs, staffing situation and user requirements.

As a general rule, users should be provided with authoring tools that match their level of content contribution, training and flexibility (see "Authoring Tools by Type of User").

Treat Editing as a CM Application

Vendors and end-user organizations should treat the editorial interface to the content database as an application unto itself. In effect, the template interface is an application, yet it has too often been little more than a loose coupling of scripted, HTML-based forms linked to some kind of back-end repository. If the coupling is too loose, you'll lose strict control over the content that's being entered in the forms, and you may also lose the ability to effectively manage longer text elements.

Early CMS projects were plagued with difficulties in establishing template interfaces and later with maintaining them, but better editing tools are now available to integrate with CMSs. The management applications have also matured, gaining improved support for file formats such as Microsoft Word documents. CMSs are also better at controlling the editing of text via templates or forms, mainly because specialized editing tools for integration with templates have emerged.

Two examples of better editing tools for use in templating environments are from RealObjects, Saarbrucken, Germany, and Ektron, Amherst, NH. RealObjects' Edit-on is a Java applet that has been integrated with Fatwire and other systems. Ektron offers eWebEditPro, an editing control that can be added to template interfaces, and eWebEditPro+XML, a tool for adding and managing XML. Ektron's editing tools are used by CMS vendors including Vignette, divine, Eprise and Microsoft.

Used properly, editing controls such as those from Ektron and RealObjects provide much more granular handling of text that was previously only crudely tagged. These newer editing controls are, at the least, improving on content accuracy and, at best, introducing more fine-grained markup, including XML markup. With these and other improved tools for editing, newer-generation CMSs are relying more on XML and the greater flexibility it provides in modeling content.

In the past, if a CMS customer implemented an editorial interface and then later needed to change underlying data structures, the enterprise likely had to heavily modify or completely rewrite the interface — especially with CMSs that used relational databases as the underlying data store.

One of XML's advantages is that it makes modifications to data structure much easier. If the underlying data store is relational and the interface is a heavily programmed template, changing the underlying data structure is a complex, programming-heavy task. If the underlying data structure is XML, changing that structure typically means modifying the Document Type Definition (DTD) or XML Schema and then running a process to update the XML editing interface, often automatically. The XML editor is then ready to parse the text according to the revised DTD or Schema.

"Integrating an XML editor with a [CMS] provides customers with immediate benefits," says Bruce Sharpe, executive vice president of XML Content Solutions at Corel, Ottawa, Ontario, Canada, the company that recently acquired the XML editing tool, XMetal. "Information can be accessed and edited through a single interface. This makes a huge difference for customer[s] and their productivity."

While XML editors offer clear benefits, not all editorial interfaces must be XML editors. Indeed, the nature of enterprise content management is that underlying data structures will be a combination of relational, XML and other data types. There will also be all manner of content in terms of length, value and shelf life. The editorial interface(s) should then be appropriate to:

1. The content type — both data type and length.

2. The user type — from occasional contributor to IT administrator.

3. The content's point in the lifecycle — initial creation through editing and updating.

4. The content's shelf life.

5. The requirement (or lack thereof) for content to be tagged at a granular level.

With these factors in mind, those implementing CMSs should consider supporting a variety of interfaces. For example, regular contributors of complex and lengthy documents to an XML database would be well served by a tightly integrated XML editor. But ad hoc contributors to that same database should be given a simple-to-use tool that exposes precisely the content they need to edit and validate before returning it to the repository. Ad hoc users could also be supported by a workflow process that forwards their revised content to more skilled editors to ensure that content was entered or updated correctly. In yet another example, corporate users of an Intranet that supports a small number of simple document types could use a set of Microsoft Word templates. The Word files could then be processed through a tool that normalizes the files into the format required by the CM repository. When the documents need to be modified, reverse processes could reconstruct the Word files for further updating and editing.

Metadata is likely a combination of XML, relational data and other data types. By its nature, metadata is often structured, discrete and relatively short in length. For example, a CMS system for an electronics company could maintain detailed information on parts, product availability and maintenance procedures, all as metadata. Such information could include fixed values, choice groups ("yes" or "no," or "X," "Y" or "Z."), and other data types that lend themselves to structured interfaces and enforced validation.

If we consider again our range of authors — from occasional contributor to the power user — they likely have a similar range of needs for editing metadata. An infrequent contributor adding a Microsoft Word file could be required to fill out a simple form or even be required to fill out the "Property Sheet" embedded within Word (File Menu... Properties). On the other end of the spectrum, a knowledge worker could be provided with an XML editing tool as an interface to the required metadata. The GUI of a commercial XML editor such as Arbortext Epic or Corel XMetal can be configured to behave like a forms-based interface while capturing and storing XML data.

Not all content will have to be XML. However, if you handle complex or lengthy documents, if you need content to be "componentized" and you need it to be accessible to a variety of editorial interfaces, XML makes sense. XML has helped fixture manufacturer Kohler Plumbing of Kohler, WI, better manage the creation and maintenance of some 10,000 active documents ranging from in-box literature, service documents, service manuals and tech sheets, to presales sheets and installation guides.

"With our needs for multiple output, multilingual, multibrand, multilocalization and multicustomers, we couldn't stay working at the document level," says Mark Peterson, manager of Kohler Plumbing's North America technical communications department. "We had to move to working at an object level." Moving to the object level has meant moving to XML editing.

Peterson has a team of 12, including writers, illustrators (many of the smaller documents are 50 percent illustration), translators and layout specialists. Many documents find form in print, CD and online, as well as in support of catalog production. Peterson admits that the move brought some "culture shock," a condition that a full-time, specialized worker may have to adjust to, but the average business user may not.

According to Tony White, senior director of product marketing at Broadvision, Redwood City, CA, the management system should be transparent to business users. "They should be able to create and modify content using tools such as Microsoft Word or Web-based forms, without needing to know their work is being submitted," White says.

As detailed in "CM System Editing Tools and Integrations," content management vendors have worked hard to provide easy-to-use tools for business users. For the wider enterprise, it is the range of options that counts. Not every user will be productive with the same kind of interface, and not every organization will have the same kind of content. Try to match your content and users with the most appropriate interfaces.

The good news is that as CMSs mature, the challenge of providing a better user experience is being met, and both organizations and individual users will be the beneficiaries.

Bill Trippe (btrippe@nmpub.com) is president of New Millennium Publishing (www.nmpub.com), Boston, a consultancy specializing in electronic publishing, content management, SGML and XML.




Channels
Business Process Management
Content Storage
Content Management
Compliance
Enterprise Solutions
Document Scanning & Capture
Content Delivery & Publishing
Collaboration & Knowledge Management
Search and Classification
Locate an article from our print magazine. Just enter your Locator ID Number below.
ID#


NEWS FROM THE PIPELINE

OpenOffice.org 2.0 Closes On Final

New Study Finds Steep Growth For Smartphones

PalmSource Sale Cleared By Federal Agency

CTIA Panel Examines Enterprise Security Risks

[more]






HOME | ARCHIVE | REALWARE AWARDS

A Publication of the Network Computing Enterprise Architecture Group
Brought to you by CMP Media LLC, Copyright © 2005
Privacy Statement | Your California Privacy Rights | Terms Of Service