|
December 2001
XML ANSWERS
It's Time for an XML Audit
by Bill Trippe
As a colleague of mine likes to joke, "XML is the answer we've been waiting for. Now we can all
go home." Oh, if life were only so simple. If the answer to every data question were "XML," and the
solution were straightforward, inexpensive and quick to deploy. But it's not of course.
If your organization is like most, you have multiple data sources: structured and unstructured,
short-lived and long-lived, some in very good shape and others in questionable shape. These sources
likely live on heterogeneous systems ranging from individual PCs and servers to client-server
networks and, yes, still, mainframes.
Is the answer to convert everything to XML, or to somehow make everything available in XML form?
Not in your lifetime. But neither should you do nothing, nor deal with each data source ad hoc as
questions come up and needs arise. You will require XML at some point, for both internal and
external data sharing, and you need to be prepared for it.
The answer is to conduct what I call an XML audit. This is an exercise where you profile each of
your data sources for their readiness to be converted to XML. An XML data audit differs from data
auditing practices you may already have in place. Some organizations, notably those engaged in
scientific and medical research, already conduct data audits. These audits look at data quality
(including completeness, accuracy and validity), as well as issues of freshness and the
applicability of data to its purpose. Database administrators, especially those who work with
relational databases, also sometimes look at the human aspects of data auditingwho can alter the
database structure, who can access the data itself and so on.
An XML data audit is specific to XML readiness and is more of a management exercise than a
technical exercise. The result should be a concise document that details the sources of data, their
potential uses as XML and the "barriers to entry" that stand between the current data format and an
XML version of the data. Some attention should be paid to the unique technical requirements of
getting data in to and out of XML.
For example, you may have a document management system that currently stores the documents in
their source format and the document metadata in a relational database. You likely have several
potential XML uses for the data, such as providing XML-based abstracts of the documents to an
internal or external Web site, or converting some of the documents themselves into XML. The barriers
to entry might include the expense of purchasing and effort required to master new tools or add-on
modules that would convert the relational metadata in the document management system to XML. The
barriers could also include the need to develop or contract out a capability to convert the source
documents to XML.
Armed with the information from your audit, you will have a snapshot of your data sources, their
potential use as XML, and a high-level idea of what work stands between the data in its current form
and the data in XML. What needs for XML might arise, and in what timetable? What would it take to
make certain data available in XML to a business partner or potential customer? Which of your
current systems have tools, utilities or programming interfaces to go from current formats to
XML?
An XML audit is an invaluable planning tool as questions and opportunities arise. Think of it as
the sensible middle ground between doing nothing and doing everything, neither of which would work.
Think of it also as the start of your road map to XMLyour first step to creating an enterprise in
which all your key data sources are open and available for integration with other applications, and
in which XML is a growing part of your infrastructure.
Bill trippe (btrippe@nmpub.com) is president of New
Millennium Publishing (www.nmpub.com), Boston, a consultancy
specializing in electronic publishing, content management, SGML and XML.
|