Intelligent Enterprise featuring Transform
START NEWS & ANALYSIS OPINION CHANNELS PRODUCT GUIDES REVIEWS TECHWEBCASTS
CONTACTS ARCHIVES ADVANCED SEARCH
Rate & Review
Letter to the Editor
E-mail Article
Print Article
July 2002

ON STORAGE

Innovative Storage Tames 'Fixed' Content

by Lowell Rapaport

According to storage giant EMC of Hopkinton, MA, up to three quarters of all digital data is "fixed" content. Once saved, fixed content never changes and is only accessed and viewed. Examples include archived e-mails, documents, content posted to Web sites, product manuals, and specifications and scientific content such as medical images and geologic data. EMC developed a new storage technology, called Centera, specifically to address the needs of fixed, unstructured content. "Storage for fixed content needs to guarantee authenticity, online access, long-term retention, scalability, location-independent access and low cost," explains Ken Steinhardt, EMC's director of technology analysis.

To satisfy all these needs, EMC developed what the company calls a new storage category, content virtualization. Content virtualization essentially replaces the mounted volumes, drive letters, directories and subdirectories found in existing storage systems. Instead, Centera uses an application programming interface (API) to let developers send files to a self-managed storage device instead of creating a path through a complex directory structure. The storage device returns an identification number to the application based on the content of the file. Every file gets a unique identifier.

"The only way for two files to have identical ID numbers is for the content of the files to be identical," says Steinhardt.

Basing file identifiers on content makes Centera inherently reliable. If a file's contents don't match its ID number, then Centera knows that the content has been corrupted. If Centera discovers two files with the same ID number, then it knows the contents of the two files are identical. This helps the system avoid file duplication. The ID number scheme also supports versioning. When a file is changed, the modified file gets a new ID number based on the modified content. Centera inherently protects and preserves multiple versions of content under development and prevents tampering with old content.

Synopsis

Vendor: EMC, Hopkinton, MA
www.emc.com

Product: Centera

Description: Network-attached storage optimized for fixed content.

Strengths: Strong redundancy and scalability. Platform independent.

Weaknesses: Requires specific support from application developers.

Price: $210,000 for five terabytes.

Another advantage of content virtualization is that it makes storage system-independent. Centera is a network-attached storage device, so applications interact with Centera via internet protocol (IP). This approach spares application developers from the messy business of keeping track of drive volumes and paths. Content repositories store only a file's ID number along with metadata identifying the content.

"Before content virtualization, archives had to store both metadata and a file's complete path," says Steinhardt. "Because the path was open to the operating system, there was always the possibility of a file getting lost. Users and applications don't interact directly with Centera's internal file structure, so there is less risk of losing a file."

Rounding out the Centera package is redundancy based on large arrays of clustered servers. Entry-level systems use a cluster of 16 servers. Each server holds up to four drives of up to 160 GB each. To keep costs down, Centera relies on standard hardware: Pentium III processors, IDE hard drives and Linux for its underlying operating system. Clustering the servers gives Centera load balancing and automatic failover. In addition, Centera's internal software mirrors data across drives and servers. More complex than RAID 0, which simply mirrors drives, Centera's mirroring puts a file and its copy on different servers.

Centera systems scale from five terabytes across 16 servers to more than a petabyte across nearly 3,600 individual clustered servers.

"It's a pretty impressive solution," says Galen Schreck, an analyst with Forrester Research in Cambridge, MA. "Content virtualization software has been around for some time, but EMC is one of the few companies with the clout to attract developers and become a standard."

E-mail archiving and document management integrations with Centera are already available. OTG, Tower Technology and Artesia Technologies are among the many content management software vendors that have announced or completed integrations with Centera.

As for customers, one of the early adopters of Centera is Framingham, MA-based Connected Corp., a backup software vendor with an application service provider business protecting 102 TB of data for its online customers.

"We specialize in software that backs up every computer our clients own, including desktop PCs, workstations and servers," says Tom Hickman, Connected's engineering operations manager. "An enterprise backup is the result of multiple incremental backups made on a daily basis." Hickman says a large backup customer for Connected can have as many as 80,000 individual computers, resulting in 80,000 incremental backups every day. Prior to deploying Centera, Connected used a two-stage backup solution. Computers were first backed up to a RAID and then moved from RAID to tape.

"Centera's built-in mirroring makes it reliable enough for backup," says Hickman. "The clustering makes it fast enough for an incremental enterprise backup. Plus, since it's online storage, restoration of files is quick."

Hickman says Connected expects to save money with Centera. "It's too early to determine total cost of ownership, [but] we currently need an administrator for every 25 TB of tape storage. Centera should let a single administrator manage up to 100 TBs."




Channels
Business Process Management
Content Storage
Content Management
Compliance
Enterprise Solutions
Document Scanning & Capture
Content Delivery & Publishing
Collaboration & Knowledge Management
Search and Classification
Locate an article from our print magazine. Just enter your Locator ID Number below.
ID#


NEWS FROM THE PIPELINE

OpenOffice.org 2.0 Closes On Final

New Study Finds Steep Growth For Smartphones

PalmSource Sale Cleared By Federal Agency

CTIA Panel Examines Enterprise Security Risks

[more]






HOME | ARCHIVE | REALWARE AWARDS

A Publication of the Network Computing Enterprise Architecture Group
Brought to you by CMP Media LLC, Copyright © 2005
Privacy Statement | Your California Privacy Rights | Terms Of Service