Over the last year or so, the cost for hard drive storage has dropped and disk capacity has
risen so much that a desktop computer can have as much storage as a big RAID system had just a few
years earlier. This has been a boon to heavy users of storage and it has propelled RAID into wider use
in document applications including Web publishing.
As RAID costs have declined and capacities
increased, performance has been improved with larger bandwidth interfaces like Ultra3 SCSI and Fibre
channel. At 160 megabytes per second, Ultra3 is the fastest single-channel interface available. Fibre
channel is also becoming a popular host interface for RAID subsystems. A single fibre channel
connection supports 100 Mbytes/sec throughput.
Document management applications in particular are ready to take advantage of faster, larger and less
expensive RAID systems. The database operations upon which document management systems depend can be
sped up by a high speed RAID 0 drive set. Combining striping and mirroring (i.e., RAID levels 0 and 1)
improves reliability.
RAID subsystems can be used as front-end disk cache to optical jukeboxes and
tape archives. The low cost of disk storage means that you can now build a RAID cache large enough to
contribute a performance boost to the entire archive rather than to just a few selected files. RAID
makes it possible to keep an active tape archive or to replace an expensive MO system with a less
expensive DVD or CD alternative.
RAID is an essential part of any Web strategy. If you plan to make
your document archive available on the Internet or an intranet, RAID will ensure immediate access to
documents and other information to hundreds or even thousands of users at once. While Internet
connections are more often the impediment to speedy delivery, RAID is important for reducing the
latency those users will face as they wait for their particular requests to be fulfilled.
What to Look for in RAID
Before you even look at any type of storage, you should first determine the requirements of your
document management system. Just how does this system store files? Document management systems handle
several different types of files including database files, annotation files and scanned document
images. Each has different storage requirements. Some RAID controllers can be adapted to the types of
files being stored.
Say you start out believing you will only need to store compressed bitonal
images. These images take about 10 kilobytes each. If you then switch to storing color scans, your
RAID will need to be optimized for the larger file sizes. A RAID controller that can adapt to the file
size can maintain performance as your storage habits change. Controllers from CMD Technologies
(www.cmd.com), for example, not only support multiple RAID levels (as do many controllers and
subsystems), they also allow you to vary stripe sizes (the size of data blocks) to match file sizes.
Flexibility will also be enhanced if your RAID controllers and subsystems can support additional
storage on the fly by either adding more disks or using higher-capacity drives. This makes it easier
for fast-growing companies to increase storage capacities without prior planning. RAID systems from
Procom (www.procom.com) and RAIDtec
(www.raidtec.com) can add additional disk storage without shutting
down at all.
Flexibility and customizability can be more important than common performance benchmarks according to
Steve Ferrari, product marketing manager at CMD Technologies. Many vendors will quote throughput in
Mbytes/sec and transaction rates in number of input/output operations per second, but Ferrari says
these specifications can be misleading.
"RAID vendors quote theoretical specs that arent achievable in the real world," he
says. "You have to benchmark RAID systems with software that will avoid cache hits and that are
tailored to the users production environment."
If write or read requests go to the cache on the controller card (a cache hit) rather than to the disk
drives, then the measure of RAID performance will be artificially high. Cache hits dont
recognize the controllers true ability to deliver redundant storage. Ferrari recommends
Intels Iometer benchmark software (which can be downloaded for free at www.intel.com) to test
RAID systems. The software measures end-to-end performance of storage systems without cache hits.
Most RAID vendors claim that their systems can support high throughput and high transaction rates
at the same time. High throughput (the measure of data delivered) is dependent on having a large pipe,
such as multiple SCSI or fibre channel connections. High transaction rates (input/output requests per
second) are much more dependent on the speed of the controller and drive latency. If you need to move
large amounts of data (more than 100 Mbytes/sec), but you anticipate light loads (in terms of number
of files or users), then you may be able to make do with a multiple-channel RAID system. In this case,
its always a good idea to use enough drives on each channel to take full advantage of the
available bandwidth (e.g., an 80 Mbyte/sec. Ultra2 SCSI channel can support seven to eight 10
Mbyte/sec. drives).
If youre supporting a large customer service operation, a trading floor
or a busy Web site, youll need a system capable of supporting a high transaction rate. Here,
fast controllers and low-latency disk drives (those with short seek times) are much more important. In
high-end applications, RAID vendors will dispense with disk drives completely, replacing them with
solid state hard drives. These devices dont actually have a "drive," theyre just
arrays of memory chips emulating a hard drive.
A final performance enhancement is the use of dual
active controllers that are external to the server. External RAID systems often accommodate dual RAID
controllers in what is called an active-active configuration. The two RAID controllers operate in
parallel, splitting the processing load between them. Originally, external controllers were
active-passive. The passive controller took over only if the active controller failed. RAID designers
realized that if both controllers ran in parallel, they could improve performance while ensuring
redundancy.
Today, with few exceptions, most external RAID controllers support active-active with automatic
failover. A few even support hot swapping of the controllers, but these features are only available to
external RAID controllers. Internal controllers (a.k.a., PCI RAID host controllers) are not hot
swappable and they dont support redundancy at the controller level. While you lose some measure
of reliability, internal controllers are less expensive and they do not require a separate
enclosure.
Redundancy: The Other Half of the RAID Story
High performance is important, but its not enough if a hard drive fails and your company is shut
down while the drive is replaced and the data is restored from backup. This is where redundancy comes
in. All professional RAID enclosures include redundant power supplies, cooling systems and
hot-swappable drive bays. Some support hot-swappable RAID controllers. Even if you are creating JBOD
storage (just a bunch of disks) with no RAID features at all, a good enclosure will keep your data
safe and your drives operating through power surges and heat waves.
When it comes to redundancy,
there has been a trend away from parity RAID (RAID levels 3, 4 and 5) and towards mirroring (RAID
level 1). With the low cost of drives, mirroring is not as cost prohibitive as it used to be. And if
you need to maximize write performance, there is no write penalty associated with calculating parity
bits you get with higher RAID levels.
In document management environments, RAID level 0+1 (striping
and mirroring) is an effective combination for critical data that must be accessed at high speeds,
such as databases. RAID level 5 (striping with parity) or 0+5 (multiple stripe sets with striped
parity) are best for large data archives because they provide maximum redundancy and storage
efficiency. If you are going to use RAID as the front end (cache) of a tape or optical archive, you
may be able to dispense with the redundancy features altogether (using RAID 0) since the data is
already "backed up" in the archive.
Almost all RAID controllers support multiple logical drive arrays in a single subsystem. In this
scenario, a single controller (or pair of controllers external to your server) can take a large array
of drives and divide them up into separate RAID 0, RAID 1 and RAID 5 sets. The number of permutations
is generally limited only by the number of drives you have available. Using this approach, a single
high-end array can take the place of several low- or mid-range arrays. This is a good feature for
smaller systems where it would be too expensive to run several separate RAID systems.
Building Vs. Buying Complete RAID Subsystems
Depending on your technical proficiency and interest in customizing your storage solution, you can
start with complete RAID subsystems or the controller boards that are the building blocks of complete
system.
Most end users should consider the complete systems (see table starting on page 48). In
addition to putting everything together for you, suppliers of complete systems provide services such
as drive certification, installation and maintenance. This should not be taken lightly. RAID systems
are complicated to set up properly. If a drive or RAID controller fails, you will receive a
replacement quickly. Suppliers of high-end systems will even monitor the condition of the RAID
remotely and conduct preventative maintenance. All you have to do as the customer is provide floor
space and a clean power supply.
RAID controllers are purchased by integrators, value added resellers, storage specialists and
knowledgeable do-it-yourselfers seeking less expensive or customized RAID subsystems. Most of these
controllers are also sold on the OEM market; if you buy a complete RAID system, chances are it will
feature a controller made by one of these manufacturers.
Host and Drive Interface: This indicates the type and number of channels to the server and the disk
drives, respectively. Ultra, Ultra2 and Ultra3 SCSI are 40-, 80-, and 160-Mbyte/sec., respectively.
Several systems now employ Ultra3 SCSI interfaces. Ultra3 is the fastest single-channel interface
available. The latest PCI RAID host adapters from DPT (www.dpt.com) and Mylex (www.mylex.com) support
these interfaces directly from the server. Fibre channel (FCAL) is also becoming a popular interface.
A single fibre channel connection supports 100 Mbytes/sec throughput.
RAID systems depart from
these interfaces in just a few cases. Ethernet host interface is used for network attached storage. A
couple of low-end RAID systems use Fast and Wide SCSI (SCSI-II) or ATA-2 interfaces to the drives.
Total Number of Drives: This is simply the total number of drives the vendor says can be connected to
a RAID controller. Be careful with this number. While some vendors simply list the total number of
drives that can be supported by the interface used by their RAID controller, a few are more honest and
list only the number of drives that it is practical to connect. The more drives you connect to a
controller, the more likely you are to exceed the controllers ability to process data. Too many
drives and you can slow a RAID systems performance to a crawl.
Furthermore, it is too easy to saturate a bus ability to move data. If you assume a hard drive
can transfer 15 Mbytes per second, it takes only four or five drives to fully saturate an Ultra2 SCSI
channel and seven or eight drives to saturate a fibre channel loop.
Cache: This indicates how much solid state memory resides on the controller. A large cache may improve
performance, but unless the cache has some sort of battery backup or is mirrored across multiple
controllers, it is unsafe storage that can be lost to a power dropout or controller failure. All the
active-active controllers in this listing are capable of mirroring their cached data.
RAID Levels
RAID 0 Striping. RAID 0 specifies that data is striped across two or more drives. This
allows multiple drives to be used when accessing data and makes more efficient use of SCSI bandwidth.
RAID 0 carries no redundancy in case of a drive failure.
RAID 1 Mirroring. RAID 1 makes duplicate copies of data on each drive in the RAID
system. It is the ultimate in redundancy.
RAID 0 + 1 (10) RAID 10 combines mirroring and striping in a single RAID subsystem. This
provides the maximum redundancy with no loss in performance. Other RAID levels require a small loss in
performance to provide redundancy.
RAID 3 This level takes a block of data and breaks it up into stripes that are recorded
across two or more drives. Parity information for each data stripe is recorded on a single additional
drive. RAID 3 is infrequently used in hard drive RAID systems, but it is used in tape arrays.
RAID 4 This is similar to RAID 3 except that instead of creating parity for each stripe
of data, parity is created for the entire data block. RAID 4 supports higher transaction rates. Parity
is checked on each block rather than each stripe.
RAID 5 Similar to RAID 4 in that parity is generated for each block. However, instead of
a single dedicated parity disk, the parity information is striped on the data disks along with the
blocks themselves. Transaction rates are high, but write speed is penalized as the RAID 5 controller
has to avoid placing data blocks and associated parity information on the same disks.
RAID 5 + 3 A combination of RAID 5 and RAID 3. Each of the "drives" of a RAID
0 system is set up as a RAID 3 subsystem.
RAID Glossary
Block. A string of data elements recorded, processed or transmitted as a unit. The elements can be
characters, words or physical records.
Cache Memory. High-speed random access memory used to speed up I/O operations. It can be used to store
frequently accessed data and is used for intermediate storage of data retrieved from disk or data
thats to be written to disk.
Checksum. A number that represents the sum of the bits within an arbitrary length of binary data. The
checksum of 0100110 is 3.
Controller. Software or hardware that handles the striping or mirroring of data across the drives and
manages the drives.
Data Availability. The level of fault tolerance within an array. The more component failures that can
occur without losing access to data, the higher the level of availability. The level of availability
provided by RAID systems varies from simple disk redundancy to total component redundancy.
Disk. A randomly addressable, rewritable mass storage device.
Disk Array. A collection of disks presented as one or more virtual disks to the host.
Disk Striping. A type of disk array mapping where consecutive stripes of data are mapped round-robin
to consecutive array members. The act of binding a group of two or more physical disks to form a
single logical disk. Striping maps data across the entire disk array, breaking up "hot
spots" performance bottlenecks caused by frequent access to a chunk of data.
DGR. Dynamic Growth and Reconfiguration. The ability to add storage and change RAID levels
without taking your entire system off-line.
Fault Tolerant. Having no single point of failure that would result in loss of data availability.
Fibre Channel. A high-speed interface that can connect a RAID system to a host computer and permit
high data transfer rates.
Hot-Pluggable. Able to accept added components to the subsystem while its still operating.
Hot Spare. A spare drive thats continuously spinning. If a drive fails, the spare drive
immediately replaces it.
Hot Swap. To manually replace a defective disk, fan, power source or controller while the rest of the
RAID system is running.
Mirroring. Data written to two drives at the same time. If one drive fails, the other provides the
data immediately.
Parity. Extra information used in RAID. If a disk fails, the parity, with the data on the remaining
drives, can be used to recreate the lost drives data.
RAID. Redundant Array of Independent Disks. A method of data storage where you store information over
many disk drives.
Read Cache. The cache used to accelerate read operations by retaining data previously read, written or
erased, based on the prediction that it will be reread.
Regeneration. The process of rebuilding user data that was stored on a failed RAID 1, 3 or 5 array
disk. Regeneration may be used to recover data when a member disk has failed. It can also be used to
recover data when an unrecoverable media error is encountered on a member disk.
SCSI. Small Computer System Interface. An interface standard that lets devices such as hard drives and
optical drives communicate with a computers main processor. The latest version is Ultra3 with a
maximum speed of 160 Mbytes/sec.
Transaction Rate. The number of I/O requests satisfied per unit of time, such as a second.
Throughput. The speed at which data can be moved from one place to another, usually expressed in
megabytes per second.
Ultra SCSI. Serial interface that transfers data at more than twice the speed of previous SCSI
interfaces, using fast-wide bandwidths. Also known as SCSI-3.
Write-Back Cache. Deploys controller cache for write operations, dynamically allocating memory
as needed to both read and write operations. This lets all of your different applications continue
without waiting for completions of writes to disk, while batteries protect cached write data from
power interruptions.