|
July 2000
Storage Area Networks
By Lowell Rappaport
A fact of life in any information- driven enterprise is that the demand for storage always increases. The trouble is, adding storage the old-fashioned way by adding servers, network resources and more software is costly and complicated. A storage area network (SAN) simplifies management and scaling by placing all enterprisewide storage on a dedicated, high-speed network.
[SANs let you] pool storage so that multiple servers can reach data,
easing scaling and reducing management costs, says Skip Jones, president
of the Fibre Channel Industry Association (www.fibrechannel.org).
Jones says that SANs operate independently of the local and wide area networks that users connect to. This enables databases and other applications to access data without having to compete with user traffic for bandwidth. Also, by putting all the storage in one place, an administrator can manage it from a single console rather than logging into multiple independent storage servers. Scaling can be as easy as plugging a storage device into a network port.
Despite the advantages of a SAN, dont go charging into building one
without careful thought. SAN components are expensive, warns Eric
Herzog, vice president of marketing at Mylex (www.mylex.com).
Fibre channel host adapters are two to three times as expensive as the
SCSI adapters; cabling, hubs and routers are about twice as expensive [as components
used in traditional networks].
As a storage area networks capacity gets larger, the portion of cost dedicated to fibre channel hardware becomes less significant. Once a system passes the $1 million mark, commodity items like hard drives and other storage account for more than 85 percent of the cost. Therefore, if you are building a lot of storage, or if you anticipate rapidly increasing demand for storage, the extra cost of fibre channel equipment is more than made up by savings stemming from ease of scalability and management.
Traditionally, storage devices are attached to a server or cluster of servers via a SCSI interface. The software to manage storage such as tape or optical library management and hierarchical storage management all run on the same server. Access to the storage is made exclusively through the server. In a SAN, the storage devices and servers are all placed on a network. This should be as easy to put together as an ethernet network, but there are a few caveats. First, the state of the fibre channel technology used for SANs is where ethernet and SCSI were 15 years ago. One of the reasons ethernet is easy is because it has been under steady development for decades. SAN vendors caution about the interoperability of fibre channel, and most build sophisticated labs where they test the compatibility of hardware from different manufacturers. Building a SAN is also complicated by the fact that there is no overall operating system for storage area networks. Instead, management software for a SAN is spread across several layers of machines. The first difficulty in building a SAN is deciding where the intelligence should lie. In traditional server-attached storage, all the intelligence lies in the server the storage device is connected to. Storage is completely under the servers control regardless of whether the storage is a RAID system, optical jukebox or tape archive. The development of network attached storage (NAS) has suggested a direction that SAN storage may eventually follow. Originally developed to add value to storage devices, NAS devices make storage independent of the server. On an ethernet, they are quick and easy to set up. Unfortunately, NAS devices that connect to the LAN via ethernet are generally not fast enough to support thousands or hundreds of simultaneous users across an enterprise.
In a SAN environment, network attached technology comes into its own. Fibre channel is fast enough to support the demands of an enterprise, so storage devices can function independently of any server. For tape libraries and optical jukeboxes, this means that the storage device manages its own media and physical volumes while virtual volumes are managed by a separate server.
Other functions found in storage management systems are also spread out among several layers on a SAN: The file system is handled by a metadata server, storage volume management is handled by a storage management application, and file migration is handled by a separate hierarchical management system.
Because storage management functions are easy to separate on a SAN, they can accommodate a wide variety of storage devices, including new and legacy equipment. It may be complicated to build a storage architecture with lots of different devices, but a SAN makes it easy to incorporate legacy equipment such as optical jukeboxes and older RAID systems. You simply give the legacy storage servers their own fibre channel interfaces and then connect them to the SAN.
Starting With the Basics
At the lowest management level, SANs use host-to-LUN (logical unit number) mapping. This is similar to traditional SCSI-attached storage in that the device is partitioned into multiple logical volumes, each of which has a dedicated connection to a single server. The advantage in a SAN is that all of the servers and their respective storage volumes can share the same fibre connectivity. Adding storage to a server is just a matter of adding storage to the fibre network.
The disadvantage of using host-to-LUN mapping is that storage devices are still tied to their servers. Users still have to go through a server to access their files. The server can become a performance bottleneck and, when you have a large number of servers, a management headache as well.
The Real-time Data Intelligence (REDI) software sold with Magnitude SANs from
Xiotech (www.xiotech.com) assign storage volumes
to servers via host-to-LUN mapping. The software supports server failover for
reliability, and servers can be clustered together using software from Veritas,
Novell or Microsoft. Clustering permits load balancing across multiple servers
accessing a single storage volume. Other portions of the REDI software suite
enable mirroring of SAN storage volumes for redundancy and ease of administration
via a single console.
Xiotechs Magnitude SAN integrates hardware and software. Architecting these together improves reliability and control over the SAN, says Dick Blaschke, the companys executive vice president of marketing. Blaschke adds that design integration helps Xiotech overcome the potential bottlenecks associated with host-to-LUN mapping, and he contends that bottlenecks are better addressed with additional servers and a faster SAN network.
Virtual File Systems
Virtual file systems eliminate the potential bottleneck of sending everything through a single server by giving multiple servers full access to all storage volumes. A virtual file system places all storage on a SAN into a single pool. The pool can be separated into distinct virtual partitions, but the key feature is that all storage is available to all the servers connected to the SAN. Virtual file systems are not unlike network file systems such as Novell. But being on a SAN, virtual file system access to data is not restricted to just a single server or server cluster. It is also important to note that the servers in a virtual file system are not storage servers that manage file access. Rather, they are application servers. The applications can include any software that interacts with storage. Adding storage is simply a matter of plugging in a storage device and telling the virtual file system how it is to be used. When you need to scale up to more users, you add an application server such as a computer running a database. This gives users an additional avenue to their files without affecting the systems already in place. Obviously, you cant allow all servers unfettered access to a storage volume. When multiple servers try to write to a single file, there must be a mechanism that gives permission to overwrite a file to each server and that keeps a journal of changes made to the filing system. This job is given to a metadata server that keeps track of security and keeps different servers from making undocumented changes to commonly used files. Metadata servers can be tied directly to the SAN or they can be tied in through an ethernet connection. Although virtual file systems are more complex to install and set up, they have advantages over most host-to-LUN mapped SANs. They are very scalable, allowing you to add capacity just by adding application servers. Virtual file systems also improve performance by removing the storage server from the loop. Eliminating storage servers has enormous implications for document management applications. Databases and Web servers can access storage directly rather than going through a storage server, and multiple database servers can run off the same database. Since document management systems are highly dependent on the performance of a database, you can spread the processing load over several computers or computer clusters.
The Sanergy virtual file system from Tivoli (www.tivoli.com)
is built on either NTFS (Microsoft NT file system) or UFS (Unix file system)
depending on the type of metadata controller used. Once you pick a metadata
platform, any application server can access the storage regardless of platform.
In other words, the storage systems use NTFS or UFS, but the application servers
can be Windows NT, Solaris, IRIX, AIX, True64 or even Macintosh.
Sanergy lets you use existing file management utilities to set up security and user-level permissions. For administrators long used to working with traditional server attached storage, this makes for an easier transition to a SAN environment. Sanergy also supports metadata server failover as well as traditional clustering of both metadata servers and application servers.
While Tivolis software relies on operating systems, Centravision from
ADIC (www.adic.com) uses its own proprietary file
system. Like Sanergy, Centravision employs a metadata server that controls access
to the storage devices. And like a proper virtual file system, the application
servers can access storage directly without having to funnel files through a
server.
Centravision uses its own file system rather than relying on Windows or Solaris for file system services. This lets just about any computer serve as a metadata server. In theory, Windows, Solaris, Linux, FreeBSD, any other Unix or even a Macintosh can serve metadata. And any machine can act as an application server. The only requirement is that the metadata server be scalable to support a growing number of application servers. This requirement can easily be met with clustering software.
Another advantage of virtual file systems is that you can match your application servers to the platforms your end users are on or to your chosen applications. For example, if you need to serve documents both internally and on the Web, you can use a Windows NT machine for internal use and a FreeBSD machine with Apache Web server software for the Internet. Both machines can serve the same files using the same database. In exchange for the added complexity of a virtual file system, you get greater flexibility in supporting operating systems and applications. You can also save money by buying only the servers you need for current needs, confident that you can always scale up as needed.
Command and Control
One of the major advantages of a SAN is that it brings all your storage into
one place, but the SAN isnt going to work if you cant monitor and
control it. Vixel (www.vixel.com) is well known
in SAN circles as a maker of fibre channel hubs, switches and routers. But the
company also offers SAN Insight 2000, a comprehensive SAN management utility.
SANs incorporate hardware from many different vendors, says Brett Oxenhandler, senior product marketing manager for Vixel. We try to make SAN Insight work with every possible piece of SAN hardware available.
SAN Insight provides a single console that monitors the performance of every piece of equipment on a storage area network, including devices that may not be on the fibre channel network at all. When an administrator wants to work with a piece of equipment, SAN Insight will automatically load the necessary management software. It does this by managing devices through an ethernet running in parallel with a SANs fibre channel network.
It is a given that data files on a SAN are transported across the fibre network, but you can choose how SAN devices communicate with each other. It is possible to run communications network protocols over fibre channel. Devices that would normally communicate with each other via TCP/IP over ethernet can also use IP over fibre channel, but this is not necessarily the best way for devices on a SAN to talk to each other, according to Eric Herzog of Mylex. TCP is a packet technology, but storage is not transferred the same way, Herzog says, adding that data file transfer is controlled by SCSI commands. The two would have to be made to work together. There are other reasons not to use TCP/IP over fibre channel. TCP/IP was designed for the Internet a network built on top of hardware that varies widely in speed and quality. TCP/IP is designed to work on crowded networks that transfer millions of packets every second. The protocol has a lot of built-in redundancies that are a waste of bandwidth on a high-speed network. Virtual Interface is a communications protocol designed just for SANs. It is designed for high reliability interconnects like fibre channel, and it lets devices talk with each other over SAN hardware. Virtual Interface has less overhead than a conventional networking protocol like TCP/IP and is designed to coexist with the SCSI commands that serve as the data transfer protocol used on fibre channel.
Design Considerations
Building a SAN is not a trivial matter. Hardware interoperability problems are just the start. You have to design a SAN for the way your enterprise works. Like a LAN, SANs can be segmented with hubs, routers and switches. These devices work the same way on a SAN that they do on an ethernet, but traffic patterns differ. For example, all the devices on a SAN will need to communicate with the backup devices, but there may not be a lot of traffic between storage devices on a SAN. At the same time, youll have multiple servers fighting for bandwidth while trying to access storage devices.
The few SANs that are advertised as plug and play are relatively simple systems that closely resemble network attached storage. If they include both hard disk and tape backup storage, they generally will not allow very much integration with third-party hardware or software.
Plug and play for SANs is not a pipe dream, but it is about two to three years
away according to Larry Krantz, chairman of the board of the Storage Network
Industry Association (www.snia.org).
The big breakthrough in SANs will be to develop a mapping device that will locate all the storage and data on the SAN, Krantz says.
Until then, SANs will have to be carefully constructed to avoid hardware incompatibilities,
and the software will have to be carefully matched to the hardware. In building
SANs with its Shark Enterprise Storage Servers, IBM (www.ibm.com/storage)
uses StoreWatch software for low-level partitioning and host-to-LUN mapping
and Tivoli software for virtual file systems and management. According to Chris
Saul, IBMs disk systems consultant in the storage subsystem division,
the company still lacks a unified console for managing SANs, but it is in development.
A SAN with a management console can make administrators three times as efficient as with conventional storage systems, Saul says.
Tape libraries figure prominently in SANs installed by StorageTek (www.storagetek.com).
The company combines RAID subsystems with tape libraries to create enterprise-scale
storage area networks. StorageTek relies on host-to-LUN mapping, and a key component
of its SAN strategy is serverless backup using LAN-free backup software.
With a SAN, you only need a few tape drives to back up rather than a tape drive on every server, says Don Kleinschnitz, StorageTeks vice president of corporate strategy.
This resource sharing ability extends to all storage devices in a SAN. In a pure fibre network, where even the disk drives are fibre channel, you only need a handful of spare hard drives for failover. In traditional RAID systems, each RAID volume must have its own spare drive. When you have dozens or even hundreds of RAID volumes, it can get expensive to keep all those drives online yet unused, just waiting for drive failures.
Kleinschnitz says that buying a SAN is a strategic decision, one where starting
small and growing later on makes financial sense. SANs are not sold purely
for performance, he concludes. They are sold based on future growth.
|