RAID: Saving Your Data
RAID -- Redundant Array of Inexpensive Drives (or Disks, depending on whom you ask) – has been around for years, originally fueled by the need to ensure high data integrity and availability. The basic premise of storing data on more than one drive to make sure nothing gets lost, and the ability to recover from failed disk drives without shutting down the system has made RAID a staple for many larger systems, mostly UNIX-based, but also gradually easing into the DOS and Windows worlds as those operating systems became the drivers of critical servers. The concepts behind RAID are to achieve a point where the parameter Mean Time to Data Availability (MTDA) is minimal and Mean Time to Data Loss (MTDL) is infinitely large. In other words, you never have to wait for your system to recover from a disk problem, and you never will lose information.
Windows NT is ideally suited as a mission-critical operating system because of its high reliability and strong security. RAID subsystems have been available for Windows NT from the earliest versions, and Windows NT 4 extends the support to provide built-in RAID capabilities and a reasonably long list of supported RAID controllers. In this article we look at what RAID is, how it works, how you use it, and how to get it installed on your systems. We’ll try and show you why RAID may be important to you and why you should consider implementing it. But first, let’s figure out what RAID is.
The RAID Levels
RAID methodology was first proposed at the University of California at Berkeley, although it was not envisioned in quite the same way we use RAID today. RAID was defined as a memory architecture using a subsystem of two or more hard disk drives treated as a single, larger logical drive. The purpose of this proposed architecture was to take advantage of data redundancy inherent in the multiple drive design, as well as to capitalize on the lower costs of smaller drives. (When RAID was first proposed it was considerably cheaper to buy five 200MB hard drives than one larger 1GB drive, for example. This not true any more, so the current focus for RAID is data integrity and reliability instead of cost saving.)
Originally there were six levels of RAID (called RAID 0 through RAID 5), and a few more have been added lately to combine features of other levels. Although RAID 6 follows the general numbering process, most of the new levels break with the number sequence and create numbers of their own, usually for marketing reasons. Not all of the RAID levels are commercially available, and some are supported by only a few products. An example is RAID 6, which very few products implement. Only RAID levels 0, 1, and 5 are supported by Windows NT 4 without additional hardware and software from RAID vendors. Those three levels, though, can be supported by a variety of hard disk and controller combinations.
RAID 0 is the base level of the RAID technology. RAID 0 stripes data across all drives. Striping means that all available hard drives are combined into a single large virtual filesystem, with the blocks of the filesystem arrayed so that they are spread evenly across all the drives. For example, if you have three 500MB hard drives, RAID 0 provides for a 1.5GB virtual hard drive. When you store files, they are written across all three drives. When a large file, such as a 100MB multimedia presentation, is saved to the virtual drive, a part of it may be written to the first drive, the next chunk to the second, more to the third, and perhaps more wrapping back to the first drive to start the sequence again. The exact manner in which the chunks of data move from physical drive to physical drive depends on the way the virtual drive has been set up, which includes considering drive capacity and the way in which blocks are allocated on each drive.
No parity control is used with RAID 0, which is the major problem with this level. There is no data redundancy at all with RAID 0, either, as the data is only written once. RAID 0 does have a number of advantages, through. Most important for many users is that striping across three or more drives does provide some increase in performance through load-balancing. This is especially noticeable with large files. For example, to return to our 100MB file save, if 33MB was written to each of the three hard drives in our RAID 0 system, all three drives can save data at the same time. The total save time for the file is theoretically one third that of writing the full file on one drive. Reading data is similar, in that the three thirds of the large file can be read simultaneously and combined by the RAID software, again theoretically reducing the read time to one third. In practice, the savings are not that dramatic due to factors such as drive latency and head seek times, but RAID 0 arrays of three or more drives can noticeably affect read and write performance. On the author’s RAID 0 system comprising four drives, read and write performance is less than half that of a single large drive.
RAID 0 has a side advantage that many users will like in that it allows drives of differing sizes and layouts to be used. For example, if you have a couple of drives of only a few hundred megabytes capacity that have been replaced in a server by high capacity drives, you can use the smaller units in a RAID 0 array with no penalty. Windows NT 4 supports RAID 0 setups of three to thirty-two disks.
RAID 1 is known as disk mirroring or disk shadowing. With RAID 1, each hard drive on the system has a duplicate drive which contains an exact copy of the first drive’s contents. Since every bit written to the filesystem is duplicated, there is data redundancy with RAID 1. If one drive in the RAID 1 array fails or develops a problem of any kind (such as a bad sector), the mirror drive can take over and maintain all normal filesystem operations while the faulty drive is diagnosed and fixed. Many RAID 1 disk controllers have software routines that will automatically take a faulty drive off-line, run diagnostics on it and, if possible, reformat the drive and copy all data back from the mirror image all while the filesystem proceeds as if nothing had happened. Users are usually unaware of faults with RAID 1 controllers except for alert messages that can be triggered.
One big disadvantage of RAID 1 is the use of disks. If you have two 2GB drives, you can only have a total filesystem of 2GB (the other 2GB is mirrored). You’re only getting half the disk space you’re paying for, but you do have fully redundant drives. In case of catastrophic failure of a drive, a controller, or a motherboard, you can remove a mirror drive and boot on another controller or server.
RAID 1 offers an increase in read performance in most implementations, as the controller card allows both of the drives (primary and mirror) to be read at the same time, resulting in a faster read operation. Write operations are not faster, though, as there are two drives that must be written. In many RAID 1 systems that do not use separate drive controllers for the primary and mirror drives, writing can even slow down as two complete write operations must be performed in sequence. Testing on one of the author’s systems using a single SCSI controller card shows that the speed increase of the read operation is more than offset by write delays, with a net effect of a slight degradation in system performance. With two controller cards (one for each drive), performance increases slightly due to the read advantages, but only when the two controllers are able to cooperate with each other.
Implementations of RAID 1 usually require two drives of similar size. If you use a 1.5GB and a 2GB drive, for example, the extra 0.5GB on the second drive is wasted. Some controllers will allow you to combine drives of different sizes, with the extra used for non-mirrored partitions. Windows NT 4 supports RAID 1 with two or more drives.
RAID 2 tries to overcome the speed limitations of RAID 1 by using a technique called Hamming codes. Hamming showed that data can be organized in such a manner that if an error develops affects only one bit in each byte, the error can be detected and corrected without an inordinate amount of system overhead. In the classic example of RAID 2, nine disk drives can be used in an array which has the first bit of every byte written on the first drive, the second bit of the byte on the second drive, and so on. The ninth drive has an error correction bit. A fault affecting one drive’s read of a bit will be detected by the correction code, and fixed with a simple algorithm. The same method can be used with less than nine drives, writing more than one bit per drive but then increasing the chances that a damaged hard drive will lead to unrecoverable errors.
All drives in a RAID 2 array read and write data in parallel (as with RAID 0), so throughput is faster. In theory, the amount of throughput increase by the same factor as the number of drives involved, so a nine-drive array will result in read and write operations taking one eighth the amount of time (ignoring the error-correction drive). In practice, RAID 2 systems don’t come near this theoretical figure, but they do show a noticeably increase in performance. The major problem with RAID 2 is that it requires large disk arrays, which are seldom practical for small server systems. RAID 2 is not inherently supported by Windows NT 4, but some commercial implementations are adaptable. The sheer size and cost of a RAID 2 subsystem generally rule them out in favor of other RAID systems.
RAID 3 tries to combine the techniques of RAID 0 and RAID 2 into a more reasonably-priced system. With RAID 3, two or more drives hold data while an extra drive holds error correction. Data is interleaved across the data drives so that the first byte is on the first drive, the second byte on the second, and so on. When all the data drives are used, the storage loops back to the first data drive. The number of drives affects which bits are stored where, so in a two-drive array drive one will have the odd-numbered bits and drive two will have the even-numbered bits. The error correction drive holds a bit-sensitive value for each sector on the data drives (it actually uses the same algorithm as that employed with parity RAM). With eight data drives, the system is very similar to RAID 2 except the error correction algorithm is different.
As you would expect, RAID 3 offers performance increase because of simultaneous reads and writes. RAID 3 is often found on larger minicomputer systems where huge files must be handled and the performance advantage RAID 3 offers is noticeable. As with RAID 2, though, the cost of setting up a RAID 3 subsystem can be prohibitive, and there are better methods to achieve the same result. RAID 3 is not supported by Windows NT 4, although some commercial systems are available.
RAID 4 tries to solve the primary problem that occurs with RAID 2 and RAID 3, the need for sequential disk I/Os and the need for large block sizes on the drives. With RAID 4, larger blocks of data (such as a sector) are written to the first drive, the next to the second, and so on, instead of bits as with RAID 2 and RAID 3. A single error correction drive is involved. In case of a data disk error, missing data can be reconstructed from the error correction codes.
Because of the use of larger disk blocks, small file reads and writes can be performed simultaneously with RAID 4, spread over the entire disk array. Optimization algorithms can adjust the layout of data across the drives so that frequently read files are spread out for fastest retrieval. As with RAID 2 and RAID 3, RAID 4 is not supported by Windows NT.
RAID 5 addresses a significant flaw apparent with RAID 4. Since RAID 4 uses a dedicated error correction drive, write limitations are inherent as codes are written sequentially to the single error correction drive. Although this is not a problem for most operating systems, for those involved in high data throughput (such as transaction processing) this single error-correction write process can become a bottleneck. RAID 5 addresses this problem by removing the dedicated error correction drive and spreading the error correction data across all the hard drives in the array. Each drive in the array then holds data and error correction information. RAID 5 is often called striping with parity, since the data is striped as with RAID 0 and a parity code is also written to allow recovery of corrupted data.
On heavily loaded systems, RAID 5 doesn’t improve read times over RAID 4 but write times are better because they can be performed in parallel. Tests of heavily loaded UNIX workstations running RAID 5 show write times halved by this technique over an equivalent RAID 4 subsystem. In theory, RAID 5 allows all drives to read and write in parallel so as the number of drives in the array increases, transfer rates drop by the inverse (so if four drives are used, read and write operations are a quarter of the speeds found with one drive). The primary problem with RAID 5 is that when a drive encounters an error or fails, the recovery algorithm required to rebuild lost data can become significant and load the operating system down.
RAID 5 is supported by Windows NT 4 and most RAID vendors as it is a good compromise between data integrity, speed, and cost. RAID 5 has better performance that RAID 1 (mirroring) and there is a performance increase. RAID 5 usually requires at least three drives, with more drives preferable. The overhead RAID 5 imposes on RAM can be significant, too, so Microsoft recommends at least 16MB RAM when RAID 5 is used. As with RAID 1, though, drives of disparate capacities may result in a lot of unused disk space, as most RAID 5 systems use the smallest drive capacity in the array for all the RAID 5 drives. Extra disk space can be used for un-striped partitions, but these are not protected by the RAID system.
As mentioned earlier, there are a few new RAID levels and systems that have appeared over the years, although none have gained a significant following. Support for Windows NT 4 varies. RAID 10 was introduced by RAID vendor ECCS, offering mirrored pairs of drives with block striping between the drives. RAID 10 is essentially a combination of RAID 1 and RAID 0. RAID 10 offers very good scalability with the ability to add new drives as needed, as well as offering good data redundancy, but it is useful to remember that RAID 10 is really just a combination of RAID 0 and RAID 1 with a few bells and whistles. RAID 10 subsystems tend to be expensive, usually enough to price them out of the realistic range for Windows NT servers.
Another highly-touted system, RAID 53, was introduced by RAID vendor Hi-DATA. RAID 53 is intended to combine the reliability of RAID 3 with the lower cost of RAID 5. Hi-DATA’s system uses a RAID 3 array for each drive in a RAID 5 model. As with ECCS’ RAID 10, the cost of a RAID 53 subsystem becomes higher than is reasonable for an average server.
RAID enjoyed a surge in popularity in the late eighties as desktop UNIX systems became attractive in terms of performance and reliability. Many companies saw a chance to dump dedicated minicomputers and use a PC with an Intel UNIX such as SCO UNIX. The ability to use RAID subsystems on these servers made data integrity and reliability just as good as dedicated minicomputer drive arrays, for a lot less money. Several companies sprang up to offer RAID products to this growing desktop PC market, and while many have faded away, the strongest have continued to refine their Intel platform products, including ports to Windows NT.
When they first appeared for desktop PC machines, RAID systems were composed of ESDI and SCSI drives. ESDI, of course, virtually disappeared within a couple of years, but SCSI has remained popular and is still the best choice in terms of support and performance for RAID. IDE and EIDE RAID subsystems are also appearing now, although EIDE limitations of four drives (only two with IDE) limits their applicability to only mirroring (RAID 1 and, to a lesser extent, RAID 5). According to the RAID vendors, SCSI subsystems represent over ninety-five percent of the RAID market. SCSI’s high throughput and features like hot-swappable drives (where a drive can be removed without powering down the system) make it very attractive over all other drive formats. Also, the development of Ultra SCSI with throughputs to 40Mbps has made SCSI subsystems almost so fast the host server processors have become the limiting factor.
Implementing RAID on your Windows NT system
Windows NT 4 supports only RAID 0, 1, and 5 natively, and while there are a few others such as RAID 3 available, we’ll focus on the three levels Microsoft deems most important. In general, RAID can be implemented either in software or in hardware (apart from the disk array, of course). The specific RAID version you want to employ determines which method is best but most operating systems rely on software primarily. Windows NT is not exception. RAID 5, for example, is usually implemented only in software although a multi-channel SCSI hard disk controller can be used to increased performance.
RAID 1 is almost always software-based and since the drivers are so small they offer virtually no drain on the system. Hardware-based implementation of RAID 1 usually requires dual drive adapters which can actually slow the system down more than the RAID 1 software implementation does (primarily because of dual DMA calls)..
Is there a performance issue that you must deal with in deciding which RAID level to employ on your system? To develop a concrete answer, we constructed a test suite based on a Windows NT 4 server installed on a 150Mhz Pentium Pro with 128MB RAM, running Lotus Notes to provide a variable load of read and write operations. We wanted to use the same controller card throughout the tests to eliminate SCSI controller card performance variations, so we chose the DPT SmartRAID IV Ultra controller. The SmartRAID IV Ultra is a PCI SCSI controller with three channels (we only used) and 4MB cache RAM. The DPT system supports RAID levels 0, 1, and 5 and includes software add-ons for Windows NT 4. All hard drives used in the test were off-the-shelf 2GB Ultra SCSI units from the same manufacturer.
We started with a single 2GB hard drive and measured performance times with light, medium, and heavy loads. We then set up RAID 1 (disk mirroring) with a second 2GB drive, first with both drives on the same controller channel and then with each drive on a separate channel. This was followed by a RAID 0 test with four drives on one controller channel. Finally, we installed RAID 5 with three 2GB drives on two different Ultra SCSI channels. The performance numbers were normalized to the non-RAID values and are shown in Table 1. Since significant difference in read and write operation times were recorded, both numbers are reported. All values have been rounded for clarity.
As you can see, all three RAID implementation had a noticeably improvement in performance for read operations, but write performance differed noticeably. Also bear in mind that some of the performance improvements are due to the 4MB cache on our DPT controller card. RAID 0 performance increases as drives are added to the system, and a quick test with only two drives showed our performance numbers dropped to almost the non-RAID numbers (we tested RAID 0 with four drives). RAID 5 read operations are faster than either RAID level (due to the higher number of drives) but write operations are not as fast as RAID 0. RAID 1 read operations are fast, as you would expect from parallel reads, but writes are not improved noticeably (the slight performance increase in the table is probably due to the controller cache).
Fault tolerance is the big separator when it comes to choosing a RAID level. With RAID 0 you have no redundancy: a drive failure means loss of data. RAID 1 allows automatic switch-over to a mirror unless both drives fail at the same time (unlikely except for surge problems). In general, though, no data is lost with RAID 1. RAID 5 is similar, in that a drive failure means no lost data, except when two drives fail. When a RAID 5 drive fails, though, performance is degraded due to the algorithmic reconstruction of lost data.
RAID 5 is the RAID version most often recommended, especially for business systems where data availability is critical. RAID 5 is the most successful RAID version usually implemented, and with large disk subsystems RAID 5 approaches both the MTDA and MTDL goals mentioned at the beginning of this article.
To implement one of the RAID levels you need to decide whether you will purchase a complete RAID subsystem, or build your own from controllers and hard disks. The former is the easiest, but usually more expensive. Several vendors offer RAID subsystems that require only the installation of a card in your PC, some installation software, then a cable run from the controller card to the disk subsystem to provides the entire RAID setup for you. The main vendors of these plug-and-go RAID subsystems which have been certified by Microsoft are Micropolis (Radion subsystems), DEC and Hitachi (both of whom offers several subsystems), Storage Solutions (the STAC-a-ray subsystem), and Compaq (their Ragingsrv RAID system).
For high-performance high-reliability servers, the plug-and-go subsystems are often the best choice. A good example is Boxhill System’s Fibre Box is a distributed RAID subsystem for Windows NT that allows all disk subsystem processing to take place away from the server’s motherboard, and as the name suggests the subsystem uses fibre optics to provide data transfer rates to a whopping 200Mbps. With disk capacities of over 72GB per box, and a maximum size over a terabyte, this type of subsystem is for the serious server.
Alternatively, you can rely on Windows NT to do most of the work with a standard SCSI card (better is a SCSI card with RAID support provided on-card) and a number of drives. To install RAID on your machine, all you need is the drives, a controller that is supported by Windows NT, and the patience to set it up correctly. This is, luckily, quite easy. The choice of the SCSI controller card is often personal, and there are many available that support Windows NT RAID, including cards from Adaptec, BusLogic, DPT, DEC, Future Domain, Mylex, NCR, Qlogic, Trantor, and UltraStor.
Some of the controller cards, such as the DTP SmartRAID IV used in our performance tests, are intended specifically for RAID and provide add-on software to enhance the RAID subsystem’s performance. As an example, DPT includes Storage Manager as a GUI-based disk array control package, which makes setting up and using the RAID system almost trivial. The software automatically inventories all devices attached to the controller, provides the best options for a RAID subsystem, and lets you monitor performance at any time. Total time for installation on our test server, from opening the box to completing the Windows NT software installation was just under one hour, most of which was used in reading the Windows NT CD-ROM.
Is RAID worth the effort? If you value your data, without a doubt. A simple two-drive RAID 1 implementation is inexpensive compared to data loss, involving only the purchase of a second hard drive. The entire package can be added to existing systems for a few hundred dollars, depending on the drive. Since the software for mirroring is part of the Windows NT Disk Administrator package, there’s nothing special to installation and configuration. RAID 0 and RAID 5 are more expensive to implement, but do offer more security and faster throughput. A four-disk RAID 5 subsystem can be assembled for a couple of thousand dollars, for example. The magic of RAID 0 and RAID 5 is usually ignored until the first drive failure, and then when you see RAID do it’s magic and recover gracefully, you’ll wonder why you didn’t buy RAID earlier.