RAID

by Parm Mann on 14 June 2008, 00:00

Quick Link: HEXUS.net/qasmo

Add to My Vault: x

RAID 4, 5 and 6

RAID 4: Block level Striping with Dedicated Parity Drive

A commonly used implementation of RAID before the advent of RAID 5, RAID 4 provides block-level striping (like RAID 0) with a parity disk. If a disk fails, the parity data is used to create a replacement disk. Parity is used to provide redundancy without the overhead costs involved in a mirrored array, whereas RAID 1 utilizes 50% of the total capacity the principle of parity is to take “X” amount of data and use that to compute an extra piece of data, then these “X+1” pieces of data are taken and stored on “X+1” drives. If any one of these pieces of data is lost then it can be restored from the data that remains. With parity, if you use a 4 disk array then you have the effective space of 3 disks, whereas with mirroring you would only get the equivalent of the space from 2 disks.

Parity protection is used with a RAID 0 (striped) array and “X” is normally the blocks or bytes distributed across the array. The parity data can be either on a dedicated parity drive as with RAID 4 or spread amongst the drives as in RAID 5.

Parity has some obvious advantages over mirroring in overhead costs, mirroring has a 50% overhead for its redundancy whereas parity has an overhead of 100/D where D = the number of drives in the array. As parity is used with a striped array the performance benefits of striping are also apparent.

The complexity of the millions of calculations that have to be performed every second leads to a major disadvantage with parity. Additional processing power is required leading to the necessity of a hardware controller for high performance. Software RAID with striping and parity utilizes a large amount of CPU power and slows the system as a whole. Similarly, should a drive fail the missing data has to be reconstituted, again requiring millions of calculations which takes time. A mirrored array is quick and simple to recover from particularly if hotpluggable drives are used. The use of a single parity drive can also create a bottleneck in read/write speeds slowing overall performance.

RAID 5: Block interleaved Distributed Parity

RAID 5 is perhaps the most popular form of RAID on the market (more so for businesses than end users). It utilizes a striped system with the dedicated parity drive replaced by a distributed parity algorithm, which allows parity data to be spread across the drives in the array. This has the effect of removing the bottleneck effect of a single parity drive, allowing increases in system performance, although the performance cost due to the complexity of the parity operations is still present. Fault tolerance is still maintained by keeping parity data from a given block of data on a wholly different drive to that on which the original data is stored.

Due to the additional complexity of RAID 5 a fairly high end system and RAID controller are required. A minimum of 3 drives are required, preferably identical. Without a dedicated controller card and a well specified system RAID 5 can severely slow down a system due to the amount of calculations required for distributed parity.

RAID 6: Independent Data Disks with Double Parity

Based on RAID 5 this variant is designed solely to improve data redundancy with parity data written to two separate disks. Where most RAID types can tolerate the loss of a single drive from the array, RAID 6 is able to recover from the loss of two drives failing. Some performance is lost over RAID 5 due to the additional calculations although random read times may be slightly improved as data is spread over an additional disk.

A minimum of 4 drives are required for this type of array as is a specialized (read expensive) dedicated controller card. For this reason RAID 6 is infrequently used, the chances of 2 disks failing simultaneously being so slim except in situations where the entire array fails in which case no amount of error correction will help.