RAID Systems And Data Recovery

Hard drives are reliable even if sometimes it happens that they fail. To further increase the reliability of the systems we can use multiple disks and spread the data. This is the principle of RAID – Redundant Array of Inexpensive Disks. This array behaves as one large disk and offers many advantages over a single hard drive. The capacity of a RAID system is less than the sum of total hard drive capacity. This is because there is some amount of redundancy spread over all drives. This redundant data prevents data loss when some of the drives fails. This makes RAID system very reliable. But reliability is not the only advantage.

Higher Data Security

Except RAID level 0 all RAID systems provide additional protection for the data. This additional security allows any of the drives in the system to fail without loosing any data. In the case of a single drive failure (in some cases even more drives can fail) the RAID system continues with operation and users don't notice any change. Of course, the system administrator is noticed that the system is degraded and he should replace the failed drive as soon as possible.

Fault Tolerance

Because of redundancy there are now lower chances that the system will fail as with only one hard drive. However, because of additional hardware components the chances that something will fail are greater, but the RAID system itself is more robust.

Improved Availability

By providing fault tolerance and additional recovery features RAID system can provide very high availability.

Increased Integrated Capacity

RAID systems make it extremely easy to create a large disks storage by combining smaller disks into a big system. Even if some percentage of this total capacity is lost because of redundancy, the overall gain is huge.

Improved Performance

Because the data is stored on many drives this makes it easy to increase performance since the RAID controller can write the file simultaneously to all the drives. The same applies to reading.

Of course, there are also disadvantages of RAID systems. They cost more than a single drive. You need many hard drives and a special RAID controller. But you have to look at this as a system. You pay more but you get security, fault tolerance, availability and performance. And because hard drives are cheap and RAID controllers are built in into many modern motherboards, RAID system are becoming common also in home servers.

There are two approaches for RAID. Hardware or software based. Hardware based RAID system used dedicated RAID hard drive controller which does all the hard work of distributing the data over different drives and takes care of the RAID system. Software based approach needs no additional hardware, only individual hard drives. RAID software runs under operating system and takes care for writing to and reading from individual drives. The advantage of this approach is lower cost for the price of additional load of main processor.

There are few types or levels of RAID systems. Each type emphasizes one or few advantages.

RAID level 0

This is only a collection of disks without any redundancy. The main purpose of this level is striping. Each file is broken into stripes which are spread over all disks. This RAID level improves performance but offers no data protection.

RAID level 1

RAID 1 is in fact disk mirroring. You use two disks to store the same data. If one disk fails you still have another one. RAID 1 offers protection but no performance gain.

RAID level 2

RAID 2 uses bit-level striping with Hamming code ECC. It is expensive and rarely used.

RAID level 3

RAID 3 uses byte-level striping with dedicated parity. It differs from RAID 4 only in the size of the stripes sent to the various disks. Any disk can fail without data loss.

RAID level 4

RAID 4 uses block-level striping with dedicated parity disk. The dedicated parity disk presents a bottleneck which decreases random write performance in particular.

RAID level 5

RAID 5 is very popular because it offers fault tolerance and improves performance. Data and parity is striped across 3 or more drives. RAID 5 system can tolerate loss of one drive. Usually hot spares are used for additional reliability. As soon as one drive fails the hot spare drive is automatically put into the system and rebuilt with the data from other drives.

RAID level 6

RAID 6 is similar to RAID 5 but can handle failure of any two disks. It has slightly lower writing performance than RAID 5 because of calculation of two sets of parity information.

RAID Data Recovery

When a “normal” hard drive fails either due to electronics or mechanics the data is still present on the platters. Data recovery companies can recover the data. It costs some money but in general the data can “easily” be restored. RAID systems provide fault tolerance but it is still possible that more then one drive fails. For example, RAID 5 with two failed drives is not operational. All the data is inaccessible and therefore lost. Unless you find a RAID data recovery company which will examine the system and recover the data. Data recovery is not a trivial task, and RAID data recovery is further complicated because individual files are distributed over many disks. However, for data recovery experts this is not a problem. Any RAID system can be recovered unless the damage is such that it actually prevents data recovery from individual hard drives.

To avoid paying for data recovery services you should treat RAID system as a highly reliable hard drive that can sometime fail. It is very unlikely that it will happen but it is not impossible. You should not treat RAID systems as backup or archive. Any RAID system is only a highly reliable hard drive that provides fault tolerance, additional security and performance.

Tags: , ,