Raid

RAID
Definition and Use of the Different RAID Levels
Betriebssysteme Einführung 2 Chr. Vogt
Contents
The different RAID levels:
Definition
Cost / Efficiency
Reliability
Performance
Further High Availability Aspects
Performance Optimization Features

3 What is RAID?
4 RAID Levels
5 Evaluation Criteria for the RAID Levels
6 RAID 0 or Striping
7 RAID 1 or Mirroring
8 RAID 0+1 (1)
9 RAID 0+1 (2)
10 RAID 3 (1)
11 RAID 3 (2)
12 RAID 5 (1)
13 RAID 5 (2)
14 Adaptive RAID 3+5
15 RAID 6
16 Further Aspects for High Availability
17 Performance Optimization Features

What is RAID?
RAID (Redundant Array of Independent Disks)
was first defined by Patterson, Gibson, and Katz from the Berkeley University in 1987,
uses logical volumes consisting of several disks.
The RAID level defines how the disks are organized,
Depending on the RAID level, the reliability and/or the performance will be improved.
RAID systems can be implemented
in software
in hardware
RAID controllers
external RAID subsystems
RAID functionality offered by the storage arrays in a SAN (Storage Area Network)
RAID Levels
Originally, five RAID levels were defined:
RAID 1: Mirrored disks
RAID 2: Hamming code for error detection
RAID 3: Single parity disk, bit-interleaved parity
RAID 4: Single parity disk, block-level parity
RAID 5: Spread data/parity over all disks (block-level distributed parity)
Later, further RAID levels were defined:
RAID 0: Striped disks (no redundancy)
RAID 0+1: a combination of RAID 0 and RAID 1
RAID 6: dual redundancy using a Reed-Solomon code
Only RAID 0, RAID 1, RAID 0+1, and RAID 5 are commonly used today
(occasionally also RAID 3, and RAID 6).

Evaluation Criteria for the RAID Levels
The following aspects need to be considered when evaluating the different RAID levels:
Space efficiency / cost: How much additional disk space is required for redundant data?
Reliability: How many disks may fail without losing data?
Also: What reliability remains after a disk failure?
Performance:
during normal operation,
after a disk failure,
during the restoration of a failed disk.
In many cases
read and write operations have to be considered separately,
the performance also depends on the size of the I/O.
Restoration: What needs to be done in order to restore the data of a failed disk.
RAID 0 or Striping
For a RAID 0 array with n disks, data is divided into n strips which are written to the
disks in a round-robin fashion.
Space efficiency / cost: No redundancy, 100% storage efficiency, no extra cost.
Reliability: No redundancy, no reliability.
Performance: Very good, when the disks can work in parallel.
Depends on the stripe size, the I/O size, and on the number of
simultaneous requests.

RAID 1 or Mirroring
The same data is written to two or more disks.
Space efficiency / cost: 1/n for an n-fold mirror / n times as many disks are needed
Reliability: n-1 disks can fail without losing data
Performance: Read performance is improved: read from any disk (in parallel)
Write performance is unchanged: parallel writes to all disks
Restoration: Copy the data from a remaining disk in the mirror set
RAID 0+1 (1)
RAID 0+1 is sometimes also called RAID 10. It can be built in two different ways:
Space efficiency / cost: 1/n when using n-fold mirrors / n times as many disks are needed
Performance: As with RAID 0
Mirrored Stripesets Striped Mirrorsets

RAID 0+1 (2)
Reliability:
Mirrored Stripesets: Any n-1 disks can fail without losing data (n-fold mirroring).
A failing disk makes the whole stripe set inaccessible.
Striped Mirrorsets: Any n-1 disks can fail without losing data (n-fold mirroring).
n-1 disks can fail per mirror set in the stripe set.
Restoration (of a single failed disk):
Mirrored Stripesets: A whole stripeset needs to be copied from one of the remaining
copies in the mirrorset.
Striped Mirrorsets: The data on the failed disk needs to be copied from one of the
remaining disks in the mirrorset.
RAID 0+1
should always be built as striped mirrorsets,
is the best possible solution when cost is not the main concern.
RAID 3 (1)
For a RAID 3 array with n disks, data
is divided into n-1 strips (typically small ones),
parity information (bitwise XOR) is calculated over the n-1 strips and stored on the n-th disk.
Space efficiency / cost: (n-1)/n when using n disks / 1 additional parity disk is needed
Reliability: 1 disk may fail without losing data

RAID 3 (2)
Performance:
Normal operation:
Good read performance (disks are accessed in parallel).
Good write performance for large I/O requests (more than n-1 strips), because all disks are
accessed in parallel.
Very bad performance for small writes which access only one or few disks:
read the old data from the disks, which will not be overwritten,
calculate the new parity,
write the new data and the new parity.
After a disk failure: Performance decreases, because the missing data must be
restored from the old data and the parity information.
In either situation, the single parity disk can become a bottleneck.
Restoration: In order to restore the contents of a failed disk, all remaining disks,
including the parity disk, must be read. Hence, there will be a heavy
I/O load on the disks during the restoration.
RAID 5 (1)
is divided into n-1 strips (typically large ones),
parity information (bitwise XOR) is calculated over the n-1 strips,
the parity information is distributed over the disks.
Space efficiency / cost: (n-1)/n when using n disks / 1 additional disk is needed
Reliability: 1 disk may fail without losing data

RAID 5 (2)
Performance:
Normal operation:
Good read performance (disks are accessed in parallel).
Because of the big stripe size, I/Os typically only access one or few disks.
Very bad performance for small writes:
read the old data that is to be changed and the old parity information,
calculate the new parity (from the old data, the old parity, and the new data),
write the new data and the new parity.
After a disk failure: Performance decreases, because the missing data must be
restored from the old data and the parity information.
Restoration: In order to restore the contents of a failed disk, all remaining disks,
including the parity disk, must be read. Hence, there will be a heavy
I/O load on the disks during the restoration.
(Same as with RAID 3.)
Adaptive RAID 3+5
Some vendors claim that their RAID systems (software or hardware) perform adaptive
RAID 3 / RAID 5 operations.
The placement of the parity information is fixed, and cannot be changed dynamically.
The adaptation lies in the choice of the write algorithm, depending on the I/O size:
perform large I/Os by writing all data and the parity in parallel,
perform small I/Os by using the algorithm described for RAID 5.

RAID 6
is divided into n-2 strips,
two pieces of parity information are calculated over the n-2 strips (using a Reed-Solomon code),
the parity information is distributed over the disks.
Space efficiency / cost: (n-2)/n when using n disks / 2 additional disks are needed
Reliability: 2 disks may fail without losing data.
Performance and Restoration: Similar to RAID 5.
Further Aspects for High Availability
RAID sets protect against the loss of data in case of a disk failure.
In addition, provisions need to be taken to
make the access to the data highly available,
improve the performance of I/O operations.
In order to make data access highly available, a RAID system should provide
redundant power supplies, fans, etc.
several busses for attaching the disks,
redundant controllers
with load balancing for the I/O requests,
with cache takeover in case of a controller failure,
hot-swappable components (disks, power supplies, etc.),
hot spare disks to reduce the time during which a disk is missing from a RAID set.

Performance Optimization Features
Caches, which can in particular improve the bad write performance of RAID3 and RAID5 sets.
To avoid data inconsistencies („write hole“), caches must be protected with a battery backup.
Load balancing between redundant controllers.
In mirrorsets,
read from the disk with the optimal head position, and/or
load balance the read requests between the disks in the mirrorset.
Since I/O performance often depends on the size of the I/Os, a RAID system should
gather statistical information about the I/O sizes,
allow the stripe size (or: chunk size) to be adjusted for each RAID set individually.
Since restoring a failed disk causes a heavy additional I/O load, it should be possible to
decide whether to give preference to
the restore operation, in order to restore redundancy as quickly as possible,
the application I/Os, in order to affect the applications as little as possible.

Raid

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Raid

Similar to Raid (20)

Recently uploaded

Recently uploaded (20)

Raid