This document discusses different RAID levels including RAID 1, RAID 2, RAID 3, RAID 4, and RAID 5. RAID 1 uses disk mirroring to duplicate all data across two disks. RAID 2 uses bit-level striping with Hamming codes for error correction. RAID 3 uses byte-level striping with a dedicated parity disk. RAID 4 uses block-level striping with a dedicated parity disk. RAID 5 spreads data and parity across all disks rather than dedicating a disk to parity. RAID 5 provides improved write performance over RAID 4 and is commonly used today for its balance of performance, redundancy and cost effectiveness.
RAID 1 to 5: Understanding Disk Mirroring and Parity in RAID Levels
1. Raid 1 : MIRRORED DISKS
• Mirroring
• It is also called “Data
mirroring”.
• Duplicate every disk.
2. Raid 1 : MIRRORED DISKS
CHARACTERISTICS
• The automatic duplication of the data means there is
little likelihood of data loss .
• Data can be read from either disk but is written on
both disks.
• There is no overhead of storing parity information.
• Recovery from failure is simple. If one drive fails we
just have to access data from the second drive
• Data permanently lost only if the second disk fails
before the first failed disk is replaced.
3. Raid 1 : MIRRORED DISKS
ADVANTAGES
• It provides fault tolerance.
• Good read performance.
• Reasonable write performance.
• Simple to implement.
• Random Read performance=better than single
disk
• Sequential Read Performance=fair, just as single
disks
• Sequential Write Performance=good
• Random Write Performance=same as single disks
4. Raid 1 : MIRRORED DISKS
DISADVANTAGES
• High cost
• High check disk overhead
5. Raid 2 : Hamming code for ecc
• Uses Bit-level striping
with Hamming codes
for ECC (Error code
correction)
• Data stripping is used
6. Raid 2 : Hamming code for ecc
CHARACTERISTICS
• This uses bit level striping i.e. instead of striping the block across
the disks, it strips the bits across the disks.
• Two groups of disks are require:
1. One group of disks are used to write the data
2. Another group is used to write the ECC.
• This uses hamming ECC, and stores this information in the
redundancy disks.
• When data is written to the disks, it calculates the ECC code for
the data, and strips the data bits to the data disks and write the
ECC code to the redundancy disks.
• When the data is read from the disks, it also reads the
corresponding ECC code from the redundancy disks and check
whether the data is consistent.
7. Raid 2 : Hamming code for ecc
Advantages
Random Read performance=fair
Sequential Read Performance=very good
Sequential Write performance=good
Disadvantages
Random Write Performance=poor
Requires a complex controller
High overhead for check disks
Not used in modern systems
8. THIRD LEVEL RAID
• RAID 3 consists of byte
level striping with dedicated
parity bit.
• All disk arms are
synchronized. Speed is
limited by the slowest disk
• One parity disk:
Parity bit value = XOR
across all data bit values
9. Continued...
• If one disk fails, recover the lost data:
– XOR across all good data bit values and parity bit value
• Fast read/write.
• During a write, RAID-3 stores a portion of each block on each
data disk. It also computes the parity for the data, and writes it
to the parity drive.
• When the data is read back in, the parity is also read, and
compared to a newly computed parity, to ensure that there
were no errors.
• It is not possible to have multiple operations being performed
on the array at the same time, due to the fact that all drives are
involved in every operation.
10. Continued…
Advantages
– Random Read Performance is good
– Sequential Read performance is very good
– Sequential Write performance is fair to good.
– Lowest overhead for check disks.
Disadvantages
– Random Write performance is poor.
– Complex controller design.
– Can tolerate one disk failure.
– Even when a single disk fails the performance is reduced
to that of Raid 0.
11. FOURTH LEVEL RAID
• Uses Block-level striping
with dedicated parity.
• If error, read/write of same
block on all disks.
• If no error, write also needs
to update (read-n-write) the
parity block. (no need to read
other disks)
12. Continued…
• Can compute new parity based on the old data,new data,oldparity.
New parity = (old data XOR new data) XOR old parity.
• A one-block write will be used as an example.
1) A write request for one block is issued by a program.
2) The RAID software determines which disks contain the data, and
parity, and which block they are in.
3) The disk controller reads the data block from disk.
4) The disk controller reads the corresponding parity block from
disk.
5) The data block just read is XORed with the parity block just read.
6) The data block to be written is XORed with the parity block.
7) The data block and the updated parity block are both written to
disk.
13. Continued…
• Thus from the above example one block write will result in two
blocks being read from disk and two blocks being written to
disk. If the data blocks to be read happen to be in a buffer in the
RAID controller, the amount of data read from disk could drop
to one, or even zero blocks, thus improving the write
performance.
Advantages
– Random Read Performance is very good.
– Sequential Read and Write performance is good.
– Lowest overhead of check disks.
Disadvantages
– Quite complex controller design.
– Random write performance is poor.
– Not commonly used.
14.
15. Need of Level 5?
• Achieve parallelism in write operation.
• The check disk is the bottleneck.
16. RAID Level-5
• Block-interleaved Distributed parity
• Spreads data and parity among all N+1
disks, rather than storing data in N disks
and parity in 1 disk
• Optimized for multi-thread access
17. RAID Level-5
Level-4 Level-5
1 2 3 4 5
data disks
check
disk
data and check disks
1 2 3 4 5
S0
S1
S2
S3
S4
S5
S0
S1
S2
S3
S4
S5
18. RAID Level 5
• Wastage is small: same as in Raid 4
• Parity update traffic is distributed across
disks
D0,0
D1,0
D2,0
P3
D0,1
D1,1
P2
D3,1
D0,2
P1
D2,2
D3,2
P0
D1,3
D2,3
D3,3
D0,0 D0,1 D0,2 = P0
19. RAID 5 Actions
D D D P
Fault-free Read
D D D P
1
2
3
4
Fault-free Write
D D D P
Degraded Read
D D D P
Degraded Write
20. 20
1. Read old
data.
Old New
2. Write
new data
Old P.
3. XOR old and
new data to create
“Partial Product”.
4. Read old
parity data.
5. Xor old parity with
partial product, writing
out result as new parity.
P. P.}
XOR
}
XOR
New P.
Chunk 1 Chunk 2 Chunk 3
Drive
5
Chunk 0 Parity
Drive
4
Drive
3
Drive
2
Drive
1
RAID 5 - Write Operation
21. Key points of RAID Level-5
• Level-5 stripes file data and check data over
all the disks
– no longer a single check disk
– no more write bottleneck
• Drastically improves the performance of
multiple writes
– they can now be done in parallel
• Slightly improves reads
– one more disk to use for reading
22. Advantages
• Best cost/ performance for transaction oriented
n/w.
• Very high data protection
• Support multiple reads & writes
• Can also be optimized for large, sequential
request.
• Also processing in limited storage capacity
• Used in supercomputer application & transaction
processing
24. Discussion
Hardware & software solution??
Software would give the best performance as well as least cost.
It is not even clear if synchronizing the disk in a group improves
RAID performance
Each level improving
• Data rate: supercomputer application (sequential data)
(small no of request per sec)
• The I/O rate: transaction processing (random data)
(large no of read modify writes)
• Or usable storage capacity
or possible all three.
25. Comparison of all levels on the basis of
data rate & I/O rate
RAID1 RAID2 RAID3 RAID4 RAID5
Random read Better
than single
disk
fair good very good very good
Random write good poor poor poor fair
Sequential read fair very good very good good good
Sequential write Same as
single disk
good Fair to
good
good good
26. Continue…
Which level is Best ?
• Highest performance per disk comes from
either Level 1 or Level 5
• If storage is used less than 50% in case of
transaction processing then level 1 best
• If storage is used greater than 50% or in
supercomputer application or for combine
level 5 is best.
27. Why RAID 5 is used most?
• If a disk gets an error or starts to fail, data is
recreated from this distributed data and parity
block
• It allows many NAS (Network Administrator
Specialist) and server drives to be "hot-
swappable“.
• It's a great solution for fault tolerance.
29. Conclusion
• This paper make two separate points.
1)The advantage of building I/O system for
personal computer disk.
2) The advantage of 5 different disk array
organization.
30. Conclusion
• RAID offers a cost effective option to meet the
challenge of exponential growth in the processer
and memory.
• Size reduction of personal computer disk is the
key to success of disk array.
since the number of I/O per sec for an
inexpensive disk is within a factor of two of large
disk.
• RAID offers an attractive alternative to SLED in
terms of performance , reliability , power
consumption .
31. • The measure challenge is reliability.
• We must use extra disk containing redundant
information to overcome from disk fails.
• The highest performance of disk come from
level1 and level 5.
• For transaction processing using less then 50%
of the storage capacity use level 1.
• For transaction processing using more then
50% of the storage capacity or for
supercomputer application use level 5.
32. • We see the different levels of RAID
1) Mirrored disk: Duplicating all disk can
double the cost.
2)Hamming code for ECC : Uses ECC to
monitor correctness of information on disk
3)Single check disk per group: Few less disk
4)Independent R/W: Perform parallelism
5)No single check disk : It distribute the data
and check information across the disk
hence multiple writes per group.
33. Issues on RAID
• what is the impact of RAID on Latency?
• What is the impact of MTTF on individual
disk?
• What is the real lifetime of RAID vs calculated
MTTF?
• How is synchronized disk affect level 4 and 5
RAID performance?
• How does slowdown s actually behave?
34. • How does defective sector affect RAID?
• How should 100 to 1000 disk constructed and
physically connected to the processer?
• What is the impact of cabling on cost,
performance , reliability ?
• Where should be RAID connected to CPU so as
not to limit performance?