RAID OverviewRAID Overview
91.560
1
Computing speeds double every 3 years
Disk speeds can’t keep up
Data needs higher MTBF than any
component in system
 IO Performance and Availability
Issues!
2
The Motivation for RAID
PERFORMANCE
◦ Parallelism
◦ Load Balancing
AVAILABILITY
◦ Redundancy: Mirroring, or Striping with
Parity
FLEXIBILITY
◦ Selectable Performance/Availability/Cost
3
RAID to the Rescue!
Redundant Array of Independent Disks
(a.k.a. “Disk Array”)
◦ Multiple drives, single host disk unit
Provides opportunity to increase:
◦ Performance via Parallelism
◦ Data Availability via Redundancy
Cheap cost via commodity disks
4
What is RAID Technology?
 Chunk size tuneable for BW vs Thruput tradeoffs
◦ Large Chunk High Throughput (IO/sec)
◦ Small Chunk High Bandwidth (MB/sec)
5
Chunk 1
Parity
Chunk 11
Chunk 16
Chunk 21
Chunk 5
Chunk 10
Chunk 15
Chunk 20
Chunk 2
Chunk 7
Chunk 17
Chunk 22
Chunk 3
Chunk 8
Chunk 13
Chunk 23
DriveDrive
55
Chunk 12
Chunk 0
Chunk 18
Chunk 9
Chunk 14
Chunk 19
Chunk 24
Chunk 4
Chunk 6
DriveDrive
44
DriveDrive
33
DriveDrive
22
DriveDrive
11
Performance: Disk Striping
6
Data Stream, OR IO Stream, can be multiplexed
across multiple disks, depending on BW vs
Thruput. Parity data is also stored on disk.
Same data written to both disks.
Striping with Parity
Single Host Unit
> Add XOR’d parity for increased availability.
Mirroring
Availability: Redundancy
Parity = XOR of data from every disk in
the RAID unit
7
DATA
0
1
1
0
0 parity
- Any single disk’s
data can be recovered
by XOR’ing the data
of the surviving disks.
Parity Redundancy
Many to choose from
Each offers unique tradeoffs
◦ Performance
◦ Availability
◦ Costs
We offer levels 0, 1, 3, 5, 10
8
RAID Levels
•High Performance; Low Availability
•Data Striped on Multiple Disks
•Multi-threaded Access
9
Disk Striping with No Redundancy
Data Stream, OR IO Stream, can be Striped
across multiple disks (BW vs Thruput).
RAID 0
 Chunk size tuneable for BW or Thruput
 No redundancy
10
Chunk 1
Parity
Chunk 11
Chunk 16
Chunk 21
Chunk 5
Chunk 10
Chunk 15
Chunk 20
Chunk 2
Chunk 7
Chunk 17
Chunk 22
Chunk 3
Chunk 8
Chunk 13
Chunk 23
DriveDrive
55
Chunk 12
Chunk 0
Chunk 18
Chunk 9
Chunk 14
Chunk 19
Chunk 24
Chunk 4
Chunk 6
DriveDrive
44
DriveDrive
33
DriveDrive
22
DriveDrive
11
RAID 0 Striping
 Single-disk Performance; Expensive
Availability
 Data 100% duplicated across both spindles.
 Single-threaded access
11
Disk Mirroring
SAME data written to BOTH disks -- no
segmenting.
Single Host Unit
RAID 1
•Highest Performance; Most Expensive
Availability
•Multi-threaded Access
12
Striped Mirrors
Data Stream, OR IO Stream, can be Striped
across multiple disks (BW vs Thruput).
Single Host Unit
RAID 1/0
•High BW Performance; Cheap Availability
•Sector-granular data striping
•Single-threaded Access
13
Disk Striping with dedicated parity drive
Data Stream is Striped across N-1 disks for
high bandwidth.
XOR Parity Data
RAID 3
 Chunk size = single sector (pure RAID 3 would be single byte)
 All parity data on same spindle
14
Chunk 1
Parity
Chunk 9
Chunk 13
Chunk 17
Chunk 4
Chunk 8
Chunk 12
Chunk 16
Chunk 2
Chunk 6
Chunk 14
Chunk 18
Chunk 3
Chunk 7
Chunk 11
Chunk 19
DriveDrive
55
Chunk 10
Chunk 0
Chunk 15
Parity
Parity
Parity
Parity
Parity
Chunk 5
DriveDrive
44
DriveDrive
33
DriveDrive
22
DriveDrive
11
RAID 3 Striping
 High Read Performance, expensive Write
performance; Cheap Availability
 Tuneable Stripe granularity
 Optimized for multi-thread access
15
Disk Striping with rotating parity drive
IO Stream is Striped across N-1 disks for high
IOs per second (thruput).
XOR Parity Data
RAID 5
 Chunk size is tuned such that typical IO aligns on single disk.
 Parity rotates amongst disks to avoid write bottleneck
16
Chunk 1
Parity
Chunk 9
Chunk 16
Parity
Chunk 4
Chunk 8
Chunk 12
Parity
Chunk 2
Chunk 6
Chunk 13
Chunk 17
Chunk 3
Chunk 10
Chunk 18
Parity
DriveDrive
55
Parity
Chunk 0
Chunk 14
Chunk 7
Chunk 11
Chunk 15
Chunk 19
Parity
Chunk 5
DriveDrive
44
DriveDrive
33
DriveDrive
22
DriveDrive
11
RAID 5 Striping
17
1. Read old
data.
Old New
2. Write
new data
Old P.
3. XOR old and
new data to create
“Partial Product”.
4. Read
old parity
data.
5. Xor old parity with
partial product, writing
out result as new parity.
P. P.
}
XOR
}
XOR
New P.
Chunk 1 Chunk 2 Chunk 3
DriveDrive
55
Chunk 0 Parity
DriveDrive
44
DriveDrive
33
DriveDrive
22
DriveDrive
11
RAID 5 - Write Operation
18
RAID 0 - Data striping, Non-redundant.
High Performance, Low Availability
RAID 1 - Mirroring
Moderate Performance, Expensive High Availability
RAID 1/0 - Striping and Mirroring
High Performance, Expensively High Availability
RAID 3 - Striping, single parity disk.
High Bandwidth Performance, Cheap Availability
RAID 5 - Striping, rotating parity disk.
High Thruput Performance, Cheap Availability
RAID Level Review
Increasing performance gap between CPU
and IO
Data availability a priority
RAID meets the IO challenge:
◦ Performance via parallelism
◦ Data Availability via redundancy
◦ Flexibility via multiple RAID levels, each offer
unique performance/availability/cost tradeoffs
19
Summary

Raid intro

  • 1.
  • 2.
    Computing speeds doubleevery 3 years Disk speeds can’t keep up Data needs higher MTBF than any component in system  IO Performance and Availability Issues! 2 The Motivation for RAID
  • 3.
    PERFORMANCE ◦ Parallelism ◦ LoadBalancing AVAILABILITY ◦ Redundancy: Mirroring, or Striping with Parity FLEXIBILITY ◦ Selectable Performance/Availability/Cost 3 RAID to the Rescue!
  • 4.
    Redundant Array ofIndependent Disks (a.k.a. “Disk Array”) ◦ Multiple drives, single host disk unit Provides opportunity to increase: ◦ Performance via Parallelism ◦ Data Availability via Redundancy Cheap cost via commodity disks 4 What is RAID Technology?
  • 5.
     Chunk sizetuneable for BW vs Thruput tradeoffs ◦ Large Chunk High Throughput (IO/sec) ◦ Small Chunk High Bandwidth (MB/sec) 5 Chunk 1 Parity Chunk 11 Chunk 16 Chunk 21 Chunk 5 Chunk 10 Chunk 15 Chunk 20 Chunk 2 Chunk 7 Chunk 17 Chunk 22 Chunk 3 Chunk 8 Chunk 13 Chunk 23 DriveDrive 55 Chunk 12 Chunk 0 Chunk 18 Chunk 9 Chunk 14 Chunk 19 Chunk 24 Chunk 4 Chunk 6 DriveDrive 44 DriveDrive 33 DriveDrive 22 DriveDrive 11 Performance: Disk Striping
  • 6.
    6 Data Stream, ORIO Stream, can be multiplexed across multiple disks, depending on BW vs Thruput. Parity data is also stored on disk. Same data written to both disks. Striping with Parity Single Host Unit > Add XOR’d parity for increased availability. Mirroring Availability: Redundancy
  • 7.
    Parity = XORof data from every disk in the RAID unit 7 DATA 0 1 1 0 0 parity - Any single disk’s data can be recovered by XOR’ing the data of the surviving disks. Parity Redundancy
  • 8.
    Many to choosefrom Each offers unique tradeoffs ◦ Performance ◦ Availability ◦ Costs We offer levels 0, 1, 3, 5, 10 8 RAID Levels
  • 9.
    •High Performance; LowAvailability •Data Striped on Multiple Disks •Multi-threaded Access 9 Disk Striping with No Redundancy Data Stream, OR IO Stream, can be Striped across multiple disks (BW vs Thruput). RAID 0
  • 10.
     Chunk sizetuneable for BW or Thruput  No redundancy 10 Chunk 1 Parity Chunk 11 Chunk 16 Chunk 21 Chunk 5 Chunk 10 Chunk 15 Chunk 20 Chunk 2 Chunk 7 Chunk 17 Chunk 22 Chunk 3 Chunk 8 Chunk 13 Chunk 23 DriveDrive 55 Chunk 12 Chunk 0 Chunk 18 Chunk 9 Chunk 14 Chunk 19 Chunk 24 Chunk 4 Chunk 6 DriveDrive 44 DriveDrive 33 DriveDrive 22 DriveDrive 11 RAID 0 Striping
  • 11.
     Single-disk Performance;Expensive Availability  Data 100% duplicated across both spindles.  Single-threaded access 11 Disk Mirroring SAME data written to BOTH disks -- no segmenting. Single Host Unit RAID 1
  • 12.
    •Highest Performance; MostExpensive Availability •Multi-threaded Access 12 Striped Mirrors Data Stream, OR IO Stream, can be Striped across multiple disks (BW vs Thruput). Single Host Unit RAID 1/0
  • 13.
    •High BW Performance;Cheap Availability •Sector-granular data striping •Single-threaded Access 13 Disk Striping with dedicated parity drive Data Stream is Striped across N-1 disks for high bandwidth. XOR Parity Data RAID 3
  • 14.
     Chunk size= single sector (pure RAID 3 would be single byte)  All parity data on same spindle 14 Chunk 1 Parity Chunk 9 Chunk 13 Chunk 17 Chunk 4 Chunk 8 Chunk 12 Chunk 16 Chunk 2 Chunk 6 Chunk 14 Chunk 18 Chunk 3 Chunk 7 Chunk 11 Chunk 19 DriveDrive 55 Chunk 10 Chunk 0 Chunk 15 Parity Parity Parity Parity Parity Chunk 5 DriveDrive 44 DriveDrive 33 DriveDrive 22 DriveDrive 11 RAID 3 Striping
  • 15.
     High ReadPerformance, expensive Write performance; Cheap Availability  Tuneable Stripe granularity  Optimized for multi-thread access 15 Disk Striping with rotating parity drive IO Stream is Striped across N-1 disks for high IOs per second (thruput). XOR Parity Data RAID 5
  • 16.
     Chunk sizeis tuned such that typical IO aligns on single disk.  Parity rotates amongst disks to avoid write bottleneck 16 Chunk 1 Parity Chunk 9 Chunk 16 Parity Chunk 4 Chunk 8 Chunk 12 Parity Chunk 2 Chunk 6 Chunk 13 Chunk 17 Chunk 3 Chunk 10 Chunk 18 Parity DriveDrive 55 Parity Chunk 0 Chunk 14 Chunk 7 Chunk 11 Chunk 15 Chunk 19 Parity Chunk 5 DriveDrive 44 DriveDrive 33 DriveDrive 22 DriveDrive 11 RAID 5 Striping
  • 17.
    17 1. Read old data. OldNew 2. Write new data Old P. 3. XOR old and new data to create “Partial Product”. 4. Read old parity data. 5. Xor old parity with partial product, writing out result as new parity. P. P. } XOR } XOR New P. Chunk 1 Chunk 2 Chunk 3 DriveDrive 55 Chunk 0 Parity DriveDrive 44 DriveDrive 33 DriveDrive 22 DriveDrive 11 RAID 5 - Write Operation
  • 18.
    18 RAID 0 -Data striping, Non-redundant. High Performance, Low Availability RAID 1 - Mirroring Moderate Performance, Expensive High Availability RAID 1/0 - Striping and Mirroring High Performance, Expensively High Availability RAID 3 - Striping, single parity disk. High Bandwidth Performance, Cheap Availability RAID 5 - Striping, rotating parity disk. High Thruput Performance, Cheap Availability RAID Level Review
  • 19.
    Increasing performance gapbetween CPU and IO Data availability a priority RAID meets the IO challenge: ◦ Performance via parallelism ◦ Data Availability via redundancy ◦ Flexibility via multiple RAID levels, each offer unique performance/availability/cost tradeoffs 19 Summary