USENIX LISA10 November 7, 2010
Techniques for Handling Huge
Storage
Richard.Elling@RichardElling.com
USENIX LISA’10 Confer...
USENIX LISA10 November 7, 2010
Agenda
How did we get here?
When good data goes bad
Capacity, planning, and design
What com...
3
History
Sunday, November 7, 2010
USENIX LISA10 November 7, 2010
Milestones in Tape Evolution
4
1951 - magnetic tape for data storage
1964 - 9 track
1972 - ...
USENIX LISA10 November 7, 2010
Milestones in Disk Evolution
5
1954 - hard disk invented
1950s - Solid state disk invented
...
USENIX LISA10 November 7, 2010
Architectural Changes
Simple, parallel interfaces
Serial interfaces
Aggregated serial inter...
7
When Good Data Goes
Bad
Sunday, November 7, 2010
USENIX LISA10 November 7, 2010
Failure Rates
Mean Time Between Failures (MTBF)
Statistical interarrival error rate
Often c...
USENIX LISA10 November 7, 2010
Multiple Systems and Statistics
Consider 100 systems each with an MTBF = 1,000 hours
At tim...
USENIX LISA10 November 7, 2010
Failure Rates
MTBF is a summary metric
Manufacturers estimate MTBF by stressing many units ...
USENIX LISA10 November 7, 2010
Why Do We Care?
Summary statistics, like MTBF or AFR, can me misleading or risky if
we do n...
USENIX LISA10 November 7, 2010
Time Dependent Reliability
Useful for repairable systems
System can be repaired to satisfac...
USENIX LISA10 November 7, 2010
TDR Example 1
13
0
5
10
15
20
1234567891011121314151617181920212223242526272829303132333435...
USENIX LISA10 November 7, 2010
TDR Example 2
14
Did a common event occur?
0
5
10
15
20
12345678910111213141516171819202122...
USENIX LISA10 November 7, 2010
TDR Example 2.5
15
0
5
10
15
20
Jan 1, 2010 May 14, 2011 Sep 23, 2012 Feb 3, 2014
MeanCumul...
USENIX LISA10 November 7, 2010
Long Term Storage
Near-line disk systems for backup
Access time and bandwith advantages ove...
USENIX LISA10 November 7, 2010
Reliability
17
Reliability is time dependent
TDR analysis reveals trends
Use cumulative plo...
18
Data Sheets
Sunday, November 7, 2010
USENIX LISA10 November 7, 2010
Reading Data Sheets
Manufacturers publish useful data sheets and product guides
Reliability...
20
Availability
Sunday, November 7, 2010
USENIX LISA10 November 7, 2010
Nines Matter
Is the Internet up?
21
Sunday, November 7, 2010
USENIX LISA10 November 7, 2010
Nines Matter
Is the Internet up?
Is the Internet down?
22
Sunday, November 7, 2010
USENIX LISA10 November 7, 2010
Nines Matter
Is the Internet up?
Is the Internet down?
Is the Internet reliability 5-9’s?
2...
USENIX LISA10 November 7, 2010
Nines Don’t Matter
Is the Internet up?
Is the Internet down?
Is the Internet’s reliability ...
USENIX LISA10 November 7, 2010
Reliability Matters!
Is the Internet up?
Is the Internet down?
Is the Internet’s reliabilit...
USENIX LISA10 November 7, 2010
Designing for Failure
Change design perspective
Design to success
How to make it work?
What...
USENIX LISA10 November 7, 2010
HA-Cluster plugin
Example: Design for Success
x86 Server
NexentaStor
Shared
Storage
Shared
...
USENIX LISA10 November 7, 2010
Designing for Failure
Application-level replication
Hard to implement - coding required
Som...
USENIX LISA10 November 7, 2010
Reliability - Availability
Reliability trumps availability
If disks didn’t break, RAID woul...
30
Data Retention
Sunday, November 7, 2010
USENIX LISA10 November 7, 2010
Evaluating Data Retention
MTTDL = Mean Time To Data Loss
Note: MTBF is not constant in the ...
USENIX LISA10 November 7, 2010
Another MTTDL Model
MTTDL[1] model doesn't take into account unrecoverable read
But unrecov...
USENIX LISA10 November 7, 2010
Why Worry about UER?
Richard's study
3,684 hosts with 12,204 LUNs
11.5% of all LUNs reporte...
USENIX LISA10 November 7, 2010
Why Worry about UER?
RAID array study
34
Sunday, November 7, 2010
USENIX LISA10 November 7, 2010
Why Worry about UER?
RAID array study
35
Unrecoverable
Reads
Disk Disappeared
“disk pull”
“...
USENIX LISA10 November 7, 2010
MTTDL[2] Model
Probability that a reconstruction will fail
Precon_fail = (N-1) * size / UER...
USENIX LISA10 November 7, 2010
Practical View of MTTDL[1]
37
Sunday, November 7, 2010
USENIX LISA10 November 7, 2010
MTTDL[1] Comparison
38
Sunday, November 7, 2010
USENIX LISA10 November 7, 2010
MTTDL Models: Mirror
39
Spares are not always better...
Sunday, November 7, 2010
USENIX LISA10 November 7, 2010
MTTDL Models: RAIDZ2
40
Sunday, November 7, 2010
USENIX LISA10 November 7, 2010
Space, Dependability, and
Performance
41
Sunday, November 7, 2010
USENIX LISA10 November 7, 2010
Dependability Use Case
Customer has 15+ TB of read-mostly data
16-slot, 3.5” drive chassis
...
USENIX LISA10 November 7, 2010
Planning for Spares
Number of systems Need for spares
How many spares do you need?
How ofte...
USENIX LISA10 November 7, 2010
SparesOptimizer Demo
44
Sunday, November 7, 2010
Capacity, Planning, and
Design
45
Sunday, November 7, 2010
USENIX LISA10 November 7, 2010
46
Space
Space is a poor sizing metric, really!
Technology marketing heavily pushes space
M...
USENIX LISA10 November 7, 2010
Bandwidth
Bandwidth constraints in modern systems are rare
Overprovisioning for bandwidth i...
USENIX LISA10 November 7, 2010
Latency
Lower latency == better performance
Latency != IOPS
IOPS also achieved with paralle...
49
Deduplication
Sunday, November 7, 2010
USENIX LISA10 November 7, 2010
What is Deduplication?
A $2.1 Billion feature
2009 buzzword of the year
Technique for impro...
USENIX LISA10 November 7, 2010
Dedup how-to
Given a bunch of data
Find data that is duplicated
Build a lookup table of ref...
USENIX LISA10 November 7, 2010
Dedup Constraints
Size of the deduplication table
Quality of the checksums
Collisions happe...
USENIX LISA10 November 7, 2010
Verification
add reference
checksum
compress
DDT entry lookup
write()
read data
data
match?...
USENIX LISA10 November 7, 2010
Reference Counts
54
Eggs courtesy of Richard’s chickens
Sunday, November 7, 2010
55
Replication
Sunday, November 7, 2010
USENIX LISA10 November 7, 2010
Replication Services
Recovery
Point
Objective
System I/O
Performance
Text
Days
Seconds
Slow...
USENIX LISA10 November 7, 2010
How Many Copies Do You Need?
Answer: at least one, more is better...
One production, one ba...
USENIX LISA10 November 7, 2010
Tiering Example
58
Big, honking
disk array
Big, honking
tape library
File-based
backup
Work...
USENIX LISA10 November 7, 2010
Tiering Example
59
Big, honking
disk array
Big, honking
tape library
File-based
backup
... ...
USENIX LISA10 November 7, 2010
Tiering Example
60
Big, honking
disk array
Big, honking
tape library
Near-line
backup
Backu...
USENIX LISA10 November 7, 2010
Tiering Example
61
Big, honking
disk array
Big, honking
tape library
Near-line
backup
Quick...
USENIX LISA10 November 7, 2010
Application-Level Replication
Example
62
Site 2
Long-term
archive option
Site 1
Data stored...
63
Data Sheets
Sunday, November 7, 2010
USENIX LISA10 November 7, 2010
Reading Data Sheets Redux
Manufacturers publish useful data sheets and product guides
Relia...
65
Summary
Sunday, November 7, 2010
USENIX LISA10 November 7, 2010
Key Points
66
You will need many copies of your data, get used to it
The cost/byte decrease...
67
ThankYou!
Questions?
Richard.Elling@RichardElling.com
Richard.Elling@Nexenta.com
Sunday, November 7, 2010
Upcoming SlideShare
Loading in …5
×

Techniques for Managing Huge Data LISA10

816 views
739 views

Published on

Slides from the USENIX LISA10 Tutorial on Techniques for Managing Huge Data

Published in: Technology
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
816
On SlideShare
0
From Embeds
0
Number of Embeds
4
Actions
Shares
0
Downloads
24
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

Techniques for Managing Huge Data LISA10

  1. 1. USENIX LISA10 November 7, 2010 Techniques for Handling Huge Storage Richard.Elling@RichardElling.com USENIX LISA’10 Conference November 8, 2010 Sunday, November 7, 2010
  2. 2. USENIX LISA10 November 7, 2010 Agenda How did we get here? When good data goes bad Capacity, planning, and design What comes next? 2 Note: this tutorial uses live demos, slides not so much Sunday, November 7, 2010
  3. 3. 3 History Sunday, November 7, 2010
  4. 4. USENIX LISA10 November 7, 2010 Milestones in Tape Evolution 4 1951 - magnetic tape for data storage 1964 - 9 track 1972 - Quarter Inch Cartridge (QIC) 1977 - Commodore Datasette 1984 - IBM 3480 1989 - DDS/DAT 1995 - IBM 3590 2000 - T9940 2000 - LTO 2006 - T10000 2008 - TS1130 Sunday, November 7, 2010
  5. 5. USENIX LISA10 November 7, 2010 Milestones in Disk Evolution 5 1954 - hard disk invented 1950s - Solid state disk invented 1981 - Shugart Associates System Interface (SASI) 1984 - Personal Computer Advanced Technology (PC/AT)Attachment, later shortened to ATA 1986 - “Small” Computer System Interface (SCSI) 1986 - Integrated Drive Electronics (IDE) 1994 - EIDE 1994 - Fibre Channel (FC) 1995 - Flash-based SSDs 2001 - Serial ATA (SATA) 2005 - Serial Attached SCSI (SAS) Sunday, November 7, 2010
  6. 6. USENIX LISA10 November 7, 2010 Architectural Changes Simple, parallel interfaces Serial interfaces Aggregated serial interfaces 6 Sunday, November 7, 2010
  7. 7. 7 When Good Data Goes Bad Sunday, November 7, 2010
  8. 8. USENIX LISA10 November 7, 2010 Failure Rates Mean Time Between Failures (MTBF) Statistical interarrival error rate Often cited in literature and data sheets MTBF = total operating hours / total number of failures Annualized Failure Rate (AFR) AFR = operating hours per year / MTBF Expressed as a percent Example MTBF = 1,200,000 hours Year = 24 x 365 = 8,760 hours AFR = 8,760 / 1,200,000 = 0.0073 = 0.73% AFR is easier to grok than MTBF 8 Operating hours per year is a flexible definition Sunday, November 7, 2010
  9. 9. USENIX LISA10 November 7, 2010 Multiple Systems and Statistics Consider 100 systems each with an MTBF = 1,000 hours At time=1,000 hours, 100 failures occurred Not all systems will see one failure 9 0 10 20 30 40 0 1 2 3 4 NumberofSystems Number of Failures Very, Very Unlucky Unlucky Very Unlucky Sunday, November 7, 2010
  10. 10. USENIX LISA10 November 7, 2010 Failure Rates MTBF is a summary metric Manufacturers estimate MTBF by stressing many units for short periods of qualification time Summary metrics hide useful information Example: mortality study Study mortality of children aged 5-14 during 1996-1998 Measured 20.8 per 100,000 MTBF = 4,807 years Current world average life expectancy is 67.2 years For large populations, such as huge disk farms, the summary MTBF can appear constant Better question to be answered, “is my failure rate increasing or decreasing?” 10 Sunday, November 7, 2010
  11. 11. USENIX LISA10 November 7, 2010 Why Do We Care? Summary statistics, like MTBF or AFR, can me misleading or risky if we do not also distinguish between stable and trending processes We need to analyze the ordered times between failure in relationship to the system age to describe system reliability 11 Sunday, November 7, 2010
  12. 12. USENIX LISA10 November 7, 2010 Time Dependent Reliability Useful for repairable systems System can be repaired to satisfactory operation by any action Failures occur sequentially in time Measure the age of the components of a system Need to distinguish age from interarrival times (time between failures) Doesn’t have to be precise, resolution of weeks works ok Some devices report Power On Hours (POH) SMART for disks OSes Clerical solutions or inventory asset systems work fine 12 Sunday, November 7, 2010
  13. 13. USENIX LISA10 November 7, 2010 TDR Example 1 13 0 5 10 15 20 1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950 MeanCumulativeFailures System Age (months) Disk Set A Disk Set B Disk Set C Target MTBF Sunday, November 7, 2010
  14. 14. USENIX LISA10 November 7, 2010 TDR Example 2 14 Did a common event occur? 0 5 10 15 20 1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950 MeanCumulativeFailures System Age (months) Disk Set A Disk Set B Disk Set C Target MTBF Sunday, November 7, 2010
  15. 15. USENIX LISA10 November 7, 2010 TDR Example 2.5 15 0 5 10 15 20 Jan 1, 2010 May 14, 2011 Sep 23, 2012 Feb 3, 2014 MeanCumulativeFailures Date Sunday, November 7, 2010
  16. 16. USENIX LISA10 November 7, 2010 Long Term Storage Near-line disk systems for backup Access time and bandwith advantages over tape Enterprise-class tape for backup and archival 15-30 years shelf life Significant ECC Read error rate: 1e-20 Enterprise-class HDD read error rate: 1e-15 16 Sunday, November 7, 2010
  17. 17. USENIX LISA10 November 7, 2010 Reliability 17 Reliability is time dependent TDR analysis reveals trends Use cumulative plots, mean cumulative plots, and recurrance rates Graphs are good Track failures and downtime by system versus age and calendar dates Corelate anomalous behavior Manage retirement, refresh, preventative processes using real data Sunday, November 7, 2010
  18. 18. 18 Data Sheets Sunday, November 7, 2010
  19. 19. USENIX LISA10 November 7, 2010 Reading Data Sheets Manufacturers publish useful data sheets and product guides Reliability information MTBF or AFR UER, or equivalent Warranty Performance Interface bandwidth Sustained bandwidth (aka internal or media bandwidth) Average rotational delay or rpm (HDD) Average response or seek time Native sector size Environmentals Power 19 AFR operating hours per year can be a footnote Sunday, November 7, 2010
  20. 20. 20 Availability Sunday, November 7, 2010
  21. 21. USENIX LISA10 November 7, 2010 Nines Matter Is the Internet up? 21 Sunday, November 7, 2010
  22. 22. USENIX LISA10 November 7, 2010 Nines Matter Is the Internet up? Is the Internet down? 22 Sunday, November 7, 2010
  23. 23. USENIX LISA10 November 7, 2010 Nines Matter Is the Internet up? Is the Internet down? Is the Internet reliability 5-9’s? 23 Sunday, November 7, 2010
  24. 24. USENIX LISA10 November 7, 2010 Nines Don’t Matter Is the Internet up? Is the Internet down? Is the Internet’s reliability 5-9’s? Do 5-9’s matter? 24 Sunday, November 7, 2010
  25. 25. USENIX LISA10 November 7, 2010 Reliability Matters! Is the Internet up? Is the Internet down? Is the Internet’s reliability 5-9’s? Do 5-9’s matter? Reliability matters! 25 Sunday, November 7, 2010
  26. 26. USENIX LISA10 November 7, 2010 Designing for Failure Change design perspective Design to success How to make it work? What you learned in school: solve the equation Can be difficult... Design for failure How to make it work when everything breaks? What you learned in the army: win the war Can be difficult... at first... 26 Sunday, November 7, 2010
  27. 27. USENIX LISA10 November 7, 2010 HA-Cluster plugin Example: Design for Success x86 Server NexentaStor Shared Storage Shared Storage x86 Server NexentaStor FC SAS iSCSI Sunday, November 7, 2010
  28. 28. USENIX LISA10 November 7, 2010 Designing for Failure Application-level replication Hard to implement - coding required Some activity in open community Hard to apply to general purpose computing Examples DoD, Google, Facebook, Amazon, ... The big guys Tends to scale well with size Multiple copies of data 28 Sunday, November 7, 2010
  29. 29. USENIX LISA10 November 7, 2010 Reliability - Availability Reliability trumps availability If disks didn’t break, RAID would not exist If servers didn’t break, HA cluster would not exist Reliability measured in probabilities Availability measured in nines 29 Sunday, November 7, 2010
  30. 30. 30 Data Retention Sunday, November 7, 2010
  31. 31. USENIX LISA10 November 7, 2010 Evaluating Data Retention MTTDL = Mean Time To Data Loss Note: MTBF is not constant in the real world, but keeps math simple MTTDL[1] is a simple MTTDL model No parity (single vdev, striping, RAID-0) MTTDL[1] = MTBF / N Single Parity (mirror, RAIDZ, RAID-1, RAID-5) MTTDL[1] = MTBF2 / (N * (N-1) * MTTR) Double Parity (3-way mirror, RAIDZ2, RAID-6) MTTDL[1] = MTBF3 / (N * (N-1) * (N-2) * MTTR2) Triple Parity (4-way mirror, RAIDZ3) MTTDL[1] = MTBF4 / (N * (N-1) * (N-2) * (N-3) * MTTR3) 31 Sunday, November 7, 2010
  32. 32. USENIX LISA10 November 7, 2010 Another MTTDL Model MTTDL[1] model doesn't take into account unrecoverable read But unrecoverable reads (UER) are becoming the dominant failure mode UER specifed as errors per bits read More bits = higher probability of loss per vdev MTTDL[2] model considers UER 32 Sunday, November 7, 2010
  33. 33. USENIX LISA10 November 7, 2010 Why Worry about UER? Richard's study 3,684 hosts with 12,204 LUNs 11.5% of all LUNs reported read errors Bairavasundaram et.al. FAST08 www.cs.wisc.edu/adsl/Publications/corruption-fast08.pdf 1.53M LUNs over 41 months RAID reconstruction discovers 8% of checksum mismatches “For some drive models as many as 4% of drives develop checksum mismatches during the 17 months examined” Manufacturers trade UER for space 33 Sunday, November 7, 2010
  34. 34. USENIX LISA10 November 7, 2010 Why Worry about UER? RAID array study 34 Sunday, November 7, 2010
  35. 35. USENIX LISA10 November 7, 2010 Why Worry about UER? RAID array study 35 Unrecoverable Reads Disk Disappeared “disk pull” “Disk pull” tests aren’t very useful Sunday, November 7, 2010
  36. 36. USENIX LISA10 November 7, 2010 MTTDL[2] Model Probability that a reconstruction will fail Precon_fail = (N-1) * size / UER Model doesn't work for non-parity schemes single vdev, striping, RAID-0 Single Parity (mirror, RAIDZ, RAID-1, RAID-5) MTTDL[2] = MTBF / (N * Precon_fail) Double Parity (3-way mirror, RAIDZ2, RAID-6) MTTDL[2] = MTBF2/ (N * (N-1) * MTTR * Precon_fail) Triple Parity (4-way mirror, RAIDZ3) MTTDL[2] = MTBF3/ (N * (N-1) * (N-2) * MTTR2 * Precon_fail) 36 Sunday, November 7, 2010
  37. 37. USENIX LISA10 November 7, 2010 Practical View of MTTDL[1] 37 Sunday, November 7, 2010
  38. 38. USENIX LISA10 November 7, 2010 MTTDL[1] Comparison 38 Sunday, November 7, 2010
  39. 39. USENIX LISA10 November 7, 2010 MTTDL Models: Mirror 39 Spares are not always better... Sunday, November 7, 2010
  40. 40. USENIX LISA10 November 7, 2010 MTTDL Models: RAIDZ2 40 Sunday, November 7, 2010
  41. 41. USENIX LISA10 November 7, 2010 Space, Dependability, and Performance 41 Sunday, November 7, 2010
  42. 42. USENIX LISA10 November 7, 2010 Dependability Use Case Customer has 15+ TB of read-mostly data 16-slot, 3.5” drive chassis 2 TB HDDs Option 1: one raidz2 set 24 TB available space 12 data 2 parity 2 hot spares, 48 hour disk replacement time MTTDL[1] = 1,790,000 years Option 2: two raidz2 sets 24 TB available space (each set) 6 data 2 parity no hot spares MTTDL[1] = 7,450,000 years 42 Sunday, November 7, 2010
  43. 43. USENIX LISA10 November 7, 2010 Planning for Spares Number of systems Need for spares How many spares do you need? How often do you plan replacements? Replacing devices immediately becomes impractical Not replacing devices increases risk, but how much? There is no black/white answer, it depends... 43 Sunday, November 7, 2010
  44. 44. USENIX LISA10 November 7, 2010 SparesOptimizer Demo 44 Sunday, November 7, 2010
  45. 45. Capacity, Planning, and Design 45 Sunday, November 7, 2010
  46. 46. USENIX LISA10 November 7, 2010 46 Space Space is a poor sizing metric, really! Technology marketing heavily pushes space Maximizing space can mean compromising performance AND reliability As disks and tapes get bigger, they don’t get better $150 rule PHB’s get all excited about space Most current capacity planning tools manage by space Sunday, November 7, 2010
  47. 47. USENIX LISA10 November 7, 2010 Bandwidth Bandwidth constraints in modern systems are rare Overprovisioning for bandwidth is relatively simple Where to gain bandwidth can be tricky Link aggregation Ethernet SAS MPIO Adding parallelism beyond 2 trades off reliability 47 Sunday, November 7, 2010
  48. 48. USENIX LISA10 November 7, 2010 Latency Lower latency == better performance Latency != IOPS IOPS also achieved with parallelism Parallelism only delivers latency when latency is constrained by bandwidth Latency = access time + transfer time HDD Access time limited by seek and rotate Transfer time usually limited by media or internal bandwidth SSD Access time limited by architecture more than c Transfer time limited by architecture and interface Tape Access time measured in seconds 48 Sunday, November 7, 2010
  49. 49. 49 Deduplication Sunday, November 7, 2010
  50. 50. USENIX LISA10 November 7, 2010 What is Deduplication? A $2.1 Billion feature 2009 buzzword of the year Technique for improving storage space efficiency Trades big I/Os for small I/Os Does not eliminate I/O Implementation styles offline or post processing data written to nonvolatile storage process comes along later and dedupes data example: tape archive dedup inline data is deduped as it is being allocated to nonvolatile storage example: ZFS 50 Sunday, November 7, 2010
  51. 51. USENIX LISA10 November 7, 2010 Dedup how-to Given a bunch of data Find data that is duplicated Build a lookup table of references to data Replace duplicate data with a pointer to the entry in the lookup table Grainularity file block byte 51 Sunday, November 7, 2010
  52. 52. USENIX LISA10 November 7, 2010 Dedup Constraints Size of the deduplication table Quality of the checksums Collisions happen All possible permutations of N bits cannot be stored in N/10 bits Checksums can be evaluated by probability of collisions Multiple checksums can be used, but gains are marginal Compression algorithms can work against deduplication Dedup before or after compression? 52 Sunday, November 7, 2010
  53. 53. USENIX LISA10 November 7, 2010 Verification add reference checksum compress DDT entry lookup write() read data data match? new entry yes no verify? yes no yes no DDT match? 53 Sunday, November 7, 2010
  54. 54. USENIX LISA10 November 7, 2010 Reference Counts 54 Eggs courtesy of Richard’s chickens Sunday, November 7, 2010
  55. 55. 55 Replication Sunday, November 7, 2010
  56. 56. USENIX LISA10 November 7, 2010 Replication Services Recovery Point Objective System I/O Performance Text Days Seconds Slower Faster Mirror Application Level Replication Block Replication DRBD, SNDR Object-level sync Databases, ZFS File-level sync rsync Traditional Backup NDMP, tar Hours 56 Sunday, November 7, 2010
  57. 57. USENIX LISA10 November 7, 2010 How Many Copies Do You Need? Answer: at least one, more is better... One production, one backup One production, one near-line, one backup One production, one near-line, one backup, one at DR site One production, one near-line, one backup, one at DR site, one archived in a vault RAID doesn’t count Consider 3 to 4 as a minimum for important data 57 Sunday, November 7, 2010
  58. 58. USENIX LISA10 November 7, 2010 Tiering Example 58 Big, honking disk array Big, honking tape library File-based backup Works great, but... Sunday, November 7, 2010
  59. 59. USENIX LISA10 November 7, 2010 Tiering Example 59 Big, honking disk array Big, honking tape library File-based backup ... backups never complete 10 million files 1 million daily changes 12 hour backup window Sunday, November 7, 2010
  60. 60. USENIX LISA10 November 7, 2010 Tiering Example 60 Big, honking disk array Big, honking tape library Near-line backup Backups to near-line storage and tape have different policies 10 million files 1 million daily changes weekly backup window hourly block-level replication Sunday, November 7, 2010
  61. 61. USENIX LISA10 November 7, 2010 Tiering Example 61 Big, honking disk array Big, honking tape library Near-line backup Quick file restoration possible Sunday, November 7, 2010
  62. 62. USENIX LISA10 November 7, 2010 Application-Level Replication Example 62 Site 2 Long-term archive option Site 1 Data stored at different sites Site 3 Application Sunday, November 7, 2010
  63. 63. 63 Data Sheets Sunday, November 7, 2010
  64. 64. USENIX LISA10 November 7, 2010 Reading Data Sheets Redux Manufacturers publish useful data sheets and product guides Reliability information MTBF or AFR UER, or equivalent Warranty Performance Interface bandwidth Sustained bandwidth (aka internal or media bandwidth) Average rotational delay or rpm (HDD) Average response or seek time Native sector size Environmentals Power 64 AFR operating hours per year can be a footnote Sunday, November 7, 2010
  65. 65. 65 Summary Sunday, November 7, 2010
  66. 66. USENIX LISA10 November 7, 2010 Key Points 66 You will need many copies of your data, get used to it The cost/byte decreases faster than kicking old habits Replication is a good thing, use often Tiering is a good thing, use often Beware of designing for success, design for failure, too Reliability trumps availability Space, dependability, performance: pick two Sunday, November 7, 2010
  67. 67. 67 ThankYou! Questions? Richard.Elling@RichardElling.com Richard.Elling@Nexenta.com Sunday, November 7, 2010

×