Successfully reported this slideshow.
Your SlideShare is downloading. ×

Performance and Availability Tradeoffs in Replicated File Systems

Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Loading in …3
×

Check these out next

1 of 27 Ad
Advertisement

More Related Content

Similar to Performance and Availability Tradeoffs in Replicated File Systems (20)

Recently uploaded (20)

Advertisement

Performance and Availability Tradeoffs in Replicated File Systems

  1. 1. Performance and Availability Tradeoffs in Replicated File Systems Peter Honeyman Center for Information Technology Integration University of Michigan, Ann Arbor
  2. 2. Acknowledgements • Joint work with Dr. Jiaying Zhang • Now at Google • This was a chapter of her dissertation • Partially supported by • NSF/NMI GridNFS • DOE/SciDAC Petascale Data Storage Institute • NetApp • IBM ARC
  3. 3. Storage replication • Advantages ☺ • Scalability • Reliability • Read performance
  4. 4. Storage replication • Disadvantages ☹ • Complex synchronization protocols • Concurrency • Durability • Write performance
  5. 5. Durability • If we weaken the durability guarantee, we may lose data ... • And be forced to restart the computation • But it might be worth it
  6. 6. Utilization tradeoffs • Adding replication servers enhances durability • Reduces the risk that computation must be restarted • Increases utilization ☺ • Replication increases run time • Reduces utilization ☹
  7. 7. Placement tradeoffs • Nearby replication servers reduce the replication penalty • Increases utilization ☺ • Nearby replication servers are vulnerable to correlated failure • Reduces utilization ☹
  8. 8. Run-time model recover fail ok fail start run end
  9. 9. Parameters • Failure free, single server run time • Can be estimated or measured • Our focus is on 1 to 10 days
  10. 10. Parameters • Replication overhead • Penalty associated with replication to backup servers • Proportional to RTT • Ratio can be measured by running with a backup server a few msec away
  11. 11. Parameters • Recovery time • Time to detect failure of the primary server and switch to a backup server • Not a sensitive parameter
  12. 12. Parameters • Probability distribution functions • Server failure • Successful recovery
  13. 13. Server failure • Estimated by analyzing PlanetLab ping data • 716 nodes, 349 sites, 25 countries • All-pairs, 15 minute interval, 1/04 to 6/05 • 692 nodes were alive throughout • We ascribe missing pings to node failure and network partition
  14. 14. PlanetLab failure cumulative failure: log-linear scale
  15. 15. Correlated failures failed nodes nodes per site 2 3 4 5 2 0.526 0.593 0.552 0.561 3 0.546 0.440 0.538 4 0.378 0.488 5 0.488 number of sites 259 65 21 11 P(n nodes down | 1 node down)
  16. 16. 0.25 Correlated failures Average Failure Correlations 0.20 0.15 0.10 0.05 0 25 75 125 175 RTT (ms) nodes slope y-intercept 2 -2.4 x 10-4 0.195 3 -2.3 x 10-4 0.155 4 -2.3 x 10-4 0.134 5 -2.4 x 10-4 0.119
  17. 17. Run-time model • Discrete event simulation for expected run time and utilization recover fail ok fail start run end
  18. 18. Simulation results one hour no replication: utilization = .995 write intensity 0.0001 0.001 0.01 RTT 0.1 1.0 1.0 0.8 0.8 0.6 0.6 RTT RTT One backup Four backups
  19. 19. Simulation results one day no replication: utilization = .934 write intensity 0.0001 0.001 0.01 RTT 0.1 1.0 1.0 0.8 0.8 0.6 0.6 RTT RTT One backup Four backups
  20. 20. Simulation results ten days no replication: utilization = .668 RTT RTT 1.00 1.00 0.75 0.75 0.50 0.50 RTT RTT One backup Four backups
  21. 21. Simulation discussion • Replication improves utilization for long- running jobs • Multiple backup servers do not improve utilization (due to low PlanetLab failure rates)
  22. 22. Simulation discussion • Distant backup servers improve utilization for light writers • Distant backup servers do not improve utilization for heavy writers • Implications for checkpoint interval …
  23. 23. Checkpoint interval calculated on the back of a napkin one day, 20% checkpoint overhead 10 day, 2% checkpoint overhead 10 day, 2% checkpoint overhead one backup server four backup servers
  24. 24. Work in progress • Realistic failure data • Storage and processor failure • PDSI failure data repository • Realistic checkpoint costs — help! • Realistic replication overhead • Depends on amount of computation • Less than 10% for NAS Grid Benchmarks
  25. 25. Conclusions • Conventional wisdom holds that consistent mutable replication in large-scale distributed systems is too expensive to consider • Our study suggests otherwise
  26. 26. Conclusions • Consistent replication in large-scale distributed storage systems is feasible and practical • Superior performance • Rigorous adherence to conventional file system semantics • Improved utilization
  27. 27. Thank you for your attention! www.citi.umich.edu Questions?

×