Future Information Growth And Storage Device Reliability 2007


Published on

Conference presentation from 2007

  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Future Information Growth And Storage Device Reliability 2007

  1. 1. Future Information Growth AND Storage Device Reliability Andrei Khurshudov Seagate Technology 2007
  2. 2. The History of Data Storage • Storage media: charcoal and dirt on stone • Data type: analog (image) • Storage life: >17,000 years (in a sealed dry ‘Diamond Sutra’ (the world’s earliest complete survival of a dated printed book), cave) AD 868 Storage media: ink on paper Data type: analog (images, characters) Storage life: >1,100 years (sealed in a cave) Andrei Khurshudov, 2007
  3. 3. The History of Computer Data Storage 1.8” Perpendicular 2005 5.25” drive 2.5” drive 1.8” drive RAMAC Hard disk drive 3340 Winchester, 1980 1991 1988 1956 1962 Hybrid Jazz Zip The floppy 3.5” drive SSD Magnetic drum 1983 Blue-ray/HD DVD Don’t know how to sell more storage… DVD CD ROM Direct access to data Magnetic Tape Holographic Disk disk 1980 CD/DVD Holographic Sequential access to data Need more storage! Punch cards Compact Cassette Magnetic tape Punched tape 1940 1950 1960 1970 1980 1990 2000 2010 2020 quot;Do not fold, spindle or mutilate” Andrei Khurshudov, 2007
  4. 4. The First HDD is Born • Stands for quot;Random Access Method of Accounting and Controlquot; • Born: 1956 • Capacity: 5 MB • Disk diameter: 24” • Recording surfaces: 100 • Tracks/surface: 100 • RPM: 1200 • Weight: >1 ton • Cost: leased for $3,200 per month “While the storage capacity of the drive could have been increased above five megabytes, the marketing department at IBM was against a larger capacity drive because they didn't know how to sell a product with more storage (source: Currie Munce, VP, IBM Research) Andrei Khurshudov, 2007
  5. 5. Modern Disk Drive About 50 years old Runs faster with every year… Mass-produced electro-mechanical device 2006 total industry output >400M drives Utilizes principles of magnetic recording Most recent products utilize PMR Relies on a flying magnetic element Typical mechanical separation ~5-10 nm Available in several standard form factors 1”, 1.8”, 2.5”, 3.5” Designed for several distinct markets Desktop, Enterprise, Mobile, HH, CE Uses various computer interfaces PATA, SATA, SAS, SCSI, FCAL Historically high data density growth rate CAGR of 30% to 50% over the last decades Experiences constant cost pressure Cost of GB is under $0.5 and falling Always under attack from disruptive Destroys or assimilates competition for 50 technologies years Continually expands into new markets Most recent: CE, automotive, archival Highly competitive industry Darwinian principles in accelerated action* Industry share leader: Seagate ~40% of the total market share * “The Innovator’s Dilemma” by Clayton M. Christensen Innovator’ Dilemma” Andrei Khurshudov, 2007
  6. 6. Disk Drive Industry Trends 0.85” drive Source: PC World, The Hard Drive Turns 50 Source: Coughlin Associates Bear Stearns Technology Conference, 2006 Bear Stearns Technology Conference, 2006 Ed Grochowski, IBM Ed Grochowski, IBM Drives get denser, smaller, faster, and cheaper Reliability becomes increasingly difficult Andrei Khurshudov, 2007
  7. 7. Yesterday, Today, and Tomorrow Tomorrow Yesterday Today There’s plenty of room at the bottom! Andrei Khurshudov, 2007
  8. 8. Estimated Number of Units Shipped 900,000 800,000 700,000 U n i ts , M il li o n s 600,000 500,000 400,000 300,000 200,000 100,000 - 00 01 02 03 04 05 06 07 08 09 10 11 12 CY CY CY CY CY CY CY CY CY CY CY CY CY Source: Seagate Market Research Rapid overall HDD unit growth will continue into the foreseeable future More than 1.5X increase in units shipped in 2012 compared to 2007 Andrei Khurshudov, 2007
  9. 9. Strong Link Between Information Growth and Storage Produced • Internet • Blogs • Movies • TV • Music • Maps • Databases • Archives New Storage New Data • Business • Legal • Science • Diaries • Art • Gaming • Literature • Noise • Etc. Balance is required! Data storage technology underpins information growth Andrei Khurshudov, 2007
  10. 10. Estimated Total PB’s Capacity Shipped T otal PB's shippe d Proje ction y = 7872.3e 0.3679x R 2 = 0.9883 500,000 450,000 400,000 Exponential growth 350,000 Total PB 's ship 300,000 250,000 200,000 150,000 100,000 50,000 - 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 Ye ar Source: Seagate Market Research Information growth trend is indeed exponential! Overall information growth will scale with the HDD capacity growth It is estimated that over 90% of all new information produced in the world is being stored on magnetic media, most of it on hard disk drives (Google) Shipped capacity doubles every 30 months Over 1M PB of storage will be produced between 2008-12 Andrei Khurshudov, 2007
  11. 11. Long-term Storage Growth Projection Long-term storage growth projection Alotabyte? !!! 1,000,000,000,000 100,000,000,000 Total PB PB shippe Total Shipped 10,000,000,000 Yottabyte 1,000,000,000 100,000,000 10,000,000 Zettabyte 1,000,000 100,000 10,000 Exabyte 1,000 100 10 Petabyte 1 2000 2005 2010 2015 2020 2025 2030 2035 2040 2045 2050 Year Andrei Khurshudov Exponential growth in storage capacity will enable the information avalanche! Andrei Khurshudov, 2007
  12. 12. Definitions of reliability Reliability is the probability of performing required functions for a specified time under the stated operational conditions For HDD: Required functions include storing and accessing data at the specified high data rate and with specified power consumption, acoustic noise, start-up time, etc. Specified time is the service life, which is typically 3 to 10 years. Stated operational conditions are those specified by the HDD specification (temperature, humidity, shock, vibration, etc.) Weibull reliability model : Describes the “weakest link” in a product Treats system as a series of components each having finite reliability: R1 R2 Rn HDD Reliability Etc. Code Motor PCBA HDI HDD fails if any one component fails! R = R1*R2*R3*…Rn Andrei Khurshudov, 2007
  13. 13. HDD Reliability Trends Manufacturer’s HDD MTBF Specifications From: Ed Grochowski, IBM From: Ed Grochowski, IBM • MTBF indicates, on average, how many hours a product is expected to operate before failures. • MTBF = Total The Ultimate Battle Product Reliability vs. Storage Density Operational Time / Number of Failures Reliability vs. Cost Current typical MTBF numbers (by product class): Reliability vs. Performance Server: 1,400,000 hours Reliability vs. Development Time Desktop: 700,000 hours Mobile: 400,000 hours Reliability vs. Environment … Reliability keeps increasing with time in spite of design complexities and more stringent qualification test requirements Andrei Khurshudov, 2007
  14. 14. HDD Reliability Hierarchy Involvement Dealing with… Customer perception of reliability Limited statistics Closing gap between expected Reliability in User Environment reliability and reality The last line of defense. Manufacturing for Reliability Balancing quality against cost Advanced test techniques and Product reliability qualification failure modes analysis Engineering and Technology Design for Reliability Principles Reliability Physics & Theory Fundamental laws of nature HDD reliability is built upon Tribology ! Andrei Khurshudov, 2007
  15. 15. A Perspective on HDD Reliability Cumulative Failure / Repair / Return rates (after 3-4 years) Laptop com puter Refrigerator: side-by-side, w icem ith aker and dispenser Rider m er ow Desktop com puter When Compared to Law tractor n Washing machine (front-loading) many other products, Self-propelled m er ow Vacuum cleaner (canister) HDD reliability looks Washing m achine (top-loading) Dishw asher very high Gas range Refrigerator: top- and bottom-freezer, w icem / aker Average 3-4 year W oven (electric) all Push m er (gas) ow cumulative repair Microwave oven (over-the-range) Cooktop (gas) Clothes dryer rate for CE products Average for CE products Vacuum cleaner (upright) is 15% Cam corder (digital) Refrigerator: top- and bottom-freezer, no icemaker HDD is a component, Cooktop (electric) Range (electric) not a product Digital cam era TV: 30- to 36-inch direct view TV: 25- to 27-inch direct view Proton rocket HDD M edical Pacem akers Sony PS3 (w H ith DD) % 0 5 10 15 20 25 30 35 40 45 50 Source: Consumer Reports National Research Center, 2006 Product Reliability Survey; http://en.wikipedia.org/wiki/Proton_rocket; www.seagate.com; http://www.medscape.com/viewarticle/536755 Andrei Khurshudov, 2007
  16. 16. The Actual Cost of Unreliability If the company experiences a major loss of data then 60% of companies that lose their data will shut down within 6 months of the disaster (source: Bostoncomputing.net)) Bostoncomputing.net 72% of businesses that suffer major data loss disappear within 24 months (Source: Realty Times) 93% of companies that lost their data center for 10 days or more due to a disaster filed for bankruptcy within one year of the disaster (source: Bostoncomputing.net) Bostoncomputing.net) Recreating data from scratch is estimated to cost between $2000 and $8000 per MB (Source: Realty Times) Of those companies participating in the 2001 Cost of Downtime Survey (Source: 2001 Cost of Downtime Survey Results): 8% said it would cost their companies more than $1 million per hour 18% said each hour would cost between $251K and $1 million 28% said each hour would cost between $51K and $250K 46% said each hour of downtime would cost their companies up to $50k Andrei Khurshudov, 2007
  17. 17. Aggravating Aspects of Data Loss 40% of Small and Medium Sized Businesses do not back up their data (Source: Realty Times) 40 - 50% of all backups are not fully recoverable (Source: Realty Times) 34% of companies fail to test their tape backups, and of those that do, 77% have found tape back-up failures (source: Bostoncomputing.net)) Bostoncomputing.net quot;More than 109,000 TBs of unique enterprise PC data are not being regularly backed up“ (IDC) A national Harris Interactive survey reveals (Source: Realty Times): Only 25% of users frequently back up digital files, even when 85 percent of computer users say they are very concerned about losing important digital data 37% of the survey's respondents admitted to backing up their files less than once per month 9% admitted they have never backed up their files More than 22% said backing up information is on their to-do list, but they seldom do it Andrei Khurshudov, 2007
  18. 18. What do drives fail for? Generic HDD failure mode pareto Write abort High-fly write • Up to 40% NTF CND Scratch • System-dependent TA Head degradation • Up to 30% • System-dependent, Grown defect personnel-dependent, Motor procedure-dependent, Mishandling Handling damage PCB etc. Observation: Tribology is responsible for many failure modes ! Andrei Khurshudov, 2007
  19. 19. Tribology inside HDD Connectors FDB Motor Head-Disk Interface Ramp (friction and wear) Pivot Bearing Screws (wear and torque retention) There are multiple ways in which tribology impacts HDD reliability Andrei Khurshudov, 2007
  20. 20. The Role of Tribology in HDD Reliability It is estimated that 15% to 35% of all HDD failures are linked to Tribology (25% on average) Improving tribological robustness enhances overall disk drive reliability Major known failure modes related to tribological issues: Scratch (on both head and media; with or w/out particles) Thermal erasure (disk) and head degradation New defects Weak write / read Crash Failure of some other moving parts Etc. Andrei Khurshudov, 2007
  21. 21. Future Improvement Opportunities HDD reliability: Number of drives that will not fail between 2008 and 2012 per every 0.1% AFR improvement: ~ 3,000,000 Amount of stored information that will not be lost/impacted between 2008 and 2012 per every 0.1% AFR improvement: ~ 1,000,000 TB (or 1 EB) Tribology: Number of drives that will not fail between 2008 and 2012 due to Tribological problems per every 0.1% AFR improvement: ~ 750,000 Amount of stored information that will not be lost/impacted between 2008 and 2012 due to Tribological problems per every 0.1% AFR improvement: ~ 250,000 TB = 250 PB Andrei Khurshudov, 2007
  22. 22. Is this worth the effort? Petabytes in use: The “American Memory” project is one of the largest digitized archives of U.S. history, with more than 7.5 million digital records from 100 collections of manuscripts, books, maps, films, sound recordings and photographs. The total size of the project is 0.008 Petabytes [Wired] As of November 2006, eBay had 2 Petabytes of data [Wikipedia] [Wikipedia Jefferson National Accelerator Facility has a 2 Petabyte storage farm used to collect data from experiments on the particle accelerator [Wikipedia] [Wikipedia RapidShare in 2007 had 3.5 Petabytes of hard-disk storage [Wikipedia] [Wikipedia The San Diego Supercomputer Center (SDSC) in the USA has a 1-Petabyte hard disk store and a 6-Petabyte robotic tape store [Wikipedia] [Wikipedia Microsoft stores on 900 servers a total of about 14 Petabytes. These are mostly imagery for Microsoft's digital model planet, Virtual Earth [Wikipedia] 15 Petabytes of data will be generated each year in particle physics experiments using CERN’s Large Hadron Collider, due to be launched in May 2008 [Wikipedia] [Wikipedia The total storage capacity needed for the above data is ~ 44 PB A failure rate reduction of 0.005% over the next 5 years is required to cover the above storage capacity needs Andrei Khurshudov, 2007
  23. 23. Future Scenario Exponential growth of data over time (information avalanche) Lower cost of data storage per GB Many more disk drives required to accommodate all of the new data and backup Continually increasing reliability of disk drives Nevertheless, more total failures (in absolute terms) unless HDD reliability increases on a faster rate than the drive unit growth Andrei Khurshudov, 2007
  24. 24. Conclusions Data storage capacity growth enables overall information growth Reliability of data storage devices is a key element in this growth Unreliability is extremely costly Even small improvements in reliability will have huge impact on the amount of information preserved in the future Tribology is, and will remain, a major enabler of the future information growth Relative contribution of Tribology to HDD unreliability is on the order of 25% Andrei Khurshudov, 2007
  25. 25. References “The Innovator’s Dilemma” by Clayton M. Christensen Google: Failure Trends in a Large Disk Drive Population, E. Pinheiro, W.-D. Weber and L. Andr´e Barroso, FAST 2007 Wired: http://www.wired.com/science/discoveries/news/2002/10/55509 Wikipedia on Petabytes: http://en.wikipedia.org/wiki/Petabyte Consumer Reports National Research Center, 2006 Product Reliability Survey: http://www.squaretrade.com/htm/pop/lm_failureRates.html Proton rocket launcher: http://en.wikipedia.org/wiki/Proton_rocket HDD specifications: www.seagate.com Medical pacemaker’s reliability: http://www.medscape.com/viewarticle/536755 2001 Cost of Downtime Survey Results: http://www.datadepositbox.com/media/data- loss-statistics.asp BostonComputing.net: http://www.bostoncomputing.net/consultation/databackup/statistics IDC: IDC analyst Fred Broussard, PC Backup and Higher Prioritization for the Enterprise and Consumer, July 2002 Andrei Khurshudov, 2007
  26. 26. Thank you! Andrei Khurshudov, 2007