This document discusses the history and future of data storage. It begins with early examples of data storage using materials like stone and paper. It then outlines the development of data storage technologies over time, from magnetic tapes and disks to modern solid state drives. The document notes that storage capacity and data growth are increasing exponentially, and projections estimate yottabytes of storage being produced by 2050. Reliability is also discussed, as are the tribological factors that impact hard disk drive reliability.
Galaxy has introduced a new graphics card, the GeForce GTX570 MDT x4, which features four digital video outputs capable of 1080p resolution. This allows the card to power up to four monitors simultaneously. The card is aimed at users who want increased desktop workspace for multitasking or gaming across multiple screens. It provides improved performance and efficiency over traditional single-monitor setups. Galaxy also released software to easily configure different display modes for gaming, productivity, and entertainment applications.
HD-DVD was a high-density optical disc format that was envisioned as the successor to DVD. It had a higher storage capacity than DVD, with single-layer discs holding 15GB of data and dual-layer discs holding 30GB. HD-DVD was developed primarily by Toshiba and aimed to store high-definition video. However, HD-DVD was discontinued in 2008 after losing the high-definition format war to the competing Blu-ray Disc format.
This Skip Doctor manual disc repair kit allows users to easily repair scratched CDs, DVDs, and game discs without messy pastes. The kit contains a resurfacing wheel that uniformly resurfaces the protective layer of discs while leaving the data unaffected. The kit includes a disc holder, resurfacing wheel, spray bottle with fluid, drying cloth, and buffing square stored inside the handle.
Spinning Brown Donuts: Why Storage Still CountsSparkhound Inc.
Storage, next to server hardware, is pretty commoditized and probably the least exciting thing in your datacenter. However, not properly assessing your storage needs and requirements can be the difference between a great app or resume generating event. This session will cover topics such as: Why you may not need all flash, SAN is not just NAS spelled backwards, leveraging cloud storage, why RAID is not a sound backup solution, and cutting through the marketing to make sense of it all.
Research and technology explosion in scale-out storageJeff Spencer
A view of the directions storage is taking in science & technology from Ryan Sayre, technical strategist in the office of the CTO for EMC Isilon, using examples from recent work in life science genomics and other industries taking advantage of the combination of extreme computing (HPC) and big data. As presented at the Bull sponsored Science & Innovation 2013 conference Westminster.
Digital Pragmatism with Business Intelligence, Big Data and Data VisualisationJen Stirrup
Contact details:
Jen.Stirrup@datarelish.com
In a world where the HiPPO’s (Highest Paid Person’s Opinion) is final, how can we use technology to drive the organisation towards data-driven decision making as part of their organizational DNA? R provides a range of functionality in machine learning, but we need to expose its richness in a world where it is made accessible to decision makers. Using Data Storytelling with R, we can imprint data in the culture of the organization by making it easily accessible to everyone, including decision makers. Together, the insights and process of machine learning are combined with data visualisation to help organisations derive value and insights from big and little data.
Big data is driving changes in leadership approaches. Leaders must envision how to make their organizations more customer-centric by using data and innovation networks. They also need to enable big data operations by establishing the right technology and organizational structures. Further, leaders must empower all employees to take action based on data insights. Finally, leaders must energize their organizations to become big data driven by constantly advocating changes and using data to reward performance.
Galaxy has introduced a new graphics card, the GeForce GTX570 MDT x4, which features four digital video outputs capable of 1080p resolution. This allows the card to power up to four monitors simultaneously. The card is aimed at users who want increased desktop workspace for multitasking or gaming across multiple screens. It provides improved performance and efficiency over traditional single-monitor setups. Galaxy also released software to easily configure different display modes for gaming, productivity, and entertainment applications.
HD-DVD was a high-density optical disc format that was envisioned as the successor to DVD. It had a higher storage capacity than DVD, with single-layer discs holding 15GB of data and dual-layer discs holding 30GB. HD-DVD was developed primarily by Toshiba and aimed to store high-definition video. However, HD-DVD was discontinued in 2008 after losing the high-definition format war to the competing Blu-ray Disc format.
This Skip Doctor manual disc repair kit allows users to easily repair scratched CDs, DVDs, and game discs without messy pastes. The kit contains a resurfacing wheel that uniformly resurfaces the protective layer of discs while leaving the data unaffected. The kit includes a disc holder, resurfacing wheel, spray bottle with fluid, drying cloth, and buffing square stored inside the handle.
Spinning Brown Donuts: Why Storage Still CountsSparkhound Inc.
Storage, next to server hardware, is pretty commoditized and probably the least exciting thing in your datacenter. However, not properly assessing your storage needs and requirements can be the difference between a great app or resume generating event. This session will cover topics such as: Why you may not need all flash, SAN is not just NAS spelled backwards, leveraging cloud storage, why RAID is not a sound backup solution, and cutting through the marketing to make sense of it all.
Research and technology explosion in scale-out storageJeff Spencer
A view of the directions storage is taking in science & technology from Ryan Sayre, technical strategist in the office of the CTO for EMC Isilon, using examples from recent work in life science genomics and other industries taking advantage of the combination of extreme computing (HPC) and big data. As presented at the Bull sponsored Science & Innovation 2013 conference Westminster.
Digital Pragmatism with Business Intelligence, Big Data and Data VisualisationJen Stirrup
Contact details:
Jen.Stirrup@datarelish.com
In a world where the HiPPO’s (Highest Paid Person’s Opinion) is final, how can we use technology to drive the organisation towards data-driven decision making as part of their organizational DNA? R provides a range of functionality in machine learning, but we need to expose its richness in a world where it is made accessible to decision makers. Using Data Storytelling with R, we can imprint data in the culture of the organization by making it easily accessible to everyone, including decision makers. Together, the insights and process of machine learning are combined with data visualisation to help organisations derive value and insights from big and little data.
Big data is driving changes in leadership approaches. Leaders must envision how to make their organizations more customer-centric by using data and innovation networks. They also need to enable big data operations by establishing the right technology and organizational structures. Further, leaders must empower all employees to take action based on data insights. Finally, leaders must energize their organizations to become big data driven by constantly advocating changes and using data to reward performance.
Prof William Kosar: Letters of Credit as a Payment MethodWilliam Kosar
This is the 2nd lesson from a 5 day course on Letters of Credit (in English and Arabic) taught to Iraqi Private Commercial Bankers both at the Banking and Finance Academy in Erbil as well as the Banking Studies Center of the Central Bank of Iraq in Baghdad. .
The document discusses the benefits of exercise for mental health. Regular physical activity can help reduce anxiety and depression and improve mood and cognitive functioning. Exercise causes chemical changes in the brain that may help protect against mental illness and improve symptoms.
Our talk at Social Media Summit, 28 April 2009.
Case study - Mobile Jam Fest
How To Use Social Media To Promote Creativity, Make Positive Social Changes, And Connect People
Ping pong is a sport that is easy to learn but difficult to master. It involves using forehand and backhand techniques with accuracy, angle, spin and speed to return balls in singles or doubles matches. Players must follow safety and etiquette rules, serve low and with control and spin, and aim to win games to 11 by a margin of 2 points.
This presentation was used in May 2011 at RIMS Conference, Mining and Metal Session.
Recent world-wide events have painfully shown many industries that natural hazards can impair ingress/egress capabilities in any business areas. Mining, passengers, automotive and event electronics companies have suffered major drawbacks from climate, seismic, fire tornadoes and other hazards.
The presentation shows that careful risk and crises evaluation, management and attentive mitigation can lead to higher survival rate and event to gain competitive edge on less prepared companies.
Student responders allow teachers to collect formative and summative assessment data in real-time through multiple choice, true/false, and other response types. The responders provide instant feedback to teachers on student understanding, which enables teachers to adjust instruction as needed. The assessment data collected through responders gives teachers valuable insights into student mastery and areas needing reteaching or review. When paired with learning goals, effective feedback from responders ranks as one of the most important factors for increasing student achievement.
The document discusses various Earth Day and sustainability activities at McDaniel College such as planting trees and flowers to beautify campus, testing individual carbon footprints, and efforts to reduce the school's carbon footprint through switching to geothermal heating and using energy efficient lighting. It also mentions participating in Recyclemania to improve recycling rates and provides citations for recycling statistics and program details.
Accurate biochemical knowledge starting with precise structure-based criteria...Michel Dumontier
Biochemical ontologies aim to capture and represent biochemical entities and the relations that exist between them in an accurate and precise manner. A fundamental starting point is the use of identifiers that precisely and uniquely identify some biochemical entity, whether it be a substance, a quality or some biological process. Yet, our current approach for generating identifiers doing so is often haphazard and incomplete. This prevents us from accurately integrating knowledge and also leads to under specification of our knowledge. This talk aims to initiate a discussion on plausible structure-based strategies for biochemical identity, ultimately to generate identifiers in an automatic and curator/database independent fashion, whether it be at molecular level or some part thereof (e.g. residues, collection of residues, atoms, collection of atoms, functional groups). With structure-based identifiers in hand, we will be in a position to accurately capture specific biochemical knowledge, such as how a set of residues in a binding site are involved in a chemical reaction including the fact that a key nitrogen atom must first be de-protonated. Thus, this will enhance our current representation of biochemical knowledge and make it fundamentally more useful.
Combining Quantitative & Qualitative Data in a Single Large scale User Resear...UserZoom
This document summarizes a large-scale user research study conducted by Measuring Usability that compared the usability of several major airline websites. Both quantitative and qualitative data were collected, including surveys of over 1,000 airline customers, task-based testing with 130 participants, and analysis of think-aloud transcripts. The results showed Southwest had the highest usability and satisfaction scores while United had the lowest. Issues with flight selection, calendar interfaces, and difficulty changing reservations were identified for United and American. The study provided insights into key drivers of the user experience for each airline website.
The document describes an experiment that measured the temperature change of water in a calorimeter over 90 minutes when wrapped in towels containing different amounts of absorbed water. The experiment used calorimeters wrapped in towels soaked in 50ml, 100ml, 150ml and 200ml of water, with controls that were not wrapped. Graphs of the temperature change over time were produced for each condition and compared to analyze how the amount of absorbed water in the towel affected temperature change.
An updated Twitter 101 deck. Focus on "5 considerations before companies dive into Twitterville" and "Advantage for companies on the Social Media fence"
This presentation provides a pattern for scaling scrum teams to programs as well as provides some guidance for kicking off larger programs, dealing with program stakeholders as well explores scaling alternatives.
This document summarizes the history and types of CDs and DVDs. It discusses how they are manufactured by creating a glass master that is used to injection mold the discs out of polycarbonate. It describes the differences between CDs and DVDs in terms of size, data storage capacity, and technology. CDs can store 700MB of data while DVDs can store up to 17GB. The document also compares high definition formats Blu-ray and HD-DVD in terms of storage capacity, with Blu-ray having higher capacity of up to 200GB compared to HD-DVD's 60GB maximum.
This document summarizes the history and types of CDs and DVDs. It discusses how they are manufactured by creating a glass master that is used to injection mold the discs out of polycarbonate. It describes the differences between CDs and DVDs in terms of size, data storage capacity, and technology. CDs can store 700MB of data while DVDs can store up to 17GB. The document also compares high definition formats Blu-ray and HD-DVD in terms of storage capacity, with Blu-ray having higher capacity of up to 200GB compared to HD-DVD's 60GB maximum.
Hard disk & Optical disk (college group project)Vshal_Rai
- Hard disk drives (HDDs) are devices used for digital data storage. They consist of rapidly rotating discs coated with magnetic material. Magnetic heads write data to and read data from the disc surfaces.
- HDDs were first introduced in 1956 and have since decreased dramatically in size and cost, becoming standard in personal computers by the late 1980s. Capacities have also increased greatly, with modern HDDs capable of storing terabytes of data.
- Optical discs like CDs and DVDs store data in the form of pits and lands on a reflective surface. They were invented in the late 1950s and early 1960s and are now commonly used to store music, video, and computer programs and data.
Prof William Kosar: Letters of Credit as a Payment MethodWilliam Kosar
This is the 2nd lesson from a 5 day course on Letters of Credit (in English and Arabic) taught to Iraqi Private Commercial Bankers both at the Banking and Finance Academy in Erbil as well as the Banking Studies Center of the Central Bank of Iraq in Baghdad. .
The document discusses the benefits of exercise for mental health. Regular physical activity can help reduce anxiety and depression and improve mood and cognitive functioning. Exercise causes chemical changes in the brain that may help protect against mental illness and improve symptoms.
Our talk at Social Media Summit, 28 April 2009.
Case study - Mobile Jam Fest
How To Use Social Media To Promote Creativity, Make Positive Social Changes, And Connect People
Ping pong is a sport that is easy to learn but difficult to master. It involves using forehand and backhand techniques with accuracy, angle, spin and speed to return balls in singles or doubles matches. Players must follow safety and etiquette rules, serve low and with control and spin, and aim to win games to 11 by a margin of 2 points.
This presentation was used in May 2011 at RIMS Conference, Mining and Metal Session.
Recent world-wide events have painfully shown many industries that natural hazards can impair ingress/egress capabilities in any business areas. Mining, passengers, automotive and event electronics companies have suffered major drawbacks from climate, seismic, fire tornadoes and other hazards.
The presentation shows that careful risk and crises evaluation, management and attentive mitigation can lead to higher survival rate and event to gain competitive edge on less prepared companies.
Student responders allow teachers to collect formative and summative assessment data in real-time through multiple choice, true/false, and other response types. The responders provide instant feedback to teachers on student understanding, which enables teachers to adjust instruction as needed. The assessment data collected through responders gives teachers valuable insights into student mastery and areas needing reteaching or review. When paired with learning goals, effective feedback from responders ranks as one of the most important factors for increasing student achievement.
The document discusses various Earth Day and sustainability activities at McDaniel College such as planting trees and flowers to beautify campus, testing individual carbon footprints, and efforts to reduce the school's carbon footprint through switching to geothermal heating and using energy efficient lighting. It also mentions participating in Recyclemania to improve recycling rates and provides citations for recycling statistics and program details.
Accurate biochemical knowledge starting with precise structure-based criteria...Michel Dumontier
Biochemical ontologies aim to capture and represent biochemical entities and the relations that exist between them in an accurate and precise manner. A fundamental starting point is the use of identifiers that precisely and uniquely identify some biochemical entity, whether it be a substance, a quality or some biological process. Yet, our current approach for generating identifiers doing so is often haphazard and incomplete. This prevents us from accurately integrating knowledge and also leads to under specification of our knowledge. This talk aims to initiate a discussion on plausible structure-based strategies for biochemical identity, ultimately to generate identifiers in an automatic and curator/database independent fashion, whether it be at molecular level or some part thereof (e.g. residues, collection of residues, atoms, collection of atoms, functional groups). With structure-based identifiers in hand, we will be in a position to accurately capture specific biochemical knowledge, such as how a set of residues in a binding site are involved in a chemical reaction including the fact that a key nitrogen atom must first be de-protonated. Thus, this will enhance our current representation of biochemical knowledge and make it fundamentally more useful.
Combining Quantitative & Qualitative Data in a Single Large scale User Resear...UserZoom
This document summarizes a large-scale user research study conducted by Measuring Usability that compared the usability of several major airline websites. Both quantitative and qualitative data were collected, including surveys of over 1,000 airline customers, task-based testing with 130 participants, and analysis of think-aloud transcripts. The results showed Southwest had the highest usability and satisfaction scores while United had the lowest. Issues with flight selection, calendar interfaces, and difficulty changing reservations were identified for United and American. The study provided insights into key drivers of the user experience for each airline website.
The document describes an experiment that measured the temperature change of water in a calorimeter over 90 minutes when wrapped in towels containing different amounts of absorbed water. The experiment used calorimeters wrapped in towels soaked in 50ml, 100ml, 150ml and 200ml of water, with controls that were not wrapped. Graphs of the temperature change over time were produced for each condition and compared to analyze how the amount of absorbed water in the towel affected temperature change.
An updated Twitter 101 deck. Focus on "5 considerations before companies dive into Twitterville" and "Advantage for companies on the Social Media fence"
This presentation provides a pattern for scaling scrum teams to programs as well as provides some guidance for kicking off larger programs, dealing with program stakeholders as well explores scaling alternatives.
This document summarizes the history and types of CDs and DVDs. It discusses how they are manufactured by creating a glass master that is used to injection mold the discs out of polycarbonate. It describes the differences between CDs and DVDs in terms of size, data storage capacity, and technology. CDs can store 700MB of data while DVDs can store up to 17GB. The document also compares high definition formats Blu-ray and HD-DVD in terms of storage capacity, with Blu-ray having higher capacity of up to 200GB compared to HD-DVD's 60GB maximum.
This document summarizes the history and types of CDs and DVDs. It discusses how they are manufactured by creating a glass master that is used to injection mold the discs out of polycarbonate. It describes the differences between CDs and DVDs in terms of size, data storage capacity, and technology. CDs can store 700MB of data while DVDs can store up to 17GB. The document also compares high definition formats Blu-ray and HD-DVD in terms of storage capacity, with Blu-ray having higher capacity of up to 200GB compared to HD-DVD's 60GB maximum.
Hard disk & Optical disk (college group project)Vshal_Rai
- Hard disk drives (HDDs) are devices used for digital data storage. They consist of rapidly rotating discs coated with magnetic material. Magnetic heads write data to and read data from the disc surfaces.
- HDDs were first introduced in 1956 and have since decreased dramatically in size and cost, becoming standard in personal computers by the late 1980s. Capacities have also increased greatly, with modern HDDs capable of storing terabytes of data.
- Optical discs like CDs and DVDs store data in the form of pits and lands on a reflective surface. They were invented in the late 1950s and early 1960s and are now commonly used to store music, video, and computer programs and data.
1. Various data storage devices were developed over time, including magnetic tapes in 1951, hard disks in 1956, floppy disks starting in 1969, CDs in 1988, DVDs in 1995, USB flash drives in 2000, and Blu-ray discs in 2005.
2. Storage capacities of these devices have increased significantly over the decades, from kilobytes in the 1950s-1980s to megabytes in the 1990s to gigabytes and terabytes starting in the 2000s.
3. Common units used to describe digital storage capacities are bytes, kilobytes, megabytes, gigabytes, and terabytes, which are powers of 1024 rather than powers of 10 due to the binary nature of digital information.
Shubham Rai's document summarizes the history, manufacturing process, types, differences, data limits, and competing formats of CDs and DVDs. It notes that CDs were first publicly demonstrated in 1976 and held 700MB of data, while DVDs introduced in the 1990s could hold 4.7GB through smaller pit sizes and tighter spacing. The document also compares high definition formats Blu-ray and HD-DVD, with Blu-ray having a higher storage capacity of up to 200GB through its smaller pit size and additional layers.
Cd &-dvd-by-aaron-rinaca-mike-ferris-mike-burker-steve-mathieu-2001-sprVarun Kumar
CD and DVD technology was summarized in 3 sentences:
CDs were introduced in the 1980s and could store up to 700MB of data, while DVDs introduced in 1996 have much higher storage capacity of 4.7-17GB due to using a shorter wavelength laser and smaller pit sizes which allowed for multiple layers. DVDs surpassed VHS and will become the leading video format while CDs will still be used for audio but higher end applications will move to DVD's larger storage capabilities.
Blu-ray Disc (also known as BD or Blu-Ray) is an optical disc storage,designed to Upgrade the contemporaystandard DVD format. It is a High Definitjion Audio- Video Device.
This document discusses Compact Disc Read Only Memory (CD-ROM). It provides details about:
1. CD-ROMs hold large amounts of data like text, graphics, and audio in a digital format on plastic discs. They can store up to 680 megabytes of data.
2. CD-ROMs use pits arranged in a spiral track to store data, read with a low-power laser. Unlike hard disks, CD-ROMs rotate at a constant linear velocity for reading.
3. CD-ROMs provide random access to data, allow quick retrieval of related items, and can maintain data integrity for over 100 years, making them a secure long-term storage medium.
In this presentation, I have mentioned that what is CD (Compact Disk) and What is DVD. How data stores into it. And How it manufactures. And Difference between them.
The document discusses holographic versatile discs (HVDs), a type of optical disc storage technology. HVDs use holography to store data in a photopolymer layer, allowing storage capacities over 3.9 terabytes. Data is written to the disc using a green laser beam and read using interference between a reference beam and the stored holograms. While offering vastly higher storage than technologies like CDs, DVDs, and Blu-ray discs, HVDs also have drawbacks like complex optical systems and high production costs that have prevented widespread adoption.
This document discusses various types of auxiliary storage devices used in information technology. It describes magnetic tape formats like half inch, quarter inch, 8mm, and 4mm tapes. Hard disks are described as magnetic disks that hold more data faster than floppy disks, with storage capacities from 10MB to several GB. Floppy disk sizes and storage capacities are provided. Other auxiliary storage devices covered include Zip disks, Jaz disks, SuperDisks, CDs, DVDs, and USB flash drives. Their storage capacities and uses are summarized.
This document summarizes the evolution and technology of Blu-ray discs. It discusses how Blu-ray discs offer higher storage capacity than DVDs through the use of blue laser technology and allow for HD recording and playback. The document traces the history of Blu-ray and the Blu-ray Disc Association alliance. It also compares the disc characteristics, storage capacities, and features of CDs, DVDs, and Blu-ray discs.
Floppy disks, tapes, zip disks, and hard disks are examples of different storage media that vary in size, storage capacity, accessibility, and portability. Floppy disks have low storage capacity and portability while hard disks have high storage capacity but low portability. Storage media can be optical like CDs, DVDs, or magnetic like floppy disks, hard drives, zip disks, and tapes. Magnetic media uses magnetism to store data and can be accessed directly or sequentially, while optical media uses lasers to read binary data encoded as pits on the storage surface. Flash memory is a type of solid-state circuit media that can be electrically erased and rewritten in blocks.
The document discusses different types of secondary storage technologies. It describes floppy disks, hard disks, optical disks like CDs and DVDs, and magnetic tape. It provides details on the characteristics and uses of each technology. Floppy disks are portable but lower capacity, while hard disks provide faster access and greater capacity but are generally not portable. Optical disks can store large amounts of data but are read-only or rewriteable. Magnetic tape is best for backups due to its low cost and high capacity. Future technologies may provide even higher storage capacities.
DVD is an optical disc storage format that was invented in 1995 and offers higher storage capacity than CDs. DVDs use a shorter wavelength laser to place pits more densely, allowing more data to be stored in the same physical space as a CD. DVDs can have multiple layers, with later versions supporting up to 8 layers, greatly increasing potential storage capacity over single-layer discs.
The document summarizes the history and evolution of removable storage technologies from paper tape in the 1700s to modern USB thumb drives. It describes early technologies like punched cards, cassette tapes, floppy disks of various sizes, portable hard drives, Zip drives, and optical discs. Key points included the increasing storage capacities over time, from kilobytes to megabytes to gigabytes, as well as the advantages and disadvantages of each technology as newer options emerged.
Dot matrix printers use a pattern of tiny ink dots to print images in a lower quality than other printer types but are less expensive. Laser printers use a laser beam and toner cartridge to print higher quality images. Inkjet printers use an ink cartridge and printing element to produce an even finer printed image. A hard disk drive is a non-volatile storage device that stores data on rapidly rotating magnetic platters and is faster than an external disk drive.
The document discusses M-Disc, a new type of optical disc created from a material similar to rock that is much more durable than traditional discs like DVDs and Blu-rays. M-Discs are designed to last over 1,000 years and are not easily erased like other discs. They work on current DVD and Blu-ray drives but require a more powerful laser for writing. The material makes scratches and fingerprints impossible and data cannot be erased from the solid recording layer. M-Discs cost around $3 each and have storage capacities similar to DVDs currently but may increase to Blu-ray levels.
This document discusses four main types of external memory: magnetic disks, optical disks, magnetic tape, and disk drives. Magnetic disks store data on circular platters coated with magnetic material and use read/write heads to access data. Optical disks like CDs, DVDs, and Blu-rays use lasers to read and write digital information to coated discs. Magnetic tape stores data on magnetic tape housed in cartridges and uses similar reading and writing techniques as disks. Disk drives, like hard disk drives, store large amounts of data on spinning magnetic platters that are read and written to by heads positioned very close to the disk surface.
The document discusses M-Disc, a new type of optical disc material that is much more durable than traditional discs like DVDs and Blu-rays. M-Disc uses a material with properties "similar to rock" rather than polycarbonate, making the discs highly resistant to scratches, fingerprints, heat, and UV light damage. The discs are expected to maintain data integrity for 1000 years, far surpassing normal discs. M-Discs can still be read by existing DVD and Blu-ray drives due to backward compatibility, but require a more powerful laser for writing. The discs cost around $3 each and have a storage capacity similar to DVDs currently, with plans to increase capacity to match Blu-rays.
Similar to Future Information Growth And Storage Device Reliability 2007 (20)
Hyper-Converged Infrastructure: Big Data and IoT opportunities and challenges...Andrei Khurshudov
The document discusses emerging technologies related to the Fourth Industrial Revolution including the Internet of Things (IoT), big data, artificial intelligence, and how they are fundamentally changing information technology. It notes that these technologies are creating massive amounts of data, especially unstructured data from machines. Realizing their full potential will require new approaches to data storage, processing, analytics and decision making delivered through solutions like cloud computing, hyper-converged infrastructure, and edge/fog computing. The integration of all these technologies promises to deliver improved productivity, living standards and actionable insights.
Short introduction to Big Data Analytics, the Internet of Things, and their s...Andrei Khurshudov
Invited talk at the 26th ASME annual conference on information and storage and processing systems (ISPS 2017) held at Hilton San Francisco District, San Francisco, California, USA from August 29–30, 2017.
The document provides tips for building a successful big data analytics organization. It outlines six rules:
1. Only hire PhDs to leverage their rigorous training in research skills. The small extra cost is worth it.
2. Candidates need experience in math, modeling or simulation to ensure strong programming skills, even if not in key languages like R and Python.
3. The most important trait is a demonstrated ability to learn quickly, as the field evolves rapidly. Talented individuals can learn new skills.
4. Still hire some highly experienced experts in areas like machine learning to guide the team, but only 10-20% need this level of experience.
5. Require all candidates to do
Health monitoring & predictive analytics to lower the TCO in a datacenterAndrei Khurshudov
The document summarizes a workshop presentation by Seagate on their Cloud Gazer technology for drive health monitoring, predictive analytics, and automation in data centers. The technology aims to lower total cost of ownership by monitoring drive health, predicting failures, automating diagnostics and management, and extending drive lifespan. It utilizes drive data aggregation, analytics, failure predictions using machine learning, and closed-loop automation. Use cases highlighted include failure detection weeks in advance, workload optimization, and finding failure triggers through root cause analysis. The technology provides a drive-centric management tool to help data centers optimize systems, reduce costs, and improve efficiency.
Seagate uses big data analytics to improve the quality and reliability of its hard disk drives. It collects data from every stage of the manufacturing process, involving thousands of variables for each drive. Analyzing this large data requires new tools beyond traditional analytics. Seagate builds infrastructure to store and access this manufacturing data, hires data scientists to apply machine learning techniques to understand complex relationships in the data. This approach has already led to dramatic quality improvements, ensuring data can be reliably stored and preserved for the future.
Seagate relies heavily on big data analytics to ensure high quality in data storage. As data storage needs grow exponentially, predictive analytics are crucial to avoid costly failures. Seagate collects terabytes of manufacturing, testing, component, and field data daily. This data is analyzed using machine learning algorithms to predict and prevent drive failures, helping ensure the reliability of over 1 billion drives expected in cloud datacenters by 2020. Seagate's big data analytics infrastructure combines comprehensive data collection, large-scale analytics capabilities, and data-driven decision making to advance quality control in high-volume data storage manufacturing.
The document describes the Seagate Hadoop Workflow Accelerator, which enables organizations to optimize Hadoop workflows and centralize data storage. It accelerates Hadoop applications by leveraging ClusterStor's high-performance Lustre parallel file system and bypassing the HDFS software layer. This provides improved Hadoop performance, flexibility to scale compute and storage independently, and reduced total cost of ownership.
This document discusses the history and challenges of long-term data storage. It describes how storage media and conditions significantly impact longevity, with some analog images surviving over 17,000 years sealed in dry caves but other works deteriorating within 200 years without special efforts. Modern applications now require digital data storage for 5-25 years, with demands expected to increase. Technologies have evolved from tape to disk drives to improve capacity and meet long-term reliability needs.
Andrei Khurshudov gave a presentation on solid state drives (SSDs) at the Symposium on Magnetic Storage Tribology and Reliability in Miami, Florida on October 20, 2008. In the presentation, he discussed SSD technology trends, challenges relating to reliability over the life of SSDs, and the need for standardization of SSD reliability testing methods. He noted that while SSDs offer benefits over hard disk drives like improved performance and lower power consumption, challenges remain regarding cost, reliability over the lifetime of the product, and write performance.
Dr. Sean Tan, Head of Data Science, Changi Airport Group
Discover how Changi Airport Group (CAG) leverages graph technologies and generative AI to revolutionize their search capabilities. This session delves into the unique search needs of CAG’s diverse passengers and customers, showcasing how graph data structures enhance the accuracy and relevance of AI-generated search results, mitigating the risk of “hallucinations” and improving the overall customer journey.
Pushing the limits of ePRTC: 100ns holdover for 100 daysAdtran
At WSTS 2024, Alon Stern explored the topic of parametric holdover and explained how recent research findings can be implemented in real-world PNT networks to achieve 100 nanoseconds of accuracy for up to 100 days.
Climate Impact of Software Testing at Nordic Testing DaysKari Kakkonen
My slides at Nordic Testing Days 6.6.2024
Climate impact / sustainability of software testing discussed on the talk. ICT and testing must carry their part of global responsibility to help with the climat warming. We can minimize the carbon footprint but we can also have a carbon handprint, a positive impact on the climate. Quality characteristics can be added with sustainability, and then measured continuously. Test environments can be used less, and in smaller scale and on demand. Test techniques can be used in optimizing or minimizing number of tests. Test automation can be used to speed up testing.
Enchancing adoption of Open Source Libraries. A case study on Albumentations.AIVladimir Iglovikov, Ph.D.
Presented by Vladimir Iglovikov:
- https://www.linkedin.com/in/iglovikov/
- https://x.com/viglovikov
- https://www.instagram.com/ternaus/
This presentation delves into the journey of Albumentations.ai, a highly successful open-source library for data augmentation.
Created out of a necessity for superior performance in Kaggle competitions, Albumentations has grown to become a widely used tool among data scientists and machine learning practitioners.
This case study covers various aspects, including:
People: The contributors and community that have supported Albumentations.
Metrics: The success indicators such as downloads, daily active users, GitHub stars, and financial contributions.
Challenges: The hurdles in monetizing open-source projects and measuring user engagement.
Development Practices: Best practices for creating, maintaining, and scaling open-source libraries, including code hygiene, CI/CD, and fast iteration.
Community Building: Strategies for making adoption easy, iterating quickly, and fostering a vibrant, engaged community.
Marketing: Both online and offline marketing tactics, focusing on real, impactful interactions and collaborations.
Mental Health: Maintaining balance and not feeling pressured by user demands.
Key insights include the importance of automation, making the adoption process seamless, and leveraging offline interactions for marketing. The presentation also emphasizes the need for continuous small improvements and building a friendly, inclusive community that contributes to the project's growth.
Vladimir Iglovikov brings his extensive experience as a Kaggle Grandmaster, ex-Staff ML Engineer at Lyft, sharing valuable lessons and practical advice for anyone looking to enhance the adoption of their open-source projects.
Explore more about Albumentations and join the community at:
GitHub: https://github.com/albumentations-team/albumentations
Website: https://albumentations.ai/
LinkedIn: https://www.linkedin.com/company/100504475
Twitter: https://x.com/albumentations
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Albert Hoitingh
In this session I delve into the encryption technology used in Microsoft 365 and Microsoft Purview. Including the concepts of Customer Key and Double Key Encryption.
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfPaige Cruz
Monitoring and observability aren’t traditionally found in software curriculums and many of us cobble this knowledge together from whatever vendor or ecosystem we were first introduced to and whatever is a part of your current company’s observability stack.
While the dev and ops silo continues to crumble….many organizations still relegate monitoring & observability as the purview of ops, infra and SRE teams. This is a mistake - achieving a highly observable system requires collaboration up and down the stack.
I, a former op, would like to extend an invitation to all application developers to join the observability party will share these foundational concepts to build on:
Maruthi Prithivirajan, Head of ASEAN & IN Solution Architecture, Neo4j
Get an inside look at the latest Neo4j innovations that enable relationship-driven intelligence at scale. Learn more about the newest cloud integrations and product enhancements that make Neo4j an essential choice for developers building apps with interconnected data and generative AI.
Securing your Kubernetes cluster_ a step-by-step guide to success !KatiaHIMEUR1
Today, after several years of existence, an extremely active community and an ultra-dynamic ecosystem, Kubernetes has established itself as the de facto standard in container orchestration. Thanks to a wide range of managed services, it has never been so easy to set up a ready-to-use Kubernetes cluster.
However, this ease of use means that the subject of security in Kubernetes is often left for later, or even neglected. This exposes companies to significant risks.
In this talk, I'll show you step-by-step how to secure your Kubernetes cluster for greater peace of mind and reliability.
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slackshyamraj55
Discover the seamless integration of RPA (Robotic Process Automation), COMPOSER, and APM with AWS IDP enhanced with Slack notifications. Explore how these technologies converge to streamline workflows, optimize performance, and ensure secure access, all while leveraging the power of AWS IDP and real-time communication via Slack notifications.
20 Comprehensive Checklist of Designing and Developing a WebsitePixlogix Infotech
Dive into the world of Website Designing and Developing with Pixlogix! Looking to create a stunning online presence? Look no further! Our comprehensive checklist covers everything you need to know to craft a website that stands out. From user-friendly design to seamless functionality, we've got you covered. Don't miss out on this invaluable resource! Check out our checklist now at Pixlogix and start your journey towards a captivating online presence today.
For the full video of this presentation, please visit: https://www.edge-ai-vision.com/2024/06/building-and-scaling-ai-applications-with-the-nx-ai-manager-a-presentation-from-network-optix/
Robin van Emden, Senior Director of Data Science at Network Optix, presents the “Building and Scaling AI Applications with the Nx AI Manager,” tutorial at the May 2024 Embedded Vision Summit.
In this presentation, van Emden covers the basics of scaling edge AI solutions using the Nx tool kit. He emphasizes the process of developing AI models and deploying them globally. He also showcases the conversion of AI models and the creation of effective edge AI pipelines, with a focus on pre-processing, model conversion, selecting the appropriate inference engine for the target hardware and post-processing.
van Emden shows how Nx can simplify the developer’s life and facilitate a rapid transition from concept to production-ready applications.He provides valuable insights into developing scalable and efficient edge AI solutions, with a strong focus on practical implementation.
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfMalak Abu Hammad
Discover how MongoDB Atlas and vector search technology can revolutionize your application's search capabilities. This comprehensive presentation covers:
* What is Vector Search?
* Importance and benefits of vector search
* Practical use cases across various industries
* Step-by-step implementation guide
* Live demos with code snippets
* Enhancing LLM capabilities with vector search
* Best practices and optimization strategies
Perfect for developers, AI enthusiasts, and tech leaders. Learn how to leverage MongoDB Atlas to deliver highly relevant, context-aware search results, transforming your data retrieval process. Stay ahead in tech innovation and maximize the potential of your applications.
#MongoDB #VectorSearch #AI #SemanticSearch #TechInnovation #DataScience #LLM #MachineLearning #SearchTechnology
2. The History of Data Storage
• Storage media: charcoal and dirt on stone
• Data type: analog (image)
• Storage life: >17,000 years (in a sealed dry ‘Diamond Sutra’ (the world’s earliest
complete survival of a dated printed book),
cave)
AD 868
Storage media: ink on paper
Data type: analog (images, characters)
Storage life: >1,100 years (sealed in a cave)
Andrei Khurshudov, 2007
3. The History of Computer Data Storage 1.8” Perpendicular
2005
5.25” drive 2.5” drive 1.8” drive
RAMAC Hard disk drive
3340 Winchester, 1980 1991
1988
1956
1962
Hybrid
Jazz
Zip
The floppy 3.5” drive SSD
Magnetic drum 1983
Blue-ray/HD DVD
Don’t know how to sell more
storage…
DVD
CD ROM
Direct access to data Magnetic Tape Holographic Disk disk
1980
CD/DVD Holographic
Sequential access to data
Need more storage!
Punch cards
Compact Cassette
Magnetic tape
Punched tape
1940 1950 1960 1970 1980 1990 2000 2010 2020
quot;Do not fold, spindle or mutilate”
Andrei Khurshudov, 2007
4. The First HDD is Born
• Stands for quot;Random Access Method of
Accounting and Controlquot;
• Born: 1956
• Capacity: 5 MB
• Disk diameter: 24”
• Recording surfaces: 100
• Tracks/surface: 100
• RPM: 1200
• Weight: >1 ton
• Cost: leased for $3,200 per month
“While the storage capacity of the drive could have been increased above five megabytes, the
marketing department at IBM was against a larger capacity drive because they didn't know how
to sell a product with more storage (source: Currie Munce, VP, IBM Research)
Andrei Khurshudov, 2007
5. Modern Disk Drive
About 50 years old Runs faster with every year…
Mass-produced electro-mechanical device 2006 total industry output >400M drives
Utilizes principles of magnetic recording Most recent products utilize PMR
Relies on a flying magnetic element Typical mechanical separation ~5-10 nm
Available in several standard form factors 1”, 1.8”, 2.5”, 3.5”
Designed for several distinct markets Desktop, Enterprise, Mobile, HH, CE
Uses various computer interfaces PATA, SATA, SAS, SCSI, FCAL
Historically high data density growth rate CAGR of 30% to 50% over the last decades
Experiences constant cost pressure Cost of GB is under $0.5 and falling
Always under attack from disruptive Destroys or assimilates competition for 50
technologies years
Continually expands into new markets Most recent: CE, automotive, archival
Highly competitive industry Darwinian principles in accelerated action*
Industry share leader: Seagate ~40% of the total market share
* “The Innovator’s Dilemma” by Clayton M. Christensen
Innovator’ Dilemma”
Andrei Khurshudov, 2007
6. Disk Drive Industry Trends
0.85” drive
Source: PC World, The Hard Drive Turns 50
Source: Coughlin Associates
Bear Stearns Technology Conference, 2006
Bear Stearns Technology Conference, 2006
Ed Grochowski, IBM
Ed Grochowski, IBM
Drives get denser, smaller, faster, and cheaper
Reliability becomes increasingly difficult
Andrei Khurshudov, 2007
7. Yesterday, Today, and Tomorrow
Tomorrow
Yesterday
Today
There’s plenty of room at the bottom!
Andrei Khurshudov, 2007
8. Estimated Number of Units Shipped
900,000
800,000
700,000
U n i ts , M il li o n s
600,000
500,000
400,000
300,000
200,000
100,000
-
00
01
02
03
04
05
06
07
08
09
10
11
12
CY
CY
CY
CY
CY
CY
CY
CY
CY
CY
CY
CY
CY
Source: Seagate Market Research
Rapid overall HDD unit growth will continue into the
foreseeable future
More than 1.5X increase in units shipped in 2012
compared to 2007
Andrei Khurshudov, 2007
9. Strong Link Between Information Growth and
Storage Produced
• Internet
• Blogs
• Movies
• TV
• Music
• Maps
• Databases
• Archives
New Storage
New Data
• Business
• Legal
• Science
• Diaries
• Art
• Gaming
• Literature
• Noise
• Etc.
Balance is required!
Data storage technology underpins information growth
Andrei Khurshudov, 2007
10. Estimated Total PB’s Capacity Shipped
T otal PB's shippe d Proje ction y = 7872.3e 0.3679x
R 2 = 0.9883
500,000
450,000
400,000
Exponential growth
350,000
Total PB 's ship
300,000
250,000
200,000
150,000
100,000
50,000
-
2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012
Ye ar
Source: Seagate Market Research
Information growth trend is indeed exponential!
Overall information growth will scale with the HDD capacity growth
It is estimated that over 90% of all new information produced in the world is being stored on
magnetic media, most of it on hard disk drives (Google)
Shipped capacity doubles every 30 months
Over 1M PB of storage will be produced between 2008-12
Andrei Khurshudov, 2007
11. Long-term Storage Growth Projection
Long-term storage growth projection
Alotabyte?
!!!
1,000,000,000,000
100,000,000,000
Total PB PB shippe
Total Shipped
10,000,000,000
Yottabyte
1,000,000,000
100,000,000
10,000,000
Zettabyte
1,000,000
100,000
10,000
Exabyte
1,000
100
10
Petabyte
1
2000 2005 2010 2015 2020 2025 2030 2035 2040 2045 2050
Year Andrei Khurshudov
Exponential growth in storage capacity will
enable the information avalanche!
Andrei Khurshudov, 2007
12. Definitions of reliability
Reliability is the probability of performing required functions for a
specified time under the stated operational conditions
For HDD:
Required functions include storing and accessing data at the specified high
data rate and with specified power consumption, acoustic noise, start-up
time, etc.
Specified time is the service life, which is typically 3 to 10 years.
Stated operational conditions are those specified by the HDD
specification (temperature, humidity, shock, vibration, etc.)
Weibull reliability model :
Describes the “weakest link” in a product
Treats system as a series of components each having finite
reliability:
R1 R2 Rn
HDD Reliability
Etc.
Code
Motor PCBA
HDI
HDD fails if any one component fails!
R = R1*R2*R3*…Rn
Andrei Khurshudov, 2007
13. HDD Reliability Trends
Manufacturer’s HDD MTBF Specifications
From: Ed Grochowski, IBM
From: Ed Grochowski, IBM
• MTBF indicates,
on average, how
many hours a
product is expected
to operate before
failures.
• MTBF = Total
The Ultimate Battle
Product
Reliability vs. Storage Density Operational Time /
Number of Failures
Reliability vs. Cost
Current typical MTBF numbers (by product class):
Reliability vs. Performance
Server: 1,400,000 hours
Reliability vs. Development Time Desktop: 700,000 hours
Mobile: 400,000 hours
Reliability vs. Environment
…
Reliability keeps increasing with time in spite of design
complexities and more stringent qualification test
requirements
Andrei Khurshudov, 2007
14. HDD Reliability Hierarchy
Involvement Dealing with…
Customer perception of reliability Limited statistics
Closing gap between expected
Reliability in User Environment
reliability and reality
The last line of defense.
Manufacturing for Reliability
Balancing quality against cost
Advanced test techniques and
Product reliability qualification
failure modes analysis
Engineering and Technology
Design for Reliability
Principles
Reliability Physics & Theory Fundamental laws of nature
HDD reliability is built upon Tribology !
Andrei Khurshudov, 2007
15. A Perspective on HDD Reliability
Cumulative Failure / Repair / Return rates (after 3-4 years)
Laptop com puter
Refrigerator: side-by-side, w icem
ith aker and dispenser
Rider m er
ow
Desktop com puter
When Compared to
Law tractor
n
Washing machine (front-loading)
many other products,
Self-propelled m er
ow
Vacuum cleaner (canister)
HDD reliability looks
Washing m achine (top-loading)
Dishw asher
very high
Gas range
Refrigerator: top- and bottom-freezer, w icem
/ aker
Average 3-4 year
W oven (electric)
all
Push m er (gas)
ow
cumulative repair
Microwave oven (over-the-range)
Cooktop (gas)
Clothes dryer
rate for CE products
Average for CE products
Vacuum cleaner (upright)
is 15%
Cam corder (digital)
Refrigerator: top- and bottom-freezer, no icemaker
HDD is a component,
Cooktop (electric)
Range (electric)
not a product
Digital cam era
TV: 30- to 36-inch direct view
TV: 25- to 27-inch direct view
Proton rocket
HDD
M edical Pacem akers
Sony PS3 (w H
ith DD)
%
0 5 10 15 20 25 30 35 40 45 50
Source: Consumer Reports National Research Center, 2006 Product Reliability Survey; http://en.wikipedia.org/wiki/Proton_rocket;
www.seagate.com; http://www.medscape.com/viewarticle/536755
Andrei Khurshudov, 2007
16. The Actual Cost of Unreliability
If the company experiences a major loss of data then
60% of companies that lose their data will shut down within 6 months of the
disaster (source: Bostoncomputing.net))
Bostoncomputing.net
72% of businesses that suffer major data loss disappear within 24 months
(Source: Realty Times)
93% of companies that lost their data center for 10 days or more due to a
disaster filed for bankruptcy within one year of the disaster (source:
Bostoncomputing.net)
Bostoncomputing.net)
Recreating data from scratch is estimated to cost between $2000
and $8000 per MB (Source: Realty Times)
Of those companies participating in the 2001 Cost of Downtime
Survey (Source: 2001 Cost of Downtime Survey Results):
8% said it would cost their companies more than $1 million per hour
18% said each hour would cost between $251K and $1 million
28% said each hour would cost between $51K and $250K
46% said each hour of downtime would cost their companies up to $50k
Andrei Khurshudov, 2007
17. Aggravating Aspects of Data Loss
40% of Small and Medium Sized Businesses do not back up their data (Source: Realty
Times)
40 - 50% of all backups are not fully recoverable (Source: Realty Times)
34% of companies fail to test their tape backups, and
of those that do, 77% have found tape back-up failures (source: Bostoncomputing.net))
Bostoncomputing.net
quot;More than 109,000 TBs of unique enterprise PC data are not being regularly
backed up“ (IDC)
A national Harris Interactive survey reveals (Source: Realty Times):
Only 25% of users frequently back up digital files, even when 85 percent of
computer users say they are very concerned about losing important digital data
37% of the survey's respondents admitted to backing up their files less than once
per month
9% admitted they have never backed up their files
More than 22% said backing up information is on their to-do list, but they
seldom do it
Andrei Khurshudov, 2007
18. What do drives fail for?
Generic HDD failure mode pareto
Write abort
High-fly write
• Up to 40%
NTF
CND
Scratch • System-dependent
TA
Head degradation
• Up to 30%
• System-dependent,
Grown defect
personnel-dependent,
Motor
procedure-dependent,
Mishandling
Handling damage
PCB
etc.
Observation:
Tribology is responsible for many failure modes !
Andrei Khurshudov, 2007
19. Tribology inside HDD
Connectors
FDB Motor
Head-Disk Interface
Ramp (friction and wear)
Pivot Bearing
Screws
(wear and torque retention)
There are multiple ways in which tribology impacts HDD reliability
Andrei Khurshudov, 2007
20. The Role of Tribology in HDD Reliability
It is estimated that 15% to 35% of all HDD failures are
linked to Tribology (25% on average)
Improving tribological robustness enhances overall disk drive
reliability
Major known failure modes related to tribological issues:
Scratch (on both head and media; with or w/out particles)
Thermal erasure (disk) and head degradation
New defects
Weak write / read
Crash
Failure of some other moving parts
Etc.
Andrei Khurshudov, 2007
21. Future Improvement Opportunities
HDD reliability:
Number of drives that will not fail between 2008 and 2012 per
every 0.1% AFR improvement: ~ 3,000,000
Amount of stored information that will not be lost/impacted
between 2008 and 2012 per every 0.1% AFR improvement:
~ 1,000,000 TB (or 1 EB)
Tribology:
Number of drives that will not fail between 2008 and 2012 due to
Tribological problems per every 0.1% AFR improvement: ~
750,000
Amount of stored information that will not be lost/impacted
between 2008 and 2012 due to Tribological problems per every
0.1% AFR improvement: ~ 250,000 TB = 250 PB
Andrei Khurshudov, 2007
22. Is this worth the effort?
Petabytes in use:
The “American Memory” project is one of the largest digitized archives of U.S.
history, with more than 7.5 million digital records from 100 collections of
manuscripts, books, maps, films, sound recordings and photographs. The total
size of the project is 0.008 Petabytes [Wired]
As of November 2006, eBay had 2 Petabytes of data [Wikipedia]
[Wikipedia
Jefferson National Accelerator Facility has a 2 Petabyte storage farm used to
collect data from experiments on the particle accelerator [Wikipedia]
[Wikipedia
RapidShare in 2007 had 3.5 Petabytes of hard-disk storage [Wikipedia] [Wikipedia
The San Diego Supercomputer Center (SDSC) in the USA has a 1-Petabyte hard
disk store and a 6-Petabyte robotic tape store [Wikipedia]
[Wikipedia
Microsoft stores on 900 servers a total of about 14 Petabytes. These are mostly
imagery for Microsoft's digital model planet, Virtual Earth [Wikipedia]
15 Petabytes of data will be generated each year in particle physics experiments
using CERN’s Large Hadron Collider, due to be launched in May 2008 [Wikipedia] [Wikipedia
The total storage capacity needed for the above data is ~ 44 PB
A failure rate reduction of 0.005% over the next 5 years is required to
cover the above storage capacity needs
Andrei Khurshudov, 2007
23. Future Scenario
Exponential growth of data over time
(information avalanche)
Lower cost of data storage per GB
Many more disk drives required to
accommodate all of the new data and backup
Continually increasing reliability of disk drives
Nevertheless, more total failures (in absolute
terms) unless HDD reliability increases on a
faster rate than the drive unit growth
Andrei Khurshudov, 2007
24. Conclusions
Data storage capacity growth enables overall
information growth
Reliability of data storage devices is a key element in this
growth
Unreliability is extremely costly
Even small improvements in reliability will have huge
impact on the amount of information preserved in the
future
Tribology is, and will remain, a major enabler of the
future information growth
Relative contribution of Tribology to HDD unreliability is on
the order of 25%
Andrei Khurshudov, 2007
25. References
“The Innovator’s Dilemma” by Clayton M. Christensen
Google: Failure Trends in a Large Disk Drive Population, E. Pinheiro, W.-D. Weber and
L. Andr´e Barroso, FAST 2007
Wired: http://www.wired.com/science/discoveries/news/2002/10/55509
Wikipedia on Petabytes: http://en.wikipedia.org/wiki/Petabyte
Consumer Reports National Research Center, 2006 Product Reliability Survey:
http://www.squaretrade.com/htm/pop/lm_failureRates.html
Proton rocket launcher: http://en.wikipedia.org/wiki/Proton_rocket
HDD specifications: www.seagate.com
Medical pacemaker’s reliability: http://www.medscape.com/viewarticle/536755
2001 Cost of Downtime Survey Results: http://www.datadepositbox.com/media/data-
loss-statistics.asp
BostonComputing.net:
http://www.bostoncomputing.net/consultation/databackup/statistics
IDC: IDC analyst Fred Broussard, PC Backup and Higher Prioritization for the
Enterprise and Consumer, July 2002
Andrei Khurshudov, 2007