"Coerced Cache Eviction: Dealing with Misbehaving Disks through Discreet-Mode Journaling", presented at DSN 2011. For more details check out http://pages.cs.wisc.edu/~vijayc/cce.htm
The document discusses disk-based storage technologies and how they fit within the memory hierarchy. It describes the components of a disk drive, including platters, read/write heads, actuators, and electronics. Details are provided on disk geometry, with tracks divided into sectors, and the steps involved in disk access, including seek time, rotational latency, and data transfer time.
Xd planning guide - storage best practicesNuno Alves
This document provides guidelines for planning storage infrastructure for Citrix XenDesktop environments. It discusses organizational requirements like alignment with IT strategy and high availability needs. Technical requirements covered include performance needs like typical I/O rates and functional requirements like supported protocols. The document recommends avoiding bottlenecks, choosing appropriate RAID levels based on read/write ratios, validating storage performance, and involving storage vendors in planning.
This document discusses mass storage structures including magnetic disks, solid state disks, disk structure, disk attachment methods like host-attached storage, network-attached storage, and storage area networks. It also covers disk scheduling algorithms, disk management topics, swap space management, RAID structures, and stable storage implementation. Magnetic disks are organized into platters, tracks, cylinders, and sectors. Solid state disks use flash memory or DRAM instead of magnetic platters. Disks can be attached directly to hosts or accessed over a network. Disk scheduling algorithms aim to minimize seek times and rotational latency when servicing multiple requests. RAID and swap space management improve reliability, performance and memory management respectively.
This document provides an overview of various data storage technologies and devices used in client-server systems, including magnetic disks, tapes, CD-ROMs, WORM disks, optical disks, RAID configurations, network protection devices, power protection devices, and remote system management. It describes the basic workings and purposes of these different components that are crucial for reliable data storage and system uptime in client-server computing environments.
The document discusses various aspects of disk management in computer systems, including disk structure, disk scheduling, disk formatting, boot blocks, bad block recovery, swap space management, and the file system and I/O management in Windows 2000. Specifically, it covers topics like logical vs physical disk addressing, seek and rotational latency, improving access time through scheduling, low-level vs logical formatting, bootstrapping from disk, handling defective sectors, allocating and managing virtual memory using swap space, and the role of the kernel, virtual memory manager, and I/O manager in Windows 2000.
Windows 2000 is a 32-bit operating system designed for compatibility, reliability, and performance. It includes several key components like the kernel, executive services, and environmental subsystems. The kernel schedules threads and handles exceptions/interrupts. Executive services include the object manager, virtual memory manager, process manager, and I/O manager. Environmental subsystems allow running applications from other operating systems. The document also discusses disk structure, file systems, networking, and other OS concepts.
The document discusses physical storage media used in database systems, including their characteristics and performance measures. It describes the storage hierarchy from fastest volatile cache and main memory to slower non-volatile secondary storage like magnetic disks and tertiary storage like tape. It focuses on magnetic disks, explaining their mechanical components and performance optimization techniques like disk scheduling algorithms and file organization to minimize disk arm movement.
Chapter 12 discusses mass storage systems and their role in operating systems. It describes the physical structure of disks and tapes and how they are accessed. Disks are organized into logical blocks that are mapped to physical sectors. Disks connect to computers via I/O buses and controllers. RAID systems improve reliability through redundancy across multiple disks. Operating systems provide services for disk scheduling, management, and swap space. Tertiary storage uses tape drives and removable disks to archive less frequently used data in large installations.
The document discusses disk-based storage technologies and how they fit within the memory hierarchy. It describes the components of a disk drive, including platters, read/write heads, actuators, and electronics. Details are provided on disk geometry, with tracks divided into sectors, and the steps involved in disk access, including seek time, rotational latency, and data transfer time.
Xd planning guide - storage best practicesNuno Alves
This document provides guidelines for planning storage infrastructure for Citrix XenDesktop environments. It discusses organizational requirements like alignment with IT strategy and high availability needs. Technical requirements covered include performance needs like typical I/O rates and functional requirements like supported protocols. The document recommends avoiding bottlenecks, choosing appropriate RAID levels based on read/write ratios, validating storage performance, and involving storage vendors in planning.
This document discusses mass storage structures including magnetic disks, solid state disks, disk structure, disk attachment methods like host-attached storage, network-attached storage, and storage area networks. It also covers disk scheduling algorithms, disk management topics, swap space management, RAID structures, and stable storage implementation. Magnetic disks are organized into platters, tracks, cylinders, and sectors. Solid state disks use flash memory or DRAM instead of magnetic platters. Disks can be attached directly to hosts or accessed over a network. Disk scheduling algorithms aim to minimize seek times and rotational latency when servicing multiple requests. RAID and swap space management improve reliability, performance and memory management respectively.
This document provides an overview of various data storage technologies and devices used in client-server systems, including magnetic disks, tapes, CD-ROMs, WORM disks, optical disks, RAID configurations, network protection devices, power protection devices, and remote system management. It describes the basic workings and purposes of these different components that are crucial for reliable data storage and system uptime in client-server computing environments.
The document discusses various aspects of disk management in computer systems, including disk structure, disk scheduling, disk formatting, boot blocks, bad block recovery, swap space management, and the file system and I/O management in Windows 2000. Specifically, it covers topics like logical vs physical disk addressing, seek and rotational latency, improving access time through scheduling, low-level vs logical formatting, bootstrapping from disk, handling defective sectors, allocating and managing virtual memory using swap space, and the role of the kernel, virtual memory manager, and I/O manager in Windows 2000.
Windows 2000 is a 32-bit operating system designed for compatibility, reliability, and performance. It includes several key components like the kernel, executive services, and environmental subsystems. The kernel schedules threads and handles exceptions/interrupts. Executive services include the object manager, virtual memory manager, process manager, and I/O manager. Environmental subsystems allow running applications from other operating systems. The document also discusses disk structure, file systems, networking, and other OS concepts.
The document discusses physical storage media used in database systems, including their characteristics and performance measures. It describes the storage hierarchy from fastest volatile cache and main memory to slower non-volatile secondary storage like magnetic disks and tertiary storage like tape. It focuses on magnetic disks, explaining their mechanical components and performance optimization techniques like disk scheduling algorithms and file organization to minimize disk arm movement.
Chapter 12 discusses mass storage systems and their role in operating systems. It describes the physical structure of disks and tapes and how they are accessed. Disks are organized into logical blocks that are mapped to physical sectors. Disks connect to computers via I/O buses and controllers. RAID systems improve reliability through redundancy across multiple disks. Operating systems provide services for disk scheduling, management, and swap space. Tertiary storage uses tape drives and removable disks to archive less frequently used data in large installations.
This document discusses physical storage in database systems. It describes different types of storage media like cache, main memory, magnetic disks, flash memory, optical storage, and tape storage. It explains the storage hierarchy and performance measures of disks. The document also covers disk organization, file organization, optimization of disk access, RAID systems, and how redundancy improves reliability.
The document discusses mass storage systems, including disk structure, disk scheduling algorithms, disk management, RAID structure, disk attachment methods, stable storage implementation, and tertiary storage devices. It provides details on disk formatting, swap space management, different RAID levels, network attached storage, stable storage implementation, removable media like tapes and optical disks, operating system issues, and hierarchical storage management.
This document defines key concepts related to virtualization and disk performance. It discusses how virtualization can compound disk fragmentation issues and slow performance. The disk subsystem is identified as the main performance bottleneck for virtualized environments due to the additional processing layers from guest to host systems. Best practices for improving disk performance in virtualized servers include advanced defragmentation of host and guest systems, adjusting filesystem settings, separating disks and partitions, and using high-performance storage.
The document provides information about I/O systems and a case study, including details about disk structure, disk scheduling algorithms, disk management techniques, direct memory access, swap space management, RAID structure, disk attachment methods, and features of the Windows 2000 and MS-DOS operating systems. Key points covered include how disks are addressed as logical blocks, techniques for minimizing seek time and maximizing disk bandwidth, common disk scheduling algorithms like SSTF and SCAN, and how swap space is allocated and managed in different operating systems.
This document discusses mass storage systems. It begins with an overview of disk structure, including details on disk performance characteristics like seek time and rotational latency. It then covers topics like disk scheduling algorithms, disk management in operating systems, swap space management, RAID structures, and implementing stable storage. RAID levels like mirroring and striping with parity are explained. The document provides information on technologies like solid-state disks, magnetic tape, storage arrays, and network-attached storage.
This document discusses various topics related to disk management in computer systems. It covers disk structure, disk scheduling, disk formatting, boot blocks, bad block recovery, swap space management, and features of the Windows 2000 operating system. The key points are:
- Disks are addressed as large arrays of logical blocks, typically 512 bytes each.
- Disk scheduling aims to optimize seek time, rotational latency, and bandwidth for efficient data transfer.
- The operating system handles disk initialization, partitioning, logical formatting, and recovery of bad blocks.
- Swap space is used as an extension of main memory and can be located in the file system or a separate partition.
- Windows 2000 is a 32-
The document discusses various mass storage devices used in computers. It provides details on floppy disks, hard disks, CDs, DVDs, and USB drives. For floppy disks, it describes the parts of the floppy disk and drive, different sizes of floppies, and how data is written to a floppy disk. For hard disks, it explains the components, how data is read and written, low-level formatting, partitioning, and high-level formatting. It also lists characteristics of different hard disk families used in PCs such as capacity, reliability, and transfer rates.
This presentation gives an overview of physical storage technologies and the various ways of accessing storage on a computer or a server. Presented at School of Engineering and Applied Science, Ahmedabad University as a part of Software Engineering course.
This document discusses mass storage systems and disk structure. It covers topics such as disk formatting, mapping logical blocks to physical sectors, disk attachment methods like SCSI and Fibre Channel, and disk scheduling algorithms. It also summarizes disk management techniques including partitioning, file systems, and swap space. Additional sections cover RAID configurations, stable storage implementation, and snapshot and replication features.
This document discusses secure data storage mechanisms like RAID and LVM. It provides an overview of RAID, including its history and common types like RAID 0, 1, 5 and 6. LVM is introduced as a way to manage logical volumes more flexibly than traditional partitioning. Advantages of RAID and LVM include data redundancy and flexibility to change storage allocation. Disadvantages include increased storage needs and complexity of management.
The document discusses Solaris memory management. It describes Solaris' memory architecture including backing store, virtual memory system, and process memory allocation. It then discusses Solaris' memory management techniques, including swapping and demand paging. Demand paging loads pages of memory on demand to lower memory footprint and startup time, while swapping is used as a last resort. Memory is shared between processes and protected via virtual memory and page protections.
This document discusses mass storage systems and their management by operating systems. It covers disk structure, disk scheduling algorithms, disk management including partitioning and file systems, swap space management, RAID configurations, and implementing stable storage. The objectives are to describe mass storage devices, explain their performance characteristics, evaluate disk scheduling, and discuss operating system services for storage like RAID.
This document discusses physical storage media and file organization in a database system. It describes different types of storage media like magnetic disks, flash memory, and tape storage. It explains the hierarchy of storage from fastest but volatile primary storage to slower but non-volatile secondary and tertiary storage. The document also discusses techniques for improving performance and reliability of disk storage, including RAID (Redundant Arrays of Independent Disks) and how it uses data striping and redundancy across multiple disks to provide improved I/O performance and fault tolerance. It outlines several RAID levels that trade off performance, reliability, and cost in different ways.
This document discusses NVMFS, a new file system designed to take advantage of nonvolatile memory. NVMFS is POSIX compliant and leverages the functionality of an underlying flash translation layer. It exposes NVM primitives through the standard system file interface. The document describes how NVMFS, combined with atomic writes and NVM compression in MySQL, can improve performance and endurance for databases on flash storage. Performance tests showed improvements in throughput and latency compared to conventional configurations.
1.Introduction
2.OS Structures
3.Process
4.Threads
5.CPU Scheduling
6.Process Synchronization
7.Dead Locks
8.Memory Management
9.Virtual Memory
10.File system Interface
11.File system implementation
12.Mass Storage System
13.IO Systems
14.Protection
15.Security
16.Distributed System Structure
17.Distributed File System
18.Distributed Co Ordination
19.Real Time System
20.Multimedia Systems
21.Linux
22.Windows
This document provides an overview of various database implementation techniques, including RAID, file organization, indexing, and query processing. It describes the different RAID levels for improving reliability and performance of disk storage. RAID levels use disk striping and redundancy such as mirroring or parity to provide fault tolerance. The document also discusses file organization techniques for fixed and variable length records, including using a free list or slotted pages. Indexing methods like B+ trees and hashing are introduced for efficient retrieval of records from files.
This document provides an overview of physical storage media and file organization concepts for databases. It discusses various storage media like magnetic disks, flash memory, tape storage and their characteristics. The document introduces the concept of storage hierarchy with primary, secondary and tertiary storage. It describes magnetic disks in detail and optimization techniques for disk access like RAID and file organization. RAID levels 1-4 are summarized with their performance and reliability tradeoffs.
The document discusses cache design and organization. It describes how caches work, sitting between the CPU and main memory to provide fast access to frequently used data. The key aspects covered include cache size, block size, mapping techniques, replacement algorithms, write policies, and the evolution of cache hierarchies in processors like the Pentium IV with multiple levels of on-chip and off-chip caches.
The document summarizes key characteristics of cache memory including location, capacity, unit of transfer, access methods, performance, physical types, organization, and hierarchy. It discusses cache memory in terms of where it is located (internal or external to the CPU), its typical sizes (word, block), access techniques (sequential, random, associative), performance metrics (access time, transfer rate), common physical implementations (SRAM, disk), and organizational aspects like mapping functions, replacement algorithms, and write policies. A cache sits between the CPU and main memory, using fast but small memory to speed up access to frequently used data from larger but slower main memory.
This document discusses physical storage in database systems. It describes different types of storage media like cache, main memory, magnetic disks, flash memory, optical storage, and tape storage. It explains the storage hierarchy and performance measures of disks. The document also covers disk organization, file organization, optimization of disk access, RAID systems, and how redundancy improves reliability.
The document discusses mass storage systems, including disk structure, disk scheduling algorithms, disk management, RAID structure, disk attachment methods, stable storage implementation, and tertiary storage devices. It provides details on disk formatting, swap space management, different RAID levels, network attached storage, stable storage implementation, removable media like tapes and optical disks, operating system issues, and hierarchical storage management.
This document defines key concepts related to virtualization and disk performance. It discusses how virtualization can compound disk fragmentation issues and slow performance. The disk subsystem is identified as the main performance bottleneck for virtualized environments due to the additional processing layers from guest to host systems. Best practices for improving disk performance in virtualized servers include advanced defragmentation of host and guest systems, adjusting filesystem settings, separating disks and partitions, and using high-performance storage.
The document provides information about I/O systems and a case study, including details about disk structure, disk scheduling algorithms, disk management techniques, direct memory access, swap space management, RAID structure, disk attachment methods, and features of the Windows 2000 and MS-DOS operating systems. Key points covered include how disks are addressed as logical blocks, techniques for minimizing seek time and maximizing disk bandwidth, common disk scheduling algorithms like SSTF and SCAN, and how swap space is allocated and managed in different operating systems.
This document discusses mass storage systems. It begins with an overview of disk structure, including details on disk performance characteristics like seek time and rotational latency. It then covers topics like disk scheduling algorithms, disk management in operating systems, swap space management, RAID structures, and implementing stable storage. RAID levels like mirroring and striping with parity are explained. The document provides information on technologies like solid-state disks, magnetic tape, storage arrays, and network-attached storage.
This document discusses various topics related to disk management in computer systems. It covers disk structure, disk scheduling, disk formatting, boot blocks, bad block recovery, swap space management, and features of the Windows 2000 operating system. The key points are:
- Disks are addressed as large arrays of logical blocks, typically 512 bytes each.
- Disk scheduling aims to optimize seek time, rotational latency, and bandwidth for efficient data transfer.
- The operating system handles disk initialization, partitioning, logical formatting, and recovery of bad blocks.
- Swap space is used as an extension of main memory and can be located in the file system or a separate partition.
- Windows 2000 is a 32-
The document discusses various mass storage devices used in computers. It provides details on floppy disks, hard disks, CDs, DVDs, and USB drives. For floppy disks, it describes the parts of the floppy disk and drive, different sizes of floppies, and how data is written to a floppy disk. For hard disks, it explains the components, how data is read and written, low-level formatting, partitioning, and high-level formatting. It also lists characteristics of different hard disk families used in PCs such as capacity, reliability, and transfer rates.
This presentation gives an overview of physical storage technologies and the various ways of accessing storage on a computer or a server. Presented at School of Engineering and Applied Science, Ahmedabad University as a part of Software Engineering course.
This document discusses mass storage systems and disk structure. It covers topics such as disk formatting, mapping logical blocks to physical sectors, disk attachment methods like SCSI and Fibre Channel, and disk scheduling algorithms. It also summarizes disk management techniques including partitioning, file systems, and swap space. Additional sections cover RAID configurations, stable storage implementation, and snapshot and replication features.
This document discusses secure data storage mechanisms like RAID and LVM. It provides an overview of RAID, including its history and common types like RAID 0, 1, 5 and 6. LVM is introduced as a way to manage logical volumes more flexibly than traditional partitioning. Advantages of RAID and LVM include data redundancy and flexibility to change storage allocation. Disadvantages include increased storage needs and complexity of management.
The document discusses Solaris memory management. It describes Solaris' memory architecture including backing store, virtual memory system, and process memory allocation. It then discusses Solaris' memory management techniques, including swapping and demand paging. Demand paging loads pages of memory on demand to lower memory footprint and startup time, while swapping is used as a last resort. Memory is shared between processes and protected via virtual memory and page protections.
This document discusses mass storage systems and their management by operating systems. It covers disk structure, disk scheduling algorithms, disk management including partitioning and file systems, swap space management, RAID configurations, and implementing stable storage. The objectives are to describe mass storage devices, explain their performance characteristics, evaluate disk scheduling, and discuss operating system services for storage like RAID.
This document discusses physical storage media and file organization in a database system. It describes different types of storage media like magnetic disks, flash memory, and tape storage. It explains the hierarchy of storage from fastest but volatile primary storage to slower but non-volatile secondary and tertiary storage. The document also discusses techniques for improving performance and reliability of disk storage, including RAID (Redundant Arrays of Independent Disks) and how it uses data striping and redundancy across multiple disks to provide improved I/O performance and fault tolerance. It outlines several RAID levels that trade off performance, reliability, and cost in different ways.
This document discusses NVMFS, a new file system designed to take advantage of nonvolatile memory. NVMFS is POSIX compliant and leverages the functionality of an underlying flash translation layer. It exposes NVM primitives through the standard system file interface. The document describes how NVMFS, combined with atomic writes and NVM compression in MySQL, can improve performance and endurance for databases on flash storage. Performance tests showed improvements in throughput and latency compared to conventional configurations.
1.Introduction
2.OS Structures
3.Process
4.Threads
5.CPU Scheduling
6.Process Synchronization
7.Dead Locks
8.Memory Management
9.Virtual Memory
10.File system Interface
11.File system implementation
12.Mass Storage System
13.IO Systems
14.Protection
15.Security
16.Distributed System Structure
17.Distributed File System
18.Distributed Co Ordination
19.Real Time System
20.Multimedia Systems
21.Linux
22.Windows
This document provides an overview of various database implementation techniques, including RAID, file organization, indexing, and query processing. It describes the different RAID levels for improving reliability and performance of disk storage. RAID levels use disk striping and redundancy such as mirroring or parity to provide fault tolerance. The document also discusses file organization techniques for fixed and variable length records, including using a free list or slotted pages. Indexing methods like B+ trees and hashing are introduced for efficient retrieval of records from files.
This document provides an overview of physical storage media and file organization concepts for databases. It discusses various storage media like magnetic disks, flash memory, tape storage and their characteristics. The document introduces the concept of storage hierarchy with primary, secondary and tertiary storage. It describes magnetic disks in detail and optimization techniques for disk access like RAID and file organization. RAID levels 1-4 are summarized with their performance and reliability tradeoffs.
The document discusses cache design and organization. It describes how caches work, sitting between the CPU and main memory to provide fast access to frequently used data. The key aspects covered include cache size, block size, mapping techniques, replacement algorithms, write policies, and the evolution of cache hierarchies in processors like the Pentium IV with multiple levels of on-chip and off-chip caches.
The document summarizes key characteristics of cache memory including location, capacity, unit of transfer, access methods, performance, physical types, organization, and hierarchy. It discusses cache memory in terms of where it is located (internal or external to the CPU), its typical sizes (word, block), access techniques (sequential, random, associative), performance metrics (access time, transfer rate), common physical implementations (SRAM, disk), and organizational aspects like mapping functions, replacement algorithms, and write policies. A cache sits between the CPU and main memory, using fast but small memory to speed up access to frequently used data from larger but slower main memory.
This document discusses the key characteristics of computer memory, including location, capacity, unit of transfer, access methods, performance, physical type, physical characteristics, and organization. It covers different types of memory like CPU registers, main memory, cache, disk, and tape. The different access methods like sequential, direct, random, and associative access are explained. The memory hierarchy and performance aspects like access time, memory cycle time, and transfer rate are defined. Factors like cache size, mapping function, replacement algorithm, write policy, block size that impact cache performance are also summarized.
Cache is a small amount of fast memory located close to the CPU that stores frequently accessed instructions and data. It speeds up processing by allowing the CPU to access needed information more quickly than from main memory. Caches exploit the principle of locality of reference, where programs tend to access the same data/instructions repeatedly over short periods. There are multiple cache levels, with L1 cache being fastest but smallest and L3 cache being largest but slower. Caching improves performance dramatically by fulfilling over 90% of memory requests from the small cache rather than requiring slower access to main memory.
RocksDB is an embedded key-value store that is optimized for fast storage. It uses a log-structured merge-tree to organize data on storage. Optimizing RocksDB for open-channel SSDs would allow controlling data placement to exploit flash parallelism and minimize overhead. This could be done by mapping RocksDB files like SSTables and logs to virtual blocks that map to physical flash blocks in a way that considers data access patterns and flash characteristics. This would improve performance by reducing writes and garbage collection.
Facebook's Approach to Big Data Storage ChallengeDataWorks Summit
Facebook data warehouse cluster stores more than 100PB of data, with 500+ terabytes of data entered into the clusters every day. To meet the capacity requirement of future data growth, storing data in a cost-effective way becomes a top priority of Facebook data infrastructure team. This talk will present various solutions we use to reduce our warehouse cluster`s data footprint: (1) Smart retention: history-based hive table retention control (2) Increase RCFile compression ratio through clever sorting (3) HDFS file-level raiding to reduce the default replication factor of 3 to a lower ratio; (4) Attack small-file raiding problem though Directory-level raiding and raid-aware compaction.
This document summarizes key concepts about physical storage systems from the textbook "Database System Concepts, 7th Ed." by Silberschatz, Korth and Sudarshan. It describes the storage hierarchy from fastest volatile primary storage (e.g. cache, main memory) to slower non-volatile secondary storage (e.g. magnetic disks, flash storage) to slowest tertiary storage (e.g. magnetic tapes). It also discusses various storage media like magnetic disks, flash storage, SSDs and RAID arrays, covering their mechanisms, performance and reliability through redundancy.
1. Magnetic disks are the primary storage medium for databases due to their large storage capacity and reliability. Disks store data in circular tracks divided into sectors, with read/write heads positioning over tracks to access data.
2. RAID (Redundant Arrays of Independent Disks) organizes multiple disks for improved performance, capacity, and reliability. Techniques like mirroring duplicate data across disks for fault tolerance, while striping distributes data across disks to enable parallel access.
3. Database designers must choose an appropriate RAID level based on factors like update frequency, capacity needs, and performance requirements to optimize the physical storage structure.
This document summarizes different types of physical storage media and RAID levels. It discusses volatile primary storage like cache and main memory, and non-volatile secondary storage like magnetic disks and tapes. Tertiary storage includes slower media like magnetic tapes. RAID levels provide data redundancy across multiple disks for reliability or performance gains, with tradeoffs in cost. Common RAID levels include RAID 0 for striping without parity, RAID 1 for mirroring, and RAID 5 for block-interleaved distributed parity. Flash storage like SSDs provide faster access than HDDs but have limitations on write endurance.
This document provides an overview of NVM compression, a hybrid flash-aware application level compression solution. It discusses the drawbacks of existing row-level compression in MySQL and outlines an architecture for NVM compression that avoids these drawbacks. Key aspects of the NVM compression approach include performing compression only during flush, using sparse addressing to avoid over-provisioning flash space, and adding a new multi-threaded flush framework. Evaluation results and building blocks of the solution are also briefly mentioned.
Lec11 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- Memory part3Hsien-Hsin Sean Lee, Ph.D.
This document discusses DRAM and storage systems. It begins by describing the basic DRAM cell and how DRAM is organized into banks, rows, and columns. It then covers DRAM operation including refreshing and different DRAM standards. The document also discusses disk organization with platters, tracks, and sectors. It provides details on disk access times and reliability techniques like RAID levels 0 through 6 which use data mirroring, striping, and error correction codes.
At StampedeCon 2012 in St. Louis, Pritam Damania presents: Reliable backup and recovery is one of the main requirements for any enterprise grade application. HBase has been very well embraced by enterprises needing random, real-time read/write access with huge volumes of data and ease of scalability. As such, they are looking for backup solutions that are reliable, easy to use, and can co-exist with existing infrastructure. HBase comes with several backup options but there is a clear need to improve the native export mechanisms. This talk will cover various options that are available out of the box, their drawbacks and what various companies are doing to make backup and recovery efficient. In particular it will cover what Facebook has done to improve performance of backup and recovery process with minimal impact to production cluster.
OSDC 2016 - Interesting things you can do with ZFS by Allan Jude&Benedict Reu...NETWAYS
ZFS is the next generation filesystem originally developed at Sun Microsystems. Available under the CDDL, it uniquely combines volume manager and filesystem into a powerful storage management solution for Unix systems. Regardless of big or small storage requirements. ZFS offers features, for free, that are usually found only in costly enterprise storage solutions. This talk will introduce ZFS and give an overview of its features like snapshots and rollback, compression, deduplication as well as replication. We will demonstrate how these features can make a difference in the datacenter, giving administrators the power and flexibility to adapt to changing storage requirements.
Real world examples of ZFS being used in production for video streaming, virtualization, archival, and research are shown to illustrate the concepts. The talk is intended for people considering ZFS for their data storage needs and those who are interested in the features ZFS provides.
Deterministic Memory Abstraction and Supporting Multicore System ArchitectureHeechul Yun
Presentation slides of the following paper at ECRTS'18.
Farzad Farshchi, Prathap Kumar Valsan, Renato Mancuso, Heechul Yun. "Deterministic Memory Abstraction and Supporting Multicore System Architecture." Euromicro Conference on Real-Time Systems (ECRTS), 2018
This document provides an overview of various data storage technologies including RAID, DAS, NAS, and SAN. It discusses RAID levels like RAID 0, 1, 5 which provide data striping and redundancy. Direct attached storage (DAS) connects directly to servers but cannot be shared, while network attached storage (NAS) uses file sharing protocols over IP networks. Storage area networks (SAN) use dedicated storage networks like Fibre Channel and iSCSI to provide block-level access to consolidated storage. The key is choosing the right solution based on capacity, performance, scalability, availability, data protection needs, and budget.
DRBD is a block device designed to mirror a block device across a network for high availability clustering. It can be understood as network-based RAID1. To set up DRBD, partitions must be prepared, configuration files created, and DRBD started to begin synchronization. Problems with DRBD can occur due to network errors disconnecting nodes, disk errors on mirrored devices, or role changes without synchronization. These issues are resolved by fixing the underlying problem and reattaching DRBD devices.
The document discusses Ext4 journaling and the write barrier feature. It notes that the write barrier forces a flush-to-disk call after writing the journal to ensure consistency. However, this can cause sluggishness when storage is full during OTA updates. Disabling the write barrier allows reordering of cache-to-disk writes, reducing latency and improving performance, though it introduces a small risk of filesystem corruption in the event of a power failure. Tests showed disabling the barrier reduced fsync latency and improved SQLite transactions per second on HDD and EMMC storage.
The document provides an overview of log structured file systems. It discusses how log structured file systems work by writing all data and metadata sequentially to a circular buffer called a log to improve write performance. It also describes how log structured file systems address issues like limited disk space through garbage collection and provide simpler crash recovery without requiring a file system check.
ZFS provides several advantages over traditional block-based filesystems when used with PostgreSQL, including preventing bitrot, improved compression ratios, and write locality. ZFS uses copy-on-write and transactional semantics to ensure data integrity and allow for snapshots and clones. Proper configuration such as enabling compression and using ZFS features like intent logging can optimize performance when used with PostgreSQL's workloads.
The document discusses various physical storage media used in computers including cache, main memory, flash memory, magnetic disks, optical disks, and magnetic tapes. It classifies storage based on characteristics like speed of access, cost, and reliability. RAID systems are described which provide storage virtualization through techniques like mirroring and striping across disks to improve performance and reliability. Different RAID levels are outlined including RAID 0, 1, 2, 3, 4, 5, and 6.
ZFS is a combined filesystem, volume manager, and RAID controller that provides immense storage capacity, simplifies administration, and ensures data integrity. It uses copy-on-write to prevent data corruption and supports features like snapshots, clones, replication, compression, and sharing data over NAS and SAN protocols. ZFS organizes storage into pools composed of virtual devices that provide fault tolerance and high performance.
Data deduplication is a hot topic in storage and saves significant disk space for many environments, with some trade offs. We’ll discuss what deduplication is and where the Open Source solutions are versus commercial offerings. Presentation will lean towards the practical – where attendees can use it in their real world projects (what works, what doesn’t, should you use in production, etcetera).
This document summarizes key concepts about storage devices including hard drives, RAID, and SSDs. It discusses the hardware components of hard drives including platters, read/write heads, and interfaces. It describes the different types of delays that occur with hard drives including rotational, seek, and transfer times. The document then covers RAID levels 0, 1, and their performance characteristics for sequential and random access workloads. It discusses challenges with ensuring consistent updates across mirrored drives in RAID 1 configurations.
Similar to Coerced Cache Eviction: Dealing with Misbehaving Disks through Discreet-Mode Journaling (20)
Coerced Cache Eviction: Dealing with Misbehaving Disks through Discreet-Mode Journaling
1. Coerced
Cache
Evic-on
and
Discreet-‐Mode
Journaling:
Dealing
with
Misbehaving
Disks
Abhishek
Rajimwale*,
Vijay
Chidambaram,
Deepak
Ramamurthi
Andrea
Arpaci-‐Dusseau,
Remzi
Arpaci-‐Dusseau
*Data
Domain
Inc
University
of
Wisconsin
Madison
2. Disks
are
not
perfect
• Expanding
disk
fault
model
• Latent
Sector
Errors
[Bairavasundaram
SIGMETRICS
07]
– RAID-‐6
• Block
Corrup-on
[Bairavasundaram FAST 08]
– Checksums
Disk
Cache
• The
disk
cache
Disk
Surface
– Always
trusted
so
far
3/13/12 DSN 11 2
3. Disk
Caches
• Disk
cache
improves
performance
– But
at
the
risk
of
data
loss
Write
to
disk
• Order
of
writes
issued
by
file
system:
– A,
B
,C
• Disks
reorder
writes
during
destaging:
– B,
A,
C
Disk
Cache
• File
systems
flush
the
disk
cache
to
ensure
correct
ordering
of
writes
Disk
Surface
– A,
flush,
B,
flush,
C
3/13/12 DSN 11 3
4. Problem:
Flushing
doesn’t
work
• Disks
can
fail
to
flush
data
upon
request
• One
reason:
Bugs
– Errors
in
the
storage
stack
[Bairavasundaram
FAST
08]
– Improper
propaga-on
of
error
codes
[Bairavasundaram
FAST
08]
– Inadequate
failure
policies
[Prabhakaran
SOSP
05]
– Bugs
in
the
firmware
[Ghemawat
SOSP
03]
3/13/12 DSN 11 4
5. Disks
can
lie!
• Misbehaving
disks
ignore
or
delay
flush
requests
• Increases
risk
for
data
loss
- File
systems
usually
blamed
for
such
loss
50
Sequen6al
writes
45
Avg
6me
(msec)
40
w/
cache
35
w/o
cache
30
25
20
15
10
5
0
4k
16k
64k
128k
512k
1m
Write
size
3/13/12 DSN 11 5
6. Disks
can
lie!
• Evidence
from
industry
experts
– Microsoc
– Seagate
• From
the
fcntl
man
page
in
Mac
OSX:
F_FULLFSYNC
Does the same thing as fsync(2) then asks the drive to flush all buffered data to
the permanent storage device (arg is ignored). This is currently implemented on
HFS, MS-DOS (FAT), and Universal Disk Format (UDF) file systems. The operation
may take quite a while to complete.
Certain FireWire drives have also been known to ignore the request to flush their
buffered data.
3/13/12 DSN 11 6
7. Ordering
points
are
essen-al
• All
modern
file
systems
depend
on
ordering
points
– Journaling
file
systems
(ext3,
ext4)
• Data
before
the
commit
block
– Copy
on
write
file
systems
(ZFS)
• Data
before
the
uber-‐block
• If
ordering
points
are
not
enforced:
– Data
corrup-on
– Inconsistent
file
system
3/13/12 DSN 11 7
8. Summary
• We
present
Coerced
Cache
Evic-on
(CCE)
– Write
extra
data
into
the
cache
to
evict
target
blocks
• We
show
how
to
characterize
9
SATA
disk
drive
cache
– Examine
the
wide
range
of
caching
policies
• We
implement
CCE
in
ext3
– Well
known
journaling
file
system
• CCE
provides
stronger
enforcement
for
ordering
points
– At
acceptable
overheads
3/13/12 DSN 11 8
10. File
System
Background
• Consider
dele-ng
a
file
– Removing
its
directory
entry
– Freeing
the
space
occupied
by
the
file
and
its
metadata
• Journaling
file
system
– Makes
sure
all
changes
get
to
disk
or
none
do
– Groups
writes
into
transac-ons
– Writes
everything
to
a
log
first
– Checkpoints
to
disk
later
3/13/12 DSN 11 10
11. File
System
Background
• Ext3
file
system
– Semi-‐modern
journaling
file
system
– Well
known,
well
understood
• Variants
of
journaling
– Data
journaling
mode
• Everything
(data,
metadata)
goes
to
the
log
first
– Ordered
journaling
mode
• Only
metadata
is
logged
3/13/12 DSN 11 11
15. Coerced
Cache
Evic-on
• Ensures
that
cache
has
been
truly
flushed
• Key
idea:
– Extra
writes
to
flush
the
disk
cache
– Desired
Order
of
writes:
A,
B,
C
– With
CCE:
• Write
A
• Write
to
flush
zone
• Write
B
• Write
to
flush
zone
• Write
C
3/13/12 DSN 11 15
16. Coerced
Cache
Evic-on
Memory M
C
D
B F F F F F F F F F
Disk Cache
Disk Surface
Journal
Flush Zone
Fixed locations
3/13/12 DSN 11 16
17. Coerced
Cache
Evic-on
• Desired
proper-es:
– High
probability
of
flushing
target
blocks
– Low
performance
overhead
• Need
to
understand
the
disk
cache
to
design
the
flush
workload
3/13/12 DSN 11 17
19. Cache
Fingerprin-ng
• Manufacturers
don’t
expose
details
about
disk
caches
• Disk
caches
can
vary
in:
– Read/Write
par--on
size
– Number
of
segments
– Replacement
policy
Disk
Cache
• Poorly
characterized
in
literature
3/13/12 DSN 11 19
20. Cache
Fingerprin-ng
• Flush
micro-‐benchmark:
– Write
target
block
– Write
varied
flush
workload
–
measure
cost
– fsync()
– Read
target
–
infer
evic>on
• Micro-‐benchmark
is
repeated
– Probability
of
evic-on
is
calculated
• Vary
in
each
workload:
– Number
of
writes
– Amount
of
data
in
each
write
– Sequen-al/Random
writes
3/13/12 DSN 11 20
21. Cache
Fingerprin-ng
• Evic-on
fingerprint
– Probability
of
evic-on
is
visually
shown
– Darker
region
indicates
higher
probability
Eviction Probability
90
–
100%
70
-‐
90
50
–
70
30
–
50
10
–
30
0
–
10
3/13/12 DSN 11 21
22. Cache
Fingerprin-ng
• Performance
fingerprint
– Time
taken
to
write
flush
workload
– Darker
region
indicates
more
-me
Flush Latency
500+
ms
100
–
500
50
–
100
10
-‐
50
0
-‐
10
3/13/12 DSN 11 22
23. Cache
Fingerprin-ng
• Selec-ng
a
flush
workload:
– Combine
informa-on
from
both
fingerprints
– High
probability
of
evic-on
• Dark
region
in
evic-on
fingerprint
– Low
performance
cost
• Light
region
in
performance
fingerprint
3/13/12 DSN 11 23
25. Cache
Fingerprin-ng
Sequen-al
writes
may
be
ineffec-ve
at
flushing
–
Regardless
of
the
size
of
the
write
A
number
of
random
writes
are
required
Eviction Probability
90
–
100%
70
-‐
90
50
–
70
30
–
50
10
–
30
0
–
10
3/13/12 DSN 11 25
26. Cache
Fingerprin-ng
Ver-cal
stripes
indicate
that
the
cache
is
segmented
– Each
write,
regardless
of
size,
is
sent
to
one
segment
Eviction Probability
90
–
100%
70
-‐
90
50
–
70
30
–
50
10
–
30
0
–
10
3/13/12 DSN 11 26
27. Cache
Fingerprin-ng
Cache
behavior
of
disks
from
the
same
manufacturer
is
qualita-vely
similar
across
their
different
models
Eviction Probability
90
–
100%
70
-‐
90
50
–
70
30
–
50
10
–
30
0
–
10
3/13/12 DSN 11 27
28. Cache
Fingerprin-ng
• It’s
not
all
good
news
however:
– Some
caches
appear
to
use
random
replacement
policies
– For
such
caches,
we
cannot
evict
blocks
with
100%
certainty
– A
large
number
of
random
writes
are
required
to
get
high
evic-on
probability
3/13/12 DSN 11 28
29. Cache
Fingerprin-ng
-‐
Results
Drive
Number
Total
Evic6on
Time
(s)
of
writes
Data
Probability
(MB)
Hitachi
8
MB
1
2.38
100
0.05
Hitachi
32
MB
1
11
100
0.087
Seagate
8
MB
256
31
100
0.87
Seagate
16
MB
128
17
100
0.342
Seagate
64
MB
128
37
100
0.396
Samsung
8
MB
128
49
~
90
1.328
Samsung
16
MB
256
128
~
90
2.872
Western
Digital
16
MB
1792
19
~
90
5.107
Western
Digital
64
MB
256
1
100
7.705
3/13/12 DSN 11 29
31. Discreet
Mode
Journaling
• Incorpora-ng
CCE
into
ext3
– Fingerprint
the
disk
to
find
op-mal
flush
workload
– Create
flush
zone
with
suitable
size
– Modify
ext3
to
issue
flush
zone
writes:
• One
at
each
ordering
point
• #
of
CCE
opera-ons
=
#
of
ordering
points
• Can
be
used
with
any
disk:
–
As
long
as
the
disk
is
fingerprinted
first
3/13/12 DSN 11 31
33. Evalua-on
• Goal:
– CCE
provides
higher
reliability
– At
what
cost?
Is
it
prac-cal
to
use?
• Experimental
setup:
– File
system:
Ext3
– Disk:
Hitachi
8
MB
– Journaling
mode:
Data
journaling
• (See
paper
for
ordered
journaling
results)
– Opera-ng
system:
Linux
2.6.13,
Linux
2.6.23
3/13/12 DSN 11 33
34. Evalua-on
• What
we
compare:
– Regular
journaling
with
disk
cache
turned
off
• “Safe”
but
slow
• Disk
might
not
obey
command
to
turn
off
cache!
– Regular
journaling
with
disk
cache
turned
on
• Unsafe
but
fast
– Discreet
mode
journaling
• Midway
op-on
–
Safe
but
with
cost
3/13/12 DSN 11 34
35. Evalua-on
• Benchmarks:
– OpenSSH
•
copy,
untar,
configure,
make
– Postmark
• Simulates
a
mail
server
• Single
threaded
– Filebench
Webserver
•
I/O
intensive
– Filebench
Varmail
• Mul-threaded
postmark
3/13/12 DSN 11 35
36. Evalua-on
–
OpenSSH
Data
Journaling
Mode
45
40
35
30
Time
(s)
25
20
15
10
5
0
regular
w/o
cache
discreet
regular
w/
cache
3/13/12 DSN 11 36
37. Evalua-on
–
Postmark
Data
Journaling
Mode
800
700
600
Time
(s)
500
400
300
200
100
0
regular
w/o
cache
discreet
regular
w/
cache
3/13/12 DSN 11 37
40. Evalua-on
–
Filebench
varmail
• Workload
writes
a
small
amount
of
data
and
calls
fsync()
repeatedly
• Each
fsync()causes
3
CCEs
• Number
of
op-miza-ons
:
– Incorporate
Group
Commit
in
varmail
• Improves
throughput
for
all
modes
– We
use
a
few
other
techniques
as
well
(see
paper)
3/13/12 DSN 11 40
42. Summary
• Coerced
Cache
Evic-on
(CCE):
–
Run
file
systems
reliably
on
top
of
misbehaving
disks
• Characteriza-on
of
9
SATA
disk
caches
through
fingerprints
• Discreet
Mode
Journaling:
– Implementa-on
of
CCE
for
ext3
filesystem
– Acceptable
performance
on
3
workloads
• Only
if
the
cache
doesn’t
use
random
replacement
– High
overhead
for
apps
which
call
fsync()
frequently
3/13/12 DSN 11 42
43. Conclusion
• Trust
in
disk
is
weakening:
– Latent
Sector
Errors
– Block
corrup-on
– Cache
flushing
• Cloud
compu-ng
systems:
– Virtualized
hardware
– Large
socware
stack
• Can
such
hardware
be
trusted?
• Will
coercion
be
more
widely
used?
3/13/12 DSN 11 43
44. Thank you!
Advanced
Systems
Lab
(ADSL)
University
of
Wisconsin-‐Madison
hEp://www.cs.wisc.edu/adsl
3/13/12 DSN 11 44