Dulo: an effective buffer cache management scheme to exploit both temporal and spatial localities

1. 2.5 As we know, disk I/O performance is critical to data-intensive applications. This is because these applications demand efficient I/O support. For example, database systems can manage millions of records on a storage device and access them in either small or large pieces, which requires low access time or high transfer rate of storage devices. For multimedia applications, they often access large blocks of data in a predictable sequence and demand a guaranteed minimal transfer rate. For scientific applications, I/O can be a big challenge because a huge amount of data can be requested in very-large-scale parallel systems within a short time. At Los Alamos national lab, ASCI mission-oriented programs conducts large-scale simulation-based analysis, which requires several gigabytes per second I/O bandwidth to support physical simulation and visualization. Their access

2. 0.5 The performance of the hard disk is limited by mechanical constraints. To read or write a disk block, disk head has to on the right disk track through seeking and on the right sector through disk platter rotation. From the graph I just showed, you can see how slow disk seek--movement of disk arm--is. That is, disk arm is the Achilles' heel of disk access performance. If you access disk sequentially, you will minimize disk seeks and make full use of disk rotations. So access of sequential blocks is faster than access of randomly placed blocks by at least an order of magnitude.

3. 0.5 This is the outline for the rest of the talk. First I will present my proposed scheme that uses disk layout information in buffer cache to improve disk performance. I will show the inadequacy of current buffer cache management in an OS. After describing how to efficiently managing disk layout information, I’ll present my proposed History-based prefetching and Miss-penalty aware caching, followed by a Performance evaluation of the scheme in a Linux kernel implementation. Next I will briefly introduce my proposed schemes on the coordination of distributed caches to reduce I/O requests, including Coordination of multi-level caches in a hierarchy and cooperative management of caches in peer-clients.

6. 1.75 To utilize disk layout information for buffer cache management, we need to answer two questions before we design a new cache management scheme. The first question is which disk layout information to use. The second question is how to efficiently manage the disk layout information. The layout information that is interesting to us is the one that can help locate sequentially accessed blocks. We use Logical block number (LBN) that describes logical disk geometry provided by disk firmware. This is because disk manufacturers have made every effort to ensure accessing of continuous LBNs has a performance close to that of contiguous disk blocks. Another advantage of using LBN is that this interface is easily available and highly portable across different platforms. For the second question we know currently LBN is only used to identify disk locations for

7. Then the sequence, X1..X4, are requested. The first block is on-demand fetched, and the following blocks are prefetched quickly without disk seeking. So blocks X2, X3, and X4 are hits. We assume that blocks are replaced from the bottom. LRU always put recently accessed blocks at the top. However, because random blocks are more expensive in their

8. Then the Y sequence is accessed through prefetching. You can see in LRU the random blocks are replaced, while in the dual locality policy these blocks are retained in the blocks.

9. Then the Y sequence is accessed through prefetching. You can see in LRU the random blocks are replaced, while in the dual locality policy these blocks are retained in the blocks.

10. Now the X sequence is requested. All the blocks in the request are hits for LRU. Unfortunately, the dual locality policy just replaced them. But the good thing is that these sequential blocks are cheap to re-load by another prefetching.

11. This time we want to re-access the random blocks. They are hit in dual locality policy. However, the LRU policy has to take four time-consuming disk rotation and seeking times to reload them. By considering the different access costs between sequential and random Blocks, dual locality policy makes a big performance difference: reduce disk

12. Because LRU or its variants are the most widely used replacement algorithms, we build the DULO scheme by using the LRU algorithm and its data structure -- the LRU stack, as a reference point. There are two key DULO operations: one is sequence forming. Sequence is defined a number of blocks whose disk

13. 2 The disk block table is similar in structure to a multi-level page table in operating systems. Just as each process has a page table, each disk has a block table. While a process table is used for translating a page’s virtual address into its physical address, we use the block table to record and track the recent access times of a disk block through its LBN. In this illustrative example, the block table has 3 levels, each entry in a directory level corresponds to 512 entries in its next lower level. Then LBN 5140 is mapped to this entry at leaf level through directory entries 0 and 10. At the leaf level of the table, a block can record up to two recent access times. Because we cannot afford record exact access times for each block, we let the system maintain a clock. The clock ticks when a block on disk is accessed. Assume the current clock time is 7. This block has only one timestamp, which is 1. When this block is accessed, it takes the current clock time as its most recent timestamp and records it in its corresponding table entry. If the entry is full, the oldest timestamp is replaced. We also record the most recent timestamp at the directory level of the table. So timestamp 7 is also recorded in the entries at these two directory levels. Using the block table, we can build efficient algorithm for finding access sequences by comparing timestamps of neighboring leaf block entries. You might be concerned with the space cost for the table when more and more blocks are added in it. Actually we only need to keep the disk working set in the table, and the table supports efficient space reclamation. We know a entry at the a directory level records the largest timestamp among all those of the blocks in the directory. When memory pressure is high and the system needs to reclaim some memory held by the table, we can traverse the table with a threshold timestamp. When we see a directory entry whose timestamp is smaller than the threshold, all the entries under it are removed. In this way, the space overhead can be

19. 0.5 This is the outline for the rest of the talk. First I will present my proposed scheme that uses disk layout information in buffer cache to improve disk performance. I will show the inadequacy of current buffer cache management in an OS. After describing how to efficiently managing disk layout information, I’ll present my proposed History-based prefetching and Miss-penalty aware caching, followed by a Performance evaluation of the scheme in a Linux kernel implementation. Next I will briefly introduce my proposed schemes on the coordination of distributed caches to reduce I/O requests, including Coordination of multi-level caches in a hierarchy and cooperative management of caches in peer-clients.

20. Now let me present my future research plan.

Dulo: an effective buffer cache management scheme to exploit both temporal and spatial localities

Recommended

Recommended

More Related Content

What's hot

What's hot (15)

Viewers also liked

Viewers also liked (17)

Similar to Dulo: an effective buffer cache management scheme to exploit both temporal and spatial localities

Similar to Dulo: an effective buffer cache management scheme to exploit both temporal and spatial localities (20)

Recently uploaded

Recently uploaded (20)

Dulo: an effective buffer cache management scheme to exploit both temporal and spatial localities