Your SlideShare is downloading. ×
A Fast File System for UNIX
    Marshall K. McKusick, William N. Joy, Samuel J. Leffler, and Robert S. Fabry

    Slides by...
State of the Art


    •    Bell Labs UNIX file system for the PDP-11 (referred to as “old
         filesystem” or OldFS)

 ...
Inode Layout in OldFS

                         inodes             data




•    All inodes are stored at the beginning of...
Data Layout in OldFS

    •    Completely agnostic to physical storage device

    •    Consecutive file blocks unlikely to...
Performance for OldFS


    •    Old system using 4% of disk bandwidth

    •    Performance good initially (175kbps), but...
The Fast File System (FFS)



    •    Disk partitions divided into “cylinder groups”

    •    4K minimum block size

   ...
Cylinder Groups


    •    Bookkeeping info stored for each cylinder group

          •     Backup copy of superblock
    ...
Fragments


    •    2,4, or 8 per block (minimum size is a disk sector, 512 bytes)

    •    Files never use more than on...
Layout Optimizations

    •    Optimize for the processor and mass storage device (usually disk)

    •    Cylinder aware
...
Layout Policies: Inodes



    •    Inodes of files in a directory often accessed together

          •     For instance, l...
Layout Policies: Data Blocks


    •    Place all data blocks for a file within the same cylinder group

    •    Preferabl...
So when you say “Fast” File
    System....




Tuesday, April 6, 2010
Read Throughput
                                Processor/   Speed     Max read
                         Type
            ...
Write Throughput
                                Processor/   Speed    Max write
                         Type
           ...
Other metrics...


    •    When running ls for large directories containing other directories,
         disk accesses for...
Other Enhancements

    •    Arbitrary length file names (ok, 512 bytes)

    •    Advisory file locking

          •     Sh...
Conclusions


    •    Taking advantage of disk geometry and access patterns resulted in 10-
         fold improvement in ...
Thank you. Questions?




Tuesday, April 6, 2010
Upcoming SlideShare
Loading in...5
×

Fast File System

344

Published on

Class presentation.

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
344
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
0
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Transcript of "Fast File System"

  1. 1. A Fast File System for UNIX Marshall K. McKusick, William N. Joy, Samuel J. Leffler, and Robert S. Fabry Slides by Aleatha Parker-Wood Tuesday, April 6, 2010
  2. 2. State of the Art • Bell Labs UNIX file system for the PDP-11 (referred to as “old filesystem” or OldFS) • Disks are divided into physical partitions which contain a file system • Linked list of free blocks stored in superblock • inodes point either directly to blocks or to indirect blocks Tuesday, April 6, 2010
  3. 3. Inode Layout in OldFS inodes data • All inodes are stored at the beginning of the disk region for the filesystem • Incurs long seek times for every access • inodes for files are unlikely to be adjacent to their containing directory’s inodes or to each other • More seek time incurred Tuesday, April 6, 2010
  4. 4. Data Layout in OldFS • Completely agnostic to physical storage device • Consecutive file blocks unlikely to be on the same cylinder • Even more seeking • 512 byte blocks (increased to 1024 bytes) • Increasing the block size improved performance by a factor of 2 • Ergo: room for improvement! Tuesday, April 6, 2010
  5. 5. Performance for OldFS • Old system using 4% of disk bandwidth • Performance good initially (175kbps), but degraded over time (30kbps) • Free list became increasingly disorganized as file system was used... • Blocks allocated in increasingly random locations Tuesday, April 6, 2010
  6. 6. The Fast File System (FFS) • Disk partitions divided into “cylinder groups” • 4K minimum block size • ensures few levels of indirection (2 for files < than 4 GB) • Blocks are broken into fragments to accommodate small files Tuesday, April 6, 2010
  7. 7. Cylinder Groups • Bookkeeping info stored for each cylinder group • Backup copy of superblock • Space for inodes • A bit map of free blocks/fragments • A static number of inodes allocated at creation time • Bookkeeping info stored at a varying offset for each group (so losing the top platter will not result in complete data loss) Tuesday, April 6, 2010
  8. 8. Fragments • 2,4, or 8 per block (minimum size is a disk sector, 512 bytes) • Files never use more than one fragmented block • Writing to a file which occupies a fragmented block either fills the current block (if room is available) or allocates a new block. • Expanding files a fragment at a time causes frequent copying, writing in full blocks is optimal. Tuesday, April 6, 2010
  9. 9. Layout Optimizations • Optimize for the processor and mass storage device (usually disk) • Cylinder aware • Chooses rotationally optimal blocks (either consecutive or delayed) • Stores rotational layout tables to find positions with data already written nearby • Trade off between localizing data references and spreading unrelated data across cylinder groups. Tuesday, April 6, 2010
  10. 10. Layout Policies: Inodes • Inodes of files in a directory often accessed together • For instance, ls reads every inode in the directory • Keep inodes in same cylinder group • When creating new directories, choose cylinder group with few current inodes and directories Tuesday, April 6, 2010
  11. 11. Layout Policies: Data Blocks • Place all data blocks for a file within the same cylinder group • Preferably at rotationally optimal placements • If file is greater than 48K (i.e., an indirect block is needed), move to new cylinder group (you had to seek anyway...) • Likewise for every MB thereafter Tuesday, April 6, 2010
  12. 12. So when you say “Fast” File System.... Tuesday, April 6, 2010
  13. 13. Read Throughput Processor/ Speed Max read Type Bus (Kbps) bandwidth % %CPU 750/ Old 1024 UNIBUS 29 983 3 11 750/ New 4096/1024 UNIBUS 221 983 22 43 750/ New 8192/1024 UNIBUS 233 983 24 29 750/ New 4096/1024 MASSBUS 466 983 47 73 750/ New 8192/1024 MASSBUS 466 983 47 54 Tuesday, April 6, 2010
  14. 14. Write Throughput Processor/ Speed Max write Type Bus (Kbps) bandwidth % %CPU 750/ Old 1024 UNIBUS 48 983 5 29 750/ New 4096/1024 UNIBUS 142 983 14 43 750/ New 8192/1024 UNIBUS 215 983 22 46 750/ New 4096/1024 MASSBUS 323 983 33 94 750/ New 8192/1024 MASSBUS 466 983 47 95 Tuesday, April 6, 2010
  15. 15. Other metrics... • When running ls for large directories containing other directories, disk accesses for inodes cut in two • Large directories containing only files cut by up to a factor of eight • Transfer rates stable over time • Throughput varies with amount of free space maintained (reduced by half when system is full) Tuesday, April 6, 2010
  16. 16. Other Enhancements • Arbitrary length file names (ok, 512 bytes) • Advisory file locking • Shared or exclusive • Applied or removed only on open files • Symbolic links, a la Multics • Atomic rename operation • Quotas Tuesday, April 6, 2010
  17. 17. Conclusions • Taking advantage of disk geometry and access patterns resulted in 10- fold improvement in both read and write throughput • Improvements in block layout increased locality while reducing wasted space • Hardware matters! Tuesday, April 6, 2010
  18. 18. Thank you. Questions? Tuesday, April 6, 2010

×