SlideShare a Scribd company logo
1 of 18
A Fast File System for UNIX
    Marshall K. McKusick, William N. Joy, Samuel J. Leffler, and Robert S. Fabry

    Slides by Aleatha Parker-Wood




Tuesday, April 6, 2010
State of the Art


    •    Bell Labs UNIX file system for the PDP-11 (referred to as “old
         filesystem” or OldFS)

    •    Disks are divided into physical partitions which contain a file system

    •    Linked list of free blocks stored in superblock

    •    inodes point either directly to blocks or to indirect blocks




Tuesday, April 6, 2010
Inode Layout in OldFS

                         inodes             data




•    All inodes are stored at the beginning of the disk region for the filesystem

      •    Incurs long seek times for every access

•    inodes for files are unlikely to be adjacent to their containing directory’s
     inodes or to each other

      •    More seek time incurred

Tuesday, April 6, 2010
Data Layout in OldFS

    •    Completely agnostic to physical storage device

    •    Consecutive file blocks unlikely to be on the same cylinder

          •     Even more seeking

    •    512 byte blocks (increased to 1024 bytes)

          •     Increasing the block size improved performance by a factor of 2

          •     Ergo: room for improvement!


Tuesday, April 6, 2010
Performance for OldFS


    •    Old system using 4% of disk bandwidth

    •    Performance good initially (175kbps), but degraded over time
         (30kbps)

    •    Free list became increasingly disorganized as file system was used...

    •    Blocks allocated in increasingly random locations




Tuesday, April 6, 2010
The Fast File System (FFS)



    •    Disk partitions divided into “cylinder groups”

    •    4K minimum block size

          •     ensures few levels of indirection (2 for files < than 4 GB)

    •    Blocks are broken into fragments to accommodate small files



Tuesday, April 6, 2010
Cylinder Groups


    •    Bookkeeping info stored for each cylinder group

          •     Backup copy of superblock
          •     Space for inodes
          •     A bit map of free blocks/fragments
          •     A static number of inodes allocated at creation time

    •    Bookkeeping info stored at a varying offset for each group (so losing
         the top platter will not result in complete data loss)



Tuesday, April 6, 2010
Fragments


    •    2,4, or 8 per block (minimum size is a disk sector, 512 bytes)

    •    Files never use more than one fragmented block

    •    Writing to a file which occupies a fragmented block either fills the
         current block (if room is available) or allocates a new block.

    •    Expanding files a fragment at a time causes frequent copying, writing
         in full blocks is optimal.



Tuesday, April 6, 2010
Layout Optimizations

    •    Optimize for the processor and mass storage device (usually disk)

    •    Cylinder aware

    •    Chooses rotationally optimal blocks (either consecutive or delayed)

    •    Stores rotational layout tables to find positions with data already
         written nearby

    •    Trade off between localizing data references and spreading unrelated
         data across cylinder groups.


Tuesday, April 6, 2010
Layout Policies: Inodes



    •    Inodes of files in a directory often accessed together

          •     For instance, ls reads every inode in the directory

    •    Keep inodes in same cylinder group

    •    When creating new directories, choose cylinder group with few
         current inodes and directories


Tuesday, April 6, 2010
Layout Policies: Data Blocks


    •    Place all data blocks for a file within the same cylinder group

    •    Preferably at rotationally optimal placements

    •    If file is greater than 48K (i.e., an indirect block is needed), move to
         new cylinder group (you had to seek anyway...)

    •    Likewise for every MB thereafter




Tuesday, April 6, 2010
So when you say “Fast” File
    System....




Tuesday, April 6, 2010
Read Throughput
                                Processor/   Speed     Max read
                         Type
                                   Bus       (Kbps)   bandwidth   %    %CPU

                                  750/
                     Old 1024
                                 UNIBUS       29        983       3     11
                                  750/
                New 4096/1024
                                 UNIBUS      221        983       22   43
                                  750/
                New 8192/1024
                                 UNIBUS      233        983       24   29
                                 750/
                New 4096/1024
                                MASSBUS      466        983       47   73
                                 750/
                New 8192/1024
                                MASSBUS      466        983       47   54

Tuesday, April 6, 2010
Write Throughput
                                Processor/   Speed    Max write
                         Type
                                   Bus       (Kbps)   bandwidth   %    %CPU

                                  750/
                     Old 1024
                                 UNIBUS       48        983       5    29
                                  750/
                New 4096/1024
                                 UNIBUS      142        983       14   43
                                  750/
                New 8192/1024
                                 UNIBUS      215        983       22   46
                                 750/
                New 4096/1024
                                MASSBUS      323        983       33   94
                                 750/
                New 8192/1024
                                MASSBUS      466        983       47   95

Tuesday, April 6, 2010
Other metrics...


    •    When running ls for large directories containing other directories,
         disk accesses for inodes cut in two

    •    Large directories containing only files cut by up to a factor of eight

    •    Transfer rates stable over time

    •    Throughput varies with amount of free space maintained (reduced by
         half when system is full)



Tuesday, April 6, 2010
Other Enhancements

    •    Arbitrary length file names (ok, 512 bytes)

    •    Advisory file locking

          •     Shared or exclusive

          •     Applied or removed only on open files

    •    Symbolic links, a la Multics

    •    Atomic rename operation

    •    Quotas

Tuesday, April 6, 2010
Conclusions


    •    Taking advantage of disk geometry and access patterns resulted in 10-
         fold improvement in both read and write throughput

    •    Improvements in block layout increased locality while reducing
         wasted space

    •    Hardware matters!




Tuesday, April 6, 2010
Thank you. Questions?




Tuesday, April 6, 2010

More Related Content

What's hot

Understanding the Windows Server Administration Fundamentals (Part-1)
Understanding the Windows Server Administration Fundamentals (Part-1)Understanding the Windows Server Administration Fundamentals (Part-1)
Understanding the Windows Server Administration Fundamentals (Part-1)Tuan Yang
 
PARALLEL FILE SYSTEM FOR LINUX CLUSTERS
PARALLEL FILE SYSTEM FOR LINUX CLUSTERSPARALLEL FILE SYSTEM FOR LINUX CLUSTERS
PARALLEL FILE SYSTEM FOR LINUX CLUSTERSRaheemUnnisa1
 
How to See and Resolve Office 365 Performance Challenges
How to See and Resolve Office 365 Performance Challenges How to See and Resolve Office 365 Performance Challenges
How to See and Resolve Office 365 Performance Challenges ThousandEyes
 
Dropbox presentation
Dropbox presentationDropbox presentation
Dropbox presentationKenton Larsen
 
Chapter 6 network security
Chapter 6   network securityChapter 6   network security
Chapter 6 network securitySyaiful Ahdan
 
Computer Networking A Top-Down Approach 6th editiion.pdf
Computer Networking A Top-Down Approach 6th editiion.pdfComputer Networking A Top-Down Approach 6th editiion.pdf
Computer Networking A Top-Down Approach 6th editiion.pdfssuser1a7f55
 
RAID - (Redundant Array of Inexpensive Disks or Drives, or Redundant Array of...
RAID - (Redundant Array of Inexpensive Disks or Drives, or Redundant Array of...RAID - (Redundant Array of Inexpensive Disks or Drives, or Redundant Array of...
RAID - (Redundant Array of Inexpensive Disks or Drives, or Redundant Array of...Jason Augustine
 
Setup NanoStation M2 (NSM2) in AP WDS mode (Access Point with WDS) for Wi-Fi ...
Setup NanoStation M2 (NSM2) in AP WDS mode (Access Point with WDS) for Wi-Fi ...Setup NanoStation M2 (NSM2) in AP WDS mode (Access Point with WDS) for Wi-Fi ...
Setup NanoStation M2 (NSM2) in AP WDS mode (Access Point with WDS) for Wi-Fi ...Tũi Wichets
 
Named Data Networking Operational Aspects - IoT as a Use-case
Named Data Networking Operational Aspects - IoT as a Use-caseNamed Data Networking Operational Aspects - IoT as a Use-case
Named Data Networking Operational Aspects - IoT as a Use-caseRute C. Sofia
 

What's hot (13)

Understanding the Windows Server Administration Fundamentals (Part-1)
Understanding the Windows Server Administration Fundamentals (Part-1)Understanding the Windows Server Administration Fundamentals (Part-1)
Understanding the Windows Server Administration Fundamentals (Part-1)
 
PARALLEL FILE SYSTEM FOR LINUX CLUSTERS
PARALLEL FILE SYSTEM FOR LINUX CLUSTERSPARALLEL FILE SYSTEM FOR LINUX CLUSTERS
PARALLEL FILE SYSTEM FOR LINUX CLUSTERS
 
How to See and Resolve Office 365 Performance Challenges
How to See and Resolve Office 365 Performance Challenges How to See and Resolve Office 365 Performance Challenges
How to See and Resolve Office 365 Performance Challenges
 
Dropbox presentation
Dropbox presentationDropbox presentation
Dropbox presentation
 
CEH-brochure.pdf
CEH-brochure.pdfCEH-brochure.pdf
CEH-brochure.pdf
 
Storage system architecture
Storage system architectureStorage system architecture
Storage system architecture
 
NAS Concepts
NAS ConceptsNAS Concepts
NAS Concepts
 
Chapter 6 network security
Chapter 6   network securityChapter 6   network security
Chapter 6 network security
 
LAMP TECHNOLOGY
LAMP TECHNOLOGYLAMP TECHNOLOGY
LAMP TECHNOLOGY
 
Computer Networking A Top-Down Approach 6th editiion.pdf
Computer Networking A Top-Down Approach 6th editiion.pdfComputer Networking A Top-Down Approach 6th editiion.pdf
Computer Networking A Top-Down Approach 6th editiion.pdf
 
RAID - (Redundant Array of Inexpensive Disks or Drives, or Redundant Array of...
RAID - (Redundant Array of Inexpensive Disks or Drives, or Redundant Array of...RAID - (Redundant Array of Inexpensive Disks or Drives, or Redundant Array of...
RAID - (Redundant Array of Inexpensive Disks or Drives, or Redundant Array of...
 
Setup NanoStation M2 (NSM2) in AP WDS mode (Access Point with WDS) for Wi-Fi ...
Setup NanoStation M2 (NSM2) in AP WDS mode (Access Point with WDS) for Wi-Fi ...Setup NanoStation M2 (NSM2) in AP WDS mode (Access Point with WDS) for Wi-Fi ...
Setup NanoStation M2 (NSM2) in AP WDS mode (Access Point with WDS) for Wi-Fi ...
 
Named Data Networking Operational Aspects - IoT as a Use-case
Named Data Networking Operational Aspects - IoT as a Use-caseNamed Data Networking Operational Aspects - IoT as a Use-case
Named Data Networking Operational Aspects - IoT as a Use-case
 

Similar to Fast File System

Solid State Drive Technology - MIT Lincoln Labs
Solid State Drive Technology - MIT Lincoln LabsSolid State Drive Technology - MIT Lincoln Labs
Solid State Drive Technology - MIT Lincoln LabsMatt Simmons
 
Btrfs by Chris Mason
Btrfs by Chris MasonBtrfs by Chris Mason
Btrfs by Chris MasonTerry Wang
 
Secondarystoragedevices1 130119040144-phpapp02
Secondarystoragedevices1 130119040144-phpapp02Secondarystoragedevices1 130119040144-phpapp02
Secondarystoragedevices1 130119040144-phpapp02Seshu Chakravarthy
 
A fast file system for unix presentation by parang saraf (cs5204 VT)
A fast file system for unix presentation by parang saraf (cs5204 VT)A fast file system for unix presentation by parang saraf (cs5204 VT)
A fast file system for unix presentation by parang saraf (cs5204 VT)Parang Saraf
 
Secondary storage devices
Secondary storage devices Secondary storage devices
Secondary storage devices Slideshare
 
DiskStorage_BasicFileStructuresandHashing.pdf
DiskStorage_BasicFileStructuresandHashing.pdfDiskStorage_BasicFileStructuresandHashing.pdf
DiskStorage_BasicFileStructuresandHashing.pdfChristalin Nelson
 
Allocation and free space management
Allocation and free space managementAllocation and free space management
Allocation and free space managementrajshreemuthiah
 
SSD Deployment Strategies for MySQL
SSD Deployment Strategies for MySQLSSD Deployment Strategies for MySQL
SSD Deployment Strategies for MySQLYoshinori Matsunobu
 
Osdc2011.ext4btrfs.talk
Osdc2011.ext4btrfs.talkOsdc2011.ext4btrfs.talk
Osdc2011.ext4btrfs.talkUdo Seidel
 
Unit 4 external sorting
Unit 4   external sortingUnit 4   external sorting
Unit 4 external sortingDrkhanchanaR
 
Network Implementation and Support Lesson 05 File Access - Eric Vanderburg
Network Implementation and Support Lesson 05   File Access - Eric VanderburgNetwork Implementation and Support Lesson 05   File Access - Eric Vanderburg
Network Implementation and Support Lesson 05 File Access - Eric VanderburgEric Vanderburg
 
How to Modernize Your Database Platform to Realize Consolidation Savings
How to Modernize Your Database Platform to Realize Consolidation SavingsHow to Modernize Your Database Platform to Realize Consolidation Savings
How to Modernize Your Database Platform to Realize Consolidation SavingsIsaac Christoffersen
 
Hadoop on a personal supercomputer
Hadoop on a personal supercomputerHadoop on a personal supercomputer
Hadoop on a personal supercomputerPaul Dingman
 
9_Storage_Devices.pptx
9_Storage_Devices.pptx9_Storage_Devices.pptx
9_Storage_Devices.pptxJawaharPrasad3
 
DownloadClassSessionFile (44).pdf
DownloadClassSessionFile (44).pdfDownloadClassSessionFile (44).pdf
DownloadClassSessionFile (44).pdfHanaBurhan1
 

Similar to Fast File System (20)

Lect09
Lect09Lect09
Lect09
 
Solid State Drive Technology - MIT Lincoln Labs
Solid State Drive Technology - MIT Lincoln LabsSolid State Drive Technology - MIT Lincoln Labs
Solid State Drive Technology - MIT Lincoln Labs
 
Btrfs by Chris Mason
Btrfs by Chris MasonBtrfs by Chris Mason
Btrfs by Chris Mason
 
Secondarystoragedevices1 130119040144-phpapp02
Secondarystoragedevices1 130119040144-phpapp02Secondarystoragedevices1 130119040144-phpapp02
Secondarystoragedevices1 130119040144-phpapp02
 
A fast file system for unix presentation by parang saraf (cs5204 VT)
A fast file system for unix presentation by parang saraf (cs5204 VT)A fast file system for unix presentation by parang saraf (cs5204 VT)
A fast file system for unix presentation by parang saraf (cs5204 VT)
 
Secondary storage devices
Secondary storage devices Secondary storage devices
Secondary storage devices
 
DiskStorage_BasicFileStructuresandHashing.pdf
DiskStorage_BasicFileStructuresandHashing.pdfDiskStorage_BasicFileStructuresandHashing.pdf
DiskStorage_BasicFileStructuresandHashing.pdf
 
Ext filesystem4
Ext filesystem4Ext filesystem4
Ext filesystem4
 
File server-info
File server-infoFile server-info
File server-info
 
Allocation and free space management
Allocation and free space managementAllocation and free space management
Allocation and free space management
 
SSD Deployment Strategies for MySQL
SSD Deployment Strategies for MySQLSSD Deployment Strategies for MySQL
SSD Deployment Strategies for MySQL
 
Osdc2011.ext4btrfs.talk
Osdc2011.ext4btrfs.talkOsdc2011.ext4btrfs.talk
Osdc2011.ext4btrfs.talk
 
M1 rl 1.2.1
M1 rl 1.2.1M1 rl 1.2.1
M1 rl 1.2.1
 
Unit 4 external sorting
Unit 4   external sortingUnit 4   external sorting
Unit 4 external sorting
 
Network Implementation and Support Lesson 05 File Access - Eric Vanderburg
Network Implementation and Support Lesson 05   File Access - Eric VanderburgNetwork Implementation and Support Lesson 05   File Access - Eric Vanderburg
Network Implementation and Support Lesson 05 File Access - Eric Vanderburg
 
How to Modernize Your Database Platform to Realize Consolidation Savings
How to Modernize Your Database Platform to Realize Consolidation SavingsHow to Modernize Your Database Platform to Realize Consolidation Savings
How to Modernize Your Database Platform to Realize Consolidation Savings
 
Hadoop on a personal supercomputer
Hadoop on a personal supercomputerHadoop on a personal supercomputer
Hadoop on a personal supercomputer
 
9_Storage_Devices.pptx
9_Storage_Devices.pptx9_Storage_Devices.pptx
9_Storage_Devices.pptx
 
Secondary storage
Secondary storageSecondary storage
Secondary storage
 
DownloadClassSessionFile (44).pdf
DownloadClassSessionFile (44).pdfDownloadClassSessionFile (44).pdf
DownloadClassSessionFile (44).pdf
 

Recently uploaded

Next-generation AAM aircraft unveiled by Supernal, S-A2
Next-generation AAM aircraft unveiled by Supernal, S-A2Next-generation AAM aircraft unveiled by Supernal, S-A2
Next-generation AAM aircraft unveiled by Supernal, S-A2Hyundai Motor Group
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphNeo4j
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?XfilesPro
 
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxMaking_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxnull - The Open Security Community
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 

Recently uploaded (20)

Next-generation AAM aircraft unveiled by Supernal, S-A2
Next-generation AAM aircraft unveiled by Supernal, S-A2Next-generation AAM aircraft unveiled by Supernal, S-A2
Next-generation AAM aircraft unveiled by Supernal, S-A2
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?
 
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxMaking_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
The transition to renewables in India.pdf
The transition to renewables in India.pdfThe transition to renewables in India.pdf
The transition to renewables in India.pdf
 

Fast File System

  • 1. A Fast File System for UNIX Marshall K. McKusick, William N. Joy, Samuel J. Leffler, and Robert S. Fabry Slides by Aleatha Parker-Wood Tuesday, April 6, 2010
  • 2. State of the Art • Bell Labs UNIX file system for the PDP-11 (referred to as “old filesystem” or OldFS) • Disks are divided into physical partitions which contain a file system • Linked list of free blocks stored in superblock • inodes point either directly to blocks or to indirect blocks Tuesday, April 6, 2010
  • 3. Inode Layout in OldFS inodes data • All inodes are stored at the beginning of the disk region for the filesystem • Incurs long seek times for every access • inodes for files are unlikely to be adjacent to their containing directory’s inodes or to each other • More seek time incurred Tuesday, April 6, 2010
  • 4. Data Layout in OldFS • Completely agnostic to physical storage device • Consecutive file blocks unlikely to be on the same cylinder • Even more seeking • 512 byte blocks (increased to 1024 bytes) • Increasing the block size improved performance by a factor of 2 • Ergo: room for improvement! Tuesday, April 6, 2010
  • 5. Performance for OldFS • Old system using 4% of disk bandwidth • Performance good initially (175kbps), but degraded over time (30kbps) • Free list became increasingly disorganized as file system was used... • Blocks allocated in increasingly random locations Tuesday, April 6, 2010
  • 6. The Fast File System (FFS) • Disk partitions divided into “cylinder groups” • 4K minimum block size • ensures few levels of indirection (2 for files < than 4 GB) • Blocks are broken into fragments to accommodate small files Tuesday, April 6, 2010
  • 7. Cylinder Groups • Bookkeeping info stored for each cylinder group • Backup copy of superblock • Space for inodes • A bit map of free blocks/fragments • A static number of inodes allocated at creation time • Bookkeeping info stored at a varying offset for each group (so losing the top platter will not result in complete data loss) Tuesday, April 6, 2010
  • 8. Fragments • 2,4, or 8 per block (minimum size is a disk sector, 512 bytes) • Files never use more than one fragmented block • Writing to a file which occupies a fragmented block either fills the current block (if room is available) or allocates a new block. • Expanding files a fragment at a time causes frequent copying, writing in full blocks is optimal. Tuesday, April 6, 2010
  • 9. Layout Optimizations • Optimize for the processor and mass storage device (usually disk) • Cylinder aware • Chooses rotationally optimal blocks (either consecutive or delayed) • Stores rotational layout tables to find positions with data already written nearby • Trade off between localizing data references and spreading unrelated data across cylinder groups. Tuesday, April 6, 2010
  • 10. Layout Policies: Inodes • Inodes of files in a directory often accessed together • For instance, ls reads every inode in the directory • Keep inodes in same cylinder group • When creating new directories, choose cylinder group with few current inodes and directories Tuesday, April 6, 2010
  • 11. Layout Policies: Data Blocks • Place all data blocks for a file within the same cylinder group • Preferably at rotationally optimal placements • If file is greater than 48K (i.e., an indirect block is needed), move to new cylinder group (you had to seek anyway...) • Likewise for every MB thereafter Tuesday, April 6, 2010
  • 12. So when you say “Fast” File System.... Tuesday, April 6, 2010
  • 13. Read Throughput Processor/ Speed Max read Type Bus (Kbps) bandwidth % %CPU 750/ Old 1024 UNIBUS 29 983 3 11 750/ New 4096/1024 UNIBUS 221 983 22 43 750/ New 8192/1024 UNIBUS 233 983 24 29 750/ New 4096/1024 MASSBUS 466 983 47 73 750/ New 8192/1024 MASSBUS 466 983 47 54 Tuesday, April 6, 2010
  • 14. Write Throughput Processor/ Speed Max write Type Bus (Kbps) bandwidth % %CPU 750/ Old 1024 UNIBUS 48 983 5 29 750/ New 4096/1024 UNIBUS 142 983 14 43 750/ New 8192/1024 UNIBUS 215 983 22 46 750/ New 4096/1024 MASSBUS 323 983 33 94 750/ New 8192/1024 MASSBUS 466 983 47 95 Tuesday, April 6, 2010
  • 15. Other metrics... • When running ls for large directories containing other directories, disk accesses for inodes cut in two • Large directories containing only files cut by up to a factor of eight • Transfer rates stable over time • Throughput varies with amount of free space maintained (reduced by half when system is full) Tuesday, April 6, 2010
  • 16. Other Enhancements • Arbitrary length file names (ok, 512 bytes) • Advisory file locking • Shared or exclusive • Applied or removed only on open files • Symbolic links, a la Multics • Atomic rename operation • Quotas Tuesday, April 6, 2010
  • 17. Conclusions • Taking advantage of disk geometry and access patterns resulted in 10- fold improvement in both read and write throughput • Improvements in block layout increased locality while reducing wasted space • Hardware matters! Tuesday, April 6, 2010