The ext filesystem
A four generation retrospective
Presented by –
Neha Kulkarni (5202)
ME Computer
Pune Institute of Computer Technology
Outline
• Linux file system
• Virtual file system
• Ext file system
• Ext2 file system
• Ext3 file system
• Ext4 file system
• Comparison
• Conclusion
• References
Linux filesystem
• A file system is an abstraction that supports the
creation, deletion, and modification of files, and
organization of files into directories.
• Linux uses the File System Hierarchy (FSH)
standard.
• Modern UNIX systems allow disks to be
partitioned into two or more separate physical
entities, each containing a distinct file system.
• Modern UNIX file systems are virtual file systems,
designed to handle many different types of
underlying physical file systems.
In the FHS, all files and directories appear under the
root directory /, even if they are stored on different
physical or virtual devices
• The superblock contains all of the information
about how the file system is configured, such
as block size, block address range, and mount
status.
• The i-nodes contain the file attributes and a
map indicating where the blocks of the file are
located on the disk. They are of 128 bytes.
• The data blocks are where file contents are
stored.
• i-node in position 2 of the table usually points
to the entry for the root directory file in the
file system.
inode pointer structure
Linux virtual file system
ext filesystem
• Implemented in April 1992 as the first file system
specifically for the Linux kernel
• It was the first implementation that used the
virtual file system, for which support was added
in the Linux kernel in version 0.96c, and it could
handle file systems up to 2GB in size.
• Filenames were allowed up to 255 characters.
• Only one timestamp, no data modification
timestamps
• Use of linked-list for free space- caused
fragmentation
Inode description for ext
ext2 filesystem
• Designed by Remy Card in January 1993
• Designed and implemented to fix some problems
present in the first extended file system.
• Maximal file size 2GB, can be extended up to 4TB
• Provides long file names up to 255 characters
• Reserves around 5% of the blocks for super-user
• User can set attributes on files or directories
• Allows the administrator to choose the logical
block size when creating the file system (1024B,
2048B, 4096B)
• Implements fast symbolic links (upto 60
characters)
• Keeps track of the file system state
• Immutable files can only be read. Nobody can
write or delete them.
• Directories are managed as linked lists of
variable length entries
• Data modification timestamps were available
• ext2 partitions can be converted to ext3 and vice-
versa without any need for backing up the data
and repartitioning
Bitmaps
• The superblock is defined in struct ext2_super_block, line 339
of include/linux/ext2_fs.h
• A group descriptor is defined by ext2_group_descr structure,
line 148 of ext2_fs.h
• Block number of the first block of inode table is stored in the
bg_inode_table field of the group descriptor.
• i_mode determines the type and access rights of a file. They
are present in sys/stat.h
Directory inode
Directory entries in the inode table require special attention. To test if an inode refers
to a directory file we can use the S_ISDIR(mode) macro:
if (S_ISDIR(inode.i_mode))
In the case of directory entries, the data blocks pointed by i_block[] contain a list of
the files in the directory and their respective inode numbers.
Locating a file
int fd = open("/home/ealtieri/hello.txt", O_RDONLY);
The desired file is hello.txt, while its path is /home/ealtieri/hello.txt.
Once the inode of the file is known, the data blocks belonging to the hello.txt are specified
by the inode.block[] array.
ext3 filesystem
• ext3, or third extended filesystem, is a
journalled filesystem.
• The filesystem was merged with the mainline
Linux kernel in November 2001 from 2.4.15
onward.
• Its main advantage over ext2 is journaling,
which improves reliability and eliminates the
need to check the file system after an unclean
shutdown.
• ext3 adds the following features to ext2:
- A journal
- Online file system growth
- HTree indexing for larger directories – 32
bit
• Maximum file size is 16GB-2TB
• Maximum file system size is 2TB-32TB
• The max number of blocks for ext3 is 232
• Journalling levels:
– Journal (Lowest risk) -
Both metadata and file contents are written to the
journal before being committed to the main file
system.
– Ordered (medium risk) -
Only metadata is journalled, file contents are written
to the disk before associated metadata is marked as
committed in the journal.
– Writeback (highest risk) -
Only metadata is journalled, file contents are not.
Contents might be written before or after the journal
I is updated.
Block allocator
• When allocating a block for a file, the Ext3 block
allocator always starts from the block group
where the i-node structure is stored to keep the
meta-data and data blocks close to each other
• In case of multiple files allocating blocks
concurrently, the Ext3 block allocator uses block
reservation to make sure that subsequent
requests for blocks for a particular file get served
before it is interleaved with other files.
• Ext3 still uses the bitmap to search for the free
blocks to reserve
Disadvantages of ext3
• Ext3 lacks recent features such as extents, dynamic
allocation of inodes, block sub-allocation.
• A directory can have at the most 31998 subdirectories.
• No online defragmentation tool
• Does not support the recovery of deleted files
• Does not have native support for snapshots
• No check-summing while writing the journal
• Ext3 stores dates as Unix time using four bytes in the
file header. 32 bits does not give enough scope to
continue processing files beyond January 18, 2038.
ext4 filesystem
• It supports 48-bit block addressing, so it will
have 1 EB of maximum filesystem size and 16
TB of maximum file size.
• 1 EB = 1,048,576 TB (1 EB = 1024 PB, 1 PB =
1024 TB, 1 TB = 1024 GB)
• Ext4 allows an unlimited number of sub
directories
Extents
• Extent mapping is used in Ext4 to represent new files, rather
than the double, triple indirect block mapping used in Ext2/3.
Ext4 block allocator
• Main goal is to provide better allocation for
small and large files
• Ext4 uses a "multiblock allocator" (mballoc).
• Persistent pre-allocation :
Applications tell the filesystem to preallocate
the space.
• The feature is available via the
libc posix_fallocate() interface
Delayed Allocation
• Delayed allocation defers block allocations from
write() operation time to page flush time.
• It increases the opportunity to combine many
block allocation requests into a single request
reducing fragmentation
• Saves CPU cycles, and avoids unnecessary block
allocation for short-lived files
• The current Ext4 delayed allocation only supports
data=writeback journalling mode.
Inode features
• Larger inodes: Ext4 will default to 256 bytes. This is needed
to accommodate some extra fields (like nanosecond
timestamps or inode versioning), and the remaining space
of the inode will be used to store extend attributes that are
small enough to fit it that space. This will make the access
to those attributes much faster, and improves the
performance of applications that use extend attributes by a
factor of 3-7 times.
• Inode reservation consists in reserving several inodes when
a directory is created, expecting that they will be used in
the future.
• Nanoseconds timestamps means that inode fields like
"modified time" will be able to use nanosecond resolution
instead of the second resolution of Ext3.
Journalling
• Ext4 checksums the journal data to know if
the journal blocks are failing or corrupted.
• In Ext4 the journaling feature can be disabled,
which provides a small performance
improvement.
Other features
• Ext4 supports online defragmentation
• There's a e4defrag tool which can defragment
individual files or the whole filesystem
• Fsck- In Ext4, at the end of each group's inode
table will be stored a list of unused inodes
(with a checksum, for safety), so fsck will not
check those inodes. The result is that total fsck
time improves from 2 to 20 times, depending
on the number of used inodes.
Difference between ext versions
Point ext2 ext3 ext4
Maximum
individual file size
16GB – 2TB 16GB – 2TB 16GB – 16TB
Maximum file
system size
2TB – 32TB 2TB – 32TB 1EB
Journalling Not available Available Available and can
be turned “off” too
Number of
directories
31998 31998 Unlimited
Journal checksum No No Yes
Multi-block
allocation and
delayed allocation
No No Yes
Conclusion
• Linux file system follows the FSH standard
• Ext, ext2, ext3, ext4 are the versions of the ext
file system
• Ext is no longer supported by Linux kernel
• Ext4 filesystem supports 48-bit block
addressing, multi-block allocation, extents,
persistent pre-allocation, delayed allocation,
unlimited number of subdirectories, journal
checksumming and improved timestamps.
References
[1] ”Design and Implementation of the Seconded Extended
Filesystem”, Remy Card, Proceedings of the First Dutch
International Symposium on Linux, ISBN 9036703859
[2] Ext4 Linux Kernal Newbies – Online Documentation
[3] Introduction to Linux : Chapter 3 – General Overview of the
Linux Filesystem
[4] ”Ext4 block and inode allocator improvements”, Aneesh
Kumar et al, Proceesings of the Linux Symposium, 2008
[5] Linux Filesystem Hierarchy, Binh Nguyen
[6] www.tldp.org_LDP_tlk_fs_filesystem
[7] www.wikipedia.org – ext2, ext3, ext4
Ext filesystem4

Ext filesystem4

  • 1.
    The ext filesystem Afour generation retrospective Presented by – Neha Kulkarni (5202) ME Computer Pune Institute of Computer Technology
  • 2.
    Outline • Linux filesystem • Virtual file system • Ext file system • Ext2 file system • Ext3 file system • Ext4 file system • Comparison • Conclusion • References
  • 3.
    Linux filesystem • Afile system is an abstraction that supports the creation, deletion, and modification of files, and organization of files into directories. • Linux uses the File System Hierarchy (FSH) standard. • Modern UNIX systems allow disks to be partitioned into two or more separate physical entities, each containing a distinct file system. • Modern UNIX file systems are virtual file systems, designed to handle many different types of underlying physical file systems.
  • 4.
    In the FHS,all files and directories appear under the root directory /, even if they are stored on different physical or virtual devices
  • 6.
    • The superblockcontains all of the information about how the file system is configured, such as block size, block address range, and mount status. • The i-nodes contain the file attributes and a map indicating where the blocks of the file are located on the disk. They are of 128 bytes. • The data blocks are where file contents are stored. • i-node in position 2 of the table usually points to the entry for the root directory file in the file system.
  • 7.
  • 8.
  • 9.
    ext filesystem • Implementedin April 1992 as the first file system specifically for the Linux kernel • It was the first implementation that used the virtual file system, for which support was added in the Linux kernel in version 0.96c, and it could handle file systems up to 2GB in size. • Filenames were allowed up to 255 characters. • Only one timestamp, no data modification timestamps • Use of linked-list for free space- caused fragmentation
  • 10.
  • 11.
    ext2 filesystem • Designedby Remy Card in January 1993 • Designed and implemented to fix some problems present in the first extended file system. • Maximal file size 2GB, can be extended up to 4TB • Provides long file names up to 255 characters • Reserves around 5% of the blocks for super-user • User can set attributes on files or directories • Allows the administrator to choose the logical block size when creating the file system (1024B, 2048B, 4096B)
  • 12.
    • Implements fastsymbolic links (upto 60 characters) • Keeps track of the file system state • Immutable files can only be read. Nobody can write or delete them. • Directories are managed as linked lists of variable length entries • Data modification timestamps were available • ext2 partitions can be converted to ext3 and vice- versa without any need for backing up the data and repartitioning
  • 14.
  • 15.
    • The superblockis defined in struct ext2_super_block, line 339 of include/linux/ext2_fs.h • A group descriptor is defined by ext2_group_descr structure, line 148 of ext2_fs.h • Block number of the first block of inode table is stored in the bg_inode_table field of the group descriptor. • i_mode determines the type and access rights of a file. They are present in sys/stat.h
  • 16.
    Directory inode Directory entriesin the inode table require special attention. To test if an inode refers to a directory file we can use the S_ISDIR(mode) macro: if (S_ISDIR(inode.i_mode)) In the case of directory entries, the data blocks pointed by i_block[] contain a list of the files in the directory and their respective inode numbers.
  • 17.
    Locating a file intfd = open("/home/ealtieri/hello.txt", O_RDONLY); The desired file is hello.txt, while its path is /home/ealtieri/hello.txt. Once the inode of the file is known, the data blocks belonging to the hello.txt are specified by the inode.block[] array.
  • 18.
    ext3 filesystem • ext3,or third extended filesystem, is a journalled filesystem. • The filesystem was merged with the mainline Linux kernel in November 2001 from 2.4.15 onward. • Its main advantage over ext2 is journaling, which improves reliability and eliminates the need to check the file system after an unclean shutdown.
  • 19.
    • ext3 addsthe following features to ext2: - A journal - Online file system growth - HTree indexing for larger directories – 32 bit • Maximum file size is 16GB-2TB • Maximum file system size is 2TB-32TB • The max number of blocks for ext3 is 232
  • 20.
    • Journalling levels: –Journal (Lowest risk) - Both metadata and file contents are written to the journal before being committed to the main file system. – Ordered (medium risk) - Only metadata is journalled, file contents are written to the disk before associated metadata is marked as committed in the journal. – Writeback (highest risk) - Only metadata is journalled, file contents are not. Contents might be written before or after the journal I is updated.
  • 21.
    Block allocator • Whenallocating a block for a file, the Ext3 block allocator always starts from the block group where the i-node structure is stored to keep the meta-data and data blocks close to each other • In case of multiple files allocating blocks concurrently, the Ext3 block allocator uses block reservation to make sure that subsequent requests for blocks for a particular file get served before it is interleaved with other files. • Ext3 still uses the bitmap to search for the free blocks to reserve
  • 22.
    Disadvantages of ext3 •Ext3 lacks recent features such as extents, dynamic allocation of inodes, block sub-allocation. • A directory can have at the most 31998 subdirectories. • No online defragmentation tool • Does not support the recovery of deleted files • Does not have native support for snapshots • No check-summing while writing the journal • Ext3 stores dates as Unix time using four bytes in the file header. 32 bits does not give enough scope to continue processing files beyond January 18, 2038.
  • 23.
    ext4 filesystem • Itsupports 48-bit block addressing, so it will have 1 EB of maximum filesystem size and 16 TB of maximum file size. • 1 EB = 1,048,576 TB (1 EB = 1024 PB, 1 PB = 1024 TB, 1 TB = 1024 GB) • Ext4 allows an unlimited number of sub directories
  • 24.
    Extents • Extent mappingis used in Ext4 to represent new files, rather than the double, triple indirect block mapping used in Ext2/3.
  • 25.
    Ext4 block allocator •Main goal is to provide better allocation for small and large files • Ext4 uses a "multiblock allocator" (mballoc). • Persistent pre-allocation : Applications tell the filesystem to preallocate the space. • The feature is available via the libc posix_fallocate() interface
  • 26.
    Delayed Allocation • Delayedallocation defers block allocations from write() operation time to page flush time. • It increases the opportunity to combine many block allocation requests into a single request reducing fragmentation • Saves CPU cycles, and avoids unnecessary block allocation for short-lived files • The current Ext4 delayed allocation only supports data=writeback journalling mode.
  • 27.
    Inode features • Largerinodes: Ext4 will default to 256 bytes. This is needed to accommodate some extra fields (like nanosecond timestamps or inode versioning), and the remaining space of the inode will be used to store extend attributes that are small enough to fit it that space. This will make the access to those attributes much faster, and improves the performance of applications that use extend attributes by a factor of 3-7 times. • Inode reservation consists in reserving several inodes when a directory is created, expecting that they will be used in the future. • Nanoseconds timestamps means that inode fields like "modified time" will be able to use nanosecond resolution instead of the second resolution of Ext3.
  • 28.
    Journalling • Ext4 checksumsthe journal data to know if the journal blocks are failing or corrupted. • In Ext4 the journaling feature can be disabled, which provides a small performance improvement.
  • 29.
    Other features • Ext4supports online defragmentation • There's a e4defrag tool which can defragment individual files or the whole filesystem • Fsck- In Ext4, at the end of each group's inode table will be stored a list of unused inodes (with a checksum, for safety), so fsck will not check those inodes. The result is that total fsck time improves from 2 to 20 times, depending on the number of used inodes.
  • 30.
    Difference between extversions Point ext2 ext3 ext4 Maximum individual file size 16GB – 2TB 16GB – 2TB 16GB – 16TB Maximum file system size 2TB – 32TB 2TB – 32TB 1EB Journalling Not available Available Available and can be turned “off” too Number of directories 31998 31998 Unlimited Journal checksum No No Yes Multi-block allocation and delayed allocation No No Yes
  • 31.
    Conclusion • Linux filesystem follows the FSH standard • Ext, ext2, ext3, ext4 are the versions of the ext file system • Ext is no longer supported by Linux kernel • Ext4 filesystem supports 48-bit block addressing, multi-block allocation, extents, persistent pre-allocation, delayed allocation, unlimited number of subdirectories, journal checksumming and improved timestamps.
  • 32.
    References [1] ”Design andImplementation of the Seconded Extended Filesystem”, Remy Card, Proceedings of the First Dutch International Symposium on Linux, ISBN 9036703859 [2] Ext4 Linux Kernal Newbies – Online Documentation [3] Introduction to Linux : Chapter 3 – General Overview of the Linux Filesystem [4] ”Ext4 block and inode allocator improvements”, Aneesh Kumar et al, Proceesings of the Linux Symposium, 2008 [5] Linux Filesystem Hierarchy, Binh Nguyen [6] www.tldp.org_LDP_tlk_fs_filesystem [7] www.wikipedia.org – ext2, ext3, ext4