The evolution of linux file system

The evolution of Linux file system
Gang He
ghe@suse.com

2
Agenda
• Local file system (LFS)
• Cluster file system (CFS)
• Distributed file system (DFS)

5
File system concepts
• file descriptor (user space)
• struct file, struct dentry, struct inode, struct
address_space (kernel space)
• struct super_block,
meta data, file data,
buffer/page cache

6
Ext2 → Ext3 → Ext4
• Ext2 (1993), inspired from UFS, first popular and
stable Linux file system, but design is plain.
- File systems are getting bigger, how to look up a entry under a big
directory, how to reduce fsck time after crash ...
• Ext3 (2001), add journal, hash-tree directory indexing,
etc.
- File systems are getting bigger, have to eliminate these limitations
- Various advanced file systems impact ext3 ...
• Ext4 (2008), 48-bit addressing space, no limited
directory entries, extents, multi-block allocation,
delayed block allocation, online defragmentation, 256
bytes inode, persistent preallocation, barrier, etc.

7
Nowadays
• Ext4, will continue to be maintained due to stability and
historical reasons
• XFS, robust and scalable, good performance for large
storage, will shine in handling big file area (e.g. virtual
machine image)
• BtrFS, new design (replace ext4), inspired from ZFS,
contains many features of enterprise file system. for
examples, copy on write, own internal RAID (manage
volume ), snapshot/clone support, dynamically grow
and shrink, SSD support, etc.

9
Why CFS
• Independent storage devices (e.g. SAN).
• High availability requirement.
• How to scale out file system in
CPU,
memory,
even network bandwidth.

10
CFS common points
• File system POSIX semantics.
• Shared disk.
• Distributed lock manager (DLM).
• Cluster manager stacks.

11
CFS future
• Scale out more nodes, provide higher aggregation IO
bandwidth.
• More high availability, support online fsck, online
deframentation, online expand/shrink.
• File system level snapshot, file level clone.
• Tiered storage, SSD support.
• Deduplication.
• ...

13
Background
• Costs, storage array, fabric switches, HBA card, etc
are expensive.
• Unified storage space, linear expansion, commodity
hardware.
• Driven by Internet industry (e.g. search, picture share,
big data, etc).
• Google file system appeared (2003).

14
GFS-like DFS (HDFS, MooseFS, KFS)

15
DFS common points
• Not strictly comply with File system POSIX semantics,
most implementations are based on user-space.
• Share nothing, meta-data/file data are stored
separately, meta-data access/file data access are
separated.
• Have own local file system, a file represents a logical
data block, a data block has several copy blocks.
• Meta-data server usually load all meta data into
memory at start-up, writing logs records incremental
changes, then flushing memory to disk/merging log
and previous meta-data file gets a new checkpoint of
meta-data.
• Other algorithms: heartbeat algorithm, rack-aware,
block allocation policy, file lock management, etc.

16
Scale out
• Meta-data cluster server, e.g. GFS2, Ceph.
• Fully symmetric, no central meta-data server, e.g.
GlusterFS.
• Improved cluster management mechanism.
hearbeat/corosync → zookeeper cluster
• IO Flow Control, reduce meta-data server
dependence, costs control (ECC), etc.
VS.

17
Current trends
• Linear scale out.
• CompuStor hyper-converged systems.
• Flash technology utilization.
• High-speed network support.
• Application-Aware (e.g. VM image).
• Deduplication/Compression/Snapshot/Clone.
• Object/Block/File unified storage.

The evolution of linux file system

More Related Content

What's hot

Similar to The evolution of linux file system

Recently uploaded

The evolution of linux file system