Learn about log structured
file system
Gang He
Apr. 20, 2018
2
Overview
A log-structured file system is a file system in which data
and meta-data are written sequentially to a circular buffer,
called a log. The design was first proposed in 1988 by John
K. Ousterhout and Fred Douglis and first implemented in
1992 by Ousterhout and Mendel Rosenblum for the Unix-
like Sprite distributed operating system.
3
Background
• System memories are growing, file system
performance is largely determined by its write
performance.
• There is a large gap between random I/O
performance and sequential I/O performance.
• Writes create multiple, chronologically-advancing
versions of both file data and meta-data, it is easy to
implement a versioning file system.
• Recovery from crashes is simpler.
4
Design- Write Sequentially
• Boost write throughput by writing all changes to disk
contiguously.
Disk as an array of blocks, append at end.
Write data, indirect blocks, inodes together.
Write inode map and checkpoint region.
• Writes are written in segments.
~1MB of continuous disk blocks.
Accumulated in cache and flushed at once.
• Data layout on disk.
“temporal locality” (good for writing), rather than
“logical locality” (good for reading).
5
Design- Write Sequentially
6
Design- Garbage Collection
• Log is infinite, but disk is finite.
Reuse the old parts of the log.
• Clean old segments to recover space.
Read in M existing segments, compact their contents into N
new segments (where N < M), and then write the N
segments to disk in new locations.
Segments ranked by "liveness" or age.
Segment cleaner "runs in background".
• Cleaning policies.
Greedy: clean based on low utilization.
Cost-benefit: use age (time of last write) .
7
Design- Crash Recovery
• Log and checkpoint.
Limited crash vulnerability.
At checkpoint flush active segment, inode map.
• No fsck required
8
Summary
• LFS introduces a new approach to updating the disk. Instead
of overwriting files in places, LFS always writes to an unused
portion of the disk, and then later reclaims that old space
through cleaning.
• The large writes that LFS generates are excellent for
performance on many different devices. On hard drives, large
writes ensure that positioning time is minimized; on parity-
based RAIDs, such as RAID-4 and RAID-5, they avoid the
small-write problem entirely.
• File system snapshots, which is used by backup system.
• Recent research has even shown that large I/Os are required
for high performance on Flash-based SSDs.
9
SSD Architecture
10
SSD Key Points
• Host interfaces.
SATA/SAS, PCIe, USB, etc.
• Program/Erase (P/E) Cycles
• Flash Translation Layer (FTL)
Address Mapping, Wear Leveling, Garbage Collection
(GC), Bad Block Managment, etc.
• Over Provisioning (OP)
• Write Amplification Factor (WAF)
• TRIM/DISCARD
• Change Write Workload
Write sequentially instead of random to reduce WAF.
11
Nvme-cli
12
F2FS Introduction
• F2FS (Flash-Friendly File System) is a flash file system initially
developed by Samsung Electronics for the Linux kernel. The motive
for F2FS was to build a file system that, from the start, takes into
account the characteristics of NAND flash memory-based storage
devices (such as solid-state disks, eMMC, and SD cards), which are
widely used in computer systems ranging from mobile devices to
servers.
• F2FS was designed on a basis of a log-structured file system
approach. Jaegeuk Kim, the principal F2FS author, has stated that it
remedies some known issues of the older log-structured file
systems, such as the snowball effect of wandering trees and high
cleaning overhead. In addition, since a NAND-based storage device
shows different characteristics according to its internal geometry or
flash memory management scheme, it supports various parameters
not only for configuring on-disk layout, but also for selecting
allocation and cleaning algorithms.
13
F2FS Disk Structure
• Start address of main area is aligned to the zone size.
• Cleaning operation is done in a unit of section, the
section is matched with FTL GC unit.
• All the FS metadata are co-located at front region.
14
F2FS - Address Wandering Tree
Problem
15
F2FS - Cleaning
16
F2FS Performance
Panda board + eMMC
Corporate Headquarters
Maxfeldstrasse 5
90409 Nuremberg
Germany
+49 911 740 53 0 (Worldwide)
+www.suse.com
Join us on:
www.opensuse.org
17

Learn about log structured file system

  • 1.
    Learn about logstructured file system Gang He Apr. 20, 2018
  • 2.
    2 Overview A log-structured filesystem is a file system in which data and meta-data are written sequentially to a circular buffer, called a log. The design was first proposed in 1988 by John K. Ousterhout and Fred Douglis and first implemented in 1992 by Ousterhout and Mendel Rosenblum for the Unix- like Sprite distributed operating system.
  • 3.
    3 Background • System memoriesare growing, file system performance is largely determined by its write performance. • There is a large gap between random I/O performance and sequential I/O performance. • Writes create multiple, chronologically-advancing versions of both file data and meta-data, it is easy to implement a versioning file system. • Recovery from crashes is simpler.
  • 4.
    4 Design- Write Sequentially •Boost write throughput by writing all changes to disk contiguously. Disk as an array of blocks, append at end. Write data, indirect blocks, inodes together. Write inode map and checkpoint region. • Writes are written in segments. ~1MB of continuous disk blocks. Accumulated in cache and flushed at once. • Data layout on disk. “temporal locality” (good for writing), rather than “logical locality” (good for reading).
  • 5.
  • 6.
    6 Design- Garbage Collection •Log is infinite, but disk is finite. Reuse the old parts of the log. • Clean old segments to recover space. Read in M existing segments, compact their contents into N new segments (where N < M), and then write the N segments to disk in new locations. Segments ranked by "liveness" or age. Segment cleaner "runs in background". • Cleaning policies. Greedy: clean based on low utilization. Cost-benefit: use age (time of last write) .
  • 7.
    7 Design- Crash Recovery •Log and checkpoint. Limited crash vulnerability. At checkpoint flush active segment, inode map. • No fsck required
  • 8.
    8 Summary • LFS introducesa new approach to updating the disk. Instead of overwriting files in places, LFS always writes to an unused portion of the disk, and then later reclaims that old space through cleaning. • The large writes that LFS generates are excellent for performance on many different devices. On hard drives, large writes ensure that positioning time is minimized; on parity- based RAIDs, such as RAID-4 and RAID-5, they avoid the small-write problem entirely. • File system snapshots, which is used by backup system. • Recent research has even shown that large I/Os are required for high performance on Flash-based SSDs.
  • 9.
  • 10.
    10 SSD Key Points •Host interfaces. SATA/SAS, PCIe, USB, etc. • Program/Erase (P/E) Cycles • Flash Translation Layer (FTL) Address Mapping, Wear Leveling, Garbage Collection (GC), Bad Block Managment, etc. • Over Provisioning (OP) • Write Amplification Factor (WAF) • TRIM/DISCARD • Change Write Workload Write sequentially instead of random to reduce WAF.
  • 11.
  • 12.
    12 F2FS Introduction • F2FS(Flash-Friendly File System) is a flash file system initially developed by Samsung Electronics for the Linux kernel. The motive for F2FS was to build a file system that, from the start, takes into account the characteristics of NAND flash memory-based storage devices (such as solid-state disks, eMMC, and SD cards), which are widely used in computer systems ranging from mobile devices to servers. • F2FS was designed on a basis of a log-structured file system approach. Jaegeuk Kim, the principal F2FS author, has stated that it remedies some known issues of the older log-structured file systems, such as the snowball effect of wandering trees and high cleaning overhead. In addition, since a NAND-based storage device shows different characteristics according to its internal geometry or flash memory management scheme, it supports various parameters not only for configuring on-disk layout, but also for selecting allocation and cleaning algorithms.
  • 13.
    13 F2FS Disk Structure •Start address of main area is aligned to the zone size. • Cleaning operation is done in a unit of section, the section is matched with FTL GC unit. • All the FS metadata are co-located at front region.
  • 14.
    14 F2FS - AddressWandering Tree Problem
  • 15.
  • 16.
  • 17.
    Corporate Headquarters Maxfeldstrasse 5 90409Nuremberg Germany +49 911 740 53 0 (Worldwide) +www.suse.com Join us on: www.opensuse.org 17