Next-gen Filesystems
a high-altitude overview
These slides are © 2014 Jim Salter, with license Creative Commons Attribution-ShareAlike 3.0 unported.
http://creativecommons.org/licenses/by-sa/3.0/deed.en_US
First of all: Who?
Jim Salter
Technomancer,
Mercenary Sysadmin,
Small Business Owner
● 6+ years of ZFS in production
● 6+ [units of time] of btrfs in test
Today's slides can be found at:
http://jrs-s.net/presentations/next-gen-fs-overview/
What
… is a “next generation” filesystem?
What's a last generation filesystem?
What's a filesystem “generation” anyway?
Generation 0
punching paper holes and recording modem squeal on a tape deck
● No Files
● No Folders
● No System
Generation 0
punching paper holes and recording modem squeal on a tape deck
Generation 0
punching paper holes and recording modem squeal on a tape deck
Generation 0
punching paper holes and recording modem squeal on a tape deck
Generation 0
punching paper holes and recording modem squeal on a tape deck
Generation 0
punching paper holes and recording modem squeal on a tape deck
Generation 1
the beginnings of random access
● Files!
● … that's all we got
Generation 2
the beginnings of organization
● Files!
● Folders!
● … that's all we got
Generation 3
in which we no longer trust everyone
● Files!
● Folders!
● Ownership!
● Permission!
Generation 4
in which we're tired of power failure
Journalling!
Generation 5
in which we really get serious
● Volume Management
● Per-block checksumming
● Self-healing RAID arrays
● Atomic COW snapshots
● Asynchronous replication
● Far-future scalability
Whydo I really care about this stuff?
● Data Retention
● Data Management
● Data Longevity
● Protection from human error
● Protection from environmental failure
bit rot
in moR!l daeaces, ar% gften eboQch tk bafkrupQ
http://arstechnica.com/information-technology/2014/01/bitrot-and-atomic-cows-inside-next-gen-filesystems/
Checksumming!per-block checksumming + redundant storage = GTFO bitrot
[87.030967] BTRFS info (device vdf): csum failed ino 258 off 0 csum 3377436548
private 796777854
[87.031188] BTRFS info (device vdf): csum failed ino 258 off 0 csum 3377436548
private 796777854
[87.031678] btrfs read error corrected: ino 258 off 0 (dev /dev/vde sector
267344)
Replicationyour backups suck. so much. lern 2 replicate ok
Traditional FSin-place modification of data is just what it sounds like
Dark red: newly (re)written data blocks
Pale red: existing data blocks
White: unlinked data blocks
Copy on Write FS”the data comet” : write a new block, unlink the old block
Dark red: newly (re)written data blocks
Pale red: existing data blocks
White: unlinked data blocks
Abstracting CoWwhere the blocks are isn't important: “the data worm”
Dark red: newly (re)written data blocks
Pale red: existing data blocks
White: unlinked data blocks
Understanding CoWvisualizing “atomic COW snapshots”
Dark red: newly (re)written data blocks
Pale red: existing data blocks
White: unlinked data blocks
Blue tint: snapshot @1
Yellow tint: snapshot @2
Performance?“3GB/sec on commodity hardware? That'll do, pig. That'll do.”
Btrfs storage
Raw disk storage
ZFS storage
Bare Metal
Yes, performance.… even off-cache
ZFS can't do
nocache. ='(
Bare Metal
Raw disk storage
Btrfs storage
Maturity(n / 2) + 7
● 2001:
Development begun
● 2004:
First official announcement
● 2005:
Implemented in baseline Solaris
● Jun 2007:
Chris Mason announces btrfs to the kernel mailing list
● Jan 2009:
Implemented in baseline Linux kernel
● Mar 2014:
OpenSUSE commits to default btrfs FS with 13.2 in Nov
2014; Facebook hires lead btrfs devs, deploys btrfs in
production web tier
The Future Viewbetting against the GPL is usually a bad idea
● Has a 5+ year jumpstart on btrfs
● GPL licensed
● Built in to the mainline Linux kernel
● The probable near-future default Linux filesystem
● Devs much more responsive to community feature
demand
● Snowballing heavy, Heavy, HEAVY development
Can't Get Enough?then come back and get some more!
RAID, Replication, and You
Track: Operations 2
3:45PM – 4:30PM
Today's slides can be found at:
http://jrs-s.net/presentations/next-gen-fs-overview/

An Overview of Next-Gen Filesystems

  • 1.
    Next-gen Filesystems a high-altitudeoverview These slides are © 2014 Jim Salter, with license Creative Commons Attribution-ShareAlike 3.0 unported. http://creativecommons.org/licenses/by-sa/3.0/deed.en_US
  • 2.
    First of all:Who? Jim Salter Technomancer, Mercenary Sysadmin, Small Business Owner ● 6+ years of ZFS in production ● 6+ [units of time] of btrfs in test Today's slides can be found at: http://jrs-s.net/presentations/next-gen-fs-overview/
  • 3.
    What … is a“next generation” filesystem? What's a last generation filesystem? What's a filesystem “generation” anyway?
  • 4.
    Generation 0 punching paperholes and recording modem squeal on a tape deck ● No Files ● No Folders ● No System
  • 5.
    Generation 0 punching paperholes and recording modem squeal on a tape deck
  • 6.
    Generation 0 punching paperholes and recording modem squeal on a tape deck
  • 7.
    Generation 0 punching paperholes and recording modem squeal on a tape deck
  • 8.
    Generation 0 punching paperholes and recording modem squeal on a tape deck
  • 9.
    Generation 0 punching paperholes and recording modem squeal on a tape deck
  • 10.
    Generation 1 the beginningsof random access ● Files! ● … that's all we got
  • 11.
    Generation 2 the beginningsof organization ● Files! ● Folders! ● … that's all we got
  • 12.
    Generation 3 in whichwe no longer trust everyone ● Files! ● Folders! ● Ownership! ● Permission!
  • 13.
    Generation 4 in whichwe're tired of power failure Journalling!
  • 14.
    Generation 5 in whichwe really get serious ● Volume Management ● Per-block checksumming ● Self-healing RAID arrays ● Atomic COW snapshots ● Asynchronous replication ● Far-future scalability
  • 15.
    Whydo I reallycare about this stuff? ● Data Retention ● Data Management ● Data Longevity ● Protection from human error ● Protection from environmental failure
  • 16.
    bit rot in moR!ldaeaces, ar% gften eboQch tk bafkrupQ http://arstechnica.com/information-technology/2014/01/bitrot-and-atomic-cows-inside-next-gen-filesystems/
  • 17.
    Checksumming!per-block checksumming +redundant storage = GTFO bitrot [87.030967] BTRFS info (device vdf): csum failed ino 258 off 0 csum 3377436548 private 796777854 [87.031188] BTRFS info (device vdf): csum failed ino 258 off 0 csum 3377436548 private 796777854 [87.031678] btrfs read error corrected: ino 258 off 0 (dev /dev/vde sector 267344)
  • 18.
    Replicationyour backups suck.so much. lern 2 replicate ok
  • 19.
    Traditional FSin-place modificationof data is just what it sounds like Dark red: newly (re)written data blocks Pale red: existing data blocks White: unlinked data blocks
  • 20.
    Copy on WriteFS”the data comet” : write a new block, unlink the old block Dark red: newly (re)written data blocks Pale red: existing data blocks White: unlinked data blocks
  • 21.
    Abstracting CoWwhere theblocks are isn't important: “the data worm” Dark red: newly (re)written data blocks Pale red: existing data blocks White: unlinked data blocks
  • 22.
    Understanding CoWvisualizing “atomicCOW snapshots” Dark red: newly (re)written data blocks Pale red: existing data blocks White: unlinked data blocks Blue tint: snapshot @1 Yellow tint: snapshot @2
  • 23.
    Performance?“3GB/sec on commodityhardware? That'll do, pig. That'll do.” Btrfs storage Raw disk storage ZFS storage Bare Metal
  • 24.
    Yes, performance.… evenoff-cache ZFS can't do nocache. ='( Bare Metal Raw disk storage Btrfs storage
  • 25.
    Maturity(n / 2)+ 7 ● 2001: Development begun ● 2004: First official announcement ● 2005: Implemented in baseline Solaris ● Jun 2007: Chris Mason announces btrfs to the kernel mailing list ● Jan 2009: Implemented in baseline Linux kernel ● Mar 2014: OpenSUSE commits to default btrfs FS with 13.2 in Nov 2014; Facebook hires lead btrfs devs, deploys btrfs in production web tier
  • 26.
    The Future Viewbettingagainst the GPL is usually a bad idea ● Has a 5+ year jumpstart on btrfs ● GPL licensed ● Built in to the mainline Linux kernel ● The probable near-future default Linux filesystem ● Devs much more responsive to community feature demand ● Snowballing heavy, Heavy, HEAVY development
  • 27.
    Can't Get Enough?thencome back and get some more! RAID, Replication, and You Track: Operations 2 3:45PM – 4:30PM Today's slides can be found at: http://jrs-s.net/presentations/next-gen-fs-overview/