SlideShare a Scribd company logo
1 of 19
ZFS filesystem++ Marc Seeger (marc-seeger.de) Now with 100% more pictures of kittens!
The basic idea
The usual features File names/Directories => FAT / Inodes Metadata => time/size/… CRUD => + truncate, appending, moving, links Security => ACL ()/capabilites ()
The good stuff Journaling => metadata only /complete Encryption Transparent compression Checksums Snapshots/Versioning
Layout
Logical Volume Manager (LVM) Operating System Logical Volume Manager Volume(consisting of HDDs)
Problems today
Silent data corruption Controller, cable, drive, firmware, ... CERN: Large Hadron Collider = >15.000 TB/year „Data integrity“ paper* Disk errors. 2 GB file to > 3,000 nodes every 2 hours  for 5 weeks=> 500 errors on 100 nodes.  Single bit errors. 10% of disk errors. Sector (512 bytes) sized errors. 10% of disk errors. 64 KB regions. 80% of disk errors. (Bug in WD disk firmware + 3Ware controller cards) RAID errors. 492 RAID systems each week for 4 weeks.Specs: Bit Error Rate of 10^14 read/written. Good news: only about 1/3 of the spec’d rate.Bad news: 2.4 petabytes of data => 300 errors. Memory errors.Good news: only 3 double-bit errors in 3 months on 1300 nodes.Bad news: according to the spec there shouldn’t have been any. (double bit errors can’t be corrected.)  CERN found an overall byte error rate of 3 * 10^7 * http://indico.cern.ch/getFile.py/access?contribId=3&resId=1&materialId=paper&confId=13797
Management Labels, partitions, volumes, provisioning, grow/shrink, /etc files... Limits:  filesystem/volume size, file size, number of files, files per directory, number of snapshots ... Different tools to manage file, block, iSCSI, NFS, CIFS ...
Slow Linear-time create fat locks fixed block size naïve prefetch dirty region logging painful RAID rebuilds growing backup time
What‘s different about ZFS “ZFS is a new kind of file system that provides: simple administration transactional semantics end-to-end data integrity immense scalability. ZFS is not an incremental improvement to existing technology; it is a fundamentally new approach to data management. We've blown away 20 years of obsolete assumptions, eliminated complexity at the source, and created a storage system that's actually a pleasure to use.”
pooled storage model completely eliminates: the concept of volumes and the associated problems of: Partitions Provisioning Wasted bandwidth Stranded storage. Thousands of file systems can draw from a common storage pool, each one consuming only as much space as it actually needs. The combined I/O bandwidth of all devices in the pool is available to all filesystems at all times.
All operations are copy-on-write transactions  the on-disk state is always valid. There is no need to fsck(1M) a ZFS file system, ever. Every block is checksummed to prevent silent data corruption (user-selectable algorithm) the data is self-healing in replicated (mirrored or RAID) configurations. If one copy is damaged, ZFS detects it and uses another copy to repair it.
RAID-Z similar to RAID-5 but: uses variable stripe width to eliminate the RAID-5 write hole. All RAID-Z writes are full-stripe writes. no read-modify-write tax no write hole no need for NVRAM in hardware.(ZFS loves cheap disks)
But cheap disks can fail! No problem: ZFS provides disk scrubbing(like ECC memory scrubbing) 256 bit block checksum works while storage pool is live and in use!
Scalability 128-bit filesystem256 quadrillion zettabytes. All metadata is allocated dynamically  no need to pre-allocate inodes or otherwise limit the scalability of the file system when it is first created. Directories can have up to 248 (256 trillion) entries No limit exists on the number of file systems … … or number of files that can be contained within a file system.
Snapshots A snapshot is a read-only copy of a file system or volume. Snapshots can be created quickly and easily. Initially, snapshots consume no additional space within the pool. Snapshots are happening at constant-time As data within the active dataset changes, the snapshot consumes space by continuing to reference the old data. Incremental backups are so efficient that they can be used for remote replication — e.g. to transmit an incremental update every 10 seconds.
Performance! ZFS has a pipelined I/O engine, similar in concept to CPU pipelines. The pipeline operates on I/O dependency graphs and provides scoreboarding, priority, deadline scheduling, out-of-order issue and I/O aggregation. I/O loads that bring other file systems to their knees are handled with ease by the ZFS I/O pipeline. (quote: sun)
Compression ZFS provides built-in compression. In addition to reducing space usage by 2-3x, compression also reduces the amount of I/O by 2-3x. For this reason, enabling compression actually makes some workloads go faster. In addition to file systems, ZFS storage pools can provide volumes for applications that need raw-device semantics. ZFS volumes can be used as swap devices, for example. And if you enable compression on a swap volume, you now have compressed virtual memory.

More Related Content

What's hot

ZFS: The Last Word in Filesystems
ZFS: The Last Word in FilesystemsZFS: The Last Word in Filesystems
ZFS: The Last Word in Filesystems
Jarod Wang
 
Zettabyte File Storage System
Zettabyte File Storage SystemZettabyte File Storage System
Zettabyte File Storage System
Amdocs
 

What's hot (20)

ZFS in 30 minutes
ZFS in 30 minutesZFS in 30 minutes
ZFS in 30 minutes
 
ZFS Tutorial USENIX June 2009
ZFS  Tutorial  USENIX June 2009ZFS  Tutorial  USENIX June 2009
ZFS Tutorial USENIX June 2009
 
Scale2014
Scale2014Scale2014
Scale2014
 
Zfs intro v2
Zfs intro v2Zfs intro v2
Zfs intro v2
 
S8 File Systems Tutorial USENIX LISA13
S8 File Systems Tutorial USENIX LISA13S8 File Systems Tutorial USENIX LISA13
S8 File Systems Tutorial USENIX LISA13
 
JetStor NAS 724UXD Dual Controller Active-Active ZFS Based
JetStor NAS 724UXD Dual Controller Active-Active ZFS BasedJetStor NAS 724UXD Dual Controller Active-Active ZFS Based
JetStor NAS 724UXD Dual Controller Active-Active ZFS Based
 
ZFS: The Last Word in Filesystems
ZFS: The Last Word in FilesystemsZFS: The Last Word in Filesystems
ZFS: The Last Word in Filesystems
 
Flourish16
Flourish16Flourish16
Flourish16
 
USENIX LISA11 Tutorial: ZFS a
USENIX LISA11 Tutorial: ZFS a USENIX LISA11 Tutorial: ZFS a
USENIX LISA11 Tutorial: ZFS a
 
ZFS Tutorial USENIX LISA09 Conference
ZFS Tutorial USENIX LISA09 ConferenceZFS Tutorial USENIX LISA09 Conference
ZFS Tutorial USENIX LISA09 Conference
 
PostgreSQL + ZFS best practices
PostgreSQL + ZFS best practicesPostgreSQL + ZFS best practices
PostgreSQL + ZFS best practices
 
Fossetcon14
Fossetcon14Fossetcon14
Fossetcon14
 
Lavigne bsdmag apr13
Lavigne bsdmag apr13Lavigne bsdmag apr13
Lavigne bsdmag apr13
 
Introduction to BTRFS and ZFS
Introduction to BTRFS and ZFSIntroduction to BTRFS and ZFS
Introduction to BTRFS and ZFS
 
MySQL on ZFS
MySQL on ZFSMySQL on ZFS
MySQL on ZFS
 
OSDC 2016 - Interesting things you can do with ZFS by Allan Jude&Benedict Reu...
OSDC 2016 - Interesting things you can do with ZFS by Allan Jude&Benedict Reu...OSDC 2016 - Interesting things you can do with ZFS by Allan Jude&Benedict Reu...
OSDC 2016 - Interesting things you can do with ZFS by Allan Jude&Benedict Reu...
 
Tlf2014
Tlf2014Tlf2014
Tlf2014
 
Btrfs and Snapper - The Next Steps from Pure Filesystem Features to Integrati...
Btrfs and Snapper - The Next Steps from Pure Filesystem Features to Integrati...Btrfs and Snapper - The Next Steps from Pure Filesystem Features to Integrati...
Btrfs and Snapper - The Next Steps from Pure Filesystem Features to Integrati...
 
Zettabyte File Storage System
Zettabyte File Storage SystemZettabyte File Storage System
Zettabyte File Storage System
 
Btrfs current status and_future_prospects
Btrfs current status and_future_prospectsBtrfs current status and_future_prospects
Btrfs current status and_future_prospects
 

Similar to ZFS

Distributed File System
Distributed File SystemDistributed File System
Distributed File System
Ntu
 
Zettabyte File Storage System
Zettabyte File Storage SystemZettabyte File Storage System
Zettabyte File Storage System
Amdocs
 
LizardFS-WhitePaper-Eng-v4.0 (1)
LizardFS-WhitePaper-Eng-v4.0 (1)LizardFS-WhitePaper-Eng-v4.0 (1)
LizardFS-WhitePaper-Eng-v4.0 (1)
Pekka Männistö
 
LizardFS-WhitePaper-Eng-v3.9.2-web
LizardFS-WhitePaper-Eng-v3.9.2-webLizardFS-WhitePaper-Eng-v3.9.2-web
LizardFS-WhitePaper-Eng-v3.9.2-web
Szymon Haly
 
Oracle Open World 2014: Lies, Damned Lies, and I/O Statistics [ CON3671]
Oracle Open World 2014: Lies, Damned Lies, and I/O Statistics [ CON3671]Oracle Open World 2014: Lies, Damned Lies, and I/O Statistics [ CON3671]
Oracle Open World 2014: Lies, Damned Lies, and I/O Statistics [ CON3671]
Kyle Hailey
 

Similar to ZFS (20)

XFS.ppt
XFS.pptXFS.ppt
XFS.ppt
 
Distributed File System
Distributed File SystemDistributed File System
Distributed File System
 
I/O System and Case study
I/O System and Case studyI/O System and Case study
I/O System and Case study
 
Cassandra admin
Cassandra adminCassandra admin
Cassandra admin
 
Linux%20 memory%20management
Linux%20 memory%20managementLinux%20 memory%20management
Linux%20 memory%20management
 
Zettabyte File Storage System
Zettabyte File Storage SystemZettabyte File Storage System
Zettabyte File Storage System
 
LizardFS-WhitePaper-Eng-v4.0 (1)
LizardFS-WhitePaper-Eng-v4.0 (1)LizardFS-WhitePaper-Eng-v4.0 (1)
LizardFS-WhitePaper-Eng-v4.0 (1)
 
LizardFS-WhitePaper-Eng-v3.9.2-web
LizardFS-WhitePaper-Eng-v3.9.2-webLizardFS-WhitePaper-Eng-v3.9.2-web
LizardFS-WhitePaper-Eng-v3.9.2-web
 
Massively Parallel Architectures
Massively Parallel ArchitecturesMassively Parallel Architectures
Massively Parallel Architectures
 
Cluster Filesystems and the next 1000 human genomes
Cluster Filesystems and the next 1000 human genomesCluster Filesystems and the next 1000 human genomes
Cluster Filesystems and the next 1000 human genomes
 
Storage for next-generation sequencing
Storage for next-generation sequencingStorage for next-generation sequencing
Storage for next-generation sequencing
 
File Management in Operating Systems
File Management in Operating SystemsFile Management in Operating Systems
File Management in Operating Systems
 
FILE STRUCTURE IN DBMS
FILE STRUCTURE IN DBMSFILE STRUCTURE IN DBMS
FILE STRUCTURE IN DBMS
 
os
osos
os
 
Fsck Sx
Fsck SxFsck Sx
Fsck Sx
 
Fsck Sx
Fsck SxFsck Sx
Fsck Sx
 
TDS-16489U-R2 0215 EN
TDS-16489U-R2 0215 ENTDS-16489U-R2 0215 EN
TDS-16489U-R2 0215 EN
 
Oracle Open World 2014: Lies, Damned Lies, and I/O Statistics [ CON3671]
Oracle Open World 2014: Lies, Damned Lies, and I/O Statistics [ CON3671]Oracle Open World 2014: Lies, Damned Lies, and I/O Statistics [ CON3671]
Oracle Open World 2014: Lies, Damned Lies, and I/O Statistics [ CON3671]
 
Storage
StorageStorage
Storage
 
Posscon2013
Posscon2013Posscon2013
Posscon2013
 

More from Marc Seeger

Lunch and learn: Cucumber and Capybara
Lunch and learn: Cucumber and CapybaraLunch and learn: Cucumber and Capybara
Lunch and learn: Cucumber and Capybara
Marc Seeger
 
Communitygetriebe Android Systementwicklung
Communitygetriebe Android SystementwicklungCommunitygetriebe Android Systementwicklung
Communitygetriebe Android Systementwicklung
Marc Seeger
 
Eventdriven I/O - A hands on introduction
Eventdriven I/O - A hands on introductionEventdriven I/O - A hands on introduction
Eventdriven I/O - A hands on introduction
Marc Seeger
 
Key-Value Stores: a practical overview
Key-Value Stores: a practical overviewKey-Value Stores: a practical overview
Key-Value Stores: a practical overview
Marc Seeger
 
Security In Dect
Security In DectSecurity In Dect
Security In Dect
Marc Seeger
 

More from Marc Seeger (17)

DevOps Boston - Heartbleed at Acquia
DevOps Boston - Heartbleed at AcquiaDevOps Boston - Heartbleed at Acquia
DevOps Boston - Heartbleed at Acquia
 
The current state of anonymous filesharing
The current state of anonymous filesharingThe current state of anonymous filesharing
The current state of anonymous filesharing
 
Lunch and learn: Cucumber and Capybara
Lunch and learn: Cucumber and CapybaraLunch and learn: Cucumber and Capybara
Lunch and learn: Cucumber and Capybara
 
NoSQL databases
NoSQL databasesNoSQL databases
NoSQL databases
 
building blocks of a scalable webcrawler
building blocks of a scalable webcrawlerbuilding blocks of a scalable webcrawler
building blocks of a scalable webcrawler
 
Communitygetriebe Android Systementwicklung
Communitygetriebe Android SystementwicklungCommunitygetriebe Android Systementwicklung
Communitygetriebe Android Systementwicklung
 
Eventdriven I/O - A hands on introduction
Eventdriven I/O - A hands on introductionEventdriven I/O - A hands on introduction
Eventdriven I/O - A hands on introduction
 
Alternative Infrastucture
Alternative InfrastuctureAlternative Infrastucture
Alternative Infrastucture
 
Communitygetriebene Android Systemerweiterungen
Communitygetriebene Android SystemerweiterungenCommunitygetriebene Android Systemerweiterungen
Communitygetriebene Android Systemerweiterungen
 
Key-Value Stores: a practical overview
Key-Value Stores: a practical overviewKey-Value Stores: a practical overview
Key-Value Stores: a practical overview
 
The Dirac Video CoDec
The Dirac Video CoDecThe Dirac Video CoDec
The Dirac Video CoDec
 
Anonimität - Konzepte und Werkzeuge
Anonimität - Konzepte und WerkzeugeAnonimität - Konzepte und Werkzeuge
Anonimität - Konzepte und Werkzeuge
 
Security In Dect
Security In DectSecurity In Dect
Security In Dect
 
Social Media in der Unternehmenskommunikation
Social Media in der UnternehmenskommunikationSocial Media in der Unternehmenskommunikation
Social Media in der Unternehmenskommunikation
 
xDSL, DSLAM & CO
xDSL, DSLAM & COxDSL, DSLAM & CO
xDSL, DSLAM & CO
 
Ruby Xml Mapping
Ruby Xml MappingRuby Xml Mapping
Ruby Xml Mapping
 
HdM Stuttgart Präsentationstag PPTP VPN WLAN Update
HdM Stuttgart Präsentationstag PPTP VPN WLAN UpdateHdM Stuttgart Präsentationstag PPTP VPN WLAN Update
HdM Stuttgart Präsentationstag PPTP VPN WLAN Update
 

Recently uploaded

Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
vu2urc
 

Recently uploaded (20)

TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 

ZFS

  • 1. ZFS filesystem++ Marc Seeger (marc-seeger.de) Now with 100% more pictures of kittens!
  • 3. The usual features File names/Directories => FAT / Inodes Metadata => time/size/… CRUD => + truncate, appending, moving, links Security => ACL ()/capabilites ()
  • 4. The good stuff Journaling => metadata only /complete Encryption Transparent compression Checksums Snapshots/Versioning
  • 6. Logical Volume Manager (LVM) Operating System Logical Volume Manager Volume(consisting of HDDs)
  • 8. Silent data corruption Controller, cable, drive, firmware, ... CERN: Large Hadron Collider = >15.000 TB/year „Data integrity“ paper* Disk errors. 2 GB file to > 3,000 nodes every 2 hours for 5 weeks=> 500 errors on 100 nodes. Single bit errors. 10% of disk errors. Sector (512 bytes) sized errors. 10% of disk errors. 64 KB regions. 80% of disk errors. (Bug in WD disk firmware + 3Ware controller cards) RAID errors. 492 RAID systems each week for 4 weeks.Specs: Bit Error Rate of 10^14 read/written. Good news: only about 1/3 of the spec’d rate.Bad news: 2.4 petabytes of data => 300 errors. Memory errors.Good news: only 3 double-bit errors in 3 months on 1300 nodes.Bad news: according to the spec there shouldn’t have been any. (double bit errors can’t be corrected.)  CERN found an overall byte error rate of 3 * 10^7 * http://indico.cern.ch/getFile.py/access?contribId=3&resId=1&materialId=paper&confId=13797
  • 9. Management Labels, partitions, volumes, provisioning, grow/shrink, /etc files... Limits: filesystem/volume size, file size, number of files, files per directory, number of snapshots ... Different tools to manage file, block, iSCSI, NFS, CIFS ...
  • 10. Slow Linear-time create fat locks fixed block size naïve prefetch dirty region logging painful RAID rebuilds growing backup time
  • 11. What‘s different about ZFS “ZFS is a new kind of file system that provides: simple administration transactional semantics end-to-end data integrity immense scalability. ZFS is not an incremental improvement to existing technology; it is a fundamentally new approach to data management. We've blown away 20 years of obsolete assumptions, eliminated complexity at the source, and created a storage system that's actually a pleasure to use.”
  • 12. pooled storage model completely eliminates: the concept of volumes and the associated problems of: Partitions Provisioning Wasted bandwidth Stranded storage. Thousands of file systems can draw from a common storage pool, each one consuming only as much space as it actually needs. The combined I/O bandwidth of all devices in the pool is available to all filesystems at all times.
  • 13. All operations are copy-on-write transactions  the on-disk state is always valid. There is no need to fsck(1M) a ZFS file system, ever. Every block is checksummed to prevent silent data corruption (user-selectable algorithm) the data is self-healing in replicated (mirrored or RAID) configurations. If one copy is damaged, ZFS detects it and uses another copy to repair it.
  • 14. RAID-Z similar to RAID-5 but: uses variable stripe width to eliminate the RAID-5 write hole. All RAID-Z writes are full-stripe writes. no read-modify-write tax no write hole no need for NVRAM in hardware.(ZFS loves cheap disks)
  • 15. But cheap disks can fail! No problem: ZFS provides disk scrubbing(like ECC memory scrubbing) 256 bit block checksum works while storage pool is live and in use!
  • 16. Scalability 128-bit filesystem256 quadrillion zettabytes. All metadata is allocated dynamically  no need to pre-allocate inodes or otherwise limit the scalability of the file system when it is first created. Directories can have up to 248 (256 trillion) entries No limit exists on the number of file systems … … or number of files that can be contained within a file system.
  • 17. Snapshots A snapshot is a read-only copy of a file system or volume. Snapshots can be created quickly and easily. Initially, snapshots consume no additional space within the pool. Snapshots are happening at constant-time As data within the active dataset changes, the snapshot consumes space by continuing to reference the old data. Incremental backups are so efficient that they can be used for remote replication — e.g. to transmit an incremental update every 10 seconds.
  • 18. Performance! ZFS has a pipelined I/O engine, similar in concept to CPU pipelines. The pipeline operates on I/O dependency graphs and provides scoreboarding, priority, deadline scheduling, out-of-order issue and I/O aggregation. I/O loads that bring other file systems to their knees are handled with ease by the ZFS I/O pipeline. (quote: sun)
  • 19. Compression ZFS provides built-in compression. In addition to reducing space usage by 2-3x, compression also reduces the amount of I/O by 2-3x. For this reason, enabling compression actually makes some workloads go faster. In addition to file systems, ZFS storage pools can provide volumes for applications that need raw-device semantics. ZFS volumes can be used as swap devices, for example. And if you enable compression on a swap volume, you now have compressed virtual memory.

Editor's Notes

  1. For example, deleting a file on a Unix file system involves two steps:Removing its directory entry.Marking space for the file and its inode as free in the free space map.
  2. Erweiterbarkeit der Volume Groups durch Hinzufügen von Physical Volumes (Festplatten) und der daraus folgenden Erweiterbarkeit der darin enthaltenen Logical Volumes. Unter den meisten Betriebssystemen im laufenden Betrieb möglich RAID
  3. http://storagemojo.com/2007/09/19/cerns-data-corruption-research/
  4. Dirty Region Logging (DRL) is an optional property of a volume, used to provide a speedy recovery of mirrored volumes after a system failure. DRL keeps track of the regions that have changed due to I/O writes to a mirrored volume. DRL uses this information to recover only the portions of the volume that need to be recovered.