SlideShare a Scribd company logo
1 of 37
Download to read offline
Performance evaluation of Linux Discard
Support
(Overview, benchmark results, current status)

Red Hat
Luk´ˇ Czerner
   as
May 27, 2011
Copyright ©    2011 Luk´ˇ Czerner, Red Hat.
                         as
Permission is granted to copy, distribute and/or modify this
document under the terms of the GNU Free Documentation
License, Version 1.3 or any later version published by the Free
Software Foundation; with no Invariant Sections, no Front-Cover
Texts, and no Back-Cover Texts. A copy of the license is included
in the COPYING file.
Part I
Discard Background
Agenda



1 SSD Description



2 Thinly Provisioned Storage



3 Introducing Linux Discard Support
SSD Description




Solid-State Drive




   Flash memory block device
   Wear-leveling needed
   Firmware = black box
SSD Description




ATA TRIM Command


  Helps handle garbage collection overhead
  Subsequent READ of TRIM’ed blocks
    1   Read data should NOT change between READ’s
    2   Read data should NOT be retrieved from data previously
        written to any other LBA.
  As long as the device has enough free pages to write to we do
  not necessarily need it.
  In a nutshell: TRIM command tells the device what LBA’a is
  not used by the OS anymore.
Thinly Provisioned Storage




Thin Provisioning




   Unlike in traditional storage, there is no fixed one-to-one
   logical bock to physical storage mapping
   More efficient use of storage space
   Block reclamation interface needed
Thinly Provisioned Storage




SCSI UNMAP / WRITE SAME



  Storage space reclamation interface
  Subsequent READ of unmapped blocks
    1   Read data should NOT change between READ’s
    2   Read data should NOT be retrieved from data previously
        written to any other LBA.
  Unlike with SSD’s we can not afford to wait until we run out
  of space for reclamation.
Introducing Linux Discard Support




Linux Discard Implementation



   Abstraction for the two underlying specifications:
     1   ATA TRIM Command
     2   SCSI UNMAP / WRITE SAME
   API for user-space
         BLKDISCARD ioctl
         Added with v2.6.27-rc9-30-gd30a260
   API for File Sytems
     1   sb issue discard()
     2   blkdev issue discard()
Part II
Discard Performance
Agenda




4 Testing Methodology



5 Results
Testing Methodology




What do we need to find out ?



   Does discard really work ? Is it reliable ?
   How fast/slow is it ?
   Is there any difference between devices from different vendors
   ?
   What is the ideal discard size ?
   SSD performance degradation
Testing Methodology




How do we test it ?


   BLKDISCARD ioctl()
   Automatic discard of different ranges
   Different discard patterns
     1   sequential performance
     2   random IO peformance
     3   discard already discarded blocks
   test-discard - discard benchmarking tool
         http://sourceforge.net/projects/test-discard/
   impression - filesystem aging tool
Results




Sequential discard performance
               300                                          1200
                                     Duration Summary
                                            Throughput

               250                                          1000


               200                                          800




                                                                   Throughput [MB/s]
Duration [s]




               150                                          600


               100                                          400


               50                                           200


                0                                            0
                     10              100                 1000
                          Record size [kB]
Results




Different modes comparison
                    1200
                                          Throughput (sequential)
                                          Throughput (random IO)
                                            Throughput (discard2)
                    1000


                     800
Throughput [MB/s]




                     600


                     400


                     200


                      0
                           10               100                     1000
                                Record size [kB]
Results




Difference between various vendors
                    1200
                                             Throughput Vendor 1
                                             Throughput Vendor 2
                                             Throughput Vendor 3
                    1000


                     800
Throughput [MB/s]




                     600


                     400


                     200


                      0
                           10               100                    1000
                                Record size [kB]
Results




SSD performance degradation
                            70
                                                                             Vendor1
                                                                             Vendor2
                                                                             Vendor3
                            60                                         Vendor3 discard
 Write performance [MB/s]




                            50


                            40


                            30


                            20


                            10



                                 0   50   100    150      200       250     300      350   400
                                                Filesystem saturation [%]
Part III
Discard Support for Linux File systems
Agenda



6 Periodic Discard



7 Discard Batching



8 Different Approach
Periodic Discard




Periodic discard



   Easy to implement
   File system support
     1   ext4 (v2.6.27-5185-g8a0aba7)
     2   btrfs (since upstream)
     3   gfs2 (v2.6.29-9-gf15ab56)
     4   fat, swap, nilfs
   mount -o discard /dev/sdc /mnt/test
   TRIM is non-queueable command - implications ?
Periodic Discard




Benchmarking periodic discard



   Expectations ?
   Testing methodology
     1   Metadata intensive load
     2   Load with removing files
     3   Reasonable file size distribution
   Discard-kit
     1   Using PostMark
     2   http://sourceforge.net/projects/test-discard/files/
Periodic Discard




Ext4 performance (18% hit)
               800
                                       Deleted/s
                                       Append/s
                                         Read/s
               700               Files-created/s
                                 Transactions/s
                                      Read[B/s]
               600                    Write[B/s]
 Opetation/s




               500


               400


               300


               200


               100
                     nodiscard       discard
Periodic Discard




Performance with various file systems
                   750
                                                                    ext4
                   700                                              btrfs

                   650                                              gfs2
  Transactions/s




                   600

                   550

                   500

                   450

                   400

                   350
                         no


                                   di




                                              no


                                                        di




                                                                   no


                                                                             di
                                     sc




                                                          sc




                                                                               sc
                          di




                                               di




                                                                    di
                                      ar




                                                           ar




                                                                                ar
                            sc




                                                 sc




                                                                      sc
                                          d




                                                               d




                                                                                    d
                              ar




                                                   ar




                                                                        ar
                                 d




                                                      d




                          -18%                  -7%                 -63%   d
Discard Batching




Discard Batching - The idea



   Fine-grained discard is not necessarily needed
   Small extents are slow
   With time, freed extents tends to coalesce
   Disadvantages
     1 There is a price for tracking freed extents
     2 Discarding already discarded blocks should be easy, but...
     3 Daemon (in-kernel, user-space) needed.
     4 File system independent solution would most likely be pain to
       do right (if possible).
Discard Batching




Batched discard support



   File system specific solution
   Provide ioctl() interface - FITRIM
   Do not disturb other ongoing IO too much
     1   Prevent allocations while trimming
     2   How to handle huge filesystem ?
   File system support
     1   ext4 (v2.6.36-rc6-35-g7360d17)
     2   ext3 (v2.6.37-11-g9c52749)
     3   xfs (v2.6.37-rc4-63-ga46db60)
Discard Batching




FITRIM ioctl


   Ioctl with one RW parameter defined in linux/fs.h
   struct fstrim range {
     u64 start;
     u64 len;
     u64 minlen;
   }
   fstrim tool
        http://sourceforge.net/projects/fstrim/
   util-linux-ng
        Since v2.18-165-gd9e2d0d
Discard Batching




Batched discard benchmark results
                 6
                                                      FITRIM on ext4
                                                       BLKDISCARD

                 5


                 4
  Duration [s]




                 3


                 2


                 1


                 0
                     0   20       40             60           80         100
                              Filesystem saturation [%]
Different Approach




Alternative approach




   It is always a compromise
   The future of SSD’s and thinly provisioned LUN’s (???)
Part IV
Discard Support in user-space
Agenda




9 e2fsprogs



10 Other utilities
e2fsprogs




Discard in e2fsprogs tools

   Using BLKDISCARD ioctl()
   mke2fs
     1   Refresh SSD’s garbage collector
     2   discard zeroes data - significant speed boost
     3   mkfs.ext4 -E discard /dev/sdc
   e2fsck
     1   After the last check discard free space
     2   Non detected file system errors ? oops
     3   fsck.ext4 -E discard /dev/sdc
   resize2fs
     1   Refresh SSD’s garbage collector
     2   discard zeroes data - significant speed boost
     3   resize2fs -E discard /dev/sdc
e2fsprogs




File system creation
               30
                                         nodiscard
                                           discard

               25


               20
Duration [s]




               15


               10


               5


               0
                    EXT4                 XFS
                           File system
Other utilities




Fstrim tool




   Very simple tool to invoke FITRIM ioctl on mounted file
   system
   Stand-alone tool
       http://sourceforge.net/projects/fstrim/
   Since v2.18-165-gd9e2d0d part of util-linux-ng
Part V
Summary
Summary
  Linux Discard support is a abstraction for underlying
  specification
  Exported via BLKDISCARD ioctl to user-space and
  blkdev issue discard() for filesystems
  Discard testing kit (Discard-kit)
    1   test-discard
    2   PostMark
  Filesystem support
    1   Fine grained (online) discard - mount -o discard
    2   Batched discard support - fstrim from util-linux-ng
  Support in user-space utilities
    1   Filesystem creation (mkfs)
    2   e2fsprogs - mkfs,e2fsck,resize2fs
    3   xfsprogs - mkfs
    4   fstrim
The end.
Thanks for listening.
Useful links




   http://sourceforge.net/projects/fstrim/
   http://sourceforge.net/projects/test-discard/
   http://people.redhat.com/lczerner/discard/

More Related Content

Similar to Performance evaluation of Linux Discard Support

Bottlenecks, Bottlenecks, and more Bottlenecks: Lessons Learned from 2 Years ...
Bottlenecks, Bottlenecks, and more Bottlenecks: Lessons Learned from 2 Years ...Bottlenecks, Bottlenecks, and more Bottlenecks: Lessons Learned from 2 Years ...
Bottlenecks, Bottlenecks, and more Bottlenecks: Lessons Learned from 2 Years ...Enkitec
 
Presentation sun storage tek™ 2500 series
Presentation   sun storage tek™ 2500 seriesPresentation   sun storage tek™ 2500 series
Presentation sun storage tek™ 2500 seriesxKinAnx
 
Evaluating the networking performance of linux based home router platforms fo...
Evaluating the networking performance of linux based home router platforms fo...Evaluating the networking performance of linux based home router platforms fo...
Evaluating the networking performance of linux based home router platforms fo...Alpen-Adria-Universität
 
Practical experiences and best practices for SSD and IBM i
Practical experiences and best practices for SSD and IBM iPractical experiences and best practices for SSD and IBM i
Practical experiences and best practices for SSD and IBM iCOMMON Europe
 
An introduction and evaluations of a wide area distributed storage system
An introduction and evaluations of  a wide area distributed storage systemAn introduction and evaluations of  a wide area distributed storage system
An introduction and evaluations of a wide area distributed storage systemHiroki Kashiwazaki
 
Présentation Aerys
Présentation Aerys Présentation Aerys
Présentation Aerys iCOMMUNITY
 
Diamondmax plus 8_data_sheet
Diamondmax plus 8_data_sheetDiamondmax plus 8_data_sheet
Diamondmax plus 8_data_sheetceed2013
 
Perf Storwize V7000 Eng
Perf Storwize V7000 EngPerf Storwize V7000 Eng
Perf Storwize V7000 EngOleg Korol
 
How to Create a High-Speed Template Engine in Python
How to Create a High-Speed Template Engine in PythonHow to Create a High-Speed Template Engine in Python
How to Create a High-Speed Template Engine in Pythonkwatch
 
Toshiba Storage Protfolio 2014
Toshiba Storage Protfolio 2014Toshiba Storage Protfolio 2014
Toshiba Storage Protfolio 2014Mustafa Kuğu
 
Lustre client performance comparison and tuning (1.8.x to 2.x)
Lustre client performance comparison and tuning (1.8.x to 2.x)Lustre client performance comparison and tuning (1.8.x to 2.x)
Lustre client performance comparison and tuning (1.8.x to 2.x)inside-BigData.com
 
M series technical presentation-part 1
M series technical presentation-part 1M series technical presentation-part 1
M series technical presentation-part 1xKinAnx
 
CPU Subsystem Total Power Consumption: Understanding the Factors and Selectin...
CPU Subsystem Total Power Consumption: Understanding the Factors and Selectin...CPU Subsystem Total Power Consumption: Understanding the Factors and Selectin...
CPU Subsystem Total Power Consumption: Understanding the Factors and Selectin...CAST, Inc.
 
IBM Solid State in eX5 servers
IBM Solid State in eX5 serversIBM Solid State in eX5 servers
IBM Solid State in eX5 serversTony Pearson
 
Calton pu experimental methods on performance in cloud and accuracy in big da...
Calton pu experimental methods on performance in cloud and accuracy in big da...Calton pu experimental methods on performance in cloud and accuracy in big da...
Calton pu experimental methods on performance in cloud and accuracy in big da...jins0618
 

Similar to Performance evaluation of Linux Discard Support (20)

Bottlenecks, Bottlenecks, and more Bottlenecks: Lessons Learned from 2 Years ...
Bottlenecks, Bottlenecks, and more Bottlenecks: Lessons Learned from 2 Years ...Bottlenecks, Bottlenecks, and more Bottlenecks: Lessons Learned from 2 Years ...
Bottlenecks, Bottlenecks, and more Bottlenecks: Lessons Learned from 2 Years ...
 
Presentation sun storage tek™ 2500 series
Presentation   sun storage tek™ 2500 seriesPresentation   sun storage tek™ 2500 series
Presentation sun storage tek™ 2500 series
 
Evaluating the networking performance of linux based home router platforms fo...
Evaluating the networking performance of linux based home router platforms fo...Evaluating the networking performance of linux based home router platforms fo...
Evaluating the networking performance of linux based home router platforms fo...
 
SPC BENCHMARK 2/ENERGY™ EXECUTIVE SUMMARY
SPC BENCHMARK 2/ENERGY™  EXECUTIVE SUMMARYSPC BENCHMARK 2/ENERGY™  EXECUTIVE SUMMARY
SPC BENCHMARK 2/ENERGY™ EXECUTIVE SUMMARY
 
VyattaCore TIPS2013
VyattaCore TIPS2013VyattaCore TIPS2013
VyattaCore TIPS2013
 
Windows Azure Storage – Architecture View
Windows Azure Storage – Architecture ViewWindows Azure Storage – Architecture View
Windows Azure Storage – Architecture View
 
Practical experiences and best practices for SSD and IBM i
Practical experiences and best practices for SSD and IBM iPractical experiences and best practices for SSD and IBM i
Practical experiences and best practices for SSD and IBM i
 
An introduction and evaluations of a wide area distributed storage system
An introduction and evaluations of  a wide area distributed storage systemAn introduction and evaluations of  a wide area distributed storage system
An introduction and evaluations of a wide area distributed storage system
 
Présentation Aerys
Présentation Aerys Présentation Aerys
Présentation Aerys
 
Diamondmax plus 8_data_sheet
Diamondmax plus 8_data_sheetDiamondmax plus 8_data_sheet
Diamondmax plus 8_data_sheet
 
Perf Storwize V7000 Eng
Perf Storwize V7000 EngPerf Storwize V7000 Eng
Perf Storwize V7000 Eng
 
How to Create a High-Speed Template Engine in Python
How to Create a High-Speed Template Engine in PythonHow to Create a High-Speed Template Engine in Python
How to Create a High-Speed Template Engine in Python
 
Toshiba Storage Protfolio 2014
Toshiba Storage Protfolio 2014Toshiba Storage Protfolio 2014
Toshiba Storage Protfolio 2014
 
Lustre client performance comparison and tuning (1.8.x to 2.x)
Lustre client performance comparison and tuning (1.8.x to 2.x)Lustre client performance comparison and tuning (1.8.x to 2.x)
Lustre client performance comparison and tuning (1.8.x to 2.x)
 
M series technical presentation-part 1
M series technical presentation-part 1M series technical presentation-part 1
M series technical presentation-part 1
 
CPU Subsystem Total Power Consumption: Understanding the Factors and Selectin...
CPU Subsystem Total Power Consumption: Understanding the Factors and Selectin...CPU Subsystem Total Power Consumption: Understanding the Factors and Selectin...
CPU Subsystem Total Power Consumption: Understanding the Factors and Selectin...
 
Sending Push Notifications using the Windows Push Notification Service and Wi...
Sending Push Notifications using the Windows Push Notification Service and Wi...Sending Push Notifications using the Windows Push Notification Service and Wi...
Sending Push Notifications using the Windows Push Notification Service and Wi...
 
IBM Solid State in eX5 servers
IBM Solid State in eX5 serversIBM Solid State in eX5 servers
IBM Solid State in eX5 servers
 
Linux on System z disk I/O performance
Linux on System z disk I/O performanceLinux on System z disk I/O performance
Linux on System z disk I/O performance
 
Calton pu experimental methods on performance in cloud and accuracy in big da...
Calton pu experimental methods on performance in cloud and accuracy in big da...Calton pu experimental methods on performance in cloud and accuracy in big da...
Calton pu experimental methods on performance in cloud and accuracy in big da...
 

Recently uploaded

.NET 8 ChatBot with Azure OpenAI Services.pptx
.NET 8 ChatBot with Azure OpenAI Services.pptx.NET 8 ChatBot with Azure OpenAI Services.pptx
.NET 8 ChatBot with Azure OpenAI Services.pptxHansamali Gamage
 
2024.03.12 Cost drivers of cultivated meat production.pdf
2024.03.12 Cost drivers of cultivated meat production.pdf2024.03.12 Cost drivers of cultivated meat production.pdf
2024.03.12 Cost drivers of cultivated meat production.pdfThe Good Food Institute
 
Novo Nordisk's journey in developing an open-source application on Neo4j
Novo Nordisk's journey in developing an open-source application on Neo4jNovo Nordisk's journey in developing an open-source application on Neo4j
Novo Nordisk's journey in developing an open-source application on Neo4jNeo4j
 
UiPath Studio Web workshop Series - Day 3
UiPath Studio Web workshop Series - Day 3UiPath Studio Web workshop Series - Day 3
UiPath Studio Web workshop Series - Day 3DianaGray10
 
Outage Analysis: March 5th/6th 2024 Meta, Comcast, and LinkedIn
Outage Analysis: March 5th/6th 2024 Meta, Comcast, and LinkedInOutage Analysis: March 5th/6th 2024 Meta, Comcast, and LinkedIn
Outage Analysis: March 5th/6th 2024 Meta, Comcast, and LinkedInThousandEyes
 
UiPath Studio Web workshop series - Day 1
UiPath Studio Web workshop series  - Day 1UiPath Studio Web workshop series  - Day 1
UiPath Studio Web workshop series - Day 1DianaGray10
 
March Patch Tuesday
March Patch TuesdayMarch Patch Tuesday
March Patch TuesdayIvanti
 
Stobox 4: Revolutionizing Investment in Real-World Assets Through Tokenization
Stobox 4: Revolutionizing Investment in Real-World Assets Through TokenizationStobox 4: Revolutionizing Investment in Real-World Assets Through Tokenization
Stobox 4: Revolutionizing Investment in Real-World Assets Through TokenizationStobox
 
Oracle Database 23c Security New Features.pptx
Oracle Database 23c Security New Features.pptxOracle Database 23c Security New Features.pptx
Oracle Database 23c Security New Features.pptxSatishbabu Gunukula
 
The Zero-ETL Approach: Enhancing Data Agility and Insight
The Zero-ETL Approach: Enhancing Data Agility and InsightThe Zero-ETL Approach: Enhancing Data Agility and Insight
The Zero-ETL Approach: Enhancing Data Agility and InsightSafe Software
 
UiPath Studio Web workshop series - Day 2
UiPath Studio Web workshop series - Day 2UiPath Studio Web workshop series - Day 2
UiPath Studio Web workshop series - Day 2DianaGray10
 
Emil Eifrem at GraphSummit Copenhagen 2024 - The Art of the Possible.pptx
Emil Eifrem at GraphSummit Copenhagen 2024 - The Art of the Possible.pptxEmil Eifrem at GraphSummit Copenhagen 2024 - The Art of the Possible.pptx
Emil Eifrem at GraphSummit Copenhagen 2024 - The Art of the Possible.pptxNeo4j
 
SIM INFORMATION SYSTEM: REVOLUTIONIZING DATA MANAGEMENT
SIM INFORMATION SYSTEM: REVOLUTIONIZING DATA MANAGEMENTSIM INFORMATION SYSTEM: REVOLUTIONIZING DATA MANAGEMENT
SIM INFORMATION SYSTEM: REVOLUTIONIZING DATA MANAGEMENTxtailishbaloch
 
How to release an Open Source Dataweave Library
How to release an Open Source Dataweave LibraryHow to release an Open Source Dataweave Library
How to release an Open Source Dataweave Libraryshyamraj55
 
EMEA What is ThousandEyes? Webinar
EMEA What is ThousandEyes? WebinarEMEA What is ThousandEyes? Webinar
EMEA What is ThousandEyes? WebinarThousandEyes
 
AI Workshops at Computers In Libraries 2024
AI Workshops at Computers In Libraries 2024AI Workshops at Computers In Libraries 2024
AI Workshops at Computers In Libraries 2024Brian Pichman
 
My key hands-on projects in Quantum, and QAI
My key hands-on projects in Quantum, and QAIMy key hands-on projects in Quantum, and QAI
My key hands-on projects in Quantum, and QAIVijayananda Mohire
 
CyberSecurity - Computers In Libraries 2024
CyberSecurity - Computers In Libraries 2024CyberSecurity - Computers In Libraries 2024
CyberSecurity - Computers In Libraries 2024Brian Pichman
 
Trailblazer Community - Flows Workshop (Session 2)
Trailblazer Community - Flows Workshop (Session 2)Trailblazer Community - Flows Workshop (Session 2)
Trailblazer Community - Flows Workshop (Session 2)Muhammad Tiham Siddiqui
 

Recently uploaded (20)

.NET 8 ChatBot with Azure OpenAI Services.pptx
.NET 8 ChatBot with Azure OpenAI Services.pptx.NET 8 ChatBot with Azure OpenAI Services.pptx
.NET 8 ChatBot with Azure OpenAI Services.pptx
 
2024.03.12 Cost drivers of cultivated meat production.pdf
2024.03.12 Cost drivers of cultivated meat production.pdf2024.03.12 Cost drivers of cultivated meat production.pdf
2024.03.12 Cost drivers of cultivated meat production.pdf
 
Novo Nordisk's journey in developing an open-source application on Neo4j
Novo Nordisk's journey in developing an open-source application on Neo4jNovo Nordisk's journey in developing an open-source application on Neo4j
Novo Nordisk's journey in developing an open-source application on Neo4j
 
UiPath Studio Web workshop Series - Day 3
UiPath Studio Web workshop Series - Day 3UiPath Studio Web workshop Series - Day 3
UiPath Studio Web workshop Series - Day 3
 
Outage Analysis: March 5th/6th 2024 Meta, Comcast, and LinkedIn
Outage Analysis: March 5th/6th 2024 Meta, Comcast, and LinkedInOutage Analysis: March 5th/6th 2024 Meta, Comcast, and LinkedIn
Outage Analysis: March 5th/6th 2024 Meta, Comcast, and LinkedIn
 
UiPath Studio Web workshop series - Day 1
UiPath Studio Web workshop series  - Day 1UiPath Studio Web workshop series  - Day 1
UiPath Studio Web workshop series - Day 1
 
March Patch Tuesday
March Patch TuesdayMarch Patch Tuesday
March Patch Tuesday
 
Stobox 4: Revolutionizing Investment in Real-World Assets Through Tokenization
Stobox 4: Revolutionizing Investment in Real-World Assets Through TokenizationStobox 4: Revolutionizing Investment in Real-World Assets Through Tokenization
Stobox 4: Revolutionizing Investment in Real-World Assets Through Tokenization
 
Oracle Database 23c Security New Features.pptx
Oracle Database 23c Security New Features.pptxOracle Database 23c Security New Features.pptx
Oracle Database 23c Security New Features.pptx
 
The Zero-ETL Approach: Enhancing Data Agility and Insight
The Zero-ETL Approach: Enhancing Data Agility and InsightThe Zero-ETL Approach: Enhancing Data Agility and Insight
The Zero-ETL Approach: Enhancing Data Agility and Insight
 
UiPath Studio Web workshop series - Day 2
UiPath Studio Web workshop series - Day 2UiPath Studio Web workshop series - Day 2
UiPath Studio Web workshop series - Day 2
 
Emil Eifrem at GraphSummit Copenhagen 2024 - The Art of the Possible.pptx
Emil Eifrem at GraphSummit Copenhagen 2024 - The Art of the Possible.pptxEmil Eifrem at GraphSummit Copenhagen 2024 - The Art of the Possible.pptx
Emil Eifrem at GraphSummit Copenhagen 2024 - The Art of the Possible.pptx
 
SIM INFORMATION SYSTEM: REVOLUTIONIZING DATA MANAGEMENT
SIM INFORMATION SYSTEM: REVOLUTIONIZING DATA MANAGEMENTSIM INFORMATION SYSTEM: REVOLUTIONIZING DATA MANAGEMENT
SIM INFORMATION SYSTEM: REVOLUTIONIZING DATA MANAGEMENT
 
How to release an Open Source Dataweave Library
How to release an Open Source Dataweave LibraryHow to release an Open Source Dataweave Library
How to release an Open Source Dataweave Library
 
EMEA What is ThousandEyes? Webinar
EMEA What is ThousandEyes? WebinarEMEA What is ThousandEyes? Webinar
EMEA What is ThousandEyes? Webinar
 
AI Workshops at Computers In Libraries 2024
AI Workshops at Computers In Libraries 2024AI Workshops at Computers In Libraries 2024
AI Workshops at Computers In Libraries 2024
 
My key hands-on projects in Quantum, and QAI
My key hands-on projects in Quantum, and QAIMy key hands-on projects in Quantum, and QAI
My key hands-on projects in Quantum, and QAI
 
SheDev 2024
SheDev 2024SheDev 2024
SheDev 2024
 
CyberSecurity - Computers In Libraries 2024
CyberSecurity - Computers In Libraries 2024CyberSecurity - Computers In Libraries 2024
CyberSecurity - Computers In Libraries 2024
 
Trailblazer Community - Flows Workshop (Session 2)
Trailblazer Community - Flows Workshop (Session 2)Trailblazer Community - Flows Workshop (Session 2)
Trailblazer Community - Flows Workshop (Session 2)
 

Performance evaluation of Linux Discard Support

  • 1. Performance evaluation of Linux Discard Support (Overview, benchmark results, current status) Red Hat Luk´ˇ Czerner as May 27, 2011
  • 2. Copyright © 2011 Luk´ˇ Czerner, Red Hat. as Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.3 or any later version published by the Free Software Foundation; with no Invariant Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license is included in the COPYING file.
  • 4. Agenda 1 SSD Description 2 Thinly Provisioned Storage 3 Introducing Linux Discard Support
  • 5. SSD Description Solid-State Drive Flash memory block device Wear-leveling needed Firmware = black box
  • 6. SSD Description ATA TRIM Command Helps handle garbage collection overhead Subsequent READ of TRIM’ed blocks 1 Read data should NOT change between READ’s 2 Read data should NOT be retrieved from data previously written to any other LBA. As long as the device has enough free pages to write to we do not necessarily need it. In a nutshell: TRIM command tells the device what LBA’a is not used by the OS anymore.
  • 7. Thinly Provisioned Storage Thin Provisioning Unlike in traditional storage, there is no fixed one-to-one logical bock to physical storage mapping More efficient use of storage space Block reclamation interface needed
  • 8. Thinly Provisioned Storage SCSI UNMAP / WRITE SAME Storage space reclamation interface Subsequent READ of unmapped blocks 1 Read data should NOT change between READ’s 2 Read data should NOT be retrieved from data previously written to any other LBA. Unlike with SSD’s we can not afford to wait until we run out of space for reclamation.
  • 9. Introducing Linux Discard Support Linux Discard Implementation Abstraction for the two underlying specifications: 1 ATA TRIM Command 2 SCSI UNMAP / WRITE SAME API for user-space BLKDISCARD ioctl Added with v2.6.27-rc9-30-gd30a260 API for File Sytems 1 sb issue discard() 2 blkdev issue discard()
  • 12. Testing Methodology What do we need to find out ? Does discard really work ? Is it reliable ? How fast/slow is it ? Is there any difference between devices from different vendors ? What is the ideal discard size ? SSD performance degradation
  • 13. Testing Methodology How do we test it ? BLKDISCARD ioctl() Automatic discard of different ranges Different discard patterns 1 sequential performance 2 random IO peformance 3 discard already discarded blocks test-discard - discard benchmarking tool http://sourceforge.net/projects/test-discard/ impression - filesystem aging tool
  • 14. Results Sequential discard performance 300 1200 Duration Summary Throughput 250 1000 200 800 Throughput [MB/s] Duration [s] 150 600 100 400 50 200 0 0 10 100 1000 Record size [kB]
  • 15. Results Different modes comparison 1200 Throughput (sequential) Throughput (random IO) Throughput (discard2) 1000 800 Throughput [MB/s] 600 400 200 0 10 100 1000 Record size [kB]
  • 16. Results Difference between various vendors 1200 Throughput Vendor 1 Throughput Vendor 2 Throughput Vendor 3 1000 800 Throughput [MB/s] 600 400 200 0 10 100 1000 Record size [kB]
  • 17. Results SSD performance degradation 70 Vendor1 Vendor2 Vendor3 60 Vendor3 discard Write performance [MB/s] 50 40 30 20 10 0 50 100 150 200 250 300 350 400 Filesystem saturation [%]
  • 18. Part III Discard Support for Linux File systems
  • 19. Agenda 6 Periodic Discard 7 Discard Batching 8 Different Approach
  • 20. Periodic Discard Periodic discard Easy to implement File system support 1 ext4 (v2.6.27-5185-g8a0aba7) 2 btrfs (since upstream) 3 gfs2 (v2.6.29-9-gf15ab56) 4 fat, swap, nilfs mount -o discard /dev/sdc /mnt/test TRIM is non-queueable command - implications ?
  • 21. Periodic Discard Benchmarking periodic discard Expectations ? Testing methodology 1 Metadata intensive load 2 Load with removing files 3 Reasonable file size distribution Discard-kit 1 Using PostMark 2 http://sourceforge.net/projects/test-discard/files/
  • 22. Periodic Discard Ext4 performance (18% hit) 800 Deleted/s Append/s Read/s 700 Files-created/s Transactions/s Read[B/s] 600 Write[B/s] Opetation/s 500 400 300 200 100 nodiscard discard
  • 23. Periodic Discard Performance with various file systems 750 ext4 700 btrfs 650 gfs2 Transactions/s 600 550 500 450 400 350 no di no di no di sc sc sc di di di ar ar ar sc sc sc d d d ar ar ar d d -18% -7% -63% d
  • 24. Discard Batching Discard Batching - The idea Fine-grained discard is not necessarily needed Small extents are slow With time, freed extents tends to coalesce Disadvantages 1 There is a price for tracking freed extents 2 Discarding already discarded blocks should be easy, but... 3 Daemon (in-kernel, user-space) needed. 4 File system independent solution would most likely be pain to do right (if possible).
  • 25. Discard Batching Batched discard support File system specific solution Provide ioctl() interface - FITRIM Do not disturb other ongoing IO too much 1 Prevent allocations while trimming 2 How to handle huge filesystem ? File system support 1 ext4 (v2.6.36-rc6-35-g7360d17) 2 ext3 (v2.6.37-11-g9c52749) 3 xfs (v2.6.37-rc4-63-ga46db60)
  • 26. Discard Batching FITRIM ioctl Ioctl with one RW parameter defined in linux/fs.h struct fstrim range { u64 start; u64 len; u64 minlen; } fstrim tool http://sourceforge.net/projects/fstrim/ util-linux-ng Since v2.18-165-gd9e2d0d
  • 27. Discard Batching Batched discard benchmark results 6 FITRIM on ext4 BLKDISCARD 5 4 Duration [s] 3 2 1 0 0 20 40 60 80 100 Filesystem saturation [%]
  • 28. Different Approach Alternative approach It is always a compromise The future of SSD’s and thinly provisioned LUN’s (???)
  • 29. Part IV Discard Support in user-space
  • 31. e2fsprogs Discard in e2fsprogs tools Using BLKDISCARD ioctl() mke2fs 1 Refresh SSD’s garbage collector 2 discard zeroes data - significant speed boost 3 mkfs.ext4 -E discard /dev/sdc e2fsck 1 After the last check discard free space 2 Non detected file system errors ? oops 3 fsck.ext4 -E discard /dev/sdc resize2fs 1 Refresh SSD’s garbage collector 2 discard zeroes data - significant speed boost 3 resize2fs -E discard /dev/sdc
  • 32. e2fsprogs File system creation 30 nodiscard discard 25 20 Duration [s] 15 10 5 0 EXT4 XFS File system
  • 33. Other utilities Fstrim tool Very simple tool to invoke FITRIM ioctl on mounted file system Stand-alone tool http://sourceforge.net/projects/fstrim/ Since v2.18-165-gd9e2d0d part of util-linux-ng
  • 35. Summary Linux Discard support is a abstraction for underlying specification Exported via BLKDISCARD ioctl to user-space and blkdev issue discard() for filesystems Discard testing kit (Discard-kit) 1 test-discard 2 PostMark Filesystem support 1 Fine grained (online) discard - mount -o discard 2 Batched discard support - fstrim from util-linux-ng Support in user-space utilities 1 Filesystem creation (mkfs) 2 e2fsprogs - mkfs,e2fsck,resize2fs 3 xfsprogs - mkfs 4 fstrim
  • 36. The end. Thanks for listening.
  • 37. Useful links http://sourceforge.net/projects/fstrim/ http://sourceforge.net/projects/test-discard/ http://people.redhat.com/lczerner/discard/