Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
NexentaStor Performance        Tuning             Richard EllingSenior Director of Solutions Engineering                Ne...
Agenda• Read Performance Model• Device Performance Characterization• NexentaStor Tunables
Read Performance Model
NexentaStor Performance• Performance of NexentaStor systems is  difficult to predict• Generally better than proprietary RA...
Good Hardware Choices• NexentaStor uses block devices    – HDDs    – SSDs    – Anything that looks like a set of blocks   ...
Hybrid Storage Pool          Optimize performance and cost                              Adaptive Replacement Cache        ...
Working Set Size• Average Working Set Size (WSS) is the  amount of space needed to satisfy the  immediate storage needs of...
Performance Envelope                            10,000,000     4KB Random Read IOPS                             1,000,000 ...
Performance Envelope                            10,000,000     4KB Random Read IOPS                             1,000,000 ...
Performance Envelope                               10,000,000        ARC Hit       4KB Random Read IOPS     Performance   ...
1,500,0004KB Random Read IOPS                       1,250,000                                                Small Config E...
Real World Example                                            Read                          WriteServer Time           NFS...
Device Performance      Characterization13
Characterizing Device                       Performance• Modern storage devices vary widely in  performance• “Datasheets d...
SNIA SSS-PTS• SNIA recognizes difficulty in comparing  devices• Proposes Solid State Storage Performance  Test Specificatio...
SSS-PTS IOPS Measurement• Preconditioning and iterate until results are  consistent     – Helps to eliminate out-of-the-bo...
SSS-PTS Concurrent I/O                             Operations• Number of concurrent I/Os (or threads)  can be very importa...
Sample NexentaStor ZVol                                             Test                                           Read:Wr...
ZVol Performance
Another SSS-PTS Result20
Comparing Devices21
All IOPS are not Created                                                                         Equal                    ...
NexentaStor Tunables23
Choose Appropriate                          Components•    The biggest tuning knob•    Have the right components for the j...
Recordsize and Block Size• 2nd biggest tuning knob     – Recordsize is “max block size” for file systems     – Block size i...
I/O Concurrency• For current ZFS implementations,  zfs_vdev_max_pending is global, per-  device setting     – Older releas...
Prefetching• By default, intelligent prefetching is  enabled     – Adaptive algorithm     – If prefetching seemed to work,...
Compression• Compression turns big I/O into small I/O, when  possible     –   Algorithms do not suffer from “compression g...
Deduplication• Deduplication turns large I/O into  small I/O     – Does not eliminate I/O!     – Avoid use for big, slow H...
Measure and Manage• Performance management is always a work  in progress• Generalizations are becoming more  difficult as ...
Questions?     Richard.Elling@Nexenta.com31
Upcoming SlideShare
Loading in …5
×

NexentaStor Performance Tuning - OpenStorage Summit 2011

7,728 views

Published on

Published in: Technology, Business
  • Be the first to comment

NexentaStor Performance Tuning - OpenStorage Summit 2011

  1. 1. NexentaStor Performance Tuning Richard EllingSenior Director of Solutions Engineering Nexenta
  2. 2. Agenda• Read Performance Model• Device Performance Characterization• NexentaStor Tunables
  3. 3. Read Performance Model
  4. 4. NexentaStor Performance• Performance of NexentaStor systems is difficult to predict• Generally better than proprietary RAID systems – Proprietary systems tend to use wimpy CPUs with limited amounts of memory for cache – NexentaStor systems scale with the latest processor and memory technology• Best NexentaStor performance achieved by choosing the best hardware configuration for the job, not by “tuning” NexentaStor software2
  5. 5. Good Hardware Choices• NexentaStor uses block devices – HDDs – SSDs – Anything that looks like a set of blocks • Size must be greater than 64 MB • Sorry, floppy disks are too small• NexentaStor block drivers – Initiators: ATA, IDE, SATA, SAS, Parallel SCSI, iSCSI, FC, DDRdrive, USB, SD, CF, XD, MMC – Others: files, ramdisk, BD3
  6. 6. Hybrid Storage Pool Optimize performance and cost Adaptive Replacement Cache (ARC) separate Main Main Pool Level 2 ARC intent log Pool Write optimized HDD HDD Read optimized device (SSD) HDD device (SSD) Size (GBytes) 1 - 10 GByte large big Cost write iops/$ size/$ size/$ Use sync writes persistent storage read cache secondary Performance low-latency writes low-latency reads optimization Need more stripe more, faster stripe speed? devices4
  7. 7. Working Set Size• Average Working Set Size (WSS) is the amount of space needed to satisfy the immediate storage needs of applications or the frequently used space• Reduce WSS by – Snapshots & clones (most effective) – Compression – Deduplication5
  8. 8. Performance Envelope 10,000,000 4KB Random Read IOPS 1,000,000 100,000 10,000 1,000 0 250 500 750 1000 Working Set Size (GB) 4KB random read IOPS Expected Max Performance8
  9. 9. Performance Envelope 10,000,000 4KB Random Read IOPS 1,000,000 100,000 A R C L2ARC 10,000 Pool Disk 1,000 0 250 500 750 1000 Working Set Size (GB) 4KB random read IOPS Expected Max Performance9
  10. 10. Performance Envelope 10,000,000 ARC Hit 4KB Random Read IOPS Performance 1,000,000 L2ARC Hit Performance 100,000 A R C L2ARC Pool 10,000 Performance Pool Disk 1,000 0 250 500 750 1000 ARC ARC + L2ARC Working Set Size (GB) Size Size 4KB random read IOPS Expected Max Performance10
  11. 11. 1,500,0004KB Random Read IOPS 1,250,000 Small Config Expected Performance Medium Config Expected Performance Large Config Expected Performance 1,000,000 10 GbE wire speed 750,000 500,000 250,000 0 0 250 500 750 1000 Working Set Size (GB) Configuration Small Medium Large RAM size (GB) 24 96 192 100% ARC hit rate performance 600,000 900,000 1,300,000 L2ARC size (GB) 0 250 480 L2ARC device small random read IOPS 0 30,000 60,000 Pool small random read IOPS 1,400 3,600 8,000 11
  12. 12. Real World Example Read WriteServer Time NFSOPS BW Latency BW Latency OPS OPS (KB/sec) (usec) (KB/sec) (usec) 1 5:31:03 AM 9,699 6,780 125,163 271 2,865 29,432 242 1 5:31:04 AM 9,263 6,464 111,200 297 2,682 142,730 496 1 5:31:05 AM 11,703 7,969 131,949 258 3,535 206,254 551 1 5:31:06 AM 14,751 11,030 184,239 179 3,581 219,542 705 1 5:31:07 AM 14,318 10,916 183,431 158 3,246 88,383 353 1 5:31:08 AM 11,396 7,334 114,184 318 3,973 39,423 351 1 5:31:09 AM 10,766 7,152 123,791 274 3,518 34,355 235 2 5:21:24 AM 4,138 2,352 45,295 2,525 1,598 16,193 2,122 2 5:21:25 AM 6,050 2,366 55,238 1,211 3,209 175,509 1,193 2 5:21:26 AM 8,902 2,958 85,980 1,907 5,735 281,881 996 2 5:21:27 AM 3,456 1,669 34,443 2,212 1,526 46,251 2,291 2 5:21:28 AM 3,463 1,790 35,542 5,307 1,571 17,157 4,052 2 5:21:29 AM 3,306 1,711 29,829 3,641 1,462 40,895 2,532 2 5:21:30 AM 3,697 2,111 41,909 1,921 1,478 31,911 87712
  13. 13. Device Performance Characterization13
  14. 14. Characterizing Device Performance• Modern storage devices vary widely in performance• “Datasheets don’t lie” ... but the information is vague and unhelpful• Need a comprehensive device characterization suite14
  15. 15. SNIA SSS-PTS• SNIA recognizes difficulty in comparing devices• Proposes Solid State Storage Performance Test Specification (SSS-PTS)• Nexenta’s implementation in NexentaStor.org repository – Using open source vdbench – Results cannot be used for SSS-PTS publication, but are very useful for systems architects• Works great for HDDs, too15
  16. 16. SSS-PTS IOPS Measurement• Preconditioning and iterate until results are consistent – Helps to eliminate out-of-the-box optimizations• Read/write ratio – 100:1, 95:5, 65:35, 50:50, 35:65, 5:95, 0:100• Block I/O sizes (KB) – 0.5, 4, 8, 16, 32, 64, 128, 1024• Execute random I/O• Measure IOPS16
  17. 17. SSS-PTS Concurrent I/O Operations• Number of concurrent I/Os (or threads) can be very important – NexentaStor architects need to choose best concurrency value for the entire platform – Nexenta tests add thread counts: • 1, 2, 4, 8, 16, 32• For SSS-PTS results publication – vendors can choose which to report17
  18. 18. Sample NexentaStor ZVol Test Read:Write Ratio Block Size 100:0 95:5 65:35 50:50 35:65 5:95 0:100 (KiB) 0.5 620,880 117,710 30,625 22,197 19,010 17,512 37,404 4 603,126 96,684 19,284 15,584 13,869 9,957 19,247 8 647,250 126,288 20,177 13,769 12,348 7,405 8,741 16 338,106 48,965 9,598 7,413 5,313 4,437 4,423 32 164,678 28,759 4,574 3,483 2,428 1,983 2,264 64 84,688 11,496 2,166 1,503 1,172 829 1,076 128 46,126 5,705 965 770 611 502 571 1024 4,978 715 107 84 78 64 75 Local test, closed course, professional driver Test clearly shows effects of caching and single HDD pool18
  19. 19. ZVol Performance
  20. 20. Another SSS-PTS Result20
  21. 21. Comparing Devices21
  22. 22. All IOPS are not Created Equal Avg resp time (ms) vs. IOPS by Threads & 2 more Read % Threads 0 100 1 80 2 60 512 40 4 20 6 0 80 8 60 10 4096 40 20 0 80 60 8192 Avg resp time (ms) 40 IO size (bytes) 20 0 80 60 32768 40 20 0 80 60 65536 40 20 0 80 131072 60 40 20 0 60 70 80 90 100 110 120 130 140 150 160 170 180 190 60 70 80 90 100 110 120 130 140 150 160 170 180 190 IOPS22
  23. 23. NexentaStor Tunables23
  24. 24. Choose Appropriate Components• The biggest tuning knob• Have the right components for the job• Choose reliable components• Leverage hybrid storage pool concepts• In general, go wide then deep24
  25. 25. Recordsize and Block Size• 2nd biggest tuning knob – Recordsize is “max block size” for file systems – Block size is “only block size” for block devices• For fixed-record-length workloads – Match recordsize/block size to avoid I/O amplification for bandwidth-constrained systems – Multiples can be ok • Experiment and observe trade-offs – Smaller block sizes require more metadata per unit of available storage• For variable workloads (eg files), large recordsize is ok25
  26. 26. I/O Concurrency• For current ZFS implementations, zfs_vdev_max_pending is global, per- device setting – Older releases, default is 35 – Current releases, default 10 – Consider changing to match devices to workload – Setting can have availability implications• Room for improvement, stay tuned...26
  27. 27. Prefetching• By default, intelligent prefetching is enabled – Adaptive algorithm – If prefetching seemed to work, prefetch more – Generally works well• For high-concurrency environments, consider disabling prefetching – Not tunable from NexentaStor 3.x UI• Room for improvement, stay tuned...27
  28. 28. Compression• Compression turns big I/O into small I/O, when possible – Algorithms do not suffer from “compression growth” – Various algorithms available – Enabled by default in NexentaStor 3.x – Amaze your friends: zeros compress to nothing• For high performance environments, consider disabling compression – When bandwidth is over-provisioned – When space is inexpensive $/GB – When low variance of latency is desired28
  29. 29. Deduplication• Deduplication turns large I/O into small I/O – Does not eliminate I/O! – Avoid use for big, slow HDDs (IOPS is constrained)• In general, deduplication and high performance are not the best of friends29
  30. 30. Measure and Manage• Performance management is always a work in progress• Generalizations are becoming more difficult as workloads become more diverse• Experiment and measure prior to production• Measure and manage in production• Performance management has room for improvement, stay tuned...30
  31. 31. Questions? Richard.Elling@Nexenta.com31

×