SlideShare a Scribd company logo
1 of 31
Download to read offline
NexentaStor Performance
        Tuning
             Richard Elling
Senior Director of Solutions Engineering
                Nexenta
Agenda

• Read Performance Model
• Device Performance Characterization
• NexentaStor Tunables
Read Performance Model
NexentaStor Performance
• Performance of NexentaStor systems is
  difficult to predict
• Generally better than proprietary RAID
  systems
    – Proprietary systems tend to use wimpy CPUs with
      limited amounts of memory for cache
    – NexentaStor systems scale with the latest
      processor and memory technology
• Best NexentaStor performance achieved by
  choosing the best hardware configuration for
  the job, not by “tuning” NexentaStor software
2
Good Hardware Choices
• NexentaStor uses block devices
    – HDDs
    – SSDs
    – Anything that looks like a set of blocks
      • Size must be greater than 64 MB
      • Sorry, floppy disks are too small
• NexentaStor block drivers
    – Initiators: ATA, IDE, SATA, SAS, Parallel SCSI,
      iSCSI, FC, DDRdrive, USB, SD, CF, XD, MMC
    – Others: files, ramdisk, BD
3
Hybrid Storage Pool
          Optimize performance and cost
                              Adaptive Replacement Cache
                              (ARC)
                      separate              Main
                                         Main Pool             Level 2 ARC
                      intent log            Pool



                    Write optimized          HDD
                                              HDD              Read optimized
                    device (SSD)                HDD            device (SSD)


    Size (GBytes)      1 - 10 GByte            large                 big
    Cost               write iops/$           size/$               size/$
    Use                 sync writes      persistent storage      read cache
                                            secondary
    Performance     low-latency writes                        low-latency reads
                                           optimization
    Need more              stripe          more, faster            stripe
    speed?                                   devices
4
Working Set Size
• Average Working Set Size (WSS) is the
  amount of space needed to satisfy the
  immediate storage needs of
  applications or the frequently used
  space
• Reduce WSS by
    – Snapshots & clones (most effective)
    – Compression
    – Deduplication
5
Performance Envelope

                            10,000,000 
    4KB Random Read IOPS




                             1,000,000 


                               100,000 


                                10,000 


                                 1,000 
                                           0
     250
    500
     750
   1000

                                                Working Set Size (GB)

       4KB random read IOPS
                               Expected Max Performance



8
Performance Envelope
                            10,000,000 
    4KB Random Read IOPS

                             1,000,000 


                               100,000 
 A
                                            R
                                            C L2ARC
                                10,000 
                                                              Pool Disk
                                 1,000 
                                           0
     250
    500
     750
   1000

                                                Working Set Size (GB)

       4KB random read IOPS
                               Expected Max Performance


9
Performance Envelope
                               10,000,000 
       ARC Hit
       4KB Random Read IOPS


     Performance
                                1,000,000 
      L2ARC Hit
     Performance
                                  100,000 
 A
                                               R
                                               C L2ARC                               Pool
                                   10,000 
                                      Performance
                                                             Pool Disk
                                    1,000 
                                              0
   250
   500
   750
    1000
                          ARC        ARC + L2ARC
                               Working Set Size (GB)
                          Size             Size
          4KB random read IOPS
           Expected Max Performance


10
1,500,000
4KB Random Read IOPS




                       1,250,000                                                Small Config Expected Performance
                                                                                Medium Config Expected Performance
                                                                                Large Config Expected Performance
                       1,000,000                                                10 GbE wire speed

                        750,000

                        500,000

                        250,000

                              0
                                   0     250      500      750       1000

                                       Working Set Size (GB)
                               Configuration                          Small          Medium        Large
                               RAM size (GB)                                  24             96           192
                               100% ARC hit rate performance           600,000        900,000      1,300,000
                               L2ARC size (GB)                                 0         250              480
                               L2ARC device small random read IOPS              0      30,000         60,000
                               Pool small random read IOPS                  1,400       3,600          8,000

       11
Real World Example
                                            Read                          Write
Server Time           NFSOPS                 BW       Latency               BW       Latency
                                  OPS                            OPS
                                           (KB/sec)    (usec)             (KB/sec)    (usec)
     1   5:31:03 AM       9,699    6,780    125,163       271     2,865     29,432       242
     1   5:31:04 AM       9,263    6,464    111,200       297     2,682    142,730       496
     1   5:31:05 AM      11,703    7,969    131,949       258     3,535    206,254       551
     1   5:31:06 AM      14,751   11,030    184,239       179     3,581    219,542       705
     1   5:31:07 AM      14,318   10,916    183,431       158     3,246     88,383       353
     1   5:31:08 AM      11,396    7,334    114,184       318     3,973     39,423       351
     1   5:31:09 AM      10,766    7,152    123,791       274     3,518     34,355       235


     2   5:21:24 AM       4,138    2,352     45,295      2,525    1,598     16,193      2,122
     2   5:21:25 AM       6,050    2,366     55,238      1,211    3,209    175,509      1,193
     2   5:21:26 AM       8,902    2,958     85,980      1,907    5,735    281,881       996
     2   5:21:27 AM       3,456    1,669     34,443      2,212    1,526     46,251      2,291
     2   5:21:28 AM       3,463    1,790     35,542      5,307    1,571     17,157      4,052
     2   5:21:29 AM       3,306    1,711     29,829      3,641    1,462     40,895      2,532
     2   5:21:30 AM       3,697    2,111     41,909      1,921    1,478     31,911       877
12
Device Performance
      Characterization




13
Characterizing Device
                       Performance
• Modern storage devices vary widely in
  performance
• “Datasheets don’t lie”
  ... but the information is vague and
  unhelpful
• Need a comprehensive device
  characterization suite



14
SNIA SSS-PTS

• SNIA recognizes difficulty in comparing
  devices
• Proposes Solid State Storage Performance
  Test Specification (SSS-PTS)
• Nexenta’s implementation in
  NexentaStor.org repository
     – Using open source vdbench
     – Results cannot be used for SSS-PTS publication,
       but are very useful for systems architects
• Works great for HDDs, too

15
SSS-PTS IOPS Measurement

• Preconditioning and iterate until results are
  consistent
     – Helps to eliminate out-of-the-box
       optimizations
• Read/write ratio
     – 100:1, 95:5, 65:35, 50:50, 35:65, 5:95, 0:100
• Block I/O sizes (KB)
     – 0.5, 4, 8, 16, 32, 64, 128, 1024
• Execute random I/O
• Measure IOPS
16
SSS-PTS Concurrent I/O
                             Operations
• Number of concurrent I/Os (or threads)
  can be very important
     – NexentaStor architects need to choose
       best concurrency value for the entire
       platform
     – Nexenta tests add thread counts:
       • 1, 2, 4, 8, 16, 32
• For SSS-PTS results publication
     – vendors can choose which to report

17
Sample NexentaStor ZVol
                                             Test
                                           Read:Write Ratio
     Block Size
                     100:0     95:5     65:35    50:50    35:65       5:95     0:100
        (KiB)
        0.5          620,880 117,710    30,625   22,197   19,010      17,512   37,404
         4           603,126   96,684   19,284   15,584   13,869       9,957   19,247
         8           647,250 126,288    20,177   13,769   12,348       7,405    8,741
        16           338,106   48,965    9,598    7,413       5,313    4,437    4,423
        32           164,678   28,759    4,574    3,483       2,428    1,983    2,264
        64            84,688   11,496    2,166    1,503       1,172     829     1,076
        128           46,126    5,705      965      770        611      502       571
       1024            4,978     715       107       84         78       64        75

                       Local test, closed course, professional driver

                  Test clearly shows effects of caching and single HDD pool


18
ZVol Performance
Another SSS-PTS Result




20
Comparing Devices




21
All IOPS are not Created
                                                                         Equal
                                                              Avg resp time (ms) vs. IOPS by Threads & 2 more
                                                                                       Read %                                                                                  Threads
                                                              0                                                            100
                                                                                                                                                                                 1
                          80
                                                                                                                                                                                 2
                          60




                                                                                                                                                         512
                          40                                                                                                                                                     4
                          20                                                                                                                                                     6
                           0
                          80
                                                                                                                                                                                 8

                          60                                                                                                                                                     10




                                                                                                                                                         4096
                          40
                          20
                           0
                          80
                          60




                                                                                                                                                         8192
     Avg resp time (ms)




                          40




                                                                                                                                                           IO size (bytes)
                          20
                           0
                          80
                          60




                                                                                                                                                                       32768
                          40
                          20
                           0
                          80
                          60




                                                                                                                                                         65536
                          40
                          20
                           0
                          80




                                                                                                                                                         131072
                          60
                          40
                          20
                           0
                               60   70   80   90 100 110 120 130 140 150 160 170 180 190    60    70   80   90 100 110 120 130 140 150 160 170 180 190


                                                                                           IOPS


22
NexentaStor Tunables




23
Choose Appropriate
                          Components
•    The biggest tuning knob
•    Have the right components for the job
•    Choose reliable components
•    Leverage hybrid storage pool concepts
•    In general, go wide then deep




24
Recordsize and Block Size
• 2nd biggest tuning knob
     – Recordsize is “max block size” for file systems
     – Block size is “only block size” for block devices
• For fixed-record-length workloads
     – Match recordsize/block size to avoid I/O
       amplification for bandwidth-constrained systems
     – Multiples can be ok
        • Experiment and observe trade-offs
     – Smaller block sizes require more metadata per unit of
       available storage
• For variable workloads (eg files), large recordsize
  is ok

25
I/O Concurrency

• For current ZFS implementations,
  zfs_vdev_max_pending is global, per-
  device setting
     – Older releases, default is 35
     – Current releases, default 10
     – Consider changing to match devices to
       workload
     – Setting can have availability implications
• Room for improvement, stay tuned...

26
Prefetching

• By default, intelligent prefetching is
  enabled
     – Adaptive algorithm
     – If prefetching seemed to work, prefetch more
     – Generally works well
• For high-concurrency environments,
  consider disabling prefetching
     – Not tunable from NexentaStor 3.x UI
• Room for improvement, stay tuned...

27
Compression
• Compression turns big I/O into small I/O, when
  possible
     –   Algorithms do not suffer from “compression growth”
     –   Various algorithms available
     –   Enabled by default in NexentaStor 3.x
     –   Amaze your friends: zeros compress to nothing
• For high performance environments, consider
  disabling compression
     – When bandwidth is over-provisioned
     – When space is inexpensive $/GB
     – When low variance of latency is desired

28
Deduplication

• Deduplication turns large I/O into
  small I/O
     – Does not eliminate I/O!
     – Avoid use for big, slow HDDs (IOPS is
       constrained)
• In general, deduplication and high
  performance are not the best of friends



29
Measure and Manage

• Performance management is always a work
  in progress
• Generalizations are becoming more
  difficult as workloads become more diverse
• Experiment and measure prior to
  production
• Measure and manage in production
• Performance management has room for
  improvement, stay tuned...

30
Questions?

     Richard.Elling@Nexenta.com




31

More Related Content

Recently uploaded

Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)Mark Simos
 
Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)Kaya Weers
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...Wes McKinney
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesThousandEyes
 
QCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architecturesQCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architecturesBernd Ruecker
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfpanagenda
 
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS:  6 Ways to Automate Your Data IntegrationBridging Between CAD & GIS:  6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integrationmarketing932765
 
Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...itnewsafrica
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentPim van der Noll
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch TuesdayIvanti
 
All These Sophisticated Attacks, Can We Really Detect Them - PDF
All These Sophisticated Attacks, Can We Really Detect Them - PDFAll These Sophisticated Attacks, Can We Really Detect Them - PDF
All These Sophisticated Attacks, Can We Really Detect Them - PDFMichael Gough
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPathCommunity
 
Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024TopCSSGallery
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI AgeCprime
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesKari Kakkonen
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality AssuranceInflectra
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityIES VE
 
Infrared simulation and processing on Nvidia platforms
Infrared simulation and processing on Nvidia platformsInfrared simulation and processing on Nvidia platforms
Infrared simulation and processing on Nvidia platformsYoss Cohen
 
Microservices, Docker deploy and Microservices source code in C#
Microservices, Docker deploy and Microservices source code in C#Microservices, Docker deploy and Microservices source code in C#
Microservices, Docker deploy and Microservices source code in C#Karmanjay Verma
 

Recently uploaded (20)

Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)
 
Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
 
QCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architecturesQCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architectures
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
 
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS:  6 Ways to Automate Your Data IntegrationBridging Between CAD & GIS:  6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
 
Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch Tuesday
 
All These Sophisticated Attacks, Can We Really Detect Them - PDF
All These Sophisticated Attacks, Can We Really Detect Them - PDFAll These Sophisticated Attacks, Can We Really Detect Them - PDF
All These Sophisticated Attacks, Can We Really Detect Them - PDF
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to Hero
 
Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI Age
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examples
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a reality
 
Infrared simulation and processing on Nvidia platforms
Infrared simulation and processing on Nvidia platformsInfrared simulation and processing on Nvidia platforms
Infrared simulation and processing on Nvidia platforms
 
Microservices, Docker deploy and Microservices source code in C#
Microservices, Docker deploy and Microservices source code in C#Microservices, Docker deploy and Microservices source code in C#
Microservices, Docker deploy and Microservices source code in C#
 

Featured

2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by Hubspot2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by HubspotMarius Sescu
 
Everything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPTEverything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPTExpeed Software
 
Product Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsProduct Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsPixeldarts
 
How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthThinkNow
 
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfAI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfmarketingartwork
 
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024Neil Kimberley
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)contently
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024Albert Qian
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsKurio // The Social Media Age(ncy)
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Search Engine Journal
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summarySpeakerHub
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next Tessa Mero
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentLily Ray
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best PracticesVit Horky
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project managementMindGenius
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...RachelPearson36
 

Featured (20)

2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by Hubspot2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by Hubspot
 
Everything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPTEverything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPT
 
Product Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsProduct Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage Engineerings
 
How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental Health
 
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfAI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
 
Skeleton Culture Code
Skeleton Culture CodeSkeleton Culture Code
Skeleton Culture Code
 
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search Intent
 
How to have difficult conversations
How to have difficult conversations How to have difficult conversations
How to have difficult conversations
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best Practices
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project management
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
 

NexentaStor Performance Tuning - OpenStorage Summit 2011

  • 1. NexentaStor Performance Tuning Richard Elling Senior Director of Solutions Engineering Nexenta
  • 2. Agenda • Read Performance Model • Device Performance Characterization • NexentaStor Tunables
  • 4. NexentaStor Performance • Performance of NexentaStor systems is difficult to predict • Generally better than proprietary RAID systems – Proprietary systems tend to use wimpy CPUs with limited amounts of memory for cache – NexentaStor systems scale with the latest processor and memory technology • Best NexentaStor performance achieved by choosing the best hardware configuration for the job, not by “tuning” NexentaStor software 2
  • 5. Good Hardware Choices • NexentaStor uses block devices – HDDs – SSDs – Anything that looks like a set of blocks • Size must be greater than 64 MB • Sorry, floppy disks are too small • NexentaStor block drivers – Initiators: ATA, IDE, SATA, SAS, Parallel SCSI, iSCSI, FC, DDRdrive, USB, SD, CF, XD, MMC – Others: files, ramdisk, BD 3
  • 6. Hybrid Storage Pool Optimize performance and cost Adaptive Replacement Cache (ARC) separate Main Main Pool Level 2 ARC intent log Pool Write optimized HDD HDD Read optimized device (SSD) HDD device (SSD) Size (GBytes) 1 - 10 GByte large big Cost write iops/$ size/$ size/$ Use sync writes persistent storage read cache secondary Performance low-latency writes low-latency reads optimization Need more stripe more, faster stripe speed? devices 4
  • 7. Working Set Size • Average Working Set Size (WSS) is the amount of space needed to satisfy the immediate storage needs of applications or the frequently used space • Reduce WSS by – Snapshots & clones (most effective) – Compression – Deduplication 5
  • 8. Performance Envelope 10,000,000 4KB Random Read IOPS 1,000,000 100,000 10,000 1,000 0 250 500 750 1000 Working Set Size (GB) 4KB random read IOPS Expected Max Performance 8
  • 9. Performance Envelope 10,000,000 4KB Random Read IOPS 1,000,000 100,000 A R C L2ARC 10,000 Pool Disk 1,000 0 250 500 750 1000 Working Set Size (GB) 4KB random read IOPS Expected Max Performance 9
  • 10. Performance Envelope 10,000,000 ARC Hit 4KB Random Read IOPS Performance 1,000,000 L2ARC Hit Performance 100,000 A R C L2ARC Pool 10,000 Performance Pool Disk 1,000 0 250 500 750 1000 ARC ARC + L2ARC Working Set Size (GB) Size Size 4KB random read IOPS Expected Max Performance 10
  • 11. 1,500,000 4KB Random Read IOPS 1,250,000 Small Config Expected Performance Medium Config Expected Performance Large Config Expected Performance 1,000,000 10 GbE wire speed 750,000 500,000 250,000 0 0 250 500 750 1000 Working Set Size (GB) Configuration Small Medium Large RAM size (GB) 24 96 192 100% ARC hit rate performance 600,000 900,000 1,300,000 L2ARC size (GB) 0 250 480 L2ARC device small random read IOPS 0 30,000 60,000 Pool small random read IOPS 1,400 3,600 8,000 11
  • 12. Real World Example Read Write Server Time NFSOPS BW Latency BW Latency OPS OPS (KB/sec) (usec) (KB/sec) (usec) 1 5:31:03 AM 9,699 6,780 125,163 271 2,865 29,432 242 1 5:31:04 AM 9,263 6,464 111,200 297 2,682 142,730 496 1 5:31:05 AM 11,703 7,969 131,949 258 3,535 206,254 551 1 5:31:06 AM 14,751 11,030 184,239 179 3,581 219,542 705 1 5:31:07 AM 14,318 10,916 183,431 158 3,246 88,383 353 1 5:31:08 AM 11,396 7,334 114,184 318 3,973 39,423 351 1 5:31:09 AM 10,766 7,152 123,791 274 3,518 34,355 235 2 5:21:24 AM 4,138 2,352 45,295 2,525 1,598 16,193 2,122 2 5:21:25 AM 6,050 2,366 55,238 1,211 3,209 175,509 1,193 2 5:21:26 AM 8,902 2,958 85,980 1,907 5,735 281,881 996 2 5:21:27 AM 3,456 1,669 34,443 2,212 1,526 46,251 2,291 2 5:21:28 AM 3,463 1,790 35,542 5,307 1,571 17,157 4,052 2 5:21:29 AM 3,306 1,711 29,829 3,641 1,462 40,895 2,532 2 5:21:30 AM 3,697 2,111 41,909 1,921 1,478 31,911 877 12
  • 13. Device Performance Characterization 13
  • 14. Characterizing Device Performance • Modern storage devices vary widely in performance • “Datasheets don’t lie” ... but the information is vague and unhelpful • Need a comprehensive device characterization suite 14
  • 15. SNIA SSS-PTS • SNIA recognizes difficulty in comparing devices • Proposes Solid State Storage Performance Test Specification (SSS-PTS) • Nexenta’s implementation in NexentaStor.org repository – Using open source vdbench – Results cannot be used for SSS-PTS publication, but are very useful for systems architects • Works great for HDDs, too 15
  • 16. SSS-PTS IOPS Measurement • Preconditioning and iterate until results are consistent – Helps to eliminate out-of-the-box optimizations • Read/write ratio – 100:1, 95:5, 65:35, 50:50, 35:65, 5:95, 0:100 • Block I/O sizes (KB) – 0.5, 4, 8, 16, 32, 64, 128, 1024 • Execute random I/O • Measure IOPS 16
  • 17. SSS-PTS Concurrent I/O Operations • Number of concurrent I/Os (or threads) can be very important – NexentaStor architects need to choose best concurrency value for the entire platform – Nexenta tests add thread counts: • 1, 2, 4, 8, 16, 32 • For SSS-PTS results publication – vendors can choose which to report 17
  • 18. Sample NexentaStor ZVol Test Read:Write Ratio Block Size 100:0 95:5 65:35 50:50 35:65 5:95 0:100 (KiB) 0.5 620,880 117,710 30,625 22,197 19,010 17,512 37,404 4 603,126 96,684 19,284 15,584 13,869 9,957 19,247 8 647,250 126,288 20,177 13,769 12,348 7,405 8,741 16 338,106 48,965 9,598 7,413 5,313 4,437 4,423 32 164,678 28,759 4,574 3,483 2,428 1,983 2,264 64 84,688 11,496 2,166 1,503 1,172 829 1,076 128 46,126 5,705 965 770 611 502 571 1024 4,978 715 107 84 78 64 75 Local test, closed course, professional driver Test clearly shows effects of caching and single HDD pool 18
  • 22. All IOPS are not Created Equal Avg resp time (ms) vs. IOPS by Threads & 2 more Read % Threads 0 100 1 80 2 60 512 40 4 20 6 0 80 8 60 10 4096 40 20 0 80 60 8192 Avg resp time (ms) 40 IO size (bytes) 20 0 80 60 32768 40 20 0 80 60 65536 40 20 0 80 131072 60 40 20 0 60 70 80 90 100 110 120 130 140 150 160 170 180 190 60 70 80 90 100 110 120 130 140 150 160 170 180 190 IOPS 22
  • 24. Choose Appropriate Components • The biggest tuning knob • Have the right components for the job • Choose reliable components • Leverage hybrid storage pool concepts • In general, go wide then deep 24
  • 25. Recordsize and Block Size • 2nd biggest tuning knob – Recordsize is “max block size” for file systems – Block size is “only block size” for block devices • For fixed-record-length workloads – Match recordsize/block size to avoid I/O amplification for bandwidth-constrained systems – Multiples can be ok • Experiment and observe trade-offs – Smaller block sizes require more metadata per unit of available storage • For variable workloads (eg files), large recordsize is ok 25
  • 26. I/O Concurrency • For current ZFS implementations, zfs_vdev_max_pending is global, per- device setting – Older releases, default is 35 – Current releases, default 10 – Consider changing to match devices to workload – Setting can have availability implications • Room for improvement, stay tuned... 26
  • 27. Prefetching • By default, intelligent prefetching is enabled – Adaptive algorithm – If prefetching seemed to work, prefetch more – Generally works well • For high-concurrency environments, consider disabling prefetching – Not tunable from NexentaStor 3.x UI • Room for improvement, stay tuned... 27
  • 28. Compression • Compression turns big I/O into small I/O, when possible – Algorithms do not suffer from “compression growth” – Various algorithms available – Enabled by default in NexentaStor 3.x – Amaze your friends: zeros compress to nothing • For high performance environments, consider disabling compression – When bandwidth is over-provisioned – When space is inexpensive $/GB – When low variance of latency is desired 28
  • 29. Deduplication • Deduplication turns large I/O into small I/O – Does not eliminate I/O! – Avoid use for big, slow HDDs (IOPS is constrained) • In general, deduplication and high performance are not the best of friends 29
  • 30. Measure and Manage • Performance management is always a work in progress • Generalizations are becoming more difficult as workloads become more diverse • Experiment and measure prior to production • Measure and manage in production • Performance management has room for improvement, stay tuned... 30
  • 31. Questions? Richard.Elling@Nexenta.com 31

Editor's Notes

  1. \n
  2. \n
  3. \n
  4. The true performance of NexentaStor systems can be difficult to predict. The key variables for getting the best performance from a NexentaStor system are the choice of hardware and dataset configuration.\n\nIn general, NexentaStor systems can perform better than proprietary storage systems. The reasons are simple, by using good, scalable storage software, Nexenta customers can leverage the improvements in component technologies. For example, changing a NexentaStor server’s processor for a faster version, a new motherboard, or adding memory is as simple and straightforward as upgrading a compute server. NexentaStor licensing is based on total storage, not the number or speed of the processors nor the amount of RAM. New or faster network interfaces can be added to improve client performance without incurring additional storage licensing costs.\n\nNexentaStor systems can increase their performance over time, cost-effectively, incrementally, and efficiently over their lifetime.\n
  5. NexentaStor system building blocks include persistent storage devices. Many different block devices are supported, providing an easy migration path from legacy storage. New technologies are easily added to existing NexentaStor systems, protecting your investment against technology obsolescence.\n\nObviously, not all hardware choices are fast. The cost, performance, and reliability of system components has a significant impact on the overall system performance and dependability. This flexibility offers optimization options unparalleled in the industry.\n
  6. The key to the NexentaStor system’s optimization is the Hybrid Storage Pool (HSP). Main memory or DRAM is used as an Adaptive Replacement Cache (ARC) efficiently storing both frequently used and recently used data. \nThe Separate Intent Log is used to optimize latency of synchronous write workloads, such as NFS. The log satisfies synchronous write semantics while the transactional object store optimizes and allocates space on the main pool storage. The log does not need to be very large, size according to the amount of data expected to be written in 30 seconds.\n\nA level-2 ARC or cache device can be used to cost-effectively grow the size of the ARC. For large, read-intensive workloads, the cost-per-gigabyte of SSDs is lower than main memory DRAM. Excellent read system performance can be achieved using modestly priced SSDs.\n\nThe main pool performance can be less critical when the system is configured with enough RAM when the log and L2ARC devices are fast.\n
  7. The Working Set Size (WSS) is used to describe the amount of space most commonly used for applications or workloads. The use of WSS to describe performance is becoming increasingly useful as the size of disks increases. In many cases, systems are configured with 10s or 100s of Terabytes of storage, while only a small fraction of the space is used at any given time. This fraction is the working set size.\n
  8. \n
  9. \n
  10. \n
  11. \n
  12. \n
  13. \n
  14. \n
  15. \n
  16. \n
  17. \n
  18. \n
  19. \n
  20. \n
  21. \n
  22. \n
  23. \n
  24. \n
  25. \n
  26. \n
  27. \n
  28. \n
  29. \n
  30. \n
  31. \n