IBM SAN Volume Controller Performance Analysis


Published on

Storage Problems and Limitations with Native Storage
SVC Overview
SVC Physical and Logical Overview
Performance and Scalability Implications
Types of Problems
Performance Analysis Techniques
Performance Analysis Tools for SVC
Performance Analysis Metrics for SVC
Online Banking Example

1 Like
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • The Goal of this Presentation is to provide some practical tips for storage administrators Author intro: End to End Performance Support for Managed Storage Service – 17 DS8000 Systems as of 5/25 Over 125 ESS Model 800 Over 2 Petabytes of Managed Storage Proactive, Reactive support for all customers
  • Slide Graphic - Copyright 2006, IBM Corporation Version 7/24/2008 “ SAN Volume Controller – What’s Under the Hood” Page 3
  • Slide Graphic - Copyright 2006, IBM Corporation Version 7/24/2008 “ SAN Volume Controller – What’s Under the Hood” Page 4
  • LUNs are provided by the Storage systems. LUNs are treated as Managed Disks in SVC. There is a 1:1 relationship between storage system LUN and mdisk. Mdisks are grouped into logical groups called Managed Disk Groups. The smallest logical unit of mdisk storage is called an extent. The extents in a managed disk are pooled into the managed disk group. Vdisks are created from extents. They are assigned to hosts. Slide Graphic - Copyright 2004, IBM Corporation Version March, 2004 SAN Volume Controller Competition Page 36
  • When a Fibre Channel network becomes congested, the FC switches will instead stop accepting additional frames until the congestion clears, in addition to occasionally dropping frames. This congestion quickly moves “upstream” and clogs the end devices (such as the SVC) from communicating anywhere, not just the congested links. (This is referred to in the industry as head-of-line blocking. ) This could have the result that your SVC will be unable to communicate to your disk arrays or mirror write cache because you have a single congested link leading to an edge switch. This is why SVC only supports SVC inter-node links and SVC to controller links within the same high-speed backplane of a single switch.
  • Application Configuration issues such as improper installation or application caching issues can introduce significant latency. Design issues can lead to single threaded application components such as serialized I/O streams. Defects in the application can cause performance issues, delays, time outs, etc Poorly tune SQL can lead to significant delays Host Multi-pathing software must be supported to ensure proper load balancing and redundancy HBA microcode and device drivers must be at supported levels to ensure connectivity/support OS must be a compatible and supported OS. SVC Microcode fixes available to improve performance – see latest microcode Front end contention – IO Group, Node port congestion, CPU contention, or cache contention Backend contention – saturated managed disk groups Backend Storage Front end port, cache or NVS contention Backend controller and disk group contention Fabric - All of the components can have design issues but the fabric design is especially important as it is imperative to avoid latency and congestion
  • Response time/queue metrics available in 4.x and later
  • Customer was copying files during this time as part of their ETL
  • If the SVC preferred node is not used as for communications with a host’s vdisk then a significant amount of latency is introduced due to additional svc intra-node communication. This should be avoided by properly configuring the MP software. The preferred node issue will be removed in 431 - and SVC will become fully active-active.
  • “ SVC 4.2 code improvements A large number of software-managed locks were modified so as to either reduce the scope of locking, reduce the duration for holding the lock, or both. The improved lock management granularity allowed some locks to be eliminated altogether, since in the new design they would have been used by only one process. Scheduling of CPU use was modified to permit greater balance of load and fewer task switches. Logic was introduced to dynamically adapt to observed MDisk stress levels, thus better matching SVC memory and processing resources to the current capability of the underlying storage.” Copyright 2007, IBM Corporation Version 9/21/2007 SVC Performance Guidelines, Section 16.2 SVC 4.2 Code Improvements
  • The peak target
  • IBM SAN Volume Controller Performance Analysis

    1. 1. SAN Volume Controller Performance Analysis July 25, 2008
    2. 2. Trademarks & Disclaimer <ul><li>The following terms are trademarks of the IBM Corporation: </li></ul><ul><li>Enterprise Storage Server® - Abbreviated: ESS </li></ul><ul><li>TotalStorage® Expert TSE </li></ul><ul><li>FAStT/DS4000/DS8000 </li></ul><ul><li>AIX® </li></ul><ul><li>IBM SAN Volume Controller </li></ul><ul><li>Other trademarks appearing in this report may be considered trademarks of their respective companies. </li></ul><ul><li>SANavigator,EFCM McDATA </li></ul><ul><li>UNIX is a registered trademark in the United States and other countries, licensed exclusively through X/Open Company Limited. </li></ul><ul><li>Microsoft, Windows, Windows NT, and the Windows logo are trademarks of Microsoft Corporation in the United States, other countries, or both. EMC is a registered trademark of EMC Inc. </li></ul><ul><li>HP-UX is a registered trademark of HP Inc. </li></ul><ul><li>Solaris is a registered trademark of SUN Microsystems, Inc </li></ul><ul><li>Linux is a registered trademark of Linus Torvalds in the United States, other countries, or both. </li></ul><ul><li>UNIX is a registered trademark of The Open Group in the United States and other countries. </li></ul><ul><li>Disclaimer </li></ul><ul><li>The views in this presentation are those of the author and are not necessarily those of IBM </li></ul>
    3. 3. Abstract <ul><li>SAN Volume Controller(SVC) is a flexible, scalable platform for block level storage virtualization. While the SVC adds flexibility in provisioning of storage and provides enhancements to support higher availability potentials, it adds complexity in performance design. This impact is most acute in performance analysis as a new stripping layer is added in your data path and can and does make the analysis more complex. We will provide a technical overview of a SAN environment with SVC and explore the performance and scalability considerations when using SVC. We will review some of the tools, metrics, and methods necessary to identify root causes for the most common performance issues. </li></ul>
    4. 4. Table of Contents <ul><li>Introduction </li></ul><ul><li>Storage Problems and Limitations with Native Storage </li></ul><ul><li>SVC Overview </li></ul><ul><li>SVC Physical and Logical Overview </li></ul><ul><li>Performance and Scalability Implications </li></ul><ul><li>Types of Problems </li></ul><ul><li>Performance Analysis Techniques </li></ul><ul><li>Performance Analysis Tools for SVC </li></ul><ul><li>Performance Analysis Metrics for SVC </li></ul><ul><li>Online Banking Example </li></ul><ul><li>Summary </li></ul>
    5. 6. SVC High Level Logical View
    6. 7. I/O Group 1 FAStT 10GB FAStT 10GB FAStT 10GB FAStT 10GB ESS 20GB ESS 20GB ESS 20GB vdisk0 20GB vdisk1 20GB vdisk2 20GB vdisk3 20GB vdisk4 20GB Virtual Disks Mapped to Hosts SVC Combined Physical & Logical View I/O Group 2 Virtual Disks are associated with particular I/O Groups Managed Disk Groups are accessible by all I/O Groups in the Cluster. SVC Cluster Virtual Disk LUN Managed Disk mdisk0 10GB mdisk1 10GB mdisk2 10GB mdisk3 10GB mdisk6 20GB mdisk5 20GB mdisk4 20GB Managed Disk Groups mdiskgrp0 [FAStT Group] - 40GB mdiskgrp1 [ESS Group] - 60GB IBM IBM IBM IBM
    7. 8. Performance and Scalability Limitations <ul><li>Shared resources! </li></ul><ul><ul><li>Cache, fibre ports, CPU, Fabric </li></ul></ul><ul><li>Cache implications </li></ul><ul><ul><li>Completely random workload – ‘Cache Unfriendly’ </li></ul></ul><ul><ul><li>Highly sequential – ‘Large DB hot backups’ </li></ul></ul><ul><li>Fabric implications </li></ul><ul><ul><li>Increases the number of fabric hops! </li></ul></ul><ul><ul><li>Additional fabric traffic to synchronize write data </li></ul></ul><ul><ul><li>Traffic flows in and out of same ports – </li></ul></ul><ul><ul><ul><li>Read cache misses </li></ul></ul></ul><ul><ul><ul><li>Write synchronizations </li></ul></ul></ul>
    8. 9. Types of Problems <ul><li>Application </li></ul><ul><ul><li>Configuration </li></ul></ul><ul><ul><li>Design issues </li></ul></ul><ul><ul><li>Defects </li></ul></ul><ul><ul><li>DB queries, etc </li></ul></ul><ul><li>Host </li></ul><ul><ul><li>Multi-pathing software compatibility </li></ul></ul><ul><ul><li>HBA microcode/device driver </li></ul></ul><ul><ul><li>OS compatibility </li></ul></ul><ul><li>SVC </li></ul><ul><ul><li>Microcode level, performance features </li></ul></ul><ul><ul><li>Front end contention - IO group, Node </li></ul></ul><ul><ul><li>Backend contention MDG, Mdisk </li></ul></ul><ul><li>Backend Storage </li></ul><ul><ul><li>Front end Port, Cache, NVS </li></ul></ul><ul><ul><li>Backend Controller, RAID Group (Disks) </li></ul></ul><ul><li>Fabric </li></ul><ul><ul><li>ISL Congestion </li></ul></ul>
    9. 10. Performance Analysis Process <ul><li>Gather Host multi pathing, SVC, and Storage configuration/firmware </li></ul><ul><li>Ensure device support and compatibility </li></ul><ul><ul><li>SVC Support Matrix – If Host or Storage devices are unsupported Resolve! </li></ul></ul><ul><ul><ul><li> </li></ul></ul></ul><ul><ul><li>Update SVC firmware to latest level (Ensure Host Multi-pathing is supported/configured right) </li></ul></ul><ul><li>After resolving configuration issues: </li></ul><ul><ul><li>Gather end to end response time (i.e. – Host iostat/perfmon data) </li></ul></ul><ul><ul><li>If elongated response time exists drill down to next layer </li></ul></ul><ul><li>Measurement Points </li></ul><ul><ul><li>Application – Transactional latency </li></ul></ul><ul><ul><li>Host – LV & Disk I/O Response Times, Disk Utilization, Throughput </li></ul></ul><ul><ul><li>Fabric – Throughput, Utilization </li></ul></ul><ul><ul><li>SVC – IO Group, MD Group, MDisks, Vdisks </li></ul></ul><ul><ul><li>Storage – Depends on technology </li></ul></ul><ul><ul><ul><li>EMC – FA, Cache, DA, Disk, Volume </li></ul></ul></ul><ul><ul><ul><li>DS8K/ESS – Front end Port, Array (Physical), Volume </li></ul></ul></ul>
    10. 11. Performance Analysis Tools for SVC <ul><li>Tivoli Total Storage Productivity Center (TPC) </li></ul><ul><ul><li>Complex and expensive to deploy </li></ul></ul><ul><ul><li>Provides lots of detail </li></ul></ul><ul><li>Native command line interface – Data in XML format but no publicly available post-processing </li></ul><ul><ul><li>Custom written text parser not ideal </li></ul></ul><ul><ul><li>XSL and ANT are good options or other XML parser/viewers </li></ul></ul>
    11. 12. SVC Key Performance Metrics <ul><li>IO Group </li></ul><ul><ul><li>Front-End & Backend Latency (Read/Write), Queue Time (Read/Write), Throughput (Read/Write), Transfer Size (Read/Write), I/O Rates (Read/Write) </li></ul></ul><ul><ul><li>Cache Hits </li></ul></ul><ul><li>Node </li></ul><ul><ul><li>Same as IO Group +CPU + Port to Local Node Send I/O Rate & Receive </li></ul></ul><ul><li>MD Group </li></ul><ul><ul><li>Front-End & Backend Latency (Read/Write), Queue Time (Read/Write), Throughput (Read/Write), Transfer Size (Read/Write), I/O Rates (Read/Write) </li></ul></ul><ul><li>MDisk </li></ul><ul><ul><li>Backend Latency (Read/Write), Queue Time (Read/Write), Throughput (Read/Write), Transfer Size (Read/Write), I/O Rates (Read/Write) </li></ul></ul><ul><li>Vdisk </li></ul><ul><ul><li>Front-End, Queue Time (Read/Write), Throughput (Read/Write), Transfer Size (Read/Write), I/O Rates (Read/Write) </li></ul></ul><ul><ul><li>NVS Full & Delays, Cache Hits </li></ul></ul><ul><li>Explanations </li></ul><ul><ul><li>Overall Response Time = vdisk response time </li></ul></ul><ul><ul><li>If an I/O is a cache hit, then you only have the vdisk response time </li></ul></ul><ul><ul><li>Backend Response Time = mdisk fabric response time (i.e. from the point we send the I/O to the controller to when we get it back) </li></ul></ul><ul><ul><li>Backend Queue = mdisk queue time (inside SVC waiting to be sent onto fabric + fabric response time) </li></ul></ul><ul><ul><li>Backend responses are also for 32K tracks, so a vdisk doing 256K I/O will need many backend I/O to complete (if its a cache miss) a lot of these will be concurrent </li></ul></ul>
    12. 13. Real World Example: Online Banking Application (OLB) – Problem Statement <ul><li>An online banking application and other applications that rely SAN I/O are experiencing intermittent, severe performance impacts </li></ul><ul><li>Performance impacts typified by a daily performance degradation between 3:15 am and 6:00am. </li></ul><ul><li>SVC response time outside of problem window is acceptable. </li></ul>
    13. 14. OLB – Host Impact – Increase in copy times
    14. 15. OLB: Performance Analysis – Host Configuration <ul><li>Collect host configuration data </li></ul><ul><ul><li>Prior to microcode 4.3.1 it is very important that host multi-path sw communicates to SVC preferred node! </li></ul></ul><ul><ul><li>Try to use IBM SDD/PCM as they work! </li></ul></ul><ul><ul><li>If using others DMP/MPxIO only 1 Multi-path software should be active </li></ul></ul><ul><ul><li>Special procedures and/or configuration changes may be required for non IBM MP </li></ul></ul><ul><li>Hosts were running improperly configured MPxIO – Needed patch and SVC configuration change </li></ul><ul><li> </li></ul><ul><li>Hosts running unsupported DMP configuration – </li></ul><ul><ul><li>Needed patch from Veritas to fix </li></ul></ul><ul><ul><li>VxVM 5.0 Requires RP3 (Rolling Patch 3 and Hotfix 127320-02) </li></ul></ul><ul><li>Identify and repair host configuration </li></ul>
    15. 16. OLB: Upgrade SVC to Latest Firmware <ul><li>Make sure you are at least at 4.x. </li></ul><ul><li>Latest SVC Firmware (4.2.x) has many fixes </li></ul><ul><li>Fixes to increase mdisk q-depth settings </li></ul><ul><li>Vs. 3.x – SVC 4.x takes advantage of all node ports </li></ul><ul><li>Cache partitioning available for governing workloads </li></ul><ul><li>4.x provides enhanced performance metrics </li></ul>
    16. 17. OLB: Gather End to End Response Time <ul><li>Initially gather enough information to confirm there are I/O related issues </li></ul><ul><li>Identify if I/O throughput degradation is systemic </li></ul><ul><ul><li>All devices on given host </li></ul></ul><ul><ul><li>All devices on all hosts </li></ul></ul><ul><ul><li>All devices on a given SVC or SAN component </li></ul></ul><ul><li>In this case all hosts were impacted by throughput degradation </li></ul><ul><li>Watch for large transfer sizes as destages from cache to backend storage are done in 32 KB writes. </li></ul>
    17. 18. OLB: Gather SVC MD Group data Focus on those the MDGs with the most throughput during period 62.10 37.70 76.30 19.40 3.00 16.50 279.60 87.80 191.80 SVC1_33333_R5_4 SVC001 70.50 76.40 41.10 17.20 6.90 10.30 302.60 105.30 197.30 SVC1_33333_R5_1 SVC001 31.30 22.00 30.20 10.20 1.90 8.30 275.80 68.10 207.70 SVC1_33333_R5_0 SVC001 34.00 33.10 32.20 10.80 2.30 8.40 302.90 78.20 224.70 SVC1_33333_R5_2 SVC001 99.10 242.90 77.90 30.60 13.10 17.50 335.60 102.20 233.30 SVC1_12345_R5_4 SVC001 67.90 75.70 56.00 22.00 7.40 14.60 378.00 91.30 286.70 SVC1_33333_R5_9 SVC001 105.00 359.70 62.50 35.70 18.00 17.70 353.30 60.20 293.10 SVC1_12345_R5_2 SVC001 106.00 308.70 74.80 43.40 20.50 22.80 433.30 124.00 309.30 SVC1_12345_R5_3 SVC001 125.40 381.90 72.80 57.40 29.30 28.10 473.60 78.50 395.10 SVC1_22222_R1_3 SVC001 109.20 518.90 64.70 57.00 28.00 29.00 579.60 119.80 459.80 SVC1_12345_R5_1 SVC001 Avg Size KB Avg Write Size KB Avg Read Size KB Avg Total Data Rate MB Avg Write Data Rate MB Avg Read Data Rate MB Avg Total IO Rate Avg Write IO Rate Avg Read IO Rate MD GROUP SVC
    18. 19. OLB: Drill Down To Vdisk What are these hosts doing during this time period! 15.3 6.1 11.1 0.5 0 0.5 7.4 2.1 5.4 Host4 vdisk6 17 6.7 12.2 1.9 0 1.9 22.1 2.4 19.7 Host3 vdisk5 17.1 8 17.3 2.6 0 2.6 45 4 40.9 Host1, Host2 vdisk4 47.2 41 58.7 4.8 1.6 3.2 108.1 39.7 68.4 Host1, Host2 vdisk3 13.8 8 13.8 2.9 0 2.9 69.5 0.1 69.4 Host1, Host2 vdisk2 13.7 8 13.7 3 0 3 73.9 1.1 72.8 Host1, Host2 vdisk1 Avg Size KB Avg Write Size KB Avg Read Size KB Avg Total Data Rate MB Avg Write Data Rate MB Avg Read Data Rate MB Avg Total IO Rate Avg Write IO Rate Avg Read IO Rate Servers VDISK
    19. 20. OLB: Identify Processes and Scheduled Jobs Initiating I/O <ul><li>Check Native schedulers (cront/at) for </li></ul><ul><ul><li>Application users </li></ul></ul><ul><ul><li>DB users </li></ul></ul><ul><ul><li>root </li></ul></ul><ul><li>Check 3 rd party schedulers (Autosys) </li></ul><ul><li>Cron entries for db servers on hosts with high I/O identified 103 database backup schedules within problem period! </li></ul>
    20. 21. OLB – Root Cause <ul><li>The root cause of the online banking performance degradation is a flooding of the San Volume Controller by streaming read IO’s originating from RMAN Oracle backups initiated on 103 databases within a 20 minute period. </li></ul><ul><li>This read IO flood is cache hostile, causing other read and write requests to queue, creating performance degradation. </li></ul><ul><li>With the current host read-ahead settings, at peak (concurrent Oracle RMAN incremental backups between 3am - 6am), the SVC is not able to process the combination of volume and composition of IO without a flow-on performance impact. </li></ul>
    21. 22. OLB: Actions Taken During Analysis 15 60 2500 Target Peak 60 60 2350 Host - MPxIO corrected 65 70 1300 Host - DMP patch 90 80 800 SVC 4.2.03 upgd 120 80 600 Initial Inspection SVC Read Resp (ms) SVC CPU SVC MB/s  
    22. 23. OLB Final Recommendations (by priority): <ul><li>Implement production backup policy/strategy in the test environment. </li></ul><ul><ul><li>Veritas snapshot backups for hosts operating large databases – Reduce data transferred/Schedule! </li></ul></ul><ul><li>Tune RMAN, Oracle and the SVC to control IO composition and IO availability </li></ul><ul><ul><li>Scheduling/Xfer Size/Isolation/Governance on vdisk </li></ul></ul><ul><li>Add a new IO GROUP to SVC001 </li></ul><ul><ul><li>Isolation! </li></ul></ul><ul><li>Replace the SVC 2145-8F4 nodes currently in use with 2145-8G4. </li></ul><ul><ul><li>Hardware Upgrade! </li></ul></ul>
    23. 24. SVC Performance Analysis Summary <ul><li>Identify performance requirements/expectations! </li></ul><ul><li>Determine compatibility/Resolve incompatibilities </li></ul><ul><li>Utilize latest SVC firmware if possible </li></ul><ul><li>Measure hosts </li></ul><ul><li>Measure SVC </li></ul><ul><li>Measure backend storage </li></ul><ul><li>Identify bottlenecks and resolve </li></ul>
    24. 25. Appendix A: Additional Resources These publications are also relevant as further information sources: IBM System Storage SAN Volume Controller, SG24-6423-05 Get More Out of Your SAN with IBM Tivoli Storage Manager , SG24-6687 IBM Tivoli Storage Area Network Manager: A Practical Introduction , SG24-6848 IBM System Storage: Implementing an IBM SAN , SG24-6116 IBM System Storage Open Software Family SAN Volume Controller: Planning Guide , GA22-1052 IBM System Storage Master Console: Installation and User’s Guide , GC30-4090 IBM System Storage Open Software Family SAN Volume Controller: Installation Guide , SC26-7541 IBM System Storage Open Software Family SAN Volume Controller: Service Guide , SC26-7542 IBM System Storage Open Software Family SAN Volume Controller: Configuration Guide , SC26-7543 IBM System Storage Open Software Family SAN Volume Controller: Command-Line Interface User's Guide , SC26-7544 IBM System Storage Open Software Family SAN Volume Controller: CIM Agent Developers Reference , SC26-7545 IBM TotalStorage Multipath Subsystem Device Driver User's Guide , SC30-4096 IBM System Storage Open Software Family SAN Volume Controller: Host Attachment Guide , SC26-7563
    25. 26. Biography Brett Allison has been doing distributed systems performance related work since 1997 including J2EE application analysis, UNIX/NT, and Storage technologies. His current role is Performance and Capacity Management team lead ITDS. He has developed tools, processes, and service offerings to support storage performance and capacity. He has spoken at a number of conferences and is the author of several White Papers on performance