Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Ds8000 Practical Performance Analysis P04 20060718


Published on

The Goal of this Presentation is to provide some practical storage performance tips for storage administrators of IBM DS8000 storage arrays.

Ds8000 Practical Performance Analysis P04 20060718

  1. 1. IBM GLOBAL SERVICES Las Vegas, NV P04 Brett Allison Practical Performance Analysis of DS8000 Storage Subsystems July 24 - 28, 2006 ® © IBM Corporation 2006
  2. 2. Trademarks & Disclaimer <ul><li>The following terms are trademarks of the IBM Corporation: </li></ul><ul><li>Enterprise Storage Server® - Abbreviated: ESS </li></ul><ul><li>TotalStorage® Expert TSE </li></ul><ul><li>FAStT/DS4000/DS8000 </li></ul><ul><li>AIX® </li></ul><ul><li>z/OS® </li></ul><ul><li>Other trademarks appearing in this report may be considered trademarks of their respective companies. </li></ul><ul><li>SANavigator,EFCM McDATA </li></ul><ul><li>UNIX is a registered trademark in the United States and other countries, licensed exclusively through X/Open Company Limited. </li></ul><ul><li>Microsoft, Windows, Windows NT, and the Windows logo are trademarks of Microsoft Corporation in the United States, other countries, or both. EMC is a registered trademark of EMC Inc. </li></ul><ul><li>HP-UX is a registered trademark of HP Inc. </li></ul><ul><li>Solaris is a registered trademark of SUN Microsystems, Inc </li></ul><ul><li>Linux is a registered trademark of Linus Torvalds in the United States, other countries, or both. </li></ul><ul><li>UNIX is a registered trademark of The Open Group in the United States and other countries. </li></ul><ul><li>Disk Magic is a trademark of IntelliMagic ( http:// ), </li></ul><ul><li>Disclaimer </li></ul><ul><li>The views in this presentation are those of the author and are not necessarily those of IBM </li></ul>
  3. 3. Table of Contents <ul><li>Introduction </li></ul><ul><li>Hardware Overview </li></ul><ul><li>Performance Implications/Observations </li></ul><ul><li>Disk Magic Observations </li></ul><ul><li>Performance Analysis Techniques </li></ul><ul><li>“ Bubba Numbers” </li></ul><ul><li>Performance Tools </li></ul><ul><ul><li>PDCU </li></ul></ul><ul><ul><li>TPC </li></ul></ul>
  4. 4. DS8000 Performance Enhancers Memory DIMMs Memory DIMMs P5 L3 Cache Host 2-way P5 570 Server 2-way P5 570 Server RIO-G Switched Fibre Interconnect 2Gb Fibre links Switched FC Disk Packs <ul><li>RIO-G </li></ul><ul><li>1 GB/sec per link full duplex </li></ul><ul><li>Spatial reuse (use all links) </li></ul><ul><li>At 50% utilization a loop supports 2GB/s sustained data transfer </li></ul>Host Adapter A Fibre Channel host port can sustain a 206 MB/s data transfer <ul><li>Back-end </li></ul><ul><li>XOR partial remain in the adapter, No cache bandwidth consumed </li></ul><ul><li>Switched-Fiber: two concurrent ops per loop. </li></ul><ul><li>POWER5 </li></ul><ul><li>Near linear SMP scaling </li></ul><ul><li>Simultaneous Multi Threading </li></ul><ul><li>Large L1, L2 and L3 caches </li></ul><ul><li>L3 cache directory on die </li></ul>Cache SARC provides up to 100% improvement in cache hits over LRU … … Memory DIMMs Memory DIMMs P5 L3 Cache
  5. 5. Disk Magic Introduction <ul><li>Disk Magic is a tool that models current and future performance of Disk Subsystems, IBM or other, attached to Open, iSeries or zSeries servers. </li></ul><ul><li>Disk Magic is a product of IntelliMagic ( http:// ), developed in close cooperation with the IBM performance team in Tucson (AZ) and is licensed to the IBM Server Group, to be used for marketing support purposes. </li></ul><ul><li>Techline provides the service: Sales Support Connect (SPC) at 1-877-707-2727 for US and Canada; or 506-646-7498 for Latin America. </li></ul>
  6. 6. Disk Magic Observations – 70/30/50 – Varying Block Size
  7. 7. Disk Magic Observations – 70/30/50 – Varying # I/Os
  8. 8. Performance Analysis Process – I/O – Bottom Up Host resource issue? Fix it ID hot/slow Host disks ID hot/slow Host disks Host Analysis – Phase 1 Storage server Analysis – Phase 2 Storage Srvr perf data Fix it N Y
  9. 9. Storage Subsystem Performance Analysis Process Always Collect Performance Data! Application Problem? Disk Contention? Yes Fix ! No Look at Performance Data No Identify other resource Identify Ripest Fruit and Harvest
  10. 10. Collecting Performance Data – Storage Subsystem Requires Infrastructure Scalable Complicated Documented Expensive Supported Limited views/analysis Broad range of data collected TPC 3.1 No longer available for customer download Free Excel macro for post-processing provided Low collection system overhead Limited performance data Collects Port, Array, Volume data Limited documentation Easy installation/Usage Performance Data Collection Utility (PDCU) Cons Pros Tool
  11. 11. Bubba Numbers Not available but derived by (> of Avg Read KB/200,000 OR Avg Write KB/200,000) <ul><li>50% </li></ul>Utilization <ul><li>6 msec </li></ul>Avg Read RT SUM(Read Times + Write Times) / Interval length <ul><li>1 </li></ul>Population Backend Disk Response Time <ul><li>10 msec </li></ul>Avg Write RT Average RT NVS Delayed DFW I/Os Avg Read RT Utilization Avg Write RT Metric <ul><li>6 msec </li></ul><ul><li>5 % </li></ul><ul><li>10 msec </li></ul><ul><li>50 % </li></ul><ul><li>1 msec </li></ul>Threshold Volume Backend Disk Response Time Not available in PDCU Array Port Comment Component
  12. 12. Read Response Time by Port - PDCU
  13. 13. All Volume Read Response Time - PDCU
  14. 14. Analyzing Volume Data – PDCU/Excel Pivot Table Volume Data Summary Table
  15. 15. Analyzing Volume Data – PDCU/Excel Continued 99% 19% 759.37 93.23 8.14 0x5001 80% 19% 780.99 96.86 8.06 0x4e0b 61% 20% 796.91 97.87 8.14 0x4e02 41% 20% 826.57 101.84 8.12 0x4f0b 21% 21% 828.30 102.32 8.10 0x5000 Cumulative % % Total Total I/O RT Avg Read I/O Rate Avg R RT Volume ID
  16. 16. Analyzing Volume Configuration - Map Volumes to Arrays <ul><li>Within DS CLI issue ‘lsfbvol’ and save output </li></ul>Name ID accstate datastate configstate deviceMTM datatype extpool sam captype cap (2^30B) cap (10^9B) cap (blocks) volgrp <ul><li>Within DS CLI issue ‘lsarray’ and save output </li></ul>Array State Data RAIDtype arsite Rank DA Pair DDMcap (10^9B) <ul><li>Correlate and convert Array ID to hexadecimal </li></ul>0x10 R16 P16 0x5001 0xA R10 P10 0x4e0b 0xA R10 P10 0x4e02 0x9 R9 P9 0x4f0b 0x10 R16 P16 0x5000 Hex Rank ID Rank ID Extent Pool ID Volume ID
  17. 17. Analyze the Arrays Associated with the Hot Volumes - PDCU
  18. 18. Performing Bottom Up Analysis using TPC for Disk – Array Utilization Report
  19. 19. Drill Down from the Array Table - TPC Select the magnifying glass icon to drill down to volumes From the volumes table you can chart all volumes volume
  20. 20. Getting Performance Data - tpctool <ul><li>Syntax is very particular – Read documentation </li></ul><ul><li>Prior to 3.1.2 output did not contain component ID! </li></ul><ul><li>CLI Guide </li></ul>
  21. 21. Performance Analysis Process – I/O – Top Down Host resource issue? Fix it ID hot/slow Host disks ID hot/slow Host disks Host Analysis – Phase 1 Storage server Analysis – Phase 2 Storage Srvr perf data Fix it N Y
  22. 22. Host I/O Analysis - Example of AIX Server Gather LUN ->hdisk information See Appendix A) Disk Path P Location adapter LUN SN Type vpath197 hdisk42 09-08-01[FC] fscsi0 75977014E01 IBM 2107-900 Format the data (email me for the script) Note: The formatted data can be used in Excel pivot tables to perform top-down examination of I/O subsystem performance ------------------------------------------------------------------------ Detailed Physical Volume Stats (512 byte blocks) ------------------------------------------------------------------------ VOLUME: /dev/hdisk42 description: IBM FC 2107 reads: 1723 (0 errs) read sizes (blks): avg 180.9 min 8 max 512 sdev 151.0 read times (msec): avg 4.058 min 0.163 max 39.335 sdev 4.284 Gather Response Time Data ‘filemon’ (See Appendix B) 91.8 2.868 1978 hdisk1278 75977010604 7597701 test1 18:04:05 May/30/2006 93.3 3.832 1605 hdisk42 75977014E01 7597701 test1 18:04:05 May/30/2006 AVG_READ_KB READ_TIMES #READS HDISK LUN DS8000 SERVER TIME DATE
  23. 23. Host I/O Analysis – Helpful Views – Pivot tables from filemon data and ‘datapath query essmap’ LSS 7 & 10 I/Os make up 47% of total RT LSS View Rank View Rank ‘ffff’ LUNs ‘0703’ & ‘0709’ make up 46% of total RT to LSS 7 & 10
  24. 24. DS8000 Port Layout -> ‘datapath query essmap’ Disk hdisk Connection port vpath5 hdisk42 R1-B4-H1-ZA 300 Excerpt from ‘ datapath query essmap’ <ul><li>Connection = 2107 Port information </li></ul><ul><li>B4 = 4 th I/O enclosure OR I/O Enclosure 3 </li></ul><ul><li>H1 = 1 st 4 port slot OR H0 </li></ul><ul><li>ZA = Fabric A in dual fabric </li></ul>
  25. 25. Correlating LUN from SDD with DS8000 Volume DS8000 SN VOLUME ID SDD ‘datapath query essmap’ NX3DA0001 0000 NAME VOLUME ID CLI ‘ lsfbvol’ 7597701 0000
  26. 26. Summary <ul><li>Top down approach is the most efficient way to id hot luns </li></ul><ul><li>PDCU is no longer available to customers but it works! </li></ul><ul><li>TPC GUI is limited in analysis views </li></ul><ul><li>tpctool is the best way to get raw data </li></ul><ul><li>Use Extent Pool as key to correlate Volume with Array </li></ul><ul><li>Develop your own Bubba numbers! </li></ul><ul><li>Look for hot arrays especially with large DDM capacity </li></ul><ul><li># of I/Os drives disk utilization more than transfer size </li></ul><ul><li>Spreading data and load balancing still need to be done! </li></ul>
  27. 27. Appendix A: Configuration - Getting LUN Serial Numbers for DS8000 Devices LCU ID, ChPID, devnum VOLSER RMF PP and online displays RMF zOS VG, hostname, Connection, hdisk,LSS LUN SN datapath query DS8000map SDD 1.6.X AIX Device Name, vpath LUN SN datapath query device SDD Linux SDD SDD Tool Device Name Serial datapath query device Wintel Device Name LUN SN datapath query device HP-UX, Solaris Other Metrics Key Command OS
  28. 28. Appendix B - Measure End-to-End Host Disk I/O Response Time RespTime, ActRate RMF Mon3 DEVR, etc. RMF zOS avgserv iostat -D iostat AIX 5.3 Avg. Disk sec/Read Physical Disk perfmon NT/Wintel svctm (ms) iostat –d 2 5 *iostat Linux iostat –xcn 2 5 sar –d filemon -o /tmp/filemon.log -O all Command/Object iostat sar filemon Native Tool svc_t (ms) Solaris avserv (ms) HP-UX read time (ms) write time (ms) AIX 5.x – 5.2 Metric(s) OS
  29. 29. Appendix C: Resources <ul><li>AIX Documentation </li></ul><ul><ul><li> </li></ul></ul><ul><li>Linux – iostat </li></ul><ul><ul><li> </li></ul></ul><ul><li>HP-UX Documentation </li></ul><ul><ul><li> </li></ul></ul><ul><li>Solaris Documentation </li></ul><ul><ul><li> </li></ul></ul><ul><li>DS8000 Redbooks ( </li></ul><ul><ul><li>IBM TotalStorage DS8000 Series: Performance Monitoring and Tuning </li></ul></ul><ul><ul><li>IBM TotalStorage DS8000 Series: Concepts and Architecture, SG24-6452 </li></ul></ul><ul><li>TPC Documentation ( </li></ul><ul><ul><li>Managing Disk Subsystems using IBM TotalStorage Productivity Center , SG24-7097 </li></ul></ul><ul><ul><li>IBM TotalStorage Productivity Center Installation and Configuration Guide GC32-1774 </li></ul></ul><ul><ul><li>IBM TotalStorage Productivity Center User’s Guide GC32-1775 </li></ul></ul>
  30. 30. Biography Brett Allison has been doing distributed systems performance related work since 1997 including J2EE application analysis, UNIX/NT, and Storage technologies. His current role is Performance and Capacity Management team lead ITDS. He has developed tools, processes, and service offerings to support storage performance and capacity. He has spoken at a number of conferences and is the author of several White Papers on performance