Capacity Management for SAN


Published on

Capacity Management for SAN

  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • - A good first step to implementing effective capacity management for SAN attached storage is to ensure that you are managing the non-SAN specific aspects of storage first. A second important step is recognizing what limitations and gaps exist from the host perspective.
  • Keep in mind the level at which disk space runs out (e.g. file systems, drives, volumes, etc…). Typically this is where monitoring is configured, but it can be proactive.Also remember that multiple I/O requests can be in flight at the same time just like other networking protocols, controlled by queue depth settings.
  • - Aggregate data to the appropriate level for reporting to a given audience.
  • Highlight storage for IT, unknown, and other unbillable storage.If customers have a blank check they will consume a lot more storage.Having many tools that all consume data can add up. Athene consolidates your data for capacity management.Make sure all allocated storage has an owner.
  • All storage is not created equal.Opposing forces of growth and decreasing cost of storage. If costs stop decreasing, like CPU speeds stopped increasing, look out. Physical limits can be reached for storage density.Primary focus on billing is giving accountability first, rather than ensuring exact financial accounting of real costs. Yeah, it may not be all real, but it’s better than an open checkbook.
  • Ideally you could do a business study, then create a business plan based on those results (i.e. cost/benefit analysis).Need a compelling story to generate interest.
  • - How much storage can administrators manage? It depends on many factors.
  • Are we talking utilization on the host or SAN side? Does it include overheads for file systems, RAID, DR, etc…?Right size for backups, growth, variability, etc…Start with most important low hanging fruit.
  • Proactive management with automated trending. Be aware that fighting fires is more glamorous and visible.It’s easy to get buried with data, filter out the noise with exceptions and filters (10% of 10GB vs. 10% of 1TB).All trend lines are not created equal.
  • Storage vmotion in vSphere 5 will load balance based on datastore performance.Thin provisioning may not be appropriate in situations where delays for expanding storage are not acceptable
  • Compare advantages of using virtual storage to distribute over more spindles versus specific placement, admin and performance.Mention types of in band versus out of band virtualization. Host, SAN, and Array components required.
  • - How do you find dark and hidden storage? Compare allocated versus what shows up on hosts and asset management.
  • - Also, proportion of samples over a threshold and variablity.
  • It can also be in the reverse where the host looks okay, but there is an impact. Measured I/O response is the best way to determine what the OS is experiencing.Also, significant changes from normal can indicate problems.
  • If the line waiting for service increases, either your throughput or service time has increased.Queues don’t typically increase in a linear fashion, things can fall apart quickly when this spikes up. Can be good for monitoring and diagnosis but not planning.
  • - Individual disks may go to completely different areas of backend storage. An impact in one area can be to traced back through to the root problem.
  • Capacity Management for SAN

    1. 1. Metron Capacity Management for SAN Attached Storage Warning: Low Disk Space
    2. 2. Metron-Athene • Established 1986 • Stable ownership • Consistent Focus on CM • Industry Leadership
    3. 3. Athene z/OS, HP-UX, AIX, Solaris, Linux Data Source Acquire Framework DB/Application Virtual Server Custom Control Center Capacity Database
    4. 4. Objectives• Trends in storage technology.• Define two distinct aspects of storage capacity.• Examine key areas related to capacity management of SAN attached storage.• Equate with business value.• Show how tools like Athene can help you achieve your goals.• Provide ideas about how to proceed with improving storage capacity management processes in your environment.
    5. 5. Trends• Solid state devices• Cloud storage• Embedded storage (e.g. Exadata, vBlock)• Big data (e.g. Hadoop)• Tiered storage• Primary de-duplication• FCoE, 16 Gbps Fiber, and 10 Gbps Ethernet
    6. 6. Two Distinct Aspects of Storage Capacity Disk Performance Capacity Response, IOPsDisk Space Capacity Bytes
    7. 7. Space Capacity – Growth (measureable) Changing demands for storage – Slope of line
    8. 8. Space Capacity - HistoryGrowth can result in increasing cost and complexity
    9. 9. Space Capacity – Growth and Cost Factors Growth • Business as usual (Trend) • Acquisitions • New applications and projects Costs • Equipment, including power • Resource management, including people • Storage use by application (Billable Customers)
    10. 10. Space Capacity – Storage as a Service How much are customers consuming? Don’t forget about the IT department and other insiders!
    11. 11. Space Capacity – Tiered Service ModelDefine what tiers are (platinum, gold, silver, etc…) Rates should be adjusted on a frequent basis. Estimate growth versus storage cost declines. Billing is an effective way to create accountability.
    12. 12. Space Capacity – Management Support Effective storage management happens with a bridge to business results, and building that bridge begins with a solid foundation. Show business value to be self evident.
    13. 13. Space Capacity – Business ViewWith management backing, important processes can be implementedBusiness IT • Capacity budgeting and inventory management • Mandatory storage request process • Storage mapping to determine ownership • Chargeback of some form • Define executive reporting requirementsOnce the bridge is built reporting information can flow freely
    14. 14. Space Capacity – Who is ResponsibleManaging storage capacity requires work.Storage administrators typically have limited time andhigher priorities in their complex environments.
    15. 15. Space Capacity – Over and Under ProvisioningAdministrators may have no choice but to overallocate which results in low utilization. It is important to define exactly what ‘Utilization’ is for your storage. Many factors determine what ‘Right Sized’ means for each system. But, running out of space means only one thing to all.
    16. 16. Space Capacity – Doing the Technical Work After roles and responsibilities are assigned and business requirements are complete, technical solutions can be implemented to optimize storage space management, including databases. Trending, forecasting, and exceptions.
    17. 17. Space Capacity – Different Viewpoints Business, Application, Host, Storage Array, Billing Tier If billing for storage ensure transparency with detail reports
    18. 18. Space Capacity – Virtual Environments and ClustersManaging storage in clustered and/or virtual environment can be challengingbecause it is shared among all hosts and virtual machines running on it. • Manage capacity at a high level • Account for storage use at a low level, e.g. VM or DB • If billing be cautious of different tiers being allocated to the same cluster. • Don’t forget about overheadOvercommit with thin provisioning
    19. 19. Space Capacity – Storage VirtualizationPooling physical storage from multiple sources into logical groupings • Simplifies Administration • Can be a centralized source for collecting data • If using as a data source beware of double counting with backend • Don’t forget about overhead for replicationWide variety of techniques for virtualizing storage, be aware ofthe implications for data collection and reporting
    20. 20. Space Capacity – Best PracticesFind dark and hidden storage, where it has beenallocated and never used, or plugged into a different box. Use thin provisioning and de-duplication where possible. Include data retention policies for storage space management. Account for overhead from RAID, replication, file systems, etc…Understand the value of data in deciding where to putit, how to protect it, and how long to keep it.
    21. 21. Space Capacity – Best PracticesUnderstand the limitations of linear regression when trendingand forecasting data. Use statistics like R^2 to confirm. Be sure to account for all variables when ‘Right Sizing’! Include directory and file level reporting for file servers if possible.
    22. 22. Performance Capacity – Response ImpactsSAN or storage array performance problems can have seriousimpacts over a long duration, and be difficult to identify.
    23. 23. Performance Capacity – MetricsUnderstand the limitations of certain metrics • Measured response is the best metric for identifying trouble. • Host utilization only shows busy time, it doesn’t give capacity for SAN. • Physical IOPs is an important measure of throughput, all disks have their limitation. • Queue Length is a good indicator that a limitation has been reached somewhere.
    24. 24. Performance Capacity – Metric Thresholds Many times critical host disk metrics are not breached during impactful events. Consider using Statistical Process Control Are these potential problems having a real impact?
    25. 25. Performance Capacity – Metric Thresholds (Host)Other times certain metrics like utilization are indicatingimpactful events, but ample capacity is still available.
    26. 26. Performance Capacity – Metric Thresholds (Host)Queue lengths from the previous utilization indicate that it maynot currently be impacting response, but headroom is unknown.
    27. 27. Performance Capacity – Metric Thresholds (Host)The high utilization can be seen generating large amountsof I/O in this chart.
    28. 28. Performance Capacity – Architecture (Array) • Front End Processors • Shared Cache • Back End Processors • Disk Storage
    29. 29. Performance Capacity – Metric Thresholds (Array) Front end processors are typically the first to bottleneck
    30. 30. Performance Capacity – Metric Thresholds (Array) Impact of utilization on response for a single processor Curves based on simple queuing with normal distribution
    31. 31. Performance Capacity – Component Breakdown Service time versus response time – different metrics
    32. 32. Performance Capacity – Workload Profiles I/O profile has a big impact on performance. Be sure to include it when comparing applications. Test with tools like Iometer, IOzone, Bonnie, etc…
    33. 33. Performance Capacity – Best Practices
    34. 34. Performance Capacity – Best PracticesTrending, forecasting, and exceptions with Athene
    35. 35. Performance Capacity – Best Practices• Choose service levels and establish baselines.• Use available data sources, vendor utilities, etc…• Consolidate reporting tools and data. (Athene)
    36. 36. Storage Capacity – Final Thoughts• Talk with storage team about current state of reporting and fill in the gaps.• Fabric and network utilization might be in scope.• Set priorities for where to spend time and effort.• Simplify where possible.• Work to establish formal naming conventions where needed.• Tools - without knowledge, experience, and commitment won’t help.
    37. 37. Storage Capacity – Thank you for attending Capacity Management for SAN Attached Storage Dale Feiste Metron-Athene Inc.