Capacity Planning for    Itanium  Paul O’ Sullivan and Prem S. Sinha,  PhD .  PerfCap Corporation 76-39A Northeastern Blvd.,, Nashua, NH 03062 www.PerfCap.com; Info@PerfCap.com; 603-594-0222
PerfCap Corporation Group Started within Digital/Compaq (now HP) over 21 years ago Operating as independent corporation since 2001 Privately Owned, Zero Debt Currently focused on Performance Monitoring, Capacity Planning and Asset Management 20+ Years of Solid Engineering & Development Worldwide Presence HP and other resellers continue to sell it world wide Partnership HP, IBM, SUN Microsoft Certified Partner
Some of Current Customers Barclays UK Commerzbank  Deutsche Bank UK SIAC Mary Kay  Certegy Analog Devices Royal Bank of Scotland BNP Paribas  (3 th  Largest Retail Bank in Europe)   Enterprise License – Unlimited use (3000+ deployed)  ISE  (Largest Options Stock Exchange) Enterprise License – Unlimited use  US Postal Services Monitoring 450 nodes Thomson Reuters Up to 45,000+ International Papers  Vodafone British Telecom MDS Pharmacy Pfizer Qwest  Lockheed Martin Caremark Swedish Customs Netherlands Army CNS Dubai UPMC Medical Center UIC Medical Center University Hospital, Zurich US Dept. of Education SUNY Buffalo Univ.
Capacity Planning Endorsement Adrian Cockcroft  winner of   A.A. Michelson lifetime achievement award at 2007 CMG, in his personal blog wrote  “ The most interesting commercial tool I saw at CMG earlier this month is a capacity monitoring tool called PAWZ from PerfCap Corporation.  The key thing they have worked on is taking the human out of the loop as much as possible with sophisticated capacity modeling algorithms and a simple and scalable operational model. ...  The core idea is that you care about "headroom" in a service, and anything that limits that headroom is taken into account. Running out of CPU power, network bandwidth, memory, threads etc. will increase response time of the service, so monitor them all, track trends in headroom and calculate the point in time where lack of headroom will impact service response time. At eBay we used to call this the "time to live" for a service. You can easily focus on the services that have the shortest time to live, and proactively make sure that you have a low probability of poor response time.”
Challenges Do More With Less  Large number of geographically dispersed resources Multi-platform Automate the process  –   On a daily basis Collect Data Consolidate/Analyze Data Generate Performance and Capacity Reports Send “Need-to-Know” Exception Notification  Information availability: anytime anywhere  web access
Data Management Hierarchical Approach : Performance Analysts : Capacity Planners Raw Data Key Performance Data Risk Data
Desk Top Browser Intranet PAWZ FindIT Server  (NT/W2K) Networks  Storage Events Trending Clusters Real Time Applications Performance Reports Daily, Weekly  Health  Reports Critical Systems Asset Location Change Report Configuration Asset Reports Windows NT/2000/XP SUN Solaris HP-UX IBM-AIX OpenVMS Cluster LINUX Tru64 UNIX
PAWZ Components PAWZ Agent/Monitor:   Resides on each node to be monitored Collect Performance data 24x7 Send colleted data to PAWZ Server in real time and/or once a day PAWZ Server:   Resides on a Windows based server and communicates with hundreds of PAWZ Agents Receives data from PAWZ Agent  Processes and produces real time, daily and historical charts and reports Produces trend graphs for simple projections Runs a queuing network modeler for capacity planning PAWZ Browser:   Resides on any corporate desktop Shows all reports and charts within Internet Browser Manage most of PAWZ functions
PAWZ Key Functionality Collect performance data 24 x 7 Provide real time and daily alerts based on performance thresholds  Provide Performance Reports: Real Time Daily Historical – for trending Performs Saturation Analysis every day for each node for capacity planning Performs Risk Analysis to detect systems that could be at Risk. Provides consolidated data center configuration report
Capacity  Planning Definition:   A process to determine how much computing resources are required to  meet business growth Or How much business can grow before some device will run out of capacity To answer “What if” questions like: Can my current configuration handle three times of current workload – when will my current configuration  saturate What will be impact of a new application on current system performance What will be impact of upgrading a current server or adding a new server Can I reduce the number of servers with out violating my “Service Level Agreement” – a.k.a  Server Consolidation
Sizing Methods Rules of Thumb Linear Projec- tions Analytic Models Simula- tion Models Bench- marks Real System Cost Accuracy
Capacity Planning via Trending Time Performance Metric (Av. or Peak CPU Utilization) Simple to produce and follow Issues defining right Capacity Limit single vs composite metric end user satisfaction  J  F  M  A  M  J  J  A  S  O  N  D Today Remaining Capacity Capacity Limit
PAWZ Planner Where do you want to operate? Response Time =   {Service Time + Queuing Time}  Workload Response Time Saturation Point Current Workload Headroom
Capacity Planning via Modeling Steps: Data Collection Identifying Peak Interval(s) Workload Characterization Model Validation Saturation Analysis “ What If” Analysis
PAWZ Planner
Remaining Headroom (Capacity) Trend
Headroom Risk Analysis Time Headroom Headroom threshold Headroom crosses threshold Lead time Amber status – system within lead time of dropping below headroom threshold. Lead time Headroom reaches 0 Red status – system within lead time of exhausting capacity. Current state
Risk Analysis
Risk Analysis
Risk Analysis
“ What if” PAWZ Planner has a “what-if” Capacity Planning module to forecast:- How much business can grow before some device will run out of capacity Can my current configuration handle three times of current workload – when will my current configuration  saturate What will be impact of a new application on current system performance What will be impact of upgrading a current server or adding a new server
“ What if” CPU  & Disk Upgrade Before After
Itanium Capacity Study Typical Study Capability to do any platform to any other platform (Alpha to Integrity) Hardware:- Customer on Integrity Server cluster with HP-UX and Oracle RX8620 (4/4/16), 64Gb Memory SAN
Itanium Capacity Study Alternate models considered:- RX8640 32 Core P570 32 Core M8000 32 Core 3 or 4 node cluster considered
Itanium Capacity Study Reason for Study Expected substantial application growth System already Peaking at CPU What alternate configurations would provide adequate growth of at least 200% current workload? HP and non-HP configurations considered
Itanium Capacity Study
CPU by Image / Disk I/O Rate
CPU by Core
Memory vs Process Count
Total IO Counts
IO Rates
Disk Response Time
Performance Data from Benchmark CPU Utilization 86.3% Disk I/O Rate 1514/s Hard Page Fault Rate 1.2/s Memory Utilization 73%
Current Response Time Curve
Where should your system live?
At peak sustained load,  9% headroom CPU is primary resource bottleneck Possible solutions: Horizontal scaling Integrity upgrade Alternate hardware platform Headroom - Current System
Configuration Alternatives (3 or 4 nodes) HP rx8620 (1.1 GHz, Itanium 2) – current configuration HP rx8640 (1.6 GHz, 24MB L3 cache), 16 core HP rx8640 (1.6 GHz, 25MB L3 cache), 32 core IBM p 570 (2.2 GHz, Power 5), 16 core IBM p 570 (2.2 GHz, Power 5), 32 core IBM p 570 (4.7 GHz, Power 6), 16 core Sun SPARC Enterprise M8000 (2.4 GHz) , 16 core Sun SPARC Enterprise M8000 (2.4 GHz) , 32 core Configuration must support 200% workload growth
Response Time  vs  Workload Growth 3-node RAC
Response Time  vs  Workload Growth 4-node RAC
Projection Conclusions CPU is constraining resource Memory, disk will support 200% growth 3 configuration platforms support growth: HP rx8640 (1.6 GHz, 25MB L3 cache), 32 core IBM p 570 (2.2 GHz, Power 5), 32 core IBM p 570 (4.7 GHz, Power 6), 16 core Sun SPARC Enterprise M8000 (2.4 GHz) , 32 core Horizontal scaling to 4 nodes will not change qualifying platforms.  However, cores may be adjusted.
Minimal Cores, 3-node RAC
Minimal Cores, 4-node RAC
Mixing 1.1 GHz and  1.6 GHz Itanium Cores
Minimal Number of Cores per Node  Supporting 200% Growth Platform 3-node 4-node Sun SPARC Enterprise M8000 (2.4 GHz) 32 24 HP rx8640 (1.6 GHz, 25MB L3 cache) 30 24 IBM p 570 (2.2 GHz, Power 5) 26 20 IBM p 570 (4.7 GHz, Power 6) 12 10
Itanium Capacity Study Customer satisfied Had options Reduce Oracle cost by reducing number of cores Forecast from real data Could approach vendors with confidence Today 90% of this study automated via PAWZ Same Graphs Same Results
Modelling Capability Hardware Alpha to Integrity Integrity to new models and beyond Other vendors to Integrity Software Increases in workload Optimization Decreases in workload Virtualization
Summary PerfCap offers an integrated Performance Management and Capacity Planning Software that is: Out-of-the-box (no scripting required) Fully automated Multi-Platform Web based Highly scalable Pricing  Independent of number and class of CPUs in a server
More Information Sales [email_address] Web site www.PerfCap.com Hot Line 603-594-0222

Hp Connect 10 06 08 V5

  • 1.
    Capacity Planning for Itanium Paul O’ Sullivan and Prem S. Sinha, PhD . PerfCap Corporation 76-39A Northeastern Blvd.,, Nashua, NH 03062 www.PerfCap.com; Info@PerfCap.com; 603-594-0222
  • 2.
    PerfCap Corporation GroupStarted within Digital/Compaq (now HP) over 21 years ago Operating as independent corporation since 2001 Privately Owned, Zero Debt Currently focused on Performance Monitoring, Capacity Planning and Asset Management 20+ Years of Solid Engineering & Development Worldwide Presence HP and other resellers continue to sell it world wide Partnership HP, IBM, SUN Microsoft Certified Partner
  • 3.
    Some of CurrentCustomers Barclays UK Commerzbank Deutsche Bank UK SIAC Mary Kay Certegy Analog Devices Royal Bank of Scotland BNP Paribas (3 th Largest Retail Bank in Europe) Enterprise License – Unlimited use (3000+ deployed) ISE (Largest Options Stock Exchange) Enterprise License – Unlimited use US Postal Services Monitoring 450 nodes Thomson Reuters Up to 45,000+ International Papers Vodafone British Telecom MDS Pharmacy Pfizer Qwest Lockheed Martin Caremark Swedish Customs Netherlands Army CNS Dubai UPMC Medical Center UIC Medical Center University Hospital, Zurich US Dept. of Education SUNY Buffalo Univ.
  • 4.
    Capacity Planning EndorsementAdrian Cockcroft winner of A.A. Michelson lifetime achievement award at 2007 CMG, in his personal blog wrote “ The most interesting commercial tool I saw at CMG earlier this month is a capacity monitoring tool called PAWZ from PerfCap Corporation. The key thing they have worked on is taking the human out of the loop as much as possible with sophisticated capacity modeling algorithms and a simple and scalable operational model. ... The core idea is that you care about "headroom" in a service, and anything that limits that headroom is taken into account. Running out of CPU power, network bandwidth, memory, threads etc. will increase response time of the service, so monitor them all, track trends in headroom and calculate the point in time where lack of headroom will impact service response time. At eBay we used to call this the "time to live" for a service. You can easily focus on the services that have the shortest time to live, and proactively make sure that you have a low probability of poor response time.”
  • 5.
    Challenges Do MoreWith Less Large number of geographically dispersed resources Multi-platform Automate the process – On a daily basis Collect Data Consolidate/Analyze Data Generate Performance and Capacity Reports Send “Need-to-Know” Exception Notification Information availability: anytime anywhere web access
  • 6.
    Data Management HierarchicalApproach : Performance Analysts : Capacity Planners Raw Data Key Performance Data Risk Data
  • 7.
    Desk Top BrowserIntranet PAWZ FindIT Server (NT/W2K) Networks Storage Events Trending Clusters Real Time Applications Performance Reports Daily, Weekly Health Reports Critical Systems Asset Location Change Report Configuration Asset Reports Windows NT/2000/XP SUN Solaris HP-UX IBM-AIX OpenVMS Cluster LINUX Tru64 UNIX
  • 8.
    PAWZ Components PAWZAgent/Monitor: Resides on each node to be monitored Collect Performance data 24x7 Send colleted data to PAWZ Server in real time and/or once a day PAWZ Server: Resides on a Windows based server and communicates with hundreds of PAWZ Agents Receives data from PAWZ Agent Processes and produces real time, daily and historical charts and reports Produces trend graphs for simple projections Runs a queuing network modeler for capacity planning PAWZ Browser: Resides on any corporate desktop Shows all reports and charts within Internet Browser Manage most of PAWZ functions
  • 9.
    PAWZ Key FunctionalityCollect performance data 24 x 7 Provide real time and daily alerts based on performance thresholds Provide Performance Reports: Real Time Daily Historical – for trending Performs Saturation Analysis every day for each node for capacity planning Performs Risk Analysis to detect systems that could be at Risk. Provides consolidated data center configuration report
  • 10.
    Capacity PlanningDefinition: A process to determine how much computing resources are required to meet business growth Or How much business can grow before some device will run out of capacity To answer “What if” questions like: Can my current configuration handle three times of current workload – when will my current configuration saturate What will be impact of a new application on current system performance What will be impact of upgrading a current server or adding a new server Can I reduce the number of servers with out violating my “Service Level Agreement” – a.k.a Server Consolidation
  • 11.
    Sizing Methods Rulesof Thumb Linear Projec- tions Analytic Models Simula- tion Models Bench- marks Real System Cost Accuracy
  • 12.
    Capacity Planning viaTrending Time Performance Metric (Av. or Peak CPU Utilization) Simple to produce and follow Issues defining right Capacity Limit single vs composite metric end user satisfaction J F M A M J J A S O N D Today Remaining Capacity Capacity Limit
  • 13.
    PAWZ Planner Wheredo you want to operate? Response Time =  {Service Time + Queuing Time}  Workload Response Time Saturation Point Current Workload Headroom
  • 14.
    Capacity Planning viaModeling Steps: Data Collection Identifying Peak Interval(s) Workload Characterization Model Validation Saturation Analysis “ What If” Analysis
  • 15.
  • 16.
  • 17.
    Headroom Risk AnalysisTime Headroom Headroom threshold Headroom crosses threshold Lead time Amber status – system within lead time of dropping below headroom threshold. Lead time Headroom reaches 0 Red status – system within lead time of exhausting capacity. Current state
  • 18.
  • 19.
  • 20.
  • 22.
    “ What if”PAWZ Planner has a “what-if” Capacity Planning module to forecast:- How much business can grow before some device will run out of capacity Can my current configuration handle three times of current workload – when will my current configuration saturate What will be impact of a new application on current system performance What will be impact of upgrading a current server or adding a new server
  • 23.
    “ What if”CPU & Disk Upgrade Before After
  • 24.
    Itanium Capacity StudyTypical Study Capability to do any platform to any other platform (Alpha to Integrity) Hardware:- Customer on Integrity Server cluster with HP-UX and Oracle RX8620 (4/4/16), 64Gb Memory SAN
  • 25.
    Itanium Capacity StudyAlternate models considered:- RX8640 32 Core P570 32 Core M8000 32 Core 3 or 4 node cluster considered
  • 26.
    Itanium Capacity StudyReason for Study Expected substantial application growth System already Peaking at CPU What alternate configurations would provide adequate growth of at least 200% current workload? HP and non-HP configurations considered
  • 27.
  • 28.
    CPU by Image/ Disk I/O Rate
  • 29.
  • 30.
  • 31.
  • 32.
  • 33.
  • 34.
    Performance Data fromBenchmark CPU Utilization 86.3% Disk I/O Rate 1514/s Hard Page Fault Rate 1.2/s Memory Utilization 73%
  • 35.
  • 36.
    Where should yoursystem live?
  • 37.
    At peak sustainedload, 9% headroom CPU is primary resource bottleneck Possible solutions: Horizontal scaling Integrity upgrade Alternate hardware platform Headroom - Current System
  • 38.
    Configuration Alternatives (3or 4 nodes) HP rx8620 (1.1 GHz, Itanium 2) – current configuration HP rx8640 (1.6 GHz, 24MB L3 cache), 16 core HP rx8640 (1.6 GHz, 25MB L3 cache), 32 core IBM p 570 (2.2 GHz, Power 5), 16 core IBM p 570 (2.2 GHz, Power 5), 32 core IBM p 570 (4.7 GHz, Power 6), 16 core Sun SPARC Enterprise M8000 (2.4 GHz) , 16 core Sun SPARC Enterprise M8000 (2.4 GHz) , 32 core Configuration must support 200% workload growth
  • 39.
    Response Time vs Workload Growth 3-node RAC
  • 40.
    Response Time vs Workload Growth 4-node RAC
  • 41.
    Projection Conclusions CPUis constraining resource Memory, disk will support 200% growth 3 configuration platforms support growth: HP rx8640 (1.6 GHz, 25MB L3 cache), 32 core IBM p 570 (2.2 GHz, Power 5), 32 core IBM p 570 (4.7 GHz, Power 6), 16 core Sun SPARC Enterprise M8000 (2.4 GHz) , 32 core Horizontal scaling to 4 nodes will not change qualifying platforms. However, cores may be adjusted.
  • 42.
  • 43.
  • 44.
    Mixing 1.1 GHzand 1.6 GHz Itanium Cores
  • 45.
    Minimal Number ofCores per Node Supporting 200% Growth Platform 3-node 4-node Sun SPARC Enterprise M8000 (2.4 GHz) 32 24 HP rx8640 (1.6 GHz, 25MB L3 cache) 30 24 IBM p 570 (2.2 GHz, Power 5) 26 20 IBM p 570 (4.7 GHz, Power 6) 12 10
  • 46.
    Itanium Capacity StudyCustomer satisfied Had options Reduce Oracle cost by reducing number of cores Forecast from real data Could approach vendors with confidence Today 90% of this study automated via PAWZ Same Graphs Same Results
  • 47.
    Modelling Capability HardwareAlpha to Integrity Integrity to new models and beyond Other vendors to Integrity Software Increases in workload Optimization Decreases in workload Virtualization
  • 48.
    Summary PerfCap offersan integrated Performance Management and Capacity Planning Software that is: Out-of-the-box (no scripting required) Fully automated Multi-Platform Web based Highly scalable Pricing Independent of number and class of CPUs in a server
  • 49.
    More Information Sales[email_address] Web site www.PerfCap.com Hot Line 603-594-0222