Relative Capacity por Eduardo Oliveira e Joseph Temple


Published on

Published in: Technology
1 Like
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • IBM Systems for an On Demand Business – Agenda chart This presentation is intended to highlight three key areas: 1) Why is innovation in business important? Innovation is the most likely way for businesses to address challenges they face today: to grow while maintaining or better managing costs. 2) Secondly, we believe that IBM Systems can help you innovate – through technologies, people and solutions that help ignite innovation through business and technology integration – using technology as an innovation catalyst by combining it with business and market insights. In other words – becoming an On Demand Business 3) So before we go further, lets’ touch on why you may want to become an On Demand Business? (or continue your current path toward becoming an On Demand Business) At IBM, we believe an On Demand Business drives innovation more effectively . Why? Because an On Demand Business is dynamically responsive to customer demands, market opportunities and external threats. The real-time exchange of ideas, insights and experience is critical. The objective is to achieve growth and profit, not by aggressively cutting costs necessarily, but through innovations that improve product or service delivery, allow entry into new markets and increase productivity. Transition line: Many executives today believe that for a company to grow revenue it must innovate, to do new things that drive different results.
  • Large machines are required for parallel hell and parallel purgatory. Blades, Rack optimized clusters, and MPPs work well in parallel nirvana. Distributed Client Server build out is in the center of the chart.
  • Relative Capacity por Eduardo Oliveira e Joseph Temple

    1. 1. Relative Capacity Joseph Temple – Distinguished Engineer Eduardo Oliveira – Executive IT Specialist © Copyright IBM Corporation, 2008
    2. 2. Trademarks The following are trademarks of the International Business Machines Corporation in the United States, other countries, or both. The following are trademarks or registered trademarks of other companies. * All other products may be trademarks or registered trademarks of their respective companies. Notes : Performance is in Internal Throughput Rate (ITR) ratio based on measurements and projections using standard IBM benchmarks in a controlled environment. The actual throughput that any user will experience will vary depending upon considerations such as the amount of multiprogramming in the user's job stream, the I/O configuration, the storage configuration, and the workload processed. Therefore, no assurance can be given that an individual user will achieve throughput improvements equivalent to the performance ratios stated here. IBM hardware products are manufactured from new parts, or new and serviceable used parts. Regardless, our warranty terms apply. All customer examples cited or described in this presentation are presented as illustrations of the manner in which some customers have used IBM products and the results they may have achieved. Actual environmental costs and performance characteristics will vary depending on individual customer configurations and conditions. This publication was produced in the United States. IBM may not offer the products, services or features discussed in this document in other countries, and the information may be subject to change without notice. Consult your local IBM business contact for information on the product or services available in your area. All statements regarding IBM's future direction and intent are subject to change or withdrawal without notice, and represent goals and objectives only. Information about non-IBM products is obtained from the manufacturers of those products or their published announcements. IBM has not tested those products and cannot confirm the performance, compatibility, or any other claims related to non-IBM products. Questions on the capabilities of non-IBM products should be addressed to the suppliers of those products. Prices subject to change without notice. Contact your IBM representative or Business Partner for the most current pricing in your geography. Adobe, the Adobe logo, PostScript, and the PostScript logo are either registered trademarks or trademarks of Adobe Systems Incorporated in the United States, and/or other countries. Cell Broadband Engine is a trademark of Sony Computer Entertainment, Inc. in the United States, other countries, or both and is used under license therefrom. Java and all Java-based trademarks are trademarks of Sun Microsystems, Inc. in the United States, other countries, or both. Microsoft, Windows, Windows NT, and the Windows logo are trademarks of Microsoft Corporation in the United States, other countries, or both. Intel, Intel logo, Intel Inside, Intel Inside logo, Intel Centrino, Intel Centrino logo, Celeron, Intel Xeon, Intel SpeedStep, Itanium, and Pentium are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States and other countries. UNIX is a registered trademark of The Open Group in the United States and other countries. Linux is a registered trademark of Linus Torvalds in the United States, other countries, or both. ITIL is a registered trademark, and a registered community trademark of the Office of Government Commerce, and is registered in the U.S. Patent and Trademark Office. IT Infrastructure Library is a registered trademark of the Central Computer and Telecommunications Agency, which is now part of the Office of Government Commerce. For a complete list of IBM Trademarks, see *, AS/400® , e business(logo)® , DBE, ESCO, eServer, FICON, IBM® , IBM (logo)®, iSeries® , MVS, OS/390® , pSeries® , RS/6000® , S/30, VM/ESA® , VSE/ESA, WebSphere® , xSeries® , z/OS® , zSeries® , z/VM® , System i, System i5, System p, System p5, System x, System z, System z9® , BladeCenter® Not all common law marks used by IBM are listed on this page. Failure of a mark to appear does not mean that IBM does not use the mark nor does it mean that the product is not actively marketed or is not significant within its relevant market. Those trademarks followed by ® are registered trademarks of IBM in the United States; all others are trademarks or common law marks of IBM in the United States.
    3. 3. Agenda <ul><li>General Ideas </li></ul><ul><li>Pfister’s Diagram </li></ul><ul><li>Workload Type/Characteristics </li></ul><ul><li>Workload Factor </li></ul><ul><li>Benchmarks </li></ul><ul><li>White Space </li></ul><ul><li>Virtualization and Utilization </li></ul><ul><li>Relative Capacity </li></ul><ul><li>Ideas International/Performance Comparison </li></ul><ul><li>Non-Functional requirements </li></ul><ul><li>Total Cost of Ownership (TCO) </li></ul><ul><li>Consolidation: Case Study </li></ul><ul><li>Platform Choices </li></ul>
    4. 4. General Ideas <ul><li>It is the nature of computers that any computer can be programmed to accomplish any task. </li></ul><ul><li>Conversely, functionality is not enough. </li></ul><ul><li>Rational Platform Selection is based on the solution’s “non functional requirements” </li></ul>
    5. 5. Simplifying the Client Server Build Out Synchronization Time Bulk Data Traffic Shared Nothing High Latency Blades, Clusters,Squadrons HV Read Only Webserving , some DSS Shared Memory Low Medium Latency F6800,rx8400,rp8400 P670, Squadrons ML OLTP, Legacy SMP Shared Memory High Medium Latency F12000,F15000, SuperDome , P690 Data Warehouse, some DSS From: In Search of Clusters, The ongoing battle in lowly parallel comp uting by Greg Pfister , p461 Shared Everything Low Latency zSeries, Squadrons HE OLTP, Mixed Workload Price/Performance Total Capacity WLM & BI Function Virtualization Archit e ctural Divide Blades Midrange Client Server Mainframes Archit e ctural Divide High End UNIX Severs
    6. 7. Workload Characterization 2. I/O Bound – e.g. high I/O content applications 9. Protocol Serving – e.g. static HTTP, firewall, etc. 3. Mixed Low – e.g. multiple, data-intense applications or skewed OLTP, MQ 1 . Data Intensive – large working set and/or high I/O content applications 4. Mixed High – e.g. multiple, cpu-intense simple applications 8. Skewless OTLP – e.g. simple and predictable transaction processing 7. Java Heavy – e.g. cpu intensive java applications 6. Java Light – e.g. data intensive java applications 5. Database – e.g. Oracle DBMS or dynamic HTTP server 10. CPU Intensive – e.g. numerically intensive, etc. I/O Driven CPU Driven
    7. 8. Industry Benchmarks TPC-C, TPCE?? Parallel Hell positioning is empirical and folklore driven
    8. 9. Workload Considerations <ul><li>Workloads may benefit from being physically close to the data </li></ul><ul><li>Most servers run in low to medium utilization </li></ul><ul><li>High I/O workloads Vs. High CPU workloads </li></ul><ul><li>Workloads may require high availability </li></ul><ul><ul><li>Mission critical applications </li></ul></ul>
    9. 10. Comparing servers using relative capacity : Given system B with capacity C B processing a workload at utilization U B capacity C A needed by system A to process the same workload is given by: where WLF is the Workload Factor. With WLF we try to compensate for all the architectural differences between system A and system B. It is simplified: Actually WLF = f(U B )
    10. 11. Parallel Serial Processor Speed Cache RAS Processor Speed Cache RAS Processor Speed Cache RAS
    11. 12. Industry Standard Benchmarks <ul><li>System Level </li></ul><ul><li>Characteristics </li></ul><ul><li>No disk I/O </li></ul><ul><li>No network I/O </li></ul><ul><li>No database </li></ul><ul><li>Data sharing </li></ul><ul><li>Global memory </li></ul><ul><li>Interconnect important </li></ul><ul><li>Examples </li></ul><ul><li>SPEC OMPL2001 </li></ul><ul><li>SPEC OMPM2001 </li></ul><ul><li>Component </li></ul><ul><li>Characteristics </li></ul><ul><li>No disk I/O </li></ul><ul><li>No network I/O </li></ul><ul><li>No database </li></ul><ul><li>No data sharing </li></ul><ul><li>Cache / local memory </li></ul><ul><li>Scales w/QTY of cores </li></ul><ul><li>Examples </li></ul><ul><li>SPECint2000 </li></ul><ul><li>SPECint_rate2000/2006 </li></ul><ul><li>SPECfp2000/2006 </li></ul><ul><li>SPECfp_rate2000 </li></ul><ul><li>SPECjbb2000/2005 </li></ul><ul><li>Specialty </li></ul><ul><li>Characteristics </li></ul><ul><li>Disk I/O </li></ul><ul><li>Network I/O </li></ul><ul><li>No data sharing </li></ul><ul><li>Local memory </li></ul><ul><li>Scales w/QTY of cores </li></ul><ul><li>Examples </li></ul><ul><li>SPECweb2005 </li></ul><ul><li>NotesBench </li></ul><ul><li>SPECjAppServer 2004 </li></ul><ul><li>SPECsfs97_R1.v3 </li></ul><ul><li>Database Examples </li></ul><ul><li>TPC/H (Read Only) </li></ul><ul><li>SAP SD 2-Tier (Limited I/O) </li></ul><ul><li>System Level Database </li></ul><ul><li>Characteristics </li></ul><ul><li>Disk I/O </li></ul><ul><li>Network (except batch) </li></ul><ul><li>Global memory </li></ul><ul><li>Data sharing </li></ul><ul><li>Read/Write database </li></ul><ul><li>Examples </li></ul><ul><li>TPC/C </li></ul><ul><li>SAP SD 3-Tier </li></ul><ul><li>Oracle Apps </li></ul><ul><li>Oracle RAC </li></ul><ul><li>Oracle Batch </li></ul><ul><li>PeopleSoft Financials </li></ul>Memory Memory I/O CPU Cache CPU Cache I/O Local Interconnect Memory Memory I/O CPU Cache CPU Cache I/O Local Interconnect Global Interconnect Memory Memory I/O CPU Cache CPU Cache I/O Local Interconnect Memory Memory I/O CPU Cache CPU Cache I/O Local Interconnect Global Interconnect Memory Memory I/O CPU Cache CPU Cache I/O Local Interconnect Memory Memory I/O CPU Cache CPU Cache I/O Local Interconnect Global Interconnect Memory Memory I/O CPU Cache CPU Cache I/O Local Interconnect Memory Memory I/O CPU Cache CPU Cache I/O Local Interconnect Global Interconnect
    12. 13. System Design and Benchmarks <ul><li>Industry standard benchmarks are used by vendors to establish performance or price/performance leadership. </li></ul><ul><ul><li>Benchmarks are chosen to show off a platform not to allow comparisons </li></ul></ul><ul><li>Benchmarks frequently do not match real customer workloads </li></ul><ul><ul><li>Small – very limited stress on data delivery infrastructure </li></ul></ul><ul><ul><li>Throughput oriented with highly paralyzed processing – scales with quantity of processors for all vendors </li></ul></ul><ul><li>Real workloads (including virtualization and mixed workloads) place tremendous stress on cache and system interconnect. </li></ul>Workload / Server Size Data Sharing / Workload Complexity On-Chip Cache Qty of Threads (1-2 Sockets) Any Benchmark Quantity of Cores / threads Parallelization Throughput Results TPC-H SAP SD 2-Tier SPECJBB2005 SPECintRate SPECfpRate SPECweb System Interconnect Cache Architecture Schedulers # of Processors TPC-C SAP SD 3-Tier Interconnect Cache Schedulers # Processors Virtualization Mixed Workload
    13. 14. 'White space' = wasted capacity Shared Systems Separate Dedicated Systems
    14. 15. Peak and Average <ul><li>The desired peak utilization is the “Utilization at Saturation design point”, Usd </li></ul><ul><li>Average Utilization is: </li></ul>Where: P is the Peak Load A is the Average Load s is the number of servers used to implement the capacity
    15. 16. Virtualization enables higher CPU Utilization <ul><li>Single workload model assumptions: </li></ul><ul><ul><li>Average Utilization: 20.7% </li></ul></ul><ul><ul><li>Peak: 79% </li></ul></ul><ul><li>As more copies of this workload are added, average utilization approaches peak </li></ul><ul><ul><li>8:1 39% Average, peak 76% </li></ul></ul><ul><ul><li>16:1 48% Average, peak 78% </li></ul></ul><ul><ul><li>64:1 61% Average, peak 78% </li></ul></ul><ul><li>As workload is added the number of CPUs required for the work grows at a much lower rate. </li></ul>
    16. 17. Why Larger Servers for Virtualization? <ul><li>Hardware Advantages </li></ul><ul><ul><li>Higher utilization due to shared headroom. </li></ul></ul><ul><ul><li>More internal bandwidth to improve performance </li></ul></ul><ul><ul><li>Fewer disk & network adapters and ports </li></ul></ul><ul><ul><li>Able to share memory more effectively </li></ul></ul><ul><ul><li>More fault tolerant features </li></ul></ul><ul><li>People Advantages </li></ul><ul><ul><li>Fewer servers to order, install, track, maintain, and retire </li></ul></ul><ul><ul><li>Fewer Hypervisor instances to manage </li></ul></ul><ul><ul><li>Fewer firmware patches to apply </li></ul></ul><ul><li>Data Center Advantages </li></ul><ul><ul><li>Better power utilization </li></ul></ul><ul><ul><li>Reduced floor space </li></ul></ul>Hardware Capacity Usable VM Capacity Smaller Servers Medium Peak to Avg. Utilization Gap Larger Servers Small to Very Small Peak to Avg. Utilization Gap
    17. 18. Relative Capacity Criteria <ul><li>What is the number and utilization of servers? </li></ul><ul><ul><li>How big is this? What is the potential for virtualization or workload management? Need to profile utilization by intervals. </li></ul></ul><ul><li>How Parallel is the work? </li></ul><ul><ul><li>Read only partitioned data, mostly read minimum sharing, mostly read shared data, read/write partitioned data, read/write shared data, etc. (Leverage of shared v. partitioned resources) </li></ul></ul><ul><li>How large are the working sets for the DB, Memory, and Cache? </li></ul><ul><ul><li>What is the &quot;Workload Factor?&quot; Need throughput v. utilization by intervals </li></ul></ul><ul><li>What are the testing and QA practices? </li></ul><ul><ul><li>How much nonproduction hardware is there as a result? </li></ul></ul>
    18. 19. Relative capacity (cont’d) <ul><li>Capacity Metric can vary (MIPS, MHz, tpm, tpc-c, n° engines, ...) </li></ul><ul><li>Utilization (%) can be measured with various tools (vmstat, top, Task Manager, ...) </li></ul><ul><li>WLF is measured in [ C B ]/[ C A ] units </li></ul><ul><ul><li>i.e., LSPR values are MIPS/MIPS workload factors for two zSeries machines at same utilization </li></ul></ul><ul><li>WLF difficult to measure! </li></ul><ul><ul><li>not enough benchmarks to cover all the cases </li></ul></ul><ul><ul><li>driven by cache miss rate , which cannot be directly measured </li></ul></ul><ul><ul><li>“ Cloud of uncertainty” around measured values </li></ul></ul>
    19. 20. Ideas International <ul><li>Independent Organization </li></ul><ul><li>Publishes the Consolidated Analysis Report (CAR) </li></ul><ul><li>CAR contains the RPE2 performance index </li></ul><ul><li>RPE2 index can be used into cross-platform selection </li></ul><ul><li>Requires a License </li></ul><ul><li> </li></ul><ul><li>RPE2 = Relative Performance Estimate 2 ( Copyright © 2008 - Ideas International Limited) </li></ul>
    20. 21. Performance Comparison <ul><li>Server Utilization </li></ul><ul><li>Workload Type </li></ul><ul><li>I/O Latency </li></ul><ul><li>Cache </li></ul><ul><li>Clock Speed </li></ul><ul><li>Architecture </li></ul><ul><li>Non-functional requirements </li></ul>
    21. 22. Cost is a quantification of Non functional requirements <ul><li>Costs go way beyond Hardware, Software, Maintenance </li></ul><ul><li>There are different ways to surface them </li></ul><ul><li>Here is one way to organize things: </li></ul><ul><ul><li>Cost of outages </li></ul></ul><ul><ul><li>Prioritization </li></ul></ul><ul><ul><li>Growth </li></ul></ul><ul><ul><li>Administration Costs </li></ul></ul><ul><ul><li>Time to Market v Code Quality </li></ul></ul><ul><ul><li>Project Costs </li></ul></ul><ul><ul><li>Environmental Costs </li></ul></ul>
    22. 23. A full range of TCO factors considerations – often ignored <ul><li>Integration </li></ul><ul><ul><li>Integrated Functionality vs. Functionality to be implemented (possibly with 3rd party tools) </li></ul></ul><ul><ul><li>Balanced System </li></ul></ul><ul><ul><li>Integration of / into Standards </li></ul></ul><ul><li>Further Availability Aspects </li></ul><ul><ul><li>Planned outages </li></ul></ul><ul><ul><li>Unplanned outages </li></ul></ul><ul><ul><li>Automated Take Over </li></ul></ul><ul><ul><li>Uninterrupted Take Over (especially for DB) </li></ul></ul><ul><ul><li>Workload Management across physical borders </li></ul></ul><ul><ul><li>Business continuity </li></ul></ul><ul><ul><li>Availability effects for other applications / projects </li></ul></ul><ul><ul><li>End User Service </li></ul></ul><ul><ul><li>End User Productivity </li></ul></ul><ul><ul><li>Virtualization </li></ul></ul><ul><li>Skills and Resources </li></ul><ul><ul><li>Personnel Education </li></ul></ul><ul><ul><li>Availability of Resources </li></ul></ul><ul><li>Availability </li></ul><ul><ul><li>High availability </li></ul></ul><ul><ul><li>Hours of operation </li></ul></ul><ul><li>Backup / Restore / Site Recovery </li></ul><ul><ul><li>Backup </li></ul></ul><ul><ul><li>Disaster Scenario </li></ul></ul><ul><ul><li>Restore </li></ul></ul><ul><ul><li>Effort for Complete Site Recovery </li></ul></ul><ul><ul><li>SAN effort </li></ul></ul><ul><li>Infrastructure Cost </li></ul><ul><ul><li>Space </li></ul></ul><ul><ul><li>Power </li></ul></ul><ul><ul><li>Network Infrastructure </li></ul></ul><ul><ul><li>Storage Infrastructure </li></ul></ul><ul><li>Additional development and implementation </li></ul><ul><ul><li>Investment for one platform – reproduction for others </li></ul></ul><ul><li>Controlling and Accounting </li></ul><ul><ul><li>Analyzing the systems </li></ul></ul><ul><ul><li>Cost </li></ul></ul><ul><li>Operations Effort </li></ul><ul><ul><li>Monitoring, Operating </li></ul></ul><ul><ul><li>Problem Determination </li></ul></ul><ul><ul><li>Server Management Tools </li></ul></ul><ul><ul><li>Integrated Server Management – Enterprise Wide </li></ul></ul><ul><li>Security </li></ul><ul><ul><li>Authentication / Authorization </li></ul></ul><ul><ul><li>User Administration </li></ul></ul><ul><ul><li>Data Security </li></ul></ul><ul><ul><li>Server and OS Security </li></ul></ul><ul><ul><li>RACF vs. other solutions </li></ul></ul><ul><li>Deployment and Support </li></ul><ul><ul><li>System Programming </li></ul></ul><ul><ul><ul><li>Keeping consistent OS and SW Level </li></ul></ul></ul><ul><ul><ul><li>Database Effort </li></ul></ul></ul><ul><ul><li>Middleware </li></ul></ul><ul><ul><ul><li>SW Maintenance </li></ul></ul></ul><ul><ul><ul><li>SW Distribution (across firewall) </li></ul></ul></ul><ul><ul><li>Application </li></ul></ul><ul><ul><ul><li>Technology Upgrade </li></ul></ul></ul><ul><ul><ul><li>System Release change without interrupts </li></ul></ul></ul><ul><li>Operating Concept </li></ul><ul><ul><li>Development of an operating procedure </li></ul></ul><ul><ul><li>Feasibility of the developed procedure </li></ul></ul><ul><ul><li>Automation </li></ul></ul><ul><li>Resource Utilization and Performance </li></ul><ul><ul><li>Mixed Workload / Batch </li></ul></ul><ul><ul><li>Resource Sharing </li></ul></ul><ul><ul><ul><li>shared nothing vs. shared everything </li></ul></ul></ul><ul><ul><li>Parallel Sysplex vs. Other Concepts </li></ul></ul><ul><ul><li>Response Time </li></ul></ul><ul><ul><li>Performance Management </li></ul></ul><ul><ul><li>Peak handling / scalability </li></ul></ul>
    23. 24. Overview of Techline’s case study SAR Reports Webserver Webserver Webserver Server Consolidation Tool Projected Util. VMXT z/VM zLinux Apache Capacity Tool = Actual Util. ?
    24. 25. Platform Choices Legacy Quality of Service Total Cost Application Platform Support Application Structure