Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

IBM z/OS V2R2 Performance and Availability Topics


Published on

Apresentação realizada no ITSO z Systems 2015, São Paulo, Brasil, entre os dias 19 e 22 de Outubro de 2015. Apresentação criada pelo time técnico da IBM ITSO.

Published in: Technology
  • Be the first to comment

IBM z/OS V2R2 Performance and Availability Topics

  1. 1. © 2015 IBM CorporationITSO-1 Welcome DAY 2 Performance & Availability
  2. 2. © 2015 IBM CorporationITSO-2 Topics Covered • Software Pricing and You • IBM z13 Performance • IBM z13 SIMD • IBM z13 SMT • IBM z13 Coupling • Erase-on-Scratch Enhancements in z/OS 2.1 • zEnterprise Data Compression (zEDC) • Planned Outage Considerations • Focus of all sections is on price/performance – getting the maximum value from your investment in System z.
  3. 3. © 2015 IBM CorporationITSO-3 Agenda • 09:00 Start • 10:30 – 10:45 Coffee Break • 12:30 – 13:30 Lunch • 14:45 – 15:00 Coffee Break • 17:00 Finish!
  4. 4. © 2015 IBM CorporationITSO-44 The following are trademarks of the International Business Machines Corporation in the United States, other countries, or both. The following are trademarks or registered trademarks of other companies. * All other products may be trademarks or registered trademarks of their respective companies. Notes: Performance is in Internal Throughput Rate (ITR) ratio based on measurements and projections using standard IBM benchmarks in a controlled environment. The actual throughput that any user will experience will vary depending upon considerations such as the amount of multiprogramming in the user's job stream, the I/O configuration, the storage configuration, and the workload processed. Therefore, no assurance can be given that an individual user will achieve throughput improvements equivalent to the performance ratios stated here. IBM hardware products are manufactured from new parts, or new and serviceable used parts. Regardless, our warranty terms apply. All customer examples cited or described in this presentation are presented as illustrations of the manner in which some customers have used IBM products and the results they may have achieved. Actual environmental costs and performance characteristics will vary depending on individual customer configurations and conditions. This publication was produced in the United States. IBM may not offer the products, services or features discussed in this document in other countries, and the information may be subject to change without notice. Consult your local IBM business contact for information on the product or services available in your area. All statements regarding IBM's future direction and intent are subject to change or withdrawal without notice, and represent goals and objectives only. Information about non-IBM products is obtained from the manufacturers of those products or their published announcements. IBM has not tested those products and cannot confirm the performance, compatibility, or any other claims related to non-IBM products. Questions on the capabilities of non-IBM products should be addressed to the suppliers of those products. Prices subject to change without notice. Contact your IBM representative or Business Partner for the most current pricing in your geography. Adobe, the Adobe logo, PostScript, and the PostScript logo are either registered trademarks or trademarks of Adobe Systems Incorporated in the United States, and/or other countries. Cell Broadband Engine is a trademark of Sony Computer Entertainment, Inc. in the United States, other countries, or both and is used under license therefrom. Java and all Java-based trademarks are trademarks of Sun Microsystems, Inc. in the United States, other countries, or both. Microsoft, Windows, Windows NT, and the Windows logo are registered trademarks of Microsoft Corporation in the United States, other countries, or both. Intel, Intel logo, Intel Inside, Intel Inside logo, Intel Centrino, Intel Centrino logo, Celeron, Intel Xeon, Intel SpeedStep, Itanium, and Pentium are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States and other countries. UNIX is a registered trademark of The Open Group in the United States and other countries. Linux is a registered trademark of Linus Torvalds in the United States, other countries, or both. ITIL is a registered trademark, and a registered community trademark of the Office of Government Commerce, and is registered in the U.S. Patent and Trademark Office. IT Infrastructure Library is a registered trademark of the Central Computer and Telecommunications Agency, which is now part of the Office of Government Commerce. For a complete list of IBM Trademarks, see *BladeCenter®, DB2®, e business(logo)®, DataPower®, ESCON, eServer, FICON, IBM®, IBM (logo)®, MVS, OS/390®, POWER6®, POWER6+, POWER7®, Power Architecture®, PowerVM®, S/390®, System p®, System p5, System x®, System z®, System z9®, System z10®, WebSphere®, X-Architecture®, zEnterprise, z9®, z10, z/Architecture®, z/OS®, z/VM®, z/VSE®, zSeries® Not all common law marks used by IBM are listed on this page. Failure of a mark to appear does not mean that IBM does not use the mark nor does it mean that the product is not actively marketed or is not significant within its relevant market. Those trademarks followed by ® are registered trademarks of IBM in the United States; all others are trademarks or common law marks of IBM in the United States. Trademarks
  5. 5. © 2015 IBM CorporationITSO-5 Software Pricing and You Performance and Availability
  6. 6. © 2015 IBM CorporationITSO-6 Topics covered in this section • Software pricing basics • Why techies need to understand software pricing • Mobile Workload Pricing • z Systems Collocated Application Pricing • Country Multiplex Pricing • References • Summary DISCLAIMERS: Any prices used in this section are notional, based on a mix of z/OS products. They may not represent actual prices, and are used purely for comparison purposes. This presentation also focuses solely on IBM MLC products. You also need to factor in IPLA and non-IBM products when deciding on the optimum configuration for your enterprise.
  7. 7. © 2015 IBM CorporationITSO-7 IBM Software Pricing Options • The System Programmers’ cure for insomnia: – AEWLC – AWLC – CMLC – EWLC – MULC – MWP – PSLC – SALC – SVC – TTO – TUP – ULC – VU – zCAP – zELC – zIPLA – zNALC – zzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzz The thrill of IBM software pricing – who needs sky-diving when you can learn about this stuff??!!
  8. 8. © 2015 IBM CorporationITSO-8 Software Pricing Basics • First, a 1-slide introduction to IBM MLC Software Pricing… – Most major IBM monthly license charge (MLC) software products for z Systems are charged using sub- capacity pricing). This is based on the peak Rolling 4-Hour Average of the LPARs they run in, NOT on the actual CPU utilization, and NOT on the CPU time they use. 0 100 200 300 400 500 600 700 800 Actual MSUs & R4HA Total R4 - Total
  9. 9. © 2015 IBM CorporationITSO-9 Software Pricing Basics • Well, a 1(ish)-slide introduction to IBM MLC Software Pricing… – To be precise, the charge is based on the lower of: the peak Rolling 4-Hour Average (R4HA – measured in MSUs), or the highest defined capacity (specified in MSUs) for all the LPARs on that CPC running that product for the month (00:00 on 2nd to 23:59 on the 1st) – Remember that if you do something to lower the peak, some other interval becomes the new peak and might be unaffected by the change you made. 0 100 200 300 400 500 600 700 800 900 1000 MSUs Time R4HA z/OS MSUs Adjusted z/OS MSUs 00 23
  10. 10. © 2015 IBM CorporationITSO-10 Software Pricing Basics • 1-slide introduction to IBM MLC Software Pricing (cont)… – There is a bulk discount – the more MSUs you consume, the lower is the price per additional MSU. – The AVERAGE cost per MSU is the total cost / peak R4HA. – The INCREMENTAL cost per MSU is always less than the average and is the price you pay for the next MSU. 0 50000 100000 150000 200000 250000 MonthlyCost MSUs Pricing Curve 1372.35 386.4 316.05 226.8 120.75 92.4 65.1 49.35 39.9 0 200 400 600 800 1000 1200 1400 1600 $ per Additional MSU
  11. 11. © 2015 IBM CorporationITSO-11 Software Pricing Basics • 1-slide introduction to Software Pricing (continued)… – Basic rule is that each CPC is looked at in isolation to determine your incremental $ per MSU. – Assume you have 3 CPCs, all running monoplexes, peak R4HA in each CPC is 315 MSUs: – 3 x 93,184 = $279,555/Mth. – But, if 1 sysplex accounts for > 50% of used MVS MIPS on multiple CPCs, software is priced based on aggregated MSUs across those CPCs. – If the 3 CPCs qualified for sysplex aggregation, the total MSUs would be 945, and the cost would be: 1 x 156,949/Mth. – This group of CPCs is called a PricePlex 0 50000 100000 150000 200000 250000 MonthlyCost MSUs Pricing Curve x3
  12. 12. © 2015 IBM CorporationITSO-12 Software Pricing Basics • 1-slide introduction to Software Pricing (I TOLD you this wasn’t simple….)…. – Sysplex aggregation determines the $ per MSU you pay – where you are on the pricing curve. – But the IBM software bill for each MLC product is based on the sum of the peak Rolling 4-hour Averages for each LPAR that that product is used in for each CPC. – So the highest interval for CPC1 is used, plus the highest interval for CPC2 (which is probably at a different time), plus the highest interval for CPC3 (which is also probably at a different time). – QUESTION for you to think about – how would your software bill be affected if you moved a major workload: – From one LPAR to another on the same CPC? – From one LPAR to an LPAR on a different CPC?
  13. 13. © 2015 IBM CorporationITSO-13 Why techies need to know SW pricing • Traditionally, software contract staff worked with vendors and decided on which software pricing metric to use, often working independently of mainframe technical staff. • And mainframe technical staff aimed to deliver the best performance from the available capacity, often without understanding or thinking much about software pricing metrics. – Generally, the same pricing metric (PSLC, VWLC, AWLC, etc) was used for every LPAR on a CPC and for every CPC in the installation, so you didn’t really need to be aware of the pricing metric when deciding what to put where. – There are (OF COURSE) a small number of exceptions like zNALC (for ‘new’ applications) or MULC (measured usage, for very small usage of some products in large LPARs), but these are not very widely used.
  14. 14. © 2015 IBM CorporationITSO-14 Why techies need to know SW pricing • Between z900 and z10, IBM provided a financial incentive (‘technology dividend’) to move to newer CPCs by increasing the number of MIPS per Software MSU with each generation. Or, to put it another way, the number of MSUs required to process a given amount of work DEcreased. Software MSUs are the base for most software pricing, so this let you do the same amount of work for less money. THIS WAS GOODNESS! 0 200 400 600 800 1000 1200 z900 z990 z9 z10 z196 MSUs needed to do same amount of work MSUs
  15. 15. © 2015 IBM CorporationITSO-15 Why techies need to know SW pricing • The downside was that capacity management became more complex. An LPAR on a z10 with a 1000 MSU cap could process more work than an LPAR on a z9 with the same (1000 MSU) cap. • This complicated the process of managing LPAR sizes and routing work to the ‘best’ system. 0 1000 2000 3000 4000 5000 6000 7000 8000 9000 z900 z990 z9 z10 z196 MIPS for 1000 MSUs MIPS
  16. 16. © 2015 IBM CorporationITSO-16 Why techies need to know SW pricing • An aside….. When is an MSU not an MSU? • The original idea of MSUs was as an indicator of CPC capacity. – The MSU for a CPC was: – The SU/Sec for that box x number of engines x 3600 (to get MSU/hr)./1,000,000 • When IBM started altering the number of MIPS in an MSU as a way of discounting software, you now had TWO MSUs: – “Hardware” MSU – calculated using the original formula – this is the basis for reporting in RMF Type 72 records & Workload Activity Reports and service units reporting in Type 30 records. – “Software” MSU – used as the basis for software charging and is used in the RMF Type 70 records & CPU Activity reports….
  17. 17. © 2015 IBM CorporationITSO-17 Why techies need to know SW pricing • And what is a MIPS (Millions of Instructions Per Second)? • In theory, MIPS is an indication of the speed of the processor…. • However can imagine that the MIPS for a processor depends on how long the instructions take to complete. – Some instructions take a LOT longer to complete than others – for example, moving characters from one location in memory to another takes MUCH longer than adding the numbers in two registers. • As a result, the ‘MIPS’ for a processor is very workload dependent – there is no single MIPS number for any box and no tool that reports MIPS numbers. The typical range between high and low is about 34%. So you need to be very careful any time you use MIPS, ESPECIALLY in contracts…. We’ll come back to this again later… • Now, to return to our originally scheduled program….. MSUs and z196….
  18. 18. © 2015 IBM CorporationITSO-18 Why techies need to know SW pricing • On z196, IBM stopped increasing the number of MIPS per MSU and instead used a new pricing option called AWLC (or AEWLC) that charged a lower price per MSU than the predecessor pricing option (VWLC) to incent customers to move to z196. • Starting with zEC12, discounts are applied during the IBM billing process, so that the price per MSU on a zEC12 (or z13) is lower than on a z196, but the number of MIPS per MSU was the roughly same on zEC12 (or z13) as on a z196. • The financial effect is similar (you pay less per MIPS on newer CPCs), however the capacity management complexity is somewhat simplified. 0 1000 2000 3000 4000 5000 6000 7000 8000 z196 zEC12 z13 MIPS for 1000 MSUs MIPS
  19. 19. © 2015 IBM CorporationITSO-19 Why techies need to know SW pricing • Despite all the complexity of software contracts, one thing that has been consistent up until now is the average price per MSU for a given LPAR on a given CPC – 1000 MSUs costs xxxx dollars regardless of the mix of work running in the LPAR… 1000 MSUs is 1000 MSUs. • But the world is changing. Workloads are changing. IBM is incenting customers to put new and more workloads on z/OS by reducing the cost per MSU for certain workloads (this is GOOD). It is also making it possible to mix new and traditional workloads in the same LPAR while still getting a discount for the new workloads – this provides far more flexibility for how you configure your systems. However this also means that the days of consistent $ per MSU for an LPAR are over (this is …… EXCITING!).
  20. 20. © 2015 IBM CorporationITSO-20 Recent IBM MLC SW Pricing Options • The three most recent pricing options are: – Mobile Workload Pricing, announced in May 2014. – z Systems Collocated Application Pricing, announced in April 2015 – Country Multiplex Pricing, announced in July 2015 • Lets look at each of these and see how they will impact YOU.
  21. 21. © 2015 IBM CorporationITSO-21 Mobile Workload Pricing • What is Mobile Workload Pricing (MWP)? • Headline is that it offers a 60% discount on MSUs consumed by CICS/DB2/IMS/MQ/WAS transactions that originated on a mobile device. • 60 …. PERCENT …. OFF! WOW! What else is there to say?? • Quite a bit….
  22. 22. © 2015 IBM CorporationITSO-22 Introduction to MWP • First, mobile is not a fad, it is not going away. – There are already large z/OS customers where mobile consumes up to 50% of their z/OS capacity. – Some banks are incenting customers to interact with them using mobile apps rather than PCs, partly so that they can benefit from MWP. – And these are only the early days – we are still in the 3277 phase… • IBM (and many others) believe mobile use will out-accelerate all other platforms over the next few years, so MWP is IBM’s attempt to capture the mobile workloads that exploit existing z/OS applications, rather than having customers host these applications on other platforms. – Important to note that MWP is aimed at customers that are re-using existing z/OS applications with mobile platforms.
  23. 23. © 2015 IBM CorporationITSO-23 Introduction to MWP • MWP IS a REALLY significant offering from IBM - it indicates that IBM acknowledges that it must improve the cost-competitiveness of z/OS if customers are to grow and roll out new applications on this platform. – zCAP and CMP (both previewed with the z13 announcement) continue this trend. • The latest pricing options are all aimed at reducing the cost of GROWTH. – They might not immediately reduce your SW bills, BUT, IF you grow your z/OS workloads and exploit the new pricing options, at some point the bulk of your work will be priced at the new, more competitive price points, and your traditional (higher-priced) work will be a decreasing portion of the total work (and cost).
  24. 24. © 2015 IBM CorporationITSO-24 Introduction to MWP • If you sign up for Mobile Workload Pricing (it is optional, and you must sign an agreement and supplements if you want to use it), IBM will reduce the R4HA FOR EVERY IBM MLC PRODUCT IN THAT LPAR in each interval by 60% of the corresponding R4HA of the MSUs consumed by CICS, DB2, IMS, MQ, or WAS transactions that originated from a mobile device. • Important point here is that it is not only the subsystem where the transaction ran (CICS, for example) that is discounted. It is EVERY subcapacity IBM MLC product in that LPAR – SDSF, DB2, PL/1, you name it.
  25. 25. © 2015 IBM CorporationITSO-25 Introduction to MWP • Initial questions from techies after they hear about MWP are normally: – Precisely what qualifies as a ‘mobile device’? – How do you get the CPU time used by those applications so you can input it to MWRT (MWRT is PC-based version of SCRT if you use MWP)? • But maybe your initial questions should be: – How much mobile do I have now – using MWP generates additional work (for you and for the system), so would the savings justify the work? Or should we concentrate on getting better prepared now, and sign up later? – Mobile users are usually customers that expect instant responses, so how do I give them the capacity they need, while also controlling my costs? – How do I manage my budgets/capacity when one (constantly varying) part of my workload has a different price per MSU than the rest of my workload? – EXACTLY how does MWP impact my bills?
  26. 26. © 2015 IBM CorporationITSO-26 Understanding MWP • ‘Success’ is all about expectation setting…. • If you promise this……………………... • And deliver this….... – You are a hero • If you promise this…………………….. • And deliver this….., – You get to experience a ‘career transitioning event’
  27. 27. © 2015 IBM CorporationITSO-27 Understanding MWP • To ensure that MWP is perceived as a successful project, it is vital that you control the expectations because the MWP message is already being mis-interpreted.…. – MWP is aimed at reducing the cost of growing z/OS workloads. It MIGHT reduce your current costs, depending on how much of your work is MWP- eligible and whether that coincides with your peak Rolling 4-Hour Average (R4HA). – But the real intent is to let you add mobile workloads to z/OS at a much lower cost than was the case previously. So MWP is more about reducing the cost of adding workloads to z/OS than reducing your SW bill today. – Let’s look at an example….
  28. 28. © 2015 IBM CorporationITSO-28 Understanding MWP 0 100 200 300 400 500 600 700 800 900 1000 MSUs Time Impact of MWP on R4HA z/OS MSUs MWP MSUs Adjusted z/OS MSUs
  29. 29. © 2015 IBM CorporationITSO-29 Understanding MWP • The first expectation (misunderstanding) that you must control: ‘signing up for MWP will reduce my SW bill by 60%’. It might reduce your bill, but it WILL NOT reduce it by 60%. • If you reduce the total number of MSUs on your bill by some amount, you are reducing them at the incremental cost, not the average cost so your bill will not decrease by the same percent as your MSUs • Let’s say you have a 2400 MSU system and you did not sign up for MWP – the bill would be $230,123 (an average of $95.88 per MSU). • Now assume that 1680 of the 2400 MSUs were MWP-eligible and that you DID sign up for MWP. The MWP discount would be roughly 1000 MSUs (60% of 1680). So (absolute best case) that would bring your bill back to 1400 MSUs – $185,138. That’s a reduction of nearly 20%, which should be great. But it is not 60%. OR ? 1372.35 386.4 316.05 226.8 120.75 92.4 65.1 49.35 39.9 0 500 1000 1500 $ per Additional MSU
  30. 30. © 2015 IBM CorporationITSO-30 Understanding MWP And, in reality, your peak R4HA will now be some other interval, so you are VERY unlikely to actually see your peak R4HA reduce by 1000 MSUs. But let’s look at a growth scenario instead. Let’s say that you ADDED 1680 MSUs of mobile workload to the peak rolling 4 hour interval on your 2400-MSU system, but did NOT sign that system up for MWP pricing….. The average $ per MSU for the 2400 MSUs was $95.88. Because of the SW price curve, the additional 1680 MSUs would have cost an extra $67,037. An average of just $39.90 per MSU for those extra MSUs. But if you DID sign that system up for MWP and met all the requirements, the additional cost for the 1680 MSUs of mobile work would have been $26,815 – $40,222 less than if you didn’t have MWP. By exploiting MWP, you grew the actual used MSUs by 70% but your bill only increased by 11.6% - just $15.96 per MSU. – Of course, this assumes that the additional capacity was only used by MWP-eligible workloads.
  31. 31. © 2015 IBM CorporationITSO-31 Understanding MWP • Expectation number 2 – ‘signing up for MWP will definitely reduce my existing bill by something’. • Actual: Signing up for MWP will reduce the R4HA for each hour in the month by 60% of the MSUs used by MWP-eligible workloads. • As we saw in the earlier chart, IF you run a lot of MWP-eligible work at the time of your current peak R4HA, MWP will probably save you money. • If your current peak R4HA is at a time when there is little or no MWP-eligible work (during the batch window, for example), then MWP probably will not reduce your current bill by much. – BUT, it may allow you to add mobile workload at other times of the day at zero additional cost, even if the peak total MSUs exceeds the batch shift peak.
  32. 32. © 2015 IBM CorporationITSO-32 Understanding MWP • Expectation 3 – ‘MWP reduces the basis on which I get billed by 60% of the capacity used by mobile’. • Actual – MWP reduces the R4HA of every interval by 60% of the MSUs used by MWP-eligible work. • You pay software bills based on the lower of: the peak R4HA for the month, or the highest defined capacity for the month. • It is possible that the use of a defined capacity is already saving you most of what MWP would save you if you did not have a softcap. • BUT, as the volume of your mobile work grows, the number of MSUs in that 60% discount will increase, so over time it will probably deliver more savings than a softcap alone. • Let’s look at an example…..
  33. 33. © 2015 IBM CorporationITSO-33 Understanding MWP 0 100 200 300 400 500 600 700 800 900 1000 MSUs Time Impact of MWP on R4HA and SoftCap z/OS MSUs MWP MSUs Adjusted z/OS MSUs Defined Capacity
  34. 34. © 2015 IBM CorporationITSO-34 Understanding MWP • To summarize the financial aspects of MWP: • Given the global trend towards people being more reliant on their mobile devices, it is reasonable to expect that MWP WILL deliver real savings – some customers are already saving money with it. • No one will reduce their IBM SW bill by 60% due to mobile. – But real savings can be made and the cost of growth can be significantly reduced – Just make sure that expectations are not set unrealistically high. • If you don’t have much mobile work on your system today, don’t ignore MWP. – This is good, because it gives you time to investigate and determine the best way for YOU to implement MWP and to work with subsystem sysprogs, application architects and developers, and contract administrators. • Now let’s look at the technical considerations for how MWP affects your system and subsystem topology.
  35. 35. © 2015 IBM CorporationITSO-35 Managing an environment that has MWP • Part of the terms and conditions of MWP is that you must have a way to identify the CPU time consumed by transactions that originated on a mobile device, and you are responsible for providing that information to MWRT (or a new version of SCRT). So everyone wants to know how to calculate the number of MSUs used by MWP-eligible work.
  36. 36. © 2015 IBM CorporationITSO-36 Managing an environment that has MWP • But before you break out your SMF and FORTRAN VS manuals, you need to pause and consider something else: how do you control your SW costs in an MWP environment? – Today, you can easily determine a reliable average and incremental cost per MSU for each of your LPARs based on the peak R4HA and product mix in each LPAR. – Then you take your monthly SW budget and divide by the average cost per MSU amount, and that gives you your total Defined Capacity value. – If your business requires predictable monthly bills, this is the most effective way to achieve that. – But how do you do that when SOME (very variable) subset of your workload effectively has a different cost per MSU?
  37. 37. © 2015 IBM CorporationITSO-37 Managing an environment that has MWP 147,000 140,000 154,000 Cost Same total used capacity, three very different costs
  38. 38. © 2015 IBM CorporationITSO-38 Managing an environment that has MWP • This is a very fundamental (and new) challenge for any site that is interested in fully exploiting IBM’s recent software pricing options – your budgets are managed in dollars, but your LPARs are managed using MSUs, and the average price per MSU can constantly change depending on the workload mix. • If you don’t add more capacity, you might not have sufficient capacity to deliver the required service level. • But if you DO add more capacity, how much do you add? And how do you stop your traditional workloads from using all that capacity and increasing your costs beyond your budgeted amounts? – You can use products to dynamically manage your defined capacities, but they also operate based on MSUs, so you still have the same challenge.
  39. 39. © 2015 IBM CorporationITSO-39 Managing an environment that has MWP • Ideally, you would be able to: – Identify, in near-real time, how many MSUs are being used by each pricing option. – Have a tool that would use that information to dynamically set and manage a defined capacity that would maximize the number of available MSUs, but without exceeding your financial targets. • Today, the cost of gathering that information in real time at a transaction level might be higher than the savings that MWP would provide. – Recently previewed WLM APAR OA47042 (z/OS 2.1 and later), combined with support in CICS and IMS, may provide relief IF the WLM classification criteria can be used to identify all MWP-eligible transactions. APAR is still open, due for delivery in December, and all details are not available yet. But if you are interested in MWP, this is an APAR to follow. – Also, have a look at MXG 33.216 which already has the definitions for the new fields! • With that in mind, let’s look at your options for how you could provide an environment for your MWP workloads.
  40. 40. © 2015 IBM CorporationITSO-40 Managing an environment that has MWP • You basically have 3 options: – Run your MWP-eligible transactions in the same regions and subsystems as your traditional workloads. – Provide regions and subsystems that are dedicated to MWP-eligible transactions, but that run in shared LPARs. – Provide dedicated LPARs for the MWP-eligible transactions. • Let’s look at the benefits and drawbacks of each of these.
  41. 41. © 2015 IBM CorporationITSO-41 Managing an environment that has MWP • Shared regions • Benefits: –EASY to set up – just use existing regions and subsystems. • Drawbacks: – Currently, you MUST process transaction-level SMF data to identify CPU consumption of MWP-eligible transactions. This could be a LOT of data. – Identifying the source of the transaction from the SMF records might not be possible. – How do you identify the original source of transactions that are called by other txns? – Maintenance effort for programs that extract CPU usage info is not insignificant – every time a new MWP- eligible application is deployed or modified, you need to update your programs. And not every application will use the same mechanism for identifying its source. • Drawbacks: – Transaction-level SMF records do not capture region management time – about 80% is captured, at best. – MQ does not provide transaction-level CPU usage info in its SMF records, so you are limited to collecting whatever MQ charges back to CICS/IMS/etc. – Categorizing CPU usage in real time is currently expensive, maybe impossible (but OA47042 might help).
  42. 42. © 2015 IBM CorporationITSO-42 Managing an environment that has MWP • Dedicated regions in shared LPARs • Benefits: – Might be easier to identify the transaction source in the network and route it to the dedicated regions – removes the need to identify this from transaction-level SMF records. – Because identification is done based on mobile device name, maintenance effort should be a lot lower than if you are gathering this info from transaction-level SMF or log records. • Drawbacks: – Requires additional regions/subsystems, meaning more work to set up and manage, plus the resources required for more address spaces. – Requires data sharing if you want to extend this to database manager. • Benefits: – IBM will accept data extracted from SMF Type 30 records – massive reduction in volume of SMF data to be processed. – Because Type 30 records are used, you capture all the management time as well. – Possible to economically identify CPU consumption of these regions in real time, even without the WLM MWP support.
  43. 43. © 2015 IBM CorporationITSO-43 Managing an environment that has MWP • Dedicated systems • Benefits: – All of the benefits of dedicated regions, plus…. – Dramatically easier to manage LPAR capacity, because nearly all work in the LPAR has the same average price per MSU. – Easier to provide dedicated capacity for MWP work and have less important traditional work subject to capping in other LPARs. – IBM will accept data from just the Type 70 and Type 89 records – no need to collect, keep, and post-process transaction-level or even address space-level SMF records. • Drawbacks: – Setting up new systems means more work to set up and manage, plus the resources required for more LPARs. – Requires data sharing, assuming that you want to share data between MWP and traditional applications. • Benefits: – There might be security advantages to isolating transactions originating on a mobile device into their own LPARs.
  44. 44. © 2015 IBM CorporationITSO-44 Implementing MWP • Regardless of which topology you decide to use, you are responsible for getting CPU usage data into a format that can be used by MWRT. • IBM currently do not provide any mainstream tool to do this processing. – They do have a product called Transaction Analysis Workbench 1.2 (5697-P37) that purportedly helps you gather data for MWP if APAR PI29291 is applied, but I have not been able to get any more information about this. – In time, the WLM MWP support might collect all the data you need in the Type 72 and Type 99 SMF records, however you will still be responsible for getting it from there into a format that can be input to MWRT/SCRT. • Al Sherkow and Barry Merrill have produced some tools based on MXG. – But they are still limited by the information that can be found in the SMF records. – MXG already supports the new WLM MWP support, which might or might not identify all mobile transactions. – And they still require customer programming.
  45. 45. © 2015 IBM CorporationITSO-45 Implementing MWP • In order to be able to avail of MWP, you must: – Have a zBC12 or zEC12 or later in your enterprise. – The MWP-eligible workloads must run on a z114/z196 or later. – Be running z/OS (V1 or V2) and one or more of CICS (V4 or V5), DB2 (V9, V10, or V11), IMS (V11, V12, or V13), MQ (V7 or V8), or WAS (V7 or V8). – Be using a sub-capacity pricing option – AWLC, AEWLC, or zNALC. – Sign the MWP supplement. – And agree with IBM which applications will be eligible, and how you will gather the usage data for those applications. And, especially, exactly how you will identify the MWP-eligible transactions. – Also, any time you add new MWP transactions/applications, you must inform IBM and complete a new supplement. – Provide your own mechanism to create the MWP input to MWRT (or SCRT 23.10 or later). – Use MWRT or SCRT 23.10 or later to report your utilization to IBM.
  46. 46. © 2015 IBM CorporationITSO-46 MWP Summary • Investigate if MWP would help you today, and to what extent. • Set management’s expectations to a realistic level and position this as a strategic direction to reduce future costs. • Work with subsystem sysprogs, application developers, whoever owns the WLM policy, and contract administrators to identify the most efficient topology for your company, bearing in mind zCAP and other similar options that may follow. – And don’t forget that you need some way to ensure that the additional capacity you provide for MWP work is not used by traditional work. • Work with subsystem sysprogs and application developers to investigate how you can identify MWP-eligible transactions – if possible, use consistent mechanism to simplify programs that extract MWP CPU time info. • Create and test programs to extract the required data into MWR-readable format. • Sign the IBM agreements and supplements. • Plan for what you will spend your MASSIVE bonus on….
  47. 47. © 2015 IBM CorporationITSO-47 z Systems Collocated Application Pricing (zCAP) • z Systems New Application License Charging (zNALC) has been available since 2007. – It significantly reduces the software costs for applications that meet certain criteria. – However it requires that the applications are run in a dedicated zNALC LPAR(s) – zNALC LPARs can be in same sysplex as traditional workloads and can share data with traditional workloads. But z/OS in the zNALC LPAR will be priced using zNALC prices. $0.00 $500.00 $1,000.00 $1,500.00 $2,000.00 $2,500.00 $3,000.00 $3,500.00 $4,000.00 $4,500.00 1 2 3 4 5 6 7 8 9 AWLC to zNALC z/OS comparison AWLC zNALC
  48. 48. © 2015 IBM CorporationITSO-48 z Systems Collocated Application Pricing • To address the needs of customers that have new applications but that don’t want to have to set up dedicated LPARs for those workloads, IBM introduced a new pricing option called z Systems Collocated Application Pricing (zCAP). • zCAP is conceptually similar to MWP in that discounts are based on the middleware CPU consumption of applications that meet the criteria for zCAP and that are described in your zCAP agreement and supplements with IBM. • However, because the applications are NEW, they should be a lot easier to identify than MWP transactions, which use existing applications (meaning that you don’t have the complexity of trying to determine the source of the transaction).
  49. 49. © 2015 IBM CorporationITSO-49 z Systems Collocated Application Pricing • What is a ‘new’ workload? – Must be a new application to z/OS in your enterprise. – Does not have to be new ‘in the universe’ – for example, SAP has been around for many years, but if you are not using SAP on z/OS now, then it is eligible to be considered ‘new’ for zCAP purposes. – If you move SAP from another platform in your enterprise to z/OS, that also counts as being ‘new’ for zCAP purposes. – The zCAP definition of ‘new’ is a lot more flexible than the zNALC definition of new. Application must use at least one of CICS/DB/IMS/MQ/WAS, but that is all. • The objective is to provide you with more flexibility to help you add new z/OS applications. • Organic growth of existing applications does not count as ‘new’ for zCAP purposes. • For gray areas, speak to IBM and make a case for why the application should be considered ‘new’. • Also, in the words of IBM’s David Chase, ‘newness does not wear off. Applications that qualified as ‘new’ 5 years ago are still considered new today’.
  50. 50. © 2015 IBM CorporationITSO-50 z Systems Collocated Application Pricing • Like MWP, you have to identify the MSUs used by the zCAP-eligible workload (CICS/DB2/IMS/MQ/WAS). – Then you subtract 50% of that amount from the z/OS R4HA. – And you subtract 100% of that amount from all other MLC products in the LPAR (CICS, DB2, IMS, MQ, WAS, COBOL, NetView, etc.) – Then you pay for the MSUs for the subsystems used by the zCAP-eligible workload using the same pricing metric that is being used by the LPAR the application is running in. • Let’s look at two scenarios…. – First one is where a new application is the only user of a ‘zCAP-defining’ subsystem (CICS/DB2/IMS/MQ/WAS) – Second one is where the new application uses an existing subsystem.
  51. 51. © 2015 IBM CorporationITSO-51 z Systems Collocated Application Pricing Net New MQ Example = 100 MSUs of new MQ workload * 1. Existing LPAR 2. New MQ, standard rules 3. New MQ with zCAP pricing MSUs used for subcap billing: MSUs used for subcap billing: MSUs used for subcap billing: z/OS 1,000 z/OS 1,100 z/OS 1,050 DB2 and CICS 1,000 DB2 and CICS 1,100 DB2 and CICS 1,000 MQ (LPAR value) 1,100 MQ (usage value) 100 Standard LPAR Value = 1,100 Standard LPAR Value = 1,100 z/OS, other programs adjusted Standard LPAR Value = 1,000 1,100 1,100 1,100 1,050 1,000 1,000 1,000 z/OS DB2 z/OS DB2 MQ z/OS DB2 & CICS & CICS & CICS MQ 100 * Assumes workloads peak at same time Example courtesy of David Chase, IBM
  52. 52. © 2015 IBM CorporationITSO-52 z Systems Collocated Application Pricing • Consider what would have happened if you had used zNALC for this application… – You would have paid a discounted price for z/OS based on a 100 MSU R4HA. – You would have paid for 100 MSUs of MQ • Because you are using zCAP in this example: – The MSU value used for CICS & DB2 was reduced by 100% of the capacity used by the new application because it didn’t use either of those products – so you paid for 1000 MSU of CICS or DB2, rather than 1100 MSUs. – You reduced the total z/OS R4HA number by 50% of the capacity used by the new application (50 MSU reduction) so you paid for 1050 MSUs of z/OS. – You only paid for 100 MSUs of MQ, even though it lived in an LPAR that was using 1100 MSUs. • So the net effect may be similar to zNALC, but without the need for a separate LPAR.
  53. 53. © 2015 IBM CorporationITSO-53 z Systems Collocated Application Pricing Incremental MQ Example = 100 MSUs of MQ growth * 1. Existing LPAR 2. MQ growth, standard rules 3. MQ growth with zCAP pricing MSUs used for subcap billing: MSUs used for subcap billing: MSUs used for subcap billing: z/OS 1,000 z/OS 1,100 z/OS 1,050 DB2 and CICS 1,000 DB2 and CICS 1,100 DB2 and CICS 1,000 MQ 1,000 MQ w/growth 1,100 MQ w/growth 1,100 Standard LPAR Value = 1,100 Standard LPAR Value = 1,100 z/OS, other programs adjusted Standard LPAR Value = 1,000 1,100 1,100 1,100 1,100 100 of 100 of growth 1,050 growth 1,000 1,000 1,000 1,000 z/OS DB2 MQ z/OS DB2 MQ z/OS DB2 MQ & CICS & CICS & CICS * Assumes workloads peak at same time Example courtesy of David Chase, IBM
  54. 54. © 2015 IBM CorporationITSO-54 z Systems Collocated Application Pricing • In this example, the new application used a product (MQ) that was already being used by existing applications: – The MQ cost went up by the 100 MSUs that the application was using. – The R4HA value used for CICS & DB2 was reduced by 100 because the new application didn’t use CICS or DB2. – The total z/OS MSU number was reduced by 50% of the capacity used by the new application (50 MSU reduction). – The R4HA for every other MLC product would be reduced by the 100 MSUs. • So, again, the net effect is similar to zNALC, but without the need for a separate LPAR. – With zNALC you would pay for 100 MSUs of z/OS at the very-reduced zNALC rate. With zCAP, you would pay for 50 MSUs of z/OS at your incremental price for z/OS (with the price depending on where you are on the pricing curve for z/OS). – The relative costs of MQ would depend on if you use AWLC or Value Unit Edition (IPLA, only available with zNALC) and where you are on the pricing curve.
  55. 55. © 2015 IBM CorporationITSO-55 z Systems Collocated Application Pricing • As with MWP, you are responsible for identifying the capacity used by the new workload and translating that into a CSV file that is input to MWRT (or new SCRT). – If the new application is the only user of a subsystem (as in the 1st example), it is acceptable to use data from the Type 89 SMF records. – If the application is using an existing subsystem product (MQ, in example 2), but runs in its own dedicated region, IBM will accept data from the Type 30 records for that region. – If the application is using an existing subsystem AND an existing region, then you need to use transaction-level information to determine the MSUs used by the new application.
  56. 56. © 2015 IBM CorporationITSO-56 z Systems Collocated Application Pricing Requirements • zCAP is only available for new applications that run in a z114/z196 or later with AWLC, AEWLC, CMLC, or zNALC sub-capacity pricing. • Supports both z/OS V1 and V2, and current and recent versions of CICS/DB2/IMS/MQ/WAS. • Data must be submitted to IBM using MWRT 3.3.0 or later (current version is 3.3.5) or SCRT 23.10 or later. • There is a new contract Addendum and accompanying Supplement: – Addendum for z Systems Collocated Application Pricing (Z126-6861) – Terms and conditions to receive zCAP benefit for AWLC, AEWLC, zNALC billing • Supplement to the Addendum for zCAP (Z126-6862) – Customer explains how they measure their zCAP application CPU time – Agreement to and compliance with the terms and conditions specified in the zCAP contract Addendum is required
  57. 57. © 2015 IBM CorporationITSO-57 z Systems Collocated Application Pricing Summary • zCAP has a similar objective to zNALC – reduce the cost of adding ‘new’ applications to z/OS. • But it is intended to give you an alternative to running dedicated zNALC LPARs – you can now select a topology that makes both financial and technical sense. • It is not possible to make a blanket statement about which option (zNALC or zCAP) will have lower costs. Recommend that you work with your IBMer to price the following options: – Straight AWLC/AEWLC. – zCAP. – zNALC with AWLC/AEWLC for subsystems. – zNALC with IPLA for subsystems. – Don’t forget to factor in cost of dedicated LPAR for zNALC.
  58. 58. © 2015 IBM CorporationITSO-58 Country Multiplex Pricing • The most recent pricing option is Country Multiplex Pricing (CMP), announced in July 2015. • Its primary objective is to address customer issues with sysplex aggregation and provide customers with much more flexibility regarding how you configure your systems and sysplexes – it aims to eliminate financial incentives to create configurations that make no technical sense. • For any customer that has or would like to have a sysplex, this is THE BEST THING EVER! • Let’s look at some of the issues that it addresses. And then we will look at some scenarios to see how it would affect your SW bills.
  59. 59. © 2015 IBM CorporationITSO-59 Country Multiplex Pricing • Sysplex Aggregation – loved and loathed. • The great thing about sysplex aggregation is that it reduces the incremental price per MSU (i.e. how much additional MSUs will cost you) for your software by summing your MSUs across your CPCs to move you onto the lower priced tiers. 1372.35 386.4 316.05 226.8 120.75 92.4 65.1 49.35 39.9 0 200 400 600 800 1000 1200 1400 1600 $ per Additional MSU MSUs
  60. 60. © 2015 IBM CorporationITSO-60 Country Multiplex Pricing • The not-so-great thing is that your business structure might not be consistent with creating a sysplex that accounts for >50% of all used MVS MIPS. – Companies have production systems, development systems, test systems, quality assurance systems, and sysprog systems – they each have a specific purpose and objectives that might clash with each other. – Your business might consist of multiple companies that do not share data or applications, so there is no logical reason for them to be in the same sysplex. • But to get over the magical 50% sysplex aggregation threshold, some customers create sysplexes that are sysplexes in name only. – Mixing test and production. – Mixing completely unrelated systems in the same sysplex. – Only criteria is the number of MSUs used by the system, not its relationship to other systems in the sysplex. • Valuable and scarce technical resource is expended on creating and maintaining an environment that delivers zero business advantage to the enterprise. It would be far more valuable to use those skills to implement new business functions and products.
  61. 61. © 2015 IBM CorporationITSO-61 Country Multiplex Pricing • If you switch to Country Multiplex Pricing, the R4HA for every LPAR across every CPC in a country is used to determine your incremental software cost, regardless of whether the systems are in the same sysplex (or ANY sysplex) or not. • No more financial encouragement to create shamplexes (as long as you are already using CMP) – YIPEE!
  62. 62. © 2015 IBM CorporationITSO-62 Country Multiplex Pricing • Mixing different types of system (test and production, for example) in the same sysplex can cause system and sysplex outages. – This is why IBM’s best practice guidelines say not to mix test and production in the same sysplex. – Test systems are used for ….. Testing. It is the nature of those systems to have new, untested software. Compare that to production, which requires stability, control, consistency, manageability • Despite the known problems, people still created such sysplexes because of the short term financial savings. • With CMP, there is no connection between the use of sysplex and your software costs. • So, after you move to CMP, there is ZERO incentive to ever create nonsensical sysplexes again…. YIPEE (again)
  63. 63. © 2015 IBM CorporationITSO-63 Country Multiplex Pricing • For technical reasons, you might wish to keep production and non-production systems on separate CPCs. – For example, you want to be able to test new HW functions in a safe environment before moving them to production. – Or you want to place a production CF on a CPC that doesn’t have any production z/OS systems. This config has the same failure-isolation characteristics as a standalone CF, but at a lower cost. • But because of the financial benefit of sysplex aggregation, there was a very strong incentive to include as many CPCs as possible in the sysplex, making it very difficult to have a completely failure-isolated CPC. • With CMP, the number of CPCs that a sysplex is spread across has zero impact on MLC prices. So you could have 2 production CPCs and 2 test CPCs, or 4 production/test CPCs – the MLC SW cost would be the same. • Now you can really configure for the optimum configuration without being constrained by financial considerations.
  64. 64. © 2015 IBM CorporationITSO-64 Country Multiplex Pricing • There are many factors that play into identifying the optimum physical location of your CPCs: – Availability and cost of data center space – Disaster recovery considerations – Location and condition of existing corporate data centers – Availability of skills – Infrastructure and natural hazards – earthquakes, flooding, ice storms, reliable power supply – And, prior to CMP, sysplex distances (so you can include both data centers in sysplex aggregation) • With CMP, because the sysplex aggregation requirement has gone away, the location of your CPCs (as long as they are in the same country) has no impact on your MLC software costs, so you are free to determine their location based purely on business and technical considerations.
  65. 65. © 2015 IBM CorporationITSO-65 Country Multiplex Pricing • Prior to CMP, when calculating your software bill for the month, IBM uses the sum of the peak R4HAs for each CPC for the month. • It is unlikely that all your CPCs will peak at exactly the same time. As a result, your bill is probably based on more MSUs than you actually use at any one point in time.
  66. 66. © 2015 IBM CorporationITSO-66 Country Multiplex Pricing CPC1 CPC2 CPC3 CMLC SUM LP1 LP2 LP3 LP4 AWLC SUM LP1 LP2 LP3 AWLC SUM LP1 LP2 AWLC SUM 0:00 55 232 13 563 863 0:00 217 101 392 710 0:00 148 183 331 1904 1:00 64 481 49 246 840 1:00 276 392 384 1052 1:00 71 62 133 2025 2:00 60 454 15 255 784 2:00 235 382 65 682 2:00 179 288 467 1933 3:00 73 279 38 342 732 3:00 166 269 202 637 3:00 348 321 669 2038 4:00 75 257 37 671 1040 4:00 108 218 347 673 4:00 260 115 375 2088 5:00 52 442 32 329 855 5:00 369 86 122 577 5:00 450 123 573 2005 6:00 61 415 17 172 665 6:00 315 342 123 780 6:00 241 74 315 1760 7:00 75 406 12 168 661 7:00 366 293 155 814 7:00 148 340 488 1963 8:00 66 465 12 159 702 8:00 117 64 100 281 8:00 103 363 466 1449 9:00 68 374 18 390 850 9:00 154 264 347 765 9:00 446 155 601 2216 10:00 63 350 50 571 1034 10:00 266 83 220 569 10:00 229 399 628 2231 11:00 66 395 22 382 865 11:00 339 120 336 795 11:00 244 373 617 2277 12:00 52 459 24 263 798 12:00 342 247 318 907 12:00 304 211 515 2220 13:00 74 412 46 508 1040 13:00 233 239 132 604 13:00 140 207 347 1991 14:00 53 443 48 164 708 14:00 122 144 270 536 14:00 286 191 477 1721 15:00 63 296 26 691 1076 15:00 256 378 152 786 15:00 447 227 674 2536 16:00 60 342 21 178 601 16:00 86 335 176 597 16:00 315 348 663 1861 17:00 61 417 33 199 710 17:00 132 106 163 401 17:00 151 153 304 1415 18:00 72 495 9 535 1111 18:00 188 219 81 488 18:00 409 215 624 2223 19:00 73 304 22 460 859 19:00 185 160 384 729 19:00 210 445 655 2243 20:00 53 459 30 694 1236 20:00 321 361 149 831 20:00 269 306 575 2642 21:00 56 463 39 453 1011 21:00 198 370 67 635 21:00 158 115 273 1919 22:00 72 201 37 418 728 22:00 66 392 286 744 22:00 217 340 557 2029 23:00 58 283 17 602 960 23:00 243 133 154 530 23:00 257 269 526 2016 0:00 59 321 44 528 952 0:00 384 72 91 547 0:00 155 177 332 1831 1:00 53 471 46 406 976 1:00 54 344 373 771 1:00 224 203 427 2174 Peak 1236 1052 674 2962 2642
  67. 67. © 2015 IBM CorporationITSO-67 Country Multiplex Pricing • With CMP, your peak R4HA is determined by summing every LPAR on every CPC, effectively working as if every LPAR was in the one CPC. • The result is likely to be a lower peak R4HA number than would be calculated using pre-CMP rules.
  68. 68. © 2015 IBM CorporationITSO-68 Country Multiplex Pricing • Because your bill was based on the peak R4HA for the month for each CPC, if you moved an application from one CPC to another, you would end up paying for the capacity used by that application on BOTH CPCs for that month. • For the same reason, some customers are unwilling to enable queue sharing or dynamic workload routing (especially across two sites) because that could result in work moving between CPCs more than would happen with static routing. – But by not exploiting these technologies, you are losing a lot of the benefit of data sharing and probably getting longer response times and less efficient resource usage than if you let WLM or shared queue manager control the routing. • Because CMP calculates your peak R4HA by summing every LPAR on every CPC, moving work from one CPC to another should have no impact on your MLC software bill • There is now no financial reason NOT to fully exploit the workload routing options that are available to you or to move workloads between CPCs.
  69. 69. © 2015 IBM CorporationITSO-69 Country Multiplex Pricing • Single Version Charging (SVC) saves you money by letting you pay for two versions of a product as if they were one version (GOOD). – Remember that you pay based on LPAR sizes, so if you didn’t have SVC, you would pay for both versions based on the LPAR’s peak R4HA. With SVC you only pay for the latest version. • However, you generally only have 1 year to complete the migration to the new version (NOT SO GOOD). – COBOL V5 now offers 1.5 years for migration. • CMP provides a feature known as Multiple Version Migration. With MVM, you pay for all installed versions of a product as if they were the most recent version (similar to SVC), however there is no limit on how long you take to migrate. If you wish, you could run both versions indefinitely. You can even run more than two versions.
  70. 70. © 2015 IBM CorporationITSO-70 Country Multiplex Pricing • Because your software bill is based on peak R4HA (or peak defined capacity) for each CPC, increasing the defined capacity on one CPC would probably result in an increase in your software bill for that month even if you reduced the defined capacity on another CPC by a similar amount. • Because CMP is based on the peak R4HA/peak defined capacity across all your CPCs, decreasing the defined capacity on one CPC would allow you to increase the defined capacity on another CPC without impacting your MLC software bill (just as moving a defined capacity from one LPAR to another on the same CPC today would not impact your software bill). • This allows you to get the full benefit of installed capacity spread across multiple CPCs without your MLC SW bill going up. Ideal if you have different CPCs that service different time zones, or if you have affinities between workloads and specific LPARs.
  71. 71. © 2015 IBM CorporationITSO-71 Country Multiplex Pricing • What’s the catch? CMP is primarily designed to increase flexibility, separate financial considerations from technical decisions, and help improve availability – these benefits are available to anyone that signs up for CMP. And it lets you reconfigure into a more sensible sysplex topology (no longer spreading one sysplex over every CPC, for example), without increasing your software costs. • While it should also enable growth at reduced costs, that is not its primary objective. – If your CPCs are not aggregated today, CMP should reduce the cost of adding capacity. – If your CPCs ARE aggregated today, most of the CMP financial benefit will probably come above 2500 MSUs - up to 2500 MSUs, CMLC prices are the same as AWLC. • In return for the greater flexibility that CMP provides, future bills are calculated as a delta off your current bill. • How does this work???
  72. 72. © 2015 IBM CorporationITSO-72 Country Multiplex Pricing • Prior to moving to CMP, IBM calculates 2 baselines for each product: • One is based on the average of the peak R4HA across all your CPCs for the 3 months your last 3 bills are based on – this is called the MSU Base. – Note that this value is arrived at using the same methodology as CMP – the total R4HA for each interval is calculated by summing the R4HA for every LPAR on every CPC. – As a result, this value probably will be different to the values that were used to calculate your bill for those 3 months, but it is consistent with how your bill will be calculated after you move to CMP. • The other baseline is the average of the billed amount ($s) for each of the prior 3 months – this is called the MLC Base. • The % difference between the MLC base and what the price would have been, based on the CMLC rules and tiers is calculated – this is called the MLC Base Factor. • These values will all be documented in your CMLC agreement.
  73. 73. © 2015 IBM CorporationITSO-73 Country Multiplex Pricing 0 50,000 100,000 150,000 200,000 250,000 300,000 350,000 400,000 450,000 MLC Base CMLC Price Prod A Prod A MLC Base Factor (x%)
  74. 74. © 2015 IBM CorporationITSO-74 Country Multiplex Pricing Reported MSUs from SCRT Multiplex report for the product= 4,000 $69,123 $301,995 $371,118 MSU Base = 3,827 Price the actual MSUs from monthly Multiplex report on CMLC curve 4000 MSUs = 301,995 295,514 Calculate total MLC list price including Base uplift Price the 3,827 MSU Base on CMLC 3,827 MSUs = $295,514 1 2 3 4 $69,123 Multiply resulting price by MLC Base Factor to determine Base uplift: $295,514 * .23391 = $69,123 A B C
  75. 75. © 2015 IBM CorporationITSO-75 Country Multiplex Pricing • After you move to CMP, your bill is calculated as follows: 1. The peak R4HA is used to calculate what the CMLC price would be. 2. Then the current CMLC price of the MSU Base is calculated. 3. Multiply the answer from 2 by the MLC Base Factor to get the MLC uplift 4. Add 3 to 1 to determine your actual CMLC bill • Let’s look at some scenarios to see how this might affect YOU.
  76. 76. © 2015 IBM CorporationITSO-76 CMP Sample Scenarios – Scenario 1: You qualify for sysplex aggregation today and you move to CMP and change NOTHING. – Result: Your bill will not change. – Reasoning: Your CMLC bill is calculated based on the difference between your current Peak R4HA (after you move to CMP) and your MSU Base. If the R4HA is the same as the MSU Base, there is no delta, so your bill stays the same.
  77. 77. © 2015 IBM CorporationITSO-77 CMP Sample Scenarios – Scenario 2: You qualify for sysplex aggregation today then move to CMP and break up shamplexes but everything else stays the same – Result: Your bill will not change. – Reasoning: Again, because your new R4HA is the same as the MSU Base, there is no delta, so your bill stays the same. – Note that if you had done this BEFORE you moved to CMP, your bill would probably have increased dramatically.
  78. 78. © 2015 IBM CorporationITSO-78 CMP Sample Scenarios – Scenario 3: You do NOT qualify for sysplex aggregation today. Then you sign up for CMP and don’t change anything. – Result: Your bill will not change. – Reasoning: Remember that the MSU Base is calculated by summing across all CPCs. The MLC Base depends on whether you were aggregated before, but the MSU Base does not. So, because your new R4HA is the same as the MSU Base, there is no delta, so your bill stays the same. Even though CMP does not require sysplex aggregation, the MLC Base at the time you move to CMP determines your future bills. So, it doesn’t matter if you stay aggregated AFTER you move to CMP, but you want to stay aggregated up until you make the move.
  79. 79. © 2015 IBM CorporationITSO-79 CMP Sample Scenarios – Scenario 4: You do NOT qualify for sysplex aggregation today. Then you sign up for CMP and your configuration changes so that you would have qualified for sysplex aggregation under the old rules. – Result: Your bill will not change. You have the option of moving back to AWLC, but you must stay there for 12 months before moving back to CMP. – Reasoning: It IS possible to move back to AWLC. But we think this is probably not a very likely scenario. There is no incentive to meet the old sysplex aggregation rules after you signed up for CMP, so your systems are likely to move in the opposite direction. Also, the increased cost associated with moving back to AWLC might offset any gains from moving to a lower MLC Base (and remember that the new MLC Base will be based on your configuration and utilization at least 12 months after you move back to AWLC).
  80. 80. © 2015 IBM CorporationITSO-80 CMP Sample Scenarios – Scenario 5: You have 2 priceplexes today. You sign up for CMP. And grow by 1000 MSUs. – Result: Your bill will increase. The amount of the increase is likely to be less than would have been the case if you had grown by the same amount under AWLC. – Reasoning: Each priceplex is likely to be on a steeper part of the pricing curve. When all the processors are in CMP, the peak R4HA will be calculated across all CPCs, very likely resulting in the incremental price per MSU being lower because the configuration is on the flatter part of the pricing curve.
  81. 81. © 2015 IBM CorporationITSO-81 CMP Sample Scenarios – Scenario 6: You have 1 priceplex today. In the middle of the month you move a workload from CPC1 to CPC2. Peak MSUs on CPC1 is 750 MSUs before the move, and the peak R4HA on CPC2 is 750 after the move. Even though the combined peak never exceeds 850 MSUs, the bill would be for 1500 MSUs based on the two peak MSUs. Then you sign up for CMP and make the same move in reverse but everything else remains the same. – Result: Moving the application will not cause your bill to increase. – Reasoning: Because the Peak R4HA is calculated based on the sum of all LPARs across all CPCs, moving a workload from one CPC to another under CMP has the same effect as moving a workload from one LPAR to another prior to CMP.
  82. 82. © 2015 IBM CorporationITSO-82 CMP Requirements • Must be running z/OS V1 or later. • If you sign up for CMP, ALL CPCs in your enterprise in the country that run z/OS must be included. • You can only sign up for CPC if ALL your z/OS CPCs are z196 or later. – To be precise, “Machines eligible to be included in a new Multiplex cannot be older than two generations prior to the most recently available server at the time a client first implements a Multiplex” and “Going forward, any machine to be added to an existing Multiplex must conform to the machine types that satisfy the generation N, N-1, and N-2 criteria at the time that machine is added” • Must use SCRT V23 R10.0 or later (was made available on October 2).
  83. 83. © 2015 IBM CorporationITSO-83 CMP Requirements • Sysplex aggregation considerations: – From the CMP announcement letter: – “Clients with existing sysplexes that use sysplex aggregation pricing and are to become part of a Multiplex must be in compliance with announced sysplex rules prior to entering the Multiplex. Otherwise, the MLC Base will be calculated on a non-aggregated basis. Clients must have submitted a valid Sysplex Verification Package within the prior 12 months. Sysplex aggregation rules and related reporting requirements (SVP) are eliminated under CMP for clients who were sysplex compliant before entering CMP.”
  84. 84. © 2015 IBM CorporationITSO-84 CMP Requirements • Considerations for Outsourcers: – “Clients acting as service providers, using z Systems software to host applications or infrastructure for a third party, may implement CMP only for eligible machines that are dedicated to a particular end-user client. Service providers implementing CMP may have one Multiplex (as defined below) per dedicated end-user client environment within a country. Multi-tenant (non-dedicated) machines or sysplexes are not eligible for CMP.”
  85. 85. © 2015 IBM CorporationITSO-85 Country Multiplex Pricing Recommendations • Ensure that whoever is responsible for your system topology understands the flexibility that CMP introduces. • Your aim should be for all ‘PlatinumPlexes’ – that is, sysplexes that share all system infrastructure data sets (single RACFplex, single SMSplex, single HSMplex, single RMMplex, possibly single JESplex, and so on) plus shared data and applications… Ideally each sysplex would represent a Single System Image to users, and a single point of control to sysprogs and operators. This improves: – Managability and simplicity (= less mistakes and more efficient operations) – Capacity utilization (work can run wherever there is available capacity) – Application availability (if every application runs on at least 2 z/OS systems, outages (planned or unplanned) are masked from users and customers.) • Start by separating systems that have a history of problems or many outages from production systems. • Try to separate developers from production systems – auditors generally much prefer such configurations.
  86. 86. © 2015 IBM CorporationITSO-86 Country Multiplex Pricing Recommendations • From a financial perspective, you want to do everything reasonable to minimize your MLC and MSU baselines because they play such a large role on your monthly bills moving forward: – Agree with IBM which months will be used to calculate your baselines. – Remember that your bill for month N is based on the usage for month N-1. – Don’t choose a time when stress or load testing is being carried out. – Avoid peak business periods. – Make the optimal use of available capping capabilities to reduce peaks: – The important number is the peak R4HA across all LPARs, not the total consumed capacity, so aim to limit peaks and shift non-critical work to quieter periods – flatter peaks and fewer valleys.
  87. 87. © 2015 IBM CorporationITSO-87 Country Multiplex Pricing Recommendations • More: – Do NOT disaggregate BEFORE you switch to CMP!! – If you are not meeting sysplex aggregation criteria today, determine if it would be possible to do so for the 3 months leading up to the switch to CMP. – Move to SCRT 23.10 NOW and ensure that the process is running flawlessly. You don’t want to have one of your 3 months disqualified because of a problem with the SCRT process. – If you are in the middle of an SVC migration, complete it before you move to CMP, or move to CMP before the SVC period runs out. – If you buy a new product after you go to CMP, all use of that product from day one qualifies for CMP rules.
  88. 88. © 2015 IBM CorporationITSO-88 CMP Summary • From a technical perspective, CMP is possibly the biggest leap forward since the introduction of sysplex: – The original intent of sysplex aggregation was great – to incent customers to implement Parallel Sysplex by discounting software to offset the additional hardware costs to use sysplex – sadly that message got twisted over the years, and achievement of the cost reduction became the objective rather than achieving the business advantages that sysplex can provide. – CMP provides the financial benefits of sysplex aggregation without requiring unnatural acts. You can now configure your systems in whatever way delivers the most value and advantage without software cost considerations overriding the technical considerations. • Once you get to CMP, configuring and managing your systems and sysplexes should be much easier and more logical. • Getting the best value from the move requires careful planning, starting at least 6 months in advance. – Your decisions at this time will determine your MLC base, and the MLC base will constitute a large part of your bill for years into the future. So invest now, to save later.
  89. 89. © 2015 IBM CorporationITSO-89 Overall SW Pricing Summary • These new pricing options are intended to reduce the cost of adding new applications to z/OS and extending the use of existing ones. • ALL of them are of interest to system programmers: – MWP and zCAP have an impact on how you manage the capacity available to your LPARs, how you configure your subsystems and LPARs, and even down to which SMF record types you need to collect and keep. – CMP frees you to configure your systems and sysplexes in a way that delivers the maximum business value and improves availability and manageability. • To get the maximum value from your z/OS investment, z/OS sysprogs, subsystems sysprogs, application architects, and contract administrators must all work together. • It is also vital to take time to look at all the options, look at how your applications can exploit them, and then decide on the best topology for your site – ‘haste makes waste’
  90. 90. © 2015 IBM CorporationITSO-90 z13 Performance Performance and Availability
  91. 91. © 2015 IBM CorporationITSO-91 Introduction • The purpose of this section is not to show you how fast z13 is, but to help you understanding what contributes to z13 (and zEC12, and z196 and, and) performance and how you can configure your CPCs, LPARs, and applications to optimize performance. – We will also touch on variability and what you can do to minimize it. • We’ll look at what’s new with z13 in term of hardware structure and how those changes contribute to the performance you see. • We will also see what you can do to squeeze the most out of your system, which does not necessarily mean using it up to its last drop.
  92. 92. © 2015 IBM CorporationITSO-92 Introduction • What challenges are facing z and all chip manufacturers? – No relief from ever-increasing demands for additional capacity. – Slowing rate of cycle time reductions (Moore’s Law) and Flat memory access times. – Increasing volumes of data. – New applications require faster (realtime) processing of more data. – Urgent need for increased data and network security. Speed Capacity Big data, 64-bit Analytics Encryption
  93. 93. © 2015 IBM CorporationITSO-93 z13 Overview • 3 PU chips per node, 2 nodes per drawer, up to 4 drawers. • Up to 8 processor units (cores) per chip, providing up to 141 configurable processor units • SMT2 for zIIPs and IFLs – Includes metering for capacity, utilization, and adjusted chargeback (zIIPs) • z13 clock speed is lower than zEC12 (5.0 GHz vs 5.5), but this is offset by greater parallelism in the processor design. – For example, 2x instruction pipe width, re-optimized pipe depth for power/performance. z13 can decode 6 instrs / cycle compared to 3 / cycle on zEC12. • Improved (reduced) CPI (Cycles per Instruction) • Larger L1, L2, L3, L4 caches. • Concept of LPAR affinities extended from PUs to memory. • z13 supports nearly 3x as much configurable memory as zEC12 – Up from 4TB to 10TB. Continued focus on keeping data "closer" to the processor unit – Ask IBM about 3x and ‘mega’ memory offers. • New SIMD instructions, particularly helpful for analytics • Performance improvements for both CPACF and CryptoExpress (5S replaces 4S) Speed Capacity Big data, 64-bit Analytics Encryption
  94. 94. © 2015 IBM CorporationITSO-94 z13 PU Chip  Up to eight active cores (PUs) per chip –5.0 GHz (v5.5 GHz zEC12) –L1 cache/ core –L2 cache/ core  Single Instruction/Multiple Data (SIMD)  Single thread or 2-way simultaneous multithreaded (SMT) operation  Improved instruction execution bandwidth: –Greatly improved branch prediction and instruction fetch to support SMT –Instruction decode, dispatch, complete increased to 6 instructions per cycle* –Issue up to 10 instructions per cycle* –Integer and floating point execution units  On chip 64 MB eDRAM L3 Cache –Shared by all cores  I/O buses –One GX++ I/O bus –Two PCIe I/O buses  Memory Controller (MCU) –Interface to controller on memory DIMMs –Supports RAIM design  Chip Area – 678.8 mm2 – 28.4 x 23.9 mm – 17,773 power pins – 1,603 signal I/Os  14S0 22nm SOI Technology – 17 layers of metal – 3.99 Billion Transistors – 13.7 miles of copper wire * zEC12 decodes 3 instructions and executes 7
  95. 95. © 2015 IBM CorporationITSO-95 z13 PU Core CP Chip Floorplan  2X Instruction pipe width – Improves IPC for all modes – Symmetry simplifies dispatch/issue rules – Required for effective SMT  Added FXU and BFU execution units – 4 FXUs – 2 BFUs, – 2 DFUs, – 2 new SIMD units (VXUs)  SIMD unit plus additional registers  Pipe depth re-optimized for power/performance – Product frequency reduced – Processor performance increased  SMT support – Wide, symmetric pipeline – Full architected state per thread – SMT-adjusted CPU usage metering IFB ICM LSU ISU IDU FXU RU L2D L2I XU PC VFU COP
  96. 96. © 2015 IBM CorporationITSO-96 z13 Drawer based Topology Mem DIMMsMem DIMMs PUPU SCSC Mem DIMMs NODE 1 Fully Populated Drawer Mem DIMMsMem DIMMs A-Bus S-Bus X-Bus NODE 0 X-Bus SCSC A-Bus To other drawers To other drawers PU PU PU PU PU Mem DIMMs Physical node: (Two per Drawer) – Three PU chips, One SC chip – RAIM Memory – Memory Controllers are in the PU chips – Five DDR3 DIMM slots per Controller: Either 20 or 25 DIMMs per drawer  SC and CP Chip Interconnects – X-bus: SC and CPs to each other (same node) – S-bus: SC to SC chip in the same drawer – A-bus: SC to SC chips in the remote drawers
  97. 97. © 2015 IBM CorporationITSO-97 zEC12 Book based Topology Fully connected 4 Book system:  120* total cores  Total system cache - 1536 MB shared L4 (eDRAM) (5632) - 576 MB L3 (eDRAM) (1536) - 144 MB L2 private (SRAM) (564) - 19.5 MB L1 private (SRAM) (31.5) CP1CP1CP2CP2 CP4CP4 CP5CP5CP3CP3 SC0SC1 Mem1 Mem0 FBCs Mem2 CP0CP0 Book: FBCs *Of the maximum 144 PUs only 120 are used
  98. 98. © 2015 IBM CorporationITSO-98 Comparing z13 structure with the zEC12 one • z13 hardware structure is significantly different than the z10/z196/zEC12 one – the step from EC12 to z13 is similar to the step from z9 to z10: – Every time System z has a major new design, some workloads will benefit more than others. – The generations that are incremental refinements (e.g. zEC12 over z196) have less variability because they do things the same way, only faster. • z13 has direct point-to-point connectivity among processors in the same node. This was not available in previous design. • z13 has a fast bus (S-Bus) connecting the two nodes within the same drawer. This makes intra- Drawer communication very efficient. • z13 lacks any-to-any node connectivity which was available in previous design. This makes communication between opposite nodes in different drawers (aka «far nodes») less efficient than in the past. • The new structure is needed to accomodate a larger number of processors (up from 101 to 141) and provide growth.
  99. 99. © 2015 IBM CorporationITSO-99 Relevance of Nest Performance zSeries CPI History – 9672 to zEC12 CyclesperInstruction CPI = Cycles per Instruction Off Core On Core
  100. 100. © 2015 IBM CorporationITSO-100 Impact of Relative Nest Intensity 50,000 MIPS difference! 30 engine difference!
  101. 101. © 2015 IBM CorporationITSO-101 Relevance of Nest Performance – the z196 example • Cache latency for a z196 system (1, 4, 12, 32, 77 are relative access times) Dispatching without Hiperdispatch – PR/SM dispatching attempts to re-dispatch a logical processor on the same physical processor but can’t guarantee that – In z/OS all logical processor select work from the same work unit queue therefore it is completely unpredictable where a Unit of Work gets processed
  102. 102. © 2015 IBM CorporationITSO-102 Hiperdispatch design objectives and implementation • HiperDispatch was introduced with z10 • Objective is to keep work as much as possible local to a physical processor to optimize the usage of the processor caches. Expected result: – Cache reloads should occur much less frequently – Cache misses and fetches from other books (and chips) should be avoided as much as possible • Implemented through the interaction between z/OS and PR/SM to optimize work unit and logical processor placement to physical processors. Consists of 2 parts: – One in z/OS (aka Dispatcher Affinity) because it attempts to create a temporary affinity between work and processors – One in PR/SM (aka Vertical CPU Management) because it attempts to assign physical processors exclusively to logical processors as much as possible
  103. 103. © 2015 IBM CorporationITSO-103 z13 PR/SM Enhancements – Memory affinity • Memory affinity added to PR/SM on z13 – Tries to allocate memory for each LPAR within just one drawer – Makes a 2-node drawer look like one memory-node – Dispatch logical processors for each LPAR on same drawer as memory – Important side effect: Drawer-based L4 cache affinity – Re-arrange memory and processor allocation as needed to maintain affinity – LPAR activation / de-activation / size change / Config CP ON/OFF, IRD – Hardware design supports high-performance memory re-assignment – Builds on existing Enhanced Drawer Availability function • Memory affinity smooths performance behavior – Minimal cross-drawer data traffic in steady-state operations – Almost all LPARs expected to fit within single z13 drawer – Drawers can have up to 2.5TB of memory on z13
  104. 104. © 2015 IBM CorporationITSO-104 zEC12 and z13 Cache Hierarchy ... Memory L4 Cache L2 CPU1 L1 L3 Cache L2 CPU6 L1... L2 CPU1 L1 L3 Cache L2 CPU6 L1... zEC12 Single Book View z13 Single Drawer View –L1 private 64k I, 96k D –L2 private 1 MB I + 1 MB D –L3 shared 48 MB / chip –L4 shared 384 MB / book –L1 private 96k I, 128k D –L2 private 2 MB I + 2 MB D –L3 shared 64 MB / chip –L4 shared 480 MB / node –plus 224 MB NIC
  105. 105. © 2015 IBM CorporationITSO-105 Workload’s Relative Nest Intensity  Workload’s performance is sensitive to how deep into the memory hierarchy the processor must go to retrieve workload’s instructions and data for execution. Best performance occurs when the instructions and data are found in the cache(s) nearest the processor (remember those relative access times on earlier slide).  To identify a workload profile, IBM introduced a new term, “Relative Nest Intensity (RNI)” which indicates the level of activity to shared cache and memory resources (L3, L4, memory). The higher the RNI, the deeper into the memory hierarchy the processor must go to retrieve the instructions and data for that workload.
  106. 106. © 2015 IBM CorporationITSO-106 A system’s Relative Nest Intensity varies with the workload Sample customer data – not z13
  107. 107. © 2015 IBM CorporationITSO-107 Why is understanting YOUR RNI so important RNI causes significant variability in effective capacity, and z13 is more sensitive to RNI than zEC12.
  108. 108. © 2015 IBM CorporationITSO-108 The importance of assigning the right LPAR weight In HD mode, LPARs use Vertical Low logical processors to consume above guaranteed capacity Sample customer data – not z13
  109. 109. © 2015 IBM CorporationITSO-109 The importance of assigning the right LPAR weight • The previous chart shows a wee’sk worth of data about CPU consumption of a production system – GSY7. In the chart, the blue line represents the processing capacity assigned to the LPAR based on its weight. In many intervals GSY7 uses more than its guaranteed capacity. In HiperDispatch mode this is done using Vertical Low logical processors. These processors use what’s left by other LPARs and can be dispatched on any available physical processor. • For this reason, Vertical Low logical processors, depending on the workload’s relative nest intensity, show less cache efficiency. This is reflected in their CPI. • Let us see the effect of cache efficiency on CPI using some customer data.
  110. 110. © 2015 IBM CorporationITSO-110 Impact of data sourcing on CPI – Vertical high processor Sample customer data – not z13 CPI – The lower the better
  111. 111. © 2015 IBM CorporationITSO-111 Impact of data sourcing on CPI – Vertical high processor • Vertical high logical processors are always dispatched on the same physical processor. This increases the efficiency of L1 and L2 caches, which are private to each PU (Processing Unit) and the L3 cache, which is located in the multi- core PU chip. • The previous chart shows how the CPI mainly depends on L1 cache efficiency, but also shows how, for vertical high logical processors, most of the data needed to keep processing is sourced by L1, L2 and L3 which are closer to the processor. • This is the effect of the persistent affinity generated by HiperDispatch for vertical high processors.
  112. 112. © 2015 IBM CorporationITSO-112 Impact of data sourcing on CPI – Vertical medium processor Sample customer data – not z13 CPI – The lower the better
  113. 113. © 2015 IBM CorporationITSO-113 Impact of data sourcing on CPI – Vertical medium processor • Vertical medium logical processors are assigned a home physical processor of which they own a significant share. However, unlike vertical highs, they can be dispatched elsewhere by PR/SM should the home physical processor be busy when needed. • PR/SM knows the CEC’s hardware topology, and keeps track of where logical processors have been previously dispatched. This allows it to try to maximize cache efficiency when it needs to dispatch a logical processor on a PU different than its home one. • In the previous chart we see that the medium processor has less L3 cache efficiency than the vertical high one, but that it enjoys a good L4 efficiency. L4 is shared by PUs in the same Book / Drawer.
  114. 114. © 2015 IBM CorporationITSO-114 Impact of data sourcing on CPI – Vertical low processor Sample customer data – not z13 CPI – The lower the better
  115. 115. © 2015 IBM CorporationITSO-115 Impact of data sourcing on CPI – Vertical low processor • Vertical low logical processors are usually parked and are not used until the LPAR needs more capacity than it is allowed by its relative share. Vertical low processors are dispatched wherever there are available cycles (in any drawer). This results in them having lower cache hit rates AND in polluting caches of other logical processors. Because it is difficult for PR/SM to maximize cache efficiency for vertical low logical processors, their RNI (and hence their performance) tends to be much less consistent than vertical mediums or vertical highs. • In the previous chart you can see how vertical low processors show less cache efficiency from shared caches (L3 and L4) because they keep moving between chips and drawers. Their CPI is highly dependent on L1 and L2, which in turn depend on the data locality of the workload.
  116. 116. © 2015 IBM CorporationITSO-116 Things you may consider to maximize performance • Be aware of your workload’s cache profile, use CPU MF (SMF Type 113 records) data to determine it and tools such as zPCR or SAS/MXG to plot and monitor its use of cache. • Assign your LPARs the right processor weight. Try to make sure that vertical low logical processors are seldomly used. – If possible assign a processor weight that makes PR/SM use as many Vertical high processors as possible. Use Alain’s Maneville excellent LPAR Design tool To plan your LPAR configuration. • Try to not saturate your physical processing capacity. If possible, over-provision it, as lower CPU utilization brings more efficiency and has a potential for cost reduction. See IBM White Paper WP101208 – ‘Running IBM System z at High Utilization’ by Gary King. – Also consider the use of subcapacity models.
  117. 117. © 2015 IBM CorporationITSO-117 Relationship between CPU Util and CPU per Txn • CPU Util CPUConsumptionperTxn Actual customer production environment measurements Lower CPU/Tran Lower CPU Util Higher CPU/Tran Higher CPU Util Impact of CPU Utilization on Txn CPU Time 0 1 2 3 4 5 6 7 8 Measurement 1 Measurement 2 Measurement 3 Measurement 4
  118. 118. © 2015 IBM CorporationITSO-118 Things you may consider to maximize performance • Try to minimize cache disruptions due to interrupts. Larger memory configurations allow for fewer I/Os and better RNIs. – Consider using DB2 page-fixed Buffer Pools and large (1MB) pages. – z13 supports larger memory configurations. But try to ensure that each LPAR’s memory fits in a single drawer. • Increase TLB efficiency by using Large Pages. This is especially important when moving to larger memory configurations. – 1MB pages use a separate TLB and take pressure off the 4KB TLB. – Make sure you have enough real memory to avoid RSM breaking up large pages to back 4K ones. • If using sysplex data sharing, aim to maximize proportion of synchronous requests. Make use of the fastest available link technology. – ICA on z13 and CIB 12X-IFB3 on zEC12 for short distances, CIB 1X for long distances.
  119. 119. © 2015 IBM CorporationITSO-119 z13 Memory Location Source: z13 Technical Guide -
  120. 120. © 2015 IBM CorporationITSO-120 Preparing to measure z13 Efficiency • We’ve seen how caching efficiency is key to z13 processor performance. The previous charts are produced using hardware instrumentation data (CPU MF Counters). IBM recommends activating CPU MF (counters) and keeping the SMF 113 records. Collecting counters has negligible CPU cost and provides invaluable insights. If you haven’t activated them yet just DO IT! Here are some links with additional information and instructions: • z/OS CPU MF Enablement Education • Collecting CPU MF (Counters) on z/OS – Detailed Instructions • IBM Redpaper Setting Up and Using the IBM System z CPU Measurement Facility with z/OS, REDP-4727
  121. 121. © 2015 IBM CorporationITSO-121 z13 CPU MF enhancements • On the z13, CPU MF uses the same metrics as previous processors – New formulas. – zEC12 RNI formula also updated (the RNI formula gets updated with every new CPC) • New “Miss” cycles measurement for L1 cache provides more insights (SCPL1M in John’s paper referenced below). • On z13, CPU MF provides metrics at Logical Processor or Thread level – When running SMT 1 CPU MF Counters are provided at Logical Processor level – When running SMT 2 CPU MF Counters are provided at Thread level • See John Burg’s SHARE presentation for details Also attached to the back of this PDF (thanks John!)
  122. 122. © 2015 IBM CorporationITSO-122 Understanding your LPARs’ topology • Assigning proper weights lets you influence the number of vertical high/medium processors a LPAR will use. – Use Alain Maneville’s LPAR Design tool. • In a z13 configuration, aim to have all logical processors and memory for a given LPAR fit in a single drawer. • To see how successful you are with this, you need to know how physical memory and Processor Units are distributed across drawers, and how PR/SM allocates home PUs to logical processors. – Logical processor topology information can be obtained by using SMF99.14. – There is no way to determine actual in-use memory in each drawer.
  123. 123. © 2015 IBM CorporationITSO-123 z13 Processor Unit (Core) Location  PUs can be purchased as CPs, IFLs, Unassigned IFLs, zIIPs, ICFs or Additional SAPs −CPs and zIIPs initial placement in 1st drawer working up −IFLs and ICFs initial placement in highest drawer working down −zIIP to CP purchase ratio is 2:1 −Additional SAPs + Permanent SAPs may not exceed 32 −Any un-configured PU can act as an additional Spare PU z13 1st Drawer 2nd Drawer 3rd Drawer 4th Drawer Model Cust PUs Cust PUs SAPs IFP Spare Cust PUs SAPs IFP Spare Cust PUs SAPs IFP Spare Cust PUs SAPs IFP Spare NE1 141 34 6 1 1 35 6 0 1 36 6 0 0 36 6 0 0 NC9 129 31 6 1 1 32 6 0 1 33 6 0 0 33 6 0 0 N96 96 31 6 1 1 32 6 0 1 33 6 0 0 N63 63 31 6 1 1 32 6 0 1 N30 30 30 6 1 2
  124. 124. © 2015 IBM CorporationITSO-124 What is SMF 99.14 • SMF 99 Subtype 14 contains HiperDispatch Topology data for this LPAR, including: – Logical Processor characteristics: Polarization (VH, VM, VL), Affinity Node, etc. – Physical topology information – Logical Processors allocation to zEC12 Books / Chips – Logical Processors allocation to z13 Drawers / Nodes / Chips • Low volume recording - Written every 5 minutes or when topology changes • Recommend collecting them FROM EVERY LPAR to help understand why performance changed
  125. 125. © 2015 IBM CorporationITSO-125 The WLM Topology Reporter New WLM Topology Report available to process SMF 99 subtype 14 records Steps: 1. Download tool from web site above 2. Collect SMF 99 Subytpe 14 records 3. Run provided host program to create topology file in CSV format 4. Download topology file to workstation 5. Load it into provided Excel spreadsheet to generate topology reports
  126. 126. © 2015 IBM CorporationITSO-126 WLM Topology Reporter
  127. 127. © 2015 IBM CorporationITSO-127 WLM Topology Reporter - Spreadsheet 1 – Create copy of current spreadsheet 2 – Open CSV file containing SMF99.14 data 3 – Select Interval to be analyzed 4 – Copy data into main sheet 5 – Create Report
  128. 128. © 2015 IBM CorporationITSO-128 WLM Topology Reporter – Interpreting the results
  129. 129. © 2015 IBM CorporationITSO-129 WLM Topology Reporter – Sample use case • In the following slides, we’ll see an example of a Topology Report and review how it can be used to understand what happens during a system’s reconfiguration. • To do this we started with a z13 LPAR configuration using 6 dedicated CPs and 8 dedicated zIIPs and dynamically varied online another zIIP. – NOTE: While the Type 113 records provide information at the thread level (when running in SMT2 mode), the 99.14 records are at the core level.
  130. 130. © 2015 IBM CorporationITSO-130 Starting configuration – 6 CPs and 8 zIIPs (all dedicated) SMFID Affinity Node Polarity CP Type CPU Num Note: Topology report shows CPU Num in decimal, RMF shows it in hex.
  131. 131. © 2015 IBM CorporationITSO-131 Adding a zIIP engine
  132. 132. © 2015 IBM CorporationITSO-132 PR/SM reaction – Dynamic Processor Reassignment After a while (up to 5 mins), PR/SM performs dynamic processor reassignment to move the last added zIIP to the same node where other processors of the same system reside
  133. 133. © 2015 IBM CorporationITSO-133 Z13 Performance topics summary • Be familiar with your workload’s cache profile so that you can spot unexpected changes or the impact of tuning efforts. – Collect CPU MF Counters, use CPU MF data to determine your workload profile. • Assign your LPARs the right processor weight to maximize use of VH and VM CPs. – Get familiar with LPAR Design Tool, SMF99.14 and WLM Topology Reporter • If possible, over-provision CPU capacity as it can bring more efficiency. • Exploit larger memory configurations and attractive pricing to reduce I/O and improve RNI. • Implement Large Pages to increase TLB efficiency. • For sysplex data sharing, make use of the fastest available link technology.
  134. 134. © 2015 IBM CorporationITSO-134 New z13 Single Instruction Multiple Data instructions Performance and Availability
  135. 135. © 2015 IBM CorporationITSO-135 Introduction • z13 is the first System z CEC providing specialized hardware to improve the performance of complex mathematical models and analytic workloads through vector processing and new complex instructions, which can process multiple data items with only a single instruction. • This section will give you introductory information about SIMD, including motivations, implementation, exploiters, and performance.
  136. 136. © 2015 IBM CorporationITSO-136 SIMD - Single Instruction Multiple Data - Overview • Motivation / Background – The amount of data is increasing exponentially - IT shops need to respond to the diversity of data – Enterprises use traditional integer and floating point data, but also now string, and XML-character-based data – As the volume of data from operational systems continues to increase, It becomes more important to be able to perform the computations and analytics closer to the data • SIMD Objective – Leverage data intensity and be competitive with large data volumes; compete by doing more operations on a given byte of data, extract more interesting insight. • Use Cases – Reporting functions: Querying and populating reports, often in batch fashion to process lots of data quickly – Numerically intensive processing – i.e. time forecasting, simulation – Modelers, matrix intensive computations
  137. 137. © 2015 IBM CorporationITSO-137 Instruction pool Data pool Results Instruction pool Data pool Results Workloads Java.Next C/C++Compiler built-ins for SIMD operations (z/OS and Linux on z Systems) MASS & ATLAS Math Libraries (z/OS and Linux on z Systems) SIMD Registers and Instruction Set MASS - Mathematical Acceleration Sub-System ATLAS - Automatically Tuned Linear Algebra Software Single Instruction Multiple Data (SIMD) Vector Processing  A type of data parallel computing that can accelerate operations on integer, string, character, and floating point data types  Provide optimized SIMD compliers and libraries that will minimize the effort on the part of middleware/application developers  Operating System/Hypervisor Support: − z/OS: 2.1 SPE available at GA − Compiler exploitation • IBM Java V8 => 1Q2015 • XL C/C++ on zOS => 1Q2015 • XL C/C++ on Linux on z => 2Q2015 • Enterprise COBOL => 1Q2015 • Enterprise PL/I => 1Q2015 − Linux: IBM is working with its Linux Distribution partners to support new functions/features − No z/VM Support for SIMD
  138. 138. © 2015 IBM CorporationITSO-138 SIMD (Single Instruction Multiple Data) conceptual view • [Significantly] smaller amount of code improved execution efficiency • Number of elements processed in parallel = (size of SIMD / size of element)
  139. 139. © 2015 IBM CorporationITSO-139 SIMD Hardware Accelerator 1 3 Integer 16 x Byte, 8 x HW, 4xW, 2xDW, 1xQW  Byte to QuadWord add, sub, compare  Byte to DoubleWord min, max, ave.  Byte to Word multiply, multiply/add 4 - 32 x 32 multiply/adds  Logical ops, shifts,  CRC (GF multiply up to 64b), Checksum (32b),  Loads efficient with 8B alignment though minor penalties for byte alignment  Gather by Step String  Find 8b, 16b, 32b, equal or not equal with zero character end  Range compare  Find any equal  Load to block boundary, load/store with length Floating-point BFP DP only 32 x 2 x 64b  2 BFUs with an increase in architected registers  Exceptions suppressed Operates on three distinct data types:
  140. 140. © 2015 IBM CorporationITSO-140 Single Instruction Multiple Data • Quick recap – the following pictures illustrate the principle of Single Instruction Multiple Data (SIMD): When I first heard that z13 was going to implement SIMD, I didn't see the value for business applications in it, since I only knew about SIMD advantages in scientific applications like image processing, for example – but I was wrong…
  141. 141. © 2015 IBM CorporationITSO-141 Single Instruction Multiple Data and string processing • SIMD is very well suited whenever one has to process large arrays of data of the same type, which also means large arrays of character data – also known as strings • Character array: • Situations when processing on character arrays occurs: – String comparison – Single character / substring search – String conversion • All these operations are heavily used by [Java] application programmers
  142. 142. © 2015 IBM CorporationITSO-142 Java acceleration with SIMD IBM z13 running Java 8 on z/OS® Single Instruction Multiple Data (SIMD) vector engine exploitation java.lang.String exploitation - compareTo - compareToIgnoreCase - contains - contentEquals - equals - indexOf - lastIndexOf - regionMatches - toLowerCase - toUpperCase - getBytes java.util.Arrays - equals (primitive types) String encoding converters For ISO8859-1, ASCII, UTF8, and UTF16 - encode (char2byte) - decode (byte2char) Auto-SIMD - Simple loops (eg. matrix multiplication)
  143. 143. © 2015 IBM CorporationITSO-143 Java Sample – Read a large text file into a string 10,000 times
  144. 144. © 2015 IBM CorporationITSO-144 Java Sample – Same as before plus perform case conversion The «toLower» method translates the text string to lower cases. When running on z13 Java 8 exploits SIMD to do it
  145. 145. © 2015 IBM CorporationITSO-145 Java Sample – CPU time comparison zEC12 vs z13 z13 for Java enhancements & SIMD Effect z13 for Java enhancements Specific test case, your mileage MAY vary !
  146. 146. © 2015 IBM CorporationITSO-146 SIMD Migration, and Fallback Considerations • This is new functionality and code will have to be developed to take advantage of it • Some mathematical function replacement can be done without code changes by inclusion of the scalar MASS library before the standard math library – Different accuracy for MASS vs. the standard MATH library – IEEE is the only mode allowed for MASS – Migration Action: Assess the accuracy of the functions in the context of the user application when deciding whether to use the MASS and ATLAS libraries LOADxx “MACHMIG VEF” can be used to disable SIMD at IPL time