A few days ago, I was in a store, watching a store employee trying to use simple technology to record the details about my purchase and me. Oh! It was painful! (I think the word is excruciating.) I found myself getting increasingly frustrated and muttering, &quot;Get out of the way; I can do that quicker myself!&quot; This may have nothing to do with the store employee; it just could be the signs of age creeping up on me. I have felt the same at airports when you change a flight -- just what are the ticket agents typing for all that time? I have a theory that they are doing their weekly shopping online while I stand there, trying to remain pleasant (maybe another age thing, but I am finding it increasingly difficult to keep cool). Get out of the way; I can do that quicker myself! It is frustrating when there is a barrier between technology and me, which is why I love self-checking, either at the airport or online, because I get to use the technology -- hooray! I also like to shop online because I get to use the technology -- hooray! I also like to fill up my car with gas at an automated pump because I get to use the technology -- hooray! Putting technology in front of the front office has been the most significant development in a long time in the evolution of IT. We IT professionals just love it (being at the forefront of business, that is). However, let’s spare a thought for the poor IT customer who is responsible for the business side of the IT service. Imagine being responsible for traditional airline check-in, and the technology fails. Fortunately, there are agents at the check-in desks to handle the situation. In fact, not long ago, I was manually checked in at Dublin airport because of a computer fault -- there was hardly any delay, and the situation was handled with typical Irish charm. (Funny, how much more impressive it is when problems are handled well than when things work just fine to begin with.) Now, imagine being a business manager responsible for automated check-in: What do you do if the technology fails? You won’t have lots of agents on hand because the technology has replaced them. What would have happened at Dublin airport? It makes me shiver to think about it. In such circumstances, time is of great importance and knowledge is critical. That is where BSM comes in. Unfortunately, IT has become the new frustration barrier. For example, many gas stations are becoming totally automated, and we are beginning to see the pay booth disappear completely, yet the business managers are reliant on IT to update them on failures. Can you imagine how frustrating it is for a business manager responsible for managing the automated gas pumps to have to wait for a status update from IT? Get out of the way; I can do that quicker myself! We must start to hand over to the business control of some of the tools that IT has used to manage the infrastructure. For example, business managers want to see the status of all automated gas pumps themselves, or see which automated check-in devices have failed, or be able to order their new technology online, or be able to enter change requests online, or check the status of those change requests, or look at their SLAs online. In other words, they want to control their own resources. Get out of the way; I can do that quicker myself! The technology is available to allow business to drive their resources themselves; it is the IT mindset that must to change. The IT job is to manage the infrastructure, not operate the business.
Many of us know how extra weight can just creep up on us. A little nibble here, a little nibble there, and all of a sudden, our clothes don't fit. Well, Capacity Management can be just the same -- a few extra records here, a little bit more bandwidth there, a little more memory there, and all of a sudden, the service is running out of capacity. Dieting is the human equivalent of Capacity Management, but you know, Capacity Management is also, in theory, a simple process that consists of three basic questions: What have I got? What is coming? And, will it fit? Simple questions, but devilishly difficult to answer. Let's look at Capacity Management from a different angle. Pretend that you have borrowed a car from a friend, but there is one tiny fault with the car that will cause problems for you. What could the failure be? Simple the gasoline gauge is not working, and, as a result, you have no idea how much gas there is the car. So, what do you do? You fill up the car with gas even though the tank is over half full, and continue to fill up the car at regular intervals rather than risk running out of gas. The result is wasted time to fill the tank and the expense of always having excess gas in the tank. Too often, this is how IT addresses capacity -- by over-investing, and not regularly measuring capacity volumes (for example, utilizing a high-capacity server for a single service of low capacity). In my car, in addition to a gas gauge, I have a warning indicator that comes on when the gas tank has enough gas left to take the car about 55 miles. Think of it — this means that I get a warning when I have reached a threshold. Again, the parallel is clear: IT needs to set planned thresholds that are within the capacity limits, and to enable indicators to show immediately when any of those thresholds are breached. Suppose you had to take seven people on a 100-mile journey; who would deliver the people first? The fast, two-seater sports car? Or a seven-seat van? It is the tortoise and the hare all over again. Capacity management is not just a case of managing volume, but of using the correct technologies to manage volume and performance. That way, when that 55-mile indicator light comes on, and I don't know where the next gas station is located, I can reduce my speed to increase gas capacity. Therefore, the challenge is to not just manage capacity, but also performance, because the two are so closely related. We must also remember that capacity covers all IT infrastructure items, from disk, to bandwidth, and even to staffing levels. Most important, it is the capacity of a service that matters, not just the capacity of, say, a server. IT has often closely monitored the capacity of specific technology components (e.g., mainframes and servers), but has forgotten that it is the capacity of the end-to-end service that matters to the customer. For example, if the capacity of the Stock Control Service is great, but a customer's workstation does not have the required memory, the customer will be unhappy because the end-to-end service is slow. IT has for too long ignored Capacity Management as a practice. However, with ever-tightening budgets, increasing demands for performance, governance pressure, and business demands, capacity is fast becoming a concern for IT. We need to remember those simple questions -- What have I got? What is coming? Will it fit? -- and understand that it is the capacity of services that must be managed.
Monitoring The activities shown above are employed by each of the sub-processes of Capacity Management. The major difference between the sub-processes is in the data that is being monitored and collected, and the perspective from which it is analyzed. For example, the level of utilization of individual components in the infrastructure is of interest to Resource Capacity Management, while the transaction throughput rates and response times are of interest to Service Capacity Management. For Business Capacity Management, the transaction throughput rates for the online service need to be translated into business volumes (e.g., in terms of sales invoices raised or orders taken). Monitoring, analysis, tuning, and implementation form a natural cycle that is continuously iterated by each of the sub-processes. Monitors should be established on all the components and for each of the services. The data should be analyzed and the results included in reports and recommendations. Some form of tuning may then be put in place to act on the recommendations. (Continued on next page)
Summary: Enterprise Performance Assurance is a automatically managed, closed-loop continuous methodology which continually measures resources needed to support Business Services, builds a historical database which can be analyzed to understand the performance relationships and behaviors of business services. With Prediction, performance and response time problems are cost-effectively prevented before they occur.
Optional – one slide “how to” for previous slide… Perceive is a way for your customers to graphically view the performance of their systems in one place. Managers, VPs, Sys Admins love this product. Easy to install. Easy to use. Since Performance Assurance can be an “invisible” back-office project, Perceive allows end users to check on IT’s success with Performance Assurance to satisfy that their business service have been, are, and will continue to be delivered in a high-confidence, optimized cost fashion.
Non-optional, second step in the EPA value proposition – the heart of Enterprise performance assurance is turning “raw” performance measurements – which must be continually measured, into a business context – a process known as “workload characterization”. This ensures that all subsequent and ongoing management of performance, response time and throughput is done in the context of the application and business. Nobody ever called a helpdesk and reported server utilization at 87.2%… they care about responsiveness and throughput. To prevent response time and throughput problems, one must manage to business workload performance.
Non optional, next slide. Workloads can be analyzed both historically and in near real-time, to understand their underlying resource requirements and bottlenecks. This is the foundation for using predictive modeling to subsequently optimize performance and response time on an ongoing basis, as well as being extremely useful for delivering reports such as “Actual versus Planned” workload performance.
Optional “drill down/how to”: Here we see a months worth of workload response time for a workload. It is easy to rapidly identify that each Monday, mid-day, there is a spike in response time. But, since this spike does NOT exceed normal behavior, it doesn’t represent a problem that needs investigation. However, on Wednesday the 24 th the response time exceed two standard deviations from normal performance. This is clearly a potential problem needing investigation. By the way, this type of longer-term workload performance and response time perspective is vital for customers considering server consolidation, as all such consolidations must be based from a baseline of “business peak” to ensure there will be no under-provisioning decisions made.
Non-optional, next step. Workloads are then automatically managed across time, tracking and analyzing their performance, identifying normal and abnormal behaviors. All information is stored in a data repository and made available for both ad-hoc (native console or web accessible) and scheduled, published access to needed stakeholders This ensures that all performance problems can be filtered to determine if they are something that is “out of normal”, and then rapidly identified, prioritized and resolved.
Non optional – the start of the EPA value proposition in context of BSM
Optional Drill down slide: to see what were the components of the poor response time. The bottleneck is in the CPU resources.
Non optional: The value of Predict is paramount. Here we see how Predict can model the impact of business changes such as numbers of users, numbers of transactions, business growth scenarios, consolidation scenarios, etc. on the underlying IT resources. This has several huge benefits: when used as part of an ongoing Performance Assurance methodology, it can identify potential business service performance breakpoints before they become critical and thus prevent business service degradation. It can ensure that an enterprise can understand and project their resource requirements to successfully implement business change (“Capacity Planning”), and it can help ensure customers the ability to confidently deliver to service level agreements at optimal cost across business change.
Optional drill-down/how to: With the Predict Solution – the user can see and predict when response time will be off the charts. This can be done long before the response time will ever be a problem. The Predict Solution uses network Queing Theory to take the data collected and calculate what workload response times will look like when additional transaction load or users are added to the application or workload. Trending off measured raw performance metrics leads to highly inaccurate answers.
Non optional value: Prediction also enables the Enterprise to predict the impact of a considered IT change on the business service’s response time. Changes which can be predicted include both server-side HW as well as decisions such as workload balancing scenarios. Used in an ongoing fashion, Predict thus enables the “right timing” and “right sizing” of resource deployments needed to run the business, and ensures consistent performance across change scenarios.
Optional drill down/how to: here we can see “inside” workload response time to see how it changes (for better or worse) subject to considered IT change. Here, the “financialsapp” application can be seen to have a response time of .02mSec, which is 71% (I.e. 29% better) after the modeled change. And, we can also see WHY the response time is better, and where most of the service time is spent. In this example, the financialsapp spends about half its time dependent on CPU and the other half on I/O. There is I/O wait time, so if 29% improvement is not sufficient, the next step would be to model I/O changes.
Here’s an example of how system needs change over time.. First <Click> Service Levels are established, say 3 seconds. <Click> Hardware is acquired to support the service level. <Click> And, as new applications are rolled out, ….New modules and users are added. Utilization is measured, <Click> but really offers NO INSIGHT into response time – which always behaves non-linearly with respect to resource utilization. In an optimal world, you would deploy new resource just before your SLA is exceeded <Click>, but in the real world, you can’t afford the risk of doing it too late (Service Down!), so you buy resources early – wasting money. <Click> The cycle repeats <Click> more resources, <Click> more measurements <Click> more response time spikes, perhaps some server consolidations to reduce costs, etc, etc… <Click> <Click> <Click> <Click until you get “Over versus Under Provision”> If the blue line is shifted too far to the left, then you have over-provisioned and you are wasting money. The only “good thing” about this choice is that it is VERY quantifyable – measure the money! If the blue line is shifted too far to the right, then you have under-provisioned (deferred HW purchases too long) and put service levels at risk. Its an age-old balancing act, one which most enterprises have always erred on the side of “over provisioning” (hence today’s Server Consolidation craze!) There are multiple stages in the life cycle of all applications and business services. If you can predict when response time will hit the wall for each stage, you can minimize the amount of hardware and software resources that you need to have in place. The goal is to put in the right hardware and software at the most cost effective time. The organization saves money by retaining flexibility, deferring purchases and due to increasing price/performance (Moore’s Law) frees capital and gets increased business agility. Only BMC Performance Assurance can predict the response time resource relationship with confidence to ensure best-possible of balancing between expenditures and assurance of service levels – across your enterprise.
Ameritrade – Ameritrade is a pioneer in the online trading industry, with a 25 year history of providing customers a self-directed approach to brokerage services. Objectives of their performance assurance project were to increase speed, value and customer experience, and to increase the number of trades from 50,000 to 500,000, without a significant increase in hardware or software costs. BMC Software and partner Maryville Technologies were able to implement a strategy that not only provided capacity planning based on the metrics, but also provided an ongoing management plan to assure the availability and performance of their infrastructure. Visit them on-line and see how applying Performance Assurance internally enables them to COMPETE and WIN - Market leading trading time guarantees!
Partner win with SAIC Business Issue: Transition to Competition/Deregulation & Reduce Operating Costs 1000 servers, 39 mission-critical apps, 175 remote sites IT issues: Why they needed Capacity planning Unacceptably High IT Hardware Costs Very high rate of server proliferation Decisions Based on Instincts, conjecture, and “gut-feel” Server Over-engineering is the Norm Underutilized and/or Unused Equipment Results: Reduced server growth rate Lowered maintenance costs Systems analysts make recommendations instead of delivering graphs
The process of Change management is potentially the most important on in IT to be addressed, because… And Change is implicit in implementing consolidation, virtualization and IT resource automation solutions… Change is accelerated by adopting them, and the actual configuration changes will occur in real-time in virtualized environments – by definition.
If we think of Data Center Optimization, we should first take a view as to what key challenges exist that have created a non-optimized data center. Perhaps most significantly, in spite of multiple server consolidations in the past, the average data center is still running at only 10-15% average resource utilization. Silo purchasing, organizational constraints, application design policies and poor or non-existent capacity planning created this problem; and the implications are significant. In previous days, the costs of the servers were a natural barrier to such wastefulness, but with the advent of truly “industry standard” hardware configurations (Intel/AMD x896) running Windows, Linux, or Solaris, the pure HW Costs have been driven way down. The costs today are not HW, they relate to the number of servers in IT, whether used or not… Things like Electricity, data center physical costs, and most significantly ongoing management costs (Maintenance, SW licenses, management tools, personnel) all scale in direct proportion to the number of servers… The response is overwhelming: virtually every IT organization is adopting consolidation and virtualization strategies to optimize their data centers. Ironically, most believe that the obstacles are IT process related: managing the organizational, process and/or maturity related aspects of these changes, not the technology itself or the potential ROI of the effort. Google Source - (CNET, “Electric Slide for Tech Industry?” by Steven Shankland, February 1, 2006)
Data Center Optimization – BMC offers a unique, rapid time to value, high ROI solution Consider the example of a bank merger with the goals to consolidate customer base and reduce costs by growing business scale. Consider also that at same time the bank is looking to roll out new e-banking services. The bank can use BMC Discovery solutions to identify all server-related infrastructure, configuration, applications and key business services, populating a CMDB. They can use BMC Performance Assurance solutions to determine what capacity and configurations would be required to sustain all existing business services. They also can use Performance Assurance to evaluate resource requirements associated with their business growth plans post-merger. The output of planning will generate a specific Resource Plan which then can be automatically implemented via BMC Change and Configuration Management solutions to ensure that the right software is provisioned to the right hardware. Each and every change can then be implemented according to policy, tracked for compliance, and reflected in the CMDB. Finally, the Resource Plan can be used to generate automation policies so the Bank can roll out the new e-Banking services on an architecture built to support Capacity on Demand. This approach is critical because the bank does not know in advance what business demand would be, and yet has a business requirement to ensure service levels. BMC Virtualizer enables the customer to be able to automatically allocate shared resources from lower-priority business services whenever required to sustain service levels. The results are reduced total server count, increased assurance of service levels in spite of unpredictable demand, and increased average server utilization. Lets take a short walk through how BMC delivers these values, quickly and simply
Automatically provision correctly sized resources in time to meet business demand Lower business service costs Minimize business service risks Optimize Performance Flexibly adapt to change, speed adoption of commodity HW and virtualization technology Eliminate human error associated with resource availability and capacity changes Ensure Business Availability Ensure Business Capacity and Performance Proactively Analyze and Model application resource requirements Properly sized server assets Minimize service risk, ensure appropriate business service response time & throughput Policy-based orchestration and provisioning of resources Balance competing business priorities Increase overall IT resource utilization Minimize costs, risks by tightly managing software configuration Create virtual pools of IT resource Balance server, power, floorspace, cooling, licensing and management costs Increase overall IT resource utilization
Very simply, BMC Performance Assurance solutions build a highly accurate resource requirements plan. They uniquely deliver a highly accurate and reliable picture of exactly what resources will be needed to sustain service level requirements. They factor business changes such as changes in demand. They factor technology changes such as application stacking, consolidation and/or virtualization. Everything is evaluated in terms of answering questions such as: What resources are needed? When and under what circumstances? How will performance and response time be affected? Why is this ITIL discipline of Capacity Management so critical to the success of data center optimization through consolidation and virtualization? You can’t consolidate or virtualize without discipline of Capacity Mangeement because: Virtualized and consolidated applications share common resources. Two business transactions cannot share the same physical resource at the same time unless one waits. Capacity Management is knowing in advance how to best prevent transactions from waiting or failing. Successful use of BMC Performance Assurance generates the resource plan you need to successfully implement data center configuration changes. You will be able to build change plans that address predictable changes: from business demand changes, to technology and configuration choices and changes. And, should you be implementing Capacity on Demand or other automated resource provisioning strategies, it delivers the information you need to trigger automation actions. You can predict in advance when capacity will run out, and automatically provide resources before that point is reached – which is the only way to cost-effectively deal with unpredictable capacity change. BMC Performance assurance answers the 3 key challenges of Capacity Management: What and how much resource will we need? When and why will we need it? What is impact on service levels of change?
Just to give you one real-world example, here’s the experience of a major book retailer with tens of thousands of employees and a few billion dollars in revenues. This is ROI that was conservatively documented and published by Forrester Research, and we can make this available to you if you’d like. As they spun up their asset management program, they wanted to support 150 different types of IT Assets in 1,400 retail and corporate sites. The main issues they wanted to target were the discovery of underused assets that they suspected were out there somewhere, and to control a problem with lease penalties. Using Remedy Asset Management, they discovered 6000 assets that they previously didn’t know about. Some of them were being used effectively, but another 30% were idle or underused. By redeploying these where they were needed rather than buying new assets, they saved $180K. By tying their lease contracts and schedules to assets, and generating reports and alerts to ensure these assets were returned or renewed on time, they were able to save $120K in lease penalties. They also tied warranty contracts and manufacturer recalls to these assets, allowing them to save $131K by avoiding maintenance and support costs on those assets. So even with their modest goals, they achieved $431,000 in their first year alone using BMC’s Remedy Asset Management.
Implementing Performance Assurance can significantly and very rapidly deliver high ROI to IT organizations, while maintaining service levels… Company is MetLife ________________ These benefits are both Capital Expenditures (avoidance) as well as operational in nature, and are typically quickly realizable. Company is Entergy _________________
Interesting anecdote on the rapid value that can be obtained by getting control of just some of the processes involved in consolidation. Here is the case of Siebel Software, who integrated the disciplines/processes of Discovery, Asset and Capacity Management, and saved several $M in Capital and Operational costs. Company completed significant downsizing, resulting in a new CIO CIO still receiving POs for 85 new servers per month – even after significant personnel reduction No visibility to what they currently have No visibility to how much (or why) it is needed Implemented BMC Performance Assurance ® Solutions Process Change: Issued mandate: no servers – physical or virtual will be acquired without Validating against what they have (Discovery, Asset) Validating what is required with Performance Assurance NOTE: Siebel has since been purchased by Oracle. _________________________ Archer Daniels Midland reduced an application server configuration from 75 active and 75 passive failover servers to 75 active and 5 passive, reducing their server footprint (and costs) by 41% with improved levels of service availability using Virtualizer.
BMC’s Data Center Optimization solutions are available incrementally, in a manner that maps easily to customers problems as well as their organizational and technology readiness. Each step delivers intrinsic value and ROI, building towards the ultimate goal of successfully implementing the fully optimized data center – process and technology. If a customer only needs to understand their current infrastructure prior to making IT configuration decisions such as consolidation or virtualization, they can implement BMC Discovery solutions – solving that problem and simultaneously initiating the foundational step needed for Change Management. Or if a customer only needs to accurately size a consolidation or virtualization project, the can implement BMC Performance Assurance solutions – as a product or service – solving that problem, while simultaneously generating the knowledge needed to successfully implement the change more quickly with automated Change and Configuration Management solutions. If a customer wants to factor in the Asset perspective for a financial view of their environment, they can implement BMC Remedy Asset Management
If a customer needs to get complete control of data center configuration change, they can implement BMC’s solutions for closed loop Change and Configuration management, including automation to ensure accuracy and timeliness of change implementation. And, for customers who want to implement the advantages of the real-time infrastructure today, then can implement BMC Virtualizer. All of the BMC Data Center Optimization solution can – and should for optimal ongoing value – be utilized in conjuction with BMC’s other BSM solutions such as Service Impact Management, Service Level Management, etc…
Because the Change process and the Implementation actions associated with Configuration Changes are so tightly interconnected, BMC offers a comprehensive “closed loop” change and configuration management solution to manage each and every aspect of the actual implementation of configuration change. Request: From the Discovery phase, we already have a fully populated CMDB with comprehensive configuration information. At this point a Change Request is opened to implement the project. Based on policy, the changes are subjected to change approval. Plan: The appropriate software configuration is specified in the DSL The previously created Resource Plan identifies the target servers for deployment The organizationally appropriate scheduling is built and approved Implement: Configuration Management pulls the right Software from the DSL and automatically ensures the right software configurations “meet” the right hardware configurations. Multiple approaches are used to ensure timeliness and accuracy. Verify: The configuration changes planned are verified against actual results to ensure change is implemented “as approved”. Upon confirmation of success, change ticked is closed and the CMDB is updated to reflect this new business/IT reality. The Change and Configuration Management loop is closed.
“ Get Out of the Way, I Can Do It Faster Myself” Joely Scott-Thomas, Senior Account Manager, BMC Software ]
As a customer: Endless examples of customer facing technology delivering valuable services which can also frustrate.
Business view: With said endless examples of customer facing technology, business managers must perform to KPI’s utilizing technology they are responsible for without really having the timely information and therefore control to deliver the result. They are currently heavily dependent on IT management for updates.
IT view: Technology is finally in the front office, a significant development in the evolution of IT and they are just loving it.
What our the goals: Capacity Management- To align to ITIL or not
ITIL Definition Statement: To ensure that cost justifiable IT Capacity always exists and that it is matched to the current and future identified needs of the business.
The CDB (Capacity Mgmt Database) is the cornerstone of a successful capacity management process operating as a subset of a Configuration Management Database (CMDB)
Or not- take a simpler approach: Does my database look big in this? A few extra records here, a few more users there, more bandwidth, more memory, more more more. The computer says NO.
Capacity Management is the human equivalent of dieting. All that extra nibbling and suddenly your clothes don’t fit.
What is our Commitment: Gartner’s IT Management Process Maturity may be an aim 60% 30% Gartner Research, Inc., “Conference Polling Indicates Improvement in IT Management Process Maturity,” Deb Curtis, April 2006. Slide title created by BMC Software.
What is our commitment: Take a Capacity Management View of Gartner’s Maturity Model Without Capacity Management … You can’t get to high levels of value… And ability to automate change near real-time… Or virtualization…
How are we going to get there: ITIL Process Relationships
Capacity Management interacts other ITIL processes, as well as the Service Desk…
Service Support Service Delivery
Source: ITIL Service Management handbook
Service Level Management Incident Management Service Desk Problem Management Change Management Capacity Management Availability Management Financial Management Configuration Management Release Management IT Service Continuity Management
Inputs, Sub-processes and Outputs of Capacity Management
The CDB provides the necessary data to create performance and Capacity Management reports, including the Capacity Plan.
SERVICE AND COMPONENT BASED REPORTS
Reports which illustrate how the service and its constituent components are performing and how much of its maximum Capacity is being used.
Reports that show management and technical staff when the Capacity and performance of a particular component or service becomes unacceptable.
Documents the current levels of resource utilization and service performance, taking into consideration business strategy and plans, forecasts the future requirements for resource to support the IT Services that underpin the business activities.
Capacity Management is a closed-loop methodology Manage Analyze Predict Measure
“ Consumer View” of Performance Ad-hoc, enterprise-stakeholder access to Application and server performance information – Real time through historical
Understand performance relationship between resources and business Map
Characterize Business workload resource utilization
Manage Performance of Business “Units of Work”
Identify underlying cause of performance and response time problems Workload Response Time 0.0 0.5 1.0 1.5 2.0 12:00 12:45 1:30 2:15 3:00 3:45 4:30 Secs CPU Service IO Service CPU Wait IO Wait Other Service Other Wait
Identify underlying workload resource bottlenecks
Optimize performance requirements to meet service level agreements
Performance Analysis Identify Business Variance and its impact on IT Resources
Maintain consistent performance in a constant-change environment
Track workload resource utilization and responsiveness
Rapidly identify, prioritize and solve performance and response time incidents by filtering abnormal workload performance
Automate and Publish workload performance information to all stakeholders
Determine resource utilizations across entire application and server infrastructure
Populate historical repository of utilization data
Visualize and communicate the performance interdependencies between IT and business
see Inside Business Resource Response time and Usage Determine which resource is Constrained and when Workload Response Time Detail Workload MoneyWeb@lwl11dv83 0.0 0.5 1.0 1.5 2.0 12:00 PM 12:45 1:30 2:15 3:00 3:45 4:30 Secs CPU Service IO Service CPU Wait IO Wait Other Service Other Wait
Predict Performance Impact of Business Changes
Model response time impact of business changes
Identify Business Service breakpoints before failure
Maintain consistent performance and response time across change
Transaction Response Time 0 5 10 15 20 current grow20% grow40% grow60% grow80% grow100% grow10% grow30% grow50% grow70% grow90% Predict
Modeling Business Change and Resource Impact Secs [email_address] Oracle@lwlgcaa1 [email_address] [email_address] Grow the workload and determine the “ knee of the curve” Transaction Response Time <ALL> Transactions [MoneyWeb] in Money_Growth 0 5 10 15 20 current grow20% grow40% grow60% grow80% grow100% grow10% grow30% grow50% grow70% grow90%
Predict Business Performance Impact of IT Change
Predict Business Service impact of IT Resource change
CPU, Disk, Memory, etc.
Configuration, Load Balancing, etc.
Right-size and Right-time resource expenditures
Maintain consistent performance and response time across change
Predict Business Performance Impact of IT Change See Response time impact of resource change
Q1 Q2 Q3 Q4 Q1 Q2 Q3 Q4 SLA Response Time Hardware $ Response Time Capacity Management - Right Timing and Right Sizing Business Resource Utilization 20% 40% 60% 80% 100% Over provision vs. Under Provision Go Live Add New Module Add New Dept Consolidate Servers Wasted Resource Investment 1 2 3 4 5 6 Degraded Service Levels Moore’s Law: Price Decrease and Performance Increase Wasted Resource Investment Degraded Service Levels Capacity Management must deliver the predictive determination of the RESPONSE CURVE vs. Resource Utilization
Cost reduction required by deregulation of the business
Capacity Management Process and Tools
Achieved annual savings target in 3 months
To date exceeded savings target by 2X
Capital HW budget: 62%
Expense HW budget: 27%
Administration costs: 12%
Quick Fact “ On average, 40 percent of unplanned mission-critical application downtime is caused by application failures and another 40 percent is caused by operations errors. Improving change management processes is one of the best investments that enterprises can make, as availability can increase by 25 percent to 35 percent.” - Gartner Research, Inc., “Best Practices for Continuous Application Availability,” D. Scott and E. Holub, Gartner Data Center Conference, December 2005
How can we apply Capacity Management: Today’s Data Center Situation
Average server utilization is 10-15% (85% of resources are idle)
Electricity costs are rising
Google forecasting electricity costs to outweigh server costs this year
Real estate, cooling are very large fixed costs
Management costs increase for every server, used or idle
Gartner Data Center Survey 1
95% of data centers are planning or executing virtualization (e.g. VMware, etc.) and/or server consolidation projects (Opening, question 2)
69% are doing so to control server sprawl and reduce overall TCO (A3, question 1)
66% believe projects will take more than one year to complete (A3, question 4)
Many are moving to a real-time infrastructure or RTI (think “Capacity on Demand”)
73% see moving concretely to RTI as “business imperative” (k2, question 1)
70% see obstacles as organizational, process, and/or low maturity — not technology or ROI (k2, question 4)
1 Gartner Research, Inc., “Interactive Polling Results: 2005 Data Center Conference,” December 2005. Percentages calculated and additive to obtain figures.
Capacity Management as an enabler to Proactively Managing the Lifecycle and Costs of the Datacentre
Integrated approach for Data Center lifecycle management blending best practices from:
Change and Configuration Management
BMC solutions for Data Center Optimization enable IT to:
Proactively identify, plan, change, and manage server assets
Support new, agile business application demands
Optimize capacity against business requirements, financial constraints, and contractual obligations
Datacenter Optimization Lifecycle. Business Agility, Risk Mitigation, Cost Reduction Identify under-performing assets Audit and optimize Licenses Reduce risk of unpredicted change by mapping resources to business cycle Reduce costs by tightly managing software configuration and change Identify resource consolidation or re-purposing opportunities Minimize business service risks with prediction Balance server, power, floorspace, cooling, licensing and management costs through “scenario planning” Ensure appropriate business service response time and throughput Balance competing business priorities Properly size and quantify server assets Flexibly adapt to dynamic change Lower costs of service high availability Create shared, virtual resource pools Increase overall resource utilization Ensure real-time business service availability, performance and capacity 1 2 3 4
Acquisition = 25% growth in new users within two months – would they fit?
Solution must give answers within three weeks
Initial capacity study
Completed within two weeks
Success drove customer to one-year subscription with additional studies
Quick Fact A large insurance company reduced infrastructure costs by over $1 million in one month by adopting capacity management delivered as a managed service. By ensuring Capacity Management was part of its change process , a large regional energy company reduced CapEx budget by 67%, OpEx budget by 27%, and achieved 12 months savings plans in three months.
Quick Fact A $30 billion food manufacturer was keeping 75 servers as overflow capacity for important applications. The company reduced its production infrastructure by over 41% by using dynamic provisioning capability
A large technology company saved over $1 million by effectively merging the disciplines of Discovery, Asset, and Capacity Management .
Found 1000 unused servers, reducing operational expense
Ended server purchasing for 18 months
Instituted Change Management process to enforce Capacity Management
Data Center Optimization Summary: Discovery, Analysis and Planning
If you know what resources you have, what services they support, how busy they are, and what resources you will require to successfully implement technology changes, such as consolidation or virtualization.
Reduce Server Infrastructure costs by 40% or more
Recognize ROI within three months
Accelerate virtualization and consolidation projects
Data Center Optimization Summary: Implementation and Optimization
Implementing a unique marriage of Closed-loop Change and Configuration Management with automated server provisioning to:
Ensure compliance and reduce risk by improving IT process and technology maturity
Optimize infrastructure costs
Improve IT’s ability to respond to dynamic business change
Implementation: BMC ® Change and Configuration Management