© 2006 IBM Corporation Where's the Bottleneck?

535 views
501 views

Published on

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
535
On SlideShare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
7
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide
  • Have you ever monitored business environment that spans across multiple systems and learned that the Some users are experiencing miserable response time, but wondered what the actual performance problem was? Have you ever monitored a system and determined the processor utilization was appropriate ; However work wasn’t processing as quickly as you’d like? Have you ever wondered how much operating system processes contribute to the CPU utilization versus application-level transactions to determine what work is consuming the most resources? If you can answer yes to any of these questions you are in the right place ! If you answered no ,……………………then, you’re still in the right place because EVERYONE should be concerned about performance ! In this presentation, I’ll share with you on how you can use a robust performance management tool to pinpoint the exact cause of a performance problem. This tool is EWLM which stands for Enterprise Workload Manager.
  • In this presentation, first we will go over EWLM product structure and then…
  • This picture illustrates a fairly typical three-tier application structure consisting of web servers, application servers and a backend database server. For the sake of completeness, let’s say in addition to that we also have a front-end load balancing device. Set of users sit out in the world and drive various transactions. And it takes 12.6 seconds to get a response for some users while for other it completes in 5 seconds. Right away, two questions come to mind: Is a 12.6 second response time a problem? It may or may not be, depending on the performance goals of this transactions and the expectations of the user. If a 12.6 second response time is a problem, then where is the problem? Is the delay in the web servers, the application servers, the database server, or spread out across all of them? EWLM, as you’ll see, is designed to help address these two fundamental questions.
  • EWLM is an expansion of the same concepts that were introduced by the z/OS workload manager. It applies the proven and mature goal oriented z/Work load management technology to the multi-tier heterogeneous environment. EWLM monitors the performance against end user provided goals and optimize resources dynamically across the servers to achieve the goal. EWLM allows you to monitor all the transactions or a specific transactions that an instrumented application processes. EWLM allows you to monitor all or specific operating system processes. EWLM Influences compatible network load b alancers routing decisions to achieve business performance goals defined in policy. EWLM dynamically adjusts LPAR CPU allocation on p and i series systems to help work meet its externally defined goal.
  • A very basic notion of EWLM is the time-stamping of the transaction flow as it traverses through the web server, the application server and the database server. That provides a way to know how much time is being spent in each “hop” of the application structure. To do this, EWLM provides “agent code” that runs on each one of the servers you wish to monitor in your environment. This agent code implements an industry standard called ARM (Application Response Measurement). The application components in each hop are “instrumented” to make use of the ARM APIs available in EWLM agent. The agent code feeds information back up to a central management server called the “EWLM Domain Manager.” The Domain Manager collects and correlates the information and provides reports to the user through a graphical interface known as the “Control Center.”
  • Transition to this slide: Prior to using EWLM to determine where performance problems exist, it is beneficial to have an understanding of how EWLM monitors transactions. AND you might be curious as to how it gathers the granular performance data that it does. Hop 0 = Entry application = Location where EWLM classifies the work according to the transaction classes defined for the particular application that is the Entry application. Transactions are only classified once . The transaction is classified to the first transaction class in which it satisfies the classification rules. Therefore, be sure that the most specific transaction classes are in the lower position number starting with 1 and the more general classification rules are in the transaction classes in the higher positions.
  • Transition: Then, the transaction moves to the second application [hop 1] to continue processing. EWLM recognizes that the transaction has been classified to a transaction class already. EWLM does not classify it again. EWLM updates the correlator with performance data that is specific to hop 1.
  • Before we get into example scenario, let me explain how EWLM computes the end to end response time. When a business transaction spans server platforms, it’s necessary to “correlate” the time spent in each tier so an overall transaction time can be determined. The ARM specification calls for an “ARM Correlator” to be created by the agent and passed along from server to server Middleware is responsible to pass the correlator as transaction hops from one application to other.
  • If some of the middleware in the transaction flow is not instrumented, there is still some value using EWLM. If first hop is instrumented, you would still get end-to-end response time. The details at the later hops will be lost and you won’t be able to derive complete end-to-end topology. EWLM won’t be able to manage transactions running on un-instrumented middlewares but it can still manage the middleware processes.
  • If application is not using ARM APIs, EWLM can not report response time for the transactions and can not manage to transactions goals. But, EWLM can report and manage resource consumptions by middleware processes. EWLM agent also understands when batch jobs starts and ends. So, you can monitor and manage short running “batch” processing to response time goals without any instrumentation. having applications ARMed is preferable, but some monitoring and management possible even without
  • So using EWLM instrumentation, you can learn that transactions are completing on avg at 12.6 seconds. Is it good or bad? Is this response time typical or you have noticed better response time at some other times but for some reason for limited set of transactions, it get worse. Is there any other important work running on the system that is getting lower response time and demands resources that are consumed by these sets of transactions? To answer these questions we have to compare the response times seen to defined goals to know whether there’s a problem or not.
  • Transition: Now that you are familiar with how EWLM monitors application-level transaction as the transaction moves from one hop to the next, let’s define an example environment that I’ll use for the remainder of the presentation. I’ll use this environment for illustrative purposes, note that EWLM is beneficial in both in large, complex environments as well as simple environments such as this example. To begin, we must define our IT business environment. zSeries LPAR 1 is an EWLM managed server running z/OS and one applications that instrument the ARM 4.0 APIs. xSeries Guest 1 is a second EWLM managed server running xLinux and one application that instruments the ARM 4.0 APIs. xSeries Guest 2 is a third EWLM managed server running windows and one application that instruments the ARM 4.0 APIs. All system process application-level transactions and operating system processes and use the same physical resources to process the work. You want to monitor the application-level work separately from the operating system processes to ensure that the Web transactions have the required resources to complete with a higher-level importance compared to the operating system processes. Note: For EWLM to obtain granular performance data at each hop, each application must instrument the Application Response Measurement (ARM) 4.0 standard APIs. You may have service level agreements with business partners that indicate a specific level of service that you will provide them. For example, Bank A may require 90% of the online transactions completing in 2 seconds or less. URI or URL that drives a servlet requires a WAS runtime server for servlets. HTTP server catches the URI or URL request which drives the servlet invocation. Servlets must run in some runtime environment. Instead of servlet it could be ejb, web service too. All things that a URI can address.
  • Transition: So, within this environment we are particularly interested in monitoring 2 types of work. For EWLM to function properly, it is important that you identify the ‘Entry Application’. The ‘Entry application’ is the first ARM-instrumented application in the domain that receives the work request. During this time, EWLM compares the work request to the transaction class rules defined for that particular application. In this example, our entry application is IHS powered by Apache, which is an IBM Webserving plugin that is shipped with WebSphere. Therefore, we must create a transaction class for this application so that EWLM can monitor the work accordingly.
  • Additional examples: Work over specific port is higher priority. For example, Bank A might use your server’s to process their web application over a specific port and Bank B might use a different port. Identify the work based on the port on which the transaction is processed to assign the appropriate service class performance goal to the different work.
  • After using Linux tools such as “top” to identify the processes we want EWLM to monitor, we can create a domain policy using the EWLM Control Center that contains service class goals that include performance goals for each process that we identified. Velocity Defines how fast work should run when ready, without delays due to processor constraints, storage problems, and I/O delays (for managed system resources). Use a velocity goal for work in which response time goals are not appropriate, such as service processes, daemons, and long-running batch work. Example: Fastest, Fast, Moderate, Slow, Slowest Discretionary Defines that the work is to complete when resources are available. No time interval and no importance. Use this for work with low priority.
  • Application environment: IBM Webserving Plugin [IBM_HTTP_Server/6.0.1 Apache/2.0.47 (Unix)] Group name is in brackets. Registered application name is IBM Webserving Plugin – known by ARM. Use “ewlmWinAdTool” utility to examine whether IBM HTTP server is instrumented on Windows system or not. Transaction class includes all transactions that the RemoteUser teller(*) originates on ‘System1’ which runs on the ‘Windows’ operating system. This transaction class is the most specific compared to the next two examples that I’m about to show. Therefore, it should be in Position 1.
  • Transaction class only includes the transaction that originate on a particular system running on the Linux operating system in the domain. You may use a classification rule that is similar to this if you have multiple operating system instances in your EWLM domain that run Webserving plugins but you want to monitor them separately. Does not include all IHS application instances like the default transaction class does. This transaction class is more general than the previous but more specific than the next. So, in this set of transaction classes it should be in position 2.
  • This is the most general transaction class; therefore, in the last (or highest numbered) position in the domain policy. Transition: Now, that you have created the performance goals in service classes for your transaction-level work, you need to identify the operating system processes in which EWLM is to monitor. Application environment: IBM Webserving Plugin [IBM_HTTP_Server/6.0.1 Apache/2.0.47 (Unix)] Group name is in brackets. Registered application name is IBM Webserving Plugin – known by ARM. Q: How do users know what the application instance name is? A: Must look up in their product documentation. The application instance name, defined by ARM standard, is The default host operating system instance name (string) concatenated with “/PID=” concatenated with webserver PID unsigned integer converted to string representation of the decimal value
  • Transition: Let’s assume that you have activated a service policy on the EWLM domain. Let’s use the EWLM Control Center reports and monitors to determine if the performance goals are met.
  • Now that we’ve defined our goals and (hopefully) achieved classification of the workload in the manner we intended, it’s now time to look at how you can monitor the actual workload and see how it’s doing against the goals. This is done through the Control Center, and has three levels of monitoring: High-Level – monitoring performance at the Service Class level. This will include all the Transaction or Process Classes that map into the Service Class, so it will involve a rollup of many different things. But since you mapped them all into the same Service Class, they must have similar characteristics, so there’s value in seeing this rollup. Mid-Level – monitoring performance at the Transaction Class or Process Class level. This will provide you a way to see how the work that matches a particular Transaction Class (or Process Class) is doing against the goal. That provides a closer level of granularity. Low-Level – monitoring down at the server level, and seeing things like the average time spent in each hop, or the amount of CPU utilization. You see the options under “Monitor” ... there are six of them, and they go from highest level of monitoring down to the lowest level. Let’s now look at the first, or highest, level of monitoring.
  • The Exceptions Report provides a quick snapshot of those Service Classes that are not meeting their performance goals. It does this by computing something called a “Performance Index” (“PI” for short), which is merely the actual performance divided by the goal. If actual is twice the goal, then the PI is “2”. The Exceptions Report shows all Service Classes whose PI value is greater than 1. A Service Class that is meeting or exceeding its goal will not be seen on this Exceptions Report. What this provides is a way to quickly see where problems might exist. There’s no reason to spend time sifting through information about Service Classes that are doing just fine. At a first cut the issue is: “Who is not performing to their goal?” The Exception Report provides that. Note: it is important to remember that a Service Class may have multiple Transaction Classes mapped to it, which means the “PI” value for a Service Class is a representation of the performance for many different transactions. It may be the case that the Service Class’s PI value is greater than 1 even though some or perhaps many of the individual transactions are doing okay. That’s why you should dig deeper into a Service Class reporting a PI greater than 1. Don’t assume that all the workload under the Service Class is performing badly; it may not be.
  • The “Service Classes” selection will allow you to see a listing of all the Service Classes and view the PI for each. This will show you Service Classes with PI’s higher than 1 (like the Exception Report did), as well as those with PI lower than 1 (which are Service Classes that are meeting their goals). When the list is displayed, you also have the opportunity to see the Importance designation of the Service Class, the actual performance seen (expressed in the terms used for the goal of the Service Class), and the goal itself. Finally, there’s a pulldown with a list of actions you can use against the Service Classes. We’ll see what those actions are in a few charts.
  • Here’s an example of what a “Service Class Details” report looks like. Let’s go through what some of these things are telling us: Goal – the goal that’s defined in the Service Class. Transaction Classes – this shows what Transaction Classes are associated with this Service Class. This is a good way to understand what kind of workload is included within the Service Class reporting. And it shows how some thought ought to be given to the naming convention of your Transaction Classes so that with a quick scan you can easily tell what those transactions relate to. Performance and Performance Index – shows the actual performance seen for the Service Class. The Performance will be expressed in the same language as the goal: if the goal is average, the performance will be average; if the goal is percentile, the performance will be percentile, etc. The Performance Index is computed by dividing actual by goal. If the PI is greater than 1, the Service Class is not meeting its goal. Less than 1 means it is meeting its goal. Note: a PI of 0 means no work has been seen. This is important if you’re trying to determine if your Transaction Classes are classifying properly. If you see a Service Class with a PI of zero, you know nothing’s been classified to the class. Transaction Counts – lists the counts for various completion states of the transactions seen
  • Here’s what the Goal Achievement Monitor looks like. This is an applet that loads in your browser and provides a historical representation of the actual performance seen versus the goal. The goal is expressed as a horizontal line. In this example the goal is “80% in 2.0 seconds.” We see that the actual is 100% of transaction are completing in 2 sec. Hence, this Service Class is meeting its goal. Note: at the present time EWLM does not maintain more than 1 hour of historical data. That shortcoming is being addressed, and we can expect to see a database with more extensive historical archiving.
  • The “Topology” representation shows a graphical picture of the transactions (or servers, if you choose that display) in the hop-structure seen by EWLM for this service class. What is perhaps more useful is the “Table View,” which can be accessed by clicking on the little table icon. this will bring up a table that shows detailed statistics for the transactions per hop. Note: understanding and analyzing the numbers and relationship of the numbers in this table is beyond this presentation. There’s a lot in that table. That’s deep-dive performance analysis stuff. For this presentation, our objective is to let you know that the statistics are there, and how to get to them.
  • The “Topology” representation shows a graphical picture of the transactions (or servers, if you choose that display) in the hop-structure seen by EWLM for this service class. What is perhaps more useful is the “Table View,” which can be accessed by clicking on the little table icon. this will bring up a table that shows detailed statistics for the transactions per hop.
  • Similar to server topology, EWLM provides application topology which shows relationships between different application environments through which business transaction flows. EWLM provides alerts when it detects that goals are not being met due to some bottleneck at specific application environment.
  • Note: understanding and analyzing the numbers and relationship of the numbers in this table is beyond this presentation. There’s a lot in that table. That’s deep-dive performance analysis stuff. For this presentation, our objective is to let you know that the statistics are there, and how to get to them.
  • Green – good Red - bad
  • Green – good Red - Bad
  • Manage CPU resource across partitions based on EWLM end-to-end policy. This approach is similar to how IRD on z/OS manages LPAR weights. Move resources dynamically to where needed to meet goals stated in policy If all goals cannot be met, sacrifice least important work Multiple partition workload groups are allowed on the same hardware box. You can run mixture of operating system within the partition workload group EWLM manages the partitions which use the shared processor. Processing capacity on partitions that use dedicated processors can not be dynamically changed.
  • © 2006 IBM Corporation Where's the Bottleneck?

    1. 1. Where's the Bottleneck? Tracking Down the Cause of Performance Problems Using EWLM Hiren Shah [email_address] Chicago, IL
    2. 2. Agenda <ul><li>Enterprise Workload Manager (EWLM) Overview </li></ul><ul><li>How does EWLM monitor transactions? </li></ul><ul><li>Example environment </li></ul><ul><ul><li>Identify work for EWLM to monitor </li></ul></ul><ul><ul><li>Create a domain policy to identify work EWLM is to monitor </li></ul></ul><ul><li>Description of reports, monitors, and topologies available </li></ul><ul><li>Case Study: EWLM load balancing </li></ul><ul><li>Case Study: EWLM Virtual Machine (Partition) Management </li></ul>
    3. 4. EWLM Overview <ul><li>Ability to monitor all or specific transactions that an application processes. </li></ul><ul><li>Ability to monitor all or specific operating system processes. </li></ul><ul><li>Monitor application transactions separate from operating system processes. </li></ul><ul><li>Obtain end-to-end transaction data </li></ul><ul><ul><ul><li>Example transaction flow: </li></ul></ul></ul><ul><ul><ul><li>Hop 0 – IBM HTTP Server (IHS) </li></ul></ul></ul><ul><ul><ul><li>Hop 1 – WebSphere Application Server </li></ul></ul></ul><ul><ul><ul><li>Hop 2 – DB2 Universal Database </li></ul></ul></ul><ul><li>Autonomic resource management based on business goals and importance specified in the customer supplied policy. </li></ul>
    4. 6. How does EWLM monitor transactions? Example Transaction Flow: Hop 0 <ul><li>EWLM process to monitor a transaction: </li></ul><ul><li>First ARM instrumented application in the domain receives the work request. This is the IHS Webserving plugin and is considered hop 0. </li></ul><ul><li>EWLM classifies the transaction to a transaction class. </li></ul><ul><li>EWLM updates the correlator that is attached to the transaction with the classification information. </li></ul>
    5. 7. How does EWLM monitor transactions? Example Transaction Flow: Hop 1 <ul><li>Transaction moves to a second application [hop 1] to continue processing. </li></ul><ul><li>EWLM recognizes that the transaction has been classified to a transaction class already. EWLM does not classify it again. </li></ul><ul><li>EWLM updates the correlator with performance data that is specific to hop 1. </li></ul>
    6. 8. How does EWLM monitor transactions? Example Transaction Flow: Hop 2 <ul><li>Transaction moves to third application [hop 2] to complete processing. </li></ul><ul><li>EWLM updates the correlator with performance data about this hop. </li></ul><ul><li>The transaction completes. </li></ul><ul><li>EWLM calculates end-to-end performance data for the transaction and reports the result in the Control Center. </li></ul>
    7. 9. ARM – Application Response Measurement When an business transaction spans server platforms, it’s necessary to “correlate” the time spent in each so an overall transaction time can be determined. The ARM specification calls for an “ARM Correlator” to be created by the agent and passed along from server to server ARM Services (EWLM managed server) WebSphere AppServer ARM Services (EWLM managed server) ARM Services (EWLM managed server) ARM Correlator “ 12345” Server Corr. Start Stop Duration ++++++ +++++ +++++ ++++ ++++++++ Plugin 12345 11:00:00.0 11:00:00.3 00.3 seconds AppServer 12345 11:00:00.3 11:00:02.3 02.0 seconds DB2 12345 11:00:02.3 11:00:12.3 10.0 seconds AppServer 12345 11:00:12.3 11:00:12.5 00.2 seconds Plugin 12345 11:00:12.5 11:00:12.6 00.1 seconds = = = = = = = Overall Response Time 12.6 seconds Conceptual View of Things! In reality it’s more sophisticated than this Webserver Plugin DB2 EWLM Domain Manager
    8. 10. What happens if some piece isn’t ARMed? Other AppServer There’s still value! App doesn’t drive API <ul><li>Key Points: </li></ul><ul><ul><li>Transaction flow not interrupted – flows just as it would were EWLM not in the picture at all </li></ul></ul><ul><ul><li>First “Hop” can determine “end-to-end” response time, but details at later “hops” get lost </li></ul></ul><ul><ul><li>EWLM does has a mechanism to monitor platform-initiated “processes” </li></ul></ul>EWLM Domain Manager EWLM Manager Messaging Services Control Center EWLM Management Console ARM API EWLM Code Operating System WebServer with Plugin ARM API EWLM code Operating System ARM API EWLM Code Operating System DB2 Database Server
    9. 11. Managing un-ARMed middleware / applications If an application doesn’t make use of the ARM APIs provided by EWLM, then EWLM can’t monitor the response times . But some monitoring is possible. EWLM agent can “see” elements of the platform operating environment, and detect the starting of processes. <ul><li>Two things: </li></ul><ul><ul><li>Report on server statistics, like CPU utilization </li></ul></ul><ul><ul><li>Can make use of “Process Classes” or “Partition Classes” to monitor un-ARMed applications </li></ul></ul>Message: having applications ARMed is preferable, but some monitoring and management possible even without EWLM Agent Operating System Processes
    10. 12. Information gathered …. Is there a problem? <ul><li>It Depends … </li></ul><ul><ul><li>Is this a critical application? Is 12.6 seconds good or bad for this application? </li></ul></ul><ul><ul><li>Is this 12.6 second response time typical? Or does this usually run quickly but had a one-time slow response? </li></ul></ul><ul><ul><li>Is there something else running that’s even more important than this application? Is that more-critical thing consuming resource? </li></ul></ul>To answer these questions we have to compare the response times seen to defined goals to know whether there’s a problem or not. ( We do this in our minds … key here is to capture it in EWLM ) Webserver 0.4 Seconds AppServer 2.2 Seconds Database 10.0 Seconds
    11. 13. Example environment <ul><li>Business environment: </li></ul><ul><li>Web application [Bank application, online sales, etc] </li></ul><ul><li>Work running in this domain: </li></ul><ul><li>Web transactions </li></ul><ul><li>Operating system processes </li></ul><ul><li>Typical end-to-end transaction flow: </li></ul><ul><li>Hop 0 = IHS </li></ul><ul><ul><li>Entry application </li></ul></ul><ul><li>Hop 1 = WebSphere Application Server </li></ul><ul><ul><li>Application server environment for servlets </li></ul></ul><ul><li>Hop 2 = DB2 </li></ul><ul><ul><li>Database that stores account information </li></ul></ul>Lpar1 : z/OS DB2 zSeries – z9 Guest 2: Windows WebSphere Guest 1: xLinux IBM HTTP Server xSeries
    12. 14. Work for EWLM to monitor <ul><li>Application-level transactions </li></ul><ul><li>Entry application = IHS </li></ul><ul><ul><li>First ARM-instrumented application in the domain that processes the transaction. </li></ul></ul><ul><li>Result: Need transaction class(es) for the IHS Webserving plugin. </li></ul><ul><li>xLinux processes </li></ul><ul><li>Result: Need process class(es) for x/Linux. </li></ul>Lpar1 : z/OS DB2 zSeries – z9 Guest 2: Windows WebSphere Guest 1: xLinux IBM HTTP Server xSeries
    13. 15. Create a domain policy - core steps <ul><li>Define service classes that contain performance goals. </li></ul><ul><li>Add the applications to the domain policy. </li></ul><ul><li>Create transaction classes for the Entry applications. </li></ul><ul><li>Add the platforms whose operating system process EWLM is to monitor. </li></ul><ul><li>Create process classes. </li></ul><ul><li>Note: Items in bold represent components of the domain policy that display in the left pane of the domain policy editor in the EWLM Control Center. </li></ul>
    14. 16. Step 1: Define service classes Example service classes for application-level transactions <ul><li>Can use any performance goal: Average response time, percentile response time, velocity or discretionary </li></ul><ul><li>Service class name: High priority Web transactions </li></ul><ul><ul><li>Average response time of 1 second </li></ul></ul><ul><ul><li>Use for transactions initiated by bank tellers or customers. </li></ul></ul><ul><li>Service class name: Web transactions </li></ul><ul><ul><li>Average response time of 2 seconds </li></ul></ul><ul><ul><li>Use for transactions for a specific IHS application instance </li></ul></ul><ul><li>Service class name: Default service class </li></ul><ul><ul><li>90% in Average response time of 3 seconds </li></ul></ul><ul><ul><li>Use for general IHS transactions on any server or instance in the domain. </li></ul></ul>
    15. 17. Step 2: Define service classes Example service classes for operating system processes <ul><li>For short running batch processes Response Time goal can be used otherwise use a Velocity or Discretionary goal </li></ul><ul><li>Service class name: Mail Processing </li></ul><ul><ul><li>Goal = Fastest velocity </li></ul></ul><ul><ul><li>For un-instrumented application running on x/Linux that handles mail client. </li></ul></ul><ul><li>Service class name: Print Processing </li></ul><ul><ul><li>Goal = Slowest velocity </li></ul></ul><ul><ul><li>For un-instrumented print server running on x/Linux </li></ul></ul><ul><li>Service class name: Development processes </li></ul><ul><ul><li>Goal = Moderate velocity </li></ul></ul><ul><ul><li>For development application processes </li></ul></ul>
    16. 18. Step 3 : Define transaction classes Example transaction classes for IBM Webserving plugin Transaction class name: Bank Teller Web transactions Service Class: High Priority Web transactions Average response time = 1 second
    17. 19. Step 3: Define transaction classes Example transaction classes for IBM Webserving plugin Transaction class name: IHS Web transactions Service Class: Web transactions Average response time = 2 seconds
    18. 20. Step 3: Define transaction classes Example transaction classes for IBM Webserving plugin <ul><li>Transaction class name: </li></ul><ul><li>Default - IBM Webserving plugin </li></ul><ul><li>Rules: </li></ul><ul><li>(*) = (*) </li></ul><ul><li>Service Class: Default Service Class </li></ul><ul><ul><li>Average response time = 3 seconds </li></ul></ul>
    19. 21. Review – Steps to create a domain policy <ul><li>Define service classes that contain performance goals for transactions and processes. </li></ul><ul><li>Add the applications to the domain policy that instrument Application Response Measurement (ARM) 4.0 standard APIs. </li></ul><ul><li>Create transaction classes for the Entry applications to identify specific transactions that an ARM-instrumented application processes. </li></ul><ul><li>Add the platforms whose operating system process EWLM is to monitor. </li></ul><ul><li>Create process classes to identify un-instrumented work . </li></ul><ul><li>Next steps… </li></ul><ul><li>Deploy and activate your policy on the EWLM domain. </li></ul><ul><li>View the monitors and reports in the Control Center to determine the cause of performance problems, if they exist. </li></ul>
    20. 22. Step 5: Define process classes Example Linux process classes EWLM provides another type of class called a “Process Class.” These are used for work requests not transaction-oriented, or for un-ARMed applications. <ul><li>How Process Classes are the same / differ from Transaction Classes: </li></ul><ul><ul><li>Same: Process Classes use Rules and Filters and map to Service Classes </li></ul></ul><ul><ul><li>Differ: Some different filter types and different Service Class goal </li></ul></ul>Transaction Classes – when ARM-enabled application initiates the work Process Classes – when server initiates the work EWLM code Mail Server xLinux Process Class <ul><li>Rule: </li></ul><ul><ul><li>Filter Type: Executable Path </li></ul></ul><ul><ul><li>Filter Operation: Equal </li></ul></ul><ul><ul><li>Filter Value: optmyCOmailServer </li></ul></ul>Name: PC_mail_process Service Class: Mail Processing Service Class Name: “Mail Processing” Goal: Velocity: Fastest
    21. 23. Monitoring can be done at three levels of detail The Control Center provides a section for monitoring activity in the Domain. You can drill down from high-level to lower-level: <ul><li>Displays only those Service Classes that are not meeting their goals </li></ul><ul><li>Displays all Service Classes so you can see which are meeting goals, which are not </li></ul><ul><li>Displays all Transaction Classes so you can see how individual classes are performing against goals </li></ul><ul><li>Allows you to drill down to the managed servers themselves and see things like overall processor utilization and delays within Service Classes </li></ul>
    22. 24. Exception Report This will show those Service Classes that are not meeting their goals: <ul><li>Uses “Performance Index” (PI) as gauge: </li></ul><ul><ul><li>“ PI” greater than 1  Not meeting goal </li></ul></ul><ul><ul><li>“ PI” less than 1  Meeting goal </li></ul></ul><ul><ul><ul><li>If less than 1, won’t show on “Exceptions” </li></ul></ul></ul>This provides first-level indication of where problem may exist. (Remember: Service Class may be comprised of multiple Transaction Classes)
    23. 25. Service Class Detail This shows all the Service Classes and give you a snap-shot view of how they’re doing: Column with PI values. May sort the table by this column. Pulldown of actions that can be executed against selected Service Class . Actual performance of Service Class The goal of the Service Class
    24. 26. Service Class Details Provides a snap-shot of the details under the Service Class: Goal defined in the Service Class In this example, three different Transaction Classes are associated with the Service Class Actual performance. “0” means no activity mapping to this Service Class. Transaction counts. Use this to determine if a significant number of transactions aren’t completing as designed
    25. 27. Transaction classes Examine transaction classes to determine if performance goals are met. Note: This view allows you to view performance of only application-level work.
    26. 28. Transaction class details <ul><li>Examine details for more information such as: </li></ul><ul><li>Transaction states </li></ul><ul><li># of transactions </li></ul><ul><li>Note: Only includes the application-level work specific to this transaction class. </li></ul>
    27. 29. Process classes Examine process classes to determine if performance goals are met for work included in each process class. Note: This view allows you to view performance of only operating system or un-instrumented processes.
    28. 30. Partition classes Examine partition classes to determine if performance goals are met for work included in each partition class.
    29. 31. EWLM Control Center Monitors <ul><li>Monitors provide the following: </li></ul><ul><li>Automatically refresh data every 30 seconds. </li></ul><ul><li>Display up to 24 hours of data. </li></ul><ul><li>Provide export function to save the data. </li></ul><ul><li>Types of monitors </li></ul><ul><li>Goal achievement monitor </li></ul><ul><li>Transaction rate monitor </li></ul><ul><li>Processor utilization monitor </li></ul><ul><li>Performance index monitor </li></ul>
    30. 32. Goal achievement monitor <ul><li>Displays the actual performance compared to the goal . </li></ul><ul><li>Use this to determine if there a specific time intervals when goals are not met. </li></ul><ul><li>Compare to the transaction rate monitor to determine if high workloads directly relate to missed performance goals. </li></ul>Blue line indicates actual Yellow line represents goal
    31. 33. Performance index (PI) monitor <ul><li>Displays the actual performance index compared to a PI of 1 . </li></ul><ul><li>Use this monitor to view the performance index as it fluctuates over time. </li></ul><ul><li>PI>1 indicates the goal is not met </li></ul><ul><li>PI=1 indicates the goal is met </li></ul><ul><li>PI<1 indicates the goal is exceeded </li></ul>Yellow line represents goal Blue line indicates actual
    32. 34. Transaction rate monitor <ul><li>Displays the number of transactions (or processes) that complete per second. </li></ul><ul><li>Use this to determine when peak workloads exist, which may directly relate to a missed performance goal. </li></ul>
    33. 35. EWLM topology views <ul><li>Provides end-to-end view of work that EWLM monitors in the EWLM domain. </li></ul><ul><li>Displays a warning icon on a specific node that might be the cause of performance problems. </li></ul><ul><li>Topology views - </li></ul><ul><li>Server topology </li></ul><ul><li>Application topology </li></ul>
    34. 36. Server topology <ul><li>Displays on managed servers in the domain that EWLM is monitoring. </li></ul>
    35. 37. Application topology <ul><li>Displays all applications in the domain that EWLM is monitoring. </li></ul>
    36. 38. Application topology details View Average active time to determine if time allocated to each hop is appropriate. Use to determine if a hop (application instance) does not adhere to a performance goal.
    37. 39. Managed Servers and Server Details It’s even possible to drill down and see statistics on the server platform itself: For all servers in the Domain
    38. 40. Problem identified – report it or do something? Today EWLM’s capabilities provides some autonomic management. Over time more and more “management” capabilities will be rolled into the EWLM. Human Operator Manual Action Goal EWLM Domain Manager EWLM Agent EWLM dynamic LPAR management TCP Load Balancing Devices TIO Server Provisioning
    39. 41. Domain Manager Load balancers ask the Domain Manager for recommendations (weight) using SASP Management Domain with servers of different capacity Transactions Server and Application Health and Performance Statistics Tran 6 Tran 5 Transactions Tran 8 Tran 9 Tran 10 Tran 11 <ul><li>EWLM understands: </li></ul><ul><li>End to end performance goal </li></ul><ul><li>Convoluted application and server topology </li></ul><ul><li>Hardware characteristics: CPUs, Memory, IO </li></ul><ul><li>Application performance: response time, resource utilization. </li></ul>Transaction Routing Using EWLM’s Recommendations Tran 1 Tran 2 Tran 4 Tran 3 Tran 7 IP/ Port/ Protocol IP web 6 WAS 2 Sys A (Managed Server) web 7 WAS 5 Sys B (Managed Server) web 1 WAS 4 Sys D (Managed Server) Load Balancer Application Group 1 Group 2 Load Balancer Group 3 System Group 4 web 1 DB2 3 Sys C (Managed Server) WAS 4
    40. 42. Load Balancing Case Study Overview <ul><li>Workload Generator: Trade Application </li></ul><ul><li>Workload Generation </li></ul><ul><ul><li>1 engine simulating 100 active clients total of 100,000 page requests </li></ul></ul><ul><li>Machines </li></ul><ul><ul><li>3 Managed Servers are on Windows running in VMware in 2 CPU xSeries servers </li></ul></ul><ul><ul><li>1 MS on Solaris 9 with 2 CPU, Domain Manager is on xLinux. </li></ul></ul><ul><li>Load Balancer </li></ul><ul><ul><li>CISCO Catalyst 6509 w/Content Switch Module </li></ul></ul><ul><li>Tests repeated 4 times </li></ul>
    41. 43. Case Study: Application Topology Solaris Solaris Solaris Windows Windows Windows Windows Windows Windows Windows Windows Windows IHS WAS DB2 CISCO CSM
    42. 44. Average Page Response Time
    43. 45. Page Throughput
    44. 46. Domain Manager User Interface EWLM LPAR Management AIX AIX i5/OS Linux AIX <ul><li>Management of CPU resources across partitions on Power 5 hardware (similar to IRD on zSeries) </li></ul><ul><li>Mixture of operating systems </li></ul><ul><li>Multiple partition groups </li></ul>HV
    45. 47. Case Study: Partition Management <ul><li>Transactions have equal business goal and importance. </li></ul><ul><li>3 partitions on pSeries 570 box. </li></ul><ul><li>The database is not instrumented with ARM. </li></ul><ul><li>EWLM LPAR management was not turned on initially. </li></ul>hci088_AIX hci090_AIX hci092_AIX IHS WebSphere Un-Instrumented DataBase
    46. 48. Performance: Without EWLM Partition Mgmt.
    47. 49. EWLM LPAR Mgmt. Adjustments hci088_AIX : IHS hci090_AIX : WAS hci092_AIX : Un-Instrumented Database <ul><li>Virtual Processor Adjustment </li></ul><ul><li>Processing Capacity Adjustment </li></ul><ul><li>Weight Adjustment </li></ul>Actions <ul><li>EWLM manages partition capacity for un-instrumented work. </li></ul><ul><li>Improved transaction rate </li></ul>Results Before After
    48. 50. Performance: After EWLM LPAR Mgmt. Transaction rate comparison Before After
    49. 51. Where to go for more information <ul><li>IBM Enterprise Workload Manager Red Book: SG24-6350 </li></ul><ul><ul><li>URL: http://www.redbooks.ibm.com/abstracts/sg246785.html?Open </li></ul></ul><ul><li>EWLM Information Center: </li></ul><ul><ul><li>URL: http://publib.boulder.ibm.com/infocenter/eserver/v1r2/index.jsp </li></ul></ul><ul><li>Hardening the EWLM performance Data – Red paper </li></ul><ul><ul><li>URL: http://www.redbooks.ibm.com/abstracts/redp4018.html?Open </li></ul></ul><ul><li>EWLM interpreting control center performance reports – Red paper </li></ul><ul><ul><li>URL : http://www.redbooks.ibm.com/abstracts/redp3963.html?Open </li></ul></ul><ul><li>EWLM Class </li></ul><ul><ul><li>Enterprise Workload Manager Planning and Implementation (Course Code OZ200) </li></ul></ul>
    50. 52. Notices <ul><ul><li>Produced in the United States of America, 08/04, All Rights Reserved </li></ul></ul><ul><ul><li>IBM, IBM eServer logo, IBM logo, e-business on demand, DB2, DB2 Connect, DB2 Universal Database, HiperSockets, Enterprise Storage Server, Performance Toolkit for VM, Tivoli, TotalStorage, VM/ESA, WebSphere, z/OS, z/VM and zSeries are trademarks or registered trademarks of International Business Machines Corporation in the United States, other countries or both. </li></ul></ul><ul><ul><li>Java and all Java-based trademarks and logos are trademarks of Sun Microsystems, Inc. in the United States, other countries or both. </li></ul></ul><ul><ul><li>UNIX is a registered trademark of The Open Group in the United States and other countries. </li></ul></ul><ul><ul><li>Intel is a trademark of Intel Corporation in the United States, other countries or both. </li></ul></ul><ul><ul><li>Linux is a trademark of Linus Torvalds in the United States, other countries, or both. </li></ul></ul><ul><ul><li>Microsoft, Windows and Windows NT are registered trademarks of Microsoft Corporation </li></ul></ul><ul><ul><li>Other company, product and service names may be trademarks or service marks of others. </li></ul></ul><ul><ul><li>Information concerning non-IBM products was obtained from the suppliers of their products or their published announcements. Questions on the capabilities of the non-IBM products should be addressed with the suppliers. </li></ul></ul><ul><ul><li>IBM hardware products are manufactured from new parts, or new and serviceable used parts. Regardless, our warranty terms apply. </li></ul></ul><ul><ul><li>IBM may not offer the products, services or features discussed in this document in other countries, and the information may be subject to change without notice. Consult your local IBM business contact for information on the product or services available in your area. </li></ul></ul><ul><ul><li>All statements regarding IBM’s future direction and intent are subject to change or withdrawal without notice, and represent goals and objectives only. </li></ul></ul>

    ×