Your SlideShare is downloading. ×
Upcoming SlideShare
Loading in...5

Thanks for flagging this SlideShare!

Oops! An error has occurred.

Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply



Published on

  • Be the first to comment

  • Be the first to like this

No Downloads
Total Views
On Slideshare
From Embeds
Number of Embeds
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

No notes for slide


  • 1. IBM DB2 QUERY PATROLLER V7.1 Data Management Solutions White Paper
  • 2. First Edition (July 2000) © Copyright International Business Machines Corporation 2000. All rights reserved. Note to U.S. Government Users -- Documentation related to restricted rights -- Use, duplication or disclosure is subject to restrictions set forth in GSA ADP Schedule Contract with IBM Corp. i
  • 3. ii
  • 4. Notices References in this publication to IBM products, programs, or services do not imply that IBM intends to make these available in all countries in which IBM operates. Any reference to an IBM product, program, or service is not intended to state or imply that only that IBM product, program, or service may be used. Any functionally equivalent product, program, or service that does not infringe any of the intellectual property rights of IBM may be used instead of the IBM product, program, or service. The evaluation and verification of operation in conjunction with other products, except those expressly designated by IBM, are the responsibility of the user. IBM may have patents or pending patent applications covering subject matter in this document. The furnishing of this document does not give you any license to these patents. You can send license inquiries, in writing, to: IBM Corporation IBM Director of Licensing 208 Harbor Drive Stamford, Connecticut 06904 U.S.A. iii
  • 5. Trademarks AIX, DB2, NUMA-Q, QMF are trademarks of International Business Machine Corporation. Microsoft, Windows, Windows NT are registered trademarks of Microsoft Corporation. Java or all Java-based trademarks and logos, and Solaris are trademarks of Sun Microsystems, Inc. in the United States, other countries, or both. Other company, product, and service names used in this publication may be trademarks or service marks of others. iv
  • 6. Contents Notices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii Trademarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iv Contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v Large Scale Warehouse Challenges . . . . . . . . . . . . . 7 EMPOWERING USERS WITH INFORMATION ................ 7 ROBUST SCALABILITY .................................... 7 HIGH AVAILABILITY AND STABILITY ....................... 8 CORPORATE ASSET PROTECTION ........................ 8 DESCRIPTION ............................................. 9 ARCHITECTURE OVERVIEW ............................... 9 Overview ................................................... 9 Query Patroller . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 (null) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 System Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 (null) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 (null) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 Query Monitor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 Query Enabler . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 DB2 Query Patroller Tracker . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 DB2 Query Patroller - Brings it all 15 together in one product . . . . . . . . SETTING THE STANDARD FOR ROBUST . . . . . . . . . . . . . . . . . . . QUERY 15 MANAGEMENT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Ad Hoc Query Tools ........................................ 16 Hardware Load Levelers .................................... 16 Proprietary Query Management Tools ........................ 16 DB2 Query Patroller All Management in One Tool ............. 16 HIGH AVAILABILITY AND STABILITY ...................... 17 v
  • 7. Proactive Query Capture .................................... 17 Robust Query Termination .................................. 17 ROBUST SCALABILITY ................................... 18 Frees User Desktop, Improving Productivity ................... 18 Query Cost Analysis ........................................ 18 User Prioritization .......................................... 18 Load Balancing ............................................ 19 Data Warehouse Optimization ............................... 19 For More Information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 Additional Information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 vi
  • 8. Large Scale Warehouse Challenges Today, database management personnel are facing increasing challenges. While they want to deliver information to end users as quickly as possible, they are finding that it takes an enormous amount of resources to be responsive to the growing number of users demanding information. Even worse, the users’ needs for information continue to change and evolve, as new information becomes available. EMPOWERING USERS WITH INFORMATION Innovative data mining, information queries, and trend analysis techniques are providing companies with much needed competitive advantage, and are enabling radical breakthroughs within industries. As a result, there is great demand for rapid, innovative query capabilities against increasingly large, mission critical data warehouses. In the past, end users typically called upon their Information Systems (IS) department and asked for a report on the market, sales, or inventory. IS generated the report on the user’s behalf and provided it to them usually days or even weeks later. These reports were static in nature and did not allow the user to change the selection criteria for the reports. Sometimes, requirements were not communicated properly and the users waited weeks for a report that did not answer their business questions. In the late 1980s, desktop query and reporting tools entered the market, allowing end-users to perform their own queries against the corporate or departmental database. This immediately provided the companies using these applications with a competitive advantage. While end users at other companies waited weeks for a report in order to make a decision, companies that allowed their business analysts and key managers to query directly against the database were making decisions in minutes and gaining a competitive advantage in the marketplace. ROBUST SCALABILITY The steady increase of end users performing queries against the corporate database presents a huge challenge for database administrators. As the number of users performing their own queries increases, the response time of the system may decline due to increased contention. Large scale data warehouses that provide breakthrough business value pose a challenge: How can a company’s data warehouse continue to provide quick response 7 DB2 Query Patroller
  • 9. time across large amounts of data to an ever increasing number of end-users tapping the power of ad hoc query tools on their desktops? One solution to this problem has been to add more hardware. The new symmetric multi-processing (SMP) and massively parallel processing (MPP) systems available in recent years can, in part, help handle the increased load or help spread the load over several machines to improve query performance. This is the direction many leading edge organizations have taken with their Decision Support Systems (DSS). HIGH AVAILABILITY AND STABILITY Even with the addition of new, more powerful hardware systems, users may still unknowingly submit “runaway” queries. When users submit these queries during peak business hours, they can bring even the largest system to a crawl. With the advancement in query tools, it is now possible for every business analyst in a company to quickly generate a query without knowing anything about the back end database or about SQL. When too many end users submit complex “runaway” queries to a database running on a large MPP system at the same time, they can potentially bring the large, multi-terabyte system to its knees. The drop in response is due primarily too poor query management. The database load management capability was overwhelmed because all the queries reached the data warehouse at the same time. If the query submissions had been controlled, the response time would not have been significantly impacted. CORPORATE ASSET PROTECTION All of these trends point to the need to grow and protect the data warehouse as the vital corporate asset that it is. The robustness, availability, performance, and security of large data warehouses are of paramount importance since they enable radical business breakthroughs while maintaining competitive advantage in the marketplace. . DB2 Query Patroller 8
  • 10. DB2 Query Patroller Developed To Manage Your Workloads DESCRIPTION DB2 Query Patroller greatly improves the scalability of a data warehouse by allowing hundreds of users to safely submit queries on multi-terabyte class systems. The product is a true three-tier architecture solution. Its components span the client/server environment to better manage and control all aspects of query submission. DB2 Query Patroller acts as an agent on behalf of the end user. It prioritizes and schedules queries so that query completion is more predictable and computer resources are efficiently utilized. After an end user submits a query, DB2 Query Patroller frees up the user’s desktop so they can perform other work, or even submit additional queries, while waiting for the original query results. DB2 Query Patroller is integrated with the DB2 optimizer and performs cost analysis on queries and then schedules and dispatches those queries so that the load is balanced across the database partitions. DB2 Query Patroller sets individual user and group priorities, as well as user query limits. This enables the data warehouse to deliver the needed results to its most important users as quickly as possible. It also has the ability to limit usage of the system by stopping those “runaway” queries before they even start. If desired, an end user can choose to receive e-mail notification of scheduled query completion or query failure. ARCHITECTURE OVERVIEW Overview DB2 Query Patroller consists of components running on the database server and end users’ desktops. DB2 Query Patroller is made up of several components each having a specific task in providing query and resource management. 9 DB2 Query Patroller
  • 11. End User End User DB2 CAE Query Monitor Internet Server DB2 CAE Administrator DB2 CAE Query Monitor Query Administrator Query Patroller Tracker Enterprise Network DB2 EEE Query Patroller Server DB2 EE Agent Agent ... Agent Agent Query Patroller Server Agent ... Data Datamart Warehouse AIX, Solaris, NT, 2000, HP-UX, NUMA-Q Figure 1 – DB2 Query Patroller Overview Query Patroller Server The Server is the core component of DB2 Query Patroller. It provides an environment for storing user profiles, storing system parameters, maintaining job lists, scheduling queries and storing node information. The Server component executes on a node with the DB2 database server. Query Administrator The Administrator component gives a DBA or system administrator the tools needed to manage the DB2 Query Patroller environment. This java interface allows for the management of the Query Patroller system. The administrator provides menus to configure user profiles, system parameters and node parameters. System Parameters The system administrator can set up system-wide, partition, user, or group level thresholds for governing the data warehouse, including: Ÿ Maximum number of queries running on the system at any given time. DB2 Query Patroller 10
  • 12. Ÿ Maximum cost threshold for the entire system. The cumulative cost of all queries running cannot exceed this number. Ÿ Maximum cost threshold for each defined user or group in the system. Ÿ Maximum number of jobs a user can initiate. This value can be configured differently for each of your users or groups. Ÿ Specific amount of time to retain temporary result tables. When DB2 Query Patroller takes control of a query, a temporary table is created in the database to store the query results. DB2 Query Patroller will automatically clean up these tables after the period of time specified by the administrator. Query Patroller will also allow users to share results sets so that a query can be executed once and all authorized users can reuse the result set. Query Patroller Agent The agent component of DB2 Query Patroller resides on each of the database server nodes. It processes the database requests on behalf of the query patroller server and gathers resource utilization statistics to allow for query workload balancing, as well as monitoring of the resource utilization of each partition. On a uni-processor or non-clustered SMP machine, the agent and server components run on the same machine. On MPP or clustered SMP machine, the server runs on one node and the agents will run on all of the database nodes Query Monitor The Query Monitor component of DB2 Query Patroller provides both the administrators and the end users with a Java based interface for viewing and managing their queries. The Query Monitor component enables end users to view a job’s status, submit and cancel queries, and drop result tables. End users can only display information for their own queries and jobs running on the system while the Query Monitor tool provides administrators with the ability to manage all queries in the system. 11 DB2 Query Patroller
  • 13. Figure 2 – DB2 Query Monitor Job List The DB2 Query Monitor job list maintains the queries submitted by end users. The job list contains information about each query submitted through Query Patroller. The system administrator or end user is able to use the job list to view information for the queries in the system including: Ÿ Job sequence number Ÿ Query priority Ÿ Query status Ÿ Query source Ÿ Node on which the query was submitted Ÿ Type of application submitting the query Ÿ ID of user submitting the query Ÿ Date and time the query was submitted Users and administrators may also view more detailed information on any of the queries listed in the job list table, such as query run time, cost of the query, and the SQL statement. DB2 Query Patroller 12
  • 14. Query Enabler The Query Enabler component of DB2 Query Patroller executes inside of the DB2 client. This component intercepts dynamic SQL statements being passed to DB2 from any front end query tool. Query Enabler interacts with other DB2 Query Patroller components and with the user to execute or schedule the query and to return results from previously completed queries. Query Enabler intercepts queries submitted by end users. If matching queries exist on the Query Patroller Server, Query Enabler provides the end user with a display of those queries and prompts the user to indicate whether or not a new result set should be returned. Whenever a user wants to submit a query, Query Enabler provides the option to set scheduled run times or to submit and wait for the results. If the end user does not want to wait for the query results, Query Enabler releases the desktop application and passes the query to the DB2 Query Patroller Server. Query Patroller then takes control of the query and runs it in the background on behalf of the end user. The next time the user submits that same query, the result set for that query will be returned to the application. Query Enabler also has the ability to run in a silent mode so that the end user does not interact with the Query Enabler, but rather they can run in the same mode they have today with their end user tools submitting queries directly to the server. This also enables 3-tier or n-tier applications to utilize Query Patroller without the need for additional software on the client desktop. DB2 Query Patroller Tracker Figure 3 – DB2 Query Patroller Tracker 13 DB2 Query Patroller
  • 15. The DB2 Query Patroller Tracker product enables a user to manage the databases by displaying usage history in a graphical, user-friendly format. It provides two key features that support system administrators in managing the database. First, it gives the system administrator the ability to monitor the database load and activity over time. Second, it provides the administrator with details on table and column access to assist in tuning the system. The Query Patroller server stores the historical information in DB2 tables so that administrators can drill down on whatever aspects of the database usage that they desire using the query tool of their choice. DB2 Query Patroller 14
  • 16. DB2 Query Patroller - Brings it all together in one product SETTING THE STANDARD FOR ROBUST QUERY MANAGEMENT To understand DB2 Query Patroller functionality and how it differs from query tool management systems, it is necessary to understand the problems each tool is designed to solve. At present, four technologies are available to manage queries and resources: Ÿ Ad hoc query tools Ÿ Three-tier proprietary tools Ÿ Server-based query and resource managers Ÿ Hardware resource managers Each has its own strengths that make it appropriate for particular types of query situations. Figure 4 illustrates the queries divided into four classes based upon the resource load leveling provided and the management of the query before it runs against the database. Query Managed Proprietary DB2 Query Query Tools Patroller Ad Hoc Hardware Load Not Query Tools Leveling Managed No Yes Resource Load Balancing Figure 4– Classes of Query and Resource Management 15 DB2 Query Patroller
  • 17. Ad Hoc Query Tools Ad hoc query tools do a good job of allowing the end user to directly ask questions of the database without having to go to IS personnel every time they have a need for additional information. Generally, with relatively small databases and few users, there is little need for query and resource management. However, as databases grow larger and the number of users increase, the strain on the data warehouse becomes evident. In some cases, a query tool may have a rudimentary scheduling facility. However, that requires the user to keep their PC powered on overnight to schedule the query or puts the burden on the end user to share resources in good faith with other users. QMF for Windows is a unique exception that provides predictive query governing. However, it requires managing a distributed environment rather than all query governing being centrally managed by the database server. Hardware Load Levelers Some Database Management Systems and MPP hardware companies offer system software products that spread the query load across the database-specified nodes. Queries are routed to free nodes for processing. Even though this provides a good use of the hardware resources, it does not look at the type of query being submitted. Any query that comes into this type of system is immediately run, regardless of the time or cost that the query will consume. Proprietary Query Management Tools Three-tier query tools have a server component that provides some capabilities for scheduling queries. This component releases the desktop and submits the query at the pre-scheduled time on behalf of the end user. Typically, these type of tools only work with their own front end and provide a canned query interface for end users. This is less adaptable for ad hoc querying. Typically, many users submit very predictable optimized queries. Three-tier query tools provide little user prioritization and resource balancing. DB2 Query Patroller All Management in One Tool The first three categories of query and resource management tools fail to provide end users with acceptable query response times and IS with the control they need. The DB2 Query Patroller product addresses these challenges. DB2 Query Patroller is the only product of its kind on the market today that controls and monitors queries. DB2 Query Patroller works with dynamic SQL query tools to prioritize and schedule user queries based on user profiles and cost analysis performed on each query. DB2 Query Patroller 16
  • 18. Large queries are put on hold and can then be scheduled for a later time during off-peak hours. Queries with high priority (based on user profiles) are promoted to the top of the schedule. In addition, DB2 Query Patroller monitors resource utilization statistics to determine which partitions are the least busy and provides load distribution functionality that evens out the workload across the system. HIGH AVAILABILITY AND STABILITY Proactive Query Capture At the core of DB2 Query Patroller’s breakthrough functionality, is its ability to proactively capture queries. Query Patroller’s proactive approach to query management helps it guarantee the high levels of availability and stability required in a mission-critical data warehouse. As queries are submitted against the data warehouse, Query Patroller traps the queries, assesses their cost, and prioritizes their execution. Without this proactive query trap, users could submit “runaway” queries that compromise the system availability and IS could only report in retrospect why the system failed. DB2 Query Patroller serves as a vigilant eye over vital corporate data warehouses. Since the queries are captured, should the database server fail for any reason, these queries will be automatically restarted by Query Patroller on behalf of the end user. Robust Query Termination The proactive query capture approach is enhanced through Query Patroller’s robust query termination. One of its strengths is its ability to effectively terminate queries. Many ad-hoc query tools give end users a terminate option, but in reality the query is just terminated on the client workstation. The processes already started on the database server may not be terminated. If the user assumes that the query has been cancelled they might be more likely to submit other queries and repeat the same submit and terminate process. The end result could be that the server gets bogged down with multiple orphan queries that continue to run, wasting valuable resources. DB2 Query Patroller addresses this problem by truly terminating the query. It ensures that both the end user workstation and the database server are released from a terminated query. This ensures that the cycles used for processing on the database server are fully utilized by needed queries and it frees up the system administrator from having to monitor and kill the orphan queries. 17 DB2 Query Patroller
  • 19. ROBUST SCALABILITY Frees User Desktop, Improving Productivity Typically with most front-end query tools, after a user submits a query, the user’s application is “hung up” in a “pending output” state until the results of the query return. Users must wait until the query completes for their desktops to become available, which can greatly reduce their productivity. Users need to be able to perform other tasks, even submit additional queries, while earlier queries run in the background. In many cases, users don’t need their query results back until the next day or the following Monday morning. Thus, instead of submitting a query for immediate execution , the query could be scheduled for a later time when the system load may be lower. DB2 Query Patroller frees up the user application and improves user productivity by allowing the user to submit and schedule queries based on their response requirements. Query Cost Analysis DB2 Query Patroller is integrated with the optimizer which performs cost analysis of each query entering the system to determine the static cost of the query. Query Patroller enables the system administrator to modify a user’s profile and specify a query cost threshold for each user or group. After completing cost analysis, DB2 Query Patroller compares the returned value to the value in the user profile. If the returned value exceeds the user threshold, DB2 Query Patroller places the query on hold so that the query can run at a later time. Query Patroller also notifies the end user that their query is on hold for future execution. User Prioritization The majority of ad hoc query tools do not take into account a user’s priority with respect to other users submitting queries into the system. For example, many times the CEO of a company needs a report right away for a meeting, but the system is so overloaded that the query does not complete in time. If the CEO’s priority class level is high, the query request would automatically move to the top of the query submission queue and be executed immediately. DB2 Query Patroller provides an environment that facilitates the prioritized completion of queries. It maintains a user profile for each user that submits queries into the system. The user profile defines a priority class, which identifies the relative priority a user has when submitting a query into the database. A higher priority class places the user’s query closer to the top of the query submission queue. DB2 Query Patroller 18
  • 20. The system administrator sets individual user and group priorities, thus enabling the data warehouse to deliver the needed results to your most important users as quickly as possible. DB2 Query Patroller also enables the system administrator to limit the number of queries that each individual user can simultaneously submit. This feature gives other users the opportunity to have their queries processed in a timely fashion. Load Balancing Ideally, query workload should be balanced across available resources. However, in an MPP environment, ad hoc query tools may only submit queries onto one or two nodes on the system heavily using some nodes and under utilizing others. In comparison, a server-based query manager could more intelligently balance node utilization. This prevents bottlenecks at the nodes being bombarded with ad hoc query requests. DB2 Query Patroller provides the system administrator with the ability to set system and user parameters to govern the queries entering the database. The system administrator may specify the maximum number of concurrent queries for each user or group, for each node, and for the entire system. DB2 Query Patroller provides load leveling across MPP hardware environments and clustered servers. By tracking node or server utilization, Query Patroller routes queries to idle nodes or servers and spreads the query load across the system. Data Warehouse Optimization DB2 Query Patroller enables system administrators to monitor the database load by providing access to the following information: Ÿ What tables are being accessed for all jobs Ÿ Columns accessed for each table Ÿ Number of rows returned, by table, for all jobs Ÿ Detailed view of job activity over time Ÿ Historical view of job activity DB2 Tracker displays this information in an easy to view format by determining the total number of tables accessed in the database, and calculating the total number of times each specific table is accessed. For each table displayed, the user is able to drill down to view the columns accessed for queries against that table. This enables the administrator to decide if new indexes should be created on the columns used most in the table, if an Automatic Summary Table (AST) may improve performance, or if certain tables should be considered for archival. DB2 Query Patroller also provides robust charge back mechanisms. Administrators can track usage by user, group, client hostname or application submitting the query. The resources that can be accumulated 19 DB2 Query Patroller
  • 21. for chargeback include elapsed query execution time, rows returned, or query cost. All of this information is stored in DB2 tables which allows chargeback reporting using any query tool. DB2 Query Patroller 20
  • 22. 21 DB2 Query Patroller
  • 23. Additional Information If you are interested in learning more about DB2 Query Patroller and the other products in the DB2 UDB database family, please contact your local IBM representative or visit our Web sites at: . DB2 Query Patroller 22