Bottom line: Y2K introduced a drop dead date for re-hosting. This has stemed the move to distributed systems. Is this true of your accounts ? If not rehosted, re-investment into M/F solution and this is unlikely to be abandoned. ERPs are now in the Mainframe space. Announce may 10th SAP application server to run on OS/390. For e-business, the mainframe webserver supports high volume transactions.
Pre-Joined Tables - Used when the cost of joining is prohibitive. Report Tables - Used when specialized critical reports are needed. Mirror Tables - Used when tables are accessed concurrently by multiple different types of environments. Split Tables - Used when distinct groups use different parts of a table. Combined Tables - Used to consolidate one-to-one or one-to-many relationships into a single table. Redundant Data - Used to reduce the number of table joins required. Repeating Groups - Used to reduce I/O and (possibly) storage usage. Derivable Data - Used to eliminate calculations and algorithms. Speed Tables - Used to make the processing of hierarchies more efficient. Physical Denormalization - Used to optimize for specialized physical DBMS characteristics.
Bottom line: E-commerce is driving workload growth. The data we are seeing shows that transaction based processes are expanding. This is significant for us as we move forward. Gartner: Te internet gives rise to unpredictable and unthrottable transaction volumes. The explosion in electronic delivery channels will lead to a mammoth surge in online transaction volumes. Number of IMS transactions growing at a rate of 25% Researchers forecast a three-fold increase in IT transactions between now and the year 2000. Every day, within the US alone, S/390-based systems transfer approximately two trillion dollars in electronic funds with near 100 percent availability. The number of US home banking users is expected to grow 75 percent annually until the year 2000. Ninety percent of the worlds largest businesses run their mission critical transactions on CICS and IMS processing monitors. IBM technology holds a potential 60 million North American households to enable access to the home-banking highway. The world’s largest organizations rely on a continuous link to information and logic residing on large-scale hosts. IBM CICS and IMS provide that access to business critical data and logic, keeping information accurate and consistent while providing timely access to many end users. Investments in these mission critical systems are profound. Changes that affect the corporate ‘jewels’ are evaluated intensely with conservative perspectives. The challenge to business is to leverage existing investments in mission critical systems while extending the business through new delivery platforms and the Internet. What is needed is an evolutionary approach to the new revolution.
Atomicity means that a transaction must exhibit an “all or nothing” behavior. Either all of the instructions within the transaction happen, or none of them happen. Atomicity preserves the “completeness” of the business process. Consistency refers to the state of the data both before, and after, the transaction is executed. A transaction maintains the consistency of the state of the data. In other words, after running a transaction all data in the database is “correct.” Isolation means that transactions can run at the same time. Any transactions running in parallel have the illusion that there is no concurrency. In other words, it appears that the system is running only a single transaction at a time. No other concurrent transaction has visibility to the uncommitted database modifications made by any other transactions. To achieve isolation a locking mechanism is required. Durability refers to the impact of an outage or failure on a running transaction. A durable transaction will not impact the state of data if the transaction ends abnormally. The data will survive any failures.
Specifying read uncommitted isolation implements read-through locks and is sometimes referred to as a dirty read . It applies to read operations only. With this isolation level data may be read that never actually exists in the database, because the transaction can read data that has been changed by another process but is not yet committed. Read uncommitted isolation provides the highest level availability and concurrency of the isolation levels, but the worst degree of data integrity. It should be used only when data integrity problems can be tolerated. Certain types of applications, such as those using analytical queries, estimates, and averages are likely candidates for read uncommitted locking. A dirty read can cause duplicate rows to be returned where none exist or no rows may be returned when one (or more) actually exists. When choosing read uncommitted isolation the programmer and DBA must ensure that these types of problems are acceptable for the application. The read committed isolation level, also called cursor stability , provides more integrity than read uncommitted. When read committed is specified the transaction will never read data that is not yet committed; only committed data can be read. With an isolation level of repeatable read a further restriction is placed on reads, namely the assurance that the same data can be accessed multiple times during the course of the transaction without its value changing. The lower isolation levels (read uncommitted and read committed) permit the underlying data to change if it is accessed more than once. Use repeatable read only when data can be read multiple different times during the course of the transaction and the data values must be consistent. Finally, the greatest integrity is provided by the serializable isolation level. Serializable isolation removes the possibility of phantoms. A phantom occurs when the transaction opens a cursor that retrieves data and subsequently another process inserts a value that would satisfy the request and should be in the result set.
Unplanned outages - driven by problems Element failure Performance degradation Capacity limitation Application logic error Transaction backout Data corruption Planned outages - driven by change Database maintenance Application migrations Configuration upgrade Data propagation Source of 70% and 30% is CIO feed back during Customer briefings and meetings.
Type of Business Estimated Hourly Financial Impact of an Outage Retail Brokerage $6.45 million Credit Card Sales Authorizations $2.6 million Home Shopping Channel $113,750 Catalog Sales Centers $90,000 Airline Reservation Centers $89,500 Package Shipping Service $28,250 ATM Service Fees $14,500 Percentage Approximate Downtime Per Year 99.999% 5 minutes .08 hours 99.99% 53 minutes .88 hours 99.95% 262 minutes 4.37 hours 99.9% 526 minutes 8.77 hours 99.8% 1,052 minutes 17.5 hours 99.5% 2,628 minutes 43.8 hours 99% 5,256 minutes 87.6 hours 98% 10,512 minutes 175.2 hours (or 7.3 days)
This is the most basic strategy (and the only one actually supported by IBM). You must be able to determine a common recovery point for a series of table spaces . A DB2 quiesce certainly works fine but if that’s not available, a quiet point report can be generated to show common points of consistency. One of the listed points of consistency can be used as a recovery point. Of course there is the question of determining what exactly had run since the bad transaction. For this the summary report (by Unit of Recovery, Table) is the ideal weapon. It is assumed that the user would rerun or reentry “Good Transaction 2”.
For example, suppose you took a full image copy of a database object early Monday morning at 2:00 AM; and then took an incremental image copy at the same time the following three mornings. The full image copy plus all three incremental image copies need to be applied to recover the tablespace. If the same column of the same row was updated on Tuesday to “A”, Wednesday to “B”, and Thursday to “C”, the recovery process would have to apply these three changes before arriving at the final, accurate data. If a full image copy were taken each night, the recovery process would only need to apply the last image copy backup which would contain the correct value.
The simplest version of SQL based transaction recovery is the generation of UNDO SQL (inserts turned into deletes, etc.) for the transactions in error. Note that in this case there is no work that needs to be redone. But, the potential for anomalies causing failures in the UNDO is certainly a consideration.
XML is getting a lot of publicity these days. If you believe everything you read, then XML is going to solve all of our interoperability problems, completely replace SQL, and possibly even deliver world peace. Okay, that last one is an exaggeration, but you get the point. In actuality, XML stands for eXtensible Markup Language. The need for extensibility, structure, and validation is the basis for the evolution of the web towards XML. XML, like HTML, is based upon SGML (Standard Generalized Markup Language) which allows documents to be self-describing, through the specification of tag sets and the structural relationships between the tags. HTML is a small, specifically defined set of tags and attributes, enabling users to bypass the self-describing aspect for a document. XML, on the other hand, retains the key SGML advantage of self-description, while avoiding the complexity of full-blown SGML.
19 The concept that inspire our product lines is exemplified here by the Cockpit of a Boeign 777 perhaps the most advanced and automated aircraft today. Like in a real aircraft cockpit our products are focused at keeping your mission critical applications running (availability) and running efficiently (performance). Transition Also like in a real cockpit we not only display valuable information on the equipment and the applications that comprise your Distributed Enterprise but ...
DBA 101 Database Administration Practices & ProceduresCraig S. MullinsDirector, Technology Planninghttp://www.craigsmullins.comhttp://www.BMC.comhttp://www.DBAzine.com Craig S. Mullins 20041
Agenda What is a DBA? DBA Tasks and Roles Summary2 Craig S. Mullins, 2004
What is a DBA? A day in the life of a DBA... 0010 001010100 101011101011 101011010010101 010010011010101110000113 Craig S. Mullins, 2004 11001010100100101000100101
The DBA is a “Jack of all Trades” OS/390 C++ Windows V$ Tables Linux Oracle SQL XML Java DB2 Informix applet Unix MQ Connect DB2 DNS gateway HTML VTAM CGI TCP/IP ISP connection ZPARMs 3GL database schema COBOL ASP operating system Java HTTP hardware SQL Server network software application code bridge/router/hub VB network cabling CICS SQL*Net4 Craig S. Mullins, 2004 JCL
DBA as a Management Discipline The term discipline implies planning and then implementing according to that plan. Creating the Database Environment Database Security Database Design Backup and Recovery Application Design Disaster Planning Design Reviews Storage Management Database Change Management Distributed Database Management Data Availability Data Warehouse Administration Performance Management Database Utility Management System Performance Database Connectivity Database Performance Procedural DBA Application Performance Soft Skills Data Integrity5 Craig S. Mullins, 2004
Creating the Database Environment Choosing a DBMS vendor, platform, and architecture of DBMS MVS, OS/390, z/OS Enterprise Windows NT / 2000 / XP Departmental Unix Personal AIX Mobile Sun Solaris HP-UX Linux Parallel Edition others? Others (VSE, VMS, MPE, OS/400, etc.) Desktop OS Windows 98 / ME / XP PostgreSQL Linux Mac?6 Craig S. Mullins, 2004
Choosing the DBMS Operating System Support Benchmarks (TPC, homegrown) Scalability Availability of Tools Availability of Technicians (DBAs, Programmers, SAs, etc.) Cost of Ownership Release Schedule (Versions, Releases) Reference Customers7 Craig S. Mullins, 2004
Installing the DBMS Hardware Requirements CPU (version/speed), firmware, memory, etc. Storage Requirements System, Applications Software Requirements Allied Agents (TP, MQ, middleware) Languages and Compilers Configuration …of the DBMS …of connecting software Verification8 Craig S. Mullins, 2004
Upgrading the DBMS Analysis of New Features Check all Requirements Hardware and Software (see Installation Checklist) Planning the Upgrade Impact to system, applications Scheduling Fallback Strategy Migration Verification9 Craig S. Mullins, 2004
Database Standards & Procedures Naming Conventions (in conjunction with DA) Roles & Responsibilities (in conjunction with DA and SA) Programming Guidelines (in conjunction with App Dev) Database Guidelines Security (in conjunction with Security Admin or SA) Migration & Turnover Procedures Design Review Guidelines (in conjunction with DA, SA, App Dec) Operational Support (in conjunction with Operations) DBMS Education10 Craig S. Mullins, 2004
Database Design Translation of Logical Model to Physical Database Entities to Tables, Attributes to Columns, etc. …But differences CAN and WILL occur Create DDL Create Storage Structures for Database Files for data and indexes Raw Files versus OS Files Partitioning Clustering Placement Interleaving Data11 Craig S. Mullins, 2004
Database Table space Table Table Table space Table Table Disk Drive Disk Drive Table space Table Table File File File12 Craig S. Mullins, 2004
Types of Indexes B-Tree Indexes Custom er Data Bit-map Indexes Warehouse Data Reverse Key Indexes Partitioning Indexes Supplier Da ta Clustering Indexes Unique/Non-unique Indexes Hashing14 Craig S. Mullins, 2004
When and How to Index A proper indexing strategy can be the #1 factor to ensure optimal performance First take care of unique & PK constraints Then for foreign keys (usually) Heavily used queries - predicates Overloading of columns for IXO Index to avoid sorting ORDER BY, GROUP BY, DISTINCT Consider I/U/D implications Choose first column (high cardinality) Indexing variable columns?15 Craig S. Mullins, 2004
Denormalization? Pre-Joined Tables Report Tables Mirror Tables Split Tables (Splitting Long Text Columns) Combined Tables Redundant Data Repeating Groups Derivable Data Hierarchies Special Physical Implementation Needs16 Craig S. Mullins, 2004
Application Design Database Application Development and SQL SQL Set-at-a-Time Processing and Relational Closure Embedding SQL in a Program SQL Middleware and APIs Code Generators Object Orientation and SQL Types of SQL ad hoc versus planned embedded versus stand-alone static versus dynamic SQL Coding for Performance Hints and Tips17 Craig S. Mullins, 2004
Defining Transactions Defining Transactions Atomicity Consistency Isolation Durability Unit of Work Ensure proper definition and coding18 Craig S. Mullins, 2004
TP System Versus DBMS (Stored Procs) Presentation (Client) Presentation (Client) Workflow Controller Stored Procedures Transaction Server Relational Relational DBMS DBMS (2) Relational Disk DBMS (1) Disk Disk19 Craig S. Mullins, 2004
Transactions and Locking Locking Types of Locks Share, eXclusive Intent Locks Row, Page, Table, Table Space Timeouts Deadlocks Example on next page20 Craig S. Mullins, 2004
Deadlocks Process A Process B . Table X . . Request row 3 . data…data…data... . lock . . . . Request row 7 . . Request row 7 data…data…data... . lock . Request row 3 Process A is waiting on Process B Process B is waiting on Process A21 Craig S. Mullins, 2004
Lock Duration Isolation Level Read uncommitted Read committed Repeatable read Serializable Lock Escalation Techniques to Minimize Lock Problems standardize the sequence of updates within all programs save all data modification requests until the end of the UOW22 Craig S. Mullins, 2004
Design Reviews Design reviews are conducted to analyze and review all aspects of the database and application code for efficiency, effectiveness, and accuracy. Types of Design Reviews The Conceptual Design Review The Logical Design Review The Physical Design Review The Organizational Design Review SQL and Application Code Review The Pre-Implementation Design Review The Post-Implementation Design Review23 Craig S. Mullins, 2004
Database Change Management The DBA is the custodian of database changes. Types of Changes DBMS Software Hardware Configuration Logical and Physical Design Applications Physical Database Structures24 Craig S. Mullins, 2004
The Limitations of ALTER Changes Not Likely to be Supported • Changing the name of some objects. • Moving an object to another database. • Changing the number of table space partitions or data files or removing a partition. • Moving a table from one table space to another. • Rearranging the order of columns in a table. • Removing columns from a table. • Changing the definition of a key (P or F) • Changing a view (adding or removing cols). • Modifying the contents of a trigger. • Changing a hash key. • Changing table clustering. • And so on...25 Craig S. Mullins, 2004
Database Table Space or DB Space Table Index View Column Trigger Synonym26 Craig S. Mullins, 2004
Adding a Column to Middle of a Table Retrieve current table def by Drop the table, which in turn drops any querying sys catalog. related objects. Retrieve def for any views on that Re-create table adding the new column. table. Reload the table using the unloaded data Retrieve def for all indexes on that from step 8. table. Re-create any referential constraints. Retrieve def for all triggers on that table. Re-create any triggers, views and indexes for the table. Capture all referential constraints for the table & related tables. Re-create the security authorizations from step 6. Retrieve all security. Examine each application program to Obtain a list of all programs that determine if changes are required for it access the table. to continue functioning appropriately. Unload the data in the table.27 Craig S. Mullins, 2004
Change Management Guidelines Requesting Database Changes Standardized Change Requests Automated or Paper Forms DBA Checklists Service Level Standards Communication To DBA from Requesters ...and from DBA back to Requesters28 Craig S. Mullins, 2004
Data Availability Availability is the condition where a given resource can be accessed by its consumers. Availability can be broken down into four distinct components: • Manageability – the ability to create and maintain an effective environment that delivers service to users. • Recoverability – the ability to re-establish service in the event of an error or component failure. • Reliability – the ability to deliver service to specified levels for a stated period of time. • Serviceability – the ability to effectively determine the existence of problems, diagnosis their cause(s), and repair the problem.29 Craig S. Mullins, 2004
Causes of Availability Problems Loss of the Data Center Corruption of Data Network Problems Loss of Database Objects Loss of the Server Loss of Data Hardware Data Replication and Disk-Related Outages Propagation Failures Operating System Failure Severe Performance Failure of the DBMS Problems Software Recovery Issues Application Problems DBA Mistakes Security and Planned Versus Authorization Problems Unplanned Outages30 Craig S. Mullins, 2004
Planned Versus Unplanned Outages 30% of Outages 70% of Outages Application Availability Planned Outages Unplanned Outages31 Craig S. Mullins, 2004
How Much Availability is Enough? Five 9’s? 99.999% equals 5 minutes per year! The Cost of Downtime Varies by Industry Varies by Applications Requires Analysis Cost of Assuring Availability Vs. Cost of Downtime Service Level Agreements32 Craig S. Mullins, 2004
Techniques to Improve Availability Perform Routine Maintenance While Systems Remain Operational Online Utilities Automate DBA Functions Exploit High Availability Features of the DBMS Parallelism Clustering and data sharing Exploit Hardware Technologies Storage technologies (e.g. RAID)33 Craig S. Mullins, 2004
Performance Management Defining Performance (Database) performance is the optimization of resource use to increase throughput and minimize contention, enabling the largest possible workload to be processed. Monitoring Versus Management Reactive Versus Proactive Identification Versus Correction Historical Trending34 Craig S. Mullins, 2004
Monitoring Versus Management Identify Problems FIND IT Determine How to Fix Optimize ANALYZE IT Environment Monitoring FIX IT Management35 Craig S. Mullins, 2004
Types of Performance Management System DBMS System Interaction with Allied Agents Database Database Design Database Structure Application Program Design SQL36 Craig S. Mullins, 2004
System Performance Interaction w/the OS and Allied Agents Hardware configuration Disk Storage and I/O DBMS Installation and Configuration Issues Memory Usage Data Cache Details “Open” Database Objects Database Logs Locking and Contention The System Catalog Other Configuration Options System Monitoring38 Craig S. Mullins, 2004
Database Performance Partitioning FilePlacement and Indexing Allocation Database Log Placement Denormalization Distributed Data Placement Clustering Disk Allocation Interleaving Data Database Reorganization Free Determining When to Space Reorganize Compression Online Reorganization Page Size (Block Size) Automation39 Craig S. Mullins, 2004
Optimization Hint(s) Determine Optimal System Access Catalog SQL Request Other System Information Exit -or- Save Run SQL? Access No Path Yes Execute the Optimized Database Tables SQL Query Results Multiple Rows41 Craig S. Mullins, 2004
Understanding & Optimizing Joins Join predicates Multiple Table Access No Cartesian OUTER - (COMPOSITE) INNER - (NEW) Products TABLE 1 TABLE 2 Minimize the size of intermediate tables JOIN OUTER INTERIM (COMPOSITE) INNER - (NEW) TABLE 3 Index for performance Type of Joins JOIN Nested Loop OUTER RESULT Merge Scan Hybrid Star Join42 Craig S. Mullins, 2004
Data Integrity Database structure integrity Consistency Options Database Checking Memory Usage Additional Options Semantic data integrity Entity Integrity Referential Integrity User- vs. System-Managed RI Unique Constraints Data Types Default Values Check Constraints Triggers43 Craig S. Mullins, 2004
An Example - dbcc Database Consistency EXTENT etc. dbcc Table page database44 Craig S. Mullins, 2004
Referential Integrity Primary Key Foreign Key DELETE RULE DEPT e mp l o ys Restrict e mp l o ye d - b y EP M Cascade Set NULL Managing Referential Sets Backup & Recovery LOAD CHECK45 Craig S. Mullins, 2004
Database Security Granting and Revoking Authority Types of Privileges Granting Table Privileges Granting Database Object Privileges Granting System Privileges Granting Program and Procedure Privileges Granting to Public Revoking Privileges Cascading Revokes Chronology and Revokes Security Reporting46 Craig S. Mullins, 2004
Additional Database Security Authorization Roles and Groups Roles Groups Limit the Number of SA Users Group-Level Security and Cascading Revokes Other Database Security Mechanisms Using Views for Security Using Stored Procedures for Security Auditing External Security Job Scheduling and Security External OS and File Security47 Craig S. Mullins, 2004
Database Backup Image Copy Backups Full Versus Incremental Image Copy Backups Merging Incremental Copies Database Objects and Backups Copying Indexes DBMS Control Backup Consistency When to Create a Point of Consistency Log Archival and Backup Determining the Backup Schedule DBMS Instance Backup Designing the DBMS Environment for Recovery Database Object Definition Backups48 Craig S. Mullins, 2004
Database Recovery Recovering from Image Copy Backups Determining Recovery Options Error Analysis Image Copy Analysis Types of Recovery Recover to Current Point in Time Recovery SQL Based Recovery Index Recovery Testing Your Recovery Plan Recovering a Dropped Database Object Recovery of Broken Blocks and Pages49 Craig S. Mullins, 2004
A Recovery Scenario Good Transaction DiskTimeline Failure Recovery started Recover to current to address media failure. Backup Log Log(s)50 Craig S. Mullins, 2004
Another Example 12 12 12 12 11 1 11 1 11 1 11 1 10 2 10 2 10 2 10 2 9 3 9 3 9 3 9 3 8 4 8 4 8 4 8 4 7 5 7 5 7 5 7 5 6 6 6 6 Sun Mon Tues Wed Thu Fri Sat Incr Just the changes Full from Monday Incr Backup of the entire database Just the changes object from Tuesday Incr Just the changes51 Craig S. Mullins, 2004 from Wednesday
SQL-Based Recovery Good Transaction 1 Good Transaction 2 Bad Transaction UNDO Bad Transactions Generate UNDO SQL Recovery started Apply UNDO SQL UNDO SQL, generated from the database log, can be used to get rid of bad transactions. And the database can remain online.52 Craig S. Mullins, 2004
Disaster Planning Disaster: any unplanned, extended loss of critical business applications due to lack of computer processing capabilities for more than a 48-hour period. - Sungard Recovery Services DBAs must integrate database recovery into the corporate disaster recovery plan DBAs must test the disaster plan DBAs must work with the application owners/sponsors to accurately gauge the criticality of each piece of data to create a valid database disaster plan53 Craig S. Mullins, 2004
Timeline Disaster occurs taking down the local site. Database modifications applied and logged at the local site. Log Backups Log(s) Image copy backups are taken and one is sent off-site to the remote location.54 Craig S. Mullins, 2004
Data & Storage Management Files and Data Sets Storage Options File Placement on Disk RAID Space Management JBOD SANs Data Page Layouts Allocation Pages Netword Attached Storage (NAS) Data Record Layouts DAFS Calculating Table Size Index Page Layouts Planning for the Calculating Index Size Future Transaction Logs Capacity Planning55 Craig S. Mullins, 2004
Distributed Database Management Setting up a Distributed Environment Seattle Data Distribution Standards Accessing Distributed Data Two-Phase Commit Distributed Performance Problems Sydney56 Craig S. Mullins, 2004
Data Warehouse Administration Data Warehouse Design Data Movement ETL Data Cleansing Data Warehouse Scalability Data Warehouse Performance Automated Summary Tables Data Freshness Data Content Data Usage Query Design: OLAP versus OLTP Financial Chargeback Backup and Recovery Integration with Production Systems That Feed the Data Warehouse57 Craig S. Mullins, 2004
Other DBA Issues DBA Tools Evaluation, budget, installation, training, usage... Database Utility Management LOAD, UNLOAD, COPY, RECOVER, REORG, CHECK Database Connectivity Client/Server Internet XML Coping With New Technologies Internet/Java/XML Triggers, Stored Procedures, UDFs PDAs58 Craig S. Mullins, 2004
Procedural DBA Stored Procedures Performance Monitors D E D V E Admin. DBMS E Triggers B Process L U O G P External Design Libraries Review Functions59 Craig S. Mullins, 2004
Role of the Procedural DBA DBCO Impl (COMMIT in em entation Ensuring proc, write o r guide) Reuse n m stratio ini set) Des i CO Ad rder, proc DB r firing o Rev gn igge iews (tr EX An PLAI On Call a ly sis N Coding for DBCO Complex Abends Queries QL i n gS Debuggi ng Schema un SQL T R esolutionDBCO = Database Code Objectshorthand for trigger, UDF, stored procedure60 Craig S. Mullins, 2004
Soft Skills Communication Collaboration Calm disposition and demeanor Documentation Educator Packrat Thirst for knowledge61 Craig S. Mullins, 2004
Summary DBAs require many different types of skills not just technology business knowledge is important DBA is not an easy job Cooperation is required Communication is essential Technology-wise need to know where to find answers don’t need to know all the answers off the top of your head No DBA is an “island”63 Craig S. Mullins, 2004
Contact Information BMC Software http://www.bmc.com Assuring Business Availability Craig S. Mullins DB2 Technology Planning Craig_Mullins@BMC.com firstname.lastname@example.org http://www.craigsmullins.com http://www.craigsmullins.com/dba_book.htm64 Craig S. Mullins, 2004