Published on

  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide


  1. 1. Improving Database Performance <ul><li>So far, we have looked at many aspects of designing, creating, populating and querying a database. We have (briefly) explored ‘optimisation’ which is used to ensure that query execution time is minimised </li></ul><ul><li>In this lecture we are going to look at some techniques which are used to improve performance and availability </li></ul><ul><li>WHY ? </li></ul><ul><li>Because databases are required to be available, in many installations and applications, 24 hours a day, 7 days a week, 52 weeks every year - think of the ‘user’ demands in e-business </li></ul>
  2. 2. Improving Database Performance <ul><li>There are many ‘solutions’ - including </li></ul><ul><li>parallel processors </li></ul><ul><li>faster processors </li></ul><ul><li>higher speed communications </li></ul><ul><li>more memory </li></ul><ul><li>faster disks </li></ul><ul><li>more disk units on line </li></ul><ul><li>higher capacity disks </li></ul><ul><li>any others ? </li></ul>
  3. 3. Improving Database Performance <ul><li>We are going to look at a technique called ‘clustering’ - an architecture for improving ‘power’ and availability </li></ul><ul><li>What are the ‘dangers’ to non-stop availability </li></ul><ul><li>Try these :- </li></ul><ul><ul><ul><ul><li>System outages (planned) </li></ul></ul></ul></ul><ul><ul><ul><ul><ul><li>Maintenance, tuning </li></ul></ul></ul></ul></ul><ul><ul><ul><ul><li>System outages (unplanned) </li></ul></ul></ul></ul><ul><ul><ul><ul><ul><li>hardware failure, bugs, virus attacks </li></ul></ul></ul></ul></ul>
  4. 4. Improving Database Performance <ul><li>E-business is not the only focus </li></ul><ul><li>Businesses are tending to be ‘global organisations’ - remember one of the early lectures ? </li></ul><ul><li>So what is one of the solutions’ ? </li></ul><ul><li>In a single word - clustering </li></ul><ul><li>Clustering is based on the premise that multiple processors can provide better, faster and more reliable processing than a single computer </li></ul>
  5. 5. Improving Database Performance <ul><li>However, as in most ‘simple’ solutions in Information Technology, the problem is in the details </li></ul><ul><li>How can clustering be achieved ? </li></ul><ul><li>Which technologies and architectures off the best approach to clustering ? </li></ul><ul><ul><li>and, what is the measure, or metric, of ‘best’ ? </li></ul></ul>
  6. 6. Improving Database Performance <ul><li>What are some of the advantages of clustering ? </li></ul><ul><ul><li>Improved availability of services </li></ul></ul><ul><ul><li>Scalability </li></ul></ul><ul><li>Clustering involves multiple and independent computing systems which work together as one </li></ul><ul><li>When one of the independent systems fails, the cluster software can distribute work from the failing system to the remaining systems in the cluster </li></ul>
  7. 7. Improving Database Performance <ul><li>‘Users’ normally would not notice the difference </li></ul><ul><ul><li>They interact with a cluster as if it were a single server - and importantly the resources they require will still be available </li></ul></ul><ul><li>Clustering can provide high levels of availability </li></ul>
  8. 8. Improving Database Performance <ul><li>What about ‘scalability’ ? </li></ul><ul><li>Loads will (sometimes) exceed the capabilities which make up the cluster. </li></ul><ul><li>Additional facilities can be incrementally added to increase </li></ul><ul><li>the cluster’s computing power and ensure processing requirements are met </li></ul><ul><li>As transaction and processing loads become established, the cluster (or parts of it) can be increased in size or number </li></ul>
  9. 9. Improving Database Performance <ul><li>Clustering is NOT a ‘new’ concept </li></ul><ul><li>A company named DEC introduced them for VMS systems in the early 1980’s - about 20 years ago </li></ul><ul><li>Which firms offer clustering packages now ? </li></ul><ul><li>IBM, Microsoft and Sun Microsystems </li></ul>
  10. 10. Improving Database Performance <ul><li>What are the different types of Clustering ? </li></ul><ul><li>There are 2 architectures; </li></ul><ul><ul><li>Shared nothing and </li></ul></ul><ul><ul><li>Shared disk </li></ul></ul><ul><ul><li>In a shared nothing architecture, each system has its own private memory and one or more disks </li></ul></ul><ul><ul><li>And each server in the cluster has its own independent subset of the data it can work on independently without meeting resource contention from other servers </li></ul></ul>
  11. 11. Improving Database Performance <ul><li>This might explain better:- </li></ul><ul><li>CPU 1 CPU 2 CPU 3 </li></ul><ul><li>Memory 1 Memory 2 Memory ..n </li></ul><ul><li>A Shared Nothing Architecture </li></ul>Interconnection network
  12. 12. Improving Database Performance <ul><li>As you saw on the previous overhead, a shared nothing environment, each system has its own ‘private memory’ and one or more disks </li></ul><ul><li>And each server in the cluster has its own independent subset of the data it can work on without meeting resource conflicts from other servers </li></ul><ul><li>The clustered processors communicate by passing messages through a network which interconnects the computers </li></ul>
  13. 13. Improving Database Performance <ul><li>Client requests are automatically directed to the system which owns the particular resource </li></ul><ul><li>Only one of the clustered systems can ‘own’ and access a particular resource at a time. </li></ul><ul><li>When a failure occurs, resource ownership can be dynamically transferred to another system in the cluster </li></ul><ul><li>Theoretically, a shared nothing multiprocessor could scale up to thousands of processors - the processors don’t interfere with one another - no resources are shared </li></ul>
  14. 14. Improving Database Performance <ul><li>CPU 1 CPU 2 CPU …n </li></ul><ul><li>Memory 1 Memory 2 Memory ..n </li></ul><ul><li>A Shared All Environment </li></ul>Interconnecting Network
  15. 15. Improving Database Performance <ul><li>In a ‘shared all’ environment, you noticed that all of the connected systems shared the same disk devices </li></ul><ul><li>Each processor has its own private memory, but all the processors can directly access all the disks </li></ul><ul><li>In addition, each server has access to all the data </li></ul>
  16. 16. Improving Database Performance <ul><li>In this arrangement, ‘shared all’ clustering doesn’t scale as effectively as shared-nothing clustering for small machines. All the nodes have access to the same data, so a controlling facility must be used to direct processing to make sure that all nodes have a consistent view of the data as it changes </li></ul><ul><li>Attempts by more than one nodes to update the same data need to be prohibited. This can cause performance and scalability problems </li></ul><ul><li>(similar to the concurrency aspect) </li></ul>
  17. 17. Improving Database Performance <ul><li>Shared-all architectures are well suited to the large scale processing found in main frame environments </li></ul><ul><li>Main frames are large processors capable of high work loads. The number of clustered PC’s and midrange processors, even with the newer, faster processors, which would equal the computing power from a few clustered mainframes, would be high - about 250 nodes. </li></ul>
  18. 18. Improving Database Performance <ul><li>This chart might help : </li></ul><ul><li>Shared Disk Shared Nothing Quick adaptability to High possibility of </li></ul><ul><li>changing workloads simpler, cheaper hardware </li></ul><ul><li>High availability Almost unlimited scalability </li></ul><ul><li>Data need not be Data may need to be partitioned partitioned across the cluster </li></ul>
  19. 19. Improving Database Performance <ul><li>There is another technique - InfiniBand architecture which can reduce bottlenecks in the Input/Output level, and which has a further appeal of reducing the cabling, connector and administrative overheads of the database infrastructure </li></ul><ul><li>It is an ‘intelligent’ agent - meaning software. </li></ul><ul><li>Its main attraction is that it can change the way information is exchanged in applications. It removes unnecessary overheads from a system </li></ul>
  20. 20. Improving Database Performance <ul><li>Peripheral Interconnect (PCI) remains a bus-based system - this allows the transfer of data between 2 (yes, 2!) of the members at a time </li></ul><ul><li>Many PCI buses cause bottlenecks in the bridge to the memory subsystems. Newer versions of PCI allowed only a minor improvement - only 2 64 bit 66MKz adapters on the bus </li></ul><ul><li>A bus allows only a small number of devices to be interconnected, is limited in its electrical paths, and cannot adapt to meet high availability demands </li></ul>
  21. 21. Improving Database Performance <ul><li>A newer device, called a fabric, can scale to thousands of devices with parallel communications between each node </li></ul><ul><li>A group formed in 1999 (Intel/Microsoft, IBM, Compaq and Sun Future IO) to form the InfiniBand Trade Association </li></ul><ul><li>Their objective was to develop and ensure one standard for communication interconnects and system I/O. </li></ul><ul><li>One of their early findings was that replacing a bus architecture by fabric was not the full story - or solution </li></ul>
  22. 22. Improving Database Performance <ul><li>Their early solution needed to be synchronised with software changes. If not, a very high speed network could be developed, but actual application demands would not be met </li></ul><ul><li>InfinBand is comprised of </li></ul><ul><li>1 Host Channel Adapters (HCA) </li></ul><ul><li>2. Target Channel Adapters (TCA) </li></ul><ul><li>3. Switches </li></ul><ul><li>4. Routers </li></ul><ul><li>1 and 2 define end nodes in the fabric, 3 and 4 are interconnecting devices </li></ul>
  23. 23. Improving Database Performance <ul><li>The HCA (host channel adapter) manages a connection a connection and interfaces with a fabric </li></ul><ul><li>The TCA (target channel adapter) delivers required data such as a disk interface which replaces the existing SCSI interface </li></ul><ul><li>An InfiniBand switch links HCAs and TCAs into a network </li></ul><ul><li>The router allows the interface to other networks AND the translation of legacy components and networks. It can be used for MAN and WAN interfaces. </li></ul>
  24. 24. Improving Database Performance <ul><li>InfiniBand link speeds are identified in multiples of the base 1x - (0.5 Gb full duplex link - 0.25Gb in each direction) </li></ul><ul><li>Other defined sizes are $x (2 Gb full duplex) and 12x (6Gb full duplex). </li></ul><ul><li>Just for size : </li></ul><ul><li>A fast SCSi adapter could accommodate a throughput rate of 160Mb per second </li></ul><ul><li>A single InfinBand adapter 4x can deliver between 300 and 500 Mb per second </li></ul>
  25. 25. Improving Database Performance End Nodes Routers Switch
  26. 26. Improving Database Performance <ul><li>So far we have looked at improving database performance by </li></ul><ul><li>1. The use of ‘shared-all’ or ‘shared-nothing’ architectures </li></ul><ul><li>2. Implementing an InfiniBand communications interface and network facility </li></ul>
  27. 27. Improving Database Performance <ul><li>Now we are going to look at another option </li></ul><ul><li>It’s known as the ‘Federated Database’ environment </li></ul><ul><li>So, what is a ‘Federated Database’ ? </li></ul><ul><li>Try this: It is a collection of data stored on multiple autonomous computing systems connected to a network. </li></ul><ul><li> The user or users is presented with what appears to be one integrated database </li></ul>
  28. 28. Improving Database Performance <ul><li>A federated database presents ‘views’ to users which look exactly the same as views of data from a centralised database </li></ul><ul><li>This is very similar to the use of the Internet where many sites have multiple sources - but the user doesn’t see them </li></ul><ul><li>In a federated database approach, each data resource is defined (as you have done) by means of table schemas, and the user is able to access and manipulate data </li></ul>
  29. 29. Improving Database Performance <ul><li>The ‘queries’ actually access data from a number of databases at a number of locations </li></ul><ul><li>One of the interesting aspects of a federated database is that the individual databases may consist of any DBMS (IBM, Oracle, SQL Server, possibly MS Access) and run on any operating system (Unix, VMS, MS-XP) and on different hardware ( Hewlett-Packhard servers, Unisys, IBM, Sun Microsystems ….. </li></ul>
  30. 30. Improving Database Performance <ul><li>However, there a some reservations : </li></ul><ul><li>Acceptable performance requires the inclusion of a smart optimiser using the cost-based technique which has intelligence about both the distribution (perhaps a global data dictionary) and also the different hardware and DBMS at each accessed site. </li></ul>
  31. 31. Improving Database Performance <ul><li>Another attractive aspect of the federated arrangement is that additional database servers can be added to the federation at any time - and servers can also be deleted. </li></ul><ul><li>As a general comment, any multisource database can be implemented in either a centralised or federated architecture </li></ul><ul><li>In the next few overheads, there are some comments on this </li></ul>
  32. 32. Improving Database Performance <ul><li>The centralised approach has some disadvantages, the major one being that investment is large, and the return on investment may take many months, or years </li></ul><ul><li>The process includes these steps: </li></ul><ul><li>1. Concept development and data model for collecting data needed to support business decisions and processes </li></ul><ul><li>2. Identification of useful data sources (accurate, timely, comprehensive, available …) </li></ul>
  33. 33. Improving Database Performance <ul><li>3. Obtain a database server platform to support the database (and probably lead to data warehousing). </li></ul><ul><li>4. Capture data, or extract data, from the source(s) </li></ul><ul><li>5. Clean, format, and transform data to the quality and formats required </li></ul><ul><li>6. Design an integrated database to hold this data </li></ul><ul><li>7. Load the database (and review quality) </li></ul>
  34. 34. Improving Database Performance <ul><li>8. Develop systems to ensure that content is current (probably transaction systems) </li></ul><ul><li>From this point, that database becomes ‘usable’ </li></ul><ul><li>So, what is different with the Federated Database approach </li></ul><ul><li>1. Firstly, the economics are different - the investment in the large, high speed processor is not necessary </li></ul><ul><li>2. Data is not centralised - it remains with and on the systems used to maintain it </li></ul>
  35. 35. Improving Database Performance <ul><li>3. The database server can be a mid-range, or several servers. </li></ul><ul><li>4. Another aspect is that it is probably most unlikely to run a query which regularly needs access to all of the individual databases - but with the centralised approach all of the data needs to be ‘central’. </li></ul><ul><li>5. Local database support local queries - that’s probably why the local databases were introduced. </li></ul>
  36. 36. Improving Database Performance <ul><li>The Internet offers the capability of large federations of content servers </li></ul><ul><li>Distributed application architectures built around Web servers and many co-operating databases are (slowly) becoming common both </li></ul><ul><ul><li>within and </li></ul></ul><ul><ul><li>between </li></ul></ul><ul><ul><li>enterprises (companies). </li></ul></ul><ul><li>Users are normally unaware of the interfacing and supporting software necessary for federated databases to be accessible </li></ul>
  37. 37. Improving Database Performance <ul><li>Finally, there is another aspect which is used to improve the availability and performance of a database </li></ul><ul><li>This occurs at the ‘configuration stage’ which is when the database and its requirements are being ‘created’ </li></ul><ul><li>- quite different from the ‘create table’ which you have used </li></ul><ul><li>It is the responsibility of the System Administration and Database Management (and of course Senior / Executive Management) </li></ul>
  38. 38. Improving Database Performance <ul><li>Physical Layouts </li></ul><ul><li>The physical layout very much influences </li></ul><ul><ul><li>How much data a database can hold </li></ul></ul><ul><ul><li>The number of concurrent and database users </li></ul></ul><ul><ul><li>How many concurrent processes can execute </li></ul></ul><ul><ul><li>Recovery capability </li></ul></ul><ul><ul><li>Performance (response time) </li></ul></ul><ul><ul><li>Nature of Database Administration </li></ul></ul><ul><ul><li>Cost </li></ul></ul><ul><ul><li>Expansion </li></ul></ul>
  39. 39. Improving Database Performance Oracle Architecture <ul><li>Oracle8i and 9i are object-relational database management systems. They contain the capabilities of relational and object-oriented database systems </li></ul><ul><li>They utilise database servers for many types of business applications including </li></ul><ul><ul><li>On Line Transaction Processing (OLTP) </li></ul></ul><ul><ul><li>Decision Support Systems </li></ul></ul><ul><ul><li>Data Warehousing </li></ul></ul>
  40. 40. Improving Database Performance Oracle Architecture <ul><li>In perspective, Oracle is NOT a ‘high end’ application DBMS </li></ul><ul><li>A high end system has one or more of these characteristics: </li></ul><ul><ul><li>Management of a very large database (VLDB) - probably hundreds of gigabytes or terabytes </li></ul></ul><ul><ul><li>Provides access to many concurrent users - in the thousands, or tens of thousands </li></ul></ul><ul><ul><li>Gives a guarantee of constant database availability for mission critical applications - 24 hours a day, 7 days a week. </li></ul></ul>
  41. 41. Improving Database Performance Oracle Architecture <ul><li>High end applications environments are not normally controlled by Relational Database Management Systems </li></ul><ul><li>High end database environments are controlled by mainframe computers and non-relational DBMSs. </li></ul><ul><li>Current RDBMSs cannot manage very large amounts of data, or perform well under demanding transaction loads. </li></ul>
  42. 42. Improving Database Performance Oracle Architecture <ul><li>There are some guidelines for designing a database with files distributed so that optimum performance, from a specific configuration, can be achieved </li></ul><ul><li>The primary aspect which needs to be clearly understood is the nature of the database </li></ul><ul><ul><li>Is it transaction oriented ? </li></ul></ul><ul><ul><li>Is it read-intensive ? </li></ul></ul>
  43. 43. Improving Database Performance <ul><li>The key items which need to be understood are </li></ul><ul><ul><li>Identifying Input/Output contention among datafiles </li></ul></ul><ul><ul><li>Identifying Input/Output bottlenecks among all database files </li></ul></ul><ul><ul><li>Identifying concurrent Input/Output among background processes </li></ul></ul><ul><ul><li>Defining the security and performance goals for the database </li></ul></ul><ul><ul><li>Defining the system hardware and mirroring architecture </li></ul></ul><ul><ul><li>Identifying disks which can be dedicated to the database </li></ul></ul>
  44. 44. Improving Database Performance <ul><li>Let’s look at tablespaces : </li></ul><ul><li>These ones will be present in some combination </li></ul><ul><li>System Data dictionary </li></ul><ul><li>Data Standard-operation tables </li></ul><ul><li>Data_2 Static tables used during standard operation </li></ul><ul><li>Indexes Indexes for the standard-operation tables </li></ul><ul><li>Indexes_2 Indexes for the static tables </li></ul><ul><li>RBS Standard-operation RollBack Segments </li></ul><ul><li>RBS_2 Special RollBack segments used for data loads </li></ul><ul><li>Temp Standard operation temporary segments </li></ul><ul><li>Temp_user Temporary segments created by a temporary user </li></ul>
  45. 45. Improving Database Performance <ul><li>Tools RDBMS tools tables </li></ul><ul><li>Tools_1 Indexes for the RDBMS tools tables </li></ul><ul><li>Users User objects in development tables </li></ul><ul><li>Agg_data Aggregation data and materialised views </li></ul><ul><li>Partitions Partitions of a table or index segments; create multiple tablespaces for them </li></ul><ul><li>Temp_Work Temporary tables used during data load processing </li></ul>
  46. 46. Improving Database Performance <ul><li>(A materialised view stores replicated data based on an underlying query. A materialised view stores data which is replicated from within the current database). </li></ul><ul><li>A Snapshot stores data from a remote database. </li></ul><ul><li>The system optimiser may choose to use a materialised view instead of a query against a larger table if the materialised view will return the same data and thus improve response time. A materialised view does however incur an overhead of additional space usage, and maintenance) </li></ul>
  47. 47. Improving Database Performance <ul><li>Each of the tablespaces will require a separate datafile </li></ul><ul><li>Monitoring of I/O performance among datafiles is done after the database has been created, and the DBA must estimate the I/O load for each datafile (based on what information ?) </li></ul><ul><li>The physical layout planning is commenced by estimating the relative I/O among the datafiles, with the most active tablespace given a weight of 100. </li></ul><ul><li>Estimate the I/O from the other datafiles relative to the most active datafile </li></ul>
  48. 48. Improving Database Performance <ul><li>Assign a weight of 35 for the System tablespace files and the index tablespaces a value of 1/3 or their data tablespaces </li></ul><ul><li>Rdb’s may go as high as 70 (depending on the database activity) - between 30 and50 is ‘normal’ </li></ul><ul><li>In production, Temp will be used by large sorts </li></ul><ul><li>Tools will be used rarely in production - as will the Tools_2 tablespace </li></ul>
  49. 49. Improving Database Performance <ul><li>So, what do we have ? - Something like this - </li></ul><ul><li>Tablespace Weight % of Total </li></ul><ul><li>Data 100 45 </li></ul><ul><li>Rbs 40 18 </li></ul><ul><li>System 35 16 </li></ul><ul><li>Indexes 33 15 </li></ul><ul><li>Temp 5 2 </li></ul><ul><li>Data_2 4 2 </li></ul><ul><li>Indexes_2 2 1 </li></ul><ul><li>Tools 1 1 </li></ul><ul><li>(220) </li></ul>
  50. 50. Improving Database Performance <ul><li>94% of the Input/Output is associated with the top four tablespaces </li></ul><ul><li>This indicates then that in order to properly the datafile activity, 5 disks would be needed, AND that NO other database files should be put on the disks which are accommodating the top 4 tablespaces </li></ul><ul><li>There are some rules which apply : </li></ul><ul><li>1 . Data tablespaces should be stored separately from their Index tablespaces </li></ul><ul><li>2 . RBS tablespaces should be stored separately from their Index tablespaces </li></ul>
  51. 51. Improving Database Performance <ul><li>and 3 . The System tablespace should be stored separately from the other tablespaces in the database </li></ul><ul><li>In my example, there is only 1 Data tablespace. In production databases there will probably be many Data tablespaces (which will happen if Partitions are used). </li></ul><ul><li>If/when this occurs, the weightings of each of the Data tablespaces will need to be made (but for my efforts, 1 Data tablespace will be used). </li></ul>
  52. 52. Improving Database Performance <ul><li>As you have probably guessed, there are other tablespaces which require to be considered - many used by the many and various ‘processes’ of Oracle </li></ul><ul><li>One of these considerations is the on-line redo log files (you remember these and their purpose ?) </li></ul><ul><li>They store the records of each transaction. Each database must have at least 2 online redo log files available to it - the database will write to one log in sequential mode until the redo log file is filled, then it will start writing to the second redo log file. </li></ul>
  53. 53. Improving Database Performance <ul><li>Redo log files (cont’d) </li></ul><ul><li>The Online Redo Log files maintain data about current transactions and they cannot be recovered from a backup unless the database is/was shut down prior to backup - this is a requirement of the ‘Offline Backup’ procedure (if we have time we will look at this) </li></ul><ul><li>On line redo log files need to be ‘mirrored’ </li></ul><ul><li>A method of doing this is to employ redo log groups - which dynamically maintain multiple sets of the online redo logs </li></ul><ul><li>The operating system is also a good ally for mirroring files </li></ul>
  54. 54. Improving Database Performance <ul><li>Redo log files should be placed away from datafiles because of the performance implications, and this means knowing how the 2 types of files are used </li></ul><ul><li>Every transaction (unless it is tagged with the nologging parameter) is recorded in the redo log files </li></ul><ul><li>The entries are written by the LogWriter (LGWR) process </li></ul><ul><li>The data in the transaction is concurrently written to a number of tablespaces(the RBS rollback segments and the Data tablespace come to mind) via the DataBase Writer (DBWR) and this raises possible contention issues if a datafile is located on the same disk as a redo log file </li></ul>
  55. 55. Improving Database Performance <ul><li>Redo log files are written sequentially </li></ul><ul><li>Datafiles are written in ‘random’ order - it is a good move to have these 2 different demands separated </li></ul><ul><li>If a datafile must be stored on the same disk as a redo log files, then it should not belong to the System tablespace, the RBS tablespace, or a very active Data or Index tablespace </li></ul><ul><li>So what about Control Files ? </li></ul><ul><li>There is much less traffic here, and they can be internally mirrored. (config.ora or init.orafile). The database will maintain the control files as identical copies of each other. </li></ul><ul><li>There should be 3 copies, across 3 disks </li></ul>
  56. 56. Improving Database Performance <ul><li>The LGWR background process writes to the online redo files in a cyclical manner </li></ul><ul><li>When the lst redo file is full, it directs writing to the 2nd file …. </li></ul><ul><li>When the ‘last’ file is full, LWGR starts overwriting the contents of the 1st file .. and so on </li></ul><ul><li>When ARCHIVELOG mode is used, the contents of the ‘about to be overwritten file’ are written to a redo file on a disk device </li></ul>
  57. 57. Improving Database Performance <ul><li>There will be contention on the online redo log as LGWR will be attempting to write to one redo log file while the Archiver (ARCH) will be trying to read another. </li></ul><ul><li>The solution is to distribute the redo log files across multiple disks </li></ul><ul><li>The archived redo log files are high I/O and therefore should NOT be on the same device as System, Rbs, Data, or Indexes tablespaces </li></ul><ul><li>Neither should they be stored on the same device as any of the online redo log files. </li></ul>
  58. 58. Improving Database Performance <ul><li>The database will stall if there is not enough disk space, and the archived files should directed to a disk which contains small and preferably static files </li></ul><ul><li>Concurrent I/O </li></ul><ul><li>A commendable goal, and one which needs careful planning to achieve. </li></ul><ul><li>Placing two random access files which are never accessed at the same time will quite happily avoid contention for I/O capability </li></ul>
  59. 59. Improving Database Performance <ul><li>What we have just covered is known as </li></ul><ul><li>1. Concurrent I/O - when concurrent processes are being performed against the same device (disk) </li></ul><ul><li>This is overcome by isolating data tables from their Indexes for instance </li></ul><ul><li>2. Interference - when sequential writing is interfered by reads or writes to other files on the same disk </li></ul>
  60. 60. Improving Database Performance <ul><li>At the risk of labouring this a bit, </li></ul><ul><li>The 3 background processes to watch are </li></ul><ul><li>1. DBWR, which writes in a random manner </li></ul><ul><li>2. LGWR, which writes sequentially </li></ul><ul><li>3. ARCH, which reads and writes sequentially </li></ul><ul><li>LGWR and ARCH write to 1 file at a time, but DBWR may be attempting to write to multiple files at once - (can you think of an example ?) </li></ul><ul><li>Multiple DBWR processes for each instance or multiple I/O slaves for each DBWR is a solution </li></ul>
  61. 61. Improving Database Performance <ul><li>What are the disk layout goals ? </li></ul><ul><li>Are they (1) recoverability or (2) performance </li></ul><ul><li>Recoverability must address all processes which impact disks (storage area for archived redo logs and for Export dump files - (which so far we haven’t mentioned) come to mind). </li></ul><ul><li>Performance calls for file I/O performance and relative speeds of the disk drives </li></ul>
  62. 62. Improving Database Performance <ul><li>What are some recoverability issues ? </li></ul><ul><li>All critical database files should be placed on mirrored drives, and the database run in ARCHIVELOG mode </li></ul><ul><li>The online red files must also be mirrored (Operating system or mirrored redo log groups) </li></ul><ul><li>Recoverability issues involve a few disks </li></ul><ul><li>and this is where we start to look at hardware specification </li></ul>
  63. 63. Improving Database Performance <ul><li>Mirroring architecture leads to specifying </li></ul><ul><ul><li>the number of disks required </li></ul></ul><ul><ul><li>the models of disks (capacity and speed) </li></ul></ul><ul><ul><li>the strategy </li></ul></ul><ul><ul><li>If the hardware system if heterogeneous, the faster drives should be dedicated to Oracle database files </li></ul></ul><ul><ul><li>RAID systems should be carefully analysed as to their capability and the optimum benefit sought - RAID-1 , RAID- 3 and RAID-5 have different processes relating to parity </li></ul></ul>
  64. 64. Improving Database Performance <ul><li>The disks chosen for mirroring architecture must be dedicated to the database </li></ul><ul><li>This guarantees that non-database load on these disks will not interfere with database processes </li></ul>
  65. 65. Improving Database Performance <ul><li>Goals for disk layout : </li></ul><ul><ul><li>The database must be recoverable </li></ul></ul><ul><ul><li>The online redo log files must be mirrored via the system or the database </li></ul></ul><ul><ul><li>The database file I/O weights must be estimated </li></ul></ul><ul><ul><li>Contention between DBWR, LGWR and ARCH must be minimised </li></ul></ul><ul><ul><li>Contention between disks for DBWR must be minimised </li></ul></ul><ul><ul><li>The performance goals must be defined </li></ul></ul><ul><ul><li>The disk hardware options must be known </li></ul></ul><ul><ul><li>The disk mirroring architecture must be known </li></ul></ul><ul><ul><li>Disks must be dedicated to the database </li></ul></ul>
  66. 66. Improving Database Performance <ul><li>So where does that leave us ? </li></ul><ul><li>We’re going to look at ‘solutions’ from Optimal to Practical </li></ul><ul><li>and we’ll assume that : </li></ul><ul><li>the disks are dedicated to the database </li></ul><ul><li>the online redo log files are being mirrored by the Operating System </li></ul><ul><li>the disks are of identical size </li></ul><ul><li>the disks have identical performance characteristics </li></ul><ul><li>(obviously the best case scenario !) </li></ul>
  67. 67. Improving Database Performance <ul><li>So, with that optimistic outlook let’s proceed </li></ul><ul><li>Case 1 - The Optimum Physical Layout </li></ul><ul><li>Disk No Contents Disk No. Contents </li></ul><ul><li>1 Oracle Software 12 Control file 2 </li></ul><ul><li>2 SYSTEM tablespace 13 Control file 3 </li></ul><ul><li>3 RBS tablespace 14 Application software </li></ul><ul><li>4 DATA tablespace 15 RBS_2 </li></ul><ul><li>5 INDEXES tablespace 16 DATA_2 </li></ul><ul><li>6 TEMP tablespace 17 INDEXES_2 </li></ul><ul><li>7 TOOLS tablespace 18 TEMP_ USER </li></ul><ul><li>8 OnLine Redo Log 1 19 TOOLS_1 </li></ul><ul><li>9 OnLine redo log 2 20 USERS </li></ul><ul><li>10 OnLine redo Log 3 21 Archived redo dest. disk </li></ul><ul><li>11 Control file 1 22 Archived dump file </li></ul>
  68. 68. Hardware Configurations <ul><li>The 22 disk solution is an optimal solution. </li></ul><ul><li>It may not be feasible for a number of reasons, including hardware costs </li></ul><ul><li>In the following overheads there will be efforts to reduce the number of disks, commensurate with preserving performance </li></ul>
  69. 69. Hardware Configurations <ul><li>This leads to - 17 disk configuration </li></ul><ul><li>Disk Contents Disk Contents </li></ul><ul><li>1 Oracle software 11 Application software </li></ul><ul><li>2 SYSTEM tablespace 12 RBS_2 </li></ul><ul><li>3 RBS tablespace 13 DATA_2 </li></ul><ul><li>4 DATA tablespace 14 INDEXES_2 </li></ul><ul><li>5 INDEXES tablespace 15 TEMP_USER </li></ul><ul><li>6 TEMP tablespace 16 Archived redo log </li></ul><ul><li>7 TOOLS tablespace destination disk </li></ul><ul><li>8 Online Redo log 1, Control file 1 17 Export dump </li></ul><ul><li>9 Online Redo log 2, Control file 2 destination disk </li></ul><ul><li>10 Online Redo log 3, Control file 3 </li></ul>
  70. 70. Hardware Configurations <ul><li>The Control Files are candidates for placement onto the three redo log disks. The altered arrangement reflects this. </li></ul><ul><li>The Control files will interfere with the online redo logfiles but only at log switch points and during recovery </li></ul>
  71. 71. Hardware Configurations <ul><li>The TOOLS_1 tablespace will be merged with the TOOLS tablespace </li></ul><ul><li>In a production environment, users will not have resource privileges, and the USERS tablespace can be ignored </li></ul><ul><li>However, what will be the case if users require development and test access ? </li></ul><ul><li>Create another database ? (test ?) </li></ul>
  72. 72. Hardware Configurations <ul><li>The RBS and RBS_2 tablespaces have special rollback segments used during data loading. </li></ul><ul><li>Data loads should not occur during production usage, and so if the 17 disk option is not practical, we can look at combining RBS and RBS_2 - there should be no contention </li></ul><ul><li>TEMP and TEMP_USER can be placed on the same disk </li></ul><ul><li>The TEMP tablespace weighting (5 in the previous table) can vary. It should be possible to store these 2 tablespaces on the same disk. </li></ul><ul><li>TEMP_USER is dedicated to a specific user - (such as Oracle Financials, and these have temporary segments requirements which are greater than the system’s users) </li></ul>
  73. 73. Hardware Configurations <ul><li>The revised solution is now </li></ul><ul><li>Disk Contents Disk Content </li></ul><ul><li>1 Oracle software 11 Application software </li></ul><ul><li>2 SYSTEM tablespace 12 DATA_2 </li></ul><ul><li>3 RBS, RBS_2 tablespace 13 INDEXES_2 </li></ul><ul><li>4 DATA tablespace 14 Archived Redo Log </li></ul><ul><li>5 INDEXES tablespaces destination disk </li></ul><ul><li>6 TEMP, TEMP_USER tablespace 15 Export dump file </li></ul><ul><li>7 TOOLS tablespace destination disk </li></ul><ul><li>8 Online Redo Log 1, Control file 1 </li></ul><ul><li>9 Online Redo Log 2, Control file 2 15 disks </li></ul><ul><li>10 Online Redo Log 3, Control file 3 </li></ul>
  74. 74. Hardware Configurations <ul><li>What if there aren’t 15 disks ? -->> Move to attempt 3 </li></ul><ul><li>Here the online Redo Logs will be placed onto the same disk. Where there are ARCHIVELOG backups, this will cause concurrent I/O and interference contention between LGWR and ARCH on that disk </li></ul><ul><li>What we can deduce from this, is that the combination about to be proposed is NOT appropriate for a high transaction system or systems running in ARCHIVELOG mode </li></ul><ul><li>(why is this so - Prof. Julius Sumner Miller ?) </li></ul>
  75. 75. Hardware Configurations <ul><li>The ‘new’ solution - </li></ul><ul><li>Disk Contents </li></ul><ul><li>1 Oracle software </li></ul><ul><li>2 SYSTEM tablespace, Control file 1 </li></ul><ul><li>3 RBS, RBS_2 tablespaces, Control file 2 </li></ul><ul><li>4 DATA tablespace, Control file 3 </li></ul><ul><li>5 INDEXES tablespaces </li></ul><ul><li>6 TEMP, TEMP_USER tablespaces 12 disks </li></ul><ul><li>7 TOOLS, INDEXES_2 tablespaces </li></ul><ul><li>8 OnLine Redo Logs 1, 2 and 3 </li></ul><ul><li>9 Application software </li></ul><ul><li>10 DATA_2 </li></ul><ul><li>11 Archived redo log destination disk </li></ul><ul><li>12 Export dump file destination disk </li></ul>
  76. 76. Hardware Configurations <ul><li>Notice that the Control Files have been moved to Disks 2, 3 and 4 </li></ul><ul><li>The Control Files are not I/O demanding, and can safely coexist with SYSTEM, RBS and DATA </li></ul><ul><li>What we have done so far is to ‘move’ the high numbered disks to the ‘low’ numbered disks - these are the most critical in the database. </li></ul><ul><li>The next attempt to ‘rationalise’ the disk arrangement is to look carefully at the high numbered disks. </li></ul>
  77. 77. Hardware Configurations <ul><ul><li>DATA_2 can be combined with with the TEMP tablespaces (this disk has 4% of the I/O load). </li></ul></ul><ul><ul><li>This should be safe as the static tables (which ones are those ?) are not as likely to have group operations performed on them as the ones in the DATA tablespace </li></ul></ul><ul><ul><li>The Export dump files have been moved to the Online Redo disk (the Redo log files are about 100Mb and don’t increase in size -(is that correct ?) Exporting causes minor transaction activity. </li></ul></ul><ul><ul><li>The other is the combination of the application software with the archived redo log file destination area. This leaves ARCH space to write log files, and avoids conflicts with DBWR </li></ul></ul>
  78. 78. Hardware Configurations <ul><li>Disk Content </li></ul><ul><li>1 Oracle software </li></ul><ul><li>2 SYSTEM tablespace, Control file 1 </li></ul><ul><li>3 RBS tablespace, RBS_2 tablespace, Control file 2 </li></ul><ul><li>4 DATA tablespace, Control file 3 </li></ul><ul><li>5 INDEXES tablespace 9 disks </li></ul><ul><li>6 TEMP, TEMP_USER, DATA_2 tablespaces </li></ul><ul><li>7 TOOLS, INDEXES_2 tablespaces </li></ul><ul><li>8 Online Redo logs 1, 2 and 3, Export dump file </li></ul><ul><li>9 Application software, Archived Redo log destination disk </li></ul>
  79. 79. Hardware Configurations <ul><li>Can the number of required disks be further reduced ? </li></ul><ul><li>Remember that the performance characteristics will deteriorate </li></ul><ul><li>It’s now important to look closely at the weights set during the I/O estimation process. </li></ul>
  80. 80. Hardware Configurations <ul><li>Estimated Weightings for the previous (9 disk) solution are </li></ul><ul><li>Disk Weight Contents </li></ul><ul><li>1 Oracle software </li></ul><ul><li>2 35 SYSTEM tablespace, Control file 1 </li></ul><ul><li>3 40 RBS, RBS_2 tablespace, Control file 2 </li></ul><ul><li>4 100 DATA tablespace, Control file 3 </li></ul><ul><li>5 33 INDEXES tablespaces </li></ul><ul><li>6 9 TEMP, TEMP_USER, DATA_2 tablespace </li></ul><ul><li>7 3 TOOLS, INDESES_2 tablespaces </li></ul><ul><li>8 40+ Online Redo logs 1,2 and 3, Export dump file destination disk </li></ul><ul><li>9 40+ Application software, archived redo log destination disk </li></ul>
  81. 81. Hardware Configurations <ul><li>A further compromise distribution could be </li></ul><ul><li>Disk Weight Contents </li></ul><ul><li>1 Oracle software </li></ul><ul><li>2 38 SYSTEM, TOOLS, INDEXES_2 tablespaces , Control file1 </li></ul><ul><li>3 40 RBS, RBS_2 tablespaces , Control file 2 </li></ul><ul><li>4 100 DATA tablespace , Control file 3 </li></ul><ul><li>5 42 INDEXES, TEMP, TEMP_USER, DATA_2 tablespaces </li></ul><ul><li>6 40+ Online redo logs 1,2 and 3, Export dump file destination disk </li></ul><ul><li>7 40+ Application software, Archived redo log destination disk </li></ul>
  82. 82. Hardware Configurations <ul><li>A few thoughts for a small database system - 3 disks </li></ul><ul><li>1. Suitable for an OLTP application. Assumes that the transactions a small in size, large in number and variety, and randomly scattered among the available tables. </li></ul><ul><li>The application should be as index intensive as possible, and the full table scans must be kept to the minimum possible </li></ul>
  83. 83. Hardware Configurations <ul><li>2. Isolate the SYSTEM tablespace. This stores the data dictionary - which is accessed for every query and is accessed many times for every query </li></ul><ul><li>In a ‘typical case’, query execution requires </li></ul><ul><ul><li>the column names to be checked in CODES_TABLE table </li></ul></ul><ul><ul><li>the user’s privilege of access to the CODES_TABLE table </li></ul></ul><ul><ul><li>the user’s privilege to access the Code column of the CODES_TABLE table </li></ul></ul><ul><ul><li>the user’s role definition(s) </li></ul></ul><ul><ul><li>the indexes defined on the CODES_TABLE table </li></ul></ul><ul><ul><li>the columns of the columns defined on the CODES_TABLE table </li></ul></ul>
  84. 84. Hardware Configurations <ul><li>3. Isolate the INDEXES tablespace. This probably accounts for 35 to 40% of the I/O </li></ul><ul><li>4. Separate the rollback segments and DATA tablespaces </li></ul><ul><li>There is a point to watch here - with 3 disks there are 4 tablespaces - SYSTEM, INDEXES, DATA and RBS. </li></ul><ul><li>The placement of RBS is determined by the volume of transactions. If high, RBS and DATA should be kept apart. </li></ul><ul><li>If low, RBS and DATA should work together without causing contention </li></ul>
  85. 85. Hardware Configurations <ul><li>The 3 disk layout would be one of these : </li></ul><ul><li>Disk 1: SYSTEM tablespace, control file, redo log </li></ul><ul><li>Disk 2 : INDEXES tablespace, control file, redo log, RBS tablespace </li></ul><ul><li>Disk 3 : DATA tablespace, control file, redo log </li></ul><ul><li>OR </li></ul><ul><li>Disk 1 : SYSTEM tablespace, control file, redo log </li></ul><ul><li>Disk 2 : INDEXES tablespace, control file, redo log </li></ul><ul><li>Disk 3 : DATA tablespace, control file, redo log, RBS tablespace </li></ul>
  86. 86. Hardware Configurations <ul><li>Summary : </li></ul><ul><li>Database Type Tablespaces </li></ul><ul><li>Small development SYSTEM </li></ul><ul><li>database DATA </li></ul><ul><li>INDEXES </li></ul><ul><li>RBS </li></ul><ul><li>TEMP </li></ul><ul><li>USERS </li></ul><ul><li>TOOLS </li></ul>
  87. 87. Hardware Configurations <ul><li>Summary : </li></ul><ul><li>Database Type Tablespaces </li></ul><ul><li>Production OLTP SYSTEM </li></ul><ul><li>database DATA </li></ul><ul><li>DATA_2 </li></ul><ul><li>INDEXES </li></ul><ul><li>INDEXES_2 </li></ul><ul><li>RBS </li></ul><ul><li>RBS_2 </li></ul><ul><li>TEMP </li></ul><ul><li>TEMP_USER </li></ul><ul><li>TOOLS </li></ul>
  88. 88. Hardware Configurations <ul><li>Summary : </li></ul><ul><li>Database Type Tablespaces Tablespaces </li></ul><ul><li>Production OLTP SYSTEM TEMP </li></ul><ul><li>with historical DATA TEMP_USER </li></ul><ul><li>data DATA_2 TOOLS </li></ul><ul><li>DATA_ARCHIVE </li></ul><ul><li>INDEXES </li></ul><ul><li>INDEXES_2 </li></ul><ul><li>INDEXES_ARCHIVE </li></ul><ul><li>RBS </li></ul><ul><li>RBS_2 </li></ul>