Recovery Tuning

533 views
507 views

Published on

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
533
On SlideShare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
19
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Recovery Tuning

  1. 1. Recovery Tuning <ul><li>Main techniques </li></ul><ul><ul><li>Put the log on a dedicated disk </li></ul></ul><ul><ul><li>Delay writing updates to the database disks as long as possible </li></ul></ul><ul><ul><li>Setting proper intervals for DB dumping and checkpointing </li></ul></ul><ul><ul><li>Reduce the size of large update transactions </li></ul></ul>
  2. 2. Separate Disk for the Log <ul><li>DB2 UDB v7.1 on Windows 2000 </li></ul><ul><li>5 % performance improvement if log is located on a different disk </li></ul><ul><li>Controller cache hides negative impact </li></ul><ul><ul><li>mid-range server, with Adaptec RAID controller (80Mb RAM) and 2x18Gb disk drives. </li></ul></ul>Figure 2.18 in the textbook shows a 30% improvement
  3. 3. Tuning Database Writes <ul><li>Database writes caused by transactions tend to be random </li></ul><ul><ul><li>Better to be delayed as much as possible </li></ul></ul><ul><ul><ul><li>Sufficient info in the log for recovery </li></ul></ul></ul><ul><ul><li>But eventually they need to be written to the disk </li></ul></ul><ul><ul><ul><li>To reduce recovery time </li></ul></ul></ul><ul><ul><li>When to write? </li></ul></ul><ul><ul><ul><li>Forced: when the buffer is full (or nearly full) </li></ul></ul></ul><ul><ul><ul><li>Opportunistic: when no extra overhead for disk seeking </li></ul></ul></ul><ul><ul><ul><li>Checkpoint: force all committed writes to disk </li></ul></ul></ul>
  4. 4. Writing Dirty Pages to the Disk <ul><li>When the number of dirty pages is greater than a given parameter (Oracle 8) </li></ul><ul><li>When the number of dirty pages crosses a given threshold (less than 3% of free pages in the database buffer for SQL Server 7) </li></ul><ul><li>When the log is full, a checkpoint is forced. This can have a significant impact on performance. </li></ul>
  5. 5. Tune Checkpoint Intervals <ul><li>Oracle 8i on Windows 2000 </li></ul><ul><li>A checkpoint (partial flush of dirty pages to disk) occurs at regular intervals or when the log is full: </li></ul><ul><ul><li>Impacts the performance of on-line processing </li></ul></ul><ul><ul><li>Reduces the size of log </li></ul></ul><ul><ul><li>Reduces time to recover from a crash </li></ul></ul>
  6. 6. Group Commit <ul><li>Log-writing a bottleneck if every committing transaction needs a write to the log </li></ul><ul><li>Group commit </li></ul><ul><ul><li>Write the logs of multiple transactions in batch </li></ul></ul><ul><ul><li>Need to use a “log buffer” (another thing to tune!) </li></ul></ul><ul><ul><li>Better throughput if many concurrent short update transactions </li></ul></ul><ul><ul><li>Longer response time for individual transactions </li></ul></ul><ul><ul><ul><li>This is a problem if they hold lock </li></ul></ul></ul><ul><ul><ul><li>Early release of locks can cause problems, but the risk is remote </li></ul></ul></ul>
  7. 7. Log IO - Data <ul><li>Settings: </li></ul><ul><ul><li>lineitem ( L_ORDERKEY, L_PARTKEY , L_SUPPKEY , L_LINENUMBER , L_QUANTITY, L_EXTENDEDPRICE , L_DISCOUNT, L_TAX , L_RETURNFLAG, L_LINESTATUS , L_SHIPDATE, L_COMMITDATE, L_RECEIPTDATE, L_SHIPINSTRUCT , L_SHIPMODE , L_COMMENT ); </li></ul></ul><ul><ul><li>READ COMMITTED isolation level </li></ul></ul><ul><ul><li>Empty table </li></ul></ul><ul><ul><li>Dual Xeon (550MHz,512Kb), 1Gb RAM, Internal RAID controller from Adaptec (80Mb), 4x18Gb drives (10000RPM), Windows 2000. </li></ul></ul>
  8. 8. Log IO - Transactions <ul><li>No Concurrent Transactions: </li></ul><ul><ul><li>Insertions [300 000 inserts, 10 threads], e.g., </li></ul></ul><ul><ul><li>insert into lineitem values (1,7760,401,1,17,28351.92,0.04,0.02,'N','O','1996-03-13','1996-02-12','1996-03-22','DELIVER IN PERSON','TRUCK','blithely regular ideas caj'); </li></ul></ul>
  9. 9. Group Commits <ul><li>DB2 UDB v7.1 on Windows 2000 </li></ul><ul><li>Log records of many transactions are written together </li></ul><ul><ul><li>Increases throughput by reducing the number of writes </li></ul></ul><ul><ul><li>At cost of increased minimum response time. </li></ul></ul>
  10. 10. Transaction Chopping <ul><li>Some transactions, in particular batch transactions, can be very long </li></ul><ul><ul><li>A lot of log information </li></ul></ul><ul><ul><li>Very costly for recovery </li></ul></ul><ul><li>Solution </li></ul><ul><ul><li>Transaction chopping </li></ul></ul><ul><ul><li>An easy to understand concept </li></ul></ul><ul><ul><ul><li>Formal work in appendix B of the textbook </li></ul></ul></ul>
  11. 11. Summary <ul><li>In this module, we have covered: </li></ul><ul><ul><li>The principles of recovery </li></ul></ul><ul><ul><li>How to optimise recovery-related options </li></ul></ul><ul><ul><ul><li>Put the log on a dedicated disk </li></ul></ul></ul><ul><ul><ul><li>Delay writing updates </li></ul></ul></ul><ul><ul><ul><li>Using checkpoint and dump properly </li></ul></ul></ul><ul><ul><ul><li>Reduce the size of update transactions </li></ul></ul></ul>
  12. 12. CS5226 Week 6 Operating System & Database Performance Tuning
  13. 13. Outline <ul><li>Part 1: Operating systems and DBMS </li></ul><ul><li>Part 2: OS-related tuning </li></ul>
  14. 14. Operating System <ul><li>Operating system is an interface between hardware and other software, supporting: </li></ul><ul><ul><li>Processes and threads; </li></ul></ul><ul><ul><li>Paging, buffering and IO scheduling </li></ul></ul><ul><ul><li>Multi-tasking </li></ul></ul><ul><ul><li>File system </li></ul></ul><ul><ul><li>Other utilities such as timing, networking and performing monitoring </li></ul></ul>hardware Operating system Other software DBMS
  15. 15. Scheduling <ul><li>Process versus thread </li></ul><ul><ul><li>Scheduling based on time-slicing, IO, priority etc </li></ul></ul><ul><ul><ul><li>Different from transaction scheduling </li></ul></ul></ul><ul><ul><li>The cost of content switching </li></ul></ul><ul><ul><ul><li>When switch is desirable? And when is not? </li></ul></ul></ul><ul><li>The administrator can set priorities to processes/threads </li></ul><ul><ul><li>Case 1: The DBMS runs at a lower priority </li></ul></ul><ul><ul><li>Case 2: Different transactions run at different priority </li></ul></ul><ul><ul><li>Case 3: Online transactions with higher priority than offline transactions </li></ul></ul>
  16. 16. Priority Inversion <ul><li>Let priorities T1 > T2s > T3 </li></ul>… a solution: priority inheritance T1 T2s T3 Lock x Request X
  17. 17. Database Buffers Application buffers DBMS buffers OS buffers <ul><li>An application can have its own in-memory buffers (e.g., variables in the program; cursors); </li></ul><ul><li>A logical read/write will be issued to the DBMS if the data needs to be read/written to the DBMS; </li></ul><ul><li>A physical read/write is issued by the DBMS using its systematic page replacement algorithm. And such a request is passed to the OS. </li></ul><ul><li>OS may initiate IO operations to support the virtual memory the DBMS buffer is built on. </li></ul>
  18. 18. Database Buffer Size <ul><li>Buffer too small, then hit ratio too small </li></ul><ul><li>hit ratio = (logical acc. - physical acc.) / (logical acc.) </li></ul><ul><li>Buffer too large, paging </li></ul><ul><li>Recommended strategy: monitor hit ratio and increase buffer size until hit ratio flattens out. If there is still paging, then buy memory. </li></ul>LOG DATA DATA RAM Paging Disk DATABASE PROCESSES DATABASE BUFFER
  19. 19. Buffer Size - Data <ul><li>Settings: </li></ul><ul><ul><li>employees ( ssnum , name, lat, long, hundreds1, </li></ul></ul><ul><ul><li>hundreds2); </li></ul></ul><ul><ul><li>clustered index c on employees(lat); (unused) </li></ul></ul><ul><ul><li>10 distinct values of lat and long, 100 distinct values of hundreds1 and hundreds2 </li></ul></ul><ul><ul><li>20000000 rows (630 Mb); </li></ul></ul><ul><ul><li>Warm Buffer </li></ul></ul><ul><ul><li>Dual Xeon (550MHz,512Kb), 1Gb RAM, Internal RAID controller from Adaptec (80Mb), 4x18Gb drives (10000 RPM), Windows 2000. </li></ul></ul>
  20. 20. Buffer Size - Queries <ul><li>Queries: </li></ul><ul><ul><li>Scan Query </li></ul></ul><ul><ul><li>select sum(long) from employees; </li></ul></ul><ul><ul><li>Multipoint query </li></ul></ul><ul><ul><li>select * from employees where lat = ?; </li></ul></ul>
  21. 21. Database Buffer Size <ul><li>SQL Server 7 on Windows 2000 </li></ul><ul><li>Scan query: </li></ul><ul><ul><li>LRU (least recently used) does badly when table spills to disk as Stonebraker observed 20 years ago. </li></ul></ul><ul><li>Multipoint query: </li></ul><ul><ul><li>Throughput increases with buffer size until all data is accessed from RAM. </li></ul></ul>
  22. 22. Multiprogramming Levels <ul><li>More concurrent users </li></ul><ul><ul><li>Better utilization of CPU cycles (and other system resources) </li></ul></ul><ul><ul><li>Risk of excessive page swapping </li></ul></ul><ul><ul><li>More lock conflicts </li></ul></ul><ul><li>So how many exactly </li></ul><ul><ul><li>Depends on transaction profiles </li></ul></ul><ul><ul><ul><li>Experiments to find the best value </li></ul></ul></ul><ul><ul><ul><li>And this parameter may change when application patterns change </li></ul></ul></ul><ul><ul><ul><li>Feedback control mechanism </li></ul></ul></ul>
  23. 23. Disk Layout and Access <ul><li>Larger disk allocation chunks improves write performance </li></ul><ul><ul><li>At the cost of disk utilisation </li></ul></ul><ul><li>Setting disk usage factor </li></ul><ul><ul><li>Low when expecting updates/inserts </li></ul></ul><ul><ul><li>Higher for scan-type of queries </li></ul></ul><ul><li>Prefetching within DBMS ; not OS </li></ul><ul><ul><li>For non-random accesses </li></ul></ul>
  24. 24. Scan Performance - Data <ul><li>Settings: </li></ul><ul><ul><li>lineitem ( L_ORDERKEY, L_PARTKEY , L_SUPPKEY , L_LINENUMBER , L_QUANTITY, L_EXTENDEDPRICE , L_DISCOUNT, L_TAX , L_RETURNFLAG, L_LINESTATUS , L_SHIPDATE, L_COMMITDATE, L_RECEIPTDATE, L_SHIPINSTRUCT , L_SHIPMODE , L_COMMENT ); </li></ul></ul><ul><ul><li>600 000 rows </li></ul></ul><ul><ul><li>Lineitem tuples are ~ 160 bytes long </li></ul></ul><ul><ul><li>Cold Buffer </li></ul></ul><ul><ul><li>Dual Xeon (550MHz,512Kb), 1Gb RAM, Internal RAID controller from Adaptec (80Mb), 4x18Gb drives (10000RPM), Windows 2000. </li></ul></ul>
  25. 25. Scan Performance - Queries <ul><li>Queries: </li></ul><ul><ul><li>select avg(l_discount) from lineitem; </li></ul></ul>
  26. 26. Usage Factor <ul><li>DB2 UDB v7.1 on Windows 2000 </li></ul><ul><li>Usage factor is the percentage of the page used by tuples and auxiliary data structures (the rest is reserved for future) </li></ul><ul><li>Scan throughput increases with usage factor. </li></ul>
  27. 27. Prefetching <ul><li>DB2 UDB v7.1 on Windows 2000 </li></ul><ul><li>Throughput increases up to a certain point when prefetching size increases. </li></ul>
  28. 28. Summary <ul><li>In this module, we have covered: </li></ul><ul><ul><li>A review of OS from the DBMS perspective </li></ul></ul><ul><ul><li>How to optimise OS-related parameters and options </li></ul></ul><ul><ul><ul><li>Thread </li></ul></ul></ul><ul><ul><ul><li>Buffer, and </li></ul></ul></ul><ul><ul><ul><li>File system </li></ul></ul></ul>

×