Your SlideShare is downloading. ×
0
Compression ow2009 r2
Compression ow2009 r2
Compression ow2009 r2
Compression ow2009 r2
Compression ow2009 r2
Compression ow2009 r2
Compression ow2009 r2
Compression ow2009 r2
Compression ow2009 r2
Compression ow2009 r2
Compression ow2009 r2
Compression ow2009 r2
Compression ow2009 r2
Compression ow2009 r2
Compression ow2009 r2
Compression ow2009 r2
Compression ow2009 r2
Compression ow2009 r2
Compression ow2009 r2
Compression ow2009 r2
Compression ow2009 r2
Compression ow2009 r2
Compression ow2009 r2
Compression ow2009 r2
Compression ow2009 r2
Compression ow2009 r2
Compression ow2009 r2
Compression ow2009 r2
Compression ow2009 r2
Compression ow2009 r2
Compression ow2009 r2
Compression ow2009 r2
Compression ow2009 r2
Compression ow2009 r2
Compression ow2009 r2
Compression ow2009 r2
Compression ow2009 r2
Compression ow2009 r2
Compression ow2009 r2
Compression ow2009 r2
Compression ow2009 r2
Compression ow2009 r2
Compression ow2009 r2
Compression ow2009 r2
Compression ow2009 r2
Compression ow2009 r2
Compression ow2009 r2
Compression ow2009 r2
Compression ow2009 r2
Compression ow2009 r2
Compression ow2009 r2
Compression ow2009 r2
Compression ow2009 r2
Compression ow2009 r2
Compression ow2009 r2
Compression ow2009 r2
Compression ow2009 r2
Compression ow2009 r2
Compression ow2009 r2
Compression ow2009 r2
Compression ow2009 r2
Compression ow2009 r2
Compression ow2009 r2
Compression ow2009 r2
Compression ow2009 r2
Compression ow2009 r2
Compression ow2009 r2
Compression ow2009 r2
Compression ow2009 r2
Compression ow2009 r2
Compression ow2009 r2
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Compression ow2009 r2

355

Published on

Published in: Technology, Education
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
355
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
19
Comments
0
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide
  • The author has ‘experienced’ the following results on tests conducted on 11.1.0.6 (windows beta) 500,000 row sales table. Update 90% of rows with no lengthening of row. Conventaional table : 19 seconds Table with OLTP compression : 2 hours 12 minutes and 23 seconds! Ok – an unfair test, but certainly requires a warning.
  • The author has ‘experienced’ the following results on tests conducted on 11.1.0.6 (windows beta) 500,000 row sales table. Update 90% of rows with no lengthening of row. Conventaional table : 19 seconds Table with OLTP compression : 2 hours 12 minutes and 23 seconds! Ok – an unfair test, but certainly requires a warning.
  • The author has ‘experienced’ the following results on tests conducted on 11.1.0.6 (windows beta) 500,000 row sales table. Update 90% of rows with no lengthening of row. Conventaional table : 19 seconds Table with OLTP compression : 2 hours 12 minutes and 23 seconds! Ok – an unfair test, but certainly requires a warning.
  • Transcript

    • 1. Data Compression in Oracle Carl DudleyUniversity of Wolverhampton, UK UKOUG Director Oracle ACE Director carl.dudley@wlv.ac.uk 1 Carl Dudley – University of Wolverhampton, UK
    • 2. Introduction Working with Oracle since 1986 Oracle DBA - OCP Oracle7, 8, 9, 10 Oracle DBA of the Year – 2002 Oracle ACE Director Regular Presenter at Oracle Conferences Consultant and Trainer Technical Editor for a number of Oracle texts UK Oracle User Group Director Member of IOUC Day job – University of Wolverhampton, UK 2 Carl Dudley – University of Wolverhampton, UK
    • 3. Main Topics  Oracle 9i and 10g Compression - major features  Compression in data warehousing  Sampling the data to predict compression  Pre-sorting the data for compression  Behaviour of DML/DDL on compressed tables  Compression Internals  Advanced Compression in Oracle11g (for OLTP operations)  Shrinking unused space 3 Carl Dudley – University of Wolverhampton, UK
    • 4. Oracle Data Compression – Main Features 4 Carl Dudley – University of Wolverhampton, UK
    • 5. Compression : Characteristics Trades Physical I/O against CPU utilization – Transparent to applications – Can increase I/O throughput and buffer cache capacity Useful for read mostly applications – Decision Support and OLAP Compression is performed only when Oracle considers it worthwhile – Depends on column length and amount of duplication Compression occurs only when duplicate values are present within and across columns within a single database block – Compression algorithms have caused little change to the Kernel code • Modifications only to block formatting and accessing rows and columns – No compression within individual column values or across blocks – Blocks retrieved in compressed format in the buffer cache 5 Carl Dudley – University of Wolverhampton, UK
    • 6. Getting Compressed  Building a new compressed table CREATE TABLE <table_name> ... COMPRESS; CREATE TABLE <table_name> COMPRESS AS SELECT ...  Altering an existing table to be compressed ALTER TABLE <table_name> MOVE COMPRESS; – No additional copy created but temp space and exclusive table level lock required for the compression activity ALTER TABLE <table_name> COMPRESS; – Future bulk inserts may be compressed – existing data is not  Compressing individual partitions ALTER TABLE <table_name> MOVE PARTITION <partition_name> COMPRESS; – Existing data and future bulk inserts compressed in a specific partition 6 Carl Dudley – University of Wolverhampton, UK
    • 7. Tablespace Level Compression  Entire tablespaces can compress by default CREATE | ALTER TABLESPACE < tablespace_name> DEFAULT [ COMPRESS | NOCOMPRESS ] ... – All objects in the tablespace will be compressed by default 7 Carl Dudley – University of Wolverhampton, UK
    • 8. Compressing Table Data  Uncompressed conventional emp table CREATE TABLE emp (empno NUMBER(4) ,ename VARCHAR2(12) ...);  Compressed emp table CREATE TABLE emp (empno NUMBER(4) ,ename VARCHAR2(12) ...) COMPRESS;  Could consider sorting the data on columns which lend themselves to compression 8 Carl Dudley – University of Wolverhampton, UK
    • 9. Table Data Compression  Uncompressed emp table 7369 CLERK 2000 1550 ACCOUNTING 7782 MANAGER 4975 1600 PLANT 7902 ANALYST 4000 2100 OPERATIONS 7900 CLERK 2750 1500 OPERATIONS 7934 CLERK 2200 1200 ACCOUNTING 7654 PLANT 3000 1100 RESEARCH  Compressed emp table – Similar to index compression [SYMBOL TABLE] [A]=CLERK, [B]=ACCOUNTING, [C]=PLANT, [D]=OPERATIONS 7369 [A] 2000 1550 [B] 7782 MANAGER 4975 1600 [C] 7902 ANALYST 4000 2100 [D] 7900 [A] 2750 1500 [D] 7934 [A] 2200 1200 [B] 7654 [C] 3000 1100 RESEARCH 9 Carl Dudley – University of Wolverhampton, UK
    • 10. Scanning Compressed Tables – Tests(1) Compressing can significantly reduce disk I/O - good for queries? – Possible increase in CPU activity – May need to unravel the compression but logical reads will be reduced SELECT table_name,compressed,num_rows FROM my_user_tables; TABLE_NAME COMPRESSED NUM_ROWS ---------- ---------- -------- EMP_NC DISABLED 1835008 EMP_DSS ENABLED 1835008 SELECT COUNT(ename) FROM emp_nc call count cpu elapsed disk query current rows ------- ------ -------- ---------- ---------- ---------- ---------- ---------- Parse 1 0.00 0.00 0 0 0 0 Execute 1 0.00 0.00 0 0 0 0 Fetch 2 0.31 1.88 10974 10978 0 1 ------- ------ -------- ---------- ---------- ---------- ---------- ---------- total 4 0.31 1.88 10974 10978 0 1 SELECT COUNT(ename) FROM emp_dss call count cpu elapsed disk query current rows ------- ------ -------- ---------- ---------- ---------- ---------- ---------- Parse 1 0.00 0.00 0 0 0 0 Execute 1 0.00 0.00 0 0 0 0 Fetch 2 0.56 0.71 3057 3060 0 1 ------- ------ -------- ---------- ---------- ---------- ---------- ---------- total 4 0.56 0.71 3057 3060 0 1 10 Carl Dudley – University of Wolverhampton, UK
    • 11. Scanning Compressed Tables – Tests (2) Force all rows to be uncompressed – Increases logical I/O? SELECT ename FROM emp_nc call count cpu elapsed disk query current rows ------- ------ -------- ---------- ---------- ---------- ---------- ---------- Parse 1 0.00 0.00 0 0 0 0 Execute 1 0.00 0.00 0 0 0 0 Fetch 122335 1.68 2.00 10974 132607 0 1835008 ------- ------ -------- ---------- ---------- ---------- ---------- ---------- total 122337 1.68 2.00 10974 132607 0 1835008 SELECT ename FROM emp_dss call count cpu elapsed disk query current rows ------- ------ -------- ---------- ---------- ---------- ---------- ---------- Parse 1 0.00 0.00 0 0 0 0 Execute 1 0.00 0.00 0 0 0 0 Fetch 122335 2.17 2.24 3057 125286 0 1835008 ------- ------ -------- ---------- ---------- ---------- ---------- ---------- total 122337 2.17 2.24 3057 125286 0 1835008 11 Carl Dudley – University of Wolverhampton, UK
    • 12. Space Reduction Due to Compression Space usage summary Statistics for table EMP_NC Statistics for table EMP_DSS Unformatted Blocks ...........0 Unformatted Blocks ...........0 FS1 Blocks (0-25) ............0 FS1 Blocks (0-25) ............0 FS2 Blocks (25-50) ...........0 FS2 Blocks (25-50) ...........0 FS3 Blocks (50-75) ...........0 FS3 Blocks (50-75) ...........0 FS4 Blocks (75-100) ..........0 FS4 Blocks (75-100) ..........0 Full Blocks .............10,974 Full Blocks ..............3,057 Total Blocks ............11,776 Total Blocks .............3,200 Total Bytes .........96,468,992 Total Bytes .........26,214,400 Total Mbytes ................92 Total Mbytes ................25 Unused Blocks ..............653 Unused Blocks ...............85 Unused Bytes .........5,349,376 Unused Bytes ...........696,320 Last Used Ext FileId .........4 Last Used Ext FileId .........4 Last Used Ext BlockId ..264,841 Last Used Ext BlockId ..263,561 Last Used Block ............371 Last Used Block .............43 – Summary routine adapted from Tom Kyte’s example 12 Carl Dudley – University of Wolverhampton, UK
    • 13. Compression in Data Warehousing 13 Carl Dudley – University of Wolverhampton, UK
    • 14. Compression in Data Warehousing  Fact tables are good candidates for compression – Large and have repetitive values – Repetitive data tends to be clustered  Dimension tables are often too small for compression  Large block size leads to greater compression – Typical in data warehouses – More rows available for compression within each block  Materialized views can be compressed (and partitioned) – Naturally sorted on creation due to GROUP BY – Especially good for ROLLUP views and join views • Tend to contain repetitive data 14 Carl Dudley – University of Wolverhampton, UK
    • 15. Compression of Individual Table Partitions Partition level – Partitioning must be range or list (or composite) CREATE TABLE sales (sales_id NUMBER(8) : : ,sales_date DATE) PARTITION BY RANGE(sales_date) (PARTITION sales_jan2009 VALUES LESS THAN (TO_DATE(02/01/2009,DD/MM/YYYY)) COMPRESS, PARTITION sales_feb2009 VALUES LESS THAN (TO_DATE(03/01/2009,DD/MM/YYYY)), PARTITION sales_mar2009 VALUES LESS THAN (TO_DATE(04/01/2009,DD/MM/YYYY)), PARTITION sales_apr2009 VALUES LESS THAN (TO_DATE(05/01/2009,DD/MM/YYYY))); – The first partition will be compressed – Could consider compressing read only partitions of historical data 15 Carl Dudley – University of Wolverhampton, UK
    • 16. Effect of Partition Operations  Consider individual partitions compressed as shown PARTITION p1 COMPRESS VALUES LESS THAN 100 PARTITION p2 COMPRESS VALUES LESS THAN 200 PARTITION p3 NOCOMPRESS VALUES LESS THAN 300 PARTITION p4 NOCOMPRESS VALUES LESS THAN 400  Splitting a compressed partition ALTER TABLE s1 SPLIT PARTITION p1 (50) INTO (PARTITION p1a, PARTITION p1b); – Produces two new compressed partitions PARTITION p1a COMPRESS VALUES LESS THAN 50 PARTITION p1b COMPRESS VALUES LESS THAN 100 PARTITION p2 COMPRESS VALUES LESS THAN 200 PARTITION p3 NOCOMPRESS VALUES LESS THAN 300 PARTITION p4 NOCOMPRESS VALUES LESS THAN 400 16 Carl Dudley – University of Wolverhampton, UK
    • 17. Effect of Partition Operations (contd)  Effect of merging compressed partitions PARTITION p1a COMPRESS VALUES LESS THAN 50 PARTITION p1b COMPRESS VALUES LESS THAN 100 PARTITION p2 COMPRESS VALUES LESS THAN 200 PARTITION p3 NOCOMPRESS VALUES LESS THAN 300 PARTITION p4 NOCOMPRESS VALUES LESS THAN 400  Merge of two compressed partitions ALTER TABLE s1 MERGE PARTITIONS p1b,p2 INTO PARTITION p1b_2; PARTITION p1a COMPRESS VALUES LESS THAN 50 PARTITION p1b_2 NOCOMPRESS VALUES LESS THAN 200 PARTITION p3 NOCOMPRESS VALUES LESS THAN 300 PARTITION p4 NOCOMPRESS VALUES LESS THAN 400 – New partition p1b_2 is not compressed by default • Same applies if any to be merged are initially uncompressed 17 Carl Dudley – University of Wolverhampton, UK
    • 18. Forcing Compression During Partition Maintenance  Force compression of the new partition(s) after a split operation ALTER TABLE s1 SPLIT PARTITION p1 AT (50) INTO (PARTITION p1a COMPRESS,PARTITION p2a);  Force compression of new partition after a merge operation ALTER TABLE s1 MERGE PARTITIONS p2,p3 INTO PARTITION p2_3 COMPRESS;  Partitions may be empty or contain data during maintenance operations involving compression 18 Carl Dudley – University of Wolverhampton, UK
    • 19. Effect of Partitioned Bitmap Indexes Scenario : – Table having no compressed partitions has bitmap locally partitioned indexes – The presence of usable bitmap indexes will prevent the first operation that compresses a partition SQL> ALTER TABLE sales MOVE PARTITION p4 COMPRESS; ORA-14646: Specified alter table operation involving compression cannot be performed in the presence of usable bitmap indexes SQL> ALTER TABLE part2 SPLIT PARTITION p1a AT (25) INTO ( PARTITION p1c COMPRESS,PARTITION p1d); ORA-14646: Specified alter table operation involving compression cannot be performed in the presence of usable bitmap indexes 19 Carl Dudley – University of Wolverhampton, UK
    • 20. Compression of Partitions withBitmap Indexes in Place  Uncompressed partitioned table with bitmap index in 3 partitions CREATE TABLE emp_part PARTITION BY RANGE (deptno) (PARTITION p1 VALUES LESS THAN (11), PARTITION p2 VALUES LESS THAN (21), PARTITION p3 VALUES LESS THAN (31)) AS SELECT * FROM emp; CREATE BITMAP INDEX part$empno ON emp_part(empno) LOCAL; 20 Carl Dudley – University of Wolverhampton, UK
    • 21. Compression of Partitions withBitmap Indexes in Place (continued)  First compression operation requires the following 1. Mark bitmap indexes unusable (or drop them) ALTER INDEX part$empno UNUSABLE; 2. Compress the first (and any subsequent) partition as required ALTER TABLE emp_part MOVE PARTITION p1 COMPRESS; 3. Rebuild the bitmap indexes (or recreate them) ALTER INDEX part$empno REBUILD PARTITION p1; ALTER INDEX part$empno REBUILD PARTITION p2; ALTER INDEX part$empno REBUILD PARTITION p3; – Each index partition must be individually rebuilt 21 Carl Dudley – University of Wolverhampton, UK
    • 22. Compression of Partitions withBitmap Indexes in Place (continued)  Oracle needs to know maximum records per block – Correct mapping of bits to blocks can then be done – On compression this value increases  Oracle has to rebuild bitmaps to accommodate potentially larger number of values even if no data is present in the partition(s) – Could result in larger bitmaps for uncompressed partitions • Increase in size can be offset by the actual compression  Once rebuilt, the indexes can cope with any compression – Subsequent compression operations do not invalidate bitmap indexes  Recommended to create each partitioned table with at least one compressed (dummy/empty?) partition – Can be subsequently dropped  Compression activity does not affect Btree usability 22 Carl Dudley – University of Wolverhampton, UK
    • 23. Table Level Compression for Partitioned Tables  Compression can be the default for all partitions CREATE TABLE sales (sales_id NUMBER(8), : : sales_date DATE) COMPRESS PARTITION BY (sales_date) ... – Can still specify individual partitions to be NOCOMPRESS  Default partition maintenance actions on compressed tables – Splitting non-compressed partitions results in non-compressed partitions – Merging non-compressed partitions results in a compressed partition – Adding a partition will result in a new compressed partition – Moving a partition does not alter its compression 23 Carl Dudley – University of Wolverhampton, UK
    • 24. Finding the Largest Tables  Useful for finding candidates for compression SELECT owner ,name ,SUM(gb) ,SUM(pct) FROM (SELECT owner ,name ,TO_CHAR(gb,999.99) gb ,TO_CHAR((RATIO_TO_REPORT(gb) OVER())*100,999,999,999.99) pct FROM (SELECT owner ,SUBSTR(segment_name,1,30) name ,SUM(bytes/(1024*1024*1024)) gb FROM dba_segments WHERE segment_type IN (TABLE,TABLE PARTITION) GROUP BY owner ,segment_name ) ) WHERE pct > 3 GROUP BY ROLLUP(owner ,name) ORDER BY 3; 24 Carl Dudley – University of Wolverhampton, UK
    • 25. Finding the Largest Tables (contd) OWNER NAME SUM(GB) SUM(PCT) ------------- -------------- ------------- ---------- SH COSTS .03 8.23 SH SALES .05 14.44 SH SALES_HIST .13 32.93 SH .21 55.61 SYS IDL_UB2$ .01 3.86 SYS SOURCE$ .02 6.43 SYS .03 10.29 .24 65.90 25 Carl Dudley – University of Wolverhampton, UK
    • 26. Sampling Data to Predict Compression 26 Carl Dudley – University of Wolverhampton, UK
    • 27. Compression Factor and Space Saving  Compression Factor (CF) non-compressed blocks CF = * 100 compressed blocks  Space Savings (SS) non-compressed blocks - compressed blocks SS = * 100 compressed blocks 27 Carl Dudley – University of Wolverhampton, UK
    • 28. Predicting the Compression FactorCREATE OR REPLACE FUNCTION compression_ratio (tabname VARCHAR2) RETURN NUMBER IS pct NUMBER := 0.000099; -- sample percentage blkcnt NUMBER := 0; -- original block count (should be < 10K) blkcntc NUMBER; -- compressed block count BEGIN EXECUTE IMMEDIATE CREATE TABLE temp_uncompressed PCTFREE 0 AS SELECT * FROM || tabname || WHERE ROWNUM < 1; WHILE ((pct < 100) AND (blkcnt < 1000)) LOOP -- until > 1000 blocks in sample EXECUTE IMMEDIATE TRUNCATE TABLE temp_uncompressed; EXECUTE IMMEDIATE INSERT INTO temp_uncompressed SELECT * FROM || tabname || SAMPLE BLOCK ( || pct || ,10); EXECUTE IMMEDIATE SELECT COUNT(DISTINCT(dbms_rowid.rowid_block_number(rowid))) FROM temp_uncompressed INTO blkcnt; pct := pct * 10; END LOOP; EXECUTE IMMEDIATE CREATE TABLE temp_compressed COMPRESS AS SELECT * FROM temp_uncompressed; EXECUTE IMMEDIATE SELECT COUNT(DISTINCT(dbms_rowid.rowid_block_number(rowid))) FROM temp_compressed INTO blkcntc; EXECUTE IMMEDIATE DROP TABLE temp_compressed; EXECUTE IMMEDIATE DROP TABLE temp_uncompressed; RETURN (blkcnt/blkcntc); END;/ 28 Carl Dudley – University of Wolverhampton, UK
    • 29. Predicting the Compression Factor (continued) CREATE OR REPLACE PROCEDURE compress_test(p_comp VARCHAR2) IS comp_ratio NUMBER; BEGIN comp_ratio := compression_ratio(p_comp); dbms_output.put_line(Compression factor for table || p_comp || is || comp_ratio ); END;  Run the compression test for the emp table EXEC compress_test(EMP) Compression factor for table EMP is 1.6 29 Carl Dudley – University of Wolverhampton, UK
    • 30. Compression Test – Clustered DataCREATE TABLE noclust (col1 VARCHAR2(1000)) COMPRESS; CREATE TABLE clust (col1 VARCHAR2(1000)) COMPRESS;INSERT INTO noclust VALUES (VV...VV);INSERT INTO noclust VALUES (WW...WW); INSERT INTO clust VALUES (VV...VV);INSERT INTO noclust VALUES (XX...XX); INSERT INTO clust VALUES (VV...VV);INSERT INTO noclust VALUES (YY...YY); INSERT INTO clust VALUES (VV...VV);INSERT INTO noclust VALUES (ZZ...ZZ); INSERT INTO clust VALUES (VV...VV);INSERT INTO noclust VALUES (VV...VV); INSERT INTO clust VALUES (VV...VV);INSERT INTO noclust VALUES (WW...WW); INSERT INTO clust VALUES (WW...WW);INSERT INTO noclust VALUES (XX...XX); : : : INSERT INTO clust VALUES (WW...WW);INSERT INTO noclust VALUES (YY...YY); INSERT INTO clust VALUES (XX...XX);INSERT INTO noclust VALUES (ZZ...ZZ); : : :INSERT INTO noclust VALUES (VV...VV); INSERT INTO clust VALUES (YY...YY); : : : : : :INSERT INTO noclust VALUES (ZZ...ZZ); INSERT INTO clust VALUES (ZZ...ZZ); Every value for column col1 is 390 bytes long Both tables have a total of 25 rows stored in blocks of size 2K – So a maximum of four rows will fit in each block Both have same amount of repeated values but the clustering is different 30 Carl Dudley – University of Wolverhampton, UK
    • 31. Compression Test (continued) noclust - 4 rows per block. (7 blocks in total) The 5th row to be inserted must go in the next block as it contains different data header header header header vv…vv zz…zz yy…yy xx…xx ww…ww vv…vv zz…zz yy…yy xx…xx ww…ww vv…vv zz…zz yy…yy xx…xx ww…ww vv…vv header header clust - 20 rows per block. vv…vv Rows 2,3,4,5 are duplicates of the first row in the block. ww…ww Rows 7,8,9,10 are duplicates of the 6th row in the block, and this pattern is repeated. xx…xx The residual space in the first block is used by the compressed data yy…yy zz…zz 31 Carl Dudley – University of Wolverhampton, UK
    • 32. Compression Test - Compression Factors  Compression test routine is accurate due to sampling of actual data – Make sure default tablespace is correctly set • Temporary sample tables are physically built for the testing EXEC compress_test(CLUST) Compression factor for table CLUST is 3.5 EXEC compress_test(NOCLUST) Compression factor for table NOCLUST is 1 32 Carl Dudley – University of Wolverhampton, UK
    • 33. Testing Compression : Sampling Rows  Tables can be sampled at row or block level – Block level samples a random selection of whole blocks – Row level (default) samples a random selection of rows SELECT * FROM emp SAMPLE (10); • Selects a 10% sample of rows • If repeated, a different sample will be taken  Samples can be fixed in Oracle10g using SEED – SEED can can have integer values from 0 -100 – Can also have higher numbers ending in 00 SELECT * FROM emp SAMPLE (10) SEED (1); – Shows a 10% sample of rows – If repeated, the exact same sample will be taken • Also applies to block level sampling • The sample set will change if DML is performed on the table 33 Carl Dudley – University of Wolverhampton, UK
    • 34. Pre-Sorting the Data for Compression 34 Carl Dudley – University of Wolverhampton, UK
    • 35. Sorting the Data for Compression  Reorganize (pre-sort) rows in segments that will be compressed to cause repetitive data within blocks  For multi-column tables, order the rows by the low cardinality column(s) CREATE TABLE emp_comp COMPRESS AS SELECT * FROM emp ORDER BY <some unselective column(s)>; – For a single-column table, order the table rows by the column value  Presort the data on a column which has : no. of distinct values ~ no. of blocks (after compression)  Information on column cardinality is shown in: ALL_TAB_COL_STATISTICS ALL_PART_COL_STATISTICS ALL_SUBPART_COL_STATISTICS 35 Carl Dudley – University of Wolverhampton, UK
    • 36. Sorting the Data for Compression (continued)  Presort data on column having no. of distinct values ~ no. of blocks ~ SELECT * FROM large_emp; (114368 rows) EMPNO ENAME JOB ------- ------------ ---------- 43275 25***** CLERK 47422 128**** ANALYST 79366 6****** MANAGER : : : SELECT COUNT(DISTINCT job) FROM large_emp; 5 jobs SELECT COUNT(DISTINCT ename) FROM large_emp; 170 enames 36 Carl Dudley – University of Wolverhampton, UK
    • 37. Sorting the Data for Compression (continued) CREATE TABLE nocomp AS SELECT empno,ename,job FROM large_emp; Non-compressed table : Number of used blocks = 360 CREATE TABLE cjob COMPRESS AS SELECT empno,ename,job FROM large_emp ORDER BY job; Compressed table sorted on job : Number of used blocks = 243 CREATE TABLE cename COMPRESS AS SELECT empno,ename,job FROM large_emp ORDER BY ename; Compressed table sorted on ename : Number of used blocks = 172  Sorting on the job column is not the most effective 37 Carl Dudley – University of Wolverhampton, UK
    • 38. Behaviour of DML/DDLon Tables with Default (Direct Load) Compression 38 Carl Dudley – University of Wolverhampton, UK
    • 39. Default Compressed Table Behaviour Ordinary DML produces UNcompressed data – UPDATE • Wholesale updates lead to large increases in storage (>250%) • Performance impact on UPDATEs can be around 400% • Rows are migrated to new blocks (default value of PCTFREE is 0) – DELETE • Performance impact of around 15% for compressed rows Creating a compressed table can take 50% longer 39 Carl Dudley – University of Wolverhampton, UK
    • 40. Default Compressed Table Behaviour(continued) Operations which perform compression – CREATE TABLE ... AS SELECT ... – ALTER TABLE ... MOVE ... – INSERT /*+APPEND*/ (single threaded) – INSERT /*+PARALLEL(sales,4)*/ • Requires ALTER SESSION ENABLE PARALLEL DML; • Both of the above inserts work with data from database tables and external tables – SQL*Loader DIRECT = TRUE – Various partition maintenance operations Can not be used for: – Certain LOB and VARRAY constructs (see 11g Docs) – Index organized tables • Can use index compression on IOTs – External tables, index clusters, hash clusters – Tables with more than 255 columns 40 Carl Dudley – University of Wolverhampton, UK
    • 41. Compression Internals 41Carl Dudley – University of Wolverhampton, UK
    • 42. Hexadecimal Dump of Compressed Data Symbol Table : 14 unique names 5 unique jobs Start of next block 42 Carl Dudley – University of Wolverhampton, UK
    • 43. Oracle Dump - Two Uncompressed Employee Rows ... ALTER SYSTEM DUMP block_row_dump: DATAFILE 8 BLOCK 35; tab 0, row 0, @0x4a1 tl: 41 fb: --H-FL-- lb: 0x2 cc: 8 col 0: [ 3] c2 03 38  Creates a trace file in col 1: [ 5] 43 4c 41 52 4b USER_DUMP_DEST (10g) col 2: [ 7] 4d 41 4e 41 47 45 52 col 3: [ 3] c2 4f 28 trace directory (11g) col 4: [ 7] 77 b5 06 09 01 01 01 col 5: [ 3] c2 19 33 col 6: *NULL* col 7: [ 2] c1 0b tab 0, row 1, @0x4ca tl: 40 fb: --H-FL-- lb: 0x2 cc: 8 col 0: [ 3] c2 03 39 col 1: [ 5] 53 43 4f 54 54 col 2: [ 7] 41 4e 41 4c 59 53 54 col 3: [ 3] c2 4c 43 col 4: [ 7] 77 bb 04 13 01 01 01 col 5: [ 2] c2 1f col 6: *NULL* col 7: [ 2] c1 15 ... 43 Carl Dudley – University of Wolverhampton, UK
    • 44. Oracle Dump of Table of empno,ename,job tl: 18 fb: --H-FL-- lb: 0x0 cc: 2 col 0: [ 9] 50 52 45 53 49 44 45 4e 54 col 1: [ 4] 4b 49 4e 47 bindmp: 00 0b 02 d1 50 52 45 53 49 44 45 4e 54 cc 4b 49 4e 47 tab 0, row 1, @0x746 PRESIDENT KING tl: 9 fb: --H-FL-- lb: 0x0 cc: 2 col 0: [ 7] 41 4e 41 4c 59 53 54 col 1: [ 4] 46 4f 52 44 bindmp: 00 0a 02 11 cc 46 4f 52 44 FORD ..... tab 0, row 13, @0x6cc tl: 10 fb: --H-FL-- lb: 0x0 cc: 2 col 0: [ 5] 43 4c 45 52 4b col 1: [ 5] 53 4d 49 54 48 bindmp: 00 0b 02 0e cd 53 4d 49 54 48 SMITH tab 0, row 14, @0x780 tl: 8 fb: --H-FL-- lb: 0x0 cc: 1 col 0: [ 5] 43 4c 45 52 4b bindmp: 00 04 cd 43 4c 45 52 4b CLERK ..... tab 0, row 17, @0x761 tl: 10 fb: --H-FL-- lb: 0x0 cc: 1 col 0: [ 7] 41 4e 41 4c 59 53 54 bindmp: 00 02 cf 41 4e 41 4c 59 53 54 ANALYST tab 1, row 0, @0x6c3 tl: 9 fb: --H-FL-- lb: 0x0 cc: 3 col 0: [ 5] 43 4c 45 52 4b col 1: [ 5] 53 4d 49 54 48 col 2: [ 3] c2 02 34 bindmp: 2c 00 02 02 0d cb c2 02 34 44 Carl Dudley – University of Wolverhampton, UK
    • 45. Oracle Avoids Unnecessary Compression  Create two tables with repeating small values in one column CREATE TABLE tnocomp ( CREATE TABLE tcomp ( col1 VARCHAR2(1) col1 VARCHAR2(1) ,col2 VARCHAR2(6)) ,col2 VARCHAR2(6)) PCTFREE 0; COMPRESS;  Insert data (320 rows) as follows COL1 COL2 ---- ------ 1A 1ZZZZZ 2A 2ZZZZZ 3A 3ZZZZZ 4A 4ZZZZZ 5A 5ZZZZZ Values unique in col2 1A 6ZZZZZ Values repeat in col1 every 5 rows 2A 7ZZZZZ ... 4A 319ZZZ 5A 320ZZZ 45 Carl Dudley – University of Wolverhampton, UK
    • 46. Evidence of Minimal Compression SELECT dbms_rowid.rowid_block_number(ROWID) block ,dbms_rowid.rowid_relative_fno(ROWID) file ,COUNT(*) num_rows FROM &table_name GROUP BY dbms_rowid.rowid_block_number(ROWID) ,dbms_rowid.rowid_relative_fno(ROWID); tnocomp tcomp BLOCK FILE NUM_ROWS BLOCK FILE NUM_ROWS ----- ---- -------- ----- ---- -------- 34 8 126 66 8 128 35 8 126 67 8 132 36 8 68 68 8 60  Evidence of compression in the compressed table 46 Carl Dudley – University of Wolverhampton, UK
    • 47. Further Evidence of Compression block_row_dump:ALTER SYSTEM tab 0, row 0, @0x783 DUMP DATAFILE 8 tl: 5 fb: --H-FL-- lb: 0x0 cc: 1 col 0: [ 2] 31 41 BLOCK 67; bindmp: 00 1b ca 31 41 tab 0, row 1, @0x77e tl: 5 fb: --H-FL-- lb: 0x0 cc: 1 col 0: [ 2] 32 41 bindmp: 00 1b ca 32 41 Symbol tab 0, row 2, @0x779 Table (tab0) tl: 5 fb: --H-FL-- lb: 0x0 cc: 1 (1A,2A. col 0: [ 2] 33 41 bindmp: 00 1b ca 33 41 3A.4A.5A) HEX(A) = 41 tab 0, row 3, @0x774 tl: 5 fb: --H-FL-- lb: 0x0 cc: 1 col 0: [ 2] 34 41 bindmp: 00 1a ca 34 41 tab 0, row 4, @0x76f tl: 5 fb: --H-FL-- lb: 0x0 cc: 1 col 0: [ 2] 35 41 bindmp: 00 19 ca 35 41 tab 1, row 0, @0x763 tl: 12 fb: --H-FL-- lb: 0x0 cc: 2 col 0: [ 2] 31 41 col 1: [ 6] 31 30 31 30 30 30 bindmp: 2c 00 02 02 00 c9 31 30 31 30 30 30 tab 1, row 1, @0x757 tl: 12 fb: --H-FL-- lb: 0x0 cc: 2 col 0: [ 2] 32 41 col 1: [ 6] 31 30 32 30 30 30 bindmp: 2c 00 02 02 01 c9 31 30 32 30 30 30 47 Carl Dudley – University of Wolverhampton, UK
    • 48. Compression not Performed on Unsuitable Data  Both tables recreated with values in col1 now set to TO_CHAR(MOD(ROWNUM,50)) – Much less repetition of values (only every 50 rows) allowing less compression tnocomp COL1 COL2 BLOCK FILE NUM_ROWS ---- ------ ----- ---- -------- 1 1ZZZZZ 34 8 128 2 2ZZZZZ 35 8 128 3 3ZZZZZ 36 8 64 4 4ZZZZZ Oracle decides ... 5ZZZZZ not to compress tcomp 49 49ZZZZ 50 50ZZZZ (if the compression BLOCK FILE NUM_ROWS factor is likely to be ----- ---- -------- 1 51ZZZZ 66 8 128 2 52ZZZZ less than 1.03) 67 8 128 3 53ZZZZ 68 8 64 48 Carl Dudley – University of Wolverhampton, UK
    • 49. Comparison of Heap and IOT CompressionIOT = Index Organized table 49 Carl Dudley – University of Wolverhampton, UK
    • 50. Comparison of IOT and Heap Tables  Tests constructed using a standard set of data in emptest – Six columns with absence of nulls EMPNO ENAME JOB HIREDATE SAL DEPTNO ------- ---------- --------- --------- ------ ------ 1 KING PRESIDENT 17-NOV-81 5000 10 2 FORD ANALYST 03-DEC-81 3000 20 3 SCOTT ANALYST 09-DEC-82 3000 20 4 JONES MANAGER 02-APR-81 2975 20 5 BLAKE MANAGER 01-MAY-81 2850 30 6 CLARK MANAGER 09-JUN-81 2450 10 7 ALLEN SALESMAN 20-FEB-81 1600 30 8 TURNER SALESMAN 08-SEP-81 1500 30 9 MILLER CLERK 23-JAN-82 1300 10 10 WARD SALESMAN 22-FEB-81 1250 30 11 MARTIN SALESMAN 28-SEP-81 1250 30 12 ADAMS CLERK 12-JAN-83 1100 20 13 JAMES CLERK 03-DEC-81 950 30 14 SMITH CLERK 17-DEC-80 800 20 15 KING PRESIDENT 17-NOV-81 5000 10 16 FORD ANALYST 03-DEC-81 3000 20 ... ... ... ... ... ... 2000000 SMITH CLERK 17-DEC-80 800 20 50 Carl Dudley – University of Wolverhampton, UK
    • 51. Creation of IOTs empi : Conventional IOT based on emptest dataCREATE TABLE empi (empno,ename,job,hiredate,sal,deptno,CONSTRAINT pk_empi PRIMARY KEY(ename,job,hiredate,sal,deptno,empno))ORGANIZATION INDEXPCTFREE 0AS SELECT * FROM emptest empic : First five columns compressedCREATE TABLE empic (empno,ename,job,hiredate,sal,deptno,CONSTRAINT pk_empic PRIMARY KEY(ename,job,hiredate,sal,deptno,empno))ORGANIZATION INDEXPCTFREE 0COMPRESS 5AS SELECT * FROM emptest 51 Carl Dudley – University of Wolverhampton, UK
    • 52. Test tables Four tables built having heap/IOT structures and compressed/noncompressed data Table Table Compress Blocks Average Name Type row length emph Heap No 9669 33 emphc Heap Yes 3288 33 empi IOT No 10240 33 empic IOT Yes 2560 33  Average row length obtained from user_tables (avg_row_len) – Compressed tables show no reduction in average row length 52 Carl Dudley – University of Wolverhampton, UK
    • 53. Compression Data  Number of blocks in heap tables obtained using dbms_rowidSELECT COUNT( DISTINCT sys.dbms_rowid.rowid_block_number(ROWID)) BLOCKSFROM &table_name;  Number of blocks in IOTs obtained from index validation VALIDATE INDEX &index_name; SELECT * FROM index_stats;  Compressed IOTs have compression shown as DISABLED in user_tables, but ENABLED in user_indexes 53 Carl Dudley – University of Wolverhampton, UK
    • 54. Timings to Scan Tables (1) SELECT deptno FROM <table_name>; Table Table Compress CPU Elapsed DISK I/O Query Name Type Time Time emph Heap No 2.40 2.34 9670 142386 emphc Heap Yes 2.95 3.13 3289 136316 empi IOT No 2.59 3.85 9507 142531 empic IOT Yes 2.60 2.63 2398 135590  Repeat queries on empi and empic have 0 physical reads – Could be suffering from cold buffer flooding 54 Carl Dudley – University of Wolverhampton, UK
    • 55. Updates to Heap and IOT Tables UPDATE <table_name> SET ename = XXXXXXX‘; Lengthens same update on Oracle9i with 230,000 row table Results for each employee name by at least one character TableTable Table Compress Blocks Table Compress Blocks Blocks PCT CPUElapsed Blocks PCT Elapsed Name type type Name beforebefore afterIncrease time time time after Increase update update update update emphemph HeapHeap No No 9669 1092 1280 13% 17% 61.53 148.70 11020 5mins emphc HeapHeap Yes emphc Yes 3288 361 2291 20010 509%535% 112.54 224.13 15mins empi empi IOT IOT No No 10240 1077 2218 94% 19864 106% 155.06 275.73 4mins empic IOT IOT empic Yes Yes 2560 261 4792 527 87%101% 52.45 208.82 12mins Note the ‘explosion’ in size of the compressed heap table 55 Carl Dudley – University of Wolverhampton, UK
    • 56. Advanced Compression in Oracle11g (for OLTP Operations) 56 Carl Dudley – University of Wolverhampton, UK
    • 57. Advanced Compression in Oracle 11g Conventional DML maintains the compression – Inserted and updated rows remain compressed The compress activity is kept at a minimum Known as the Advanced Compression option - extra cost Syntax : COMPRESS [FOR OLTP]|[BASIC] Compression settings tracked in user_tables – PCTFREE is 10 by default SELECT table_name,compression,compress_for,pct_free FROM user_tables; TABLE_NAME COMPRESSION COMPRESS_FOR PCT_FREE --------------- ----------- ------------ -------- EMP_TEST DISABLED 10 EMP_DSS_COMP ENABLED BASIC 0 EMP_OLTP_COMP ENABLED OLTP 10 57 Carl Dudley – University of Wolverhampton, UK
    • 58. Compressing for OLTP Operations Requires COMPATIBILITY = 11.1.0 (or higher) CREATE TABLE t1 ... COMPRESS FOR ALL OPERATIONS; Conventionally inserted rows stay uncompressed until PCTFREE is reached – Mimimises compression operations Header Header Header Header Header Free space Free space Free space Free space Free space Conventional Block Rows are now More uncompressed Rows are inserts are not becomes full compressed Inserts fill the block compressed compressed again Transaction activity 58 Carl Dudley – University of Wolverhampton, UK
    • 59. OLTP Compression Behaviour  Table of employee data  Rows repeat every 7th row – lots of duplication  8K Block filled up to PCTFREE with 166 rows – Rows are uncompressed – Block dump reports 826 bytes of free space  Additional 10 rows inserted into table – Block dump reports 5636 bytes of free space – Shows evidence of compression of 166 rows 59 Carl Dudley – University of Wolverhampton, UK
    • 60. OLTP Compression Some early results Table containing 3million parts records No Compression Compression compression for direct path for all operations operations 35M 12M 16M — Avoid large scale (batch) updates on OLTP compressed tables • Significant processing overheads 60 Carl Dudley – University of Wolverhampton, UK
    • 61. OLTP Compression Test Table containing 500000 sales records PROD_ID CUST_ID TIME_ID CHANNEL_ID PROMO_ID AMOUNT_SOLD ACCOUNT_TYPE ------- ------- --------- ---------- -------- ----------- ------------- 13 5590 10-JAN-04 3 999 1232.16 Minor account 19 6277 23-FEB-05 1 123 7690.00 Minor account 16 6859 04-NOV-05 3 999 66.16 Minor account : : : : : : : Three tables created CREATE TABLE s_non_c PCTFREE 0 AS SELECT * FROM sales; — Non-compressed table with fully packed blocks CREATE TABLE s_c COMPRESS AS SELECT * FROM sales; — Compressed table for DSS operations CREATE TABLE s_c_all PCTFREE 0 COMPRESS FOR OLTP AS SELECT * FROM sales; — Compressed table for OLTP operations 61 Carl Dudley – University of Wolverhampton, UK
    • 62. OLTP Compression Test (continued) Update stress test – update 94% of the rows — No lengthening of any data UPDATE sales SET account_type = ‘Major account’ WHERE prod_id > 13; Non-compressed Compressed Compressed Table table Table for DSS for OLTP Original size 3043 848 848 (blocks) Size after update 3043 5475 5570 (blocks) Elapsed time for 18 secs 40 secs 1:32:03 secs update Test somewhat unfair on OLTP compression — Update is large and batch orientated — I/O subsystem was single disk — But still interesting? 62 Carl Dudley – University of Wolverhampton, UK
    • 63. OLTP Compression Characteristics Tests performed on 500,000 row employee table with data repeating every 14th row Transactions update 1500, 15000 and 50,000 rows with no increase in length of any data – Update performance of OLTP compression gets worse as we move more into a batch environment Time Initial Time Update Space Update Space Update Space Type of to Space to 1500 after 15000 after 50000 after Table create (blocks) Scan rows update rows update rows update (secs) (blocks) (blocks) (blocks) Regular 11.45 3712 1.29 0.07 3712 2.09 3712 3.30 3712 DSS 5.26 768 0.35 1.38 802 4.91 976 9.41 1280 OLTP 4.23 896 0.40 1.42 980 10.04 1024 1m18.50 1938 63 Carl Dudley – University of Wolverhampton, UK
    • 64. Test for Conventional Inserts  Two empty tables created – Uncompressed (emp_ins) – Compressed for all operations (emp_ins_c_all)  1,900,000 employee rows inserted via conventional singe row inserts – Elapsed time to insert into emp_ins : 3m 54s – Elapsed time to insert into emp_ins_c_all : 4m.33s  Further tests show a 40% overhead 64 Carl Dudley – University of Wolverhampton, UK
    • 65. Introducing OLTP Compression Changing a non-compressed table to OLTP compressed ALTER TABLE t1 COMPRESS FOR OLTP; — Will not compress existing rows — Compresses new rows including those inserted into partially full blocks below the high water mark Compression advisor for OLTP compression (test kit) can be obtained from — http://www.oracle.com/technology/products/database/compression/download.htm — dbmscomp.sql package header build — prvtcomp.plb package body build 65 Carl Dudley – University of Wolverhampton, UK
    • 66. Shrinking Unused Space in Oracle10g 66 Carl Dudley – University of Wolverhampton, UK
    • 67. Reclaiming Space in Oracle10g Unused space below the High Water Mark can be reclaimed online ALTER TABLE <table_name> SHRINK SPACE; ― Space caused by delete operations can be returned to free space ― Object must be in a tablespace with ASSM Two-step operation 1. Rows are moved to blocks available for insert • Requires row movement to be enabled ALTER TABLE <table_name> ENABLE ROW MOVEMENT; 2. High water mark (HWM) is repositioned requiring table-level lock Newly freed blocks are not returned to free space if COMPACT is used ― HWM is not repositioned ALTER TABLE <table_name> SHRINK SPACE COMPACT; 67 Carl Dudley – University of Wolverhampton, UK
    • 68. Repositioning of HWM During shrinks HWM Allocated block above the high water mark HWM Free block Free block Free block  Rows are physically moved to blocks that have free space – Shrinking causes ROWIDs to change – Indexes (bitmap and btree) are updated accordingly  Shrinking does not work with compressed tables 68 Carl Dudley – University of Wolverhampton, UK
    • 69. Shrinking Objects  Why Shrink? ― To reclaim space ― To increase speed of full table scans ― To allow access to table during the necessary reorganisation ― The shrink operation can be terminated/interrupted at any time • Can be continued at a later time from point of termination  Objects that can be SHRINKed ― Tables ― Indexes ― Materialized views ― Materialized view logs ― Dependent objects may be shrunk when a table shrinks ALTER TABLE <table_name> SHRINK SPACE [CASCADE]; 69 Carl Dudley – University of Wolverhampton, UK
    • 70. Compression Outside of the Database RMAN backups and LOB (Secure Files) can be compressed at three levels LOW, MEDIUM and HIGH Data Pump exports can be compressed ― Inline operation with the actual imp/exp job Log Transport services in Data Guard can compress redo data to reduce network traffic For further information, see http://www.oracle.com/technology/products/database/ oracle11g/pdf/advanced-compression-whitepaper.pdf 70 Carl Dudley – University of Wolverhampton, UK
    • 71. Data Compression in Oracle Carl DudleyUniversity of Wolverhampton, UK UKOUG SIG Director carl.dudley@wlv.ac.uk 71 Carl Dudley – University of Wolverhampton, UK

    ×