Maaz Anjum - IOUG Collaborate 2013 - An Insight into Space Realization on ODA and Exadata


Published on

Published in: Technology
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • ----- Meeting Notes (4/10/13 01:06) -----
    Stop. Ask the group the following question:
    Have you implemented or experimented with AC?
    Yes, keep an eye out for who said yes.
    ----- Meeting Notes (4/10/13 20:20) -----
    The format i'd like to follow today is to go over some of the basic concepts in advanced compression and how i leveraged them to compress the clients data.
  • ----- Meeting Notes (4/10/13 01:06) -----
    Stop. Ask the group the following question:
    Have you implemented or experimented with AC?
    Yes, keep an eye out for who said yes.
    ----- Meeting Notes (4/10/13 20:20) -----
    The format i'd like to follow today is to go over some of the basic concepts in advanced compression and how i leveraged them to compress the clients data.
  • Data compression algorithms have been around for decades, but only today are they being put to use within mainstream information systems processing.  All of the industrial strength database offer some for of data compression (Oracle, DB2, CA-IDMS), while they are unknown within simple data engines such as Microsoft Access and SQL Server.
    There are several places where data can be compressed, either external to the database, or internally, within the DBMS software
    Physical database compression
    Hardware assisted compression - IMS, the first commercially available database offers Hardware Assisted Data Compression (HDC) which interfaces with the 3380 DASD to compress IMS blocks at the hardware level, completely transparent to the database engine.
    Block/page level compression
    Historical database compression uses external mechanisms that are invisible to the database.  As block are written from the database, user exits invoke compression routines to store the compressed block on disk. Logical database compression
    Table/Segment, Row level
  • 9i offered basic compression which was uncompressed if the row was updated.
    10g – offered compress with direct load which worked with INSERT INTO APPEND
  • ----- Meeting Notes (4/10/13 20:20) -----
    data pump: with 10g, only meta_data could be compressed
    with 11g, there is a compression algorithm which compresses the backup file as its written
    How many of you have used DBFS?
  • Basic compression is inherited from 10g. Updated rows are uncompressed and would need to be recompressed at a later time – requires downtime, outage etc. This was a good option while using partitioning because, if older compressed partitions were updated, they would need to be recompressed.
    OLTP compression, new with 11g, is the newer version of BASIC compression. Updated rows are compressed.
    The last two used with HCC.
  • ----- Meeting Notes (4/10/13 20:20) -----
    lookup or journal table
    how many in the room actually have a table which is smaller than its index?
  • ----- Meeting Notes (4/10/13 20:39) -----
    tiered approach
  • Maaz Anjum - IOUG Collaborate 2013 - An Insight into Space Realization on ODA and Exadata

    1. 1. About Maaz Maaz Anjum • Marietta, Georgia • Solutions Architect: • OEM12c • Golden Gate • Engineered Systems • Member of IOUG • Using Oracle products SINCE 2001 Blog: Email:
    2. 2. • • • • • • • Founded in 2000 Distinguished Oracle Leader – Technology Momentum – Portal Blazer Award – Titan Award – Red Stack + HW Momentum – Excellence in Innovation Management Team is Ex-Oracle Location(s): Headquartered in Atlanta; Regional office in Washington D.C.; Offshore – Hyderabad and Chennai, India 200+ with 10+ yrs of Oracle experience avg (75 openings today…) Inc.500 fastest growing private company in the U.S. for the 3rd Time Voted Best Place to work in Atlanta for 2nd year
    3. 3. • Oracle Platinum Certified Systems Integrator & Oracle GSA Software Reseller • Consulting expertise and past performance across the entire Oracle Stack • One stop shop for all things Oracle including Hardware, Software, Consulting, Managed Services, and Staff Augmentation • 350 customers across Federal Civilian Agencies, Department of Defense, State and Local Government, and Fortune 500 • 1500 successful implementations since 2000 • Certified across the entire Oracle Stack (1 of 8 partners out of 10,000) • Top 5 Oracle reseller in the United States • Business Intelligence Pillar Partner for Public Sector and Commercial
    4. 4. Oracle’s Advanced Compression An Insight into Space Realization on ODA and Exadata #523
    5. 5. AGENDA • Overview • A Brief History • Pre-11g • 11g: New Features • Case Study - Story
    6. 6. VERVIEW • Combined with Oracle Database 11g helps businesses manage more data in a cost effective manner while providing for storing and auditing of historical data. • Delivers compression rates of 2-4x across all types of data and applications improving query performance. • Includes compression for structured data (numbers, characters, etc.), unstructured data (documents, images, etc.), backups (RMAN and Data Pump) and network transport (redo log transport during Data Guard gap resolution). 7
    7. 7. VERVIEW • Reduces database storage requirements and associated costs • Compresses transaction processing and data warehousing application tables • Compresses structured, unstructured, backup and Data Guard Redo Log network transport data • Includes Total Recall for storing and auditing historical data • Cascades storage savings throughout the data center 8
    8. 8. AGENDA • Overview • A Brief History • Pre-11g • 11g: New Features • Case study - Story
    9. 9. A Brief History • Given many names • Data Compression • Source Coding • Bit-Rate Compression
    10. 10. AGENDA • Overview • A Brief History • Pre-11g • 11g: New Features • Case Study - Story
    11. 11. Pre-11G • First introduced in Oracle • WITH COMPRESS • A trade-off between CPU and Disk I/O • The use of spare CPU cycles to decrease the bytes written and read • Transparent to applications, SQL, and PL/SQL • May improve performance by requiring the transfer of fewer bytes from disk through the network, into the CPU, to be stored in the buffer cache • Increase the amount of data stored on existing disk 12
    12. 12. AGENDA • Overview • A Brief History • Pre-11g • 11g: New Features • Case Study - Story
    13. 13. 11G New Features The Advanced Compression Option includes: • Data Guard Network Compression • Data Pump Compression • Fast RMAN Compression • OLTP Table Compression • SecureFile Compression and Deduplication • Leveraged in 11gR2 DBFS (DataBase File System) 14
    14. 14. 11G New Features • Compressed Tablespaces • Segment Compression • COMPRESS • COMPRESS FOR BASIC • COMPRESS FOR OLTP column • Hybrid Columnar Compression • Warehouse Compression (Query) • Archival Compression (Archive) • user_tablespaces.compress_for 15
    15. 15. 11G New Features Fully supported with… • • • • • • • • B-Tree, Bitmap Indexes, Text indexes Materialized Views Exadata Server and Cells Partitioning Parallel Query, PDML, PDDL Schema Evolution support, online, metadata-only add/drop columns Data Guard Physical Standby 16
    16. 16. 11.2 Table segment Compress for OLTP compression CREATE TABLE ct1 COMPRESS FOR OLTP AS SELECT * FROM dba_objects; Compress for Query CREATE TABLE ct2 COMPRESS FOR QUERY HIGH AS SELECT * FROM dba_objects; Compress for Archive CREATE TABLE ct3 COMPRESS FOR ARCHIVE LOW AS SELECT * FROM dba_objects; 17
    17. 17. 11.2 Table segment compression 18
    18. 18. Types of Compression 19
    19. 19. Compression Characteristics 20
    20. 20. What Can Be • Tablespaces Compressed? • Tables • Partitions • Indexes • SecureFiles • RMAN Backups • Data Pump Backups 21
    21. 21. What Can Be Compressed? Tablespaces CREATE TABLESPACE test_ts DATAFILE '/u01/app/oracle/oradata/DB11G/test_ts01.dbf' SIZE 1M DEFAULT COMPRESS FOR ALL OPERATIONS; SELECT def_tab_compression, compress_for FROM dba_tablespaces WHERE tablespace_name = 'TEST_TS'; DEF_TAB_ COMPRESS_FOR -------- -----------------ENABLED FOR ALL OPERATIONS 22
    22. 22. What Can Be Compressed? Partitions CREATE TABLE test_tab_2 ( id NUMBER(10) NOT NULL, description VARCHAR2(50) NOT NULL, created_date DATE NOT NULL ) PARTITION BY RANGE (created_date) ( PARTITION test_tab_q1 VALUES LESS THAN (TO_DATE('01/01/2008', 'DD/MM/YYYY')) COMPRESS, PARTITION test_tab_q2 VALUES LESS THAN (TO_DATE('01/04/2008', 'DD/MM/YYYY')) COMPRESS FOR OLTP, PARTITION test_tab_q3 VALUES LESS THAN (TO_DATE('01/07/2008', 'DD/MM/YYYY')) COMPRESS FOR OLTP, PARTITION test_tab_q4 VALUES LESS THAN (MAXVALUE) NOCOMPRESS ); 23
    23. 23. What Can Be Compressed? Key-Compressed Indexes •Creating an index using key compression enables you to eliminate repeated occurrences of key column prefix values. •Key compression breaks an index key into a prefix and a suffix entry. •Compression is achieved by sharing the prefix entries among all the suffix entries in an index block. •This sharing can lead to huge savings in space, allowing you to store more keys for each index block while improving performance. CREATE INDEX emp_ename ON emp(ename) TABLESPACE users COMPRESS 1;
    24. 24. What Can Be Compressed? SecureFiles •SecureFile compression does not entail table or index compression and vice-versa. •A server-wide default SecureFile compression algorithm is used. •MEDIUM and HIGH options provide varying degrees of compression. The higher the degree of compression, the higher the latency incurred. HIGH setting incurs more work, but will compress the data better. The default is MEDIUM.
    25. 25. What Can Be Compressed? SecureFiles •Compression can be specified at a partition level. The lob_storage_clause enables specification for partitioned tables on a per-partition basis. •SecureFile compression is performed on the server-side and enables random reads and writes to LOB data. Client side compression utilities like utl_compress cannot provide random access. •DBMS_LOB.SETOPTIONS can be used to enable and disable compression on individual LOBs. •LOB compression is applicable only to SECUREFILE LOBs.
    27. 27. What Can Be Compressed? RMAN Backups
    28. 28. What Can Be Compressed? RMAN Backups Sample “Medium” Algorithm compression results
    29. 29. What Can Be Compressed? Data Pump Backups •The ability to compress the metadata associated with a Data Pump job was first provided in Oracle Database 10g Release 2. •In Oracle database 11g, this compression capability has been extended so that table data can be compressed on export.
    30. 30. What Can Be Compressed? Data Pump Backups Full Data Pump functionality is available using a compressed file. Any command that is used on a regular file will also work on a compressed file. Users have the following options to determine which parts of a dump file set should be compressed: •ALL enables compression for the entire export operation. •DATA-ONLY results in all data being written to the dump file in compressed format. •METADATA-ONLY results in all metadata being written to the dump file in compressed format. This is the default. •NONE disables compression for the entire export operation.
    31. 31. CONSIDERATIO NS • When compression is specified at multiple levels, the most specific setting is always used • As such, partition settings always override table settings, which always override tablespace settings Compression Object Type Tablespace Table Index OLTP OLTP Partition 1 ARCHIVE LOW Partition 2 ARCHIVE HIGH
    32. 32. Hybrid Columnar Two Types Compression • Warehouse Compression • Archive Compression Works with Exadata and now ZFS Storage ZFS Storage can be attached with an ODA as NAS Storage
    33. 33. Hybrid Columnar Compression
    34. 34. MYTHS • Data is decompressed while being read • Oracle Database does not need to decompress table blocks when reading data. Oracle can keep blocks compressed in memory and read them directly. Hence, more data can be packed in memory which results in improved cache hit ratio and reduced I/O. • Data needs to be recompressed once update • Not true with 11gR2 – COMPRESS WITH OLTP algorithm compresses newer data without uncompressing updated rows.
    35. 35. CONSIDERATIO NS • When should I compress? • What should I compress?
    36. 36. Compression & Partitioning OLTP Applications • Table Partitioning • Heavily accessed data • Partitions using OLTP Table Compression • Cold or historical data • Partitions using Online Archival Compression • Data Warehouses
    37. 37. New Compression Advisors DBMS_COMPRESSION built-in package • GET_COMPRESSION_RATIO • Returns the possible compression ratio for an uncompressed table or materialized view and estimates achievable compression • GET_COMPRESSION_TYPE • Inspects data and reports what compression type is in use by row Enterprise Manager Segment Advisor • Estimates OLTP Table Compression automatically • Advises tables that will benefit from OLTP Compression
    38. 38. Compression Ratio Estimate GET_COMPRESSION_RATIO CREATE TABLE comp_test1 AS SELECT * FROM dba_objects; set serveroutput on DECLARE blkcnt_comp PLS_INTEGER; blkcnt_uncm PLS_INTEGER; row_comp PLS_INTEGER; row_uncm PLS_INTEGER; comp_ratio PLS_INTEGER; comp_type VARCHAR2(30); BEGIN dbms_compression.get_compression_ratio('UWDATA', 'UWCLASS', 'COMP_TEST1', NULL, dbms_compression.comp_for_oltp, blkcnt_cmp, blkcnt_uncmp, row_comp, row_uncm, dbms_compression.comp_for_oltp, blkcnt_cmp, blkcnt_uncmp, row_comp, row_uncm, comp_ratio, comp_type); dbms_output.put_line('Block Count Compressed: ' || TO_CHAR(blkcnt_comp)); dbms_output.put_line('Block Count UnCompressed: ' || TO_CHAR(blkcnt_uncm)); dbms_output.put_line('Row Count Compressed: ' || TO_CHAR(row_comp)); dbms_output.put_line('Row Count UnCompressed: ' || TO_CHAR(row_uncm)); dbms_output.put_line('Block Count Compressed: ' || TO_CHAR(comp_ratio)); dbms_output.put_line('Compression Type: ' || comp_type; END; /
    39. 39. AGENDA • • • • Overview A Brief History Pre-11g 11g: New Features • Case study of BIAS' implementation for a customer. • Background • requirements • Storage savings achieved; tablespaces and tables
    40. 40. CASE STUDY Challenge – 8TB Database Uncompressed and Unpartitioned – ODA had only 2.3TB of usable space. Goals – Compress customer data and achieve similar (if not better) performance – Use Database Replay to simulate workload – Perform Detailed Analysis of Performance Statistics
    41. 41. CASE STUDY Hardware – Platform: ODA – 2 Nodes running Oracle Enterprise Linux 5.7 64bit – CPU: 24 cores per node – RAM: 96GB per node Database Version – 64bit Instance parameters – SGA: 48GB – PGA: 10GB – Block Size: 8K
    42. 42. CASE STUDY Data Characteristics Billions of Rows Across Several Tables VARCHAR2 and NUMBER data types Repeating Patterns within each table
    43. 43. CASE STUDY Steps – Create a compressed export dump of a schema. – Create compressed tablespaces – in our case for OLTP. – Import Meta data only; users, grants, objects (excluding indexes and constraints). – Alter tables for compression. ALTER TABLE CARBON COMPRESS FOR OLTP;
    44. 44. CASE STUDY Steps – Import only table data – with appropriate parallel degree. – Import index creation scripts from export dump. – Alter relevant indexes to add compression factor of 1. – Create indexes with appropriate parallel option. – Import only constraints. – For good measure, generate statistics on the schema.
    45. 45. CHALLENGES • • • • What should I compress first? Multiple iterations with import Tweaked level of compression Time is the biggest enemy
    46. 46. RESULTS What’s that? You want to see proof of compression?? 8 TB Database, compressed to less than 1.5TB on an Oracle Database Appliance
    47. 47. RESULTS
    48. 48. RESULTS
    49. 49. RESULTS
    50. 50. RESULTS
    51. 51. RESULTS Database Replay Results Capture: 3 ½ Hours Replay: Nearly 10
    52. 52. RESULTS Database Replay Results •Database was CPU bound •Presumably because indexes were compressed
    53. 53. RESULTS Database Replay Results Two INSERT statements are the top consumers (over 75%) of the total sql statements
    54. 54. RESULTS Database Replay Results Average Active Sessions show database was CPU bound
    55. 55. What Did We Learn? • Compression Ratio’s vary but are mainly dependent on Block redundancy • Choose appropriate compression type • Database Replay (conditions need to be perfect) • Ensure workload capture is done with a consistent backup • Ensure same number of clients can be spawned • Spend adequate time analyzing the results • Patience is golden virtue!
    56. 56. QUESTIONS
    57. 57. Blog: Email: Twitter: @maaz_anjum Session: 523