Hybrid Columnar Compression in a non-Exadata System

6,239 views

Published on

Published in: Technology
0 Comments
7 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
6,239
On SlideShare
0
From Embeds
0
Number of Embeds
15
Actions
Shares
0
Downloads
0
Comments
0
Likes
7
Embeds 0
No embeds

No notes for slide

Hybrid Columnar Compression in a non-Exadata System

  1. 1. Hybrid Columnar Compressionfor Non-Exadata DatabasesPeter Brink, Credit SuisseEnkitec E4 conference, Dallas13/14 August 2012
  2. 2. About me Peter Brink  15 years experience in Data Warehouse projects within Financials  13 years working with Oracle databasesEnkitec E4 conference, Dallas Peter Brink13/14 August 2012 Page 2
  3. 3. Agenda  Hybrid Columnar Compression without Exadata?  HCC overview  HCC proof of concept with ZFS appliance - Planning the PoC - Test cases and test results  Get the most out of HCC  DiscussionEnkitec E4 conference, Dallas Peter Brink13/14 August 2012 Page 3
  4. 4. Hybrid Columnar Compression without Exadata?Enkitec E4 conference, Dallas Peter Brink13/14 August 2012 Page 4
  5. 5. Hybrid Columnar Compression without Exadata Oracle Press Release, 30th September 2011  Oracle Announces Hybrid Columnar Compression Support for ZFS Storage Appliances and Pillar Axiom Storage SystemsEnkitec E4 conference, Dallas Peter Brink13/14 August 2012 Page 5
  6. 6. Hybrid Columnar Compression without Exadata  HCC compression is built into the 11g database, but only enabled when running on Oracle storage SQL> select table_name, compression, compress_for 2 from dba_tables 3 where table_name = T1_ARCHIVE_HIGH; TABLE_NAME COMPRESS COMPRESS_FOR ------------------------------ -------- ------------ T1_ARCHIVE_HIGH ENABLED ARCHIVE HIGH SQL> select c1 from t1_archive_high; select c1 from t1_archive_high * ERROR at line 1: ORA-64307: hybrid columnar compression is only supported in tablespaces residing on Exadata storage see Jonathan Lewis’ trick on how to create (and keep) an HCC table  dbms_compression (11.2.0.1) creates HCC compressed tables during estimation of compression ratiosEnkitec E4 conference, Dallas Peter Brink13/14 August 2012 Page 6
  7. 7. Hybrid Columnar Compression overview Table Compression types  Basic 9i only available for bulk operations  OLTP or Advanced Compression 11g available for all operations  Hybrid Columnar Compression query low query high } Warehouse compression X archive low archive high } Online archival compression 12cEnkitec E4 conference, Dallas Peter Brink13/14 August 2012 Page 7
  8. 8. Hybrid Columnar Compression overview OLTP Compression  dictionary compression, self-contained within single blocks  tables with >255 columns are silently not compressed  locking works the same as for uncompressed blocks Initial inserts are PCTFREE triggers Further uncompressed PCTFREE triggers uncompressd compression inserts compressionEnkitec E4 conference, Dallas Peter Brink13/14 August 2012 Page 8
  9. 9. Hybrid Columnar Compression overview  HCC compression only used for bulk operations conventional inserts and updates fall back to OLTP compression updates result in rows migrated to new blocks  Tables are organised in Compression Units (CU)  Logical structure spanning multiple blocks  Organised by column during data load  Columns are compressed individually Compression Unit, Database Concepts GuideEnkitec E4 conference, Dallas Peter Brink13/14 August 2012 Page 9
  10. 10. Hybrid Columnar Compression overview HCC compression types Type Description Query Low • LZO compression algorithm • Optimised for speed rather than high compression ratio Query High • ZLIB compression algorithm Archive Low • ZLIB with higher compression level • Does not necessarily yield higher compression than Query High Archive High • BZIP2 compression • Yields the highest compression ratio • Significant cpu overheadEnkitec E4 conference, Dallas Peter Brink13/14 August 2012 Page 10
  11. 11. Hybrid Columnar Compression overview  create table T1 (col1 date, col1 number …) partition by range col1( partition P2009 … compress for archive high, partition P2010 … compress for archive low, partition P2011 … compress for query high, partition P2012Q1 … compress for OLTP, partition P2012Q2 … nocompress ) different compression types can be used for partitions of a single table  alter table T1 compress for query high; does not result in existing blocks getting compressed  alter table T1 nocompress; will not decompress blocks in a table  The table needs to be rebuilt to apply changed compression settings to existing dataEnkitec E4 conference, Dallas Peter Brink13/14 August 2012 Page 11
  12. 12. Introducing the ZFS Storage Appliance  Oracle Hardware  Primarily a NAS Storage device  it runs ZFS inside the device  Also capable of providing Fibre Channel, iSCSI, etc  Provides “full service” as NAS storage  cloning  Snapshots  Data deduplication  Options available from shelves to full racks 3.3 TB to 1.7 PB of raw capacityEnkitec E4 conference, Dallas Peter Brink13/14 August 2012 Page 12
  13. 13. HCC proof of concept with ZFS storage appliance Pre-requirements for HCC compression with ZFS SA  Database must be running on 11.2.0.3 + patch p13041324  Database must use NFS datafiles (ideally dNFS)  Standard requirements for performant NFS implementations  10 GigE Networking  Network adjacency (one hop between host and storage)  Jumbo Frames throughput (MTU 9000)Enkitec E4 conference, Dallas Peter Brink13/14 August 2012 Page 13
  14. 14. HCC proof of concept – Application Characteristics  Risk application chosen for POC Hybrid between Data Warehouse and OLTP  10 TB, 2 node RAC cluster used for data loading and reporting  2.5 TB read-only, archive database  uses OLTP compression }  subset of tables with data older than 6 month not in scope for POC  Growth in data volumes driven by:  business growth  demand for more detailed analysis  increased data granularity and risk factors  more complex risk methodologiesEnkitec E4 conference, Dallas Peter Brink13/14 August 2012 Page 14
  15. 15. HCC proof of concept – Application Characteristics Data loading  More than 500 GB trade level detail data loaded every week -> detail data gets purged within a period of one to six weeks  Aggregated into > 50 GB of final data -> final data is kept forever  24 x 5.5 loading activity  No direct path loads!Enkitec E4 conference, Dallas Peter Brink13/14 August 2012 Page 15
  16. 16. HCC proof of concept – Application Characteristics Reporting and Analysis  Only aggregated data is used for reporting purposes  Reporting activity counts for 80% of database workload  80% of reports are generated for the last 2 COB dates data for last 2 COB dates will be in the buffer cache  Data extracts rely on nested loop joins / index access!Enkitec E4 conference, Dallas Peter Brink13/14 August 2012 Page 16
  17. 17. HCC proof of concept – Compression Advisor DBMS_COMPRESSION.GET_COMPRESSION_RATIO  Does not require Exadata storage, only needs a 11r2 database  The advisor gets the ratio by compressing sample data. It will consume CPU and requires disk space  Accuracy depends on how representative the sample set is Sample results: Compression type Predicted Ratio Ratio during POC Archive High 17.1 16.7 – 27.5 Archive Low 13.5 12.9 – 20.8 Query High 10.9 10.7 – 17.8 Query Low 5.8 6.6 – 9.5Enkitec E4 conference, Dallas Peter Brink13/14 August 2012 Page 17
  18. 18. HCC proof of concept – DML performance DML Performance Impact  Updates to compressed data will take longer and expand the space used  Sample results (update primary key to a new value for all rows) Size before Size after (MB) Elapsed time (MB) (seconds) Uncompressed 860 860 30 HCC - Query Low 154 694 246  Updates of a single row of a HCC table locks the Compression Unit containing the row Avoid updates of HCC compressed rowsEnkitec E4 conference, Dallas Peter Brink13/14 August 2012 Page 18
  19. 19. HCC proof of concept – Query Performance  Positive impact when disk IO can be reduced or avoided  Negative impact caused by cpu usage for decompression and additional logical IO when majority of blocks are cached  Test result from a representative set of report queries: Elapsed Time CPU time IO Wait time (s) (s) (s) Uncached Uncompressed 24356 3542 20652 Query Low 5071 3726 1352 Query High 6681 5064 246 Cached Uncompressed 2171 1711 372 Query Low 2593 2556 33 Query High 3515 3492 30Enkitec E4 conference, Dallas Peter Brink13/14 August 2012 Page 19
  20. 20. HCC proof of concept – Application Characteristics Disk space consumption:  HCC will give no benefits on 34% space used for indexes  Challenges adapting HCC for 44% of table space for trade level data less than 30% of total database size in scopeEnkitec E4 conference, Dallas Peter Brink13/14 August 2012 Page 20
  21. 21. HCC proof of concept – Configuration Compression Matrix by data type Compression Size before Size after (GB) (GB) Trade level Uncompressed 2037 2037 Aggregated Query High for 1512 139 partitions older than one week, latest partitions are uncompressed Static Uncompressed 270 270 Audit data Rebuild with Query 745 46 High, then set to nocompress without rebuildEnkitec E4 conference, Dallas Peter Brink13/14 August 2012 Page 21
  22. 22. HCC proof of concept – Application performance test  Upgrade to 11.2.0.3 resulted in regressed performance for loading component. Setting optimizer_features_enable to 11.2.0.2 resolved this issue  No further impact on data loads. Occasional updates to HCC rows resulted in rows migrating, but no noticeable performance impact and no locking issues  Reporting application validated previous test results. Performance improvement for small percentage of reports retrieving HCC rowsEnkitec E4 conference, Dallas Peter Brink13/14 August 2012 Page 22
  23. 23. Get the most out of HCC Design your application to take advantage of HCC  Get your partitioning right  Use bulk loads that can use HCC compression  Avoid updates  Minimize single row lookups  In-memory-parallel execution  Ensure you have sufficient CPU capacityEnkitec E4 conference, Dallas Peter Brink13/14 August 2012 Page 23
  24. 24. Get the most out of HCC Moving data from an Exadata to a non-Exadata platform  Without ZFS or Pillar storage data in HCC format can not be used by non-Exadata database  Using Oracle storage allows accessing HCC blocks when a failover / restore of an Exadata database to a non-Exadata platform occursEnkitec E4 conference, Dallas Peter Brink13/14 August 2012 Page 24
  25. 25. Conclusion  The Oracle ZFS Storage appliance offers genuine storage saving capabilities through the Hybrid Columnar Compression feature  Actual storage savings depend on what objects can be compressed. For the application evaluated only a 20% reduction was achievable  Natural fit for any Oracle database already using Oracle storage  Being allowed to use HCC compression without a storage platform change would be appreciatedEnkitec E4 conference, Dallas Peter Brink13/14 August 2012 Page 25
  26. 26. Bibliography  Exadata Hybrid Columnar Compression – Leveraging it fully, Christo Kutrovsky, UK OUG 2011  Expert Oracle Exadata, Kerry Osborne, Randy Johnson, Tanel Poder, Apress, 2011  Oracle Scratchpad, Jonathan Lewis, http://jonathanlewis.wordpress.com/category/oracle/exadata/  Exadata Hybrid Columnar Compression, Oracle http://www.oracle.com/technetwork/database/features/availability/31 1358-132337.pdf  oracle.com for documentation on HCC and the ZFS storage applianceEnkitec E4 conference, Dallas Peter Brink13/14 August 2012 Page 26
  27. 27. HCC for Non-Exadata Databases DiscussionEnkitec E4 conference, Dallas Peter Brink13/14 August 2012 Page 27
  28. 28. Thanks to Duncan Lawie, Credit Suisse, for his work on this. Thank you for your attention. peter.brink@credit-suisse.comEnkitec E4 conference, Dallas Peter Brink13/14 August 2012 Page 28

×