Five Tuning Tips For Your Datawarehouse

2,683 views
2,466 views

Published on

Five tuning tips for enhancing the performance and manageability of a data warehouse.

Published in: Technology
2 Comments
1 Like
Statistics
Notes
  • Thanks!
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • Thanks for the presentation.

    Now embedded at www.ukocn.com

    http://www.ukocn.com/forums/business-intelligence-epm/data-warehousing/five-tuning-tips-your-datawarehouse-jeff-moss
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
No Downloads
Views
Total views
2,683
On SlideShare
0
From Embeds
0
Number of Embeds
30
Actions
Shares
0
Downloads
133
Comments
2
Likes
1
Embeds 0
No embeds

No notes for slide
  • Five Tuning Tips For Your Datawarehouse

    1. 1. Five Tuning Tips For Your Data Warehouse Jeff Moss
    2. 2. My First Presentation <ul><li>Yes, my very first presentation </li></ul><ul><ul><li>For BIRT SIG </li></ul></ul><ul><ul><li>For UKOUG </li></ul></ul><ul><li>Useful Advice from friends and colleagues </li></ul><ul><ul><li>Use graphics where appropriate </li></ul></ul><ul><ul><li>Find a friendly or familiar face in the audience </li></ul></ul><ul><ul><li>Imagine your audience is naked! </li></ul></ul><ul><ul><li>… but like Oracle, be careful when combining advice! </li></ul></ul>
    3. 3. Be Careful Combining Advice! <ul><li>Thanks for the opportunity Mark! </li></ul>
    4. 4. Agenda <ul><li>My background </li></ul><ul><li>Five tips </li></ul><ul><ul><li>Partition for success </li></ul></ul><ul><ul><li>Squeeze your data with data segment compression </li></ul></ul><ul><ul><li>Make the most of your PGA memory </li></ul></ul><ul><ul><li>Beware of temporal data affecting the optimizer </li></ul></ul><ul><ul><li>Find out where your query is at </li></ul></ul><ul><li>Questions </li></ul>
    5. 5. My Background <ul><li>Independent Consultant </li></ul><ul><li>13 years Oracle experience </li></ul><ul><li>Blog: http://oramossoracle.blogspot.com/ </li></ul><ul><li>Focused on warehousing / VLDB since 1998 </li></ul><ul><li>First project </li></ul><ul><ul><li>UK Music Sales Data Mart </li></ul></ul><ul><ul><li>Produces BBC Radio 1 Top 40 chart and many more </li></ul></ul><ul><ul><li>2 billion row sales fact table </li></ul></ul><ul><ul><li>1 Tb total database size </li></ul></ul><ul><li>Currently working with Eon UK (Powergen) </li></ul><ul><ul><li>4Tb Production Warehouse, 8Tb total storage </li></ul></ul><ul><ul><li>Oracle Product Stack </li></ul></ul>
    6. 6. What Is Partitioning ? <ul><li>“ Partitioning addresses key issues in supporting very large tables and indexes by letting you decompose them into smaller and more manageable pieces called partitions .” Oracle Database Concepts Manual, 10gR2 </li></ul><ul><li>Introduced in Oracle 8.0 </li></ul><ul><li>Numerous improvements since </li></ul><ul><li>Subpartitioning adds another level of decomposition </li></ul><ul><li>Partitions and Subpartitions are logical containers </li></ul>
    7. 7. Partition To Tablespace Mapping <ul><li>Partitions map to tablespaces </li></ul><ul><ul><li>Partition can only be in One tablespace </li></ul></ul><ul><ul><li>Tablespace can hold many partitions </li></ul></ul><ul><ul><li>Highest granularity is One tablespace per partition </li></ul></ul><ul><ul><li>Lowest granularity is One tablespace for all the partitions </li></ul></ul><ul><li>Tablespace volatility </li></ul><ul><ul><li>Read / Write </li></ul></ul><ul><ul><li>Read Only </li></ul></ul>P_JAN_2005 P_FEB_2005 P_MAR_2005 P_APR_2005 P_MAY_2005 P_JUN_2005 P_JUL_2005 P_AUG_2005 P_SEP_2005 P_OCT_2005 P_NOV_2005 P_DEC_2005 T_Q1_2005 T_Q2_2005 T_Q3_2005 T_Q4_2005 T_Q1_2006 P_JAN_2006 P_FEB_2006 P_MAR_2006 T_Q3_2005 Read / Write Read Only
    8. 8. Why Partition ? - Performance <ul><li>Improved query performance </li></ul><ul><ul><li>Pruning or elimination </li></ul></ul><ul><ul><li>Partition wise joins </li></ul></ul><ul><li>Read only partitions </li></ul><ul><ul><li>Quicker checkpointing </li></ul></ul><ul><ul><li>Quicker backup </li></ul></ul><ul><ul><li>Quicker recovery </li></ul></ul><ul><ul><li>… but it depends on mapping of: </li></ul></ul><ul><ul><li>partition:tablespace:datafile </li></ul></ul>SELECT SUM(sales) FROM part_tab WHERE sales_date BETWEEN ‘01-JAN-2005’ AND ’30-JUN-2005’ Sales Fact Table * Oracle 10gR2 Data Warehousing Manual JAN FEB MAR APR MAY JUN JUL AUG SEP OCT NOV DEC
    9. 9. Why Partition ? - Manageability <ul><li>Archiving </li></ul><ul><ul><li>Use a rolling window approach </li></ul></ul><ul><ul><li>ALTER TABLE … ADD/SPLIT/DROP PARTITION… </li></ul></ul><ul><li>Easier ETL Processing </li></ul><ul><ul><li>Build a new dataset in a staging table </li></ul></ul><ul><ul><li>Add indexes and constraints </li></ul></ul><ul><ul><li>Collect statistics </li></ul></ul><ul><ul><li>Then swap the staging table for a partition on the target </li></ul></ul><ul><ul><ul><li>ALTER TABLE…EXCHANGE PARTITION… </li></ul></ul></ul><ul><li>Easier Maintenance </li></ul><ul><ul><li>Table partition move, e.g. to compress data </li></ul></ul><ul><ul><li>Local Index partition rebuild </li></ul></ul>
    10. 10. Why Partition ? - Scalability <ul><li>Partition is generally consistent and predictable </li></ul><ul><ul><li>Assuming an appropriate partitioning key is used </li></ul></ul><ul><ul><li>…and data has an even distribution across the key </li></ul></ul><ul><li>Read only approach </li></ul><ul><ul><li>Scalable backups - read only tablespaces are ignored </li></ul></ul><ul><ul><li>…so partitions in those tablespaces are ignored </li></ul></ul><ul><li>Pruning allows consistent query performance </li></ul>
    11. 11. Why Partition ? - Availability <ul><li>Offline data impact minimised </li></ul><ul><ul><li>… depending on granularity </li></ul></ul><ul><ul><li>Quicker recovery </li></ul></ul><ul><ul><li>Pruned data not missed </li></ul></ul><ul><ul><li>EXCHANGE PARTITION </li></ul></ul><ul><ul><ul><li>Allows offline build </li></ul></ul></ul><ul><ul><ul><li>Quick swap over </li></ul></ul></ul>P_JAN_2005 P_FEB_2005 P_MAR_2005 P_APR_2005 P_MAY_2005 P_JUN_2005 P_JUL_2005 P_AUG_2005 P_SEP_2005 P_OCT_2005 P_NOV_2005 P_DEC_2005 T_Q1_2005 T_Q2_2005 T_Q3_2005 T_Q4_2005 T_Q1_2006 P_JAN_2006 P_FEB_2006 P_MAR_2006 T_Q3_2005 Read / Write Read Only
    12. 12. Fact Table Partitioning Transaction Date Load Date <ul><li>Easier ETL Processing </li></ul><ul><ul><li>Each load deals with only 1 partition </li></ul></ul><ul><li>No use to end user queries! </li></ul><ul><ul><li>Can’t prune – Full scans! </li></ul></ul><ul><li>Harder ETL Processing </li></ul><ul><ul><li>But still uses EXCHANGE PARTITION </li></ul></ul><ul><li>Useful to end user queries </li></ul><ul><ul><li>Allows full pruning capability </li></ul></ul>07-JAN-2005 Customer 1 09-JAN-2005 15-JAN-2005 Customer 2 17-JAN-2005 January Partition February Partition 22-JAN-2005 Customer 3 01-FEB-2005 02-FEB-2005 Customer 4 05-FEB-2005 26-FEB-2005 Customer 5 28-FEB-2005 March Partition 06-MAR-2005 Customer 2 07-MAR-2005 12-MAR-2005 Customer 3 15-MAR-2005 Tran Date Customer Load Date April Partition 21-JAN-2005 Customer 7 04-APR-2005 09-APR-2005 Customer 9 10-APR-2005 07-JAN-2005 Customer 1 09-JAN-2005 15-JAN-2005 Customer 2 17-JAN-2005 21-JAN-2005 Customer 7 04-APR-2005 22-JAN-2005 Customer 3 01-FEB-2005 January Partition February Partition 02-FEB-2005 Customer 4 05-FEB-2005 26-FEB-2005 Customer 5 28-FEB-2005 March Partition 06-MAR-2005 Customer 2 07-MAR-2005 12-MAR-2005 Customer 3 15-MAR-2005 Tran Date Customer Load Date April Partition 09-APR-2005 Customer 9 10-APR-2005
    13. 13. Watch out for… <ul><li>Partition exchange and table statistics 1 </li></ul><ul><ul><li>Partition stats updated </li></ul></ul><ul><ul><li>… but Global stats are NOT! </li></ul></ul><ul><ul><li>Affects queries accessing multiple partitions </li></ul></ul><ul><ul><li>Solution </li></ul></ul><ul><ul><ul><li>Gather stats on staging table prior to EXCHANGE </li></ul></ul></ul><ul><ul><ul><li>Gather stats on partitioned table using GLOBAL </li></ul></ul></ul>Jonathan Lewis: Cost-Based Oracle Fundamentals, Chapter 2
    14. 14. Partitioning Feature: Characteristic Reason Matrix    Partition Truncation     Exchange Partition    Archiving    Pruning (Partition Elimination)   Partition wise joins  Parallel DML     Local Indexes    Read Only Partitions Availability Scalability Manageability Performance Characteristic: Feature:
    15. 15. What Is Data Segment Compression ? <ul><li>Compresses data by eliminating intra block repeated column values </li></ul><ul><li>Reduces the space required for a segment </li></ul><ul><ul><li>…but only if there are appropriate repeats! </li></ul></ul><ul><li>Self contained </li></ul><ul><li>Lossless algorithm </li></ul>
    16. 16. Where Can Data Segment Compression Be Used ? <ul><li>Can be used with a number of segment types </li></ul><ul><ul><li>Heap & Nested Tables </li></ul></ul><ul><ul><li>Range or List Partitions </li></ul></ul><ul><ul><li>Materialized Views </li></ul></ul><ul><li>Can’t be used with </li></ul><ul><ul><li>Subpartitions </li></ul></ul><ul><ul><li>Hash Partitions </li></ul></ul><ul><ul><li>Indexes – but they have row level compression </li></ul></ul><ul><ul><li>IOT </li></ul></ul><ul><ul><li>External Tables </li></ul></ul><ul><ul><li>Tables that are part of a Cluster </li></ul></ul><ul><ul><li>LOBs </li></ul></ul>
    17. 17. How Does Segment Compression Work ? Database Block Symbol Table Row Data Area 100 Call to discuss bill amount TEL NO YES 3 TEL 4 NO 5 YES 2 Call to discuss bill amount 1 100 1 2 3 4 5 101 Call to discuss new product MAIL NO N/A 8 MAIL 9 N/A 7 Call to discuss new product 6 101 6 7 8 4 9 102 Call to discuss new product TEL YES N/A 10 7 3 5 9 10 102 ID DESCRIPTION CONTACT TYPE OUTCOME FOLLOWUP
    18. 18. Pros & Cons <ul><li>Pros </li></ul><ul><ul><li>Saves space </li></ul></ul><ul><ul><ul><li>Reduces LIO / PIO </li></ul></ul></ul><ul><ul><ul><li>Speeds up backup/recovery </li></ul></ul></ul><ul><ul><ul><li>Improves query response time </li></ul></ul></ul><ul><ul><li>Transparent </li></ul></ul><ul><ul><ul><li>To readers </li></ul></ul></ul><ul><ul><ul><li>… and writers </li></ul></ul></ul><ul><ul><li>Decreases time to perform some DML </li></ul></ul><ul><ul><ul><li>Deletes should be quicker </li></ul></ul></ul><ul><ul><ul><li>Bulk inserts may be quicker </li></ul></ul></ul><ul><li>Cons </li></ul><ul><ul><li>Increases CPU load </li></ul></ul><ul><ul><li>Can only be used on Direct Path operations </li></ul></ul><ul><ul><ul><li>CTAS </li></ul></ul></ul><ul><ul><ul><li>Serial Inserts using INSERT /*+ APPEND */ </li></ul></ul></ul><ul><ul><ul><li>Parallel Inserts (PDML) </li></ul></ul></ul><ul><ul><ul><li>ALTER TABLE…MOVE… </li></ul></ul></ul><ul><ul><ul><li>Direct Path SQL*Loader </li></ul></ul></ul><ul><ul><li>Increases time to perform some DML </li></ul></ul><ul><ul><ul><li>Bulk inserts may be slower </li></ul></ul></ul><ul><ul><ul><li>Updates are slower </li></ul></ul></ul>
    19. 19. Ordering Your Data For Maximum Benefits <ul><li>Colocate data to maximise compression benefits </li></ul><ul><li>For maximum compression </li></ul><ul><ul><li>Minimise the total space required by the segment </li></ul></ul><ul><ul><li>Identify most “compressable” column(s) </li></ul></ul><ul><li>For optimal access </li></ul><ul><ul><li>We know how the data is to be queried </li></ul></ul><ul><ul><li>Order the data by </li></ul></ul><ul><ul><ul><li>Access path columns </li></ul></ul></ul><ul><ul><ul><li>Then the next most “compressable” column(s) </li></ul></ul></ul>Uniformly distributed Colocated 1 2 3 4 5 1 2 3 4 5 1 2 3 4 5 1 2 3 4 5 1 1 1 1 2 2 2 2 3 3 3 3 4 4 4 4 5 5 5 5
    20. 20. Get Max Compression Order Package <ul><ul><li>PROCEDURE mgmt_p_get_max_compress_order </li></ul></ul><ul><ul><li>Argument Name Type In/Out Default? </li></ul></ul><ul><ul><li>------------------------------ ----------------------- ------ -------- </li></ul></ul><ul><ul><li>P_TABLE_OWNER VARCHAR2 IN DEFAULT </li></ul></ul><ul><ul><li>P_TABLE_NAME VARCHAR2 IN </li></ul></ul><ul><ul><li>P_PARTITION_NAME VARCHAR2 IN DEFAULT </li></ul></ul><ul><ul><li>P_SAMPLE_SIZE NUMBER IN DEFAULT </li></ul></ul><ul><ul><li>P_PREFIX_COLUMN1 VARCHAR2 IN DEFAULT </li></ul></ul><ul><ul><li>P_PREFIX_COLUMN2 VARCHAR2 IN DEFAULT </li></ul></ul><ul><ul><li>P_PREFIX_COLUMN3 VARCHAR2 IN DEFAULT </li></ul></ul><ul><ul><li>BEGIN </li></ul></ul><ul><ul><li>mgmt_p_get_max_compress_order(p_table_owner => ‘AE_MGMT’ </li></ul></ul><ul><ul><li>,p_table_name =>’BIG_TABLE’ </li></ul></ul><ul><ul><li>,p_sample_size =>10000); </li></ul></ul><ul><ul><li>END: </li></ul></ul><ul><ul><li>/ </li></ul></ul>Running mgmt_p_get_max_compress_order... ---------------------------------------------------------------------------------------------------- Table : BIG_TABLE Sample Size : 10000 Unique Run ID: 25012006232119 ORDER BY Prefix: ---------------------------------------------------------------------------------------------------- Creating MASTER Table : TEMP_MASTER_25012006232119 Creating COLUMN Table 1: COL1 Creating COLUMN Table 2: COL2 Creating COLUMN Table 3: COL3 ---------------------------------------------------------------------------------------------------- The output below lists each column in the table and the number of blocks/rows and space used when the table data is ordered by only that column, or in the case where a prefix has been specified, where the table data is ordered by the prefix and then that column. From this one can determine if there is a specific ORDER BY which can be applied to to the data in order to maximise compression within the table whilst, in the case of a a prefix being present, ordering data as efficiently as possible for the most common access path(s). ---------------------------------------------------------------------------------------------------- NAME COLUMN BLOCKS ROWS SPACE_GB ============================== ============================== ============ ============ ======== TEMP_COL_001_25012006232119 COL1 290 10000 .0022 TEMP_COL_002_25012006232119 COL2 345 10000 .0026 TEMP_COL_003_25012006232119 COL3 555 10000 .0042
    21. 21. Data Warehousing Specifics <ul><li>Star Schema compresses better than Normalized </li></ul><ul><ul><li>More redundant data </li></ul></ul><ul><li>Focus on… </li></ul><ul><ul><li>Fact Tables and Summaries in Star Schema </li></ul></ul><ul><ul><li>Transaction tables in Normalized Schema </li></ul></ul><ul><li>Performance Impact 1 </li></ul><ul><ul><li>Space Savings </li></ul></ul><ul><ul><ul><li>Star schema: 67% </li></ul></ul></ul><ul><ul><ul><li>Normalized: 24% </li></ul></ul></ul><ul><ul><li>Query Elapsed Times </li></ul></ul><ul><ul><ul><li>Star schema: 16.5% </li></ul></ul></ul><ul><ul><ul><li>Normalized: 10% </li></ul></ul></ul>1 - Table Compression in Oracle 9iR2: A Performance Analysis
    22. 22. Things To Watch Out For <ul><li>DROP COLUMN is awkward </li></ul><ul><ul><li>ORA-39726: Unsupported add/drop column operation on compressed tables </li></ul></ul><ul><ul><li>Uncompress the table and try again - still gives ORA-39726! </li></ul></ul><ul><li>After UPDATEs data is uncompressed </li></ul><ul><ul><li>Performance impact </li></ul></ul><ul><ul><li>Row migration </li></ul></ul><ul><li>Use appropriate physical design settings </li></ul><ul><ul><li>PCTFREE 0 - pack each block </li></ul></ul><ul><ul><li>Large blocksize - reduce overhead / increase repeats per block </li></ul></ul>
    23. 23. PGA Memory: What For ? <ul><li>Sorts </li></ul><ul><ul><li>Standard sorts [SORT] </li></ul></ul><ul><ul><li>Buffer [BUFFER] </li></ul></ul><ul><ul><li>Group By [GROUP BY (SORT)] </li></ul></ul><ul><ul><li>Connect By [CONNECT-BY (SORT)] </li></ul></ul><ul><ul><li>Rollup [ROLLUP (SORT)] </li></ul></ul><ul><ul><li>Window [WINDOW (SORT)] </li></ul></ul><ul><li>Hash Joins [HASH-JOIN] </li></ul><ul><li>Indexes </li></ul><ul><ul><li>Maintenance [IDX MAINTENANCE SOR] </li></ul></ul><ul><ul><li>Bitmap Merge [BITMAP MERGE] </li></ul></ul><ul><ul><li>Bitmap Create [BITMAP CREATE] </li></ul></ul><ul><li>Write Buffers [LOAD WRITE BUFFERS] </li></ul>Serial Process PGA Dedicated Server Cursors Variables Sort Area [] V$SQL_WORKAREA.OPERATION_TYPE
    24. 24. PGA Memory Management: Manual <ul><li>The “old” way of doing things </li></ul><ul><ul><li>Still available though – even in 10g R2 </li></ul></ul><ul><li>Configuring </li></ul><ul><ul><li>ALTER SESSION SET WORKAREA_SIZE_POLICY=MANUAL; </li></ul></ul><ul><ul><li>Initialisation parameter: WORKAREA_SIZE_POLICY=MANUAL </li></ul></ul><ul><li>Set memory parameters yourself </li></ul><ul><ul><li>HASH_AREA_SIZE </li></ul></ul><ul><ul><li>SORT_AREA_SIZE </li></ul></ul><ul><ul><li>SORT_AREA_RETAINED_SIZE </li></ul></ul><ul><ul><li>BITMAP_MERGE_AREA_SIZE </li></ul></ul><ul><ul><li>CREATE_BITMAP_AREA_SIZE </li></ul></ul><ul><li>Optimal values depend on the type of work 1 </li></ul><ul><ul><li>One size does not fit all! </li></ul></ul>1 - Richmond Shee: If Your Memory Serves You Right
    25. 25. PGA Memory Management: Automatic <ul><li>The “new” way from 9i R1 </li></ul><ul><ul><li>Default OFF in 9i R1/R2 </li></ul></ul><ul><ul><ul><li>Enabled by setting at session/instance level: </li></ul></ul></ul><ul><ul><ul><ul><li>WORKAREA_SIZE_POLICY=AUTO </li></ul></ul></ul></ul><ul><ul><ul><ul><li>PGA_AGGREGATE_TARGET > 0 </li></ul></ul></ul></ul><ul><ul><li>Default ON since 10g R1 </li></ul></ul><ul><li>Oracle dynamically manages the available memory to suit the workload </li></ul><ul><ul><li>But of course, it’s not perfect! </li></ul></ul>Jože Senegačnik - Advanced Management Of Working Areas In Oracle 9i/10g, presented at UKOUG 2005
    26. 26. Auto PGA Parameters: Pre 10gR2 <ul><li>WORKAREA_SIZE_POLICY </li></ul><ul><ul><li>Set to AUTO </li></ul></ul><ul><li>PGA_AGGREGATE_TARGET </li></ul><ul><ul><li>The target for summed PGA across all processes </li></ul></ul><ul><ul><li>Can be exceeded if too small </li></ul></ul><ul><ul><ul><li>Over Allocation </li></ul></ul></ul><ul><li>_PGA_MAX_SIZE </li></ul><ul><ul><li>Target maximum PGA size for a single process </li></ul></ul><ul><ul><li>Default is a fixed value of 200Mb </li></ul></ul><ul><ul><li>Hidden / Undocumented Parameter </li></ul></ul><ul><ul><ul><li>Usual caveats apply </li></ul></ul></ul>
    27. 27. Auto PGA Parameters : Pre 10gR2 <ul><li>_SMM_MAX_SIZE </li></ul><ul><ul><li>Limit for a single workarea operation for one process </li></ul></ul><ul><ul><li>Derived Default </li></ul></ul><ul><ul><ul><li>LEAST(5% of PGA_AGGREGATE_TARGET </li></ul></ul></ul><ul><ul><ul><li>, 50% of _PGA_MAX_SIZE) </li></ul></ul></ul><ul><ul><ul><li>Hits limit of 100Mb </li></ul></ul></ul><ul><ul><ul><ul><li>When PGA_AGGREGATE_TARGET is >= 2000Mb </li></ul></ul></ul></ul><ul><ul><ul><ul><li>And _PGA_MAX_SIZE is left at default of 200Mb </li></ul></ul></ul></ul><ul><ul><li>Hidden / Undocumented Parameter </li></ul></ul><ul><ul><ul><li>Usual caveats apply </li></ul></ul></ul>
    28. 28. Auto PGA Parameters : Pre 10gR2 <ul><li>_SMM_PX_MAX_SIZE </li></ul><ul><ul><li>Limit for all the parallel slaves of a single workarea operation </li></ul></ul><ul><ul><li>Derived Default </li></ul></ul><ul><ul><ul><li>30% of PGA_AGGREGATE_TARGET </li></ul></ul></ul><ul><ul><li>Hidden / Undocumented Parameter </li></ul></ul><ul><ul><ul><li>Usual caveats apply </li></ul></ul></ul><ul><ul><li>Parallel slaves still limited </li></ul></ul><ul><ul><ul><li>_SMM_MAX_SIZE </li></ul></ul></ul><ul><ul><li>Impacts only when… </li></ul></ul>PGA_AGGREGATE_TARGET: 3000Mb _PGA_MAX_SIZE = 200Mb _SMM_MAX_SIZE = 100Mb _SMM_PX_MAX_SIZE = 900Mb Session 1 100Mb Session 2 100Mb Session 3 100Mb Session 4 100Mb Session 5 100Mb Session 6 100Mb Session 7 100Mb Session 8 100Mb Session 9 75Mb Session 10 75Mb Session 11 75Mb Session 12 75Mb Session 1 75Mb Session 2 75Mb Session 3 75Mb Session 4 75Mb Session 5 75Mb Session 6 75Mb Session 7 75Mb Session 8 75Mb
    29. 29. 10gR2 Improvements <ul><li>_SMM_MAX_SIZE now the driver </li></ul><ul><ul><li>More advanced algorithm </li></ul></ul><ul><ul><li>_PGA_MAX_SIZE = 2 * _SMM_MAX_SIZE </li></ul></ul><ul><li>Parallel operations </li></ul><ul><ul><li>_SMM_PX_MAX_SIZE = 50% * PGA_AGGREGATE_TARGET </li></ul></ul><ul><ul><li>When DOP <=5 then _smm_max_size is used </li></ul></ul><ul><ul><li>When DOP > 5 _smm_px_max_size / DOP is used </li></ul></ul>Jože Senegačnik - Advanced Management Of Working Areas In Oracle 9i/10g, presented at UKOUG 2005 10% * PGA_AGGREGATE_TARGET 1000Mb + 100Mb 500Mb – 1000Mb 20% * PGA_AGGREGATE_TARGET <= 500Mb _SMM_MAX_SIZE PGA_AGGREGATE_TARGET
    30. 30. PGA Target Advisor <ul><li>select trunc(pga_target_for_estimate/1024/1024) pga_target_for_estimate </li></ul><ul><li>, to_char(pga_target_factor * 100,'999.9') ||'%' pga_target_factor </li></ul><ul><li>, trunc(bytes_processed/1024/1024) bytes_processed </li></ul><ul><li>, trunc(estd_extra_bytes_rw/1024/1024) estd_extra_bytes_rw </li></ul><ul><li>, to_char(estd_pga_cache_hit_percentage,'999') || </li></ul><ul><li>'%' estd_pga_cache_hit_percentage </li></ul><ul><li>, estd_overalloc_count </li></ul><ul><li>from v$pga_target_advice </li></ul><ul><li>/ </li></ul><ul><li>PGA Target For PGA Tgt Estimated Extra Estimated PGA Estimated </li></ul><ul><li>Estimate Mb Factor Bytes Processed Bytes Read/Written Cache Hit % Overallocation Count </li></ul><ul><li>-------------- ------- ---------------- ------------------ --------------- -------------------- </li></ul><ul><li>5,376 12.5% 5,884,017 7,279,799 45% 113 </li></ul><ul><li>10,752 25.0% 5,884,017 3,593,510 62% 8 </li></ul><ul><li>21,504 50.0% 5,884,017 3,140,993 65% 0 </li></ul><ul><li>32,256 75.0% 5,884,017 3,104,894 65% 0 </li></ul><ul><li>43,008 100.0% 5,884,017 2,300,826 72% 0 </li></ul><ul><li>51,609 120.0% 5,884,017 2,189,160 73% 0 </li></ul><ul><li>60,211 140.0% 5,884,017 2,189,160 73% 0 </li></ul><ul><li>68,812 160.0% 5,884,017 2,189,160 73% 0 </li></ul><ul><li>77,414 180.0% 5,884,017 2,189,160 73% 0 </li></ul><ul><li>86,016 200.0% 5,884,017 2,189,160 73% 0 </li></ul><ul><li>129,024 300.0% 5,884,017 2,189,160 73% 0 </li></ul><ul><li>172,032 400.0% 5,884,017 2,189,160 73% 0 </li></ul><ul><li>258,048 600.0% 5,884,017 2,189,160 73% 0 </li></ul>
    31. 31. Beware Of Temporal Data Affecting The Optimizer <ul><li>Slowly Changing Dimensions </li></ul><ul><ul><li>Cover ranges of time </li></ul></ul><ul><ul><li>“ From” and “To” DATE columns define applicability </li></ul></ul><ul><ul><li>Need BETWEEN operator to retrieve rows for a reporting point in time </li></ul></ul><ul><ul><ul><li>SELECT * FROM d_customer </li></ul></ul></ul><ul><ul><ul><li>WHERE ’15/01/2005’ BETWEEN valid_from AND valid_to </li></ul></ul></ul>Month 1 1 st Jan, 2004 Month 2 1 st Feb, 2004 CUSTOMER CUSTOMER_ID NAME CUSTOMER_TYPE 487438 Jeff Moss I & C 839398 Mark Rittman SME D_CUSTOMER CUSTOMER_ID NAME CUSTOMER_TYPE VALID_FROM VALID_TO 487438 Jeff Moss SME 01/01/2004 31/01/2004 487438 Jeff Moss I & C 01/02/2004 839398 Mark Rittman SME 01/02/2004 CUSTOMER CUSTOMER_ID NAME CUSTOMER_TYPE 487438 Jeff Moss SME D_CUSTOMER CUSTOMER_ID NAME CUSTOMER_TYPE VALID_FROM VALID_TO 487438 Jeff Moss SME 01/01/2004
    32. 32. Dependent Predicates <ul><li>When multiple predicates exist, individual selectivities are combined using standard probability math 1 : </li></ul><ul><ul><li>P1 AND P2 </li></ul></ul><ul><ul><ul><li>S(P1 & P2) = S(P1) * S(P2) </li></ul></ul></ul><ul><ul><li>P1 OR P2 </li></ul></ul><ul><ul><ul><li>S(P1 | P2) = S(P1) + S(P2) – [S(P1) * S(P2)] </li></ul></ul></ul><ul><li>Only valid if the predicates are independent otherwise… </li></ul><ul><ul><li>Incorrect selectivity estimate </li></ul></ul><ul><ul><li>Incorrect cardinality estimate </li></ul></ul><ul><ul><li>Potentially suboptimal execution plan </li></ul></ul><ul><li>BETWEEN is multiple predicates! </li></ul><ul><li>Also known as Correlated Columns 2 </li></ul>1 – Wolfgang Breitling, Fallacies Of The Cost Based Optimizer 2 – Jonathan Lewis, Cost-Based Oracle Fundamentals, Chapter 6
    33. 33. Some Test Tables… <ul><li>Consider these 3 test tables… </li></ul><ul><li>12 records in an SCD type table </li></ul>TEST_12_DISTINCT_TD TEST_2_DISTINCT_TD TEST_1_DISTINCT_TD
    34. 34. Optimizer Gets Incorrect Cardinality select * from test_1_distinct_td where to_date('09-OCT-2005','DD-MON-YYYY') between from_date and to_date; KEY NON_KEY_AT FROM_DATE TO_DATE ---------- ---------- --------- --------- 1 Jeff 01-JAN-05 31-DEC-05 2 Mark 01-FEB-05 31-DEC-05 3 Doug 01-MAR-05 31-DEC-05 4 Niall 01-APR-05 31-DEC-05 5 Tom 01-MAY-05 31-DEC-05 6 Jonathan 01-JUN-05 31-DEC-05 7 Lisa 01-JUL-05 31-DEC-05 8 Cary 01-AUG-05 31-DEC-05 9 Mogens 01-SEP-05 31-DEC-05 10 Anjo 01-OCT-05 31-DEC-05 10 rows selected. Execution Plan ---------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ---------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 11 | 264 | 3 (0)| 00:00:01 | |* 1 | TABLE ACCESS FULL| TEST_1_DISTINCT_TD | 11 | 264 | 3 (0)| 00:00:01 | ----------------------------------------------------------------------------------------
    35. 35. …And Again select * from test_2_distinct_td where to_date('09-OCT-2005','DD-MON-YYYY') between from_date and to_date; KEY NON_KEY_AT FROM_DATE TO_DATE ---------- ---------- --------- --------- 7 Lisa 01-JUL-05 31-DEC-05 8 Cary 01-AUG-05 31-DEC-05 9 Mogens 01-SEP-05 31-DEC-05 10 Anjo 01-OCT-05 31-DEC-05 4 rows selected. Execution Plan ---------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ---------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 11 | 264 | 3 (0)| 00:00:01 | |* 1 | TABLE ACCESS FULL| TEST_2_DISTINCT_TD | 11 | 264 | 3 (0)| 00:00:01 | ----------------------------------------------------------------------------------------
    36. 36. … And Again select * from test_12_distinct_td where to_date('09-OCT-2005','DD-MON-YYYY') between from_date and to_date; KEY NON_KEY_AT FROM_DATE TO_DATE ---------- ---------- --------- --------- 10 Anjo 01-OCT-05 31-OCT-05 1 row selected. Execution Plan ----------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ----------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 4 | 96 | 3 (0)| 00:00:01 | |* 1 | TABLE ACCESS FULL| TEST_12_DISTINCT_TD | 4 | 96 | 3 (0)| 00:00:01 | -----------------------------------------------------------------------------------------
    37. 37. Workarounds <ul><li>Ignore it </li></ul><ul><ul><li>If your query still gets the right plan of course! </li></ul></ul><ul><li>Hints </li></ul><ul><ul><li>Force the optimizer to do as you tell it </li></ul></ul><ul><li>Stored outlines </li></ul><ul><li>Adjust statistics held against the table </li></ul><ul><ul><li>Affects any SQL that accesses that object </li></ul></ul><ul><li>Optimizer Profile (10g) </li></ul><ul><ul><li>Offline Optimisation 1 </li></ul></ul><ul><li>Dynamic sampling level 4 or above </li></ul><ul><ul><li>Samples “ single table predicates that reference 2 or more columns ” </li></ul></ul><ul><ul><li>Takes extra time during the parse – minimal but often worth it </li></ul></ul>1 - Jonathan Lewis: Cost-Based Oracle Fundamentals, Chapter 2
    38. 38. Dynamic Sampling With A Hint select /*+ dynamic_sampling(test_1_distinct_td,4) */ * from test_1_distinct_td where to_date('09-OCT-2005','DD-MON-YYYY') between from_date and to_date; KEY NON_KEY_AT FROM_DATE TO_DATE ---------- ---------- --------- --------- 1 Jeff 01-JAN-05 31-DEC-05 2 Mark 01-FEB-05 31-DEC-05 3 Doug 01-MAR-05 31-DEC-05 4 Niall 01-APR-05 31-DEC-05 5 Tom 01-MAY-05 31-DEC-05 6 Jonathan 01-JUN-05 31-DEC-05 7 Lisa 01-JUL-05 31-DEC-05 8 Cary 01-AUG-05 31-DEC-05 9 Mogens 01-SEP-05 31-DEC-05 10 Anjo 01-OCT-05 31-DEC-05 10 rows selected. Execution Plan ---------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ---------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 10 | 240 | 3 (0)| 00:00:01 | |* 1 | TABLE ACCESS FULL| TEST_1_DISTINCT_TD | 10 | 240 | 3 (0)| 00:00:01 | ----------------------------------------------------------------------------------------
    39. 39. Find Out Where Your Query Is At <ul><li>Data Warehouses are big, big, BIG! </li></ul><ul><ul><li>Big on rows </li></ul></ul><ul><ul><li>Big on disk storage </li></ul></ul><ul><ul><li>Big on hardware </li></ul></ul><ul><ul><li>Big SQL statements issued </li></ul></ul><ul><ul><ul><li>Lots of data to scan, join and sort </li></ul></ul></ul><ul><ul><ul><li>Many operations </li></ul></ul></ul><ul><ul><ul><li>Long running </li></ul></ul></ul><ul><li>So where is my long running query at ? </li></ul><ul><ul><li>No solid answers here, just food for thought… </li></ul></ul>
    40. 40. A “Big” Query Execution Plan <ul><li>| Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| </li></ul><ul><li>-------------------------------------------------------------------------------------------------------------- </li></ul><ul><li>| 0 | SELECT STATEMENT | | 1 | 124 | | 49722 (10)| </li></ul><ul><li>| 1 | PX COORDINATOR | | | | | | </li></ul><ul><li>| 2 | PX SEND QC (RANDOM) | :TQ20006 | 1 | 124 | | 49722 (10)| </li></ul><ul><li>| 3 | HASH JOIN | | 1 | 124 | | 49722 (10)| </li></ul><ul><li>| 4 | BUFFER SORT | | | | | | </li></ul><ul><li>| 5 | PX RECEIVE | | 207K| 9510K| | 25982 (9)| </li></ul><ul><li>| 6 | PX SEND BROADCAST | :TQ20000 | 207K| 9510K| | 25982 (9)| </li></ul><ul><li>| 7 | VIEW | | 207K| 9510K| | 25982 (9)| </li></ul><ul><li>| 8 | WINDOW SORT | | 207K| 10M| 26M| 25982 (9)| </li></ul><ul><li>| 9 | MERGE JOIN | | 207K| 10M| | 25976 (9)| </li></ul><ul><li>| 10 | TABLE ACCESS BY INDEX ROWID| AML_T_ANALYSIS_DATE | 1 | 22 | | 2 (0)| </li></ul><ul><li>| 11 | INDEX UNIQUE SCAN | AML_I_ANL_PK | 1 | | | 0 (0)| </li></ul><ul><li>| 12 | SORT AGGREGATE | | 1 | 9 | | | </li></ul><ul><li>| 13 | PX COORDINATOR | | | | | | </li></ul><ul><li>| 14 | PX SEND QC (RANDOM) | :TQ10000 | 1 | 9 | | | </li></ul><ul><li>| 15 | SORT AGGREGATE | | 1 | 9 | | | </li></ul><ul><li>| 16 | PX BLOCK ITERATOR | | 1 | 9 | | 2 (0)| </li></ul><ul><li>| 17 | TABLE ACCESS FULL | AML_T_ANALYSIS_DATE | 1 | 9 | | 2 (0)| </li></ul><ul><li>| 18 | FILTER | | | | | | </li></ul><ul><li>| 19 | FILTER | | | | | | </li></ul><ul><li>| 20 | TABLE ACCESS FULL | AML_T_BILLING_ACCOUNT_DIM| 82M| 2371M| | 5457 (5)| </li></ul><ul><li>| 21 | HASH JOIN | | 18M| 1340M| | 23704 (10)| </li></ul><ul><li>| 22 | HASH JOIN | | 10M| 500M| | 17005 (11)| </li></ul><ul><li>| 23 | PX RECEIVE | | 10M| 265M| | 11304 (14)| </li></ul><ul><li>| 24 | PX SEND HASH | :TQ20003 | 10M| 265M| | 11304 (14)| </li></ul><ul><li>| 25 | BUFFER SORT | | 1 | 124 | | | </li></ul><ul><li>| 26 | VIEW | AML_V_MD_CUH_SID | 10M| 265M| | 11304 (14)| </li></ul><ul><li>| 27 | HASH JOIN | | 10M| 337M| | 11304 (14)| </li></ul><ul><li>| 28 | PX RECEIVE | | 17M| 310M| | 5228 (18)| </li></ul><ul><li>| 29 | PX SEND HASH | :TQ20001 | 17M| 310M| | 5228 (18)| </li></ul><ul><li>| 30 | PX BLOCK ITERATOR | | 17M| 310M| | 5228 (18)| </li></ul><ul><li>| 31 | TABLE ACCESS FULL | AML_T_MEASURE_DIM | 17M| 310M| | 5228 (18)| </li></ul><ul><li>| 32 | PX RECEIVE | | 34M| 461M| | 5958 (10)| </li></ul><ul><li>| 33 | PX SEND HASH | :TQ20002 | 34M| 461M| | 5958 (10)| </li></ul><ul><li>| 34 | PX BLOCK ITERATOR | | 34M| 461M| | 5958 (10)| </li></ul><ul><li>| 35 | TABLE ACCESS FULL | AML_T_CUSTOMER_DIM | 34M| 461M| | 5958 (10)| </li></ul><ul><li>| 36 | PX RECEIVE | | 55M| 1212M| | 5562 (3)| </li></ul><ul><li>| 37 | PX SEND HASH | :TQ20004 | 55M| 1212M| | 5562 (3)| </li></ul><ul><li>| 38 | PX BLOCK ITERATOR | | 55M| 1212M| | 5562 (3)| </li></ul><ul><li>| 39 | TABLE ACCESS FULL | AML_T_CUSTOMER_DIM | 55M| 1212M| | 5562 (3)| </li></ul><ul><li>| 40 | PX RECEIVE | | 94M| 2516M| | 6483 (5)| </li></ul><ul><li>| 41 | PX SEND HASH | :TQ20005 | 94M| 2516M| | 6483 (5)| </li></ul><ul><li>| 42 | PX BLOCK ITERATOR | | 94M| 2516M| | 6483 (5)| </li></ul><ul><li>| 43 | MAT_VIEW ACCESS FULL | AML_M_CD_BAD | 94M| 2516M| | 6483 (5)| </li></ul><ul><li>Sorts </li></ul><ul><li>Aggregations </li></ul><ul><li>Hash joins </li></ul><ul><li>Merge joins </li></ul><ul><li>Table scans </li></ul><ul><li>Materialized View scans </li></ul><ul><li>Analytics </li></ul><ul><li>Parallel Query </li></ul><ul><li>Pruning </li></ul><ul><li>Temp Space Use </li></ul>
    41. 41. V$ Views To The Rescue ? <ul><li>V$SESSION – Identify your session </li></ul><ul><li>V$SQL_PLAN – Get the execution plan operations </li></ul><ul><li>V$SQL_WORKAREA – Get all the work areas which will be required </li></ul><ul><li>V$SESSION_LONGOPS – Get information on long plan operations </li></ul><ul><li>V$SQL_WORKAREA_ACTIVE – Get the work area(s) being used right now </li></ul>V$SESSION SID SERIAL# PROGRAM USERNAME SQL_ID SQL_CHILD_NUMBER SQL_ADDRESS SQL_HASH_VALUE V$SQL_PLAN SQL_ID CHILD_NUMBER ADDRESS HASH_VALUE OPERATION ID PARENT_ID V$SESSION_LONGOPS SID SERIAL# OPNAME TARGET MESSAGE SQL_ID SQL_ADDRESS SQL_HASH_VALUE ELAPSED_SECONDS V$SQL_WORKAREA_ACTIVE SQL_ID SQL_HASH_VALUE WORKAREA_ADDRESS OPERATION_ID OPERATION_TYPE POLICY SID QCSID ACTIVE_TIME V$SQL_WORKAREA SQL_ID CHILD_NUMBER WORKAREA_ADDRESS OPERATION_ID OPERATION_TYPE
    42. 42. Demonstration
    43. 43. Problems <ul><li>V$SQL_PLAN Bug </li></ul><ul><ul><li>Service Request: 4990863.992 </li></ul></ul><ul><ul><li>Broken in 10gR1, Works in 10gR2 </li></ul></ul><ul><ul><li>PARENT_ID corruption </li></ul></ul><ul><ul><ul><li>Can’t link rows in this view to their parents as the values are corrupted due to this bug </li></ul></ul></ul><ul><ul><ul><li>Shows up in TEMP TABLE TRANSFORMATION operations </li></ul></ul></ul><ul><li>Multiple Work Areas can be active…or None </li></ul><ul><li>Some operations are not shown in Long ops </li></ul><ul><li>V$SESSION sql_id may not be the executing cursor </li></ul><ul><ul><li>E.g. for refreshing Materialized View </li></ul></ul>* Test case for bug: http://www.oramoss.demon.co.uk/Code/test_error_v_sql_plan.sql
    44. 44. Questions ?
    45. 45. References: Papers <ul><li>Table Compression in Oracle 9iR2: A Performance Analysis </li></ul><ul><li>Table Compression in Oracle 9iR2: An Oracle White Paper </li></ul><ul><li>“Fallacies Of The Cost Based Optimizer”, Wolfgang Breitling </li></ul><ul><li>“ Scaling To Infinity, Partitioning In Oracle Data Warehouses”, Tim Gorman </li></ul><ul><li>Advanced Management Of Working Areas in Oracle 9i/10g, UKOUG 2005, Joze Senegacnik </li></ul><ul><li>Oracle9i Memory Management: Easier Than Ever, Oracle Open World 2002 , Sushil Kumar </li></ul><ul><li>Working with Automatic PGA , Christo Kutrovsky </li></ul><ul><li>Optimising Oracle9i Instance Memory, Ramaswamy, Ramesh </li></ul><ul><li>Oracle Metalink Note 223730.1 : Automatic PGA Memory Managment in 9i </li></ul><ul><li>Oracle Metalink Note 147806.1 : Oracle9i New Feature: Automated SQL Execution Memory Management </li></ul><ul><li>Oracle Metalink Note 148346.1 : Oracle9i Monitoring Automated SQL Execution Memory Management </li></ul><ul><li>Memory Management and Latching Improvements in Oracle Database 9i and 10g , Oracle Open World 2005, Tanel Pőder </li></ul><ul><li>If Your Memory Serves You Right… , IOUG Live! 2004, April 2004, Toronto, Canada, Richmond Shee </li></ul><ul><li>Decision Speed: Table Compression In Action </li></ul>
    46. 46. References: Online Presentation / Code <ul><li>http://www.oramoss.demon.co.uk/presentations/fivetuningtipsforyourdatawarehouse.ppt </li></ul><ul><li>http://www.oramoss.demon.co.uk/Code/mgmt_p_get_max_compression_order.prc </li></ul><ul><li>http://www.oramoss.demon.co.uk/Code/test_dml_performance_delete.sql </li></ul><ul><li>http://www.oramoss.demon.co.uk/Code/test_dml_performance_insert.sql </li></ul><ul><li>http://www.oramoss.demon.co.uk/Code/test_dml_performance_update.sql </li></ul><ul><li>http://www.oramoss.demon.co.uk/Code/test_error_v_sql_plan.sql </li></ul><ul><li>htt p://www.oramoss.demon.co.uk/Code/run_big_query.sql </li></ul><ul><li>htt p://www.oramoss.demon.co.uk/Code/run_big_query_parallel.sql </li></ul><ul><li>htt p://www.oramoss.demon.co.uk/Code/get_query_progress.sql </li></ul>

    ×