Five Tuning Tips For Your Datawarehouse
Upcoming SlideShare
Loading in...5
×
 

Like this? Share it with your network

Share

Five Tuning Tips For Your Datawarehouse

on

  • 3,389 views

Five tuning tips for enhancing the performance and manageability of a data warehouse.

Five tuning tips for enhancing the performance and manageability of a data warehouse.

Statistics

Views

Total Views
3,389
Views on SlideShare
3,363
Embed Views
26

Actions

Likes
1
Downloads
113
Comments
2

5 Embeds 26

http://www.ukocn.com 15
http://www.slideshare.net 6
http://www.linkedin.com 3
http://sys-as-sysdba.blogspot.com 1
https://www.linkedin.com 1

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
  • Thanks!
    Are you sure you want to
    Your message goes here
    Processing…
  • Thanks for the presentation.

    Now embedded at www.ukocn.com

    http://www.ukocn.com/forums/business-intelligence-epm/data-warehousing/five-tuning-tips-your-datawarehouse-jeff-moss
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Five Tuning Tips For Your Datawarehouse Presentation Transcript

  • 1. Five Tuning Tips For Your Data Warehouse Jeff Moss
  • 2. My First Presentation
    • Yes, my very first presentation
      • For BIRT SIG
      • For UKOUG
    • Useful Advice from friends and colleagues
      • Use graphics where appropriate
      • Find a friendly or familiar face in the audience
      • Imagine your audience is naked!
      • … but like Oracle, be careful when combining advice!
  • 3. Be Careful Combining Advice!
    • Thanks for the opportunity Mark!
  • 4. Agenda
    • My background
    • Five tips
      • Partition for success
      • Squeeze your data with data segment compression
      • Make the most of your PGA memory
      • Beware of temporal data affecting the optimizer
      • Find out where your query is at
    • Questions
  • 5. My Background
    • Independent Consultant
    • 13 years Oracle experience
    • Blog: http://oramossoracle.blogspot.com/
    • Focused on warehousing / VLDB since 1998
    • First project
      • UK Music Sales Data Mart
      • Produces BBC Radio 1 Top 40 chart and many more
      • 2 billion row sales fact table
      • 1 Tb total database size
    • Currently working with Eon UK (Powergen)
      • 4Tb Production Warehouse, 8Tb total storage
      • Oracle Product Stack
  • 6. What Is Partitioning ?
    • “ Partitioning addresses key issues in supporting very large tables and indexes by letting you decompose them into smaller and more manageable pieces called partitions .” Oracle Database Concepts Manual, 10gR2
    • Introduced in Oracle 8.0
    • Numerous improvements since
    • Subpartitioning adds another level of decomposition
    • Partitions and Subpartitions are logical containers
  • 7. Partition To Tablespace Mapping
    • Partitions map to tablespaces
      • Partition can only be in One tablespace
      • Tablespace can hold many partitions
      • Highest granularity is One tablespace per partition
      • Lowest granularity is One tablespace for all the partitions
    • Tablespace volatility
      • Read / Write
      • Read Only
    P_JAN_2005 P_FEB_2005 P_MAR_2005 P_APR_2005 P_MAY_2005 P_JUN_2005 P_JUL_2005 P_AUG_2005 P_SEP_2005 P_OCT_2005 P_NOV_2005 P_DEC_2005 T_Q1_2005 T_Q2_2005 T_Q3_2005 T_Q4_2005 T_Q1_2006 P_JAN_2006 P_FEB_2006 P_MAR_2006 T_Q3_2005 Read / Write Read Only
  • 8. Why Partition ? - Performance
    • Improved query performance
      • Pruning or elimination
      • Partition wise joins
    • Read only partitions
      • Quicker checkpointing
      • Quicker backup
      • Quicker recovery
      • … but it depends on mapping of:
      • partition:tablespace:datafile
    SELECT SUM(sales) FROM part_tab WHERE sales_date BETWEEN ‘01-JAN-2005’ AND ’30-JUN-2005’ Sales Fact Table * Oracle 10gR2 Data Warehousing Manual JAN FEB MAR APR MAY JUN JUL AUG SEP OCT NOV DEC
  • 9. Why Partition ? - Manageability
    • Archiving
      • Use a rolling window approach
      • ALTER TABLE … ADD/SPLIT/DROP PARTITION…
    • Easier ETL Processing
      • Build a new dataset in a staging table
      • Add indexes and constraints
      • Collect statistics
      • Then swap the staging table for a partition on the target
        • ALTER TABLE…EXCHANGE PARTITION…
    • Easier Maintenance
      • Table partition move, e.g. to compress data
      • Local Index partition rebuild
  • 10. Why Partition ? - Scalability
    • Partition is generally consistent and predictable
      • Assuming an appropriate partitioning key is used
      • …and data has an even distribution across the key
    • Read only approach
      • Scalable backups - read only tablespaces are ignored
      • …so partitions in those tablespaces are ignored
    • Pruning allows consistent query performance
  • 11. Why Partition ? - Availability
    • Offline data impact minimised
      • … depending on granularity
      • Quicker recovery
      • Pruned data not missed
      • EXCHANGE PARTITION
        • Allows offline build
        • Quick swap over
    P_JAN_2005 P_FEB_2005 P_MAR_2005 P_APR_2005 P_MAY_2005 P_JUN_2005 P_JUL_2005 P_AUG_2005 P_SEP_2005 P_OCT_2005 P_NOV_2005 P_DEC_2005 T_Q1_2005 T_Q2_2005 T_Q3_2005 T_Q4_2005 T_Q1_2006 P_JAN_2006 P_FEB_2006 P_MAR_2006 T_Q3_2005 Read / Write Read Only
  • 12. Fact Table Partitioning Transaction Date Load Date
    • Easier ETL Processing
      • Each load deals with only 1 partition
    • No use to end user queries!
      • Can’t prune – Full scans!
    • Harder ETL Processing
      • But still uses EXCHANGE PARTITION
    • Useful to end user queries
      • Allows full pruning capability
    07-JAN-2005 Customer 1 09-JAN-2005 15-JAN-2005 Customer 2 17-JAN-2005 January Partition February Partition 22-JAN-2005 Customer 3 01-FEB-2005 02-FEB-2005 Customer 4 05-FEB-2005 26-FEB-2005 Customer 5 28-FEB-2005 March Partition 06-MAR-2005 Customer 2 07-MAR-2005 12-MAR-2005 Customer 3 15-MAR-2005 Tran Date Customer Load Date April Partition 21-JAN-2005 Customer 7 04-APR-2005 09-APR-2005 Customer 9 10-APR-2005 07-JAN-2005 Customer 1 09-JAN-2005 15-JAN-2005 Customer 2 17-JAN-2005 21-JAN-2005 Customer 7 04-APR-2005 22-JAN-2005 Customer 3 01-FEB-2005 January Partition February Partition 02-FEB-2005 Customer 4 05-FEB-2005 26-FEB-2005 Customer 5 28-FEB-2005 March Partition 06-MAR-2005 Customer 2 07-MAR-2005 12-MAR-2005 Customer 3 15-MAR-2005 Tran Date Customer Load Date April Partition 09-APR-2005 Customer 9 10-APR-2005
  • 13. Watch out for…
    • Partition exchange and table statistics 1
      • Partition stats updated
      • … but Global stats are NOT!
      • Affects queries accessing multiple partitions
      • Solution
        • Gather stats on staging table prior to EXCHANGE
        • Gather stats on partitioned table using GLOBAL
    Jonathan Lewis: Cost-Based Oracle Fundamentals, Chapter 2
  • 14. Partitioning Feature: Characteristic Reason Matrix    Partition Truncation     Exchange Partition    Archiving    Pruning (Partition Elimination)   Partition wise joins  Parallel DML     Local Indexes    Read Only Partitions Availability Scalability Manageability Performance Characteristic: Feature:
  • 15. What Is Data Segment Compression ?
    • Compresses data by eliminating intra block repeated column values
    • Reduces the space required for a segment
      • …but only if there are appropriate repeats!
    • Self contained
    • Lossless algorithm
  • 16. Where Can Data Segment Compression Be Used ?
    • Can be used with a number of segment types
      • Heap & Nested Tables
      • Range or List Partitions
      • Materialized Views
    • Can’t be used with
      • Subpartitions
      • Hash Partitions
      • Indexes – but they have row level compression
      • IOT
      • External Tables
      • Tables that are part of a Cluster
      • LOBs
  • 17. How Does Segment Compression Work ? Database Block Symbol Table Row Data Area 100 Call to discuss bill amount TEL NO YES 3 TEL 4 NO 5 YES 2 Call to discuss bill amount 1 100 1 2 3 4 5 101 Call to discuss new product MAIL NO N/A 8 MAIL 9 N/A 7 Call to discuss new product 6 101 6 7 8 4 9 102 Call to discuss new product TEL YES N/A 10 7 3 5 9 10 102 ID DESCRIPTION CONTACT TYPE OUTCOME FOLLOWUP
  • 18. Pros & Cons
    • Pros
      • Saves space
        • Reduces LIO / PIO
        • Speeds up backup/recovery
        • Improves query response time
      • Transparent
        • To readers
        • … and writers
      • Decreases time to perform some DML
        • Deletes should be quicker
        • Bulk inserts may be quicker
    • Cons
      • Increases CPU load
      • Can only be used on Direct Path operations
        • CTAS
        • Serial Inserts using INSERT /*+ APPEND */
        • Parallel Inserts (PDML)
        • ALTER TABLE…MOVE…
        • Direct Path SQL*Loader
      • Increases time to perform some DML
        • Bulk inserts may be slower
        • Updates are slower
  • 19. Ordering Your Data For Maximum Benefits
    • Colocate data to maximise compression benefits
    • For maximum compression
      • Minimise the total space required by the segment
      • Identify most “compressable” column(s)
    • For optimal access
      • We know how the data is to be queried
      • Order the data by
        • Access path columns
        • Then the next most “compressable” column(s)
    Uniformly distributed Colocated 1 2 3 4 5 1 2 3 4 5 1 2 3 4 5 1 2 3 4 5 1 1 1 1 2 2 2 2 3 3 3 3 4 4 4 4 5 5 5 5
  • 20. Get Max Compression Order Package
      • PROCEDURE mgmt_p_get_max_compress_order
      • Argument Name Type In/Out Default?
      • ------------------------------ ----------------------- ------ --------
      • P_TABLE_OWNER VARCHAR2 IN DEFAULT
      • P_TABLE_NAME VARCHAR2 IN
      • P_PARTITION_NAME VARCHAR2 IN DEFAULT
      • P_SAMPLE_SIZE NUMBER IN DEFAULT
      • P_PREFIX_COLUMN1 VARCHAR2 IN DEFAULT
      • P_PREFIX_COLUMN2 VARCHAR2 IN DEFAULT
      • P_PREFIX_COLUMN3 VARCHAR2 IN DEFAULT
      • BEGIN
      • mgmt_p_get_max_compress_order(p_table_owner => ‘AE_MGMT’
      • ,p_table_name =>’BIG_TABLE’
      • ,p_sample_size =>10000);
      • END:
      • /
    Running mgmt_p_get_max_compress_order... ---------------------------------------------------------------------------------------------------- Table : BIG_TABLE Sample Size : 10000 Unique Run ID: 25012006232119 ORDER BY Prefix: ---------------------------------------------------------------------------------------------------- Creating MASTER Table : TEMP_MASTER_25012006232119 Creating COLUMN Table 1: COL1 Creating COLUMN Table 2: COL2 Creating COLUMN Table 3: COL3 ---------------------------------------------------------------------------------------------------- The output below lists each column in the table and the number of blocks/rows and space used when the table data is ordered by only that column, or in the case where a prefix has been specified, where the table data is ordered by the prefix and then that column. From this one can determine if there is a specific ORDER BY which can be applied to to the data in order to maximise compression within the table whilst, in the case of a a prefix being present, ordering data as efficiently as possible for the most common access path(s). ---------------------------------------------------------------------------------------------------- NAME COLUMN BLOCKS ROWS SPACE_GB ============================== ============================== ============ ============ ======== TEMP_COL_001_25012006232119 COL1 290 10000 .0022 TEMP_COL_002_25012006232119 COL2 345 10000 .0026 TEMP_COL_003_25012006232119 COL3 555 10000 .0042
  • 21. Data Warehousing Specifics
    • Star Schema compresses better than Normalized
      • More redundant data
    • Focus on…
      • Fact Tables and Summaries in Star Schema
      • Transaction tables in Normalized Schema
    • Performance Impact 1
      • Space Savings
        • Star schema: 67%
        • Normalized: 24%
      • Query Elapsed Times
        • Star schema: 16.5%
        • Normalized: 10%
    1 - Table Compression in Oracle 9iR2: A Performance Analysis
  • 22. Things To Watch Out For
    • DROP COLUMN is awkward
      • ORA-39726: Unsupported add/drop column operation on compressed tables
      • Uncompress the table and try again - still gives ORA-39726!
    • After UPDATEs data is uncompressed
      • Performance impact
      • Row migration
    • Use appropriate physical design settings
      • PCTFREE 0 - pack each block
      • Large blocksize - reduce overhead / increase repeats per block
  • 23. PGA Memory: What For ?
    • Sorts
      • Standard sorts [SORT]
      • Buffer [BUFFER]
      • Group By [GROUP BY (SORT)]
      • Connect By [CONNECT-BY (SORT)]
      • Rollup [ROLLUP (SORT)]
      • Window [WINDOW (SORT)]
    • Hash Joins [HASH-JOIN]
    • Indexes
      • Maintenance [IDX MAINTENANCE SOR]
      • Bitmap Merge [BITMAP MERGE]
      • Bitmap Create [BITMAP CREATE]
    • Write Buffers [LOAD WRITE BUFFERS]
    Serial Process PGA Dedicated Server Cursors Variables Sort Area [] V$SQL_WORKAREA.OPERATION_TYPE
  • 24. PGA Memory Management: Manual
    • The “old” way of doing things
      • Still available though – even in 10g R2
    • Configuring
      • ALTER SESSION SET WORKAREA_SIZE_POLICY=MANUAL;
      • Initialisation parameter: WORKAREA_SIZE_POLICY=MANUAL
    • Set memory parameters yourself
      • HASH_AREA_SIZE
      • SORT_AREA_SIZE
      • SORT_AREA_RETAINED_SIZE
      • BITMAP_MERGE_AREA_SIZE
      • CREATE_BITMAP_AREA_SIZE
    • Optimal values depend on the type of work 1
      • One size does not fit all!
    1 - Richmond Shee: If Your Memory Serves You Right
  • 25. PGA Memory Management: Automatic
    • The “new” way from 9i R1
      • Default OFF in 9i R1/R2
        • Enabled by setting at session/instance level:
          • WORKAREA_SIZE_POLICY=AUTO
          • PGA_AGGREGATE_TARGET > 0
      • Default ON since 10g R1
    • Oracle dynamically manages the available memory to suit the workload
      • But of course, it’s not perfect!
    Jože Senegačnik - Advanced Management Of Working Areas In Oracle 9i/10g, presented at UKOUG 2005
  • 26. Auto PGA Parameters: Pre 10gR2
    • WORKAREA_SIZE_POLICY
      • Set to AUTO
    • PGA_AGGREGATE_TARGET
      • The target for summed PGA across all processes
      • Can be exceeded if too small
        • Over Allocation
    • _PGA_MAX_SIZE
      • Target maximum PGA size for a single process
      • Default is a fixed value of 200Mb
      • Hidden / Undocumented Parameter
        • Usual caveats apply
  • 27. Auto PGA Parameters : Pre 10gR2
    • _SMM_MAX_SIZE
      • Limit for a single workarea operation for one process
      • Derived Default
        • LEAST(5% of PGA_AGGREGATE_TARGET
        • , 50% of _PGA_MAX_SIZE)
        • Hits limit of 100Mb
          • When PGA_AGGREGATE_TARGET is >= 2000Mb
          • And _PGA_MAX_SIZE is left at default of 200Mb
      • Hidden / Undocumented Parameter
        • Usual caveats apply
  • 28. Auto PGA Parameters : Pre 10gR2
    • _SMM_PX_MAX_SIZE
      • Limit for all the parallel slaves of a single workarea operation
      • Derived Default
        • 30% of PGA_AGGREGATE_TARGET
      • Hidden / Undocumented Parameter
        • Usual caveats apply
      • Parallel slaves still limited
        • _SMM_MAX_SIZE
      • Impacts only when…
    PGA_AGGREGATE_TARGET: 3000Mb _PGA_MAX_SIZE = 200Mb _SMM_MAX_SIZE = 100Mb _SMM_PX_MAX_SIZE = 900Mb Session 1 100Mb Session 2 100Mb Session 3 100Mb Session 4 100Mb Session 5 100Mb Session 6 100Mb Session 7 100Mb Session 8 100Mb Session 9 75Mb Session 10 75Mb Session 11 75Mb Session 12 75Mb Session 1 75Mb Session 2 75Mb Session 3 75Mb Session 4 75Mb Session 5 75Mb Session 6 75Mb Session 7 75Mb Session 8 75Mb
  • 29. 10gR2 Improvements
    • _SMM_MAX_SIZE now the driver
      • More advanced algorithm
      • _PGA_MAX_SIZE = 2 * _SMM_MAX_SIZE
    • Parallel operations
      • _SMM_PX_MAX_SIZE = 50% * PGA_AGGREGATE_TARGET
      • When DOP <=5 then _smm_max_size is used
      • When DOP > 5 _smm_px_max_size / DOP is used
    Jože Senegačnik - Advanced Management Of Working Areas In Oracle 9i/10g, presented at UKOUG 2005 10% * PGA_AGGREGATE_TARGET 1000Mb + 100Mb 500Mb – 1000Mb 20% * PGA_AGGREGATE_TARGET <= 500Mb _SMM_MAX_SIZE PGA_AGGREGATE_TARGET
  • 30. PGA Target Advisor
    • select trunc(pga_target_for_estimate/1024/1024) pga_target_for_estimate
    • , to_char(pga_target_factor * 100,'999.9') ||'%' pga_target_factor
    • , trunc(bytes_processed/1024/1024) bytes_processed
    • , trunc(estd_extra_bytes_rw/1024/1024) estd_extra_bytes_rw
    • , to_char(estd_pga_cache_hit_percentage,'999') ||
    • '%' estd_pga_cache_hit_percentage
    • , estd_overalloc_count
    • from v$pga_target_advice
    • /
    • PGA Target For PGA Tgt Estimated Extra Estimated PGA Estimated
    • Estimate Mb Factor Bytes Processed Bytes Read/Written Cache Hit % Overallocation Count
    • -------------- ------- ---------------- ------------------ --------------- --------------------
    • 5,376 12.5% 5,884,017 7,279,799 45% 113
    • 10,752 25.0% 5,884,017 3,593,510 62% 8
    • 21,504 50.0% 5,884,017 3,140,993 65% 0
    • 32,256 75.0% 5,884,017 3,104,894 65% 0
    • 43,008 100.0% 5,884,017 2,300,826 72% 0
    • 51,609 120.0% 5,884,017 2,189,160 73% 0
    • 60,211 140.0% 5,884,017 2,189,160 73% 0
    • 68,812 160.0% 5,884,017 2,189,160 73% 0
    • 77,414 180.0% 5,884,017 2,189,160 73% 0
    • 86,016 200.0% 5,884,017 2,189,160 73% 0
    • 129,024 300.0% 5,884,017 2,189,160 73% 0
    • 172,032 400.0% 5,884,017 2,189,160 73% 0
    • 258,048 600.0% 5,884,017 2,189,160 73% 0
  • 31. Beware Of Temporal Data Affecting The Optimizer
    • Slowly Changing Dimensions
      • Cover ranges of time
      • “ From” and “To” DATE columns define applicability
      • Need BETWEEN operator to retrieve rows for a reporting point in time
        • SELECT * FROM d_customer
        • WHERE ’15/01/2005’ BETWEEN valid_from AND valid_to
    Month 1 1 st Jan, 2004 Month 2 1 st Feb, 2004 CUSTOMER CUSTOMER_ID NAME CUSTOMER_TYPE 487438 Jeff Moss I & C 839398 Mark Rittman SME D_CUSTOMER CUSTOMER_ID NAME CUSTOMER_TYPE VALID_FROM VALID_TO 487438 Jeff Moss SME 01/01/2004 31/01/2004 487438 Jeff Moss I & C 01/02/2004 839398 Mark Rittman SME 01/02/2004 CUSTOMER CUSTOMER_ID NAME CUSTOMER_TYPE 487438 Jeff Moss SME D_CUSTOMER CUSTOMER_ID NAME CUSTOMER_TYPE VALID_FROM VALID_TO 487438 Jeff Moss SME 01/01/2004
  • 32. Dependent Predicates
    • When multiple predicates exist, individual selectivities are combined using standard probability math 1 :
      • P1 AND P2
        • S(P1 & P2) = S(P1) * S(P2)
      • P1 OR P2
        • S(P1 | P2) = S(P1) + S(P2) – [S(P1) * S(P2)]
    • Only valid if the predicates are independent otherwise…
      • Incorrect selectivity estimate
      • Incorrect cardinality estimate
      • Potentially suboptimal execution plan
    • BETWEEN is multiple predicates!
    • Also known as Correlated Columns 2
    1 – Wolfgang Breitling, Fallacies Of The Cost Based Optimizer 2 – Jonathan Lewis, Cost-Based Oracle Fundamentals, Chapter 6
  • 33. Some Test Tables…
    • Consider these 3 test tables…
    • 12 records in an SCD type table
    TEST_12_DISTINCT_TD TEST_2_DISTINCT_TD TEST_1_DISTINCT_TD
  • 34. Optimizer Gets Incorrect Cardinality select * from test_1_distinct_td where to_date('09-OCT-2005','DD-MON-YYYY') between from_date and to_date; KEY NON_KEY_AT FROM_DATE TO_DATE ---------- ---------- --------- --------- 1 Jeff 01-JAN-05 31-DEC-05 2 Mark 01-FEB-05 31-DEC-05 3 Doug 01-MAR-05 31-DEC-05 4 Niall 01-APR-05 31-DEC-05 5 Tom 01-MAY-05 31-DEC-05 6 Jonathan 01-JUN-05 31-DEC-05 7 Lisa 01-JUL-05 31-DEC-05 8 Cary 01-AUG-05 31-DEC-05 9 Mogens 01-SEP-05 31-DEC-05 10 Anjo 01-OCT-05 31-DEC-05 10 rows selected. Execution Plan ---------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ---------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 11 | 264 | 3 (0)| 00:00:01 | |* 1 | TABLE ACCESS FULL| TEST_1_DISTINCT_TD | 11 | 264 | 3 (0)| 00:00:01 | ----------------------------------------------------------------------------------------
  • 35. …And Again select * from test_2_distinct_td where to_date('09-OCT-2005','DD-MON-YYYY') between from_date and to_date; KEY NON_KEY_AT FROM_DATE TO_DATE ---------- ---------- --------- --------- 7 Lisa 01-JUL-05 31-DEC-05 8 Cary 01-AUG-05 31-DEC-05 9 Mogens 01-SEP-05 31-DEC-05 10 Anjo 01-OCT-05 31-DEC-05 4 rows selected. Execution Plan ---------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ---------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 11 | 264 | 3 (0)| 00:00:01 | |* 1 | TABLE ACCESS FULL| TEST_2_DISTINCT_TD | 11 | 264 | 3 (0)| 00:00:01 | ----------------------------------------------------------------------------------------
  • 36. … And Again select * from test_12_distinct_td where to_date('09-OCT-2005','DD-MON-YYYY') between from_date and to_date; KEY NON_KEY_AT FROM_DATE TO_DATE ---------- ---------- --------- --------- 10 Anjo 01-OCT-05 31-OCT-05 1 row selected. Execution Plan ----------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ----------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 4 | 96 | 3 (0)| 00:00:01 | |* 1 | TABLE ACCESS FULL| TEST_12_DISTINCT_TD | 4 | 96 | 3 (0)| 00:00:01 | -----------------------------------------------------------------------------------------
  • 37. Workarounds
    • Ignore it
      • If your query still gets the right plan of course!
    • Hints
      • Force the optimizer to do as you tell it
    • Stored outlines
    • Adjust statistics held against the table
      • Affects any SQL that accesses that object
    • Optimizer Profile (10g)
      • Offline Optimisation 1
    • Dynamic sampling level 4 or above
      • Samples “ single table predicates that reference 2 or more columns ”
      • Takes extra time during the parse – minimal but often worth it
    1 - Jonathan Lewis: Cost-Based Oracle Fundamentals, Chapter 2
  • 38. Dynamic Sampling With A Hint select /*+ dynamic_sampling(test_1_distinct_td,4) */ * from test_1_distinct_td where to_date('09-OCT-2005','DD-MON-YYYY') between from_date and to_date; KEY NON_KEY_AT FROM_DATE TO_DATE ---------- ---------- --------- --------- 1 Jeff 01-JAN-05 31-DEC-05 2 Mark 01-FEB-05 31-DEC-05 3 Doug 01-MAR-05 31-DEC-05 4 Niall 01-APR-05 31-DEC-05 5 Tom 01-MAY-05 31-DEC-05 6 Jonathan 01-JUN-05 31-DEC-05 7 Lisa 01-JUL-05 31-DEC-05 8 Cary 01-AUG-05 31-DEC-05 9 Mogens 01-SEP-05 31-DEC-05 10 Anjo 01-OCT-05 31-DEC-05 10 rows selected. Execution Plan ---------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ---------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 10 | 240 | 3 (0)| 00:00:01 | |* 1 | TABLE ACCESS FULL| TEST_1_DISTINCT_TD | 10 | 240 | 3 (0)| 00:00:01 | ----------------------------------------------------------------------------------------
  • 39. Find Out Where Your Query Is At
    • Data Warehouses are big, big, BIG!
      • Big on rows
      • Big on disk storage
      • Big on hardware
      • Big SQL statements issued
        • Lots of data to scan, join and sort
        • Many operations
        • Long running
    • So where is my long running query at ?
      • No solid answers here, just food for thought…
  • 40. A “Big” Query Execution Plan
    • | Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)|
    • --------------------------------------------------------------------------------------------------------------
    • | 0 | SELECT STATEMENT | | 1 | 124 | | 49722 (10)|
    • | 1 | PX COORDINATOR | | | | | |
    • | 2 | PX SEND QC (RANDOM) | :TQ20006 | 1 | 124 | | 49722 (10)|
    • | 3 | HASH JOIN | | 1 | 124 | | 49722 (10)|
    • | 4 | BUFFER SORT | | | | | |
    • | 5 | PX RECEIVE | | 207K| 9510K| | 25982 (9)|
    • | 6 | PX SEND BROADCAST | :TQ20000 | 207K| 9510K| | 25982 (9)|
    • | 7 | VIEW | | 207K| 9510K| | 25982 (9)|
    • | 8 | WINDOW SORT | | 207K| 10M| 26M| 25982 (9)|
    • | 9 | MERGE JOIN | | 207K| 10M| | 25976 (9)|
    • | 10 | TABLE ACCESS BY INDEX ROWID| AML_T_ANALYSIS_DATE | 1 | 22 | | 2 (0)|
    • | 11 | INDEX UNIQUE SCAN | AML_I_ANL_PK | 1 | | | 0 (0)|
    • | 12 | SORT AGGREGATE | | 1 | 9 | | |
    • | 13 | PX COORDINATOR | | | | | |
    • | 14 | PX SEND QC (RANDOM) | :TQ10000 | 1 | 9 | | |
    • | 15 | SORT AGGREGATE | | 1 | 9 | | |
    • | 16 | PX BLOCK ITERATOR | | 1 | 9 | | 2 (0)|
    • | 17 | TABLE ACCESS FULL | AML_T_ANALYSIS_DATE | 1 | 9 | | 2 (0)|
    • | 18 | FILTER | | | | | |
    • | 19 | FILTER | | | | | |
    • | 20 | TABLE ACCESS FULL | AML_T_BILLING_ACCOUNT_DIM| 82M| 2371M| | 5457 (5)|
    • | 21 | HASH JOIN | | 18M| 1340M| | 23704 (10)|
    • | 22 | HASH JOIN | | 10M| 500M| | 17005 (11)|
    • | 23 | PX RECEIVE | | 10M| 265M| | 11304 (14)|
    • | 24 | PX SEND HASH | :TQ20003 | 10M| 265M| | 11304 (14)|
    • | 25 | BUFFER SORT | | 1 | 124 | | |
    • | 26 | VIEW | AML_V_MD_CUH_SID | 10M| 265M| | 11304 (14)|
    • | 27 | HASH JOIN | | 10M| 337M| | 11304 (14)|
    • | 28 | PX RECEIVE | | 17M| 310M| | 5228 (18)|
    • | 29 | PX SEND HASH | :TQ20001 | 17M| 310M| | 5228 (18)|
    • | 30 | PX BLOCK ITERATOR | | 17M| 310M| | 5228 (18)|
    • | 31 | TABLE ACCESS FULL | AML_T_MEASURE_DIM | 17M| 310M| | 5228 (18)|
    • | 32 | PX RECEIVE | | 34M| 461M| | 5958 (10)|
    • | 33 | PX SEND HASH | :TQ20002 | 34M| 461M| | 5958 (10)|
    • | 34 | PX BLOCK ITERATOR | | 34M| 461M| | 5958 (10)|
    • | 35 | TABLE ACCESS FULL | AML_T_CUSTOMER_DIM | 34M| 461M| | 5958 (10)|
    • | 36 | PX RECEIVE | | 55M| 1212M| | 5562 (3)|
    • | 37 | PX SEND HASH | :TQ20004 | 55M| 1212M| | 5562 (3)|
    • | 38 | PX BLOCK ITERATOR | | 55M| 1212M| | 5562 (3)|
    • | 39 | TABLE ACCESS FULL | AML_T_CUSTOMER_DIM | 55M| 1212M| | 5562 (3)|
    • | 40 | PX RECEIVE | | 94M| 2516M| | 6483 (5)|
    • | 41 | PX SEND HASH | :TQ20005 | 94M| 2516M| | 6483 (5)|
    • | 42 | PX BLOCK ITERATOR | | 94M| 2516M| | 6483 (5)|
    • | 43 | MAT_VIEW ACCESS FULL | AML_M_CD_BAD | 94M| 2516M| | 6483 (5)|
    • Sorts
    • Aggregations
    • Hash joins
    • Merge joins
    • Table scans
    • Materialized View scans
    • Analytics
    • Parallel Query
    • Pruning
    • Temp Space Use
  • 41. V$ Views To The Rescue ?
    • V$SESSION – Identify your session
    • V$SQL_PLAN – Get the execution plan operations
    • V$SQL_WORKAREA – Get all the work areas which will be required
    • V$SESSION_LONGOPS – Get information on long plan operations
    • V$SQL_WORKAREA_ACTIVE – Get the work area(s) being used right now
    V$SESSION SID SERIAL# PROGRAM USERNAME SQL_ID SQL_CHILD_NUMBER SQL_ADDRESS SQL_HASH_VALUE V$SQL_PLAN SQL_ID CHILD_NUMBER ADDRESS HASH_VALUE OPERATION ID PARENT_ID V$SESSION_LONGOPS SID SERIAL# OPNAME TARGET MESSAGE SQL_ID SQL_ADDRESS SQL_HASH_VALUE ELAPSED_SECONDS V$SQL_WORKAREA_ACTIVE SQL_ID SQL_HASH_VALUE WORKAREA_ADDRESS OPERATION_ID OPERATION_TYPE POLICY SID QCSID ACTIVE_TIME V$SQL_WORKAREA SQL_ID CHILD_NUMBER WORKAREA_ADDRESS OPERATION_ID OPERATION_TYPE
  • 42. Demonstration
  • 43. Problems
    • V$SQL_PLAN Bug
      • Service Request: 4990863.992
      • Broken in 10gR1, Works in 10gR2
      • PARENT_ID corruption
        • Can’t link rows in this view to their parents as the values are corrupted due to this bug
        • Shows up in TEMP TABLE TRANSFORMATION operations
    • Multiple Work Areas can be active…or None
    • Some operations are not shown in Long ops
    • V$SESSION sql_id may not be the executing cursor
      • E.g. for refreshing Materialized View
    * Test case for bug: http://www.oramoss.demon.co.uk/Code/test_error_v_sql_plan.sql
  • 44. Questions ?
  • 45. References: Papers
    • Table Compression in Oracle 9iR2: A Performance Analysis
    • Table Compression in Oracle 9iR2: An Oracle White Paper
    • “Fallacies Of The Cost Based Optimizer”, Wolfgang Breitling
    • “ Scaling To Infinity, Partitioning In Oracle Data Warehouses”, Tim Gorman
    • Advanced Management Of Working Areas in Oracle 9i/10g, UKOUG 2005, Joze Senegacnik
    • Oracle9i Memory Management: Easier Than Ever, Oracle Open World 2002 , Sushil Kumar
    • Working with Automatic PGA , Christo Kutrovsky
    • Optimising Oracle9i Instance Memory, Ramaswamy, Ramesh
    • Oracle Metalink Note 223730.1 : Automatic PGA Memory Managment in 9i
    • Oracle Metalink Note 147806.1 : Oracle9i New Feature: Automated SQL Execution Memory Management
    • Oracle Metalink Note 148346.1 : Oracle9i Monitoring Automated SQL Execution Memory Management
    • Memory Management and Latching Improvements in Oracle Database 9i and 10g , Oracle Open World 2005, Tanel Pőder
    • If Your Memory Serves You Right… , IOUG Live! 2004, April 2004, Toronto, Canada, Richmond Shee
    • Decision Speed: Table Compression In Action
  • 46. References: Online Presentation / Code
    • http://www.oramoss.demon.co.uk/presentations/fivetuningtipsforyourdatawarehouse.ppt
    • http://www.oramoss.demon.co.uk/Code/mgmt_p_get_max_compression_order.prc
    • http://www.oramoss.demon.co.uk/Code/test_dml_performance_delete.sql
    • http://www.oramoss.demon.co.uk/Code/test_dml_performance_insert.sql
    • http://www.oramoss.demon.co.uk/Code/test_dml_performance_update.sql
    • http://www.oramoss.demon.co.uk/Code/test_error_v_sql_plan.sql
    • htt p://www.oramoss.demon.co.uk/Code/run_big_query.sql
    • htt p://www.oramoss.demon.co.uk/Code/run_big_query_parallel.sql
    • htt p://www.oramoss.demon.co.uk/Code/get_query_progress.sql