Your SlideShare is downloading. ×
Understanding histogramppt.prn
Understanding histogramppt.prn
Understanding histogramppt.prn
Understanding histogramppt.prn
Understanding histogramppt.prn
Understanding histogramppt.prn
Understanding histogramppt.prn
Understanding histogramppt.prn
Understanding histogramppt.prn
Understanding histogramppt.prn
Understanding histogramppt.prn
Understanding histogramppt.prn
Understanding histogramppt.prn
Understanding histogramppt.prn
Understanding histogramppt.prn
Understanding histogramppt.prn
Understanding histogramppt.prn
Understanding histogramppt.prn
Understanding histogramppt.prn
Understanding histogramppt.prn
Understanding histogramppt.prn
Understanding histogramppt.prn
Understanding histogramppt.prn
Understanding histogramppt.prn
Understanding histogramppt.prn
Understanding histogramppt.prn
Understanding histogramppt.prn
Understanding histogramppt.prn
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Understanding histogramppt.prn

1,217

Published on

Very simple and clear explaination for

Very simple and clear explaination for

Published in: Technology
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
1,217
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
94
Comments
0
Likes
1
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. Understanding Histograms in the Oracle Database Robert Gaydos R Wave Solutions Robert.Gaydos@rwavesolutions.com Copyright © R Wave Solutions 2008 All rights reserved
  • 2. Addenda What is a Histogram? Types of Histograms How to create Histograms How to identify existing Histograms Rules and tips when creating Histograms Q&A Copyright © R Wave Solutions 2008 All rights reserved
  • 3. What is a Histogram? A histogram holds the data distribution of values within a column of a table. – Number of occurrences for a specific value/range – Used by CBO to optimize a query. It is collected by using DBMS_STATS. – By default DBMS_STATS does not collect histogram stats. Min/Max Value and # of distinct values NOTE: DBMS_STAT is used to delete histogram data. Copyright © R Wave Solutions 2008 All rights reserved
  • 4. Types of Histograms Two types of Histograms – frequency histograms – height-balanced histograms – Type of histogram is stored in the HISTOGRAM column of the *tab_col_statistics views (USER/DBA) – Value = (‘HEIGHT BALANCED’, ‘FREQUENCY’, or ‘NONE’) Copyright © R Wave Solutions 2008 All rights reserved
  • 5. Histogram and Buckets When Histograms are created the number of buckets can be specified. It is this number that controls the type of histogram created. # Buckets = # of Rows of information. When Distinct Values <= Buckets – Then Frequency Histogram is created – Else Height-Balanced Histogram is created Copyright © R Wave Solutions 2008 All rights reserved
  • 6. Frequency Histograms Each value of the column corresponds to a single bucket of the histogram. Each bucket contains the number of occurrences of that single value. Copyright © R Wave Solutions 2008 All rights reserved
  • 7. Frequency Histograms Example Data Sorted A A Results B B C B Bucket 1 A = 1 B B C C Bucket 2 B = 3 C C C C Bucket 3 C = 6 C C C C Bucket 4 D = 1 B C D D Bucket 5 E = 1 E E Copyright © R Wave Solutions 2008 All rights reserved
  • 8. Height-Balanced Histograms In a height-balanced histogram, the column values are divided into bands so that each band contains approximately the same number of rows. The useful information that the histogram provides is where in the range of values the endpoints fall. Copyright © R Wave Solutions 2008 All rights reserved
  • 9. Height-Balanced Example 1 Data Sorted A A Results B B C B EPN EPV B B C C 0 B C C C C 1 C C C C C 2 C B C D D 3 E E E Buckets = 4 Copyright © R Wave Solutions 2008 All rights reserved
  • 10. The DB only stores 0-B Height-Balanced Example 1 2-C 3-E Data Sorted A A Results Notice that B B C EPN EPV C B Crosses Buckets. B B C C 0 B C C This is howC C 1 C database knows C C 2 C C C which values are B C D popular D 3 E E E Buckets = 4 Copyright © R Wave Solutions 2008 All rights reserved
  • 11. How to Create a Histogram Created by using DBMS_STATS.GATHER_TABLE_STATS METHOD_OPT => 'FOR COLUMNS SIZE <# of buckets> <Column Name>‘ execute dbms_stats.gather_table_stats (ownname => 'oe', tabname => 'inventories', METHOD_OPT => 'FOR COLUMNS SIZE 10 quantity_on_hand'); Copyright © R Wave Solutions 2008 All rights reserved
  • 12. Identify Existing Histograms In 10g – HISTOGRAM column of the *tab_col_statistics views (USER/DBA) In 9 select owner, table_name, column_name, count(*) buckets from dba_histograms where endpoint_number not in (0,1) group by owner, table_name, column_name Copyright © R Wave Solutions 2008 All rights reserved
  • 13. Rules On When To Create Histograms First there are no rules of thumb that are always correct. When creating using dbms_stats, the use of the DEGREE (for parallel) can skew the value of density Running dbms_stats can drop histograms – Use method_opt==>'REPEAT‘ Copyright © R Wave Solutions 2008 All rights reserved
  • 14. Rules On When To Create Histograms Index maintenance overhead is performed during update, insert and delete operations. Histogram maintenance overhead is performed during the analyze/dbms_stats process. Histograms are NOT just for indexed columns. – Adding a histogram to an un-indexed column that is used in a where clause can improve performance. Copyright © R Wave Solutions 2008 All rights reserved
  • 15. Histogram Example select sum(t1.d2*t2.d3*t3.d3) from t1, t2, t3 where t1.fk1 = t2.pk1 and t3.pk1 = t2.fk1 and t3.d2 = 35 and t1.d3 = 0; << This column has 2 values in it In this example column T1.D3 has 2 values and is our most selective criterion, but without a histogram CBO assumes an even distribution of 2 distinct values (density = 0.5). Copyright © R Wave Solutions 2008 All rights reserved
  • 16. SQL*Plan Without Histogram Rows Row Source Operation ------- --------------------------------- 1 SORT AGGREGATE 2088 HASH JOIN 601 TABLE ACCESS BY INDEX ROWID 601 INDEX RANGE SCAN OBJ#(57612) 1000 HASH JOIN 4 TABLE ACCESS FULL OBJ#(57604) filter("T1"."D3"=0) 62500 TABLE ACCESS FULL OBJ#(57605) Copyright © R Wave Solutions 2008 All rights reserved
  • 17. Timing Without Histogram call count cpu elapsed disk query current rows ------- ------ -------- ---------- ---------- ---------- ---------- ---------- Parse 1 0.00 0.00 0 0 0 0 Execute 1 0.00 0.00 0 0 0 0 Fetch 2 1.04 4.91 19013 20231 0 1 ------- ------ -------- ---------- ---------- ---------- ---------- ---------- total 4 1.04 4.91 19013 20231 0 1 Query = 20231 Copyright © R Wave Solutions 2008 All rights reserved
  • 18. Use DBMS_STATS to Add Histogram execute dbms_stats.gather_table_stats(NULL, 'T1', estimate_percent=>null, method_opt=>'FOR COLUMNS SIZE AUTO d3'); Copyright © R Wave Solutions 2008 All rights reserved
  • 19. Timing With Histogram call count cpu elapsed disk query current rows ------- ------ -------- ---------- ---------- ---------- ---------- ---------- Parse 1 0.00 0.00 0 0 0 0 Execute 1 0.01 0.00 0 0 0 0 Fetch 2 0.56 2.84 9841 10666 0 1 ------- ------ -------- ---------- ---------- ---------- ---------- ---------- total 4 0.57 2.84 9841 10666 0 1 Now Query = 10666 It was Query = 20231 ½ the IO Copyright © R Wave Solutions 2008 All rights reserved
  • 20. Plan With Histogram Rows Row Source Operation ------- --------------------------------------- 1 SORT AGGREGATE 2088 HASH JOIN 601 TABLE ACCESS BY INDEX ROWID OBJ#(57606) 601 INDEX RANGE SCAN OBJ#(57612) 1000 TABLE ACCESS BY INDEX ROWID OBJ#(57605) 1005 NESTED LOOPS 4 TABLE ACCESS FULL OBJ#(57604) filter("T1"."D3"=0) 1000 INDEX RANGE SCAN OBJ#(57609) access("T1"."FK1"="T2"."PK1") Copyright © R Wave Solutions 2008 All rights reserved
  • 21. Histograms Opportunities Any column used in a where clause with skewed data Columns that are not queried all the time Reduced overhead for insert, update, delete Copyright © R Wave Solutions 2008 All rights reserved
  • 22. When Not To Create A Histogram No rule of thumb is ever perfect. Do not create Histograms – Evenly distributed data in the column – Columns that are not queried – Do not create them on every column of every table. – PK of a table Copyright © R Wave Solutions 2008 All rights reserved
  • 23. Creating / Maintaining Histograms Table maintenance procedures are important. – Accidentally deleting histogram data is possible. – Histogram data is ONLY Collected/Maintained when stats are collected (DBMS_STATS). – Changing data in the table does not change data stored in the histogram. Copyright © R Wave Solutions 2008 All rights reserved
  • 24. Maintaining Histograms Always check if histogram exists on table before DBMS_STATS is run. – Do not forget to note the # of buckets. – Create tables to hold information. Use METHOD_OPT FOR ALL COLUMNS REPEAT to prevent deletion of histograms data. Distribution ratio change -> Recollect Copyright © R Wave Solutions 2008 All rights reserved
  • 25. Creating Histograms What is the best way to start out creating Histograms? Three popular ways to create. – Salt and Pepper - Create them as you need them. – Little Hammer DBMS_STATS using METHOD_OPT AUTO Should NOT be used after fresh restarted database – Big Hammer DBMS_STATS using SKEWONLY – Over Kill Hammer - FOR ALL COLUMNS Copyright © R Wave Solutions 2008 All rights reserved
  • 26. Other Comments It is the Density of a column that the optimizer considers when accessing a table. Changing the number of buckets changes the density. More Buckets DOES NOT mean better density. Using degree != 1 can affect density. Copyright © R Wave Solutions 2008 All rights reserved
  • 27. Review What is a Histogram? Two types of Histograms in the database How to create Histograms Data is collected by running DBMS_STATS. Different methods to create histograms Bucket size affects density of column. Density of the column affects CBO. Copyright © R Wave Solutions 2008 All rights reserved
  • 28. Thank You for Your Time Copyright © R Wave Solutions 2008 All rights reserved

×