SlideShare a Scribd company logo
1 of 46
Download to read offline
拨开Oracle CBO优化器迷雾,探
究Histogram直方图之秘

刘相兵(Maclean Liu)        ORA-ALLSTARS Exadata用
liu.maclean@gmail.com   户组QQ群:23549328
                                     www.askmaclean.com
About Me
   Email:liu.maclean@gmail.com
   Blog:www.askmaclean.com
   Oracle Certified Database Administrator Master 10g and 11g
   Over 7 years experience with Oracle DBA technology
   Over 8 years experience with Linux technology
   Member Independent Oracle Users Group
   Member All China Users Group
Presents for advanced Oracle topics: RAC, DataGuard, Performance
Tuning and Oracle Internal.




                           www.askmaclean.com            www.askmaclean.com
                                                          www.askmaclean.com
About Me




www.askmaclean.com   www.askmaclean.com
                      www.askmaclean.com
神秘的Histogram直方图迷雾




      www.askmaclean.com   www.askmaclean.com
                            www.askmaclean.com
预备知识:CBO术语
NDV – number of distinct values
Card – cardinality
NULLS: Number of Nulls in Column
TYPE: Histogram Type
#BKTS: Histogram Buckets
UNCOMPBKTS: Histogram Uncompressed Buckets
ENDPTVALS: Histogram End Point Values
Freq 频率直方图
HtBal 高度平衡直方图
DEN: Column Density 密度
sel – selectivity 选择性
详见 http://www.askmaclean.com/archives/cbo-
  terms.html

                www.askmaclean.com   www.askmaclean.com
                                      www.askmaclean.com
预备知识: histgrm$中char的存放
直方图基表histgrm$中使用16进制存放char类型,需要
 转换16进制为str才能看懂
转换函数可以使用hexstr 函数,下载地址:
 http://t.cn/zYkqHoi




            www.askmaclean.com   www.askmaclean.com
                                  www.askmaclean.com
预备知识: DBMS_STATS.AUTO_SAMPLE_SIZE

默认的自动采样大小参数
 DBMS_STATS.AUTO_SAMPLE_SIZE

AUTO_SAMPLE_SIZE时会优先采样5500
 行,dbms_stats包内部算法会评估采样的5500行数据是
 否有效,若无效则会再次采样55000行数据,依此类
 推。

为什么是5500这个数字?

5500这个数字是经过数学论证的。 可以90%以上保证采
  样获得的直方图Histogram的桶buckets中数据分布式
  均匀。


              www.askmaclean.com   www.askmaclean.com
                                    www.askmaclean.com
啥是直方图Histogram?

Histogram直方图用来描绘 columns上的数据分布情况
当数据分布倾斜时Histogram可以有效提升cardinality 评
  估的准确度

或曰:“如果数据分布极均匀,那么是否不需要
 Histogram”

是的!例如绝大多数情况下单列主键 上不必要创建直方
 图Histogram(即便存在gap )

使用直方图的直接目的是为了让谓词列(column
 predicate)得到更好的选择性评估


             www.askmaclean.com   www.askmaclean.com
                                   www.askmaclean.com
什么是数据倾斜?Data Skew?
select source,count(*) from data_skew group by source order
  by 2,1;


                                        倾斜的数据分布
                 600000

                 500000

                 400000

                 300000

                 200000                                           合计

                 100000

                      0




                          www.askmaclean.com     www.askmaclean.com
                                                  www.askmaclean.com
数据倾斜+没有直方图啥后果?
SQL> exec dbms_stats.gather_table_stats(user,‘DATA_SKEW’,method_opt=>‘FOR ALL
  COLUMNS SIZE 1’);           ==》SIZE 1 不收histogram直方图
SQL> select count(*) from DATA_SKEW where source='Google Search';
 COUNT(*)
----------
   566566
SQL> explain plan for select 1 from DATA_SKEW where source='Google Search';
SQL> SELECT * FROM TABLE(dbms_xplan.display);
-------------------------------------------------------------------------------
| Id | Operation             | Name         | Rows | Bytes | Cost (%CPU)| Time       |
-------------------------------------------------------------------------------
| 0 | SELECT STATEMENT |                          | 6341 | 82433 | 1256 (2)| 00:00:16 |
|* 1 | TABLE ACCESS FULL| DATA_SKEW | 6341 | 82433 | 1256 (2)| 00:00:16 |


1 - filter("SOURCE"='Google Search')


优化器评估 source=‘Google Search’; Cardinality为 6431 ,实际source=‘Google Search’ 共
  566566,


差88倍!! 数据倾斜+没直方图==》优化器Optimizer
 很迷茫、很惆怅。。。。。。
                                                   www.askmaclean.com                     www.askmaclean.com
                                                                                           www.askmaclean.com
没有直方图,优化器怎么算Cardinality?
alter session set events '10053 trace name context forever,level 1';
explain plan for select 1 from DATA_SKEW where source='Google Search';


SINGLE TABLE ACCESS PATH
 -----------------------------------------
 BEGIN Single Table Cardinality Estimation
 -----------------------------------------
 Column (#2): SOURCE(VARCHAR2)
  AvgLen: 13.00 NDV: 158 Nulls: 0 Density: 0.0063291
 Table: DATA_SKEW Alias: DATA_SKEW
  Card: Original: 1001859 Rounded: 6341 Computed: 6340.88 Non Adjusted: 6340.88


Computed: 6340.88  原始计算获得的 Cardinality
Rounded: 6341             round(Computed: 6340.88) 后的结果


非NULL比例= (Card: Original – Nulls ) / Card: Original = 1
没有直方图的情况下


Sel= (1/NDV)*非NULL比例= (1/158) * 1 = 0.0063291

Computed Cardinality= Card Original * Sel= 1001859 * 0.0063291 = 6340.88

                                             www.askmaclean.com        www.askmaclean.com
                                                                        www.askmaclean.com
上例中如果有直方图Cardinality呢?
exec dbms_stats.gather_table_stats(user,'DATA_SKEW',method_opt=>'FOR ALL
  COLUMNS SIZE SKEWONLY', estimate_percent=>100);


SQL> explain plan for select 1 from DATA_SKEW where source='Google Search';


SQL> SELECT * FROM TABLE(dbms_xplan.display);
-------------------------------------------------------------------------------
| Id | Operation             | Name         | Rows | Bytes | Cost (%CPU)| Time    |
-------------------------------------------------------------------------------
| 0 | SELECT STATEMENT |                          | 566K| 7192K| 1256 (2)| 00:00:16 |
|* 1 | TABLE ACCESS FULL| DATA_SKEW | 566K| 7192K| 1256 (2)| 00:00:16 |
-------------------------------------------------------------------------------


Column (#2): SOURCE(VARCHAR2)
   AvgLen: 13.00 NDV: 161 Nulls: 0 Density: 9.9500e-05
   Histogram: Freq #Bkts: 161 UncompBkts: 999999 EndPtVals: 161
 Table: DATA_SKEW Alias: DATA_SKEW
   Card: Original: 999999 Rounded: 566566 Computed: 566566.00 Non Adjusted:
    566566.00


Histogram: Freq 频率直方图                               #Bkts: 161  161个buckets
UncompBkts  采样获得的 ( Card Original – NULLS)

                                                   www.askmaclean.com                   www.askmaclean.com
                                                                                         www.askmaclean.com
10053 trace中并不列出完整的直方图信息哦
直方图查看: dba_tab_cols、dba_tab_histograms、 sys. histgrm$

SQL> select HISTOGRAM from dba_tab_cols where table_name='DATA_SKEW' and
  column_name='SOURCE';
HISTOGRAM
---------------
FREQUENCY             频率直方图


SOURCE 是字符类型 需要 hexstr(endpoint_value)才能看懂
select endpoint_number, hexstr(endpoint_value) source_char from dba_tab_histograms
   where table_name=‘DATA_SKEW’ and column_name=‘SOURCE’;




                                 www.askmaclean.com                   www.askmaclean.com
                                                                       www.askmaclean.com
拨开云雾见月明—开始探究
Histogram直方图之旅




      www.askmaclean.com   www.askmaclean.com
                            www.askmaclean.com
Histogram直方图的历史
最早追述到Oracle 7版本开始引入Histogram的思想

Oracle 7中使用CBO时选择性评估很不准确

在Oracle 8中引入了Histogram作为optimizer statistics
 新特性,当时Frequency Histograms被称作 "value-
 based" histogram




                 www.askmaclean.com   www.askmaclean.com
                                       www.askmaclean.com
Histogram直方图的历史
版本8~9i默认不收集直方图Histogram,需要手动指定
 method_opt size (默认size=1 仅收集最小/大值,
 distinct值等信息)
从版本10g开始Oracle会自动收集Histogram,
 Histogram是否收集取决于col_usage$中记录的关于
 该列用作SQL中谓词条件的信息和数据分布情况
                                       SMON定期将shared pool中
                                       的谓词使用状况刷新到
                                       col_usage$表中

                                       例如:
                                       Select * from tab where
                                       colA=1;

                                       则记录为对colA充当
                                       EQUALITY_PREDS 
                                       equality predicates等式谓词

                  www.askmaclean.com        www.askmaclean.com
                                             www.askmaclean.com
Optimizer优化器何时使用Histogram?

• 不使用绑定变量,硬编码且cursor_sharing=EXACT造成大
  量硬解析时

• 绑定变量,且_optim_peek_user_binds=true(默认)的硬解
  析时;当_optim_peek_user_binds=false时虽然选择性不参
  考直方图,当硬解析仍可能load histogram

• 11g中ACS(Adaptive Cursor Sharing) bind aware引发再
  次硬解析时

• 其他一些导致现有子游标无法使用的原因

• 。。。。。。。。。。。。。。


                 www.askmaclean.com   www.askmaclean.com
                                       www.askmaclean.com
Optimizer优化器在哪里使用Histogram?

• 有2个地方Oracle Optimizer优化器会用到
  Histogram:
   • 过滤谓词的选择性评估
   • Join连接基数(Cardinality)评估

• 在做Join连接基数(Cardinality)评估时,往往差之
  毫厘谬之千里:
  • 优化器可能选择错误的Join连接方式或顺序。

• 例如分页排序查询 因为基数评估误差导致去用了
  HASH JOIN…..


            www.askmaclean.com   www.askmaclean.com
                                  www.askmaclean.com
Histogram的Buckets桶数
• 对于绝大多数情况默认的75个桶总是合适的

• 最大桶数= 最小值(254,其他因素限制桶数)

• 若频繁出现地distinct值的数据并不多,则将桶数设
  置为大于这个数目往往是有益的。

• 版本12c之前直方图最大桶数为254,12c之后………




           www.askmaclean.com   www.askmaclean.com
                                 www.askmaclean.com
Histogram的其他特性
• 早期版本中远程DBLINK查询时表上的直方图可能不被使用

• 直方图是静态的,当数据更新频繁时可能变得”陈旧”

• 对于超过32个字符的字符型列,超出的那一部分无法在直方
  图中体现。 例如rpad(‘A’,100,‘Z’)和rpad(‘A’,101,‘Z’)被认为
  是同一个value

• 数字和日期在直方图上被精确表示

• Popular and non-popular值被用做帮助判断选择性




                 www.askmaclean.com   www.askmaclean.com
                                       www.askmaclean.com
如何收集直方图
• 推荐使用dbms_stats收集直方图,虽然analyze命令也能收
  集直方图
 EXECUTE DBMS_STATS.GATHER_TABLE_STATS
 ('scott','emp',METHOD_OPT => 'FOR COLUMNS SIZE 20 ename');


• Method_opt参数
 FOR ALL [INDEXED | HIDDEN] COLUMNS [size_clause]
 FOR COLUMNS [size clause] column [size_clause] [,column...]

• Size指定直方图的桶数
 SIZE {integer | REPEAT | AUTO | SKEWONLY}
 Integer – 人工指定直方图桶数 从1~254
 REPEAT - 刷新现有的直方图列上的信息
 AUTO    - 基于数据分布和负载收集直方图
 SKEWONLY- 基于数据分布收集直方图



• 若对直方图不甚了解,推荐使用AUTO或SKEWONLY

                            www.askmaclean.com                www.askmaclean.com
                                                               www.askmaclean.com
Histogram直方图的种类
版本12c之前有2种类型Histogram

• Height Balanced Histogram高度平衡直方图
   • 列值被分成多个buckets
   • 每个bucket包含大致一样数目的行数
   • 当NDV>254时会采用高度平衡直方图(注意dbms_stats采
     样到的NDV未必是实际的NDV)

• Frequency Histogram(Value-Based )频率直方图
   • 该列上每一个值都会具有频率信息
   • 当NDV( number of distinct values)的个数<=最大桶数
     buckets 254个时使用频率直方图

• 在Oracle数据字典基表sys. histgrm$ 上这2种类型Histogram
  的存放方式一样
                 www.askmaclean.com   www.askmaclean.com
                                       www.askmaclean.com
Height Balanced Histogram
• 每个buckets桶里的行数都大致相同,除了最后一个桶
• 最后一个桶中的可能比其他桶中的少
• 每个桶中最大值成为bucket value endpoint_value
• 每个value值占有一个桶的一部分,按比例
select 1 from DATA_SKEW_HB where source='Google Search';
Column (#2):
  NewDensity:0.000403, OldDensity:0.000663 BktCnt:254, PopBktCnt:229, PopValCnt:11, NDV:255
 Column (#2): SOURCE(
  AvgLen: 13 NDV: 255 Nulls: 0 Density: 0.000403
  Histogram: HtBal #Bkts: 254 UncompBkts: 254 EndPtVals: 36
 Table: DATA_SKEW_HB Alias: DATA_SKEW_HB
  Card: Original: 1000093.000000 Rounded: 566982 Computed: 566981.86 Non Adjusted:
   566981.86




                                   www.askmaclean.com                 www.askmaclean.com
                                                                       www.askmaclean.com
Height Balanced Histogram
SQL> select count(distinct source) from data_skew_hb;
COUNT(DISTINCTSOURCE)
---------------------
               255
SQL> exec dbms_stats.gather_table_stats(user,'DATA_SKEW_HB',method_opt=>'FOR ALL
    COLUMNS SIZE SKEWONLY',estimate_percent=>100);

SQL> select histogram from dba_tab_cols where table_name='DATA_SKEW_HB' and
    column_name='SOURCE';
HISTOGRAM
---------------
HEIGHT BALANCED

SQL> select count(*) from dba_tab_histograms where table_name='DATA_SKEW_HB' and
    column_name='SOURCE';
  COUNT(*)
----------
       36
select endpoint_number,hexstr(endpoint_value) from dba_tab_histograms where
table_name='DATA_SKEW_HB' and column_name='SOURCE' order by 1,2

                                      www.askmaclean.com              www.askmaclean.com
                                                                       www.askmaclean.com
Height Balanced Histogram
•     每个buckets桶里的行数都大致相同,除了最后一个桶
•     最后一个桶中的行数可能比其他桶中的少
•     每个桶中最大值成为bucket value endpoint_value
•     每个value值占有一个桶的一部分,按比例
select 1 from DATA_SKEW_HB where source='Google Search';
Column (#2):
     NewDensity:0.000403, OldDensity:0.000663 BktCnt:254, PopBktCnt:229, PopValCnt:11, NDV:255
    Column (#2): SOURCE(
    AvgLen: 13 NDV: 255 Nulls: 0 Density: 0.000403
     Histogram: HtBal #Bkts: 254 UncompBkts: 254 EndPtVals: 36
    Table: DATA_SKEW_HB Alias: DATA_SKEW_HB
     Card: Original: 1000093.000000 Rounded: 566982 Computed: 566981.86 Non Adjusted:
      566981.86




                                     www.askmaclean.com                  www.askmaclean.com
                                                                          www.askmaclean.com
Height Balanced Histogram压缩Bucket




              www.askmaclean.com   www.askmaclean.com
                                    www.askmaclean.com
10053 trace中的UNCOMPBKTS和ENDPTVALS


• 当Histogram直方图类型为frequency histograms( Histogram:
  Freq)时UncompBkts 等于统计信息中采样行数-NULLS, 而
  EndPtVals 等于bucket总数,或者说NDV,因为frequency
  histograms中 NDV=number of buckets




• 当直方图类型为height balanced histograms (Histogram:
  HtBal) UncompBkts 等于bucket的数目(其实也等于10053
  trace中#Bkts的数目),而EndPtVals 等于已经被压缩的
  Histogram的大小,换句话说是等于: select count(*) from
  dba_tab_histograms where table_name=’&TAB’ and
  column_name=’&COL’的实际总和。 通过这2个值对比,可以了
  解到popular值的多少以及数据的倾斜度, 是有很多个大量重复
  的值(popular value)还是仅有几个巨大的重复值。

                  www.askmaclean.com   www.askmaclean.com
                                        www.askmaclean.com
Height Balanced Histogram- popular value

重复出现为endpoint_value的值称为popular value,其Cardinality计算
  公式为:
Comp_Card = Orig_Card * Sel
Sel = (该Popular值的桶数 /总的桶数) * 非NULL比例
非NULL比例=(Orig_Card - NNulls) / Orig_Card
如上例中source=‘Google Search’是一个popular值:

该Popular值的桶数=215-71 =144
非NULL比例= (1000093-0)/1000093=1

则sel = 144/254

Comp_Card = 1000093* 144/254= 566981.86




                      www.askmaclean.com   www.askmaclean.com
                                            www.askmaclean.com
Height Balanced Histogram- non popular value

• 没有重复出现为endpoint_value、或者没有充当过
  endpoint_value的值称为non popular value,其
  Cardinality计算公式为:

• Comp_Card = Orig_Card * Sel

• Sel= Density * 非空比例

• 10.2.0.4以后的 New Density= (总的buckets数- 所有
  popular值的buckets数PopBktCnt)/总的buckets数/ (NDV-
  popular值总的个数POPVALCNT)




                   www.askmaclean.com   www.askmaclean.com
                                         www.askmaclean.com
Height Balanced Histogram- non popular value

select count(*) from data_skew_hb where source='ABC100';

COUNT(*)
----------
      666

10g中10053 trace默认不显示PopBktCnt和POPVALCNT 但是可以利用
  后面的脚本获得
11g 10053 level 1 trace :

NewDensity:0.000403, OldDensity:0.000663 BktCnt:254, PopBktCnt:229,
  PopValCnt:11, NDV:255
Column (#2): SOURCE(
 AvgLen: 13 NDV: 255 Nulls: 0 Density: 0.000403
 Histogram: HtBal #Bkts: 254 UncompBkts: 254 EndPtVals: 36
Table: DATA_SKEW_HB Alias: DATA_SKEW_HB
 Card: Original: 1000093.000000 Rounded: 403 Computed: 403.42 Non Adjusted:
  403.42
                            www.askmaclean.com             www.askmaclean.com
                                                            www.askmaclean.com
POPVALCNT
select count(*) PopValCnt from (select
  endpoint_number,hexstr(endpoint_value) endstr,
  endpoint_number- lag( endpoint_number,1,0) over (order by
  endpoint_number ) as buckets from dba_tab_histograms where
  table_name='DATA_SKEW_HB' and column_name='SOURCE')
  end_diff where buckets>1;

 POPVALCNT
----------
       11




                       www.askmaclean.com       www.askmaclean.com
                                                 www.askmaclean.com
PopBktCnt
select sum(buckets) PopValCnt from (select
  endpoint_number,hexstr(endpoint_value) endstr,
  endpoint_number- lag( endpoint_number,1,0) over (order by
  endpoint_number ) as buckets from dba_tab_histograms where
  table_name='DATA_SKEW_HB' and column_name='SOURCE')
  end_diff where buckets>1;



 POPVALCNT
----------
      229




                       www.askmaclean.com       www.askmaclean.com
                                                 www.askmaclean.com
Popular value的列表
select endstr popvalue,buckets from (select
  endpoint_number,hexstr(endpoint_value) endstr,
  endpoint_number- lag( endpoint_number,1,0) over (order by
  endpoint_number ) as buckets from dba_tab_histograms where
  table_name='DATA_SKEW_HB' and column_name='SOURCE')
  end_diff where buckets>1;




                       www.askmaclean.com       www.askmaclean.com
                                                 www.askmaclean.com
High Balanced Histogram 计算
      Density

10.2.0.4 以后New Desnsity = (总的buckets数- 所有popular值的buckets
  数PopBktCnt)/总的buckets数/ (NDV- popular值总的个数
  POPVALCNT)

= (254-229)/254/(255-11)= 25/255/244= 0.000403

Comp_Card = Orig_Card * Sel= 1000093 * 0.000403 = 403




                         www.askmaclean.com      www.askmaclean.com
                                                  www.askmaclean.com
Frequency Histogram频率直方图
• 每一个bucket桶代表一个列值

• 列上的所有的值均有对应的桶

• 当NDV(采样到的)<min(指定的buckets数量,254)时
  创建频率直方图




            www.askmaclean.com   www.askmaclean.com
                                  www.askmaclean.com
Frequency Histogram频率直方图-等式谓词
Exec dbms_stats.gather_table_stats(user,'DATA_SKEW',estimate_percent=>100);

SQL> select histogram from dba_tab_cols where table_name='DATA_SKEW' and
    column_name='SOURCE';
HISTOGRAM
---------------
FREQUENCY

SQL> select count(*) from data_skew where source='Bing Search';

  COUNT(*)
----------
     9009

Column (#2): SOURCE(VARCHAR2)
  AvgLen: 13.00 NDV: 161 Nulls: 0 Density: 9.9500e-05
  Histogram: Freq #Bkts: 161 UncompBkts: 999999 EndPtVals: 161
 Table: DATA_SKEW Alias: DATA_SKEW
  Card: Original: 999999 Rounded: 9009 Computed: 9009.00 Non Adjusted: 9009.00


                             www.askmaclean.com               www.askmaclean.com
                                                               www.askmaclean.com
Frequency Histogram频率直方图-等式谓词
select endpoint_number,
    hexstr(endpoint_value) endstr,
    endpoint_number - lag(endpoint_number, 1, 0) over(order by endpoint_number)
   as rows_cnt
 from dba_tab_histograms
where table_name = 'DATA_SKEW'
  and column_name = 'SOURCE';




                             www.askmaclean.com               www.askmaclean.com
                                                               www.askmaclean.com
Frequency Histogram频率直方图-非闭包范围
SQL> select count(*) from data_skew where source>= 'Sougo Search';
  COUNT(*)
----------
   150150                               Column (#2): SOURCE(VARCHAR2)
                                      AvgLen: 13.00 NDV: 161 Nulls: 0 Density:
                                    9.9500e-05
                                      Histogram: Freq #Bkts: 161 UncompBkts:
                                    999999 EndPtVals: 161
                                     Table: DATA_SKEW Alias: DATA_SKEW
                                      Card: Original: 999999 Rounded: 150150
                                    Computed: 150149.50 Non Adjusted: 150149.50




                                    Sel= Sum(Bucketsize)/Numrows
                                    = 150150/ 999999




                          www.askmaclean.com            www.askmaclean.com
                                                         www.askmaclean.com
Frequency Histogram频率直方图-闭包范围
SQL> select count(*) from data_skew where source between 'ABC904' and 'ABC94';

  COUNT(*)
----------                              Column (#2): SOURCE(VARCHAR2)
     2664
                                          AvgLen: 13.00 NDV: 161 Nulls: 0 Density:
                                        9.9500e-05
                                          Histogram: Freq #Bkts: 161 UncompBkts:
                                        999999 EndPtVals: 161
                                         Table: DATA_SKEW Alias: DATA_SKEW
                                          Card: Original: 999999 Rounded: 2664
                                        Computed: 2664.00 Non Adjusted: 2664.00



                                        Sel= Sum(Bucketsize)/Numrows
                                        = (666+666+666+666)/ 999999




                             www.askmaclean.com               www.askmaclean.com
                                                               www.askmaclean.com
Histogram带来的一些问题

• 在11g acs特性出现前,绑定变量SQL若第一次代入的变量
  是倾斜值,由于游标共享可能导致今后的执行计划不佳

• 若应用程序没有很好地绑定变量,且列上有直方图,则频繁
  的硬解析可能加剧latch: row cache object

• 由于ACS和其他一些原因作用导致child cursors数量过多:
   • 进一步导致latch:shared pool、library cache等问题
   • 造成shared pool碎片化

• 收集系统中过多列的直方图会导致row cache膨胀,管理成
  本增加

• 欢迎补充。。。。。。。。。。。。
                www.askmaclean.com   www.askmaclean.com
                                      www.askmaclean.com
目前版本Histogram的欠缺
Height-Balanced Histogram中 non popular值使用density
 评估选择性,对于那些出现较频繁,但仍排不上popular榜
 单的值,使用density的评估很难准确
          SQL> select count(*) from DATA_SKEW_HB where
          source='Maclean Search';

            COUNT(*)
          ----------
               2000

          Column (#2): SOURCE(VARCHAR2)
            AvgLen: 13.00 NDV: 256 Nulls: 0 Density: 4.0174e-04
            Histogram: HtBal #Bkts: 254 UncompBkts: 254 EndPtVals: 36
           Table: DATA_SKEW_HB Alias: DATA_SKEW_HB
            Card: Original: 1002093 Rounded: 403 Computed: 402.58
          Non Adjusted: 402.58




                       www.askmaclean.com          www.askmaclean.com
                                                    www.askmaclean.com
Histogram的未来-12c Top Frequency histograms
SQL> select banner from v$version where rownum=1;
Oracle Database 12c Enterprise Edition Release 12.1.0.0.2 - 64bit Beta


SQL> exec dbms_stats.gather_table_stats(user,'DATA_SKEW_HB');


SQL> select histogram from dba_tab_cols where table_name='DATA_SKEW_HB' and
  column_name='SOURCE';


HISTOGRAM
---------------
TOP-FREQUENCY


select count(*) from DATA_SKEW_HB where source='Maclean Search';


 Column (#2):
   NewDensity:0.000001, OldDensity:0.000000 BktCnt:1002091.000000, PopBktCnt:1001999.000000,
    PopValCnt:162, NDV:256
 Column (#2): SOURCE(VARCHAR2)
   AvgLen: 13 NDV: 256 Nulls: 0 Density: 0.000001
   Histogram: Top-Freq #Bkts: 1002091 UncompBkts: 1002091 EndPtVals: 254 ActualVal: yes
 Table: DATA_SKEW_HB Alias: DATA_SKEW_HB
   Card: Original: 1002093.000000 Rounded: 2000 Computed: 2000.00 Non Adjusted: 2000.00

                                    www.askmaclean.com                   www.askmaclean.com
                                                                          www.askmaclean.com
Histogram的未来-12c Hybrid Histograms
在data_skew_hb表上生成更多NDV
SQL> exec dbms_stats.gather_table_stats(user,'DATA_SKEW_HB');
SQL> select histogram from dba_tab_cols where table_name='DATA_SKEW_HB' and
  column_name='SOURCE';
HISTOGRAM
---------------
HYBRID


SQL> select count(*) from DATA_SKEW_HB where source='Maclean Search';


 COUNT(*)
----------
    2000


Column (#2):
   NewDensity:0.000085, OldDensity:0.000000 BktCnt:5405.000000, PopBktCnt:671.000000,
    PopValCnt:5, NDV:10255
 Column (#2): SOURCE(VARCHAR2)
   AvgLen: 13 NDV: 10255 Nulls: 0 Density: 0.000085
   Histogram: Hybrid #Bkts: 254 UncompBkts: 5405 EndPtVals: 254 ActualVal: yes
 Table: DATA_SKEW_HB Alias: DATA_SKEW_HB
   Card: Original: 6001593.000000 Rounded: 2221 Computed: 2220.76 Non Adjusted: 2220.76

                                    www.askmaclean.com                 www.askmaclean.com
                                                                        www.askmaclean.com
你离彻底搞懂10053 trace这天书又近了一步

                             敬请期待,
                             下一讲 揭秘
                             Oracle优化
                             器内幕,一
                             次性彻底搞
                             明白10053
                             trace !!




        www.askmaclean.com    www.askmaclean.com
                               www.askmaclean.com
更多信息


   www.askmaclean.com
     tuning




or
http://www.askmaclean.com/archives/tag/tuning



              www.askmaclean.com   www.askmaclean.com
                                    www.askmaclean.com
Question & Answer




   If you have more questions later, feel free to
   ask




                   www.askmaclean.com               www.askmaclean.com
                                                     www.askmaclean.com

More Related Content

What's hot

Performance tuning a quick intoduction
Performance tuning   a quick intoductionPerformance tuning   a quick intoduction
Performance tuning a quick intoductionRiyaj Shamsudeen
 
Oracle 10g Performance: chapter 00 sampling
Oracle 10g Performance: chapter 00 samplingOracle 10g Performance: chapter 00 sampling
Oracle 10g Performance: chapter 00 samplingKyle Hailey
 
Optimizer Cost Model MySQL 5.7
Optimizer Cost Model MySQL 5.7Optimizer Cost Model MySQL 5.7
Optimizer Cost Model MySQL 5.7I Goo Lee
 
Dbms plan - A swiss army knife for performance engineers
Dbms plan - A swiss army knife for performance engineersDbms plan - A swiss army knife for performance engineers
Dbms plan - A swiss army knife for performance engineersRiyaj Shamsudeen
 
PL/SQL User-Defined Functions in the Read World
PL/SQL User-Defined Functions in the Read WorldPL/SQL User-Defined Functions in the Read World
PL/SQL User-Defined Functions in the Read WorldMichael Rosenblum
 
Fatkulin presentation
Fatkulin presentationFatkulin presentation
Fatkulin presentationEnkitec
 
[Pgday.Seoul 2021] 2. Porting Oracle UDF and Optimization
[Pgday.Seoul 2021] 2. Porting Oracle UDF and Optimization[Pgday.Seoul 2021] 2. Porting Oracle UDF and Optimization
[Pgday.Seoul 2021] 2. Porting Oracle UDF and OptimizationPgDay.Seoul
 
A close encounter_with_real_world_and_odd_perf_issues
A close encounter_with_real_world_and_odd_perf_issuesA close encounter_with_real_world_and_odd_perf_issues
A close encounter_with_real_world_and_odd_perf_issuesRiyaj Shamsudeen
 
DBA Commands and Concepts That Every Developer Should Know
DBA Commands and Concepts That Every Developer Should KnowDBA Commands and Concepts That Every Developer Should Know
DBA Commands and Concepts That Every Developer Should KnowAlex Zaballa
 
Data Tracking: On the Hunt for Information about Your Database
Data Tracking: On the Hunt for Information about Your DatabaseData Tracking: On the Hunt for Information about Your Database
Data Tracking: On the Hunt for Information about Your DatabaseMichael Rosenblum
 
Oracle 10g Performance: chapter 09 enqueues
Oracle 10g Performance: chapter 09 enqueuesOracle 10g Performance: chapter 09 enqueues
Oracle 10g Performance: chapter 09 enqueuesKyle Hailey
 
UKOUG, Oracle Transaction Locks
UKOUG, Oracle Transaction LocksUKOUG, Oracle Transaction Locks
UKOUG, Oracle Transaction LocksKyle Hailey
 
Introduction to Parallel Execution
Introduction to Parallel ExecutionIntroduction to Parallel Execution
Introduction to Parallel ExecutionDoug Burns
 
Advanced RAC troubleshooting: Network
Advanced RAC troubleshooting: NetworkAdvanced RAC troubleshooting: Network
Advanced RAC troubleshooting: NetworkRiyaj Shamsudeen
 
A New View of Database Views
A New View of Database ViewsA New View of Database Views
A New View of Database ViewsMichael Rosenblum
 
Deep review of LMS process
Deep review of LMS processDeep review of LMS process
Deep review of LMS processRiyaj Shamsudeen
 

What's hot (20)

Rac 12c optimization
Rac 12c optimizationRac 12c optimization
Rac 12c optimization
 
Performance tuning a quick intoduction
Performance tuning   a quick intoductionPerformance tuning   a quick intoduction
Performance tuning a quick intoduction
 
Oracle 10g Performance: chapter 00 sampling
Oracle 10g Performance: chapter 00 samplingOracle 10g Performance: chapter 00 sampling
Oracle 10g Performance: chapter 00 sampling
 
Optimizer Cost Model MySQL 5.7
Optimizer Cost Model MySQL 5.7Optimizer Cost Model MySQL 5.7
Optimizer Cost Model MySQL 5.7
 
Dbms plan - A swiss army knife for performance engineers
Dbms plan - A swiss army knife for performance engineersDbms plan - A swiss army knife for performance engineers
Dbms plan - A swiss army knife for performance engineers
 
PL/SQL User-Defined Functions in the Read World
PL/SQL User-Defined Functions in the Read WorldPL/SQL User-Defined Functions in the Read World
PL/SQL User-Defined Functions in the Read World
 
Fatkulin presentation
Fatkulin presentationFatkulin presentation
Fatkulin presentation
 
[Pgday.Seoul 2021] 2. Porting Oracle UDF and Optimization
[Pgday.Seoul 2021] 2. Porting Oracle UDF and Optimization[Pgday.Seoul 2021] 2. Porting Oracle UDF and Optimization
[Pgday.Seoul 2021] 2. Porting Oracle UDF and Optimization
 
A close encounter_with_real_world_and_odd_perf_issues
A close encounter_with_real_world_and_odd_perf_issuesA close encounter_with_real_world_and_odd_perf_issues
A close encounter_with_real_world_and_odd_perf_issues
 
DBA Commands and Concepts That Every Developer Should Know
DBA Commands and Concepts That Every Developer Should KnowDBA Commands and Concepts That Every Developer Should Know
DBA Commands and Concepts That Every Developer Should Know
 
Px execution in rac
Px execution in racPx execution in rac
Px execution in rac
 
Data Tracking: On the Hunt for Information about Your Database
Data Tracking: On the Hunt for Information about Your DatabaseData Tracking: On the Hunt for Information about Your Database
Data Tracking: On the Hunt for Information about Your Database
 
Noinject
NoinjectNoinject
Noinject
 
Oracle 10g Performance: chapter 09 enqueues
Oracle 10g Performance: chapter 09 enqueuesOracle 10g Performance: chapter 09 enqueues
Oracle 10g Performance: chapter 09 enqueues
 
UKOUG, Oracle Transaction Locks
UKOUG, Oracle Transaction LocksUKOUG, Oracle Transaction Locks
UKOUG, Oracle Transaction Locks
 
Introduction to Parallel Execution
Introduction to Parallel ExecutionIntroduction to Parallel Execution
Introduction to Parallel Execution
 
Redo internals ppt
Redo internals pptRedo internals ppt
Redo internals ppt
 
Advanced RAC troubleshooting: Network
Advanced RAC troubleshooting: NetworkAdvanced RAC troubleshooting: Network
Advanced RAC troubleshooting: Network
 
A New View of Database Views
A New View of Database ViewsA New View of Database Views
A New View of Database Views
 
Deep review of LMS process
Deep review of LMS processDeep review of LMS process
Deep review of LMS process
 

Viewers also liked

【Maclean liu技术分享】深入理解oracle中mutex的内部原理
【Maclean liu技术分享】深入理解oracle中mutex的内部原理【Maclean liu技术分享】深入理解oracle中mutex的内部原理
【Maclean liu技术分享】深入理解oracle中mutex的内部原理maclean liu
 
Essential oracle security internal for dba
Essential oracle security internal for dbaEssential oracle security internal for dba
Essential oracle security internal for dbamaclean liu
 
11g新特性 在线实施补丁online patching
11g新特性 在线实施补丁online patching11g新特性 在线实施补丁online patching
11g新特性 在线实施补丁online patchingmaclean liu
 
了解Oracle critical patch update
了解Oracle critical patch update了解Oracle critical patch update
了解Oracle critical patch updatemaclean liu
 
Z-krant: wordt vriend van de Z-krant!
Z-krant: wordt vriend van de Z-krant!Z-krant: wordt vriend van de Z-krant!
Z-krant: wordt vriend van de Z-krant!Paul Rowold
 
09. permendikbud nomor 71 tahun 2013 ttg buku teks pelajaran layak
09. permendikbud nomor 71 tahun 2013 ttg buku teks pelajaran layak09. permendikbud nomor 71 tahun 2013 ttg buku teks pelajaran layak
09. permendikbud nomor 71 tahun 2013 ttg buku teks pelajaran layakIrma Muthiara Sari
 
Content Marketing 101: A Step-by-Step Playbook
Content Marketing 101: A Step-by-Step PlaybookContent Marketing 101: A Step-by-Step Playbook
Content Marketing 101: A Step-by-Step PlaybookJay Acunzo
 
The State of Social PR
The State of Social PR The State of Social PR
The State of Social PR Vuelio
 
Google IOF presentation
Google IOF presentationGoogle IOF presentation
Google IOF presentationJason Potts
 
Benchmarks riet picknicken
Benchmarks riet picknickenBenchmarks riet picknicken
Benchmarks riet picknickenlaurenztack
 
介绍Dbms registry plsql程序包1
介绍Dbms registry plsql程序包1介绍Dbms registry plsql程序包1
介绍Dbms registry plsql程序包1maclean liu
 
Collaboaration tools for non profit agencies
Collaboaration tools for non profit agenciesCollaboaration tools for non profit agencies
Collaboaration tools for non profit agenciesmewren
 
Prioritization Survey Results
Prioritization Survey ResultsPrioritization Survey Results
Prioritization Survey ResultsMichal Bularz
 
Varamobaden Vision 2025
Varamobaden Vision 2025 Varamobaden Vision 2025
Varamobaden Vision 2025 Bjorn Orrenius
 

Viewers also liked (20)

【Maclean liu技术分享】深入理解oracle中mutex的内部原理
【Maclean liu技术分享】深入理解oracle中mutex的内部原理【Maclean liu技术分享】深入理解oracle中mutex的内部原理
【Maclean liu技术分享】深入理解oracle中mutex的内部原理
 
Essential oracle security internal for dba
Essential oracle security internal for dbaEssential oracle security internal for dba
Essential oracle security internal for dba
 
11g新特性 在线实施补丁online patching
11g新特性 在线实施补丁online patching11g新特性 在线实施补丁online patching
11g新特性 在线实施补丁online patching
 
了解Oracle critical patch update
了解Oracle critical patch update了解Oracle critical patch update
了解Oracle critical patch update
 
Z-krant: wordt vriend van de Z-krant!
Z-krant: wordt vriend van de Z-krant!Z-krant: wordt vriend van de Z-krant!
Z-krant: wordt vriend van de Z-krant!
 
09. permendikbud nomor 71 tahun 2013 ttg buku teks pelajaran layak
09. permendikbud nomor 71 tahun 2013 ttg buku teks pelajaran layak09. permendikbud nomor 71 tahun 2013 ttg buku teks pelajaran layak
09. permendikbud nomor 71 tahun 2013 ttg buku teks pelajaran layak
 
Content Marketing 101: A Step-by-Step Playbook
Content Marketing 101: A Step-by-Step PlaybookContent Marketing 101: A Step-by-Step Playbook
Content Marketing 101: A Step-by-Step Playbook
 
BPM & KM: They Converge Quite Nicely
BPM & KM: They Converge Quite NicelyBPM & KM: They Converge Quite Nicely
BPM & KM: They Converge Quite Nicely
 
Devon house
Devon houseDevon house
Devon house
 
Investigacion
InvestigacionInvestigacion
Investigacion
 
The State of Social PR
The State of Social PR The State of Social PR
The State of Social PR
 
Google IOF presentation
Google IOF presentationGoogle IOF presentation
Google IOF presentation
 
Develop Mobile App
Develop Mobile AppDevelop Mobile App
Develop Mobile App
 
Perímetro craneal
Perímetro cranealPerímetro craneal
Perímetro craneal
 
Benchmarks riet picknicken
Benchmarks riet picknickenBenchmarks riet picknicken
Benchmarks riet picknicken
 
介绍Dbms registry plsql程序包1
介绍Dbms registry plsql程序包1介绍Dbms registry plsql程序包1
介绍Dbms registry plsql程序包1
 
Didactisch atelier hb
Didactisch atelier hbDidactisch atelier hb
Didactisch atelier hb
 
Collaboaration tools for non profit agencies
Collaboaration tools for non profit agenciesCollaboaration tools for non profit agencies
Collaboaration tools for non profit agencies
 
Prioritization Survey Results
Prioritization Survey ResultsPrioritization Survey Results
Prioritization Survey Results
 
Varamobaden Vision 2025
Varamobaden Vision 2025 Varamobaden Vision 2025
Varamobaden Vision 2025
 

Similar to 【Maclean liu技术分享】拨开oracle cbo优化器迷雾,探究histogram直方图之秘 0321

Query optimizer vivek sharma
Query optimizer vivek sharmaQuery optimizer vivek sharma
Query optimizer vivek sharmaaioughydchapter
 
Postgres performance for humans
Postgres performance for humansPostgres performance for humans
Postgres performance for humansCraig Kerstiens
 
New Tuning Features in Oracle 11g - How to make your database as boring as po...
New Tuning Features in Oracle 11g - How to make your database as boring as po...New Tuning Features in Oracle 11g - How to make your database as boring as po...
New Tuning Features in Oracle 11g - How to make your database as boring as po...Sage Computing Services
 
Postgres Performance for Humans
Postgres Performance for HumansPostgres Performance for Humans
Postgres Performance for HumansCitus Data
 
Advanced pg_stat_statements: Filtering, Regression Testing & more
Advanced pg_stat_statements: Filtering, Regression Testing & moreAdvanced pg_stat_statements: Filtering, Regression Testing & more
Advanced pg_stat_statements: Filtering, Regression Testing & moreLukas Fittl
 
Tutorial - Learn SQL with Live Online Database
Tutorial - Learn SQL with Live Online DatabaseTutorial - Learn SQL with Live Online Database
Tutorial - Learn SQL with Live Online DatabaseDBrow Adm
 
Histograms : Pre-12c and Now
Histograms : Pre-12c and NowHistograms : Pre-12c and Now
Histograms : Pre-12c and NowAnju Garg
 
ITB2019 Faster DB Development with QB - Andrew Davis
ITB2019 Faster DB Development with QB - Andrew DavisITB2019 Faster DB Development with QB - Andrew Davis
ITB2019 Faster DB Development with QB - Andrew DavisOrtus Solutions, Corp
 
Riyaj: why optimizer_hates_my_sql_2010
Riyaj: why optimizer_hates_my_sql_2010Riyaj: why optimizer_hates_my_sql_2010
Riyaj: why optimizer_hates_my_sql_2010Riyaj Shamsudeen
 
Histograms: Pre-12c and now
Histograms: Pre-12c and nowHistograms: Pre-12c and now
Histograms: Pre-12c and nowAnju Garg
 
In Memory Database In Action by Tanel Poder and Kerry Osborne
In Memory Database In Action by Tanel Poder and Kerry OsborneIn Memory Database In Action by Tanel Poder and Kerry Osborne
In Memory Database In Action by Tanel Poder and Kerry OsborneEnkitec
 
Oracle Database In-Memory Option in Action
Oracle Database In-Memory Option in ActionOracle Database In-Memory Option in Action
Oracle Database In-Memory Option in ActionTanel Poder
 
Managing Statistics for Optimal Query Performance
Managing Statistics for Optimal Query PerformanceManaging Statistics for Optimal Query Performance
Managing Statistics for Optimal Query PerformanceKaren Morton
 
A few things about the Oracle optimizer - 2013
A few things about the Oracle optimizer - 2013A few things about the Oracle optimizer - 2013
A few things about the Oracle optimizer - 2013Connor McDonald
 
Database & Technology 1 _ Tom Kyte _ SQL Techniques.pdf
Database & Technology 1 _ Tom Kyte _ SQL Techniques.pdfDatabase & Technology 1 _ Tom Kyte _ SQL Techniques.pdf
Database & Technology 1 _ Tom Kyte _ SQL Techniques.pdfInSync2011
 
Informix Warehouse Accelerator (IWA) features in version 12.1
Informix Warehouse Accelerator (IWA) features in version 12.1Informix Warehouse Accelerator (IWA) features in version 12.1
Informix Warehouse Accelerator (IWA) features in version 12.1Keshav Murthy
 

Similar to 【Maclean liu技术分享】拨开oracle cbo优化器迷雾,探究histogram直方图之秘 0321 (20)

Query optimizer vivek sharma
Query optimizer vivek sharmaQuery optimizer vivek sharma
Query optimizer vivek sharma
 
Postgres performance for humans
Postgres performance for humansPostgres performance for humans
Postgres performance for humans
 
Oracle 12c SPM
Oracle 12c SPMOracle 12c SPM
Oracle 12c SPM
 
New Tuning Features in Oracle 11g - How to make your database as boring as po...
New Tuning Features in Oracle 11g - How to make your database as boring as po...New Tuning Features in Oracle 11g - How to make your database as boring as po...
New Tuning Features in Oracle 11g - How to make your database as boring as po...
 
Postgres Performance for Humans
Postgres Performance for HumansPostgres Performance for Humans
Postgres Performance for Humans
 
Advanced pg_stat_statements: Filtering, Regression Testing & more
Advanced pg_stat_statements: Filtering, Regression Testing & moreAdvanced pg_stat_statements: Filtering, Regression Testing & more
Advanced pg_stat_statements: Filtering, Regression Testing & more
 
Oracle training in hyderabad
Oracle training in hyderabadOracle training in hyderabad
Oracle training in hyderabad
 
Tutorial - Learn SQL with Live Online Database
Tutorial - Learn SQL with Live Online DatabaseTutorial - Learn SQL with Live Online Database
Tutorial - Learn SQL with Live Online Database
 
Histograms : Pre-12c and Now
Histograms : Pre-12c and NowHistograms : Pre-12c and Now
Histograms : Pre-12c and Now
 
ITB2019 Faster DB Development with QB - Andrew Davis
ITB2019 Faster DB Development with QB - Andrew DavisITB2019 Faster DB Development with QB - Andrew Davis
ITB2019 Faster DB Development with QB - Andrew Davis
 
Riyaj: why optimizer_hates_my_sql_2010
Riyaj: why optimizer_hates_my_sql_2010Riyaj: why optimizer_hates_my_sql_2010
Riyaj: why optimizer_hates_my_sql_2010
 
Histograms: Pre-12c and now
Histograms: Pre-12c and nowHistograms: Pre-12c and now
Histograms: Pre-12c and now
 
In Memory Database In Action by Tanel Poder and Kerry Osborne
In Memory Database In Action by Tanel Poder and Kerry OsborneIn Memory Database In Action by Tanel Poder and Kerry Osborne
In Memory Database In Action by Tanel Poder and Kerry Osborne
 
Oracle Database In-Memory Option in Action
Oracle Database In-Memory Option in ActionOracle Database In-Memory Option in Action
Oracle Database In-Memory Option in Action
 
Managing Statistics for Optimal Query Performance
Managing Statistics for Optimal Query PerformanceManaging Statistics for Optimal Query Performance
Managing Statistics for Optimal Query Performance
 
A few things about the Oracle optimizer - 2013
A few things about the Oracle optimizer - 2013A few things about the Oracle optimizer - 2013
A few things about the Oracle optimizer - 2013
 
SQLQueries
SQLQueriesSQLQueries
SQLQueries
 
Database & Technology 1 _ Tom Kyte _ SQL Techniques.pdf
Database & Technology 1 _ Tom Kyte _ SQL Techniques.pdfDatabase & Technology 1 _ Tom Kyte _ SQL Techniques.pdf
Database & Technology 1 _ Tom Kyte _ SQL Techniques.pdf
 
Oracle SQL Tuning
Oracle SQL TuningOracle SQL Tuning
Oracle SQL Tuning
 
Informix Warehouse Accelerator (IWA) features in version 12.1
Informix Warehouse Accelerator (IWA) features in version 12.1Informix Warehouse Accelerator (IWA) features in version 12.1
Informix Warehouse Accelerator (IWA) features in version 12.1
 

More from maclean liu

Mysql企业备份发展及实践
Mysql企业备份发展及实践Mysql企业备份发展及实践
Mysql企业备份发展及实践maclean liu
 
Oracle専用データ復旧ソフトウェアprm dulユーザーズ・マニュアル
Oracle専用データ復旧ソフトウェアprm dulユーザーズ・マニュアルOracle専用データ復旧ソフトウェアprm dulユーザーズ・マニュアル
Oracle専用データ復旧ソフトウェアprm dulユーザーズ・マニュアルmaclean liu
 
【诗檀软件 郭兆伟-技术报告】跨国企业级Oracle数据库备份策略
【诗檀软件 郭兆伟-技术报告】跨国企业级Oracle数据库备份策略【诗檀软件 郭兆伟-技术报告】跨国企业级Oracle数据库备份策略
【诗檀软件 郭兆伟-技术报告】跨国企业级Oracle数据库备份策略maclean liu
 
基于Oracle 12c data guard & far sync的低资源消耗两地三数据中心容灾方案
基于Oracle 12c data guard & far sync的低资源消耗两地三数据中心容灾方案基于Oracle 12c data guard & far sync的低资源消耗两地三数据中心容灾方案
基于Oracle 12c data guard & far sync的低资源消耗两地三数据中心容灾方案maclean liu
 
TomCat迁移步骤简述以及案例
TomCat迁移步骤简述以及案例TomCat迁移步骤简述以及案例
TomCat迁移步骤简述以及案例maclean liu
 
PRM DUL Oracle Database Health Check
PRM DUL Oracle Database Health CheckPRM DUL Oracle Database Health Check
PRM DUL Oracle Database Health Checkmaclean liu
 
dbdao.com 汪伟华 my-sql-replication复制高可用配置方案
dbdao.com 汪伟华 my-sql-replication复制高可用配置方案dbdao.com 汪伟华 my-sql-replication复制高可用配置方案
dbdao.com 汪伟华 my-sql-replication复制高可用配置方案maclean liu
 
Vbox virtual box在oracle linux 5 - shoug 梁洪响
Vbox virtual box在oracle linux 5 - shoug 梁洪响Vbox virtual box在oracle linux 5 - shoug 梁洪响
Vbox virtual box在oracle linux 5 - shoug 梁洪响maclean liu
 
【诗檀软件】Mysql高可用方案
【诗檀软件】Mysql高可用方案【诗檀软件】Mysql高可用方案
【诗檀软件】Mysql高可用方案maclean liu
 
Shoug at apouc2015 4min pitch_biotwang_v2
Shoug at apouc2015 4min pitch_biotwang_v2Shoug at apouc2015 4min pitch_biotwang_v2
Shoug at apouc2015 4min pitch_biotwang_v2maclean liu
 
Apouc 4min pitch_biotwang_v2
Apouc 4min pitch_biotwang_v2Apouc 4min pitch_biotwang_v2
Apouc 4min pitch_biotwang_v2maclean liu
 
使用Oracle osw analyzer工具分析oswbb日志,并绘制系统性能走势图1
使用Oracle osw analyzer工具分析oswbb日志,并绘制系统性能走势图1使用Oracle osw analyzer工具分析oswbb日志,并绘制系统性能走势图1
使用Oracle osw analyzer工具分析oswbb日志,并绘制系统性能走势图1maclean liu
 
诗檀软件 Oracle开发优化基础
诗檀软件 Oracle开发优化基础 诗檀软件 Oracle开发优化基础
诗檀软件 Oracle开发优化基础 maclean liu
 
Orclrecove 1 pd-prm-dul testing for oracle database recovery_20141030_biot_wang
Orclrecove 1 pd-prm-dul testing for oracle database recovery_20141030_biot_wangOrclrecove 1 pd-prm-dul testing for oracle database recovery_20141030_biot_wang
Orclrecove 1 pd-prm-dul testing for oracle database recovery_20141030_biot_wangmaclean liu
 
诗檀软件 – Oracle数据库修复专家 oracle数据块损坏知识2014-10-24
诗檀软件 – Oracle数据库修复专家 oracle数据块损坏知识2014-10-24诗檀软件 – Oracle数据库修复专家 oracle数据块损坏知识2014-10-24
诗檀软件 – Oracle数据库修复专家 oracle数据块损坏知识2014-10-24maclean liu
 
追求Jdbc on oracle最佳性能?如何才好?
追求Jdbc on oracle最佳性能?如何才好?追求Jdbc on oracle最佳性能?如何才好?
追求Jdbc on oracle最佳性能?如何才好?maclean liu
 
使用Virtual box在oracle linux 5.7上安装oracle database 11g release 2 rac的最佳实践
使用Virtual box在oracle linux 5.7上安装oracle database 11g release 2 rac的最佳实践使用Virtual box在oracle linux 5.7上安装oracle database 11g release 2 rac的最佳实践
使用Virtual box在oracle linux 5.7上安装oracle database 11g release 2 rac的最佳实践maclean liu
 
Prm dul is an oracle database recovery tool database
Prm dul is an oracle database recovery tool   databasePrm dul is an oracle database recovery tool   database
Prm dul is an oracle database recovery tool databasemaclean liu
 
Oracle prm dul, jvm and os
Oracle prm dul, jvm and osOracle prm dul, jvm and os
Oracle prm dul, jvm and osmaclean liu
 
Oracle dba必备技能 使用os watcher工具监控系统性能负载
Oracle dba必备技能   使用os watcher工具监控系统性能负载Oracle dba必备技能   使用os watcher工具监控系统性能负载
Oracle dba必备技能 使用os watcher工具监控系统性能负载maclean liu
 

More from maclean liu (20)

Mysql企业备份发展及实践
Mysql企业备份发展及实践Mysql企业备份发展及实践
Mysql企业备份发展及实践
 
Oracle専用データ復旧ソフトウェアprm dulユーザーズ・マニュアル
Oracle専用データ復旧ソフトウェアprm dulユーザーズ・マニュアルOracle専用データ復旧ソフトウェアprm dulユーザーズ・マニュアル
Oracle専用データ復旧ソフトウェアprm dulユーザーズ・マニュアル
 
【诗檀软件 郭兆伟-技术报告】跨国企业级Oracle数据库备份策略
【诗檀软件 郭兆伟-技术报告】跨国企业级Oracle数据库备份策略【诗檀软件 郭兆伟-技术报告】跨国企业级Oracle数据库备份策略
【诗檀软件 郭兆伟-技术报告】跨国企业级Oracle数据库备份策略
 
基于Oracle 12c data guard & far sync的低资源消耗两地三数据中心容灾方案
基于Oracle 12c data guard & far sync的低资源消耗两地三数据中心容灾方案基于Oracle 12c data guard & far sync的低资源消耗两地三数据中心容灾方案
基于Oracle 12c data guard & far sync的低资源消耗两地三数据中心容灾方案
 
TomCat迁移步骤简述以及案例
TomCat迁移步骤简述以及案例TomCat迁移步骤简述以及案例
TomCat迁移步骤简述以及案例
 
PRM DUL Oracle Database Health Check
PRM DUL Oracle Database Health CheckPRM DUL Oracle Database Health Check
PRM DUL Oracle Database Health Check
 
dbdao.com 汪伟华 my-sql-replication复制高可用配置方案
dbdao.com 汪伟华 my-sql-replication复制高可用配置方案dbdao.com 汪伟华 my-sql-replication复制高可用配置方案
dbdao.com 汪伟华 my-sql-replication复制高可用配置方案
 
Vbox virtual box在oracle linux 5 - shoug 梁洪响
Vbox virtual box在oracle linux 5 - shoug 梁洪响Vbox virtual box在oracle linux 5 - shoug 梁洪响
Vbox virtual box在oracle linux 5 - shoug 梁洪响
 
【诗檀软件】Mysql高可用方案
【诗檀软件】Mysql高可用方案【诗檀软件】Mysql高可用方案
【诗檀软件】Mysql高可用方案
 
Shoug at apouc2015 4min pitch_biotwang_v2
Shoug at apouc2015 4min pitch_biotwang_v2Shoug at apouc2015 4min pitch_biotwang_v2
Shoug at apouc2015 4min pitch_biotwang_v2
 
Apouc 4min pitch_biotwang_v2
Apouc 4min pitch_biotwang_v2Apouc 4min pitch_biotwang_v2
Apouc 4min pitch_biotwang_v2
 
使用Oracle osw analyzer工具分析oswbb日志,并绘制系统性能走势图1
使用Oracle osw analyzer工具分析oswbb日志,并绘制系统性能走势图1使用Oracle osw analyzer工具分析oswbb日志,并绘制系统性能走势图1
使用Oracle osw analyzer工具分析oswbb日志,并绘制系统性能走势图1
 
诗檀软件 Oracle开发优化基础
诗檀软件 Oracle开发优化基础 诗檀软件 Oracle开发优化基础
诗檀软件 Oracle开发优化基础
 
Orclrecove 1 pd-prm-dul testing for oracle database recovery_20141030_biot_wang
Orclrecove 1 pd-prm-dul testing for oracle database recovery_20141030_biot_wangOrclrecove 1 pd-prm-dul testing for oracle database recovery_20141030_biot_wang
Orclrecove 1 pd-prm-dul testing for oracle database recovery_20141030_biot_wang
 
诗檀软件 – Oracle数据库修复专家 oracle数据块损坏知识2014-10-24
诗檀软件 – Oracle数据库修复专家 oracle数据块损坏知识2014-10-24诗檀软件 – Oracle数据库修复专家 oracle数据块损坏知识2014-10-24
诗檀软件 – Oracle数据库修复专家 oracle数据块损坏知识2014-10-24
 
追求Jdbc on oracle最佳性能?如何才好?
追求Jdbc on oracle最佳性能?如何才好?追求Jdbc on oracle最佳性能?如何才好?
追求Jdbc on oracle最佳性能?如何才好?
 
使用Virtual box在oracle linux 5.7上安装oracle database 11g release 2 rac的最佳实践
使用Virtual box在oracle linux 5.7上安装oracle database 11g release 2 rac的最佳实践使用Virtual box在oracle linux 5.7上安装oracle database 11g release 2 rac的最佳实践
使用Virtual box在oracle linux 5.7上安装oracle database 11g release 2 rac的最佳实践
 
Prm dul is an oracle database recovery tool database
Prm dul is an oracle database recovery tool   databasePrm dul is an oracle database recovery tool   database
Prm dul is an oracle database recovery tool database
 
Oracle prm dul, jvm and os
Oracle prm dul, jvm and osOracle prm dul, jvm and os
Oracle prm dul, jvm and os
 
Oracle dba必备技能 使用os watcher工具监控系统性能负载
Oracle dba必备技能   使用os watcher工具监控系统性能负载Oracle dba必备技能   使用os watcher工具监控系统性能负载
Oracle dba必备技能 使用os watcher工具监控系统性能负载
 

【Maclean liu技术分享】拨开oracle cbo优化器迷雾,探究histogram直方图之秘 0321

  • 1. 拨开Oracle CBO优化器迷雾,探 究Histogram直方图之秘 刘相兵(Maclean Liu) ORA-ALLSTARS Exadata用 liu.maclean@gmail.com 户组QQ群:23549328 www.askmaclean.com
  • 2. About Me  Email:liu.maclean@gmail.com  Blog:www.askmaclean.com  Oracle Certified Database Administrator Master 10g and 11g  Over 7 years experience with Oracle DBA technology  Over 8 years experience with Linux technology  Member Independent Oracle Users Group  Member All China Users Group Presents for advanced Oracle topics: RAC, DataGuard, Performance Tuning and Oracle Internal. www.askmaclean.com www.askmaclean.com www.askmaclean.com
  • 3. About Me www.askmaclean.com www.askmaclean.com www.askmaclean.com
  • 4. 神秘的Histogram直方图迷雾 www.askmaclean.com www.askmaclean.com www.askmaclean.com
  • 5. 预备知识:CBO术语 NDV – number of distinct values Card – cardinality NULLS: Number of Nulls in Column TYPE: Histogram Type #BKTS: Histogram Buckets UNCOMPBKTS: Histogram Uncompressed Buckets ENDPTVALS: Histogram End Point Values Freq 频率直方图 HtBal 高度平衡直方图 DEN: Column Density 密度 sel – selectivity 选择性 详见 http://www.askmaclean.com/archives/cbo- terms.html www.askmaclean.com www.askmaclean.com www.askmaclean.com
  • 7. 预备知识: DBMS_STATS.AUTO_SAMPLE_SIZE 默认的自动采样大小参数 DBMS_STATS.AUTO_SAMPLE_SIZE AUTO_SAMPLE_SIZE时会优先采样5500 行,dbms_stats包内部算法会评估采样的5500行数据是 否有效,若无效则会再次采样55000行数据,依此类 推。 为什么是5500这个数字? 5500这个数字是经过数学论证的。 可以90%以上保证采 样获得的直方图Histogram的桶buckets中数据分布式 均匀。 www.askmaclean.com www.askmaclean.com www.askmaclean.com
  • 8. 啥是直方图Histogram? Histogram直方图用来描绘 columns上的数据分布情况 当数据分布倾斜时Histogram可以有效提升cardinality 评 估的准确度 或曰:“如果数据分布极均匀,那么是否不需要 Histogram” 是的!例如绝大多数情况下单列主键 上不必要创建直方 图Histogram(即便存在gap ) 使用直方图的直接目的是为了让谓词列(column predicate)得到更好的选择性评估 www.askmaclean.com www.askmaclean.com www.askmaclean.com
  • 9. 什么是数据倾斜?Data Skew? select source,count(*) from data_skew group by source order by 2,1; 倾斜的数据分布 600000 500000 400000 300000 200000 合计 100000 0 www.askmaclean.com www.askmaclean.com www.askmaclean.com
  • 10. 数据倾斜+没有直方图啥后果? SQL> exec dbms_stats.gather_table_stats(user,‘DATA_SKEW’,method_opt=>‘FOR ALL COLUMNS SIZE 1’); ==》SIZE 1 不收histogram直方图 SQL> select count(*) from DATA_SKEW where source='Google Search'; COUNT(*) ---------- 566566 SQL> explain plan for select 1 from DATA_SKEW where source='Google Search'; SQL> SELECT * FROM TABLE(dbms_xplan.display); ------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 6341 | 82433 | 1256 (2)| 00:00:16 | |* 1 | TABLE ACCESS FULL| DATA_SKEW | 6341 | 82433 | 1256 (2)| 00:00:16 | 1 - filter("SOURCE"='Google Search') 优化器评估 source=‘Google Search’; Cardinality为 6431 ,实际source=‘Google Search’ 共 566566, 差88倍!! 数据倾斜+没直方图==》优化器Optimizer 很迷茫、很惆怅。。。。。。 www.askmaclean.com www.askmaclean.com www.askmaclean.com
  • 11. 没有直方图,优化器怎么算Cardinality? alter session set events '10053 trace name context forever,level 1'; explain plan for select 1 from DATA_SKEW where source='Google Search'; SINGLE TABLE ACCESS PATH ----------------------------------------- BEGIN Single Table Cardinality Estimation ----------------------------------------- Column (#2): SOURCE(VARCHAR2) AvgLen: 13.00 NDV: 158 Nulls: 0 Density: 0.0063291 Table: DATA_SKEW Alias: DATA_SKEW Card: Original: 1001859 Rounded: 6341 Computed: 6340.88 Non Adjusted: 6340.88 Computed: 6340.88  原始计算获得的 Cardinality Rounded: 6341  round(Computed: 6340.88) 后的结果 非NULL比例= (Card: Original – Nulls ) / Card: Original = 1 没有直方图的情况下 Sel= (1/NDV)*非NULL比例= (1/158) * 1 = 0.0063291 Computed Cardinality= Card Original * Sel= 1001859 * 0.0063291 = 6340.88 www.askmaclean.com www.askmaclean.com www.askmaclean.com
  • 12. 上例中如果有直方图Cardinality呢? exec dbms_stats.gather_table_stats(user,'DATA_SKEW',method_opt=>'FOR ALL COLUMNS SIZE SKEWONLY', estimate_percent=>100); SQL> explain plan for select 1 from DATA_SKEW where source='Google Search'; SQL> SELECT * FROM TABLE(dbms_xplan.display); ------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 566K| 7192K| 1256 (2)| 00:00:16 | |* 1 | TABLE ACCESS FULL| DATA_SKEW | 566K| 7192K| 1256 (2)| 00:00:16 | ------------------------------------------------------------------------------- Column (#2): SOURCE(VARCHAR2) AvgLen: 13.00 NDV: 161 Nulls: 0 Density: 9.9500e-05 Histogram: Freq #Bkts: 161 UncompBkts: 999999 EndPtVals: 161 Table: DATA_SKEW Alias: DATA_SKEW Card: Original: 999999 Rounded: 566566 Computed: 566566.00 Non Adjusted: 566566.00 Histogram: Freq 频率直方图 #Bkts: 161  161个buckets UncompBkts  采样获得的 ( Card Original – NULLS) www.askmaclean.com www.askmaclean.com www.askmaclean.com
  • 13. 10053 trace中并不列出完整的直方图信息哦 直方图查看: dba_tab_cols、dba_tab_histograms、 sys. histgrm$ SQL> select HISTOGRAM from dba_tab_cols where table_name='DATA_SKEW' and column_name='SOURCE'; HISTOGRAM --------------- FREQUENCY  频率直方图 SOURCE 是字符类型 需要 hexstr(endpoint_value)才能看懂 select endpoint_number, hexstr(endpoint_value) source_char from dba_tab_histograms where table_name=‘DATA_SKEW’ and column_name=‘SOURCE’; www.askmaclean.com www.askmaclean.com www.askmaclean.com
  • 14. 拨开云雾见月明—开始探究 Histogram直方图之旅 www.askmaclean.com www.askmaclean.com www.askmaclean.com
  • 15. Histogram直方图的历史 最早追述到Oracle 7版本开始引入Histogram的思想 Oracle 7中使用CBO时选择性评估很不准确 在Oracle 8中引入了Histogram作为optimizer statistics 新特性,当时Frequency Histograms被称作 "value- based" histogram www.askmaclean.com www.askmaclean.com www.askmaclean.com
  • 16. Histogram直方图的历史 版本8~9i默认不收集直方图Histogram,需要手动指定 method_opt size (默认size=1 仅收集最小/大值, distinct值等信息) 从版本10g开始Oracle会自动收集Histogram, Histogram是否收集取决于col_usage$中记录的关于 该列用作SQL中谓词条件的信息和数据分布情况 SMON定期将shared pool中 的谓词使用状况刷新到 col_usage$表中 例如: Select * from tab where colA=1; 则记录为对colA充当 EQUALITY_PREDS  equality predicates等式谓词 www.askmaclean.com www.askmaclean.com www.askmaclean.com
  • 17. Optimizer优化器何时使用Histogram? • 不使用绑定变量,硬编码且cursor_sharing=EXACT造成大 量硬解析时 • 绑定变量,且_optim_peek_user_binds=true(默认)的硬解 析时;当_optim_peek_user_binds=false时虽然选择性不参 考直方图,当硬解析仍可能load histogram • 11g中ACS(Adaptive Cursor Sharing) bind aware引发再 次硬解析时 • 其他一些导致现有子游标无法使用的原因 • 。。。。。。。。。。。。。。 www.askmaclean.com www.askmaclean.com www.askmaclean.com
  • 18. Optimizer优化器在哪里使用Histogram? • 有2个地方Oracle Optimizer优化器会用到 Histogram: • 过滤谓词的选择性评估 • Join连接基数(Cardinality)评估 • 在做Join连接基数(Cardinality)评估时,往往差之 毫厘谬之千里: • 优化器可能选择错误的Join连接方式或顺序。 • 例如分页排序查询 因为基数评估误差导致去用了 HASH JOIN….. www.askmaclean.com www.askmaclean.com www.askmaclean.com
  • 19. Histogram的Buckets桶数 • 对于绝大多数情况默认的75个桶总是合适的 • 最大桶数= 最小值(254,其他因素限制桶数) • 若频繁出现地distinct值的数据并不多,则将桶数设 置为大于这个数目往往是有益的。 • 版本12c之前直方图最大桶数为254,12c之后……… www.askmaclean.com www.askmaclean.com www.askmaclean.com
  • 20. Histogram的其他特性 • 早期版本中远程DBLINK查询时表上的直方图可能不被使用 • 直方图是静态的,当数据更新频繁时可能变得”陈旧” • 对于超过32个字符的字符型列,超出的那一部分无法在直方 图中体现。 例如rpad(‘A’,100,‘Z’)和rpad(‘A’,101,‘Z’)被认为 是同一个value • 数字和日期在直方图上被精确表示 • Popular and non-popular值被用做帮助判断选择性 www.askmaclean.com www.askmaclean.com www.askmaclean.com
  • 21. 如何收集直方图 • 推荐使用dbms_stats收集直方图,虽然analyze命令也能收 集直方图 EXECUTE DBMS_STATS.GATHER_TABLE_STATS ('scott','emp',METHOD_OPT => 'FOR COLUMNS SIZE 20 ename'); • Method_opt参数 FOR ALL [INDEXED | HIDDEN] COLUMNS [size_clause] FOR COLUMNS [size clause] column [size_clause] [,column...] • Size指定直方图的桶数 SIZE {integer | REPEAT | AUTO | SKEWONLY} Integer – 人工指定直方图桶数 从1~254 REPEAT - 刷新现有的直方图列上的信息 AUTO - 基于数据分布和负载收集直方图 SKEWONLY- 基于数据分布收集直方图 • 若对直方图不甚了解,推荐使用AUTO或SKEWONLY www.askmaclean.com www.askmaclean.com www.askmaclean.com
  • 22. Histogram直方图的种类 版本12c之前有2种类型Histogram • Height Balanced Histogram高度平衡直方图 • 列值被分成多个buckets • 每个bucket包含大致一样数目的行数 • 当NDV>254时会采用高度平衡直方图(注意dbms_stats采 样到的NDV未必是实际的NDV) • Frequency Histogram(Value-Based )频率直方图 • 该列上每一个值都会具有频率信息 • 当NDV( number of distinct values)的个数<=最大桶数 buckets 254个时使用频率直方图 • 在Oracle数据字典基表sys. histgrm$ 上这2种类型Histogram 的存放方式一样 www.askmaclean.com www.askmaclean.com www.askmaclean.com
  • 23. Height Balanced Histogram • 每个buckets桶里的行数都大致相同,除了最后一个桶 • 最后一个桶中的可能比其他桶中的少 • 每个桶中最大值成为bucket value endpoint_value • 每个value值占有一个桶的一部分,按比例 select 1 from DATA_SKEW_HB where source='Google Search'; Column (#2): NewDensity:0.000403, OldDensity:0.000663 BktCnt:254, PopBktCnt:229, PopValCnt:11, NDV:255 Column (#2): SOURCE( AvgLen: 13 NDV: 255 Nulls: 0 Density: 0.000403 Histogram: HtBal #Bkts: 254 UncompBkts: 254 EndPtVals: 36 Table: DATA_SKEW_HB Alias: DATA_SKEW_HB Card: Original: 1000093.000000 Rounded: 566982 Computed: 566981.86 Non Adjusted: 566981.86 www.askmaclean.com www.askmaclean.com www.askmaclean.com
  • 24. Height Balanced Histogram SQL> select count(distinct source) from data_skew_hb; COUNT(DISTINCTSOURCE) --------------------- 255 SQL> exec dbms_stats.gather_table_stats(user,'DATA_SKEW_HB',method_opt=>'FOR ALL COLUMNS SIZE SKEWONLY',estimate_percent=>100); SQL> select histogram from dba_tab_cols where table_name='DATA_SKEW_HB' and column_name='SOURCE'; HISTOGRAM --------------- HEIGHT BALANCED SQL> select count(*) from dba_tab_histograms where table_name='DATA_SKEW_HB' and column_name='SOURCE'; COUNT(*) ---------- 36 select endpoint_number,hexstr(endpoint_value) from dba_tab_histograms where table_name='DATA_SKEW_HB' and column_name='SOURCE' order by 1,2 www.askmaclean.com www.askmaclean.com www.askmaclean.com
  • 25. Height Balanced Histogram • 每个buckets桶里的行数都大致相同,除了最后一个桶 • 最后一个桶中的行数可能比其他桶中的少 • 每个桶中最大值成为bucket value endpoint_value • 每个value值占有一个桶的一部分,按比例 select 1 from DATA_SKEW_HB where source='Google Search'; Column (#2): NewDensity:0.000403, OldDensity:0.000663 BktCnt:254, PopBktCnt:229, PopValCnt:11, NDV:255 Column (#2): SOURCE( AvgLen: 13 NDV: 255 Nulls: 0 Density: 0.000403 Histogram: HtBal #Bkts: 254 UncompBkts: 254 EndPtVals: 36 Table: DATA_SKEW_HB Alias: DATA_SKEW_HB Card: Original: 1000093.000000 Rounded: 566982 Computed: 566981.86 Non Adjusted: 566981.86 www.askmaclean.com www.askmaclean.com www.askmaclean.com
  • 26. Height Balanced Histogram压缩Bucket www.askmaclean.com www.askmaclean.com www.askmaclean.com
  • 27. 10053 trace中的UNCOMPBKTS和ENDPTVALS • 当Histogram直方图类型为frequency histograms( Histogram: Freq)时UncompBkts 等于统计信息中采样行数-NULLS, 而 EndPtVals 等于bucket总数,或者说NDV,因为frequency histograms中 NDV=number of buckets • 当直方图类型为height balanced histograms (Histogram: HtBal) UncompBkts 等于bucket的数目(其实也等于10053 trace中#Bkts的数目),而EndPtVals 等于已经被压缩的 Histogram的大小,换句话说是等于: select count(*) from dba_tab_histograms where table_name=’&TAB’ and column_name=’&COL’的实际总和。 通过这2个值对比,可以了 解到popular值的多少以及数据的倾斜度, 是有很多个大量重复 的值(popular value)还是仅有几个巨大的重复值。 www.askmaclean.com www.askmaclean.com www.askmaclean.com
  • 28. Height Balanced Histogram- popular value 重复出现为endpoint_value的值称为popular value,其Cardinality计算 公式为: Comp_Card = Orig_Card * Sel Sel = (该Popular值的桶数 /总的桶数) * 非NULL比例 非NULL比例=(Orig_Card - NNulls) / Orig_Card 如上例中source=‘Google Search’是一个popular值: 该Popular值的桶数=215-71 =144 非NULL比例= (1000093-0)/1000093=1 则sel = 144/254 Comp_Card = 1000093* 144/254= 566981.86 www.askmaclean.com www.askmaclean.com www.askmaclean.com
  • 29. Height Balanced Histogram- non popular value • 没有重复出现为endpoint_value、或者没有充当过 endpoint_value的值称为non popular value,其 Cardinality计算公式为: • Comp_Card = Orig_Card * Sel • Sel= Density * 非空比例 • 10.2.0.4以后的 New Density= (总的buckets数- 所有 popular值的buckets数PopBktCnt)/总的buckets数/ (NDV- popular值总的个数POPVALCNT) www.askmaclean.com www.askmaclean.com www.askmaclean.com
  • 30. Height Balanced Histogram- non popular value select count(*) from data_skew_hb where source='ABC100'; COUNT(*) ---------- 666 10g中10053 trace默认不显示PopBktCnt和POPVALCNT 但是可以利用 后面的脚本获得 11g 10053 level 1 trace : NewDensity:0.000403, OldDensity:0.000663 BktCnt:254, PopBktCnt:229, PopValCnt:11, NDV:255 Column (#2): SOURCE( AvgLen: 13 NDV: 255 Nulls: 0 Density: 0.000403 Histogram: HtBal #Bkts: 254 UncompBkts: 254 EndPtVals: 36 Table: DATA_SKEW_HB Alias: DATA_SKEW_HB Card: Original: 1000093.000000 Rounded: 403 Computed: 403.42 Non Adjusted: 403.42 www.askmaclean.com www.askmaclean.com www.askmaclean.com
  • 31. POPVALCNT select count(*) PopValCnt from (select endpoint_number,hexstr(endpoint_value) endstr, endpoint_number- lag( endpoint_number,1,0) over (order by endpoint_number ) as buckets from dba_tab_histograms where table_name='DATA_SKEW_HB' and column_name='SOURCE') end_diff where buckets>1; POPVALCNT ---------- 11 www.askmaclean.com www.askmaclean.com www.askmaclean.com
  • 32. PopBktCnt select sum(buckets) PopValCnt from (select endpoint_number,hexstr(endpoint_value) endstr, endpoint_number- lag( endpoint_number,1,0) over (order by endpoint_number ) as buckets from dba_tab_histograms where table_name='DATA_SKEW_HB' and column_name='SOURCE') end_diff where buckets>1; POPVALCNT ---------- 229 www.askmaclean.com www.askmaclean.com www.askmaclean.com
  • 33. Popular value的列表 select endstr popvalue,buckets from (select endpoint_number,hexstr(endpoint_value) endstr, endpoint_number- lag( endpoint_number,1,0) over (order by endpoint_number ) as buckets from dba_tab_histograms where table_name='DATA_SKEW_HB' and column_name='SOURCE') end_diff where buckets>1; www.askmaclean.com www.askmaclean.com www.askmaclean.com
  • 34. High Balanced Histogram 计算 Density 10.2.0.4 以后New Desnsity = (总的buckets数- 所有popular值的buckets 数PopBktCnt)/总的buckets数/ (NDV- popular值总的个数 POPVALCNT) = (254-229)/254/(255-11)= 25/255/244= 0.000403 Comp_Card = Orig_Card * Sel= 1000093 * 0.000403 = 403 www.askmaclean.com www.askmaclean.com www.askmaclean.com
  • 35. Frequency Histogram频率直方图 • 每一个bucket桶代表一个列值 • 列上的所有的值均有对应的桶 • 当NDV(采样到的)<min(指定的buckets数量,254)时 创建频率直方图 www.askmaclean.com www.askmaclean.com www.askmaclean.com
  • 36. Frequency Histogram频率直方图-等式谓词 Exec dbms_stats.gather_table_stats(user,'DATA_SKEW',estimate_percent=>100); SQL> select histogram from dba_tab_cols where table_name='DATA_SKEW' and column_name='SOURCE'; HISTOGRAM --------------- FREQUENCY SQL> select count(*) from data_skew where source='Bing Search'; COUNT(*) ---------- 9009 Column (#2): SOURCE(VARCHAR2) AvgLen: 13.00 NDV: 161 Nulls: 0 Density: 9.9500e-05 Histogram: Freq #Bkts: 161 UncompBkts: 999999 EndPtVals: 161 Table: DATA_SKEW Alias: DATA_SKEW Card: Original: 999999 Rounded: 9009 Computed: 9009.00 Non Adjusted: 9009.00 www.askmaclean.com www.askmaclean.com www.askmaclean.com
  • 37. Frequency Histogram频率直方图-等式谓词 select endpoint_number, hexstr(endpoint_value) endstr, endpoint_number - lag(endpoint_number, 1, 0) over(order by endpoint_number) as rows_cnt from dba_tab_histograms where table_name = 'DATA_SKEW' and column_name = 'SOURCE'; www.askmaclean.com www.askmaclean.com www.askmaclean.com
  • 38. Frequency Histogram频率直方图-非闭包范围 SQL> select count(*) from data_skew where source>= 'Sougo Search'; COUNT(*) ---------- 150150 Column (#2): SOURCE(VARCHAR2) AvgLen: 13.00 NDV: 161 Nulls: 0 Density: 9.9500e-05 Histogram: Freq #Bkts: 161 UncompBkts: 999999 EndPtVals: 161 Table: DATA_SKEW Alias: DATA_SKEW Card: Original: 999999 Rounded: 150150 Computed: 150149.50 Non Adjusted: 150149.50 Sel= Sum(Bucketsize)/Numrows = 150150/ 999999 www.askmaclean.com www.askmaclean.com www.askmaclean.com
  • 39. Frequency Histogram频率直方图-闭包范围 SQL> select count(*) from data_skew where source between 'ABC904' and 'ABC94'; COUNT(*) ---------- Column (#2): SOURCE(VARCHAR2) 2664 AvgLen: 13.00 NDV: 161 Nulls: 0 Density: 9.9500e-05 Histogram: Freq #Bkts: 161 UncompBkts: 999999 EndPtVals: 161 Table: DATA_SKEW Alias: DATA_SKEW Card: Original: 999999 Rounded: 2664 Computed: 2664.00 Non Adjusted: 2664.00 Sel= Sum(Bucketsize)/Numrows = (666+666+666+666)/ 999999 www.askmaclean.com www.askmaclean.com www.askmaclean.com
  • 40. Histogram带来的一些问题 • 在11g acs特性出现前,绑定变量SQL若第一次代入的变量 是倾斜值,由于游标共享可能导致今后的执行计划不佳 • 若应用程序没有很好地绑定变量,且列上有直方图,则频繁 的硬解析可能加剧latch: row cache object • 由于ACS和其他一些原因作用导致child cursors数量过多: • 进一步导致latch:shared pool、library cache等问题 • 造成shared pool碎片化 • 收集系统中过多列的直方图会导致row cache膨胀,管理成 本增加 • 欢迎补充。。。。。。。。。。。。 www.askmaclean.com www.askmaclean.com www.askmaclean.com
  • 41. 目前版本Histogram的欠缺 Height-Balanced Histogram中 non popular值使用density 评估选择性,对于那些出现较频繁,但仍排不上popular榜 单的值,使用density的评估很难准确 SQL> select count(*) from DATA_SKEW_HB where source='Maclean Search'; COUNT(*) ---------- 2000 Column (#2): SOURCE(VARCHAR2) AvgLen: 13.00 NDV: 256 Nulls: 0 Density: 4.0174e-04 Histogram: HtBal #Bkts: 254 UncompBkts: 254 EndPtVals: 36 Table: DATA_SKEW_HB Alias: DATA_SKEW_HB Card: Original: 1002093 Rounded: 403 Computed: 402.58 Non Adjusted: 402.58 www.askmaclean.com www.askmaclean.com www.askmaclean.com
  • 42. Histogram的未来-12c Top Frequency histograms SQL> select banner from v$version where rownum=1; Oracle Database 12c Enterprise Edition Release 12.1.0.0.2 - 64bit Beta SQL> exec dbms_stats.gather_table_stats(user,'DATA_SKEW_HB'); SQL> select histogram from dba_tab_cols where table_name='DATA_SKEW_HB' and column_name='SOURCE'; HISTOGRAM --------------- TOP-FREQUENCY select count(*) from DATA_SKEW_HB where source='Maclean Search'; Column (#2): NewDensity:0.000001, OldDensity:0.000000 BktCnt:1002091.000000, PopBktCnt:1001999.000000, PopValCnt:162, NDV:256 Column (#2): SOURCE(VARCHAR2) AvgLen: 13 NDV: 256 Nulls: 0 Density: 0.000001 Histogram: Top-Freq #Bkts: 1002091 UncompBkts: 1002091 EndPtVals: 254 ActualVal: yes Table: DATA_SKEW_HB Alias: DATA_SKEW_HB Card: Original: 1002093.000000 Rounded: 2000 Computed: 2000.00 Non Adjusted: 2000.00 www.askmaclean.com www.askmaclean.com www.askmaclean.com
  • 43. Histogram的未来-12c Hybrid Histograms 在data_skew_hb表上生成更多NDV SQL> exec dbms_stats.gather_table_stats(user,'DATA_SKEW_HB'); SQL> select histogram from dba_tab_cols where table_name='DATA_SKEW_HB' and column_name='SOURCE'; HISTOGRAM --------------- HYBRID SQL> select count(*) from DATA_SKEW_HB where source='Maclean Search'; COUNT(*) ---------- 2000 Column (#2): NewDensity:0.000085, OldDensity:0.000000 BktCnt:5405.000000, PopBktCnt:671.000000, PopValCnt:5, NDV:10255 Column (#2): SOURCE(VARCHAR2) AvgLen: 13 NDV: 10255 Nulls: 0 Density: 0.000085 Histogram: Hybrid #Bkts: 254 UncompBkts: 5405 EndPtVals: 254 ActualVal: yes Table: DATA_SKEW_HB Alias: DATA_SKEW_HB Card: Original: 6001593.000000 Rounded: 2221 Computed: 2220.76 Non Adjusted: 2220.76 www.askmaclean.com www.askmaclean.com www.askmaclean.com
  • 44. 你离彻底搞懂10053 trace这天书又近了一步 敬请期待, 下一讲 揭秘 Oracle优化 器内幕,一 次性彻底搞 明白10053 trace !! www.askmaclean.com www.askmaclean.com www.askmaclean.com
  • 45. 更多信息 www.askmaclean.com tuning or http://www.askmaclean.com/archives/tag/tuning www.askmaclean.com www.askmaclean.com www.askmaclean.com
  • 46. Question & Answer If you have more questions later, feel free to ask www.askmaclean.com www.askmaclean.com www.askmaclean.com