• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
Oracle Data Buffer Cache

Oracle Data Buffer Cache



Oracle Database internal info

Oracle Database internal info



Total Views
Views on SlideShare
Embed Views



10 Embeds 383

http://isky000.com 196
http://www.jianzhaoyang.com 164
http://www.slideshare.net 15
http://www.slashdocs.com 2
http://reader.youdao.com 1
http://www.mysqlmeg.cn 1
http://cache.baidu.com 1
http://xianguo.com 1
http://www.zhuaxia.com 1
http://www.docshut.com 1



Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.


12 of 2 previous next

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
  • very good
    Are you sure you want to
    Your message goes here
  • Good! Very helpful.
    Are you sure you want to
    Your message goes here
Post Comment
Edit your comment

    Oracle Data Buffer Cache Oracle Data Buffer Cache Presentation Transcript

    • data buffer cache 管理机制浅析
    • 内容
      • Oracle 如何寻找到需要的 buffer?
      • Oracle 如何管理 data buffer cache 里面的块?
      • Oracle 如何确定哪些块应该写入数据文件,如何写?
      • 和 Data buffer cache 相关的一些等待事件
    • 引入
      • Instance 最大的内存区域
      • db_cache_size 参数
      • 分配单位: granule
      • 提供了 default 、 keep 、 recyle 三种不不同类型的 cache
      • 多种数据块尺寸( 2 、 4 、 8 、 16 或 32k )的 buffer cache 对应不同的 blocksize 数据块 db_nk_cache_size
    • 定位 buffer
      • Buffer 存放的位置:
      • The hash bucket for a particular block header is determined based on the modulus of the Data Block Address (DBA) and the value of the _DB_BLOCK_HASH_BUCKETS parameter. For example, hash bucket = MOD(DBA, _DB_BLOCK_HASH_BUCKETS).
      • 先在 PGA 中构造 buffer discriptor 内存结构,同需要的锁定模式一起传入搜索函数, hash 算法找到对应的 bucket
      • 搜索函数: kcbget(descriptor,lock_mode)
    • Buffer cache 示意图 _db_block_hash_buckets _db_block_hash_latches
    • BH 结构
      • 视图: X$BH V$BH
      • BH (0x0x5cfce8c4) file#: 45 rdba: 0x0b417f8f (45/98191) class 1 ba: 0x0x5c73e000
      • set: 5 dbwrid: 0 obj: 198268 objn: 198268
      • hash: [6893c08c,6893c08c] lru: [61ff9348,57fee018]
      • LRU flags: hot_buffer
      • ckptq: [NULL] fileq: [NULL]
      • st: XCURRENT md: NULL rsop: 0x(nil) tch: 2
      • LRBA: [0x0.0.0] HSCN: [0xffff.ffffffff] HSUB: [255] RRBA: [0x0.0.0]
    • Hash chain+BH
      • CHAIN: 4889 LOC: 0x0x5a6e4100 HEAD: [51fdf58c,56fcc874]
      • BH (0x0x51fdf58c) file#: 1 rdba: 0x00402432 (1/9266) class 1 ba: 0x0x51a1a000
      • set: 12 dbwrid: 0 obj: 3807 objn: 3807
      • hash: [56fcc874,5a6e4100] lru: [51fdf8c4,51fdeff4]
      • LRU flags:
      • ckptq: [NULL] fileq: [NULL]
      • st: XCURRENT md: NULL rsop: 0x(nil) tch: 7
      • LRBA: [0x0.0.0] HSCN: [0xffff.ffffffff] HSUB: [255] RRBA: [0x0.0.0]
      • buffer tsn: 0 rdba: 0x00402432 (1/9266)
      • scn: 0x0000.000071f9 seq: 0x01 flg: 0x04 tail: 0x71f90601
      • frmt: 0x02 chkval: 0x42ef type: 0x06=trans data
    • Hash chain 上的搜索
      • For each buffer in the chain:
      • - Ignore buffers that do not match RDBA
      • - Wait for READING buffer and return them
      • - Skip CR (consistent read) buffers
      • - If the CUR (current) buffer is held in a compatible mode, then use it
      • - Otherwise if all other users are CR state objects
      • – Make it a CR copy and create a new EXLCUR copy of the buffer
      • – Or wait for the current buffer to be released
      • If no usable buffers exist in cache, read from disk
      • 搜索的时候,需要持有 cache buffer chain latch
    • Working sets _db_block_lru_latches 缺省值为 DBWR 进程的数量 ×8 (允许的最大的 buffer pool 数量)
    • Working sets
      • Each list that is shown above will have sublists called the auxiliary write list (AUX) and a MAIN list. For example, the LRU-P list will have a LRUP-AUX and a LRUP-MAIN list.
      • LRU-XR, LRU-XO and LRU-P are also called write lists.buffers are linked to these due to a specific write action.
      • these lists are candidates for immediate write-outs by the DBWR.
      • enable write prioritization capabilities
      • 一个 BH 只能在 LRU 或 LRUW 上,但能存在多个 write list 上
    • working sets
      • Dump of buffer cache at level 10
      • (WS) size: 501 wsid: 1 state: 0
      • (WS_REPL_LIST) main_prev: 5a6ff9bc main_next: 5a6ff9bc aux_prev: 58fffd08 aux_next: 58fa4048curnum: 501 auxnum: 501
      • cold: 5a6ff9bc hbmax: 0 hbufs: 0
      • (WS_WRITE_LIST) main_prev: 5a6ff9d8 main_next: 5a6ff9d8 aux_prev: 5a6ff9e0 aux_next: 5a6ff9e0curnum: 0 auxnum: 0
      • (WS_XOBJ_LIST) main_prev: 5a6ff9f4 main_next: 5a6ff9f4 aux_prev: 5a6ff9fc aux_next: 5a6ff9fccurnum: 0 auxnum: 0
      • (WS_XRNG_LIST) main_prev: 5a6ffa10 main_next: 5a6ffa10 aux_prev: 5a6ffa18 aux_next: 5a6ffa18curnum: 0 auxnum: 0
      • (WS) fbwanted: 0
      • (WS) bgotten: 0 sumwrt: 0 sumscan: 0
      • (WS) numscan: 0 hotscan: 0 dmoves: 0
      • MAIN RPL_LST Queue header (NEXT_DIRECTION)[NULL]
      • MAIN RPL_LST Queue header (PREV_DIRECTION)[NULL]
      • AUXILIARY RPL_LST Queue header (NEXT_DIRECTION)[58fa4048,58fffd08]
      • 0x58fa4000=>0x58fa42f0=>0x58fa45e0=>0x58fa48d0=>0x58fa4bc0=>0x58fa4eb0=>0x58fa51a0=>0x58fa5490
      • 0x58fa5780=>0x58fa5a70=>0x58fa5d60=>0x58fa6050=>0x58fa6340=>0x58fa6630=>0x58fa6920=>0x58fa6c10
      • 0x58fa6f00=>0x58fa71f0=>0x58fa74e0=>0x58fa77d0=>0x58fa7ac0=>0x58fa7db0=>0x58fa80a0=>0x58fa8390
    • 确定可重用 buffer 的过程
      • 8i: 从 list 的尾端开始 scan ,将冷端的 buffer head 所指向的内容牺牲掉
      • 9i: 当查询所需要的块需要从磁盘读进来,挂在 lru 链上时,
      • 从 list 的尾端开始 scan ,先扫描辅 list ,再扫描主 list
      • lru 算法 +touch count 数
      • 热块往主 list 上移动,从中插入主 list 。辅 list 上空了之后,执行相同的算法在 Lru 中找出可牺牲的块,换到辅 list 上
    • LRU 算法
      • IF ( touch count of scanned buffer >_db_aging_hot_criteria )
      • THEN
      • Give buffer another chance (do not select as a victim)
      • IF (_db_aging_stay_count >= _db_aging_hot_criteria) THEN
      • Halve the buffer's touch count
      • ELSE
      • Set the buffer's touch count to _db_aging_stay_count
      • END IF
      • ELSE
      • Select buffer as a victim
      • END IF
    • LRUW
      • LRUW list:
      • The LRU-W (write) list is used to hold buffers that aged out of the LRU but need to be written to disk before they can be reused.
    • block 更改的过程
      • Update block 的过程:
      • 假如我要修改 BH2 指向的块的内容
      • 1)oracle 会将 BH2 从辅助 LRU 链表上摘下,同时插入主 LRU 链表的中间,也就是插入 BH1 和 BH4 中间,同时增加 BH2 的 touch 的数量。 2)  将该 BH2 的标记设置为钉住( ping )。( latch 保护) 3)  更新 BH2 对应的内存数据块的内容。 4)  更新完以后,取消钉住的标记。 5)  将 BH2 从主 LRU 链表转移到主 LRUW 链表上。 6)  如果这个时候又有进程发出更新 BH2 所对应的内存数据块的内容,则 BH2 再次被钉住,更新,取消钉住。 7) DBWR 启动以后,在扫描主 LRUW 链表时会将 BH2 转移到辅助 LRUW 链表上。
      • 8) DBWR 将辅助 LRUW 链表上的 BH2 对应的数据块写入数据文件。 9)  确认成功写入数据文件以后,将 BH2 从辅助 LRUW 链表上转移到辅助 LRU 链表上
    • checkpoint
      • Checkpoints ( checkpoint queue latch )
      • To ensure that the data blocks that have their redo
      • generated up to a certain point in the redo log (RBA) are written to the disk
      • Checkpoint structure includes:
      • – Checkpoint SCN
      • – Checkpoint RBA
      • – Thread that allocated the checkpoint
      • – Enabled thread bitmap
      • Timestamp
      • 关于 DDL:
      • The Oracle server ensures that the DDL is successfully mini-checkpointed before the DROP (which depends on the SCN and seq# of the data blocks within the object).
    • CKPTQ & FQ Pre-Oracle8 DBWR scanned the entire cache to find buffers with checkpoint bit set. CKPTQ and FQs eliminate this scan. – When the buffer is first modified, it is inserted into the CKPTQ in RBA order – The buffer is also inserted into the appropriate FQ When a checkpoint is initiated, DBWR writes all buffers on the queue until the checkpoint RBA is less than the head of the CKPTQ RBA. 全量检查点发生条件: 发出命令: alter system checkpoint ; 除了 shutdown abort 以外的正常关闭数据库。
    • DBWR 触发条件: 1.Lru 链上扫描以查找可以覆盖的 buffer header 时,如果已经扫描的 buffer header 的数量到达一定的限度(由隐藏参数: _db_block_max_scan_pct 决定,我的库中是 40 ) 2. 当 DBWR 在主 LRUW 链表上查找已经更新完而正在等待被写入数据文件的 buffer header 时,如果找到的 buffer header 的数量超过一定限度(由隐藏参数: _db_writer_scan_depth_pct 决定 我的库中是 25 ) 3. 如果主 LRUW 链表和辅助 LRUW 链表上的脏数据块的总数超过一定限度,。该限度由隐藏参数: _db_large_dirty_queue (我的库是 25 )决定。 4. 完全检查点时触发 DBWR 。 5. 将表空间设置为离线( offline )状态时触发 DBWR 。 6. 发出命令: alter tablespace … begin backup ,从而将表空间设置为热备份状态时触发 DBWR 。 7. 将表空间设置为只读状态时,触发 DBWR 。 8. 删除对象时(比如删除某个表)会触发 DBWR 。
    • 写数据 DBWR 会将要写的脏数据块所对应的 buffer header 拷贝到一个名为批量写( write batch )的结构中。每个 working set 所对应的 DBWR 进程都可以向该结构里拷贝 buffer header 。当 write batch 的 buffer header 的个数达到一定限额时,才会发生实际的 I/O
    • 等待事件
      • buffer busy waits
      • P1=file# p2=block_id p3= reason code
      • Reason code 130 和 220 是最常见
      • Free buffer waits
      • If a session spends a lot of time on the free buffer waits event, it is usually due to one or a combination of the following five reasons:
      • Inefficient SQL statements
      • Not enough DBWR processes
      • Slow I/O subsystem
      • Delayed block cleanouts.
      • Small buffer cache
    • 等待事件
      • Latch free
      • – Cache buffers chain
      • 热点块问题
      • 可以通过 v$session_wait 的 p1raw 字段来判断 latch free 等待事件是否是由于出现了热点块。如果 p1raw 保持一致,那么说明 session 在等待同一个 latch 地址,系统存在热点块。
      • – Cache buffers lru chains
      • – Checkpoint queue latch