The document discusses a presentation on tuning Oracle LOBs (Large Objects). It covers LOB architecture including inline vs out-of-line storage, LOB locators, inodes, indexes and segments. The presentation agenda includes introduction, storing large content, LOB internals, physical storage planning, caching tuning, loading LOBs, development strategies and temporary LOBs. Examples are provided to illustrate LOB structures like locators, inodes and indexes.
Artificial intelligence in the post-deep learning era
RMOUG Training Days 2004: LOB Internals and Performance Tuning
1. RMOUG Training Days
2004
LOB Internals and Performance
Tuning
Tanel Põder
independent consultant
http://integrid.info
1/40
11-Feb-04
Tanel Põder RMOUG Training Days
2. Agenda
• Introduction & about
• Storing large content in Oracle database
• LOB internal architecture
• LOB physical storage planning
• LOB cache layer tuning
• Accessing and loading LOBs
• Dev/architecture strategies
• Temporary LOBs
• Summary
2/40
Tanel Põder RMOUG Training Days
3. Introduction & About
• Name: Tanel Põder
• Experience: 7 years as Oracle DBA
• Occupation: independent consultant
• Company: integrid.info
• Other: Europe, Middle-East & Africa Oracle
User Group BoD member
• Oracle Certified Master DBA
• More information and presentations:
http://integrid.info
3/40
Tanel Põder RMOUG Training Days
4. Storing Content in Oracle
• CHAR/VARCHAR2 - 2000/4000 byte limit
• RAW - 2000 byte limit
• LONG/LONG RAW - always stored in row
- no random access
- one column per table
- hard to maintain
LOBs - a powerful way for storing,
accessing and maintaining large content
in Oracle database
4/40
Tanel Põder RMOUG Training Days
5. Large Content in VARCHARs
• DBA_SOURCE
SQL> desc dba_source
Name Null? Type
------------------- -------- ---------------
OWNER VARCHAR2(30)
NAME VARCHAR2(30)
TYPE VARCHAR2(12)
LINE NUMBER
TEXT VARCHAR2(4000)
• Oracle holds its PL/SQL source code in
VARCHAR2 pieces
• Source is split up by rows
5/40
Tanel Põder RMOUG Training Days
6. Large content in LONGs
• DBA_VIEWS
SQL> desc dba_views
Name Null? Type
-------------------- -------- --------------
OWNER NOT NULL VARCHAR2(30)
VIEW_NAME NOT NULL VARCHAR2(30)
TEXT_LENGTH NUMBER
TEXT LONG
.
.
.
• LONGs are used in data dictionary despite
they’ve been deprecated a long time ago…
6/40
Tanel Põder RMOUG Training Days
7. LOB Concepts
• Able to store large content
– 4GB in 9i, 2-128TB in 10g (232-1 blocks)
• Must also cope with some small content
• Store both binary and character content
• Ability for random access
• Be manageable
– Partitioning, reorganizing
• Can be used in parallel operations
• External objects accessible and domain
indexable
7/40
Tanel Põder RMOUG Training Days
8. LOB Architecture
• A LOB can be either stored inline with row
– enable storage in row option
– Will be automatically moved ouf of line if it
grows large
• … or can be defined to always remain out-
of-line
– disable storage in row option
• A LOB can also reside outside the database
(BFILEs)
– Can be indexed using Oracle Text
8/40
Tanel Põder RMOUG Training Days
9. Inline LOB Architecture
• create table emp (id number, name varchar2(100), fingerprint blob)
lob (fingerprint) store as lob_fingerprint (enable storage in row);
LOB Item LOB Index
LEAF BLOCK
LOB Column
ID NAME FINGERPRINT
1 SCOTT 10100101011101
SEG BIT
2 CLARK null HDR MAP 101001 101001
3 KING 101001010101
4 BLAKE 101001
5 … inode
6 … inode
101001
7 … empty_clob()
8 … …
9 … …
LOB Segment
9/40
Tanel Põder RMOUG Training Days
10. Multiple LOBs in a Table
• create table emp (id number, name varchar2(100), fingerprint blob,
picture blob) lob (fingerprint) store as lob_fprint (enable storage in
row) lob (picture) store as lob_picture (enable storage in row);
LEAF BLOCK
SEG BIT
ID NAME FINGERPRINT PICTURE
HDR MAP 101001 101001
1 SCOTT 10100101011101 101001
2 CLARK null 101001 101001
3 KING 101001010101 inode
4 BLAKE 101001 101001
5 … inode LEAF BLOCK
6 … 101001010101
7 … inode
SEG BIT
HDR MAP
101001 101001
8 …
9 … 101001
10/40
Tanel Põder RMOUG Training Days
11. Out-of-line LOB Architecture
• create table emp (id number, name varchar2(100), fingerprint blob,
picture blob) lob (fingerprint) store as lob_fprint (enable storage in
row) lob (picture) store as lob_picture (disable storage in row);
BR LOB Index
… FINGERPRINT PICTURE
LEAF LEAF LEAF
… 10100101011101 LOB ID
… … LOB ID
SEG BIT
… 101001010101 LOB ID HDR MAP 101001 101001
… 101001 LOB ID
… … null
… …
101001010101 LOB ID
… … LOB ID
101001 101001
… … …
… … …
LOB Segment
11/40
Tanel Põder RMOUG Training Days
12. LOB Column Internal Structures
• enable storage in row inline LOB
Max. total column size 4000 bytes
LOB LOCATOR LOB INODE DATA No LOB index entries
20 bytes 16 bytes max 3964 bytes No LOB segment entries
• enable storage in row out-of-line LOB
Up to 12 4-byte relative DBAs are
LOB LOCATOR LOB INODE POINTERS TO DATA stored inline. If LOB grows larger, LOB
20 bytes 16 bytes 0-48 bytes (12xRDBA) index will be used to find data chunks
• disable storage in row DATA
alloc. in chunks
LOB LOCATOR
20 bytes
DATA
inodes in
index leafs
LOB index stores inodes for large LOB
items, also for old chunk versions for
providing read consistency
12/40
Tanel Põder RMOUG Training Days
13. LOB Locator Structure
• LOB locator is a pointer to LOB instances
physical location
– Locator works the same way for both persistent
and temporary LOBs
– 20 bytes in size
– Contains 10 byte LOB ID (2B+8B LOB OID)
– Includes some metainformation
• LOB locator column dump and structure:
len(2), vsn(2), flg(4), bytl(2), lobid(10), inode(16), data
col 1: [36]
00 54 00 01 02 0c 80 00 00 02 00 00 00 01 00 00 00 02 8c d1 <- LOB locator
00 10 09 00 00 00 00 00 00 00 00 00 00 00 00 00 <- kdlinode
13/40
Tanel Põder RMOUG Training Days
14. LOB inode Structure
• LOB inode is a structure for keeping track of
chunks belonging to a LOB item
• LOB inode information can be kept in-row
– Stores RDBAs (relative DBAs) for chunks
– Requires 16 bytes + 4 bytes per chunk in lobitem
– max 12 chunks inline
• LOB inode information can be kept in LOB
index
– disable storage in row LOBs
– enable storage in row LOBs with over 12 chunks
– Old versions for LOB chunks
14/40
Tanel Põder RMOUG Training Days
15. An Inline LOB inode Example
create table t (a number, b clob) tablespace lobtest lob (b)
store as lob_tx (enable storage in row tablespace lobtest
nocache nologging);
insert into t values (1,NULL);
• The column simply has NULL value (or isn't stored in row)
insert into t values (2,empty_clob());
col 1: [36]
00 54 00 01 02 0c 80 00 00 02 00 00 00 01 00 00 00 02 8c d1
00 10 09 00 00 00 00 00 00 00 00 00 00 00 00 00
• The column stores 20-byte LOB locator, 16-byte inode structure, but no pointers
to chunks, since the LOB is empty
insert into t values (3,'X');
col 1: [38]
00 54 00 01 02 0c 80 00 00 02 00 00 00 01 00 00 00 02 8c d2
00 12 09 00 00 00 00 00 00 02 00 00 00 00 00 01 58 00
• The inserted value itself is stored in row (CLOBs are always in fixed width 2-
byte charset) 15/40
Tanel Põder RMOUG Training Days
16. An Inline LOB inode Example Cont'd
insert into t values (4, rpad('X', 1000, 'X'));
col 1: [2036]
00 54 00 01 02 0c 80 00 00 02 00 00 00 01 00 00 00 02 8c d3 <-loc.
07 e0 09 00 00 00 00 00 07 d0 00 00 00 00 00 01 <-inode
58 00 58 00 58 00 58 00 58 00 58 00 58 00 58 00 58 00 58 00 <- data
58 00 58 00 58 00 58 00 58 00 58 00 58 00 58 00 58 00 58 00 ...
• The total column size doesn't exceed 4000 bytes, thus stored in the row
insert into t values (5, rpad('X', 2000, 'X'));
col 1: [40]
00 54 00 01 02 0c 80 00 00 02 00 00 00 01 00 00 00 02 8c d4 <-loc.
00 14 05 00 00 00 00 00 0f a0 00 00 00 00 00 02 03 c0 00 14 <-RDBA
• The column is stored out-of line since it's total size with 20-byte locator and 16-
byte inode structure would exceed 4000 bytes
• Instead a 4-byte relative datablock address pointing to first block of the LOB
chunk is stored inline
• Still no LOB index entries are inserted
16/40
Tanel Põder RMOUG Training Days
18. LOB Index Structure
• A B-tree like index structure
– Leafs contain pointers to LOB chunks (RDBAs)
– Contains the inode if disable storage in row LOB
– One LOB item may have several associated rows
in index, when lobitem consists of many chunks
– Is subject to normal caching, logging and undo
mechanisms (unlike LOB segments)
• Has to be stored in the same tablespace
with the LOB segment
– That way addressing can be done using RDBAs
– Logically related segments also grouped
physically 18/40
Tanel Põder RMOUG Training Days
19. disable storage in row LOB index
row#9[7536] flag: ------, lock: 2, len=50, data:(32):
00 20 03 00 00 00 00 7d 1d 4c 00 00 00 00 00 01 <- kdlinode
01 80 00 eb 01 80 00 ec 01 80 00 ed 01 80 00 ee <- RDBAs
col 0; len 10; (10): 00 00 00 01 00 00 00 00 2f 46 <- LOB ID
col 1; len 4; (4): 00 00 00 00
row#10[7486] flag: ------, lock: 2, len=50, data:(32):
01 80 00 ef 01 80 00 f0 01 80 01 39 01 80 01 3a <- only RDBAs
01 80 01 3b 01 80 01 3c 01 80 01 3d 01 80 01 3e <- only RDBAs
col 0; len 10; (10): 00 00 00 01 00 00 00 00 2f 46 <- LOB ID
col 1; len 4; (4): 00 00 00 04 <- offset for above chunks in LOB
row#11[7436] flag: ------, lock: 2, len=50, data:(32):
01 80 01 3f 01 80 01 40 01 80 01 41 01 80 01 42
01 80 01 43 01 80 01 44 01 80 01 45 01 80 01 46
col 0; len 10; (10): 00 00 00 01 00 00 00 00 2f 46
col 1; len 4; (4): 00 00 00 0c
19/40
Tanel Põder RMOUG Training Days
20. LOB Segment Structure
• A bunch of datablocks organized in chunks
– Chunks are sometimes called pages or fatblocks
• A Chunk is the minimum size of IO for LOBs
• Read consistency and rollback is
implemented by creating versions of chunks
– During update, a new LOB ID is created
– A new version of updated chunks is created
– New locator's inode and chunk pointers in LOB
indexes point to new chunk version
• Old chunks are retained in LOB segment
– Depending on PCTVERSION or RETENTION
20/40
Tanel Põder RMOUG Training Days
21. Changes Inside LOB Segment
• Users access LOBs using locators
• A locator allows read consistent
access to a LOB instance ROOT LOB Index
– Even when LOB chunks are 2 2
1 1 1
updated by another
transaction SEG BIT
HDR MAP 101001 101001
– Old locators pointers are
kept in LOB index 101001
– Old chunk versions are 101001
read, if they are over-
101001
written, ret. ORA-1555
LOB Segment
21/40
Tanel Põder RMOUG Training Days
22. (N)CLOB charset issues
• (N)CLOBs are always fixed-width starting
from 8i (UTF-8, AL32UTF16)
– Varying width charsets are converted to fixed
width internally
– Fixed width charsets will remain unchanged
• AL16UTF16 is always Big-Endian encoded
– big-endian means that most significant byte of a
word is stored first
• UCS2 encoding is platform dependent
• Comparing different-endian internal data
structures may user more CPU
22/40
Tanel Põder RMOUG Training Days
23. Little Endian vs. Big Endian
• Little endian stores less significant values of a
word first (to lower memory address):
Original Little Endian Big Endian
Character 'X': 00 58 58 00 00 58
• Transparent to user, but may have
performance impact
• 10g automatically stores all (N)CLOB in Big-E
– Use this query select o.name, c.name from obj$ o, col$
to find old-fashioned c, lob$ l
where l.obj#=o.obj#
LOBs and o.obj#=c.obj#
– Convert them to a and l.intcol# = c.intcol#
and c.type#=112
new table using and bitand(l.property, 512) = 1;
CTAS
23/40
Tanel Põder RMOUG Training Days
24. LOB Storage Planning and Tuning
• LOB block Size
– Blocksize can be different from table block size
– Chunk sizes are multiples of Oracle blocks
• Chunk Size
– Bigger for big LOBs
– Bigger allow Oracle to do multiblock reads
– Bigger sizes keep LOB indexes smaller
• PCTVERSION / RETENTION
• Caching (file system read, HW read/write)
• Asynch IO
24/40
Tanel Põder RMOUG Training Days
25. Block Size Considerations
SQL> create table tl (a clob);
SQL> insert into tl values (rpad('X', 1982, 'X'));
SQL> insert into tl values (rpad('X', 1982, 'X'));
SQL> analyze table tl compute statistics;
SQL> select blocks, avg_space from user_tables where table_name = 'TL';
BLOCKS AVG_SPACE
---------- ---------- PCTFREE 10%
2 4070
SQL> truncate table tl;
SQL> alter table tl pctfree 0; PCTUSED 40%
SQL> insert into tl values (rpad('X', 1982, 'X'));
SQL> insert into tl values (rpad('X', 1982, 'X'));
SQL> analyze table tl compute statistics; Datablock
SQL> select blocks, avg_space from user_tables where table_name = 'TL';
BLOCKS AVG_SPACE
---------- ---------- May cause space and IO wastage
1 62 especially in FREELIST managed tables
25/40
Tanel Põder RMOUG Training Days
26. Partitioned LOBs
• Similar to regular table partitioning
• LOB segment partitions have to be in same
blocksize
– But LOB segment block size may differ from
parent table segment's blocksize
• Partitioned LOBs are fully supported for
tables
• 10g supports partitioned IOT with LOBs
• Eases backup & recovery in VLDBs
26/40
Tanel Põder RMOUG Training Days
27. LOB Cache Layer Tuning
• LOBs can have following internal caching
attributes: create table t (a clob)
– CACHE lob (a) store as lob_a
(disable storage in row
– NOCACHE nocache nologging);
– CACHE READS (8.1.6+)
• Despite any settings, in-line LOBs are cached
anyway, they are regular columns in a row
• LOB indexes are always cached
• Client side caching can be done using OCI
and Thick JDBC
– Uses lob buffering subsystem (LBS)
27/40
Tanel Põder RMOUG Training Days
28. CACHE LOB
• Reads&writes are done through buffer cache
• Any modifications will be logged
– Only redo vectors for actual changes in the block
are logged
• Modifications are asynchronously written to
disk by DBWR
• Can saturate buffer cache heavily
– Especially for older databases withoug buffer
cache touch count mechanism
– There is no _small_table_threshold parameter
for LOBs to avoid large buffered scans
28/40
Tanel Põder RMOUG Training Days
29. CACHE LOB Continued
• Consider different blocksize and buffer pool
for LOBs Server
PGA
• Wait event:
– db file seq/scattered read 101001
– P1: file number
– P2: first DBA
DBWR
– P3: blockcount
• Good for accessing and
modifying the same
data frequently
29/40
Tanel Põder RMOUG Training Days
30. NOCACHE LOB
• Server process reads blocks directly to PGA
• Writes are done also by server process
• Server process waits until writes completed
• Even though buffer cache is passed, some
coordination work still required
Server
• Wait event: PGA
– direct path read / write (lob)
– P1: file number
– P2: first DBA
– P3: blockcount
30/40
Tanel Põder RMOUG Training Days
31. CACHE READS LOB
• Reads will be done through buffer cache
• Writes are direct
• Good for frequently read, but rarely modified
data
• Particularily good in environments with huge
incoming data feeds, where writes can be
cached and batched on client side or in OCI
• Where caching writes would cause too much
logging
31/40
Tanel Põder RMOUG Training Days
32. LOB Logging Attributes
• Inline LOBs are always logged!
• LOB indexes are always logged!
• CACHE LOBs are always logged!
• NOLOGGING works only for NOCACHE and
CACHE READS LOBs
• Using LOGGING NOCACHE LOB, the whole
chunks will be logged, despite the size of
updated content – performance impact
• NOLOGGING NOCACHE LOB introduces
backup & recovery issues
32/40
Tanel Põder RMOUG Training Days
33. Nologging Direct IO Implications
• Controlfile updates on every NOLOGGING
operation
– Extreme controlfile parallel read and write
events
• Set event 10359 at level 1 to avoid
controlfile updates
– Documented in 9.0.1 docs and Note 1058851.6
– The event doesn't change recoverability, only
RMANs ability detect files needing backup
– Affects undrecoverable_change# in V$DATAFILE
• Backup & Recovery considerations
33/40
Tanel Põder RMOUG Training Days
34. Developer Strategies
• Simply storing and reading LOBs is relatively
simple
• Frequently manipulated LOBs may become
resource hungry
• Use temporary LOBs
– Support in PL/SQL, OCI, JDBC
– dbms_lob.createtemporary()
• Use LOB Buffering Subsystem (LBS)
– Allows to buffer, modify and batch updates to
LOBs in client side
34/40
Tanel Põder RMOUG Training Days
35. Temporary LOBs
• Regular LOBs stored in database are called
internal persistent LOBs
• Temporary LOBs are in-session structures
used for efficient LOB manipulation
– In-memory objects in UGA
– “paged” to temporary tablespace
– No logging, rollback or CR mechanisms
– Is empty when instatiated
– Can be instatiated using CACHE or NOCACHE
attribute
• Very lightweight
35/40
Tanel Põder RMOUG Training Days
36. Architectural Strategies
• Uniform access to LOBs whether stored
inside or outside the database (BFILEs)
– The same code can be reused despite the
storage type (although BFILEs are read only)
• When loading, buffer and bundle at client
side as much as possible
– Means less calls to database and less direct
write requests if NOCACHE or CACHE READS
LOBs
• Use LOB random access facilities if required
– A major benefit over LONGs
36/40
Tanel Põder RMOUG Training Days
37. General Recommendations
• The best way for optimizing performance of
an operation is not to do the operation at all
– Cache, bundle, etc..
– However, caching is not always the best
approach
• Several bugs related to ASSM and LOBs
– Free space never reused, etc..
– Use manual (freelist) segment space mgmt.
– Any new functionality (such partitioned IOT with
LOBs) should be evaluated and tested carefully
• Incremental backups of fairly static content
37/40
Tanel Põder RMOUG Training Days
38. Summary
• LOBs should be used instead of LONGs
– However in some cases RAW or VARCHAR may be
better
• Can be enable storage in row or disable
storage in row
– Disable storage in row always out of line, leaving
20-byte LOB locator in row
– Content accessed through LOB index
– Enable storage in row keeps content in row if
content+overhead <= 4000 bytes (3964+20+16)
• NOCACHE, CACHE, CACHE READS,
LOGGING, NOLOGGING Days
Tanel Põder RMOUG Training
options 38/40
39. LOB Internals and Performance
Tuning
Tanel Põder
Thank you!
http://integrid.info
tanel@integrid.info
39/40
Tanel Põder
integrid.info RMOUG Training Days