Online index rebuild automation

Online Index Rebuild
Automation
on a multitenant fleet
Carlos Sierra

Motivation
• Improve performance of Execution Plans which include Index Fast Full
Scans (IFFS)
• Or large Index Range Scans operations
• Shrink large bloated indexes to reduce Tablespace growth
• Thus reduce respective space alerts

Case study
• Custom key-value infrastructure application installed in 30+ CDBs and
700+ PDBs
• Multi-versioning implemented at the application layer
• Sequence-based transaction id as versioning mechanism
• Most indexes monolithically increasing in one or more columns
• Target query execution time in the order of few milliseconds, or < 1ms
• “Optimal” execution plans often call for IFFS or long index scans
• Many indexes with wasted space > 25%, some with > 95%

Sample indexes
• DB_SYSTEMS_PK (ID, TXNID)
• I_DB_SYS_BY_COMPARTMENT (COMPARTMENTID, TXNID)
• TRANSACTIONS_PK (TRANSACTIONID)
• TRANSACTIONS_AK (COMMITTRANSACTIONID, STATUS, TRANSACTIONID)
• TRANSACTIONS_AK2 (TRANSACTIONID, BEGINTIME)
• TRANSACTIONS_AK3 (STATUS, COMMITTRANSACTIONID, TRANSACTIONID)
• TRANSACTIONKEYS_PK (TRANSACTIONID, STEPNUMBER)
• TRANSACTIONKEYS_AK (COMMITTRANSACTIONID, BUCKETID)

Typical query
SELECT
<bunch of columns>
FROM
DB_SYSTEMS
WHERE
(id, TxnID, 1) IN (
SELECT
id, TxnID,
ROW_NUMBER() OVER (PARTITION BY id ORDER BY TxnID DESC) rn
FROM
DB_SYSTEMS
WHERE
TxnID <= :1
)
AND Live = 'Y'
AND compartmentId = :2
ORDER BY
compartmentId ASC, id ASC
FETCH FIRST :3 ROWS ONLY

Typical execution plan
---------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | Buffers |
---------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 1 | 70976 |
|* 1 | VIEW | | 1 | 31 | 1 | 70976 |
|* 2 | WINDOW SORT PUSHED RANK | | 1 | 31 | 1 | 70976 |
| 3 | NESTED LOOPS | | 1 | 31 | 1 | 70976 |
|* 4 | TABLE ACCESS BY INDEX ROWID BATCHED| DB_SYSTEMS | 1 | 2375 | 2462 | 2383 |
|* 5 | INDEX RANGE SCAN | I_DB_SYS_BY_COMPARTMENT | 1 | 2375 | 2462 | 200 |
|* 6 | VIEW PUSHED PREDICATE | VW_NSO_1 | 2462 | 1 | 1 | 68593 |
| 7 | WINDOW BUFFER | | 2462 | 2 | 1 | 68593 |
|* 8 | INDEX RANGE SCAN DESCENDING | DB_SYSTEMS_PK | 2462 | 2 | 1 | 68593 |
---------------------------------------------------------------------------------------------------------------
• Scan entire set of matching rows for given compartment, then for each row scan all matching rows for
specific id, finding the one with largest transaction id. If row from outer scan matches the max from inner
scan then return it.
• For popular values this nested loop scan means O(N2) complexity

Alternative execution plans
• Nested loops with subquery as outer rows source
• Hash join with row sources accessed by
• Full scans on table
• Full scans on indexes
• Large range scans

Performance of some known execution plans
AVG_ET_SECS_AWR PLAN_HASH_VALUE NL HJ P99_ET_SECS P95_ET_SECS
--------------- --------------- --- --- ----------- -----------
0.040205 1339591056 0 1 0.054080 0.054080
0.058834 979865086 0 1 0.064525 0.064329
0.072383 3083557534 0 1 0.081679 0.081338
1.545154 1910733940 1 0 3.384472 3.217428
• Nested Loops plan with historical average performance per execution of 1.5s, 99th
percentile of 3.4s and 95th percentile of 3.2s
• 3 Hash-Join plans with average performance in the 40ms to 72ms range
• Best performant HJ plan with 40ms on average, and 95th percentile of 54ms

Two of the “optimal” HJ plans

Case study rules of thumb
• Queries are constrained by FETCH FIRST :b ROWS ONLY
• If unconstrained, the query were to return many rows, then NL plans
may perform better on average than HJ plans
• If unconstrained, the query were to return few or no rows, then HJ
plans may perform better on average than NL plans
• Consistent performance is perceived as more valuable than better
performance
• Then HJ plans are often the right answer!

Candidate indexes for rebuild
• Larger than 10 MB
• Wasted space of over 25%
• They are accessed through a FULL SCAN
• As per execution plans in memory or last 7 days of AWR history

Exclusions and inclusions
• Exclude indexes from a TRANSACTIONS table for which the
application performs an explicit LOCK EXCLUSIVE
• Regardless of size or wasted space
• Include indexes from a TRANSACTIONKEYS table that are known to be
the largest on many PDBs
• Only if > 10 MB and wasted space > 25%

Some large bloated indexes
• Not referenced by IFFS or large
range scans
• Average size of a Tablespace for
a PDB is 100 GB
• These 4 indexes are usually the
largest ones on a PDB
• Wasted space for some indexes
in the order of 90%

Online Index Rebuild Automation
• OEM Job executed weekly on every CDB of the fleet
• Loop over all PDBs
• Loop over all candidate indexes per PDB
• Considering thresholds (space and savings) plus exceptions
• Execute online index rebuild
• Wait 2 minutes between indexes being rebuilt
• Log into 3 places
• Trace file
• Report (spool file)
• Alert log

Results: sensible space reduction
• 130s to 135s per index including 120s between them. Thus ~10s to ~15s per index on average
• No visible contention during “eyes-on-glass” on two monitored “online rebuild” cycles of entire fleet
• 105 to 142 GB of saved space per CDB on first cycle (including large bloated indexes). Savings ~90%

Results: small “overall” performance gain
• Real gain is on short-latency queries performing full scans on rebuilt indexes
• Performance of IFFS operation is proportional to index size

Moving forward
• Tune online index rebuild autonomous job
• Adjust size (10 MB) and savings (25%) thresholds
• Reduce transaction’s retention window
• Target: as short as business permits (maybe 2h)
• Perform aggressive garbage collection
• Implement monitoring job with alarming capabilities
• Implement index compression
• On leading wide-columns with low to medium cardinality
• Evaluate shrinking tables with a new autonomous job
• Using online table redefinition

Further improvements
• Adapt and adopt SPM to achieve plan stability on critical SQL
• Implement a smart algorithm based on current and historical performance
• Some of the chosen plans will contain IFFS and HJ
• Address currently excluded table
• Promote EXCLUSIVE TABLE LOCK to a DBMS_LOCK
• Improve dynamic SQL eliminating “versioning” subquery
• Include ending transaction on each row
• Pair Oracle system change number (SCN) to transactions, and use AS OF

Online index rebuild automation

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Online index rebuild automation

Similar to Online index rebuild automation (20)

Recently uploaded

Recently uploaded (20)

Online index rebuild automation