Advertisement
Advertisement

More Related Content

Advertisement
Advertisement

Online index rebuild automation

  1. Online Index Rebuild Automation on a multitenant fleet Carlos Sierra
  2. Motivation • Improve performance of Execution Plans which include Index Fast Full Scans (IFFS) • Or large Index Range Scans operations • Shrink large bloated indexes to reduce Tablespace growth • Thus reduce respective space alerts
  3. Case study • Custom key-value infrastructure application installed in 30+ CDBs and 700+ PDBs • Multi-versioning implemented at the application layer • Sequence-based transaction id as versioning mechanism • Most indexes monolithically increasing in one or more columns • Target query execution time in the order of few milliseconds, or < 1ms • “Optimal” execution plans often call for IFFS or long index scans • Many indexes with wasted space > 25%, some with > 95%
  4. Sample indexes • DB_SYSTEMS_PK (ID, TXNID) • I_DB_SYS_BY_COMPARTMENT (COMPARTMENTID, TXNID) • TRANSACTIONS_PK (TRANSACTIONID) • TRANSACTIONS_AK (COMMITTRANSACTIONID, STATUS, TRANSACTIONID) • TRANSACTIONS_AK2 (TRANSACTIONID, BEGINTIME) • TRANSACTIONS_AK3 (STATUS, COMMITTRANSACTIONID, TRANSACTIONID) • TRANSACTIONKEYS_PK (TRANSACTIONID, STEPNUMBER) • TRANSACTIONKEYS_AK (COMMITTRANSACTIONID, BUCKETID)
  5. Typical query SELECT <bunch of columns> FROM DB_SYSTEMS WHERE (id, TxnID, 1) IN ( SELECT id, TxnID, ROW_NUMBER() OVER (PARTITION BY id ORDER BY TxnID DESC) rn FROM DB_SYSTEMS WHERE TxnID <= :1 ) AND Live = 'Y' AND compartmentId = :2 ORDER BY compartmentId ASC, id ASC FETCH FIRST :3 ROWS ONLY
  6. Typical execution plan --------------------------------------------------------------------------------------------------------------- | Id | Operation | Name | Starts | E-Rows | A-Rows | Buffers | --------------------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | | 1 | 70976 | |* 1 | VIEW | | 1 | 31 | 1 | 70976 | |* 2 | WINDOW SORT PUSHED RANK | | 1 | 31 | 1 | 70976 | | 3 | NESTED LOOPS | | 1 | 31 | 1 | 70976 | |* 4 | TABLE ACCESS BY INDEX ROWID BATCHED| DB_SYSTEMS | 1 | 2375 | 2462 | 2383 | |* 5 | INDEX RANGE SCAN | I_DB_SYS_BY_COMPARTMENT | 1 | 2375 | 2462 | 200 | |* 6 | VIEW PUSHED PREDICATE | VW_NSO_1 | 2462 | 1 | 1 | 68593 | | 7 | WINDOW BUFFER | | 2462 | 2 | 1 | 68593 | |* 8 | INDEX RANGE SCAN DESCENDING | DB_SYSTEMS_PK | 2462 | 2 | 1 | 68593 | --------------------------------------------------------------------------------------------------------------- • Scan entire set of matching rows for given compartment, then for each row scan all matching rows for specific id, finding the one with largest transaction id. If row from outer scan matches the max from inner scan then return it. • For popular values this nested loop scan means O(N2) complexity
  7. Alternative execution plans • Nested loops with subquery as outer rows source • Hash join with row sources accessed by • Full scans on table • Full scans on indexes • Large range scans
  8. Performance of some known execution plans AVG_ET_SECS_AWR PLAN_HASH_VALUE NL HJ P99_ET_SECS P95_ET_SECS --------------- --------------- --- --- ----------- ----------- 0.040205 1339591056 0 1 0.054080 0.054080 0.058834 979865086 0 1 0.064525 0.064329 0.072383 3083557534 0 1 0.081679 0.081338 1.545154 1910733940 1 0 3.384472 3.217428 • Nested Loops plan with historical average performance per execution of 1.5s, 99th percentile of 3.4s and 95th percentile of 3.2s • 3 Hash-Join plans with average performance in the 40ms to 72ms range • Best performant HJ plan with 40ms on average, and 95th percentile of 54ms
  9. Two of the “optimal” HJ plans
  10. Case study rules of thumb • Queries are constrained by FETCH FIRST :b ROWS ONLY • If unconstrained, the query were to return many rows, then NL plans may perform better on average than HJ plans • If unconstrained, the query were to return few or no rows, then HJ plans may perform better on average than NL plans • Consistent performance is perceived as more valuable than better performance • Then HJ plans are often the right answer!
  11. Candidate indexes for rebuild • Larger than 10 MB • Wasted space of over 25% • They are accessed through a FULL SCAN • As per execution plans in memory or last 7 days of AWR history
  12. Exclusions and inclusions • Exclude indexes from a TRANSACTIONS table for which the application performs an explicit LOCK EXCLUSIVE • Regardless of size or wasted space • Include indexes from a TRANSACTIONKEYS table that are known to be the largest on many PDBs • Only if > 10 MB and wasted space > 25%
  13. Some large bloated indexes • Not referenced by IFFS or large range scans • Average size of a Tablespace for a PDB is 100 GB • These 4 indexes are usually the largest ones on a PDB • Wasted space for some indexes in the order of 90%
  14. Online Index Rebuild Automation • OEM Job executed weekly on every CDB of the fleet • Loop over all PDBs • Loop over all candidate indexes per PDB • Considering thresholds (space and savings) plus exceptions • Execute online index rebuild • Wait 2 minutes between indexes being rebuilt • Log into 3 places • Trace file • Report (spool file) • Alert log
  15. Results: sensible space reduction • 130s to 135s per index including 120s between them. Thus ~10s to ~15s per index on average • No visible contention during “eyes-on-glass” on two monitored “online rebuild” cycles of entire fleet • 105 to 142 GB of saved space per CDB on first cycle (including large bloated indexes). Savings ~90%
  16. Results: small “overall” performance gain • Real gain is on short-latency queries performing full scans on rebuilt indexes • Performance of IFFS operation is proportional to index size
  17. Moving forward • Tune online index rebuild autonomous job • Adjust size (10 MB) and savings (25%) thresholds • Reduce transaction’s retention window • Target: as short as business permits (maybe 2h) • Perform aggressive garbage collection • Implement monitoring job with alarming capabilities • Implement index compression • On leading wide-columns with low to medium cardinality • Evaluate shrinking tables with a new autonomous job • Using online table redefinition
  18. Further improvements • Adapt and adopt SPM to achieve plan stability on critical SQL • Implement a smart algorithm based on current and historical performance • Some of the chosen plans will contain IFFS and HJ • Address currently excluded table • Promote EXCLUSIVE TABLE LOCK to a DBMS_LOCK • Improve dynamic SQL eliminating “versioning” subquery • Include ending transaction on each row • Pair Oracle system change number (SCN) to transactions, and use AS OF
Advertisement