Due Diligence with Exadata

472 views

Published on

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
472
On SlideShare
0
From Embeds
0
Number of Embeds
9
Actions
Shares
0
Downloads
0
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Due Diligence with Exadata

  1. 1. Due Diligence Examining Exadata Jonathan Lewis jonathanlewis.wordpress.com www.jlcomp.demon.co.uk Who am I ? Independent Consultant 28+ years in IT 24+ using Oracle Strategy, Design, Review, Briefings, Educational, Trouble-shooting Member of the Oak Table Network Oracle ACE Director Oracle author of the year 2006 Select Editor’s choice 2007 UKOUG Inspiring Presenter 2011 UKOUG Council member 2012 ODTUG 2012 Best Presenter (d/b) O1 visa for USAJonathan Lewis Examining Exadata© 2012 2 / 36 1
  2. 2. Small print (a)Jonathan Lewis Examining Exadata© 2012 3 / 36 Small print (b)Jonathan Lewis Examining Exadata© 2012 4 / 36 2
  3. 3. Small print (c)Jonathan Lewis Examining Exadata© 2012 5 / 36 Why choose Exadata? • Political • Economic • TechnicalJonathan Lewis Examining Exadata© 2012 6 / 36 3
  4. 4. Political • Single Supplier – No finger-pointing (+) – Stranglehold (-) • Single point of management – No finger-pointing (+) – No "lost" databases (+) • What about needs for different versions (-)Jonathan Lewis Examining Exadata© 2012 7 / 36 Economic (a) • Black Box – Single supplier, pre-installed (+) • No build time – What about upgrades / patches (-) – Its "just" Oracle (+) • but not as we know it, Jim (-)Jonathan Lewis Examining Exadata© 2012 8 / 36 4
  5. 5. Economic (b) • Bang per buck – Quantity of hardware – Licence costs for "matching" system – Special features (cp SAN vs. JBOD)Jonathan Lewis Examining Exadata© 2012 9 / 36 Technical • Why is it good ? • Why is it irrelevant ? • Where are the nasty surprises?Jonathan Lewis Examining Exadata© 2012 10 / 36 5
  6. 6. USPs • Smart Scans / offload • Storage Indexes • Hybrid Columnar Compression • Smart flash cache – (vs. Database flash cache)Jonathan Lewis Examining Exadata© 2012 11 / 36 Earliest Experiences (a)Jonathan Lewis Examining Exadata© 2012 12 / 36 6
  7. 7. Earliest Experiences (b) QueryJonathan Lewis Examining Exadata© 2012 13 / 36 Earliest Experiences (c) select small_vc, sum(rep_col) from t1 group by small_vc having sum(rep_col) > 1000 SELECT A1.C0, SUM(A1.C1) FROM :Q613000 A1 GROUP BY A1.C0 HAVING SUM(A1.C1)>1000 SELECT /*+ NO_EXPAND ROWID(A1) */ A1."SMALL_VC" C0, A1."REP_COL" C1 FROM "T1" PX_GRANULE(0, BLOCK_RANGE, DYNAMIC) A1 WAIT #1: nam=direct path read ela= 12 p1=11 p2=2582 p3=4 WAIT #1: nam=direct path read ela= 14747 p1=11 p2=2722 p3=8 WAIT #1: nam=PX Deq Credit: send blkd ela= 2 p1=268500994 p2=1 p3=0 WAIT #1: nam=direct path read ela= 3 p1=11 p2=2730 p3=8 WAIT #1: nam=direct path read ela= 295 p1=11 p2=2738 p3=8Jonathan Lewis Examining Exadata© 2012 14 / 36 7
  8. 8. Earliest Experiences (d) • Highs – Local discs – Hard-wired "network" – Result sets (row/column projections) sent across n/w • Lows – Every node was a full Oracle instance – No detailed information about which node to call – Instances had to access discs on remote nodes • (implemented at the chip level)Jonathan Lewis Examining Exadata© 2012 15 / 36 Smart Scan / Offload Query Database instance(s) Decomposition ASM instance Block Range Request Cell codeJonathan Lewis Examining Exadata© 2012 16 / 36 8
  9. 9. Flash Cache Database Flash Cache "Level 2 cache" Cell / Smart Flash CacheJonathan Lewis Examining Exadata© 2012 17 / 36 USPs • Smart Scans / offload • Smart flash cache – (vs. Database flash cache) • Storage Indexes • Hybrid Columnar CompressionJonathan Lewis Examining Exadata© 2012 18 / 36 9
  10. 10. Storage Indexes (a) Up to 8 columns Memory Only for "tables" ColX Low High Nulls ColA Low High Nulls ColX Low High Nulls ColY Low High Nulls ColZ Low High Nulls ColZ Low High Nulls ColZ Low High Nulls ColA Low High Nulls ColB Low High Nulls ... ColF Low High Nulls Disk 1MB 1MB 1MBJonathan Lewis Examining Exadata© 2012 19 / 36 Storage Indexes (a) "cell smart table scan" request: where colX = 99 ColX 10 100 Yes ColA Low High Nulls ColX 200 350 No ColY Low High Nulls ColZ Low High Nulls ColZ Low High Nulls ColZ Low High Nulls ColA Low High Nulls ColX Low High Nulls ColB Low High Nulls ... ColF Low High Nulls Will visit Must visit Will skip Will load SI 1MB 1MB 1MBJonathan Lewis Examining Exadata© 2012 20 / 36 10
  11. 11. Storage Indexes (b) select … from t1 where col1 = k1; -- Slow select … from t1 where col1 = k2; -- Quicker select … from t1 where col2 = k1; -- Slow select … from t1 where col2 = k2; -- Quicker ... select … from t1 where col8 = k1; -- Slow select … from t1 where col8 = k2; -- Quicker select … from t1 where col9 = k1; -- Slow select … from t1 where col9 = k2; -- ???? select … from t1 where col1 = k1; -- ???? select … from t1 where col2 = k1; -- ????Jonathan Lewis Examining Exadata© 2012 21 / 36 Storage Indexes (c) select … from t1 where colA = k1 and colB = k2; -- Slow select … from t1 where colA = k1; -- Quick select … from t1 where colB = k2; -- Quick select … from t1 where colA = k1; -- Slow select … from t1 where colB = k2; -- Slow select … from t1 where colA = k1 and colB = k2; -- Quick select … from t1 where colA = k1; -- Slow select … from t1 where colA = k1 and colB = k2; -- Quick select … from t1 where colB = k2; -- ????Jonathan Lewis Examining Exadata© 2012 22 / 36 11
  12. 12. Storage Indexes (d) create table t1 as select mod(rownum,1e3) scattered, -- 1,000 rows per value trunc((rownum-1)/1e3) clustered, -- 1,000 rows per value ... from {very large rowsource} where rownum <= 1e6 ; Scattered: 0, 1, 2, ..., 998, 999, 0, 1, 2, ..., 1, ... 998, 999 Every value appears in every MB Clustered: 0, 0, 0, 0, ..., 1, 1, 1, 1, ... 999, 999, 999 Any given value appears in just one MBJonathan Lewis Examining Exadata© 2012 23 / 36 USPs • Smart Scans / offload • Smart flash cache – (vs. Database flash cache) • Storage Indexes • Hybrid Columnar CompressionJonathan Lewis Examining Exadata© 2012 24 / 36 12
  13. 13. Compression • More rows per block – fewer blocks to be read from disk – More CPU to extract rows – More CPU to load rows – More contention on modificatin • Compres for OLTP - "deduplication" • Compress for query/archive - HCCJonathan Lewis Examining Exadata© 2012 25 / 36 HCC (a) CU Header Column 1 Column 2 Column 3 Column 4 Bitmap Maximum size for "archive" ca. 256KB (probably) Limited to ca. 32KB for "query" (probably) Up to 32,759 Rows (almost certainly) for bothJonathan Lewis Examining Exadata© 2012 26 / 36 13
  14. 14. HCC (b) Block boundaries Block header + Row directoryJonathan Lewis Examining Exadata© 2012 27 / 36 HCC (c) CU Header Column 2 Column 3 Column 3 Column 4 Column 1 Column 1 Column 3 Bitmap Row Header Row Directory Block Header A CU is stored as a "single row" The rows it holds are referenced individually by rowid The row_number of a rowid is the row number within CU The block address of a rowid is for the first block of the CUJonathan Lewis Examining Exadata© 2012 28 / 36 14
  15. 15. HCC (d) • Compression – takes place at the database server • Decompression – Takes place at the db server of indexed access – Takes place at the cell server of tablescans (usually) – May have to take place at the db server for t/scans – Can use a LOT of CPU.Jonathan Lewis Examining Exadata© 2012 29 / 36 HCC (e) • Deletion – We set one bit in the bitmap (and "lock" the CU) • Updates – We copy (migrate) the row to another block • Which is stored as "compress for OLTP" – We set the "deleted" bit (and "lock" the CU) – We update every relevant index with the new rowid – Smart scan disabled for this CU/MBJonathan Lewis Examining Exadata© 2012 30 / 36 15
  16. 16. HCC - access by rowid (example) • Load entire CU into db server cache. • For each column used in query: – "table fetch continued row" to start of column – Decompress column into local memory – Select column value • How long can we keep the column ? – Only for the equivalent of "buffer is pinned".Jonathan Lewis Examining Exadata© 2012 31 / 36 HCC - access by rowid (example) select max(padding) from t1_ah -- 33M(32 * 2^20) rows where n_128k between 1000 and 1999 -- 256,000 rows ; http://jonathanlewis.wordpress.com/2012/07/27/compression-units-3/ No compression 0.70 CPU seconds. Query high 77.24 CPU seconds Archive high 3,022.83 CPU seconds 16 seconds after manual optimization CU 1 CU 2 CU 3 CU 4 CU 5 1 6 2 7 3 8 4 9 5 ...Jonathan Lewis Examining Exadata© 2012 32 / 36 16
  17. 17. Optimizer Problem (a) SegHeader 1 Read The rows we want 1 Read 1 Read 1 Read What strategies can the optimizer adopt to acquire these four rows when we run a query like: select * from TABX where COLY = {constant}`Jonathan Lewis Examining Exadata© 2012 33 / 36 Optimizer Problem (b) select * from TABX where COLY = {constant} 1 Read 1 Read 1 Read 1 ReadJonathan Lewis Examining Exadata© 2012 34 / 36 17
  18. 18. Optimizer Problem (c) Traditional Hardware Say 200 byte rows, 40 per block Say we can read 16 blocks as fast as 1 The break point could be as high as 1 row in 640. Exadata Say 32 blocks = 32,000 rows - handled in one read You could get 14 reads in-flight The cell server CPU does the work The break point could easily be 1 row in 450,000 (without allowing for storage index and cell flash cache effects) (and then indexed access wipes out your db server CPU !)Jonathan Lewis Examining Exadata© 2012 35 / 36 Summary • Exadata seems to offer little to OLTP • Putting OLTP and its DSS/DW on the same box may be nice • Political or economic arguments may apply • Go and see Frits Hoogland ! • Storage indexes may create instability • Choose compression levels carefully • Indexing strategies are harder to decide • And the optimizer is ignorant of Exadata featurea • Go and see Richard Foote and Maria Colgan !Jonathan Lewis Examining Exadata© 2012 36 / 36 18

×