Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

BGOUG 2012 - XML Index Strategies


Published on

updated version

Published in: Technology
  • Be the first to comment

  • Be the first to like this

BGOUG 2012 - XML Index Strategies

  1. 1. XML Indexing StrategiesChoosing the Right Index for the Right Job Marco Gralike
  2. 2. Richard Foote (Mr. Index)• OakTable Member• ACE Director• Oracle Certified Professional• 22+ years DBA,
  3. 3. Refinement
  4. 4. Structured or Semi-Structured or… Structured Semi Structured
  5. 5. Unstructured Content
  6. 6. Document Driven / Data Driven
  7. 7. XML Container (in memory or via storage) In Memory CLOB (document) (document)Object Relational Binary XML (data) (data)
  8. 8. Design: Width and Height and … 3 1 4 2 5 X Y 6 ZContent Height : minOccurs="0" maxOccurs=“unbounded”Content Width : type="xs:string“, restriction…?Content Distribution : histogram, statistics, skew, cardinality ?
  9. 9. XMLIndex Use Cases Binary XML Binary XML Mixed(Schema based) (Schema less, Schema based) XMLIndex XMLIndex Structured w/ Text Structured Component index Component
  10. 10. Storage Models (xmltype)• CLOB – Default until (deprecated in 12.1) – Non-Schema Based• Binary XML – Oracle 11 and Onwards – Schema and Non-Schema Based• Object Relational (+Hybrid) – Nested Tables, Types, Varray’s – Schema Based
  11. 11. Querying XML Content in XML DB SQL/XML XQuery XMLType Abstraction DB XQuery Procedural XQuery XQuery Rewrite Pushdown XVM (use “no query rewrite”) Relational Streaming XPath DOM Tree Evaluation Model Access SQL Execution Methods XMLIndex Object-Relational Binary XML Relational Storage Secure FilesSource: S317428: Building Really Scalable XML Applications with Oracle XML DB and Oracle Text
  12. 12. Index Methods (xmltype)
  13. 13. Storage Index Defaults (xmltype)• Binary XML / CLOB – LOB Index• Object Relational – DBMS_XMLSCHEMA  “OPTIONS” – Oracle 10g: Index Organized Tables – Oracle 11g: B-Tree Indexes – xdb:annotations • Storage type  xdb:SQLType • Storage Type  xdb:ColumnProps / xdb:TableProps
  14. 14. Index Methods (10.x) BTre BTre BT e eI Index bookstore Function based Index (XPath) book whitepapertitle author author chapter title author id paragraph content content Oracle Text Index
  15. 15. Function-Based Index• Deprecated in 11.2• Object Relational XMLType Storage (can, but shouldn’t, on CLOB when hybrid)• Performance wise the lesser option…SQL> CREATE INDEX function_based_index ON xml_data_table (extractValue(OBJECT_VALUE, /Root/TextID));
  16. 16. BTree / Bitmap Index• Structured XML Data – Ordered Collection Tables (OCT) – ComplexTypes… – “dot” notation using the “xmldata” pseudocolumnSQL> CREATE INDEX dot_notation_index ON xml_data_table ("XMLDATA". "TEXTID");
  17. 17. Index AlternativesSQL> CREATE INDEX function_based_again_idx ON xml_data_table (CAST("XMLDATA". "TEXTID“ as VARCHAR2(10)));SQL> CREATE INDEX oracle_11_applicable_only_index ON xml_data_table xdt (XMLCast(XMLQuery ($i/Root/TextID PASSING xdt.OBJECT_VALUE as "i" RETURNING content) as VARCHAR2(10)));
  18. 18. Oracle Text Index• Unstructured Data in XML – CLOB storage XML part in Object Relational XML – Secondary index on XMLIndex• Can only index XML data TEXT nodes (<12.1!)• Result Set Interface (new in – Specify Query request and hit list requirements in XMLSQL> CREATE INDEX oracle_text_index ON xml_data_table (OBJECT_VALUE) INDEXTYPE IS CTXSYS.CONTEXT;
  19. 19. Index Methods (11.1) BTre BTre BT e eI Index bookstore Function based Index (XPath) book whitepapertitle author author chapter title author id paragraph Unstructured XMLIndex content content Secondary Oracle Text Index
  20. 20. Usage: Unstructured XMLIndex• XML Document contains: – Semi Structured Data and Structured Data – Supports searching and fragment extraction – When XPath queried is not known beforehand• XMLType CLOB or Binary XML content• If you use an XMLIndex and/or combine it with Structured XMLIndex(es)
  21. 21. Simple: Unstructured XMLIndexSQL> CREATE INDEX xmlindex_idx ON “XMLTYPE_COLUMN"(xdata) INDEXTYPE IS XDB.XMLINDEX;Index created.SQL> CREATE INDEX xmlindex_idx ON “XMLTYPE_TABLE"(object_value) INDEXTYPE IS XDB.XMLINDEX;Index created.
  23. 23. Path TableUnstructured XMLIndex f (x) Path Table
  24. 24. Unstructured XMLIndex (UXI) One Path Table Use Path Subsetting  Full Blown XMLIndex can be BIG Unstructured Token Tables (XDB.X$......) XMLIndex f (x)  Query re-write on Tokens  Fuzzy Searches, //  Optimizer Statistics Can be maintained Manually  Recorded in Pending Table Path Table Secondairy indexes possible
  25. 25. Index Methods (11.2) BTre BTre BT e eI Index bookstore Function based Index (XPath) book whitepaper Structured XMLIndextitle author author chapter title author id paragraph Unstructured XMLIndex content content Highly Structured Secondary Oracle Islands of Data Text Index
  26. 26. Usage: Structured XMLIndex• With highly Structured Data• Likely candidates: ComplexTypes• Structured Islands of Data – Can be nested, but officially only one level – XMLTABLE “virtual” nested column hint• Will create (multiple) “Content Tables” – Multiple XPath defined same columns with different purposeThey deliver relational performance…!
  27. 27. Simple: Structured XMLIndex• “XMLTABLE” Driven SyntaxSQL> CREATE INDEX xmlindex_sxi on xmldata_table (doc) indextype is xdb.xmlindex parameters (GROUP elementinfo_group Be aware XMLTABLE xml_cnt_tab_elementinfo /root/element COLUMNS infocol VARCHAR2(4000) PATH info );
  28. 28. Content Table(s) CONTENT TABLE(s)KEY INDEX Key RID YOUR (KEY) Key Key RID RID YOUR YOUR column column column ss Unique BTREE Index RAW rowid RAW rowid RAW rowid X X Primary Key Not Not Not X null null nullRID INDEX (RID) NON Unique BTREE IndexYour Columns
  29. 29. Structured XMLIndex (SXI)• Content Table(s)• Based on XMLTABLE syntax• XMLTable construct can be nested but: “Only ONE XMLType column allowed” Structured  VIRTUAL column XMLIndex f (x)• Can be maintained Manually• Secondairy indexes possible• LOCAL parameter (partitioning) Content Tables
  30. 30. Adding Structured IndexesSQL> ALTER INDEX xmlindex_sxi parameters (ADD_GROUP GROUP my_new_group XMLTABLE xml_content_tab_new /root/extra COLUMNS extracol VARCHAR2(35) PATH new_element );
  31. 31. Mixed XMLIndex Options Unstructured bookstore XMLIndex book whitepapertitle author author chapter title author id paragraph Structured Structured XMLIndex XMLIndex content content Secondary (text) Index
  32. 32. Mixed XMLIndex structuresCREATE INDEX xmlindex on TEST_RANGE_XML (doc) indextype is xdb.xmlindex PARAMETERS( PATH TABLE path_table PATHS (EXCLUDE(/root/ElementInfo)) );BEGIN DBMS_XMLINDEX.registerParameter (StructuredXML, ADD_GROUP GROUP ElementInfo XMLTABLE xml_cnttable_valueinfo /root/ElementInfo COLUMNS ValueInfo VARCHAR2(100) PATH ValueInfo );END; /ALTER INDEX xmlindex PARAMETERS(PARAM StructuredXML);
  34. 34. “There Can Be Only One…”
  35. 35. Syntax Awareness• SYNC=ALWAYS – Mandatory when Combined XMLIndex• SYNC=MANUAL – Locking• STALE=FALSE | TRUE – Hmmm…• Empty XMLIndex tables – OOPS  I got my “XMLTABLE” Syntax etc. “wrong”
  36. 36. Notes on XMLIndex (1)• Only ONE XMLIndex is allowed per column of XMLType table – Add extra XMLIndex structures (structured or unstructured) via ADD_GROUP syntax – Only SYNC=ALWAYS is allowed while using mixed XML Index structures or add more than one (11g)
  37. 37. Notes on XMLIndex (2)• You need the LOCAL parameter to create local partitioned XML Indexes• An XMLIndex on a HASH partitioned XMLType column or XMLType table, is not allowed (11g) – But you can create an Oracle Text Index on such structures
  38. 38. Recap• True understanding of Storage and Index options will provide: – Optimal performance – Out perform XML (Java based)• A lot of choice: – Problems are Complex – Also provides Solutions• Good design beforehand is the path to success
  39. 39. References (1)Oracle Whitepapers – Oracle XML DB : Choosing the Best XMLType Storage Option for Your Use Case (PDF) – Oracle XML DB : Best Practices to Get Optimal Performance out of XML Queries (PDF)Blog – • (Dedicated XMLDB blog) • Semi-Structured XMLIndex section • Structured XMLIndex section
  40. 40. References (2)• Oracle Open World Presentation on XML DB – S317428: Building Really Scalable XML Applications with Oracle XML DB and Oracle Text• XML DB OTN / FAQ Thread – mID=34 – eadID=410714