Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Miracle Open World 2011 - XML Index Strategies


Published on

Published in: Technology
  • Be the first to comment

Miracle Open World 2011 - XML Index Strategies

  1. 1. XML Indexing StrategiesChoosing the Right Index for the Right Job<br />Marco Gralike<br />
  2. 2. Strange Chainsaw Tree Guy<br />OakTable Member<br />ACE Director<br />Customer Advisory Board Member<br />17+ years DBA, etc.<br /><br /><br />
  3. 3. Richard Foote (Mr. Index)<br />OakTable Member<br />ACE Director<br />Oracle Certified Professional<br />22+ years DBA, etc.<br /><br />
  4. 4. Refinement<br />
  5. 5. Structured or Semi-Structured or…<br />Structured<br />Semi<br />Structured<br />
  6. 6. Unstructured Content<br />
  7. 7. Document Driven / Data Driven<br />
  8. 8. Design: Width and Height and …<br />1<br />3<br />4<br />5<br />2<br />X<br />Y<br />6<br />Z<br />Content Height: minOccurs="0" maxOccurs=“unbounded”<br />Content Width: type="xs:string“, restriction…?<br />Content Distribution: histogram, statistics, skew, cardinality ?<br />
  9. 9. XMLType “Under the Hood”<br />DOM Tree Model<br />XQuery<br />SQL/XML<br />XMLType Abstraction<br />Procedural XQuery<br />DB XQuery<br />XVM (use “no query rewrite”)<br />XQuery Rewrite<br />Pushdown<br />SQL Execution<br />XMLIndex<br />Streaming XPath Evaluation<br />RelationalAccess Methods<br />Binary XML<br />Object-Relational<br />Relational Storage<br />Secure Files<br />Source OOW 2010 presentation S317428: Building Really Scalable XML Applications with Oracle XML DB and Oracle Text<br />
  10. 10.
  11. 11.
  12. 12.
  13. 13.
  14. 14.
  15. 15.
  16. 16. XML Storage/Index Use Cases<br />Binary XML<br />(Schema based)<br />XMLIndex Structured Component<br />
  17. 17. XMLType<br />In Memory<br />(document)<br />CLOB<br />(document)<br />Object Relational<br />(data)<br />Binary XML<br />(data)<br />
  18. 18. XMLType Storage Models<br />CLOB<br />Default until<br />Non-Schema Based<br />Binary XML<br />Oracle 11 and Onwards<br />Schema and Non-Schema Based<br />Object Relational (+Hybrid)<br />Nested Tables, Types, Varray’s<br />Schema Based<br />
  19. 19. Storage Index Defaults (xmltype)<br />Binary XML / CLOB<br />LOB Index<br />Object Relational<br />DBMS_XMLSCHEMA  “OPTIONS”<br />Oracle 10g: Index Organized Tables<br />Oracle 11g: B-Tree Indexes<br />xdb:annotations<br />Storage type xdb:SQLType<br />Storage Typexdb:ColumnProps / xdb:TableProps<br />
  20. 20. Index Methods (xmltype)<br />CLOB<br /><ul><li>Oracle Text
  21. 21. XMLIndex (>= 11g)</li></ul>Binary XML<br /><ul><li>XMLIndex / Oracle Text
  22. 22. Virtual Columns (BTree)</li></ul>Object Relational<br /><ul><li> Standard Indexes
  23. 23. Oracle Text</li></li></ul><li>Index Methods (10.x)<br />BTree Index<br />BTree I<br />BT<br />bookstore<br />Oracle Text Index<br />Function based Index (XPath)<br />book<br />whitepaper<br />title<br />author<br />author<br />chapter<br />title<br />author<br />id<br />paragraph<br />content<br />content<br />
  24. 24. Function-Based Index<br />Deprecated in 11.2<br />Object Relational XMLType Storage<br />(can be used, but shouldn’t, on CLOB)<br />Performance wise the lesser option…<br />SQL> CREATE INDEX function_based_index<br /> ON xml_data_table<br /> (extractValue(OBJECT_VALUE, '/Root/TextID')); <br />
  25. 25. BTree / Bitmap Index (O.R.)<br />Structured XML Data<br />Ordered Collection Tables (OCT)<br />ComplexTypes…<br />“dot” notation using the “xmldata” pseudocolumn<br />SQL> CREATE INDEX dot_notation_index<br /> ON xml_data_table<br /> ("XMLDATA". "TEXTID");<br />
  26. 26. Syntax Index Alternatives<br />SQL> CREATE INDEX function_based_again_idx<br /> ON xml_data_table<br /> (CAST("XMLDATA". "TEXTID“ as VARCHAR2(10)));<br />SQL> CREATE INDEX oracle_11_applicable_only_index<br /> ON xml_data_tablexdt<br /> (XMLCast(XMLQuery<br />('$i/Root/TextID'<br /> PASSING xdt.OBJECT_VALUE as "i" <br /> RETURNING content)<br />as VARCHAR2(10)));<br />
  27. 27. Oracle Text Index<br />Unstructured Data in XML<br />CLOB storage XML part in Object Relational XML<br />Secondary index on XMLIndex<br />Can only index XML data TEXT nodes<br />Result Set Interface (new in<br />Specify Query request and hit list requirements in XML<br />SQL> CREATE INDEX oracle_text_index<br /> ON xml_data_table<br /> (OBJECT_VALUE) INDEXTYPE IS CTXSYS.CONTEXT; <br />
  28. 28. Index Methods (11.1)<br />BTree Index<br />BTree I<br />BT<br />bookstore<br />Secondary Oracle Text Index<br />Function based Index (XPath)<br />book<br />whitepaper<br />Unstructured<br />XMLIndex<br />title<br />author<br />author<br />chapter<br />title<br />author<br />id<br />paragraph<br />content<br />content<br />
  29. 29. XML Document contains:<br />Semi Structured Data and Structured Data<br />Supports searching and fragment extraction<br />When XPath queried is not known beforehand<br />XMLType CLOB or Binary XML content<br />If you use an XMLIndex and/or combine it with Structured XMLIndex(es)<br />Usage: Unstructured XMLIndex<br />
  30. 30. Simple: Unstructured XMLIndex<br />SQL> CREATEINDEXxmlindex_idx<br /> ON “XMLTYPE_COLUMN"(xdata) <br />INDEXTYPEISXDB.XMLINDEX;<br />Index created.<br />SQL> CREATEINDEXxmlindex_idx<br /> ON “XMLTYPE_TABLE"(object_value) <br />INDEXTYPEISXDB.XMLINDEX;<br />Index created.<br />
  31. 31. Creating Unstructured XMLIndex<br />CREATEINDEXXMLIDX <br />ON XMLBINARY_TAB (object_value) <br />INDEXTYPEISXDB.XMLIndex<br />PARAMETERS <br /> ('PATHS (INCLUDE (/ROOT/ID /ROOT/INFO/INFO_ID ) <br />NAMESPACEMAPPING(xmlns="http://localhost/xmlschema_bin.xsd") ) <br />PATH TABLE path_table(TABLESPACE XML_DATA)<br />PATH ID INDEX pathid_idx (TABLESPACE XML_INDX)<br />ORDER KEY INDEX orderkey_idx (TABLESPACE XML_INDX)<br />VALUE INDEX    value_idx   (TABLESPACE XML_INDX)<br />ASYNC (SYNC ALWAYS) STALE (FALSE) ') <br />PARALLEL LOGGING;<br />
  32. 32. Path Table<br />PATH TABLE<br />PATH INDEX<br /><ul><li>(PATHID, RID), BTREE</li></ul>ORDER INDEX<br /><ul><li>(RID, ORDER_KEY), BTREE</li></ul>VALUE INDEX<br /><ul><li>(SUBSTRB("VALUE",1,</li></ul>1599))<br /><ul><li>FUNCTION BASED</li></ul>SECONDAIRY INDEXES<br />Unstructured<br />XMLIndex<br />f (x)<br />Path Table<br />
  33. 33. Unstructured XMLIndex (UXI)<br /><ul><li>OnePath Table
  34. 34. UsePath Subsetting
  35. 35. Full Blown XMLIndex canbe BIG
  36. 36. Token Tables (XDB.X$......)
  37. 37. Query re-writeonTokens
  38. 38. FuzzySearches, //
  39. 39. OptimizerStatistics
  40. 40. CanbemaintainedManually
  41. 41. Recorded inPending Table
  42. 42. Secondairyindexespossible</li></ul>Unstructured<br />XMLIndex<br />f (x)<br />Path Table<br />
  43. 43. Index Methods (11.2)<br />BTree Index<br />BTree I<br />BT<br />bookstore<br />Secondary Oracle Text Index<br />Function based Index (XPath)<br />book<br />whitepaper<br />Structured<br />XMLIndex<br />Unstructured<br />XMLIndex<br />title<br />author<br />author<br />chapter<br />title<br />author<br />id<br />paragraph<br />content<br />content<br />Highly Structured<br />Islands of Data<br />
  44. 44. With highly Structured Data<br />Likely candidates: ComplexTypes<br />Structured Islands of Data<br />Can be nested, but officially only one level<br />XMLTABLE “virtual” nested column hint<br />Will create (multiple) “Content Tables” <br />Multiple XPath defined same columns with different purpose<br />They deliver relational performance…!<br />Usage: Structured XMLIndex<br />
  45. 45. Simple: Structured XMLIndex<br />“XMLTABLE” Driven Syntax<br />SQL> CREATE INDEX xmlindex_sxi<br /> on xml_data_table (xmlcol) <br />indextype is xdb.xmlindex<br /> parameters <br /> ('GROUP employee_info_group<br />XMLTABLEEMP_CONTENT_TABLE<br /> ' '/employees/emp' ' <br /> COLUMNS <br />empidNUMBER(10) PATH ' 'id' ' '); <br />Be aware<br />' '<br />
  46. 46. Content Table(s)<br />KEY INDEX<br /><ul><li>(KEY)
  47. 47. Unique BTREE Index
  48. 48. PrimaryKey</li></ul>RID INDEX<br /><ul><li>(RID)
  49. 49. NON Unique BTREE Index</li></ul>Your Columns<br />RID<br />rowid<br />YOUR<br />column<br />X<br />Key<br />RAW<br />Not null<br />RID<br />rowid<br />Key<br />RAW<br />Not null<br />YOUR<br />columns<br />X<br />CONTENT TABLE(s)<br />YOUR<br />columns<br />X<br />RID<br />rowid<br />Key<br />RAW<br />Not null<br />
  50. 50. Structured XMLIndex (SXI)<br />Content Table(s)<br />BasedonXMLTABLEsyntax<br />XMLTable construct canbenestedbut:<br />“Only ONE XMLType column allowed”<br /> VIRTUAL column<br />CanbemaintainedManually<br />Secondairyindexespossible<br />LOCAL parameter (partitioning)<br />Structured<br />XMLIndex<br />f (x)<br />Content<br />Tables<br />
  51. 51. Adding Structured Indexes<br />SQL> ALTER INDEX xmlindex_sxi<br />parameters<br /> ('ADD_GROUP<br /> GROUP my_new_group<br /> XMLTABLE xml_content_tab_new<br /> ' '/root/extra' ' <br /> COLUMNS <br />extracol VARCHAR2(35) PATH ' 'new_element' ' ');<br />
  52. 52. Mixed XMLIndex Options<br />bookstore<br />Secondary (text)Index<br />Unstructured<br />XMLIndex<br />book<br />whitepaper<br />Structured<br />XMLIndex<br />Structured<br />XMLIndex<br />title<br />author<br />author<br />paragraph<br />title<br />author<br />id<br />chapter<br />content<br />content<br />
  53. 53. Mixed XMLIndex structures<br />CREATE INDEX xmlindex on TEST_RANGE_XML (doc) <br />indextype is xdb.xmlindex<br />PARAMETERS <br />(' PATH TABLE path_table PATHS (EXCLUDE(/root/ElementInfo)) '); <br />BEGIN <br />DBMS_XMLINDEX.registerParameter<br /> ('StructuredXML', <br /> 'ADD_GROUP GROUP ElementInfo<br />XMLTABLExml_cnttable_valueinfo ' '/root/ElementInfo' ' <br /> COLUMNS ValueInfo VARCHAR2(100) PATH ' 'ValueInfo' '); <br />END; /<br />ALTER INDEX xmlindex PARAMETERS('PARAM StructuredXML'); <br />
  54. 54. XMLIndex Maintenance<br />ALTER INDEX<br />XMLIndex Parameter Changes<br />DBMS_XMLINDEX.DROPPARAMETER<br />DBMS_XMLINDEX.MODIFYPARAMETER<br />DBMS_XMLINDEX.REGISTERPARAMETER<br />Manual Synchronizing an XMLIndex<br />DBMS_XMLINDEX.SYNCINDEX <br />Pending Tables<br />
  55. 55. “There Can Be Only One…”<br />
  56. 56. Syntax Awareness<br />SYNC=ALWAYS<br />Mandatory when Combined XMLIndex<br />SYNC=MANUAL<br />Locking<br />STALE=FALSE | TRUE<br />Hmmm…<br />Empty XMLIndex tables<br />OOPS  I got my “XMLTABLE” Syntax etc. “wrong”<br />
  57. 57. Notes on XMLIndex (1)<br />Only ONE XMLIndex is allowed in a user schema<br />Add extra XMLIndex structures (structured or unstructured) <br /> via ADD_GROUP syntax<br />Only SYNC=ALWAYS is allowed while using mixed XMLIndex structures or add more than one<br />
  58. 58. Notes on XMLIndex (2)<br />You need the LOCAL parameter to create local partitioned XMLIndexes<br />An XMLIndex on a HASH partitioned XMLType column or XMLType table, isnot (yet) allowed<br />But you can create an Oracle Text Index on such structures<br />
  59. 59. Recap<br />True understanding of Storage <br /> and Index options will provide:<br />Optimal performance<br />Out perform XML (Java based)<br />A lot of choice:<br />Problems are Complex<br />Also provides Solutions<br />Good design beforehand is the path to success<br />
  60. 60. References (1)<br />Oracle Whitepapers<br />Oracle XML DB : Choosing the Best XMLType Storage Option for Your Use Case (PDF)<br />Oracle XML DB : Best Practices to Get Optimal Performance out of XML Queries (PDF)<br />Blog<br /> <br />(Dedicated XMLDB blog)<br />Semi-Structured XMLIndex section<br />Structured XMLIndex section<br />
  61. 61. References (2)<br />Oracle Open World Presentationon XML DB<br />S317428: Building Really Scalable XML Applications with Oracle XML DB and Oracle Text<br />XML DB OTN / FAQ Thread<br /> <br /><br />