Miracle Open World 2011 - XML Index Strategies


Published on

Published in: Technology
1 Like
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • It all comes down to packaging
  • Definitions of Structured, Semi-Structured and Unstructured data
  • Emp/Dept tables, Foreign/Primary Keys…Showing here ONLY 1 XML document…
  • See also OOW 2010, S317428: Building Really Scalable XML Applications with Oracle XML DB and Oracle Text – Nipun Agarwal, Oracle
  • Miracle Open World 2011 - XML Index Strategies

    1. 1. XML Indexing StrategiesChoosing the Right Index for the Right Job<br />Marco Gralike<br />
    2. 2. Strange Chainsaw Tree Guy<br />OakTable Member<br />ACE Director<br />Customer Advisory Board Member<br />17+ years DBA, etc.<br />Blog.gralike.com<br />Technology.amis.nl<br />
    3. 3. Richard Foote (Mr. Index)<br />OakTable Member<br />ACE Director<br />Oracle Certified Professional<br />22+ years DBA, etc.<br />RichardFoote.wordpress.com<br />
    4. 4. Refinement<br />
    5. 5. Structured or Semi-Structured or…<br />Structured<br />Semi<br />Structured<br />
    6. 6. Unstructured Content<br />
    7. 7. Document Driven / Data Driven<br />
    8. 8. Design: Width and Height and …<br />1<br />3<br />4<br />5<br />2<br />X<br />Y<br />6<br />Z<br />Content Height: minOccurs="0" maxOccurs=“unbounded”<br />Content Width: type="xs:string“, restriction…?<br />Content Distribution: histogram, statistics, skew, cardinality ?<br />
    9. 9. XMLType “Under the Hood”<br />DOM Tree Model<br />XQuery<br />SQL/XML<br />XMLType Abstraction<br />Procedural XQuery<br />DB XQuery<br />XVM (use “no query rewrite”)<br />XQuery Rewrite<br />Pushdown<br />SQL Execution<br />XMLIndex<br />Streaming XPath Evaluation<br />RelationalAccess Methods<br />Binary XML<br />Object-Relational<br />Relational Storage<br />Secure Files<br />Source OOW 2010 presentation S317428: Building Really Scalable XML Applications with Oracle XML DB and Oracle Text<br />
    10. 10.
    11. 11.
    12. 12.
    13. 13.
    14. 14.
    15. 15.
    16. 16. XML Storage/Index Use Cases<br />Binary XML<br />(Schema based)<br />XMLIndex Structured Component<br />
    17. 17. XMLType<br />In Memory<br />(document)<br />CLOB<br />(document)<br />Object Relational<br />(data)<br />Binary XML<br />(data)<br />
    18. 18. XMLType Storage Models<br />CLOB<br />Default until<br />Non-Schema Based<br />Binary XML<br />Oracle 11 and Onwards<br />Schema and Non-Schema Based<br />Object Relational (+Hybrid)<br />Nested Tables, Types, Varray’s<br />Schema Based<br />
    19. 19. Storage Index Defaults (xmltype)<br />Binary XML / CLOB<br />LOB Index<br />Object Relational<br />DBMS_XMLSCHEMA  “OPTIONS”<br />Oracle 10g: Index Organized Tables<br />Oracle 11g: B-Tree Indexes<br />xdb:annotations<br />Storage type xdb:SQLType<br />Storage Typexdb:ColumnProps / xdb:TableProps<br />
    20. 20. Index Methods (xmltype)<br />CLOB<br /><ul><li>Oracle Text
    21. 21. XMLIndex (>= 11g)</li></ul>Binary XML<br /><ul><li>XMLIndex / Oracle Text
    22. 22. Virtual Columns (BTree)</li></ul>Object Relational<br /><ul><li> Standard Indexes
    23. 23. Oracle Text</li></li></ul><li>Index Methods (10.x)<br />BTree Index<br />BTree I<br />BT<br />bookstore<br />Oracle Text Index<br />Function based Index (XPath)<br />book<br />whitepaper<br />title<br />author<br />author<br />chapter<br />title<br />author<br />id<br />paragraph<br />content<br />content<br />
    24. 24. Function-Based Index<br />Deprecated in 11.2<br />Object Relational XMLType Storage<br />(can be used, but shouldn’t, on CLOB)<br />Performance wise the lesser option…<br />SQL> CREATE INDEX function_based_index<br /> ON xml_data_table<br /> (extractValue(OBJECT_VALUE, '/Root/TextID')); <br />
    25. 25. BTree / Bitmap Index (O.R.)<br />Structured XML Data<br />Ordered Collection Tables (OCT)<br />ComplexTypes…<br />“dot” notation using the “xmldata” pseudocolumn<br />SQL> CREATE INDEX dot_notation_index<br /> ON xml_data_table<br /> ("XMLDATA". "TEXTID");<br />
    26. 26. Syntax Index Alternatives<br />SQL> CREATE INDEX function_based_again_idx<br /> ON xml_data_table<br /> (CAST("XMLDATA". "TEXTID“ as VARCHAR2(10)));<br />SQL> CREATE INDEX oracle_11_applicable_only_index<br /> ON xml_data_tablexdt<br /> (XMLCast(XMLQuery<br />('$i/Root/TextID'<br /> PASSING xdt.OBJECT_VALUE as "i" <br /> RETURNING content)<br />as VARCHAR2(10)));<br />
    27. 27. Oracle Text Index<br />Unstructured Data in XML<br />CLOB storage XML part in Object Relational XML<br />Secondary index on XMLIndex<br />Can only index XML data TEXT nodes<br />Result Set Interface (new in<br />Specify Query request and hit list requirements in XML<br />SQL> CREATE INDEX oracle_text_index<br /> ON xml_data_table<br /> (OBJECT_VALUE) INDEXTYPE IS CTXSYS.CONTEXT; <br />
    28. 28. Index Methods (11.1)<br />BTree Index<br />BTree I<br />BT<br />bookstore<br />Secondary Oracle Text Index<br />Function based Index (XPath)<br />book<br />whitepaper<br />Unstructured<br />XMLIndex<br />title<br />author<br />author<br />chapter<br />title<br />author<br />id<br />paragraph<br />content<br />content<br />
    29. 29. XML Document contains:<br />Semi Structured Data and Structured Data<br />Supports searching and fragment extraction<br />When XPath queried is not known beforehand<br />XMLType CLOB or Binary XML content<br />If you use an XMLIndex and/or combine it with Structured XMLIndex(es)<br />Usage: Unstructured XMLIndex<br />
    30. 30. Simple: Unstructured XMLIndex<br />SQL> CREATEINDEXxmlindex_idx<br /> ON “XMLTYPE_COLUMN"(xdata) <br />INDEXTYPEISXDB.XMLINDEX;<br />Index created.<br />SQL> CREATEINDEXxmlindex_idx<br /> ON “XMLTYPE_TABLE"(object_value) <br />INDEXTYPEISXDB.XMLINDEX;<br />Index created.<br />
    31. 31. Creating Unstructured XMLIndex<br />CREATEINDEXXMLIDX <br />ON XMLBINARY_TAB (object_value) <br />INDEXTYPEISXDB.XMLIndex<br />PARAMETERS <br /> ('PATHS (INCLUDE (/ROOT/ID /ROOT/INFO/INFO_ID ) <br />NAMESPACEMAPPING(xmlns="http://localhost/xmlschema_bin.xsd") ) <br />PATH TABLE path_table(TABLESPACE XML_DATA)<br />PATH ID INDEX pathid_idx (TABLESPACE XML_INDX)<br />ORDER KEY INDEX orderkey_idx (TABLESPACE XML_INDX)<br />VALUE INDEX    value_idx   (TABLESPACE XML_INDX)<br />ASYNC (SYNC ALWAYS) STALE (FALSE) ') <br />PARALLEL LOGGING;<br />
    32. 32. Path Table<br />PATH TABLE<br />PATH INDEX<br /><ul><li>(PATHID, RID), BTREE</li></ul>ORDER INDEX<br /><ul><li>(RID, ORDER_KEY), BTREE</li></ul>VALUE INDEX<br /><ul><li>(SUBSTRB("VALUE",1,</li></ul>1599))<br /><ul><li>FUNCTION BASED</li></ul>SECONDAIRY INDEXES<br />Unstructured<br />XMLIndex<br />f (x)<br />Path Table<br />
    33. 33. Unstructured XMLIndex (UXI)<br /><ul><li>OnePath Table
    34. 34. UsePath Subsetting
    35. 35. Full Blown XMLIndex canbe BIG
    36. 36. Token Tables (XDB.X$......)
    37. 37. Query re-writeonTokens
    38. 38. FuzzySearches, //
    39. 39. OptimizerStatistics
    40. 40. CanbemaintainedManually
    41. 41. Recorded inPending Table
    42. 42. Secondairyindexespossible</li></ul>Unstructured<br />XMLIndex<br />f (x)<br />Path Table<br />
    43. 43. Index Methods (11.2)<br />BTree Index<br />BTree I<br />BT<br />bookstore<br />Secondary Oracle Text Index<br />Function based Index (XPath)<br />book<br />whitepaper<br />Structured<br />XMLIndex<br />Unstructured<br />XMLIndex<br />title<br />author<br />author<br />chapter<br />title<br />author<br />id<br />paragraph<br />content<br />content<br />Highly Structured<br />Islands of Data<br />
    44. 44. With highly Structured Data<br />Likely candidates: ComplexTypes<br />Structured Islands of Data<br />Can be nested, but officially only one level<br />XMLTABLE “virtual” nested column hint<br />Will create (multiple) “Content Tables” <br />Multiple XPath defined same columns with different purpose<br />They deliver relational performance…!<br />Usage: Structured XMLIndex<br />
    45. 45. Simple: Structured XMLIndex<br />“XMLTABLE” Driven Syntax<br />SQL> CREATE INDEX xmlindex_sxi<br /> on xml_data_table (xmlcol) <br />indextype is xdb.xmlindex<br /> parameters <br /> ('GROUP employee_info_group<br />XMLTABLEEMP_CONTENT_TABLE<br /> ' '/employees/emp' ' <br /> COLUMNS <br />empidNUMBER(10) PATH ' 'id' ' '); <br />Be aware<br />' '<br />
    46. 46. Content Table(s)<br />KEY INDEX<br /><ul><li>(KEY)
    47. 47. Unique BTREE Index
    48. 48. PrimaryKey</li></ul>RID INDEX<br /><ul><li>(RID)
    49. 49. NON Unique BTREE Index</li></ul>Your Columns<br />RID<br />rowid<br />YOUR<br />column<br />X<br />Key<br />RAW<br />Not null<br />RID<br />rowid<br />Key<br />RAW<br />Not null<br />YOUR<br />columns<br />X<br />CONTENT TABLE(s)<br />YOUR<br />columns<br />X<br />RID<br />rowid<br />Key<br />RAW<br />Not null<br />
    50. 50. Structured XMLIndex (SXI)<br />Content Table(s)<br />BasedonXMLTABLEsyntax<br />XMLTable construct canbenestedbut:<br />“Only ONE XMLType column allowed”<br /> VIRTUAL column<br />CanbemaintainedManually<br />Secondairyindexespossible<br />LOCAL parameter (partitioning)<br />Structured<br />XMLIndex<br />f (x)<br />Content<br />Tables<br />
    51. 51. Adding Structured Indexes<br />SQL> ALTER INDEX xmlindex_sxi<br />parameters<br /> ('ADD_GROUP<br /> GROUP my_new_group<br /> XMLTABLE xml_content_tab_new<br /> ' '/root/extra' ' <br /> COLUMNS <br />extracol VARCHAR2(35) PATH ' 'new_element' ' ');<br />
    52. 52. Mixed XMLIndex Options<br />bookstore<br />Secondary (text)Index<br />Unstructured<br />XMLIndex<br />book<br />whitepaper<br />Structured<br />XMLIndex<br />Structured<br />XMLIndex<br />title<br />author<br />author<br />paragraph<br />title<br />author<br />id<br />chapter<br />content<br />content<br />
    53. 53. Mixed XMLIndex structures<br />CREATE INDEX xmlindex on TEST_RANGE_XML (doc) <br />indextype is xdb.xmlindex<br />PARAMETERS <br />(' PATH TABLE path_table PATHS (EXCLUDE(/root/ElementInfo)) '); <br />BEGIN <br />DBMS_XMLINDEX.registerParameter<br /> ('StructuredXML', <br /> 'ADD_GROUP GROUP ElementInfo<br />XMLTABLExml_cnttable_valueinfo ' '/root/ElementInfo' ' <br /> COLUMNS ValueInfo VARCHAR2(100) PATH ' 'ValueInfo' '); <br />END; /<br />ALTER INDEX xmlindex PARAMETERS('PARAM StructuredXML'); <br />
    54. 54. XMLIndex Maintenance<br />ALTER INDEX<br />XMLIndex Parameter Changes<br />DBMS_XMLINDEX.DROPPARAMETER<br />DBMS_XMLINDEX.MODIFYPARAMETER<br />DBMS_XMLINDEX.REGISTERPARAMETER<br />Manual Synchronizing an XMLIndex<br />DBMS_XMLINDEX.SYNCINDEX <br />Pending Tables<br />
    55. 55. “There Can Be Only One…”<br />
    56. 56. Syntax Awareness<br />SYNC=ALWAYS<br />Mandatory when Combined XMLIndex<br />SYNC=MANUAL<br />Locking<br />STALE=FALSE | TRUE<br />Hmmm…<br />Empty XMLIndex tables<br />OOPS  I got my “XMLTABLE” Syntax etc. “wrong”<br />
    57. 57. Notes on XMLIndex (1)<br />Only ONE XMLIndex is allowed in a user schema<br />Add extra XMLIndex structures (structured or unstructured) <br /> via ADD_GROUP syntax<br />Only SYNC=ALWAYS is allowed while using mixed XMLIndex structures or add more than one<br />
    58. 58. Notes on XMLIndex (2)<br />You need the LOCAL parameter to create local partitioned XMLIndexes<br />An XMLIndex on a HASH partitioned XMLType column or XMLType table, isnot (yet) allowed<br />But you can create an Oracle Text Index on such structures<br />
    59. 59. Recap<br />True understanding of Storage <br /> and Index options will provide:<br />Optimal performance<br />Out perform XML (Java based)<br />A lot of choice:<br />Problems are Complex<br />Also provides Solutions<br />Good design beforehand is the path to success<br />
    60. 60. References (1)<br />Oracle Whitepapers<br />Oracle XML DB : Choosing the Best XMLType Storage Option for Your Use Case (PDF)<br />Oracle XML DB : Best Practices to Get Optimal Performance out of XML Queries (PDF)<br />Blog<br />http://blog.gralike.com <br />(Dedicated XMLDB blog)<br />Semi-Structured XMLIndex section<br />Structured XMLIndex section<br />
    61. 61. References (2)<br />Oracle Open World Presentationon XML DB<br />S317428: Building Really Scalable XML Applications with Oracle XML DB and Oracle Text<br />XML DB OTN / FAQ Thread<br />http://forums.oracle.com/forums/forum.jspa?forumID=34 <br />http://forums.oracle.com/forums/thread.jspa?threadID=410714<br />