Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
XML Indexing Strategies
Choosing the Right Index for the
Right Job
Marco Gralike
Strange Chainsaw Tree Guy
• OakTable Member
• ACE Director
• Customer Advisory
Board Member
• 17+ years DBA, etc.
Blog.gra...
Richard Foote (Mr. Index)
• OakTable Member
• ACE Director
• Oracle Certified
Professional
• 22+ years DBA, etc.
RichardFo...
Refinement
Structured or Semi-Structured or…
Structured
Semi
Structured
Unstructured Content
Document Driven / Data Driven
XML Container
(in memory or via storage)
In Memory
(document)
CLOB
(document)
Object Relational
(data)
Binary XML
(data)
Design: Width and Height and …
Z
X
Y
1
2
3
4
6
5
Content Height : minOccurs="0" maxOccurs=“unbounded”
Content Width : type...
XMLIndex Use Cases
Binary XML
(Schema based)
XMLIndex
Structured
Component
Binary XML / CLOB Column
(Schema less, Schema
b...
Storage Models (xmltype)
• CLOB
– Default until 11.2.0.2.0
– Non-Schema Based
• Binary XML
– Oracle 11 and Onwards
– Schem...
Querying XML Content in XML DB
XMLIndex
DOM Tree
Model
Streaming XPath
Evaluation
Object-Relational
Relational Storage Sec...
Index Methods (xmltype)
Storage Index Defaults (xmltype)
• Binary XML / CLOB
– LOB Index
• Object Relational
– DBMS_XMLSCHEMA  “OPTIONS”
– Oracle...
Index Methods (10.x)
paragraph
book
title author author
whitepaper
title author id
content
bookstore
chapter
content
Funct...
Function-Based Index
• Deprecated in 11.2
• Object Relational XMLType Storage
(can, but shouldn’t on CLOB)
• Performance w...
BTree / Bitmap Index
• Structured XML Data
– Ordered Collection Tables (OCT)
– ComplexTypes…
– “dot” notation using the “x...
Index Alternatives
SQL> CREATE INDEX oracle_11_applicable_only_index
ON xml_data_table xdt
(XMLCast(XMLQuery
('$i/Root/Tex...
Oracle Text Index
• Unstructured Data in XML
– CLOB storage XML part in Object Relational XML
– Secondary index on XMLInde...
Index Methods (11.1)
Unstructured
XMLIndex
paragraph
book
title author author
whitepaper
title author id
content
bookstore...
Simple: Unstructured XMLIndex
SQL> CREATE INDEX xmlindex_idx
ON “XMLTYPE_COLUMN"(xdata)
INDEXTYPE IS XDB.XMLINDEX;
Index c...
• XML Document contains:
– Semi Structured Data and Structured Data
– Supports searching and fragment extraction
– When XP...
Path Table
Path Table
Unstructured
XMLIndex
f (x)
Unstructured XMLIndex (UXI)
 One Path Table
 Use Path Subsetting
 Full Blown XMLIndex can be BIG
 Token Tables (XDB.X$...
Creating Unstructured XMLIndex
CREATE INDEX XMLIDX
ON XMLBINARY_TAB (object_value)
INDEXTYPE IS XDB.XMLIndex
PARAMETERS
('...
Index Methods (11.2)
Structured
XMLIndex
Unstructured
XMLIndex
paragraph
book
title author author
whitepaper
title author ...
Simple: Structured XMLIndex
• “XMLTABLE” Driven Syntax
SQL> CREATE INDEX xmlindex_sxi
on xmldata_table (doc)
indextype is ...
• With highly Structured Data
• Likely candidates: ComplexTypes
• Structured Islands of Data
– Can be nested, but only one...
Content Table(s)
KEY INDEX
 (KEY)
 Unique BTREE Index
 Primary Key
RID INDEX
 (RID)
 NON Unique BTREE Index
Your Colu...
Structured XMLIndex (SXI)
• Content Table(s)
• Based on XMLTABLE syntax
• XMLTable construct can be nested but:
“Only ONE ...
Adding Structured Indexes
SQL> ALTER INDEX xmlindex_sxi
parameters
('ADD_GROUP
GROUP my_new_group
XMLTABLE xml_content_tab...
Mixed XMLIndex Options
Unstructured
XMLIndex
Structured
XMLIndex
Structured
XMLIndex
paragraph
book
title author author
wh...
Mixed XMLIndex structures
CREATE INDEX xmlindex on TEST_RANGE_XML (doc)
indextype is xdb.xmlindex
PARAMETERS
(' PATH TABLE...
XMLIndex Maintenance
• ALTER INDEX
• XMLIndex Parameter Changes
– DBMS_XMLINDEX.DROPPARAMETER
– DBMS_XMLINDEX.MODIFYPARAME...
“There Can Be Only One…”
Syntax Awareness
• SYNC=ALWAYS
– Mandatory when Combined XMLIndex
• SYNC=MANUAL
– Locking
• STALE=FALSE | TRUE
– Hmmm…
• E...
Notes on XMLIndex (1)
• Only ONE XMLIndex is allowed in a user
schema
– Add extra XMLIndex structures (structured or
unstr...
Notes on XMLIndex (2)
• You need the LOCAL parameter to create local
partitioned XMLIndexes
• An XMLIndex on a HASH partit...
Recap
• True understanding of Storage
and Index options will provide:
– Optimal performance
– Out perform XML (Java based)...
References (1)
Oracle Whitepapers
– Oracle XML DB : Choosing the Best XMLType
Storage Option for Your Use Case (PDF)
– Ora...
References (2)
• Oracle Open World Presentation on XML DB
– S317428: Building Really Scalable XML
Applications with Oracle...
UKOUG 2010 (Birmingham) - XML Indexing strategies - Choosing the Right Index for the Right Job
UKOUG 2010 (Birmingham) - XML Indexing strategies - Choosing the Right Index for the Right Job
Upcoming SlideShare
Loading in …5
×

UKOUG 2010 (Birmingham) - XML Indexing strategies - Choosing the Right Index for the Right Job

2,004 views

Published on

Published in: Technology
  • Be the first to comment

UKOUG 2010 (Birmingham) - XML Indexing strategies - Choosing the Right Index for the Right Job

  1. 1. XML Indexing Strategies Choosing the Right Index for the Right Job Marco Gralike
  2. 2. Strange Chainsaw Tree Guy • OakTable Member • ACE Director • Customer Advisory Board Member • 17+ years DBA, etc. Blog.gralike.com Technology.amis.nl
  3. 3. Richard Foote (Mr. Index) • OakTable Member • ACE Director • Oracle Certified Professional • 22+ years DBA, etc. RichardFoote.wordpress.com
  4. 4. Refinement
  5. 5. Structured or Semi-Structured or… Structured Semi Structured
  6. 6. Unstructured Content
  7. 7. Document Driven / Data Driven
  8. 8. XML Container (in memory or via storage) In Memory (document) CLOB (document) Object Relational (data) Binary XML (data)
  9. 9. Design: Width and Height and … Z X Y 1 2 3 4 6 5 Content Height : minOccurs="0" maxOccurs=“unbounded” Content Width : type="xs:string“, restriction…? Content Distribution : histogram, statistics, skew, cardinality ?
  10. 10. XMLIndex Use Cases Binary XML (Schema based) XMLIndex Structured Component Binary XML / CLOB Column (Schema less, Schema based) XMLIndex Structured Component Mixed w/ Text index
  11. 11. Storage Models (xmltype) • CLOB – Default until 11.2.0.2.0 – Non-Schema Based • Binary XML – Oracle 11 and Onwards – Schema and Non-Schema Based • Object Relational (+Hybrid) – Nested Tables, Types, Varray’s – Schema Based
  12. 12. Querying XML Content in XML DB XMLIndex DOM Tree Model Streaming XPath Evaluation Object-Relational Relational Storage Secure Files Binary XML XQuerySQL/XML XMLType Abstraction XVM (use “no query rewrite”) PushdownXQuery Rewrite Procedural XQueryDB XQuery SQL Execution Relational Access Methods Source: S317428: Building Really Scalable XML Applications with Oracle XML DB and Oracle Text
  13. 13. Index Methods (xmltype)
  14. 14. Storage Index Defaults (xmltype) • Binary XML / CLOB – LOB Index • Object Relational – DBMS_XMLSCHEMA  “OPTIONS” – Oracle 10g: Index Organized Tables – Oracle 11g: B-Tree Indexes – xdb:annotations • Storage type  xdb:SQLType • Storage Type  xdb:ColumnProps / xdb:TableProps
  15. 15. Index Methods (10.x) paragraph book title author author whitepaper title author id content bookstore chapter content Function based Index (XPath) Oracle Text Index BTBTre e I BTre e Index
  16. 16. Function-Based Index • Deprecated in 11.2 • Object Relational XMLType Storage (can, but shouldn’t on CLOB) • Performance wise the lesser option… SQL> CREATE INDEX function_based_index ON xml_data_table (extractValue(OBJECT_VALUE, '/Root/TextID'));
  17. 17. BTree / Bitmap Index • Structured XML Data – Ordered Collection Tables (OCT) – ComplexTypes… – “dot” notation using the “xmldata” pseudocolumn SQL> CREATE INDEX dot_notation_index ON xml_data_table ("XMLDATA". "TEXTID");
  18. 18. Index Alternatives SQL> CREATE INDEX oracle_11_applicable_only_index ON xml_data_table xdt (XMLCast(XMLQuery ('$i/Root/TextID' PASSING xdt.OBJECT_VALUE as "i" RETURNING content) as VARCHAR2(10))); SQL> CREATE INDEX function_based_again_idx ON xml_data_table (CAST("XMLDATA". "TEXTID“ as VARCHAR2(10)));
  19. 19. Oracle Text Index • Unstructured Data in XML – CLOB storage XML part in Object Relational XML – Secondary index on XMLIndex • Can only index XML data TEXT nodes • Result Set Interface (new in 11.2.0.2) – Specify Query request and hit list requirements in XML SQL> CREATE INDEX oracle_text_index ON xml_data_table (OBJECT_VALUE) INDEXTYPE IS CTXSYS.CONTEXT;
  20. 20. Index Methods (11.1) Unstructured XMLIndex paragraph book title author author whitepaper title author id content bookstore chapter content Function based Index (XPath) Secondary Oracle Text Index BTBTre e I BTre e Index
  21. 21. Simple: Unstructured XMLIndex SQL> CREATE INDEX xmlindex_idx ON “XMLTYPE_COLUMN"(xdata) INDEXTYPE IS XDB.XMLINDEX; Index created. SQL> CREATE INDEX xmlindex_idx ON “XMLTYPE_TABLE"(object_value) INDEXTYPE IS XDB.XMLINDEX; Index created.
  22. 22. • XML Document contains: – Semi Structured Data and Structured Data – Supports searching and fragment extraction – When XPath queried is not known beforehand • XMLType CLOB or Binary XML content • If you use an XMLIndex and/or combine it with Structured XMLIndex(es) Usage: Unstructured XMLIndex
  23. 23. Path Table Path Table Unstructured XMLIndex f (x)
  24. 24. Unstructured XMLIndex (UXI)  One Path Table  Use Path Subsetting  Full Blown XMLIndex can be BIG  Token Tables (XDB.X$......)  Query re-write on Tokens  Fuzzy Searches, //  Optimizer Statistics  Can be maintained Manually  Recorded in Pending Table  Secondary indexes possible Path Table Unstructured XMLIndex f (x)
  25. 25. Creating Unstructured XMLIndex CREATE INDEX XMLIDX ON XMLBINARY_TAB (object_value) INDEXTYPE IS XDB.XMLIndex PARAMETERS ('PATHS (INCLUDE (/ROOT/ID /ROOT/INFO/INFO_ID ) NAMESPACE MAPPING (xmlns="http://localhost/xmlschema_bin.xsd") ) PATH TABLE path_table (TABLESPACE XML_DATA) PATH ID INDEX pathid_idx (TABLESPACE XML_INDX) ORDER KEY INDEX orderkey_idx (TABLESPACE XML_INDX) VALUE INDEX value_idx (TABLESPACE XML_INDX) ASYNC (SYNC ALWAYS) STALE (FALSE) ') PARALLEL LOGGING;
  26. 26. Index Methods (11.2) Structured XMLIndex Unstructured XMLIndex paragraph book title author author whitepaper title author id content bookstore chapter content Function based Index (XPath) Secondary Oracle Text Index Highly Structured Islands of Data BTBTre e I BTre e Index
  27. 27. Simple: Structured XMLIndex • “XMLTABLE” Driven Syntax SQL> CREATE INDEX xmlindex_sxi on xmldata_table (doc) indextype is xdb.xmlindex parameters ('GROUP elementinfo_group XMLTABLE xml_cnt_tab_elementinfo ' '/root/element' ' COLUMNS infocol VARCHAR2(4000) PATH ' 'info' ' '); Be aware ' '
  28. 28. • With highly Structured Data • Likely candidates: ComplexTypes • Structured Islands of Data – Can be nested, but only one level – XMLTABLE “virtual” nested column hint • Multiple “content tables” – Multiple XPath defined same columns with different purpose They deliver relational performance…! Usage: Structured XMLIndex
  29. 29. Content Table(s) KEY INDEX  (KEY)  Unique BTREE Index  Primary Key RID INDEX  (RID)  NON Unique BTREE Index Your Columns CONTENT TABLE(s) RID rowid Key RAW Not null RID rowid Key RAW Not null YOUR column s X YOUR column s X RID rowid YOUR column X Key RAW Not null
  30. 30. Structured XMLIndex (SXI) • Content Table(s) • Based on XMLTABLE syntax • XMLTable construct can be nested but: “Only ONE XMLType column allowed”  VIRTUAL column • Can be maintained Manually • Secondary indexes possible • LOCAL parameter (partitioning) Content Tables Structured XMLIndex f (x)
  31. 31. Adding Structured Indexes SQL> ALTER INDEX xmlindex_sxi parameters ('ADD_GROUP GROUP my_new_group XMLTABLE xml_content_tab_new ' '/root/extra' ' COLUMNS extracol VARCHAR2(35) PATH ' 'new_element' ' ');
  32. 32. Mixed XMLIndex Options Unstructured XMLIndex Structured XMLIndex Structured XMLIndex paragraph book title author author whitepaper title author id content bookstore chapter content Secondary (text) Index
  33. 33. Mixed XMLIndex structures CREATE INDEX xmlindex on TEST_RANGE_XML (doc) indextype is xdb.xmlindex PARAMETERS (' PATH TABLE path_table PATHS (EXCLUDE(/root/ElementInfo)) '); BEGIN DBMS_XMLINDEX.registerParameter ('StructuredXML', 'ADD_GROUP GROUP ElementInfo XMLTABLE xml_cnttable_valueinfo ' '/root/ElementInfo' ' COLUMNS ValueInfo VARCHAR2(100) PATH ' 'ValueInfo' '); END; / ALTER INDEX xmlindex PARAMETERS('PARAM StructuredXML');
  34. 34. XMLIndex Maintenance • ALTER INDEX • XMLIndex Parameter Changes – DBMS_XMLINDEX.DROPPARAMETER – DBMS_XMLINDEX.MODIFYPARAMETER – DBMS_XMLINDEX.REGISTERPARAMETER • Manual Synchronizing an XMLIndex – DBMS_XMLINDEX.SYNCINDEX – Pending Tables
  35. 35. “There Can Be Only One…”
  36. 36. Syntax Awareness • SYNC=ALWAYS – Mandatory when Combined XMLIndex • SYNC=MANUAL – Locking • STALE=FALSE | TRUE – Hmmm… • Empty XMLIndex tables – OOPS  I got my “XMLTABLE” Syntax etc. “wrong”
  37. 37. Notes on XMLIndex (1) • Only ONE XMLIndex is allowed in a user schema – Add extra XMLIndex structures (structured or unstructured) via ADD_GROUP syntax – Only SYNC=ALWAYS is allowed while using mixed XMLIndex structures or add more than one
  38. 38. Notes on XMLIndex (2) • You need the LOCAL parameter to create local partitioned XMLIndexes • An XMLIndex on a HASH partitioned XMLType column or XMLType table, is not (yet) allowed – But you can create an Oracle Text Index on such structures
  39. 39. Recap • True understanding of Storage and Index options will provide: – Optimal performance – Out perform XML (Java based) • A lot of choice: – Problems are Complex – Also provides Solutions • Good design beforehand is the path to success
  40. 40. References (1) Oracle Whitepapers – Oracle XML DB : Choosing the Best XMLType Storage Option for Your Use Case (PDF) – Oracle XML DB : Best Practices to Get Optimal Performance out of XML Queries (PDF) Blog – http://blog.gralike.com • (Dedicated XMLDB blog) • Semi-Structured XMLIndex section • Structured XMLIndex section
  41. 41. References (2) • Oracle Open World Presentation on XML DB – S317428: Building Really Scalable XML Applications with Oracle XML DB and Oracle Text • XML DB OTN / FAQ Thread – http://forums.oracle.com/forums/forum.jspa?foru mID=34 – http://forums.oracle.com/forums/thread.jspa?thr eadID=410714

×