10. If you’re a performance nerd, this is actually cool… No one figured out XML yet… Solving the customer problem… Back to basics… Deeper understanding of the data handling issues… So why the “….” XML…?
12. Free Format…”XML is cool”… (aka no design effort) Have to Uphold the “Coding Granny Argument”… Everyone for themselves… Waiting for “Codd, Date”… Square wheels… What’s spoiling the soup…?
13. Different data models XPath models an XML document as a tree while most general purpose programming languages have no native data types for a tree. Different programming paradigms XSLT is a functional language, while Java is object-oriented and Perl is a procedural one. Impedance Mismatch
14. Effects, Costs Unnecessary CPU and Memory Overhead A lot of expensive type and encoding conversions Impedance Mismatch
22. The “Dimensions” in 1 XML doc. 1 3 4 5 2 X Y 6 Z nx rows Elements with maxoccurs=“unbounded”
23. Multi Dimensional Issues… Its a database… Its Row based Its Column based Its multiple databases… More then 1 XML doc Not uncommon 1 Mb >>
24. Complexities of a database “Relations” “Redundancy” “Nullology” Design, etc… It can contain a database 10 Mb or bigger nowadays More often than less… Enormous complex XSD’s XMLType – Not just a “Container”
25. Checked on XML Well-Formedness One root element Begin & End tags If XML Schema reference XOB methods will be used if an XML Schema is available DOM methods will be used if registered XML Schema information is not available XMLType – Not just a “Datatype”
26. What you want in access… Fast DDL Selects Inserts, Deletes, Updates Specific / Smart Small XML Fragments Direct Access
27. Mistakes are very, very Painful! Inserts, Updates & Deletes Fast Efficient Selects Precise Via Indexes ? XML Validation Strict, Lazy, really needed ? Client Side Possibilities
35. Common XML Parsers Often DOM or Infoset based CPU intensive Memory intensive Serializing, parsing, tree traversals, happen in memory…
36. In Memory: Common XML Parsers Often handle XML tree traversals only via ONEmethod It is not structured, semi-structured or unstructured XML content aware It is not very “smart” / “content aware” regarding XMLhandling based on its XML tree’s and/or XML data content
37. XMLType Physical Storage CLOB LOB LOB index Object Relational Varray, Types, Nested Tables IOT, B-Tree, XML Schema Binary XML LOB, LOB Index Stored in Post Parse Representation
38. Hybrid CLOB Mixed complex[n] un/structured XSD [y] B-Tree, IOT Document na unstructured XSD [n] XMLIndex Relational World XMLDB World XML Data Storage XMLType column/tables XMLType Views Obj.Rel. Binary XML Content complex[n] structured XSD [y] B-Tree, IOT (Object) Relational Objects Mixed complex[y] un/structured XSD [y/n] XMLIndex Relational Tables
43. Structured XMLIndex (SXI), 11.2 Content Table(s) BasedonXMLTABLE syntax XMLTable construct canbe nestedbut: Only 1 extra XMLType allowed VIRTUAL column is passed CanbemaintainedManually Secondairyindexespossible Structured XMLIndex f (x) Content Tables
44. Driving access on CONTENT BTree Index bookstore Secondary Oracle Text Index Function based Index (XPath) B-Tree Index book whitepaper StructuredXMLIndex Unstructured XMLIndex title author author chapter title author id paragraph content structured content
48. XML Schema will be parsed only once XML Schema will be cached in memory (SGA) No additional parsing No additional validation XML Schema registration doesn’t have to be creating types/tables… Binary XML has part of the solution XML Schema Advantages
49. XML Document structure is known, therefore No parsing is needed when loaded from disk into memory XML OBject (XOB) structures can be applied Memory footprint is much less compared to DOM structure Needed specific nodes can now be handled efficiently in memory XML Schema Advantages
53. XDB Utilities Toolset Object Relational Storage And a bit regarding Binary XML Makes xdb:annotation easy Helper Packages for index creation Whitepaper on “best practices” Not a Replacement for proper XML (Schema) Design
54. XML Schema - Query Rewrite String CHAR String Float bookstore CLOB VARCHAR2 (20) book whitepaper title author author chapter title author id paragraph NUMBER (15) content content
55. XML Design Avoid Cyclic References in XML Schemata For ease of Maintenance: xdb:annotations Is DOM validation, fidelity needed ? CPU / XML parsing: XML Schema validation “overhead” ? Index maintenance overhead, when using “disk” solutions Y X
56. Be aware of what you are doing ! Avoid unneeded (full) XML Schema validation During Storage (Inserts), Generating XML xdb:MaintainDOM=false Avoid Impedance mismatch Java XML Java XML Relational XML Java (“All In One Go Objective”) Avoid XML fragments // and/or via XMLEXISTS Use Indexes Y X
58. Keep XML small Do not use / enforce Pretty Print if not needed Avoid namespace reference “Overkill” Most used Namespace is Leading Use short Namespace References Make XML data as “sparse” as possible <employee><name>Marco</name></employee> <employee name=“Marco”/> XML Data Partitioning Binary XML if needed Y X
59. Keep XML small (OR specific) Don’t use “meaning full element names” 64Kb DDL “create table” buffer ORA 01792 maximum number of columns in a table or view is 100 Break XML up Out of Line CLOB (unstructured) Not Accessed Data Don’t create objects if you don’t need it Use xdb:defaultTable=“” for global types
61. Customer Use Case Memory / DOM Memory / DOM CLOB Oracle Advanced Queue XMLType BLOB Process Checks Validation XML Schema (JAVA) Store in ETL Tables Shred Elements Via XMLDOM
62. New XML Approach Rewrite on Disk / XOB (Relational) CLOB Oracle Advanced Queue BLOB Store in ETL Tables Oracle Workflow Validation Against XML Schema Checks XMLType Table (O.R)
63. Using the CBO as an XML Parser… ORA-31186 ORA-31186 ORA-31186 ORA-31186: Document contains too many nodes Cause: Unable to load the document because it has exceeded the maximum allocated number of DOM nodes.
66. Using the (XML) Relational Mindset Design XSD as you would with E(E)R Design for proper physical access, performance: Storage, Index Content Awareness Partitioning Overkill of “meaning full” data parsing Avoid Redundancy, whitespace, “Pretty Print” Design with the future in mind
67. So in short: Balanced Design Inserts, Updates & Deletes XML Future Changes Index Maintenance Selects In Memory Via Indexes XML Validation Strict, Lazy Client Side Possibilities
68. Reward Optimal performance Out performing standard XML solutions PL/SQL, SQL access optimized for best performance on XML PL/SQL, SQL, Design, Access: Efficient Fast
69.
70. References (1) Oracle XML DB http://download.oracle.com/docs/cd/E11882_01/appdev.112/e16659/toc.htm XML DB OTN / FAQ Thread http://forums.oracle.com/forums/forum.jspa?forumID=34 http://forums.oracle.com/forums/thread.jspa?threadID=410714
71. References (2) Oracle Whitepapers Oracle XML DB : Choosing the Best XMLType Storage Option for Your Use Case (PDF) Oracle XML DB : Best Practices to Get Optimal Performance out of XML Queries (PDF) Blog http://technology.amis.nl/blog http://blog.gralike.com (Dedicated XMLDB blog)
Editor's Notes
Square wheel JSON?
Emp/Dept tables, Foreign/Primary Keys…Showing here ONLY 1 XML document…