OPP2010 (Brussels) - Programming with XML in PL/SQL - Part 2

XML Programming in PL/SQL (Part 2) “Exploring Oracle XML Features in Depth”Marco Gralike

“The foundation is there; So why not use it?”…referring to the Relational Model…Chris Date- Hotsos keynote, 2009

If you’re a performance nerd, this is actually cool…No one figured out XML yet…Solving the customer problem…Back to basics…Deeper understanding of the data handling issues…So why the “….” XML…?

Free Format…”XML is cool”… (aka no design effort)Have to Uphold the “Coding Granny Argument”…Everyone for themselves…Waiting for “Codd, Date”…Square wheels…What’s spoiling the soup…?

Different data modelsXPath models an XML document as a tree while most general purpose programming languages have no native data types for a tree.Different programming paradigms XSLT is a functional language, while Java is object-oriented and Perl is a procedural one.Impedance Mismatch

Effects, CostsUnnecessary CPU and Memory Overhead A lot of expensive type and encoding conversionsImpedance Mismatch

The “Dimensions” in 1 XML doc.13452XY6Znxrows Elements with maxoccurs=“unbounded”

Multi Dimensional Issues…Its a database…Its Row basedIts Column basedIts multiple databases…More then 1 XML docNot uncommon 1 Mb >>

Complexities of a database“Relations”“Redundancy”“Nullology”Design, etc…It can contain a database10 Mb or bigger nowadaysMore often than less…Enormous complex XSD’s XMLType – Not just a “Container”

Checked onXML Well-FormednessOne root elementBegin & End tagsIf XML Schema referenceXOB methods will be used if an XML Schema is availableDOM methods will be used if registered XML Schema information is not available XMLType – Not just a “Datatype”

What you want in access…Fast DDLSelectsInserts, Deletes, UpdatesSpecific / SmartSmall XML FragmentsDirect Access

Mistakes are very, very Painful!Inserts, Updates & DeletesFast EfficientSelectsPreciseVia Indexes ?XML ValidationStrict, Lazy, really needed ?Client Side Possibilities

Oracle XMLType“Containers”

Object RelationalBinaryCLOBXMLType

PhysicalDesignLogicalDesignXMLType Solutions

Structured / Semi-StructuredStructuredSemiStructured

Common XML ParsersOften DOM or Infoset basedCPU intensiveMemory intensiveSerializing, parsing, tree traversals, happen in memory…

In Memory: Common XML ParsersOften handle XML tree traversals only via ONEmethodIt is not structured, semi-structured or unstructured XML content awareIt is not very “smart” / “content aware” regarding XMLhandling based on its XML tree’s and/or XML data content

XMLType Physical StorageCLOBLOBLOB indexObject RelationalVarray, Types, Nested TablesIOT, B-Tree, XML SchemaBinary XMLLOB, LOB IndexStored in Post Parse Representation

HybridCLOBMixedcomplex[n]un/structuredXSD [y]B-Tree, IOTDocumentnaunstructuredXSD [n]XMLIndexRelational WorldXMLDB WorldXML Data StorageXMLTypecolumn/tablesXMLTypeViewsObj.Rel.Binary XMLContentcomplex[n]structuredXSD [y]B-Tree, IOT(Object) Relational ObjectsMixedcomplex[y]un/structuredXSD [y/n]XMLIndexRelational Tables

Unstructured XMLIndex (UXI), 11.1PathTableUsePath SubsettingFullBlown XMLIndex canbe BIG Token Tables (XDB.X$......)Query re-writeonTokensFuzzy Searches, //Optimizer StatisticsCanbemaintainedManuallyRecorded inPending TableSecondairyindexespossibleUnstructuredXMLIndexf (x)Path Table

Structured XMLIndex (SXI), 11.2Content Table(s)BasedonXMLTABLE syntaxXMLTable construct canbenestedbut:Only 1 extra XMLType allowedVIRTUAL column is passedCanbemaintainedManuallySecondairyindexespossibleStructuredXMLIndexf (x)ContentTables

Driving access on CONTENTBTree IndexbookstoreSecondary Oracle Text IndexFunction based Index (XPath)B-TreeIndexbookwhitepaper StructuredXMLIndexUnstructuredXMLIndextitleauthorauthorchaptertitleauthoridparagraphcontentstructuredcontent

There can be only one XMLIndex…

XML Schema will be parsed only onceXML Schema will be cached in memory (SGA)No additional parsingNo additional validationXML Schema registration doesn’t have to be creating types/tables…Binary XML has part of the solutionXML Schema Advantages

XML Document structure is known, thereforeNo parsing is needed when loaded from disk into memoryXML OBject (XOB) structures can be appliedMemory footprint is much less compared to DOM structureNeeded specific nodes can now be handled efficiently in memoryXML Schema Advantages

XDB AnnotationsHybrid: CLOB withinOR

XDB Annotations (OR/Binary XML)LevelsRoot, Simpletype, Complextypexmlns:xdb="http://xmlns.oracle.com/xdb"xdb:storeVarrayAsTablexdb:defaultTablexdb:maintainDomxdb:maintainOrderxdb:SQLInlineOracle V.11.1.0.7.0 - Partitioning xdb:tableprops

Mixing Logical and Physical Design

XDB Utilities ToolsetObject Relational StorageAnd a bit regarding Binary XMLMakes xdb:annotation easyHelper Packages for index creationWhitepaper on “best practices”Not a Replacement for proper XML (Schema) Design

XML Schema - Query RewriteStringCHARStringFloatbookstoreCLOBVARCHAR2(20)bookwhitepapertitleauthorauthorchaptertitleauthoridparagraphNUMBER(15)contentcontent

XML DesignAvoid Cyclic References in XML SchemataFor ease of Maintenance: xdb:annotationsIs DOM validation, fidelity needed ?CPU / XML parsing: XML Schema validation “overhead” ?Index maintenance overhead, when using “disk” solutionsYX

Be aware of what you are doing !Avoid unneeded (full) XML Schema validationDuring Storage (Inserts), Generating XMLxdb:MaintainDOM=falseAvoid Impedance mismatchJava  XML  Java  XML  Relational  XML  Java (“All In One Go Objective”)Avoid XML fragments// and/or via XMLEXISTSUse Indexes YX

Keep XML smallDo not use / enforce Pretty Print if not neededAvoid namespace reference “Overkill”Most used Namespace is Leading Use short Namespace ReferencesMake XML data as “sparse” as possible<employee><name>Marco</name></employee><employee name=“Marco”/>XML Data PartitioningBinary XML if neededYX

Keep XML small (OR specific)Don’t use “meaning full element names”64Kb DDL “create table” bufferORA 01792 maximum number of columns in a table or view is 100Break XML upOut of LineCLOB (unstructured)Not Accessed DataDon’t create objects if you don’t need itUse xdb:defaultTable=“” for global types

Customer Use CaseMemory/ DOMMemory/ DOMCLOB Oracle Advanced QueueXMLTypeBLOBProcess ChecksValidationXML Schema(JAVA)Store in ETL TablesShred ElementsVia XMLDOM

New XML ApproachRewrite on Disk / XOB (Relational)CLOB Oracle Advanced QueueBLOBStore in ETL TablesOracle WorkflowValidationAgainst XML SchemaChecksXMLType Table(O.R)

Using the CBO as an XML Parser…ORA-31186ORA-31186ORA-31186ORA-31186: Document contains too many nodesCause: Unable to load the document because it has exceeded the maximum allocated number of DOM nodes.

DemonstrationXDB UtilitiesJDeveloperXML Schema xdb:annotationsEffect on Queries

Using the (XML) Relational MindsetDesign XSD as you would with E(E)RDesign for proper physical access, performance:Storage, IndexContent AwarenessPartitioning Overkill of “meaning full” data parsingAvoid Redundancy, whitespace, “Pretty Print”Design with the future in mind

So in short: Balanced DesignInserts, Updates & DeletesXML Future Changes Index MaintenanceSelectsIn MemoryVia IndexesXML ValidationStrict, LazyClient Side Possibilities

RewardOptimal performanceOut performing standard XML solutionsPL/SQL, SQL access optimized for best performance on XMLPL/SQL, SQL, Design, Access:EfficientFast

References (1)Oracle XML DB http://download.oracle.com/docs/cd/E11882_01/appdev.112/e16659/toc.htmXML DB OTN / FAQ Threadhttp://forums.oracle.com/forums/forum.jspa?forumID=34http://forums.oracle.com/forums/thread.jspa?threadID=410714

References (2)Oracle WhitepapersOracle XML DB : Choosing the Best XMLType Storage Option for Your Use Case (PDF)Oracle XML DB : Best Practices to Get Optimal Performance out of XML Queries (PDF)Bloghttp://technology.amis.nl/bloghttp://blog.gralike.com (Dedicated XMLDB blog)

OPP2010 (Brussels) - Programming with XML in PL/SQL - Part 2

More Related Content

What's hot

Similar to OPP2010 (Brussels) - Programming with XML in PL/SQL - Part 2

More from Marco Gralike

Recently uploaded

OPP2010 (Brussels) - Programming with XML in PL/SQL - Part 2

Editor's Notes