SlideShare a Scribd company logo
1 of 69
The Ultimate Performance Challenge:“How to Make XML Perform ?!” Marco Gralike
Agenda
Agenda
“XML is not a ‘fast’ thing, there is a ton of parsing involved. Sorry, I never saw the point in huge XML files – they are many times larger than they should be and the amount of work involved in parsing them is incredible”. Tom Kyte - Januari 9, 2009, AskTom
“The foundation is there; So why not use it?” …referring to the Relational Model… Chris Date- Hotsos keynote, 2009
Relational…
XML…?
Evolution…
If you’re a performance nerd,  	this is actually cool… No one figured out XML yet… Solving the customer problem… Back to basics… Deeper understanding of 	the data handling issues… So why the “Hxxx” XML…?
Agenda
Free Format…”XML is cool”…  (aka no design effort) Have to uphold the “Coding Granny Argument” (among others meaningful names) Everyone for themselves… Waiting for “Codd, Date”… Square wheels… What’s spoiling the soup…?
Different data models XPath models an XML document as  	a tree while most general purpose  	programming languages  	have no native data types for a tree. Different programming paradigms  XSLT is a functional language, while Java  	is object-oriented and Perl is a procedural one. Impedance Mismatch
Effects, Costs Unnecessary CPU and Memory  Overhead  A lot of expensive type and  	encoding conversions Impedance Mismatch
Agenda
Containerization
The “Dimensions” in 1 XML doc. 1 3 4 5 2 X Y 6 Z nx rows  Elements with maxoccurs=“unbounded”
Multi Dimensional Issues… Its a database… Its Row based Its Column based Its multiple databases… More then 1 XML doc Not uncommon 1 Mb >>
Complexities of a database “Relations” “Redundancy” “Nullology” Design, etc… It can contain a database 10 Mb or bigger nowadays More often than less… Enormous complex XSD’s  XMLType – Not just a “Container”
Checked on XML Well-Formedness One root element Begin & End tags If XML Schema reference XOB methods will be used if an XML Schema is available DOM methods will be used if registered  	XML Schema information is not available  XMLType – Not just a “Container”
What you want in access… Fast DDL Selects Inserts, Deletes, Updates Specific / Smart Small XML Fragments Direct Access
Agenda
Document contra Data Driven
Structured / Semi-Structured Structured Semi Structured
Common XML Parsers Often DOM or Infoset based CPU intensive Memory intensive Serializing, parsing, tree traversals, happen in memory…
In Memory: Common XML Parsers Often handle XML tree traversals only via  ONEmethod It is not structured, semi-structured or unstructured XML content aware It is not very “smart” / “content aware” regarding XMLhandling based on its XML tree’s and/or XML data content
XMLType Physical Storage CLOB LOB LOB index Object Relational Varray, Types, Nested Tables IOT, B-Tree, XML Schema Binary XML LOB, LOB Index Stored in Post Parse Representation
Choosing a Storage Model
Hybrid CLOB Mixed complex[n] un/structured XSD [y] B-Tree, IOT Document na unstructured XSD [n] XMLIndex Relational World XMLDB World XML Data Storage XMLType column/tables XMLType Views Obj.Rel. Binary XML Content complex[n] structured XSD [y] B-Tree, IOT (Object)  Relational  Objects Mixed complex[y] un/structured XSD [y/n] XMLIndex Relational  Tables
Partition XML data EMPLOYEES_PROJ_TAB PROJ_DETAILS_TAB EMP_PROJ_P11 “employees”.”employee” reference_id EMP_PROJ_P12
XML Partitioning Object Relational Partitioning Equi-Partitioning since version Oracle 11.1.0.7.0 Binary XML Partitioning Range, List, Hash Local partitioned XMLIndex LOCAL keyword in XMLIndex create syntax XMLIndex is not supported for HASH partitioning Partition Key on virtual Column (Binary XML) Partition Key on column (Object Relational)
Agenda
Index Quick Sheet
Unstructured XMLIndex (UXI) PathTable UsePath Subsetting FullBlown XMLIndex canbe BIG  Token Tables (XDB.X$......) Query re-writeonTokens Fuzzy Searches, // Optimizer Statistics CanbemaintainedManually Recorded inPending Table Secondaryindexespossible Unstructured XMLIndex f (x) Path Table
PathTable INDEXED COLUMNS PATH INDEX ,[object Object],ORDER INDEX ,[object Object],VALUE INDEX ,[object Object]
FUNCTION BASEDNotIndexed: LOCATOR column, pointer to  XML fragments (XDB.X$...) SECONDARY INDEXES Unstructured XMLIndex f (x) Path Table
Structured XMLIndex (SXI) Content Table(s) BasedonXMLTABLE syntax XMLTable construct canbe nestedbut: Only 1 extra XMLType allowed VIRTUAL column is passed CanbemaintainedManually Secondaryindexespossible Structured XMLIndex f (x) Content Tables
Content Table(s) INDEXED COLUMNS KEY INDEX ,[object Object],RID INDEX ,[object Object],Indexesneededforcombined XMLIndex Types Mixing Unstructured and StructuredXMLIndexes Yourdefined columns  Secondaryindexes Structured XMLIndex f (x) Content Tables
Driving access on CONTENT BTree Index bookstore Secondary Oracle Text Index Function based Index (XPath) book whitepaper    StructuredXMLIndex Unstructured XMLIndex title author author chapter title author id paragraph content structured content Structured XMLIndex
There can be only one XMLIndex…
Agenda
Design
XML Schema will be parsed only once If registered in the XDB Repository XML Schema will be cached in memory (SGA) No additional parsing No additional validation XML Schema Advantages
XML Document structure is known, therefore No parsing is needed when loaded from disk into memory XML OBject (XOB) structures can be applied Memory footprint is much less compared to DOM structure Needed specific nodes can now be handled efficiently in memory XML Schema Advantages
XDB Annotations Hybrid: CLOB withinOR
XDB Annotations (OR/Binary XML) Levels Root, Simpletype, Complextype xmlns:xdb="http://xmlns.oracle.com/xdb" xdb:storeVarrayAsTable xdb:defaultTable xdb:maintainDom xdb:maintainOrder xdb:SQLInline Oracle V.11.1.0.7.0 - Partitioning  xdb:tableprops
Mixing Logical and Physical Design
XML Schema - Query Rewrite String CHAR String Float bookstore CLOB VARCHAR2 (20) book whitepaper title author author chapter title author id paragraph NUMBER (15) content content
XML Design Avoid Cyclic References in XML Schemata For ease of Maintenance: xdb:annotations Is DOM validation, fidelity needed ? CPU / XML parsing:  	XML Schema validation “overhead” ? Index maintenance overhead,  	when using “disk” solutions Y X
Be aware of what you are doing ! Avoid unneeded (full) XML Schema validation During Storage (Inserts), Generating XML xdb:MaintainDOM=false Avoid Impedance mismatch Java  XML  Java  XML  Relational  XML  Java (“All In One Go Objective”) Avoid XML fragments //  and/or via XMLEXISTS Use Indexes  Y X
Agenda
Keep XML small Do not use / enforce Pretty Print if not needed Avoid namespace reference “Overkill” Most used Namespace is Leading  Use short Namespace References (aliases) Make XML data as “sparse” as possible <employee><name>Marco</name></employee> <employee name=“Marco”/> XML Data Partitioning Binary XML if needed Y X
Keep XML small (OR specific) Don’t use “meaning full element names” 64Kb DDL “create table” buffer ORA 01792 maximum number of columns in a table or view is 1000 Break XML up Out of Line CLOB (unstructured) Not Accessed Data Don’t create objects if you don’t need it Use xdb:defaultTable=“” for global types
Holistic Approach (Recap)
Customer Use Case Memory / DOM Memory / DOM CLOB  Oracle  Advanced Queue XMLType BLOB Process  Checks Validation XML Schema (JAVA) Store in  ETL Tables Shred Elements Via XMLDOM
Duration (1000 Cases)
New XML Approach Rewrite on Disk  / XOB  (Relational) CLOB  Oracle  Advanced Queue BLOB Store in  ETL Tables Oracle  Workflow Validation Against  XML Schema Checks XMLType Table (O.R)
Using the CBO as an XML Parser… ORA-31186 ORA-31186 ORA-31186 ORA-31186: 	Document contains too many nodes Cause: 	Unable to load the document because it has exceeded the maximum allocated number of DOM nodes.
Using the (XML) Relational Mindset Design XSD as you would with E(E)R Design for proper physical access, performance: Storage, Index Content Awareness Partitioning  Overkill of “meaning full” data parsing Avoid Redundancy, whitespace, “Pretty Print” Design with the future in mind
So in short: Balanced Design Inserts, Updates & Deletes XML Future Changes  Index Maintenance Selects In Memory Via Indexes XML Validation Strict, Lazy Client Side Possibilities
Reward Optimal performance Out performing XML  Proper design will give 	you 10, 100 fold  	performance increase over 	XML handling… …also known as…ehh… …standard relational database  performance…

More Related Content

More from Marco Gralike

UKOUG Tech14 - Using Database In-Memory Column Store with Complex Datatypes
UKOUG Tech14 - Using Database In-Memory Column Store with Complex DatatypesUKOUG Tech14 - Using Database In-Memory Column Store with Complex Datatypes
UKOUG Tech14 - Using Database In-Memory Column Store with Complex DatatypesMarco Gralike
 
Ordina Oracle Open World
Ordina Oracle Open WorldOrdina Oracle Open World
Ordina Oracle Open WorldMarco Gralike
 
Starting with JSON Path Expressions in Oracle 12.1.0.2
Starting with JSON Path Expressions in Oracle 12.1.0.2Starting with JSON Path Expressions in Oracle 12.1.0.2
Starting with JSON Path Expressions in Oracle 12.1.0.2Marco Gralike
 
An introduction into Oracle VM V3.x
An introduction into Oracle VM V3.xAn introduction into Oracle VM V3.x
An introduction into Oracle VM V3.xMarco Gralike
 
An introduction into Oracle Enterprise Manager Cloud Control 12c Release 3
An introduction into Oracle Enterprise Manager Cloud Control 12c Release 3An introduction into Oracle Enterprise Manager Cloud Control 12c Release 3
An introduction into Oracle Enterprise Manager Cloud Control 12c Release 3Marco Gralike
 
XML Amsterdam - Creating structure in unstructured data
XML Amsterdam - Creating structure in unstructured dataXML Amsterdam - Creating structure in unstructured data
XML Amsterdam - Creating structure in unstructured dataMarco Gralike
 
An AMIS Overview of Oracle database 12c (12.1)
An AMIS Overview of Oracle database 12c (12.1)An AMIS Overview of Oracle database 12c (12.1)
An AMIS Overview of Oracle database 12c (12.1)Marco Gralike
 
Flexibiliteit & Snel Schakelen
Flexibiliteit & Snel SchakelenFlexibiliteit & Snel Schakelen
Flexibiliteit & Snel SchakelenMarco Gralike
 
Hotsos 2013 - Creating Structure in Unstructured Data
Hotsos 2013 - Creating Structure in Unstructured DataHotsos 2013 - Creating Structure in Unstructured Data
Hotsos 2013 - Creating Structure in Unstructured DataMarco Gralike
 
Expertezed 2012 Webcast - XML DB Use Cases
Expertezed 2012 Webcast - XML DB Use CasesExpertezed 2012 Webcast - XML DB Use Cases
Expertezed 2012 Webcast - XML DB Use CasesMarco Gralike
 
BGOUG 2012 - Drag & drop and other stuff - Using your database as a file server
BGOUG 2012 - Drag & drop and other stuff - Using your database as a file serverBGOUG 2012 - Drag & drop and other stuff - Using your database as a file server
BGOUG 2012 - Drag & drop and other stuff - Using your database as a file serverMarco Gralike
 
BGOUG 2012 - XML Index Strategies
BGOUG 2012 - XML Index StrategiesBGOUG 2012 - XML Index Strategies
BGOUG 2012 - XML Index StrategiesMarco Gralike
 
BGOUG 2012 - Design concepts for xml applications that will perform
BGOUG 2012 - Design concepts for xml applications that will performBGOUG 2012 - Design concepts for xml applications that will perform
BGOUG 2012 - Design concepts for xml applications that will performMarco Gralike
 
ODTUG Webcast - Thinking Clearly about XML
ODTUG Webcast - Thinking Clearly about XMLODTUG Webcast - Thinking Clearly about XML
ODTUG Webcast - Thinking Clearly about XMLMarco Gralike
 
UKOUG 2011 - Drag, Drop and other Stuff. Using your Database as a File Server
UKOUG 2011 - Drag, Drop and other Stuff. Using your Database as a File ServerUKOUG 2011 - Drag, Drop and other Stuff. Using your Database as a File Server
UKOUG 2011 - Drag, Drop and other Stuff. Using your Database as a File ServerMarco Gralike
 
XFILES, The APEX 4 version - The truth is in there
XFILES, The APEX 4 version - The truth is in thereXFILES, The APEX 4 version - The truth is in there
XFILES, The APEX 4 version - The truth is in thereMarco Gralike
 
Miracle Open World 2011 - XML Index Strategies
Miracle Open World 2011  -  XML Index StrategiesMiracle Open World 2011  -  XML Index Strategies
Miracle Open World 2011 - XML Index StrategiesMarco Gralike
 
UKOUG 2010 (Birmingham) - XML Indexing strategies - Choosing the Right Index ...
UKOUG 2010 (Birmingham) - XML Indexing strategies - Choosing the Right Index ...UKOUG 2010 (Birmingham) - XML Indexing strategies - Choosing the Right Index ...
UKOUG 2010 (Birmingham) - XML Indexing strategies - Choosing the Right Index ...Marco Gralike
 
OPP2010 (Brussels) - Programming with XML in PL/SQL - Part 2
OPP2010 (Brussels) - Programming with XML in PL/SQL - Part 2OPP2010 (Brussels) - Programming with XML in PL/SQL - Part 2
OPP2010 (Brussels) - Programming with XML in PL/SQL - Part 2Marco Gralike
 
OPP2010 (Brussels) - Programming with XML in PL/SQL - Part 1
OPP2010 (Brussels) - Programming with XML in PL/SQL - Part 1OPP2010 (Brussels) - Programming with XML in PL/SQL - Part 1
OPP2010 (Brussels) - Programming with XML in PL/SQL - Part 1Marco Gralike
 

More from Marco Gralike (20)

UKOUG Tech14 - Using Database In-Memory Column Store with Complex Datatypes
UKOUG Tech14 - Using Database In-Memory Column Store with Complex DatatypesUKOUG Tech14 - Using Database In-Memory Column Store with Complex Datatypes
UKOUG Tech14 - Using Database In-Memory Column Store with Complex Datatypes
 
Ordina Oracle Open World
Ordina Oracle Open WorldOrdina Oracle Open World
Ordina Oracle Open World
 
Starting with JSON Path Expressions in Oracle 12.1.0.2
Starting with JSON Path Expressions in Oracle 12.1.0.2Starting with JSON Path Expressions in Oracle 12.1.0.2
Starting with JSON Path Expressions in Oracle 12.1.0.2
 
An introduction into Oracle VM V3.x
An introduction into Oracle VM V3.xAn introduction into Oracle VM V3.x
An introduction into Oracle VM V3.x
 
An introduction into Oracle Enterprise Manager Cloud Control 12c Release 3
An introduction into Oracle Enterprise Manager Cloud Control 12c Release 3An introduction into Oracle Enterprise Manager Cloud Control 12c Release 3
An introduction into Oracle Enterprise Manager Cloud Control 12c Release 3
 
XML Amsterdam - Creating structure in unstructured data
XML Amsterdam - Creating structure in unstructured dataXML Amsterdam - Creating structure in unstructured data
XML Amsterdam - Creating structure in unstructured data
 
An AMIS Overview of Oracle database 12c (12.1)
An AMIS Overview of Oracle database 12c (12.1)An AMIS Overview of Oracle database 12c (12.1)
An AMIS Overview of Oracle database 12c (12.1)
 
Flexibiliteit & Snel Schakelen
Flexibiliteit & Snel SchakelenFlexibiliteit & Snel Schakelen
Flexibiliteit & Snel Schakelen
 
Hotsos 2013 - Creating Structure in Unstructured Data
Hotsos 2013 - Creating Structure in Unstructured DataHotsos 2013 - Creating Structure in Unstructured Data
Hotsos 2013 - Creating Structure in Unstructured Data
 
Expertezed 2012 Webcast - XML DB Use Cases
Expertezed 2012 Webcast - XML DB Use CasesExpertezed 2012 Webcast - XML DB Use Cases
Expertezed 2012 Webcast - XML DB Use Cases
 
BGOUG 2012 - Drag & drop and other stuff - Using your database as a file server
BGOUG 2012 - Drag & drop and other stuff - Using your database as a file serverBGOUG 2012 - Drag & drop and other stuff - Using your database as a file server
BGOUG 2012 - Drag & drop and other stuff - Using your database as a file server
 
BGOUG 2012 - XML Index Strategies
BGOUG 2012 - XML Index StrategiesBGOUG 2012 - XML Index Strategies
BGOUG 2012 - XML Index Strategies
 
BGOUG 2012 - Design concepts for xml applications that will perform
BGOUG 2012 - Design concepts for xml applications that will performBGOUG 2012 - Design concepts for xml applications that will perform
BGOUG 2012 - Design concepts for xml applications that will perform
 
ODTUG Webcast - Thinking Clearly about XML
ODTUG Webcast - Thinking Clearly about XMLODTUG Webcast - Thinking Clearly about XML
ODTUG Webcast - Thinking Clearly about XML
 
UKOUG 2011 - Drag, Drop and other Stuff. Using your Database as a File Server
UKOUG 2011 - Drag, Drop and other Stuff. Using your Database as a File ServerUKOUG 2011 - Drag, Drop and other Stuff. Using your Database as a File Server
UKOUG 2011 - Drag, Drop and other Stuff. Using your Database as a File Server
 
XFILES, The APEX 4 version - The truth is in there
XFILES, The APEX 4 version - The truth is in thereXFILES, The APEX 4 version - The truth is in there
XFILES, The APEX 4 version - The truth is in there
 
Miracle Open World 2011 - XML Index Strategies
Miracle Open World 2011  -  XML Index StrategiesMiracle Open World 2011  -  XML Index Strategies
Miracle Open World 2011 - XML Index Strategies
 
UKOUG 2010 (Birmingham) - XML Indexing strategies - Choosing the Right Index ...
UKOUG 2010 (Birmingham) - XML Indexing strategies - Choosing the Right Index ...UKOUG 2010 (Birmingham) - XML Indexing strategies - Choosing the Right Index ...
UKOUG 2010 (Birmingham) - XML Indexing strategies - Choosing the Right Index ...
 
OPP2010 (Brussels) - Programming with XML in PL/SQL - Part 2
OPP2010 (Brussels) - Programming with XML in PL/SQL - Part 2OPP2010 (Brussels) - Programming with XML in PL/SQL - Part 2
OPP2010 (Brussels) - Programming with XML in PL/SQL - Part 2
 
OPP2010 (Brussels) - Programming with XML in PL/SQL - Part 1
OPP2010 (Brussels) - Programming with XML in PL/SQL - Part 1OPP2010 (Brussels) - Programming with XML in PL/SQL - Part 1
OPP2010 (Brussels) - Programming with XML in PL/SQL - Part 1
 

Recently uploaded

"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphNeo4j
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Neo4j
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxMaking_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxnull - The Open Security Community
 

Recently uploaded (20)

"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptxVulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxMaking_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
 

Hotsos 2010 - The Ultimate Performance Challenge: How To Make Xml Perform?

  • 1. The Ultimate Performance Challenge:“How to Make XML Perform ?!” Marco Gralike
  • 3.
  • 4.
  • 6. “XML is not a ‘fast’ thing, there is a ton of parsing involved. Sorry, I never saw the point in huge XML files – they are many times larger than they should be and the amount of work involved in parsing them is incredible”. Tom Kyte - Januari 9, 2009, AskTom
  • 7. “The foundation is there; So why not use it?” …referring to the Relational Model… Chris Date- Hotsos keynote, 2009
  • 11. If you’re a performance nerd, this is actually cool… No one figured out XML yet… Solving the customer problem… Back to basics… Deeper understanding of the data handling issues… So why the “Hxxx” XML…?
  • 13. Free Format…”XML is cool”… (aka no design effort) Have to uphold the “Coding Granny Argument” (among others meaningful names) Everyone for themselves… Waiting for “Codd, Date”… Square wheels… What’s spoiling the soup…?
  • 14. Different data models XPath models an XML document as a tree while most general purpose programming languages have no native data types for a tree. Different programming paradigms XSLT is a functional language, while Java is object-oriented and Perl is a procedural one. Impedance Mismatch
  • 15. Effects, Costs Unnecessary CPU and Memory Overhead A lot of expensive type and encoding conversions Impedance Mismatch
  • 17.
  • 18.
  • 19.
  • 20.
  • 21.
  • 23. The “Dimensions” in 1 XML doc. 1 3 4 5 2 X Y 6 Z nx rows Elements with maxoccurs=“unbounded”
  • 24. Multi Dimensional Issues… Its a database… Its Row based Its Column based Its multiple databases… More then 1 XML doc Not uncommon 1 Mb >>
  • 25. Complexities of a database “Relations” “Redundancy” “Nullology” Design, etc… It can contain a database 10 Mb or bigger nowadays More often than less… Enormous complex XSD’s XMLType – Not just a “Container”
  • 26. Checked on XML Well-Formedness One root element Begin & End tags If XML Schema reference XOB methods will be used if an XML Schema is available DOM methods will be used if registered XML Schema information is not available XMLType – Not just a “Container”
  • 27. What you want in access… Fast DDL Selects Inserts, Deletes, Updates Specific / Smart Small XML Fragments Direct Access
  • 30. Structured / Semi-Structured Structured Semi Structured
  • 31. Common XML Parsers Often DOM or Infoset based CPU intensive Memory intensive Serializing, parsing, tree traversals, happen in memory…
  • 32. In Memory: Common XML Parsers Often handle XML tree traversals only via ONEmethod It is not structured, semi-structured or unstructured XML content aware It is not very “smart” / “content aware” regarding XMLhandling based on its XML tree’s and/or XML data content
  • 33. XMLType Physical Storage CLOB LOB LOB index Object Relational Varray, Types, Nested Tables IOT, B-Tree, XML Schema Binary XML LOB, LOB Index Stored in Post Parse Representation
  • 35. Hybrid CLOB Mixed complex[n] un/structured XSD [y] B-Tree, IOT Document na unstructured XSD [n] XMLIndex Relational World XMLDB World XML Data Storage XMLType column/tables XMLType Views Obj.Rel. Binary XML Content complex[n] structured XSD [y] B-Tree, IOT (Object) Relational Objects Mixed complex[y] un/structured XSD [y/n] XMLIndex Relational Tables
  • 36.
  • 37. Partition XML data EMPLOYEES_PROJ_TAB PROJ_DETAILS_TAB EMP_PROJ_P11 “employees”.”employee” reference_id EMP_PROJ_P12
  • 38. XML Partitioning Object Relational Partitioning Equi-Partitioning since version Oracle 11.1.0.7.0 Binary XML Partitioning Range, List, Hash Local partitioned XMLIndex LOCAL keyword in XMLIndex create syntax XMLIndex is not supported for HASH partitioning Partition Key on virtual Column (Binary XML) Partition Key on column (Object Relational)
  • 41. Unstructured XMLIndex (UXI) PathTable UsePath Subsetting FullBlown XMLIndex canbe BIG Token Tables (XDB.X$......) Query re-writeonTokens Fuzzy Searches, // Optimizer Statistics CanbemaintainedManually Recorded inPending Table Secondaryindexespossible Unstructured XMLIndex f (x) Path Table
  • 42.
  • 43. FUNCTION BASEDNotIndexed: LOCATOR column, pointer to XML fragments (XDB.X$...) SECONDARY INDEXES Unstructured XMLIndex f (x) Path Table
  • 44. Structured XMLIndex (SXI) Content Table(s) BasedonXMLTABLE syntax XMLTable construct canbe nestedbut: Only 1 extra XMLType allowed VIRTUAL column is passed CanbemaintainedManually Secondaryindexespossible Structured XMLIndex f (x) Content Tables
  • 45.
  • 46. Driving access on CONTENT BTree Index bookstore Secondary Oracle Text Index Function based Index (XPath) book whitepaper StructuredXMLIndex Unstructured XMLIndex title author author chapter title author id paragraph content structured content Structured XMLIndex
  • 47. There can be only one XMLIndex…
  • 50. XML Schema will be parsed only once If registered in the XDB Repository XML Schema will be cached in memory (SGA) No additional parsing No additional validation XML Schema Advantages
  • 51. XML Document structure is known, therefore No parsing is needed when loaded from disk into memory XML OBject (XOB) structures can be applied Memory footprint is much less compared to DOM structure Needed specific nodes can now be handled efficiently in memory XML Schema Advantages
  • 52. XDB Annotations Hybrid: CLOB withinOR
  • 53. XDB Annotations (OR/Binary XML) Levels Root, Simpletype, Complextype xmlns:xdb="http://xmlns.oracle.com/xdb" xdb:storeVarrayAsTable xdb:defaultTable xdb:maintainDom xdb:maintainOrder xdb:SQLInline Oracle V.11.1.0.7.0 - Partitioning xdb:tableprops
  • 54. Mixing Logical and Physical Design
  • 55. XML Schema - Query Rewrite String CHAR String Float bookstore CLOB VARCHAR2 (20) book whitepaper title author author chapter title author id paragraph NUMBER (15) content content
  • 56. XML Design Avoid Cyclic References in XML Schemata For ease of Maintenance: xdb:annotations Is DOM validation, fidelity needed ? CPU / XML parsing: XML Schema validation “overhead” ? Index maintenance overhead, when using “disk” solutions Y X
  • 57. Be aware of what you are doing ! Avoid unneeded (full) XML Schema validation During Storage (Inserts), Generating XML xdb:MaintainDOM=false Avoid Impedance mismatch Java  XML  Java  XML  Relational  XML  Java (“All In One Go Objective”) Avoid XML fragments // and/or via XMLEXISTS Use Indexes Y X
  • 59. Keep XML small Do not use / enforce Pretty Print if not needed Avoid namespace reference “Overkill” Most used Namespace is Leading Use short Namespace References (aliases) Make XML data as “sparse” as possible <employee><name>Marco</name></employee> <employee name=“Marco”/> XML Data Partitioning Binary XML if needed Y X
  • 60. Keep XML small (OR specific) Don’t use “meaning full element names” 64Kb DDL “create table” buffer ORA 01792 maximum number of columns in a table or view is 1000 Break XML up Out of Line CLOB (unstructured) Not Accessed Data Don’t create objects if you don’t need it Use xdb:defaultTable=“” for global types
  • 62. Customer Use Case Memory / DOM Memory / DOM CLOB Oracle Advanced Queue XMLType BLOB Process Checks Validation XML Schema (JAVA) Store in ETL Tables Shred Elements Via XMLDOM
  • 64. New XML Approach Rewrite on Disk / XOB (Relational) CLOB Oracle Advanced Queue BLOB Store in ETL Tables Oracle Workflow Validation Against XML Schema Checks XMLType Table (O.R)
  • 65. Using the CBO as an XML Parser… ORA-31186 ORA-31186 ORA-31186 ORA-31186: Document contains too many nodes Cause: Unable to load the document because it has exceeded the maximum allocated number of DOM nodes.
  • 66. Using the (XML) Relational Mindset Design XSD as you would with E(E)R Design for proper physical access, performance: Storage, Index Content Awareness Partitioning Overkill of “meaning full” data parsing Avoid Redundancy, whitespace, “Pretty Print” Design with the future in mind
  • 67. So in short: Balanced Design Inserts, Updates & Deletes XML Future Changes Index Maintenance Selects In Memory Via Indexes XML Validation Strict, Lazy Client Side Possibilities
  • 68. Reward Optimal performance Out performing XML Proper design will give you 10, 100 fold performance increase over XML handling… …also known as…ehh… …standard relational database performance…
  • 69.
  • 70. References Oracle XML DB http://www.oracle.com/pls/db112/homepage XML DB FAQ Thread http://forums.oracle.com/forums/thread.jspa?threadID=410714 Blog http://technology.amis.nl/blog http://blog.gralike.com

Editor's Notes

  1. Square wheel  JSON?
  2. Emp/Dept tables, Foreign/Primary Keys…Showing here ONLY 1 XML document…