• Save
Design Concepts For Xml Applications That Will Perform
Upcoming SlideShare
Loading in...5
×
 

Design Concepts For Xml Applications That Will Perform

on

  • 6,059 views

S307479 - Oracle XMLDB - Design Concepts for XML Applications That Will Perform - AMIS - Marco Gralike

S307479 - Oracle XMLDB - Design Concepts for XML Applications That Will Perform - AMIS - Marco Gralike

Oracle Open World 2009 Presentation

Statistics

Views

Total Views
6,059
Views on SlideShare
5,893
Embed Views
166

Actions

Likes
1
Downloads
0
Comments
0

3 Embeds 166

http://technology.amis.nl 95
http://www.liberidu.com 63
http://www.slideshare.net 8

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Design Concepts For Xml Applications That Will Perform Design Concepts For Xml Applications That Will Perform Presentation Transcript

  • Oracle XML Database
    Design Concepts for XML Applications That Will Perform !
    Marco Gralike, AMIS, 2009
  • Started as DBA with Oracle 7 on Windows NT 3.1 (1994)
    Experienced with Oracle 7.x / 8.x / 9.x / 10.x and 11.1
    Oracle 11g Beta tester for Oracle XMLDB
    Active Oracle OTN XMLDB Forum Member
    Oracle ACE Award for XMLDB Community Contributions
    OakTable Network member
    Introductions
  • Or a short story
    “Why XML on Diskcan be faster than XML in Memory…”
  • Disclaimer
    The following are “Rules of Numb”
    Bare in mind: Every environment has its own unique criteria and needs regarding business needs and its architecture, etc…
    “Maintainability”
    “Extendibility”
    …so pay attention to:
    “Choice”
    “Design”
    “Testing”
    “Performance”
  • A Customer Use-Case
  • Initial State
    No performance
    12.000 “Cases” / night (4 Hour Window)
    4 hours are not enough anymore
    The “XML” part “looks like it takes too long”
    Original database system version 8.1.X
    Future Wishes
    The need to be able to handle 120.000 “Cases” / night
    In the near future hardware/OS from OpenVMS to HPUX
    Customer Case
  • An overview
    Memory
    / DOM
    Memory
    / DOM
    CLOB
    Oracle
    Advanced Queue
    XMLType
    BLOB
    Process
    Checks
    Validation
    XML Schema
    (JAVA)
    Store in
    ETL Tables
    Oracle
    Workflow
    Shred Elements
    Via XMLDOM
  • 10.000 “Cases” (~ 10 Mb size)
  • How expensive are 1.000 “Cases” ?
  • The Cost of Mixing Worlds
  • BLOB2CLOB and CLOB2XMLType
  • Feeding data to the database
    Why BLOB ?  XML data & PDF data
    Why CLOB ?  Conversion needed for XML handling
    Why XMLType  Needed to check XML element content
    XML Validation (well-formedness)
    Memory
    / DOM
    CLOB
    Oracle
    Advanced Queue
    XMLType
    BLOB
  • Different data models.
    XPath models an XML document as
    a tree while most general purpose
    programming languages
    have no native data types for a tree.
    Different programming paradigms.
    XSLT is a functional language, while Java
    is object-oriented and Perl is a procedural one.
    Effect/Costs
    Unnecessary CPU and Memory Overhead
    A lot of expensive type and encoding conversions
    Impedance Mismatch
  • If you deal with XML  Handle it via XML(DB)
    So if it is relational, do it the relational way…
    If XML use XQuery, or others like XPath etc…
    If you mix worlds be careful regarding
    Information loss (PK/FK  XML) ?
    Whitespace  NULL  Whitespace ?
    Impedance mismatch
    The General Rule !
  • XML Document Validation
  • Validate XML Document via its XML Schema
  • Validation on content and structure
    XML Schema  Validation on XML structure
     PL/SQL Wrapper with JAVA XML Parser
    Memory
    / DOM
    Validation
    XML Schema
    ( JAVA based)
    XMLType
    Shred Elements
    via XMLDOM
    Process
    Checks
  • Java XML Parser
  • XML Parsers
    Often DOM or Infoset based
    CPU intensive
    Memory intensive
    Parsing, serializing or tree traversals, happen in memory
    Often handle XML tree traversals only via ONE method
    It is not structured, semi-structured or unstructured
    XML content aware
    It is not very “smart” / “content aware” regarding XML handling based on its XML tree’s and/or XML data content
  • XML Schema will be parsed only once
    XML Schema will be cached in memory
    No additional parsing
    No additional validation
    XML Document structure is known, therefore:
    No parsing is needed when loaded from disk into memory
    XML Object (XOB) structures can be applied
    Memory footprint is much less compared to DOM structure
    Needed specific nodes can now be handled efficiently in memory
    XML Schema Registration Advantages
  • XML Schema based - Query Rewrite
    String
    CHAR
    bookstore
    String
    VARCHAR2
    (20)
    Float
    CLOB
    book
    whitepaper
    title
    author
    author
    chapter
    title
    author
    id
    paragraph
    NUMBER
    (15)
    content
    content
  • Checked on
    XML Well-Formedness
    One root element
    Begin & End tags
    If XML Schema reference
    XOB methods will be used if an
    XML Schema is available
    DOM methods will be used if an
    XML Schema information is
    not available
    XMLType – Not just a “Datatype”
  • Some XSD Design Rules
  • Keep XML small !
    Do not use / enforce Pretty Print if not needed
    Avoid namespace reference “Overkill”
    Most used Namespace is Leading
    Use short Namespace References
    Make XML data as “sparse” as possible
    <employee><name>Marco</name></employee>
    <employee name=“Marco”/>
    XML Data Partitioning
    Binary XML if possible
    Y
    X
  • XML Design
    Avoid Cyclic References in XML Schemata
    For ease of Maintenance: xdb:annotations
    Is DOM validation, fidelity needed ?
    CPU: XML parsing- XML Schema validation “overhead” ?
    Index maintenance overhead, if implemented via disk
    Y
    X
  • XML Document Handling
    Shredding & Storing XML
  • Check Total Amount
  • XML Content
    TABLE “B”
    TABLE “A”
    TABLE “C”
  • Think in “3D” or in “Driving Table” terms
    maxoccurs=“unbounded”
     Give me the <title> and <content> where <content> contains…
    1
    3
    4
    5
    2
    X
    Y
    6
    Z
    x n rows
  • Checking the Amount…
  • Used Setup
    OpenVMS
    Version 9.2.0.5.0
    1.000 “Cases”
    1) l_xpath := '//case['||i||']/amount_charged/text()' ;
    2) l_xpath := '/case_data/case['||i||']/amount_charged/text()' ;
    3) select sum(to_number(extract(value(tr),'/case_data/case/amount_charged/text()'))
    All in memory: COLLECTION ITERATOR PICKLER FETCH
    The Effect of // (for a 1.000 “Cases)
  • CLOB XMLType (V 11.1.0.6.0)
    ORA-31186
  • Effect of //
    In memory
    10.000 Cases:
    ORA-31186
    Document contains too
    many nodes
    maxoccurs=unbounded
    maxLength, totalDigits, etc
    Increasing volume – XMLType CLOB
    ORA-31186: Document contains too many nodes
    Cause: Unable to load the document because it has exceeded
    the maximum allocated number of DOM nodes.
    Action: Reduces the size of the document
  • XML Document Handling
    Object Relational, Binary XML
  • A Solution based on XMLType O.R.
    Rewrite on Disk
    / XOB
    (Relational)
    CLOB
    Oracle
    Advanced Queue
    BLOB
    Store in
    ETL Tables
    Oracle
    Workflow
    Validation
    Against
    XML Schema
    Checks
    XMLType Table
    (O.R)
  • Driving Access on CONTENT (11gR1, on Disk)
    BTree Index
    BTree Index
    BTree Index
    bookstore
    Secondary Oracle Text Index
    Function based Index (XPath)
    BTree
    Index
    book
    whitepaper
    Unstructured
    XMLIndex
    title
    author
    author
    chapter
    title
    author
    id
    paragraph
    content
    structured
    content
  • Can be influenced via
    Statistics
    Indexes
    XML Schema Registration (XOB)
    Encoding in Binary XML storage
    SQL Re-Write of XPath, XQuery
    Partitioning
    Cost Based Optimizer Advantages
  • O.R. XMLType (V 11.1.0.6.0)
    ORA-31186
    ORA-31186
  • So why can DISK out perform MEMORY
    XML Schema validation based on Registered XML Schema
    Query re-write possible
    Based on plain “old” SQL/database methods
    Optimized CPU handling
    Optimized Memory handling (if needed)
    Multiple optimized solutions possible via Optimizer instead of one XML parser method
    Specific parts of XML can be handled / be driven via:
    specific indexing
    or content
    Full blown validation can be avoided
  • Recap…
  • Be aware of what you are doing !
    Avoid unneeded (full) XML Schema validation
    During Insert
    Generating XML
    Avoid Impedance mismatch
    Java  XML  Java  XML  Relational  XML  Java
    “All In One Go Objective”
    Avoid intermediate XML fragments
    //
    XMLEXISTS
    Use Indexes
    xdb:MaintainDOM=false
    Y
    X
  • XML Data Handling and Design
    Handle XML Smart
    Keep XML Small
    Restrict XML where possible
    Be precise !
    maxoccurs, maxLength
    Provide Oracle of extra / precise information (XSD)
    Register XML Schema
    If possible…
    Y
    X
  • Balanced Design
    • Inserts, Updates & Deletes
    • XML Future Changes
    • Index Maintenance
    • Selects
    • In Memory
    • Via Indexes
    • XML Validation
    • Strict, Lazy
    • Client Side Possibilities
  • Now you why DISK can be faster than MEMORY
    100.000 “Cases” shredded & validated in 5 minutes
    Instead of 1000 “Cases” in 3 minutes…
    Avoiding
    ORA-31186: Document contains too many nodes
    Scalable
    Efficient with Memory and CPU
    Checked in production on a 9.2.0.5.0 database version
    Extra:
    …decreased used PL/SQL code by half…
    …but will have to KNOW what you are doing…
  • Oracle Open World 2009 - XMLDB Sessions
  • References
    XMLDB DevelopersGuide
    http://www.oracle.com/pls/db112/homepage
    The XMLDB Forum
    http://forums.oracle.com/forums/forum.jspa?forumID=34
    XML DB FAQ Thread
    http://forums.oracle.com/forums/thread.jspa?threadID=410714
    Blog
    http://technology.amis.nl/blog
    http://blog.gralike.com