Relational data as_xml


Published on

Database Presentation

Published in: Education
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Relational data as_xml

  1. 1. Efficiently Publishing Relational Data as XML Documents Authors:- J Shanmugasundaram, Michael Carey etc (IBM Almaden Research Center) Presented By Harshavardhan Achrekar (University of Massachusetts-Lowell)
  2. 2. What drove them? <ul><li>XML emerging as standard for business data exchange on World Wide Web. </li></ul><ul><li>Need a mechanism to publish currently stored relational data as XML Documents. </li></ul>
  3. 3. Primary Issues <ul><li>Language Specifications - structure and tag data from tables as hierarchical XML Documents. </li></ul><ul><li>Best Implementation Technique – study characteristics and performances of various alternatives for constructing XML documents. </li></ul><ul><ul><li>When to add tags & structure </li></ul></ul><ul><ul><li>How much of processing is done within relational engine? </li></ul></ul>
  4. 4. RoadMap <ul><li>Language specification based on SQL </li></ul><ul><li>Implementation </li></ul><ul><ul><li>Early tagging, structuring </li></ul></ul><ul><ul><li>Late tagging, structuring </li></ul></ul><ul><ul><li>Early structure, late tagging </li></ul></ul><ul><li>Performance Evaluation </li></ul>
  5. 5. Sample XML Document for Customer <customer id=”C1”> <name> John Doe </name> <accounts> <account id=”A1”> 1894654 </account> <account id=”A2”> 3849342 </account> </accounts> <porders> <porder id=”PO1” acct=”A1”> // first purchase order <date>1 Jan 2000</date> <items> <item id=”I1”> Shoes </item> <item id=”I2”> Bungee Ropes </item> </items> <payments> <payment id=”P1”> due Jan 15 </payment> <payment id=”P2”> due Jan 20 </payment> <payment id=”P3”> due feb 15 </payment> </payments> </porder> <porder id=”PO2” acct=”A2”> // second purchase order … </porder> </porders> </customer> <ul><li>Note the </li></ul><ul><li>Elements </li></ul><ul><li>Names/Tags </li></ul><ul><li>ID Refs </li></ul><ul><li>Attribute </li></ul><ul><li>Nested sub-element </li></ul>
  6. 6. Underlying tables Customer ( id int, name varchar) Account ( id varchar, custID int, acctnum int) Item ( id int, poID int, desc varchar) PurchOrder ( id int, custID int, acctID varchar, date varchar) Payment ( id int, poID int, desc varchar)
  7. 7. SQL-based language specifications <ul><li>Sqlfunctions: Define XMLConstruct CUST (Custid: integer, CustName: varchar) AS { </li></ul><ul><li><Customer id=$Custid>$CustName </Customer>} </li></ul><ul><li>Sqlaggregates: Select XMLAGG ( ITEM (, item.desc) ) </li></ul><ul><li>From Item item </li></ul><ul><li>// returns an XML aggregation of items </li></ul>
  8. 8. Customer Definition of XML Constructor <ul><li>Define XML Constructor CUST (custId: integer, </li></ul><ul><li>custName: varchar(20), </li></ul><ul><li>acctList: xml, </li></ul><ul><li>porderList: xml) AS { </li></ul><ul><li>< customer id=$custId> </li></ul><ul><li>< name > $custName </ name > </li></ul><ul><li>< accounts > $acctList </ accounts > </li></ul><ul><li>< porders > $porderList </ porders > </li></ul><ul><li></ customer > </li></ul><ul><li>} </li></ul>Input Output Output - A Customer XML Element Aggregate function XMLAGG – Concatenates XML Fragments produced by XML Constructor
  9. 9. Sample SQL query constructs XML from relational tables <ul><li>Select, CUST(,, </li></ul><ul><li>(Select XMLAGG(ACCT(, acct.acctnum)) </li></ul><ul><li>From Account acct </li></ul><ul><li>Where, </li></ul><ul><li>(Select XMLAGG(PORDER(, porder.acct,, </li></ul><ul><li>(Select XMLAGG(ITEM(, item.desc)) </li></ul><ul><li>From Item item </li></ul><ul><li>Where </li></ul><ul><li>(Select XMLAGG(PAYMENT(,pay.desc)) </li></ul><ul><li>From Payment pay, </li></ul><ul><li>Where </li></ul><ul><li>From PurchOrder porder </li></ul><ul><li>Where </li></ul><ul><li>From Customer cust </li></ul>Correlated sub-query for customer’s Accounts Correlated sub-query For purchase orders Top Level query returns each customer from customer table Correlated sub-query returns XML fragment LINES 1-14 produces Scalar function returning Customer XML
  10. 10. Implementation Alternatives <ul><li>Two main differences: </li></ul><ul><ul><li>Nesting (structuring) </li></ul></ul><ul><ul><li>Tagging </li></ul></ul><ul><li>Space of alternatives: </li></ul>Late Tagging Early Tagging Late Structuring Early Structuring Inside Engine Inside Engine Inside Engine Outside Engine Outside Engine Outside Engine Stored Procedures CLOB
  11. 11. Early tagging and structuring <ul><li>Stored Procedure - Outside the engine Approach </li></ul><ul><li>Explicitly issue nested queries </li></ul><ul><li>Algorithm:- </li></ul><ul><li>First query & retrieve root elements (customers id, name) </li></ul><ul><li>Using Customer id ,issue a query to retrieve account info. </li></ul><ul><li>Next, for same customer id, issue a query to retrieve customers purchase order </li></ul><ul><ul><li>For each purchase order retrieved, query to get item and payment info. </li></ul></ul><ul><li>Once done Processing of one customer is over. </li></ul><ul><li>Repeat same for next customer till entire XML Document is ready. </li></ul><ul><li>Fixed order Nested Loop Join outside the ENGINE </li></ul><ul><li>Tag/Structure as soon as structure is ready </li></ul><ul><li>Many SQL queries issued/tuple for tables with nested structure. </li></ul>
  12. 12. Early tagging and structuring <ul><li>Correlated CLOB - Inside the engine Approach </li></ul><ul><ul><li>Push queries into the engine </li></ul></ul><ul><ul><li>Plug in XMLAGG, XMLCONSTRUCT support into engine </li></ul></ul><ul><ul><li>Character Large Objects- CLOBS XML Fragments </li></ul></ul><ul><ul><li>Performance Issues -handle huge CLOBS in engine </li></ul></ul><ul><ul><li>Fixed join order – implies nested loop join strategy </li></ul></ul>
  13. 13. Efficiently Publishing Relational Data as XML Documents
  14. 14. Early tagging and structuring <ul><li>De-Correlated CLOB - Inside the engine Approach </li></ul><ul><ul><li>Decorrelate and use Outer Joins – no longer fixed order </li></ul></ul><ul><ul><li>Compute Account lists associated with all customers </li></ul></ul><ul><ul><li>Compute Purchase order lists associated with all customers </li></ul></ul><ul><ul><li>Join results above on customer id. </li></ul></ul><ul><ul><li>Still carry around CLOBs (due to early tagging!) </li></ul></ul>
  15. 15. Efficiently Publishing Relational Data as XML Documents