DB2 Native XML

  • 1,187 views
Uploaded on

 

More in: Technology
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
No Downloads

Views

Total Views
1,187
On Slideshare
0
From Embeds
0
Number of Embeds
3

Actions

Shares
Downloads
3
Comments
0
Likes
1

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. IBM DB2 v9 pureXML amolpujari@gmail.com
  • 2. Agenda• XML in DB2• Saving hordes of lines of code and precious time too• SQL/XML• XQuery• RSS Generator• Workflow example• XML in oracle• DB2 x Oracle (xquery performance)
  • 3. XML in DB2(Native XML Database)• DB2 code name was Viper• Native XML support – Native Storage for XML – Stores parsed XML – Big documents get divided into regions• Indexes on XML column – Internal indexes – User Created indexes• XML Schema• Xquery – XPATH – FLOWR
  • 4. Native XML support• Storing it hierarchical• Keep XML as XML• DB2 will store XML in parsed hierarchical format (similar to the DOM representation)• “Native” = the best-suited on-disk representation of XML CREATE TABLE dept ( deptID char(8), … , doc xml);• Relational columns are stored in relational format• XML columns are stored natively• All XML data is stored in XML-typed columns
  • 5. Native XML Storage •1 String table per database •Database wide dictionary for all tags in all xml columns
  • 6. XML Node Storage Layout
  • 7. XML in DB2 DB2 Client/Application SQL/XML XQuery Relational Interface XML Interface DB2 Engine SQL/XML Parser XQuery Parser Hybrid SQL/XQuery Compiler Query Evaluation and Runtime XML Navigation
  • 8. XML in DB2• CREATE TABLE msg ( item XML)• INSERT INTO msg VALUES ( XMLPARSE(DOCUMENT <?xml version="1.0"?><root>…</root> PRESERVE WHITESPACE) )• REGISTER XMLSCHEMA http://sample/po FROM file:item.xsd AS xscma COMPLETE• INSERT INTO msg VALUES ( XMLVALIDATE(XMLPARSE(DOCUMENT <?xml version="1.0"?><root>…</root> PRESERVE WHITESPACE) ACCORDING TO XMLSCHEMA ID xscma) )• CREATE INDEX xind_newsgroup ON msg(item) GENERATE KEY USING XMLPATTERN //@newsgroup„ AS SQL VARCHAR(50)
  • 9. Saving hordes of lines of code• Web applications use databases• What they get from database is relational data• Relational data need to be used to form xml in the end and this involves DOM/SAX operations• But what if they get the required xml formed direct from database by firing a single xquery?• With DB2 XML, you – Dont involve so many relational tables – Dont keep fetching relational records out – Dont need external DOM/SAX operations – Just need a single Xquery and required xml doc is ready in one fetch – Save a lot of execution time and also hordes of lines of code
  • 10. SQL/XML• A standardized mechanism for using SQL and XML together• Retrieve data as XML from relational objects• A set of functionsxmlelement() Creates an XML element, allowing the name to be specifiedxmlattributes() Creates XML attributes from columns, using the name of each column as the name of the corresponding attributexmlroot() Creates the root node of an XML documentxmlcomment() Creates an XML commentxmlpi() Creates an XML processing instructionxmlparse() Parses a string as XML and returns the resulting XML structurexmlforest() Creates XML elements from columns, using the name of each column as the name of the corresponding elementxmlconcat() Combines a list of individual XML values to create a single value containing an XML forestxmlagg() Combines a collection of rows, each containing a single XML value, to create a single value containing an XML forest.
  • 11. SQL/XMLSELECT XMLELEMENT(NAME, “Department”, XMLATTRIBUTES( e.department AS “name”), XMLAGG( XMLELEMENT(NAME “emp”, e.firstname))) AS “department_list”FROM employee eWHERE . . .GROUP BY e.department department_list firstname lastname department <Department name=“A00”> <emp>CHRISTINE</emp> SEAN LEE A00 <emp>VINCENZO</emp> <emp>SEAN</emp> MICHAEL JOHNSON B01 </Department> VINCENZO BARELLI A00 <Department name=“B01”> CHRISTINE SMITH A00 <emp>MICHAEL</emp> </Department>
  • 12. SQL/XMLTraditional way SQL/XMLdb2 select empno, firstnme, lastname from employee SELECT XMLSerialize( XMLELEMENT(NAME "TABLE",EMPNO FIRSTNME LASTNAME -- XMLATTRIBUTES(‟80%‟ AS “width”)------ ------------ --------------- XMLAGG( XMLELEMENT(NAME "TR",000010 CHRISTINE HAAS XMLELEMENT(NAME "TD", empno),000020 MICHAEL THOMPSON XMLELEMENT(NAME "TD", firstnme),. XMLELEMENT(NAME "TD", lastname)))). AS varchar(4000)) FROM employee000030 SALLY KWAN200340 ROY ALONZO 42 record(s) selected.// fetching relational data // single fetch and html(xml) is ready//construct html table <% rs.next(); %><table …. <%=(rs.getString(1))%><!—setting table attributes -> // job done<%While(rs.next())// 42 fetches {%> // construct table rows <tr…> <!—setting row attributes -> //construct table columns <!—setting column attributes -> <td…><%=(rs.getString(“EMPNO”))%> <td…><%=(rs.getString(“FIRSTNME”))%> <td…><%=(rs.getString(“LASTNAME”))%> </tr><%}%>
  • 13. XQuery New kid on the block• A language for running queries against XML-tagged documents in files and “databases”• Provides XPath compatibility• Supports conditional expressions, element constructors• FLOWR expressions the syntax for retrieving, filtering, and transforming operators, functions, path• Result of an XQuery is an instance of XML Query Data Model• Uses XML Schema types, offers static typing at compile time and dynamic typing at run time, supports primitive and derived types• could evaluate to simple node values (such as elements and attributes) or atomic values (such as strings and numbers). XQueries can also evaluate to sequences of both nodes and simple values.• XQuery update is planned
  • 14. FLWOR Expression• FOR: iterates through a sequence, bind variables to items• LET: binds a variable to a sequence• WHERE: eliminates items of the iteration• ORDER: reorders items of the iteration• RETURN: constructs query resultsFOR $movie in db2-fn:xmlcolumn(„MOVIE.DOC‟)LET $actors :=$movie/actorWHERE $movie/duration > 90ORDER by $movie/@yearRETURN <movie> <title>Chicago</title><movie> <actor>Renne Zellweger</actor> {$movie/title, $actors} <actor>Richard Gere</actor> <actor>Catherine Zita-Jones</actor></movie> </movie>
  • 15. XQuery(sample data) Item (xml) <msg id=„12‟ newsserver=„news.persistent.co.in‟ newsgroup=„comp.lang.c‟> <item> <title>Re: SIGPIPE - Finding the thread</title> <link><e1lqu9$2ee$1@news.intranet.pspl.co.in></link> <author>sushrut bidwai <sushrut_bidwai@persistent.co.in></author> <pubDate>Thu, 13 Apr 2006, 09:49:39 +0530</pubDate> <description>some description here…</description> </item> </msg> <msg id=„12‟ newsserver=„news.software.ibm.com‟ newsgroup=„ibm.software.unicode‟> <item> <title>Gold Mobile</title> <link><d1nl7v$4lug$5@news.boulder.ibm.com></link> <author>Nadine <Nadine.grantham@gmail.com></author> <pubDate>Tue, 22 Mar 2005, 04:58:39 +0530</pubDate> <description>some description here…</description> </item> </msg>
  • 16. XQueryexamples• Getting the list of messages where the description contains a particular string (“uninitialized” in this case) xquery for $a in db2-fn:xmlcolumn(MSG.ITEM)/msg where contains($a/item/description,"uninitialized") return $a• Getting the first 3 messages sent by an author to the news group xquery let $a := ( for $b in db2-fn:xmlcolumn(MSG.ITEM)/msg where contains($b/item/author,"Shridhar") return $b ) return $a [position() < 4]
  • 17. XQueryexamples• Getting the last 5 messages sent by an author to the news group xquery let $a := for $b in db2-fn:xmlcolumn(MSG.ITEM)/msg where contains($b/item/author,"Shridhar") return $b let $c := count($a) let $d := $c - 5 return $a [position() > $d]• Returns the list of authors and the number of messages they have sent to the group xquery let $a := db2-fn:xmlcolumn(MSG2.ITEM)/msg/item/author let $b := distinct-values($a) for $e in ($b) let $d := count(for $c in db2-fn:xmlcolumn(MSG2.ITEM)/msg/item where $c/author = $e return $c ) return <result> <author>{$e}</author> <message-count>{$d}</message-count> </result>
  • 18. RSS Generator• Really Simple Syndication (lightweight XML format designed for sharing data)• A web application to generate RSS and ATOM feeds• Source: data (messages) from news servers• Uploading messages from news server to xmldb2 in xml document format• Used XML Schema definition support for validation at database level• Used xml indexes as necessary based on XQueries• Need just a single xquery fetch to generate RSS/ATOM feeds
  • 19. RSS example<rss version="2.0"> <channel> <title>news.persistent.co.in: comp.lang.c</title> <link>http://news.persistent.co.in</link> <description>The latest content from news.persistent.co.in: comp.lang.c</description> <lastBuildDate>Thu, 13 Apr 2006, 17:58:13 +0530</lastBuildDate> <language>en-us</language> <copyright>Copyright 2006 Persistent System Private Limited</copyright> <item> <title>Re: SIGPIPE - Finding the thread</title> <link><e1lqu9$2ee$1@news.intranet.pspl.co.in></link> <author>sushrut bidwai <sushrut_bidwai@persistent.co.in></author> <pubDate>Thu, 13 Apr 2006, 09:49:39 +0530</pubDate> <description>some description here…</description> </item> . . <item> . . <item> . </channel></rss>
  • 20. RSS Generator(Administration) news message xml record News Updater NXD Uploading newsgroup messages to NXD News Server Database <msg id=„12‟ newsserver=„news.software.ibm.com‟ newsgroup=„ibm.software.unicode‟> <item> <title>Gold Mobile</title> <link><d1nl7v$4lug$5@news.boulder.ibm.com></link> <author>Nadine <Nadine.grantham@gmail.com></author> <pubDate>Tue, 22 Mar 2005, 04:58:39 +0530</pubDate> <description>some description here…</description> </item> </msg> xml record
  • 21. RSS Generator NXD Xquery <xml> response request GeneratorWeb Browser Web Browser
  • 22. RSS Generator• One Xquery and the job is done• Result of XQuery is a single record which is a RSS document• No DOM/SAX stuff• Not even 2nd fetchxqueryfor $a in ( 1 to 1 )return<rss version="2.0"> <channel> <title> newsServer:newsGroup </title> <link>http://newsServer</link> <description>The latest content from newsServer:newsGroup</description> <lastBuildDate>Thu, 13 Apr 2006, 17:58:13 +0530</lastBuildDate> { let $e := ( for $b in db2-fn:xmlcolumn(MSG.ITEM)/msg[@newsserver="newsServer"][@newsgroup="newsGroup"] where $b/item[fn:contains(title,"subject")] and/or $b/item[fn:contains(author,"author")] and/or $b/item[fn:contains(description,"description")] order by fn:number($b/@id) descending return $b ) for $i in ( 1 to n) return $e[$i]/item } </channel></rss>
  • 23. RSS Generator
  • 24. Workflow ExampleA Document Approval System ( One simple, Content Management Use Case)• A Web Application• Uses Native XML features• Just a single xquery fetch and html (xml) is ready• Simple and easy to use• Facilitates document review process• Uses NXD to store document state related info• Facilitates easy querying of requests based on assignee, reviewer, request states etc
  • 25. Workflow Example
  • 26. XML in OracleStorage• XMLType Storage( Gone Relational and not Native) – CLOB • Whole Document Stored in one column • Requires DOM operations • Text Indexing • Inefficient update – Object Relational • Document Shredded across tables, rows and columns • Requires XML Schema • Insert/retrieval requires (de) composition
  • 27. XML in Oracle Index XML• CTXXPATH( Gone Relational and not Native) – When you need to speed up existsNode() queries on an XMLType column. – e.g. • CREATE INDEX [schema.]index on [schema.]table(XMLType column) INDEXTYPE IS ctxsys.CTXXPATH [PARAMETERS ([storage storage_pref] [memory memsize])]; – Looks a bit complicated – No XML specific index support
  • 28. XML in OracleXquery• No full support for Xquery e.g. SELECT XMLQuery(„Xquery for $a in ora:view(„MSG‟)/ROW/ITEM/msg[@newsgroup=“pspl.misc”] Return <root>{$a/item/title}</root>‟) from MSG This xquery will return some “null” values where newsgroup condition doesn‟t match The xquery will need to be modified to suppress the „null‟ values and so to get the proper result SELECT XMLQuery(„Xquery for $a in ora:view(„MSG‟)/ROW/ITEM/msg[@newsgroup=„pspl.misc‟] Return <root>{$a/item/title}</root>‟) from MSG WHERE ExistsNode(ITEM,‟/msg[@newsgroup=“pspl.misc”]‟)=1 More the conditions, bigger the query with more number of ExistsNode calls
  • 29. XML in OracleXquery• No full support for Xquery another example SELECT XMLQuery(„Xquery for $a in ora:view(„MSG‟)/ROW/ITEM/msg Where contains($a/item/title,”join”) Return <root>{$a/item/title}</root>‟) from MSG This xquery will return some “null” values where contains return false – Now there is no workaround for this. One can not modify this query to give proper result as one can not specify “contains” function within ExistsNode. So possible workaround is to add some code at application level to suppress „null‟ values
  • 30. XML x Oracle(Sample Database Design) Xquery performance Table1: msg Item (xml)DB2 xml index: <msg id=„12‟ newsserver=„news.persistent.co.in‟ newsgroup=„comp.lang.c‟>create index xind_newsserver <item>on msg(item) <title>Re: SIGPIPE - Finding the thread</title>generate key using <link><e1lqu9$2ee$1@news.intranet.pspl.co.in></link>xmlpattern //@newsserver <author>sushrut bidwai <sushrut_bidwai@persistent.co.in></author>as sql varchar(50); <pubDate>Thu, 13 Apr 2006, 09:49:39 +0530</pubDate> <description>some description here…</description> </item> </msg> <msg id=„12‟ newsserver=„news.software.ibm.com‟ newsgroup=„ibm.software.unicode‟> <item> <title>Gold Mobile</title> <link><d1nl7v$4lug$5@news.boulder.ibm.com></link> <author>Nadine <Nadine.grantham@gmail.com></author> <pubDate>Tue, 22 Mar 2005, 04:58:39 +0530</pubDate> <description>some description here…</description> </item> </msg> Oracle CTXXPATH index: CREATE INDEX on MSG (ITEM) around 4, 50 000 xml records INDEXTYPE IS ctxsys.CTXXPATH on both side DB2 and oracle
  • 31. XML x OracleXquery performanceDb2_1.sql xquery for $a in db2-fn:xmlcolumn(MSG.ITEM)/msg where contains($a/item/description,“sample") return $a Execution time in milliseconds: 187525ora_1.sql select xmlquery (for $a in /msg where contains($a/item/description,"sample") return $a passing item returning content) result from msg ORA-04030: out of process memory
  • 32. XML x OracleXquery performanceDb2_2.sql xquery for $a in db2-fn:xmlcolumn(MSG.ITEM)/msg where contains($a/item/title,"Lint") return $a Execution time in milliseconds: 198474ora_2.sql select xmlquery (for $a in /msg where contains($a/item/title,“Lint") return $a passing item returning content) result from msg ORA-04030: out of process memory
  • 33. XML x OracleXquery performanceDb2_3.sql xquery for $a in db2-fn:xmlcolumn(MSG.ITEM)/msg where $a/@newsgroup = "control.cancel" return $a Execution time in milliseconds: 126858ora_3.sql select xmlquery (for $a in /msg return $a passing item returning content) result from msg Where existsNode(ITEM,/msg[@newsgroup="control.cancel"])=1 Took more than an hour to fetch all records
  • 34. XML x OracleXquery performanceDb2_4.sql xquery let $a := ( for $b in db2-fn:xmlcolumn(MSG.ITEM)/msg where contains($b/item/author,"Shantanu Gadgil") return $b ) return $a [position() < 10] Execution time in milliseconds: 173419ora_4.sql select * from ( select xmlquery (for $a in /msg where contains($a/item/author,"Shantanu Gadgil ") order by $a[@id] return $a passing item returning content) result from msg ) where rownum <= 10 ORA-04030: out of process memory
  • 35. Thank You