Introduction to pureXML in DB2 9 for z/OS

1,232 views

Published on

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
1,232
On SlideShare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
20
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Introduction to pureXML in DB2 9 for z/OS

  1. 1. IBM Software Group | DB2 Information Management Software ® IBM Software Group Introduction to pureXML in DB2 9 for z/OS December 2008 Guogen (Gene) Zhang, gzhang@us.ibm.com 1
  2. 2. IBM Software Group | DB2 Information Management Software IBM Silicon Valley Lab in San Jose, CA The first IBM lab dedicated to software development, built 30 years ago. 2
  3. 3. IBM Software Group | DB2 Information Management Software Our Mission • Build the world’s most scalable XML database system that is easy to use. 3
  4. 4. IBM Software Group | DB2 Information Management Software DB2 for z/OS XML Design Principles • Learn from relational scalability, leveraging mature infra-structure in DB2 • Provide application developers “revolutionary” new tools (query languages) in processing XML data productively and efficiently • Provide DBAs familiar objects to administrate for XML data easily with familiar tools – Table spaces, indexes, buffer pools etc. – Locking – Performance monitoring 4
  5. 5. IBM Software Group | DB2 Information Management Software Agenda • What is XML and where is XML used? • Overview of DB2 XML – SQL/XML feature in DB2 for z/OS V8 – what you can do with pureXML in DB2 9 for z/OS – Language interfaces and tools • Usage scenarios and business value of pureXML • DBA’s tasks for XML • Performance and scalability 5
  6. 6. IBM Software Group | DB2 Information Management Software XML document XML = Extensible Markup Language XML Declaration Namespace Declaration <?xml version =“1.0” encoding=“utf-8” ?> PI <?xml-stylesheet type="text/xml" href="#style1"?> <abc:process xmlns:abc=“http://abc.com” Attribute abc:title=“How to crack open an egg”> Start Tag <abc:step>Strike egg against an edge.</abc:step> <abc:step> Pull apart the shells <abc:warning>carefully</abc:warning> </abc:step> End Tag </abc:process> <!-- This is a simple example --> Value Comment QName Content Element 6
  7. 7. IBM Software Group | DB2 Information Management Software An XML Purchase Order <?xml version="1.0" encoding="UTF-8"?> <?xml version="1.0" encoding="UTF-8"?> <ipo:purchaseOrder <ipo:purchaseOrder xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:ipo="http://www.example.com/IPO" orderDate="1999-12-01"> xmlns:ipo="http://www.example.com/IPO" orderDate="1999-12-01"> <shipTo exportCode="1" xsi:type="ipo:UKAddress"> <shipTo exportCode="1" xsi:type="ipo:UKAddress"> <name>Helen Zoe</name> <name>Helen Zoe</name> <street>47 Eden Street</street> <street>47 Eden Street</street> <city>Cambridge</city> <city>Cambridge</city> <items> <items> <postcode>CB1 1JR</postcode> <postcode>CB1 1JR</postcode> <item partNum="833-AA"> <item partNum="833-AA"> </shipTo> </shipTo> <productName>Lapis necklace</productName> <productName>Lapis necklace</productName> <billTo xsi:type="ipo:USAddress"> <billTo xsi:type="ipo:USAddress"> <quantity>1</quantity> <quantity>1</quantity> <name>Robert Smith</name> <name>Robert Smith</name> <USPrice>99.95</USPrice> <USPrice>99.95</USPrice> <street>8 Oak Avenue</street> <street>8 Oak Avenue</street> <comment>Want this for the <comment>Want this for the <city>Old Town</city> <city>Old Town</city> holidays!</comment> holidays!</comment> <state>PA</state> <state>PA</state> <shipDate>1999-12-05</shipDate> <shipDate>1999-12-05</shipDate> <zip>95819</zip> <zip>95819</zip> </item> </item> </billTo> </billTo> <item partNum="926-AA"> <item partNum="926-AA"> <productName>Baby Monitor</productName> <productName>Baby Monitor</productName> <quantity>1</quantity> <quantity>1</quantity> <USPrice>39.98</USPrice> <USPrice>39.98</USPrice> <shipDate>1999-12-21</shipDate> <shipDate>1999-12-21</shipDate> </item> </item> </items> </items> </ipo:purchaseOrder> </ipo:purchaseOrder> 7
  8. 8. IBM Software Group | DB2 Information Management Software An XML Purchase Order <?xml version="1.0" encoding="UTF-8"?> <?xml version="1.0" encoding="UTF-8"?> <ipo:purchaseOrder <ipo:purchaseOrder Attribute Element Start Tag xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:ipo="http://www.example.com/IPO" orderDate="1999-12-01"> xmlns:ipo="http://www.example.com/IPO" orderDate="1999-12-01"> <billTo xsi:type="ipo:USAddress"> <shipTo exportCode="1" xsi:type="ipo:UKAddress"> <shipTo exportCode="1" xsi:type="ipo:UKAddress"> <name>Helen Zoe</name> <name>Helen Zoe</name> <name>Robert Smith</name> <street>47 Eden Street</street> <street>47 Eden Street</street> <city>Cambridge</city> <city>Cambridge</city> Content <street>8 Oak Avenue</street> <postcode>CB1 1JR</postcode> <postcode>CB1 1JR</postcode> <items> <items> <item partNum="833-AA"> <item partNum="833-AA"> </shipTo> </shipTo> <city>Old Town</city> <billTo xsi:type="ipo:USAddress"> <billTo xsi:type="ipo:USAddress"> <productName>Lapis necklace</productName> <productName>Lapis necklace</productName> <quantity>1</quantity> <quantity>1</quantity> <state>PA</state> <name>Robert Smith</name> <name>Robert Smith</name> <street>8 Oak Avenue</street> <USPrice>99.95</USPrice> <USPrice>99.95</USPrice> <street>8 Oak Avenue</street> <comment>Want this for the <zip>95819</zip> <city>Old Town</city> <city>Old Town</city> <comment>Want this for the holidays!</comment> holidays!</comment> <state>PA</state> <state>PA</state> </billTo> <zip>95819</zip> <shipDate>1999-12-05</shipDate> <shipDate>1999-12-05</shipDate> End Tag <zip>95819</zip> </billTo> </billTo> </item> </item> <item partNum="926-AA"> <item partNum="926-AA"> <productName>Baby Monitor</productName> <productName>Baby Monitor</productName> <quantity>1</quantity> <quantity>1</quantity> <USPrice>39.98</USPrice> <USPrice>39.98</USPrice> <shipDate>1999-12-21</shipDate> <shipDate>1999-12-21</shipDate> </item> </item> </items> </items> </ipo:purchaseOrder> </ipo:purchaseOrder> 8
  9. 9. IBM Software Group | DB2 Information Management Software XML Characteristics • XML can represent flexibly structured data in text – Nesting – Repeating – Self-describing • XML is a universal language to represent e- Business data and transactions. • Platform-independent, and Unicode compliant • Easy to understand and easy to process (with the right tools) 9
  10. 10. IBM Software Group | DB2 Information Management Software XML Parsing and Serialization <book> <authors> <author id=“47”>John Doe</author> XML Parsing <author id=“58”>Peter Pan</author> </authors> <title>Database systems</title> <price>29</price> <keywords> <keyword>SQL</keyword> book <keyword>relational</keyword> </keywords> </book> authors title price keywords Database author author 29 keyword keyword Systems Serialization id=47 John Doe id=58 Peter Pan SQL relational 10
  11. 11. IBM Software Group | DB2 Information Management Software Simple Parsing v.s. Validation XML Document <employee> <name>Jason</name> <job>Software Engineer</job> </employee> Simple Parsing myschema.xsd (well-formedness checking) Validating Parsing with XML Schema (optional) XDM Instance XDM Instance 11
  12. 12. IBM Software Group | DB2 Information Management Software XML vs. Relational Relational XML <DEPARTMENT deptid="15" deptname="Sales"> Set oriented Sequences (ordered!) <EMPLOYEE> Structure Flexibly-structured <EMPNO>10</EMPNO> Strong schema Schema-variability <FIRSTNAME>CHRISTINE</FIRSTNAME> <LASTNAME>SMITH</LASTNAME> Strongly typed Optionally typed <PHONE>408-463-4963</PHONE> Tabular data model XML data model <SALARY>52750.00</SALARY> Flat Nested, hierarchical </EMPLOYEE> "Null" Not there at all <EMPLOYEE> <EMPNO>27</EMPNO> <FIRSTNAME>MICHAEL</FIRSTNAME> <LASTNAME>THOMPSON</LASTNAME> <SALARY>41250.00</SALARY> </EMPLOYEE> Department </DEPARTMENT> DEPTID DEPTNAME 15 Sales Employee DEPTID EMPNO FIRSTNAME LASTNAME PHONE SALARY 15 27 MICHAEL THOMPSON NULL 41250 15 10 CHRISTINE SMITH 408-463-4963 52750 12
  13. 13. IBM Software Group | DB2 Information Management Software Schema Evolution “Employees are now allowed to have multiple phone numbers…” <DEPARTMENT deptid="15" deptname="Sales"> Requires: <EMPLOYEE> • Normalization of existing data ! <EMPNO>10</EMPNO> <FIRSTNAME>CHRISTINE</FIRSTNAME> • Change of applications <LASTNAME>SMITH</LASTNAME> <PHONE>408-463-4963</PHONE> <PHONE>415-010-1234</PHONE> Phone <SALARY>52750.00</SALARY> EMPNO PHONE </EMPLOYEE> 27 406-463-1234 <EMPLOYEE> 10 415-010-1234 <EMPNO>27</EMPNO> <FIRSTNAME>MICHAEL</FIRSTNAME> 10 408-463-4963 <LASTNAME>THOMPSON</LASTNAME> <PHONE>406-463-1234</PHONE> Department Costly! <SALARY>41250.00</SALARY> DEPTID DEPTNAME </EMPLOYEE> </DEPARTMENT> 15 Sales Employee DEPTID EMPNO FIRSTNAME LASTNAME PHONE SALARY 15 27 MICHAEL THOMPSON 406-463-1234 41250 15 10 CHRISTINE SMITH 408-463-4963 52750 13
  14. 14. IBM Software Group | DB2 Information Management Software Where is XML Used? • Data interchange and Web development – HTML 4.0 => XHTML – Web services, SOAP, SOA – Web 2.0 AJAX • Documentation and content • Database development – Managing XML data – Use XML as a flexible hierarchical data model • … 14
  15. 15. IBM Software Group | DB2 Information Management Software Industry-specific XML Formats • To model complex and variable business data • To integrate businesses and applications Elements + Attributes HL7 CDA 3 945 + 477 Health Level 7, Clinical Document Architecture STAR 77319 + 625 Standards for Technology in Automotive Retail (OAGIS) … … FpML 4.2: 1867 + 196 Financial products Markup Language FIXML 4.4: 619 + 2593 Financial Information eXchange Protocol 15
  16. 16. IBM Software Group | DB2 Information Management Software Generate a Relational Schema for FpML In Relational: 485+ tables In pureXML: create table T(ID int, trade XML); 16
  17. 17. IBM Software Group | DB2 Information Management Software pureXML in DB2 9 • SQL XML data type and native storage • Designed specifically for XML – Supports XML hierarchical structure storage – Native operations and languages: XPath, SQL/XML, (XQuery in the future) • Not transforming into relational • Not using objects or nested tables • Not using LOBs • Integrated with relational engine, with all the utilities and tools support 17
  18. 18. IBM Software Group | DB2 Information Management Software Agenda • What is XML and where is XML used? • Overview of DB2 XML – SQL/XML feature in DB2 for z/OS V8 – what you can do with pureXML in DB2 9 for z/OS – Language interfaces and tools • Usage scenarios and business value of pureXML • DBA’s tasks for XML • Performance and scalability 18
  19. 19. IBM Software Group | DB2 Information Management Software SQL/XML Publishing Functions in V8 • Scalar functions – Formally called XML constructors – XMLELEMENT, XMLATTRIBUTES – XMLNAMESPACES – XMLFOREST – XMLCONCAT • Aggregate function: XMLAGG • Transient XML data type – internal use only • Cast function – Cast internal XML values to CLOB: XML2CLOB • Easier to use • Highly optimized – good performance XML Extender is available on V8, deprecated in V9, will be removed in the future. 19
  20. 20. IBM Software Group | DB2 Information Management Software V8 XML Functions Example <Department name="Shipping"> <emp>Lee</emp> <emp>Martin</emp> <emp>Oppenheimer</emp> </Department> SELECT XML2CLOB( XMLELEMENT(NAME "Department", XMLATTRIBUTES (e.dept AS "name" ), XMLAGG(XMLELEMENT(NAME "emp", e.lname) ORDER BY e.lname ) ) ) AS "dept_list" FROM employees e GROUP BY dept; 20
  21. 21. IBM Software Group | DB2 Information Management Software What You Can Do with pureXML - Managing XML data the same way as relational data • Create tables with XML columns XML DOC XML or alter table add XML columns • Insert XML data, optionally validated against schemas • Create indexes on XML data • Efficiently search XML data • Extract XML data XML Index • Decompose XML data into relational data or create relational view • Construct XML documents from relational and XML data XML Column • Handle XML in all the utilities and tools 21
  22. 22. IBM Software Group | DB2 Information Management Software What you can do with pureXML (1) CREATE TABLE PurchaseOrders ( ponumber varchar(10) not null, podate date not null, status char(1), XMLpo xml) IN MYDB.MYTS; CREATE TABLE PO LIKE PurchaseOrders; CREATE VIEW ValidPurchaseOrders as SELECT ponumber, podate, XMLpo FROM PurchaseOrders WHERE status = ‘A’; ALTER TABLE PurchaseOrders ADD revisedXMLpo xml; 22
  23. 23. IBM Software Group | DB2 Information Management Software XML Storage on Mature Infrastructure DocID index XML index (user) NodeID index B+tree B+tree B+tree B+tree B+tree B+tree Base Table Regular Internal XML Table Table space DocID … XMLCol DOCID MIN_NODEID XMLData (DB2_GENERATED_DOCID_FOR_XML) 1 1 02 2 2 02 3 2 0208 3 02 A table with an XML column has a DocID column, used to link from Each XMLData column is a VARBINARY, the base table to the XML table. containing a subtree or a sequence of A DocID index is used for getting subtrees, with context path. Rows in XML to base table rows from XML table are freely movable, linked with a indexes. NodeID index. 23
  24. 24. IBM Software Group | DB2 Information Management Software Storing XML Trees - Tree Packing 02 Node0 Node0 Each node contains Each node contains local node id, length and local node id, length and 02 optional number of optional number of Node1 Node1 children. children. Proxy nodes are used Proxy nodes are used 02 04 06 Node2 Node2 Node6 Node6 Node7 Node7 as placeholder for as placeholder for subtrees in aaseparate subtrees in separate record. record. 02 02 04 06 Node3 Node3 Node4 Node4 Node5 Node5 Node8 Node8 ItItsupports traversal supports traversal using firstChild, using firstChild, rid1 nextSibling, or nextSibling, or Rec Hdr Node0 (1) Node1 (3) Node 2 (p) nextNode. nextNode. RecHdr contains context Node6 Node7 (1) Node8 RecHdr contains context path information for the path information for the record ––absolute ID, rid2 Node3 record absolute ID, Rec Hdr Node 2 (3) Node4 path, in-scope path, in-scope namespaces namespaces Node5 All names use stringIDs. All names use stringIDs. 24
  25. 25. IBM Software Group | DB2 Information Management Software StringIDs to Optimize Storage and Processing String table dept 0 dept 4 employee 1 name employee employee 5 id 2 phone id=901 name phone office id=902 name phone office 3 office Tag names encoded John Doe 408-555-1212 344 Peter Pan 408-555-9918 216 as unique integers 0 4 4 5=901 1 2 3 5=902 1 2 3 John Doe 408-555-1212 344 Peter Pan 408-555-9918 216 25
  26. 26. IBM Software Group | DB2 Information Management Software XML Objects for Non-partitioned Base Table Default: Maxpartitions – 256, SEGSIZE 4, DSSIZE 4G NODEID XML INDEX Index Cols: DOCID MIN_NODEID XMLDATA DOCID Table for XMLCol1 INDEX PBG TS for XMLCol1 Cols: DOCID XML NODEID XMLCol1 INDEX Index XMLCol2 BASE Table Cols: DOCID MIN_NODEID Non-Partioned Base TS XMLDATA (simple, segmented, PBG) Table for XMLCol1 PBG TS for XMLCol2 26
  27. 27. IBM Software Group | DB2 Information Management Software XML Objects for Partitioned Base Table DSSIZE depends on base table page size (critical: max num parts) NodeID XML INDEX Index (NPI) (NPI) Part1 Part2 DOCID DOCID MIN_NODEID MIN_NODEID XMLDATA XMLDATA DOCID INDEX (NPI) Partitioned TS for XMLCol1 Cols: Cols: DOCID DOCID XMLCOL1 XMLCOL1 XMLCOL2 XMLCOL2 NodeID XML INDEX Index BASE Table BASE Table (NPI) (NPI) Part1 Part2 Part1 Part2 Partitioned Base TS 2 Parts, Table has 2 DOCID MIN_NODEID DOCID MIN_NODEID XMLDATA XMLDATA XML Coumns Partitioned TS for XMLCol2 27
  28. 28. IBM Software Group | DB2 Information Management Software What you can do with pureXML (2) EXEC SQL BEGIN DECLARE SECTION; SQL TYPE IS XML AS CLOB(1M) xmlPo; EXEC SQL END DECLARE SECTION; INSERT INTO PurchaseOrders VALUES (‘200300001’, CURRENT DATE, ‘A’, :xmlPo); INSERT INTO PurchaseOrders VALUES(‘200300002’, CURRENT DATE, ‘A’, XMLPARSE(DOCUMENT :vchar PRESERVE WHITESPACE) ); INSERT into PurchaseOrders VALUES( '200300001', CURRENT DATE, 'A', XMLPARSE(DOCUMENT DSN_XMLValidate(:lobPo,’SYSXSR.myPOSchema’)) ); UPDATE PurchaseOrders SET XMLpo = :XMPpo_new WHERE ponumber = ‘12345’; DELETE FROM PurchaseOrders WHERE ponumber = ‘12345’ OR XMLEXISTS(‘/purchaseOrder/items/item[shipDate < “2000-01-01”]’ PASSING XMLPO); 28
  29. 29. IBM Software Group | DB2 Information Management Software - ipo:purchaseOrder orderDate="1999-12-01" - shipTo exportCode="1" xsi:type="ipo:UKAddress" A Data Record -name: Helen Zoe View and -street: 47 Eden Street XPath Examples -city: Cambridge -postcode: CB1 1JR - billTo xsi:type="ipo:USAddress" ShipTo name: /ipo:purchaseOrder/shipTo/name -name: Robert Smith -street:8 Oak Avenue -city: Old Town BillTo: /ipo:purchaseOrder/billTo -state: PA -zip: 95819 Total amount: - items fn:sum(/ipo:purchaseOrder/items/item/xs:decimal(USPrice)) -item partNum="833-AA" -productName: Lapis necklace -quantity: 1 Product names: -USPrice: 99.95 /ipo:purchaseOrder/items/item/productName -comment: Want this for the holidays! -shipDate: 1999-12-05 Ship date of the first item: -item partNum="926-AA" /ipo:purchaseOrder/items/item[1]/shipDate -productName: Baby Monitor -quantity: 1 Item info with product “Baby Monitor”: -USPrice: 39.98 /ipo:purchaseOrder/items/item[productName = “Baby Monitor”] -shipDate: 1999-12-21 Sold price of product “Baby Monitor”: /ipo:purchaseOrder/items/item[productName = “Baby Monitor”]/USPrice 29
  30. 30. IBM Software Group | DB2 Information Management Software What you can do with pureXML (3) SELECT XMLpo INTO :xmlPo FROM PurchaseOrders WHERE ponumber = ‘200300001’; CREATE INDEX IDX1 ON PurchaseOrders(XMLPO) Generate Keys Using XMLPATTERN ‘//items/item/desc’ as SQL VARCHAR(100); SELECT XMLQUERY(‘declare namespace ipo="http://www.example.com/IPO"; /ipo:purchaseOrder/items/item/quantity’ PASSING XMLpo) FROM PurchaseOrders; SELECT XMLPO FROM PurchaseOrders WHERE XMLEXISTS(‘//items/item[desc = “Shoe”]’ PASSING XMLpo); SELECT XMLPO FROM PurchaseOrders, Product WHERE XMLEXISTS(‘//items/item[desc = $n]’ PASSING XMLpo, Product.name as “n”); This is a join between XML and a regular column, it can join XML documents also 30
  31. 31. IBM Software Group | DB2 Information Management Software XML Indexes <?xml version="1.0"?> <?xml version="1.0"?> <purchaseOrder orderDate="1999-10-20"> • XPath value index: index values <purchaseOrder orderDate="1999-10-20"> <shipTo country="US"> <shipTo country="US"> of elements or attributes inside a <name>Alice Smith</name> <name>Alice Smith</name> document. ... ... </shipTo> </shipTo> • Index entries include: <billTo country="US"> <billTo country="US"> <name>Robert Smith</name> (key value, DocID, NodeID, <name>Robert Smith</name> ... ... RIDx) </billTo> </billTo> <comment>Hurry, my lawn is going wild!</comment> <comment>Hurry, my lawn is going wild!</comment> • Support string (VARCHAR) or <items> <items> <item partNum="872-AA"> numeric (DECFLOAT) key type <item partNum="872-AA"> <desc>Lawnmower</desc> <desc>Lawnmower</desc> <quantity>1</quantity> <quantity>1</quantity> <USPrice>148.95</USPrice> CREATE INDEX IX1 ON <USPrice>148.95</USPrice> <comment>Confirm this is electric</comment> <comment>Confirm this is electric</comment> PurchaseOrders(XMLPO) </item> </item> <item partNum="926-AA"> Generate Keys Using <item partNum="926-AA"> <desc>Baby Monitor</desc> <desc>Baby Monitor</desc> XMLPATTERN <quantity>1</quantity> <quantity>1</quantity> <USPrice>39.98</USPrice> ‘/purchaseOrder/items/item/desc’ <USPrice>39.98</USPrice> <shipDate>2003-05-21</shipDate> <shipDate>2003-05-21</shipDate> as SQL VARCHAR(100); </item> </item> </items> </items> </purchaseOrder> This index can be used for predicate: </purchaseOrder> XMLEXISTS(‘/purchaseOrder/items/item[desc = “Baby Monitor”]’ passing XMLPO) 31
  32. 32. IBM Software Group | DB2 Information Management Software New Access Methods Access Methods Description DocScan “R” Base algorithm: given a document, scan and (QuickXScan) evaluate XPath DocID list access “DX” XMLExists(‘/Catalog/Categories/Product[RegPrice > unique DocID list from an 100]’ passing catalog) with index on XML index, then access ‘/Catalog/Categories/Product/RegPrice’ as SQL the base table and XML DECFLOAT table. DocID ANDing/ORing XMLExists(‘/Catalog/Categories/Product[RegPrice > “DX/DI/DU” union or 100 and Discount > 0.1]’ … ) intersect (unique) DocID With indexes on: lists from XML indexes, ‘//RegPrice’ as SQL DECFLOAT and then access the base ‘//Discount’ as SQL DECFLOAT table and XML table.
  33. 33. IBM Software Group | DB2 Information Management Software What you can do with pureXML (4) -- DML (construct XML from relational) Can construct XHTML for web pages. SELECT XMLDOCUMENT( XMLELEMENT(NAME “hr:Department", XMLNAMESPACES(‘http://example.com/hr’ as “hr”), XMLATTRIBUTES (e.dept AS "name" ), XMLCOMMENT(‘names in alphabetical order’), XMLAGG(XMLELEMENT(NAME “hr:emp", e.lname) ORDER BY e.lname ) ) ) AS "dept_list” FROM employees e GROUP BY dept; <?xml version=“1.0” encoding=“UTF-8”> <?xml version=“1.0” encoding=“UTF-8”> <hr:Department xmlns:hr=“http://example.com/hr” <hr:Department xmlns:hr=“http://example.com/hr” firstname lname dept name="Shipping"> name="Shipping"> Sean Lee Shipping <!-- names in alphabetical order --> <!-- names in alphabetical order --> Michael Johnson Accounting <hr:emp>Lee</hr:emp> <hr:emp>Lee</hr:emp> <hr:emp>Martin</hr:emp> <hr:emp>Martin</hr:emp> Vicky Oppenheimer Shipping <hr:emp>Oppenheimer</hr:emp> <hr:emp>Oppenheimer</hr:emp> Christine Martin Shipping </hr:Department> </hr:Department> 33
  34. 34. IBM Software Group | DB2 Information Management Software Another XML Publishing Example SELECT XMLSERIALIZE Query Result: (formatted for easy viewing) ( XMLELEMENT <DEPT DEPTNO="D01" NAME="DEVELOPMENT CENTER"> ( NAME "DEPT", XMLATTRIBUTES <PROJ PROJNO="AD3100" NAME="ADMIN SERVICES"> ( D.DEPTNO AS "DEPTNO", <EMP EMPNO="000010" EMPNAME="BRIAN BARTAK"> D.DEPTNAME AS "NAME" ) , </EMP> ( SELECT XMLAGG </PROJ> ( XMLELEMENT ( NAME "PROJ", <PROJ PROJNO="MA2100" NAME="WELD LINE AUTOMATION"> XMLATTRIBUTES <EMP EMPNO="000010" EMPNAME="BRIAN BARTAK"> (P.PROJNO AS "PROJNO", </EMP> P.PROJNAME AS "NAME" ) , <EMP EMPNO="000110" EMPNAME="VINCENZO LUCCHESI"> ( SELECT XMLAGG (XMLELEMENT </EMP> ( NAME "EMP", </PROJ> XMLATTRIBUTES </DEPT> (E.EMPNO AS "EMPNO", E.FIRSTNME || ' ' || E.LASTNAME AS "EMPNAME" ) ) ) FROM DBA015.EMPPROJACT EP, DBA015.EMP E WHERE EP.PROJNO = P.PROJNO AND EP.EMPNO = E.EMPNO ) ) XML ) FROM DBA015.PROJ P WHERE P.DEPTNO = D.DEPTNO ) ) ) AS CLOB FROM DBA015.DEPT D XML WHERE D.DEPTNO = 'D01' 34
  35. 35. IBM Software Group | DB2 Information Management Software What you can do with pureXML (5) SELECT XMLDocument( XMLElement(NAME "invoice", XMLAttributes( '12345' as "invoiceNo'), XMLQuery ('/purchaseOrder/billTo' PASSING xmlpo), XMLElement(NAME “purchaseOrderNo”, PO.ponumber) XMLElement(NAME "amount", XMLQuery ('fn:sum(/purchaseOrder/items/item/xs:decimal(USPrice))' PASSING xmlpo) ) ) ) FROM PurchaseOrders PO, WHERE PO.ponumber = ‘200300001’; 35
  36. 36. IBM Software Group | DB2 Information Management Software Aggregate/concatenate documents • From a same row: SELECT XMLELEMENT(NAME “doc”, XMLCONCAT(xd1, xd2, xd3) ) FROM T1, T2 WHERE … • From different rows: SELECT XMLELEMENT(NAME “doc”, XMLAGG(xd) ) FROM T1, T2 WHERE … • xd1, xd2, xd3, xd … can be any XML expressions 36
  37. 37. IBM Software Group | DB2 Information Management Software What you can do with pureXML (6) Leveraging the Power of SQL CREATE VIEW ORDER_VIEW AS XMLTable function was delivered via PTF UK33493 SELECT PO.POID, X.* FROM PurchaseOrders PO, XMLTABLE( '//item' PASSING PO.XMLPO COLUMNS "orderDate" DATE PATH '../../@orderDate', "shipTo City" VARCHAR(20) PATH '../../shipTo/city', "shipTo State" CHAR(2) PATH '../../shipTo/state', "Part #" CHAR(6) PATH '@partnum', "Product Name" CHAR(20) PATH 'productName', "Quantity" INTEGER PATH 'quantity', "US Price" DECIMAL(9,2) PATH 'USPrice', "Ship Date" DATE PATH 'shipDate', "Comment" VARCHAR(60) PATH 'comment' ) AS X; SELECT "Product Name", "shipTo State", SUM("US Price" * "Quantity") AS TOTAL_SALE Can join with other tables. FROM ORDER_VIEW GROUP BY "Product Name", "shipTo State"; SELECT "shipTo City", "shipTo State", RANK() OVER(ORDER BY SUM("Quantity")) AS SALES_RANK FROM ORDER_VIEW WHERE "Product Name" = 'Baby Monitor' GROUP BY "shipTo State", "shipTo City" ORDER BY SALES_RANK; 37
  38. 38. IBM Software Group | DB2 Information Management Software XMLTABLE: Table view of XML Data SELECT X.* FROM PurchaseOrders PO, XMLTABLE( '//item' PASSING PO.XMLPO COLUMNS "Seqno" FOR ORDINALITY, "Part #" CHAR(6) PATH '@partnum', "Product Name" CHAR(20) PATH 'productName', "Quantity" INTEGER PATH 'quantity', "US Price" DECIMAL(9,2) PATH 'USPrice', "Ship Date" DATE PATH 'shipDate', "Comment" VARCHAR(60) PATH 'comment' ) AS X; Seqno Part # Product Quantity US Price Ship Date Comment Name 1 833-AA Lapis 1 99.95 1999-12-05 Want this for necklace the holidays! 2 926-AA Baby Monitor 1 39.98 1999-12-21 38
  39. 39. IBM Software Group | DB2 Information Management Software What you can do with pureXML (7) Consume Web Services from SQL: How much is EUR1000 worth in USD? SELECT 1000 * XMLCAST( XMLQUERY('$d//*:ConversionRateResult' PASSING XMLPARSE (DOCUMENT DB2XML.SOAPHTTPNV( 'http://www.webservicex.net/CurrencyConvertor.asmx', 'http://www.webserviceX.NET/ConversionRate', '<?xml version="1.0" encoding="utf-8"?> <soap:Envelope xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/"> <soap:Body> <ConversionRate xmlns="http://www.webserviceX.NET/"> <FromCurrency>EUR</FromCurrency> <ToCurrency>USD</ToCurrency> </ConversionRate> </soap:Body> </soap:Envelope>') ) AS "d") AS DECIMAL(10,5)) FROM SYSIBM.SYSDUMMYU# 39
  40. 40. IBM Software Group | DB2 Information Management Software What you can do with pureXML (8) • XML Schema Repository (XSR), register schemas for – Schema validation – Annotated schema decomposition • Remote DRDA support • Host language support: C/C++, COBOL, PL/I, Assembly, Java, .Net • Utilities support for XML: – LOAD/UNLOAD record text data or file references – CHECK DATA – REORG – COPY – RECOVER – REBUILD – … 40
  41. 41. IBM Software Group | DB2 Information Management Software Encoding for XML • Textual XML Data Encoding – Internally-encoded (within XML data itself) • Encoding Declaration option or Byte Order Mark within XML data • Default: UTF-8 • Applies to binary variables (DB2 detects encoding) – Externally-encoded • Character host variables CCSID (override internal encoding) • DB2 uses UTF-8 to handle XML data. Character data in XML always stored as UTF-8 in XML column. • During insert, DB2 converts XML data to UTF-8 from the corresponding CCSID. • During select, DB2 converts XML in UTF-8 to host var encoding locally, or sends serialized XML data in UTF-8 format to remote requester. 41
  42. 42. IBM Software Group | DB2 Information Management Software XML Schema Support • XML Schema adds constraints on XML data (W3C Standard). • Register a schema in XML Schema Repository (XSR) – A set of DB2 provided user tables, and stored procedures • Schema validation (type annotation not kept) INSERT into PurchaseOrders VALUES( '200300001', CURRENT DATE, 'A', SYSFUN.DSN_XMLValidate(:lobPO,’SYSXSR.ORDERSCHEMA’)); • Annotated schema-based decomposition – store using tables. (XDBDECOMPXML stored proc) E.g. orderID ->PORDER.ORDERID <attribute name="orderID" type=“xs:string” db2-xdb:rowSet = “PORDER” db2-xdb:column= “ORDERID” /> annotations 42
  43. 43. IBM Software Group | DB2 Information Management Software Example: Use CLP to Register XML Schema • Register schema SchemaLocation REGISTER XMLSCHEMA http://www.n1.com/order.xsd FROM file://C:/xmlschema/order.xsd AS SYSXSR.ORDERSCHEMA SQL Identifier SchemaLocation ADD http://www.n1.com/lineitem.xsd FROM file://C:/xmlschema/lineitem.xsd ADD http://www.n1.com/parts.xsd FROM file://C:/xmlschema/parts.xsd COMPLETE [ENABLE DECOMPOSITION]; • Remove schema REMOVE XMLSCHEMA SYSXSR.ORDERSCHEMA; 43
  44. 44. IBM Software Group | DB2 Information Management Software Application Interfaces • XML type is supported in – Java (JDBC, SQLJ), ODBC, – C/C++, COBOL, PL/I, Assembly – .NET • Applications use: – XML as CLOB(n), XML as CLOB_FILE – XML as DBCLOB(n), XML as DBCLOB_FILE – XML as BLOB(n), XML as BLOB_FILE – All character or binary string types are supported • XMLParse and XMLSerialize apply (implicitly or explicitly) 44
  45. 45. IBM Software Group | DB2 Information Management Software Java JDBC Example Use standard interface: PreparedStatement pstmt = connection.prepareStatement("INSERT INTO PurchaseOders VALUES(?, ?)"); // second column: XML type … InputStream fin = new FileInputStream(file); pstmt.setBinaryStream( 2, fin, flen ); pstmt.execute(); Statement s = connection.createStatement(); ResultSet rs = s.executeQuery ("select ponumber, xmlpo from purchaseOrders"); while (rs.next()) { int po_no = rs.getInt ("ponumber"); String spo= rs.getString(2); System.out.println (spo); // uninterpreted flat xml text } Or use com.ibm.db2.jcc.DB2Xml interface 45
  46. 46. IBM Software Group | DB2 Information Management Software COBOL Application with Embedded SQL Example EXEC SQL BEGIN DECLARE SECTION END-EXEC. 01 xmlBuff USAGE IS SQL TYPE IS XML as CLOB(5K). EXEC SQL END DECLARE SECTION END-EXEC. EXEC SQL SELECT xmlCol INTO :xmlBuf from myTable where id = '001'. After Translation 01 xmlBuff. 49 xmlBuff-LENGTH PIC 9(9) COMP. 49 xmlBuff-DATA PIC X(length). If the length of the host variables is greater than 32K, translation becomes 01 xmlBuff. 02 xmlBuff-LENGTH PIC 9(9) COMP. 02 xmlBuff-DATA. 49 filler PIC X(32767). 49 filler PIC X(32767). : 49 filler PIC X(length - n * 32767). 46
  47. 47. IBM Software Group | DB2 Information Management Software Use Host Vars and Parameter Markers • XML parameter marker: CAST(? AS XML) • Inside XPath: XMLQUERY(‘/purchaseorder/items/item[desc = $x]’ PASSING XMLCOL, cast(? AS VARCHAR(20)) as “x”) XMLEXISTS(‘/purchaseorder/items/item[desc = $x]’ PASSING XMLCOL, :hv as “x”) XMLQUERY(‘/purchaseorder/items/item[desc = $x]’ PASSING CAST(? AS XML), cast(? AS VARCHAR(20)) as “x”) • Especially important for performance for short queries or OLTP applications. 47
  48. 48. IBM Software Group | DB2 Information Management Software Validate a doc – Example: COBOL using LOB • 1) XML in applications' memory (hostvar), if it's UTF-8, use BLOB hostvar is the best (use CLOB or DBCLOB if it's not UTF-8): EXEC SQL INCLUDE COBIIVAR END-EXEC. EXEC SQL INCLUDE COBSVAR END-EXEC. EXEC SQL INCLUDE SQLCA END-EXEC. ... 01 BLOB-VAR USAGE IS SQL TYPE IS BLOB(16M). ... EXEC SQL INSERT INTO T VALUES (XMLPARSE(DOCUMENT DSN_XMLVALIDATE(:BLOB-VAR, 'SYSXSR.MYSCHEMA')) ) END-EXEC. ... • 2) XML in a file, use LOB file reference variable is best. Assuming XML in UTF-8: ... 01 BLOBFR1 SQL TYPE IS BLOB-FILE. ... *** example file is in HFS MOVE '/u/li637/blob10a.txt' TO BLOBFR1-NAME. MOVE 20 TO BLOBFR1-NAME-LENGTH. MOVE SQL-FILE-READ TO BLOBFR1-FILE-OPTION. EXEC SQL INSERT INTO T VALUES (XMLPARSE(DOCUMENT DSN_XMLVALIDATE(:BLOBFR1, 'SYSXSR.MYSCHEMA')) ) END-EXEC. • 3) Use LOB locator to hold XML value… 48
  49. 49. IBM Software Group | DB2 Information Management Software Development Tools • Tool choices: – IBM Data Studio(http://www14.software.ibm.com/webapp/download/search.jsp?go=y&rs=swg-ids) – Rational Data Architect – Rational Application Developer – .NET – QMF – SPUFI – CLP (Command Line Processor) • They can help to perform: – Schema registration, validation – Annotation for decomposition – Mapping relational to XML schema for XML generation – XML editor, schema editor (in Data Studio) 49
  50. 50. IBM Software Group | DB2 Information Management Software V9 XML Feature Summary • First-class XML type, native storage of XQuery Data Model (XDM) • Complete SQL/XML constructor functions – XMLDOCUMENT, XMLTEXT, XMLCOMMENT, XMLPI – From V8: XMLELEMENT, XMLATTRIBUTES, XMLNAMESPACES, XMLFOREST, XMLCONCAT, XMLAGG • XMLPARSE, XMLSERIALIZE, XMLCAST • XML indexes – use for XMLEXISTS and XMLTABLE • Other SQL/XML functions with XPath – XMLEXISTS, XMLQUERY, XMLTABLE • XML Schema repository, Validation UDF, and decomposition • DRDA (distributed support) and application interfaces • Utilities and tools XMLTABLE and XMLCAST: available through PK51571, PK51572, PK51573 50
  51. 51. IBM Software Group | DB2 Information Management Software DB2 for z/OS XML Unique Features • Mature optimized storage infrastructure, compact storage with optional high-ratio compression – As little as 0.3 of original document size • Synergy with z/OS XML system services for highly efficient parsing, achieving unparalleled XML insertion and load performance • Schema validation exploiting latest XML validating parser (XLXP-C); • Efficient XML indexes for scalability; Patent pending highly efficient XPath algorithm for query performance; • Partitioned table space and data sharing support; • Redirection of XML processing to the zIIP specialty engine from DRDA workload; • 100% redirection of XML parsing to the zIIP or zAAP specialty engine in the near future. 51
  52. 52. IBM Software Group | DB2 Information Management Software Agenda • What is XML and where is XML used? • Overview of DB2 XML – SQL/XML feature in DB2 for z/OS V8 – what you can do with pureXML in DB2 9 for z/OS – Language interfaces and tools • Usage scenarios and business value of pureXML • DBA’s tasks for XML • Performance and scalability 52
  53. 53. IBM Software Group | DB2 Information Management Software Usage Scenarios: Directly Processing XML XML for data in motion - storing what is exchanged –SEPA/UNIFI, ACORD, FIXML, FpML, MIMSO, XBRL, … –DJXDM, HR-XML, HL7, ARTS, HIPAA, NewsML, XForms, … –Insurance policy, contract, purchase order, emails, etc. UNIFI (ISO 20022) X9 NACHA ECCHO Financial Services XML Standards 53
  54. 54. IBM Software Group | DB2 Information Management Software Usage Scenarios: Business Monitoring • Data analysis of intermittent process status – Immediate access to arriving data via XML tools and SQL • Data warehouse • Auditing Extracting Or • ... XML Operational Mapping Data Store Diverse Data XML in Sources Cleansing Data DB2 Warehouse 54
  55. 55. IBM Software Group | DB2 Information Management Software Usage Scenarios: Generating web pages: XHTML,... Feeds JSON Output XSLT Any text format including XML, PDF • Flexibility in Web service input and output formats • Alleviates ‘top-down’ Web service format requirements 55
  56. 56. IBM Software Group | DB2 Information Management Software Usage Scenarios: Relational Schema vs. XML • Versatile schemas and enable end-user customizable applications. –very difficult to reflect in a relational schema • Object persistence (single XML column v.s. many tables) –avoiding mapping of complex XML into many relational tables • Sparse attribute values (null v.s. absence) –Product features (e.g. car feature) 56
  57. 57. IBM Software Group | DB2 Information Management Software DB2 as Web Services Consumer … • Implementation – DB2 server-side routines (SOAP UDFs) – Light-weight native implementation – Can be called inside an SQL statement or from within other server-side routines • Process flow – Receive input parms from SQL statement – Compose HTTP/SOAP or HTTPS/SOAP request – Invoke TCPIP socket call and send Web Service request – Receive reply from Web Service Provider • Validate HTTP headers • Strip SOAP envelope and return SOAP Body • Tooling support for DB2 in RAD – Create a SQL UDF specifically for a WS operation from WSDL – Creates ‘wrapper’ SQL UDF to hide XML/relational mapping. 57
  58. 58. IBM Software Group | DB2 Information Management Software … and DB2 as Web Service Provider • Tooling integrated in IBM Data Studio – Query builder • Integrated Query Editor for SQL and XQuery • Graphical SQL Builder – Stored Procedure builder • Integrated debugger for Java and SQL PL procedures • Run and test procedures with DB2 and IDS • Rapid generation of Web services from database operations using intelligent defaults • No programming required • Assembles a “ready-to-deploy” Data Web Services (DWS) Web application • Integrated deploy and test environment • Automated Web service artifact creation – WSDL documents • Web service exploration and test tools • Support for DB2 pureXML and XSLT 58
  59. 59. IBM Software Group | DB2 Information Management Software DataStudio Tooling for Data Web Services 1. Develop Statements 2. Create Service 3. Drag ‘n drop Service assembly 4. Deploy Service 5. Test Service 59
  60. 60. IBM Software Group | DB2 Information Management Software Tax Forms • Usually hundreds- thousands of different tax forms Schema Diversity • Typically not every field in a form is used Sparse Data • Many forms change every year Schema Evolution A case for XML ! 60
  61. 61. IBM Software Group | DB2 Information Management Software Current relational storage, inefficient, anonymous Generic columns XML columns, requires complex mappings in the application col1 col2 col3 col4 col5 … col1000 134 NULL 11/23/05 NULL NULL … NULL NULL 276 NULL NULL Yes … NULL 12 NULL NULL 99.99 NULL … NULL NULL NULL NULL 123.23 NULL … No <form> … <wages>134</wages> <date>11/23/05</date> New XML format: </form> XML: Avoids sparsity. Proper data labeling. 2 columns, not 1000. Transformable. Extensible. Simplifies mapping. 61
  62. 62. IBM Software Group | DB2 Information Management Software XML and the Web before DB2 pureXML XML Client Application Server Relational Database Mapping Mapping XML Objects Relational schema • XML to object mapping and object to relational mapping – Very costly – Complex – Inflexible 62
  63. 63. IBM Software Group | DB2 Information Management Software DB2 pureXML and the Web XML SOA Gateway XML relational Client DB2 pureXML • End-to-End XML – No expensive object mapping – Pass Thru XML from/to database • SOA-Gateway – Device/application to handle network protocols, security, reliability, performance – Easy to manage • Simple pre- and post-processing of XML – e.g. via XSLT 63
  64. 64. IBM Software Group | DB2 Information Management Software XML as front to backend systems Interface Tables 1 Backend 1 XML Interface Tables 2 Backend 2 XML Interface Tables 3 Backend 3 XML relational DB2 pureXML Interface Tables n Backend n XML Physical tables or logical views 64
  65. 65. IBM Software Group | DB2 Information Management Software XML-Enabled Databases: Two Main Options CLOB/Varchar Shredding XML DOC Extract XML selected DOC Fixed “Decomposition” elements/attr. Mapping Shredder Side Tables XML DOC XML DOC XML DOC Varchar or clob (regular relational tables) (regular tables for column faster lookup) 65
  66. 66. IBM Software Group | DB2 Information Management Software DB2 pureXML Advantages • Directly store XML, no decomp/comp, normalize/de-normalize • Eliminates database schema evolution bottleneck Up to 10 times • Declarative language, reduce complexity, dramatically improve application development productivity • Native processing, high performance • Unparalleled reliability, availability, scalability 66
  67. 67. IBM Software Group | DB2 Information Management Software Business Value of pureXML • Speed up Application Development – Reduced system and development complexity – Improved developer productivity • Greater Business Agility – Easily accommodate changes to data and schemas – Update applications rapidly and reduce maintenance costs • Improved Business Insight – Access to information in otherwise unexploited documents – Unprecedented application performance • Consolidating converged information on System z – Reduce floor space, power consumption, cooling cost – Consolidate information resources and reduce admin cost 67
  68. 68. IBM Software Group | DB2 Information Management Software Agenda • What is XML and where is XML used? • Overview of DB2 XML – SQL/XML feature in DB2 for z/OS V8 – what you can do with pureXML in DB2 9 for z/OS – Language interfaces and tools • Usage scenarios and business value of pureXML • DBA’s tasks for XML • Performance and scalability 68
  69. 69. IBM Software Group | DB2 Information Management Software A few Things that are New related to XML 1. Implicitly created objects – DBA cannot create explicitly 2. XML indexes: XPath and keys 3. XML Schema registration 4. XMLDATA contains StringIDs in a catalog table SYSIBM.SYSXMLSTRINGS (dictionary) • UNLOAD FROMCOPY restricted, DSN1COPY 5. A new XML lock type, ID '35'x in traces • IFCID 20, 21, 107, 150, 172, and 196. 6. XML keyword in some utilities 69
  70. 70. IBM Software Group | DB2 Information Management Software System Configurations • Basic XML parsing requires z/OS XMLSS: z/OS 1.8 or z/OS 1.7 with APAR OA16303 • XML schemas requires IBM 31-bit SDK for z/OS, Java 2 Technology Edition, V5 (5655-N98), SDK5. And Java stored procedure setup. • Zparms for virtual storage: XMLVALA and XMLVALS – Default: 200MB and 10GB. – Also LOBVALA and LOBVALS impact bind-in and bind-out of XML • Buffer pool for XML tables (default BP16K0), authorization for users who create or alter tables with XML columns. – DEFAULT BUFFER POOL FOR USER XML DATA ===> BP16K0 BP16K0 - BP16K9 70
  71. 71. IBM Software Group | DB2 Information Management Software Overview of utilities for XML • No new special utilities for XML. • DB2 utilities support the XML data type and the related database objects – The XML data type in LOAD and UNLOAD, with file reference support – Support for the XML table spaces and tables used to store XML column values – Support for auxiliary relationships used to connect base tables to XML tables – Support for the base table space DocID index – Support for the XML table space NodeID index and XML indexes • No partition level checking for XML PBG UTS(APAR pk49033) • DSN1COPY copied data set containing XML objects cannot be moved to another system due to StringIDs (dictionary specific to each system). 71
  72. 72. IBM Software Group | DB2 Information Management Software DBA Considerations • Database administrators should treat XML database objects as they do in LOB database objects. • Like LOB objects, XML objects contain data stored outside the base table space. – XML table spaces and index spaces must be consistent also with their related base table • All the utilities either support or tolerate XML, with some specific XML keyword support. • In addition, XML has more considerations, such as table space size limit, compression, and indexes. 72
  73. 73. IBM Software Group | DB2 Information Management Software Table Space Size Consideration • Basic XML storage is about 0.3 (strip whitespace w/ compression) to 1.5 (preserve ws w/o compression) of original doc size • An XML table space always use 16KB pages. – For non range-partitioned base table spaces, PBG table space is used for XML. • Range-partitioned base table spaces: XML partitioning follows base table partitioning. • The number of rows to fit into a relational partition is limited by the number of documents to fit into an XML partition. – For example, 4K doc size, 32GB partition can roughly store 8M documents (or 7M to be safe). 73
  74. 74. IBM Software Group | DB2 Information Management Software REPORT TABLESPACESET - Output Contains both a LOB column and an XML column: POTABLE(I INT, CHARPO CLOB, PO XML). (New XML text is shown in red) TABLESPACE SET REPORT: TABLESPACE : DBMYPO.TSMYPO TABLE : ADMF001.POTABLE INDEXSPACE : DBMYPO.IRDOCIDP INDEX : ADMF001.I_DOCIDPOTABLE LOB TABLESPACE SET REPORT: TABLESPACE : DBMYPO.TSMYPO BASE TABLE : ADMF001.POTABLE TABLESPACE COLUMN LOB TABLESPACE : : CHARPO : DBMYPO.TSMYPO DBMYPO.TSLOBPO BASE TABLE AUX TABLE : : ADMF001.POTABLE ADMF001.TBLOBPO AUX INDEXSPACE : DBMYPO.IXLOBPO COLUMN AUX INDEX : : PO ADMF001.IXLOBPO XML TABLESPACE XML TABLESPACE SET REPORT: : DBMYPO.XPOT0000 XML TABLE : ADMF001.XPOTABLE TABLESPACE : DBMYPO.TSMYPO XML NODEID : ADMF001.POTABLE DBMYPO.IRNODEID BASE TABLE INDEXSPACE: COLUMN XML NODEID : PO INDEX : ADMF001.I_NODEIDXPOTABLE XML TABLESPACE : DBMYPO.XPOT0000 XMLXML INDEXSPACE TABLE : ADMF001.XPOTABLE DBMYPO.XMLIDX : XML NODEID INDEXSPACE: DBMYPO.IRNODEID XMLXML INDEX NODEID INDEX : ADMF001.XMLIDX : ADMF001.I_NODEIDXPOTABLE XML INDEXSPACE : DBMYPO.XMLIDX XML INDEX : ADMF001.XMLIDX 74
  75. 75. IBM Software Group | DB2 Information Management Software LISTDEF – define a list of related objects • Example: LISTDEF listname INCLUDE TABLESPACES TABLESPACE tsname [RI] [ALL | BASE | LOB | XML] • Valid specifications: – BASE (non-LOB and non-XML objects) – LOB (LOB objects) – XML (XML objects) – ALL (BASE, LOB, and XML objects) – TABLESPACES (related table spaces) – INDEXSPACES (related index spaces) – RI (related by referential constraints, including informational referential constraints) – Initial object can be: DB, TS, IS, Table, Index 75
  76. 76. IBM Software Group | DB2 Information Management Software UNLOAD XML Data • To unload XML data directly to output records, specify XML as the field type. – non-delimited format: a 2-byte length will precede the value of the XML. – For delimited output, no length field is present. – Limit to 32K in length • To unload XML data to separate files: – Specify CHAR(n)/VARCHAR(n) BLOBF, CLOBF or DBCLOBF for file names – Use the template control statement to create the XML output file and filename • UNLOAD FROMCOPY is restricted 76
  77. 77. IBM Software Group | DB2 Information Management Software File References – UNLOAD to HFS //CRTMNT //CRTMNT EXEC PGM=BPXBATCH, EXEC PGM=BPXBATCH, // PARM='sh mkdir /u/sample; chmod 777 /u/sample; // PARM='sh mkdir /u/sample; chmod 777 /u/sample; // // chown sysadm /u/sample' chown sysadm /u/sample' //STEP1 //STEP1 EXEC PGM=BPXBATCH, EXEC PGM=BPXBATCH, // PARM='sh mkdir /u/sample/clobf‘ // PARM='sh mkdir /u/sample/clobf‘ ... ... TEMPLATE TCLOBF DSN /u/sample/clobf TEMPLATE TCLOBF DSN /u/sample/clobf DIR(5) DSNTYPE(HFS) DIR(5) DSNTYPE(HFS) UNLOAD TABLESPACE SAMPLEDB.SAMPLETS UNLOAD TABLESPACE SAMPLEDB.SAMPLETS PUNCHDDN SYSPUNCH UNLDDN SYSREC PUNCHDDN SYSPUNCH UNLDDN SYSREC FROM TABLE ADMF001.SAMPLETB FROM TABLE ADMF001.SAMPLETB ( MYCOL1 ( MYCOL1 POSITION(*) DECIMAL(5,2) POSITION(*) DECIMAL(5,2) ,MYXML1 ,MYXML1 POSITION(*) VARCHAR CLOBF TCLOBF POSITION(*) VARCHAR CLOBF TCLOBF ) ) SAMPLE.UFILEREF.SYSREC: SAMPLE.UFILEREF.SYSREC: ß@ ß@ /u/sample/clobf/BI1OFEHQ /u/sample/clobf/BI1OFEHQ Ñ@ Ñ@ /u/sample/clobf/BI1OFEH0 /u/sample/clobf/BI1OFEH0 `@ `@ /u/sample/clobf/BI1OFEH1 /u/sample/clobf/BI1OFEH1 i@ i@ /u/sample/clobf/BI1OFEIN /u/sample/clobf/BI1OFEIN r@ r@ /u/sample/clobf/BI1OFEIO /u/sample/clobf/BI1OFEIO $ cd /u/sample/clobf $ cd /u/sample/clobf $ ls $ ls BI1OFEH0 BI1OFEH1 BI1OFEHQ BI1OFEH0 BI1OFEH1 BI1OFEHQ BI1OFEIN BI1OFEIN BI1OFEIO BI1OFEIO 77
  78. 78. IBM Software Group | DB2 Information Management Software LOAD XML Data • LOAD uses internal INSERT for XML data – honors LOG(NO). • To load XML directly from input records, specify XML as the field type. – LOAD DATA INDDN(INFILE) LOG NO RESUME(NO) FORMAT DELIMITED INTO TABLE PURCHASEORDERS – LOAD DATA INDDN(INFILE) LOG NO RESUME(NO) … XMLPO POSITION(20) XML PRESERVE WHITESPACE INTO TABLE PURCHASEORDERS • To load XML from files, specify CHAR or VARCHAR along with either BLOBF, CLOBF or DBCLOBF. • Schema validation not supported for LOAD. • XML compression takes effect after first REORG, not on initial LOAD. Same for FREEPAGE, PCTFREE. 78
  79. 79. IBM Software Group | DB2 Information Management Software File References – LOAD from HFS LOAD DATA INDDN SYSREC LOAD DATA INDDN SYSREC LOG NO RESUME YES LOG NO RESUME YES EBCDIC CCSID(00037,00000,00000) EBCDIC CCSID(00037,00000,00000) SORTKEYS SORTKEYS 10 10 INTO TABLE "ADMF001"."SAMPLETB“ INTO TABLE "ADMF001"."SAMPLETB“ WHEN(00001:00002) = X'0003‘ WHEN(00001:00002) = X'0003‘ ( "DSN_NULL_IND_00001" POSITION( 00003) ( "DSN_NULL_IND_00001" POSITION( 00003) CHAR(1) CHAR(1) , "MYCOL1" , "MYCOL1" POSITION( 00004:00006) DECIMAL PACKED POSITION( 00004:00006) DECIMAL PACKED NULLIF(DSN_NULL_IND_00001)=X'FF' NULLIF(DSN_NULL_IND_00001)=X'FF' , "DSN_NULL_IND_00002" POSITION( 00007) , "DSN_NULL_IND_00002" POSITION( 00007) CHAR(1) CHAR(1) , "MYXML1“ , "MYXML1“ POSITION( 00008) POSITION( 00008) VARCHAR CLOBF VARCHAR CLOBF NULLIF(DSN_NULL_IND_00002)=X'FF' NULLIF(DSN_NULL_IND_00002)=X'FF' ) ) 79
  80. 80. IBM Software Group | DB2 Information Management Software Operation and Recovery • To recover base table space, take image copies of all related objects – Use LISTDEF to define a list of related objects – (QUIESCE not needed in V9) • COPYTOCOPY may be used to replicate image copies of XML objects. • MERGECOPY may be used to merge incremental copies of XML table spaces. • Point in Time recovery (RECOVER TORBA, TOLOGPOINT) – All related objects, including XML objects must be recovered to a consistent point in time • Optional: CHECK utilities to validate base table spaces with XML columns, XML indexes and related XML table spaces. • If there is an availability issue with one object in the related set, availability of the others may be impacted. 80
  81. 81. IBM Software Group | DB2 Information Management Software Diagnosing Problem Related to XML Objects • Identify XML tables and their related objects – Run REPORT TABLESPACESET • Validate that the auxiliary index is consistent with the underlying table spaces – Run CHECK INDEX on all indexes, DocID, NodeID and XML value indexes • Validate the logical connection between the base table and XML table. – Run CHECK DATA against the base table space. • Use Repair to diagnose problem related to base table spaces with XML columns and their DocID index – Use REPAIR LOCATE KEY to locate a row using DocID key in the DocID index • Use Repair to diagnose problem related to XML table spaces and their NodeID index or XML Value Index – Use REPAIR LOCATE RID to locate a row using a RID. 81
  82. 82. IBM Software Group | DB2 Information Management Software Checking data integrity CHECK INDEX on DOCID, NODEID, XML indexes CHECK DATA on base tablespace ƒ SCOPE AUXONLY XML table space XML table space ƒ AUXERROR REPORT ƒ AUXERROR INVALIDATE 2 - CHECK INDEX Base table space 1 - CHECK INDEX NODEID XML Index Index DOCID Index Cols: Cols: DOCID DOCID XMLCOL NODEID XML Record Value 3 - CHECK DATA CHECK INDEX(2) 82
  83. 83. IBM Software Group | DB2 Information Management Software Performance Monitoring and Tuning • There are no special changes in DB2 Performance Expert to support XML other than minor points - such as new XML locks. • XML performance problem can be analyzed through accounting traces and performance traces. • There is a new LOAD MODULE for XML: DSNNXML in ADMF address space • XML indexes have the same consideration as other indexes. • The REORG utility should be used to maintain order and free space. • Run RUNSTATS for statistics to help pick XML indexes. 83
  84. 84. IBM Software Group | DB2 Information Management Software Checking query plan CREATE TABLE ACORD.REQUEST ( ID BIGINT NOT NULL PRIMARY KEY, REQUESTXML XML, RESPONSEXML XML ) IN DATABASE DBACORD CREATE INDEX ACORD.ACORDINDEX1 ON ACORD.REQUEST(REQUESTXML) GENERATE KEYS USING XMLPATTERN 'declare default element namespace "http://ACORD.org/Standards/Life/2"; /TXLife/TXLifeRequest/TransRefGUID' as SQL VARCHAR(24) CREATE INDEX ACORD.ACORDINDEX2 ON ACORD.REQUEST(REQUESTXML) GENERATE KEYS USING XMLPATTERN 'declare default element namespace "http://ACORD.org/Standards/Life/2"; /TXLife/TXLifeRequest/OLifE/Holding/Policy/@id' AS SQL VARCHAR(9) 84
  85. 85. IBM Software Group | DB2 Information Management Software Query plan (cont’ed) EXPLAIN PLAN SET QUERYNO = 101 FOR SELECT XMLQuery('declare default element namespace "http://ACORD.org/Standards/Life/2"; /TXLife/TXLifeRequest/OLifE/Holding/Policy/Life/Coverage/LifeParticipant' PASSING R.REQUESTXML), XMLQuery('declare default element namespace "http://ACORD.org/Standards/Life/2"; /TXLife/TXLifeRequest/OLifE/Party [@id = /TXLife/TXLifeRequest/OLifE/ Find participant information Holding/Policy/Life/Coverage/ LifeParticipant/@PartyID ] ' about a policy. PASSING R.REQUESTXML) FROM ACORD.REQUEST R WHERE XMLExists('declare default element namespace "http://ACORD.org/Standards/Life/2"; /TXLife/TXLifeRequest[TransRefGUID="2004-1217-141016-000012"]/ OLifE[Holding/Policy/@id="POLICY12"]' PASSING R.REQUESTXML) +-------------------------------------------------------------------------------------+ | PLANNO | ACCESSTYPE | MATCHCOLS | ACCESSCREATOR | ACCESSNAME | MIXOPSEQ | +-------------------------------------------------------------------------------------+ 1_| 1 | M | 0 | | | 0 | 2_| 1 | DX | 1 | ACORD | ACORDINDEX2 | 1 | 3_| 1 | DX | 1 | ACORD | ACORDINDEX1 | 2 | 4_| 1 | DI | 0 | | | 3 | +-------------------------------------------------------------------------------------+ 85
  86. 86. IBM Software Group | DB2 Information Management Software Steps in a picture DOCID list 4 3 INTERSECT DOCID IDX DOCID IDX DOCID list 1 DOCID list 2 RID list 6 1 2 5 Base Table NODEID IDX NODEID IDX XML IDX1 XML IDX1 XML IDX2 XML IDX2 DocID index XML index (user) NodeID index B+tree B+tree B+tree B+tree B+tree B+tree Base Table Internal XML Table DocID … XMLCol DOCID MIN_NODEID XMLData 1 1 02 2 2 02 3 2 0208 3 02 86
  87. 87. IBM Software Group | DB2 Information Management Software Agenda • What is XML and where is XML used? • Overview of DB2 XML – SQL/XML feature in DB2 for z/OS V8 – what you can do with pureXML in DB2 9 for z/OS – Language interfaces and tools • Usage scenarios and business value of pureXML • DBA’s tasks for XML • Performance and scalability 87
  88. 88. IBM Software Group | DB2 Information Management Software Performance and Scalability • XML storage leverages mature optimized storage infrastructure. Compact format, compressed well. • Next generation parsers: z/OS XML System Services and XLXP-C. – Validation is more expensive than well-formedness checking only • Highly efficient XPath streaming algorithm • Support partitioned table spaces and data sharing. • Initial sweet spot: a large number of small documents. 88
  89. 89. IBM Software Group | DB2 Information Management Software Customer Sample Performance Test 1 XML column 1 CLOB column 3 relational tables Application DB2 / ZIIP No indexes CPU El CPU EL CPU Elapsed XML/ XML/ CPU Elapsed XML/ XML/ XML Insert 260 10715 13375 26482 CLOBInsert 255 11554 1.02 0.93 12860 32715 1.04 0.81 RELInsert 998 29145 0.26 0.37 31396 66485 0.43 0.40 XMLRetr1 128 22106 4722 5148 XMLRetr2 148 15824 4798 5251 CLOBRetr 1228 17019 5309 11107 RELRetr 376 33874 17660 39970 XMLUpdate 262 5250 8744 15579 CLOBUpdate 178 5054 1.47 1.04 7272 22820 1.20 0.68 RELUpdate 781 22868 0.34 0.23 25698 51285 0.34 0.30 89
  90. 90. IBM Software Group | DB2 Information Management Software Storage for UNIFI Messages 180 160 140 Storage (KB) 120 100 70% Uncompress (KB) 80 52% 71% 58% Compress (KB) 67% 60 40 20 0 Original Docs DB2XML Strip WS DB2XML Pres WS 96 sample documents Strip WS: Strip Whitespaces Pres WS: Preserve Whitespaces 90
  91. 91. IBM Software Group | DB2 Information Management Software Table Compression Non compressed XML table Compressed XML table 500 400 Time in second 300 200 100 0 EL CPU EL CPU insert Xscan 91
  92. 92. IBM Software Group | DB2 Information Management Software Insert Performance (Batch) Elapsed CPU 300 250 3.9 millions 10K Time (Sec) 200 docs per hour or 1100 docs/sec 150 100 50 0 1K x 1000000 10K x 100000 100K x 10000 1M x 1000 10M x 100 Doc Size x Number Measurement in March 2007, z9 DS8300, Single thread, Docs in EBCDIC 92
  93. 93. IBM Software Group | DB2 Information Management Software Insert Performance – compare w/ CLOB XML XML w/ One index CLOB 140% 111% 116% 120% 100% 100% 100% 80% 71.36% 64% 60% 40% 20% 0% Elapsed CPU (average of 1K to 10M document insert performance) 93

×