ANSI SQL Transparent DynamicMultipath Hierarchical Structured       Data Processing For Relational, XML, & Legacy Data    ...
Previous Database HistoryHierarchical processing popular 40 years ago  Used user navigational processingMultipath hierarch...
SQL/XML Industry ProblemsOnly proprietary vendor solutions  All vendor solutions incompatible with each other  Integration...
SQL/XML Structured Data    Processing ProblemsLimited to linear single path processing  Query data selection limitations  ...
A Working ANSI SQL XML Solution Limits processing to only structured data   Allows unambiguous nonprocedural querying   En...
Structured Data Automatic        Processing BenefitsSQL transparent XML integration  ANSI SQL-92 syntax and semantics, not...
Current XML Hierarchical DataXML Was Created as a Markup LanguageUnstructured Mapped to SemistructuredStructured Vs. Semis...
Why Hierarchical Data     structures are PowerfulAutomatic Data and Path ReuseExtending Path Increases Data ValueAdding Pa...
Why Hierarchical Structured   Data Processing is PowerfulDynamic Multipath Query CombinationsMultipath Data Value Grows Co...
Hierarchical Data Structure TypesXML is self defining contiguous and nested  No foreign key relation needed to make struct...
Hierarchical Structured Data Makeup  Multiple Node Types  Nodes Support Multiple Data Occurrences  Naturally Built With 1 ...
SQL Hierarchical Structured Data Can Be Stored in Relational Rowsets Multiple Node Types in Relational Rowset Node Multipl...
Mapping Relational SQL to            Hierarchical XML Input View   Conceptual Hierarchical        Result                  ...
Relational Logical Hierarchical View CREATE VIEW EmpView AS SELECT * FROM Emp                             Emp             ...
XML Physical Hierarchical ViewCREATE XML CustView                                      CREATE VIEW CustView ASCust(       ...
Joining Heterogeneous ViewsSELECT EmpID, DpndID, EaddrID,       CustID InvID, AddrID    <root>FROM EmpView                ...
Logical Hierarchical Structures Offer Flexible Relational/XML Integration Logical      Physical                Data Model:...
Heterogeneous Data Structure Mashup Uses Linking Below RootSELECT EmpID, DpndID, InvID,    <root>       AddrID, EaddrID, C...
Association Table Use  SELECT EmpID, DpndID, EaddrID, CustID, InvoiceID          AddrID, IntersectData  FROM EmpView  LEFT...
Hierarchical Data Filtering, AutoOutput & Structure-free Processing Input View   Conceptual Hierarchical   Result         ...
Optimization, Global Views           and Node Promotion Input View   Conceptual Hierarchical   Result                     ...
Global Query Uses Global Views                           EmpCust              Global views allow entire         Global Que...
Node Promotion and Nested View CREATE VIEW EmpCust AS SELECT *                        Hierarchical views can be nested. FR...
Node Promotion Override and XMLFormat Change    <emp>                  <empid>Emp01</empid> SELECT EmpID, DpndID,         ...
Data Structure Mashup With Node   Promotion = Data Virtualization SELECT EmpID, DpndID,                     <root>        ...
Multipath Query Processing and its Required LCA ProcessingMultipath query references multiple pathways-  and uses a WHERE ...
Multipath LCA Query Logic for     Single SELECT Data  SELECT DpndID                      Lowest Common Ancestor (LCA)  FRO...
Multipath LCA Query Logic for    Multiple SELECT Data  SELECT DpndID, InvID               Lowest Common Ancestor (LCA)  FR...
Compound WHERE Clauses also         Require Their Own LCA                                       Lowest Common Ancestor (LC...
Data Driven       Variable Structure Generation SELECT EmpID, DpndID, InvID,          <root>        AddrID, EaddrID,      ...
Variable Structure Generation       Using Multiple ChoiceSELECT EmpID, DpndID, InvID,        Variable Structure       Addr...
Data Structure TransformationTwo Basic Types of Data Structure Transformation  Restructuring: uses data relationships in t...
Restructure TransformationSELECT X.EmpID, X.DpndID, Y.InvID, X.AddrIDFROM EmpCust YLEFT JOIN EmpCust X ON Y.InvCustID=X.Em...
Semantic Any-to-Any Structure      Reshaping TransformationSELECT X.DpndID, Y.EmpID, Y.EaddrIDFROM EmpView X              ...
Polymorphic Any-to-Any ReshapingSELECT X.DpndID, Y.EaddrID, Z.EmpIDFROM EmpView XLEFT JOIN EmpView Y ON X.DpndID=Y.DpndIDL...
Reshaping to Multipath StructureSELECT X.DpndID, Y.EaddrID, Z.EmpIDFROM EmpView XLEFT JOIN EmpView Y ON X.DpndID=Y.DpndIDL...
Multipath Querying Vs. Transform Multipath Multipath                 Transformed Transformed  ViewX ViewX Data            ...
Renaming, Replicating, & SplittingSELECT EmpID EmpName, DpndID AS DpndName, Nodes        Addr1.EaddrID AS AddrName,       ...
XML Duplicate and Shared  Unambiguous Node Processing                                                  This same SQL   SEL...
Nonlinear Hierarchical ORDER BY SELECT CustID, InvID, AddrID <root> FROM CustView                 <cust custid="Cust03"> O...
Hier & XML Middleware Enabler                                          Before Query: Query     SQL Pre Processing         ...
Dynamic Data Structure Creation Recap  SELECTed Data Output    Controls processing and output result structure    With aut...
The Power of Hierarchical   LEFT Outer Join Syntax RecapViewA LEFT JOIN ViewXON C.c=Z.c AND A.a=10              Shown on t...
Combining Basic Capabilities     ViewR                         Hierarchical     Examples:       A                 XML Resu...
Combining Advanced Capabilities ViewR     ViewX                     Data Mashup    Examples:                   CREATE XML ...
Advanced Capabilities 1Multipath nonlinear processing  Dynamic increase of data value using structure semantics  Processes...
Advanced Capabilities 2Dynamic automatic variable structure generation  Dynamically builds structures based on values in d...
Advanced Capabilities 3Polymorphic any-to-any structure transform  Uses data relationships or just structure semantics  Ut...
Advanced Capabilities Summary1) ANSI SQL standard and mathematically sound2) Ease of use (nonprocedural, navigationless, s...
Upcoming SlideShare
Loading in …5
×

ANSI SQL Transparent Multipath Hierarchical Structured Data Processing for Relational, XML & Legacy Data

942 views

Published on

ANSI SQL transparent XML multipath hierarchical processing with advanced new relational and hierarchical processing capabilities.

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
942
On SlideShare
0
From Embeds
0
Number of Embeds
21
Actions
Shares
0
Downloads
0
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

ANSI SQL Transparent Multipath Hierarchical Structured Data Processing for Relational, XML & Legacy Data

  1. 1. ANSI SQL Transparent DynamicMultipath Hierarchical Structured Data Processing For Relational, XML, & Legacy Data Michael M David and Lee Fesperman Advanced Data Access Technologies, Inc. www.adatinc.com The example results shown in this presentation can be reproduced on our online interactive prototype at www.adatinc.com/demo.html
  2. 2. Previous Database HistoryHierarchical processing popular 40 years ago Used user navigational processingMultipath hierarchical query processing used Used nonprocedural navigationless processing Made possible by automatic semantic processingRelational processing replaces hierarchical Also uses nonprocedural navigationless processing But more flexibility with data independence from joinHierarchical processing technology forgotten No Internet to store hierarchical processing knowledgeHierarchical structures popular again with XML User procedural navigation is back again Where is navigationless access for structured access?2
  3. 3. SQL/XML Industry ProblemsOnly proprietary vendor solutions All vendor solutions incompatible with each other Integration solutions are XML centric and proceduralNo satisfactory XML integration solution Hierarchical data value loss XML not fully integrated into SQL XML structured data processed as semistructuredNo hierarchical processing standard Invalid hierarchical processing result is possible Invalid hierarchical structure result is possibleMarkup and structured data processed the same Structured data needs to be processed differently! 3
  4. 4. SQL/XML Structured Data Processing ProblemsLimited to linear single path processing Query data selection limitations Relational join single path mindsetRequires user database navigation Loss of dynamic hierarchical processing Limits complex hierarchical processing Prevents seamless & transparent processingRequires specific vendor user training Requires XML and vendor trained users 4
  5. 5. A Working ANSI SQL XML Solution Limits processing to only structured data Allows unambiguous nonprocedural querying Enables navigationless schema-free operation Only performs hierarchical operations Only uses hierarchical Left Outer Join operation Naturally supports full hierarchical structures Allows hierarchical structure-aware operation Supports inherent hierarchical processing This solves relational/XML data integration This also solves SQL to XML system mapping Transparent multipath hierarchical processing 5
  6. 6. Structured Data Automatic Processing BenefitsSQL transparent XML integration ANSI SQL-92 syntax and semantics, not XML centric Nonprocedural and navigationless, Interactive No knowledge of structure necessary, schema-freeMultipath nonlinear hierarchical processing Hierarchically accurate results automatically Dynamic hierarchical processing optimization Dynamic output uses hierarchical result structureSQL hierarchical views fully functional Global views with no overhead and maximum reuse Global hierarchical queries now possible Hierarchical data filtering & XML keyword search 6
  7. 7. Current XML Hierarchical DataXML Was Created as a Markup LanguageUnstructured Mapped to SemistructuredStructured Vs. Semistructured DataSame as Fixed Vs. Fuzzy MeaningWhy Semistructured Requires Navigation Structured Semistructured Data Data The multiple B nodes in this A A structure make B C B C it ambiguous for querying D E B D without user Unambiguous Ambiguous navigation. 7 Query Structure Query Structure
  8. 8. Why Hierarchical Data structures are PowerfulAutomatic Data and Path ReuseExtending Path Increases Data ValueAdding Paths Increase Data Value FurtherHierarchical XML Becoming UbiquitousData Structures are Unambiguous Data/Path Creating more value than is 1) A/B A collected, 2) A/C B C automatic 3) A/C/D nonlinear data 4) A/C/E D E value increase. 8
  9. 9. Why Hierarchical Structured Data Processing is PowerfulDynamic Multipath Query CombinationsMultipath Data Value Grows ContinuallyEvery Node is Related to Every Other NodeNo. of Possible Queries Becomes UnlimitedAutomatic Processing = Unlimited Complexity Increasingly Complex Naturally utilizes the inherent Multipath Queries hierarchical semantics inMview multipath queries. 1) SELECT B,C FROM Mview A 2) SELECT B,D FROM Mview WHERE A=2B C 3) SELECT A,B FROM Mview WHERE E=5 4) SELECT B,C FROM Mview WHERE D=1 AND E=5 D E 5) SELECT D,E FROM Mview WHERE B=3 OR C=4 9
  10. 10. Hierarchical Data Structure TypesXML is self defining contiguous and nested No foreign key relation needed to make structureIBM’s IMS database is discontinuous Internally linkedStructured VSAM is contiguous With hierarchical level & data occurrence countsFlat tabular objects hierarchically modeled Relational tables, flat files, spread sheetsAll support same hierarchical operation and principles 10
  11. 11. Hierarchical Structured Data Makeup Multiple Node Types Nodes Support Multiple Data Occurrences Naturally Built With 1 to M Relationships Hierarchical Data Preservation Operation- Naturally Support Variable Length Paths Co C01 Not to be confused with 1 Co2 function driven single node M external hierarchical data Emp3 structure processing. Emp Emp1 1 1 Emp2 Suitable for IMS, Structured Proj3 VSAM, XML, COBOL FD and M M Proj4 hierarchically modeled Dpnd Proj Dpnd1 Proj2 Relational and flat data. 11
  12. 12. SQL Hierarchical Structured Data Can Be Stored in Relational Rowsets Multiple Node Types in Relational Rowset Node Multiple Data Occurrences in Rowset Multiple Pathways in Rowset Variable Length Pathways in Rowset C01 Relational Rowset Co2 Hierarchical Co Emp Dpnd Proj structure Emp3 Co1 Emp1 Dpnd1 preservedEmp1 in rowset, Emp2 Co1 Emp2 Proj2 in and back Proj3 Co2 Emp3 Proj3 out again. Proj4Dpnd1 Proj2 Co2 Emp3 Proj4 12
  13. 13. Mapping Relational SQL to Hierarchical XML Input View Conceptual Hierarchical Result Query A A Shown on this slide: SELECT A.a, D.d, E.e BB C FROM GlobalView D E 1. Conceptual Level WHERE B.b=“B1” D EE XML 2. Multipath ProcessingX D Output 3. Hierarchical Filtering 4. Dynamic Data Select Working Set Result Set 5. Access OptimizationA B C D E A D E 6. Global View SupportA1 B1 C1 D1 E1 A1 D1 E1A1 B1 C1 D2 E1 A1 D2 E1 7. Schema-free QueryA1 B1 C1 D1 E2 A1 D1 E2 8. Node PromotionA1 B1 C1 D2 D2 A1 D2 D2 9. Auto Output FormatA1 B2 C1 D1 E1 13
  14. 14. Relational Logical Hierarchical View CREATE VIEW EmpView AS SELECT * FROM Emp Emp Hierarchical SQL LEFT JOIN Dpnd data modeling and ON EmpID=DpndEmpID processing Dpnd Eaddr semantics defined. AND DpndCode=’D’ LEFT JOIN Eaddr ON EmpCustID=EaddrCustID; SELECT EmpID, DpndID, <root> <emp empid="Emp01"> EaddrID <dpnd dpndid="Dpnd01"/> FROM EmpView <eaddr eaddrid="Addr01"/> </emp> Logical hierarchical SQL <emp empid="Emp02"> view and semantics <eaddr eaddrid="Addr03"/> processed directly by relational engine making it </emp> operate fully hierarchically. </root> 14
  15. 15. XML Physical Hierarchical ViewCREATE XML CustView CREATE VIEW CustView ASCust( SELECT * FROM Cust CustID Char(8), LEFT JOIN Invoice CustStoreID Char(8)), ON CustID=InvIDInvoice( LEFT JOIN ADDR Converted to InvID Char(8), SQL CustView ON CustID=AddrID; InvCustID Char(8), SELECT Cust, Invoice, Addr InvStatus Char(8)) Parent Cust,Addr( FROM CustView AddrID Char(8), <root> AddrCustID Char(8), <cust custid="Cust01"> AddrState Char(8)) Parent Cust <invoice invid="Inv01"/> <invoice invid="Inv02"/> <addr addrid="Addr01"/> Cust This SQL created outer join view syntax is Invoice Addr used as a hierarchical map of the physical 15 IMS CustView for seamless operation.
  16. 16. Joining Heterogeneous ViewsSELECT EmpID, DpndID, EaddrID, CustID InvID, AddrID <root>FROM EmpView <emp empid="Emp01">LEFT JOIN CustView <dpnd dpndid="Dpnd01"/> <eaddr eaddrid="Addr01"/>ON EmpCustID=CustID <cust custid="Cust01"> <invoice invid="Inv01"/> Hierarchical Heterogeneous <invoice invid="Inv02"/> Structure Join Logical View <addr addrid="Addr01"/> Emp of Result </cust> </emp>Dpnd Eaddr Emp <emp empid="Emp02"> <eaddr eaddrid="Addr03"/> Cust Dpnd Eaddr Cust <cust custid="Cust03"> <addr addrid="Addr03"/>Invoice Addr Invoice Addr </cust>Emp logical and Cust physical views are </emp>both hierarchically modeled in the same </root>way in SQL making for a seamless, 16unified and logical hierarchical join.
  17. 17. Logical Hierarchical Structures Offer Flexible Relational/XML Integration Logical Physical Data Model: Logical Data Type Tables XML Natural hierarchical Mapping Common structures Features: solve Common Outer Join Modeling Abstraction hierarchical Data Modeling Consistency fixed structure Log. and Phy. Seamless problems. Structures have Capabilities: Emp same hierarchy Data Integration Dpnd Eaddr op principles Solves Rel and Hier Problems Separates Structure from Data Heterogeneous Hierarchical Structure Flexibility Cust Logical Offers Flexibility to Fixed Structures Invoice Addr Structure 17
  18. 18. Heterogeneous Data Structure Mashup Uses Linking Below RootSELECT EmpID, DpndID, InvID, <root> AddrID, EaddrID, CustID <emp empid="Emp01"> <dpnd dpndid="Dpnd01"/>FROM EmpView LEFT JOIN <eaddr eaddrid="Addr01"> CustView ON EaddrID=AddrID <cust custid="Cust01">WHERE CustID <> ”Cust02” <invoice invid="Inv01"/> <invoice invid="Inv02"/>Link Below Root Result Structure <addr addrid="Addr01"/> </cust> Emp Emp </eaddr> Dpnd Eaddr Dpnd Eaddr </emp> ...</root> Cust Cust Mashups allow linking anywhere into the lower level structure. WeInvoice Addr Invoice Addr determined this was valid and whatDashed arrow is linkage. the new semantic structure is.Solid arrow is new structure. 18
  19. 19. Association Table Use SELECT EmpID, DpndID, EaddrID, CustID, InvoiceID AddrID, IntersectData FROM EmpView LEFT JOIN AssociationTable On DpndID=DpndX LEFT JOIN CustView ON CustID=CustX Result Emp EmpView Structure Emp Association Table Dpnd Eaddr Association Dpnd Capabilities added: Eaddr Table • External RelationshipsDpndX CustX IntersectData • M to M Relationships IntersectData • Intersecting Data • Can be Transparent Cust CustView Cust • Retains Hier Structure Invoice Addr Invoice Addr 19
  20. 20. Hierarchical Data Filtering, AutoOutput & Structure-free Processing Input View Conceptual Hierarchical Result Automatic Query A A Structured XML SELECT A.a, D.d, E.e Output Formatting BB C FROM GlobalView D E WHERE B.b=“B1” D EEX D Hierarchical WHERE clause global data filtering Cousin nodes like node C above are filtered from B node This makes this a more internally complex multipath query Structure-free processing Navigationless XML access, no need to know structure Automatic structure-aware output formatting 20 Result structure known from outer join syntax modeling
  21. 21. Optimization, Global Views and Node Promotion Input View Conceptual Hierarchical Result This is a conceptual Query A A query where the input SELECT A.a, D.d, E.e structure is not known BB C FROM GlobalView D E and the output adapts WHERE B.b=“B1” D EE to the dynamic result.X D Hierarchical optimization can remove from access - Unreferenced data nodes not on path to referenced data Dynamically controlled by SQL’s variable SELECT list Optimization makes all views global views Because they have no overhead for unreferenced fields Node promotion closes around unselected nodes 21 This happens in relational processing too
  22. 22. Global Query Uses Global Views EmpCust Global views allow entire Global Query Emp structures to be defined and queried with noSELECT * Dpnd Eaddr Cust overhead, more userFROM EmpCust friendly. Global query Invoice Addr allows hierarchicalWHERE InvStatus=‘O’ filtering of entire view. Global Data Filter<root> <emp empid="Emp01" empstoreid="Store01" empcustid="Cust01" empstatus="F"> <dpnd dpndid="Dpnd01" dpndempid="Emp01" dpndcode="D"/> <eaddr eaddrid="Addr01" eaddrcustid="Cust01" eaddrstate="CA"/> <cust custid="Cust01" custstoreid="Store01"> <invoice invid="Inv02" invcustid="Cust01" invstatus="O"/> <addr addrid="Addr01" addrcustid="Cust01" addrstate="CA"/> </cust> </emp></root> 22
  23. 23. Node Promotion and Nested View CREATE VIEW EmpCust AS SELECT * Hierarchical views can be nested. FROM EmpView LEFT JOIN CustView Views can contain views. ON EmpCustID=CustID SELECT EmpID, DpndID, <root> InvID, AddrID <emp empid="Emp01"> FROM EmpCust <dpnd dpndid="Dpnd01"/> <invoice invid="Inv01"/> <invoice invid="Inv02"/> EmpCust Node <addr addrid="Addr01"/> Emp Promotion </emp> <emp empid="Emp02">Dpnd Eaddr Emp <addr addrid="Addr03"/> Dpnd Invoice Addr </emp> Cust </root>Invoice Addr Not SELECTed 23
  24. 24. Node Promotion Override and XMLFormat Change <emp> <empid>Emp01</empid> SELECT EmpID, DpndID, <dpnd> <dpndid>Dpnd01</dpndid> InvID, AddrID </dpnd> FROM EmpCust <cust> FOR XML ELEMENT KEEP NODE <invoice> <invid>Inv01</invid> EmpCust No Node </invoice> <invoice> Emp Promotion <invid>Inv02</invid> Emp </invoice> Dpnd Eaddr <addr> Dpnd Cust <addrid>Addr01</addrid> Cust </addr>Invoice Addr Invoice Addr </cust> </emp>Without node promotion, unselected <emp>nodes are output empty. XML output <empid>Emp02</empid>format was changed to Element style. 24
  25. 25. Data Structure Mashup With Node Promotion = Data Virtualization SELECT EmpID, DpndID, <root> InvID, AddrID <emp empid="Emp01"> FROM EmpView <dpnd dpndid="Dpnd01"/> LEFT JOIN CustView <invoice invid="Inv01"/> <invoice invid="Inv02"/> ON EaddrID=AddrID <addr addrid="Addr01"/>Link Below Root </emp> Result Output with <emp empid="Emp02"> Emp Node Promotion <addr addrid="Addr03"/> Dpnd Eaddr Emp </emp> </root> Dpnd Invoice Addr CustInvoice Addr Mashups with node promotion gives aggregated result of the data. This hasDashed boxes are unselected, not output. same effect as data virtualization. 25
  26. 26. Multipath Query Processing and its Required LCA ProcessingMultipath query references multiple pathways- and uses a WHERE data filtering operation For example: Selecting data based on data in another path This requires a special processing using LCALowest Common Ancestor (LCA) Node The LCA node is the lowest common ancestor node - Located between two pathway node points in the structure Used to keep the hierarchical query result meaningfulTwo types of SQL multipath LCA usage SELECT with WHERE clause referencing two different paths Compound WHERE clause referencing two different paths This LCA processing solves the XML Keyword Search problem These two uses of LCA can be combined causing nesting Each SELECT data item can have its own LCA 26
  27. 27. Multipath LCA Query Logic for Single SELECT Data SELECT DpndID Lowest Common Ancestor (LCA) FROM EmpCust processing for the SQL SELECT WHERE AddrID=’Addr01’ clause will automatically process multiple LCA nodes when multiple specified output data is located EmpCust across different pathways. Emp SELECT LCA Dpnd Eaddr Cust <root> <dpnd dpndid="Dpnd01"/> Invoice Addr </root>SELECT data types Dpnd andInvoice generate different LCAs. 27
  28. 28. Multipath LCA Query Logic for Multiple SELECT Data SELECT DpndID, InvID Lowest Common Ancestor (LCA) FROM EmpCust processing for the SQL SELECT WHERE AddrID=’Addr01’ clause will automatically process multiple LCA nodes when multiple specified output data is located EmpCust across different pathways. Emp SELECT LCAs <root> Dpnd Eaddr Cust <dpnd dpndid="Dpnd01"/> Invoice Addr <invoice invid=“Inv01"/> <invoice invid=“Inv02"/>SELECT data types Dpnd andInvoice generate different LCAs. </root> 28
  29. 29. Compound WHERE Clauses also Require Their Own LCA Lowest Common Ancestor (LCA) node SELECT DpndID processing for multipath queries is FROM EmpCust necessary. We found it working in SQL WHERE InvID=’Inv01’ naturally on both the SELECT and WHERE operations. Academic projects AND AddrID=’Addr01’ are currently researching how to add it to XQuery. LCA is the lowest node EmpCust between node points on separate paths. Emp SELECT LCA WHERE LCA <root>Dpnd Eaddr Cust <dpnd dpndid="Dpnd01"/> Invoice Addr <dpnd dpndid="Dpnd03"/>This LCA example is actually a </root>combined LCA and nested LCA example 29
  30. 30. Data Driven Variable Structure Generation SELECT EmpID, DpndID, InvID, <root> AddrID, EaddrID, <emp empid="Emp01" CustID, EmpStatus empstatus="F"> FROM EmpView <dpnd dpndid="Dpnd01"/> LEFT JOIN CustView <cust custid="Cust01"> <invoice invid="Inv01"/> ON EmpCustID=CustID <invoice invid="Inv02"/> AND EmpStatus=“F” <addr addrid="Addr01"/> </cust>Join Structure Variable Structure </emp> <emp empid="Emp02" Emp Emp empstatus="">Dpnd Eaddr </emp> Dpnd </root> Cust Cust Current EmpStatus data valueInvoice Addr controls if this CustView data Invoice Addr occurrence expands or not. 30
  31. 31. Variable Structure Generation Using Multiple ChoiceSELECT EmpID, DpndID, InvID, Variable Structure AddrID, EaddrID, <root> CustID, EmpStatus <emp empid="Emp01"FROM EmpView empstatus="F"> LEFT JOIN CustViewF <dpnd dpndid="Dpnd01"/> ON EmpCustID=CustID Insert Full Time View Here AND EmpStatus=“F” </emp> <emp empid="Emp02" LEFT JOIN CustViewP empstatus=“P"> ON EmpCustID=CustID Insert Part Time View Here AND EmpStatus=“P” </emp> </root> Emp Current EmpStatus data value Dpnd CustViewF CustViewP controls which view expands. Control value is up or down path. 31
  32. 32. Data Structure TransformationTwo Basic Types of Data Structure Transformation Restructuring: uses data relationships in the data Restructuring: Uses only the structure semantics in the dataRestructuring Used to bring out new data relationships Also removes data and nodes New structure is secondary, driven new relationships desiredReshaping Used to reshape data structure to a desired new structure No data relationships dependency Any-to-any data structure transformation is possible Reshaping is driven by structures natural semanticsSQL transformation operation assures accuracy Restructuring and Reshaping operations can be combined 32
  33. 33. Restructure TransformationSELECT X.EmpID, X.DpndID, Y.InvID, X.AddrIDFROM EmpCust YLEFT JOIN EmpCust X ON Y.InvCustID=X.EmpCustID <root> <invoice invid="Inv01"/>EmpCust X.Emp <emp empid="Emp01"> <dpnd dpndid="Dpnd01"/> X.Dpnd Eaddr X.Cust <addr addrid="Addr01"/> </emp> Y.Invoice Addr </Invoice> ....</root> Result Invoice Restructuring SQL transformation: 1) Takes structure apart Emp 2) Disregards unwanted portions 3) Recombines it differently -- Dpnd Addr 4) Uses different data relationships 5) This introduces new semantics 33
  34. 34. Semantic Any-to-Any Structure Reshaping TransformationSELECT X.DpndID, Y.EmpID, Y.EaddrIDFROM EmpView X <root>LEFT JOIN EmpView Y <dpnd dpndid="Dpnd01">ON X.DpndID=Y.DpndID <emp empid="Emp01"> <eaddr eaddrid="Addr01"/> Result </emp> EmpView Structure </dpnd> X Emp Dpnd </root> Dpnd Eaddr Emp Reshaping uses only the semantics of the data structure to transform it into the desiredY Emp Eaddr structure. No new data relationships are needed so this method can always be used.Dpnd Eaddr Linking below the root is utilized. 34
  35. 35. Polymorphic Any-to-Any ReshapingSELECT X.DpndID, Y.EaddrID, Z.EmpIDFROM EmpView XLEFT JOIN EmpView Y ON X.DpndID=Y.DpndIDLEFT JOIN EmpView Z ON Y.EaddrID=Z.EaddrID <root> EmpView Target <dpnd dpndid="Dpnd01"> Structure Emp <eaddr eaddrid="Addr01"> X Dpnd <emp empid="Emp01"/> Dpnd Eaddr </eaddr> Eaddr </dpnd> Y Emp Emp </root> Dpnd Eaddr This reshaping example moves only one node at a time which makes its operation polymorphic allowing any Z Emp shaped input structure that uses the same data names. Reshaping uses comparing the structure to itself for Dpnd Eaddr 35 positioning since data relationships are not relied on.
  36. 36. Reshaping to Multipath StructureSELECT X.DpndID, Y.EaddrID, Z.EmpIDFROM EmpView XLEFT JOIN EmpView Y ON X.DpndID=Y.DpndIDLEFT JOIN EmpView Z ON X.DpndID=Z.DpndID EmpView Target <root> Structure Emp <dpnd dpndid="Dpnd01"> X Dpnd <eaddr eaddrid="Addr01"/> Dpnd Eaddr <emp empid="Emp01"/> Eaddr Emp </dpnd> Emp Y </root> Dpnd Eaddr This reshaping example demonstrates reconstructing Emp Z to a multipath nonlinear structure using SQL’s hierarchical data modeling. The SQL hierarchicalDpnd Eaddr semantic operation helps preserve correctness. 36
  37. 37. Multipath Querying Vs. Transform Multipath Multipath Transformed Transformed ViewX ViewX Data ViewX ViewX Data B 12 16 A 10 A 10 10 B C 12 18 16 C 18 18SELECT A, C SELECT A, C A simpleFROM ViewX FROM ViewX multipathWHERE B>10 WHERE B>10 query can Output Data usually avoid Structure Replication Specifically transformsMultiple use and preservemultipath transforms 10 A 10 10 structure withschema-free structure A/B to more accuratequery uses & B/A changing 18 C 18 18 results.preserves semantics fromdata structure 1-to-M to M-to-1 37
  38. 38. Renaming, Replicating, & SplittingSELECT EmpID EmpName, DpndID AS DpndName, Nodes Addr1.EaddrID AS AddrName, Addr1.EaddrID AS AddrName, Addr2.Eaddrstate AS AddrStateFROM Emp AS EmployeeLEFT JOIN Dpnd AS Dependent ON EmpID=DpndEmpID and DpndCode=DLEFT JOIN EAddr AS Addr1 ON EmpCustID=Addr1.EAddrCustIDLEFT JOIN EAddr AS Addr2 ON Addr1.EaddrCustID=Addr2.EAddrCustID <root> <employee empname="Emp01">Dpnd Emp Eaddr <dependent dpndname="Dpnd01"/> Result <addr1 addrname="Addr01"> <addr2 addrstate="CA"/> Employee </addr1> </employee> Dependent Addr1 <employee empname="Emp02"> Addr2 <addr1 addrname="Addr03"> <addr2 addrstate="NV"/>The AS keyword renames XML data </addr1>items and nodes, it can be used to </employee> 38replicate nodes to help split them. </root>
  39. 39. XML Duplicate and Shared Unambiguous Node Processing This same SQL SELECT * FROM Dept handles both LEFT JOIN Cust ON DeptID=CustDeptID duplicate and shared data. LEFT JOIN Emp ON DeptID=EmpDeptID LEFT JOIN Addr AS AddrC ON CustID=AddrCustID LEFT JOIN Addr AS AddrE ON EmpID=AddrEmpID Ambiguous Unambiguous AmbiguousShared Structure Hierarchical Structure Duplicate Element Type Dept Dept Dept Cust Emp Cust Emp Cust Emp 39 Addr AddrC AddrE Addr Addr
  40. 40. Nonlinear Hierarchical ORDER BY SELECT CustID, InvID, AddrID <root> FROM CustView <cust custid="Cust03"> ORDER BY AddrID Desc, <addr addrid="Addr03"/> </cust> InvID Desc, <cust custid="Cust02"> CustID Desc <invoice invid="Inv03"/> <addr addrid="Addr04"/> CustView Ordering <addr addrid="Addr02"/> Cust </cust> <cust custid="Cust01"> Invoice AddrOrdering Inv before Cust is Trouble: Hierarchical ops like ORDER BY require special nonlinear hierarchical processing Cust1 Cust2 Cust1 to make sense for SQL Hierarchical processing. With Order By, each path is Inv1 Inv2 Inv3 ordered independently as in XML above. 40 Cust becomes out-of-order
  41. 41. Hier & XML Middleware Enabler Before Query: Query SQL Pre Processing Establish Views User’s SQL Preload XML (ETL) Real-time Processor SQL Pre Processing: Hier Access Operating Determines structure Hierarchically Optimizes structure Rewrite & submit SQL SQL Post Processing Real-time Access: Structure-aware This Unique Implementation: Retrieve XML (EII) • Supports XML enables hier processing Return XML as rowset • Uses in place SQL processor Retain XML data order • Needs no restaging of data SQL Post processing: • Not XML centric, no learning curve • ANSI SQL-92, vendor neutral Remove replicated data • Operates seamlessly & transparently Nonlinear data ordering • Protects current SQL investment Rowset to XML 41
  42. 42. Dynamic Data Structure Creation Recap SELECTed Data Output Controls processing and output result structure With automatic node promotion and collection Combining Data Structures Joining and mashup of heterogeneous structures Mashup & node promotion = data virtualization Generating Variable Data Structures Multiple choice of view structure generation Driven by data value further up or down the path Transforming Data Structures Restructuring using data relationships Reshaping using structure semantics Node splitting and renaming 42
  43. 43. The Power of Hierarchical LEFT Outer Join Syntax RecapViewA LEFT JOIN ViewXON C.c=Z.c AND A.a=10 Shown on this slide: Views Expanded • Automatically Combines and Unifies Heterogeneous StructuresA LEFT JOIN B ON A.a=B.a LEFT JOIN C ON A.a=C.a • Path Filtering Based Up/Down PathLEFT JOINX LEFT JOIN Y ON X.x=Y.y • Directly Executable by SQL Engine LEFT JOIN Z ON X.x=Z.zON C.c=Z.c AND A.a=10 • Flexible Recombinant Properties A A • Referencing Below Root is valid Outer join models joinB C B C of views, • Access optimization: paths sliced out semantics X X • Processing optimization: reorganized define result structure. • Naturally Hierarchically Distributable 43Y Z Y Z
  44. 44. Combining Basic Capabilities ViewR Hierarchical Examples: A XML Result Hier View Support B C Data Modeling X D E Hier. Preservation A1 Var. Length PathsCREATE View ViewR AS B1 D1 E1 Multiple Paths B2 E2SELECT * FROM A Multi-node TypesLEFT JOIN B ON A.a=B.a Dyn. Data SelectLEFT JOIN C ON A.a=C.a A1B1D1E1 Hier. Data FilteringLEFT JOIN D ON C.c=D.c A1B2 E2 Node PromotionLEFT JOIN E ON C.c=E.c Node Collection Optimized Access A1B1C2D1E1 SELECT A.a,B.b,D.d,E.e FROM ViewR Global View A1 C1D1 A1B2C2 E2 WHERE C=‘C2’ Schema-free 44
  45. 45. Combining Advanced Capabilities ViewR ViewX Data Mashup Examples: CREATE XML A X ViewX AS A XML View B C Y Z X(XID Char(8)), C Heterogeneous Y(YID Char(8), Data Integration YFK Char(8)) X MashupCREATE View Parent X,ViewR AS Z(ZID Char(8), Y Z Var. StructuresSELECT * FROM A ZFK Char(8)) LCA ProcessingLEFT JOIN B Parent X ON A.a=B.a A Linear FilteringLEFT JOIN C Converted to Hier. Ordering ON A.a=C.a B C Outer Join Preserve Nodes X Unified View SELECT A.a, C.c, Y.y, Z.z XML Keyword FROM ViewR LEFT JOIN ViewX Y Z Search ON C.c=Z.c AND A=‘A1’ WHERE Y=‘Y1’ OR Z=‘Z2’ A1C2X1 Y1 Z1 Hierarchical ORDER BY A.a, Y.y, Z.z A2C2X2 Y2 Z2 XML Result FOR XML KEEP NODE A3C3X3 Y3 Z3 45
  46. 46. Advanced Capabilities 1Multipath nonlinear processing Dynamic increase of data value using structure semantics Processes all queries regardless of internal complexity Hierarchical data filtering-- XML Keyword SearchMultipath hierarchical data structure joins Performed by simple join of hierarchical structure views Can be performed interactively and heterogeneously Dynamically combines hierarchical structure semanticsLinking below root allows structure mashup Enables capability to mashup structures meaningfully Includes powerful look-back and look-ahead capability Mashup + node promotion = data virtualization Enables SQL Transformations 46
  47. 47. Advanced Capabilities 2Dynamic automatic variable structure generation Dynamically builds structures based on values in data Can utilize cascading and embedded view operation Operates across multiple paths independentlyDynamic user control of structure Dynamic SELECT list controls processing structure Dynamic combining views builds data structure Transformations Change StructureDynamic Structured Output Formatting Structure-aware processing knows active structure Uses structure of result to format output data Active structure is dynamically controlled by user 47
  48. 48. Advanced Capabilities 3Polymorphic any-to-any structure transform Uses data relationships or just structure semantics Utilizes SQL and hierarchical rowset data Two types of transforms, Restructure and ReshapingMultipath nonlinear hierarchical Operations Single Order By orders multiple pathways separately Aggregation operates on separate pathwaysMultipath internal LCA nonlinear processing LCA processing limits the domain across pathways Is automatic in ANSI SQL, but not in XQuery Responsible for keeping multipath result meaningfulUntested natural and automatic operations Natural hierarchical distributed processing Automatic parallel processing is possible Semantic web RDF to SQL, SQLfX driven by RDF 48
  49. 49. Advanced Capabilities Summary1) ANSI SQL standard and mathematically sound2) Ease of use (nonprocedural, navigationless, schema-free)3) Hierarchically correct (principled multipath processing)4) Greater efficiency (hierarchical processing optimization)5) Fully interactive (dynamically process native XML)6) Conceptual hierarchical processing on full structures7) Queries can operate across the entire hierarchical structure8) Nonlinear multipath LCA hierarchical processing9) Full nonlinear hierarchical data structure mashupsA) Variable data generated structure controlB) Any-to-any polymorphic structure transformationsC) All operations are semantically controlled and accurateD) Data virtualization supportedE) Natural distributed hierarchical processingF) Automatic parallel processing is possibleAll of the capabilities shown in this presentation can be reproduced 49 on our online interactive demo at www.adatinc.com/demo.html

×