Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

XML In The Real World - Use Cases For Oracle XMLDB

4,252 views

Published on

XML In The Real World - Use Cases For Oracle XMLDB - Why to use XML in the first place plus why unstructured data is wanted as a data source.

Published in: Technology
  • Login to see the comments

XML In The Real World - Use Cases For Oracle XMLDB

  1. 1. XML In The Real World Use Cases for Oracle XML Database Marco Gralike – AMIS
  2. 2. Overview <ul><li>Unstructured Data </li></ul><ul><li>Unstructured Data Solutions </li></ul><ul><li>Why XML ? </li></ul><ul><li>Oracle XML Database - Use Cases </li></ul>
  3. 3. Unstructured Data
  4. 4. What is “Unstructured Data” <ul><li>Documents, Whitepapers </li></ul><ul><li>Reports </li></ul><ul><li>Spreadsheets </li></ul><ul><li>WebPages, Blog Content </li></ul><ul><li>Email </li></ul><ul><li>Images, Audio, Video </li></ul><ul><li>Organizations can also use this data to gain a competitive advantage </li></ul>
  5. 5. Is Unstructured Data Relevant? <ul><li>80 percent of business is conducted on unstructured information (Gartner Group) </li></ul><ul><li>85 percent of all data stored is held in an unstructured format (Butler Group) </li></ul><ul><li>Unstructured data doubles every three months (Gartner Group) </li></ul><ul><li>7 million web pages are added every day (Gartner Group) </li></ul><ul><li>Source: http://www.b-eye-network.com/view/2098 (2005) </li></ul>
  6. 6. Context  “Which result do you Expect” ? <ul><li>You get “the correct” answer, because it is hardcoded… </li></ul>
  7. 7. Labels…? <ul><li>Use the wrong “labels”  You will NOT find the data ! </li></ul>
  8. 8. Labels…? (Cont.) <ul><li>Diablo Game Forum  </li></ul><ul><li>A User Forum  </li></ul><ul><li>Discussions about </li></ul><ul><li>Religion  </li></ul><ul><li>Info about Greek Heroes  </li></ul><ul><li>Baby Names  </li></ul>
  9. 9. The Actual Info Needed… <ul><li>Context: “ This info is relevant to ME, but maybe only to ME ” </li></ul>
  10. 10. Unstructured Data “Solutions”
  11. 11. How to Deal with Unstructured Data? <ul><li>Store it: in Memory or on Disk </li></ul><ul><ul><li>File servers, Content Management Systems, Data Warehouse, Websites </li></ul></ul><ul><li>Categorize the data </li></ul><ul><ul><li>Labeling </li></ul></ul><ul><ul><li>Foldering </li></ul></ul><ul><ul><li>Attempts to define Relationships (Context: “label”+”folder”) </li></ul></ul><ul><ul><li>Business Taxonomy </li></ul></ul><ul><li>Purpose: </li></ul><ul><ul><li>Trying to Retrieve and Store Unstructured data in such a way that the context of the information is still valid. </li></ul></ul>
  12. 12. Everyone Tries to Find “ The ” solution? <ul><li>Everyone is in need for answers: </li></ul><ul><ul><li>Eg. “Google” </li></ul></ul><ul><ul><li>Data Warehouses - Business Intelligence </li></ul></ul><ul><ul><li>Content Management Software </li></ul></ul><ul><ul><li>Etc. </li></ul></ul><ul><li>Because the content has been created by humans it has a free format </li></ul><ul><li>There is to date no UNIVERSAL solution for </li></ul><ul><ul><li>How to store it </li></ul></ul><ul><ul><li>How to access it </li></ul></ul><ul><ul><li>How to make information retrieval performant </li></ul></ul>
  13. 13. Unstructured Data Issues <ul><li>Keeping it intact? </li></ul><ul><li>Much empty space ! </li></ul><ul><li>How to find “labels”? </li></ul><ul><li>Access Optimization? </li></ul><ul><li>Break it up in “labels” (shredding) </li></ul><ul><li>How to Rebuild the Tree after it has pulverized? </li></ul><ul><li>Is it still the Same Tree after rebuilding? </li></ul>
  14. 14. Why XML…?
  15. 15. What is XML? <ul><li>The Extensible Markup Language (XML) </li></ul><ul><li>is a general-purpose specification for creating custom markup languages </li></ul><ul><li>It is classified as an extensible language because it allows its users to define their own elements </li></ul><ul><li>It is designed to be relatively human-legible </li></ul><ul><li>Its primary purpose is to help information systems share structured data, particularly via the Internet , and </li></ul><ul><li>It is used both to encode documents and to serialize data. </li></ul><ul><li>It is a fee-free open standard </li></ul>
  16. 16. Why use XML with unstructured data…? <ul><li>It is universally accepted by software vendors as the glue between systems regarding (unstructured) data transport </li></ul><ul><li>It is easy due to its free format nature that appeals humans </li></ul><ul><li>Due to its free format nature, it is (miss)used for everything: </li></ul><ul><ul><li>Transport of data </li></ul></ul><ul><ul><li>Storage of data </li></ul></ul><ul><ul><li>Data transformation, evaluation, description, retrieval </li></ul></ul><ul><li>It Seems to fit the most </li></ul><ul><li>Has the Ability to solve the problem </li></ul>
  17. 17. The Ability to Solve the Problem <ul><li>Store it: in Memory or on Disk </li></ul><ul><ul><li>File servers, Content Management Systems, Data Warehouse, Websites </li></ul></ul><ul><li>Categorize the data </li></ul><ul><ul><li>Labeling </li></ul></ul><ul><ul><li>Foldering </li></ul></ul><ul><ul><li>Attempts to define Relationships (Context: “label”+”folder”) </li></ul></ul><ul><ul><li>Business Taxonomy </li></ul></ul><ul><li>Purpose: </li></ul><ul><ul><li>Trying to Retrieve and Store Unstructured data in such a way that the context of the information is still valid. </li></ul></ul>Vendor Storage Solution: Eg. Oracle XMLType XML Schemata: (XBRL, Health Care, etc) XQuery, XPath, Etc.
  18. 18. XML Schemata – Taxonomy <ul><li>Mathematical </li></ul><ul><ul><li>MathML - Mathematical Markup Language </li></ul></ul><ul><li>Graphical User Interfaces </li></ul><ul><ul><li>GLADE - GNOME's User Interface Language (GTK+) </li></ul></ul><ul><ul><li>KParts - KDE's User Interface Language (Qt) </li></ul></ul><ul><ul><li>XUL - XML User Interface Language (Native) </li></ul></ul><ul><ul><li>XAML - Microsoft's Extensible Application Markup Language </li></ul></ul><ul><li>Metadata </li></ul><ul><ul><li>RDF - Resource Description Framework </li></ul></ul><ul><ul><li>ONIX - ONline Information eXchange </li></ul></ul><ul><ul><li>DDML - reformulations XML DTD </li></ul></ul>
  19. 19. XML Schemata – Taxonomy (Cont.) <ul><li>Bookmarks </li></ul><ul><ul><li>XBEL - XML Bookmark Exchange Language </li></ul></ul><ul><li>Business </li></ul><ul><ul><li>ARTSXML - Retail XML schema specification by Association for Retail Technology Standards </li></ul></ul><ul><ul><li>UBL - Defining a common XML library of business documents (purchase orders, invoices, etc.) by Oasis </li></ul></ul><ul><ul><li>HR-XML </li></ul></ul><ul><ul><li>XBRL - Extensible Business Reporting Language for International Financial Reporting Standards IFRS and United States Generally Accepted Accounting Principles GAAP business accounting. </li></ul></ul>
  20. 20. Business XML Schemata…(source Altova ) <ul><li>EAD SPACE/XML AgXML STAR BioML CellML EMBLxml </li></ul><ul><li>B2MML BatchML BPEL4WS CAM ebXML UBL Xcbl Chem </li></ul><ul><li>eStandards CML DocBook SCORM FIXML FpML IFX MDDL </li></ul><ul><li>XBRL gdmxml KML GML CAP EDXL EML HL7 SNOMED-CT </li></ul><ul><li>HR-XML ACORD for Life ACORD XML for Property & </li></ul><ul><li>Casualty Global JXDM OWL RDF/XML XMI SMIL SVG PIDX </li></ul><ul><li>XMML cTOC PIM DES Google Sitemap Protocol Atom ICE </li></ul><ul><li>JDF PROSE/XML RSS CDF ML SAML Microsoft Office 2003 </li></ul><ul><li>Open Office XML WSDL VoiceXML IRS Tax XML OTA </li></ul><ul><li>DWML …………………………………...…………………etc… </li></ul>
  21. 21. Origins by Industry / Field ( source Altova ) <ul><li>Archiving, Advertising, Agriculture, Automotive, Biology, </li></ul><ul><li>Business, Chemistry, Documentation, Education, Financial, </li></ul><ul><li>Geneology, Government, Healthcare, Human Resources, </li></ul><ul><li>Insurance, Legal, Manufacturing, Mathematics, Media, Metadata, </li></ul><ul><li>Natural Resources, Pharmaceuticals, Publishing, Real Estate, </li></ul><ul><li>Recreation, Science, Security, Software, Speech ,Taxes, Travel </li></ul><ul><li>Weather………………………………...…………………etc… </li></ul>
  22. 22. Oracle XML(DB) Use Cases
  23. 23. “ Universal” Data Storage (UWV Example) <ul><li>The historical data storage for end-of-life systems </li></ul><ul><ul><li>Data Sources: UVI relational, hierarchical, network, object orientated databases </li></ul></ul><ul><ul><li>Data Target: UWV – Historische Gegevens Opslag (XMLDB) </li></ul></ul><ul><li>Reasons: </li></ul><ul><ul><li>Data still needed </li></ul></ul><ul><ul><ul><li>CWI: Reference Data during Unemployed Benefit intake </li></ul></ul></ul><ul><ul><ul><li>Fraud Investigation </li></ul></ul></ul><ul><ul><li>Ease of sharing data (Webservices) </li></ul></ul><ul><ul><li>Law (Deletion of data after 7 years) </li></ul></ul><ul><ul><li>Intelligent ways to search data </li></ul></ul>
  24. 24. Content Management System (for XML) <ul><li>Storage and handling of XML data </li></ul><ul><ul><li>Koninklijke Bibliotheek </li></ul></ul><ul><ul><li>Buildin Standard Gateways to the outside world: </li></ul></ul><ul><ul><ul><li>WebDAV, HTTP, FTP, SOAP Webservices </li></ul></ul></ul><ul><ul><ul><li>(Binary) XMLType support in Oracle Streams </li></ul></ul></ul><ul><ul><ul><li>XMLType API’s for C / Java / .Net </li></ul></ul></ul><ul><li>Reasons: </li></ul><ul><ul><li>Complies to almost all (W3C) XML standards </li></ul></ul><ul><ul><li>Build-in XML Aware Repository </li></ul></ul><ul><ul><li>Versioning Capabilities </li></ul></ul><ul><ul><li>Security build-in (ACL / On top off the known relational features) </li></ul></ul>
  25. 25. Exposure of Legacy Systems <ul><li>Relational data to XML conversion capabilties </li></ul><ul><ul><li>XML can be exposed via Webservices etc. </li></ul></ul><ul><ul><li>Buildin Standard Gateways to the outside world: </li></ul></ul><ul><ul><ul><li>WebDAV, HTTP, FTP, SOAP Webservices </li></ul></ul></ul><ul><ul><ul><li>(Binary) XMLType support in Oracle Streams </li></ul></ul></ul><ul><ul><ul><li>XMLType API’s for C / Java / .Net </li></ul></ul></ul><ul><li>Reasons: </li></ul><ul><ul><li>Addaption of current systems to the XML / SOA world </li></ul></ul><ul><ul><li>Sharing of data via XML Taxonomy </li></ul></ul>
  26. 26. Database as a Service / SOA environments <ul><li>Database content shared only via (native database) webservices (database buildin functionality) / SOA </li></ul><ul><ul><li>Decoupled </li></ul></ul><ul><ul><li>Stateless, etc, etc </li></ul></ul><ul><li>Reasons: </li></ul><ul><ul><li>Addaption of current systems </li></ul></ul><ul><ul><li>SOA “API” compliant  SOAP, WS* </li></ul></ul><ul><ul><li>Web browser compatible. </li></ul></ul>
  27. 27. Application Metadata, Data and Security store <ul><li>Integrated solution to securing the database and application user communities </li></ul><ul><ul><li>Useful for “lightweight” fusion application topologies </li></ul></ul><ul><li>Reasons </li></ul><ul><ul><li>The database can provide a uniform security model across all tiers, support multiple application user stores, including the associated roles, authentication credentials, database attributes, and application-defined attributes. </li></ul></ul><ul><ul><li>Enables users to have a single unique global identity across the enterprise. </li></ul></ul>
  28. 28. Intermediate Hub in ETL Topologies <ul><li>Handling, Filtering, Extraction of data based on XML schemata and/or XML to be shipped and processed to other participating processes and/or (data store) systems </li></ul><ul><ul><li>Koninklijke Bibliotheek </li></ul></ul><ul><ul><li>Buildin Standard Gateways to the outside world: </li></ul></ul><ul><ul><ul><li>WebDAV, HTTP, FTP, SOAP Webservices </li></ul></ul></ul><ul><ul><ul><li>(Binary) XMLType support in Oracle Streams </li></ul></ul></ul><ul><ul><ul><li>XMLType API’s for C / Java / .Net </li></ul></ul></ul><ul><li>Reasons: </li></ul><ul><ul><li>Supporting extraction languages and functionality available: SQL, PL/SQL, Java, C, .Net, Xquery, XPATH, XSQL, ACL, WebDAV, Oracle Search, Text, Spatial, Media, etc. </li></ul></ul>
  29. 29. XML Data Exchange based on XMLSchema <ul><li>Exchange of XML data, eg. XBRL to other systems (eg. Non oracle databases, systems, web services, etc) </li></ul><ul><li>Reasons: </li></ul><ul><ul><li>Data exchange based on defined XML Schema, WSDL, etc via native XML handling functionality: XQuery, XPath </li></ul></ul><ul><ul><li>Avoiding Impedance Mismatch </li></ul></ul><ul><ul><ul><li>Different data models. </li></ul></ul></ul><ul><ul><ul><ul><li>E.g. XPath models an XML document as a tree, while most general purpose programming languages have no native data types for a tree. </li></ul></ul></ul></ul><ul><ul><ul><li>Different programming paradigms. </li></ul></ul></ul><ul><ul><ul><ul><li>XSLT is a functional language, while Java is object-oriented , and Perl is a procedural one. </li></ul></ul></ul></ul>
  30. 30. XML Native Environment <ul><li>Using the XML DB Environment in topologies, architectures that are XML based (/ XML Aware) </li></ul><ul><li>Reasons: </li></ul><ul><ul><li>Data Shipping / Transport XML via XML </li></ul></ul><ul><ul><li>User Interfaces that “understand” / are based on XML </li></ul></ul>
  31. 31. Questions? <ul><li>[email_address] </li></ul><ul><li>http://www.amis.nl </li></ul>

×