Successfully reported this slideshow.

Michael Lang Sr. Presentation


Published on

  • Be the first to comment

Michael Lang Sr. Presentation

  1. 1. Semantic Software Architecture Using Semantic Technology to Build the Enterprise Information Web Michael Lang
  2. 2. 2 Emergent Analytics  Extensible enterprise information management paradigm  Add semantics to all aspects of the enterprise's information systems − All information becomes easily accessible using SPARQL − Add new information easily − Understand how everything is related and what it is  Provides the capability to analyze information enterprise wide
  3. 3. 3 IT
  4. 4. 4 Information Technology  The technology that enables the management of all types of information − Create it – works great − Store it – works great − Change it – works great − Find it – not so good − Analyze it – very complex, very difficult − Use it – works great if you are inside the application that creates it, otherwise BIG problem − Commonly called SILOS  We all want FEDERATION
  5. 5. 5 The term “Semantic Web” will not appear in this presentation
  6. 6. 6 Semantic Technology IT
  7. 7. 7 New Information Management Paradigm  Semantic Technology is a layer of description that sits within the current IT infrastructure − We build the descriptions using OWL and RDF − We access the descriptions at run-time using SPARQL  OWL and RDF are unique because they are a description language and an information model that has its own unique aspects − Enables a radical transformation of IT capabilities − Completely distributed information management − FEDERATION
  8. 8. 8 Information Federation  Enterprises are made up of many domains within domains  Sales, Operations, R&D, Executive management, manufacturing, …  Logistics, HR, Finance, intelligence …  Each domain fields its own applications and creates its own information to execute its mission  It is normally not possible to federate and integrate applications within domains, across domains or with partners  Enterprises will not take the next step in analytic capability until they first solve the INFORMATION federation problem
  9. 9. 9 What are RDF and OWL for? They are only used for one thing.... To DESCRIBE things ANYTHING Machines can UNDERSTAND the descriptions
  10. 10. 10 Federation Requires Description  Information discovery, reuse, and integration all depend on description − If we do not know what something is we cannot possibly know how to integrate it with other things or even how it should be used  If we describe everything well enough, we are in a position to have a knowledge-based web − integrate and interoperate − Analyze any combination of information  RDF & OWL enable information federation − both machines and people can understand the descriptions
  11. 11. 11 Defense Advanced Research Projects Agency  Relational Database Technology  TCP/IP  OWL/RDF − DARPA creates the Defense Agent Markup Language program in 2000 to facilitate information federation - − W3C takes the work funded by DARPA and others to create the Resource Description Framework (RDF) and Ontology Web Language (OWL) specifications  These standards comprise an excellent information management technology architecture  There are no other standards that can be used to accomplish the goal of information federation
  12. 12. 12 World Wide Web Architecture Mature Active Research and Standards Activity Commercial Cutting Edge
  13. 13. 13 Semantic Software Architecture
  14. 14. 14 Semantic Software Architecture  All components support RDF, OWL and/or SPARQL as well as other web technologies − OWL modeling tools − RDF stores − Spyders − Federators − SPARQL endpoints − Visualization tools − Analytic tools − SPARQL endpoint registry
  15. 15. 15 Spyder  Software component that transforms relational data formats to RDF using the mapping ontology  Adds the semantics of any domain ontology to any database  Provides SPARQL endpoint for relational databases  Generates information about sources to optimize performance  exposes full power of SQL  allows mappings themselves to be analyzed  Minimizes or eliminates the need for triple stores  Easier to use than ETL 
  16. 16. 16 Federator Enables users to query multiple RDF graphs exposed by Spyders as if they were a single graph − Uses the source metadata provided by Spyders to optimize performance Works against the native information sources − Does not require RDF to be moved into a triple store before it is queried − Delegates the maximum amount of processing as far down as possible Better solution than traditional ETL based processes − Uses the domain ontology and mapping ontology Supports complex analytics − Integrated with rules engine
  17. 17. Spyder OptimizerIndexes PlannerRe-Writer SPARQL Endpoint
  18. 18. Federator OptimizerCache Indexes PlannerRe-WriterRules SPARQL Endpoint
  19. 19. Federator OptimizerCache Indexes PlannerRe- Writer Rules SPARQL Endpoint Data Source Spyder OptimizerIndexes PlannerRe- Writer SPARQL Endpoint Spyder OptimizerIndexes PlannerRe- Writer SPARQL Endpoint Mapping Ontology Mapping Ontology Metadata Ontology Metadata Ontology Domain Ontology SPARQL Endpoint Registry Dashboard SPARQL SPARQL SPARQL SQL SQL Data Source
  20. 20. Ontology Repository Federator OptimizerCache Indexes PlannerRe- Writer Rules SPARQL Endpoint Data Source Spyder OptimizerIndexes PlannerRe- Writer SPARQL Endpoint Spyder OptimizerIndexes PlannerRe- Writer SPARQL Endpoint Mapping Ontology Metadata Ontology Domain Ontology SPARQL Endpoint Registry Dashboard SPARQL SPARQL SPARQL SQL SQL Data Source SPARQL SPARQL SPARQL
  21. 21. 21 Ontology Architecture
  22. 22. 22 Ontology Architecture  An ontology architecture is the system of ontologies required to accomplish a goal − Very much like a software architecture  The goal for an EIW is federation of information sources across business units to enable enterprise reporting and analysis − The ontology architecture of an EIW is designed to solve the information federation problem − While enabling sophisticated analytics
  23. 23. 23 EIW Ontology Architecture Human Resources Domain Ontology Relational Mapping Ontology Relational Mapping Ontology Process Ontology RDBMS RDBMS Standards Ontology Analytics Ontology Source Ontology Source Ontology Discussion Ontology Community Ontology Top-down Bottom-up
  24. 24. 24 EIW Ontology Architecture for Federation Human Resources Domain Ontology Relational Mapping Ontology Relational Mapping Ontology RDBMS RDBMS Reporting/Analytics SPARQL Source Ontology Source Ontology The Federator
  25. 25. 25 Domain Ontology The Domain Ontology is a conceptual description of a business domain − The “domain” is defined by the business processes, rules, information sources, and any required analytics Instances in this ontology are the same instances which are currently stored in information sources (databases) Exposes all information of the domain to any user or application using the business terminology of the domain  in some cases, these business terms are defined by standards
  26. 26. 26 Relational Mapping Ontology  Describes how concepts in the domain ontology relate to data in databases  Enables the translation of data from a relational format to RDF format, using terminology defined in the Domain Ontology  We have created a document that defines the Relational Mapping Ontology − This document should be released to the public this year − The D2RQ language was not sufficient for our mission 
  27. 27. 27 Relational Schema Ontology  Represents metadata about a relational database schema as instance data − All columns are instances and have properties relating them to their tables  Enables analysis of the way a database is mapped to the Domain Ontology (via the Relational Mapping Ontology) − How many columns are mapped to properties in the Domain Ontology? − How many are mapped to classes? − How is Person represented in customer management system?
  28. 28. 28 Analytics Ontology  Enables us to describe questions, queries, reports, forms − we represent questions as instances and relate them to the queries that provide their answers  Queries are related to Domain Ontology concepts  Domain Ontology concepts are mapped to data sources  Enables "gap analysis" of analytic requirements − are the concepts used in the query to answer this question mapped to the necessary data sources?  Long-term can be used to model-drive a reporting tool − create instances of "Reports" and the tool builds them
  29. 29. 29 Process Ontology − Enables description of business processes  RDF/OWL version of BPMN − Enables analysis of the information flows of business process steps in terms of the HR Domain Ontology − Long-term will enable execution of processes described as instances of the ontology − Short-term enables us to link processes with other artifacts in the domain  Domain Ontology concepts  Standards  documentation  Discussions - anything
  30. 30. 30 How Hard is this?  Many people believe that it is too hard, not enough trained people and takes too long to build the descriptions − So millions of dollars and many years have been spent trying to develop an automated way of doing the modeling − Automated machine learning has not been invented − The machines must be bootstrapped with descriptions  The first bullet is a fallacy − It is not very hard − There are plenty of people that can do this work − It does not take very long to build the models
  31. 31. 31 Federation Solution  Enterprise Information Web  Any information from any system can be shared with any other system on the enterprise networks or the World Wide Web  Steps  Describe all of the terms and artifacts in each domain using RDF, OWL  We currently do this description work, but we do not use machine readable standards – Excel, Word, Powerpoint, Visio  The formal description of a domain is called a domain ontology  Describe how all of the information managed in each domain is related to the domain vocabularyUse these descriptions to say how domains are related  Query the Domain vocabularies for any information  The result is an Enterprise Information Web that meets the goals of information sharing and analysis
  32. 32. 32 Relational DB’s Finance HR Logistics Web Service Domain Descriptions Knowledgebase Web sites Applications 1. Information Systems 2. Expose as RDF web services or SPARQL endpoints 3. EIW contains self described data 4. ESB is a big federated knowledgebase of any information user 5. Any authorized user or system can query the ESB for any information Enterprise Information Web RDF Web Service sensors Web Service weather location Federator Web Service Enterprise Information Web
  33. 33. 33 Leverage Existing Investment  We leverage existing infrastructure  Same networks, same security, same applications, same organizations  A lot of this description work is being done now, it simply requires some redirection  Must use standards like any other federation  The result of this relatively minor change and expense is an astounding advance in information management capability
  34. 34. 34 EIW Demo Community Content Security Discussions OWL editing ASK queries View Designer/Views
  35. 35. 35 Visual Ontology Web Language
  36. 36. 36 Visualization  There is no adopted standard by W3C for visual representation of OWL or RDF models  OWL and RDF will not become a widely used standards without good visualization of models  We do not believe any existing modeling standard will do, OWL is too different  We need OWL design patterns to fundamentally change information management capability at DOD and elsewhere  The capability will be in beta test in December on