Ontology 101 - Kendall & McGuiness


Published on

Elisa Kendall and Deborah McGuiness - Ontology 101. Knowledge Representation, OWL and Tools

Published in: Technology

Ontology 101 - Kendall & McGuiness

  1. 1. +Ontology 101: An Introduction to KnowledgeRepresentation, the Web Ontology Language (OWL) &Ontology DevelopmentElisa Kendall, Thematix & Deborah McGuinness, RPI / McGuinness Associates6 June 2011
  2. 2. + Content management Mars was photographed by the Hubble Space Telescope in August 2003 as 2 2 the planet passed closer to Earth than it had in nearly 60,000 years. Image Credit: NASA, J. Bell (Cornell U.) and M. Wolff (SSI) A sunset on Mars creates a glow due to the presence of tiny dust particles in the atmosphere. This photo is a combination of four images taken by Mars Pathfinder, which landed on Mars in 1997. Image credit: NASA/JPL Recent images from instruments on board the Mars Reconnaissance Orbiter take much more detailed, narrower views of specific features of the Martian surface. Image credit: NASA/JPL The Planetary Data Store (PDS) is a distributed repository of 40+ years’ imagery & data taken by a range of instruments on many diverse missions, available for scientific research.Copyright © 2011 Thematix & McGuinness Associates
  3. 3. + Smart search 3 3 Provenance/sources for tracking family members in the 19th century include early census data (often error prone), military records, passenger & immigration lists, online documents (e.g., county histories, church histories, etc.) • Historical/forensic research requires cross-domain search of a wide variety of resources within a given geo-spatial/temporal context • Similar capabilities are essential for business intelligence, law enforcement, government applications – all require terminology reconciliationCopyright © 2011 Thematix & McGuinness Associates
  4. 4. + Merchandising 4 4 Attribute 1 Attribute 2 Leg Room Attribute 4 Attribute 5 Attribute 5 On-Off Access “Comfort- Attribute 7 Attribute 8 Oriented” Attribute 9 Flight Media Options Attribute 11 Attribute 12 Meal Options Attribute 14 Attribute 15 Time of Day Attribute 17 Attribute 18 Distance to Gate Attribute 20 Attribute 21 Flight Characteristics Attribute N …Copyright © 2011 Thematix & McGuinness Associates
  5. 5. + Search Engine Optimization 5 5  Use search engines as a front door to transaction  Dramatically increase relevance by inclusion of more meaningful, synonymic tags  Enrich search results, adding ratings, video, prices, avails, amenities, locality etc.Copyright © 2011 Thematix & McGuinness Associates
  6. 6. + Tutorial Outline 6 6  Introduction to Knowledge Representation  OWL Basics  Tools & ApplicationsCopyright © 2011 Thematix & McGuinness Associates
  7. 7. + Part 1: Introduction to Knowledge Representation 7 7  A little background and a few definitions  Layers of abstraction & conceptual modeling  Classifying ontologies  A little methodologyCopyright © 2011 Thematix & McGuinness Associates
  8. 8. + Definitions 8 8  An ontology is a specification of a conceptualization. – Tom Gruber  Knowledge engineering is the application of logic and ontology to the task of building computable models of some domain for some purpose. – John Sowa  Artificial Intelligence can be viewed as the study of intelligent behavior achieved through computational means. Knowledge Representation then is the part of AI that is concerned with how an agent uses what it knows in deciding what to do. – Brachman and Levesque, KR&R  Knowledge representation means that knowledge is formalized in a symbolic form, that is, to find a symbolic expression that can be interpreted. – Klein and Methlie  The task of classifying all the words of language, or whats the same thing, all the ideas that seek expression, is the most stupendous of logical tasks. Anybody but the most accomplished logician must break down in it utterly; and even for the strongest man, it is the severest possible tax on the logical equipment and faculty. – Charles Sanders Peirce, letter to editor B. E. Smith of the Century DictionaryCopyright © 2011 Thematix & McGuinness Associates
  9. 9. + What is an Ontology? An ontology specifies a rich description of the 9 9  Terminology, concepts, nomenclature  Properties explicitly defining concepts  Relations among concepts (hierarchical and lattice)  Rules distinguishing concepts, refining definitions and relations (constraints, restrictions, regular expressions) relevant to a particular domain or area of interest.Copyright © 2011 Thematix & McGuinness Associates
  10. 10. + Logic and Ontology 10 10  Predicate logic is harder to read than the original English, but is more precise: Every semi-trailer truck has at least 3 axles. (∀ x)(((SemiTrailerTruck(x) ∧ (∃ y)(SemiTrailer(y) ∧ (hasPart (x,y))) ∧ (SemiTrailerTruck(x) ∧ (∃ z)(TractorUnit(z) ∧ (hasPart (x,z)))) ⊃ (∃ s)(set(s) ∧ (count(s,(≥3)) ∧ (∀ w)(member(w,s) ⊃ (Axle(w) ∧ hasPart(x,w))) )).  Logic is a simple language with few basic symbols.  The level of detail depends on the choice of predicates – these predicates represent an ontology of the relevant concepts in the domain.  Different choices of predicates represent different ontological commitments. * Derived from Knowledge Representation: Logical, Philosophical, and Computational Foundations, John F. Sowa, Brooks/Cole, Pacific Grove, CA, 2000.Copyright © 2011 Thematix & McGuinness Associates
  11. 11. + Ontology-based Technologies 11 11  Ontologies provide a common vocabulary for use by independently developed resources, processes, services  Agreements among organizations sharing common services can be made with regard to their usage; the meaning of relevant concepts can be expressed unambiguously  By composing / mapping ontologies and mediating terminology across participating events, resources and services, independently-developed services can work together to share information and processes consistently, accurately, and completely  Ontologies also ensure  Valid conversations among agents to collect, process, fuse, and exchange information  Accurate searching by ensuring context using concept definitions and relations instead of/in addition to statistical relevance of keywordsCopyright © 2011 Thematix & McGuinness Associates
  12. 12. + Features of KR languages  Vocabulary – a collection of symbols 12 12  Domain-independent logical symbols (e.g., ∀ or ⊃)  Domain-dependent constants, identifying individuals, properties, or relations in the application domain or universe of discourse  Variables, whose range is governed by quantifiers  Punctuation that separates or groups other symbols  Syntax –  formation rules that determine how symbols can be combined in well- formed expressions  rules may be stated in a linear grammar, graph grammar, or independent abstract syntax  Semantics –  a theory of reference that determines how the constants and variables are associated with things in the universe of discourse  a theory of truth that distinguishes true statements from false statements  Rules of Inference –  rules that determine how one pattern can be inferred from another  if the logic is sound, the rules of inference must preserve truth as determined by the semanticsCopyright © 2011 Thematix & McGuinness Associates
  13. 13. + Logic Logics vary from classical FOL along six dimensions: 13 13  Syntax  Subsets limit permissible operators or combinations, e.g., propositional logic (without quantifiers), Horn-clause (excludes disjunctions in conclusions, such as Prolog), terminological or definitional logics (containing additional restrictions, e.g., description logics)  Proof Theory restricts or extends permissible proofs: • Intuitionistic logic and relevance logic rule out certain extraneous information • Non-monotonic logics allow introduction of default assumptions • Access-limited logic restricts the number of times a proposition can be used in a proof; Linear logic allows a proposition to be used only once • Modal logic incorporates modal auxiliaries (◊p means p is possibly true; p means p is  necessarily true), temporal logic extends model logic to include always, sometimes • Intensional logics express concepts such as need, ought, hope, fear, wish, believe, know, expect, and intend  Model Theory modifies the truth value of statements in terms of some model of the world: classical FOL is two-valued; a three-valued logic introduces unknowns; fuzzy logic uses the same notation as FOL but with an infinite range of certainty factors (1.0 to 0.0)  Ontology – frameworks may include support for built-in components, such as set theory or time  Meta-language – language for encoding information about objectsCopyright © 2011 Thematix & McGuinness Associates
  14. 14. + Description logic  A family of logic-based Knowledge Representation formalisms 14 14  Descendants of semantic networks and KL-ONE  Describe domain in terms of concepts (classes), roles (relationships), and individuals (instances)  Distinguished by  Formal semantics • Decidable fragments of FOL • Closely related to Propositional, Modal, and Dynamic Logics  Provision of inference services • Sound and complete decision procedures for key problems • Implemented systems (highly optimized)  Applications include  Configuration – product configurators, consistency checking, constraint propagation, first significant industrial application (e.g., CLASSIC)  Ontologies – ontology engineering (design, maintenance, integration), reasoning with ontology-based mark-up, service description and discovery  Databases – consistency of conceptual schemata, schema integration, query subsumption (w.r.t. conceptual schemata)Copyright © 2011 Thematix & McGuinness Associates
  15. 15. + Knowledge bases, databases, & ontology 15 15  An ontology is a conceptual model of some aspect of a particular universe of discourse (or of a domain of discourse)  Typically, ontologies contain only “rarified” or “special” individuals, metadata, representing elemental concepts critical to the domain  A knowledge base is a persistent repository for  Ontology & metadata representing individuals, facts, & rules about how they can be combined or relate to one another  Metadata, facts & rules only – in some applications and frameworks the ontology is separately maintained  Most inference engines require in-memory deductive databases for efficient reasoning (including commercially available reasoners)  A knowledge base may be implemented in a physical, external database, such as a relational database, but reasoning is typically done on a subset (partition) of that knowledge base in memoryCopyright © 2011 Thematix & McGuinness Associates
  16. 16. + Reasoning & Truth Maintenance 16 16  Reasoning is the mechanism by which the assertions made in an ontology and related knowledge base are evaluated by an inference engine.  In classical logic, the validity of a particular conclusion is retained even if new information is received.  This may change if some of the preconditions are actually hypothetical assumptions invalidated by the new information.  The same idea applies for arbitrary actions – new information can make preconditions invalid.  Generally, there are two issues that a reasoner must address:  If some conclusion is invalid, which other conclusions are also invalid?  If some action cannot be performed, which others are at risk?  The “housekeeping” associated with tracking the threads that support answering these questions is called truth maintenance.Copyright © 2011 Thematix & McGuinness Associates
  17. 17. + Negation  If all new information is “positive”, then all prior conclusions should 17 17 remain valid.  Problems arise if new information negates a prior assumption, causing it to be withdrawn.  What does it mean to negate (withdraw) an assumption?  Conclusive information is not available?  The assumption cannot be proven?  The assumption is not provable using certain methods?  The assumption is not provable given a fixed quantity of time?  The answers to these questions can result in different definitions of negation and differing interpretations by non-monotonic reasoners.  Solutions include chronological and “intelligent” backtracking algorithms, heuristics, circumscription algorithms, justification or assumption based retraction, and so forth, depending on the reasoner and methods used for truth maintenance.  Reasoning efficiency is dependent, in part, on the algorithms applied for truth maintenance.Copyright © 2011 Thematix & McGuinness Associates
  18. 18. + Explanations and Proofs 18 18  When a reasoner draws a particular conclusion, many users and applications want to understand why?  Primary motivations include interoperability, reuse, and trust  Understanding the provenance of the information and results is crucial, especially when web-based information is involved  What information sources were used (source)  How recently they were updated (currency)  How reliable these sources are (authoritativeness)  Was the information directly available or derived, and if derived, how (method of reasoning)  Methods used to explain why a reasoner reached a particular conclusion include explanation generation and proof specificationCopyright © 2011 Thematix & McGuinness Associates
  19. 19. + Abstraction Layers • Identify subject areas Contextual 19 19 Ontology Conceptual • Define the meaning of things in the organization • Describe the logical representation of properties Ontology, Conceptual ER, Conceptual Logical Business Process … • Describe the physical means by which data is stored ER, Relational, XML Schema Physical XML, source code, scripting languages, • Represent the coding language on a specific development platform stored procedures… Definition Physical KBs, • Hold the values of the properties applied to the data in a schema databases, asset repositories… Instance *Layering diagram courtesy Kenn Hussey  Knowledge Representation / Management for Large Scale Applications  Provide broad metadata, process, service & asset management facilities (including feedback/lessons learned…)  Enable rich cross-domain, cross-process, cross organizational modeling supported by mapping & transformation services to provide maximum flexibility, interoperability  Leverage standards and best practices in information architecture, metadata modeling, management, registration, and governance, and asset management & registration  Provide incremental reasoning capabilities for model validation, transformation services  Repeatable, reusable, interoperableCopyright © 2011 Thematix & McGuinness Associates
  20. 20. + Hypothetical Conceptual Model “EU-Rent” 20 20Produced using Embarcadero EA/Studio Business Modeler Edition, courtesy Kenn HusseyCopyright © 2011 Thematix & McGuinness Associates
  21. 21. + Hypothetical Logical Model “EU-Rent” 21 21 Produced using Embarcadero ER/Studio, courtesy Kenn HusseyCopyright © 2011 Thematix & McGuinness Associates
  22. 22. + Hypothetical Physical Model “EU-Rent” 22 22Produced using Embarcadero ER/Studio, courtesy Kenn HusseyCopyright © 2011 Thematix & McGuinness Associates
  23. 23. + Classifying ontologies 23 23 Classification techniques are as diverse as conceptual models; and generally include understanding KR System  Level of Expressivity  Level of Complexity / Structure OO Software Model  Granularity Entity – Relationship  Target Usage, Relevance Model  Amount of Automation, Reasoning Requirements Concept Map  Prescriptive vs. Descriptive / Reliability / Level Topic Map of Authoritativeness Level of Expressivity  Design Methodology Database Schema  Governance XML Schema  Vocabulary Management, Metrics Hierarchical Taxonomy Simple Taxonomy Glossary Level of ComplexityCopyright © 2011 Thematix & McGuinness Associates
  24. 24. + Framework of dimensions 24 24  Semantic Dimensions  Expressiveness: represents how well a KR language addresses increasingly complex semantics  Structure: represents how well an ontology encodes semantics, with the same or less expressivity than the KR language  Granularity: represents the level of detail specified in an ontology  Pragmatic Dimensions  Intended use: the original use case(es), or purpose for developing a particular ontology  Automated reasoning: the extent to which the ontology is designed to be used for automated reasoning  Prescriptive vs. Descriptive: the extent to which an ontology was intended to be used for descriptive purposes vs. normative prescriptive use (i.e., with high degree of concern for correctness) Reference: http://ontolog.cim3.net/cgi-bin/wiki.pl?OntologySummit2007Copyright © 2011 Thematix & McGuinness Associates
  25. 25. + Model dynamics 25 25 Model centric perspectives characterize the ontologies themselves and are concerned with their structure, formalism and dynamics. Perspective One Extreme Other Extreme Level of Least authoritative, broader Most authoritative, narrower, Authoritativeness shallowly defined ontologies more deeply defined ontologies Source of Passive (Transcendent) - Active (Immanent) - Structure Structure Structure originates outside emerges from data or behavior the system Degree of Informal or primarily Formal, having rigorously Formality taxonomic defined types, relations, and theories or axioms Model Dynamics Read-only, ontologies are Volatile, ontologies are fluid & static changing Instance Read-only, resource instances Volatile, resource instances Dynamics are static change continuouslyCopyright © 2011 Thematix & McGuinness Associates
  26. 26. + Application characteristics 26 26 Application centric perspectives are concerned with how applications use and manipulate ontologies. Perspective One Extreme Other Extreme Control/Degree Externally focused, public Internally focused, private of Manageability (little or no control) (full control) Application Static (with periodic updates) Dynamic Changeability Coupling Loosely-coupled Tightly-coupled Integration Focus Information integration Application integration Lifecycle Usage Design Time Run TimeCopyright © 2011 Thematix & McGuinness Associates
  27. 27. + Language evaluation 27 27 KR Expressivity Requirements Reasoning Requirements Classes Exceptions Slots/Attributes Automatic Classification Metaclasses Number Restrictions Complex Class Extensions (union, intersection) Inheritance Subsumption Hierarchies Monotonic, Nonmonotonic Value Restrictions Simple, Multiple Behaviors, Procedural Definitions, Methods Procedural Support, Execution Relations / Functions Slots/Attributes Subsumption Hierarchies Constraint Checking, Deduction Built-in Functions, Equations, Formulae Instances / Individuals / Facts Axioms Reasoning with Rules Production Rules (forward and backward chaining) * Derived from “Evaluating Knowledge Representation and Reasoning Capabilities of Ontology Specification Languages”, Oscar Corcho and Asuncion Gomez-Perez, January 2002Copyright © 2011 Thematix & McGuinness Associates
  28. 28. + Considerations  Intended use of ontologies, including domain requirements (e.g., 28 28 scientific and engineering apps require formulas, units of measure, computations that may be challenging to represent)  Intended use of KRSs that implement them, including reasoning requirements, questions to be answered  For distributed environments, the number and kinds of resources, processes, services requiring ontologies – how distributed, how unique, developed collaboratively or independently, dynamic community participation or static  What kinds of transformations are required among processes, resources, services to support semantic mediation  Ontology and KRS alignment / de-confliction / ambiguity resolution requirements  Ontology and KRS composition requirements, dynamic vs. static composition, in what environment and under what constraints  Performance, sizing, timing requirements of target environmentCopyright © 2011 Thematix & McGuinness Associates
  29. 29. + A Little Methodology … 29 29  Requirements, domain & use case analysis are critical  Develop initial source/reference material  Focus on system or application requirements  Iterative development starting with a “thread” that covers basic capabilities can ground the work and prioritize decisions  Need to understand and communicate  Architectural trade-offs, cost & technical benefits  The nature of the information & kinds of questions that need to be answered drive the architecture, approach, and ontology scoping and design  Reuse standards and well-tested, available ontologies whenever possibleCopyright © 2011 Thematix & McGuinness Associates
  30. 30. + What to look for  A controlled vocabulary 30 30 FAA & IATA airport codes, ACRISS car codes, …  A hierarchical or taxonomic structure (for query expansion) Vehicle, Ground-based Vehicle, Wheeled Vehicle, Powered Vehicle, Automobile, Sedan…  Knowledge supporting structured queries Find all available hybrid SUVs or hybrid sedans that can seat four adults within a reasonable taxi ride of Planet Hollywood, Las Vegas for 3-4 days the week of June 20th  Efficient inference (i.e., limited expressive power) vs. increased expressivity (potentially expensive or resource bounded computation)  Custom reasoning for temporal relations, geospatial, dynamic algorithm / equation evaluation, process-specific, conditional operations  Computational tractabilityCopyright © 2011 Thematix & McGuinness Associates
  31. 31. + Start with canonical definitions  General concepts as well as domain-specific knowledge 31 31  Basic starting point – cross-domain definitions  Namespace definitions, metadata, naming conventions, governance policies  Commonly used structures & vocabularies, such as domain-specific vocabulary & messaging standards, international country & language codes (ISO), national postal addressing or other government standards, industry best practices  Common metadata for ontology & schema management (e.g., Dublin Core, for documents & models, ISO 1087 for synonyms & similar relations, MIME media types for images & multimedia, etc.)  Domain vocabularies must be prioritized, selected based on business requirements, clear ROI  Common early targets include  Smart search (pull); customer experience & cross-sell / upsell (push)  Richer interoperability among trading partners  Service registration, description, discovery & management  Asset/artifact repository search & retrieval  Automated verificationCopyright © 2011 Thematix & McGuinness Associates
  32. 32. + IDEF5 (Integrated Definition Methods) Ontology Capture Method Analysis 32 32  Layout a high-level architecture for ontology elements  Identify the relationships among elements – roles, domain, interface, process, utility  For each ontology element  Describe its domain and scope, how it will be used  Identify example questions and anticipated/sample answers for the application(s) it will support  Identify key stakeholders, ownership, maintenance, resources for instance knowledge  Describe anticipated reuse/evolution path  Identify critical standards, resources that it must interoperate with, dependencies  Resources  http://www.idef.com/IDEF5/html  http://www.kbsi.com/technology/methods/sbont.htmCopyright © 2011 Thematix & McGuinness Associates
  33. 33. + Principles & Methods for Terminology Work (ISO 704) 33 33  Methodology for describing concepts & terms  Uses ISO 1087 for terminology  Uses ISO 860 for terminology “harmonization” (alignment) methods  Basis for typical methods used for taxonomy development today  Describes how to flesh out definitions  Recommendations strategies for relating terms to one another using standard vocabulary  ISO 1087 – great resource for language to describe kinds of relationships, acronyms & other designations, preferred vs. deprecated terms, etc.  ISO 860 augments this with recommendations for vocabulary comparisonCopyright © 2011 Thematix & McGuinness Associates
  34. 34. + Part 2: OWL Basics  A quick intro to RDF 34 34  Basic OWL constructs: classes, properties & slots  Inheritance & constraints  Class & slot inheritance  Property, slot & slot value constraints  Individuals & literals  Disjoint & equivalence relations  New features of OWL 2  A few rules of thumb  Depth vs. breadth, naming, synonymous terms  Class vs. property value, class vs. individual  Inverse properties  Limiting scopeCopyright © 2011 Thematix & McGuinness Associates
  35. 35. + Semantic Web 35 35 "The Semantic Web is an extension of the current web in which information is given well-defined meaning, better enabling computers and people to work in cooperation." -- Tim Berners-LeeCopyright © 2011 Thematix & McGuinness Associates
  36. 36. + Guiding Principles 36 36  Historically, knowledge representation and reasoning systems have operated under closed world assumptions  Uncertainty is magnified under open-world, “wild, wild web” conditions, making reasoning much more difficult  Semantic web languages are designed to support less certainty, to provide “better” search results, informed answers to questions, not absolutes  Because they are based on XML, such languages can assist businesses in leveraging existing investment in mark-up, content, and data  To augment business intelligence/analysis and knowledge mining  To support knowledge sharing and collaboration, augment enterprise information integration  Enrich web services and other applications  Support policy-based applications and ensure compliance with policy at a lower cost with higher potential ROI than traditional computing methodsCopyright © 2011 Thematix & McGuinness Associates
  37. 37. + Resource Description Framework (RDF) 37 37  Describes relationships  Uses URIs used for naming  Language has  graph based model  RDF/XML serialization (exchange syntax)  other presentation syntaxes (N3, Turtle, …)  Specification, W3C presentations, tools are available at  Semantic Web: http://www.w3.org/standards/semanticweb/  Linked Data: http://www.w3.org/standards/semanticweb/data  RDF: http://www.w3.org/standards/techs/rdf#w3c_all  New RDF “next steps” revision working group under way to make specific enhancements, “fixes” to the standard  RDF Working Group: http://www.w3.org/2011/rdf-wg/wiki/Main_PageCopyright © 2011 Thematix & McGuinness Associates
  38. 38. + RDF Notation Options 38 38 Subject http://semtech2011.semanticweb.com/travel.owl#Conference PredicateGraph: http://semtech2011.semanticweb.com/travel.owl#heldAt http://semtech2011.semanticweb.com/travel.owl#HiltonSFUnionSquare Object <rdf:Description rdf:ID=“Conference"/>XML/RDF: <semtech:heldAt rdf:resource="#HiltonSFUnionSquare"/> </rdf:Description>N3: semtech:Conference semtech:heldAt semtech:HiltonSFUnionSquare . Copyright © 2011 Thematix & McGuinness Associates
  39. 39. + RDF Schema (RDFS)  An RDF vocabulary that provides for identifying: 39 39  classes,  subsumption (inheritance) relations for classes,  subsumption (inheritance) relations for properties,  domain and range for propertiesCopyright © 2011 Thematix & McGuinness Associates
  40. 40. + RDF Schema (RDFS) 40 40 semtech:Hotel rdfs:subClassOfGraph: rdf:type rdfs:Class semtech:ConferenceHotel rdf:type rdf:type semtech:HiltonSFUnionSquare <rdf:Description rdf:ID="Hotel">XML/RDF: <rdf:type rdf:resource="http://www.w3.org/2000/01/rdf-schema#Class"/> </rdf:Description> <rdfs:Class rdf:ID=“ConferenceHotel"> <rdfs:subClassOf rdf:resource="#Hotel"/> </rdfs:Class> <semtech:ConferenceHotel rdf:ID=“HiltonSFUnionSquare“/> Copyright © 2011 Thematix & McGuinness Associates
  41. 41. + The Web Ontology Language (OWL)  Two languages emerged in parallel to address semantic web 41 41 requirements  DAML-ONT, supported by the DARPA/DAML program  OIL (Ontology Inference Layer) developed by EU & US researchers  Merged DAML+OIL was submitted to the W3C 2002, formed the basis for the WebOnt Working Group  OWL extends RDF Schema  Has an RDFS based syntax and reuses some RDF vocabulary (e.g., subClassOf, domain, range)  Adds rich primitives and redefines others (transitivity, inverse, cardinality constraints, complex class definitions)  Describes the structure of a domain in terms of classes and properties  Uses RDFS for class/property membership assertions (ground facts), XML Schema Datatypes  OWL specifications became W3C recommendations in February 2004  OWL 2 specifications was adopted in October 2009Copyright © 2011 Thematix & McGuinness Associates
  42. 42. + General nature of descriptions 42 42 a WINE a LIQUID General Categories a POTABLE grape: chardonnay, ... [>= 1] sugar-content: dry, sweet, off-dry Structured color: red, white, rose Components price: a PRICE winery: a WINERY grape dictates color (modulo skin) Interconnections harvest time and sugar are related Between PartsCopyright © 2011 Thematix & McGuinness Associates
  43. 43. + General nature of descriptions 43 43 Class a WINE Superclass a LIQUID General Categories a POTABLENumber / CardRestrictions grape: chardonnay, ... [>= 1] sugar-content: dry, sweet, off-dry Structured Roles / color: red, white, rose Components Properties price: a PRICE winery: a WINERY Value Restrictions grape dictates color (modulo skin) Interconnections harvest time and sugar are related Between Parts Copyright © 2011 Thematix & McGuinness Associates
  44. 44. + Description development 44 44  Define domain terms and inter-relationships  Define concepts in the domain (classes, nouns)  Identify subclass/superclass relationships  Identify attributes/properties/slots (verbs)  Identify any general properties (relations, functions, verbs)  Restrict slot values  Define a representative set of individuals – for testing & evaluation, prototype development  Define individuals (instances)  Define relationships between individuals (filling in slots)Copyright © 2011 Thematix & McGuinness Associates
  45. 45. + Classes & class hierarchy 45 45  A class is a concept in the domain  Vintage – a wine made from grapes grown in a specified year  A class of properties (flavor, body, color, sugar…)  A class is a collection of elements with similar properties  White wine – wines made from white grapes  White table wine – wines made from white grapes that are not appellations or regional (not “quality wine” in the EU)  A class contains necessary conditions for membership (specific varietal, field, micro-climate, date picked, date bottled)  Instances of classes  Andrew Murray 2005 Late Harvest Viognier  Robert M. Parker, Jr,’s Wine Advocate review, dated February 28, 2002  Los Olivos, Santa Barbara County, California, USACopyright © 2011 Thematix & McGuinness Associates
  46. 46. + Class inheritance 46 46  Classes are organized into subclass-superclass (or generalization-specialization) hierarchies  True subclass relationships are the basis of a formal is-a hierarchy Classes are “is-a” related if an instance of the subclass is an instance of the superclass  Classes may be viewed as sets  Subclasses of a class are comprised of a subset of the superset  Examples  RedWine is a subclass of Wine Every red wine is a wine or every instance of a red wine (e.g., Marietta Old Vines Red) is an instance of wine  NapaValleyWine is a subclass of CaliforniaWine Every wine from Napa Valley is a wine from CaliforniaCopyright © 2011 Thematix & McGuinness Associates
  47. 47. + Levels in the class hierarchy 47 47  Different modes of development  Top-down - define the most general concepts first and then specialize them  Bottom-up - define the most specific concepts and then organize them into more general classes  Combination (typical – breadth at the top level and depth along a few branches to test design)  Class inheritance is transitive  A is a subclass of B (white wine, dessert wine are subclasses of wine)  B is a subclass of C (viognier is a subclass of white wine, late harvest wine is a subclass of dessert wine)  therefore A is a subclass of C (late harvest viognier is a subclass of white wine, dessert wine, & wine)Copyright © 2011 Thematix & McGuinness Associates
  48. 48. + Classes in OWL 2 48 48  6 kinds of class expressions in OWL  5 anonymous  Definitions can be nested using set-theoretic constructsCopyright © 2011 Thematix & McGuinness Associates
  49. 49. + Properties & slots 49 49  Slots in a class definition describe attributes of members of a class Each wine will have color, sugar content, flavor, body, etc.  Types of properties  “intrinsic” properties: flavor and color of wine  “extrinsic” properties: name and price of wine  parts: ingredients in a recipe  relations to other objects: producer of wine (winery)  Data and object properties  simple (datatype) contain primitive values (strings, numbers)  complex properties may contain other objects (e.g., a winery instance)Copyright © 2011 Thematix & McGuinness Associates
  50. 50. + Properties in OWL 2 50 50Copyright © 2011 Thematix & McGuinness Associates
  51. 51. + Class & slot inheritance 51 51  A subclass inherits all the slots from its super class If a wine has a name and flavor, a red wine also has a name and flavor  If a class has multiple super classes, it inherits slots and restrictions from all of them Latin is both an individual language and an ancient language. It inherits “has attested literature: true” from ancient language and “has indigenous name: lingua latina” from individual languageCopyright © 2011 Thematix & McGuinness Associates
  52. 52. + Property (slot) constraints / restrictions  Number restrictions describe or limit the number of possible values a particular property can have 52 52  A language must be associated with at least one English name and at least one French name  A language may be associated with zero or more Indigenous names  Cardinality – similar meaning to classical set theory, measures the number of elements in the set (restriction class)  Cardinality – cardinality N means the class defined by the property restriction must have exactly N values (individual or literal values)  Minimum cardinality - 1 means that there must be at least one value (required), 0 means that the value is optional  Maximum cardinality - 1 means that there can be at most one value (single-valued), N means that there can be up to N values (N > 1, multi- valued)Copyright © 2011 Thematix & McGuinness Associates
  53. 53. + Slot value constraints 53 53  Slot value type – defines the set of possible values for the property  String: a string of characters (“McGinley Vineyard”)  Number: an integer or a float (3.5 acres)  Boolean: a true/false flag  Enumerated type: a list of allowed values (red, white, rose)  Filler: a single value (e.g., the color slot for a RedWine must be filled with the single value “red”)  Object type – a class defined in an ontology (e.g., Winery is the value restriction on the hasMaker slot on the class Wine)Copyright © 2011 Thematix & McGuinness Associates
  54. 54. + Domain & range properties 54 54  In OWL and many other KR languages, relations (properties, slots) are strictly binary  The domain & range represent the source & target arguments, respectively, for the property  Domain of a slot – the class (or classes) that may have the slot − Wine is the domain of the slot hasWineColor  Range of a slot – the class (or classes) to which slot values belong − everything that fills the hasWineColor slot is an instance of the enumerated class {red, white, rose}  Some KR languages that inherently support n-ary relations, such as CL, do not make this distinction  More flexible, intuitively more like mathematics, where functions have ranges (or return types) but not all relations are functions  Requires additional relations to specify argument order, which can be critical for ontology alignmentCopyright © 2011 Thematix & McGuinness Associates
  55. 55. + Property inheritance  A subclass inherits all the slots of its superclass(es) 55 55  A subclass can add constraints to “narrow” the set of allowed values  Make the cardinality range smaller  Replace a class in the range with a subclassCopyright © 2011 Thematix & McGuinness Associates
  56. 56. + Individuals 56 56  An Individual (instance, object in other paradigms)  Any class that an individual is a member of, or is an individual of, is a type of the individual  Any superclass of a class is an ancestor (or type) of the individual  Specify slot values for the individual  Slot values should conform to the constraints such as range, value type, cardinality restrictions, etc.Copyright © 2011 Thematix & McGuinness Associates
  57. 57. + OWL Individuals 57 57 Graco Garmin Chevrolet Sony BMWCopyright © 2011 Thematix & McGuinness Associates
  58. 58. + OWL Statements 58 58 Graco Garmin Chevrolet Sony BMW a Mini Cooper a Camaro LT ConvertibleCopyright © 2011 Thematix & McGuinness Associates
  59. 59. + OWL ObjectProperty 59 59 Graco Garmin Chevrolet Sony<owl:ObjectProperty rdf:ID="builtBy"> BMW <rdfs:range rdf:resource="#Enterprise"/> <rdfs:domain rdf:resource="#DurableGood"/> <owl:inverseOf rdf:resource="#hasBuilt"/></owl:ObjectProperty> a Mini Cooper a Camaro LT ConvertibleCopyright © 2011 Thematix & McGuinness Associates
  60. 60. + OWL ObjectProperty 60 60 range Graco Garmin Chevrolet Sony BMW<owl:ObjectProperty rdf:ID="builtBy"> <rdfs:range rdf:resource="#Enterprise"/> <rdfs:domain rdf:resource="#DurableGood"/> <owl:inverseOf rdf:resource="#hasBuilt"/></owl:ObjectProperty> a Mini Cooper a Camaro LT Convertible Copyright © 2011 Thematix & McGuinness Associates
  61. 61. + OWL ObjectProperty 61 61 range Graco Garmin Chevrolet Sony<owl:ObjectProperty rdf:ID="builtBy"> BMW <rdfs:range rdf:resource="#Enterprise"/> <rdfs:domain rdf:resource="#DurableGood"/> <owl:inverseOf rdf:resource="#hasBuilt"/></owl:ObjectProperty> a Mini domain Cooper a Camaro LT ConvertibleCopyright © 2011 Thematix & McGuinness Associates
  62. 62. + Inverse Properties 62 62 domain Graco Garmin Sony Chevrolet<owl:ObjectProperty rdf:ID="hasBuilt"> BMW <rdfs:range rdf:resource="#DurableGood"/> <rdfs:domain rdf:resource="#Enterprise"/> <owl:inverseOf rdf:resource="#builtBy"/></owl:ObjectProperty> a Mini range Cooper a Camaro LT Convertible Copyright © 2011 Thematix & McGuinness Associates
  63. 63. + OWL Class 63 63 Graco Garmin Chevrolet<owl:Class Sonyrdf:ID="ManufacturingEnterprise"/> BMW <owl:Class rdf:ID="DiscreteManufacturingEnterprise"> <rdfs:subClassOf> <owl:Class rdf:about="#ManufacturingEnterprise"/> </rdfs:subClassOf> </owl:Class>Copyright © 2011 Thematix & McGuinness Associates
  64. 64. + Class Axioms 64 64 A ⊆ B where B  Subsumption (necessary) B  is a class description partial or primitive class A  Definition (necessary and sufficient)  C ≡ D where D D is a class description complete or defined class C Courtesy Evan Wallace, NISTCopyright © 2011 Thematix & McGuinness Associates
  65. 65. + Meta-Properties – Global Cardinality Restriction 65 65 Range  Functional Domain Range  Inverse Functional Courtesy Evan Wallace, NIST DomainCopyright © 2011 Thematix & McGuinness Associates
  66. 66. + Class Descriptions – Property Restriction 66 66 property P Individual of Class C  Quantified property restriction (type)  Universally quantified – allValuesFrom  Existentially quantified - someValuesFrom  hasValue property restriction (value)  Property cardinality restriction (# of values) Courtesy Evan Wallace, NISTCopyright © 2011 Thematix & McGuinness Associates
  67. 67. + Disjoint classes A 67 67 B  Classes are disjoint if they cannot have common instances  Disjoint classes cannot have any common subclasses  If winery and wine are disjoint, then there is no instance that is both a winery and a wine; there is no class that is both a subclass of winery and a subclass of wine  Disjointness is often used to aid consistency checking  Disjointness is also helpful in teasing out subtle distinctions among classes across multiple ontologies  Equivalence is also often used to identify the same concepts across ontologies that may be named differently, or to name classes defined through class axiomsCopyright © 2011 Thematix & McGuinness Associates
  68. 68. + New features in OWL 2 68 68  New property constructs  Reflexive, irreflexive, & asymmetric properties  Property chains  Disjoint properties  Keys  Self-reflexive properties  Qualified cardinality restrictions  Negative property assertions  New class axioms  Disjoint unions  Disjoint classesCopyright © 2011 Thematix & McGuinness Associates
  69. 69. + More new features  Extended datatype handling 69 69  Better support for XML Schema Datatypes  New datatypes for OWL specifically (owl:real, owl:rational, rdf:PlainLiteral)  Custom datatype definition & use of xsd facets / datatype restrictions  New data range restrictions  Intersections, unions, complements  Support for punning  Limited support for a particular element to be defined as both a class & individual, for example  Improved support for annotations (e.g., metadata about ontologies, provenance, transformation detail, etc.)  See http://www.w3.org/TR/owl2-new-features/ & http://www.w3.org/TR/owl2-quick-reference/ for detail  Current work specifically focused on provenance: http://www.w3.org/2011/prov/wiki/Main_PageCopyright © 2011 Thematix & McGuinness Associates
  70. 70. + Rules of Thumb 70 70  Cycles are common in many KR systems, though rarely “a good thing”  Cycles are disallowed by some tools because they prohibit “code generation”, export -- including RDF/OWL  Classes A, B, and C have equivalent sets of instances  By many definitions, A, B, and C are equivalent  Use owl:equivalentClass instead of creating cyclesCopyright © 2011 Thematix & McGuinness Associates
  71. 71. + Siblings in the class hierarchy 71 71  All siblings should be specified at roughly the same level of generality  Compare to section and subsections in a bookCopyright © 2011 Thematix & McGuinness Associates
  72. 72. + Class specification 72 72  If a class has only one child, there may be a modeling problem – often a sign that a definition is incomplete  If the only Red Burgundy we have is Côtes d’Or, why introduce the subclass?  Subclasses of a class usually have  Additional properties  Additional slot restrictions  Participate in different relationships  Compare to bullets in a bulleted listCopyright © 2011 Thematix & McGuinness Associates
  73. 73. +Creating levels & subclasses 73 73  If a class has a large number of subclasses, it may be useful to define intermediate levels  For example, in the domain of wines, there are natural groupings around wine color  However, if no natural classification exists, the long list may be appropriateCopyright © 2011 Thematix & McGuinness Associates
  74. 74. + Inheritance, naming, synonyms 74 74  A “wine” is not a subclass of “wines”  A particular vintage should be classified as an instance of the class Wines Class  Class names should be either  all singular  all plural instance-of  Synonym names for the same concept are not Instance different classes MariettaOldVinesRed  Many systems, metadata standards support synonymous terms as part of a class definition  OWL allows defining necessary and sufficiency condition definitions thereby allowing synonym definitions to be “first class” termsCopyright © 2011 Thematix & McGuinness Associates
  75. 75. + Class vs. property value 75 75  Do concepts with different slot values become restrictions for different slots?  How important is the distinction for the domain?  Class definitions for most domains should be fairly stable – i.e., they should not change frequently once the definitions are established and individuals createdCopyright © 2011 Thematix & McGuinness Associates
  76. 76. + Class vs. individual 76 76  Individual instances are the most specific objects in an ontology  If concepts form a natural hierarchy, represent them as classes  If they will have instances below them, represent them as classesCopyright © 2011 Thematix & McGuinness Associates
  77. 77. + Limiting scope 77 77  An ontology should not contain all the possible information about the domain  No need to specialize or generalize more than the application requires  No need to include all possible properties of a class • Only the most salient properties • Only the properties that the application requires  Ontologies of wine, food, and their pairings probably will not include details such as:  Bottle size (half bottle, full bottle, magnum, …)  Label color  Wine bottle color (green, amber, …)Copyright © 2011 Thematix & McGuinness Associates
  78. 78. + Part 3: Putting It All Together 78 78  Syntax Checking  Consistency Checking  Instance Data Analysis & Evaluation Services  Applications  Lighter Weight  Richer Applications  Examples – showing a combination of semantic web components in action: TW Wine Agent  Understanding, Trust & Proof: InferenceWeb  eScience Applications: Virtual Solar Terrestrial Observatory, SONet, & PopSciGrid  Leveraging Explanations in a Cognitive Assistant  QuestionsCopyright © 2011 Thematix & McGuinness Associates
  79. 79. + Syntax checking  For RDF & OWL Ontologies 79 79  RDF syntax checking, graph visualization • W3C RDF Validator (http://www.w3.org/RDF/Validator/) • Jena API & Toolkit (http://jena.sourceforge.net/)  OWL syntax checking, OWL dialect determination • OWL Consistency Checker (http://clarkparsia.com/pellet/, also provides full OWL 2 DL reasoning capabilities) • OWL 2 Validator (Univ. of Manchester, http://owl.cs.manchester.ac.uk/validator/ ) • Protégé OWL (http://protege.stanford.edu/) • Jena API & Toolkit (http://jena.sourceforge.net/)  Every tool provides unique capabilities; sophisticated projects may require multiple approaches  Tools listed are open source, commercial options are also availableCopyright © 2011 Thematix & McGuinness Associates
  80. 80. + Consistency checking  Requirements are typically application specific 80 80  RDF vocabularies should be checked by an RDF validator or rule engine  Jena Semantic Web Framework - http://jena.sourceforge.net/  Pychinko, MindSwap’s Rete-based RDF-friendly rule engine – (CWM clone) http://www.mindswap.org/~katz/pychinko/  OWL Ontologies should be run through a consistency checking reasoner  Pellet (open source, originally from Mindswap, supported by Clark & Parsia, LLC) – http://clarkparsia.com/pellet/,  RacerPro – http://www.racer-systems.com/  FaCT++ – http://owl.man.ac.uk/factplusplus/  HermiT OWL Reasoner – http://www.hermit-reasoner.com/  KAON2 infrastructure for managing OWL-DL, SWRL, and F-Logic ontologies – http://kaon2.semanticweb.org/  VIStologys ConsVISor OWL Consistency checker –  OWL instance data may also be run through checking tools  TW Instance data evaluation - http://onto.rpi.edu/demo/oie/Copyright © 2011 Thematix & McGuinness Associates
  81. 81. + Example process 81 81 Pellet OWL DL Reasoner Valid OWL (DL consistency checked, concept satisfiability checked) TW Instance Data Evaluation Valid OWL (FOL consistency checked, deductive closure) Valid RDF/XML Documents Ontology Registry (Valid OWL Syntax Representation) & Library Application Environment Ontologies should be checked to ensure consistency, limit potential for invalid conclusionsCopyright © 2011 Thematix & McGuinness Associates
  82. 82. + Inconsistencies caused by distributed OWL data 82 82 Wine ontology wine:Wine rdfs:subClassOf rdfs:subClassOf wine:EarlyHarvest wine:LateHarvest owl:disjointWith A new ontology A new instance data rdfs:subClassOf rdfs:subClassOf rdf:type rdf:type wine:BadWineClass wi:BadWineInstance Case1: semantic inconsistency Case2: semantic inconsistency caused by a new class. caused by a new instance.Copyright © 2011 Thematix & McGuinness Associates
  83. 83. + TW OWL instance-data evaluation 83 83  Motivation  Objective metrics are needed to evaluate if semantic web instance data is ready for practical use  Next generation Chimaera for more diverse and more instance- oriented applications  Challenges  Criteria for “ready for use” vary across applications  Existing syntax and logical-consistency checking is not enough for applications using some standard practical assumptions (e.g. closed world, unique names, …)  Solution  Extensible & customizable service-oriented architecture  Integrated environment for evaluating issues beyond standard syntactic correctness and logical consistency  See http://onto.rpi.edu/demo/oie/Copyright © 2011 Thematix & McGuinness Associates
  84. 84. + Some issues in OWL instance data 84 84 owl:cardinality owl:cardinality "1" "1" owl:onProperty owl:onProperty hasMaker hasColor rdf:type rdf:type owl:Restriction owl:Restrictionrdfs:subClassOf rdfs:subClassOf Wine Wine hasColor Red W W rdf:typeMPV rdf:type EPV hasColor White  Missing property value (MPV)  Excessive property value (EPV) Copyright © 2011 Thematix & McGuinness Associates
  85. 85. + “Pluggable” instance evaluation services <?xml version="1.0"?> <!DOCTYPE rdf:RDF [ <!ENTITY rdf "http://www.w3.org/1999/02/22-rdf-syntax-ns#" > <!ENTITY wine "http://tw.rpi.edu/2008/04/wine-lite.owl#" > ]> 85 85 <rdf:RDF xmlns:wine = "&wine;" xmlns:rdf = "&rdf;"> <!-- A wine instance missing a value for hasMaker. Therefore, we need to report a warning indicating a missing property value. --> <wine:Zinfandel rdf:about="#W1"> <wine:hasColor rdf:resource="&wine;Red"/> <wine:hasFlavor rdf:resource="&wine;Moderate"/> <wine:hasSugar rdf:resource="&wine;Dry"/> <wine:hasBody rdf:resource="&wine;Full"/> </wine:Zinfandel> <!-- A wine instance missing a value for hasColor; however, the value for hasColor for all instances of Zinfandel has been defined in the wine ontology. Therefore, we dont need to report a warning indicating a missing property value. --> <wine:Zinfandel rdf:about="#W3"> <wine:hasFlavor rdf:resource="&wine;Moderate"/> <wine:hasSugar rdf:resource="&wine;Dry"/> <wine:hasBody rdf:resource="&wine;Full"/> <wine:hasMaker> <wine:Winery rdf:about="#Elyse" /> </wine:hasMaker> </wine:Zinfandel> </rdf:RDF>Copyright © 2011 Thematix & McGuinness Associates
  86. 86. + Ontology-enhanced applications  Simple ‘ontologies’, taxonomy driven applications 86 86  Controlled, shared vocabularies (content management, faceted search)  Web site organization & navigation, expectation setting  Browsing & search engine optimization via mapping to tagged structures such as the new schema.org from Yahoo!, Google & Bing, Good Relations ontology – see http://purl.org/goodrelations/  “Smarter” search support (query expansion such as FindUR, e-Cyc; SPARQL for linked data applications)  More expressive ontologies provide more options for  Configuration, e.g., AT&T PROSE/Questar  Data Integration, e.g., Virtual Observatories (VSTO, …)Copyright © 2011 Thematix & McGuinness Associates
  87. 87. Semantic Sommelier+ 87 87 Semantically-enabled advisors, utilizes  ontologies  reasoning  social  mobile  provenance  contextCopyright © 2011 Thematix & McGuinness Associates
  88. 88. + TW Wine Agent – semantic interoperability Wine Agent receives a meal description and retrieves a 88 88 selection of matching wines available on the Web, using an ensemble of standards and tools  RDF & OWL − for encoding wine/food listings and pairing recommendations  Semantic MediaWiki − for publishing user-contributed recommendations*  Twitter and Facebook − for posting recommendations  Pellet − for deriving knowledge from wine ontology and recommendations  SPARQL − for querying wine/food listings with recommendations  Inference Web − for explaining TW Wine Agent’s intelligent behaviorCopyright © 2011 Thematix & McGuinness Associates
  89. 89. + Social Semantic Web data publishing 89 89 Collaborative wine recommendations  Using Semantic MediaWiki to manage users and user-contributed food/wine pairings recommendations  Using semantic form to add OWLCopyright © 2011 Thematix & McGuinness Associates instance data
  90. 90. + Semantic Query: food hierarchy & recommendation 90 90Copyright © 2011 Thematix & McGuinness Associates
  91. 91. + Wine Agent processing 91 91 Given a description of a meal  Combine wine ontology and the OWL data published at Wiki  Use Pellet and SPARQL to state a premise (the meal) and query the knowledge base for a suggestion for a wine description or a set of wine instances  Use Inference Web to explain results (descriptions, instances, provenance, reasoning engines, etc.)Copyright © 2011 Thematix & McGuinness Associates
  92. 92. + Wine Agent for iPhone 92 92  Client application which talks to a SW service  Make requests for dishes and wines using auto-generated interfaces  Make recommendations to the system for othersCopyright © 2011 Thematix & McGuinness Associates
  93. 93. + Dynamic interfaces as function of ontology 93 93 owl:Functional or rdfs:range == owl:maxCardinality == 1 owl:Class rdfs:range != owl:Class U rdfs:Literal rdfs:range == rdfs:Literal OtherwiseCopyright © 2011 Thematix & McGuinness Associates
  94. 94. + Descriptions 94 94  Resulting descriptions are converted to classes (for creating recommendations) or to instances (for obtaining recommendations)  When going to a class, all “Is A” statements become rdfs:subClassOf and all other properties become owl:Restrictions  Instances are realized by the reasoner to identify appropriate recommendationsCopyright © 2011 Thematix & McGuinness Associates
  95. 95. + Getting the Recommendation 95 95  Recommendations are made up of two classes: 1 dish, 1 wine  When the instance is classified, the agent returns the matching recommendations  Tapping a particular recommendation causes the wine agent to look for items belonging to the matching pairCopyright © 2011 Thematix & McGuinness Associates
  96. 96. + Extensibility 96 96  Wine Agent Ontology is extensible  “Menu file” in RDF  Restaurants can introduce new classes and instances to provide better matches for users  Searches can be restricted to imported menusCopyright © 2011 Thematix & McGuinness Associates
  97. 97. + Future Work 97 97  Group recommendations using multiple iPhones/iPod touches  Recommendations based on personal dietary needs and preferences  ‘iPellet’, an OWL Reasoner designed for iPhone, for offline reasoningCopyright © 2011 Thematix & McGuinness Associates
  98. 98. + Observations from the Wine Agent 98 98  Background knowledge is reasonably simple and built in OWL (includes foods and wine and pairing information similar to the OWL Guide, Ontology Engineering 101, CLASSIC Tutorial, …)  Background knowledge can be used for simple query expansion over wine sources to retrieve for example documents about red wines (including zinfandel, syrah, …)  Background knowledge used to interact with structured queries such as those possible on wine.com  Constraints allows a reasoner like Pellet to infer consequences of the premises and query.  Explanation system (Inference Web) can provide provenance information such as information on the knowledge source (McGuinness’ wine ontology) and data sources (such as wine.com)  Services work could allow automatic “matchmaking” instead of hand coded linkages with web resourcesCopyright © 2011 Thematix & McGuinness Associates
  99. 99. + Trust & Understanding 99 99 If users (humans and agents) are to use, reuse, and integrate system answers, they must trust them. System transparency supports understanding and trust. Even simple “lookup” systems benefit from providing information about their sources. Systems that manipulate information (with sound deduction or potentially unsound heuristics) benefit from providing information about their manipulations. Goal: Provide interoperable infrastructure that supports explanations of sources, assumptions, and answers as an enabler for trust.Copyright © 2011 Thematix & McGuinness Associates
  100. 100. + Explanations, Proof Analysis 100 100  Framework for explaining question answering tasks  Stores and manages meta-information about proofs and explanations through a distributed repository  Uses the Proof Markup Language (PML) for proof interchange (OWL-based)  Services include proof presentation, strategy-based views, filtering, trust, combination, expansion, checking, search, and other capabilities  Integrated browsing and display of PML documents from diverse sources  Rewriting capabilities for improved understanding  Multi-modal dialogue options including alternative strategies for presenting explanations, visualizations, and summariesCopyright © 2011 Thematix & McGuinness Associates
  101. 101. + Inference Web (IW) 101 101 End Users End-User Data Access & Validate published PML data Interaction Data Analysis services Services Explanation via Graph DistributedExplanation via Summary PML data Explanation via Annotation Access published PML data Inference Web is a semantic web-based knowledge provenance management infrastructure: • PML for encoding and interchange provenance metadata in distributed environment • Interactive explanation services for end-users • Data access and analysis services for enriching the value of knowledge provenance Copyright © 2011 Thematix & McGuinness Associates
  102. 102. + Proof Markup Language (PML)  A new kind of linked data on the Web 102 102 World Wide Web D PML PML data Enterprise Web data PML PML PML data D PML data data data D Enterprise Web D D D D PML D data …  Modularized & extensible  Provenance: annotate provenance properties  Justification: encodes provenance relations  Trust: add trust annotation  Semantic Web basedCopyright © 2011 Thematix & McGuinness Associates
  103. 103. + Technical Architecture IW Registrar call Web Service Edit edit 103 103 PML database Interface Interfacemachine agents generate domain experts generate edit PML API translate Proofs PML Translator (N3, KIF) PML (provenance, justification, trust) Computation Tools(validation, conflict detection, abstraction) is-stored-as Data Access Presentation Tools (IW Browser) PML Java PML parse PML Web Objects API Documents reads uses harvests & indexes searches harvests IW Search subscribesCopyright © 2011 Thematix & McGuinness Associates
  104. 104. Making Systems Actionable using Knowledge Provenance Mobile CALO Wine Agent 104 104 Combining Proofs in TPTP Intelligence Analyst Tools Knowledge Provenance in VirtualGILA NOW including Data-gov ObservatoriesCopyright © 2011 Thematix & McGuinness Associates 104
  105. 105. + Example Proof Tree Browser 105 105Copyright © 2011 Thematix & McGuinness Associates