Successfully reported this slideshow.
Your SlideShare is downloading. ×

Ontology 101 - New York Semantic Technology Conference

Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad

Check these out next

1 of 152 Ad

More Related Content

Similar to Ontology 101 - New York Semantic Technology Conference (20)

Recently uploaded (20)

Advertisement

Ontology 101 - New York Semantic Technology Conference

  1. 1. + Ontology 101: An Introduction to Knowledge Representation, the Web Ontology Language (OWL) & Ontology Development Elisa Kendall, Thematix Partners LLC & Deborah McGuinness, RPI / McGuinness Associates 15 October 2012
  2. 2. + Content management Mars was photographed by the Hubble Space Telescope in August 2003 as the 2 2 planet passed closer to Earth than it had in nearly 60,000 years. Image Credit: NASA, J. Bell (Cornell U.) and M. Wolff (SSI) A sunset on Mars creates a glow due to the presence of tiny dust particles in the atmosphere. This photo is a combination of four images taken by Mars Pathfinder, which landed on Mars in 1997. Image credit: NASA/JPL Recent images from instruments on board the Mars Reconnaissance Orbiter take much more detailed, narrower views of specific features of the Martian surface. Image credit: NASA/JPL The Planetary Data Store (PDS) is a distributed repository of 40+ years’ imagery & data taken by a range of instruments on many diverse missions, available for scientific research. Copyright © 2012 Thematix Partners LLC & McGuinness Associates
  3. 3. + Smart search 3 3 Provenance/sources for tracking family members in the 19th century include early census data (often error prone), military records, passenger & immigration lists, online documents (e.g., county histories, church histories, etc.) • Historical/forensic research requires cross-domain search of a wide variety of resources within a given geo-spatial/temporal context • Similar capabilities are essential for business intelligence, law enforcement, government applications – all require terminology reconciliation Copyright © 2012 Thematix Partners LLC & McGuinness Associates
  4. 4. + Merchandising Attribute 1 4 4 Attribute 2 Leg Room Attribute 4 Attribute 5 Attribute 5 On-Off Access “Comfort- Attribute 7 Attribute 8 Oriented” Attribute 9 Flight Media Options Attribute 11 Attribute 12 Meal Options Attribute 14 Attribute 15 Time of Day Attribute 17 Attribute 18 Distance to Gate Attribute 20 Attribute 21 Flight Characteristics Attribute N … Copyright © 2012 Thematix Partners LLC & McGuinness Associates
  5. 5. + Search Engine Optimization 5 5  Use search engines as a front door to transaction  Dramatically increase relevance by inclusion of more meaningful, synonymic tags  Enrich search results, adding ratings, video, prices, avails, amenities, locality etc. Copyright © 2012 Thematix Partners LLC & McGuinness Associates
  6. 6. + Tutorial Outline 6 6  Introduction to Knowledge Representation  OWL Basics  Tools & Applications Copyright © 2012 Thematix Partners LLC & McGuinness Associates
  7. 7. + Part 1: Introduction to Knowledge Representation 7 7  A little background and a few definitions  Layers of abstraction & conceptual modeling  Classifying ontologies  A little methodology Copyright © 2012 Thematix Partners LLC & McGuinness Associates
  8. 8. + Definitions 8 8  An ontology is a specification of a conceptualization. – Tom Gruber  Knowledge engineering is the application of logic and ontology to the task of building computable models of some domain for some purpose. – John Sowa  Artificial Intelligence can be viewed as the study of intelligent behavior achieved through computational means. Knowledge Representation then is the part of AI that is concerned with how an agent uses what it knows in deciding what to do. – Brachman and Levesque, KR&R  Knowledge representation means that knowledge is formalized in a symbolic form, that is, to find a symbolic expression that can be interpreted. – Klein and Methlie  The task of classifying all the words of language, or what's the same thing, all the ideas that seek expression, is the most stupendous of logical tasks. Anybody but the most accomplished logician must break down in it utterly; and even for the strongest man, it is the severest possible tax on the logical equipment and faculty. – Charles Sanders Peirce, letter to editor B. E. Smith of the Century Dictionary Copyright © 2012 Thematix Partners LLC & McGuinness Associates
  9. 9. + What is an ontology? An ontology specifies a rich description of the 9 9  Terminology, concepts, nomenclature  Properties explicitly defining concepts  Relations among concepts (hierarchical and lattice)  Rules distinguishing concepts, refining definitions and relations (constraints, restrictions, regular expressions) relevant to a particular domain or area of interest. Copyright © 2012 Thematix Partners LLC & McGuinness Associates
  10. 10. + Logic and ontological commitment 10 10  Predicate logic is harder to read than the original English, but is more precise: Every semi-trailer truck has at least 3 axles. ( x)(((SemiTrailerTruck(x)  ( y)(SemiTrailer(y)  (hasPart (x,y)))  (SemiTrailerTruck(x)  ( z)(TractorUnit(z)  (hasPart (x,z))))  ( s)(set(s)  (count(s,(≥3))  ( w)(member(w,s)  (Axle(w)  hasPart(x,w))) )).  Logic is a simple language with few basic symbols.  The level of detail depends on the choice of predicates – these predicates represent an ontology of the relevant concepts in the domain.  Different choices of predicates represent different ontological commitments. * Derived from Knowledge Representation: Logical, Philosophical, and Computational Foundations, John F. Sowa, Brooks/Cole, Pacific Grove, CA, 2000. Copyright © 2012 Thematix Partners LLC & McGuinness Associates
  11. 11. + Ontology-based technologies 11 11  Ontologies provide a common vocabulary for use by independently developed resources, processes, services  Agreements among organizations sharing common services can be made with regard to their usage; the meaning of relevant concepts can be expressed unambiguously  By composing / mapping ontologies and mediating terminology across participating events, resources and services, independently-developed services can work together to share information and processes consistently, accurately, and completely  Ontologies also ensure  Valid conversations among agents to collect, process, fuse, and exchange information  Accurate searching by ensuring context using concept definitions and relations instead of/in addition to statistical relevance of keywords Copyright © 2012 Thematix Partners LLC & McGuinness Associates
  12. 12. + KR language features  Vocabulary – a collection of symbols 12 12  Domain-independent logical symbols (e.g.,  or )  Domain-dependent constants, identifying individuals, properties, or relations in the application domain or universe of discourse  Variables, whose range is governed by quantifiers  Punctuation that separates or groups other symbols  Syntax –  formation rules that determine how symbols can be combined in well- formed expressions  rules may be stated in a linear grammar, graph grammar, or independent abstract syntax  Semantics –  a theory of reference that determines how the constants and variables are associated with things in the universe of discourse  a theory of truth that distinguishes true statements from false statements  Rules of Inference –  rules that determine how one pattern can be inferred from another  if the logic is sound, the rules of inference must preserve truth as determined by the semantics Copyright © 2012 Thematix Partners LLC & McGuinness Associates
  13. 13. + Classifying logics Logics vary from classical FOL along six dimensions:  Syntax 13 13  Subsets limit permissible operators or combinations, e.g., propositional logic (without quantifiers), Horn-clause (excludes disjunctions in conclusions, such as Prolog), terminological or definitional logics (containing additional restrictions, e.g., description logics)  Proof Theory restricts or extends permissible proofs: • Intuitionistic logic and relevance logic rule out certain extraneous information • Non-monotonic logics allow introduction of default assumptions • Access-limited logic restricts the number of times a proposition can be used in a proof; Linear logic allows a proposition to be used only once • Modal logic incorporates modal auxiliaries (p means p is possibly true; •p means p is necessarily true), temporal logic extends model logic to include always, sometimes • Intensional logics express concepts such as need, ought, hope, fear, wish, believe, know, expect, and intend  Model Theory modifies the truth value of statements in terms of some model of the world: classical FOL is two-valued; a three-valued logic introduces unknowns; fuzzy logic uses the same notation as FOL but with an infinite range of certainty factors (1.0 to 0.0)  Ontology – frameworks may include support for built-in components, such as set theory or time  Meta-language – language for encoding information about objects Copyright © 2012 Thematix Partners LLC & McGuinness Associates
  14. 14. + Description logic  A family of logic-based Knowledge Representation formalisms 14 14  Descendants of semantic networks and KL-ONE  Describe domain in terms of concepts (classes), roles (relationships), and individuals (instances)  Distinguished by  Formal semantics • Decidable fragments of FOL • Closely related to Propositional, Modal, and Dynamic Logics  Provision of inference services • Sound and complete decision procedures for key problems • Implemented systems (highly optimized)  Applications include  Configuration – product configurators, consistency checking, constraint propagation, first significant industrial application (e.g., CLASSIC)  Ontologies – ontology engineering (design, maintenance, integration), reasoning with ontology-based mark-up, service description and discovery  Databases – consistency of conceptual schemata, schema integration, query subsumption (w.r.t. conceptual schemata) Copyright © 2012 Thematix Partners LLC & McGuinness Associates
  15. 15. + Knowledge bases, databases, and ontology 15 15  An ontology is a conceptual model of some aspect of a particular universe of discourse (or of a domain of discourse)  Typically, ontologies contain only “rarified” or “special” individuals, metadata, representing elemental concepts critical to the domain  A knowledge base is a persistent repository for  Ontology & metadata representing individuals, facts, & rules about how they can be combined or relate to one another  Metadata, facts & rules only – in some applications and frameworks the ontology is separately maintained  Most inference engines require in-memory deductive databases for efficient reasoning (including commercially available reasoners)  A knowledge base may be implemented in a physical, external database, such as a relational database, but reasoning is typically done on a subset (partition) of that knowledge base in memory Copyright © 2012 Thematix Partners LLC & McGuinness Associates
  16. 16. + Reasoning and truth maintenance 16 16  Reasoning is the mechanism by which the assertions made in an ontology and related knowledge base are evaluated by an inference engine.  In classical logic, the validity of a particular conclusion is retained even if new information is received.  This may change if some of the preconditions are actually hypothetical assumptions invalidated by the new information.  The same idea applies for arbitrary actions – new information can make preconditions invalid.  Generally, there are two issues that a reasoner must address:  If some conclusion is invalid, which other conclusions are also invalid?  If some action cannot be performed, which others are at risk?  The “housekeeping” associated with tracking the threads that support answering these questions is called truth maintenance. Copyright © 2012 Thematix Partners LLC & McGuinness Associates
  17. 17. + Negation  If all new information is “positive”, then all prior conclusions should 17 17 remain valid.  Problems arise if new information negates a prior assumption, causing it to be withdrawn.  What does it mean to negate (withdraw) an assumption?  Conclusive information is not available?  The assumption cannot be proven?  The assumption is not provable using certain methods?  The assumption is not provable given a fixed quantity of time?  The answers to these questions can result in different definitions of negation and differing interpretations by non-monotonic reasoners.  Solutions include chronological and “intelligent” backtracking algorithms, heuristics, circumscription algorithms, justification or assumption based retraction, and so forth, depending on the reasoner and methods used for truth maintenance.  Reasoning efficiency is dependent, in part, on the algorithms applied for truth maintenance. Copyright © 2012 Thematix Partners LLC & McGuinness Associates
  18. 18. + Explanations and proofs 18 18  When a reasoner draws a particular conclusion, many users and applications want to understand why?  Primary motivations include interoperability, reuse, and trust  Understanding the provenance of the information and results is crucial, especially when web-based information is involved  What information sources were used (source)  How recently they were updated (currency)  How reliable these sources are (authoritativeness)  Was the information directly available or derived, and if derived, how (method of reasoning)  Methods used to explain why a reasoner reached a particular conclusion include explanation generation and proof specification Copyright © 2012 Thematix Partners LLC & McGuinness Associates
  19. 19. + Abstraction layers 19 19 Ontology Ontology, Conceptual ER, Conceptual Business Process … ER, Relational, XML Schema XML, source code, scripting languages, stored procedures… Physical KBs, databases, asset repositories… *Layering diagram courtesy Kenn Hussey  Knowledge Representation / Management for Large Scale Applications  Provide broad metadata, process, service & asset management facilities (including feedback/lessons learned…)  Enable rich cross-domain, cross-process, cross organizational modeling supported by mapping & transformation services to provide maximum flexibility, interoperability  Leverage standards and best practices in information architecture, metadata modeling, management, registration, and governance, and asset management & registration  Provide incremental reasoning capabilities for model validation, transformation services  Repeatable, reusable, interoperable Copyright © 2012 Thematix Partners LLC & McGuinness Associates
  20. 20. + Hypothetical conceptual model “EU-Rent” 20 20 Produced using Embarcadero EA/Studio Business Modeler Edition, courtesy Kenn Hussey Copyright © 2012 Thematix Partners LLC & McGuinness Associates
  21. 21. + Hypothetical logical model “EU-Rent” 21 21 Produced using Embarcadero ER/Studio, courtesy Kenn Hussey Copyright © 2012 Thematix Partners LLC & McGuinness Associates
  22. 22. + Hypothetical physical model “EU-Rent” 22 22 Produced using Embarcadero ER/Studio, courtesy Kenn Hussey Copyright © 2012 Thematix Partners LLC & McGuinness Associates
  23. 23. + Classifying ontologies 23 23 Classification techniques are as diverse KR System as conceptual models; and generally include understanding OO Software Model  Level of Expressivity Entity – Relationship Model  Level of Complexity / Structure Concept Map  Granularity Topic Map  Target Usage, Relevance Amount of Automation, Reasoning Requirements Level of Expressivity  Database Schema  Prescriptive vs. Descriptive / Reliability / Level of XML Schema Authoritativeness Hierarchical Taxonomy  Design Methodology Simple Taxonomy  Governance Glossary  Vocabulary Management, Metrics Level of Complexity Copyright © 2012 Thematix Partners LLC & McGuinness Associates
  24. 24. + Framework of dimensions 24 24  Semantic Dimensions  Expressiveness: represents how well a KR language addresses increasingly complex semantics  Structure: represents how well an ontology encodes semantics, with the same or less expressivity than the KR language  Granularity: represents the level of detail specified in an ontology  Pragmatic Dimensions  Intended use: the original use case(es), or purpose for developing a particular ontology  Automated reasoning: the extent to which the ontology is designed to be used for automated reasoning  Prescriptive vs. Descriptive: the extent to which an ontology was intended to be used for descriptive purposes vs. normative prescriptive use (i.e., with high degree of concern for correctness) Reference: http://ontolog.cim3.net/cgi-bin/wiki.pl?OntologySummit2007 Copyright © 2012 Thematix Partners LLC & McGuinness Associates
  25. 25. + Considerations  Intended use of ontologies, including domain requirements (e.g., 25 25 scientific and engineering apps require formulas, units of measure, computations that may be challenging to represent)  Intended use of KRSs that implement them, including reasoning requirements, questions to be answered  For distributed environments, the number and kinds of resources, processes, services requiring ontologies – how distributed, how unique, developed collaboratively or independently, dynamic community participation or static  What kinds of transformations are required among processes, resources, services to support semantic mediation  Ontology and KRS alignment / de-confliction / ambiguity resolution requirements  Ontology and KRS composition requirements, dynamic vs. static composition, in what environment and under what constraints  Performance, sizing, timing requirements of target environment Copyright © 2012 Thematix Partners LLC & McGuinness Associates
  26. 26. + A little methodology … 26 26  Requirements, domain & use case analysis are critical  Develop initial source/reference material  Focus on system or application requirements  Iterative development starting with a “thread” that covers basic capabilities can ground the work and prioritize decisions  Need to understand and communicate  Architectural trade-offs, cost & technical benefits  The nature of the information & kinds of questions that need to be answered drive the architecture, approach, and ontology scoping and design  Reuse standards and well-tested, available ontologies whenever possible Copyright © 2012 Thematix Partners LLC & McGuinness Associates
  27. 27. + What to look for  A controlled vocabulary 27 27 FAA & IATA airport codes, ACRISS car codes, …  A hierarchical or taxonomic structure (for query expansion) Vehicle, Ground-based Vehicle, Wheeled Vehicle, Powered Vehicle, Automobile, Sedan…  Knowledge supporting structured queries Find all available hybrid SUVs or hybrid sedans that can seat four adults within a reasonable taxi ride of Planet Hollywood, Las Vegas for 3-4 days the week of June 20th  Efficient inference (i.e., limited expressive power) vs. increased expressivity (potentially expensive or resource bounded computation)  Custom reasoning for temporal relations, geospatial, dynamic algorithm / equation evaluation, process-specific, conditional operations  Computational tractability Copyright © 2012 Thematix Partners LLC & McGuinness Associates
  28. 28. + Start with canonical definitions  General concepts as well as domain-specific knowledge 28 28  Basic starting point – cross-domain definitions  Namespace definitions, metadata, naming conventions, governance policies  Commonly used structures & vocabularies, such as domain-specific vocabulary & messaging standards, international country & language codes (ISO), national postal addressing or other government standards, industry best practices  Common metadata for ontology & schema management (e.g., Dublin Core, for documents & models, ISO 1087 for synonyms & similar relations, MIME media types for images & multimedia, SKOS for annotations, etc.)  Domain vocabularies must be prioritized, selected based on business requirements, clear ROI  Common early targets include  Smart search (pull); customer experience & cross-sell / upsell (push) – RDFa, extensions to schema.org  Richer interoperability among trading partners – extensions to messaging agreements  Service registration, description, discovery & management  Augmented decision support and business rule applications  Asset/artifact repository search & retrieval  Automated verification Copyright © 2012 Thematix Partners LLC & McGuinness Associates
  29. 29. + IDEF5 (Integrated Definition Methods) Ontology Capture Method Analysis 29 29  Layout a high-level architecture for ontology elements  Identify the relationships among elements – roles, domain, interface, process, utility  For each ontology element  Describe its domain and scope, how it will be used  Identify example questions and anticipated/sample answers for the application(s) it will support  Identify key stakeholders, ownership, maintenance, resources for instance knowledge  Describe anticipated reuse/evolution path  Identify critical standards, resources that it must interoperate with, dependencies  Resources  http://www.idef.com/IDEF5/html  http://www.kbsi.com/technology/methods/sbont.htm Copyright © 2012 Thematix Partners LLC & McGuinness Associates
  30. 30. + Principles & Methods for Terminology Work (ISO 704) 30 30  Methodology for describing concepts & terms  Uses ISO 1087 for terminology  Uses ISO 860 for terminology “harmonization” (alignment) methods  Basis for typical methods used for taxonomy development today  Describes how to flesh out definitions  Recommendations strategies for relating terms to one another using standard vocabulary  ISO 1087 – great resource for language to describe kinds of relationships, acronyms & other designations, preferred vs. deprecated terms, etc.  ISO 860 augments this with recommendations for vocabulary comparison Copyright © 2012 Thematix Partners LLC & McGuinness Associates
  31. 31. + Key methodology issues  Naming conventions and versioning policies are critical for – 31 31 every– organization  Namespace definitions for ontologies are critical, along with policies for their management that are well understood  http://<authority>/<subdomain>/<topic>/<date, in YYYYMMDD form>/filename.extension is common practice in some organizations, with content negotiation to the latest version  Levels of hierarchy may be added for large organizations or modularized ontologies  Use of “id.name.ext” or “ontology.name.ext” vs. “www.name.ext,” for the authority component of the URL is increasing  Namespace prefixes (abbreviations) for individual modules, especially where there are multiple modules, can be important due to longevity Copyright © 2012 Thematix Partners LLC & McGuinness Associates
  32. 32. + More on naming  For naming entities in an ontology – there are some rules of 32 32 thumb that vary by community of practice  data modelers often use underscores at word boundaries, spaces in names – which semantic web tools may not handle well (ok in labels)  some name properties <domain><predicate><range>, others <predicate><range>, & others <predicate>, but not necessarily consistently  semantic web practitioners typically use camel case; property naming varies by domain & community of practice  for very large ontologies, some use unique identifiers to name concepts and properties, with human readable names in labels only –depends on tooling that helps people interpret the content  The key is to establish & consistently use guidelines tailored to your organization  Guidelines for classes/concepts (e.g., OWL classes, CL names, SBVR concepts), properties (e.g., OWL properties, CL relations, functions, SBVR roles), individuals (objects in UML) – so that conceptual models developed in one tool can be exported consistently to other tools & formats essential Copyright © 2012 Thematix Partners LLC & McGuinness Associates
  33. 33. + Change management  Change management and traceability is not well defined at the element level for ontologies, still a research topic 33 33  Current approach to element-level versioning to support reasoning about change is to track two parallel streams: one for additions to the ontology, one for retractions; often managed as two separate files  UML/MOF versioning suggests linking to a workspace; use of a default workspace would get the latest version, and tools such as MagicDraw© & repositories such as Adaptive can provide an indication of the differences  UML-based versioning does not support reasoning about the differences so that users can determine the impact of applying the changes  addition or deletion of axioms can change downstream reasoning results, especially where there are complex dependencies  Mechanisms that preserve the versioning detail in artifacts that are generated from such models, such as RDF/XML serialized OWL, and are compatible with the above approaches, is an ongoing topic of discussion  Consider the use cases/justification for whatever level of versioning detail is needed on a project by project basis  Balance with usability/performance Copyright © 2012 Thematix Partners LLC & McGuinness Associates
  34. 34. + Modularity  Guidance on modularity is limited 34 34  When is any model “too big”?  Can individual parts evolve independently, and if so, shouldn‟t that dictate module boundaries?  Can individual parts be used independently, and if so, shouldn‟t that also dictate module boundaries?  For ontologies, particularly OWL ontologies, individuals should be managed separately from “schema” to facilitate reasoning during development (i.e., if you add an individual that creates a logical inconsistency, most reasoners won‟t load the ontology at all)  Current practices in the semantic web community address modularity either statically or dynamically  Static approaches include adherence to “DL-Safe” rules and suggestions in papers by Alan Rector (University of Manchester)  Dynamic approaches include use of tools that can assist in determining whether or not an ontology is appropriately modularized, using reasoning to perform rewriting with respect to soundness and completeness for a given ontology Copyright © 2012 Thematix Partners LLC & McGuinness Associates
  35. 35. + Metadata for ontologies and elements  Metadata should be standardized at both the model level and the 35 35 element level for every model (not just ontologies)  Model level metadata can reuse properties from the Dublin Core Metadata Terms, from the Simple Knowledge Organization System (SKOS), ISO 11179 Metadata Registry standard with ISO 1087 Terminology support, the Ontology Metadata Vocabulary (OMV), emerging W3C work on provenance and others  Most annotations should be optional at the element level, but a minimal set, including names, labels, and formal, text definitions, is important for reusability & collaboration  Consistent use of the same annotations (properties, tags) improves readability, facilitates automated documentation generation, and enables better search over ontology repositories  Model level metadata may reuse organization-specific taxonomies to enable better search  through RDFa tagging, for example  good relations & schema.org are both well known & in use in industry today Copyright © 2012 Thematix Partners LLC & McGuinness Associates
  36. 36. + Part 2: OWL Basics  A quick intro to RDF and RDF Schema 36 36  Basic constructs: descriptions & classes  Class expressions  Properties & restricting property values  Individuals & data ranges  Miscellaneous class axioms  New features of OWL 2  A few rules of thumb  Depth vs. breadth, naming, synonymous terms  Class vs. property value, class vs. individual  Limiting scope Copyright © 2012 Thematix Partners LLC & McGuinness Associates
  37. 37. + Semantic Web 37 37 "The Semantic Web is an extension of the current web in which information is given well-defined meaning, better enabling computers and people to work in cooperation." -- Tim Berners-Lee Copyright © 2012 Thematix Partners LLC & McGuinness Associates
  38. 38. + Guiding Principles 38 38  Historically, knowledge representation and reasoning systems have operated under closed world assumptions  Uncertainty is magnified under open-world, “wild, wild web” conditions, making reasoning much more difficult  Semantic web languages are designed to support less certainty, to provide “better” search results, informed answers to questions, not absolutes  Because they are based on XML, such languages can assist businesses in leveraging existing investment in mark-up, content, and data  To augment business intelligence/analysis and knowledge mining  To support knowledge sharing and collaboration, augment enterprise information integration  Enrich web services and other applications  Support policy-based applications and ensure compliance with policy at a lower cost with higher potential ROI than traditional computing methods Copyright © 2012 Thematix Partners LLC & McGuinness Associates
  39. 39. + Resource Description Framework (RDF) 39 39  Describes relationships  Uses URIs used for naming  Language has  graph based model  RDF/XML serialization (exchange syntax)  other presentation syntaxes (N3, Turtle, …)  Specification, W3C presentations, tools are available at  Semantic Web: http://www.w3.org/standards/semanticweb/  Linked Data: http://www.w3.org/standards/semanticweb/data  RDF: http://www.w3.org/standards/techs/rdf#w3c_all  New RDF “next steps” revision working group under way to make specific enhancements, “fixes” to the standard  RDF Working Group: http://www.w3.org/2011/rdf-wg/wiki/Main_Page Copyright © 2012 Thematix Partners LLC & McGuinness Associates
  40. 40. + RDF Notation Options 40 40 Subject http://semtech2012.semanticweb.com/travel.owl#Conference Predicate Graph: http://semtech2012.semanticweb.com/travel.owl#heldAt http://semtech2012.semanticweb.com/travel.owl#HiltonSFUnionSquare Object <rdf:Description rdf:ID=“Conference"/> XML/RDF: <semtech:heldAt rdf:resource="#HiltonSFUnionSquare"/> </rdf:Description> N3: semtech:Conference semtech:heldAt semtech:HiltonSFUnionSquare . Copyright © 2012 Thematix Partners LLC & McGuinness Associates
  41. 41. + RDF Schema (RDFS)  An RDF vocabulary that provides for identifying: 41 41  classes,  subsumption (inheritance) relations for classes,  subsumption (inheritance) relations for properties,  domain and range for properties Copyright © 2012 Thematix Partners LLC & McGuinness Associates
  42. 42. + RDF Schema (RDFS) 42 42 semtech:Hotel rdfs:subClassOf Graph: rdf:type rdfs:Class semtech:ConferenceHotel rdf:type rdf:type semtech:HiltonSFUnionSquare <rdf:Description rdf:ID="Hotel"> XML/RDF: <rdf:type rdf:resource="http://www.w3.org/2000/01/rdf-schema#Class"/> </rdf:Description> <rdfs:Class rdf:ID=“ConferenceHotel"> <rdfs:subClassOf rdf:resource="#Hotel"/> </rdfs:Class> <semtech:ConferenceHotel rdf:ID=“HiltonSFUnionSquare“/> Copyright © 2012 Thematix Partners LLC & McGuinness Associates
  43. 43. + The Web Ontology Language (OWL)  Two languages emerged in parallel to address semantic web 43 43 requirements  DAML-ONT, developed through the DARPA/DAML program  OIL (Ontology Inference Layer) developed by EU & US researchers  Merged DAML+OIL was submitted to the W3C 2002, formed the basis for the WebOnt Working Group  OWL extends RDF Schema  Has an RDFS based syntax and reuses some RDF vocabulary (e.g., subClassOf, domain, range)  Adds rich primitives and redefines others (transitivity, inverse, cardinality constraints, complex class definitions)  Describes the structure of a domain in terms of classes and properties  Uses RDFS for class/property membership assertions (ground facts), XML Schema Datatypes  OWL specifications became W3C recommendations in February 2004  OWL 2 specifications were adopted in October 2009 Copyright © 2012 Thematix Partners LLC & McGuinness Associates
  44. 44. + General nature of descriptions 44 44 a WINE a LIQUID General Categories a POTABLE grape: chardonnay, ... [>= 1] sugar-content: dry, sweet, off-dry Structured color: red, white, rose Components price: a PRICE winery: a WINERY grape dictates color (modulo skin) Interconnections harvest time and sugar are related Between Parts Copyright © 2012 Thematix Partners LLC & McGuinness Associates
  45. 45. + General nature of descriptions 45 45 Class a WINE Superclass a LIQUID General Categories a POTABLE Number / Card Restrictions grape: chardonnay, ... [>= 1] sugar-content: dry, sweet, off-dry Structured Roles / color: red, white, rose Components Properties price: a PRICE winery: a WINERY Value Restrictions grape dictates color (modulo skin) Interconnections harvest time and sugar are related Between Parts Copyright © 2012 Thematix Partners LLC & McGuinness Associates
  46. 46. + Describing classes  A class is a concept in the domain 46 46  Vintage – a wine made from grapes grown in a specified year  A set of characteristics (flavor, body, color, sugar…)  A class is a collection of elements with similar properties  White wine – wines made from white grapes  White table wine – wines made from white grapes that are not appellations or regional (not “quality wine” in the EU)  A class expression defines necessary conditions for membership (specific varietal, field, micro-climate, date picked, date bottled)  Instances of classes  Daniel Gehrs 2009 Merlot  Robert M. Parker, Jr,‟s Wine Advocate review, dated February 28, 2002  Los Olivos, Santa Barbara County, California, USA Copyright © 2012 Thematix Partners LLC & McGuinness Associates
  47. 47. + Class inheritance  Classes are organized into subclass-superclass (or generalization- 47 47 specialization) hierarchies  True subclass relationships are the basis of a formal is-a hierarchy  Classes are “is-a” related if an instance of the subclass is an instance of the superclass  is-a hierarchies are essential for classification  Violation of true is-a hierarchical relationships can have unintended consequences & and cause errors in reasoning  Class expressions should be viewed as sets, and subclasses as a subset of the superclass, in contrast with a collection of attributes as classes are sometimes specified in object oriented programming  Examples  RedWine is a subclass of Wine Every red wine is a wine; every instance of a red wine (e.g., Foxen Cabernet Franc) is an instance of wine  NapaValleyWine is a subclass of CaliforniaWine Every wine from Napa Valley is a wine from California Copyright © 2012 Thematix Partners LLC & McGuinness Associates
  48. 48. + Class expressions 48 48  6 primary kinds of class expressions in OWL  5 anonymous  Definitions can be nested using set-theoretic constructs Copyright © 2012 Thematix Partners LLC & McGuinness Associates
  49. 49. + Example class hierarchy with other expressions 49 49  Example from ISO 1087 combining subclass-superclass and union expressions  Provides a more accurate representation of the relationships defined in the ISO standard than a strict is-a hierarchy would Copyright © 2012 Thematix Partners LLC & McGuinness Associates
  50. 50. + Example class hierarchy with other expressions 50 50 Copyright © 2012 Thematix Partners LLC & McGuinness Associates
  51. 51. + Modeling considerations for class hierarchy development 51 51  Practitioners tend to start  Top-down - define the most general concepts first and then specialize them  Bottom-up - define the most specific concepts and then organize them into more general classes  Combination (typical – breadth at the top level and depth along a few branches to test design)  Class inheritance is transitive  A is a subclass of B (white wine, dessert wine are subclasses of wine)  B is a subclass of C (sauvignon blanc is a subclass of white wine, late harvest wine is a subclass of dessert wine)  therefore A is a subclass of C (late harvest sauvignon blanc is a subclass of white wine, dessert wine, & wine)  Start with a more formal is-a approach, tease out whether other expressions are more appropriate based on use cases, formal definitions when available, etc. Copyright © 2012 Thematix Partners LLC & McGuinness Associates
  52. 52. + Properties  Properties describe characteristics, features, or attributes of the 52 52 members of a class Every flowering plant has a bloom color, color pattern, flower form, petal form, etc.  Classes of properties  intrinsic: properties of the plant itself such as the optimal sunlight, soil conditions, expected height, and so forth  extrinsic: properties imposed externally such as the grower and price  whole-part relations  geospatial, mereonomic relations: what microclimates are this plant well suited to, where in a particular historic garden will you find it  Data and object properties  simple attributes typically have primitive data values (e.g., strings, numbers)  complex properties refer to other entities (e.g., an individual grower, such as Monrovia, or organization such as the American Orchid Society)  OWL reasoning requires strict separation of data and object properties Copyright © 2012 Thematix Partners LLC & McGuinness Associates
  53. 53. + Properties in OWL 2 53 53 Copyright © 2012 Thematix Partners LLC & McGuinness Associates
  54. 54. + Domain & range properties 54 54  In OWL and many other KR languages, relations (properties) are strictly binary  The domain & range represent the source & target arguments, respectively, for the property  Domain – the class (or classes) that may have the property  Wine is the domain of the property hasWineColor  Range – the class (or classes) defining valid property values  everything that fulfills the hasWineColor property is an instance of the enumerated class {red, white, rose}  Some KR languages that inherently support n-ary relations, such as CL, do not make this distinction  More flexible, intuitively more like mathematics, where functions have ranges (or return types) but not all relations are functions  Requires additional relations to specify argument order, which can be critical for ontology alignment Copyright © 2012 Thematix Partners LLC & McGuinness Associates
  55. 55. + Class & property inheritance 55 55  A subclass inherits all the properties from its super class If a wine has a name and flavor, a red wine also has a name and flavor  If a class has multiple super classes, it inherits properties and restrictions from all of them Latin is both an individual language and an ancient language. It inherits “has attested literature: true” from ancient language and “has indigenous name: lingua latina” from individual language Copyright © 2012 Thematix Partners LLC & McGuinness Associates
  56. 56. + Class expressions that restrict property values 56 56  Number restrictions describe or limit the number of possible values a particular property can have  A language must be associated with at least one English name and at least one French name  A language may be associated with zero or more Indigenous names  Cardinality – similar meaning to classical set theory, measures the number of elements in the set (restriction class)  Cardinality – cardinality N means the class defined by the property restriction must have exactly N values (individual or literal values)  Minimum cardinality - 1 means that there must be at least one value (required), 0 means that the value is optional  Maximum cardinality - 1 means that there can be at most one value (single-valued), N means that there can be up to N values (N > 1, multi- valued) Copyright © 2012 Thematix Partners LLC & McGuinness Associates
  57. 57. + Simple number restriction examples <owl:ObjectProperty rdf:about="&re;hasColor"> <rdfs:range rdf:resource="&re;Color"/> 57 57 <rdfs:domain rdf:resource="&re;ColoredThing"/> </owl:ObjectProperty> <owl:Class rdf:about="&re;Color"/> <owl:Class rdf:about="&re;ColoredThing"/> <owl:Class rdf:about="&re;MultiColoredThing"> <owl:equivalentClass> <owl:Restriction> <owl:onProperty rdf:resource="&re;hasColor"/> <owl:minCardinality rdf:datatype="&xsd;nonNegativeInteger">2</owl:minCardinality> </owl:Restriction> </owl:equivalentClass> </owl:Class> <owl:Class rdf:about="&re;SingleColoredThing"> <owl:equivalentClass> <owl:Restriction> <owl:onProperty rdf:resource="&re;hasColor"/> <owl:cardinality rdf:datatype="&xsd;nonNegativeInteger">1</owl:cardinality> </owl:Restriction> </owl:equivalentClass> </owl:Class>  Creating a class of individuals that have exactly one color & a class of individuals that have more than one color  Note that the relationship between SingleColoredThing and the restriction class is modeled as an equivalence relationship, meaning that membership in the restriction class is a necessary and sufficient condition for being a SingleColoredThing Copyright © 2012 Thematix Partners LLC & McGuinness Associates
  58. 58. + Qualified cardinality number restriction example 58 58  OWL 2 provides the capability to further qualify cardinality by specific classes and data ranges  Patterns that reuse a smaller number of significant properties, building up more complex class expressions that limit the set of valid values are common, useful in complex classification systems Copyright © 2012 Thematix Partners LLC & McGuinness Associates
  59. 59. + Qualified cardinality number restriction example <owl:ObjectProperty rdf:about="&re;hasCharacteristic"> <rdfs:range rdf:resource="&re;Characteristic"/> 59 59 </owl:ObjectProperty> <owl:Class rdf:about="&re;BloomColor"> <rdfs:subClassOf rdf:resource="&re;Characteristic"/> <rdfs:subClassOf rdf:resource="&re;Color"/> </owl:Class> <owl:Class rdf:about="&re;Characteristic"/> <owl:Class rdf:about="&re;Color"/> <owl:Class rdf:about="&re;FloweringPlantType"> <rdfs:subClassOf> <owl:Restriction> <owl:onProperty rdf:resource="&re;hasCharacteristic"/> <owl:onClass rdf:resource="&re;BloomColor"/> <owl:minQualifiedCardinality rdf:datatype="&xsd;nonNegativeInteger">1</owl:minQualifiedCardinality> </owl:Restriction> </rdfs:subClassOf> </owl:Class> Copyright © 2012 Thematix Partners LLC & McGuinness Associates
  60. 60. + Class expressions that restrict possible values by type 60 60  Universal (allValuesFrom) and existential (someValuesFrom) quantification  allValuesFrom is the OWL equivalent for (x), and restricts the possible values for a property to members of a particular class or data range  someValuesFrom is the OWL equivalent for (x), and restricts the possible values for a property to at least one member of a particular class or data range  Specific value (hasValue) restrictions  a single data value (e.g., the color property for a RedWine must be filled with the value “red”)  an individual member of a class (e.g., Winery is the value restriction on the hasMaker property on the class Wine)  Enumerations – lists of allowable individuals or data elements  Combinations of the above that include both restrictions and boolean class expressions (union – logical or, intersection – logical and, complement – logical not) Copyright © 2012 Thematix Partners LLC & McGuinness Associates
  61. 61. + Class expression restricting all values for a given property 61 61 * courtesy Azalea Society of America  Azaleas are members of the set of flowering plants whose bloom color must be either an RHSColor (Royal Horticultural Society) or ASAColor (Azalea Society of America) Copyright © 2012 Thematix Partners LLC & McGuinness Associates
  62. 62. + And the OWL for that … <owl:Class rdf:about="&re;Azalea"> 62 62 <rdfs:subClassOf rdf:resource="&re;FloweringPlantType"/> <rdfs:subClassOf> <owl:Restriction> <owl:onProperty rdf:resource="&re;hasBloomColor"/> <owl:allValuesFrom> <owl:Class> <owl:unionOf rdf:parseType="Collection"> <rdf:Description rdf:about="&re;ASAColor"/> <rdf:Description rdf:about="&re;RHSColor"/> </owl:unionOf> </owl:Class> </owl:allValuesFrom> </owl:Restriction> </rdfs:subClassOf> </owl:Class> * courtesy Azalea Society of America Copyright © 2012 Thematix Partners LLC & McGuinness Associates
  63. 63. + Another typical pattern combining restrictions & boolean connectives 63 63  Use of equivalence expressions like this, describing RHSFloweringPlantType as a flowering plant and something whose bloom colors must be from the set of colors defined by the Royal Horticultural Society is a common pattern Copyright © 2012 Thematix Partners LLC & McGuinness Associates
  64. 64. + And the corresponding OWL for that … 64 64 <owl:Class rdf:about="&re;RHSFloweringPlantType"> <owl:equivalentClass> <owl:Class> <owl:intersectionOf rdf:parseType="Collection"> <rdf:Description rdf:about="&re;FloweringPlantType"/> <owl:Restriction> <owl:onProperty rdf:resource="&re;hasBloomColor"/> <owl:allValuesFrom rdf:resource="&re;RHSColor"/> </owl:Restriction> </owl:intersectionOf> </owl:Class> </owl:equivalentClass> </owl:Class> Copyright © 2012 Thematix Partners LLC & McGuinness Associates
  65. 65. + Class expression restricting some & exact values for a given property 65 65 Copyright © 2012 Thematix Partners LLC & McGuinness Associates
  66. 66. + And the OWL for that … <owl:Class rdf:about="&re;Azalea"> <rdfs:subClassOf rdf:resource="&re;FloweringPlantType"/> 66 66 <rdfs:subClassOf> <owl:Restriction> <owl:onProperty rdf:resource="&re;hasBloomColor"/> <owl:allValuesFrom> <owl:Class> <owl:unionOf rdf:parseType="Collection"> <rdf:Description rdf:about="&re;ASAColor"/> <rdf:Description rdf:about="&re;RHSColor"/> </owl:unionOf> </owl:Class> </owl:allValuesFrom> </owl:Restriction> </rdfs:subClassOf> <rdfs:subClassOf> <owl:Restriction> <owl:onProperty rdf:resource="&re;hasBloomColorPattern"/> <owl:someValuesFrom rdf:resource="&re;ASAColorPattern"/> </owl:Restriction> </rdfs:subClassOf> </owl:Class> Copyright © 2012 Thematix Partners LLC & McGuinness Associates
  67. 67. + ... <owl:Class rdf:about="&re;SingleColoredAzalea"> <owl:equivalentClass> <owl:Class> 67 67 <owl:intersectionOf rdf:parseType="Collection"> <owl:Restriction> <owl:onProperty rdf:resource="&re;hasBloomColorPattern"/> <owl:hasValue rdf:resource="&re;Solid"/> </owl:Restriction> <owl:Restriction> <owl:onProperty rdf:resource="&re;hasBloomColor"/> <owl:cardinality rdf:datatype="&xsd;nonNegativeInteger">1</owl:cardinality> </owl:Restriction> </owl:intersectionOf> </owl:Class> </owl:equivalentClass> <rdfs:subClassOf rdf:resource="&re;Azalea"/> </owl:Class> <owl:NamedIndividual rdf:about="&re;Solid"> <rdf:type rdf:resource="&re;ColorPattern"/> </owl:NamedIndividual> Copyright © 2012 Thematix Partners LLC & McGuinness Associates
  68. 68. + Individuals and data ranges  An individual (instance, object in other paradigms) 68 68  Any class that an individual is a member of, or is an individual of, is a type of the individual  Any superclass of a class is an ancestor of (or type of) the individual  Specify property values for the individual  Property values should conform to the constraints such as range, value type, cardinality restrictions, etc.  Enumerated classes & data ranges are commonly used to specify the complete set of valid values for a particular property Copyright © 2012 Thematix Partners LLC & McGuinness Associates
  69. 69. + Example enumerated class of individuals 69 69 Copyright © 2012 Thematix Partners LLC & McGuinness Associates
  70. 70. + Class axioms 70 70  Subsumption (necessary) B  A ⊆ B where B is a class description partial or primitive class A  Definition (necessary and sufficient)  C ≡ D where D D is a class description complete or defined class C Courtesy Evan Wallace, NIST Copyright © 2012 Thematix Partners LLC & McGuinness Associates
  71. 71. + Meta-properties – global cardinality restrictions 71 71 Range  Functional Domain Range  Inverse Functional Courtesy Evan Wallace, NIST Domain Copyright © 2012 Thematix Partners LLC & McGuinness Associates
  72. 72. + Disjoint classes A 72 72 B  Classes are disjoint if they cannot have common instances  Disjoint classes cannot have any common subclasses  If winery and wine are disjoint, then there is no instance that is both a winery and a wine; there is no class that is both a subclass of winery and a subclass of wine  Disjointness is often used to aid consistency checking  Disjointness is also helpful in teasing out subtle distinctions among classes across multiple ontologies  Equivalence is also often used to identify the same concepts across ontologies that may be named differently, or to name classes defined through class axioms Copyright © 2012 Thematix Partners LLC & McGuinness Associates
  73. 73. + New features in OWL 2 73 73  New property constructs  Reflexive, irreflexive, & asymmetric properties  Property chains  Disjoint properties  Keys  Self-reflexive properties  Qualified cardinality restrictions  Negative property assertions  New class axioms  Disjoint unions  Disjoint classes Copyright © 2012 Thematix Partners LLC & McGuinness Associates
  74. 74. + More new features  Extended datatype handling 74 74  Better support for XML Schema Datatypes  New datatypes for OWL specifically (owl:real, owl:rational, rdf:PlainLiteral)  Custom datatype definition & use of xsd facets / datatype restrictions  New data range restrictions  Intersections, unions, complements  Support for punning  Limited support for a particular element to be defined as both a class & individual, for example  Improved support for annotations (e.g., metadata about ontologies, provenance, transformation detail, etc.)  See http://www.w3.org/TR/owl2-new-features/ & http://www.w3.org/TR/owl2- quick-reference/ for detail  Current work specifically focused on provenance: http://www.w3.org/2011/prov/wiki/Main_Page Copyright © 2012 Thematix Partners LLC & McGuinness Associates
  75. 75. + Rules of Thumb 75 75  Cycles are common in many KR systems, though rarely “a good thing”  Cycles are disallowed by some tools because they prohibit “code generation”, export -- including RDF/OWL  Classes A, B, and C have equivalent sets of instances  By many definitions, A, B, and C are equivalent  Use owl:equivalentClass instead of creating cycles Copyright © 2012 Thematix Partners LLC & McGuinness Associates
  76. 76. + Class hierarchy analysis 76 76  Siblings should be specified at roughly the same level of generality -- compare to section and subsections in a book Copyright © 2012 Thematix Partners LLC & McGuinness Associates
  77. 77. + More on class definition analysis  Classes that have single subclasses may be incompletely 77 77 defined, unless additional children are specified in subordinate modules  Subclasses of a class typically have more detailed definitions  Additional properties  Constraints/restrictions on inherited properties  Participate in further qualified relations  Compare to bullets in a bulleted list  If a class has a large number of subclasses, it may be useful to define intermediate levels  For wine, for example there are natural groupings around wine color  If no natural classification exists, the long list may be appropriate Copyright © 2012 Thematix Partners LLC & McGuinness Associates
  78. 78. + Inheritance, naming, synonyms  A “wine” is not a subclass of “wines” 78 78  A particular vintage should be classified as an instance of the class Wines  Class names should be either all singular or all plural Class  Synonymous names for the same concept should be modeled as alternate designations, not as unique classes (in the same ontology) instance-of  Many systems & standards facilitate synonymous Instance term representation (e.g., ISO 1087) MariettaOldVinesRed  OWL allows defining necessary and sufficiency condition definitions thereby allowing synonym definitions to be “first class” terms (as appropriate, through equivalence / same as relations) Copyright © 2012 Thematix Partners LLC & McGuinness Associates
  79. 79. + Class vs. property value 79 79  Are concepts with distinct property values used in restrictions on other properties?  How important is the distinction for the domain?  Class definitions for most domains should be fairly stable – i.e., they should not change frequently once the definitions are established and individuals created Copyright © 2012 Thematix Partners LLC & McGuinness Associates
  80. 80. + Class vs. individual 80 80  Individuals are the most specific entities in an ontology  If concepts form a natural hierarchy, represent them as classes  If they might have individuals, represent them as classes Copyright © 2012 Thematix Partners LLC & McGuinness Associates
  81. 81. + Limiting scope 81 81  An ontology should not contain all the possible information about the domain  No need to specialize or generalize more than the application requires  No need to include all possible properties of a class • Only the most salient properties • Only the properties that the application requires  Ontologies of wine, food, and their pairings probably will not include details such as:  Bottle size (half bottle, full bottle, magnum, …)  Label color  Wine bottle color (green, amber, …) Copyright © 2012 Thematix Partners LLC & McGuinness Associates
  82. 82. Part 3: Putting It All Together + 82 82  Ontology-powered applications (both light weight and richer applications)  Wine Agent  Semantically-enabled Environmental Monitoring  Population Science Health Data Exploration  Scientific Observations  Virtual Observatories  Cognitive Assistant, Learning, and Explanation  Technical Underpinnings  Understanding, Trust, & Proof: InferenceWeb  Semantic Methodology  Checking: Syntax, Consistency, Evaluation Services  Linked Data  Questions Copyright © 2012 Thematix Partners LLC & McGuinness Associates
  83. 83. + Ontology-enhanced applications  Simple „ontologies‟, taxonomy driven applications 83 83  Controlled, shared vocabularies (content management, faceted search) – e.g., PopSciGrid  Web site organization & navigation, expectation setting  Browsing & search engine optimization via mapping to tagged structures such as the new schema.org from Yahoo!, Google & Bing, Good Relations ontology – see http://purl.org/goodrelations/  “Smarter” search support – Wine Agent, Environmental portal  More expressive ontologies provide more options for  Configuration, e.g., AT&T PROSE/Questar  Data Integration, e.g., Virtual Observatories (VSTO, …) Copyright © 2012 Thematix Partners LLC & McGuinness Associates
  84. 84. 84 Semantically-enabled advisors utilize:  Ontologies  Reasoning  Social  Mobile  Provenance  Context Patton & McGuinness.et. al tw.rpi.edu/web/project/Wineagent
  85. 85. + TW Wine Agent – semantic interoperability Wine Agent receives a meal description and retrieves a 85 85 selection of matching wines available on the Web, using an ensemble of standards and tools  RDF & OWL  for encoding wine/food listings and pairing recommendations  Semantic MediaWiki  for publishing user-contributed recommendations*  Twitter and Facebook  for posting recommendations  Pellet  for deriving knowledge from wine ontology and recommendations  SPARQL  for querying wine/food listings with recommendations  InferenceWeb  for explaining Wine Agent‟s recommendations Copyright © 2012 Thematix Partners LLC & McGuinness Associates
  86. 86. + Social Semantic Web data publishing 8686 Collaborative wine recommendations  Using Semantic MediaWiki to manage users and user-contributed food/wine pairings recommendations  Using semantic form to add OWL instance data Copyright © 2012 Thematix Partners LLC & McGuinness Associates
  87. 87. + Semantic Query: food hierarchy & recommendation 87 87 Copyright © 2012 Thematix Partners LLC & McGuinness Associates
  88. 88. + Semantic Sommelier 88 Previous versions used ontologies to infer descriptions of wines for meals and query for wines New version uses Context: GPS location, local restaurants and wine lists, user preferences Social input: Twitter, Facebook, Wiki, mobile, … Source variability in quality, contradictions exist, Maintenance is an issue… however new models emerging
  89. 89. + Extensibility  Wine Agent Ontology is extensible 89 89  “Menu file” in RDF  Restaurants can introduce new classes and instances to provide better matches for users  Searches can be restricted to imported menus and/or imported wine lists  In process:  Group recommendations using multiple iPhones/iPod touches  Recommendations based on personal dietary needs and preferences  „iPellet‟, an OWL Reasoner designed for iPhone, for offline reasoning Copyright © 2012 Thematix Partners LLC & McGuinness Associates
  90. 90. + Observations from the Wine Agent 90 90  Background knowledge is reasonably simple and built in OWL (includes foods and wine and pairing information similar to the original Living with CLASSIC paper, CLASSIC tutorial, OWL Guide, Ontology Engineering 101, …)  Background knowledge can be used for simple query expansion over wine sources to retrieve for example documents about red wines (including zinfandel, syrah, …)  Background knowledge used to interact with structured queries such as those possible on wine sites like wine.com  Constraints allows a reasoner like Pellet to infer consequences of the premises and query.  Explanation system (Inference Web) can provide provenance information such as information on the knowledge source (McGuinness‟ wine ontology) and data sources (such as wine reviews or particular restaurants or wine stores)  Services work could allow automatic “matchmaking” instead of hand coded linkages with web resources Copyright © 2012 Thematix Partners LLC & McGuinness Associates
  91. 91. + SemantEco/SemantAqua  Enable/Empower citizens & scientists to explore pollution 91 sites, facilities, regulations, an d health impacts along with provenance. 5 4  Demonstrates semantic monitoring possibilities. 2 3  Map presentation of analysis  Explanations and Provenance 1 available http://was.tw.rpi.edu/swqp/map.html and http://aquarius.tw.rpi.edu/projects/semantaqua 1. Map view of analyzed results 2. Explanation of pollution 3. Possible health effect of contaminant (from EPA) 4. Filtering by facet to select type of data 5. Link for reporting problems
  92. 92. +System Architecture 92 92 Virtuoso access
  93. 93. + Ontology 93 93  Extends existing best practice ontologies, e.g. SWEET, OWL-Time.  Includes terms for relevant pollution concepts  Can be used to conclude: “any water source that has a measurement outside of its allowable range” is a polluted water source. Portion of the SemantEco and SemantAqua ontologies.
  94. 94. + Ontology 94 94  Regulation Ontology  models the federal and state water quality regulations for drinking water sources  We can recognize pollution, e.g. “any water source that contains 0.01 mg/L of Arsenic or more is a polluted water source.” Portion of Cal. Regulation Ontology.
  95. 95. + Querying 95 95  With OWL reasoning during query: SELECT DISTINCT ?site ?lat ?lng WHERE { ?site a pol:PollutedSite ; geo:lat ?lat ; geo:long ?lng . }  OWL reasoning hides the complexity of the relationships allowing the developer (or user) to ask simple questions without requiring deep knowledge of environmental regulations
  96. 96. + Reflections 96 96  Uses semantic technologies along with simple ontologies and provenance-encoding tools  Was generated by students in a regular term class, then extended to be the basis of a masters degree (Wang), and the foundation of 3 other class projects  Small background extensible ontology  Now being expanded to include endangered species, migration, health effects, etc. Copyright © 2012 Thematix Partners LLC & McGuinness Associates
  97. 97. + SONet: Scientific Observations Network  A Community-Driven Scientific Observations Network to 97 97 achieve Semantic Interoperability of Environmental and Ecological Data  Build a community-sanctioned, evolving data model for observational data  Build a network of practitioners  Build repository of motivating use cases  All to enable interoperability among existing data resources, supporting cross-disciplinary synthetic research in the environmental sciences Schildhauer1, Bowers2, Gries3, McGuinness4, Dibner5, Madin6, Jones1, Bermudez7 1NCEASUC Santa Barbara, 2Gonzaga University 3NTL/LTER and Univ. of Wisconsin, 4McGuinness & Associates, 5OGC Interoperability Institute, 6Macquarie University, 7Southeastern Universities Research Association https://sonet.ecoinformatics.org Copyright © 2012 Thematix Partners LLC & McGuinness Associates
  98. 98. + Population Sciences Grid Goals 98 98  Convey complex health-related information to consumer and public health decision makers for community health impact  Leverage the growing evidence base for communicating health information on the Internet  Inform the development of future research opportunities effectively utilizing cyberinfrastructure for cancer prevention and control Copyright © 2012 Thematix Partners LLC & McGuinness Associates
  99. 99. + Computer Science Perspective on PopSciGrid Goals 99 99  How can semantic technologies be used to integrate, present, and analyze data for a wide range of users?  Can tools allow lay people to build their own demos and support public usage and accurate interpretation?  How do we facilitate collaboration and “viral” applications?  Within PopSciGrid:  Which policies (taxation, smoking bans, etc) impact health and health care costs?  What data should be displayed to help scientists and lay people evaluate related questions?  What data might be presented so that people choose to make (positive) behavior changes?  What does the data show? why should someone believe that?  What are appropriate follow ups? Copyright © 2012 Thematix Partners LLC & McGuinness Associates
  100. 100. PopSciGrid Workflow + 100 100 Publish Ban coverage CSV2RDF4LOD Direct Visualize derive derive archive Archive SemDiff CSV2RDF4LOD derive Enhance Copyright © 2012 Thematix Partners LLC & McGuinness Associates
  101. 101. + PopSciGrid: Conn 101 101 Extensible Mashups via Linked Data  Diverse datasets from NIH  Potentially linking to other content (e.g. “unemployment rate”) Accountable Mashups via Provenance  Annotate datasets used in demos  Feedback users‟ comment to gov contact (e.g. %) http://logd.tw.rpi.edu/demo/trends_in_smoking_prevalence  Annotation capabilities in process _tobacco_policy_coverage_and_tobacco_prices Copyright © 2012 Thematix Partners LLC & McGuinness Associates
  102. 102. PopSciGrid II + 102 102 Copyright © 2012 Thematix Partners LLC & McGuinness Associates
  103. 103. + PopSciGrid II 103
  104. 104. + Reflections 104 Successful but…. 104  What if we could allow data experts to build their own demos?  What if we could allow non-subject matter experts to function as subject- literate staff?  What if team members could interchange roles (and thus make contributions in other areas)?  What technological infrastructure is required?  Claim: all of this is being done now – but not at scale
  105. 105. + Updates and Motivations from a Computer Science Perspective 105 105 Old: New:  Raw conversions  Enhanced conversions  Per-dataset vocabularies  Vocabulary reuse  Custom queries  Generic queries  Custom data management code  Re-usable data management  Limited use because of Google code Visualization licenses  Unlimited use of new open  State-level data source visualization toolkit  State and county-level data
  106. 106. + RDF Data Cube Vocabulary 106 106  Integrated with the LOGD  For publishing multi- data conversion dimensional data, such as infrastructure statistics, on the web in such a way that it can be  Integratedwith other tooling linked to related data sets like Stats2RDF and concepts using RDF.  Compatible with the cube model that underlies SDMX (Statistical Data and Metadata eXchange).  Also compatible with:  SKOS, SCOVO, VoiD, FOAF, Dublin Core Terms
  107. 107. 107 County average life expectancy (Summary Measures of Health
  108. 108. + Reflections 108 108  Uses semantic technologies along with simple ontologies and provenance-encoding tools  Provides an operational specification for other policy exploration tools – initially tobacco data and related policies, now also physical education and nutrition policies in elementary, junior and senior high school (CLASS - http://class.cancer.gov/About.aspx ).  Similar model being used in the Tetherless SemantEco / SemantAqua / SemantAir Portal family Copyright © 2012 Thematix Partners LLC & McGuinness Associates
  109. 109. Why did I present wine, water, and linked data applications? 109  Wine advisor shows semantic technology in action making actionable recommendations  Water application shows a “typical” semantic integration web 3.0 application  Both of these styles are needed for many semantic applications and these features are realizable today  Next – they depend on  Understandable and Actionable applications  Semantic web stack  Data availability – linked data cloud is growing  Ontologies  Semantic methodologies
  110. 110. + Foundations: Trust & Understanding 110 110 If users (humans and agents) are to use, reuse, and integrate system answers, they must trust them. System transparency supports understanding and trust. Even simple “lookup” systems benefit from providing information about their sources. Systems that manipulate information (with sound deduction or potentially unsound heuristics) benefit from providing information about their manipulations. Goal: Provide interoperable infrastructure that supports explanations of sources, assumptions, and answers as an enabler for trust. Copyright © 2012 Thematix Partners LLC & McGuinness Associates

Editor's Notes

  • Suppliers can merchandise offerings in innovative ways, unique to the brand, that better fit customer needs and desires and differentiate from the competition. Take for example an airline wishing to allow the customer to search for “comfort oriented flights” that use certain attributes of the flight to create “synthetic categories” of availability. Here, key attributes such as leg room, media options, and other facts create a class of availabilities that offer greater comfort.One can imagine other ‘synthetic search’ scenarios, such as “disability friendly,” “commuter friendly,” “easiest connections” and the like.
  • Enhance search across many search engines at once. Control the quality and criteria for relevance. Avoid the expense, upkeep and currency issues of intermediaries. As search engine functionality extends, take advantage of transactional as well as descriptive aspects of search results.Hilton could provide same as Pegasus. One point of leverage is semantics disintermediates aggregators.Ability to produce many more meaningful description tags – synonyms – that enhance relevance.-- provide results from Bing, Yahoo and Google as illustration-- not as good as it can be: we need to mock-up something even better.
  • http://tw.rpi.edu/web/project/Wineagent
  • http://was.tw.rpi.edu/swqp/map.htmlReport Problem: Depends on the state, problem report link will be changing to show correct link that user show report problem to in that state (currectly encoded 4 states:NY, RI, MASS, CA. + EPA)Currently, we only model health effect for water sites, and haven’t implement health effect based reasoning.Note: these are new features and were discussed in as “future work” in the paper.tml
  • We are using regulation data from 4 states: MASS, CA, RI, NY and 1 regulation data from EPA (total 5)Preprocessing regulation data: identify correct limit for each contaminant(some of data contain English words, not just number), write adhoc code to convert them into the format that our converter is able to process.Some links to regulation data: http://www.dem.ri.gov/pubs/regs/regs/water/h20q09.pdf page 100 (RI)http://water.epa.gov/drink/contaminants/index.cfm (EPA)http://www.mass.gov/dep/water/drinking/standards/dwstand.htm (MA)Data range:the echo data range: 10/31/2007-09/30/2010the usgs date range: 1955-05-26 to 1999-11-09
  • ImpacTeen: part of Bridging the Gap: Research Informing Practice and Policy for Healthy Youth Behavior, supported by the Robert Wood Johnson Foundation and administered by Univ. of Illinois at Chicago.http://www.impacteen.org/
  • Stats2RDF is a plugin for OntoWiki that makes it easy to convert spreadsheet-based data to RDF Data Cubes.Built on SKOS- Get links for all the compatible vocabularies (from email from dlm to mccus)
  • Components of note:Measure selectorInfo boxes that expand with additional infoScatter plot matrixTimeline Use of RDF datacubeLevels of detail at county, state, and “stratum” level (from Community Health Status Indicators dataset)Slide on summary of what’s newWhat datasets are being used?Star the new onesLevel of detail (State, County, and Stratum)Description of what we are using from CHSI, what could be used, what isn’t yet used.Summary of what’s new from the dataset perspective, data representation perspective (including why, what it enables)Data granularity (adding county-level data)Demo functionality: statter matrix, provenance, info boxesBackup slide on d3, why it’s useful, what could be done with it. No licensing problems like google viz (include comment about compositionality)Justification of use of SemWebTech. Linked DataIdentify relationships between measure using concept terms (DataFAQs: include Stephan’s data quality label as well. http://aksw.org/Projects/Stats2RDF
  • Many Benefits:Reduced query formation from 8 to 3 steps and reduced choices at each stageAllowed scientists to get data from instruments they never knew of before (e.g., photometers in example)Supported augmentation and validation of dataUseful and related data provided without having to be an expert to ask for itIntegration and use (e.g. plotting) based on inferenceAsk and answer questions not possible beforeBut Needed Provenance (SPCDIS, PML), reusability &amp; modularity (SESF)Deborah McGuinness, Peter Fox, Luca Cinquini, Patrick West, Jose Garcia, James L. Benedict, and Don Middleton. The Virtual Solar-Terrestrial Observatory: A Deployed Semantic Web Application Case Study for Scientific Research. In the Proceedings of the Nineteenth Conference on Innovative Applications of Artificial Intelligence (IAAI-07). Vancouver, British Columbia, Canada, July 22-26, 2007. Peter Fox, Deborah L. McGuinness, Luca Cinquini, Patrick West, Jose Garcia, James L. Benedict, and Don Middleton. Ontology-supported Scientific Data Frameworks: The Virtual Solar-Terrestrial Observatory Experience. In Computers and Geosciences - Elsevier. Volume 35, Issue 4 (2009).
  • Many Benefits:Reduced query formation from 8 to 3 steps and reduced choices at each stageAllowed scientists to get data from instruments they never knew of before (e.g., photometers in example)Supported augmentation and validation of dataUseful and related data provided without having to be an expert to ask for itIntegration and use (e.g. plotting) based on inferenceAsk and answer questions not possible beforeBut Needed Provenance (SPCDIS, PML), reusability &amp; modularity (SESF)Deborah McGuinness, Peter Fox, Luca Cinquini, Patrick West, Jose Garcia, James L. Benedict, and Don Middleton. The Virtual Solar-Terrestrial Observatory: A Deployed Semantic Web Application Case Study for Scientific Research. In the Proceedings of the Nineteenth Conference on Innovative Applications of Artificial Intelligence (IAAI-07). Vancouver, British Columbia, Canada, July 22-26, 2007. Peter Fox, Deborah L. McGuinness, Luca Cinquini, Patrick West, Jose Garcia, James L. Benedict, and Don Middleton. Ontology-supported Scientific Data Frameworks: The Virtual Solar-Terrestrial Observatory Experience. In Computers and Geosciences - Elsevier. Volume 35, Issue 4 (2009).
  • How to:In the “Regulation” box, check the “CA Regulation”, and Click “Go”Results:We can see that there are more polluted water sites, polluting facilities based on CA Regulation”.

×