Lect6-An introduction to ontologies and ontology development


Published on

Lecture 6 of MAS course at URV, 2010

Published in: Technology, Education

Lect6-An introduction to ontologies and ontology development

  1. 1. LECTURE 6: An introduction to ontologies and ontology development Artificial Intelligence II – Multi-Agent Systems Introduction to Multi-Agent Systems URV, Winter-Spring 2010 (Based on a presentation by Dr David Sánchez)
  2. 2. Outline of the lecture Ontologies Definition Components Use in MAS OWL: Web Ontology Language A method for ontology development
  3. 3. Ontologies in FIPA-ACL You have come across ontologies before in this course: (cfp :sender (agent-identifier :name j) :receiver (set (agent-identifier :name i)) :content "((action (agent-identifier :name i) (sell book “The Lord of the Rings”) (any ?x (and (= (price book) ?x) (< ?x 10)))))" :ontology book-market :language fipa-sl)
  4. 4. What Is An Ontology (I) Tom Gruber: Short answer: An ontology is a specification of a conceptualization Long answer: […] an ontology is a description (like a formal specification of a program) of the concepts and relationships that can exist in a given domain of discourse for an agent or a community of agents
  5. 5. What Is An Ontology (II) An ontology is an explicit description of a domain concepts properties and attributes of concepts restrictions on properties and attributes individuals (often, but not always) An ontology defines a common vocabulary a shared understanding of a domain among a set of agents
  6. 6. Why Develop an Ontology? To share a common understanding of the structure of information among people among software agents To make domain assumptions explicit To enable reuse of domain knowledge to avoid “re-inventing the wheel” to introduce standards to allow interoperability
  7. 7. Ontology components Concepts Disease, Treatment, Symptom Properties and attributes of concepts Causes, OccursIn, Receives Restrictions on properties and attributes Cancer always Receives Radiotherapy Individuals (often, but not always) “John Smith’s cough” is a particular Symptom
  8. 8. What Is “Ontology Engineering”? Defining concepts in the domain (classes) Arranging the concepts in a hierarchy (subclass-superclass hierarchy) Defining attributes and properties that classes can have and restrictions on their values Defining individuals and filling in property values
  9. 9. Size and scope of an ontology Two extremes (the reality is usually something in between): One huge ontology that captures "everything“ in the domain One (small) ontology for each specific application A A O A A O A O O A A A O O O A A A A O A A
  10. 10. "One large ontology" approach (I) Benefits Few or no internal inconsistencies Easier to find for an application developer Uniform documentation Less overlapping work!
  11. 11. "One large ontology" approach (II) Drawbacks Who maintains it? Who is responsible? Heavy and slow to use (both for human users and for applications) Difficult to take into account everybody's opinions and wishes at design time and when updating Difficult and expensive construction
  12. 12. Example: Unified Medical Language System (UMLS) Metathesaurus Over 1 million biomedical concepts Integrates 100 vocabularies and classification systems ICD-10: International classification of diseases (more than 12400 codes) MeSH: Medical Subjects Headings (more than 25,000 descriptors) LOINC: Logical Observation Identifiers Names and Codes (58,000 observation terms) SNOMED CT: Systematized Nomenclature of Medicine - - Clinical Terms (over 1 million medical concepts)
  13. 13. "Several small ontologies" approach (I) Benefits Ontologies fit perfectly the application demands Smaller, and thus faster to use Easier to understand and to form the complete picture of an ontology (fewer concepts and interrelations)
  14. 14. "Several small ontologies" approach (II) Drawbacks Different ontologies do not fit together without either central coordination body, or ontology alignment software Overlapping work - same concepts defined in multiple ontologies, either in the same way or (even worse!) differently Application developers have to choose between multiple incomplete options
  15. 15. Basic example of ontology alignment
  16. 16. Some mixed approaches A O A O A O A A O A O O O O O upper O O ontology A O A A O domains A O A mediating A A none software A A O A O M A O A A A A A A A
  17. 17. Importance in MAS (I) Agent-systems are typically distributed systems There is certainly the possibility of being able to access different domain ontologies Agent-systems consisting of a single agent are rare and often not useful Agents typically need to communicate with each other Agents should understand each other
  18. 18. Importance in MAS (II) People often design and implement agents independently, unaware of other developers working in the same domain Agents' understanding of each other is mostly based on ontologies
  19. 19. Usual alternatives (I) Single, common A A ontology Uses: A MAS developed by a O A unique group [Practical exercise] A A Well-structured domain A with a jointly agreed standard vocabulary [Medicine]
  20. 20. Usual alternatives (II) Common core A O ontology (e.g. high A level ontologies like A O WordNet) O complemented with O O A especialised lower- A O level classes and instances locally by each agent
  21. 21. Usual alternatives (III) A A O A O M A O A A A Application that maps the concepts and relationships in different partial domain ontologies Usually quite complex, and with human supervision Union of previous MASs
  22. 22. Outline of the lecture Ontologies Definition Components Use in MAS OWL: Web Ontology Language A method for ontology development
  23. 23. Representation of ontologies Languages: RDF, DAML, OIL, OWL Tools to edit ontologies: Protégé OilEd OntoEdit
  24. 24. OWL: Web Ontology Language The newest ontology representation language Since October 2009, OWL2 Standard worldwide notation Designed to bring semantic content to the Web (Semantic Web) WebOnt group developed the OWL formalism OWL language now a W3C recommendation http://www.w3.org/TR/2009/REC-owl2-quick-reference- 20091027/ OWL2 Quick Reference Guide (October 2009)
  25. 25. OWL: Language components RDF Schemas Features Equality and Inequality Property Characteristics Property Restrictions Logical Operators
  26. 26. RDF Schemas Features They define basic ontological components Classes Subclasses Individuals Properties Subproperties Domain Range
  27. 27. Classes Classes are sets of individuals with common characteristics A Class should be described such that it is possible for it to contain Individuals Classes that cannot possibly contain any individuals are said to be inconsistent Eg: Disorder, Patient, Treatment, Symptom
  28. 28. Subclasses Define class specializations by constraining their coverage Ex: Breast Cancer is a subclass of Cancer Class hierarchies can be specified by making one or more statements that a class is a subclass of another class
  29. 29. Individuals (Instances) Individuals are the specific objects in the domain Individuals may be (and are likely to be) members of multiple Classes Ex. St_Johns_Hospital, Peter_Smith_disorder
  30. 30. Properties Properties can be used to state relationships between individuals or from individuals to data values Relationships in OWL are binary Subject predicate Object Individual a hasProperty Individual b Individual hasProperty Value Eg: hasSymptom, isCausedBy, Author
  31. 31. Property types Object Property: relates individuals Establishes a relationship between objects Datatype Property: relates individuals to data (int, string, float etc) Can be considered “attributes” of the instance Annotation Property: for attaching metadata to classes, individuals or properties E.g. version, author, comment
  32. 32. Property examples isCausedBy Object Disorder Cause Property hasScientificName Datatype String Disorder Property comment Annotation Disorder Meta-data Property
  33. 33. Built-in datatypes Basic datatypes: http://www.w3.org/2001/XMLSchema#name xsd:string, xsd:boolean, xsd:decimal, xsd:float, xsd:double, xsd:dateTime, xsd:time, xsd:date, xsd:gYearMonth, xsd:gYear, xsd:gMonthDay, xsd:gDay, xsd:gMonth, xsd:hexBinary, xsd:base64Binary, xsd:anyURI, xsd:normalizedString, xsd:token, xsd:language, xsd:NMTOKEN, xsd:Name, xsd:NCName, xsd:integer, xsd:nonPositiveInteger, xsd:negativeInteger, xsd:long, xsd:int, xsd:short, xsd:byte, xsd:nonNegativeInteger, xsd:unsignedLong, xsd:unsignedInt, xsd:unsignedShort, xsd:unsignedByte and xsd:positiveInteger.
  34. 34. Sub Properties Defines properties specializations by constraining their coverage Make hierarchies from one or more statements that a property is a subproperty of one or more other properties Ex. hasScientificName is a subPropertyOf hasName
  35. 35. Domain It indicates the individuals to which the property should be applied If a property relates an individual A to another individual B, and the property has a class C as its domain, then the individual A must belong to class C Ex. hasSymptom has the domain Disorder X hasSymptom Y X is a Disorder
  36. 36. Range It indicates the individuals to which the property should be applied If a property relates an individual A to another individual B, and the property has class C as its range, then the individual B must belong to class C Ex. hasSymptom has a range of Symptom X hasSymptom Y Y is a Symptom
  37. 37. Equality and Inequality OWL terms that allow expressing equalities and inequalities between ontological components EquivalentClass: two classes are equivalent EquivalentProperty: two properties are equivalent SameIndividualAs: different names that refer to the same individual DifferentFrom: two individuals are different AllDifferent: all members of a list are distinct and pairwise disjoint
  38. 38. Property Characteristics They define the semantics of properties InverseProperty: one property is the inverse of another TransitiveProperty: the property is transitive SymmetricProperty: the property is symmetric FunctionalProperty: the property has a unique value InverseFunctionalProperty: The inverse of the property is functional
  39. 39. Property Restrictions (I) Define some constraints on the use of properties AllValuesFrom: all the values in the range of a property belong to a given class Cancer isTreatedWith [AllValuesFrom Radiotherapy] SomeValuesFrom: at least one value in the range of a property belongs to a given class Flu hasSymptom [SomeValuesFrom Fever]
  40. 40. Property Restrictions (II) MinCardinality, MaxCardinality: minimum/maximum number of individuals to whom you can be related with a certain property
  41. 41. Logical Operators (I) Define classes out of other classes IntersectionOf Tuberculosis_Symptoms = Fever IntersectionOf Coughing_Blood UnionOf Flu_Symptoms = Fever UnionOf Vomit ComplementOf StandardDisorder = ComplementOf ContagiousDisorder
  42. 42. Logical Operators (II) DisjointWith: two classes are disjoint Symptom DisjointWith Cause OneOf: defines a class by enumerating all the individuals that belong to it Hospitals is OneOf {University-Hospital}, {St_John}, {City-Clinic}
  43. 43. OWL - Conclusions OWL is a language for representing ontologies, which extends frame languages OWL has a rich set of features There exist reasoners to check the consistency of an ontology Before building a knowledge base (ontology) an study of the domain is required (in order to determine constraints, relationships and incompatibilities)
  44. 44. Outline of the lecture Ontologies Definition Components Use in MAS OWL: Web Ontology Language A method for ontology development
  45. 45. Ontology Development Process In this talk: determine consider enumerate define define define create scope reuse terms classes properties restrictions instances In reality - an iterative process: determine consider enumerate consider define enumerate define scope reuse terms reuse classes terms classes define define define define create define create properties classes properties restrictions instances classes instances consider define define create reuse properties restrictions instances
  46. 46. General golden rules There is not one ‘correct’ way to model a domain There are always different structuring possibilities Ontology development is necessarily an iterative process Concepts in the ontology should be close to (physical or logical) objects –nouns- and relationships –verbs- in the domain of interest
  47. 47. I-Determine Domain and Scope determine consider enumerate define define define create scope reuse terms classes properties restrictions instances Goal: limit the scope of the model What is the domain that the ontology will cover? For what are we going to use the ontology? To what types of questions (competency questions) should the information in the ontology provide answers? Who will use and maintain the ontology? Answers to these questions may change during the lifecycle
  48. 48. Limiting the scope An ontology should not contain all the possible information about the domain No need to specialize or generalize more than the application requires No need to include all the possible properties of a class Only the most relevant properties Only the properties that the application requires
  49. 49. Competency Questions Incremental explicit list of questions that the final ontology knowledge base should be able to answer Is cancer contagious or not? Which symptoms define the flu disorder? Which are the causes of hypertension? Which treatment should I apply for a patient that is allergic to penicillin and has flu?
  50. 50. II-Consider Reuse determine consider enumerate define define define create scope reuse terms classes properties restrictions instances Why reuse other ontologies? To avoid repeating the work To interact with the tools that use other ontologies To use ontologies that have been validated through use in previous applications To make the final knowledge base compatible with predefined standards (e.g. MeSH, UMLS)
  51. 51. What to Reuse? Domain-specific standard terminology Unified Medical Language System (UMLS) MeSH, ICD10
  52. 52. Example: Gene Ontology
  53. 53. III-Enumerate Important Terms determine consider enumerate define define define create scope reuse terms classes properties restrictions instances Goal: build a complete list of terms in the delimited scope. Are they the appropriate ones to answer all the Competency Questions? Which are the terms we need to talk about? What do we want to say about the terms? Make a comprehensive list of the terms without considering (here) the overlap between concepts they represent, relations among terms, or whether the concepts are classes or properties
  54. 54. Enumerating Terms – Medical Ontology Disorder, symptom, treatment, cause, … Disorder contagiousness, disorder scientific name, disorder standard code, … Cancer, blood disorder, flu, hepatitis, … Fever, icterus, vomit, cough, …
  55. 55. IV-Define Classes and a Class Hierarchy determine consider enumerate define define define create scope reuse terms classes properties restrictions instances Goal: find out in the list of terms those which represent concepts in the domain A class is a concept in the domain A class of Disorders A class of Symptoms A class of Cancers A class is a collection of elements with similar properties A class can later be instantiated John’s blood disorder
  56. 56. Class Inheritance Classes usually constitute a taxonomic hierarchy (a subclass-superclass hierarchy) An instance of a subclass is an instance of a superclass If you think of a class as a set of elements, a subclass is a subset that has a certain common characteristic
  57. 57. Class Inheritance - Example Cancer is a subclass of Disorder Every cancer is a disorder Lung cancer is a subclass of Cancer Every lung cancer is a cancer
  58. 58. Levels in the Hierarchy
  59. 59. Modes of Development Top-down: define the most general concepts first and then specialize them Bottom-up: start with the most specific concepts and then organize them in more general classes Combination: define the more relevant concepts first and then generalize and specialize them as necessary
  60. 60. Documentation Classes (and properties) usually have documentation Describing the class in natural language Listing domain assumptions relevant to the class definition Listing synonyms Different labels for different languages
  61. 61. Some hints If a class only has one child, there may be a modelling problem If a class has more than a dozen children, additional subcategories may be necessary Subclasses of a class usually … have additional properties have different restrictions participate in different relationships
  62. 62. V-Define Properties of Classes determine consider enumerate define define define create scope reuse terms classes properties restrictions instances Describe attributes of instances of the class and relations to other instances [Attributes] For each disorder we want to know its natural language name, its scientific name, its ICD-10 code, etc. [Relations to other concepts] Each disorder has symptoms, causes, treatments, etc.
  63. 63. Relationships in a medical ontology
  64. 64. Properties Datatype vs Object properties Datatype properties (attributes) Contain primitive values (strings, numbers) Disorder name: string Disorder scientific name: string Disorder ICD-10 code: float Disorder contagiousness: boolean Object properties (relationships) Contain (or point to) other objects A syndrome has a set of symptoms A disease can be the cause of a syndrome An intervention plan is associated with a syndrome
  65. 65. Property and Class Inheritance A subclass inherits all the properties from its superclass If a disorder has a name and a contagiousness, a Flu disorder also has a name and a contagiousness If a class has multiple superclasses, it inherits properties from all of them Leukemia is both a Blood disorder and a Cancer
  66. 66. VI-Property Restrictions determine consider enumerate define define define create scope reuse terms classes properties restrictions instances Property restrictions constrain or limit the set of possible values for a property The scientific name of a disorder is a string The symptoms of any disorder have to be instances of the Symptom class A disorder is required to have at least one MeSH code
  67. 67. Common Restrictions Cardinality: the number of values a property has Value type: the type of values a property has Minimum and maximum value: a range of values for a numeric property Default value: the value a property has unless explicitly specified otherwise
  68. 68. Domain and Range of a Property Domain of a property: the class (or classes) that have the property More precisely: class (or classes) of instances which can have the property Which are the classes that can use a property? Range of a property: the class (or classes) to which property values belong Which are the classes restricting the property possible values?
  69. 69. Restrictions and Class Inheritance A subclass inherits all the properties restrictions from its superclass A subclass can override the restrictions to “narrow” the list of allowed values Make the cardinality range smaller Replace a class in the range with a subclass
  70. 70. VII-Create Instances (Individuals) determine consider enumerate define define define create scope reuse terms classes properties restrictions instances Choose the class of the instance to be created Create an instance of a class The class becomes a direct type of the instance Any superclass of the direct type is a type of the instance Assign property values for the instance frame Property values should conform to the restrictions Knowledge-acquisition tools often check it
  71. 71. Extra material in Moodle space OWL official description Link to Protégé web page