Can there be such a thing as Ontology Engineering?
Robert Stevens
BioHealth Informatics Group
University of Manchester
Introduction
 A bit of ontology introduction if required;
 What is engineering?
 Predictability in ontology engineering...
A World of Instances
 The world (of information) is made up of things and lots of them
 Instances, individuals, objects,...
We Put things into Categories
 All these instances hang about making our world
 Putting these things into categories is ...
We have Labels for the Categories and their Instances
 We label categories with symbols: Words
 “Lion” is a category of ...
A Controlled Vocabulary
 A specified set of words and phrases for the
categories in which we place instances
 Natural la...
We also like to Relate Things Together
 Categories have subcategories
 Instances in one category can be related
in some ...
Categories and sub-categories
biopolymer
polypeptide Nucleic acid
enzyme
DNA
RNA
Describing Category Membership
 We can make conditions that any instance must fulfil in order to be a
member of a particu...
Relationships
 These conditions made from a property and a successor
relationship
 isPartOf, hasPart
 isDerivedFrom
 D...
A Structured Controlled Vocabulary
 Not only can we agree on the
labels we give categories
 Can also agree on how the
in...
Manchester Mercury
January 1st 1754
Executed 18
Found Dead 34
Frighted 2
Kill'd by falls and other accidents
55
Kill'd the...
Uses of Ontology in Bioinformatics
What is engineering?
 American Engineers' Council for Professional
Development defines "engineering" as:
 “The creative ...
What Type of Artefact? The Rise of the Computer Science
Ontology
 A term borrowed from philosophy
 Not supposed to be th...
Software engineering life cycle
06/27/14
18
http://www.samsvb.co.uk
Ontology
Where are we in the Development of Ontology Engineering?
 At about 1975…
 There’s a lot of craft involved;
 Too much re...
The Waterfall Method
06/27/14
20
RequirementsRequirements
ConceptualisationConceptualisation
Development +
Coding
Developm...
Something a bit more agile
06/27/14
21
Requirements, scoping,
Competency questions
Knowledge acquisition
Conceptualisation...
Four Broad Areas of Ontology Engineering
1. Technical aspects: Code repositories, issue trackers,
editors, and so on
2. Co...
Getting the Requirements Right
 Truth and beauty is an easy requirement to state
 Just model the world as it is and all ...
Strict Semantics
 Languages such as OWL have a strict semantics;
 Statements have a precise and interpretable meaning;
...
Correct, but Wrong…
 An automated reasoner for OWL can make sure all your
axioms are coherent;
 One can make sure the on...
Total Definition
 In OWL a defined class can find its own place in the hierarchy
 A parent is any person that has a chil...
Normalisation
 An “engineering” method to manage polyhierarchies in
ontology through reasoning;
 Make a strict tree of p...
Authoring Tools
 These are really just axiom editors
 Support for the surrounding processes are nascent
 Lots of “hand-...
Protégé 4
06/27/14
29
Patterns and Components
 Software Design Patterns: Accepted design solutions to
common problems;
 Application building a...
Ontology Pre-Processor Language
A cell type is equivalent to a cell type
that is part of some anatomy
Pattern
Ontology Pre-Processor Language
?cell:CLASS,
?anatomyPart:CLASS,
?anatomy:CLASS =
(CL:0000000 part_of some ?anatomyPart)
B...
Ontology Pre-Processor Language
?cell:CLASS,
?anatomyPart:CLASS,
?anatomy:CLASS =
(CL:0000000 part_of some ?anatomyPart)
B...
Resulting OWL axioms
Class: CL:0003523
Annotation:
rdfs:label ‘Kidney Cell’
EquivalentTo:
CL:0000000 and OBO_REL:part_of s...
Automation
 Moving from hand-crafting to production line
 Can try things out and then re-model (as long as the
entities ...
Populous
 Generic tool for populating ontology templates
 Spreadsheet style interface
 Supports validation at the point...
Evaluation
 A big “can of worms”
 Closely linked to requirements
 Closely linked to what one believes an ontology to be...
The Role of philosophy
06/27/14
38
Biology
Computer Science
Philosophy
Angels on the head of a pin
Biology
Computer Science
Philosophy
The role of philosophy
Can we have Ontology Engineering?
 Probably, but you’ll have to wait;
 Not much predictability, except to say “it’s hard...
Upcoming SlideShare
Loading in …5
×

Can there be such a thing as Ontology Engineering?

325 views

Published on

Invited talk at Carlton University, Ottawa

Published in: Science, Education, Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
325
On SlideShare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
5
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Can there be such a thing as Ontology Engineering?

  1. 1. Can there be such a thing as Ontology Engineering? Robert Stevens BioHealth Informatics Group University of Manchester
  2. 2. Introduction  A bit of ontology introduction if required;  What is engineering?  Predictability in ontology engineering  The application of deterministic principles  The role of strict semantics  The role of philosophy  Acquiring some level of reproducibility.
  3. 3. A World of Instances  The world (of information) is made up of things and lots of them  Instances, individuals, objects, tokens, particulars.  The Earth is a kind of Planet  Robert Stevens (NE 67 41 58 A) is a Person  All the individual Alpha Haemoglobins in my many Instances of Red Blood Cell  Each cell instance in my Body has copies of some 30,000 Genes  A Word, language, idea, etc.  This Table, those Chairs,  Any Thing with “A”, “The”, “That”, etc. before it….
  4. 4. We Put things into Categories  All these instances hang about making our world  Putting these things into categories is a fundamental part of human cognition  Psychologists study this as concept formation  The same instances are put into a category  The capitalised and italicised in the slide before last
  5. 5. We have Labels for the Categories and their Instances  We label categories with symbols: Words  “Lion” is a category of big cat with big teeth  Gene, Protein, Cell, Person, Hydrolase Activity, etc.  …and, as we’ve already seen, each category can have many labels and any particular label can refer to more than one category  Semantic Heterogeneity  “A lion” is an instance in that category  Does the category “Lion” exist?  Lions exist, but the category could just be a human way of talking about lions  … we like putting things into categories
  6. 6. A Controlled Vocabulary  A specified set of words and phrases for the categories in which we place instances  Natural language definitions for those words and phrases  A glossary defines, but doesn’t control  The Uniprot keywords define and control  Control is placed upon which labels are used to represent the categories (concepts) we’ve used to describe the instances in the world  …, but there is nothing about how things in these categories are related Biopolymer DNA Enzyme Nucleic acid mRNA Polypeptide snRNA tRNA
  7. 7. We also like to Relate Things Together  Categories have subcategories  Instances in one category can be related in some way to instances in another  Can relate instances to each other in many different ways  Is-a, part-of, develops-from, etc.axes  We can use these relationships to classify categories  Things in category A are part is  If all instances in category A are also in category B then As are kinds of Bs Biopolymer Nucleic Acid Polypeptide Enzym e DNA RNA tRNA mRNA smRNA
  8. 8. Categories and sub-categories biopolymer polypeptide Nucleic acid enzyme DNA RNA
  9. 9. Describing Category Membership  We can make conditions that any instance must fulfil in order to be a member of a particular category  A Phosphatase must have a phosphatase catalytic domain  A Receptor must have a transmembrane domain  A codon has three nucleotide residues  A limb has part that is a joint  A man has a Y chromosome and an X chromosome  A woman has only an X chromosome
  10. 10. Relationships  These conditions made from a property and a successor relationship  isPartOf, hasPart  isDerivedFrom  DevelopsFrom  isHomologousTo  …and many, many more
  11. 11. A Structured Controlled Vocabulary  Not only can we agree on the labels we give categories  Can also agree on how the instances of categories are related  And agree on the labels we give he relations  Structure aids querying and captures knowledge with greater fidelity Biopolymer Nucleic Acid Polypeptide Enzym e DNA RNA tRNA mRNA smRNA Gene regionOf transcribedFrom translatedFrom
  12. 12. Manchester Mercury January 1st 1754 Executed 18 Found Dead 34 Frighted 2 Kill'd by falls and other accidents 55 Kill'd themselves 36 Murdered 3 Overlaid 40 Poisoned 1 Scalded 5 Smothered 1 Stabbed 1 Starved 7 Suffocated 5 Aged 1456 Consumption 3915 Convulsion 5977 Dropsy 794 Fevers 2292 Smallpox 774 Teeth 961 Bit by mad dogs 3 Broken Limbs 5 Bruised 5 Burnt 9 Drowned 86 Excessive Drinking 15 List of diseases & casualties this year 19276 burials 15444 christenings Deaths by centile
  13. 13. Uses of Ontology in Bioinformatics
  14. 14. What is engineering?  American Engineers' Council for Professional Development defines "engineering" as:  “The creative application of scientific principles to design or develop structures, machines, apparatus, or manufacturing processes, or works utilizing them singly or in combination; or to construct or operate the same with full cognizance of their design; or to forecast their behavior under specific operating conditions; all as respects an intended function, economics of operation and safety to life and property.[2]”  Taken from http://en.wikipedia.org/wiki/Engineering
  15. 15. What Type of Artefact? The Rise of the Computer Science Ontology  A term borrowed from philosophy  Not supposed to be the same thing, but…  Meant to deliver formal, computational semantics to applications and humans  Necessarily involves consensus
  16. 16. Software engineering life cycle 06/27/14 18 http://www.samsvb.co.uk Ontology
  17. 17. Where are we in the Development of Ontology Engineering?  At about 1975…  There’s a lot of craft involved;  Too much reliance on gurus  Could two independent sets of ontologist develop two ontologies for the same domain with the same utility?  Can we cost ontology building?  Do we know when we have succcess?
  18. 18. The Waterfall Method 06/27/14 20 RequirementsRequirements ConceptualisationConceptualisation Development + Coding Development + Coding Quality+ Testing Quality+ Testing Maintenance + Support Maintenance + Support Getting it right first time
  19. 19. Something a bit more agile 06/27/14 21 Requirements, scoping, Competency questions Knowledge acquisition Conceptualisation, pattern forming Axiomatization Testing / evaluation? Repeated, small iterations Repeated, small iterations Users always involved Users always involved
  20. 20. Four Broad Areas of Ontology Engineering 1. Technical aspects: Code repositories, issue trackers, editors, and so on 2. Coding styles and naming conventions, etc. 3. Choosing a class, placing it in a hierarchy and choosing relationships and entities by which it is described. 4. The rhetoric behind how (2) and (3) are done. One can have philosophical justification for any decision, or it can just be practically useful….
  21. 21. Getting the Requirements Right  Truth and beauty is an easy requirement to state  Just model the world as it is and all else wil flow from this;  Not necessarily helpful;  Have to set a scope;  Have to set priorities – what do we most need to represent?  Competency questions – what do I need to be able to answer?  Separating “what the ontology must answer” and “what the ontology must enable to be answered”;  Requirements change; keeping it “agile”  Setting priorities.
  22. 22. Strict Semantics  Languages such as OWL have a strict semantics;  Statements have a precise and interpretable meaning;  Deductions can follow from a series of statements;  Can be used to aid development and use of the ontology
  23. 23. Correct, but Wrong…  An automated reasoner for OWL can make sure all your axioms are coherent;  One can make sure the ontology is structurally robust  The statements in the ontology can stil be rubbish though…  A strict semantics lends some kind of predictability to an ontology;  A pure description logic approach of all defined classes has some appeal…
  24. 24. Total Definition  In OWL a defined class can find its own place in the hierarchy  A parent is any person that has a child;  A mother is any woman that has a child;  As a woman is a kind of person, we can infer a mother to be a kind of parent;  Do this for all classes; press the button and you have an ontology  Definition is hard (but that may be a good thing) and the tools may lack  Requires discipline from the authors  …and it all grounds out to a primitive somewhere along the line…
  25. 25. Normalisation  An “engineering” method to manage polyhierarchies in ontology through reasoning;  Make a strict tree of primitive classes using one criterion;  Put all other criteria as restrictions upon those classes;  Re-establish the polyhierarchy through defined classes with the “other” criteria….  http://ontogenesis.knowledgeblog.org/49
  26. 26. Authoring Tools  These are really just axiom editors  Support for the surrounding processes are nascent  Lots of “hand-crafting” of even large ontologies  Knowledge gathering tools; organising tools; axiom generation tools; checking and validation tools; …
  27. 27. Protégé 4 06/27/14 29
  28. 28. Patterns and Components  Software Design Patterns: Accepted design solutions to common problems;  Application building at the level of components;  Design pattern analogy in ontologies;  Patterns or regularities that are not ODP;  Ontologies tend to be repetitious and humans tend to be bad at repetition – tedium kicks in….  Calls for automation
  29. 29. Ontology Pre-Processor Language A cell type is equivalent to a cell type that is part of some anatomy Pattern
  30. 30. Ontology Pre-Processor Language ?cell:CLASS, ?anatomyPart:CLASS, ?anatomy:CLASS = (CL:0000000 part_of some ?anatomyPart) BEGIN ADD ?cell equivalentTo ?anatomy END; Variables Create axioms A cell type is equivalent to a cell type that is part of some anatomy Pattern OPPL Script
  31. 31. Ontology Pre-Processor Language ?cell:CLASS, ?anatomyPart:CLASS, ?anatomy:CLASS = (CL:0000000 part_of some ?anatomyPart) BEGIN ADD ?cell equivalentTo ?anatomy END; A cell type is equivalent to a cell type that is part of some anatomy Pattern OPPL Script Variable mapper ?cell -> ‘Kidney Cell’[CL:0003523] ?anatomyPart -> ‘Kidney’[FMA:629093]
  32. 32. Resulting OWL axioms Class: CL:0003523 Annotation: rdfs:label ‘Kidney Cell’ EquivalentTo: CL:0000000 and OBO_REL:part_of some FMA:629093 A ‘Kidney Cell’ is equivalent to a cell that is part of the ‘Kidney’ Example Generated OWL (Manchester Syntax)
  33. 33. Automation  Moving from hand-crafting to production line  Can try things out and then re-model (as long as the entities involved don’t change)  Documents what has been done;  Ruthlessly consistent;  Also need support in repetitious knowledge gathering as well as axiom generation.
  34. 34. Populous  Generic tool for populating ontology templates  Spreadsheet style interface  Supports validation at the point of data entry  Expressive Pattern language for OWL Ontology generation http://www.e-lico.eu/populous
  35. 35. Evaluation  A big “can of worms”  Closely linked to requirements  Closely linked to what one believes an ontology to be…;  “Just do what I say and it will be OK” isn’t an evaluation strategy;  Nor is saying “just model reality” and that’s all you need to evaluate;  No really convincing way of doing it.
  36. 36. The Role of philosophy 06/27/14 38 Biology Computer Science Philosophy
  37. 37. Angels on the head of a pin
  38. 38. Biology Computer Science Philosophy The role of philosophy
  39. 39. Can we have Ontology Engineering?  Probably, but you’ll have to wait;  Not much predictability, except to say “it’s hard” and “people wil disagree with you”  So, much like software engineering;  Much to learn from SE and it should be quicker;  Programming is not software engineering  Axiom authoring is not ontology engineering;  At the moment we’re writing axioms, but realise we need to engineer;  Once wwe can demonstrate, with predictability, that two independent groups can take a method and each produce an ontology that meets some needs then I’ll begin to relax.

×