Can there be such a thing as Ontology Engineering?

  • 89 views
Uploaded on

Invited talk at Carlton University, Ottawa

Invited talk at Carlton University, Ottawa

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
    Be the first to like this
No Downloads

Views

Total Views
89
On Slideshare
0
From Embeds
0
Number of Embeds
0

Actions

Shares
Downloads
0
Comments
0
Likes
0

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. Can there be such a thing as Ontology Engineering? Robert Stevens BioHealth Informatics Group University of Manchester
  • 2. Introduction  A bit of ontology introduction if required;  What is engineering?  Predictability in ontology engineering  The application of deterministic principles  The role of strict semantics  The role of philosophy  Acquiring some level of reproducibility.
  • 3. A World of Instances  The world (of information) is made up of things and lots of them  Instances, individuals, objects, tokens, particulars.  The Earth is a kind of Planet  Robert Stevens (NE 67 41 58 A) is a Person  All the individual Alpha Haemoglobins in my many Instances of Red Blood Cell  Each cell instance in my Body has copies of some 30,000 Genes  A Word, language, idea, etc.  This Table, those Chairs,  Any Thing with “A”, “The”, “That”, etc. before it….
  • 4. We Put things into Categories  All these instances hang about making our world  Putting these things into categories is a fundamental part of human cognition  Psychologists study this as concept formation  The same instances are put into a category  The capitalised and italicised in the slide before last
  • 5. We have Labels for the Categories and their Instances  We label categories with symbols: Words  “Lion” is a category of big cat with big teeth  Gene, Protein, Cell, Person, Hydrolase Activity, etc.  …and, as we’ve already seen, each category can have many labels and any particular label can refer to more than one category  Semantic Heterogeneity  “A lion” is an instance in that category  Does the category “Lion” exist?  Lions exist, but the category could just be a human way of talking about lions  … we like putting things into categories
  • 6. A Controlled Vocabulary  A specified set of words and phrases for the categories in which we place instances  Natural language definitions for those words and phrases  A glossary defines, but doesn’t control  The Uniprot keywords define and control  Control is placed upon which labels are used to represent the categories (concepts) we’ve used to describe the instances in the world  …, but there is nothing about how things in these categories are related Biopolymer DNA Enzyme Nucleic acid mRNA Polypeptide snRNA tRNA
  • 7. We also like to Relate Things Together  Categories have subcategories  Instances in one category can be related in some way to instances in another  Can relate instances to each other in many different ways  Is-a, part-of, develops-from, etc.axes  We can use these relationships to classify categories  Things in category A are part is  If all instances in category A are also in category B then As are kinds of Bs Biopolymer Nucleic Acid Polypeptide Enzym e DNA RNA tRNA mRNA smRNA
  • 8. Categories and sub-categories biopolymer polypeptide Nucleic acid enzyme DNA RNA
  • 9. Describing Category Membership  We can make conditions that any instance must fulfil in order to be a member of a particular category  A Phosphatase must have a phosphatase catalytic domain  A Receptor must have a transmembrane domain  A codon has three nucleotide residues  A limb has part that is a joint  A man has a Y chromosome and an X chromosome  A woman has only an X chromosome
  • 10. Relationships  These conditions made from a property and a successor relationship  isPartOf, hasPart  isDerivedFrom  DevelopsFrom  isHomologousTo  …and many, many more
  • 11. A Structured Controlled Vocabulary  Not only can we agree on the labels we give categories  Can also agree on how the instances of categories are related  And agree on the labels we give he relations  Structure aids querying and captures knowledge with greater fidelity Biopolymer Nucleic Acid Polypeptide Enzym e DNA RNA tRNA mRNA smRNA Gene regionOf transcribedFrom translatedFrom
  • 12. Manchester Mercury January 1st 1754 Executed 18 Found Dead 34 Frighted 2 Kill'd by falls and other accidents 55 Kill'd themselves 36 Murdered 3 Overlaid 40 Poisoned 1 Scalded 5 Smothered 1 Stabbed 1 Starved 7 Suffocated 5 Aged 1456 Consumption 3915 Convulsion 5977 Dropsy 794 Fevers 2292 Smallpox 774 Teeth 961 Bit by mad dogs 3 Broken Limbs 5 Bruised 5 Burnt 9 Drowned 86 Excessive Drinking 15 List of diseases & casualties this year 19276 burials 15444 christenings Deaths by centile
  • 13. Uses of Ontology in Bioinformatics
  • 14. What is engineering?  American Engineers' Council for Professional Development defines "engineering" as:  “The creative application of scientific principles to design or develop structures, machines, apparatus, or manufacturing processes, or works utilizing them singly or in combination; or to construct or operate the same with full cognizance of their design; or to forecast their behavior under specific operating conditions; all as respects an intended function, economics of operation and safety to life and property.[2]”  Taken from http://en.wikipedia.org/wiki/Engineering
  • 15. What Type of Artefact? The Rise of the Computer Science Ontology  A term borrowed from philosophy  Not supposed to be the same thing, but…  Meant to deliver formal, computational semantics to applications and humans  Necessarily involves consensus
  • 16. Software engineering life cycle 06/27/14 18 http://www.samsvb.co.uk Ontology
  • 17. Where are we in the Development of Ontology Engineering?  At about 1975…  There’s a lot of craft involved;  Too much reliance on gurus  Could two independent sets of ontologist develop two ontologies for the same domain with the same utility?  Can we cost ontology building?  Do we know when we have succcess?
  • 18. The Waterfall Method 06/27/14 20 RequirementsRequirements ConceptualisationConceptualisation Development + Coding Development + Coding Quality+ Testing Quality+ Testing Maintenance + Support Maintenance + Support Getting it right first time
  • 19. Something a bit more agile 06/27/14 21 Requirements, scoping, Competency questions Knowledge acquisition Conceptualisation, pattern forming Axiomatization Testing / evaluation? Repeated, small iterations Repeated, small iterations Users always involved Users always involved
  • 20. Four Broad Areas of Ontology Engineering 1. Technical aspects: Code repositories, issue trackers, editors, and so on 2. Coding styles and naming conventions, etc. 3. Choosing a class, placing it in a hierarchy and choosing relationships and entities by which it is described. 4. The rhetoric behind how (2) and (3) are done. One can have philosophical justification for any decision, or it can just be practically useful….
  • 21. Getting the Requirements Right  Truth and beauty is an easy requirement to state  Just model the world as it is and all else wil flow from this;  Not necessarily helpful;  Have to set a scope;  Have to set priorities – what do we most need to represent?  Competency questions – what do I need to be able to answer?  Separating “what the ontology must answer” and “what the ontology must enable to be answered”;  Requirements change; keeping it “agile”  Setting priorities.
  • 22. Strict Semantics  Languages such as OWL have a strict semantics;  Statements have a precise and interpretable meaning;  Deductions can follow from a series of statements;  Can be used to aid development and use of the ontology
  • 23. Correct, but Wrong…  An automated reasoner for OWL can make sure all your axioms are coherent;  One can make sure the ontology is structurally robust  The statements in the ontology can stil be rubbish though…  A strict semantics lends some kind of predictability to an ontology;  A pure description logic approach of all defined classes has some appeal…
  • 24. Total Definition  In OWL a defined class can find its own place in the hierarchy  A parent is any person that has a child;  A mother is any woman that has a child;  As a woman is a kind of person, we can infer a mother to be a kind of parent;  Do this for all classes; press the button and you have an ontology  Definition is hard (but that may be a good thing) and the tools may lack  Requires discipline from the authors  …and it all grounds out to a primitive somewhere along the line…
  • 25. Normalisation  An “engineering” method to manage polyhierarchies in ontology through reasoning;  Make a strict tree of primitive classes using one criterion;  Put all other criteria as restrictions upon those classes;  Re-establish the polyhierarchy through defined classes with the “other” criteria….  http://ontogenesis.knowledgeblog.org/49
  • 26. Authoring Tools  These are really just axiom editors  Support for the surrounding processes are nascent  Lots of “hand-crafting” of even large ontologies  Knowledge gathering tools; organising tools; axiom generation tools; checking and validation tools; …
  • 27. Protégé 4 06/27/14 29
  • 28. Patterns and Components  Software Design Patterns: Accepted design solutions to common problems;  Application building at the level of components;  Design pattern analogy in ontologies;  Patterns or regularities that are not ODP;  Ontologies tend to be repetitious and humans tend to be bad at repetition – tedium kicks in….  Calls for automation
  • 29. Ontology Pre-Processor Language A cell type is equivalent to a cell type that is part of some anatomy Pattern
  • 30. Ontology Pre-Processor Language ?cell:CLASS, ?anatomyPart:CLASS, ?anatomy:CLASS = (CL:0000000 part_of some ?anatomyPart) BEGIN ADD ?cell equivalentTo ?anatomy END; Variables Create axioms A cell type is equivalent to a cell type that is part of some anatomy Pattern OPPL Script
  • 31. Ontology Pre-Processor Language ?cell:CLASS, ?anatomyPart:CLASS, ?anatomy:CLASS = (CL:0000000 part_of some ?anatomyPart) BEGIN ADD ?cell equivalentTo ?anatomy END; A cell type is equivalent to a cell type that is part of some anatomy Pattern OPPL Script Variable mapper ?cell -> ‘Kidney Cell’[CL:0003523] ?anatomyPart -> ‘Kidney’[FMA:629093]
  • 32. Resulting OWL axioms Class: CL:0003523 Annotation: rdfs:label ‘Kidney Cell’ EquivalentTo: CL:0000000 and OBO_REL:part_of some FMA:629093 A ‘Kidney Cell’ is equivalent to a cell that is part of the ‘Kidney’ Example Generated OWL (Manchester Syntax)
  • 33. Automation  Moving from hand-crafting to production line  Can try things out and then re-model (as long as the entities involved don’t change)  Documents what has been done;  Ruthlessly consistent;  Also need support in repetitious knowledge gathering as well as axiom generation.
  • 34. Populous  Generic tool for populating ontology templates  Spreadsheet style interface  Supports validation at the point of data entry  Expressive Pattern language for OWL Ontology generation http://www.e-lico.eu/populous
  • 35. Evaluation  A big “can of worms”  Closely linked to requirements  Closely linked to what one believes an ontology to be…;  “Just do what I say and it will be OK” isn’t an evaluation strategy;  Nor is saying “just model reality” and that’s all you need to evaluate;  No really convincing way of doing it.
  • 36. The Role of philosophy 06/27/14 38 Biology Computer Science Philosophy
  • 37. Angels on the head of a pin
  • 38. Biology Computer Science Philosophy The role of philosophy
  • 39. Can we have Ontology Engineering?  Probably, but you’ll have to wait;  Not much predictability, except to say “it’s hard” and “people wil disagree with you”  So, much like software engineering;  Much to learn from SE and it should be quicker;  Programming is not software engineering  Axiom authoring is not ontology engineering;  At the moment we’re writing axioms, but realise we need to engineer;  Once wwe can demonstrate, with predictability, that two independent groups can take a method and each produce an ontology that meets some needs then I’ll begin to relax.