SlideShare a Scribd company logo
1 of 36
Collaborative Ontology building: So
much more than authoring an
Ontology
Robert Stevens
BioHealth Informatics Group
The University of Manchester
Manchester
United Kingdom
Robert.Stevens@manchester.ac.uk
Overview
• An experiment in collaborative authoring
• Issues raised
• Observations made
• The process and the artefact
• Bits of technology
Ontologists: What’s their
Problem?
David Randall
Manchester Metropolitan University
What do I Know about
Collaborative Ontology Authoring?
• “you’ve never built a real ontology”
• Advisor in projects
• Experiments in collaborative authoring
• Doing it for real in a Kidney and urinary Pathway
Ontology
• Informal observational studies with collaborative
protégé
The Software Engineering Life-Cycle
Ontology
Issues in Ontology
Authoring
SCOPESCOPE
COMPLEXITYCOMPLEXITY
COSTCOST
AUTHORINGAUTHORING
EVALUATIONEVALUATION
http://ontogenesis.ontonet.org/ppt/Issues_mindmapSB.pdf
The NCL Study
• A small group met to normalise the OBO Cell
Ontology (CL)
• Transform an axiomatically lean hand-crafted
“tangled” ontology to:
• An axiomatically rich ontology where the structure is
computationally maintained
• Study the process and deliver the artefact
• http://www.gong.manchester.ac.uk/CTON.html
• Two two day meetings; videoed and observed by an
ethnographer
• Part of the OntoGenesis network
Contractile cell CL
What is Ontology Normalisation?
• Hand-crafted ontologies with multiple
inheritance are “tangled”
• Usually axiomatically lean
• We classify along one axis and use
“restrictions” to other modules to capture
other axes
• Then re-build the multiple inheritance using
the axiomatically rich ontology
Tangled Ontology of Cars
Tangled Untangled Inferred
Contractile cell nCL
The People
• Ten people “friends and family”
• All some sort of biologists
• All familiar with OWL and normalisation
• All “singing from the same hymn sheet”
The Overall Process
• Analyse issues in current OBO CL
• Determine primary axis of classification
• Identify supporting ontologies
• Identify properties and design patterns; determine
representation
• Gather knowledge
• Generate OWL encoding
• Evaluate, iterate
• Two face to face meetings; separate work; email and
skype
Questions Raised
• When do we work as a larger group; smaller
groups and singly?
• What resources do we use?
• Who knows what?
• What strategies do we use?
• What expertise do we need?
• What are the vested interests?
Producing the “schema”
• What is it we want to say about cells?
• How do we want to say it?
• Most time was spent on these questions (one day)
• Best Face to face as the whole group
• Perhaps a fait accompli in the large
• Lots of modifications through debate
• Strong chair and process (“bhenevolent
dictatorship”)
“what about sea
urchins?”
“what about sea
urchins?”
Ethnographer’s
Observations!
Ethnographer’s
Observations!
I don’t know
about plants
I don’t know
about plants
NCL Schema Captured in a
Spreadsheet
Term Name CTO id ploidy morphology
Cellular
component size germ line nucleation process
slow muscle
cell CL:0000189
PATO:00018
73
GO:0030017
;
GO:0005739 Large n/a
PATO:00019
08 GO:0031444
blue sensitive
photorecepto
r cell CL:0000495
PATO:00013
94
PATO:00011
54 ;
PATO:00018
73 Large Somatic
PATO:00014
07
GO:0050908
;
GO:0007603
green
sensitive
photorecepto
r cell CL:0000496
PATO:00013
94
PATO:00011
54 ;
PATO:00018
73 Large Somatic
PATO:00014
07
GO:0050908
;
GO:0007603
R1
photorecepto
r cell CL:0000687
PATO:00013
94 ?? Variable Somatic
PATO:00014
07
GO:0050908
;
GO:0007603
CL normalisation Workflow
Ontology API
CL Spreadsheet
The Ontology
Preprocessor Language
• Adding “select”, “add” and “remove” keywords to
MOS
• A “scripting” language for OWL
• We generate a list of instructions to build an
ontology
• We can embed patterns in to this generation
• Saves “mouse clicks”
• Rapid production of large amounts of ontology
• Easy to apply changes; acts as a macro language
OPPL sample
ADD Class: CL_0000811;REMOVE subClassOf owl:Thing;
ADD label ``CD8-positive, alpha-beta immature T cell'';
ADD subClassOf cto:Cell;ADD subClassOf cto:has_ploidy some
pato:PATO_0001394;ADD comment ``MORPHOLOGY: pleiomorphic'';
ADD comment ``CELULAR COMPONENT: '';
ADD subClassOf cto:has_size some cto:Small;
ADD comment ``GERM LINE: n/a'';
ADD subClassOf cto:has_nucleation some pato:PATO_0001407;
ADD subClassOf cto:participates_in some go:GO_2456;
ADD subClassOf cto:participates_in some go:GO_0021700;
ADD subClassOf cto:participates_in some go:GO_0032940;
ADD comment ``PROCESS: '';
ADD comment ``LINEAGE: mesoderm'';
ADD subClassOf cto:appears_in some cto:Animalia;
ADD comment ``ORGANISM COMMENT: '';
ADD subClassOf cto:potentiality some cto:TerminallyDifferentiated;
What we Generate
Class: 'CD8-positive alpha-beta immature T cell'
SubClassOf: Cell,
has_morphology some pleomorphic,
has_nucleation some mononuclete,
has_ploidy some diploid,
has_potentiality some TerminallyDifferentiated,
derives_from some 'double-positive alpha-beta immature T cell',
located_in some 'Animalia',
not (participates_in some gametogenesis),
participates_in some 'T cell mediated immunity',
participates_in some 'developmental maturation',
participates_in some 'secretion by cell'
A Defined Class
Class: “diploid cell”
EquivalentTo: cell
That has_ploidy some diploid
• Picks up all cells that has_ploidy some diploid
• Trivial, but difficult to do by hand and be complete
Class: “germline cell”
EquivalentTo: cell
That (participates_in some gametogenesis) or
(directly_derived_from some gamete)
The Representation
• Aligning with RO and most OBO conventions
• Red_blood_cell participates_in some
Oxygen_transport
• Red_blood_cell has_disposition some
(realisable_entity that is_realised_in some
oxygen_transport)
• First is simple and useful, but not actually true
• Second is more ontologically formal and “right”, Can
easily expand the “schema” to either representation
• Do experiments with patterns
Entity Quality or
Entity Property Quality Pattern?
• At least two ways of representing qualities
• Need only one instance of a quality type inhering in
each entity
• has_quality exactly 1 diploid
• coupled with has_quality max 1 ploidy
• Otherwise:
• has_ploidy some diploid
• has_ploidy is functional and in property hierarchy
under has_quality
• Again, applying patterns is easy; do experiments;
gain consistency
Time Spent
• First two day meeting
• One day “planning the schema”
• Half a day describing 30 cells and producing
an ontology
• An hour or so evaluating and re-generating
• Quick iterations and always having an
ontology to look at
The Second Meeting
• Six months gatherhing material
• An hour or so of review all together
• Pairs adding more material
• A review
• More pair work
• More review
• Then dispersed activity (all “spare time”)
• Short iteration periods (in terms of work spent)
Resources used
• Brain power;
• The Web – Wikipedia is our friend
• Other ontologies
• Text books (minor use)
• Research papers
• The developing ontology and the reasoner
• Phone a friend (who is an authority in the
field?)
Identifying Issues in OBO CL
• CL generated in a few days and not really
touched (not true now)
• Lots of well recognised issues: Wrong biology;
missing biology; ontological defects; …
• Still observed to be very useful
• Issues gave us some “tests”
Identifying Supporting
Ontologies
CL
Ontology
PATO
Qualities
GO
Biological Process
GO
Cellular Component
NCBI
Taxonomy
FMA Anatomy
Nucleation
Morphology
Size
Ploidy
Muscle Contraction
Secretion
Bacillus anthracis str. Ames
Chloroplast
Cell Membrane
Epithelium
Kidney
“It lets me do the biology”
• Is what one of our biologists said
• I can see what we’ve said about a cell
• I can see where it is in the structure
• I relate the two
• The work is “turned around”: thinking about the biology and
its consequences
• P1: flight muscle cell, thats interesting ... no, a cardiac muscle
cell is not a skeletal muscle cell!!
• P2; a flight muscle cell is never a cardiac muscle cell.’
• “Why has it put it there?”
• Hereit” is the reasoner
Strategies
• Pinning down the scope: Only cells in vivo
• Dealing with a representative set of cells:
developing a test plan
• Collective wisdom: testing against current
knowledge – “pericytes”
• Concentrating on biology and less on ontology
egineering
• Using the owners and authorities
Being “Agile”
• Software engineering has moved on from
simplistic life cycles
• Agile methods are the fashion
• Embedding users
• Always have something working
• Test driven development
• Short iterations
• Deliver early
Observations on Collaboration
• The work is not mechanical
• It involves extensive synchronous face-to-face work on
deciding on scope and purpose
• It relies on a socially distributed expertise, and ‘knowing
who knows’
• It involves the synchronous or rapid use of a number of
different artefacts, and an understanding of how best to use
them.
• It involves constant ‘testing’ and the delaying of final
decisions through ambiguity resolution and error checking,
and the constant recording of rationales for decision-making
The New KUPO Process
Collaborative
Spreadsheet
Collaborative
Spreadsheet
Individual
Spreadsheet
Individual
Spreadsheet
Semantic WikiSemantic Wiki
Issue TrackerIssue Tracker
OPPL
Script
Formulation
OPPL
Script
Formulation
Generate
OWL
Generate
OWL
Reasoned
Ontology
Reasoned
Ontology
View OntologyView Ontology
Summary
• Mass direct authoring of an ontology seems bad
• In NCL we only used Protégé to “look at it” – no
hand-building
• Mass knowledge gathering and commenting seems
good
• Keeping “Agile” seems good
• Doing too much by hand seems bad
• Developing the schema in a team seems good
• The team should have a coherent, non-clashing
interests
Acknowledgements
• Mikel Aranguren and Simon Jupp for slides
• Mikel Aranguren, Simon Jupp, Helen
Parkinson, Phil Lord, David Shotton, James
Malone, Jonathan Bard, Midori Harris did the
work
• Dave Randall did the ethnography
• The EPSRC for funding OntoGenesis

More Related Content

Similar to Collaborative Ontology building: So much more than authoring an Ontology

The role of annotation in reproducibility (Empirical 2014)
The role of annotation in reproducibility (Empirical 2014)The role of annotation in reproducibility (Empirical 2014)
The role of annotation in reproducibility (Empirical 2014)Oscar Corcho
 
Collaboratively Creating the Knowledge Graph of Life
Collaboratively Creating the Knowledge Graph of LifeCollaboratively Creating the Knowledge Graph of Life
Collaboratively Creating the Knowledge Graph of LifeChris Mungall
 
Computing on the shoulders of giants
Computing on the shoulders of giantsComputing on the shoulders of giants
Computing on the shoulders of giantsBenjamin Good
 
Ontologies: What Librarians Need to Know
Ontologies: What Librarians Need to KnowOntologies: What Librarians Need to Know
Ontologies: What Librarians Need to KnowBarry Smith
 
Ontology learning from text
Ontology learning from textOntology learning from text
Ontology learning from textrobertstevens65
 
Adverse outcome pathways collaboration, Jason O’Brien from the Environment an...
Adverse outcome pathways collaboration, Jason O’Brien from the Environment an...Adverse outcome pathways collaboration, Jason O’Brien from the Environment an...
Adverse outcome pathways collaboration, Jason O’Brien from the Environment an...OECD Environment
 
elns-the-opinions-of-physical-chemists_tcm18-244630.pptx
elns-the-opinions-of-physical-chemists_tcm18-244630.pptxelns-the-opinions-of-physical-chemists_tcm18-244630.pptx
elns-the-opinions-of-physical-chemists_tcm18-244630.pptxAlandraKahl1
 
BioAssay Express: Creating and exploiting assay metadata
BioAssay Express: Creating and exploiting assay metadataBioAssay Express: Creating and exploiting assay metadata
BioAssay Express: Creating and exploiting assay metadataPhilip Cheung
 
Creating an Urban Legend: A System for Electrophysiology Data Management and ...
Creating an Urban Legend: A System for Electrophysiology Data Management and ...Creating an Urban Legend: A System for Electrophysiology Data Management and ...
Creating an Urban Legend: A System for Electrophysiology Data Management and ...Anita de Waard
 
A_future_perspective_-_N_Harding
A_future_perspective_-_N_HardingA_future_perspective_-_N_Harding
A_future_perspective_-_N_HardingNial Harding
 
25 January 2022: Webinar on Adverse Outcome Pathway co-operative activities b...
25 January 2022: Webinar on Adverse Outcome Pathway co-operative activities b...25 January 2022: Webinar on Adverse Outcome Pathway co-operative activities b...
25 January 2022: Webinar on Adverse Outcome Pathway co-operative activities b...OECD Environment
 
Active research management and sharing
Active research management and sharingActive research management and sharing
Active research management and sharingJisc
 
Presentation phinney abrf 2019
Presentation phinney abrf 2019Presentation phinney abrf 2019
Presentation phinney abrf 2019UC Davis
 
Why Teaching of Bioethics Matters
Why Teaching of Bioethics Matters Why Teaching of Bioethics Matters
Why Teaching of Bioethics Matters Chris Willmott
 
Science as a way of knowing
Science as a way of knowingScience as a way of knowing
Science as a way of knowingJohn Wilkins
 
Amia tb-review-08
Amia tb-review-08Amia tb-review-08
Amia tb-review-08Russ Altman
 
How do we make science better?
How do we make science better?How do we make science better?
How do we make science better?Christian Bokhove
 
The Ontogenesis Kblog: Light-Weight Publishing about Semantics with Light-Wei...
The Ontogenesis Kblog: Light-Weight Publishing about Semantics with Light-Wei...The Ontogenesis Kblog: Light-Weight Publishing about Semantics with Light-Wei...
The Ontogenesis Kblog: Light-Weight Publishing about Semantics with Light-Wei...robertstevens65
 

Similar to Collaborative Ontology building: So much more than authoring an Ontology (20)

The role of annotation in reproducibility (Empirical 2014)
The role of annotation in reproducibility (Empirical 2014)The role of annotation in reproducibility (Empirical 2014)
The role of annotation in reproducibility (Empirical 2014)
 
Collaboratively Creating the Knowledge Graph of Life
Collaboratively Creating the Knowledge Graph of LifeCollaboratively Creating the Knowledge Graph of Life
Collaboratively Creating the Knowledge Graph of Life
 
Computing on the shoulders of giants
Computing on the shoulders of giantsComputing on the shoulders of giants
Computing on the shoulders of giants
 
Ontologies: What Librarians Need to Know
Ontologies: What Librarians Need to KnowOntologies: What Librarians Need to Know
Ontologies: What Librarians Need to Know
 
Ontology learning from text
Ontology learning from textOntology learning from text
Ontology learning from text
 
Adverse outcome pathways collaboration, Jason O’Brien from the Environment an...
Adverse outcome pathways collaboration, Jason O’Brien from the Environment an...Adverse outcome pathways collaboration, Jason O’Brien from the Environment an...
Adverse outcome pathways collaboration, Jason O’Brien from the Environment an...
 
elns-the-opinions-of-physical-chemists_tcm18-244630.pptx
elns-the-opinions-of-physical-chemists_tcm18-244630.pptxelns-the-opinions-of-physical-chemists_tcm18-244630.pptx
elns-the-opinions-of-physical-chemists_tcm18-244630.pptx
 
Data Mining Dissertations and Adventures and Experiences in the World of Chem...
Data Mining Dissertations and Adventures and Experiences in the World of Chem...Data Mining Dissertations and Adventures and Experiences in the World of Chem...
Data Mining Dissertations and Adventures and Experiences in the World of Chem...
 
BioAssay Express: Creating and exploiting assay metadata
BioAssay Express: Creating and exploiting assay metadataBioAssay Express: Creating and exploiting assay metadata
BioAssay Express: Creating and exploiting assay metadata
 
Creating an Urban Legend: A System for Electrophysiology Data Management and ...
Creating an Urban Legend: A System for Electrophysiology Data Management and ...Creating an Urban Legend: A System for Electrophysiology Data Management and ...
Creating an Urban Legend: A System for Electrophysiology Data Management and ...
 
A_future_perspective_-_N_Harding
A_future_perspective_-_N_HardingA_future_perspective_-_N_Harding
A_future_perspective_-_N_Harding
 
25 January 2022: Webinar on Adverse Outcome Pathway co-operative activities b...
25 January 2022: Webinar on Adverse Outcome Pathway co-operative activities b...25 January 2022: Webinar on Adverse Outcome Pathway co-operative activities b...
25 January 2022: Webinar on Adverse Outcome Pathway co-operative activities b...
 
Ontology at Manchester
Ontology at ManchesterOntology at Manchester
Ontology at Manchester
 
Active research management and sharing
Active research management and sharingActive research management and sharing
Active research management and sharing
 
Presentation phinney abrf 2019
Presentation phinney abrf 2019Presentation phinney abrf 2019
Presentation phinney abrf 2019
 
Why Teaching of Bioethics Matters
Why Teaching of Bioethics Matters Why Teaching of Bioethics Matters
Why Teaching of Bioethics Matters
 
Science as a way of knowing
Science as a way of knowingScience as a way of knowing
Science as a way of knowing
 
Amia tb-review-08
Amia tb-review-08Amia tb-review-08
Amia tb-review-08
 
How do we make science better?
How do we make science better?How do we make science better?
How do we make science better?
 
The Ontogenesis Kblog: Light-Weight Publishing about Semantics with Light-Wei...
The Ontogenesis Kblog: Light-Weight Publishing about Semantics with Light-Wei...The Ontogenesis Kblog: Light-Weight Publishing about Semantics with Light-Wei...
The Ontogenesis Kblog: Light-Weight Publishing about Semantics with Light-Wei...
 

More from robertstevens65

Ontologies: Necessary, but not sufficient
Ontologies: Necessary, but not sufficientOntologies: Necessary, but not sufficient
Ontologies: Necessary, but not sufficientrobertstevens65
 
The Pragmatics and Formality of Authoring OntologiesOdsl 2016
The Pragmatics and Formality of Authoring OntologiesOdsl 2016The Pragmatics and Formality of Authoring OntologiesOdsl 2016
The Pragmatics and Formality of Authoring OntologiesOdsl 2016robertstevens65
 
OBOPedia: An Encyclopaedia of Biology Using OBO OntologiesObopedia swat4ls-20...
OBOPedia: An Encyclopaedia of Biology Using OBO OntologiesObopedia swat4ls-20...OBOPedia: An Encyclopaedia of Biology Using OBO OntologiesObopedia swat4ls-20...
OBOPedia: An Encyclopaedia of Biology Using OBO OntologiesObopedia swat4ls-20...robertstevens65
 
The Quality of Method Reporting in
The Quality of Method Reporting in The Quality of Method Reporting in
The Quality of Method Reporting in robertstevens65
 
The Semantics of Genomic Analysis
The Semantics of  Genomic AnalysisThe Semantics of  Genomic Analysis
The Semantics of Genomic Analysisrobertstevens65
 
Issues and activities in authoring ontologies
Issues and activities in authoring ontologiesIssues and activities in authoring ontologies
Issues and activities in authoring ontologiesrobertstevens65
 
The state of the nation for ontology development
The state of the nation for ontology developmentThe state of the nation for ontology development
The state of the nation for ontology developmentrobertstevens65
 
Properties and Individuals in OWL: Reasoning About Family History
Properties and Individuals in OWL: Reasoning About Family HistoryProperties and Individuals in OWL: Reasoning About Family History
Properties and Individuals in OWL: Reasoning About Family Historyrobertstevens65
 
Choosing and Building Knowledge Artefacts
Choosing and Building Knowledge ArtefactsChoosing and Building Knowledge Artefacts
Choosing and Building Knowledge Artefactsrobertstevens65
 
Populous: A tool for Populating OWL Ontologies from Templates
Populous: A tool for Populating OWL Ontologies from TemplatesPopulous: A tool for Populating OWL Ontologies from Templates
Populous: A tool for Populating OWL Ontologies from Templatesrobertstevens65
 
Lessons from teaching non-computer scientists OWL and ontologies
Lessons from teaching non-computer scientists OWL and ontologiesLessons from teaching non-computer scientists OWL and ontologies
Lessons from teaching non-computer scientists OWL and ontologiesrobertstevens65
 
Kidney and Urinary Pathways Knowledge Base (part of e-LICO)
Kidney and Urinary Pathways Knowledge Base (part of e-LICO)Kidney and Urinary Pathways Knowledge Base (part of e-LICO)
Kidney and Urinary Pathways Knowledge Base (part of e-LICO)robertstevens65
 
A Rose by Any Other Name is Still a Rose
A Rose by Any Other Name is Still a RoseA Rose by Any Other Name is Still a Rose
A Rose by Any Other Name is Still a Roserobertstevens65
 
Working with big biomedical ontologies
Working with big biomedical ontologiesWorking with big biomedical ontologies
Working with big biomedical ontologiesrobertstevens65
 
The Big Picture: The Industrial Revolutiona talk in berlin, 2008, about indus...
The Big Picture: The Industrial Revolutiona talk in berlin, 2008, about indus...The Big Picture: The Industrial Revolutiona talk in berlin, 2008, about indus...
The Big Picture: The Industrial Revolutiona talk in berlin, 2008, about indus...robertstevens65
 
Knowledge Management in a Knowledge Based Discipline
Knowledge Management in a Knowledge Based DisciplineKnowledge Management in a Knowledge Based Discipline
Knowledge Management in a Knowledge Based Disciplinerobertstevens65
 
A family History Knowledge Base in OWL 2
A family History Knowledge Base in OWL 2A family History Knowledge Base in OWL 2
A family History Knowledge Base in OWL 2robertstevens65
 
RIO: The Regularities Inspector for Ontologies Plugin for Protégé 4
RIO: The Regularities Inspector for Ontologies Plugin for Protégé 4 RIO: The Regularities Inspector for Ontologies Plugin for Protégé 4
RIO: The Regularities Inspector for Ontologies Plugin for Protégé 4 robertstevens65
 
Communities building ontologies: Tensions and Reality
Communities building ontologies: Tensions and RealityCommunities building ontologies: Tensions and Reality
Communities building ontologies: Tensions and Realityrobertstevens65
 

More from robertstevens65 (20)

Ontologies: Necessary, but not sufficient
Ontologies: Necessary, but not sufficientOntologies: Necessary, but not sufficient
Ontologies: Necessary, but not sufficient
 
The Pragmatics and Formality of Authoring OntologiesOdsl 2016
The Pragmatics and Formality of Authoring OntologiesOdsl 2016The Pragmatics and Formality of Authoring OntologiesOdsl 2016
The Pragmatics and Formality of Authoring OntologiesOdsl 2016
 
OBOPedia: An Encyclopaedia of Biology Using OBO OntologiesObopedia swat4ls-20...
OBOPedia: An Encyclopaedia of Biology Using OBO OntologiesObopedia swat4ls-20...OBOPedia: An Encyclopaedia of Biology Using OBO OntologiesObopedia swat4ls-20...
OBOPedia: An Encyclopaedia of Biology Using OBO OntologiesObopedia swat4ls-20...
 
The Quality of Method Reporting in
The Quality of Method Reporting in The Quality of Method Reporting in
The Quality of Method Reporting in
 
The Semantics of Genomic Analysis
The Semantics of  Genomic AnalysisThe Semantics of  Genomic Analysis
The Semantics of Genomic Analysis
 
Issues and activities in authoring ontologies
Issues and activities in authoring ontologiesIssues and activities in authoring ontologies
Issues and activities in authoring ontologies
 
The state of the nation for ontology development
The state of the nation for ontology developmentThe state of the nation for ontology development
The state of the nation for ontology development
 
Properties and Individuals in OWL: Reasoning About Family History
Properties and Individuals in OWL: Reasoning About Family HistoryProperties and Individuals in OWL: Reasoning About Family History
Properties and Individuals in OWL: Reasoning About Family History
 
Choosing and Building Knowledge Artefacts
Choosing and Building Knowledge ArtefactsChoosing and Building Knowledge Artefacts
Choosing and Building Knowledge Artefacts
 
Populous: A tool for Populating OWL Ontologies from Templates
Populous: A tool for Populating OWL Ontologies from TemplatesPopulous: A tool for Populating OWL Ontologies from Templates
Populous: A tool for Populating OWL Ontologies from Templates
 
Spreadsheets to OWL
Spreadsheets to OWLSpreadsheets to OWL
Spreadsheets to OWL
 
Lessons from teaching non-computer scientists OWL and ontologies
Lessons from teaching non-computer scientists OWL and ontologiesLessons from teaching non-computer scientists OWL and ontologies
Lessons from teaching non-computer scientists OWL and ontologies
 
Kidney and Urinary Pathways Knowledge Base (part of e-LICO)
Kidney and Urinary Pathways Knowledge Base (part of e-LICO)Kidney and Urinary Pathways Knowledge Base (part of e-LICO)
Kidney and Urinary Pathways Knowledge Base (part of e-LICO)
 
A Rose by Any Other Name is Still a Rose
A Rose by Any Other Name is Still a RoseA Rose by Any Other Name is Still a Rose
A Rose by Any Other Name is Still a Rose
 
Working with big biomedical ontologies
Working with big biomedical ontologiesWorking with big biomedical ontologies
Working with big biomedical ontologies
 
The Big Picture: The Industrial Revolutiona talk in berlin, 2008, about indus...
The Big Picture: The Industrial Revolutiona talk in berlin, 2008, about indus...The Big Picture: The Industrial Revolutiona talk in berlin, 2008, about indus...
The Big Picture: The Industrial Revolutiona talk in berlin, 2008, about indus...
 
Knowledge Management in a Knowledge Based Discipline
Knowledge Management in a Knowledge Based DisciplineKnowledge Management in a Knowledge Based Discipline
Knowledge Management in a Knowledge Based Discipline
 
A family History Knowledge Base in OWL 2
A family History Knowledge Base in OWL 2A family History Knowledge Base in OWL 2
A family History Knowledge Base in OWL 2
 
RIO: The Regularities Inspector for Ontologies Plugin for Protégé 4
RIO: The Regularities Inspector for Ontologies Plugin for Protégé 4 RIO: The Regularities Inspector for Ontologies Plugin for Protégé 4
RIO: The Regularities Inspector for Ontologies Plugin for Protégé 4
 
Communities building ontologies: Tensions and Reality
Communities building ontologies: Tensions and RealityCommunities building ontologies: Tensions and Reality
Communities building ontologies: Tensions and Reality
 

Recently uploaded

Hubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsHubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsSérgio Sacani
 
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...ssifa0344
 
Forensic Biology & Its biological significance.pdf
Forensic Biology & Its biological significance.pdfForensic Biology & Its biological significance.pdf
Forensic Biology & Its biological significance.pdfrohankumarsinghrore1
 
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.Nitya salvi
 
DIFFERENCE IN BACK CROSS AND TEST CROSS
DIFFERENCE IN  BACK CROSS AND TEST CROSSDIFFERENCE IN  BACK CROSS AND TEST CROSS
DIFFERENCE IN BACK CROSS AND TEST CROSSLeenakshiTyagi
 
Nanoparticles synthesis and characterization​ ​
Nanoparticles synthesis and characterization​  ​Nanoparticles synthesis and characterization​  ​
Nanoparticles synthesis and characterization​ ​kaibalyasahoo82800
 
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdfPests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdfPirithiRaju
 
fundamental of entomology all in one topics of entomology
fundamental of entomology all in one topics of entomologyfundamental of entomology all in one topics of entomology
fundamental of entomology all in one topics of entomologyDrAnita Sharma
 
Botany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdfBotany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdfSumit Kumar yadav
 
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptxUnlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptxanandsmhk
 
VIRUSES structure and classification ppt by Dr.Prince C P
VIRUSES structure and classification ppt by Dr.Prince C PVIRUSES structure and classification ppt by Dr.Prince C P
VIRUSES structure and classification ppt by Dr.Prince C PPRINCE C P
 
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPirithiRaju
 
Presentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptxPresentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptxgindu3009
 
Disentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTDisentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTSérgio Sacani
 
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral AnalysisRaman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral AnalysisDiwakar Mishra
 
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls AgencyHire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls AgencySheetal Arora
 
Chromatin Structure | EUCHROMATIN | HETEROCHROMATIN
Chromatin Structure | EUCHROMATIN | HETEROCHROMATINChromatin Structure | EUCHROMATIN | HETEROCHROMATIN
Chromatin Structure | EUCHROMATIN | HETEROCHROMATINsankalpkumarsahoo174
 
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 60009654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000Sapana Sha
 

Recently uploaded (20)

CELL -Structural and Functional unit of life.pdf
CELL -Structural and Functional unit of life.pdfCELL -Structural and Functional unit of life.pdf
CELL -Structural and Functional unit of life.pdf
 
The Philosophy of Science
The Philosophy of ScienceThe Philosophy of Science
The Philosophy of Science
 
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsHubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
 
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
 
Forensic Biology & Its biological significance.pdf
Forensic Biology & Its biological significance.pdfForensic Biology & Its biological significance.pdf
Forensic Biology & Its biological significance.pdf
 
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
 
DIFFERENCE IN BACK CROSS AND TEST CROSS
DIFFERENCE IN  BACK CROSS AND TEST CROSSDIFFERENCE IN  BACK CROSS AND TEST CROSS
DIFFERENCE IN BACK CROSS AND TEST CROSS
 
Nanoparticles synthesis and characterization​ ​
Nanoparticles synthesis and characterization​  ​Nanoparticles synthesis and characterization​  ​
Nanoparticles synthesis and characterization​ ​
 
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdfPests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
 
fundamental of entomology all in one topics of entomology
fundamental of entomology all in one topics of entomologyfundamental of entomology all in one topics of entomology
fundamental of entomology all in one topics of entomology
 
Botany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdfBotany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdf
 
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptxUnlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptx
 
VIRUSES structure and classification ppt by Dr.Prince C P
VIRUSES structure and classification ppt by Dr.Prince C PVIRUSES structure and classification ppt by Dr.Prince C P
VIRUSES structure and classification ppt by Dr.Prince C P
 
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
 
Presentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptxPresentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptx
 
Disentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTDisentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOST
 
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral AnalysisRaman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
 
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls AgencyHire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
 
Chromatin Structure | EUCHROMATIN | HETEROCHROMATIN
Chromatin Structure | EUCHROMATIN | HETEROCHROMATINChromatin Structure | EUCHROMATIN | HETEROCHROMATIN
Chromatin Structure | EUCHROMATIN | HETEROCHROMATIN
 
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 60009654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
 

Collaborative Ontology building: So much more than authoring an Ontology

  • 1. Collaborative Ontology building: So much more than authoring an Ontology Robert Stevens BioHealth Informatics Group The University of Manchester Manchester United Kingdom Robert.Stevens@manchester.ac.uk
  • 2. Overview • An experiment in collaborative authoring • Issues raised • Observations made • The process and the artefact • Bits of technology
  • 3. Ontologists: What’s their Problem? David Randall Manchester Metropolitan University
  • 4. What do I Know about Collaborative Ontology Authoring? • “you’ve never built a real ontology” • Advisor in projects • Experiments in collaborative authoring • Doing it for real in a Kidney and urinary Pathway Ontology • Informal observational studies with collaborative protégé
  • 5. The Software Engineering Life-Cycle Ontology
  • 7. The NCL Study • A small group met to normalise the OBO Cell Ontology (CL) • Transform an axiomatically lean hand-crafted “tangled” ontology to: • An axiomatically rich ontology where the structure is computationally maintained • Study the process and deliver the artefact • http://www.gong.manchester.ac.uk/CTON.html • Two two day meetings; videoed and observed by an ethnographer • Part of the OntoGenesis network
  • 9. What is Ontology Normalisation? • Hand-crafted ontologies with multiple inheritance are “tangled” • Usually axiomatically lean • We classify along one axis and use “restrictions” to other modules to capture other axes • Then re-build the multiple inheritance using the axiomatically rich ontology
  • 10. Tangled Ontology of Cars Tangled Untangled Inferred
  • 12. The People • Ten people “friends and family” • All some sort of biologists • All familiar with OWL and normalisation • All “singing from the same hymn sheet”
  • 13. The Overall Process • Analyse issues in current OBO CL • Determine primary axis of classification • Identify supporting ontologies • Identify properties and design patterns; determine representation • Gather knowledge • Generate OWL encoding • Evaluate, iterate • Two face to face meetings; separate work; email and skype
  • 14. Questions Raised • When do we work as a larger group; smaller groups and singly? • What resources do we use? • Who knows what? • What strategies do we use? • What expertise do we need? • What are the vested interests?
  • 15. Producing the “schema” • What is it we want to say about cells? • How do we want to say it? • Most time was spent on these questions (one day) • Best Face to face as the whole group • Perhaps a fait accompli in the large • Lots of modifications through debate • Strong chair and process (“bhenevolent dictatorship”) “what about sea urchins?” “what about sea urchins?” Ethnographer’s Observations! Ethnographer’s Observations! I don’t know about plants I don’t know about plants
  • 16. NCL Schema Captured in a Spreadsheet Term Name CTO id ploidy morphology Cellular component size germ line nucleation process slow muscle cell CL:0000189 PATO:00018 73 GO:0030017 ; GO:0005739 Large n/a PATO:00019 08 GO:0031444 blue sensitive photorecepto r cell CL:0000495 PATO:00013 94 PATO:00011 54 ; PATO:00018 73 Large Somatic PATO:00014 07 GO:0050908 ; GO:0007603 green sensitive photorecepto r cell CL:0000496 PATO:00013 94 PATO:00011 54 ; PATO:00018 73 Large Somatic PATO:00014 07 GO:0050908 ; GO:0007603 R1 photorecepto r cell CL:0000687 PATO:00013 94 ?? Variable Somatic PATO:00014 07 GO:0050908 ; GO:0007603
  • 19. The Ontology Preprocessor Language • Adding “select”, “add” and “remove” keywords to MOS • A “scripting” language for OWL • We generate a list of instructions to build an ontology • We can embed patterns in to this generation • Saves “mouse clicks” • Rapid production of large amounts of ontology • Easy to apply changes; acts as a macro language
  • 20. OPPL sample ADD Class: CL_0000811;REMOVE subClassOf owl:Thing; ADD label ``CD8-positive, alpha-beta immature T cell''; ADD subClassOf cto:Cell;ADD subClassOf cto:has_ploidy some pato:PATO_0001394;ADD comment ``MORPHOLOGY: pleiomorphic''; ADD comment ``CELULAR COMPONENT: ''; ADD subClassOf cto:has_size some cto:Small; ADD comment ``GERM LINE: n/a''; ADD subClassOf cto:has_nucleation some pato:PATO_0001407; ADD subClassOf cto:participates_in some go:GO_2456; ADD subClassOf cto:participates_in some go:GO_0021700; ADD subClassOf cto:participates_in some go:GO_0032940; ADD comment ``PROCESS: ''; ADD comment ``LINEAGE: mesoderm''; ADD subClassOf cto:appears_in some cto:Animalia; ADD comment ``ORGANISM COMMENT: ''; ADD subClassOf cto:potentiality some cto:TerminallyDifferentiated;
  • 21. What we Generate Class: 'CD8-positive alpha-beta immature T cell' SubClassOf: Cell, has_morphology some pleomorphic, has_nucleation some mononuclete, has_ploidy some diploid, has_potentiality some TerminallyDifferentiated, derives_from some 'double-positive alpha-beta immature T cell', located_in some 'Animalia', not (participates_in some gametogenesis), participates_in some 'T cell mediated immunity', participates_in some 'developmental maturation', participates_in some 'secretion by cell'
  • 22. A Defined Class Class: “diploid cell” EquivalentTo: cell That has_ploidy some diploid • Picks up all cells that has_ploidy some diploid • Trivial, but difficult to do by hand and be complete Class: “germline cell” EquivalentTo: cell That (participates_in some gametogenesis) or (directly_derived_from some gamete)
  • 23. The Representation • Aligning with RO and most OBO conventions • Red_blood_cell participates_in some Oxygen_transport • Red_blood_cell has_disposition some (realisable_entity that is_realised_in some oxygen_transport) • First is simple and useful, but not actually true • Second is more ontologically formal and “right”, Can easily expand the “schema” to either representation • Do experiments with patterns
  • 24. Entity Quality or Entity Property Quality Pattern? • At least two ways of representing qualities • Need only one instance of a quality type inhering in each entity • has_quality exactly 1 diploid • coupled with has_quality max 1 ploidy • Otherwise: • has_ploidy some diploid • has_ploidy is functional and in property hierarchy under has_quality • Again, applying patterns is easy; do experiments; gain consistency
  • 25. Time Spent • First two day meeting • One day “planning the schema” • Half a day describing 30 cells and producing an ontology • An hour or so evaluating and re-generating • Quick iterations and always having an ontology to look at
  • 26. The Second Meeting • Six months gatherhing material • An hour or so of review all together • Pairs adding more material • A review • More pair work • More review • Then dispersed activity (all “spare time”) • Short iteration periods (in terms of work spent)
  • 27. Resources used • Brain power; • The Web – Wikipedia is our friend • Other ontologies • Text books (minor use) • Research papers • The developing ontology and the reasoner • Phone a friend (who is an authority in the field?)
  • 28. Identifying Issues in OBO CL • CL generated in a few days and not really touched (not true now) • Lots of well recognised issues: Wrong biology; missing biology; ontological defects; … • Still observed to be very useful • Issues gave us some “tests”
  • 29. Identifying Supporting Ontologies CL Ontology PATO Qualities GO Biological Process GO Cellular Component NCBI Taxonomy FMA Anatomy Nucleation Morphology Size Ploidy Muscle Contraction Secretion Bacillus anthracis str. Ames Chloroplast Cell Membrane Epithelium Kidney
  • 30. “It lets me do the biology” • Is what one of our biologists said • I can see what we’ve said about a cell • I can see where it is in the structure • I relate the two • The work is “turned around”: thinking about the biology and its consequences • P1: flight muscle cell, thats interesting ... no, a cardiac muscle cell is not a skeletal muscle cell!! • P2; a flight muscle cell is never a cardiac muscle cell.’ • “Why has it put it there?” • Hereit” is the reasoner
  • 31. Strategies • Pinning down the scope: Only cells in vivo • Dealing with a representative set of cells: developing a test plan • Collective wisdom: testing against current knowledge – “pericytes” • Concentrating on biology and less on ontology egineering • Using the owners and authorities
  • 32. Being “Agile” • Software engineering has moved on from simplistic life cycles • Agile methods are the fashion • Embedding users • Always have something working • Test driven development • Short iterations • Deliver early
  • 33. Observations on Collaboration • The work is not mechanical • It involves extensive synchronous face-to-face work on deciding on scope and purpose • It relies on a socially distributed expertise, and ‘knowing who knows’ • It involves the synchronous or rapid use of a number of different artefacts, and an understanding of how best to use them. • It involves constant ‘testing’ and the delaying of final decisions through ambiguity resolution and error checking, and the constant recording of rationales for decision-making
  • 34. The New KUPO Process Collaborative Spreadsheet Collaborative Spreadsheet Individual Spreadsheet Individual Spreadsheet Semantic WikiSemantic Wiki Issue TrackerIssue Tracker OPPL Script Formulation OPPL Script Formulation Generate OWL Generate OWL Reasoned Ontology Reasoned Ontology View OntologyView Ontology
  • 35. Summary • Mass direct authoring of an ontology seems bad • In NCL we only used Protégé to “look at it” – no hand-building • Mass knowledge gathering and commenting seems good • Keeping “Agile” seems good • Doing too much by hand seems bad • Developing the schema in a team seems good • The team should have a coherent, non-clashing interests
  • 36. Acknowledgements • Mikel Aranguren and Simon Jupp for slides • Mikel Aranguren, Simon Jupp, Helen Parkinson, Phil Lord, David Shotton, James Malone, Jonathan Bard, Midori Harris did the work • Dave Randall did the ethnography • The EPSRC for funding OntoGenesis