This document discusses collaborative ontology development. It notes that ontology development has shifted from being done by lone knowledge engineers to being done collaboratively by distributed groups. It provides examples of large ontologies developed collaboratively like Gene Ontology, NCI Thesaurus, and International Classification of Diseases. It describes the collaborative development processes used for these ontologies including things like issue tracking systems, multiple editors, and consensus building. It also discusses the WebProtege tool which aims to provide a collaborative online environment for ontology development similar to Google Docs.
3. Lots of databases and sources
The data is in different silos
Need to integrate them
Considerable benefit if you can integrate the data
Ontologies are essential to science
Monday, July 15, 13
4. Many ontologies today are large
and there are lots of them
• Gene ontology: 28K classes
• Foundational Model of Anatomy: >80K classes
• NCI Thesaurus: 80K classes
• SNOMED CT: >300K classes
Monday, July 15, 13
5. There are lots of ontologies and more to come
BioPortal has more
than 350 ontologies
only in the field of
biomedicine
Users uploaded
more than 230
ontologies to
WebProtégé in the
first two months
after its release
Monday, July 15, 13
6. To provide canonical representation of scientific knowledge
To annotate experimental data to enable interpretation,
comparison, and discovery across databases
To facilitate knowledge-based applications for decision
support, natural language-processing, data integration
and other applications
Scientists have adopted ontologies
Monday, July 15, 13
7. Ontology development has changed, too
or to any number of
users anywhere
in the world
from a lone
knowledge engineer
to a few
distributed
users
Monday, July 15, 13
9. Collaborative Ontology
Development
• Collaborative
• Several users contribute to a single developing
ontology
• There are mechanisms to carry out discussions and
to reach consensus
• Ontologies
• From simple taxonomies
• To expressive OWL ontologies
Monday, July 15, 13
11. Gene Ontology (GO)
• Developed by the Gene Ontology Consortium
• Goal: create a single terminological resource
for annotating genes and gene function from
different model organisms:
• drosophilla, mouse, e.coli, homo sapiens, ...
• GO: 38,000 classes
Monday, July 15, 13
13. Key Resource: GO Annotations
Manually curated over the past 10 years
Publicly available
345,000 annotations for homo sapiens
TP53
Gene product
GO:0007569
cell aging
GO Term
PubMed article
Manual
GO
Annotation
Monday, July 15, 13
15. The Gene Ontology
Terminology for consistent description of gene products
Issue Tracker
Curators of biomedical
databases
GO Curators 3 full-time curators have
access to edit GO
Anyone in the community can
submit an issue or request
Monday, July 15, 13
17. The NCI Thesaurus
A reference ontology for cancer biology,
translational science, and clinical oncology
~20 full-time editors making changes
Changes are not immediately visible
A “lead editor” who approves the
changes, and assigns new tasks
Monday, July 15, 13
20. ICD – Why should you care?
Certificate of death
Policy making
Medical bills
Monday, July 15, 13
21. Developing ICD-10:
Revision process in the 20th century
8 Annual Revision Conferences (1982 - 89)
17 – 58 Countries participated
1- 5 person delegations
Mainly Health Statisticians
Manual curation
List exchange
Index was done later
"Decibel” Method of discussion
Output: Paper Copy
Work in English only
Limited testing in the field
Monday, July 15, 13
22. ICD-11: the 21st century
• ICD-11 is being developed as an OWL ontology
• Being developed collaboratively, in an open
editing process
• Links to other ontologies, such as SNOMED CT
• 33,000 classes
Monday, July 15, 13
23. Over 250 domain experts from around the world
Organized in groups, which edit different parts of the ontology
T. Tudorache, S. Falconer, C. Nyulas, N. F. Noy and M. A. Musen
Will Semantic Web Technologies Work for the Development of ICD-11?
International Semantic Web Conference (ISWC 2010), In-Use Track, Shanghai, China
Monday, July 15, 13
24. ICD-11 development process
• Each night a snapshot of the commonly edited ontology is
published in a public platform to encourage feedback from
the larger community http://apps.who.int/classifications/
icd11/browse/f/en
• Editorial workflow
• Centrally overseen by WHO
• Peer-reviewed process for the content and structure
• Experts may add change proposals
• WebProtégé used as the collaborative ontology
development platform
Monday, July 15, 13
32. Ontology Development as a
Collaborative Process
• Ontology development is an inherently
collaborative process
• It is also inherently modular, so “stepping on
someone else’s toes” is not a big issue
• Users expect Web 2.0-style interaction:
• feeds, emails
• watched entities
• Web interface
• social-networking features
Monday, July 15, 13
33. Dimensions of Collaborative
Workflows
•Ontology size
• from 100s to 10,000s of concepts
•Size of the community
• Contributors (in some form): from 2-3
to dozens
• Editors: from 1-2 to 20
•Control mechanisms
• Variety of roles
• Gatekeepers, etc.
• Client-server editing
•Discussion tools
• mailing lists, message boards
• face-to-face meetings, telecons
• Synchronization and editing
mechanisms
• CVS, SVN
Monday, July 15, 13
36. Collaboration Features
• Simultaneous editing
• Change tracking
• Threaded discussions for ontology entities and changes
(notes, discussions, proposals, reviews)
• Watching ontology entities and branches and notifications
• Upload and sharing of ontologies
• Download any revision of the ontology
• Access policies
• User interface customization for domain experts
• Change analysis and statistics
Monday, July 15, 13
43. Research Challenges
• Human-Computer Interaction:
• How do we enable domain experts to contribute effectively?
• What are the minimal sets of constructs necessary?
• Change analysis:
• Are there patterns in how users edit ontologies?
• Can we use these patterns to guide user interfaces?
• Community dynamics:
• What are the dynamics in groups that develop ontologies
collaboratively?
• Are there explicit or implicit roles?
• Do roles change over time?
Monday, July 15, 13