Collaborative Construction of Large Biological Ontologies
Upcoming SlideShare
Loading in...5
×
 

Collaborative Construction of Large Biological Ontologies

on

  • 1,438 views

 

Statistics

Views

Total Views
1,438
Views on SlideShare
1,437
Embed Views
1

Actions

Likes
0
Downloads
6
Comments
0

1 Embed 1

http://www.linkedin.com 1

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment
  • (e.g., the habit of checking differences between versions before submission)

Collaborative Construction of Large Biological Ontologies Collaborative Construction of Large Biological Ontologies Presentation Transcript

  • Collaborative Construction of Large Biological Ontologies Jie Bao a , This work is in collaboration with Zhiliang Hu b , LaRon Hughes b , Doina Caragea c , Peter Wong a , James Reecy b , Vasant G Honavar a a Artificial Intelligence Research Laboratory, Department of Computer Science a Center for Computational Intelligence, Learning, and Discovery b Department of Animal Science, Iowa State University, Ames, IA 50011, USA c Department of Computing and Information Sciences, Kansas State University Manhattan, KS 66506 Email: {baojie, zhu, laron, pwwong,jreecy, honavar}@iastate.edu, dcaragea@ksu.edu
  • Outline
    • Collaborative Ontology Building (COB) Desiderata
    • Limitations of CVS-based Collaboration
    • COB-based on Modular Ontologies
    • The COB Editor
  • Large Biological Ontologies Gramineae Taxonomy Plant Ontology Gene Ontology MGED Ontology (microarray)
  • Example: Gene Ontology
  • Non-collaborative Ontology Building Download Ontology Local Editing Upload Ontology (single curator) (Protégé) (OBO-Edit)
  • Collaboration In Need Example: Gene Ontology Consortium
  • Collaboration In Need (2) Swine Cattle Chicken Horse Each group works on an ontology module for a particular species (according to the group’s best expertise) Example 2: an animal trait ontology that involves multiple research groups across the world
  • Challenges
    • Knowledge Integration
    • Concurrence Management
    • Consistency Maintenance
    • Privilege Management
    • History Maintenance
    • Scalability
  • Solutions
    • Pipeline
      • Divide the ontology building process into sequential phrases
      • Each phrase is assigned to a particular contributor.
    • CVS
      • CVS = Concurrent Version System
      • Treat an ontology as a single file/document;
      • Use collaborative tools like CVS to build the ontology.
    • Modular Ontology
      • Build the ontology with fine-grained modules;
      • Different contributors can concurrently edit different modules.
    <= Very limited collaboration <= Collaboration with high cost <= Our approach
  • Outline
    • Collaborative Ontology Building (COB) Desiderata
    • Limitations of CVS-based Collaboration
    • COB-based on Modular Ontologies
    • The COB Editor
  • CVS-based Ontology Building Get GO CVS Account Get Source Forge Account Set Up CVS Access Submit Change Request Track the Request User submit change suggestion (in natural language) Get Source Forge Account Take a Change Request Curator Download Whole GO Flat File Local Editing Make Local Log File Save GO Flat File Manual Version Control Commit the Whole New Ontology to CVS
  • Unprincipled Authorization and Organization
    • No principled mechanism to ensure curator privilege assignments,
    • No clear organizational division of the whole ontology into smaller manageable units.
  • Risk of Inconsistency
    • No principled way to avoid unintended couplings and over-writing.
    • The validity and consistency of the ontology are heavily dependent on
      • the curator discipline and
      • good community communications (e.g., via email lists).
  • Lack of Partial Editing/Reuse
    • A curator has to
      • download the entire ontology, before editing,
      • and submit the entire modified ontology, after editing;
    • A user cannot download and reuse only a selected subset of the ontology
    • High communication and memory overhead!
  • Expensive History Maintenance
    • Even a minor edit of the ontology causes the ontology file to be replicated in its entirety
    • Tracing the changing history of a term requires processing the entire ontology file for comparisons (e.g., diff)
  • Limited Participation
    • Since all editing has global effect, it is diffcult to
      • grant privileges scope to different types of users (e.g., core curators versus normal curators)
      • accept/deny/modify/revert local changes made by other curators
    • The curator community has to be limited to a small number of trusted curators.
  • Outline
    • Collaborative Ontology Building (COB) Desiderata
    • Limitations of CVS-based Collaboration
    • COB based on Modular Ontologies
    • The COB Editor
  • Basic Strategy
    • Localize the interactions among different parts of a large ontology.
    • Build an ontology with fine-grained organizational structure.
    • Allow group collaboration on different ontology modules.
  • Package-based Ontologies
    • The whole ontology consists of a set of packages
    • Each package represents a fragment of the whole ontology
    • Each term has a &quot;home package&quot;
    Egg Chicken Reproduction General General Cattle Pig Chicken Animal Trait ontology
  • Package Nesting
    • A nested package is a part of another package
    • Could be used to represent the organizational structure of an ontology
      • Arrange knowledge
      • Enforce hierarchical management of knowledge
    General Pig Pig Health Animal trait ontology
  • Division of Labor
    • A package can be assigned to curators with the best knowledge of the relevant sub-domain.
      • e.g. Pig Health, Pig Reproduction
    • The package hierarchy helps to manage interactions among experts with different degrees of expertise.
      • e.g. Pig, Pig Health
  • Partial Reuse General Cattle Pig Chicken Animal Trait Ontology (Centralized) Pork General Pig Cattle Chicken Pork Animal Trait Ontology (Package-based) Semantic importing Knowledge incorporated in Pork ontology Knowledge not presented in Prok ontology Legend:
  • Scaleability
    • Reduction in communication overhead and computational time cost
      • Parsing
      • Transfering
      • Consistency check
    • Reduction in memory requirements
      • Ontology can be partially loaded into memory
    • Reduction in history tracking cost
      • Effect of changes is localized
  • Broadened Participation
    • Open-community collaboration success witnessed by DMOZ and Wikipedia
    • Package-based ontology management can
      • Control the scope of an editing action
      • Minimize the risk of vandalization
    • Better tradeoff between broader participation and ontology quality
      • There are different levels of curators, e.g. ontology admins, pig experts, pig health experts.
      • An editing action can be approved or denied by a curator with higher privileges
  • Outline
    • Collaborative Ontology Building (COB) Desiderata
    • Limitations of CVS-based Collaboration
    • COB-based on Modular Ontologies
    • The COB Editor
  • The COB Editor Pig Package Cattle Package Chicken Package [Bao et al. BIDM06]
  • Collaborative Ontology Building
    • Ontology modularity facilitates collaborative building
    • Each package can be independently developed
    • Different curators can concurrently edit the ontology on different packages
    • Ontology can be only partially loaded
    • Unwanted interactions are minimized by limiting term and axiom visibility
    • Module access privileges can be controlled by the package hierarchy
  • Work with COB Editor
    • Download
    • http://www.animalgenome.org/bioinfo/projects/ATO/
    • http://sourceforge.net/projects/cob (source code)
    Get Ontology Account Check out a package Curator Create new package or Lock Package Edit the Package Commit the Package (Auto) Server Change Log
  • More Features
    • Support import/export from/to OWL and OBO format
      • can be used for Gene Ontology and others
    • Ontology shared on a database server
    • Allows multi-relational hierarchies
      • e.g. both is-a and part-of
    • Visibility of a term can be controlled by scope limitation modifiers
      • e.g. public, private, protected
  • Conclusions
    • Modular ontologies can improve collaborative ontology building in many aspects
    • Package-based Ontology offers an &quot;importing&quot; based ontolog language.
    • COB Editor provides the necessary tool to collaboratively build well-structured, large-scale, biomedical ontologies
  • Future Work
    • Support of inference and consistency checking
    • Accommodation and modularization of existing ontologies, e.g. GO, EC, SCOP
    • Support of ontology mapping and ontology integration
    • Support of more expressive ontologies, e.g. UMLS, SNOMED
    • Thanks!