Semantic Infrastructure to Enable Collaboration in Ontology Development


Published on

This talk was presented at the 2011 International Conference on Collaboration Technologies and Systems

In many scientific disciplines, and in biomedicine in particular, researchers rely on ontologies to enable them to annotate and inte- grate their data. These ontologies are living and constantly evolving artifacts and the ontology authors must rely on their user community to ensure that the coverage of the ontologies is sufficient for annotations and other tasks for which users deploy the ontologies.

We have developed a distributed collaborative mechanism to enable users to provide feedback to ontology authors, to request new terms, and to use provisional terms in their applications. The ontology authors can use the same infrastructure to explore this feedback in their ontology-editing environment, to update the ontology, to record their decisions on the users’ requests, and to publish both the updated ontology and the information on how they acted on the requested changes.

Specifically, we present the Notes ontology that we use to rep- resent the different types of user feedback and change requests, the service-oriented Notes API to access the information that con- forms to this ontology, and the two ontology editing and publishing environments—WebProtégé and NCBO BioPortal—that use this API to provide services for their users.

Published in: Technology
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • I’m going to talk today about an architecture and set of tools that we’ve developed to enable and encourage collaboration and feedback between ontology developers and users.
  • The main piece of this functionality is something we call notes, which are essentially annotations that can be made on ontologies to support things like adding a comment in a discussion or requesting that a new term be added to an ontology. I’ll be talking a lot more about what notes are and the architecture that supports their use.
  • Before we get to that, I’d like to briefly get everyone on the same page about what ontologies are, at least in our context. Then I’ll discuss our notes functionality and architecture, followed by some related functionality to support immediate use of new terms. Finally, I’ll discuss some our future plans in this area and our conclusions.
  • First off, let’s discuss what we mean by ontology. There are many ways to define ontology, but for our purposes we’re going to think of an ontology as an explicit description of a domain. This description contains concepts, or terms, which can have properties and attributes. The ontology we’re looking at here is the Foundational Model of Anatomy and you can see some terms listed on the left with some potential properties and attributes for these terms listed on the right. Ontologies define a common vocabulary for a domain and as a result provide a shared understanding of what is in that domain.Show BRO or another ontology from the talkAcknowledge that many people may not knowAn ontology is an explicit description of a domain- Concepts or terms- Properties and attributes of terms- Constraints on properties and attributes- IndividualsAn ontology defines- A common vocabulary- A shared understanding
  • Now that we have a good understanding of what ontology means in our context, let’s talk a little about the ontology development process. Ontology developers use a set of tools to create and edit their ontologies. Some of you may be familiar with Protégé or the its web-based version, which can be seen here on the left. These editing tools allow users to create new terms, change property values, or move terms in the hierarchy. In addition, ontology libraries exist that offer developers a platform on which to publish their ontologies. BioPortal, seen on the right here, is the library we’ve created to support the biomedical community. Ideally, these platforms are integrated with the ability for users of both systems to interact seamlessly, something which our notes architecture enables.
  • There were several driving factors involved in our notes implementation. The first was to assist ontology developers with consensus building. For example, ontology developers can use notes to discuss term names, definitions, and placement of terms in the hierarchy.
  • We also have groups who aren’t ontology developers but who use an ontology to populate drop-down menus. When a term doesn’t exist, users of their systems can select “Other” and enter in information about the value they need, which is then stored as a string. Rather than losing this information, the notes architecture enables these groups to request that the term be added to the ontology that they’re using in the drop-down.
  • In other cases, there are curators who are reviewing literature will often come across a term they would like to annotate that doesn’t currently exist in the ontology they’re using to support their work. The notes architecture gives these curators a way to submit a request that these terms be added.
  • Finally, there are projects that use automated processes to analyze document sets with ontology terms. The analysis produces a set of recommendations which can then be automatically submitted to the ontology developers via the notes api.
  • To satisfy the needs of these use cases, we came up with a set of basic requirements that would guide our implementation. The first was that the notes needed to be structured, meaning that the data we gathered would be specific to the type of note being created. We also needed programmatic access via an api so that different systems could read and write notes against our data store. We wanted to give ontology developers a way to archive notes once they had taken action so that they could be hidden in whatever UI is being used. And finally, we wanted to have a model that would allow us to create new types of notes that were domain-specific, for example to propose a new a semantic type for UMLS ontologies.
  • We started by modeling the notes in an ontology, part of which you can see represented here. All of the note types inherit from the base annotation class. You can see we have simple notes like comments or examples, and then more complex notes, like term proposals, all of which require a different set of metadata in order to be of use to ontology developers.Comment notes don’t have a bodyArrow colorsInclude other note types, make it clear that there are other types using arrow with “…”
  • We took the notes ontology and built a set of APIs around it, using a Java library as the base and building a set of RESTful services on top of that. The Java Notes API uses the OWL-API as a storage mechanism. You can see that BioPortal and WebProtege use the RESTful services. We’ll show what this actually looks like in the UI next. External applications can read and write notes using either the Java library or the RESTful services.
  • You can see here an example of BioPortal’s Web UI for creating a new term proposal. As we’ve said, the note is structured and some of the fields can be pre-populated using the BioPortal REST services.
  • Here is what an ontology developer’s view of the same functionality. Notes that are submitted via BioPortal are visible here, and vice-versa. Ontology developers can discuss proposals directly in WebProtege and record decisions, which will then be reflected back to the users in BioPortal.
  • Here is what an ontology developer’s view of the same functionality. Notes that are submitted via BioPortal are visible here, and vice-versa. Ontology developers can discuss proposals directly in WebProtege and record decisions, which will then be reflected back to the users in BioPortal.
  • When implementing new term proposals, we found that people wanted to be able to use whatever term they were proposing right away, regardless of what the ontology developer actually decided to do in regards to implementation. To enable this, we create something called a provisional term that gets an id in the form of a PURL (Persistent Uniform Resource Locators). This id is provided back to the term requestor in the proposal and they can use this id in their application. Later on, when the ontology developer decides to implement the term, we record the permanent id and can create a record indicating that these two terms are equivalent or sameAs. This also enables something we’re calling a Term Marketplace, which is essentially a way for users to create one-off terms that they believe don’t exist. Ontology developers can then examine these terms and decide to implement them or create a pointer to a term that is equivalent that already exists. Again, we create a record tying these two together which allows users to continue using the provisional id or switch to the permanent id.
  • To support the usage of provisional terms, we defined and implemented a set of requirements. First, users need to be able to create terms and get an id. In addition, ontology developers and other applications need a way to query and filter these terms based on a set of parameters, like suggested ontologies, submitter, or term creation date ranges. Finally, users needed to be able to take the provisional id and get back the final, implemented term if one exists. All of this functionality is currently available via REST services.
  • In conclusion, we created an ontology-based system to enable collaboration between ontology editors and feedback from users. We have a service-oriented, RESTful architecture which allows multiple systems to read and write to the same store. Finally, we’ve implemented a set of RESTful services to support the immediate use of proposed and provisional terms.
  • Semantic Infrastructure to Enable Collaboration in Ontology Development

    1. 1. Semantic Infrastructure to Enable Collaboration in Ontology Development<br />2011 International Conference on Collaboration Technologies and Systems<br />Semantic Technologies for Information-Integrated Collaboration<br />Wednesday, May 25 2011<br />Presenter: Paul R. Alexander<br />Authors: Paul R. Alexander, CsongorNyulas, Tania Tudorache,Patricia L. Whetzel, Natalya F. Noy, Mark A. Musen<br />Stanford Center for Biomedical Informatics Research, Stanford University, US<br />
    2. 2. Notes Primer<br />
    3. 3. Introduction<br />What is an ontology?<br />Ontology development and publication<br />Problems working together<br />Notes and provisional terms<br />Conclusion<br />
    4. 4. What is an Ontology?<br />
    5. 5. Ontology Development<br />Editing<br />Publishing<br />Protégé, WebProtégé<br />BioPortal<br />
    6. 6. Biomedical Resource Ontology<br />
    7. 7. MIAMExpress<br />
    8. 8. Gene Ontology<br />
    9. 9. Notes Use Cases<br />Consensus-building (BRO)<br />New term requests (MIAMExpress, Phenex)<br />Data annotation (GO)<br />Automated requests (ODIE)<br />
    10. 10. Requirements for Notes<br />Comprised of structured information<br />Accessed via REST web service<br />Placed into an archived state<br />Extended via domain-specific notes types<br />
    11. 11. Types of Notes<br />
    12. 12. Notes APIs<br />
    13. 13. Notes Usage<br />Comments/Requests<br />
    14. 14. Notes Usage<br />Editor Workflow<br />
    15. 15. Notes Usage<br />Editor Workflow<br />
    16. 16. Provisional Terms<br />
    17. 17. Requirements for Provisional Terms<br />Term creation<br />Query and filter<br />Get implemented term<br />
    18. 18. Future Work<br />Further integration of BioPortal and WebProtégé<br />Automation support for editors<br />Full-fledged ‘Term Marketplace’<br />
    19. 19. Conclusion<br />Ontology-based architecture to support collaboration and feedback<br />Service-oriented, RESTful architecture allows multiple systems to access a common store<br />Provisional terms enable immediate use<br />http://bioportal.bioontology.org<br />