Analogy for modularization of ontologies…Given 5 different lines with different colors and a given a set of possible angular relationships easier to build different shapes and patterns
Here is an example, that would hopefully illustrates the strengths and usefulness of having our ontology. NIFSTD has various neuron types with an asserted simple hierarchy within the NIF-Cell module (here is an example with five neuron types). However, we assert various logical restrictions about these neurons.
Having the defined classes enabled us to have useful concept-based queries through the NIF search interface. For example, while searching for ‘GABAergic neuron’, the system recognizes the term as ‘defined’ from the ontology, and looks for any neuron that has GABA as a neurotransmitter (instead of the lexical match of the search term like in Google) and enhances the query over those inferred list of neurons.
One of the largest roadblocks that we encountered during our ontology development was the lack of tools for domain experts to contribute their knowledge to NIFSTD. To bridge these gaps, NIF has created NeuroLex (http://neurolex.org), a semantic wiki interface for the domain experts as an easy entry point to the NIFSTD contents. It has been extensively used in the area of neuronal cell types where NIF is working with a group of neuroscientists such as Gordon Shephard and Georgio Ascoli, to create a comprehensive list of neurons and their properties.
We envision NeuroLex as the main entry point for the broader community to access, annotate, edit and enhance the core NIFSTD content. The peer-reviewed contributions in the media wiki are later implanted in formal OWL modules. While the properties in NeuroLex are meant for easier interpretation, the restrictions in NIFSTD are usually based on rigorous OBO-RO standard relations. For example, the property ‘soma located in’ is translated as ‘Neuron X’ has_part some (‘Soma’ and (part_of some ‘Brain region Y’)) in NIFSTD.
While the principles promote developing highly interoperable and reusable reference ontologies in ideal cases, following some of them in a rigid manner is often proven to be too ambitious for day-to-day development.
NIFSTD and NeuroLex: A Comprehensive Ontology Development Based on Multiple Biomedical Ontologies and Community Involvement
NIFSTD AND NEUROLEX: DEVELOPMENT OF ACOMPREHENSIVE NEUROSCIENCE ONTOLOGYFahim IMAM, Stephen LARSON, Georgio ASCOLI, Gordon SHEPHERD,Anita BANDROWSKI, Jeffery S. GRETHE, Amarnath GUPTA, Maryann E. MARTONEUniversity of California, San Diego, CAGeorge Mason University, Fairfax, VAYale University, New Haven, CTICBO Workshop 2011July 26, 2011Funded in part by the NIH Neuroscience BlueprintHHSN271200800035C via NIDANEUROSCIENCE INFORMATION FRAMEWORK
NIF: DISCOVER AND UTILIZE WEB-BASEDNEUROSCIENCE RESOURCES A portal for finding andusing neuroscienceresources A consistent frameworkfor describing resources Provides simultaneoussearch of multiple typesof information, organizedby category NIFSTD Ontology, acritical component Enables concept-based searchUCSD, Yale, Cal Tech, George Mason, Harvard MGHSupported by NIH BlueprintEasierThe Neuroscience Information Framework (NIF), http://neuinfo.org
NIF STANDARD ONTOLOGIES (NIFSTD)• Set of modular ontologies– Covering neuroscience relevantterminologies– Comprehensive ~60, 000 distinctconcepts + synonyms• Expressed in OWL-DL language– Supported by common DL Resoners• Closely follows OBO communitybest practices• Avoids duplication of efforts– Standardized to the same upper levelontologies• e.g., Basic Formal Ontology (BFO), OBORelations Ontology (OBO-RO),Phonotypical Qualities Ontology (PATO)– Relies on existing community ontologiese.g., CHEBI, GO, PRO, OBI etc.3• Modules cover orthogonal domaine.g. , BrainRegions, Cells, Molecules, Subcellular parts, Diseases, Nervous systemfunctions, etc.Bill Bug et al.
4NIFSTD EXTERNAL COMMUNITY SOURCESDomain External Source Import/ Adapt ModuleOrganism taxonomy NCBI Taxonomy, GBIF, ITIS, IMSR, Jackson Labs mouse catalog Adapt NIF-OrganismMolecules IUPHAR ion channels and receptors, Sequence Ontology (SO),ChEBI, and Protein Ontology (PRO); pending: NCBI EntrezProtein, NCBI RefSeq, NCBI Homologene, NIDA drug listsAdaptIUPHAR,ChEBI;ImportPRO, SONIF-MoleculeNIF-ChemicalSub-cellular Sub-cellular Anatomy Ontology (SAO). Extracted cell parts andsubcellular structures. Imported GO Cellular ComponentImport NIF-SubcellularCell CCDB, NeuronDB, NeuroMorpho.org. Terminologies; pending:OBO Cell OntologyAdapt NIF-CellGross Anatomy NeuroNames extended by including terms from BIRN, SumsDB,BrainMap.org, etc; multi-scale representation of NervousSystem Macroscopic anatomyAdapt NIF-GrossAnatomyNervous systemfunctionSensory, Behavior, Cognition terms from NIF, BIRN,BrainMap.org, MeSH, and UMLSAdapt NIF-FunctionNervous systemdysfunctionNervous system disease from MeSH, NINDS terminology;Disease Ontology (DO)Adapt/Import NIF- DysfunctionPhenotypic qualities PATO is Imported as part of the OBO foundry core Import NIF-QualityInvestigation: reagents Overlaps with molecules above, especially RefSeq for mRNA Import NIF-InvestigationInvestigation:instruments, protocolsBased on Ontology for Biomedical Investigation (OBI) to includeentities for biomaterial transformations, assays, datatransformationsAdapt NIF-InvestigationInvestigation: Resource NIF, OBI, NITRC, Biomedical Resource Ontology (BRO) Adapt NIF-ResourceBiological Process Gene Ontology’s (GO) biological process in whole Import NIF-BioProcessCognitive Paradigm Cognitive Paradigm Ontology (CogPO) Import NIF-Investigation
IMPORTING OR ADAPTING A NEW ONTOLOGY ORVOCABULARY SOURCESource Import/adapta source already in OWL, uses the OBO-RO and the BFO and is orthogonal toexisting modulesthe import simply involves adding anowl:import statementexisting orthogonal ontology is in OWLbut does not use the same foundationalontologies as NIFSTDan ontology-bridging module (explainedlater) is constructed declaring the deeplevel semantic equivalencies such asfoundational objects and processes.external source is satisfied by the abovetwo rules but observed to be too large forNIF’s scope of interestsa relevant subset is extracted.MIREOT principles has been adoptedexternal source has not been representedin OWL, or does not use the samefoundation as NIFSTD,the terminology is adapted toOWL/RDF in the context of theNIFSTD foundational layer ontologies
NIFSTD DESIGN PRINCIPLES• Single Inheritance for Named Classes– Follows simple inheritance principle for namedclasses– An asserted named class can have only one namedclass as its superclass– Promotes the named classes to be univocal and toavoid ambiguities• Classes with multiple named superclasses– Can be inferred using automated reasoners– Saves a great deal of manual labor and minimizeshuman errors• Alan Rector’s Normalization principles.
DESIGN PRINCIPLES• Unique Identifiers and Annotation Properties.– NIFSTD entities are identified by a unique identifierand accompanied by a variety of annotationproperties• Derived from Dublin Core Metadata (DC) and SimpleKnowledge Organization System (SKOS) model.• Synonyms, acronyms, definition, defining source etc.– Reuse the same URI through MIREOTed classes fromexternal source,• Allows to avoid extra mapping annotations, e.g., classidentifiers remain unaltered.
DESIGN PRINCIPLES• Annotation properties associated withversioning different levels of contents– creation date and modification dates– file level versioning for each of the modules– annotations for retiring antiquated conceptdefinitions• hasFormerParentClass and isReplacesByClass etc.• tracking former ontology graph position andreplacement concepts.
DESIGN PRINCIPLES• Object Properties and Bridge Modules.– Mostly drawn from OBO Relations Ontology (OBO-RO)– Intra-module relations are kept within the samemodule• ONLY universal restrictions are considered– e.g., partonomy relations within different brain regions– The cross-module relations are specified in separatebridging modules• Modules that only contain logical restrictions on a set ofclasses assigned between multiple modules.• Allows main domain modules—e.g., anatomy, cell type, etc.to remain independent of one another
DESIGN PRINCIPLES Helps keeping the modularity principles intact facilitate extensions for broader communities without NIF-centric views These bridging modules can easily be excluded in order to focus on core modulesTwo example bridging modules in NIFSTD
TYPICAL KNOWLEDGE MODELA typical knowledge model in NIFSTD. Both cross-modular and intra-modularclasses are associated through object properties mostly drawn from the OBORelations ontology (RO).
TYPICAL USE OF ONTOLOGY IN NIF• Basic feature of an ontology– Organizing the concepts involved in a domaininto a hierarchy and– Precisely specifying how the classes are‘related’ with each other (i.e., logical axioms)• Explicit knowledge are asserted but implicitlogical consequences can be inferred– A powerful feature of an ontology13
Class name Asserted necessary conditionsCerebellum Purkinje cell 1. Is a ‘Neuron’2. Its soma lies within Purkinje cell layer of cerebellar cortex’3. It has ‘Projection neuron role’4. It uses ‘GABA’ as a neurotransmitter5. It has ‘Spiny dendrite quality’Class name Asserted defining (necessary & sufficient) expressionCerebellum neuron Is a ‘Neuron’ whose soma lies in any part of the‘Cerebellum’ or ‘Cerebellar cortex’Principal neuron Is a ‘Neuron’ which has ‘Projection neuron role’, i.e., aneuron whose axon projects out of the brain region inwhich its soma liesGABAergic neuron Is a ‘Neuron’ that uses ‘GABA’ as a neurotransmitterONTOLOGY – ASSERTED HIERARCHY14
NIFSTD CURRENT VERSION• Key feature: Includes useful defined concepts toinfer useful classificationNIF Standard Ontologies 16
NIFSTD AND NEUROLEX WIKI• Semantic wiki platform• Provides simple forms forstructured knowledge• Can addconcepts, properties• Generate hierarchieswithout having to learncomplicated ontology tools• Good teaching tool forprinciples behindontologies• Community can contributeNIF Standard Ontologies17Stephen D. Larson et al.
NeuroLex vs.NIFSTDNeuroLex NIFSTDA semantic mediawiki based websitecontaining the content of the NIFSTDplus additional community contributionsCollection of cohesive, unified modularontologies deployed in OWLCategories ClassesContent is fluid and can be updated atany time.Structure is based on OBO foundryprinciplesDefines relationships betweencategories as simple propertiesDefines relationships between classes asOWL restrictions derived from ROAt a glance guide to the differences between NeuroLex and NIFSTDLarson et. al
Top Down Vs. Bottom upTop-down ontology construction• A select few authors have write privileges• Maximizes consistency of terms with each other• Making changes requires approval and re-publishing• Works best when domain to be organized has: small corpus, formalcategories, stable entities, restricted entities, clear edges.• Works best with participants who are: expert catalogers, coordinated users, expertusers, people with authoritative source of judgmentBottom-up ontology construction• Multiple participants can edit the ontology instantly• Control of content is done after edits are made based on the merit of the content• Semantics are limited to what is convenient for the domain• Not a replacement for top-down construction; sometimes necessary to increase flexibility• Necessary when domain has: large corpus, no formal categories, no clear edges• Necessary when participants are: uncoordinated users, amateur users, naïve catalogers• Neuroscience is a domain that is less formal and neuroscientists are more uncoordinatedLarson et. alNIFSTDNEUROLEX
http://neurolex.org/wiki/Special:ContributionScoresNEUROLEX WIKI CONTRIBUTIONS
NIFSTD/NEUROLEX CURATION WORKFLOW‘has soma location’ in NeuroLex == ‘Neuron X’ has_part some (‘Soma’ and(part_of some ‘Brain region Y’)) in NIFSTD
ACCESS TO NIFSTD CONTENTS• NIFSTD is available as– OWL Formathttp://ontology.neuinfo.org– RDF and SPARQL Endpointhttp://ontology.neuinfo.org/sparql-endpoint.html• Specific contents through webservices– http://ontology.neuinfo.org/ontoquest-service.html• Available through NCBO Bioportal– Provides annotation and mappingservices– http://bioportal.bioontology.org/NIF Standard Ontologies 23
SUMMARY AND CONCLUSIONS• NIF with NIFSTD provides an example of how ontologies canbe practically applied to enhance search and data integrationacross diverse resources• We believe, we have defined a process to form complexsemantics to various neuroscience concepts through NIFSTDand through NeuroLex collaborative environment.• NIF encourages the use of community ontologies• Moving towards building rich knowledgebase forNeuroscience that integrates with larger life sciencecommunities.25
Point of Discussion• Gaining OBO Foundry community consensusfor a production system is difficult as we oftenneed to move quickly along with the project• We rather favor a system whereby we startwith minimal complexity as required and addmore as the ontologies evolve over timetowards perfection• What should be the most effective way tocollaborate and gain community consensus?