The document discusses building biomedical ontologies. It notes problems with current approaches that lead to redundant ontologies. The OBO Foundry is proposed as a solution to create a single ontology for each domain through collaboration. The OBO Foundry utilizes common principles like using the Basic Formal Ontology as a top-level ontology and common relation ontology. Successful ontologies in the OBO Foundry like the Gene Ontology are discussed as examples.
ENVO: The Environment Ontology (Presentation at the Genomics Standards Consor...Barry Smith
This document discusses using ontologies to improve data integration across biological and biomedical domains. It proposes developing the Environment Ontology (EnvO) to represent environmental contexts. EnvO would align under the Basic Formal Ontology and relate environments at different levels of granularity, from ecosystems down to molecular components. Representing habitats, niches, and other key environmental terms ontologically could help integrate data across species and research groups. Developing EnvO coherently according to established ontology best practices could extend the success of projects like the Gene Ontology to the domain of environmental data.
Enriching the Gene Ontology via the Dissection of Labels using the Ontology P...jesualdofernandez
Authors: J.T. Fernandez-Breis, L. Iannone, I. Palmisano, A. Rector, R. Stevens.
Presented at 17th International Conference on Knowledge Engineering and Knowledge Management, EKAW2010
Presentation to ImmPort Science Meeting, February 27, 2014 on the proper treatment of value sets in the Immport Immunology Database and Analysis Portal
Surveys a series of ethical, economic, clinical and also safety issues relating to the application of informatics to healthcare, focusing especially on the role of informatics in the Patient Protection and Affordable Care Act. Talk presented in the University at Buffalo Clinical/Research Ethics Seminar - Ethics, Informatics and Obamacare, November 20, 2012. Slides are available here: http://ontology.buffalo.edu/13/ethics-informatics-obamacare.pptx
Towards Joint Doctrine for Military InformaticsBarry Smith
This document discusses using semantic enhancement and ontologies to integrate siloed data from multiple sources. It describes challenges with current approaches that rely on creating a single "über-model" or virtual integration through a homogeneous data model. Instead, it proposes a virtual integration approach using a distributed cloud-based data representation and integration framework. This framework would use ontologies to semantically enhance data from different sources and provide a unified view without requiring physical data integration or homogenization. The goal is a scalable solution that allows for continuous semantic enrichment while preserving the original data and semantics.
This document discusses ontologies related to pain. It begins by introducing the Basic Formal Ontology (BFO) and Ontology for General Medical Science (OGMS) as top-level ontologies. It then discusses dependent continuants like qualities, functions, and dispositions. Examples of pain-related phenomena are provided like canonical pain from tissue damage and variant pain without damage. Borderline cases are noted as existing in nature without sharp boundaries. The methodology of canonical ontology is said to address borderline cases.
Towards Joint Doctrine for Military InformaticsBarry Smith
This document discusses using semantic enhancement and ontologies to integrate siloed data from multiple sources. It describes challenges with current approaches that rely on creating a single "über-model" or virtual integration through a homogeneous data model. Instead, it proposes a virtual integration approach using ontologies to provide a comprehensive view of domains while keeping data in its original state. Data from different sources can be semantically tagged and integrated in a cloud-based system without heavy preprocessing. This allows flexible, scalable integration while preserving data and semantics from the original sources.
Ontology of Documents, Tokyo, February 2011Barry Smith
The document discusses various social and economic processes and entities. It describes how a bank lending money for a beach condo led to the creation of complex financial instruments like collateralized debt obligations (CDOs). It raises ontological questions about what kinds of entities these are. CDOs seem to fall outside traditional categories as they are normative entities that can be traded but have no physical existence. The document also examines how identity, property rights, and economic interactions in African villages are anchored through basic documentary practices, showing how even rudimentary documents can create new institutional realities.
ENVO: The Environment Ontology (Presentation at the Genomics Standards Consor...Barry Smith
This document discusses using ontologies to improve data integration across biological and biomedical domains. It proposes developing the Environment Ontology (EnvO) to represent environmental contexts. EnvO would align under the Basic Formal Ontology and relate environments at different levels of granularity, from ecosystems down to molecular components. Representing habitats, niches, and other key environmental terms ontologically could help integrate data across species and research groups. Developing EnvO coherently according to established ontology best practices could extend the success of projects like the Gene Ontology to the domain of environmental data.
Enriching the Gene Ontology via the Dissection of Labels using the Ontology P...jesualdofernandez
Authors: J.T. Fernandez-Breis, L. Iannone, I. Palmisano, A. Rector, R. Stevens.
Presented at 17th International Conference on Knowledge Engineering and Knowledge Management, EKAW2010
Presentation to ImmPort Science Meeting, February 27, 2014 on the proper treatment of value sets in the Immport Immunology Database and Analysis Portal
Surveys a series of ethical, economic, clinical and also safety issues relating to the application of informatics to healthcare, focusing especially on the role of informatics in the Patient Protection and Affordable Care Act. Talk presented in the University at Buffalo Clinical/Research Ethics Seminar - Ethics, Informatics and Obamacare, November 20, 2012. Slides are available here: http://ontology.buffalo.edu/13/ethics-informatics-obamacare.pptx
Towards Joint Doctrine for Military InformaticsBarry Smith
This document discusses using semantic enhancement and ontologies to integrate siloed data from multiple sources. It describes challenges with current approaches that rely on creating a single "über-model" or virtual integration through a homogeneous data model. Instead, it proposes a virtual integration approach using a distributed cloud-based data representation and integration framework. This framework would use ontologies to semantically enhance data from different sources and provide a unified view without requiring physical data integration or homogenization. The goal is a scalable solution that allows for continuous semantic enrichment while preserving the original data and semantics.
This document discusses ontologies related to pain. It begins by introducing the Basic Formal Ontology (BFO) and Ontology for General Medical Science (OGMS) as top-level ontologies. It then discusses dependent continuants like qualities, functions, and dispositions. Examples of pain-related phenomena are provided like canonical pain from tissue damage and variant pain without damage. Borderline cases are noted as existing in nature without sharp boundaries. The methodology of canonical ontology is said to address borderline cases.
Towards Joint Doctrine for Military InformaticsBarry Smith
This document discusses using semantic enhancement and ontologies to integrate siloed data from multiple sources. It describes challenges with current approaches that rely on creating a single "über-model" or virtual integration through a homogeneous data model. Instead, it proposes a virtual integration approach using ontologies to provide a comprehensive view of domains while keeping data in its original state. Data from different sources can be semantically tagged and integrated in a cloud-based system without heavy preprocessing. This allows flexible, scalable integration while preserving data and semantics from the original sources.
Ontology of Documents, Tokyo, February 2011Barry Smith
The document discusses various social and economic processes and entities. It describes how a bank lending money for a beach condo led to the creation of complex financial instruments like collateralized debt obligations (CDOs). It raises ontological questions about what kinds of entities these are. CDOs seem to fall outside traditional categories as they are normative entities that can be traded but have no physical existence. The document also examines how identity, property rights, and economic interactions in African villages are anchored through basic documentary practices, showing how even rudimentary documents can create new institutional realities.
Basic Formal Ontology (BFO) and DiseaseBarry Smith
The document discusses different approaches to conceptualizing health, disease, and biological kinds across multiple levels of granularity. It notes that traditional biology data conceptualized entities based on observable instances, while new biology data represents entities at the molecular level through genetic sequences. It argues that linking different kinds of phenomena represented at various levels requires annotation with terms from controlled vocabularies like ontologies. Ontologies provide a structured framework for integrating data across databases and supporting logical reasoning by standardizing references to biological entities, processes, and functions.
The document discusses the Open Biomedical Ontologies (OBO) Foundry, which is a collection of reference ontologies in biology and biomedicine that follow a set of principles to ensure logical structure and interoperability. It outlines the motivation and history of OBO Foundry, describes how the ontologies are organized, and provides examples of ontologies in the Foundry and their scopes. The goal of OBO Foundry is to facilitate sharing, integration and analysis of biological data through standardized ontologies.
Integrating Pathway Databases with Gene Ontology Causal Activity ModelsBenjamin Good
The Gene Ontology (GO) Consortium (GOC) is developing a new knowledge representation approach called ‘causal activity models’ (GO-CAM). A GO-CAM describes how one or several gene products contribute to the execution of a biological process. In these models (implemented as OWL instance graphs anchored in Open Biological Ontology (OBO) classes and relations), gene products are linked to molecular activities via semantic relationships like ‘enables’, molecular activities are linked to each other via causal relationships such as ‘positively regulates’, and sets of molecular activities are defined as ‘parts’ of larger biological processes. This approach provides the GOC with a more complete and extensible structure for capturing knowledge of gene function. It also allows for the representation of knowledge typically seen in pathway databases.
Here, we present details and results of a rule-based transformation of pathways represented using the BioPAX exchange format into GO-CAMs. We have automatically converted all Reactome pathways into GO-CAMs and are currently working on the conversion of additional resources available through Pathway Commons. By converting pathways into GO-CAMs, we can leverage OWL description logic reasoning over OBO ontologies to infer new biological relationships and detect logical inconsistencies. Further, the conversion helps to increase standardization for the representation of biological entities and processes. The products of this work can be used to improve source databases, for example by inferring new GO annotations for pathways and reactions and can help with the formation of meta-knowledge bases that integrate content from multiple sources.
Cross Product Extensions to the Gene OntologyChris Mungall
The document discusses extending the Gene Ontology (GO) through assigning logical computable definitions to GO classes. This involves partitioning GO classes into "cross product" sets based on the ontologies used in the definitions. Over 13,000 GO classes now have provisional logical definitions assigned using this approach, covering molecular function, biological process, cellular component, and other ontologies. The logical definitions allow for nested descriptions and reasoning over GO classes. Anatomy classes are standardized in the Uberon cross-species anatomy ontology.
Basic Formal Ontology: A Common StandardBarry Smith
The document discusses the Basic Formal Ontology (BFO), a top-level ontology intended to support information integration in scientific research. BFO defines three fundamental dichotomies: continuant vs. occurrent, independent vs. dependent, and type vs. instance. It also distinguishes between specifically dependent continuants like qualities and realizable dependent continuants like functions. The goal of BFO is to provide a framework for building consistent lower-level ontologies in a reusable manner.
This document discusses applying ontology design patterns in bio-ontologies using the Ontology Preprocessor Language (OPPL). OPPL allows complex modeling to be stored, shared, and consistently applied to ontologies. It can be used to efficiently apply ontology design patterns, like closure patterns, to encapsulate semantics. Version 2 of OPPL was developed to be more axiom-centric and supports features like variables to represent patterns. The document concludes that OPPL enables easy manipulation of ontologies and consistent application of design patterns to aid in modeling.
Function and Phenotype Prediction through Data and Knowledge FusionKarin Verspoor
The biomedical literature captures the most current biomedical knowledge and is a tremendously rich resource for research. With over 24 million publications currently indexed in the US National Library of Medicine’s PubMed index, however, it is becoming increasingly challenging for biomedical researchers to keep up with this literature. Automated strategies for extracting information from it are required. Large-scale processing of the literature enables direct biomedical knowledge discovery. In this presentation, I will introduce the use of text mining techniques to support analysis of biological data sets, and will specifically discuss applications in protein function and phenotype prediction, exploring the integration of literature data with complementary structured resources.
Biomedical ontologies are key to the success of Semantic Web technologies in Life Sciences; therefore, it is important to provide appropriate tools for their development and further exploitation. The Ontology Pre Processor Language (OPPL) can be used for automating the complex manipulation needed to devise biomedical ontologies with richer axiomatic content, which in turn pave the way towards advanced biological data analyses. We present OPPL-Galaxy, an OPPL wrapper for the Galaxy platform, and a series of examples demonstrating its functionality for enriching ontologies. As Galaxy provides an integrated framework to make use of various bioinformatics tools, the functionality delivered by OPPL to manipulate ontologies can be combined along with the tools and workflows devised in Galaxy. As a result, those workflows can be used to perform more thorough analyses of biological information by exploiting extant biological knowledge codified in (enriched) biomedical ontologies
P Systems Model Optimisation by Means of Evolutionary Based Search ...Natalio Krasnogor
This document discusses using evolutionary algorithms to optimize parameters in P systems, which are computational models of biological cells. Four test cases of increasing difficulty are used to compare different algorithms. The results show that genetic algorithms, differential evolution, and opposition-based differential evolution perform better for problems with fewer parameters, while variable neighbourhood search algorithms perform better for the largest problem with 38 parameters. This is because the evolutionary algorithms are less efficient at optimizing large populations within the limited evaluation budget, whereas variable neighbourhood search focuses on a single solution.
This document describes a parallel implementation of a multi-objective evolutionary algorithm (MOEA). It presents a graph-based MOEA called MEGA that uses genetic operators like mutations and crossover on graph representations. This is parallelized as PMEGA by running MEGA on separate processor cores and subpopulations with periodic migration. Experiments on a drug design problem show PMEGA achieves similar solution quality as MEGA while providing a 1.6x speedup on a dual-core CPU.
Introduction to Ontologies for Environmental BiologyBarry Smith
1. The document introduces ontologies for environmental biology and discusses several disciplines that could benefit from their use, including GIS, ecology, environmental biology, and various "-omics" fields.
2. It describes what an ontology is and compares ontologies to legends for maps or diagrams, which allow integration and help humans and computers make sense of complex data. Ontologies provide standardized terminology and annotations.
3. The document outlines the Open Biomedical Ontologies (OBO) Foundry, a collection of interoperable reference ontologies for annotating biomedical data. Foundry ontologies include the Gene Ontology and other ontologies for molecules, cells, anatomical structures, and more. They are developed through consensus and share
University of Toronto Chemistry Librarians Workshop June 2012Brock University
The document discusses a bioinformatics assignment that has been part of a third year Biology course at York University since 2009, which has involved using databases from NCBI, EMBL, and other sources to analyze gene and protein data from organisms like Tetrahymena and compare coding between species. Students conduct tasks like mRNA alignment, identifying protein domains and functions, and assessing restriction enzymes to complete their assignments, which have generally received high marks along with positive feedback from students.
Computational Protein Design. 1. Challenges in Protein EngineeringPablo Carbonell
This document discusses computational protein design and outlines several key challenges in protein engineering. It begins with an overview of the protein design cycle and discusses locating amino acid substitutions, types of protein interactions, and engineering protein activity and binding affinity. The goal of protein engineering is to alter protein structures to improve properties, with the main challenge being developing accurate models to predict substitutions that enhance desired properties. The document provides details on computational approaches for increasing thermostability, catalytic activity, and binding affinity/specificity.
Research report (alternative splicing, protein structure; retinitis pigmentosa)avalgar
This presentation explains the two major scientific projects I have been involved in.
It extends way further than a CV, but shorter than an actual scientific paper.
Scratchpads in the Biodiversity Informatics LandscapeVince Smith
Roberts, D., Harman, K., Rycroft, S.D. & Smith, V.S. Stockholm Biodiversity Informatics Symposium 2008, Swedish Museum of Natural History, Stockholm, Sweden 1-4 December 2008.
The document summarizes the 2012 iGEM competition project from Carnegie Mellon University. The team developed fluorescent biosensors to characterize promoters by tagging mRNA with Spinach and proteins with a fluorogen activating protein (FAP). They created new inducible promoters and used fluorescence measurements and a mathematical model to characterize transcription and translation rates. The goal was to provide a better way to measure cellular activity without disrupting cells.
Towards integration of systems biology and biomedical ontologiesRobert Hoehndorf
Systems biology is an approach to biology that emphasizes the
structure and dynamic behavior of biological systems and the
interactions that occur within them. To succeed, systems biology
crucially depends on the accessibility and integration of data across
domains and levels of granularity. Biomedical ontologies were
developed to facilitate such an integration for data and are often
used to annotate biosimulation models in systems biology.
Here, I present an approach towards combining both disciplines in a common framework that enables information to flow between both.
Reasoning over phenotype diversity, character change, and evolutionary descentHilmar Lapp
The document discusses developing an ontology and database called Phenoscape to integrate evolutionary phenotype data across studies by representing phenotypes using an entity-quality model, which links phenotypic qualities to the anatomical entities they describe for specific taxa. This will allow mining of phenotype data to generate hypotheses about candidate genes underlying evolutionary changes by comparing mutant phenotypes in model organisms to phenotypic changes between evolutionary lineages. The project aims to enable new discoveries through computational analysis and data-mining of curated evolutionary phenotype data.
Detection of genomic homology in eukaryotic genomesKlaas Vandepoele
i-ADHoRe 3.0--fast and sensitive detection of genomic homology in extremely large data sets.
Proost S, Fostier J, De Witte D, Dhoedt B, Demeester P, Van de Peer Y, Vandepoele K.
Nucleic Acids Res. 2012 Jan;40(2):e11.
Comparative genomics is a powerful means to gain insight into the evolutionary processes that shape the genomes of related species. As the number of sequenced genomes increases, the development of software to perform accurate cross-species analyses becomes indispensable. However, many implementations that have the ability to compare multiple genomes exhibit unfavorable computational and memory requirements, limiting the number of genomes that can be analyzed in one run. Here, we present a software package to unveil genomic homology based on the identification of conservation of gene content and gene order (collinearity), i-ADHoRe 3.0, and its application to eukaryotic genomes. The use of efficient algorithms and support for parallel computing enable the analysis of large-scale data sets. Unlike other tools, i-ADHoRe can process the Ensembl data set, containing 49 species, in 1 h. Furthermore, the profile search is more sensitive to detect degenerate genomic homology than chaining pairwise collinearity information based on transitive homology. From ultra-conserved collinear regions between mammals and birds, by integrating coexpression information and protein-protein interactions, we identified more than 400 regions in the human genome showing significant functional coherence. The different algorithmical improvements ensure that i-ADHoRe 3.0 will remain a powerful tool to study genome evolution.
We can distinguish two families of approaches to the building of ontologies -- corresponding roughly to the contrast between 'neats' and 'scruffies' in artificial intelligence research. We describe the implications of each approach for the building of an ontology of philosophy, focusing especially on the Indiana Philosophy Ontology (InPhO) project led by Colin Allen.
A video presentation based on these slides is available here: https://www.youtube.com/watch?v=5HV3M0NvyPM
An application of Basic Formal Ontology to the Ontology of Services and Commo...Barry Smith
Basic Formal Ontology (BFO) is an upper level ontology widely used in biomedical informatics and other domains to support information integration across disciplines, We here apply BFO to the development of a coherent ontological treatment of the distinction between commodities and services.
More Related Content
Similar to Biomedical ontology tutorial_atlanta_june2011_part2
Basic Formal Ontology (BFO) and DiseaseBarry Smith
The document discusses different approaches to conceptualizing health, disease, and biological kinds across multiple levels of granularity. It notes that traditional biology data conceptualized entities based on observable instances, while new biology data represents entities at the molecular level through genetic sequences. It argues that linking different kinds of phenomena represented at various levels requires annotation with terms from controlled vocabularies like ontologies. Ontologies provide a structured framework for integrating data across databases and supporting logical reasoning by standardizing references to biological entities, processes, and functions.
The document discusses the Open Biomedical Ontologies (OBO) Foundry, which is a collection of reference ontologies in biology and biomedicine that follow a set of principles to ensure logical structure and interoperability. It outlines the motivation and history of OBO Foundry, describes how the ontologies are organized, and provides examples of ontologies in the Foundry and their scopes. The goal of OBO Foundry is to facilitate sharing, integration and analysis of biological data through standardized ontologies.
Integrating Pathway Databases with Gene Ontology Causal Activity ModelsBenjamin Good
The Gene Ontology (GO) Consortium (GOC) is developing a new knowledge representation approach called ‘causal activity models’ (GO-CAM). A GO-CAM describes how one or several gene products contribute to the execution of a biological process. In these models (implemented as OWL instance graphs anchored in Open Biological Ontology (OBO) classes and relations), gene products are linked to molecular activities via semantic relationships like ‘enables’, molecular activities are linked to each other via causal relationships such as ‘positively regulates’, and sets of molecular activities are defined as ‘parts’ of larger biological processes. This approach provides the GOC with a more complete and extensible structure for capturing knowledge of gene function. It also allows for the representation of knowledge typically seen in pathway databases.
Here, we present details and results of a rule-based transformation of pathways represented using the BioPAX exchange format into GO-CAMs. We have automatically converted all Reactome pathways into GO-CAMs and are currently working on the conversion of additional resources available through Pathway Commons. By converting pathways into GO-CAMs, we can leverage OWL description logic reasoning over OBO ontologies to infer new biological relationships and detect logical inconsistencies. Further, the conversion helps to increase standardization for the representation of biological entities and processes. The products of this work can be used to improve source databases, for example by inferring new GO annotations for pathways and reactions and can help with the formation of meta-knowledge bases that integrate content from multiple sources.
Cross Product Extensions to the Gene OntologyChris Mungall
The document discusses extending the Gene Ontology (GO) through assigning logical computable definitions to GO classes. This involves partitioning GO classes into "cross product" sets based on the ontologies used in the definitions. Over 13,000 GO classes now have provisional logical definitions assigned using this approach, covering molecular function, biological process, cellular component, and other ontologies. The logical definitions allow for nested descriptions and reasoning over GO classes. Anatomy classes are standardized in the Uberon cross-species anatomy ontology.
Basic Formal Ontology: A Common StandardBarry Smith
The document discusses the Basic Formal Ontology (BFO), a top-level ontology intended to support information integration in scientific research. BFO defines three fundamental dichotomies: continuant vs. occurrent, independent vs. dependent, and type vs. instance. It also distinguishes between specifically dependent continuants like qualities and realizable dependent continuants like functions. The goal of BFO is to provide a framework for building consistent lower-level ontologies in a reusable manner.
This document discusses applying ontology design patterns in bio-ontologies using the Ontology Preprocessor Language (OPPL). OPPL allows complex modeling to be stored, shared, and consistently applied to ontologies. It can be used to efficiently apply ontology design patterns, like closure patterns, to encapsulate semantics. Version 2 of OPPL was developed to be more axiom-centric and supports features like variables to represent patterns. The document concludes that OPPL enables easy manipulation of ontologies and consistent application of design patterns to aid in modeling.
Function and Phenotype Prediction through Data and Knowledge FusionKarin Verspoor
The biomedical literature captures the most current biomedical knowledge and is a tremendously rich resource for research. With over 24 million publications currently indexed in the US National Library of Medicine’s PubMed index, however, it is becoming increasingly challenging for biomedical researchers to keep up with this literature. Automated strategies for extracting information from it are required. Large-scale processing of the literature enables direct biomedical knowledge discovery. In this presentation, I will introduce the use of text mining techniques to support analysis of biological data sets, and will specifically discuss applications in protein function and phenotype prediction, exploring the integration of literature data with complementary structured resources.
Biomedical ontologies are key to the success of Semantic Web technologies in Life Sciences; therefore, it is important to provide appropriate tools for their development and further exploitation. The Ontology Pre Processor Language (OPPL) can be used for automating the complex manipulation needed to devise biomedical ontologies with richer axiomatic content, which in turn pave the way towards advanced biological data analyses. We present OPPL-Galaxy, an OPPL wrapper for the Galaxy platform, and a series of examples demonstrating its functionality for enriching ontologies. As Galaxy provides an integrated framework to make use of various bioinformatics tools, the functionality delivered by OPPL to manipulate ontologies can be combined along with the tools and workflows devised in Galaxy. As a result, those workflows can be used to perform more thorough analyses of biological information by exploiting extant biological knowledge codified in (enriched) biomedical ontologies
P Systems Model Optimisation by Means of Evolutionary Based Search ...Natalio Krasnogor
This document discusses using evolutionary algorithms to optimize parameters in P systems, which are computational models of biological cells. Four test cases of increasing difficulty are used to compare different algorithms. The results show that genetic algorithms, differential evolution, and opposition-based differential evolution perform better for problems with fewer parameters, while variable neighbourhood search algorithms perform better for the largest problem with 38 parameters. This is because the evolutionary algorithms are less efficient at optimizing large populations within the limited evaluation budget, whereas variable neighbourhood search focuses on a single solution.
This document describes a parallel implementation of a multi-objective evolutionary algorithm (MOEA). It presents a graph-based MOEA called MEGA that uses genetic operators like mutations and crossover on graph representations. This is parallelized as PMEGA by running MEGA on separate processor cores and subpopulations with periodic migration. Experiments on a drug design problem show PMEGA achieves similar solution quality as MEGA while providing a 1.6x speedup on a dual-core CPU.
Introduction to Ontologies for Environmental BiologyBarry Smith
1. The document introduces ontologies for environmental biology and discusses several disciplines that could benefit from their use, including GIS, ecology, environmental biology, and various "-omics" fields.
2. It describes what an ontology is and compares ontologies to legends for maps or diagrams, which allow integration and help humans and computers make sense of complex data. Ontologies provide standardized terminology and annotations.
3. The document outlines the Open Biomedical Ontologies (OBO) Foundry, a collection of interoperable reference ontologies for annotating biomedical data. Foundry ontologies include the Gene Ontology and other ontologies for molecules, cells, anatomical structures, and more. They are developed through consensus and share
University of Toronto Chemistry Librarians Workshop June 2012Brock University
The document discusses a bioinformatics assignment that has been part of a third year Biology course at York University since 2009, which has involved using databases from NCBI, EMBL, and other sources to analyze gene and protein data from organisms like Tetrahymena and compare coding between species. Students conduct tasks like mRNA alignment, identifying protein domains and functions, and assessing restriction enzymes to complete their assignments, which have generally received high marks along with positive feedback from students.
Computational Protein Design. 1. Challenges in Protein EngineeringPablo Carbonell
This document discusses computational protein design and outlines several key challenges in protein engineering. It begins with an overview of the protein design cycle and discusses locating amino acid substitutions, types of protein interactions, and engineering protein activity and binding affinity. The goal of protein engineering is to alter protein structures to improve properties, with the main challenge being developing accurate models to predict substitutions that enhance desired properties. The document provides details on computational approaches for increasing thermostability, catalytic activity, and binding affinity/specificity.
Research report (alternative splicing, protein structure; retinitis pigmentosa)avalgar
This presentation explains the two major scientific projects I have been involved in.
It extends way further than a CV, but shorter than an actual scientific paper.
Scratchpads in the Biodiversity Informatics LandscapeVince Smith
Roberts, D., Harman, K., Rycroft, S.D. & Smith, V.S. Stockholm Biodiversity Informatics Symposium 2008, Swedish Museum of Natural History, Stockholm, Sweden 1-4 December 2008.
The document summarizes the 2012 iGEM competition project from Carnegie Mellon University. The team developed fluorescent biosensors to characterize promoters by tagging mRNA with Spinach and proteins with a fluorogen activating protein (FAP). They created new inducible promoters and used fluorescence measurements and a mathematical model to characterize transcription and translation rates. The goal was to provide a better way to measure cellular activity without disrupting cells.
Towards integration of systems biology and biomedical ontologiesRobert Hoehndorf
Systems biology is an approach to biology that emphasizes the
structure and dynamic behavior of biological systems and the
interactions that occur within them. To succeed, systems biology
crucially depends on the accessibility and integration of data across
domains and levels of granularity. Biomedical ontologies were
developed to facilitate such an integration for data and are often
used to annotate biosimulation models in systems biology.
Here, I present an approach towards combining both disciplines in a common framework that enables information to flow between both.
Reasoning over phenotype diversity, character change, and evolutionary descentHilmar Lapp
The document discusses developing an ontology and database called Phenoscape to integrate evolutionary phenotype data across studies by representing phenotypes using an entity-quality model, which links phenotypic qualities to the anatomical entities they describe for specific taxa. This will allow mining of phenotype data to generate hypotheses about candidate genes underlying evolutionary changes by comparing mutant phenotypes in model organisms to phenotypic changes between evolutionary lineages. The project aims to enable new discoveries through computational analysis and data-mining of curated evolutionary phenotype data.
Detection of genomic homology in eukaryotic genomesKlaas Vandepoele
i-ADHoRe 3.0--fast and sensitive detection of genomic homology in extremely large data sets.
Proost S, Fostier J, De Witte D, Dhoedt B, Demeester P, Van de Peer Y, Vandepoele K.
Nucleic Acids Res. 2012 Jan;40(2):e11.
Comparative genomics is a powerful means to gain insight into the evolutionary processes that shape the genomes of related species. As the number of sequenced genomes increases, the development of software to perform accurate cross-species analyses becomes indispensable. However, many implementations that have the ability to compare multiple genomes exhibit unfavorable computational and memory requirements, limiting the number of genomes that can be analyzed in one run. Here, we present a software package to unveil genomic homology based on the identification of conservation of gene content and gene order (collinearity), i-ADHoRe 3.0, and its application to eukaryotic genomes. The use of efficient algorithms and support for parallel computing enable the analysis of large-scale data sets. Unlike other tools, i-ADHoRe can process the Ensembl data set, containing 49 species, in 1 h. Furthermore, the profile search is more sensitive to detect degenerate genomic homology than chaining pairwise collinearity information based on transitive homology. From ultra-conserved collinear regions between mammals and birds, by integrating coexpression information and protein-protein interactions, we identified more than 400 regions in the human genome showing significant functional coherence. The different algorithmical improvements ensure that i-ADHoRe 3.0 will remain a powerful tool to study genome evolution.
Similar to Biomedical ontology tutorial_atlanta_june2011_part2 (20)
We can distinguish two families of approaches to the building of ontologies -- corresponding roughly to the contrast between 'neats' and 'scruffies' in artificial intelligence research. We describe the implications of each approach for the building of an ontology of philosophy, focusing especially on the Indiana Philosophy Ontology (InPhO) project led by Colin Allen.
A video presentation based on these slides is available here: https://www.youtube.com/watch?v=5HV3M0NvyPM
An application of Basic Formal Ontology to the Ontology of Services and Commo...Barry Smith
Basic Formal Ontology (BFO) is an upper level ontology widely used in biomedical informatics and other domains to support information integration across disciplines, We here apply BFO to the development of a coherent ontological treatment of the distinction between commodities and services.
Ways of Worldmarking: The Ontology of the EruvBarry Smith
‘Eruv’ is a Hebrew word meaning literally ‘mixture’ or ‘mingling’. An eruv is an urban region demarcated within a larger urban region by means of a boundary made up of telephone wires or similar markers. Through the creation of the eruv, the smaller region is turned symbolically (halachically = according to Jewish law) into a private domain. So long as they remain within the boundaries of the eruv, Orthodox Jews may engage in activities that would otherwise be prohibited on the Sabbath, such as pushing prams or wheelchairs, or carrying walking sticks. There are eruvim in many towns and university campuses throughout the world. There are five eruvim in Chicago, five in Brooklyn, twenty three in Queens and Long Island, and at least three in Manhattan. The US Supreme Court is (like most other major US Federal Government buildings) located within the eruv of Washington DC. In many cases, not all of those living within or near the area of an actual or proposed eruv will themselves be Orthodox Jews, and this has sometimes led to protests against eruv creation. For further details see http://ontology.buffalo.edu/smith/articles/eruv.pdf
Contemporary philosophy of collective agency, as illustrated by the work of Searle, Bratman, Gilbert, Pettit and others, focuses predominantly on small groups of agents sharing common goals. In his groundbreaking paper “Massively Shared Agency” of 2014, Scott Shapiro shows the limits of this approach when dealing with the large groups of agents that form industrial corporations, armies, or systems of law enforcement. Such groups will involve alienated or uncommitted participants pursuing motives of their own. And as Shapiro shows, they can manifest shared agency only when the actions of all participants are coordinated through authority structures organized hierarchically. Here I wish to focus on that dimension of massively shared agency that has to do with the transmission of authority. I will show that while such transmission almost always involves communication through speech (or through the digital counterparts of speech), transmission of this sort is too transient,
and falls short of creating the type of enduring intermeshing of plans and intentions that is required for the imposition of hierarchical authority structures across large organizations. To create and maintain the needed hierarchical authority structures what is required are complexes of intermeshed documents. Such documents provide for what we can think of as a division of deontic labor, allowing plans, orders, and obligations to be meshed together over time.
Presented at the conference on Truth, Image and Normativity, Cagliari, Sardinia, October 23, 2014
Increasingly, biological and clinical scientists are using ontologies to serve integration and coordination of research across diverse organisms and scientific fields. Ontologies, in this context, are logically organized collections of terms defined in such a way that they can be used consistently across multiple disciplines to describe clinical and experimental data. Ontologies are used in aging research to unify experimental results from a broad range of fields including genetics, proteomics, (stem) cell biology, oncology, model organism biology, psychogerontology, and many more. We will explore against this background questions such as the following: What is aging? What is premature aging? And more specifically: Is aging a disease?
The document summarizes regulations and guidance around meaningful use of electronic health records (EHRs) in the US. It outlines the stages of meaningful use, including capturing health information electronically, using it to track conditions, and communicating information for care coordination. It also discusses some challenges and risks, such as EHRs decreasing doctor efficiency and increasing wait times, as well as costs to implement privacy and security requirements. An expert warns that pressure on hospitals to receive payments may cost lives if doctors are forced to use EHRs that disrupt their work. The IOM recommends coordinating efforts to identify patient safety risks from health IT.
In a lecture, delivered in Vienna in 1894 and dedicated "to the academic youth of Austria-Hungary", Franz Brentano outlined four phases of advance and decline which he saw as providing the key to the understanding of the history of Western philosophy. In the first cycle, in antiquity, the initial advancing phase culminated in the work of Aristotle, and was followed by three phases of decline, terminating in the irrational mysticism of the Neo-Pythagoreans. These four phases then repeated themselves: in the Middle Ages, beginning with Aquinas and ending with the "learned ignorance" of Nicholas of Cusa; and then in the modern period, beginning with Bacon and reaching its low point in the work of Kant, Fichte, Schelling, and Hegel. In the contemporary era we are currently witnessing the end of the fourth cycle in the work of (for example) Derrida, Rorty; but also the beginnings of a new, fifth cycle, which is described in the talk. (Presented at the conference Consequences of Realism, Rome, May 4-6, 2014.)
There is blind chess but there is no blind poker. This is because to play poker essentially involves the use of cards and chips (or representations of or proxies for cards and chips). A game of chess, in contrast, may involve only the exchange of speech acts. We draw initial conclusions for the ontology of poker from this distinction.
Talk presented on March 14, 2014
For video presentation see http://www.youtube.com/edit?video_id=PgwpR9NPKzw
Clinical trial data wants to be free: Lessons from the ImmPort Immunology Dat...Barry Smith
Presentation to the Clinical and Research Ethics Seminar, Clinical and Translational Science Center, Buffalo, January 21, 2014
https://immport.niaid.nih.gov/
http://youtu.be/booqxkpvJMg
The Philosophome: An Exercise in the Ontology of the HumanitiesBarry Smith
Presentation at the opening of the Humanomics Research Centre at the University of Copenhagen, 7 February 2014
For background links see: http://philosophome.org/
We describe the methodology of omics disciplines in biology, and consider how analogous methods might be applied in humanities disciplines, focusing specifically on philosophy. We conclude by outlining a possible strategy for a research center in humanomics, identifying possible sources of data in the philosophical domain.
IAO-Intel: An Ontology of Information Artifacts in the Intelligence DomainBarry Smith
We describe on-going work on IAO-Intel, an information artifact ontology developed as part of a suite of ontologies designed to support the needs of intelligence community. IAO-Intel provides a controlled, structured vocabulary for the consistent formulation of metadata about documents, images, emails and other carriers of information. It will provide a resource for uniform explication of the terms used in multiple existing military dictionaries, thesauri and metadata registries, thereby enhancing the degree to
which the content formulated with their aid will be available to computational reasoning.
Presented at the 2013 STIDS (Semantic Technology for Intelligence, Defense and Security) conference: http://stids.c4i.gmu.edu/
Talk presented at the conference on the Philosophy of Emerging Media, Boston University, October 26-27, 2013
If you try to find information about a gene or a molecule or a restaurant or a sports team or a politician on the web, it’s likely that some ontology will be involved in your search. An ontology is (briefly put) a semantically organized consensus representation of the types of entities in a given domain and of the relations between these entities – it is something like a large graph of the way some part of the world is structured. So important have ontologies become to organizations such as the BBC or the New York Times, that there is a running joke in the Semantic Web community to the effect that the Columbia School of Journalism is about to be renamed the Columbia School of Journalism and Ontology. I will attempt to draw conclusions from these phenomena concerning the ways in which social interactions are being influenced, and to some degree also transformed, by digital media.
e‐Human Beings: The contribution of internet ranking systems to the developme...Barry Smith
This document discusses how online rating systems can contribute to human capital development and personal identity. Simple rating apps like Uber allow both customers and drivers to rate each other, shaping reputations and future interactions. Academics are increasingly defined by their online rankings on sites like Google Scholar. Mass collaboration through virtual choirs and military operations demonstrates how digital interconnection can enable new forms of coordinated action. The author proposes that personal identity is now composed of both biological and digital aspects, with an individual's plans, skills and reputation intermeshed with others online.
The idea underlying biomedical ontology is that, if common terms are used to annotate or tag heterogeneous data collected by scientists working in different disciplines, then these data will be more easily reused for integration and
analysis. To this end, the terms in ontologies need to be carefully defined. Smith examines definitions
of terms central to ageing research in this light, focusing on the Gene Ontology (GO), the Foundational Model of Anatomy Ontology (FMA) and the Plant Ontology (PO).
This document discusses biomedical ontology work being done at the University at Buffalo. It describes three US partner institutions collaborating on biomedical ontologies and lists several biomedical ontologies co-developed at UB, including the Basic Formal Ontology and Foundational Model of Anatomy. It outlines a strategy using these ontologies to provide consistent representation of knowledge in an ontology repository at the Institute for Healthcare Informatics.
ImmPort strategies to enhance discoverability of clinical trial dataBarry Smith
Describes strategies for submission of clinical trial data to the NIAID Immunology Database and Analysis Portal in order to advance discoverability, comparability and analysis
Introduces the idea of a theory of document acts, analogous to the theory of social acts advocated in 1913 by Adolf Reinach, and to the theory of speech acts advanced by Austin and Searle.
Ontology and the National Cancer Institute Thesaurus (2005)Barry Smith
The National Cancer Institute Thesaurus is described by its authors as "a biomedical vocabulary that provides consistent, unambiguous codes and definitions for concepts used in cancer research" and which "exhibits ontology-like properties in its construction and use". We performed a qualitative analysis of the Thesaurus in order to assess its conformity with principles of good practice in terminology and ontology design.
MATERIALS AND METHODS:
We used both the on-line browsable version of the Thesaurus and its OWL-representation (version 04.08b, released on August 2, 2004), measuring each in light of the requirements put forward in relevant ISO terminology standards and in light of ontological principles advanced in the recent literature.
RESULTS:
We found many mistakes and inconsistencies with respect to the term-formation principles used, the underlying knowledge representation system, and missing or inappropriately assigned verbal and formal definitions.
CONCLUSION:
Version 04.08b of the NCI Thesaurus suffers from the same broad range of problems that have been observed in other biomedical terminologies. For its further development, we recommend the use of a more principled approach that allows the Thesaurus to be tested not just for internal consistency but also for its degree of correspondence to that part of reality which it is designed to represent.
Introduction to the Logic of DefinitionsBarry Smith
We focus on definitions for common nouns such as 'human being' and 'leukocyte'
A definition, for terms such as this, is a statement of necessary and sufficient conditions for an entity's falling under an instance of the type to which the term refers.
Such definitions should be of Aristotelian form, which means that they specify the genus, and state what it is about certain instances of this genus in virtue of which they are instances of the type (species) defined.
This document discusses biomedical ontologies developed at the University at Buffalo. It describes partnerships with Stanford University and Mayo Clinic for biomedical informatics research. It outlines ontologies developed at UB including the Basic Formal Ontology, Foundational Model of Anatomy, and Ontology for General Medical Science. It provides examples of using ontologies to represent diseases and infectious processes. It also discusses using ontologies for consistent representation of data in a biomedical data repository.
2. Problems with UMLS-style
approaches
• let a million ontologies bloom, each one close
to the terminological habits of its authors
• in concordance with the “not invented here”
syndrome
• then map these ontologies, and use these
mappings to integrate your different pots of
data
2
3. Mappings are hard
They create an N2 problem; are fragile, and
expensive to maintain
Need new authorities to maintain(one for each
pair of mapped ontologies), yielding new risk
of forking – who will police the mappings?
The goal should be to minimize the need for
mappings, by avoiding redundancy in the first
place – one ontology for each domain
Invest resources in disjoint ontology modules
which work well together – reduce need for
mappings to minimum possible 8
4. Why should you care?
• you need to create systems for data mining
and text processing which will yield useful
digitally coded output
• if the codes you use are constantly in need of
ad hoc repair huge, resources will be wasted
• serious investment in annotation will be
defeated from the start
• relevant data will not be found, because it will
be lost in multiple semantic cemeteries
9
5. How to do it right?
• how create an incremental, evolutionary
process, where what is good survives, and what
is bad fails
• where the number of ontologies needing to be
used together is small – integration = addition
• where these ontologies are stable
• by creating a scenario in which people will find it
profitable to reuse ontologies, terminologies and
coding systems which have been tried and tested
10
6. Reasons why GO has been
successful
It is a system for prospective standardization
built with coherent top level but with content
contributed and monitored by domain specialists
Based on community consensus
Updated every night
Clear versioning principles ensure backwards
compatibility; prior annotations do not lose their
value
Initially low-tech to encourage users, with
movement to more powerful formal approaches
(including OWL-DL – though still proceeding
caution)
11
7. GO has learned the lessons of
successful cooperation
• Clear documentation
• The terms chosen are already familiar
• Fully open source (allows thorough testing in
manifold combinations with other ontologies)
• Subjected to considerable third-party critique
• Tracker for user input and help desk with rapid
turnaround
12
8. GO has been amazingly successful in
overcoming the data balkanization
problem
but it covers only generic biological entities of
three sorts:
– cellular components
– molecular functions
– biological processes
no diseases, symptoms, disease
biomarkers, protein interactions,
experimental processes … 13
9. CONTINUANT OCCURRENT
RELATION
TO TIME
INDEPENDENT DEPENDENT
GRANULARITY
Anatomical
Organism Organ
ORGAN AND Entity
(NCBI Function
ORGANISM (FMA,
Taxonomy) (FMP, CPRO) Phenotypic Biological
CARO) Quality Process
(PaTO) (GO)
CELL AND Cellular Cellular
Cell
CELLULAR Component Function
(CL)
COMPONENT (FMA, GO) (GO)
Molecule
Molecular Function Molecular Process
MOLECULE (ChEBI, SO,
(GO) (GO)
RnaO, PrO)
OBO (Open Biomedical Ontology) Foundry proposal
(Gene Ontology in yellow) 14
10. CONTINUANT OCCURRENT
RELATION
TO TIME
INDEPENDENT DEPENDENT
GRANULARITY
Environment Ontology
Anatomical
Organism Organ
ORGAN AND Entity
(NCBI Function
ORGANISM (FMA,
Taxonomy) (FMP, CPRO) Phenotypic Biological
CARO)
(ENVO)
Quality Process
(PaTO) (GO)
CELL AND Cellular Cellular
Cell
CELLULAR Component Function
(CL)
COMPONENT (FMA, GO) (GO)
Molecule
Molecular Function Molecular Process
MOLECULE (ChEBI, SO,
(GO) (GO)
RnaO, PrO)
Environment Ontology
15
11. CONTINUANT OCCURRENT
RELATION
TO TIME
INDEPENDENT DEPENDENT
GRANULARITY
COMPLEX OF Family, Community, Population Population
ORGANISMS Deme, Population Phenotype Process
Anatomical Organ
ORGAN AND Organism Entity Function
ORGANISM (NCBI (FMA, (FMP, CPRO) Phenotypic
Taxonomy) Biological
CARO) Quality
Process
(PaTO)
(GO)
CELL AND Cellular Cellular
Cell
CELLULAR Component Function
(CL)
COMPONENT (FMA, GO) (GO)
Molecule
Molecular Function Molecular Process
MOLECULE (ChEBI, SO,
(GO) (GO)
RnaO, PrO)
Population-level ontologies 16
12. The OBO Foundry: a step-by-step,
evidence-based approach to
expanding the GO
Developers commit to working to ensure
that, for each domain, there is community
convergence on a single ontology
and agree in advance to collaborate with
developers of ontologies in adjacent
domains.
http://obofoundry.org 17
13. OBO Foundry Principles
Common governance (coordinating editors)
Common training
Common architecture:
• simple shared top level ontology (BFO)
• shared Relation Ontology:
www.obofoundry.org/ro
18
14. Open Biomedical Ontologies Foundry
Seeks to create high quality, validated terminology
modules across all of the life sciences which will be
• one ontology for each domain, so no need for
mappings
• close to language use of experts
• evidence-based
• incorporate a strategy for motivating potential
developers and users
• revisable as science advances 19
16. RELATION TO CONTINUANT OCCURRENT
TIME
GRANULARITY INDEPENDENT DEPENDENT
Anatomical
Organism Organ Organism-Level
ORGAN AND Entity
(NCBI Function Process
ORGANISM (FMA,
Taxonomy) (FMP, CPRO) Phenotypic (GO)
CARO) Quality
(PaTO)
CELL AND Cellular Cellular
Cell Cellular Process
CELLULAR Component Function
(CL) (GO)
COMPONENT (FMA, GO) (GO)
Molecule Molecular
Molecular Function
MOLECULE (ChEBI, SO, Process
(GO)
RnaO, PrO) (GO)
OBO Foundry coverage
21
17. ORTHOGONALITY
modularity ensures
• annotations can be additive
• division of labor amongst domain experts
• high value of training in any given module
• lessons learned in one module can benefit
work on other modules
• incentivization of those responsible for
individual modules
22
18. Benefits of coordination
• Can more easily reuse what is made by others
• Can more easily inspect and criticize what is
made by others
• Leads to innovations (e.g. Mireot strategy for
importing terms into ontologies)
23
19. CONTINUANT OCCURRENT
RELATION
TO TIME
INDEPENDENT DEPENDENT
GRANULARITY
Anatomical
Organism Entity Organ
ORGAN AND
(NCBI (FMA, Function
ORGANISM
Taxonomy) CARO) (FMP, CPRO) Phenotypic Biological
Quality Process
XAO ZFA
(PaTO) (GO)
CELL AND Cellular Cellular
Cell
CELLULAR Component Function
(CL)
COMPONENT (FMA, GO) (GO)
Molecule (SO, RnaO) Molecular
Molecular Function
MOLECULE Process
ChEBI PRO (GO)
(GO)
Current Foundry members in yellow
24
20. Foundry ontologies currently under
review
Plant Ontology (PO)
Ontology for Biomedical Investigations (OBI)
Ontology for General Medical Science (OBMS)
Infectious Disease Ontology (IDO)
25
21. top level Basic Formal Ontology (BFO)
Information Artifact Ontology for Biomedical Ontology of General
mid-level Ontology Investigations Medical Science
(IAO) (OBI) (OGMS)
Anatomy Ontology
(FMA*, CARO) Infectious
Disease
Environment Ontology
Cellular
Cell Ontology (IDO*)
Component
Ontology (EnvO)
Ontology
domain level (CL)
(FMA*, GO*)
Phenotypic Biological
Quality Process
Ontology Ontology (GO*)
Subcellular Anatomy Ontology (SAO) (PaTO)
Sequence Ontology
(SO*) Molecular
Function
Protein Ontology
(GO*)
(PRO*)
OBO Foundry Modular Organization 26
22. OBI
The Ontology for Biomedical Investigations
hfp://purl.org/obo/OBI_0000225
27
23. Purpose of OBI
To provide a resource for the unambiguous
description of the components of
biomedical investigations such as the
design, protocols and instrumentation,
material, data and types of analysis and
statistical tools applied to the data
NOT designed to model biology
28
29. Ontology for General Medical
Science
http://code.google.com/p/ogms/
(OBO) http://purl.obolibrary.org/obo/ogms.obo
(OWL) http://purl.obolibrary.org/obo/ogms.owl
34
31. Ontology for General Medical
Science
Jobst Landgrebe (then Co-Chair of the HL7
Vocabulary Group):
“the best ontology effort in the whole
biomedical domain by far”
36
32. EXPERIMENTAL
ARTIFACTS
Ontology for Biomedical Investigations (OBI)
CLINICAL
MEDICINE
Ontology of General Medical Science (OGMS)
INFORMATION
ARTIFACTS
Information Artifact Ontology (IAO)
How to keep clear about the distinction
• processes of observation,
• results of such processes (measurement
data)
• the entities observed
37
33. How is the OBO Foundry organized?
• Top-Level: Basic Formal Ontology (BFO)
• Mid-Level: IAO, OBI, OGMS ...
• Domain-Level: Foundry Bio-Ontologies
38
34. top level Basic Formal Ontology (BFO)
Information Artifact Ontology for Biomedical Ontology of General
mid-level Ontology Investigations Medical Science
(IAO) (OBI) (OGMS)
Anatomy Ontology
(FMA*, CARO) Infectious
Disease
Environment Ontology
Cellular
Cell Ontology (IDO*)
Component
Ontology (EnvO)
Ontology
domain level (CL)
(FMA*, GO*)
Phenotypic Biological
Quality Process
Ontology Ontology (GO*)
Subcellular Anatomy Ontology (SAO) (PaTO)
Sequence Ontology
(SO*) Molecular
Function
Protein Ontology
(GO*)
(PRO*)
OBO Foundry Modular Organization 39
35. BFO: the very top
Continuant Occurrent
(Process, Event)
Independent Dependent
Continuant Continuant
40
36. CONTINUANT OCCURRENT
RELATION
TO TIME
INDEPENDENT DEPENDENT
GRANULARITY
Anatomical
Organism Organ
ORGAN AND Entity
(NCBI Function
ORGANISM (FMA,
Taxonomy) (FMP, CPRO) Phenotypic Biological
CARO) Quality Process
(PaTO) (GO)
CELL AND Cellular Cellular
Cell
CELLULAR Component Function
(CL)
COMPONENT (FMA, GO) (GO)
Molecule
Molecular Function Molecular Process
MOLECULE (ChEBI, SO,
(GO) (GO)
RnaO, PrO)
41
37. RELATION CONTINUANT OCCURRENT
TO TIME
GRANULARITY INDEPENDENT DEPENDENT
Anatomical
Organism Organ Organism-Level
ORGAN AND Entity
(NCBI Function Process
ORGANISM (FMA,
Taxonomy) (FMP, CPRO) Phenotypic (GO)
CARO) Quality
(PaTO)
CELL AND Cellular Cellular
Cell Cellular Process
CELLULAR Component Function
(CL) (GO)
COMPONENT (FMA, GO) (GO)
Molecule Molecular
Molecular Function
MOLECULE (ChEBI, SO, Process
(GO)
RnaO, PrO) (GO)
obofoundry.org 42
38. BFO & GO
continuant occurrent
independent dependent
continuant continuant
cellular molecular biological
component function processes
43
40. Experience with BFO in
building ontologies provides
• a community of skilled ontology developers
and users (user group has 120 members)
• associated logical tools
• documentation for different types of users
• a methodology for building conformant
ontologies by starting with BFO and populating
downwards
45
42. How to build an ontology
import BFO into ontology editor such as Protégé
work with domain experts to create an initial mid-
level classification
find ~50 most commonly used terms corresponding
to types in reality
arrange these terms into an informal is_a hierarchy
according to this universality principle
A is_a B ≡ every instance of A is an instance of B
fill in missing terms to give a complete hierarchy
(leave it to domain experts to populate the lower
levels of the hierarchy)
47
43. Users of BFO
PharmaOntology (W3C HCLS SIG)
MediCognos / Microsoft Healthvault
Cleveland Clinic Semantic Database in Cardiothoracic
Surgery
Major Histocompatibility Complex (MHC) Ontology (NIAID)
Neuroscience Information Framework Standard (NIFSTD)
and Constituent Ontologies
Interdisciplinary Prostate Ontology (IPO)
Nanoparticle Ontology (NPO): Ontology for Cancer
Nanotechnology Research
Neural Electromagnetic Ontologies (NEMO)
ChemAxiom – Ontology for Chemistry 49
:.
44. Users of BFO
GO Gene Ontology
CL Cell Ontology
SO Sequence Ontology
ChEBI Chemical Ontology
PATO Phenotype (Quality) Ontology
FMA Foundational Model of Anatomy Ontology
ChEBI Chemical Entities of Biological Interest
PRO Protein Ontology
Plant Ontology
Environment Ontology
Ontology for Biomedical Investigations
RNA Ontology 50
:.
45. Users of BFO
Ontology for Risks Against Patient Safety (RAPS/REMINE)
eagle-i an VIVO (NCRR)
IDO Infectious Disease Ontology (NIAID)
National Cancer Institute Biomedical Grid Terminology
(BiomedGT)
US Army Biometrics Ontology
US Army Command and Control Ontology
Sleep Domain Ontology
Subcellular Anatomy Ontology (SAO)
Translaftional Medicine On (VO)
Yeast Ontology (yOWL)
Zebrafish Anatomical Ontology (ZAO) 51
:.
47. Continuants
• continue to exist through time,
preserving their identity while
undergoing different sorts of changes
• independent continuants – objects,
things, ...
• dependent continuants – qualities,
attributes, shapes, potentialities ...
55
50. Qualities
temperature / blood pressure / mass ...
are dimensions of variation within the
structure of the entity
a quality is something which can
change while its bearer remains one
and the same
58
53. BFO: The Very Top
continuant occurrent
independent dependent
continuant continuant
quality
temperature 62
54. Blinding Flash of the Obvious
independent dependent
continuant continuant
quality
organism
temperature types
John’s
John
temperature
instances
63
55. Blinding Flash of the Obvious
independent dependent
continuant continuant
quality
organism
temperature types
John’s
John
temperature
instances
64
56. Blinding Flash of the Obvious
inheres_in
.
organism
temperature types
John’s
John
temperature
instances
65
57. temperature types
37ºC 37.1ºC 37.2ºC 37.3ºC 37.4ºC 37.5ºC
instantiates instantiates instantiates instantiates instantiates instantiates
at t1 at t2 at t3 at t4 at t5 at t6
John’s temperature
instances 66
58. human types
embryo fetus neonate infant child adult
instantiates instantiates instantiates instantiates instantiates instantiates
at t1 at t2 at t3 at t4 at t5 at t6
John
instances 67
59. Temperature subtypes
Development-stage subtypes
are threshold divisions (hence we do
not have sharp boundaries, and we
have a certain degree of choice, e.g. in
how many subtypes to distinguish,
though not in their ordering)
68
60. independent dependent
continuant continuant
quality
organism
temperature types
John’s
John
temperature
instances
69
61. independent dependent
occurrent
continuant continuant
quality process
organism course of
temperature temperature
changes
John’s John’s
John
temperature temperature history
70
62. independent dependent
occurrent
continuant continuant
quality process
organism
temperature life of an
organism
John’s John’s
John
temperature life
71
63. BFO: The Very Top
continuant occurrent
independent dependent
continuant continuant
quality disposition
72
64. BFO: The Very Top
continuant occurrent
independent dependent
continuant continuant
quality
function
role
disposition
73
65. disposition
- of a glass vase, to shatter if dropped
- of a human, to eat
- of a banana, to ripen
- of John, to lose hair
74
66. disposition
if it ceases to exist, then its bearer
and/or its immediate surrounding
environment is physically changed
its realization occurs when its bearer is in
some special physical circumstances
its realization is what it is in virtue of the
bearer’s physical make-up
75
67. independent dependent
occurrent
continuant continuant
function process
eye
to see process of
seeing
John’s eye function of John’s John seeing
eye: to see
80
68. OGMS
Ontology for General Medical
Science
http://code.google.com/p/ogms
88
69. R T U New York State
Center of Excellence in
Bioinformatics & Life
Sciences
Ontology of General Medical Science (OGMS)
• ontology for the representation of
– diseases, signs, symptoms
– clinical processes
– diagnosis, treatment and outcomes
• fundamental idea:
– a disease is a disposition rooted in some
(physical) disorder in the organism
89
70. R T U New York State
Center of Excellence in
Bioinformatics & Life
Sciences
Motivation
• Clarity about:
– disease etiology and progression
– disease and the diagnostic process
– phenotype and signs/symptoms
– entities in reality and observations of sucn
entities
90
72. Physical Disorder
– independent
continuant
fiat object part
A causally linked
combination of physical
components of the
extended organism that
is clinically abnormal. 92
:.
73. Clinically abnormal
– (1) not part of the life plan for an organism
of the relevant type (unlike aging or
pregnancy),
– (2) causally linked to an elevated risk
either of pain or other feelings of illness,
or of death or dysfunction, and
– (3) such that the elevated risk exceeds a
certain threshold level.*
*Compare: baldness
93
75. Pathological Process
=def. A bodily process that is a
manifestation of a disorder and is clinically
abnormal.
Disease =def. – A disposition to undergo
pathological processes that exists in an
organism because of one or more
disorders in that organism.
95
76. Cirrhosis - environmental exposure
• Etiological process - phenobarbitol-induced hepatic cell death
– produces
• Disorder - necrotic liver
– bears
• Disposition (disease) - cirrhosis
– realized_in
• Pathological process - abnormal tissue repair with cell proliferation
and fibrosis that exceed a certain threshold; hypoxia-induced cell
death
– produces
• Abnormal bodily features
– recognized_as
• Symptoms - fatigue, anorexia
• Signs - jaundice, enlarged spleen
96
77. Dispositions and Predispositions
All diseases are dispositions; not all
dispositions are diseases.
Predisposition to Disease
=def. – A disposition in an organism that
constitutes an increased risk of the
organism’s subsequently developing some
disease.
97
78. HNPCC - genetic pre-disposition
• Etiological process - inheritance of a mutant mismatch repair gene
– produces
• Disorder - chromosome 3 with abnormal hMLH1
– bears
• Disposition (disease) - Lynch syndrome
– realized_in
• Pathological process - abnormal repair of DNA mismatches
– produces
• Disorder - mutations in proto-oncogenes and tumor suppressor genes
with microsatellite repeats (e.g. TGF-beta R2)
– bears
• Disposition (disease) - non-polyposis colon cancer
– realized in
• Symptoms (including pain)
98
79. Huntington’s Disease - genetic
• Etiological process - inheritance of Symptoms & Signs
>39 CAG repeats in the HTT gene used_in
– produces Interpretive process
• Disorder - chromosome 4 with
produces
abnormal mHTT
– bears Hypothesis - rule out Huntington’s
• Disposition (disease) - Huntington’s suggests
disease Laboratory tests
– realized_in produces
• Pathological process - accumulation of
mHTT protein fragments, abnormal
Test results - molecular detection of
transcription regulation, neuronal cell the HTT gene with >39CAG repeats
death in striatum used_in
– produces Interpretive process
• Abnormal bodily features produces
– recognized_as Result - diagnosis that patient X has a
• Symptoms - anxiety, depression disorder that bears the disease
• Signs - difficulties in speaking and Huntington’s disease
swallowing
99
80. HNPCC - genetic pre-disposition
• Etiological process - inheritance of a mutant mismatch repair
gene
– produces
• Disorder - chromosome 3 with abnormal hMLH1
– bears
• Disposition (disease) - Lynch syndrome
– realized_in
• Pathological process - abnormal repair of DNA mismatches
– produces
• Disorder - mutations in proto-oncogenes and tumor
suppressor genes with microsatellite repeats (e.g. TGF-beta
R2)
– bears
• Disposition (disease) - non-polyposis colon cancer 100
81. Cirrhosis - environmental exposure
• Etiological process - phenobarbitol- Symptoms & Signs
induced hepatic cell death used_in
– produces Interpretive process
produces
• Disorder - necrotic liver
– bears Hypothesis - rule out cirrhosis
suggests
• Disposition (disease) - cirrhosis
Laboratory tests
– realized_in
produces
• Pathological process - abnormal tissue Test results - elevated liver enzymes
repair with cell proliferation and in serum
fibrosis that exceed a certain used_in
threshold; hypoxia-induced cell death
Interpretive process
– produces produces
• Abnormal bodily features Result - diagnosis that patient X has a
– recognized_as disorder that bears the disease
• Symptoms - fatigue, anorexia cirrhosis
• Signs - jaundice, splenomegaly
101
82. Systemic arterial hypertension
• Etiological process – abnormal Symptoms & Signs
reabsorption of NaCl by the kidney used_in
– produces Interpretive process
produces
• Disorder – abnormally large scattered
molecular aggregate of salt in the Hypothesis - rule out hypertension
blood suggests
– bears Laboratory tests
produces
• Disposition (disease) - hypertension
– realized_in Test results -
used_in
• Pathological process – exertion of
abnormal pressure against arterial wall
Interpretive process
produces
– produces
Result - diagnosis that patient X has a
• Abnormal bodily features
disorder that bears the disease hypertension
– recognized_as
• Symptoms - headaches, dizziness
• Signs – elevated blood pressure
102
83. Type 2 Diabetes Mellitus
• Etiological process –
Symptoms & Signs
– produces
used_in
• Disorder – abnormal pancreatic beta Interpretive process
cells and abnormal muscle/fat cells
produces
– bears
Hypothesis - rule out diabetes mellitus
• Disposition (disease) – diabetes suggests
mellitus Laboratory tests – fasting serum blood
– realized_in glucose, oral glucose challenge test, and/or
• Pathological processes – diminished blood hemoglobin A1c
insulin production , diminished produces
muscle/fat uptake of glucose Test results -
– produces used_in
• Abnormal bodily features Interpretive process
– recognized_as produces
• Symptoms – polydipsia, polyuria,
Result - diagnosis that patient X has a
polyphagia, blurred vision disorder that bears the disease type 2
diabetes mellitus
• Signs – elevated blood glucose and
hemoglobin A1c 103
84. Type 1 hypersensitivity to penicillin
• Etiological process – sensitizing of mast
cells and basophils during exposure to
Symptoms & Signs
used_in
penicillin-class substance
– produces Interpretive process
produces
• Disorder – mast cells and basophils with
epitope-specific IgE bound to Fc epsilon Hypothesis -
suggests
receptor I
– bears Laboratory tests –
produces
• Disposition (disease) – type I
hypersensitivity Test results – occasionally, skin testing
used_in
– realized_in
•
Interpretive process
Pathological process – type I
produces
hypersensitivity reaction
– produces
Result - diagnosis that patient X has a
disorder that bears the disease type 1
• Abnormal bodily features hypersensitivity to penicillin
– recognized_as
• Symptoms – pruritis, shortness of breath
• Signs – rash, urticaria, anaphylaxis 104
86. Disease vs. Disease course
Disease =def. – A disposition to undergo
pathological processes that exists in an
organism because of one or more
disorders in that organism.
Disease course =def. – The aggregate of
processes in which a disease disposition
is realized.
106
87. coronary heart
disease
disease
associated disease disease
with early associated with associated
lesions and asymptomatic with surface unstable stable
small fibrous (‘silent’) disruption of angina angina
plaques infarction plaque
instantiates instantiates instantiates instantiates instantiates
at t1 at t2 at t3 at t4 at t5
John’s coronary heart disease
107
time
88. independent dependent
occurrent
continuant continuant
disposition process
disorder course of
disease
disease
John’s John’s
disordered coronary heart course of John’s
heart disease disease
108
90. IDO (Infectious Disease Ontology) Core
Follows GO strategy of providing a
canonical ontology of what is involved
in every infectious disease – host,
pathogen, vector, virulence, vaccine,
transmission – accompanied by IDO
Extensions for specific diseases,
pathogens and vectors
Provides common terminology resources
and tested common guidelines for a
vast array of different disease
communities 110
91. Infectious Disease Ontology Consortium
• MITRE, Mount Sinai, UTSouthwestern –
Influenza
• IMBB/VectorBase – Vector borne diseases (A.
gambiae, A. aegypti, I. scapularis, C. pipiens,
P. humanus)
• Colorado State University – Dengue Fever
• Duke University – Tuberculosis, Staph. aureus
• Cleveland Clinic – Infective Endocarditis
• University of Michigan – Brucellosis
• Duke University, University at Buffalo – HIV
111
92. Influenza - infectious
• Etiological process - infection of airway epithelial cells with
influenza virus
– produces
• Disorder - viable cells with influenza virus
– bears
• Disposition (disease) - flu
– realized_in
• Pathological process - acute inflammation
– produces
• Abnormal bodily features
– recognized_as
• Symptoms - weakness, dizziness
• Signs - fever 112
93. Influenza – disease course
• Etiological process - infection of airway epithelial cells with
influenza virus
– produces
• Disorder - viable cells with influenza virus
– bears
• Disposition (disease) - flu
– realized_in
• Pathological process - The disorder also induces normal
acute inflammation
– produces physiological processes (immune
• Abnormal bodily featuresresponse) that can results in the
– recognized_as elimination of the disorder (transient
• Symptoms - weakness, dizziness
disease course).
• Signs - fever 113