Chris Mungall discussed his path in biocuration which led him to focus on ontologies. Ontologies can amplify the impact of data by providing a structured knowledge framework. Early ontologies like GO became too monolithic so the Open Biological Ontologies (OBO) Foundry was created to develop interoperable, modular ontologies through collaboration. Mungall described work developing ontologies like Uberon, developing tools like ROBOT for quality control, and a vision for more sophisticated ontology annotation to encode biological knowledge.
1. The document discusses using phenotypes across species to aid in interpreting genomic data from patients and improving diagnosis and treatment.
2. Building comprehensive phenotype databases from multiple sources is challenging due to disparate data on human genes/variants and model organisms.
3. The Monarch Initiative aims to link human diseases to phenotypes in model systems through an ontology-based knowledge base and portal.
4. Incorporating rich phenotypic data can improve variant filtering and interpretation by providing more context for sequencing results.
Representation of kidney structures in UberonChris Mungall
The document discusses representation of kidney structures in the Uberon anatomy ontology. It provides examples of kidney classes like glomerular capsule and S-shaped body represented in Uberon along with their relationships. It also discusses how Uberon integrates representations of kidney structures from other species and anatomy ontologies through equivalence axioms and cross-links.
Collaboratively Creating the Knowledge Graph of LifeChris Mungall
The document discusses collaboratively building a knowledge graph of life by connecting existing biological ontologies. It describes how ontologies can standardize and organize biological data by representing entities and their relationships in a graph. The challenges of integrating different ontology projects are addressed through initiatives like the Open Biological and Biomedical Ontologies (OBO) Foundry. The document outlines how ontologies can be formalized using OWL and connected using tools like the Ontology Development Kit to enable discovery across domains. Current efforts like the Gene Ontology, Biolink Model, and National Microbiome Data Collaborative are leveraging these techniques to create unified, semantically queryable knowledge graphs.
Causal reasoning using the Relation OntologyChris Mungall
The document discusses the need for standardized relationship types in biological data and ontologies. It provides an overview of the Relation Ontology (RO), which defines over 450 standardized relationship types organized hierarchically. RO provides a foundation for integrating multiple knowledge graphs and represents relationships in ontologies, linked data, and knowledge bases. It enables logical reasoning and inference across graphs through properties like transitivity.
Experiences in the biosciences with the open biological ontologies foundry an...Chris Mungall
The document discusses the need for ontologies in biology to integrate data from the large number of biological databases and standards. It outlines tools for building and using ontologies, including those for end users to search and analyze data, and those for ontology engineers to develop ontologies through automated reasoning and integration. The Gene Ontology is provided as an example of an ontology that has been widely adopted for analyzing gene sets. The document advocates developing ontologies through a collaborative framework like the Open Biological and Biomedical Ontologies to promote reuse and integration across domains.
All together now: piecing together the knowledge graph of lifeChris Mungall
The document summarizes challenges in organizing biological knowledge and progress made through collaborative ontology development. It discusses how early efforts focused on individual ontologies but challenges emerged in maintenance and linking data. New approaches focus on shared principles, standardized mappings between ontologies, and modeling knowledge as graphs. Tools like Boomer and LinkML help reconcile mappings and model data, while community efforts like OBO Foundry and Biolink Model advance integration through open collaboration. Overall progress has been made but more work is needed to operationalize ontologies and build interconnected knowledge graphs.
Uberon is an integrative multi-species anatomy ontology that contains over 11,000 classes describing anatomical structures across multiple animal species, with a focus on chordates and mammals. It uses multiple relationship types like subclass, part-of, and develops-from to connect these classes in a structured ontology. Uberon aims to bridge between existing species-specific anatomy ontologies like the Mouse Anatomy ontology and the Foundational Model of Anatomy for human. It allows cross-referencing between these ontologies and helps integrate anatomical knowledge across models and humans.
1. The document discusses using phenotypes across species to aid in interpreting genomic data from patients and improving diagnosis and treatment.
2. Building comprehensive phenotype databases from multiple sources is challenging due to disparate data on human genes/variants and model organisms.
3. The Monarch Initiative aims to link human diseases to phenotypes in model systems through an ontology-based knowledge base and portal.
4. Incorporating rich phenotypic data can improve variant filtering and interpretation by providing more context for sequencing results.
Representation of kidney structures in UberonChris Mungall
The document discusses representation of kidney structures in the Uberon anatomy ontology. It provides examples of kidney classes like glomerular capsule and S-shaped body represented in Uberon along with their relationships. It also discusses how Uberon integrates representations of kidney structures from other species and anatomy ontologies through equivalence axioms and cross-links.
Collaboratively Creating the Knowledge Graph of LifeChris Mungall
The document discusses collaboratively building a knowledge graph of life by connecting existing biological ontologies. It describes how ontologies can standardize and organize biological data by representing entities and their relationships in a graph. The challenges of integrating different ontology projects are addressed through initiatives like the Open Biological and Biomedical Ontologies (OBO) Foundry. The document outlines how ontologies can be formalized using OWL and connected using tools like the Ontology Development Kit to enable discovery across domains. Current efforts like the Gene Ontology, Biolink Model, and National Microbiome Data Collaborative are leveraging these techniques to create unified, semantically queryable knowledge graphs.
Causal reasoning using the Relation OntologyChris Mungall
The document discusses the need for standardized relationship types in biological data and ontologies. It provides an overview of the Relation Ontology (RO), which defines over 450 standardized relationship types organized hierarchically. RO provides a foundation for integrating multiple knowledge graphs and represents relationships in ontologies, linked data, and knowledge bases. It enables logical reasoning and inference across graphs through properties like transitivity.
Experiences in the biosciences with the open biological ontologies foundry an...Chris Mungall
The document discusses the need for ontologies in biology to integrate data from the large number of biological databases and standards. It outlines tools for building and using ontologies, including those for end users to search and analyze data, and those for ontology engineers to develop ontologies through automated reasoning and integration. The Gene Ontology is provided as an example of an ontology that has been widely adopted for analyzing gene sets. The document advocates developing ontologies through a collaborative framework like the Open Biological and Biomedical Ontologies to promote reuse and integration across domains.
All together now: piecing together the knowledge graph of lifeChris Mungall
The document summarizes challenges in organizing biological knowledge and progress made through collaborative ontology development. It discusses how early efforts focused on individual ontologies but challenges emerged in maintenance and linking data. New approaches focus on shared principles, standardized mappings between ontologies, and modeling knowledge as graphs. Tools like Boomer and LinkML help reconcile mappings and model data, while community efforts like OBO Foundry and Biolink Model advance integration through open collaboration. Overall progress has been made but more work is needed to operationalize ontologies and build interconnected knowledge graphs.
Uberon is an integrative multi-species anatomy ontology that contains over 11,000 classes describing anatomical structures across multiple animal species, with a focus on chordates and mammals. It uses multiple relationship types like subclass, part-of, and develops-from to connect these classes in a structured ontology. Uberon aims to bridge between existing species-specific anatomy ontologies like the Mouse Anatomy ontology and the Foundational Model of Anatomy for human. It allows cross-referencing between these ontologies and helps integrate anatomical knowledge across models and humans.
Mapping Phenotype Ontologies for Obesity and DiabetesChris Mungall
This document discusses approaches to mapping phenotype ontologies across species and categories. It describes using OWL axioms to define phenotypes in a machine-interpretable way and create bridges between ontologies. This enables cross-ontology queries and integrated views of data. Challenges include modeling complex phenomena accurately in OWL and a lack of tools integrated into the ontology development process. The Monarch Initiative aims to address these issues by developing tools like TermGenie and providing integrated views of data from multiple ontologies.
The Gene Ontology (GO) provides a controlled vocabulary for describing gene and gene product attributes across species. It consists of three ontologies covering biological processes, molecular functions, and cellular components. GO terms are organized into a directed acyclic graph structure and can have relationships like "is_a" and "part_of". Genes are annotated with GO terms to capture functional information, which is shared across species to facilitate research. While useful, the GO has some limitations like unclear reasoning principles and lack of validation procedures.
Introduction to Ontologies for Environmental BiologyBarry Smith
1. The document introduces ontologies for environmental biology and discusses several disciplines that could benefit from their use, including GIS, ecology, environmental biology, and various "-omics" fields.
2. It describes what an ontology is and compares ontologies to legends for maps or diagrams, which allow integration and help humans and computers make sense of complex data. Ontologies provide standardized terminology and annotations.
3. The document outlines the Open Biomedical Ontologies (OBO) Foundry, a collection of interoperable reference ontologies for annotating biomedical data. Foundry ontologies include the Gene Ontology and other ontologies for molecules, cells, anatomical structures, and more. They are developed through consensus and share
Genome annotation with open source software: Apollo, Jbrowse and the GO in Ga...Nathan Dunn
This document describes improvements to Apollo, an open-source genome annotation editor, that increase the efficiency of genome annotation refinement. Key improvements include automated processing of genomic evidence, ability to associate and export Gene Ontology annotations, variant effect prediction, and user interface enhancements. Apollo can now be launched via Docker or preconfigured Amazon cloud instances, simplifying installation. It provides web services for integration with other web tools.
Molecular scaffolds are special and useful guides to discovery, poster (36x54"). Presented at ACS National Meeting SciMix in Indianapolis, Sep 9, 2013.
This document discusses how ontologies can be used to do biology. It describes how ontologies allow biological data and knowledge to be shared and integrated by providing common definitions and vocabularies. It also discusses how ontologies can enable new discoveries by revealing unexpected connections between different data sources and facilitating automated reasoning. While ontologies help biologists find new things, real biological insights still require human analysis and experimentation. The document uses examples from kidney and urinary system research to illustrate how ontologies are built and applied in bioinformatics.
Molecular scaffolds are special and useful guides to discoveryJeremy Yang
Molecular scaffolds are special structures that can be used to guide discovery in fields like chemical biology and drug discovery. Scaffolds represent the core structure or framework of molecules. They are useful because they allow clustering and organization of chemical data, exploration of chemical space, and prediction of properties like bioactivity. Examples of famous drug scaffolds discussed include the beta-lactam, steroid, and benzodiazepine scaffolds. Software tools are available for scaffold analysis and applications include database clustering, navigation of chemical space, and prediction of promiscuity. While the definition of a scaffold is not always consistent, cheminformatics methods can help address challenges in scaffold analysis.
This document discusses lessons learned from developing and using the Gene Ontology (GO) over the past 20 years. It covers how GO aims to systematically annotate gene function across species using an ontology. It describes how GO uses OWL constructs like subclasses, equivalence and reasoning to leverage relationships with other ontologies. It also discusses moving beyond simple annotation to represent biology accurately using causal models and graphs. Finally, it covers the Open Biology Ontology Foundry principles of collaboration, shared standards and interconnected ontologies that GO adheres to.
The document describes a seminar on high-throughput sequencing bioinformatics. It discusses analyzing microbiome samples using 16S rRNA sequencing and tools like Mothur and QIIME. It provides an overview of analyzing 16S sequences, including quality filtering, OTU clustering, classification, and diversity analysis. It also outlines running a Mothur tutorial to analyze a mock microbiome dataset from 21 samples using the Mothur MiSeq standard operating procedure.
ContentMine Presentation for WHO Health Data SeminarJenny Molloy
This document summarizes content mining technology and policy developments. It discusses what content and mining are, provides a brief history of content mining, and outlines legal considerations around copyright and database rights. It then describes the ContentMine software and pipeline for scraping, normalizing, and extracting facts from scholarly documents at scale. Examples of mining applications in chemistry, clinical trials, phylogenetics, and genome annotation are provided. The document concludes with a discussion of the potential value of content mining for public health researchers.
This document discusses bioinformatics and biology at various levels of organization. It begins by explaining that biology is extremely complex due to the hierarchical organization of life, from molecules to ecosystems. It then provides definitions of bioinformatics from Wikipedia and other sources, emphasizing that it is an interdisciplinary field that uses computer science and other approaches to study vast amounts of biological data. Examples of different types of biological data and areas of bioinformatics research are given, such as sequence analysis, databases, and structural bioinformatics. Overall, the document provides a high-level introduction to bioinformatics and its role in understanding biology.
1. The main activity of TAIR curators is producing a 'gold standard' annotated reference genome dataset by integrating experimental data from research literature. New annotations are constantly added.
2. One common use of TAIR is to infer the function of genes in agriculturally important species based on orthology to Arabidopsis genes.
3. TAIR's annotations are used in applications such as functional categorization and term enrichment. It is important to use the latest annotation file from TAIR.
Introduction to Gene Mining Part A: BLASTn-off!adcobb
In this lesson, students will learn to use bioinformatics portals and tools to mine plant versions of human genes. Student handout and teacher resource materials are available at www.Araport.org, Teaching Resources (Community tab). Suitable for grades 9-12 or first year undergraduate students.
International Cancer Genomics Consortium (ICGC) Data Coordinating CenterNeuro, McGill University
The document is a presentation slide deck for the International Cancer Genome Consortium (ICGC) Data Coordinating Center (DCC) given on November 14th 2013. It provides an overview of the ICGC, including its goals to catalog genomic abnormalities in 50 different cancer types using comprehensive genome, transcriptome, methylome, and clinical data analysis. It describes the activities of the ICGC DCC, which provides tools and infrastructure for data uploading, tracking, quality control, and distribution. The DCC aims to make ICGC data accessible and useful to researchers through search and analysis capabilities on its data portal.
Functional annotation of invertebrate genomesSurya Saha
Functional annotation of the Asian citrus psyllid genome identified genes, assigned gene ontology terms, and mapped genes to pathways. Gene ontology and pathway analysis of differentially expressed genes between infected and uninfected psyllids identified enriched terms involved in the cytoskeleton, endocytosis, and mitochondrial dysfunction. Improved functional annotation using GOanna added depth to the gene ontology annotation and identified additional enriched pathways related to response to hypoxia and regulation of cytoskeletal remodeling.
An introduction to Web Apollo for i5K Pilot Species Projects - HemipteraMonica Munoz-Torres
Introduction to Web Apollo for the i5K Pilot species project. WebApollo is genome annotation editor; it provides a web-based environment that allows multiple distributed users to review, edit, and share manual annotations. This presentation includes information specific to the projects of the Global Initiative to sequence the genomes of 5,000 species of arthropods, i5K. Let's get started!
The document describes the OGO (Orthologs and Genes Ontology) system, which provides a semantic query interface for exploring information about ortholog genes and genetic diseases. It allows users to formulate complex queries about orthologs and diseases without needing SPARQL syntax. An example query and its results are shown, finding the ortholog genes of the gene that causes prostate cancer in rats. Future plans include adding more reasoning capabilities and integrating with additional biomedical ontologies and standards.
This document provides an overview of exome analysis for identifying causal genes for Mendelian disorders. It discusses technological advances that have enabled exome sequencing, key publications in the field, strategies and tools used for data analysis, and exome sequencing service providers. The document is intended as a useful resource for those interested in how exome analysis is used to identify genes underlying Mendelian conditions.
Use of semantic phenotyping to aid disease diagnosismhaendel
This document discusses using semantic phenotyping to aid disease diagnosis. It outlines using ontologies to semantically annotate phenotypes seen in patients, animal models, and genes. This allows computation of semantic similarity between phenotypes to identify potential disease candidates. The document also discusses challenges such as uneven phenotype data distribution and differences in how phenotypes are described across species. It proposes building an integrated cross-species semantic framework called Uberpheno to address these challenges and better leverage animal models for diagnosing rare diseases.
Mapping Phenotype Ontologies for Obesity and DiabetesChris Mungall
This document discusses approaches to mapping phenotype ontologies across species and categories. It describes using OWL axioms to define phenotypes in a machine-interpretable way and create bridges between ontologies. This enables cross-ontology queries and integrated views of data. Challenges include modeling complex phenomena accurately in OWL and a lack of tools integrated into the ontology development process. The Monarch Initiative aims to address these issues by developing tools like TermGenie and providing integrated views of data from multiple ontologies.
The Gene Ontology (GO) provides a controlled vocabulary for describing gene and gene product attributes across species. It consists of three ontologies covering biological processes, molecular functions, and cellular components. GO terms are organized into a directed acyclic graph structure and can have relationships like "is_a" and "part_of". Genes are annotated with GO terms to capture functional information, which is shared across species to facilitate research. While useful, the GO has some limitations like unclear reasoning principles and lack of validation procedures.
Introduction to Ontologies for Environmental BiologyBarry Smith
1. The document introduces ontologies for environmental biology and discusses several disciplines that could benefit from their use, including GIS, ecology, environmental biology, and various "-omics" fields.
2. It describes what an ontology is and compares ontologies to legends for maps or diagrams, which allow integration and help humans and computers make sense of complex data. Ontologies provide standardized terminology and annotations.
3. The document outlines the Open Biomedical Ontologies (OBO) Foundry, a collection of interoperable reference ontologies for annotating biomedical data. Foundry ontologies include the Gene Ontology and other ontologies for molecules, cells, anatomical structures, and more. They are developed through consensus and share
Genome annotation with open source software: Apollo, Jbrowse and the GO in Ga...Nathan Dunn
This document describes improvements to Apollo, an open-source genome annotation editor, that increase the efficiency of genome annotation refinement. Key improvements include automated processing of genomic evidence, ability to associate and export Gene Ontology annotations, variant effect prediction, and user interface enhancements. Apollo can now be launched via Docker or preconfigured Amazon cloud instances, simplifying installation. It provides web services for integration with other web tools.
Molecular scaffolds are special and useful guides to discovery, poster (36x54"). Presented at ACS National Meeting SciMix in Indianapolis, Sep 9, 2013.
This document discusses how ontologies can be used to do biology. It describes how ontologies allow biological data and knowledge to be shared and integrated by providing common definitions and vocabularies. It also discusses how ontologies can enable new discoveries by revealing unexpected connections between different data sources and facilitating automated reasoning. While ontologies help biologists find new things, real biological insights still require human analysis and experimentation. The document uses examples from kidney and urinary system research to illustrate how ontologies are built and applied in bioinformatics.
Molecular scaffolds are special and useful guides to discoveryJeremy Yang
Molecular scaffolds are special structures that can be used to guide discovery in fields like chemical biology and drug discovery. Scaffolds represent the core structure or framework of molecules. They are useful because they allow clustering and organization of chemical data, exploration of chemical space, and prediction of properties like bioactivity. Examples of famous drug scaffolds discussed include the beta-lactam, steroid, and benzodiazepine scaffolds. Software tools are available for scaffold analysis and applications include database clustering, navigation of chemical space, and prediction of promiscuity. While the definition of a scaffold is not always consistent, cheminformatics methods can help address challenges in scaffold analysis.
This document discusses lessons learned from developing and using the Gene Ontology (GO) over the past 20 years. It covers how GO aims to systematically annotate gene function across species using an ontology. It describes how GO uses OWL constructs like subclasses, equivalence and reasoning to leverage relationships with other ontologies. It also discusses moving beyond simple annotation to represent biology accurately using causal models and graphs. Finally, it covers the Open Biology Ontology Foundry principles of collaboration, shared standards and interconnected ontologies that GO adheres to.
The document describes a seminar on high-throughput sequencing bioinformatics. It discusses analyzing microbiome samples using 16S rRNA sequencing and tools like Mothur and QIIME. It provides an overview of analyzing 16S sequences, including quality filtering, OTU clustering, classification, and diversity analysis. It also outlines running a Mothur tutorial to analyze a mock microbiome dataset from 21 samples using the Mothur MiSeq standard operating procedure.
ContentMine Presentation for WHO Health Data SeminarJenny Molloy
This document summarizes content mining technology and policy developments. It discusses what content and mining are, provides a brief history of content mining, and outlines legal considerations around copyright and database rights. It then describes the ContentMine software and pipeline for scraping, normalizing, and extracting facts from scholarly documents at scale. Examples of mining applications in chemistry, clinical trials, phylogenetics, and genome annotation are provided. The document concludes with a discussion of the potential value of content mining for public health researchers.
This document discusses bioinformatics and biology at various levels of organization. It begins by explaining that biology is extremely complex due to the hierarchical organization of life, from molecules to ecosystems. It then provides definitions of bioinformatics from Wikipedia and other sources, emphasizing that it is an interdisciplinary field that uses computer science and other approaches to study vast amounts of biological data. Examples of different types of biological data and areas of bioinformatics research are given, such as sequence analysis, databases, and structural bioinformatics. Overall, the document provides a high-level introduction to bioinformatics and its role in understanding biology.
1. The main activity of TAIR curators is producing a 'gold standard' annotated reference genome dataset by integrating experimental data from research literature. New annotations are constantly added.
2. One common use of TAIR is to infer the function of genes in agriculturally important species based on orthology to Arabidopsis genes.
3. TAIR's annotations are used in applications such as functional categorization and term enrichment. It is important to use the latest annotation file from TAIR.
Introduction to Gene Mining Part A: BLASTn-off!adcobb
In this lesson, students will learn to use bioinformatics portals and tools to mine plant versions of human genes. Student handout and teacher resource materials are available at www.Araport.org, Teaching Resources (Community tab). Suitable for grades 9-12 or first year undergraduate students.
International Cancer Genomics Consortium (ICGC) Data Coordinating CenterNeuro, McGill University
The document is a presentation slide deck for the International Cancer Genome Consortium (ICGC) Data Coordinating Center (DCC) given on November 14th 2013. It provides an overview of the ICGC, including its goals to catalog genomic abnormalities in 50 different cancer types using comprehensive genome, transcriptome, methylome, and clinical data analysis. It describes the activities of the ICGC DCC, which provides tools and infrastructure for data uploading, tracking, quality control, and distribution. The DCC aims to make ICGC data accessible and useful to researchers through search and analysis capabilities on its data portal.
Functional annotation of invertebrate genomesSurya Saha
Functional annotation of the Asian citrus psyllid genome identified genes, assigned gene ontology terms, and mapped genes to pathways. Gene ontology and pathway analysis of differentially expressed genes between infected and uninfected psyllids identified enriched terms involved in the cytoskeleton, endocytosis, and mitochondrial dysfunction. Improved functional annotation using GOanna added depth to the gene ontology annotation and identified additional enriched pathways related to response to hypoxia and regulation of cytoskeletal remodeling.
An introduction to Web Apollo for i5K Pilot Species Projects - HemipteraMonica Munoz-Torres
Introduction to Web Apollo for the i5K Pilot species project. WebApollo is genome annotation editor; it provides a web-based environment that allows multiple distributed users to review, edit, and share manual annotations. This presentation includes information specific to the projects of the Global Initiative to sequence the genomes of 5,000 species of arthropods, i5K. Let's get started!
The document describes the OGO (Orthologs and Genes Ontology) system, which provides a semantic query interface for exploring information about ortholog genes and genetic diseases. It allows users to formulate complex queries about orthologs and diseases without needing SPARQL syntax. An example query and its results are shown, finding the ortholog genes of the gene that causes prostate cancer in rats. Future plans include adding more reasoning capabilities and integrating with additional biomedical ontologies and standards.
This document provides an overview of exome analysis for identifying causal genes for Mendelian disorders. It discusses technological advances that have enabled exome sequencing, key publications in the field, strategies and tools used for data analysis, and exome sequencing service providers. The document is intended as a useful resource for those interested in how exome analysis is used to identify genes underlying Mendelian conditions.
Use of semantic phenotyping to aid disease diagnosismhaendel
This document discusses using semantic phenotyping to aid disease diagnosis. It outlines using ontologies to semantically annotate phenotypes seen in patients, animal models, and genes. This allows computation of semantic similarity between phenotypes to identify potential disease candidates. The document also discusses challenges such as uneven phenotype data distribution and differences in how phenotypes are described across species. It proposes building an integrated cross-species semantic framework called Uberpheno to address these challenges and better leverage animal models for diagnosing rare diseases.
Getting Started with the Hymenoptera Anatomical OntologyKatja C. Seltmann
For Biodiversity Informatics workshop in Stockholm, Friday September 18. Describing the Hymenoptera Anatomical Ontology. Authors: Matthew Yoder, Andrew Deans, Katja Seltmann, István Mikó, Matthew Bertone
Knockout mice are produced by disrupting genes in mice through the insertion of artificial DNA, allowing researchers to observe the effects of gene deletion and gain insight into gene function. This document discusses how knockout mice are made via embryonic stem cell manipulation and gene targeting or trapping. It provides the example of cystic fibrosis knockout mice modeling the human disease. While mice are a valuable model for human genetics, there are also limitations such as differences in phenotypes between mice and humans.
This document discusses challenges and opportunities in applying mRNA sequencing (mRNAseq) to non-model organisms. It describes using digital normalization as a way to cope with having a massive amount of lamprey mRNAseq data but an incomplete genomic reference. Digital normalization enables assembly and analysis that would otherwise not be possible. The document also discusses applying digital normalization to ascidian mRNAseq data, where it results in substantial time savings and comparable transcriptomes to those assembled without normalization. Finally, it discusses next steps for lamprey transcriptome analysis including enabling various evolutionary and biological studies.
Basic Formal Ontology (BFO) and DiseaseBarry Smith
The document discusses different approaches to conceptualizing health, disease, and biological kinds across multiple levels of granularity. It notes that traditional biology data conceptualized entities based on observable instances, while new biology data represents entities at the molecular level through genetic sequences. It argues that linking different kinds of phenomena represented at various levels requires annotation with terms from controlled vocabularies like ontologies. Ontologies provide a structured framework for integrating data across databases and supporting logical reasoning by standardizing references to biological entities, processes, and functions.
Why the world needs phenopacketeers, and how to be onemhaendel
Keynote presented at the the Ninth International Biocuration Conference Geneva, Switzerland, April 10-14, 2016
The health of an individual organism results from complex interplay between its genes and environment. Although great strides have been made in standardizing the representation of genetic information for exchange, there are no comparable standards to represent phenotypes (e.g. patient disease features, variation across biodiversity) or environmental factors that may influence such phenotypic outcomes. Phenotypic features of individual organisms are currently described in diverse places and in diverse formats: publications, databases, health records, registries, clinical trials, museum collections, and even social media. In these contexts, biocuration has been pivotal to obtaining a computable representation, but is still deeply challenged by the lack of standardization, accessibility, persistence, and computability among these contexts. How can we help all phenotype data creators contribute to this biocuration effort when the data is so distributed across so many communities, sources, and scales? How can we track contributions and provide proper attribution? How can we leverage phenotypic data from the model organism or biodiversity communities to help diagnose disease or determine evolutionary relatedness? Biocurators unite in a new community effort to address these challenges.
This document discusses Gene Ontology (GO) annotation. It describes how GO provides a common language to describe gene products using controlled vocabularies and ontologies. Key points:
- GO annotation involves assigning GO terms to describe the molecular function, biological process and cellular component of a gene product.
- Literature curation and computational methods are used to make GO annotations based on different types of evidence.
- GO annotations are captured in gene annotation files along with the evidence and reference sources.
The Human Phenotype Ontology (HPO) was developed to describe phenotypic abnormalities, aka, “deep phenotyping”, whereby symptoms and characteristic phenotypic findings (a phenotypic profile) are captured. The HPO has been utilized to great success for assisting computational phenotype comparison against known diseases, other patients, and model organisms to support diagnosis of rare disease patients. Clinicians and geneticists create phenotypic profiles based on clinical evaluation, but this is time consuming and can miss important phenotypic features. Patients are sometimes the best source of information about their symptoms that might otherwise be missed in a clinical encounter. However, HPO primarily use medical terminology, which can be difficult for patients and their families to understand. To make the HPO accessible to patients, we systematically added non-expert terminology (i.e., layperson terms) synonyms. Using semantic similarity, patient-recorded phenotypic profiles can be evaluated against those created clinically for undiagnosed patients to determine the improvement gained from the patient-driven phenotyping, as well as how much the patient phenotyping narrows the diagnosis. This patient-centric HPO can be utilized by all: in patient-centered rare disease websites, in patient community platforms and registries, or even to post one’s hard-to-diagnosed phenotypic profile on the Web.
This document discusses challenges and opportunities in applying mRNA sequencing (mRNAseq) to non-model organisms. It describes using digital normalization to cope with large amounts of lamprey mRNAseq data that would otherwise be too computationally intensive to assemble. Digital normalization was applied successfully to Molgula ascidian mRNAseq data, enabling transcriptome analysis. The lamprey transcriptome was assembled from over 5 billion reads from 50 tissues, producing over 600,000 transcripts. Next steps include addressing contamination issues and using the more complete transcriptome to enable various evolutionary and biological studies of lamprey. The document advocates making protocols and data openly available to help characterize genes in non-model organisms.
Howdy! This is a fantastic biology literature review example that we've prepared for you. If you need more go to https://www.literaturereviewwritingservice.com/biology-literature-review-example-and-writing-tips/
This document discusses extranuclear inheritance, which refers to inheritance patterns of genetic material located outside the cell nucleus, such as chloroplast and mitochondrial DNA. Extranuclear inheritance follows maternal inheritance patterns rather than Mendelian inheritance. A disease associated with extranuclear inheritance is Leber's hereditary optic neuropathy (LHON), which is linked to mutations in mitochondrial DNA. The document provides examples of organisms and inheritance patterns related to extranuclear DNA.
GRP170 is a molecular chaperone located in the endoplasmic reticulum that assists with protein folding. Caenorhabditis elegans contains two paralogs of the GRP170 gene, GRP170a and GRP170b, which are expressed at different times and induced at different rates. This study examines the role of the GRP170a gene in C. elegans.
Model organisms are non-human species that are widely studied in laboratories to help scientists understand biological processes. They are usually easy to maintain and breed in a lab setting. The document discusses several important model organisms including mice, fruit flies, yeast, and bacteria. It provides details on their genomes, uses for research, and similarities to humans that make them valuable models. Key model organisms like mice and fruit flies have been widely used to study genetics, development, and disease due to their small genomes and short lifecycles.
The document discusses the nematode C. elegans as a model organism for studying Alzheimer's disease. Some key points:
- C. elegans is a useful model organism for neurodegenerative diseases like Alzheimer's and Parkinson's due to its short lifespan, transparency, and genetic similarities to humans.
- Sydney Brenner first introduced C. elegans as a model organism in 1963 due to these advantages.
- About 38% of the C. elegans genome is genetically similar to humans, allowing researchers to study genetic pathways involved in neurodegeneration.
Solez Update on the Technology and Future of Medicine Course: Space, Regenera...Kim Solez ,
Dr. Kim Solez presents Update on the "Technology and Future of Medicine Course: Space, Regenerative Medicine, Large Touch Screens, and Leonard Cohen" on September 25, 2014 at Lab Medicine Pathology Grand Rounds at the University of Alberta in Edmonton, Canada.
This document discusses building communities around ontology development. It provides examples of gene ontologies, plant ontologies, and trait ontologies that are used to group genes and phenotypes. It also outlines how ontologies are developed, managed, and annotated through collaborations between various organizations. Ontology requests are monitored through bug tracking tools and mailing lists. Participation is encouraged to help drive ontology development and annotation.
Demonstration of the applicability of the Linked Data Modeling Language and CHEMROF ( https://chemkg.github.io/chemrof/) for semantic chemical sciences. Presented at MADICES 2022. https://github.com/MADICES/MADICES-2022
Scaling up semantics; lessons learned across the life sciencesChris Mungall
Semantic modeling is key to understanding the biological processes underpinning the health of humans and the health of ecosystems on this planet. There are a number of different approaches to semantic modeling, varying from modeling of *things* in the form of knowledge graphs, modeling of *data structures* in the form of semantic schemas, and modeling of *words* in the form of ultra-large language models. Taking the metaphor of modeling paradigms as planets in a semantic solar system, I will take us on a tour through the solar system, exploring the strengths of each approach, and looking through a historic lens at how we keep iterating over similar solutions with each rotation around the sun. As an alternative to the dichotomy of either resisting change, or starting afresh I urge an approach were we embrace change and adapt with each revolution. I will look specifically at how the OBO community have built powerful knowledge graphs of biological concepts, how the LinkML modeling language incorporates aspects of both frame languages and shape languages, and how language models can be integrated with semantic ontological approaches through the OntoGPT framework
LinkML Intro July 2022.pptx PLEASE VIEW THIS ON ZENODOChris Mungall
NOTE THAT I HAVE MOVED AWAY FROM SLIDESHARE TO ZENODO
The identical presentation is now here:
https://doi.org/10.5281/zenodo.7778641
General introduction to LinkML, The Linked Data Modeling Language.
Adapter from presentation given to NIH May 2022
https://linkml.io/linkml
Slides from the Ontology Access Kit (OAK) workshop, https://incatools.github.io/ontology-access-kit/
OAK is a pluralistic Python library for accessing a variety of ontologies, using either the command line or the Python library
This document provides an overview of LinkML, a lightweight modeling language for building data schemas and knowledge graphs. It discusses how LinkML allows users to model data in a simple yet expressive way and generate outputs like JSON Schema, OWL, and RDF. LinkML aims to be developer-friendly and integrates with popular tools and standards. Several key projects currently use LinkML for tasks like building knowledge graphs and modeling genomics and clinical data.
LinkML is a modeling language for building semantic models that can be used to represent biomedical and other scientific knowledge. It allows generating various schemas and representations like OWL, JSON Schema, GraphQL from a single semantic model specification. The key advantages of LinkML include simplicity through YAML files, ability to represent models in multiple forms like JSON, RDF, and property graphs, and "stealth semantics" where semantic representations like RDF are generated behind the scenes.
Uberon: opening up to community contributionsChris Mungall
The document discusses Uberon, an integrative multi-species anatomy ontology. It describes Uberon's taxonomic scope covering metazoans with a focus on vertebrates. It outlines how Uberon is edited on GitHub and maintained with cross-references to other species-specific anatomy ontologies. It also discusses how phenotypes from the Phenotype and Human Phenotype Ontology are directly mapped to Uberon and species-specific anatomies, as well as considerations for which anatomy ontology a phenotype ontology should use.
Modeling exposure events and adverse outcome pathways using ontologiesChris Mungall
This document discusses using ontologies to model exposure events, adverse outcome pathways, and phenotypes in order to support predictive toxicology. It describes existing ontologies like the Environment, Conditions, and Treatments Ontology (ECTO) and Gene Ontology Causal Activity Models (GO-CAMs) that can be used to represent exposure mechanisms and adverse outcomes. The document also presents challenges for developing an open predictive toxicology framework that leverages ontologies and linked data to make toxicology data more findable, accessible, interoperable, and reusable.
The document discusses the Environment Ontology (ENVO), which aims to represent environmental entities and their relationships in a structured format. It describes the main hierarchies in ENVO, including biome, environmental feature, and environmental material. ENVO represents different levels of environmental granularity from broad biomes down to specific materials. Any material entity can act as a feature determining an environmental system. The objectives for further developing ENVO are also outlined, such as representing various environmental qualities like temperature, nutrients, and toxins.
Chris Mungall presented a Bayesian approach called k-BOOM (Bayesian OWL Ontology Merging) to combine existing disease ontologies and lists into a unified framework. k-BOOM generates hypothetical logical mappings between ontologies, estimates weights for each mapping, and uses a greedy algorithm to find the set of mappings that maximizes the probability of the merged ontology. It has been applied to merge several disease ontologies into MonDO (Monarch Disease Ontology). Evaluation found high agreement with held-out data and detection of errors in source ontologies. Next steps include improving weight estimation and evaluation.
Presentation on BioMake, a GNU-Make-like utility for managing builds and complex workflows using declarative specifications. From GMOD/PAG meeting 2017
Increased Expressivity of Gene Ontology Annotations - Biocuration 2013Chris Mungall
Presentation from Biocuration conference describing extension to GO annotation formalism allowing curators to capture more detailed biological context and specificity at time of annotation. Feature Portuguese Man-o-War assaults.
Uberon is a multi-species anatomy ontology covering animal anatomy. It contains over 8,000 classes describing anatomical structures across metazoans in a species-neutral way. Uberon bridges species-specific anatomy ontologies and allows cross-species analysis of high-throughput genomics and phenomics data. It is extensively connected to other biomedical ontologies and has been applied in projects involving phenomics, transcriptomics, systematics and finding disease models.
Software engineering methodologies also work for Ontology engineering. This presentation from Bio-Ontologies 2012 describes how we are using Jenkins CI in GO and other ontologies.
The document discusses ontology generation tools TermGenie and Shoge. TermGenie allows curators to generate stable IDs and logical axioms for terms by filling out predefined templates, and has been used successfully by the Gene Ontology Consortium. Shoge uses anatomy grammars to generate labels and logical axioms for entire modules, such as generating terms for limb segments and their repetition across different contexts. The document provides examples of grammars used by Shoge to model phenomena such as serial homology.
1. This document discusses building an informatics resource that encodes developmental knowledge to better understand diseases and disorders.
2. It aims to dynamically classify diseases based on their developmental origins, find disorders that affect structures derived from the neural crest, and identify genes implicated in disorders.
3. Several challenges are outlined, including different databases using different semantic languages and incomplete developmental relationships in ontologies.
4. The approach involves selecting relevant ontologies, building bridging ontologies, curating developmental graphs, and querying the integrated knowledge base to enable discovery.
Mending Clothing to Support Sustainable Fashion_CIMaR 2024.pdfSelcen Ozturkcan
Ozturkcan, S., Berndt, A., & Angelakis, A. (2024). Mending clothing to support sustainable fashion. Presented at the 31st Annual Conference by the Consortium for International Marketing Research (CIMaR), 10-13 Jun 2024, University of Gävle, Sweden.
Immersive Learning That Works: Research Grounding and Paths ForwardLeonel Morgado
We will metaverse into the essence of immersive learning, into its three dimensions and conceptual models. This approach encompasses elements from teaching methodologies to social involvement, through organizational concerns and technologies. Challenging the perception of learning as knowledge transfer, we introduce a 'Uses, Practices & Strategies' model operationalized by the 'Immersive Learning Brain' and ‘Immersion Cube’ frameworks. This approach offers a comprehensive guide through the intricacies of immersive educational experiences and spotlighting research frontiers, along the immersion dimensions of system, narrative, and agency. Our discourse extends to stakeholders beyond the academic sphere, addressing the interests of technologists, instructional designers, and policymakers. We span various contexts, from formal education to organizational transformation to the new horizon of an AI-pervasive society. This keynote aims to unite the iLRN community in a collaborative journey towards a future where immersive learning research and practice coalesce, paving the way for innovative educational research and practice landscapes.
Describing and Interpreting an Immersive Learning Case with the Immersion Cub...Leonel Morgado
Current descriptions of immersive learning cases are often difficult or impossible to compare. This is due to a myriad of different options on what details to include, which aspects are relevant, and on the descriptive approaches employed. Also, these aspects often combine very specific details with more general guidelines or indicate intents and rationales without clarifying their implementation. In this paper we provide a method to describe immersive learning cases that is structured to enable comparisons, yet flexible enough to allow researchers and practitioners to decide which aspects to include. This method leverages a taxonomy that classifies educational aspects at three levels (uses, practices, and strategies) and then utilizes two frameworks, the Immersive Learning Brain and the Immersion Cube, to enable a structured description and interpretation of immersive learning cases. The method is then demonstrated on a published immersive learning case on training for wind turbine maintenance using virtual reality. Applying the method results in a structured artifact, the Immersive Learning Case Sheet, that tags the case with its proximal uses, practices, and strategies, and refines the free text case description to ensure that matching details are included. This contribution is thus a case description method in support of future comparative research of immersive learning cases. We then discuss how the resulting description and interpretation can be leveraged to change immersion learning cases, by enriching them (considering low-effort changes or additions) or innovating (exploring more challenging avenues of transformation). The method holds significant promise to support better-grounded research in immersive learning.
(June 12, 2024) Webinar: Development of PET theranostics targeting the molecu...Scintica Instrumentation
Targeting Hsp90 and its pathogen Orthologs with Tethered Inhibitors as a Diagnostic and Therapeutic Strategy for cancer and infectious diseases with Dr. Timothy Haystead.
ESR spectroscopy in liquid food and beverages.pptxPRIYANKA PATEL
With increasing population, people need to rely on packaged food stuffs. Packaging of food materials requires the preservation of food. There are various methods for the treatment of food to preserve them and irradiation treatment of food is one of them. It is the most common and the most harmless method for the food preservation as it does not alter the necessary micronutrients of food materials. Although irradiated food doesn’t cause any harm to the human health but still the quality assessment of food is required to provide consumers with necessary information about the food. ESR spectroscopy is the most sophisticated way to investigate the quality of the food and the free radicals induced during the processing of the food. ESR spin trapping technique is useful for the detection of highly unstable radicals in the food. The antioxidant capability of liquid food and beverages in mainly performed by spin trapping technique.
Travis Hills of MN is Making Clean Water Accessible to All Through High Flux ...Travis Hills MN
By harnessing the power of High Flux Vacuum Membrane Distillation, Travis Hills from MN envisions a future where clean and safe drinking water is accessible to all, regardless of geographical location or economic status.
EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...Sérgio Sacani
Context. With a mass exceeding several 104 M⊙ and a rich and dense population of massive stars, supermassive young star clusters
represent the most massive star-forming environment that is dominated by the feedback from massive stars and gravitational interactions
among stars.
Aims. In this paper we present the Extended Westerlund 1 and 2 Open Clusters Survey (EWOCS) project, which aims to investigate
the influence of the starburst environment on the formation of stars and planets, and on the evolution of both low and high mass stars.
The primary targets of this project are Westerlund 1 and 2, the closest supermassive star clusters to the Sun.
Methods. The project is based primarily on recent observations conducted with the Chandra and JWST observatories. Specifically,
the Chandra survey of Westerlund 1 consists of 36 new ACIS-I observations, nearly co-pointed, for a total exposure time of 1 Msec.
Additionally, we included 8 archival Chandra/ACIS-S observations. This paper presents the resulting catalog of X-ray sources within
and around Westerlund 1. Sources were detected by combining various existing methods, and photon extraction and source validation
were carried out using the ACIS-Extract software.
Results. The EWOCS X-ray catalog comprises 5963 validated sources out of the 9420 initially provided to ACIS-Extract, reaching a
photon flux threshold of approximately 2 × 10−8 photons cm−2
s
−1
. The X-ray sources exhibit a highly concentrated spatial distribution,
with 1075 sources located within the central 1 arcmin. We have successfully detected X-ray emissions from 126 out of the 166 known
massive stars of the cluster, and we have collected over 71 000 photons from the magnetar CXO J164710.20-455217.
ESA/ACT Science Coffee: Diego Blas - Gravitational wave detection with orbita...Advanced-Concepts-Team
Presentation in the Science Coffee of the Advanced Concepts Team of the European Space Agency on the 07.06.2024.
Speaker: Diego Blas (IFAE/ICREA)
Title: Gravitational wave detection with orbital motion of Moon and artificial
Abstract:
In this talk I will describe some recent ideas to find gravitational waves from supermassive black holes or of primordial origin by studying their secular effect on the orbital motion of the Moon or satellites that are laser ranged.
The debris of the ‘last major merger’ is dynamically youngSérgio Sacani
The Milky Way’s (MW) inner stellar halo contains an [Fe/H]-rich component with highly eccentric orbits, often referred to as the
‘last major merger.’ Hypotheses for the origin of this component include Gaia-Sausage/Enceladus (GSE), where the progenitor
collided with the MW proto-disc 8–11 Gyr ago, and the Virgo Radial Merger (VRM), where the progenitor collided with the
MW disc within the last 3 Gyr. These two scenarios make different predictions about observable structure in local phase space,
because the morphology of debris depends on how long it has had to phase mix. The recently identified phase-space folds in Gaia
DR3 have positive caustic velocities, making them fundamentally different than the phase-mixed chevrons found in simulations
at late times. Roughly 20 per cent of the stars in the prograde local stellar halo are associated with the observed caustics. Based
on a simple phase-mixing model, the observed number of caustics are consistent with a merger that occurred 1–2 Gyr ago.
We also compare the observed phase-space distribution to FIRE-2 Latte simulations of GSE-like mergers, using a quantitative
measurement of phase mixing (2D causticality). The observed local phase-space distribution best matches the simulated data
1–2 Gyr after collision, and certainly not later than 3 Gyr. This is further evidence that the progenitor of the ‘last major merger’
did not collide with the MW proto-disc at early times, as is thought for the GSE, but instead collided with the MW disc within
the last few Gyr, consistent with the body of work surrounding the VRM.
The binding of cosmological structures by massless topological defectsSérgio Sacani
Assuming spherical symmetry and weak field, it is shown that if one solves the Poisson equation or the Einstein field
equations sourced by a topological defect, i.e. a singularity of a very specific form, the result is a localized gravitational
field capable of driving flat rotation (i.e. Keplerian circular orbits at a constant speed for all radii) of test masses on a thin
spherical shell without any underlying mass. Moreover, a large-scale structure which exploits this solution by assembling
concentrically a number of such topological defects can establish a flat stellar or galactic rotation curve, and can also deflect
light in the same manner as an equipotential (isothermal) sphere. Thus, the need for dark matter or modified gravity theory is
mitigated, at least in part.
Authoring a personal GPT for your research and practice: How we created the Q...Leonel Morgado
Thematic analysis in qualitative research is a time-consuming and systematic task, typically done using teams. Team members must ground their activities on common understandings of the major concepts underlying the thematic analysis, and define criteria for its development. However, conceptual misunderstandings, equivocations, and lack of adherence to criteria are challenges to the quality and speed of this process. Given the distributed and uncertain nature of this process, we wondered if the tasks in thematic analysis could be supported by readily available artificial intelligence chatbots. Our early efforts point to potential benefits: not just saving time in the coding process but better adherence to criteria and grounding, by increasing triangulation between humans and artificial intelligence. This tutorial will provide a description and demonstration of the process we followed, as two academic researchers, to develop a custom ChatGPT to assist with qualitative coding in the thematic data analysis process of immersive learning accounts in a survey of the academic literature: QUAL-E Immersive Learning Thematic Analysis Helper. In the hands-on time, participants will try out QUAL-E and develop their ideas for their own qualitative coding ChatGPT. Participants that have the paid ChatGPT Plus subscription can create a draft of their assistants. The organizers will provide course materials and slide deck that participants will be able to utilize to continue development of their custom GPT. The paid subscription to ChatGPT Plus is not required to participate in this workshop, just for trying out personal GPTs during it.
PPT on Direct Seeded Rice presented at the three-day 'Training and Validation Workshop on Modules of Climate Smart Agriculture (CSA) Technologies in South Asia' workshop on April 22, 2024.
7. Which path to AI? (circa 1990s)
Knowledge-
Based
Knowledge-
Free
statisti
cs
logic
learnin
g
encodin
g
Artificial Intelligence
Narrow AI Broad AI
‘knowin
g that’
‘knowin
g how’
Biologicall
y inspired
Cognitivel
y inspired
14. • Analysis pipeline
• Curation tools
• Annotation databa
From sequence to genome
annotation
15. • Analysis pipeline
• Curation tools
• Annotation databa
Chado
Mungall, C. J., Emmert, D. B., & FlyBase Consortium, (2007). A Chado case study: an ontology-based
modular schema for representing genome-associated biological information. Bioinformatics, 23(13),
i337-346. http://doi.org/10.1093/bioinformatics/btm189
Generalized community tools
16. • Analysis pipeline
• Curation tools
• Annotation
database
• Functional
annotation
Genomes to function
annotation?
What does it do?
17. Gene Ontology: tool for the
unification of biology (2000)
Organize
generalized
biological
knowledge as a
graph
Attach genes to
nodes
Propagate across
species
Create gene lists
Interpret high
throughput data
Ashburner, M., Ball, C. A., Blake, J. A., Botstein, D., Butler, H., Cherry, J. M., … Sherlock, G. (2000). Gene ontology: tool for the
unification of biology. The Gene Ontology Consortium. Nat Genet, 25(1), 25–29. http://doi.org/10.1038/75556
18. Ontologies as force amplifiers for
data
domain knowledgedata
biocurationexperimen
19. Don’t worship the monolith
PROBLEM: GO and other ontologies were becoming monolithic
- lots of implicit overlap with other ontologies, latent structure
20. Open Biological Ontologies
(OBO)
http://obofoundry.org
1. Well-integrated
Modular ontologies
2. Provide technical
and
sociotechnological
framework for
cooperation
4. Allow us to
curate all of the
things
3. Provide tools, best
practices and
infrastructure for
forging new
ontologies
@obofoundry
21. OBO Library PURLs
PURL: Persistent URL
Consistent, predictable, stable and versioned
URLs for ontology objects
Can be shortened as compact URIs (CURIEs), e.g.
GO:0008150
Can be registered and viewed on OBO site
http://obofoundry.org
Ontology purls
Main ontology, subsets
versionIRIs
Ontology term purls
23. Contributions to and uses of
RO
virtualflybrain.org globalbioticinteractions.org
Osumi-Sutherland, D. (2012).
doi:10.1093/bioinformatics/bts113
Has soma location
Has synaptic terminal in
Upstream in neural circuit with
…
Eats
Epiphyte of
Parasite of
Kleptoparasitizes
hyperparasitizes
Neurocellular Bioitic interaction
Is model of
Has phenotype
Molecularly controls
Allosteric inhibitor of
causes or contributes to condition
...
David Osumi-Sutherland Anne ThessenMatt Brush Greg Stupp
Gene, drug,
phenotype
>500 relations
26. Making the pieces fit together: GO
and CHEBI
Hill, D. P., Adams, N., Bada, M., Batchelor, C., Berardini, T. Z., Dietze, H., … Lomax, J. (2013). Dovetailing biology and
chemistry: integrating the Gene Ontology with the ChEBI chemical ontology. BMC Genomics, 14(1), 513.
http://doi.org/10.1186/1471-2164-14-513
GO CHEBI
• Some relationships didn’t make sense
• E.g. nucleotide isa carbohydrate
• Acids conjugate bases
Harold Drabkin
David Hill
Jane Lomax
Tanya Berardini
Janna Hastings
27. Making the pieces fit together: GO
and CHEBI
Hill, D. P., Adams, N., Bada, M., Batchelor, C., Berardini, T. Z., Dietze, H., … Lomax, J. (2013). Dovetailing biology and
chemistry: integrating the Gene Ontology with the ChEBI chemical ontology. BMC Genomics, 14(1), 513.
http://doi.org/10.1186/1471-2164-14-513
GO CHEBI
• Fixed many is-as
• E.g. nucleotide isa carbohydrate
• Acids conjugate bases
+ OWL reasoning
Harold Drabkin
David Hill
Jane Lomax
Tanya Berardini
Janna Hastings
GO CHEBI
+ Design Patterns
28. lung
lung
lobular organ
parenchymatous
organ
solid organ
pleural sac
thoracic
cavity organ
thoracic
cavity
abnormal lung
morphology
abnormal respiratory
system morphology
Mammalian Phenotype
Mouse Anatomy
FMA
abnormal pulmonary
acinus morphology
abnormal pulmonary
alveolus morphology
lung
alveolus
organ system
respiratory
system
Lower
respiratory
tract
alveolar sac
pulmonary
acinus
organ system
respiratory
system
Human development
lung
lung bud
respiratory
primordium
pharyngeal region
Challenges of multi-species anatomy
and phenotypes
develops_from
part_of
is_a (SubClassOf)
surrounded_by
29. The perils of mappings
Class A Class B Mapped
?
Useful
?
FMA: extensor
retinaculum of wrist
MouseAnatomy: retina Yes No
Plant Ontology: Pith
Fly Anat: femur
MouseAnatomy: medulla
MouseAnatomy: femur
Yes
Yes
No
No*
ZfishAnat: hypophysis MouseAnatomy: pituitary No Yes
TAO:fossa AdverseReactions: depression Yes No
FMA: colon GAZ: Colón, Panama Yes No
Quality: male Chebi: maleate 2(-) Yes No
30. http://uberon.org
• Initial Phase
• Bottom-up
• Create groupings of
terms
• Light curation
• Next Phase
• Top down
• 14k classes
• Design Patterns
• Periodic alignment
and feeding back to
curators
Uberon
34. dinosaurs, sponges, comb jellies
and cephalopods, oh my
Thacker, R. W., (2014). The Porifera Ontology (PORO):
enhancing sponge systematics with an anatomy ontology.
Journal of Biomedical Semantics, 5(1), 39.
http://doi.org/10.1186/2041-1480-5-39
Graphic courtesy Nizar Ibrahim, Paul Sereno, et al.
Phenotype RCN
Wasila Dahdul
Bob Thacker
obofoundry.org/
ontology/ceph.html
obofoundry.org/
ontology/cteno.html
35. Phenotype and Disease
Ontologies
Problem: Many ontologies, vocabularies and
condition/phenotype lists:
HP, MP, WBPhenotype, FBcv, TO, VT, FYPO, APO,
SNOMED
OMIM, Orphanet, DO, NCIT, MESH, ICD, UMLS,
MEDGEN …
ZFIN, Phenoscape: EQ
Köhler, S.. (2013).. F1000Research, 1–
12.
http://doi.org/10.3410/f1000research.2-
Standardized Design
Patterns + OWL
Reasoning
Bayesian OWL Ontology
Merging
(BOOM)
Mungall, C.J et al (2016) kBOOM.
bioRxiv 10.1101/048843
Monarch merged
‘upheno’ ontology
MonDO
Elvira Mitraka
Sue Bello Nicole
Vasileksky
36. Combined score
Remove off-target and common variants
Whole exome
Variant Score based on allele frequency and
pathological impact
Mendelian filters
Whole or partial
phenome (HPO)
Owl
Sim
Gene phenotype scores
Curated
Phenotype
Data
Monarch
Integrated
KB
upheno
Curated
Orthology,
Interaction, ..
Data
+GENOMISER
40. Biological knowledge and curation
QC
Deegan, J., Dimmer, E., & Mungall, C. J. (2010). Formalization of taxon-based constraints to detect inconsistencies in annotation and
ontology development. BMC Bioinformatics, 11(1), 530. http://doi.org/10.1186/1471-2105-11-530
Annotation errors can arise for different reasons
- machine error (inappropriate propagation)
- human error
Previous versions of the GO had
various unusual annotations:
• Genes in chicken responsible
for lactation
41. Biological knowledge and curation
QC
Deegan, J., Dimmer, E., & Mungall, C. J. (2010). Formalization of taxon-based constraints to detect inconsistencies in annotation and
ontology development. BMC Bioinformatics, 11(1), 530. http://doi.org/10.1186/1471-2105-11-530
Annotation errors can arise for different reasons
- machine error (inappropriate propagation)
- human error
Previous versions of the GO had
various unusual annotations:
• Genes in chicken responsible
for lactation
• Genes in slime mold
responsible for dorsal fin
development
42. Solution: Taxon constraints
Deegan, J., Dimmer, E., & Mungall, C. J. (2010). Formalization of taxon-based constraints to detect inconsistencies in annotation and
ontology development. BMC Bioinformatics, 11(1), 530. http://doi.org/10.1186/1471-2105-11-530
Encode taxon constraints as OWL
rules in the ontology
only in taxon
never in taxon
Can be propagated across
ontologies
E.g.
dorsal fin only in vertebrata
(uberon)
dorsal fin never in tetrapod
(uberon)
lactation only in mammals (go)
43. Hi, ROBOT
How can we package things up and make
them easier to use in ontology/curation QC
pipelines?
Enter ROBOT
Design Patterns
Continuous Integration
44. Next steps for ontology
annotation
Existing ontology annotation model:
Bag of terms
gene
ter
m
ter
m
ter
m
ter
m
ter
m
ter
m
ter
m
ter
m
53. Take homes
Knowledge is a force multiplier
Applies to all biocuration work
But pinpoints need for QC
Design for generality
But acknowledge difficulties
Better support required
Biological knowledge is multifaceted and
nuanced
Computer scientists have a tendency towards
hubris
Biology is our nemesis
Collaborative approach is vital
59. Acknowledgments
Monarch Initiative: Jeremy Nguyen-Xuan, Kent Shefcheck, Matt Brush, Tom Conlin, Lilly
Winfree, Eric Douglass, Jules Jacobsen, Craig McLachan, Suzanna Lewis, Julie McMurry, Dan
Keith, Nicole Washington, Nicole Vasilevsky, Nathan Dunn, Harry Hochheiser, William Bone, Neal
Boerkel, Damian Smedley, Tudor Groza, Sebastian Koehler, Melissa Haendel, Peter
Robinson
GO: Michael Ashburner, David Hill, Paola Roncaglia, David Osumi-Sutherland, Tanya Berardini,
Jen Deegan, Jane Lomax, Karen Christie, Pascale Gaudet, Monica Munoz-Torres, Seth
Carbon, Eric Douglass, Heiko Dietze, Ruth Loverin, Rachael Huntley, Midori Harris, Harold
Drabkin, Kimberley Van Auken, Marc Feuermann, Petra Fey, Jim Hu, Debbie Siegel, Helen
Parkinson, Tony Sawford, Stacia Engel, Sylav Poux, Melanie Courtot, Becky Foulger, Emily
Dimmer, Rachael Huntley, Huaiyu Mi, Judy Blake, Paul Sternberg, Mike Cherry, Suzi Lewis, Paul
Thomas
OBO: Michael Ashburner, Suzanna Lewis, Barry Smith, Richard Scheuermann, Chris Stockert,
Jie Zheng, Melanie Courtot, Simon Jupp, Ramona Wall,s Darren Natale, Melissa Haendel, Lynn
Schriml, Alan Ruttenberg, Seth Carbon, James Overton, Bjoern Peters, + all contributors
Planteome: Pankaj Jaiswal, Dennis Stevenson, Laurel Cooper, Austin Meier, Marie Angelique
Laporte, Elizabeth Arnaud
Uberon: David Osumi-Sutherland, Paula Mabee, Jim Balhoff, Wasila Dahdul, Alex Dececci,
Nizar Ibrahim, Paul Sereno, Frederic Bastian, Ann Niknejad, Marc Robinson-Rechavi, David
Blackburn, Terry Hayamizu, Yvonne Bradford, Ceri Van Slyke, Alex Diehl, Terry Meehab,
Robert Druzinsky, Melissa Haendel
ALL OF THE BIOCURATORSNIH ORIP R24OD011883
NHGRI U41HG 002273 NSF DEB-0956049 DOE DE-AC02-05CH11231
NSF IOS 1340112
NSF DBI 1062404
60.
61.
62. Give me a place to stand and with a lever I
will move the whole world
63. Uncovering latent meaning in
ontologies
Mungall, C. J. (2004). Obol: Integrating Language and Meaning in Bio-Ontologies. Comparative and
Functional Genomics, 5(7), 509–520.
regulation of Notch signaling pathway involved in heart
induction
relation relation anatomicpathway
OWL EXPRESSION HERE
≡ ∃regulates (NSP ⊓ ∃ part-of HI)
64. Open Biological Ontologies
(OBO)
To provide modular building
blocks
Not just functional annotation of
genes and gene products
Framework, tools and
infrastructure for cooperation and
harmonization
Smith, B., Ashburner, M., Rosse, C., Bard, J., Bug, W., Ceusters, W., … Lewis, S. (2007). The OBO Foundry: coordinated
evolution of ontologies to support biomedical data integration. Nat Biotechnol, 25(11), 1251–1255.
Functio
n
(GO)
Anatomy
Environ
ment
Chemical
s
(CHEBI)
Phenotyp
e and
Disease
Genes
(SO,
GENO)
Occurs
in
…
http://obofoundry.org
66. Relations: the glue that holds it
together
RO 2005 paper
10 relations
Current RO
>500 relations
Molecular biology
Neurobiology
Biotic interactions
…
Many rules on how relations compose together
Working with wikidata
http://obofoundry.org/ontology/ro.html
67. Beyond the GO
Functional
Genomics: Gene
function
Transcriptomics:
Gene expression
Phenomics: Effects
of gene mutations
Gene Ontology
Anatomy and
Stage Ontology
Phenotype and
Trait Ontology
Links genes to
What they do
Links genes to
where they
are expressed
Links genes to
what happens
when they are
disrupted or
when they varyDisease Ontology
Environment
Ontology
68. anatomical
structure
endoderm of
forgut
lung bud
lung
respiration organ
organ
foregut
alveolus
alveolus of lung
organ part
FMA:lung
MA:lung
endoderm
GO: respiratory
gaseous exchange
MA:lung
alveolus
FMA:
pulmonary
alveolus
is_a (taxon equivalent)
develops_from
part_of
is_a (SubClassOf)
capable_of
NCBITaxon: Mammalia
EHDAA:
lung bud
only_in_taxon
pulmonary
acinus
alveolar sac
lung primordium
swim bladder
respiratory
primordium
NCBITaxon:
Actinopterygii
http://uberon.org
Mungall, C. J., Torniai, C., Gkoutos, G. V, Lewis, S. E., & Haendel, M. A. (2012). Uberon, an integrative multi-species anatomy
ontology. Genome Biology, 13(1), R5. doi:10.1186/gb-2012-13-1-r5
Uberon bridges anatomy
ontologies
70. Uberon Core
Extensions to other animals…
Thacker, R. W., Díaz, M. C., Kerner, A., Vignes-Lebbe, R., Segerdell, E., Haendel, M. a,
& Mungall, C. J. (2014). The Porifera Ontology (PORO): enhancing sponge
systematics with an anatomy ontology. Journal of Biomedical Semantics, 5(1), 39
Non-model/human
extension
Porifera
Ontology
Ctenophore
Ontology
Cephalopod
Ontology
http://phenotypercn.org
https://github.com/obophenotype/cephalopod-ontology
https://github.com/obophenotype/ctenophore-ontology
https://github.com/obophenotype/porifera-ontology
https://github.com/obophenotype/uberon
Arthropod
Ontology
74. TODO DEPRECATED The need
for modularization
Growing pains of GO
Terms were added as-needed for curation
Hard to maintain
Scope: Encompassing all of biology is hard
Biochemistry, cell biology, plants, animal development and
physiology, …
We needed to modularize
Meanwhile
Other ontologies in the ‘style’ of GO were popping up,
for annotating other kinds of data
Challenge: how were we going to coordinate this?
75. Biological knowledge and curation
QC
Taxon constraints
CONCRETE EXAMPLE HERE
Intersection rules
(see Seth’s talk)
Deegan, J., Dimmer, E., & Mungall, C. J. (2010). Formalization of taxon-based constraints to detect inconsistencies in annotation and
ontology development. BMC Bioinformatics, 11(1), 530. http://doi.org/10.1186/1471-2105-11-530
82. Uberon/CL applications and
users
Ontology Modularization
GO
CLO
Pheno Ontologies (EQ definitions)
ENVO
Transcriptomics and genome annotation
ENCODE
FANTOM5
LINCS
BgeeDb
Phenomics
Human and Mammalia Phenotype Ontology
Phenotype comparison algorithms
Evolutionary Phenotypes: Phenoscape
http://uberon.github.io/about/adopters.html
83. The path to AI, 1990s
Two goals
Broad AI
Narrow AI
What path to get there?
Knowledge-Based
Explicit Encoding of knowledge about the world
Analytic or deductive reasoning
Mathematical Logic vs Cognitively inspired (neats vs scruffs)
‘Knowing that’
Knowledge-Free
Machine Learning, Neural Networks
Statistics
Pattern Recognition
Biological Inspired
‘Knowing how’