Introduction to Web Apollo for the i5K i5K copepod research community. WebApollo is genome annotation editor; it provides a web-based environment that allows multiple distributed users to review, edit, and share manual annotations. This presentation includes information specific to the projects of the Global Initiative to sequence the genomes of 5,000 species of arthropods, i5K. Let's get started!
An introduction to Web Apollo for the Biomphalaria glabatra research community.Monica Munoz-Torres
Web Apollo is a web-based, collaborative genomic annotation editing platform. We need annotation editing tools to modify and refine precise location and structure of the genome elements that predictive algorithms cannot yet resolve automatically.
This presentation is an introduction to how the manual annotation process takes place using Web Apollo. It is addressed to the members of the Biomphalaria glabatra research community.
This presentation is a thorough guide to the use of Web Apollo, with details on User Navigation, Functionality, and the thought process behind manual annotation.
During this workshop, participants:
- Learn to identify homologs of known genes of interest in your newly sequenced genome.
- Become familiar with the environment and functionality of the Web Apollo genome annotation editing tool.
- Learn how to corroborate or modify automatically annotated gene models using all available evidence in Web Apollo.
- Understand the process of curation in the context of genome annotation.
An introduction to Web Apollo for i5K Pilot Species Projects - HemipteraMonica Munoz-Torres
Introduction to Web Apollo for the i5K Pilot species project. WebApollo is genome annotation editor; it provides a web-based environment that allows multiple distributed users to review, edit, and share manual annotations. This presentation includes information specific to the projects of the Global Initiative to sequence the genomes of 5,000 species of arthropods, i5K. Let's get started!
Apollo and i5K: Collaborative Curation and Interactive Analysis of GenomesMonica Munoz-Torres
Precise elucidation of the many different biological features encoded in a genome requires a careful curation process that involves reviewing all available evidence to allow researchers to resolve discrepancies and validate automated gene models, protein alignments, and other biological elements. Genome annotation is an inherently collaborative task; researchers only rarely work in isolation, turning to colleagues for second opinions and insights from those with expertise in particular domains and gene families.
The i5k initiative seeks to sequence the genomes of 5,000 insect and related arthropod species. The selected species are known to be important to worldwide agriculture, food safety, medicine, and energy production as well as many used as models in biology, those most abundant in world ecosystems, and representatives in every branch of the insect phylogeny in an effort to better understand arthropod evolution and phylogeny. Because computational genome analysis remains an imperfect art, each of these new genomes sequenced will require visualization and curation.
Apollo is an instantaneous, collaborative, genome annotation editor, and the new JavaScript based version allows researchers real-time interactivity, breaking down large amounts of data into manageable portions to mobilize groups of researchers with shared interests. The i5K is a broad and inclusive effort that seeks to involve scientists from around the world in their genome curation process and Apollo is serving as the platform to empower this community. Here we offer details about this collaboration.
An introduction to Web Apollo for the Biomphalaria glabatra research community.Monica Munoz-Torres
Web Apollo is a web-based, collaborative genomic annotation editing platform. We need annotation editing tools to modify and refine precise location and structure of the genome elements that predictive algorithms cannot yet resolve automatically.
This presentation is an introduction to how the manual annotation process takes place using Web Apollo. It is addressed to the members of the Biomphalaria glabatra research community.
This presentation is a thorough guide to the use of Web Apollo, with details on User Navigation, Functionality, and the thought process behind manual annotation.
During this workshop, participants:
- Learn to identify homologs of known genes of interest in your newly sequenced genome.
- Become familiar with the environment and functionality of the Web Apollo genome annotation editing tool.
- Learn how to corroborate or modify automatically annotated gene models using all available evidence in Web Apollo.
- Understand the process of curation in the context of genome annotation.
An introduction to Web Apollo for i5K Pilot Species Projects - HemipteraMonica Munoz-Torres
Introduction to Web Apollo for the i5K Pilot species project. WebApollo is genome annotation editor; it provides a web-based environment that allows multiple distributed users to review, edit, and share manual annotations. This presentation includes information specific to the projects of the Global Initiative to sequence the genomes of 5,000 species of arthropods, i5K. Let's get started!
Apollo and i5K: Collaborative Curation and Interactive Analysis of GenomesMonica Munoz-Torres
Precise elucidation of the many different biological features encoded in a genome requires a careful curation process that involves reviewing all available evidence to allow researchers to resolve discrepancies and validate automated gene models, protein alignments, and other biological elements. Genome annotation is an inherently collaborative task; researchers only rarely work in isolation, turning to colleagues for second opinions and insights from those with expertise in particular domains and gene families.
The i5k initiative seeks to sequence the genomes of 5,000 insect and related arthropod species. The selected species are known to be important to worldwide agriculture, food safety, medicine, and energy production as well as many used as models in biology, those most abundant in world ecosystems, and representatives in every branch of the insect phylogeny in an effort to better understand arthropod evolution and phylogeny. Because computational genome analysis remains an imperfect art, each of these new genomes sequenced will require visualization and curation.
Apollo is an instantaneous, collaborative, genome annotation editor, and the new JavaScript based version allows researchers real-time interactivity, breaking down large amounts of data into manageable portions to mobilize groups of researchers with shared interests. The i5K is a broad and inclusive effort that seeks to involve scientists from around the world in their genome curation process and Apollo is serving as the platform to empower this community. Here we offer details about this collaboration.
Three's a crowd-source: Observations on Collaborative Genome AnnotationMonica Munoz-Torres
It is impossible for a single individual to fully curate a genome with precise biological fidelity. Beyond the problem of scale, curators need second opinions and insights from colleagues with domain and gene family expertise, but the communications constraints imposed in earlier applications made this inherently collaborative task difficult. Apollo, a client-side, JavaScript application allowing extensive changes to be rapidly made without server round-trips, placed us in a position to assess the difference this real-time interactivity would make to researchers’ productivity and the quality of downstream scientific analysis. To evaluate this, we trained and supported geographically dispersed scientific communities (hundreds of scientists and agreed-upon gatekeepers, in ~100 institutions around the world) to perform biologically supported manual annotations, and monitored their findings. We observed that: 1) Previously disconnected researchers were more productive when obtaining immediate feedback in dialogs with collaborators. 2) Unlike earlier genome projects, which had the advantage of more highly polished genomes, recent projects usually have lower coverage. Therefore curators now face additional work correcting for more frequent assembly errors and annotating genes that are split across multiple contigs. 3) Automated annotations were improved as exemplified by discoveries made based on revised annotations, for example ~2800 manually annotated genes from three species of ants granted further insight into the evolution of sociality in this group, and ~3600 manual annotations contributed to a better understanding of immune function, reproduction, lactation and metabolism in cattle. 4) There is a notable trend shifting from whole-genome annotation to annotation of specific gene families or other gene groups linked by ecological and evolutionary significance. 5) The distributed nature of these efforts still demand strong, goal-oriented (i.e. publication of findings) leadership and coordination, as these are crucial to the success of each project. Here we detail these and other observations on collaborative genome annotation efforts.
Apollo annotation guidelines for i5k projects Diaphorina citriMonica Munoz-Torres
Apollo is a web-based application that supports and enables collaborative genome curation in real time, allowing teams of curators to improve on existing automated gene models through an intuitive interface. Apollo allows researchers to break down large amounts of data into manageable portions to mobilize groups of researchers with shared interests.
From peer-reviewed to peer-reproduced: a role for research objects in scholar...Alejandra Gonzalez-Beltran
The reproducibility of science in the digital age is attracting a lot of attention and concerns from the scientific community, where studies have shown the inability to reproduce results due to a variety of reasons, ranging from unavailability of the data to lack of proper descriptions of the experimental steps.
Multiple research object models have been proposed to describe different aspects of the research process. Investigation/Study/Assay (ISA) is a widely used general-purpose metadata tracking framework with an associated suite of open-source software, which offers a rich description of the experiment’s hypotheses and design, investigators involved, experimental factors, protocols applied. The information is organised in a three-level hierarchy where ’Investigation’ provides the project context for a ’Study’ (a research question), which itself contains one or more ’Assays’ (taking analytical measurements and key data processing and analysis steps). Nanopublication (NP) is a research object model which enables specific scientific assertions, such as the conclusions of an experiment, to be annotated with supporting evidence, published and cited. Lastly, the Research Object (RO) is a model that enables the aggregation of the digital resources contributing to findings of computational research, including results, data and software, as citable compound digital objects.
For computational reproducibility, platforms such as Taverna and Galaxy are popular and efficient ways to represent the data analysis steps in the form of reusable workflows, where the data transformations can be specified and executed in an automatic way.
In this presentation, we will address the question of whether such research object models and workflow representation frameworks can be used to assist in the peer review process, by facilitating evaluation of the accuracy of the information provided by scientific articles with respect to their repeatability.
Our case study is based on an article on a genome assembler algorithm published in GigaScience, but due to the proven use of the respective research object models in their respective communities, we argue that the combination of models and workflow system will improve the scholarly publishing process, making science peer-reproduced.
Ontologies for life sciences: examples from the gene ontologyMelanie Courtot
A half day course presented during the Earlham Institute summer school on bioinformatics 2016, in Norwich, UK, http://www.earlham.ac.uk/earlham-institute-summer-school-bioinformatics
In this presentation, I talk about the various tools for the submission of DNA or RNA sequences into various sequence databases. The sequence submission tools talked about in this presentation are BankIt, Sequin and Webin.
This presentation is a thorough guide to the use of Web Apollo, with details on User Navigation, Functionality, and the thought process behind manual annotation.
During this workshop, participants:
- Learn to identify homologs of known genes of interest in a newly sequenced genome.
- Become familiar with the environment and functionality of the Web Apollo genome annotation editing tool.
- Learn how to corroborate or modify automatically annotated gene models using available evidence in Web Apollo.
- Understand the process of curation in the context of genome annotation.
Introduction to Web Apollo for the i5K Pilot species project. WebApollo is genome annotation editor; it provides a web-based environment that allows multiple distributed users to review, edit, and share manual annotations. Let's get started!
Three's a crowd-source: Observations on Collaborative Genome AnnotationMonica Munoz-Torres
It is impossible for a single individual to fully curate a genome with precise biological fidelity. Beyond the problem of scale, curators need second opinions and insights from colleagues with domain and gene family expertise, but the communications constraints imposed in earlier applications made this inherently collaborative task difficult. Apollo, a client-side, JavaScript application allowing extensive changes to be rapidly made without server round-trips, placed us in a position to assess the difference this real-time interactivity would make to researchers’ productivity and the quality of downstream scientific analysis. To evaluate this, we trained and supported geographically dispersed scientific communities (hundreds of scientists and agreed-upon gatekeepers, in ~100 institutions around the world) to perform biologically supported manual annotations, and monitored their findings. We observed that: 1) Previously disconnected researchers were more productive when obtaining immediate feedback in dialogs with collaborators. 2) Unlike earlier genome projects, which had the advantage of more highly polished genomes, recent projects usually have lower coverage. Therefore curators now face additional work correcting for more frequent assembly errors and annotating genes that are split across multiple contigs. 3) Automated annotations were improved as exemplified by discoveries made based on revised annotations, for example ~2800 manually annotated genes from three species of ants granted further insight into the evolution of sociality in this group, and ~3600 manual annotations contributed to a better understanding of immune function, reproduction, lactation and metabolism in cattle. 4) There is a notable trend shifting from whole-genome annotation to annotation of specific gene families or other gene groups linked by ecological and evolutionary significance. 5) The distributed nature of these efforts still demand strong, goal-oriented (i.e. publication of findings) leadership and coordination, as these are crucial to the success of each project. Here we detail these and other observations on collaborative genome annotation efforts.
Apollo annotation guidelines for i5k projects Diaphorina citriMonica Munoz-Torres
Apollo is a web-based application that supports and enables collaborative genome curation in real time, allowing teams of curators to improve on existing automated gene models through an intuitive interface. Apollo allows researchers to break down large amounts of data into manageable portions to mobilize groups of researchers with shared interests.
From peer-reviewed to peer-reproduced: a role for research objects in scholar...Alejandra Gonzalez-Beltran
The reproducibility of science in the digital age is attracting a lot of attention and concerns from the scientific community, where studies have shown the inability to reproduce results due to a variety of reasons, ranging from unavailability of the data to lack of proper descriptions of the experimental steps.
Multiple research object models have been proposed to describe different aspects of the research process. Investigation/Study/Assay (ISA) is a widely used general-purpose metadata tracking framework with an associated suite of open-source software, which offers a rich description of the experiment’s hypotheses and design, investigators involved, experimental factors, protocols applied. The information is organised in a three-level hierarchy where ’Investigation’ provides the project context for a ’Study’ (a research question), which itself contains one or more ’Assays’ (taking analytical measurements and key data processing and analysis steps). Nanopublication (NP) is a research object model which enables specific scientific assertions, such as the conclusions of an experiment, to be annotated with supporting evidence, published and cited. Lastly, the Research Object (RO) is a model that enables the aggregation of the digital resources contributing to findings of computational research, including results, data and software, as citable compound digital objects.
For computational reproducibility, platforms such as Taverna and Galaxy are popular and efficient ways to represent the data analysis steps in the form of reusable workflows, where the data transformations can be specified and executed in an automatic way.
In this presentation, we will address the question of whether such research object models and workflow representation frameworks can be used to assist in the peer review process, by facilitating evaluation of the accuracy of the information provided by scientific articles with respect to their repeatability.
Our case study is based on an article on a genome assembler algorithm published in GigaScience, but due to the proven use of the respective research object models in their respective communities, we argue that the combination of models and workflow system will improve the scholarly publishing process, making science peer-reproduced.
Ontologies for life sciences: examples from the gene ontologyMelanie Courtot
A half day course presented during the Earlham Institute summer school on bioinformatics 2016, in Norwich, UK, http://www.earlham.ac.uk/earlham-institute-summer-school-bioinformatics
In this presentation, I talk about the various tools for the submission of DNA or RNA sequences into various sequence databases. The sequence submission tools talked about in this presentation are BankIt, Sequin and Webin.
This presentation is a thorough guide to the use of Web Apollo, with details on User Navigation, Functionality, and the thought process behind manual annotation.
During this workshop, participants:
- Learn to identify homologs of known genes of interest in a newly sequenced genome.
- Become familiar with the environment and functionality of the Web Apollo genome annotation editing tool.
- Learn how to corroborate or modify automatically annotated gene models using available evidence in Web Apollo.
- Understand the process of curation in the context of genome annotation.
Introduction to Web Apollo for the i5K Pilot species project. WebApollo is genome annotation editor; it provides a web-based environment that allows multiple distributed users to review, edit, and share manual annotations. Let's get started!
Precise elucidation of the many different biological features encoded in any genome requires careful examination and review by researchers, who gather and evaluate the available evidence to corroborate and modify gene predictions and other biological elements. This curation process allows them to resolve discrepancies and validate automated gene model hypotheses and alignments. This approach is the well-established practice for well-known genomes such as human, mouse, zebrafish, Drosophila, et cetera. Desktop Apollo was originally developed to meet these needs.
The cost of sequencing a genome has been dramatically reduced by several orders of magnitude in the last decade, and the natural consequence is that more and more researchers are sequencing more and more new genomes, both within populations and across species. Because individual researchers can now readily sequence many genomes of interest, the need for a universally accessible genomic curation tool logically follows. Each new exome or genome sequenced requires visualization and curation to obtain biologically accurate genomic features sets, even for limited set of genes, because computational genome analysis remains an imperfect art. Additionally, unlike earlier genome projects, which had the advantage of more highly polished genomes, recent projects usually have lower coverage. Therefore researchers now face additional work correcting for more frequent assembly errors and annotating genes split across multiple contigs.
Genome annotation is an inherently collaborative task; researchers only very rarely work in isolation, turning to colleagues for second opinions and insights from those with with expertise in particular domains and gene families. The new JavaScript based Apollo, allows researchers real-time interactivity, breaking down large amounts of data into manageable portions to mobilize groups of researchers with shared interests. We are also focused on training the next generation of researchers by reaching out to educators to make these tools available as part of curricula via workshops and webinars, and through widely applied systems such as iPlant and DNA Subway. Here we offer details of our progress.
Presentation at Genome Informatics, Session (3) on Databases, Data Mining, Visualization, Ontologies and Curation.
Authors: Monica C Munoz-Torres, Suzanna E. Lewis, Ian Holmes, Colin Diesh, Deepak Unni, Christine Elsik.
Web Apollo: Lessons learned from community-based biocuration efforts.Monica Munoz-Torres
This presentation tries to highlight the importance and relevance of community-based curation of biological data. It describes the results of harvesting expertise from dispersed researchers assigning functions to predicted and curated peptides, as well as collaborative efforts for standardization of genes and gene product attributes across species and databases.
Apollo is a web-based application that supports and enables collaborative genome curation in real time, allowing teams of curators to improve on existing automated gene models through an intuitive interface. Apollo allows researchers to break down large amounts of data into manageable portions to mobilize groups of researchers with shared interests.
Apollo is a web-based application that supports and enables collaborative genome curation in real time, allowing teams of curators to improve on existing automated gene models through an intuitive interface. Apollo allows researchers to break down large amounts of data into manageable portions to mobilize groups of researchers with shared interests.
An introduction on gene annotation & curation for the IAGC and BIPAA research communities.
Apollo is a web-based application that supports and enables collaborative genome curation in real time, allowing teams of curators to improve on existing automated gene models through an intuitive interface. Apollo allows researchers to break down large amounts of data into manageable portions to mobilize groups of researchers with shared interests.
The i5K, an initiative to sequence the genomes of 5,000 insect and related arthropod species, is a broad and inclusive effort that seeks to involve scientists from around the world in their genome curation process, and Apollo is serving as the platform to empower this community.
This presentation is an introduction to Apollo for the members of the i5K Pilot Project working on species of the order Hemiptera.
Apollo is a web-based application that supports and enables collaborative genome curation in real time, allowing teams of curators to improve on existing automated gene models through an intuitive interface. Apollo allows researchers to break down large amounts of data into manageable portions to mobilize groups of researchers with shared interests.
A Workshop at the Stowers Institute for Medical Research.
BioAssay Express: Creating and exploiting assay metadataPhilip Cheung
The challenge of accurately characterizing bioassays is a real pain point for many drug discovery organizations. Research has shown that some organizations have legacy assay collections exceeding 20,000 protocols, the great majority of which are not accurately characterized. This problem is compounded by the fact that many new protocol registrations are still not following FAIR (Findability, Accessibility, Interoperability, and Reusability) Data principles.
BioAssay Express is a tool focused on transforming the traditional protocol description from an unstructured free form text into a well-curated data store based upon FAIR Data principles. By using well-defined annotations for assays, the tool enables precise ontology based searches without having to resort to imprecise keyword searches.
This talk explores a number of new important features designed to help scientists accelerate the drug discovery process. Some example use-cases include: enabling drug repositioning projects; improving SAR models; identifying appropriate machine learning data sets; fine-tuning integrative-omic pathways;
An aspirational goal for our team is to build a metadata schema based on semantic web vocabularies that is comprehensive to the extent that the text description becomes optional. One of the many possibilities is to take the initial prospective ELN entry for a bioassay protocol and feed it directly to an automated instrument. While there are many challenges involved in creating the ELN-to-robot loop, we will provide some insights into our collaborations with UCSF automation experts.
In summary, the ability to quickly and accurately search or analyze bioassay data (public or internal) is a rate limiting problem in drug discovery. We will present the latest developments toward removing this bottleneck.
https://plan.core-apps.com/acs_sd2019/abstract/6f58993d-a716-49ad-9b09-609edde5a3f4
Introduction to Apollo - i5k Research Community – Calanoida (copepod)Monica Munoz-Torres
Apollo is a web-based application that supports and enables collaborative genome curation in real time, allowing teams of curators to improve on existing automated gene models through an intuitive interface. Apollo allows researchers to break down large amounts of data into manageable portions to mobilize groups of researchers with shared interests.
The i5k, an initiative to sequence the genomes of 5,000 insect and related arthropod species, is a broad and inclusive effort that seeks to involve scientists from around the world in their genome curation process, and Apollo is serving as the platform to empower this community.
This presentation is an introduction to Apollo for the members of the i5K Pilot Project working on species of the order Calanoida (copepod).
There are many characteristics of biological data. All these characteristics make the management of biological information a particularly challenging problem. Here mainly we will focus on characteristics of biological information and multidisciplinary field called bioinformatics. Bioinformatics, now a days has emerged with graduate degree programs in several universities.
Ajith Ranabahu, Priti Parikh, Maryam Panahiazar, Amit Sheth and Flora Logan-Klumpler: Kino : Making Semantic Annotations Easier, Presented at 5th Intl Conf on Semantic Computing (ICSC2011), Palo Alto, CA, September 2011.
This presentation contains details about the Apollo genome annotation editor functionality. It also includes a step-by-step example about curating a gene of interest.
This presentation explains the meaning of curation and includes an introduction to the Apollo genome annotation editing tool and its curation environment.
Apollo is a web-based application that supports and enables collaborative genome curation in real time, allowing teams of curators to improve on existing automated gene models through an intuitive interface. Apollo allows researchers to break down large amounts of data into manageable portions to mobilize groups of researchers with shared interests.
An introduction to use and functionality for the IAGC and BIPAA research communities.
Apollo is a web-based application that supports and enables collaborative genome curation in real time, allowing teams of curators to improve on existing automated gene models through an intuitive interface. Apollo allows researchers to break down large amounts of data into manageable portions to mobilize groups of researchers with shared interests.
This is a brief update about the genome browser JBrowse and the genome annotation editor Apollo, addressed to the members of the Alliance of Genome Resources (AGR).
Learn more about JBrowse at jbrowse.org
Learn more about Apollo at GenomeArchitect.org
Apollo Genome Annotation Editor: Latest Updates, Including New Galaxy Integra...Monica Munoz-Torres
Manual curation is crucial to improving the quality of the annotations for a genome sequencing project. During this portion of the genome sequencing workflow, curators use a variety of experimental evidence to improve on automated predictions to more accurately represent the underlying biology.
Apollo is a web-based genome annotation editor that allows curators to manually revise and edit genomic elements. It provides a reporting structure for annotated genomic elements and an ‘Annotator Panel’ that allows users to quickly browse the genome and all available annotations. Users can manually edit the structure of a genomic element as well as add metadata, including references to other databases, adding functional assignments to genes and gene products with specific lookup support for Gene Ontology (GO) terms, as well as including references to published literature in support of these annotations.
Apollo is currently used in more than one hundred genome annotation projects around the world, ranging from the annotation of a single species to lineage-specific efforts supporting annotation for dozens of organisms at a time. Apollo enables collaborative, real-time curation (akin to Google Docs); researchers may restrict access to certain annotations depending on the role of users and groups within the community, as well as share tracks of evidence data with the public. Users are able to export their manual annotations via FASTA and GFF3 files, the Chado database schema, and web services. The news hot of the presses is that Apollo is now available for integration with Galaxy via Docker! This allows users to run analyses on their genome of interest, including a step of manual curation, all from the comfort of their installation of the versatile Galaxy platform.
Scientific research is inherently a collaborative task; in our case it is a dialog among different researchers to reach a shared understanding of the underlying biology. To facilitate this dialog we have developed two web-based annotation tools: Apollo (http://genomearchitect.org/), a genomic feature editor, designed to support structural annotation of gene models, and Noctua (http://noctua.berkeleybop.org/), a biological-process model builder designed for describing the functional roles of gene products. Here we wish to outline an inventory of essential requirements that, in our experience, enable an annotation tool to meet the needs of both professional biocurators as well as other members of the research community. Here are the general requirements, beyond specific functional requirements, that any annotation tool must satisfy.
Comparative genome analysis requires high quality annotations of all genomic elements. Today’s sequencing projects face numerous challenges including lower coverage, more frequent assembly errors, and the lack of closely related species with well-annotated genomes. Precise elucidation of the many different biological features encoded in any genome requires careful examination and review. We need genome annotation editing tools to modify and refine the location and structure of the genome elements that predictive algorithms cannot yet resolve automatically. During the manual annotation process, curators identify elements that best represent the underlying biology and eliminate elements that reflect systemic errors of automated analyses.
Apollo is a web-based application that supports and enables collaborative genome curation in real time, analogous to Google Docs, allowing teams of curators to improve on existing automated gene models through an intuitive interface. Researchers from nearly one hundred institutions worldwide are currently using Apollo for distributed curation efforts in over sixty genome projects across the tree of life: from plants to arthropods, to fungi, to species of fish and other vertebrates including human, cattle (bovine), and dog.
Apollo is a web-based application that supports and enables collaborative genome curation in real time, allowing teams of curators to improve on existing automated gene models through an intuitive interface. Apollo allows researchers to break down large amounts of data into manageable portions to mobilize groups of researchers with shared interests.
The i5K, an initiative to sequence the genomes of 5,000 insect and related arthropod species, is a broad and inclusive effort that seeks to involve scientists from around the world in their genome curation process, and Apollo is serving as the platform to empower this community.
This presentation is an introduction to Apollo for the members of the i5K Pilot Project on Eurytemora affinis
Introduction to Apollo: A webinar for the i5K Research CommunityMonica Munoz-Torres
Apollo is a web-based application that supports and enables collaborative genome curation in real time, allowing teams of curators to improve on existing automated gene models through an intuitive interface. Apollo allows researchers to break down large amounts of data into manageable portions to mobilize groups of researchers with shared interests.
The i5K, an initiative to sequence the genomes of 5,000 insect and related arthropod species, is a broad and inclusive effort that seeks to involve scientists from around the world in their genome curation process, and Apollo is serving as the platform to empower this community.
This presentation is an introduction to Apollo for the members of the i5K Pilot Project Species.
Comparative genome analysis requires high quality annotations of all genomic elements. Today’s sequencing projects face numerous challenges including lower coverage, more frequent assembly errors, and the lack of closely related species with well-annotated genomes. Precise elucidation of the many different biological features encoded in any genome requires careful examination and review. We need genome annotation editing tools to modify and refine the location and structure of the genome elements that predictive algorithms cannot yet resolve automatically. During the manual annotation process, curators identify elements that best represent the underlying biology and eliminate elements that reflect systemic errors of automated analyses.
Apollo is a web-based application that supports and enables collaborative genome curation in real time, analogous to Google Docs, allowing teams of curators to improve on existing automated gene models through an intuitive interface. Researchers from nearly one hundred institutions worldwide are currently using Apollo for distributed curation efforts in over sixty genome projects across the tree of life: from plants to arthropods, to fungi, to species of fish and other vertebrates including human, cattle (bovine), and dog.
This is an introduction to conducting manual annotation efforts using Apollo. This webinar was offered to members of the i5K Research community on 2015-10-07.
CONSORCIO ONTOLOGÍA DE GENES: herramientas para anotación funcionalMonica Munoz-Torres
Esta presentación contiene información impartida durante el curso de Ontología de Genes en BIOS. Los temas de la charla incluyen una descripción de la estructura de la ontología, cómo se construyen los términos y porqué es necesario usar ontologías. También discutimos los análisis estadísticos de enriquecimiento y representación de términos. Los ejercicios son parte del entrenamiento del grupo de GOA en EMBL-EBI.
La presentación fue dada en Español.
Apollo is a web-based, collaborative genomic annotation editing platform. We need annotation editing tools to modify and refine precise location and structure of the genome elements that predictive algorithms cannot yet resolve automatically.
This presentation is an introduction to how the manual annotation process takes place using Apollo. It is addressed to the members of the American Chestnut & Chinese Chestnut Genomics research community.
Apollo: A workshop for the Manakin Research Coordination NetworkMonica Munoz-Torres
Apollo is a web-based, collaborative genomic annotation editing platform. We need annotation editing tools to modify and refine precise location and structure of the genome elements that predictive algorithms cannot yet resolve automatically.
This presentation is an introduction to how the manual annotation process takes place using Apollo. It is addressed to the members of the Manakin Genomics research community.
Apollo - A webinar for the Phascolarctos cinereus research communityMonica Munoz-Torres
Web Apollo is a web-based, collaborative genomic annotation editing platform. We need annotation editing tools to modify and refine precise location and structure of the genome elements that predictive algorithms cannot yet resolve automatically.
This presentation is an introduction to how the manual annotation process takes place using Web Apollo. It is addressed to the members of the Phascolarctos cinereus research community.
Continuing with the theme of DNA repair via homologous recombination, I will discuss the following family during the PAINT call:
PTHR13451 CLASS II CROSSOVER JUNCTION ENDONUCLEASE MUS81
Talk at the 8th International Biocuration Conference. Beijing, China. April 23-26, 2015.
Obtaining meaningful results from genome analyses requires high quality annotations of all genomic elements. Today’s sequencing projects face challenges such as lower coverage, more frequent assembly errors, and the lack of closely related species with well-annotated genomes. Apollo is a web-based application that supports and enables collaborative genome curation in real time, analogous to Google Docs, allowing curators to improve on existing automated gene models through an intuitive interface. Apollo’s extensible architecture is built on top of JBrowse; its components are a web-based client, an annotation-editing engine, and a server-side data service. It allows users to visualize automated gene models, protein alignments, expression and variant data, and conduct structural and/or functional annotations.
Apollo is actively used within a variety of projects, including the initiative to sequence the genomes of 5,000 Arthropod species (i5K), and will become essential to the thousands of genomes now being sequenced and analyzed. Researchers from nearly 100 institutions worldwide are currently using Apollo on distributed curation efforts for over sixty genome projects across the tree of life; from plants to echinoderms, to fungi, to species of fish and other vertebrates including human, cattle (bovine), and dog. We are training the next generation of researchers by reaching out to educators to make these tools available as part of curricula, offering workshops and webinars to the scientific community, and through widely applied systems such as iPlant and DNA Subway. We are currently integrating Apollo into an annotation environment combining gene structural and functional annotation, transcriptomic, proteomic, and phenotypic annotation. In this presentation we will describe in detail its utility to users, introduce the architecture to developers interested in expanding on this open-source project, and offer details of our future plans.
Authors:
Monica Munoz-Torres(1), Nathan Dunn(1), Colin Diesh(2), Deepak Unni(2), Seth Carbon(1), Heiko Dietze(1), Christopher Mungall(1), Nicole Washington(1), Ian Holmes(3), Christine Elsik(2), and Suzanna E. Lewis(1)
1Lawrence Berkeley National Laboratory, Genomics Division, Berkeley, CA
2Divisions of Animal and Plant Sciences, University of Missouri, Columbia, MO
3University of California Berkeley, Bioengineering, Berkeley, CA
Synthetic Fiber Construction in lab .pptxPavel ( NSTU)
Synthetic fiber production is a fascinating and complex field that blends chemistry, engineering, and environmental science. By understanding these aspects, students can gain a comprehensive view of synthetic fiber production, its impact on society and the environment, and the potential for future innovations. Synthetic fibers play a crucial role in modern society, impacting various aspects of daily life, industry, and the environment. ynthetic fibers are integral to modern life, offering a range of benefits from cost-effectiveness and versatility to innovative applications and performance characteristics. While they pose environmental challenges, ongoing research and development aim to create more sustainable and eco-friendly alternatives. Understanding the importance of synthetic fibers helps in appreciating their role in the economy, industry, and daily life, while also emphasizing the need for sustainable practices and innovation.
The French Revolution, which began in 1789, was a period of radical social and political upheaval in France. It marked the decline of absolute monarchies, the rise of secular and democratic republics, and the eventual rise of Napoleon Bonaparte. This revolutionary period is crucial in understanding the transition from feudalism to modernity in Europe.
For more information, visit-www.vavaclasses.com
Biological screening of herbal drugs: Introduction and Need for
Phyto-Pharmacological Screening, New Strategies for evaluating
Natural Products, In vitro evaluation techniques for Antioxidants, Antimicrobial and Anticancer drugs. In vivo evaluation techniques
for Anti-inflammatory, Antiulcer, Anticancer, Wound healing, Antidiabetic, Hepatoprotective, Cardio protective, Diuretics and
Antifertility, Toxicity studies as per OECD guidelines
Unit 8 - Information and Communication Technology (Paper I).pdfThiyagu K
This slides describes the basic concepts of ICT, basics of Email, Emerging Technology and Digital Initiatives in Education. This presentations aligns with the UGC Paper I syllabus.
Embracing GenAI - A Strategic ImperativePeter Windle
Artificial Intelligence (AI) technologies such as Generative AI, Image Generators and Large Language Models have had a dramatic impact on teaching, learning and assessment over the past 18 months. The most immediate threat AI posed was to Academic Integrity with Higher Education Institutes (HEIs) focusing their efforts on combating the use of GenAI in assessment. Guidelines were developed for staff and students, policies put in place too. Innovative educators have forged paths in the use of Generative AI for teaching, learning and assessments leading to pockets of transformation springing up across HEIs, often with little or no top-down guidance, support or direction.
This Gasta posits a strategic approach to integrating AI into HEIs to prepare staff, students and the curriculum for an evolving world and workplace. We will highlight the advantages of working with these technologies beyond the realm of teaching, learning and assessment by considering prompt engineering skills, industry impact, curriculum changes, and the need for staff upskilling. In contrast, not engaging strategically with Generative AI poses risks, including falling behind peers, missed opportunities and failing to ensure our graduates remain employable. The rapid evolution of AI technologies necessitates a proactive and strategic approach if we are to remain relevant.
Introduction to AI for Nonprofits with Tapp NetworkTechSoup
Dive into the world of AI! Experts Jon Hill and Tareq Monaur will guide you through AI's role in enhancing nonprofit websites and basic marketing strategies, making it easy to understand and apply.
A Strategic Approach: GenAI in EducationPeter Windle
Artificial Intelligence (AI) technologies such as Generative AI, Image Generators and Large Language Models have had a dramatic impact on teaching, learning and assessment over the past 18 months. The most immediate threat AI posed was to Academic Integrity with Higher Education Institutes (HEIs) focusing their efforts on combating the use of GenAI in assessment. Guidelines were developed for staff and students, policies put in place too. Innovative educators have forged paths in the use of Generative AI for teaching, learning and assessments leading to pockets of transformation springing up across HEIs, often with little or no top-down guidance, support or direction.
This Gasta posits a strategic approach to integrating AI into HEIs to prepare staff, students and the curriculum for an evolving world and workplace. We will highlight the advantages of working with these technologies beyond the realm of teaching, learning and assessment by considering prompt engineering skills, industry impact, curriculum changes, and the need for staff upskilling. In contrast, not engaging strategically with Generative AI poses risks, including falling behind peers, missed opportunities and failing to ensure our graduates remain employable. The rapid evolution of AI technologies necessitates a proactive and strategic approach if we are to remain relevant.
Instructions for Submissions thorugh G- Classroom.pptxJheel Barad
This presentation provides a briefing on how to upload submissions and documents in Google Classroom. It was prepared as part of an orientation for new Sainik School in-service teacher trainees. As a training officer, my goal is to ensure that you are comfortable and proficient with this essential tool for managing assignments and fostering student engagement.
2. An introduction to Web Apollo.
A webinar for the Eurytemora affinis research community.
Monica Munoz-Torres, PhD | @monimunozto
Berkeley Bioinformatics Open-Source Projects (BBOP)
Genomics Division, Lawrence Berkeley National Laboratory
29 August, 2014
UNIVERSITY OF
CALIFORNIA
3. Outline
1. What is Web Apollo?:
• Definition & working concept.
2. Our Experience With Community
Based Curation.
3. The Manual Annotation Process.
4. Becoming acquainted with Web
Apollo.
5. Example.
An introduction to
Web Apollo.
A webinar for the
Eurytemora affinis
research community.
Outline 3
4. During this webinar you will:
• Learn to identify homologs of known genes of interest
in your newly sequenced genome.
• Become familiar with the environment and
functionality of the Web Apollo genome annotation
editing tool.
Footer 4
5. What is Web Apollo?
• Web Apollo is a web-based, collaborative genomic
annotation editing platform.
We need annotation editing tools to modify and refine the
precise location and structure of the genome elements that
predictive algorithms cannot yet resolve automatically.
1. What is Web Apollo? 5
Find more about Web Apollo at
http://GenomeArchitect.org
and
Genome Biol 14:R93. (2013).
6. Brief history of Apollo*:
Biologists could finally visualize computational analyses and
experimental evidence from genomic features and build
manually-curated consensus gene structures. Apollo became a
very popular, open source tool (insects, fish, mammals, birds, etc.).
a. Desktop:
one person at a time editing a
specific region, annotations
saved in local files; slowed down
collaboration.
b. Java Web Start:
users saved annotations directly
to a centralized database;
potential issues with stale
annotation data remained.
1. What is Web Apollo? 6
*
7. Web Apollo
• Browser-based tool integrated with JBrowse.
• Two new tracks: “Annotation” and “DNA Sequence”
• Allows for intuitive annotation creation and editing,
with gestures and pull-down menus to create and
modify transcripts and exons
structures, insert comments
(CV, freeform text), etc.
• Customizable look & feel.
• Edits in one client are
instantly pushed to all other
clients: Collaborative!
1. What is Web Apollo? 7
8. Working
Concept
In the context of gene manual annotation,
curation tries to find the best examples
and/or eliminate most errors.
To conduct manual annotation efforts:
Gather and evaluate all available evidence
using quality-control metrics to
corroborate or modify automated
annotation predictions.
Perform sequence similarity searches
(phylogenetic framework) and use
literature and public databases to:
• Predict functional assignments from
experimental data.
• Distinguish orthologs from paralogs,
and classify gene membership in
families and networks.
Automated gene models
Evidence:
cDNAs, HMM domain searches,
alignments with assemblies or
genes from other species.
Manual annotation & curation
2. In our experience. 8
9. Dispersed, community-based gene
manual annotation efforts.
We continuously train and support
hundreds of geographically dispersed
scientists from many research
communities, to perform biologically
supported manual annotations using
Web Apollo.
– Gate keepers and monitoring.
– Written tutorials.
– Training workshops and geneborees.
– Personalized user support.
2. In our experience. 9
10. What we have learned.
Harvesting expertise from dispersed researchers who
assigned functions to predicted and curated peptides
we have developed more interactive and
responsive tools, as well as better visualization,
editing, and analysis capabilities.
2. In our experience. 10
http://people.csail.mit.edu/fredo/PUBLI/Drawing/
11. Collaborative Efforts Improved
Automated Annotations
In many cases, automated annotations have been
improved (e.g: Apis mellifera. Elsik et al. BMC Genomics 2014, 15:86).
Also, learned of the challenges of newer sequencing
technologies, e.g.:
– Frameshifts and indel errors
– Split genes across scaffolds
– Highly repetitive sequences
To face these challenges, we train annotators in
recovering coding sequences in agreement with all
available biological evidence.
2. In our experience. 11
12. It is helpful to work together.
Scientific community efforts bring together domain-specific
and natural history expertise that would
otherwise remain disconnected.
Breaking down large amounts of data into
manageable portions and mobilizing groups
of researchers to extract the most accurate
representation of the biology from all
available data distills invaluable
knowledge from genome analysis.
2. In our experience. 12
14. A little training goes a long way!
With the right tools, wet lab scientists make exceptional
curators who can easily learn to maximize the
generation of accurate, biologically supported gene
models.
2. In our experience. 14
15. Manual
Annotation
How do we get there?
15
Assembly
Manual
annotation
Experimental
validation
Automated
Annotation
In a genome sequencing project…
3. How do we get there?
16. Gene Prediction
Identification of protein-coding genes, tRNAs, rRNAs,
regulatory motifs, repetitive elements (masked), etc.
- Ab initio (DNA composition): Augustus, GENSCAN,
geneid, fgenesh
- Homology-based: E.g: SGP2, fgenesh++
16
Nucleic Acids 2003 vol. 31 no. 13 3738-3741
3. How do we get there?
17. Gene Annotation
Integration of data from prediction tools to generate a
consensus set of predictions or gene models.
• Models may be organized using:
- automatic integration of predicted sets; e.g: GLEAN
- packaging necessary tools into pipeline; e.g: MAKER
• All available biological evidence (e.g. transcriptomes) further
informs the annotation process.
In some cases algorithms and metrics used to generate
consensus sets may actually reduce the accuracy of the
gene’s representation; in such cases it is usually better to
use an ab initio model to create a new annotation.
3. How do we get there? 17
18. Manual Genome Annotation
• Identifies elements that best represent the underlying
biology.
• Eliminates elements that reflect the systemic errors of
automated genome analyses.
• Determines functional roles through comparative
analysis of well-studied, phylogenetically similar
genome elements using literature, databases, and
the researcher’s experience.
3. How do we get there? 18
19. Curation Process: is Necessary
1. A computationally predicted consensus gene set is
generated using multiple lines of evidence.
2. Manual annotation takes place.
3. Ideally consensus computational predictions will be
integrated with manual annotations to produce an
updated Official Gene Set (OGS).
Otherwise, “incorrect and incomplete genome annotations
will poison every experiment that uses them”.
- M. Yandell.
3. How do we get there? 19
20. The Collaborative Curation Process at
i5K
1) A computationally predicted consensus gene set has
been generated using multiple lines of evidence; e.g.
Consensus Gene EAFF_v0.5.3-Models.
2) i5K Projects will integrate consensus computational
predictions with manual annotations to produce an updated
Official Gene Set (OGS):
» If it’s not on either track, it won’t make the OGS!
» If it’s there and it shouldn’t, it will still make the OGS!
3. How do we get there? 20
21. Consensus set: reference and start point
• In some cases algorithms and metrics used to generate
consensus sets may actually reduce the accuracy of the gene’s
representation; e.g. use Augustus model instead to create a new
annotation.
• Isoforms: drag original and alternatively spliced form to ‘User-created
Annotations’ area.
• If an annotation needs to be removed from the consensus set,
drag it to the ‘User-created Annotations’ area and label as
‘Delete’ on Information Editor.
• Overlapping interests? Collaborate to reach agreement.
• Follow guidelines for i5K Pilot Species Projects as shown at
http://goo.gl/LRu1VY
3. How do we get there? 21
23. Web Apollo The Sequence Selection Window
Sort
4. Becoming Acquainted with Web Apollo.
23
24. Navigation tools:
pan and zoom Search box: go
to a scaffold or
a gene model.
Grey bar of coordinates
indicates location. You can
also select here in order to
zoom to a sub-region.
‘View’: change
color by CDS,
toggle strands,
set highlight.
‘File’:
Upload your own
evidence: GFF3,
BAM, BigWig, VCF*.
Add combination
and sequence
search tracks.
‘Tools’:
Use BLAT to query the
genome with a protein
or DNA sequence.
Available Tracks
‘User-created Annotations’ Track
Evidence Tracks Area
Login
Web Apollo
Graphical User Interface (GUI) for editing annotations
4. Becoming Acquainted with Web Apollo.
25. Flags non-canonical
splice
sites.
Selection of features and
sub-features
Edge-matching
‘User-created Annotations’ Track
Evidence Tracks Area
The editing logic in the server:
selects longest ORF as CDS
flags non-canonical splice sites
Web Apollo
4. Becoming Acquainted with Web Apollo.
25
26. Web Apollo
DNA Track
‘User-created Annotations’ Track
4. Becoming Acquainted with Web Apollo.
There are two new kinds of tracks for:
annotation editing
sequence alteration editing
27. Web Apollo
Annotations, annotation edits, and History: stored in a centralized database.
4. Becoming Acquainted with Web Apollo.
28. Web Apollo
4. Becoming Acquainted with Web Apollo.
28
• DBXRefs
• PubMed IDs
• GO terms
• Comments
The Information Editor
29. Additional Functionality
In addition to protein-coding gene annotation that you know and love.
• Non-coding genes: ncRNAs, miRNAs, repeat regions, and TEs
• Sequence alterations (less coverage = more fragmentation)
• Visualization of stage and cell-type specific transcription data as
coverage plots, heat maps, and alignments
4. Becoming Acquainted with Web Apollo.
30. How to begin curating
To find the gene region you wish to annotate, you may use:
a) a protein sequence from another species
b) a sequence from a similar gene
c) on your own, you aligned your gene models or transcriptomic data to the genome.
d) you used high quality proteins and/or gene family alignments (multi or single
species) and are able to identify conserved domains.
Option 1 – You have a sequence but don’t know where it is in this genome:
• Use BLAT in Web Apollo window, or BLAST at NAL’s i5k BLAST server, available at:
http://i5k.nal.usda.gov/blastn
• Alternatively, use any other tool; for example Geneious.
Option 2 – The genome has already been annotated with your sequences and you have a gene
identifier that has been indexed in Web Apollo.
• That is, you know where to look, so type the ID in the Search box of Web Apollo.
• Web Apollo autocompletes using a case-insensitive search anchored on the left-hand side of
the word. For example “HaGR” will show all “hagr” objects (up to 30).
• Choose one of the genes and click “Go”.
• You can do that with Domains, Alignments or Gene names provided to you (if they have been
indexed).
Option 3 – Find genes based on functional ontology terms or network membership identifiers.
31. General Process of Curation
1. Select the chromosomal region of interest, e.g. scaffold.
2. Select appropriate evidence tracks.
3. Determine whether a feature in an existing evidence track will
provide a reasonable gene model to start working.
- If yes: select and drag the feature to the ‘User-created
Annotations’ area, creating an initial gene model. If necessary
use editing functions to adjust the gene model.
- Nothing available to you? Let’s have a talk.
4. Check your edited gene model for integrity and accuracy by
comparing it with available homologs.
4. Becoming Acquainted with Web Apollo
31 |
Always remember: when annotating gene models using Web
Apollo, you are looking at a ‘frozen’ version of the genome
assembly and you will not be able to modify the assembly itself.
32. Example
Introductory demonstration using the Apis mellifera genome.
Q&A session using the Eurytemora affinis genome at
https://apollo.nal.usda.gov/euraff/selectTrack.jsp
A public Honey Bee Web Apollo Demo is available at
http://genomearchitect.org/WebApolloDemo
Example 32
33. What do we know for this species?
• What data are currently available?
• At NCBI:
• 5,570 nucleotide sequences scaffolds
• 446 amino acid sequences CO-I
• 0 conserved domains identified
• 0 “gene” entries submitted
Footer 33
34. PubMed Search: what’s new?
Footer 34
Empirical examples of
beneficial reversal of
dominance:
• Warfarin resistance: mutation
of VKORC1 is associated with
increased dietary requirement
for vit. K
35. How many sequences for your gene of
interest?
And what do we know about it?
• VKORC1 – vit. K epoxide reductase
complex, subunit 1.
• MF: quinone binding (IEA,
GO:0048038), vit K epoxide reductase
activity (IDA, GO:0047057).
• BP: blood coagulation (IMP,
GO:0007596), bone development
(ISS,GO:0060348).
• CC: endoplasmic reticulum membrane
(TAS, GO:0005789), integral
component of membrane (IEA,
GO:0016021).
Footer 35
36. BLAST at i5K
https://i5k.nal.usda.gov/blast
Footer 36
To Web Apollo
37. BLAST at i5K: hsps in “BLAST+ results” track
Footer 37
39. Creating a new gene model: drag and drop
• Web Apollo automatically calculates the longest open reading
frame (ORF). In this case, the ORF includes the hsp.
Footer 39
46. Arthropodcentric Thanks!
AgriPest Base
FlyBase
Hymenoptera Genome Database
VectorBase
Acromyrmex echinatior
Acyrthosiphon pisum
Apis mellifera
Atta cephalotes
Bombus terrestris
Camponotus floridanus
Helicoverpa armigera
Linepithema humile
Manduca sexta
Mayetiola destructor
Nasonia vitripennis
Pogonomyrmex barbatus
Solenopsis invicta
Tribolium castaneum… and you!
47. Thanks!
• Berkeley Bioinformatics Open-source Projects
(BBOP), Berkeley Lab: Web Apollo and Gene Ontology
teams. Suzanna E. Lewis (PI).
• Christine G. Elsik (PI). § University of Missouri.
• Ian Holmes (PI). University of California, Berkeley.
• Arthropod genomics community, i5K Steering
Committee, Monica Poelchau at USDA/NAL, fringy
Richards at HGSC-BCM, Alexie Papanicolaou at
CSIRO, Oliver Niehuis at 1KITE http://www.1kite.org/,
BGI, and the Honey Bee Genome Sequencing
Consortium.
• Web Apollo is supported by NIH grants
5R01GM080203 from NIGMS, and 5R01HG004483
from NHGRI, and by the Director, Office of Science,
Office of Basic Energy Sciences, of the U.S.
Department of Energy under Contract No. DE-AC02-
05CH11231.
• Insect images used with permission:
http://AlexanderWild.com and O. Niehuis.
• For your attention, thank you!
Colleagues at BBOP
Web Apollo
Suzanna Lewis
Gregg Helt
Colin Diesh §
Deepak Unni §
Gene Ontology
Chris Mungall
Seth Carbon
Heiko Dietze
Web Apollo: http://GenomeArchitect.org
GO: http://GeneOntology.org
i5K: http://arthropodgenomes.org/wiki/i5K
Thank you. 47