Introduction to Web Apollo for the i5K Pilot species project. WebApollo is genome annotation editor; it provides a web-based environment that allows multiple distributed users to review, edit, and share manual annotations. This presentation includes information specific to the projects of the Global Initiative to sequence the genomes of 5,000 species of arthropods, i5K. Let's get started!
This presentation is a thorough guide to the use of Web Apollo, with details on User Navigation, Functionality, and the thought process behind manual annotation.
During this workshop, participants:
- Learn to identify homologs of known genes of interest in your newly sequenced genome.
- Become familiar with the environment and functionality of the Web Apollo genome annotation editing tool.
- Learn how to corroborate or modify automatically annotated gene models using all available evidence in Web Apollo.
- Understand the process of curation in the context of genome annotation.
An introduction to Web Apollo for the Biomphalaria glabatra research community.Monica Munoz-Torres
Web Apollo is a web-based, collaborative genomic annotation editing platform. We need annotation editing tools to modify and refine precise location and structure of the genome elements that predictive algorithms cannot yet resolve automatically.
This presentation is an introduction to how the manual annotation process takes place using Web Apollo. It is addressed to the members of the Biomphalaria glabatra research community.
Introduction to Web Apollo for the i5K Pilot species project. WebApollo is genome annotation editor; it provides a web-based environment that allows multiple distributed users to review, edit, and share manual annotations. Let's get started!
Web Apollo Tutorial for the i5K copepod research community.Monica Munoz-Torres
Introduction to Web Apollo for the i5K i5K copepod research community. WebApollo is genome annotation editor; it provides a web-based environment that allows multiple distributed users to review, edit, and share manual annotations. This presentation includes information specific to the projects of the Global Initiative to sequence the genomes of 5,000 species of arthropods, i5K. Let's get started!
Web Apollo: Lessons learned from community-based biocuration efforts.Monica Munoz-Torres
This presentation tries to highlight the importance and relevance of community-based curation of biological data. It describes the results of harvesting expertise from dispersed researchers assigning functions to predicted and curated peptides, as well as collaborative efforts for standardization of genes and gene product attributes across species and databases.
This presentation is a thorough guide to the use of Web Apollo, with details on User Navigation, Functionality, and the thought process behind manual annotation.
During this workshop, participants:
- Learn to identify homologs of known genes of interest in your newly sequenced genome.
- Become familiar with the environment and functionality of the Web Apollo genome annotation editing tool.
- Learn how to corroborate or modify automatically annotated gene models using all available evidence in Web Apollo.
- Understand the process of curation in the context of genome annotation.
An introduction to Web Apollo for the Biomphalaria glabatra research community.Monica Munoz-Torres
Web Apollo is a web-based, collaborative genomic annotation editing platform. We need annotation editing tools to modify and refine precise location and structure of the genome elements that predictive algorithms cannot yet resolve automatically.
This presentation is an introduction to how the manual annotation process takes place using Web Apollo. It is addressed to the members of the Biomphalaria glabatra research community.
Introduction to Web Apollo for the i5K Pilot species project. WebApollo is genome annotation editor; it provides a web-based environment that allows multiple distributed users to review, edit, and share manual annotations. Let's get started!
Web Apollo Tutorial for the i5K copepod research community.Monica Munoz-Torres
Introduction to Web Apollo for the i5K i5K copepod research community. WebApollo is genome annotation editor; it provides a web-based environment that allows multiple distributed users to review, edit, and share manual annotations. This presentation includes information specific to the projects of the Global Initiative to sequence the genomes of 5,000 species of arthropods, i5K. Let's get started!
Web Apollo: Lessons learned from community-based biocuration efforts.Monica Munoz-Torres
This presentation tries to highlight the importance and relevance of community-based curation of biological data. It describes the results of harvesting expertise from dispersed researchers assigning functions to predicted and curated peptides, as well as collaborative efforts for standardization of genes and gene product attributes across species and databases.
Apollo and i5K: Collaborative Curation and Interactive Analysis of GenomesMonica Munoz-Torres
Precise elucidation of the many different biological features encoded in a genome requires a careful curation process that involves reviewing all available evidence to allow researchers to resolve discrepancies and validate automated gene models, protein alignments, and other biological elements. Genome annotation is an inherently collaborative task; researchers only rarely work in isolation, turning to colleagues for second opinions and insights from those with expertise in particular domains and gene families.
The i5k initiative seeks to sequence the genomes of 5,000 insect and related arthropod species. The selected species are known to be important to worldwide agriculture, food safety, medicine, and energy production as well as many used as models in biology, those most abundant in world ecosystems, and representatives in every branch of the insect phylogeny in an effort to better understand arthropod evolution and phylogeny. Because computational genome analysis remains an imperfect art, each of these new genomes sequenced will require visualization and curation.
Apollo is an instantaneous, collaborative, genome annotation editor, and the new JavaScript based version allows researchers real-time interactivity, breaking down large amounts of data into manageable portions to mobilize groups of researchers with shared interests. The i5K is a broad and inclusive effort that seeks to involve scientists from around the world in their genome curation process and Apollo is serving as the platform to empower this community. Here we offer details about this collaboration.
This presentation is a thorough guide to the use of Web Apollo, with details on User Navigation, Functionality, and the thought process behind manual annotation.
During this workshop, participants:
- Learn to identify homologs of known genes of interest in a newly sequenced genome.
- Become familiar with the environment and functionality of the Web Apollo genome annotation editing tool.
- Learn how to corroborate or modify automatically annotated gene models using available evidence in Web Apollo.
- Understand the process of curation in the context of genome annotation.
Precise elucidation of the many different biological features encoded in any genome requires careful examination and review by researchers, who gather and evaluate the available evidence to corroborate and modify gene predictions and other biological elements. This curation process allows them to resolve discrepancies and validate automated gene model hypotheses and alignments. This approach is the well-established practice for well-known genomes such as human, mouse, zebrafish, Drosophila, et cetera. Desktop Apollo was originally developed to meet these needs.
The cost of sequencing a genome has been dramatically reduced by several orders of magnitude in the last decade, and the natural consequence is that more and more researchers are sequencing more and more new genomes, both within populations and across species. Because individual researchers can now readily sequence many genomes of interest, the need for a universally accessible genomic curation tool logically follows. Each new exome or genome sequenced requires visualization and curation to obtain biologically accurate genomic features sets, even for limited set of genes, because computational genome analysis remains an imperfect art. Additionally, unlike earlier genome projects, which had the advantage of more highly polished genomes, recent projects usually have lower coverage. Therefore researchers now face additional work correcting for more frequent assembly errors and annotating genes split across multiple contigs.
Genome annotation is an inherently collaborative task; researchers only very rarely work in isolation, turning to colleagues for second opinions and insights from those with with expertise in particular domains and gene families. The new JavaScript based Apollo, allows researchers real-time interactivity, breaking down large amounts of data into manageable portions to mobilize groups of researchers with shared interests. We are also focused on training the next generation of researchers by reaching out to educators to make these tools available as part of curricula via workshops and webinars, and through widely applied systems such as iPlant and DNA Subway. Here we offer details of our progress.
Presentation at Genome Informatics, Session (3) on Databases, Data Mining, Visualization, Ontologies and Curation.
Authors: Monica C Munoz-Torres, Suzanna E. Lewis, Ian Holmes, Colin Diesh, Deepak Unni, Christine Elsik.
Science has evolved from the isolated individual tinkering in the lab, through the era of the “gentleman scientist” with his or her assistant(s), to group-based then expansive collaboration and now to an opportunity to collaborate with the world. With the advent of the internet the opportunity for crowd-sourced contribution and large-scale collaboration has exploded and, as a result, scientific discovery has been further enabled. The contributions of enormous open data sets, liberal licensing policies and innovative technologies for mining and linking these data has given rise to platforms that are beginning to deliver on the promise of semantic technologies and nanopublications, facilitated by the unprecedented computational resources available today, especially the increasing capabilities of handheld devices. The speaker will provide an overview of his experiences in developing a crowdsourced platform for chemists allowing for data deposition, annotation and validation. The challenges of mapping chemical and pharmacological data, especially in regards to data quality, will be discussed. The promise of distributed participation in data analysis is already in place.
Ontomaton: NCBO BioPortal Ontology lookups in Google Spreadsheets produced by ISATeam at University of Oxford e-Research Centre (Eamonn Maguire, Alejandra Gonzalez-Beltran, Philippe Rocca-Serra and Susanna Sansone) and NCBO (Trish Whetzel).
The work was presented during ICBO 2013 in Montreal by Trish Whetzel (Thanks Trish!)
Towards Reproducible Science: a few building blocks from my personal experienceOscar Corcho
Invited keynote given at the Second International Workshop on Semantics for BioDiversity (http://fusion.cs.uni-jena.de/s4biodiv2017/), held in conjunction with ISWC2017 (https://iswc2017.semanticweb.org/)
Three's a crowd-source: Observations on Collaborative Genome AnnotationMonica Munoz-Torres
It is impossible for a single individual to fully curate a genome with precise biological fidelity. Beyond the problem of scale, curators need second opinions and insights from colleagues with domain and gene family expertise, but the communications constraints imposed in earlier applications made this inherently collaborative task difficult. Apollo, a client-side, JavaScript application allowing extensive changes to be rapidly made without server round-trips, placed us in a position to assess the difference this real-time interactivity would make to researchers’ productivity and the quality of downstream scientific analysis. To evaluate this, we trained and supported geographically dispersed scientific communities (hundreds of scientists and agreed-upon gatekeepers, in ~100 institutions around the world) to perform biologically supported manual annotations, and monitored their findings. We observed that: 1) Previously disconnected researchers were more productive when obtaining immediate feedback in dialogs with collaborators. 2) Unlike earlier genome projects, which had the advantage of more highly polished genomes, recent projects usually have lower coverage. Therefore curators now face additional work correcting for more frequent assembly errors and annotating genes that are split across multiple contigs. 3) Automated annotations were improved as exemplified by discoveries made based on revised annotations, for example ~2800 manually annotated genes from three species of ants granted further insight into the evolution of sociality in this group, and ~3600 manual annotations contributed to a better understanding of immune function, reproduction, lactation and metabolism in cattle. 4) There is a notable trend shifting from whole-genome annotation to annotation of specific gene families or other gene groups linked by ecological and evolutionary significance. 5) The distributed nature of these efforts still demand strong, goal-oriented (i.e. publication of findings) leadership and coordination, as these are crucial to the success of each project. Here we detail these and other observations on collaborative genome annotation efforts.
Introduction to Apollo - i5k Research Community – Calanoida (copepod)Monica Munoz-Torres
Apollo is a web-based application that supports and enables collaborative genome curation in real time, allowing teams of curators to improve on existing automated gene models through an intuitive interface. Apollo allows researchers to break down large amounts of data into manageable portions to mobilize groups of researchers with shared interests.
The i5k, an initiative to sequence the genomes of 5,000 insect and related arthropod species, is a broad and inclusive effort that seeks to involve scientists from around the world in their genome curation process, and Apollo is serving as the platform to empower this community.
This presentation is an introduction to Apollo for the members of the i5K Pilot Project working on species of the order Calanoida (copepod).
Apollo is a web-based application that supports and enables collaborative genome curation in real time, allowing teams of curators to improve on existing automated gene models through an intuitive interface. Apollo allows researchers to break down large amounts of data into manageable portions to mobilize groups of researchers with shared interests.
This presentation explains the meaning of curation and includes an introduction to the Apollo genome annotation editing tool and its curation environment.
Apollo annotation guidelines for i5k projects Diaphorina citriMonica Munoz-Torres
Apollo is a web-based application that supports and enables collaborative genome curation in real time, allowing teams of curators to improve on existing automated gene models through an intuitive interface. Apollo allows researchers to break down large amounts of data into manageable portions to mobilize groups of researchers with shared interests.
Apollo is a web-based application that supports and enables collaborative genome curation in real time, allowing teams of curators to improve on existing automated gene models through an intuitive interface. Apollo allows researchers to break down large amounts of data into manageable portions to mobilize groups of researchers with shared interests.
The i5K, an initiative to sequence the genomes of 5,000 insect and related arthropod species, is a broad and inclusive effort that seeks to involve scientists from around the world in their genome curation process, and Apollo is serving as the platform to empower this community.
This presentation is an introduction to Apollo for the members of the i5K Pilot Project working on species of the order Hemiptera.
Apollo and i5K: Collaborative Curation and Interactive Analysis of GenomesMonica Munoz-Torres
Precise elucidation of the many different biological features encoded in a genome requires a careful curation process that involves reviewing all available evidence to allow researchers to resolve discrepancies and validate automated gene models, protein alignments, and other biological elements. Genome annotation is an inherently collaborative task; researchers only rarely work in isolation, turning to colleagues for second opinions and insights from those with expertise in particular domains and gene families.
The i5k initiative seeks to sequence the genomes of 5,000 insect and related arthropod species. The selected species are known to be important to worldwide agriculture, food safety, medicine, and energy production as well as many used as models in biology, those most abundant in world ecosystems, and representatives in every branch of the insect phylogeny in an effort to better understand arthropod evolution and phylogeny. Because computational genome analysis remains an imperfect art, each of these new genomes sequenced will require visualization and curation.
Apollo is an instantaneous, collaborative, genome annotation editor, and the new JavaScript based version allows researchers real-time interactivity, breaking down large amounts of data into manageable portions to mobilize groups of researchers with shared interests. The i5K is a broad and inclusive effort that seeks to involve scientists from around the world in their genome curation process and Apollo is serving as the platform to empower this community. Here we offer details about this collaboration.
This presentation is a thorough guide to the use of Web Apollo, with details on User Navigation, Functionality, and the thought process behind manual annotation.
During this workshop, participants:
- Learn to identify homologs of known genes of interest in a newly sequenced genome.
- Become familiar with the environment and functionality of the Web Apollo genome annotation editing tool.
- Learn how to corroborate or modify automatically annotated gene models using available evidence in Web Apollo.
- Understand the process of curation in the context of genome annotation.
Precise elucidation of the many different biological features encoded in any genome requires careful examination and review by researchers, who gather and evaluate the available evidence to corroborate and modify gene predictions and other biological elements. This curation process allows them to resolve discrepancies and validate automated gene model hypotheses and alignments. This approach is the well-established practice for well-known genomes such as human, mouse, zebrafish, Drosophila, et cetera. Desktop Apollo was originally developed to meet these needs.
The cost of sequencing a genome has been dramatically reduced by several orders of magnitude in the last decade, and the natural consequence is that more and more researchers are sequencing more and more new genomes, both within populations and across species. Because individual researchers can now readily sequence many genomes of interest, the need for a universally accessible genomic curation tool logically follows. Each new exome or genome sequenced requires visualization and curation to obtain biologically accurate genomic features sets, even for limited set of genes, because computational genome analysis remains an imperfect art. Additionally, unlike earlier genome projects, which had the advantage of more highly polished genomes, recent projects usually have lower coverage. Therefore researchers now face additional work correcting for more frequent assembly errors and annotating genes split across multiple contigs.
Genome annotation is an inherently collaborative task; researchers only very rarely work in isolation, turning to colleagues for second opinions and insights from those with with expertise in particular domains and gene families. The new JavaScript based Apollo, allows researchers real-time interactivity, breaking down large amounts of data into manageable portions to mobilize groups of researchers with shared interests. We are also focused on training the next generation of researchers by reaching out to educators to make these tools available as part of curricula via workshops and webinars, and through widely applied systems such as iPlant and DNA Subway. Here we offer details of our progress.
Presentation at Genome Informatics, Session (3) on Databases, Data Mining, Visualization, Ontologies and Curation.
Authors: Monica C Munoz-Torres, Suzanna E. Lewis, Ian Holmes, Colin Diesh, Deepak Unni, Christine Elsik.
Science has evolved from the isolated individual tinkering in the lab, through the era of the “gentleman scientist” with his or her assistant(s), to group-based then expansive collaboration and now to an opportunity to collaborate with the world. With the advent of the internet the opportunity for crowd-sourced contribution and large-scale collaboration has exploded and, as a result, scientific discovery has been further enabled. The contributions of enormous open data sets, liberal licensing policies and innovative technologies for mining and linking these data has given rise to platforms that are beginning to deliver on the promise of semantic technologies and nanopublications, facilitated by the unprecedented computational resources available today, especially the increasing capabilities of handheld devices. The speaker will provide an overview of his experiences in developing a crowdsourced platform for chemists allowing for data deposition, annotation and validation. The challenges of mapping chemical and pharmacological data, especially in regards to data quality, will be discussed. The promise of distributed participation in data analysis is already in place.
Ontomaton: NCBO BioPortal Ontology lookups in Google Spreadsheets produced by ISATeam at University of Oxford e-Research Centre (Eamonn Maguire, Alejandra Gonzalez-Beltran, Philippe Rocca-Serra and Susanna Sansone) and NCBO (Trish Whetzel).
The work was presented during ICBO 2013 in Montreal by Trish Whetzel (Thanks Trish!)
Towards Reproducible Science: a few building blocks from my personal experienceOscar Corcho
Invited keynote given at the Second International Workshop on Semantics for BioDiversity (http://fusion.cs.uni-jena.de/s4biodiv2017/), held in conjunction with ISWC2017 (https://iswc2017.semanticweb.org/)
Three's a crowd-source: Observations on Collaborative Genome AnnotationMonica Munoz-Torres
It is impossible for a single individual to fully curate a genome with precise biological fidelity. Beyond the problem of scale, curators need second opinions and insights from colleagues with domain and gene family expertise, but the communications constraints imposed in earlier applications made this inherently collaborative task difficult. Apollo, a client-side, JavaScript application allowing extensive changes to be rapidly made without server round-trips, placed us in a position to assess the difference this real-time interactivity would make to researchers’ productivity and the quality of downstream scientific analysis. To evaluate this, we trained and supported geographically dispersed scientific communities (hundreds of scientists and agreed-upon gatekeepers, in ~100 institutions around the world) to perform biologically supported manual annotations, and monitored their findings. We observed that: 1) Previously disconnected researchers were more productive when obtaining immediate feedback in dialogs with collaborators. 2) Unlike earlier genome projects, which had the advantage of more highly polished genomes, recent projects usually have lower coverage. Therefore curators now face additional work correcting for more frequent assembly errors and annotating genes that are split across multiple contigs. 3) Automated annotations were improved as exemplified by discoveries made based on revised annotations, for example ~2800 manually annotated genes from three species of ants granted further insight into the evolution of sociality in this group, and ~3600 manual annotations contributed to a better understanding of immune function, reproduction, lactation and metabolism in cattle. 4) There is a notable trend shifting from whole-genome annotation to annotation of specific gene families or other gene groups linked by ecological and evolutionary significance. 5) The distributed nature of these efforts still demand strong, goal-oriented (i.e. publication of findings) leadership and coordination, as these are crucial to the success of each project. Here we detail these and other observations on collaborative genome annotation efforts.
Introduction to Apollo - i5k Research Community – Calanoida (copepod)Monica Munoz-Torres
Apollo is a web-based application that supports and enables collaborative genome curation in real time, allowing teams of curators to improve on existing automated gene models through an intuitive interface. Apollo allows researchers to break down large amounts of data into manageable portions to mobilize groups of researchers with shared interests.
The i5k, an initiative to sequence the genomes of 5,000 insect and related arthropod species, is a broad and inclusive effort that seeks to involve scientists from around the world in their genome curation process, and Apollo is serving as the platform to empower this community.
This presentation is an introduction to Apollo for the members of the i5K Pilot Project working on species of the order Calanoida (copepod).
Apollo is a web-based application that supports and enables collaborative genome curation in real time, allowing teams of curators to improve on existing automated gene models through an intuitive interface. Apollo allows researchers to break down large amounts of data into manageable portions to mobilize groups of researchers with shared interests.
This presentation explains the meaning of curation and includes an introduction to the Apollo genome annotation editing tool and its curation environment.
Apollo annotation guidelines for i5k projects Diaphorina citriMonica Munoz-Torres
Apollo is a web-based application that supports and enables collaborative genome curation in real time, allowing teams of curators to improve on existing automated gene models through an intuitive interface. Apollo allows researchers to break down large amounts of data into manageable portions to mobilize groups of researchers with shared interests.
Apollo is a web-based application that supports and enables collaborative genome curation in real time, allowing teams of curators to improve on existing automated gene models through an intuitive interface. Apollo allows researchers to break down large amounts of data into manageable portions to mobilize groups of researchers with shared interests.
The i5K, an initiative to sequence the genomes of 5,000 insect and related arthropod species, is a broad and inclusive effort that seeks to involve scientists from around the world in their genome curation process, and Apollo is serving as the platform to empower this community.
This presentation is an introduction to Apollo for the members of the i5K Pilot Project working on species of the order Hemiptera.
Apollo is a web-based application that supports and enables collaborative genome curation in real time, allowing teams of curators to improve on existing automated gene models through an intuitive interface. Apollo allows researchers to break down large amounts of data into manageable portions to mobilize groups of researchers with shared interests.
An introduction on gene annotation & curation for the IAGC and BIPAA research communities.
Apollo is a web-based application that supports and enables collaborative genome curation in real time, allowing teams of curators to improve on existing automated gene models through an intuitive interface. Apollo allows researchers to break down large amounts of data into manageable portions to mobilize groups of researchers with shared interests.
A Workshop at the Stowers Institute for Medical Research.
Comparative genome analysis requires high quality annotations of all genomic elements. Today’s sequencing projects face numerous challenges including lower coverage, more frequent assembly errors, and the lack of closely related species with well-annotated genomes. Precise elucidation of the many different biological features encoded in any genome requires careful examination and review. We need genome annotation editing tools to modify and refine the location and structure of the genome elements that predictive algorithms cannot yet resolve automatically. During the manual annotation process, curators identify elements that best represent the underlying biology and eliminate elements that reflect systemic errors of automated analyses.
Apollo is a web-based application that supports and enables collaborative genome curation in real time, analogous to Google Docs, allowing teams of curators to improve on existing automated gene models through an intuitive interface. Researchers from nearly one hundred institutions worldwide are currently using Apollo for distributed curation efforts in over sixty genome projects across the tree of life: from plants to arthropods, to fungi, to species of fish and other vertebrates including human, cattle (bovine), and dog.
Curators are necessarily detail oriented -- a trait born of, and reinforced by, our efforts to describe biological data accurately and precisely. To ensure comprehensive coverage and meaningful integration of new and existing knowledge, however, it is important to periodically step back from this fine-grained view and assess emergent features in accumulated curation. I will explore how PomBase has used the global "big picture" view of curated data to provide biological summaries, modularise content, and improve data display and access for our users. The global perspective can also be used to detect annotation errors and identify knowledge gaps, thereby improving overall annotation quality. I will also describe the progress we have made in engaging fission yeast researchers in community curation. Finally, I will show that the global curation perspective and community engagement share a common theme: both improve overall understanding, accessibility and reuse of accumulated knowledge by our user community.
Ontologies and Semantic Web technologies play an important role in the life sciences to help make data more interoperable and reusable. There are now many publicly available ontologies that enable biologists to describe everything from gene function through to animal physiology and disease.
Various efforts such as the Open Biomedical Ontologies (OBO) foundry provide central registries for biomedical ontologies and ensure they remain interoperable through a set of common shared development principles.
At EMBL-EBI we contribute to the development of biomedical ontologies and make extensive use of them in the annotation of public datasets. Biological data typically comes with rich and often complex metadata, so the ontologies provide a standard way to capture “what the data is about” and gives us hooks to connect to more data about similar things.
These ontology annotations have been put to good use in a number of large-scale data integration efforts and there’s an increasing recognition of the need for ontologies in making data FAIR (Findable, Accessible, Interoperable and Reusable).
EMBL-EBI build a number of integrative data platforms where ontologies are at the core of our domain models. One example is the Open Targets platform, where data about disease from 18 different databases can be aggregated and grouped based on therapeutic areas in the ontology and used to identify potential drug targets.
The ontologies team at EMBL-EBI provide a suite of services that are aimed at making ontologies more accessible for both humans and machines. We work with scientific data curators and software developers to integrate ontologies and semantics into both the data generation and data presentation workflows. We provide:
– An ontology lookup service (OLS) that provides search and visualisation services to over 200+ ontologies
– Services for automating the annotation of metadata and learning from previous annotations (Zooma)
– An ontology mapping and alignment service (OXO)
– Tools for working with metadata and ontologies in spreadsheets (Webulous)
– Software for enriching documents in search engines to support “semantic” query expansion
I’ll present how we are using these services at EMBL-EBI to scale up the semantic annotation of metadata. I’ll talk about our open source technology stack and describe how we utilise a polyglot persistence approach (graph databases, triples stores, document stores etc) to optimize how we deliver ontologies and semantics to our users.
Being Reproducible: SSBSS Summer School 2017Carole Goble
Lecture 2:
Being Reproducible: Models, Research Objects and R* Brouhaha
Reproducibility is a R* minefield, depending on whether you are testing for robustness (rerun), defence (repeat), certification (replicate), comparison (reproduce) or transferring between researchers (reuse). Different forms of "R" make different demands on the completeness, depth and portability of research. Sharing is another minefield raising concerns of credit and protection from sharp practices.
In practice the exchange, reuse and reproduction of scientific experiments is dependent on bundling and exchanging the experimental methods, computational codes, data, algorithms, workflows and so on along with the narrative. These "Research Objects" are not fixed, just as research is not “finished”: the codes fork, data is updated, algorithms are revised, workflows break, service updates are released. ResearchObject.org is an effort to systematically support more portable and reproducible research exchange.
In this talk I will explore these issues in more depth using the FAIRDOM Platform and its support for reproducible modelling. The talk will cover initiatives and technical issues, and raise social and cultural challenges.
This presentation contains details about the Apollo genome annotation editor functionality. It also includes a step-by-step example about curating a gene of interest.
Apollo is a web-based application that supports and enables collaborative genome curation in real time, allowing teams of curators to improve on existing automated gene models through an intuitive interface. Apollo allows researchers to break down large amounts of data into manageable portions to mobilize groups of researchers with shared interests.
An introduction to use and functionality for the IAGC and BIPAA research communities.
Apollo is a web-based application that supports and enables collaborative genome curation in real time, allowing teams of curators to improve on existing automated gene models through an intuitive interface. Apollo allows researchers to break down large amounts of data into manageable portions to mobilize groups of researchers with shared interests.
This is a brief update about the genome browser JBrowse and the genome annotation editor Apollo, addressed to the members of the Alliance of Genome Resources (AGR).
Learn more about JBrowse at jbrowse.org
Learn more about Apollo at GenomeArchitect.org
Apollo Genome Annotation Editor: Latest Updates, Including New Galaxy Integra...Monica Munoz-Torres
Manual curation is crucial to improving the quality of the annotations for a genome sequencing project. During this portion of the genome sequencing workflow, curators use a variety of experimental evidence to improve on automated predictions to more accurately represent the underlying biology.
Apollo is a web-based genome annotation editor that allows curators to manually revise and edit genomic elements. It provides a reporting structure for annotated genomic elements and an ‘Annotator Panel’ that allows users to quickly browse the genome and all available annotations. Users can manually edit the structure of a genomic element as well as add metadata, including references to other databases, adding functional assignments to genes and gene products with specific lookup support for Gene Ontology (GO) terms, as well as including references to published literature in support of these annotations.
Apollo is currently used in more than one hundred genome annotation projects around the world, ranging from the annotation of a single species to lineage-specific efforts supporting annotation for dozens of organisms at a time. Apollo enables collaborative, real-time curation (akin to Google Docs); researchers may restrict access to certain annotations depending on the role of users and groups within the community, as well as share tracks of evidence data with the public. Users are able to export their manual annotations via FASTA and GFF3 files, the Chado database schema, and web services. The news hot of the presses is that Apollo is now available for integration with Galaxy via Docker! This allows users to run analyses on their genome of interest, including a step of manual curation, all from the comfort of their installation of the versatile Galaxy platform.
Scientific research is inherently a collaborative task; in our case it is a dialog among different researchers to reach a shared understanding of the underlying biology. To facilitate this dialog we have developed two web-based annotation tools: Apollo (http://genomearchitect.org/), a genomic feature editor, designed to support structural annotation of gene models, and Noctua (http://noctua.berkeleybop.org/), a biological-process model builder designed for describing the functional roles of gene products. Here we wish to outline an inventory of essential requirements that, in our experience, enable an annotation tool to meet the needs of both professional biocurators as well as other members of the research community. Here are the general requirements, beyond specific functional requirements, that any annotation tool must satisfy.
Comparative genome analysis requires high quality annotations of all genomic elements. Today’s sequencing projects face numerous challenges including lower coverage, more frequent assembly errors, and the lack of closely related species with well-annotated genomes. Precise elucidation of the many different biological features encoded in any genome requires careful examination and review. We need genome annotation editing tools to modify and refine the location and structure of the genome elements that predictive algorithms cannot yet resolve automatically. During the manual annotation process, curators identify elements that best represent the underlying biology and eliminate elements that reflect systemic errors of automated analyses.
Apollo is a web-based application that supports and enables collaborative genome curation in real time, analogous to Google Docs, allowing teams of curators to improve on existing automated gene models through an intuitive interface. Researchers from nearly one hundred institutions worldwide are currently using Apollo for distributed curation efforts in over sixty genome projects across the tree of life: from plants to arthropods, to fungi, to species of fish and other vertebrates including human, cattle (bovine), and dog.
Apollo is a web-based application that supports and enables collaborative genome curation in real time, allowing teams of curators to improve on existing automated gene models through an intuitive interface. Apollo allows researchers to break down large amounts of data into manageable portions to mobilize groups of researchers with shared interests.
The i5K, an initiative to sequence the genomes of 5,000 insect and related arthropod species, is a broad and inclusive effort that seeks to involve scientists from around the world in their genome curation process, and Apollo is serving as the platform to empower this community.
This presentation is an introduction to Apollo for the members of the i5K Pilot Project on Eurytemora affinis
Introduction to Apollo: A webinar for the i5K Research CommunityMonica Munoz-Torres
Apollo is a web-based application that supports and enables collaborative genome curation in real time, allowing teams of curators to improve on existing automated gene models through an intuitive interface. Apollo allows researchers to break down large amounts of data into manageable portions to mobilize groups of researchers with shared interests.
The i5K, an initiative to sequence the genomes of 5,000 insect and related arthropod species, is a broad and inclusive effort that seeks to involve scientists from around the world in their genome curation process, and Apollo is serving as the platform to empower this community.
This presentation is an introduction to Apollo for the members of the i5K Pilot Project Species.
This is an introduction to conducting manual annotation efforts using Apollo. This webinar was offered to members of the i5K Research community on 2015-10-07.
CONSORCIO ONTOLOGÍA DE GENES: herramientas para anotación funcionalMonica Munoz-Torres
Esta presentación contiene información impartida durante el curso de Ontología de Genes en BIOS. Los temas de la charla incluyen una descripción de la estructura de la ontología, cómo se construyen los términos y porqué es necesario usar ontologías. También discutimos los análisis estadísticos de enriquecimiento y representación de términos. Los ejercicios son parte del entrenamiento del grupo de GOA en EMBL-EBI.
La presentación fue dada en Español.
Apollo is a web-based, collaborative genomic annotation editing platform. We need annotation editing tools to modify and refine precise location and structure of the genome elements that predictive algorithms cannot yet resolve automatically.
This presentation is an introduction to how the manual annotation process takes place using Apollo. It is addressed to the members of the American Chestnut & Chinese Chestnut Genomics research community.
Apollo: A workshop for the Manakin Research Coordination NetworkMonica Munoz-Torres
Apollo is a web-based, collaborative genomic annotation editing platform. We need annotation editing tools to modify and refine precise location and structure of the genome elements that predictive algorithms cannot yet resolve automatically.
This presentation is an introduction to how the manual annotation process takes place using Apollo. It is addressed to the members of the Manakin Genomics research community.
Apollo - A webinar for the Phascolarctos cinereus research communityMonica Munoz-Torres
Web Apollo is a web-based, collaborative genomic annotation editing platform. We need annotation editing tools to modify and refine precise location and structure of the genome elements that predictive algorithms cannot yet resolve automatically.
This presentation is an introduction to how the manual annotation process takes place using Web Apollo. It is addressed to the members of the Phascolarctos cinereus research community.
Continuing with the theme of DNA repair via homologous recombination, I will discuss the following family during the PAINT call:
PTHR13451 CLASS II CROSSOVER JUNCTION ENDONUCLEASE MUS81
Talk at the 8th International Biocuration Conference. Beijing, China. April 23-26, 2015.
Obtaining meaningful results from genome analyses requires high quality annotations of all genomic elements. Today’s sequencing projects face challenges such as lower coverage, more frequent assembly errors, and the lack of closely related species with well-annotated genomes. Apollo is a web-based application that supports and enables collaborative genome curation in real time, analogous to Google Docs, allowing curators to improve on existing automated gene models through an intuitive interface. Apollo’s extensible architecture is built on top of JBrowse; its components are a web-based client, an annotation-editing engine, and a server-side data service. It allows users to visualize automated gene models, protein alignments, expression and variant data, and conduct structural and/or functional annotations.
Apollo is actively used within a variety of projects, including the initiative to sequence the genomes of 5,000 Arthropod species (i5K), and will become essential to the thousands of genomes now being sequenced and analyzed. Researchers from nearly 100 institutions worldwide are currently using Apollo on distributed curation efforts for over sixty genome projects across the tree of life; from plants to echinoderms, to fungi, to species of fish and other vertebrates including human, cattle (bovine), and dog. We are training the next generation of researchers by reaching out to educators to make these tools available as part of curricula, offering workshops and webinars to the scientific community, and through widely applied systems such as iPlant and DNA Subway. We are currently integrating Apollo into an annotation environment combining gene structural and functional annotation, transcriptomic, proteomic, and phenotypic annotation. In this presentation we will describe in detail its utility to users, introduce the architecture to developers interested in expanding on this open-source project, and offer details of our future plans.
Authors:
Monica Munoz-Torres(1), Nathan Dunn(1), Colin Diesh(2), Deepak Unni(2), Seth Carbon(1), Heiko Dietze(1), Christopher Mungall(1), Nicole Washington(1), Ian Holmes(3), Christine Elsik(2), and Suzanna E. Lewis(1)
1Lawrence Berkeley National Laboratory, Genomics Division, Berkeley, CA
2Divisions of Animal and Plant Sciences, University of Missouri, Columbia, MO
3University of California Berkeley, Bioengineering, Berkeley, CA
Data Visualization And Annotation Workshop at Biocuration 2015Monica Munoz-Torres
8th International Biocuration Conference. Beijing, China. April 23-26, 2015.
Workshop 2: Data Visualization and Annotation.
Chairs: Rama Balakrishnan, Stanford University, USA and Monica Munoz-Torres, Lawrence Berkeley National Laboratory, USA
Explaining the most intricate biological processes often requires a degree of detail beyond the scope of equations and algorithms; in fact, most biological knowledge is represented visually as illustrations, graphs, and diagrams. Genomics data in particular require specialized forms of visualization to improve our understanding and increase our chances of extracting meaningful conclusions from our analyses. Furthermore, the heterogeneity and abundance of genomic data include widely varied sources, techniques for their obtention, and intrinsic experimental error. And even data obtained under similar conditions from two or more individuals are loaded with biological variation. So what is the best way to interpret the stories the data are telling us? Given the questions we wish to answer and the data we are generating, which tools would be most useful and effective? In this workshop we will explore the tools available for human interpretation of genomic data, specifically in the context of annotation.
Presentations and perspectives, panelists/presenters:
- Lorna Richardson, IGMM, University of Edinburgh, United Kingdom
- Justyna Szostak, PMI Research & Development, Switzerland
The workshop included a brief introduction to a landscape of tools available - as updated as the constantly changing field allows-, brief presentations chosen from abstract submissions and invited speakers, as well as ample discussion to capture the contributions and questions from attendants. In the end, we hope participants walked away with a toolset in hand that may benefit the progress of their own research.
Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...Sérgio Sacani
We characterize the earliest galaxy population in the JADES Origins Field (JOF), the deepest
imaging field observed with JWST. We make use of the ancillary Hubble optical images (5 filters
spanning 0.4−0.9µm) and novel JWST images with 14 filters spanning 0.8−5µm, including 7 mediumband filters, and reaching total exposure times of up to 46 hours per filter. We combine all our data
at > 2.3µm to construct an ultradeep image, reaching as deep as ≈ 31.4 AB mag in the stack and
30.3-31.0 AB mag (5σ, r = 0.1” circular aperture) in individual filters. We measure photometric
redshifts and use robust selection criteria to identify a sample of eight galaxy candidates at redshifts
z = 11.5 − 15. These objects show compact half-light radii of R1/2 ∼ 50 − 200pc, stellar masses of
M⋆ ∼ 107−108M⊙, and star-formation rates of SFR ∼ 0.1−1 M⊙ yr−1
. Our search finds no candidates
at 15 < z < 20, placing upper limits at these redshifts. We develop a forward modeling approach to
infer the properties of the evolving luminosity function without binning in redshift or luminosity that
marginalizes over the photometric redshift uncertainty of our candidate galaxies and incorporates the
impact of non-detections. We find a z = 12 luminosity function in good agreement with prior results,
and that the luminosity function normalization and UV luminosity density decline by a factor of ∼ 2.5
from z = 12 to z = 14. We discuss the possible implications of our results in the context of theoretical
models for evolution of the dark matter halo mass function.
What is greenhouse gasses and how many gasses are there to affect the Earth.moosaasad1975
What are greenhouse gasses how they affect the earth and its environment what is the future of the environment and earth how the weather and the climate effects.
THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.Sérgio Sacani
The return of a sample of near-surface atmosphere from Mars would facilitate answers to several first-order science questions surrounding the formation and evolution of the planet. One of the important aspects of terrestrial planet formation in general is the role that primary atmospheres played in influencing the chemistry and structure of the planets and their antecedents. Studies of the martian atmosphere can be used to investigate the role of a primary atmosphere in its history. Atmosphere samples would also inform our understanding of the near-surface chemistry of the planet, and ultimately the prospects for life. High-precision isotopic analyses of constituent gases are needed to address these questions, requiring that the analyses are made on returned samples rather than in situ.
Richard's entangled aventures in wonderlandRichard Gill
Since the loophole-free Bell experiments of 2020 and the Nobel prizes in physics of 2022, critics of Bell's work have retreated to the fortress of super-determinism. Now, super-determinism is a derogatory word - it just means "determinism". Palmer, Hance and Hossenfelder argue that quantum mechanics and determinism are not incompatible, using a sophisticated mathematical construction based on a subtle thinning of allowed states and measurements in quantum mechanics, such that what is left appears to make Bell's argument fail, without altering the empirical predictions of quantum mechanics. I think however that it is a smoke screen, and the slogan "lost in math" comes to my mind. I will discuss some other recent disproofs of Bell's theorem using the language of causality based on causal graphs. Causal thinking is also central to law and justice. I will mention surprising connections to my work on serial killer nurse cases, in particular the Dutch case of Lucia de Berk and the current UK case of Lucy Letby.
A brief information about the SCOP protein database used in bioinformatics.
The Structural Classification of Proteins (SCOP) database is a comprehensive and authoritative resource for the structural and evolutionary relationships of proteins. It provides a detailed and curated classification of protein structures, grouping them into families, superfamilies, and folds based on their structural and sequence similarities.
Cancer cell metabolism: special Reference to Lactate PathwayAADYARAJPANDEY1
Normal Cell Metabolism:
Cellular respiration describes the series of steps that cells use to break down sugar and other chemicals to get the energy we need to function.
Energy is stored in the bonds of glucose and when glucose is broken down, much of that energy is released.
Cell utilize energy in the form of ATP.
The first step of respiration is called glycolysis. In a series of steps, glycolysis breaks glucose into two smaller molecules - a chemical called pyruvate. A small amount of ATP is formed during this process.
Most healthy cells continue the breakdown in a second process, called the Kreb's cycle. The Kreb's cycle allows cells to “burn” the pyruvates made in glycolysis to get more ATP.
The last step in the breakdown of glucose is called oxidative phosphorylation (Ox-Phos).
It takes place in specialized cell structures called mitochondria. This process produces a large amount of ATP. Importantly, cells need oxygen to complete oxidative phosphorylation.
If a cell completes only glycolysis, only 2 molecules of ATP are made per glucose. However, if the cell completes the entire respiration process (glycolysis - Kreb's - oxidative phosphorylation), about 36 molecules of ATP are created, giving it much more energy to use.
IN CANCER CELL:
Unlike healthy cells that "burn" the entire molecule of sugar to capture a large amount of energy as ATP, cancer cells are wasteful.
Cancer cells only partially break down sugar molecules. They overuse the first step of respiration, glycolysis. They frequently do not complete the second step, oxidative phosphorylation.
This results in only 2 molecules of ATP per each glucose molecule instead of the 36 or so ATPs healthy cells gain. As a result, cancer cells need to use a lot more sugar molecules to get enough energy to survive.
Unlike healthy cells that "burn" the entire molecule of sugar to capture a large amount of energy as ATP, cancer cells are wasteful.
Cancer cells only partially break down sugar molecules. They overuse the first step of respiration, glycolysis. They frequently do not complete the second step, oxidative phosphorylation.
This results in only 2 molecules of ATP per each glucose molecule instead of the 36 or so ATPs healthy cells gain. As a result, cancer cells need to use a lot more sugar molecules to get enough energy to survive.
introduction to WARBERG PHENOMENA:
WARBURG EFFECT Usually, cancer cells are highly glycolytic (glucose addiction) and take up more glucose than do normal cells from outside.
Otto Heinrich Warburg (; 8 October 1883 – 1 August 1970) In 1931 was awarded the Nobel Prize in Physiology for his "discovery of the nature and mode of action of the respiratory enzyme.
WARNBURG EFFECT : cancer cells under aerobic (well-oxygenated) conditions to metabolize glucose to lactate (aerobic glycolysis) is known as the Warburg effect. Warburg made the observation that tumor slices consume glucose and secrete lactate at a higher rate than normal tissues.
Richard's aventures in two entangled wonderlandsRichard Gill
Since the loophole-free Bell experiments of 2020 and the Nobel prizes in physics of 2022, critics of Bell's work have retreated to the fortress of super-determinism. Now, super-determinism is a derogatory word - it just means "determinism". Palmer, Hance and Hossenfelder argue that quantum mechanics and determinism are not incompatible, using a sophisticated mathematical construction based on a subtle thinning of allowed states and measurements in quantum mechanics, such that what is left appears to make Bell's argument fail, without altering the empirical predictions of quantum mechanics. I think however that it is a smoke screen, and the slogan "lost in math" comes to my mind. I will discuss some other recent disproofs of Bell's theorem using the language of causality based on causal graphs. Causal thinking is also central to law and justice. I will mention surprising connections to my work on serial killer nurse cases, in particular the Dutch case of Lucia de Berk and the current UK case of Lucy Letby.
(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...Scintica Instrumentation
Intravital microscopy (IVM) is a powerful tool utilized to study cellular behavior over time and space in vivo. Much of our understanding of cell biology has been accomplished using various in vitro and ex vivo methods; however, these studies do not necessarily reflect the natural dynamics of biological processes. Unlike traditional cell culture or fixed tissue imaging, IVM allows for the ultra-fast high-resolution imaging of cellular processes over time and space and were studied in its natural environment. Real-time visualization of biological processes in the context of an intact organism helps maintain physiological relevance and provide insights into the progression of disease, response to treatments or developmental processes.
In this webinar we give an overview of advanced applications of the IVM system in preclinical research. IVIM technology is a provider of all-in-one intravital microscopy systems and solutions optimized for in vivo imaging of live animal models at sub-micron resolution. The system’s unique features and user-friendly software enables researchers to probe fast dynamic biological processes such as immune cell tracking, cell-cell interaction as well as vascularization and tumor metastasis with exceptional detail. This webinar will also give an overview of IVM being utilized in drug development, offering a view into the intricate interaction between drugs/nanoparticles and tissues in vivo and allows for the evaluation of therapeutic intervention in a variety of tissues and organs. This interdisciplinary collaboration continues to drive the advancements of novel therapeutic strategies.
An introduction to Web Apollo for i5K Pilot Species Projects - Hemiptera
1. An introduction to Web Apollo.
A webinar for the i5K Pilot Species Projects - Hemiptera
Monica Munoz-Torres, PhD
Biocurator & Bioinformatics Analyst | @monimunozto
Genomics Division, Lawrence Berkeley National Laboratory
12+1 May, 2014
UNIVERSITY OF
CALIFORNIA
2. Outline
1. What is Web Apollo?:
• Definition & working concept.
2. Community based curation from our
experience. Lessons Learned.
3. Manual Annotation at i5K: how do we
get there?
4. Becoming acquainted with Web
Apollo.
An introduction to
Web Apollo.
A webinar for the i5K
Pilot Species Projects -
Hemiptera.
Outline 2
3. What is Web Apollo?
• Web Apollo is a web-based, collaborative genomic
annotation editing platform.
We need annotation editing tools to modify and refine the
precise location and structure of the genome elements that
predictive algorithms cannot yet resolve automatically.
31. What is Web Apollo?
Find more about Web Apollo at
http://GenomeArchitect.org
and
Genome Biol 14:R93. (2013).
4. Brief history of Apollo*:
a. Desktop:
one person at a time editing a
specific region, annotations
saved in local files; slowed down
collaboration.
b. Java Web Start:
users saved annotations directly
to a centralized database;
potential issues with stale
annotation data remained.
1. What is Web Apollo? 4
Biologists could finally visualize computational analyses and
experimental evidence from genomic features and build
manually-curated consensus gene structures. Apollo became a
very popular, open source tool (insects, fish, mammals, birds, etc.).
*
5. Web Apollo
• Browser-based; plugin for JBrowse.
• Allows for intuitive annotation creation and editing,
with gestures and pull-down menus to create
transcripts, add/delete/resize exons, merge/split
exons or transcripts, insert comments
(CV, freeform text), etc.
• Customizable rules and
appearance.
• Edits in one client are
instantly pushed to all other
clients: Collaborative!
1. What is Web Apollo? 5
6. Working
Concept
In the context of gene manual annotation,
curation tries to find the best examples
and/or eliminate (most) errors.
To conduct manual annotation efforts:
Gather and evaluate all available evidence
using quality-control metrics to
corroborate or modify automated
annotation predictions.
Perform sequence similarity searches
(phylogenetic framework) and use
literature and public databases to:
• Predict functional assignments from
experimental data.
• Distinguish orthologs from paralogs,
and classify gene membership in
families and networks.
2. In our experience. 6
Automated gene models
Evidence:
cDNAs, HMM domain searches,
alignments with assemblies or
genes from other species.
Manual annotation & curation
7. Dispersed, community-based gene
manual annotation efforts.
Using Web Apollo, we* have trained
geographically dispersed scientific
communities to perform biologically
supported manual annotations, and
monitored their findings: ~80 institutions,
14 countries, hundreds of scientists, and
gate keepers.
– Training workshops and geneborees.
– Tutorials with detailed instructions.
– Personalized user support.
2. In our experience. 7
*Collaboration with Elsik Lab,
Hymenoptera Genome
Database.
8. What have we learned?
Harvesting expertise from dispersed researchers who
assigned functions to predicted and curated peptides,
we have developed more interactive and responsive
tools, as well as better visualization, editing, and
analysis capabilities.
82. In our experience.
9. It is helpful to work together.
Scientific community efforts bring together domain-
specific and natural history expertise that would have
otherwise remain disconnected.
92. In our experience.
10. Improved Automated Annotations*
In many cases, automated annotations have been
improved (e.g: Apis mellifera. Elsik et al. BMC Genomics 2014, 15:86).
Also, learned of the challenges of newer sequencing
technologies, e.g.:
– Frameshifts and indel errors
– Split genes across scaffolds
– Highly repetitive sequences
To face these challenges, we train annotators in
recovering coding sequences in agreement with all
available biological evidence.
102. In our experience.
11. Understanding the evolution of sociality.
Comparison of the genomes of 7 species of
ants contributed to a better understanding
of the evolution and organization of insect
societies at the molecular level.
Insights drawn mainly from six core aspects of
ant biology:
1. Alternative morphological castes
2. Division of labor
3. Chemical Communication
4. Alternative social organization
5. Social immunity
6. Mutualism
11
… groups of
communities
have taught us a
lot!
Libbrecht et al. 2012. Genome Biology 2013, 14:212
2. In our experience.
12. A little training goes a long way!
With the right tools, wet lab scientists make exceptional
curators who can easily learn to maximize the
generation of accurate, biologically supported gene
models.
122. In our experience.
13. Manual annotation at i5K
How do we get there?
3. How do we get there? 13
Assembly
Manual
annotation
Experimental
validation
Automated
Annotation
In a genome sequencing project…
14. Gene Prediction
Gene Prediction:
Identification of protein-coding genes, tRNAs, rRNAs,
regulatory motifs, repetitive elements (masked), etc.
Ab initio or homology-based. E.g: fgenesh, Augustus,
geneid, SGP2
14
Nucleic Acids 2003 vol. 31 no. 13 3738-3741
3. How do we get there?
15. Gene Annotation
Gene Annotation:
Integration of data from prediction tools to generate a
consensus set of predictions (gene models).
• Models may be organized by:
- automatic integration of predicted sets; e.g: GLEAN
- packaging necessary tools into pipeline; e.g: MAKER
• Transcriptomes are used to further inform the annotation
process.
153. How do we get there?
16. The Collaborative Curation Process at
i5K
1) A computationally predicted consensus gene set has
been generated using multiple lines of evidence; e.g.
CLEC_v0.5.3-Models.
2) i5K Projects will integrate consensus computational
predictions with manual annotations to produce an updated
Official Gene Set (OGS):
» If it’s not on either track, it won’t make the OGS!
» If it’s there and it shouldn’t, it will still make the OGS!
163. How do we get there?
17. Consensus set: reference and start point
• In some cases algorithms and metrics used to generate
consensus sets may actually reduce the accuracy of the gene’s
representation; e.g. use Augustus model instead to create a new
annotation.
• Isoforms: drag original and alternatively spliced form to ‘User-
created Annotations’ area.
• If an annotation needs to be removed from the consensus set,
drag it to the ‘User-created Annotations’ area and label as
‘Delete’ on Information Editor.
• Overlapping interests? Collaborate to reach agreement.
• Follow guidelines for i5K Pilot Species Projects as shown at
http://goo.gl/LRu1VY
173. How do we get there?
18. Navigation tools:
pan and zoom Search box: go
to a scaffold or
a gene model.
Grey bar of coordinates
indicates location. You can
also select here in order to
zoom to a sub-region.
‘View’: change
color by CDS,
toggle strands,
set highlight.
‘File’:
Upload your own
evidence: GFF3,
BAM, BigWig, VCF*.
Add combination
and sequence
search tracks.
‘Tools’:
Use BLAT to query the
genome with a protein
or DNA sequence.
Available Tracks
Evidence Tracks Area
‘User-created Annotations’ Track
Login
Web Apollo
Graphical User Interface (GUI) for editing annotations
4. Becoming Acquainted with Web Apollo.
19. Flags non-
canonical splice
sites.
Selection of features and
sub-features
Edge-matching
Evidence Tracks Area
‘User-created Annotations’ Track
The editing logic (server):
selects longest ORF as CDS
flags non-canonical splice sites
Web Apollo
4. Becoming Acquainted with Web Apollo.
20. DNA Track
‘User-created Annotations’ Track
Two new kinds of tracks:
annotation editing
sequence alteration editing
Web Apollo
4. Becoming Acquainted with Web Apollo.
21. Web Apollo
Annotations, annotation edits, and History: stored in a centralized database.
4. Becoming Acquainted with Web Apollo.
24. [Some of the] Functionality:
Protein-coding gene annotation (that you know and love)
Sequence alterations (less coverage = more fragmentation)
Visualization of stage and cell-type specific transcription data as
coverage plots, heat maps, and alignments
4. Becoming Acquainted with Web Apollo.
27. Thanks!
• Berkeley Bioinformatics Open-source Projects
(BBOP), Berkeley Lab: Web Apollo and Gene
Ontology teams. Suzanna E. Lewis (PI).
• Elsik Lab. § University of Missouri. Christine G.
Elsik (PI).
• Ian Holmes (PI). * University of California Berkeley.
• Arthropod genomics community, i5K
http://www.arthropodgenomes.org/wiki/i5K Steering
Committee, USDA/NAL, HGSC-BCM, BGI, and
1KITE http://www.1kite.org/.
• Web Apollo is supported by NIH grants 5R01GM080203
from NIGMS, and 5R01HG004483 from NHGRI, and by the
Director, Office of Science, Office of Basic Energy
Sciences, of the U.S. Department of Energy under Contract
No. DE-AC02-05CH11231.
• Insect images used with permission:
http://AlexanderWild.com
• For your attention, thank you!
Thank you. 27
Web Apollo
Ed Lee
Gregg Helt
Colin Diesh §
Deepak Unni §
Rob Buels *
Gene Ontology
Chris Mungall
Seth Carbon
Heiko Dietze
BBOP
Web Apollo: http://GenomeArchitect.org
GO: http://GeneOntology.org
i5K: http://arthropodgenomes.org/wiki/i5K