Apollo is a web-based application that supports and enables collaborative genome curation in real time, allowing teams of curators to improve on existing automated gene models through an intuitive interface. Apollo allows researchers to break down large amounts of data into manageable portions to mobilize groups of researchers with shared interests.
An introduction to use and functionality for the IAGC and BIPAA research communities.
Apollo is a web-based application that supports and enables collaborative genome curation in real time, allowing teams of curators to improve on existing automated gene models through an intuitive interface. Apollo allows researchers to break down large amounts of data into manageable portions to mobilize groups of researchers with shared interests.
A Workshop at the Stowers Institute for Medical Research.
Apollo is a web-based application that supports and enables collaborative genome curation in real time, allowing teams of curators to improve on existing automated gene models through an intuitive interface. Apollo allows researchers to break down large amounts of data into manageable portions to mobilize groups of researchers with shared interests.
An introduction on gene annotation & curation for the IAGC and BIPAA research communities.
This presentation contains details about the Apollo genome annotation editor functionality. It also includes a step-by-step example about curating a gene of interest.
This presentation explains the meaning of curation and includes an introduction to the Apollo genome annotation editing tool and its curation environment.
Apollo is a web-based application that supports and enables collaborative genome curation in real time, allowing teams of curators to improve on existing automated gene models through an intuitive interface. Apollo allows researchers to break down large amounts of data into manageable portions to mobilize groups of researchers with shared interests.
Course: Bioinformatics for Biomedical Research (2014).
Session: 1.3- Genome Browsing, Genomic Data Mining and Genome Data Visualization with Ensembl, Biomart and IGV.
Statistics and Bioinformatisc Unit (UEB) & High Technology Unit (UAT) from Vall d'Hebron Research Institute (www.vhir.org), Barcelona.
Apollo is a web-based application that supports and enables collaborative genome curation in real time, allowing teams of curators to improve on existing automated gene models through an intuitive interface. Apollo allows researchers to break down large amounts of data into manageable portions to mobilize groups of researchers with shared interests.
A Workshop at the Stowers Institute for Medical Research.
Apollo is a web-based application that supports and enables collaborative genome curation in real time, allowing teams of curators to improve on existing automated gene models through an intuitive interface. Apollo allows researchers to break down large amounts of data into manageable portions to mobilize groups of researchers with shared interests.
An introduction on gene annotation & curation for the IAGC and BIPAA research communities.
This presentation contains details about the Apollo genome annotation editor functionality. It also includes a step-by-step example about curating a gene of interest.
This presentation explains the meaning of curation and includes an introduction to the Apollo genome annotation editing tool and its curation environment.
Apollo is a web-based application that supports and enables collaborative genome curation in real time, allowing teams of curators to improve on existing automated gene models through an intuitive interface. Apollo allows researchers to break down large amounts of data into manageable portions to mobilize groups of researchers with shared interests.
Course: Bioinformatics for Biomedical Research (2014).
Session: 1.3- Genome Browsing, Genomic Data Mining and Genome Data Visualization with Ensembl, Biomart and IGV.
Statistics and Bioinformatisc Unit (UEB) & High Technology Unit (UAT) from Vall d'Hebron Research Institute (www.vhir.org), Barcelona.
Gene Expression - Microarrays discusses analyzing gene expression data from microarray experiments. It describes the basic workflow including experimental design, sample preparation, hybridization, image analysis, preprocessing, normalization, and statistical analysis. Key points are that microarrays allow measuring expression of thousands of genes simultaneously, and proper experimental design and data analysis are important to draw meaningful biological conclusions from microarray data.
Comparative genome analysis requires high quality annotations of all genomic elements. Today’s sequencing projects face numerous challenges including lower coverage, more frequent assembly errors, and the lack of closely related species with well-annotated genomes. Precise elucidation of the many different biological features encoded in any genome requires careful examination and review. We need genome annotation editing tools to modify and refine the location and structure of the genome elements that predictive algorithms cannot yet resolve automatically. During the manual annotation process, curators identify elements that best represent the underlying biology and eliminate elements that reflect systemic errors of automated analyses.
Apollo is a web-based application that supports and enables collaborative genome curation in real time, analogous to Google Docs, allowing teams of curators to improve on existing automated gene models through an intuitive interface. Researchers from nearly one hundred institutions worldwide are currently using Apollo for distributed curation efforts in over sixty genome projects across the tree of life: from plants to arthropods, to fungi, to species of fish and other vertebrates including human, cattle (bovine), and dog.
Genome resources at EMBL-EBI: Ensembl and Ensembl GenomesEBI
Event: Plant and Animal Genomes Conference
Speaker: Bert Overduin
The Ensembl project (http://www.ensembl.org) seeks to enable genomic science by providing high quality, integrated annotation on chordate and selected eukaryotic genomes. All supported species include comprehensive, evidence-based gene annotations and a selected set of genomes includes additional data focused on variation, comparative, evolutionary, functional and regulatory annotation. As of Ensembl release 65 (December 2011), 56 species are fully supported. Ensembl data are accessible through an interactive web site, flat files, the data mining tool BioMart, direct database querying and a set of Perl APIs. Moreover, Ensembl is not just a data visualisation tool, but a suite of programs for data production (e.g. gene calling and comparative genomics) that can be deployed individually according to the needs of an individual community. Ensembl Genomes (http://www.ensemblgenomes.org) consists of five sub-portals (for bacteria, protists, fungi, plants and invertebrate metazoa) designed to complement the genomes available in Ensembl. It currently contains data for over 300 species. Many of the databases that support Ensembl Genomes have been built by, or in close collaboration with, groups that maintain specialist data resources for individual species, and we are actively seeking to extend the range of these collaborations. Together Ensembl and Ensembl Genomes offer a single unified interface across the taxonomic space. This presentation will consist of a short introduction to Ensembl and Ensembl Genomes followed by a demonstration of the respective websites and the BioMart data retrieval tool. Special attention will be given to recently developed functionality like the Variant Effect Predictor, which predicts the consequences of substitutions, insertions and deletions on transcripts and protein sequences, and the possibility to visualize your own data by attaching BAM and VCF files (for example).
Introduction to Apollo: A webinar for the i5K Research CommunityMonica Munoz-Torres
This document provides an introduction and outline for a webinar on using the Apollo genome annotation editing tool. It was presented by Monica Munoz-Torres of BBOP to the i5K Research Community. The webinar aimed to help participants better understand genome curation in the context of automated and manual annotation. It also aimed to familiarize participants with Apollo's functionality and how to identify homologs of known genes, corroborate gene models using evidence, and modify automated annotations in Apollo. The document includes sections on genome sequencing projects, the objectives and uses of genome annotation, and a biological refresher on concepts relevant to manual annotation like genes, transcription, translation, and genome curation steps.
Apollo is a web-based application that supports and enables collaborative genome curation in real time, allowing teams of curators to improve on existing automated gene models through an intuitive interface. Apollo allows researchers to break down large amounts of data into manageable portions to mobilize groups of researchers with shared interests.
The i5K, an initiative to sequence the genomes of 5,000 insect and related arthropod species, is a broad and inclusive effort that seeks to involve scientists from around the world in their genome curation process, and Apollo is serving as the platform to empower this community.
This presentation is an introduction to Apollo for the members of the i5K Pilot Project working on species of the order Hemiptera.
This document provides information about a training workshop on using Ensembl. It includes an agenda for the day-long workshop covering topics like introduction to Ensembl, browsing genes and data, using BioMart, and exploring genetic variation data. The workshop materials are available under a CC BY license and the document encourages attendees to cite any Ensembl papers if using the resource for their own work. Break times and locations are also listed on the agenda.
Ontologies for life sciences: examples from the gene ontologyMelanie Courtot
Ontologies for life sciences: examples from the Gene Ontology
The document discusses ontologies for life sciences, using the Gene Ontology (GO) as an example. It provides an overview of GO, describing it as a way to capture biological knowledge for gene products in a written and computable form using a set of concepts and relationships arranged hierarchically. GO allows consistent descriptions of genes/gene products across databases. Model organism databases provide annotations connecting genes to GO terms. The GO is a collaborative effort to address the need for consistent descriptions of genes.
GRC Workshop held at Churchill College on Sep 21, 2014. Talk by Bronwen Aken discussing the Ensembl approach to annotating the complete human reference assembly.
Ensembl is a genome browser that annotates genes and predicts regulatory functions for vertebrate genomes. It uses raw DNA sequence data to create a tracking database and automatically finds genes and other features. Ensembl incorporates data from other sources and provides web-based access to genomic information through views of genes, transcripts, proteins, DNA homology, and more. It aims to make genome annotation freely accessible to support research.
Apollo is a web-based application that supports and enables collaborative genome curation in real time, allowing teams of curators to improve on existing automated gene models through an intuitive interface. Apollo allows researchers to break down large amounts of data into manageable portions to mobilize groups of researchers with shared interests.
The i5K, an initiative to sequence the genomes of 5,000 insect and related arthropod species, is a broad and inclusive effort that seeks to involve scientists from around the world in their genome curation process, and Apollo is serving as the platform to empower this community.
This presentation is an introduction to Apollo for the members of the i5K Pilot Project on Eurytemora affinis
Comparative genome analysis requires high quality annotations of all genomic elements. Today’s sequencing projects face numerous challenges including lower coverage, more frequent assembly errors, and the lack of closely related species with well-annotated genomes. Precise elucidation of the many different biological features encoded in any genome requires careful examination and review. We need genome annotation editing tools to modify and refine the location and structure of the genome elements that predictive algorithms cannot yet resolve automatically. During the manual annotation process, curators identify elements that best represent the underlying biology and eliminate elements that reflect systemic errors of automated analyses.
Apollo is a web-based application that supports and enables collaborative genome curation in real time, analogous to Google Docs, allowing teams of curators to improve on existing automated gene models through an intuitive interface. Researchers from nearly one hundred institutions worldwide are currently using Apollo for distributed curation efforts in over sixty genome projects across the tree of life: from plants to arthropods, to fungi, to species of fish and other vertebrates including human, cattle (bovine), and dog.
The document outlines topics that will be covered in a bioinformatics course, including biological databases, sequence alignments, database searching, phylogenetics, protein structure, gene prediction, gene ontologies, hidden Markov models, and non-coding RNA. It then provides more details on topics like using sequence similarity to gather information from unknown protein sequences, finding conserved patterns in alignments using profiles and hidden Markov models, and challenges around gene prediction from raw genomic sequences.
An introduction to Web Apollo for the Biomphalaria glabatra research community.Monica Munoz-Torres
Web Apollo is a web-based, collaborative genomic annotation editing platform. We need annotation editing tools to modify and refine precise location and structure of the genome elements that predictive algorithms cannot yet resolve automatically.
This presentation is an introduction to how the manual annotation process takes place using Web Apollo. It is addressed to the members of the Biomphalaria glabatra research community.
Genomics is the study of genomes through mapping, sequencing, and analysis. It involves sequencing entire genomes and analyzing gene function on a genome-wide scale. There are three main areas of genomics: structural genomics focuses on sequencing and mapping genomes; functional genomics examines gene expression and protein interactions; and comparative genomics compares genomic features between organisms to study evolution and biology. Techniques like DNA sequencing, microarrays, and bioinformatics are used to efficiently analyze entire genomes and understand how the structure and function of genomes relate to biological processes.
BITs: Genome browsers and interpretation of gene lists.BITS
Module 5 Genome browsers and interpreting gene lists.
Part of training session "Basic Bioinformatics concepts, databases and tools" - http://www.bits.vib.be/training
This document discusses NIST's work in developing genomic reference materials and methods to evaluate microbial genomics measurements. It describes three projects: 1) assessing genomic purity by detecting low levels of contaminants using sequencing and classification, 2) evaluating SNP calling methods using reference materials and replicates to establish confidence, and 3) developing characterized genomic reference materials for public health pathogens. The overall aim is to build an infrastructure to support genome-based characterization of microbial samples.
Structural genomics aims to sequence and map genomes. Genetic maps show relative gene locations based on recombination rates, while physical maps show direct DNA distances. Genetic maps have low resolution and may differ from physical distances. Physical maps use techniques like restriction mapping and sequencing to order DNA fragments. Sequencing entire genomes requires breaking DNA into small fragments that are then reassembled using overlap or genetic/physical maps. The human genome was first sequenced in 2000 using both map-based and whole genome shotgun approaches. Single nucleotide polymorphisms are also studied to compare individuals.
Apollo annotation guidelines for i5k projects Diaphorina citriMonica Munoz-Torres
Apollo is a web-based application that supports and enables collaborative genome curation in real time, allowing teams of curators to improve on existing automated gene models through an intuitive interface. Apollo allows researchers to break down large amounts of data into manageable portions to mobilize groups of researchers with shared interests.
This document provides an introduction and overview of manual genome annotation using the Apollo genome annotation tool. It begins with an outline of the webinar topics, which include an introduction to manual annotation and its necessity, an overview of the Apollo tool and its functionality for collaborative curation, and examples and demonstrations. The document then covers key concepts for manual annotation such as the definition of a gene, genome curation steps, transcription and translation including reading frames, splice sites, and phase. The goal of the webinar is to help participants better understand genome curation and manual annotation using Apollo to identify and modify gene models.
Gene Expression - Microarrays discusses analyzing gene expression data from microarray experiments. It describes the basic workflow including experimental design, sample preparation, hybridization, image analysis, preprocessing, normalization, and statistical analysis. Key points are that microarrays allow measuring expression of thousands of genes simultaneously, and proper experimental design and data analysis are important to draw meaningful biological conclusions from microarray data.
Comparative genome analysis requires high quality annotations of all genomic elements. Today’s sequencing projects face numerous challenges including lower coverage, more frequent assembly errors, and the lack of closely related species with well-annotated genomes. Precise elucidation of the many different biological features encoded in any genome requires careful examination and review. We need genome annotation editing tools to modify and refine the location and structure of the genome elements that predictive algorithms cannot yet resolve automatically. During the manual annotation process, curators identify elements that best represent the underlying biology and eliminate elements that reflect systemic errors of automated analyses.
Apollo is a web-based application that supports and enables collaborative genome curation in real time, analogous to Google Docs, allowing teams of curators to improve on existing automated gene models through an intuitive interface. Researchers from nearly one hundred institutions worldwide are currently using Apollo for distributed curation efforts in over sixty genome projects across the tree of life: from plants to arthropods, to fungi, to species of fish and other vertebrates including human, cattle (bovine), and dog.
Genome resources at EMBL-EBI: Ensembl and Ensembl GenomesEBI
Event: Plant and Animal Genomes Conference
Speaker: Bert Overduin
The Ensembl project (http://www.ensembl.org) seeks to enable genomic science by providing high quality, integrated annotation on chordate and selected eukaryotic genomes. All supported species include comprehensive, evidence-based gene annotations and a selected set of genomes includes additional data focused on variation, comparative, evolutionary, functional and regulatory annotation. As of Ensembl release 65 (December 2011), 56 species are fully supported. Ensembl data are accessible through an interactive web site, flat files, the data mining tool BioMart, direct database querying and a set of Perl APIs. Moreover, Ensembl is not just a data visualisation tool, but a suite of programs for data production (e.g. gene calling and comparative genomics) that can be deployed individually according to the needs of an individual community. Ensembl Genomes (http://www.ensemblgenomes.org) consists of five sub-portals (for bacteria, protists, fungi, plants and invertebrate metazoa) designed to complement the genomes available in Ensembl. It currently contains data for over 300 species. Many of the databases that support Ensembl Genomes have been built by, or in close collaboration with, groups that maintain specialist data resources for individual species, and we are actively seeking to extend the range of these collaborations. Together Ensembl and Ensembl Genomes offer a single unified interface across the taxonomic space. This presentation will consist of a short introduction to Ensembl and Ensembl Genomes followed by a demonstration of the respective websites and the BioMart data retrieval tool. Special attention will be given to recently developed functionality like the Variant Effect Predictor, which predicts the consequences of substitutions, insertions and deletions on transcripts and protein sequences, and the possibility to visualize your own data by attaching BAM and VCF files (for example).
Introduction to Apollo: A webinar for the i5K Research CommunityMonica Munoz-Torres
This document provides an introduction and outline for a webinar on using the Apollo genome annotation editing tool. It was presented by Monica Munoz-Torres of BBOP to the i5K Research Community. The webinar aimed to help participants better understand genome curation in the context of automated and manual annotation. It also aimed to familiarize participants with Apollo's functionality and how to identify homologs of known genes, corroborate gene models using evidence, and modify automated annotations in Apollo. The document includes sections on genome sequencing projects, the objectives and uses of genome annotation, and a biological refresher on concepts relevant to manual annotation like genes, transcription, translation, and genome curation steps.
Apollo is a web-based application that supports and enables collaborative genome curation in real time, allowing teams of curators to improve on existing automated gene models through an intuitive interface. Apollo allows researchers to break down large amounts of data into manageable portions to mobilize groups of researchers with shared interests.
The i5K, an initiative to sequence the genomes of 5,000 insect and related arthropod species, is a broad and inclusive effort that seeks to involve scientists from around the world in their genome curation process, and Apollo is serving as the platform to empower this community.
This presentation is an introduction to Apollo for the members of the i5K Pilot Project working on species of the order Hemiptera.
This document provides information about a training workshop on using Ensembl. It includes an agenda for the day-long workshop covering topics like introduction to Ensembl, browsing genes and data, using BioMart, and exploring genetic variation data. The workshop materials are available under a CC BY license and the document encourages attendees to cite any Ensembl papers if using the resource for their own work. Break times and locations are also listed on the agenda.
Ontologies for life sciences: examples from the gene ontologyMelanie Courtot
Ontologies for life sciences: examples from the Gene Ontology
The document discusses ontologies for life sciences, using the Gene Ontology (GO) as an example. It provides an overview of GO, describing it as a way to capture biological knowledge for gene products in a written and computable form using a set of concepts and relationships arranged hierarchically. GO allows consistent descriptions of genes/gene products across databases. Model organism databases provide annotations connecting genes to GO terms. The GO is a collaborative effort to address the need for consistent descriptions of genes.
GRC Workshop held at Churchill College on Sep 21, 2014. Talk by Bronwen Aken discussing the Ensembl approach to annotating the complete human reference assembly.
Ensembl is a genome browser that annotates genes and predicts regulatory functions for vertebrate genomes. It uses raw DNA sequence data to create a tracking database and automatically finds genes and other features. Ensembl incorporates data from other sources and provides web-based access to genomic information through views of genes, transcripts, proteins, DNA homology, and more. It aims to make genome annotation freely accessible to support research.
Apollo is a web-based application that supports and enables collaborative genome curation in real time, allowing teams of curators to improve on existing automated gene models through an intuitive interface. Apollo allows researchers to break down large amounts of data into manageable portions to mobilize groups of researchers with shared interests.
The i5K, an initiative to sequence the genomes of 5,000 insect and related arthropod species, is a broad and inclusive effort that seeks to involve scientists from around the world in their genome curation process, and Apollo is serving as the platform to empower this community.
This presentation is an introduction to Apollo for the members of the i5K Pilot Project on Eurytemora affinis
Comparative genome analysis requires high quality annotations of all genomic elements. Today’s sequencing projects face numerous challenges including lower coverage, more frequent assembly errors, and the lack of closely related species with well-annotated genomes. Precise elucidation of the many different biological features encoded in any genome requires careful examination and review. We need genome annotation editing tools to modify and refine the location and structure of the genome elements that predictive algorithms cannot yet resolve automatically. During the manual annotation process, curators identify elements that best represent the underlying biology and eliminate elements that reflect systemic errors of automated analyses.
Apollo is a web-based application that supports and enables collaborative genome curation in real time, analogous to Google Docs, allowing teams of curators to improve on existing automated gene models through an intuitive interface. Researchers from nearly one hundred institutions worldwide are currently using Apollo for distributed curation efforts in over sixty genome projects across the tree of life: from plants to arthropods, to fungi, to species of fish and other vertebrates including human, cattle (bovine), and dog.
The document outlines topics that will be covered in a bioinformatics course, including biological databases, sequence alignments, database searching, phylogenetics, protein structure, gene prediction, gene ontologies, hidden Markov models, and non-coding RNA. It then provides more details on topics like using sequence similarity to gather information from unknown protein sequences, finding conserved patterns in alignments using profiles and hidden Markov models, and challenges around gene prediction from raw genomic sequences.
An introduction to Web Apollo for the Biomphalaria glabatra research community.Monica Munoz-Torres
Web Apollo is a web-based, collaborative genomic annotation editing platform. We need annotation editing tools to modify and refine precise location and structure of the genome elements that predictive algorithms cannot yet resolve automatically.
This presentation is an introduction to how the manual annotation process takes place using Web Apollo. It is addressed to the members of the Biomphalaria glabatra research community.
Genomics is the study of genomes through mapping, sequencing, and analysis. It involves sequencing entire genomes and analyzing gene function on a genome-wide scale. There are three main areas of genomics: structural genomics focuses on sequencing and mapping genomes; functional genomics examines gene expression and protein interactions; and comparative genomics compares genomic features between organisms to study evolution and biology. Techniques like DNA sequencing, microarrays, and bioinformatics are used to efficiently analyze entire genomes and understand how the structure and function of genomes relate to biological processes.
BITs: Genome browsers and interpretation of gene lists.BITS
Module 5 Genome browsers and interpreting gene lists.
Part of training session "Basic Bioinformatics concepts, databases and tools" - http://www.bits.vib.be/training
This document discusses NIST's work in developing genomic reference materials and methods to evaluate microbial genomics measurements. It describes three projects: 1) assessing genomic purity by detecting low levels of contaminants using sequencing and classification, 2) evaluating SNP calling methods using reference materials and replicates to establish confidence, and 3) developing characterized genomic reference materials for public health pathogens. The overall aim is to build an infrastructure to support genome-based characterization of microbial samples.
Structural genomics aims to sequence and map genomes. Genetic maps show relative gene locations based on recombination rates, while physical maps show direct DNA distances. Genetic maps have low resolution and may differ from physical distances. Physical maps use techniques like restriction mapping and sequencing to order DNA fragments. Sequencing entire genomes requires breaking DNA into small fragments that are then reassembled using overlap or genetic/physical maps. The human genome was first sequenced in 2000 using both map-based and whole genome shotgun approaches. Single nucleotide polymorphisms are also studied to compare individuals.
Apollo annotation guidelines for i5k projects Diaphorina citriMonica Munoz-Torres
Apollo is a web-based application that supports and enables collaborative genome curation in real time, allowing teams of curators to improve on existing automated gene models through an intuitive interface. Apollo allows researchers to break down large amounts of data into manageable portions to mobilize groups of researchers with shared interests.
This document provides an introduction and overview of manual genome annotation using the Apollo genome annotation tool. It begins with an outline of the webinar topics, which include an introduction to manual annotation and its necessity, an overview of the Apollo tool and its functionality for collaborative curation, and examples and demonstrations. The document then covers key concepts for manual annotation such as the definition of a gene, genome curation steps, transcription and translation including reading frames, splice sites, and phase. The goal of the webinar is to help participants better understand genome curation and manual annotation using Apollo to identify and modify gene models.
Drug TanzeumDiseasesDiabetesGene and Gene Productglucago.docxjacksnathalie
Drug Tanzeum
Diseases Diabetes
Gene and Gene Product glucagon-like peptide (GLP)-1 fusion protein
Gene ID GCG
Structure 3IOL
1. Report on Gene and Related Protein (3-4 pages) (30 pts) (details on navigation of PubMed site in separate Power Point)
a. The name and gene code for your gene product/protein.
b. Determine chromosome location of your gene. Paste a picture of the chromosome from MAP VIEWER into your report and describe.
c. Identify the size of the coding region of your gene, include a link to the GENE page, and explain what the coding region is. Note: your gene sequence may be very long, do not cut and paste the sequence into your report.
d. Determine the intron /exon structure of your gene including the number of introns and exons. Cut and paste a schematic diagram of your gene into the report, indicate position of introns, and exons, and provide a caption for the figure that describes how the sequence of the primary mRNA transcript is related to the gene (DNA) sequence.
e. Identify the sequence of the mature mRNA and copy this sequence into your report with the appropriate link to this sequence. (Note: the mRNA sequence will actually be shown as a “complementary DNA” or “cDNA.” The sequence given in the database is an exact copy of the mRNA sequence, except that it has T instead of U. ) Explain how this sequence is derived from the sequence of the coding region identified above.
f. Identify the Open reading frame in the mature mRNA (Coding sequence, CDS) Find the START codon, STOP codon, and highlight the open reading frame. Explain where translation begins and ends.
g. Translate the first ten amino acids of the Open Reading Frame.
h. Identify the protein sequence, and copy this sequence into your report with a link to the appropriate page. Check the first ten amino acids of the database protein sequence against the ten amino acids you translated above.
i. Compare your (human) gene sequence with the same gene sequence from other species. Copy a representative portion of the sequence alignment into your report that shows the similarities/differences between the human protein sequence and the homologous protein sequence from at least one other species. (Note: sequences will align if you use the Courier font). Explain how the similarity/difference in sequences explains the similarities and differences between species.
j. Paste a picture of the secondary (worm view) and tertiary (space filling) structures of your protein into your report. Note: structures have not been determined for all proteins, so you may have been given an identifier for a similar protein. Describe things about the structure, for example features of the shape, charge, or hydrophobicity.
Genome database assignment and stepwise instructions
Goal: improve student understanding of relationship between genes, protein(gene products), and biological functions.
The exercise:
Each student will be assigned a gene product/protein that will be di ...
This document provides an overview of RNA-Seq analysis. It begins with considerations for RNA-Seq experiments such as computational requirements. It then describes the general RNA-Seq analysis workflow including short-read alignment, transcript reconstruction, abundance estimation, visualization, and statistics. The document focuses on explaining the "Tuxedo" analysis pipeline which includes Bowtie, Tophat, Cufflinks, Cuffmerge, Cuffdiff and CummeRbund. It provides examples of commands for each step and discusses alternative tools. The document concludes with resources for further information on RNA-Seq analysis.
Apollo: A workshop for the Manakin Research Coordination NetworkMonica Munoz-Torres
Apollo is a web-based, collaborative genomic annotation editing platform. We need annotation editing tools to modify and refine precise location and structure of the genome elements that predictive algorithms cannot yet resolve automatically.
This presentation is an introduction to how the manual annotation process takes place using Apollo. It is addressed to the members of the Manakin Genomics research community.
This document outlines the TUXEDO protocol for analyzing RNA-Seq data. It describes the basic steps as: 1) Alignment of RNA-Seq reads to a reference genome or transcriptome using splice-aware aligners like TopHat. 2) Quantification of gene and transcript expression levels using Cufflinks. 3) Quality control checks and 4) Detection of differential expression between conditions using Cuffdiff. Key points covered include RNA-Seq methodology, gene expression, alignment formats, FPKM normalization, and quality control metrics.
This presentation is a thorough guide to the use of Web Apollo, with details on User Navigation, Functionality, and the thought process behind manual annotation.
During this workshop, participants:
- Learn to identify homologs of known genes of interest in a newly sequenced genome.
- Become familiar with the environment and functionality of the Web Apollo genome annotation editing tool.
- Learn how to corroborate or modify automatically annotated gene models using available evidence in Web Apollo.
- Understand the process of curation in the context of genome annotation.
The document describes a lab experiment analyzing gene expression data from human fibroblasts in response to serum using microarray analysis. The aims are to analyze the gene expression data using Excel and the ArrayTrack workbench. Key steps include importing microarray data into Excel and pre-treating the data by centering and scaling. ArrayTrack is then used to analyze the data through descriptive statistics, exploring gene expression profiles of gene lists, and using the significance analysis of microarrays (SAM) tool. Additional online databases like Gene Atlas and ArrayExpress are queried to find expression profiles and experimental data for a specific gene, APT13A2, under different conditions.
one complete report from all the 4 labs.pdfstudy help
The document provides instructions for compiling a complete lab report from four biology labs on genomic databases, primer design, PCR, and molecular cloning. It outlines the necessary sections for the report, including an introduction describing the overall question and background, materials and methods, results with data and figures, and a discussion/conclusion section. It also provides additional details on designing a transgene reporter gene based on knowledge gained from the lab exercises, including defining a transgene, necessary gene elements, and ideas for using the transgene.
one complete report from all the 4 labs.pdfstudy help
The document provides instructions for compiling a complete lab report from four biology labs on genomic databases, primer design, PCR, and molecular cloning. It outlines the required sections of the report, including an introduction, materials and methods, results with data, and a discussion/conclusion section. It also provides discussion questions on building a reporter gene or transgene, defining key terms and outlining the necessary gene elements and ideas for using the transgene. The report should integrate results and instructions from all four labs.
Web Apollo Tutorial for the i5K copepod research community.Monica Munoz-Torres
Introduction to Web Apollo for the i5K i5K copepod research community. WebApollo is genome annotation editor; it provides a web-based environment that allows multiple distributed users to review, edit, and share manual annotations. This presentation includes information specific to the projects of the Global Initiative to sequence the genomes of 5,000 species of arthropods, i5K. Let's get started!
Three groups annotated the genome of Mycoplasma genitalium and found inconsistencies in their annotations. Of the 468 genes, 318 were annotated consistently by all three groups but 45 had conflicting annotations. Errors likely arose from insufficient sequence similarity to determine homology accurately or incorrectly inferring function based on homology alone. Database curation is needed to prevent propagation of erroneous annotations.
MYSTERY MOLECULE PROJECT, PART II(20 points total)BIO1001.docxdohertyjoetta
MYSTERY MOLECULE PROJECT, PART II
(20 points total)
BIO1001
Fall 2020
To complete the remainder of this project and prepare for your presentation, follow the instructions below. Your presentation should include the answers to all questions indicated below. For your final
presentation, you are expected to
organize your research and
present a rehearsed, 10-12
minute presentation using the rubric at the end of this document as a guideline for preparation.
Learning Objectives
At the conclusion of this phase of the mystery molecule project, students will be able to
1. Effectively navigate sequence databases of the NCBI website
a. BLAST DNA sequence against the human genome
b. Analyze alignment data from genomic
databases
c. Identify an unknown gene based on DNA sequence
2. Use
the scientific literature (primary and secondary resources)
to analyze gene products, and research the cellular and molecular basis of a disease gene.
3. Organize scientific content into a coherent presentation.
4. Communicate research results to a group of peers.
Step 1: BLAST your DNA sequence
1. Go to:
http://www.ncbi.nlm.nih.gov/
2. Click on BLAST (under “popular resources” at the right side of the screen)
3. Choose the nucleotide blast program (under “Web BLAST”)
4. In the “blastn” tab (on left) enter your DNA sequence into the “enter query sequence” box (obtain your mystery DNA sequences in the Mystery Molecule content area in D2L).
a. In “choose search set” click the “genomic + transcript databases”
b. From the drop down menu below, select “Human genomic + transcript (Human G + T)”
c. Under “program selection” in the next section down, optimize for highly similar sequences (megablast)
5. Submit query (click on “BLAST” at the bottom of the screen) and wait– this may take up to a few minutes.
6. A colorized map of sequence alignment scores will appear. Red indicates very good sequence alignment.
7. Scroll down to descriptions of sequences that produced statistically significant alignments. You should see options for genomic alignments and transcripts. An e-value close to zero, and maximum sequence identity close to 100% are optimal.
8. Identify your mystery gene/transcript from the description, e-value, and % sequence identity.
Step 2: Analysis of gene product
Click on the reference number (left, in the column labeled, “Accession”) for the mRNA transcript or genomic sequence of your mystery gene. Scroll down; you will see many references (titles, authors, years of publication) describing your gene. Clicking on any one of the PubMed reference numbers will link you directly to the original publication. You will need these publications to characterize your gene for your presentation. Bookmark this page—all your references must be from primary or secondary sources (no google, wikipedia, textbooks, blogs, etc.).
You must use a minimum of 7 references that can be found by searching NCBI.
An approach is developed to detect and correct errors in 16S RNA fragments from metagenomic sequencing data. Two algorithms are proposed - the first finds and corrects errors by studying correspondence between similar sequences, while the second fine-tunes the first algorithm's accuracy for estimating sequence errors, SNPs and detecting species. The approaches are tested on two 16S RNA fragment datasets, and classification results after error correction are compared to evaluate performance. Future work includes improving error detection and correction and validating the approach on other datasets.
Practical 7 dna, rna and the flow of genetic information5Osama Barayan
This document is a lab report for a biochemistry course that discusses DNA, RNA, and genetic information. The report summarizes databases that contain DNA and protein sequences, such as NCBI. It also discusses analyzing a DNA sequence to find an open reading frame and identify the encoded protein. Finally, the report discusses using BLAST to find homologous sequences and determine if sequences are homologous based on percent identity and E-values.
This document provides an overview of a webinar introducing the Web Apollo genome annotation tool. The webinar aims to help researchers in the Ceratitis capitata research community learn to identify homologous genes, become familiar with the Web Apollo interface and annotation process, and access resources for the Ceratitis capitata genome. The webinar covers what Web Apollo is, the manual annotation process, and a demonstration of Web Apollo's functionality.
Unison: Enabling easy, rapid, and comprehensive proteomic miningReece Hart
Unison is an online database and data integration platform that aggregates proteomic and genomic data from multiple sources and provides over 200 million precomputed predictions on protein sequences, domains, structures, and more. It aims to enable easy, rapid, and comprehensive proteomic mining through semantic integration of distinct data types and automated querying of predictions. Custom data mining projects using Unison have led to discoveries about proteins like Bcl-2 that regulate apoptosis.
This document provides instructions for exploring a pairwise ortholog matrix between genomes. It explains how to set the reference genome, ortholog detection method, and display options for the cells. Changing parameters requires clicking an apply button to update the table with the new settings.
This document provides instructions for students to conduct basic BLAST searches using unknown DNA sequences as queries to identify homologous sequences and determine the identity of the unknown sequences. It describes how to analyze the BLAST results, including the descriptions, alignments and significance values. Students are guided to find homologs of a known sequence on GenBank and explore the BLAST results, alignments and formatting options.
Similar to Editing Functionality - Apollo Workshop (20)
Apollo is a web-based application that supports and enables collaborative genome curation in real time, allowing teams of curators to improve on existing automated gene models through an intuitive interface. Apollo allows researchers to break down large amounts of data into manageable portions to mobilize groups of researchers with shared interests.
This is a brief update about the genome browser JBrowse and the genome annotation editor Apollo, addressed to the members of the Alliance of Genome Resources (AGR).
Learn more about JBrowse at jbrowse.org
Learn more about Apollo at GenomeArchitect.org
Apollo Genome Annotation Editor: Latest Updates, Including New Galaxy Integra...Monica Munoz-Torres
This document describes Apollo, an open-source web-based tool for collaborative genome annotation. It allows users to curate gene models for multiple organisms on one server, generates analysis-ready data and reports. Key features include user annotations, evidence tracks, an annotator panel, search/edit/export functions, switching between organisms, and visualizing/editing annotation details. Apollo integrates with JBrowse and can export annotations to GFF3, FASTA, and Chado databases.
Introduction to Apollo - i5k Research Community – Calanoida (copepod)Monica Munoz-Torres
Apollo is a web-based application that supports and enables collaborative genome curation in real time, allowing teams of curators to improve on existing automated gene models through an intuitive interface. Apollo allows researchers to break down large amounts of data into manageable portions to mobilize groups of researchers with shared interests.
The i5k, an initiative to sequence the genomes of 5,000 insect and related arthropod species, is a broad and inclusive effort that seeks to involve scientists from around the world in their genome curation process, and Apollo is serving as the platform to empower this community.
This presentation is an introduction to Apollo for the members of the i5K Pilot Project working on species of the order Calanoida (copepod).
This document summarizes updates from the Gene Ontology Consortium website and community. It discusses the GO Help rotation team for 2016 and their productivity in resolving issues since September 2015. It also highlights updates to the GO website homepage and contributions page, including removing outdated elements and adding suggested teaching materials. The document concludes by recognizing the Berkeley Bioinformatics Open Source Projects for their support of the GO Consortium through various NIH and DOE grants.
Scientific research is inherently a collaborative task; in our case it is a dialog among different researchers to reach a shared understanding of the underlying biology. To facilitate this dialog we have developed two web-based annotation tools: Apollo (http://genomearchitect.org/), a genomic feature editor, designed to support structural annotation of gene models, and Noctua (http://noctua.berkeleybop.org/), a biological-process model builder designed for describing the functional roles of gene products. Here we wish to outline an inventory of essential requirements that, in our experience, enable an annotation tool to meet the needs of both professional biocurators as well as other members of the research community. Here are the general requirements, beyond specific functional requirements, that any annotation tool must satisfy.
This is an introduction to conducting manual annotation efforts using Apollo. This webinar was offered to members of the i5K Research community on 2015-10-07.
CONSORCIO ONTOLOGÍA DE GENES: herramientas para anotación funcionalMonica Munoz-Torres
Este documento describe herramientas para la anotación funcional de genes utilizando la ontología de genes (GO). Explica cómo Apollo y la GO permiten editar y asignar términos GO de forma precisa y profunda para mejorar la anotación estructural y funcional de genomas. También describe dónde se encuentran los archivos de anotación GO y cómo la estructura de la GO como un grafo acíclico dirigido permite análisis complejos de conjuntos de datos.
This document provides an overview of a talk on genome curation and manual annotation using the Apollo genome annotation tool. The talk aims to help scientists understand the genome curation process from assembled genome to automated and manual annotation. It will introduce Apollo and teach how to identify homologs of known genes, corroborate and modify automated gene models using evidence in Apollo. The talk will also refresh attendees on key biological concepts like the definition of a gene, central dogma, transcription, and translation to better understand manual annotation.
Apollo - A webinar for the Phascolarctos cinereus research communityMonica Munoz-Torres
This document provides an overview of a webinar introducing the Apollo genome annotation tool. The webinar aims to help the koala genome research community better understand genome curation processes involving automated annotation and manual curation using Apollo. It outlines the webinar topics which will explain gene prediction, the Apollo interface for collaborative curation, and demonstrations of identifying gene homologs and modifying automated annotations. The webinar aims to familiarize participants with genome curation concepts and the Apollo tool.
Continuing with the theme of DNA repair via homologous recombination, I will discuss the following family during the PAINT call:
PTHR13451 CLASS II CROSSOVER JUNCTION ENDONUCLEASE MUS81
Talk at the 8th International Biocuration Conference. Beijing, China. April 23-26, 2015.
Obtaining meaningful results from genome analyses requires high quality annotations of all genomic elements. Today’s sequencing projects face challenges such as lower coverage, more frequent assembly errors, and the lack of closely related species with well-annotated genomes. Apollo is a web-based application that supports and enables collaborative genome curation in real time, analogous to Google Docs, allowing curators to improve on existing automated gene models through an intuitive interface. Apollo’s extensible architecture is built on top of JBrowse; its components are a web-based client, an annotation-editing engine, and a server-side data service. It allows users to visualize automated gene models, protein alignments, expression and variant data, and conduct structural and/or functional annotations.
Apollo is actively used within a variety of projects, including the initiative to sequence the genomes of 5,000 Arthropod species (i5K), and will become essential to the thousands of genomes now being sequenced and analyzed. Researchers from nearly 100 institutions worldwide are currently using Apollo on distributed curation efforts for over sixty genome projects across the tree of life; from plants to echinoderms, to fungi, to species of fish and other vertebrates including human, cattle (bovine), and dog. We are training the next generation of researchers by reaching out to educators to make these tools available as part of curricula, offering workshops and webinars to the scientific community, and through widely applied systems such as iPlant and DNA Subway. We are currently integrating Apollo into an annotation environment combining gene structural and functional annotation, transcriptomic, proteomic, and phenotypic annotation. In this presentation we will describe in detail its utility to users, introduce the architecture to developers interested in expanding on this open-source project, and offer details of our future plans.
Authors:
Monica Munoz-Torres(1), Nathan Dunn(1), Colin Diesh(2), Deepak Unni(2), Seth Carbon(1), Heiko Dietze(1), Christopher Mungall(1), Nicole Washington(1), Ian Holmes(3), Christine Elsik(2), and Suzanna E. Lewis(1)
1Lawrence Berkeley National Laboratory, Genomics Division, Berkeley, CA
2Divisions of Animal and Plant Sciences, University of Missouri, Columbia, MO
3University of California Berkeley, Bioengineering, Berkeley, CA
Data Visualization And Annotation Workshop at Biocuration 2015Monica Munoz-Torres
8th International Biocuration Conference. Beijing, China. April 23-26, 2015.
Workshop 2: Data Visualization and Annotation.
Chairs: Rama Balakrishnan, Stanford University, USA and Monica Munoz-Torres, Lawrence Berkeley National Laboratory, USA
Explaining the most intricate biological processes often requires a degree of detail beyond the scope of equations and algorithms; in fact, most biological knowledge is represented visually as illustrations, graphs, and diagrams. Genomics data in particular require specialized forms of visualization to improve our understanding and increase our chances of extracting meaningful conclusions from our analyses. Furthermore, the heterogeneity and abundance of genomic data include widely varied sources, techniques for their obtention, and intrinsic experimental error. And even data obtained under similar conditions from two or more individuals are loaded with biological variation. So what is the best way to interpret the stories the data are telling us? Given the questions we wish to answer and the data we are generating, which tools would be most useful and effective? In this workshop we will explore the tools available for human interpretation of genomic data, specifically in the context of annotation.
Presentations and perspectives, panelists/presenters:
- Lorna Richardson, IGMM, University of Edinburgh, United Kingdom
- Justyna Szostak, PMI Research & Development, Switzerland
The workshop included a brief introduction to a landscape of tools available - as updated as the constantly changing field allows-, brief presentations chosen from abstract submissions and invited speakers, as well as ample discussion to capture the contributions and questions from attendants. In the end, we hope participants walked away with a toolset in hand that may benefit the progress of their own research.
The document provides an update on the progress of Apollo, an open-source genome annotation tool. Key points include: Milestone 2.0 will feature a simpler, more extensible architecture; new layout features like multiple organism selection and reference sequence search; pending issues for Milestone 1.0.4 and timeline for testing/release; and a call for more developers to join the Apollo community.
Apollo and i5K: Collaborative Curation and Interactive Analysis of GenomesMonica Munoz-Torres
Precise elucidation of the many different biological features encoded in a genome requires a careful curation process that involves reviewing all available evidence to allow researchers to resolve discrepancies and validate automated gene models, protein alignments, and other biological elements. Genome annotation is an inherently collaborative task; researchers only rarely work in isolation, turning to colleagues for second opinions and insights from those with expertise in particular domains and gene families.
The i5k initiative seeks to sequence the genomes of 5,000 insect and related arthropod species. The selected species are known to be important to worldwide agriculture, food safety, medicine, and energy production as well as many used as models in biology, those most abundant in world ecosystems, and representatives in every branch of the insect phylogeny in an effort to better understand arthropod evolution and phylogeny. Because computational genome analysis remains an imperfect art, each of these new genomes sequenced will require visualization and curation.
Apollo is an instantaneous, collaborative, genome annotation editor, and the new JavaScript based version allows researchers real-time interactivity, breaking down large amounts of data into manageable portions to mobilize groups of researchers with shared interests. The i5K is a broad and inclusive effort that seeks to involve scientists from around the world in their genome curation process and Apollo is serving as the platform to empower this community. Here we offer details about this collaboration.
This document provides an introduction and overview of a manual annotation workshop using the Web Apollo genome annotation tool. It discusses manual annotation and community-based curation efforts. The workshop aims to teach participants how to identify genes of interest, become familiar with Web Apollo, learn how to corroborate and modify gene models using evidence, and understand the genome annotation process from assembly to manual curation. The document outlines the workshop activities and provides guidance on using Web Apollo, including navigating the interface, editing annotations, and annotating simple cases by adding or modifying exons.
Embracing Deep Variability For Reproducibility and Replicability
Abstract: Reproducibility (aka determinism in some cases) constitutes a fundamental aspect in various fields of computer science, such as floating-point computations in numerical analysis and simulation, concurrency models in parallelism, reproducible builds for third parties integration and packaging, and containerization for execution environments. These concepts, while pervasive across diverse concerns, often exhibit intricate inter-dependencies, making it challenging to achieve a comprehensive understanding. In this short and vision paper we delve into the application of software engineering techniques, specifically variability management, to systematically identify and explicit points of variability that may give rise to reproducibility issues (eg language, libraries, compiler, virtual machine, OS, environment variables, etc). The primary objectives are: i) gaining insights into the variability layers and their possible interactions, ii) capturing and documenting configurations for the sake of reproducibility, and iii) exploring diverse configurations to replicate, and hence validate and ensure the robustness of results. By adopting these methodologies, we aim to address the complexities associated with reproducibility and replicability in modern software systems and environments, facilitating a more comprehensive and nuanced perspective on these critical aspects.
https://hal.science/hal-04582287
Candidate young stellar objects in the S-cluster: Kinematic analysis of a sub...Sérgio Sacani
Context. The observation of several L-band emission sources in the S cluster has led to a rich discussion of their nature. However, a definitive answer to the classification of the dusty objects requires an explanation for the detection of compact Doppler-shifted Brγ emission. The ionized hydrogen in combination with the observation of mid-infrared L-band continuum emission suggests that most of these sources are embedded in a dusty envelope. These embedded sources are part of the S-cluster, and their relationship to the S-stars is still under debate. To date, the question of the origin of these two populations has been vague, although all explanations favor migration processes for the individual cluster members. Aims. This work revisits the S-cluster and its dusty members orbiting the supermassive black hole SgrA* on bound Keplerian orbits from a kinematic perspective. The aim is to explore the Keplerian parameters for patterns that might imply a nonrandom distribution of the sample. Additionally, various analytical aspects are considered to address the nature of the dusty sources. Methods. Based on the photometric analysis, we estimated the individual H−K and K−L colors for the source sample and compared the results to known cluster members. The classification revealed a noticeable contrast between the S-stars and the dusty sources. To fit the flux-density distribution, we utilized the radiative transfer code HYPERION and implemented a young stellar object Class I model. We obtained the position angle from the Keplerian fit results; additionally, we analyzed the distribution of the inclinations and the longitudes of the ascending node. Results. The colors of the dusty sources suggest a stellar nature consistent with the spectral energy distribution in the near and midinfrared domains. Furthermore, the evaporation timescales of dusty and gaseous clumps in the vicinity of SgrA* are much shorter ( 2yr) than the epochs covered by the observations (≈15yr). In addition to the strong evidence for the stellar classification of the D-sources, we also find a clear disk-like pattern following the arrangements of S-stars proposed in the literature. Furthermore, we find a global intrinsic inclination for all dusty sources of 60 ± 20◦, implying a common formation process. Conclusions. The pattern of the dusty sources manifested in the distribution of the position angles, inclinations, and longitudes of the ascending node strongly suggests two different scenarios: the main-sequence stars and the dusty stellar S-cluster sources share a common formation history or migrated with a similar formation channel in the vicinity of SgrA*. Alternatively, the gravitational influence of SgrA* in combination with a massive perturber, such as a putative intermediate mass black hole in the IRS 13 cluster, forces the dusty objects and S-stars to follow a particular orbital arrangement. Key words. stars: black holes– stars: formation– Galaxy: center– galaxies: star formation
Discovery of An Apparent Red, High-Velocity Type Ia Supernova at 𝐳 = 2.9 wi...Sérgio Sacani
We present the JWST discovery of SN 2023adsy, a transient object located in a host galaxy JADES-GS
+
53.13485
−
27.82088
with a host spectroscopic redshift of
2.903
±
0.007
. The transient was identified in deep James Webb Space Telescope (JWST)/NIRCam imaging from the JWST Advanced Deep Extragalactic Survey (JADES) program. Photometric and spectroscopic followup with NIRCam and NIRSpec, respectively, confirm the redshift and yield UV-NIR light-curve, NIR color, and spectroscopic information all consistent with a Type Ia classification. Despite its classification as a likely SN Ia, SN 2023adsy is both fairly red (
�
(
�
−
�
)
∼
0.9
) despite a host galaxy with low-extinction and has a high Ca II velocity (
19
,
000
±
2
,
000
km/s) compared to the general population of SNe Ia. While these characteristics are consistent with some Ca-rich SNe Ia, particularly SN 2016hnk, SN 2023adsy is intrinsically brighter than the low-
�
Ca-rich population. Although such an object is too red for any low-
�
cosmological sample, we apply a fiducial standardization approach to SN 2023adsy and find that the SN 2023adsy luminosity distance measurement is in excellent agreement (
≲
1
�
) with
Λ
CDM. Therefore unlike low-
�
Ca-rich SNe Ia, SN 2023adsy is standardizable and gives no indication that SN Ia standardized luminosities change significantly with redshift. A larger sample of distant SNe Ia is required to determine if SN Ia population characteristics at high-
�
truly diverge from their low-
�
counterparts, and to confirm that standardized luminosities nevertheless remain constant with redshift.
Sexuality - Issues, Attitude and Behaviour - Applied Social Psychology - Psyc...PsychoTech Services
A proprietary approach developed by bringing together the best of learning theories from Psychology, design principles from the world of visualization, and pedagogical methods from over a decade of training experience, that enables you to: Learn better, faster!
TOPIC OF DISCUSSION: CENTRIFUGATION SLIDESHARE.pptxshubhijain836
Centrifugation is a powerful technique used in laboratories to separate components of a heterogeneous mixture based on their density. This process utilizes centrifugal force to rapidly spin samples, causing denser particles to migrate outward more quickly than lighter ones. As a result, distinct layers form within the sample tube, allowing for easy isolation and purification of target substances.
Microbial interaction
Microorganisms interacts with each other and can be physically associated with another organisms in a variety of ways.
One organism can be located on the surface of another organism as an ectobiont or located within another organism as endobiont.
Microbial interaction may be positive such as mutualism, proto-cooperation, commensalism or may be negative such as parasitism, predation or competition
Types of microbial interaction
Positive interaction: mutualism, proto-cooperation, commensalism
Negative interaction: Ammensalism (antagonism), parasitism, predation, competition
I. Mutualism:
It is defined as the relationship in which each organism in interaction gets benefits from association. It is an obligatory relationship in which mutualist and host are metabolically dependent on each other.
Mutualistic relationship is very specific where one member of association cannot be replaced by another species.
Mutualism require close physical contact between interacting organisms.
Relationship of mutualism allows organisms to exist in habitat that could not occupied by either species alone.
Mutualistic relationship between organisms allows them to act as a single organism.
Examples of mutualism:
i. Lichens:
Lichens are excellent example of mutualism.
They are the association of specific fungi and certain genus of algae. In lichen, fungal partner is called mycobiont and algal partner is called
II. Syntrophism:
It is an association in which the growth of one organism either depends on or improved by the substrate provided by another organism.
In syntrophism both organism in association gets benefits.
Compound A
Utilized by population 1
Compound B
Utilized by population 2
Compound C
utilized by both Population 1+2
Products
In this theoretical example of syntrophism, population 1 is able to utilize and metabolize compound A, forming compound B but cannot metabolize beyond compound B without co-operation of population 2. Population 2is unable to utilize compound A but it can metabolize compound B forming compound C. Then both population 1 and 2 are able to carry out metabolic reaction which leads to formation of end product that neither population could produce alone.
Examples of syntrophism:
i. Methanogenic ecosystem in sludge digester
Methane produced by methanogenic bacteria depends upon interspecies hydrogen transfer by other fermentative bacteria.
Anaerobic fermentative bacteria generate CO2 and H2 utilizing carbohydrates which is then utilized by methanogenic bacteria (Methanobacter) to produce methane.
ii. Lactobacillus arobinosus and Enterococcus faecalis:
In the minimal media, Lactobacillus arobinosus and Enterococcus faecalis are able to grow together but not alone.
The synergistic relationship between E. faecalis and L. arobinosus occurs in which E. faecalis require folic acid
1. Apollo
Collaborative genome annotation editing
A workshop for the International Aphid Genome Consortium research community.
Monica Munoz-Torres, PhD | @monimunozto
Berkeley Bioinformatics Open-Source Projects (BBOP)
Environmental Genomics & Systems Biology Division
Lawrence Berkeley National Laboratory
Webinar - 22 March, 2017
http://GenomeArchitect.org
12. BECOMING ACQUAINTED WITH APOLLO
Annotator
panel.
• Choose appropriate evidence from list of “Tracks” on annotator panel.
• Select & drag elements from evidence track into the ‘User-created Annotations’ area.
• Hovering over annotation in progress brings up an information pop-up.
Creating a new
annotation
21. “Simple case”:
- the predicted gene model is correct or nearly correct, and
- this model is supported by evidence that completely or
mostly agrees with the prediction.
- evidence that extends beyond the predicted model is
assumed to be non-coding sequence.
The following are simple modifications.
BECOMING ACQUAINTED WITH APOLLO SIMPLE CASES
24. Editing functionality
Example: Adding an exon supported by experimental data
• RNAseq reads show evidence in support of a transcribed product that was not predicted.
• Add exon by dragging up one of the RNAseq reads.
26. In some cases all the data may disagree with the annotation, in other cases some data
support the annotation and some of the data support one or more alternative transcripts.
Try to annotate as many alternative transcripts as are well supported by the data.
MATCHING EXON BOUNDARY TO EVIDENCE
BECOMING ACQUAINTED WITH APOLLO SIMPLE CASES
To modify an exon
boundary and match
data in the evidence
tracks: select both the
offending exon and
the element with the
correct boundary,
then right click on the
annotation to select
‘Set 3’ end’ or ‘Set 5’
end’ as appropriate.
29. Non-canonical splices are indicated
with orange circles with a white
exclamation point inside, placed over
the edge of the offending exon.
Canonical splice sites:
3’-…exon]GA / TG[exon…-5’
5’-…exon]GT / AG[exon…-3’
reverse strand, not reverse-complemented:
forward strand
SPLICE SITES
Zoom to review non-canonical
splice site warnings. Although
these may not always have to be
corrected (e.g. GC donor), they
should be flagged with a
comment.
Exon/intron splice site error warning
Curated model
BECOMING ACQUAINTED WITH APOLLO SIMPLE CASES
31. Apollo calculates the longest possible open reading
frame (ORF) that includes canonical ‘Start’ and
‘Stop’ signals within the predicted exons.
If ‘Start’ appears to be incorrect, modify it by
selecting an in-frame ‘Start’ codon further up or
downstream, depending on evidence (e.g.
proteins, RNAseq).
It may be present outside the predicted gene
model, within a region supported by another
evidence track.
In very rare cases, the actual ‘Start’ codon may
be non-canonical (non-ATG).
‘Start’ AND ‘Stop’ SITES
BECOMING ACQUAINTED WITH APOLLO SIMPLE CASES
33. Evidence may support joining two or more different gene models.
Warning: protein alignments may have incorrect splice sites and lack non-conserved regions!
1. In ‘User-created Annotations’ area shift-click to select an intron from each gene model and
right click to select the ‘Merge’ option from the menu.
2. Drag supporting evidence tracks over the candidate models to corroborate overlap, or
review edge matching and coverage across models.
3. Check the resulting translation by querying a protein database e.g. UniProt, NCBI nr. Add
comments to record that this annotation is the result of a merge.
MERGE TWO GENE PREDICTIONS
ON THE SAME SCAFFOLD
BECOMING ACQUAINTED WITH APOLLO COMPLEX CASES
Red lines around exons:
‘edge-matching’ allows annotators to confirm whether
the evidence is in agreement, without examining each
exon at the base level.
35. DNA Track
‘User-created Annotations’ Track
ANNOTATE FRAMESHIFTS AND
CORRECT SINGLE-BASE ERRORS
Always remember: when annotating gene models using Apollo, you are looking at a ‘frozen’ version of
the genome assembly and you will not be able to modify the assembly itself.
BECOMING ACQUAINTED WITH APOLLO COMPLEX CASES
40. 40 | BECOMING ACQUAINTED WITH APOLLO
Information Editor
Isoforms at BIPAA:
If the gene you are annotating does not have
multiple isoforms, add metadata only on left side
of the Information Editor (i.e. under gene).
If the gene you are annotating has multiple
isoforms, you should populate the right panel
(mRNA / transcript) for each isoform, adding a
letter (A, B, C, …) at the end of the name to
distinguish
48. • Check ‘Start’ and ‘Stop’ sites.
• Check splice sites: most splice sites display
these residues …]5’-GT/AG-3’[…
• Check if you can annotate UTRs, for example
using RNA-Seq data:
– align it against relevant genes/gene family
– blastp against NCBI’s RefSeq or nr
• Check & comment gaps in the genome.
• Additional functionality may be necessary:
– merge 2 gene predictions - same scaffold
– ‘merge’ 2 gene predictions - different
scaffolds
– split a gene prediction
– annotate frameshifts
– annotate selenocysteines, correcting
single-base and other assembly errors, etc.
48 |
• Add:
– Important project information in the form of
comments.
– IDs for this gene model in public or private
databases via DBXRefs, e.g. GenBank ID,
gene symbol(s), common name(s),
synonyms.
– Comments about the changes you made to
each gene model, if any.
– Any appropriate functional assignments, e.g.
via BLAST + HMM (e.g. InterProScan), RNA-
Seq or other data of your own, literature
searches, etc.
CHECKLIST
for accuracy and integrity
MANUAL ANNOTATION CHECKLIST
50. Apis mellifera genome data in Apollo
GenomeArchitect.org
1. Evidence in support of protein coding gene
models.
1.1 Consensus Gene Sets:
Official Gene Set v3.2
Official Gene Set v1.0
1.2 Consensus Gene Sets comparison:
OGSv3.2 genes that merge OGSv1.0 and
RefSeq genes
OGSv3.2 genes that split OGSv1.0 and RefSeq
genes
1.3 Protein Coding Gene Predictions Supported
by Biological Evidence:
NCBI Gnomon
Fgenesh++ with RNASeq training data
Fgenesh++ without RNASeq training data
NCBI RefSeq Protein Coding Genes and Low
Quality Protein Coding Genes
1.4 Ab initio protein coding gene predictions:
Augustus Set 12, Augustus Set 9, Fgenesh,
GeneID, N-SCAN, SGP2
1.5 Transcript Sequence Alignment:
NCBI ESTs, Apis cerana RNA-Seq, Forager Bee
Brain Illumina Contigs, Nurse Bee Brain Illumina
Contigs, Forager RNA-Seq reads, Nurse RNA-Seq
reads, Abdomen 454 Contigs, Brain and Ovary
454 Contigs, Embryo 454 Contigs, Larvae 454
Contigs, Mixed Antennae 454 Contigs, Ovary 454
Contigs, Testes 454 Contigs, Forager RNA-Seq
HeatMap, Forager RNA-Seq XY Plot, Nurse RNA-
Seq HeatMap, Nurse RNA-Seq XY Plot
51. Apis mellifera genome data in Apollo
GenomeArchitect.org
1. Evidence in support of protein coding gene
models (Continued).
1.6 Protein homolog alignment:
Acep_OGSv1.2
Aech_OGSv3.8
Cflo_OGSv3.3
Dmel_r5.42
Hsal_OGSv3.3
Lhum_OGSv1.2
Nvit_OGSv1.2
Nvit_OGSv2.0
Pbar_OGSv1.2
Sinv_OGSv2.2.3
Znev_OGSv2.1
Metazoa_Swissprot
2. Evidence in support of non protein coding
gene models
2.1 Non-protein coding gene predictions:
NCBI RefSeq Noncoding RNA
NCBI RefSeq miRNA
2.2 Pseudogene predictions:
NCBI RefSeq Pseudogene
54. Ceramidase
Example 54
Ceramidase is an enzyme, which cleaves fatty acids from ceramide, producing
sphingosine (SPH), which in turn is phosphorylated by a sphingosine kinase to
form sphingosine-1-phosphate (S1P). Ceramide, SPH, and S1P are bioactive
lipids that mediate cell proliferation, differentiation, apoptosis, adhesion, and
migration.
It has come to our attention that the honey bee Apis mellifera ortholog of
Ceramidase is fragmented into 2 or more genes in the current gene set (Official
Gene Set v3.2).
59. 59i5K Workspace@NAL
BIPAA resources - Apollo
You may find candidate genes from
blast results using the ‘Search’ box
with coordinates in main window.
60. Create a new annotation
Example 60
Drag and drop ‘GB40335-RA’
61. Transcriptomic data support longer gene
Example 61
RNA-Seq reads support
a large intron and
additional exons
located about 20k bp
downstream (3’) of the
last predicted exon for
GB40335-RA.
63. Merge transcripts
Example 63
Select one exon from each gene
model, holding down the ‘Shift’ key.
Then, select ‘Merge’ from right-click
menu to bring gene models together.
Note non-canonical splice sites.
64. Exon not supported by RNA-Seq data
Example 64
At the end of GB40335-RA,
select last exon and right-click
to choose the ‘Delete’ option.
65. Fix remaining non-canonical splice site
Example 65
Now on the other offending exon (was first exon of GB40336-RA), use
RNA-seq reads - or use ‘Set Downstream Splice Acceptor’, or drag the
intron/exon boundary manually - to use a canonical splice site.
71. 71i5K Workspace@NAL
Accessing the genome home page
Each genome hosted on BIPAA has a dedicated home page, accessible
from AphidBase, ParWaspDB or LepidoDB. Access may be restricted
for some species; if so, login with your BIPAA account.
To create a BIPAA account visit http://bipaa.genouest.org/account
For Phylloxera, visit
http://bipaa.genouest.org/sp/daktulosphaira_vitifoliae/
72. 72i5K Workspace@NAL
1. A computationally predicted consensus gene set has been
generated using multiple lines of evidence in MAKER.
2. BIPAA will integrate consensus computational predictions with
manual annotations to produce an updated, official gene set (OGS):
Attention!
• If it’s not on either track, it won’t make the OGS!
• If it’s there and it shouldn’t, it will still make the OGS!
Curation process with IAGC & BIPAA
73. 73i5K Workspace@NAL
3. In some cases algorithms and metrics used to generate consensus sets may
actually reduce the accuracy of the gene’s representation. Use caution!
4. Isoforms: drag original and alternatively spliced form to ‘User-created
Annotations’ area.
If the gene you are annotating does not have multiple isoforms, add metadata only on left side of the
Information Editor (i.e. under gene).
If the gene you are annotating has multiple isoforms, you should populate the right panel (mRNA / transcript)
for each isoform, adding a letter (A, B, C, …) at the end of the name to distinguish each isoform.
5. If an annotation needs to be removed from the consensus set, drag it to the
‘User-created Annotations’ area, copy the gene ID in the Name field, and mark
with ‘Delete’ radio button on the Information Editor.
6. Overlapping interests? Collaborate to reach agreement.
7. Follow guidelines for IAGC & BIPAA, at
https://bipaa.genouest.org/is/how-to-annotate-a-genome/
Curation process with IAGC & BIPAA
74. 74i5K Workspace@NAL
Annotation report
To avoid mistakes, a personal report is generated each night for each
annotator, giving access to the list of annotated genes, and the
possible corresponding errors and warnings (e.g. missing symbol,
wrong name, etc.).
75. 75i5K Workspace@NAL
Updating the OGS
• Regularly, a new OGS is released: merging the original OGS with the
manual curation set.
• If a manually curated gene overlaps a gene predicted by Maker, we
keep the manual annotation and replace the automated one.***
• Question from moni: Overlapping CDS? Or just overlapping coordinates?
• Gene IDs are conserved between each OGS release, a suffix being
incremented when a gene is modified (structure, as well as
associated information like Name or Symbol).
76. PUBLIC DEMO
76 |
APOLLO ON THE WEB
instructions
Public Honey bee demo available at:
genomearchitect.org/demo/
Username:
demo@demo.com
Password:
demo
78. Apollo Development
Nathan Dunn
Technical Lead Eric Yao
Christine Elsik’s Lab,
University of Missouri
Suzi Lewis
Principal Investigator
BBOP
Moni Munoz-Torres
Project Manager
Deepak Unni
JBrowse. Ian Holmes’ Lab
University of California, Berkeley
79. For your attention,
Thank you!
Berkeley Bioinformatics Open-Source Projects,
Environmental Genomics & Systems Biology,
Lawrence Berkeley National Laboratory
Funding
• Apollo is supported by NIH grants 5R01GM080203 from NIGMS,
and 5R01HG004483 from NHGRI.
• BBOP is also supported by the Director, Office of Science,
Office of Basic Energy Sciences, of the U.S. Department of
Energy under Contract No. DE-AC02-05CH11231
Collaborators
• Ian Holmes, Eric Yao, UC Berkeley (JBrowse)
• Chris Elsik, Deepak Unni, U of Missouri (Apollo)
• Monica Poelchau, USDA/NAL (Apollo)
• i5k Community
berkeleybop.org
UNIVERSITY OF
CALIFORNIA
Suzanna Lewis & Chris Mungall
Seth Carbon (Noctua / AmiGO)
Nathan Dunn (Apollo)
Monica Munoz-Torres (Apollo / GO)
Jeremy Nguyen Xuan (Monarch Init.)