The document summarizes key concepts from Lecture 10 of the Microbial Phylogenomics course taught by Jonathan Eisen in winter 2014. It discusses the history of genome sequencing, including the first bacterial genome sequenced. It then covers the general steps involved in genome sequencing projects, including library construction, random sequencing, closure, and annotation. Subsequent slides discuss trends in completed genomes over time, structural annotation of genes and features, functional annotation including Gene Ontology and enzyme classification, and methods for functional prediction such as membrane protein prediction and phylogeny-based approaches.
UC Davis EVE161 Lecture 8 - rRNA ecology - by Jonathan Eisen @phylogenomicsJonathan Eisen
This lecture discusses using rRNA sequencing to analyze and compare microbial communities between different environmental samples. It introduces UniFrac, a new phylogenetic method for measuring distances between communities based on shared lineages. UniFrac can be used to compare multiple samples simultaneously and is more powerful than previous non-phylogenetic techniques because it accounts for evolutionary distances between sequences. The lecture applies UniFrac to compare bacterial populations in different marine environments like water, sediment, and sea ice to examine questions about culturing effects, bacterial cosmopolitanism, and habitat distinctions.
UC Davis EVE161 Lecture 11 by @phylogenomicsJonathan Eisen
This document summarizes key points from Lecture 10 of the course Microbial Phylogenomics at UC Davis taught by Jonathan Eisen in Winter 2014. It discusses genome sequencing and comparative genomics. Specifically, it covers structural diversity of bacterial genomes including multiple genetic elements like plasmids and chromosomes. It also discusses gene content, order, density and shared genes between bacterial strains and species. Genome rearrangements through inversions or mobile elements inserting genomic islands are noted to contribute to genomic diversity.
UC Davis EVE161 Lecture 17 by @phylogenomicsJonathan Eisen
This document contains slides from a lecture on metagenomics given by Jonathan Eisen at UC Davis in winter 2014. The lecture discusses shotgun metagenomics and analyzing metagenomic functions and gene content from environmental samples without genome assemblies. It provides an example of a comparative metagenomics study of various microbial communities that identified habitat-specific genes and metabolic profiles reflecting the different environments. The slides include figures and references from a 2005 Science paper on this topic. Problem set 4 for the class involves selecting a relevant paper for presentation the following week.
UC Davis EVE 161 Lecture 7 - rRNA workflows - by Jonathan Eisen @phylogenomicsJonathan Eisen
This document summarizes a lecture on rRNA sequencing and analysis from a microbiome phylogenomics course. The lecture covers:
- The goals that should guide rRNA analysis, including taxonomic assignment, ecological characterization of communities, and comparisons between communities.
- The general workflow for rRNA analysis, including PCR, sequencing, alignment, clustering sequences into OTUs, and assigning taxonomy to OTUs.
- Methods for measuring alpha and beta diversity from rRNA data, including OTU richness, phylogenetic diversity, and comparisons between communities.
- Specific techniques covered in more depth include degenerate PCR, alignment, OTU clustering, and diversity metrics like rarefaction curves and rank abundance curves.
This document outlines the syllabus for a course on microbial phylogenomics taught by Jonathan Eisen at UC Davis in winter 2014. The course will cover the history of sequencing-based studies of microbial diversity through four eras: the rRNA tree of life, rRNA analysis of environmental samples, genome sequencing, and metagenomics. Students will learn about microbial diversity, phylogeny, and how to analyze research papers. The course will include lectures, readings, assignments, and a final student project to critically review a relevant research paper. Grading will be based on participation, weekly assignments, exams, a presentation, and a final exam.
The document summarizes a lecture on the modern view of the tree of life. It discusses two papers for the lecture - one that analyzes the eukaryotic tree of life using broad taxonomic sampling, and one that places eukaryotes within the Archaea based on phylogenomic analysis. The lecture covers the parts of a phylogenetic tree, character analysis, data matrices, sequence alignment, tree reconstruction methods, and challenges like long branch attraction and homoplasy. It shows tree topologies from analyses using varying numbers of taxa.
UC Davis EVE161 Lecture 8 - rRNA ecology - by Jonathan Eisen @phylogenomicsJonathan Eisen
This lecture discusses using rRNA sequencing to analyze and compare microbial communities between different environmental samples. It introduces UniFrac, a new phylogenetic method for measuring distances between communities based on shared lineages. UniFrac can be used to compare multiple samples simultaneously and is more powerful than previous non-phylogenetic techniques because it accounts for evolutionary distances between sequences. The lecture applies UniFrac to compare bacterial populations in different marine environments like water, sediment, and sea ice to examine questions about culturing effects, bacterial cosmopolitanism, and habitat distinctions.
UC Davis EVE161 Lecture 11 by @phylogenomicsJonathan Eisen
This document summarizes key points from Lecture 10 of the course Microbial Phylogenomics at UC Davis taught by Jonathan Eisen in Winter 2014. It discusses genome sequencing and comparative genomics. Specifically, it covers structural diversity of bacterial genomes including multiple genetic elements like plasmids and chromosomes. It also discusses gene content, order, density and shared genes between bacterial strains and species. Genome rearrangements through inversions or mobile elements inserting genomic islands are noted to contribute to genomic diversity.
UC Davis EVE161 Lecture 17 by @phylogenomicsJonathan Eisen
This document contains slides from a lecture on metagenomics given by Jonathan Eisen at UC Davis in winter 2014. The lecture discusses shotgun metagenomics and analyzing metagenomic functions and gene content from environmental samples without genome assemblies. It provides an example of a comparative metagenomics study of various microbial communities that identified habitat-specific genes and metabolic profiles reflecting the different environments. The slides include figures and references from a 2005 Science paper on this topic. Problem set 4 for the class involves selecting a relevant paper for presentation the following week.
UC Davis EVE 161 Lecture 7 - rRNA workflows - by Jonathan Eisen @phylogenomicsJonathan Eisen
This document summarizes a lecture on rRNA sequencing and analysis from a microbiome phylogenomics course. The lecture covers:
- The goals that should guide rRNA analysis, including taxonomic assignment, ecological characterization of communities, and comparisons between communities.
- The general workflow for rRNA analysis, including PCR, sequencing, alignment, clustering sequences into OTUs, and assigning taxonomy to OTUs.
- Methods for measuring alpha and beta diversity from rRNA data, including OTU richness, phylogenetic diversity, and comparisons between communities.
- Specific techniques covered in more depth include degenerate PCR, alignment, OTU clustering, and diversity metrics like rarefaction curves and rank abundance curves.
This document outlines the syllabus for a course on microbial phylogenomics taught by Jonathan Eisen at UC Davis in winter 2014. The course will cover the history of sequencing-based studies of microbial diversity through four eras: the rRNA tree of life, rRNA analysis of environmental samples, genome sequencing, and metagenomics. Students will learn about microbial diversity, phylogeny, and how to analyze research papers. The course will include lectures, readings, assignments, and a final student project to critically review a relevant research paper. Grading will be based on participation, weekly assignments, exams, a presentation, and a final exam.
The document summarizes a lecture on the modern view of the tree of life. It discusses two papers for the lecture - one that analyzes the eukaryotic tree of life using broad taxonomic sampling, and one that places eukaryotes within the Archaea based on phylogenomic analysis. The lecture covers the parts of a phylogenetic tree, character analysis, data matrices, sequence alignment, tree reconstruction methods, and challenges like long branch attraction and homoplasy. It shows tree topologies from analyses using varying numbers of taxa.
Microbial Phylogenomics (EVE161) Class 6: Era II - Culture Independent rRNAJonathan Eisen
Microbial Phylogenomics (EVE161) at UC Davis Spring 2016. Co-taught by Jonathan Eisen and Holly Ganz.
Class 6:
Ear II: Culture independent rRNA studies
UC Davis EVE161 Lecture 15 by @phylogenomicsJonathan Eisen
This document summarizes a lecture on shotgun metagenomics from a course on microbial phylogenomics. The lecture discusses how shotgun sequencing was applied to sequence microbial communities directly from environmental samples, without culturing. This allowed reconstruction of near-complete genomes from dominant species in an acid mine drainage biofilm sample. The sample was dominated by a few microbial populations, and shotgun sequencing generated enough data to assemble genomes representing Leptospirillum group II and Ferroplasma type II. Analysis of the assembled genomes provided insights into the metabolic pathways and survival strategies of these uncultivated organisms inhabiting an extreme environment.
This document contains slides from a lecture on the evolution of DNA sequencing technologies taught by Jonathan Eisen at UC Davis in winter 2014. The lecture covers the timeline of sequencing technology development from early manual Sanger and Maxam-Gilbert sequencing methods through modern next-generation sequencing platforms. It discusses the key innovations that enabled automation and high-throughput sequencing, such as labeled dideoxynucleotides, capillary electrophoresis, emulsion PCR, and sequencing by synthesis using reversible terminators. The slides illustrate sequencing workflows and compare different sequencing platforms such as 454, Illumina, SOLiD, and Helicos.
This document contains lecture slides for a course on microbial phylogenomics taught by Jonathan Eisen at UC Davis in winter 2014. The slides discuss the use of rRNA PCR and sequencing to study major microbial groups based on 16S rRNA gene sequences. They provide phylogenetic trees comparing sequences from cultivated vs uncultivated microbes in various bacterial divisions. The slides also address issues with phylogenetic analysis like unseen changes over evolutionary time and limitations in representing diversity due to a lack of cultivated microbes. Overall, the slides aim to provide students with an understanding of how rRNA gene sequencing has expanded knowledge of microbial diversity beyond what was known from culture and the challenges that remain in fully resolving deep phylogenetic relationships.
UC Davis EVE161 Lecture 14 by @phylogenomicsJonathan Eisen
This document contains slides from a lecture on metagenomics and microbial phylogenomics. The lecture discusses the history and development of metagenomics, which involves studying the collective genomes of microbes in an environment. It reviews key papers on metagenomics and the discovery of proteorhodopsin and the SAR11 lineage of bacteria from environmental samples. The slides also discuss previous findings on marine microbes from rRNA studies and introduce two new lineages of alpha- and gamma-proteobacteria identified from an analysis of 16S rRNA genes cloned from Sargasso Sea bacterioplankton DNA.
This document summarizes key concepts from a lecture on the early development of the tree of life. It discusses how prior to Carl Woese's work in the 1960s-1970s, constructing a universal tree of life was difficult due to a lack of homologous traits shared across all domains of life. Woese developed one of the first universal trees using sequences of 16S ribosomal RNA, which are highly conserved yet vary enough between major groups to distinguish relationships. His work established the three domain system of Archaea, Bacteria, and Eukarya and provided strong evidence that all life on Earth descended from a common ancestor.
UC Davis EVE161 Lecture 13 by @phylogenomicsJonathan Eisen
This document summarizes a lecture on microbial phylogenomics from a winter 2014 course at UC Davis taught by Jonathan Eisen. The lecture covered several topics:
- Analysis of fungal genomes revealed over 200 genes specific to eukaryotes involved in cytoskeleton, protein degradation and chromatin. Few genes differentiated single-celled from multicellular eukaryotes.
- Endosymbiont genomes were found to evolve more rapidly than free-living relatives due to population genetics factors rather than DNA repair differences.
- Lateral gene transfer plays a major role in prokaryote evolution, occurring through transformation, conjugation and transduction. Whole genome trees and atypical gene distributions can reveal transfer
EveMicrobial Phylogenomics (EVE161) Class 9Jonathan Eisen
Microbial Phylogenomics (EVE161) at UC Davis Spring 2016. Co-taught by Jonathan Eisen and Holly Ganz.
Class 9:
Era II: rRNA Case Study: Built Environment Metaanalysis
UC Davis EVE161 Lecture 9 by @phylogenomicsJonathan Eisen
This document summarizes a lecture about a case study analyzing microbial communities in dust samples from various spaces in a university building using rRNA sequencing. The study found indoor bacterial communities were highly diverse but dominated by Proteobacteria, Firmicutes, and Deinococci. Architectural characteristics like space type, building layout, ventilation sources, and human occupancy patterns significantly influenced the structure of bacterial communities between spaces. Restrooms in particular contained very distinct microbial communities. The study demonstrates how human activities and building design can shape the indoor microbiome.
The document contains slides from a course on microbial phylogenomics taught by Jonathan Eisen at UC Davis in winter 2016. The slides discuss various topics relating to metagenomics including the environmental genome shotgun sequencing of the Sargasso Sea, methods for binning sequences from metagenomic data like aligning to reference genomes or assembly, and examples of projects that used shotgun sequencing like the Wolbachia and glassy-winged sharpshooter projects. It also discusses challenges with assembly for metagenomic data due to variations in coverage and the DeLong lab's early work characterizing uncultured marine microbes.
UC Davis EVE161 Lecture 18 by @phylogenomicsJonathan Eisen
This document contains slides for a lecture on metagenomics. It discusses student presentation guidelines, summarizes a published article on characterizing genes from the human gut microbiome, provides details on the methods used in that study to extract and sequence DNA from fecal samples of 124 individuals, and includes some results tables. The study generated over 500 GB of sequence data and identified over 3 million non-redundant microbial genes from the gut microbiome.
This document summarizes a study that used PCR and cloning to analyze the 16S rRNA genes present in a natural marine bacterioplankton population from the Sargasso Sea. Researchers constructed a library of 51 small-subunit rRNA genes and sequenced five unique genes. In addition to genes from known marine Synechococcus and SAR11 lineages, they identified two new classes of genes belonging to alpha- and gamma-proteobacteria, confirming that many planktonic bacteria have not been previously recognized by microbiologists.
This is an introduction to conducting manual annotation efforts using Apollo. This webinar was offered to members of the i5K Research community on 2015-10-07.
Apollo is a web-based application that supports and enables collaborative genome curation in real time, allowing teams of curators to improve on existing automated gene models through an intuitive interface. Apollo allows researchers to break down large amounts of data into manageable portions to mobilize groups of researchers with shared interests.
The i5K, an initiative to sequence the genomes of 5,000 insect and related arthropod species, is a broad and inclusive effort that seeks to involve scientists from around the world in their genome curation process, and Apollo is serving as the platform to empower this community.
This presentation is an introduction to Apollo for the members of the i5K Pilot Project on Eurytemora affinis
Microbial Phylogenomics (EVE161) Class 6: Era II - Culture Independent rRNAJonathan Eisen
Microbial Phylogenomics (EVE161) at UC Davis Spring 2016. Co-taught by Jonathan Eisen and Holly Ganz.
Class 6:
Ear II: Culture independent rRNA studies
UC Davis EVE161 Lecture 15 by @phylogenomicsJonathan Eisen
This document summarizes a lecture on shotgun metagenomics from a course on microbial phylogenomics. The lecture discusses how shotgun sequencing was applied to sequence microbial communities directly from environmental samples, without culturing. This allowed reconstruction of near-complete genomes from dominant species in an acid mine drainage biofilm sample. The sample was dominated by a few microbial populations, and shotgun sequencing generated enough data to assemble genomes representing Leptospirillum group II and Ferroplasma type II. Analysis of the assembled genomes provided insights into the metabolic pathways and survival strategies of these uncultivated organisms inhabiting an extreme environment.
This document contains slides from a lecture on the evolution of DNA sequencing technologies taught by Jonathan Eisen at UC Davis in winter 2014. The lecture covers the timeline of sequencing technology development from early manual Sanger and Maxam-Gilbert sequencing methods through modern next-generation sequencing platforms. It discusses the key innovations that enabled automation and high-throughput sequencing, such as labeled dideoxynucleotides, capillary electrophoresis, emulsion PCR, and sequencing by synthesis using reversible terminators. The slides illustrate sequencing workflows and compare different sequencing platforms such as 454, Illumina, SOLiD, and Helicos.
This document contains lecture slides for a course on microbial phylogenomics taught by Jonathan Eisen at UC Davis in winter 2014. The slides discuss the use of rRNA PCR and sequencing to study major microbial groups based on 16S rRNA gene sequences. They provide phylogenetic trees comparing sequences from cultivated vs uncultivated microbes in various bacterial divisions. The slides also address issues with phylogenetic analysis like unseen changes over evolutionary time and limitations in representing diversity due to a lack of cultivated microbes. Overall, the slides aim to provide students with an understanding of how rRNA gene sequencing has expanded knowledge of microbial diversity beyond what was known from culture and the challenges that remain in fully resolving deep phylogenetic relationships.
UC Davis EVE161 Lecture 14 by @phylogenomicsJonathan Eisen
This document contains slides from a lecture on metagenomics and microbial phylogenomics. The lecture discusses the history and development of metagenomics, which involves studying the collective genomes of microbes in an environment. It reviews key papers on metagenomics and the discovery of proteorhodopsin and the SAR11 lineage of bacteria from environmental samples. The slides also discuss previous findings on marine microbes from rRNA studies and introduce two new lineages of alpha- and gamma-proteobacteria identified from an analysis of 16S rRNA genes cloned from Sargasso Sea bacterioplankton DNA.
This document summarizes key concepts from a lecture on the early development of the tree of life. It discusses how prior to Carl Woese's work in the 1960s-1970s, constructing a universal tree of life was difficult due to a lack of homologous traits shared across all domains of life. Woese developed one of the first universal trees using sequences of 16S ribosomal RNA, which are highly conserved yet vary enough between major groups to distinguish relationships. His work established the three domain system of Archaea, Bacteria, and Eukarya and provided strong evidence that all life on Earth descended from a common ancestor.
UC Davis EVE161 Lecture 13 by @phylogenomicsJonathan Eisen
This document summarizes a lecture on microbial phylogenomics from a winter 2014 course at UC Davis taught by Jonathan Eisen. The lecture covered several topics:
- Analysis of fungal genomes revealed over 200 genes specific to eukaryotes involved in cytoskeleton, protein degradation and chromatin. Few genes differentiated single-celled from multicellular eukaryotes.
- Endosymbiont genomes were found to evolve more rapidly than free-living relatives due to population genetics factors rather than DNA repair differences.
- Lateral gene transfer plays a major role in prokaryote evolution, occurring through transformation, conjugation and transduction. Whole genome trees and atypical gene distributions can reveal transfer
EveMicrobial Phylogenomics (EVE161) Class 9Jonathan Eisen
Microbial Phylogenomics (EVE161) at UC Davis Spring 2016. Co-taught by Jonathan Eisen and Holly Ganz.
Class 9:
Era II: rRNA Case Study: Built Environment Metaanalysis
UC Davis EVE161 Lecture 9 by @phylogenomicsJonathan Eisen
This document summarizes a lecture about a case study analyzing microbial communities in dust samples from various spaces in a university building using rRNA sequencing. The study found indoor bacterial communities were highly diverse but dominated by Proteobacteria, Firmicutes, and Deinococci. Architectural characteristics like space type, building layout, ventilation sources, and human occupancy patterns significantly influenced the structure of bacterial communities between spaces. Restrooms in particular contained very distinct microbial communities. The study demonstrates how human activities and building design can shape the indoor microbiome.
The document contains slides from a course on microbial phylogenomics taught by Jonathan Eisen at UC Davis in winter 2016. The slides discuss various topics relating to metagenomics including the environmental genome shotgun sequencing of the Sargasso Sea, methods for binning sequences from metagenomic data like aligning to reference genomes or assembly, and examples of projects that used shotgun sequencing like the Wolbachia and glassy-winged sharpshooter projects. It also discusses challenges with assembly for metagenomic data due to variations in coverage and the DeLong lab's early work characterizing uncultured marine microbes.
UC Davis EVE161 Lecture 18 by @phylogenomicsJonathan Eisen
This document contains slides for a lecture on metagenomics. It discusses student presentation guidelines, summarizes a published article on characterizing genes from the human gut microbiome, provides details on the methods used in that study to extract and sequence DNA from fecal samples of 124 individuals, and includes some results tables. The study generated over 500 GB of sequence data and identified over 3 million non-redundant microbial genes from the gut microbiome.
This document summarizes a study that used PCR and cloning to analyze the 16S rRNA genes present in a natural marine bacterioplankton population from the Sargasso Sea. Researchers constructed a library of 51 small-subunit rRNA genes and sequenced five unique genes. In addition to genes from known marine Synechococcus and SAR11 lineages, they identified two new classes of genes belonging to alpha- and gamma-proteobacteria, confirming that many planktonic bacteria have not been previously recognized by microbiologists.
This is an introduction to conducting manual annotation efforts using Apollo. This webinar was offered to members of the i5K Research community on 2015-10-07.
Apollo is a web-based application that supports and enables collaborative genome curation in real time, allowing teams of curators to improve on existing automated gene models through an intuitive interface. Apollo allows researchers to break down large amounts of data into manageable portions to mobilize groups of researchers with shared interests.
The i5K, an initiative to sequence the genomes of 5,000 insect and related arthropod species, is a broad and inclusive effort that seeks to involve scientists from around the world in their genome curation process, and Apollo is serving as the platform to empower this community.
This presentation is an introduction to Apollo for the members of the i5K Pilot Project on Eurytemora affinis
Comparative genome analysis requires high quality annotations of all genomic elements. Today’s sequencing projects face numerous challenges including lower coverage, more frequent assembly errors, and the lack of closely related species with well-annotated genomes. Precise elucidation of the many different biological features encoded in any genome requires careful examination and review. We need genome annotation editing tools to modify and refine the location and structure of the genome elements that predictive algorithms cannot yet resolve automatically. During the manual annotation process, curators identify elements that best represent the underlying biology and eliminate elements that reflect systemic errors of automated analyses.
Apollo is a web-based application that supports and enables collaborative genome curation in real time, analogous to Google Docs, allowing teams of curators to improve on existing automated gene models through an intuitive interface. Researchers from nearly one hundred institutions worldwide are currently using Apollo for distributed curation efforts in over sixty genome projects across the tree of life: from plants to arthropods, to fungi, to species of fish and other vertebrates including human, cattle (bovine), and dog.
1. Mutations are changes in the nucleotide sequence of DNA that can arise spontaneously during DNA replication or due to damage from mutagens.
2. DNA repair enzymes work to minimize mutations by correcting errors during replication or reacting to damaged DNA.
3. If a mismatch introduced during replication is not repaired, it will become a permanent mutation when that region is replicated again.
This document provides an introduction to genomics, proteomics, and comparative genomics. It discusses the central dogma of molecular biology involving DNA replication, transcription, and translation. It describes DNA and RNA structure and explains how genetic information flows from DNA to protein. The document also discusses genome sequencing, gene mapping, and how comparative analysis of genomes from different species can provide insights into evolutionary relationships and biological functions.
This document summarizes key differences between prokaryotic and eukaryotic genomes. Prokaryotic genomes are typically smaller, usually contained in a single circular DNA molecule within the nucleoid. The DNA is highly compacted via supercoiling. Genes have compact organization with little non-coding DNA. Operons, where genes are expressed as a unit, are common. Repetitive DNA, transposons, and pathogenicity islands can be transferred horizontally and influence virulence. In contrast, eukaryotic genomes are larger with linear chromosomes, more non-coding DNA, introns, and complex gene regulation.
This document provides an overview of a talk on genome curation and manual annotation using the Apollo genome annotation tool. The talk aims to help scientists understand the genome curation process from assembled genome to automated and manual annotation. It will introduce Apollo and teach how to identify homologs of known genes, corroborate and modify automated gene models using evidence in Apollo. The talk will also refresh attendees on key biological concepts like the definition of a gene, central dogma, transcription, and translation to better understand manual annotation.
Introduction to Apollo: A webinar for the i5K Research CommunityMonica Munoz-Torres
This document provides an introduction and outline for a webinar on using the Apollo genome annotation editing tool. It was presented by Monica Munoz-Torres of BBOP to the i5K Research Community. The webinar aimed to help participants better understand genome curation in the context of automated and manual annotation. It also aimed to familiarize participants with Apollo's functionality and how to identify homologs of known genes, corroborate gene models using evidence, and modify automated annotations in Apollo. The document includes sections on genome sequencing projects, the objectives and uses of genome annotation, and a biological refresher on concepts relevant to manual annotation like genes, transcription, translation, and genome curation steps.
Comparative genome analysis requires high quality annotations of all genomic elements. Today’s sequencing projects face numerous challenges including lower coverage, more frequent assembly errors, and the lack of closely related species with well-annotated genomes. Precise elucidation of the many different biological features encoded in any genome requires careful examination and review. We need genome annotation editing tools to modify and refine the location and structure of the genome elements that predictive algorithms cannot yet resolve automatically. During the manual annotation process, curators identify elements that best represent the underlying biology and eliminate elements that reflect systemic errors of automated analyses.
Apollo is a web-based application that supports and enables collaborative genome curation in real time, analogous to Google Docs, allowing teams of curators to improve on existing automated gene models through an intuitive interface. Researchers from nearly one hundred institutions worldwide are currently using Apollo for distributed curation efforts in over sixty genome projects across the tree of life: from plants to arthropods, to fungi, to species of fish and other vertebrates including human, cattle (bovine), and dog.
The document discusses key differences between prokaryotic and eukaryotic genes. Prokaryotic genes have a simple structure with a promoter, protein coding sequence, and terminator. Eukaryotic genes have a more complex structure with exons that can be separated by large introns. The document also discusses how gene duplication, horizontal gene transfer, and mutations can drive evolution. It notes that while most mutations are deleterious, some provide adaptations that allow organisms to escape natural selection.
DNA is the largest molecule known. A single, unbroken strand of it can contain many millions of atoms. When released from a cell, DNA typically breaks up into countless fragments. In solutions, these strands have a slight negative electric charge, a fact that makes for some fascinating chemistry.
This document provides an overview of genetics and DNA. It discusses the location of DNA in cells, the structure of DNA as a double helix, the discovery of DNA as the genetic material through experiments in the 1900s, and DNA's role in inheritance and protein coding. It also summarizes DNA replication, repair, cloning techniques like recombinant DNA and PCR, DNA fingerprinting, and applications of biotechnology including genetically modified bacteria, plants and animals.
This document provides an overview of genetics and DNA. It discusses the location of DNA in cells, the structure of DNA, the discovery of DNA's role as the genetic material, DNA replication, and applications of DNA technology. Some key points include: DNA is found coiled in chromatin in the nucleus; it has a double-helix structure with adenine pairing with thymine and guanine pairing with cytosine; experiments in the 1950s showed that DNA carries genetic information; DNA replicates semi-conservatively before cell division; and recombinant DNA techniques have led to uses like producing insulin from bacteria.
This document provides an overview of genetics and DNA. It discusses the location of DNA in cells, DNA structure including the double helix formation, the discovery of DNA as the genetic material through experiments, the role of DNA in inheritance and protein coding, DNA replication, repair, and cloning techniques like recombinant DNA and PCR. It also describes uses of biotechnology like producing insulin through transgenic bacteria and developing pest-resistant plants.
This document provides an overview of genetics and DNA. It discusses the location of DNA in cells, the structure of DNA as a double helix, the discovery of DNA as the genetic material through experiments in the 1900s, and DNA's role in inheritance and protein production. It also summarizes DNA replication, repair, cloning techniques like recombinant DNA and PCR, DNA fingerprinting, and applications of biotechnology including genetically modified bacteria, plants and animals.
This document provides an overview of genetics and DNA. It discusses the location of DNA in cells, the structure of DNA, the discovery of DNA's role as the genetic material, DNA replication, and applications of DNA technology. Some key points include: DNA is found coiled in chromatin in the nucleus; it has a double helix structure with adenine pairing with thymine and guanine pairing with cytosine; experiments in the 1950s showed that DNA carries genetic information; DNA replicates semi-conservatively before cell division; and recombinant DNA techniques have led to production of insulin and other proteins through transgenic bacteria.
This document provides an overview of genetics and DNA. It discusses the location of DNA in cells, its double helix structure, and the key discoveries around DNA's role as the genetic material. These include Griffith's experiments showing transformation in bacteria and Hershey and Chase's experiments demonstrating that DNA enters the host cell during viral infection. It also summarizes Watson and Crick's proposal of the DNA double helix model and the roles of DNA in replication, transcription, and protein production. The document concludes by covering DNA cloning techniques like recombinant DNA and PCR, as well as applications in biotechnology including genetically modified organisms.
Unit 1 genetics nucleic acids DNA (1) Biology aid Lassie sibanda
These slides will help those who love biology but yet find it so hard to break down see how easy and interesting life science is. hope these improve your knowledge
This document provides an overview of genetics and DNA. It discusses the location of DNA in cells, its double helix structure, and its role in inheritance and coding for proteins. Key events in discovering DNA's structure are summarized, such as experiments showing DNA is the genetic material in viruses and bacteria. The document also covers DNA replication, transcription, and repair. It introduces techniques like DNA cloning, polymerase chain reaction, and DNA fingerprinting used in biotechnology and forensics. Examples are given of genetically modified bacteria, plants, and animals produced using these methods.
This document provides an overview of genetics and DNA. It discusses the location of DNA in cells, the structure of DNA as a double helix, the discovery of DNA as the genetic material through experiments in the 1900s, and DNA's role in inheritance and protein coding. It also summarizes DNA replication, repair, cloning techniques like recombinant DNA and PCR, DNA fingerprinting, and applications of biotechnology including genetically modified bacteria, plants and animals.
Similar to UC Davis EVE161 Lecture 10 by @phylogenomics (20)
Innovations in Sequencing & Bioinformatics
Talk for
Healthy Central Valley Together Research Workshop
Jonathan A. Eisen University of California, Davis
January 31, 2024 linktr.ee/jonathaneisen
Talk by Jonathan Eisen for LAMG2022 meetingJonathan Eisen
The document discusses the history of the Lake Arrowhead Microbial Genomes (LAMG) conference. It reveals that LAMG2020 was cancelled due to a secret plan by organizers who formed an "anti-karyote society" that hates eukaryotes. The meeting was to be renamed the "Big, Large, Enormous" meeting of the Lake Arrowhead Big Large Enormous Anti-Karyote Society. The document also hints that several past LAMG speakers have made cryptic comments indicating involvement in a conspiracy surrounding the conference.
Thoughts on UC Davis' COVID Current ActionsJonathan Eisen
Slides I used for a presentation to Chancellor May's leadership council about the current state of UC Davis' response to COVID and how it could be improved
Phylogenetic and Phylogenomic Approaches to the Study of Microbes and Microbi...Jonathan Eisen
The document discusses Jonathan Eisen's work as a microbiology professor at UC Davis. It provides an overview of his research topics, which include microbial phylogenomics and evolvability, phylogenetic methods and tools, and using phylogenomics to study microbial communities and interactions between microbes and hosts under stress. The document also acknowledges collaborators and funding sources for Eisen's research over the years.
This document summarizes a class on detecting, quantifying, and tracking variations of SARS-CoV-2 RNA from COVID-19 samples. It discusses using quantitative RT-PCR (qRT-PCR) to detect and measure viral RNA levels in samples. Sequencing is used to identify variations in the viral genome over time, and online tools like Nextstrain allow viewing the evolution and global transmission of variants. Genotyping assays are also described that can rapidly screen samples for known single nucleotide variations during PCR.
The document discusses the benefits of exercise for mental health. Regular physical activity can help reduce anxiety and depression and improve mood and cognitive functioning. Exercise causes chemical changes in the brain that may help protect against mental illness and improve symptoms.
EVE198 Winter2020 Class 8 - COVID RNA DetectionJonathan Eisen
This document summarizes a class on SARS-CoV-2 RNA detection, quantification, and variation. It discusses how qRT-PCR is used to detect and quantify the virus by amplifying and detecting viral RNA. It also covers sequencing to identify variants, how variants evolve over time, and genotyping assays that can screen samples for known single nucleotide variations. Nextstrain and other online tools are presented that use sequencing data to analyze viral phylogenies, track variant distributions globally, and visualize genetic variations across the SARS-CoV-2 genome.
The document discusses the benefits of exercise for mental health. Regular physical activity can help reduce anxiety and depression and improve mood and cognitive function. Exercise causes chemical changes in the brain that may help protect against mental illness and improve symptoms for those who already suffer from conditions like depression and anxiety.
The document discusses the benefits of exercise for mental health. Regular physical activity can help reduce anxiety and depression and improve mood and cognitive function. Exercise causes chemical changes in the brain that may help protect against mental illness and improve symptoms.
EVE198 Winter2020 Class 5 - COVID VaccinesJonathan Eisen
The document discusses a class on COVID-19 vaccines. It covers topics like vaccine development, current candidates, delivery challenges, and comparisons between vaccines. Moderna and Pfizer mRNA vaccines are highlighted as being similar but having some differences in mRNA region, nanoparticle structure/synthesis, dosage amount, and storage temperature requirements. Other vaccines discussed include Novavax using spike protein nanoparticles, and AstraZeneca and Johnson & Johnson using DNA for spike protein delivered by a modified virus.
EVE198 Winter2020 Class 9 - COVID TransmissionJonathan Eisen
This document discusses modes of SARS-CoV-2 transmission including droplets, aerosols, and surfaces. It emphasizes that surfaces are not as big a risk as initially thought. It provides guidance on limiting transmission from different modes such as distancing, masks, washing hands, cleaning surfaces, and improving ventilation. The focus in 2021 is on droplets and aerosols rather than surfaces.
EVE198 Fall2020 "Covid Mass Testing" Class 8 VaccinesJonathan Eisen
This document discusses a class on vaccines for COVID-19. It covers topics like vaccine development, current candidate vaccines, challenges with vaccine distribution, and how vaccines are being assessed for safety, effectiveness, costs and production feasibility. Over 100 vaccine candidates are in development using platforms like DNA, RNA, viral vectors and inactivated viruses. Efforts like Operation Warp Speed are coordinating development of nucleic acid, viral vector and protein subunit vaccines. Distribution challenges include vaccine production, storage and logistics, number of doses required, and overcoming vaccine nationalism and hesitancy.
EVE198 Fall2020 "Covid Mass Testing" Class 2: Viruses, COIVD and TestingJonathan Eisen
The document discusses the benefits of exercise for mental health. Regular physical activity can help reduce anxiety and depression and improve mood and cognitive function. Exercise causes chemical changes in the brain that may help protect against mental illness and improve symptoms.
EVE198 Fall2020 "Covid Mass Testing" Class 1 IntroductionJonathan Eisen
The document discusses the benefits of exercise for mental health. Regular physical activity can help reduce anxiety and depression and improve mood and cognitive function. Exercise causes chemical changes in the brain that may help protect against mental illness and improve symptoms.
Main Java[All of the Base Concepts}.docxadhitya5119
This is part 1 of my Java Learning Journey. This Contains Custom methods, classes, constructors, packages, multithreading , try- catch block, finally block and more.
Executive Directors Chat Leveraging AI for Diversity, Equity, and InclusionTechSoup
Let’s explore the intersection of technology and equity in the final session of our DEI series. Discover how AI tools, like ChatGPT, can be used to support and enhance your nonprofit's DEI initiatives. Participants will gain insights into practical AI applications and get tips for leveraging technology to advance their DEI goals.
How to Fix the Import Error in the Odoo 17Celine George
An import error occurs when a program fails to import a module or library, disrupting its execution. In languages like Python, this issue arises when the specified module cannot be found or accessed, hindering the program's functionality. Resolving import errors is crucial for maintaining smooth software operation and uninterrupted development processes.
Macroeconomics- Movie Location
This will be used as part of your Personal Professional Portfolio once graded.
Objective:
Prepare a presentation or a paper using research, basic comparative analysis, data organization and application of economic information. You will make an informed assessment of an economic climate outside of the United States to accomplish an entertainment industry objective.
This presentation includes basic of PCOS their pathology and treatment and also Ayurveda correlation of PCOS and Ayurvedic line of treatment mentioned in classics.
বাংলাদেশের অর্থনৈতিক সমীক্ষা ২০২৪ [Bangladesh Economic Review 2024 Bangla.pdf] কম্পিউটার , ট্যাব ও স্মার্ট ফোন ভার্সন সহ সম্পূর্ণ বাংলা ই-বুক বা pdf বই " সুচিপত্র ...বুকমার্ক মেনু 🔖 ও হাইপার লিংক মেনু 📝👆 যুক্ত ..
আমাদের সবার জন্য খুব খুব গুরুত্বপূর্ণ একটি বই ..বিসিএস, ব্যাংক, ইউনিভার্সিটি ভর্তি ও যে কোন প্রতিযোগিতা মূলক পরীক্ষার জন্য এর খুব ইম্পরট্যান্ট একটি বিষয় ...তাছাড়া বাংলাদেশের সাম্প্রতিক যে কোন ডাটা বা তথ্য এই বইতে পাবেন ...
তাই একজন নাগরিক হিসাবে এই তথ্য গুলো আপনার জানা প্রয়োজন ...।
বিসিএস ও ব্যাংক এর লিখিত পরীক্ষা ...+এছাড়া মাধ্যমিক ও উচ্চমাধ্যমিকের স্টুডেন্টদের জন্য অনেক কাজে আসবে ...
This presentation was provided by Steph Pollock of The American Psychological Association’s Journals Program, and Damita Snow, of The American Society of Civil Engineers (ASCE), for the initial session of NISO's 2024 Training Series "DEIA in the Scholarly Landscape." Session One: 'Setting Expectations: a DEIA Primer,' was held June 6, 2024.
The simplified electron and muon model, Oscillating Spacetime: The Foundation...RitikBhardwaj56
Discover the Simplified Electron and Muon Model: A New Wave-Based Approach to Understanding Particles delves into a groundbreaking theory that presents electrons and muons as rotating soliton waves within oscillating spacetime. Geared towards students, researchers, and science buffs, this book breaks down complex ideas into simple explanations. It covers topics such as electron waves, temporal dynamics, and the implications of this model on particle physics. With clear illustrations and easy-to-follow explanations, readers will gain a new outlook on the universe's fundamental nature.
This slide is special for master students (MIBS & MIFB) in UUM. Also useful for readers who are interested in the topic of contemporary Islamic banking.
1. Lecture 10:
EVE 161:
Microbial Phylogenomics
!
Lecture #10:
Era III: Genome Sequencing
!
UC Davis, Winter 2014
Instructor: Jonathan Eisen
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
!1
2. Where we are going and where we have been
• Previous lecture:
! 9: rRNA Case Study - Built Environment
• Current Lecture:
! 10: Genome Sequencing
• Next Lecture:
! 11: Genome Sequencing II
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
!2
4. insight progress
1. Library construction
2. Random sequencing phase
(i) Sequence DNA
(15,000 sequences per Mb)
(i) Isolate DNA
–1
3. Closure phase
(i) Assemble sequences
(ii) Close gaps
–1
(ii) Fragment DNA
(iii) Edit
GGG ACTGTTC...
(iii) Clone DNA
(iv) Annotation
237
800,000 1
700,000
4. Complete
genome sequence
239
100,000
238
200,000
600,000
300,000
500,000
400,000
Figure 1 Diagram depicting the steps in a whole-genome shotgun sequencing project.
analysis of the genomes of two thermophilic bacterial species, be extensive, it is somehow constrained by phylogenetic relationAquifex aeolicus and Thermotoga maritima, revealed that 20–25% of ships. Other evidence for a ‘core’ of particular lineages comes from
the genes in these species were more similar to genes from archaea the finding of a conserved core of euryarchaeal genomes21,22 and
than those from bacteria13,14. This led to the suggestion of possible another finding that some types of gene might be more prone to gene
Slides for these species and archaeal transfer than others23. It Winter seems
extensive gene exchanges between UC Davis EVE161 Course Taught by Jonathan Eisentherefore2014 likely that horizontal gene
7. TIGR
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
8. Why Completeness is Important
• Improves characterization of genome features
• Gene order, replication origins
• Better comparative genomics
• Genome duplications, inversions
• Presence and absence of particular genes can be very
important
• Missing sequence might be important (e.g., centromere)
• Allows researchers to focus on biology not sequencing
• Facilitates large scale correlation studies
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
9. General Steps in Analysis of Complete Genomes
• Identification/prediction of genes
• Characterization of gene features
• Characterization of genome features
• Prediction of gene function
• Prediction of pathways
• Integration with known biological data
• Comparative genomics
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
10. General Steps in Analysis of Complete Genomes
• Structural Annotation
• Identification/prediction of genes
• Characterization of gene features
• Characterization of genome features
• Functional Annotation
• Prediction of gene function
• Prediction of pathways
• Integration with known biological data
• Evolutionary Annotation
• Comparative genomics
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
11. Structural Annotation I: Genes in Genomes
• Protein coding genes.
! In long open reading frames
! ORFs interrupted by introns in eukaryotes
! Take up most of the genome in prokaryotes, but only a
small portion of the eukaryotic genome
• RNA-only genes
! Transfer RNA
! ribosomal RNA
! snoRNAs (guide ribosomal and transfer RNA
maturation)
! intron splicing
! guiding mRNAs to the membrane for translation
! gene regulation—this is a growing list
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
12. Structural Annotation II: Other Features to Find
• Gene control sequences
! Promoters
! Regulatory elements
• Transposable elements, both active and defective
! DNA transposons and retrotransposons
! Many types and sizes
• Other Repeated sequences.
! Centromeres and telomeres
! Many with unknown (or no) function
• Unique sequences that have no obvious function
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
13. How to Find ncRNAs
• The most universal genes, such as tRNA and rRNA, are very conserved and thus
easy to detect. Finding them first removes some areas of the genome from further
consideration.
• One easy approach to finding common RNA genes is just looking for sequence
homology with related species: a BLAST search will find most of them quite easily
• Functional RNAs are characterized by secondary structure caused by base pairing
within the molecule.
• Determining the folding pattern is a matter of testing many possibilities to find the
one with the minimum free energy, which is the most stable structure.
• The free energy calculations are in turn based on experiments where short synthetic
RNA molecules are melted
• Related to this is the concept that paired regions (stems) will be conserved across
species lines even if the individual bases aren’t conserved. That is, if there is an A-U
pairing on one species, the same position might be occupied by a G-C in another
species.
• This is an example of concerted evolution: a deleterious mutation at one site is
cancelled by a compensating mutation at another site.
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
14. RNA Structure
•
•
RNA differs from DNA in having fairly
common G-U base pairs. Also, many
functional RNAs have unusual modified
bases such as pseudouridine and inosine.
The pseudoknot, pairing between a loop
and a sequence outside its stem, is
especially difficult to detect:
computationally intense and not subject to
the normal situation that RNA base pairing
follows a nested pattern
– But pseudoknots seem to be fairly rare.
•
Essentially, RNA folding programs start
with all possible short sequences, then
build to larger ones, adding the
contribution of each structural element.
– There is an element of dynamic
programming here as well.
– And, “stochastic context-free grammars”,
something I really don’t want to approach
right now!
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
15. Finding tRNAs
•
•
•
tRNAs have a highly conserved
structure, with 3 main stem-andloop structures that form a
cloverleaf structure, and several
conserved bases. Finding such
sequences is a matter of looking in
the DNA for the proper features
located the proper distance apart.
Looking for such sequences is
well-suited to a decision tree, a
series of steps that the sequence
must pass.
In addition, a score is kept, rating
how well the sequence passed
each step. This allows a more
stringent analysis later on, to
eliminate false positives.
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
16. Bacteria / Archaeal Protein Coding Genes
•
Bacteria use ATG as their main start codon, but GTG and TTG are also fairly common, and
a few others are occasionally used.
–
•
The stop codons are the same as in eukaryotes: TGA, TAA, TAG
–
•
•
stop codons are (almost) absolute: except for a few cases of programmed frameshifts and the use
of TGA for selenocysteine, the stop codon at the end of an ORF is the end of protein translation.
Genes can overlap by a small amount. Not much, but a few codons of overlap is common
enough so that you can’t just eliminate overlaps as impossible.
Cross-species homology works well for many genes. It is very unlikely that non-coding
sequence will be conserved.
–
•
Remember that start codons are also used internally: the actual start codon may not be the first
one in the ORF.
But, a significant minority of genes (say 20%) are unique to a given species.
Translation start signals (ribosome binding sites; Shine-Dalgarno sequences) are often
found just upstream from the start codon
–
–
however, some aren’t recognizable
genes in operons sometimes don’t always have a separate ribosome binding site for each gene
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
17. Composition Methods
• The frequency of various codons is different in coding regions as
compared to non-coding regions.
– This extends to G-C content, dinucleotide frequencies, and other
measures of composition. Dicodons (groups of 6 bases) are often
used
– Well documented experimentally.
• The composition varies between different proteins of course, and
it is affected within a species by the amounts of the various
tRNAs present
– horizontally transferred genes can also confuse things: they tend to
have compositions that reflect their original species.
– A second group with unusual compositions are highly expressed
genes.
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
18. Eukaryotic Genes Harder to Find
•
•
Some fundamental differences between
prokaryotes and eukaryotes:
There is lots of non-coding DNA in eukaryotes.
– First step: find repeated sequences and RNA
genes
– Note that eukaryotes have 3 main RNA
polymerases. RNA polymerase 2 (pol2)
transcribes all protein-coding genes, while pol1
and pol3 transcribe various RNA-only genes.
•
•
•
most eukaryotic genes are split into exons and
introns.
Only 1 gene per transcript in eukaryotes.
No ribosome binding sites: translation starts at
the first ATG in the mRNA
– thus, in eukaryotic genomes, searching for the
transcription start site (TSS) makes sense.
•
Many fewer eukaryotic genomes have been
sequenced
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
19. Exons
• Exon sequences can often be identified by sequence conservation,
at least roughly.
• Dicodon statistics, as was used for prokaryotes, also is useful
– eukaryotic genomes tend to contain many isochores, regions of
different GC content, and composition statistics can vary between
isochores.
• The initial and terminal exons contain untranslated regions, and
thus special methods are needed to detect them.
• Predicting splice junctions is a matter of collecting information about
the sequences surrounding each possible GT/AC pair, then running
this information through some combination of decision tree, Markov
models, discriminant analysis, or neural networks, in an attemp to
massage the data into giving a reliable score.
– In general, sites are more likely to be correct if predicted by multiple
methods
– Experimental data from ESTs can be very helpful here.
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
21. Functional Classification I: GO
•
The Gene Ontology (GO) consortium (http://www.geneontology.org/) is an attempt
describe gene products with a structured controlled vocabulary, a set of invariant
terms that have a known relationship to each other.
•
Each GO term is given a number of the form GO:nnnnnnn (7 digits), as well as a term name. For
example, GO:0005102 is “receptor binding”.
•
There are 3 root terms: biological process, cellular component, and molecular function. A
gene product will probably be described by GO terms from each of these “ontologies”.
(ontology is a branch of philosophy concerned with the nature of being, and the basic
categories of being and their relationships.)
–
•
For instance, cytochrome c is described with the molecular function term “oxidoreductase
activity”, the biological process terms “oxidative phosphorylation” and “induction of cell death”,
and the cellular component terms “mitochondrial matrix” and “mitochondrial inner membrane”
The terms are arranged in a hierarchy that is a “directed acyclic graph” and not a tree.
This means simply that each term can have more than one parent term, but the
direction of parent to child (i.e. less specific to more specific) is always maintained.
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
22. Functional Classification II: Enzyme Nomenclature
•
Enzyme functions: which reactants are converted to which products
•
Enzyme functions are given unique numbers by the Enzyme Commission.
– Across many species, the enzymes that perform a specific function are usually
evolutionarily related. However, this isn’t necessarily true. There are cases of two
entirely different enzymes evolving similar functions.
– Often, two or more gene products in a genome will have the same E.C. number.
– E.C. numbers are four integers separated by dots. The left-most number is the
least specific
– For example, the tripeptide aminopeptidases have the code "EC 3.4.11.4", whose
components indicate the following groups of enzymes:
• EC 3 enzymes are hydrolases (enzymes that use water to break up some other molecule)
• EC 3.4 are hydrolases that act on peptide bonds
• EC 3.4.11 are those hydrolases that cleave off the amino-terminal amino acid from a
polypeptide
• EC 3.4.11.4 are those that cleave off the amino-terminal end from a tripeptide
•
Top level E.C. numbers:
– E.C. 1: oxidoreductases (often dehydrogenases): electron transfer
– E.C. 2: transferases: transfer of functional groups (e.g. phosphate) between
molecules.
– E.C. 3: hydrolases: splitting a molecule by adding water to a bond.
– E.C. 4: lyases: non-hydrolytic addition or removal of groups from a molecule
– E.C. 5: isomerases: rearrangements of atoms within a molecule
– E.C. 6: ligases: joining two molecules using energy from ATP
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
23. Functional Prediction
•
•
•
•
•
•
BLAST searches
HMM models of specific genes or gene families (Pfam, TIGRfam,
FIGfam).
Sequence motifs and domains. If the gene is not a good match to
previously known genes, these provide useful clues.
Cellular location predictions, especially for transmembrane proteins.
Genomic neighbors, especially in bacteria, where related functions
are often found together in operons and divergons (genes
transcribed in opposite directions that use a common control region).
Biochemical pathway/subsystem information. If an organism has
most of the genes needed to perform a function, any missing
functions are probably present too.
– Also, experimental data about an organism’s capacities can be used to
decide whether the relevant functions are present in the genome.
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
24. Functional Prediction II: Membrane Spanning
•
Integral membrane proteins contain amino acid
sequences that go through the membrane one or
several times.
– There are also peripheral membrane proteins that stick
to the hydrophilic head groups by ionic and polar
interactions
– There are also some that have covalently bound
hydrophobic groups, such as myristoylate, a 14 carbon
saturated fatty acid that is attached to the N-terminal
amino group.
•
There are 2 main protein structures that cross
membranes.
– Most are alpha helices, and in proteins that span
multiple times, these alpha helices are packed together
in a coiled-coil. Length = 15-30 amino acids.
– Less commonly, there are proteins with membrane
spanning “beta barrels”, composed of beta sheets
wrapped into a cylinder. An example: porins, which
transport water across the membrane.
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
25. Functional Prediction by Phylogeny
• Key step in genome projects
• More accurate predictions help guide experimental and
computational analyses
• Many diverse approaches
• All improved both by “phylogenomic” type analyses that
integrate evolutionary reconstructions and understanding
of how new functions evolve
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
26. Functional Prediction
• Identification of motifs
! Short regions of sequence similarity that are indicative
of general activity
! e.g., ATP binding
• Homology/similarity based methods
! Gene sequence is searched against a databases of
other sequences
! If significant similar genes are found, their functional
information is used
• Problem
! Genes frequently have similarity to hundreds of motifs
and multiple genes, not all with the same function
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
28. H. pylori genome - 1997
“The ability of H. pylori to
perform mismatch repair is
suggested by the presence of
methyl transferases, mutS
and uvrD. However,
orthologues of MutH and
MutL were not identified.”
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
30. Phylogenetic Tree of MutS Family
Yeast
Human
Celeg
Aquae
Strpy
Bacsu
Synsp
Deira Helpy
Borbu
Metth
mSaco
Yeast
Human
Mouse
Arath
Arath
Human
Mouse
Spombe
Yeast
Yeast
Spombe
Yeast
Celeg
Human
Fly
Xenla
Rat
Mouse
Human
Yeast
Neucr
Arath
Aquae
Trepa
Chltr
Deira
Theaq
BacsuBorbu
Thema
SynspStrpy
Ecoli
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
Neigo
Based on Eisen,
1998 Nucl Acids
30
Res 26: 4291-4300.
32. Overlaying Functions onto Tree
MutS2
MSH5
Aquae
Strpy
Bacsu
Synsp
Deira Helpy
Borbu
Metth
Yeast
Human
Celeg
MSH6
mSaco
Yeast
Human
Mouse
Arath
MSH3
MSH1
MSH4
Yeast
Celeg
Human
Arath
Human
Mouse
Spombe
Yeast
Fly
Xenla
Rat
Mouse
Human
Yeast
Neucr
Arath
Yeast
Spombe
Aquae
Chltr
Deira
Theaq
Thema
Trepa
BacsuBorbu
Synsp
Strpy
Ecoli
Neigo
MutS1
MSH2
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
Based on Eisen,
1998 Nucl Acids
32
Res 26: 4291-4300.
33. MutS Subfamilies
•
•
•
•
•
MutS1
MSH1
MSH2
MSH3
MSH6
Bacterial MMR
Euk - mitochondrial MMR
Euk - all MMR in nucleus
Euk - loop MMR in nucleus
Euk - base:base MMR in nucleus
Bacterial - function unknown
Euk - meiotic crossing-over
Euk - meiotic crossing-over
!
• MutS2
• MSH4
• MSH5
TIGR
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
34. Functional Prediction Using Tree
MSH5 - Meiotic Crossing Over
Aquae
Strpy
Bacsu
Synsp
Deira Helpy
Borbu
Metth
Yeast
Human
Celeg
MSH6 - Nuclear
Repair
Of Mismatches
MutS2 - Unknown Functions
mSaco
Yeast
Human
Mouse
Arath
Yeast
Celeg
Human
Arath
MSH3 - Nuclear Human
Mouse
RepairOf Loops Spombe
Yeast
MSH1
Mitochondrial
Repair
MSH4 - Meiotic Crossing
Over
Fly
Xenla
Rat
Mouse
Human
Yeast
Neucr
Arath
Yeast
Spombe
Aquae
Chltr
Deira
Theaq
Thema
MSH2 - Eukaryotic Nuclear
Mismatch and Loop Repair
Trepa
BacsuBorbu
Synsp
Strpy
Ecoli
Neigo
Slides for MutS1 - EVE161 Course Taught by Jonathan Eisen Winter 2014
UC Davis Bacterial Mismatch and Loop Repair
Based on Eisen,
1998 Nucl Acids
34
Res 26: 4291-4300.
36. Blast Search of H. pylori “MutS”
Sequences producing significant alignments:
sp|P73625|MUTS_SYNY3
sp|P74926|MUTS_THEMA
sp|P44834|MUTS_HAEIN
sp|P10339|MUTS_SALTY
sp|O66652|MUTS_AQUAE
sp|P23909|MUTS_ECOLI
DNA
DNA
DNA
DNA
DNA
DNA
MISMATCH
MISMATCH
MISMATCH
MISMATCH
MISMATCH
MISMATCH
REPAIR
REPAIR
REPAIR
REPAIR
REPAIR
REPAIR
Score
E
(bits) Value
PROTEIN
PROTEIN
PROTEIN
PROTEIN
PROTEIN
PROTEIN
117
69
64
62
57
57
• Blast search pulls up Syn. sp MutS#2 with much higher p value
than other MutS homologs
• Based on this TIGR predicted this species had mismatch repair
Based on Eisen et al. 1997 Nature Medicine 3: 1076-1078.
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
3e-25
1e-10
3e-09
2e-08
4e-07
4e-07
37. High Mutation Rate in H. pylori
Based on Eisen et al. 1997 Nature Medicine 3: 1076-1078.
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
38. Phylogenomics
PHYLOGENENETIC PREDICTION OF GENE FUNCTION
EXAMPLE A
METHOD
EXAMPLE B
2A
CHOOSE GENE(S) OF INTEREST
5
3A
2B
1A 2A 1B 3B
IDENTIFY HOMOLOGS
2
1 3 4
5
6
ALIGN SEQUENCES
1A
2A 3A 1B
2B
1
2
3
4
5
6
1
3B
2
3
4
5
6
3
4
5
6
4
5
6
CALCULATE GENE TREE
Duplication?
1A
2A 3A 1B
2B
3B
OVERLAY KNOWN
FUNCTIONS ONTO TREE
Duplication?
1A
2A 3A 1B
2B
1
3B
2
INFER LIKELY FUNCTION
OF GENE(S) OF INTEREST
Ambiguous
Duplication?
Species 1
1A 1B
Species 2
2A 2B
Species 3
3A 3B
1
2
3
ACTUAL EVOLUTION
(ASSUMED TO BE UNKNOWN)
Duplication
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
Based on Eisen, 1998
Genome Res 8: 163-167.
40. Chemosynthetic Symbionts
Eisen et al. 1992
Eisen et al. 1992. J. Bact.174: 3416
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
41. Carboxydothermus hydrogenoformans
• Isolated from a Russian hotspring
• Thermophile (grows at 80°C)
• Anaerobic
• Grows very efficiently on CO (Carbon
Monoxide)
• Produces hydrogen gas
• Low GC Gram positive (Firmicute)
• Genome Determined (Wu et al. 2005 PLoS
Genetics 1: e65. )
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
42. Homologs of Sporulation Genes
Wu et al. 2005 PLoS
Genetics 1: e65.
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
43. Carboxydothermus sporulates
Wu et al. 2005 PLoS Genetics 1: e65.
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
44. Non-Homology Predictions:
Phylogenetic Profiling
• Step 1: Search all genes in
organisms of interest against all
other genomes
!
• Ask: Yes or No, is each gene found
in each other species
!
• Cluster genes by distribution
patterns (profiles)
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
45. Sporulation Gene Profile
Wu et al. 2005 PLoS Genetics 1: e65.
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
46. B. subtilis new sporulation genes
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
47. Functional Prediction III: Colocalization
•
Operon structure is often
maintained over fairly large
taxonomic regions.
–
–
•
Sometimes gene order is altered,
and sometimes one or more
enzymes are missing.
But in general, this phenomenon
allows recognition or verification
that widely diverged enzymes do
in fact have the same function.
This is an operon that contains
part of the glycolytic pathway.
–
–
–
–
–
–
1: phosphoclycerate mutase
2: triosephosphate isomerase
3: enolase
4: phosphoglycerate kinase
5: glyceraldehyde 3-phosphate
dehydrogenase
6: central glycolytic gene regulator
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
53. After the Genomes
• Better analysis and annotation
• Comparative genomics
• Functional genomics (Experimental analysis of gene
function on a genome scale)
• Genome-wide gene expression studies
• Proteomics
• Genome wide genetic experiments
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014