Gene prediction is the process of determining where a coding gene might be in a genomic sequence. Functional proteins must begin with a Start codon (where DNA transcription begins), and end with a Stop codon (where transcription ends).
This document discusses nucleic acid probes and their use in hybridization experiments. It notes that probes are short sequences of nucleotides that bind to specific target sequences. The degree of homology between the probe and target determines how stable the hybridization is. Probes can range in size from 10 to over 10,000 nucleotide bases, with most common probes being 14 to 40 bases. Short probes hybridize quickly but have less specificity, while longer probes hybridize more stably. The document then describes different methods for labeling probes, including nick translation, primer extension, RNA polymerase transcription, end-labeling, and direct labeling. It also discusses factors that affect probe specificity and hybridization conditions.
The document discusses genome sequencing and related topics. It begins by defining what a genome is - the complete set of DNA in an organism. It then discusses the different types of genomes, such as prokaryotic and eukaryotic, including nuclear, mitochondrial, and chloroplast genomes. The document also defines genomics as the comprehensive study of whole genomes and all gene interactions, distinguishing it from traditional genetics which focuses on single genes. It outlines some key milestones in genomic sequencing and the technical foundations that enabled sequencing whole genomes. Finally, it describes the main approaches used for genome sequencing projects, including hierarchical shotgun sequencing and whole genome shotgun sequencing.
Comparative genomics in eukaryotes, organellesKAUSHAL SAHU
Comparative genomics involves comparing the genomic features of different organisms, such as DNA sequences, genes, and gene order. This field has revealed both similarities and differences between organisms that can provide insights into evolutionary relationships. Some of the first comparative genomic studies compared large DNA viruses. Since then, many complete genome sequences have been determined, including for yeast, fruit flies, worms, plants, mice, and humans. While humans have around 35,000 genes, complexity is not solely due to gene number. Comparative analysis of human and mouse genomes shows 40% sequence similarity and similar gene numbers, but different genome sizes. Mitochondrial genomes also yield insights when compared between domains of life. Computational tools like BLAST are used to facilitate genomic
Comparative genomics involves systematically comparing genome sequences from different organisms. It uses computer programs to identify homologous genomic regions and align sequences at the base-pair level. Comparing genomes at different phylogenetic distances can provide insights into gene structure/function, evolution, and characteristics unique to each organism. Key tools for comparative genomics include genome browsers, aligners, and databases that classify orthologous gene clusters conserved across species.
Yeast artificial chromosomes (YACs) are engineered DNA molecules that can clone and replicate large DNA sequences in yeast cells. YACs contain essential yeast elements like a centromere and telomeres that allow them to behave like natural yeast chromosomes. YACs can clone very large inserts of up to 10 megabases of foreign DNA, making them useful for generating whole genome libraries.
This document discusses the use of 16S ribosomal RNA (rRNA) gene sequencing for bacterial identification and phylogenetic analysis. It explains that the 16S rRNA gene is highly conserved, making it useful for comparing distantly related organisms. The document outlines the process of 16S rRNA gene sequencing, including PCR amplification using conserved primer regions and sequencing of variable regions. It also discusses various methods that have been developed using 16S rRNA, such as TRFLP profiling and ribotyping, to study microbial communities.
The document discusses several key aspects of gene prediction including:
1. Gene prediction algorithms use signals like start/stop codons, splice sites, and open reading frames to identify genes computationally with near 100% accuracy.
2. There are ab initio, homology-based, and probabilistic models like Hidden Markov Models that can predict prokaryotic and eukaryotic genes.
3. Eukaryotic gene prediction is more challenging due to larger genomes, fewer genes, and intron-exon structures. Programs must consider splicing, polyadenylation, and other post-transcriptional modifications.
Gene prediction is the process of determining where a coding gene might be in a genomic sequence. Functional proteins must begin with a Start codon (where DNA transcription begins), and end with a Stop codon (where transcription ends).
This document discusses nucleic acid probes and their use in hybridization experiments. It notes that probes are short sequences of nucleotides that bind to specific target sequences. The degree of homology between the probe and target determines how stable the hybridization is. Probes can range in size from 10 to over 10,000 nucleotide bases, with most common probes being 14 to 40 bases. Short probes hybridize quickly but have less specificity, while longer probes hybridize more stably. The document then describes different methods for labeling probes, including nick translation, primer extension, RNA polymerase transcription, end-labeling, and direct labeling. It also discusses factors that affect probe specificity and hybridization conditions.
The document discusses genome sequencing and related topics. It begins by defining what a genome is - the complete set of DNA in an organism. It then discusses the different types of genomes, such as prokaryotic and eukaryotic, including nuclear, mitochondrial, and chloroplast genomes. The document also defines genomics as the comprehensive study of whole genomes and all gene interactions, distinguishing it from traditional genetics which focuses on single genes. It outlines some key milestones in genomic sequencing and the technical foundations that enabled sequencing whole genomes. Finally, it describes the main approaches used for genome sequencing projects, including hierarchical shotgun sequencing and whole genome shotgun sequencing.
Comparative genomics in eukaryotes, organellesKAUSHAL SAHU
Comparative genomics involves comparing the genomic features of different organisms, such as DNA sequences, genes, and gene order. This field has revealed both similarities and differences between organisms that can provide insights into evolutionary relationships. Some of the first comparative genomic studies compared large DNA viruses. Since then, many complete genome sequences have been determined, including for yeast, fruit flies, worms, plants, mice, and humans. While humans have around 35,000 genes, complexity is not solely due to gene number. Comparative analysis of human and mouse genomes shows 40% sequence similarity and similar gene numbers, but different genome sizes. Mitochondrial genomes also yield insights when compared between domains of life. Computational tools like BLAST are used to facilitate genomic
Comparative genomics involves systematically comparing genome sequences from different organisms. It uses computer programs to identify homologous genomic regions and align sequences at the base-pair level. Comparing genomes at different phylogenetic distances can provide insights into gene structure/function, evolution, and characteristics unique to each organism. Key tools for comparative genomics include genome browsers, aligners, and databases that classify orthologous gene clusters conserved across species.
Yeast artificial chromosomes (YACs) are engineered DNA molecules that can clone and replicate large DNA sequences in yeast cells. YACs contain essential yeast elements like a centromere and telomeres that allow them to behave like natural yeast chromosomes. YACs can clone very large inserts of up to 10 megabases of foreign DNA, making them useful for generating whole genome libraries.
This document discusses the use of 16S ribosomal RNA (rRNA) gene sequencing for bacterial identification and phylogenetic analysis. It explains that the 16S rRNA gene is highly conserved, making it useful for comparing distantly related organisms. The document outlines the process of 16S rRNA gene sequencing, including PCR amplification using conserved primer regions and sequencing of variable regions. It also discusses various methods that have been developed using 16S rRNA, such as TRFLP profiling and ribotyping, to study microbial communities.
The document discusses several key aspects of gene prediction including:
1. Gene prediction algorithms use signals like start/stop codons, splice sites, and open reading frames to identify genes computationally with near 100% accuracy.
2. There are ab initio, homology-based, and probabilistic models like Hidden Markov Models that can predict prokaryotic and eukaryotic genes.
3. Eukaryotic gene prediction is more challenging due to larger genomes, fewer genes, and intron-exon structures. Programs must consider splicing, polyadenylation, and other post-transcriptional modifications.
This document discusses nucleotide probes, which are single-stranded DNA or RNA fragments that are labeled and complementary to a target DNA sequence. Probes can range in size from 15 base pairs to several hundred kilobases. They are used to identify a specific DNA fragment through base pairing. Probes must be labeled to be detected, typically through radioactive labeling or fluorescent tags. Labeling can occur on the end of the probe or through polymerase-based incorporation of multiple labeled nucleotides during DNA synthesis. Probes have various uses, including searching DNA libraries and diagnosing genetic disorders through techniques like Southern and Northern blotting.
This document discusses molecular probes, including their definition, types, preparation, and labeling. It describes the three main types of probes - oligonucleotide probes, DNA probes, and RNA probes. It explains how to prepare probes from genomic DNA, cDNA, synthetic oligonucleotides, and RNA. Methods of radioactive labeling including nick translation and oligonucleotide labeling are covered. Non-radioactive labeling using biotin and digoxigenin is also discussed. Finally, applications of molecular probes in identification of recombinant clones, fingerprinting, in situ hybridization, and medical research are summarized.
Automated DNA sequencing ; Protein sequencingRima Joseph
This document discusses several methods for DNA and protein sequencing. It describes automated DNA sequencing which is based on the Sanger method but uses fluorescent labels and allows direct computer storage of sequence data. It then discusses various methods for protein sequencing including purification, amino acid composition analysis, N-terminal sequencing using Edman degradation or other methods, C-terminal sequencing, breaking disulfide bonds, cleaving the protein into peptides, ordering peptides by overlap, and locating disulfide bonds. Newer methods discussed are using genomic data and mass spectrometry techniques.
Automated sequencing of genomes require automated gene assignment
Includes detection of open reading frames (ORFs)
Identification of the introns and exons
Gene prediction a very difficult problem in pattern recognition
Coding regions generally do not have conserved sequences
Much progress made with prokaryotic gene prediction
Eukaryotic genes more difficult to predict correctly
Secondary structure prediction tools analyze a protein's amino acid sequence to predict its 3D structure and function. These tools use various methods like Chou-Fasman, GOR, neural networks, and hidden Markov models to identify alpha helices and beta sheets based on characteristics like residue propensity values, sequence homology, and patterns in windows of amino acids. Accurate prediction of secondary structure is important for determining a protein's tertiary structure and biological role.
Functional proteomics, methods and toolsKAUSHAL SAHU
INTRODUCTION
HISTORY
DEFINITION
PROTEOMICS
FUNCTIONAL PROTEOMICS
PROTEOMICS SOFTWARE
PROTEOMICS ANALYSIS
TOOLS FOR PROTEOM ANALYSIS
DIFFERENTS METHODS FOR STUDY OF FUNCTIONAL PROTEOMICS
APLLICATIONS
LIMITATIONS
CONCLUSION
Open reading frame is part of reading frame that contains no stop codons or region of amino acids coding triple codons.
ORF starts with start codon and ends at stop codon.
SAGE (Serial analysis of Gene Expression)talhakhat
SAGE (Serial Analysis of Gene Expression) is a technique that allows for the rapid and comprehensive analysis of gene expression patterns in a given cell population. It works by isolating mRNA, synthesizing cDNA, ligating short sequence tags to the cDNA, and then counting the number of times each tag is observed to quantify gene expression levels. The tags are concatenated and sequenced to generate vast amounts of data that must be analyzed computationally to identify which genes particular tags correspond to and to compare expression profiles between cell types. SAGE provides an overview of a cell's complete transcriptional activity and has been applied to study differences in cancer vs normal cells and to identify targets of oncogenes and tumor suppressor genes.
This document discusses DNA sequencing methods. It describes the Maxam-Gilbert sequencing method developed in 1976-1977 which uses chemical modification and cleavage of DNA at specific bases, followed by electrophoresis to separate fragments by size. It also mentions the popular Sanger sequencing method. The procedure for Maxam-Gilbert sequencing involves labeling DNA, cleaving it with chemicals, running the fragments on a gel, and analyzing the results to deduce the DNA sequence. Advantages include no premature termination and ability to sequence stretches not possible with enzymatic methods, while disadvantages include use of radioactivity and toxic chemicals.
The SCOP database classifies protein structures hierarchically and describes evolutionary relationships between proteins. It was created in 1994 at the Centre for Protein Engineering and is maintained manually. SCOP links to the Protein Data Bank to obtain structural classifications for each protein structure directly and can also be searched to find a protein's structural class, fold, and domain information.
FASTA is a bioinformatics tool and biological database that is used to compare amino acid sequences of proteins or nucleotide sequences of DNA. It was first described in 1985 by Lipman and Pearson. FASTA performs fast homology searches to find similarities between a query sequence and sequences in a database. While similar to BLAST, FASTA is faster for sequence comparisons. It works by identifying patches of sequence similarity that may contain gaps. Some key FASTA programs include FASTA, TFASTA, FASTS, and FASTX/Y. FASTA is useful for applications like identification of species, establishing phylogeny, DNA mapping, and understanding protein function.
Sequence assembly refers to aligning and merging fragments from a longer DNA sequence in order to reconstruct the original sequence. This is needed as DNA sequencing technology cannot read whole genomes in one go, but rather reads small pieces of between 20 and 30,000 bases, depending on the technology used. Typically the short fragments, called reads, result from shotgun sequencing genomic DNA, or gene transcript (ESTs).
The problem of sequence assembly can be compared to taking many copies of a book, passing each of them through a shredder with a different cutter, and piecing the text of the book back together just by looking at the shredded pieces. Besides the obvious difficulty of this task, there are some extra practical issues: the original may have many repeated paragraphs, and some shreds may be modified during shredding to have typos. Excerpts from another book may also be added in, and some shreds may be completely unrecognizable.
Whole genome sequencing is a technique to sequence the entire genome of an organism. It involves breaking the genome into small fragments, copying the fragments, sequencing the fragments, and reassembling the sequence data into the full genome. Key steps include isolating DNA, fragmenting it, ligating fragments into plasmids, amplifying the plasmids, sequencing the fragments using Sanger sequencing, and assembling the sequence reads into the complete genome. Whole genome sequencing allows researchers to discover coding and non-coding regions, predict disease susceptibility, and perform evolutionary studies by comparing species.
Whole genome shotgun sequencing involves randomly breaking genomic DNA into small fragments, sequencing the fragments, and then reassembling the sequences using overlapping regions. The document outlines the history and procedure of shotgun sequencing. Genomic DNA is first fragmented, end-repaired, and size-selected into small, medium, and large fragments. Libraries are created for each size fragment and sequenced. A base caller filters poor calls and an assembler finds overlaps to generate continuous nucleotide sequences or contigs of the whole genome.
ESTs are short sequences of DNA that represent genes expressed in certain tissues or organisms. They provide a quick and inexpensive way for scientists to discover new genes and map their positions in genomes. ESTs represent a snapshot of genes expressed in a tissue at a given time. Sequencing the beginning or end of cDNA clones produces 5' and 3' ESTs, which can help identify genes and study gene expression and regulation.
Genome organization in prokaryotes(molecular biology)IndrajaDoradla
1. In prokaryotes, the genome is located in an irregularly shaped region within the cell called the nucleoid, which is not surrounded by a membrane like the eukaryotic nucleus.
2. The prokaryotic genome is generally a circular piece of DNA that can exist in multiple copies and ranges in length but is at least a few million base pairs. It is packaged into the nucleoid through supercoiling facilitated by nucleoid-associated proteins.
3. DNA supercoiling allows for very long strands of DNA to be tightly packaged into a prokaryotic cell. This involves the introduction of plectonemic supercoils that twist the DNA into loops and wind it around nucle
This document provides an overview of plasmid and phage vectors used in genetic engineering. It discusses the properties and types of plasmid vectors such as pBR322, as well as bacteriophage vectors including lambda phage and M13 phage. The mechanisms of gene cloning using these vectors are explained, along with their applications in cloning genes and producing recombinant proteins. In conclusion, bacteriophages are identified as good vectors compared to plasmids, though plasmids do not frequently destroy their host cells like bacteriophages can.
Lectut btn-202-ppt-l23. labeling techniques for nucleic acidsRishabh Jain
Nucleic acid probes can be labeled with radioisotopes or nonisotopic labels for use in hybridization techniques. Common labeling methods include radioactive labeling with 32P or 3H, or nonisotopic labeling with biotin, digoxigenin, or fluorescein. Labeled probes are used to detect complementary DNA or RNA sequences and can be DNA, RNA, or oligonucleotide probes. Probes are prepared through various techniques such as PCR, random priming, or in vitro transcription and must be purified before use and stored appropriately.
Modified M13 vectors have a large number of cloning sites which allow for insertion of foreign DNA. These vectors are derived from the M13 bacteriophage and are commonly used for DNA sequencing, mapping and mutagenesis experiments in molecular biology research. The document appears to be a seminar topic submission about using the M13 phage for biotechnology applications.
"Microbial Genomics @NIST" presentation at the Standards for Pathogen Identification via NGS (SPIN) workshop hosted by National Institute for Standards and Technology October 2014 by Nathan Olson from NIST.
Bacteria come in different shapes including spherical (coccus), rod-shaped (bacillus), comma-shaped (vibrio), and spiral-shaped (spirillum). They can also be classified based on whether they have flagella.
This document discusses nucleotide probes, which are single-stranded DNA or RNA fragments that are labeled and complementary to a target DNA sequence. Probes can range in size from 15 base pairs to several hundred kilobases. They are used to identify a specific DNA fragment through base pairing. Probes must be labeled to be detected, typically through radioactive labeling or fluorescent tags. Labeling can occur on the end of the probe or through polymerase-based incorporation of multiple labeled nucleotides during DNA synthesis. Probes have various uses, including searching DNA libraries and diagnosing genetic disorders through techniques like Southern and Northern blotting.
This document discusses molecular probes, including their definition, types, preparation, and labeling. It describes the three main types of probes - oligonucleotide probes, DNA probes, and RNA probes. It explains how to prepare probes from genomic DNA, cDNA, synthetic oligonucleotides, and RNA. Methods of radioactive labeling including nick translation and oligonucleotide labeling are covered. Non-radioactive labeling using biotin and digoxigenin is also discussed. Finally, applications of molecular probes in identification of recombinant clones, fingerprinting, in situ hybridization, and medical research are summarized.
Automated DNA sequencing ; Protein sequencingRima Joseph
This document discusses several methods for DNA and protein sequencing. It describes automated DNA sequencing which is based on the Sanger method but uses fluorescent labels and allows direct computer storage of sequence data. It then discusses various methods for protein sequencing including purification, amino acid composition analysis, N-terminal sequencing using Edman degradation or other methods, C-terminal sequencing, breaking disulfide bonds, cleaving the protein into peptides, ordering peptides by overlap, and locating disulfide bonds. Newer methods discussed are using genomic data and mass spectrometry techniques.
Automated sequencing of genomes require automated gene assignment
Includes detection of open reading frames (ORFs)
Identification of the introns and exons
Gene prediction a very difficult problem in pattern recognition
Coding regions generally do not have conserved sequences
Much progress made with prokaryotic gene prediction
Eukaryotic genes more difficult to predict correctly
Secondary structure prediction tools analyze a protein's amino acid sequence to predict its 3D structure and function. These tools use various methods like Chou-Fasman, GOR, neural networks, and hidden Markov models to identify alpha helices and beta sheets based on characteristics like residue propensity values, sequence homology, and patterns in windows of amino acids. Accurate prediction of secondary structure is important for determining a protein's tertiary structure and biological role.
Functional proteomics, methods and toolsKAUSHAL SAHU
INTRODUCTION
HISTORY
DEFINITION
PROTEOMICS
FUNCTIONAL PROTEOMICS
PROTEOMICS SOFTWARE
PROTEOMICS ANALYSIS
TOOLS FOR PROTEOM ANALYSIS
DIFFERENTS METHODS FOR STUDY OF FUNCTIONAL PROTEOMICS
APLLICATIONS
LIMITATIONS
CONCLUSION
Open reading frame is part of reading frame that contains no stop codons or region of amino acids coding triple codons.
ORF starts with start codon and ends at stop codon.
SAGE (Serial analysis of Gene Expression)talhakhat
SAGE (Serial Analysis of Gene Expression) is a technique that allows for the rapid and comprehensive analysis of gene expression patterns in a given cell population. It works by isolating mRNA, synthesizing cDNA, ligating short sequence tags to the cDNA, and then counting the number of times each tag is observed to quantify gene expression levels. The tags are concatenated and sequenced to generate vast amounts of data that must be analyzed computationally to identify which genes particular tags correspond to and to compare expression profiles between cell types. SAGE provides an overview of a cell's complete transcriptional activity and has been applied to study differences in cancer vs normal cells and to identify targets of oncogenes and tumor suppressor genes.
This document discusses DNA sequencing methods. It describes the Maxam-Gilbert sequencing method developed in 1976-1977 which uses chemical modification and cleavage of DNA at specific bases, followed by electrophoresis to separate fragments by size. It also mentions the popular Sanger sequencing method. The procedure for Maxam-Gilbert sequencing involves labeling DNA, cleaving it with chemicals, running the fragments on a gel, and analyzing the results to deduce the DNA sequence. Advantages include no premature termination and ability to sequence stretches not possible with enzymatic methods, while disadvantages include use of radioactivity and toxic chemicals.
The SCOP database classifies protein structures hierarchically and describes evolutionary relationships between proteins. It was created in 1994 at the Centre for Protein Engineering and is maintained manually. SCOP links to the Protein Data Bank to obtain structural classifications for each protein structure directly and can also be searched to find a protein's structural class, fold, and domain information.
FASTA is a bioinformatics tool and biological database that is used to compare amino acid sequences of proteins or nucleotide sequences of DNA. It was first described in 1985 by Lipman and Pearson. FASTA performs fast homology searches to find similarities between a query sequence and sequences in a database. While similar to BLAST, FASTA is faster for sequence comparisons. It works by identifying patches of sequence similarity that may contain gaps. Some key FASTA programs include FASTA, TFASTA, FASTS, and FASTX/Y. FASTA is useful for applications like identification of species, establishing phylogeny, DNA mapping, and understanding protein function.
Sequence assembly refers to aligning and merging fragments from a longer DNA sequence in order to reconstruct the original sequence. This is needed as DNA sequencing technology cannot read whole genomes in one go, but rather reads small pieces of between 20 and 30,000 bases, depending on the technology used. Typically the short fragments, called reads, result from shotgun sequencing genomic DNA, or gene transcript (ESTs).
The problem of sequence assembly can be compared to taking many copies of a book, passing each of them through a shredder with a different cutter, and piecing the text of the book back together just by looking at the shredded pieces. Besides the obvious difficulty of this task, there are some extra practical issues: the original may have many repeated paragraphs, and some shreds may be modified during shredding to have typos. Excerpts from another book may also be added in, and some shreds may be completely unrecognizable.
Whole genome sequencing is a technique to sequence the entire genome of an organism. It involves breaking the genome into small fragments, copying the fragments, sequencing the fragments, and reassembling the sequence data into the full genome. Key steps include isolating DNA, fragmenting it, ligating fragments into plasmids, amplifying the plasmids, sequencing the fragments using Sanger sequencing, and assembling the sequence reads into the complete genome. Whole genome sequencing allows researchers to discover coding and non-coding regions, predict disease susceptibility, and perform evolutionary studies by comparing species.
Whole genome shotgun sequencing involves randomly breaking genomic DNA into small fragments, sequencing the fragments, and then reassembling the sequences using overlapping regions. The document outlines the history and procedure of shotgun sequencing. Genomic DNA is first fragmented, end-repaired, and size-selected into small, medium, and large fragments. Libraries are created for each size fragment and sequenced. A base caller filters poor calls and an assembler finds overlaps to generate continuous nucleotide sequences or contigs of the whole genome.
ESTs are short sequences of DNA that represent genes expressed in certain tissues or organisms. They provide a quick and inexpensive way for scientists to discover new genes and map their positions in genomes. ESTs represent a snapshot of genes expressed in a tissue at a given time. Sequencing the beginning or end of cDNA clones produces 5' and 3' ESTs, which can help identify genes and study gene expression and regulation.
Genome organization in prokaryotes(molecular biology)IndrajaDoradla
1. In prokaryotes, the genome is located in an irregularly shaped region within the cell called the nucleoid, which is not surrounded by a membrane like the eukaryotic nucleus.
2. The prokaryotic genome is generally a circular piece of DNA that can exist in multiple copies and ranges in length but is at least a few million base pairs. It is packaged into the nucleoid through supercoiling facilitated by nucleoid-associated proteins.
3. DNA supercoiling allows for very long strands of DNA to be tightly packaged into a prokaryotic cell. This involves the introduction of plectonemic supercoils that twist the DNA into loops and wind it around nucle
This document provides an overview of plasmid and phage vectors used in genetic engineering. It discusses the properties and types of plasmid vectors such as pBR322, as well as bacteriophage vectors including lambda phage and M13 phage. The mechanisms of gene cloning using these vectors are explained, along with their applications in cloning genes and producing recombinant proteins. In conclusion, bacteriophages are identified as good vectors compared to plasmids, though plasmids do not frequently destroy their host cells like bacteriophages can.
Lectut btn-202-ppt-l23. labeling techniques for nucleic acidsRishabh Jain
Nucleic acid probes can be labeled with radioisotopes or nonisotopic labels for use in hybridization techniques. Common labeling methods include radioactive labeling with 32P or 3H, or nonisotopic labeling with biotin, digoxigenin, or fluorescein. Labeled probes are used to detect complementary DNA or RNA sequences and can be DNA, RNA, or oligonucleotide probes. Probes are prepared through various techniques such as PCR, random priming, or in vitro transcription and must be purified before use and stored appropriately.
Modified M13 vectors have a large number of cloning sites which allow for insertion of foreign DNA. These vectors are derived from the M13 bacteriophage and are commonly used for DNA sequencing, mapping and mutagenesis experiments in molecular biology research. The document appears to be a seminar topic submission about using the M13 phage for biotechnology applications.
"Microbial Genomics @NIST" presentation at the Standards for Pathogen Identification via NGS (SPIN) workshop hosted by National Institute for Standards and Technology October 2014 by Nathan Olson from NIST.
Bacteria come in different shapes including spherical (coccus), rod-shaped (bacillus), comma-shaped (vibrio), and spiral-shaped (spirillum). They can also be classified based on whether they have flagella.
Improving and validating the Atlantic Cod genome assembly using PacBioLex Nederbragt
This document summarizes work using PacBio long reads to improve the Atlantic cod genome assembly. Error-corrected and raw PacBio reads were used with different assembly programs. Both helped increase contig and scaffold lengths over the previous assembly, with raw reads performing best. Bridgemapper validation found misassemblies corrected by PacBio. The improved assembly met goals of <5% gaps and scaffold N50 over 1 Mbp. Lessons included developing programs to handle cod's heterozygosity and structural variation better. The new assembly version aims to have 23 pseudochromosomes and improved annotation.
This document summarizes a presentation on using whole genome sequencing (WGS) for rapid characterization of bacterial outbreaks. The presenter discusses transitioning public health labs from traditional typing methods to WGS-based approaches. Key points include developing automated analysis pipelines to identify bacteria, determine antimicrobial resistance and virulence genes, and construct phylogenomic trees from core genome SNPs. The goal is a cloud-based system allowing labs to securely upload and analyze sequencing data with open source tools integrated in modular pipelines.
This document discusses the history and various methods of DNA sequencing. It begins with a brief overview of DNA sequencing and its uses. It then outlines some of the major developments in DNA sequencing techniques, including the earliest RNA sequencing in 1972, Sanger sequencing in 1977, and the first complete genome of Haemophilus influenzae in 1995. The document proceeds to provide more detailed explanations of several DNA sequencing methods, such as Sanger sequencing, pyrosequencing, shotgun sequencing, Illumina sequencing, and SOLiD sequencing.
An earthquake is caused by the movement of tectonic plates deep underground. As the plates squeeze and stretch, stress builds until the rocks break, releasing energy in waves that cause the shaking felt during earthquakes. Common effects of earthquakes include damaged or destroyed buildings and homes, cracks in the ground, and potential tsunamis. While scientists are working to predict earthquakes, they currently cannot determine exactly how or when one will stop once the shaking has begun.
Cyclones form over warm ocean waters and are caused by heavy winds that can do significant damage. A document by Connah, Kiya, and Daniel discusses a cyclone disaster in Fiji that has already killed 5,000 people in the northern division and shows pictures of destroyed houses with broken windows and missing walls as well as boats stuck in place from even a small cyclone's winds.
This document provides advice on obtaining research grants. It discusses choosing funding bodies and collaborators, laying the groundwork, writing proposals, presentation, costs, justification, reviews, outcomes, and reapplying if not initially successful. Key points include focusing proposals, using positive language, addressing reviewer criteria, managing timelines, and learning from feedback to improve future applications.
This document contains a list of 14 names, some of which are Irish in origin. The list includes common Irish names like Anne, Eamon, Ian, Jackie, Jim, Ken, Mairead as well as some less common surnames like Oddling-Smee, Lo, McGrath, Given, McCann, McConvey, Osborne, MacDonald, Maginnis, Maguire.
Web 2.0 refers to a more dynamic and collaborative version of the World Wide Web that allows users to share and interact with information online through things like blogs, wikis, and web services. It represents a transition from static web pages to web applications that emphasize open communication, sharing, and user-generated content in web-based communities. While initially a technical term, Web 2.0 is now more commonly used for marketing purposes to describe modern web features that enable user participation and collaboration on the internet.
The document is the PostgreSQL 9.3.3 documentation. It includes information about PostgreSQL such as its history, licensing, and how to report bugs. It also provides tutorials and details about SQL features, data types, functions, and more.
Bio380 Human Evolution: Waking the deadMark Pallen
Bio380 Human Evolution, genes and genomes lecture on contribution of archaic populations to gene pool of anatomically modern humans, including Neanderthals and Denisovan
This document discusses assembly tools and visualization software. It describes assemblers like ABySS and SOAPdenovo that use de Bruijn graphs to assemble genomes from short reads. It also discusses tools for visualizing assemblies, like Tablet and ABySS-Explorer. Finally, it covers read mapping with SAM/BAM formats and tools like BWA, and visualization of mappings with Artemis and IGV.
Automated assemblies are one thing, good assemblies are another!
This presentation covers the basic concepts of using paired-end and mate pair read data to identify mis-assemblies. It also covers some of the tools for visualising and correcting mis-assemblies. An attempt is made to rate these tools on their feature set and scalability beyond small (<15MBase) genomes and provides some closing remakes about what the ideal genome assembly editing tool should have in terms of features.
Bio380 lecture on cancer as an evolutionary process, showing descent with modification, branching evolution and natural selection; focus on genome evolution
This was a talk given on 2014-06-19 for the Genome Center’s Bioinformatics Core as part of a 1 week workshop on using Galaxy. It concerns the Assemblathon projects as well as other aspects relating to genome assembly.
A version of this talk is also available on Slideshare with embedded notes.
Note, this is an evolving talk. There are older and newer versions of the talk also available on slideshare.
Bio303 Lecture 2 Two Old Enemies, TB and LeprosyMark Pallen
In this lecture I will focusing on another of the most serious infectious threats to humanity, tuberculosis, outlining its evolutionary origins, impact on human health and wealth and the steps taken to control and treat this infection. I will also discuss a related mycobacterial infection, leprosy and recent progress in its control.
Genomic sequencing allows researchers to determine the order of DNA nucleotides in whole genomes. There are two main approaches - hierarchical shotgun sequencing and whole genome shotgun sequencing. Hierarchical shotgun sequencing was used for the Human Genome Project. It involves first creating a physical map using markers like RFLPs, VNTRs, and STSs. The genome is then broken into large clones which are sequenced and assembled based on the physical map. Advances in genomic sequencing have led to sequencing of many important genomes like yeast, nematode, rice, fruit fly, and human. Genomic sequencing provides valuable information about gene structure and organization and aids in understanding genome function and evolution.
This document provides an overview of genomics, including its history, major research areas, and applications. Genomics is concerned with studying the genomes of organisms, including determining entire DNA sequences and genetic mapping. Major research areas discussed include bacteriophage, human, computational, and comparative genomics. Applications of genomics discussed include functional genomics, predictive medicine, metagenomics for medicine, biofuels and more. The first genomes sequenced were small viruses and mitochondria, while the human genome project aimed to map the entire human DNA sequence.
The Human Genome Project was an international scientific research project with the goal of determining the sequence of nucleotide base pairs that make up human DNA. It originally aimed to map the over three billion nucleotides contained in the human genome. The finished human genome is a mosaic assembled from sequencing a small number of individuals. The project has provided insights into human genetics and disease research.
This document provides an overview of modern genetics. It begins by defining genetics as the study of heredity and genes. It describes Gregor Mendel's foundational work in genetics and how his work led to the modern understanding of genes and inheritance being controlled by DNA. Key experiments that established DNA as the genetic material, such as Griffith's transformation experiment and Hershey and Chase's experiment, are summarized. The central dogma of biology involving DNA replication, transcription of DNA to mRNA, and translation of mRNA to proteins is explained at a high level. Concepts covered include DNA and RNA structure, mutation, genetic engineering techniques like recombinant DNA, and applications to medical genetics research.
Molecular tagging of genes involves identifying existing DNA or introducing new DNA to function as a tag or label for the gene of interest. There are four main strategies for gene tagging: marker-based tagging, transposon tagging, T-DNA tagging, and epitope tagging. Marker-based tagging uses molecular markers tightly linked to important traits to assist in plant breeding programs. Transposon tagging relies on transposons, which can move within the genome, to provide a DNA tag that can then be used to identify adjacent DNA sequences and genes.
The document provides information about the genome organization of various organisms including microbes, viruses, yeast, and Tetrahymena thermophila. It discusses the size and composition of genomes, highlighting that higher eukaryotes have more non-coding and repetitive DNA. The genomes of model organisms like E. coli, Saccharomyces cerevisiae, and Tetrahymena thermophila are described in detail, including their chromosome structure, gene content, and scientific contributions from studying these organisms.
Recombinant DNA technology allows DNA from different species to be isolated, cut, spliced together, and replicated. This creates new "recombinant" DNA molecules. Key steps include using restriction enzymes to cut DNA into fragments, inserting fragments into cloning vectors like plasmids, and transforming host cells to replicate the recombinant DNA. PCR is also used to amplify specific DNA sequences. Recombinant DNA technology has many applications, including producing human proteins, diagnosing genetic diseases, and detecting bacteria and viruses.
This document provides information about a lecture series on methods in molecular biology. The course is titled "Methods in Molecular Biology" and is worth 3 credit hours. It will be taught by Dr. Sumera Shaheen in the department of biochemistry at Govt. College Women University Faisalabad. The lectures will cover topics such as recombinant DNA technology, vectors, PCR, DNA sequencing, gel electrophoresis, expression of recombinant proteins, antibodies, and blotting techniques. Recommended textbooks for the course are also listed.
This document discusses the progress of fungal genomics. It notes that the sequencing of the yeast S. cerevisiae genome in 1996 revolutionized work in yeast genetics. The Fungal Genome Initiative was launched in 2000 to sequence genomes of fungi throughout the kingdom. To date, high quality draft genomes have been published for 10 fungi. Fungal genomes range in size from 12-40 Mb. Chromosomal genes make up the bulk of the genome. Mitochondrial, plasmid, and virus-like genes also contribute to the fungal phenotype. Transposons are rare in filamentous fungi. The S. cerevisiae genome was discussed as an example, noting its 16 chromosomes, 6183 genes, and circular 2 μm plasmid
Dr. ladli kishore (microbial genetics and variation) (1)Drladlikishore2015
This document discusses the history and key concepts of microbial genetics and variation, including:
1. It outlines the history of genetics from Mendel's experiments in 1865 to the discovery of gene sequencing in the 1970s.
2. It defines genes, chromosomes, DNA, and how genetic information is stored and expressed through proteins.
3. It explains genetic processes like transcription, translation, mutation, and gene regulation, and how genetic material can be transferred between bacteria.
The document discusses the need for increased genomic sampling of bacterial and archaeal diversity based on the tree of life. It summarizes the Genomic Encyclopedia of Bacteria and Archaea (GEBA) pilot project, which selected 200 organisms from diverse phylogenetic lineages for genome sequencing. The results showed that sequencing genomes from underrepresented lineages led to novel gene, protein, and structural discoveries that improved genome annotation and metagenomic analysis. The document argues for expanding GEBA-style systematic sequencing to better represent microbial diversity.
This document provides an overview of molecular genetics and biotechnology. It discusses the structure of DNA and the DNA double helix model. It explains DNA replication, which is essential for cells to copy genetic material before cell division. Gene expression through transcription and translation is summarized, including how DNA is copied into mRNA and then translated into proteins. The sequencing of the human genome is mentioned, along with applications like disease diagnosis. Genetic engineering techniques like isolating, inserting, and recombining genes are outlined. Uses of biotechnology in agriculture, farming, and medicine are also reviewed.
The document discusses biotechnology and its traditional and modern applications. It summarizes that biotechnology has traditionally involved techniques like using yeast to make beer/wine and selective breeding of plants and animals. Modern biotechnology focuses on genetic engineering using recombinant DNA technology to modify genes and achieve goals like understanding disease and improving agriculture. It also discusses techniques like polymerase chain reaction (PCR) and gel electrophoresis that are used in biotechnology and forensics.
This document provides an overview of DNA cloning including:
1. The basic steps in DNA cloning including isolation of vector and gene source DNA, insertion into the vector, and introduction into cells.
2. Uses of polymerase chain reaction and restriction enzymes in cloning.
3. Applications of cloning such as recombinant protein production, genetically modified organisms, DNA fingerprinting, and gene therapy.
1. Microbial genomics analyzes and compares the complete genetic material of microorganisms. It provides insights into microbial evolution, diversity, applications in biotechnology, and treatment of pathogens.
2. Key tools for studying whole microbial genomes include pulsed field gel electrophoresis, large insert cloning vectors, whole genome sequencing approaches, microarray hybridization, and genome annotation pipelines.
3. Sequencing the first free-living organism, Haemophilus influenzae, involved random sequencing of small and large insert libraries, followed by closing gaps using various methods to produce the complete genome.
DNA Fingerprinting for Taxonomy and Phylogeny.pptxsharanabasapppa
This document provides information about DNA fingerprinting and its use for taxonomy and phylogeny of insects. It defines DNA and describes the history and process of DNA fingerprinting. It explains that the cytochrome c oxidase 1 (CO1) gene of the mitochondria is used as the standard DNA barcode for identifying animal species. Choosing this locus allows identification of insects from any life stage. DNA barcoding provides benefits like enabling non-specialists to identify specimens and combating diseases by identifying vectors. It concludes by discussing applications of DNA barcoding and listing references.
Polymerase chain reaction (PCR) is a technique used to amplify specific regions of DNA. It allows scientists to make millions to billions of copies of the target DNA sequence. Real-time quantitative PCR (qPCR) allows quantification of the amount of target DNA or RNA present. In situ hybridization is a technique that uses labeled nucleic acid probes to localize specific DNA or RNA sequences within cells in preserved tissue samples.
Prokaryotic genetic material differs from eukaryotes in several key ways:
1. Prokaryotes lack a membrane-bound nucleus and have their DNA located in the nucleoid. They typically have a single circular chromosome while eukaryotes have multiple linear chromosomes.
2. Prokaryotic genes are arranged in operons and expressed together, whereas eukaryotic genes each have their own promoter and are independently expressed.
3. DNA replication in prokaryotes is rapid and ongoing, starting from a single origin of replication site, while eukaryotes tightly regulate replication during the cell cycle.
This document discusses high-resolution views of the cancer genome using various technologies including DNA microarrays, comparative genomic hybridization, tiling arrays, next-generation sequencing, and DNAse-Seq. It describes how these technologies can be used to analyze gene expression, copy number variation, chromatin structure, and more to better understand cancer at the genomic level. Integrating data from all these sources presents challenges but may help improve individual health outcomes.
The seminar covered bacterial genetics including basic terminology like genes, DNA, RNA, and the genetic code. It discussed the bacterial genome and central dogma of transcription and translation. Extrachromosomal genetic elements like plasmids that allow for gene transfer between bacteria were explained. Methods of gene transfer through transformation, transduction, and conjugation were also summarized. The presentation concluded with bacterial variation being influenced by genetic changes and gene transfer.
Nothing in Microbiology makes Sense except in the Light of EvolutionMark Pallen
Professor Mark Pallen's Inaugural Lecture at Warwick Medical School, University of Warwick, April 15th 2014.
Storified version of lecture: https://storify.com/mjpallen/palleninaugural
This lecture introduces concepts of bacterial genetics and virulence. It defines key genetic terms and describes how bacteria differ from eukaryotes in their genetics. Mobile genetic elements often facilitate the acquisition of virulence genes via horizontal gene transfer. Virulence factors do not always benefit the bacterium directly but may aid bacteriophages. Genetic methods like signature-tagged mutagenesis and Tn-seq can identify genes required for virulence in model infections.
Bio305 Lecture on Gene Regulation in Bacterial PathogensMark Pallen
This document summarizes the regulation of bacterial virulence gene expression. It discusses the hierarchical regulation of genes from the DNA to post-translational level. Key topics covered include transcription factors, operons, two-component systems, quorum sensing, and methods to study virulence gene expression such as reporter gene fusions, chromatin immunoprecipitation, and microarrays. The goal is to provide an overview of the complex regulatory networks that control bacterial pathogenesis.
This document provides an overview of the module "Bio305 Pathogen Biology" taught by Professor Mark Pallen. It begins with definitions of key terms like pathogen, virulence, infection, and pathogenesis. It then discusses concepts like the molecular basis of virulence, how bacteria sense their environment and regulate virulence genes, and the steps in successful bacterial infection. It also covers how bacterial sex and acquisition of mobile genetic elements like pathogenicity islands have driven the evolution of virulence. The document provides a sophisticated, multi-factorial view of virulence as a process.
Bio303 laboratory diagnosis of infectionMark Pallen
In this Bio303 module talk, I provide an overview of how infections are diagnosed in the clinical microbiology lab, focusing on technologies, old and new, and also on practical issues and workflows crucial to optimal use of the lab.
Bio303 Lecture Three: New Foes, Emerging InfectionsMark Pallen
This document outlines the lectures in a course on global health and emerging infections. The first three lectures discuss existing threats like malaria, tuberculosis, and leprosy. The third lecture focuses on new threats posed by emerging infections and examines case studies of SARS, pandemic flu, and a 2011 E. coli outbreak in Germany. The fourth lecture discusses disease eradication efforts for smallpox and current efforts for polio and guinea worm. The fifth lecture provides an overview of infectious disease diagnosis in clinical microbiology laboratories.
My first lecture on the second year Bio263 module on human evolution. An overview of human evolution and palaeoanthropology. Taxonomy and humanity's place in nature. Who is our closest living relative? Evidence from morphology and molecules.
See also Slidecast on YouTube
http://www.youtube.com/watch?v=28bLQIGRbWU
Bio303 Lecture 1 The Global Burden of Infection and an Old Enemy, MalariaMark Pallen
The Global Burden of Infection and an Old Enemy, Malaria. In this lecture I will survey the global burden of infection, including its human and economic costs, and examine the problem of neglected tropical diseases before focusing on one of the most serious infectious threats to humanity: malaria, outlining its evolutionary origins, impact on human health and wealth and the steps taken to control and treat this infection.
See also Bio303 Facebook page
My fourth lecture in my series on human evolution, migration, population genetics and genomics. Discussion of Polynesians, Jewish populations, origins of the English and Thomas Jefferson's black descendants.
http://www.youtube.com/watch?v=o0FSXmDlO-c
The document discusses human evolution and major population movements based on genetic evidence. It summarizes that Bushmen from Southern Africa represent an early lineage of anatomically modern humans. It also describes the Bantu expansion from West Africa across much of sub-Saharan Africa around 5,000 years ago. Additionally, it outlines five episodes of settlement in Europe over the past 50,000 years based on mitochondrial DNA and Y-chromosome evidence.
This presentation was provided by Steph Pollock of The American Psychological Association’s Journals Program, and Damita Snow, of The American Society of Civil Engineers (ASCE), for the initial session of NISO's 2024 Training Series "DEIA in the Scholarly Landscape." Session One: 'Setting Expectations: a DEIA Primer,' was held June 6, 2024.
Thinking of getting a dog? Be aware that breeds like Pit Bulls, Rottweilers, and German Shepherds can be loyal and dangerous. Proper training and socialization are crucial to preventing aggressive behaviors. Ensure safety by understanding their needs and always supervising interactions. Stay safe, and enjoy your furry friends!
Exploiting Artificial Intelligence for Empowering Researchers and Faculty, In...Dr. Vinod Kumar Kanvaria
Exploiting Artificial Intelligence for Empowering Researchers and Faculty,
International FDP on Fundamentals of Research in Social Sciences
at Integral University, Lucknow, 06.06.2024
By Dr. Vinod Kumar Kanvaria
A Strategic Approach: GenAI in EducationPeter Windle
Artificial Intelligence (AI) technologies such as Generative AI, Image Generators and Large Language Models have had a dramatic impact on teaching, learning and assessment over the past 18 months. The most immediate threat AI posed was to Academic Integrity with Higher Education Institutes (HEIs) focusing their efforts on combating the use of GenAI in assessment. Guidelines were developed for staff and students, policies put in place too. Innovative educators have forged paths in the use of Generative AI for teaching, learning and assessments leading to pockets of transformation springing up across HEIs, often with little or no top-down guidance, support or direction.
This Gasta posits a strategic approach to integrating AI into HEIs to prepare staff, students and the curriculum for an evolving world and workplace. We will highlight the advantages of working with these technologies beyond the realm of teaching, learning and assessment by considering prompt engineering skills, industry impact, curriculum changes, and the need for staff upskilling. In contrast, not engaging strategically with Generative AI poses risks, including falling behind peers, missed opportunities and failing to ensure our graduates remain employable. The rapid evolution of AI technologies necessitates a proactive and strategic approach if we are to remain relevant.
Introduction to AI for Nonprofits with Tapp NetworkTechSoup
Dive into the world of AI! Experts Jon Hill and Tareq Monaur will guide you through AI's role in enhancing nonprofit websites and basic marketing strategies, making it easy to understand and apply.
How to Build a Module in Odoo 17 Using the Scaffold MethodCeline George
Odoo provides an option for creating a module by using a single line command. By using this command the user can make a whole structure of a module. It is very easy for a beginner to make a module. There is no need to make each file manually. This slide will show how to create a module using the scaffold method.
A workshop hosted by the South African Journal of Science aimed at postgraduate students and early career researchers with little or no experience in writing and publishing journal articles.
Executive Directors Chat Leveraging AI for Diversity, Equity, and InclusionTechSoup
Let’s explore the intersection of technology and equity in the final session of our DEI series. Discover how AI tools, like ChatGPT, can be used to support and enhance your nonprofit's DEI initiatives. Participants will gain insights into practical AI applications and get tips for leveraging technology to advance their DEI goals.
The simplified electron and muon model, Oscillating Spacetime: The Foundation...RitikBhardwaj56
Discover the Simplified Electron and Muon Model: A New Wave-Based Approach to Understanding Particles delves into a groundbreaking theory that presents electrons and muons as rotating soliton waves within oscillating spacetime. Geared towards students, researchers, and science buffs, this book breaks down complex ideas into simple explanations. It covers topics such as electron waves, temporal dynamics, and the implications of this model on particle physics. With clear illustrations and easy-to-follow explanations, readers will gain a new outlook on the universe's fundamental nature.
A review of the growth of the Israel Genealogy Research Association Database Collection for the last 12 months. Our collection is now passed the 3 million mark and still growing. See which archives have contributed the most. See the different types of records we have, and which years have had records added. You can also see what we have for the future.
Macroeconomics- Movie Location
This will be used as part of your Personal Professional Portfolio once graded.
Objective:
Prepare a presentation or a paper using research, basic comparative analysis, data organization and application of economic information. You will make an informed assessment of an economic climate outside of the United States to accomplish an entertainment industry objective.
Assessment and Planning in Educational technology.pptxKavitha Krishnan
In an education system, it is understood that assessment is only for the students, but on the other hand, the Assessment of teachers is also an important aspect of the education system that ensures teachers are providing high-quality instruction to students. The assessment process can be used to provide feedback and support for professional development, to inform decisions about teacher retention or promotion, or to evaluate teacher effectiveness for accountability purposes.
2. Microbial Genomics
General features of microbial genomes
Historical overview
Genome sequencing, annotation and analysis
Genome evolution
What we can learn from a genome sequence?
3. General features of genomes
Microbial Human
Small WSIWYG genomes Very large genomes
(Mbp) (Gbp)
Gene density high (>90%)
intergenic regions short
Gene density low
very little repetitiveor non- Only 25% is genes
coding DNA Introns mean only1%
Introns very rare codes
Protein-coding genes Genes can span ≥30
(CDS) short (~1kbp)
kbp
Operons with promoters
just upstream Genes have ~3
Fewer non-coding RNAs transcripts
Splicing and splice
variants
4. Bacterial genome organisation
Chromosomes Plasmids
Most commonly single Independent autonomous
replicon, can be circular or
circular chromosome linear
(always DNA) may integrate into chromosome
BUT many species have copy number varies 1 to 10s
linear chromosome(s) (e.g. often carry non-essential genes
Borrelia, Streptomyces, Rh that confer an adaptive
odoccus) advantage in certain conditions
BUT a few species with two
chromosomes (e.g.
Vibriocholerae)
Can be mix of circular and
linear (e.g.
Agrobacteriumtumefacien
s, B. burgdoferi)
5. Bacterial Genome Size
species which occupy restricted ecological
niches, (e.g. obligate intracellular parasites and
endosymbionts) tend to have smaller genomes
(<1.5 Mb) than generalist bacteria
smallest known bacterial genome:
Carsonellaruddii, 160 kb! (Nakabachi et al. 2006)
BUT mitochondrial genomes are smaller
largest genomes found in bacteria with complex
developmental cycles, e.g. Streptomyces
largest bacterial genome: Sorangiumcellulosum, 13
Mb
6. Bacterial genomes are made from DNA
In 1944, Oswald Avery, Colin MacLeod, and Maclyn
McCarty showed that DNA (not proteins) was the genetic
material responsible for inheritance.
Identified DNA as the "transforming principle" while studying
Streptococcus pneumoniae
Avery, Oswald T., Colin M. MacLeod, and Maclyn McCarty.
Studies on the chemical nature of the substance inducing
transformation of pneumococcal types. Journal of Experimental
Medicine. 1944 Feb 1; 79(2): 137-158.
In 1952, this work was supported by Alfred Hershey and
Martha Chase who showed that only the DNA of a virus
needs to enter a bacterium to infect it.
Used radioactively labelled bacteriophage
Hershey AD and Chase M. Independent functions of viral
protein and nucleic acid in growth of bacteriophage. Journal of
General Physiology. 1952. 36: 39-56.
7. Viral genomes are variable
Use RNA or DNA but not
both in genome
Some have RNA genomes!
Grouped into families
depending on
type of genome: DNA or
RNA, single- or double-
stranded
Typically dozens of genes
or fewer
Large genomes in pox
viruses (~200 kb)
Massive genomes in
megaviruses (1Mbp!)
8. Microbial Genomics Timeline
Year Milestone
1977 Invention of dideoxy chain terminator sequencing (“Sanger sequencing”)
1979 Sequencing of the 5.3-kilobase genome of bacteriophage phiX174
1981 First human mitochondrial genome sequence*
1982 Determination of the 48.5-kilobase genome sequence of bacteriophage lambda through first use
of shotgun sequencing
1986 Development of automated fluorescent sequencing
1995 First complete genome sequences obtained of free-living bacteria (Haemophilus influenzae and
Mycoplasma genitalium)
1996 Mycoplasma becomes first bacterial genus that has completely sequenced genomes from two
different species (M. genitalium and M. pneumoniae)
1997 First genome sequences from Escherichia coli and Bacillus subtilis
1998 First genome sequence from Mycobacterium tuberculosis; genome sequence from
Rickettsiaprowazekii provides first evidence of reductive evolution
9. Microbial Genomics Timeline
Year Milestone
1999 Helicobacter pylori becomes the first species with completely sequenced genomes from two
isolates
2000 Meningococcal genome sequence primes first application of reverse vaccinology
2001 Second E. coli genome sequences reveal unexpected level of horizontal gene transfer;
genome sequence of M. leprae provides compelling evidence of bacterial pseudogenes and
reductive evolution; first paper reporting genome sequences of two strains from one species
(Staphylococcus aureus) in a single publication.
2002 Genome sequencing of multiple strains of Bacillus anthracis to provide markers for forensic
epidemiology
2003 Genome sequencing of uncultivable Tropherymawhippleileads to design of axenic growth
medium
2004 Genome sequence of mimivirus blurs distinctions between bacteria and viruses
2005 Use of whole-genome sequencing used to identify target of new anti-tuberculosis drug
Mycoplasma genitalium genome sequenced using pyrosequencing
2006- Bacterial metagenomics survey of the Sargasso sea yields >1 million new genes
2011 Rise of next-generation or high-throughput sequencing
10. The first genome sequences
The first sequenced gene was from bacteriophage MS2
The gene encoding the coat protein
1972
Min Jou W, Haegeman G, Ysebaert M, and Fiers W. Nucleotide
sequence of the gene coding for the bacteriophage MS2 coat
protein. Nature. 1972 May 12; 237(5350): 82-88.
The first sequenced genome was bacteriophage MS2
1976
RNA genome is 3,569 nucleotides
Fiers W, Contreras R, Duerinck F, Haegeman G, Iserentant
D, Merregaert J, Min Jou W, Molemans F, Raeymaekers A, Van
den Berghe A, Volckaert G, and Ysebaert M. Complete
nucleotide sequence of bacteriophage MS2 RNA: primary and
secondary structure of the replicase gene. Nature. 1976 Apr 8;
260(5551): 500-507.
11. The first genome sequences
The first sequenced DNA genome was bacteriophage Φ-
X174
1977
5368 base pairs
Sanger F, Air GM, Barrell BG, Brown NL, Coulson AR, Fiddes
CA, Hutchison CA, Slocombe PM, and Smith M. Nucleotide
sequence of bacteriophage phi X174 DNA. Nature. 1977 265
(5596): 687-695.
The first sequenced bacterial genome was Haemophilus
influenzae
1995
1,830,140 base pairs
Fleischmann R, Adams M, White O, Clayton R, Kirkness
E, Kerlavage A, Bult C, Tomb J, Dougherty B, and Merrick J.
Whole-genome random sequencing and assembly of
Haemophilus influenzae Rd. Science, 1995. 269 (5223): 496-
512.
12. Overview of a genome project
Choose strain Closure and finishing
Fresh isolate or tractable Manually intensive
lab strain? Difficulty depends on
Choose strategy how repetitive
Shotgun sequencing Data Release
Paired-end sequencing Immediate or delayed?
Draft or complete? Annotation
Choose chemistry Manually intensive bottle
Sanger; 454; Illumina; neck
Ion Torrent Publication
Assembly
Automated
13. Methods for genome sequencing – historic
Sanger method sequencing
Sanger F and Coulson AR. A rapid method for
determining sequences in DNA by primed synthesis
with DNA polymerase. Journal of Molecular Biology.
1975 94: 441-448.
Step 1, a sequence-specific DNA primer is radiolabeled
Step 2, the primer is annealed to the template DNA
Step 3, the primer is extended by DNA polymerase
Incorporation of a deoxynucleotide - further extension possible
Incorporation of a dideoxynucleotide – chain termination
Four reactions set up
ddATP, dATP, dCTP, dGTP, dTTP
ddCTP, dATP, dCTP, dGTP, dTTP
ddGTP, dATP, dCTP, dGTP, dTTP
ddTTP, dATP, dCTP, dGTP, dTTP
15. Methods for genome sequencing –
automated Sanger sequencing
Smith LM, Sanders JZ, Kaiser RJ, Hughes P, Dodd C, Connell CR, Heiner C,
Kent SBH, and Hood LE. Fluorescence detection in automated DNA
sequence analysis. Nature. 1986 321: 674-679.
Replaced radioisotopes with fluorescent dyes
Safer for the researchers
Each of the four DNA bases could be dyed a different colour
Eliminated the need to run separate reactions in separate lanes
The migration of the dye could be read because of the fluorescence
This information allowed automatic gel reading
Further improvements were made
Improved dye chemistry using fluorescent dideoxy-terminators (DuPont): Prober
JM, Trainor GL, Dam RJ, Hobbs FW, Robertson CW, Zagursky RJ, Cocuzza AJ,
Jensen MA, and Baumeister K. A system for rapid DNA sequencing with
fluorescent chain-terminating dideoxynucleotides. Science 238: 336-341.
Replacing slab gels with re-useable capillary tubes: Ruiz-Martinez MC, Berka J,
Belenkii A, Foret F, Miller AW, and Karger BL. DNA sequencing by capillary
electrophoresis with replaceable linear polyacrylamide and laser-induced
fluorescence detection. Analytical Chemistry 1993 65: 2851-2858.
16. Whole-Genome Shotgun Sanger Sequencing
Random shearing
bacterial
chromosome
Size selection
plasmid vector
Pick colonies to create shotgun
Cloning library
Sequence each insert
with two primers Plasmid preps
17. High-throughput Sequencing
100x faster, 100x cheaper!
A disruptive technology
Several technologies in the marketplace from 2007
onwards
454 (Roche)
Illumina
Ion Torrent
PacBio
Fundamentally new approaches
Solid-phase amplification of clonal templates in “molecular
colonies”
Massive increase in number of “clones” compensates for shorter
read length
New chemistries for sequence reading
454: pyrophosphate detection on base addition
Illumina: reversible de-protection of fluorescent bases
19. 454 sequencing
Emulsion-based clonal amplification
Anneal sstDNA to Clonal amplification Break
Emulsify beads and PCR
an excess of DNA occurs inside microreactors, enric
reagents in water-in-oil
Capture Beads microreactors h for DNA-positive
microreactors
beads
20. Pyrosequencing
DNA template with primer
mixed with the enzymes along
with the two substrates
adenosine 5‟-phosphosulfate
(APS) and luciferin
1. one of the four nucleotides
added to reaction
2. If complementary to base in
template strand then DNA
polymerase incorporates it
3. Pyrophosphate (Ppi)
released then converted to
ATP by sulfurylase in the
presence of APS.
4. ATP serves as a substrate to
luciferase, causing a light
reaction.
5. Excess nucleotides degraded
by apyrase.
22. The Sequence Assembly Problem
Sequencing technologies generate reads of <1000
bp
These reads must be assembled into a single
continuous genomic sequence.
Shotgun sequencing exploits many overlapping
sequences (high coverage) to infer ordering directly
from the sequences themselves
23. The Repeat Problem
Repeats at read ends can be assembled in multiple
ways
Correct
ATTTATGTGTGTGTGGTGTG
GTGTGGTGTGCACTACTGCT
ACTACTGCTGACTACTGTGTGGTGTG
GTGTGGTGTGATATCCCT
Incorrect
ATTTATGTGTGTGTGGTGTG
GTGTGGTGTGATATCCCT
ACTACTGCTGACTACTGTGTGGTGTG
GTGTGGTGTGCACTACTGCT
24. Random shearing
bacterial
chromosome
Size selection for 3kb or 8kb etc
Obtain sequences from
either side of linker
Paired-end
known distance apart in
genome
Sequencing Add linkers
Circularise
Add adapters Shear and select on size and
presence of linkers
Create long fragments of known
length
Obtain sequence from paired ends
known distance apart
Allows assembly of contigs across
repeats into scaffolds
25. Genome Assembly
Contig 1 Contig 2 Contig 3
Sequence Gap
Scaffold
Physical Gap
26. Re-sequencing
Short reads (<200bp)
inefficient de novo
assembly
Instead they are
mapped against a
reference genome
Re-sequencing is like
assembling a jigsaw
puzzle using the image
on the lid
27. Genome annotation
Annotation is the addition of information about the
predicted sequence features to the flat file of DNA code
Identification of potential coding sequences - CDS
Homology searches to predict function
Other features can be annotated as well
rRNAs
Potential promoters
tRNAs
Small non-coding RNAs
Repeat sequences
Insertion sequences (ISs), transposons, gene fragments
Location of the origin of replication
Determination of the number of bases, genes, and
G+C%.
29. …to this?
FT gene complement(9299..10702)
FT /db_xref="GenBank:2367266”
FT /gene="dnaA”
FT /note="b3702”
FT CDS complement(9299..10702)
FT /db_xref="GI:2367267”
FT /db_xref="PID:g2367267”
FT /function="putative regulator; DNA - replication, repair,
FT restriction/modification”
FT /codon_start=1
FT /protein_id="AAC76725.1”
FT /gene="dnaA”
FT /translation="MSLSLWQQCLARLQDELPATEFSMWIRPLQAELSDNTLALYAPNR
FT FVLDWVRDKYLNNINGLLTSFCGADAPQLRFEVGTKPVTQTPQAAVTSNVAAPAQVAQT
FT QPQRAAPSTRSGWDNVPAPAEPTYRSNVNVKHTFDNFVEGKSNQLARAAARQVADNPGG
FT AYNPLFLYGGTGLGKTHLLHAVGNGIMARKPNAKVVYMHSERFVQDMVKALQNNAIEEF
FT KRYYRSVDALLIDDIQFFANKERSQEEFFHTFNALLEGNQQIILTSDRYPKEINGVEDR
FT LKSRFGWGLTVAIEPPELETRVAILMKKADENDIRLPGEVAFFIAKRLRSNVRELEGAL
FT NRVIANANFTGRAITIDFVREALRDLLALQEKLVTIDNIQKTVAEYYKIKVADLLSKRR
FT SRSVARPRQMAMALAKELTNHSLPEIGDAFGGRDHTTVLHACRKIEQLREESHDIKEDF
FT SNLIRTLSS”
FT /product="DNA biosynthesis; initiation of chromosome
FT replication; can be transcription regulator”
FT /transl_table=11
FT /note="f467; 100 pct identical to DNAA_ECOLI SW: P03004;
FT CG Site No. 851”
31. An ORF is not a CDS!
An ORF is just an open reading frame
There are many more ORFs than protein coding genes (CDSs) in a
genome
Non-coding ORFs
CDSs
(note ORF can extend
upstream of start codon)
32. The Problem of Frameshift Errors
Actual sequence
10 20 30 40 50 60 70
| | | | | | |
ATGAGTACCGCTAAATTAGTTAAATCAAAAGCGACCAATCTGCTTTATACCCGCAACGATGTCTCCGACAGCGAGAAA
M S T A K L V K S K A T N L L Y T R N D V S D S E K
• V P L N • L N Q K R P I C F I P A T M S P T A R K
E Y R • I S • I K S D Q S A L Y P Q R C L R Q R E K
10 20 30 40 50 60 70
| | | | | | |
ATGAGTACCGCTAAATTAGTTAAATCAAAAAGCGACCAATCTGCTTTATACCCGCAACGATGTCTCCGACAGCGAGAA
M S T A K L V K S K S D Q S A L Y P Q R C L R Q R E
• V P L N • L N Q K A T N L L Y T R N D V S D S E K
E Y R • I S • I K K R P I C F I P A T M S P T A R K
Frameshifted sequence after single base error
33. Homology
Similarities in form the cat sat on the mat
(sequence) allow us die Katze sass auf der Matte
to infer similarities in
“meaning” (structure
and function)
Homology is not just
sequence similarity
Two sequences can
be similar without
any common
ancestry, particularly
if low complexity vge|GBant88-2 ITLITCVSVKDNSKRYVVAG
vge|GEfae9-178 LTLITCDQATKTTGRIIVIA
vge|GSpne1-403 MTLITCDPIPTFNKRLLVNF
sortase_staur LTLITCDDYNEKTGVWEKRK
34. Types of Homology
Homologues can be
divided into
Orthologues: lines of
descent congruent with
whole genome
Paralogues: result of
gene duplication
Xenologues: result of
HGT
35. Homology Searches
The aim of homology searches is to identify sequences
within these databases that are homologous to your
sequence.
This involves comparing your sequence with all the
database sequences
looking for stretches of sequence that appear to be similar
then scoring the matches and ranking them
a measure of the significance of the match is given
Most common program used for homology searches is
BLAST
36. Bacterial Genome Dynamics
Gene Loss Gene Duplication
Gene Gain
Drastic downsizing in isolated
intracellular niches Horizontal gene transfer
by phage, plasmids,
pathogenicity islands
Bacterial Rapid emergence of
Accumulation of
genetically uniform
pseudogenes and IS Genome pathogens from variable
elements after shift to Dynamics ancestral populations
new niche
Recombination and
rearrangements single nucleotide polymorphisms (SNPs)
Gene Change
37. Horizontal gene transfer
Horizontal (or lateral) gene transfer denotes any
transfer, exchange or acquisition of genetic material that
differs from the normal mode of transmission from
parents to offspring (vertical transmission).
Vertical gene transfer
Horizontal gene
38. Bacterial mobile genetic elements
Transposons
pieces of DNA that act as „jumping genes‟ that change
location on chromosome or plasmid chromosomal
localization.
encode transposase that catalyses the transposition
event
can carry resistance or virulence genes
Insertion sequences (IS elements)
transposable elements that encode only the transposase
multiple copies of same IS within genome provide targets
for homologous recombination, rearrangements and
replicon fusions
Conjugative transposons
normally integrated into the chromosome
excise then transferred to recipient cells by conjugation
39. Bacterial mobile genetic elements
Plasmids
self-replicating extrachromosomalreplicons
usually circular but can be linear
Can carry resistance or virulence genes
Bacteriophages
bacterial virusescan carry virulence genes
can insert into bacterial chromosome as prophages
(lysogeny)
Integrons
complex natural cloning and gene expression systems
able to capture promoterless gene cassettes by site-
specific recombination
allow formation of large arrays of gene cassettes
transferred as a whole between different replicons.
40. Genomic islands
large chromosomal regions, part of the flexible gene
pool
previously transferred by other mobile genetic
elements
present in some bacteria but absent in close
relatives
carry multiple genes that increase phenotypic
versatility
contribute to dynamic character of bacterial
chromosomes and can be excised from the
chromosome and transferred to other recipients
pathogenicity islands contain dozens of genes that
allow quantum leap to complex new virulence
41. Core genomes and Pangenomes
Core genome
pool of genes shared by all members of a bacterial
species
Accessory or dispensable genome
pool of genes present in some but not all genomes within
the same bacterial species
Pangenome
global gene repertoire of a bacterial species, comprised of
core genome + accessory genome
Metagenome
global gene repertoire of mixed microbial population
42. Escherichia coli Core and Pan-genomes
Welch et al. Proc Natl Acad Sci U S A. 2002 Dec 24;99(26):17020-4
43. Metagenomics
Environmental shotgun
sequencing
DNA extracted from
mixed microbial
communities sequenced
en masse
Assembled into contigs
Typically only small
contigs can be obtained
44. Uses of a genome sequence
Gene discovery
Fuelling hypothesis driven research on pathogen biology
Comparative genomics
SNP discovery and genomic epiemiology
Functional genomics
Transcriptomics
Proteomics
Interactome
Structural Genomics
Mass Mutagenesis
45. Haemolytic-uraemic syndrome
Shiga-toxin-producing E. coli (STEC)
bloody diarrhoea; damage to kidneys and brain
anaemia; loss of platelets
46. German E. coli O104:H4 outbreak
May-July 2011
>4000 cases
>40 deaths
Link to sprouting seeds
High risk of haemolytic-
uraemic syndrome
Females particularly at risk
Frank et al DOI: 10.1056/NEJMoa1106483
47.
48. Take-away messages from the genome
Pathogens don‟t bother with passports!
Not a new strain: something similar seen in Germany ten
years ago and in Korea
closest genome-sequenced strain was isolated from Central
African Republic in late 1990s, belongs to an
enteroaggregative lineage
German STEC probably comes from a lineage
circulating in human populations rather than from an
animal source (unlike E. coli O157)
49. Take-away messages
Bacteria evolve
quickly
Virulence factors in E.
coli can jump from one
lineage to another on
mobile genetic
elements
Pathotypes can
overlap and evolve
Antibiotic resistance
seen where no
obvious prior use of
antibiotics
50.
51. Take-away messages from genome sequence
Genome sequencing brings the advantages of
open-endedness (revealing the “unknown unknowns”),
universal applicability
ultimate in resolution
Bench-top sequencing platforms now generate data
sufficiently quickly and cheaply to have an impact on
real-world clinical and epidemiological problems