SlideShare a Scribd company logo
1 of 1
Download to read offline
An ontology for transposable elements and other repetitive sequences in the age of
genomics

Kate L. Hertweck, National Evolutionary Synthesis Center

As sequencing costs decrease, researchers are incorporating large-scale genomic sequencing projects
into their projects. The resulting data inundate the scientific community, providing ample opportunity
for myriad comparative genomic studies. A crucial step in most genome sequencing projects is to mask
repetitive sequences. This approach improves efficiency of gene assembly, but discards an informative,
diverse part of the genome. The repetitive portion of a genome comprises all sequences in very high
copy number, such as transposable elements. Previously thought to be “junk” DNA, a growing body of
evidence suggests transposable elements play vital roles in genomic evolution, affecting everything
from chromosome structure, gene regulation, and even derivation of new genes (Biemont, 2010).
Substantial work has described the classification of transposable elements (Wicker et al., 2007),
although our current knowledge of such sequences is largely based on relatively few model systems.

A majority of publicly available repeat libraries are built from long-read Sanger sequences or highly
curated, deep coverage genome sequencing. Available approaches to repetitive element assembly from
next generation sequencing data relies on assumptions about the genome's repeat content, including
availability of a reference genome, depth of sequencing, and length of reads. The results from these
algorithms provide invaluable information about transposable elements, especially in organisms with
very large genomes. However, results from various repeat assembly methods require an extensive
amount of metadata to be useful for other researchers. Development of an appropriate ontology for
repetitive elements assembled from next generation sequencing data should include characteristics of
the sequencing method (platform, length, number of reads) as well as details of the assembly (ab initio
vs de novo, stringency thresholds) and annotation methods (library used, search parameters).


BIEMONT, C. 2010. A Brief History of the Status of Transposable Elements: From Junk DNA to Major
      Players in Evolution. Genetics 186: 1085-1093.
WICKER, T., F. SABOT, A. HUA-VAN, J. L. BENNETZEN, P. CAPY, B. CHALHOUB, A. FLAVELL, et al. 2007. A
      unified classification system for eukaryotic transposable elements. Nature Reviews Genetics 8:
      973-982.

More Related Content

What's hot

Basics of Data Analysis in Bioinformatics
Basics of Data Analysis in BioinformaticsBasics of Data Analysis in Bioinformatics
Basics of Data Analysis in BioinformaticsElena Sügis
 
Molecular Phylogenetics
Molecular PhylogeneticsMolecular Phylogenetics
Molecular PhylogeneticsMeghaj Mallick
 
BIOLOGICAL SEQUENCE DATABASES
BIOLOGICAL SEQUENCE DATABASES BIOLOGICAL SEQUENCE DATABASES
BIOLOGICAL SEQUENCE DATABASES nadeem akhter
 
Gcc talk baltimore july 2014
Gcc talk baltimore july 2014Gcc talk baltimore july 2014
Gcc talk baltimore july 2014pratikomics
 
LECTURE NOTES ON BIOINFORMATICS
LECTURE NOTES ON BIOINFORMATICSLECTURE NOTES ON BIOINFORMATICS
LECTURE NOTES ON BIOINFORMATICSMSCW Mysore
 
Turning literature into databases
Turning literature into databasesTurning literature into databases
Turning literature into databasesLars Juhl Jensen
 
The ensembl database
The ensembl databaseThe ensembl database
The ensembl databaseAshfaq Ahmad
 
Computational Genomics - Bioinformatics - IK
Computational Genomics - Bioinformatics - IKComputational Genomics - Bioinformatics - IK
Computational Genomics - Bioinformatics - IKIlgın Kavaklıoğulları
 
Bioinformatics lecture 1
Bioinformatics lecture 1Bioinformatics lecture 1
Bioinformatics lecture 1Hamid Ur-Rahman
 
bioinfomatics
bioinfomaticsbioinfomatics
bioinfomaticsnguyenpg
 
Introduction to Bioinformatics.
 Introduction to Bioinformatics. Introduction to Bioinformatics.
Introduction to Bioinformatics.Elena Sügis
 
Genomics and Bioinformatics
Genomics and BioinformaticsGenomics and Bioinformatics
Genomics and BioinformaticsAmit Garg
 
Biological databases: Challenges in organization and usability
Biological databases: Challenges in organization and usabilityBiological databases: Challenges in organization and usability
Biological databases: Challenges in organization and usabilityLars Juhl Jensen
 
Bioinformatics
BioinformaticsBioinformatics
BioinformaticsAmna Jalil
 
Genome data management
Genome data managementGenome data management
Genome data managementShareb Ismaeel
 

What's hot (20)

Basics of Data Analysis in Bioinformatics
Basics of Data Analysis in BioinformaticsBasics of Data Analysis in Bioinformatics
Basics of Data Analysis in Bioinformatics
 
Molecular Phylogenetics
Molecular PhylogeneticsMolecular Phylogenetics
Molecular Phylogenetics
 
BIOLOGICAL SEQUENCE DATABASES
BIOLOGICAL SEQUENCE DATABASES BIOLOGICAL SEQUENCE DATABASES
BIOLOGICAL SEQUENCE DATABASES
 
Gcc talk baltimore july 2014
Gcc talk baltimore july 2014Gcc talk baltimore july 2014
Gcc talk baltimore july 2014
 
LECTURE NOTES ON BIOINFORMATICS
LECTURE NOTES ON BIOINFORMATICSLECTURE NOTES ON BIOINFORMATICS
LECTURE NOTES ON BIOINFORMATICS
 
Turning literature into databases
Turning literature into databasesTurning literature into databases
Turning literature into databases
 
The ensembl database
The ensembl databaseThe ensembl database
The ensembl database
 
Computational Genomics - Bioinformatics - IK
Computational Genomics - Bioinformatics - IKComputational Genomics - Bioinformatics - IK
Computational Genomics - Bioinformatics - IK
 
Bioinformatics lecture 1
Bioinformatics lecture 1Bioinformatics lecture 1
Bioinformatics lecture 1
 
03 Guerra, Rudy
03 Guerra, Rudy03 Guerra, Rudy
03 Guerra, Rudy
 
bioinfomatics
bioinfomaticsbioinfomatics
bioinfomatics
 
Data mining ppt
Data mining pptData mining ppt
Data mining ppt
 
Introduction to Bioinformatics.
 Introduction to Bioinformatics. Introduction to Bioinformatics.
Introduction to Bioinformatics.
 
Genomics and Bioinformatics
Genomics and BioinformaticsGenomics and Bioinformatics
Genomics and Bioinformatics
 
Plegable 23 de feb.
Plegable 23 de feb. Plegable 23 de feb.
Plegable 23 de feb.
 
Biological databases: Challenges in organization and usability
Biological databases: Challenges in organization and usabilityBiological databases: Challenges in organization and usability
Biological databases: Challenges in organization and usability
 
Bioinformatics
BioinformaticsBioinformatics
Bioinformatics
 
Biological Database
Biological DatabaseBiological Database
Biological Database
 
Characteristics of biological databases
Characteristics of biological databasesCharacteristics of biological databases
Characteristics of biological databases
 
Genome data management
Genome data managementGenome data management
Genome data management
 

Similar to iEvoBio Hertweck abstract 2012

Utility of transcriptome sequencing for phylogenetic
Utility of transcriptome sequencing for phylogeneticUtility of transcriptome sequencing for phylogenetic
Utility of transcriptome sequencing for phylogeneticEdizonJambormias2
 
Cvnoheader
CvnoheaderCvnoheader
CvnoheaderHon Chau
 
Comparative genomics to the rescue: How complete is your plant genome sequence?
Comparative genomics to the rescue: How complete is your plant genome sequence?Comparative genomics to the rescue: How complete is your plant genome sequence?
Comparative genomics to the rescue: How complete is your plant genome sequence?Klaas Vandepoele
 
Next Generation Sequencing methods
Next Generation Sequencing methods Next Generation Sequencing methods
Next Generation Sequencing methods Zohaib HUSSAIN
 
Cv.hon.chung.chau
Cv.hon.chung.chauCv.hon.chung.chau
Cv.hon.chung.chauHon Chau
 
Whole genome sequencing of bacteria & analysis
Whole genome sequencing of bacteria & analysisWhole genome sequencing of bacteria & analysis
Whole genome sequencing of bacteria & analysisdrelamuruganvet
 
Processing Amplicon Sequence Data for the Analysis of Microbial Communities
Processing Amplicon Sequence Data for the Analysis of Microbial CommunitiesProcessing Amplicon Sequence Data for the Analysis of Microbial Communities
Processing Amplicon Sequence Data for the Analysis of Microbial CommunitiesMartin Hartmann
 
Algal Functional Annotation Tool
Algal Functional Annotation ToolAlgal Functional Annotation Tool
Algal Functional Annotation ToolSarah Adams
 
CAMERA Presentation at KNAW ICoMM Colloquium May 2008
CAMERA Presentation at KNAW ICoMM Colloquium May 2008CAMERA Presentation at KNAW ICoMM Colloquium May 2008
CAMERA Presentation at KNAW ICoMM Colloquium May 2008Saul Kravitz
 
Genome sequencing. ppt.pptx
Genome sequencing. ppt.pptxGenome sequencing. ppt.pptx
Genome sequencing. ppt.pptxGedifewGebrie
 
Bioinformatics tools for NGS data analysis
Bioinformatics tools for NGS data analysisBioinformatics tools for NGS data analysis
Bioinformatics tools for NGS data analysisDespoina Kalfakakou
 
Comparative genomics
Comparative genomicsComparative genomics
Comparative genomicsAthira RG
 
EVE 161 Winter 2018 Class 18
EVE 161 Winter 2018 Class 18EVE 161 Winter 2018 Class 18
EVE 161 Winter 2018 Class 18Jonathan Eisen
 
Ondex: Data integration and visualisation
Ondex: Data integration and visualisationOndex: Data integration and visualisation
Ondex: Data integration and visualisationBiogeeks
 

Similar to iEvoBio Hertweck abstract 2012 (20)

Utility of transcriptome sequencing for phylogenetic
Utility of transcriptome sequencing for phylogeneticUtility of transcriptome sequencing for phylogenetic
Utility of transcriptome sequencing for phylogenetic
 
10.1.1.80.2149
10.1.1.80.214910.1.1.80.2149
10.1.1.80.2149
 
Cvnoheader
CvnoheaderCvnoheader
Cvnoheader
 
rheumatoid arthritis
rheumatoid arthritisrheumatoid arthritis
rheumatoid arthritis
 
Comparative genomics to the rescue: How complete is your plant genome sequence?
Comparative genomics to the rescue: How complete is your plant genome sequence?Comparative genomics to the rescue: How complete is your plant genome sequence?
Comparative genomics to the rescue: How complete is your plant genome sequence?
 
Next Generation Sequencing methods
Next Generation Sequencing methods Next Generation Sequencing methods
Next Generation Sequencing methods
 
Cv.hon.chung.chau
Cv.hon.chung.chauCv.hon.chung.chau
Cv.hon.chung.chau
 
Whole genome sequencing of bacteria & analysis
Whole genome sequencing of bacteria & analysisWhole genome sequencing of bacteria & analysis
Whole genome sequencing of bacteria & analysis
 
Processing Amplicon Sequence Data for the Analysis of Microbial Communities
Processing Amplicon Sequence Data for the Analysis of Microbial CommunitiesProcessing Amplicon Sequence Data for the Analysis of Microbial Communities
Processing Amplicon Sequence Data for the Analysis of Microbial Communities
 
Bioinformatics-2009-Moura-1096-8
Bioinformatics-2009-Moura-1096-8Bioinformatics-2009-Moura-1096-8
Bioinformatics-2009-Moura-1096-8
 
Algal Functional Annotation Tool
Algal Functional Annotation ToolAlgal Functional Annotation Tool
Algal Functional Annotation Tool
 
New generation Sequencing
New generation Sequencing New generation Sequencing
New generation Sequencing
 
CAMERA Presentation at KNAW ICoMM Colloquium May 2008
CAMERA Presentation at KNAW ICoMM Colloquium May 2008CAMERA Presentation at KNAW ICoMM Colloquium May 2008
CAMERA Presentation at KNAW ICoMM Colloquium May 2008
 
Genome sequencing. ppt.pptx
Genome sequencing. ppt.pptxGenome sequencing. ppt.pptx
Genome sequencing. ppt.pptx
 
E1062632
E1062632E1062632
E1062632
 
Bioinformatics tools for NGS data analysis
Bioinformatics tools for NGS data analysisBioinformatics tools for NGS data analysis
Bioinformatics tools for NGS data analysis
 
Protease Phylogeny
 Protease Phylogeny  Protease Phylogeny
Protease Phylogeny
 
Comparative genomics
Comparative genomicsComparative genomics
Comparative genomics
 
EVE 161 Winter 2018 Class 18
EVE 161 Winter 2018 Class 18EVE 161 Winter 2018 Class 18
EVE 161 Winter 2018 Class 18
 
Ondex: Data integration and visualisation
Ondex: Data integration and visualisationOndex: Data integration and visualisation
Ondex: Data integration and visualisation
 

More from Kate Hertweck

Opening science to interdisciplinarity: balancing trade-offs while creating, ...
Opening science to interdisciplinarity: balancing trade-offs while creating, ...Opening science to interdisciplinarity: balancing trade-offs while creating, ...
Opening science to interdisciplinarity: balancing trade-offs while creating, ...Kate Hertweck
 
Archives of a Future Commons: Seeds and/as Data
Archives of a Future Commons:  Seeds and/as DataArchives of a Future Commons:  Seeds and/as Data
Archives of a Future Commons: Seeds and/as DataKate Hertweck
 
Hertweck Evolution 2017
Hertweck Evolution 2017Hertweck Evolution 2017
Hertweck Evolution 2017Kate Hertweck
 
Hertweck AB3ACBS presentation
Hertweck AB3ACBS presentationHertweck AB3ACBS presentation
Hertweck AB3ACBS presentationKate Hertweck
 
Transposable elements of Agavoideae
Transposable elements of AgavoideaeTransposable elements of Agavoideae
Transposable elements of AgavoideaeKate Hertweck
 
Developing an undergraduate bioinformatics course
Developing an undergraduate bioinformatics courseDeveloping an undergraduate bioinformatics course
Developing an undergraduate bioinformatics courseKate Hertweck
 
Evolution of transposons, genomes, and organisms (Hertweck Fall 2014)
Evolution of transposons, genomes, and organisms (Hertweck Fall 2014)Evolution of transposons, genomes, and organisms (Hertweck Fall 2014)
Evolution of transposons, genomes, and organisms (Hertweck Fall 2014)Kate Hertweck
 
Hertweck Evolution 2014
Hertweck Evolution 2014Hertweck Evolution 2014
Hertweck Evolution 2014Kate Hertweck
 
Hertweck Monocots V Presentation
Hertweck Monocots V PresentationHertweck Monocots V Presentation
Hertweck Monocots V PresentationKate Hertweck
 
Hertweck Asparagales 2013
Hertweck Asparagales  2013Hertweck Asparagales  2013
Hertweck Asparagales 2013Kate Hertweck
 
iEvoBio Hertweck presentation 2012
iEvoBio Hertweck presentation 2012iEvoBio Hertweck presentation 2012
iEvoBio Hertweck presentation 2012Kate Hertweck
 

More from Kate Hertweck (16)

Opening science to interdisciplinarity: balancing trade-offs while creating, ...
Opening science to interdisciplinarity: balancing trade-offs while creating, ...Opening science to interdisciplinarity: balancing trade-offs while creating, ...
Opening science to interdisciplinarity: balancing trade-offs while creating, ...
 
Archives of a Future Commons: Seeds and/as Data
Archives of a Future Commons:  Seeds and/as DataArchives of a Future Commons:  Seeds and/as Data
Archives of a Future Commons: Seeds and/as Data
 
Hertweck Evolution 2017
Hertweck Evolution 2017Hertweck Evolution 2017
Hertweck Evolution 2017
 
Hertweck AB3ACBS presentation
Hertweck AB3ACBS presentationHertweck AB3ACBS presentation
Hertweck AB3ACBS presentation
 
Transposable elements of Agavoideae
Transposable elements of AgavoideaeTransposable elements of Agavoideae
Transposable elements of Agavoideae
 
Careers in Botany
Careers in BotanyCareers in Botany
Careers in Botany
 
Developing an undergraduate bioinformatics course
Developing an undergraduate bioinformatics courseDeveloping an undergraduate bioinformatics course
Developing an undergraduate bioinformatics course
 
Evolution of transposons, genomes, and organisms (Hertweck Fall 2014)
Evolution of transposons, genomes, and organisms (Hertweck Fall 2014)Evolution of transposons, genomes, and organisms (Hertweck Fall 2014)
Evolution of transposons, genomes, and organisms (Hertweck Fall 2014)
 
Hertweck Evolution 2014
Hertweck Evolution 2014Hertweck Evolution 2014
Hertweck Evolution 2014
 
Hertweck Monocots V Presentation
Hertweck Monocots V PresentationHertweck Monocots V Presentation
Hertweck Monocots V Presentation
 
Phylolecture
PhylolecturePhylolecture
Phylolecture
 
Hertweck Asparagales 2013
Hertweck Asparagales  2013Hertweck Asparagales  2013
Hertweck Asparagales 2013
 
Hertweck bbl2012
Hertweck bbl2012Hertweck bbl2012
Hertweck bbl2012
 
Hertweck uva2012
Hertweck uva2012Hertweck uva2012
Hertweck uva2012
 
iEvoBio Hertweck presentation 2012
iEvoBio Hertweck presentation 2012iEvoBio Hertweck presentation 2012
iEvoBio Hertweck presentation 2012
 
Evolution 2012
Evolution 2012Evolution 2012
Evolution 2012
 

iEvoBio Hertweck abstract 2012

  • 1. An ontology for transposable elements and other repetitive sequences in the age of genomics Kate L. Hertweck, National Evolutionary Synthesis Center As sequencing costs decrease, researchers are incorporating large-scale genomic sequencing projects into their projects. The resulting data inundate the scientific community, providing ample opportunity for myriad comparative genomic studies. A crucial step in most genome sequencing projects is to mask repetitive sequences. This approach improves efficiency of gene assembly, but discards an informative, diverse part of the genome. The repetitive portion of a genome comprises all sequences in very high copy number, such as transposable elements. Previously thought to be “junk” DNA, a growing body of evidence suggests transposable elements play vital roles in genomic evolution, affecting everything from chromosome structure, gene regulation, and even derivation of new genes (Biemont, 2010). Substantial work has described the classification of transposable elements (Wicker et al., 2007), although our current knowledge of such sequences is largely based on relatively few model systems. A majority of publicly available repeat libraries are built from long-read Sanger sequences or highly curated, deep coverage genome sequencing. Available approaches to repetitive element assembly from next generation sequencing data relies on assumptions about the genome's repeat content, including availability of a reference genome, depth of sequencing, and length of reads. The results from these algorithms provide invaluable information about transposable elements, especially in organisms with very large genomes. However, results from various repeat assembly methods require an extensive amount of metadata to be useful for other researchers. Development of an appropriate ontology for repetitive elements assembled from next generation sequencing data should include characteristics of the sequencing method (platform, length, number of reads) as well as details of the assembly (ab initio vs de novo, stringency thresholds) and annotation methods (library used, search parameters). BIEMONT, C. 2010. A Brief History of the Status of Transposable Elements: From Junk DNA to Major Players in Evolution. Genetics 186: 1085-1093. WICKER, T., F. SABOT, A. HUA-VAN, J. L. BENNETZEN, P. CAPY, B. CHALHOUB, A. FLAVELL, et al. 2007. A unified classification system for eukaryotic transposable elements. Nature Reviews Genetics 8: 973-982.