1. An ontology for transposable elements and other repetitive sequences in the age of
genomics
Kate L. Hertweck, National Evolutionary Synthesis Center
As sequencing costs decrease, researchers are incorporating large-scale genomic sequencing projects
into their projects. The resulting data inundate the scientific community, providing ample opportunity
for myriad comparative genomic studies. A crucial step in most genome sequencing projects is to mask
repetitive sequences. This approach improves efficiency of gene assembly, but discards an informative,
diverse part of the genome. The repetitive portion of a genome comprises all sequences in very high
copy number, such as transposable elements. Previously thought to be “junk” DNA, a growing body of
evidence suggests transposable elements play vital roles in genomic evolution, affecting everything
from chromosome structure, gene regulation, and even derivation of new genes (Biemont, 2010).
Substantial work has described the classification of transposable elements (Wicker et al., 2007),
although our current knowledge of such sequences is largely based on relatively few model systems.
A majority of publicly available repeat libraries are built from long-read Sanger sequences or highly
curated, deep coverage genome sequencing. Available approaches to repetitive element assembly from
next generation sequencing data relies on assumptions about the genome's repeat content, including
availability of a reference genome, depth of sequencing, and length of reads. The results from these
algorithms provide invaluable information about transposable elements, especially in organisms with
very large genomes. However, results from various repeat assembly methods require an extensive
amount of metadata to be useful for other researchers. Development of an appropriate ontology for
repetitive elements assembled from next generation sequencing data should include characteristics of
the sequencing method (platform, length, number of reads) as well as details of the assembly (ab initio
vs de novo, stringency thresholds) and annotation methods (library used, search parameters).
BIEMONT, C. 2010. A Brief History of the Status of Transposable Elements: From Junk DNA to Major
Players in Evolution. Genetics 186: 1085-1093.
WICKER, T., F. SABOT, A. HUA-VAN, J. L. BENNETZEN, P. CAPY, B. CHALHOUB, A. FLAVELL, et al. 2007. A
unified classification system for eukaryotic transposable elements. Nature Reviews Genetics 8:
973-982.