• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
Hertweck bbl2012
 

Hertweck bbl2012

on

  • 411 views

Annual presentation to NESCent scientists in Dec 2012 about my postdoctoral research projects.

Annual presentation to NESCent scientists in Dec 2012 about my postdoctoral research projects.

Statistics

Views

Total Views
411
Views on SlideShare
411
Embed Views
0

Actions

Likes
0
Downloads
2
Comments
0

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    Hertweck bbl2012 Hertweck bbl2012 Presentation Transcript

    • Genome-wide effects oftransposable element evolution Kate L Hertweck National Evolutionary Synthesis Center (NESCent)
    • But first...a teaching interlude● Teaching half time for Duke Bio 202 (genetics and evolution)● Responsible for one lab section, lab development, and lecturing● Interesting integration of Duke course with Coursera next semester
    • Overview1. Transposable elements as a model system2. Genomic contributions to life history evolution in Asparagales3. TEs and aging in Drosophila
    • What is in a genome? ● The first step in analyzing genomes is usually to mask or filter repetitive sequences, which often comprise a large portion of the nuclear genome ● Repetitive sequences include satellites, telomeres, and other “junk” DNA elements ● “Selfish” DNA (or mobile genetic elements) is a category of repetitive sequences representing transposable elements (parasitic self-replicating derived from viruses) ● Growing evidence (including ENCODE) supports that “junk” DNA contains essential function and provides material for evolutionary innovation Class I: Retrotransposons Class II: DNA transposons LTR TIR LINE Crypton SINE Helitron ERV Maverick SVA www.virtualsciencefair.orgTEs Asparagales Drosophila
    • TEs directly affect organisms as they move throughout a genome ● TEs interact with genes ● TE insertion within a gene disrupts function ● Exaptation of TEs into genes: Alu elements contributed to evolution of three color vision (Dulai, 1999) ● Gene expression and regulatory changes ● TEs affect molecular evolution ● Indels ● increased recombination (chromosomal restructuring) ● Links between TEs and adaptation/speciationTEsKate Hertweck, Genomic effects of repetitive DNA DNA NESCent, Genomic effects of junk Asparagales Drosophila
    • TEs indirectly affect organisms through changes in genome size Changes in overall genome size Physical-mechanical effects of nuclear size and mass Many historical hypotheses about relationships between genome size and life history (complexity, mean generation time, ecology, growth form)TEs Asparagales Drosophila
    • Research questions and goals ● What are patterns of genome expansion and contraction throughout the evolutionary history of organisms? ● Patterns in genome size change ● Proliferation of TEs within lineages Evolutionnews.orgTEs Asparagales Drosophila
    • Research questions and goals ● What are patterns of genome expansion and contraction throughout the evolutionary history of organisms? ● Patterns in genome size change ● Proliferation of TEs within lineages ● Do genomic patterns correlate with changes in life history? ● Improving methods for comparative genomics across broad taxonomic levels ● Application of phylogenetic comparative methods to genomic data Evolutionnews.orgTEs Asparagales Drosophila
    • Overview1. Transposable elements as a model system2. Genomic contributions to life history evolution in Asparagales3. TEs and aging in DrosophilaCollaborators: J. Chris Pires and lab (U of Missouri) Patrick Edger Dustin Mayfield
    • Genomic evolution in Asparagales ● Many edible species (onion, asparagus, agave) and ornamentals (orchid, amaryllis, yucca) ● Lots of variation in life history traits: physiology, growth habit, habitat ● Interesting patterns of genomic evolution ● Wide variation genome size ● Bimodal karyotypes ● Despite possessing some of the largest angiosperm genomes, we know little about the TEs in Asparagales ● Possibility to test hypotheses of correlations between genomic changes and life history traits ag.arizona.edu Naturehills.comTEs Asparagales Drosophila
    • TEs Asparagales Drosophila
    • TEs Asparagales Drosophila
    • TEs Asparagales Drosophila
    • TEs Asparagales Drosophila
    • Our data ● Illumina (80-120 bp single end), 6 taxa per lane ● GSS (Genome Survey Sequences): total genomic DNA! ● Data originally collected for systematics ● Assembled plastomes, mtDNA genes, and nrDNA genes from less than 10% of data (Steele et al 2012) ● Poaceae (family of grasses, model system) ● Medium-sized genomes ● Well-annotated library of repeats ● Asparagales (order of petaloid monocots, non-model system) ● Very large genomes ● Discovery of novel repeatsTEs Asparagales Drosophila
    • Our data ● Illumina (80-120 bp single end), 6 taxa per lane ● GSS (Genome Survey Sequences): total genomic DNA! ● Data originally collected for systematics ● Assembled plastomes, mtDNA genes, and nrDNA genes from less than 10% of data (Steele et al 2012) ● Poaceae (family of grasses, model system) ● Medium-sized genomes ● Well-annotated library of repeats ● Asparagales (order of petaloid monocots, non-model system) ● Very large genomes ● Discovery of novel repeats ● Is there a way to characterize repeats when the genome is a big black box?TEs Asparagales Drosophila
    • Bioinformatics approach ● Sequence assembly: ● Ab initio repeat construction: use raw sequence reads to build pseudomolecules or ancestral sequences ● De novo sequence assembly: standard genome assembly methods, screen resulting contigsTEs Asparagales Drosophila
    • Bioinformatics approach ● Sequence assembly: ● Ab initio repeat construction: use raw sequence reads to build pseudomolecules or ancestral sequences ● De novo sequence assembly: standard genome assembly methods, screen resulting contigs ● Annotation method: Motif searching ● Reference libraryTEs Asparagales Drosophila
    • Bioinformatics approach ● Sequence assembly: ● Ab initio repeat construction: use raw sequence reads to build pseudomolecules or ancestral sequences ● De novo sequence assembly: standard genome assembly methods, screen resulting contigs ● Annotation method: Motif searching ● Reference library Sidenote: improving the ontology for transposable elements (classification and annotation) Sequence Ontology (SO) Comparative Data Analysis Ontology (CDAO)TEs Asparagales Drosophila
    • Pipeline Scripts available on GitHub: Raw fastq files AsparagalesTEscripts De novo genome assembly (MSR-CA) Filter out scaffolds that BLAST to reference organellar genomes Run RepeatMasker to identify similarity to known repeats (3110 repeats, 98.7% are from grasses ) Discard unknown scaffolds and “unimportant” repeats, categorize others by type Map raw reads back to scaffolds to estimate relative proportion of TETEs Asparagales Drosophila
    • Pipeline Scripts available on GitHub: Raw fastq files AsparagalesTEscripts De novo genome assembly (MSR-CA) Filter out scaffolds that BLAST to reference organellar genomes Run RepeatMasker to identify similarity to known repeats (3110 repeats, 98.7% are from grasses ) Discard unknown scaffolds and “unimportant” repeats, categorize others by type Map raw reads back to scaffolds to estimate relative proportion of TETEs Asparagales Drosophila
    • Quality control: Poaceae ● Largest scaffolds with deepest coverage are from the chloroplast and mitochondrial genomes, but are easily identified for exclusion ● All relevant classes of repeats are present in scaffolds from a single genome ● Even long repeats can be reconstructed into a single scaffold ● Characterization of repeats is not dependent on sequence coverage ● Estimates of quantity repeats are not very accurate-- but there is little consensus of TE quantification in published literature! ● Decision: use a dataset constructed from similar data and analyzed in the same pipeline so any error is systematic and shared among all taxa ● How well do these methods work for non-model systems?TEs Asparagales Drosophila
    • Example: LTR from Hosta ● Reads map across scaffold: assembly is reliable ● Some divergence in reads: measure of diversity?TEs Asparagales Drosophila
    • REs in Core AsparagalesTEs Asparagales Drosophila
    • Genome size varies among core Asparagales 25 20 15 10 Genome size (Gb) 5 #reads (billions) 0TEs Asparagales Drosophila
    • Number of scaffolds varies among taxa 3000 2500 2000 1500 1000 Total scaffolds Nuclear scaffolds 500 0TEs Asparagales Drosophila
    • Proportion of TEs varies among taxa 60 50 40 30 other (RC, satellite, low complexity, simple repeats) 20 % Copia LTRs % Gypsy LTRs 10 % LINEs % DNA TEs 0TEs Asparagales Drosophila
    • Very large genomes in Core AsparagalesTEs Asparagales Drosophila
    • Small genomes contain variationTEs Asparagales Drosophila
    • Developing genomic traits for comparative biology ● Genomic traits can be treated just like any other phenotype • Number of gene copies of a single family • Genome size, intron size, GC content, number of chromosomes, polyploidy, karyotype (sex chromosomes) • Sometimes genomic traits evolve in such a way that models need to be altered to accommodate their variation ● We finally have enough information to be able to apply these methods across robust phylogenies of organisms! ● What about transposable elements?TEs Asparagales Drosophila
    • So what? ● You can peek into the black box of large plant genomes with even very limited genomic sequence data ● There is a great deal of variation in TE compliments among closely related plant species ● These methods can easily be applied to extant datasets to summarize TEsTEs Asparagales Drosophila
    • So what? ● Data available for most plants are low coverage, with little known about the TEs present and their direct effects on the genome and organism ● Plant genomes tolerate more plasticity than animal genomes • Polyploidy, chromosomal restructuring more common in plants • Repetitive compliment comprises a higher proportion of plant genomes • Differences in gene silencing ● Pretty plants are great, but what if we want a more applied approach?TEs Asparagales Drosophila
    • Overview1. Transposable elements as a model system2. Genomic contributions to life history evolution in Asparagales3. TEs and aging in DrosophilaCollaborators: Joseph Graves (UNCG, NC A&T) Michael Rose (UC Irvine) Mira Han (NESCent)
    • Genomics of aging ● Aging as “detuning” of adaptation ● Age-related genes and expression patterns ● Does the movement of TEs throughout a genome correspond to how long an organism lives? ● Previously discussed life history traits only involve TE proliferation in gametic tissue ● Questions about aging involve changes in organisms throughout lifespan, especially if results can be transferred to human researchTEs Asparagales Drosophila
    • Experimental data ● Replicate populations of fruit flies selected for both short and long life spans (Burke et al 2010) ● Next-gen sequencing of pooled populations ● SNP analysis indicates allele frequency changes at many loci, but little evidence for selective sweeps ● Extensive gene expression changeTEs Asparagales Drosophila
    • Experimental approach ● Does the frequency of a TE differ between control and treatment populations? ● Are there patterns consistent with type of TE ● T-lex: perl script for identifying presence and absence of annotated transposable elements ● 2947 transposable elements from publicly available genome sequence Scripts available on GitHub: FB flyTEscripts MITE LINE LTR TIRTEs Asparagales Drosophila
    • Preliminary results ● Controls and populations selected for shorter lifespan ● All population pairs are statistically the same (Kruskal-Wallis, p=0.9414) 700 600 500 number of TEs 400 NA 0 300 100 final 200 100 0 1 2 3 4 5 populationTEs Asparagales Drosophila
    • Preliminary results ● Controls and populations selected for shorter lifespan ● 153 TEs vary in one or more population ● 70 TEs vary in all five populations ● some TE frequencies move to fixationTEs Asparagales Drosophila
    • Finishing the job... ● What are patterns from other population pairs (selection for longer lifespan)? ● Formal statistical testing for variation ● Where are TEs of interest located in the genome? What genes are located nearby? ● T-lex de novo: searching for unannotated insertions – Are there unique TE insertions related to longer life spans?TEs Asparagales Drosophila
    • Conclusions ● What are general patterns of TE evolution? ● Different TEs contribute to genome size obesity. ● We still need better methods to compare genomes. ● Are there common patterns between TEs and life history trait evolution? ● Yes, very specific insertions, at least in Drosophila. ● How can comparative methods be appropriated for genomic characeristics? ● Does TE proliferation contribute to diversification or shifts in rates of molecular evolution? ● We are getting closer to possessing enough data to answer these questions.TEs Asparagales Drosophila
    • Conclusions ● There are many interesting questions to be investigated using other folks genomic trash! ● A little sequencing data can tell you a lot about a genome. ● Many markers for systematic purposes ● You can characterize major groups of repeats even in the absence of a robust reference library for the species. ● Informatics tools and resources abound!TEs Asparagales Drosophila
    • Acknowledgements NESCent (National Evolutionary Synthesis Center) Allen Roderigo Karen Cranston (and bioinformatics group!) www.nescent.org k8hert.blogspot.com Find me: Twitter @k8hert Google+ k8hertweck@gmail.comKate Hertweck, TE ontology effects of junk DNA Evolutionary