• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content

Loading…

Flash Player 9 (or above) is needed to view presentations.
We have detected that you do not have it on your computer. To install it, go here.

Like this presentation? Why not share!

Assessing the impact of transposable element variation on mouse phenotypes and traits

on

  • 731 views

 

Statistics

Views

Total Views
731
Views on SlideShare
731
Embed Views
0

Actions

Likes
0
Downloads
7
Comments
0

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    Assessing the impact of transposable element variation on mouse phenotypes and traits Assessing the impact of transposable element variation on mouse phenotypes and traits Presentation Transcript

    • Assessing the impact of transposable element variation on mouse phenotypes and traits Thomas Keane Christoffer Nellåker and Chris Ponting Vertebrate Resequencing Informatics MRC Functional Genomics Unit Wellcome Trust Sanger Institute University of Oxford Cambridge, UK Oxford, UKThomas Keane, WTSI 14th May, 2011
    • Transposable Elements (TEs) Transposons are segments of DNA that can move within the genome   A minimal ‘genome’ – ability to replicate and change location Dominate landscape of mammalian genomes   38-45% of rodent and primate genomes   Genome size proportional to number of TEs Class 1 (RNA intermediate) and 2 (DNA intermediate) Potent genetic mutagens   Disrupt expression of genes   Genome reorganisation and evolution   Transduction of flanking sequence Transposable elements (TEs) active amongst laboratory mouse strains Mouse Genomes Project: Whole genome sequencing of 17 key laboratory mouse strains   13 classical laboratory strains and 4 wild derived inbred strains   Average of ~25x illumina sequencing per strainThomas Keane, WTSI 14th May, 2011
    • Agouti Mouse Model Dolinoy PNAS 2007;104:13056–13061Thomas Keane, WTSI 14th May, 2011
    • Mouse TEs 3 main classes of TEs in mouse genome   Long interspersed nuclear elements (LINE)   Short interspersed nuclear elements (SINE)   Endogenous retrovirus superfamily (ERV)   Etn, IAP, MuLV, IS2, MaLR, VL30, RLTR Key questions   What is the true extent and distribution of TEs in the germline of laboratory mouse strains?   What can we learn about the selective pressure acting on TEs maintained in the germline?   How much phenotypic variation and complex traits can we associate with TEs?Thomas Keane, WTSI 14th May, 2011
    • TE Calling Terminology   B6+: Present in the reference genome   B6-: Not present in reference   TEV: Transposable element variant Computational calling methods   B6+   SVMerge* pipeline: Integrate calls from several read-pair based SV ‘deletion’ (!) callers (Kim Wong, WTSI)   B6-   RetroSeq** pipeline developed   Identifies discordant mate pairs and compares to a library of known TE sequences   Size estimation   Full length element (~5-8kb) vs. solo LTR (<1kb)   30-40x physical coverage long fragment (~3kb) end reads (15 strains)   Test if insertion point spanned by 3kb fragment read pairs *Wong K, Keane TM, Stalker J, Adams DJ (2010) SVMerge: Enhanced structural variant and breakpoint detection by integration of multiple detection methods and local assembly, Genome Biology, 11:R128 **RetroSeq available from https://github.com/tk2/RetroSeqThomas Keane, WTSI 14th May, 2011
    • B6+ TEV Example C57B6/NJ strain has the ERV DBA/2J Absent in DBA/2J strain   Flanking spanning read pairs denote absence C57B6/NJThomas Keane, WTSI 14th May, 2011
    • B6- TEV Example NOD/ShiLtJ 3kb fragments   Full length (~8kb) IAP insertion   Not spanned by 3kb fragment reads Zoomed into breakpointThomas Keane, WTSI 14th May, 2011
    • TEV Catalog %% =(<*"G5B =(<*"G5B 103,798 TEVs detected @ @ A A =(<*"G5DB =(<*"G5DB DHEGAI"JB DHEGAI"JB 88 *566&3!7 *566&3!7 99 *566&3!7 *566&3!7 ))( ))( )0#( )0#(   28,951 SINEs ?? F DE>G/KL"JB DE>G/KL"JB F 1C%GB )()): )()): 1C%GB )#;0;   40,074 LINEs EE 2 1GB 1GB 2 *1"*GMB )#;0; )#:) )#:) *1"*GMB   34,773 ERVs "" D =#AGANB =#AGANB D =*1GB )$$ )$$ 3 3 =*1GB )#55) )#55) Evolutionary context + + >*1G0B >*1G0B )0;/)G/OIPB )0;/)G/OIPB )#;)< )#;)< )<0$ )<0$   MP consensus tree based on > > BB C C )0;20GEI-A7Q )0;20GEI-A7Q )):;( )):;( strain distribution patterns of TEs = = .. )0;/(G/O3O*RQ )0;/(G/O3O*RQ "2GB "2GB )50< )50< )<55) )<55) B6+ 1 * * S/*G3LB S/*G3LB =1/&G3LB =1/&G3LB #:()0 #:()0 $:## 1 2SCG2KB 2SCG2KB $:##   44,401 insertions within the /2%3&G3LB /2%3&G3LB 50:#; 50:#; C57BL/6J lineage !! )$$U )$$U "" )$$U )$$U B6- :$U :$U 3%! 3%! /.D3 :$U :$U !"#$ !"#$ %"&%( %"&%( %"&%)* %"&%)* /.D3   59,397 TEVs insertions outside 5$U 5$U 5$U 5$U %"&%)$ %"&%)$ +,"! +,"! of C57BL/6J lineage $U $U ".D3VWR-X ".D3VWR-X $U $U +-"% +-"% ./0 ./0 0$U 0$U 0$U 0$U .12 .12 TEs more frequent in wild $U $U ".D3 ".D3 $U $U 3&4 3&4 strains 1* 1* *= *= => => >3 >3 3? 3? ?@ ?@ @A @A )#T#)< T$($ ;T(:< 0T);# )T5$ :T<)) 5T<<0 )#T#)< T$($ ;T(:< 0T);# )T5$ :T<)) 5T<<0 1* 1* )T<## )T<## *= *= 5$0 5$0 => => )T5): )T5): >3 >3 ;5 ;5 3? 3? #$ #$ ?@ ?@ @A @A 0T$0# )T:(; 0T$0# )T:(;   13.8-22.4 vs. 4.2-6.3 per Mb ## )$$U )$$U $$ )$$U )$$U Notable expansion/contraction :$U :$U :$U :$U of certain classes 5$U 5$U $U $U 5$U 5$U $U $U   ERVs expanding relative to the 0$U 0$U 0$U 0$U other classes $U $U $U $U   IAPs active amongst ERVs 1* *= => >3 3? 1* *= => >3 3? ?" "+ +D ?" "+ +D )#T#)< T$($ ;T(:< 0T);# )T5$ )T0(( )0 )0$ )#T#)< T$($ ;T(:< 0T);# )T5$ )T0(( )0 )0$ 1* *= => >3 3? 1* *= => >3 3? ?" ?" )T<## 5$0 )T5): ;5 #$ 5)$ )T<## 5$0 )T5): ;5 #$ 5)$ "+ +D "+ +D << << <$ <$Thomas Keane, WTSI 14th May, 2011
    • Callset Validation B6+   Manually annotated all of Chr19 across 8 strains (Flint group, Oxford)   PCR validation of 250 randomly selected calls across 8 strains B6-   PCR validation of 109 calls across 8 strains (Binnaz Yalcin, Oxford)   Initially SINE false positive rate found to be high   Further filtering of low complexity, microsatellites, simple repeats was required   Reduced false positive from ~30% to 9%   False negative determined by examining SDP from PCR data   Size status assignment accurate   >95% of SINEs assigned <3kb statusThomas Keane, WTSI 14th May, 2011
    • $ ! Structure of ERV Families F &!! 34(54678)B8C46)D+5892:;8<=> 0)C&!8G,;4;892:;8+68C46)D4 %! IAP Type I $!7.3 kb (full length) gag-pol genes (usually defective) $ 5’ LTR 3’ LTR (~430 nt) #! ?+@4A Solo LTR element "! 19: Recombination of the Solo LTR flanking LTRs E ! VL30 IS2 ETn RLTR1B RLTR45 IAP RLTR10 MaLR MuLV VL30 IS2 ETn RLTR1B RLTR45 IAP RLTR10 MaLR MuLV " # &!! &!! 34(5467819:8;)-)8012;8<=> %! %! 34(5467819:892:;8<=> $! $! #! 916 #! MH ()*+(,- 2012&! "! N,02 "! .)-)/012 24D,+6+6C 892:; ! ! HG GI IJ J9 9? ?K KL VL30 IS2 ETn RLTR1B RLTR45 IAP RLTR10 MaLR MuLV Thomas Keane, WTSI 14th May, 2011
    • $! ,-.-/01234564784910:45 DEF>5=>? >H?5=>? #! GEF>5=>? I-D?5=>? Genomic Sequence Context "! B49;.285DEF> B49;.285>H? B49;.285GEF> ! +#(&!!* )"()#* )#()$* )$()%* )%(#!* #!(#"* #"(##* ##(#$* #$(#%* #%(+!* +!(+"* +"(+#* !()"* "#$&*#() "#$%&%#() ! B,5(5C295A* GEF> >H?GEF> DEF> DEF> >H? >H? &!! ) + GEF> DEF>,-.-/01234564784910:45;<5=>?@5A* %! >J;9 E917;9 K/09L $! E9147:4928 DEF>5=>? >H?5=>? #! GEF>5=>? -./ I-D?5=>? R(?0/-4@ Q5&! (% Q5&! (M Q5&!($ Q5&!(+ Q5&! (# , B49;.285DEF> "! B49;.285>H? B49;.285GEF> K;/P58T09:4 Q5!O"+ Q5!O+ Q5!OM+ Q5& Q5&O"+ Q5&O+ Q5&OM+ S5&OM+ ! " +#(&!!* )"()#* )#()$* )$()%* )%(#!* #!(#"* #"(##* ##(#$* #$(#%* #%(+!* +!(+"* +"(+#* !()"* % B,5(5C295A* & !"#$$%& " + GEF> DEF> !O+ >H? GEF>5=>? GEF>5:49;.28 >J;9 >H?5=>? >H?5:49;.28 !O"+ E917;9 DEF>5=>? DEF>5:49;.28 K/09L P2@109845C6* P2@109845C6* !O&"+ E9147:4928Thomas Keane, WTSI 14th May, 2011 )MM $M$+ #&%& "+%# &+NM N%M $&! ")) &## &MM&& &!N#$ #$)$% "%$+M &## ")) )MM $&! N%M &+NM "+%# " 01 21 -./ &
    • E917;9 K/09L E9147:4928 5’ and 3’ Relative Densities -./ , % 5’ " 3’ & !"#$$%& " !O+Sense GEF>5=>? GEF>5:49;.28 >H?5=>? >H?5:49;.28 !O"+ DEF>5=>? DEF>5:49;.28 P2@109845C6* !O&"+ P2@109845C6* )MM ")) $M$+ #&%& "+%# &+NM N%M $&! &## &MM&& &!N#$ #$)$% "%$+M &## ")) )MM $&! N%M &+NM "+%# #&%& $M$+ &!N#$ &MM&& "%$+M #$)$% " 01 21 & !"#$$%&Anti-sense " !O+ !O"+ P2@109845C6* P2@109845C6* !O&"+ N%M $&! )MM ")) &!N#$ $M$+ #&%& "+%# &+NM &## #$)$% "%$+M &MM&& &## ")) )MM $&! N%M &+NM "+%# #&%& $M$+ &!N#$ &MM&& "%$+M #$)$% Thomas Keane, WTSI 14th May, 2011
    • Density and Orientation within Genes ! " !!"# " !"# " !"# !"# !"# HHH HHH ;?@*.A*BHHH HHH ;?@*.A*B HHH ;?@*.A*B 89:; 89:; 89:; <9:; $"# <9:; $!# <9:; ;=> $!# $"# ;=> $!# ;=> !"#!#"$%#&%&()&()#"%)&$*$%#&+,- !"#!#"$%#&%&()&()#"%)&$*$%#&+,- !"#!#"$%#&%&()&()#"%)&$*$%#&+,- C(+DA %"# $"# C(+DA $"# %"# C(+DA $"# E(BBF* E(BBF* E(BBF* <GDA %!# &"# %!# &"# <GDA %!# <GDA %"# 3"# 3"# %"# %"# &!# "# &!# "23 &2% $2! 425 627 3"233 3&23% 3$23!89:; "# &!# "23 &2% $2! 425 627 3"233 3&23% 3$23! ;=> <9:; ()*+,*-.*/0#1 89:; "23 &2% $2! 425 627 3"233 3&23% 3$23! ;=> <9:; 89:; ()*+,*-.*/0#1 ()*+,*-.*/0#1 Distinct anti-sense bias observed in all types   Significantly different bias in first introns between ERVs vs SINEs Orientation bias remains constant despite divergence of element   Biphasic selection process Assuming no sense/anti-sense insertion bias   Implies that half of sense orientated ERVs and one third of SINE/LINEs are deleterious Thomas Keane, WTSI 14th May, 2011
    • QTLs associated with TEsTable 3: QTLs associated with SVs Ancestral Phenotype Chr SV start SV stop Event Gene SV overlap LogPMean platelet volume 1 175158884 175158885 insertion Fcer1a upstream 52.833OFT Total activity 2 144402772 144402974 SINE insertion Sec23b intron 15.721Hippocampus cellular proliferation marker 4 49690364 49690365 SINE insertion Grin3a intron 20.119Home cage activity 4 108951264 108951265 ERV insertion Eps15 upstream 15.922T-cells: %CD3 4 130038389 130038390 SINE insertion Snrnp40 intron 12.129Wound healing 7 90731819 90731820 ERV insertion Tmc3 upstream 22.216Red cells: mean cellular haemoglobin 7 111398000 111480000 insertion Trim5 exon 13.016Red cells: mean cellular haemoglobin 7 111504957 111505193 deletion Trim30b UTR 12.806Red cells: mean cellular volume 8 87957244 87957245 LINE insertion 4921524J17Rik upstream 18.141Serum urea concentration 11 115106122 115106250 deletion Tmem104 UTR 13.404Hippocampus cellular proliferation marker 13 113783196 113783359 SINE deletion deletion Gm6320 upstream 17.456T-cells: CD4/CD8 ratio 17 34483680 34483681 deletion H2-Ea upstream 82.858Start and stop coordinates are given for build37 of the mouse genome, so that insertions into the reference are given asconsecutive base pairs (columns headed SV start and SV stop). The part of the gene overlapped is reported in the columnheaded SV overlap. LogP is the negative logarithm of the P-value for association between the SV and the phenotype asassessed in outbred HS mice 22. Yalcin et al, under review Thomas Keane, WTSI 14th May, 2011 29
    • Eps15 IAP Candidate $,-& Eps15: epidermal growth factor receptor pathway substrate 15 +( +(( Whole Arena Total Distance &0)(1, *( $,-& 16000 14000 *(( 12000 )( 10000 234 8000 )(( 6000 &( 4000 2000 $,-& !"#$% #$% 0 Eps15/Eps15 -/- +( +(( Number Of Entries To Centre #./* 250 )( *($,-& 200 *(( & 150 )( &0*(1, #/.* 100 )(( &( 50 &( 234 0 !"#$% #$% Eps15/Eps15 -/- ( Thomas Keane, WTSI 14th May, 2011 !"#$% #$% #./* )(
    • Conclusions Unprecedented catalog (>100k) of mouse TEV elements identified False positive and negative rates are low Wild derived strains contain significantly more TEs Evolutionary context shows expansion of ERVs in mouse lineage Distinct anti-sense bias for all elements within genes Estimate that half of sense orientated ERVs and one third of SINE/ LINEs are deleteriousThomas Keane, WTSI 14th May, 2011
    • Acknowledgements Mouse TE Project   Christoffer Nellåker (Oxford)   Wayne Frankel (Jax)   Chris Ponting (Oxford) Mouse Genomes Project   Sanger   Petr Danecek   Kim Wong   David Adams   Richard Durbin   Sanger Sequencing Teams   EBI   Ewan Birney   Wellcome Trust Centre Oxford   Jonathan Flint et al.   Binnaz Yalcin   Avigail Agam   Richard Mott   Jackson Lab   Laura Reinholdt   Leah Rae Donahue Further Information   http://www.sanger.ac.uk/mousegenomes   Contacts   thomas.keane@sanger.ac.uk   christoffer.nellaker@gmail.com   chris.ponting@dpag.ox.ac.ukThomas Keane, WTSI 14th May, 2011