TIGR
Topics of Discussion• DNA Repair• Why study evolution of repair?• Evolution of specific pathways with examples  from recen...
Damage is not just to DNATIGR
TIGR
General Mechanisms of Resistance to    Cellular Damaging Agents• Damage protection/prevention• Damage tolerance• Repair an...
Classes of DNA Repair• Direct repair  – Photoreactivation  – Alkylation transfer  – DNA ligation/non-homologous end joinin...
Excision Repair Outline                  NUCLEOTIDE              BASE EXCISION                      and             M ISM ...
Recombination Outline                  RecBCD     Generation of                   RecE,T    single-strand                 ...
“Nothing in biology makes sense except in the light of evolution.”       T. H. Dobzhansky (1973)TIGR
Why Study Evolution and Repair?• Repair variation leads to differences in evolutionary  patterns within and between specie...
Steps in Phylogenomic Analysis• Create database of genes of interest• Presence/absence of homologs in complete genomes• Ph...
Nucleotide Excision RepairPathway                    Biochemical Activity(s).             |-------------------------------...
Recombinational Repair  Pathway                              Biochemical Activity(s).             |-----------------------...
Evolution of Specific PathwaysTIGR
Photoreactivation and Photolyases• All photoreactivation is carried out by enzymes in the photolyase  family• Two main cla...
Uses of Evolution : Photoreactivation• All known enzymes that perform photoreactivation are part of  a single large photol...
Phr.S thyp          PHR E. coli            O R FA0 0 9 6 5* * * * * * * * *              p hr.neucr                       ...
Photolyases in A. thaliana                     phr.chlamy                     cry2.tomat                     PHH1.CRY2.   ...
Alkyltransferases• All known alkyltransferases are members of a single  gene family• Found in most but not all species• Li...
Alkylation Repair Genes Ada E. coli Ada H. infl Ogt E. coli Ogt H. infl Ogt Gram+ Ogt D. radio M M E  G T uks AlkA Gram+ A...
DNA Ligases• Two major ligase families• Ligase I  – NAD dependent  – Found in all bacteria and only in bacteria• Ligase II...
DNA Ligases in A. thaliana                 ARATH1 F4N21.14                 YEAST-GP-600039                 YEAST-SW-DNLI Y...
Mismatch Excision Repair• Core of process highly homologous between bacteria and  eukaryotes (all use MutS and MutL homolo...
9                                       9                                        0                                        ...
Ancient Duplication in MutS FamilyA.                             B.                                                      B...
Parallel Loss of MutLSLost in mycoplasmal lineage (present in B. subtilis and S.  pyogenes)Lost in M. tuberculosis lineage...
Nucleotide Excision Repair• Bacterial and eukaryotic systems are not-homologous,  despite having very similar mechanisms• ...
Evolution of UvrA FamilyA. ABC Transporters               B. UvrA Subfamily   UvrA H. influenzae                      NrtD...
UvrA Evolution       UvrA1C     UvrA1N          UvrA2C    UvrA2N                                           Gene Duplicatio...
Base Excision Repair Glycosylases• Distribution patterns highly uneven but some glycosylases  have been found in all speci...
A. thaliana TAG homologs                C. crescentus                A. thaliana_ 5 K23L20 1                A. thaliana_ 3...
AP Endonucleases• All species encode either Nfo or Xth homologs. Some encode  both.• Only Nfo: mycoplasmas, Aquifex, M. ja...
Recombinational Repair• RecA homologs found in all free-living species (B.  aphidicolum encodes RecBCD but not RecA)• Most...
Xen.bov ie                                                                    Xen.nemat                                   ...
A05970                   MucB              U muCs      ImpB                                    ******                     ...
Big Picture: Evolutionary OriginTIGR
Likely Ancient Repair Processes/ProteinsProcess                      ProteinsMismatch repair              MutL, MutSAP end...
Originated within BacteriaProcess                        ProteinsMismatch repair                MutH, VsrAlkylation revers...
Originated within EukaryotesProcess                        ProteinsMismatch repair                duplications of MutS, mu...
Originated in Eukaryote-Archaea LineageProcess                      ProteinsBase excision                OggNucleotide exc...
Ambiguous OriginTIGR
Repair Genes in Archaea• All species: RecA,MRE11, Rad50, MutY-  Nth, Ogt, Rad2, Lig-II, PCNA• UvrABCD in M. thermoautotrop...
TIGR
DNA Repair Genes in D.   radiodurans Complete GenomeProcess                      Genes in D. radioduransNucleotide Excisio...
Problem: List of DNA repair gene homologs  in D. radiodurans genome is not  significantly different from otherbacterial ge...
Unusual Features of D. radiodurans       DNA Repair Genes       Process               GenesNucleotide excision repair   Tw...
Gain and Loss of Repair Genes              BACTERIA                                                                   ARCH...
Repair Studies in Different Species       (determined by Medline searches as of 1998)         Humans                   702...
Evolution of Repair Summary• Mycoplasmas have lost many repair genes which may  explain high mutation rate.• Mismatch repa...
TIGR
AcknowledgementsTIGR                NIEHS•Craig Venter       •Ben Van Houten•Claire Fraser•John Heidelberg    Louisiana St...
TIGR
Unusual Distributions•   XP-B like gene in some bacteria and some Archaea.•   LigaseII in M. tuberculosis, B. subtilis, an...
Big Picture: Duplication and              LossTIGR
Genes Lost in Mycoplasmal LineageProcess                        ProteinBase excision repair           MutY/Nth, AlkARecomb...
Parallel Loss of MutLSLost in mycoplasmal lineage (present in B. subtilis and S.  pyogenes)Lost in M. tuberculosis lineage...
Need for Experimental Studies in Archaea  • No novel repair genes cloned in Archaea. All    repair genes show homology to ...
Repair Genes in all Archaea        Process                      ProteinNucleotide excision repair          Rad2, Rad1 ±Rec...
DNA Repair Gene Summary• Most of the standard eukaryotic DNA repair  genes are found• Some likely plastid repair genes are...
Acknowledgements• Genome duplications: S. Salzberg, J. Heidelberg, O.  White, A. Stoltzfus, J. Peterson• Genome sequences ...
Evolution of Uracil Glycosylase• Many non-homologous proteins have uracil-DNA  glycosylase activity (Ung, GPADH, MUG, cycl...
Ambiguous OriginProcess                      ProteinsBase excision                3MG, GT MMR, UngNucleotide excision repa...
Big Picture: Distribution PatternsTIGR
Present in All Bacteria Process                       ProteinsNucleotide Excision RepairRecombinaseReplication            ...
Present in All Free-Living Bacteria Process                     ProteinsNucleotide Excision Repair   UvrABCDRecombinase   ...
Present in Most Bacteria      Process                  ProteinNucleotide excision repair     UvrABCDHolliday junction reso...
Present in Bacteria or Eukaryotes                  (But Not Both)Process                        Bacteria           Eukaryo...
Evolution of Alkyltransferases• All known alkyltransferases share a conserved,  homologous alkyltransferase domain• Theref...
Standard Eukaryotic Repair Genes       Pathway                Genes       Mismatch Repair        MSH2-6, MutLs       Base ...
Missing Eukaryotic Repair Genes?  Pathway                Genes  Mismatch Repair  Base Excision Repair  Nucleotide Excision...
Bootstrap                                                                                                    MSH6.P ombe  ...
Bootstrap                                                                                                         81      ...
Jonathan Eisen talk on "Phylogenomics of DNA repair" at Lake Arrowhead Small Genomes Meeting 2000
Jonathan Eisen talk on "Phylogenomics of DNA repair" at Lake Arrowhead Small Genomes Meeting 2000
Jonathan Eisen talk on "Phylogenomics of DNA repair" at Lake Arrowhead Small Genomes Meeting 2000
Jonathan Eisen talk on "Phylogenomics of DNA repair" at Lake Arrowhead Small Genomes Meeting 2000
Upcoming SlideShare
Loading in …5
×

Jonathan Eisen talk on "Phylogenomics of DNA repair" at Lake Arrowhead Small Genomes Meeting 2000

6,847 views

Published on

Talk by Jonathan Eisen on Phylogenomics of DNA repair at Lake Arrowhead Small Genomes meeting in 2000.

Published in: Health & Medicine, Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
6,847
On SlideShare
0
From Embeds
0
Number of Embeds
5,345
Actions
Shares
0
Downloads
8
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Jonathan Eisen talk on "Phylogenomics of DNA repair" at Lake Arrowhead Small Genomes Meeting 2000

  1. 1. TIGR
  2. 2. Topics of Discussion• DNA Repair• Why study evolution of repair?• Evolution of specific pathways with examples from recent genome projects (e.g., A. thaliana, Vibrio cholerae, Shewanella putrefaciens, Buchnera aphidicolum symbiont)• Big picture – evolutionary origins of repairTIGR
  3. 3. Damage is not just to DNATIGR
  4. 4. TIGR
  5. 5. General Mechanisms of Resistance to Cellular Damaging Agents• Damage protection/prevention• Damage tolerance• Repair and recoveryTIGR
  6. 6. Classes of DNA Repair• Direct repair – Photoreactivation – Alkylation transfer – DNA ligation/non-homologous end joining• Excision repair – Base excision repair – Mismatch excision repair – Nucleotide excision repair• Recombinational repairTIGR
  7. 7. Excision Repair Outline NUCLEOTIDE BASE EXCISION and M ISM ATCH EXCISION Damage N-glycosylase Re cognition Endonucle ase AP e ndo * * Exonucle ase , He licase , Exonucle ase , Polyme rase Polyme rase LigaseTIGR
  8. 8. Recombination Outline RecBCD Generation of RecE,T single-strand RecQ,J overhang Rad50, M 11, XRS2 RE RecA Initiation, RecFOR alignment Rad 52 RecA Rad51,55,57 Strand invasion DNA synthesis RuvABC RecG,RUS? Branch migration Rad54? and resolutionTIGR
  9. 9. “Nothing in biology makes sense except in the light of evolution.” T. H. Dobzhansky (1973)TIGR
  10. 10. Why Study Evolution and Repair?• Repair variation leads to differences in evolutionary patterns within and between species.• Evolutionary analysis can identify mutation/repair biases.• Evolutionary studies can improve our understanding of repair proteins and pathways.• Comparisons of repair genes can be used to infer evolutionary history.• Information on mutation processes improves sequence and phylogenetic analysis.• Evolutionary analysis is required to infer the origins and history of repair processes. TIGR
  11. 11. Steps in Phylogenomic Analysis• Create database of genes of interest• Presence/absence of homologs in complete genomes• Phylogenetic trees of each gene family• Infer evolutionary events (gene origin, duplication, loss and transfer)• Refine presence/absence (orthologs, paralogs, subfamilies)• Functional predictions and functional evolution• Analysis of pathways TIGR
  12. 12. Nucleotide Excision RepairPathway Biochemical Activity(s). |-------------------------------------Bacteria------------------------------------| |-----Archaea------| |--Eukarya---| Protein Name(s) Bacter ial NER UvrA Binds damaged DNA + + + + + + + + + + + + + + + - - - - - UvrB Helicase, 3 incision endonuclease + + + + + + + + + + + + + + + - - - - - UvrC 5 incision endonuclease + + + + + + + + + + + + + + + - - - - - UvrD Excision helicase + + + + + ++ + + + ++ + + + + ++ - - - + + MFD Transcription repair coupling + + + + + + - - + + + + - + - - - - - - Eukaryotic NER Recognition Rad14 (XPA) Binds damaged DNA - - - - - - - - - - - - - - - - - - + + + RFA1/RPA1 ssDNA binding w/ RFA2,3 - - - - - - - - - - - - - - ± - - - + + + RFA2/RPA2 ssDNA binding w/ RFA1,3 - - - - - - - - - - - - - - - - - - + ++ + RFA3/RPA3-human ssDNA binding w/ RFA1,2 - - - - - - - - - - - - - - - - - - - + + RFA3/RPA3-yeast ssDNA binding w/ RFA1,2 - - - - - - - - - - - - - - - - - - + + Initiation Rad3 (XPD) (ERCC2) TFIIH component – helicase - - - - - - - - - - - - - - - - - ± + + + Rad25 (XPB) (ERCC3) TFIIH component – helicase - - - - - - - - - + - + - + - - + + + + + SSL1 (p44) TFIIH component - - - - - - - - - - - - - - - - - - + + + TFB1 (p62) TFIIH component - - - - - - - - - - - - - - - - - - + + + TFB2 (p52) TFIIH component - - - - - - - - - - - - - - - - - - + + + TFB3 (MAT1) TFIIH component - - - - - - - - - - - - - - - - - - + + + TFB4 (p34) TFIIH component - - - - - - - - - - - - - - - - - - + + + CCL1 (CyclinH) TFIIH component - - - - - - - - - - - - - - - - - - + + + Kin28 (CDK7) TFIIH component - protein kinase - - - - - - - - - - - - - - - - - - + + + Incision Rad2 (XPG) (ERCC5) 3 incision (flap endonuclease) - - - - - - - - - - - - - - + + + + + + + Rad10 (ERCC1) 5 incision endonuclease w/ Rad1 - - - - - - - - - - - - - - - - - - + + + Rad1 (XPF) (ERCC4) 5 incision endonuclease w/ Rad10 - - - - - - - - - - - - - + + + + + + + Specificity Rad4 (XPC) Repair of inactive DNA - - - - - - - - - - - - - - - - - - + + + Rad23 (HHRAD23) Repair of inactive DNA - - - - - - - - - - - - - - - - - - + ++ + Rad7 Repair of inactive DNA - - - - - - - - - - - - - - - - - - + + Rad16 Repair of inactive DNA - - - - - - - - - - - - - - - - - - + + + Rad26 (CSB) (ERCC6) Transcription-repair coupling - - - - - - - - - - - - - - - - - - + + + CSA (ERCC8) Transcription-repair coupling - - - - - - - - - - - - - - - - - - ± + + TIGR
  13. 13. Recombinational Repair Pathway Biochemical Activity(s). |-------------------------------------Bacteria------------------------------------| |-----Archaea------| |--Eukarya---| Protein Name(s) Initiation RecBCD pathway RecB ExoV H elicase + + + - - - - - - + + - - + - - - - - - RecC ExoV Nuclease + + + - - - - - - + ±+ - - + - - - - - - RecD ExoV Helicase + + + - ± ± - - - + ±+ - - + - - - - - - RecF pathway RecF Assists RecA filamentation + + - - + + - - + + - + - + - - ± - ± ± RecJ 5-3 ssDNA exonuclease + + + + + + - - + - + + + + - - - - - - RecO Binds ssDNA, assists RecF? + + + - + + - - + + - - - + - - - - - - RecR ATP binding, assists RecF? + + + ±+ + + - - + + - + + + - - - - - - RecN ATP binding + + + + + + - - + + - + + + - - ± - - - RecQ 3-5 DNA helicase + + + - ± + - - + - - + - + - - - - + ++ + RecE pathway RecE/ExoVIII 5-3 dsDNA exonuclease + - - - - - - - - - - - - + - - - - - - RecT Binds ssDNA, promotes pairing + - - - + + - - - - - - - + - - - - - - SbcBCD pathway SbcB/ExoI 3-5 ssDNA exonuclease + + - - - - - - - - - - - + - - - - - - SbcC dsDNA exonuclease (w/ sbcD) + - - - ±+ + - - + - + + + + ± ± ± ± ± ± ± SbcD dsDNA exonuclease (w/ sbcC) + - - - - + - - + - + + + + ± ± ± ± ± ± ± AddAB Pathway AddA/RexA Exonuclease + helicase w/ AddB - - + - + + - - - - - + - + - - - - - - AddB/RexB Exonuclease + helicase w/ AddA - - + - + + - - - - - - - + - - - - - - Rad52 pathway Rad52, Rad59 n/a - - - - - - - - - - - - - - - - - - ++ + + M re11/Rad32 Nuclease w/ Rad50 ± - - - ± ± - - ± - ± ± ± ± + + + + + + + Rad50 Nuclease w/ M re11 ± - - - ± ± - - ± - ± ± ± ± + + + + + + + Recom binas e RecA, Rad51 DNA binding, strand exchange + + + + + + + + + + + + + + + + + + ++ ++ ++ Br anch m igr ation/r esolution Branch migration RuvA Binds junctions. Helicase w/ RuvB + + + + + + + + + + + + - + - - - - - - RuvB 5-3 junction helicase w/ RuvA + + + + + + + + + + + + - + - - - - - - RecG Resolvase, 3-5 junction helicase + + + + + + - - + + + + + + - - - - - - Resolvases RuvC Junction endonuclease + + + + - - - - + + - + - + - - - - - - RecG Resolvase, 3-5 junction helicase + + + + + + - - + + + + + + - - - - - - Rus Junction endonuclease + - - - - - - - - ±+ - - ±+ + - - - - - - CCE1 Junction endonuclease - - - - - - - - - - - - - - - - - - + + O ther r ecom bination pr oteins Rad54 n/a - - - - - - - - - - - - - - - - - - + + + Rad55 n/a - - - - - - - - - - - - - - - - - - + + + Rad57 n/a - - - - - - - - - - - - - - - - - - + + + Xrs2 Assists Rad50/M RE11? - - - - - - - - - - - - - - - - - - + +TIGR
  14. 14. Evolution of Specific PathwaysTIGR
  15. 15. Photoreactivation and Photolyases• All photoreactivation is carried out by enzymes in the photolyase family• Two main classes of photolyases – class I and class II – are distantly related to each other and likely the result of an ancient duplication• PhrI and PhrII missing from most species for which complete genomes are available.• Many cases of functional change (e.g., CPD -> 6-4) and some are not even involved in DNA repair• Many of the eukaryotic proteins appear to be of an organellar ancestry TIGR
  16. 16. Uses of Evolution : Photoreactivation• All known enzymes that perform photoreactivation are part of a single large photolyase gene family• Some members of the family do not function as photolyases, but instead work as blue-light receptors• If a species does not encode a member of the photolyase gene family, it likely does not have photoreactivation capability• If a species encodes a photolyase, one cannot conclude it has photolyase activity• Position of photolyase homologs within photolyase tree helps predict what activities they have TIGR
  17. 17. Phr.S thyp PHR E. coli O R FA0 0 9 6 5* * * * * * * * * p hr.neucr M T H F ty pe Phr.Tricho Class I CPD Phr.Yeast Photoly ases Phr.B firm p hr.strp y p hr.haloba PHR STRGR p C RY1.huma p hr.mouse p hr2.human p hr2.mouse 6-4 p hr.drosop Photoly ases phr3.Synsp O R F0 2 2 9 5.V ib ch* * * * * * * * p hr.neigo O RF0 1 7 9 2 .V ib ch* * * * * * * Phr.Adiant Phr2.Adian Phr3.Adian p hr.tomato Blue C RY1 ARATH p hr.phycom Light C RY2 ARATH Receptors PHH1.arath PHR1 SINAL p hr.chlamy PHR ANANI p hr.Synsp 8-H DF ty pe PHR SYNY3 CPDTIGR p hr.Theth Rh.cap s Photoly ases
  18. 18. Photolyases in A. thaliana phr.chlamy cry2.tomat PHH1.CRY2. PHR1 SINAL Cry3.Adian Crys Group with CRY1/hy4.A phr.Brevib Phr.Cordi. α-Proteobacteria ORF05094.C Phr.Rhoca Phr.Bacfi Phr.Entfa Phr.Strpy Phr.Pseae. Phr.Yerpe. Phr.Ecoli Phr.S thyp Phr.Salty Phr.Shepu. A00965.Vib Phr.Yeast Other Bacteria Phr.Neucr Phr.Tricho Phr.SYNY3 Phr.Anani Phr.Synsp Phr.Theth Phr.Mycav. Phr.Mycsm CT12574.Fl ARATH3 MSJ Phr.fly pCRY1.huma PhrL.Mouse Eukaryotic PhrL2.huma PhrL2.Mous 295.Vibch Phr2.Synsp ARATH2 T30 ARATH5 F6A Cyano/Plastid 1792.VibchTIGR Phr.Neime Phr.Neigo Phr.Halha Phr.Strgr
  19. 19. Alkyltransferases• All known alkyltransferases are members of a single gene family• Found in most but not all species• Likely present in LUCA• Ada protein in E. coli originated by fusion between an alkyltransferase and a transcription-regulatory domain• Gram-positive bacteria have the Ada domain fused to an alkylation glycosylase instead of alkyltransferase TIGR
  20. 20. Alkylation Repair Genes Ada E. coli Ada H. infl Ogt E. coli Ogt H. infl Ogt Gram+ Ogt D. radio M M E G T uks AlkA Gram+ AlkAE. coli AlkA Domain (O6-Me-G glycosylase) Ogt Domain (O6-Me-G alkyltransferase) Ada Domain (transcriptions regulator)TIGR
  21. 21. DNA Ligases• Two major ligase families• Ligase I – NAD dependent – Found in all bacteria and only in bacteria• Ligase II – ATP dependent – Found in all Archaea and eukaryotes – Found in some bacteria – Duplicated in many eukaryotes TIGR
  22. 22. DNA Ligases in A. thaliana ARATH1 F4N21.14 YEAST-GP-600039 YEAST-SW-DNLI YEAST YEAST-GP-3515 ARATH1 F13F21.31 ARATH1 T23G18.1 ARATH1 T6D22.10 CELEG C29A12.3 DROMECG560 AERPE-gi|5104764. AQUAE-gi|2983805 DROME-CG17227 ARATH5 MUL3 11 YEAST-SW-DNL4 YEAST DROMECG12176 ARCFU-gi|2648829 METJA-gi|1590924 METTH-gi|2622703 ARCFU-gi|2649996TIGR PYRHO-gi|3258051 PYRFUPf 1527421
  23. 23. Mismatch Excision Repair• Core of process highly homologous between bacteria and eukaryotes (all use MutS and MutL homologs).• Eukaryotes encode multiple MutS and MutL homologs, not all of which are involved in mismatch repair.• Two major MutS groups– MutS-I proteins involved in MMR and MutS-II proteins involved in chromosome segregation.• MutS1 and MutL missing from many bacteria, especially pathogens. Other MMR proteins also defective in some.• Few homologs in Archaea – some encode MutS2, none encode MutS1, and some may encode MutL.• Some evolutionary and functional relationships to restriction- modification systems (MutH, MED1, Vsr). TIGR
  24. 24. 9 9 0 5 MH S 6 1 0 0 79 MH S 3 1 0 0 1 0 0 MH S 2 M tS u -I M ism tch a 95 1 0 0 MH S 1 R a ep ir 2 9 5 6 M tS u 1 6 /8 1 9 Proposed duplication 55 1 0 0 MH S 5 M tS u -II 8 9 5 6 MH S 4 C ro o m h m so e C sso er & ro v S reg tio eg a nTIGR 60 74 M tS u 2
  25. 25. Ancient Duplication in MutS FamilyA. B. B ug of r . r d rei b S y gns po ee 5 Tp li u . ald m Bs bls . ut i i Dr do ua s . ai dr n Sn s y. p Mt 2 uS Aaoi u . e lc s Aa oc s . el u i M e iai m .gn l u t Dr d d r n . a i ua s o 4 3 M nu oi e .p e mna Bb r d rei . ug of r Spo e e . y gns Spo e e . y gns Bs bii . u tl s Bs bls . ut i i Gn ee Sns y. p Dpc to ul a n i i Sn s y. p Hpl r . yoi M tS u 1 Gn ee 2 Ng n rh e e . oor oaDpc to ul a n i i Aa oc s . el u i 1 Hi funa . nl e z e Dr d d r n . a i ua s o Bb r d rei . ug of r Ec l . oi TIGR
  26. 26. Parallel Loss of MutLSLost in mycoplasmal lineage (present in B. subtilis and S. pyogenes)Lost in M. tuberculosis lineage (found in some other highGC Gram-positives)Lost in H. pylori / C. jejuni lineage (present in many other Proteobacteria)Possibly lost in Euryarchaeota lineageDefective in many “wild” E. coli and S. typhimurium strainsLoss of genes may give an advantage in some conditions by increasing mutation rate or recombination rate between species. TIGR
  27. 27. Nucleotide Excision Repair• Bacterial and eukaryotic systems are not-homologous, despite having very similar mechanisms• Most of the eukaryotic and bacterial proteins originated within each of these domains• Some of the eukaryotic proteins are shared with Archaea (Rad1, Rad2, Rad25).• All free-living bacteria encode UvrABCD. B. aphidicolum encodes Mfd but not UvrABCD.• UvrABC also found in one Archaea.• Some functional and evolutionary relationships with drug resistance and transport TIGR
  28. 28. Evolution of UvrA FamilyA. ABC Transporters B. UvrA Subfamily UvrA H. influenzae NrtDC UvrA E. coli UvrA N. gonorrhoaea OppDF UvrA R. prowazekii UUP UvrA S. mutans UvrA S. pyogenes UvrA S. pneumoniae NodI UvrA B. subtilis LivF UvrA M. luteus UvrA M. tuberculosis XylG UvrA1 UvrA M. hermoautotrophicum UvrA H. pylori UvrA1 UvrA C. jejuni UvrA P. gingivalis UvrA2 Dup lication UvrA C. tepidum in UvrA uvra1 D. radiodurans family PstB UvrA T. thermophilus UvrA T. pallidum MDR UvrA B. burgdorefi HlyB UvrA T. maritima UvrA A. aeolicus TAP1 UvrA Synechocystis sp. UvrA2 S. coelicolor CFTR, SUR DrrC S. peuceteus UvrA2 UvrA2 D. radiodurans TIGR
  29. 29. UvrA Evolution UvrA1C UvrA1N UvrA2C UvrA2N Gene Duplication UvrAC UvrAN Tandem DuplicationABC2 ABC1 UvrA Diversification of ABC family ABCTIGR
  30. 30. Base Excision Repair Glycosylases• Distribution patterns highly uneven but some glycosylases have been found in all species• Some are ancient enzymes, probably presence in LUCA (e.g., MutY-Nth), others more recent (e.g., TagI).• Many families are distantly related to each other (e.g., Ogg, AlkA, MutY-Nth)• Many cases of gene duplication, loss and possibly transfer, especially from organellar genomes to nucleus• Orthologs frequently have different specificity TIGR
  31. 31. A. thaliana TAG homologs C. crescentus A. thaliana_ 5 K23L20 1 A. thaliana_ 3 MBK21.7 A. thaliana_ 1 F23A5.15 A. thaliana_ 1 T24D18.7 A. thaliana_5 MTI20 23 A. thaliana_1 F9E10.6 V. cholerae H. influenzae E.coli M. tuberculosis N. meningitidis ATIGR N. meningitidis B
  32. 32. AP Endonucleases• All species encode either Nfo or Xth homologs. Some encode both.• Only Nfo: mycoplasmas, Aquifex, M. jannascii, yeast• Only Xth: many bacteria, A. fulgidus, humans (so far)• Both: E. coli, B. subtilis, M. tuberculosis, M. thermoautotrophicum• Both Nfo and Xth are likely ancient.• Many cases of gene loss of one or the other, but never both TIGR
  33. 33. Recombinational Repair• RecA homologs found in all free-living species (B. aphidicolum encodes RecBCD but not RecA)• Most recombination initiation pathways are of recent origin – RecBCD, RecE within Proteobacteria/Gram-positives – RecF within bacteria – AddAB within low-GC gram-Positives – SbcCD may be of ancient origin (possibly homologous to MRE11/Rad50)• Resolution pathways also somewhat recent origin – CCE1 within eukaryotes – RuvABC, RecG near origin of bacteria – Rus within bacteria (phage origin?)• Many cases of gene loss in initiation, resolution pathways. TIGR
  34. 34. Xen.bov ie Xen.nemat Pr.v ulgari Pr.mirabil Ent.agglo Y .pestis S.marcesce E.coli Shig.flex Shig.sonn Shepu.tig V ib.angui V ib.choler γ Ps.oleov or Ps.margina Ps.fluores Ps.putid Ps.aerugi Ps.aePA M A z.v inelan M BBA D17T F * * * * * * A c.calcoac A c.sp.A DP Past.haem H.influenz Past.multo A ctinobaci A er.salmon Xa.ory za Xa.citri Xa.campes B.pertussi Ps.cepaci Chrom.v ino M thmon.cla M thphy .met M thbac.fla β Nitrosomon L.pneumop Ne.gonorr Ne.meningi T.ferroox i R hb.phase R h.legumin A .tumefaci R h.melilot Br.abortus Blastochlo α R hps.palu A ceto.pol A ceto.alt Gluc.ox y d A q.magnet Zy m.mobili Caul.cresc Prcs.denit R ho.sphae R ho.capsu 2M y x .x anth 1M y x .x anth δ He.py loriTIGR ε He.py lori2 Cmp.jejuni Cmp.fetus 0.1
  35. 35. A05970 MucB U muCs ImpB ****** RumB DinP3 RumB R391 RulB DinP1 DinP2 UvrXTIGR
  36. 36. Big Picture: Evolutionary OriginTIGR
  37. 37. Likely Ancient Repair Processes/ProteinsProcess ProteinsMismatch repair MutL, MutSAP endonuclease Xth, NfoRecombinase RecA/RadA/Rad51Alkylation reversal Ogt/MGMTPhotolyase PhrII, PhrIdGTP/GTP clean up MutTBase excision glycosylases MutY/Nth, AlkA, Ung?Recombination endonuclease SbcC/Rad50, SbcD/MRE11Other SMS, Lon, UmuC TIGR
  38. 38. Originated within BacteriaProcess ProteinsMismatch repair MutH, VsrAlkylation reversal Ada (fusion of Ogt)Base excision Fpg-Nei, TagIRecombination initiation RecFJNOR, AddAB, RecBCD, RecET, SbcBRecombination resolution RecG, RuvABC, RusNucleotide excision repair UvrABCDTranscription-coupled repair MFDInduction LexAOther SSB, LigaseI TIGR
  39. 39. Originated within EukaryotesProcess ProteinsMismatch repair duplications of MutS, mutLBase excision 3MG?Recombinase duplication of RecARecombination initiation duplications of RecQRecombination resolution CCE1Nucleotide excision repair Most XPs, TFIIH, etc.Transcription-coupled repair CSA, CSBInduction P53Non-homologous end joining XRCC4, Kus, DNA-PKcsOther RFAs, Rad52-59, XRS2 TIGR
  40. 40. Originated in Eukaryote-Archaea LineageProcess ProteinsBase excision OggNucleotide excision repair Rad1, Rad2, Rad25?Ligation LigaseII TIGR
  41. 41. Ambiguous OriginTIGR
  42. 42. Repair Genes in Archaea• All species: RecA,MRE11, Rad50, MutY- Nth, Ogt, Rad2, Lig-II, PCNA• UvrABCD in M. thermoautotrophicum• PhrI and PhrII in some species• Variety of glycosylases in some species• No Ung homologs in any species, but alternative glycosylases have Ung activity• Rad1 in many species.• New Holliday junction resolvaseTIGR
  43. 43. TIGR
  44. 44. DNA Repair Genes in D. radiodurans Complete GenomeProcess Genes in D. radioduransNucleotide Excision Repair UvrABCD, UvrA2Base Excision Repair AlkA, Ung, Ung2, GT, MutM, MutY-Nths, MPGAP Endonuclease XthMismatch Excision Repair MutS, MutLRecombination Initiation RecFJNRQ, SbcCD, RecD Recombinase RecA Migration and resolution RuvABC, RecGReplication PolA, PolC, PolX, phage PolLigation DnlJdNTP pools, cleanup MutTs, RRaseOther LexA, RadA, HepA, UVDE, MutS2 TIGR
  45. 45. Problem: List of DNA repair gene homologs in D. radiodurans genome is not significantly different from otherbacterial genomes of the similar sizeTIGR
  46. 46. Unusual Features of D. radiodurans DNA Repair Genes Process GenesNucleotide excision repair Two UvrAsBase excision repair Four MutY-NthsRecombination RecD but not RecBCReplication Four Pol genesdNTP pools Many MutTs, two RRasesOther UVDE TIGR
  47. 47. Gain and Loss of Repair Genes BACTERIA ARCHAEA EUKARYOTES Human Mycge Mycpn Yeast Bacsu Synsp Borbu Neigo Trepa Metth Haein Helpy Metjn Arcfu Strpy Ecoli -Ogt -AlkA -Nfo -AlkA -PhrI -Ogt -AlkA -AlkA -PhrI -Ogt -Ung -Xth -Rad25 -AlkA -Nfo -RecFRQN -Rad25? Rus + -Nfo -TagI -RecQ +P53 -Vsr -RuvC UmuD + -Nfo -SbcD? dRecQ -SbcCD -Dut +Rad7 +Nei? -Rec -Lon dRad23 -LexA -SMS +CCE1 +RecE -SbcCD -LexA +MAG?tRecT? -UmuC -LexA +Spr tTagI ? tRad25 t3MG -PhrI -PhrII -PhrI -Ogt -PhrI -Ogg tUvrABCD Ada + -PhrII -AlkA -Ogt MutH + -PhrII? -AlkA -Xth -AlkA -Ung SbcB + -Fpg -MutLS -Nfo -Fpg -Nfo -RecFJORQN -Nfo -Dut -MutLS -Mfd -RecO -Lon -PhrI -RecFORQ -SbcCD -LexA -Ung? -PhrII -SbcCD -RecG -UmuC -MutLS -LexA -Dut -RecQ? Vsr + -UmuC -PriA -Dut RecBCD? + -TagI+RecT -LexA -UmuC -SMS -MutT RFAs + -PhrII +TFIIH -RuvC +Rad4,10,14,16,23,26 CSA + Rad52,53,54 + +TagI? dPhr DNA-PK, Ku + SNF2 d TagI? + dMutS +Fpg dMutL UvrABCD + dRecA Mfd + RecFJNOR + Ung? + RuvABC + SSB, + +RecG Rad1 + +Dut? LigI + +Rad2 from mitochondria LexA + +Rad25? SSB + Ogg + +PriA LigII + +Dut? PhrI, PhrII + +Ogt +Ung, AlkA, MutY-Nth +AlkA +Xth, Nfo? +MutLS? +SbcCD +RecA +UmuCTIGR +MutT +Lon dMutSI/MutSII dRecA/SMS dPhrI/PhrII
  48. 48. Repair Studies in Different Species (determined by Medline searches as of 1998) Humans 7028 E. coli 3926 S. cerevisiae 988 Drosophila 387 B. subtilits 284 S. pombe 116 Xenopus 56 C. elegans 25 A. thaliana 20 Methanogens 16 Haloferax 5 Giardia 0TIGR
  49. 49. Evolution of Repair Summary• Mycoplasmas have lost many repair genes which may explain high mutation rate.• Mismatch repair genes absent in many pathogens (is high mutation rate advantageous?)• Whole pathways frequently lost as units (e.g., MutLS).• May be able to predict pathway interactions by correlated loss of genes.• Archaeal genomes have few homologs of bacterial or eukaryotic repair proteins.• Some eukaryotic repair proteins have likely mitochondrial and plastid ancestry• Many ancient duplications (MutS, SNF2, UvrC).• Some unusual distributions (XPB, UvrABCD) TIGR
  50. 50. TIGR
  51. 51. AcknowledgementsTIGR NIEHS•Craig Venter •Ben Van Houten•Claire Fraser•John Heidelberg Louisiana State University•Owen White •John Battista•Steve Salzberg OtherStanford •J. Laval•Phil Hanawalt •F. Taddei•Rick Myers •A. Britt•D. Crowley •J. MillerU.C. Berkeley Funding•Michael Eisen •DOE, OBER•A. J. Clark •NIH •NSF TIGR
  52. 52. TIGR
  53. 53. Unusual Distributions• XP-B like gene in some bacteria and some Archaea.• LigaseII in M. tuberculosis, B. subtilis, and A. aeolicus• UvrABCD in M. thermoatuotrophicum• Mycoplasmas and some low GC gram positives do not have any Holliday junction resolving homologs (RuvC, RecG, Rus)• Mycoplasmas are the only species without MutY-Nth homologs• MutS2 unevenly distributed among bacteria, Archaea• Genes in RecF pathway not always present as a unit• Uracil glycosylase missing from Archaea and some bacteria TIGR
  54. 54. Big Picture: Duplication and LossTIGR
  55. 55. Genes Lost in Mycoplasmal LineageProcess ProteinBase excision repair MutY/Nth, AlkARecombination initiation RecF pathway, SbcCDRecombination resolution RecG, RuvCMismatch repair MutLSTranscription coupled repair MFDInduction LexADirect repair PhrI, OgtAP endonuclease XthOther MutT, Dut, PriA, SMS TIGR
  56. 56. Parallel Loss of MutLSLost in mycoplasmal lineage (present in B. subtilis and S. pyogenes)Lost in M. tuberculosis lineage (found in some other highGC Gram-positives)Lost in H. pylori lineage (present in many other Proteobacteria)Possibly lost in Euryarchaeota lineageDefective in many “wild” E. coli and S. typhimurium strainsLoss of genes may give an advantage in some conditions by increasing mutation rate or recombination rate between species. TIGR
  57. 57. Need for Experimental Studies in Archaea • No novel repair genes cloned in Archaea. All repair genes show homology to repair genes in other species. • Many novel repair genes found in bacteria and eukaryotes because of experimental work in these species. • Since novel repair pathways appear to evolve frequently in bacteria and eukaryotes, there is a need for more genetic and experimental studies of repair in Archaea. TIGR
  58. 58. Repair Genes in all Archaea Process ProteinNucleotide excision repair Rad2, Rad1 ±Recombination RecA, Mre11, Rad50Replication PolB, PCNALigase Ligase IIBase excision repair MutY-NthdNTP pools MutT familyAlkyltransferase Ogt in all speciesTIGR
  59. 59. DNA Repair Gene Summary• Most of the standard eukaryotic DNA repair genes are found• Some likely plastid repair genes are found• Some duplications relative to other speciesTIGR
  60. 60. Acknowledgements• Genome duplications: S. Salzberg, J. Heidelberg, O. White, A. Stoltzfus, J. Peterson• Genome sequences and analysis: J. Heidelberg, T. Read, H. Tettelin, K. Nelson, J. Peterson, R. Fleischmann, D. Bryant• Horizontal transfers: K. Nelson, W. F. Doolittle• TIGR: C. Fraser, J. Venter, M-I. Benito, S. Kaul, Seqcore• $$$: DOE, NSF, NIH, ONRTIGR
  61. 61. Evolution of Uracil Glycosylase• Many non-homologous proteins have uracil-DNA glycosylase activity (Ung, GPADH, MUG, cyclin)• Therefore, absence of homologs of these genes should not be used to infer likely absence of activity• However, presence of homologs of Ung and MUG genes can be used to indicate presence of activity because all homologs of these genes have this activityTIGR
  62. 62. Ambiguous OriginProcess ProteinsBase excision 3MG, GT MMR, UngNucleotide excision repair Rad25Recombination initiation RecQOther Dut TIGR
  63. 63. Big Picture: Distribution PatternsTIGR
  64. 64. Present in All Bacteria Process ProteinsNucleotide Excision RepairRecombinaseReplication PolA,CSingle-strand DNA Binding SSBLigase LigaseITIGR
  65. 65. Present in All Free-Living Bacteria Process ProteinsNucleotide Excision Repair UvrABCDRecombinase RecAReplication PolA,CSingle-strand DNA Binding SSBLigase LigaseITIGR
  66. 66. Present in Most Bacteria Process ProteinNucleotide excision repair UvrABCDHolliday junction resolution RuvABCRecombination RecA; RecJ, RecGReplication PolA,C; PriA; SSBLigase DnlJTranscription-coupled repair MfdBase excision repair Ung, MutY-NthAP endonuclease XthTIGR
  67. 67. Present in Bacteria or Eukaryotes (But Not Both)Process Bacteria EukaryotesTranscription-coupled repair CSB, CSAMismatch strand recognition MutH -Nucleotide excision repair UvrABC XPs, TFIIH, etc.Recombination initiation RecBCD, RecF KU, DNA-PKHolliday junction resolution RuvABC CCE1Base excision -Inducible responses LexA P53 TIGR
  68. 68. Evolution of Alkyltransferases• All known alkyltransferases share a conserved, homologous alkyltransferase domain• Therefore, if a species does not encode any protein with this domain, it likely does not have alkyltransferase activity• If a species does encode an member of this gene family, it likely has alkyltransferase activityTIGR
  69. 69. Standard Eukaryotic Repair Genes Pathway Genes Mismatch Repair MSH2-6, MutLs Base Excision Repair Ogg, MutY-Nth, Tag, 3MG, Ung Nucleotide Excision XPA, Rad1, Rad2, Rad3, Repair Rad10, Rad25, etc Recombination MRE11, Rad50 Rad51 Direct Repair Phr, Dnl1 Other PCNA, Dut, LonTIGR
  70. 70. Missing Eukaryotic Repair Genes? Pathway Genes Mismatch Repair Base Excision Repair Nucleotide Excision XPA, Btf2, Btf3, Repair Kin28 Recombination Direct Repair Ogt Other TIGR
  71. 71. Bootstrap MSH6.P ombe 55 MSH6.Yeast 35 30 38 100 GTBP .Arath IV.At4g02070 ARATH T10M13.8 93 93 ARATH AGAA.3 MSH6 MSH7.Arath 100 GTBP .Mouse GTBP .Human 27 Y47G6A.11.Celegans 100 MSH3.Human 82 38 REP 1.Mouse MSH3.Arath IV M7J2.90 MSH3 SWI4.pombe MSH3.yeast MUTS BORBU MUTS TREP A 48 MutS.Cloac.blast MutS.Clodi.blast 15 30 MutS.Synsp 26 MutS.Chlte.blast 96 MutS.P orgi.blast 100 100 MutS.Theaq 35 MutS.Theaq cald MutS.Theth MutS.Thema 66 100 MutS.Ecoli 12 60 MutS.Salty 82 MutS.Yerpe.blast 20 33 82 MutS.Vibch 100 97 MutS.Haein 83 43 100 MutS.Actin.blast 66 33 17 MutS.Actin.blast MutS1 100 MutS.P asmu.blast 72 MutS.Shepu.blast 31 61 MutS.Neime.TIGR 54 12 MutS.Neigo.blast 86 15 100 MutS.Azovi MutS.P seae.blast MutS.Thife.blast 24 100 MutS.Entfa.blast 98 77 MutS.Strmu.blast 97 MutS.Strpy.blast 78 HexA.Strpn 54 MutS.Staau.blast MutS.Bacsu 100MutS.Ricpr MUTS RHIME 89 MutS.Caucr.prelim 92 100 MutS.Aqupy 89 100MutS.Aquae 80 MutS.Chltr.? MutS.Chlpn MSH2.Rat 100 79 MSH2.Human 93 67 38 MSH2.Mouse 40 72 88 MSH2.Xenla 85 100 MSH2.Neucr MSH2.Yeast MSH2.Arath3 Mus1.Maize MSH2 MSH2.P ombe SP E1.Drome H26D21.2.Celeg 88 MSH1.Spombe MSH1 MSH1.Yeast MSH1.Canal.blast 51 76 MSH4.Celeg MSH4.Yeast MSH4.Canal MSH4 MSH4.human 46 ARATH3 MQC12.24 MSH5.Yeast 100 MSH5.Human 53 Msh5.Mouse MSH5 MSH5.Celeg mMutS.Saco.glaucum. Muts2.Metth 100 100 MutS2.P yrho 100 MutS2.P yrab 100 MutS2.Helpy.TIGR MutS2.Helpy99 MutS2.Camje 55 MutS2.Deira.TIGR MutS2.Thema 58 MutS2.Bacsu MutS2.Staau 17 45 23 77 Muts2.Entfa.blast 100 Muts2.Strmu.blast MutS2 49 27 87 39 MutS2.Strpy.Blast 50 917 MutS2.Strpn.blast 95 Muts2.Cloac.blast 26 Muts2.Clodi.blast 70 MutS2.Synsp ARATH 5 MJP 23 7 Muts2.Chlte.blastTIGR MutS2.Borbu MutS2.Aquae Muts2.P orgi.blast MutS2.Arath 1 F16G16.7
  72. 72. Bootstrap 81 E.coli Shig.flexneri Shig.sonnei 100 55 Ent.agglomerans Y.pestis 61 87 45 S.marcescens 71 Xen.bovienii.edit 10 55 0 Xen.nematophilus 87 100 P r.vulgaris γ 43 96 P r.mirabilis Shepu.tigr 49 100 Vib.anguillarum 23 Vib.cholerae 66 %MP LA Aer.salmonicida Actinobacillus actinomycetemco 59 100 83 H.influenzae 75 1 00 P ast.multocida P ast.haemolytica 25 P s.oleovorans 25 100 100 P s.marginalis P s.fluorescens 93 P s.putida 96 P s.aeruginosa 100 P s.aeP AM7 46 7 Az.vinelandii Ac.calcoaceticus Ac.sp.ADP 1 100 Mthmon.clara 99 90 Mthphy.methylotrophus 65 Mthbac.flagellatum β 100 30 30 Nitrosomonas.blast 8 35 Chrom.vinosum L.pneumophila 100 T.ferrooxidans 10 67 Xa.oryzae Xa.campestris Xa.citri 25 B.pertussis 40 P s.cepacia Ne.gonorrhoea Ne.meningitis.TIGR 4 Blastochloris.viridis Rhps.palustris 85 85 Rhb.phaseoli 78 Rh.leguminosarum 96 99 8 7 A.tumefaciens 54 Rh.meliloti 18 Br.abortus α 44 94 3 36 Aq.magnetotacticum.edit 49 100 9 3 Aceto.polyoxogenes Aceto.altoacetigenes 98 Gluc.oxydans 17 Zym.mobilis 84 P rcs.denitrificans 4 Rho.sphaeroides Rho.capsulatus Caul.crescentus ε 99 Ric.prowazekii 18 reca.wolbachia.blast He.pylori 10 0100 He.pylori2 56 82 Cmp.jejuni δ Cmp.fetus 100 32 1Myx.xanthus 10 95 8 0 2Myx.xanthus blast.geobacter blast.desulf Tmtg.maritima Spirochetes 93 Trep.pallidum 9944 Tr.denticolum.blast Bor.borgdorferi 96 Lept.biflexa Lept.interrogans Cyanobacteria 39 Spir.platensis 68 Syn.7942 97 Ana.variabilis 9 94 Syn.7002 Syn.6803 A.thaliana A.thaliana.chr3 Mycoplasmas 88 A.thaliana.chr2 51 100Ureo urolyticum 39 23 100Mycp.genitalium 21 Mycp.pneumoniae 68 P hytoplasma sp Mycp.mycoides 97 Mycp.pulmonas 100 De.radiodurans The.aquaticus D/T Green Non-Sulfur The.thermophilus 30 Chlfx.aurantiacus 60 Dehalo.blast Chlamydia 100 Chl.trachomatis Chl.pneumoniae Cory.glutamicum 48 Cory.pseudotuberculosis 99 Strpm.coelicolor 100 52 Strpm.lividans 93 Strpm.ambofaciens High GC Gram + 96 100 Strpm.violaceus 5 2 57 Strpm.rimosus 100 Mycb.tuberculosis 9956 Mycb.bovis.sanger.fr2 36 99 Mycb.avium.blast Mycb.leprae Mycb.smegmatis Amycolatopsis mediterranei Bifido.breve Green Sulfur Chb.tepidum 100P orp.gingivalis 78Bact.fragilis P rev.ruminocola List.monocytogenes 64 60B.subtilis 73 Bac anthracis.blast 76 Ent faecalis.blast Low GC Gram + 70 Staph.aureus 77 100 Strc.pneumoniae 88 100 Strc.parasanguisTIGR Strc.pyogenes 84 1Lct.lactis Ach.laidlawii reca.blast.carboxy 100Clost.perfringens Cl.acetybutylicum.blast Hydrogenobacteria Aq.pyrophilus Aq.aeolicus

×