Gutell 024.jme.1992.35.0318

172 views

Published on

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
172
On SlideShare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
0
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Gutell 024.jme.1992.35.0318

  1. 1. J Mol Evol (1992) 35:318-328Journal ofMolecularEvolution©Springer-VerlagNewYorkInc.1992The Nucleotide Sequence of the Entire Ribosomal DNA Operon and theStructure of the Large Subunit rRNA of Giardia murisHarry van Keulen, 1 Robin R. Gutell, 2 Scott R. Campbell, 1 Stanley L. Erlandsen, 3 andEdward L. Jarroll 1Department of Biology, Cleveland State University, 1983 E. 24th Street, Cleveland, OH 44115, USAz Molecular Cellular and Developmental Biology, Campus Box 347 University of Colorado, Boulder, CO 80309, USA3 Department of Cell Biology and Neuroanatomy, University of Minnesota Medical School, Minneapolis, MN, USASummary. The total nucleotide sequence of therDNA of Giardia muris, an intestinal protozoanparasite of rodents, has been determined. The re-peat unit is 7668 basepairs (bp) in size and consistsof a spacer of 3314 bp, a small-subunit rRNA (SSU-rRNA) gene of 1429, and a large-subunit rRNA(LSU-rRNA) gene of 2698 bp. The spacer containslong direct repeats and is heterogeneous in size.The LSU-rRNA of G. muris was compared to thatof the human intestinal parasite Giardia duodenalis,to the bird parasite Giardia ardeae, and to that ofEscherichia coli. The LSU-rRNA has a size com-parable to the 23S rRNA of E. coli but shows struc-tural features typical for eukaryotes. Some variableregions are typically small and account for the over-all smaller size of this rRNA. The structure of theG. muris LSU-rRNA is similar to that of the otherGiardia rRNA, but each rRNA has characteristicfeatures residing in a number of variable regions.Key words: Giardia muris-- Protozoan parasiteRibosomal rRNA genes -- Ribosomal DNA sequenceLarge-Subunit rRNA ~ Sequence comparisonIntroductionThe majority of eukaryotes studied to date carry aset of tandemly repeated units containing the ribo-Offprint requests to: H. van Keulensomal RNA (rRNA) genes which consist of a non-transcribed spacer, a transcribed spacer, the small-subunit rRNA (SSU-rRNA) gene, an internaltranscribed spacer, the 5.8S rRNA gene, a secondinternal transcribed spacer, and the large-subunitrRNA (LSU-rRNA) gene (Gerbi 1985). Variationson this theme include the size of the nontranscribedspacer, which can be either uniform or non-uniformin size within a species, and the size and sequenceof the variable regions in the mature rRNA genes.The SSU-rRNAs have proven useful in providinga phylogenetic framework to study taxonomic seg-regation of organisms (Woese 1987). Based on thiscriterion, the protozoan parasite Giardia duodena-lis (synonymous for G. lamblia and G. intestinalis)has been described as representing the earliest-diverging lineage in the eukaryotic line of descent(Sogin et al. 1989) The Giardia rRNA genes areunusual in several respects: not only is the entiretranscription unit small, but also the SSU-rRNAgene is small and resembles in size prokaryoticrDNA more closely than eukaryotic rDNA. How-ever, the G + C content of the rRNA genes of G.duodenalis is high (75%), which could make com-parison with other rRNA genes with lower G + Ccontent problematic. Giardia duodenalis is only oneof five known species of Giardia. The others are G.muris, G. ardeae, G. psittaci, and G. agilis (Filice1952; Erlandsen and Bemrick 1987; Erlandsen et al.1990). One of these species, G. muris, has a rDNArepeat unit that is larger and has a different restric-
  2. 2. 319151i0115120125130135140145150155160165170175180185190195110Ol1051Ii01115112011251130113511401145115011551160116511701175118011851190119512001205121012151220122512301235124012451250125512601265127012751280128512901295130013051310131513201GATCCGGACT CGAGATCAAA AGACCGGAGGCCCTCGAAOG GGGGGTCGTA CAGGGGT~CCCTGGGCACG GAACGGATCG GGGGTCCATGTGGGTCCTCA CGGGGGCCTT CCCTCCAGGG~GGTCCCCTG QGTCCATGGG GCCCTTGAAG~ C C G GTCCTCACGG GGTGGGGCCCCTCCATCGGG GGGCGGGGAC TGGGGTCCACATCCGGGTCA TCGGACCATG TGGTCATATGCGGTCCCTCG GGGGTGGGTC ATCGAATGGGTTGGGGTCCG GGGTCCCCCT CGACCCTCACATGAACGGGA CGGAGGTCCA ~ C C A GTTGAATCTCT GGGGTCTCGG GCCGGACCCTCCCAAGCCCT CACGGGATCA TCGTACGGGTCATACGAGGG ACGGGGGCT~ CCGGGGCCCTACGGGGCCTC CCCCCATAGG TCCATGGACCCCATCATACG GATGGGGGTT C ~ T C GG C G A ~ G T C ~ TCCGTGGGCCC ACGGCGTCCCCCCTGGCCCG ~AGTCCACAGTCATTCGAG GTGTCCCCGGTCCTTCAAGA TGAACGCKT-GATCACACGGGG GGGCTCATGGCCCCCGCCCT CCTTCGAC-C4/GATGGGCCTT ~ C AATGGGGGACC ATCITCCCCAAGGTCCCCCT CGAGTCCTCCGGGATCCCCG TGGGGTCCCCCACGGGATTA TAATC-C-C-Cs2~A ~ T C ATGGCTCCATCACGGGGTCC TIW_AAGACGATTCATGGGTG GGGGTGTGGTGGGGTCCGTC CGGTCCTCACGGGACAGGAT .C_CAGGGGGCCCGGGTCATCA GGGGGATCCGCATGGACGGG GACTCCCCCACTCACC-GATC ~ITCAGAGGGGCCCCCACGG GTCATCGGACGTAGAGGTAG TGGGTCCATGGGGAGGGAAC CCTTCCCGCTGGCGCCGAAG CGA~CCACGGAGGCCA CTGTG~CACGGGCC4SCCCG CGTCTCCGTGCTCATCACCT CTCATCACTTTTATCGTCTC TTATCATCTCCATCGCGTCC CCAGCCCCCCCTCATGGATC ~I~ATGGATCCCTGACTCCT ~ T C A CTTCCTCCCTC AGCCCTCCCCCTACCCCACG TCGC4~AGCCCTTCACGTCCC CTCATACGGG CCACATGGACGGTCATCATA TGGGGCCCGG ~CCTCATATC~ k ~ CCCCTTCATC GATCGGGTCCACACGGCATG GGAt.-J.-J.-~CAGGGACGATCGA ACGGGCCCCC GAGGGCACGAGGGCCCGGGA CGACGCTCAT CATTCCAGACTTCTCCTTCA TTCAAGTTTA GGACCGGAGGCCgCAGAGGG GGACCGAAGG GGCGGCCCGACGGCCACTGA CTGGGCGTGT CTGCGTCCCCTCCCTCGCAT CTCTCGCATC TCCCGCGTCTATTATCGTCC CTTATCGTGT CTCGTGTCCCTTATCGTCCC CCGTCATCIX2 TCATCGCTCTCACTCCTCTC GTGGATCCCC CTGGCCCCCT~TCATGCCCC CTCATGCCCC CCCATGCCCCCTCGTGGCCC CCCCTCTCAG GCCTCCCCCCACGCCTGCAC GCACGGGCCC CCATCCCCTCTCTCCCACAT CCGTCTGACT CGTCCCCCCAACCCCCAGGTGGGGAATCCCCCCGTCCCCC~ C C GTCCIIV,-GC4~AGGGGCCTATGGGGCCGGGGACGGACACCAAGGCCGAt2-r~-J."A ~ A C GCTCCAGATCC TC~TTAGGACAATAGA GGGTTCGATC~CCCTT TCGACe-GTGT~ T C G ~ T G G CCGGGTCGAGA G G G A ~AGCGAhI~x-*-I"~TCV~CGGGTCGAGG CCTGGTGC4~CCTTACACCTA A A A ~CGGCTACTAT AGGGGAAGGAAAGTG~ATATTTACTA ATAAAATCCACATCATCTCG TCCCATGACTTACGTGACTG TGCACGCAGTAGGAGACTTG CATGGGGTCAGC-CCACCTAA GGGCCTTATGGAGGCTCCCG GGACGCTCCTCA~t-~-L-~*,~AAACTCGACTTGAAAGTGCAC GAAAGATGGAGGAAGGCCGG G C C ~AC-G~4%AC-~_A_~Q~GAGGGGTCACCTGGGGA CCCTCAATAA GTC-GACGAAA GGTGGAGGCC GAL-FFICCGCTACTGTAGGG GCCAGGAGGC CGGGGCCCGG ~zGGCCCGAAG GGCGAGGCCGGCCGA~CCCCCCTC AAATCGGACG ~ CGAGAGGCCCAAATCGGACG GGGAGGAGGG CTCCAGGGC~CCCCCCCT CAAATCGAACGGGGGAGGGG CGAGAGC-CCC AAATCGGACG GGGAGGAGGG CTCCAGGGCC........................................ C **********GCCCCCCCTC AAATCGAACG ~ CGAGAGGCCC AAATCGGACGGGGAGGAGGG CTCC~Q~CCCCCCCTC AAATCGAACG GGCGAGGCGAGAGGCCCAAA TCGGACGGGA GGAGGC-CTCC AGGC~CCC CCCCTCAAATCGAACGGGGG GAGGGGCGAG ~.C.-GCCCCCAAATCGGACGGG ~ C T CCAGGC~CC CCCCCTCAAA TCGAACGGGG G G ~ GAGGCCCCCA********** ********** *****Vr*t** ******vr*** **********AATCGGACGG C-C~GGAGGCC CC~CGGC CCCCTCCCCC CTCC-GTC********** ******** ............... C .......... C .......TTCCTCCCTC CCCCCCTCGC CCGGTCCTGA ACGAAGTGCC GTCGCAGGGAGGGGAGTGAG CTGGGGAAGA AGGACCACAA CCCCCACCTA GAGACCTTCAATAAGTGGCC CAAAGATGGA GGCCGAI-I-FI"CGGCTACTAT AGGGAAC-GCACGACGAGGCC GGACCGGGAC GAGAAGACCC ACCCGACGAC GAGTGTGC2_A~~GTACGACGA GAGGACGGAA GGGTCCGGCC CACAACGGAC GGATCGACCGVlATCGACAGAC CATCCGGTTG ATCCTGCCGG AGTACTACGC TACeCCAAGGA~C, CCA TGCAAGCGGA CACGAGGTAT G A A ~ ACGGCTCGGTACAACC~TAC GAGTCTGACC GGGGGTGAAG GCTAGACGC~ TACCGCTGGCAACCCAGCGC C/%AGACGAGT GCTCAAGAGC GGGGAAGGAA AGCACGCGATGGAGCGAATG CCCGGATGAG GTTCCGAGGT ATTACCTAGT CGGTAGAGTAGTGGTCTACG GAGGGGATGA TGCCTGGCGG AGGATC/kGGG TTTGACTCCGFig. 1. The nucleotide sequence of Giardia muris rDNA. TheRNA-like DNA strand of the longest rDNA is given correspond-ing to the map. (See insert.) The positions of the 5 and 3 endsof the SSU-rRNA (numbers 1 and 2) and 5.8S rRNA (numbers 3and 4) genes are indicated with numbered arrowheads. Here,only the first 23 and last 37 bases of the LSU-rDNA are indicatedwith arrowheads (5, 6). The sequence of a shorter spacer-specificSau3AI fragment, position 2139-2901, is aligned with and writtenunder the DNA sequence; the identical base sequences are indi-cated by dots (.) under the bases, and deletions are indicated byasterisks (*). The positions of a number of direct repeats areindicated by underlining the beginning of each repeated sequencewith a short arrow (---~). The various restriction enzyme sites,3251 GAGAACGGGC CTGAGAGACG GCCCGTACAT CC/%AGGACGG CAGCAGGCGC3301 C,GAACTTGCC CAATGCGTGA AGGCGTC.AGG CAGCAACGGG GGATCCCATG3351 AAATGGGAAG ACTGGGGGGT AGATGACCCC AGCACAAGTC GAGGGAAAGG3401 TCTGGTGCCA GCAGCCGCGG TAATTCC.AC~ TCGGCAGGCG TCGTACGGCG3451 CTGTTGCAGT TAAAACGTCC GGAGTCGAGA CGTCCAC, CCG GGAGGAAAGA3501 GGAGCGCTTA AGGCGGGAGT GAGTACGAGA AAGCCCGGGA CGGACATGAA3551 GGTGAATGGG TAAGGGCATG TGTATTGGTG ~CGGGT GAAATAGGAT3601 GATCCGACCA AGACAGAeAA AGGCGTAGQC ACTTGCCAAG ACCATATCAG3651 TCGAACCAGG ACGAKGCCCG GGGGCGAGAK GGCGATTAGA CACCACCGTA3701 TTCCCGGGCG TAAACGATGC CACCGAGAGA CTGGCCAGGT CGTO_AGGATC3751 GAAGGGAAAC CGATCAGGGT ACGGGCTCTG GGGGGAGTAT GGCCGCAAGG3801 CTGGAACTTG I%AGGCATTGA CGGAGGGGTA ~d%CCAGAeG TGGAGTCTGC3851 GGCTCAATTT ~ G C GAACACCTTA CCAGGCCCAG ACGTACGGAG3901 GATCGACGGT TGAGAGGACe TTCGTGATCG TACGAGTGGT GGTGCATGGC3951 CGTTCACAGC e e Q ~ AGCCGTCTGC TTGACTGCGA cAKeGAGCGA4001 GACCCTAACC TGGATGGGAC CGCe~ATGGT G/%ATTGGAC,G AAGGTGGGGC4051 GAT/%ACAGGT CTGTGATGCC CTTAGACGCC CTC,GGCTGCA CGCGTACTAC4101 ACTGTGGGGA TC,AAACCACG TCGAGTTGTG /%AGCTTGATG AGATC/%ACCC4151 CCACGTGGTT GGGATCGTGG ACTGGAKCGT CCTCGTGAAC CTC,GAATGTC4201 TAGTAGC,CGT AGGTCATCAA TCTACC,CCGG ATACGTCCCT C,CCCCTTGTA4251 CACACCGCCC GTCGCTCCTA C C G A ~ CTTCTGGCGA GCTCCTGGGA4301 GGGATGAACC GAACAGGGAC GAACeGCGAG GCTTGGAGGA AGGAGAAGTC"1724351 GTAACAAGGT ATCCGTAGGT GAACCTGCGG ATC,GATCCAT CGAGAGGGA/~V34401 GGTACGAGGA TGAAGC-CAAT GAGATGACGC GACCCGGTGG ATGCCTTGGC4451 T C ~ C G ATGAAGGACG TGGCTGACGA CGATACTeGA TGTC,GTCCAAV44501 GCACGTGACA TCGATCTTCG AATGGATCAT CGGGGTGGTG GGGTGATCTAV54551 CGAACTGGGT TGTTCGAGGG GTTGAACGAT GGGGAATGAG ATGAGGATCC4601 CCCCCCACTC CGATGAAGAT <Large subunit rDNA>V67251 CGCCTGAGGT TTGGTTCGGG TTGGCATTCC CCCCGAGCCC CCGAGGGTCG7301 AAGACTGAGG GGGTGAGGGA AC~GGCCGGAA CAACCCCCCC GAGGGTCGTG7351 AGACGAAGGG G ~ AGGCCCCCCG CCCCGAGGAA GACCTGAAGG7401 GGGGTGAGGG CATCCCCCGT CATACCAGC~ GGG GAA GGCTCCCTCC7451 GGTCGTGGGA C C A ~ T GAGGGCCCCC GGTCGTAAGG GGGCATCCGA7501 GGGGTGAGAC TTGAGGCGCC GAGGGTCATA C ~ T~%GGC42CCA7551 ~ T CCCCCGCCCG GATCGACCTG AAGCGAGATC AGACCCCCGT7601 GGGTCGAAGG ~ T A C G GGGGCCTACA GAAGGATACC AGATTTGGAA7651 CGGTTACAGT GGAI-I-I-I~SI II II IBX BXB BBB B88 S T B K HBBK PP O8,, H • B,I-,..,,., . . . . . i. , , i ..i , i , ,- | ,1kElP| icorresponding to the map, are indicated by underlining. Theclosed arrows, position 128-973, indicate an open reading framefor 281 amino acids. Insert: physical and genetic map of the G.rnuris rDNA. The positions of the rRNA genes are indicated byboxes over the restriction enzyme map; from left to right theboxes represent SSU-rDNA, 5.8S rDNA, and LSU-rDNA. Thearrows indicate the direction and extent of DNA sequence anal-ysis. An oligonucleotide, used to supplement the sequence anal-ysis, is indicated at the 5 end of the SSU-rDNA as an upright baron an arrow and has the sequence 5GATCCTGCCGGAGY.Restriction enzyme sites: BamHI (B); HindlII (H); KpnI (K);PvulI (P); Sau3AI (S); TaqI (T); XhoI (X); the short bars withoutletters are the SmaI sites.
  3. 3. 320tion enzyme recognition pattern than G. duodenalis(van Keulen et al. 1991a). The rDNA of G. murisalso has a much lower G + C content (61.9%) thanthat of G. duodenalis (Boothroyd et al. 1987). Toprovide a basis for further analysis of the taxonomicand evolutionary position of the genus Giardia, theDNA sequence of the entire rDNA operon of Giar-dia rnuris was determined from available clones.This report will focus on the spacer and LSU-rRNAgene and its RNA structure. A comparison betweenthe SSU-rRNA of G. muris, G. duodenalis, and G.ardeae and its implication in phylogeny will be pre-sented in a later publication.Materials and MethodsIsolation of DNA. Giardia muris trophozoites were obtainedfrom cysts isolated from CFI mice by standard methods (Robert-Thomson et al. 1976; Sauch 1984). Trophozoites were harvestedfrom G. muris cysts which were induced to excyst as previouslydescribed (Campbell et al. 1990). Total genomic DNA was iso-lated according to a standard procedure described previously(van Keulen et al. 1985). Plasmid DNA was isolated by the al-kaline minipreparation procedure of Birnboim (1983) modified bythe use of ammonium acetate for the precipitation of the genomicDNA complex (Bird and Wu 1989).Southern Blot Analysis. Total genomic DNA was digested (4~g per digestion) with various restriction enzymes and fraction-ated according to size on 1% agarose gels. The gels were stainedwith ethidium bromide, photographed, and prepared for South-ern transfer to nitrocellulose fdters according to standard proce-dures (Maniatis et al. 1982).All cloned DNAs used as probes were digested with the ap-propriate restriction enzymes and separated from vector DNAby agarose gel electrophoresis. The DNA fragments were puri-fied by the freeze-phenol procedure (Benson 1984). Nick-translations and hybridizations were carried out as describedpreviously (Campbell et al. 1990).DNA Sequence Analysis. The appropriate DNA restrictionenzyme fragments were size fractionated, purified, and cloned inM13mp18 or rap19 vector DNA with JM 101 or JM 109 as hoststrains (Messing 1983). For the single-stranded DNA isolationsand the sequencing reactions, the protocols supplied with theSequenase and TAQuence kits were used (United States Bio-chemical Corp., Cleveland, OH). Areas with extreme band com-pressions were analyzed by using dlTP with Sequenase or7-deaza-dGTP with TAQuence in the reaction labeling mixture.When necessary the standard 8% acrylamide/8 M urea sequenc-ing gels were substituted for 8% acrylamide/8 M urea/20% form-amide gels. Most of the sequence was determined by analysis ofboth strands; in small remaining sections where this could not bedone the sequence analysis was repeated using the two differentenzymes. The oligonucleotide used as internal primer for se-quence analysis was synthesized in the microchemical facility atthe University of Minnesota and had the sequence 5GATCCT-GCCGGAC3. The sequence data were analyzed with theDNASTAR sequence programs (DNASTAR Inc., Madison,WI). The sequence of the LSU-rRNA of G. muris was comparedto those from G. duodenalis (Healey et al. 1990), G. ardeae (vanKeulen et al. 1991b), Escherichia coli (Brosius et al. 1980), andHalobacteriurn halobiurn Mankin and Kagramanova 1986). Se-quence similarity was determined by the DNASTAR alignmentprogram, based on the procedure of Wilbur and Lipman (1983)with a K-tuple size of 5, range of 20, and gap penalty of 6.Sequence alignment was performed by aligning the sequencesaccording to the procedure of Woese et al. (1983).Secondary Structure. The secondary structure diagrams weredrawn in a format based on that of E. coli 23S rRNA model,which may be regarded as the standard or prototype structure(Gutell et al. 1990). The figures were laid out with the assistanceof a new RNA graphics program, XRNA, developed by BrynWeiser (unpublished). Identification of some of the variable re-gions and estimates of their sizes were based on the compendiumof 23S-like rRNAs by Gutell et al. (1990).ResultsStructure of the rDNA Operon and Spacer DNAThe rDNA repeat of G. muris contains two HindlIIand two KpnI sites (Fig. 1). Cloning of the operonwas performed by isolation of the two HindlII andthree KpnI fragments from genomic DNA digestsand insertion in plasmid vectors. The clones consistof HindlII fragments of 5.2-kilobasepair (kbp)(pGmr5.2) and 2.55 kbp (pGmr2.5) and of KpnIfragments of 6.5-kbp (pGmr6.5), 6.2 kbp (pGmr6.2),and 1.15 kbp (pGmrl.15). Digestion of the clonespGmr6.5 and pGmr6.2 with BamHI revealed thatthe size difference resided in 1.8- and 1.5-kbpBamHI fragments, respectively, which hybridizedto SSU-rDNA probes (not shown). The size differ-ence was caused by the presence of differentlysized Sau3AI fragments in the various clones whichare located 5 to the SSU-rRNA gene (Fig. 1). Sincedifferently sized clones could be artifacts from ex-tension/deletion of the cloned DNAs in E. coli dueto repeated sequences, the possible presence of var-iously sized BamHI fragments in genomic DNAwas determined by hybridization of nick-translatedSau3AI fragment to size-fractionatedrestriction en-zyme digests of G. muris genomic DNA. Indeed,four bands hybridized to the Sau3AI fragment, in-dicating that there were several differently sizedspacer DNAs present. These same bands also hy-bridized to a SSU-rDNA probe (data not shown).The restriction enzyme map and entire sequenceFig. 2. Sequence alignment of G. muris, G. ardeae, G. duode-nalis LSU-rRNA with E. coli 23SrRNA. The sequence of theLSU-rRNA of G. muris was aligned with the sequence of G.ardeae (van Keulen et al. 1991b), with G. duodenalis from po-sition 856-3745 of the sequence published by Healey et al. (1990)for G. intestinalis except that a third C was inserted after posi-tion 3500 (unpublished observation) and with E. coli (Brosius etal. 1980). The dashes (-) indicate spaces created in the sequencefor optimal alignment.
  4. 4. 321£.~5"O3.m6a¢l~Om~¢*O.~,~O ~Ez~Gmd~G~E~O.,,w~Om~m0 i,,md,~EuO.~,~E.c~EGJ,w,~E.coEG.,,.M~G.¢,I*~G~tL*£z*l~G.,,.M*Ez*t~G.,,.M~G..~,uG~,&,.*O ~Ec.aGm,L,~O ~O ~O ~ / ~GJ~a~O ~O ~O~,~£.~E~,;J£.~G.m~G.mnr£.~IGGVUAAGCG&CUAAGCGUACACGGUGGAUGCCC UGGCAG UCAGAC, GCC.AUGAAGGA it~GUG (CUAAUC) UGCGAUAAGCGU CC~ i< .................... >AAGGU (GAUA~IA) ACCGUUA |01I...... ugagaugacgcGACC~UGCt~UGGCUCGGGGGA-CGAUGAAGGAI (~GUG {GCUGAC} GACGAUAC -ucgsugu t<gguccaa- --gcac~g- - -> ..................... $$I ...... caugcagaccc~ccGGCGGAUGCCUGGGCUCGGGCCU-CC,/tGG~GGAICGCG(GCGC~C)AGCG~AU-gcg~gc I <a{j -cccg- - - 0cacga a- - - >..................... 87I ........... alC{JCcCCGCCGGCGGAtJGCCUCC,CCCCGGGCGG- CC,^CGJ~AC.AGI CGCG(GCGGAG)CGCGAC,AC-{Jcggu{JC I<ggacccgccc{jcccc{jagaa> .....................UAACCGGCG IauuUCCGAAUGGG (GAAA )CCC [AGJGOGU (UUCG )ACACACUAUCAUUAACU- GA ........ (-AUC .... CA} ..... UA~GGCGAACCGGGGGAAC~J IGA (AACA) UC 201--Itcauc{ja I -ucUUCC~U- -- ( .... ) - -- I ....... ( .... ) ..... g gaucauc{jggg - uggtt{jg.... (..... /uccc) -cccccacucc{jaugaagaugA~U [UA (AC,CA} UA 166--gcacc{jgl -uuUCCC,~.... ( .... ) - -- I ....... (.... ) ..... c {jcucggccuc cc cc{jgga {jacc [{jgccg/-gac) uc cccc gagc g{jccggggac gACGGGCGGGACUI UA(AGCA)~ 174--gcaccga I -¢¢cucgaacgc- (a{j¢- ) -go I {jc..... (cccg) ..... slcgccgcc{jccu-cg{jc{jc cc -- (..... / -gc{j) cggcccgaggc{jgc{jgg{jgcgACGGG~CUI UA{AGCA)U^ 182t.t~Gt,~CCCCC~GG;~4+~UC (JO,CC) C~GI~WCC CCCAGU~;C(GGCC,A} ~ ~ ~ I <....................... ~CUG,~t~AG .... UGUGUGUGUUAG~ 29"/UCN;UACOCCCAGGAGJU~GAC~cc(AACC) GGGAt3~.CtPOC.i~ (GG<3C,A) GCC,AUCCA(;GAGUAGt/O~GI <accuc g (aa{j) cg auga gc guug >.... u{jaggacuggg{jggauguccuaaacc¢ 2J3UCAGI~CGCC~C~CACC (~CA) GGG&UUCCCG--GUAGC (GGCGA)GCC,AOG~GCCCG I <- -c~¢ (gaa ) gggcc- Oc~> ......... gggcguggcc{ji {j¢cgggg{j{ja 2g|t3C.NCUACGCCCC~CC (~CC) GGGAUUCCCC--17JAGC(GGCGA)GCGACGCGGGN~C,CCCGI <--ccc¢ (gaa ) gg-c{jcgcu{jug> ......... {jg{jcgc aggc gcag gccc gccg 289C,AAGCGUCU( ~ - )AGG~UACJ~<23 (UG~CJU;C) CCCGUAI CACJ~AAAUGCJ~CAUGOUGUGAGC~U....... IGAGUAGGGCGGG&(CA~UA )UCCUGUCUG3~UAUGGGGG 4]0gaagaugg{j(~gaaa)ccca-caccaagGJF~Q3(UGCCAGU)CC-GUA|gg{jau--gga¢acaa--ccc ......... ¢aacgacIGAGUACCUC~X;CO(---ugagag)-GUAGAGOGIa--~GGAG 3Mg~&gug~gg~ggg~gc~cuc-r-g~¢gga~G~P`~UC.J~C-~GC~CCc(~3C~u~cc--ggg¢g¢gg~-gcc ......... aCgcggglC--,N;UT++C-CGCCGCt3(---u.gagag}UGC.lV~,~,CGII41--cC,OC,3~ 3~4~{Jagggggc(ccgagg~g~¢-cg~{JaC`J¢.~<23~t+@PJ~ACC~CC~GUA~gg~---ggc~g~gggccu ......... {Jcgc{JgclGAGUAC,CGCt~Ct3(---ugagcg}UGCJV3~CGa---aG(3CJ.G 392(G-ACCAU -}CCOCCAAC,C~UAAAUACUCCUGACUGACCGAUAGUGAAC- CA(~JACC {Gt~,A )GGGAAAGGCGAAAAG~CCCC (~-) ~ ~ C C - ~ C C ~ A ~ < - -- 52~~GUGGUA~C)CUUCt3g~-~3P.~CU~UA~G~CCCCC`J~A~C`Cm%U~CC3~AGUAGU{GL~C~A~A<:Gj@+AGGUC~A~-~C~AUGC~(ga~a)G~CA-cg~.~-~~¢~g{Jg~g<~ 505~CUUC~UACCQ~CC(~3AC~AcCC`AUAC~K:GC~j~C`~J~`GtJAGU(QL`GA~ACGAA~[;;GL~.~`~J~C~ACGCC(ccc{Ja{J)G~CC~gcu-~C`AcC-~C~ccc{Jg¢uggcg<-c~ 503(GCGCGGcC~CU11CCAAGGcU~UA~CC~cCC~3~ACCC`AUAGO~C~ACCJ~Jk~A~(~)G~~CC~u~gc-~g~CC~CC~CC~CC<gc~ 509............................... >ACAA CGC{WAG )C~~"0GUGAC UG O~ UA CC UUr~JGUAUAAUGGGUCAGCGAC t~AUAUI~C UG UAGCAAGGUU~C C -- (- --GAAU 6|5cc¢{jaccugucga(-uga)ucgauuggucgg> ................... ( .... ) ........... -GCCCGUC~CACC, GACCGAGGAGt23AGCJ,CCGAUCGCCJ~+GUCauguggc{aaa{Jgga 593ccgllgggcccagg(gCgll)Cc~4{jgccgtlc{jg> ................... { .... ) ........... -.GCCCGUCUCC,/~CACGGACCCJ+GGAGCCGCCCCCCGGCC.CGAGUCag&gga- (- - c {j{j{j{j 5~ggcc ......... {¢{jcc] .......... {jg> ................... ( .... ) ............ CCCCGUCUCG~,AACACGGACCG/tGGAGCCACC,CGCCGCGGCC,~GCCc{Jaggg- (- - - agcc ~7~A) -GGG--GAGCCGAAGG (GARA) CCG< ............................................. >AGUCU (UAACU )GGGCG~G~AG~JC, CAGGGUAUAGA¢CCGAAACCCGGUGAUCUAG 690a) gucaa{jacuucaa{jg{j ({jc¢c) cuu<a¢ ................. (aa{jg{j) ................. ca> ................... GAGCC~UCG~.ICCUAC,ACCCC,M~4GGUGGUGAUCUAC 663-) -,JCC-UgUGGCGC~GC|C,ACG)C,CG<Agau¢ccccg....... ga ({jcc¢-) ug .......... ¢ggggcc> ................... GAGCGGC~C,GGGGCG~.,GCCC~AAAGGCGGUGAUCUAC 672-)-~--g~GCGC~`~`G~(GAGC)GC~¢g¢~ggg~g~g~g~c~c!-tgcgggcg~J{Jc{Jc{Jggc~> ................... GAGCCGCGGCGCGUGGGCCCG~GCGGUC~UCUAU 675~.,~R]G~GUUC;GG [U~C )A C U ~ C C C , J~CCC~C-UA-AU (G ~ ] AUUAGCGGAUC~CUUGUGGC~U(~ G ) ~ ~ U ~ ~ 812AC~I~ACCAGGGUGAAGCCAGAC (GAAA] CC~G -GUGCUGA (CG~T~AAA) UCC~UCGGUCGAGUUGAGUGUAGGGGC {GAAAG }ACUCAUCGAACCACC~JC~IA~ 786GCCCGGCCAC-,GGCGAA~JCCAGGC (GAAA )GCC~GGAGGCCC~CCC~G -GUGCUGA (CG~?~3AA} ~ ~ C ~ C (~ G )A ~ U ~ ~ ~ 79~GCCCGGCGAGGGC~ AC,GCCGGGC (GAAA )G C C ~ C C G C CG~G-GUGCOC,~ (CGCGCAGA} U~GCUCGUCGGAGC ~ C (~ G )A C ~ U ~ C C ~ ~ 798t~CCCGAAAGCOAU {UUAG )GUAGCGCCU~G~JGAA~CAU !......... CUCCGG ]GGGt~GAGCAC~I/J-OCGGCAA- -GGGGGUC (AUCCC) GAC~JACCAACCCGAUGCAAACUGCC, AAUA |CC~ 924CCUCCG~AAUG~JCJ [CCCA )GGAUAGCCGACAUC ...... ~uga~ca{J~u{Jc¢cu~AGG1~ACGcCAc~AUCGgJGGGA~JAGGGAG~gu~a~UcCJ-UAU~cA~CcUCGAA~CAUGAAcA~g{Jg 901CCt~CGA,4AUUUCU [CCCA) C,C,~U~CCGCCACC...... I¢ggcagc--u{jcg{j¢ ICGGU^GGGCCGOGACCGC,CGGGAGCGGC,GGA(¢cu{j- ) UCCCU-CGUCCGCCCCU~CUCCG~kcgI g¢cg 908CCUCCGAAAUGUCU(CCCA) GGACAGCCGCCGCC...... ~c~gcagu-~u{Jcggc~CGUAGAGCGcUGGCCGGC@C~GAGCGG~C`G+~J(c¢ug-)CCCCt~-CGCCCGCCCc~CAAA~UCC~G~g~cg 911................................. J~UGtT~UCAC~,C,AC-ACACAC(;~CGGGU(GCU,~C) GUCCGUCGUG~C,AGGG(~ ) C~ C ~ C ~ C ~ 1017~Jgg~g~gg{J{J~sacuu~gug{J~c¢cccuca~gu;pJ~u~G~`UGUCC~A~GGGCCUCUCCUG(G~U`qAG~CA~`~A~GGGC.~J~G^CGG~C`AUG~CGACAGUCG-GGOC~J~G~`1JG1~GGa~UG 1023¢{Jcc~cgggg~-g¢uuggcu{Jggcgc~gu{Jggg~gug~g~GPGUGG;cG~UG~+J~GGCCUCUCCUG(G~Uj~J+.G~cAGG^~*GCCJ~GACGG~C~AUGAU~C~CG-~CCcAC~GUGCGG-ga~CJ~+~G~ 1029c{Jccgcccc{Jc~-gcuggc~ugg{J~{J{Jggcggg<~g~augc{J~cC~C~CGCG~C~GCCCCU~UG~G~UAAG~cA(;~`A~;G~GAGG~GG~C~AcGAU)~CGCJ+~G~GGC~AC~GGUGCGC-cg~cGC~GG 1032UUAAGUGC,GPJ~CGA~CCAC, J~4GCC~,C~UGLV.K~ (Ut~N;JU~C,CA} C,C C A U C A ~ {~J,4AUA)GCI~ C ~ C ~ C ~ ~ -U 1141CUcgau -c as a- C G G U G U C C G U C G ~ ~ C (tTO~C~GUA} ~,M, G C C U C U ~ {GUN,CA)AC I C C A C C A C , , C C C , ~ ~ C - AC~ 1145cGga g¢ -g aaa- CGGCGUCGGUCC,GUCC~CO(:W,J~C, GUGGCC(CO~,GUC ) (IC.,~,UCCUCC~A~ (~ )AC I ~ C ~ ~ ~ C - ~ 1111~gc--g{Jaa~cGGCGUCc~GCCGGUccCGA~AGCUGGA~.~3GUGGCC~ccAGNkGUc]GGCAUCCU~CAGG;AGUGU~G-GAAcA)A~t~CA~cAGCccg~UcGG~GGCCCGGAAAAUc`GAG~-C~C~cG |153AAACCAUC,C l ACCr.,A%C,CUGCGG| CAGCC-,ACGC|~JAO- ) GCG~UIGUUGGGUAGGGGA~UGUaa- GCCUG-CGAAGGUG~ (G~ ] ~ IA ~ . U ~ ~ 1260AAGCAUACGI AUCC,Ggt,ACCCGACI ¢ga{ja.... [u{ju{j-) .... .~cucgCGUAGGAGGUcguc~`ug~uc--ga{Ju-cgasgc~augg{J(GUGA]cac{Ju{Jgu{Jg~au~{JagucAUGAUUGC1J~AUCUCGG 12.%C,AGCCU<3C,G IACUCACGCCCGGCI ¢ gcc a.... [¢gCg-] .... ugg~gG~U`~GC~A~Ccgc{J{J¢gcu~--gggu~{Ja~ggc{Jggg{J~C~CCA~ccc{J~gcu`gg~gg-{J~;CCt3GUGCACAUcL+C~G 1261C,&GCCCCC~IACCCC-CGCCCGGCI cgCc;r .... (¢gcg-| .... cggcgC,(;UAGGAGGCcgcI{J ag{jc ¢ccgggg{jc{ja aggcg{jc{jc (GCAG) gc cc cg¢cgg I accgg- cCUCUGGUGCAGAUCUCGG 1266CA~P~GU~J~CC~U~G(:GGGU( ~ ) GCCCC,CUICC~CC,C,~- C~CCN~GGGU,JCC~G~J~CG (U/J~AU}CGI GGGCAGGG~,AGUCC,ACCCCUAAGC,CC,J~GGCC(C.~) <............. 136"/G%GUN;U~,GCCAINACUCC- - - (au¢ -u } o--GG~G IGCCU~aCgU~TJ~;kOGGGUUCCGU~UCCCOG(UCGAU) CAI ~ ~ g a -ucc~ a......... (.... ) <{jucc{jauggu {gu 131~t.~GCAGUAGC~COACUCCG-- (ucacc } - ~ IGCCI~gg -uGGAC,J~GGGUUCC~UGG~CCC (GGC~ )~ | A C ~ gg-C ccual ......... (.... ) <-{JC{JCgU{J{Ju (gU 1361C3+.C-CAGt~CCCC~CUAC`UC~--~g~--CGGAG~GAC13g~gggGG]~G~11~G~)~~C~{Jg~gccuaa ......... (.... ) <-{jcggcgggu ({jm 1367................................... >GGCGUAGUCC~U-C.,GGJU~ (UUJI~UAIPJ} CCUGUACUUGGUGUUACU<................. >GC~CGGACJO~,GCUA 14~9a¢ ] gccguaccaugggg (ucou )ccucllcaggacg> ......... ~tu~g~ugA~CGG~N~CAUc}~CGG-U~guaugau{J{J{Jg<u~u~ag~gu{Ja)cau~u>uu~a~aa~{Jguacuugaa~gaag~ 1465c{j } accagaggacgua¢ (ccc¢ ) guacgc¢¢gucg> .......... aaafigg{JaA~3G~(;G`~J~`C~CU~ccGGU~G~{Jcg~{Jg~uc<-gc~g(gc{Ja)cg¢~{Ju>ccg~ccgag~a~-gugca~gg{Jcc 1468ag)accgggaaggg{jcg(ugcc)c{j¢ccguc{jaac> ............. g{JggaGcC~G~CGC~I.`~GA`CU~(-~``~`3CAGgcgcg{J~cc~c<~g~g~gaga)~gc-cc>g~c~¢gg~ga~gc{Jci~g{Jgga 14"/1UGUUGGCCGGG(C C . ~ ) CCC~CGUGUAGGC I .... UGGUU-UUCC~GG(CAJ~U) CCGC,J ~ ~ G G C G U G A U C ~ (UACG} GUGC~GC~AC3u~UGCC 155"/~tcaug{Ju{J{J{Ja(gucauccc{J-}u~cu~tc~cc~guc¢cc~---~--cauggccag~-~ugu~(w4~g~gu¢gV`g¢augg¢ca-u{Jaug{Jat4ca ............ (.... ) ......... CaUguc-cuu¢ 15~a{jgcgau{jg{ja({jccgccgcg-}ucccagccccc{jcu¢¢ .... I cgcgugcccccga{j-a¢ (c~U;lC) gug{jcgccg{jg{jcac{jg{ja{jcg~tcu .......... ..;{ .... ) ......... c{jcc.gc{j{jcc¢ 1~61cc{jcggc{jg{jc (ggc{jc¢¢c{j- } {jccc{jcgaacgcccc ..... I{jcigccccc-{jgac-gc(cctugc)gcg{ja{ja{jg{j{jggcccgg{j{j{jcggac¢¢ ........ (.... ) ......... c{jcgc-guccc |~4CI.P.,CbUCC-AGC.J~.JUU~,CCt,CU;O,C,CJI+IJCAGGI.t~ C.&UC.JU~UCGUACCCCJU~CC(~ )~ I~ ( ~ ) ~ C ~ = I I ~CUCGGG~CtJAI]:OC,t,A~IA 15"/9g-gucaucc~u{jaaacccguugu- --guCCCCcccauccU{JaUCGUAC-CACJU~CC(GCk~,CA} G(31A~U {~UCA) G C ~ - ! Ig{jga gAC.,AC¢.~.P~ ~ ~ ~ * • C¢~ ~ ~ 1673g- 9uca¢cccugaaa -ccucg{j{j- -¢{jgggcgugcu{jcgcugCCGt3Aco - -CC&CC (GCIkGCA)GGIA ~ (GAGC~),)GCCUCUGGGC- I | g g g a g A C J ~ C ~ 16"16cggccgccccu{jaaaagccgg{jg .... g{jcgcc{jgc ¢gcgcgCCGUAC-- -CGACC (C,CAGCA} GGIACOCCGGGGU(CkC~.A)G C ~ I I g{jgagcc,,l~A GC 1610UGGUGCCGUJU~(UUCG)(~,AG~GGCACI GCU~UJ~U~J~,GUGA~IX:C(CUCGC) <............ > G G A ~ U C ~ I . U t C ~ U G G C ~ - C ~ (AUUJ~)JU~ 1717UJ~C,CUCCGU~C [CUCG)GC,AGAJ~GG,~,UIGGCUCU{Jau............ ( ..... }< .......... ¢g> .......... I+UCAC,k~ CGG~C~kCCGJ~GG~,~r~gCC~CUGU~(A~I/~) CJUt, 175"]UJ~GCUCCGU~ (CUCG)G~tJ~GGJ~U IGGC~Ugau............ ( ..... ) <c¢(gc¢-] gga-> .......... cu ~CGG,V=~--~,.T, GGAU~ -~OLIGO~ (AJ~l~ ) CJ~ 17(6CGGCUCCGUJU~(CUCG) GGAJU~GGJ~U I GGCUCUgI¢............ ( ..... ) <gg (¢gcg) ¢cgg> .......... g U ~ C G ~ ~ - CGAOLIGUL~{ACUAG)~A 17"/1CACA~..;~.CUGUGC-AJ~ACAC[ - {jama) GUaGACGUAt~a;tGUGUr_,ACGCCUGCCCaGUIC,CCG-CJO,OGUT3~,k~JI~C,J~DI~(~t.U~C{C,C~ ] ~ ~ I CCGGUjUU~.CI~C 1G 1906CAt,I~GCGUCGUGCcAGtCG%t[caug{j )A~cAcr.,AcGIJG&UUUCUGCCCJ~GU I~Ccgucl¢ .... ¢augs~ .... (gagg] ..... uucau .... ¢c~agC~ I CUG(P,JgUt~ IGGCIG 1162CACAGcGUCGUG~.~C~GU~c~-u{J~v`~CGA~CA~:~`~J~GLP`J~+.UUUCUGCCC~J|GCCJ~C¢guc~-~-~gug~a .... (gig{J) ..... t.~C41~¢.... ¢¢&¢g~ | ~GUA~ |GGC|(; I1~1CAP-AC,C~J~-~.,CCG (- C~¢¢ ) ¢ ~ ¢ 1 3 ~ 0 ~ * ~ ~ i ~CCCNCa¢ .... ¢gUgla .... (gCgi) ..... UCCg¢.... ¢gaIgCCI~IGC,~IO 1174C,CCGU(~CIJ~UA) AC(~/CC~,AGO (U~CC,,lUh~) CCbUIGt..CG~ (IJIk~3) U C C C , ~ ~ ~ ~ ~ ~ {U 2021Gr.,AGU(JO,CUAUG) ~ G G (U~,CC,~AUG) CCUCi GUC~GG( ~ ) OCCr.,AC .~C,AUCCP-.&CU~.~C~CU¢CgUpa- ccuacag,~CCG (G 1913GGAGU(~ } A~-~J~,GG (UJV-,CC.AA,~UG) CCUCIGUCGGG(UAAt~,I) UCOC~C~U~;,~Jt, UGGACC,kACC~J~-,AUCCC~L~I.t~.~CCGAGI~C~GC¢,dC¢ F gl t- ~ , ~¢~ (G 1919GGAGU~AAC~&~G}AC~q~r~RLqAG~(~GCCAAAL~CCUC~G~CGGG(c~UU}UC~J~C~JGCA~C~AA~C~AC~~~~g-gag-~(~ 1994C-,2~UG ) P-AGUGOACCC"C-,O~~~CGt]GA IACCtrtX~tlq~C~CO IG A A C A U O C . A G ( : ~ ~ (r.,~CA}G~COC,C 2155CAACGGG) C C - , G G ~ C C U U U t ~ /~ C , ~ . ~ C U l GGGGC,AJOGGG~UCAUCC,GUGCAG(~r.~CAGGGGJ~ ................ (g¢cg) ...... 2016C.,~.CC,GG) Cr.,(~UC,C C G G ~ C C L ¢ , t ~ G IAP~U.~GACUCC~CCCGG~COI . C ~ C G ~ r ~ ................ (9c¢g) ...... 2092cJ~c(;~.,~ ) CGAGGGCC C,~C~CCCt-~,.q3G IAC,CUUGACUCCJU~CCGGG~COI ~UGGGGCGG~,CGGC~r~__ _n,~_-_~.~................ (gccg) ...... ~097N.~,AC,CCC,ACC~,~AUACCAC~G-~UC,~UGU~(]UUG~¢CG (~U~UC) CG~J,.P..C<............ ~ (0~} CGGd-Cl ~ 2262....... ¢~ J ~ . C C CUC~CGIJUGtr,.C~ CAUCCCA¢ucic ......... ( ..... ) ........ < ¢ g a c u u a c a u ¢ ~ ~ # ~ ; ~ Q ; ~ O ~ (G~;) C~,~GC I~CCU 2110....... ¢~C.&CCCUC~CGGU~CCACCC, CCC~ac ......... ( ..... ) ........ . , m c g g u u c g g c g s ~ (GGG)CGGO~ IgCCU 2116....... CC,CCCCt;P+..k~P-ACCCUC,ACGC,CCC,CO~CGCCCCGCUcic......... ( ..... ) ........ <¢¢ggucgcpcgg>GG~CCC,C ¢ ~ J O ~ Q (GGG)(~,a;ICl Gcco 2191cc (U;U~..~J;U~C] UaA<]C.P,GCACC,JU~; UUGC,CI Uk~.U~ (UC,~;,+.C~) ~U,JJ~,,AUOC,~ mJC,(~/¢;cG (UC,~OGG-) C~CJ~.AGG~C (a&~} ~..~ 2312GC(UAC,AUCACJ~OG) GCN~,CGUCCUA~ IUCAGUGAGG(UCGGJ~A) ~U~-,CUUACUUr4~UA~IACC-(¢cugucc } - GG~J~Clb0~C~3 (~) G~0 2299C.~(UJ~-,*~CJ~CACC) ~CCAC(=GCC, AGCI ~ (~C,J~A] CCUC~ P . . t CC,CCC~C~CCCC,CC,CCA(¢¢agucc ) CGGC~~-gg-~C (C,jO,k) ~.~ 2306C,C(UACACCUC,ACC) C,CAGG~UCCCACGG~;@C,C I ~ [3~~,A) CCUCCCGCGGADCAC,AJ~GGCkCM,p . , C C ~ C ~ C (¢¢¢~ ) ~ (~ } ~ 2312GUCAUAGLP_~UCCGGU(~-UUCUG( -AAUG- -GA )&GGC,CCAUCGCI.tCJ~CGGA~(X;GGG&~C~C~ (~ )A ~ ~ ~ 2.~04G ~ C c U l ~ c c ~ c t R , ~ l ~ c c g c u ¢ ( c g g u a l u c c J g a g c g u g g i g - g u g a c i g a - - ~ A C ~ ~ ~ ~ ( ~ U | ~ ~ ~ ~422GG~LIACCC~UC~-¢gCU¢ (gggca - -u¢ ) gagcgcgg&g- gugaci ga - -IIJ~GINACCACAGGG&UAAC~~ (~ )~ .t~"IN~GAU 2426GGCcL~CCGAUCCUUCGC-cgc ¢¢ (cggcc - - gC )gggcgcggig- guggCagl - - a+~GIPJACCkCAGGG& CC,CCC,AGO:;{CCCC,C )AGCC,Ik . <~,N3 2432G U ~ |AUCACIAUC~CUG |AAGU}AGGUCCCAAGGGUIAUGGC(UGUU¢)C,CCAUUUAAAGUGGUACC,CC.~GCIr,GGUUI~GAACGI~C(~) ~ ~ ~ ~ 2625CFJCGGCUCi UUCCOIACC<~JC-CGCAUG[CJ~OC| G<~GCGG/0kGCGUiCGGAU(UGUU¢)ACCCGUCUa.J~GGA~ .G~CCGU~(GUGA}~ C U s c ¢ g s c ¢ - 2541GOCGCCUCItFJ~CU i ACCG~rCCGCGCG(CA~JC) GGCGCGG~.(;CGU IC~qAU(~ )ACCCGCUCC-A~GUG~-Gt~-~GGGt,FJJJ~CCGU~ (GU~Ik} ~ C U a c r . g a c ¢ - 2545C,UCC,GCUCIUOCCUIACO~L~CCC,CC,CG(CkCC)C-P.,CC,C~.,J~.GC~J ItGG~U(UGUUC) ACCOGUUCt. ~ ~ ~ (G~) ~ ~ a ~ g c ¢ . ~$1CC,CUGGAUJ~CUGCUCC (UAC,OJ~ [C,+~_~)GG~C } GG~UGC~oCJU+~C~ 1UGUCAU(GCCA) AUGC,C A ~ ~ ~ (0 2"/4"/- auc u¢ caugcgu¢caguACGUCGGGU(ClkGUAC[~GA) GC,J~C } ACCCGUCGCC4k~cUccg IU .cUCCCGG |UUGIX~U(~) ~ ~ _~uc~gg.. ~ ¢ ~ (G 2651..... ~cgggg~gagagcAC(~cGGGU(cAGUA~|Gk3A)~J~GCC~C~GCC.~Q~¢g~ggu-~v~CGCGG|~(~)~~-~g~¢~gg--~g~¢~(G ~$$..... ccgggg¢CagIgcACGGCGGGC (CN~JAC (G~A) GG~AC} GCCCGCCGOGGGCcgcr~gc-CCCC~-GG IUUGCCC(GGCC)G G G ~ _ ~ g c c ~ g . . ~ (G~I~.,CAUC~ ) A ~ C ~ C C ~ C , AUC,J~JUCUCC<.......................................... >CtJ~CCCU (Ut.~} AGG~J~GG |JU~I CG~I3 (AAGACC,A] 21263L~GCCUCUA ) AG~GUCCAUc¢caccUa¢.... ucgcaug;Ig<-uugagggg ........ cuu.............. ¢cccucga> ........ (---) .......... uu¢ I g¢ I ccgg~l (uugaCgll) 2731~JcUcUA)AGcGCGC~c~cg¢¢u¢g----g~gu~<~g~g~ccuagcc¢~gcg¢&c~-~¢¢s~gg¢~> ........ (.._} ........... ¢¢lflglccgg~;(ullgacga| 2"/49A~3CCUcUA~AC~CC~CGCAC~cgc~u~g----¢~c~g~<~{Jg~{Jcg~g~c~cag¢cc~{Ju{J¢~c¢gu~g¢¢gag~ggc¢¢> ........ (_..) ........... ¢¢I g¢I ¢¢g~;g (gllg&¢¢l} 2759CC,ACG0"OC~GGCCGGGUGUGUAAGCGCA ........ (GCC.~-) ...... UGCGUUGAGCU ....... AACCGGUACUAAtP.~AC c~r~ c U-~AAC Cuu I 2~04cccu~nau-cuccgaccuac-uguacggcguggagcuuc (uug¢g)ll¢cg¢cuga---ggut.n~.g{j,..,ucggguugg¢llU+UCC¢.¢ccgag.............. I 2~11cccgc.cg- - - Cagcgcucc - ugua Cgcg;Icggag¢¢c¢ (gcgcg) accgccuga- - -ggaacgccc ......... cugccc-g¢cg¢ ............... I 2117¢¢¢ggcg---cggcgcucc-uguacggcgcagagcccu (gcg--)iucgccuga---ggga¢ gcg¢¢--ugca{jagcgcg-gggcgg{jgcg ........... I o!~
  5. 5. 322for the SSU-rDNA, 5.8S rDNA, and spacer DNAand part of the LSU-rDNA of G. muris rDNA ispresented in Fig. 1. The position of the LSU-rDNAis indicated by the first 23 and last 37 nucleotides.The entire sequence of the LSU-rDNA is shownseparately in Figs. 2 and 3. The sequence of one ofthe shorter Sau3AI fragments in the spacer is indi-cated in Fig. 1, where it is aligned with the sequenceof the same region corresponding to the largestSau3AI fragment found. The observed size differ-ence is obviously the result of variation in the num-ber of repeated sequences in the spacer DNA.These repeats are indicated in the figure. Theboundaries of the mature rRNAs that were deter-mined previously (van Keulen et al. 1991a) are in-dicated by arrowheads (Fig. 1). The exact positionof the 3 end of the LSU-rRNA is the least certain.The entire operon is 7668 nucleotide (nt) long andcontains a SSU-rRNA gene of 1429 nt and a LSU-rRNA gene of approximately 2698 nt. The overall G+ C content of G. muris rDNA is 61.9%. The G +C content of the LSU-rDNA is 57.2%. An openreading frame of 281 amino acids was found in thespacer DNA.Structure of the LSU-rRNAThe sequence alignment of the LSU-rDNA includ-ing 5.8S rDNA of G. muris with that of G. duode-nalis, G. ardeae, and E. coli is presented in Fig. 2.Overall similarity of the LSU-rDNA, not adjustedfor secondary structure, was determined from se-quence alignment of G. muris LSU-rDNA withthose of G. duodenalis, G. ardeae, E. coli, and H.halobium and was 65%, 71%, 46%, and 47%, re-spectively.Secondary structure models for G. muris,G. ardeae, and G. duodenalis LSU-rRNA wereconstructed. The structure of the G. muris rRNAis shown in Fig. 3. Domains (roman numerals)and regions of interest (A-H) are indicated whichare based on the secondary structure models de-scribed by Gutell et al. (1990). Since the 5.8S do-main differs among the three species, the ones forG. ardeae and G. duodenalis are shown separatelyin Fig. 4, where the tentative helices A and B areindicated in the G. duodenalis structure. The size ofthe same region in G. muris is smaller, 6 nt com-pared to 19. The same is true for the G. ardeaeRNA, which has only 5 nt here. The regions wherethe largest differences were found among the threespecies, namely C, D, F, and H, are shown sepa-rately in Fig. 5.Region C consists of approximately 34 nt in G.muris and 35 in G. ardeae, but only 16 nt in G.duodenalis, with the helical segment of only 8 nt.Region F shows less difference among the GiardiarRNAs, being 20 for G. muris and 26 and 28 nt forG. ardeae and G. duodenalis, respectively. RegionH is also larger in these latter two species. Of thesethree regions, the ones of G. ardeae and G. duode-nalis are more similar to each other than to that ofG. muris.DiscussionThe entire nucleotide sequence of the rDNA operonof G. muris has been determined. The organizationof the rDNA of G. muris is different from that foundin two other Giardia species (G. duodenalis and G.ardeae). The spacer of G. muris rDNA is largerthan the one in G. duodenalis (van Keulen et al.1991) and it is the only one of the three that is het-erogenous in size. The difference resides in se-quences upstream of and close to the SSU-rRNAgene. Many tandem repeats are localized in this seg-ment of the rDNA. This phenomenon of tandemrepeats in spacer DNA is common in rRNA genesand is believed to be involved in transcription reg-ulation (Jacob 1986; Sollner-Webb and Tower1986). Why there is heterogeneity in the G. murisspacer DNA and not in those of G. duodenalis andG. ardeae is unclear.An open reading frame (ORF) in the antisensestrand of the G. duodenalis LSU-rDNA has beendescribed by Upcroft et al. (1991). There was noevidence for a similar ORF in the G. muris se-quence. However, an ORF for 281 amino acids waspresent in the spacer DNA, on the same strand asthe rRNA genes, which would yield a protein ofabout 30 kDa. Whether this ORF codes for a realprotein in G. muris remains to be determined.When the SSU-rRNA of G. duodenalis was an-alyzed, the rRNA appeared to resemble in sizeprokaryotic 16S rRNA. Phylogenetic analysisshowed G. duodenalis as the earliest-branching eu-karyote studied to date (Sogin et al. 1989). A similarobservation can be made in the case of G. muris andG. ardeae (manuscript in preparation). As might beexpected, the LSU-rRNA of Giardia shows similarfeatures. It is, with respect to its size, similar toprokaryotic rRNA, but contains in its secondarystructure many typical eukaryotic features. Nota-bly, these eukaryotic features are among the small-est described so far. Space limitations prevent thediscussion of all of the eukaryotic signature featuresGiardia maintains, but a few will be discussedbelow. For an optimal analysis of similarity and dif-ferences among the Giardia LSU-rRNA, the sec-ondary structures of these molecules were con-structed. The structure of the G. duodenalis LSU-rRNA was taken from the sequence determined by
  6. 6. 323A GUA UAA AOu OX OeAAGCCU 0--6 G OU U G A AG II "III I II AA A °~uc UocUGGccAuG UGGAUAA uGCGGUGGA ACGOAA G C III1"111111111 II1"U AG_ G UuCAGAUGACCGGUAC CCUG A Cu • ccc COG--GU--AG--G GAGAG-- CO G Au A cG GUA -- U GGGU~ -- GO--C ACU G U--AC--G A--UU--A c --OG o GA u A--UC--G c cG--G G~.U U u A--UUo_ G G~C U G A--UG AGA G--C UO G~Co--c u~ GGAG O--G G A G--GA A G--COX A G C--O G--C GU AA C A C--G GU U--A G IJ A GcA AAU U A--U G GU U U A GU ~ GG GUA GAUG CAAGG %U II ..I Illl IIlil t~cAA - OUUGC cco.........,_~ %A CGGC.%U%AAG GCGCA G U ^GuGuGGUAO G CU GA GAcA C-- GUAC CAAAGUU ~0G G U O U GA GA AUAOGGG AO II IIIII1" C~^ ~"* G°. ,GA%*GC CuGU UAUGGGU AU~u_.~ 2~u GGC uG * U AA * Uo GGG / OCGG--G: .....-o°-G OG,O.,o * A U--A O--cG--G G.U --AG_ G UG CAAGGUGA0G II1" UG--G Uo uO--G G--GU--AGA O--G GA~. G ou ~ G~....A--U Cu.A C--G%G~ Go G-~o AC ~O G--Gau a U--AUu ca* uU TTI 3halfAA A c GGOGAOGU U UAG^GGGGGC GAGGAGG~ *lit IIIIIIOGUGGA AA lilt uGGG • U U UAuGGuGGUG %G A, o .,°,>G/oG-G ,.o2 ,°G°<c GGO::/ "A~ GU- o A--UU~G ~*U G~.%o *Go .. G* G--o Go~.G U--A A,o^Go~: ,o G . . . . .A ~A~ cG U--A GU ~G o A Gu G A--U CC~A U.UG GA . O--G UE c..~o ~,o G X e~ A C--G Gc. oG ¢Cc A o-~ u -c uu G ^ ."G G:c Go A % U G CGuu- AU~ a.u ,,* ~ o^ ~-,.G - A ~--~ 55.8s G o"G~ O • G O~ ~O UG~%C"~ O--C U--AAG~ G U--A G A U G o o G--oA GG G-- C GGCGGGGAAGUUGAGAAOUGAAGGGAA A * U A A U ~A 5.8S 3G. G o^°-- GA * O~ . . . . . . G..,CA u .GGU~ AGGG*GAOG <.G G--G ~:gG~ U O %% GGGG UGGUGoGA GU O--G U U GO%%GCCGC°" .... oo . ° e °,A Gu cG ,,GG GG.. uG G u %%~ C A G A ~ GA. u u ~- GOG%UP s 2asA.*G A" A ^ A ~ A * Au u G c G A G G A uG%%~ko-,g.%, G,o ~GA"............. ~G-Gu~ oG ~G "A,,U~G A GA G A -- U G UGU G A UGA AG GGGUGGGGGGAG II IIIIIIIIII GG UGA GOGGAGGGOGOGAG A AGOG ~OGuGGA AG uGA UA oGUAG U AG~,,% G C ~ ~ " I UAAAOGAGGAG~ UG~CG~U AA U A ] CaGe I I OG Ux;: ~ A %A G~A~Acu,~ G U Ao °-c; GAGAUA ~GA G--GO GGG~G G--CA G--G.... AAAG AcaG--GACCUG~GAUGA o*G~UA ~~ GGAGG .% CU Ao tcGU UAGuUoliiii III IIIII III GAG A G c AGCAAG COG ACAGG GGG I AAA C U A AUG AGoC--Gu Uo oAceFig. 3. Secondary structure of the G. muris LSU-rRNA. The secondary structure is divided in six domains, indicated with romannumerals (I-VI). The positions of some variable regions are indicated with A-H and are based on the compendium of Gutell et al.(1990). (A) Domains I-III and (B) IV-VI. Continued on page 324.Healey et al. (1990), referred to as G. intestinalis intheir paper. (Minor corrections were made based onour own sequence analysis.) The following generalfeatures appeared when the various domains andhelices were analyzed.Domain IThe first feature is what appears to be a 5.8S rRNAthat is not a part of the LSU-rRNA as is the case inprokaryotes and Vairimorpha (Vossbrinck and
  7. 7. 324°A QGA--UA--Ua °--O^--U0 0o ¢° ° ¢ ^G.V U °O--~ G--OGO ^ 0 o 0O0_ou ° 0-°O--G ~0u°-- cA G • UA A AU--AG--° U--A A U~ GUO--O AG--O G.~AO--G °--O °,~uAGUG U CUU III IIFuO% °^ueu°u %° _ o^^°u^~°%~ ouooou^~..A -- u 0 - G "Guo°-° °0-o~ %^o u°°^ "• oV--A O--G G G • °o-o o_,U--A A -U0 A CA G--O UuA°--0°~°^, °_o °_co°^co,^~_~°GGG GGAGU AUcl~ III1" O0A 00VO~ A UAuG UOAAACGGOUGAAG, UG°--0O--GA--U°--0AOO--GO~o_gG--O~°°--CGuo A5 hall IV ~CouA0A GGAAUGFig. 3.GOoGAGOAuA GA O--Ouuo 0AGO(I co tllllU OU000 A°A--UC--GU UA--Ua -- 0AG--cG--O oGGAAo-, %_~A--UG--O G--CG--O A -OG--¢U--A°--0~-" °~°~°°°^~°,i,, o^(] .°-°°o0 oGGOUA0 O _GOo -- OOACA U --A GoAUU.o II AAA--U uOA O--a A OG°~G--C GA OA G--CAAO--° A ~; A--U UA--UU--A0--0U--A GO° Uo--°OA--U ° G--O G--OG.UA • G G~ G IJ-- AC_G° U " _O_GuU 0~0 O--OA=U II I IIIIII Goo OuU°AU°Oo0 vG 0 Ua_oCU--AGU °UA A AAAGA--U g °-c ° AU --AA O°--0 o O-coU--A o-°UV ° °-°U--A A--UO--° O--0O--G 0 0A G A O--G AUAA OUAAUG0G AU °G 0 G UVili;iIIUGGUCGG GGG.II Akll ~cuUGuGGcoGcO AAGcGCc^CCAa¢CAUCC UU • • • • I I I • I I I CU UA U UAG UUUOGGUG °G AU ¢ UU ¢A a AUoo o .ooO. /--,,, ° o .o oo°^~ ° o oV o^ ....X 0,.,~ as/*" u° .% °%.°U • 00 A Lre A 0--°Q A¢/• GU O G.% GO G--O• • 0 U A °A--Ua-%-u u ^" ° o, ~o,,, A uo-°o o~ ¢ a.u....... ; ,,ooc0 III k0 UUUU000 0 0 0 AA~UAocG UCQGOAU° °0 AG ^UOO%~ /°A U¢ II IIIIII- AAo u o ~ooo°o~-o u uoo^oooouo°o ~°-o ° ° %G°¢~ g:~ °"°_0 oU. oA o .% ¢_o~.o o~,o o ~ o o~-~,° Oo Uv°oO. u o o uc %~•%Ao o ^ %,oo%. ~o°VoG0--oo--0 AOO--0U0 uA O°CoO uContinued from page 323.Woese 1986). It is possible to isolate 5.8S rRNAfrom G. duodenalis (Montanez et al. 1989) and S1mapping has identified the position of the 5 end ofthe LSU-rRNA (Boothroyd et al. 1987). The size ofthe Giardia 5.8S rRNA, however, appears to besmaller than that in other eukaryotes (Edlind andChakraborty 1987; van Keulen et al. 1991a). Al-though the exact 5 and 3 ends of the RNA are notknown, the 5.8S rRNA sequence can be estimatedfrom the base-pairing scheme. A similar, though notidentical, structure was obtained by Edlind et al.(1990) for the G. duodenalis 5.8S domain. Theshorter size of the Giardia 5.8S rRNA seems to bethe result of shortening of helices A and B in atentative G. duodenalis model. These helices areabsent in G. muris and G. ardeae. An absence ofhelix A and B is also found in Pirulla marina; helixB is absent in many but not all plastid RNAs [seeGutell et al. (1990) for a survey of these irregularhelices]. The largest variation in size is seen in re-gion C, also described as domain D2 (Michot andBachellerie 1987), which is considered a eukaryotic-
  8. 8. A uuUCGuuUG 0-~ UA*- ss.ss ~C -- cC GGU -- GAA S.8S 3U--C G--UG--C UG-- G--U C°.... s °::°:::°ooooo--u A AO -- G G U cGA OO O CG-~ U A U U o uU~,%GU tA G G "- eG~.,U C--G cc G oGo~uA 528S.9° }.:o .... o o.-uA G AGO CG C UUGG UG AG AGU GGUC GuUUAAAAGA UOU U uG-- UIIIA A U C ON UGA GO GGUCOGGGOG G GII IIIIIIIIII ACG UG UUGGGUUUGU C~, e A A~% UuGUUGAAoGAAUoAGGcGAU GOGGu UA G% G d I[[r UUAAAGO%C OUGc I I GCG~Cu- o ~ o , 2 ,o .... Ao,OG % G N G% C G cGAG% U G G~ C AA CC A%uCU A AGe AGG~CAC--GA o--ue G__ o G--OG--UG--UGu A eU,~,A ,~G- oGO %%CA ~ CG GOA sAG O UG GCGU G ~ UG A CGCUGUGGGGC GGCG GCC GCUG CCc G u U IIll-III IIII III IIII I Ge UC G A G CGGCGCGUCUGGGCUGU CGG UGGC GAA e u AUGo --G AO--GOU -- GuGAA oGGAAU--UCCC--GCGA3255 5.8S/ G G e 5.8S 3"AC ¢¢ AC IGu~° ~ ~ o% Ac_GA GG--C GOG--CG--C U--A OU--A G--C AG--cC--GA G--¢ G "~ ,G--e AUGC--GG GA "oUtAGeS 28Sc - G C C C GG~C CC -- G G U C~cOC C A CG U%. CG GGCG A ,GGAGGGCCGCUCAAAAGAc U U GGIII C--G GOo................ : AACG-C U C CII IIIIIIIII CCG UG CCGAGCCCGCCGAACcA GGcGAU AUG U G G Illll UAAGG C AGC~c COOGc IIC U--UG A A A G UUAGG_ C A UU_ ¢ G--CO--G G A--UAU AA UG -- CU U C--GGCU GU A A A G--CGA AGU%~"G GCGUGU A G G G CGuGC~, %CGC A UUUGUGGG UUU CCG GUGAGi.l.,il ill Ill lili ~GGGGU GGCA ~CGU U G A G uGU CGCAUC CCCU GAU U GAGU GCG CUGU - UCU--U¢--GC UG UAGAFig. 4. The structureof the 5.8S rRNAdomain. The 5.8S rRNA domain of G.duodenalis(A) and G. ardeae(B). HelicesA and B are indicated in the G. duodenalisstructure.specific secondary structure element (Michot andBacheUerie 1987). This region in eubacteria is about30 nt and in archaebacteria about 80 nt. In eukaryotes,however, the size varies from 213 (Crithidia)-873(Homo) nt. In contrast to these large sizes, this equiv-alent region in G. muris is only 34 nt and in G. duo-denalis only 16 nt. The size of this region in G. ardeaeis almost the same as that in G. muris (35 nt). RegionC in G. duodenalis is the most truncated found so far.Domain HRegion D is variable in size and secondary struc-ture. This region is similar in G. duodenalis and G.ardeae in contrast to what is observed for region C.No phylogenetically conserved helix can be formedfor D in G. muris. Variable region E is generallcon-sidered kingdom specific and is a typical eukaryoticsignature sequence.
  9. 9. 326oo° o° ~o oo~o ~oo~< <~-oo r~~ooIIIIIIIIIIIII II~ _ IIIIoI.IIIIIoOo ~0o Ooo~ZZ~oo<o-o2IIIIIIIII I|1 I! <°ooo oIII oO0 ~ 0oo~-~/~%<oo -%, ??o~0~0 ~~<~<oo~<~o~o~ <tttttttlltttl ~ o<o II IIIIII I.° o%~<o=o=ooo=0 • 0o IIIlllll I.o OO~<~DODot~ DIIIII IID~DODo0~ DM.oo ~O0o • o%0000~00: V~o° oo o°o~o<oO°oD D 0 0 D~Z
  10. 10. 327Domain IliA number of variable regions are present in thisdomain. In Giardia, these appear to be truncated,accounting in part for the small size of the rRNA.Domain IVThe second characteristically eukaryotic variableregion, indicated as F in Fig. 3, or D8 in the nomen-clature of Michot and Bachellerie (1987), resides inthis domain. It is 45 nt in E. coli, 22-37 nt in ar-chaebacteria, and ranges from 143 (Tetrahymena)to 718 (Homo) nt in eukaryotes, with lower eukary-otes having the smaller size. This region is only 20nt in G. muris and is 28 nt in G. duodenalis. In G.ardeae it is again similar to that of G. duodenalis,namely, 26 nt.Domain VRegion G is reduced in all three Giardia species,with a size (22 nt) which is similar to that ofprokaryotes (30 nt in E. coli). This region is smallerin Giardia than the same region in all other analyzedeukaryotes, where it ranges from 74 (Tetrahymena)to 261 (Crithidia) nt. The overall structure of thisdomain and the sequence of the unpaired bases inthe central loop are similar to E. coli LSU-rRNA,with a high degree of sequence conservation.Domain VIThis domain shows the least sequence homology toother LSU-rRNA. However, when folded, this do-main has, despite being quite variable, a secondarystructure that is similar to that same domain in allLSU-rRNA. The putative 3 end of the LSU-rRNAcan therefore be localized with a reasonable degreeof certainty. Since it is not possible to isolateenough rRNA from G. muris cysts to identify theexact position of the 3 end of the LSU-rRNA, se-quence comparison to G. ardeae and G. duodenaliswas used to estimate the possible 3 end. The loss ofsequence homology between G. ardeae and G. duo-denalis was used as an indication of the 3 end (vanKeulen et al. 1991a). This suggests that the G. duo-denalis LSU-rRNA ends much earlier than otherinvestigators have reported (Boothroyd et al. 1987;Healey et al. 1990). Based on the putative folding ofdomain VI, this conclusion still holds. Additionalevidence comes from a reevaluation of the size ofthe LSU-rRNA from G. duodenalis by gel electro-phoresis of glyoxal-treated RNA which includedmore markers than previously used. A size of about1.7 kb was obtained (unpublished results). Based onthe alignments of Fig. 2, this agrees well with theobserved size of the LSU-rRNA. The major vari-able region in domain VI is indicated with an H inFig. 3. This region is much larger in other eukary-otes (64 nt [Euglena]-230 nt [Rattus] than in Giar-dia: 19 nt in G. muris and 38--40 nt in the other twospecies of Giardia.In conclusion, the size of the LSU-rRNA of G.muris is, together with the similar structures of G.duodenalis and G. ardeae LSU-rRNA, prokaryoticrather than eukaryotic. However, a number of phy-logenetic signatures link Giardia with eukaryotes.Two variable regions in particular, C and F, but alsoG and H, are examples of a considerable shorteningof the rRNA size where these structures are amongthe shortest found in eukaryotic LSU-rRNA. Allthis supports the suggestion that Giardia appears tooccupy a unique position having one of the mosttruncated LSU-rRNAs found in eukaryotes so far.This is consistent with other observations to theeffect that Giardia belongs to the earliest and deep-est branching of the eukaryotic evolutionary tree.However, to give Giardia the status of "missinglink" between pro- and eukaryotes, as others (Kab-nick and Peattie 1991) have suggested, cannot bemaintained, since Giardia LSU-rRNA shows manytypical eukaryotic features. The rDNA gene, as awhole, has many distinctly eukaryotic features, be-ing tandemly repeated, having spacer DNA withvariable sizes, and containing a separate gene forthe 5.8S rRNA. The short size of the entire op-eron--which is the result of a short spacer, asmaller 5.8S rRNA, and the shortening in size ofimportant structural elements in the rRNAs, asshown here for the LSU-rRNA--and the finding ofopen reading frames in the rDNA operon, appear tomake Giardia rRNA genes unique.Acknowledgments. We gratefully acknowledge Bryn Weisersprogramming expertise, which resulted in the secondary struc-ture drawings. We would also like to thank the W.M. KeckFoundation for their generous support of RNA Science on theBoulder campus. This work was supported by grants from theOhio Boards of Regents Academic and Research Challenge Pro-grams (H.v.K., E.L.J.) and the U.S. Environmental ProtectionAgency cooperative agreement CR-816637-01-0 to the Universityof Minnesota (S.L.E.) and Cleveland State University (E.L.J.).The EMBL Data Library accession number is X65063. R.R.G. isan associate in the Evolutionary Biology Program of the Cana-dian Institute for Advanced Research.ReferencesBenson SA (1984) A rapid procedure for isolation of DNA frag-ments from agarose gels. BioTechniques 2:66-68Birnboim HC (1983) A rapid alkaline extraction method for theisolation of plasmid DNA. Meth Enzymol 100:243-255
  11. 11. 328Bird RC, Wu G (1989)An alternative to the dry ice/ethanol bathfor precipitation of nucleic acids. BioTechniques 7:337-338Boothroyd JC, Wang A, Campbell DA, Wang CC (1987) Anunusually compact ribosomal DNA repeat in the protozoanGiardia lamblia. Nucl Acids Res 15:4065--4084Brosius J, Dull TJ, Noller HF (1980) Complete nucleotide se-quence of a 23S ribosomal RNA gene from Escherichia coli.Proc Natl Acad Sci USA 77:201-204Campbell SR, van Keulen H, Erlandsen SL, SenturiaJB, JarrollEL (1990) Giardia sp: comparison of electrophoretic karyo-types. Exp Parasitol 71:470--482Edlind TD, Chakraborty PR (1987) Unusual ribosomal RNA ofthe intestinal parasite Giardia lamblia. Nucl Acids Res 15:7889-7901EdlindTD, Sharetzsky C, Cha ME (1990)Ribosomal RNA of theprimitive eukaryote Giardia lamblia: large subunit domain Iand potential processing signals. Gene 96:289-293Erlandsen SL, Bemrick WJ (1987)SEM evidence for a new spe-cies, Giardia psittaci. J Parasitol 73:623-629Erlandsen SL, Bemrick WJ, Wells CL, Feely DE, Knudson L,Campbell SR, van Keulen H, Jarroll EL (1990) Axenic cul-ture and characterization of Giardia ardeae from the greatblue heron (Ardea herodias). J Parasitol 76:717-724Filice FP (1952) Studies on the cytology and life history of agiardia from the laboratory rat. Univ Calif Publ Zool 57:53-146Gerbi SA (1985)Evolutionof ribosomal DNA. In: Maclntyre RJ(ed) Molecular evolutionarygenetics. Plenum, New York, pp410-517GutellRR, Schnare MN, Gray MW (1990)A compilation of largesubunit (23S-like) ribosomal RNA sequences presented in asecondary structure format. Nucl Acids Res 18(Suppl):2119-2330Healey A, Mitchell R, Upcroft JA, Boreham PFL, Upcroft P(1990) Complete nucleotide sequence of the ribosomal RNAtandem repeat unit from Giardia intestinalis. Nucl Acids Res18:4006Jacob ST (1986) Transcription of eukaryotic ribosomal RNAgene. Mol Cell Biochem 70:11-20Kabnick KS, Peattie DA (1991)Giardia: A missing link betweenprokaryotes and eukaryotes. Am Scientist 79:34-43Maniatis T, Fritsch EF, Sambrook J (1982)Molecular cloning, alaboratory manual. Cold Spring Harbor Laboratory, ColdSpring Harbor.Mankin AS, Kagramanova VK (1986) Complete nucleotide se-quence of the single ribosomal RNA operon of Halobacte-rium halobiurn: secondary structure of the archaebacterial23S rRNA. Mol Gen Genet 202:152-161Messing J (1983)New MI3 vectors for cloning. Methods Enzy-tool 101:20-78Michot B, Bachellerie J-P (1987) Comparisons of large subunitrRNAs reveal some eukaryote-specific elements of second-ary structure. Biochimie 69:11-23Montanez C, Cervantes L, Ovando C, Ortega-Pierres G (1989)Giardia lamblia: isolation of RNA. Exp Parasitol 68:354-356Roberts-Thomson IC, Stevens DP, Mahmoud AAF, Warren KS(1976) Giardiasis in the mouse: an animal model. Gastroen-terology 71:57-61Sauch JF (1984)Purification of Giardia muris cysts by velocitysedimentation. Appl Environ Microbiol 48:454-455Sogin ML, GundersonJH, Elwood HJ, Alonso RA, Peattie DA(1989) Phylogenetic meaning of the kingdom concept: an un-usual ribosomal RNA from Giardia lamblia. Science 243:75-77Sollner-Webb B, Tower J (1986)Transcriptionof cloned eukary-otic ribosomal RNA genes. Ann Rev Biochem 55:801-830Upcroft JA, Healey A, Mitchell R, Boreham PFL, Upcroft P(1991) Antigen expression from the ribosomal DNA repeatunit of Giardia intestinalis. Nucl Acids Res 18:7077-7081van Keulen H, Loverde PT, Bobek LA, Rekosh DM (1985)Or-ganization of the ribosomal RNA genes in Schistosoma man-soni. Mol Biochem Parasitol 15:215-230van Keulen H, Campbell SR, Erlandsen SL, Jarroll EL (1991a)Cloning and restriction enzyme mapping of ribosomal DNAof Giardia duodenalis, Giardia ardeae and Giardia muris.Mol Biochem Parasitol 46:275-284van Keulen H, Horvat S, Erlandsen SL, Jarroll EL (1991b)Nu-cleotide sequence of the 5.8S and large subunit rRNA genesand the internal transcribed spacer and part of the externalspacer from Giardia ardeae. Nucl Acids Res 19:6050Vossbrinck CR, Woese CR (1986) Eukaryotic ribosomes thatlack 5.8S RNA. Nature 320:287-288Wilbur WJ, Lipman DJ (1983) Rapid similarity searches of nu-cleic acid and protein data banks. Proc Natl Acad Sci USA80:726-730Woese CR, Gutell R, Gupta R, Noller HF (1983)Detailed anal-ysis of the higher order structure of 16S-likeribosomal ribo-nucleic acids. Microbiol Rev 47:621-669Woese CR (1987)Bacterial evolution. Microbiol Rev 51:221-271Received August 19, 1991/RevisedApril 22, 1992

×