Peter Hollingsworth - Plants Plenary


Published on

Published in: Education, Technology
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • Success with 2 pass, with contamenents treated as fail Success with 2 pass, with contaminents treated as fail + 7 recovered with phusion Success with 2 pass, with contaminents treated missing data + 7 recovered with phusion
  • If anyone asks about ERIR primer, it is same as MALPR1 and comparable results, but with higher annealing temp better suited to use with Phusion Taq
  • If anyone asks about ERIR primer, it is same as MALPR1 and comparable results, but with higher annealing temp better suited to use with Phusion Taq
  • Add gmno species number
  • Compress to 6/8 failures failed in rbcl
  • Number of species in those lineages
  • Peter Hollingsworth - Plants Plenary

    1. 1. Plant DNA Barcoding using matK some work on new primer sets Dr. Alan Forrest Prof. Pete Hollingsworth Royal Botanic Garden Edinburgh Damon Little, New York Botanic Garden Aron Fazekas, University of Guelph Gao Lian-Ming, Kunming Institute of Botany Sean Graham, University of British Columbia Mehrdad Hajibabaei, CCDB, University of Guelph Maria Kuzmina, CCDB, University of Guelph Hollingsworth, Graham, Little (2011). "Choosing and using a plant DNA barcode." PLoSONE 6: e19254.
    2. 3. Angiosperms: matK baseline How good are the current “best” matK primers? <ul><li>Ca. 10K PCR & sequencing attempts from 5 labs: </li></ul><ul><li>Kim 1R+3F = 72% success (N=9424) </li></ul><ul><li>2-step protocol: Kim 1R+3F and 390F+1326R: 80% success </li></ul><ul><li>Poorly performing orders include Malpighiales, Piperlaes, Poales, and Myrtales (especially Melastomataceae) </li></ul><ul><li>*ACDB African Centre for DNA Barcoding, University of Johannesburg, South Africa </li></ul><ul><li>*CCDB Canadian Centre for DNA Barcoding, University of Guelph, Canada </li></ul><ul><li>*KIB Kunming Institute of Botany, Chinese Academy of Sciences, China </li></ul><ul><li>*NYBG New York Botanic Garden, USA </li></ul><ul><li>*UBC University of British Columbia, Canada </li></ul>
    3. 4. Angiosperms: 3 approaches to improve matK retrieval <ul><li>1) ePCR of existing published primers against ca. 10K matK sequences </li></ul><ul><li>Genetic algorithms to search for new primers </li></ul><ul><li>2) CODEHOP : CO nsensus DE generate H ybrid O ligonucleotide P rimer </li></ul><ul><li>Primer cocktails with a degenerate ‘core’ coupled with variant 3’ triplets for all known exact matches in GenBank </li></ul><ul><li>3) New primers/combinations tested alongside existing primers: </li></ul><ul><li>1R+3F KJ Kim, unpublished </li></ul><ul><li>390F+1326R Cuenoud et al (2002) Am J Bot 89 </li></ul><ul><li>472F+1248R Yu et al (2011) J Syst Evol 49, 1-6 </li></ul><ul><li>xF+MALPR1 New combination (Ford et al 2009; Dunning & Savolainen 2010) </li></ul><ul><li>398Fb4+1311R CODEHOP; this study </li></ul>matK primer location
    4. 5. Angiosperms: the test sample <ul><li>5 Plates of samples </li></ul><ul><li>Wide taxonomic sample: N=470 </li></ul><ul><li>52/61 orders and 172 families sensu APG3 </li></ul><ul><li>All samples previously sequenced for rbcL </li></ul><ul><li>DNA extractions standardized, concentration equilibrated </li></ul><ul><li>A 188 samples from accessions that worked previously for 1R+3F ( retain current success rates ) </li></ul><ul><li>B 188 samples from accessions that failed previously for 1R+3F ( improve on current success rates ) </li></ul><ul><li>C 94 samples from 5 orders that performed particularly poorly ( check that the nightmare groups are fixed ) </li></ul>
    5. 6. Angiosperms: testing different protocols <ul><li>PCR: different additives (acetamide, betaine, BSA, DMSO, DTT, formamide, glycerol, sulfolane, trehalose, 2-pyrrolidone, CES solution) primer and magnesium concentrations, annealing time and temperature </li></ul><ul><li>Best results: Platinum Taq polymerase, 1M betaine, 0.2M trehalose </li></ul><ul><li>PCR clean-up: nothing, Qiagen columns, ExoSAP-IT (neat and dilute) </li></ul><ul><li>no clean-up = poor sequence quality </li></ul><ul><li>Best results: ExoSAP-IT (dilute 1:10) </li></ul><ul><li>Sequencing PCR: </li></ul><ul><li>Different additives tested (nothing, betaine, DMSO, trehalose, BDX64) </li></ul><ul><li>Best results: 0.2M trehalose increased read length by up to 150bp </li></ul><ul><li>Full details of tests available from Alan Forrest, to be posted on Connect </li></ul>
    6. 7. Angiosperms: PCR results from different primer pairs <ul><li> Worked Failed Bad </li></ul><ul><li>before before clades </li></ul><ul><li>Collaborating labs: Total A B C </li></ul><ul><li>rbcL 100% 100% 100% 100% </li></ul><ul><li>matK 1R+3F 40% 100% 0% 0% </li></ul><ul><li>Test lab: </li></ul><ul><li>rbcL 99% 99% 98% 97% </li></ul><ul><li>matK 390F+1326R 71% 79% 63% 71% </li></ul><ul><li>matK 1R+3F 85% 97% 85% 63% </li></ul><ul><li>matK 398Fb4+1311R 86% 87% 87% 83% </li></ul><ul><li>matK 472F+1248R 88% 93% 92% 71% </li></ul><ul><li>matK xF+MALPR1 91% 94% 92% 85% </li></ul>
    7. 8. Angiosperms: 2-step matK PCR amplification <ul><li>1 st Round 2 nd Round Samples amplified </li></ul><ul><li>1R+3F 390F+1326R 91% </li></ul><ul><li>390F+1326R 398Fb4+1311R 90% </li></ul><ul><li>xF+MALPR1 390F+1326R 93% </li></ul><ul><li>xF+MALPR1 398Fb4+1311R 94% </li></ul><ul><li>1R+3F 398Fb4+1311R 95% </li></ul><ul><li>472F+1248R 1R+3F 95% </li></ul><ul><li>472F+1248R 390F+1326R 95% </li></ul><ul><li>xF+MALPR1 1R+3F 96% </li></ul><ul><li>472F+1248R 398Fb4+1311R 97% </li></ul><ul><li>xF+MALPR1 472F+1248R 98% </li></ul>
    8. 9. Angiosperms: 2-step protocol results: xF+MALPR1 & 472F+1248R <ul><li>470 samples sequenced </li></ul><ul><li>High quality bi-directional reads obtained for 94% samples (96% inc. single reads, 97% inc. Phusion recoveries) </li></ul><ul><li>Complete failures: 3 (all failed for rbcL ) </li></ul><ul><li>Sequence failures: 17 low quality unable to contig </li></ul><ul><li>Of these failures, 10 subsequently recovered with Phusion Taq, but 3 were potentially pseudogenes </li></ul><ul><li>Single reads: 9 </li></ul><ul><li>Contaminants/Mix ups: 15 </li></ul><ul><li>Of these, 7 are contaminants when sequenced with rbcL as well </li></ul><ul><li>8 are matK problems, but ok for rbcL </li></ul><ul><li>Contaminants as fails: success 91% (92% inc. Phusion recoveries) </li></ul><ul><li>Contaminants as missing: success 96% (97% inc. Phusion recoveries) </li></ul>
    9. 10. Angiosperms: recommended work flow Dilute DNA 1:10 1 st ROUND: all samples PCR matK primers xF+MALPR1 1M betaine, 0.2M trehalose, Platinum Taq Clean successful PCR products Sequence clean PCR products 0.2M trehalose Acquire samples and extract DNA 2 nd ROUND: all PCR and SEQ failures 3F+1R or 472F+1248R 1M betaine, 0.2M trehalose, Platinum Taq PCR and SEQUENCE rbcL Clean successful PCR products Sequence clean PCR products 0.2M trehalose >95% matK sequence success rate ALL poor quality sequences/mononucleotide motifs PCR and sequence matK primers xF+ERIR 1M betaine, 0.2M trehalose, Phusion Taq
    10. 11. Angiosperms: recommendations and protocols <ul><li>PCR using a good quality thermostable Taq polymerase </li></ul><ul><ul><li>fewer amplicons obtained with cheaper alternatives </li></ul></ul><ul><li>Clean-up amplicons and sequence using 0.2M trehalose </li></ul><ul><li>Poor sequences due to mononucleotide motifs can be sequenced using Phusion Taq and primer xF+ERIR </li></ul><ul><li>Online resources: </li></ul><ul><li>matK barcoding protocols will made be available on Connect </li></ul><ul><li>Ordinal alignments available for specific primer design for problematic taxa </li></ul><ul><li>Statistics on primer mismatch and mono-nucleotide motifs available sorted by taxon </li></ul>
    11. 12. Angiosperms: matK barcode summary <ul><li>The 2-step protocol recommended here allowed >90% of samples from a wide taxonomic range to be sequenced for matK </li></ul><ul><li>Need to assess whether this is robust to different laboratory environments and plant groups </li></ul>
    12. 14. The Guardian, 17 th November, 2007
    13. 15. Gymnosperms: matK barcodes <ul><li>Gymnosperms include ca. 1100 species </li></ul><ul><li>Many economically/ecologically important and/or rare taxa </li></ul><ul><li>Full length matK alignment for primer design: </li></ul><ul><li>>800 accessions representing all genera downloaded from GenBank </li></ul><ul><li>Gymnosperm matK quite conserved: </li></ul><ul><li>conserved priming sites can be located, but divergent in Gnetales </li></ul><ul><li>Sample set: </li></ul><ul><li>All 86/86 genera (N=119) including Ginkgo </li></ul><ul><li>sensu Christenhusz et al (2011) Phytotaxa 19, 55-70 </li></ul>
    14. 16. Gymnosperms: matK barcodes <ul><li>All gymnosperms: N=95 N=16 N=8 </li></ul><ul><li>Conifers Cycads Gnetophytes </li></ul><ul><li>rbcL 89% 100% 100% </li></ul><ul><li>A GYMF1A+R1A 86% 100% 38% </li></ul><ul><li>B1 GYM-F+GYM-R 86% 100% 25% </li></ul><ul><li>B2 GNE-F+GNE-R na na 88% </li></ul><ul><li>matK A+B 95% 100% 100% </li></ul><ul><li>7 failures in conifers for matK also failed for rbcL </li></ul><ul><li>suggests primer mismatch not the reason for failure </li></ul><ul><li>Recommendation: </li></ul><ul><li>1 st round PCR and SEQ with GYM-F1A+R1A, </li></ul><ul><li>2 nd round PCR and SEQ using GYM-F+GYM-R for conifers and cycads, and GNE-F+GNE-R for gnetophytes </li></ul>
    15. 17. Ferns & allies: matK barcodes <ul><li>Ferns and allies include ca. 10,000 species </li></ul><ul><li>ca. 90% of these are Polypodiales </li></ul><ul><li>Full length matK alignment for primer design: </li></ul><ul><li>159 accessions representing all major groups derived from several published and unpublished sources </li></ul><ul><li>Fern matK very variable: </li></ul><ul><li>difficult to locate conserved sites for primer design </li></ul><ul><li>Variability means potentially useful barcode: </li></ul><ul><li>Recent publication* supports use of rbcL + matK as the core fern barcode, but further empirical utility tests required </li></ul><ul><li>Sample set: </li></ul><ul><li>14/14 orders and 44/48 families (N=95) </li></ul><ul><li>sensu Christenhuz et al (2011) Phytotaxa 19, 7-54 </li></ul><ul><li>*Li et al (2011) PLoS ONE 6, e26597 </li></ul>
    16. 18. Ferns & allies: matK barcodes <ul><li>ePCR and manual examination of alignment failed to locate any universal priming sites: </li></ul><ul><li>Primers therefore designed at the ordinal level </li></ul><ul><li>Cyatheales: Single primer pair amplifies 100% </li></ul><ul><li>(8/8 accessions) </li></ul><ul><li>Polypodiales: 81% successfully sequenced </li></ul><ul><li>Single primer pair amplifies 43/57 accessions with 2 nd primer pair adding 3 accessions </li></ul><ul><li>5/15 failures also failed for rbcL </li></ul><ul><li>Primers for lycophytes and earlier diverging orders designed but as yet untested </li></ul>
    17. 19. Liverworts: matK barcodes <ul><li>Liverworts include ca. 5000 known species </li></ul><ul><li>ca. 90% of these are leafy liverworts </li></ul><ul><li>Full length matK alignment for primer design: </li></ul><ul><li>56 accessions representing all major groups including many de novo sequences </li></ul><ul><li>Liverwort matK very variable: </li></ul><ul><li>difficult to locate conserved sites for primer design </li></ul><ul><li>Variability means potentially useful barcode </li></ul><ul><li>Sample set: </li></ul><ul><li>15/15 orders and 74/82 families (N=94) sensu Crandall-Stotler et al (2009) Edin J Bot 66, 1-44 </li></ul>
    18. 20. <ul><li>Two-step approach: </li></ul><ul><li>A Best single primer pair gives 72% </li></ul><ul><li>B Four primer pairs representing major clades used separately on failures from step 1: </li></ul><ul><li>complex thalloids (400 spp.), simple thalloids 1 (200 spp.), simple thalloids 2 (150 spp.), leafy (4300 spp.) </li></ul><ul><li>Using these 4 primer pairs as a cocktail gave lower PCR success </li></ul><ul><li>rbcL 100% success </li></ul><ul><li>matK A plus B results in 90% success </li></ul><ul><li>Failures include early diverging Treubiales and Calobryales (only ca. 20 spp.) </li></ul><ul><li>Full length matK sequences are the rate limiting step </li></ul>Liverworts: matK barcodes
    19. 21. Mosses: matK barcodes <ul><li>Mosses include ca. 12,800 species </li></ul><ul><li>Greatest numbers and diversity in Hypnales </li></ul><ul><li>Full length matK alignment for primer design: </li></ul><ul><li>66 accessions representing all major groups including many de novo sequences </li></ul><ul><li>Moss matK quite conserved compared to ferns and liverworts: </li></ul><ul><li>conserved priming sites located and range of primer pairs tested </li></ul><ul><li>matK barcode utility unknown: </li></ul><ul><li>lack of moss matK primers has precluded any meaningful comparisons with other markers </li></ul><ul><li>Sample set: </li></ul><ul><li>29/30 orders and 92/111 families (N=107) sensu Goffinet & Shaw </li></ul>
    20. 22. Mosses: matK barcodes <ul><li>rbcL 100% PCR success </li></ul><ul><li>matK: 4 primer pairs tested </li></ul><ul><li>Best primer pair sequences 82% (Best 2-step = 94%, all 4 primers = 98%) </li></ul><ul><li>However: </li></ul><ul><li>All mosses except Sphagnum contain a mononucleotide motif in the centre of the barcode region, which is difficult to sequence across. </li></ul><ul><li>Phusion Taq polymerase alleviates the problem, but PCR is more difficult to optimize </li></ul><ul><li>Best primer pair sequences 62% </li></ul><ul><li>Best 2-step = 75%, best 3-step = 82% (Hypnales = 85%) </li></ul>
    21. 23. 2-step protocol = >95% 2-step protocol = >95% 2-step protocol = ca. 80% Polypodiales 1-step protocol = 100% Cyatheales Lycophyte and early-diverging lineage primers require testing 1-step protocol = >80% 3-step protocol = >80% Further primer optimization required 2-step protocol = ca. 90% Further primer optimization required
    22. 24. Acknowledgements <ul><li>Collaborating laboratories: </li></ul><ul><li>Damon Little </li></ul><ul><li>New York Botanic Garden </li></ul><ul><li>Sean Graham </li></ul><ul><li>University of British Columbia </li></ul><ul><li>Gao Lian-Ming, Li De-Zhu </li></ul><ul><li>Kunming Institute of Botany </li></ul><ul><li>Maria Kuzmina </li></ul><ul><li>Mehrdad Hajibabaei </li></ul><ul><li>CCDB, University of Guelph </li></ul><ul><li>Aron Fazekas </li></ul><ul><li>University of Guelph </li></ul>Suppliers of data and samples: Olivier Maurin, Michelle van der Bank ACCB, University of Johannesburg Harald Schneider Natural History Museum, London Dietmar Quandt, Susann Wicke Nees Institute, University of Bonn Fay Wei Li, ChunNeng Wang, other National Taiwan University Paul Wolf Utah State University Juan Carlos Villareal University of Conneticut