lnformatics Workshop, Adelaide     28 November 2011  The BARCODE Data Standard:  Enabling Molecular Diagnostics          f...
The Infrastructure of TaxonomyCollections and databases of specimensCodes of Taxonomic NomenclatureCompilations of taxo...
DNA Barcoding   New tools for taxonomy                  The ability to compare genotype                  information acros...
Emerging Applications
Couplets Consisting of:“Species Name - DNA Sequence” Basis of a “look-up table” enabling molecular diagnostic application...
Manual AssemblySubjective interpretation?
“Only [27%] of papers had a legitimate specimensexamined section, with museum numbers for each  voucher, and names of the ...
Problem AreasTRANSPARENCY AND TRACEABILITYGenetic Data QualitySpecimen Data QualityTaxonomyInformation Access
First InternationalBarcode of Life Conference
Barcoders began calling for a Paradigm Shift
Barcoding:     Integrating Best Practices                     Genomics         Classical        Taxonomy
Data Standards for BARCODE        Records in INSDC*Community-based standards for COICreation of a reserved keyword BARCO...
Second International Barcode ofLife Conference 17-21 Sept 2007
Validation demonstrates that a procedure is     robust, reliable and reproducible.PCR amplification and DNA sequencing:• A...
Third InternationalBarcode of Life Conference
2009: Barcode Markers for Plants52 authors from 24 institutions in 9nations, proposed a pair of shortsequences (totaling a...
Fourth InternationalBarcode of Life Conference
2011: Barcode Marker for Fungi149 authors from 71 institutions proposeITS as fungal barcode target. It also hasdemonstrate...
Move toward rapid data release:In 2009 the community acknowledged the value of the “Ft Lauderdale Accord”Raw sequence da...
Issues that need to be addressed:Legacy BARCODE records lack trace filesMany recent BARCODE records lack valid namesNot...
Question: What is barcoding?A method for species identification and discovery through the analysis of short, standardized...
DNA Barcodes: a tool of integrative taxonomy DNA Identification                 DNA Taxonomy                      Barcodin...
Evolution of StandardsEven among well-studied vertebrates:serious discrepancies exist in the application of names across ...
2011: BOLD 3.0Supports assembly of BARCODE compliant data records for all markersIncludes specimen images and introduces...
What other issues remain?Barcode annotation of plants and fungi?Registration of institutions/collectionsSynchronization...
www.biorepositories.org
Structured Reference to Vouchers?
LinkOut to Collection Catalogs
Accomplishments:Integration of genomics and biodiversity science via creation of a robust molecular diagnostic interface ...
Acknowledgments: All Participants of the CBOL Database Work  Group and many, many others!
Rationale for Defining“BARCODE” keyword in GenBankProvides the community with reference records with verifiable and retri...
The Barcode Data StandardEstablishing a new data standard for “BARCODE”  keyword records in DDBJ/EMBL/GenBank:1. Minimum 5...
BARCODE Records (without trace files)
Dr Robert Hanner - Barcode Data standards for animals, plants & fungi
Dr Robert Hanner - Barcode Data standards for animals, plants & fungi
Dr Robert Hanner - Barcode Data standards for animals, plants & fungi
Upcoming SlideShare
Loading in …5
×

Dr Robert Hanner - Barcode Data standards for animals, plants & fungi

2,000 views
1,872 views

Published on

Establishing standards for barcoding and highlighting some of the problems with inconsistencies within the barcode library.

Published in: Education, Technology
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
2,000
On SlideShare
0
From Embeds
0
Number of Embeds
281
Actions
Shares
0
Downloads
80
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

Dr Robert Hanner - Barcode Data standards for animals, plants & fungi

  1. 1. lnformatics Workshop, Adelaide 28 November 2011 The BARCODE Data Standard: Enabling Molecular Diagnostics for BiodivesityRobert Hanner, Ph.D.Centre for Biodiversity GenomicsUniversity of Guelph, Canada
  2. 2. The Infrastructure of TaxonomyCollections and databases of specimensCodes of Taxonomic NomenclatureCompilations of taxonomic namesMonographsFloristic and faunistic surveys/inventoriesRevisionsThe (undigitized) Taxonomic Literature
  3. 3. DNA Barcoding New tools for taxonomy The ability to compare genotype information across a huge range of organisms is a powerful tool
  4. 4. Emerging Applications
  5. 5. Couplets Consisting of:“Species Name - DNA Sequence” Basis of a “look-up table” enabling molecular diagnostic applications However, both elements are assertions Underlying specimens and associated raw sequence data are not typically available for secondary inspection
  6. 6. Manual AssemblySubjective interpretation?
  7. 7. “Only [27%] of papers had a legitimate specimensexamined section, with museum numbers for each voucher, and names of the museums where the specimens used in the study could be examined”
  8. 8. Problem AreasTRANSPARENCY AND TRACEABILITYGenetic Data QualitySpecimen Data QualityTaxonomyInformation Access
  9. 9. First InternationalBarcode of Life Conference
  10. 10. Barcoders began calling for a Paradigm Shift
  11. 11. Barcoding: Integrating Best Practices Genomics Classical Taxonomy
  12. 12. Data Standards for BARCODE Records in INSDC*Community-based standards for COICreation of a reserved keyword BARCODE - Required & recommended data elements - Sequence quality and coverageRecommended for identifying unknownsProcess to propose non-COI gene regions*http://barcoding.si.edu/pdf/dwg_data_standards-final.pdf
  13. 13. Second International Barcode ofLife Conference 17-21 Sept 2007
  14. 14. Validation demonstrates that a procedure is robust, reliable and reproducible.PCR amplification and DNA sequencing:• Are robust methods which produces successful results a high percentage of the time.• Are reliable methods that produce accurate results.• Are reproducible methods producing similar results each time a sample is tested.
  15. 15. Third InternationalBarcode of Life Conference
  16. 16. 2009: Barcode Markers for Plants52 authors from 24 institutions in 9nations, proposed a pair of shortsequences (totaling about 1,450 basepairs) from rbcL and matK as thefoundation for a DNA barcode library forplants.CBOL Plant Working Group (2009) A DNA barcode forland plants. Proc Natl Acad Sci USA 106:12794–12797.
  17. 17. Fourth InternationalBarcode of Life Conference
  18. 18. 2011: Barcode Marker for Fungi149 authors from 71 institutions proposeITS as fungal barcode target. It also hasdemonstrated utility in some plants*.Fungal Barcoding Consortium (2011) The nuclearribosomal internal transcribed spacer (ITS) region as auniversal DNA barcode marker for Fungi. Proc Natl AcadSci USA (Submitted).*Hollingsworth (2011) Refining the DNA barcode for landplants. www.pnas.org/cgi/doi/10.1073/pnas.1116812108
  19. 19. Move toward rapid data release:In 2009 the community acknowledged the value of the “Ft Lauderdale Accord”Raw sequence data and high-level taxonomy (eg order) deposited in INSDC prior to publicationGave rise to “Dark taxa” in INSDC and subsequent arguments pro & con
  20. 20. Issues that need to be addressed:Legacy BARCODE records lack trace filesMany recent BARCODE records lack valid namesNot all potential BARCODE data is in the public domain
  21. 21. Question: What is barcoding?A method for species identification and discovery through the analysis of short, standardized DNA sequencesShould BARCODE be applied only to known species as an ID tag, or should it be used to designate a sequence entry conforming to a meta-data standard?
  22. 22. DNA Barcodes: a tool of integrative taxonomy DNA Identification DNA Taxonomy BarcodingLow ambiguity High ambiguitySpecies well-known Species unknown
  23. 23. Evolution of StandardsEven among well-studied vertebrates:serious discrepancies exist in the application of names across labsIdentification accuracy of reference collections highly variablePerhaps BARCODE is a better process tag unless reserved for published data
  24. 24. 2011: BOLD 3.0Supports assembly of BARCODE compliant data records for all markersIncludes specimen images and introduces BINs to aid data validationIntroduces features for 3rd party annotation of data records to facilitate library curation
  25. 25. What other issues remain?Barcode annotation of plants and fungi?Registration of institutions/collectionsSynchronization of data bases
  26. 26. www.biorepositories.org
  27. 27. Structured Reference to Vouchers?
  28. 28. LinkOut to Collection Catalogs
  29. 29. Accomplishments:Integration of genomics and biodiversity science via creation of a robust molecular diagnostic interface between themIncreased community awareness of taxonomy and collections
  30. 30. Acknowledgments: All Participants of the CBOL Database Work Group and many, many others!
  31. 31. Rationale for Defining“BARCODE” keyword in GenBankProvides the community with reference records with verifiable and retrievable data:  Associated with retrievable voucher specimens (liberally defined: tissue, DNA, etc.)  Linked to on-line metadata  Meet an agreed upon standard of taxonomic identification  Provide an assured level of data completeness  On an agreed upon gene region  Recommended for use in identifying unknowns
  32. 32. The Barcode Data StandardEstablishing a new data standard for “BARCODE” keyword records in DDBJ/EMBL/GenBank:1. Minimum 500bp, <1% ambiguous base calls2. Double stranded sequence3. Trace files and associated quality scores4. Primers used to generate sequence5. Linkages to: 1. A morphological voucher specimen 2. Structured reference to collections 3. Geospatial reference information 4. Valid species name 5. Who performed the identification 6. Literature citations
  33. 33. BARCODE Records (without trace files)

×