Your SlideShare is downloading. ×
0
Ilene Mizrachi - Opening Plenary
Ilene Mizrachi - Opening Plenary
Ilene Mizrachi - Opening Plenary
Ilene Mizrachi - Opening Plenary
Ilene Mizrachi - Opening Plenary
Ilene Mizrachi - Opening Plenary
Ilene Mizrachi - Opening Plenary
Ilene Mizrachi - Opening Plenary
Ilene Mizrachi - Opening Plenary
Ilene Mizrachi - Opening Plenary
Ilene Mizrachi - Opening Plenary
Ilene Mizrachi - Opening Plenary
Ilene Mizrachi - Opening Plenary
Ilene Mizrachi - Opening Plenary
Ilene Mizrachi - Opening Plenary
Ilene Mizrachi - Opening Plenary
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Ilene Mizrachi - Opening Plenary

560

Published on

Barcode Sequence Dataflow into Genbank

Barcode Sequence Dataflow into Genbank

Published in: Education, Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
560
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
19
Comments
0
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. Ilene Mizrachi November 30, 2011 Fourth International Barcode of Life ConferenceNational Center for Biotechnology Information – National Library of Medicine – Bethesda MD 20892 USA
  • 2. Barcode Project -2003 and beyond Barcode of Life project was initiated at in 2003 INSDC would be the repository for raw and assembled sequence data INSDC adopts new source fields to accommodate Barcode metadata requirements Barcode of Life Database (BOLD) established as a community workbench and sequencing center National Center for Biotechnology Information – National Library of Medicine – Bethesda MD 20892 USA
  • 3. What is a Barcode? A global reference library of DNA barcode sequences that is integrated with other systems of biodiversity information (e.g., databases of specimens, species, biogeographic information). Mechanism to link DNA sequences to vouchered specimens and valid species names. A reserved BARCODE keyword was adopted for data that met strict barcode standards National Center for Biotechnology Information – National Library of Medicine – Bethesda MD 20892 USA
  • 4. Barcode Standard Formally described species or a provisional label for an unpublished species Voucher specimen identifier, preferably in a biorepository using a structured field Country-Code using the controlled vocabulary used by GenBank; Sequence from a gene region specified by the CBOL  COI for animals  matK and rbcL for plants  ITS for fungi Contain at least 75% contiguous, high quality bases from within the approved region Electropherogram trace files for bidirectional sequencing runs Sequences of all forward and reverse primers Strongly recommended data elements  GPS coordinates  Name of the identifier  Name of the collector  Date of collection National Center for Biotechnology Information – National Library of Medicine – Bethesda MD 20892 USA
  • 5. Compliant Barcode Record
  • 6. Barcode records in GenBankNational Center for Biotechnology Information – National Library of Medicine – Bethesda MD 20892 USA
  • 7. Life of an iBOL Record
  • 8. Submissions from BOLDNational Center for Biotechnology Information – National Library of Medicine – Bethesda MD 20892 USA
  • 9. Data Sharing WorksNational Center for Biotechnology Information – National Library of Medicine – Bethesda MD 20892 USA
  • 10. http://www.ncbi.nlm.nih.gov/WebSub/?tool=barcode
  • 11. QA checks at GenBankTo ensure that the sequence data is of high quality, thefollowing checks are run: Barcode data element compliance Consistency checks such as:  reported latitude-longitude falls within cited country  collection date has already occurred Sequence quality checks National Center for Biotechnology Information – National Library of Medicine – Bethesda MD 20892 USA
  • 12. Compliance toolNational Center for Biotechnology Information – National Library of Medicine – Bethesda MD 20892 USA
  • 13. Checking Sequence Quality • Trim primer sequences • Check congruence between fwd and reverse reads • Align sequences to check for gaps • Translate sequences to check for internal stopsNational Center for Biotechnology Information – National Library of Medicine – Bethesda MD 20892 USA
  • 14. Updates Are Critical Primary data repository – sequence records owned by submitter Submitter is responsible for providing additional data and metadata as it becomes available:  Publication  Sequence  Taxonomy  Voucher Third party updates are welcome! National Center for Biotechnology Information – National Library of Medicine – Bethesda MD 20892 USA
  • 15. Challenges If Reference Barcodes are to be used for species identification, phylogenetics, ecological forensics, conservation, and macro-analysis of biodiversity patterns, then the minimal requirement should be (a) high quality sequence (b) link to specimen and (c) taxonomic identification Need to support rapid data release including preliminary taxonomic classifications similar to “Fort Lauderdale Principles” of genomics community Data updated asynchronously at BOLD and in GenBank. Need to continue work on update channel Need to work with communities to devise strict QA tests for plant and fungal Barcodes National Center for Biotechnology Information – National Library of Medicine – Bethesda MD 20892 USA
  • 16. Acknowledgements Taxonomy Group  GenBank Group  Scott Federhen  Susan Schafer  Conrad Schoch  Michael Fetchko  Lu Sun  Carol Hotton  Software Support  Detlef Leipe  Colleen Bollin  Kamen Todorov  Vasuki Gobu National Center for Biotechnology Information – National Library of Medicine – Bethesda MD 20892 USA

×