Megan Milton & Mallory Van Wyngaarden - Managing Barcode Data Library Generation

1,422 views

Published on

How to manage barcode data library generation using BOLD systems

Published in: Education, Technology
  • Be the first to comment

Megan Milton & Mallory Van Wyngaarden - Managing Barcode Data Library Generation

  1. 1. Barcode of Life Data Systems (BOLD) www.boldsystems.org (v2.5) v3.boldsystems.org (v3.0 beta) Managing Barcode Data Library Generation Fourth International Barcode of Life Conference - Workshop Megan Milton and Mallory Van Wyngaarden Monday, November 28, 2011 – University of Adelaide, Australia
  2. 2. Barcode Library Generation
  3. 3. Barcode Library Generation <ul><li>Needs </li></ul><ul><ul><li>Scope (taxonomic and/or geographic) </li></ul></ul><ul><ul><li>Barcode standards compliance </li></ul></ul><ul><ul><li>Completion of data </li></ul></ul><ul><ul><li>Access by all participants </li></ul></ul><ul><ul><li>Quality control process </li></ul></ul><ul><ul><li>Data Curation/updates </li></ul></ul><ul><ul><li>Avoid duplication of effort </li></ul></ul><ul><ul><li>Computational power for analysis </li></ul></ul><ul><ul><li>Protection of data </li></ul></ul>
  4. 4. BOLD Workbench <ul><li>How BOLD addresses these needs: </li></ul><ul><ul><li>Secure Data Storage </li></ul></ul><ul><ul><li>Online access anywhere </li></ul></ul><ul><ul><li>Permission based sharing </li></ul></ul><ul><ul><li>Taxonomy Browser (view progress so far) </li></ul></ul><ul><ul><li>Built-in Quality Control checks </li></ul></ul><ul><ul><li>Progress feeds/Activity log </li></ul></ul><ul><ul><li>Analysis tools on BOLD compute cluster </li></ul></ul>
  5. 5. <ul><li>Requesting an Account </li></ul><ul><ul><li>Requirements: </li></ul></ul><ul><ul><ul><li>Valid Email Address </li></ul></ul></ul><ul><ul><ul><li>Institutional Affiliation </li></ul></ul></ul><ul><ul><ul><li>Password </li></ul></ul></ul>User Registration Getting Started
  6. 6. <ul><li>Creating a Project </li></ul><ul><ul><li>Project Identifiers </li></ul></ul><ul><ul><ul><li>Project code </li></ul></ul></ul><ul><ul><ul><li>Project type </li></ul></ul></ul><ul><ul><li>Markers </li></ul></ul><ul><ul><ul><li>Primary </li></ul></ul></ul><ul><ul><ul><li>secondary </li></ul></ul></ul><ul><ul><li>Campaign </li></ul></ul><ul><ul><li>Description </li></ul></ul><ul><ul><li>Project permissions </li></ul></ul>Getting Started Project Creation Form
  7. 7. Specimen Page Sequence Page Getting Started Barcode Record = Specimen data + Molecular data
  8. 8. Getting Started Standard Workflow - order of upload Specimen Data Images Traces Sequences
  9. 9. Specimen Data Submissions Single Specimen Upload Form <ul><li>Specimen Data </li></ul><ul><ul><li>Single Uploads </li></ul></ul><ul><ul><ul><li>Identifiers </li></ul></ul></ul><ul><ul><ul><li>Taxonomy </li></ul></ul></ul><ul><ul><ul><li>Specimen Details </li></ul></ul></ul><ul><ul><ul><li>Collection data </li></ul></ul></ul><ul><ul><li>Batch Uploads </li></ul></ul><ul><ul><ul><li>New and updated records </li></ul></ul></ul><ul><ul><ul><li>Template spreadsheet </li></ul></ul></ul><ul><ul><ul><li>Submit through BOLD to Data Management Team </li></ul></ul></ul>
  10. 10. Image Submissions Image Library <ul><li>Image Data </li></ul><ul><ul><li>Required Fields </li></ul></ul><ul><ul><ul><li>Sample ID </li></ul></ul></ul><ul><ul><ul><li>Process ID </li></ul></ul></ul><ul><ul><ul><li>Image File </li></ul></ul></ul><ul><ul><ul><li>Original Specimen </li></ul></ul></ul><ul><ul><ul><li>View Metadata </li></ul></ul></ul><ul><ul><ul><li>Licensing </li></ul></ul></ul><ul><ul><li>Resolution </li></ul></ul><ul><ul><ul><li>< 20 Megapixels </li></ul></ul></ul><ul><ul><li>Assemble Package </li></ul></ul><ul><ul><ul><li>Images (.jpeg format) </li></ul></ul></ul><ul><ul><ul><li>Spreadsheet (template) </li></ul></ul></ul><ul><ul><ul><li>Maximum zipped file size 190MB </li></ul></ul></ul>
  11. 11. Trace Submissions Trace File Viewer <ul><li>Trace Files </li></ul><ul><ul><li>Sequencing details: </li></ul></ul><ul><ul><ul><li>Trace file in .ab1 or .scf </li></ul></ul></ul><ul><ul><ul><li>Phred File in .phd.1 </li></ul></ul></ul><ul><ul><ul><li>PCR primers </li></ul></ul></ul><ul><ul><ul><li>Sequencing primer </li></ul></ul></ul><ul><ul><ul><li>Direction </li></ul></ul></ul><ul><ul><ul><li>Marker </li></ul></ul></ul><ul><ul><ul><li>Attribution to run site </li></ul></ul></ul><ul><ul><li>Assemble Package </li></ul></ul><ul><ul><ul><li>Electropherograms </li></ul></ul></ul><ul><ul><ul><li>Spreadsheet (template) </li></ul></ul></ul><ul><ul><ul><li>Maximum zipped file size 190MB </li></ul></ul></ul>
  12. 12. Primer Submissions Primer Database <ul><li>Primer Database </li></ul><ul><ul><li>Search by </li></ul></ul><ul><ul><ul><li>Primer code </li></ul></ul></ul><ul><ul><ul><li>Submitter </li></ul></ul></ul><ul><ul><ul><li>Target marker </li></ul></ul></ul><ul><ul><ul><li>Reference/Citation </li></ul></ul></ul>
  13. 13. Primer Submissions Primer Submission Form <ul><li>Primers </li></ul><ul><ul><li>Required Fields </li></ul></ul><ul><ul><ul><li>Primer code </li></ul></ul></ul><ul><ul><ul><li>Primer description </li></ul></ul></ul><ul><ul><ul><li>Target marker </li></ul></ul></ul><ul><ul><ul><li>Primer sequence </li></ul></ul></ul><ul><ul><ul><li>Reference/Citation </li></ul></ul></ul><ul><ul><ul><li>Direction </li></ul></ul></ul><ul><ul><ul><li>*Public/private </li></ul></ul></ul>
  14. 14. Sequence Submissions Sequence Page <ul><li>Sequence Data </li></ul><ul><ul><li>Required Fields </li></ul></ul><ul><ul><ul><li>Aligned sequences in FASTA format </li></ul></ul></ul><ul><ul><ul><li>Header can use Process ID or Sample ID </li></ul></ul></ul><ul><ul><ul><li>Marker </li></ul></ul></ul><ul><ul><ul><li>Run Site (Institution) </li></ul></ul></ul><ul><ul><ul><li>< 1000 sequence per upload </li></ul></ul></ul>
  15. 15. <ul><li>Project Console </li></ul><ul><ul><li>Project Permissions and Publication </li></ul></ul><ul><ul><ul><li>Project manager only </li></ul></ul></ul><ul><ul><li>Project Statistics </li></ul></ul><ul><ul><li>Upload/Downloads </li></ul></ul><ul><ul><li>Sequence Analysis </li></ul></ul><ul><ul><li>Specimen Aggregates </li></ul></ul><ul><ul><li>Activity Feed </li></ul></ul><ul><ul><li>Tags and Comments </li></ul></ul>Project Console Project Summary
  16. 16. <ul><li>Record List </li></ul><ul><ul><li>Identification </li></ul></ul><ul><ul><li>Specimen Page </li></ul></ul><ul><ul><ul><li>Specimen information </li></ul></ul></ul><ul><ul><ul><li>Image data </li></ul></ul></ul><ul><ul><li>Sequence Page </li></ul></ul><ul><ul><ul><li>Sequence(s), trace files and primer </li></ul></ul></ul><ul><ul><li>Icons and flags </li></ul></ul><ul><ul><li>Tagging and Comments on multiple records </li></ul></ul>Record List and Icons Project Summary
  17. 17. <ul><li>Taxon ID Tree </li></ul><ul><ul><li>Requires: good quality sequences, some level of taxonomy, images are recommended </li></ul></ul><ul><ul><li>Highlights common contaminations </li></ul></ul><ul><ul><li>Colourize by taxonomy, geography, etc </li></ul></ul><ul><ul><li>Helps to catch misidentifications </li></ul></ul><ul><ul><li>Add pictures for comparison </li></ul></ul><ul><ul><li>Use to help make identifications </li></ul></ul>Taxon ID Tree Data Validation
  18. 18. <ul><li>Nearest Neighbour </li></ul><ul><ul><li>Tabular Format </li></ul></ul><ul><ul><li>Requires low level taxonomy </li></ul></ul><ul><ul><li>Highlights: </li></ul></ul><ul><ul><ul><li>Low Divergence compared to nearest neighbour </li></ul></ul></ul><ul><ul><ul><li>Divergence that is less than the intra-specific </li></ul></ul></ul>Nearest Neighbour Summary Data Validation
  19. 19. <ul><li>Editing Records </li></ul><ul><ul><li>Review graphs and flags in Project Summary </li></ul></ul><ul><ul><li>Review and edit specimen page </li></ul></ul><ul><ul><li>Review sequence page </li></ul></ul><ul><ul><ul><li>Sequence </li></ul></ul></ul><ul><ul><ul><li>Trace </li></ul></ul></ul><ul><ul><ul><li>Primer </li></ul></ul></ul><ul><ul><li>Replace or delete images, traces, sequences </li></ul></ul>Specimen and Sequence Pages Data Curation
  20. 20. <ul><li>Publishing Project </li></ul><ul><ul><li>Submitting to GenBank </li></ul></ul><ul><ul><li>Making projects public on BOLD </li></ul></ul>Publication Published Project
  21. 21. Bibliography Submissions Biblio Submission Form and Publication Database <ul><li>Bibliography </li></ul><ul><li>Required Fields: </li></ul><ul><ul><ul><li>Title </li></ul></ul></ul><ul><ul><ul><li>Authors </li></ul></ul></ul><ul><ul><ul><li>Abstract </li></ul></ul></ul><ul><ul><ul><li>Journal details </li></ul></ul></ul><ul><li>Connect to BOLD records </li></ul><ul><ul><ul><li>Primary records </li></ul></ul></ul><ul><ul><ul><li>Secondary records </li></ul></ul></ul>
  22. 22. <ul><ul><li>[email_address] </li></ul></ul>

×