• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
Dr Aron Fazekas - Plant DNA Barcoding; data workflow
 

Dr Aron Fazekas - Plant DNA Barcoding; data workflow

on

  • 1,764 views

Dr Fazekas process for checking and editing DNA sequences before publishing on BOLD.

Dr Fazekas process for checking and editing DNA sequences before publishing on BOLD.

Statistics

Views

Total Views
1,764
Views on SlideShare
1,681
Embed Views
83

Actions

Likes
0
Downloads
47
Comments
0

4 Embeds 83

http://raunakms.wordpress.com 80
http://a0.twimg.com 1
http://paper.li 1
https://si0.twimg.com 1

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

CC Attribution-NonCommercial LicenseCC Attribution-NonCommercial License

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment
  • Assumptions: BOLD project exists already. Just received raw data back from sequencer.
  • Every base is criticalOther principles: homology
  • Mention orientation
  • Mention orientation
  • Contigs need to agree…ABI software will make mistakes from time to time
  • Important to look at the sequence… many gaps inserted (an extreme example, but it can happen on a smaller scale.
  • Delete old alignment or make new: develop methods to backcheck the aligned file with the original
  • Relevant points outliers odditiesSingle sequenes – how do we know they are what they are?

Dr Aron Fazekas - Plant DNA Barcoding; data workflow Dr Aron Fazekas - Plant DNA Barcoding; data workflow Presentation Transcript

  • Plant DNA Barcoding:data workflowAron Fazekas University of Guelph
  • Plant DNA Barcoding: data workflowWorkflow Outline: raw sequence editing data alignment re-edit the sequence file upload to BOLD quality checks using BOLD / genbank
  • Sequence editing: primer trimming
  • Sequence editing: primer trimming 5’ GTTATGCATGAACGTAATGCTC GAGCATTACGT….
  • Sequence editing: primer trimming
  • Sequence editing: editing miscalls
  • Sequence editing: congruence between forward/ reverse reads
  • Sequence Alignment After editing: need to align the data Kelchner (2000) Ann Missouri Bot Gard rbcL easy to align - most programs work well matK tricky to align – TransAlign seems to do the best job trnH difficult (impossible between genera?) ITS difficult (impossible between genera?)Clustal www.clustal.orgTransAlign http://www.biomedcentral.com/1471-2105/6/156K-Align http://www.ebi.ac.uk/Tools/msa/kalign/
  • Sequence AlignmentProblems to look for after alignment: - primers not trimmed - gaps at the ends - gaps in the middle (protein coding) - translation shows stop codons
  • - primers not trimmed trnH-psbA- gaps at the ends Real data submitted for publication
  • rbcL - gaps in the middle of a data submitted for publicationcoding region
  • Translate coding regions (rbcL, matK) toensure there are no stop codons present
  • Edit both the alignment file and the original sequence file
  • Can trnH-psbA (or other non-coding sequence) be alignedacross diverse species?
  • Upload to BOLD
  • After data is edited, aligned: use BOLD tocreate a tree
  • • Check for misplaced taxa – remove them from the dataset• Check for singleton species – make a list
  • BOLD BLAST check
  • Genbank BLAST check
  • Genbank BLAST check
  • Genbank Blast
  • Acknowledgements Sujeevan Ratnasingham & Bold Team