Your SlideShare is downloading. ×
Non-synonymous SNP ID
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Non-synonymous SNP ID

1,512
views

Published on

"Large" data set project for Bioinformatics class identifying non-synonymous SNPs in sockeye salmon

"Large" data set project for Bioinformatics class identifying non-synonymous SNPs in sockeye salmon

Published in: Technology

0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
1,512
On Slideshare
0
From Embeds
0
Number of Embeds
3
Actions
Shares
0
Downloads
31
Comments
0
Likes
2
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide
  • 22% loss
  • Transcript

    • 1. Identifying synonymous and non-synonymous SNPs
      Bioinformatics for Environmental Sciences
      Winter 2009
      Large Data Set Analysis
      Caroline Storer
    • 2. Objective
      Identify weather a SNP is synonymous or non-synonymous
      Develop high-throughput workflow for non-synonymous SNP identification
    • 3. Context
      SNPs are becoming an abundant and accessible tool for genetic studies in non-model organisms
      Additionally,
      • In sockeye salmon, we now have > 110 SNPs to use for genetic stock identification (GSI)
      • 4. SNPs under selection often provide high resolution for GSI
      • 5. Non-synonymous SNPs might indicate genes under selection
    • Theory
      A single, nucleotide sequence difference could produce a change in amino acid and thereby possibly alter protein function
      This is a non-synonymous SNP
    • 6. Non-synonymous SNPs
      Dependent on position of SNP in codon
      Dependant on the reading frame
    • 7. Difficulties
      Determining the reading frame
      5’-TCTAAAATGGGTGAC-3
      5’-UCUAAAAUGGGUGAC-3
      UCU AAA AUG GGU GAC
      . CUA AAA UGG GUG AC
      . . UAA AAU GGG UGA C
      dsDNA
      RNA
      6 possible RFs for dsDNA
      3 possible RFs in each direction
      Whichreading frame is correct?
    • 8. Workflow
    • 9. Data: Sequences & SNPs
      2 sequences, one for each SNP allele
      1 sequence with 1 SNP
      • 23 sequences with 1 SNP
      • 10. Sequence length from 102 bp – 400 bp
      TGATTTCT[C/T]CATTCCATG
      TGATTTCTCCATTCCATG
      TGATTTCTTCATTCCATG
      46 DNA sequences, 1 for each allele
    • 11. Translation: Transseq
      http://www.ebi.ac.Tk/Tools/emboss/transeq/index.html?
      • Result: 143 AA sequences, 12 for each SNP locus, 6 for each allele sequence
    • Protein BLAST
      iNquiry BLASTP: amino acid query/protein database
      Top hit only
      Tabular output
      INPUT
      OUTPUT
      BLASTP
      84 sequences
      276 sequences
      All 23 loci
      18 loci
      • 22% loss of loci due to zero hits in BLAST
    • Picking a Reading Frame
      Query Top Hit E-value Scenario
      Only 1 reading frame had hits
      Multiple reading frames had hits,1 had higher E-value
      Reading frame
      Locus
      Allele
    • 12. Sequence Alignment & SNPs
      17 synonymous SNPs (no change in AA)
      1 non- synonymous SNP
    • 13. A Non-synonymous SNP
      GO Terms
      SNP U1214: [A/C]
      Gene: Sialytransferase
    • 14. High-throughput
    • 15. Conclusions
      Can identify non-synonymous mutations
      Method is not high-throughput
      Method could be more automated with sockeye genome