• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
SNP
 

SNP

on

  • 775 views

snp

snp

Statistics

Views

Total Views
775
Views on SlideShare
775
Embed Views
0

Actions

Likes
2
Downloads
0
Comments
0

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    SNP SNP Presentation Transcript

    • Mining Single Nucleotide Polymorphisms from public sequence databases. Gary Barker IACR Long Ashton
    • What are Single Nucleotide Polymorphisms (SNPs)?
      • ATGGTAA G CCTGAG C TGACTTAGCGT-AT
      • ATGGTAA A CCTGAG T TGACTTAGCGTCAT
      •   
      • snp snp indel
      • SNPs result from replication errors and DNA damage
    • Why are these polymorphisms useful? It’s sometimes possible to correlate a SNP with a particular trait. This is known as association genetics.
    • Disease resistant population Disease susceptible population Genotype all individuals for thousands of SNPs ATG A TTATAG ATG T TTATAG Resistant people all have an ‘A’ at position 4 in geneX , while susceptible people have a ‘T’ gene X
    • To use SNPs, you first have to find them. Poorly studied organisms: Sequence many ‘loci’ (different places in the genome) for many individuals. Many well studied organisms : Required data is already present in public sequence databases, it just needs to be processed.
    • Number of ESTs* in EMBL database *ESTs are single pass (often partial) gene sequences
    • Mining SNPs from EST sequences in the database AutoSNP (PERL script) can find likely SNPs in data sets downloaded from public databases. 1) Marks up only those polymorphisms where each allele is supported by at least two independent sequences. This filters out most sequencing errors. 2) Adds further confidence scores based on co-segregation 3) Results written to HTML reports.
    •  
    •  
    •  
    • Accessing AutoSNP results 1) Search by accession number:
    •  
    • Accessing AutoSNP results 2) Search with a query sequence
    •  
    • Current AutoSNP approach: Cluster sequences (d2cluster) Align and find SNPs (cap3) Accession # / SNP report # Query with Accession MySQL database gi|11117503 | snip_1.htm gi|12217138 | snip_2.htm Sequence query Blast client Matching Accessions Links to existing SNP reports
    • Desirable: Client supplied query Sequence (ATAGCGTACG……) Blast search (data direct from EBI?) Build contigs of results Detect eSNPs Client gets SNP report(s) (html) for all sequences matching query Data and processing power (large) processing power (medium) processing power (small) < 10 seconds
    • Conclusions SNPs (single nucleotide polymorphisms) are abundant and useful genetic markers. Software exists to mine them from public data sets, but this doesn’t work in real time. GRID technology could help to deliver up-to-date alignments to users for any query sequence with putative SNPs marked up. Related useful features would include bootstrapped trees for each alignment, generated on the fly.