SlideShare a Scribd company logo
Mining Single Nucleotide Polymorphisms from public sequence databases. Gary Barker  IACR Long Ashton
What are Single Nucleotide Polymorphisms (SNPs)? ,[object Object],[object Object],[object Object],[object Object],[object Object]
Why are these polymorphisms useful? It’s sometimes possible to correlate a  SNP with a particular trait. This is known as  association genetics.
Disease resistant population Disease susceptible population Genotype all individuals for thousands of SNPs ATG A TTATAG ATG T TTATAG Resistant people all have an ‘A’ at position 4 in  geneX ,  while susceptible people have a ‘T’ gene X
To use SNPs, you first have to find them. Poorly studied organisms:  Sequence many ‘loci’ (different places in the genome)  for many individuals.  Many well studied organisms :  Required data is already present in public sequence databases,  it just needs to be processed.
Number of ESTs* in EMBL database *ESTs are single pass (often partial) gene sequences
Mining SNPs from EST sequences in the database AutoSNP  (PERL script) can find likely SNPs in data sets downloaded from public databases. 1) Marks up only those polymorphisms where each allele is supported by at least two independent sequences. This filters out most sequencing errors. 2) Adds further confidence scores based on co-segregation 3) Results written to HTML reports.
 
 
 
Accessing AutoSNP results 1) Search by accession number:
 
Accessing AutoSNP results 2) Search with a query sequence
 
Current AutoSNP approach:   Cluster sequences (d2cluster) Align and find SNPs (cap3) Accession # / SNP report # Query with Accession MySQL database gi|11117503  |  snip_1.htm gi|12217138  |  snip_2.htm Sequence query Blast client Matching  Accessions Links to existing SNP reports
Desirable: Client supplied query Sequence (ATAGCGTACG……) Blast search (data direct from EBI?) Build contigs of results Detect eSNPs Client gets SNP report(s) (html) for all sequences matching query Data and processing power  (large) processing power (medium)  processing power (small) < 10 seconds
Conclusions SNPs (single nucleotide polymorphisms) are abundant and useful genetic markers. Software exists to mine them from public data sets, but this doesn’t work in real time. GRID technology could help to deliver up-to-date alignments to users for any query sequence with putative SNPs marked up. Related useful features would include bootstrapped trees for each alignment, generated on the fly.

More Related Content

What's hot

Gene expression introduction
Gene expression introductionGene expression introduction
Gene expression introduction
Setia Pramana
 
Multiple sequence alignment
Multiple sequence alignmentMultiple sequence alignment
Multiple sequence alignment
Subhranil Bhattacharjee
 
Comparative genomics
Comparative genomicsComparative genomics
Comparative genomics
Athira RG
 
Genome annotation 2013
Genome annotation 2013Genome annotation 2013
Genome annotation 2013
Karan Veer Singh
 
Fusion proteins & inclusion bodies, M. Sc. Zoology, University of Mumbai
Fusion proteins & inclusion bodies, M. Sc. Zoology, University of MumbaiFusion proteins & inclusion bodies, M. Sc. Zoology, University of Mumbai
Fusion proteins & inclusion bodies, M. Sc. Zoology, University of Mumbai
Royston Rogers
 
Primary and secondary databases ppt by puneet kulyana
Primary and secondary databases ppt by puneet kulyanaPrimary and secondary databases ppt by puneet kulyana
Primary and secondary databases ppt by puneet kulyana
Puneet Kulyana
 
subtractive hybridization
subtractive hybridizationsubtractive hybridization
subtractive hybridization
Sakshi Saxena
 
SNPs analysis methods
SNPs analysis methodsSNPs analysis methods
SNPs analysis methods
had89
 
DiGE....2-D gel electrophoresis
DiGE....2-D gel electrophoresisDiGE....2-D gel electrophoresis
DiGE....2-D gel electrophoresis
Karan Veer Singh
 
Next generation sequencing
Next generation sequencingNext generation sequencing
Next generation sequencing
PALANIANANTH.S
 
Blast Algorithm
Blast AlgorithmBlast Algorithm
Genome Assembly
Genome AssemblyGenome Assembly
Genome Assembly
Aureliano Bombarely
 
SNP Detection Methods and applications
SNP Detection Methods and applications SNP Detection Methods and applications
SNP Detection Methods and applications
Aneela Rafiq
 
Genotyping by Sequencing
Genotyping by SequencingGenotyping by Sequencing
Genotyping by Sequencing
Senthil Natesan
 
Swiss prot database
Swiss prot databaseSwiss prot database
Swiss prot database
sagrika chugh
 
The next generation sequencing platform of roche 454
The next generation sequencing platform of roche 454The next generation sequencing platform of roche 454
The next generation sequencing platform of roche 454
creativebiogene1
 
Phylogenetic Tree, types and Applicantion
Phylogenetic Tree, types and Applicantion Phylogenetic Tree, types and Applicantion
Phylogenetic Tree, types and Applicantion
Faisal Hussain
 
2 whole genome sequencing and analysis
2 whole genome sequencing and analysis2 whole genome sequencing and analysis
2 whole genome sequencing and analysis
saberhussain9
 
Molecular Phylogenetics
Molecular PhylogeneticsMolecular Phylogenetics
Molecular Phylogenetics
Meghaj Mallick
 

What's hot (20)

Gene expression introduction
Gene expression introductionGene expression introduction
Gene expression introduction
 
Multiple sequence alignment
Multiple sequence alignmentMultiple sequence alignment
Multiple sequence alignment
 
Comparative genomics
Comparative genomicsComparative genomics
Comparative genomics
 
Genome annotation 2013
Genome annotation 2013Genome annotation 2013
Genome annotation 2013
 
Fusion proteins & inclusion bodies, M. Sc. Zoology, University of Mumbai
Fusion proteins & inclusion bodies, M. Sc. Zoology, University of MumbaiFusion proteins & inclusion bodies, M. Sc. Zoology, University of Mumbai
Fusion proteins & inclusion bodies, M. Sc. Zoology, University of Mumbai
 
Primary and secondary databases ppt by puneet kulyana
Primary and secondary databases ppt by puneet kulyanaPrimary and secondary databases ppt by puneet kulyana
Primary and secondary databases ppt by puneet kulyana
 
subtractive hybridization
subtractive hybridizationsubtractive hybridization
subtractive hybridization
 
SNPs analysis methods
SNPs analysis methodsSNPs analysis methods
SNPs analysis methods
 
DiGE....2-D gel electrophoresis
DiGE....2-D gel electrophoresisDiGE....2-D gel electrophoresis
DiGE....2-D gel electrophoresis
 
Next generation sequencing
Next generation sequencingNext generation sequencing
Next generation sequencing
 
Blast Algorithm
Blast AlgorithmBlast Algorithm
Blast Algorithm
 
Genome Assembly
Genome AssemblyGenome Assembly
Genome Assembly
 
Est database
Est databaseEst database
Est database
 
SNP Detection Methods and applications
SNP Detection Methods and applications SNP Detection Methods and applications
SNP Detection Methods and applications
 
Genotyping by Sequencing
Genotyping by SequencingGenotyping by Sequencing
Genotyping by Sequencing
 
Swiss prot database
Swiss prot databaseSwiss prot database
Swiss prot database
 
The next generation sequencing platform of roche 454
The next generation sequencing platform of roche 454The next generation sequencing platform of roche 454
The next generation sequencing platform of roche 454
 
Phylogenetic Tree, types and Applicantion
Phylogenetic Tree, types and Applicantion Phylogenetic Tree, types and Applicantion
Phylogenetic Tree, types and Applicantion
 
2 whole genome sequencing and analysis
2 whole genome sequencing and analysis2 whole genome sequencing and analysis
2 whole genome sequencing and analysis
 
Molecular Phylogenetics
Molecular PhylogeneticsMolecular Phylogenetics
Molecular Phylogenetics
 

Similar to SNP

Fruit breedomics workshop wp6 from marker assisted breeding to genomics assis...
Fruit breedomics workshop wp6 from marker assisted breeding to genomics assis...Fruit breedomics workshop wp6 from marker assisted breeding to genomics assis...
Fruit breedomics workshop wp6 from marker assisted breeding to genomics assis...
fruitbreedomics
 
Expanding Your Research Capabilities Using Targeted NGS
Expanding Your Research Capabilities Using Targeted NGSExpanding Your Research Capabilities Using Targeted NGS
Expanding Your Research Capabilities Using Targeted NGS
Integrated DNA Technologies
 
Bioinformatics MiRON
Bioinformatics MiRONBioinformatics MiRON
Bioinformatics MiRON
Prabin Shakya
 
Cloud bioinformatics 2
Cloud bioinformatics 2Cloud bioinformatics 2
Cloud bioinformatics 2
ARPUTHA SELVARAJ A
 
An analogy of algorithms for tagging of single nucleotide polymorphism and ev
An analogy of algorithms for tagging of single nucleotide polymorphism and evAn analogy of algorithms for tagging of single nucleotide polymorphism and ev
An analogy of algorithms for tagging of single nucleotide polymorphism and evIAEME Publication
 
Pathema Burkholderia Annotation Jamboree: Prokaryotic Annotation Overview
Pathema Burkholderia Annotation Jamboree: Prokaryotic Annotation OverviewPathema Burkholderia Annotation Jamboree: Prokaryotic Annotation Overview
Pathema Burkholderia Annotation Jamboree: Prokaryotic Annotation Overview
Pathema
 
31931 31941
31931 3194131931 31941
31931 31941
Amit Gupta
 
Role of bioinformatics in life sciences research
Role of bioinformatics in life sciences researchRole of bioinformatics in life sciences research
Role of bioinformatics in life sciences research
Anshika Bansal
 
Microarray biotechnologg ppy dna microarrays
Microarray biotechnologg ppy dna microarraysMicroarray biotechnologg ppy dna microarrays
Microarray biotechnologg ppy dna microarrays
ayeshasattarsandhu
 
Genome in a bottle april 30 2015 hvp Leiden
Genome in a bottle april 30 2015 hvp LeidenGenome in a bottle april 30 2015 hvp Leiden
Genome in a bottle april 30 2015 hvp Leiden
GenomeInABottle
 
A Distributed Annotation Pipeline for MSSNG
A Distributed Annotation Pipeline for MSSNGA Distributed Annotation Pipeline for MSSNG
A Distributed Annotation Pipeline for MSSNG
Simon Twigger
 
Bioinformatics_1_ChenS.pptx
Bioinformatics_1_ChenS.pptxBioinformatics_1_ChenS.pptx
Bioinformatics_1_ChenS.pptx
xRowlet
 
A Genome Sequence Analysis System Built with Hypertable
A Genome Sequence Analysis System Built with HypertableA Genome Sequence Analysis System Built with Hypertable
A Genome Sequence Analysis System Built with Hypertable
DATAVERSITY
 
Improved Algorithm for Amplicon Sequencing Assay Designs
Improved Algorithm for Amplicon Sequencing Assay DesignsImproved Algorithm for Amplicon Sequencing Assay Designs
Improved Algorithm for Amplicon Sequencing Assay Designs
Thermo Fisher Scientific
 
2015 functional genomics variant annotation and interpretation- tools and p...
2015 functional genomics   variant annotation and interpretation- tools and p...2015 functional genomics   variant annotation and interpretation- tools and p...
2015 functional genomics variant annotation and interpretation- tools and p...
Gabe Rudy
 
Predicting phenotype from genotype with machine learning
Predicting phenotype from genotype with machine learningPredicting phenotype from genotype with machine learning
Predicting phenotype from genotype with machine learning
Patricia Francis-Lyon
 
Genome in a Bottle - Towards new benchmarks for the “dark matter” of the huma...
Genome in a Bottle - Towards new benchmarks for the “dark matter” of the huma...Genome in a Bottle - Towards new benchmarks for the “dark matter” of the huma...
Genome in a Bottle - Towards new benchmarks for the “dark matter” of the huma...
GenomeInABottle
 
2014 agbt giab data integration poster 140206
2014 agbt giab data integration poster 1402062014 agbt giab data integration poster 140206
2014 agbt giab data integration poster 140206GenomeInABottle
 

Similar to SNP (20)

NCBI
NCBINCBI
NCBI
 
Fruit breedomics workshop wp6 from marker assisted breeding to genomics assis...
Fruit breedomics workshop wp6 from marker assisted breeding to genomics assis...Fruit breedomics workshop wp6 from marker assisted breeding to genomics assis...
Fruit breedomics workshop wp6 from marker assisted breeding to genomics assis...
 
Expanding Your Research Capabilities Using Targeted NGS
Expanding Your Research Capabilities Using Targeted NGSExpanding Your Research Capabilities Using Targeted NGS
Expanding Your Research Capabilities Using Targeted NGS
 
Bioinformatics MiRON
Bioinformatics MiRONBioinformatics MiRON
Bioinformatics MiRON
 
Cloud bioinformatics 2
Cloud bioinformatics 2Cloud bioinformatics 2
Cloud bioinformatics 2
 
An analogy of algorithms for tagging of single nucleotide polymorphism and ev
An analogy of algorithms for tagging of single nucleotide polymorphism and evAn analogy of algorithms for tagging of single nucleotide polymorphism and ev
An analogy of algorithms for tagging of single nucleotide polymorphism and ev
 
Pathema Burkholderia Annotation Jamboree: Prokaryotic Annotation Overview
Pathema Burkholderia Annotation Jamboree: Prokaryotic Annotation OverviewPathema Burkholderia Annotation Jamboree: Prokaryotic Annotation Overview
Pathema Burkholderia Annotation Jamboree: Prokaryotic Annotation Overview
 
31931 31941
31931 3194131931 31941
31931 31941
 
Role of bioinformatics in life sciences research
Role of bioinformatics in life sciences researchRole of bioinformatics in life sciences research
Role of bioinformatics in life sciences research
 
Microarray biotechnologg ppy dna microarrays
Microarray biotechnologg ppy dna microarraysMicroarray biotechnologg ppy dna microarrays
Microarray biotechnologg ppy dna microarrays
 
Genome in a bottle april 30 2015 hvp Leiden
Genome in a bottle april 30 2015 hvp LeidenGenome in a bottle april 30 2015 hvp Leiden
Genome in a bottle april 30 2015 hvp Leiden
 
A Distributed Annotation Pipeline for MSSNG
A Distributed Annotation Pipeline for MSSNGA Distributed Annotation Pipeline for MSSNG
A Distributed Annotation Pipeline for MSSNG
 
Bioinformatics_1_ChenS.pptx
Bioinformatics_1_ChenS.pptxBioinformatics_1_ChenS.pptx
Bioinformatics_1_ChenS.pptx
 
A Genome Sequence Analysis System Built with Hypertable
A Genome Sequence Analysis System Built with HypertableA Genome Sequence Analysis System Built with Hypertable
A Genome Sequence Analysis System Built with Hypertable
 
Improved Algorithm for Amplicon Sequencing Assay Designs
Improved Algorithm for Amplicon Sequencing Assay DesignsImproved Algorithm for Amplicon Sequencing Assay Designs
Improved Algorithm for Amplicon Sequencing Assay Designs
 
2015 functional genomics variant annotation and interpretation- tools and p...
2015 functional genomics   variant annotation and interpretation- tools and p...2015 functional genomics   variant annotation and interpretation- tools and p...
2015 functional genomics variant annotation and interpretation- tools and p...
 
Predicting phenotype from genotype with machine learning
Predicting phenotype from genotype with machine learningPredicting phenotype from genotype with machine learning
Predicting phenotype from genotype with machine learning
 
DNA Profiling_HMD_2020.pptx
DNA Profiling_HMD_2020.pptxDNA Profiling_HMD_2020.pptx
DNA Profiling_HMD_2020.pptx
 
Genome in a Bottle - Towards new benchmarks for the “dark matter” of the huma...
Genome in a Bottle - Towards new benchmarks for the “dark matter” of the huma...Genome in a Bottle - Towards new benchmarks for the “dark matter” of the huma...
Genome in a Bottle - Towards new benchmarks for the “dark matter” of the huma...
 
2014 agbt giab data integration poster 140206
2014 agbt giab data integration poster 1402062014 agbt giab data integration poster 140206
2014 agbt giab data integration poster 140206
 

Recently uploaded

By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024
Pierluigi Pugliese
 
Introduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - CybersecurityIntroduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - Cybersecurity
mikeeftimakis1
 
Free Complete Python - A step towards Data Science
Free Complete Python - A step towards Data ScienceFree Complete Python - A step towards Data Science
Free Complete Python - A step towards Data Science
RinaMondal9
 
The Metaverse and AI: how can decision-makers harness the Metaverse for their...
The Metaverse and AI: how can decision-makers harness the Metaverse for their...The Metaverse and AI: how can decision-makers harness the Metaverse for their...
The Metaverse and AI: how can decision-makers harness the Metaverse for their...
Jen Stirrup
 
RESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for studentsRESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for students
KAMESHS29
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
Alan Dix
 
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptxSecstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
nkrafacyberclub
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
Safe Software
 
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex ProofszkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
Alex Pruden
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
DanBrown980551
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Paige Cruz
 
Video Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the FutureVideo Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the Future
Alpen-Adria-Universität
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
Laura Byrne
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
OnBoard
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
BookNet Canada
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance
 
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
James Anderson
 
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
UiPathCommunity
 
A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...
sonjaschweigert1
 
Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
Aftab Hussain
 

Recently uploaded (20)

By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024
 
Introduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - CybersecurityIntroduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - Cybersecurity
 
Free Complete Python - A step towards Data Science
Free Complete Python - A step towards Data ScienceFree Complete Python - A step towards Data Science
Free Complete Python - A step towards Data Science
 
The Metaverse and AI: how can decision-makers harness the Metaverse for their...
The Metaverse and AI: how can decision-makers harness the Metaverse for their...The Metaverse and AI: how can decision-makers harness the Metaverse for their...
The Metaverse and AI: how can decision-makers harness the Metaverse for their...
 
RESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for studentsRESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for students
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
 
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptxSecstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
 
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex ProofszkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
 
Video Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the FutureVideo Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the Future
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
 
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
 
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
 
A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...
 
Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
 

SNP

  • 1. Mining Single Nucleotide Polymorphisms from public sequence databases. Gary Barker IACR Long Ashton
  • 2.
  • 3. Why are these polymorphisms useful? It’s sometimes possible to correlate a SNP with a particular trait. This is known as association genetics.
  • 4. Disease resistant population Disease susceptible population Genotype all individuals for thousands of SNPs ATG A TTATAG ATG T TTATAG Resistant people all have an ‘A’ at position 4 in geneX , while susceptible people have a ‘T’ gene X
  • 5. To use SNPs, you first have to find them. Poorly studied organisms: Sequence many ‘loci’ (different places in the genome) for many individuals. Many well studied organisms : Required data is already present in public sequence databases, it just needs to be processed.
  • 6. Number of ESTs* in EMBL database *ESTs are single pass (often partial) gene sequences
  • 7. Mining SNPs from EST sequences in the database AutoSNP (PERL script) can find likely SNPs in data sets downloaded from public databases. 1) Marks up only those polymorphisms where each allele is supported by at least two independent sequences. This filters out most sequencing errors. 2) Adds further confidence scores based on co-segregation 3) Results written to HTML reports.
  • 8.  
  • 9.  
  • 10.  
  • 11. Accessing AutoSNP results 1) Search by accession number:
  • 12.  
  • 13. Accessing AutoSNP results 2) Search with a query sequence
  • 14.  
  • 15. Current AutoSNP approach: Cluster sequences (d2cluster) Align and find SNPs (cap3) Accession # / SNP report # Query with Accession MySQL database gi|11117503 | snip_1.htm gi|12217138 | snip_2.htm Sequence query Blast client Matching Accessions Links to existing SNP reports
  • 16. Desirable: Client supplied query Sequence (ATAGCGTACG……) Blast search (data direct from EBI?) Build contigs of results Detect eSNPs Client gets SNP report(s) (html) for all sequences matching query Data and processing power (large) processing power (medium) processing power (small) < 10 seconds
  • 17. Conclusions SNPs (single nucleotide polymorphisms) are abundant and useful genetic markers. Software exists to mine them from public data sets, but this doesn’t work in real time. GRID technology could help to deliver up-to-date alignments to users for any query sequence with putative SNPs marked up. Related useful features would include bootstrapped trees for each alignment, generated on the fly.