2. Biological BackgroundBiological Background
►How can researchers hope to identify andHow can researchers hope to identify and
study all the changes that occur in so manystudy all the changes that occur in so many
different diseases?different diseases?
►How can they explain why some peopleHow can they explain why some people
respond to treatment and not others?respond to treatment and not others?
3. ‘‘SNP’SNP’ is the answer to these questions…is the answer to these questions…
►So what exactly are SNPs?So what exactly are SNPs?
►How are they involved in so many differentHow are they involved in so many different
aspects of health?aspects of health?
4. SingleSingle Nucleotide Polymorphism
A Single Nucleotide Polymorphisms (SNP),
pronounced “snip,” is a genetic variation when a
single nucleotide (i.e., A, T, C, or G) is altered
and kept through heredity.
SNP: Single DNA base variation found >1%
Mutation: Single DNA base variation found
<1%
C T T A G C T T
C T T A G T T T
SNP
C T T A G C T T
C T T A G T T T
Mutation
94%
6%
99.9%
0.1%
5. 5
Single Nucleotide
Polymorphism
SNPs are the most frequent form among various
genetic variations.
90% of human genetic variations come from
SNPs.
SNPs occur about every 300~600 base pairs.
Millions of SNPs have been identified (e.g.,
HapMap and Perlegen).
SNPs have become the preferred markers for
association studies because of their high
abundance and high-throughput SNP genotyping
technologies.
6. Single Nucleotide
Polymorphism
A SNP is usually assumed to be a binary
variable.
The probability of repeat mutation at the same SNP
locus is quite small.
The tri-allele cases are usually considered to be the
effect of genotyping errors.
The nucleotide on a SNP locus is called
a major allele (if allele frequency > 50%), or
a minor allele (if allele frequency < 50%).
A C T T A G C T T
A C T T A G C T C C: Minor allele
94%
6%
T: Major allele
7. SNP factsSNP facts
► SNPs are found inSNPs are found in
coding and (mostly) noncoding regions.coding and (mostly) noncoding regions.
► Occur with a very high frequencyOccur with a very high frequency
about 1 in 1000 bases to 1 in 100 to 300 bases.about 1 in 1000 bases to 1 in 100 to 300 bases.
► The abundance of SNPs and the ease with which they canThe abundance of SNPs and the ease with which they can
be measured make these genetic variations significant.be measured make these genetic variations significant.
► SNPs close to particular gene acts as a marker for thatSNPs close to particular gene acts as a marker for that
gene.gene.
► SNPs in coding regions may alter the protein structureSNPs in coding regions may alter the protein structure
made by that coding region.made by that coding region.
9. SNPs may / may not alter proteinSNPs may / may not alter protein
structurestructure
10. SNP mapsSNP maps
►Sequence genomes of a large number ofSequence genomes of a large number of
peoplepeople
►Compare the base sequences to discoverCompare the base sequences to discover
SNPs.SNPs.
►Generate a single map of the humanGenerate a single map of the human
genome containing all possible SNPs =>genome containing all possible SNPs =>
SNP mapsSNP maps
12. SNP ProfilesSNP Profiles
► Genome of each individual contains distinct SNPGenome of each individual contains distinct SNP
pattern.pattern.
► People can be grouped based on the SNP profile.People can be grouped based on the SNP profile.
► SNPs Profiles important for identifying response toSNPs Profiles important for identifying response to
Drug Therapy.Drug Therapy.
► Correlations might emerge between certain SNPCorrelations might emerge between certain SNP
profiles and specific responses to treatment.profiles and specific responses to treatment.
14. How can SNPs aid research?How can SNPs aid research?
►BiomarkersBiomarkers
►Association StudiesAssociation Studies
►GenotypingGenotyping
►Loss of HeterozygosityLoss of Heterozygosity
►DNA Copy NumberDNA Copy Number
15. Techniques to detect knownTechniques to detect known
PolymorphismsPolymorphisms
► Hybridization TechniquesHybridization Techniques
Micro arraysMicro arrays
Real time PCRReal time PCR
► Enzyme based TechniquesEnzyme based Techniques
Nucleotide extensionNucleotide extension
CleavageCleavage
LigationLigation
Reaction product detection and displayReaction product detection and display
► Comparison of Techniques usedComparison of Techniques used
16. Techniques to detect unknownTechniques to detect unknown
PolymorphismsPolymorphisms
► Direct SequencingDirect Sequencing
► MicroarrayMicroarray
► Cleavage / LigationCleavage / Ligation
► Electrophoretic mobility assaysElectrophoretic mobility assays
► Comparison of Techniques usedComparison of Techniques used
17. Direct SequencingDirect Sequencing
► Sanger dideoxysequencing can detect any type of unknownSanger dideoxysequencing can detect any type of unknown
polymorphism and its position, when the majority of DNA contains thatpolymorphism and its position, when the majority of DNA contains that
polymorphism.polymorphism.
► Misses polymorphisms and mutations when the DNA is heterozygousMisses polymorphisms and mutations when the DNA is heterozygous
► limited utility for analysis of solid tumors or pooled samples of DNA duelimited utility for analysis of solid tumors or pooled samples of DNA due
to low sensitivityto low sensitivity
► Once a sample is known to contain a polymorphism in a specificOnce a sample is known to contain a polymorphism in a specific
region, direct sequencing is particularly useful for identifying aregion, direct sequencing is particularly useful for identifying a
polymorphism and its specific position.polymorphism and its specific position.
► Even if the identity of the polymorphism cannot be discerned in the firstEven if the identity of the polymorphism cannot be discerned in the first
pass, multiple sequencing attempts have proven quite successful inpass, multiple sequencing attempts have proven quite successful in
elucidating sequence and position information.elucidating sequence and position information.
18. SNP Microarray ChipSNP Microarray Chip
► Use microarrayUse microarray
platform similar toplatform similar to
gene expressiongene expression
studiesstudies
► Hybridization ofHybridization of
fluorescently taggedfluorescently tagged
samples to probessamples to probes
which correspond towhich correspond to
sequences ofsequences of
interestinterest
19. Affymetrix Probe LayoutAffymetrix Probe Layout
► Two alleles, A and BTwo alleles, A and B
► PerfectMatch (Signal)PerfectMatch (Signal)
MisMatch (Background)MisMatch (Background)
► Sense (forward)Sense (forward)
Antisense (reverse)Antisense (reverse)
► Shifted SequencesShifted Sequences
(-2, -1, 0, 1, 2)(-2, -1, 0, 1, 2)
► Read intensity valuesRead intensity values
20. GenotypingGenotyping
► Each probe gives some indication of allele A or BEach probe gives some indication of allele A or B
► Aggregate information from all probes for a givenAggregate information from all probes for a given
SNPSNP
► Create classifier for each SNPCreate classifier for each SNP
► Make genotype callsMake genotype calls
(AA, BB, AB, AB_A, AB_B, Unknown)(AA, BB, AB, AB_A, AB_B, Unknown)
21. SIGNIFICANCE OF SNPsSIGNIFICANCE OF SNPs
• In disease diagnosisIn disease diagnosis
• In finding predisposition to diseasesIn finding predisposition to diseases
• In drug discovery & developmentIn drug discovery & development
• In drug responsesIn drug responses
• Investigation of migration patternsInvestigation of migration patterns
• All these aspect will help to look for medication & diagnosis atAll these aspect will help to look for medication & diagnosis at
individual levelindividual level
Feb. 25. 2003 SI Hung
22. • Two different screening strategiesTwo different screening strategies
• - Many SNPs in a few individuals- Many SNPs in a few individuals
• - A few SNPs in many individuals- A few SNPs in many individuals
• Different strategies will require different toolsDifferent strategies will require different tools
• Important in determining markers for complexImportant in determining markers for complex
genetic statesgenetic states
SNP ScreeningSNP Screening
23. SNP genotyping methods for detecting genes contributing to susceptibility or
resistance to multifactorial diseases, adverse drug reactions:
=> case-control association analysis
case
control
….GCCGTTGAC….
….GCCATTGAC….
….GCCATTGAC….
….GCCATTGAC….
24. A set of closely linked genetic markers presentA set of closely linked genetic markers present
on one chromosome which tend to be inheritedon one chromosome which tend to be inherited
together (not easily separable by recombination)together (not easily separable by recombination)
HAPLOTYPEHAPLOTYPE
26. Association of haplotype frequencies with the presence of
desired phenotypic frequencies in the population will help in
utilizing the maximum potential of SNP as a marker.
HAPLOTYPE CORRELATION WITHHAPLOTYPE CORRELATION WITH
PHENOTYPEPHENOTYPE
The “Haplotype centric” approach combines the information of
adjacent SNPs into composite multilocus haplotypes.
Haplotypes are not only more informative but also capture
the regional LD information, which is assumed to be robust
and powerful
27. • SNPs are the most frequent form of DNA variations
• They are the disease causing mutations in many genes
• They are abundant & have slow mutation rates
• Easy to score
• May work as the next generation of genetic markers
ADVANTAGEADVANTAGE
S:S:
28. Some important SNP database Resources
1. dbSNP (http://www.ncbi.nlm.nih.gov/SNP/)
LocusLink (http://www.ncbi.nlm.nih.gov/LocusLink/list.cgi)
2. TSC (http://snp.cshl.org/)
3. SNPper (http://snpper.chip.org/bio/)
4. JSNP (http://snp.ims.u-tokyo.ac.jp/search.html)
5. GeneSNPs (http://www.genome.utah.edu/genesnps/)
6. HGVbase (http://hgvbase.cgb.ki.se/)
7. PolyPhen (http://dove.embl-heidelberg.de/PolyPhen/)
OMIM (http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=OMIM)
Feb. 25. 2003 SI Hung
8. Human SNP database
(http://www-genome.wi.mit.edu/snp/human/)
Editor's Notes
SNP Simple to measure & understand
Haplotype have the advantage in the appropriate circumstances of carrying more information about the genotype-phenotype link than do the underlying SNPs.