SlideShare a Scribd company logo
Statistical methods for off-target variant
genotyping on Affymetrix’Axiom® Arrays
Teresa A. Webster, Ali Pirani, Mei-mei Shen, Laurent Bellon, and Hong Gao
Affymetrix, Inc. Santa Clara, CA 95051 USA
Algorithm for OTV genotyping
We developed a statistical approach called “OTVCaller” to genotype the
OTV SNPs. This algorithm uses the posterior genotyping information
generated by the automated clustering algorithm AxiomGT1 to initiate the
2D Baum-Welch algorithm in order to search for the optimal clustering in
both intensity strength and contrast dimensions. It then iteratively
updates sample assignments and cluster centers until convergence is
reached. The detailed workflow of the OTVCaller algorithm is shown in
Figure 3 and is implemented in SNPolisher R package.
Figure 3: The workflow of the OTVCaller algorithm. Two models are considered,
including a three-genotype model (AA, BB, and OTV) and a four genotype model
(AA, AB, BB, and OTV). The last step is to select the best-fit model.
ABSTRACT
Background: Off-target variants (OTVs) (Didion, et al., BMC Genomics
13:34, 2012) are genomic markers with sequences that are significantly
different from the sequences of hybridization probes used in microarray-based
genotyping. This dissimilarity between probe and target can lower
hybridization intensities such that genotypes of the subpopulation are often
incorrectly called as heterozygotes by genotyping algorithms that cluster
intensities into three expected genotypes, namely AA, AB, and BB. OTVs tend
to group members of the divergent subpopulation into a fourth cluster
denoted as the OTV cluster. Therefore, correct identification of this OTV
cluster, a process described as OTV genotyping, is a critical step.
Results: We have developed a statistical method called “OTVCaller” to
genotype all OTVs based on the posterior genotyping information generated
by the automated clustering algorithm AxiomGT1, which is used by Axiom®
Genotyping Console™ Software. OTVCaller uses posterior genotypes as the
prior information to initiate the Baum-Welch algorithm to iteratively search for
the optimal locations of the four clusters AA, AB, BB, and OTV. It then derives
genotypes and OTV types when convergence is reached. We applied OTVCaller
to genotype data from multiple species and demonstrate OTVCaller can
successfully identify the OTV cluster in the presence of all three genotypes
(AA, AB, and BB) or in the presence of only two genotypes (AA and BB).
OTVCaller is available from Affymetrix in a SNP data analysis post-processing
R package “SNPolisher™” and is applicable to Axiom® genotyping arrays.
Background
Off-target variant (OTV) probes (Didion, et al., 2012)1 usually demonstrate
substantially low hybridization intensity and center at zero in contrast
dimension, due to double deletion or sequence nonhomology. Thus they tend
to confound the genotype calling of heterozygotes. Therefore, there is a critical
need for statistical methods to distinguish OTVs from true heterozygotes.
Application to wheat OTV genotypes
We applied the OTVCaller algorithm implemented in SNPolisher™ R package
to bread wheat genotyping data generated with Affymetrix’ Axiom® custom
array. OTVCaller accurately corrected miscalled heterozygotes, as shown in
Figure 4 where OTVs had been miscalled as heterozygotes and true
heterozygotes had been miscalled as null calls before OTV genotyping. They
are all corrected after OTV genotyping.
Figure 4: Three examples of OTV clusters before and after OTV genotyping by
OTVCaller algorithm. OTV clusters are represented by cyan diamonds.
OTV
AABBBB AA
AB
OTV
Figure 1: Two
examples of
OTV genotypes.
Left has four
genotypes, AA,
AB, BB, and OTV.
Right has three
genotypes, AA,
BB, and OTV.
Diamonds in cyan
represent off-
target variants.
Figure 2: Two examples of genotype clusters with low HetSO values. They are
identified as OTV genotypes as their HetSO values are below -0.3.
Before OTV Genotyping
Identification of OTV genotypes
We internally developed a metric for SNP quality control called heterozygote
strength offset (HetSO). It is defined as the vertical distance (as measured by
[A+B]/2 or signal size) from the center of the heterozygote cluster to the line
connecting the centers of the homozygote clusters.
Large negative values are usually associated with OTV clusters. Therefore, we
define OTVs as genotypes with good cluster resolution but with HetSO < -0.3,
which is a customizable threshold. Identification of OTV genotypes is executed
automatically by SNPolisher R package.
After OTV Genotyping
1. Didion, J.P., Yang, H., Sheppard, K., Fu, C., McMillan, L., Villena, F.P., and Churchill, G.A. Discovery of novel variants in
genotyping arrays improves genotype retention and reduces ascertainment bias. BMC Genomics 13:34 (2012).
HetSO
=‐2.24 HetSO=
‐0.72

More Related Content

What's hot

Advancement of molecular markers and crop improvement in plant breeding
Advancement of molecular markers and crop improvement in plant breedingAdvancement of molecular markers and crop improvement in plant breeding
Advancement of molecular markers and crop improvement in plant breeding
PARTNER, BADC, World Bank
 
Marker assistant selection
Marker assistant selectionMarker assistant selection
Marker assistant selection
Pawan Nagar
 
Marker assisted selection for complex traits in agricultural crops
Marker assisted selection for complex traits in agricultural cropsMarker assisted selection for complex traits in agricultural crops
Marker assisted selection for complex traits in agricultural crops
Aparna Veluru
 
Marker assisted selection (2)
Marker assisted selection (2)Marker assisted selection (2)
Marker assisted selection (2)
Shreya Lodh
 
Mapping and QTL
Mapping and QTLMapping and QTL
Mapping and QTL
FAO
 
Using Genomic Selection in Barley to Improve Disease Resistance
Using Genomic Selection in Barley to Improve Disease ResistanceUsing Genomic Selection in Barley to Improve Disease Resistance
Using Genomic Selection in Barley to Improve Disease Resistance
Borlaug Global Rust Initiative
 
MARKER ASSISTED BACKCROSS BREEDING
MARKER ASSISTED BACKCROSS BREEDINGMARKER ASSISTED BACKCROSS BREEDING
MARKER ASSISTED BACKCROSS BREEDING
sandeshGM
 
05 costa
05 costa05 costa
05 costa
fruitbreedomics
 
Genome responses of trypanosome infected cattle
Genome responses of trypanosome infected cattleGenome responses of trypanosome infected cattle
Genome responses of trypanosome infected cattle
Laurence Dawkins-Hall
 
IRJET- Analysis of Hybrid Purity in Watermelon using Microsatellite Marker in...
IRJET- Analysis of Hybrid Purity in Watermelon using Microsatellite Marker in...IRJET- Analysis of Hybrid Purity in Watermelon using Microsatellite Marker in...
IRJET- Analysis of Hybrid Purity in Watermelon using Microsatellite Marker in...
IRJET Journal
 
Genetics, structure, and prevalence of FP967 T-DNA in flax
Genetics, structure, and prevalence of FP967 T-DNA in flaxGenetics, structure, and prevalence of FP967 T-DNA in flax
Genetics, structure, and prevalence of FP967 T-DNA in flax
Jamille McLeod
 
Marker assisted selection( mas) and its application in plant breeding
Marker assisted selection( mas) and its application in plant breedingMarker assisted selection( mas) and its application in plant breeding
Marker assisted selection( mas) and its application in plant breeding
Hemantkumar Sonawane
 
MARKER-ASSISTED BREEDING FOR RICE IMPROVEMENT
MARKER-ASSISTED BREEDING FOR RICE IMPROVEMENTMARKER-ASSISTED BREEDING FOR RICE IMPROVEMENT
MARKER-ASSISTED BREEDING FOR RICE IMPROVEMENT
FOODCROPS
 
Systemic analysis of data combined from genetic qtl's and gene expression dat...
Systemic analysis of data combined from genetic qtl's and gene expression dat...Systemic analysis of data combined from genetic qtl's and gene expression dat...
Systemic analysis of data combined from genetic qtl's and gene expression dat...
Laurence Dawkins-Hall
 
Trophectoderm dna fingerprinting by quantitative real time pcr successfully d...
Trophectoderm dna fingerprinting by quantitative real time pcr successfully d...Trophectoderm dna fingerprinting by quantitative real time pcr successfully d...
Trophectoderm dna fingerprinting by quantitative real time pcr successfully d...
t7260678
 
Marker assisted whole genome selection in crop improvement
Marker assisted whole genome     selection in crop improvementMarker assisted whole genome     selection in crop improvement
Marker assisted whole genome selection in crop improvement
Senthil Natesan
 
Identification and verification of QTL associated with frost tolerance using ...
Identification and verification of QTL associated with frost tolerance using ...Identification and verification of QTL associated with frost tolerance using ...
Identification and verification of QTL associated with frost tolerance using ...
PGS
 
QTL lecture for Bio4025
QTL lecture for Bio4025QTL lecture for Bio4025
QTL lecture for Bio4025
DanChitwood
 
Seed purity test
Seed purity testSeed purity test
Seed purity test
Senthil Natesan
 
Stem cellge arrayposter
Stem cellge arrayposterStem cellge arrayposter
Stem cellge arrayposter
Elsa von Licy
 

What's hot (20)

Advancement of molecular markers and crop improvement in plant breeding
Advancement of molecular markers and crop improvement in plant breedingAdvancement of molecular markers and crop improvement in plant breeding
Advancement of molecular markers and crop improvement in plant breeding
 
Marker assistant selection
Marker assistant selectionMarker assistant selection
Marker assistant selection
 
Marker assisted selection for complex traits in agricultural crops
Marker assisted selection for complex traits in agricultural cropsMarker assisted selection for complex traits in agricultural crops
Marker assisted selection for complex traits in agricultural crops
 
Marker assisted selection (2)
Marker assisted selection (2)Marker assisted selection (2)
Marker assisted selection (2)
 
Mapping and QTL
Mapping and QTLMapping and QTL
Mapping and QTL
 
Using Genomic Selection in Barley to Improve Disease Resistance
Using Genomic Selection in Barley to Improve Disease ResistanceUsing Genomic Selection in Barley to Improve Disease Resistance
Using Genomic Selection in Barley to Improve Disease Resistance
 
MARKER ASSISTED BACKCROSS BREEDING
MARKER ASSISTED BACKCROSS BREEDINGMARKER ASSISTED BACKCROSS BREEDING
MARKER ASSISTED BACKCROSS BREEDING
 
05 costa
05 costa05 costa
05 costa
 
Genome responses of trypanosome infected cattle
Genome responses of trypanosome infected cattleGenome responses of trypanosome infected cattle
Genome responses of trypanosome infected cattle
 
IRJET- Analysis of Hybrid Purity in Watermelon using Microsatellite Marker in...
IRJET- Analysis of Hybrid Purity in Watermelon using Microsatellite Marker in...IRJET- Analysis of Hybrid Purity in Watermelon using Microsatellite Marker in...
IRJET- Analysis of Hybrid Purity in Watermelon using Microsatellite Marker in...
 
Genetics, structure, and prevalence of FP967 T-DNA in flax
Genetics, structure, and prevalence of FP967 T-DNA in flaxGenetics, structure, and prevalence of FP967 T-DNA in flax
Genetics, structure, and prevalence of FP967 T-DNA in flax
 
Marker assisted selection( mas) and its application in plant breeding
Marker assisted selection( mas) and its application in plant breedingMarker assisted selection( mas) and its application in plant breeding
Marker assisted selection( mas) and its application in plant breeding
 
MARKER-ASSISTED BREEDING FOR RICE IMPROVEMENT
MARKER-ASSISTED BREEDING FOR RICE IMPROVEMENTMARKER-ASSISTED BREEDING FOR RICE IMPROVEMENT
MARKER-ASSISTED BREEDING FOR RICE IMPROVEMENT
 
Systemic analysis of data combined from genetic qtl's and gene expression dat...
Systemic analysis of data combined from genetic qtl's and gene expression dat...Systemic analysis of data combined from genetic qtl's and gene expression dat...
Systemic analysis of data combined from genetic qtl's and gene expression dat...
 
Trophectoderm dna fingerprinting by quantitative real time pcr successfully d...
Trophectoderm dna fingerprinting by quantitative real time pcr successfully d...Trophectoderm dna fingerprinting by quantitative real time pcr successfully d...
Trophectoderm dna fingerprinting by quantitative real time pcr successfully d...
 
Marker assisted whole genome selection in crop improvement
Marker assisted whole genome     selection in crop improvementMarker assisted whole genome     selection in crop improvement
Marker assisted whole genome selection in crop improvement
 
Identification and verification of QTL associated with frost tolerance using ...
Identification and verification of QTL associated with frost tolerance using ...Identification and verification of QTL associated with frost tolerance using ...
Identification and verification of QTL associated with frost tolerance using ...
 
QTL lecture for Bio4025
QTL lecture for Bio4025QTL lecture for Bio4025
QTL lecture for Bio4025
 
Seed purity test
Seed purity testSeed purity test
Seed purity test
 
Stem cellge arrayposter
Stem cellge arrayposterStem cellge arrayposter
Stem cellge arrayposter
 

Similar to Statistical methods for off-target variant genotyping on Affymetrix' Axiom Arrays

Cameron_Locker_variants_final_poster1
Cameron_Locker_variants_final_poster1Cameron_Locker_variants_final_poster1
Cameron_Locker_variants_final_poster1
Cameron Locker, MPH
 
A Novel High Accuracy Algorithm for Reference Assembly in Colour Space
A Novel High Accuracy Algorithm for Reference Assembly in Colour SpaceA Novel High Accuracy Algorithm for Reference Assembly in Colour Space
A Novel High Accuracy Algorithm for Reference Assembly in Colour Space
CSCJournals
 
2015 agbt sv_poster_justin
2015 agbt sv_poster_justin2015 agbt sv_poster_justin
2015 agbt sv_poster_justin
GenomeInABottle
 
June 2017: Biomedical applications of prototype-based classifiers and relevan...
June 2017: Biomedical applications of prototype-based classifiers and relevan...June 2017: Biomedical applications of prototype-based classifiers and relevan...
June 2017: Biomedical applications of prototype-based classifiers and relevan...
University of Groningen
 
160627 giab for festival sv workshop
160627 giab for festival sv workshop160627 giab for festival sv workshop
160627 giab for festival sv workshop
GenomeInABottle
 
Microarray Data Classification Using Support Vector Machine
Microarray Data Classification Using Support Vector MachineMicroarray Data Classification Using Support Vector Machine
Microarray Data Classification Using Support Vector Machine
CSCJournals
 
K Marcoe In Cell User Ge Meeting 2008
K Marcoe In Cell User Ge Meeting  2008K Marcoe In Cell User Ge Meeting  2008
K Marcoe In Cell User Ge Meeting 2008
KarenMarcoe
 
A hybrid classification model for eeg signals
A hybrid classification model for  eeg signals A hybrid classification model for  eeg signals
A hybrid classification model for eeg signals
Aboul Ella Hassanien
 
dual-event machine learning models to accelerate drug discovery
dual-event machine learning models to accelerate drug discoverydual-event machine learning models to accelerate drug discovery
dual-event machine learning models to accelerate drug discovery
Sean Ekins
 
PERFORMANCE ANALYSIS OF MULTICLASS SUPPORT VECTOR MACHINE CLASSIFICATION FOR ...
PERFORMANCE ANALYSIS OF MULTICLASS SUPPORT VECTOR MACHINE CLASSIFICATION FOR ...PERFORMANCE ANALYSIS OF MULTICLASS SUPPORT VECTOR MACHINE CLASSIFICATION FOR ...
PERFORMANCE ANALYSIS OF MULTICLASS SUPPORT VECTOR MACHINE CLASSIFICATION FOR ...
ijcsa
 
Final presentation dwi riyono
Final presentation dwi riyonoFinal presentation dwi riyono
Final presentation dwi riyono
Dwi Riyono
 
C0451823
C0451823C0451823
C0451823
IOSR Journals
 

Similar to Statistical methods for off-target variant genotyping on Affymetrix' Axiom Arrays (12)

Cameron_Locker_variants_final_poster1
Cameron_Locker_variants_final_poster1Cameron_Locker_variants_final_poster1
Cameron_Locker_variants_final_poster1
 
A Novel High Accuracy Algorithm for Reference Assembly in Colour Space
A Novel High Accuracy Algorithm for Reference Assembly in Colour SpaceA Novel High Accuracy Algorithm for Reference Assembly in Colour Space
A Novel High Accuracy Algorithm for Reference Assembly in Colour Space
 
2015 agbt sv_poster_justin
2015 agbt sv_poster_justin2015 agbt sv_poster_justin
2015 agbt sv_poster_justin
 
June 2017: Biomedical applications of prototype-based classifiers and relevan...
June 2017: Biomedical applications of prototype-based classifiers and relevan...June 2017: Biomedical applications of prototype-based classifiers and relevan...
June 2017: Biomedical applications of prototype-based classifiers and relevan...
 
160627 giab for festival sv workshop
160627 giab for festival sv workshop160627 giab for festival sv workshop
160627 giab for festival sv workshop
 
Microarray Data Classification Using Support Vector Machine
Microarray Data Classification Using Support Vector MachineMicroarray Data Classification Using Support Vector Machine
Microarray Data Classification Using Support Vector Machine
 
K Marcoe In Cell User Ge Meeting 2008
K Marcoe In Cell User Ge Meeting  2008K Marcoe In Cell User Ge Meeting  2008
K Marcoe In Cell User Ge Meeting 2008
 
A hybrid classification model for eeg signals
A hybrid classification model for  eeg signals A hybrid classification model for  eeg signals
A hybrid classification model for eeg signals
 
dual-event machine learning models to accelerate drug discovery
dual-event machine learning models to accelerate drug discoverydual-event machine learning models to accelerate drug discovery
dual-event machine learning models to accelerate drug discovery
 
PERFORMANCE ANALYSIS OF MULTICLASS SUPPORT VECTOR MACHINE CLASSIFICATION FOR ...
PERFORMANCE ANALYSIS OF MULTICLASS SUPPORT VECTOR MACHINE CLASSIFICATION FOR ...PERFORMANCE ANALYSIS OF MULTICLASS SUPPORT VECTOR MACHINE CLASSIFICATION FOR ...
PERFORMANCE ANALYSIS OF MULTICLASS SUPPORT VECTOR MACHINE CLASSIFICATION FOR ...
 
Final presentation dwi riyono
Final presentation dwi riyonoFinal presentation dwi riyono
Final presentation dwi riyono
 
C0451823
C0451823C0451823
C0451823
 

More from Affymetrix

Axiom™ Genome-Wide CEU 1 Array Plate
Axiom™ Genome-Wide CEU 1 Array PlateAxiom™ Genome-Wide CEU 1 Array Plate
Axiom™ Genome-Wide CEU 1 Array Plate
Affymetrix
 
Axiom® Genome-Wide CHB 1 & CHB 2 Array Plate Set
Axiom® Genome-Wide CHB 1 & CHB 2 Array Plate SetAxiom® Genome-Wide CHB 1 & CHB 2 Array Plate Set
Axiom® Genome-Wide CHB 1 & CHB 2 Array Plate Set
Affymetrix
 
Axiom™ Genome-Wide ASI 1 Array Plate
Axiom™ Genome-Wide ASI 1 Array PlateAxiom™ Genome-Wide ASI 1 Array Plate
Axiom™ Genome-Wide ASI 1 Array Plate
Affymetrix
 
SNP genotyping using the Affymetrix® Axiom® Genome-Wide Pan-African (PanAFR) ...
SNP genotyping using the Affymetrix® Axiom® Genome-Wide Pan-African (PanAFR) ...SNP genotyping using the Affymetrix® Axiom® Genome-Wide Pan-African (PanAFR) ...
SNP genotyping using the Affymetrix® Axiom® Genome-Wide Pan-African (PanAFR) ...
Affymetrix
 
A SNP array for human population genetics studies
A SNP array for human population genetics studiesA SNP array for human population genetics studies
A SNP array for human population genetics studies
Affymetrix
 
Axiom® Genome-Wide LAT 1 Array World Array 4
Axiom®  Genome-Wide LAT 1 Array World Array 4Axiom®  Genome-Wide LAT 1 Array World Array 4
Axiom® Genome-Wide LAT 1 Array World Array 4
Affymetrix
 
Axiom® Genome-Wide AFR 1 Array World Array 3
Axiom®  Genome-Wide AFR 1 Array World Array 3Axiom®  Genome-Wide AFR 1 Array World Array 3
Axiom® Genome-Wide AFR 1 Array World Array 3
Affymetrix
 
FFPE Applications Solutions brochure
FFPE Applications Solutions brochureFFPE Applications Solutions brochure
FFPE Applications Solutions brochure
Affymetrix
 
Solutions for Personalized Medicine brochure
Solutions for Personalized Medicine brochureSolutions for Personalized Medicine brochure
Solutions for Personalized Medicine brochure
Affymetrix
 
Download our publication archive list
Download our publication archive listDownload our publication archive list
Download our publication archive list
Affymetrix
 
Designing GWAS arrays for efficient imputation-based coverage
Designing GWAS arrays for efficient imputation-based coverageDesigning GWAS arrays for efficient imputation-based coverage
Designing GWAS arrays for efficient imputation-based coverage
Affymetrix
 
Use of Affymetrix Arrays (GeneChip® Human Transcriptome 2.0 Array and Cytosca...
Use of Affymetrix Arrays (GeneChip® Human Transcriptome 2.0 Array and Cytosca...Use of Affymetrix Arrays (GeneChip® Human Transcriptome 2.0 Array and Cytosca...
Use of Affymetrix Arrays (GeneChip® Human Transcriptome 2.0 Array and Cytosca...
Affymetrix
 
From trials evaluating drugs to trials evaluating treatment algorithms – Focu...
From trials evaluating drugs to trials evaluating treatment algorithms – Focu...From trials evaluating drugs to trials evaluating treatment algorithms – Focu...
From trials evaluating drugs to trials evaluating treatment algorithms – Focu...
Affymetrix
 
Affymetrix OncoScan®* data analysis with Nexus Copy Number™
Affymetrix OncoScan®* data analysis with Nexus Copy Number™Affymetrix OncoScan®* data analysis with Nexus Copy Number™
Affymetrix OncoScan®* data analysis with Nexus Copy Number™
Affymetrix
 
Phenotypic identification of subclones in multiple myeloma with different gen...
Phenotypic identification of subclones in multiple myeloma with different gen...Phenotypic identification of subclones in multiple myeloma with different gen...
Phenotypic identification of subclones in multiple myeloma with different gen...
Affymetrix
 
Mitigating genotyping application note
Mitigating genotyping application noteMitigating genotyping application note
Mitigating genotyping application note
Affymetrix
 
Allopolyploid Genotyping Algorithm on Affymetrix' Axiom Arrays
Allopolyploid Genotyping Algorithm on Affymetrix' Axiom ArraysAllopolyploid Genotyping Algorithm on Affymetrix' Axiom Arrays
Allopolyploid Genotyping Algorithm on Affymetrix' Axiom Arrays
Affymetrix
 
SNP genotyping of markers with nearby secondary polymorphisms using Affymetri...
SNP genotyping of markers with nearby secondary polymorphisms using Affymetri...SNP genotyping of markers with nearby secondary polymorphisms using Affymetri...
SNP genotyping of markers with nearby secondary polymorphisms using Affymetri...
Affymetrix
 
Development of a high-throughput high-density SNP genotyping array for bovine
Development of a high-throughput high-density SNP genotyping array for bovineDevelopment of a high-throughput high-density SNP genotyping array for bovine
Development of a high-throughput high-density SNP genotyping array for bovine
Affymetrix
 
SNP genotyping using Affymetrix' Axiom Genotyping Solution
SNP genotyping using Affymetrix' Axiom Genotyping SolutionSNP genotyping using Affymetrix' Axiom Genotyping Solution
SNP genotyping using Affymetrix' Axiom Genotyping Solution
Affymetrix
 

More from Affymetrix (20)

Axiom™ Genome-Wide CEU 1 Array Plate
Axiom™ Genome-Wide CEU 1 Array PlateAxiom™ Genome-Wide CEU 1 Array Plate
Axiom™ Genome-Wide CEU 1 Array Plate
 
Axiom® Genome-Wide CHB 1 & CHB 2 Array Plate Set
Axiom® Genome-Wide CHB 1 & CHB 2 Array Plate SetAxiom® Genome-Wide CHB 1 & CHB 2 Array Plate Set
Axiom® Genome-Wide CHB 1 & CHB 2 Array Plate Set
 
Axiom™ Genome-Wide ASI 1 Array Plate
Axiom™ Genome-Wide ASI 1 Array PlateAxiom™ Genome-Wide ASI 1 Array Plate
Axiom™ Genome-Wide ASI 1 Array Plate
 
SNP genotyping using the Affymetrix® Axiom® Genome-Wide Pan-African (PanAFR) ...
SNP genotyping using the Affymetrix® Axiom® Genome-Wide Pan-African (PanAFR) ...SNP genotyping using the Affymetrix® Axiom® Genome-Wide Pan-African (PanAFR) ...
SNP genotyping using the Affymetrix® Axiom® Genome-Wide Pan-African (PanAFR) ...
 
A SNP array for human population genetics studies
A SNP array for human population genetics studiesA SNP array for human population genetics studies
A SNP array for human population genetics studies
 
Axiom® Genome-Wide LAT 1 Array World Array 4
Axiom®  Genome-Wide LAT 1 Array World Array 4Axiom®  Genome-Wide LAT 1 Array World Array 4
Axiom® Genome-Wide LAT 1 Array World Array 4
 
Axiom® Genome-Wide AFR 1 Array World Array 3
Axiom®  Genome-Wide AFR 1 Array World Array 3Axiom®  Genome-Wide AFR 1 Array World Array 3
Axiom® Genome-Wide AFR 1 Array World Array 3
 
FFPE Applications Solutions brochure
FFPE Applications Solutions brochureFFPE Applications Solutions brochure
FFPE Applications Solutions brochure
 
Solutions for Personalized Medicine brochure
Solutions for Personalized Medicine brochureSolutions for Personalized Medicine brochure
Solutions for Personalized Medicine brochure
 
Download our publication archive list
Download our publication archive listDownload our publication archive list
Download our publication archive list
 
Designing GWAS arrays for efficient imputation-based coverage
Designing GWAS arrays for efficient imputation-based coverageDesigning GWAS arrays for efficient imputation-based coverage
Designing GWAS arrays for efficient imputation-based coverage
 
Use of Affymetrix Arrays (GeneChip® Human Transcriptome 2.0 Array and Cytosca...
Use of Affymetrix Arrays (GeneChip® Human Transcriptome 2.0 Array and Cytosca...Use of Affymetrix Arrays (GeneChip® Human Transcriptome 2.0 Array and Cytosca...
Use of Affymetrix Arrays (GeneChip® Human Transcriptome 2.0 Array and Cytosca...
 
From trials evaluating drugs to trials evaluating treatment algorithms – Focu...
From trials evaluating drugs to trials evaluating treatment algorithms – Focu...From trials evaluating drugs to trials evaluating treatment algorithms – Focu...
From trials evaluating drugs to trials evaluating treatment algorithms – Focu...
 
Affymetrix OncoScan®* data analysis with Nexus Copy Number™
Affymetrix OncoScan®* data analysis with Nexus Copy Number™Affymetrix OncoScan®* data analysis with Nexus Copy Number™
Affymetrix OncoScan®* data analysis with Nexus Copy Number™
 
Phenotypic identification of subclones in multiple myeloma with different gen...
Phenotypic identification of subclones in multiple myeloma with different gen...Phenotypic identification of subclones in multiple myeloma with different gen...
Phenotypic identification of subclones in multiple myeloma with different gen...
 
Mitigating genotyping application note
Mitigating genotyping application noteMitigating genotyping application note
Mitigating genotyping application note
 
Allopolyploid Genotyping Algorithm on Affymetrix' Axiom Arrays
Allopolyploid Genotyping Algorithm on Affymetrix' Axiom ArraysAllopolyploid Genotyping Algorithm on Affymetrix' Axiom Arrays
Allopolyploid Genotyping Algorithm on Affymetrix' Axiom Arrays
 
SNP genotyping of markers with nearby secondary polymorphisms using Affymetri...
SNP genotyping of markers with nearby secondary polymorphisms using Affymetri...SNP genotyping of markers with nearby secondary polymorphisms using Affymetri...
SNP genotyping of markers with nearby secondary polymorphisms using Affymetri...
 
Development of a high-throughput high-density SNP genotyping array for bovine
Development of a high-throughput high-density SNP genotyping array for bovineDevelopment of a high-throughput high-density SNP genotyping array for bovine
Development of a high-throughput high-density SNP genotyping array for bovine
 
SNP genotyping using Affymetrix' Axiom Genotyping Solution
SNP genotyping using Affymetrix' Axiom Genotyping SolutionSNP genotyping using Affymetrix' Axiom Genotyping Solution
SNP genotyping using Affymetrix' Axiom Genotyping Solution
 

Recently uploaded

原版制作(carleton毕业证书)卡尔顿大学毕业证硕士文凭原版一模一样
原版制作(carleton毕业证书)卡尔顿大学毕业证硕士文凭原版一模一样原版制作(carleton毕业证书)卡尔顿大学毕业证硕士文凭原版一模一样
原版制作(carleton毕业证书)卡尔顿大学毕业证硕士文凭原版一模一样
yqqaatn0
 
The use of Nauplii and metanauplii artemia in aquaculture (brine shrimp).pptx
The use of Nauplii and metanauplii artemia in aquaculture (brine shrimp).pptxThe use of Nauplii and metanauplii artemia in aquaculture (brine shrimp).pptx
The use of Nauplii and metanauplii artemia in aquaculture (brine shrimp).pptx
MAGOTI ERNEST
 
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...
Ana Luísa Pinho
 
The binding of cosmological structures by massless topological defects
The binding of cosmological structures by massless topological defectsThe binding of cosmological structures by massless topological defects
The binding of cosmological structures by massless topological defects
Sérgio Sacani
 
Shallowest Oil Discovery of Turkiye.pptx
Shallowest Oil Discovery of Turkiye.pptxShallowest Oil Discovery of Turkiye.pptx
Shallowest Oil Discovery of Turkiye.pptx
Gokturk Mehmet Dilci
 
Bob Reedy - Nitrate in Texas Groundwater.pdf
Bob Reedy - Nitrate in Texas Groundwater.pdfBob Reedy - Nitrate in Texas Groundwater.pdf
Bob Reedy - Nitrate in Texas Groundwater.pdf
Texas Alliance of Groundwater Districts
 
ANAMOLOUS SECONDARY GROWTH IN DICOT ROOTS.pptx
ANAMOLOUS SECONDARY GROWTH IN DICOT ROOTS.pptxANAMOLOUS SECONDARY GROWTH IN DICOT ROOTS.pptx
ANAMOLOUS SECONDARY GROWTH IN DICOT ROOTS.pptx
RASHMI M G
 
Eukaryotic Transcription Presentation.pptx
Eukaryotic Transcription Presentation.pptxEukaryotic Transcription Presentation.pptx
Eukaryotic Transcription Presentation.pptx
RitabrataSarkar3
 
EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...
EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...
EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...
Sérgio Sacani
 
Medical Orthopedic PowerPoint Templates.pptx
Medical Orthopedic PowerPoint Templates.pptxMedical Orthopedic PowerPoint Templates.pptx
Medical Orthopedic PowerPoint Templates.pptx
terusbelajar5
 
Oedema_types_causes_pathophysiology.pptx
Oedema_types_causes_pathophysiology.pptxOedema_types_causes_pathophysiology.pptx
Oedema_types_causes_pathophysiology.pptx
muralinath2
 
Topic: SICKLE CELL DISEASE IN CHILDREN-3.pdf
Topic: SICKLE CELL DISEASE IN CHILDREN-3.pdfTopic: SICKLE CELL DISEASE IN CHILDREN-3.pdf
Topic: SICKLE CELL DISEASE IN CHILDREN-3.pdf
TinyAnderson
 
Sharlene Leurig - Enabling Onsite Water Use with Net Zero Water
Sharlene Leurig - Enabling Onsite Water Use with Net Zero WaterSharlene Leurig - Enabling Onsite Water Use with Net Zero Water
Sharlene Leurig - Enabling Onsite Water Use with Net Zero Water
Texas Alliance of Groundwater Districts
 
SAR of Medicinal Chemistry 1st by dk.pdf
SAR of Medicinal Chemistry 1st by dk.pdfSAR of Medicinal Chemistry 1st by dk.pdf
SAR of Medicinal Chemistry 1st by dk.pdf
KrushnaDarade1
 
如何办理(uvic毕业证书)维多利亚大学毕业证本科学位证书原版一模一样
如何办理(uvic毕业证书)维多利亚大学毕业证本科学位证书原版一模一样如何办理(uvic毕业证书)维多利亚大学毕业证本科学位证书原版一模一样
如何办理(uvic毕业证书)维多利亚大学毕业证本科学位证书原版一模一样
yqqaatn0
 
The debris of the ‘last major merger’ is dynamically young
The debris of the ‘last major merger’ is dynamically youngThe debris of the ‘last major merger’ is dynamically young
The debris of the ‘last major merger’ is dynamically young
Sérgio Sacani
 
20240520 Planning a Circuit Simulator in JavaScript.pptx
20240520 Planning a Circuit Simulator in JavaScript.pptx20240520 Planning a Circuit Simulator in JavaScript.pptx
20240520 Planning a Circuit Simulator in JavaScript.pptx
Sharon Liu
 
mô tả các thí nghiệm về đánh giá tác động dòng khí hóa sau đốt
mô tả các thí nghiệm về đánh giá tác động dòng khí hóa sau đốtmô tả các thí nghiệm về đánh giá tác động dòng khí hóa sau đốt
mô tả các thí nghiệm về đánh giá tác động dòng khí hóa sau đốt
HongcNguyn6
 
Thornton ESPP slides UK WW Network 4_6_24.pdf
Thornton ESPP slides UK WW Network 4_6_24.pdfThornton ESPP slides UK WW Network 4_6_24.pdf
Thornton ESPP slides UK WW Network 4_6_24.pdf
European Sustainable Phosphorus Platform
 
Nucleophilic Addition of carbonyl compounds.pptx
Nucleophilic Addition of carbonyl  compounds.pptxNucleophilic Addition of carbonyl  compounds.pptx
Nucleophilic Addition of carbonyl compounds.pptx
SSR02
 

Recently uploaded (20)

原版制作(carleton毕业证书)卡尔顿大学毕业证硕士文凭原版一模一样
原版制作(carleton毕业证书)卡尔顿大学毕业证硕士文凭原版一模一样原版制作(carleton毕业证书)卡尔顿大学毕业证硕士文凭原版一模一样
原版制作(carleton毕业证书)卡尔顿大学毕业证硕士文凭原版一模一样
 
The use of Nauplii and metanauplii artemia in aquaculture (brine shrimp).pptx
The use of Nauplii and metanauplii artemia in aquaculture (brine shrimp).pptxThe use of Nauplii and metanauplii artemia in aquaculture (brine shrimp).pptx
The use of Nauplii and metanauplii artemia in aquaculture (brine shrimp).pptx
 
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...
 
The binding of cosmological structures by massless topological defects
The binding of cosmological structures by massless topological defectsThe binding of cosmological structures by massless topological defects
The binding of cosmological structures by massless topological defects
 
Shallowest Oil Discovery of Turkiye.pptx
Shallowest Oil Discovery of Turkiye.pptxShallowest Oil Discovery of Turkiye.pptx
Shallowest Oil Discovery of Turkiye.pptx
 
Bob Reedy - Nitrate in Texas Groundwater.pdf
Bob Reedy - Nitrate in Texas Groundwater.pdfBob Reedy - Nitrate in Texas Groundwater.pdf
Bob Reedy - Nitrate in Texas Groundwater.pdf
 
ANAMOLOUS SECONDARY GROWTH IN DICOT ROOTS.pptx
ANAMOLOUS SECONDARY GROWTH IN DICOT ROOTS.pptxANAMOLOUS SECONDARY GROWTH IN DICOT ROOTS.pptx
ANAMOLOUS SECONDARY GROWTH IN DICOT ROOTS.pptx
 
Eukaryotic Transcription Presentation.pptx
Eukaryotic Transcription Presentation.pptxEukaryotic Transcription Presentation.pptx
Eukaryotic Transcription Presentation.pptx
 
EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...
EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...
EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...
 
Medical Orthopedic PowerPoint Templates.pptx
Medical Orthopedic PowerPoint Templates.pptxMedical Orthopedic PowerPoint Templates.pptx
Medical Orthopedic PowerPoint Templates.pptx
 
Oedema_types_causes_pathophysiology.pptx
Oedema_types_causes_pathophysiology.pptxOedema_types_causes_pathophysiology.pptx
Oedema_types_causes_pathophysiology.pptx
 
Topic: SICKLE CELL DISEASE IN CHILDREN-3.pdf
Topic: SICKLE CELL DISEASE IN CHILDREN-3.pdfTopic: SICKLE CELL DISEASE IN CHILDREN-3.pdf
Topic: SICKLE CELL DISEASE IN CHILDREN-3.pdf
 
Sharlene Leurig - Enabling Onsite Water Use with Net Zero Water
Sharlene Leurig - Enabling Onsite Water Use with Net Zero WaterSharlene Leurig - Enabling Onsite Water Use with Net Zero Water
Sharlene Leurig - Enabling Onsite Water Use with Net Zero Water
 
SAR of Medicinal Chemistry 1st by dk.pdf
SAR of Medicinal Chemistry 1st by dk.pdfSAR of Medicinal Chemistry 1st by dk.pdf
SAR of Medicinal Chemistry 1st by dk.pdf
 
如何办理(uvic毕业证书)维多利亚大学毕业证本科学位证书原版一模一样
如何办理(uvic毕业证书)维多利亚大学毕业证本科学位证书原版一模一样如何办理(uvic毕业证书)维多利亚大学毕业证本科学位证书原版一模一样
如何办理(uvic毕业证书)维多利亚大学毕业证本科学位证书原版一模一样
 
The debris of the ‘last major merger’ is dynamically young
The debris of the ‘last major merger’ is dynamically youngThe debris of the ‘last major merger’ is dynamically young
The debris of the ‘last major merger’ is dynamically young
 
20240520 Planning a Circuit Simulator in JavaScript.pptx
20240520 Planning a Circuit Simulator in JavaScript.pptx20240520 Planning a Circuit Simulator in JavaScript.pptx
20240520 Planning a Circuit Simulator in JavaScript.pptx
 
mô tả các thí nghiệm về đánh giá tác động dòng khí hóa sau đốt
mô tả các thí nghiệm về đánh giá tác động dòng khí hóa sau đốtmô tả các thí nghiệm về đánh giá tác động dòng khí hóa sau đốt
mô tả các thí nghiệm về đánh giá tác động dòng khí hóa sau đốt
 
Thornton ESPP slides UK WW Network 4_6_24.pdf
Thornton ESPP slides UK WW Network 4_6_24.pdfThornton ESPP slides UK WW Network 4_6_24.pdf
Thornton ESPP slides UK WW Network 4_6_24.pdf
 
Nucleophilic Addition of carbonyl compounds.pptx
Nucleophilic Addition of carbonyl  compounds.pptxNucleophilic Addition of carbonyl  compounds.pptx
Nucleophilic Addition of carbonyl compounds.pptx
 

Statistical methods for off-target variant genotyping on Affymetrix' Axiom Arrays

  • 1. Statistical methods for off-target variant genotyping on Affymetrix’Axiom® Arrays Teresa A. Webster, Ali Pirani, Mei-mei Shen, Laurent Bellon, and Hong Gao Affymetrix, Inc. Santa Clara, CA 95051 USA Algorithm for OTV genotyping We developed a statistical approach called “OTVCaller” to genotype the OTV SNPs. This algorithm uses the posterior genotyping information generated by the automated clustering algorithm AxiomGT1 to initiate the 2D Baum-Welch algorithm in order to search for the optimal clustering in both intensity strength and contrast dimensions. It then iteratively updates sample assignments and cluster centers until convergence is reached. The detailed workflow of the OTVCaller algorithm is shown in Figure 3 and is implemented in SNPolisher R package. Figure 3: The workflow of the OTVCaller algorithm. Two models are considered, including a three-genotype model (AA, BB, and OTV) and a four genotype model (AA, AB, BB, and OTV). The last step is to select the best-fit model. ABSTRACT Background: Off-target variants (OTVs) (Didion, et al., BMC Genomics 13:34, 2012) are genomic markers with sequences that are significantly different from the sequences of hybridization probes used in microarray-based genotyping. This dissimilarity between probe and target can lower hybridization intensities such that genotypes of the subpopulation are often incorrectly called as heterozygotes by genotyping algorithms that cluster intensities into three expected genotypes, namely AA, AB, and BB. OTVs tend to group members of the divergent subpopulation into a fourth cluster denoted as the OTV cluster. Therefore, correct identification of this OTV cluster, a process described as OTV genotyping, is a critical step. Results: We have developed a statistical method called “OTVCaller” to genotype all OTVs based on the posterior genotyping information generated by the automated clustering algorithm AxiomGT1, which is used by Axiom® Genotyping Console™ Software. OTVCaller uses posterior genotypes as the prior information to initiate the Baum-Welch algorithm to iteratively search for the optimal locations of the four clusters AA, AB, BB, and OTV. It then derives genotypes and OTV types when convergence is reached. We applied OTVCaller to genotype data from multiple species and demonstrate OTVCaller can successfully identify the OTV cluster in the presence of all three genotypes (AA, AB, and BB) or in the presence of only two genotypes (AA and BB). OTVCaller is available from Affymetrix in a SNP data analysis post-processing R package “SNPolisher™” and is applicable to Axiom® genotyping arrays. Background Off-target variant (OTV) probes (Didion, et al., 2012)1 usually demonstrate substantially low hybridization intensity and center at zero in contrast dimension, due to double deletion or sequence nonhomology. Thus they tend to confound the genotype calling of heterozygotes. Therefore, there is a critical need for statistical methods to distinguish OTVs from true heterozygotes. Application to wheat OTV genotypes We applied the OTVCaller algorithm implemented in SNPolisher™ R package to bread wheat genotyping data generated with Affymetrix’ Axiom® custom array. OTVCaller accurately corrected miscalled heterozygotes, as shown in Figure 4 where OTVs had been miscalled as heterozygotes and true heterozygotes had been miscalled as null calls before OTV genotyping. They are all corrected after OTV genotyping. Figure 4: Three examples of OTV clusters before and after OTV genotyping by OTVCaller algorithm. OTV clusters are represented by cyan diamonds. OTV AABBBB AA AB OTV Figure 1: Two examples of OTV genotypes. Left has four genotypes, AA, AB, BB, and OTV. Right has three genotypes, AA, BB, and OTV. Diamonds in cyan represent off- target variants. Figure 2: Two examples of genotype clusters with low HetSO values. They are identified as OTV genotypes as their HetSO values are below -0.3. Before OTV Genotyping Identification of OTV genotypes We internally developed a metric for SNP quality control called heterozygote strength offset (HetSO). It is defined as the vertical distance (as measured by [A+B]/2 or signal size) from the center of the heterozygote cluster to the line connecting the centers of the homozygote clusters. Large negative values are usually associated with OTV clusters. Therefore, we define OTVs as genotypes with good cluster resolution but with HetSO < -0.3, which is a customizable threshold. Identification of OTV genotypes is executed automatically by SNPolisher R package. After OTV Genotyping 1. Didion, J.P., Yang, H., Sheppard, K., Fu, C., McMillan, L., Villena, F.P., and Churchill, G.A. Discovery of novel variants in genotyping arrays improves genotype retention and reduces ascertainment bias. BMC Genomics 13:34 (2012). HetSO =‐2.24 HetSO= ‐0.72