SlideShare a Scribd company logo
Risk Classification with an
Adaptive Naive Bayes Kernel Machine Model
Jessica Minnier1,
Ming Yuan3, Jun Liu4, and Tianxi Cai2
1Department of Public Health & Preventive Medicine, Oregon Health & Science University
2Department of Biostatistics, Harvard School of Public Health
3Department of Statistics, University of Wisconsin-Madison
4Department of Statistics, Harvard University
June 30, 2015
ASA Oregon Chapter Meeting
Outline
1 Background and Motivation
2 Model and Methods
Kernels
Blockwise Kernel PCA Estimation
Regularized Selection of Informative Regions
Theoretical Results
3 Simulation Studies
4 Genetic Risk of Type I Diabetes
5 Conclusions
Adaptive Naive Bayes Kernel Machine Model 2
Background and Motivation
Adaptive Naive Bayes (Blockwise) Kernel Machine Classification
• Goal: genetic data → quantify disease risk, predict therapeutic
efficacy, determine disease subtypes
• Goal: build an accurate parsimonious prediction model
– reduce the cost of unnecessary marker measurements
– improve the prediction precision for future patients
– improve over modest prediction precision obtained with clinical
predictors and/or known risk alleles
Adaptive Naive Bayes Kernel Machine Model Background and Motivation 3
Background and Motivation
Adaptive Naive Bayes (Blockwise) Kernel Machine Classification
• Goal: genetic data → quantify disease risk, predict therapeutic
efficacy, determine disease subtypes
• Goal: build an accurate parsimonious prediction model
– reduce the cost of unnecessary marker measurements
– improve the prediction precision for future patients
– improve over modest prediction precision obtained with clinical
predictors and/or known risk alleles
• Complex diseases
– many alleles contribute to risk
– many distinct combinations of risk factors lead to disease
Adaptive Naive Bayes Kernel Machine Model Background and Motivation 3
Background and Motivation
• Genome wide association studies (GWAS)
– identifying SNPs associated with disease risk
– primary goal of testing
– accurate risk prediction remains difficult
• Common approach:
– select top ranked SNPs based on large scale testing
– construct a composite genetic score w/ selected SNPs
Adaptive Naive Bayes Kernel Machine Model Background and Motivation 4
Background and Motivation
• Genome wide association studies (GWAS)
– identifying SNPs associated with disease risk
– primary goal of testing
– accurate risk prediction remains difficult
• Common approach:
– select top ranked SNPs based on large scale testing
– construct a composite genetic score w/ selected SNPs
– may not work well due to
false +/− errors in identifying predictive SNPs
over-fitting
using only subset of SNPs available
additive effects only
Adaptive Naive Bayes Kernel Machine Model Background and Motivation 4
Background and Motivation
Recent progress in prediction with high dimensional data
• Regularized estimation: LASSO (Tibshirani, 1996); SCAD (Fan and Li,
2001); Adaptive LASSO (Zou, 2006)
• Machine learning: Support vector machine (Cristianini, Shawe-Taylor,
2000); Least square Kernel Machine Regression (Liu, Lin, Ghosh, 2007);
Kernel logistic regression (Zhu and Hastie, 2005; Liu, Ghosh and Lin,
2008)
• Screening + Regularized estimation: Sure independence screening
(Fan and Lv, 2008; Fan and Song, 2009)
Global methods: may be unstable for large p, high correlation
Adaptive Naive Bayes Kernel Machine Model Background and Motivation 5
Approach
Challenge:
• Prediction models based on univariate testing, additive models, global
methods → low prediction accuracy, low AUC, missing heritability
• Non-linear effects? testing for interactions → low power
Adaptive Naive Bayes Kernel Machine Model Model and Methods 6
Approach
Challenge:
• Prediction models based on univariate testing, additive models, global
methods → low prediction accuracy, low AUC, missing heritability
• Non-linear effects? testing for interactions → low power
Our approach [Minnier et al., 2015]:
• Blockwise method:
leverage biological knowledge to build models at the gene-set level
genes, gene-pathways, linkage disequilibrium blocks
• Kernel machine regression:
allow for complex and nonlinear effects
implicitly specify underlying complex functional form of covariate
effects via similarity measures (kernels) that define the distance
between two sets of covariates
Adaptive Naive Bayes Kernel Machine Model Model and Methods 6
Kernel Methods: similar inputs to similar outputs
• transform data to feature space H with non-linear map φ
• “kernel trick” lets us use K(, ) similarity function instead of φ
• K induces the feature space
N. Takahashi’s webpage
Adaptive Naive Bayes Kernel Machine Model Model and Methods 7
Previous Methods
Blockwise methods
• Inference: Gene-set testing
Gene burden tests
Gene Set Enrichment Analysis (GSEA)
SNP-set Sequence Kernel Association Test (SKAT, SKAT-O; Wu et al.
2010; Wu, Lee, et al. 2011)
Adaptive Naive Bayes Kernel Machine Model Model and Methods 8
Previous Methods
Blockwise methods
• Inference: Gene-set testing
Gene burden tests
Gene Set Enrichment Analysis (GSEA)
SNP-set Sequence Kernel Association Test (SKAT, SKAT-O; Wu et al.
2010; Wu, Lee, et al. 2011)
Kernel machine methods
• Support Vector Machine (SVM) classification methods
• Inference
KM SNP-set Testing (Liu et al. 2007, 2008; SKAT methods)
Gene expression test with kernel Cox model (Li and Luan 2003)
Adaptive Naive Bayes Kernel Machine Model Model and Methods 8
Notations and Model Assumptions
• Data
– Response: Y = (Y1, ..., Yn)T
– Predictors: M blocks of genomic regions, for b = 1, ..., M,
X(b)
= (X
(b)
1 , ..., X(b)
n )T
n×pb
,
Adaptive Naive Bayes Kernel Machine Model Model and Methods 9
Notations and Model Assumptions
• Data
– Response: Y = (Y1, ..., Yn)T
– Predictors: M blocks of genomic regions, for b = 1, ..., M,
X(b)
= (X
(b)
1 , ..., X(b)
n )T
n×pb
,
• Blockwise: Partition genome into gene-sets
– Recombination hotspots, gene-pathways
Adaptive Naive Bayes Kernel Machine Model Model and Methods 9
Notations and Model Assumptions
• Data
– Response: Y = (Y1, ..., Yn)T
– Predictors: M blocks of genomic regions, for b = 1, ..., M,
X(b)
= (X
(b)
1 , ..., X(b)
n )T
n×pb
,
• Model under blockwise Naive Bayes (NB) assumption:
X(1)
, ..., X(M)
| Y independent
Adaptive Naive Bayes Kernel Machine Model Model and Methods 10
Notations and Model Assumptions
• Data
– Response: Y = (Y1, ..., Yn)T
– Predictors: M blocks of genomic regions, for b = 1, ..., M,
X(b)
= (X
(b)
1 , ..., X(b)
n )T
n×pb
,
• Model under blockwise Naive Bayes (NB) assumption:
X(1)
, ..., X(M)
| Y independent ⇒
logit{pr(Y = 1 | X(1)
, ..., X(M)
)} = c +
M
b=1
logit{pr(Y = 1 | X(b)
)}
– NB assumption allows separate estimation by block and reduces overfitting
– Performs well for zero-one loss L(X) = I( ˆY (X) = Y ) [Domingos and
Pazzani, 1997]
Adaptive Naive Bayes Kernel Machine Model Model and Methods 10
Notations and Model Assumptions
• Within each region, the effect may be complex and interactive due to
– multiple causal variants
– un-typed causal variants in the presence of high LD
Adaptive Naive Bayes Kernel Machine Model Model and Methods 11
Notations and Model Assumptions
• Within each region, the effect may be complex and interactive due to
– multiple causal variants
– un-typed causal variants in the presence of high LD
• Blockwise Kernel Machine Regression
logit{pr(Y = 1 | X(b)
)} = a(b)
+h(b)
(X(b)
)
h(b)
(X(b)
) =
l
β(b)
l ψ(b)
l (X(b)
) ∈ HK(b)
{ψ(b)
l } = { λ(b)
l φ(b)
l } implicitly specified via a symmetric positive
definite kernel K(b)
(·, ·).
Adaptive Naive Bayes Kernel Machine Model Model and Methods 11
Notations and Model Assumptions
• Within each region, the effect may be complex and interactive due to
– multiple causal variants
– un-typed causal variants in the presence of high LD
• Blockwise Kernel Machine Regression
logit{pr(Y = 1 | X(b)
)} = a(b)
+h(b)
(X(b)
)
h(b)
(X(b)
) =
l
β(b)
l ψ(b)
l (X(b)
) ∈ HK(b)
{ψ(b)
l } = { λ(b)
l φ(b)
l } implicitly specified via a symmetric positive
definite kernel K(b)
(·, ·).
K(b)
(X(b)
i , X(b)
j ) defines the similarity between X(b)
i and X(b)
j .
Adaptive Naive Bayes Kernel Machine Model Model and Methods 11
Notations and Model Assumptions
• Within each region, the effect may be complex and interactive due to
– multiple causal variants
– un-typed causal variants in the presence of high LD
• Blockwise Kernel Machine Regression
logit{pr(Y = 1 | X(b)
)} = a(b)
+h(b)
(X(b)
)
h(b)
(X(b)
) =
l
β(b)
l ψ(b)
l (X(b)
) ∈ HK(b)
{ψ(b)
l } = { λ(b)
l φ(b)
l } implicitly specified via a symmetric positive
definite kernel K(b)
(·, ·).
K(b)
(X(b)
i , X(b)
j ) defines the similarity between X(b)
i and X(b)
j .
HK(b) , the functional space spanned by K(b)
(·, ·), is a reproducible
kernel hilbert space (RKHS)
Adaptive Naive Bayes Kernel Machine Model Model and Methods 11
Choices of Kernel Functions
Linear kernel: K(Xi , Xj ) = ρ + XT
i Xj ,
h(X) =
p
k=1
βkXk
Fitting logistic regression with linear kernel ⇔ logistic ridge regression.
Adaptive Naive Bayes Kernel Machine Model Model and Methods 12
Choices of Kernel Functions
Linear kernel: K(Xi , Xj ) = ρ + XT
i Xj ,
h(X) =
p
k=1
βkXk
Fitting logistic regression with linear kernel ⇔ logistic ridge regression.
IBS kernel: K(Xi , Xj ) = p
k=1(2 − |Xik − Xjk|),
powerful in detecting non-linear effects with SNP data [Wu et al, 2010]
Adaptive Naive Bayes Kernel Machine Model Model and Methods 12
Estimation of h: Kernel PCA
• primal form: h = l βl ψl = l βl
√
λl φl
• Kernel PCA approximation:
K = [K(Xi , Xj )]1≤i,j≤n =
n
l=1
λl φl φ
T
l
K =
0
l=1
λl φl φ
T
l = ΨΨ
T
; Ψ = [λ
1
2
1 φ1, ..., λ
1
2
0
φ 0
]n× 0
Scholkopf et al. [1999]; Williams and Seeger [2000]; Braun et al. [2008]; Zhang et al. [2010]
• h(b)
(X(b)
) = Ψβ
• obtain (a, β) as the minimizer of ridge logistic objective function
L(Y , a, Ψβ) + τ β 2
Adaptive Naive Bayes Kernel Machine Model Model and Methods 13
Regularized Selection of Informative Regions
• For b = 1, ..., M, perform kernel PCA regression and obtain h(b)
logit{pr(Y = 1 | X(b)
)} = a(b)
+ h(b)
(X(b)
)
• Classify a future subject with X = {X(b)
, b = 1, ..., M} based on
M
b=1
h(b)
(X(b)
) ≥ c
• Final prediction rule with weighted block effects
– Some regions may not be predictive of the outcome due to false
discovery
– Inclusion of all regions for prediction may lead to reduced accuracy
– Regularized estimation of block effects using LASSO:
M
b=1
γbh(b)
(X(b)
) ≥ c
Adaptive Naive Bayes Kernel Machine Model Model and Methods 14
Regularized Selection of Informative Regions
• For b = 1, ..., M, perform kernel PCA regression and obtain h(b)
logit{pr(Y = 1 | X(b)
)} = a(b)
+ h(b)
(X(b)
)
• Classify a future subject with X = {X(b)
, b = 1, ..., M} based on
M
b=1
h(b)
(X(b)
) ≥ c
• Final prediction rule with weighted block effects
– Regularized estimation of block effects using LASSO, pseudo-data H
estimated with cross-validation:
K
k=1
YT
log g(b + Hγ) + (1 − Y)T
log{1 − g(b + Hγ)} − τ2 γ 1,
M
b=1
γbh(b)
(X(b)
) ≥ c
Adaptive Naive Bayes Kernel Machine Model Model and Methods 15
Theoretical Results
• Consistency of h(b)(x):
– h(b)
(x) → h(b)
(x) at
√
n rate for finite dimensional HK
– Relies on convergence of sample eigen-values and -vectors from kernel
PCA to the true eigensystem of HK
Ψ → Ψ = {ψ(b)
1 , . . . , ψ(b)
0
}
• Oracle property of γ:
– Gene-set selection consistency
P(A = A) → 1
where A = {b|h(b)
(x) = 0}, A = {b|h(b)
(x) = 0}
Adaptive Naive Bayes Kernel Machine Model Model and Methods 16
Simulation Studies for NBKM
• SNP data sampled from gene-sets in a GWAS dataset (from type I
diabetes study, Affy 500k)
• 350 regions, 9256 SNPs
• Only the first 4 regions are associated with the outcome
• the joint effects of the SNPs in each of these regions set as
– linear for the first two regions and non-linear for the other 2 regions
– linear for all 4 regions
– nonlinear for all 4 regions
Adaptive Naive Bayes Kernel Machine Model Simulation Studies 17
Prediction Accuracy
Simulations: nt = 1000, nv = 500, # of genes = 350 total # of SNPs = 9256
Adaptive Naive Bayes Kernel Machine Model Simulation Studies 18
Gene-set selection
Simulations: nt = 1000, nv = 500, # of genes = 350 total # of SNPs = 9256
Adaptive Naive Bayes Kernel Machine Model Simulation Studies 19
Genetic Risk of Type I Diabetes
• Autoimmune disease, usually diagnosed in childhood
• T1D
– 75 SNPs have been identified as T1D risk alleles (National Human
Genome Research Institute, Hindorff et al. [2009])
– 91 genes that either contain these SNPs or flank the SNP on either
side on the chromosome
Adaptive Naive Bayes Kernel Machine Model Genetic Risk of Type I Diabetes 20
Genetic Risk of Type I Diabetes
• Autoimmune disease, usually diagnosed in childhood
• T1D
– 75 SNPs have been identified as T1D risk alleles (National Human
Genome Research Institute, Hindorff et al. [2009])
– 91 genes that either contain these SNPs or flank the SNP on either
side on the chromosome
• T1D + Other autoimmune diseases (Rheumatoid arthritis, Celiac
disease, Crohns disease, Lupus, Inflammatory bowel disease)
– 365 SNPs have been identified as T1D+other autoimmune disease risk
alleles (NHGRI)
– 375 genes that either contain these SNPs or flank the SNP on either
side on the chromosome
Adaptive Naive Bayes Kernel Machine Model Genetic Risk of Type I Diabetes 20
Genetic Risk of Type I Diabetes
GWAS data collected by Welcome Trust Case Control Consortium
(WTCCC)
• 2000 cases, 3000 controls of European descent from Great Britain
• segment the genome into gene-sets: gene and a flanking region of
20KB on either side of the gene
• The WTCCC data includes
– 350 of the gene-sets listed in the NHGRI catalog
– covering 9,256 SNPs in the WTCCC data
Adaptive Naive Bayes Kernel Machine Model Genetic Risk of Type I Diabetes 21
T1D Prediction Results
Adaptive Naive Bayes Kernel Machine Model Genetic Risk of Type I Diabetes 22
Conclusions
• Kernel Machine Regression provides a useful tool for incorporating
non-linear complex effects
• Blockwise KM regression achieves a nice balance between capturing
complex effects and overfitting
• IBS kernel performs well under both linear and non-linear settings
Remarks
• May use SKAT to screen blocks for initial stage
• Can be extended to data with other covariates such as clinical
variables
• Possible extensions might incorporate more complex block structure,
different types of outcomes, interactions, and beyond!
Adaptive Naive Bayes Kernel Machine Model Conclusions 23
Thank you!
Adaptive Naive Bayes Kernel Machine Model Conclusions 24
References I
M. Braun, J. Buhmann, and K. M¨uller. On relevant dimensions in kernel feature spaces. The Journal of Machine Learning
Research, 9:1875–1908, 2008.
P. Domingos and M. Pazzani. On the optimality of the simple bayesian classifier under zero-one loss. Machine learning, 29(2):
103–130, 1997.
L. Hindorff, P. Sethupathy, H. Junkins, E. Ramos, J. Mehta, F. Collins, and T. Manolio. Potential etiologic and functional
implications of genome-wide association loci for human diseases and traits. Proceedings of the National Academy of
Sciences, 106(23):9362, 2009.
J. Minnier, M. Yuan, J. S. Liu, and T. Cai. Risk classification with an adaptive naive bayes kernel machine model. Journal of the
American Statistical Association, 110(509):393–404, 2015.
B. Scholkopf, S. Mika, C. Burges, P. Knirsch, K. Muller, G. Ratsch, and A. Smola. Input space versus feature space in
kernel-based methods. Neural Networks, IEEE Transactions on, 10(5):1000–1017, 1999.
C. Williams and M. Seeger. The effect of the input density distribution on kernel-based classifiers. In Proceedings of the 17th
International Conference on Machine Learning. Citeseer, 2000.
R. Zhang, W. Wang, and Y. Ma. Approximations of the standard principal components analysis and kernel pca. Expert Systems
with Applications, 37(9):6531–6537, 2010.
Adaptive Naive Bayes Kernel Machine Model 25

More Related Content

What's hot

GRAPHICAL MODEL AND CLUSTERINGREGRESSION BASED METHODS FOR CAUSAL INTERACTION...
GRAPHICAL MODEL AND CLUSTERINGREGRESSION BASED METHODS FOR CAUSAL INTERACTION...GRAPHICAL MODEL AND CLUSTERINGREGRESSION BASED METHODS FOR CAUSAL INTERACTION...
GRAPHICAL MODEL AND CLUSTERINGREGRESSION BASED METHODS FOR CAUSAL INTERACTION...
ijaia
 
MISSING DATA CLASSIFICATION OF CHRONIC KIDNEY DISEASE
MISSING DATA CLASSIFICATION OF CHRONIC KIDNEY DISEASEMISSING DATA CLASSIFICATION OF CHRONIC KIDNEY DISEASE
MISSING DATA CLASSIFICATION OF CHRONIC KIDNEY DISEASE
IJDKP
 
AI for drug discovery
AI for drug discoveryAI for drug discovery
AI for drug discovery
Deakin University
 
American Statistical Association October 23 2009 Presentation Part 1
American Statistical Association October 23 2009 Presentation Part 1American Statistical Association October 23 2009 Presentation Part 1
American Statistical Association October 23 2009 Presentation Part 1
Double Check ĆŐNSULTING
 
NLP tutorial at AIME 2020
NLP tutorial at AIME 2020NLP tutorial at AIME 2020
NLP tutorial at AIME 2020
Rui Zhang
 
Srge most important publications 2020
Srge most important  publications 2020Srge most important  publications 2020
Srge most important publications 2020
Aboul Ella Hassanien
 
The Role of Statistician in Personalized Medicine: An Overview of Statistical...
The Role of Statistician in Personalized Medicine: An Overview of Statistical...The Role of Statistician in Personalized Medicine: An Overview of Statistical...
The Role of Statistician in Personalized Medicine: An Overview of Statistical...
Setia Pramana
 
Analysis of Imbalanced Classification Algorithms A Perspective View
Analysis of Imbalanced Classification Algorithms A Perspective ViewAnalysis of Imbalanced Classification Algorithms A Perspective View
Analysis of Imbalanced Classification Algorithms A Perspective View
ijtsrd
 
Very brief overview of AI in drug discovery
Very brief overview of AI in drug discoveryVery brief overview of AI in drug discovery
Very brief overview of AI in drug discovery
Dr. Gerry Higgins
 
September Journal Club -Aishwarya
September Journal Club -AishwaryaSeptember Journal Club -Aishwarya
September Journal Club -Aishwarya
RSG Luxembourg
 
Machine Learning in Biology and Why It Doesn't Make Sense - Theo Knijnenburg,...
Machine Learning in Biology and Why It Doesn't Make Sense - Theo Knijnenburg,...Machine Learning in Biology and Why It Doesn't Make Sense - Theo Knijnenburg,...
Machine Learning in Biology and Why It Doesn't Make Sense - Theo Knijnenburg,...
Seattle DAML meetup
 
Simplified Knowledge Prediction: Application of Machine Learning in Real Life
Simplified Knowledge Prediction: Application of Machine Learning in Real LifeSimplified Knowledge Prediction: Application of Machine Learning in Real Life
Simplified Knowledge Prediction: Application of Machine Learning in Real Life
Peea Bal Chakraborty
 
SCDT: FC-NNC-structured Complex Decision Technique for Gene Analysis Using Fu...
SCDT: FC-NNC-structured Complex Decision Technique for Gene Analysis Using Fu...SCDT: FC-NNC-structured Complex Decision Technique for Gene Analysis Using Fu...
SCDT: FC-NNC-structured Complex Decision Technique for Gene Analysis Using Fu...
IJECEIAES
 
Evaluation of Logistic Regression and Neural Network Model With Sensitivity A...
Evaluation of Logistic Regression and Neural Network Model With Sensitivity A...Evaluation of Logistic Regression and Neural Network Model With Sensitivity A...
Evaluation of Logistic Regression and Neural Network Model With Sensitivity A...
CSCJournals
 
Systems biology in polypharmacology: explaining and predicting drug secondary...
Systems biology in polypharmacology: explaining and predicting drug secondary...Systems biology in polypharmacology: explaining and predicting drug secondary...
Systems biology in polypharmacology: explaining and predicting drug secondary...Andrei KUCHARAVY
 
Machine Learning Based Approaches for Prediction of Parkinson's Disease
Machine Learning Based Approaches for Prediction of Parkinson's Disease  Machine Learning Based Approaches for Prediction of Parkinson's Disease
Machine Learning Based Approaches for Prediction of Parkinson's Disease
mlaij
 
A Self-Adaptive Evolutionary Negative Selection Approach for Anom
A Self-Adaptive Evolutionary Negative Selection Approach for AnomA Self-Adaptive Evolutionary Negative Selection Approach for Anom
A Self-Adaptive Evolutionary Negative Selection Approach for AnomLuis J. Gonzalez, PhD
 
FunGen JC Presentation - Mostafavi et al. (2019)
FunGen JC Presentation - Mostafavi et al. (2019)FunGen JC Presentation - Mostafavi et al. (2019)
FunGen JC Presentation - Mostafavi et al. (2019)
BrianSchilder
 

What's hot (19)

CariesICTAI2008
CariesICTAI2008CariesICTAI2008
CariesICTAI2008
 
GRAPHICAL MODEL AND CLUSTERINGREGRESSION BASED METHODS FOR CAUSAL INTERACTION...
GRAPHICAL MODEL AND CLUSTERINGREGRESSION BASED METHODS FOR CAUSAL INTERACTION...GRAPHICAL MODEL AND CLUSTERINGREGRESSION BASED METHODS FOR CAUSAL INTERACTION...
GRAPHICAL MODEL AND CLUSTERINGREGRESSION BASED METHODS FOR CAUSAL INTERACTION...
 
MISSING DATA CLASSIFICATION OF CHRONIC KIDNEY DISEASE
MISSING DATA CLASSIFICATION OF CHRONIC KIDNEY DISEASEMISSING DATA CLASSIFICATION OF CHRONIC KIDNEY DISEASE
MISSING DATA CLASSIFICATION OF CHRONIC KIDNEY DISEASE
 
AI for drug discovery
AI for drug discoveryAI for drug discovery
AI for drug discovery
 
American Statistical Association October 23 2009 Presentation Part 1
American Statistical Association October 23 2009 Presentation Part 1American Statistical Association October 23 2009 Presentation Part 1
American Statistical Association October 23 2009 Presentation Part 1
 
NLP tutorial at AIME 2020
NLP tutorial at AIME 2020NLP tutorial at AIME 2020
NLP tutorial at AIME 2020
 
Srge most important publications 2020
Srge most important  publications 2020Srge most important  publications 2020
Srge most important publications 2020
 
The Role of Statistician in Personalized Medicine: An Overview of Statistical...
The Role of Statistician in Personalized Medicine: An Overview of Statistical...The Role of Statistician in Personalized Medicine: An Overview of Statistical...
The Role of Statistician in Personalized Medicine: An Overview of Statistical...
 
Analysis of Imbalanced Classification Algorithms A Perspective View
Analysis of Imbalanced Classification Algorithms A Perspective ViewAnalysis of Imbalanced Classification Algorithms A Perspective View
Analysis of Imbalanced Classification Algorithms A Perspective View
 
Very brief overview of AI in drug discovery
Very brief overview of AI in drug discoveryVery brief overview of AI in drug discovery
Very brief overview of AI in drug discovery
 
September Journal Club -Aishwarya
September Journal Club -AishwaryaSeptember Journal Club -Aishwarya
September Journal Club -Aishwarya
 
Machine Learning in Biology and Why It Doesn't Make Sense - Theo Knijnenburg,...
Machine Learning in Biology and Why It Doesn't Make Sense - Theo Knijnenburg,...Machine Learning in Biology and Why It Doesn't Make Sense - Theo Knijnenburg,...
Machine Learning in Biology and Why It Doesn't Make Sense - Theo Knijnenburg,...
 
Simplified Knowledge Prediction: Application of Machine Learning in Real Life
Simplified Knowledge Prediction: Application of Machine Learning in Real LifeSimplified Knowledge Prediction: Application of Machine Learning in Real Life
Simplified Knowledge Prediction: Application of Machine Learning in Real Life
 
SCDT: FC-NNC-structured Complex Decision Technique for Gene Analysis Using Fu...
SCDT: FC-NNC-structured Complex Decision Technique for Gene Analysis Using Fu...SCDT: FC-NNC-structured Complex Decision Technique for Gene Analysis Using Fu...
SCDT: FC-NNC-structured Complex Decision Technique for Gene Analysis Using Fu...
 
Evaluation of Logistic Regression and Neural Network Model With Sensitivity A...
Evaluation of Logistic Regression and Neural Network Model With Sensitivity A...Evaluation of Logistic Regression and Neural Network Model With Sensitivity A...
Evaluation of Logistic Regression and Neural Network Model With Sensitivity A...
 
Systems biology in polypharmacology: explaining and predicting drug secondary...
Systems biology in polypharmacology: explaining and predicting drug secondary...Systems biology in polypharmacology: explaining and predicting drug secondary...
Systems biology in polypharmacology: explaining and predicting drug secondary...
 
Machine Learning Based Approaches for Prediction of Parkinson's Disease
Machine Learning Based Approaches for Prediction of Parkinson's Disease  Machine Learning Based Approaches for Prediction of Parkinson's Disease
Machine Learning Based Approaches for Prediction of Parkinson's Disease
 
A Self-Adaptive Evolutionary Negative Selection Approach for Anom
A Self-Adaptive Evolutionary Negative Selection Approach for AnomA Self-Adaptive Evolutionary Negative Selection Approach for Anom
A Self-Adaptive Evolutionary Negative Selection Approach for Anom
 
FunGen JC Presentation - Mostafavi et al. (2019)
FunGen JC Presentation - Mostafavi et al. (2019)FunGen JC Presentation - Mostafavi et al. (2019)
FunGen JC Presentation - Mostafavi et al. (2019)
 

Viewers also liked

Datamining with nb
Datamining with nbDatamining with nb
Datamining with nb
James Wong
 
Dwdm naive bayes_ankit_gadgil_027
Dwdm naive bayes_ankit_gadgil_027Dwdm naive bayes_ankit_gadgil_027
Dwdm naive bayes_ankit_gadgil_027ankitgadgil
 
Ing transito (1)
Ing transito (1)Ing transito (1)
Ing transito (1)
erika acuña noriega
 
REALIDAD AUMENTADA EVARISTO GLEZ PORTAS
REALIDAD AUMENTADA EVARISTO GLEZ PORTASREALIDAD AUMENTADA EVARISTO GLEZ PORTAS
REALIDAD AUMENTADA EVARISTO GLEZ PORTAS
Vani González Portas
 
Metodos de demanda vehicular
Metodos de demanda vehicularMetodos de demanda vehicular
Metodos de demanda vehicular
erika acuña noriega
 
Subdrenajes
SubdrenajesSubdrenajes
explain what values and attitudes are and describe their impact on managerial...
explain what values and attitudes are and describe their impact on managerial...explain what values and attitudes are and describe their impact on managerial...
explain what values and attitudes are and describe their impact on managerial...
evangeline jumalon
 
2016 Middle Market M&A Activity
2016 Middle Market M&A Activity2016 Middle Market M&A Activity
2016 Middle Market M&A Activity
Signal Hill Capital
 
Ensayo cientifico accesibilidad y circulacion peatonal. electiva iv.
Ensayo cientifico accesibilidad y circulacion peatonal. electiva iv.Ensayo cientifico accesibilidad y circulacion peatonal. electiva iv.
Ensayo cientifico accesibilidad y circulacion peatonal. electiva iv.
erika acuña noriega
 
WEE-Nepal Updates_Final_8 Oct 2015
WEE-Nepal Updates_Final_8 Oct 2015WEE-Nepal Updates_Final_8 Oct 2015
WEE-Nepal Updates_Final_8 Oct 2015Keshab Bahadur Thapa
 
Text classification
Text classificationText classification
Text classification
James Wong
 
A short tutorial on r
A short tutorial on rA short tutorial on r
A short tutorial on r
Ashraf Uddin
 
Introduction to text classification using naive bayes
Introduction to text classification using naive bayesIntroduction to text classification using naive bayes
Introduction to text classification using naive bayes
Dhwaj Raj
 
Sentiment analysis using naive bayes classifier
Sentiment analysis using naive bayes classifier Sentiment analysis using naive bayes classifier
Sentiment analysis using naive bayes classifier
Dev Sahu
 
Naive Bayes Presentation
Naive Bayes PresentationNaive Bayes Presentation
Naive Bayes Presentation
Md. Enamul Haque Chowdhury
 
Minor disorder of pregnancy ppt
Minor disorder of pregnancy pptMinor disorder of pregnancy ppt
Minor disorder of pregnancy ppt
pinal darji
 

Viewers also liked (18)

Naive bayes classifier
Naive bayes classifierNaive bayes classifier
Naive bayes classifier
 
Datamining with nb
Datamining with nbDatamining with nb
Datamining with nb
 
Dwdm naive bayes_ankit_gadgil_027
Dwdm naive bayes_ankit_gadgil_027Dwdm naive bayes_ankit_gadgil_027
Dwdm naive bayes_ankit_gadgil_027
 
Ing transito (1)
Ing transito (1)Ing transito (1)
Ing transito (1)
 
REALIDAD AUMENTADA EVARISTO GLEZ PORTAS
REALIDAD AUMENTADA EVARISTO GLEZ PORTASREALIDAD AUMENTADA EVARISTO GLEZ PORTAS
REALIDAD AUMENTADA EVARISTO GLEZ PORTAS
 
About-Callbox
About-CallboxAbout-Callbox
About-Callbox
 
Metodos de demanda vehicular
Metodos de demanda vehicularMetodos de demanda vehicular
Metodos de demanda vehicular
 
Subdrenajes
SubdrenajesSubdrenajes
Subdrenajes
 
explain what values and attitudes are and describe their impact on managerial...
explain what values and attitudes are and describe their impact on managerial...explain what values and attitudes are and describe their impact on managerial...
explain what values and attitudes are and describe their impact on managerial...
 
2016 Middle Market M&A Activity
2016 Middle Market M&A Activity2016 Middle Market M&A Activity
2016 Middle Market M&A Activity
 
Ensayo cientifico accesibilidad y circulacion peatonal. electiva iv.
Ensayo cientifico accesibilidad y circulacion peatonal. electiva iv.Ensayo cientifico accesibilidad y circulacion peatonal. electiva iv.
Ensayo cientifico accesibilidad y circulacion peatonal. electiva iv.
 
WEE-Nepal Updates_Final_8 Oct 2015
WEE-Nepal Updates_Final_8 Oct 2015WEE-Nepal Updates_Final_8 Oct 2015
WEE-Nepal Updates_Final_8 Oct 2015
 
Text classification
Text classificationText classification
Text classification
 
A short tutorial on r
A short tutorial on rA short tutorial on r
A short tutorial on r
 
Introduction to text classification using naive bayes
Introduction to text classification using naive bayesIntroduction to text classification using naive bayes
Introduction to text classification using naive bayes
 
Sentiment analysis using naive bayes classifier
Sentiment analysis using naive bayes classifier Sentiment analysis using naive bayes classifier
Sentiment analysis using naive bayes classifier
 
Naive Bayes Presentation
Naive Bayes PresentationNaive Bayes Presentation
Naive Bayes Presentation
 
Minor disorder of pregnancy ppt
Minor disorder of pregnancy pptMinor disorder of pregnancy ppt
Minor disorder of pregnancy ppt
 

Similar to Risk Classification with an Adaptive Naive Bayes Kernel Machine Model

32_Nov07_MachineLear..
32_Nov07_MachineLear..32_Nov07_MachineLear..
32_Nov07_MachineLear..butest
 
04 1 evolution
04 1 evolution04 1 evolution
04 1 evolution
Tianlu Wang
 
Data driven model optimization [autosaved]
Data driven model optimization [autosaved]Data driven model optimization [autosaved]
Data driven model optimization [autosaved]
Russell Jarvis
 
Integrative Networks Centric Bioinformatics
Integrative Networks Centric BioinformaticsIntegrative Networks Centric Bioinformatics
Integrative Networks Centric Bioinformatics
Natalio Krasnogor
 
ABC short course: final chapters
ABC short course: final chaptersABC short course: final chapters
ABC short course: final chapters
Christian Robert
 
Diversity mechanisms for evolutionary populations in Search-Based Software En...
Diversity mechanisms for evolutionary populations in Search-Based Software En...Diversity mechanisms for evolutionary populations in Search-Based Software En...
Diversity mechanisms for evolutionary populations in Search-Based Software En...
Annibale Panichella
 
. An introduction to machine learning and probabilistic ...
. An introduction to machine learning and probabilistic .... An introduction to machine learning and probabilistic ...
. An introduction to machine learning and probabilistic ...butest
 
Prototype-based models in machine learning
Prototype-based models in machine learningPrototype-based models in machine learning
Prototype-based models in machine learning
University of Groningen
 
Large Scale Data Clustering: an overview
Large Scale Data Clustering: an overviewLarge Scale Data Clustering: an overview
Large Scale Data Clustering: an overview
Vahid Mirjalili
 
Oral presentation at Protein Folding Consortium Workshop in Berkeley (2017)
Oral presentation at Protein Folding Consortium Workshop in Berkeley (2017)Oral presentation at Protein Folding Consortium Workshop in Berkeley (2017)
Oral presentation at Protein Folding Consortium Workshop in Berkeley (2017)
Temple University
 
randomization approach in case-based reasoning: case of study of mammography ...
randomization approach in case-based reasoning: case of study of mammography ...randomization approach in case-based reasoning: case of study of mammography ...
randomization approach in case-based reasoning: case of study of mammography ...
Miled Basma Bentaiba
 
Multi-omics infrastructure and data for R/Bioconductor
Multi-omics infrastructure and data for R/BioconductorMulti-omics infrastructure and data for R/Bioconductor
Multi-omics infrastructure and data for R/Bioconductor
Levi Waldron
 
Contrast Pattern Aided Regression and Classification
Contrast Pattern Aided Regression and ClassificationContrast Pattern Aided Regression and Classification
Contrast Pattern Aided Regression and Classification
Artificial Intelligence Institute at UofSC
 
Vahid Taslimitehrani PhD Dissertation Defense: Contrast Pattern Aided Regress...
Vahid Taslimitehrani PhD Dissertation Defense: Contrast Pattern Aided Regress...Vahid Taslimitehrani PhD Dissertation Defense: Contrast Pattern Aided Regress...
Vahid Taslimitehrani PhD Dissertation Defense: Contrast Pattern Aided Regress...
Artificial Intelligence Institute at UofSC
 
Introduction to Supervised ML Concepts and Algorithms
Introduction to Supervised ML Concepts and AlgorithmsIntroduction to Supervised ML Concepts and Algorithms
Introduction to Supervised ML Concepts and Algorithms
NBER
 
2008: Applied AIS - A Roadmap of AIS Research in Brazil and Sample Applications
2008: Applied AIS - A Roadmap of AIS Research in Brazil and Sample Applications2008: Applied AIS - A Roadmap of AIS Research in Brazil and Sample Applications
2008: Applied AIS - A Roadmap of AIS Research in Brazil and Sample Applications
Leandro de Castro
 
Affymetrix OncoScan®* data analysis with Nexus Copy Number™
Affymetrix OncoScan®* data analysis with Nexus Copy Number™Affymetrix OncoScan®* data analysis with Nexus Copy Number™
Affymetrix OncoScan®* data analysis with Nexus Copy Number™
Affymetrix
 
R Analytics in the Cloud
R Analytics in the CloudR Analytics in the Cloud
R Analytics in the CloudDataMine Lab
 
The application of artificial intelligence
The application of artificial intelligenceThe application of artificial intelligence
The application of artificial intelligencePallavi Vashistha
 

Similar to Risk Classification with an Adaptive Naive Bayes Kernel Machine Model (20)

32_Nov07_MachineLear..
32_Nov07_MachineLear..32_Nov07_MachineLear..
32_Nov07_MachineLear..
 
04 1 evolution
04 1 evolution04 1 evolution
04 1 evolution
 
Data driven model optimization [autosaved]
Data driven model optimization [autosaved]Data driven model optimization [autosaved]
Data driven model optimization [autosaved]
 
Integrative Networks Centric Bioinformatics
Integrative Networks Centric BioinformaticsIntegrative Networks Centric Bioinformatics
Integrative Networks Centric Bioinformatics
 
ABC short course: final chapters
ABC short course: final chaptersABC short course: final chapters
ABC short course: final chapters
 
5 5 10
5 5 105 5 10
5 5 10
 
Diversity mechanisms for evolutionary populations in Search-Based Software En...
Diversity mechanisms for evolutionary populations in Search-Based Software En...Diversity mechanisms for evolutionary populations in Search-Based Software En...
Diversity mechanisms for evolutionary populations in Search-Based Software En...
 
. An introduction to machine learning and probabilistic ...
. An introduction to machine learning and probabilistic .... An introduction to machine learning and probabilistic ...
. An introduction to machine learning and probabilistic ...
 
Prototype-based models in machine learning
Prototype-based models in machine learningPrototype-based models in machine learning
Prototype-based models in machine learning
 
Large Scale Data Clustering: an overview
Large Scale Data Clustering: an overviewLarge Scale Data Clustering: an overview
Large Scale Data Clustering: an overview
 
Oral presentation at Protein Folding Consortium Workshop in Berkeley (2017)
Oral presentation at Protein Folding Consortium Workshop in Berkeley (2017)Oral presentation at Protein Folding Consortium Workshop in Berkeley (2017)
Oral presentation at Protein Folding Consortium Workshop in Berkeley (2017)
 
randomization approach in case-based reasoning: case of study of mammography ...
randomization approach in case-based reasoning: case of study of mammography ...randomization approach in case-based reasoning: case of study of mammography ...
randomization approach in case-based reasoning: case of study of mammography ...
 
Multi-omics infrastructure and data for R/Bioconductor
Multi-omics infrastructure and data for R/BioconductorMulti-omics infrastructure and data for R/Bioconductor
Multi-omics infrastructure and data for R/Bioconductor
 
Contrast Pattern Aided Regression and Classification
Contrast Pattern Aided Regression and ClassificationContrast Pattern Aided Regression and Classification
Contrast Pattern Aided Regression and Classification
 
Vahid Taslimitehrani PhD Dissertation Defense: Contrast Pattern Aided Regress...
Vahid Taslimitehrani PhD Dissertation Defense: Contrast Pattern Aided Regress...Vahid Taslimitehrani PhD Dissertation Defense: Contrast Pattern Aided Regress...
Vahid Taslimitehrani PhD Dissertation Defense: Contrast Pattern Aided Regress...
 
Introduction to Supervised ML Concepts and Algorithms
Introduction to Supervised ML Concepts and AlgorithmsIntroduction to Supervised ML Concepts and Algorithms
Introduction to Supervised ML Concepts and Algorithms
 
2008: Applied AIS - A Roadmap of AIS Research in Brazil and Sample Applications
2008: Applied AIS - A Roadmap of AIS Research in Brazil and Sample Applications2008: Applied AIS - A Roadmap of AIS Research in Brazil and Sample Applications
2008: Applied AIS - A Roadmap of AIS Research in Brazil and Sample Applications
 
Affymetrix OncoScan®* data analysis with Nexus Copy Number™
Affymetrix OncoScan®* data analysis with Nexus Copy Number™Affymetrix OncoScan®* data analysis with Nexus Copy Number™
Affymetrix OncoScan®* data analysis with Nexus Copy Number™
 
R Analytics in the Cloud
R Analytics in the CloudR Analytics in the Cloud
R Analytics in the Cloud
 
The application of artificial intelligence
The application of artificial intelligenceThe application of artificial intelligence
The application of artificial intelligence
 

Recently uploaded

Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
AbhimanyuSinha9
 
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
axoqas
 
Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)
TravisMalana
 
Adjusting OpenMP PageRank : SHORT REPORT / NOTES
Adjusting OpenMP PageRank : SHORT REPORT / NOTESAdjusting OpenMP PageRank : SHORT REPORT / NOTES
Adjusting OpenMP PageRank : SHORT REPORT / NOTES
Subhajit Sahu
 
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
u86oixdj
 
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
dwreak4tg
 
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdfCh03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
haila53
 
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
ewymefz
 
Machine learning and optimization techniques for electrical drives.pptx
Machine learning and optimization techniques for electrical drives.pptxMachine learning and optimization techniques for electrical drives.pptx
Machine learning and optimization techniques for electrical drives.pptx
balafet
 
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
ahzuo
 
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
ahzuo
 
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
slg6lamcq
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP
 
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
oz8q3jxlp
 
Data_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptx
Data_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptxData_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptx
Data_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptx
AnirbanRoy608946
 
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Subhajit Sahu
 
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
74nqk8xf
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
Timothy Spann
 
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
axoqas
 
The Building Blocks of QuestDB, a Time Series Database
The Building Blocks of QuestDB, a Time Series DatabaseThe Building Blocks of QuestDB, a Time Series Database
The Building Blocks of QuestDB, a Time Series Database
javier ramirez
 

Recently uploaded (20)

Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
 
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
 
Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)
 
Adjusting OpenMP PageRank : SHORT REPORT / NOTES
Adjusting OpenMP PageRank : SHORT REPORT / NOTESAdjusting OpenMP PageRank : SHORT REPORT / NOTES
Adjusting OpenMP PageRank : SHORT REPORT / NOTES
 
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
 
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
 
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdfCh03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
 
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
 
Machine learning and optimization techniques for electrical drives.pptx
Machine learning and optimization techniques for electrical drives.pptxMachine learning and optimization techniques for electrical drives.pptx
Machine learning and optimization techniques for electrical drives.pptx
 
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
 
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
 
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
 
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
 
Data_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptx
Data_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptxData_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptx
Data_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptx
 
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
 
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
 
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
 
The Building Blocks of QuestDB, a Time Series Database
The Building Blocks of QuestDB, a Time Series DatabaseThe Building Blocks of QuestDB, a Time Series Database
The Building Blocks of QuestDB, a Time Series Database
 

Risk Classification with an Adaptive Naive Bayes Kernel Machine Model

  • 1. Risk Classification with an Adaptive Naive Bayes Kernel Machine Model Jessica Minnier1, Ming Yuan3, Jun Liu4, and Tianxi Cai2 1Department of Public Health & Preventive Medicine, Oregon Health & Science University 2Department of Biostatistics, Harvard School of Public Health 3Department of Statistics, University of Wisconsin-Madison 4Department of Statistics, Harvard University June 30, 2015 ASA Oregon Chapter Meeting
  • 2. Outline 1 Background and Motivation 2 Model and Methods Kernels Blockwise Kernel PCA Estimation Regularized Selection of Informative Regions Theoretical Results 3 Simulation Studies 4 Genetic Risk of Type I Diabetes 5 Conclusions Adaptive Naive Bayes Kernel Machine Model 2
  • 3. Background and Motivation Adaptive Naive Bayes (Blockwise) Kernel Machine Classification • Goal: genetic data → quantify disease risk, predict therapeutic efficacy, determine disease subtypes • Goal: build an accurate parsimonious prediction model – reduce the cost of unnecessary marker measurements – improve the prediction precision for future patients – improve over modest prediction precision obtained with clinical predictors and/or known risk alleles Adaptive Naive Bayes Kernel Machine Model Background and Motivation 3
  • 4. Background and Motivation Adaptive Naive Bayes (Blockwise) Kernel Machine Classification • Goal: genetic data → quantify disease risk, predict therapeutic efficacy, determine disease subtypes • Goal: build an accurate parsimonious prediction model – reduce the cost of unnecessary marker measurements – improve the prediction precision for future patients – improve over modest prediction precision obtained with clinical predictors and/or known risk alleles • Complex diseases – many alleles contribute to risk – many distinct combinations of risk factors lead to disease Adaptive Naive Bayes Kernel Machine Model Background and Motivation 3
  • 5. Background and Motivation • Genome wide association studies (GWAS) – identifying SNPs associated with disease risk – primary goal of testing – accurate risk prediction remains difficult • Common approach: – select top ranked SNPs based on large scale testing – construct a composite genetic score w/ selected SNPs Adaptive Naive Bayes Kernel Machine Model Background and Motivation 4
  • 6. Background and Motivation • Genome wide association studies (GWAS) – identifying SNPs associated with disease risk – primary goal of testing – accurate risk prediction remains difficult • Common approach: – select top ranked SNPs based on large scale testing – construct a composite genetic score w/ selected SNPs – may not work well due to false +/− errors in identifying predictive SNPs over-fitting using only subset of SNPs available additive effects only Adaptive Naive Bayes Kernel Machine Model Background and Motivation 4
  • 7. Background and Motivation Recent progress in prediction with high dimensional data • Regularized estimation: LASSO (Tibshirani, 1996); SCAD (Fan and Li, 2001); Adaptive LASSO (Zou, 2006) • Machine learning: Support vector machine (Cristianini, Shawe-Taylor, 2000); Least square Kernel Machine Regression (Liu, Lin, Ghosh, 2007); Kernel logistic regression (Zhu and Hastie, 2005; Liu, Ghosh and Lin, 2008) • Screening + Regularized estimation: Sure independence screening (Fan and Lv, 2008; Fan and Song, 2009) Global methods: may be unstable for large p, high correlation Adaptive Naive Bayes Kernel Machine Model Background and Motivation 5
  • 8. Approach Challenge: • Prediction models based on univariate testing, additive models, global methods → low prediction accuracy, low AUC, missing heritability • Non-linear effects? testing for interactions → low power Adaptive Naive Bayes Kernel Machine Model Model and Methods 6
  • 9. Approach Challenge: • Prediction models based on univariate testing, additive models, global methods → low prediction accuracy, low AUC, missing heritability • Non-linear effects? testing for interactions → low power Our approach [Minnier et al., 2015]: • Blockwise method: leverage biological knowledge to build models at the gene-set level genes, gene-pathways, linkage disequilibrium blocks • Kernel machine regression: allow for complex and nonlinear effects implicitly specify underlying complex functional form of covariate effects via similarity measures (kernels) that define the distance between two sets of covariates Adaptive Naive Bayes Kernel Machine Model Model and Methods 6
  • 10. Kernel Methods: similar inputs to similar outputs • transform data to feature space H with non-linear map φ • “kernel trick” lets us use K(, ) similarity function instead of φ • K induces the feature space N. Takahashi’s webpage Adaptive Naive Bayes Kernel Machine Model Model and Methods 7
  • 11. Previous Methods Blockwise methods • Inference: Gene-set testing Gene burden tests Gene Set Enrichment Analysis (GSEA) SNP-set Sequence Kernel Association Test (SKAT, SKAT-O; Wu et al. 2010; Wu, Lee, et al. 2011) Adaptive Naive Bayes Kernel Machine Model Model and Methods 8
  • 12. Previous Methods Blockwise methods • Inference: Gene-set testing Gene burden tests Gene Set Enrichment Analysis (GSEA) SNP-set Sequence Kernel Association Test (SKAT, SKAT-O; Wu et al. 2010; Wu, Lee, et al. 2011) Kernel machine methods • Support Vector Machine (SVM) classification methods • Inference KM SNP-set Testing (Liu et al. 2007, 2008; SKAT methods) Gene expression test with kernel Cox model (Li and Luan 2003) Adaptive Naive Bayes Kernel Machine Model Model and Methods 8
  • 13. Notations and Model Assumptions • Data – Response: Y = (Y1, ..., Yn)T – Predictors: M blocks of genomic regions, for b = 1, ..., M, X(b) = (X (b) 1 , ..., X(b) n )T n×pb , Adaptive Naive Bayes Kernel Machine Model Model and Methods 9
  • 14. Notations and Model Assumptions • Data – Response: Y = (Y1, ..., Yn)T – Predictors: M blocks of genomic regions, for b = 1, ..., M, X(b) = (X (b) 1 , ..., X(b) n )T n×pb , • Blockwise: Partition genome into gene-sets – Recombination hotspots, gene-pathways Adaptive Naive Bayes Kernel Machine Model Model and Methods 9
  • 15. Notations and Model Assumptions • Data – Response: Y = (Y1, ..., Yn)T – Predictors: M blocks of genomic regions, for b = 1, ..., M, X(b) = (X (b) 1 , ..., X(b) n )T n×pb , • Model under blockwise Naive Bayes (NB) assumption: X(1) , ..., X(M) | Y independent Adaptive Naive Bayes Kernel Machine Model Model and Methods 10
  • 16. Notations and Model Assumptions • Data – Response: Y = (Y1, ..., Yn)T – Predictors: M blocks of genomic regions, for b = 1, ..., M, X(b) = (X (b) 1 , ..., X(b) n )T n×pb , • Model under blockwise Naive Bayes (NB) assumption: X(1) , ..., X(M) | Y independent ⇒ logit{pr(Y = 1 | X(1) , ..., X(M) )} = c + M b=1 logit{pr(Y = 1 | X(b) )} – NB assumption allows separate estimation by block and reduces overfitting – Performs well for zero-one loss L(X) = I( ˆY (X) = Y ) [Domingos and Pazzani, 1997] Adaptive Naive Bayes Kernel Machine Model Model and Methods 10
  • 17. Notations and Model Assumptions • Within each region, the effect may be complex and interactive due to – multiple causal variants – un-typed causal variants in the presence of high LD Adaptive Naive Bayes Kernel Machine Model Model and Methods 11
  • 18. Notations and Model Assumptions • Within each region, the effect may be complex and interactive due to – multiple causal variants – un-typed causal variants in the presence of high LD • Blockwise Kernel Machine Regression logit{pr(Y = 1 | X(b) )} = a(b) +h(b) (X(b) ) h(b) (X(b) ) = l β(b) l ψ(b) l (X(b) ) ∈ HK(b) {ψ(b) l } = { λ(b) l φ(b) l } implicitly specified via a symmetric positive definite kernel K(b) (·, ·). Adaptive Naive Bayes Kernel Machine Model Model and Methods 11
  • 19. Notations and Model Assumptions • Within each region, the effect may be complex and interactive due to – multiple causal variants – un-typed causal variants in the presence of high LD • Blockwise Kernel Machine Regression logit{pr(Y = 1 | X(b) )} = a(b) +h(b) (X(b) ) h(b) (X(b) ) = l β(b) l ψ(b) l (X(b) ) ∈ HK(b) {ψ(b) l } = { λ(b) l φ(b) l } implicitly specified via a symmetric positive definite kernel K(b) (·, ·). K(b) (X(b) i , X(b) j ) defines the similarity between X(b) i and X(b) j . Adaptive Naive Bayes Kernel Machine Model Model and Methods 11
  • 20. Notations and Model Assumptions • Within each region, the effect may be complex and interactive due to – multiple causal variants – un-typed causal variants in the presence of high LD • Blockwise Kernel Machine Regression logit{pr(Y = 1 | X(b) )} = a(b) +h(b) (X(b) ) h(b) (X(b) ) = l β(b) l ψ(b) l (X(b) ) ∈ HK(b) {ψ(b) l } = { λ(b) l φ(b) l } implicitly specified via a symmetric positive definite kernel K(b) (·, ·). K(b) (X(b) i , X(b) j ) defines the similarity between X(b) i and X(b) j . HK(b) , the functional space spanned by K(b) (·, ·), is a reproducible kernel hilbert space (RKHS) Adaptive Naive Bayes Kernel Machine Model Model and Methods 11
  • 21. Choices of Kernel Functions Linear kernel: K(Xi , Xj ) = ρ + XT i Xj , h(X) = p k=1 βkXk Fitting logistic regression with linear kernel ⇔ logistic ridge regression. Adaptive Naive Bayes Kernel Machine Model Model and Methods 12
  • 22. Choices of Kernel Functions Linear kernel: K(Xi , Xj ) = ρ + XT i Xj , h(X) = p k=1 βkXk Fitting logistic regression with linear kernel ⇔ logistic ridge regression. IBS kernel: K(Xi , Xj ) = p k=1(2 − |Xik − Xjk|), powerful in detecting non-linear effects with SNP data [Wu et al, 2010] Adaptive Naive Bayes Kernel Machine Model Model and Methods 12
  • 23. Estimation of h: Kernel PCA • primal form: h = l βl ψl = l βl √ λl φl • Kernel PCA approximation: K = [K(Xi , Xj )]1≤i,j≤n = n l=1 λl φl φ T l K = 0 l=1 λl φl φ T l = ΨΨ T ; Ψ = [λ 1 2 1 φ1, ..., λ 1 2 0 φ 0 ]n× 0 Scholkopf et al. [1999]; Williams and Seeger [2000]; Braun et al. [2008]; Zhang et al. [2010] • h(b) (X(b) ) = Ψβ • obtain (a, β) as the minimizer of ridge logistic objective function L(Y , a, Ψβ) + τ β 2 Adaptive Naive Bayes Kernel Machine Model Model and Methods 13
  • 24. Regularized Selection of Informative Regions • For b = 1, ..., M, perform kernel PCA regression and obtain h(b) logit{pr(Y = 1 | X(b) )} = a(b) + h(b) (X(b) ) • Classify a future subject with X = {X(b) , b = 1, ..., M} based on M b=1 h(b) (X(b) ) ≥ c • Final prediction rule with weighted block effects – Some regions may not be predictive of the outcome due to false discovery – Inclusion of all regions for prediction may lead to reduced accuracy – Regularized estimation of block effects using LASSO: M b=1 γbh(b) (X(b) ) ≥ c Adaptive Naive Bayes Kernel Machine Model Model and Methods 14
  • 25. Regularized Selection of Informative Regions • For b = 1, ..., M, perform kernel PCA regression and obtain h(b) logit{pr(Y = 1 | X(b) )} = a(b) + h(b) (X(b) ) • Classify a future subject with X = {X(b) , b = 1, ..., M} based on M b=1 h(b) (X(b) ) ≥ c • Final prediction rule with weighted block effects – Regularized estimation of block effects using LASSO, pseudo-data H estimated with cross-validation: K k=1 YT log g(b + Hγ) + (1 − Y)T log{1 − g(b + Hγ)} − τ2 γ 1, M b=1 γbh(b) (X(b) ) ≥ c Adaptive Naive Bayes Kernel Machine Model Model and Methods 15
  • 26. Theoretical Results • Consistency of h(b)(x): – h(b) (x) → h(b) (x) at √ n rate for finite dimensional HK – Relies on convergence of sample eigen-values and -vectors from kernel PCA to the true eigensystem of HK Ψ → Ψ = {ψ(b) 1 , . . . , ψ(b) 0 } • Oracle property of γ: – Gene-set selection consistency P(A = A) → 1 where A = {b|h(b) (x) = 0}, A = {b|h(b) (x) = 0} Adaptive Naive Bayes Kernel Machine Model Model and Methods 16
  • 27. Simulation Studies for NBKM • SNP data sampled from gene-sets in a GWAS dataset (from type I diabetes study, Affy 500k) • 350 regions, 9256 SNPs • Only the first 4 regions are associated with the outcome • the joint effects of the SNPs in each of these regions set as – linear for the first two regions and non-linear for the other 2 regions – linear for all 4 regions – nonlinear for all 4 regions Adaptive Naive Bayes Kernel Machine Model Simulation Studies 17
  • 28. Prediction Accuracy Simulations: nt = 1000, nv = 500, # of genes = 350 total # of SNPs = 9256 Adaptive Naive Bayes Kernel Machine Model Simulation Studies 18
  • 29. Gene-set selection Simulations: nt = 1000, nv = 500, # of genes = 350 total # of SNPs = 9256 Adaptive Naive Bayes Kernel Machine Model Simulation Studies 19
  • 30. Genetic Risk of Type I Diabetes • Autoimmune disease, usually diagnosed in childhood • T1D – 75 SNPs have been identified as T1D risk alleles (National Human Genome Research Institute, Hindorff et al. [2009]) – 91 genes that either contain these SNPs or flank the SNP on either side on the chromosome Adaptive Naive Bayes Kernel Machine Model Genetic Risk of Type I Diabetes 20
  • 31. Genetic Risk of Type I Diabetes • Autoimmune disease, usually diagnosed in childhood • T1D – 75 SNPs have been identified as T1D risk alleles (National Human Genome Research Institute, Hindorff et al. [2009]) – 91 genes that either contain these SNPs or flank the SNP on either side on the chromosome • T1D + Other autoimmune diseases (Rheumatoid arthritis, Celiac disease, Crohns disease, Lupus, Inflammatory bowel disease) – 365 SNPs have been identified as T1D+other autoimmune disease risk alleles (NHGRI) – 375 genes that either contain these SNPs or flank the SNP on either side on the chromosome Adaptive Naive Bayes Kernel Machine Model Genetic Risk of Type I Diabetes 20
  • 32. Genetic Risk of Type I Diabetes GWAS data collected by Welcome Trust Case Control Consortium (WTCCC) • 2000 cases, 3000 controls of European descent from Great Britain • segment the genome into gene-sets: gene and a flanking region of 20KB on either side of the gene • The WTCCC data includes – 350 of the gene-sets listed in the NHGRI catalog – covering 9,256 SNPs in the WTCCC data Adaptive Naive Bayes Kernel Machine Model Genetic Risk of Type I Diabetes 21
  • 33. T1D Prediction Results Adaptive Naive Bayes Kernel Machine Model Genetic Risk of Type I Diabetes 22
  • 34. Conclusions • Kernel Machine Regression provides a useful tool for incorporating non-linear complex effects • Blockwise KM regression achieves a nice balance between capturing complex effects and overfitting • IBS kernel performs well under both linear and non-linear settings Remarks • May use SKAT to screen blocks for initial stage • Can be extended to data with other covariates such as clinical variables • Possible extensions might incorporate more complex block structure, different types of outcomes, interactions, and beyond! Adaptive Naive Bayes Kernel Machine Model Conclusions 23
  • 35. Thank you! Adaptive Naive Bayes Kernel Machine Model Conclusions 24
  • 36. References I M. Braun, J. Buhmann, and K. M¨uller. On relevant dimensions in kernel feature spaces. The Journal of Machine Learning Research, 9:1875–1908, 2008. P. Domingos and M. Pazzani. On the optimality of the simple bayesian classifier under zero-one loss. Machine learning, 29(2): 103–130, 1997. L. Hindorff, P. Sethupathy, H. Junkins, E. Ramos, J. Mehta, F. Collins, and T. Manolio. Potential etiologic and functional implications of genome-wide association loci for human diseases and traits. Proceedings of the National Academy of Sciences, 106(23):9362, 2009. J. Minnier, M. Yuan, J. S. Liu, and T. Cai. Risk classification with an adaptive naive bayes kernel machine model. Journal of the American Statistical Association, 110(509):393–404, 2015. B. Scholkopf, S. Mika, C. Burges, P. Knirsch, K. Muller, G. Ratsch, and A. Smola. Input space versus feature space in kernel-based methods. Neural Networks, IEEE Transactions on, 10(5):1000–1017, 1999. C. Williams and M. Seeger. The effect of the input density distribution on kernel-based classifiers. In Proceedings of the 17th International Conference on Machine Learning. Citeseer, 2000. R. Zhang, W. Wang, and Y. Ma. Approximations of the standard principal components analysis and kernel pca. Expert Systems with Applications, 37(9):6531–6537, 2010. Adaptive Naive Bayes Kernel Machine Model 25