1. Rule Based Kernels in Semi-Parametric Mixed ModelsRule Based Kernels in Semi-Parametric MixedModelsDeniz AkdemirPostdoctoral ResearcherCornell UniversityDepartment of Plant Breeding and GeneticsIthaca, NY
2. Rule Based Kernels in Semi-Parametric Mixed ModelsExtraction of RulesISLE AlgorithmAlgorithm 1.1: ISLE(M, ν, η)F0(x) = 0.for j=1 to Mdo(cj, θj) = argmin(c,θ)i∈Sj(η) L(yi, Fj−1(xi) + cf(xi, θ))Tj(x) = f(x, θj)Fj(x) = Fj−1(x) + νcjTj(x)return ({Tj(x)}Mj=1 and FM (x).)Here L(., .) is a loss function, Sj(η) is a subset of the indices{1, 2, . . . , n} chosen by a sampling scheme η, 0 ≤ ν ≤ 1 is amemory parameter.
3. Rule Based Kernels in Semi-Parametric Mixed ModelsPost ProcessingLASSO Post-ProcessingThe ﬁnal ensemble models considered by the ISLE framework havean additive form:F(x) = w0 +Mj=1wjf(x, θj) (1)where {f(x, θj)}Mj=1 are base learners selected from F. ISLE uses atwo-step approach to produce F(x). The ﬁrst step involvessampling the space of possible models to obtain {θj}Mj=1. Thesecond step proceeds with combining the base learners by choosingweights {wj}Mj=0 in (1).Friedman and Popescu [1] recommend learning the weights{wj}Mj=0 using lasso [2].
4. Rule Based Kernels in Semi-Parametric Mixed ModelsPost ProcessingTrees to RulesFigure: A simple regression tree which can be represented asy = 20I(x < 0)(z < 1) + 15I(x < 0)I(z ≥ 1) + 10I(x ≥ 0). Each leafnode deﬁnes a rule which can be expressed as a product of indicatorfunctions of half spaces. Each rule speciﬁes a ’simple’ rectangular regionin the input space.
5. Rule Based Kernels in Semi-Parametric Mixed ModelsKernel Learning and Clustering With RulesSimilarities in Phenotype and Genotype spacesSimilarities in phenotype space: Y Y ′,Similarities in genotype space: MM′,Y Y ′ = MM′.We want f(M) = K ≈ Y Y ′ : Kernel Learning.
6. Rule Based Kernels in Semi-Parametric Mixed ModelsKernel Learning and Clustering With RulesSemi-Supervised Importance Sampling ClusteringAlgorithm (SS-ISCA)Algorithm 3.1: SS-ISCA(X, Y, M, m, ν)R1 : A random projection of Yfor j = 1 to MdoGenerate m logic rules {lℓ(x)}mℓ=1 to estimate Rj from X:Sj(x) ⇐ {lℓ(x)}mℓ=1T(X) ⇐ {Sj(xi)}ni=1jd=1Rj+1 : A random projection of YRj+1 ⇐ (I − νPT(X))Rj+1return (T(X))
7. Rule Based Kernels in Semi-Parametric Mixed ModelsKernel Learning and Clustering With RulesSemi-Parametric Mixed ModelSelection in animal or plant breeding is usually based onestimates of genetic breeding values (GEBV) obtained withsemi-parametric mixed models (SPMM).A SPMM for the n × 1 response vector y is expressed asy = Xβ + Zg + e (2)where X is the n × p design matrix for the ﬁxed eﬀects, β is ap × 1 vector of ﬁxed eﬀects coeﬃcients, Z is the n × q designmatrix for the random eﬀects; the random eﬀects (g′, e′)′ areassumed to follow a multivariate normal distribution withmean 0 and covarianceσ2gK 00 σ2e Inwhere K is a q × q kernel matrix.
8. Rule Based Kernels in Semi-Parametric Mixed ModelsKernel Learning and Clustering With RulesFHB Data Set1 210203040ISCAp−val=01 210203040SS−ISCAp−val=01 210203040RFp−val=0.5671 210203040PAMp−val=0.1671 210203040Mclustp−val=0.725Figure: (FHB Data Set, Semi-Supervised Clustering) p values from the ttests corresponding to diﬀerent clustering approaches indicate that theSS-ISCA and ISCA produce groups that are diﬀerent from each other interms of the mean FHB.
9. Rule Based Kernels in Semi-Parametric Mixed ModelsKernel Learning and Clustering With RulesFHB Data SetTable: (FHB Data Set, Semi-Supervised Clustering) SS-ISCA and ISCAclusterings outperform other clusterings.silhouette dunn connectSS-ISCA 0.108 0.419** 44.450**ISCA 0.114* 0.344* 67.763*RF 0.092 0.344* 111.410PAM 0.126** 0.160 206.743Mclust 0.114* 0.317 126.918
10. Rule Based Kernels in Semi-Parametric Mixed ModelsRule Based Similarity Matrix in Mixed ModelRule Based Similarity Matrix in Mixed Model-FHB DataSetTarget variables DON and FHB are used in SS-ISCA.Linear Gaussian ISCA SS−ISCA0.50.60.70.8Figure: (Vertical axis: Correlation between trait and GEBVs in the test
11. Rule Based Kernels in Semi-Parametric Mixed ModelsRule Based Similarity Matrix in Mixed ModelRule Based Linear Gaussian0.890.900.910.920.930.940.95Grincore2012Aberdeen_2011, trait: Heading DateFigure: Grincore2011Aberdeen Heading Date (Vertical axis: Correlationbetween trait and GEBVs in the test data)
12. Rule Based Kernels in Semi-Parametric Mixed ModelsRule Based Similarity Matrix in Mixed ModelYieldRule Based Linear Gaussian0.50.60.70.80.9MN_SP2_NormN_2011_Crookston, trait: yieldFigure: Crookston 2011 Yield (Vertical axis: Correlation between traitand GEBVs in the test data)
13. Rule Based Kernels in Semi-Parametric Mixed ModelsRule Based Similarity Matrix in Mixed ModelHeightRule Based Linear Gaussian0.50.60.70.80.9MN_SP2_NormN_2011_Crookston, trait: yieldFigure: Crookston 2011 Height (Vertical axis: Correlation between traitand GEBVs in the test data)
14. Rule Based Kernels in Semi-Parametric Mixed ModelsBibliographyJ.H. Friedman and B.E. Popescu.Importance sampled learning ensembles.Journal of Machine Learning Research, 94305, 2003.R. Tibshirani.Regression shrinkage and selection via the lasso.Journal of the Royal Statistical Society. Series B (Methodological), pages 267–288, 1996.
Be the first to comment