SlideShare a Scribd company logo
1 of 30
Download to read offline
Bayesian selection of best subsets in high-dimensional
regression
Shiqiang Jin
Department of Statistics
Kansas State University, Manhattan, KS
Joint work with
Gyuhyeong Goh
Kansas State University, Manhattan, KS
July 31, 2019
Shiqiang Jin (Department of Statistics Kansas State University, Manhattan, KS Joint work with Gyuhyeong Goh Kansas State University, Manhattan,
Bayesian selection of best subsets in high-dimensional regression July 31, 2019 1 / 25
Bayesian linear regression model in High-dimensional Data
Consider a linear regression model
y = Xβ + , (1)
where y = (y1, . . . , yn)T
is a response vector, X = (x1, . . . , xp) ∈ Rn×p
is a
model matrix, β = (β1, . . . , βp)T
is a coefficient vector and  ∼ N(0, σ2
In).
We assume p  n, i.e. High-dimensional data.
We assume only a few number of predictors are associated with the response,
i.e. β is sparse.
Shiqiang Jin (Department of Statistics Kansas State University, Manhattan, KS Joint work with Gyuhyeong Goh Kansas State University, Manhattan,
Bayesian selection of best subsets in high-dimensional regression July 31, 2019 2 / 25
Bayesian linear regression model in High-dimensional Data
To better explain the sparsity of β, we introduce a latent index set
γ ⊂ {1, . . . , p} so that Xγ represents a sub-matrix of X containing xj , j ∈ γ.
e.g. γ = {1, 3, 4} ⇒ Xγ = (x1, x3, x4).
The full model in (1) can be reduced to
y = Xγβγ + . (2)
Shiqiang Jin (Department of Statistics Kansas State University, Manhattan, KS Joint work with Gyuhyeong Goh Kansas State University, Manhattan,
Bayesian selection of best subsets in high-dimensional regression July 31, 2019 3 / 25
Priors and marginal posterior distribution
Our Research Goals are to obtain:
(i) k most important predictors out of p
k

candidate models;
(ii) a single best model from among 2p
candidate models.
We consider
βγ|σ2
, γ ∼ Normal(0, τσ2
I|γ|),
σ2
∼ Inverse-Gamma(aσ/2, bσ/2),
π(γ) ∝ I(|γ| = k),
where |γ| is number of elements in γ.
Shiqiang Jin (Department of Statistics Kansas State University, Manhattan, KS Joint work with Gyuhyeong Goh Kansas State University, Manhattan,
Bayesian selection of best subsets in high-dimensional regression July 31, 2019 4 / 25
Priors and marginal posterior distribution
Our Research Goals are to obtain:
(i) k most important predictors out of p
k

candidate models;
(ii) a single best model from among 2p
candidate models.
We consider
βγ|σ2
, γ ∼ Normal(0, τσ2
I|γ|),
σ2
∼ Inverse-Gamma(aσ/2, bσ/2),
π(γ) ∝ I(|γ| = k),
where |γ| is number of elements in γ.
Shiqiang Jin (Department of Statistics Kansas State University, Manhattan, KS Joint work with Gyuhyeong Goh Kansas State University, Manhattan,
Bayesian selection of best subsets in high-dimensional regression July 31, 2019 4 / 25
Priors and marginal posterior distribution
Given k, it can be shown that
π(γ|y) ∝
(τ−1
)
|γ|
2
|XT
γXγ + τ−1I|γ||
1
2 (yT
Hγy + bσ)
aσ+n
2
I(|γ| = k)
≡ g(γ)I(|γ| = k),
where Hγ = In − Xγ(XT
γXγ + τ−1
I|γ|)−1
XT
γ.
Hence, g(γ) is our model selection criterion.
Shiqiang Jin (Department of Statistics Kansas State University, Manhattan, KS Joint work with Gyuhyeong Goh Kansas State University, Manhattan,
Bayesian selection of best subsets in high-dimensional regression July 31, 2019 5 / 25
Best subset selection Algorithm
Note that our goals are to obtain (i) k most important predictors out of p
k

models; (ii) a single best model from among 2p
candidate models.
Hence, this can be fallen into the description of best subset selection as
follows:
(i) Fixed size: for k = 1, 2, . . . , p, select the best subset model by
Mk = argγ max
|γ|=k
g(γ)
from p
k

candidate models.
(ii) Varying size: Pick the single best model from M1, . . . , Mp.
Challenge of best subset selection:
e.g. (i) p = 1000, k = 5, 1000
5

≈ 8 × 1012
; (ii) p = 40, 240
≈ 1012
.
Shiqiang Jin (Department of Statistics Kansas State University, Manhattan, KS Joint work with Gyuhyeong Goh Kansas State University, Manhattan,
Bayesian selection of best subsets in high-dimensional regression July 31, 2019 6 / 25
Best subset selection Algorithm
Note that our goals are to obtain (i) k most important predictors out of p
k

models; (ii) a single best model from among 2p
candidate models.
Hence, this can be fallen into the description of best subset selection as
follows:
(i) Fixed size: for k = 1, 2, . . . , p, select the best subset model by
Mk = argγ max
|γ|=k
g(γ)
from p
k

candidate models.
(ii) Varying size: Pick the single best model from M1, . . . , Mp.
Challenge of best subset selection:
e.g. (i) p = 1000, k = 5, 1000
5

≈ 8 × 1012
; (ii) p = 40, 240
≈ 1012
.
Shiqiang Jin (Department of Statistics Kansas State University, Manhattan, KS Joint work with Gyuhyeong Goh Kansas State University, Manhattan,
Bayesian selection of best subsets in high-dimensional regression July 31, 2019 6 / 25
Best subset selection Algorithm
Note that our goals are to obtain (i) k most important predictors out of p
k

models; (ii) a single best model from among 2p
candidate models.
Hence, this can be fallen into the description of best subset selection as
follows:
(i) Fixed size: for k = 1, 2, . . . , p, select the best subset model by
Mk = argγ max
|γ|=k
g(γ)
from p
k

candidate models.
(ii) Varying size: Pick the single best model from M1, . . . , Mp.
Challenge of best subset selection:
e.g. (i) p = 1000, k = 5, 1000
5

≈ 8 × 1012
; (ii) p = 40, 240
≈ 1012
.
Shiqiang Jin (Department of Statistics Kansas State University, Manhattan, KS Joint work with Gyuhyeong Goh Kansas State University, Manhattan,
Bayesian selection of best subsets in high-dimensional regression July 31, 2019 6 / 25
Neighborhood Search
To avoid the exhaustive computation, we resort to the idea of Neighborhood
Search proposed by Madigan et al. (1995) and Hans et al. (2007).
Let γ(t)
be a current state of γ, |γ(t)
| = k is model size.
Addition neighbor: N+(γ(t)
) = {γ(t)
∪ {j} : j /
∈ γ(t)
}; model size?
Deletion neighbor: N−(γ(t)
) = {γ(t)
 {j0
} : j0
∈ γ(t)
}; model size?
e.g. Suppose p = 4, k = 2. Let γ(t)
= {1, 2}, then
N+(γ(t)
) = {{1, 2, 3}, {1, 2, 4}},
N−(γ(t)
) = {{1}, {2}}.
Shiqiang Jin (Department of Statistics Kansas State University, Manhattan, KS Joint work with Gyuhyeong Goh Kansas State University, Manhattan,
Bayesian selection of best subsets in high-dimensional regression July 31, 2019 7 / 25
Hybrid best subset search with a fixed k
Note our Goal (i) is to find γ̂ = argγ max|γ̂|=k g(γ).
1. Initialize γ̂ s.t. |γ̂| = k.
2. Repeat #deterministic search:local optimum
Update γ̃ ← arg maxγ∈N+(γ̂) g(γ) ; # N+(γ̂) = {γ̂ ∪ {j} : j /
∈ γ̂}
Update γ̂ ← arg maxγ∈N−(γ̃) g(γ); # N−(γ̃) = {γ̃  {j} : j ∈ γ̃}
until convergence.
In Step 2, we have p + 1 many candidate models in all γ ∈ N+(γ̂) and all
γ ∈ N−(γ̃)) in each iteration.
compute g(γ) p + 1 times in each iteration.
Shiqiang Jin (Department of Statistics Kansas State University, Manhattan, KS Joint work with Gyuhyeong Goh Kansas State University, Manhattan,
Bayesian selection of best subsets in high-dimensional regression July 31, 2019 8 / 25
Hybrid best subset search with a fixed k
Note our Goal (i) is to find γ̂ = argγ max|γ̂|=k g(γ).
1. Initialize γ̂ s.t. |γ̂| = k.
2. Repeat #deterministic search:local optimum
Update γ̃ ← arg maxγ∈N+(γ̂) g(γ) ; # N+(γ̂) = {γ̂ ∪ {j} : j /
∈ γ̂}
Update γ̂ ← arg maxγ∈N−(γ̃) g(γ); # N−(γ̃) = {γ̃  {j} : j ∈ γ̃}
until convergence.
In Step 2, we have p + 1 many candidate models in all γ ∈ N+(γ̂) and all
γ ∈ N−(γ̃)) in each iteration.
compute g(γ) p + 1 times in each iteration.
Shiqiang Jin (Department of Statistics Kansas State University, Manhattan, KS Joint work with Gyuhyeong Goh Kansas State University, Manhattan,
Bayesian selection of best subsets in high-dimensional regression July 31, 2019 8 / 25
1st Key features of proposed algorithm
g(γ) =
(τ−1
)
|γ|
2
|XT
γXγ + τ−1I|γ||
1
2 (yT
Hγy + bσ)
aσ+n
2
,
where Hγ = In − Xγ(XT
γXγ + τ−1
I|γ|)−1
XT
γ.
compute inverse matrix and determinant p + 1 times.
We propose the following formula and we can show that evaluating all
candidate models in addition neighbor can be done simultaneously in this
single computation.
g(N+(γ̂)) =

(yT
Hγ̂y + bσ)1p −
(XT
Hγ̂y)2
τ−11p + diag(XT
Hγ̂X)
− aσ+n
2
×

τ−1
1p + diag(XT
Hγ̂X)
	−1/2
, (3)
Similarly for g(N−(γ̂)).
Shiqiang Jin (Department of Statistics Kansas State University, Manhattan, KS Joint work with Gyuhyeong Goh Kansas State University, Manhattan,
Bayesian selection of best subsets in high-dimensional regression July 31, 2019 9 / 25
Hybrid best subset search with a fixed k
Note our Goal (i) is to find γ̂ = argγ max|γ̂|=k g(γ).
1. Initialize γ̂ s.t. |γ̂| = k.
2. Repeat #deterministic search:local optimum
Update γ̃ ← arg max g(N+(γ̂)) ; # N+(γ̂) = {γ̂ ∪ {j} : j /
∈ γ̂}
Update γ̂ ← arg max g(N−(γ̃)); # N−(γ̃) = {γ̃  {j} : j ∈ γ̃}
until convergence.
3. Set γ(0)
= γ̂.
4. Repeat for t = 1, . . . , T: #stochastic search:global optimum
Sample γ∗
∼ π(γ|y) = {g(γ)}
P
γ
{g(γ)}
I{γ ∈ N+(γ(t−1)
)};
Sample γ(t)
∼ π(γ|y) = {g(γ)}
P
γ
{g(γ)}
I{γ ∈ N−(γ∗
)};
If π(γ̂|y)  π(γ(t)
|y), then update γ̂ = γ(t)
, break the loop, and go to 2.
5. Return γ̂.
Note that all g(γ) are computed simultaneously in their neighbor space.
Shiqiang Jin (Department of Statistics Kansas State University, Manhattan, KS Joint work with Gyuhyeong Goh Kansas State University, Manhattan,
Bayesian selection of best subsets in high-dimensional regression July 31, 2019 10 / 25
Hybrid best subset search with a fixed k
Note our Goal (i) is to find γ̂ = argγ max|γ̂|=k g(γ).
1. Initialize γ̂ s.t. |γ̂| = k.
2. Repeat #deterministic search:local optimum
Update γ̃ ← arg max g(N+(γ̂)) ; # N+(γ̂) = {γ̂ ∪ {j} : j /
∈ γ̂}
Update γ̂ ← arg max g(N−(γ̃)); # N−(γ̃) = {γ̃  {j} : j ∈ γ̃}
until convergence.
3. Set γ(0)
= γ̂.
4. Repeat for t = 1, . . . , T: #stochastic search:global optimum
Sample γ∗
∼ π(γ|y) = {g(γ)}
P
γ
{g(γ)}
I{γ ∈ N+(γ(t−1)
)};
Sample γ(t)
∼ π(γ|y) = {g(γ)}
P
γ
{g(γ)}
I{γ ∈ N−(γ∗
)};
If π(γ̂|y)  π(γ(t)
|y), then update γ̂ = γ(t)
, break the loop, and go to 2.
5. Return γ̂.
Note that all g(γ) are computed simultaneously in their neighbor space.
Shiqiang Jin (Department of Statistics Kansas State University, Manhattan, KS Joint work with Gyuhyeong Goh Kansas State University, Manhattan,
Bayesian selection of best subsets in high-dimensional regression July 31, 2019 10 / 25
2nd Key features of proposed algorithm
To avoid staying in the local optimum for a long time in step 4, we use the
escort distribution.
The idea of escort distribution (used in statistical physics and
thermodynamics) is introduced to stimulate the movement of Markov chain.
An escort distribution of g(γ) is given as
{g(γ)}α
P
γ{g(γ)}α
, α ∈ [0, 1]
Shiqiang Jin (Department of Statistics Kansas State University, Manhattan, KS Joint work with Gyuhyeong Goh Kansas State University, Manhattan,
Bayesian selection of best subsets in high-dimensional regression July 31, 2019 11 / 25
Escort distributions
e.g. Consider 3 candidate models with its probability:
If α = 1,
{g(γ)}α
P
γ{g(γ)}α
=





0.02 model 1
0.90 model 2
0.08 model 3
.
1 2 3
Escort
probability
0.0
0.2
0.4
0.6
0.8
1.0
α=1
0.02
0.9
0.08
1 2 3
Escort
probability
0.0
0.2
0.4
0.6
0.8
1.0
α=0.5
0.1
0.69
0.21
1 2 3
Escort
probability
0.0
0.2
0.4
0.6
0.8
1.0
α=0.05
0.3
0.37
0.33
Figure: Escort distributions of πα(γ|y).
Shiqiang Jin (Department of Statistics Kansas State University, Manhattan, KS Joint work with Gyuhyeong Goh Kansas State University, Manhattan,
Bayesian selection of best subsets in high-dimensional regression July 31, 2019 12 / 25
Hybrid best subset search with a fixed k
Note our Goal (i) is to find γ̂ = argγ max|γ̂|=k g(γ).
1. Initialize γ̂ s.t. |γ̂| = k.
2. Repeat #deterministic search:local optimum
Update γ̃ ← arg max g(N+(γ̂)) ; # N+(γ̂) = {γ̂ ∪ {j} : j /
∈ γ̂}
Update γ̂ ← arg max g(N−(γ̃)); # N−(γ̃) = {γ̃  {j} : j ∈ γ̃}
until convergence.
3. Set γ(0)
= γ̂.
4. Repeat for t = 1, . . . , T: #stochastic search:global optimum
Sample γ∗
∼ πα(γ|y) = {g(γ)}α
P
γ
{g(γ)}α
I{γ ∈ N+(γ(t−1)
)}; # α ∈ [0, 1]
Sample γ(t)
∼ πα(γ|y) = {g(γ)}α
P
γ
{g(γ)}α
I{γ ∈ N−(γ∗
)};
If π(γ̂|y)  π(γ(t)
|y), then update γ̂ = γ(t)
, break the loop, and go to 2.
5. Return γ̂.
Note that all g(γ) are computed simultaneously in their neighbor space.
Shiqiang Jin (Department of Statistics Kansas State University, Manhattan, KS Joint work with Gyuhyeong Goh Kansas State University, Manhattan,
Bayesian selection of best subsets in high-dimensional regression July 31, 2019 13 / 25
Best subset selection with varying k
Note that Goal (ii): a single best model from among 2p
candidate models.
We extend ”fixed” k to varying k by assigning a prior on k.
Note that the uniform prior, k ∼ Uniform{1, . . . , K}, tends to assign larger
probability to a larger subset (see Chen and Chen (2008)).
We define
π(k) ∝ 1/

p
k

I(k ≤ K).
Shiqiang Jin (Department of Statistics Kansas State University, Manhattan, KS Joint work with Gyuhyeong Goh Kansas State University, Manhattan,
Bayesian selection of best subsets in high-dimensional regression July 31, 2019 14 / 25
Hybrid best subset search with varying k
Bayesian best subset selection can be done by maximizing
π(γ, k|y) ∝ g(γ)/

p
k

(4)
over (γ, k).
Our algorithm proceeds as follows:
1. Repeat for k = 1, . . . , K:
Given k, implement the hybrid search algorithm to obtain best subset model
γ̂k .
2. Find the best model γ̂∗
obtained by
γ̂∗
= arg max
k∈{1,...,K}

g(γ̂k )/

p
k

. (5)
Shiqiang Jin (Department of Statistics Kansas State University, Manhattan, KS Joint work with Gyuhyeong Goh Kansas State University, Manhattan,
Bayesian selection of best subsets in high-dimensional regression July 31, 2019 15 / 25
Consistency of model selection criterion
Theorem
Let γ∗ indicate the true model. Define Γ = {γ : |γ| ≤ K, γ 6= γ∗}. Assume that
p = O(nξ
) for ξ ≥ 1. Under the asymptotic identifiability condition of Chen and
Chen (2008), if τ → ∞ as n → ∞ but τ = o(n), then the proposed Bayesian
subset selection possesses the Bayesian model selection consistency, that is,
π(γ∗|y)  max
γ∈Γ
π(γ|y) (6)
in probability as n → ∞.
As n → ∞, the maximizer of π(γ|y) is the true model based on our model
selection criterion.
Shiqiang Jin (Department of Statistics Kansas State University, Manhattan, KS Joint work with Gyuhyeong Goh Kansas State University, Manhattan,
Bayesian selection of best subsets in high-dimensional regression July 31, 2019 16 / 25
Simulation study
Setup
For given n = 100, we generate the data from
yi
ind
∼ Normal
p
X
j=1
βj xij , 1
!
,
where
I (xi1, . . . , xip)T iid
∼ Normal(0p, Σ) with Σ = (Σij )p×p and Σij = ρ|i−j|
,
I βj
iid
∼ Uniform{−1, −2, 1, 2} if j ∈ γ and βj = 0 if j /
∈ γ.
I γ is an index set of size 4 randomly selected from {1, 2, . . . , p}.
I We consider four scenarios for p and ρ:
(i) p = 200, ρ = 0.1, (ii) p = 200, ρ = 0.9,
(iii) p = 1000, ρ = 0.1, (iv) p = 1000, ρ = 0.9.
Shiqiang Jin (Department of Statistics Kansas State University, Manhattan, KS Joint work with Gyuhyeong Goh Kansas State University, Manhattan,
Bayesian selection of best subsets in high-dimensional regression July 31, 2019 17 / 25
Simulation study
Results (high-dimensional scenarios)
Table: 2, 000 replications; FDR (false discovery rate), TRUE% (percentage of the true
model detected), SIZE (selected model size), HAM (Hamming distance).
Scenario Method FDR (s.e.) TRUE% (s.e.) SIZE (s.e.) HAM (s.e.)
p = 200 Proposed 0.006 (0.001) 96.900 (0.388) 4.032 (0.004) 0.032 (0.004)
ρ = 0.1 SCAD 0.034 (0.002) 85.200 (0.794) 4.188 (0.011) 0.188 (0.011)
MCP 0.035 (0.002) 84.750 (0.804) 4.191 (0.011) 0.191 (0.011)
ENET 0.016 (0.001) 92.700 (0.582) 4.087 (0.007) 0.087 (0.007)
LASSO 0.020 (0.002) 91.350 (0.629) 4.109 (0.009) 0.109 (0.009)
p = 200 Proposed 0.023 (0.002) 88.750 (0.707) 3.985 (0.006) 0.203 (0.014)
ρ = 0.9 SCAD 0.059 (0.003) 74.150 (0.979) 4.107 (0.015) 0.480 (0.022)
MCP 0.137 (0.004) 55.400 (1.112) 4.264 (0.020) 1.098 (0.034)
ENET 0.501 (0.004) 0.300 (0.122) 7.716 (0.072) 5.018 (0.052)
LASSO 0.276 (0.004) 15.550 (0.811) 5.308 (0.033) 2.038 (0.034)
Shiqiang Jin (Department of Statistics Kansas State University, Manhattan, KS Joint work with Gyuhyeong Goh Kansas State University, Manhattan,
Bayesian selection of best subsets in high-dimensional regression July 31, 2019 18 / 25
Simulation study
Results (ultra high-dimensional scenarios)
Table: 2, 000 replications; FDR (false discovery rate), TRUE% (percentage of the true
model detected), SIZE (selected model size), HAM (Hamming distance).
Scenario Method FDR (s.e.) TRUE% (s.e.) SIZE (s.e.) HAM (s.e.)
p = 1000 Proposed 0.004 (0.001) 98.100 (0.305) 4.020 (0.003) 0.020 (0.003)
ρ = 0.1 SCAD 0.027 (0.002) 87.900 (0.729) 4.145 (0.010) 0.145 (0.010)
MCP 0.031 (0.002) 86.550 (0.763) 4.172 (0.013) 0.172 (0.013)
ENET 0.035 (0.002) 84.850 (0.802) 4.181 (0.013) 0.206 (0.012)
LASSO 0.014 (0.001) 93.850 (0.537) 4.073 (0.007) 0.073 (0.007)
p = 1000 Proposed 0.023(0.002) 89.850 (0.675) 4.005 (0.005) 0.190 (0.013)
ρ = 0.9 SCAD 0.068 (0.003) 74.250 (0.978) 4.196 (0.014) 0.493 (0.023)
MCP 0.152 (0.004) 53.750 (1.115) 4.226 (0.017) 1.202 (0.035)
ENET 0.417 (0.005) 0.150 (0.087) 6.228 (0.068) 4.089 (0.043)
LASSO 0.265 (0.004) 19.500 (0.886) 5.139 (0.029) 1.909 (0.035)
Shiqiang Jin (Department of Statistics Kansas State University, Manhattan, KS Joint work with Gyuhyeong Goh Kansas State University, Manhattan,
Bayesian selection of best subsets in high-dimensional regression July 31, 2019 19 / 25
Real data application
Data description
We apply the proposed method to Breast Invasive Carcinoma (BRCA) data
generated by The Cancer Genome Atlas (TCGA) Research Network
http://cancergenome.nih.gov.
The data set contains 17, 814 gene expression measurements (recorded on
the log scale) of 526 patients with primary solid tumor.
BRCA1 is a tumor suppressor gene and its mutations predispose women to
breast cancer (Findlay et al., 2018).
Shiqiang Jin (Department of Statistics Kansas State University, Manhattan, KS Joint work with Gyuhyeong Goh Kansas State University, Manhattan,
Bayesian selection of best subsets in high-dimensional regression July 31, 2019 20 / 25
Real data application
Results (based on 4, 000 genes)
Our goal here is to identify the best fitting model for estimating an
association between BRCA1 (response variable) and the other genes
(independent variables).
BRCA1 = β1 ∗ NBR2 + β2 ∗ DTL + . . . + βp ∗ VPS25 + .
Results:
Table: Model comparison
# of selected PMSE BIC EBIC
Proposed 8 0.60 984.45 1099.50
SCAD 4 0.68 1104.69 1166.47
MCP 4 0.68 1104.69 1166.47
ENET 5 0.68 1110.65 1186.25
LASSO 4 0.68 1104.69 1166.47
Shiqiang Jin (Department of Statistics Kansas State University, Manhattan, KS Joint work with Gyuhyeong Goh Kansas State University, Manhattan,
Bayesian selection of best subsets in high-dimensional regression July 31, 2019 21 / 25
Real data application
Results (cont.)
0.001
0.001
0.001
0.001
0.001
0.001
0.001
0.001
0.001
0.001
0.001
0.001
0.001
0.001
0.001
0.001
0.582
0.001
0.001
0.001
0.001
0.001
0.001
0.001
0.001
RPL27
C10orf76
DTL
NBR2
C17orf53
TUBG1
CRBN
YTHDC2
VPS25
CMTM5
TUBA1B
LASSO ENET MCP SCAD Proposed
Methods
Genes
−0.1
0.0
0.1
0.2
0.3
Coefficient
Figure: Except C10orf76, 7 genes are documented as diseases-related genes
Shiqiang Jin (Department of Statistics Kansas State University, Manhattan, KS Joint work with Gyuhyeong Goh Kansas State University, Manhattan,
Bayesian selection of best subsets in high-dimensional regression July 31, 2019 22 / 25
Concluding remarks
Parallel computing is applicable to our algorithm with varying k.
The proposed method can be extended to multivariate linear regression
models, binary regression models, and multivariate mixed responses models
(in progress).
Shiqiang Jin (Department of Statistics Kansas State University, Manhattan, KS Joint work with Gyuhyeong Goh Kansas State University, Manhattan,
Bayesian selection of best subsets in high-dimensional regression July 31, 2019 23 / 25
REFERENCES
Hans, C., A. Dobra, and M. West (2007). Shotgun stochastic search for “large p” regression.
Journal of the American Statistical Association 102(478), 507–516.
Chen, J. and Z. Chen (2008). Extended bayesian information criteria for model selection with
large model spaces. Biometrika 95(3), 759–771.
Findlay, G. M., R. M. Daza, B. Martin, M. D. Zhang, A. P. Leith, M. Gasperini, J. D. Janizek,
X. Huang, L. M. Starita, and J. Shendure (2018). Accurate classification of brca1 variants
with saturation genome editing. Nature 562(7726), 217.
Madigan, David, Jeremy York, and Denis Allard (1995). Bayesian Graphical Models for Discrete
Data. International Statistical Review / Revue Internationale De Statistique 63(2), 215-232.
Shiqiang Jin (Department of Statistics Kansas State University, Manhattan, KS Joint work with Gyuhyeong Goh Kansas State University, Manhattan,
Bayesian selection of best subsets in high-dimensional regression July 31, 2019 24 / 25
Contact: jinsq@ksu.edu
THANK YOU
Shiqiang Jin (Department of Statistics Kansas State University, Manhattan, KS Joint work with Gyuhyeong Goh Kansas State University, Manhattan,
Bayesian selection of best subsets in high-dimensional regression July 31, 2019 25 / 25

More Related Content

What's hot

An overview of Bayesian testing
An overview of Bayesian testingAn overview of Bayesian testing
An overview of Bayesian testingChristian Robert
 
A new generalized lindley distribution
A new generalized lindley distributionA new generalized lindley distribution
A new generalized lindley distributionAlexander Decker
 
prior selection for mixture estimation
prior selection for mixture estimationprior selection for mixture estimation
prior selection for mixture estimationChristian Robert
 
better together? statistical learning in models made of modules
better together? statistical learning in models made of modulesbetter together? statistical learning in models made of modules
better together? statistical learning in models made of modulesChristian Robert
 
A lattice-based consensus clustering
A lattice-based consensus clusteringA lattice-based consensus clustering
A lattice-based consensus clusteringDmitrii Ignatov
 
ABC short course: final chapters
ABC short course: final chaptersABC short course: final chapters
ABC short course: final chaptersChristian Robert
 
Bayesian inference on mixtures
Bayesian inference on mixturesBayesian inference on mixtures
Bayesian inference on mixturesChristian Robert
 
Hierarchical Deterministic Quadrature Methods for Option Pricing under the Ro...
Hierarchical Deterministic Quadrature Methods for Option Pricing under the Ro...Hierarchical Deterministic Quadrature Methods for Option Pricing under the Ro...
Hierarchical Deterministic Quadrature Methods for Option Pricing under the Ro...Chiheb Ben Hammouda
 
Introduction to Evidential Neural Networks
Introduction to Evidential Neural NetworksIntroduction to Evidential Neural Networks
Introduction to Evidential Neural NetworksFederico Cerutti
 
Problem set 3 - Statistics and Econometrics - Msc Business Analytics - Imperi...
Problem set 3 - Statistics and Econometrics - Msc Business Analytics - Imperi...Problem set 3 - Statistics and Econometrics - Msc Business Analytics - Imperi...
Problem set 3 - Statistics and Econometrics - Msc Business Analytics - Imperi...Jonathan Zimmermann
 
Reliable ABC model choice via random forests
Reliable ABC model choice via random forestsReliable ABC model choice via random forests
Reliable ABC model choice via random forestsChristian Robert
 
ABC short course: model choice chapter
ABC short course: model choice chapterABC short course: model choice chapter
ABC short course: model choice chapterChristian Robert
 
from model uncertainty to ABC
from model uncertainty to ABCfrom model uncertainty to ABC
from model uncertainty to ABCChristian Robert
 
Using Consolidated Tabular and Text Data in Business Predictive Analytics
Using Consolidated Tabular and Text Data  in Business Predictive AnalyticsUsing Consolidated Tabular and Text Data  in Business Predictive Analytics
Using Consolidated Tabular and Text Data in Business Predictive AnalyticsBohdan Pavlyshenko
 
Likelihood-free Design: a discussion
Likelihood-free Design: a discussionLikelihood-free Design: a discussion
Likelihood-free Design: a discussionChristian Robert
 

What's hot (20)

An overview of Bayesian testing
An overview of Bayesian testingAn overview of Bayesian testing
An overview of Bayesian testing
 
A new generalized lindley distribution
A new generalized lindley distributionA new generalized lindley distribution
A new generalized lindley distribution
 
prior selection for mixture estimation
prior selection for mixture estimationprior selection for mixture estimation
prior selection for mixture estimation
 
better together? statistical learning in models made of modules
better together? statistical learning in models made of modulesbetter together? statistical learning in models made of modules
better together? statistical learning in models made of modules
 
ABC workshop: 17w5025
ABC workshop: 17w5025ABC workshop: 17w5025
ABC workshop: 17w5025
 
A lattice-based consensus clustering
A lattice-based consensus clusteringA lattice-based consensus clustering
A lattice-based consensus clustering
 
ABC short course: final chapters
ABC short course: final chaptersABC short course: final chapters
ABC short course: final chapters
 
Bayesian inference on mixtures
Bayesian inference on mixturesBayesian inference on mixtures
Bayesian inference on mixtures
 
Vancouver18
Vancouver18Vancouver18
Vancouver18
 
Hierarchical Deterministic Quadrature Methods for Option Pricing under the Ro...
Hierarchical Deterministic Quadrature Methods for Option Pricing under the Ro...Hierarchical Deterministic Quadrature Methods for Option Pricing under the Ro...
Hierarchical Deterministic Quadrature Methods for Option Pricing under the Ro...
 
Introduction to Evidential Neural Networks
Introduction to Evidential Neural NetworksIntroduction to Evidential Neural Networks
Introduction to Evidential Neural Networks
 
Problem set 3 - Statistics and Econometrics - Msc Business Analytics - Imperi...
Problem set 3 - Statistics and Econometrics - Msc Business Analytics - Imperi...Problem set 3 - Statistics and Econometrics - Msc Business Analytics - Imperi...
Problem set 3 - Statistics and Econometrics - Msc Business Analytics - Imperi...
 
Reliable ABC model choice via random forests
Reliable ABC model choice via random forestsReliable ABC model choice via random forests
Reliable ABC model choice via random forests
 
ABC short course: model choice chapter
ABC short course: model choice chapterABC short course: model choice chapter
ABC short course: model choice chapter
 
from model uncertainty to ABC
from model uncertainty to ABCfrom model uncertainty to ABC
from model uncertainty to ABC
 
Deep Learning Opening Workshop - Horseshoe Regularization for Machine Learnin...
Deep Learning Opening Workshop - Horseshoe Regularization for Machine Learnin...Deep Learning Opening Workshop - Horseshoe Regularization for Machine Learnin...
Deep Learning Opening Workshop - Horseshoe Regularization for Machine Learnin...
 
Intractable likelihoods
Intractable likelihoodsIntractable likelihoods
Intractable likelihoods
 
Using Consolidated Tabular and Text Data in Business Predictive Analytics
Using Consolidated Tabular and Text Data  in Business Predictive AnalyticsUsing Consolidated Tabular and Text Data  in Business Predictive Analytics
Using Consolidated Tabular and Text Data in Business Predictive Analytics
 
Likelihood-free Design: a discussion
Likelihood-free Design: a discussionLikelihood-free Design: a discussion
Likelihood-free Design: a discussion
 
Deep Learning Opening Workshop - Domain Adaptation Challenges in Genomics: a ...
Deep Learning Opening Workshop - Domain Adaptation Challenges in Genomics: a ...Deep Learning Opening Workshop - Domain Adaptation Challenges in Genomics: a ...
Deep Learning Opening Workshop - Domain Adaptation Challenges in Genomics: a ...
 

Similar to Bayesian Best Subset Selection in High-D Regression

Bayesian hybrid variable selection under generalized linear models
Bayesian hybrid variable selection under generalized linear modelsBayesian hybrid variable selection under generalized linear models
Bayesian hybrid variable selection under generalized linear modelsCaleb (Shiqiang) Jin
 
Reproducibility and differential analysis with selfish
Reproducibility and differential analysis with selfishReproducibility and differential analysis with selfish
Reproducibility and differential analysis with selfishtuxette
 
Some sampling techniques for big data analysis
Some sampling techniques for big data analysisSome sampling techniques for big data analysis
Some sampling techniques for big data analysisJae-kwang Kim
 
Hierarchical Deterministic Quadrature Methods for Option Pricing under the Ro...
Hierarchical Deterministic Quadrature Methods for Option Pricing under the Ro...Hierarchical Deterministic Quadrature Methods for Option Pricing under the Ro...
Hierarchical Deterministic Quadrature Methods for Option Pricing under the Ro...Chiheb Ben Hammouda
 
The Sample Average Approximation Method for Stochastic Programs with Integer ...
The Sample Average Approximation Method for Stochastic Programs with Integer ...The Sample Average Approximation Method for Stochastic Programs with Integer ...
The Sample Average Approximation Method for Stochastic Programs with Integer ...SSA KPI
 
Accelerating Metropolis Hastings with Lightweight Inference Compilation
Accelerating Metropolis Hastings with Lightweight Inference CompilationAccelerating Metropolis Hastings with Lightweight Inference Compilation
Accelerating Metropolis Hastings with Lightweight Inference CompilationFeynman Liang
 
Divergence clustering
Divergence clusteringDivergence clustering
Divergence clusteringFrank Nielsen
 
Mimo system-order-reduction-using-real-coded-genetic-algorithm
Mimo system-order-reduction-using-real-coded-genetic-algorithmMimo system-order-reduction-using-real-coded-genetic-algorithm
Mimo system-order-reduction-using-real-coded-genetic-algorithmCemal Ardil
 
3rd NIPS Workshop on PROBABILISTIC PROGRAMMING
3rd NIPS Workshop on PROBABILISTIC PROGRAMMING3rd NIPS Workshop on PROBABILISTIC PROGRAMMING
3rd NIPS Workshop on PROBABILISTIC PROGRAMMINGChristian Robert
 
Module - 2 Discrete Mathematics and Graph Theory
Module - 2 Discrete Mathematics and Graph TheoryModule - 2 Discrete Mathematics and Graph Theory
Module - 2 Discrete Mathematics and Graph TheoryAdhiyaman Manickam
 

Similar to Bayesian Best Subset Selection in High-D Regression (20)

Bayesian hybrid variable selection under generalized linear models
Bayesian hybrid variable selection under generalized linear modelsBayesian hybrid variable selection under generalized linear models
Bayesian hybrid variable selection under generalized linear models
 
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
 
MUMS: Transition & SPUQ Workshop - A Review of Model Calibration Methods with...
MUMS: Transition & SPUQ Workshop - A Review of Model Calibration Methods with...MUMS: Transition & SPUQ Workshop - A Review of Model Calibration Methods with...
MUMS: Transition & SPUQ Workshop - A Review of Model Calibration Methods with...
 
Reproducibility and differential analysis with selfish
Reproducibility and differential analysis with selfishReproducibility and differential analysis with selfish
Reproducibility and differential analysis with selfish
 
MUMS: Bayesian, Fiducial, and Frequentist Conference - Model Selection in the...
MUMS: Bayesian, Fiducial, and Frequentist Conference - Model Selection in the...MUMS: Bayesian, Fiducial, and Frequentist Conference - Model Selection in the...
MUMS: Bayesian, Fiducial, and Frequentist Conference - Model Selection in the...
 
Some sampling techniques for big data analysis
Some sampling techniques for big data analysisSome sampling techniques for big data analysis
Some sampling techniques for big data analysis
 
Hierarchical Deterministic Quadrature Methods for Option Pricing under the Ro...
Hierarchical Deterministic Quadrature Methods for Option Pricing under the Ro...Hierarchical Deterministic Quadrature Methods for Option Pricing under the Ro...
Hierarchical Deterministic Quadrature Methods for Option Pricing under the Ro...
 
The Sample Average Approximation Method for Stochastic Programs with Integer ...
The Sample Average Approximation Method for Stochastic Programs with Integer ...The Sample Average Approximation Method for Stochastic Programs with Integer ...
The Sample Average Approximation Method for Stochastic Programs with Integer ...
 
Talk iccf 19_ben_hammouda
Talk iccf 19_ben_hammoudaTalk iccf 19_ben_hammouda
Talk iccf 19_ben_hammouda
 
mlcourse.ai. Clustering
mlcourse.ai. Clusteringmlcourse.ai. Clustering
mlcourse.ai. Clustering
 
MUMS: Bayesian, Fiducial, and Frequentist Conference - Spatially Informed Var...
MUMS: Bayesian, Fiducial, and Frequentist Conference - Spatially Informed Var...MUMS: Bayesian, Fiducial, and Frequentist Conference - Spatially Informed Var...
MUMS: Bayesian, Fiducial, and Frequentist Conference - Spatially Informed Var...
 
Side 2019 #9
Side 2019 #9Side 2019 #9
Side 2019 #9
 
Lec 1
Lec 1Lec 1
Lec 1
 
Accelerating Metropolis Hastings with Lightweight Inference Compilation
Accelerating Metropolis Hastings with Lightweight Inference CompilationAccelerating Metropolis Hastings with Lightweight Inference Compilation
Accelerating Metropolis Hastings with Lightweight Inference Compilation
 
Divergence clustering
Divergence clusteringDivergence clustering
Divergence clustering
 
Mimo system-order-reduction-using-real-coded-genetic-algorithm
Mimo system-order-reduction-using-real-coded-genetic-algorithmMimo system-order-reduction-using-real-coded-genetic-algorithm
Mimo system-order-reduction-using-real-coded-genetic-algorithm
 
3rd NIPS Workshop on PROBABILISTIC PROGRAMMING
3rd NIPS Workshop on PROBABILISTIC PROGRAMMING3rd NIPS Workshop on PROBABILISTIC PROGRAMMING
3rd NIPS Workshop on PROBABILISTIC PROGRAMMING
 
Module - 2 Discrete Mathematics and Graph Theory
Module - 2 Discrete Mathematics and Graph TheoryModule - 2 Discrete Mathematics and Graph Theory
Module - 2 Discrete Mathematics and Graph Theory
 
QMC: Undergraduate Workshop, Introduction to Monte Carlo Methods with 'R' Sof...
QMC: Undergraduate Workshop, Introduction to Monte Carlo Methods with 'R' Sof...QMC: Undergraduate Workshop, Introduction to Monte Carlo Methods with 'R' Sof...
QMC: Undergraduate Workshop, Introduction to Monte Carlo Methods with 'R' Sof...
 
Ica group 3[1]
Ica group 3[1]Ica group 3[1]
Ica group 3[1]
 

Recently uploaded

Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024BookNet Canada
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
Unlocking the Potential of the Cloud for IBM Power Systems
Unlocking the Potential of the Cloud for IBM Power SystemsUnlocking the Potential of the Cloud for IBM Power Systems
Unlocking the Potential of the Cloud for IBM Power SystemsPrecisely
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Wonjun Hwang
 
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024BookNet Canada
 
costume and set research powerpoint presentation
costume and set research powerpoint presentationcostume and set research powerpoint presentation
costume and set research powerpoint presentationphoebematthew05
 
Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Neo4j
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphNeo4j
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxMaking_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxnull - The Open Security Community
 
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsSnow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsHyundai Motor Group
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Bluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdfBluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdfngoud9212
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsAndrey Dotsenko
 

Recently uploaded (20)

Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
Unlocking the Potential of the Cloud for IBM Power Systems
Unlocking the Potential of the Cloud for IBM Power SystemsUnlocking the Potential of the Cloud for IBM Power Systems
Unlocking the Potential of the Cloud for IBM Power Systems
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
 
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
 
costume and set research powerpoint presentation
costume and set research powerpoint presentationcostume and set research powerpoint presentation
costume and set research powerpoint presentation
 
Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxMaking_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
 
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsSnow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort ServiceHot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Bluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdfBluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdf
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 

Bayesian Best Subset Selection in High-D Regression

  • 1. Bayesian selection of best subsets in high-dimensional regression Shiqiang Jin Department of Statistics Kansas State University, Manhattan, KS Joint work with Gyuhyeong Goh Kansas State University, Manhattan, KS July 31, 2019 Shiqiang Jin (Department of Statistics Kansas State University, Manhattan, KS Joint work with Gyuhyeong Goh Kansas State University, Manhattan, Bayesian selection of best subsets in high-dimensional regression July 31, 2019 1 / 25
  • 2. Bayesian linear regression model in High-dimensional Data Consider a linear regression model y = Xβ + , (1) where y = (y1, . . . , yn)T is a response vector, X = (x1, . . . , xp) ∈ Rn×p is a model matrix, β = (β1, . . . , βp)T is a coefficient vector and ∼ N(0, σ2 In). We assume p n, i.e. High-dimensional data. We assume only a few number of predictors are associated with the response, i.e. β is sparse. Shiqiang Jin (Department of Statistics Kansas State University, Manhattan, KS Joint work with Gyuhyeong Goh Kansas State University, Manhattan, Bayesian selection of best subsets in high-dimensional regression July 31, 2019 2 / 25
  • 3. Bayesian linear regression model in High-dimensional Data To better explain the sparsity of β, we introduce a latent index set γ ⊂ {1, . . . , p} so that Xγ represents a sub-matrix of X containing xj , j ∈ γ. e.g. γ = {1, 3, 4} ⇒ Xγ = (x1, x3, x4). The full model in (1) can be reduced to y = Xγβγ + . (2) Shiqiang Jin (Department of Statistics Kansas State University, Manhattan, KS Joint work with Gyuhyeong Goh Kansas State University, Manhattan, Bayesian selection of best subsets in high-dimensional regression July 31, 2019 3 / 25
  • 4. Priors and marginal posterior distribution Our Research Goals are to obtain: (i) k most important predictors out of p k candidate models; (ii) a single best model from among 2p candidate models. We consider βγ|σ2 , γ ∼ Normal(0, τσ2 I|γ|), σ2 ∼ Inverse-Gamma(aσ/2, bσ/2), π(γ) ∝ I(|γ| = k), where |γ| is number of elements in γ. Shiqiang Jin (Department of Statistics Kansas State University, Manhattan, KS Joint work with Gyuhyeong Goh Kansas State University, Manhattan, Bayesian selection of best subsets in high-dimensional regression July 31, 2019 4 / 25
  • 5. Priors and marginal posterior distribution Our Research Goals are to obtain: (i) k most important predictors out of p k candidate models; (ii) a single best model from among 2p candidate models. We consider βγ|σ2 , γ ∼ Normal(0, τσ2 I|γ|), σ2 ∼ Inverse-Gamma(aσ/2, bσ/2), π(γ) ∝ I(|γ| = k), where |γ| is number of elements in γ. Shiqiang Jin (Department of Statistics Kansas State University, Manhattan, KS Joint work with Gyuhyeong Goh Kansas State University, Manhattan, Bayesian selection of best subsets in high-dimensional regression July 31, 2019 4 / 25
  • 6. Priors and marginal posterior distribution Given k, it can be shown that π(γ|y) ∝ (τ−1 ) |γ| 2 |XT γXγ + τ−1I|γ|| 1 2 (yT Hγy + bσ) aσ+n 2 I(|γ| = k) ≡ g(γ)I(|γ| = k), where Hγ = In − Xγ(XT γXγ + τ−1 I|γ|)−1 XT γ. Hence, g(γ) is our model selection criterion. Shiqiang Jin (Department of Statistics Kansas State University, Manhattan, KS Joint work with Gyuhyeong Goh Kansas State University, Manhattan, Bayesian selection of best subsets in high-dimensional regression July 31, 2019 5 / 25
  • 7. Best subset selection Algorithm Note that our goals are to obtain (i) k most important predictors out of p k models; (ii) a single best model from among 2p candidate models. Hence, this can be fallen into the description of best subset selection as follows: (i) Fixed size: for k = 1, 2, . . . , p, select the best subset model by Mk = argγ max |γ|=k g(γ) from p k candidate models. (ii) Varying size: Pick the single best model from M1, . . . , Mp. Challenge of best subset selection: e.g. (i) p = 1000, k = 5, 1000 5 ≈ 8 × 1012 ; (ii) p = 40, 240 ≈ 1012 . Shiqiang Jin (Department of Statistics Kansas State University, Manhattan, KS Joint work with Gyuhyeong Goh Kansas State University, Manhattan, Bayesian selection of best subsets in high-dimensional regression July 31, 2019 6 / 25
  • 8. Best subset selection Algorithm Note that our goals are to obtain (i) k most important predictors out of p k models; (ii) a single best model from among 2p candidate models. Hence, this can be fallen into the description of best subset selection as follows: (i) Fixed size: for k = 1, 2, . . . , p, select the best subset model by Mk = argγ max |γ|=k g(γ) from p k candidate models. (ii) Varying size: Pick the single best model from M1, . . . , Mp. Challenge of best subset selection: e.g. (i) p = 1000, k = 5, 1000 5 ≈ 8 × 1012 ; (ii) p = 40, 240 ≈ 1012 . Shiqiang Jin (Department of Statistics Kansas State University, Manhattan, KS Joint work with Gyuhyeong Goh Kansas State University, Manhattan, Bayesian selection of best subsets in high-dimensional regression July 31, 2019 6 / 25
  • 9. Best subset selection Algorithm Note that our goals are to obtain (i) k most important predictors out of p k models; (ii) a single best model from among 2p candidate models. Hence, this can be fallen into the description of best subset selection as follows: (i) Fixed size: for k = 1, 2, . . . , p, select the best subset model by Mk = argγ max |γ|=k g(γ) from p k candidate models. (ii) Varying size: Pick the single best model from M1, . . . , Mp. Challenge of best subset selection: e.g. (i) p = 1000, k = 5, 1000 5 ≈ 8 × 1012 ; (ii) p = 40, 240 ≈ 1012 . Shiqiang Jin (Department of Statistics Kansas State University, Manhattan, KS Joint work with Gyuhyeong Goh Kansas State University, Manhattan, Bayesian selection of best subsets in high-dimensional regression July 31, 2019 6 / 25
  • 10. Neighborhood Search To avoid the exhaustive computation, we resort to the idea of Neighborhood Search proposed by Madigan et al. (1995) and Hans et al. (2007). Let γ(t) be a current state of γ, |γ(t) | = k is model size. Addition neighbor: N+(γ(t) ) = {γ(t) ∪ {j} : j / ∈ γ(t) }; model size? Deletion neighbor: N−(γ(t) ) = {γ(t) {j0 } : j0 ∈ γ(t) }; model size? e.g. Suppose p = 4, k = 2. Let γ(t) = {1, 2}, then N+(γ(t) ) = {{1, 2, 3}, {1, 2, 4}}, N−(γ(t) ) = {{1}, {2}}. Shiqiang Jin (Department of Statistics Kansas State University, Manhattan, KS Joint work with Gyuhyeong Goh Kansas State University, Manhattan, Bayesian selection of best subsets in high-dimensional regression July 31, 2019 7 / 25
  • 11. Hybrid best subset search with a fixed k Note our Goal (i) is to find γ̂ = argγ max|γ̂|=k g(γ). 1. Initialize γ̂ s.t. |γ̂| = k. 2. Repeat #deterministic search:local optimum Update γ̃ ← arg maxγ∈N+(γ̂) g(γ) ; # N+(γ̂) = {γ̂ ∪ {j} : j / ∈ γ̂} Update γ̂ ← arg maxγ∈N−(γ̃) g(γ); # N−(γ̃) = {γ̃ {j} : j ∈ γ̃} until convergence. In Step 2, we have p + 1 many candidate models in all γ ∈ N+(γ̂) and all γ ∈ N−(γ̃)) in each iteration. compute g(γ) p + 1 times in each iteration. Shiqiang Jin (Department of Statistics Kansas State University, Manhattan, KS Joint work with Gyuhyeong Goh Kansas State University, Manhattan, Bayesian selection of best subsets in high-dimensional regression July 31, 2019 8 / 25
  • 12. Hybrid best subset search with a fixed k Note our Goal (i) is to find γ̂ = argγ max|γ̂|=k g(γ). 1. Initialize γ̂ s.t. |γ̂| = k. 2. Repeat #deterministic search:local optimum Update γ̃ ← arg maxγ∈N+(γ̂) g(γ) ; # N+(γ̂) = {γ̂ ∪ {j} : j / ∈ γ̂} Update γ̂ ← arg maxγ∈N−(γ̃) g(γ); # N−(γ̃) = {γ̃ {j} : j ∈ γ̃} until convergence. In Step 2, we have p + 1 many candidate models in all γ ∈ N+(γ̂) and all γ ∈ N−(γ̃)) in each iteration. compute g(γ) p + 1 times in each iteration. Shiqiang Jin (Department of Statistics Kansas State University, Manhattan, KS Joint work with Gyuhyeong Goh Kansas State University, Manhattan, Bayesian selection of best subsets in high-dimensional regression July 31, 2019 8 / 25
  • 13. 1st Key features of proposed algorithm g(γ) = (τ−1 ) |γ| 2 |XT γXγ + τ−1I|γ|| 1 2 (yT Hγy + bσ) aσ+n 2 , where Hγ = In − Xγ(XT γXγ + τ−1 I|γ|)−1 XT γ. compute inverse matrix and determinant p + 1 times. We propose the following formula and we can show that evaluating all candidate models in addition neighbor can be done simultaneously in this single computation. g(N+(γ̂)) = (yT Hγ̂y + bσ)1p − (XT Hγ̂y)2 τ−11p + diag(XT Hγ̂X) − aσ+n 2 × τ−1 1p + diag(XT Hγ̂X) −1/2 , (3) Similarly for g(N−(γ̂)). Shiqiang Jin (Department of Statistics Kansas State University, Manhattan, KS Joint work with Gyuhyeong Goh Kansas State University, Manhattan, Bayesian selection of best subsets in high-dimensional regression July 31, 2019 9 / 25
  • 14. Hybrid best subset search with a fixed k Note our Goal (i) is to find γ̂ = argγ max|γ̂|=k g(γ). 1. Initialize γ̂ s.t. |γ̂| = k. 2. Repeat #deterministic search:local optimum Update γ̃ ← arg max g(N+(γ̂)) ; # N+(γ̂) = {γ̂ ∪ {j} : j / ∈ γ̂} Update γ̂ ← arg max g(N−(γ̃)); # N−(γ̃) = {γ̃ {j} : j ∈ γ̃} until convergence. 3. Set γ(0) = γ̂. 4. Repeat for t = 1, . . . , T: #stochastic search:global optimum Sample γ∗ ∼ π(γ|y) = {g(γ)} P γ {g(γ)} I{γ ∈ N+(γ(t−1) )}; Sample γ(t) ∼ π(γ|y) = {g(γ)} P γ {g(γ)} I{γ ∈ N−(γ∗ )}; If π(γ̂|y) π(γ(t) |y), then update γ̂ = γ(t) , break the loop, and go to 2. 5. Return γ̂. Note that all g(γ) are computed simultaneously in their neighbor space. Shiqiang Jin (Department of Statistics Kansas State University, Manhattan, KS Joint work with Gyuhyeong Goh Kansas State University, Manhattan, Bayesian selection of best subsets in high-dimensional regression July 31, 2019 10 / 25
  • 15. Hybrid best subset search with a fixed k Note our Goal (i) is to find γ̂ = argγ max|γ̂|=k g(γ). 1. Initialize γ̂ s.t. |γ̂| = k. 2. Repeat #deterministic search:local optimum Update γ̃ ← arg max g(N+(γ̂)) ; # N+(γ̂) = {γ̂ ∪ {j} : j / ∈ γ̂} Update γ̂ ← arg max g(N−(γ̃)); # N−(γ̃) = {γ̃ {j} : j ∈ γ̃} until convergence. 3. Set γ(0) = γ̂. 4. Repeat for t = 1, . . . , T: #stochastic search:global optimum Sample γ∗ ∼ π(γ|y) = {g(γ)} P γ {g(γ)} I{γ ∈ N+(γ(t−1) )}; Sample γ(t) ∼ π(γ|y) = {g(γ)} P γ {g(γ)} I{γ ∈ N−(γ∗ )}; If π(γ̂|y) π(γ(t) |y), then update γ̂ = γ(t) , break the loop, and go to 2. 5. Return γ̂. Note that all g(γ) are computed simultaneously in their neighbor space. Shiqiang Jin (Department of Statistics Kansas State University, Manhattan, KS Joint work with Gyuhyeong Goh Kansas State University, Manhattan, Bayesian selection of best subsets in high-dimensional regression July 31, 2019 10 / 25
  • 16. 2nd Key features of proposed algorithm To avoid staying in the local optimum for a long time in step 4, we use the escort distribution. The idea of escort distribution (used in statistical physics and thermodynamics) is introduced to stimulate the movement of Markov chain. An escort distribution of g(γ) is given as {g(γ)}α P γ{g(γ)}α , α ∈ [0, 1] Shiqiang Jin (Department of Statistics Kansas State University, Manhattan, KS Joint work with Gyuhyeong Goh Kansas State University, Manhattan, Bayesian selection of best subsets in high-dimensional regression July 31, 2019 11 / 25
  • 17. Escort distributions e.g. Consider 3 candidate models with its probability: If α = 1, {g(γ)}α P γ{g(γ)}α =      0.02 model 1 0.90 model 2 0.08 model 3 . 1 2 3 Escort probability 0.0 0.2 0.4 0.6 0.8 1.0 α=1 0.02 0.9 0.08 1 2 3 Escort probability 0.0 0.2 0.4 0.6 0.8 1.0 α=0.5 0.1 0.69 0.21 1 2 3 Escort probability 0.0 0.2 0.4 0.6 0.8 1.0 α=0.05 0.3 0.37 0.33 Figure: Escort distributions of πα(γ|y). Shiqiang Jin (Department of Statistics Kansas State University, Manhattan, KS Joint work with Gyuhyeong Goh Kansas State University, Manhattan, Bayesian selection of best subsets in high-dimensional regression July 31, 2019 12 / 25
  • 18. Hybrid best subset search with a fixed k Note our Goal (i) is to find γ̂ = argγ max|γ̂|=k g(γ). 1. Initialize γ̂ s.t. |γ̂| = k. 2. Repeat #deterministic search:local optimum Update γ̃ ← arg max g(N+(γ̂)) ; # N+(γ̂) = {γ̂ ∪ {j} : j / ∈ γ̂} Update γ̂ ← arg max g(N−(γ̃)); # N−(γ̃) = {γ̃ {j} : j ∈ γ̃} until convergence. 3. Set γ(0) = γ̂. 4. Repeat for t = 1, . . . , T: #stochastic search:global optimum Sample γ∗ ∼ πα(γ|y) = {g(γ)}α P γ {g(γ)}α I{γ ∈ N+(γ(t−1) )}; # α ∈ [0, 1] Sample γ(t) ∼ πα(γ|y) = {g(γ)}α P γ {g(γ)}α I{γ ∈ N−(γ∗ )}; If π(γ̂|y) π(γ(t) |y), then update γ̂ = γ(t) , break the loop, and go to 2. 5. Return γ̂. Note that all g(γ) are computed simultaneously in their neighbor space. Shiqiang Jin (Department of Statistics Kansas State University, Manhattan, KS Joint work with Gyuhyeong Goh Kansas State University, Manhattan, Bayesian selection of best subsets in high-dimensional regression July 31, 2019 13 / 25
  • 19. Best subset selection with varying k Note that Goal (ii): a single best model from among 2p candidate models. We extend ”fixed” k to varying k by assigning a prior on k. Note that the uniform prior, k ∼ Uniform{1, . . . , K}, tends to assign larger probability to a larger subset (see Chen and Chen (2008)). We define π(k) ∝ 1/ p k I(k ≤ K). Shiqiang Jin (Department of Statistics Kansas State University, Manhattan, KS Joint work with Gyuhyeong Goh Kansas State University, Manhattan, Bayesian selection of best subsets in high-dimensional regression July 31, 2019 14 / 25
  • 20. Hybrid best subset search with varying k Bayesian best subset selection can be done by maximizing π(γ, k|y) ∝ g(γ)/ p k (4) over (γ, k). Our algorithm proceeds as follows: 1. Repeat for k = 1, . . . , K: Given k, implement the hybrid search algorithm to obtain best subset model γ̂k . 2. Find the best model γ̂∗ obtained by γ̂∗ = arg max k∈{1,...,K} g(γ̂k )/ p k . (5) Shiqiang Jin (Department of Statistics Kansas State University, Manhattan, KS Joint work with Gyuhyeong Goh Kansas State University, Manhattan, Bayesian selection of best subsets in high-dimensional regression July 31, 2019 15 / 25
  • 21. Consistency of model selection criterion Theorem Let γ∗ indicate the true model. Define Γ = {γ : |γ| ≤ K, γ 6= γ∗}. Assume that p = O(nξ ) for ξ ≥ 1. Under the asymptotic identifiability condition of Chen and Chen (2008), if τ → ∞ as n → ∞ but τ = o(n), then the proposed Bayesian subset selection possesses the Bayesian model selection consistency, that is, π(γ∗|y) max γ∈Γ π(γ|y) (6) in probability as n → ∞. As n → ∞, the maximizer of π(γ|y) is the true model based on our model selection criterion. Shiqiang Jin (Department of Statistics Kansas State University, Manhattan, KS Joint work with Gyuhyeong Goh Kansas State University, Manhattan, Bayesian selection of best subsets in high-dimensional regression July 31, 2019 16 / 25
  • 22. Simulation study Setup For given n = 100, we generate the data from yi ind ∼ Normal p X j=1 βj xij , 1 ! , where I (xi1, . . . , xip)T iid ∼ Normal(0p, Σ) with Σ = (Σij )p×p and Σij = ρ|i−j| , I βj iid ∼ Uniform{−1, −2, 1, 2} if j ∈ γ and βj = 0 if j / ∈ γ. I γ is an index set of size 4 randomly selected from {1, 2, . . . , p}. I We consider four scenarios for p and ρ: (i) p = 200, ρ = 0.1, (ii) p = 200, ρ = 0.9, (iii) p = 1000, ρ = 0.1, (iv) p = 1000, ρ = 0.9. Shiqiang Jin (Department of Statistics Kansas State University, Manhattan, KS Joint work with Gyuhyeong Goh Kansas State University, Manhattan, Bayesian selection of best subsets in high-dimensional regression July 31, 2019 17 / 25
  • 23. Simulation study Results (high-dimensional scenarios) Table: 2, 000 replications; FDR (false discovery rate), TRUE% (percentage of the true model detected), SIZE (selected model size), HAM (Hamming distance). Scenario Method FDR (s.e.) TRUE% (s.e.) SIZE (s.e.) HAM (s.e.) p = 200 Proposed 0.006 (0.001) 96.900 (0.388) 4.032 (0.004) 0.032 (0.004) ρ = 0.1 SCAD 0.034 (0.002) 85.200 (0.794) 4.188 (0.011) 0.188 (0.011) MCP 0.035 (0.002) 84.750 (0.804) 4.191 (0.011) 0.191 (0.011) ENET 0.016 (0.001) 92.700 (0.582) 4.087 (0.007) 0.087 (0.007) LASSO 0.020 (0.002) 91.350 (0.629) 4.109 (0.009) 0.109 (0.009) p = 200 Proposed 0.023 (0.002) 88.750 (0.707) 3.985 (0.006) 0.203 (0.014) ρ = 0.9 SCAD 0.059 (0.003) 74.150 (0.979) 4.107 (0.015) 0.480 (0.022) MCP 0.137 (0.004) 55.400 (1.112) 4.264 (0.020) 1.098 (0.034) ENET 0.501 (0.004) 0.300 (0.122) 7.716 (0.072) 5.018 (0.052) LASSO 0.276 (0.004) 15.550 (0.811) 5.308 (0.033) 2.038 (0.034) Shiqiang Jin (Department of Statistics Kansas State University, Manhattan, KS Joint work with Gyuhyeong Goh Kansas State University, Manhattan, Bayesian selection of best subsets in high-dimensional regression July 31, 2019 18 / 25
  • 24. Simulation study Results (ultra high-dimensional scenarios) Table: 2, 000 replications; FDR (false discovery rate), TRUE% (percentage of the true model detected), SIZE (selected model size), HAM (Hamming distance). Scenario Method FDR (s.e.) TRUE% (s.e.) SIZE (s.e.) HAM (s.e.) p = 1000 Proposed 0.004 (0.001) 98.100 (0.305) 4.020 (0.003) 0.020 (0.003) ρ = 0.1 SCAD 0.027 (0.002) 87.900 (0.729) 4.145 (0.010) 0.145 (0.010) MCP 0.031 (0.002) 86.550 (0.763) 4.172 (0.013) 0.172 (0.013) ENET 0.035 (0.002) 84.850 (0.802) 4.181 (0.013) 0.206 (0.012) LASSO 0.014 (0.001) 93.850 (0.537) 4.073 (0.007) 0.073 (0.007) p = 1000 Proposed 0.023(0.002) 89.850 (0.675) 4.005 (0.005) 0.190 (0.013) ρ = 0.9 SCAD 0.068 (0.003) 74.250 (0.978) 4.196 (0.014) 0.493 (0.023) MCP 0.152 (0.004) 53.750 (1.115) 4.226 (0.017) 1.202 (0.035) ENET 0.417 (0.005) 0.150 (0.087) 6.228 (0.068) 4.089 (0.043) LASSO 0.265 (0.004) 19.500 (0.886) 5.139 (0.029) 1.909 (0.035) Shiqiang Jin (Department of Statistics Kansas State University, Manhattan, KS Joint work with Gyuhyeong Goh Kansas State University, Manhattan, Bayesian selection of best subsets in high-dimensional regression July 31, 2019 19 / 25
  • 25. Real data application Data description We apply the proposed method to Breast Invasive Carcinoma (BRCA) data generated by The Cancer Genome Atlas (TCGA) Research Network http://cancergenome.nih.gov. The data set contains 17, 814 gene expression measurements (recorded on the log scale) of 526 patients with primary solid tumor. BRCA1 is a tumor suppressor gene and its mutations predispose women to breast cancer (Findlay et al., 2018). Shiqiang Jin (Department of Statistics Kansas State University, Manhattan, KS Joint work with Gyuhyeong Goh Kansas State University, Manhattan, Bayesian selection of best subsets in high-dimensional regression July 31, 2019 20 / 25
  • 26. Real data application Results (based on 4, 000 genes) Our goal here is to identify the best fitting model for estimating an association between BRCA1 (response variable) and the other genes (independent variables). BRCA1 = β1 ∗ NBR2 + β2 ∗ DTL + . . . + βp ∗ VPS25 + . Results: Table: Model comparison # of selected PMSE BIC EBIC Proposed 8 0.60 984.45 1099.50 SCAD 4 0.68 1104.69 1166.47 MCP 4 0.68 1104.69 1166.47 ENET 5 0.68 1110.65 1186.25 LASSO 4 0.68 1104.69 1166.47 Shiqiang Jin (Department of Statistics Kansas State University, Manhattan, KS Joint work with Gyuhyeong Goh Kansas State University, Manhattan, Bayesian selection of best subsets in high-dimensional regression July 31, 2019 21 / 25
  • 27. Real data application Results (cont.) 0.001 0.001 0.001 0.001 0.001 0.001 0.001 0.001 0.001 0.001 0.001 0.001 0.001 0.001 0.001 0.001 0.582 0.001 0.001 0.001 0.001 0.001 0.001 0.001 0.001 RPL27 C10orf76 DTL NBR2 C17orf53 TUBG1 CRBN YTHDC2 VPS25 CMTM5 TUBA1B LASSO ENET MCP SCAD Proposed Methods Genes −0.1 0.0 0.1 0.2 0.3 Coefficient Figure: Except C10orf76, 7 genes are documented as diseases-related genes Shiqiang Jin (Department of Statistics Kansas State University, Manhattan, KS Joint work with Gyuhyeong Goh Kansas State University, Manhattan, Bayesian selection of best subsets in high-dimensional regression July 31, 2019 22 / 25
  • 28. Concluding remarks Parallel computing is applicable to our algorithm with varying k. The proposed method can be extended to multivariate linear regression models, binary regression models, and multivariate mixed responses models (in progress). Shiqiang Jin (Department of Statistics Kansas State University, Manhattan, KS Joint work with Gyuhyeong Goh Kansas State University, Manhattan, Bayesian selection of best subsets in high-dimensional regression July 31, 2019 23 / 25
  • 29. REFERENCES Hans, C., A. Dobra, and M. West (2007). Shotgun stochastic search for “large p” regression. Journal of the American Statistical Association 102(478), 507–516. Chen, J. and Z. Chen (2008). Extended bayesian information criteria for model selection with large model spaces. Biometrika 95(3), 759–771. Findlay, G. M., R. M. Daza, B. Martin, M. D. Zhang, A. P. Leith, M. Gasperini, J. D. Janizek, X. Huang, L. M. Starita, and J. Shendure (2018). Accurate classification of brca1 variants with saturation genome editing. Nature 562(7726), 217. Madigan, David, Jeremy York, and Denis Allard (1995). Bayesian Graphical Models for Discrete Data. International Statistical Review / Revue Internationale De Statistique 63(2), 215-232. Shiqiang Jin (Department of Statistics Kansas State University, Manhattan, KS Joint work with Gyuhyeong Goh Kansas State University, Manhattan, Bayesian selection of best subsets in high-dimensional regression July 31, 2019 24 / 25
  • 30. Contact: jinsq@ksu.edu THANK YOU Shiqiang Jin (Department of Statistics Kansas State University, Manhattan, KS Joint work with Gyuhyeong Goh Kansas State University, Manhattan, Bayesian selection of best subsets in high-dimensional regression July 31, 2019 25 / 25