Forensic Biology & Its biological significance.pdf
Interpretable Sparse Sliced Inverse Regression for digitized functional data
1. Interpretable Sparse Sliced Inverse Regression for
digitized functional data
Victor Picheny, Rémi Servien & Nathalie Villa-Vialaneix
nathalie.villa@toulouse.inra.fr
http://www.nathalievilla.org
Séminaire Institut de Mathématiques de Bordeaux
8 avril 2016
Nathalie Villa-Vialaneix | IS-SIR 1/26
2. Sommaire
1 Background and motivation
2 Presentation of SIR
3 Our proposal
4 Simulations
Nathalie Villa-Vialaneix | IS-SIR 2/26
3. Sommaire
1 Background and motivation
2 Presentation of SIR
3 Our proposal
4 Simulations
Nathalie Villa-Vialaneix | IS-SIR 3/26
4. A typical case study: meta-model in agronomy
climate
(daily time series:
rain, temperature...)
plant phenotypes
predictions
(yield, N leaching...)
Agronomic model
Nathalie Villa-Vialaneix | IS-SIR 4/26
5. A typical case study: meta-model in agronomy
climate
(daily time series:
rain, temperature...)
plant phenotypes
predictions
(yield, N leaching...)
Agronomic model
Agronomic model:
based on biological and chemical knowledge;
Nathalie Villa-Vialaneix | IS-SIR 4/26
6. A typical case study: meta-model in agronomy
climate
(daily time series:
rain, temperature...)
plant phenotypes
predictions
(yield, N leaching...)
Agronomic model
Agronomic model:
based on biological and chemical knowledge;
computationaly expensive to use;
Nathalie Villa-Vialaneix | IS-SIR 4/26
7. A typical case study: meta-model in agronomy
climate
(daily time series:
rain, temperature...)
plant phenotypes
predictions
(yield, N leaching...)
Agronomic model
Agronomic model:
based on biological and chemical knowledge;
computationaly expensive to use;
useful for realistic predictions but not to understand the link between
the inputs and the outputs.
Nathalie Villa-Vialaneix | IS-SIR 4/26
8. A typical case study: meta-model in agronomy
climate
(daily time series:
rain, temperature...)
plant phenotypes
predictions
(yield, N leaching...)
Agronomic model
Agronomic model:
based on biological and chemical knowledge;
computationaly expensive to use;
useful for realistic predictions but not to understand the link between
the inputs and the outputs.
Metamodeling: train a simplified, fast and interpretable model which can
be used as a proxy for the agronomic model.
Nathalie Villa-Vialaneix | IS-SIR 4/26
9. A first case study: SUNFLO [Casadebaig et al., 2011]
Inputs: 5 daily time series (length: one year) and 8 phenotypes for different
sunflower types
Output: sunflower yield
Data: 1000 sunflower types × 190 climatic series (different places and
years) (n = 190 000) of variables in R5×183
× R8
Nathalie Villa-Vialaneix | IS-SIR 5/26
10. Main facts obtained from a preliminary study
R. Kpekou internship
The study focused on the influence of the climate on the yield: 5 functional
variables digitized at 183 points.
Nathalie Villa-Vialaneix | IS-SIR 6/26
11. Main facts obtained from a preliminary study
R. Kpekou internship
The study focused on the influence of the climate on the yield: 5 functional
variables digitized at 183 points.
Main result: Using summary of the variables (mean, sd...) on several
weeks and an automatic aggregating procedure in a random forest
method, led to obtain good accuracy in prediction.
Nathalie Villa-Vialaneix | IS-SIR 6/26
12. Question and mathematical framework
A functional regression problem: X: random variable (functional) & Y:
random real variable
E(Y|X)?
Nathalie Villa-Vialaneix | IS-SIR 7/26
13. Question and mathematical framework
A functional regression problem: X: random variable (functional) & Y:
random real variable
E(Y|X)?
Data: n i.i.d. observations (xi, yi)i=1,...,n.
xi is not perfectly known but sampled at (fixed) points
xi = (xi(t1), . . . , xi(tp))T
∈ Rp
. We denote: X =
xT
1
...
xT
n
.
Nathalie Villa-Vialaneix | IS-SIR 7/26
14. Question and mathematical framework
A functional regression problem: X: random variable (functional) & Y:
random real variable
E(Y|X)?
Data: n i.i.d. observations (xi, yi)i=1,...,n.
xi is not perfectly known but sampled at (fixed) points
xi = (xi(t1), . . . , xi(tp))T
∈ Rp
. We denote: X =
xT
1
...
xT
n
.
Question: Find a model which is easily interpretable and points out
relevant intervals for the prediction within the range of X.
Nathalie Villa-Vialaneix | IS-SIR 7/26
15. Related works (variable selection in FDA)
LASSO / L1
regularization in linear models
[Ferraty et al., 2010, Aneiros and Vieu, 2014] (isolated evaluation
points), [Matsui and Konishi, 2011] (selects elements of an expansion
basis), [James et al., 2009] (sparsity on derivatives: piecewise
constant predictors)
[Fraiman et al., 2015] (blinding approach useable for various
problems: PCA, regression...)
[Gregorutti et al., 2015] adaptation of the importance of variables in
random forest for groups of variables
Nathalie Villa-Vialaneix | IS-SIR 8/26
16. Related works (variable selection in FDA)
LASSO / L1
regularization in linear models
[Ferraty et al., 2010, Aneiros and Vieu, 2014] (isolated evaluation
points), [Matsui and Konishi, 2011] (selects elements of an expansion
basis), [James et al., 2009] (sparsity on derivatives: piecewise
constant predictors)
[Fraiman et al., 2015] (blinding approach useable for various
problems: PCA, regression...)
[Gregorutti et al., 2015] adaptation of the importance of variables in
random forest for groups of variables
Our proposal: a semi-parametric (not entirely linear) model which selects
relevant intervals combined with an automatic procedure to define the
intervals.
Nathalie Villa-Vialaneix | IS-SIR 8/26
17. Sommaire
1 Background and motivation
2 Presentation of SIR
3 Our proposal
4 Simulations
Nathalie Villa-Vialaneix | IS-SIR 9/26
18. SIR in multidimensional framework
SIR: a semi-parametric regression model for X ∈ Rp
Y = F(aT
1 X, . . . , aT
d X, )
for a1, . . . , ad ∈ Rp
(to be estimated), F : Rd+1
→ R, unknown, and , an
error, independant from X.
Standard assumption for SIR
Y X | PA (X)
in which A is the so-called EDR space, spanned by (ak )k=1,...,d.
Nathalie Villa-Vialaneix | IS-SIR 10/26
20. Estimation
Equivalence between SIR and eigendecomposition
A is included in the space spanned by the first d Σ-orthogonal
eigenvectors of the generalized eigendecomposition problem:
Γa = λΣa, with Σ = E (X − E(X|Y)))T
E(X|Y) and
Γ = E E(X|Y)T
E(X|Y)
Nathalie Villa-Vialaneix | IS-SIR 11/26
21. Estimation
Equivalence between SIR and eigendecomposition
A is included in the space spanned by the first d Σ-orthogonal
eigenvectors of the generalized eigendecomposition problem:
Γa = λΣa, with Σ = E (X − E(X|Y)))T
E(X|Y) and
Γ = E E(X|Y)T
E(X|Y)
Estimation (when n > p)
compute X = 1
n
n
i=1 xi and ˆΣ = 1
n XT
(X − X)
Nathalie Villa-Vialaneix | IS-SIR 11/26
22. Estimation
Equivalence between SIR and eigendecomposition
A is included in the space spanned by the first d Σ-orthogonal
eigenvectors of the generalized eigendecomposition problem:
Γa = λΣa, with Σ = E (X − E(X|Y)))T
E(X|Y) and
Γ = E E(X|Y)T
E(X|Y)
Estimation (when n > p)
compute X = 1
n
n
i=1 xi and ˆΣ = 1
n XT
(X − X)
split the range of Y into H different slices: τ1, ... τH and estimate
ˆE(X|Y) = 1
nh i: yi∈τh
xi
h=1,...,H
, with nh = |{i : yi ∈ τh}|,
ˆΓ = ˆE(X|Y)T
DˆE(X|Y) with D = Diag n1
n , . . . , nH
n
Nathalie Villa-Vialaneix | IS-SIR 11/26
23. Estimation
Equivalence between SIR and eigendecomposition
A is included in the space spanned by the first d Σ-orthogonal
eigenvectors of the generalized eigendecomposition problem:
Γa = λΣa, with Σ = E (X − E(X|Y)))T
E(X|Y) and
Γ = E E(X|Y)T
E(X|Y)
Estimation (when n > p)
compute X = 1
n
n
i=1 xi and ˆΣ = 1
n XT
(X − X)
split the range of Y into H different slices: τ1, ... τH and estimate
ˆE(X|Y) = 1
nh i: yi∈τh
xi
h=1,...,H
, with nh = |{i : yi ∈ τh}|,
ˆΓ = ˆE(X|Y)T
DˆE(X|Y) with D = Diag n1
n , . . . , nH
n
solving the eigendecomposition problem ˆΓa = λˆΣa gives the
eigenvectors a1, . . . , ad ⇒ ˆA = (a1, . . . , ad), p × d
Nathalie Villa-Vialaneix | IS-SIR 11/26
24. Equivalent formulations
SIR as a regression problem [Li and Yin, 2008] shows that SIR is
equivalent to the (double) minimization of
E(A, C) =
H
h=1
ˆph Xh − X − ˆΣACh
2
for Xh = 1
nh i: yi∈τh
, A a (p × d)-matrix and C a vector in Rd
.
Nathalie Villa-Vialaneix | IS-SIR 12/26
25. Equivalent formulations
SIR as a regression problem [Li and Yin, 2008] shows that SIR is
equivalent to the (double) minimization of
E(A, C) =
H
h=1
ˆph Xh − X − ˆΣACh
2
for Xh = 1
nh i: yi∈τh
, A a (p × d)-matrix and C a vector in Rd
.
Rk: Given A, C is obtained as the solution of an ordinary least square
problem...
Nathalie Villa-Vialaneix | IS-SIR 12/26
26. Equivalent formulations
SIR as a regression problem [Li and Yin, 2008] shows that SIR is
equivalent to the (double) minimization of
E(A, C) =
H
h=1
ˆph Xh − X − ˆΣACh
2
for Xh = 1
nh i: yi∈τh
, A a (p × d)-matrix and C a vector in Rd
.
Rk: Given A, C is obtained as the solution of an ordinary least square
problem...
SIR as a Canonical Correlation problem [Li and Nachtsheim, 2008]
shows that SIR rewrites as the double optimisation problem
maxaj,φ Cor(φ(Y), aT
j
X), where φ is any function R → R and (aj)j are
Σ-orthonormal.
Nathalie Villa-Vialaneix | IS-SIR 12/26
27. Equivalent formulations
SIR as a regression problem [Li and Yin, 2008] shows that SIR is
equivalent to the (double) minimization of
E(A, C) =
H
h=1
ˆph Xh − X − ˆΣACh
2
for Xh = 1
nh i: yi∈τh
, A a (p × d)-matrix and C a vector in Rd
.
Rk: Given A, C is obtained as the solution of an ordinary least square
problem...
SIR as a Canonical Correlation problem [Li and Nachtsheim, 2008]
shows that SIR rewrites as the double optimisation problem
maxaj,φ Cor(φ(Y), aT
j
X), where φ is any function R → R and (aj)j are
Σ-orthonormal.
Rk: The solution is shown to satisfy φ(y) = aT
j
E(X|Y = y) and aj is
also obtained as the solution of the mean square error problem:
min
aj
E φ(Y) − aT
j X
2
Nathalie Villa-Vialaneix | IS-SIR 12/26
28. SIR in large dimensions: problem
In large dimension (or in Functional Data Analysis), n < p and ˆΣ is
ill-conditionned and does not have an inverse ⇒ Z = (X − InX
T
)ˆΣ−1/2
can
not be computed.
Nathalie Villa-Vialaneix | IS-SIR 13/26
29. SIR in large dimensions: problem
In large dimension (or in Functional Data Analysis), n < p and ˆΣ is
ill-conditionned and does not have an inverse ⇒ Z = (X − InX
T
)ˆΣ−1/2
can
not be computed.
Different solutions have been proposed in the litterature based on:
prior dimension reduction (e.g., PCA) [Ferré and Yao, 2003] (in the
framework of FDA)
regularization (ridge...) [Li and Yin, 2008, Bernard-Michel et al., 2008]
sparse SIR
[Li and Yin, 2008, Li and Nachtsheim, 2008, Ni et al., 2005]
Nathalie Villa-Vialaneix | IS-SIR 13/26
30. SIR in large dimensions: ridge penalty / L2-regularization
of ˆΣ
Following [Li and Yin, 2008] which shows that SIR is equivalent to the
minimization of
E2(A, C) =
H
h=1
ˆph Xh − X − ˆΣACh
2
,
Nathalie Villa-Vialaneix | IS-SIR 14/26
31. SIR in large dimensions: ridge penalty / L2-regularization
of ˆΣ
Following [Li and Yin, 2008] which shows that SIR is equivalent to the
minimization of
E2(A, C) =
H
h=1
ˆph Xh − X − ˆΣACh
2
+µ2
H
h=1
ˆph ACh
2
,
[Bernard-Michel et al., 2008] propose to penalize by a ridge penalty in a
high dimensional setting.
Nathalie Villa-Vialaneix | IS-SIR 14/26
32. SIR in large dimensions: ridge penalty / L2-regularization
of ˆΣ
Following [Li and Yin, 2008] which shows that SIR is equivalent to the
minimization of
E2(A, C) =
H
h=1
ˆph Xh − X − ˆΣACh
2
+µ2
H
h=1
ˆph ACh
2
,
[Bernard-Michel et al., 2008] propose to penalize by a ridge penalty in a
high dimensional setting.
They also show that this problem is equivalent to finding the eigenvectors
of the generalized eigenvalue problem
ˆΓa = λ ˆΣ + µ2Ip a.
Nathalie Villa-Vialaneix | IS-SIR 14/26
33. SIR in large dimensions: sparse versions
Specific issue to introduce sparsity in SIR
sparsity on a multiple-index model. Most authors use shrinkage
approaches.
First version: sparse penalization of the ridge solution
If (ˆA, ˆC) are the solutions of the ridge SIR as described in the previous
slide, [Ni et al., 2005, Li and Yin, 2008] propose to shrink this solution by
minimizing
Es,1(α) =
H
h=1
ˆph Xh − X − ˆΣDiag(α)ˆA ˆCh
2
+ µ1 α L1
(regression formulation of SIR)
Nathalie Villa-Vialaneix | IS-SIR 15/26
34. SIR in large dimensions: sparse versions
Specific issue to introduce sparsity in SIR
sparsity on a multiple-index model. Most authors use shrinkage
approaches.
Second version: [Li and Nachtsheim, 2008] derive the sparse optimization
problem from the correlation formulation of SIR:
min
as
j
n
i=1
Pˆaj
(X|yi) − (as
j )T
xi
2
+ µ1,j as
j L1
,
in which Pˆaj
is the projection of ˆE(X|Y = yi) = Xh onto the space spanned
by the solution of the ridge problem.
Nathalie Villa-Vialaneix | IS-SIR 15/26
35. Characteristics of the different approaches and possible
extensions
[Li and Yin, 2008] [Li and Nachtsheim, 2008]
sparsity on shrinkage coefficients estimates
nb optimization pb 1 d
sparsity common to all dims specific to each dim
Nathalie Villa-Vialaneix | IS-SIR 16/26
36. Characteristics of the different approaches and possible
extensions
[Li and Yin, 2008] [Li and Nachtsheim, 2008]
sparsity on shrinkage coefficients estimates
nb optimization pb 1 d
sparsity common to all dims specific to each dim
Extension to block-sparse SIR (like in PCA)?
Nathalie Villa-Vialaneix | IS-SIR 16/26
37. Sommaire
1 Background and motivation
2 Presentation of SIR
3 Our proposal
4 Simulations
Nathalie Villa-Vialaneix | IS-SIR 17/26
38. IS-SIR: a two step approach
Background: Back to the functional setting, we suppose that t1, ..., tp are
split into D intervals I1, ..., ID.
Nathalie Villa-Vialaneix | IS-SIR 18/26
39. IS-SIR: a two step approach
Background: Back to the functional setting, we suppose that t1, ..., tp are
split into D intervals I1, ..., ID.
First step: Solve the ridge problem on the digitized functions (viewed as
high dimensional vectors) to obtain ˆA and ˆC:
min
A,C
H
h=1
ˆph Xh − X − ˆΣACh
2
+ µ2
H
h=1
ˆph ACh
2
Nathalie Villa-Vialaneix | IS-SIR 18/26
40. IS-SIR: a two step approach
Background: Back to the functional setting, we suppose that t1, ..., tp are
split into D intervals I1, ..., ID.
First step: Solve the ridge problem on the digitized functions (viewed as
high dimensional vectors) to obtain ˆA and ˆC:
min
A,C
H
h=1
ˆph Xh − X − ˆΣACh
2
+ µ2
H
h=1
ˆph ACh
2
Second step: Sparse shrinkage using the intervals. If
PˆA (E(X|Y = yi)) = (Xh − X)T ˆA for h st yi ∈ τh and if Pi = (P1
i
, . . . , Pd
i
)T
and Pj
= (Pj
1
, . . . , Pj
n)T
, we solve:
arg min
α∈RD
d
j=1
Pj
− (X∆(ˆaj)) α 2
+ µ1 α L1
with ∆(ˆaj) the (p × D)-matrix such that ∆kl(ˆaj) = ˆajl if tl ∈ Ik and 0
otherwise.
Nathalie Villa-Vialaneix | IS-SIR 18/26
41. IS-SIR: Characteristics
uses the approach based on the correlation formulation (because the
dimensionality of the optimization problem is smaller);
uses a shrinkage approach and optimizes shrinkage coefficients in a
single optimization problem;
handles functional setting by penalizing entire intervals and not just
isolated points.
Nathalie Villa-Vialaneix | IS-SIR 19/26
42. Parameter estimation
H (number of slices): usually, SIR is known to be not very sensitive to
the number of slices (> d + 1). We took H = 10 (i.e., 10/30
observations per slice);
Nathalie Villa-Vialaneix | IS-SIR 20/26
43. Parameter estimation
H (number of slices): usually, SIR is known to be not very sensitive to
the number of slices (> d + 1). We took H = 10 (i.e., 10/30
observations per slice);
µ2 and d (ridge estimate ˆA):
L-fold CV for µ2 (for a d0 large enough) Note that GCV as described in
[Li and Yin, 2008] can not be used since the current version of the L2
penalty involves the use of an estimate of Σ−1
.
Nathalie Villa-Vialaneix | IS-SIR 20/26
44. Parameter estimation
H (number of slices): usually, SIR is known to be not very sensitive to
the number of slices (> d + 1). We took H = 10 (i.e., 10/30
observations per slice);
µ2 and d (ridge estimate ˆA):
L-fold CV for µ2 (for a d0 large enough)
using again L-fold CV, ∀ d = 1, . . . , d0, an estimate of
R(d) = d − E Tr Πd
ˆΠd ,
in which Πd and ˆΠd are the projector onto the first d dimensions of the
EDR space and its estimate, is derived similarly as in
[Liquet and Saracco, 2012]. The evolution of ˆR(d) versus d is studied
to select a relevant d.
Nathalie Villa-Vialaneix | IS-SIR 20/26
45. Parameter estimation
H (number of slices): usually, SIR is known to be not very sensitive to
the number of slices (> d + 1). We took H = 10 (i.e., 10/30
observations per slice);
µ2 and d (ridge estimate ˆA):
L-fold CV for µ2 (for a d0 large enough)
using again L-fold CV, ∀ d = 1, . . . , d0, an estimate of
R(d) = d − E Tr Πd
ˆΠd ,
in which Πd and ˆΠd are the projector onto the first d dimensions of the
EDR space and its estimate, is derived similarly as in
[Liquet and Saracco, 2012]. The evolution of ˆR(d) versus d is studied
to select a relevant d.
µ1 (LASSO) glmnet is used, in which µ1 is selected by CV along the
regularization path.
Nathalie Villa-Vialaneix | IS-SIR 20/26
46. An automatic approach to define intervals
1 Initial state: ∀ k = 1, . . . , p, τk = {tk }
Nathalie Villa-Vialaneix | IS-SIR 21/26
47. An automatic approach to define intervals
1 Initial state: ∀ k = 1, . . . , p, τk = {tk }
2 Iterate
along the regularization path, select three values for µ1:
Nathalie Villa-Vialaneix | IS-SIR 21/26
48. An automatic approach to define intervals
1 Initial state: ∀ k = 1, . . . , p, τk = {tk }
2 Iterate
along the regularization path, select three values for µ1: P% of the
coefficients are zero, P% of the coefficients are non zero, best GCV.
define: D−
(“strong zeros”) and D+
(“strong non zeros”)
Nathalie Villa-Vialaneix | IS-SIR 21/26
49. An automatic approach to define intervals
1 Initial state: ∀ k = 1, . . . , p, τk = {tk }
2 Iterate
define: D−
(“strong zeros”) and D+
(“strong non zeros”)
merge consecutive “strong zeros” (or “strong non zeros”) or “strong
zeros” (resp. “strong non zeros”) separated by a few numbers of
intervals which are of undetermined type.
Until no more iterations can be performed.
Nathalie Villa-Vialaneix | IS-SIR 21/26
50. An automatic approach to define intervals
1 Initial state: ∀ k = 1, . . . , p, τk = {tk }
2 Iterate
define: D−
(“strong zeros”) and D+
(“strong non zeros”)
merge consecutive “strong zeros” (or “strong non zeros”) or “strong
zeros” (resp. “strong non zeros”) separated by a few numbers of
intervals which are of undetermined type.
Until no more iterations can be performed.
3 Output: Collection of models (first with p intervals, last with 1), M∗
D
(optimal for GCV) and corresponding GCVD versus D (number of
intervals).
Nathalie Villa-Vialaneix | IS-SIR 21/26
51. An automatic approach to define intervals
1 Initial state: ∀ k = 1, . . . , p, τk = {tk }
2 Iterate
define: D−
(“strong zeros”) and D+
(“strong non zeros”)
merge consecutive “strong zeros” (or “strong non zeros”) or “strong
zeros” (resp. “strong non zeros”) separated by a few numbers of
intervals which are of undetermined type.
Until no more iterations can be performed.
3 Output: Collection of models (first with p intervals, last with 1), M∗
D
(optimal for GCV) and corresponding GCVD versus D (number of
intervals).
Final solution: Minimize GCVD over D.
Nathalie Villa-Vialaneix | IS-SIR 21/26
52. Sommaire
1 Background and motivation
2 Presentation of SIR
3 Our proposal
4 Simulations
Nathalie Villa-Vialaneix | IS-SIR 22/26
53. Simulation framework
Data generated with:
Y = d
j=1 log X, aj with X(t) = Z(t) + in which Z is a Gaussian
process with mean µ(t) = −5 + 4t − 4t2
and the Matern 3/2
covariance function with parameters σ = 0.1 and θ = 0.2/
√
3, is a
centered Gaussian variable independant of Z, with standard deviation
0.1.;
aj = sin
t(2+j)π
2 −
(j−1)π
3 IIj
(t)
two models: (M1), d = 1, I1 = [0.2, 0.4]. For (M2), d = 3 and
I1 = [0, 0.1], I2 = [0.5, 0.65] and I3 = [0.65, 0.78].
Nathalie Villa-Vialaneix | IS-SIR 23/26
63. Conclusion
IS-SIR:
sparse dimension reduction model adapted to functional framework;
fully automated definition of relevant intervals in the range of the
predictors
Nathalie Villa-Vialaneix | IS-SIR 26/26
64. Conclusion
IS-SIR:
sparse dimension reduction model adapted to functional framework;
fully automated definition of relevant intervals in the range of the
predictors
Perspective:
application to real data
block-wise sparse SIR?
Nathalie Villa-Vialaneix | IS-SIR 26/26
65. Aneiros, G. and Vieu, P. (2014).
Variable in infinite-dimensional problems.
Statistics and Probability Letters, 94:12–20.
Bernard-Michel, C., Gardes, L., and Girard, S. (2008).
A note on sliced inverse regression with regularizations.
Biometrics, 64(3):982–986.
Casadebaig, P., Guilioni, L., Lecoeur, J., Christophe, A., Champolivier, L., and Debaeke, P. (2011).
SUNFLO, a model to simulate genotype-specific performance of the sunflower crop in contrasting environments.
Agricultural and Forest Meteorology, 151(2):163–178.
Ferraty, F., Hall, P., and Vieu, P. (2010).
Most-predictive design points for functiona data predictors.
Biometrika, 97(4):807–824.
Ferré, L. and Yao, A. (2003).
Functional sliced inverse regression analysis.
Statistics, 37(6):475–488.
Fraiman, R., Gimenez, Y., and Svarc, M. (2015).
Feature selection for functional data.
Journal of Multivariate Analysis.
In Press.
Gregorutti, B., Michel, B., and Saint-Pierre, P. (2015).
Grouped variable importance with random forests and application to multiple functional data analysis.
Computational Statistics and Data Analysis, 90:15–35.
James, G., Wang, J., and Zhu, J. (2009).
Functional linear regression that’s interpretable.
Annals of Statistics, 37(5A):2083–2108.
Li, L. and Nachtsheim, C. (2008).
Nathalie Villa-Vialaneix | IS-SIR 26/26
66. Sparse sliced inverse regression.
Technometrics, 48(4):503–510.
Li, L. and Yin, X. (2008).
Sliced inverse regression with regularizations.
Biometrics, 64:124–131.
Liquet, B. and Saracco, J. (2012).
A graphical tool for selecting the number of slices and the dimension of the model in SIR and SAVE approches.
Computational Statistics, 27(1):103–125.
Matsui, H. and Konishi, S. (2011).
Variable selection for functional regression models via the l1 regularization.
Computational Statistics and Data Analysis, 55(12):3304–3310.
Ni, L., Cook, D., and Tsai, C. (2005).
A note on shrinkage sliced inverse regression.
Biometrika, 92(1):242–247.
Nathalie Villa-Vialaneix | IS-SIR 26/26