Time-Series Analysis on Multiperiodic Conditional Correlation by Sparse Covariance Selection

Introduction Problem Setup Optimization Method Numerical Results Conclusion and Discussion References
Time-Series Analysis on Multiperiodic
Conditional Correlation by Sparse
Covariance Selection
Michael Lie1
1Prof. Suzuki Taiji Lab.,
Faculty of Science,
Department of Information Science,
Tokyo Institute of Technology, Japan
February 12, 2015

Agenda
To propose of the new statistical model:
Sparse Multiperiodic Covariance Selection (M-CovSel)
To propose of optimization method through ADMM

Sparse Covariance Selection
Y1, · · · , Yn ∼
i.i.d.
Np(µ, Σ).
argmin
X 0
− ln det X + trace(SX) + λ X 1
Original idea: Dempster (1972)
Application to Sparse and High-dimensional Matrices:
Meinshausen and Bühlmann (2006)
Problem Formulation: Banerjee, Ghaoui and d’Aspremont
(2008)
Solution through graphical lasso model: Friedman, Hastie
and Tibshirani (2008)
Solution by ADMM method: Boyd (2011)

Application
Application: Markowitz’s Portfolio Selection
Portfolio Selection (Markowitz, 1952)
min
w
σ2
p,w = w Sw s.t. w 1 = 1 ∴ w =
S−11
1 S−11
.
Here, the inverse of empirical covariance S−1 is needed!
The existing Covariance Selection: ﬁxed time
⇒ Covariance Selection analysis over time series is needed!

Intuition
Intuition
Figure: Existing Model
By estimating X, we can construct the portfolio.

Intuition
Figure: Our Model
Sij :=
1
n
k,l
(yk,i − ˆµi)(yl,j − ˆµj) ,

Problem Formulation
Problem Formulation
Consider a stationary-time process such that the multiperiodic
inverse covariance matrix X can be expressed as
X =







X11 X12 X13 · · · X1,T
X12 X22 X23 · · · X2,T
X13 X23 X33 · · · X3,T
...
...
...
...
...
X1,T X2,T X3,T · · · XT,T







Tp columns



Tprows
.
Assumption: X is stationary time-process, such that
Xi,i+h = Xj,j+h for all i, j.

Problem Formulation
Sparse Multiperiodic Covariance Selection (M-CovSel):
argmin
X 0
f(X) := argmin
X 0
− ln det X +
i,j
trace Sij Xij +
λ1
i,j
Xij 1
+ λ2
i,j k>i,l>j
Xij − Xkl
2
2
subject to Xi,i+h = Xj,j+h, ∀i, j.
1 : w 1 =
i
|wi | 2 : w 2
F =
i
|wi |2

Problem Formulation
We separate our model into two parts:
f(X) ≡ g(X) + h(X)
g(X) = − ln det X +
i,j
trace Sij Xij ,
h(X) = λ1
i,j
Xij 1
+ λ2
i,j k>i,l>j
Xij − Xkl
2
F
.
g(X): twice differentiable and strictly convex
h(X): convex but non-differentiable

Problem Formulation
Auxiliary Variables
X =







X11 X12 X13 · · · X1,T
X12 X22 X23 · · · X2,T
X13 X23 X33 · · · X3,T
...
...
...
...
...
X1,T X2,T X3,T · · · XT,T







bvec
−→ X =















X11
...
X1,T
X22
...
X2,T
...
XT,T















p



numX×p

Problem Formulation
H: stationary time matrix

Problem Formulation
All D: time-difference matrix

Problem Formulation
Simpliﬁed D: time-difference matrix

Problem Formulation
minimize g(X) + h(˜Z)
subject to



X = Z
DX = Z
HX = 0
⇐⇒ ˜X = ˜Z
where
i,j
trace Sij Xij ,
h(˜Z) = λ1
i,j
Z1 1 + λ2
i,j
Z2
2
F ,
˜X =


X
DX
HX

 , ˜Z =


Z1
Z2
0

 .

Alternating Direction Method of Multiplier (ADMM)
Solving Through ADMM
Algorithm 1 Overview of ADMM
1: for k = 0, 1, · · · do
2: ˜X-update:
3: Compute W(0)
= (X(0)
)−1
.
4: for t = 1, 2, · · · do
5: Compute the direction using steepest gradient descent d = − G(˜X).
6: Use an Armijo’s rule based step-size selection to get α such that
X(t+1)
= X(t)
+ αd(t)
is positive deﬁnite and the objective value sufﬁ-
ciently decreases.
7: Update ˜X.
8: end for
9: ˜Z-update:
10: Update Z1 : Z
(k+1)
1 = Sλ1/ρ((X )(k+1)
+ Y(k)
ρ
)
11: Update Z2:
Z
(k+1)
2 =
ρD(X )(k+1)
+ Y(k)
2λ2 + ρ
12: Y-update: Y(k+1)
= Y(k)
+ ρ ˜X(k+1)
− ˜Z(k+1)
13: end for

minimize g(X) + h(˜Z)
subject to



X = Z
DX = Z
HX = 0
⇐⇒ ˜X = ˜Z
Its augmented Lagrangian is
Lρ(˜X, ˜Z, Y) = g(˜X) + h(˜Z) + (ρ/2) ˜X − ˜Z +
Y
ρ
2
F
,
i,j
trace Sij Xij ,
h(˜Z) = λ1
i,j
Z1 1 + λ2
i,j
Z2
2
F .

1 ˜X-update:
˜X(k+1)
:= argmin
˜X
− ln det X +
i,j
trace Sij Xij
+
ρ
2
˜X − ˜Z(k)
+
Y(k)
ρ
2
F
,
2 ˜Z-update:
˜Z(k+1)
:= argmin
˜Z
λ1 Z1 1 + λ2 Z2
2
F
+
ρ
2
˜X(k+1)
− ˜Z +
Y(k)
ρ
2
F
,
3 ˜Y-update:
Y(k+1)
:= Y(k)
+ ρ ˜X(k+1)
− ˜Z(k+1)
.

˜X Update
The solution of
˜X(k+1)
:= argmin
˜X
− ln det X +
i,j
trace Sij Xij +
ρ
2
˜X − ˜Z(k)
+
Y(k)
ρ
2
F
is solved through steepest gradient descent and the algorithm
is as given in Algorithm 1 of line 2-8.
Algorithm 2 ˜X Update
1: Compute W(0)
= (X(0)
)−1
.
2: for t = 1, 2, · · · do
3: Compute the direction using steepest gradient descent d = − G(˜X).
4: Use an Armijo’s rule based step-size selection to get α such that
X(t+1)
= X(t)
+ αd(t)
is positive deﬁnite and the objective value sufﬁ-
ciently decreases.
5: Update ˜X.
6: end for

˜Z Update
˜Z Update
˜Zk+1
:= argmin
˜Z
λ1 Z1 1 + λ2 Z2
2
F
+ (ρ/2) ˜X(k+1)
− ˜Z +
Y(k)
ρ
2
F
.
The equation above can be separated as two equations as
below:
Z
(k+1)
1 := argmin
Z1
λ1 Z1 1 + (ρ/2) (X )(k+1)
− Z1 + Yk
1/ρ 2
F
Z
(k+1)
2 := argmin
Z2
λ2 Z2
2
F + (ρ/2) D(X )(k+1)
− Z2 + Yk
2/ρ 2
F

Solution of ˜Z Update
Z
(k+1)
1 := argmin
Z1
λ1 Z1 1 + (ρ/2) (X )(k+1)
− Z1 + Yk
1/ρ 2
F
Z
(k+1)
2 := argmin
Z2
λ2 Z2
2
F + (ρ/2) D(X )(k+1)
− Z2 + Yk
2/ρ 2
F
The solution of ﬁrst solution is simply the soft-thresholding
function of
Z
(k+1)
1 = Sλ1/ρ (X )(k+1)
+
Y(k)
ρ
and the solution of second solution is
Z
(k+1)
2 =
ρD(X )(k+1) + Y(k)
2λ2 + ρ
.

Numerical Results
Numerical Results
Execution environment:
Intel Core i7-4770 CPU @ 3.40GHz (8 CPUs)
8GB RAM
R ver. 3.3.65126.0
OS Windows 7 Professional 64 bit (6.1. build 7601)
Verifying:
Convergence Speed
Sparsity of the estimates
using random data sets and real data.

Numerical Results
All D
Simpliﬁed D

Numerical Results
Figure: Runtime of n = 10, λ1 = 0.01, λ2 = 0.01.

Numerical Results
Figure: (i) Objective Values, (ii) Primal Residuals, and (iii) Dual
Residuals of n = 10, T = 5, λ1 = 0.01, λ2 = 0.01.

Numerical Results
Figure: The sparsity pattern of estimates from the model of
n = 10, T = 5, λ1 = 0.01, λ2 = 0.01.

Numerical Results
Analysis on real data
Stock data of 50 randomly selected companies from NASDAQ
Period: 4 January 2011 to 31 December 2014
Tick Name Sector
PDCO Patterson Companies, Inc. Health Care
OMER Omeros Corporation Health Care
HEAR Turtle Beach Corporation Consumer Durables
QBAK Qualstar Corporation Technology
UTHR United Therapeutics Corporation Health Care
PLCE The Children&39;s Place Retail Stores, Inc. Consumer Services
SUSQ Susquehanna Bancshares, Inc. Finance
IDCC InterDigital, Inc. Miscellaneous
ELON Echelon Corporation Technology
BGCP BGC Partners, Inc. Finance
MRGE Merge Healthcare Incorporated. Technology
TISA Top Image Systems, Ltd. Technology
IPXL Impax Laboratories, Inc. Health Care
ROVI Rovi Corporation Miscellaneous
IBCP Independent Bank Corporation Finance
BABY Natus Medical Incorporated Health Care
HFFC HF Financial Corp. Finance
ISLE Isle of Capri Casinos, Inc. Consumer Services
ITIC Investors Title Company Finance
SLGN Silgan Holdings Inc. Consumer Durables
ZIOP ZIOPHARM Oncology Inc Health Care
MXIM Maxim Integrated Products, Inc. Technology
NEPT Neptune Technologies & Bioresources Inc Health Care
UTMD Utah Medical Products, Inc. Health Care
.
.
.
.
.
.
.
.
.

Numerical Results
Figure: (i) Objective Values, (ii) Primal Residuals, and (iii) Dual
Residuals of T = 5, λ1 = 0.01, λ2 = 0.01 from real stock data.

Numerical Results
Figure: The sparsity pattern of estimates from the model of
T = 5, λ1 = 0.01, λ2 = 0.01 from real stock data.

Numerical Results
Figure: The covariance matrix plot of estimates from the model of

Numerical Results
Figure: Negative covariance value of estimates from the model of

Numerical Results
Figure: Negative covariance value of estimates from the model of
T = 5, λ1 = 0.01, λ2 = 0.01 from real stock data (zoom on T = 1).

Numerical Results
Figure: The weak positivity of estimates from the model of

Numerical Results
Figure: The weak positivity of estimates from the model of
T = 5, λ1 = 0.01, λ2 = 0.01 from real stock data (zoom on T = 1).

Conclusion and Discussion
Conclusions:
ADMM algorithm with steepest gradient descent for ˜X
update minimized our objective function f(X).
Computation time took a lot of time as T increases.
Discussions:
Instead of steepest gradient descent, Newton direction. cf.
QUIC.
Use Block Coordinate Descent as in BIG & QUIC.
Introduce the decay constant in D.

References I
[De72] Dempster, A. P. (1972). Covariance Selection. Biometrics 28 157-175.
[MB06] Meinshausen, N. and Bühlmann, P. (2006). High-dimensional graphs and
variable selection with the Lasso. Annals of Statistics 34 1436-1462.
[BG08] Banerjee, O., Ghaoui, E. L. and d’Aspremont, A. (2008). Model selection
through sparse maximum likelihood estimation for multivariate Gaussian
or binary data. Journal of Machine Learning Research 9 485-516.
[Ti08] Friedman, J., Hastie, T. and Tibshirani, R. (2008). Sparse inverse
covariance estimation with the graphical Lasso. Biostatistics 9 432-441.
[Ma52] Markowitz, H. (1952). Portfolio Selection. The Journal of Finance 7 77-91.
[Ti96] Tibshirani, R. (1996). Regression shrinkage and selection via the lasso.
Journal of the Royal Statistical Society: Series B 58 267-288.
[Bo11] Boyd, S., Parikh, N., Chu, E., Peleato, B. and Eckstein, J. (2011).
Distributed optimization and statistical learning via the alternating
direction method of multipliers. Foundations and Trends in Machine
Learning 3 1-122.

References II
[Hs13] Hsieh, C. J., Sustik, M. A., Dhillon, I., Ravikumar, P. and Poldrack, R.
(2013). BIG & QUIC: Sparse inverse covariance estimation for a million
variables. In Advances in Neural Information Processing Systems
3165-3173.
[Bv11] Bühlmann, P. and van de Geer, S. (2011). Statistics for High-Dimensional
Data: Methods, Theory and Applications. Springer-Verlag, Berlin.
[WB12] Wahlberg, B., Boyd, S., Annergren, M. and Wang, Y. (2012). An ADMM
algorithm for a class of total variation regularized estimation problems.
ArXiv:1203.1828.

Time-Series Analysis on Multiperiodic Conditional Correlation by Sparse Covariance Selection

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (14)

Similar to Time-Series Analysis on Multiperiodic Conditional Correlation by Sparse Covariance Selection

Similar to Time-Series Analysis on Multiperiodic Conditional Correlation by Sparse Covariance Selection (20)

Recently uploaded

Recently uploaded (20)

Time-Series Analysis on Multiperiodic Conditional Correlation by Sparse Covariance Selection