SlideShare a Scribd company logo
Pricing VIX Options with Multifactor Stochastic
Volatility∗
Pascal M. Caversaccio†
First Draft: May 16, 2014
This Version: June 13, 2016
Abstract
By exploiting the flexibility of the Wishart process, we propose an application of this framework
to the pricing of Chicago Board Options Exchange (CBOE) volatility index (VIX) options. Our
methodology is analytically tractable and yet flexible enough to efficiently price CBOE VIX
options. In particular, the dynamics for the CBOE VIX is carried out in a linear affine way and
the discounted Laplace transform exhibits an exponentially affine property. The tractable model
structure lightens the computational burden and facilitates a fast identification of the parameter
estimates. We empirically show that modeling the stochastic co-volatility factor can significantly
improve the in-sample fitting results due to the improved modeling of higher conditional moments
in the underlying transition probability density.
Keywords: Matrix affine diffusion, multifactor stochastic volatility, stochastic correlation, VIX option
pricing, Wishart process.
JEL classification: C51, C52, G12, G13.
∗
The author thanks Chris Bardgett, Elise Gourier, Meriton Ibraimi, Markus Leippold, Stefan Pomberger, Lujing
Su, and Nikola Vasiljevi´c for helpful comments, and Sergio Maffioletti for providing guidance and access to the cloud
infrastructure of the University of Zurich. Any remaining errors are mine.
†
University of Zurich – Department of Banking and Finance, Plattenstrasse 14, 8032 Zurich, Switzerland. E-Mail:
pascalmarco.caversaccio@uzh.ch.
Pricing VIX Options with Multifactor Stochastic
Volatility
June 13, 2016
Abstract
By exploiting the flexibility of the Wishart process, we propose an application of this framework
to the pricing of Chicago Board Options Exchange (CBOE) volatility index (VIX) options. Our
methodology is analytically tractable and yet flexible enough to efficiently price CBOE VIX
options. In particular, the dynamics for the CBOE VIX is carried out in a linear affine way and
the discounted Laplace transform exhibits an exponentially affine property. The tractable model
structure lightens the computational burden and facilitates a fast identification of the parameter
estimates. We empirically show that modeling the stochastic co-volatility factor can significantly
improve the in-sample fitting results due to the improved modeling of higher conditional moments
in the underlying transition probability density.
Keywords: Matrix affine diffusion, multifactor stochastic volatility, stochastic correlation, VIX option
pricing, Wishart process.
JEL classification: C51, C52, G12, G13.
1 Introduction
This paper incorporates the high analytical tractability as well as the enormous flexibility of Wishart
processes to efficiently price Chicago Board Options Exchange (CBOE) volatility index (VIX) op-
tions. The CBOE VIX reflects the market’s expectation of the 30-day forward Standard & Poor’s
500 index (SPX) volatility and serves as a proxy for the investor sentiment. Hence, given the direct
exposure to the volatility, we have seen an extended use of VIX options in the risk management
because they provide an immediate link to variance without the need to vega-hedge the underlying
SPX. The increased demand over the recent years for volatility derivatives, notably for VIX op-
tions, claims of course for a consistent pricing of this particular class of derivatives while preserving
consistency with the underlying index.1
Since the introduction of VIX futures in 2004 and plain vanilla option contracts on the VIX in 2006,
many research papers faced the challenge of pricing this particular class of products. Specifically,
VIX option values are by definition dependent on the SPX through its options and therefore, due
to the sake of consistent modeling, one needs to incorporate the SPX dynamics into the pricing.
The paper of Sepp (2008) provides an exact analytical formula for SPX– and VIX options under
the assumption of a stochastic volatility process with volatility jumps but no jumps in the asset
return. Another interesting approach is taken in Albanese et al. (2009), where they develop a pricing
framework that can simultaneously handle European options, forward-starts, options on the realized
variance, and options on the VIX. To do so, they make use of spectral methods. Papanicolaou and
Sircar (2014) introduce a regime-switching Heston model to widen the support of the bulk of the
volatility process’ probability distribution. Bardgett et al. (2014) develop a three-factor affine model
which yields semi-closed expressions for SPX– and VIX options. A non-affine approach is taken
in Baldeaux and Badran (2014), where the authors extend the pure diffusion 3/2 model to jumps
in the asset returns for the consistent modeling of SPX– and VIX derivatives. Furthermore, we
refer, among others, to Bergomi (2008), Cont and Kokholm (2011), Kallsen et al. (2011), Drimus
and Farkas (2013), Goard and Mazur (2013), and Branger et al. (2014) for recent papers covering
1
See Figure 8 in the supplementary appendix for the time evolution of the VIX option volume and the corresponding
open interest.
1
volatility– and particularly VIX options, respectively.
In terms of empirical investigations, Menc´ıa and Sentana (2013) conduct an extensive empirical
analysis of VIX derivative valuation models before, during, and after the 2008–2009 financial crisis,
where their results indicate that stochastic volatility plays a much more important role for VIX
options than for VIX futures. Moreover, there is also quite consistent empirical evidence for the
fact that the implied volatility surface (IVS) of index options and the corresponding term structure
of variance swaps, which are intimately connected to the VIX, are driven by more than one latent
risk factor, see, e.g., Christoffersen et al. (2009) and Egloff et al. (2010). To capture this feature,
Christoffersen et al. (2009) introduce a two-factor Heston model which is able to generate a wide
degree of variation in the stochastic correlation between the log-returns and the volatility shocks.
Even if the stochastic correlation structure implied by their setting is restricted by two artificial
boundaries, their empirical findings show that two-factor stochastic volatility models improve sub-
stantially on single-factor models in explaining the cross-sectional– and time series patterns of index
option implied volatilities. These findings give rise to the question whether a multivariate stochas-
tic volatility model of the Wishart type, with rich conditional dependence structures between risk
factors, can be (efficiently) employed in the pricing of VIX options.
The Wishart pure mean-reverting diffusion process of Bru (1991), which belongs to the class of
matrix affine diffusion (MAD), is studied in Da Fonseca et al. (2007), Da Fonseca et al. (2008),
Gouri´eroux et al. (2009), Gouri´eroux and Sufana (2010), and Da Fonseca et al. (2014) to price
financial equity derivatives. Leippold and Trojani (2010) extend this framework and allow for a rich
structure of jumps in the dynamics. The Wishart affine stochastic correlation model, developed by
Da Fonseca et al. (2007), provides prices for vanilla options consistent with the observed smile– and
skew pattern, while making it possible to detect and quantify the correlation risk in multiple-asset
derivatives like basket options. Since a stochastic SPX option pricing model embodies an implicit
dynamics for the VIX, it depends on the specification of the volatility– and jump structure in the
setting whether the model can accurately replicate the VIX dynamics. Motivated by the crucial
findings of Da Fonseca et al. (2007), we carry over the technique of highly flexible and tractable
MAD modeling to the VIX market.
2
2 Empirical Stylized Facts2
We emphasize first that the VIX market features some distinctive properties compared to the SPX
market. Firstly, the underlying of VIX options is not the current VIX spot price itself but rather the
corresponding futures contract. This fact implies that no-arbitrage considerations of VIX options
must be examined relative to the VIX futures. Moreover, because the VIX itself is not a tradable
asset, there is no cost-of-carry relationship between VIX futures and the spot VIX value (see, e.g.,
Zhang and Zhu (2006) and Zhu and Lian (2012)), i.e. F (t, T, VIXt = x) = xert,T (T−t) where rt,T is
the risk-free interest rate for the time period T − t. This circumstance clearly affects the put-call
parity for VIX options which can however be modified trivially (see, for instance, Lian and Zhu
(2013)) and is given by
C (t, T, VIXt = x) − P (t, T, VIXt = x) = e−rt,T (T−t)
F (t, T, VIXt = x) − Ke−rt,T (T−t)
. (1)
The sole difference to the ordinary put-call parity is the replacement of the underlying price by the
discounted forward volatility which can be traded via the VIX future contract.
Figure 1 displays the joint evolution of the SPX and VIX. The figure implies a strong negative
stochastic correlation, i.e. a drop in the SPX is followed by upward moves of time-varying amplitudes
in the VIX and vice versa. Furthermore, we can deduce a mean-reverting behavior in the VIX
dynamics.
[Figure 1 about here.]
To empirically prove the above fact of a negative stochastic linkage between the SPX and VIX, we
need to take a closer look at the correlation coefficients. We depict the rolling correlation coefficients
between the SPX log-return– and the CBOE VIX level increments given the four different window
sizes, measured in days, N ∈ {25, 50, 100, 250} in Figure 2. We see that the correlation coefficients are
as expected strongly negative. Notice that the negative correlation coefficients induce the leverage
2
For further empirical stylized facts concerning the SPX– and VIX (options) market we refer to Appendix D in the
supplementary appendix.
3
effect3 between the VIX and SPX since negative returns are often observed together with a rise of
volatility.
[Figure 2 about here.]
In Table 1 the first four moments of the SPX log-returns and the VIX levels are depicted. We
can observe a negative skewness and a high kurtosis for the log-returns of the SPX, which indeed
implies the empirically observed leptokurtic behavior. Moreover, the VIX levels follow a right-skewed
and leptokurtic distribution. To replicate these behaviors we can introduce a persistent stochastic
volatility process which slows down the convergence of the return distribution to normality and
exhibits a right-skewed transition probability density for the volatility process.
[Table 1 about here.]
Now we take a look at the implied volatility skews of SPX– and VIX options.4 Since it is difficult
to predict accurately future dividends on the SPX, we define for both markets the moneyness in
terms of futures prices. The log-moneyness is defined, for i ∈ {S, V}, by
m := ln K/Fi
t (T) ,
where K is the strike level of the European-style option and Fi
t (T) denotes the closing SPX (resp.
VIX) futures price if i = S (resp. i = V) today at time t with maturity T. The empirical results above
indicate very different shapes of the IVS slice for a fixed time-to-maturity. Indeed, as depicted in
Figure 3, the backed out implied volatilities of the SPX options are negatively skewed, i.e. decreasing
with increasing log-moneyness m 0.25 and for m 0.25 shortly increasing. The reverse is true
for VIX options. Their IVS slices are in most cases increasing with increasing log-moneyness m for
3
The terminology ”leverage effect” can be traced back to Black (1976). Economically speaking, the leverage effect
arises from the negative correlation between the stock price and its volatility. When the value of a stock drops,
the volatility of its returns tends to increase. Indeed, if the stock price decreases, the debt-to-equity ratio increases
and the risk of the firm therefore increases, which translates into a higher volatility for the firm. For possible
other explanations of the leverage effect we refer, among others, to Haugen et al. (1991), Campbell and Hentschel
(1992), Campbell and Kyle (1993), and Bekaert and Wu (2000).
4
The applied exclusionary criteria on the option data are outlined in Appendix C of the supplementary appendix.
4
a fixed time-to-maturity. The behavioral explanation of the positive skew patterns can be traced
back to the fact that out-of-the-money (OTM) call options on the VIX provide protection against a
market crash. As a consequence, the call option writer enters a risky position and charges additional
compensation for taking this risk, making the implied volatility higher for high-strike call options.
The similar behavior is observed in the SPX market, where however the writer of an OTM put option
charges extra compensation. Hence, according to these observations VIX OTM call option prices
exhibit a significant time value. Translated into statistical terms, the volatility density implied by the
VIX options has more mass concentrated at high volatility levels than at low volatility levels. This
observation is essentially the major reason why the original Heston (1993) model with a chi-squared
density fails to reproduce this market feature (cf. Gatheral (2008) who pointed out as first that
VIX option prices are inconsistent with the Heston (1993) dynamics). To replicate this stylized fact,
we introduce a multifactor stochastic volatility of the Wishart type in the following section which
exhibits a higher flexibility in the underlying transition probability density. Furthermore, as shown
in Christoffersen et al. (2009), stochastic correlation is an important model feature which helps to
reproduce the time variation in the implied volatility smirk. Since the Wishart process is equipped
with stochastic correlations, it is a natural candidate for modeling such a stylized fact.
[Figure 3 about here.]
3 The Multifactor Stochastic Volatility Model
In this section we derive closed-form solutions by means of transform methods for VIX– and SPX
options under multifactor stochastic volatility. A closed-form expression for VIX futures is also
presented.
3.1 Probabilistic Setup and Notation
Let (Ω, F, F, P) be a filtered probability space, equipped with a standard Brownian motion W :
[0, T] × Ω → R, satisfying the usual conditions of right-continuity and completeness. Moreover, we
5
assume that P is the historical measure, F0 is P-trivial, and we work on the finite time horizon
[0, T] , T < ∞. By R≥0 we denote the non-negative real numbers, i.e. R≥0 := [0, ∞), by R+
the positive real numbers are depicted, i.e. R+ := (0, ∞), and S+
k represents, for any k ∈ N, the
non-negative cone of symmetric positive semi-definite k × k matrices. Furthermore, the operator
tr : Rk×k → R, Q → tr (Q) = k
i=1 Qii is the trace of a k × k-matrix Q, exp [Q] denotes the matrix
exponential of a real– or complex k × k-matrix Q given by the power series
exp [Q] :=
∞
i=0
Qi
i!
,
stands for the transpose of a matrix (or vector), let Ck be the set of functions with continuous
derivatives up to the k-th order, i.e. k ∈ N0, Re [·] and Im [·] represent the real– and imaginary part
of a complex number k = Re [k] + iIm [k], with i :=
√
−1, and the identity matrix is denoted, for
k ∈ N, by Ik := diag (1, 1, . . . , 1) with diag (·) the diagonal matrix. Finally, we denote the futures
price of the SPX by F = {Ft : t ∈ [0, T]} and its logarithmic price process by S = {St : t ∈ [0, T]} =
{ln (Ft) : t ∈ [0, T]}.
3.2 Model Design
This section introduces the flexible class of MAD that we use. Since we aim for a parsimonious
option pricing model which does not encompass an over-parameterization, the model exhibits no
additional jump structure. Nonetheless, notice that our framework can be easily extended to jumps
in both, the asset return– and volatility process, by building on an earlier contribution of Leippold
and Trojani (2010) who provide the analytical transform analysis for this class of models.
We assume the existence of a probability measure Q ∼ P such that the SPX is a martingale with
respect to Q, i.e. Q is a risk-neutral measure. Guided by the stylized facts in Section 2, we directly
specify the SPX dynamics under Q. Let us remark that we do not consider the SPX spot price itself
as the underlying of SPX options but rather the corresponding futures contract. Note that since the
futures price Ft (T) converges to the SPX spot price as t → T, a European option on the underlying
spot is the same as a European option on the corresponding futures contract (maturing at the same
6
time as the option). To retain the analytical tractability and to preserve enough flexibility to account
for the different empirical evidence, we stay in an affine setting.
Assumption 3.1. Let M, Q ∈ R2×2, WQ and BQ are matrices of standard Brownian motions in
R2×2 under the risk-neutral measure Q, the Gindikin coefficient is given by R+ β > 1, and
√
X
represents the unique square root on S+
2 . Moreover, we assume
WQ
:= BQ
P + ZQ
I2 − PP ,
where ZQ is a matrix of standard Brownian motions, independent of BQ, in R2×2 under the risk-
neutral measure Q, and P ∈ R2×2 is a deterministic correlation matrix such that PP ≤ I2 is
a positive semi-definite matrix. The SPX log-dynamics is determined by the stochastic differential
equations (SDEs):
dSt = −
1
2
tr (Xt) dt + tr Xt dWQ
t , S0 = s ∈ R,
dXt = βQ Q + MXt + XtM dt + Xt dBQ
t Q + Q dBQ
t Xt, X0 = x ∈ S+
2 , (2)
where we suppose that all the eigenvalues of x are distinct, i.e. λ1,0 > λ2,0 ≥ 0.
Note, since we do not need the discounted version of F to obtain a semimartingale process, there is
no risk-free rate in the drift component.5 Assumption 3.1 defines a matrix-variate stochastic volatility
model in which we allow for multifactor volatility, stochastic correlation, and stochastic skewness.6
Since leverage is closely linked to the skewness of asset returns and the slope of the implied volatility
smile,7 our model with stochastic leverage COVt (dSt, dVt) = 2tr (PQXt) dt, where Vt := tr (Xt), can
enhance the ability to replicate the time-varying skew– and term structure patterns of the implied
volatilities in the SPX market (see, e.g., Figure 3 above).8 The investigation of COVt (dSt, dVt) and
5
We refer to Lemma B.1 in the supplementary appendix for the mathematical justification.
6
See (Leippold and Trojani, 2010, Section 2) for the analytical expressions of the stochastic correlation between
the log-returns and the volatility shocks with and without jumps.
7
More precisely, the leverage effect increases the probability of a large loss and consequently the value of OTM put
options. The leverage effect induces negative skewness in stock returns, which in turn yields a volatility smirk.
8
This fact is pointed out in Christoffersen et al. (2009). In particular, while single-factor stochastic volatility models
can capture the slope of the smile, they cannot explain large independent fluctuations in the corresponding level
and slope over time.
7
the corresponding variance of variance VARt (dVt) = 4tr Q QXt dt is interesting because these
moments are related to the conditional skewness and kurtosis, respectively, of the SPX log-returns.
It is shown in (Christoffersen et al., 2009, Online Appendix) that the paths for the conditional
skewness and kurtosis, respectively, in the one-factor Heston (1993) model are too strongly linked to
the variance path. Introducing more factors can generate a higher variation and therefore we exhibit
a wider degree of flexibility in the term structure of higher moments.
The pure diffusion Wishart process (see, e.g., Gouri´eroux et al. (2009) and Gouri´eroux and Sufana
(2010)) in (2) represents the matrix analogue of a squared Bessel process. A moment’s reflection
yields the result that in full analogy with the Cox-Ingersoll-Ross (CIR) process (cf. Cox et al.
(1985)) the term βQ Q is related to the expected long-term variance-covariance matrix X∞ through
the solution of the linear (Lyapunov) equation (cf. Da Fonseca et al. (2014))
MX∞ + X∞M = −βQ Q.
The matrix M can be compared to the speed of mean reversion in the CIR process. Moreover, Q
can be identified as the volvol parameter, i.e. the volatility of the volatility matrix. Eventually, the
condition R+ β > 1 is introduced to assure Q-almost surely (a.s.) for all t ∈ [0, T] the existence
and uniqueness of the strong solution Xt ∈ S+
2 in (2).9 Imposing R+ β ≥ 3 would yield a unique
strong solution Q-a.s. for all t ∈ [0, T] in the domain S+
2  {0} since all eigenvalues of the solution
Xt are strictly positive. The assumption of distinct eigenvalues for the initial state x implies that
the eigenvalues of the unique strong solution Xt ∈ S+
2 at any time t ∈ (0, T] never collide, i.e.
λ1,t > λ2,t ≥ 0.10 This fact in turn leads the result that tr (Xt) > 0 a.s. for all t ∈ [0, T].
To enforce the typical mean-reverting behavior of the volatility on X = {Xt : t ∈ [0, T]}, we
impose the following assumption.
Assumption 3.2. To ensure stationarity, i.e. non-explosive features of the process X, we assume
9
In daily fitting exercises (see Section 4) we noticed that this condition is in the majority of cases not satisfied if not
enforced. Also observe that this condition naturally translates to the Feller condition (cf. Feller (1951)), assuming
the eigenvalues of x are non-colliding, for the univariate CIR process which is however also not genuinely satisfied
for most option data.
10
See (Bru, 1991, Section 3) for the corresponding proofs.
8
M to be negative definite.
Every stochastic SPX option pricing model embodies an implicit dynamics for the VIX. Thus,
this setting also encompasses the dynamics for the VIX. Since there are no jumps, the VIX squared
is therefore equal to the annualized expected integrated variance over a 30-day time period.
Proposition 3.1. The VIX squared at time t is given by
VIX2
t :=
1
τVIX
t+τVIX
t
EQ
[tr (Xs)| Ft] ds, τVIX ≡
30
365
, (3)
where the expectation can be carried out explicitly for a symmetric matrix M in the form of
VIX2
t = αVIX2 + tr (βVIX2 Xt) , (4)
with
αVIX2 := βtr
1
τVIX
Q Q 4M2 −1
− Q Q (2M)−1
, (5)
βVIX2 :=
1
τVIX
(2M)−1
, (6)
≡ exp [2τVIXM] − I2.
Proof. See Appendix A.
Notice, it is common in the related literature to express the VIX squared at time t also by
VIX2
t =
1
τVIX
EQ
t+τVIX
t
tr (Xs) ds Ft .
However, observe that the interchange of the expectation and integration in Proposition 3.1 is justified
by Tonelli’s theorem.
In summary, we obtain a tractable affine setting for the VIX with at least two useful properties.
First, the model implies nonlinear persistence properties and a stochastic volatility of volatility of the
implied correlation process 12 := X12/
√
X11X22 (see, e.g., Leippold and Trojani (2010)). Second
9
and even more important, the multivariate risk structure, in which both factors feature stochastic
co-volatility, can enhance the accurate replication of the positive implied volatility skew for VIX
options since we can generate a wide degree of flexibility in the conditional higher moments of the
VIX dynamics. More precisely, given the fact that at any time t the transition probability density
fQ (XT | Xt) in (2) follows a non-central Wishart distribution,11 it is straightforward to see that only
considering orthogonal diagonal components restricts the structure of fQ and therefore reduces the
flexibility of the implied VIX2
transition probability density. This special feature is the particular
improvement in flexibility over the model of Christoffersen et al. (2009).
To better understand the restriction on conditional higher moments of orthogonal diagonal com-
ponents, we compute the first four moments of V := tr (X). We use the fact that the cumulant
generating function for V is given by:
K (Γ, Xt, t, T) = tr ξt,T Γ (I2 − 2Ξt,T Γ)−1
ξt,T Xt −
β
2
ln (det (I2 − 2Ξt,T Γ)) ,
where
Ξt,T :=
Σt,T
β
,
ξt,T and Σt,T are defined in (B.1) and (B.2), respectively, and Γ := θI2 for θ ∈ R≥0. Using stan-
dard matrix calculus and applying the relationship between cumulants and moments show that the
conditional moments are given by the following expressions:
EQ
(VT | Vt) = κ1,t = tr ξ2
t,T Xt + βtr (Ξt,T ) ,
VARQ
(VT | Vt) = κ2,t = 4tr (ξt,T Ξt,T ξt,T Xt) + 2βtr Ξ2
t,T ,
SKEWQ
(VT | Vt) =
κ3,t
κ
3/2
2,t
=
2
√
2 3tr ξt,T Ξ2
t,T ξt,T Xt + βtr Ξ3
t,T
2tr (ξt,T Ξt,T ξt,T Xt) + βtr Ξ2
t,T
3/2
,
KURTQ
(VT | Vt) =
κ4,t
κ2
2,t
=
12 4tr ξt,T Ξ3
t,T ξt,T Xt + βtr Ξ4
t,T
2tr (ξt,T Ξt,T ξt,T Xt) + βtr Ξ2
t,T
2 ,
11
We refer to Appendix B for its explicit representation.
10
where VAR, SKEW, and KURT denote the (conditional) variance, skewness, and kurtosis, respec-
tively, and κn,t is the nth conditional cumulant at time t. Incorporating the fact that M is a negative
definite matrix (see Assumption 3.2) yields the following unconditional moments:
EQ
(V∞) = −βtr Q Q (2M)−1
,
VARQ
(V∞) = 2βtr Q Q (2M)−1
2
,
SKEWQ
(V∞) = −
2
√
2 tr Q Q (2M)−1
3
√
β tr Q Q (2M)−1
2 3/2
,
KURTQ
(V∞) =
12tr Q Q (2M)−1
4
β tr Q Q (2M)−1
2 2 .
To numerically illustrate that the full Wishart specification has a greater flexibility than a mul-
tifactor Heston specification, we use a calibrated parameter set of Section 4 for which we imposed
the integer restriction β ∈ N  {1} in the calibration to simplify the efficient simulation of the VIX2
dynamics below, and compute the conditional skewness and kurtosis, respectively, of V for a varying
off-diagonal latent state X12. The initial parameter set is given by
M =



−2.9671 3.1310
3.1310 −3.4838


 , Q =



0.6857 −0.4236
−0.4236 0.5706


 ,
Xt =



0.1442 −0.0268
−0.0268 0.0050


 , β = 2.
(7)
Furthermore, we also consider a parameterized version of the Christoffersen et al. (2009) model which
consists of orthogonal diagonal components and does therefore not contain a stochastic co-volatility
factor. This model is described in Section 4 and is denoted by SV-2F in Figure 4. There are two
conclusions to be drawn from Figure 4. First, incorporating the off-diagonal components generates
a higher conditional skewness and kurtosis, respectively, thereby widening the support of the bulk
of the VIX2
dynamics’ probability distribution. This in turn improves the replication of a positive
11
implied volatility skew for VIX options due to the leverage effect which increases the probability of a
large loss and consequently the values of VIX OTM call options. Second, by varying the off-diagonal
component X12 we can observe the flexibility of this additional degree of freedom for the conditional
higher moments. It is also important to notice that X12 provides an additional mean to capture the
stochasticity of the skew effect implied by SPX options (cf. Da Fonseca et al. (2008)). Hence, X12
enhances the ability to preserve consistency with the underlying index.
[Figure 4 about here.]
The unconditional skewness (resp. kurtosis) for the MAD model is 1.9165 (resp. 5.6297) whereas for
the Christoffersen et al. (2009) model it amounts 1.5403 (resp. 3.7024). Hence, we also obtain an
improved modeling of higher unconditional moments.
The enormous flexibility of Wishart processes can further be illustrated by simulating the VIX2
dynamics in (4) and plotting the corresponding VIX option prices and histograms. Since we enforced
β to be a positive integer, we use the following proposition in combination with the Euler-Maruyama
scheme to simulate equation (4).
Proposition 3.2. Let β ∈ N  {1} and assume Y = {Yt : t ∈ [0, T]} follows a matrix Ornstein-
Uhlenbeck process in R2×β with dynamics
dYt = MYtdt + QdBQ
t , Y0 = y ∈ R2×β
,
where BQ is a matrix of standard Brownian motions in R2×β under the risk-neutral measure Q and
M, Q ∈ R2×2. Then, X := Y Y has the dynamics (2) with BQ :=
√
X
−1
Y BQ.
Proof. See Appendix A.
Figure 5 depicts on the left side the simulated VIX option prices using 30’000 Monte Carlo sim-
ulations, a risk-free interest rate of 1%, and the parameter set given in (7). We can deduce that
the MAD implied VIX model prices are higher in comparison to the Christoffersen et al. (2009)
model for all the strike prices and the decay to zero is significantly slower implying a larger support
12
of the bulk of the VIX2
dynamics’ probability distribution. This observation is also confirmed by
the corresponding histograms of the simulated VIX dynamics on the right side of Figure 5, hereby
justifying the importance of the off-diagonal components for the tail distribution. Also, as pointed
out by Gatheral (2013), it is a stylized fact that the distribution of volatility (whether realized or
implied) should be roughly lognormal.12 Such a behavior is well replicated from the MAD model
whereas the Christoffersen et al. (2009) model still exhibits too little mass concentrated at high VIX
states which can be seen from the histograms in Figure 5.
[Figure 5 about here.]
Eventually, an alternative way of interpreting the flexibility of X, which has been proposed by
Gruber et al. (2010) and employed by Leippold and Trojani (2010), is by using at time t the spec-
tral decomposition Xt = OtVtOt , where Vt is a diagonal matrix with the eigenvalues V1
t and V2
t ,
and
Ot := O (αt) :=



cos (αt) cos (αt)
sin (αt) − sin (αt)



is a matrix of orthonormal eigenvectors written in polar coordinates with angle αt ∈ [0, π]. Ob-
serve that V = V1 + V2 holds, and therefore in combination with some lengthy calculations (see
(Gruber et al., 2010, Appendix B)), we can decompose X into a volatility– and a structural part as
follows:
Xt =
Vt
2


I2 + 2
V1
t
Vt
− 1



cos (2αt) sin (2αt)
sin (2αt) − cos (2αt)





 ,
where V is equal to the total variance and V1
V is the fraction of total variance explained by the first
volatility factor. If we set cos (2α) = 1, i.e. α = 0 and α = π, we obtain a two-factor volatility model,
and if we additionally impose V1
V = 1, the state X breaks down to the original Heston (1993) model.
Thus, α ∈ (0, π) identifies the incremental impact of X12. We fix the volatility level V and the
volatility composition V1
V , and plot in Figure 6 the admissible domain for the VIX values by varying
α ∈ [0, π] and employing the parameter set given in (7). We can notice that for α ∈ (0, π) the MAD
12
Empirical studies showing that the distribution of volatility is lognormal include, among others, Cizeau et al.
(1997), Andersen et al. (2001a), and Andersen et al. (2001b).
13
model generates a wide degree of variability in the VIX values, independently of the conditional
level and composition of the spot volatility, respectively. Overall, the minimum is achieved at 0.4055
whereas the maximum amounts 0.4652. This finding underpins the importance of the additional
degree of freedom in the MAD framework which improves the modeling of higher conditional and
unconditional moments, respectively, in the underlying VIX transition probability density.
[Figure 6 about here.]
Let us remark that in order to span this incomplete market setting, coming from the stochastic
correlation, we can consider correlation products such as correlation swaps.13
3.3 Transform Analysis
The entire framework is based on an affine setup which allows us, according to Duffie et al. (2000),
to efficiently solve our financial pricing problem by means of transform methods. Therefore, we aim
an analytical characterization of the discounted Laplace transform of VIX2
T and ST , respectively, at
time t under Q:
ΨVIX2
(ω, Xt, t, T) := EQ
exp −
T
t
R (Xs) ds + ωVIX2
T Ft , (8)
ΨS
(ϑ, St, Xt, t, T) := EQ
exp −
T
t
R (Xs) ds + ϑST Ft , (9)
where ω, ϑ ∈ C, and the short rate process R = {R (Xt) : t ∈ [0, T]} is affine, i.e. R (x) = ρ0 +tr (ρ1x)
with ρ0 ∈ R≥0, ρ1 ∈ S+
2 .
For simplicity’s sake we set the risk-free rate constant, i.e. ρ1 = 0. One may argue that the error
of a constant interest rate can become substantially large, but this concern only holds for long dated
financial derivatives. As it is not the case for SPX– and VIX options, where the options are dated
much more short-term, we do not encounter this issue in our setting.
Proposition 3.3. Let Assumption 3.1 and Theorem 2.4, Theorem 2.6, and Proposition 4.9 in
13
See Buraschi et al. (2014) for a recent empirical study on correlation swaps which provide a natural traded proxy
for the price of correlation risk.
14
Cuchiero et al. (2011) be satisfied. Furthermore, assume a constant risk-free rate, i.e. ρ1 = 0,
and define τ := T − t. Then, the discounted Laplace transform in (8) is exponentially affine:
ΨVIX2
(ω, Xt, t, T) = exp (φX (τ) + tr (ψX (τ) Xt))
with the functions φX (τ) ∈ R and ψX (τ) ∈ S+
2 that solve the system of matrix Riccati differential
equations:
dψX (τ)
dτ
= M ψX (τ) + ψX (τ) M + 2ψX (τ) Q QψX (τ) , (10)
dφX (τ)
dτ
= − ρ0 + βtr Q QψX (τ) , (11)
subject to the terminal conditions φX (0) = ωαVIX2 and ψX (0) = ωβVIX2 .
Proof. See Appendix A.
Proposition 3.4. Let Assumption 3.1 be satisfied, assume a constant risk-free rate, i.e. ρ1 = 0, and
define τ := T − t. Then, the discounted Laplace transform in (9) is exponentially affine:
ΨS
(ϑ, St, Xt, t, T) = exp (ϑSt) exp (φS (τ) + tr (ψS (τ) Xt))
with the functions φS (τ) ∈ R and ψS (τ) ∈ S+
2 that solve the system of matrix Riccati differential
equations:
dψS (τ)
dτ
=
ϑ (ϑ − 1)
2
I2 + ψS (τ) M + ϑQ P + M + ϑPQ ψS (τ)
+ 2ψS (τ) Q QψS (τ) ,
(12)
dφS (τ)
dτ
= − ρ0 + βtr Q QψS (τ) , (13)
subject to the terminal conditions φS (0) = 0 and ψS (0) = 0.
Proof. See Appendix A.
Hereupon, the closed-form solutions to the matrix Riccati equations (10), (11), (12), and (13)
15
can be obtained by linearizing the flow of the differential equation using Radon’s lemma (see, e.g.,
Freiling (2002) and Lemma B.2 in the supplementary appendix).
Corollary 3.1. Let the matrix Riccati equations (10) and (11), and Theorem 2.4, Theorem 2.6,
and Proposition 4.9 in Cuchiero et al. (2011) be satisfied. Furthermore, define τ := T − t. Then,
ψX (τ) = (ωβVIX2 C12 (τ) + C22 (τ))−1
(ωβVIX2 C11 (τ) + C21 (τ)), where C11 (τ), C12 (τ), C21 (τ), and
C22 (τ) are 2 × 2 blocks of the following matrix exponential:



C11 (τ) C12 (τ)
C21 (τ) C22 (τ)


 := exp


τ



M −2Q Q
0 −M





 .
Given the solution for ψX (τ), the coefficient φX (τ) follows by direct integration:
φX (τ) = − ρ0τ −
β
2
tr ln (C22 (τ)) + τM + ωαVIX2 ,
where ln (·) denotes the matrix logarithm.
Proof. See Appendix A in the supplementary appendix.
Corollary 3.2. Let the matrix Riccati equations (12) and (13) be satisfied. Furthermore, define
τ := T − t. Then, ψS (τ) = C22 (τ)−1
C21 (τ), where C21 (τ) and C22 (τ) are 2 × 2 blocks of the
following matrix exponential:



C11 (τ) C12 (τ)
C21 (τ) C22 (τ)


 := exp


τ



M + ϑQ P −2Q Q
ϑ(ϑ−1)
2 I2 − M + ϑPQ





 .
Given the solution for ψS (τ), the coefficient φS (τ) follows by direct integration:
φS (τ) = − ρ0τ −
β
2
tr ln (C22 (τ)) + τM + τϑPQ .
Proof. See Appendix A in the supplementary appendix.
For the pricing of options on the VIX, we cannot follow the seminal technique, the so-called
16
fast Fourier transform (FFT), developed in Carr and Madan (1999). To illustrate why, we follow the
arguments in Bardgett et al. (2014) who have pointed out the technical difficulties in this framework.
We can write the price of a call option with strike K and maturity T at time t as
C (VIXt, K, t, T) = e−r(T−t)
∞
0
√
x − K
+
f x| VIX2
t = y dx,
where x denotes the value of the VIX2
at time T, f x| VIX2
t = y is the probability density function
(PDF) of x given today’s value y ∈ R+, and r is the (constant) risk-neutral interest rate. Comparing
this expression to a SPX call option payoff ey − ek +
, where y := ln (S) with S as the underlying
index price at maturity T and k := ln (K) with K as the strike price, we see that to apply the Fourier
method of Carr and Madan (1999) we would need to have an affine dependence on the log of the
VIX, which is however incompatible with affine models for log-returns. To circumvent this technical
issue, we henceforth follow the transform method in Fang and Oosterlee (2008).
Fang and Oosterlee (2008) develop an option pricing method for European-style options, called
the COS method, based on Fourier-cosine series expansions. The main idea is to decompose a
density function into a linear combination of cosine functions since the series coefficients of many
density functions can be accurately obtained from their characteristic functions. This particular
decomposition allows for easy and highly efficient numerical computations which attain, in most
cases, an exponential convergence rate and a linear computational complexity.
Theorem 3.1. Given the truncation interval [a, b] ⊂ R≥0 for the (compact) support of the PDF
f x| VIX2
t = y with x ∈ R≥0 and y ∈ R+, the price Υ (Xt, K, t, T) of a European contingent claim
on the VIX with payoff function g VIX2
T = VIX2
T − K
+
and time t ∈ [0, T] is
Υ (Xt, K, t, T) =
N−1
k=0
Ak (a, b) ϕk (a, b) , N ∈ N,
with the coefficients
Ak (a, b) := Re [exp (φX (T − t) + tr (ψX (T − t) Xt)) exp (cka)] , ck ≡ −
ikπ
b − a
,
17
ϕk (a, b) :=



χ := 2
b−a
b
a (
√
x − K)
+
cos kπx−a
b−a dx, k > 0,
2
b−a
2
3b3/2 + K K2
3 − b , k = 0,
where χ can be obtained in closed-form:
χ =
2
b − a
Re ecka
√
b
eckb
−ck
+
√
π
2c
3/2
k
erfz ckb − erfz (K
√
ck) ,
with erfz (z) := 2√
π
z
0 e−t2
dt and z ∈ C the complex Gauss error function.14 The prime superscript
in the sum indicates that the first term in the summation is divided by two and the coefficients
φX (T − t) and ψX (T − t) are given in Corollary 3.1 with ω = −ck.
Proof. See Appendix A.
For the sake of consistency and since the COS method requires far fewer evaluations of the
characteristic function for a given level of accuracy than the FFT method, we employ this particular
technique also for the pricing of options on the SPX.
Theorem 3.2. Let the log-asset prices be denoted by y := ln FT
K and y0 := ln Ft
K . Given the
truncation interval [a, b] ⊂ R for the (compact) support of the PDF f (y| y0 = z) with y, z ∈ R,
the price Φ (St, K, t, T) of a European contingent claim on the SPX with payoff function g ey =
K ey − 1
+
and time t ∈ [0, T] is
Φ (St, K, t, T) =
N−1
k=0
Bk (a, b) γk (a, b) , N ∈ N,
with the coefficient
Bk (a, b) := Re [exp (−ck (St − ln (K))) exp (φS (T − t) + tr (ψS (T − t) Xt)) exp (cka)] ,
ck ≡ −
ikπ
b − a
,
14
This error function can be evaluated using the infinite series approximation (7.1.29) in Abramowitz and Stegun
(1972).
18
and the closed-form solution of γk (a, b) can be found in (Fang and Oosterlee, 2008, Section 3). The
coefficients φS (T − t) and ψS (T − t) are given in Corollary 3.2 with ϑ = −ck.
Proof. See Appendix A.
Let us remark that the price of a put option on the VIX and SPX, respectively, can be calculated
analogously. Another approach is to back out the SPX (VIX) put option price via the ordinary
(modified) put-call parity.15
In summary, our highly analytical yet tractable model yields closed-form solutions for VIX– and
SPX options, and we can therefore circumvent the need of supplementary numerical solutions which
induces additional approximation errors.
Eventually, we present a closed-form solution for the VIX futures price F (Xt, t, T), i.e. with
current time t and settlement time T, which can be represented by
F (Xt, t, T) := EQ
VIX2
T Ft = EQ
αVIX2 + tr (βVIX2 XT ) Ft .
Observing that the discount is not used for VIX futures and setting the strike price K in Theorem 3.1
to zero yields the following result.
Theorem 3.3. Given the truncation interval [a = 0, b] ⊂ R≥0 for the (compact) support of the PDF
f x| VIX2
t = y with x ∈ R≥0 and y ∈ R+, the VIX futures price F (Xt, t, T) with payoff function
g VIX2
T = VIX2
T and time t ∈ [0, T] is
F (Xt, t, T) =
N−1
k=0
Ck (a, b) k (a, b) , N ∈ N,
15
The modified put-call parity is given in (1).
19
with the coefficients
Ck (a, b) := Re [exp (φX (T − t) + tr (ψX (T − t) Xt)) exp (cka)] , ck ≡ −
ikπ
b − a
,
k (a, b) :=



ξ := 2
b−a
b
a
√
x cos kπx−a
b−a dx, k > 0,
4b3/2
3(b−a) , k = 0,
where ξ can be obtained in closed-form:
ξ =
2
b − a
Re ecka
√
b
eckb
−ck
+
√
π erfz
√
ckb
2c
3/2
k
.
The coefficients φX (T − t) and ψX (T − t) are given in Corollary 3.1 with ρ0 = 0 and ω = −ck.
4 Pricing Performance
Every theoretical model needs to search its justification in the data. This purpose is the aim of the
current section. To do so, we focus on the VIX market to demonstrate the model flexibility and leave
a possible extension to the joint calibration of SPX– and VIX option data as an interesting avenue
of future research.
Firstly, due to model parsimony we assume that Q is a symmetric matrix. Moreover, we compare
our MAD framework with two different reference models:
1. A one-factor stochastic volatility model with jumps in both, the asset return– and volatility
process. Henceforth, we denote this model by SVJJ.
2. A parameterized version of the Christoffersen et al. (2009) model, henceforth denoted by SV-2F
model, which is naturally nested in our MAD model.
We refer to Appendix C for the detailed description of the SVJJ model. In terms of the nested
model, setting the matrices M, Q, and X as diagonal yields the SV-2F model. To see this linkage,
20
observe that with M, Q, and X assumed to be diagonal, we obtain, for i = 1, 2,
dXii,t = β (Qii)2
+ 2MiiXii,t dt + 2Qii Xii,tdB∗
i,t, (14)
where B∗ = (B∗
1, B∗
2) defined by B∗
i :=
√
Xii
−1 2
j=1 XijBQ
ji is a vector of two independent
Brownian motions. Hence, in this simple case where the parameters are diagonal matrices, the
diagonal components of the Wishart process are independent CIR processes (cf. Benabid et al.
(2010)). Eventually, by identifying the parameters in (14) with the parameters in Christoffersen
et al. (2009), we achieve the parameterized nesting, i.e. for i = 1, 2, we have
ai ≡ β (Qii)2
, bi ≡ −2Mii, σi ≡ 2Qii.
In this setting, the off-diagonal component X12 becomes futile, and the stochastic skewness is com-
pletely driven by the diagonal factors X11 and X22. Hence, the dynamics of the risk-neutral skewness
in this model is completely governed by the dynamics of the volatility components. Notice, the com-
parison between the MAD– and the SV-2F model allows us to assess the importance of the stochastic
co-volatility factor for the replication of the positive implied volatility skew of VIX options.
For the calibration we use the so-called market implied approach which relies on the existence of
semi-closed– or closed-form expressions for the prices of benchmark derivatives. Let I ⊂ N denote
the set of (noiseless) VIX call option data available on a particular date.16
Definition 4.1. Our loss function which we refer to as average relative dollar error ($ARE) is
defined by
$ARE (Θ) :=
1
#I
i∈I
PMkt
i − Pi (Θ)
PMkt
i
, (15)
with #I denoting the cardinality of the set I and where, for all i ∈ I, PMkt
i is the observed market
call price and Pi is the model-implied call price for a given parameter set Θ.
16
For the detailed data treatment we refer to Appendix C in the supplementary appendix where we describe which
exclusionary criteria are applied.
21
For our empirical investigations we have
MAD Model (10 Parameters): Θ = {β, M11, M12, M22, Q11, Q12, Q22, X11,t, X12,t, X22,t} ,
SV-2F Model (7 Parameters): Θ = {β, M11, M22, Q11, Q22, X11,t, X22,t} ,
SVJJ Model (9 Parameters): Θ = {κ, m, σ, λ0, λ1, µS, σS, ν, vt−} .
We consider the relative absolute distance to favor particularly OTM call options since the implied
volatilities are the highest for these options (see Figure 3) and we are interested in fitting the stylized
fact of a positive implied volatility skew. By using implied volatilities directly we would face many
computational difficulties, i.e. numerical inversion of the Black-Scholes formulae at each minimization
step. Therefrom, we use as metric simple option prices.17 Overall, this specification is line with the
current literature such as Papanicolaou and Sircar (2014).
The loss function (15) implies a high-dimensional non-convex optimization problem. To mitigate
the ill-posedness problem and to find the global minimum we use a differential evolution (DE)
algorithm.18
To achieve a representative result for the entire sample, we divide the data into three differ-
ent monthly subsamples which encompass different states of the economy (∅ denotes the average
sign):19
I. October 2008, ∅VIX = 61.18,
II. April 2010, ∅VIX = 17.42,
III. August 2011, ∅VIX = 35.03.
Estimating the model on each weekday is computationally very challenging. Therefore, to further
17
Instead, one could also use the vega approximation of Cont and Tankov (2004).
18
A DE algorithm is an efficient heuristic for the global optimization over continuous spaces. The inception of such
evolutionary-based optimization strategies can be traced back to Storn and Price (1997) and some earlier papers
cited therein. In brief, a DE algorithm is a population-based optimizer that attacks the starting point problem
by sampling the objective function at multiple, randomly chosen initial points and generates new points that are
perturbations of existing points. We refer to Price et al. (2005) and Chakraborty (2008) for a review of differential
evolution algorithms.
19
A further motivation for these choices is driven by the empirical VIX distributions during the subsamples. They
appeared to vary a lot across the data sets and thus making it challenging to replicate for a model.
22
lighten the computational burden, we sample the data weekly every Wednesday to avoid weekday
effects.
The calibration results are reported in Table 2. To compare the effect of enforcing the Feller
condition, i.e. σ2
i < 2ai for i = 1, 2, we calibrate two versions of the SV-2F model. The calibration
results in the third column with the remark Feller condition violate the Feller condition whereas the
results in the fourth column with the note Feller condition satisfy the Feller condition. Another
way of enforcing this condition in our framework is by setting β ≥ 3 (see also Section 3.2). Notice
that we impose the constraint of distinct eigenvalues for the initial state of X and therefore we
assure that tr (Xt) > 0 a.s. for all t ∈ [0, T] (see also Section 3.2). Overall, in the first and third
data set, respectively, the in-sample performance of the MAD model is comparable to the SVJJ
model whereas in the second data set the SVJJ model exhibits a better performance. If we throw a
glance at the SV-2F model results, we can observe that modeling the stochastic co-volatility factor
can significantly improve the calibration results in all states. One possible reason is due to the
improved modeling of higher conditional moments. In particular, it seems that the MAD model
can generate a higher degree of variety in the conditional skewness and kurtosis, respectively, of the
transition probability density compared to the SV-2F model. Translated into statistical terms, the
volatility density implied by the MAD model is more right-skewed, i.e. has more mass concentrated
at high volatility levels than at low volatility levels, than the implied volatility density of the SV-2F
model. This concentration is needed to replicate the positive implied volatility skew observed in the
VIX market. As already pointed out in Section 3.2, given the fact that only considering orthogonal
diagonal components restricts the structure of fQ VIX2
T VIX2
t which is however of high importance
for the pricing, these empirical fitting results underpin the benefit of stochastic correlation within the
risk factors X11 and X22. Additionally imposing the Feller condition in the SV-2F model can add a
high constraint to the underlying transition density which is reflected in the partly poor in-sample
fitting results.
[Table 2 about here.]
To illustrate the contribution of the additional degree of freedom in the MAD model, we investigate
23
the parameter estimates, the model-implied option prices, and the model-implied implied volatilities
on August 17, 2011 in more detail. The qualitative implications of the other days remain similar.
As an additional comparison, we calibrate the original Heston (1993) model, henceforth denoted by
SV-1F. The parameter estimates are reported in Table 3. Due to the sake of consistency, we only
consider the original SV-2F formulation for the comparison, since the additional burden of the Feller
condition can distort the results. The right economic interpretation of the parameters in the MAD–
and SV-2F model is difficult. However, we can observe that the SV-2F model exhibits a higher
mean reversion than the MAD model since M has larger negative eigenvalues. Moreover, X12 is
significantly larger than zero, implying a positive stochastic correlation within the risk factors X11
and X22. The SVJJ model implies an almost twice as large speed of mean reversion whereas the
volatility of volatility is much lower compared to the SV-1F model. This behavior is obviously linked
to SVJJ’s ability of substituting high volatility of volatility with volatility jumps. Notice that the
SVJJ model exhibits approximately two and half expected number of jumps per year of magnitude
ν = 0.1875.
[Table 3 about here.]
To get structural insights of the in-sample fit of the models, we plot in Figure 7 and Figure 8
the model-implied option prices and implied volatilities, respectively, in conjunction with the cor-
responding market values for four different time-to-maturities. The closing at-the-money (ATM)
VIX futures prices are 27.65, 26.58, 25.86, and 24.91 for 35, 63, 91, and 126 days, respectively.
The one-factor stochastic volatility model SV-1F systematically fails in fitting the option prices and
implied volatilities. In particular, the implied option price decay is too high which is due to the chi-
squared volatility density, inducing too little mass at high volatility states. This in turn increases the
probability of small VIX levels and consequently the values of in-the-money (ITM) VIX call options
(or OTM VIX put options). Hence, we observe negative implied volatility skews which confirm the
findings of Gatheral (2008). The pure diffusion model SV-2F undervalues the ATM VIX call option
prices and fails to replicate the implied volatility patterns for almost all time-to-maturities and log-
moneyness levels. Nonetheless, we can observe that adding an additional stochastic volatility term
24
changes the shape of the implied volatility skews into a positive slope. Therefore, the SV-2F model
can widen the bulk of high volatility states. In terms of option prices, we find almost no difference
between the MAD– and SVJJ model. If we throw a glance at the implied volatility levels, we note
that both models are able to generate the appropriate positive implied volatility skews but the SVJJ
model tends to increase the implied volatilities for ITM VIX call options whereas the MAD model
generates decreasing implied volatility patterns for all log-moneyness levels.
[Figure 7 about here.]
[Figure 8 about here.]
We emphasize that this empirical analysis does not include any final conclusion for model selection
since we merely contemplate the in-sample fitting performance. Moreover, we do not question the
usefulness of adding a jump process to the return and/or volatility dynamics, respectively. Instead,
we reckon that modeling an implied correlation process for the risk factors is an alternative way to
deal with model deficiencies.
We would like to stress that these fitting exercises are by no means complete. In order to fully
understand the importance of jumps and/or stochastic co-volatility factors, one would need to im-
plement a sequential calibration exercise for a complete time series of cross-sectional option data,
which is however beyond the scope of our current computational infrastructure. We leave this task
for future research.
It’s worth mentioning that in terms of computational complexity, which is a very important issue
from a practitioner’s point of view, the MAD model is two and a half up to three times faster than
the SVJJ model in daily fitting.20 Therefore, our model exhibits practicability and is yet flexible
enough to account for the positive skew in the VIX market.
20
Blatantly, this fact is due to the numerical resolution of the generalized Riccati differential equations in the SVJJ
model.
25
5 Conclusion
Due to the high analytical tractability and the enormous flexibility of the Wishart process, we propose
an application to the pricing of CBOE VIX options. We carry out the dynamics for the CBOE VIX
in a linear affine way and the discounted Laplace transform exhibits an exponentially affine property.
The tractable model structure lightens the computational burden and facilitates a fast identification
of the parameter estimates. Eventually, we empirically show that modeling the stochastic co-volatility
factor can significantly improve the in-sample fitting results due to the improved modeling of higher
conditional moments in the underlying transition probability density.
26
Appendices
A Proofs
This section contains the mathematical proofs of Proposition 3.1, Proposition 3.2, Proposition 3.3,
Proposition 3.4, Theorem 3.1, and Theorem 3.2.
Proof of Proposition 3.1. It is shown in (Buraschi et al., 2008, Appendix C) that EQ [Xs| Ft], where
0 ≤ t ≤ s ≤ T, has the solution
EQ
[Xs| Ft] = exp [(s − t) M] Xt exp (s − t) M + β
s−t
0
exp [uM] Q Q exp uM du.
Assume now that M is a symmetric matrix and take the trace of EQ [Xs| Ft]:
EQ
[tr (Xs)| Ft] = tr (Xt exp [2 (s − t) M]) + βtr Q Q (2M)−1
(exp [2 (s − t) M] − I2) .
Eventually, by integrating (3) between t and t + τVIX, where τVIX ≡ 30
365, we get to the solution
VIX2
t =
1
τVIX
tr Xt + βQ Q (2M)−1
(2M)−1
(exp [2τVIXM] − I2) − βtr Q Q (2M)−1
.
Doing the algebra yields the solution (4) with the coefficients in (5) and (6).21 Notice, one way to
deal with a non-symmetric matrix M could be by setting M = M+M
2 .22
Proof of Proposition 3.2. By applying the stochastic Leibniz rule, we obtain
dXt = YtdYt + dYtYt + d Y, Y
t
= βQ Q + MXt + XtM dt + Xt Xt
−1
Yt dBQ
t Q + Q dBQ
t Yt Xt
−1
Xt.
Now employing L´evy’s characterization of Brownian motion, we can define the R2×2-valued Brownian
21
Recall that the trace operator is invariant under cyclic permutations.
22
See also Benabid et al. (2010).
27
motion
BQ
:=
√
X
−1
Y BQ
,
which completes the proof.
Proof of Proposition 3.3. Suppose that X is an adapted Markov process in the state space S+
2 . Under
some mild regularity conditions, the L´evy infinitesimal generator AX of the matrix Markov process
X is defined for bounded f (x) ∈ C2 S+
2 ; R functions by
AX
f (x) := tr βQ Q + Mx + xM D + 2xDQ QD f (x) , (A.1)
where D is a 2 × 2 matrix of differential operators with the ij-component given by ∂
∂xij
.23 Since
the generator exhibits an affine dependence on the state space x ∈ S+
2 , we obtain, by separation
of the variables, the exponential affine property with the matrix Riccati equations given in the
proposition.
Proof of Proposition 3.4. We follow the same rationale as in the proof of Proposition 3.3 and repeat
it here for the reader’s convenience. Suppose that X is an adapted Markov process in the state space
S+
2 . Under some mild regularity conditions, the L´evy infinitesimal generator ASX of (S, X) is defined
for bounded f (s, x) ∈ C2,2 R × S+
2 ; R functions by
ASX
f (s, x) := −
1
2
tr (x)
∂f (s, x)
∂s
+
1
2
tr (x)
∂2f (s, x)
∂s2
+ tr βQ Q + Mx + xM D
+ DQ P x + xPQD
∂
∂s
+ 2xDQ QD f (s, x) ,
where D is a 2 × 2 matrix of differential operators with the ij-component given by ∂
∂xij
.24 Since the
generator exhibits an affine dependence on the state space (s, x) ∈ R × S+
2 , we obtain, by separation
of the variables, the exponential affine property with the matrix Riccati equations given in the
proposition.
23
The infinitesimal generator of X has been calculated by Bru (1991).
24
The infinitesimal generator of (S, X), in which S represents the logarithmic spot price process, has been calculated
by Da Fonseca et al. (2008) for a multifactor volatility Heston model.
28
Proof of Theorem 3.1. The general COS formula for European contingent claims is taken from (Fang
and Oosterlee, 2008, Section 3).25 The corresponding coefficients are provided in Bardgett et al.
(2014) for which we merely have to substitute our discounted Laplace transform into Ak (a, b) while
ϕk (a, b) remains the same. Hence, the result follows.
Proof of Theorem 3.2. Following (Fang and Oosterlee, 2008, Section 3) and taking into account the
transformation ΨS (ϑ, St, Xt, t, T)−ϑ ln(K)
for the discounted Laplace transform yields the result.
B The Non-Central Wishart Distribution of X
In this section we state the transition probability density fQ (XT | Xt) for a negative definite sym-
metric matrix M ∈ R2×2.
Proposition B.1. The transition probability density fQ (XT | Xt) for any time t can be calculated
explicitly in the following form:
fQ
(XT | Xt) =
det (Σt,T )−β/2
det (XT )(β−1)/2
2βΓ2
β
2
exp −
1
2
tr Σ−1
t,T (XT + ξt,T Xtξt,T )
× 0F1
β
2
,
ξt,T Xtξt,T XT
4
,
with the parameters
ξt,T := exp [(T − t) M] , (B.1)
Σt,T := βQ Q (2M)−1
(exp [2 (T − t) M] − I2) , (B.2)
and where det (·) stands for the determinant operator, pFq (·, ·), for p, q ∈ N0, denotes the hypergeo-
metric function which can be expressed in terms of zona polynomials (see Muirhead (2005)),26 and
25
Note that we consider the discounted Laplace transform in this paper, in opposite to Fang and Oosterlee (2008),
and therefore do not have an additional discount factor e−r(T −t)
in the expression of Υ (VIXt, K, t, T).
26
These polynomials have no closed-form expressions, but can be computed recursively.
29
Γk (·), for k ∈ N, is the multi-dimensional gamma function defined by
Γk (x) :=
Λ>0
exp (tr (−Λ)) (det (Λ))x−k+1
2 dΛ.
Proof. Applying the results of (Muirhead, 2005, Chapter 10) and bearing the symmetry of the matrix
M in mind yields the proposition.
C The Reference Jump Model: A Synopsis
In this section we sketch our reference jump model used for comparison. It is a simplified version
of Bardgett et al. (2014) with a constant central tendency. As usual, we denote the futures price
of the SPX by F = {Ft : t ∈ [0, T]} and its logarithmic price process by S = {St : t ∈ [0, T]} =
{ln (Ft) : t ∈ [0, T]}.
Assumption C.1. Let κ, m, σ ∈ R+, WQ and BQ are standard Brownian motions in R under the
risk-neutral measure Q, vt− := lims↑t vs is the left limit of the process v at time t > s, and we suppose
the Feller condition (cf. Feller (1951)) holds, i.e. σ2 < 2κm. Moreover, we assume
WQ
:= BQ
ρv + MQ
1 − ρv,
where MQ is a standard Brownian motion, independent of BQ, in R under the risk-neutral measure
Q, and ρv ∈ [−1, 1] is a deterministic correlation coefficient. The SPX log-dynamics is determined
by the SDEs:
dSt = −λ (vt−) θQ
J (1, 0) − 1 −
1
2
vt− dt +
√
vt−dWQ
t + ZS,Q
t dNt, S0 = s ∈ R,
dvt = κ (m − vt−) dt + σ
√
vt−dBQ
t + Zv,Q
t dNt, v0 = v∗
∈ R+,
where N = {Nt : t ∈ [0, T]} is a standard Poisson process with the affine intensity
λ (v) = λ0 + λ1v, λ0, λ1 ∈ R≥0.
30
The joint Laplace transform of the independent and identically distributed random jump sizes Z :=
ZS,Q, Zv,Q is defined as follows:
θQ
J (ϑ, γ) := EQ
exp ϑZS,Q
+ γZv,Q
,
where ϑ, γ ∈ C. Finally, we assume that the jump sizes in the returns are normally distributed with
mean µS ∈ R and variance σ2
S ∈ R≥0, respectively, and the jump sizes in the volatility factor follow
an exponential distribution with mean ν.
This model framework also implicitly defines the dynamics for the VIX. The result follows by
the affine property and using the result of (Egloff et al., 2010, Proposition 2) for the integrated
variance.
Proposition C.1. The VIX squared at time t is given by
VIX2
t :=
1
τVIX
EQ
t+τVIX
t
vs−ds + 2 eZS,Q
s
− 1 − ZS,Q
s dNs Ft , τVIX ≡
30
365
,
where the expectation can be carried out explicitly in the affine form of
VIX2
t = αVIX2 + βVIX2 vt−,
with
αVIX2 := 2λ0 0 + (1 + 2λ1 0)
(κm + νλ0)
(νλ1 − κ)
( 1 − 1) ,
βVIX2 := (1 + 2λ1 0) 1,
0 ≡ eµS+1
2
σ2
S − µS − 1 ,
1 ≡
e(νλ1−κ)τVIX − 1
(νλ1 − κ) τVIX
.
Let us remark that following the quadratic variation convention for variance swaps would yield
31
the following expectation:
VIX2
t =
1
τVIX
EQ
t+τVIX
t
vs−ds + ZS,Q
s
2
dNs Ft . (C.1)
However, the variance swap rate inherits an approximation error due to price jumps (cf. Carr and
Wu (2009)). We adjust for the jump induced error by replacing ZS,Q
s
2
with the correction term
2 eZS,Q
s − 1 − ZS,Q
s in (C.1) and hence, we replicate the VIX exactly.
Furthermore, the discounted Laplace transform of VIX2
T at time t under Q is exponentially
affine:27
ΨVIX2
(ω, vt−, t, T) := EQ
exp −ρ0τ + ωVIX2
T Ft = exp (φv (τ) + ψv (τ) vt−) ,
where ω ∈ C, ρ0 ∈ R≥0, and τ := T − t. The functions (φv (τ) , ψv (τ)) ∈ R2 solve the system of
generalized Riccati differential equations:
dψv (τ)
dτ
= −κψv (τ) +
1
2
σ2
ψ2
v (τ) + λ1 θQ
J (0, ψv (τ)) − 1 ,
dφv (τ)
dτ
= −ρ0 + ψv (τ) κm + λ0 θQ
J (0, ψv (τ)) − 1 ,
subject to the terminal conditions φv (0) = ωαVIX2 and ψv (0) = ωβVIX2 . For the numerical resolution
of the Riccati differential equations governing the coefficients φv (τ) and ψv (τ), the standard Runge-
Kutta 4th order method is applied. In order to price VIX (call) options, the previous outlined COS
method is used.
27
For simplicity’s sake we again set the risk-free rate constant.
32
D Tables
Descriptive Statistics
Sample Period: February 27, 2006 – August 30, 2013
Mean Volatility Skewness Kurtosis
Log-Returns SPX 0.0001 0.0144 -0.2995 12.0503
VIX Levels 22.1871 10.6044 2.0254 8.2538
Table 1 – We provide a summary of the most crucial descriptive statistical measures for the log-returns of the SPX
and the VIX levels. In particular, the sample mean, the sample volatility (measured by the empirical standard
deviation), the sample skewness, and the sample kurtosis are calculated and depicted. The sample period spans
the period of February 27, 2006 to August 30, 2013.
$ARE Results
Date MAD SV-2F SV-2F SVJJ # Call Options
Feller Condition Feller Condition
Data Set I.
01/10/2008 0.0356 0.0413 0.0444 0.0572 57
08/10/2008 0.0686 0.0771 0.0974 0.0768 82
15/10/2008 0.1005 0.1121 0.1137 0.1119 75
22/10/2008 0.0516 0.0528 0.0568 0.0896 55
29/10/2008 0.0823 0.1009 0.1597 0.0538 64
Data Set II.
07/04/2010 0.1327 0.1328 0.1556 0.0609 61
14/04/2010 0.0965 0.1388 0.1509 0.0603 67
21/04/2010 0.0706 0.1264 0.2135 0.1458 63
28/04/2010 0.1506 0.2511 0.5878 0.0883 82
Data Set III.
03/08/2011 0.0775 0.0978 0.1146 0.0680 102
10/08/2011 0.1284 0.1957 0.2001 0.1226 133
17/08/2011 0.0477 0.1070 0.1335 0.0822 84
24/08/2011 0.1057 0.1078 0.1215 0.0759 86
31/08/2011 0.0499 0.0518 0.1211 0.0761 82
Table 2 – The $ARE calibration results for the three data sets which encompass different states of the economy.
MAD denotes the matrix affine diffusion model outlined in Section 3, SV-2F designates the two-factor pure
diffusion model of Christoffersen et al. (2009), and SVJJ abbreviates the sketched model in Appendix C. To
mitigate the ill-posedness problem and to find the global minimum, we use a DE algorithm in the first step and
thereafter the obtained parameter values are used as starting values for a local optimizer. On each date, the data
consist of liquid VIX call options where # Call Options denotes the total number of call options to fit.
33
Parameter Estimates on August 17, 2011
β M11 M12 M22 Q11 Q12 Q22 X11,t X12,t X22,t
MAD 1.0010 -0.9955 0.1163 -0.0136 0.8994 -0.1846 -0.2335 0.0415 0.0268 0.0181
SV-2F 1.1182 -0.7369 -20.4483 -0.7910 -1.2732 0.0366 0.0001
κ m σ λ0 λ1 µS σS ν vt−
SVJJ 10.1968 0.0303 0.6939 1.5449 14.8902 0.0724 0.1072 0.1875 0.0650
SV-1F 5.9827 0.0787 0.9704 0.0854
Table 3 – The calibrated parameter estimates on August 17, 2011. MAD denotes the matrix affine diffusion model outlined in Section 3, SV-2F
designates the two-factor pure diffusion model of Christoffersen et al. (2009), SVJJ abbreviates the sketched model in Appendix C, and SV-1F stands
for the original Heston (1993) model. To mitigate the ill-posedness problem and to find the global minimum, we use a DE algorithm in the first step and
thereafter the obtained parameter values are used as starting values for a local optimizer. The data consist of 84 liquid VIX call options.
34
E Figures
Feb−2006 Aug−2008 Feb−2011 Aug−2013
0
20
40
60
80
100
Joint SPX− and CBOE VIX Evolution
Date
VIXLevel
Jul−2007 Jan−2010 Jul−2012
600
840
1080
1320
1560
1800
SPXLevel
Figure 1 – The evolution of the SPX (green solid line) and the CBOE VIX (blue solid line) during the sample
period of February 27, 2006 to August 30, 2013. The y-axes correspond to the index levels.
Feb−2006 Jan−2008 Nov−2009 Oct−2011 Aug−2013
−1
−0.8
−0.6
−0.4
−0.2
0
0.2
0.4
Rolling Correlations between the SPX Log−Return− and the CBOE VIX Level Increments
Date
CorrelationCoefficient
N = 25
N = 50
N = 100
N = 250
Figure 2 – The rolling correlations between the SPX log-return– and the CBOE VIX level increments during the
sample period of February 27, 2006 to August 30, 2013 given the four different window sizes, measured in days,
N ∈ {25, 50, 100, 250}. The y-axis corresponds to the correlation coefficient.
35
−1 −0.8 −0.6 −0.4 −0.2 0 0.2 0.4
0.1
0.3
0.5
0.7
0.9
m
ImpliedVolatility
SPX: 29/09/2011
τ = 23
τ = 51
τ = 79
τ = 114
τ = 170
τ = 261
−1 −0.5 0 0.5 1 1.5
0.5
0.7
0.9
1.1
1.3
1.5
1.7
m
ImpliedVolatility
VIX: 29/09/2011
τ = 20
τ = 48
τ = 83
τ = 111
τ = 139
τ = 174
Figure 3 – The SPX– and VIX implied volatility skews on September 29, 2011 as a function of the log-moneyness
m := ln K/Fi
t (T) , where K is the strike level of the European-style option, Fi
t (T) denotes the closing SPX
(resp. VIX) futures price if i = S (resp. i = V) today at time t with maturity T, and τ := T − t is the option’s
time-to-maturity in daily units.
0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2
0
0.5
1
1.5
2
t
ConditionalSkewness
Model−Implied Conditional Skewness
X
12,t
= −0.0268
X12,t
= −0.0161
X
12,t
= −0.0054
X12,t
= 0.0054
X12,t
= 0.0161
X12,t
= 0.0268
SV−2F
0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2
0
1
2
3
4
5
6
t
ConditionalKurtosis
Model−Implied Conditional Kurtosis
Figure 4 – The model-implied conditional skewness and kurtosis, respectively, of V := tr (X) for a varying off-
diagonal latent state X12. We use a time horizon of two years, i.e. t ∈ (0, 2]. SV-2F abbreviates a parameterized
version of the Christoffersen et al. (2009) model and is described in Section 4. The initial parameter set is given
in (7).
36
0 5 10 15 20 25 30
0
2
4
6
8
10
12
K
OptionPrice
Simulated Option Prices
MAD
SV−2F
0 5 10 15 20 25 30 35 40
0
0.025
0.05
0.075
0.1
0.125
0.15
0.175
0.2
VIXT
Density
Histogram of the Simulated VIX Dynamics
MAD
SV−2F
Figure 5 – On the left we plot the simulated VIX option prices using 30’000 Monte Carlo simulations and a risk-
free interest rate of 1%, and on the right the corresponding histograms of the simulated VIX dynamics are depicted.
MAD denotes the matrix affine diffusion model outlined in Section 3.2 and SV-2F abbreviates a parameterized
version of the Christoffersen et al. (2009) model and is described in Section 4. The employed parameter set is
given in (7).
0 0.5 1 1.5 2 2.5 3 3.5
0.4
0.41
0.42
0.43
0.44
0.45
0.46
0.47
α
VIX
α−Implied VIX Values
Figure 6 – We fix the volatility level V and the volatility composition V1
V
, and plot the admissible domain for
the VIX values by varying α ∈ [0, π]. The employed parameter set is given in (7). The minimum is achieved at
0.4055 whereas the maximum amounts 0.4652.
37
0 20 40 60 80
0
5
10
15
20
Option Maturity τ = 35
K
OptionPrice
Market
MAD
SV−2F
SVJJ
SV−1F
0 20 40 60 80
0
2
4
6
8
10
12
Option Maturity τ = 63
K
OptionPrice
10 20 30 40 50 60
0
2
4
6
8
10
12
Option Maturity τ = 91
K
OptionPrice
10 20 30 40 50 60
0
2
4
6
8
10
Option Maturity τ = 126
K
OptionPrice
Figure 7 – We plot the model-implied option prices in conjunction with the corresponding market values on
August 17, 2011 for four different time-to-maturities τ := T −t, measured in daily units, as a function of the strike
price K. The closing ATM VIX futures prices are 27.65, 26.58, 25.86, and 24.91 for 35, 63, 91, and 126 days,
respectively. MAD denotes the matrix affine diffusion model outlined in Section 3, SV-2F designates the two-
factor pure diffusion model of Christoffersen et al. (2009), SVJJ abbreviates the sketched model in Appendix C,
and SV-1F stands for the original Heston (1993) model. The employed parameter values are given in Table 3.
38
−1.5 −1 −0.5 0 0.5 1
0
0.5
1
1.5
2
2.5
3
3.5
Option Maturity τ = 35
m
ImpliedVolatility
Market
MAD
SV−2F
SVJJ
SV−1F
−1 −0.5 0 0.5 1 1.5
0.4
0.6
0.8
1
1.2
1.4
Option Maturity τ = 63
m
ImpliedVolatility
−0.5 0 0.5 1
0.4
0.5
0.6
0.7
0.8
0.9
1
1.1
Option Maturity τ = 91
m
ImpliedVolatility
−0.5 0 0.5 1
0.4
0.5
0.6
0.7
0.8
0.9
1
Option Maturity τ = 126
m
ImpliedVolatility
Figure 8 – We plot the model-implied implied volatilities in conjunction with the corresponding market values
on August 17, 2011 for four different time-to-maturities τ := T − t, measured in daily units, as a function of
the log-moneyness m := ln K/FV
t (T) , where K is the strike level of the VIX call option and FV
t (T) denotes
the closing VIX futures price today at time t with maturity T. The closing ATM VIX futures prices are 27.65,
26.58, 25.86, and 24.91 for 35, 63, 91, and 126 days, respectively. MAD denotes the matrix affine diffusion model
outlined in Section 3, SV-2F designates the two-factor pure diffusion model of Christoffersen et al. (2009), SVJJ
abbreviates the sketched model in Appendix C, and SV-1F stands for the original Heston (1993) model. The
employed parameter values are given in Table 3.
39
References
M. Abramowitz and I. A. Stegun. Handbook of mathematical functions: With formulas, graphs, and
mathematical tables, volume 55 of Applied Mathematics Series. Dover Publications, New York,
United States, 10 edition, 1972.
C. Albanese, H. Lo, and A. Majatovi´c. Spectral methods for volatility derivatives. Quantitative
Finance, 9(6):663–692, 2009.
T. G. Andersen, T. Bollerslev, F. X. Diebold, and H. Ebens. The distribution of realized stock return
volatility. Journal of Financial Economics, 61(1):43–76, 2001a.
T. G. Andersen, T. Bollerslev, F. X. Diebold, and P. Labys. The distribution of realized exchange
rate volatility. Journal of the American Statistical Association, 96(453):42–55, 2001b.
J. Baldeaux and A. Badran. Consistent modelling of VIX and equity derivatives using a 3/2 plus
jumps model. Applied Mathematical Finance, 21(4):299–312, 2014.
C. Bardgett, E. Gourier, and M. Leippold. Inferring volatility dynamics and risk premia from the
S&P 500 and VIX markets. Swiss Finance Institute, Research Paper Series No13-40, University
of Zurich, 2014.
G. Bekaert and G. Wu. Asymmetric volatilities and risk in equity markets. Review of Financial
Studies, 13(1):1–42, 2000.
A. Benabid, H. Bensusan, and N. El Karoui. Wishart stochastic volatility: Asymptotic smile and
numerical framework. Working Paper, ´Ecole Polytechnique, Paris-Saclay University, 2010.
L. Bergomi. Smile dynamics III. Risk, October:90–96, 2008.
F. Black. Studies of stock price volatility changes. Proceedings of the 1976 American Statistical Asso-
ciation, Business and Economical Statistics Section (American Statistical Association, Alexandria,
Virginia, United States), 177–181, 1976.
N. Branger, A. Kraftschik, and C. V¨olkert. The fine structure of variance: Consistent pricing of VIX
derivatives. Working Paper, University of M¨unster, 2014.
40
M.-F. Bru. Wishart processes. Journal of Theoretical Probability, 4(4):725–751, 1991.
A. Buraschi, A. Cieslak, and F. Trojani. Correlation risk and the term structure of interest rates.
Working Paper, Imperial College London, University of St. Gallen, 2008.
A. Buraschi, R. Kosowski, and F. Trojani. When there is no place to hide: Correlation risk and the
cross-section of hedge fund returns. Review of Financial Studies, 27(2):581–616, 2014.
J. Y. Campbell and L. Hentschel. No news is good news: An asymmetric model of changing volatility
in stock returns. Journal of Financial Economics, 31(3):281–318, 1992.
J. Y. Campbell and A. S. Kyle. Smart money, noise trading and stock price behaviour. Review of
Economic Studies, 60(1):1–34, 1993.
P. Carr and D. B. Madan. Option valuation using the fast Fourier transform. Journal of Computa-
tional Finance, 2(4):61–73, 1999.
P. Carr and L. Wu. Variance risk premiums. Review of Financial Studies, 22(3):1311–1341, 2009.
U. K. Chakraborty, editor. Advances in differential evolution, volume 143 of Studies in Computational
Intelligence. Springer-Verlag, Berlin Heidelberg, Germany, 2008.
P. Christoffersen, S. L. Heston, and K. Jacobs. The shape and term structure of the index option
smirk: Why multifactor stochastic volatility models work so well. Management Science, 55(12):
1914–1932, 2009.
P. Cizeau, Y. Liu, M. Meyer, C.-K. Peng, and H. E. Stanley. Volatility distribution in the S&P500
stock index. Physica A, 245(3–4):441–445, 1997.
R. Cont and T. Kokholm. A consistent pricing model for index options and volatility derivatives.
Mathematical Finance, 23(2):248–274, 2011.
R. Cont and P. Tankov. Non-parametric calibration of jump-diffusion option pricing models. Journal
of Computational Finance, 7(3):1–50, 2004.
J. C. Cox, J. E. Ingersoll, and S. A. Ross. A theory of the term structure of interest rates. Econo-
metrica, 53(2):385–407, 1985.
41
C. Cuchiero, D. Filipovi´c, E. Mayerhofer, and J. Teichmann. Affine processes on positive semidefinite
matrices. The Annals of Applied Probability, 21(2):397–463, 2011.
J. Da Fonseca, M. Grasselli, and C. Tebaldi. Option pricing when correlations are stochastic: An
analytical framework. Review of Derivatives Research, 10(2):151–180, 2007.
J. Da Fonseca, M. Grasselli, and C. Tebaldi. A multifactor volatility Heston model. Quantitative
Finance, 8(6):591–604, 2008.
J. Da Fonseca, M. Grasselli, and F. Ielpo. Estimating the Wishart affine stochastic correlation model
using the empirical characteristic function. Studies in Nonlinear Dynamics and Econometrics, 18
(3):253–289, 2014.
G. Drimus and W. Farkas. Local volatility of volatility for the VIX market. Review of Derivatives
Research, 16(3):267–293, 2013.
D. Duffie, J. Pan, and K. Singleton. Transform analysis and asset pricing for affine jump-diffusions.
Econometrica, 68(6):1343–1376, 2000.
D. Egloff, M. Leippold, and L. Wu. The term structure of variance swap rates and optimal variance
swap investments. Journal of Financial and Quantitative Analysis, 45(5):1279–1310, 2010.
F. Fang and C. W. Oosterlee. A novel pricing method for European options based on Fourier-cosine
series expansions. SIAM Journal on Scientific Computing, 31(2):826–848, 2008.
W. Feller. Two singular diffusion problems. Annals of Mathematics, 54(1):173–182, 1951.
G. Freiling. A survey of nonsymmetric Riccati equations. Linear Algebra and its Applications,
351–352:243–270, 2002.
J. Gatheral. Consistent modeling of SPX and VIX options. In The Fifth World Congress of the
Bachelier Finance Society, London, United Kingdom, July 2008.
J. Gatheral. Joint modeling of SPX and VIX. In National School of Development, Peking University,
Peking, China, October 2013.
42
J. Goard and M. Mazur. Stochastic volatility models and the pricing of VIX options. Mathematical
Finance, 23(3):439–458, 2013.
C. Gouri´eroux and R. Sufana. Derivative pricing with Wishart multivariate stochastic volatility.
Journal of Business & Economic Statistics, 28(3):438–451, 2010.
C. Gouri´eroux, J. Jasiak, and R. Sufana. The Wishart Autoregressive process of multivariate stochas-
tic volatility. Journal of Econometrics, 150(2):167–181, 2009.
P. Gruber, C. Tebaldi, and F. Trojani. Three make a smile – Dynamic volatility, skewness and
term structure components in option valuation. Working Paper, Bocconi University, University of
Lugano, 2010.
R. A. Haugen, E. Talmor, and W. N. Torous. The effect of volatility changes on the level of stock
prices and subsequent expected returns. Journal of Finance, 46(3):985–1007, 1991.
S. L. Heston. A closed-form solution for options with stochastic volatility with applications to bond
and currency options. Review of Financial Studies, 6(2):327–343, 1993.
J. Kallsen, J. Muhle-Karbe, and M. Voss. Pricing options on variance in affine stochastic volatility
models. Mathematical Finance, 21(4):627–641, 2011.
M. Leippold and F. Trojani. Asset pricing with matrix jump diffusions. Working Paper, University
of Zurich, University of Lugano, 2010.
G.-H. Lian and S.-P. Zhu. Pricing VIX options with stochastic volatility and random jumps. Decisions
in Economics and Finance, 36(1):71–88, 2013.
J. Menc´ıa and E. Sentana. Valuation of VIX derivatives. Journal of Financial Economics, 108(2):
367–391, 2013.
R. J. Muirhead. Aspects of Multivariate Statistical Theory, volume 197 of Wiley Series in Probability
and Statistics. John Wiley & Sons, Hoboken, New Jersey, United States, 2 edition, 2005.
A. Papanicolaou and R. Sircar. A regime-switching Heston model for VIX and S&P 500 implied
volatilities. Quantitative Finance, 14(10):1811–1827, 2014.
43
K. V. Price, R. M. Storn, and J. A. Lampinen. Differential evolution - A practical approach to global
optimization. Natural Computing Series. Springer-Verlag, Berlin Heidelberg, Germany, 2005.
A. Sepp. VIX option pricing in a jump-diffusion model. Risk Magazine, 84–89, 2008.
R. M. Storn and K. Price. Differential evolution - A simple and efficient heuristic for global opti-
mization over continuous spaces. Journal of Global Optimization, 11(4):341–359, 1997.
J. E. Zhang and Y. Zhu. VIX futures. Journal of Futures Markets, 26(6):521–531, 2006.
S.-P. Zhu and G.-H. Lian. An analytical formula for VIX futures and its applications. Journal of
Futures Markets, 32(2):166–190, 2012.
44
Pricing VIX Options with Multifactor Stochastic
Volatility – Supplementary Appendix∗
Pascal M. Caversaccio†
First Draft: May 16, 2014
This Version: June 13, 2016
Abstract
This supplementary appendix for Pricing VIX Options with Multifactor Stochastic Volatility con-
tains (1) the mathematical proofs of Corollary 3.1 and Corollary 3.2, (2) some useful mathe-
matical background results which we implicitly employed in the derivation of Assumption 3.1,
Corollary 3.1, and Corollary 3.2 in the main paper, (3) our data cleaning criteria, and (4) fur-
ther empirical stylized facts of the Standard & Poor’s 500 index– and Chicago Board Options
Exchange volatility index (option) market.
∗
The author thanks Chris Bardgett, Elise Gourier, Meriton Ibraimi, Markus Leippold, Stefan Pomberger, Lujing
Su, and Nikola Vasiljevi´c for helpful comments, and Sergio Maffioletti for providing guidance and access to the cloud
infrastructure of the University of Zurich. Any remaining errors are mine.
†
University of Zurich – Department of Banking and Finance, Plattenstrasse 14, 8032 Zurich, Switzerland. E-Mail:
pascalmarco.caversaccio@uzh.ch.
Pricing VIX Options with Multifactor Stochastic
Volatility – Supplementary Appendix
June 13, 2016
Abstract
This supplementary appendix for Pricing VIX Options with Multifactor Stochastic Volatility con-
tains (1) the mathematical proofs of Corollary 3.1 and Corollary 3.2, (2) some useful mathe-
matical background results which we implicitly employed in the derivation of Assumption 3.1,
Corollary 3.1, and Corollary 3.2 in the main paper, (3) our data cleaning criteria, and (4) fur-
ther empirical stylized facts of the Standard & Poor’s 500 index– and Chicago Board Options
Exchange volatility index (option) market.
A Proofs
This section contains the mathematical proofs of Corollary 3.1 and Corollary 3.2.
Proof of Corollary 3.1. According to Radon’s lemma (see, e.g., Freiling (2002) and Lemma B.2) we
can represent the function ψX (τ) as
ψX (τ) = J (τ)−1
K (τ) , (A.1)
where K (τ) and J (τ) are square matrices in R2×2 with J (τ) invertible.1 Moreover, we define
∂τ ψX (τ) := dψX(τ)
dτ , ∂τ J (τ) := dJ(τ)
dτ , and ∂τ K (τ) := dK(τ)
dτ . Multiplying equation (10) of the main
paper by J (τ) yields
J (τ) ∂τ ψX (τ) = J (τ) M ψX (τ) + J (τ) ψX (τ) M + 2J (τ) ψX (τ) Q QψX (τ) . (A.2)
Now, we differentiate
J (τ) ψX (τ) = K (τ) (A.3)
in light of (A.1), and obtain
J (τ) ∂τ ψX (τ) = ∂τ (J (τ) ψX (τ)) − ∂τ J (τ) ψX (τ) (A.4)
and
∂τ (J (τ) ψX (τ)) = ∂τ K (τ) . (A.5)
Plugging (A.3), (A.4), and (A.5) into (A.2), we get
∂τ K (τ) − ∂τ J (τ) ψX (τ) = J (τ) M ψX (τ) + K (τ) M + 2K (τ) Q QψX (τ) .
1
Mathematically speaking, it is due to the fact that the matrix Riccati equations belong to a quotient manifold.
1
By collecting the coefficients of ψX (τ), the following matrix ordinary differential equations (ODEs)
are induced:
∂τ K (τ) = K (τ) M,
∂τ J (τ) = −2K (τ) Q Q − J (t) M ,
or
d
dτ
K (τ) J (τ) = K (τ) J (τ)



M −2Q Q
0 −M


 . (A.6)
We can solve (A.6) by exponentiation:
K (τ) J (τ) = K (0) J (0) exp


τ



M −2Q Q
0 −M






= ψX (0) I2 exp


τ



M −2Q Q
0 −M






= ψX (0) C11 (τ) + C21 (τ) ψX (0) C12 (τ) + C22 (τ)
= ωβVIX2 C11 (τ) + C21 (τ) ωβVIX2 C12 (τ) + C22 (τ) .
Finally from (A.1), we can conclude that the solution is given by
ψX (τ) = (ωβVIX2 C12 (τ) + C22 (τ))−1
(ωβVIX2 C11 (τ) + C21 (τ)) .
A moment’s reflection reveals that φX (τ) is obtained by simple integration:2
φX (τ) = −ρ0
τ
0
ds + βtr Q Q
τ
0
ψX (s) ds
= −ρ0τ −
β
2
tr ln (C22 (τ)) + τM + ωαVIX2 ,
2
See, e.g., Da Fonseca et al. (2007).
2
where ln (·) denotes the matrix logarithm. This step concludes the proof.
Proof of Corollary 3.2. We follow the same rationale as in the proof of Corollary 3.1 and repeat it
here for the reader’s convenience. The linearization of the flow of the differential equation is obtained
by Radon’s lemma. We can express the function ψS (τ) as
ψS (τ) = J (τ)−1
K (τ) , (A.7)
where K (τ) and J (τ) are square matrices in R2×2 with J (τ) invertible. Moreover, we define
∂τ ψS (τ) := dψS(τ)
dτ , ∂τ J (τ) := dJ(τ)
dτ , and ∂τ K (τ) := dK(τ)
dτ . Multiplying equation (12) of the main
paper by J (τ) yields
J (τ) ∂τ ψS (τ) =
ϑ (ϑ − 1)
2
J (τ) I2 + J (τ) ψS (τ) M + ϑQ P
+ J (τ) M + ϑPQ ψS (τ) + 2J (τ) ψS (τ) Q QψS (τ) .
(A.8)
Now, we differentiate
J (τ) ψS (τ) = K (τ) (A.9)
in light of (A.7), and obtain
J (τ) ∂τ ψS (τ) = ∂τ (J (τ) ψS (τ)) − ∂τ J (τ) ψS (τ) (A.10)
and
∂τ (J (τ) ψS (τ)) = ∂τ K (τ) . (A.11)
Plugging (A.9), (A.10), and (A.11) into (A.8), we get
∂τ K (τ) − ∂τ J (τ) ψS (τ) =
ϑ (ϑ − 1)
2
J (τ) I2 + K (τ) M + ϑQ P + J (τ) M + ϑPQ ψS (τ)
+ 2K (τ) Q QψS (τ) .
3
By collecting the coefficients of ψS (τ), the following matrix ODEs are induced:
∂τ K (τ) =
ϑ (ϑ − 1)
2
J (τ) I2 + K (τ) M + ϑQ P ,
∂τ J (τ) = −2K (τ) Q Q − J (τ) M + ϑPQ ,
or
d
dτ
K (τ) J (τ) = K (τ) J (τ)



M + ϑQ P −2Q Q
ϑ(ϑ−1)
2 I2 − M + ϑPQ


 . (A.12)
We can solve (A.12) by exponentiation:
K (τ) J (τ) = K (0) J (0) exp


τ



M + ϑQ P −2Q Q
ϑ(ϑ−1)
2 I2 − M + ϑPQ






= ψS (0) I2 exp


τ



M + ϑQ P −2Q Q
ϑ(ϑ−1)
2 I2 − M + ϑPQ






= ψS (0) C11 (τ) + C21 (τ) ψS (0) C12 (τ) + C22 (τ)
= C21 (τ) C22 (τ) .
Finally from (A.7), we can conclude that the solution is given by
ψS (τ) = C22 (τ)−1
C21 (τ) .
A moment’s reflection reveals that φS (τ) is obtained by simple integration:
φS (τ) = −ρ0
τ
0
ds + βtr Q Q
τ
0
ψS (s) ds
= −ρ0τ −
β
2
tr ln (C22 (τ)) + τM + τϑPQ .
This step concludes the proof.
4
B Mathematical Background Results
We collect in this section some useful mathematical background results which we implicitly employed
in the derivation of Assumption 3.1, Corollary 3.1, and Corollary 3.2 in the main paper. The primary
intention is to provide to the mathematical sophisticated reader the techniques applied in the main
paper in order to better grasp the results.
Lemma B.1. Let R = {R (Xt−) : t ∈ [0, T]} be the short rate process and X = {Xt : t ∈ [0, T]}
denotes the underlying price process. Further, assume that R ∈ C0 (R+; R≥0). Then, the futures
price F (t, T) = Xt
B(t,T) , where B (t, T) := EQ e− T
t R(Xs−)ds
Ft is the price of a zero-coupon bond,
is a local martingale under Q.
Proof. The result follows straightforward by the first fundamental theorem of asset pricing, i.e. the
discounted price process of every tradable asset is a Q-martingale (cf. Delbaen and Schachermayer
(1994)). This theorem yields
EQ
[F (T, T)| Ft] = EQ
[XT | Ft] = XtEQ
e
T
t R(Xs−)ds
Ft =
Xt
B (t, T)
=: F (t, T) ,
which proves the martingale property.
Lemma B.2 (Radon’s lemma). Assume the following Riccati differential equation
˙W (t) = M21 (t) + M22 (t) W (t) − W (t) M11 (t) − W (t) M12 (t) W (t) , (B.1)
where W (t) ∈ Rm×n, M11 (t) ∈ Rn×n, M12 (t) ∈ Rn×m, M21 (t) ∈ Rm×n, and M22 (t) ∈ Rm×m with
n, m ∈ N for any t ∈ J ⊂ R, then the following holds:
i) Let W (t) be a solution of (B.1) on the interval J with W (t0) = W0. If Q (t) ∈ Rn×n is for any
t ∈ J the unique solution of the initial value problem
˙Q (t) = (M11 (t) + M12 (t) W (t)) Q (t) , Q (t0) = In,
5
and P (t) := W (t) Q (t), then



Q (t)
P (t)


 is a solution of the associated linear system of differential
equations
d
dt



Q (t)
P (t)


 =



M11 (t) M12 (t)
M21 (t) M22 (t)






Q (t)
P (t)


 . (B.2)
ii) If



Q (t)
P (t)


 is on the interval J a real solution of the system (B.2) such that det (Q (t)) = 0 for
any t ∈ J, then
W : J → Rm×n
, t → P (t) Q (t)−1
=: W (t)
is a real solution of (B.1) and in particular W (t0) = P (t0) Q (t0)−1
.
iii) In case of W (t) ∈ Cm×n, M11 (t) ∈ Cn×n, M12 (t) ∈ Cn×m, M21 (t) ∈ Cm×n, and M22 (t) ∈
Cm×m with n, m ∈ N for any t ∈ J ⊂ C the assertions i) and ii) remain valid.
Proof. It follows by elementary calculation. We refer to Radon (1927) and Radon (1928) for the
original work.
C Data Cleaning Criteria
Every empirical analysis inevitably hinges on the data treatment. Therefore, it is very important to
carefully address the issue of data selection.
We obtain the Standard & Poor’s 500 (SPX)– and Chicago Board Options Exchange (CBOE)
volatility index (VIX) option data quotes, covering a wide range of strikes and maturities, from the
OptionMetrics database. For the continuously compounded risk-free interest rates we take the zero-
coupon yield curve, also available on OptionMetrics, covering various maturities. The data set spans
the period from February 27, 2006 to August 30, 2013, covering seven and a half years of data.
We infer the value of the SPX– and VIX futures, respectively, at closing by backing out the
6
value using the at-the-money (ATM) forward put-call parity.3 Moreover, we follow the standard
convention in the literature and take the mid-price, defined by the average of the best bid– and
ask market quotes, to calculate the implied volatilities. To avoid noise in the data, five additional
exclusionary criteria are applied. First, we delete all non-traded and therefore illiquid options. Thus,
we remove all options which have zero open interest or were not traded for some time, i.e. volume
= 0. Second, we follow the arguments of A¨ıt-Sahalia and Lo (1998) and delete observations under a
price level of 0.10$. This criterion can be justified by the rationale that it is not possible to give an
arbitrary decimal price due to the minimum tick for option prices. Third, all option data quotes with
time-to-maturity less than five days or larger than one year are not taken into consideration. Fourth,
we delete all in-the-money (ITM) SPX options if there exist corresponding liquid out-of-the-money
(OTM) SPX options, since OTM options contain usually more information due to their high liquidity.
For the particular case that the OTM options are not sufficiently liquid, we continue working with
the most liquid one of the OTM– and ITM option. Finally, we only work with highly liquid VIX call
options since they exhibit, on average, a higher trading volume and open interest compared to the
VIX put options.4 Concerning the implied volatilities, we use a hybrid algorithm, consisting of the
(efficient) Newton-Raphson algorithm and the bisection method, for the calculations.5
We have to admit at this point that it is impossible to obtain a perfectly cleaned up data set,
since there are some issues which cannot be resolved, at least not with daily data.6 For instance,
the trading time of options may not be the closing time, which means that the closing price of the
3
The ATM put-call parity is defined as follows:
CMkt
t (Ft (T) , K ≈ Ft (T) , t, T, rt,T ) + Ke−rt,T (T −t)
= PMkt
t (Ft (T) , K ≈ Ft (T) , t, T, rt,T ) + Ft (T) e−rt,T (T −t)
,
where Ft (T) denotes the closing futures price today at time t with maturity T. We denote the ATM (K ≈ Ft (T))
observed market prices with the same maturity T by CMkt
t and PMkt
t , respectively. The risk-free interest rate rt,T
with time-to-maturity T − t can be extracted from, e.g., the zero-coupon yields or the implied London Interbank
Offered Rate (LIBOR) swap rates for long maturities.
4
Indeed, a further stylized fact which is worth mentioning is the inversely proportional put-call trading ratio for
SPX– and VIX options, i.e. we observe almost twice as many puts as calls traded daily in the SPX options market
and the reverse is true for the VIX market.
5
One can also obtain implied volatility estimates from OptionMetrics. However, there exist different approaches
for the construction of an implied volatility surface (IVS). In particular, OptionMetrics computes first the implied
volatility for each option, and in a second step the IVS is reconstructed by a Gaussian kernel smoothing with
empirically adjusted widths. Since the data treatment and the computational method are different compared to
OptionMetrics, our IVS slices in Figure 3 of the main paper do not correspond to the IVS slices obtained by
OptionMetrics. We refer to Homescu (2011) for a survey of the existing literature on the construction of an IVS.
6
High-frequency data, for example, offer more flexibility but also require special attention. Therefore, it is a trade-
off between ordinary frequented data and high-frequency data which can however open a Pandora’s box on how
to partition the data.
7
underlying value does not correspond to the underlying value at the time of trade. One way to deal
with this issue is to consider futures prices backed out from the ATM put-call parity with highly
liquid options as we conduct it in this paper. If such high liquid options do not exist, we are back
to the problem of non-synchronized prices. Nonetheless, we think that our data treatment removes
most of the noise and allows an empirical data analysis.
D Further Stylized Facts
In this section we provide further empirical stylized facts of the SPX– and CBOE VIX (option)
market. The analysis is by no means exhaustive, nonetheless it encompasses, in combination with
Section 2 of the main paper, a wide range of data features.
Table 1 and Table 2 report the properties of our SPX– and VIX option data set divided into
different maturity and moneyness bins, respectively, where the moneyness is defined, for i ∈ {S, V},
by
m := K/Fi
t (T) ,
where K is the strike level of the European-style option and Fi
t (T) denotes the closing SPX (resp.
VIX) futures price if i = S (resp. i = V) today at time t with maturity T. The applied exclusionary
criteria are outlined in Appendix C. These adjustments leave a total of (liquid) 701’347 OTM SPX
options and 122’690 VIX call options. Notice that the low number of VIX options compared to the
number of SPX options is due to the inception of the VIX options market in 2006 and therefore the
overall traded volume is lower. Moreover, it also stems from the fact that less maturities and strikes
are traded.
[Table 1 about here.]
[Table 2 about here.]
8
To empirically motivate for (multifactor) stochastic volatility one can take a look at the ATM
implied volatility evolution. In Figure 1 and Figure 2 the SPX– and VIX ATM implied volatility
evolution of four different time-to-maturities are depicted. The overall implied volatility levels are
higher for VIX options. Furthermore, the financial turmoil periods are more concise in the SPX
market via its peaks whereas the VIX market follows an oscillating behavior.
[Figure 1 about here.]
[Figure 2 about here.]
From Figure 3 in the main paper and from Table 3, where we depict eight different quantiles
of implied volatility levels for SPX– and VIX options, we observe that the VIX implied volatility
levels are substantially higher than for the SPX market. The explanation can be found in Figure 3.
Plotting the evolution of the SPX– and the CBOE VIX log-returns during the sample period of
February 27, 2006 to August 30, 2013 yields the conclusion that the driving stochastic volatility
component in the VIX market exhibits a larger sizable impact compared to the SPX market, making
the implied volatilities higher for VIX options.
[Table 3 about here.]
[Figure 3 about here.]
The introduction of a (multifactor) stochastic volatility model is also supported by Figure 4 in
which four different scatter plots are depicted where we plot the log-returns on the VIX against the
log-returns on the SPX using different dependence structures. In particular, on the top left, the real
data is plotted and we depict on the top right simulations from a fitted Frank copula, on the bottom
left simulations from a fitted t-copula, and on the bottom right simulations from a fitted Gaussian
copula. Blatantly, it is insufficient to fit a Gaussian copula to the SPX– and VIX log-returns. Using
a Frank and t-copula, respectively, yields a much better leptokurtic behavior. Figure 4 also implies
9
the leverage effect by the negative correlation of the SPX– and VIX log-returns.
[Figure 4 about here.]
Since option pricing is essentially about replicating the risk-neutral density (RND) at maturity,
given a fixed time-to-maturity, we examine the RNDs of SPX options which are directly linked to the
RNDs of VIX options by the negative correlation. For the calibration we used the technique described
in Monnier (2013) which takes available bid– and ask put option quotes and estimates the smoothest
risk-neutral density compatible with the quotes. This non-parametric method is able to recover the
middle part of the RND together with its full left tail and part of its right tail. In consideration
of the fact that the VIX RNDs exhibit fat right tails, we do not employ the method for the VIX
market and leave the extension of the technique to variance derivatives for future research.7 By the
implied volatility smirks in Figure 3 of the main paper, we expect asymmetry (negative skewness)
and fat-tails (leptokurtosis) in the risk-neutral distribution of the SPX. Indeed, as can be noted from
Figure 5, the SPX RNDs exhibit negative skewness and leptokurtocity for all time-to-maturities. As
pointed out in Huang and Wu (2004), a jump component in the SPX dynamics generates return
non-normality over the short terms, and a persistent stochastic volatility process slows down the
convergence of the return distribution to normality as the maturity increases. Notice, Figure 5 also
implies, by the leverage effect, the reverse shape for the VIX market. These facts evidently motivate
for (multifactor) stochastic volatility in the SPX dynamics.8
[Figure 5 about here.]
To gain a notion of the bid-ask spreads for both markets, which is an indicator for, e.g., hedging–
and transaction costs, we plot the evolution of the average SPX– and VIX absolute bid-ask spread and
the average SPX– and VIX relative bid-ask spread, defined by ”bid-ask spread/mid-price”, during
the sample period of February 27, 2006 to August 30, 2013. Notice that we incorporate the entire
7
The recent paper of Song and Xiu (2014) recovers the VIX RNDs using another non-parametric approach. For
a comprehensive treatment of state-price density estimation techniques see, e.g., the monograph of Jackwerth
(2004).
8
The introduction of jumps in the asset return and volatility process, respectively, can obviously also be taken into
consideration. Nevertheless, we do not follow this path due to model parsimony.
10
data range in the computation and therefore no data filter is applied. In absolute terms, the average
SPX bid-ask spreads are substantially higher. Using relative terms, the VIX bid-ask spreads are
higher than the corresponding SPX bid-ask spreads at the beginning of the period. Approaching the
recent years, both markets converge to each other in terms of relative bid-ask spreads implying that
the VIX market has gained an amendment in liquidity. This finding is also justified by Figure 8 in
which we plot the volume and open interest for the SPX– and VIX option market over time. We can
notice that the VIX option volume and open interest, respectively, have significantly increased over
the recent years whereas the same measures for the SPX market have stayed roughly stationary.
[Figure 6 about here.]
[Figure 7 about here.]
[Figure 8 about here.]
11
VIX_Option_Multifactor_Final_Version_Normal_Format_incl_Supplementary_Appendix
VIX_Option_Multifactor_Final_Version_Normal_Format_incl_Supplementary_Appendix
VIX_Option_Multifactor_Final_Version_Normal_Format_incl_Supplementary_Appendix
VIX_Option_Multifactor_Final_Version_Normal_Format_incl_Supplementary_Appendix
VIX_Option_Multifactor_Final_Version_Normal_Format_incl_Supplementary_Appendix
VIX_Option_Multifactor_Final_Version_Normal_Format_incl_Supplementary_Appendix
VIX_Option_Multifactor_Final_Version_Normal_Format_incl_Supplementary_Appendix
VIX_Option_Multifactor_Final_Version_Normal_Format_incl_Supplementary_Appendix
VIX_Option_Multifactor_Final_Version_Normal_Format_incl_Supplementary_Appendix
VIX_Option_Multifactor_Final_Version_Normal_Format_incl_Supplementary_Appendix
VIX_Option_Multifactor_Final_Version_Normal_Format_incl_Supplementary_Appendix
VIX_Option_Multifactor_Final_Version_Normal_Format_incl_Supplementary_Appendix

More Related Content

Viewers also liked

Magazine Ethnic Aug 2015
Magazine Ethnic Aug 2015Magazine Ethnic Aug 2015
Magazine Ethnic Aug 2015
kunal-craftsvilla
 
Aprendiendo herramientas web 2
Aprendiendo herramientas web 2Aprendiendo herramientas web 2
Aprendiendo herramientas web 2mvargaschaparro
 
P5 e1 andreay_dani
P5 e1 andreay_daniP5 e1 andreay_dani
P5 e1 andreay_daniandreasayago
 
Pyramid
PyramidPyramid
Pyramid
Yvon Dalat
 
Tejido muscular
Tejido muscularTejido muscular
Tejido muscular
Siria Muñoz
 
Preguntas psicopato infatil
Preguntas psicopato infatilPreguntas psicopato infatil
Preguntas psicopato infatilVirgi Sántxez
 
Sydney journey
Sydney journey  Sydney journey
Sydney journey
stilolaps
 

Viewers also liked (10)

Magazine Ethnic Aug 2015
Magazine Ethnic Aug 2015Magazine Ethnic Aug 2015
Magazine Ethnic Aug 2015
 
Sertifiointi
SertifiointiSertifiointi
Sertifiointi
 
Aprendiendo herramientas web 2
Aprendiendo herramientas web 2Aprendiendo herramientas web 2
Aprendiendo herramientas web 2
 
P5 e1 andreay_dani
P5 e1 andreay_daniP5 e1 andreay_dani
P5 e1 andreay_dani
 
Mar frumos de voinesti
Mar frumos de voinestiMar frumos de voinesti
Mar frumos de voinesti
 
Baju
BajuBaju
Baju
 
Pyramid
PyramidPyramid
Pyramid
 
Tejido muscular
Tejido muscularTejido muscular
Tejido muscular
 
Preguntas psicopato infatil
Preguntas psicopato infatilPreguntas psicopato infatil
Preguntas psicopato infatil
 
Sydney journey
Sydney journey  Sydney journey
Sydney journey
 

Similar to VIX_Option_Multifactor_Final_Version_Normal_Format_incl_Supplementary_Appendix

Deterministic Shift Extension of Affine Models for Variance Derivatives
Deterministic Shift Extension of Affine Models for Variance DerivativesDeterministic Shift Extension of Affine Models for Variance Derivatives
Deterministic Shift Extension of Affine Models for Variance Derivatives
Gabriele Pompa, PhD
 
Consistent Pricing of VIX Derivatives and SPX Options with the Heston++ model
Consistent Pricing of VIX Derivatives and SPX Options  with the Heston++ modelConsistent Pricing of VIX Derivatives and SPX Options  with the Heston++ model
Consistent Pricing of VIX Derivatives and SPX Options with the Heston++ model
Gabriele Pompa, PhD
 
Matt Fagan MSM Analysis
Matt Fagan MSM AnalysisMatt Fagan MSM Analysis
Matt Fagan MSM AnalysisMatt Fagan
 
Research on the Trading Strategy Based On Interest Rate Term Structure Change...
Research on the Trading Strategy Based On Interest Rate Term Structure Change...Research on the Trading Strategy Based On Interest Rate Term Structure Change...
Research on the Trading Strategy Based On Interest Rate Term Structure Change...
inventionjournals
 
Model Uncertainty and Exchange Rate Forecasting
Model Uncertainty and Exchange Rate ForecastingModel Uncertainty and Exchange Rate Forecasting
Model Uncertainty and Exchange Rate Forecasting
Nicha Tatsaneeyapan
 
Express measurement of market volatility using ergodicity concept
Express measurement of market volatility using ergodicity conceptExpress measurement of market volatility using ergodicity concept
Express measurement of market volatility using ergodicity concept
Jack Sarkissian
 
Option pricing under quantum theory of securities price formation - with copy...
Option pricing under quantum theory of securities price formation - with copy...Option pricing under quantum theory of securities price formation - with copy...
Option pricing under quantum theory of securities price formation - with copy...Jack Sarkissian
 
Naszodi a
Naszodi aNaszodi a
Naszodi a
Mrugaja Gokhale
 
The Predictive Power of Intraday-Data Volatility Forecasting Models: A Case S...
The Predictive Power of Intraday-Data Volatility Forecasting Models: A Case S...The Predictive Power of Intraday-Data Volatility Forecasting Models: A Case S...
The Predictive Power of Intraday-Data Volatility Forecasting Models: A Case S...
inventionjournals
 
Distribution of Nairobi Stock Exchange 20 Share Index Returns: 1998-2011
Distribution of Nairobi Stock Exchange 20 Share Index Returns: 1998-2011Distribution of Nairobi Stock Exchange 20 Share Index Returns: 1998-2011
Distribution of Nairobi Stock Exchange 20 Share Index Returns: 1998-2011
Waqas Tariq
 
vatter_wu_chavez_yu_2014
vatter_wu_chavez_yu_2014vatter_wu_chavez_yu_2014
vatter_wu_chavez_yu_2014Thibault Vatter
 
28359-eposter-PeterHickman
28359-eposter-PeterHickman28359-eposter-PeterHickman
28359-eposter-PeterHickmanPeter Hickman
 
Abstract mikhratunnisa
Abstract mikhratunnisaAbstract mikhratunnisa
Abstract mikhratunnisa
ekaputragunartha
 
Opec 12048 rev2
Opec 12048 rev2Opec 12048 rev2
Opec 12048 rev2
Per Bjarte Solibakke
 
Risk Managers: How to Create Great Stress Tests (and How to Not)
Risk Managers: How to Create Great Stress Tests (and How to Not)Risk Managers: How to Create Great Stress Tests (and How to Not)
Risk Managers: How to Create Great Stress Tests (and How to Not)
Daniel Satchkov
 
Bayesian Analysis Influences Autoregressive Models
Bayesian Analysis Influences Autoregressive ModelsBayesian Analysis Influences Autoregressive Models
Bayesian Analysis Influences Autoregressive Models
AI Publications
 
Otto_Elmgart_Noise_Vol_Struct
Otto_Elmgart_Noise_Vol_StructOtto_Elmgart_Noise_Vol_Struct
Otto_Elmgart_Noise_Vol_StructOtto Elmgart
 

Similar to VIX_Option_Multifactor_Final_Version_Normal_Format_incl_Supplementary_Appendix (20)

Deterministic Shift Extension of Affine Models for Variance Derivatives
Deterministic Shift Extension of Affine Models for Variance DerivativesDeterministic Shift Extension of Affine Models for Variance Derivatives
Deterministic Shift Extension of Affine Models for Variance Derivatives
 
Consistent Pricing of VIX Derivatives and SPX Options with the Heston++ model
Consistent Pricing of VIX Derivatives and SPX Options  with the Heston++ modelConsistent Pricing of VIX Derivatives and SPX Options  with the Heston++ model
Consistent Pricing of VIX Derivatives and SPX Options with the Heston++ model
 
poster-hmm
poster-hmmposter-hmm
poster-hmm
 
Matt Fagan MSM Analysis
Matt Fagan MSM AnalysisMatt Fagan MSM Analysis
Matt Fagan MSM Analysis
 
Research on the Trading Strategy Based On Interest Rate Term Structure Change...
Research on the Trading Strategy Based On Interest Rate Term Structure Change...Research on the Trading Strategy Based On Interest Rate Term Structure Change...
Research on the Trading Strategy Based On Interest Rate Term Structure Change...
 
Model Uncertainty and Exchange Rate Forecasting
Model Uncertainty and Exchange Rate ForecastingModel Uncertainty and Exchange Rate Forecasting
Model Uncertainty and Exchange Rate Forecasting
 
Express measurement of market volatility using ergodicity concept
Express measurement of market volatility using ergodicity conceptExpress measurement of market volatility using ergodicity concept
Express measurement of market volatility using ergodicity concept
 
Option pricing under quantum theory of securities price formation - with copy...
Option pricing under quantum theory of securities price formation - with copy...Option pricing under quantum theory of securities price formation - with copy...
Option pricing under quantum theory of securities price formation - with copy...
 
MCS 2011
MCS 2011MCS 2011
MCS 2011
 
Naszodi a
Naszodi aNaszodi a
Naszodi a
 
The Predictive Power of Intraday-Data Volatility Forecasting Models: A Case S...
The Predictive Power of Intraday-Data Volatility Forecasting Models: A Case S...The Predictive Power of Intraday-Data Volatility Forecasting Models: A Case S...
The Predictive Power of Intraday-Data Volatility Forecasting Models: A Case S...
 
Distribution of Nairobi Stock Exchange 20 Share Index Returns: 1998-2011
Distribution of Nairobi Stock Exchange 20 Share Index Returns: 1998-2011Distribution of Nairobi Stock Exchange 20 Share Index Returns: 1998-2011
Distribution of Nairobi Stock Exchange 20 Share Index Returns: 1998-2011
 
vatter_wu_chavez_yu_2014
vatter_wu_chavez_yu_2014vatter_wu_chavez_yu_2014
vatter_wu_chavez_yu_2014
 
28359-eposter-PeterHickman
28359-eposter-PeterHickman28359-eposter-PeterHickman
28359-eposter-PeterHickman
 
Abstract mikhratunnisa
Abstract mikhratunnisaAbstract mikhratunnisa
Abstract mikhratunnisa
 
opec_12048_Rev2
opec_12048_Rev2opec_12048_Rev2
opec_12048_Rev2
 
Opec 12048 rev2
Opec 12048 rev2Opec 12048 rev2
Opec 12048 rev2
 
Risk Managers: How to Create Great Stress Tests (and How to Not)
Risk Managers: How to Create Great Stress Tests (and How to Not)Risk Managers: How to Create Great Stress Tests (and How to Not)
Risk Managers: How to Create Great Stress Tests (and How to Not)
 
Bayesian Analysis Influences Autoregressive Models
Bayesian Analysis Influences Autoregressive ModelsBayesian Analysis Influences Autoregressive Models
Bayesian Analysis Influences Autoregressive Models
 
Otto_Elmgart_Noise_Vol_Struct
Otto_Elmgart_Noise_Vol_StructOtto_Elmgart_Noise_Vol_Struct
Otto_Elmgart_Noise_Vol_Struct
 

VIX_Option_Multifactor_Final_Version_Normal_Format_incl_Supplementary_Appendix

  • 1. Pricing VIX Options with Multifactor Stochastic Volatility∗ Pascal M. Caversaccio† First Draft: May 16, 2014 This Version: June 13, 2016 Abstract By exploiting the flexibility of the Wishart process, we propose an application of this framework to the pricing of Chicago Board Options Exchange (CBOE) volatility index (VIX) options. Our methodology is analytically tractable and yet flexible enough to efficiently price CBOE VIX options. In particular, the dynamics for the CBOE VIX is carried out in a linear affine way and the discounted Laplace transform exhibits an exponentially affine property. The tractable model structure lightens the computational burden and facilitates a fast identification of the parameter estimates. We empirically show that modeling the stochastic co-volatility factor can significantly improve the in-sample fitting results due to the improved modeling of higher conditional moments in the underlying transition probability density. Keywords: Matrix affine diffusion, multifactor stochastic volatility, stochastic correlation, VIX option pricing, Wishart process. JEL classification: C51, C52, G12, G13. ∗ The author thanks Chris Bardgett, Elise Gourier, Meriton Ibraimi, Markus Leippold, Stefan Pomberger, Lujing Su, and Nikola Vasiljevi´c for helpful comments, and Sergio Maffioletti for providing guidance and access to the cloud infrastructure of the University of Zurich. Any remaining errors are mine. † University of Zurich – Department of Banking and Finance, Plattenstrasse 14, 8032 Zurich, Switzerland. E-Mail: pascalmarco.caversaccio@uzh.ch.
  • 2. Pricing VIX Options with Multifactor Stochastic Volatility June 13, 2016 Abstract By exploiting the flexibility of the Wishart process, we propose an application of this framework to the pricing of Chicago Board Options Exchange (CBOE) volatility index (VIX) options. Our methodology is analytically tractable and yet flexible enough to efficiently price CBOE VIX options. In particular, the dynamics for the CBOE VIX is carried out in a linear affine way and the discounted Laplace transform exhibits an exponentially affine property. The tractable model structure lightens the computational burden and facilitates a fast identification of the parameter estimates. We empirically show that modeling the stochastic co-volatility factor can significantly improve the in-sample fitting results due to the improved modeling of higher conditional moments in the underlying transition probability density. Keywords: Matrix affine diffusion, multifactor stochastic volatility, stochastic correlation, VIX option pricing, Wishart process. JEL classification: C51, C52, G12, G13.
  • 3. 1 Introduction This paper incorporates the high analytical tractability as well as the enormous flexibility of Wishart processes to efficiently price Chicago Board Options Exchange (CBOE) volatility index (VIX) op- tions. The CBOE VIX reflects the market’s expectation of the 30-day forward Standard & Poor’s 500 index (SPX) volatility and serves as a proxy for the investor sentiment. Hence, given the direct exposure to the volatility, we have seen an extended use of VIX options in the risk management because they provide an immediate link to variance without the need to vega-hedge the underlying SPX. The increased demand over the recent years for volatility derivatives, notably for VIX op- tions, claims of course for a consistent pricing of this particular class of derivatives while preserving consistency with the underlying index.1 Since the introduction of VIX futures in 2004 and plain vanilla option contracts on the VIX in 2006, many research papers faced the challenge of pricing this particular class of products. Specifically, VIX option values are by definition dependent on the SPX through its options and therefore, due to the sake of consistent modeling, one needs to incorporate the SPX dynamics into the pricing. The paper of Sepp (2008) provides an exact analytical formula for SPX– and VIX options under the assumption of a stochastic volatility process with volatility jumps but no jumps in the asset return. Another interesting approach is taken in Albanese et al. (2009), where they develop a pricing framework that can simultaneously handle European options, forward-starts, options on the realized variance, and options on the VIX. To do so, they make use of spectral methods. Papanicolaou and Sircar (2014) introduce a regime-switching Heston model to widen the support of the bulk of the volatility process’ probability distribution. Bardgett et al. (2014) develop a three-factor affine model which yields semi-closed expressions for SPX– and VIX options. A non-affine approach is taken in Baldeaux and Badran (2014), where the authors extend the pure diffusion 3/2 model to jumps in the asset returns for the consistent modeling of SPX– and VIX derivatives. Furthermore, we refer, among others, to Bergomi (2008), Cont and Kokholm (2011), Kallsen et al. (2011), Drimus and Farkas (2013), Goard and Mazur (2013), and Branger et al. (2014) for recent papers covering 1 See Figure 8 in the supplementary appendix for the time evolution of the VIX option volume and the corresponding open interest. 1
  • 4. volatility– and particularly VIX options, respectively. In terms of empirical investigations, Menc´ıa and Sentana (2013) conduct an extensive empirical analysis of VIX derivative valuation models before, during, and after the 2008–2009 financial crisis, where their results indicate that stochastic volatility plays a much more important role for VIX options than for VIX futures. Moreover, there is also quite consistent empirical evidence for the fact that the implied volatility surface (IVS) of index options and the corresponding term structure of variance swaps, which are intimately connected to the VIX, are driven by more than one latent risk factor, see, e.g., Christoffersen et al. (2009) and Egloff et al. (2010). To capture this feature, Christoffersen et al. (2009) introduce a two-factor Heston model which is able to generate a wide degree of variation in the stochastic correlation between the log-returns and the volatility shocks. Even if the stochastic correlation structure implied by their setting is restricted by two artificial boundaries, their empirical findings show that two-factor stochastic volatility models improve sub- stantially on single-factor models in explaining the cross-sectional– and time series patterns of index option implied volatilities. These findings give rise to the question whether a multivariate stochas- tic volatility model of the Wishart type, with rich conditional dependence structures between risk factors, can be (efficiently) employed in the pricing of VIX options. The Wishart pure mean-reverting diffusion process of Bru (1991), which belongs to the class of matrix affine diffusion (MAD), is studied in Da Fonseca et al. (2007), Da Fonseca et al. (2008), Gouri´eroux et al. (2009), Gouri´eroux and Sufana (2010), and Da Fonseca et al. (2014) to price financial equity derivatives. Leippold and Trojani (2010) extend this framework and allow for a rich structure of jumps in the dynamics. The Wishart affine stochastic correlation model, developed by Da Fonseca et al. (2007), provides prices for vanilla options consistent with the observed smile– and skew pattern, while making it possible to detect and quantify the correlation risk in multiple-asset derivatives like basket options. Since a stochastic SPX option pricing model embodies an implicit dynamics for the VIX, it depends on the specification of the volatility– and jump structure in the setting whether the model can accurately replicate the VIX dynamics. Motivated by the crucial findings of Da Fonseca et al. (2007), we carry over the technique of highly flexible and tractable MAD modeling to the VIX market. 2
  • 5. 2 Empirical Stylized Facts2 We emphasize first that the VIX market features some distinctive properties compared to the SPX market. Firstly, the underlying of VIX options is not the current VIX spot price itself but rather the corresponding futures contract. This fact implies that no-arbitrage considerations of VIX options must be examined relative to the VIX futures. Moreover, because the VIX itself is not a tradable asset, there is no cost-of-carry relationship between VIX futures and the spot VIX value (see, e.g., Zhang and Zhu (2006) and Zhu and Lian (2012)), i.e. F (t, T, VIXt = x) = xert,T (T−t) where rt,T is the risk-free interest rate for the time period T − t. This circumstance clearly affects the put-call parity for VIX options which can however be modified trivially (see, for instance, Lian and Zhu (2013)) and is given by C (t, T, VIXt = x) − P (t, T, VIXt = x) = e−rt,T (T−t) F (t, T, VIXt = x) − Ke−rt,T (T−t) . (1) The sole difference to the ordinary put-call parity is the replacement of the underlying price by the discounted forward volatility which can be traded via the VIX future contract. Figure 1 displays the joint evolution of the SPX and VIX. The figure implies a strong negative stochastic correlation, i.e. a drop in the SPX is followed by upward moves of time-varying amplitudes in the VIX and vice versa. Furthermore, we can deduce a mean-reverting behavior in the VIX dynamics. [Figure 1 about here.] To empirically prove the above fact of a negative stochastic linkage between the SPX and VIX, we need to take a closer look at the correlation coefficients. We depict the rolling correlation coefficients between the SPX log-return– and the CBOE VIX level increments given the four different window sizes, measured in days, N ∈ {25, 50, 100, 250} in Figure 2. We see that the correlation coefficients are as expected strongly negative. Notice that the negative correlation coefficients induce the leverage 2 For further empirical stylized facts concerning the SPX– and VIX (options) market we refer to Appendix D in the supplementary appendix. 3
  • 6. effect3 between the VIX and SPX since negative returns are often observed together with a rise of volatility. [Figure 2 about here.] In Table 1 the first four moments of the SPX log-returns and the VIX levels are depicted. We can observe a negative skewness and a high kurtosis for the log-returns of the SPX, which indeed implies the empirically observed leptokurtic behavior. Moreover, the VIX levels follow a right-skewed and leptokurtic distribution. To replicate these behaviors we can introduce a persistent stochastic volatility process which slows down the convergence of the return distribution to normality and exhibits a right-skewed transition probability density for the volatility process. [Table 1 about here.] Now we take a look at the implied volatility skews of SPX– and VIX options.4 Since it is difficult to predict accurately future dividends on the SPX, we define for both markets the moneyness in terms of futures prices. The log-moneyness is defined, for i ∈ {S, V}, by m := ln K/Fi t (T) , where K is the strike level of the European-style option and Fi t (T) denotes the closing SPX (resp. VIX) futures price if i = S (resp. i = V) today at time t with maturity T. The empirical results above indicate very different shapes of the IVS slice for a fixed time-to-maturity. Indeed, as depicted in Figure 3, the backed out implied volatilities of the SPX options are negatively skewed, i.e. decreasing with increasing log-moneyness m 0.25 and for m 0.25 shortly increasing. The reverse is true for VIX options. Their IVS slices are in most cases increasing with increasing log-moneyness m for 3 The terminology ”leverage effect” can be traced back to Black (1976). Economically speaking, the leverage effect arises from the negative correlation between the stock price and its volatility. When the value of a stock drops, the volatility of its returns tends to increase. Indeed, if the stock price decreases, the debt-to-equity ratio increases and the risk of the firm therefore increases, which translates into a higher volatility for the firm. For possible other explanations of the leverage effect we refer, among others, to Haugen et al. (1991), Campbell and Hentschel (1992), Campbell and Kyle (1993), and Bekaert and Wu (2000). 4 The applied exclusionary criteria on the option data are outlined in Appendix C of the supplementary appendix. 4
  • 7. a fixed time-to-maturity. The behavioral explanation of the positive skew patterns can be traced back to the fact that out-of-the-money (OTM) call options on the VIX provide protection against a market crash. As a consequence, the call option writer enters a risky position and charges additional compensation for taking this risk, making the implied volatility higher for high-strike call options. The similar behavior is observed in the SPX market, where however the writer of an OTM put option charges extra compensation. Hence, according to these observations VIX OTM call option prices exhibit a significant time value. Translated into statistical terms, the volatility density implied by the VIX options has more mass concentrated at high volatility levels than at low volatility levels. This observation is essentially the major reason why the original Heston (1993) model with a chi-squared density fails to reproduce this market feature (cf. Gatheral (2008) who pointed out as first that VIX option prices are inconsistent with the Heston (1993) dynamics). To replicate this stylized fact, we introduce a multifactor stochastic volatility of the Wishart type in the following section which exhibits a higher flexibility in the underlying transition probability density. Furthermore, as shown in Christoffersen et al. (2009), stochastic correlation is an important model feature which helps to reproduce the time variation in the implied volatility smirk. Since the Wishart process is equipped with stochastic correlations, it is a natural candidate for modeling such a stylized fact. [Figure 3 about here.] 3 The Multifactor Stochastic Volatility Model In this section we derive closed-form solutions by means of transform methods for VIX– and SPX options under multifactor stochastic volatility. A closed-form expression for VIX futures is also presented. 3.1 Probabilistic Setup and Notation Let (Ω, F, F, P) be a filtered probability space, equipped with a standard Brownian motion W : [0, T] × Ω → R, satisfying the usual conditions of right-continuity and completeness. Moreover, we 5
  • 8. assume that P is the historical measure, F0 is P-trivial, and we work on the finite time horizon [0, T] , T < ∞. By R≥0 we denote the non-negative real numbers, i.e. R≥0 := [0, ∞), by R+ the positive real numbers are depicted, i.e. R+ := (0, ∞), and S+ k represents, for any k ∈ N, the non-negative cone of symmetric positive semi-definite k × k matrices. Furthermore, the operator tr : Rk×k → R, Q → tr (Q) = k i=1 Qii is the trace of a k × k-matrix Q, exp [Q] denotes the matrix exponential of a real– or complex k × k-matrix Q given by the power series exp [Q] := ∞ i=0 Qi i! , stands for the transpose of a matrix (or vector), let Ck be the set of functions with continuous derivatives up to the k-th order, i.e. k ∈ N0, Re [·] and Im [·] represent the real– and imaginary part of a complex number k = Re [k] + iIm [k], with i := √ −1, and the identity matrix is denoted, for k ∈ N, by Ik := diag (1, 1, . . . , 1) with diag (·) the diagonal matrix. Finally, we denote the futures price of the SPX by F = {Ft : t ∈ [0, T]} and its logarithmic price process by S = {St : t ∈ [0, T]} = {ln (Ft) : t ∈ [0, T]}. 3.2 Model Design This section introduces the flexible class of MAD that we use. Since we aim for a parsimonious option pricing model which does not encompass an over-parameterization, the model exhibits no additional jump structure. Nonetheless, notice that our framework can be easily extended to jumps in both, the asset return– and volatility process, by building on an earlier contribution of Leippold and Trojani (2010) who provide the analytical transform analysis for this class of models. We assume the existence of a probability measure Q ∼ P such that the SPX is a martingale with respect to Q, i.e. Q is a risk-neutral measure. Guided by the stylized facts in Section 2, we directly specify the SPX dynamics under Q. Let us remark that we do not consider the SPX spot price itself as the underlying of SPX options but rather the corresponding futures contract. Note that since the futures price Ft (T) converges to the SPX spot price as t → T, a European option on the underlying spot is the same as a European option on the corresponding futures contract (maturing at the same 6
  • 9. time as the option). To retain the analytical tractability and to preserve enough flexibility to account for the different empirical evidence, we stay in an affine setting. Assumption 3.1. Let M, Q ∈ R2×2, WQ and BQ are matrices of standard Brownian motions in R2×2 under the risk-neutral measure Q, the Gindikin coefficient is given by R+ β > 1, and √ X represents the unique square root on S+ 2 . Moreover, we assume WQ := BQ P + ZQ I2 − PP , where ZQ is a matrix of standard Brownian motions, independent of BQ, in R2×2 under the risk- neutral measure Q, and P ∈ R2×2 is a deterministic correlation matrix such that PP ≤ I2 is a positive semi-definite matrix. The SPX log-dynamics is determined by the stochastic differential equations (SDEs): dSt = − 1 2 tr (Xt) dt + tr Xt dWQ t , S0 = s ∈ R, dXt = βQ Q + MXt + XtM dt + Xt dBQ t Q + Q dBQ t Xt, X0 = x ∈ S+ 2 , (2) where we suppose that all the eigenvalues of x are distinct, i.e. λ1,0 > λ2,0 ≥ 0. Note, since we do not need the discounted version of F to obtain a semimartingale process, there is no risk-free rate in the drift component.5 Assumption 3.1 defines a matrix-variate stochastic volatility model in which we allow for multifactor volatility, stochastic correlation, and stochastic skewness.6 Since leverage is closely linked to the skewness of asset returns and the slope of the implied volatility smile,7 our model with stochastic leverage COVt (dSt, dVt) = 2tr (PQXt) dt, where Vt := tr (Xt), can enhance the ability to replicate the time-varying skew– and term structure patterns of the implied volatilities in the SPX market (see, e.g., Figure 3 above).8 The investigation of COVt (dSt, dVt) and 5 We refer to Lemma B.1 in the supplementary appendix for the mathematical justification. 6 See (Leippold and Trojani, 2010, Section 2) for the analytical expressions of the stochastic correlation between the log-returns and the volatility shocks with and without jumps. 7 More precisely, the leverage effect increases the probability of a large loss and consequently the value of OTM put options. The leverage effect induces negative skewness in stock returns, which in turn yields a volatility smirk. 8 This fact is pointed out in Christoffersen et al. (2009). In particular, while single-factor stochastic volatility models can capture the slope of the smile, they cannot explain large independent fluctuations in the corresponding level and slope over time. 7
  • 10. the corresponding variance of variance VARt (dVt) = 4tr Q QXt dt is interesting because these moments are related to the conditional skewness and kurtosis, respectively, of the SPX log-returns. It is shown in (Christoffersen et al., 2009, Online Appendix) that the paths for the conditional skewness and kurtosis, respectively, in the one-factor Heston (1993) model are too strongly linked to the variance path. Introducing more factors can generate a higher variation and therefore we exhibit a wider degree of flexibility in the term structure of higher moments. The pure diffusion Wishart process (see, e.g., Gouri´eroux et al. (2009) and Gouri´eroux and Sufana (2010)) in (2) represents the matrix analogue of a squared Bessel process. A moment’s reflection yields the result that in full analogy with the Cox-Ingersoll-Ross (CIR) process (cf. Cox et al. (1985)) the term βQ Q is related to the expected long-term variance-covariance matrix X∞ through the solution of the linear (Lyapunov) equation (cf. Da Fonseca et al. (2014)) MX∞ + X∞M = −βQ Q. The matrix M can be compared to the speed of mean reversion in the CIR process. Moreover, Q can be identified as the volvol parameter, i.e. the volatility of the volatility matrix. Eventually, the condition R+ β > 1 is introduced to assure Q-almost surely (a.s.) for all t ∈ [0, T] the existence and uniqueness of the strong solution Xt ∈ S+ 2 in (2).9 Imposing R+ β ≥ 3 would yield a unique strong solution Q-a.s. for all t ∈ [0, T] in the domain S+ 2 {0} since all eigenvalues of the solution Xt are strictly positive. The assumption of distinct eigenvalues for the initial state x implies that the eigenvalues of the unique strong solution Xt ∈ S+ 2 at any time t ∈ (0, T] never collide, i.e. λ1,t > λ2,t ≥ 0.10 This fact in turn leads the result that tr (Xt) > 0 a.s. for all t ∈ [0, T]. To enforce the typical mean-reverting behavior of the volatility on X = {Xt : t ∈ [0, T]}, we impose the following assumption. Assumption 3.2. To ensure stationarity, i.e. non-explosive features of the process X, we assume 9 In daily fitting exercises (see Section 4) we noticed that this condition is in the majority of cases not satisfied if not enforced. Also observe that this condition naturally translates to the Feller condition (cf. Feller (1951)), assuming the eigenvalues of x are non-colliding, for the univariate CIR process which is however also not genuinely satisfied for most option data. 10 See (Bru, 1991, Section 3) for the corresponding proofs. 8
  • 11. M to be negative definite. Every stochastic SPX option pricing model embodies an implicit dynamics for the VIX. Thus, this setting also encompasses the dynamics for the VIX. Since there are no jumps, the VIX squared is therefore equal to the annualized expected integrated variance over a 30-day time period. Proposition 3.1. The VIX squared at time t is given by VIX2 t := 1 τVIX t+τVIX t EQ [tr (Xs)| Ft] ds, τVIX ≡ 30 365 , (3) where the expectation can be carried out explicitly for a symmetric matrix M in the form of VIX2 t = αVIX2 + tr (βVIX2 Xt) , (4) with αVIX2 := βtr 1 τVIX Q Q 4M2 −1 − Q Q (2M)−1 , (5) βVIX2 := 1 τVIX (2M)−1 , (6) ≡ exp [2τVIXM] − I2. Proof. See Appendix A. Notice, it is common in the related literature to express the VIX squared at time t also by VIX2 t = 1 τVIX EQ t+τVIX t tr (Xs) ds Ft . However, observe that the interchange of the expectation and integration in Proposition 3.1 is justified by Tonelli’s theorem. In summary, we obtain a tractable affine setting for the VIX with at least two useful properties. First, the model implies nonlinear persistence properties and a stochastic volatility of volatility of the implied correlation process 12 := X12/ √ X11X22 (see, e.g., Leippold and Trojani (2010)). Second 9
  • 12. and even more important, the multivariate risk structure, in which both factors feature stochastic co-volatility, can enhance the accurate replication of the positive implied volatility skew for VIX options since we can generate a wide degree of flexibility in the conditional higher moments of the VIX dynamics. More precisely, given the fact that at any time t the transition probability density fQ (XT | Xt) in (2) follows a non-central Wishart distribution,11 it is straightforward to see that only considering orthogonal diagonal components restricts the structure of fQ and therefore reduces the flexibility of the implied VIX2 transition probability density. This special feature is the particular improvement in flexibility over the model of Christoffersen et al. (2009). To better understand the restriction on conditional higher moments of orthogonal diagonal com- ponents, we compute the first four moments of V := tr (X). We use the fact that the cumulant generating function for V is given by: K (Γ, Xt, t, T) = tr ξt,T Γ (I2 − 2Ξt,T Γ)−1 ξt,T Xt − β 2 ln (det (I2 − 2Ξt,T Γ)) , where Ξt,T := Σt,T β , ξt,T and Σt,T are defined in (B.1) and (B.2), respectively, and Γ := θI2 for θ ∈ R≥0. Using stan- dard matrix calculus and applying the relationship between cumulants and moments show that the conditional moments are given by the following expressions: EQ (VT | Vt) = κ1,t = tr ξ2 t,T Xt + βtr (Ξt,T ) , VARQ (VT | Vt) = κ2,t = 4tr (ξt,T Ξt,T ξt,T Xt) + 2βtr Ξ2 t,T , SKEWQ (VT | Vt) = κ3,t κ 3/2 2,t = 2 √ 2 3tr ξt,T Ξ2 t,T ξt,T Xt + βtr Ξ3 t,T 2tr (ξt,T Ξt,T ξt,T Xt) + βtr Ξ2 t,T 3/2 , KURTQ (VT | Vt) = κ4,t κ2 2,t = 12 4tr ξt,T Ξ3 t,T ξt,T Xt + βtr Ξ4 t,T 2tr (ξt,T Ξt,T ξt,T Xt) + βtr Ξ2 t,T 2 , 11 We refer to Appendix B for its explicit representation. 10
  • 13. where VAR, SKEW, and KURT denote the (conditional) variance, skewness, and kurtosis, respec- tively, and κn,t is the nth conditional cumulant at time t. Incorporating the fact that M is a negative definite matrix (see Assumption 3.2) yields the following unconditional moments: EQ (V∞) = −βtr Q Q (2M)−1 , VARQ (V∞) = 2βtr Q Q (2M)−1 2 , SKEWQ (V∞) = − 2 √ 2 tr Q Q (2M)−1 3 √ β tr Q Q (2M)−1 2 3/2 , KURTQ (V∞) = 12tr Q Q (2M)−1 4 β tr Q Q (2M)−1 2 2 . To numerically illustrate that the full Wishart specification has a greater flexibility than a mul- tifactor Heston specification, we use a calibrated parameter set of Section 4 for which we imposed the integer restriction β ∈ N {1} in the calibration to simplify the efficient simulation of the VIX2 dynamics below, and compute the conditional skewness and kurtosis, respectively, of V for a varying off-diagonal latent state X12. The initial parameter set is given by M =    −2.9671 3.1310 3.1310 −3.4838    , Q =    0.6857 −0.4236 −0.4236 0.5706    , Xt =    0.1442 −0.0268 −0.0268 0.0050    , β = 2. (7) Furthermore, we also consider a parameterized version of the Christoffersen et al. (2009) model which consists of orthogonal diagonal components and does therefore not contain a stochastic co-volatility factor. This model is described in Section 4 and is denoted by SV-2F in Figure 4. There are two conclusions to be drawn from Figure 4. First, incorporating the off-diagonal components generates a higher conditional skewness and kurtosis, respectively, thereby widening the support of the bulk of the VIX2 dynamics’ probability distribution. This in turn improves the replication of a positive 11
  • 14. implied volatility skew for VIX options due to the leverage effect which increases the probability of a large loss and consequently the values of VIX OTM call options. Second, by varying the off-diagonal component X12 we can observe the flexibility of this additional degree of freedom for the conditional higher moments. It is also important to notice that X12 provides an additional mean to capture the stochasticity of the skew effect implied by SPX options (cf. Da Fonseca et al. (2008)). Hence, X12 enhances the ability to preserve consistency with the underlying index. [Figure 4 about here.] The unconditional skewness (resp. kurtosis) for the MAD model is 1.9165 (resp. 5.6297) whereas for the Christoffersen et al. (2009) model it amounts 1.5403 (resp. 3.7024). Hence, we also obtain an improved modeling of higher unconditional moments. The enormous flexibility of Wishart processes can further be illustrated by simulating the VIX2 dynamics in (4) and plotting the corresponding VIX option prices and histograms. Since we enforced β to be a positive integer, we use the following proposition in combination with the Euler-Maruyama scheme to simulate equation (4). Proposition 3.2. Let β ∈ N {1} and assume Y = {Yt : t ∈ [0, T]} follows a matrix Ornstein- Uhlenbeck process in R2×β with dynamics dYt = MYtdt + QdBQ t , Y0 = y ∈ R2×β , where BQ is a matrix of standard Brownian motions in R2×β under the risk-neutral measure Q and M, Q ∈ R2×2. Then, X := Y Y has the dynamics (2) with BQ := √ X −1 Y BQ. Proof. See Appendix A. Figure 5 depicts on the left side the simulated VIX option prices using 30’000 Monte Carlo sim- ulations, a risk-free interest rate of 1%, and the parameter set given in (7). We can deduce that the MAD implied VIX model prices are higher in comparison to the Christoffersen et al. (2009) model for all the strike prices and the decay to zero is significantly slower implying a larger support 12
  • 15. of the bulk of the VIX2 dynamics’ probability distribution. This observation is also confirmed by the corresponding histograms of the simulated VIX dynamics on the right side of Figure 5, hereby justifying the importance of the off-diagonal components for the tail distribution. Also, as pointed out by Gatheral (2013), it is a stylized fact that the distribution of volatility (whether realized or implied) should be roughly lognormal.12 Such a behavior is well replicated from the MAD model whereas the Christoffersen et al. (2009) model still exhibits too little mass concentrated at high VIX states which can be seen from the histograms in Figure 5. [Figure 5 about here.] Eventually, an alternative way of interpreting the flexibility of X, which has been proposed by Gruber et al. (2010) and employed by Leippold and Trojani (2010), is by using at time t the spec- tral decomposition Xt = OtVtOt , where Vt is a diagonal matrix with the eigenvalues V1 t and V2 t , and Ot := O (αt) :=    cos (αt) cos (αt) sin (αt) − sin (αt)    is a matrix of orthonormal eigenvectors written in polar coordinates with angle αt ∈ [0, π]. Ob- serve that V = V1 + V2 holds, and therefore in combination with some lengthy calculations (see (Gruber et al., 2010, Appendix B)), we can decompose X into a volatility– and a structural part as follows: Xt = Vt 2   I2 + 2 V1 t Vt − 1    cos (2αt) sin (2αt) sin (2αt) − cos (2αt)       , where V is equal to the total variance and V1 V is the fraction of total variance explained by the first volatility factor. If we set cos (2α) = 1, i.e. α = 0 and α = π, we obtain a two-factor volatility model, and if we additionally impose V1 V = 1, the state X breaks down to the original Heston (1993) model. Thus, α ∈ (0, π) identifies the incremental impact of X12. We fix the volatility level V and the volatility composition V1 V , and plot in Figure 6 the admissible domain for the VIX values by varying α ∈ [0, π] and employing the parameter set given in (7). We can notice that for α ∈ (0, π) the MAD 12 Empirical studies showing that the distribution of volatility is lognormal include, among others, Cizeau et al. (1997), Andersen et al. (2001a), and Andersen et al. (2001b). 13
  • 16. model generates a wide degree of variability in the VIX values, independently of the conditional level and composition of the spot volatility, respectively. Overall, the minimum is achieved at 0.4055 whereas the maximum amounts 0.4652. This finding underpins the importance of the additional degree of freedom in the MAD framework which improves the modeling of higher conditional and unconditional moments, respectively, in the underlying VIX transition probability density. [Figure 6 about here.] Let us remark that in order to span this incomplete market setting, coming from the stochastic correlation, we can consider correlation products such as correlation swaps.13 3.3 Transform Analysis The entire framework is based on an affine setup which allows us, according to Duffie et al. (2000), to efficiently solve our financial pricing problem by means of transform methods. Therefore, we aim an analytical characterization of the discounted Laplace transform of VIX2 T and ST , respectively, at time t under Q: ΨVIX2 (ω, Xt, t, T) := EQ exp − T t R (Xs) ds + ωVIX2 T Ft , (8) ΨS (ϑ, St, Xt, t, T) := EQ exp − T t R (Xs) ds + ϑST Ft , (9) where ω, ϑ ∈ C, and the short rate process R = {R (Xt) : t ∈ [0, T]} is affine, i.e. R (x) = ρ0 +tr (ρ1x) with ρ0 ∈ R≥0, ρ1 ∈ S+ 2 . For simplicity’s sake we set the risk-free rate constant, i.e. ρ1 = 0. One may argue that the error of a constant interest rate can become substantially large, but this concern only holds for long dated financial derivatives. As it is not the case for SPX– and VIX options, where the options are dated much more short-term, we do not encounter this issue in our setting. Proposition 3.3. Let Assumption 3.1 and Theorem 2.4, Theorem 2.6, and Proposition 4.9 in 13 See Buraschi et al. (2014) for a recent empirical study on correlation swaps which provide a natural traded proxy for the price of correlation risk. 14
  • 17. Cuchiero et al. (2011) be satisfied. Furthermore, assume a constant risk-free rate, i.e. ρ1 = 0, and define τ := T − t. Then, the discounted Laplace transform in (8) is exponentially affine: ΨVIX2 (ω, Xt, t, T) = exp (φX (τ) + tr (ψX (τ) Xt)) with the functions φX (τ) ∈ R and ψX (τ) ∈ S+ 2 that solve the system of matrix Riccati differential equations: dψX (τ) dτ = M ψX (τ) + ψX (τ) M + 2ψX (τ) Q QψX (τ) , (10) dφX (τ) dτ = − ρ0 + βtr Q QψX (τ) , (11) subject to the terminal conditions φX (0) = ωαVIX2 and ψX (0) = ωβVIX2 . Proof. See Appendix A. Proposition 3.4. Let Assumption 3.1 be satisfied, assume a constant risk-free rate, i.e. ρ1 = 0, and define τ := T − t. Then, the discounted Laplace transform in (9) is exponentially affine: ΨS (ϑ, St, Xt, t, T) = exp (ϑSt) exp (φS (τ) + tr (ψS (τ) Xt)) with the functions φS (τ) ∈ R and ψS (τ) ∈ S+ 2 that solve the system of matrix Riccati differential equations: dψS (τ) dτ = ϑ (ϑ − 1) 2 I2 + ψS (τ) M + ϑQ P + M + ϑPQ ψS (τ) + 2ψS (τ) Q QψS (τ) , (12) dφS (τ) dτ = − ρ0 + βtr Q QψS (τ) , (13) subject to the terminal conditions φS (0) = 0 and ψS (0) = 0. Proof. See Appendix A. Hereupon, the closed-form solutions to the matrix Riccati equations (10), (11), (12), and (13) 15
  • 18. can be obtained by linearizing the flow of the differential equation using Radon’s lemma (see, e.g., Freiling (2002) and Lemma B.2 in the supplementary appendix). Corollary 3.1. Let the matrix Riccati equations (10) and (11), and Theorem 2.4, Theorem 2.6, and Proposition 4.9 in Cuchiero et al. (2011) be satisfied. Furthermore, define τ := T − t. Then, ψX (τ) = (ωβVIX2 C12 (τ) + C22 (τ))−1 (ωβVIX2 C11 (τ) + C21 (τ)), where C11 (τ), C12 (τ), C21 (τ), and C22 (τ) are 2 × 2 blocks of the following matrix exponential:    C11 (τ) C12 (τ) C21 (τ) C22 (τ)    := exp   τ    M −2Q Q 0 −M       . Given the solution for ψX (τ), the coefficient φX (τ) follows by direct integration: φX (τ) = − ρ0τ − β 2 tr ln (C22 (τ)) + τM + ωαVIX2 , where ln (·) denotes the matrix logarithm. Proof. See Appendix A in the supplementary appendix. Corollary 3.2. Let the matrix Riccati equations (12) and (13) be satisfied. Furthermore, define τ := T − t. Then, ψS (τ) = C22 (τ)−1 C21 (τ), where C21 (τ) and C22 (τ) are 2 × 2 blocks of the following matrix exponential:    C11 (τ) C12 (τ) C21 (τ) C22 (τ)    := exp   τ    M + ϑQ P −2Q Q ϑ(ϑ−1) 2 I2 − M + ϑPQ       . Given the solution for ψS (τ), the coefficient φS (τ) follows by direct integration: φS (τ) = − ρ0τ − β 2 tr ln (C22 (τ)) + τM + τϑPQ . Proof. See Appendix A in the supplementary appendix. For the pricing of options on the VIX, we cannot follow the seminal technique, the so-called 16
  • 19. fast Fourier transform (FFT), developed in Carr and Madan (1999). To illustrate why, we follow the arguments in Bardgett et al. (2014) who have pointed out the technical difficulties in this framework. We can write the price of a call option with strike K and maturity T at time t as C (VIXt, K, t, T) = e−r(T−t) ∞ 0 √ x − K + f x| VIX2 t = y dx, where x denotes the value of the VIX2 at time T, f x| VIX2 t = y is the probability density function (PDF) of x given today’s value y ∈ R+, and r is the (constant) risk-neutral interest rate. Comparing this expression to a SPX call option payoff ey − ek + , where y := ln (S) with S as the underlying index price at maturity T and k := ln (K) with K as the strike price, we see that to apply the Fourier method of Carr and Madan (1999) we would need to have an affine dependence on the log of the VIX, which is however incompatible with affine models for log-returns. To circumvent this technical issue, we henceforth follow the transform method in Fang and Oosterlee (2008). Fang and Oosterlee (2008) develop an option pricing method for European-style options, called the COS method, based on Fourier-cosine series expansions. The main idea is to decompose a density function into a linear combination of cosine functions since the series coefficients of many density functions can be accurately obtained from their characteristic functions. This particular decomposition allows for easy and highly efficient numerical computations which attain, in most cases, an exponential convergence rate and a linear computational complexity. Theorem 3.1. Given the truncation interval [a, b] ⊂ R≥0 for the (compact) support of the PDF f x| VIX2 t = y with x ∈ R≥0 and y ∈ R+, the price Υ (Xt, K, t, T) of a European contingent claim on the VIX with payoff function g VIX2 T = VIX2 T − K + and time t ∈ [0, T] is Υ (Xt, K, t, T) = N−1 k=0 Ak (a, b) ϕk (a, b) , N ∈ N, with the coefficients Ak (a, b) := Re [exp (φX (T − t) + tr (ψX (T − t) Xt)) exp (cka)] , ck ≡ − ikπ b − a , 17
  • 20. ϕk (a, b) :=    χ := 2 b−a b a ( √ x − K) + cos kπx−a b−a dx, k > 0, 2 b−a 2 3b3/2 + K K2 3 − b , k = 0, where χ can be obtained in closed-form: χ = 2 b − a Re ecka √ b eckb −ck + √ π 2c 3/2 k erfz ckb − erfz (K √ ck) , with erfz (z) := 2√ π z 0 e−t2 dt and z ∈ C the complex Gauss error function.14 The prime superscript in the sum indicates that the first term in the summation is divided by two and the coefficients φX (T − t) and ψX (T − t) are given in Corollary 3.1 with ω = −ck. Proof. See Appendix A. For the sake of consistency and since the COS method requires far fewer evaluations of the characteristic function for a given level of accuracy than the FFT method, we employ this particular technique also for the pricing of options on the SPX. Theorem 3.2. Let the log-asset prices be denoted by y := ln FT K and y0 := ln Ft K . Given the truncation interval [a, b] ⊂ R for the (compact) support of the PDF f (y| y0 = z) with y, z ∈ R, the price Φ (St, K, t, T) of a European contingent claim on the SPX with payoff function g ey = K ey − 1 + and time t ∈ [0, T] is Φ (St, K, t, T) = N−1 k=0 Bk (a, b) γk (a, b) , N ∈ N, with the coefficient Bk (a, b) := Re [exp (−ck (St − ln (K))) exp (φS (T − t) + tr (ψS (T − t) Xt)) exp (cka)] , ck ≡ − ikπ b − a , 14 This error function can be evaluated using the infinite series approximation (7.1.29) in Abramowitz and Stegun (1972). 18
  • 21. and the closed-form solution of γk (a, b) can be found in (Fang and Oosterlee, 2008, Section 3). The coefficients φS (T − t) and ψS (T − t) are given in Corollary 3.2 with ϑ = −ck. Proof. See Appendix A. Let us remark that the price of a put option on the VIX and SPX, respectively, can be calculated analogously. Another approach is to back out the SPX (VIX) put option price via the ordinary (modified) put-call parity.15 In summary, our highly analytical yet tractable model yields closed-form solutions for VIX– and SPX options, and we can therefore circumvent the need of supplementary numerical solutions which induces additional approximation errors. Eventually, we present a closed-form solution for the VIX futures price F (Xt, t, T), i.e. with current time t and settlement time T, which can be represented by F (Xt, t, T) := EQ VIX2 T Ft = EQ αVIX2 + tr (βVIX2 XT ) Ft . Observing that the discount is not used for VIX futures and setting the strike price K in Theorem 3.1 to zero yields the following result. Theorem 3.3. Given the truncation interval [a = 0, b] ⊂ R≥0 for the (compact) support of the PDF f x| VIX2 t = y with x ∈ R≥0 and y ∈ R+, the VIX futures price F (Xt, t, T) with payoff function g VIX2 T = VIX2 T and time t ∈ [0, T] is F (Xt, t, T) = N−1 k=0 Ck (a, b) k (a, b) , N ∈ N, 15 The modified put-call parity is given in (1). 19
  • 22. with the coefficients Ck (a, b) := Re [exp (φX (T − t) + tr (ψX (T − t) Xt)) exp (cka)] , ck ≡ − ikπ b − a , k (a, b) :=    ξ := 2 b−a b a √ x cos kπx−a b−a dx, k > 0, 4b3/2 3(b−a) , k = 0, where ξ can be obtained in closed-form: ξ = 2 b − a Re ecka √ b eckb −ck + √ π erfz √ ckb 2c 3/2 k . The coefficients φX (T − t) and ψX (T − t) are given in Corollary 3.1 with ρ0 = 0 and ω = −ck. 4 Pricing Performance Every theoretical model needs to search its justification in the data. This purpose is the aim of the current section. To do so, we focus on the VIX market to demonstrate the model flexibility and leave a possible extension to the joint calibration of SPX– and VIX option data as an interesting avenue of future research. Firstly, due to model parsimony we assume that Q is a symmetric matrix. Moreover, we compare our MAD framework with two different reference models: 1. A one-factor stochastic volatility model with jumps in both, the asset return– and volatility process. Henceforth, we denote this model by SVJJ. 2. A parameterized version of the Christoffersen et al. (2009) model, henceforth denoted by SV-2F model, which is naturally nested in our MAD model. We refer to Appendix C for the detailed description of the SVJJ model. In terms of the nested model, setting the matrices M, Q, and X as diagonal yields the SV-2F model. To see this linkage, 20
  • 23. observe that with M, Q, and X assumed to be diagonal, we obtain, for i = 1, 2, dXii,t = β (Qii)2 + 2MiiXii,t dt + 2Qii Xii,tdB∗ i,t, (14) where B∗ = (B∗ 1, B∗ 2) defined by B∗ i := √ Xii −1 2 j=1 XijBQ ji is a vector of two independent Brownian motions. Hence, in this simple case where the parameters are diagonal matrices, the diagonal components of the Wishart process are independent CIR processes (cf. Benabid et al. (2010)). Eventually, by identifying the parameters in (14) with the parameters in Christoffersen et al. (2009), we achieve the parameterized nesting, i.e. for i = 1, 2, we have ai ≡ β (Qii)2 , bi ≡ −2Mii, σi ≡ 2Qii. In this setting, the off-diagonal component X12 becomes futile, and the stochastic skewness is com- pletely driven by the diagonal factors X11 and X22. Hence, the dynamics of the risk-neutral skewness in this model is completely governed by the dynamics of the volatility components. Notice, the com- parison between the MAD– and the SV-2F model allows us to assess the importance of the stochastic co-volatility factor for the replication of the positive implied volatility skew of VIX options. For the calibration we use the so-called market implied approach which relies on the existence of semi-closed– or closed-form expressions for the prices of benchmark derivatives. Let I ⊂ N denote the set of (noiseless) VIX call option data available on a particular date.16 Definition 4.1. Our loss function which we refer to as average relative dollar error ($ARE) is defined by $ARE (Θ) := 1 #I i∈I PMkt i − Pi (Θ) PMkt i , (15) with #I denoting the cardinality of the set I and where, for all i ∈ I, PMkt i is the observed market call price and Pi is the model-implied call price for a given parameter set Θ. 16 For the detailed data treatment we refer to Appendix C in the supplementary appendix where we describe which exclusionary criteria are applied. 21
  • 24. For our empirical investigations we have MAD Model (10 Parameters): Θ = {β, M11, M12, M22, Q11, Q12, Q22, X11,t, X12,t, X22,t} , SV-2F Model (7 Parameters): Θ = {β, M11, M22, Q11, Q22, X11,t, X22,t} , SVJJ Model (9 Parameters): Θ = {κ, m, σ, λ0, λ1, µS, σS, ν, vt−} . We consider the relative absolute distance to favor particularly OTM call options since the implied volatilities are the highest for these options (see Figure 3) and we are interested in fitting the stylized fact of a positive implied volatility skew. By using implied volatilities directly we would face many computational difficulties, i.e. numerical inversion of the Black-Scholes formulae at each minimization step. Therefrom, we use as metric simple option prices.17 Overall, this specification is line with the current literature such as Papanicolaou and Sircar (2014). The loss function (15) implies a high-dimensional non-convex optimization problem. To mitigate the ill-posedness problem and to find the global minimum we use a differential evolution (DE) algorithm.18 To achieve a representative result for the entire sample, we divide the data into three differ- ent monthly subsamples which encompass different states of the economy (∅ denotes the average sign):19 I. October 2008, ∅VIX = 61.18, II. April 2010, ∅VIX = 17.42, III. August 2011, ∅VIX = 35.03. Estimating the model on each weekday is computationally very challenging. Therefore, to further 17 Instead, one could also use the vega approximation of Cont and Tankov (2004). 18 A DE algorithm is an efficient heuristic for the global optimization over continuous spaces. The inception of such evolutionary-based optimization strategies can be traced back to Storn and Price (1997) and some earlier papers cited therein. In brief, a DE algorithm is a population-based optimizer that attacks the starting point problem by sampling the objective function at multiple, randomly chosen initial points and generates new points that are perturbations of existing points. We refer to Price et al. (2005) and Chakraborty (2008) for a review of differential evolution algorithms. 19 A further motivation for these choices is driven by the empirical VIX distributions during the subsamples. They appeared to vary a lot across the data sets and thus making it challenging to replicate for a model. 22
  • 25. lighten the computational burden, we sample the data weekly every Wednesday to avoid weekday effects. The calibration results are reported in Table 2. To compare the effect of enforcing the Feller condition, i.e. σ2 i < 2ai for i = 1, 2, we calibrate two versions of the SV-2F model. The calibration results in the third column with the remark Feller condition violate the Feller condition whereas the results in the fourth column with the note Feller condition satisfy the Feller condition. Another way of enforcing this condition in our framework is by setting β ≥ 3 (see also Section 3.2). Notice that we impose the constraint of distinct eigenvalues for the initial state of X and therefore we assure that tr (Xt) > 0 a.s. for all t ∈ [0, T] (see also Section 3.2). Overall, in the first and third data set, respectively, the in-sample performance of the MAD model is comparable to the SVJJ model whereas in the second data set the SVJJ model exhibits a better performance. If we throw a glance at the SV-2F model results, we can observe that modeling the stochastic co-volatility factor can significantly improve the calibration results in all states. One possible reason is due to the improved modeling of higher conditional moments. In particular, it seems that the MAD model can generate a higher degree of variety in the conditional skewness and kurtosis, respectively, of the transition probability density compared to the SV-2F model. Translated into statistical terms, the volatility density implied by the MAD model is more right-skewed, i.e. has more mass concentrated at high volatility levels than at low volatility levels, than the implied volatility density of the SV-2F model. This concentration is needed to replicate the positive implied volatility skew observed in the VIX market. As already pointed out in Section 3.2, given the fact that only considering orthogonal diagonal components restricts the structure of fQ VIX2 T VIX2 t which is however of high importance for the pricing, these empirical fitting results underpin the benefit of stochastic correlation within the risk factors X11 and X22. Additionally imposing the Feller condition in the SV-2F model can add a high constraint to the underlying transition density which is reflected in the partly poor in-sample fitting results. [Table 2 about here.] To illustrate the contribution of the additional degree of freedom in the MAD model, we investigate 23
  • 26. the parameter estimates, the model-implied option prices, and the model-implied implied volatilities on August 17, 2011 in more detail. The qualitative implications of the other days remain similar. As an additional comparison, we calibrate the original Heston (1993) model, henceforth denoted by SV-1F. The parameter estimates are reported in Table 3. Due to the sake of consistency, we only consider the original SV-2F formulation for the comparison, since the additional burden of the Feller condition can distort the results. The right economic interpretation of the parameters in the MAD– and SV-2F model is difficult. However, we can observe that the SV-2F model exhibits a higher mean reversion than the MAD model since M has larger negative eigenvalues. Moreover, X12 is significantly larger than zero, implying a positive stochastic correlation within the risk factors X11 and X22. The SVJJ model implies an almost twice as large speed of mean reversion whereas the volatility of volatility is much lower compared to the SV-1F model. This behavior is obviously linked to SVJJ’s ability of substituting high volatility of volatility with volatility jumps. Notice that the SVJJ model exhibits approximately two and half expected number of jumps per year of magnitude ν = 0.1875. [Table 3 about here.] To get structural insights of the in-sample fit of the models, we plot in Figure 7 and Figure 8 the model-implied option prices and implied volatilities, respectively, in conjunction with the cor- responding market values for four different time-to-maturities. The closing at-the-money (ATM) VIX futures prices are 27.65, 26.58, 25.86, and 24.91 for 35, 63, 91, and 126 days, respectively. The one-factor stochastic volatility model SV-1F systematically fails in fitting the option prices and implied volatilities. In particular, the implied option price decay is too high which is due to the chi- squared volatility density, inducing too little mass at high volatility states. This in turn increases the probability of small VIX levels and consequently the values of in-the-money (ITM) VIX call options (or OTM VIX put options). Hence, we observe negative implied volatility skews which confirm the findings of Gatheral (2008). The pure diffusion model SV-2F undervalues the ATM VIX call option prices and fails to replicate the implied volatility patterns for almost all time-to-maturities and log- moneyness levels. Nonetheless, we can observe that adding an additional stochastic volatility term 24
  • 27. changes the shape of the implied volatility skews into a positive slope. Therefore, the SV-2F model can widen the bulk of high volatility states. In terms of option prices, we find almost no difference between the MAD– and SVJJ model. If we throw a glance at the implied volatility levels, we note that both models are able to generate the appropriate positive implied volatility skews but the SVJJ model tends to increase the implied volatilities for ITM VIX call options whereas the MAD model generates decreasing implied volatility patterns for all log-moneyness levels. [Figure 7 about here.] [Figure 8 about here.] We emphasize that this empirical analysis does not include any final conclusion for model selection since we merely contemplate the in-sample fitting performance. Moreover, we do not question the usefulness of adding a jump process to the return and/or volatility dynamics, respectively. Instead, we reckon that modeling an implied correlation process for the risk factors is an alternative way to deal with model deficiencies. We would like to stress that these fitting exercises are by no means complete. In order to fully understand the importance of jumps and/or stochastic co-volatility factors, one would need to im- plement a sequential calibration exercise for a complete time series of cross-sectional option data, which is however beyond the scope of our current computational infrastructure. We leave this task for future research. It’s worth mentioning that in terms of computational complexity, which is a very important issue from a practitioner’s point of view, the MAD model is two and a half up to three times faster than the SVJJ model in daily fitting.20 Therefore, our model exhibits practicability and is yet flexible enough to account for the positive skew in the VIX market. 20 Blatantly, this fact is due to the numerical resolution of the generalized Riccati differential equations in the SVJJ model. 25
  • 28. 5 Conclusion Due to the high analytical tractability and the enormous flexibility of the Wishart process, we propose an application to the pricing of CBOE VIX options. We carry out the dynamics for the CBOE VIX in a linear affine way and the discounted Laplace transform exhibits an exponentially affine property. The tractable model structure lightens the computational burden and facilitates a fast identification of the parameter estimates. Eventually, we empirically show that modeling the stochastic co-volatility factor can significantly improve the in-sample fitting results due to the improved modeling of higher conditional moments in the underlying transition probability density. 26
  • 29. Appendices A Proofs This section contains the mathematical proofs of Proposition 3.1, Proposition 3.2, Proposition 3.3, Proposition 3.4, Theorem 3.1, and Theorem 3.2. Proof of Proposition 3.1. It is shown in (Buraschi et al., 2008, Appendix C) that EQ [Xs| Ft], where 0 ≤ t ≤ s ≤ T, has the solution EQ [Xs| Ft] = exp [(s − t) M] Xt exp (s − t) M + β s−t 0 exp [uM] Q Q exp uM du. Assume now that M is a symmetric matrix and take the trace of EQ [Xs| Ft]: EQ [tr (Xs)| Ft] = tr (Xt exp [2 (s − t) M]) + βtr Q Q (2M)−1 (exp [2 (s − t) M] − I2) . Eventually, by integrating (3) between t and t + τVIX, where τVIX ≡ 30 365, we get to the solution VIX2 t = 1 τVIX tr Xt + βQ Q (2M)−1 (2M)−1 (exp [2τVIXM] − I2) − βtr Q Q (2M)−1 . Doing the algebra yields the solution (4) with the coefficients in (5) and (6).21 Notice, one way to deal with a non-symmetric matrix M could be by setting M = M+M 2 .22 Proof of Proposition 3.2. By applying the stochastic Leibniz rule, we obtain dXt = YtdYt + dYtYt + d Y, Y t = βQ Q + MXt + XtM dt + Xt Xt −1 Yt dBQ t Q + Q dBQ t Yt Xt −1 Xt. Now employing L´evy’s characterization of Brownian motion, we can define the R2×2-valued Brownian 21 Recall that the trace operator is invariant under cyclic permutations. 22 See also Benabid et al. (2010). 27
  • 30. motion BQ := √ X −1 Y BQ , which completes the proof. Proof of Proposition 3.3. Suppose that X is an adapted Markov process in the state space S+ 2 . Under some mild regularity conditions, the L´evy infinitesimal generator AX of the matrix Markov process X is defined for bounded f (x) ∈ C2 S+ 2 ; R functions by AX f (x) := tr βQ Q + Mx + xM D + 2xDQ QD f (x) , (A.1) where D is a 2 × 2 matrix of differential operators with the ij-component given by ∂ ∂xij .23 Since the generator exhibits an affine dependence on the state space x ∈ S+ 2 , we obtain, by separation of the variables, the exponential affine property with the matrix Riccati equations given in the proposition. Proof of Proposition 3.4. We follow the same rationale as in the proof of Proposition 3.3 and repeat it here for the reader’s convenience. Suppose that X is an adapted Markov process in the state space S+ 2 . Under some mild regularity conditions, the L´evy infinitesimal generator ASX of (S, X) is defined for bounded f (s, x) ∈ C2,2 R × S+ 2 ; R functions by ASX f (s, x) := − 1 2 tr (x) ∂f (s, x) ∂s + 1 2 tr (x) ∂2f (s, x) ∂s2 + tr βQ Q + Mx + xM D + DQ P x + xPQD ∂ ∂s + 2xDQ QD f (s, x) , where D is a 2 × 2 matrix of differential operators with the ij-component given by ∂ ∂xij .24 Since the generator exhibits an affine dependence on the state space (s, x) ∈ R × S+ 2 , we obtain, by separation of the variables, the exponential affine property with the matrix Riccati equations given in the proposition. 23 The infinitesimal generator of X has been calculated by Bru (1991). 24 The infinitesimal generator of (S, X), in which S represents the logarithmic spot price process, has been calculated by Da Fonseca et al. (2008) for a multifactor volatility Heston model. 28
  • 31. Proof of Theorem 3.1. The general COS formula for European contingent claims is taken from (Fang and Oosterlee, 2008, Section 3).25 The corresponding coefficients are provided in Bardgett et al. (2014) for which we merely have to substitute our discounted Laplace transform into Ak (a, b) while ϕk (a, b) remains the same. Hence, the result follows. Proof of Theorem 3.2. Following (Fang and Oosterlee, 2008, Section 3) and taking into account the transformation ΨS (ϑ, St, Xt, t, T)−ϑ ln(K) for the discounted Laplace transform yields the result. B The Non-Central Wishart Distribution of X In this section we state the transition probability density fQ (XT | Xt) for a negative definite sym- metric matrix M ∈ R2×2. Proposition B.1. The transition probability density fQ (XT | Xt) for any time t can be calculated explicitly in the following form: fQ (XT | Xt) = det (Σt,T )−β/2 det (XT )(β−1)/2 2βΓ2 β 2 exp − 1 2 tr Σ−1 t,T (XT + ξt,T Xtξt,T ) × 0F1 β 2 , ξt,T Xtξt,T XT 4 , with the parameters ξt,T := exp [(T − t) M] , (B.1) Σt,T := βQ Q (2M)−1 (exp [2 (T − t) M] − I2) , (B.2) and where det (·) stands for the determinant operator, pFq (·, ·), for p, q ∈ N0, denotes the hypergeo- metric function which can be expressed in terms of zona polynomials (see Muirhead (2005)),26 and 25 Note that we consider the discounted Laplace transform in this paper, in opposite to Fang and Oosterlee (2008), and therefore do not have an additional discount factor e−r(T −t) in the expression of Υ (VIXt, K, t, T). 26 These polynomials have no closed-form expressions, but can be computed recursively. 29
  • 32. Γk (·), for k ∈ N, is the multi-dimensional gamma function defined by Γk (x) := Λ>0 exp (tr (−Λ)) (det (Λ))x−k+1 2 dΛ. Proof. Applying the results of (Muirhead, 2005, Chapter 10) and bearing the symmetry of the matrix M in mind yields the proposition. C The Reference Jump Model: A Synopsis In this section we sketch our reference jump model used for comparison. It is a simplified version of Bardgett et al. (2014) with a constant central tendency. As usual, we denote the futures price of the SPX by F = {Ft : t ∈ [0, T]} and its logarithmic price process by S = {St : t ∈ [0, T]} = {ln (Ft) : t ∈ [0, T]}. Assumption C.1. Let κ, m, σ ∈ R+, WQ and BQ are standard Brownian motions in R under the risk-neutral measure Q, vt− := lims↑t vs is the left limit of the process v at time t > s, and we suppose the Feller condition (cf. Feller (1951)) holds, i.e. σ2 < 2κm. Moreover, we assume WQ := BQ ρv + MQ 1 − ρv, where MQ is a standard Brownian motion, independent of BQ, in R under the risk-neutral measure Q, and ρv ∈ [−1, 1] is a deterministic correlation coefficient. The SPX log-dynamics is determined by the SDEs: dSt = −λ (vt−) θQ J (1, 0) − 1 − 1 2 vt− dt + √ vt−dWQ t + ZS,Q t dNt, S0 = s ∈ R, dvt = κ (m − vt−) dt + σ √ vt−dBQ t + Zv,Q t dNt, v0 = v∗ ∈ R+, where N = {Nt : t ∈ [0, T]} is a standard Poisson process with the affine intensity λ (v) = λ0 + λ1v, λ0, λ1 ∈ R≥0. 30
  • 33. The joint Laplace transform of the independent and identically distributed random jump sizes Z := ZS,Q, Zv,Q is defined as follows: θQ J (ϑ, γ) := EQ exp ϑZS,Q + γZv,Q , where ϑ, γ ∈ C. Finally, we assume that the jump sizes in the returns are normally distributed with mean µS ∈ R and variance σ2 S ∈ R≥0, respectively, and the jump sizes in the volatility factor follow an exponential distribution with mean ν. This model framework also implicitly defines the dynamics for the VIX. The result follows by the affine property and using the result of (Egloff et al., 2010, Proposition 2) for the integrated variance. Proposition C.1. The VIX squared at time t is given by VIX2 t := 1 τVIX EQ t+τVIX t vs−ds + 2 eZS,Q s − 1 − ZS,Q s dNs Ft , τVIX ≡ 30 365 , where the expectation can be carried out explicitly in the affine form of VIX2 t = αVIX2 + βVIX2 vt−, with αVIX2 := 2λ0 0 + (1 + 2λ1 0) (κm + νλ0) (νλ1 − κ) ( 1 − 1) , βVIX2 := (1 + 2λ1 0) 1, 0 ≡ eµS+1 2 σ2 S − µS − 1 , 1 ≡ e(νλ1−κ)τVIX − 1 (νλ1 − κ) τVIX . Let us remark that following the quadratic variation convention for variance swaps would yield 31
  • 34. the following expectation: VIX2 t = 1 τVIX EQ t+τVIX t vs−ds + ZS,Q s 2 dNs Ft . (C.1) However, the variance swap rate inherits an approximation error due to price jumps (cf. Carr and Wu (2009)). We adjust for the jump induced error by replacing ZS,Q s 2 with the correction term 2 eZS,Q s − 1 − ZS,Q s in (C.1) and hence, we replicate the VIX exactly. Furthermore, the discounted Laplace transform of VIX2 T at time t under Q is exponentially affine:27 ΨVIX2 (ω, vt−, t, T) := EQ exp −ρ0τ + ωVIX2 T Ft = exp (φv (τ) + ψv (τ) vt−) , where ω ∈ C, ρ0 ∈ R≥0, and τ := T − t. The functions (φv (τ) , ψv (τ)) ∈ R2 solve the system of generalized Riccati differential equations: dψv (τ) dτ = −κψv (τ) + 1 2 σ2 ψ2 v (τ) + λ1 θQ J (0, ψv (τ)) − 1 , dφv (τ) dτ = −ρ0 + ψv (τ) κm + λ0 θQ J (0, ψv (τ)) − 1 , subject to the terminal conditions φv (0) = ωαVIX2 and ψv (0) = ωβVIX2 . For the numerical resolution of the Riccati differential equations governing the coefficients φv (τ) and ψv (τ), the standard Runge- Kutta 4th order method is applied. In order to price VIX (call) options, the previous outlined COS method is used. 27 For simplicity’s sake we again set the risk-free rate constant. 32
  • 35. D Tables Descriptive Statistics Sample Period: February 27, 2006 – August 30, 2013 Mean Volatility Skewness Kurtosis Log-Returns SPX 0.0001 0.0144 -0.2995 12.0503 VIX Levels 22.1871 10.6044 2.0254 8.2538 Table 1 – We provide a summary of the most crucial descriptive statistical measures for the log-returns of the SPX and the VIX levels. In particular, the sample mean, the sample volatility (measured by the empirical standard deviation), the sample skewness, and the sample kurtosis are calculated and depicted. The sample period spans the period of February 27, 2006 to August 30, 2013. $ARE Results Date MAD SV-2F SV-2F SVJJ # Call Options Feller Condition Feller Condition Data Set I. 01/10/2008 0.0356 0.0413 0.0444 0.0572 57 08/10/2008 0.0686 0.0771 0.0974 0.0768 82 15/10/2008 0.1005 0.1121 0.1137 0.1119 75 22/10/2008 0.0516 0.0528 0.0568 0.0896 55 29/10/2008 0.0823 0.1009 0.1597 0.0538 64 Data Set II. 07/04/2010 0.1327 0.1328 0.1556 0.0609 61 14/04/2010 0.0965 0.1388 0.1509 0.0603 67 21/04/2010 0.0706 0.1264 0.2135 0.1458 63 28/04/2010 0.1506 0.2511 0.5878 0.0883 82 Data Set III. 03/08/2011 0.0775 0.0978 0.1146 0.0680 102 10/08/2011 0.1284 0.1957 0.2001 0.1226 133 17/08/2011 0.0477 0.1070 0.1335 0.0822 84 24/08/2011 0.1057 0.1078 0.1215 0.0759 86 31/08/2011 0.0499 0.0518 0.1211 0.0761 82 Table 2 – The $ARE calibration results for the three data sets which encompass different states of the economy. MAD denotes the matrix affine diffusion model outlined in Section 3, SV-2F designates the two-factor pure diffusion model of Christoffersen et al. (2009), and SVJJ abbreviates the sketched model in Appendix C. To mitigate the ill-posedness problem and to find the global minimum, we use a DE algorithm in the first step and thereafter the obtained parameter values are used as starting values for a local optimizer. On each date, the data consist of liquid VIX call options where # Call Options denotes the total number of call options to fit. 33
  • 36. Parameter Estimates on August 17, 2011 β M11 M12 M22 Q11 Q12 Q22 X11,t X12,t X22,t MAD 1.0010 -0.9955 0.1163 -0.0136 0.8994 -0.1846 -0.2335 0.0415 0.0268 0.0181 SV-2F 1.1182 -0.7369 -20.4483 -0.7910 -1.2732 0.0366 0.0001 κ m σ λ0 λ1 µS σS ν vt− SVJJ 10.1968 0.0303 0.6939 1.5449 14.8902 0.0724 0.1072 0.1875 0.0650 SV-1F 5.9827 0.0787 0.9704 0.0854 Table 3 – The calibrated parameter estimates on August 17, 2011. MAD denotes the matrix affine diffusion model outlined in Section 3, SV-2F designates the two-factor pure diffusion model of Christoffersen et al. (2009), SVJJ abbreviates the sketched model in Appendix C, and SV-1F stands for the original Heston (1993) model. To mitigate the ill-posedness problem and to find the global minimum, we use a DE algorithm in the first step and thereafter the obtained parameter values are used as starting values for a local optimizer. The data consist of 84 liquid VIX call options. 34
  • 37. E Figures Feb−2006 Aug−2008 Feb−2011 Aug−2013 0 20 40 60 80 100 Joint SPX− and CBOE VIX Evolution Date VIXLevel Jul−2007 Jan−2010 Jul−2012 600 840 1080 1320 1560 1800 SPXLevel Figure 1 – The evolution of the SPX (green solid line) and the CBOE VIX (blue solid line) during the sample period of February 27, 2006 to August 30, 2013. The y-axes correspond to the index levels. Feb−2006 Jan−2008 Nov−2009 Oct−2011 Aug−2013 −1 −0.8 −0.6 −0.4 −0.2 0 0.2 0.4 Rolling Correlations between the SPX Log−Return− and the CBOE VIX Level Increments Date CorrelationCoefficient N = 25 N = 50 N = 100 N = 250 Figure 2 – The rolling correlations between the SPX log-return– and the CBOE VIX level increments during the sample period of February 27, 2006 to August 30, 2013 given the four different window sizes, measured in days, N ∈ {25, 50, 100, 250}. The y-axis corresponds to the correlation coefficient. 35
  • 38. −1 −0.8 −0.6 −0.4 −0.2 0 0.2 0.4 0.1 0.3 0.5 0.7 0.9 m ImpliedVolatility SPX: 29/09/2011 τ = 23 τ = 51 τ = 79 τ = 114 τ = 170 τ = 261 −1 −0.5 0 0.5 1 1.5 0.5 0.7 0.9 1.1 1.3 1.5 1.7 m ImpliedVolatility VIX: 29/09/2011 τ = 20 τ = 48 τ = 83 τ = 111 τ = 139 τ = 174 Figure 3 – The SPX– and VIX implied volatility skews on September 29, 2011 as a function of the log-moneyness m := ln K/Fi t (T) , where K is the strike level of the European-style option, Fi t (T) denotes the closing SPX (resp. VIX) futures price if i = S (resp. i = V) today at time t with maturity T, and τ := T − t is the option’s time-to-maturity in daily units. 0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2 0 0.5 1 1.5 2 t ConditionalSkewness Model−Implied Conditional Skewness X 12,t = −0.0268 X12,t = −0.0161 X 12,t = −0.0054 X12,t = 0.0054 X12,t = 0.0161 X12,t = 0.0268 SV−2F 0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2 0 1 2 3 4 5 6 t ConditionalKurtosis Model−Implied Conditional Kurtosis Figure 4 – The model-implied conditional skewness and kurtosis, respectively, of V := tr (X) for a varying off- diagonal latent state X12. We use a time horizon of two years, i.e. t ∈ (0, 2]. SV-2F abbreviates a parameterized version of the Christoffersen et al. (2009) model and is described in Section 4. The initial parameter set is given in (7). 36
  • 39. 0 5 10 15 20 25 30 0 2 4 6 8 10 12 K OptionPrice Simulated Option Prices MAD SV−2F 0 5 10 15 20 25 30 35 40 0 0.025 0.05 0.075 0.1 0.125 0.15 0.175 0.2 VIXT Density Histogram of the Simulated VIX Dynamics MAD SV−2F Figure 5 – On the left we plot the simulated VIX option prices using 30’000 Monte Carlo simulations and a risk- free interest rate of 1%, and on the right the corresponding histograms of the simulated VIX dynamics are depicted. MAD denotes the matrix affine diffusion model outlined in Section 3.2 and SV-2F abbreviates a parameterized version of the Christoffersen et al. (2009) model and is described in Section 4. The employed parameter set is given in (7). 0 0.5 1 1.5 2 2.5 3 3.5 0.4 0.41 0.42 0.43 0.44 0.45 0.46 0.47 α VIX α−Implied VIX Values Figure 6 – We fix the volatility level V and the volatility composition V1 V , and plot the admissible domain for the VIX values by varying α ∈ [0, π]. The employed parameter set is given in (7). The minimum is achieved at 0.4055 whereas the maximum amounts 0.4652. 37
  • 40. 0 20 40 60 80 0 5 10 15 20 Option Maturity τ = 35 K OptionPrice Market MAD SV−2F SVJJ SV−1F 0 20 40 60 80 0 2 4 6 8 10 12 Option Maturity τ = 63 K OptionPrice 10 20 30 40 50 60 0 2 4 6 8 10 12 Option Maturity τ = 91 K OptionPrice 10 20 30 40 50 60 0 2 4 6 8 10 Option Maturity τ = 126 K OptionPrice Figure 7 – We plot the model-implied option prices in conjunction with the corresponding market values on August 17, 2011 for four different time-to-maturities τ := T −t, measured in daily units, as a function of the strike price K. The closing ATM VIX futures prices are 27.65, 26.58, 25.86, and 24.91 for 35, 63, 91, and 126 days, respectively. MAD denotes the matrix affine diffusion model outlined in Section 3, SV-2F designates the two- factor pure diffusion model of Christoffersen et al. (2009), SVJJ abbreviates the sketched model in Appendix C, and SV-1F stands for the original Heston (1993) model. The employed parameter values are given in Table 3. 38
  • 41. −1.5 −1 −0.5 0 0.5 1 0 0.5 1 1.5 2 2.5 3 3.5 Option Maturity τ = 35 m ImpliedVolatility Market MAD SV−2F SVJJ SV−1F −1 −0.5 0 0.5 1 1.5 0.4 0.6 0.8 1 1.2 1.4 Option Maturity τ = 63 m ImpliedVolatility −0.5 0 0.5 1 0.4 0.5 0.6 0.7 0.8 0.9 1 1.1 Option Maturity τ = 91 m ImpliedVolatility −0.5 0 0.5 1 0.4 0.5 0.6 0.7 0.8 0.9 1 Option Maturity τ = 126 m ImpliedVolatility Figure 8 – We plot the model-implied implied volatilities in conjunction with the corresponding market values on August 17, 2011 for four different time-to-maturities τ := T − t, measured in daily units, as a function of the log-moneyness m := ln K/FV t (T) , where K is the strike level of the VIX call option and FV t (T) denotes the closing VIX futures price today at time t with maturity T. The closing ATM VIX futures prices are 27.65, 26.58, 25.86, and 24.91 for 35, 63, 91, and 126 days, respectively. MAD denotes the matrix affine diffusion model outlined in Section 3, SV-2F designates the two-factor pure diffusion model of Christoffersen et al. (2009), SVJJ abbreviates the sketched model in Appendix C, and SV-1F stands for the original Heston (1993) model. The employed parameter values are given in Table 3. 39
  • 42. References M. Abramowitz and I. A. Stegun. Handbook of mathematical functions: With formulas, graphs, and mathematical tables, volume 55 of Applied Mathematics Series. Dover Publications, New York, United States, 10 edition, 1972. C. Albanese, H. Lo, and A. Majatovi´c. Spectral methods for volatility derivatives. Quantitative Finance, 9(6):663–692, 2009. T. G. Andersen, T. Bollerslev, F. X. Diebold, and H. Ebens. The distribution of realized stock return volatility. Journal of Financial Economics, 61(1):43–76, 2001a. T. G. Andersen, T. Bollerslev, F. X. Diebold, and P. Labys. The distribution of realized exchange rate volatility. Journal of the American Statistical Association, 96(453):42–55, 2001b. J. Baldeaux and A. Badran. Consistent modelling of VIX and equity derivatives using a 3/2 plus jumps model. Applied Mathematical Finance, 21(4):299–312, 2014. C. Bardgett, E. Gourier, and M. Leippold. Inferring volatility dynamics and risk premia from the S&P 500 and VIX markets. Swiss Finance Institute, Research Paper Series No13-40, University of Zurich, 2014. G. Bekaert and G. Wu. Asymmetric volatilities and risk in equity markets. Review of Financial Studies, 13(1):1–42, 2000. A. Benabid, H. Bensusan, and N. El Karoui. Wishart stochastic volatility: Asymptotic smile and numerical framework. Working Paper, ´Ecole Polytechnique, Paris-Saclay University, 2010. L. Bergomi. Smile dynamics III. Risk, October:90–96, 2008. F. Black. Studies of stock price volatility changes. Proceedings of the 1976 American Statistical Asso- ciation, Business and Economical Statistics Section (American Statistical Association, Alexandria, Virginia, United States), 177–181, 1976. N. Branger, A. Kraftschik, and C. V¨olkert. The fine structure of variance: Consistent pricing of VIX derivatives. Working Paper, University of M¨unster, 2014. 40
  • 43. M.-F. Bru. Wishart processes. Journal of Theoretical Probability, 4(4):725–751, 1991. A. Buraschi, A. Cieslak, and F. Trojani. Correlation risk and the term structure of interest rates. Working Paper, Imperial College London, University of St. Gallen, 2008. A. Buraschi, R. Kosowski, and F. Trojani. When there is no place to hide: Correlation risk and the cross-section of hedge fund returns. Review of Financial Studies, 27(2):581–616, 2014. J. Y. Campbell and L. Hentschel. No news is good news: An asymmetric model of changing volatility in stock returns. Journal of Financial Economics, 31(3):281–318, 1992. J. Y. Campbell and A. S. Kyle. Smart money, noise trading and stock price behaviour. Review of Economic Studies, 60(1):1–34, 1993. P. Carr and D. B. Madan. Option valuation using the fast Fourier transform. Journal of Computa- tional Finance, 2(4):61–73, 1999. P. Carr and L. Wu. Variance risk premiums. Review of Financial Studies, 22(3):1311–1341, 2009. U. K. Chakraborty, editor. Advances in differential evolution, volume 143 of Studies in Computational Intelligence. Springer-Verlag, Berlin Heidelberg, Germany, 2008. P. Christoffersen, S. L. Heston, and K. Jacobs. The shape and term structure of the index option smirk: Why multifactor stochastic volatility models work so well. Management Science, 55(12): 1914–1932, 2009. P. Cizeau, Y. Liu, M. Meyer, C.-K. Peng, and H. E. Stanley. Volatility distribution in the S&P500 stock index. Physica A, 245(3–4):441–445, 1997. R. Cont and T. Kokholm. A consistent pricing model for index options and volatility derivatives. Mathematical Finance, 23(2):248–274, 2011. R. Cont and P. Tankov. Non-parametric calibration of jump-diffusion option pricing models. Journal of Computational Finance, 7(3):1–50, 2004. J. C. Cox, J. E. Ingersoll, and S. A. Ross. A theory of the term structure of interest rates. Econo- metrica, 53(2):385–407, 1985. 41
  • 44. C. Cuchiero, D. Filipovi´c, E. Mayerhofer, and J. Teichmann. Affine processes on positive semidefinite matrices. The Annals of Applied Probability, 21(2):397–463, 2011. J. Da Fonseca, M. Grasselli, and C. Tebaldi. Option pricing when correlations are stochastic: An analytical framework. Review of Derivatives Research, 10(2):151–180, 2007. J. Da Fonseca, M. Grasselli, and C. Tebaldi. A multifactor volatility Heston model. Quantitative Finance, 8(6):591–604, 2008. J. Da Fonseca, M. Grasselli, and F. Ielpo. Estimating the Wishart affine stochastic correlation model using the empirical characteristic function. Studies in Nonlinear Dynamics and Econometrics, 18 (3):253–289, 2014. G. Drimus and W. Farkas. Local volatility of volatility for the VIX market. Review of Derivatives Research, 16(3):267–293, 2013. D. Duffie, J. Pan, and K. Singleton. Transform analysis and asset pricing for affine jump-diffusions. Econometrica, 68(6):1343–1376, 2000. D. Egloff, M. Leippold, and L. Wu. The term structure of variance swap rates and optimal variance swap investments. Journal of Financial and Quantitative Analysis, 45(5):1279–1310, 2010. F. Fang and C. W. Oosterlee. A novel pricing method for European options based on Fourier-cosine series expansions. SIAM Journal on Scientific Computing, 31(2):826–848, 2008. W. Feller. Two singular diffusion problems. Annals of Mathematics, 54(1):173–182, 1951. G. Freiling. A survey of nonsymmetric Riccati equations. Linear Algebra and its Applications, 351–352:243–270, 2002. J. Gatheral. Consistent modeling of SPX and VIX options. In The Fifth World Congress of the Bachelier Finance Society, London, United Kingdom, July 2008. J. Gatheral. Joint modeling of SPX and VIX. In National School of Development, Peking University, Peking, China, October 2013. 42
  • 45. J. Goard and M. Mazur. Stochastic volatility models and the pricing of VIX options. Mathematical Finance, 23(3):439–458, 2013. C. Gouri´eroux and R. Sufana. Derivative pricing with Wishart multivariate stochastic volatility. Journal of Business & Economic Statistics, 28(3):438–451, 2010. C. Gouri´eroux, J. Jasiak, and R. Sufana. The Wishart Autoregressive process of multivariate stochas- tic volatility. Journal of Econometrics, 150(2):167–181, 2009. P. Gruber, C. Tebaldi, and F. Trojani. Three make a smile – Dynamic volatility, skewness and term structure components in option valuation. Working Paper, Bocconi University, University of Lugano, 2010. R. A. Haugen, E. Talmor, and W. N. Torous. The effect of volatility changes on the level of stock prices and subsequent expected returns. Journal of Finance, 46(3):985–1007, 1991. S. L. Heston. A closed-form solution for options with stochastic volatility with applications to bond and currency options. Review of Financial Studies, 6(2):327–343, 1993. J. Kallsen, J. Muhle-Karbe, and M. Voss. Pricing options on variance in affine stochastic volatility models. Mathematical Finance, 21(4):627–641, 2011. M. Leippold and F. Trojani. Asset pricing with matrix jump diffusions. Working Paper, University of Zurich, University of Lugano, 2010. G.-H. Lian and S.-P. Zhu. Pricing VIX options with stochastic volatility and random jumps. Decisions in Economics and Finance, 36(1):71–88, 2013. J. Menc´ıa and E. Sentana. Valuation of VIX derivatives. Journal of Financial Economics, 108(2): 367–391, 2013. R. J. Muirhead. Aspects of Multivariate Statistical Theory, volume 197 of Wiley Series in Probability and Statistics. John Wiley & Sons, Hoboken, New Jersey, United States, 2 edition, 2005. A. Papanicolaou and R. Sircar. A regime-switching Heston model for VIX and S&P 500 implied volatilities. Quantitative Finance, 14(10):1811–1827, 2014. 43
  • 46. K. V. Price, R. M. Storn, and J. A. Lampinen. Differential evolution - A practical approach to global optimization. Natural Computing Series. Springer-Verlag, Berlin Heidelberg, Germany, 2005. A. Sepp. VIX option pricing in a jump-diffusion model. Risk Magazine, 84–89, 2008. R. M. Storn and K. Price. Differential evolution - A simple and efficient heuristic for global opti- mization over continuous spaces. Journal of Global Optimization, 11(4):341–359, 1997. J. E. Zhang and Y. Zhu. VIX futures. Journal of Futures Markets, 26(6):521–531, 2006. S.-P. Zhu and G.-H. Lian. An analytical formula for VIX futures and its applications. Journal of Futures Markets, 32(2):166–190, 2012. 44
  • 47. Pricing VIX Options with Multifactor Stochastic Volatility – Supplementary Appendix∗ Pascal M. Caversaccio† First Draft: May 16, 2014 This Version: June 13, 2016 Abstract This supplementary appendix for Pricing VIX Options with Multifactor Stochastic Volatility con- tains (1) the mathematical proofs of Corollary 3.1 and Corollary 3.2, (2) some useful mathe- matical background results which we implicitly employed in the derivation of Assumption 3.1, Corollary 3.1, and Corollary 3.2 in the main paper, (3) our data cleaning criteria, and (4) fur- ther empirical stylized facts of the Standard & Poor’s 500 index– and Chicago Board Options Exchange volatility index (option) market. ∗ The author thanks Chris Bardgett, Elise Gourier, Meriton Ibraimi, Markus Leippold, Stefan Pomberger, Lujing Su, and Nikola Vasiljevi´c for helpful comments, and Sergio Maffioletti for providing guidance and access to the cloud infrastructure of the University of Zurich. Any remaining errors are mine. † University of Zurich – Department of Banking and Finance, Plattenstrasse 14, 8032 Zurich, Switzerland. E-Mail: pascalmarco.caversaccio@uzh.ch.
  • 48. Pricing VIX Options with Multifactor Stochastic Volatility – Supplementary Appendix June 13, 2016 Abstract This supplementary appendix for Pricing VIX Options with Multifactor Stochastic Volatility con- tains (1) the mathematical proofs of Corollary 3.1 and Corollary 3.2, (2) some useful mathe- matical background results which we implicitly employed in the derivation of Assumption 3.1, Corollary 3.1, and Corollary 3.2 in the main paper, (3) our data cleaning criteria, and (4) fur- ther empirical stylized facts of the Standard & Poor’s 500 index– and Chicago Board Options Exchange volatility index (option) market.
  • 49. A Proofs This section contains the mathematical proofs of Corollary 3.1 and Corollary 3.2. Proof of Corollary 3.1. According to Radon’s lemma (see, e.g., Freiling (2002) and Lemma B.2) we can represent the function ψX (τ) as ψX (τ) = J (τ)−1 K (τ) , (A.1) where K (τ) and J (τ) are square matrices in R2×2 with J (τ) invertible.1 Moreover, we define ∂τ ψX (τ) := dψX(τ) dτ , ∂τ J (τ) := dJ(τ) dτ , and ∂τ K (τ) := dK(τ) dτ . Multiplying equation (10) of the main paper by J (τ) yields J (τ) ∂τ ψX (τ) = J (τ) M ψX (τ) + J (τ) ψX (τ) M + 2J (τ) ψX (τ) Q QψX (τ) . (A.2) Now, we differentiate J (τ) ψX (τ) = K (τ) (A.3) in light of (A.1), and obtain J (τ) ∂τ ψX (τ) = ∂τ (J (τ) ψX (τ)) − ∂τ J (τ) ψX (τ) (A.4) and ∂τ (J (τ) ψX (τ)) = ∂τ K (τ) . (A.5) Plugging (A.3), (A.4), and (A.5) into (A.2), we get ∂τ K (τ) − ∂τ J (τ) ψX (τ) = J (τ) M ψX (τ) + K (τ) M + 2K (τ) Q QψX (τ) . 1 Mathematically speaking, it is due to the fact that the matrix Riccati equations belong to a quotient manifold. 1
  • 50. By collecting the coefficients of ψX (τ), the following matrix ordinary differential equations (ODEs) are induced: ∂τ K (τ) = K (τ) M, ∂τ J (τ) = −2K (τ) Q Q − J (t) M , or d dτ K (τ) J (τ) = K (τ) J (τ)    M −2Q Q 0 −M    . (A.6) We can solve (A.6) by exponentiation: K (τ) J (τ) = K (0) J (0) exp   τ    M −2Q Q 0 −M       = ψX (0) I2 exp   τ    M −2Q Q 0 −M       = ψX (0) C11 (τ) + C21 (τ) ψX (0) C12 (τ) + C22 (τ) = ωβVIX2 C11 (τ) + C21 (τ) ωβVIX2 C12 (τ) + C22 (τ) . Finally from (A.1), we can conclude that the solution is given by ψX (τ) = (ωβVIX2 C12 (τ) + C22 (τ))−1 (ωβVIX2 C11 (τ) + C21 (τ)) . A moment’s reflection reveals that φX (τ) is obtained by simple integration:2 φX (τ) = −ρ0 τ 0 ds + βtr Q Q τ 0 ψX (s) ds = −ρ0τ − β 2 tr ln (C22 (τ)) + τM + ωαVIX2 , 2 See, e.g., Da Fonseca et al. (2007). 2
  • 51. where ln (·) denotes the matrix logarithm. This step concludes the proof. Proof of Corollary 3.2. We follow the same rationale as in the proof of Corollary 3.1 and repeat it here for the reader’s convenience. The linearization of the flow of the differential equation is obtained by Radon’s lemma. We can express the function ψS (τ) as ψS (τ) = J (τ)−1 K (τ) , (A.7) where K (τ) and J (τ) are square matrices in R2×2 with J (τ) invertible. Moreover, we define ∂τ ψS (τ) := dψS(τ) dτ , ∂τ J (τ) := dJ(τ) dτ , and ∂τ K (τ) := dK(τ) dτ . Multiplying equation (12) of the main paper by J (τ) yields J (τ) ∂τ ψS (τ) = ϑ (ϑ − 1) 2 J (τ) I2 + J (τ) ψS (τ) M + ϑQ P + J (τ) M + ϑPQ ψS (τ) + 2J (τ) ψS (τ) Q QψS (τ) . (A.8) Now, we differentiate J (τ) ψS (τ) = K (τ) (A.9) in light of (A.7), and obtain J (τ) ∂τ ψS (τ) = ∂τ (J (τ) ψS (τ)) − ∂τ J (τ) ψS (τ) (A.10) and ∂τ (J (τ) ψS (τ)) = ∂τ K (τ) . (A.11) Plugging (A.9), (A.10), and (A.11) into (A.8), we get ∂τ K (τ) − ∂τ J (τ) ψS (τ) = ϑ (ϑ − 1) 2 J (τ) I2 + K (τ) M + ϑQ P + J (τ) M + ϑPQ ψS (τ) + 2K (τ) Q QψS (τ) . 3
  • 52. By collecting the coefficients of ψS (τ), the following matrix ODEs are induced: ∂τ K (τ) = ϑ (ϑ − 1) 2 J (τ) I2 + K (τ) M + ϑQ P , ∂τ J (τ) = −2K (τ) Q Q − J (τ) M + ϑPQ , or d dτ K (τ) J (τ) = K (τ) J (τ)    M + ϑQ P −2Q Q ϑ(ϑ−1) 2 I2 − M + ϑPQ    . (A.12) We can solve (A.12) by exponentiation: K (τ) J (τ) = K (0) J (0) exp   τ    M + ϑQ P −2Q Q ϑ(ϑ−1) 2 I2 − M + ϑPQ       = ψS (0) I2 exp   τ    M + ϑQ P −2Q Q ϑ(ϑ−1) 2 I2 − M + ϑPQ       = ψS (0) C11 (τ) + C21 (τ) ψS (0) C12 (τ) + C22 (τ) = C21 (τ) C22 (τ) . Finally from (A.7), we can conclude that the solution is given by ψS (τ) = C22 (τ)−1 C21 (τ) . A moment’s reflection reveals that φS (τ) is obtained by simple integration: φS (τ) = −ρ0 τ 0 ds + βtr Q Q τ 0 ψS (s) ds = −ρ0τ − β 2 tr ln (C22 (τ)) + τM + τϑPQ . This step concludes the proof. 4
  • 53. B Mathematical Background Results We collect in this section some useful mathematical background results which we implicitly employed in the derivation of Assumption 3.1, Corollary 3.1, and Corollary 3.2 in the main paper. The primary intention is to provide to the mathematical sophisticated reader the techniques applied in the main paper in order to better grasp the results. Lemma B.1. Let R = {R (Xt−) : t ∈ [0, T]} be the short rate process and X = {Xt : t ∈ [0, T]} denotes the underlying price process. Further, assume that R ∈ C0 (R+; R≥0). Then, the futures price F (t, T) = Xt B(t,T) , where B (t, T) := EQ e− T t R(Xs−)ds Ft is the price of a zero-coupon bond, is a local martingale under Q. Proof. The result follows straightforward by the first fundamental theorem of asset pricing, i.e. the discounted price process of every tradable asset is a Q-martingale (cf. Delbaen and Schachermayer (1994)). This theorem yields EQ [F (T, T)| Ft] = EQ [XT | Ft] = XtEQ e T t R(Xs−)ds Ft = Xt B (t, T) =: F (t, T) , which proves the martingale property. Lemma B.2 (Radon’s lemma). Assume the following Riccati differential equation ˙W (t) = M21 (t) + M22 (t) W (t) − W (t) M11 (t) − W (t) M12 (t) W (t) , (B.1) where W (t) ∈ Rm×n, M11 (t) ∈ Rn×n, M12 (t) ∈ Rn×m, M21 (t) ∈ Rm×n, and M22 (t) ∈ Rm×m with n, m ∈ N for any t ∈ J ⊂ R, then the following holds: i) Let W (t) be a solution of (B.1) on the interval J with W (t0) = W0. If Q (t) ∈ Rn×n is for any t ∈ J the unique solution of the initial value problem ˙Q (t) = (M11 (t) + M12 (t) W (t)) Q (t) , Q (t0) = In, 5
  • 54. and P (t) := W (t) Q (t), then    Q (t) P (t)    is a solution of the associated linear system of differential equations d dt    Q (t) P (t)    =    M11 (t) M12 (t) M21 (t) M22 (t)       Q (t) P (t)    . (B.2) ii) If    Q (t) P (t)    is on the interval J a real solution of the system (B.2) such that det (Q (t)) = 0 for any t ∈ J, then W : J → Rm×n , t → P (t) Q (t)−1 =: W (t) is a real solution of (B.1) and in particular W (t0) = P (t0) Q (t0)−1 . iii) In case of W (t) ∈ Cm×n, M11 (t) ∈ Cn×n, M12 (t) ∈ Cn×m, M21 (t) ∈ Cm×n, and M22 (t) ∈ Cm×m with n, m ∈ N for any t ∈ J ⊂ C the assertions i) and ii) remain valid. Proof. It follows by elementary calculation. We refer to Radon (1927) and Radon (1928) for the original work. C Data Cleaning Criteria Every empirical analysis inevitably hinges on the data treatment. Therefore, it is very important to carefully address the issue of data selection. We obtain the Standard & Poor’s 500 (SPX)– and Chicago Board Options Exchange (CBOE) volatility index (VIX) option data quotes, covering a wide range of strikes and maturities, from the OptionMetrics database. For the continuously compounded risk-free interest rates we take the zero- coupon yield curve, also available on OptionMetrics, covering various maturities. The data set spans the period from February 27, 2006 to August 30, 2013, covering seven and a half years of data. We infer the value of the SPX– and VIX futures, respectively, at closing by backing out the 6
  • 55. value using the at-the-money (ATM) forward put-call parity.3 Moreover, we follow the standard convention in the literature and take the mid-price, defined by the average of the best bid– and ask market quotes, to calculate the implied volatilities. To avoid noise in the data, five additional exclusionary criteria are applied. First, we delete all non-traded and therefore illiquid options. Thus, we remove all options which have zero open interest or were not traded for some time, i.e. volume = 0. Second, we follow the arguments of A¨ıt-Sahalia and Lo (1998) and delete observations under a price level of 0.10$. This criterion can be justified by the rationale that it is not possible to give an arbitrary decimal price due to the minimum tick for option prices. Third, all option data quotes with time-to-maturity less than five days or larger than one year are not taken into consideration. Fourth, we delete all in-the-money (ITM) SPX options if there exist corresponding liquid out-of-the-money (OTM) SPX options, since OTM options contain usually more information due to their high liquidity. For the particular case that the OTM options are not sufficiently liquid, we continue working with the most liquid one of the OTM– and ITM option. Finally, we only work with highly liquid VIX call options since they exhibit, on average, a higher trading volume and open interest compared to the VIX put options.4 Concerning the implied volatilities, we use a hybrid algorithm, consisting of the (efficient) Newton-Raphson algorithm and the bisection method, for the calculations.5 We have to admit at this point that it is impossible to obtain a perfectly cleaned up data set, since there are some issues which cannot be resolved, at least not with daily data.6 For instance, the trading time of options may not be the closing time, which means that the closing price of the 3 The ATM put-call parity is defined as follows: CMkt t (Ft (T) , K ≈ Ft (T) , t, T, rt,T ) + Ke−rt,T (T −t) = PMkt t (Ft (T) , K ≈ Ft (T) , t, T, rt,T ) + Ft (T) e−rt,T (T −t) , where Ft (T) denotes the closing futures price today at time t with maturity T. We denote the ATM (K ≈ Ft (T)) observed market prices with the same maturity T by CMkt t and PMkt t , respectively. The risk-free interest rate rt,T with time-to-maturity T − t can be extracted from, e.g., the zero-coupon yields or the implied London Interbank Offered Rate (LIBOR) swap rates for long maturities. 4 Indeed, a further stylized fact which is worth mentioning is the inversely proportional put-call trading ratio for SPX– and VIX options, i.e. we observe almost twice as many puts as calls traded daily in the SPX options market and the reverse is true for the VIX market. 5 One can also obtain implied volatility estimates from OptionMetrics. However, there exist different approaches for the construction of an implied volatility surface (IVS). In particular, OptionMetrics computes first the implied volatility for each option, and in a second step the IVS is reconstructed by a Gaussian kernel smoothing with empirically adjusted widths. Since the data treatment and the computational method are different compared to OptionMetrics, our IVS slices in Figure 3 of the main paper do not correspond to the IVS slices obtained by OptionMetrics. We refer to Homescu (2011) for a survey of the existing literature on the construction of an IVS. 6 High-frequency data, for example, offer more flexibility but also require special attention. Therefore, it is a trade- off between ordinary frequented data and high-frequency data which can however open a Pandora’s box on how to partition the data. 7
  • 56. underlying value does not correspond to the underlying value at the time of trade. One way to deal with this issue is to consider futures prices backed out from the ATM put-call parity with highly liquid options as we conduct it in this paper. If such high liquid options do not exist, we are back to the problem of non-synchronized prices. Nonetheless, we think that our data treatment removes most of the noise and allows an empirical data analysis. D Further Stylized Facts In this section we provide further empirical stylized facts of the SPX– and CBOE VIX (option) market. The analysis is by no means exhaustive, nonetheless it encompasses, in combination with Section 2 of the main paper, a wide range of data features. Table 1 and Table 2 report the properties of our SPX– and VIX option data set divided into different maturity and moneyness bins, respectively, where the moneyness is defined, for i ∈ {S, V}, by m := K/Fi t (T) , where K is the strike level of the European-style option and Fi t (T) denotes the closing SPX (resp. VIX) futures price if i = S (resp. i = V) today at time t with maturity T. The applied exclusionary criteria are outlined in Appendix C. These adjustments leave a total of (liquid) 701’347 OTM SPX options and 122’690 VIX call options. Notice that the low number of VIX options compared to the number of SPX options is due to the inception of the VIX options market in 2006 and therefore the overall traded volume is lower. Moreover, it also stems from the fact that less maturities and strikes are traded. [Table 1 about here.] [Table 2 about here.] 8
  • 57. To empirically motivate for (multifactor) stochastic volatility one can take a look at the ATM implied volatility evolution. In Figure 1 and Figure 2 the SPX– and VIX ATM implied volatility evolution of four different time-to-maturities are depicted. The overall implied volatility levels are higher for VIX options. Furthermore, the financial turmoil periods are more concise in the SPX market via its peaks whereas the VIX market follows an oscillating behavior. [Figure 1 about here.] [Figure 2 about here.] From Figure 3 in the main paper and from Table 3, where we depict eight different quantiles of implied volatility levels for SPX– and VIX options, we observe that the VIX implied volatility levels are substantially higher than for the SPX market. The explanation can be found in Figure 3. Plotting the evolution of the SPX– and the CBOE VIX log-returns during the sample period of February 27, 2006 to August 30, 2013 yields the conclusion that the driving stochastic volatility component in the VIX market exhibits a larger sizable impact compared to the SPX market, making the implied volatilities higher for VIX options. [Table 3 about here.] [Figure 3 about here.] The introduction of a (multifactor) stochastic volatility model is also supported by Figure 4 in which four different scatter plots are depicted where we plot the log-returns on the VIX against the log-returns on the SPX using different dependence structures. In particular, on the top left, the real data is plotted and we depict on the top right simulations from a fitted Frank copula, on the bottom left simulations from a fitted t-copula, and on the bottom right simulations from a fitted Gaussian copula. Blatantly, it is insufficient to fit a Gaussian copula to the SPX– and VIX log-returns. Using a Frank and t-copula, respectively, yields a much better leptokurtic behavior. Figure 4 also implies 9
  • 58. the leverage effect by the negative correlation of the SPX– and VIX log-returns. [Figure 4 about here.] Since option pricing is essentially about replicating the risk-neutral density (RND) at maturity, given a fixed time-to-maturity, we examine the RNDs of SPX options which are directly linked to the RNDs of VIX options by the negative correlation. For the calibration we used the technique described in Monnier (2013) which takes available bid– and ask put option quotes and estimates the smoothest risk-neutral density compatible with the quotes. This non-parametric method is able to recover the middle part of the RND together with its full left tail and part of its right tail. In consideration of the fact that the VIX RNDs exhibit fat right tails, we do not employ the method for the VIX market and leave the extension of the technique to variance derivatives for future research.7 By the implied volatility smirks in Figure 3 of the main paper, we expect asymmetry (negative skewness) and fat-tails (leptokurtosis) in the risk-neutral distribution of the SPX. Indeed, as can be noted from Figure 5, the SPX RNDs exhibit negative skewness and leptokurtocity for all time-to-maturities. As pointed out in Huang and Wu (2004), a jump component in the SPX dynamics generates return non-normality over the short terms, and a persistent stochastic volatility process slows down the convergence of the return distribution to normality as the maturity increases. Notice, Figure 5 also implies, by the leverage effect, the reverse shape for the VIX market. These facts evidently motivate for (multifactor) stochastic volatility in the SPX dynamics.8 [Figure 5 about here.] To gain a notion of the bid-ask spreads for both markets, which is an indicator for, e.g., hedging– and transaction costs, we plot the evolution of the average SPX– and VIX absolute bid-ask spread and the average SPX– and VIX relative bid-ask spread, defined by ”bid-ask spread/mid-price”, during the sample period of February 27, 2006 to August 30, 2013. Notice that we incorporate the entire 7 The recent paper of Song and Xiu (2014) recovers the VIX RNDs using another non-parametric approach. For a comprehensive treatment of state-price density estimation techniques see, e.g., the monograph of Jackwerth (2004). 8 The introduction of jumps in the asset return and volatility process, respectively, can obviously also be taken into consideration. Nevertheless, we do not follow this path due to model parsimony. 10
  • 59. data range in the computation and therefore no data filter is applied. In absolute terms, the average SPX bid-ask spreads are substantially higher. Using relative terms, the VIX bid-ask spreads are higher than the corresponding SPX bid-ask spreads at the beginning of the period. Approaching the recent years, both markets converge to each other in terms of relative bid-ask spreads implying that the VIX market has gained an amendment in liquidity. This finding is also justified by Figure 8 in which we plot the volume and open interest for the SPX– and VIX option market over time. We can notice that the VIX option volume and open interest, respectively, have significantly increased over the recent years whereas the same measures for the SPX market have stayed roughly stationary. [Figure 6 about here.] [Figure 7 about here.] [Figure 8 about here.] 11