VIP Call Girls Service Dilsukhnagar Hyderabad Call +91-8250192130
HHT Report
1. FINAL REPORT
Application of the Hilbert Huang
Transform to the prediction of
financial time series
Cyrille BEN LEMRID, Hadrien MAUPARD
Natixis supervisor : Adil REGHAI
Academic supervisor : Erick HERBIN
´
Ecole Centrale Paris
March 18, 2012
4. A.2.3 Non parametric processes for financial time series . . . . . . . . . . . 46
A.3 General time series: the Box & Jenkins approach for prediction . . . . . . . 46
B Evaluation criteria of backtests 47
4
5. Introduction
This report presents the Hilbert Huang research work of Cyrille Ben Lemrid and Hadrien
Maupard.
The Hilbert Huang Transform relies on two steps: a first non-parametric Empirical Mode
Decomposition which derives the signal into Intrinsic Mode Functions (semi-periodic func-
tions) of various frequencies and then a Hilbert Decomposition, projecting the IMFs onto a
time-frequency 3 dimensional graph. The details of the algorithm are thoroughly explained
in the first chapter of this report.
Due to the lack of theoretical formulation of the latter, and in order to keep our algorithm
flexible and simple, we will only use the Huang Transform, i.e the Empirical Mode
Decomposition (EMD).
Once applied to finance, it is well known that usual tools for the prediction of time series
are power less. Stationary and linear models, such as ARIMA processes, are unable to
predict financial time series, which display non stationarity and a long memory. Hence,
some extensions exist, parametric or non parametric. The EMD belongs the non parametric
predictors of non linear non stationary time series.
In chapter 2, based on empirical observations, interesting stylized facts are derived: IMFs
are uncorrelated to each other. Low frequency IMFs are periodic, and explain most of the
variance of the original time series. With their smooth and regular form, they are still able
to catch most of the information of the time series. High frequency IMFs are closer to
random processes, and have some stationarity. These facts connect the EMD with the Box
& Jenkins statistical framework: a time series can be seem as the sum of a semi-periodic
or seasonal process (the low frequency IMFs) and a random semi-stationary process (high
frequency IMFs).
In chapter 3, two categories of predictors are introduced, relying on two hypotheses on the
seasonal process: either it is deterministic, and can be prolonged, or it remains stochastic,
and conditional expectation is the best predictor. The hypothesis of deterministic seasonal
process gives one strategy: Low Frequency Mean Reverting Strategy. The hypothesis of
stochastic periodicity of the seasonal process gives two strategies: a Low Frequency Multi
Asset Shifting Pattern Recognition, and a Low Frequency Mono Asset Shifting Pattern
Recognition Strategy.
5
6. In chapter 4, the backtest method of the strategies is formulated, and underlyings for the
backtests are chosen: Implied Volatilities, Stocks, Indices and trading pairs, Commodities.
Finally, the results of these backtests are commented.
In the Annex, prerequisites about time series and asset management literature are given.
6
7. Chapter 1
Description of the Hilbert Huang Transform,
model overview
The Hilbert-Huang transform (HHT) is an empirically based data analysis method, which
is performed in two steps: first some descriptive patterns are extracted by performing an
adaptive decomposition called Empirical Mode Decomposition (Huang Transform), and then
we can capture the local behavior of these patterns by using tools coming from Hilbert
Spectral Analysis (Hilbert Transform).
1.1 The Empirical Mode Decomposition
The Empirical Mode Decomposition is based on the assumption that any data consists of
different simple intrinsic modes of oscillations. Each of these oscillatory modes is represented
by an intrinsic mode function (IMF) which satisfies two conditions:
– In the whole data set, the number of zero crossings and the number of extrema must equal
or differ at most by one.
– It exists two envelopes one passing through the local maxima and the other by the local
minima such that at any point the mean value of the two envelopes is zero.
Definition 1.1.1 An R-valued process x(t) is called an IMF (Intrinsic Mode Function) if
it is a continuous process, that satisfies the following conditions:
1. The number of extrema and the number zero-crossings must either equal or differ at
most by one : |#Γmax + #Γmin − #Γ0 | 1.
with Γ0 = { t ∈ I| x(t) = 0}
Γmax = { t ∈ I| ∃u > 0, ∀s ∈ ]t − u, t + u[/{t}, x(t) > x(s)}
Γmin = { t ∈ I| ∃u > 0, ∀s ∈ ]t − u, t + u[/{t}, x(t) < x(s)}
7
8. 2. The mean value m(t) = (xsup (t) + xinf (t))/2 of the envelope defined by the local maxima
xsup (t) and the envelope defined by the local minima xinf (t) is zero:
∃xsup ∈ Env(Γmax ), ∃xinf ∈ Env(Γmin ), ∀t ∈ I, m(t) = 0
with Env(Γmax ) = {f ∈ C(I) |∀t ∈ Γmax , f (t) = x(t)}
Env(Γmin ) = {f ∈ C(I) |∀t ∈ Γmin , f (t) = x(t)}
An IMF represents a simple oscillatory mode as a counterpart to the simple harmonic
function, but it is much more general: instead of constant amplitude and frequency, as
in a simple harmonic component, the IMF can have a variable amplitude and frequency as
functions of time.
The first condition is apparently necessary for oscillation data; the second condition requires
that upper and lower envelopes of IMF are symmetric with respect to the x-axis.
The idea of the EMD method is to separate the data into a slow varying local mean part
and a fast varying symmetric oscillation part, the oscillation part becomes the IMF and the
local mean the residue, the residue serves as input data again for further decomposition,
the process repeats until no more oscillation can be separated from the residue of frequency
mode. On each step of the decomposition, since the upper and lower envelope of the IMF
are unknown initially, a repetitive sifting process is applied to approximate the envelopes
with cubic spline functions passing through the extrema of the IMF. The data serves as the
initial input for the IMF sifting process, and the refined IMF is the difference between the
previous version and mean of the envelopes, the process repeats until the predefined stop
condition is statisfied. The residue is then the difference between the data and the improved
IMF.
One big advantage of this procedure is that it can deal with data from nonstationary and
nonlinear processes. This method is direct, and adaptive, with a posteriori-defined basis,
from the decomposition method, based on and derived from the data.
The intrinsic mode components can be decomposed in the following steps :
1. Take an arbitrary input signal x(t) and initialize the residual: r0 (t) = x(t), i = 1
2. Extract the ith IMF
3. Initialize the ”proto-Imf” h0 with h0 (t) = ri (t), k = 1
4. Extract the local maxima and minima of the ”proto-Imf” hk−1 (t)
5. Interpolate the local maxima and the local minima by a cubic spline to form upper
and lower envelopes of hk−1 (t)
6. Calculate the mean mk−1 (t) of the upper and lower envelopes of hk−1 (t)
7. Define: hk (t) = hk−1 (t) − mk−1 (t)
8. If IMF criteria are satisfied, then set IMFi (t) = hk (t) else go to (4) with k = k + 1
9. Define: ri (t) = ri−1 (t) − IM Fi (t)
10. If ri (t) still has at least two extrema, then go to (2) with i = i+1; else the decomposition
is completed and ri (t) is the ”residue” of x(t).
8
9. Figure 1.1: Sifting process of the empirical mode decomposition: (a) an arbitrary input; (b)
identified maxima (diamonds) and minima (circles) superimposed on the input; (c) upper
envelope and lower envelope (thin solid lines) and their mean (dashed line); (d) prototype
intrinsic mode function (IMF) (the difference between the bold solid line and the dashed line
in Figure 2c) that is to be refined; (e) upper envelope and lower envelope (thin solid lines)
and their mean (dashed line) of a refined IMF; and (f) remainder after an IMF is subtracted
from the input.
Once a signal has been fully decomposed, the signal x(t) can be written as
N
x(t) = IM Fi (t) + r(t)
i=1
1.2 Closed form formulas for IMFs
Rather than a Fourier or wavelet based transform, the Hilbert transform was used, in order
to compute instantaneous frequencies and amplitudes and describe the signal more locally.
Equation 3.1 displays the Hilbert transform Yt , which can be written for any function x(t)
of Lp class. The PV denotes Cauchy’s principle value integral.
+∞ t−ε +∞
1 IM Fs IM Fs IM Fs
Yt = H [IM Ft ] = PV ds = lim ds + ds
π t−s ε→0 t−s t−s
−∞ −∞ t+ε
9
10. Algorithm 1 Empirical Mode Decomposition
Require: Signal, threshold ∈ R+ ;
1: curSignal ← Signal, i = 1;
2: while (numberOfExtrema(curSignal) > 2) do
3: curImf ← curSignal
4: while (isNotAnImf(curImf , threshold) = true) do
5: Γmax ←emdGetMaxs(curImf );
6: Γmin ←emdGetMins(curImf );
7: Γmax ←emdMaxExtrapolate(curImf , Γmax );
8: Γmin ←emdMinExtrapolate(curImf , Γmin );
9: xinf ←emdInterpolate(curImf , Γmax );
10: xsup ← emdInterpolate(curImf , Γmin );
11: bias ← (xinf + xinf )/2;
12: curImf ← curImf − bias;
13: end while
14: IMFi ← curImf , i = i + 1;
15: curSignal ← curSignal − IMFi ;
16: end while
17: N = i;
18: residual ← curSignal
19: return (IM Fi )i=1..N
An analytic function can be formed with the Hilbert transform pair as shown in equation 1.1
Zt = IM Ft + iYt = At eiθt
where At = IM Ft2 + Yt2
Yt
θt = arctan
IM Ft
At and θt are the instantaneous amplitudes and phase functions, respectively
The instantaneous frequency ft can then be written as the time derivative of the phase, as
shown in equation
1 dθt
ft =
2π dt
Hence, an IMF can be expressed analytically :
t
IM Ft = At cos 2π fs ds + ψ (1.1)
0
[11] and [14] showed that not all functions give ”good” Hilbert transforms, meaning those
which produce physical instantaneous frequencies. The signals which can be analyzed using
the Hilbert transform must be restricted so that their calculated instantaneous frequency
functions have physical meaning.
10
11. Next, the empirical mode decomposition is essentially an algorithm which decomposes nearly
any signal into a finite set of functions which have ”good” Hilbert transforms that produce
physically meaningful instantaneous frequencies.
After IMFs have been obtained from the EMD method, one can further calculate instanta-
neous phases of IMFs by applying the Hilbert Huang tranform to each IMF component.
11
12. Chapter 2
State of the Art
2.1 Application fields of the EMD
The Empirical Mode Decomposition can be a powerful tool to separate non-linear and
non-stationary time series into the trend (residue function) and the oscillation (IMF)
on different time scales, it can describe the frequency components locally adaptively for
nearly any oscillating signal. This makes the tool extremely versatile. This decomposition
find applications in many fields where traditionally Fourier analysis method or Wavelet
method dominate. For instance, HHT has been used to study a wide variety of data
including rainfall, earthquakes, Sunspot number variation, heart-rate variability, financial
time series, and ocean waves to name a few subjects. But there are still some remaining
mathematical issues related to this decomposition which have been mostly left untreated:
convergence of the method, optimization problems (the best IMF selection and uniqueness
of the decomposition), spline problems (best spline functions for the HHT). In the following
chapters, these inexactitudes will be thoroughly developed, and the current potential
solutions from the literature will be gathered.
2.2 Existence and uniqueness of the Decomposition
The convergence of the proto-IMF (hk )k 0 sequence to an IMF is equivalent to the conver-
gence of (mk )k 0 the bias to zero.
L2
mk − − 0
−→
k→∞
where mk = hk−1 (t) − hk (t)
12
13. 2.2.1 Stoppage criteria
The inner loop should be ended when the result of the sifting process meets the definition
of an IMF. In practice this condition is too strong so we need to specify a relaxed condition
which can be met in a finite number of iterations. The approximate local envelope symmetry
condition in the sifting process is called the stoppage (of sifting) criterion. In the past, several
different types of stoppage criteria were adopted: the most widely used type, which originated
from Huang et al. [14], is given by a Cauchy type of convergence test, the normalized Squared
Difference between two successive sifting operations defined as
T
|hk−1 (t) − hk (t)|2
t=0
SDk = T
h2 (t)
k−1
t=0
must be smaller than a predetermined value. This definition is slightly different from the
one given by Huang et al. [14] with the summation signs operating for the numerator and
denominator separately in order to prevent the SDk from becoming too dependent on local
small amplitude values of the sifting time series.
If we assume that the local mean between the upper and lower envelopes converges to zero
in sense of the euclidean norm, we can apply the following Cauchy criterion :
2
mk−1 L2
log 2 threshold
hk−1 L2
In our implementation the threshold has been calibrated to -15.
These Cauchy types of stoppage criteria are seemingly rigorous mathematically. However,
it is difficult to implement this criterion for the following reasons: First, how small is small
enough begs an answer. Second, this criterion does not depend on the definition of the IMFs
for the squared difference might be small, but there is no guarantee that the function will
have the same numbers of zero crossings and extrema.
2.2.2 Cubic spline interpolation
Since the EMD is an empirical algorithm and involves a prescribed stoppage criterion to
carry out the sifting moves, we have to know the degree of sensitivity in the decomposition
of an input to the sifting process, so the reliability of a particular decomposition can further
be determined. Therefore, a confidence limit of the EMD is a desirable quantity. To compute
the upper and lower envelopes we use a piecewise-polynomial approximation.
In general, the goal of a spline interpolation is to create a function which achieves the best
possible approximation to a given data set. For a smooth and efficient approximation, one
has to choose high order polynomials. A popular choice is the piecewise cubic approximation
function of order three.
13
14. The basic idea behind using a cubic spline is to fit a piecewise function of the form :
S1 (x), x ∈ [x1 , x2 [
S2 (x), x ∈ [x2 , x3 [
S(x) =
...
Sn−1 (x), x ∈ [xn−1 , xn [
where Si (x) is a third degree polynomial with coefficients ai , bi , ci and di defined for
i = 0, 1, ..., n − 1 by :
Si (x) = ai + bi (x − xi ) + ci (x − xi )2 + di (x − xi )3 1{x∈[xi ,xi+1 [}
More formally, given a function f (x) defined on an interval [a, b] and a set of nodes
a = x0 < x1 < ... < xn = b, a cubic spline interpolant S(x) for f (x) is a function that
satisfies the following conditions :
n−1
1. S(x) = Si (x)1{x∈[xi ,xi+1 [} is a cubic polynomial denoted by Si (x), on the subinterval
i=0
[xi , xi+1 ) for each i = 0, 1, ..., n − 1.
2. Si+1 (xi+1 ) = Si (xi+1 ) for each i = 0, 1, ..., n − 2.
3. S i+1 (xi+1 ) = S i (xi+1 ) for each i = 0, 1, ..., n − 2.
4. S i+1 (xi+1 ) = S i (xi+1 ) for each i = 0, 1, ..., n − 2.
5. and one of the following set of boundary conditions is also satisfied :
S (x0 ) = S (xn ) = 0 (free or natural boundary)
S (x0 ) = f (x0 ) and S (xn ) = f (xn ) (clamped boundary).
But there are four problems with this decomposition method :
– the spline (cubic) connecting extrema is not the real envelope,
– the resulting IMF function does not strictly guarantee the symmetric envelopes,
– some unwanted overshoot may be caused by the spline interpolation,
– the spline cannot be connected at both ends of the data series.
Higher order spline does not in theory resolve these problems.
2.2.3 Additional boundary data points
It has been already illustrated that the cubic spline has somehow to be kept close to the
function especially near both ends of the data series. Therefore, the creation of additional
boundary data points, which are supposed to be applicable to the current data set, appears
to be the key element in technically improving the EMD. All artificially added boundary data
points are generated from within the original set of discrete knots to represent a characteristic
natural behaviour. One routine is to add new maxima and minima to the front and rear
of the data series. As a basic requirement, these data points are located off the original
time span the signal was recorded in. Therefore, no information is cancelled out and the
natural data series remains unaffected. However one disadvantage of this method is that we
anticipate the future trend of the data.
14
15. Figure 2.1: Additional boundary data points
Figure 2.2: Additional boundary data points
15
16. Chapter 3
Application to the prediction of financial time
series
3.1 The Empirical Mode Decomposition in finance:
stylized facts
In the markets, one can assume that different modes of oscillations in stock prices are
provoked by different kinds of actors. With this approach, the lowest frequency IMFs could
be considered as a statistical proxy to infer the behavior of big investors (Banks, Insurance
companies, Hedge Funds, Mutual Funds, Pension Funds..) and predict the evolution of a
stock in the long run.
3.1.1 Empirical Modes and Market Structure
3.1.1.1 Asset price and IMFs
With the EMD, a time series can be represented as follows:
NIM F
∗
∃NIM F ∈ N |∀t ∈ Z, Xt = IM Fti + rt
i=1
It can be interesting to observe the correlation matrix of the random vector
Xt
IM Ft1
...
IM Ft N
rt t∈Z
This matrix can be empirically computed on an example, as shown in Figure 3.1, page 17.
16
17. This correlation matrix shows three stylized facts:
X IMF1 IMF2 IMF3 IMF4 IMF5 IMF6 IMF7 Trend
X 1.000 0.038 0.039 0.172 0.184 0.169 0.488 0.795 0.879
IMF1 0.038 1.000 0.034 -0.009 -0.026 0.002 0.022 -0.021 -0.019
IMF2 0.039 0.034 1.000 -0.040 -0.005 0.043 -0.015 -0.022 -0.025
IMF3 0.172 -0.009 -0.040 1.000 0.109 -0.114 0.102 0.049 0.059
IMF4 0.184 -0.026 -0.005 0.109 1.000 -0.035 0.018 0.035 0.040
IMF5 0.169 0.002 0.043 -0.114 -0.035 1.000 -0.096 -0.014 -0.038
IMF6 0.488 0.022 -0.015 0.102 0.018 -0.096 1.000 -0.043 0.148
IMF7 0.795 -0.021 -0.022 0.049 0.035 -0.014 -0.043 1.000 0.977
Trend 0.879 -0.019 -0.025 0.059 0.040 -0.038 0.148 0.977 1.000
Figure 3.1: Empirical correlation matrix of AXA share price and its IMFs, from 01/01/2003
to 14/11/2011
– The EMD displays some empirical orthogonal features. Hence, the following theoretical
assumption can be made:
∀(i, j) ∈ [[1; NIM F ]]2 , i = j ⇒ IM Fi , IM Fj = 0
With the usual following scalar product:
: L2 (R) × L2 (R) → R
(X, Y ) → Cov(X, Y )
– Low Frequency IMFs display strong correlations to the original price series.
– High Frequency IMFs are uncorrelated to the original price series.
3.1.1.2 High frequency modes
Empirical modes have a strong connection to the market structure. On top of being
uncorrelated to the general time series, high frequency IMFs present strong correlation to
daily price movements, as shown by the following empirical correlation matrix of daily yields
to the IMFs processes and the trend process, (see Figure 3.3, page 18).
High frequency IMFs are accurately following daily movements, as shown in 3.4, page 19.
Moreover, there appears to be local jumps in amplitude of the high frequency IMFs when
daily changes are becoming sharper. This comes along local jumps in volatility. Hence, the
amplitude of high frequency IMFs is probably positively correlated to the short term implicit
volatility of the At-The-Money options on the stock. As we see on the last graph, amplitude
jumps along with volatility, as daily yields exceed 10% (see Figure 3.6 and Figure 3.5).
Finally, despite their significant short term periodicity, they display some signs of stationar-
ity, as shown in Figure 3.2, page 18.
17
18. Figure 3.2: Sample Autorrelation Function of the highest frequency IMF
diff(Axa)
Axa
IMF1 IMF2 IMF3 IMF4 IMF5 IMF6 IMF7 Trend
diff(Axa)
Axa
1.000 0.552 0.087 0.029 0.016 -0.007 -0.008 -0.004 -0.003
IMF1 0.552 1.000 0.034 -0.009 -0.026 0.002 0.022 -0.021 -0.019
IMF2 0.087 0.034 1.000 -0.040 -0.005 0.043 -0.015 -0.022 -0.025
IMF3 0.029 -0.009 -0.040 1.000 0.109 -0.114 0.102 0.049 0.059
IMF4 0.016 -0.026 -0.005 0.109 1.000 -0.035 0.018 0.035 0.040
IMF5 -0.007 0.002 0.043 -0.114 -0.035 1.000 -0.096 -0.014 -0.038
IMF6 -0.008 0.022 -0.015 0.102 0.018 -0.096 1.000 -0.043 0.148
IMF7 -0.004 -0.021 -0.022 0.049 0.035 -0.014 -0.043 1.000 0.977
Trend -0.003 -0.019 -0.025 0.059 0.040 -0.038 0.148 0.977 1.000
Figure 3.3: Empirical correlation of daily returns of AXA and its IMFs , from 01/01/2003
to 14/11/2011
A Ljung Box Test is a more quantitative way to assess stationarity. It tests the hypothesis
of being a second order stationary process.
Definition 3.1.1 Let m ∈ N. (H0 ) : ρ1 = ρ2 = .. = ρm = 0. The Ljung-Box Test statistic
is given by:
m
ρ2
h
Q(m) = N (N + 2)
h=1
N −h
QH0 (m) ∼ χ2
m
However, the test rejects the absence of autocorrelations for the highest frequency IMF on
our previous example of AXA. Therefore, it suggests that these high frequency modes display
18
19. Figure 3.4: Daily returns of Societe Generale and its highest frequency IMF
Figure 3.5: Daily returns of Societe Generale and its highest frequency IMF during a high
volatility period
19
20. Figure 3.6: Daily returns of Societe Generale and its highest frequency IMF normalized
some stationary properties, but keep a very short periodicity, making these processes not
i.i.d.
3.1.1.3 Low frequency modes
Low Frequency modes describe the long term dynamics of the stock. It reflects the positions
of long term actors within the market,(see Figure 3.7, page 21).
It can also be interpreted in terms of economic cycles, if applied to very long time frames,
(see Figure 3.8, page 21).
3.1.2 Back to the Box & Jenkins framework
The EMD algorithm derives the following decomposition of a given financial time series:
NIM F
∗
∃NIM F ∈ N |∀t ∈ Z, Xt = IM Fti + rt
i=1
Moreover, low frequency IMFs explain the general evolution of the stock price, and have
strong periodic patterns, whereas high frequency IMFs are linked to the daily movements.
Figure 3.9, page 22 shows all the IMFs of an example of financial time series. In red are low
frequency IMFs, in blue high frequency IMFs, in green the stock price.
20
21. Figure 3.7: Societe Generale stock price and the sum of its 2 lowest frequency IMFs
Figure 3.8: VIX and the sum of its 3 lowest frequency IMFs
Hence, we can separate the previous sum into the two components of the Box & Jenkins
approach.
21
22. Figure 3.9: Empirical Mode Decomposition of Axa, starting 01/01/2003 until 14/11/2011
Nsep NIM F
∗
∃NIM F ∈ N |∀t ∈ Z, Xt = IM Fti + IM Fti + rt
i=1 i=Nsep +1 T rend
Random part Seasonal part
In the rest of this report, we will sometimes include the trend process in the seasonal part.
In the previous example, the decomposition is the following:
Remark 3.1.2 In the Figure 3.10, page 23 , the correlation between the ”Random Part”
and the ”Seasonal Part” is:
Corr(Xtrandom (t), Xtseasonal (t)) = −0.01
In order to properly differentiate low frequency IMFs and high frequency IMFs, one needs a
rule. Multiple choices are possible:
– A stationarity criterion for high frequency IMFs: statistical tests, such as Ljung Box Test,
Runs Test, KPSS Test..
– A periodicity criterion for low frequency IMFs: low frequency IMFs must display less than
p pseudo periods within the time interval. Beyond that threshold p, they are considered
as moving too quickly, and not carrying information relative to the general evolution of
the series.
Provided that the goal of this decomposition is to extract a seasonal pattern reflecting the
broad evolution of the stock price, a selection criterion for low frequency IMFs can seem
more appropriate. It is also more intuitive than statistical tests. Therefore, the criterion
chosen is the following:
22
23. Figure 3.10: Box& Jenkins Decomposition of Axastock price based on EMD, starting
01/01/2003 until 14/11/2011
[ 1;T ]
∀i ∈ [[1; NIM F ]] , IM Fti 1 t T
∈ Xtseasonal 1 t T
⇔ #Γ0,i 3
where
[ 1;T ]
# Γ0,i = s ∈ [[1; T ]] |IM Fsi = 0
Therefore, we can now expressly write the Box & Jenkins decomposition based on the EMD:
NIM F NIM F
∗
∃NIM F ∈ N |∀t ∈ [[1; T ]] , Xt = IM Fti .1 #Γ[1;T ] >3 + IM Fti .1 [ 1;T ]
#Γ0,i 3
+ rt
0,i
i=1 i=1 T rend
Random part Seasonal part
Remark 3.1.3 In the Figure 3.10, page 23, the ”Seasonal Part” achieves to explain 91% of
the variance of the original time series:
V ar(Xtseasonal + rt )
R2 := = 0.91
V ar(Xt )
3.1.3 Prediction hypotheses
We have now decomposed our signal in two parts: the seasonal component, and the random
high frequency component. These two components are moreover uncorrelated, and the
variance of the seasonal process explains most of the variance of the original time series.
23
24. We have the following decomposition:
∀T ∈ N,∃NIM F ∈ N∗ |∀t ∈ [[0; T ]] ,
N NIM F
Xt = IM Fti .1 #Γ[1;T ] >3 + IM Fti .1 [ 1;T ]
#Γ0,i 3
+ rt
0,i
i=1 i=Nsep +1 T rend
Random part Seasonal part
We can now proceed to separate estimations for each process. As we noticed on the earlier
example, the Random Part Process is approximately centered on zero. Therefore, we will
make a simple prediction for this process:
random
E Xs T +1 s 2.T
| Xtrandom 1 t T
= (0)T +1 s 2.T
We now have to formulate a prediction of the seasonal process. Following the framework of
Box & Jenkins, two hypotheses are possible, in order to formulate predictions.
3.1.3.1 Deterministic periodicity of low frequency IMFs
The first possible assumption is that the seasonal component is deterministic. Hence, we
assume that in the future, this periodic component will keep its properties:
Ti
i
– Periodicity: ∀i ∈ [[i0 ; NIM F ]] , ∀t ∈ Z, IM Ft+j = 0, or ∀i ∈ [[i0 ; NIM F ]] , ∀t ∈
j=1
Z, IM Ft+Ti = IM Fti
i
[t;t+T ] [t;t+T ] [t;t+Ti ]
– IMF structure: ∀i ∈ [[i0 ; NIM F ]] , ∀t ∈ Z, #Γmax,i i + #Γmin,i i − #Γ0,i 1
Where
[t;t+Ti ]
Γ0,i = s ∈ [t; t + Ti ] |IM Fsi = 0
[t;t+T ] i
Γmin,i i = s ∈ [t; t + Ti ] |∃u > 0, s = arg min IM Fv
v∈[s−u,s+u]
[t;t+T ] i
Γmin,i i = s ∈ [t; t + Ti ] |∃u > 0, s = arg max IM Fv
v∈[s−u,s+u]
Hence, with these properties, each low frequency IMF can be easily prolonged, hence the
estimated future seasonal process.
3.1.3.2 Stochastic periodicity of low frequency IMFs
The second possible assumption is less strong that the first one. Instead of assuming that
the seasonal component is deterministic, it is now assumed that it presents some periodicity,
while remaining stochastic.
24
25. seasonal
E Xs + rt T +1 s 2.T
| Xtseasonal + rt 1 t T
seasonal + r )
= (Xs t T +1 s 2.T
if we assume that future estimations should rely on sequences from the past of the same
duration.
3.2 Insights of potential market predictors
3.2.1 Deterministic periodicity: Low frequency Mean Reverting
Strategy
This strategy relies on the hypothesis of deterministic periodicity of the seasonal process.
To formulate a prediction at a certain time t, within the horizon T, this strategy relies on
the following algorithm:
random seasonal
∀s ∈ [[t − T ; t]] , Xs = Xs + Xs
seasonal t
seasonal Xs seasonal 1 seasonal
∀s ∈ [[t − T ; t]] , Xs = log seasonal
where X = Xs
X T + 1 s=t−T
if Xtseasonal − 2Xt−1
seasonal seasonal
+ Xt−2 > threshold
seasonal
Xt+T
then > 1 and αM ean Re vertingStrat (t) = 1
Xtseasonal
elseif Xtseasonal − 2Xt−1
seasonal seasonal
+ Xt−2 < −threshold ,
Xt+T
then < 1 and αM ean Re vertingStrat (t) = −1
Xt
3.2.2 Conditional expectation: Low Frequency Multi Asset Shift-
ing Pattern Recognition Strategy and Mono Asset IMF Pat-
tern Recognition Strategy
3.2.2.1 Low Frequency Multi Asset Shifting Pattern Recognition Strategy
This strategy relies on the hypothesis of stochastic periodicity of the seasonal process. It
considers a pool of N assets, among which is the asset which is to be predicted: i0 .To
formulate a prediction at a certain time t0, within the horizon T, this strategy relies on the
following algorithm:
First, each price process is decomposed.
25
26. i random,i seasonal,i
∀i ∈ [[1; Nassets ]] , ∀s ∈ [[0; t0 ]] , Xs = Xs + Xs
And the asset of interest too, since it belongs to the pool of assets.
i random,i0 seasonal,i0
∀s ∈ [[0; t0 ]] , Xs0 = Xs + Xs
Then, the three best fitting patterns are chosen:
{(i1 , t1 ) , (i2 , t2 ) , (i3 , t3 )}
2
seasonal,i seasonal,i0
Xs Xs
= arg min log seasonal,i
− log seasonal,i0
(i,u)∈[ 1;Nassets ] ×[ 1;t0 −T ]
[ [ X [ u;u+T ] u s u+T
X [ t0 −T ;t0 ] t0 −T s t0
0 t
seasonal,i0 1
where X [ t0 −T ;t0 ] = X seasonal,i0 ,
T + 1 s=t −T s
0
u+T
seasonal,i 1 seasonal,i
and X [ u;u+T ] = Xs
T +1 s=u
1{X i1 >X i1 } + 1{X i2 >X i2 } + 1{X i3 >X i3 }
t1 +T t1 t2 +T t2 t3 +T t3
Let Z =
3
Z is the decision variable. Predictions are made depending on the vote of the three best
fitted scenarios.
1 Xt0 +T
if Z> ,then > 1 and αShif tingP atternStrat (t) = 1
2 Xt0
1 Xt0 +T
else if Z< ,then < 1 and αShif tingP atternStrat (t) = −1
2 Xt0
3.2.2.2 Low Frequency Mono Asset IMF Pattern Recognition Strategy
This strategy relies on the hypothesis of stochastic periodicity of the seasonal process. It is
very similar to the previous strategy. Differences exist on two essential points:
– It does not require any other time series than the historical prices of the asset that is to
be predicted
– It is adapted.
26
27. To formulate a prediction at a certain time t0, within the horizon T, this strategy relies on
the following algorithm:
i random,i0 seasonal,i0
∀s ∈ [[t0 − T ; t0 ]] , Xs0 = Xs + Xs
2
seasonal,i0 seasonal,i0
Xs Xs
{t1 , t2 , t3 } = arg min log seasonal,i0
− log seasonal,i0
u∈[ 1;t0 −T ]
[ X [ u;u+T ] u s u+T
X [ t0 −T ;t0 ] t0 −T s t0
t0 u+T
seasonal,i0 1 seasonal,i0 seasonal,i0 1 seasonal,i0
where X [ t0 −T ;t0 ] = Xs ,and X [ u;u+T ] = Xs
T + 1 s=t T +1
0 −T s=u
Hence, the decision variable is:
1{X i0 >X i0 } + 1{X i0 >X i0 } + 1{X i0 >X i0 }
t1 +T t1 t2 +T t2 t3 +T t3
Z=
3
And the predictions computed by the strategy:
1 Xt0 +T
if Z> , then > 1 and αAutoP atternStrat (t) = 1
2 Xt0
1 Xt0 +T
elseif Z< , then < 1 and αAutoP atternStrat (t) = −1
2 Xt0
27
28. Chapter 4
Strategies analysis
4.1 Portfolio management
Definition 4.1.1 Let Pt , t ∈ [1; T ] denote the stochastic process of the spot price of an asset.
4.1.1 Trading strategy
Definition 4.1.2 A trading strategy is represented as follows. At each time period, it
provides an anticipation of the market: -1 is bearish (i.e the price will decline), +1 is bullish
(i.e the price will rise).
α : [0; T ] → {−1; 1}
4.1.2 Investment Horizon
An investment duration Tinvest , in terms of business days, drives the predictions of a given
strategy. It can range from 10 to 252 business days (from two weeks until a year). Therefore,
portfolio management can be driven by mid term or long term earnings prospects.
Tinvest ∈ [|50; 252|]
These prospects drive the PnL of the strategy, as positions will be covered after Tinvest
business days.
Definition 4.1.3
∀α ∈ {−1; 1}[|1;T |] , ∀t ∈ [1; T ] ,
(P(i+Tinvest )∧t − Pi )
P <invest (t) = α(i)
α
Pi 1 i t−1
28
29. 4.1.3 Starting time
The beginning of the time series is not subject to predictions. It is kept as prerequisite
information in order to compute the first predictions. Indeed, concerning the Patterns Fitting
Strategies, one needs to have a few historical patterns available for fitting. Therefore, the
predictions start at the time:
tstart = 10.Tinvest
Hence, the new PnL vector:
∀α ∈ {−1; 1}[|1;T |] , ∀t ∈ [tstart ; T ] ,
(P(i+Tinvest )∧t − Pi )
P <invest (t) = α(i)
α
Xi tstart i t−1
4.1.4 Trading time span
A trading time span δt defines the duration between two portfolio rebalances. It is defined
in terms of business days. Every δt days, one trade is closed and another position is taken.
By default, and assumed in the back tests, it is equal to a fifth of the investment duration.
Therefore, for example for Tinvest = 252, there will be 5 rebalances per year, one every 50
business days. Hence, in this example, δt = 50.
Tinvest
δt =
5
Therefore, the PnL now becomes:
∀α ∈ {−1; 1}[|1;T |] , ∀t,
(P(Tinvest +i.δt +tstart )∧t − Pi.δt +tstart )
P &Lαinvest ,δt (t) =
T
α(i.δt + tstart )
Xi.δt +tstart 0 i
(T −tstart )
δt
4.1.5 Annualizing the PnL and reducing its variance
The PnL computed so far still carries a time dependence. It needs to be annualized, with
the following operation:
∀α ∈ {−1; 1}[|1;T |] , ∀t,
(P(Tinvest +i.δt +tstart )∧t − Pi.δt +tstart ) 252
P <invest ,δt (t) =
α α(i.δt + tstart ) .
Xi.δt +tstart Tinvest 0 i
(T −tstart )
δt
Moreover, it is valuable to manage reducing the volatility of returns of a given strategy. If
it is not at the expense of its mean rate of return, it contributes to slightly improving the
sharp ratio. Hence, usual stop losses and cash ins are implemented in the back tests.
29
30. From this section, we know have a frame for deriving a PnL vector from a given strategy
α ∈ {−1; 1}[|1;T |] , and a given a price process Pt , t ∈ [1; T ]. The latter remains now to be
evaluated with objective criteria.
4.2 Underlying and target market
Three potential trading strategies were identified. They have been tested on three different
types of underlyings, for the following reasons:
- Stocks: CAC40. The goal is to find recurrent seasonality patterns within a single stock,
or between different stocks. This seasonality could be caused by important market shifts on
the initiative of big players, such as pension funds, insurance companies, asset managers, or
banks proceeding to portfolio rebalancing.
- Implied Volatilities: VIX Index, VStoxx Index, VCAC, VDAX.. These indices are
computed by a closed formula relying on implied volatilities of numeral options, hence
reflecting the overall structure of the volatility smile.
- Index Pairs: based on the following most liquid worldwide indexes: CAC, DAX, SX5E,
SPX, NKY, UKX, IBOV, SMI, HSI. Index pairs provide trajectories which generally follow
mean reverting processes, and have the advantage of being extremely liquid.
- Commodities: WTI. Commodities are known for displaying seasonality features. Therefore,
they may constitute interesting underlyings.
30
31. Chapter 5
Results
5.1 Empirical choices
Three strategies have been mentioned in this paper. However, in practical matters, tests
have only targeted one strategy, for the following reasons:
– The IMF Mean Convexity Reverting Strategy has been tested for a few examples. However,
the calibration of the threshold has proven to be difficult. Even the seasonal process
remains somehow unstable, in particular its last values, where the second order derivative
is computed.
random seasonal
∀s ∈ [[t − T ; t]] , Xs = Xs + Xs
seasonal t
seasonal Xs seasonal 1
∀s ∈ [[t − T ; t]] , Xs = log where X = X seasonal
X
seasonal T + 1 s=t−T s
if Xtseasonal − 2Xt−1
seasonal seasonal
+ Xt−2 > threshold
seasonal
Xt+T
then > 1 and αM ean Re vertingStrat (t) = 1
Xtseasonal
elseif Xtseasonal − 2Xt−1
seasonal seasonal
+ Xt−2 < −threshold ,
Xt+T
then < 1 and αM ean Re vertingStrat (t) = −1
Xt
– The IMF Multi Asset Shifting Pattern Recognition Strategy is very similar to the Mono
Asset IMF Pattern Recognition Strategy. However, it is harder to implement because it
is a Multi Asset Strategy: results will depend on how much data is utilized. Moreover,
the multi asset strategy is unadapted, contrary to the mono asset strategy.
– Due to the small computing capacity available during the project, tests have mainly been
focused on the last strategy hereby developed, i.e the Mono Asset IMF Pattern Recognition
Strategy. Indeed, the idea is to derive reliable results by testing it on a wide range of
31
32. underlyings (which list has been provided earlier). That way, the law of large numbers
will help provide reliable results.
5.2 Tables
5.2.1 Volatility: VIX Index
First are shown the most promising results: on the VIX Index. 3 different investment
horizons have been tested: 50 days, 150 days, and 250 days. The result for 50 days is the
most reliable, for it relies on the highest amount of trades. Bullish signals also seem to be
more performing. It is quite expected, considering the general behavior of implied volatility.
As provided in table 5.1, page 32, the last back test, with 35% cash in, gives a 57% hit ratio,
and a 0,40 annualized sharp ratio.
Figure 5.1: Back test on the VIX Index, with a 50 days investment horizon and 35% cash in
Moreover, the results are also encouraging for a longer investment horizon: 150 days.
However, they rely on less trades, simply due to a longer horizon. Again, bullish signals
are more powerful. As provided in table 5.2, page 33, the last back test, with 100% cash in,
gives a 54% hit ratio, and a 0,54 annualized sharp ratio.
32
33. Figure 5.2: Back test on the VIX Index, with a 150 days investment horizon and 100% cash
in
Finally, the results are now presented for 250 days. Again, they rely on less trades, simply
due to a longer horizon. Again, bullish signals are more powerful. As provided in table 5.3,
page 34, the last back test, with 150% cash in, gives a 66% hit ratio, and a 0,76 annualized
sharp ratio. However, it only relies on 15 trades, from 2005 to 2011, where bullish signals
obviously have proven quite effective. Therefore, more tests with longer data need to be
pursued.
5.2.2 Volatility: VStoxx Index
To confirm long term results for the VIX Index, similar tests have been pursued on the
VStoxx Index, see table 5.4, page 35. For the 126 days horizon, results are also encouraging.
Without any cash ins, and with all signals (bullish and bearish), a 56% hit ratio is achieved
on 70 trades, and a 0,20 sharp ratio.
5.2.3 Volatility: other indices: aggregate performance
Here are, table 5.5, page 36, the aggregated results for all the indices tested in the following
pool. The amount of trades represented is around 1000 for the 150 days table, and 100 for
33
34. Figure 5.3: Back test on the VIX Index, with a 250 days investment horizon and 150% cash
in
the 250 days table, dating from 2006 to 2011. Therefore, results are very reliable.
It seems that the aggregate prediction power on volatility is uncertain. However, results
remain encouraging for the VIX, i.e the most liquid index (via futures or ETFs) among the
volatility indexes.
5.2.4 French stocks: CAC 40: Aggregate performance
Aggregated results for the French stocks market are quite disappointing, see table 5.6,
page 37.
5.2.5 Equities Indices and trading pairs: Aggregate performance
Tests have also been pursued on the main worldwide equity indices, and their pairs.
Aggregated results on approximately 2000 trades and 20 years historical prices show that
this strategy does not have any prediction power on this asset class, see table 5.7, page 37.
34
35. Figure 5.4: Back test on the VStoxx Index, with a 126 days investment horizon and without
cash in
5.2.6 Commodities: West Texas Intermediate (WTI)
Results for the West Texas Intermediate (WTI), are shown in table 5.8, page 38.
35
37. Figure 5.6: Back test on French stocks from the CAC 40
Figure 5.7: Back test on a pool of Equities Indexes
37
38. Figure 5.8: Back test on the West Texas Intermediate (WTI) oil price
38
39. Conclusion and outlook
HHT offers a potentially viable method for nonlinear and nonstationary data analysis. But
in all the cases studied, HHT does not give sharper results than most of the traditional time
series analysis methods. In order to make the method more robust, rigorous, in application,
an analytic mathematical foundation is needed.
In our view, the most likely solutions to the problems associated with HHT can only be
formulated in terms of optimization: the selection of spline, the extension of the end, etc.
This may be the area of future research.
While this study tries to design some theoretical ground for the HHT, further theoretical
work is greatly needed in this direction.
On the empirical aspect, more research also needs to be pursued. While not particularly
performing on stock prices, the EMD seems more adapted to curves resembling implied
volatilities, and more able to derive meaningful dynamics off them. Strong results have
been reached concerning main volatility indexes, such as VIX or VStoxx. Therefore, further
empirical tests on this asset class could be rewarding.
Moreover, a great variety of assets has not been tested for prediction: other types of
commodities (only WTI was tested), precious metals, fixed income assets such as sovereign
or corporate bonds..
Finally, significant tests have only been pursued for one strategy among the three that were
formulated. The code provided with this report is able to generate results for the two other
strategies, and can be the base of wider back tests on industrial scale.
In terms of applications, this study has limited itself to the first part of the HHT algorithm,
i.e the Empirical Mode Decomposition. Maybe further work can be done in order to properly
formalize the Hilbert spectrum, make new hypotheses, and derive potential predictors using
the same methodology as in this study.
Acknowledgements
This study has been pursued in collaboration with the Equity Quantitative Team at Natixis.
Since our first arrival in the locals of Natixis, we have been thoroughly assisted. Successful
39
40. professionals were kind enough to answer our questions and to give their opinion on our work
during the entire year. Without their advices, this study would not have achieved its current
findings. Our project consisted of working at Natixis every Wednesday, from October 011 to
March 2012. Workdays were a great opportunity to work within the finance environment,
and to learn about the role of quantitative associates within the banking industry.
First, we would like to thank our supervisor Mr Adil Reghai, Head of Quantitative Analytics
at Natixis. Adil showed much interest for our project, shared our views, and gave us valuable
feedbacks during the whole year. He helped us design our predictors, and constantly gave
us new ideas for back tests. We would also like to thank Mr Adel Ben Haj Yedder, who
greatly contributed to our project, proofread our reports, and gave us feedbacks. We also
had the opportunity to discuss with Adel about his daily job, the role of the team, and about
the banking industry in general. His views will be valuable in order for us to precise our
professional project and goals. We are also thankful to Stephanie Mielnik, Thomas Combarel
for their contributions, and to the team in general.
Moreover, this study was pursued in collaboration with Dr Alex Langnau, Global Head
of Quantitative Analytics at Allianz. Alex is a consultant for Natixis and academics at
the University of Munich, and talked Adil about the Hilbert Huang Transform. During our
project, Alex also gave us valuable feedbacks, in particular about the portfolio management of
our trading strategies. Also, we would like to thank our teachers at Ecole Centrale Paris, from
the Applied Mathematics Department. Mr Erick Herbin, professor of stochastic processes,
supervised our project. He encouraged us to formalize the Hilbert Huang Algorithm. Despite
being a difficult task, it has revealed to be essential. We are also thankful to Mr Gilles
Fa¨, professor of statistics and time series, for his lectures, providing important theoretical
y
grounds for our study.
Finally, we wish to thank our colleagues from the Applied Mathematics Program who
pursued other projects in collaboration with Natixis. We have been working with them
since October, and we enjoyed having breaks with them. To name them: Marguerite de
Mailard, Lucas Mahieux, Nicolas Pai and Victor Gerard.
40
41. Bibliography
[1] Barnhart, B. L., The Hilbert-Huang Transform: theory, applications, development,
dissertation, University of Iowa, (2011)
[2] Brockwell, P.J., and Davis, R.A.,Introduction to Time Series and Forecasting, second
edition , Springer-Verlag, New York. (2002)
[3] Cohen, L., Generalized phase space distribution functions, J. Math. Phys. 7, 781 (1966)
[4] Datig, M., Schlurmann, T., Performance and limitations of the Hilbert-Huang trans-
formation (HHT) with an application to irregular water waves, Ocean Engineering, 31,
1783-1834, (2004)
[5] De Boor, C., A Practical Guide to Splines, Revised Edition, Springer- Verlag. (2001)
[6] Dos Passos, W.,( Numerical methods, algorithms, and tools in C# ), CRC Press, (2010)
[7] Fa¨, G., S´ries Chronologiques, Lecture notes, Ecole Centrale Paris, (2012)
y e
[8] Flandrin, P., Goncalves, P., Rilling, G., EMD Equivalent Filter Banks, from Interpre-
tation to Applications, in : Hilbert-Huang Transform and Its Applications (N.E. Huang
and S.S.P. Shen, eds.), pp. 57 -74. (2005)
[9] Golitschek, M., On the convergence of interpolating periodic spline functions of high
degree, Numerische Mathematik, 19, 46-154, (1972)
[10] Guhathakurta, K., Mukherjee, I., Chowdhury, A.R., Empirical mode decomposition
analysis of two different financial time series and their comparison, Elsevier, Chaos
Solitons and Fractals, 37, 1214-1227, (2008)
[11] Holder, H.E., Bolch, A.M. and Avissar.,R., Using the Empirical Mode Decomposition
(EMD) method to process turbulence data collected on board aircraft Submitted to J.
Atmos. Ocean. Tech., (2009)
[12] Hong, L., Decomposition and Forecast for Financial Time Series with High-frequency
Based on Empirical Mode Decomposition, Elsevier, Energy Procedia, 5, 1333-1340, (2011)
[13] Huang, N.E., Shen, S.S.P., Hilbert-Huang transform and its applications, Volume 5 of
interdisciplinary mathematical sciences, (2005)
[14] Huang, N.E., Shen, Z., Long, S., Wu, M., Shih, H., Zheng, Q., Yen, N., Tung, C. and
Liu, H. The empirical mode decomposition and the Hilbert spectrum for nonlinear and
non-stationary time series analysis Proc. R. Soc.. 454, 1971, 903. (1998)
41
42. [15] Huang, N. E., Wu, Z., A review on Hilbert-Huang transform: Method and its applications
to geophysical studies, Rev. Geophys., 46, RG2006, (2008)
[16] Liu, B. , Riemenschneider, S. ,Xu, Y., Gearbox, fault diagnosis using empirical mode
decomposition and Hilbert spectrum, Mechanical Systems and Signal Processing, 20, 718-
734, (2006)
[17] Pan, H., Intelligent Finance - General Principles, International Workshop on Intelligent
Finance, Chengdu, China, (2007)
[18] Reghai, A., Goyon, S., Messaoud, M., Anane, M., Market Predictor : Pr´diction
e
quantitative des tendances des march´s, Etude Strategie Quant Recherche Actions,
e
Natixis Securities, Paris (2010)
[19] Reghai, A., Goyon, S., Combare, T., Ben Haj Yedder, A., Mielnik, S., Sharpe
Select : optimisation de l’investissement Cross Asset, Etude Strat´gie Quant Recherche
e
Quantitative, Natixis Securities, Paris (2011)
42
43. Appendix A
Time series Prerequisites
A.1 Stationary and linear processes
A.1.1 Stationarity
Definition A.1.1 A time series is a stochastic process in discrete time, Xt ∈ R, t ∈ Z for
example. Thus, a time series is composed of realizations of a single statistical variable during
a certain time interval (for example a month, a trimester, a year, or a nanosecond ).
We can expect to develop some interesting predictions if the process displays certain
structural properties:
– Either some ”rigidity”, allowing to extrapolate some deterministic parts.
– Either some form of statistical invariance, called stationnarity, allowing learning the
present and predicting the future, based on the past.
Definition A.1.2 Xt ∈ R, t ∈ Z is said strictly stationary iff its finite dimensional
distributions are invariant under any time translation, i.e:
∀τ ∈ Z,∀n ∈ N∗ , ∀(t1 , .., tn ) ∈ Zn , (Xt1 , .., Xtn ) ∼ (Xt1 −τ , .., Xtn −τ )
Definition A.1.3 Xt ∈ R, t ∈ Z is said to be stationary at the second order iff:
– (Xt )t∈Z ∈ L2 (R), i.e ∀t ∈ Z, E [Xt2 ] < ∞
– ∀t ∈ Z, E [Xt ] = E [X0 ] := µX
– ∀s, t ∈ Z, γX (t, s) := Cov (Xt , Xs ) = Cov (X0 , Xs−t ) =: γ (s − t)
Definition A.1.4 The autocorrelation function of a stochastic process Xt ∈ R, t ∈ Z is the
series
Cov (Xt , Xs )
ρ(s, t) = 1
(V ar (Xs ) .V ar (Xt )) 2
43
44. A.1.2 Linearity
Within the family of stationary processes, an important family of processes is known as the
linear processes. They are derived from white noise processes.
Definition A.1.5 A stochastic process Xt ∈ R, t ∈ Z is a weak white noise iff it is stationary
at the second order, and:
µX = 0
∀h ∈ Z, γX (h) = σ 2 .δ0 (h)
Definition A.1.6 A stochastic process Xt ∈ R, t ∈ Z is a strong white noise iff it is i.i.d
and µX = 0.
Hence, second order linear processes can now be defined.
Definition A.1.7 Xt ∈ R, t ∈ Z is a weak (resp. strong) second order linear process iff
∃(Zt )t∈Z , ∃(ψj )j∈Z , such as:
(Zt ) weak (resp. strong) White Noise (σ 2 )
|ψj | < ∞
j∈Z
∀t ∈ Z, Xt = ψj Zt−j
j∈Z
Second order linear processes are well known and studied. It can appear as excessive to
merely study these kinds of linear processes. However, the Wold’s decomposition provides a
strong result for these processes:
A.1.3 Wold’s decomposition:
Every second order stationary process Xt ∈ R, t ∈ Z can be written as the sum of a second
order linear process and a deterministic component.
∀t ∈ Z, Xt = ψj Zt−j + η(t)
j∈Z
where :
(Zt ) weak (resp. strong) White Noise (σ 2 )
|ψj | < ∞
j∈Z
η ∈ RZ
Hence, basic linear processes, such as ARMA models, provide a strong base for explaining
stationary processes. However, the latter assumption is quite reductive.
44
45. A.2 The particular case of financial time series: para-
metric and non parametric extensions
A.2.1 Non-stationary and non linear financial time series
Financial time series are known for displaying a few characteristic unknown to stationary or
linear processes:
– Their flat tales distribution are non compatible to Gaussian density functions. They must
are more accurately fitted by power laws, i.e processes of infinite variance. These processes
are not utilized in practice, because a measure of volatility (i.e variance) is paramount in
finance ( For example, in order to price options, or to compute sharp rations of indices,
stocks or strategies. See our definitions in chapter 4).
– Non-linearity: they display non constant variance. Clusters of volatilities are common in
financial time series. These clusters are incompatible with linear and stationary processes
like ARMA (which have a constant variance).
– Non-stationnarity: they have a long term memory.
– - Time inversion: linear stationary processes are invariant to time changes. However, a
financial time series obviously is coherent with only one time direction, and is not consistent
if time is inversed.
A.2.2 Parametric processes for financial time series
To tackle the issue of non linearity, popular parametric models are ARCH(p) and GARCH
(p,q).
Definition A.2.1 Xt ∈ R, t ∈ Z is defined as an ARCH(p) process by:
Xt = σt Zt
p
2 2
σt = ψ0 + ψj Xt−j
j=1
Where:
∀j ∈ [[1; p]] , ψj > 0
Zt iid(0, 1)
Definition A.2.2 Xt ∈ R, t ∈ Z is defined as a GARCH (p,q) process by:
Xt = σt Zt
p
2 2
σt = ψ0 + ψj Xt−j
j=1
45
46. Where:
∀j ∈ [[1; p]] , ψj > 0
∀j ∈ [[1; q]] , ϕj > 0
Zt iid(0, 1)
A.2.3 Non parametric processes for financial time series
Numerous non parametric methods have been and are being developed to fit financial data.
The goal here is not to mention all of them. The Empirical Mode Decomposition lives within
this environment.
A.3 General time series: the Box & Jenkins approach
for prediction
Within the framework of Box and Jenkins (1970), a time series can be modeled as realization
of simultaneous phenomena:
– The first, ηt is a regular and smooth time evolution, called trend.
– The second, St , is a periodic process of period T.
– The third component, Wt , is the random component. It can be a stationary process.
∀t ∈ Z, Xt = Wt + St + ηt
Definition A.3.1 St is a periodic process with period T, iff:
∀t ∈ Z, St+T = St
T
St = 0
t=1
: L2 (R) × L2 (R) → R
(X, Y ) → Cov(X, Y )
From this framework, we will connect the EMD algorithm to the literature of time series.
Some assumptions will be made to match this approach, and they will drive the predictions
algorithms formulated later in this chapter.
46
47. Appendix B
Evaluation criteria of backtests
In order to evaluate the efficiency of the potential highlighted strategies, a few rules need to
be taken. As mentioned above, adaptability is the main one. Every back-test ought to be
implemented as if the future were unknown. Therefore, it implies some obligations: - a time
frame restriction: every prediction must be computed without using any feature of its future
values. - a class restriction: predictions must be evaluated on an aggregate basis, for every
kind of underlying. For example, no discrimination of the best or worst performers should
be done, because it is another form of unadaptability.
However, it remains pertinent to evaluate the performance of the strategies regarding the
asset class to which they are applied. Hence, as it will be mentioned further on, some
strategies might be more efficient on stocks, trading pairs, or implied volatilities.
Within the asset management theory, a few variables suffice to quantify the efficacy of a
trading strategy. They are usually noted as the following:
Let X the random variable which denotes the annualized gain(or loss) of each trade of a
trading strategy. Its realizations are written: Xi , i = 1..n
Definition B.0.2
Average return = E [X]
Average gain = E [X|X > 0]
Average loss = E [X|X < 0]
Drawdown = - min {Xi , i = 1..n}
Hit ratio = P (X > 0)
These values need to be analysed together. A hit ratio above 50% should be compared with
the average gain and loss. The drawdown also provides valuable information on the risk of
the trading strategy, and gives a hint of the sharp ratio. Drawdowns are a valuable tool in
order to calibrate stop losses thresholds.
47
48. Definition B.0.3
E [X − T ]
Information ratio =
V ar (X − T )
with T as a benchmark performance rate. In the results displayed in the annexes, the
benchmark performance rate plugged in the computations of Information ratios is 0%.
Definition B.0.4
E [X − T ]
Sortino ratio =
V ar (X|X < T ))
with T as a benchmark performance rate. In our results displayed in the annexes, the
benchmark performance rate in our computations of Sortino ratios is 0%. Therefore, this
measurement only takes into account ”negative volatility”, i.e the volatility of losses.
Definition B.0.5
E [X − rf ]
Sharp ratio =
V ar (X))
with rf as the annualized risk free rate
The analyses hereby rely mainly on the information ratio and the Sortino ratio. However,
at the light of the current risk free rates, the information ratio can be considered as a good
approximation of the Sharp ratio.
48