1.
FINAL REPORTApplication of the Hilbert Huang Transform to the prediction of ﬁnancial time series Cyrille BEN LEMRID, Hadrien MAUPARD Natixis supervisor : Adil REGHAI Academic supervisor : Erick HERBIN ´ Ecole Centrale Paris March 18, 2012
4.
A.2.3 Non parametric processes for ﬁnancial time series . . . . . . . . . . . 46 A.3 General time series: the Box & Jenkins approach for prediction . . . . . . . 46B Evaluation criteria of backtests 47 4
5.
IntroductionThis report presents the Hilbert Huang research work of Cyrille Ben Lemrid and HadrienMaupard.The Hilbert Huang Transform relies on two steps: a ﬁrst non-parametric Empirical ModeDecomposition which derives the signal into Intrinsic Mode Functions (semi-periodic func-tions) of various frequencies and then a Hilbert Decomposition, projecting the IMFs onto atime-frequency 3 dimensional graph. The details of the algorithm are thoroughly explainedin the ﬁrst chapter of this report.Due to the lack of theoretical formulation of the latter, and in order to keep our algorithmﬂexible and simple, we will only use the Huang Transform, i.e the Empirical ModeDecomposition (EMD).Once applied to ﬁnance, it is well known that usual tools for the prediction of time seriesare power less. Stationary and linear models, such as ARIMA processes, are unable topredict ﬁnancial time series, which display non stationarity and a long memory. Hence,some extensions exist, parametric or non parametric. The EMD belongs the non parametricpredictors of non linear non stationary time series.In chapter 2, based on empirical observations, interesting stylized facts are derived: IMFsare uncorrelated to each other. Low frequency IMFs are periodic, and explain most of thevariance of the original time series. With their smooth and regular form, they are still ableto catch most of the information of the time series. High frequency IMFs are closer torandom processes, and have some stationarity. These facts connect the EMD with the Box& Jenkins statistical framework: a time series can be seem as the sum of a semi-periodicor seasonal process (the low frequency IMFs) and a random semi-stationary process (highfrequency IMFs).In chapter 3, two categories of predictors are introduced, relying on two hypotheses on theseasonal process: either it is deterministic, and can be prolonged, or it remains stochastic,and conditional expectation is the best predictor. The hypothesis of deterministic seasonalprocess gives one strategy: Low Frequency Mean Reverting Strategy. The hypothesis ofstochastic periodicity of the seasonal process gives two strategies: a Low Frequency MultiAsset Shifting Pattern Recognition, and a Low Frequency Mono Asset Shifting PatternRecognition Strategy. 5
6.
In chapter 4, the backtest method of the strategies is formulated, and underlyings for thebacktests are chosen: Implied Volatilities, Stocks, Indices and trading pairs, Commodities.Finally, the results of these backtests are commented.In the Annex, prerequisites about time series and asset management literature are given. 6
7.
Chapter 1Description of the Hilbert Huang Transform,model overviewThe Hilbert-Huang transform (HHT) is an empirically based data analysis method, whichis performed in two steps: ﬁrst some descriptive patterns are extracted by performing anadaptive decomposition called Empirical Mode Decomposition (Huang Transform), and thenwe can capture the local behavior of these patterns by using tools coming from HilbertSpectral Analysis (Hilbert Transform).1.1 The Empirical Mode DecompositionThe Empirical Mode Decomposition is based on the assumption that any data consists ofdiﬀerent simple intrinsic modes of oscillations. Each of these oscillatory modes is representedby an intrinsic mode function (IMF) which satisﬁes two conditions:– In the whole data set, the number of zero crossings and the number of extrema must equal or diﬀer at most by one.– It exists two envelopes one passing through the local maxima and the other by the local minima such that at any point the mean value of the two envelopes is zero.Deﬁnition 1.1.1 An R-valued process x(t) is called an IMF (Intrinsic Mode Function) ifit is a continuous process, that satisﬁes the following conditions: 1. The number of extrema and the number zero-crossings must either equal or diﬀer at most by one : |#Γmax + #Γmin − #Γ0 | 1. with Γ0 = { t ∈ I| x(t) = 0} Γmax = { t ∈ I| ∃u > 0, ∀s ∈ ]t − u, t + u[/{t}, x(t) > x(s)} Γmin = { t ∈ I| ∃u > 0, ∀s ∈ ]t − u, t + u[/{t}, x(t) < x(s)} 7
8.
2. The mean value m(t) = (xsup (t) + xinf (t))/2 of the envelope deﬁned by the local maxima xsup (t) and the envelope deﬁned by the local minima xinf (t) is zero: ∃xsup ∈ Env(Γmax ), ∃xinf ∈ Env(Γmin ), ∀t ∈ I, m(t) = 0 with Env(Γmax ) = {f ∈ C(I) |∀t ∈ Γmax , f (t) = x(t)} Env(Γmin ) = {f ∈ C(I) |∀t ∈ Γmin , f (t) = x(t)}An IMF represents a simple oscillatory mode as a counterpart to the simple harmonicfunction, but it is much more general: instead of constant amplitude and frequency, asin a simple harmonic component, the IMF can have a variable amplitude and frequency asfunctions of time.The ﬁrst condition is apparently necessary for oscillation data; the second condition requiresthat upper and lower envelopes of IMF are symmetric with respect to the x-axis.The idea of the EMD method is to separate the data into a slow varying local mean partand a fast varying symmetric oscillation part, the oscillation part becomes the IMF and thelocal mean the residue, the residue serves as input data again for further decomposition,the process repeats until no more oscillation can be separated from the residue of frequencymode. On each step of the decomposition, since the upper and lower envelope of the IMFare unknown initially, a repetitive sifting process is applied to approximate the envelopeswith cubic spline functions passing through the extrema of the IMF. The data serves as theinitial input for the IMF sifting process, and the reﬁned IMF is the diﬀerence between theprevious version and mean of the envelopes, the process repeats until the predeﬁned stopcondition is statisﬁed. The residue is then the diﬀerence between the data and the improvedIMF.One big advantage of this procedure is that it can deal with data from nonstationary andnonlinear processes. This method is direct, and adaptive, with a posteriori-deﬁned basis,from the decomposition method, based on and derived from the data.The intrinsic mode components can be decomposed in the following steps : 1. Take an arbitrary input signal x(t) and initialize the residual: r0 (t) = x(t), i = 1 2. Extract the ith IMF 3. Initialize the ”proto-Imf” h0 with h0 (t) = ri (t), k = 1 4. Extract the local maxima and minima of the ”proto-Imf” hk−1 (t) 5. Interpolate the local maxima and the local minima by a cubic spline to form upper and lower envelopes of hk−1 (t) 6. Calculate the mean mk−1 (t) of the upper and lower envelopes of hk−1 (t) 7. Deﬁne: hk (t) = hk−1 (t) − mk−1 (t) 8. If IMF criteria are satisﬁed, then set IMFi (t) = hk (t) else go to (4) with k = k + 1 9. Deﬁne: ri (t) = ri−1 (t) − IM Fi (t) 10. If ri (t) still has at least two extrema, then go to (2) with i = i+1; else the decomposition is completed and ri (t) is the ”residue” of x(t). 8
9.
Figure 1.1: Sifting process of the empirical mode decomposition: (a) an arbitrary input; (b)identiﬁed maxima (diamonds) and minima (circles) superimposed on the input; (c) upperenvelope and lower envelope (thin solid lines) and their mean (dashed line); (d) prototypeintrinsic mode function (IMF) (the diﬀerence between the bold solid line and the dashed linein Figure 2c) that is to be reﬁned; (e) upper envelope and lower envelope (thin solid lines)and their mean (dashed line) of a reﬁned IMF; and (f) remainder after an IMF is subtractedfrom the input.Once a signal has been fully decomposed, the signal x(t) can be written as N x(t) = IM Fi (t) + r(t) i=11.2 Closed form formulas for IMFsRather than a Fourier or wavelet based transform, the Hilbert transform was used, in orderto compute instantaneous frequencies and amplitudes and describe the signal more locally.Equation 3.1 displays the Hilbert transform Yt , which can be written for any function x(t)of Lp class. The PV denotes Cauchy’s principle value integral. +∞ t−ε +∞ 1 IM Fs IM Fs IM Fs Yt = H [IM Ft ] = PV ds = lim ds + ds π t−s ε→0 t−s t−s −∞ −∞ t+ε 9
10.
Algorithm 1 Empirical Mode DecompositionRequire: Signal, threshold ∈ R+ ; 1: curSignal ← Signal, i = 1; 2: while (numberOfExtrema(curSignal) > 2) do 3: curImf ← curSignal 4: while (isNotAnImf(curImf , threshold) = true) do 5: Γmax ←emdGetMaxs(curImf ); 6: Γmin ←emdGetMins(curImf ); 7: Γmax ←emdMaxExtrapolate(curImf , Γmax ); 8: Γmin ←emdMinExtrapolate(curImf , Γmin ); 9: xinf ←emdInterpolate(curImf , Γmax );10: xsup ← emdInterpolate(curImf , Γmin );11: bias ← (xinf + xinf )/2;12: curImf ← curImf − bias;13: end while14: IMFi ← curImf , i = i + 1;15: curSignal ← curSignal − IMFi ;16: end while17: N = i;18: residual ← curSignal19: return (IM Fi )i=1..NAn analytic function can be formed with the Hilbert transform pair as shown in equation 1.1 Zt = IM Ft + iYt = At eiθt where At = IM Ft2 + Yt2 Yt θt = arctan IM FtAt and θt are the instantaneous amplitudes and phase functions, respectivelyThe instantaneous frequency ft can then be written as the time derivative of the phase, asshown in equation 1 dθt ft = 2π dtHence, an IMF can be expressed analytically : t IM Ft = At cos 2π fs ds + ψ (1.1) 0[11] and [14] showed that not all functions give ”good” Hilbert transforms, meaning thosewhich produce physical instantaneous frequencies. The signals which can be analyzed usingthe Hilbert transform must be restricted so that their calculated instantaneous frequencyfunctions have physical meaning. 10
11.
Next, the empirical mode decomposition is essentially an algorithm which decomposes nearlyany signal into a ﬁnite set of functions which have ”good” Hilbert transforms that producephysically meaningful instantaneous frequencies.After IMFs have been obtained from the EMD method, one can further calculate instanta-neous phases of IMFs by applying the Hilbert Huang tranform to each IMF component. 11
12.
Chapter 2State of the Art2.1 Application ﬁelds of the EMDThe Empirical Mode Decomposition can be a powerful tool to separate non-linear andnon-stationary time series into the trend (residue function) and the oscillation (IMF)on diﬀerent time scales, it can describe the frequency components locally adaptively fornearly any oscillating signal. This makes the tool extremely versatile. This decompositionﬁnd applications in many ﬁelds where traditionally Fourier analysis method or Waveletmethod dominate. For instance, HHT has been used to study a wide variety of dataincluding rainfall, earthquakes, Sunspot number variation, heart-rate variability, ﬁnancialtime series, and ocean waves to name a few subjects. But there are still some remainingmathematical issues related to this decomposition which have been mostly left untreated:convergence of the method, optimization problems (the best IMF selection and uniquenessof the decomposition), spline problems (best spline functions for the HHT). In the followingchapters, these inexactitudes will be thoroughly developed, and the current potentialsolutions from the literature will be gathered.2.2 Existence and uniqueness of the DecompositionThe convergence of the proto-IMF (hk )k 0 sequence to an IMF is equivalent to the conver-gence of (mk )k 0 the bias to zero. L2 mk − − 0 −→ k→∞ where mk = hk−1 (t) − hk (t) 12
13.
2.2.1 Stoppage criteriaThe inner loop should be ended when the result of the sifting process meets the deﬁnitionof an IMF. In practice this condition is too strong so we need to specify a relaxed conditionwhich can be met in a ﬁnite number of iterations. The approximate local envelope symmetrycondition in the sifting process is called the stoppage (of sifting) criterion. In the past, severaldiﬀerent types of stoppage criteria were adopted: the most widely used type, which originatedfrom Huang et al. [14], is given by a Cauchy type of convergence test, the normalized SquaredDiﬀerence between two successive sifting operations deﬁned as T |hk−1 (t) − hk (t)|2 t=0 SDk = T h2 (t) k−1 t=0must be smaller than a predetermined value. This deﬁnition is slightly diﬀerent from theone given by Huang et al. [14] with the summation signs operating for the numerator anddenominator separately in order to prevent the SDk from becoming too dependent on localsmall amplitude values of the sifting time series.If we assume that the local mean between the upper and lower envelopes converges to zeroin sense of the euclidean norm, we can apply the following Cauchy criterion : 2 mk−1 L2 log 2 threshold hk−1 L2In our implementation the threshold has been calibrated to -15.These Cauchy types of stoppage criteria are seemingly rigorous mathematically. However,it is diﬃcult to implement this criterion for the following reasons: First, how small is smallenough begs an answer. Second, this criterion does not depend on the deﬁnition of the IMFsfor the squared diﬀerence might be small, but there is no guarantee that the function willhave the same numbers of zero crossings and extrema.2.2.2 Cubic spline interpolationSince the EMD is an empirical algorithm and involves a prescribed stoppage criterion tocarry out the sifting moves, we have to know the degree of sensitivity in the decompositionof an input to the sifting process, so the reliability of a particular decomposition can furtherbe determined. Therefore, a conﬁdence limit of the EMD is a desirable quantity. To computethe upper and lower envelopes we use a piecewise-polynomial approximation.In general, the goal of a spline interpolation is to create a function which achieves the bestpossible approximation to a given data set. For a smooth and eﬃcient approximation, onehas to choose high order polynomials. A popular choice is the piecewise cubic approximationfunction of order three. 13
14.
The basic idea behind using a cubic spline is to ﬁt a piecewise function of the form : S1 (x), x ∈ [x1 , x2 [ S2 (x), x ∈ [x2 , x3 [ S(x) = ... Sn−1 (x), x ∈ [xn−1 , xn [ where Si (x) is a third degree polynomial with coeﬃcients ai , bi , ci and di deﬁned fori = 0, 1, ..., n − 1 by : Si (x) = ai + bi (x − xi ) + ci (x − xi )2 + di (x − xi )3 1{x∈[xi ,xi+1 [}More formally, given a function f (x) deﬁned on an interval [a, b] and a set of nodesa = x0 < x1 < ... < xn = b, a cubic spline interpolant S(x) for f (x) is a function thatsatisﬁes the following conditions : n−1 1. S(x) = Si (x)1{x∈[xi ,xi+1 [} is a cubic polynomial denoted by Si (x), on the subinterval i=0 [xi , xi+1 ) for each i = 0, 1, ..., n − 1. 2. Si+1 (xi+1 ) = Si (xi+1 ) for each i = 0, 1, ..., n − 2. 3. S i+1 (xi+1 ) = S i (xi+1 ) for each i = 0, 1, ..., n − 2. 4. S i+1 (xi+1 ) = S i (xi+1 ) for each i = 0, 1, ..., n − 2. 5. and one of the following set of boundary conditions is also satisﬁed : S (x0 ) = S (xn ) = 0 (free or natural boundary) S (x0 ) = f (x0 ) and S (xn ) = f (xn ) (clamped boundary).But there are four problems with this decomposition method :– the spline (cubic) connecting extrema is not the real envelope,– the resulting IMF function does not strictly guarantee the symmetric envelopes,– some unwanted overshoot may be caused by the spline interpolation,– the spline cannot be connected at both ends of the data series.Higher order spline does not in theory resolve these problems.2.2.3 Additional boundary data pointsIt has been already illustrated that the cubic spline has somehow to be kept close to thefunction especially near both ends of the data series. Therefore, the creation of additionalboundary data points, which are supposed to be applicable to the current data set, appearsto be the key element in technically improving the EMD. All artiﬁcially added boundary datapoints are generated from within the original set of discrete knots to represent a characteristicnatural behaviour. One routine is to add new maxima and minima to the front and rearof the data series. As a basic requirement, these data points are located oﬀ the originaltime span the signal was recorded in. Therefore, no information is cancelled out and thenatural data series remains unaﬀected. However one disadvantage of this method is that weanticipate the future trend of the data. 14
15.
Figure 2.1: Additional boundary data pointsFigure 2.2: Additional boundary data points 15
16.
Chapter 3Application to the prediction of ﬁnancial timeseries3.1 The Empirical Mode Decomposition in ﬁnance: stylized factsIn the markets, one can assume that diﬀerent modes of oscillations in stock prices areprovoked by diﬀerent kinds of actors. With this approach, the lowest frequency IMFs couldbe considered as a statistical proxy to infer the behavior of big investors (Banks, Insurancecompanies, Hedge Funds, Mutual Funds, Pension Funds..) and predict the evolution of astock in the long run.3.1.1 Empirical Modes and Market Structure3.1.1.1 Asset price and IMFsWith the EMD, a time series can be represented as follows: NIM F ∗ ∃NIM F ∈ N |∀t ∈ Z, Xt = IM Fti + rt i=1It can be interesting to observe the correlation matrix of the random vector Xt IM Ft1 ... IM Ft N rt t∈ZThis matrix can be empirically computed on an example, as shown in Figure 3.1, page 17. 16
17.
This correlation matrix shows three stylized facts: X IMF1 IMF2 IMF3 IMF4 IMF5 IMF6 IMF7 Trend X 1.000 0.038 0.039 0.172 0.184 0.169 0.488 0.795 0.879 IMF1 0.038 1.000 0.034 -0.009 -0.026 0.002 0.022 -0.021 -0.019 IMF2 0.039 0.034 1.000 -0.040 -0.005 0.043 -0.015 -0.022 -0.025 IMF3 0.172 -0.009 -0.040 1.000 0.109 -0.114 0.102 0.049 0.059 IMF4 0.184 -0.026 -0.005 0.109 1.000 -0.035 0.018 0.035 0.040 IMF5 0.169 0.002 0.043 -0.114 -0.035 1.000 -0.096 -0.014 -0.038 IMF6 0.488 0.022 -0.015 0.102 0.018 -0.096 1.000 -0.043 0.148 IMF7 0.795 -0.021 -0.022 0.049 0.035 -0.014 -0.043 1.000 0.977 Trend 0.879 -0.019 -0.025 0.059 0.040 -0.038 0.148 0.977 1.000Figure 3.1: Empirical correlation matrix of AXA share price and its IMFs, from 01/01/2003to 14/11/2011– The EMD displays some empirical orthogonal features. Hence, the following theoretical assumption can be made: ∀(i, j) ∈ [[1; NIM F ]]2 , i = j ⇒ IM Fi , IM Fj = 0 With the usual following scalar product: : L2 (R) × L2 (R) → R (X, Y ) → Cov(X, Y )– Low Frequency IMFs display strong correlations to the original price series.– High Frequency IMFs are uncorrelated to the original price series.3.1.1.2 High frequency modesEmpirical modes have a strong connection to the market structure. On top of beinguncorrelated to the general time series, high frequency IMFs present strong correlation todaily price movements, as shown by the following empirical correlation matrix of daily yieldsto the IMFs processes and the trend process, (see Figure 3.3, page 18).High frequency IMFs are accurately following daily movements, as shown in 3.4, page 19.Moreover, there appears to be local jumps in amplitude of the high frequency IMFs whendaily changes are becoming sharper. This comes along local jumps in volatility. Hence, theamplitude of high frequency IMFs is probably positively correlated to the short term implicitvolatility of the At-The-Money options on the stock. As we see on the last graph, amplitudejumps along with volatility, as daily yields exceed 10% (see Figure 3.6 and Figure 3.5).Finally, despite their signiﬁcant short term periodicity, they display some signs of stationar-ity, as shown in Figure 3.2, page 18. 17
18.
Figure 3.2: Sample Autorrelation Function of the highest frequency IMF diﬀ(Axa) Axa IMF1 IMF2 IMF3 IMF4 IMF5 IMF6 IMF7 Trend diﬀ(Axa) Axa 1.000 0.552 0.087 0.029 0.016 -0.007 -0.008 -0.004 -0.003 IMF1 0.552 1.000 0.034 -0.009 -0.026 0.002 0.022 -0.021 -0.019 IMF2 0.087 0.034 1.000 -0.040 -0.005 0.043 -0.015 -0.022 -0.025 IMF3 0.029 -0.009 -0.040 1.000 0.109 -0.114 0.102 0.049 0.059 IMF4 0.016 -0.026 -0.005 0.109 1.000 -0.035 0.018 0.035 0.040 IMF5 -0.007 0.002 0.043 -0.114 -0.035 1.000 -0.096 -0.014 -0.038 IMF6 -0.008 0.022 -0.015 0.102 0.018 -0.096 1.000 -0.043 0.148 IMF7 -0.004 -0.021 -0.022 0.049 0.035 -0.014 -0.043 1.000 0.977 Trend -0.003 -0.019 -0.025 0.059 0.040 -0.038 0.148 0.977 1.000Figure 3.3: Empirical correlation of daily returns of AXA and its IMFs , from 01/01/2003to 14/11/2011A Ljung Box Test is a more quantitative way to assess stationarity. It tests the hypothesisof being a second order stationary process.Deﬁnition 3.1.1 Let m ∈ N. (H0 ) : ρ1 = ρ2 = .. = ρm = 0. The Ljung-Box Test statisticis given by: m ρ2 h Q(m) = N (N + 2) h=1 N −h QH0 (m) ∼ χ2 mHowever, the test rejects the absence of autocorrelations for the highest frequency IMF onour previous example of AXA. Therefore, it suggests that these high frequency modes display 18
19.
Figure 3.4: Daily returns of Societe Generale and its highest frequency IMFFigure 3.5: Daily returns of Societe Generale and its highest frequency IMF during a highvolatility period 19
20.
Figure 3.6: Daily returns of Societe Generale and its highest frequency IMF normalizedsome stationary properties, but keep a very short periodicity, making these processes noti.i.d.3.1.1.3 Low frequency modesLow Frequency modes describe the long term dynamics of the stock. It reﬂects the positionsof long term actors within the market,(see Figure 3.7, page 21).It can also be interpreted in terms of economic cycles, if applied to very long time frames,(see Figure 3.8, page 21).3.1.2 Back to the Box & Jenkins frameworkThe EMD algorithm derives the following decomposition of a given ﬁnancial time series: NIM F ∗ ∃NIM F ∈ N |∀t ∈ Z, Xt = IM Fti + rt i=1Moreover, low frequency IMFs explain the general evolution of the stock price, and havestrong periodic patterns, whereas high frequency IMFs are linked to the daily movements.Figure 3.9, page 22 shows all the IMFs of an example of ﬁnancial time series. In red are lowfrequency IMFs, in blue high frequency IMFs, in green the stock price. 20
21.
Figure 3.7: Societe Generale stock price and the sum of its 2 lowest frequency IMFs Figure 3.8: VIX and the sum of its 3 lowest frequency IMFsHence, we can separate the previous sum into the two components of the Box & Jenkinsapproach. 21
22.
Figure 3.9: Empirical Mode Decomposition of Axa, starting 01/01/2003 until 14/11/2011 Nsep NIM F ∗ ∃NIM F ∈ N |∀t ∈ Z, Xt = IM Fti + IM Fti + rt i=1 i=Nsep +1 T rend Random part Seasonal partIn the rest of this report, we will sometimes include the trend process in the seasonal part.In the previous example, the decomposition is the following:Remark 3.1.2 In the Figure 3.10, page 23 , the correlation between the ”Random Part”and the ”Seasonal Part” is: Corr(Xtrandom (t), Xtseasonal (t)) = −0.01In order to properly diﬀerentiate low frequency IMFs and high frequency IMFs, one needs arule. Multiple choices are possible:– A stationarity criterion for high frequency IMFs: statistical tests, such as Ljung Box Test, Runs Test, KPSS Test..– A periodicity criterion for low frequency IMFs: low frequency IMFs must display less than p pseudo periods within the time interval. Beyond that threshold p, they are considered as moving too quickly, and not carrying information relative to the general evolution of the series.Provided that the goal of this decomposition is to extract a seasonal pattern reﬂecting thebroad evolution of the stock price, a selection criterion for low frequency IMFs can seemmore appropriate. It is also more intuitive than statistical tests. Therefore, the criterionchosen is the following: 22
23.
Figure 3.10: Box& Jenkins Decomposition of Axastock price based on EMD, starting01/01/2003 until 14/11/2011 [ 1;T ] ∀i ∈ [[1; NIM F ]] , IM Fti 1 t T ∈ Xtseasonal 1 t T ⇔ #Γ0,i 3where [ 1;T ] # Γ0,i = s ∈ [[1; T ]] |IM Fsi = 0Therefore, we can now expressly write the Box & Jenkins decomposition based on the EMD: NIM F NIM F ∗ ∃NIM F ∈ N |∀t ∈ [[1; T ]] , Xt = IM Fti .1 #Γ[1;T ] >3 + IM Fti .1 [ 1;T ] #Γ0,i 3 + rt 0,i i=1 i=1 T rend Random part Seasonal partRemark 3.1.3 In the Figure 3.10, page 23, the ”Seasonal Part” achieves to explain 91% ofthe variance of the original time series: V ar(Xtseasonal + rt ) R2 := = 0.91 V ar(Xt )3.1.3 Prediction hypothesesWe have now decomposed our signal in two parts: the seasonal component, and the randomhigh frequency component. These two components are moreover uncorrelated, and thevariance of the seasonal process explains most of the variance of the original time series. 23
24.
We have the following decomposition: ∀T ∈ N,∃NIM F ∈ N∗ |∀t ∈ [[0; T ]] , N NIM F Xt = IM Fti .1 #Γ[1;T ] >3 + IM Fti .1 [ 1;T ] #Γ0,i 3 + rt 0,i i=1 i=Nsep +1 T rend Random part Seasonal partWe can now proceed to separate estimations for each process. As we noticed on the earlierexample, the Random Part Process is approximately centered on zero. Therefore, we willmake a simple prediction for this process: random E Xs T +1 s 2.T | Xtrandom 1 t T = (0)T +1 s 2.TWe now have to formulate a prediction of the seasonal process. Following the framework ofBox & Jenkins, two hypotheses are possible, in order to formulate predictions.3.1.3.1 Deterministic periodicity of low frequency IMFsThe ﬁrst possible assumption is that the seasonal component is deterministic. Hence, weassume that in the future, this periodic component will keep its properties: Ti i– Periodicity: ∀i ∈ [[i0 ; NIM F ]] , ∀t ∈ Z, IM Ft+j = 0, or ∀i ∈ [[i0 ; NIM F ]] , ∀t ∈ j=1 Z, IM Ft+Ti = IM Fti i [t;t+T ] [t;t+T ] [t;t+Ti ]– IMF structure: ∀i ∈ [[i0 ; NIM F ]] , ∀t ∈ Z, #Γmax,i i + #Γmin,i i − #Γ0,i 1Where [t;t+Ti ] Γ0,i = s ∈ [t; t + Ti ] |IM Fsi = 0 [t;t+T ] i Γmin,i i = s ∈ [t; t + Ti ] |∃u > 0, s = arg min IM Fv v∈[s−u,s+u] [t;t+T ] i Γmin,i i = s ∈ [t; t + Ti ] |∃u > 0, s = arg max IM Fv v∈[s−u,s+u]Hence, with these properties, each low frequency IMF can be easily prolonged, hence theestimated future seasonal process.3.1.3.2 Stochastic periodicity of low frequency IMFsThe second possible assumption is less strong that the ﬁrst one. Instead of assuming thatthe seasonal component is deterministic, it is now assumed that it presents some periodicity,while remaining stochastic. 24
25.
seasonal E Xs + rt T +1 s 2.T | Xtseasonal + rt 1 t T seasonal + r ) = (Xs t T +1 s 2.Tif we assume that future estimations should rely on sequences from the past of the sameduration.3.2 Insights of potential market predictors3.2.1 Deterministic periodicity: Low frequency Mean Reverting StrategyThis strategy relies on the hypothesis of deterministic periodicity of the seasonal process.To formulate a prediction at a certain time t, within the horizon T, this strategy relies onthe following algorithm: random seasonal ∀s ∈ [[t − T ; t]] , Xs = Xs + Xs seasonal t seasonal Xs seasonal 1 seasonal ∀s ∈ [[t − T ; t]] , Xs = log seasonal where X = Xs X T + 1 s=t−T if Xtseasonal − 2Xt−1 seasonal seasonal + Xt−2 > threshold seasonal Xt+T then > 1 and αM ean Re vertingStrat (t) = 1 Xtseasonal elseif Xtseasonal − 2Xt−1 seasonal seasonal + Xt−2 < −threshold , Xt+T then < 1 and αM ean Re vertingStrat (t) = −1 Xt3.2.2 Conditional expectation: Low Frequency Multi Asset Shift- ing Pattern Recognition Strategy and Mono Asset IMF Pat- tern Recognition Strategy3.2.2.1 Low Frequency Multi Asset Shifting Pattern Recognition StrategyThis strategy relies on the hypothesis of stochastic periodicity of the seasonal process. Itconsiders a pool of N assets, among which is the asset which is to be predicted: i0 .Toformulate a prediction at a certain time t0, within the horizon T, this strategy relies on thefollowing algorithm:First, each price process is decomposed. 25
26.
i random,i seasonal,i ∀i ∈ [[1; Nassets ]] , ∀s ∈ [[0; t0 ]] , Xs = Xs + XsAnd the asset of interest too, since it belongs to the pool of assets. i random,i0 seasonal,i0 ∀s ∈ [[0; t0 ]] , Xs0 = Xs + XsThen, the three best ﬁtting patterns are chosen: {(i1 , t1 ) , (i2 , t2 ) , (i3 , t3 )} 2 seasonal,i seasonal,i0 Xs Xs = arg min log seasonal,i − log seasonal,i0 (i,u)∈[ 1;Nassets ] ×[ 1;t0 −T ] [ [ X [ u;u+T ] u s u+T X [ t0 −T ;t0 ] t0 −T s t0 0 t seasonal,i0 1 where X [ t0 −T ;t0 ] = X seasonal,i0 , T + 1 s=t −T s 0 u+T seasonal,i 1 seasonal,i and X [ u;u+T ] = Xs T +1 s=u 1{X i1 >X i1 } + 1{X i2 >X i2 } + 1{X i3 >X i3 } t1 +T t1 t2 +T t2 t3 +T t3 Let Z = 3Z is the decision variable. Predictions are made depending on the vote of the three bestﬁtted scenarios. 1 Xt0 +T if Z> ,then > 1 and αShif tingP atternStrat (t) = 1 2 Xt0 1 Xt0 +T else if Z< ,then < 1 and αShif tingP atternStrat (t) = −1 2 Xt03.2.2.2 Low Frequency Mono Asset IMF Pattern Recognition StrategyThis strategy relies on the hypothesis of stochastic periodicity of the seasonal process. It isvery similar to the previous strategy. Diﬀerences exist on two essential points:– It does not require any other time series than the historical prices of the asset that is to be predicted– It is adapted. 26
27.
To formulate a prediction at a certain time t0, within the horizon T, this strategy relies onthe following algorithm: i random,i0 seasonal,i0 ∀s ∈ [[t0 − T ; t0 ]] , Xs0 = Xs + Xs 2 seasonal,i0 seasonal,i0 Xs Xs {t1 , t2 , t3 } = arg min log seasonal,i0 − log seasonal,i0 u∈[ 1;t0 −T ] [ X [ u;u+T ] u s u+T X [ t0 −T ;t0 ] t0 −T s t0 t0 u+T seasonal,i0 1 seasonal,i0 seasonal,i0 1 seasonal,i0 where X [ t0 −T ;t0 ] = Xs ,and X [ u;u+T ] = Xs T + 1 s=t T +1 0 −T s=uHence, the decision variable is: 1{X i0 >X i0 } + 1{X i0 >X i0 } + 1{X i0 >X i0 } t1 +T t1 t2 +T t2 t3 +T t3 Z= 3And the predictions computed by the strategy: 1 Xt0 +T if Z> , then > 1 and αAutoP atternStrat (t) = 1 2 Xt0 1 Xt0 +T elseif Z< , then < 1 and αAutoP atternStrat (t) = −1 2 Xt0 27
28.
Chapter 4Strategies analysis4.1 Portfolio managementDeﬁnition 4.1.1 Let Pt , t ∈ [1; T ] denote the stochastic process of the spot price of an asset.4.1.1 Trading strategyDeﬁnition 4.1.2 A trading strategy is represented as follows. At each time period, itprovides an anticipation of the market: -1 is bearish (i.e the price will decline), +1 is bullish(i.e the price will rise). α : [0; T ] → {−1; 1}4.1.2 Investment HorizonAn investment duration Tinvest , in terms of business days, drives the predictions of a givenstrategy. It can range from 10 to 252 business days (from two weeks until a year). Therefore,portfolio management can be driven by mid term or long term earnings prospects. Tinvest ∈ [|50; 252|]These prospects drive the PnL of the strategy, as positions will be covered after Tinvestbusiness days.Deﬁnition 4.1.3 ∀α ∈ {−1; 1}[|1;T |] , ∀t ∈ [1; T ] , (P(i+Tinvest )∧t − Pi ) P <invest (t) = α(i) α Pi 1 i t−1 28
29.
4.1.3 Starting timeThe beginning of the time series is not subject to predictions. It is kept as prerequisiteinformation in order to compute the ﬁrst predictions. Indeed, concerning the Patterns FittingStrategies, one needs to have a few historical patterns available for ﬁtting. Therefore, thepredictions start at the time: tstart = 10.TinvestHence, the new PnL vector: ∀α ∈ {−1; 1}[|1;T |] , ∀t ∈ [tstart ; T ] , (P(i+Tinvest )∧t − Pi ) P <invest (t) = α(i) α Xi tstart i t−14.1.4 Trading time spanA trading time span δt deﬁnes the duration between two portfolio rebalances. It is deﬁnedin terms of business days. Every δt days, one trade is closed and another position is taken.By default, and assumed in the back tests, it is equal to a ﬁfth of the investment duration.Therefore, for example for Tinvest = 252, there will be 5 rebalances per year, one every 50business days. Hence, in this example, δt = 50. Tinvest δt = 5Therefore, the PnL now becomes: ∀α ∈ {−1; 1}[|1;T |] , ∀t, (P(Tinvest +i.δt +tstart )∧t − Pi.δt +tstart ) P &Lαinvest ,δt (t) = T α(i.δt + tstart ) Xi.δt +tstart 0 i (T −tstart ) δt4.1.5 Annualizing the PnL and reducing its varianceThe PnL computed so far still carries a time dependence. It needs to be annualized, withthe following operation:∀α ∈ {−1; 1}[|1;T |] , ∀t, (P(Tinvest +i.δt +tstart )∧t − Pi.δt +tstart ) 252P <invest ,δt (t) = α α(i.δt + tstart ) . Xi.δt +tstart Tinvest 0 i (T −tstart ) δtMoreover, it is valuable to manage reducing the volatility of returns of a given strategy. Ifit is not at the expense of its mean rate of return, it contributes to slightly improving thesharp ratio. Hence, usual stop losses and cash ins are implemented in the back tests. 29
30.
From this section, we know have a frame for deriving a PnL vector from a given strategyα ∈ {−1; 1}[|1;T |] , and a given a price process Pt , t ∈ [1; T ]. The latter remains now to beevaluated with objective criteria.4.2 Underlying and target marketThree potential trading strategies were identiﬁed. They have been tested on three diﬀerenttypes of underlyings, for the following reasons:- Stocks: CAC40. The goal is to ﬁnd recurrent seasonality patterns within a single stock,or between diﬀerent stocks. This seasonality could be caused by important market shifts onthe initiative of big players, such as pension funds, insurance companies, asset managers, orbanks proceeding to portfolio rebalancing.- Implied Volatilities: VIX Index, VStoxx Index, VCAC, VDAX.. These indices arecomputed by a closed formula relying on implied volatilities of numeral options, hencereﬂecting the overall structure of the volatility smile.- Index Pairs: based on the following most liquid worldwide indexes: CAC, DAX, SX5E,SPX, NKY, UKX, IBOV, SMI, HSI. Index pairs provide trajectories which generally followmean reverting processes, and have the advantage of being extremely liquid.- Commodities: WTI. Commodities are known for displaying seasonality features. Therefore,they may constitute interesting underlyings. 30
31.
Chapter 5Results5.1 Empirical choicesThree strategies have been mentioned in this paper. However, in practical matters, testshave only targeted one strategy, for the following reasons:– The IMF Mean Convexity Reverting Strategy has been tested for a few examples. However, the calibration of the threshold has proven to be diﬃcult. Even the seasonal process remains somehow unstable, in particular its last values, where the second order derivative is computed. random seasonal ∀s ∈ [[t − T ; t]] , Xs = Xs + Xs seasonal t seasonal Xs seasonal 1 ∀s ∈ [[t − T ; t]] , Xs = log where X = X seasonal X seasonal T + 1 s=t−T s if Xtseasonal − 2Xt−1 seasonal seasonal + Xt−2 > threshold seasonal Xt+T then > 1 and αM ean Re vertingStrat (t) = 1 Xtseasonal elseif Xtseasonal − 2Xt−1 seasonal seasonal + Xt−2 < −threshold , Xt+T then < 1 and αM ean Re vertingStrat (t) = −1 Xt– The IMF Multi Asset Shifting Pattern Recognition Strategy is very similar to the Mono Asset IMF Pattern Recognition Strategy. However, it is harder to implement because it is a Multi Asset Strategy: results will depend on how much data is utilized. Moreover, the multi asset strategy is unadapted, contrary to the mono asset strategy.– Due to the small computing capacity available during the project, tests have mainly been focused on the last strategy hereby developed, i.e the Mono Asset IMF Pattern Recognition Strategy. Indeed, the idea is to derive reliable results by testing it on a wide range of 31
32.
underlyings (which list has been provided earlier). That way, the law of large numbers will help provide reliable results.5.2 Tables5.2.1 Volatility: VIX IndexFirst are shown the most promising results: on the VIX Index. 3 diﬀerent investmenthorizons have been tested: 50 days, 150 days, and 250 days. The result for 50 days is themost reliable, for it relies on the highest amount of trades. Bullish signals also seem to bemore performing. It is quite expected, considering the general behavior of implied volatility.As provided in table 5.1, page 32, the last back test, with 35% cash in, gives a 57% hit ratio,and a 0,40 annualized sharp ratio.Figure 5.1: Back test on the VIX Index, with a 50 days investment horizon and 35% cash inMoreover, the results are also encouraging for a longer investment horizon: 150 days.However, they rely on less trades, simply due to a longer horizon. Again, bullish signalsare more powerful. As provided in table 5.2, page 33, the last back test, with 100% cash in,gives a 54% hit ratio, and a 0,54 annualized sharp ratio. 32
33.
Figure 5.2: Back test on the VIX Index, with a 150 days investment horizon and 100% cashinFinally, the results are now presented for 250 days. Again, they rely on less trades, simplydue to a longer horizon. Again, bullish signals are more powerful. As provided in table 5.3,page 34, the last back test, with 150% cash in, gives a 66% hit ratio, and a 0,76 annualizedsharp ratio. However, it only relies on 15 trades, from 2005 to 2011, where bullish signalsobviously have proven quite eﬀective. Therefore, more tests with longer data need to bepursued.5.2.2 Volatility: VStoxx IndexTo conﬁrm long term results for the VIX Index, similar tests have been pursued on theVStoxx Index, see table 5.4, page 35. For the 126 days horizon, results are also encouraging.Without any cash ins, and with all signals (bullish and bearish), a 56% hit ratio is achievedon 70 trades, and a 0,20 sharp ratio.5.2.3 Volatility: other indices: aggregate performanceHere are, table 5.5, page 36, the aggregated results for all the indices tested in the followingpool. The amount of trades represented is around 1000 for the 150 days table, and 100 for 33
34.
Figure 5.3: Back test on the VIX Index, with a 250 days investment horizon and 150% cashinthe 250 days table, dating from 2006 to 2011. Therefore, results are very reliable.It seems that the aggregate prediction power on volatility is uncertain. However, resultsremain encouraging for the VIX, i.e the most liquid index (via futures or ETFs) among thevolatility indexes.5.2.4 French stocks: CAC 40: Aggregate performanceAggregated results for the French stocks market are quite disappointing, see table 5.6,page 37.5.2.5 Equities Indices and trading pairs: Aggregate performanceTests have also been pursued on the main worldwide equity indices, and their pairs.Aggregated results on approximately 2000 trades and 20 years historical prices show thatthis strategy does not have any prediction power on this asset class, see table 5.7, page 37. 34
35.
Figure 5.4: Back test on the VStoxx Index, with a 126 days investment horizon and withoutcash in5.2.6 Commodities: West Texas Intermediate (WTI)Results for the West Texas Intermediate (WTI), are shown in table 5.8, page 38. 35
36.
Figure 5.5: Back test on a pool of volatity indexes 36
37.
Figure 5.6: Back test on French stocks from the CAC 40 Figure 5.7: Back test on a pool of Equities Indexes 37
38.
Figure 5.8: Back test on the West Texas Intermediate (WTI) oil price 38
39.
Conclusion and outlookHHT oﬀers a potentially viable method for nonlinear and nonstationary data analysis. Butin all the cases studied, HHT does not give sharper results than most of the traditional timeseries analysis methods. In order to make the method more robust, rigorous, in application,an analytic mathematical foundation is needed.In our view, the most likely solutions to the problems associated with HHT can only beformulated in terms of optimization: the selection of spline, the extension of the end, etc.This may be the area of future research.While this study tries to design some theoretical ground for the HHT, further theoreticalwork is greatly needed in this direction.On the empirical aspect, more research also needs to be pursued. While not particularlyperforming on stock prices, the EMD seems more adapted to curves resembling impliedvolatilities, and more able to derive meaningful dynamics oﬀ them. Strong results havebeen reached concerning main volatility indexes, such as VIX or VStoxx. Therefore, furtherempirical tests on this asset class could be rewarding.Moreover, a great variety of assets has not been tested for prediction: other types ofcommodities (only WTI was tested), precious metals, ﬁxed income assets such as sovereignor corporate bonds..Finally, signiﬁcant tests have only been pursued for one strategy among the three that wereformulated. The code provided with this report is able to generate results for the two otherstrategies, and can be the base of wider back tests on industrial scale.In terms of applications, this study has limited itself to the ﬁrst part of the HHT algorithm,i.e the Empirical Mode Decomposition. Maybe further work can be done in order to properlyformalize the Hilbert spectrum, make new hypotheses, and derive potential predictors usingthe same methodology as in this study.AcknowledgementsThis study has been pursued in collaboration with the Equity Quantitative Team at Natixis.Since our ﬁrst arrival in the locals of Natixis, we have been thoroughly assisted. Successful 39
40.
professionals were kind enough to answer our questions and to give their opinion on our workduring the entire year. Without their advices, this study would not have achieved its currentﬁndings. Our project consisted of working at Natixis every Wednesday, from October 011 toMarch 2012. Workdays were a great opportunity to work within the ﬁnance environment,and to learn about the role of quantitative associates within the banking industry.First, we would like to thank our supervisor Mr Adil Reghai, Head of Quantitative Analyticsat Natixis. Adil showed much interest for our project, shared our views, and gave us valuablefeedbacks during the whole year. He helped us design our predictors, and constantly gaveus new ideas for back tests. We would also like to thank Mr Adel Ben Haj Yedder, whogreatly contributed to our project, proofread our reports, and gave us feedbacks. We alsohad the opportunity to discuss with Adel about his daily job, the role of the team, and aboutthe banking industry in general. His views will be valuable in order for us to precise ourprofessional project and goals. We are also thankful to Stephanie Mielnik, Thomas Combarelfor their contributions, and to the team in general.Moreover, this study was pursued in collaboration with Dr Alex Langnau, Global Headof Quantitative Analytics at Allianz. Alex is a consultant for Natixis and academics atthe University of Munich, and talked Adil about the Hilbert Huang Transform. During ourproject, Alex also gave us valuable feedbacks, in particular about the portfolio management ofour trading strategies. Also, we would like to thank our teachers at Ecole Centrale Paris, fromthe Applied Mathematics Department. Mr Erick Herbin, professor of stochastic processes,supervised our project. He encouraged us to formalize the Hilbert Huang Algorithm. Despitebeing a diﬃcult task, it has revealed to be essential. We are also thankful to Mr GillesFa¨, professor of statistics and time series, for his lectures, providing important theoretical ygrounds for our study.Finally, we wish to thank our colleagues from the Applied Mathematics Program whopursued other projects in collaboration with Natixis. We have been working with themsince October, and we enjoyed having breaks with them. To name them: Marguerite deMailard, Lucas Mahieux, Nicolas Pai and Victor Gerard. 40
41.
Bibliography[1] Barnhart, B. L., The Hilbert-Huang Transform: theory, applications, development, dissertation, University of Iowa, (2011)[2] Brockwell, P.J., and Davis, R.A.,Introduction to Time Series and Forecasting, second edition , Springer-Verlag, New York. (2002)[3] Cohen, L., Generalized phase space distribution functions, J. Math. Phys. 7, 781 (1966)[4] Datig, M., Schlurmann, T., Performance and limitations of the Hilbert-Huang trans- formation (HHT) with an application to irregular water waves, Ocean Engineering, 31, 1783-1834, (2004)[5] De Boor, C., A Practical Guide to Splines, Revised Edition, Springer- Verlag. (2001)[6] Dos Passos, W.,( Numerical methods, algorithms, and tools in C# ), CRC Press, (2010)[7] Fa¨, G., S´ries Chronologiques, Lecture notes, Ecole Centrale Paris, (2012) y e[8] Flandrin, P., Goncalves, P., Rilling, G., EMD Equivalent Filter Banks, from Interpre- tation to Applications, in : Hilbert-Huang Transform and Its Applications (N.E. Huang and S.S.P. Shen, eds.), pp. 57 -74. (2005)[9] Golitschek, M., On the convergence of interpolating periodic spline functions of high degree, Numerische Mathematik, 19, 46-154, (1972)[10] Guhathakurta, K., Mukherjee, I., Chowdhury, A.R., Empirical mode decomposition analysis of two diﬀerent ﬁnancial time series and their comparison, Elsevier, Chaos Solitons and Fractals, 37, 1214-1227, (2008)[11] Holder, H.E., Bolch, A.M. and Avissar.,R., Using the Empirical Mode Decomposition (EMD) method to process turbulence data collected on board aircraft Submitted to J. Atmos. Ocean. Tech., (2009)[12] Hong, L., Decomposition and Forecast for Financial Time Series with High-frequency Based on Empirical Mode Decomposition, Elsevier, Energy Procedia, 5, 1333-1340, (2011)[13] Huang, N.E., Shen, S.S.P., Hilbert-Huang transform and its applications, Volume 5 of interdisciplinary mathematical sciences, (2005)[14] Huang, N.E., Shen, Z., Long, S., Wu, M., Shih, H., Zheng, Q., Yen, N., Tung, C. and Liu, H. The empirical mode decomposition and the Hilbert spectrum for nonlinear and non-stationary time series analysis Proc. R. Soc.. 454, 1971, 903. (1998) 41
42.
[15] Huang, N. E., Wu, Z., A review on Hilbert-Huang transform: Method and its applications to geophysical studies, Rev. Geophys., 46, RG2006, (2008)[16] Liu, B. , Riemenschneider, S. ,Xu, Y., Gearbox, fault diagnosis using empirical mode decomposition and Hilbert spectrum, Mechanical Systems and Signal Processing, 20, 718- 734, (2006)[17] Pan, H., Intelligent Finance - General Principles, International Workshop on Intelligent Finance, Chengdu, China, (2007)[18] Reghai, A., Goyon, S., Messaoud, M., Anane, M., Market Predictor : Pr´diction e quantitative des tendances des march´s, Etude Strategie Quant Recherche Actions, e Natixis Securities, Paris (2010)[19] Reghai, A., Goyon, S., Combare, T., Ben Haj Yedder, A., Mielnik, S., Sharpe Select : optimisation de l’investissement Cross Asset, Etude Strat´gie Quant Recherche e Quantitative, Natixis Securities, Paris (2011) 42
43.
Appendix ATime series PrerequisitesA.1 Stationary and linear processesA.1.1 StationarityDeﬁnition A.1.1 A time series is a stochastic process in discrete time, Xt ∈ R, t ∈ Z forexample. Thus, a time series is composed of realizations of a single statistical variable duringa certain time interval (for example a month, a trimester, a year, or a nanosecond ).We can expect to develop some interesting predictions if the process displays certainstructural properties:– Either some ”rigidity”, allowing to extrapolate some deterministic parts.– Either some form of statistical invariance, called stationnarity, allowing learning the present and predicting the future, based on the past.Deﬁnition A.1.2 Xt ∈ R, t ∈ Z is said strictly stationary iﬀ its ﬁnite dimensionaldistributions are invariant under any time translation, i.e: ∀τ ∈ Z,∀n ∈ N∗ , ∀(t1 , .., tn ) ∈ Zn , (Xt1 , .., Xtn ) ∼ (Xt1 −τ , .., Xtn −τ )Deﬁnition A.1.3 Xt ∈ R, t ∈ Z is said to be stationary at the second order iﬀ:– (Xt )t∈Z ∈ L2 (R), i.e ∀t ∈ Z, E [Xt2 ] < ∞– ∀t ∈ Z, E [Xt ] = E [X0 ] := µX– ∀s, t ∈ Z, γX (t, s) := Cov (Xt , Xs ) = Cov (X0 , Xs−t ) =: γ (s − t)Deﬁnition A.1.4 The autocorrelation function of a stochastic process Xt ∈ R, t ∈ Z is theseries Cov (Xt , Xs ) ρ(s, t) = 1 (V ar (Xs ) .V ar (Xt )) 2 43
44.
A.1.2 LinearityWithin the family of stationary processes, an important family of processes is known as thelinear processes. They are derived from white noise processes.Deﬁnition A.1.5 A stochastic process Xt ∈ R, t ∈ Z is a weak white noise iﬀ it is stationaryat the second order, and: µX = 0 ∀h ∈ Z, γX (h) = σ 2 .δ0 (h)Deﬁnition A.1.6 A stochastic process Xt ∈ R, t ∈ Z is a strong white noise iﬀ it is i.i.dand µX = 0.Hence, second order linear processes can now be deﬁned.Deﬁnition A.1.7 Xt ∈ R, t ∈ Z is a weak (resp. strong) second order linear process iﬀ ∃(Zt )t∈Z , ∃(ψj )j∈Z , such as: (Zt ) weak (resp. strong) White Noise (σ 2 ) |ψj | < ∞ j∈Z ∀t ∈ Z, Xt = ψj Zt−j j∈ZSecond order linear processes are well known and studied. It can appear as excessive tomerely study these kinds of linear processes. However, the Wold’s decomposition provides astrong result for these processes:A.1.3 Wold’s decomposition:Every second order stationary process Xt ∈ R, t ∈ Z can be written as the sum of a secondorder linear process and a deterministic component. ∀t ∈ Z, Xt = ψj Zt−j + η(t) j∈Zwhere : (Zt ) weak (resp. strong) White Noise (σ 2 ) |ψj | < ∞ j∈Z η ∈ RZHence, basic linear processes, such as ARMA models, provide a strong base for explainingstationary processes. However, the latter assumption is quite reductive. 44
45.
A.2 The particular case of ﬁnancial time series: para- metric and non parametric extensionsA.2.1 Non-stationary and non linear ﬁnancial time seriesFinancial time series are known for displaying a few characteristic unknown to stationary orlinear processes:– Their ﬂat tales distribution are non compatible to Gaussian density functions. They must are more accurately ﬁtted by power laws, i.e processes of inﬁnite variance. These processes are not utilized in practice, because a measure of volatility (i.e variance) is paramount in ﬁnance ( For example, in order to price options, or to compute sharp rations of indices, stocks or strategies. See our deﬁnitions in chapter 4).– Non-linearity: they display non constant variance. Clusters of volatilities are common in ﬁnancial time series. These clusters are incompatible with linear and stationary processes like ARMA (which have a constant variance).– Non-stationnarity: they have a long term memory.– - Time inversion: linear stationary processes are invariant to time changes. However, a ﬁnancial time series obviously is coherent with only one time direction, and is not consistent if time is inversed.A.2.2 Parametric processes for ﬁnancial time seriesTo tackle the issue of non linearity, popular parametric models are ARCH(p) and GARCH(p,q).Deﬁnition A.2.1 Xt ∈ R, t ∈ Z is deﬁned as an ARCH(p) process by: Xt = σt Zt p 2 2 σt = ψ0 + ψj Xt−j j=1Where: ∀j ∈ [[1; p]] , ψj > 0 Zt iid(0, 1)Deﬁnition A.2.2 Xt ∈ R, t ∈ Z is deﬁned as a GARCH (p,q) process by: Xt = σt Zt p 2 2 σt = ψ0 + ψj Xt−j j=1 45
46.
Where: ∀j ∈ [[1; p]] , ψj > 0 ∀j ∈ [[1; q]] , ϕj > 0 Zt iid(0, 1)A.2.3 Non parametric processes for ﬁnancial time seriesNumerous non parametric methods have been and are being developed to ﬁt ﬁnancial data.The goal here is not to mention all of them. The Empirical Mode Decomposition lives withinthis environment.A.3 General time series: the Box & Jenkins approach for predictionWithin the framework of Box and Jenkins (1970), a time series can be modeled as realizationof simultaneous phenomena:– The ﬁrst, ηt is a regular and smooth time evolution, called trend.– The second, St , is a periodic process of period T.– The third component, Wt , is the random component. It can be a stationary process. ∀t ∈ Z, Xt = Wt + St + ηtDeﬁnition A.3.1 St is a periodic process with period T, iﬀ: ∀t ∈ Z, St+T = St T St = 0 t=1 : L2 (R) × L2 (R) → R (X, Y ) → Cov(X, Y )From this framework, we will connect the EMD algorithm to the literature of time series.Some assumptions will be made to match this approach, and they will drive the predictionsalgorithms formulated later in this chapter. 46
47.
Appendix BEvaluation criteria of backtestsIn order to evaluate the eﬃciency of the potential highlighted strategies, a few rules need tobe taken. As mentioned above, adaptability is the main one. Every back-test ought to beimplemented as if the future were unknown. Therefore, it implies some obligations: - a timeframe restriction: every prediction must be computed without using any feature of its futurevalues. - a class restriction: predictions must be evaluated on an aggregate basis, for everykind of underlying. For example, no discrimination of the best or worst performers shouldbe done, because it is another form of unadaptability.However, it remains pertinent to evaluate the performance of the strategies regarding theasset class to which they are applied. Hence, as it will be mentioned further on, somestrategies might be more eﬃcient on stocks, trading pairs, or implied volatilities.Within the asset management theory, a few variables suﬃce to quantify the eﬃcacy of atrading strategy. They are usually noted as the following:Let X the random variable which denotes the annualized gain(or loss) of each trade of atrading strategy. Its realizations are written: Xi , i = 1..nDeﬁnition B.0.2 Average return = E [X] Average gain = E [X|X > 0] Average loss = E [X|X < 0] Drawdown = - min {Xi , i = 1..n} Hit ratio = P (X > 0)These values need to be analysed together. A hit ratio above 50% should be compared withthe average gain and loss. The drawdown also provides valuable information on the risk ofthe trading strategy, and gives a hint of the sharp ratio. Drawdowns are a valuable tool inorder to calibrate stop losses thresholds. 47
48.
Deﬁnition B.0.3 E [X − T ] Information ratio = V ar (X − T )with T as a benchmark performance rate. In the results displayed in the annexes, thebenchmark performance rate plugged in the computations of Information ratios is 0%.Deﬁnition B.0.4 E [X − T ] Sortino ratio = V ar (X|X < T ))with T as a benchmark performance rate. In our results displayed in the annexes, thebenchmark performance rate in our computations of Sortino ratios is 0%. Therefore, thismeasurement only takes into account ”negative volatility”, i.e the volatility of losses.Deﬁnition B.0.5 E [X − rf ] Sharp ratio = V ar (X))with rf as the annualized risk free rateThe analyses hereby rely mainly on the information ratio and the Sortino ratio. However,at the light of the current risk free rates, the information ratio can be considered as a goodapproximation of the Sharp ratio. 48
Be the first to comment