1100163YifanGuo

Xi’an Jiaotong-Liverpool University
Final Year Project
Final Report
Quasi-Monte Carlo Methods
in Market Risk Management
拟拟拟蒙蒙蒙特特特卡卡卡罗罗罗方方方法法法
在在在市市市场场场风风风险险险管管管理理理中中中的的的运运运用用用
Author:
Yifan Guo
ID Number:
1100163
Supervisor:
Dr. Halis Sak

This page intentionally left blank.

Contents
1 Introduction 6
2 Justification of Research Problem 7
3 Literature Review 8
4 The model 10
4.1 Copula . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
4.1.1 Gauss Copula . . . . . . . . . . . . . . . . . . . . . . . . . . 11
4.1.2 t-copula . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
4.2 The stock portfolio model with t-copula . . . . . . . . . . . . . . . 16
4.3 Naive simulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
5 Methodology 19
5.1 Variance reduction methods . . . . . . . . . . . . . . . . . . . . . . 19
5.1.1 Importance Sampling . . . . . . . . . . . . . . . . . . . . . . 19
5.1.2 Stratified Sampling . . . . . . . . . . . . . . . . . . . . . . . 23
5.2 Quasi-Monte Carlo . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
5.2.1 General Principles and Discrepancy . . . . . . . . . . . . . . 30
5.2.2 Halton Sequence and Sobol Sequence . . . . . . . . . . . . . 32
5.3 Combination of Quasi-Monte Carlo and Stratified Importance Sam-
pling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
6 Numerical Analysis 36
7 Conclusion 38
References 40
Appendix 42
A Algorithms 43
A.1 AOA Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
A.2 Algorithms for Stratified Importance Sampling . . . . . . . . . . . . 44
2

A.3 Algorithm for the combination of Quasi-Monte Carlo and Stratiﬁed
Importance Sampling . . . . . . . . . . . . . . . . . . . . . . . . . . 46
B R codes 47
B.1 Quasi-random number generation codes . . . . . . . . . . . . . . . . 47
B.2 Codes for the combination of Quasi-Monte Carlo and Stratiﬁed Im-
portance Sampling . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
3

Abstract
In risk management, tail loss probability measures the risk when stock
portfolio return is smaller specific threshold level. A t-copula stock portfolio
model is utilized to estimate tail loss probability. This report first outlines
how to use the existing two variance reduction methods importance sam-
pling and stratified importance sampling proposed by Sak et al. (2010) and
Ba¸so˘glu et al. (2013) are applied to market risk management. Our aim is
to try to combine Quasi-Monte Carlo with stratified importance sampling
to get a further variance reduction. Our numerical results suggest that this
combination method does not improve the variance reduction too much.
4

List of Figures
1 Simulation of Gauss copula . . . . . . . . . . . . . . . . . . . . . . . 12
2 Simulation of t-copula . . . . . . . . . . . . . . . . . . . . . . . . . 14
3 Dependency comparison . . . . . . . . . . . . . . . . . . . . . . . . 16
4 Uniform Random Pairs . . . . . . . . . . . . . . . . . . . . . . . . . 33
5 Halton Sequence . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
6 Sobol Sequence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
7 High Dimensional Halton Sequence . . . . . . . . . . . . . . . . . . 35
8 High Dimensional Sobol Sequence . . . . . . . . . . . . . . . . . . . 35
List of Tables
1 P(R < t) ≈ 0.05 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
2 P(R < t) ≈ 0.001 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
5

1 Introduction
Monte Carlo simulations have been applied in many areas. Among those, one
important area is market risk management. Market risk focuses on the possibil-
ity that a bank may suffer an undesired loss due to price changes (Mehta et al.,
2012). Market risks are usually categorized as equity risk, interest rate risk, cur-
rency risk and commodity risk. Over past decades, bankers have utilized different
mathematical models and tools to manage market risks.
It is widely accepted that Value-at-Risk (VaR) is a standard risk assessment
tool in the industry. With a given time period T and known loss distribution
P, Value-at-Risk is the largest loss number l such that the probability of being
exceeded (L > l) is at most (1 − α) (Duffie and Pan, 1997). The reverse of VaR
computations is trying to find α given l (Glass, 1999). The tail loss probability,
the target risk measure in this project, is that reverse such that the return R of
the stock portfolio is smaller than x, i.e. P(R < x). VaR calculations are basically
based on probability distributions and it is important to find an approach to
effectively simulate real market scenarios. Variance-covariance solutions, historical
simulation and Monte Carlo simulations are commonly used and the last one is
considered to be flexible for complex scenarios without unrealistic assumptions
(Glasserman et al., 2000).
Monte Carlo simulation approximates the expectations, by computing the sam-
ple mean of a random sample drawn from a uniform distribution (Glasserman,
2004, Chapter 1, Page 2). However, plain Monte Carlo simulations are usually
accompanied with huge variance and consume much time because of huge repli-
cations (Glasserman et al., 2000). In this case, variance reduction methods are
often utilized in Monte Carlo simulations. Those methods improve the efficiency
of Monte Carlo simulation by reducing the variance of simulation estimates. A
big advantage is that instead of offering general methods with general applica-
tions, they are often applied to specific features to each problem (Glasserman,
2004, Chapter 4, Page 185). Most commonly used methods are control variates,
antithetic variates, stratified sampling and importance sampling. Among these
methods, importance sampling method design a density distribution function to
6

estimate the expectations with lower variance. Stratified sampling is to fraction-
ally draw observations from each stratum of the sample space. (Glasserman, 2004,
Chapter 4, Page 209 and 255). Also, the combination of these two methods further
improve the accuracy. Ba¸so˘glu et al. (2013) provide a efficient simulation method,
which can reduce the variance as well as minimize the maximum relative error.
Another technique to improve the efficiency of Monte Carlo simulations is quasi-
Monte Carlo method for generating quasi-random numbers. Ordinary Monte Carlo
method generate random numbers using pseudo-random sequences, often leading
to a clumping of observations (Caflisch, 1998). Quasi-Monte Carlo method em-
ploys low-discrepancy sequences, like the Halton sequence, to generate determin-
istic numbers instead of random numbers. This deterministic method increases
the accuracy because it generate evenly distributed numbers (Glasserman, 2004,
Chapter 5, Page 180). Quasi-Monte Carlo methods have been applied to numer-
ical finance, especially to price derivatives. For example, Birge (1995) stated the
low-discrepancy sequence accelerated the convergence of their estimates to true
call option prices.
This report is written in followed structure. In Section 2, research problem
will be justified. In Section 3, a literature review will be given, describing most
related models in previous papers. In Section 4, an introduction of our stock port-
folio model with naive simulation of estimating, including copula and t-copula
are shown. Methodology part in Section 5 firstly explains variance reduction
method, importance sampling and stratified sampling, in details with small exam-
ples. Quasi-Monte Carlo and the combination of QMC and stratified importance
sampling are then illustrated also in Section 5. Finally, numerical analysis part in
Section 6 shows the results of relevant experiments, discussing the improvement
of variance reduction of the combined method.
2 Justification of Research Problem
In this project, the objective is to find an efficient simulation method combining
quasi-Monte Carlo method, importance sampling and stratified sampling, aiming
to improve the efficiency for estimating the tail loss probability. Papers of quasi-
7

Monte Carlo methods mostly applied to option pricing, and there is no applications
related to risk management, especially tail loss probability. On the other hand,
no paper implies a combined method employing IS, SIS and QMC to improve the
efficiency of the tail loss probability estimate. This project first tries to realize this
combination and hope to create an simulation method with significant efficiency
improvements.
3 Literature Review
Sak et al. (2010) gives an efficient risk simulations in t-copula model, using an
importance sampling algorithm. The method is based on a stock portfolio model
employing t-copula for dependence structure of logarithm return of the portfo-
lio. Copula gives a way to estimate marginal behavior of each risk factor of the
portfolio rather than to find the joint distribution function, which is time-saving
(McNeil et al., 2010, Chapter 5). Sak et al. (2010) choose t-copula dependence
structure with generalized hyperbolic marginal distributions and also t marginal
distributions, because this model is more heavy-tailed and accurate (Demarta and
McNeil, 2005). Also, Prause (1999) and Aas and Haff (2006) both suggest that the
generalized hyperbolic distribution provide high flexibility and a good fit to the
marginal behaviors of financial data. In this case, a fast inversion method PINV can
be utilized to compute the quantiles given generalized hyperbolic marginals (Der-
flinger et al., 2010). However, naive simulation using crude Monte Carlo method
has large variance. Therefore, Sak et al. (2010) propose an efficient importance
sampling algorithm. This method adds a mean shift vector µ to the random vec-
tor Z and changes the scale parameter of the chi-squared distribution Y . These
values are obtained by solving an optimization problem: the mode of the optimal
IS density is equal to the zero-variance IS density Glasserman et al. (1999). The
optimization process exploits the algorithm BFGS suggested by Byrd et al. (1995).
The numerical results showed the method can estimate the risk factors efficiently
and accurately.
Ba¸so˘glu et al. (2013) combined importance sampling and stratified sampling to
further improve the accuracy. The method modifies the adaptive optimal alloca-
tion algorithm (AOA) for stratification. By using conditional standard deviations,
8

AOA adapts the allocations stratum by stratum and finally get a weighted sample
average of the drawings in the last iteration as the stratification estimator. The
allocation proportions converge to the optimal allocations such that the variance
is minimal during iterations (Etoré and Jourdain, 2010). Ba¸so˘glu et al. (2013)
states that in their modifications, the number of observations allocated in all it-
erations is not necessarily equal to the number of observations of the stratum of
interest; the size of allocation in stratum is increased to 10. After the modifica-
tion, the importance sampling can be added to AOA algorithm: multiplying the
AOA estimator by the importance sampling (IS) likelihood ratio draws a new es-
timator. When estimating tail loss probability, the stratified importance sampling
(SIS) algorithm based on IS parameters calculated by Algorithm 3 of Sak et al.
(2010). It requires multi-normal random vector Z to be stratified along with the
IS mean shift vector’s direction, µd = µ/ µ , and the gamma variable Y to be
stratified under new IS scale parameter. This SIS algorithm effectively minimizes
the maximum relative error of multiple tail loss probability estimates according to
numerical experiments (Ba¸so˘glu et al., 2013).
The two variance reduction methods summarized above use pseudo-random
number generators. These methods have limitations because Monte Carlo has a
much larger convergence, compared to quasi-Monte Carlo. Therefore, replacing
the pseudo-random number generators with low-discrepancy sequence is likely to
accelerate the convergence of the simulation and reduce the variance. The basic
principle of quasi-Monte Carlo approximation is the same as Mont Carlo (MC)
Methods. We want to estimate the expectation µ = E[f(U)] = [0,1)d f(x)dx,
where U is a d-dimensional vector of uniformly distributed independent random
variables U = (U1, U2, ..., Ud). The quasi-Monte Carlo method estimator for the
expectation is
ˆµ =
1
n
n
i=1
f(xi),
where x1, ..., xn are deterministically chosen in the unit hypercube [0, 1)d
. Efficient
QMC methods mainly contain [1] dimension reduction method and [2] efficient
low-discrepancy sequence. [1] QMC methods explicitly depend on the dimension
of the problem and should be applied under an upper bound of dimension d,
otherwise they will not work. Low dimensions are stated to have lower relative
9

error compared to high dimensions (Glasserman, 2004, Chapter 5, Page282-283).
Caflisch (1998) emphasizes the significant improvement of efficiency for a dimen-
sion reduction method, when Brownian bridge method reduces the high dimension
of the path dependent problem to a moderate size. [2] QMC methods employ low-
discrepancy sequences to fulfill xi in the hypercube uniformly. Usual sequences are
Halton sequence and Sobol sequence(Glasserman, 2004, Chapter 5, Page 293-314).
Joy et al. (1996) exploited Faure sequence for standard European options and the
numerical results show a good convergence to true values and lower error bounds.
In experiment of Birge (1995), for a path based option, the previous mentioned
three sequences all reduce the observations being used in the simulation and the
computation time. The most effective sequence with smallest error is Sobol se-
quence, but it consumes more computation time than other sequences. Besides,
traditional variance reduction techniques can still improve the accuracy under
quasi-Monte Carlo simulations. Joy et al. (1996) state Richardson extrapolation
in conjunction with quasi- Monte Carlo methods can further improve the accu-
racy. Glasserman et al. (2002) discuss the combination of IS and QMC methods
in option pricing.
4 The model
The foundation model of the stock portfolio to compute the market risk is
based on t-copula. Therefore, we will first illustrate copula, especially t-copula,
and then define our specific stock portfolio model with t-copula.
4.1 Copula
For a random vector of variables, copula enables us to separately look at the
marginal distribution of each variable and their dependence structure, while joint
distribution functions often illustrate that information in a more implicit way. It
is applied especially in risk management, in a view of the fact that we often have
access to individual risk factors marginal behavior rather than all factors’ depen-
dence structure. We could find out the dependence model from various possible
dependence structures after estimating the marginal distribution functions. It is
10

time-saving compared with finding an explicit joint distribution function (McNeil
et al., 2010).
Definition. (McNeil et al., 2010, Chapter 5, page 185) C : [0, 1]d
→ [0, 1] is
a d-dimensional copula when C is a joint cumulative distribution function of a
d-dimensional vector u = (u1, u2, ..., ud) on the unit cube [0, 1]d
with standard
uniform marginal distribution. We note copulas as C(u) = C(u1, ..., ud).
Sklar (1959) offers an approach (Sklar‘s Theorem)to find a copula of a joint
distribution by firstly specifying marginal distribution functions: For a joint distri-
bution F of a random vector X = (X1, ..., Xd) with marginal distributions CDFs
F1, F2, ..., Fd, a copula C : [0, 1]d
→ [0, 1] exists such that
F(x1, ..., xd) = C(F1(x1), ..., Fd(xd)). (1)
Several copulas, such as Gauss, Gumbel copula and t-copula have been identi-
fied and applied in fitting real world variables. Gauss copula is one of the implicit
copulas that have no closed-form expressions but are drawn directly from usual
multivariate distribution using Sklar’s Theorem. The introduction of this copula
and an example of its simulation will be given in next subsection.
4.1.1 Gauss Copula
Before we get to know about Gauss copula, we first try to define the notion of
the copula of a specific distribution.
Definition. (Copula of F) When a random vector X has joint distribution
function with continuous marginal distributions F1, ..., Fd, this copula of F (or X)
is then called the distribution function C of (F1(X1), ..., Fd(Xd)).
Therefore, if Y ∼ Nd(µ, Σ) is a Gaussian random vector where µ is the mean
vector Σ is the correlation matrix of Y (detailed explanations see (McNeil et al.,
2010, Chaper 2, Page 63)) ,its copula is so-called Gauss copula. McNeil et al.
(2010) suggest that Y is equal to the copula of X ∼ Nd(0, P), where P = ℘(Σ)
(see (McNeil et al., 2010, Chapter 3, Page 64)) is the correlation matrix of Y . By
11

Deﬁnition of Copula of F, the Gauss copula is thus given by
CGa
P (u) = P(Φ(X1) u1, ..., Φ(Xd) ud)
= ΦP (Φ−1
(U1), ..., Φ−1
(Ud)),
(2)
where Φ denotes the standard univariate normal distribution function and Φ de-
notes the joint distribution function of X. The notation CGa
P emphasizes that
d(d − 1)/2 parameters of the correlation matrix parametrize the copula; CGa
ρ is
written where ρ = ρ(X1, X2) for two dimensional copula. Now we can draw a
simulation of Gauss copula in the following example.
Figure 1: Simulation of Gauss copula
Example. Simulation of Gauss copula If we want to simulate a bivariate
Gauss copula (the dimension of Gauss copula is 2) with ρ = 0.7, we can write the
following R codes.
12

1 U<-GaussCopula<-function(n){
2 L<-t(chol(R))
3 Z<-matrix(rnorm(dim*n),n,dim)
4 mu<-rep(0,dim)
5 X<-matrix(0,n,dim)
6 for(i in 1:n){
7 X[i,]<-mu+L%*%Z[i,]
8 }
9 U<-pnorm(X)
10 U
11 }
12 R<-matrix(c(1,0.7,0.7,1),2,2,byrow="T")
13 dim<-2
14 U<-U(2000)
15 plot(U[,1],U[,2],pch="*",col="red",
16 main="Gauss copula simulated points",xlab="U1",ylab="U2")
In a addition, we draw the density when the total number of simulated points is
2000 in Figure 1 for Gauss copula example. A generalized algorithm to simulate
Gauss copula proposed by McNeil et al. (2010) is shown in Algorithm 4.1
Algorithm 4.1 Simulation of Gauss copula
Input:
the correlation matrix P ∈ Rd×d
Output:
the random vector U which has distribution function CGa
P
1: generate Cholesky factor L of P
2: generate a vector Z = (Z1, ..., Zd) where, Zd ∼ N(0, 1), d = 1, ..., D
3: compute X = LZ
4: return U = (Φ(X1), ..., Φ(Xd)) , where Φ is the standard normal distribution
function.
4.1.2 t-copula
Both Gauss copula and t-copula are easy to construct compared with Gumbel
and Clayton copulas because they can both extracted from usual multivariate
distributions.
13

Figure 2: Simulation of t-copula
Deﬁnition. For a td(ν, 0, P) distributed random vector X, with mean vector 0,
the correlation matrix P implied by the dispersion matrix Σ and the degree of
freedom ν, the unique t-copula takes the form
Ct
ν,P (u) = tν,P (t−1
ν (u1), ..., t−1
ν (ud)), (3)
where the d-dimensional random vector u = (u1, u2, ..., ud) denotes the component-
wise probability transformed random vector (tν(X1), ..., tν(Xd)) and tν,P is the
joint distribution function of vector X. Also, t-copula can be shown in a more
speciﬁc form by Demarta and McNeil (2005)
Ct
ν,P (u) =
t−1
ν (u1)
−∞
...
t−1
ν (ud)
−∞
Γ(
ν + d
2
)
Γ(
ν
2
) (πν)d |P|
1 +
x P−1
x
ν
−
ν + d
2
dx, (4)
where t−1
ν denotes the quantile function of a standard univariate tν distribution.
Again, we try to illustrate the simulation of t-copula using an example.
14

Example. Simulation of t-copula We want to simulate a t-copula with pa-
rameter ν = 4 and ρ = 0.71. According to (McNeil et al., 2010, Chapter 5, Page
193), we should ﬁrstly try to generate a vector T which follows multivariate t
distribution with degrees of freedom nu, mean vector 0 and correlation matrix
P. The algorithm of this generation is complex so that we utilize a R package
mvtnorm to generate T directly using command rmvt(n, sigma, df), where n is
the number of repetitions and sigma is the correlation matrix Genz et al. (2014).
The simulation of t-copula is thus becomes easily to implement as the following R
shown.
1 library(mvtnorm)
2 T<-tcopula<-function(n){
3 X<-rmvt(n,Sigma,df=nu)
4 T<-pt(X,df=nu)
5 T
6 }
7 Sigma<-matrix(c(1,0.71,0.71,1),2,2,byrow="T")
8 nu<-4
9 T<-T(2000)
10 plot(T[,1],T[,2],pch="*",col="blue",
11 main="t-copula simulated points",xlab="T1",ylab="T2")
Density of 2000 simulated t-copula points are also plotted in Figure 2. McNeil et al.
(2010) also provide algorithms to simulate t-copula and we only show the algorithm
without speciﬁc generation of the multivariate distributed vector in Algorithm 4.2.
Algorithm 4.2 Simulation of t-copula
Input:
the correlation matrix P ∈ Rd×d
; degree of freedom ν
Output:
the random vector T which has distribution function tt
ν,P
1: generate X ∼ td(ν, 0, P)
2: return T = (tν(X1), ..., tν(Xd)) , where tν is the distribution function of a
standard univariate t distribution.
15

Figure 3: Dependency comparison
In financial risk management, people apply several copulas to simulation models
of risks. Compared to other copulas, such as Gauss copula, t-copula is regarded as
a superior one because it provides better dependence between extreme values of
financial data, often daily exchange rate return (Demarta and McNeil, 2005). This
difference can be explained by Figure 3. McNeil et al. (2010) suggest that we can
compare meta distributions with standard normal margins for Gauss copula and t-
copula using the quantile function of the standard normal distribution. Their linear
correlation are both roughly 70%. Gauss copula does not have tail dependence as
shown in the left panel;in contrast, t-copula have both upper tail dependence and
lower tail dependence, which stand for the dependence between extreme values.
4.2 The stock portfolio model with t-copula
Assuming a linear asset stock portfolio, its log-return vector X = (X1, ..., Xd)
follows a t-copula with ν degrees of freedom and a weight-vector w = (w1, ..., wd)
explains the weight of each stock. We can then generate the log-return vector such
that
Xd = ciG−1
i (Fv(Ti)) i = 1, 2, ..., d, (5)
where Gi denotes the CDFs of each stock’s marginal distribution and Fv(·)denotes
the CDF of joint distribution following t-distribution with ν degrees of freedom.
ci is a scaling factor depending on yearly volatility σi and the variance vari of
16

margins Gi such that
ci =
σi
252
1
vari
. (6)
The correlation matrix Σ implies the dependence structure of the model and we
generate the lower triangular Cholesky factor of it such that LL = Σ. A collec-
tion of iid. standard normal variables Z = (Z1, ..., Zd) is introduced to generate
multivariate t-copula random vector T. We multiply L to Z to get the mul-
tivariate normal distributed vector ˜Z = LZ. Therefore we generate the vector
T = (T1, ..., Td) satisfying
T = ˜Z/ (Y/ν), (7)
where Y is a chi-square distributed random variable with ν degrees of freedom.
Finally we get our return function
R(Z, Y ) =
d
i=1
wi exp(ciG−1
i (Fv(Ti))). (8)
Marginals and inversion Although the normal distribution is a simplest model,
it fails to fit fat-tails and asymmetric dependence (Patton, 2002). Prause (1999)
and Aas and Haff (2006) both suggested that the generalized hyperbolic distribu-
tion provide high flexibility and a good fit to the marginal behaviors of financial
data. We thus use the generalized hyperbolic distribution to fit the marginal be-
haviors of the realistic financial data, namely Gi in the model. For the same
reason, Sak et al. (2010) firstly assumed the logarithmic returns of stock portfolios
follow a t-copula with generalized hyperbolic marginal distributions and provided
a corresponding efficient risk simulation method.
In the simulation, one important task is to find the inverse of marginal dis-
tributions, namely G−1
i . In R environment, the existing methods for finding an
inversion of t-distribution and generalized hyperbolic distributions, the quantile
function qt() and algorithms in package ghyp (Breymann and Lüthi, 2013) re-
spectively, are not ideal because they take much time. Now, we have to introduce
some new algorithms to get a faster inverse method.
Runuran and PINV The R package Runuran provides a interface to the C library
UNU.RAN (Universal Non-Uniform RANdom variate generators). It allows alterna-
17

tive but faster functions in R to automatically generate random variables for large
number of distributions and efficiently find inverse CDFs. Runuran functions uti-
lize a universal algorithm to generate non-uniform pseudo-random variates with
lack of corresponding generation methods and also to be used for standard dis-
tributions with higher efficiencies than existed functions in R have. As a result,
Runuran often applied when we need robust and ready-to-use sampling, sampling
and simulating from truncated, unusual distributions (Leydold et al., 2014).
When using the universal algorithm, we have to offer information of the target
distribution, often the PDF and the CDF or probability vectors. Some algorithms,
like HINV ask for the CDF, which is often difficult to acquire and that is why we
prefer the algorithm PINV. PINV algorithm requires only probability density func-
tion, thereby bringing much convenience. Also, on a basis of Newton interpolation
and Gaussian-Lobatto quadrature, the algorithm PINV saves numerous time in in-
version (Derflinger et al., 2010). As Sak et al. (2010) and Derflinger et al. (2010)
mentioned, when target distribution is the generalized hyperbolic distribution, us-
ing PINV accelerate the simulation by over 1000 times than using the quantile
function of the package ghyp and 100 times than directly evaluating CDF.
VaR and tail loss probability In risk management , Value-at-risk (VaR) is a
basic and most accepted risk measure. With a given time period T and known loss
distribution P, value-at-risk is the largest loss number l such that the probability
of being exceeded (L > l) is at most (1−α) (Duffie and Pan, 1997). Artzner et al.
(1999) gave the mathematical definition,
V aRα(L) = inf{x ∈ R : P(L > l) ≤ 1 − α}, (9)
i.e. V aRα is the upper α-quantile of the loss distribution. VaR measure is to
find the upper α-quantile when the α is given. The converse of VaR manages to
determine α when l is given (Glass, 1999). Similarly, in our simulation, we already
have a threshold x and aim to find the tail loss probability such that the return R
of the stock portfolio is smaller than x, i.e. P(R < x).
18

4.3 Naive simulation
Finding P(R < x) is equivalent to estimating E[1{R(Z,Y )<x}](Ba¸so˘glu et al.,
2013). By now we can easily get a naive simulation model. The algorithm is
shown in Algorithm 4.3.
Algorithm 4.3 Computation of P(R < x) using naive simulation
Input:
Cholesky factor L of Σ, i.e., LL = Σ
The scaling factor ci, for i = 1, ..., d using 6
Output:
Tail loss probability P(R < x)
1: for k = 1, ..., n do
2: generate independent standard normal variates Zi
3: generate Y from χ2
ν distribution
4: calculate the t-distributed vector T = ˜Z/ Y/ν
5: calculate total return R(k)
= d
i=1 wi exp(ciG−1
i (Fv(Ti)))
6: end for
7: return P(R < x) =
1
n
Σn
k=11{R(k)
< x}, where 1{·} denotes the indicator
function
Though the naive simulation model gives a very straightforward and simple
way to get the tail loss function, the variance is huge. Therefore, we need to
reduce the variance using variance reduction methods. Importance sampling and
stratiﬁed sampling are two important variance reduction methods and have been
applied to simulate tail loss probability.
5 Methodology
5.1 Variance reduction methods
5.1.1 Importance Sampling
Importance sampling is an important sampling tool and variance reduction
method in Monte Carlo computing. If we need to ﬁnd the expectation of a distri-
19

bution µ = Ef [h(X)], we have to find the CDF F(X) and in reality it often has
large variance. Especially, if the sampling performance h(x) is small in a specific
region D and nearly zero outside the region, normal Monte Carlo methods cannot
simulate real point inside the region D (Owen, 2013, chap.9).
Importance sampling introduces a new sample of random variable and approx-
imate the mathematical expectation µ = Ef [h(X)]as the weighted average of new
sample (Tokdar and Kass, 2010). That is, our problem is to find µ = Ef [h(X)],
with
µ = h(x)f(x)dx, (10)
where f(x) is the probability density function. According to (Owen, 2013, chap.9),
if we introduce another probability density function g(x) satisfying g(x) > 0 when-
ever g(x)f(x) = 0, we have
µ = h(x)
f(x)
g(x)
g(x)dx = Eg
h(X)f(X)
g(X)
= Eg[w(X)h(X)], (11)
where w(x) = f(x)/g(x), is the adjustment factor called likelihood ratio and
Eg[•] denotes the expectation with respect to g(x).
The importance sampling estimate ˆµg of µ = Eg[h(X)] is
ˆµg =
1
n
n
i=1
w(xi)h(xi), (12)
where sample of x1, x2, ..., xn is independently drawn from g(x). To illustrate
Importance sampling in a more comprehensive way, Haugh (2004) gives an good
example. We can easily understand it in a modified illustration as followed.
Example 1(Estimating P(X ≥ 2)) Suppose we wish to estimate
θ = P(X ≥ 2) = E[I{X≥2}],
20

where X ∼ N(0, 1). We may then write
θ = E[I{X≥2}] =
∞
−∞
I{X≥2}
1
√
2π
exp(−
x2
2
)dx
=
∞
−∞
I{X≥2}
1
√
2π
exp(−
x2
2
)
1
√
2π
exp(−
(x − µ)2
2
)
1
√
2π
exp(−
(x − µ)2
2
)dx
=
∞
−∞
I{X≥2} exp(−µx + µ2
/2)
1
√
2π
exp(−
(x − µ)2
2
)dx
= Eµ I{X≥2} exp(−µX + µ2
/2)
where now X ∼ N(µ, 1) because we now have a new notation, Eµ[·]. Let us now
estimate θ by simulating X from the N(µ, 1) distribution, so that
g(x) =
1
√
2π
exp(−
(x − µ)2
2
).
If we set µ = 2, for instance, we can have the following R code.
1 naive<-function(n){
2 x<-rnorm(n)
3 h<-(x≥2)
4 theta naive<-mean(h)
5 se<-sd(h)
6 CI95 <- c(theta naive - 1.96*se/sqrt(n),
7 theta naive + 1.96*se/sqrt(n))
8 c(theta naive,se,CI95)}
9 naive(10ˆ5)
10 [1] 0.0223800 0.1479167 0.0214632 0.0232968
11 IS<- function(n){
12 mu<-2
13 x<-rnorm(n) + mu
14 hprime <- (x≥2)*exp(-mu*x + muˆ2/2)
15 theta est<- mean(hprime)
16 se <-sd(hprime)
17 CI95 <- c(theta est - 1.96*se/sqrt(n),
18 theta est + 1.96*se/sqrt(n))
19 c(theta est,se,CI95)}
20 IS(10ˆ5)
21

21 [1] 0.02298110 0.03501699 0.02276407 0.02319814
22 1-pnorm(2)
23 [1] 0.02275013
We compare the results of naive simulation, importance sampling and true value
provided by R in this example. The two simulations both give close estimated
values. Their 95% confidence intervals both contain the true value. However,
the importance sampling simulation has a smaller variance so that it provides
a narrower confidence interval. Therefore, we can say importance sampling is a
variance reduction method. In general, we write an importance sampling algorithm
as Algorithm 5.1 for estimating θ = Ef [h(X)], where we simulate with respect to
the sampling density, g(·).
Algorithm 5.1 Importance Sampling Algorithm for Estimating µ = Ef [h(X)]
for j = 1, ..., n do
2: generate Xj from density g(·)
set h∗
j = h(Xj)f(Xj)/g(Xj)
4: end for
set ˆµis = n
j=1 h∗
j /n
6: set ˆσ2
is = n
j=1(h∗
j − ˆµis)2
/(n − 1)
set approx.100(1 − α)%CI = ˆθn,is ± z1−α/2
ˆσn,is
√
n
8: return ˆµis, ˆσ2
is and 100(1 − α)CI
The fundamental task when utilizing importance sampling is to find a suitable
importance distribution g(x). A good importance distribution g(x) should have
small variance of the estimate ˆµg, which is
V ar(ˆµg) =
1
n
(h(x)
f(x)
g(x)
− µ
2
g(x)dx . (13)
If h(x) ≥ 0, we can theoretically get a zero-variance IS function under
h(x)f(x)
g(x)
=
µ (Glasserman et al., 1999). However, we cannot draw a zero-variance importance
sampling because we want to estimate µ. Kahn and Marshall (1953) state the
optimal g∗
(x) which minimizes V ar(ˆµg) is proportional to |h(x)| f(x). Haugh
(2004) suggests this relationship is equivalent to the similarity of shape. Therefore,
22

by using maximum principle, namely finding g(x) such that g(x) and h(x)f(x)
have their maximum value at same value x∗
, we can get the optimal IS. In other
words, the mode of the optimal IS is equal to the mode of h(x)f(x).
To apply importance sampling to simulate tail loss probability, Sak et al. (2010)
add a mean shift vector µ with negative entries to the normal vector Z. The
chi-squared random variable Y has been turned to a gamma random variable
with shape and scale parameters, ν/2 and 2 respectively. The optimal IS can
be selected when the shift vector and the scale values are set to make the mode
of the IS density equal to the mode of the zero-variance IS function. The main
algorithm is to solve a multidimensional optimization problem. The IS parameters
in this application are µ and y0, which are calculated by Algorithm 3 proposed by
Sak et al. (2010). This algorithm makes use of an efficient quasi-Newton method,
which is an constrained version of the Broaden-Fletcher-Goldfarb-Shanno(BFGS)
algorithm.
5.1.2 Stratified Sampling
Stratified sampling is another efficient sampling tool to reduce the variance in
Monte Carlo method. Stratified importance sampling constrains the fraction of
the samples drawn from specific strata of the sample space (Glasserman, 2004).
Suppose that ξi, i = 1, ..., I is a partition of Rd
into I strata and we know pi =
P(Y ∈ ξi) for i = 1, ..., I. We wish to estimate E[Y ] with Y real valued, then
E[Y ] =
I
i=1
P(Y ∈ ξi)E[Y |Y ∈ ξi] =
I
i=1
piE[Y |Y ∈ ξi]. (14)
Let Ni be the amount of drawings allocated to stratum i such that N =
I
i=1 Ni. If we generate independent Y1, ..., YI having the same distribution as
Y in random sampling, the allocation fraction of these samples will fall in ξi,
πi = Ni/N. In stratified sampling, we decide this fraction πi in advance; each
observation drawn from ξi is constrained to have the distribution of Y conditional
on Y ∈ ξi.
We can imagine one simplest case, the proportional sampling, namely πi = pi.
We can confirm that the fraction of samples selected from stratum ξi matches
23

the theoretical probability pi = P(Y ∈ ξi). We know that the total sample size
is N and it means we have Ni = [piN] samples from ξi. For each i = 1, ..., I,
let Yij, j = 1, ..., Ni be independent drawings from the conditional distribution of
Y given Y ∈ ξi. We concern the sample mean of the observations from the ith
stratum
Y =
1
Ni
Ni
j=1
Yij, (15)
which can present the unbiased estimator of E[Y |Y ∈ ξi]. Therefore, it is easy to
formulate the stratified Monte Carlo estimator
ˆYi =
I
i=1
pi
Ni
Ni
j=1
Yij =
1
N
I
i=1
Ni
j=1
Yij. (16)
Compared with the estimator in the naive simulationY = (Y1 + · · · + Yn)/n of
a random sample of size n, the stratified estimator ˆY eliminates the sampling
variability across strata without affecting sampling variability within strata.
There are two simple but important ways to generalize this formulation. First,
we introduce a second variable X and define the strata in terms of X. We call
X stratification variable and allow it take values in an arbitrary set. Glasserman
(2004) suggests it would be concrete if we assume it is Rd
-valued and thus take
the strata ξi to be disjoint subsets of Rd
with P(X ∈ Uiξi) = 1. Then the
representation (13) generalizes to
E[Y ] =
I
i=1
P(X ∈ ξi)E[Y |X ∈ ξi] =
I
i=1
piE[Y |X ∈ ξi], (17)
where pi = P(X ∈ ξi). In general, X and Y are dependent without either com-
pletely determining the other while in some applications, Y is a function of X.
In this representation, we should generate pairs (Xij, Yij), j = 1, ..., Ni, having the
conditional distribution of (X, Y ) given X ∈ ξi.
Second, we enable the stratum allocations N1, ..., NI to be arbitrary given N1 +
...+NI = N, rather than proportional to p1, ..., pI. Namely, the allocation fraction
πi is not equals to pi. By this way, the existing representation of the stratified
24

Monte Carlo estimator is not valid. We should now rewrite it as
ˆY =
K
i=1
pi ·
1
Ni
Ni
j=1
Yij =
1
N
K
i=1
pi
πi
Ni
j=i
Yij. (18)
So far, we can conclude that if we want to implement a stratified importance
sampling, we have two considerations to concern:
• choosing the stratification variable X, each strata ξi and the allocation Ni;
• generating samples from the distribution of (X, Y ) conditional on X ∈ ξi.
Those considerations bring variance reductions compared with the naive simula-
tion. We can learn more comprehensively, we will see a modified example according
to Altun (2012) and Glasserman (2004).
Example 2 (Stratifying uniforms) Stratification can be easily understood if
we stratify the uniformly distributed random variables. We can partition the unit
interval (0, 1) into I strata
ξ1 = 0,
1
I
, ξ2 =
1
I
,
2
I
, ..., ξI =
I − 1
I
, 1 .
We set our example in a simplest way that the sample size N and the number
of strata I are equal. It is to say, pi = 1/I and Ni = Npi(Ni is rounded up
to the nearest value). First, we generate independent uniform random variables
Ui, i = 1, ..., I such that Ui ∼ U(0, 1) and we introduce
Vi =
i − 1
I
+
Ui
I
, i = 1, ..., I.
From this representation, Vi is said to be uniformly distributed between (i − 1)/I
and i/I and it then has the conditional distribution of U, where U ∈ ξi for U ∼
U(0, 1). Therefore, the uniform distribution now have a stratified sample, V1, ..., VI.
If we wish to estimate E[Y ] where Y = f(U), the integral of expression of f is
written as
ˆY =
1
n
I
i=1
f(Vi),
25

Algorithm 5.2 Stratification for Estimating E[Y ] when Y = U2
and U uniform
random variable
Input:
Number of strata I and number of simulation
Output:
The mean Y , and variance σ of Y , where Y = Y1, ..., YN
1: for i = 1, ..., I do
2: for j = 1, ..., Ni for each stratum Ni = [Npi] do
3: generate uniform random variables Uj ∼ Unif[0, 1]
4: compute Vj =
i − 1
I
+
Uj
I
5: compute Yj = V 2
j
6: end for
7: compute the sample mean of each stratum Yi =
1
Ni
Ni
j=1 Yj
8: compute the sample variance of each stratum σ2
i =
1
Ni − 1
Ni
j=1(Yj − Yj)2
9: end for
10: return mean Y = I
i=1 piYi and variance σ2
= I
i=1 p2
i
σ2
i
Ni
which is the stratified estimator of Y . This is an unbiased estimator according to
Glasserman (2004). More specifically, we write Algorithm 5.2 as stratified algo-
rithm when Y = U2
.
If we set the number of strata I is 5 and the number of replications is 10000, the
R code can shows the stratification process. We also add naive simulation codes
to compare the results of these two methods.
1 naive<-function(n){
2 u<-runif(n)
3 qu<-uˆ2
4 mean<-mean(qu)
5 se<-sd(qu)
6 c(mean,se)}
7 naive(10ˆ5)
8 [1] 0.3309719 0.2970378
9 STRS<-function(n){
26

10 strata<-0:I/I
11 p<-diff(strata)
12 N<-ceiling(n*p)
13 for(i in 1:I){ u<-runif(N[i],strata[i],strata[i+1])
14 qu<- uˆ2
15 xbar[i]<<-mean(qu)
16 condStd[i]<<-sd(qu)}
17 mean<- sum(p*xbar)
18 se <- sqrt((1/I)ˆ2*sum(condStdˆ2/N))
19 c(mean, se)}
20 I<-5
21 xbar<-rep(0,I)
22 condStd<-rep(1,I)
23 STRS(10ˆ5)
24 [1] 0.333434053 0.000210357
The comparison shows stratified sampling increases the accuracy of the simulation
by over 1400 times. Therefore, stratified sampling is an effective variance reduction
method.
AOA algorithm According to Glasserman (2004), the optimal allocation frac-
tions π∗
i , i = 1, ..., I exists when the variance of the estimator is minimized,
π∗
i = piσi
I
l=1
plσl, i = 1, ..., I. (19)
Although we cannot draw the optimal allocation fraction because we do not know
the conditional standard deviation σi, Etoré and Jourdain (2010) propose the
adaptive optimal allocation (AOA) algorithm which works with iterations to esti-
mate. In this project, we make the use of the AOA algorithm modified by Ba¸so˘glu
et al. (2013) which applies to stratified importance sampling for estimating tail
loss probability.
The total number of the iteration is K and each iteration is denoted by k =
1, ..., K. The total sample size is denoted by N and Nk
is the sample size used in
the kth iteration. Thus Nk
is the size of the sample that is drawn in iteration k
conditional on stratum i and we have
N =
K
k=1
Nk
=
K
k=1
I
i=1
Nk
i . (20)
27

Ba¸so˘glu et al. (2013) set the number of iterations K to be 3 and in each sequence
they allocate 10,40 and 50 percent of the total sample size. Ba¸so˘glu et al. (2013)
imply that the sample allocations of each iteration is not necessarily to match the
exact aimed sample size. In the end of (k − 1)th iteration, the standard deviation
conditional on stratum i is estimated as ˆσk−1
i and the allocation fractions of the
next iteration πk
i are computed using the following general formula
πk
i = piˆσk−1
i
I
l=1
pl ˆσk−1
l , i = 1, ..., I, k = 2, ..., K. (21)
The allocation sizes of the next iteration are determined using:
Nk
i = max{ πk
i Nk
, 10}, i = 1, ..., I, k = 1, ..., K. (22)
For these allocations, the minimum size of each is 10 because of better converge
probabilities in our tail loss simulation. The stratified estimator of the final itera-
tion is
ˆxstrs =
I
i=1
pi ˆxK
i , (23)
where ˆxK
i is the sample mean of samples in Si (the set of all drawings made in
stratum i), at the end of iteration k. Let Mk
i be the size of the sample collected in
Si at the end of iteration k, namely Mk
i := K
k=1 Nk
i = Mk−1
i +Nk
i , with M0
i := 0.
Therefore, ˆxstrs is an unbiased estimator of x with variance:
V [ˆxstrs] =
I
i=1
p2
i (ˆσK
i )2
MK
i . (24)
The next step is to combine the AOA algorithm with IS. Recall the equation of
importance sampling in section 5.1.1, for any x ∈ RD
, let hIS(x) := w(X)h(X)
and then
µ = Ef [h(X)] = Eg[w(X)h(X)] = Eg[hIS(X)]. (25)
Now we can implement the AOA algorithm by satisfying RD
on the simulation
function hIS(X) instead of h(X). The optimal choice of the IS density is:
g∗
(x) = |h(x)f(X)| h(x)f(x)dx, (26)
which yields zero variance. Ba¸so˘glu et al. (2013) indicate that we can choose an
IS density that imitates the function |h(x)f(X)|. An pseudo code of the modified
AOA algorithm is given in Algorithm A.1.
28

Stratified Importance Sampling: Single tail loss probability estimate
By far, we have discussed two methods of variance reduction, importance sampling
and stratified importance sampling. Ba¸so˘glu et al. (2013) work on the optimally
stratified importance sampling simulating tail loss probability of a stock portfolio
(The model is illustrated in Section 4). This combined method tries to minimize
the maximum of the relative error of the estimated values by applying a two-
dimensional stratification. The multi-normal input Z along the direction of the IS
shift v := µ/ µ and the gamma random variable Y .
Glasserman (2004) indicated that stratification of D-dimensional standard multi-
normal vector along a given direction v can stratify its linear projection over the
direction v. Ba¸so˘glu et al. (2013) use a linear transformation of the multi-normal
input illustrated by Imai and Tan (2006), providing a pseudo code for the con-
struction of the linear transformation matrix V, which can be found in Algorithm
A.2. The initial model thus have a modification, when we save V as A := LV and
write the multi-t vector T
T = (LV)Z/ (Y/ν) = AZ/ (Y/ν). (27)
The initialization of SIS algorithm is given by Ba¸so˘glu et al. (2013) and shown
in Algorithm A.3. By this means, the first element Z1 of the input vector Z cor-
respond to direction v and it can thus becomes the only element stratified in Z.
Therefore, the SIS algorithm make some modifications compared to Algorithm 3
proposed by Sak et al. (2010). The shift θ1 = µ is added to Z1, while other
elements Z1, ..., ZD are not changed. Also, the gamma random variable is stratified
under the IS scale parameter θ2 = y0/(ν/2 − 1). In the two-dimensional stratifica-
tion, the total number of strata is I = I1 × I2, where the index sets i1 = 1, ..., I1
for the multi-normal input and i2 = 1, ..., I2 for the gamma random variable.
The two-dimensional stratification should generate two independently and iden-
tically distributed uniform random variables U1 and U2. Z1 is thus generated con-
ditional on the equiprobable interval i1 using the inverse CDF of the standard
normal distribution Φ−1
:
Z1 = θ1 + Φ−
1((i1 − 1 + U1)/I1). (28)
Y can be generated conditional on the equiprobable strata i2 using the inverse
29

CDF of the gamma distribution F−1
Γ :
Y = F−1
Γ ((i2 − 1 + U2)/I2; ν/2, θ2), (29)
under the shape parameter ν/2 and the scale parameter θ2. The returns R(Z, Y )
can still computed as the model illustrated in Section 4.3. The likelihood ratio is
ρ(Z, Y ) := exp(−Z1θ1 + θ2
1/2 − Y/2 + Y/θ2 + log((θ2/2)ν/2). (30)
The weighted responses should be generated by weighting the response 1{R(Z,Y )<x}
with the IS ratio in stratum i and stored in set Si. Recall the AOA algorithm, the
allocation of the drawings in iteration k should be determined by the formulation:
Nk
i =
max{ I−1
Nk
, 10} k = 1,
max{ ˆσk−1
i Nk I
l=1 ˆσk−1
l , 10} k = 2, ..., K.
(31)
Finally, the tail loss probability estimate and its variance be calculated by for-
mulations as in AOA algorithm. The pseudo code of SIS for a single probability
estimate illustrated by Ba¸so˘glu et al. (2013) can be found in Algorithm A.4
5.2 Quasi-Monte Carlo
The previous variance reduction methods are mostly applied to ordinary Monte
Carlo methods. Those methods need random generators like pseudo-random se-
quences. It attempts to emulate the behavior of independent random variables and
usually brings cluster of samples. This tendency of clumping together for random
points means there are some closed points taken from some specific spots, while
in other spots no samples are taken. Quasi-Monte Carlo methods are seeking to
generate evenly distributed points when they are deterministic but not random
at all. The accuracy can thus been increased by avoiding points clumping and it
is thus called low-discrepancy sequence (Glasserman, 2004). Our goal is to make
use of quasi-Monte Carlo methods and modify the stratified importance sampling
method to estimate tail loss probability for a stock portfolio model. A variance
reduction is likely to be achieved.
5.2.1 General Principles and Discrepancy
Glasserman (2004) gives an introduction to quasi-Monte Carlo (QMC) methods
and discrepancy, which will be briefly illustrated in this section. From the starting
30

point, we wish to estimate the expectation using integral
E[f(X)] = f(x)dx, (32)
where f(X) is a function of a random variable X. To explore quasi-Monte Carlo
methods, we usually solve this integral problem over the unit hypercube [0, 1)d
with
d-dimension and first imagine a Monte Carlo situation. Glasserman (2004) suggests
that in a Monte Carlo simulation, the replication can be seen as the output of
transformations to an input sequence of independent uniformly distributed random
variables. Especially, there are series of transformations over the unit hypercube
[0, 1)d
for uniform random variables U1, ..., Ud. Therefore, d is the upper bound on
the number of uniforms needed in the simulation and f(U1, ..., Ud). Therefore the
estimate problem now becomes
E[f(U1, ..., Ud)] =
[0,1)d
f(x)dx. (33)
Quasi-Monte Carlo approximates this integral using
[0,1)d
f(x)dx ≈
1
n
n
i=1
f(xi), (34)
for carefully (and deterministically) chosen points x1, ..., xn in the unit hypercube
[0, 1)d
. For this expression, we should notice that
• f is a function which does not to be known and that is why our simulation
aims to simulate it.
• The boundary of the unit hypercube is required in many quasi-Monte Carlo
sequences, while ordinary Monte Carlo does not need it.
• QMC method is dependent on the dimension of the problem.
We explain the last comment more specifically. QMC’s dependence on the dimen-
sion is the most difference compared with Monte Carlo. In Monte Carlo simulation,
one will explore an algorithm with less variation and the variation difference has no
reliance on the dimension of the unit hypercube. However, QMC requires the di-
mension to be identified before generating the points. In addition, lower dimension
bring greater accuracy.
31

Discrepancy As mentioned, quasi-Monte Carlo provides ways to sample uni-
formly over all space, avoiding points clumping together. People introduce dis-
crepancy to measure a property that points fill the hypercube space with cluster,
or the deviation from uniformity. quasi-Monte Carlo methods are low-discrepancy
methods because of the evenly generated points. Theoretically, QMC have the
potential to accelerate convergence from the (1/
√
n) rate associated with Monte
Carlo to nearly O(1/n) convergence: under appropriate conditions, the error in
a quasi-Monte Carlo approximation is O(1/n1−
) for all > 0). Note that, vari-
ance reduction methods including importance sampling and stratified sampling,
can only affect the implicit constant in O(1/
√
n). Therefore, low-discrepancy
sequences will bring further accuracy compared to ordinary variance reduction
techniques but implies its dependence on dimension. We neglect the specific
discussion of the dependence on dimension of QMC in this report. However, we
should highlight that traditional opinions state QMC only appropriate for low di-
mensional problems. QMC methods are only applicable with an upper bound d
on dimension.
5.2.2 Halton Sequence and Sobol Sequence
Low-discrepancy sequences in arbitrary dimension d, such as Halton sequence,
Faure sequence and Sobol sequence, are constructed by many authors. All of
these sequences are based on Van der Corput sequences, which is a specific class
of one-dimensional low-discrepancy sequences. Here we neglect the illustration of
the specific algorithms of all these sequences but make use of them directly in R
environment with package randtoolbox (Christophe and Petr, 2014). The package
is based on rngWELL and provides algorithms of pseudo random generators and
quasi random generators including Halton sequence and Sobol Sequence, which are
applied in this project. For example, halton(n, dim=d) can generate n repetition
of d-dimensional quasi-random numbers.
As mentioned, low-discrepancy sequences generate deterministic numbers which
are evenly distributed points without cluster. To better illustrate, we firstly plot
2000 two-dimensional uniform random variables in Figure 4. Halton sequence
and Sobol Sequence are also applied to generate 2000 repitions of 5-dimensional
quasi-numbers and Figure 5 and 6 plot the dimension 2 and 3 of two sequences
32

respectively. It is obvious that there are clusters in Figure 4. Points in Figure 5
and 6 have much more even uniformity than that in Figure 4. By this means, we
can easily see the advantage of low-discrepancy sequences.
Figure 4: Uniform Random Pairs
Now we use an example to explain how Halton sequence and Sobol Sequence
are generated.
Example 3 (Halton Sequence and Sobol Sequence) We are trying to ﬁnd
the probability of the squared values of two standard normal variables smaller than
a threshold value
P(Z2
1 + Z2
2 < x),
where x is a threshold value. We ﬁrstly compute this probability using naive sim-
ulation and then generate two low-discrepancy sequences to show the applications
33

Figure 5: Halton Sequence Figure 6: Sobol Sequence
to a multi-dimensional problem. Naive simulation creates a (n×2) matrix. In each
row, we can take 2 standard normal variables and replace each element in the ma-
trix with these variables. Then, the squares of each column are summed for each
row and are then stored in an array ai. If the value of ai is smaller than threshold
value, the indicator function ci is set to be 1. Finally, mean and variance of c can
be calculated after all ci are found. In low-discrepancy sequences, the differences
from the naive simulation are that the matrix is not filled with standard normal
variables, but low-discrepancy sequences. We can see the differences directly in R
code showed in Appendix B.1.
In this example, the variance reductions of low-discrepancy sequences are not
obvious but they at least simulate the probability with a similar accuracy as naive
simulation.
Curse of dimensionality We have mentioned that low-discrepancy sequences
have problems with high dimensions. Uniformity reduces with dimension increases.
Figure 7 and 8 show this problem directly. When total dimensions of each se-
quences are 30, the points of dimension 29 and 30 are plotted respectively. There-
fore, if we want to obtain variance reduction, we should limit the dimension. In
our project, we choose the dimensions of each sequence generating quasi-random
uniform variables as 2, the same as the dimension of stratification. To some extent,
we eliminate the impact of high dimensions.
34

Figure 7: High Dimensional Halton Se-
quence
Figure 8: High Dimensional Sobol Se-
quence
5.3 Combination of Quasi-Monte Carlo and Stratified Im-
portance Sampling
After the illustration of variance reduction methods and quasi-Monte Carlo
respectively, we now come to the core part of this project. The combination of
quasi-Monte Carlo and stratified importance sampling is implemented in R envi-
ronment. In Section 5.2.2, process that quasi-random numbers are generated by
Halton sequence and Sobol Sequence are shown. The command is simple with the
help of package randtoolbox.
The existing stratified importance sampling implement stratification after gen-
erating random variables by pseudo code. Two-dimensional stratification firstly
generate two uniform random variables U1 and U2 which are identical and inde-
pendent. Low-discrepancy sequences can generate quasi-random uniform numbers
set QU using default command, which should replace the pseudo generation of
random variables in stratification. Accordingly, the generation of the stratified
element Z1 in the input vector Z utilizes the quasi-random uniform numbers set
Q using (27). The rest elements in Z are generated by pseudo code in stratified
importance sampling proposed by Ba¸so˘glu et al. (2013), therefore we should also
replace it with standard quasi-random variables QZ. Another input of stratifica-
tion, the gamma random variable is generated using the inverse CDF of the gamma
distribution, which does not contain pseudo process. We do not need to modify
35

this generation after we replace the uniform random variable with quasi-random
uniform variable in (28). For the rest part of stratified importance sampling, we
make no modifications because there is no pseudo random generation. The algo-
rithm of combined quasi-Monte Carlo stratified importance sampling is showed in
Algorithm A.5.
6 Numerical Analysis
For our experiments, stock portfolios of sizes D=2,5,10,15 and 20 are used when
degree of freedom of the t-copula equals to 5. We assume that the marginal dis-
tributions of the log-returns follow t distribution. We choose off-diagonal element
of the correlation matrix Σ randomly between 0 and 0.3 and volatilities randomly
between 0.05 and 0.2. For the weight of each stock in the portfolio, independent
and identical uniform random variables distributed in (1/1000, 1) are chosen and
also normalized to make their sum to 1. Sak et al. (2010) make their experiment
in a same way so that the dimension of the problem would not decrease. Also, we
directly use the the fitted values utilized in Ba¸so˘glu et al. (2013) for estimating the
parameters of the marginal distributions and the copula parameters for the daily
log-returns. Therefore, we can confident that this experiment can obtain similar
results to real world stock portfolio results.
We firstly implement IS algorithm proposed in Sak et al. (2010) and SIS algo-
rithm proposed in Ba¸so˘glu et al. (2013). For the SIS algorithm and our combined
SIS with low-discrepancy sequences, we select strata sizes for both the multi-
normal and the gamma distribution as 22. As a result, the total number of strata
are limited under 500. All other initial settings of SIS algorithm are same as what
Ba¸so˘glu et al. (2013) do.
In our experiment, we compare the naive method, IS, SIS, combined SIS with
Halton sequence and combined SIS with Sobol sequence for a single probability
estimate in terms of the timing results(TM ) and the variance reduction. The
variance reduction factor (VR) of an estimator ˆx is
V R(ˆx) := V [ˆxnaive]/V [ˆx]. (35)
Also, the overall performance of an estimator can be measured by the efficiency
36

Table1:P(R<t)≈0.05
P(R<t)≈0.05
ISSISHaltonSobol
dVRTMERVRTMERVRTMERVRTMER
28.050.6436.720211.230.975116.337206.401.079102.721211.991.076105.796
57.531.8346.157233.702.215158.263253.372.424156.780226.202.309146.948
107.493.2716.238163.583.601123.740176.803.758128.155156.983.836110.325
157.544.4365.955186.124.865134.012186.995.014130.640193.695.023135.077
206.916.1875.509202.256.739148.015209.516.937148.952191.226.878137.11
Variancereductionfactors(VR),executionrimes(TM)insecondsandeﬃciencyraios(ER)ofnaivesimulation,
IS,SIS,SISwithHaltonSequenceandSISwithSobolSequence.N=10000isusedfornaiveandISsimulation,
fortherestsimulations,N≈100,000
Table2:P(R<t)≈0.001
P(R<t)≈0.001
ISSISHaltonSobol
dVRTMERVRTMERVRTMERVRTMER
2364.850.569336.6415792.480.8773467.5616611.990.9613612.1715477.650.8843253.13
5245.781.802190.6803385.402.1472204.3723793.3582.3352271.1413709.7192.2292326.688
10213.763.178173.6732113.133.5111554.0012484.553.8191679.7882248.9843.6601586.579
15197.574.385152.8752218.344.8611548.4132600.795.0971731.3112383.3934.9371638.01
20114.886.13186.7161461.096.7271005.1911388.96.832940.84131492.7136.8151013.687
Variancereductionfactors(VR),executionrimes(TM)insecondsandeﬃciencyratios(ER)ofnaivesimulation,IS,SIS,SIS
withHaltonSequenceandSISwithSobolSequence.N=10000isusedfornaiveandISsimulation,fortherestsimulations,
N≈100,000
37

(ER), which is calculated as
ER(ˆx) := V (ˆxnaive)TM[ˆxnaive]/(V [ˆx]TM[ˆx]). (36)
We choose the threshold values t in order to make the probabilities approximately
equal to 0.5 and 0.001 due to the dependence of efficiency of variance reduction
methods on the probability of the rare event.
In Table 1 and 2, the timing results(TM ), variance reduction factors(VR) and
the efficiency ratios (ER) are tabulated for all simulations under two different
threshold values t. From these results, we can make the following comments.
• For the most of the cases, combining SIS with Halton sequence slightly reduce
the variance further compared to pseudo method of SIS in terms of VR. When
considering ER, the introduction of Halton sequence performs well compared
to SIS only for a smaller threshold value.
• Only for a smaller threshold value t, Sobol sequence can help SIS make a
variance reduction compared with pseudo method of SIS in terms of both
VR and ER. For other cases, the introduction of Sobol sequence even reduce
the variance reduction of SIS.
• Both low-discrepancy sequences have inconsistent performances for different
sizes of stocks.
It seems that the combination of low-discrepancy sequences and SIS only helps SIS
reduce the variance further when dealing with a lower threshold value. And for the
last comment, the possible reason is the dependence of dimension of problem, which
is crucial when we apply low-discrepancy sequences to the real world applications.
7 Conclusion
In this project, the underlying problem was to obtain the further variance
reduction when combining quasi-Monte Carlo method with stratified importance
sampling. First we had to understand the details that how this estimation of tail
loss probabilities are made when the stock portfolio is under t-copula model. Sak
38

et al. (2010) and Ba¸so˘glu et al. (2013) show applications of importance sampling
and stratified importance sampling (SIS) for this estimation problem respectively.
The core part of this project, the combination of Quasi-Monte Carlo methods and
SIS was obtained by making modifications of algorithms and R codes provided by
those two papers.
Even though we expected an extra variance reduction, the results of the exper-
iment imply that only in some cases, the combination of SIS with low-discrepancy
sequences further reduce the variance slightly. Therefore, the future research
should
• try to find the valid ranges of threshold values and stock portfolio dimensions
for this combination of Quasi-Monte Carlo methods and SIS
• try to find a better structure of the combination of quasi-Monte Carlo method
and stratified importance sampling.
39

References
Aas, K., Haff, I. H., 2006. The generalized hyperbolic skew student’s t-distribution.
Journal of financial econometrics 4 (2), 275–309.
Altun, A., 2012. A variance reduction technique: Stratified sampling, final Year
Project.
Artzner, P., Delbaen, F., Eber, J.-M., Heath, D., 1999. Coherent measures of risk.
Mathematical finance 9 (3), 203–228.
Ba¸so˘glu, ˙I., Hörmann, W., Sak, H., 2013. Optimally stratified importance sampling
for portfolio risk with multiple loss thresholds. Optimization 62 (11), 1451–1471.
Birge, J. R., 1995. Quasi-monte carlo approaches to option pricing. Ann Arbor MI
48109, 2117.
Breymann, W., Lüthi, D., 2013. ghyp: A package on generalized hyperbolic dis-
tributions. Tech. rep., Tech. rep. Institute of Data Analysis and Process Design,
2008. Available online at: http://cran. r-project. org.
Byrd, R. H., Lu, P., Nocedal, J., Zhu, C., 1995. A limited memory algorithm for
bound constrained optimization. SIAM Journal on Scientific Computing 16 (5),
1190–1208.
Caflisch, R. E., 1998. Monte carlo and quasi-monte carlo methods. Acta numerica
7, 1–49.
Christophe, D., Petr, S., 2014. randtoolbox: Generating and Testing Random
Numbers. R package version 1.16.
Demarta, S., McNeil, A. J., 2005. The t copula and related copulas. International
Statistical Review / Revue Internationale de Statistique 73 (1), pp. 111–129.
URL http://www.jstor.org/stable/25472643
Derflinger, G., Hörmann, W., Leydold, J., 2010. Random variate generation by
numerical inversion when only the density is known. ACM Transactions on Mod-
eling and Computer Simulation (TOMACS) 20 (4), 18.
40

Duffie, D., Pan, J., 1997. An overview of value at risk. The Journal of derivatives
4 (3), 7–49.
Etoré, P., Jourdain, B., 2010. Adaptive optimal allocation in stratified sampling
methods. Methodology and Computing in Applied Probability 12 (3), 335–360.
Genz, A., Bretz, F., Miwa, T., Mi, X., Leisch, F., Scheipl, F., Hothorn, T., 2014.
mvtnorm: Multivariate Normal and t Distributions. R package version 1.0-2.
URL http://CRAN.R-project.org/package=mvtnorm
Glass, D., 1999. Importance sampling applied to value at risk. Ph.D. thesis, New
York University.
Glasserman, P., 2004. Monte Carlo methods in financial engineering. Vol. 53.
Springer.
Glasserman, P., Heidelberger, P., Shahabuddin, N. P., Apr. 30 2002. Pricing of op-
tions using importance sampling and stratification/quasi-monte carlo. US Patent
6,381,586.
Glasserman, P., Heidelberger, P., Shahabuddin, P., 1999. Asymptotically opti-
mal importance sampling and stratification for pricing path-dependent options.
Mathematical finance 9 (2), 117–152.
Glasserman, P., Heidelberger, P., Shahabuddin, P., 2000. Efficient Monte Carlo
methods for value-at-risk. IBM TJ Watson Research Center.
Haugh, M., 2004. Variance reduction methods ii. Monte Carlo Simulation: IEOR
E4703, 1–20.
Imai, J., Tan, K. S., 2006. A general dimension reduction technique for derivative
pricing. Journal of Computational Finance 10 (2), 129.
Joy, C., Boyle, P. P., Tan, K. S., 1996. Quasi-monte carlo methods in numerical
finance. Management Science 42 (6), 926–938.
Kahn, H., Marshall, A. W., 1953. Methods of reducing sample size in monte carlo
computations. Journal of the Operations Research Society of America 1 (5),
263–278.
URL http://dx.doi.org/10.1287/opre.1.5.263
41

Leydold, J., Hörmann, W., Sak, H., 2014. An r interface to the unu. ran library
for universal random variate generators.
McNeil, A. J., Frey, R., Embrechts, P., 2010. Quantitative risk management: con-
cepts, techniques, and tools. Princeton university press.
Mehta, A., Neukirchen, M., Pfetsch, S., Poppensieker, T., May 2012. Managing
market risk: Today and tomorrow.
Owen, A. B., 2013. Monte Carlo theory, methods and examples.
Patton, A. J., 2002. Applications of copula theory in financial econometrics. Ph.D.
thesis, University of California, San Diego.
Prause, K., 1999. The generalized hyperbolic model: Estimation, financial deriva-
tives, and risk measures. Ph.D. thesis, PhD thesis, University of Freiburg.
Sak, H., Hörmann, W., Leydold, J., 2010. Efficient risk simulations for linear asset
portfolios in the t-copula model. European Journal of Operational Research
202 (3), 802 – 809.
Sklar, A., 1959. Fonctions de répartition à n dimensions et leurs marges.
Tokdar, S. T., Kass, R. E., 2010. Importance sampling: a review. Wiley Interdis-
ciplinary Reviews: Computational Statistics 2 (1), 54–60.
42

Appendix
A Algorithms
A.1 AOA Algorithm
Algorithm A.1 The modiﬁed AOA algorithm
Input:
Simulation function H(x): RD → R; density function of random input f(X); impor-
tance sampling density g(x); strata σi and respective probabilities pi, i = 1, ..., I;
number of iterations K; the aimed sample sizes i each iteration Nk, k = 1, ..., K
Output:
Stratiﬁed estimator ˆxstrs and its variance V [ˆxstras]
1: set M0
i = 0 and π1
i = pi for i = 1, ..., I
2: for iteration k = 1, ..., K do
3: if k ≥ 2 then
4: compute πk
i for i = 1, ..., I using (21)
5: end if
6: for stratum index i = 1, ..., I do
7: compute Nk
i using (22) and set Mk
i = Nk
i + Mk−1
i
8: for drawing n = 1, ..., Nk
i do
9: generate X from density g(X) conditional on X ∈ σi
10: compute the likelihood ratio w(x) = f(x)/g(x)
11: compute hIS(X) = h(X)w(X) add it to sample set Si
12: end for
13: set Mk
i = Mk−1
i + Nk
i
14: compute sample standard deviation ˆσk
i in set Si
15: if k = K then
16: compute sample mean ˆxk
i in set Si
17: end if
18: end for
19: end for
20: return ˆxstrs and V [ˆxstrs] using respectively (23) and (24)
43

A.2 Algorithms for Stratified Importance Sampling
Algorithm A.2 Construction of a linear transformation matrix for a given direc-
tion v ∈ RD
Input:
Stratification direction v ∈ RD
Output:
Linear transformation matrix V ∈ RD×D
1: define matrix V ∈ RD×D with all entries equal to zero
2: set the first column of V as v
3: for index d = 1, ..., D − 1 do
4: set VD,d−1 = 1
5: define matrix B ∈ RD×D as the sub-matrix of V between the elements VD−d,1
and VD−d,d
6: define vector b ∈ RD as the sub-vector of V between the elements VD,1 and VD,d
7: find the unique solution of B q = b
8: set the elements between VD−d,d+1 and VD−1,d+1 as q ∈ RD
9: scale the (d + 1)th column of V to the unit length
10: end for
11: return V
Algorithm A.3 Stratified importance sampling: Initialization
Input:
Parameters of the t-copula model; portfolio return threshold t
Output:
IS parameters θ1 and θ2; pre-multiplier matrix A
1: compute Cholesky factor L of Σ, i.e., LL = Σ
2: compute µ and y0 using Algorithm 3 by Sak et al. (2010)
3: set θ1 = µ
4: θ2 = y0/(ν/2 − 1)
5: set v = µ/θ1
6: call Algorithm A.2 with input v to construct linear transformation matrix V
7: compute the pre-multiplier matrix A = LV
8: return θ1, θ2 and A
44

Algorithm A.4 Stratiﬁed importance sampling: Single estimate
Input:
Output:
Tail loss probability estimate ˆxsis and its variance V [ˆxsis]
1: initialize with Algorithm A.3
2: set M0
i = 0 for i = 1, ..., I and deﬁne sets Si = ∅ and set
5: compute Nk
i = Nk
i + Mk−1
i
i do
7: generate U1 = U(0, 1) and set Z1 using (28)
8: generate Zd ∼ N(0, 1), d = 2, ..., D independently
9: generate U2 = U(0, 1) independently and set Y using (29)
10: generate multi-t vector T using (27) and compute R(Z, Y ) using (8)
11: ρ(Z, Y )1{r(Z,Y )<t} using (30) and add it to sample set Si
12: end for
i in set Si
14: if k = K then
i in set Si
16: end if
17: end for
18: end for
45

A.3 Algorithm for the combination of Quasi-Monte Carlo
and Stratified Importance Sampling
Algorithm A.5 The combinaition of Quasi-Monte Carlo and Stratified Impor-
tance Sampling
Input:
Output:
Tail loss probability estimate ˆxsis and its variance V [ˆxsis]
1: initialize with Algorithm A.3
2: set M0
i = 0 for i = 1, ..., I and define sets Si = ∅ and set
5: compute Nk
i = Nk
i + Mk−1
i
i do
7: generate two-dimensional quasi-random uniform numbers set QU using
low-discrepancy sequences and assign numbers in the each dimension
to U1 and U2
8: set Z1 using (28)
9: generate (D − 1)-dimensional standard quasi-random numbers QZ and
assign numbers in each dimension to dth element of Z, d = 2, ..., D
10: set Y using (29)
11: generate multi-t vector T using (27) and compute R(Z, Y ) using (8)
12: ρ(Z, Y )1{r(Z,Y )<t} using (30) and add it to sample set Si
13: end for
i in set Si
15: if k = K then
i in set Si
17: end if
18: end for
19: end for
46

B R codes
B.1 Quasi-random number generation codes
1 #Naive Simulation
2 naive<-function(x){
3 set.seed(1234)
4 n<-100000
5 z<-matrix(rnorm(2*n),n,2)
6 c<-array(0,dim=n)
7 for(i in 1:n){
8 a<-(z[i,1]ˆ2+z[i,2]ˆ2)
9 if(a< x){
10 c[i]=1
11 }
12 }
13 res<-mean(c)
14 se<-sd(c)
15 CI95<-c(res-1.96*se/sqrt(n),res+1.96*se/sqrt(n))
16 c(res,se,CI95)
17 }
18 naive(0.1)
19 [1] 0.04825000 0.21429512 0.04692179 0.04957821
20
21 #Halton sequence
22 library("randtoolbox")
23 halton.sim<-function(x){
24 n<-100000
25 z<-halton(n,dim=2,normal=TRUE)
26 c<-array(0,dim=n)
27 for(i in 1:n){
28 a<-(z[i,1]ˆ2+z[i,2]ˆ2)
29 if(a< x){
30 c[i]=1
31 }
32 }
33 res<-mean(c)
34 se<-sd(c)
47

36 c(res,se,CI95)
37 }
38 halton.sim(0.1)
39 [1] 0.04883000 0.21551356 0.04749423 0.05016577
40
41 #Sobol sequence
43 sobol.sim<-function(x){
44 n<-100000
45 z<-sobol(n,dim=2,normal=TRUE)
46 c<-array(0,dim=n)
47 for(i in 1:n){
48 a<-(z[i,1]ˆ2+z[i,2]ˆ2)
49 if(a< x){
50 c[i]=1
51 }
52 }
53 res<-mean(c)
54 se<-sd(c)
55
57 c(res,se,CI95)
58 }
59 sobol.sim(0.1)
60 [1] 0.04880000 0.21545075 0.04746462 0.05013538
B.2 Codes for the combination of Quasi-Monte Carlo and
Stratiﬁed Importance Sampling
1 #LIBRARY
2 library("Runuran")
4 R2X<-function(tab,...){
5 write.table(tab,"clipboard",sep="t",row.names=F)
6 }
7 #TOOLS
8 searchx<-function(n,p,nu,numg,L,c,w){
9 a<-0.9;b<-1;
48

10 fa<-Naive(n,a,nu,numg,L,c,w)[1]-p
11 fb<-Naive(n,b,nu,numg,L,c,w)[1]-p
12 err<-1
13 while(err/p>0.05){
14 ab<-(a+b)/2
15 fab<-Naive(n,ab,nu,numg,L,c,w)[1]-p
16 err<-abs(fab)
17 if(fab<0) a<-ab;fa<-fab
18 if(fab≥0) b<-ab;fb<-fab
19 }
20 ab
21 }
22 ortmat<-function(mat){
23 n<-dim(mat)[1]
24 k<-dim(mat)[2]
25 res<-matrix(0,n,n)
26 res[1:n,1:k]<-mat
27 res[1,(k+1):n]<-1
28 for(i in (k+1):n){
29 res[2:i,i]<-solve(t(res[2:i,1:(i-1)]),-res[1,1:(i-1)])
30 }
31 for(i in 1:n){
32 res[,i]<-res[,i]/sqrt(sum(res[,i]ˆ2))
33 }
34 res
35 }
36 qTLP T<-function(Z,X2,x,nu,numg,L,c,w,df=0){
37 d<-dim(Z)[1];n<-dim(Z)[2];
38 T<-L%*%Z/matrix(sqrt(X2/nu),d,n,byrow=TRUE)
39 Tmg<-qt(pt(T,nu),numg)
40 if(df==1){
41 res<-as.vector(t(w)%*%exp(c*Tmg))-x
42 }else{
43 res<-1*((as.vector(t(w)%*%exp(c*Tmg))-x)<0)
44 }
45 res
46 }
47 touch<-function(r,dir,x,nu,numg,L,c,w){
48 Z<-t(t(r*dir));X2<-nu
49 qTLP T(Z,X2,x,nu,numg,L,c,w,1)+1e-5
50 }
49

51 alg2<-function(dir,x,nu,numg,L,c,w){
52 dir<-dir/sqrt(sum(dirˆ2))
53 r0<-uniroot(touch,c(-10ˆ3,1e-5),dir,x,nu,numg,L,c,w)$root
54 y0<-(nu-2)/(1+r0ˆ2/nu)
55 z0<-r0*sqrt(y0/nu)*dir
56 of<-(nu/2-1)*(log(y0)-1)
57 res<-c(of,z0,y0)
58 res
59 }
60 alg3<-function(x,nu,numg,L,c,w){
61 dir<-as.vector(L%*%(c*w))
62 dir<-dir/sqrt(sum(dirˆ2))
63 d<-length(dir)-1
64 v<-dir[2:length(dir)]/dir[1]
65 f<-function(z){
66 -alg2(c(1,z),x,nu,numg,L,c,w)[1]
67 }
68 optdir<-optim(v,f,control=list(maxit=10000),lower=rep(0,d-1)
69 ,method="L-BFGS-B")[[1]]
70 optdir<-c(1,optdir)
71 optdir<-optdir/sqrt(sum(optdirˆ2))
72 res<-alg2(optdir,x,nu,numg,L,c,w)[-1]
73 res
74 }
75
76 Naive<-function(n,x,nu,numg,L,c,w){
77 d<-length(c)
78 Z<-matrix(rnorm(d*n),d,n)
79 X2<-rgamma(n,shape=nu/2,scale=2)
80 qZ<-qTLP T(Z,X2,x,nu,numg,L,c,w)
81 res<-mean(qZ)
82 res[5]<-var(qZ)/n
83 res[2]<-sqrt(res[5])*qnorm(0.975)
84 res[3]<-res[1]-res[2]
85 res[4]<-res[1]+res[2]
86 res[6]<-100*res[2]/res[1]
87 names(res)<-c("estimate","halfwidth","%95CILB","%95CIUB",
88 "est.var","rel.err")
89 res
90 }
91
50

92 IS<-function(n,x,nu,numg,L,c,w){
93 d<-length(c)
94 muy0<-alg3(x,nu,numg,L,c,w)
95 theta<-muy0[d+1]/(nu/2-1)
96 mu<-muy0[1:d]
97 Z<-matrix(rnorm(d*n)+mu,d,n)
98 X2<-rgamma(n,shape=nu/2,scale=theta)
99 wZ<-(dnorm(Z)/dnorm(Z,mu))
100 wIS<-wZ[1,]
101 for(i in 2:d) wIS<-wIS*wZ[i,]
102 wIS<-wIS*dchisq(X2,df=nu)/dgamma(X2,shape=nu/2,scale=theta)
103 qZ<-qTLP T(Z,X2,x,nu,numg,L,c,w)*wIS
104 res<-mean(qZ)
105 res[5]<-var(qZ)/n
107 res[3]<-res[1]-res[2]
108 res[4]<-res[1]+res[2]
109 res[6]<-100*res[2]/res[1]
110 names(res)<-c("estimate","halfwidth","%95CILB","%95CIUB",
111 "est.var","rel.err")
112 res
113 }
114
115 STRSIS<-function(n,ssize1,ssize2,x,nu,numg,L,c,w){
116 #### DIRECTION OPTIMIZATION
117 d<-length(c)
119 theta<-muy0[d+1]/(nu/2-1)
120 mu<-muy0[1:d]
121 A<-ortmat(t(t(mu)))
122 mu<-sqrt(sum(muˆ2))
123 #### STRATA OPTIMIZATION
124 strata1<-0:ssize1/ssize1
126 #### INITIALIZATION
127 I1<-length(strata1)-1;I2<-length(strata2)-1
128 p<-diff(strata1)%o%diff(strata2);
129 xbar<-s2<-matrix(0,I1,I2)
130 #### FIRST ITERATION
131 m<-p*n[1]
132 fmcumsum<-floor(cumsum(as.vector(m)))
51

133 M<-matrix(c(fmcumsum[1],diff(fmcumsum)),I1,I2)
134 M<-(M<10)*10+(M≥10)*M
135 StU1<-rep(strata1[-(I1+1)],rowSums(M))+runif(sum(M))
136 *rep(diff(strata1),rowSums(M))
137 Z<-matrix(c(qnorm(StU1,mu),rnorm(sum(M)*(d-1))),d,
138 sum(M),byrow=TRUE)
139 StU2<-rep(rep(strata2[-(I2+1)],I1),as.vector(t(M)))+runif(sum(M))
140 *rep(rep(diff(strata2),I1),as.vector(t(M)))
141 X2<-qgamma(StU2,shape=nu/2,scale=theta)
142 wIS<-(dnorm(Z[1,])/dnorm(Z[1,],mu))*(dchisq(X2,df=nu)/
143 dgamma(X2,shape=nu/2,scale=theta))
144 qZ<-qTLP T(A%*%Z,X2,x,nu,numg,L,c,w)*wIS
145 Mcumsum<-cumsum(t(M))
146 sample<-list(list(qZ[1:Mcumsum[1]]))
147 xbar[1,1]<-mean(sample[[1]][[1]]);
148 s2[1,1]<-var(sample[[1]][[1]])
149 for(j in 2:I2){
150 sample[[1]][[j]]<-qZ[(Mcumsum[j-1]+1):Mcumsum[j]]
151 xbar[1,j]<-mean(sample[[1]][[j]]);
152 s2[1,j]<-var(sample[[1]][[j]])
153 }
154 for(i in 2:I1){
155 sample[[i]]<-list()
156 for(j in 1:I2){
157 sample[[i]][[j]]<-qZ[(Mcumsum[(i-1)*I2+j-1]+1)
158 :Mcumsum[(i-1)*I2+j]]
159 xbar[i,j]<-mean(sample[[i]][[j]]);
160 s2[i,j]<-var(sample[[i]][[j]])
161 }
162 }
163 N<-M
164 #### ADAPTIVE ITERATIONS
165 if(length(n)≥2){
166 for(k in 2:length(n)){
167 m<-n[k]*(p*sqrt(s2))/sum(p*sqrt(s2))
170 M<-(M<10)*10+(M≥10)*M
171 StU1<-rep(strata1[-(I1+1)],rowSums(M))
172 +runif(sum(M))*rep(diff(strata1),rowSums(M))
173 Z<-matrix(c(qnorm(StU1,mu),rnorm(sum(M)*(d-1))),d,
52

174 sum(M),byrow=TRUE)
175 StU2<-rep(rep(strata2[-(I2+1)],I1),as.vector(t(M)))
176 +runif(sum(M))
177 *rep(rep(diff(strata2),I1),as.vector(t(M)))
183 sample[[1]][[1]]<-c(sample[[1]][[1]],qZ[1:Mcumsum[1]])
184 xbar[1,1]<-mean(sample[[1]][[1]]);
185 s2[1,1]<-var(sample[[1]][[1]])
186 for(j in 2:I2){
187 sample[[1]][[j]]<-c(sample[[1]][[j]],qZ[(Mcumsum[j-1]+1)
188 :Mcumsum[j]])
190 s2[1,j]<-var(sample[[1]][[j]])
191 }
192 for(i in 2:I1){
193 for(j in 1:I2){
194 sample[[i]][[j]]<-c(sample[[i]][[j]],qZ[(Mcumsum[(i-1)
195 *I2+j-1]+1):Mcumsum[(i-1)*I2+j]])
198 }
199 }
200 N<-N+M
201 }
202 }
203 #### OUTPUT
204 res<-sum(p*xbar)
205 res[5]<-sum((pˆ2)*s2/N)
207 res[3]<-res[1]-res[2]
208 res[4]<-res[1]+res[2]
209 res[6]<-100*res[2]/res[1]
210 names(res)<-c("mean","errbound","%95CILB","%95CIUB",
211 "estvar","relperc")
212 res
213 }
214 STRSHal<-function(n,ssize1,ssize2,x,nu,numg,L,c,w){
53

216 d<-length(c)
218 theta<-muy0[d+1]/(nu/2-1)
219 mu<-muy0[1:d]
230 m<-p*n[1]
233 M<-(M<10)*10+(M≥10)*M
234 HalU<-halton(sum(M),dim=2)
235 StU1<-rep(strata1[-(I1+1)],rowSums(M))+HalU[,1]
237 Z<-t(cbind(qnorm(StU1,mu),halton(sum(M),dim=d-1,normal=T)))
239 +HalU[,2]*rep(rep(diff(strata2),I1),as.vector(t(M)))
246 xbar[1,1]<-mean(sample[[1]][[1]]);
247 s2[1,1]<-var(sample[[1]][[1]])
248 for(j in 2:I2){
251 s2[1,j]<-var(sample[[1]][[j]])
252 }
253 for(i in 2:I1){
255 for(j in 1:I2){
54

260 }
261 }
262 N<-M
269 M<-(M<10)*10+(M≥10)*M
270 HalU<-halton(sum(M),dim=2)
271 StU1<-rep(strata1[-(I1+1)],rowSums(M))+HalU[,1]
273 Z<-t(cbind(qnorm(StU1,mu),halton(sum(M),dim=d-1,normal=T)))
275 +HalU[,2]*rep(rep(diff(strata2),I1),as.vector(t(M)))
282 xbar[1,1]<-mean(sample[[1]][[1]]);
283 s2[1,1]<-var(sample[[1]][[1]])
284 for(j in 2:I2){
286 :Mcumsum[j]])
288 s2[1,j]<-var(sample[[1]][[j]])
289 }
290 for(i in 2:I1){
291 for(j in 1:I2){
292 sample[[i]][[j]]<-c(sample[[i]][[j]],qZ[(Mcumsum[(i-1)
293 *I2+j-1]+1):Mcumsum[(i-1)*I2+j]])
296 }
55

297 }
298 N<-N+M
299 }
300 }
301 #### OUTPUT
303 res[5]<-sum((pˆ2)*s2/N)
305 res[3]<-res[1]-res[2]
306 res[4]<-res[1]+res[2]
307 res[6]<-100*res[2]/res[1]
310 res
311 }
312 STRSSob<-function(n,ssize1,ssize2,x,nu,numg,L,c,w){
314 d<-length(c)
316 theta<-muy0[d+1]/(nu/2-1)
317 mu<-muy0[1:d]
328 m<-p*n[1]
331 M<-(M<10)*10+(M≥10)*M
332 SobU<-sobol(sum(M),dim=2)
333 StU1<-rep(strata1[-(I1+1)],rowSums(M))+SobU[,1]
335 Z<-t(cbind(qnorm(StU1,mu),sobol(sum(M),dim=d-1,normal=T)))
337 +SobU[,2]*rep(rep(diff(strata2),I1),as.vector(t(M)))
56

344 xbar[1,1]<-mean(sample[[1]][[1]]);s2[1,1]<-var(sample[[1]][[1]])
345 for(j in 2:I2){
347 xbar[1,j]<-mean(sample[[1]][[j]]);s2[1,j]<-var(sample[[1]][[j]])
348 }
349 for(i in 2:I1){
351 for(j in 1:I2){
356 }
357 }
358 N<-M
365 M<-(M<10)*10+(M≥10)*M
366 SobU<-halton(sum(M),dim=2)
367 StU1<-rep(strata1[-(I1+1)],rowSums(M))+SobU[,1]
369 Z<-t(cbind(qnorm(StU1,mu),sobol(sum(M),dim=d-1,normal=T)))
371 +SobU[,2]*rep(rep(diff(strata2),I1),as.vector(t(M)))
378 xbar[1,1]<-mean(sample[[1]][[1]]);
57

379 s2[1,1]<-var(sample[[1]][[1]])
380 for(j in 2:I2){
382 :Mcumsum[j]])
384 s2[1,j]<-var(sample[[1]][[j]])
385 }
386 for(i in 2:I1){
387 for(j in 1:I2){
388 sample[[i]][[j]]<-c(sample[[i]][[j]],q [(Mcumsum[(i-1)
389 *I2+j-1]+1):Mcumsum[(i-1)*I2+j]])
392 }
393 }
394 N<-N+M
395 }
396 }
397 #### OUTPUT
399 res[5]<-sum((pˆ2)*s2/N)
401 res[3]<-res[1]-res[2]
402 res[4]<-res[1]+res[2]
403 res[6]<-100*res[2]/res[1]
406 res
407 }
408 #EXPERIMENTS
409 set.seed(1234);
410 d<-2
411 n<-10ˆ5;nstrs<-1:4*10ˆ4
412 ssize1<-22;ssize2<-22;
413 nu<-5;p<-0.05;
414
415 corelMatrix<-function(){
416 R <- array(-1,c(d,d))
417 for (i in 1:d){
418 for (j in 1:d){
419 if(i != j){
58

420 if(R[j,i] != -1)
421 R[i,j] = R[j,i]
422 else{
423 R[i,j] = runif(1)*0.3
424 }
425 }
426 else
427 R[i,j] = 1.0
428 }
429 }
430 R
431 }
432 R<-corelMatrix()
433 L<-t(chol(R))
434
435 volVector<-function(){
436 vol <- runif(d)*(0.2-0.05) + 0.05
437 vol
438 }
439 vol<-volVector()
440
441 assignWeights <- function(){
442 w <- array(0,c(d))
443 sum <- 0
444 for (j in 1:d){
445 w[j] <- runif(1,1/(1000),1)
446 sum <- sum + w[j]
447 }
448 for (j in 1:d){
449 w[j] <- w[j] / sum
450 }
451 w
452 }
453 w<-assignWeights()
454
455 numg<-5 + 45 * runif(d)
456 c<-sqrt((volˆ2/252)*(numg-2)/numg)
457
458 x<-searchx(10000,p,nu,numg,L,c,w)
459 Naive(n,x,nu,numg,L,c,w)
460 IS(n,x,nu,numg,L,c,w)
59

461 STRSIS(nstrs,ssize1,ssize2,x,nu,numg,L,c,w)
462 STRSHal(nstrs,ssize1,ssize2,x,nu,numg,L,c,w)
463 STRSSob(nstrs,ssize1,ssize2,x,nu,numg,L,c,w)
464
465 system.time(Naive(n,x,nu,numg,L,c,w))
466 system.time(IS(n,x,nu,numg,L,c,w))
467 system.time(STRSIS(nstrs,ssize1,ssize2,x,nu,numg,L,c,w))
468 system.time(STRSHal(nstrs,ssize1,ssize2,x,nu,numg,L,c,w))
469 system.time(STRSSob(nstrs,ssize1,ssize2,x,nu,numg,L,c,w))
60

1100163YifanGuo

Recommended

Recommended

More Related Content

What's hot

What's hot (12)

Similar to 1100163YifanGuo

Similar to 1100163YifanGuo (20)

1100163YifanGuo