2. Dissertation Defense
Uncertainty Analysis Applications of GDB Copula with TSP Generating Densities
Dissertation Committee:
Johan R. van Dorp Dissertation Director
Enrique Campos-Nanez Committee Member
Jonathan Pierce Deason Committee Member
Robert A. Roncace Committee Member
Thomas Andrew Mazzuchi Committee Member
3. Overview
Introduction
• Dissertation background
• Preview dissertation contributions
Multivariate Models
• Details of the single common risk factor multivariate model
• Details of the multiple common risk factor multivariate model
Application Examples
• Hydrological frequency analysis
• Stock returns
Concluding remarks
• Review model success
• Future work
5. Definitions – Sklar’s Theorem
Theorem (Sklar). Given the CDF H with continuous margins F and G. Then there exists a unique
copula represented by:
H(x, y) = C{F(x), G(y)}
Corollary. Given the CDF H with continuous margins F and G and the copula C. Then for
u, v ∈ [0, 1]:
C(u, v) = H{F−1
(u), G−1
(v)},
where F−1 and G−1 are quantile functions.
We also have the density representation of copulas:
h(x, y) = c{F(x), G(y)} · f(x)g(y)
where h, c, f, and g are densities.
6. Graphical representation of the diagonal band copula
Support region and copula PDF representation of the original diagonal band copula [4].
There are two extension to the diagonal band copula–both using generating densities
• Ferguson (1995) and Bojarski (2002)
• Lewandowski (2005) showed that Ferguson and Bojarski extension were equivalent
c(u, v) =
1
2
{g(|u − v|) + g(1 − |1 − u − v|)} 0 < u, v < 1 [5]
g(z) =
(gθ(−z) + gθ(z) for z ∈ [0, 1 − θ] [10]
0 elsewhere
7. GDB TSP copula [8]
c {x, y|p (·|Ψ)} = 1
2
×
p (1 − x − y|Ψ) + p (1 + x − y|Ψ) , (x, y) ∈ A1,
p (1 − x − y|Ψ) + p (1 − x + y|Ψ) , (x, y) ∈ A2,
p (x + y − 1|Ψ) + p (1 + x − y|Ψ) , (x, y) ∈ A3,
p (x + y − 1|Ψ) + p (1 − x + y|Ψ) , (x, y) ∈ A4.
If we consider the TSP generating density (Kotz, Van Dorp 2010)
p(·|Ψ) = nzn−1
→ P (·|Ψ) = zn
c { x, y | p (·|Ψ) } = 1
2
×
n (1 − x − y)n−1
+ n (1 + x − y)n−1
, (x, y) ∈ A1,
n (1 − x − y)n−1
+ n (1 − x + y)n−1
, (x, y) ∈ A2,
n (x + y − 1)n−1
+ n (1 + x − y)n−1
, (x, y) ∈ A3,
n (x + y − 1)n−1
+ n (1 − x + y)n−1
, (x, y) ∈ A4.
• For nomenclature we use: single parameter GDB TSP copula
• This single parameter copula is utilized in the current research model
• Could be extended to other copula models
8. Preview research contributions
Contributions:
• Specification of estimation procedure for a multivariate copula framework
• Bivariate model with single common risk factor
• Multivariate model with single common risk factor
• Multivariate model with multiple common risk factors
• Derivation of novel relationships between model parameters and the Spearman’s ρ and
Blomqvist’s β dependency measures
• Specification of efficient joint sampling procedures
Motivation:
• Literature review reveals focus on bivariate copulas
• Applications tend to use select few family
• Elliptical - Gaussian, Student t
• Archimedean - Gumbel, Clayton, Frank
• Increase availability of multivariate copula models
13. Extension to multiple common risk factor multivariate model
• Main addition is the incorporation of GDB-TSP copula in common risk factors model
• With multiple common risk factors, model flexibility and complexity increase
• Must know account for the Yi ↔ G(Yi)
• Must relate model parameters to global dependence measure
• Increases the challenge of the optimization procedure in the parameter estimation process
• Two classes of model parameters
• Copula parameters constrained by ni > 0
• Weights of the common risk factors with following constraints:
0 ≤ wji ≤ 1 m
i=1 ωji = 1 ∀j → 1..k
16. Application Examples
• Hydrological frequency analysis
• Objective: Study relationship between rainfall duration and amounts
• Demonstrated importance of correctly modeling dependence structure
• GDB-TSP model outperforms traditional distribution approach
• Differences in model outputs have practical implications to flood mitigation strategies
• Flood example
• Objective: Study relationship between locations upstream and downstream
• Demonstrated importance of correctly modeling marginals
• Gamma distribution selected as the univariate model for both marginals
• Sediment composition
• Objective: Spatial study of relationship between composition of Cerium and Scandium
• Demonstrated importance of correctly modeling marginals
• GEV distribution selected as univariate model for Cerium marginal
• Logistic distribution selected as univariate model for Scandium marginal
• Salmonid risk assessment
• Objective: Monte Carlo Salmonid risk assessment ds
Dg
, h
L1
,
L2
L1
• Demonstrated importance of correctly modeling dependence structure
• In parts of modeled space correlated model had a higher estimation of risk in achieving target survival rates
• Stock returns analysis
• Objective: Study effectiveness of research model on 7-dim model. Data matrix ↔ Sample matrix
• Use law of parsimony to select 3 common risk factors model over 2 common risk factors model
• Demonstrated high degree of fidelity of Sample Correlation matrix with Data Correlation matrix
17. Hydrological frequency analysis
The task is to better understand the
behavior of extreme rainfall events:
• Magnitude
• Duration
• Frequency
• Taking a distributional approach
Motivation:
• Potential for great loss
• Inform mitigation strategies
• Insurance underwriting
• Input into rainfall runoff models
An aerial view of the submerged runway at Rockhampton airport in Australia. Gey Images / Jonathan Wood
Source: hp://blogs.sacbee.com/photos/2011/01/new-storms-soak-flood-weary-au.html
18. Korean rainfall example
Data:
• Seoul Korean dataset [9]
• Bivariate dataset of rainfall
maximums of amount and
duration
• Study period from 1965-2005
Approach:
• Comparative investigation
between GDB-TSP model
and Gumbel mixed model
• Maintain assumption of
Gumbel marginals
• Relies on the calculations of
return periods
• Examine differences in model
predictions of returns period
Estimated parameters for the GDB-TSP model with Gumbel marginals
Marginal Mean Std. dev Scale Location Correlation GDB-TSP
µ σ λ u ρ n
Duration 56.25 30.55 23.82 42.5 0.55 2.689
Amount 225.23 111.9 87.2 174.9
19. Returns definitions
• T(x, y) is defined as the joint returns period of amount and duration.
• T(x|y) is defined as the conditional returns period of amount given duration.
• T′(x, y) is the non-standard joint return of amount and duration.
• T′(x|y) is defined as the non-standard conditional return of amount given duration.
Bivariate returns periods
Return Period Event
TX,Y (x, y) {(X x or Y y) or (X x Y y) }
T′
X,Y (x, y) {X x and Y y }
TX|Y (x|y) {X x given Y = y }
T′
X|Y
(x|y) {X x given Y ≤ y }
T(x, y) =
1
PE(x, y)
Where PE(x, y) = 1 − F(x, y)
20. Candidate models
Gumbel mixed model F(x, y) = Fx(x)Fy(y) × exp −θ 1
ln Fx(x)
+ 1
ln Fy(y)
−1
GDB-TSP model F(x, y) = C {Fx(x), Fy(y)} Where C is the GDB-TSP copula
For both models marginals are assumed to be Gumbel marginals:
Fz(z) = exp − exp − z−uz
λz
Distribution and density for GDB-TSP model
22. Goodness of fit details
• Distance from empirical CDF [6]
• Sn =
n
i=1
{Fn − Fθn }2
• Tn = sup
√
n |Fn − Fθn |
23. Model predictions comparison for the T(x,y) study
D=12hrs
Amount (mm)
ReturnPeriod(yr)
Image: h p://www.abbey-associates.com/splash-splash/storm_water_management.html
Storm water runoff system
24. Findings of the comparative study
• Study compared GDB-TSP model to the Gumbel mixed model
• Based on Goodness of fit results, we select the GDB-TSP model
• For T(x, y) the comparison found:
• Good agreement in the low amount-duration regime
• GDB-TSP model predicted shorter joint returns elsewhere
• For T′(x, y) the comparison found:
• Good agreement in most of the modeled space
• GDB-TSP model predicted smaller rainfall amounts in the higher duration events
• For both T(x|y) and T′(x|y) the comparison found:
• GDB-TSP model predicted smaller rainfall amounts in the shorter duration events
• GDB-TSP model predicted larger rainfall amounts in the higher duration events
25. Higher dimensional example: stock returns
• Data
• Weekly returns: January 1st 1990 – January 3rd 2011 Rt =
Pt−Pt−1
Pt−1
• Stocks: XOM APA CVX SLB SU IMO NBL
• Indices: NDX DJA GSPC
• Goal
• Reproduce data correlation matrix
• Simulate from resulting distribution
• data correlation ↔ fit correlation ↔ sampled correlation
26. Stocks estimated parameters
Objec ve func on = 0.00036
Parameter Vector Weight Vector
Objec ve func on = 0.035
Parameter Vector Weight Vector
27. Correlation matrix comparison
Data Correla
on Matrix
Fied Correla
on Matrix
Sampled Correla
on Matrix
XOM APA CVX SLB SU NDX DJA
XOM APA CVX SLB SU NDX DJA
XOM APA CVX SLB SU NDX DJA
29. Research Summary
Research contributions
• Novel relations linking copula parameters to traditional dependence measures
• Copula models
• A two parameter bivariate copula model based on a common risk factor
• A multivariate copula model based on a common risk factor
• A multivariate copula model based on multiple common risk factor
• Estimation procedures and sampling routines
Application examples
• A flood example demonstrating improvement over traditional distribution approach
• A geochemical sediment composition example leveraging the flexibility of arbitrary marginals
• A hydrology example of returns period
• A Monte Carlo simulation for risk assessment
• A multivariate example of stock market returns
Future work
• Investigate alternatives to numerical integration
• Investigate the interpretation of the common risk factors
31. References
[1] D. L. Barrow and P. W. Smith. Spline notation applied to a volume problem. The American
Mathematical Monthly, 86:50–51, Jan. 1979.
[2] R. W. Carter. Floods in Georgia. Geological Survey Circular, 1951. No. 100. [24.3-1].
[3] R. Dennis Cook and Mark E. Johnson. A family of distributions for modelling non-elliptically
symmetric multivariate data. Journal of the Royal Statistical Society. Series B (Methodological),
43(2):210–218, 1981.
[4] Roger M. Cooke and Rudi Waij. Monte carlo sampling for generalized knowledge dependence
with application to human reliability. Risk Analysis, 6(3):335–343, 1986.
[5] T.F. Ferguson. A class of symmetric bivariate uniform distributions. Statistical Papers,
36(1):31–40, 1995.
[7] Christian Genest and Louis-Paul Rivest. Statistical inference procedures for bivariate archimedean
copulas. Journal of the American statistical association, 88(423):1034–1043, September 1993.
[6] Christian Genest, Bruno Remillard, and David Beaudoin. Goodness-of-fit tests for copulas: A
review and a power study. Insurance: Mathematics and Economics, 44:199–213, 2009.
[8] Samuel Kotz and Johan Rene Van Dorp. Generalized diagonal band copulas with two-sided
generating densities. Decison Analysis, 7(2):196–214, 2010.
[9] Chang Lee, Tae-Woong Kim, Gunhui Chung, Minha Choi, and Chulsang Yoo. Application of
bivariate frequency analysis to the derivation of rainfall–frequency curves. Stochastic Environmental
Research and Risk Assessment, 24:389–397, 2010. 10.1007/s00477-009-0328-9.
[10] Daniel Lewandowski. Generalized diagonal band copulas. Insurance: Mathematics and
Economics, 37:49–67, 2005.
[11] Fu-Chun Wu and Yin-Phan Tsang. Second-order monte carlo uncertainty/variability analysis
using correlated model parameters: application to salmonid embryo survival risk assesment.
Ecological Modelling, 177:393–414, 2004.
32. GDB-TSP Canonical Correlation
Zero covariance
X correlation Y correlation
Cross−correlation
−1.0 −0.5 0.0 0.5 1.0
• Reminiscent of factor rotation
• Independent X’s
• Significant dependence Y ↔ X
• Structural differences
• Factor rotation is a linear model
• GDB-TSP model provides full distribution approach