Similar to Climate Extremes Workshop - Semiparametric Estimation of Heavy Tailed Density:Including Covariate Information - Surya Tokdar, May 17, 2018 (20)
Climate Extremes Workshop - Semiparametric Estimation of Heavy Tailed Density:Including Covariate Information - Surya Tokdar, May 17, 2018
1. 1/30
Semiparametric Estimation of Heavy Tailed Density:
Including Covariate Information
Surya T Tokdar
Duke University
Partially supported by NSF DMS 1613173
3. 3/30
Transformation of pdfs
Setting.
Data range = (a, b), with a = −∞ and/or b = ∞
{gθ : θ ∈ Θ} a parametric family of pdfs on (a, b)
Gθ denotes the CDF of gθ
Lemma
For any pdf f on (a, b) and any θ ∈ Θ there exists a unique pdf
h = hθ,f on (0, 1) such that
f (y) = gθ(y)h(Gθ(y)), y ∈ (a, b).
Proof. Take Y ∼ f and take h to be the pdf of U = Gθ(Y )
4. 3/30
Transformation of pdfs
Setting.
Data range = (a, b), with a = −∞ and/or b = ∞
{gθ : θ ∈ Θ} a parametric family of pdfs on (a, b)
Gθ denotes the CDF of gθ
Lemma
For any pdf f on (a, b) and any θ ∈ Θ there exists a unique pdf
h = hθ,f on (0, 1) such that
f (y) = gθ(y)h(Gθ(y)), y ∈ (a, b).
Proof. Take Y ∼ f and take h to be the pdf of U = Gθ(Y )
5. 4/30
gθ has heavier tails than f
−6 −4 −2 0 2 4 6
0.00.10.20.30.4
y
pdf
f
gθ
0.0 0.2 0.4 0.6 0.8 1.0
02468
u
pdf
hθ,f
6. 5/30
gθ has matching tails with f
−6 −4 −2 0 2 4 6
0.00.10.20.30.4
y
pdf
f
gθ
0.0 0.2 0.4 0.6 0.8 1.0
02468
u
pdf
hθ,f
7. 6/30
gθ has thinner tails than f
−6 −4 −2 0 2 4 6
0.00.10.20.30.4
y
pdf
f gθ
0.0 0.2 0.4 0.6 0.8 1.0
02468
u
pdf
hθ,f
8. 7/30
Tail-identified transformation
Definition
The family {gθ : θ ∈ Θ} is tail-identified if θ = θ implies gθ and
gθ have distinct right and/or left tail indices.
Lemma
If {gθ : θ ∈ Θ} is tail-identified then for any pdf f on (a, b) there is
at most one θf ∈ Θ with h = hθf ,f satisfying 0 < h(0), h(1) < ∞.
9. 7/30
Tail-identified transformation
Definition
The family {gθ : θ ∈ Θ} is tail-identified if θ = θ implies gθ and
gθ have distinct right and/or left tail indices.
Lemma
If {gθ : θ ∈ Θ} is tail-identified then for any pdf f on (a, b) there is
at most one θf ∈ Θ with h = hθf ,f satisfying 0 < h(0), h(1) < ∞.
10. 8/30
Semiparametric density model for bulk + tail
{gθ : θ ∈ Θ} a tail-identified family
H := {h(·) a cont pdf on [0, 1] : log h ∞ << ∞}
F := {f (·) = gθ(·)h(Gθ(·)) : θ ∈ Θ, h ∈ H}
Model: Y1, Y2, . . .
IID
∼ f , f ∈ F
11. 9/30
Distirbution and quantile functions
pdf : f (y) = gθ(y)h(Gθ(y))
cdf : F(y) = H(Gθ(y))
qf : Q(p) = Qθ(ζ(p))
q-density : q(p) = qθ(ζ(p)) ˙ζ(p)
where
ζ = H−1 is a diffeomoprhism of [0, 1] onto itself and
log ˙ζ ∞ = log h ∞
12. 10/30
Prior for ζ
Define transformation
T : C([0, 1]) → {diffeomorphisms of [0, 1]}
as
(Tw)(p) =
p
0 ew(t)dt
1
0 ew(t)dt
w ∼ GP induces a prior distribution on ζ = Tw.
13. 11/30
Hurricane intensity (North Atlantic 1981-2005)
Histogram of WmaxST
WmaxST
Density
50 100 150
0.0000.0100.020
0.90 0.92 0.94 0.96 0.98 1.00
100120140160180200220240
Return level
p
Q(p)
ν = 4.0, 95% CI = (3.2, 5.3)
Q(0.999) = 168, 95% CI = (157, 225)
34. 27/30
Easy extensions
Censoring is trivial to accommodate - a one line change in the
code
Can be extended to “dependent response” by using copulas
(not done yet)
35. 28/30
Analysis of species abundance with zeros
US National Forest Inventory data from 1211 sites
Response = relative basal area of black cherry trees. Several
covariates measured. Response is zero in sites with no
black cherry trees
Illustrate with covariate “Winter Temperature”
37. 30/30
References
Koenker, R. and G. Bassett (1978). Regression quantiles. Econometrica: Journal of the Econometric
Society 46(1), 33–50.
Tokdar, S. T. and J. B. Kadane (2012). Simultaneous linear quantile regression: a semiparametric Bayesian
approach. Bayesian Analysis 7(1), 51–72.
Yang, Y. and S. T. Tokdar (2017). Joint estimation of quantile planes over arbitrary predictor spaces. Journal of
the American Statistical Association 112(519), 1107–1120.