1. Review
Measuring Quality
Bandwidth Selection
Multivariate Density Estimation
Nonparametric Econometrics
Kernel Methods for Density Estimation
James Nordlund
April 21, 2011
Nordlund Nonparametric Econometrics
2. Review
Measuring Quality
Bandwidth Selection
Multivariate Density Estimation
Example Problem
Nordlund Nonparametric Econometrics
3. Review
Measuring Quality
Bandwidth Selection
Multivariate Density Estimation
Example Problem
Nordlund Nonparametric Econometrics
4. Review
Measuring Quality
Bandwidth Selection
Multivariate Density Estimation
How useful are kernel density estimates?
How many sample observations should we have?
Are kernel functions always reliable or did I just provide
one lucky example?
Nordlund Nonparametric Econometrics
5. Review
Measuring Quality
Bandwidth Selection
Multivariate Density Estimation
Modes of Convergence
Convergence in rth Mean
Big O notation
Nordlund Nonparametric Econometrics
6. Review
Measuring Quality
Bandwidth Selection
Multivariate Density Estimation
Definitions
Definition (Convergence in rth Mean)
We say that xn converges to X in the rth mean, if for some
r > 0,
lim E[||xn − X||r ] = 0
n→∞
rth
We write this as xn → X
Nordlund Nonparametric Econometrics
7. Review
Measuring Quality
Bandwidth Selection
Multivariate Density Estimation
Definitions
Definition (Order: Big O)
For a positive integer n, we write an = O(1) if, as n → ∞, an
remains bounded, i.e., |an | ≤ C for some constant C and for all
large values of n (an is a bounded sequence). Similarly, we write
an = O(bn ) if an /bn = O(1), or equivalently an ≤ Cbn , for some
constant C and for all n sufficiently large.
Nordlund Nonparametric Econometrics
8. Review
Measuring Quality
Bandwidth Selection
Multivariate Density Estimation
Main Theorem
Theorem
Let X1 , X2 , ..., Xn denote independent, identically distributed
observations with a twice differentiable p.d.f., f (x), and let
f (s) (x) denote the sth order derivative of f (x)(s = 1, 2). Let x
be an interior point in the support of X, and let
−x
f (x) = nh n k Xih . Assume that the kernel function, k(∗)
ˆ 1
i=1
is bounded and has µ2 < ∞. Assume that
supξ∈S(X) |f (l) (ξ)| < ∞ for l = 0, 1, 2 where S(X) denotes the
support of X. Assume that |u3 k(u)|du < ∞. Also, as n → ∞,
h → 0, and nh → ∞, then
ˆ 1
M SE(f (x)) = O h4 +
nh
Nordlund Nonparametric Econometrics
9. Review
Measuring Quality
Bandwidth Selection
Multivariate Density Estimation
Inside the Proof
Recall that
ˆ ˆ ˆ ˆ
M SE(f (x) = E [f (x) − f (x)]2 = V ar(f (x)) + Bias(f (x))2 .
Along the proof, we obtain
ˆ h2 (2)
Bias(f (x)) = f (x) u2 k(u)du + O(h3 )
2
and
ˆ 1
V ar(f (x)) = f (x) k(u)j du + O(h)
nh
Notice that there is a trade off between minimizing variance
and bias
Nordlund Nonparametric Econometrics
10. Review
Measuring Quality
Bandwidth Selection
Multivariate Density Estimation
How do we balance variance and bias?
Nordlund Nonparametric Econometrics
11. Review
Measuring Quality
Bandwidth Selection
Multivariate Density Estimation
Important Tools
2
ISE(h) = ˆ
f (x) − f (x) dx
2
M ISE(h) = E ˆ
f (x) − f (x) dx
1 1 2
AM ISE(h) = k(x)2 dx + h4 f (2) (x)2 dx x2 k(x)dx
nh 4
Nordlund Nonparametric Econometrics
12. Review
Measuring Quality
Bandwidth Selection
Multivariate Density Estimation
Optimal h
1
k(x)2 dx 5
hopt,AM ISE = 2
n f (2) (x)2 dx x2 k(x)dx
Nordlund Nonparametric Econometrics
13. Review
Measuring Quality
Bandwidth Selection
Multivariate Density Estimation
Rule of Thumb Methods
A popular method is to assume the unknown function f has a
ˆ
normal distribution. Then we know we know what S(α) should
look like. This gives
hROT ≈ 1.06ˆ n−1/5
σ
Of course, if we knew what f looked like, we’d stick to
parametric estimation techniques.
Importantly, hROT is close to optimal for symmetric, unimodal
densities
In this case, we call hROT the normal reference rule of thumb
Nordlund Nonparametric Econometrics
14. Review
Measuring Quality
Bandwidth Selection
Multivariate Density Estimation
Plug-In Methods
The plug-in method is a two step process
Find hROT (usually just by taking the normal reference
rule of thumb)
Use hROT to estimate f (2) (x)2 dx in
1
k(x)2 dx 5
2
n f (2) (x)2 dx x2 k(x)dx
Nordlund Nonparametric Econometrics
15. Review
Measuring Quality
Bandwidth Selection
Multivariate Density Estimation
Plug-In Methods
This improves the asymptotic rate of convergence for the kernel
function
Higher iterations do not add any usefulness
Nordlund Nonparametric Econometrics
16. Review
Measuring Quality
Bandwidth Selection
Multivariate Density Estimation
To be fair, we still make some assumption on the form of f
using the rule of thumb method
However, the assumption has less influence on our estimation,
and in applied settings the plug-in method does fairly well
More data-driven methods exist (e.g. Least Squares
Cross-Validation), but these can have a very slow rate of
convergence
Nordlund Nonparametric Econometrics
17. Review
Measuring Quality
Bandwidth Selection
Multivariate Density Estimation
What about multivariate density?
Nordlund Nonparametric Econometrics
18. Review
Measuring Quality
Bandwidth Selection
Multivariate Density Estimation
Multivariate Kernel
Univariate:
n
ˆ 1 Xi − x
f (x) = k
nh h
i=1
Multivariate:
n
ˆ 1 Xi − x
f (x) = K
nh1 h2 · · · hq h
i=1
Nordlund Nonparametric Econometrics
19. Review
Measuring Quality
Bandwidth Selection
Multivariate Density Estimation
Multivariate Properties
Univariate:
ˆ 1
M SE(f (x)) = O h4 +
nh
Multivariate:
q
ˆ
M SE(f (x)) = O h2
s + (nh1 · · · hq )−1
s=1
Same trade-off between minimizing bias and variance
Nordlund Nonparametric Econometrics
20. Review
Measuring Quality
Bandwidth Selection
Multivariate Density Estimation
Real Example
DiNardo and Tobias (2001) - growth in female wage inequality
Parametric methods missed sharp lower bound from minimum
wage in 1979
Nordlund Nonparametric Econometrics
21. Review
Measuring Quality
Bandwidth Selection
Multivariate Density Estimation
Dr. Olofsson and Trinity Mathematics
H¨rdle, W. and Linton, O. (1994). Applied Nonparametric
a
Methods. Handbook of Econometrics. 2297-2339.
Jones, M., Marron, J., Sheather, S. (1996). A Brief Survey
of Bandwidth Selection for Density Estimation. Journal of
the American Statistical Association. Vol 91. No. 433.
401-407.
Li, Q., Racine, J. (2007). Nonparametric Econometrics.
Nordlund Nonparametric Econometrics