Review
             Measuring Quality
           Bandwidth Selection
Multivariate Density Estimation




      Nonparametric Econometrics
     Kernel Methods for Density Estimation


                     James Nordlund


                       April 21, 2011




                      Nordlund    Nonparametric Econometrics
Review
                     Measuring Quality
                   Bandwidth Selection
        Multivariate Density Estimation



Example Problem




                              Nordlund    Nonparametric Econometrics
Review
                     Measuring Quality
                   Bandwidth Selection
        Multivariate Density Estimation



Example Problem




                              Nordlund    Nonparametric Econometrics
Review
                     Measuring Quality
                   Bandwidth Selection
        Multivariate Density Estimation




How useful are kernel density estimates?
    How many sample observations should we have?
    Are kernel functions always reliable or did I just provide
    one lucky example?




                              Nordlund    Nonparametric Econometrics
Review
                      Measuring Quality
                    Bandwidth Selection
         Multivariate Density Estimation



Modes of Convergence




      Convergence in rth Mean

      Big O notation




                               Nordlund    Nonparametric Econometrics
Review
                       Measuring Quality
                     Bandwidth Selection
          Multivariate Density Estimation



Definitions



  Definition (Convergence in rth Mean)
  We say that xn converges to X in the rth mean, if for some
  r > 0,
                      lim E[||xn − X||r ] = 0
                           n→∞

                          rth
  We write this as xn → X




                                Nordlund    Nonparametric Econometrics
Review
                       Measuring Quality
                     Bandwidth Selection
          Multivariate Density Estimation



Definitions



  Definition (Order: Big O)
  For a positive integer n, we write an = O(1) if, as n → ∞, an
  remains bounded, i.e., |an | ≤ C for some constant C and for all
  large values of n (an is a bounded sequence). Similarly, we write
  an = O(bn ) if an /bn = O(1), or equivalently an ≤ Cbn , for some
  constant C and for all n sufficiently large.




                                Nordlund    Nonparametric Econometrics
Review
                       Measuring Quality
                     Bandwidth Selection
          Multivariate Density Estimation



Main Theorem
  Theorem
  Let X1 , X2 , ..., Xn denote independent, identically distributed
  observations with a twice differentiable p.d.f., f (x), and let
  f (s) (x) denote the sth order derivative of f (x)(s = 1, 2). Let x
  be an interior point in the support of X, and let
                          −x
  f (x) = nh n k Xih . Assume that the kernel function, k(∗)
   ˆ         1
                  i=1
  is bounded and has µ2 < ∞. Assume that
  supξ∈S(X) |f (l) (ξ)| < ∞ for l = 0, 1, 2 where S(X) denotes the
  support of X. Assume that |u3 k(u)|du < ∞. Also, as n → ∞,
  h → 0, and nh → ∞, then

                            ˆ                          1
                       M SE(f (x)) = O h4 +
                                                      nh


                                Nordlund    Nonparametric Econometrics
Review
                        Measuring Quality
                      Bandwidth Selection
           Multivariate Density Estimation



Inside the Proof
   Recall that
         ˆ          ˆ                      ˆ             ˆ
    M SE(f (x) = E [f (x) − f (x)]2 = V ar(f (x)) + Bias(f (x))2 .

   Along the proof, we obtain

                 ˆ               h2 (2)
            Bias(f (x)) =          f (x)        u2 k(u)du + O(h3 )
                                 2
                       and
                  ˆ         1
             V ar(f (x)) =    f (x)            k(u)j du + O(h)
                           nh
   Notice that there is a trade off between minimizing variance
   and bias

                                 Nordlund    Nonparametric Econometrics
Review
                    Measuring Quality
                  Bandwidth Selection
       Multivariate Density Estimation




How do we balance variance and bias?




                             Nordlund    Nonparametric Econometrics
Review
                      Measuring Quality
                    Bandwidth Selection
         Multivariate Density Estimation



Important Tools



                                           2
     ISE(h) =          ˆ
                       f (x) − f (x)           dx

                                                    2
    M ISE(h) = E            ˆ
                            f (x) − f (x)               dx

                   1                                                         1               2
  AM ISE(h) =               k(x)2 dx + h4                    f (2) (x)2 dx       x2 k(x)dx
                  nh                                                         4




                               Nordlund        Nonparametric Econometrics
Review
                     Measuring Quality
                   Bandwidth Selection
        Multivariate Density Estimation



Optimal h




                                                                       1
                                            k(x)2 dx                   5

        hopt,AM ISE =                                            2
                              n    f (2) (x)2 dx    x2 k(x)dx




                              Nordlund    Nonparametric Econometrics
Review
                       Measuring Quality
                     Bandwidth Selection
          Multivariate Density Estimation



Rule of Thumb Methods

  A popular method is to assume the unknown function f has a
                                                  ˆ
  normal distribution. Then we know we know what S(α) should
  look like. This gives

                             hROT ≈ 1.06ˆ n−1/5
                                        σ

  Of course, if we knew what f looked like, we’d stick to
  parametric estimation techniques.

  Importantly, hROT is close to optimal for symmetric, unimodal
  densities

  In this case, we call hROT the normal reference rule of thumb


                                Nordlund    Nonparametric Econometrics
Review
                       Measuring Quality
                     Bandwidth Selection
          Multivariate Density Estimation



Plug-In Methods


  The plug-in method is a two step process

      Find hROT (usually just by taking the normal reference
      rule of thumb)

      Use hROT to estimate             f (2) (x)2 dx in
                                                              1
                                      k(x)2 dx                5

                                                          2
                      n    f (2) (x)2 dx     x2 k(x)dx




                                Nordlund    Nonparametric Econometrics
Review
                       Measuring Quality
                     Bandwidth Selection
          Multivariate Density Estimation



Plug-In Methods




  This improves the asymptotic rate of convergence for the kernel
  function

  Higher iterations do not add any usefulness




                                Nordlund    Nonparametric Econometrics
Review
                     Measuring Quality
                   Bandwidth Selection
        Multivariate Density Estimation




To be fair, we still make some assumption on the form of f
using the rule of thumb method

However, the assumption has less influence on our estimation,
and in applied settings the plug-in method does fairly well

More data-driven methods exist (e.g. Least Squares
Cross-Validation), but these can have a very slow rate of
convergence




                              Nordlund    Nonparametric Econometrics
Review
                     Measuring Quality
                   Bandwidth Selection
        Multivariate Density Estimation




What about multivariate density?




                              Nordlund    Nonparametric Econometrics
Review
                        Measuring Quality
                      Bandwidth Selection
           Multivariate Density Estimation



Multivariate Kernel


   Univariate:
                                             n
                         ˆ        1                      Xi − x
                         f (x) =                 k
                                 nh                        h
                                         i=1

   Multivariate:
                                                     n
                   ˆ              1                          Xi − x
                   f (x) =                               K
                             nh1 h2 · · · hq                   h
                                                 i=1




                                 Nordlund        Nonparametric Econometrics
Review
                        Measuring Quality
                      Bandwidth Selection
           Multivariate Density Estimation



Multivariate Properties

   Univariate:
                             ˆ                1
                        M SE(f (x)) = O h4 +
                                             nh
   Multivariate:
                                             q
                ˆ
           M SE(f (x)) = O                       h2
                                                  s   + (nh1 · · · hq )−1
                                         s=1

   Same trade-off between minimizing bias and variance




                                 Nordlund        Nonparametric Econometrics
Review
                       Measuring Quality
                     Bandwidth Selection
          Multivariate Density Estimation



Real Example




  DiNardo and Tobias (2001) - growth in female wage inequality

  Parametric methods missed sharp lower bound from minimum
  wage in 1979




                                Nordlund    Nonparametric Econometrics
Review
                Measuring Quality
              Bandwidth Selection
   Multivariate Density Estimation




Dr. Olofsson and Trinity Mathematics
H¨rdle, W. and Linton, O. (1994). Applied Nonparametric
 a
Methods. Handbook of Econometrics. 2297-2339.
Jones, M., Marron, J., Sheather, S. (1996). A Brief Survey
of Bandwidth Selection for Density Estimation. Journal of
the American Statistical Association. Vol 91. No. 433.
401-407.
Li, Q., Racine, J. (2007). Nonparametric Econometrics.




                         Nordlund    Nonparametric Econometrics

Nonparametric Density Estimation

  • 1.
    Review Measuring Quality Bandwidth Selection Multivariate Density Estimation Nonparametric Econometrics Kernel Methods for Density Estimation James Nordlund April 21, 2011 Nordlund Nonparametric Econometrics
  • 2.
    Review Measuring Quality Bandwidth Selection Multivariate Density Estimation Example Problem Nordlund Nonparametric Econometrics
  • 3.
    Review Measuring Quality Bandwidth Selection Multivariate Density Estimation Example Problem Nordlund Nonparametric Econometrics
  • 4.
    Review Measuring Quality Bandwidth Selection Multivariate Density Estimation How useful are kernel density estimates? How many sample observations should we have? Are kernel functions always reliable or did I just provide one lucky example? Nordlund Nonparametric Econometrics
  • 5.
    Review Measuring Quality Bandwidth Selection Multivariate Density Estimation Modes of Convergence Convergence in rth Mean Big O notation Nordlund Nonparametric Econometrics
  • 6.
    Review Measuring Quality Bandwidth Selection Multivariate Density Estimation Definitions Definition (Convergence in rth Mean) We say that xn converges to X in the rth mean, if for some r > 0, lim E[||xn − X||r ] = 0 n→∞ rth We write this as xn → X Nordlund Nonparametric Econometrics
  • 7.
    Review Measuring Quality Bandwidth Selection Multivariate Density Estimation Definitions Definition (Order: Big O) For a positive integer n, we write an = O(1) if, as n → ∞, an remains bounded, i.e., |an | ≤ C for some constant C and for all large values of n (an is a bounded sequence). Similarly, we write an = O(bn ) if an /bn = O(1), or equivalently an ≤ Cbn , for some constant C and for all n sufficiently large. Nordlund Nonparametric Econometrics
  • 8.
    Review Measuring Quality Bandwidth Selection Multivariate Density Estimation Main Theorem Theorem Let X1 , X2 , ..., Xn denote independent, identically distributed observations with a twice differentiable p.d.f., f (x), and let f (s) (x) denote the sth order derivative of f (x)(s = 1, 2). Let x be an interior point in the support of X, and let −x f (x) = nh n k Xih . Assume that the kernel function, k(∗) ˆ 1 i=1 is bounded and has µ2 < ∞. Assume that supξ∈S(X) |f (l) (ξ)| < ∞ for l = 0, 1, 2 where S(X) denotes the support of X. Assume that |u3 k(u)|du < ∞. Also, as n → ∞, h → 0, and nh → ∞, then ˆ 1 M SE(f (x)) = O h4 + nh Nordlund Nonparametric Econometrics
  • 9.
    Review Measuring Quality Bandwidth Selection Multivariate Density Estimation Inside the Proof Recall that ˆ ˆ ˆ ˆ M SE(f (x) = E [f (x) − f (x)]2 = V ar(f (x)) + Bias(f (x))2 . Along the proof, we obtain ˆ h2 (2) Bias(f (x)) = f (x) u2 k(u)du + O(h3 ) 2 and ˆ 1 V ar(f (x)) = f (x) k(u)j du + O(h) nh Notice that there is a trade off between minimizing variance and bias Nordlund Nonparametric Econometrics
  • 10.
    Review Measuring Quality Bandwidth Selection Multivariate Density Estimation How do we balance variance and bias? Nordlund Nonparametric Econometrics
  • 11.
    Review Measuring Quality Bandwidth Selection Multivariate Density Estimation Important Tools 2 ISE(h) = ˆ f (x) − f (x) dx 2 M ISE(h) = E ˆ f (x) − f (x) dx 1 1 2 AM ISE(h) = k(x)2 dx + h4 f (2) (x)2 dx x2 k(x)dx nh 4 Nordlund Nonparametric Econometrics
  • 12.
    Review Measuring Quality Bandwidth Selection Multivariate Density Estimation Optimal h 1 k(x)2 dx 5 hopt,AM ISE = 2 n f (2) (x)2 dx x2 k(x)dx Nordlund Nonparametric Econometrics
  • 13.
    Review Measuring Quality Bandwidth Selection Multivariate Density Estimation Rule of Thumb Methods A popular method is to assume the unknown function f has a ˆ normal distribution. Then we know we know what S(α) should look like. This gives hROT ≈ 1.06ˆ n−1/5 σ Of course, if we knew what f looked like, we’d stick to parametric estimation techniques. Importantly, hROT is close to optimal for symmetric, unimodal densities In this case, we call hROT the normal reference rule of thumb Nordlund Nonparametric Econometrics
  • 14.
    Review Measuring Quality Bandwidth Selection Multivariate Density Estimation Plug-In Methods The plug-in method is a two step process Find hROT (usually just by taking the normal reference rule of thumb) Use hROT to estimate f (2) (x)2 dx in 1 k(x)2 dx 5 2 n f (2) (x)2 dx x2 k(x)dx Nordlund Nonparametric Econometrics
  • 15.
    Review Measuring Quality Bandwidth Selection Multivariate Density Estimation Plug-In Methods This improves the asymptotic rate of convergence for the kernel function Higher iterations do not add any usefulness Nordlund Nonparametric Econometrics
  • 16.
    Review Measuring Quality Bandwidth Selection Multivariate Density Estimation To be fair, we still make some assumption on the form of f using the rule of thumb method However, the assumption has less influence on our estimation, and in applied settings the plug-in method does fairly well More data-driven methods exist (e.g. Least Squares Cross-Validation), but these can have a very slow rate of convergence Nordlund Nonparametric Econometrics
  • 17.
    Review Measuring Quality Bandwidth Selection Multivariate Density Estimation What about multivariate density? Nordlund Nonparametric Econometrics
  • 18.
    Review Measuring Quality Bandwidth Selection Multivariate Density Estimation Multivariate Kernel Univariate: n ˆ 1 Xi − x f (x) = k nh h i=1 Multivariate: n ˆ 1 Xi − x f (x) = K nh1 h2 · · · hq h i=1 Nordlund Nonparametric Econometrics
  • 19.
    Review Measuring Quality Bandwidth Selection Multivariate Density Estimation Multivariate Properties Univariate: ˆ 1 M SE(f (x)) = O h4 + nh Multivariate: q ˆ M SE(f (x)) = O h2 s + (nh1 · · · hq )−1 s=1 Same trade-off between minimizing bias and variance Nordlund Nonparametric Econometrics
  • 20.
    Review Measuring Quality Bandwidth Selection Multivariate Density Estimation Real Example DiNardo and Tobias (2001) - growth in female wage inequality Parametric methods missed sharp lower bound from minimum wage in 1979 Nordlund Nonparametric Econometrics
  • 21.
    Review Measuring Quality Bandwidth Selection Multivariate Density Estimation Dr. Olofsson and Trinity Mathematics H¨rdle, W. and Linton, O. (1994). Applied Nonparametric a Methods. Handbook of Econometrics. 2297-2339. Jones, M., Marron, J., Sheather, S. (1996). A Brief Survey of Bandwidth Selection for Density Estimation. Journal of the American Statistical Association. Vol 91. No. 433. 401-407. Li, Q., Racine, J. (2007). Nonparametric Econometrics. Nordlund Nonparametric Econometrics