Rates of convergence in M-Estimation with an example from current status data

1. Rates of Convergence in M-Estimation With an Example from Current Status Data Lu Mao Department of Biostatistics The University of North Carolina at Chapel Hill Email: lmao@unc.edu Lu Mao (UNC CH) Dec 4, 2013 1 / 23

2. 1 M-Estimation Motivation Z-Estimation Consistency Distribution Rate of convergence 2 Example: Current Status Data Description of problem Characterization of MLE Consistency and distributional results Lu Mao (UNC CH) Dec 4, 2013 2 / 23

3. MOTIVATION Examples: Parametric model: fp(x) : 2 g 0 = arg max 2 P log p Regression: Y = g0(X) + ;E(jX) = 0; g 2 G g0 = arg min g2G P(Y g(X))2 Natural estimators: MLE ^ = arg max 2 Pn log p LSE ^g = arg min g2G Pn(Y g(X))2 Lu Mao (UNC CH) Dec 4, 2013 3 / 23

4. INTRODUCTION M-estimator: ^n = arg max 2 Mn() Mn : Data-dependent criteria function Mn ! M, and 0 = arg max2M() Typically Mn() = Pnm Analysis steps: Consistency: ^n !p 0 Rate of convergence: rn(^n 0) = Op(1) Asymptotic distribution: rn(^n 0) Z Lu Mao (UNC CH) Dec 4, 2013 4 / 23

5. Z-ESTIMATION Special case: m smooth in parameter (Pm_ 0 = 0) (approx.) solve Pnm_ ^n = op(n1=2) Provided that Gnm_ ^n = Gnm_ 0 + op(1) (Consistency + Donskerness) we have p n(Pm_ ^n Pm_ 0 ) = Gnm_ 0 + op(1) V0 p n(^n 0) = Gnm_ 0 + op(1 + jj p n(^n 0)jj) p n(^n 0) = V 1 0 Gnm_ 0 + op(1) where V0 = @ @ Pm_

9. =0 Lu Mao (UNC CH) Dec 4, 2013 5 / 23

10. Z-ESTIMATION Example (Sample median) ^n = arg min PnjX j or equivalently 1=n Pnsign(X ^n) 1=n Since Psign(X ) = 2F() 1 ) V0 = 2f(0) p n(^ 0) = (2f(0))1Gnsign(X 0) + op(1) N 0; 4f(0)2 Lu Mao (UNC CH) Dec 4, 2013 6 / 23

11. M-ESTIMATION M-Estimation Non-smoothness in parameter Constraint in parameter Resulting estimator not root n consistent, i.e. rn(^n 0) = Op(1); where rn6= p n Lu Mao (UNC CH) Dec 4, 2013 7 / 23

12. CONSISTENCY Theorem 1.1 (VW Corollary 3.2.3) Let Mn() be a stochastic process indexed by a metric space , and let M : ! R be a deterministic function. a Suppose jjMn Mjj !p 0 and the true parameter 0 satis

13. es M(0) sup =2G M() for every open set G containing 0. Then if Mn(^n) sup Mn() op(1), we have ^n !p 0. b Suppose that jjMn MjjK !p 0 for every compact K and that the map 7! M() is upper-semicontinuous with a unique maximum at 0. Then the same conclusion is true provided that ^n = Op(1). Lu Mao (UNC CH) Dec 4, 2013 8 / 23

14. CONSISTENCY Example (Sample median) ^n = arg min PnjX j Use Theorem 1.1 b: j^nj PnjX ^nj + PnjXj 2PnjXj = Op(1) fj j : 2 Kg Glivenko-Cantelli Uniqueness of 0 as minimizer of PjX j @ @ PjX j = 2F() 1; @2 @2 PjX j = 2f() strictly convex on the support of X Conclusion: ^n !p 0 Lu Mao (UNC CH) Dec 4, 2013 9 / 23

15. CONSISTENCY Theorem 1.2 (Wald) Let 7! m(x) be upper-semicontinuous for every x, and for every suciently small ball U P sup 2U m 1; Then if 0 is the unique maximizer of Pm, ^n = Op(1) and Pnm^n Pnm0 op(1), we have ^n !p 0: Lu Mao (UNC CH) Dec 4, 2013 10 / 23

16. DISTRIBUTION Suppose ^h n := rn(^n 0) = Op(1), then ^h n = arg max h r2n Mn(0 + r1 n h) Mn(0) =: arg max h Hn(h) Theorem 1.3 (Argmax) Suppose that Hn H in l1(K) for every compact K R, for a limit process with continuous sample paths that have unqiue points of maxima ^ h. If ^h n = Op(1) and Hn(^h n) suph Hn(h) op(1), then ^h n ^h: Lu Mao (UNC CH) Dec 4, 2013 11 / 23

17. DISTRIBUTION Example (Parametric MLE): Hn(h) = r2n Pn(log p0+r1 n h log p0 ) = log Yn i=1 p0+h= p n p0 (Xi) = Gnh0 _ l0 1 2 h0I0h + op(1) (LAN) h0Z 1 2 h0I0h Z N(0; I0 ) =: H(h) Therefore p n(^n 0) = ^h n arg maxH(h) = I1 0 Z N(0; I1 0 ) Lu Mao (UNC CH) Dec 4, 2013 12 / 23

18. DISTRIBUTION In general Hn(h) = r2n Pn(m0+r1 n h m0 ) = r2n p n n h m0 ) + r2n Gn(m0+r1 P(m0+r1 n h m0 ) = r2n p n Gn(m0+r1 n h m0 ) + 1 2 h0V0h + op(1) V = @2 @2 Pm G(h) + 1 2 h0V0h (1) for some zero-mean Gaussian process G. Note that convergence of the Gn term concerns empirical processes indexed by Fn := r2n p n MK=rn; where M = fm m0 : d(; 0) g Lu Mao (UNC CH) Dec 4, 2013 13 / 23

19. DISTRIBUTION Empirical processes on index sets changing with n: VW Section 2.11 If (1) does hold, the variance function of G is given by E(G(h) G(g))2 = lim n!1 r4n n P(m0+h=rn m0+g=rn)2 The remaining (key) issue:

20. nding rn Lu Mao (UNC CH) Dec 4, 2013 14 / 23

21. RATE OF CONVERGENCE Theorem 1.4 (Rate of Convergence, VW Theorem 3.2.5) Let Mn be stochastic processes indexed by a semimetric space and M : ! R is a deterministic function, such that for every in a neighborhood of 0, M() M(0) . d2(; 0): Suppose that for suciently small , E sup d(;0) j(Mn M)() (Mn M)(0)j . n() p n ; for functions n such that 7! n()= is decreasing for some 2. Let r2n n(1=rn) p n; for every n. If the sequence ^n satis

22. es Mn(^n) Mn(0) Op(r2 n ) and converges in probability to 0, then rnd(^n; 0) = Op(1): Lu Mao (UNC CH) Dec 4, 2013 15 / 23

23. RATE OF CONVERGENCE Remark: For empirical-type criteria function, EjjGnjjM . n() Use maximal inequality EjjGnjjM . J[](1;M;L2(P))(PM2 )1=2; where M is the envelope function of M, and if J[](1;M;L2(P)) = Z 1 0 q 1 + logN[](jjMjjP;2;M;L2(P))d 1; uniformly in , then take n() = (PM2 )1=2 If n() = for some 2, then rn = n 1 2(2) Lu Mao (UNC CH) Dec 4, 2013 16 / 23

24. RATE OF CONVERGENCE Example (Lipschitz in parameter): If for every 1; 2 in a neighborhood of 0, jm1 m2 j m_ (x)jj1 2jj; with Pm_ 2(x) 1. Then )1=2 . : n() = (PM2 This gives rn = p n Lu Mao (UNC CH) Dec 4, 2013 17 / 23

25. CURRENT STATUS Interval censoring Case 1 (current status): Time-to-event data, examined only once Example: A cross-sectional antibody test of people of various ages against Hepatitis A virus (Keiding, 1991) Statistical problem: observe i.i.d. (U; ), U G on R+ = I(T U), T F on R+, T ? U Aim: estimate F Method: nonparametric MLE (NPMLE) Lu Mao (UNC CH) Dec 4, 2013 18 / 23

26. CHARACTERIZATION OF MLE Regularity conditions: F0 and G admit Lebesgue densities f and g respectively Likelihood: ln(F) = Pn( log F(U) + (1 ) log(1 F(U))) NPMLE: denote as ^ Fn Lu Mao (UNC CH) Dec 4, 2013 19 / 23

27. CHARACTERIZATION OF MLE Theorem 2.1 (GW Proposition 1.2) Re-order the observation times in ascending order such that U1 P Un. Let Hn be the greatest convex minorant (GCM) of the points (i; i j=1 j). Then ^ Fn(Ui) is the left derivative of Hn at i. Algebraically ^ Fn(Ui) = max 1ji min ikn Pk m=j m k j + 1 Corollary 2.2 Denote Dn as the right continuous step function de

28. ned by the points (i=n; n1Pi j=1 j), then ^ Fn(Ui) a i arg min s fDn(s) asg i=n: Lu Mao (UNC CH) Dec 4, 2013 20 / 23

29. CONSISTENCY OF MLE Theorem 2.3 (Consistency of ^ Fn(t)) Fix t, assume that f(t); g(t) 0, then ^ Fn(t) !p F0(t) Proof. See Example 5.17 ([V], pp 49) for a Wald's consistency (Theroem 1.2) proof, with (; d)=the space of distribution functions equipped with the weak topology. Lu Mao (UNC CH) Dec 4, 2013 21 / 23

30. DISTRIBUTION OF MLE Theorem 2.4 (Groeneboom, 1987) Fix t, assume that f(t); g(t) 0, then n1=3f ^ Fn(t) F0(t)g 4F0(1 F0)f g (t) 1=3 arg min h fZ(h) + h2g; where Z is a two-sided Brownian motion process originating from zero. Proof. We

31. rst use Theorem 1.4 and the subsequent Remark to establish that rn = n1=3, and then use the Argmax Theroem (Theorem 1.3) to

32. nd the asymptotic distribution. See Example 3.2.15 ([VW, pp 298]) for details. Lu Mao (UNC CH) Dec 4, 2013 22 / 23

33. References Groeneboom, P. (1987). Asymptotics for interval censored observations. Report, 87, 18 [GW] Groeneboom, P., Wellner, J. A. (1992). Information bounds and nonparametric maximum likelihood estimation (Vol. 19). Springer. [V] Van der Vaart, A. W. (2000). Asymptotic statistics (Vol. 3). Cambridge university press. [VW] Van der Vaart, A. W., Wellner, J. A. (1996). Weak Convergence and Empirical Processes. Lu Mao (UNC CH) Dec 4, 2013 23 / 23

Rates of convergence in M-Estimation with an example from current status data

Recommended

Recommended

More Related Content

What's hot

What's hot (19)

Viewers also liked

Viewers also liked (20)

Similar to Rates of convergence in M-Estimation with an example from current status data

Similar to Rates of convergence in M-Estimation with an example from current status data (20)

Recently uploaded

Recently uploaded (20)

Rates of convergence in M-Estimation with an example from current status data