Your SlideShare is downloading. ×
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
38
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
0
Comments
0
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. Approximate regeneration scheme for Markov Chains, with applications to U-statistics and extreme values. Approximate regeneration scheme Patrice Bertail MODAL’X and CREST CREST 30 Janvier 2013Patrice Bertail (MODAL’X and CREST) Approximate regeneration scheme for Markov Chains, wit CandEextreme 1 / 25 R S T , values.
  • 2. Outline The renewal or regenerative approach Notations The regenerative approach (atomic case) Nummelin splitting trick Approximate regeneration scheme U-statistics for Markovian data U-statistics U-statistics : a block Hoe¤ding decomposition Moment type conditions Asymptotics results and the bootstrap CLT variance estimation Berry-Esseen bounds Second order validity of the bootstrap? Simulation resultsP. Bertail (MODAL’ and CREST) X U-statistics of Markovian data 2012 2 / 26
  • 3. Notations X = (Xn )n 2N denotes a ψ-irreducible time-homogeneous Markov chain, valued in a measurable space (E , E ) (ψ being a maximal irreducibility measure) with transition probability Π(x, dy ) and initial distribution ν (non-stationary case) Pν (respectively, Px for x in E ) = probability measure such that X0 ν (resp., conditioned upon X0 = x), for A, such that ψ(A) > 0, by PA [.] the probability measure on the underlying space such that X0 2 A and by EA [.] the PA -expectation. Eν [.] the Pν -expectation (resp. Ex [.] the Px (.)-expectation) Hypotheses : the chain X is Harris recurrent, for any subset B 2 E such that ψ(B ) > 0 Px ( ∑ IfX n 2B g = ∞) = 1, for all x 2 E . n 1P. Bertail (MODAL’ and CREST) X U-statistics of Markovian data 2012 3 / 26
  • 4. The regenerative approach (Meyn and Tweedie, 1996) De…nition : A Markov chain is said regenerative if it possesses an accessible atom, i.e., a measurable set A such that ψ(A) > 0 and Π(x, .) = Π(y , .) for all (x, y ) 2 A2 .P. Bertail (MODAL’ and CREST) X U-statistics of Markovian data 2012 4 / 26
  • 5. The regenerative approach (Meyn and Tweedie, 1996) De…nition : A Markov chain is said regenerative if it possesses an accessible atom, i.e., a measurable set A such that ψ(A) > 0 and Π(x, .) = Π(y , .) for all (x, y ) 2 A2 . Hypothesis: The chain is positive recurrent (if and only if the expected return time to the atom is …nite, i.e. EA [τ A ] < ∞).P. Bertail (MODAL’ and CREST) X U-statistics of Markovian data 2012 4 / 26
  • 6. The regenerative approach (Meyn and Tweedie, 1996) De…nition : A Markov chain is said regenerative if it possesses an accessible atom, i.e., a measurable set A such that ψ(A) > 0 and Π(x, .) = Π(y , .) for all (x, y ) 2 A2 . Hypothesis: The chain is positive recurrent (if and only if the expected return time to the atom is …nite, i.e. EA [τ A ] < ∞). Invariant measure, for all B 2 E , " # τA 1 µ (B ) = EA [τ A ] EA ∑ IfX 2B g i . i =1P. Bertail (MODAL’ and CREST) X U-statistics of Markovian data 2012 4 / 26
  • 7. Regenerative Blocks of observations Sucessive return time to an atom A τA = τ A (1) = inf fn 1, Xn 2 Ag τ A (j ) = inf fn > τ A (j 1), Xn 2 Ag , for j 2.P. Bertail (MODAL’ and CREST) X U-statistics of Markovian data 2012 5 / 26
  • 8. Regenerative Blocks of observations Sucessive return time to an atom A τA = τ A (1) = inf fn 1, Xn 2 Ag τ A (j ) = inf fn > τ A (j 1), Xn 2 Ag , for j 2. Regenerative blocks of observations between consecutive visits to the atom B0 = (X1 , ..., XτA (1 ) ) B1 = (XτA (1 )+1 , ..., XτA (2 ) ), . . . , Bj = (XτA (j )+1 , ..., XτA (j +1 ) ), . . .P. Bertail (MODAL’ and CREST) X U-statistics of Markovian data 2012 5 / 26
  • 9. Regenerative Blocks of observations Sucessive return time to an atom A τA = τ A (1) = inf fn 1, Xn 2 Ag τ A (j ) = inf fn > τ A (j 1), Xn 2 Ag , for j 2. Regenerative blocks of observations between consecutive visits to the atom B0 = (X1 , ..., XτA (1 ) ) B1 = (XτA (1 )+1 , ..., XτA (2 ) ), . . . , Bj = (XτA (j )+1 , ..., XτA (j +1 ) ), . . . The sequence B1 , B2 , ..., Bj ... is i.i.d (B0 independent from the others but depends on ν) by the STRONG MARKOV Propertry.P. Bertail (MODAL’ and CREST) X U-statistics of Markovian data 2012 5 / 26
  • 10. Regenerative Blocks of observations Sucessive return time to an atom A τA = τ A (1) = inf fn 1, Xn 2 Ag τ A (j ) = inf fn > τ A (j 1), Xn 2 Ag , for j 2. Regenerative blocks of observations between consecutive visits to the atom B0 = (X1 , ..., XτA (1 ) ) B1 = (XτA (1 )+1 , ..., XτA (2 ) ), . . . , Bj = (XτA (j )+1 , ..., XτA (j +1 ) ), . . . The sequence B1 , B2 , ..., Bj ... is i.i.d (B0 independent from the others but depends on ν) by the STRONG MARKOV Propertry. ln =number of regeneration in a sequence of length n =random variable (depending on the length l (Bi ) of the blocks), ln p.s . 1 n ! ( EA τ A ) .P. Bertail (MODAL’ and CREST) X U-statistics of Markovian data l 2012 5 / 26
  • 11. Harris recurrent Markov chains (non atomic case)Nummelin splitting technique (1978)All recurrent Markov chains can be extended to be atomic using theNummelin splitting technique Notion of small set and minorization conditionDe…nition: A set S 2 E is said to be small for the chain if there existm 2 N , δ > 0 and a probability measure Φ supported by S such that:8(x, B ) 2 S E , Πm (x, B ) δΦ(B ),denoting by Πm the m-th iterate of the transition kernel Π. We assume that m = 1 here and throughout, with no loss of generality. Idea : a mixture with a component independent of x Π(x, B ) δΦ(B ) Π(x, B ) = δΦ(B ) + (1 δ) , for all B S, x 2 S. 1 δP. Bertail (MODAL’ and CREST) X U-statistics of Markovian data 2012 6 / 26
  • 12. Nummelin splitting trickY = (Yn )n 2N a sequence of independent Bernoulli r.v.’ with parameter δ ssuch that (X , Y ) is a bivariate Markov chain, referred to as the split chain,with state space E f0, 1g. The split chain is atomic with atomAS = S f1g. Reference measure λ(dy ) dominating fΠ(x, dy ); x 2 E g. Π(x, dy ) = π (x, y ) λ(dy ), Φ(dy ) = φ(y ) λ(dy ) 8(x, y ) 2 S 2 , π (x, y ) δφ(y ) Conditioned upon X (n ) = (X1 , . . . , Xn ), the random variables Y1 , . . . , Yn are mutually independent and, for all i 2 f1, . . . , ng, Yi are drawn from a Bernoulli distribution with parameter δφ(Xi +1 ) δIfX i 2S g + I . / π (Xi , Xi +1 ) fX i 2S gP. Bertail (MODAL’ and CREST) X U-statistics of Markovian data 2012 7 / 26
  • 13. Approximate regeneration Scheme (Bertail and Clemençon,2005-2007) b Suppose that an estimate π n (x, y ) of the transition density over S S, such that 8(x, y ) 2 S 2 , π n (x, y ) δφ(y ), is available (we b may choose S = S and δ = b). b δ Conditioned upon X (n ) = (X1 , . . . , Xn ), the random variables b b b Y1 , . . . , Yn are mutually independent and, for all i 2 f1, . . . , ng, Yi are drawn from a Bernoulli distribution with parameter b (Xi +1 ) δφ δIfX i 2S g + /b I b . π n (Xi , Xi +1 ) fX i 2S g bP. Bertail (MODAL’ and CREST) X U-statistics of Markovian data 2012 8 / 26
  • 14. Empirical choice of the small setEquilibrium between the size of the small set and the number ofregenerations (knowing the trajectory).Choose a neigborhood of size ε around a point x0 (typically the mean) n Nn (ε) = E( ∑ IfXi 2 Vx0 (ε), Yi = 1g jX (n +1 ) ) i =1 δ(ε) n 2ε i∑ = If(Xi , Xi +1 ) 2 Vx0 (ε)2 g/p (Xi , Xi +1 ). =1Since the transition density p and its minimum over Vx0 (ε)2 are unknown, ban empirical criterion Nn (ε) to optimize is obtained by replacing p by anestimate pn and δ(ε)/2ε by a lower bound bn (ε)/2ε for pn over Vx0 (ε)2 δP. Bertail (MODAL’ and CREST) X U-statistics of Markovian data 2012 9 / 26
  • 15. Empirical choice of the small setP. Bertail (MODAL’ and CREST) X U-statistics of Markovian data 2012 10 / 26
  • 16. U-statistics The parameter Z µ (h ) = h(x, y )µ(dx )µ(dy ) (x ,y )2E 2 " # τ A (1 ) τ A (2 ) 1 = (E τ A )2 EA ∑ ∑ h(Xi , Xj ) , i =1 j =1 + τ A (1 ) where h : R2 ! R is a symmetric kernelP. Bertail (MODAL’ and CREST) X U-statistics of Markovian data 2012 11 / 26
  • 17. U-statistics The parameter Z µ (h ) = h(x, y )µ(dx )µ(dy ) (x ,y )2E 2 " # τ A (1 ) τ A (2 ) 1 = (E τ A )2 EA ∑ ∑ h(Xi , Xj ) , i =1 j =1 + τ A (1 ) where h : R2 ! R is a symmetric kernel U-statistics of degree 2 2 Un ( h ) = n (n 1) 1 ∑ h(Xi , Xj ), i <j nP. Bertail (MODAL’ and CREST) X U-statistics of Markovian data 2012 11 / 26
  • 18. Examples 2 The Gini index Gn = n (n 1 ) ∑1 i <j n jXi Xj j The Wilcoxon statistics Wn = 2 n (n 1 ) ∑1 i <j n f 2 IfX i +X j >0 g 1g AUC (ROC Curve) The Takens estimator linked to Cn (r ) = n (n1 1 ) ∑1 i 6=j n IfjjX i X j jj r g , 1 jjX i X j jj Tn = n (n 1 ) ∑1 i 6 =j n log r0 Regular part (2d order) of Frechet di¤erentiable functionalsP. Bertail (MODAL’ and CREST) X U-statistics of Markovian data 2012 12 / 26
  • 19. Block representation of the U-statistics4 di¤erent contributions : - true U-stat on blocks (center), contr. of B0 ,contribution of Bln , contr. diagonal of incomplete blocks.Main tool : Hoe¤ding decomposition on the U-stat based on blocksP. Bertail (MODAL’ and CREST) X U-statistics of Markovian data 2012 13 / 26
  • 20. Hoe¤ding Decomposition on Blocks Notations: τ A (k +1 ) τ A (l +1 ) ω h (Bk , Bl ) = ∑ ∑ (h(Xi , Xj ) µ(h)) i =τ A (k )+1 j =τ A (l )+1 U-statistics of regenerative blocks 2 RL (h) = L(L 1) 1 ∑ ω h (Bk , Bl ), k <l L (ln 1) (ln 2) Un ( h ) = µ ( h ) + Rl (h) + Wn (h). n n 1 n 1 Hoe¤ding decomposition on the U-statistics on blocks. RL (h) = 2SL (h) + DL (h) where 1 L 2 L k∑ ∑ SL ( h ) = h1 (Bk ) and DL (h) = h2 (Bk , Bl ), =1 L(L 1) 1 k <l LP. Bertail (MODAL’ and CREST) X U-statistics of Markovian data 2012 14 / 26
  • 21. Moment type conditionsA0 (Block-length: moment assumption.) Let q 1, we have EA [τ q ] < ∞. AP. Bertail (MODAL’ and CREST) X U-statistics of Markovian data 2012 15 / 26
  • 22. Moment type conditionsA0 (Block-length: moment assumption.) Let q 1, we have EA [τ q ] < ∞. AA1 (Non-regenerative block.) Let l 1, we have Eν τ lA < ∞ as well as 2 !l 3 2 !l 3 τA τA τ A τ A (2 ) Eν 4 ∑ ∑ jh(Xi , Xj )j 5 < ∞ , Eν 4 ∑ ∑ jh(Xi , Xj )j 5 < ∞ i =1 j =1 i =1 j =1 + τ AP. Bertail (MODAL’ and CREST) X U-statistics of Markovian data 2012 15 / 26
  • 23. Moment type conditionsA0 (Block-length: moment assumption.) Let q 1, we have EA [τ q ] < ∞. AA1 (Non-regenerative block.) Let l 1, we have Eν τ lA < ∞ as well as 2 !l 3 2 !l 3 τA τA τ A τ A (2 ) Eν 4 ∑ ∑ jh(Xi , Xj )j 5 < ∞ , Eν 4 ∑ ∑ jh(Xi , Xj )j 5 < ∞ i =1 j =1 i =1 j =1 + τ AA2 (Block-sums: moment assumptions.) Let k 1, we have 2 !k 3 2 !k 3 τA τA τ A τ A (2 ) EA 4 ∑ ∑ jh(Xi , Xj )j 5 < ∞ , EA 4 ∑ ∑ jh(Xi , Xj )j 5 < i =1 j =1 i =1 j =1 + τ AP. Bertail (MODAL’ and CREST) X U-statistics of Markovian data 2012 15 / 26
  • 24. Moment type conditions A0 (Block-length: moment assumption.) Let q 1, we have EA [τ q ] < ∞. A A1 (Non-regenerative block.) Let l 1, we have Eν τ lA < ∞ as well as 2 !l 3 2 !l 3 τA τA τ A τ A (2 ) Eν 4 ∑ ∑ jh(Xi , Xj )j 5 < ∞ , Eν 4 ∑ ∑ jh(Xi , Xj )j 5 < ∞ i =1 j =1 i =1 j =1 + τ A A2 (Block-sums: moment assumptions.) Let k 1, we have 2 !k 3 2 !k 3 τA τA τ A τ A (2 ) EA 4 ∑ ∑ jh(Xi , Xj )j 5 < ∞ , EA 4 ∑ ∑ jh(Xi , Xj )j 5 < i =1 j =1 i =1 j =1 + τ AA2bis (Uniform moment assumptions.) Let p 0, we have " !p # 2 ! p +2 3 τA τA sup Eν x 2S ∑ h(x, Xj ) ¯ < ∞ , sup EA 4 x 2S ∑ h(x, Xj ) ¯ 5 < ∞. j =1 j =1 P. Bertail (MODAL’ and CREST) X U-statistics of Markovian data 2012 15 / 26
  • 25. Additional conditions in the general case bB1. The MSE of π is of order αn when error is measured by the sup norm over S 2 : " # Eν sup b jπ (x, y ) π (x, y )j2 = O (αn ), (x ,y )2S 2 where (αn ) denotes a sequence of nonnegative numbers decaying to zero at in…nity.B2. The parameters S and φ are chosen so that inf x 2S φ(x ) > 0.B3. We have sup(x ,y )2S 2 π (x, y ) < ∞ and supn 2N sup(x ,y )2S 2 π n (x, y ) < ∞ Pν -a.s. ˆP. Bertail (MODAL’ and CREST) X U-statistics of Markovian data 2012 16 / 26
  • 26. Asymptotic resultsTheorem 1 (Central Limit Theorem) Suppose that assumptionsA0 A2 (or A2bis) with q = k = l = 2 are ful…lled. Then, we have theconvergence in distribution under Pν : p n (Un (h) µ(h)) ) N (0, σ2 (h)), as n ! ∞,where σ2 (h) = 4EA h1 (B1 )2 /α3 . The bootstrap analog also holds.P. Bertail (MODAL’ and CREST) X U-statistics of Markovian data l 2012 17 / 26
  • 27. Asymptotic biasTheorem 2 (Asymptotic bias) Suppose that assumptions A0 withq = 4 + δ for some δ > 0, A1 with k = 2, A2 with p = 2, + a Cramercondition. Then, as n ! ∞, we have 2∆ + 2φν 2β/α + 2γ Eν [Un (h)] = µ(h) + + O (n 3/2 ), nwhere τA τA φν = Eν [ ∑ h0 (Xi )], γ = EA [ ∑ (τ A j )h0 (Xi )]/α i =1 i =1 | {z } contribution the …rst and last blocks " # " # τA ∆ = EA ∑ h(Xk , Xj ) /α, β = EA τ A ∑ h0 (Xi ) 1 k <j τ A i =1 | {z } contrib. of incomplete blocks+ the randomness of ln Rwith h0 (x ) = y 2E fh(x, y ) µ(h)gµ(dx ) for all x 2 E .Proof :(MODAL’X and CREST)arguments, Malinovskii (1985)P. Bertail partitionning U-statistics of Markovian data 2012 18 / 26
  • 28. Estimation of the varianceJacknnife type estimator (Callaert, Verarverbeke, 1981) L 1 2 ˆ h1, j (b ) = L ∑ 1 k =1, k 6=j ω h (b, Bk ) L(L 1) 1 ∑ ω h (Bk , Bl ), k <l L 1 L ˆ2 L k∑ 1, ˆ2 sL ( h ) = h k (Bk ). =1With norming constant related to the U-statistic Un (h): σ2 (h) = 4 (ln /n)3 sl2 ˆn ˆn 1 (h ).De…ne similarly the Bootstrap version of the variancesay σn2 (h), on blocks ˆtaken with replacement (either with ln …xed or by drawing the blockssequentially until the size of the bootstrap sample is n (ln blocks, random).P. Bertail (MODAL’ and CREST) X U-statistics of Markovian data 2012 19 / 26
  • 29. Estimation of the variance and bootstrapTheorem 3 (Variance estimation) Suppose that assumptions A0 A2(or A2bis and B1 B3) are ful…lled with q = 4. Then, the statistic σ2 (h) ˆnis a strongly consistent estimator of σ2 (h), σ2 (h) ! σ2 (h) Pν -almost-surely, as n ! ∞. ˆn Pr σ n 2 (h ) ˆ σ 2 (h ) ˆn > 0 a.s., as n ! ∞.P. Bertail (MODAL’ and CREST) X U-statistics of Markovian data 2012 20 / 26
  • 30. Rate of convergence and the bootstrapTheorem (A Berry-Esseen bound and a rate for thebootstrap). Under assumptions A0 with q = 3 + ε, ε > 0, A1 withk = 2, A2 (or A2 bis ) with l = 3, there exists an explicit constant C (h)depending only on the moments involved in hypotheses A1 A3 suchthat: as n ! ∞, p sup Pν nσ (h) 1 (Un (h) µ(h)) x Φ (x ) C (h) n 1/2 x 2RIt follows that (if in addition B1 B3 hold for the general case) p P nbn (h) 1 (Un (h) Un (h)) x σ sup p = O (n 1/2 ). x 2R Pν nσ(h) 1 (Un (h) µ(h)) xProof : Stein method + some simple partitionning arguments.(improvesover Bolthausen (1986) results for the mean).P. Bertail (MODAL’ and CREST) X U-statistics of Markovian data 2012 21 / 26
  • 31. Second order Validity of the Bootstrap? DOES NOT HOLD if the U-statitics is constructed on all the observations!P. Bertail (MODAL’ and CREST) X U-statistics of Markovian data 2012 22 / 26
  • 32. Second order Validity of the Bootstrap? DOES NOT HOLD if the U-statitics is constructed on all the observations! DOES NOT HOLD if ln is held …xed in the boostrap procedure.P. Bertail (MODAL’ and CREST) X U-statistics of Markovian data 2012 22 / 26
  • 33. Second order Validity of the Bootstrap? DOES NOT HOLD if the U-statitics is constructed on all the observations! DOES NOT HOLD if ln is held …xed in the boostrap procedure. The …rst blocks should be dropped even in the case of the mean. The approximate regeneration scheme gives a rule to eliminate the data from a "burning period".P. Bertail (MODAL’ and CREST) X U-statistics of Markovian data 2012 22 / 26
  • 34. Second order Validity of the Bootstrap? DOES NOT HOLD if the U-statitics is constructed on all the observations! DOES NOT HOLD if ln is held …xed in the boostrap procedure. The …rst blocks should be dropped even in the case of the mean. The approximate regeneration scheme gives a rule to eliminate the data from a "burning period". Di¢ culty for proving the second order correctness (true in the case of the mean) : partitioning arguments reduce to obtaining a local Edgeworth expansion for ( ) m P ∑ ω h (Bi , Bj )/σ2 h y, ∑ l (Bj ) = k 1 i 6 =j m j =1 Usual methods (Dubinskaite, 1985-86) very technical...P. Bertail (MODAL’ and CREST) X U-statistics of Markovian data 2012 22 / 26
  • 35. Second order Validity of the Bootstrap? DOES NOT HOLD if the U-statitics is constructed on all the observations! DOES NOT HOLD if ln is held …xed in the boostrap procedure. The …rst blocks should be dropped even in the case of the mean. The approximate regeneration scheme gives a rule to eliminate the data from a "burning period". Di¢ culty for proving the second order correctness (true in the case of the mean) : partitioning arguments reduce to obtaining a local Edgeworth expansion for ( ) m P ∑ ω h (Bi , Bj )/σ2 h y, ∑ l (Bj ) = k 1 i 6 =j m j =1 Usual methods (Dubinskaite, 1985-86) very technical... Simultaneous control of the degenerate part of the U-statistics and the lattice part...P. Bertail (MODAL’ and CREST) X U-statistics of Markovian data 2012 22 / 26
  • 36. Simulation resultsGraph panel:XGini index, EXP-AR(1) Markovian data α1 = 0.8 and 23 / 26P. Bertail (MODAL’ and CREST) U-statistics of model with 2012
  • 37. Basic Lemma for establishing a Berry-Essen type of bound 2Let Wn be a r.v. such that EWn = 1 and such that, for some constant C, Wnadmits a Berry Esseen type bound C jjP (Wn < x ) Φ(x )jj∞ n1/2then, for any random sequence ∆n ,we have C jjP (Wn + ∆n < x ) Φ(x )jj∞ + 8E (∆2 )1/2 n n1/2Proof : the proof easily follows from the Stein method for establishing BerryEsseen Bounds: see for instance Shorack (2000), lemma 1.3, p. 261.P. Bertail (MODAL’ and CREST) X U-statistics of Markovian data 2012 25 / 26
  • 38. !2 n n nEA jln [n/α]j 1/2 ∑ f (Bj )/σ(f ) = ∑ ∑ ∑ al ,r ,s = I + II j 2)l 1,[n/α] 1 ( l =1 r =1 s =1 [n/α] 1 n n with I = ∑ ∑ ∑ al ,r ,s l =1 r =1 s =1 n n n and II = ∑ ∑ ∑ al ,r ,s l =[n/α]+1 r =1 s =1with Zal ,r ,s = Pν (B0 2 du, τ A = r )PA (Bl 2 dv , τ l +1 > s ) Z l 1 x 2 PA jl [n/α]j 1/2 ∑ f (Bj )/σ(f ) 2 dx, ∑ l (Bj ) = n j 2)l 1,[n/α] 1 ( i =1P. Bertail (MODAL’ and CREST) X U-statistics of Markovian data 2012 26 / 26
  • 39. Extreme values : dependence and tail index computation
  • 40. Bootstrap confidence intervals for the extremal index
  • 41. Thank you for your attention

×