1. PART I
Approximation, Bounds, and Inequalities
©2001 CRC Press LLC
2. 1
Nonuniform Bounds in Probability
Approximations Using Stein’s Method
Louis H. Y. Chen
National University of Singapore, Republic of Singapore
ABSTRACT Most of the work on Stein’s method deals with uniform
error bounds. In this paper, we discuss non-uniform error bounds using
Stein’s method in Poisson, binomial, and normal approximations.
Keywords and phrases Stein’s method, non-uniform bounds, proba-
bility approximations, Poisson approximation, binomial approximation,
normal approximation, concentration inequality approach, binary expan-
sion of a random integer
1.1 Introduction
In 1972 Stein introduced a method of normal approximation which does
not depend on Fourier analysis but involves solving a diﬀerential equa-
tion. Although his method was for normal approximation, his ideas
are applicable to other probability approximations. The method also
works better than the Fourier analytic method for dependent random
variables, particularly if the dependence is local or of a combinatorial
nature. Since the publication of this seminal work of Stein, numerous
papers have been written and Stein’s ideas applied in many diﬀerent con-
texts of probability approximation. Most notable of these works are in
normal approximation, Poisson approximation, Poisson process approxi-
mation, compound Poisson approximation and binomial approximation.
An account of Stein’s method and a brief history of its developments can
be found in Chen (1998).
In this paper we discuss another aspect of the application of Stein’s
method, not in terms of the approximating distribution but in terms of
the nature of the error bound. Most of the papers on Stein’s method
deal with uniform error bounds. We show that Stein’s method can also
©2001 CRC Press LLC
3. be applied to obtain non-uniform error bounds and of the best possible
order. Roughly speaking, a uniform bound is one on a metric between
two distributions. Whereas a non-uniform bound on the discrepancy
between two distributions, L(W) and L(Z), is one on |Eh(W)−Eh(Z)|,
which depends on h for every h in a separating class. We will consider
non-uniform bounds in three diﬀerent contexts, Poisson, binomial, and
normal. In the exposition below, we will focus more on ideas than on
technical details.
1.2 Poisson Approximation
Poisson approximation using Stein’s method was ﬁrst investigated by
Chen (1975a). Since then many developments have taken place and
Poisson approximation has been applied to such diverse ﬁelds as ran-
dom graphs, molecular biology, computer science, probabilistic number
theory, extreme value theory, spatial statistics, and reliability theory,
where many problems can be phrased in terms of dependent events. See
for example Arratia, Goldstein, and Gordon (1990), Barbour, Holst, and
Janson (1992) and Chen (1993). All these results of Poisson approxima-
tion concern error bounds on the total variation distance between the
distribution of a sum of dependent indicator random variables and a
Poisson distribution. These bounds are therefore uniform bounds.
The possibility of nonuniform bounds in Poisson approximation using
Stein’s method was ﬁrst mentioned in Chen (1975b). For independent
indicator random variables, nonuniform bounds were ﬁrst obtained for
small and moderate λ by Chen and Choi (1992) and for unrestricted λ
with improved results by Barbour, Chen, and Choi (1995). To explain
the ideas behind obtaining nonuniform bounds, we ﬁrst illustrate how
a uniform bound is obtained in the context of independent indicator
random variables.
Let X1, . . . , Xn be independent indicator random variables with
P(Xi = 1) = 1 − P(Xi = 0) = pi, i = 1, ..., n. Deﬁne W =
n
i=1 Xi,
W(i)
= W − Xi, λ =
n
i=1 pi and Z to be a Poisson random variable
with mean λ. Let fh be the solution (which is unique except at 0) of the
Stein equation
λf(w + 1) − wf(w) = h(w) − Eh(Z)
where h is a bounded real-valued function deﬁned on Z+
= {0, 1, 2, . . .}.
Then we have
©2001 CRC Press LLC
4. Eh(W) − Eh(Z) = E {h(W) − Eh(Z)}
= E {λfh(W + 1) − Wfh(W)}
=
n
i=1
p2
i E fh(W(i)
+ 1) (1.2.1)
where f(w) = f(w + 1) − f(w). A result of Barbour and Eagleson
(1983) states that fh ∞ ≤ 2(1 ∧ λ−1
) h ∞. Applying this result, we
obtain
dT V (L(W), L(Z)) = sup
A
|P(W ∈ A) − P(Z ∈ A)|
= (1/2) sup
|h|=1
|Eh(W) − Eh(Z)|
≤ (1 ∧ λ−1
)
n
i=1
p2
i (1.2.2)
where dT V denotes the total variation distance. It is known that the
absolute constant 1 is best possible and the factor (1 ∧ λ−1
) has the
correct order for both small and large values of λ. The signiﬁcance of
the factor (1 ∧ λ−1
) is explained in Chapter 1 of Barbour, Holst, and
Janson (1992).
To obtain a nonuniform bound, we let
Ai(r) =
P(W(i)
= r)
P(Z = r)
.
Then (1.2.1) can be rewritten as
Eh(W) − Eh(Z) =
n
i=1
p2
i EAi(Z) fh(Z + 1) (1.2.3)
where h is no longer assumed to be bounded.
Let C∗
= sup1≤i≤n supr≥0 Ai(r). Then
|Eh(W) − Eh(Z)| ≤ C∗
n
i=1
p2
i E| fh(Z + 1)|.
What remains to be done is to calculate or bound C∗
and E| fh(Z+1)|.
In Barbour, Chen and Choi (1995), it is shown that for max1≤i≤n pi ≤
1/2, C∗
≤ 4e13/12
√
π and the following theorem was proved.
©2001 CRC Press LLC
5. THEOREM 1.2.1 [Theorem 3.1 in Barbour, Chen, and Choi (1995)]
Let h be a real-valued function deﬁned on Z+
such that EZ2
|h(Z)| < ∞.
We have
|Eh(W) − Eh(Z)|
≤ C∗
n
i=1
p2
i [4(1 ∧ λ−1
)E|h(Z + 1)| + E|h(Z + 2)|
−2E|h(Z + 1)| + E|h(Z)|]/2. (1.2.4)
If |h| = 1, then we have
dT V (L(W), L(Z)) = (1/2) sup
|h|=1
|Eh(W) − Eh(Z)| ≤ C∗
(1 ∧ λ−1
)
n
i=1
p2
i
where the upper bound has the same order as that of (2.2), but it has a
larger absolute constant. However, the bound in (2.4) allows a very wide
choice of possible functions h, and therefore contains more information
than the total variation distance bound in (2.2).
By iterating (2.1), we obtain
Eh(W) − Eh(Z) =
n
i=1
p2
i E fh(Z + 1) + second order terms
= −
1
2
n
i=1
p2
i E 2
h(Z) + second order terms
where E∆fh(Z + 1) = −(1/2)E∆2
h(Z) (see, for example, Chen and
Choi (1992), p.1871).
In Barbour, Chen, and Choi (1995), a more reﬁned result (Theorem
3.2) was obtained by bounding the second order error terms in the same
way the ﬁrst order error terms were bounded. From this theorem, a large
deviation result (Theorem 4.2) was proved which produces the following
corollary.
COROLLARY 1.2.2
Let z = λ + ξ
√
λ. Suppose max1≤i≤n pi → 0 and ξ = o [λ/
n
i=1 p2
i ]1/2
as n → ∞. Then, as n, z and ξ → ∞,
P(W ≥ z)
P(Z ≥ z)
− 1 ∼ −
ξ2
2λ
n
i=1
p2
i .
©2001 CRC Press LLC
6. The following asymptotic result was also deduced.
THEOREM 1.2.3
Let N be a standard normal random variable. Let h be a nonnegative
function deﬁned on R which is continuous almost everywhere and not
identically zero. Suppose that (Z−λ√
λ
)4
h(Z−λ√
λ
) : λ ≥ 1 is uniformly in-
tegrable. Then as λ → ∞ such that max1≤i≤n pi → 0,
∞
r=0
h(
r − λ
√
λ
)|P(W = r) − P(Z = r)| ∼
1
2λ
(
n
i=1
p2
i )E|N2
− 1|h(N).
By letting h ≡ 1, E|N2
− 1|h(N) = E|N2
− 1| = 2 2/(πe), and
Theorem 2.3 yields a result of Barbour and Hall (1984a, p. 477) and
Theorem 1.2 of Deheuvels and Pfeifer (1986).
Nonuniform bounds in compound Poisson approximation on a group
for small and moderate λ were ﬁrst obtained by Chen (1975b) and later
generalized and reﬁned by Chen and Roos (1995). In these papers, the
techniques were inspired by Stein’s method. The ﬁrst paper on com-
pound Poisson approximation using Stein’s method directly was by Bar-
bour, Chen, and Loh (1992).
1.3 Binomial Approximation: Binary Expansion of
a Random Integer
In his monograph, Stein (1986) considered the following problem. Let
n be a natural number and let X denote a random variable uniformly
distributed over the set {0, 1, , n − 1}. Let W denote the number of
ones in the binary expansion of X and let Z be a binomial random
variable with parameters (k, 1/2), where k is the unique integer such
that 2k−1
< n ≤ 2k
. If n = 2k
, then W has the same distribution as Z,
otherwise it is a sum of dependent indicator random variables.
By using the solution of the Stein equation
(k − x)f(x) − xf(x − 1) = h(x) − Eh(Z) (1.3.1)
where h = I{r} and r = 0, 1, . . . , k, Stein (1986) proved that
sup
0≤r≤k
|P(W = r) − P(Z = r)| ≤ 4/k.
Diaconis (1977), jointly with Stein, proved a normal approximation re-
sult for W with an error bound of order 1/
√
k. A combination of this
©2001 CRC Press LLC
7. result with the normal approximation to the binomial distribution shows
that sup0≤r≤k |P(W ≤ r) − P(Z ≤ r)| is of the order of 1/
√
k.
Loh (1992) obtained a bound on the solution of a multivariate version
of (3.1) using the probabilistic approach of Barbour (1988). Using this
result of Loh and arguments in Stein (1986), we can obtain a bound of
order 1/
√
k on the total variation distance between L(W) and L(Z).
In an unpublished work of Chen and Soon (1994) which was based
on the Ph.D. dissertation of the latter, the method of obtaining nonuni-
form bounds in Poisson approximation was applied to the approximation
of L(W) by L(Z). Apart from proving other results, this work shows
that the total variation distance between L(W) and L(Z) is, in many
instances, of much small order than 1/
√
k.
Let X =
k
i=1 Xi2k−i
for the binary expansion of X and W =
k
i=1 Xi. In Stein (1986, pp. 44–45), it is shown that
Eh(W) − Eh(Z) = EQfh(W) (1.3.2)
where Q = |{j : Xj = 0 or X +2k−j
≥ n}| and fh is the solution of (3.1)
with h being a real-valued function deﬁned on {0, 1, . . . , k}. Deﬁne
ψ(r) = E[Q|W = r] and A(r) =
P(W = r)
P(Z = r)
.
Then (3.2) can be written as
Eh(W) − Eh(Z) = Eψ(Z)A(Z)fh(Z). (1.3.3)
Let lk be the number of consecutive 1s, starting from the beginning in
the binary expansion of n − 1. The relationship between n − 1 and lk is
given by
n − 1 =
lk
i=1
2k−i
+ m
where 0 ≤ m < 2k−lk−1
. It is shown in Chen and Soon (1994) that for
0 ≤ r ≤ k − 1, lk/k ≤ A(r) ≤ 2. By obtaining upper and lower bounds
on the right hand side of (3.3), the following theorem was proved.
THEOREM 1.3.1
Assume that 2k−1
< n < 2k
.
(i) If limk→∞
lk√
k
= ∞, then
dT V (L(W), L(Z)) 2−lk
.
(ii) If lim supk→∞
lk√
k
< ∞, then
dT V (L(W), L(Z)) 2−lk
lk
√
k
©2001 CRC Press LLC
8. where xk yk means that there exist positive constants a < b such that
a ≤ xk/yk ≤ b for suﬃciently large k.
From this theorem it follows that
dT V (L(W), L(Z))
1
√
k
if and only if
0 < lim inf
k→∞
lk ≤ lim sup
k→∞
lk < ∞.
The following theorems were also proved.
THEOREM 1.3.2
|Eh(W) − Eh(Z)|
≤
13
√
k
E
Z − [k/2] − 1
k/4
(|h(Z)| + |h(Z + 1)| + 2|Eh(Z)|) .
THEOREM 1.3.3
Let a = [k/2] + bk where bk/
√
k → ∞ and bk/k → 0 as k → ∞. If lk = l
for all suﬃciently large k, then
P(W ≥ a)
P(Z ≥ a)
− 1 ∼ −2ψ
k
2
bk
k
as k → ∞, where l(1/2 − (l − 1)/[2(k − l + 1)])l+1
< ψ([k/2]) ≤ 3.
Theorem 3.3 is in fact a corollary of a more general large deviation
theorem.
1.4 Normal Approximation
Let X1, . . . , Xn be independent random variables with EXi = 0, var(Xi)
= σ2
i , E|Xi|3
= γi < ∞ and
n
i=1 σ2
i = 1. Let F be the distribution
function of
n
i=1 Xi and let Φ be the standard normal distribution func-
tion. The Berry-Esseen Theorem states that
sup
−∞<x<∞
|F(x) − Φ(x)| ≤ C
n
i=1
γi
©2001 CRC Press LLC
9. where C is an absolute constant. The smallest value of C, obtained so
far by Van Beek (1972) (without using computers), is 0.7975.
If X1, . . . , Xn are independent and identically distributed, then
sup
−∞<x<∞
|F(x) − Φ(x)| ≤ Cnγ
where γ = γi for i = 1, . . . , n. Nonuniform bounds were ﬁrst obtained
by Esseen (1945) who proved that for the i.i.d. case
|F(x) − Φ(x)| ≤
λ log n
√
n(1 + x2)
and
|F(x) − Φ(x)| ≤
λ log(2 + |x|)
√
n(1 + x2)
where λ depends on n3/2
γ. Nagaev (1965) improved the upper bounds
to Cnγ/(1+|x|3
), also for the i.i.d. case. This was generalized by Bikelis
(1966) who proved that, for independent and not necessarily identically
distributed random variables,
|F(x) − Φ(x)| ≤
C
n
i=1 γi
1 + |x|3
where C is an absolute constant. Paditz (1977) calculated C to be 114.7
and Michel (1981) reduced it to 30.54 for the i.i.d. case. All the above
proofs used the Fourier analytic method. Chen and Shao (2000) used
Stein’s method to prove the following more general result:
|F(x)−Φ(x)| ≤ C
n
i=1
EX2
i I(|Xi| > 1 + |x|)
(1 + |x|)2
+
E|Xi|3
I(|Xi| ≤ 1 + |x|)
(1 + |x|)3
where the existence of third moments is no longer assumed. Their proof
is based on truncation and the concentration inequality approach. The
concentration inequality approach was originally used by Stein for the
i.i.d. case (see Ho and Chen (1978)). It was extended by Chen (1986)
to dependent and non-identically distributed random variables with ar-
bitrary index set. A proof of the Berry-Esseen Theorem for independent
and non-identically distributed random variables using the concentration
inequality approach is given in Section 2 of Chen (1998).
The concentration inequality approach is not the only approach for
obtaining Berry-Esseen bounds using Stein’s method. Another approach
based on inductive arguments has been used by Barbour and Hall (1984b),
Bolthausen (1984) and Stroock (1993).
We would like to mention in passing that Stein’s method has also been
applied to obtain bounds on the total variation distances between the
standard normal distribution and distributions satisfying certain varia-
tional inequalities. See Utev (1989) and Cacoullos, Papathanasiou, and
Utev (1994).
©2001 CRC Press LLC
10. 1.5 Conclusion
We would like to conclude by saying that there is much more to be done
in the direction of nonuniform bounds, particularly for dependent ran-
dom variables both in Poisson approximation and normal approximation.
The large deviation results referred to in the above sections are actually
those of moderate deviation. A related question therefore is how Stein’s
method can be applied to obtain results which cover both moderate and
really large deviations.
Acknowledgement This work is partially supported by grant
RP3982719 at the National University of Singapore. I would like to
thank K. P. Choi and Qi-Man Shao for their help in preparing the
manuscript and for their helpful comments.
References
1. Arratia, R., Goldstein, L., and Gordon, L. (1990). Poisson approxi-
mation and the Chen-Stein method. Statistical Science 5, 403–434.
2. Barbour, A. D. (1988). Stein’s method and Poisson process con-
vergence. Journal of Applied Probability 25 (A), 175–184.
3. Barbour, A. D., Chen, L. H. Y., and Choi, K. P. (1995). Poisson
approximation for unbounded functions, I: independent summands.
Statistica Sinica 5, 749–766.
4. Barbour, A. D., Chen, L. H. Y., and Loh, W. L. (1992). Com-
pound Poisson approximation for nonnegative random variables
via Stein’s method. Annals of Probability 20, 1843–1866.
5. Barbour, A. D. and Eagleson, G. (1983). Poisson approximation for
some statistics based on exchangeable trials. Advances in Applied
Probability 15, 585–600.
6. Barbour, A. D. and Hall, P. (1984a). On the rate of Poisson conver-
gence. Mathematical Proceedings of the Cambridge Philosophical
Society 95, 473–480.
7. Barbour, A. D. and Hall, P. (1984b). Stein’s method and the Berry-
Esseen theorem. The Australian Journal Statistics 26, 8–15.
8. Barbour, A. D., Holst, L., and Janson, S. (1992). Poisson Ap-
proximation. Oxford Studies in Probability 2, Clarendon Press,
Oxford.
9. Bikelis, A. (1966). Estimates of the remainder in the central limit
©2001 CRC Press LLC
11. theorem. Litovsk. Mat. Sb. 6(3), 323–346 (in Russian).
10. Bolthausen, E. (1984). An estimate of the remainder in a combina-
torial central limit theorem. Zeitschrift Wahrscheinlichkeitstheorie
und Verwandte Gebiete 66, 379–386.
11. Cacoullos, T., Papathanasiou, V. and Utev, S. A. (1994). Varia-
tional inequalities with examples and an application to the central
limit theorem. Annals of Probability 22, 1607–1618.
12. Chen, L. H. Y. (1975a). Poisson approximation for dependent tri-
als. Annals of Probability 3, 534–545.
13. Chen, L. H. Y. (1975b). An approximation theorem for convolu-
tions of probability measures. Annals of Probability 3, 992–999.
14. Chen, L. H. Y. (1986). The rate of convergence in a central limit
theorem for dependent random variables with arbitrary index set.
IMA Preprint Series #243, University of Minnesota.
15. Chen, L. H. Y. (1993). Extending the Poisson approximation. Sci-
ence 262, 379–380.
16. Chen, L. H. Y. (1998). Stein’s method: some perspectives with
applications. Probability Towards 2000 (Eds., L. Accardi and C.
Heyde), pp. 97–122. Lecture Notes in Statistics No. 128. Springer
Verlag.
17. Chen, L. H. Y. and Choi, K. P. (1992). Some asymptotic and large
deviation results in Poisson approximation. Annals of Probability
20, 1867–1876.
18. Chen, L. H. Y. and Roos, M. (1995). Compound Poisson approx-
imation for unbounded functions on a group, with application to
large deviations. Probability Theory and Related Fields 103, 515–
528.
19. Chen, L. H. Y. and Shao, Q. M. (2000). A non-uniform Berry-
Esseen bound via Stein’s method. Preprint.
20. Chen, L. H. Y. and Soon, S. Y. T. (1994). On the number of
ones in the binary expansion of a random integer. Unpublished
manuscript.
21. Deheuvels, P. and Pfeifer, D. (1986). A semigroup approach to
Poisson approximation. Annals of Probability 14, 663–676.
22. Diaconis, P. (1977). The distribution of leading digits and uniform
distribution mod 1. Annals of Probability 5, 72–81.
23. Esseen, C.-G. (1945). Fourier analysis of distribution functions: a
mathematical study of the Laplace-Gaussian law. Acta Mathemat-
ica 77 1–125.
©2001 CRC Press LLC
12. 24. Ho, S. T. and Chen, L. H. Y. (1978). An Lp bound for the remain-
der in a combinatorial central limit theorem. Annals of Probability
6, 231–249.
25. Loh, W. L. (1992). Stein’s method and multinomial approximation.
Annals of Applied Probability 2, 536–554.
26. Michel, R. (1981). On the constant in the non-uniform version of
the Berry-Esseen Theorem. Zeitschrift Wahrscheinlichkeitstheorie
und Verwandte Gebiete 55, 109–117.
27. Nagaev, S. V. (1965). Some limit theorems for large deviations.
Theory of Probability and its Applications 10, 214–235.
28. Paditz, L. (1977). ¨Uber die Ann¨aherung der Verteilungsfunktionen
von Summen unabh¨angiger Zufallsgr¨oben gegen unberrenzt teil-
bare Verteilungsfunktionen unter besonderer berchtung der
Verteilungsfunktion der standarddisierten Normalverteilung. Dis-
sertation, A.TU Dresden.
29. Soon, S. Y. T. (1993). Some Problems in Binomial and Compound
Poisson Approximations. Ph.D. dissertation, National University
of Singapore.
30. Stein, C. (1972). A bound for the error in the normal approx-
imation to the distribution of a sum of dependent random vari-
ables. Proceedings of the Sixth Berkeley Symposium on Mathe-
matics, Statistics and Probability 2, 583–602, University California
Press. Berkeley, California.
31. Stein, C. (1986). Approximation Computation of Expectations.
Lecture Notes 7, Institute of Mathematics and Statistics, Hayward,
California.
32. Stroock, D. W. (1993). Probability Theory: An Analytic View.
Cambridge University Press, Cambridge, U.K.
33. Utev, S. A. (1989). Probability problems connected with a certain
integrodiﬀerential inequality. Siberian Mathematics Journal 30,
490–493.
34. Van Beek, P. (1972). An approximation of Fourier methods to
the problem of sharpening the Berry-Esseen inequality. Zeitschrift
Wahrscheinlichkeitstheorie und Verwandte Gebiete 23, 187–196.
©2001 CRC Press LLC
Be the first to comment