SlideShare a Scribd company logo
1 of 70
Download to read offline
On Nonparametric Density Estimation for Size Biased 
Data 
Yogendra P. Chaubey 
Department of Mathematics and Statistics 
Concordia University, Montreal, Canada H3G 1M8 
E-mail: yogen.chaubey@concordia.ca 
Talk to be presented at Indian Statistical Institute, 
November 19, 2014 
Yogendra Chaubey (Concordia University) Department of Mathematics & Statistics November 19, 2014 1 / 70
Abstract 
This talk will highlight some recent development in the area of 
nonparametric functional estimation with emphasis on nonparametric 
density estimation for size biased data. Such data entail constraints that 
many traditional nonparametric density estimators may not satisfy. A 
lemma attributed to Hille, and its generalization [see Lemma 1, Feller 
(1965) An Introduction to Probability Theory and Applications, xVII.1)] is 
used to propose estimators in this context. After describing the asymptotic 
properties of the estimators, we present the results of a simulation study to 
compare various nonparametric density estimators. 
Yogendra Chaubey (Concordia University) Department of Mathematics & Statistics November 19, 2014 2 / 70
Outline 
1 1. Introduction/Motivation 
1.1 Kernel Density Estimator 
1.2. Smooth Estimation of Densities on R+ 
2 2. An Approximation Lemma and Some Alternative Smooth Density 
Estimators 
2.1 Some Alternative Smooth Density Estimators on R+ 
2.2 Asymptotic Properties of the New Estimator 
2.3 Extensions Non-iid cases 
3 3. Estimation of Density in Length-biased Data 
3.1 Smooth Estimators Based on the Estimators of G 
3.2 Smooth Estimators Based on the Estimators of F 
4 4. A Comparison Between Dierent Estimators: Simulation Studies 
4.1 Simulation for 22 
4.2 Simulation for 26 
4.3 Simulation for Some Other Standard Distributions 
Yogendra Chaubey (Concordia University) Department of Mathematics  Statistics November 19, 2014 3 / 70
1. Introduction/Motivation 
1.1 Kernel Density Estimator 
Consider X as a non-negative random variable with density f(x) and 
distribution function 
F(x) = 
Z x 
0 
f(t)dt for x  0: (1.1) 
Such random variables are more frequent in practice in life testing and 
reliability. 
Yogendra Chaubey (Concordia University) Department of Mathematics  Statistics November 19, 2014 4 / 70
Based on a random sample (X1;X2; :::;Xn); from a univariate 
density f(:); the empirical distribution function (edf) is de
ned as 
Fn(x) = 
1 
n 
Xn 
i=1 
I(Xi  x): (1.2) 
edf is not smooth enough to provide an estimator of f(x): 
Various methods (viz., kernel smoothing, histogram methods, spline, 
orthogonal functionals) 
The most popular is the Kernel method (Rosenblatt, 1956). 
[See the text Nonparametric Functional Estimation by Prakasa Rao 
(1983) for a theoretical treatment of the subject or Silverman (1986)]. 
Yogendra Chaubey (Concordia University) Department of Mathematics  Statistics November 19, 2014 5 / 70
^ fn(x) = 
1 
n 
Xn 
i=1 
kh(x  Xi) = 
1 
nh 
Xn 
i=1 
k 
 
x  Xi 
h 
 
) (1.3) 
where the function k(:) called the Kernel function has the following 
properties; 
(i)k(x) = k(x) 
(ii) 
Z 1 
1 
k(x)dx = 1 
and 
kh(x) = 
1 
h 
k 
x 
h 
 
h is known as bandwidth and is made to depend on n; i.e. h  hn, 
such that hn ! 0 and nhn ! 1 as n ! 1: 
Basically k is a symmetric probability density function on the entire 
real line. This may present problems in estimating the densities of 
non-negative random variables. 
Yogendra Chaubey (Concordia University) Department of Mathematics  Statistics November 19, 2014 6 / 70
Kernel Density Estimators for Suicide Data 
0 200 400 600 
0.000 0.002 0.004 0.006 
x 
Default 
SJ 
UCV 
BCV 
Figure 1. Kernel Density Estimators for Suicide Study Data 
Yogendra Chaubey (Concordia University) Department of Mathematics  Statistics November 19, 2014 7 / 70 
Silverman (1986)
1.2. Smooth Estimation of densities on R+ 
^ fn(x) might take positive values even for x 2 (1; 0], which is not 
desirable if the random variable X is positive. Silverman (1986) 
mentions some adaptations of the existing methods when the support 
of the density to be estimated is not the whole real line, through 
transformation and other methods. 
1.2.1 Bagai-Prakasa Rao Estimator 
Bagai and Prakasa Rao (1996) proposed the following adaptation of 
the Kernel Density estimator for non-negative support [which does 
not require any transformation or corrective strategy]. 
fn(x) = 
1 
nhn 
Xn 
i=1 
k 
 
x  Xi 
hn 
 
; x  0: (1.4) 
Yogendra Chaubey (Concordia University) Department of Mathematics  Statistics November 19, 2014 8 / 70
Here k(:) is a bounded density function with support (0;1); satisfying 
Z 1 
0 
x2k(x)dx  1 
and 
hn is a sequence such that hn ! 0 and nhn ! 1 as n ! 1: 
The only dierence between ^ fn(x) and fn(x) is that the former is 
based on a kernel possibly with support extending beyond (0;1): 
One undesirable property of this estimator is that that for x such that 
for X(r)  x  X(r+1); only the
rst r order statistics contribute 
towards the estimator fn(x): 
Yogendra Chaubey (Concordia University) Department of Mathematics  Statistics November 19, 2014 9 / 70
Bagai-Prakasa Rao Density Estimators for Suicide Data 
0.000 0.002 0.004 0.006 x 
Default 
SJ 
UCV 
BCV 
0 200 400 600 
Figure 2. Bagai-Prakasa Rao Density Estimators for Suicide Study Data 
Silverman (1986) 
Yogendra Chaubey (Concordia University) Department of Mathematics  Statistics November 19, 2014 10 / 70
2.1 An approximation Lemma 
The following discussion gives a general approach to density 
estimation which may be specialized to the case of non-negative data. 
The key result for the proposal is the following Lemma given in Feller 
(1965, xVII.1). 
Lemma 1: Let u be any bounded and continuous function and 
Gx;n; n = 1; 2; ::: be a family of distributions with mean n(x) and 
variance h2 
n(x) such that n(x) ! x and hn(x) ! 0: Then 
~u(x) = 
Z 1 
1 
u(t)dGx;n(t) ! u(x): (2.1) 
The convergence is uniform in every subinterval in which hn(x) ! 0 
uniformly and u is uniformly continuous. 
Yogendra Chaubey (Concordia University) Department of Mathematics  Statistics November 19, 2014 11 / 70
This generalization may be adapted for smooth estimation of the 
distribution function by replacing u(x) by the empirical distribution 
function Fn(x) as given below ; 
~ Fn(x) = 
Z 1 
1 
Fn(t)dGx;n(t): (2.2) 
Note that Fn(x) is not a continuous function as desired by the above 
lemma, hence the above lemma is not directly used in proposing the 
estimator but it works as a motivation for the proposal. It can be 
considered as the stochastic adaptation in light of the fact that the 
mathematical convergence is transformed into stochastic convergence 
that parallels to that of the strong convergence of the empirical 
distribution function as stated in the following theorem. 
Yogendra Chaubey (Concordia University) Department of Mathematics  Statistics November 19, 2014 12 / 70
Theorem 1: Let hn(x) be the variance of Gx;n as in Lemma 1 such 
that hn(x) ! 0 as n ! 1 for every
xed x as n ! 1; then we have 
sup 
x 
j ~ Fn(x)  F(x)j a:s: ! 0 (2.3) 
as n ! 1: 
Technically, Gx;n can have any support but it may be prudent to 
choose it so that it has the same support as the random variable 
under consideration; because this will get rid of the problem of the 
estimator assigning positive mass to undesired region. 
For ~ Fn(x) to be a proper distribution function, Gx;n(t) must be a 
decreasing function of x; which can be shown using an alternative 
form of ~ Fn(x) : 
~ Fn(x) = 1  
1 
n 
Xn 
i=1 
Gx;n(Xi): (2.4) 
Yogendra Chaubey (Concordia University) Department of Mathematics  Statistics November 19, 2014 13 / 70
Equation (2.4) suggests a smooth density estimator given by 
~ fn(x) = 
d ~ Fn(x) 
dx 
=  
1 
n 
Xn 
i=1 
d 
dx 
Gx;n(Xi): (2.5) 
The potential of this lemma for smooth density estimation was 
recognized by Gawronski (1980) in his doctoral thesis written at Ulm. 
Gawronski and Stadmuller (1980, Skand. J. Stat.) investigated mean 
square error properties of the density estimator when Gx;n is obtained 
by putting Poisson weight 
pk(xn) = enx (nx)k 
k! 
(2.6) 
to the lattice points k=n; k = 0; 1; 2; ::: 
Yogendra Chaubey (Concordia University) Department of Mathematics  Statistics November 19, 2014 14 / 70
Other developments: 
This lemma has been further used to motivate the Bernstein 
Polynomial estimator (Vitale, 1975) for densities on [0; 1] by Babu, 
Canty and Chaubey (1999). Gawronski (1985, Period. Hung.) 
investigates other lattice distributions such as negative binomial 
distribution. 
Some other developments: 
Chaubey and Sen (1996, Statist. Dec.): survival functions, though in a 
truncated form. 
Chaubey and Sen (1999, JSPI): Mean Residual Life; Chaubey and Sen 
(1998a, Persp. Stat., Narosa Pub.): Hazard and Cumulative Hazard 
Functions; Chaubey and Sen (1998b): Censored Data; 
(Chaubey and Sen, 2002a, 2002b): Multivariate density estimation 
Smooth density estimation under some constraints: Chaubey and 
Kochar (2000, 2006); Chaubey and Xu (2007, JSPI). 
Babu and Chaubey (2006): Density estimation on hypercubes. [see 
also Prakasa Rao(2005)and Kakizawa (2011), Bouezmarni et al. (2010, 
JMVA) for Generalised Bernstein Polynomials and Bernstein copulas] 
Yogendra Chaubey (Concordia University) Department of Mathematics  Statistics November 19, 2014 15 / 70
A Generalised Kernel Estimator for Densities with 
Non-Negative Support 
Lemma 1 motivates the generalised kernel estimator of Foldes and 
Revesz (1974): 
fnGK(x) = 
1 
n 
Xn 
i=1 
hn(x;Xi) 
Chaubey et al. (2012, J. Ind. Stat. Assoc.) show the following 
adaptation using asymmetric kernels for estimation of densities with 
non-negative support. 
Let Qv(x) represent a distribution on [0;1) with mean 1 and 
variance v2; then an estimator of F(x) is given by 
F+ 
n (x) = 1  
1 
n 
Xn 
i=1 
Qvn 
 
Xi 
x 
 
; (2.7) 
where vn ! 0 as n ! 1: 
Yogendra Chaubey (Concordia University) Department of Mathematics  Statistics November 19, 2014 16 / 70
Obviously, this choice uses Gx;n(t) = Qvn(t=x) which is a decreasing 
function of x: 
This leads to the following density estimator 
d 
dx 
(F+ 
n (x)) = 
1 
nx2 
Xn 
i=1 
Xi qvn 
 
Xi 
x 
 
; (2.8) 
where qv(:) denotes the density corresponding to the distribution 
function Qv(:): 
Yogendra Chaubey (Concordia University) Department of Mathematics  Statistics November 19, 2014 17 / 70
However, the above estimator may not be de
ned at x = 0, except in 
cases where limx!0 
d 
dx (F+ 
n (x)) exists. Moreover, this limit is 
typically zero, which is acceptable only when we are estimating a 
density f with f(0) = 0: 
Thus with a view of the more general case where 0  f(0)  1; we 
considered the following perturbed version of the above density 
estimator: 
f+ 
n (x) = 
1 
n(x + n)2 
Xn 
i=1 
Xi qvn 
 
Xi 
x + n 
 
; x  0 (2.9) 
where n # 0 at an appropriate (suciently slow) rate as n ! 1: In 
the sequel, we illustrate our method by taking Qv(:) to be the 
Gamma ( = 1=v2;
= v2) distribution function. 
Yogendra Chaubey (Concordia University) Department of Mathematics  Statistics November 19, 2014 18 / 70
Remark: 
Note that if we believe that the density is zero at zero, we set n  0; 
however in general, it may be determined using the cross-validation 
methods. For n  0; this modi
cation results in a defective distribution 
F+ 
n (x + n): A corrected density estimator f 
n(x) is therefore proposed: 
f 
n(x) = 
f+ 
n (x) 
cn 
; (2.10) 
where cn is a constant given by 
cn = 
1 
n 
Xn 
i=1 
Qvn 
 
Xi 
n 
 
: 
Note that, since for large n; n ! 0; f 
n(x) and f+ 
n (x) are asymptotically 
equivalent, we study the asymptotic properties of f+ 
n (x) only. 
Yogendra Chaubey (Concordia University) Department of Mathematics  Statistics November 19, 2014 19 / 70
Next we present a comparison of our approach with some existing 
estimators. 
Kernel Estimator. 
The usual kernel estimator is a special case of the representation 
given by Eq. (2.5), by taking Gx;n(:) as 
Gx;n(t) = K 
 
t  x 
h 
 
; (2.11) 
where K(:) is a distribution function with mean zero and variance 1. 
Yogendra Chaubey (Concordia University) Department of Mathematics  Statistics November 19, 2014 20 / 70
Transformation Estimator of Wand et al. 
The well known logarithmic transformation approach of Wand, 
Marron and Ruppert (1991) leads to the following density estimator: 
~ f(L) 
n (x) = 
1 
nhnx 
Xn 
i=1 
k( 
1 
hn 
log(Xi=x)); (2.12) 
where k(:) is a density function (kernel) with mean zero and variance 
1. 
Yogendra Chaubey (Concordia University) Department of Mathematics  Statistics November 19, 2014 21 / 70
This is easily seen to be a special case of Eq. (2.5), taking Gx;n again 
as in Eq. (2.11) but applied to log x: This approach, however, creates 
problem at the boundary which led Marron and Ruppert (1994) to 
propose modi
cations that are computationally intensive. 
Estimators of Chen and Scaillet. 
Chen's (2000) estimator is of the form 
^ fC(x) = 
1 
n 
Xn 
i=1 
gx;n(Xi); (2.13) 
where gx;n(:) is the Gamma( = a(x; b);
= b) density with b ! 0 
and ba(x; b) ! x: 
Yogendra Chaubey (Concordia University) Department of Mathematics  Statistics November 19, 2014 22 / 70
This also can be motivated from Eq. (2.1) as follows: take 
u(t) = f(t) and note that the integral 
R 
f(t)gx;n(t)dt can be 
estimated by n1Pn 
i=1 gx;n(Xi): This approach controls the 
boundary bias at x = 0; however, the variance blows up at x = 0; and 
computation of mean integrated squared error (MISE) is not 
tractable. Moreover, estimators of derivatives of the density are not 
easily obtainable because of the appearance of x as argument of the 
Gamma function. 
Yogendra Chaubey (Concordia University) Department of Mathematics  Statistics November 19, 2014 23 / 70
Scaillet's (2004) estimators replace the Gamma kernel by inverse 
Gaussian (IG) and reciprocal inverse Gaussian (RIG) kernels. These 
estimators are more tractable than Chen's; however, the IG-kernel 
estimator assumes value zero at x = 0; which is not desirable when 
f(0)  0; and the variances of the IG as well as the RIG estimators 
blow up at x = 0: 
Bouezmarni and Scaillet (2005), however, demonstrate good
nite-sample performance of these estimators. 
Yogendra Chaubey (Concordia University) Department of Mathematics  Statistics November 19, 2014 24 / 70
It is interesting to note that one can immediately de
ne a 
Chen-Scaillet version of our estimator, namely, 
f+ 
n;C(x) = 
1 
n 
Xn 
i=1 
1 
x 
qvn 
 
Xi 
x 
 
: 
On the other hand, our version (i.e., perturbed version) this estimator 
would be 
^ f+ 
C (x) = 
1 
n 
Xn 
i=1 
gx+n;n(Xi); 
that should not have the problem of variance blowing up at x = 0: 
Yogendra Chaubey (Concordia University) Department of Mathematics  Statistics November 19, 2014 25 / 70
It may also be remarked that the idea used here may be extended to 
the case of densities supported on an arbitrary interval 
[a; b]; 1  a  b  1; by choosing for instance a Beta kernel 
(extended to the interval [a; b]) as in Chen (1999). Without loss of 
generality, suppose a = 0 and b = 1: Then we can choose, for 
instance, qv(:) as the density of Y=; where 
Y  Beta(;
);  = =( +
); such that  ! 1 and
= ! 0; so 
that Var(Y=) ! 0: 
Yogendra Chaubey (Concordia University) Department of Mathematics  Statistics November 19, 2014 26 / 70
2.2 Asymptotic Properties of the New Estimator 
2.2.1 Asymptotic Properties of ~ F+ 
n (x) 
The strong consistency holds in general for the estimator ~ F+ 
n (x): We can 
easily prove the following theorem parallel to the strong convergence of the 
empirical distribution function. 
Theorem: 
If n ! 1 as n ! 1 we have 
sup 
x 
n (x)  F(x)j a:s: ! 0: 
j ~ F+ 
as n ! 1: 
Yogendra Chaubey (Concordia University) Department of Mathematics  Statistics November 19, 2014 27 / 70
We can also show that for large n; the smooth estimator can be arbitrarily 
close to the edf by proper choice of n; as given in the following theorem. 
Theorem:Assuming that f has a bounded derivative, and n = o(n); 
then for some   0; we have, with probability one, 
sup 
x0 
j ~ F+ 
n (x)  Fn(x)j = O 
 
n3=4(log n)1+ 
: 
Yogendra Chaubey (Concordia University) Department of Mathematics  Statistics November 19, 2014 28 / 70
2.2.2 Asymptotic Properties of ~ f+ 
n (x) 
Under some regularity conditions, they obtained 
Theorem: 
sup 
x0 
n (x)  f(x)j a:s: ! 0 
j ~ f+ 
as n ! 1. 
Theorem: 
(a) If nvn ! 1, nv3n 
! 0, nvn2 
n ! 0 as n ! 1, we have 
p 
nvn(f+ 
n (x)  f(x)) ! N 
 
0; I2(q) 
f(x) 
x2 
 
; for x  0: 
(b) If nvn2 
n ! 1 and nvn4 
n ! 0 as n ! 1, we have 
p 
n(f+ 
n (0)  f(0)) ! N 
nvn2 
 
0; I2(q)f(0) 
 
: 
Yogendra Chaubey (Concordia University) Department of Mathematics  Statistics November 19, 2014 29 / 70
2.3. Extensions to Non-iid cases 
We can extend the technique to non-iid cases where a version of Fn(x) is 
available. 
Chaubey, Y.P., Dewan, I. and Li, J. (2012) { Density estimation for 
stationary associated sequences. Comm. Stat.- Simula. Computa. 
41(4), 554- 572 {Using generalised kernel approach Chaubey 
Chaubey, Yogendra P., Dewan, Isha and Li, Jun (2011) { Density 
estimation for stationary associated sequences using Poisson weights. 
Statist. Probab. Lett. 81, 267-276. 
Chaubey, Y.P. and Dewan, I. (2010). A review for smooth estimation 
of survival and density functions for stationary associated sequences: 
Some recent developments { J. Ind. Soc. Agr. Stat. 64(2), 261-272. 
Chaubey, Y.P., Lab, N. and Sen, A. (2010). Generalised kernel 
smoothing for non-negative stationary ergodic processes { Journal of 
Nonparametric Statistics, 22, 973-997 
Yogendra Chaubey (Concordia University) Department of Mathematics  Statistics November 19, 2014 30 / 70
3. Estimation of Density in Length-biased Data 
In general, when the probability that an item is sampled is 
proportional to its size, size biased data emerges. 
The density g of the size biased observation for the underlying density 
f; is given by 
g(x) = 
w(x)f(x) 
w 
; x  0; (3.1) 
where w(x) denotes the size measure and w = 
R 
w(x)f(x): 
In the area of forestry, the size measure is usually proportional to 
either length or area (see Muttlak and McDonald, 1990). 
Another important application occurs in renewal theory where 
inter-event times data are of this type if they are obtained by 
sampling lifetimes in progress at a randomly chosen point in time (see 
Cox, 1969). 
Yogendra Chaubey (Concordia University) Department of Mathematics  Statistics November 19, 2014 31 / 70
Here we will talk about the length biased case where we can write 
f(x) = 
1 
x 
g(x)=: (3.2) 
In principle any smooth estimator of the density function g may be 
transformed into that of the density function f as follows: 
^ f(x) = 
1 
x 
^g(x)=^; (3.3) 
where ^ is an estimator of : 
Note that 1= = Eg(1=X); hence a strongly consistent estimator of  
is given by 
Xn 
^ = nf 
i=1 
X1 
i g1 
Yogendra Chaubey (Concordia University) Department of Mathematics  Statistics November 19, 2014 32 / 70
Bhattacharyya et al. (1988) used this strategy in proposing the 
following smooth estimator of f; 
^ fB(x) = ^(nx)1 
Xn 
i=1 
kh(x  Xi): (3.4) 
Also since, F(x) =  Eg(X11(Xx)); Cox (1969) proposed the 
following as an estimator of the distribution function F(x) : 
^ Fn(x) = ^ 
1 
n 
Xn 
i=1 
X1 
i 1(Xix): (3.5) 
So there are two competing strategies for density estimation for LB 
data. One is to estimate g(x) and then use the relation (3.3) (i.e. 
smooth Gn as in Bhattacharyya et al. (1988)). The other is to 
smooth the Cox estimator ^ Fn(x) directly and use the derivative as the 
smooth estimator of f(x): 
Yogendra Chaubey (Concordia University) Department of Mathematics  Statistics November 19, 2014 33 / 70
Jones (1991) studied the behaviour of the estimator fB(x) in contrast 
to smooth estimator obtained directly by smoothing the estimator 
Fn(x); by Kernel method: 
^ fJ (x) = n1^ 
Xn 
i=1 
X1 
i kh(x  Xi): (3.6) 
He noted that this estimator is a proper density function when 
considered with the support on the whole real line, where as ^ fB(y) 
may be not. He compared the two estimators based on simulations, 
and using the asymptotic arguments, concluded that the latter 
estimator may be preferable in practical applications. 
Also using Jensen's inequality we
nd that 
Eg(^)  1=Egf 
1 
n 
Xn 
i=1 
X1 
i g = ; 
hence the estimator ^ may be positively biased which would transfer 
into increased bias in the above density estimators. 
Yogendra Chaubey (Concordia University) Department of Mathematics  Statistics November 19, 2014 34 / 70
If g(x)=x is integrable, the de
ciency of fB(x) of not being a proper 
density may be corrected by considering the alternative estimator 
^ fa(x) = 
R g^(x)=x 
(^g(x)=x)dx 
; (3.7) 
and this may also eliminate the increase in bias to some extent. 
However, in these situations, since X is typically a non-negative 
random variable, the estimator must satisfy the following two 
conditions: 
(i) ^g(x) = 0 for x  0; 
(ii) ^g(x)=x is integrable. 
Here both of the estimators ^ fB(x) and ^ fJ (x) do not satisfy these 
properties. 
We have a host of alternatives, those based on smoothing Gn and 
those based on smoothing Fn, that we are going to talk about next. 
Yogendra Chaubey (Concordia University) Department of Mathematics  Statistics November 19, 2014 35 / 70
3.1.1 Poisson Smoothing of Gn 
Here, we would like to see the application of the weights generated by 
the Poisson probability mass function as motivated in Chaubey and 
Sen (1996, 2000). However, a modi
cation is necessary in the present 
situation which is also outlined here. 
Using Poisson smoothing, an estimator of g(x) may be given by, 
~gnP (x) = n 
1X 
k=0 
pk(nx) 
 
Gn 
 
k + 1 
n 
 
 Gn 
 
k 
n 
 
; (3.8) 
however, note that limx!0~gnP (x) = nGn(1=n) which may 
converge to 0 as n ! 1; however for
nite samples it may not be 
zero, hence the density f at x = 0 may not be de
ned. Furthermore, 
~gnP (x)=x is not integrable. 
Yogendra Chaubey (Concordia University) Department of Mathematics  Statistics November 19, 2014 36 / 70
A simple modi
cation by attaching the weight pk(nx) to 
Gn((k  1)=n); rather than to Gn(k=n), the above problem is 
avoided. This results in the following smooth estimator of G(x) : 
~G 
n(x) = 
X 
k0 
pk(xn)Gn 
 
k  1 
n 
 
; (3.9) 
The basic nature of the smoothing estimator is not changed, however 
this provides an alternative estimator of the density function as its 
derivative is given by 
~gn(x) = n 
X 
k1 
pk(xn) 
 
Gn 
 
k 
n 
 
 Gn 
 
k  1 
n 
 
; (3.10) 
such that ~gn(0) = 0 and that ~gn(x)=x is integrable. 
Yogendra Chaubey (Concordia University) Department of Mathematics  Statistics November 19, 2014 37 / 70
Since, 
Z 1 
0 
~gn(x) 
x 
dx = n 
X 
[Gn 
k1 
 
k 
n 
 
 Gn 
 
k  1 
n 
 
] 
Z 1 
0 
pk(xn) 
x 
dx 
= n 
X 
[Gn 
k1 
 
k 
n 
 
 Gn 
 
k  1 
n 
 
] 
1 
k 
= n 
X 
k1 
1 
k(k + 1) 
Gn 
 
k 
n 
 
; 
The new smooth estimator of the length biased density f(x) is given 
by 
~ fn(x) = 
P 
k1 
pk1(xn) 
k 
h 
Gn 
 
k 
n 
 
 Gn 
 
k1 
n 
i 
P 
k1 
1 
k(k+1)Gn 
 
k 
n 
 : (3.11) 
Yogendra Chaubey (Concordia University) Department of Mathematics  Statistics November 19, 2014 38 / 70
The corresponding smooth estimator of the distribution function 
F(x) is given by 
~ Fn(x) = 
P 
k1(1=k)Wk(xn)[Gn 
 
k 
n 
 
 Gn 
 
k1 
n 
 
] 
P 
k1 
1 
k(k+1)Gn 
 
k 
n 
 (3.12) 
where 
Wk(nx) = 
1 
(k) 
Z nx 
0 
eyyk1dy = 
X 
jk 
pj(nx): 
Yogendra Chaubey (Concordia University) Department of Mathematics  Statistics November 19, 2014 39 / 70
An equivalent expression for the above estimator is given by 
~ Fn(x) = 
P 
k1 Gn 
 
k 
n 
 h 
k  Wk+1(nx) 
k+1 
Wk(nx) 
i 
P 
k1 
1 
k(k+1)Gn 
 
k 
n 
 
= 1 + 
P 
k1 Gn 
 
k 
n 
 h 
Pk(nx) 
k+1  Pk1(nx) 
k 
i 
P 
k1 
1 
k(k+1)Gn 
 
k 
n 
 ; 
where Pk() = 
Xk 
j=0 
pj() 
denotes the cumulative probability corresponding to the Poisson() 
distribution. 
The properties of above estimators can be established in an analogous 
way to those in the regular case. 
Yogendra Chaubey (Concordia University) Department of Mathematics  Statistics November 19, 2014 40 / 70
3.1.2 Gamma Smoothing of Gn 
The smooth estimator using the log-normal density may typically have 
a spike at zero, however the gamma density may be appropriate, since 
it typically has the density estimator ^g(x) such that ^g(0) = 0; so that 
no perturbation is required. The smooth density estimator in this case 
is simply given by 
g+ 
n (x) = 
1 
nx2 
Xn 
i=1 
Xi qvn 
 
Xi 
x 
 
; (3.13) 
where qv(:) denotes the density corresponding to a 
Gamma( = 1=vn;
= vn): and the corresponding estimator of 
density is given by 
f+ 
n (x) = 
g+ 
R n (x)=x 1 
0 (g+ 
n (t)=t)dt 
(3.14) 
Yogendra Chaubey (Concordia University) Department of Mathematics  Statistics November 19, 2014 41 / 70
3.2.1 Poisson Smoothing of Fn 
smoothing Fn directly using Poisson weights, an estimator of f(x) 
may be given by, 
~ fnP (x) = n 
1X 
k=0 
pk(nx) 
 
Fn 
 
k + 1 
n 
 
 Fn 
 
k 
n 
 
: (3.15) 
No modi
cations are necessary. 
Yogendra Chaubey (Concordia University) Department of Mathematics  Statistics November 19, 2014 42 / 70
3.2.2 Gamma Smoothing of Fn 
The gamma based smooth estimate of F(x) is given by 
e F+ 
n (x) = 1  
Pn 
i=1 
1 
Xi 
Qvn(Xi 
x ) 
Pn 
i=1 
1 
Xi 
; (3.16) 
and that for the density f in this case is simply given by 
~ f+ 
n (x) = 
1 
(x+n)2 
Pn 
i=1 qvn( Xi 
x+n 
) 
Pn 
i=1 
1 
Xi 
: (3.17) 
where qv(:) denotes the density corresponding to a 
Gamma( = 1=vn;
= vn): 
Note that the above estimator is computationally intensive as two 
smoothing parameters have to be computed using bivariate cross 
validation. 
Yogendra Chaubey (Concordia University) Department of Mathematics  Statistics November 19, 2014 43 / 70
4. A Simulation Study 
Here we consider parent distributions to estimate as exponential (22 
), 
26 
, lognormal, Weibull and mixture of exponential densities. 
Since the computation is very extensive for obtaining the smoothing 
parameters, we compute approximations to MISE and MSE by 
computing 
ISE(fn; f) = 
Z 1 
0 
[fn(x)  f(x)]2dx 
and 
SE (fn(x); f(x)) = [fn(x)  f(x)]2 
for 1000 samples. 
Yogendra Chaubey (Concordia University) Department of Mathematics  Statistics November 19, 2014 44 / 70
Here, MISE give us the global performance of density estimator. 
MSE let us to see how the density estimator performs locally at the 
points in which we might be interested. It is no doubt that we 
particularly want to know the behavior of density estimators near the 
lower boundary. We illustrate only MISE values. 
Optimal values of smoothing parameters are obtained using either 
BCV or UCV criterion, that roughly approximates Mean Integrated 
Squared Error. 
For Poisson smoothing as well as for Gamma smoothing BCV 
criterion is found to be better, where as for Chen and Scaillet 
method, use of BCV method is not tractable as it requires estimate of 
the derivative of the density. 
Yogendra Chaubey (Concordia University) Department of Mathematics  Statistics November 19, 2014 45 / 70
Next table gives the values of MISE for exponential density using new 
estimators as compared with Chen's and Scaillet estimators. Note 
that we include the simulation results for Scaillet's estimator using 
RIG kernel only. 
Inverse Gaussian kernel is known not to perform well for direct data 
[see Kulasekera and Padgett (2006)]. Similar observations were noted 
for LB data. 
Yogendra Chaubey (Concordia University) Department of Mathematics  Statistics November 19, 2014 46 / 70
Table: Simulated MISE for 22 
Distribution Estimator 
Sample Size 
30 50 100 200 300 500 
22 
Chen-1 0.13358 0.08336 0.07671 0.03900 0.03056 0.02554 
Chen-2 0.11195 0.08592 0.05642 0.03990 0.03301 0.02298 
RIG 0.14392 0.11268 0.07762 0.06588 0.05466 0.04734 
Poisson(F) 0.04562 0.03623 0.02673 0.01888 0.01350 0.01220 
Poisson(G) 0.08898 0.06653 0.04594 0.03127 0.02487 0.01885 
Gamma(F) 0.06791 0.05863 0.03989 0.03135 0.02323 0.01589 
Gamma*(F) 0.02821 0.01964 0.01224 0.00796 0.00609 0.00440 
Gamma(G) 0.09861 0.07663 0.05168 0.03000 0.02007 0.01317 
Gamma*(G) 0.02370 0.01244 0.00782 0.00537 0.00465 0.00356 
Yogendra Chaubey (Concordia University) Department of Mathematics  Statistics November 19, 2014 47 / 70
Table: Simulated MSE for 22 
Sample Size Estimator 
x 
0 0.1 1 2 5 
n=30 
I 0.1307 0.2040 0.0181 0.0044 0.0003 
II 0.1187 0.2499 0.0173 0.0045 0.0012 
III 0.2222 0.1823 0.0250 0.0074 0.0022 
IV 0.1487 0.1001 0.0049 0.0015 0.0005 
V 0.3003 0.2438 0.0286 0.0148 0.0013 
VI 0.1936 0.1447 0.0117 0.0042 0.0002 
VI* 0.0329 0.0286 0.0090 0.0030 9:8  105 
VII 0.1893 0.1720 0.0209 0.0066 0.0003 
VII* 0.0528 0.0410 0.0032 0.0020 8:4  105 
n=50 
I 0.1370 0.1493 0.0121 0.0030 0.0002 
II 0.1279 0.1894 0.0112 0.0032 0.0008 
III 0.2193 0.1774 0.0161 0.0046 0.0046 
IV 0.1393 0.0885 0.0034 0.0012 0.0003 
V 0.2939 0.1924 0.0218 0.0094 0.0007 
VI 0.1808 0.1365 0.0101 0.0036 0.0001 
VI* 0.0196 0.0172 0.0070 0.0024 6:8  105 
VII 0.1584 0.1440 0.0168 0.0060 0.0002 
VII* 0.0322 0.0236 0.0014 0.0012 4:8  105 
I-Chen-1, II-Chen-2, III-RIG, IV-Poisson(F), V-Poisson(G), VI-Gamma(F), VI*-Corrected Gamma(F), VII-Gamma(G), 
VII*-Corrected Gamma(G) 
Yogendra Chaubey (Concordia University) Department of Mathematics  Statistics November 19, 2014 48 / 70

More Related Content

What's hot

Regularized Compression of A Noisy Blurred Image
Regularized Compression of A Noisy Blurred Image Regularized Compression of A Noisy Blurred Image
Regularized Compression of A Noisy Blurred Image ijcsa
 
11.solution of a singular class of boundary value problems by variation itera...
11.solution of a singular class of boundary value problems by variation itera...11.solution of a singular class of boundary value problems by variation itera...
11.solution of a singular class of boundary value problems by variation itera...Alexander Decker
 
Solution of a singular class of boundary value problems by variation iteratio...
Solution of a singular class of boundary value problems by variation iteratio...Solution of a singular class of boundary value problems by variation iteratio...
Solution of a singular class of boundary value problems by variation iteratio...Alexander Decker
 
Pattern Recognition
Pattern RecognitionPattern Recognition
Pattern RecognitionEunho Lee
 
Predictive mean-matching2
Predictive mean-matching2Predictive mean-matching2
Predictive mean-matching2Jae-kwang Kim
 
International Journal of Computational Engineering Research(IJCER)
International Journal of Computational Engineering Research(IJCER)International Journal of Computational Engineering Research(IJCER)
International Journal of Computational Engineering Research(IJCER)ijceronline
 
11.solution of a subclass of singular second order
11.solution of a subclass of singular second order11.solution of a subclass of singular second order
11.solution of a subclass of singular second orderAlexander Decker
 
Solution of a subclass of singular second order
Solution of a subclass of singular second orderSolution of a subclass of singular second order
Solution of a subclass of singular second orderAlexander Decker
 
Some sampling techniques for big data analysis
Some sampling techniques for big data analysisSome sampling techniques for big data analysis
Some sampling techniques for big data analysisJae-kwang Kim
 
Lesson 26: The Fundamental Theorem of Calculus (Section 041 slides)
Lesson 26: The Fundamental Theorem of Calculus (Section 041 slides)Lesson 26: The Fundamental Theorem of Calculus (Section 041 slides)
Lesson 26: The Fundamental Theorem of Calculus (Section 041 slides)Matthew Leingang
 
Chapter2: Likelihood-based approach
Chapter2: Likelihood-based approach Chapter2: Likelihood-based approach
Chapter2: Likelihood-based approach Jae-kwang Kim
 
"reflections on the probability space induced by moment conditions with impli...
"reflections on the probability space induced by moment conditions with impli..."reflections on the probability space induced by moment conditions with impli...
"reflections on the probability space induced by moment conditions with impli...Christian Robert
 
The Universal Measure for General Sources and its Application to MDL/Bayesian...
The Universal Measure for General Sources and its Application to MDL/Bayesian...The Universal Measure for General Sources and its Application to MDL/Bayesian...
The Universal Measure for General Sources and its Application to MDL/Bayesian...Joe Suzuki
 
Application of stochastic lognormal diffusion model with
Application of stochastic lognormal diffusion model withApplication of stochastic lognormal diffusion model with
Application of stochastic lognormal diffusion model withAlexander Decker
 

What's hot (19)

Regularized Compression of A Noisy Blurred Image
Regularized Compression of A Noisy Blurred Image Regularized Compression of A Noisy Blurred Image
Regularized Compression of A Noisy Blurred Image
 
11.solution of a singular class of boundary value problems by variation itera...
11.solution of a singular class of boundary value problems by variation itera...11.solution of a singular class of boundary value problems by variation itera...
11.solution of a singular class of boundary value problems by variation itera...
 
Solution of a singular class of boundary value problems by variation iteratio...
Solution of a singular class of boundary value problems by variation iteratio...Solution of a singular class of boundary value problems by variation iteratio...
Solution of a singular class of boundary value problems by variation iteratio...
 
Pattern Recognition
Pattern RecognitionPattern Recognition
Pattern Recognition
 
Predictive mean-matching2
Predictive mean-matching2Predictive mean-matching2
Predictive mean-matching2
 
International Journal of Computational Engineering Research(IJCER)
International Journal of Computational Engineering Research(IJCER)International Journal of Computational Engineering Research(IJCER)
International Journal of Computational Engineering Research(IJCER)
 
11.solution of a subclass of singular second order
11.solution of a subclass of singular second order11.solution of a subclass of singular second order
11.solution of a subclass of singular second order
 
Solution of a subclass of singular second order
Solution of a subclass of singular second orderSolution of a subclass of singular second order
Solution of a subclass of singular second order
 
Some sampling techniques for big data analysis
Some sampling techniques for big data analysisSome sampling techniques for big data analysis
Some sampling techniques for big data analysis
 
Lesson 26: The Fundamental Theorem of Calculus (Section 041 slides)
Lesson 26: The Fundamental Theorem of Calculus (Section 041 slides)Lesson 26: The Fundamental Theorem of Calculus (Section 041 slides)
Lesson 26: The Fundamental Theorem of Calculus (Section 041 slides)
 
MNAR
MNARMNAR
MNAR
 
Decision theory
Decision theoryDecision theory
Decision theory
 
Propensity albert
Propensity albertPropensity albert
Propensity albert
 
Chapter2: Likelihood-based approach
Chapter2: Likelihood-based approach Chapter2: Likelihood-based approach
Chapter2: Likelihood-based approach
 
"reflections on the probability space induced by moment conditions with impli...
"reflections on the probability space induced by moment conditions with impli..."reflections on the probability space induced by moment conditions with impli...
"reflections on the probability space induced by moment conditions with impli...
 
Fi review5
Fi review5Fi review5
Fi review5
 
Reg n corr
Reg n corrReg n corr
Reg n corr
 
The Universal Measure for General Sources and its Application to MDL/Bayesian...
The Universal Measure for General Sources and its Application to MDL/Bayesian...The Universal Measure for General Sources and its Application to MDL/Bayesian...
The Universal Measure for General Sources and its Application to MDL/Bayesian...
 
Application of stochastic lognormal diffusion model with
Application of stochastic lognormal diffusion model withApplication of stochastic lognormal diffusion model with
Application of stochastic lognormal diffusion model with
 

Viewers also liked

Bruce Damer's talk at EE380, the Stanford University Computer Systems Colloqu...
Bruce Damer's talk at EE380, the Stanford University Computer Systems Colloqu...Bruce Damer's talk at EE380, the Stanford University Computer Systems Colloqu...
Bruce Damer's talk at EE380, the Stanford University Computer Systems Colloqu...Bruce Damer
 
Teaching Matrices within Statistics
Teaching Matrices within StatisticsTeaching Matrices within Statistics
Teaching Matrices within StatisticsKimmo Vehkalahti
 
Statistics
StatisticsStatistics
Statisticspikuoec
 
Commonly Used Statistics in Medical Research Part I
Commonly Used Statistics in Medical Research Part ICommonly Used Statistics in Medical Research Part I
Commonly Used Statistics in Medical Research Part IPat Barlow
 
Role of Statistics in Scientific Research
Role of Statistics in Scientific ResearchRole of Statistics in Scientific Research
Role of Statistics in Scientific ResearchVaruna Harshana
 
Lead Generation on SlideShare: A How-to Guide
Lead Generation on SlideShare: A How-to GuideLead Generation on SlideShare: A How-to Guide
Lead Generation on SlideShare: A How-to GuideSlideShare
 

Viewers also liked (6)

Bruce Damer's talk at EE380, the Stanford University Computer Systems Colloqu...
Bruce Damer's talk at EE380, the Stanford University Computer Systems Colloqu...Bruce Damer's talk at EE380, the Stanford University Computer Systems Colloqu...
Bruce Damer's talk at EE380, the Stanford University Computer Systems Colloqu...
 
Teaching Matrices within Statistics
Teaching Matrices within StatisticsTeaching Matrices within Statistics
Teaching Matrices within Statistics
 
Statistics
StatisticsStatistics
Statistics
 
Commonly Used Statistics in Medical Research Part I
Commonly Used Statistics in Medical Research Part ICommonly Used Statistics in Medical Research Part I
Commonly Used Statistics in Medical Research Part I
 
Role of Statistics in Scientific Research
Role of Statistics in Scientific ResearchRole of Statistics in Scientific Research
Role of Statistics in Scientific Research
 
Lead Generation on SlideShare: A How-to Guide
Lead Generation on SlideShare: A How-to GuideLead Generation on SlideShare: A How-to Guide
Lead Generation on SlideShare: A How-to Guide
 

Similar to Talk slides at ISI, 2014

Talk slides imsct2016
Talk slides imsct2016Talk slides imsct2016
Talk slides imsct2016ychaubey
 
Chaubey seminarslides2017
Chaubey seminarslides2017Chaubey seminarslides2017
Chaubey seminarslides2017ychaubey
 
Talk at 2013 WSC, ISI Conference in Hong Kong, August 26, 2013
Talk at 2013 WSC, ISI Conference in Hong Kong, August 26, 2013Talk at 2013 WSC, ISI Conference in Hong Kong, August 26, 2013
Talk at 2013 WSC, ISI Conference in Hong Kong, August 26, 2013Christian Robert
 
Optimistic decision making using an
Optimistic decision making using anOptimistic decision making using an
Optimistic decision making using anijaia
 
Chapter 8 Of Rock Engineering
Chapter 8 Of  Rock  EngineeringChapter 8 Of  Rock  Engineering
Chapter 8 Of Rock EngineeringNgo Hung Long
 
Doe02 statistics
Doe02 statisticsDoe02 statistics
Doe02 statisticsArif Rahman
 
Similarity Measure Using Interval Valued Vague Sets in Multiple Criteria Deci...
Similarity Measure Using Interval Valued Vague Sets in Multiple Criteria Deci...Similarity Measure Using Interval Valued Vague Sets in Multiple Criteria Deci...
Similarity Measure Using Interval Valued Vague Sets in Multiple Criteria Deci...iosrjce
 
Welcome to International Journal of Engineering Research and Development (IJERD)
Welcome to International Journal of Engineering Research and Development (IJERD)Welcome to International Journal of Engineering Research and Development (IJERD)
Welcome to International Journal of Engineering Research and Development (IJERD)IJERD Editor
 
Statistical Inference Part II: Types of Sampling Distribution
Statistical Inference Part II: Types of Sampling DistributionStatistical Inference Part II: Types of Sampling Distribution
Statistical Inference Part II: Types of Sampling DistributionDexlab Analytics
 
Appendix 2 Probability And Statistics
Appendix 2  Probability And StatisticsAppendix 2  Probability And Statistics
Appendix 2 Probability And StatisticsSarah Morrow
 
Common fixed points of weakly reciprocally continuous maps using a gauge func...
Common fixed points of weakly reciprocally continuous maps using a gauge func...Common fixed points of weakly reciprocally continuous maps using a gauge func...
Common fixed points of weakly reciprocally continuous maps using a gauge func...Alexander Decker
 
11.common fixed points of weakly reciprocally continuous maps using a gauge f...
11.common fixed points of weakly reciprocally continuous maps using a gauge f...11.common fixed points of weakly reciprocally continuous maps using a gauge f...
11.common fixed points of weakly reciprocally continuous maps using a gauge f...Alexander Decker
 
Dimensionality Reduction Techniques In Response Surface Designs
Dimensionality Reduction Techniques In Response Surface DesignsDimensionality Reduction Techniques In Response Surface Designs
Dimensionality Reduction Techniques In Response Surface Designsinventionjournals
 
A derivative free high ordered hybrid equation solver
A derivative free high ordered hybrid equation solverA derivative free high ordered hybrid equation solver
A derivative free high ordered hybrid equation solverZac Darcy
 
Session 2 b kotpaper
Session 2 b kotpaperSession 2 b kotpaper
Session 2 b kotpaperIARIW 2014
 
Can we estimate a constant?
Can we estimate a constant?Can we estimate a constant?
Can we estimate a constant?Christian Robert
 

Similar to Talk slides at ISI, 2014 (20)

Talk slides imsct2016
Talk slides imsct2016Talk slides imsct2016
Talk slides imsct2016
 
Chaubey seminarslides2017
Chaubey seminarslides2017Chaubey seminarslides2017
Chaubey seminarslides2017
 
Talk at 2013 WSC, ISI Conference in Hong Kong, August 26, 2013
Talk at 2013 WSC, ISI Conference in Hong Kong, August 26, 2013Talk at 2013 WSC, ISI Conference in Hong Kong, August 26, 2013
Talk at 2013 WSC, ISI Conference in Hong Kong, August 26, 2013
 
Numerical method (curve fitting)
Numerical method (curve fitting)Numerical method (curve fitting)
Numerical method (curve fitting)
 
Optimistic decision making using an
Optimistic decision making using anOptimistic decision making using an
Optimistic decision making using an
 
Chapter 8 Of Rock Engineering
Chapter 8 Of  Rock  EngineeringChapter 8 Of  Rock  Engineering
Chapter 8 Of Rock Engineering
 
Doe02 statistics
Doe02 statisticsDoe02 statistics
Doe02 statistics
 
F0742328
F0742328F0742328
F0742328
 
Similarity Measure Using Interval Valued Vague Sets in Multiple Criteria Deci...
Similarity Measure Using Interval Valued Vague Sets in Multiple Criteria Deci...Similarity Measure Using Interval Valued Vague Sets in Multiple Criteria Deci...
Similarity Measure Using Interval Valued Vague Sets in Multiple Criteria Deci...
 
Welcome to International Journal of Engineering Research and Development (IJERD)
Welcome to International Journal of Engineering Research and Development (IJERD)Welcome to International Journal of Engineering Research and Development (IJERD)
Welcome to International Journal of Engineering Research and Development (IJERD)
 
Es272 ch5b
Es272 ch5bEs272 ch5b
Es272 ch5b
 
Statistical Inference Part II: Types of Sampling Distribution
Statistical Inference Part II: Types of Sampling DistributionStatistical Inference Part II: Types of Sampling Distribution
Statistical Inference Part II: Types of Sampling Distribution
 
Chapter6.pdf.pdf
Chapter6.pdf.pdfChapter6.pdf.pdf
Chapter6.pdf.pdf
 
Appendix 2 Probability And Statistics
Appendix 2  Probability And StatisticsAppendix 2  Probability And Statistics
Appendix 2 Probability And Statistics
 
Common fixed points of weakly reciprocally continuous maps using a gauge func...
Common fixed points of weakly reciprocally continuous maps using a gauge func...Common fixed points of weakly reciprocally continuous maps using a gauge func...
Common fixed points of weakly reciprocally continuous maps using a gauge func...
 
11.common fixed points of weakly reciprocally continuous maps using a gauge f...
11.common fixed points of weakly reciprocally continuous maps using a gauge f...11.common fixed points of weakly reciprocally continuous maps using a gauge f...
11.common fixed points of weakly reciprocally continuous maps using a gauge f...
 
Dimensionality Reduction Techniques In Response Surface Designs
Dimensionality Reduction Techniques In Response Surface DesignsDimensionality Reduction Techniques In Response Surface Designs
Dimensionality Reduction Techniques In Response Surface Designs
 
A derivative free high ordered hybrid equation solver
A derivative free high ordered hybrid equation solverA derivative free high ordered hybrid equation solver
A derivative free high ordered hybrid equation solver
 
Session 2 b kotpaper
Session 2 b kotpaperSession 2 b kotpaper
Session 2 b kotpaper
 
Can we estimate a constant?
Can we estimate a constant?Can we estimate a constant?
Can we estimate a constant?
 

Recently uploaded

Patient Counselling. Definition of patient counseling; steps involved in pati...
Patient Counselling. Definition of patient counseling; steps involved in pati...Patient Counselling. Definition of patient counseling; steps involved in pati...
Patient Counselling. Definition of patient counseling; steps involved in pati...raviapr7
 
How to Create a Toggle Button in Odoo 17
How to Create a Toggle Button in Odoo 17How to Create a Toggle Button in Odoo 17
How to Create a Toggle Button in Odoo 17Celine George
 
How to Add Existing Field in One2Many Tree View in Odoo 17
How to Add Existing Field in One2Many Tree View in Odoo 17How to Add Existing Field in One2Many Tree View in Odoo 17
How to Add Existing Field in One2Many Tree View in Odoo 17Celine George
 
Department of Health Compounder Question ‍Solution 2022.pdf
Department of Health Compounder Question ‍Solution 2022.pdfDepartment of Health Compounder Question ‍Solution 2022.pdf
Department of Health Compounder Question ‍Solution 2022.pdfMohonDas
 
10 Topics For MBA Project Report [HR].pdf
10 Topics For MBA Project Report [HR].pdf10 Topics For MBA Project Report [HR].pdf
10 Topics For MBA Project Report [HR].pdfJayanti Pande
 
Diploma in Nursing Admission Test Question Solution 2023.pdf
Diploma in Nursing Admission Test Question Solution 2023.pdfDiploma in Nursing Admission Test Question Solution 2023.pdf
Diploma in Nursing Admission Test Question Solution 2023.pdfMohonDas
 
Slides CapTechTalks Webinar March 2024 Joshua Sinai.pptx
Slides CapTechTalks Webinar March 2024 Joshua Sinai.pptxSlides CapTechTalks Webinar March 2024 Joshua Sinai.pptx
Slides CapTechTalks Webinar March 2024 Joshua Sinai.pptxCapitolTechU
 
How to Solve Singleton Error in the Odoo 17
How to Solve Singleton Error in the  Odoo 17How to Solve Singleton Error in the  Odoo 17
How to Solve Singleton Error in the Odoo 17Celine George
 
Riddhi Kevadiya. WILLIAM SHAKESPEARE....
Riddhi Kevadiya. WILLIAM SHAKESPEARE....Riddhi Kevadiya. WILLIAM SHAKESPEARE....
Riddhi Kevadiya. WILLIAM SHAKESPEARE....Riddhi Kevadiya
 
Easter in the USA presentation by Chloe.
Easter in the USA presentation by Chloe.Easter in the USA presentation by Chloe.
Easter in the USA presentation by Chloe.EnglishCEIPdeSigeiro
 
5 charts on South Africa as a source country for international student recrui...
5 charts on South Africa as a source country for international student recrui...5 charts on South Africa as a source country for international student recrui...
5 charts on South Africa as a source country for international student recrui...CaraSkikne1
 
Vani Magazine - Quarterly Magazine of Seshadripuram Educational Trust
Vani Magazine - Quarterly Magazine of Seshadripuram Educational TrustVani Magazine - Quarterly Magazine of Seshadripuram Educational Trust
Vani Magazine - Quarterly Magazine of Seshadripuram Educational TrustSavipriya Raghavendra
 
ARTICULAR DISC OF TEMPOROMANDIBULAR JOINT
ARTICULAR DISC OF TEMPOROMANDIBULAR JOINTARTICULAR DISC OF TEMPOROMANDIBULAR JOINT
ARTICULAR DISC OF TEMPOROMANDIBULAR JOINTDR. SNEHA NAIR
 
Drug Information Services- DIC and Sources.
Drug Information Services- DIC and Sources.Drug Information Services- DIC and Sources.
Drug Information Services- DIC and Sources.raviapr7
 
Prescribed medication order and communication skills.pptx
Prescribed medication order and communication skills.pptxPrescribed medication order and communication skills.pptx
Prescribed medication order and communication skills.pptxraviapr7
 
Education and training program in the hospital APR.pptx
Education and training program in the hospital APR.pptxEducation and training program in the hospital APR.pptx
Education and training program in the hospital APR.pptxraviapr7
 
The basics of sentences session 10pptx.pptx
The basics of sentences session 10pptx.pptxThe basics of sentences session 10pptx.pptx
The basics of sentences session 10pptx.pptxheathfieldcps1
 
The Stolen Bacillus by Herbert George Wells
The Stolen Bacillus by Herbert George WellsThe Stolen Bacillus by Herbert George Wells
The Stolen Bacillus by Herbert George WellsEugene Lysak
 

Recently uploaded (20)

Patient Counselling. Definition of patient counseling; steps involved in pati...
Patient Counselling. Definition of patient counseling; steps involved in pati...Patient Counselling. Definition of patient counseling; steps involved in pati...
Patient Counselling. Definition of patient counseling; steps involved in pati...
 
How to Create a Toggle Button in Odoo 17
How to Create a Toggle Button in Odoo 17How to Create a Toggle Button in Odoo 17
How to Create a Toggle Button in Odoo 17
 
How to Add Existing Field in One2Many Tree View in Odoo 17
How to Add Existing Field in One2Many Tree View in Odoo 17How to Add Existing Field in One2Many Tree View in Odoo 17
How to Add Existing Field in One2Many Tree View in Odoo 17
 
Department of Health Compounder Question ‍Solution 2022.pdf
Department of Health Compounder Question ‍Solution 2022.pdfDepartment of Health Compounder Question ‍Solution 2022.pdf
Department of Health Compounder Question ‍Solution 2022.pdf
 
10 Topics For MBA Project Report [HR].pdf
10 Topics For MBA Project Report [HR].pdf10 Topics For MBA Project Report [HR].pdf
10 Topics For MBA Project Report [HR].pdf
 
March 2024 Directors Meeting, Division of Student Affairs and Academic Support
March 2024 Directors Meeting, Division of Student Affairs and Academic SupportMarch 2024 Directors Meeting, Division of Student Affairs and Academic Support
March 2024 Directors Meeting, Division of Student Affairs and Academic Support
 
Diploma in Nursing Admission Test Question Solution 2023.pdf
Diploma in Nursing Admission Test Question Solution 2023.pdfDiploma in Nursing Admission Test Question Solution 2023.pdf
Diploma in Nursing Admission Test Question Solution 2023.pdf
 
Slides CapTechTalks Webinar March 2024 Joshua Sinai.pptx
Slides CapTechTalks Webinar March 2024 Joshua Sinai.pptxSlides CapTechTalks Webinar March 2024 Joshua Sinai.pptx
Slides CapTechTalks Webinar March 2024 Joshua Sinai.pptx
 
How to Solve Singleton Error in the Odoo 17
How to Solve Singleton Error in the  Odoo 17How to Solve Singleton Error in the  Odoo 17
How to Solve Singleton Error in the Odoo 17
 
Riddhi Kevadiya. WILLIAM SHAKESPEARE....
Riddhi Kevadiya. WILLIAM SHAKESPEARE....Riddhi Kevadiya. WILLIAM SHAKESPEARE....
Riddhi Kevadiya. WILLIAM SHAKESPEARE....
 
Easter in the USA presentation by Chloe.
Easter in the USA presentation by Chloe.Easter in the USA presentation by Chloe.
Easter in the USA presentation by Chloe.
 
5 charts on South Africa as a source country for international student recrui...
5 charts on South Africa as a source country for international student recrui...5 charts on South Africa as a source country for international student recrui...
5 charts on South Africa as a source country for international student recrui...
 
Vani Magazine - Quarterly Magazine of Seshadripuram Educational Trust
Vani Magazine - Quarterly Magazine of Seshadripuram Educational TrustVani Magazine - Quarterly Magazine of Seshadripuram Educational Trust
Vani Magazine - Quarterly Magazine of Seshadripuram Educational Trust
 
ARTICULAR DISC OF TEMPOROMANDIBULAR JOINT
ARTICULAR DISC OF TEMPOROMANDIBULAR JOINTARTICULAR DISC OF TEMPOROMANDIBULAR JOINT
ARTICULAR DISC OF TEMPOROMANDIBULAR JOINT
 
Drug Information Services- DIC and Sources.
Drug Information Services- DIC and Sources.Drug Information Services- DIC and Sources.
Drug Information Services- DIC and Sources.
 
Prescribed medication order and communication skills.pptx
Prescribed medication order and communication skills.pptxPrescribed medication order and communication skills.pptx
Prescribed medication order and communication skills.pptx
 
Prelims of Kant get Marx 2.0: a general politics quiz
Prelims of Kant get Marx 2.0: a general politics quizPrelims of Kant get Marx 2.0: a general politics quiz
Prelims of Kant get Marx 2.0: a general politics quiz
 
Education and training program in the hospital APR.pptx
Education and training program in the hospital APR.pptxEducation and training program in the hospital APR.pptx
Education and training program in the hospital APR.pptx
 
The basics of sentences session 10pptx.pptx
The basics of sentences session 10pptx.pptxThe basics of sentences session 10pptx.pptx
The basics of sentences session 10pptx.pptx
 
The Stolen Bacillus by Herbert George Wells
The Stolen Bacillus by Herbert George WellsThe Stolen Bacillus by Herbert George Wells
The Stolen Bacillus by Herbert George Wells
 

Talk slides at ISI, 2014

  • 1. On Nonparametric Density Estimation for Size Biased Data Yogendra P. Chaubey Department of Mathematics and Statistics Concordia University, Montreal, Canada H3G 1M8 E-mail: yogen.chaubey@concordia.ca Talk to be presented at Indian Statistical Institute, November 19, 2014 Yogendra Chaubey (Concordia University) Department of Mathematics & Statistics November 19, 2014 1 / 70
  • 2. Abstract This talk will highlight some recent development in the area of nonparametric functional estimation with emphasis on nonparametric density estimation for size biased data. Such data entail constraints that many traditional nonparametric density estimators may not satisfy. A lemma attributed to Hille, and its generalization [see Lemma 1, Feller (1965) An Introduction to Probability Theory and Applications, xVII.1)] is used to propose estimators in this context. After describing the asymptotic properties of the estimators, we present the results of a simulation study to compare various nonparametric density estimators. Yogendra Chaubey (Concordia University) Department of Mathematics & Statistics November 19, 2014 2 / 70
  • 3. Outline 1 1. Introduction/Motivation 1.1 Kernel Density Estimator 1.2. Smooth Estimation of Densities on R+ 2 2. An Approximation Lemma and Some Alternative Smooth Density Estimators 2.1 Some Alternative Smooth Density Estimators on R+ 2.2 Asymptotic Properties of the New Estimator 2.3 Extensions Non-iid cases 3 3. Estimation of Density in Length-biased Data 3.1 Smooth Estimators Based on the Estimators of G 3.2 Smooth Estimators Based on the Estimators of F 4 4. A Comparison Between Dierent Estimators: Simulation Studies 4.1 Simulation for 22 4.2 Simulation for 26 4.3 Simulation for Some Other Standard Distributions Yogendra Chaubey (Concordia University) Department of Mathematics Statistics November 19, 2014 3 / 70
  • 4. 1. Introduction/Motivation 1.1 Kernel Density Estimator Consider X as a non-negative random variable with density f(x) and distribution function F(x) = Z x 0 f(t)dt for x 0: (1.1) Such random variables are more frequent in practice in life testing and reliability. Yogendra Chaubey (Concordia University) Department of Mathematics Statistics November 19, 2014 4 / 70
  • 5. Based on a random sample (X1;X2; :::;Xn); from a univariate density f(:); the empirical distribution function (edf) is de
  • 6. ned as Fn(x) = 1 n Xn i=1 I(Xi x): (1.2) edf is not smooth enough to provide an estimator of f(x): Various methods (viz., kernel smoothing, histogram methods, spline, orthogonal functionals) The most popular is the Kernel method (Rosenblatt, 1956). [See the text Nonparametric Functional Estimation by Prakasa Rao (1983) for a theoretical treatment of the subject or Silverman (1986)]. Yogendra Chaubey (Concordia University) Department of Mathematics Statistics November 19, 2014 5 / 70
  • 7. ^ fn(x) = 1 n Xn i=1 kh(x Xi) = 1 nh Xn i=1 k x Xi h ) (1.3) where the function k(:) called the Kernel function has the following properties; (i)k(x) = k(x) (ii) Z 1 1 k(x)dx = 1 and kh(x) = 1 h k x h h is known as bandwidth and is made to depend on n; i.e. h hn, such that hn ! 0 and nhn ! 1 as n ! 1: Basically k is a symmetric probability density function on the entire real line. This may present problems in estimating the densities of non-negative random variables. Yogendra Chaubey (Concordia University) Department of Mathematics Statistics November 19, 2014 6 / 70
  • 8. Kernel Density Estimators for Suicide Data 0 200 400 600 0.000 0.002 0.004 0.006 x Default SJ UCV BCV Figure 1. Kernel Density Estimators for Suicide Study Data Yogendra Chaubey (Concordia University) Department of Mathematics Statistics November 19, 2014 7 / 70 Silverman (1986)
  • 9. 1.2. Smooth Estimation of densities on R+ ^ fn(x) might take positive values even for x 2 (1; 0], which is not desirable if the random variable X is positive. Silverman (1986) mentions some adaptations of the existing methods when the support of the density to be estimated is not the whole real line, through transformation and other methods. 1.2.1 Bagai-Prakasa Rao Estimator Bagai and Prakasa Rao (1996) proposed the following adaptation of the Kernel Density estimator for non-negative support [which does not require any transformation or corrective strategy]. fn(x) = 1 nhn Xn i=1 k x Xi hn ; x 0: (1.4) Yogendra Chaubey (Concordia University) Department of Mathematics Statistics November 19, 2014 8 / 70
  • 10. Here k(:) is a bounded density function with support (0;1); satisfying Z 1 0 x2k(x)dx 1 and hn is a sequence such that hn ! 0 and nhn ! 1 as n ! 1: The only dierence between ^ fn(x) and fn(x) is that the former is based on a kernel possibly with support extending beyond (0;1): One undesirable property of this estimator is that that for x such that for X(r) x X(r+1); only the
  • 11. rst r order statistics contribute towards the estimator fn(x): Yogendra Chaubey (Concordia University) Department of Mathematics Statistics November 19, 2014 9 / 70
  • 12. Bagai-Prakasa Rao Density Estimators for Suicide Data 0.000 0.002 0.004 0.006 x Default SJ UCV BCV 0 200 400 600 Figure 2. Bagai-Prakasa Rao Density Estimators for Suicide Study Data Silverman (1986) Yogendra Chaubey (Concordia University) Department of Mathematics Statistics November 19, 2014 10 / 70
  • 13. 2.1 An approximation Lemma The following discussion gives a general approach to density estimation which may be specialized to the case of non-negative data. The key result for the proposal is the following Lemma given in Feller (1965, xVII.1). Lemma 1: Let u be any bounded and continuous function and Gx;n; n = 1; 2; ::: be a family of distributions with mean n(x) and variance h2 n(x) such that n(x) ! x and hn(x) ! 0: Then ~u(x) = Z 1 1 u(t)dGx;n(t) ! u(x): (2.1) The convergence is uniform in every subinterval in which hn(x) ! 0 uniformly and u is uniformly continuous. Yogendra Chaubey (Concordia University) Department of Mathematics Statistics November 19, 2014 11 / 70
  • 14. This generalization may be adapted for smooth estimation of the distribution function by replacing u(x) by the empirical distribution function Fn(x) as given below ; ~ Fn(x) = Z 1 1 Fn(t)dGx;n(t): (2.2) Note that Fn(x) is not a continuous function as desired by the above lemma, hence the above lemma is not directly used in proposing the estimator but it works as a motivation for the proposal. It can be considered as the stochastic adaptation in light of the fact that the mathematical convergence is transformed into stochastic convergence that parallels to that of the strong convergence of the empirical distribution function as stated in the following theorem. Yogendra Chaubey (Concordia University) Department of Mathematics Statistics November 19, 2014 12 / 70
  • 15. Theorem 1: Let hn(x) be the variance of Gx;n as in Lemma 1 such that hn(x) ! 0 as n ! 1 for every
  • 16. xed x as n ! 1; then we have sup x j ~ Fn(x) F(x)j a:s: ! 0 (2.3) as n ! 1: Technically, Gx;n can have any support but it may be prudent to choose it so that it has the same support as the random variable under consideration; because this will get rid of the problem of the estimator assigning positive mass to undesired region. For ~ Fn(x) to be a proper distribution function, Gx;n(t) must be a decreasing function of x; which can be shown using an alternative form of ~ Fn(x) : ~ Fn(x) = 1 1 n Xn i=1 Gx;n(Xi): (2.4) Yogendra Chaubey (Concordia University) Department of Mathematics Statistics November 19, 2014 13 / 70
  • 17. Equation (2.4) suggests a smooth density estimator given by ~ fn(x) = d ~ Fn(x) dx = 1 n Xn i=1 d dx Gx;n(Xi): (2.5) The potential of this lemma for smooth density estimation was recognized by Gawronski (1980) in his doctoral thesis written at Ulm. Gawronski and Stadmuller (1980, Skand. J. Stat.) investigated mean square error properties of the density estimator when Gx;n is obtained by putting Poisson weight pk(xn) = enx (nx)k k! (2.6) to the lattice points k=n; k = 0; 1; 2; ::: Yogendra Chaubey (Concordia University) Department of Mathematics Statistics November 19, 2014 14 / 70
  • 18. Other developments: This lemma has been further used to motivate the Bernstein Polynomial estimator (Vitale, 1975) for densities on [0; 1] by Babu, Canty and Chaubey (1999). Gawronski (1985, Period. Hung.) investigates other lattice distributions such as negative binomial distribution. Some other developments: Chaubey and Sen (1996, Statist. Dec.): survival functions, though in a truncated form. Chaubey and Sen (1999, JSPI): Mean Residual Life; Chaubey and Sen (1998a, Persp. Stat., Narosa Pub.): Hazard and Cumulative Hazard Functions; Chaubey and Sen (1998b): Censored Data; (Chaubey and Sen, 2002a, 2002b): Multivariate density estimation Smooth density estimation under some constraints: Chaubey and Kochar (2000, 2006); Chaubey and Xu (2007, JSPI). Babu and Chaubey (2006): Density estimation on hypercubes. [see also Prakasa Rao(2005)and Kakizawa (2011), Bouezmarni et al. (2010, JMVA) for Generalised Bernstein Polynomials and Bernstein copulas] Yogendra Chaubey (Concordia University) Department of Mathematics Statistics November 19, 2014 15 / 70
  • 19. A Generalised Kernel Estimator for Densities with Non-Negative Support Lemma 1 motivates the generalised kernel estimator of Foldes and Revesz (1974): fnGK(x) = 1 n Xn i=1 hn(x;Xi) Chaubey et al. (2012, J. Ind. Stat. Assoc.) show the following adaptation using asymmetric kernels for estimation of densities with non-negative support. Let Qv(x) represent a distribution on [0;1) with mean 1 and variance v2; then an estimator of F(x) is given by F+ n (x) = 1 1 n Xn i=1 Qvn Xi x ; (2.7) where vn ! 0 as n ! 1: Yogendra Chaubey (Concordia University) Department of Mathematics Statistics November 19, 2014 16 / 70
  • 20. Obviously, this choice uses Gx;n(t) = Qvn(t=x) which is a decreasing function of x: This leads to the following density estimator d dx (F+ n (x)) = 1 nx2 Xn i=1 Xi qvn Xi x ; (2.8) where qv(:) denotes the density corresponding to the distribution function Qv(:): Yogendra Chaubey (Concordia University) Department of Mathematics Statistics November 19, 2014 17 / 70
  • 21. However, the above estimator may not be de
  • 22. ned at x = 0, except in cases where limx!0 d dx (F+ n (x)) exists. Moreover, this limit is typically zero, which is acceptable only when we are estimating a density f with f(0) = 0: Thus with a view of the more general case where 0 f(0) 1; we considered the following perturbed version of the above density estimator: f+ n (x) = 1 n(x + n)2 Xn i=1 Xi qvn Xi x + n ; x 0 (2.9) where n # 0 at an appropriate (suciently slow) rate as n ! 1: In the sequel, we illustrate our method by taking Qv(:) to be the Gamma ( = 1=v2;
  • 23. = v2) distribution function. Yogendra Chaubey (Concordia University) Department of Mathematics Statistics November 19, 2014 18 / 70
  • 24. Remark: Note that if we believe that the density is zero at zero, we set n 0; however in general, it may be determined using the cross-validation methods. For n 0; this modi
  • 25. cation results in a defective distribution F+ n (x + n): A corrected density estimator f n(x) is therefore proposed: f n(x) = f+ n (x) cn ; (2.10) where cn is a constant given by cn = 1 n Xn i=1 Qvn Xi n : Note that, since for large n; n ! 0; f n(x) and f+ n (x) are asymptotically equivalent, we study the asymptotic properties of f+ n (x) only. Yogendra Chaubey (Concordia University) Department of Mathematics Statistics November 19, 2014 19 / 70
  • 26. Next we present a comparison of our approach with some existing estimators. Kernel Estimator. The usual kernel estimator is a special case of the representation given by Eq. (2.5), by taking Gx;n(:) as Gx;n(t) = K t x h ; (2.11) where K(:) is a distribution function with mean zero and variance 1. Yogendra Chaubey (Concordia University) Department of Mathematics Statistics November 19, 2014 20 / 70
  • 27. Transformation Estimator of Wand et al. The well known logarithmic transformation approach of Wand, Marron and Ruppert (1991) leads to the following density estimator: ~ f(L) n (x) = 1 nhnx Xn i=1 k( 1 hn log(Xi=x)); (2.12) where k(:) is a density function (kernel) with mean zero and variance 1. Yogendra Chaubey (Concordia University) Department of Mathematics Statistics November 19, 2014 21 / 70
  • 28. This is easily seen to be a special case of Eq. (2.5), taking Gx;n again as in Eq. (2.11) but applied to log x: This approach, however, creates problem at the boundary which led Marron and Ruppert (1994) to propose modi
  • 29. cations that are computationally intensive. Estimators of Chen and Scaillet. Chen's (2000) estimator is of the form ^ fC(x) = 1 n Xn i=1 gx;n(Xi); (2.13) where gx;n(:) is the Gamma( = a(x; b);
  • 30. = b) density with b ! 0 and ba(x; b) ! x: Yogendra Chaubey (Concordia University) Department of Mathematics Statistics November 19, 2014 22 / 70
  • 31. This also can be motivated from Eq. (2.1) as follows: take u(t) = f(t) and note that the integral R f(t)gx;n(t)dt can be estimated by n1Pn i=1 gx;n(Xi): This approach controls the boundary bias at x = 0; however, the variance blows up at x = 0; and computation of mean integrated squared error (MISE) is not tractable. Moreover, estimators of derivatives of the density are not easily obtainable because of the appearance of x as argument of the Gamma function. Yogendra Chaubey (Concordia University) Department of Mathematics Statistics November 19, 2014 23 / 70
  • 32. Scaillet's (2004) estimators replace the Gamma kernel by inverse Gaussian (IG) and reciprocal inverse Gaussian (RIG) kernels. These estimators are more tractable than Chen's; however, the IG-kernel estimator assumes value zero at x = 0; which is not desirable when f(0) 0; and the variances of the IG as well as the RIG estimators blow up at x = 0: Bouezmarni and Scaillet (2005), however, demonstrate good
  • 33. nite-sample performance of these estimators. Yogendra Chaubey (Concordia University) Department of Mathematics Statistics November 19, 2014 24 / 70
  • 34. It is interesting to note that one can immediately de
  • 35. ne a Chen-Scaillet version of our estimator, namely, f+ n;C(x) = 1 n Xn i=1 1 x qvn Xi x : On the other hand, our version (i.e., perturbed version) this estimator would be ^ f+ C (x) = 1 n Xn i=1 gx+n;n(Xi); that should not have the problem of variance blowing up at x = 0: Yogendra Chaubey (Concordia University) Department of Mathematics Statistics November 19, 2014 25 / 70
  • 36. It may also be remarked that the idea used here may be extended to the case of densities supported on an arbitrary interval [a; b]; 1 a b 1; by choosing for instance a Beta kernel (extended to the interval [a; b]) as in Chen (1999). Without loss of generality, suppose a = 0 and b = 1: Then we can choose, for instance, qv(:) as the density of Y=; where Y Beta(;
  • 37. ); = =( +
  • 38. ); such that ! 1 and
  • 39. = ! 0; so that Var(Y=) ! 0: Yogendra Chaubey (Concordia University) Department of Mathematics Statistics November 19, 2014 26 / 70
  • 40. 2.2 Asymptotic Properties of the New Estimator 2.2.1 Asymptotic Properties of ~ F+ n (x) The strong consistency holds in general for the estimator ~ F+ n (x): We can easily prove the following theorem parallel to the strong convergence of the empirical distribution function. Theorem: If n ! 1 as n ! 1 we have sup x n (x) F(x)j a:s: ! 0: j ~ F+ as n ! 1: Yogendra Chaubey (Concordia University) Department of Mathematics Statistics November 19, 2014 27 / 70
  • 41. We can also show that for large n; the smooth estimator can be arbitrarily close to the edf by proper choice of n; as given in the following theorem. Theorem:Assuming that f has a bounded derivative, and n = o(n); then for some 0; we have, with probability one, sup x0 j ~ F+ n (x) Fn(x)j = O n3=4(log n)1+ : Yogendra Chaubey (Concordia University) Department of Mathematics Statistics November 19, 2014 28 / 70
  • 42. 2.2.2 Asymptotic Properties of ~ f+ n (x) Under some regularity conditions, they obtained Theorem: sup x0 n (x) f(x)j a:s: ! 0 j ~ f+ as n ! 1. Theorem: (a) If nvn ! 1, nv3n ! 0, nvn2 n ! 0 as n ! 1, we have p nvn(f+ n (x) f(x)) ! N 0; I2(q) f(x) x2 ; for x 0: (b) If nvn2 n ! 1 and nvn4 n ! 0 as n ! 1, we have p n(f+ n (0) f(0)) ! N nvn2 0; I2(q)f(0) : Yogendra Chaubey (Concordia University) Department of Mathematics Statistics November 19, 2014 29 / 70
  • 43. 2.3. Extensions to Non-iid cases We can extend the technique to non-iid cases where a version of Fn(x) is available. Chaubey, Y.P., Dewan, I. and Li, J. (2012) { Density estimation for stationary associated sequences. Comm. Stat.- Simula. Computa. 41(4), 554- 572 {Using generalised kernel approach Chaubey Chaubey, Yogendra P., Dewan, Isha and Li, Jun (2011) { Density estimation for stationary associated sequences using Poisson weights. Statist. Probab. Lett. 81, 267-276. Chaubey, Y.P. and Dewan, I. (2010). A review for smooth estimation of survival and density functions for stationary associated sequences: Some recent developments { J. Ind. Soc. Agr. Stat. 64(2), 261-272. Chaubey, Y.P., Lab, N. and Sen, A. (2010). Generalised kernel smoothing for non-negative stationary ergodic processes { Journal of Nonparametric Statistics, 22, 973-997 Yogendra Chaubey (Concordia University) Department of Mathematics Statistics November 19, 2014 30 / 70
  • 44. 3. Estimation of Density in Length-biased Data In general, when the probability that an item is sampled is proportional to its size, size biased data emerges. The density g of the size biased observation for the underlying density f; is given by g(x) = w(x)f(x) w ; x 0; (3.1) where w(x) denotes the size measure and w = R w(x)f(x): In the area of forestry, the size measure is usually proportional to either length or area (see Muttlak and McDonald, 1990). Another important application occurs in renewal theory where inter-event times data are of this type if they are obtained by sampling lifetimes in progress at a randomly chosen point in time (see Cox, 1969). Yogendra Chaubey (Concordia University) Department of Mathematics Statistics November 19, 2014 31 / 70
  • 45. Here we will talk about the length biased case where we can write f(x) = 1 x g(x)=: (3.2) In principle any smooth estimator of the density function g may be transformed into that of the density function f as follows: ^ f(x) = 1 x ^g(x)=^; (3.3) where ^ is an estimator of : Note that 1= = Eg(1=X); hence a strongly consistent estimator of is given by Xn ^ = nf i=1 X1 i g1 Yogendra Chaubey (Concordia University) Department of Mathematics Statistics November 19, 2014 32 / 70
  • 46. Bhattacharyya et al. (1988) used this strategy in proposing the following smooth estimator of f; ^ fB(x) = ^(nx)1 Xn i=1 kh(x Xi): (3.4) Also since, F(x) = Eg(X11(Xx)); Cox (1969) proposed the following as an estimator of the distribution function F(x) : ^ Fn(x) = ^ 1 n Xn i=1 X1 i 1(Xix): (3.5) So there are two competing strategies for density estimation for LB data. One is to estimate g(x) and then use the relation (3.3) (i.e. smooth Gn as in Bhattacharyya et al. (1988)). The other is to smooth the Cox estimator ^ Fn(x) directly and use the derivative as the smooth estimator of f(x): Yogendra Chaubey (Concordia University) Department of Mathematics Statistics November 19, 2014 33 / 70
  • 47. Jones (1991) studied the behaviour of the estimator fB(x) in contrast to smooth estimator obtained directly by smoothing the estimator Fn(x); by Kernel method: ^ fJ (x) = n1^ Xn i=1 X1 i kh(x Xi): (3.6) He noted that this estimator is a proper density function when considered with the support on the whole real line, where as ^ fB(y) may be not. He compared the two estimators based on simulations, and using the asymptotic arguments, concluded that the latter estimator may be preferable in practical applications. Also using Jensen's inequality we
  • 48. nd that Eg(^) 1=Egf 1 n Xn i=1 X1 i g = ; hence the estimator ^ may be positively biased which would transfer into increased bias in the above density estimators. Yogendra Chaubey (Concordia University) Department of Mathematics Statistics November 19, 2014 34 / 70
  • 49. If g(x)=x is integrable, the de
  • 50. ciency of fB(x) of not being a proper density may be corrected by considering the alternative estimator ^ fa(x) = R g^(x)=x (^g(x)=x)dx ; (3.7) and this may also eliminate the increase in bias to some extent. However, in these situations, since X is typically a non-negative random variable, the estimator must satisfy the following two conditions: (i) ^g(x) = 0 for x 0; (ii) ^g(x)=x is integrable. Here both of the estimators ^ fB(x) and ^ fJ (x) do not satisfy these properties. We have a host of alternatives, those based on smoothing Gn and those based on smoothing Fn, that we are going to talk about next. Yogendra Chaubey (Concordia University) Department of Mathematics Statistics November 19, 2014 35 / 70
  • 51. 3.1.1 Poisson Smoothing of Gn Here, we would like to see the application of the weights generated by the Poisson probability mass function as motivated in Chaubey and Sen (1996, 2000). However, a modi
  • 52. cation is necessary in the present situation which is also outlined here. Using Poisson smoothing, an estimator of g(x) may be given by, ~gnP (x) = n 1X k=0 pk(nx) Gn k + 1 n Gn k n ; (3.8) however, note that limx!0~gnP (x) = nGn(1=n) which may converge to 0 as n ! 1; however for
  • 53. nite samples it may not be zero, hence the density f at x = 0 may not be de
  • 54. ned. Furthermore, ~gnP (x)=x is not integrable. Yogendra Chaubey (Concordia University) Department of Mathematics Statistics November 19, 2014 36 / 70
  • 56. cation by attaching the weight pk(nx) to Gn((k 1)=n); rather than to Gn(k=n), the above problem is avoided. This results in the following smooth estimator of G(x) : ~G n(x) = X k0 pk(xn)Gn k 1 n ; (3.9) The basic nature of the smoothing estimator is not changed, however this provides an alternative estimator of the density function as its derivative is given by ~gn(x) = n X k1 pk(xn) Gn k n Gn k 1 n ; (3.10) such that ~gn(0) = 0 and that ~gn(x)=x is integrable. Yogendra Chaubey (Concordia University) Department of Mathematics Statistics November 19, 2014 37 / 70
  • 57. Since, Z 1 0 ~gn(x) x dx = n X [Gn k1 k n Gn k 1 n ] Z 1 0 pk(xn) x dx = n X [Gn k1 k n Gn k 1 n ] 1 k = n X k1 1 k(k + 1) Gn k n ; The new smooth estimator of the length biased density f(x) is given by ~ fn(x) = P k1 pk1(xn) k h Gn k n Gn k1 n i P k1 1 k(k+1)Gn k n : (3.11) Yogendra Chaubey (Concordia University) Department of Mathematics Statistics November 19, 2014 38 / 70
  • 58. The corresponding smooth estimator of the distribution function F(x) is given by ~ Fn(x) = P k1(1=k)Wk(xn)[Gn k n Gn k1 n ] P k1 1 k(k+1)Gn k n (3.12) where Wk(nx) = 1 (k) Z nx 0 eyyk1dy = X jk pj(nx): Yogendra Chaubey (Concordia University) Department of Mathematics Statistics November 19, 2014 39 / 70
  • 59. An equivalent expression for the above estimator is given by ~ Fn(x) = P k1 Gn k n h k Wk+1(nx) k+1 Wk(nx) i P k1 1 k(k+1)Gn k n = 1 + P k1 Gn k n h Pk(nx) k+1 Pk1(nx) k i P k1 1 k(k+1)Gn k n ; where Pk() = Xk j=0 pj() denotes the cumulative probability corresponding to the Poisson() distribution. The properties of above estimators can be established in an analogous way to those in the regular case. Yogendra Chaubey (Concordia University) Department of Mathematics Statistics November 19, 2014 40 / 70
  • 60. 3.1.2 Gamma Smoothing of Gn The smooth estimator using the log-normal density may typically have a spike at zero, however the gamma density may be appropriate, since it typically has the density estimator ^g(x) such that ^g(0) = 0; so that no perturbation is required. The smooth density estimator in this case is simply given by g+ n (x) = 1 nx2 Xn i=1 Xi qvn Xi x ; (3.13) where qv(:) denotes the density corresponding to a Gamma( = 1=vn;
  • 61. = vn): and the corresponding estimator of density is given by f+ n (x) = g+ R n (x)=x 1 0 (g+ n (t)=t)dt (3.14) Yogendra Chaubey (Concordia University) Department of Mathematics Statistics November 19, 2014 41 / 70
  • 62. 3.2.1 Poisson Smoothing of Fn smoothing Fn directly using Poisson weights, an estimator of f(x) may be given by, ~ fnP (x) = n 1X k=0 pk(nx) Fn k + 1 n Fn k n : (3.15) No modi
  • 63. cations are necessary. Yogendra Chaubey (Concordia University) Department of Mathematics Statistics November 19, 2014 42 / 70
  • 64. 3.2.2 Gamma Smoothing of Fn The gamma based smooth estimate of F(x) is given by e F+ n (x) = 1 Pn i=1 1 Xi Qvn(Xi x ) Pn i=1 1 Xi ; (3.16) and that for the density f in this case is simply given by ~ f+ n (x) = 1 (x+n)2 Pn i=1 qvn( Xi x+n ) Pn i=1 1 Xi : (3.17) where qv(:) denotes the density corresponding to a Gamma( = 1=vn;
  • 65. = vn): Note that the above estimator is computationally intensive as two smoothing parameters have to be computed using bivariate cross validation. Yogendra Chaubey (Concordia University) Department of Mathematics Statistics November 19, 2014 43 / 70
  • 66. 4. A Simulation Study Here we consider parent distributions to estimate as exponential (22 ), 26 , lognormal, Weibull and mixture of exponential densities. Since the computation is very extensive for obtaining the smoothing parameters, we compute approximations to MISE and MSE by computing ISE(fn; f) = Z 1 0 [fn(x) f(x)]2dx and SE (fn(x); f(x)) = [fn(x) f(x)]2 for 1000 samples. Yogendra Chaubey (Concordia University) Department of Mathematics Statistics November 19, 2014 44 / 70
  • 67. Here, MISE give us the global performance of density estimator. MSE let us to see how the density estimator performs locally at the points in which we might be interested. It is no doubt that we particularly want to know the behavior of density estimators near the lower boundary. We illustrate only MISE values. Optimal values of smoothing parameters are obtained using either BCV or UCV criterion, that roughly approximates Mean Integrated Squared Error. For Poisson smoothing as well as for Gamma smoothing BCV criterion is found to be better, where as for Chen and Scaillet method, use of BCV method is not tractable as it requires estimate of the derivative of the density. Yogendra Chaubey (Concordia University) Department of Mathematics Statistics November 19, 2014 45 / 70
  • 68. Next table gives the values of MISE for exponential density using new estimators as compared with Chen's and Scaillet estimators. Note that we include the simulation results for Scaillet's estimator using RIG kernel only. Inverse Gaussian kernel is known not to perform well for direct data [see Kulasekera and Padgett (2006)]. Similar observations were noted for LB data. Yogendra Chaubey (Concordia University) Department of Mathematics Statistics November 19, 2014 46 / 70
  • 69. Table: Simulated MISE for 22 Distribution Estimator Sample Size 30 50 100 200 300 500 22 Chen-1 0.13358 0.08336 0.07671 0.03900 0.03056 0.02554 Chen-2 0.11195 0.08592 0.05642 0.03990 0.03301 0.02298 RIG 0.14392 0.11268 0.07762 0.06588 0.05466 0.04734 Poisson(F) 0.04562 0.03623 0.02673 0.01888 0.01350 0.01220 Poisson(G) 0.08898 0.06653 0.04594 0.03127 0.02487 0.01885 Gamma(F) 0.06791 0.05863 0.03989 0.03135 0.02323 0.01589 Gamma*(F) 0.02821 0.01964 0.01224 0.00796 0.00609 0.00440 Gamma(G) 0.09861 0.07663 0.05168 0.03000 0.02007 0.01317 Gamma*(G) 0.02370 0.01244 0.00782 0.00537 0.00465 0.00356 Yogendra Chaubey (Concordia University) Department of Mathematics Statistics November 19, 2014 47 / 70
  • 70. Table: Simulated MSE for 22 Sample Size Estimator x 0 0.1 1 2 5 n=30 I 0.1307 0.2040 0.0181 0.0044 0.0003 II 0.1187 0.2499 0.0173 0.0045 0.0012 III 0.2222 0.1823 0.0250 0.0074 0.0022 IV 0.1487 0.1001 0.0049 0.0015 0.0005 V 0.3003 0.2438 0.0286 0.0148 0.0013 VI 0.1936 0.1447 0.0117 0.0042 0.0002 VI* 0.0329 0.0286 0.0090 0.0030 9:8 105 VII 0.1893 0.1720 0.0209 0.0066 0.0003 VII* 0.0528 0.0410 0.0032 0.0020 8:4 105 n=50 I 0.1370 0.1493 0.0121 0.0030 0.0002 II 0.1279 0.1894 0.0112 0.0032 0.0008 III 0.2193 0.1774 0.0161 0.0046 0.0046 IV 0.1393 0.0885 0.0034 0.0012 0.0003 V 0.2939 0.1924 0.0218 0.0094 0.0007 VI 0.1808 0.1365 0.0101 0.0036 0.0001 VI* 0.0196 0.0172 0.0070 0.0024 6:8 105 VII 0.1584 0.1440 0.0168 0.0060 0.0002 VII* 0.0322 0.0236 0.0014 0.0012 4:8 105 I-Chen-1, II-Chen-2, III-RIG, IV-Poisson(F), V-Poisson(G), VI-Gamma(F), VI*-Corrected Gamma(F), VII-Gamma(G), VII*-Corrected Gamma(G) Yogendra Chaubey (Concordia University) Department of Mathematics Statistics November 19, 2014 48 / 70
  • 71. Table: Simulated MSE for 22 Sample Size Estimator x 0 0.1 1 2 5 n=100 I 0.1442 0.8201 0.0070 0.0017 0.0001 II 0.1142 0.1391 0.0054 0.0019 0.0005 III 0.2151 0.1631 0.0091 0.0030 0.0020 IV 0.1335 0.0724 0.0023 0.0008 0.0002 V 0.2498 0.1267 0.0116 0.0050 0.0003 VI 0.1332 0.0823 0.0090 0.0032 0.0001 VI* 0.0105 0.0094 0.0047 0.0015 5:9 105 VII 0.1078 0.0980 0.0121 0.0051 0.0002 VII* 0.0280 0.0184 0.0006 0.0007 3:8 105 n=200 I 0.3327 0.0901 0.0046 0.0012 6:6 105 II 0.2111 0.0943 0.0027 0.0012 0.0003 III 0.2127 0.1896 0.0067 0.0019 0.0080 IV 0.1139 0.0545 0.0015 0.0005 0.0001 V 0.1908 0.0703 0.0056 0.0026 0.0001 VI 0.0995 0.0782 0.0065 0.0024 7:4 105 VI* 0.0137 0.0125 0.0031 0.0010 5:8 105 VII 0.0636 0.0560 0.0072 0.0038 0.0002 VII* 0.0217 0.0134 0.0002 0.0005 2:9 105 I-Chen-1, II-Chen-2, III-RIG, IV-Poisson(F), V-Poisson(G),VI-Gamma(F), VI*-Corrected Gamma(F),VII-Gamma(G), VII*-Corrected Gamma(G) Yogendra Chaubey (Concordia University) Department of Mathematics Statistics November 19, 2014 49 / 70
  • 72. Table: Simulated MISE for 26 Distribution Estimator Sample Size 30 50 100 200 300 500 26 Chen-1 0.01592 0.01038 0.00578 0.00338 0.00246 0.00165 Chen-2 0.01419 0.00973 0.00528 0.00303 0.00224 0.00153 RIG 0.01438 0.00871 0.00482 0.00281 0.00208 0.00148 Poisson(F) 0.00827 0.00582 0.00382 0.00241 0.00178 0.00119 Poisson(G) 0.00834 0.00562 0.00356 0.00216 0.00166 0.00117 Gamma(F) 0.01109 0.00805 0.00542 0.00327 0.00249 0.00181 Gamma*(F) 0.01141 0.00844 0.00578 0.00345 0.00264 0.00193 Gamma(G) 0.01536 0.01063 0.00688 0.00398 0.00303 0.00213 Gamma*(G) 0.01536 0.01063 0.00688 0.00398 0.00303 0.00213 Yogendra Chaubey (Concordia University) Department of Mathematics Statistics November 19, 2014 50 / 70
  • 73. Table: Simulated MSE for 26 Sample Size Estimator x 0 0.1 1 4 10 n=30 I 0.0017 0.0018 0.0018 0.0019 0.0001 II 0.0018 0.0017 0.0011 0.0017 0.0002 III 5:6 105 6:7 105 0.0006 0.0017 0.0002 IV 0.0016 0.0016 0.0012 0.0012 0.0001 V 0.0000 2:6 105 0.0017 0.0008 0.0001 VI 0.0011 0.0010 0.0019 0.0012 8:5 105 VI* 0.0015 0.0021 0.0020 0.0012 7:9 105 VII 0.0000 3:6 107 0.0058 0.0008 0.0001 VII* 0.0000 3:6 107 0.0058 0.0008 0.0001 n=50 I 0.0012 0.0013 0.0015 0.0012 0.0001 II 0.0013 0.0012 0.0008 0.0011 0.0001 III 4:6 105 5:7 105 0.0006 0.0005 .0001 IV 0.0011 0.0011 0.0010 0.0008 8:4 105 V 0.0000 6:7 105 0.0012 0.0005 0.0001 VI 0.0005 0.0005 0.0016 0.0008 5:5 105 VI* 0.0006 0.0015 0.0016 0.0008 5:3 105 VII 0.0000 4:3 106 0.0037 0.0004 7:9 105 VII* 0.0000 4:3 106 0.0037 0.0004 7:9 105 I-Chen-1, II-Chen-2, III-RIG, IV-Poisson(F), V-Poisson(G), VI-Gamma(F), VI*-Corrected Gamma(F), VII-Gamma(G), VII*-Corrected Gamma(G) Yogendra Chaubey (Concordia University) Department of Mathematics Statistics November 19, 2014 51 / 70
  • 74. 26 For the exponential density, fC2 ^ has smaller MSEs at the boundary and MISEs than fC1^ . This means fC2 ^ performs better locally and globally than fC1^ . Similar result holds in direct data. Poisson weight estimator based on Fn is found to be better than that based on Gn: Although Poisson weight estimator based on Gn has relatively smaller MISEs, it has large MSEs at the boundary as well, just like Scaillet estimator. Scaillet estimator has huge MSEs at the boundary and the largest MISEs. Corrected Gamma estimators have similar and smaller MISE values as compared to the corresponding Poisson weight estimators. For , all estimators have comparable global results. Poison weight estimators based on Fn or Gn have similar performances and may be slightly better than the others. Yogendra Chaubey (Concordia University) Department of Mathematics Statistics November 19, 2014 52 / 70
  • 75. We have considered following additional distributions for simulation as well: (i). Lognormal Distribution f(x) = 1 p 2x expf(log x )2=2gIfx 0g; (ii). Weibull Distribution f(x) = x1 exp(x)Ifx 0g; (iii). Mixtures of Two Exponential Distribution f(x) = [ 1 1 exp(x=1) + (1 ) 1 2 exp(x=2]Ifx 0g: Yogendra Chaubey (Concordia University) Department of Mathematics Statistics November 19, 2014 53 / 70
  • 76. Table: Simulated MISE for Lognormal with = 0 Distribution Estimator Sample Size 30 50 100 200 300 500 Lognormal Chen-1 0.12513 0.08416 0.05109 0.03450 0.02514 0.01727 Chen-2 0.12327 0.08886 0.05200 0.03545 0.02488 0.01717 RIG 0.14371 0.09733 0.05551 0.03308 0.02330 0.01497 Poisson(F) 0.05559 0.04379 0.02767 0.01831 0.01346 0.01001 Poisson(G) 0.06952 0.04820 0.03158 0.01470 0.01474 0.01061 Gamma*(F) 0.06846 0.05614 0.03963 0.02640 0.01998 0.01470 Gamma*(G) 0.16365 0.12277 0.07568 0.04083 0.029913 0.02035 Yogendra Chaubey (Concordia University) Department of Mathematics Statistics November 19, 2014 54 / 70
  • 77. Table: Simulated MSE for Lognormal with = 0 Sample Size Estimator x 0 0.1 1 5 8 n=30 I 0.1108 0.0618 0.0211 2:0 104 2:8 105 II 0.1045 0.0441 0.0196 6:0 104 6:4 105 III 0.0026 0.0494 0.0207 3:0 104 3:2 105 IV 0.1307 0.0485 0.0071 3:5 104 3:8 105 V 0.0009 0.0810 0.0126 2:2 104 2:6 105 VI* 0.0090 0.1546 0.0133 2:0 104 9:1 105 VII* 0.0007 0.7321 0.0121 8:2 105 9:4 106 n=50 I 0.1123 0.0535 0.0158 1:4 104 1:1 105 II 0.1056 0.0442 0.0110 3:7 104 2:7 105 III 0.0027 0.0436 0.0133 2:0 104 1:5 105 IV 0.1398 0.0482 0.0050 1:8 104 1:6 105 V 0.0020 0.0641 0.0090 1:3 104 1:3 105 VI* 0.0035 0.1349 0.0111 1:7 104 6:6 105 VII* 0.0000 0.5482 0.0080 4:7 105 4:9 106 I-Chen-1, II-Chen-2, III-RIG, IV-Poisson(F), V-Poisson(G), VI*-Corrected Gamma(F), VII*-Corrected Gamma(G) Yogendra Chaubey (Concordia University) Department of Mathematics Statistics November 19, 2014 55 / 70
  • 78. Table: Simulated MSE for Lognormal with = 0 Sample Size Estimator x 0 0.1 1 5 8 n=100 I 0.1044 0.0486 0.0086 5:1 105 8:2 106 II 0.1038 0.0412 0.0059 1:5 104 1:3 105 III 0.0028 0.0383 0.0064 7:2 105 7:2 106 IV 0.1053 0.0424 0.0029 4:9 105 4:7 106 V 0.0018 0.0541 0.0060 5:7 105 6:8 106 VI* 0.0033 0.1011 0.0086 1:1 104 4:7 105 VII* 0.0000 0.3237 0.0044 1:9 105 2:1 106 n=200 I 0.0854 0.0422 0.0054 2:7 105 2:9 106 II 0.0871 0.0387 0.0036 6:5 105 4:6 106 III 0.0024 0.0318 0.0042 3:5 105 3:3 106 IV 0.0663 0.0320 0.0019 2:3 105 2:4 106 V 0.0019 0.0309 0.0027 2:0 105 2:5 106 VI* 0.0058 0.0717 0.0058 9:2 105 3:3 105 VII* 0.0015 0.1780 0.0026 1:0 105 1:1 106 I-Chen-1, II-Chen-2, III-RIG, IV-Poisson(F), V-Poisson(G), VI*-Corrected Gamma(F), VII*-Corrected Gamma(G) Yogendra Chaubey (Concordia University) Department of Mathematics Statistics November 19, 2014 56 / 70
  • 79. Table: Simulated MISE for Weibull with = 2 Distribution Estimator Sample Size 30 50 100 200 300 500 Weibull Chen-1 0.10495 0.06636 0.03884 0.02312 0.01700 0.01167 Chen-2 0.08651 0.05719 0.03595 0.02225 0.01611 0.01111 RIG 0.08530 0.05532 0.03227 0.01984 0.01470 0.01045 Poisson(F) 0.04993 0.03658 0.02432 0.01459 0.01179 0.00856 Poisson(G) 0.05288 0.03548 0.02268 0.01392 0.01106 0.00810 Gamma*(F) 0.08358 0.06671 0.04935 0.03169 0.02652 0.01694 Gamma*(G) 0.12482 0.08526 0.05545 0.03402 0.02731 0.02188 Yogendra Chaubey (Concordia University) Department of Mathematics Statistics November 19, 2014 57 / 70
  • 80. Table: Simulated MSE for Weibull with = 2 Sample Size Estimator x 0 0.1 1 2 3 n=30 I 0.0856 0.1343 0.0588 .0030 1:9 104 II 0.0949 0.0555 0.0398 .0116 1:4 104 III 0.0025 0.0802 0.0394 .0095 4:6 104 IV 0.0844 0.0548 0.0280 0.0086 6:9 104 V 0.0068 0.0636 0.0186 0.0031 3:1 105 VI* 0.0019 0.1049 0.0682 0.0053 0.0022 VII* 0.0000 0.2852 0.0336 0.0011 1:8 104 n=50 I 0.0644 0.0576 0.0349 0.0020 1:0 104 II 0.0679 0.0431 0.0223 0.0077 7:1 104 III 0.0021 0.0208 0.0218 0.0063 2:2 104 IV 0.0682 0.0427 0.0217 0.0059 3:7 104 V 0.0025 0.0453 0.0138 0.0018 1:6 105 VI* 1:1 106 0.0763 0.0560 0.0048 0.0018 VII* 0.0000 0.1865 0.0251 0.0008 1:4 104 I-Chen-1, II-Chen-2, III-RIG, IV-Poisson(F), V-Poisson(G), VI*-Corrected Gamma(F), VII*-Corrected Gamma(G) Yogendra Chaubey (Concordia University) Department of Mathematics Statistics November 19, 2014 58 / 70
  • 81. Table: Simulated MISE for Mixture of Two Exponential Distributions with = 0:4, 1 = 2 and 2 = 1 Distribution Estimator Sample Size 30 50 100 200 300 500 Mixture Chen-1 0.22876 0.17045 0.08578 0.06718 0.05523 0.03811 Chen-2 0.17564 0.15083 0.07331 0.08029 0.04931 0.03808 RIG 0.25284 0.20900 0.13843 0.10879 0.09344 0.07776 Poisson(F) 0.06838 0.05746 0.04116 0.02612 0.01896 0.01179 Poisson(G) 0.11831 0.09274 0.06863 0.05019 0.03881 0.03044 Gamma*(F) 0.04147 0.02645 0.01375 0.00758 0.00532 0.00361 Gamma*(G) 0.02534 0.01437 0.01091 0.01223 0.01132 0.00994 Yogendra Chaubey (Concordia University) Department of Mathematics Statistics November 19, 2014 59 / 70
  • 82. Table: Simulated MSE for Mixtures of Two Exponential Distributions with = 0:4, 1 = 2 and 2 = 1 Sample Size Estimator x 0 0.1 1 2 10 n=30 I 0.3499 0.3075 0.0249 0.0037 2:6 106 II 0.3190 0.3181 0.0245 0.0071 1:3 105 III 0.5610 0.4423 0.0564 0.0056 2:9 106 IV 0.3778 0.1907 0.0057 0.0027 1:7 106 V 0.6409 0.3237 0.0156 0.0043 2:1 106 VI* 0.0652 0.0549 0.0098 0.0006 1:1 104 VII* 0.0696 0.0539 0.0065 0.0009 1:4 105 n=50 I 0.3158 0.7921 0.0128 0.0023 1:1 106 II 0.2848 0.7600 0.0143 0.0051 2:3 106 III 0.5582 0.8473 0.0364 0.0041 1:3 106 IV 0.3840 0.1633 0.0051 0.0020 1:0 106 V 0.6228 0.2673 0.0121 0.0028 1:3 106 VI* 0.0489 0.0414 0.0066 0.0004 7:7 105 VII* 0.0500 0.0336 0.0030 0.0007 9:2 106 I-Chen-1, II-Chen-2, III-RIG, IV-Poisson(F), V-Poisson(G), VI*-Corrected Gamma(F), VII*-Corrected Gamma(G) Yogendra Chaubey (Concordia University) Department of Mathematics Statistics November 19, 2014 60 / 70
  • 83. The basic conclusion is that the smoothing based on Fn using Poisson weights or corrected Gamma perform in a similar way and produce better boundary correction as compared to Chen or Scaillet asymmetric kernel estimators. The smoothing based on Gn may have large local MSE near the boundary and hence is not preferable over smoothing of Fn: A similar message is given in Jones and Karunamuni (1997, Austr. J. Stat.). Yogendra Chaubey (Concordia University) Department of Mathematics Statistics November 19, 2014 61 / 70
  • 84. References Babu, G.J., Canty, A.J. and Chaubey, Y.P.(2002). Application of Bernstein polynomials for smooth estimation of a distribution and density function. J. Statist. Plann. Inference 105, no. 2, 377-392. Babu, G.J. and Chaubey, Yogendra P. (2006). Smooth estimation of a distribution and density function on a hypercube using Bernstein polynomials for dependent random vectors. Statistics Probability Letters 76 959-969. Bagai, I. and Prakasa Rao, B.L.S. (1996). Kernel Type Density Estimates for Positive Valued Random Variables. Sankhya A57 56{67. Bhattacharyya, B.B., L.A. Franklin, G.D. Richardson (1988). A comparison of nonparametric unweighted and length biased density estimation of
  • 85. bres.Communications in Statistics- Theory Method 17(11) 3629{3644. Yogendra Chaubey (Concordia University) Department of Mathematics Statistics November 19, 2014 62 / 70
  • 86. Bouezmarni, T., Rombouts, J.V.K. and Taamouti, A. (2010). Asymptotic Properties of the Bernstein Density Copula for Dependent Data. Journal of Multivariate Analysis, 101, 1-10. Bouezmarni, T. and Scaillet, O. (2005). Consistency of Asymmetric Kernel Density Estimators and Smoothed Histograms with Application to Income Data. Econometric Theory, 21, 390-412. Chaubey, Y.P. and Kochar, S. (2000). Smooth estimation of stochastically ordered Survival functions. J. Indian Statistical Association, 38, 209-225. Chaubey, Yogendra P. and Kochar, Subhash C. (2006). Smooth estimation of uniformly stochastically ordered survival functions. Journal of Combinatorics, Information and System Sciences, 31 1-13. Chaubey, Y. P. and Sen, P. K. (1996). On Smooth Estimation of Survival and Density Function. Statistics and Decision, 14, 1-22. Yogendra Chaubey (Concordia University) Department of Mathematics Statistics November 19, 2014 63 / 70
  • 87. Chaubey, Y.P., Sen, P.K. (1998a). On smooth estimation of hazard and cumulative hazard functions. In Frontiers of Probability and Statistics, S.P. Mukherjee et al. (eds.) Narosa: New Delhi; 91{99. Chaubey, Y. P. and Sen, P. K. (1998b). On Smooth Functional Estimation under Random Censorship. In Frontiers in Reliability 4. Series on Quality, Reliability and Engineering Statistics (A. P. Basu et al., eds.), 83-97. World Scienti
  • 88. c, Singapore. Chaubey Y.P. and Sen P. K. (1999). On smooth estimation of mean residual life. Journal of Statistical Planning and Inference 75 223{236. Chaubey Y.P. and Sen, P.K. (2002a). Smooth isotonic estimation of density, hazard and MRL functions. Calcutta Statistical Association Bulletin, 52, 99-116. Chaubey, Y. P. and Sen, P. K. (2002b). Smooth estimation of Multivariate Survival and Density Functions. Jour. Statist. Plann. and Inf., 103, 361-376. Chaubey, Y. P. , Sen, A. and Sen, P. K. (2007a). A New Smooth Density Estimator for Non-Negtive Random Variables. Technical Report No. 01/07, Department of Mathematics and Statistics, Concordia University, Montreal. Chaubey, Yogendra P. and Xu, Haipeng (2007b) Smooth estimation Yogendra Chaubey (Concordia University) Department of Mathematics Statistics November 19, 2014 64 / 70
  • 89. Chaubey, Y. P. and Sen, P. K. (2009). On the Selection of Smoothing Parameter in Poisson Smoothing of Histogram Estimator: Computational Aspects. Pak. J. Statist., 25(4), 385-401. Chaubey, Y. P. , Sen, P. K. and Li, J. (2010a). Smooth Density Estimation for Length Biased Data. Journal of the Indian Society of Agricultural Statistics, 64(2), 145-155. Chaubey, Y.P. and Dewan, I. and Li, J. (2010b). Smooth estimation of survival and density functions for a stationary associated process using Poisson weights Statistics and Probability Letters, 81, 267-276. Chaubey, Y. P. , Sen, A., Sen, P. K. and Li, J. (2012). A New Smooth Density Estimator for Non-Negative Random Variables. Journal of the Indian Statistical Association 50, 83-104. Yogendra Chaubey (Concordia University) Department of Mathematics Statistics November 19, 2014 65 / 70
  • 90. Chen, S. X. (1999). Beta kernel estimators for density functions. Computational Statistics and Data Analysis, 31, 131{145. Chen, S. X. (2000). Probability Density Function Estimation Using Gamma Kernels. Annals of the Institute of Statistical Mathematics, 52, 471-480. Cox, D.R. (1969). Some Sampling Problems in Technology. In New Developments in Survey Sampling, U. L. Johnson and H. Smith (eds.), New York: Wiley Interscience. Feller, W. (1965). An Introduction to Probability Theory and Its Applications, Vol. II. New York: Wiley. Foldes, A. and Reve, P. (1974). A general method for density estimation. Studia Sci. Math. Hungar. (1974), 8192. Yogendra Chaubey (Concordia University) Department of Mathematics Statistics November 19, 2014 66 / 70
  • 91. Gawronski, W. (1980). Verallgemeinerte Bernsteinfunktionen und Schtzung einer Wahrscheinlichkeitsdichte, Universitat Ulm, (Habilitationsschrift). Gawronski, W. (1985). Strong laws for density estimators of Bernstein type. Periodica Mathematica Hungarica 16, 23-43 Gawronski, W. and Stadmuler, U. (1980). On Density Estimation by Means of Poisson's Distribution. Scandinavian Journal of Statistics, 7, 90-94. Gawronski, W. and Stadmuler, U. (1981). Smoothing of Histograms by Means of Lattice and Continuous Distributions. Metrika, 28, 155-164. Jones, M.C. (1991). Kernel density estimation for length biased data. Biometrika 78 511{519. Yogendra Chaubey (Concordia University) Department of Mathematics Statistics November 19, 2014 67 / 70
  • 92. Jones, M.C. and Karunamuni, R.J. (1997). Fourier series estimation for length biased data. Australian Journal of Statistics, 39, 5768. Kakizawa, Y. (2011). A note on generalized Bernstein polynomial density estimators. Statistical Methodology, 8, 136-153. Kulasekera, K. B. and Padgett, W. J. (2006). Bayes Bandwidth Selection in Kernel Density Estimation with Censored Data. Journal of Nonparametric Statistics, 18, 129-143. Marron, J. S., Ruppert, D. (1994). Transformations to reduce boundary bias in kernel density estimation. J. Roy. Statist. Soc. Ser. B 56, 653-671. Muttlak, H.A. and McDonald, L.L. (1990). Ranked set sampling with size-biased sampling with applications to wildlife populations and human families. Biometrics 46, 435-445. Yogendra Chaubey (Concordia University) Department of Mathematics Statistics November 19, 2014 68 / 70
  • 93. Prakasa Rao, B. L. S. (1983). Nonparametric Functional Estimation. Academic Press:New York. Prakasa Rao, B. L. S. (2005). Estimation of distributions and density functions by generalized Bernstein polynomials. Indian J. Pure and Applied Math. 36, 63-88. Rosenblatt, M. (1956). Remarks on some nonparametric estimates of density functions. Ann.Math. Statist. 27 832{837. Scaillet, O. (2004). Density Estimation Using Inverse and Reciprocal Inverse Gaussian Kernels. Journal of Nonparametric Statistics, 16, 217-226. Silverman, B. W. (1986). Density Estimation for Statistics and Data Analysis. Chapman and Hall: London. Vitale, R. A. (1975). A Bernstein polynomial approach to density estimation. In Statistical Inference and Related Topics (ed. M. L. Puri), 2 87{100. New York: Academic Press. Wand, M.P., Marron, J.S. and Ruppert, D. (1991). Transformations in density estimation. Journal of the American Statistical Association, 86, 343-361. Yogendra Chaubey (Concordia University) Department of Mathematics Statistics November 19, 2014 69 / 70
  • 94. Talk slides are available on SlideShare: http://www.slideshare.net/YogendraChaubey/talk-slides-isi2014 THANKS!! Yogendra Chaubey (Concordia University) Department of Mathematics Statistics November 19, 2014 70 / 70