THEME – 2 On Normalizing Transformations of the Coeﬃcient of Variation for a Normal Population with an Application to Evaluation of Uniformity of Plant Varieties
THEME – 2 On Normalizing Transformations of the Coeﬃcient of Variation for a Normal Population with an Application to Evaluation of Uniformity of Plant Varieties
1.
On Normalizing Transformations of the Coeﬃcient of
Variation for a Normal Population with an Application to
Evaluation of Uniformity of Plant Varieties
Yogendra P. Chaubey∗
Department of Mathematics and Statistics
Concordia University, Montreal, Canada H3G 1M8
E-mail: yogen.chaubey@concordia.ca
∗
Joint work with M. Singh, ICARDA, Aleppo, Syria and Debaraj Sen,
Department of Mathematics and Statistics, Concordia University, Montreal,
Canada
Talk to be presented at the International Workshop on Applied
Mathematics and Omics Technologies for Discovering Biodiversity and Genetic
Resources for Climate Change Mitigation and Adaptation to Sustain Agriculture
in Drylands, ICARDA, Rabat, Morocco
June 24-27, 2014
Yogendra P. Chaubey () Department of Mathematics & Statistics Concordia University 1 / 49
2.
Abstract
The variance stabilizing transformation (VST), that was formally
introduced by Bartlett (1947, Biometrics) is quite popular in statistical
applications due to its approximate normalizing property. This property is
mainly due to the fact that the variance stabilizing transformations may be
more symmetric compared to the the untransformed statistics. Chaubey
and Mudholkar (1983, Technical Report, Concordia University) developed
a diﬀerential equation, analogous to Bartlett’s, for obtaining an
approximately symmetrizing transformations and illustrated it’s use in
some common examples. In general, the transformation may be
computationally intensive as illustrated in Chaubey, Singh and Sen (2013,
Comm. Stat. - Theor. Meth.) in terms of coeﬃcient of variation from
normal samples. In this talk we review these transformations in this light
and examine some new transformations along with an application to
evaluating the uniformity of plant varieties.
Yogendra P. Chaubey () Department of Mathematics & Statistics Concordia University 2 / 49
3.
Outline
1 Introduction
2 Symmetrizing and Variance Stabilizing Transformations
3 A Condition under which VST is ST
Fisher’s transformation of correlation coeﬀ.
Arcsin Transformation for the Binomial Proportion
Square root transformation for Poisson RV
Chi-square Random Variable
4 Symmetrizing transformations in Standard Cases
5 VST and ST for Coeﬃcient of Variation
Appendix: R-Codes for Computing the Symmetrizing Transformation
Small Sample Adjustment
Inverse Gaussian Distribution
6 An Application
Yogendra P. Chaubey () Department of Mathematics & Statistics Concordia University 3 / 49
4.
Introduction
The transformations along with the approximations are important for
both genetic resources data and climate data and appear as a
prerequisite for raw data analysis.
The earliest consideration of a transformation that stabilizes the
variance is due to Fisher (1915, 1922) in proposing Z = tanh−1r and
2χ2
ν − 1 as approximately normalizing transformations of the
correlation coeﬃcient r and the χ2
ν variable respectively.
Bartlett (1947) introduced variance stabilizing transformations
formally for the purpose of utilizing the usual analysis of variance in
the absence of homoscedasticity.
He showed how to derive these using a diﬀerential equation, and as
illustrations, conﬁrmed the variance stabilizing character of z and
χ2
ν and gave many additional examples including the square root of
a Poisson random variable and the function arcsin
√
p of the binomial
sample proportion p.
Yogendra P. Chaubey () Department of Mathematics & Statistics Concordia University 4 / 49
5.
Introduction
Since then, these transformations have been variously studied and
reﬁned essentially with a view to improving normality. Thus,
Anscomb (1948) improved
√
X of the the Poisson variable X to
X + (3/8), arcsin
√
p to arcsin (p + 3/8)/(1 + (3/4)), and
Hotelling (1953) in his deﬁnitive study of the distribution of the
correlation coeﬃcient, proposed numerous improvements of Z.
Now, we note that even though many variance stabilizing
transformations of random variables have near normal distributions
and they simplify the inference problems such as conﬁdence interval
estimation of the parameter, the stability of variance is not necessary
for normality. However, approximate symmetry is clearly a prerequisite
of any approximately normalizing transformation.
Yogendra P. Chaubey () Department of Mathematics & Statistics Concordia University 5 / 49
6.
Introduction
Hence, an approximately symmetrizing transformation of a random
variable may be a more eﬀective method of normalizing it than
stabilizing its variance.
Historically, this was ﬁrst illustrated by Wilson and Hilferty (1931),
who showed that the cube root of a chi square variable obtained by
them as an approximately symmetrizing power-transformation
provides a normal approximation superior to that based on Fisher’s
variance stabilizing transformation.
Their approach of constructing a skewness reducing power
transformation has now been extended to many other distributions,
e.g. to non-central chi square by Sankaran (1959), to quadratic forms
by Jensen and Solomon (1972), to sample variance from non-normal
populations and multivariate likelihood ratio statistics by Mudho1kar
and Trivedi (1980, 1981a, 1981b).
Yogendra P. Chaubey () Department of Mathematics & Statistics Concordia University 6 / 49
7.
Introduction
In this talk, we present the results explored in Chaubey and Mudholkar
(1983) with respect to developing a diﬀerential equation analogous to
Bartlett’s, which gives an approximately symmetrizing transformation.
This paper also examines some of the standard transformations in this
light.
Next we consider the computing aspects of these transformations
illustrated for coeﬃcient of variation for normal populations as
discussed in Chaubey, Singh and Sen (2014) and indicate its
adaptation to inverse Gaussian case.
An application in the context of assessing uniformity of two plant
varieties is illustrated.
Yogendra P. Chaubey () Department of Mathematics & Statistics Concordia University 7 / 49
8.
Preliminaries
Let Tn be a statistic based on a random sample of size n, constructed
to estimate a parameter θ. Further, assume that
√
n(Tn − θ) tends to
follow N(0, σ2(θ)) as n → ∞. Denote the jth central moment of Tn
by
µj(θ) = E(Tn − µ(θ))j
, j = 1, 2, ...
where
µ(θ) = E(Tn).
Yogendra P. Chaubey () Department of Mathematics & Statistics Concordia University 8 / 49
9.
Preliminaries
A smooth function g(Tn), intended for use as a transformation, can
be approximated by the Taylor’s expansion as
g(Tn) − g(θ) ≈ (Tn − θ)g (θ) +
1
2
(Tn − θ)2
g (θ), (2.1)
where
g (θ) =
dg(θ)
dθ
and g (θ) =
d2g(θ)
dθ2
.
Hence as a ﬁrst approximation we have
g(Tn) − E[g(Tn)] ≈ (Tn − µ(θ))(g (θ) + ξ1(θ)g (θ))
+
1
2
[(Tn − µ(θ))2
− µ2(θ)]g (θ). (2.2)
where ξ1(θ) = µ(θ) − θ.
Yogendra P. Chaubey () Department of Mathematics & Statistics Concordia University 9 / 49
10.
Preliminaries
Deﬁne
R =
g (θ)
g (θ)
and R1 =
R
1 + ξ1(θ)R
.
then we have from (2.8), approximate expression of the variance (µ2g
of g(Tn)
µ2g = (g (θ))2
(1 + ξ1(θ)R)2
[µ2(θ)
+R1µ3(θ) +
1
4
R2
1(µ4(θ) − µ2
2
(θ))] (2.3)
Similarly the third central moment µ3g of Tn (up to order O(1/n2))
can be approximately given by
µ3g = (g (θ))3
(1 + ξ1(θ)R)3
µ3(θ) +
3
2
R1(µ4(θ) − µ2
2(θ)) , (2.4)
Yogendra P. Chaubey () Department of Mathematics & Statistics Concordia University 10 / 49
11.
Variance Stabilizing Transformation
where we have omitted terms containing central moments of order
higher than 4 (this assumes that the third and fourth central
moments are of order O(1/n2) and the higher order moments are of
lower order).
Variance stabilizing transformation: (See Rao (1973)). (V ST), may
now be obtained using (2.3). Ignoring the last two terms, g(.) is an
approximate V ST if (g (θ))2µ2(θ) is constant, or,
g (θ) =
C
σ(θ)
where C is a constant. Hence
g(θ) = C
1
σ(θ)
dθ. (2.5)
Yogendra P. Chaubey () Department of Mathematics & Statistics Concordia University 11 / 49
12.
Symmetrizing Transformation:
To derive the symmetrizing transformation (ST), the third moment
of g(Xn) given in (2.4) may be equated to zero. Thus for a ST g,
µ3(θ) +
3
2
R1(µ4(θ) − µ2
2(θ)) = 0 (2.6)
that gives
g (θ)
g (θ)
= −
2
3
µ3(θ)
µ4(θ) − µ2
2(θ)
, (2.7)
where again the term involving ξ1µ3(θ) have been ignored.
The solution of this equation can be written as (see Chaubey and
Mudholkar, 1983):
g(θ) = e−a(θ)
dθ (2.8)
Yogendra P. Chaubey () Department of Mathematics & Statistics Concordia University 12 / 49
13.
A Condition under which VST is ST
where
a(θ) =
2
3
f1(θ)
f2(θ)
dθ (3.1)
with f1(.) and f2(.) being deﬁned as
f1(θ) = µ3(θ), (3.2)
f2(θ) = µ4(θ) − µ2
2(θ). (3.3)
It is natural to ask if and when can a VST be a ST. Such a condition
may be derived by equating µ3(g) = 0 with the g obtained from VST,
using Eq (2.7).
It can be easily seen that such a condition appears in the equation:
1
σ(θ)
{f1(θ) −
3
2
f2(θ)
dlnσ(θ)
dθ
} = 0
That is
dlnσ(θ)
dθ
=
2
3
f1(θ)
f2(θ)
(3.4)
Yogendra P. Chaubey () Department of Mathematics & Statistics Concordia University 13 / 49
14.
Standard Transformations
We may examine the extent to which some standard VST’s are ST in the
light of the above condition.
Fisher’s transformation of correlation coeﬀ:
Using the results from Hotelling (1953), we have
f1(ρ) = −6ρ(1 − ρ2)3/n2, f2(ρ) = 2(1 − ρ2)4/n2 and
σ(ρ) = (1 − ρ2).
It is easily seen that the condition in Eq(3.4) is satisﬁed as both sides
of the equation equals −2ρ/(1 − ρ2).
arcsin Transformation for the Binomial Proportion:
For the binomial proportion θ, we have
f1(θ) = θ(1 − θ)(1 − 2θ)/n2 f2(θ) = 2θ2(1 − θ)2/n2, and
σ(θ) = θ(1 − θ). In this case
2
3
f1(θ)
f2(θ)
=
1
3
1 − 2θ
θ(1 − θ)
Yogendra P. Chaubey () Department of Mathematics & Statistics Concordia University 14 / 49
15.
Standard Transformations
However,
dlnσ(θ)
dθ
=
1
2
1 − 2θ
θ(1 − θ)
.
Hence the condition in (3.4) is not satisﬁed. This implies that a
better normalizing transformation may be available in contrast to the
VST, arcsin
√
p.
Square root transformation for Poisson RV
In this case f1(θ) = θ, f2(θ) = θ + 2θ2, σ(θ) = (θ). And
2
3
f1(θ)
f2(θ)
=
2
3(1 + 2θ)
where as
dlnσ(θ)
dθ
=
1
2θ
.
Again in this case the condition does not hold.
Yogendra P. Chaubey () Department of Mathematics & Statistics Concordia University 15 / 49
16.
Standard Transformations
Chi-square Random Variable
Let X be distributed as χ2
nθ. Letting Tn = X/n, We have
f1(θ) = 8θ2/n2, f2(θ) = 8θ4/n2 + O(1/n3), and σ(θ) = (2θ). The
VST is given by (2Tn).
2
3
f1(θ)
f2(θ)
=
2
3θ
where as
dlnσ(θ)
dθ
=
1
θ
and the condition is not satisﬁed again.
Yogendra P. Chaubey () Department of Mathematics & Statistics Concordia University 16 / 49
17.
Symmetrizing transformations in Standard Cases
The above examples demonstrate that there may be a possibility to get a
better normalizing transformation than given by the variance stabilizing
transformation. Now we use the diﬀerential equation (2.8) to obtain such
transformations in the examples discussed above.
Correlation Coeﬃcient:
In this case
g(ρ) = exp[
2ρ
1 − ρ2
dρ]dρ
=
1
1 − ρ2
dρ =
1
2
ln
1 + ρ
1 − ρ
(4.1)
which is the well known Fisher’s Z transformation that conﬁrms our
conclusion reached earlier (see Chaubey and Mudholkar (1984)).
Yogendra P. Chaubey () Department of Mathematics & Statistics Concordia University 17 / 49
18.
Symmetrizing transformations in Standard Cases
Binomial Proportion:
In this case the ST is given by
g(θ) = θ−1/3
(1 − θ)−1/3
dθ. (4.2)
This equation does not have an explicit solution, however it can be
solved numerically. Later on we include a program for ﬁnding the ST
for coeﬃcient of variation that can be easily adapted here.
The ST may be contrasted with the VST given by
gv(θ) = θ−1/2
(1 − θ)−1/2
dθ = sin−1√
p. (4.3)
Poisson Variable:
In this case the ST is given by
g(θ) =
3
2
θ2/3
(4.4)
Yogendra P. Chaubey () Department of Mathematics & Statistics Concordia University 18 / 49
19.
Symmetrizing transformations in Standard Cases
Thus the Poisson variable is better normalized by a power
transformation with power = 2/3 as compared to the VST with
power= 1/2.
Chi-square Random Variable:
In the set-up considered earlier the symmetrizing transformation is
given by
g(θ) = e−(2/3)lnθ
dθ = 3θ1/3
. (4.5)
Thus the symmetrizing transformation for the Chi-square random
variable is the well known Wilson-Hilferty cube-root transformation.
Yogendra P. Chaubey () Department of Mathematics & Statistics Concordia University 19 / 49
20.
VST and ST for Coeﬃcient of Variation
These transformations have been investigated well in the literature.
Next we report on our recent investigations concerning VST and ST
with respect to the coeﬃcient of variation, φ = σ/µ, where σ is the
population standard deviation and µ is the population mean, where µ
is assumed to be non-negative.
It is used in many applied areas as an alternative to the standard
deviation.
Engineering applications - Signal to Noise Ratio: Kordonsky and
Gertsbakh (1997).
Agricultural research - Measure of homogeneity of experimental ﬁeld:
Taye and Njuho (2008).
- uniformity of a plant variety for seed acceptability: Singh, Niane and
Chaubey (2010).
Biometry - Measure of reproducibility of observations: Butcher and
O’Brien (1991) and Quan and Shih (1996)
Economics - a measure of income-diversity: Bedeian and Mossholder
(2000).
Yogendra P. Chaubey () Department of Mathematics & Statistics Concordia University 20 / 49
21.
VST and ST for Coeﬃcient of Variation
Normal Samples:
The inference on φ can be dealt with that for θ = 1/φ based on the
estimate ˆθ = ¯X/S, where ¯X denotes the mean and S2 the sample
variance based on a random sample X1, ..., Xn from N(µ, σ2).
Since
√
nTn ∼ tν(δ), i.e. a non-central −t. (see Johnson and Kotz
1970) with ν = n − 1 and the non-centrality parameter δ = θ, the
central moments of ˆθ [ using the moments of non-central t from
Hogben et al. (1961)] are listed below:
E(ˆθ) = c11θ, (5.1)
µ2(ˆθ) = E(ˆθ − E(ˆθ))2
= c22θ2
+
c20
n
, (5.2)
µ3(ˆθ) = E(ˆθ − E(ˆθ))3
= (c33θ2
+
c31
n
)θ, (5.3)
µ4(ˆθ) = E(ˆθ − E(ˆθ))4
= c44θ4
+
c42
n
θ2
+
c40
n2
, (5.4)
Yogendra P. Chaubey () Department of Mathematics & Statistics Concordia University 21 / 49
23.
VST and ST for Coeﬃcient of Variation
The above moments can be substituted in the formulae for the functions
f1(θ) and f2(θ) in equations (3.2) and (3.3) in order to obtain the
symmetrizing transformation. The integral in equation (2.8) is too
complex to obtain explicitly and therefore, we shall numerically evaluate it
for various values of θ and a given sample size n. We have used the
formula S(x) for integration of function s(x) as
s(x)dx = S(x) =
x
0
s(u)du + S(0).
For the ease of accessibility and to impress upon the reader how easy it is
to obtain this transformation, the source codes written in R, that were
used to compute these values are given in the appendix.
Yogendra P. Chaubey () Department of Mathematics & Statistics Concordia University 23 / 49
24.
R-Codes for Computing the Symmetrizing Transformation
## Symmetrizing transformation
## Name of the function: fsym
## Arguments: x is the argument at which the function
## is computed
## ss is the sample size
## Output: The value of the symmetrizing function
#
fsym<-function(x,ss){
#
#integral of f1(phi)/f2(phi)
f1f2<-function(x,ss){
hfun<-function(phi,ss=ss) {
nu<-ss-1;d<-sqrt(ss)*phi
c11<-sqrt(nu/2)*gamma((nu-1)/2)/gamma(nu/2)
c22<-(nu/(nu-2))-c11^2;c20<-nu/(nu-2)
c31<-3*c11*c20/(nu-3);c33<-c11*(2*c11^2
+(nu*(7-2*nu)/((nu-2)*(nu-3))))Yogendra P. Chaubey () Department of Mathematics & Statistics Concordia University 24 / 49
25.
R-Codes for Computing the Symmetrizing Transformation
c42<-6*c20*((nu/(nu-4))-((nu-1)*c11^2/(nu-3)))
c44<-(c20*nu/(nu-4))-(2*c20*c11^2*(5-nu)/(nu-3))-3*c11^4
mu1<-(c11*d)/sqrt(ss);mu2<-(c22*d^2+c20)/ss
mu3<-(c31*d+c33*d^3)/ss^1.5
mu4<-(c40+c42*d^2+c44*d^4)/ss^2
mu3/(mu4-mu2^2)}
fval<- integrate(hfun,0,x,ss=ss)$value
exp(-2*fval/3)}
##
f1f2int<-function(x,ss)sapply(x,f1f2,ss=ss)
##
integrate(f1f2int,0,x,ss=ss)$value}
Yogendra P. Chaubey () Department of Mathematics & Statistics Concordia University 25 / 49
26.
Symmetrizing transformation
0.00 0.10 0.20 0.30
2.03.04.0
θ
g(1θ)
n=30
0.00 0.10 0.20 0.30
2.02.53.03.5
θ
g(1θ)
n=50
0.00 0.10 0.20 0.30
1.82.22.63.0
θ
g(1θ)
n=100
0.00 0.10 0.20 0.30
1.82.22.6
θ
g(1θ)
n=200
Figure: 1. Symmetrizing transformation values of the coeﬃcient of variation (θ)
for varying values of sample size
Yogendra P. Chaubey () Department of Mathematics & Statistics Concordia University 26 / 49
27.
Comparison of ST and VST
Chaubey, Singh and Sen (2013) carried out a large scale simulation
comparing the VST, ST and UT (untransformed statistic) in terms of
their normalizing quality. The VST was studied in Singh (1993)that is
available in an explicit form:
g(θ) = sinh−1
(Bθ) = ln Bθ + 1 + B2θ2 (5.5)
where B = (1 + 3
4ν ) n
2ν .
Based on 100,000 simulations, it was concluded that the V ST
reduces the skewness as compared to the untransformed statistic but
the skewness is still signiﬁcant even for sample sizes as large as 200.
On the other hand the ST reduces skewness to a considerable degree
for sample sizes as small as 30.
Yogendra P. Chaubey () Department of Mathematics & Statistics Concordia University 27 / 49
28.
Comparison of ST and VST
For simulating the probability distribution of g(θ) we consider the
standardized statistic
Zg =
g(ˆθ) − E(g(ˆθ))
var(g(ˆθ))
where g(.) is any of the functions associated with symmetrizing,
variance stabilizing transformations and no transformation.
The expected value E(g(ˆθ)), using the expansion of g(Xn) = ˆθ in
(2.1), is obtained as,
E(g(Tn)) = g(θ) + g (θ)ξ1(θ) +
1
2
g (θ)(µ2(θ) + ξ2
1(θ))
= g(θ) + g (θ)[ξ1(θ) +
R
2
(µ2(θ) + ξ2
1(θ))]. (5.6)
Yogendra P. Chaubey () Department of Mathematics & Statistics Concordia University 28 / 49
29.
Comparison of ST and VST
Note that for computation of the above expectation for ST,
R = g (θ)/g (θ) is substituted from (2.7) and g is numerically
obtained from
g (θ) = exp{−
2
3
θ
0
f1(u)
f2(u)
du} (5.7)
The table of simulated probabilities are given in the next table. It was
noted that for sample sizes less than 50, ST does not provide
signiﬁcant improvement to the VST. Hence, an adjustment for small
sample sizes was provided as described next.
Yogendra P. Chaubey () Department of Mathematics & Statistics Concordia University 29 / 49
30.
Table 1. Probability distribution (P (Z ≤ zα))∗
of standardized transforms of CV
α
CV n Transformation†
0.005 0.025 0.05 0.5 0.95 0.975 0.995
0.1 30 ST 0.003 0.021 0.046 0.514 0.939 0.965 0.990
V ST 0.002 0.016 0.039 0.517 0.944 0.969 0.991
UT 0.000 0.006 0.025 0.547 0.937 0.961 0.985
50 ST 0.004 0.023 0.049 0.504 0.943 0.970 0.993
V ST 0.002 0.018 0.042 0.511 0.945 0.970 0.992
UT 0.001 0.010 0.031 0.533 0.939 0.964 0.987
100 ST 0.005 0.025 0.051 0.502 0.946 0.972 0.994
V ST 0.003 0.020 0.046 0.509 0.946 0.971 0.993
UT 0.001 0.015 0.038 0.523 0.941 0.966 0.990
0.2 30 ST 0.003 0.021 0.045 0.511 0.939 0.966 0.990
V ST 0.002 0.017 0.039 0.514 0.943 0.969 0.991
UT 0.000 0.007 0.025 0.543 0.937 0.961 0.985
50 ST 0.004 0.023 0.048 0.510 0.943 0.970 0.993
V ST 0.002 0.018 0.042 0.516 0.945 0.970 0.992
UT 0.001 0.010 0.031 0.536 0.939 0.963 0.987
100 ST 0.005 0.024 0.049 0.501 0.947 0.973 0.994
V ST 0.003 0.020 0.044 0.508 0.947 0.971 0.993
UT 0.002 0.015 0.037 0.522 0.942 0.966 0.989
0.3 30 ST 0.003 0.022 0.047 0.511 0.941 0.967 0.991
V ST 0.002 0.017 0.040 0.516 0.945 0.969 0.991
UT 0.000 0.007 0.026 0.543 0.938 0.962 0.985
50 ST 0.004 0.025 0.050 0.505 0.943 0.969 0.993
V ST 0.002 0.020 0.043 0.512 0.944 0.969 0.992
UT 0.001 0.012 0.033 0.532 0.938 0.962 0.987
100 ST 0.005 0.025 0.050 0.503 0.947 0.973 0.994
V ST 0.003 0.021 0.045 0.510 0.946 0.971 0.993
UT 0.001 0.015 0.038 0.524 0.942 0.966 0.990
†
ST : Symmetrizing transformation. V ST : variance stabilizing transformation. UT : Untransformed.
*: zα is such that for Z ∼ N(0, 1), P (Z ≤ zα) = α.
Yogendra P. Chaubey () Department of Mathematics & Statistics Concordia University 30 / 49
31.
Small Sample Adjustment
For adjusting the normal approximation provided by the ST, the
technique suggested in Mudholkar and Chaubey (1975), using a
mixture approximation was utilized.
This technique models the distribution of the standardized statistic
ZST = (g(Tn) − E(g(Tn)))/
√
µ2g, denote the standardized version of
the ST. Then ZST is modeled as
λN(0, 1) (1 − λ)
(χ2
ν − ν)
√
2ν
where denotes the mixture of the corresponding distributions.
The values of ν and λ are obtained by equating the simulated
skewness and kurtosis denoted by β1(ST) and β2(ST), respectively, i.e.
ν =
8
β1(ST)
and λ = 1 −
2
3
β2(ST) − 3
β1(ST)
(5.8)
Yogendra P. Chaubey () Department of Mathematics & Statistics Concordia University 31 / 49
32.
Small Sample Adjustment
The lower tail probabilities for ZST can now be approximated as:
P(ZST ≤ x) = λΦ(x) + (1 − λ)P(χ2
ν ≤ ν + x
√
2ν) (5.9)
The conﬁdence intervals are obtained using the following approximate
representation of the quantiles of a mixture distribution in terms of
those of its components.
Let zα and z∗
α be the α quantiles of the standardized distributions
N(0, 1) and χ2
ν −ν
√
2ν
respectively. Then the α quantile xα of the mixture
distribution is approximated as:
xα = λzα + (1 − λ)z∗
α (5.10)
where z∗
α is given in terms of the α quantile χ2
ν,α as
z∗
α =
χ2
ν,α − ν
√
2ν
. (5.11)
Yogendra P. Chaubey () Department of Mathematics & Statistics Concordia University 32 / 49
33.
Small Sample Adjustment
We have used simulated values of β1 and β2 for ST, to develop
polynomial approximations in powers of φ and 1/n. Here we used the
technique of multiple linear regression including up to quadratic terms
as well as their interactions on a grid of 105 combinations of φ and n
values that resulted in the following expressions:
β1ST ≈ −0.06694 + 8.51908/n + 15.42537/n2
+(0.2456 − 14.69333/n + 155.42357/n2
)φ
−(0.25299 − 9.73724/n + 162.48528/n2
)φ2
(5.12)
β2ST ≈ 3.02586 − 4.67269/n
+209.31385/n2
+ (0.16502 − 5.7324/n + 4.18595/n2
)φ
−(0.12802 − 5.69879/n + 93.2359/n2
)φ2
(5.13)
These models were judged to be adequate under squared multiple
correlation coeﬃcients which were 99.6% and 98%, respectively.
Yogendra P. Chaubey () Department of Mathematics & Statistics Concordia University 33 / 49
34.
Small Sample Adjustment
A comparison of probabilities obtained by the mixture approximation
using the simulated as well as modeled values of skewness and
kurtosis along with corresponding probabilities obtained by simulation
(based on 100,000 runs) are presented in Table 2 for θ = 0.1, 0.2, 0.3
and n = 20, 30, 40, 50.
It may be seen from this table that the mixture approximation based
on modeled skewness (see Eq. (5.12)) and kurtosis (see Eq. (5.13))
gives values reasonably close to those based on their simulated values,
and in turn, those are close to the exact probabilities obtained by
simulation.
Yogendra P. Chaubey () Department of Mathematics & Statistics Concordia University 34 / 49
37.
Inverse Gaussian Distribution
The inverse Gaussian (IG) distribution is regarded as a natural choice
for modeling non-negative data in many situations; see Chhikara and
Folks (1974).
The pdf an IG distribution is given by
f(x; µ, λ) =
λ
2πx3
e
−
λ(x−µ)2
2µ2x
where x, λ, µ > 0.
For this distribution
E(X) = µ, V ar(X) = µ3
/λ, CV (X) =
µ
λ
and therefore the ratio ϕ = µ/λ being the squared CV presents an
alternative way to parametrize the distribution.
Yogendra P. Chaubey () Department of Mathematics & Statistics Concordia University 37 / 49
38.
Inverse Gaussian Distribution
Based on a random sample X1, X2, ..., Xn from IG(µ, λ), ϕ may be
of interest for inference on θ. Its unbiased estimator is given by
ˆϕ = ¯XU,
where
U =
1
n − 1
n
i=1
(
1
Xi
−
1
¯X
).
It is known that ¯X and U are independent and
¯X ∼ IG(µ, nλ) and (n − 1)U/λ ∼ χ2
(n−1)
These properties may be used to set up the VST and ST in this
situation.
The details will be communicated in a forthcoming publication.
Yogendra P. Chaubey () Department of Mathematics & Statistics Concordia University 38 / 49
39.
An Application
We compare the 95% conﬁdence intervals for the CV s using data on
heights (cm) of n = 30 wheat plants of two varieties (Singh et al.
2010).
The sample values were:
Variety 1 (Entry 4) : ¯x = 91.7 cm, sd = 6.25cm, CV = 0.06814.
Variety 2 (Entry 5): ¯x = 115.03cm sd = 2.63cm, CV = 0.0229
For a general transformation, we have standardised random variate
Zg =
g(ˆφ) − E(g(ˆφ))
Var(g(ˆφ))
100(1 − α)% conﬁdence limits are solutions (φL, φU ) of the following
equations:
g(ˆφ) − E(g(ˆφ))
Var(g(ˆφ))
= xα/2, x1−α/2
Yogendra P. Chaubey () Department of Mathematics & Statistics Concordia University 39 / 49
40.
An Application
xα/2, x1−α/2 are obtained using the distribution of Zg as describde
earlier:
P
xα/2 ≤
g(ˆφ) − E(g(ˆφ))
Var(g(ˆφ))
≤ x1−α/2
= 1 − α
Note that the above equations involve the parameters φ and hence θ
in the expected values and variance of all the three transformations,
except the variance of variance stabilizing transformation through
non-linear functions, the solutions need to be obtained numerically.
In our application the uniroot function available in R software was
used. For the variance stabilizing transformation and no
transformation cases, xα is the α−quantile of the standard normal
distribution. For the symmetrizing transformation, the skewness (β1)
and kurtosis (β2) were modeled using the equations given in the
preceding section. The constants required for the approximations are
given in Table 3.
Yogendra P. Chaubey () Department of Mathematics & Statistics Concordia University 40 / 49
41.
An Application
Table 3. Constants for the approximation.
Variety n θ β1 β2 λ ν
Entry 4 30 0.068157 0.2288 3.1010 0.7056 34.97
Entry 5 30 0.022864 0.2325 3.1022 0.7070 34.41
The values of xα from equation (5.10) are: x0.025 = −1.8907 and
x0.975 = 2.0235. The resulting 95% conﬁdence intervals for θ for
various transformations are given in Table 4.
Yogendra P. Chaubey () Department of Mathematics & Statistics Concordia University 41 / 49
42.
An Application
Table 4. The 95% conﬁdence intervals of θ.
Entry 4 Entry 5
Transformations Lower Upper Width Lower Upper Width
Symmetrizing 0.05425 0.09051 0.03636 0.01821 0.03031 0.01210
Variance stabilizing 0.05317 0.09037 0.03720 0.01785 0.03028 0.01242
Untransformed 0.04936 0.08704 0.03767 0.01657 0.02916 0.01259
Vangel’s Approx. 0.05409 0.09106 0.03697 0.01820 0.03072 0.01252
In this example, we note that symmetrizing transformation provides
narrower conﬁdence intervals as compared to others.
Yogendra P. Chaubey () Department of Mathematics & Statistics Concordia University 42 / 49
43.
References
Anscombe. F.J. (1948). The transformation of Poisson. Binomial.
Negative Binomial data. Biometrika 35, 246-254.
Bartlett, M.S. (1947). The use of transformations. Biometrics 1,
39-52.
Bedeian, A.G. and Mossholder, K.W. (2000). On the use of the
coeﬃcient of variation as a measure of diversity. Organizational
Research Methods 3, 285-297.
Butcher, J.M. and O’Brien, C. (1991). The reproducibility of
biometry and keratometry measurements. Eye 5, 708-711.
Chaubey, Y.P. and Mudholkar, G.S. (1983). On the symmetrizing
transformations of random variables. Preprint, Concordia University,
Montreal. Available at http://spectrum.library.concordia.ca/973582/
Yogendra P. Chaubey () Department of Mathematics & Statistics Concordia University 43 / 49
44.
Yogendra P. Chaubey () Department of Mathematics & Statistics Concordia University 44 / 49
45.
References
Chaubey, Y.P. and Mudholkar, G.S. (1984). On the almost symmetry
of Fisher’s Z. Metron 42(I/II), 165–169.
Chaubey, Y. P., M. Singh and D. Sen (2013). On symmetrizing
transformation of the sample coeﬃcient of variation from a normal
population. Communications in Statistics - Simulation and
Computation 42, 2118-2134.
Chhikara R. S. and J. L. Folks (1989). The inverse Gaussian
distribution. Marcel Dekker, New York.
Fisher. R.A. (1915). Frequency distribution of the values of
correlation coeﬃcient from an indeﬁnitely large population.
Biometrika 10, 507-521.
Fisher. R.A. (1922). On the interpretation of χ2 from contingency
tables and calculation of ρ. J. Roy. Statist. Soc. Ser. A, 85, 87–94.
Yogendra P. Chaubey () Department of Mathematics & Statistics Concordia University 45 / 49
46.
References
Hogben, D., Pinkham, R.S. and Wilk, M.B. (1961). The moments of
the non-central t-distribution. Biometrika 9, 119–127.
Hotelling. H. (1953). New light on the correlation coeﬃcient and its
transforms. J. Roy. Statist. Soc. Ser. B. 15, 193-224.
Jensen, D.R. and Solomon, H. (1972). A Gaussian approximation to
the distribution of a quadratic form in normal variables. J. Amer.
Statist. Assoc. 67, 898-902.
Johnson, N.L. and Kotz, S. (1970). Distributions in statistics:
continuous univariate distributions -2, (Chapter 27), New York: John
Wiley & Sons.
Kordonsky, K.B. and Gertsbakh, I. (1997). Multiple Time Scales and
the Lifetime Coeﬃcient of Variation: Engineering Applications.
Lifetime Data Analysis 2, 139-156.
Yogendra P. Chaubey () Department of Mathematics & Statistics Concordia University 46 / 49
47.
References
Mudholkar, G.S. and Chaubey, Y.P. (1975). Use of logistic
distribution for approximating probabilities and percentiles of
Student’s distribution. Journal of Statistical Research 9, 1-9.
Mudholkar, G.S. and Trivedi, M.C. (1980). A normal approximation
for the distribution of the likelihood ratio statistic in multivariate
analysis of variance. Biometrika 67, 485-488.
Mudholkar, G.S. and Trivedi, M.C. (1981a). A Gaussian
approxiamtion to the distribution of the sample variance for
nonnormal Populations. Journal of the American Statistical
Association 76, 479485.
Mudholkar, G.S. and Trivedi. M.C. (1981b). A normal approximation
for the multivariate likelihood ratio statistics. In Statistical
Distributions in Scientiﬁc Work (C. Taillie, C.P. Patil and A.A.
Baldessari, Eds.). Dordrecht: Reidel, Vol. 5, 219-230
Yogendra P. Chaubey () Department of Mathematics & Statistics Concordia University 47 / 49
48.
References
Quan,H. and Shih, J. (1996). Assessing reproducibility by the
within-subject coeﬃcient of variation with random eﬀects models.
Biometrics 52, 1195-1203.
Rao, C.R. (1973). Linear Statistical Inference and Its applications,
New York: John Wiley.
Singh, M. (1993). Behavior of sample coeﬃcient of variation drawn
from several distributions. Sankhy¯a 55, 65-76.
Singh, M., Niane, A.A., and Chaubey, Y.P. (2010). Evaluating
uniformity of plant varieties: sample size for inference on coeﬃcient
of variation. Journal of Statistics and Applications 5, 1–13.
Sankaran, M.S. (1959). On the noncentral χ2 distribution.
Biometrika 46, 235-237.
Yogendra P. Chaubey () Department of Mathematics & Statistics Concordia University 48 / 49
49.
References
Taye, G. and Njuho, P. (2008). Monitoring Field Variability Using
Conﬁdence Interval for Coeﬃcient of Variation. Communications in
Statistics - Theory and Methods 37, 831–846
Wilson, E.B. and Hilferty. M.M. (1931). The distribution of
Chi-square. Proc. Nat. Acad. Sc. ll, 684-688.
Yogendra P. Chaubey () Department of Mathematics & Statistics Concordia University 49 / 49
Be the first to comment