This document discusses predictive mean matching (PMM) imputation in survey sampling. It begins with an outline and overview of the basic setup, assumptions, and PMM imputation method. It then presents three main theorems: 1) the asymptotic normality of the PMM estimator when the regression parameter β* is known, 2) the asymptotic normality when β* is estimated, and 3) the asymptotic properties of nearest neighbor imputation. The document also discusses variance estimation for the PMM estimator using replication methods like the bootstrap or jackknife. In summary, it provides a theoretical analysis of the asymptotic properties of PMM imputation and approaches for estimating the variance.
Exomoons & Exorings with the Habitable Worlds Observatory I: On the Detection...
Predictive mean-matching2
1. Predictive mean matching imputation in survey
sampling
Shu Yang Jae Kwang Kim
Iowa State University
June 14, 2017
2. Outline
1 Basic Setup
2 Main result
3 Nearest neighbor imputation
4 Variance estimation
5 Simulation study
Yang & Kim (ISU) Predictive Mean Matching Imputation June 14, 2017 2 / 31
3. Predictive mean matching: Basic Setup
FN = {(xi , yi , δi ), i = 1, 2, · · · , N}: finite population, where
δi =
1 if yi is observed
0 otherwise
Note that δi are defined throughout the population. It is referred to
as Reverse approach (Fay, 1992; Shao and Steel, 1999; Kim, Navarro,
and Fuller, 2006; Berg, Kim, and Skinner, 2016).
Parameter of interest: µ = N−1 N
i=1 yi .
Yang & Kim (ISU) Predictive Mean Matching Imputation June 14, 2017 3 / 31
4. Basic Setup (Cont’d)
Let A be the index set of the probability sample selected from FN.
The first order inclusion probability πi are known in the sample.
We observe (xi , δi , δi yi ) for i ∈ A.
Imputation estimator of µ is
ˆµ =
1
N
i∈A
1
πi
{δi yi + (1 − δi )y∗
i }
where y∗
i is an imputed estimator of yi for unit i with δi = 0.
Yang & Kim (ISU) Predictive Mean Matching Imputation June 14, 2017 4 / 31
5. Assumptions
We assume
E(yi | xi ) = m(xi ; β∗
)
where m(·) is a function of x known up to β∗.
We assume MAR (missing-at-random) in the sense that
P(δ = 1 | x, y) = P(δ = 1 | x)
Yang & Kim (ISU) Predictive Mean Matching Imputation June 14, 2017 5 / 31
6. Regression Imputation
Two-step imputation
1 Obtain a consistent estimator of β∗:
i∈A
1
πi
δi {yi − m(xi ; β)}g(xi ; β) = 0
for some g(xi ; β). That is, find ˆβ that satisfies ˆβ = β∗ + op(1).
2 Compute ˆyi = m(xi ; ˆβ) and use y∗
i = ˆyi (deterministic imputation) or
y∗
i = ˆyi + ˆe∗
i (stochastic imputation).
More rigorous theory can be found in Shao and Steel (1999) and Kim and
Rao (2009).
Yang & Kim (ISU) Predictive Mean Matching Imputation June 14, 2017 6 / 31
7. Predictive Mean Matching (PMM) Imputation
1 Obtain ˆβ satisfying ˆβ = β∗ + op(1).
2 For each unit i with δi = 0, obtain a predicted value of yi as
ˆyi = m(xi ; ˆβ).
3 Find the nearest neighbor of unit i from the respondents with the
minimum distance between yj and ˆyi . Let i(1) be the index of the
nearest neighbor of unit i, which satisfies
d(yi(1), ˆyi ) ≤ d(yj , ˆyi ),
for any j ∈ AR, where d(yi , yj ) = |yi − yj | and AR is the set of the
respondents.
4 Use y∗
i = yi(1) for δi = 0.
Yang & Kim (ISU) Predictive Mean Matching Imputation June 14, 2017 7 / 31
8. PMM Imputation
PMM estimator
ˆµpmm =
1
N
i∈A
1
πi
δi yi + (1 − δi )yi(1) . (1)
It is a hot deck imputation (using real observations as the imputed
values).
Because it uses a model information for E(y | x), it can be efficient if
the model is good.
Popular imputation method, but its asymptotic properties are not
investigated rigorously. Asymptotic properties and variance estimation
are open research problems.
Yang & Kim (ISU) Predictive Mean Matching Imputation June 14, 2017 8 / 31
9. Remark
Express the PMM estimator (1) as a function of ˆβ:
ˆµpmm(β) = N−1
i∈A
π−1
i {δi yi + (1 − δi )yi(1)}
= N−1
i∈A
π−1
i δi yi +
j∈A
π−1
j (1 − δj )
i∈A
δi dij yi
= N−1
i∈A
π−1
i δi (1 + κβ,i )yi ,
where
κβ,i =
j∈A
πi π−1
j (1 − δj )dij (2)
and dij = 1 if yj(1) = yi and dij = 0 otherwise.
Yang & Kim (ISU) Predictive Mean Matching Imputation June 14, 2017 9 / 31
10. Remark (Cont’d)
If the regression imputation were used then the imputation estimator
is a smooth function of β and the standard linearization method can
be used to investigate the asymptotic properties of ˆµReg (ˆβ).
In the PMM imputation, ˆµPMM(β) is not a smooth function of β and
we cannot apply the standard linearization method.
Yang & Kim (ISU) Predictive Mean Matching Imputation June 14, 2017 10 / 31
11. 2. Main result
Overview
Theorem 1: We will first establish the asymptotic normality of
n1/2{ˆµpmm(β∗) − µ} when β∗ is known.
Theorem 2: Next, we will establish the asymptotic normality of
n1/2{ˆµpmm(ˆβ) − µ}.
Theorem 3: In addition, we also discuss asymptotic properties of the
nearest neighbor imputation estimator.
Yang & Kim (ISU) Predictive Mean Matching Imputation June 14, 2017 11 / 31
12. Basic Idea
Express
ˆµpmm(β) − µ = Dn(β) + Bn(β), (3)
where
Dn(β) =
1
N
i∈A
1
πi
{m(xi ; β) + δi (1 + κβ,i ){yi − m(xi ; β)} −
N
i=1
yi
and
Bn(β) =
1
N
i∈A
1
πi
(1 − δi ){m(xi(1); β) − m(xi ; β)}.
The difference m(xi(1); β) − m(xi ; β) accounts for the matching
discrepancy, and Bn(β) contributes to the asymptotic bias of the
matching estimator.
Yang & Kim (ISU) Predictive Mean Matching Imputation June 14, 2017 12 / 31
13. Asymptotic bias (Abadie and Imbens, 2006)
If the matching for the nearest neighbor is based on x, then
d(xi(1), xi ) = Op(n−1/p
),
where p is the dimension of x. Thus, for the classical nearest
neighbor imputation using x, the asymptotic bias
Bn =
1
N
i∈A
1
πi
(1 − δi ){m(xi(1)) − m(xi )}
satisfies Bn = Op(n−1/p) which is not negligible for p ≥ 2.
For the case of PMM, we use a scalar function m(x) to find the
nearest neighbor. Thus, Bn = Op(n−1) and the PMM estimator is
asymptotically unbiased.
Yang & Kim (ISU) Predictive Mean Matching Imputation June 14, 2017 13 / 31
14. Theorem 1
Suppose that m(x) = E(y | x) = m(xi ; β∗) and σ2(x) = var(y | x). Under
the regularity conditions (skipped), we have
n1/2
{ˆµpmm(β∗
) − µ} → N(0, V1)
in distribution, as n → ∞, where
V1 = V m
+ V e
(4)
with
V m
=
n
N2
E V
i∈A
π−1
i m(xi ) | FN ,
V e
= E
n
N2
i∈A
π−1
i δi (1 + κβ∗,i ) − 1
2
σ2
(xi ) ,
and κβ,i is defined in (2).
Yang & Kim (ISU) Predictive Mean Matching Imputation June 14, 2017 14 / 31
15. Theorem 2
Under some additional regularity conditions, we have
n1/2
{ˆµpmm(ˆβ) − µ} → N(0, V2)
in distribution, as n → ∞, with
V2 = V1 − γ2V −1
s γ2 + γ1τ−1
Vsτ−1
γ1 (5)
where V1 is defined in (4), γ1 = E{ ˙m(x; β)},
γ2 = p lim
1
N
N
i=1
n
N
1
πi
(1 + κβ∗,i ) − 1 δi ˙m(xi ; β∗
)
and ˙m(x; β) = ∂m(x; β)/∂β .
Yang & Kim (ISU) Predictive Mean Matching Imputation June 14, 2017 15 / 31
16. Remark
1 The second term in (5),
V2 − V1 = −γ2V −1
s γ2 + γ1τ−1
Vsτ−1
γ1,
reflects the effect of using ˆβ instead of β∗ in the PMM imputation.
2 If n/N = o(1) and m(xi ; β) = β0 + β1xi with scalar x, then γ1 = γ2.
Furthermore, under SRS with g(x; β) = ˙m(x; β)/σ2(xi ) in the
estimating function of ˆβ, then V −1
s = τ−1Vsτ−1 . In this case,
V2 = V1.
3 In general, V2 is different from V1 and the effect of the sampling error
of ˆβ should be reflected in the variance estimation of ˆµPMM(ˆβ).
Yang & Kim (ISU) Predictive Mean Matching Imputation June 14, 2017 16 / 31
17. 3. Nearest neighbor imputation
Instead of using a distance function on y, one may use a distance
function on x. Thus, the nearest neighbor of unit i, denoted by i(1),
can be determined to satisfy
d(xi(1), xi ) ≤ d(xj , xi ),
for j ∈ AR, where d(xi , xj ) is the distance function between xi and xj .
The NNI estimator can be written as
ˆµNNI = N−1
i∈A
π−1
i δi yi + (1 − δi )yi(1) .
Here, the only difference is on the matching variable for identifying
i(1).
Yang & Kim (ISU) Predictive Mean Matching Imputation June 14, 2017 17 / 31
18. Asymptotic properties
We can obtain a similar decomposition in (3):
ˆµNNI − µ = Dn + Bn,
where
Dn =
1
N
i∈A
1
πi
{m(xi ) + δi (1 + κi ){yi − m(xi )} −
N
i=1
yi
and
Bn =
1
N
i∈A
1
πi
(1 − δi ){m(xi(1)) − m(xi )}.
The asymptotic bias is not negligible for p ≥ 2: Bn = Op(n−1/p)
Yang & Kim (ISU) Predictive Mean Matching Imputation June 14, 2017 18 / 31
19. Bias-corrected NNI estimator
Let ˆm(x) be a (nonparametric) estimator of m(x) = E(y | x).
We can estimate Bn by
ˆBn =
1
N
i∈A
1
πi
(1 − δi ){ ˆm(xi(1)) − ˆm(xi )}.
A bias-corrected NNI estimator of µ is
ˆµNNI,bc = N−1
i∈A
π−1
i {δi yi + (1 − δi )y∗
i }
where y∗
i = ˆm(xi ) + yi(1) − ˆm(xi(1)).
Yang & Kim (ISU) Predictive Mean Matching Imputation June 14, 2017 19 / 31
20. Theorem 3
Under some regularity conditions, the bias corrected NNI estimator is
asymptotically equivalent to the PMM estimator with known β∗. That is,
n1/2
{ˆµNNI,bc − ˆµPMM(β∗
)} = op(1).
Thus, we have
n1/2
(ˆµNNI,bc − µ) → N(0, V1)
in distribution, as n → ∞, where V1 is defined in (4).
Yang & Kim (ISU) Predictive Mean Matching Imputation June 14, 2017 20 / 31
21. 4. Replication variance estimation
If there is no nonresponse, we can use
ˆVrep(ˆµ) =
L
k=1
ck ˆµ(k)
− ˆµ
2
as a variance estimator of ˆµ = i∈A wi yi , where ˆµ(k) = i∈A w
(k)
i yi .
For example, in the delele-1 jackknife method under SRS, we have
L = n, ck = (n − 1)/n, and
w
(k)
i =
(n − 1)−1 if i = k
0 if i = k
We are interested in estimating the variance of PMM imputation
estimator.
Yang & Kim (ISU) Predictive Mean Matching Imputation June 14, 2017 21 / 31
22. 4. Replication variance estimation
Approach 1
Idea: Apply bootstrap (or jackknife) method and repeat the same
imputation method.
Such approach provides consistent variance estimation for regression
imputation.
ˆµ
(k)
reg,I =
i∈A
w
(k)
i δi yi + (1 − δi )m(xi ; ˆβ(k)
)
However, in the stochastic regression imputation, such approach does
not work because it does not capture the random imputation part
correctly.
Yang & Kim (ISU) Predictive Mean Matching Imputation June 14, 2017 22 / 31
23. 4. Replication variance estimation
Note that
ˆµreg,I2 =
i∈A
wi {δi yi + (1 − δi )(ˆyi + ˆe∗
i )}
=
i∈A
wi {ˆyi + δi (1 + κi )ˆei } ,
where κi is defined in (2).
Thus, we can write
ˆµreg,I2(β) =
i∈A
wi {m(xi ; β) + δi (1 + κi )(yi − m(xi ; β))}
=
i∈A
wi f (xi , yi , δi , κi ; β)
Yang & Kim (ISU) Predictive Mean Matching Imputation June 14, 2017 23 / 31
24. 4. Replication variance estimation
Approach 2:
Idea: If the imputation estimator can be expressed as
ˆµI =
i∈A
wi f (xi , yi , δi , κi ; ˆβ)
for some f function (known form), we can use
ˆµ
(k)
I =
i∈A
w
(k)
i f (xi , yi , δi , κi ; ˆβ(k)
)
to construct a replication variance estimator
ˆVrep =
L
k=1
ck ˆµ
(k)
I − ˆµI
2
.
Yang & Kim (ISU) Predictive Mean Matching Imputation June 14, 2017 24 / 31
25. Variance estimation for PMM estimator
1 Obtain the k-th replicate of ˆβ, denoted by ˆβ(k), by solving
i∈A
w
(k)
i δi {yi − m(xi ; β)}g(xi ; β) = 0
for β.
2 Calculate the k-th replicate of ˆµpmm by
ˆµ
(k)
pmm =
i∈A
w
(k)
i [m(xi ; ˆβ(k)
) + δi (1 + κi ){yi − m(xi ; ˆβ(k)
)}].
3 Compute
ˆVrep =
L
k=1
ck ˆµ
(k)
pmm − ˆµpmm
2
.
Yang & Kim (ISU) Predictive Mean Matching Imputation June 14, 2017 25 / 31
26. 5. Simulation Study
We wish to answer the following questions from a simulation study:
1 Is the bias of PMM imputation estimator asymptotically negligible?
2 Does the bias of NNI estimator remain significant for p ≥ 2?
3 Is the PMM imputation estimator more robust than the regression
imputation estimator?
Other issues
Efficiency comparison
Variance estimation validity
Coverage property for Normal-based Interval estimators.
Yang & Kim (ISU) Predictive Mean Matching Imputation June 14, 2017 26 / 31
27. Simulation Setup 1
Population model (N = 50, 000)
1 P1: y = −1 + x1 + x2 + e
2 P2: y = −1.167 + x1 + x2 + (x1 − 0.5)2
+ (x2 − 0.5)2
+ e
3 P3: y = −1.5 + x1 + x2 + x2 + x4 + x5 + x6 + e
where x1, x2, x3 ∼ Uniform(0, 1), x4, x5, x6, e ∼ N(0, 1).
Sampling design:
1 SRS of size n = 400
2 PPS of size n = 400 with size measure si = log(|yi + νi | + 4) where
νi ∼ N(0, 1).
Response probability: δi ∼ Bernoulli{p(xi )} where
logit{p(xi )} = 0.2 + x1i + x2i . The overall response rate is 75%.
Yang & Kim (ISU) Predictive Mean Matching Imputation June 14, 2017 27 / 31
28. Simulation Setup 2
Imputation methods (for P1 and P2)
1 NNI using (x1, x2)
2 PMM using ˆyi = ˆβ0 + ˆβ1x1i + ˆβ2x2i
3 Stochastic Regression imputation (SRI) using y∗
i = ˆyi + ˆe∗
i , where
ˆy = ˆβ0 + ˆβ1x1i + ˆβ2x2i .
Imputation methods (for P3)
1 NNI using (x1, x2, · · · , x6)
2 PMM using ˆyi = ˆβ0 + ˆβ1x1i + ˆβ2x2i + · · · + ˆβ6x6i
3 SRI using y∗
i = ˆyi + ˆe∗
i , where ˆyi = ˆβ0 + ˆβ1x1i + ˆβ2x2i + · · · + ˆβ6x6i .
Parameter of interest: θ = E(Y )
Yang & Kim (ISU) Predictive Mean Matching Imputation June 14, 2017 28 / 31
29. Simulation Result 1
Table: Simulation results: Bias (×102
) and S.E. (×102
) of the point estimator.
PMM NNI SRI
Bias S.E. Bias S.E. Bias S.E.
Simple Random Sampling
(P1) -0.15 6.46 -0.21 6.54 -0.23 6.44
(P2) -0.22 6.54 -0.25 6.55 -0.37 6.46
(P3) 1.90 11.85 18.59 11.06 0.11 11.17
Probability Proportional to Size Sampling
(P1) 0.05 6.46 0.13 6.37 0.18 6.53
(P2) 0.30 6.52 0.12 6.47 0.16 6.60
(P3) 1.33 10.99 17.53 10.70 0.40 11.10
PMM: predictive mean matching; NNI: nearest neighbor imputation; SRI:
stochastic regression imputation.
Yang & Kim (ISU) Predictive Mean Matching Imputation June 14, 2017 29 / 31
30. Simulation Result 2
Table: Simulation results: Relative Bias of jackknife variance estimates (×102
)
and Coverage Rate (%) of 95% confidence intervals.
PMM NNI SRI
RB CR RB CR RB CR
Simple Random Sampling
(P1) 5 95.1 3 95.1 5 95.8
(P2) 5 95.4 3 95.3 5 95.6
(P3) 4 95.2 4 63.8 4 95.5
Probability Proportional to Size Sampling
(P1) 2 95.5 3 94.8 2 94.9
(P2) 1 95.4 0 95.3 3 94.9
(P3) 7 95.8 3 65.5 -3 95.6
PMM: predictive mean matching; NNI: nearest neighbor imputation; SRI:
stochastic regression imputation.
Yang & Kim (ISU) Predictive Mean Matching Imputation June 14, 2017 30 / 31
31. 6. Discussion
The non-smoothness nature of the PMM estimator makes the
asymptotic result difficult to investigate.
Some recent econometric papers (Andreou and Werker, 2012; Abadie
and Imbens, 2016) provides some key ideas to solve this problem in
survey sampling.
The proof for Theorem 2 involves Martingale central limit theorem
and Le Cam’s third lemma.
Replication variance estimation can be constructed to properly
capture the variability of the PMM estimator.
Fractional imputation can be developed to address this problem. This
is a topic for future research.
Yang & Kim (ISU) Predictive Mean Matching Imputation June 14, 2017 31 / 31