SlideShare a Scribd company logo
1 of 20
Download to read offline
A Note on Confidence Band for Linear
Regression Means
Dipak K Dey, Junfeng Liu, Nalini Ravishanker, Edwards Qiang Zhang (07-24-2015)
ABSTRACT. We are often interested in estimating the overall set of population means (e.g., a
curve or surface) defined by the corresponding set of predictor values (e.g., across certain temporal
and/or spatial domains). When the model is correctly specified, we study a simple confidence band
built upon least squares regression.
1 Introduction
We consider the linear regression model
yi = xiβ + ϵi, ϵi ∼ N(0, σ2
), (i = 1, . . . , n).
Let X (dimension n×p) be the design matrix which collects p-dimensional subject-specific predictor
vectors ({xi}) with sample size n. The ordinary least square estimation for the p-dimensional
coefficient vector β and noise variance (σ2
) are denoted as ˆβ and ˆσ2
n−p, respectively. It is known
that the (1 − α) confidence ellipsoid for CT
β (rank(C)= s ≤ p) is
{CT
β : (CT
β − CT ˆβ)T
[CT
(XT
X)−1
C]−1
(CT ˆβ − CT
β) ≤ sˆσ2
n−pFs,n−p,α}.
Specifically, the (1 − α) confidence ellipsoid for β is
{β : (ˆβ − β)′
(X′
X)(ˆβ − β) ≤ pˆσ2
n−pFp,n−p,α}. (1)
A number of methods on multiple comparison and/or testing are available in the literature (e.g.,
Ravishanker and Dey, 2001).
The rest of the article is organized as follows. Section 2 studies the confidence band for response
mean set estimation; Section 3 compares the confidence band and ellipsoid approaches with regard
to the power for hypothesis testing; and Section 4 concludes with future directions.
1
2 Confidence band for response means
Often times, practice calls for a statistical estimation of an overall joint set of population means
across a specified domain (e.g., a continuous temporal/spatial curve/surface) along with a con-
fidence band around. We start with two special examples to highlight the relationship between
single-point coverage and multiple-point coverage with regard to response mean.
Example 1. yi = µi + ϵi with µi = µ (a constant) and ϵi ∼ N(0, σ2
) independently (i = 1, . . . , n),
the design matrix (X) is the n-dimensional vector 1n = (1, . . . , 1)T
n and the regression coefficient
(β) is µ. The simultaneous coverage of the n response means by the following C-scaling confidence
band amounts to individual coverage with coverage rate
Pr(|ˆµ − µ| ≤ Ctn−1,1−α/2n−1/2
(ϵT (I−Jn/n)ϵ
n−1
)1/2
) = Pr(F1,n−1 ≤ C2
F1,n−1,1−α),
where, Jn = 1n1T
n and t1−α
2
,n−p represents the 1 − α
2
quantiles for the Students’ t-distribution (de-
grees of freedom= n − p). C-specific and n-specific simultaneous coverage (screen set=sample set
with size n) rate profiles (p = 1,α =10%) are displayed in Figure 1.
Example 1∗
. We consider model yi = βi + ϵi, ϵi ∼ N(0, σ2
), (i = 1, . . . , n). With a C-scaling confi-
dence band derived from least squares estimation, the simultaneous coverage amounts to individual
coverage, i.e., ∀ 1 ≤ i ≤ n, we have
Pr(|ˆβi − βi| ≤ Ctn−1,1−α/2(i(XT
X)−1
i)1/2
(ϵT (I−PX )ϵ
n−1
)1/2
)
= Pr(|ˆβ − β| ≤ Ctn−1,1−α/2(XT
X)−1/2
(ϵT (I−PX )ϵ
n−1
)1/2
)
= Pr(F1,n−1 ≤ C2
F1,n−1,1−α).
Involving Z quantile (standard normal distribution approximation), we have
Pr(|ˆβi − βi| ≤ CZ1−α/2(i(XT
X)−1
i)1/2
(ϵT (I−PX )ϵ
n−1
)1/2
) = Pr(F1,n−1 ≤ C2
χ2
1,1−α).
In view of the link between single-point and multiple-point coverage, we proceed to study a general
case. With the prescribed targeted coverage rate (1−α), Bonferonni multiplicity adjustment (across
2
sample space) amounts to the following discrete n-adaptive fusion of individual confidence intervals
Band = xi
ˆβ ± t1−α/2n,n−pˆσn−p(xi(XT
X)−1
xT
i )1/2
. (2)
This Bonferonni adjustment is n-specific and (2) is prescribed using the sample size under modeling
(Bonferroni adjustment I). We are not very clear about the consequence when we apply Bonferonni
adjustment to any number (n∗
≥ sample size n) of data points screened for claiming an overall
coverage or not across the domain (Bonferonni adjustment II). For instance, n∗
= 1000 leads to
Band = xi
ˆβ ± t1−α/2000,n−pˆσn−p(xi(XT
X)−1
xT
i )1/2
,
which approaches ∞ as n∗
increases to infinity. Since Bonferonni adjustment likely leads to the
actual coverage probability > 1 − α, we resort to the simple scaled fusion of individual confidence
intervals
Band = xi
ˆβ ± C ∗ t1−α/2,n−pˆσn−p(xi(XT
X)−1
xT
i )1/2
. (3)
The tuning parameter C (for multiplicity adjustment) is to be determined for achieving the pre-
scribed coverage probability (1 − α) exactly. We are interested in the coverage probability of
the band (3) for the underlying response means ({xiβ}) across a continuous domain (e.g., xi =
(1, ti, t2
i , . . . , tp−1
i ), ti = i × 10−3
, 1 ≤ i ≤ 103
). The non-coverage rate resembles the family-wise er-
ror rate (FWER) under multiple testing scenario, i.e., the probability of at least one false rejections
when all hypotheses are null. The confidence band width comparison among Bonferroni adjust-
ment I (t1−α/2n,n−p), Bonferroni adjustment II (t1−α/2000,n−p), individual (t1−α/2,n−p) and C−scaling
(Ct1−α/2,n−p) with (p = 3, C = 1.5 and α = 10%) is demonstrated in Figure 2. Since the C−scaling
band (3) achieves the prescribed coverage probability (1 − α) at n = 30, Both Bonferroni ad-
justments (I and II) are conservative coverage. Conditional on n and p, the C−scaling band (3) is
equivalent to C∗
−scaling Bonferroni adjustments (I) with C∗
determined from the correspondent C.
An confidence band example for model fitting using cosine basis functions is given in Figure 3. Under
the correct models (with fixed p), the confidence band coverage probability profiles (interwoven by
n and C) integrates into an intricate pattern (as n increases from p + 1) which shows clusters
3
of n-specific profiles with limit to the normal approximation band. This sort of profile-model(p)
correspondence does not depend on basis function type (e.g., polynomial, radial, cosine, continuous,
discontinuous, etc) under additive models (Figures 4, 5, 6, 7). Out of these plots, a segment with
detailed intersection points is enlarged in Figure 8. The across-the-board coverage rate profiles
(C,n(sample size)=n∗
(screen size), p varies] are plotted in Figure 9. After fitting the model with
sample size n = p + 1, we calculate the overall coverage probability by screening a certain number
of data points (n∗
= p + 1 upward). The results are in Figure 10.
2.1 Estimation under model mis-specification
The consequences from model over-fitting are demonstrated in Figure 11, where the basis function
is polynomial f(x) =
∑p
j=1 jxj−1
. The consequences from model over-fitting are demonstrated
in Figures 12 and 13, where the basis function is polynomial f(x) =
∑p
j=1 jxj−1
and f(x) =
∑p
j=1 10jxj−1
, respectively. The over-fitting has a much less serious consequence than under-fitting.
The consequence from under-fitting depends on the specific model specifications. As an example,
we model and estimate the brain image (http://en.wikipedia.org/wiki/Medical imaging) contour
(Figure 14). The left panel is a crude “3+3” partition of the top and bottom halves and the right
panel is an adaptive “3+4” partition segmented by pre-specified landmarks. This confidence band
estimation is only for illustration purpose since some segments have small numbers of data points
incorporated into modeling and estimating.
3 Hypothesis test
As for hypothesis test, we study the hypothesis testing (H0: β = β0 vs. Ha: β ̸= β0) using different
approaches such as the confidence ellipsoid (1) and C-scaling confidence band approaches. The latter
one claim H0 rejection whenever the confidence band does not cover the overall response mean curve
under H0. When α = 10%, we study three different alternative hypotheses (H0 : β0 = (1, 2, 3) vs.
Ha : βa= (1, 2.1, 3),(1, 2.5, 3),(1, 3, 3)). The powers are compared in Figure 15. C = 1.47 achieves
(1−α) coverage under H0 with sample size n = 30. For this example, the powers are similar between
4
two approaches in each of these three cases. Note that none of these confidence band construction
and hypothesis testing procedures depend on the design matrix X and/or σ2
.
4 Conclusion
Under additive models, for specified configuration (e.g., model dimension (p), error rate threshold
(α), sample size (n)), scaled individual confidence intervals (by a constant C) are fused into a
continuous confidence band for studying the underlying overall mean function coverage probability.
In the real world, with large amount of data at hand, the fundamental motivation is to extract
decisive information from data contaminated with noises. Pursuing correct model specification (e.g.,
basis functions and dimension) needs substantial efforts for effective data processing, description and
information (feature) extraction. On one hand, we should highlight the subject-matter experience
such as clear-cut specification of segments, landmarks and curve functions in the image analysis
scenarios. On the other hand, we expect more sophisticated methodologies from statistics and/or
machine learning point of view, such as training and validation, model selection and goodness-of-
fit test, adaptive real-time modeling and prediction protocol developments which are tailored and
adjusted for diversified application platforms.
5 APPENDIX
References
[1] Nalini Ravishanker, Dipak K. Dey (2001). A First Course in Linear Model Theory. Chapman &
Hall/CRC, Boca Raton.
5
0 50 100 150 200 250 300
0.00.20.40.60.81.0
Coverage rate
Sample size (n)
Rate
(Model= µ or (i β ), p=1, α =10%)
C(from 1/20, by 1/20)
0.0 0.5 1.0 1.5 2.0 2.5
0.00.20.40.60.81.0
Coverage rate
C
Rate
(Model= µ or (i β ), p=1, α =10%)
n(2 to 300)
n(2 to 300)
Figure 1: C-specific and n-specific simultaneous coverage (screen set=sample set with size n) rate
profiles (p = 1,α =10%)
6
5 10 15 20 25 30 35 40
24681012
Band widths
n
bandcoefficient
(p=3, C=1.50, α =10%)
Bonferroni(sample size)
Bonferroni (n=1000)
individual
C−scaling
Figure 2: Confidence band width comparison among Bonferroni adjustment I (t1−α/2n,n−p), Bon-
ferroni adjustment II (t1−α/2000,n−p), individual (t1−α/2,n−p) and C−scaling (Ct1−α/2,n−p). p = 3,
C = 1.5 and α = 10%. Note that C = 1.5 arises from configuration (p = 3,n = 30,α =10%) to
achieve the coverage probability 1 − α exactly.
7
0 5 10 15 20
−30−20−100102030
Observations, fitted curve and bands
Time
Response
(n(sample)=30, p=3, C=1.5, σ =10.0, α =10%)
f(x)=1+2cos(x)+3cos(2x)
+
+
+
+
+
+
+
+
+
+
+ +
+
+ +
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
curve(true)
curve(fitted)
band
Figure 3: An confidence band example for model fitting using cosine basis functions. The raw data
points are represented by “+”
8
0.0 0.5 1.0 1.5 2.0 2.5 3.0
0.00.20.40.60.81.0
Coverage rate
C
Coveragerate
(f(x)=1+2 x +3 x2
, α =10%)
n (from p+1 to 300)
n (from p+1 to 300)
0.0 0.5 1.0 1.5 2.0 2.5 3.0
0.00.20.40.60.81.0
Coverage rate
C
Coveragerate
(f(x)=1+2cos(x)+3cos(2x), α =10%)
n (from p+1 to 300)
n (from p+1 to 300)
Figure 4: Across-the-board simultaneous coverage rate profile (p =3 with polynomial or cosine
basis: f(x) = 1 + 2x + 3x2
(left panel) or f(x) = 1 + 2cos(x) + 3cos(2x) (right panel), x ∈ [0, 1])
9
0.0 0.5 1.0 1.5 2.0 2.5 3.0
0.00.20.40.60.81.0
Coverage rate
C
Coveragerate
(f(x)=1+2cos(x)+3cos(2 x2
), α =10%)
n (from p+1 to 300)
n (from p+1 to 300)
0.0 0.5 1.0 1.5 2.0 2.5 3.0
0.00.20.40.60.81.0
Coverage rate
C
Coveragerate
(f(x)=1+2cos( x3
)+3cos(2 x6
), α =10%)
n (from p+1 to 300)
n (from p+1 to 300)
Figure 5: Across-the-board simultaneous coverage rate profile (p =3 with cosine basis: f(x) =
1 + 2cos(x) + 3cos(2x2
) (left panel) or f(x) = 1 + 2cos(x3
) + 3cos(2x6
) (right panel), x ∈ [0, 1])
10
0.0 0.5 1.0 1.5 2.0 2.5 3.0
0.00.20.40.60.81.0
Coverage rate
C
Coveragerate
(f(x)=1+2sin(x−0.5|<0.5)+2sin( (x − 0.5)2
|>0.5)+3cos(2 x2
), α =10%)
n (from p+1 to 300)
n (from p+1 to 300)
0.0 0.5 1.0 1.5 2.0 2.5 3.0
0.00.20.40.60.81.0
Coverage rate
C
Coveragerate
(f(x)=1+2(x−0.5|<0.5)+2(x−1.0|>0.5)+3 x2
), α =10%)
n (from p+1 to 300)
n (from p+1 to 300)
Figure 6: Across-the-board simultaneous coverage rate profile (p =3 with polynomial or cosine
basis: f(x) = 1 + 2sin(x − 1/2| < 1/2) + 2sin((x − 1/2)2
| ≥ 1/2) + 3cos(2x2
) (not derivable, left
panel) or f(x) = 1+2(x−1/2| < 1/2)+2(x−1| ≥ 1/2)+3x2
(discontinuous, right panel), x ∈ [0, 1])
11
0.0 0.5 1.0 1.5 2.0 2.5 3.0
0.00.20.40.60.81.0
Coverage rate
C
Coveragerate
( α =10%)
n (from p+1 to 300)
n (from p+1 to 300)
0.0 0.5 1.0 1.5 2.0 2.5 3.0
0.00.20.40.60.81.0
Coverage rate
C
Coveragerate
( α =10%)
n (from p+1 to 300)
n (from p+1 to 300)
Figure 7: Across-the-board simultaneous coverage rate profile (p =3 with polynomial or cosine basis:
f(x) = 1 + 2sin(100(x − 1/2)| < 1/2) + 2sin((100(x − 1/2))2
| ≥ 1/2) + 3cos(2x2
) (no derivable, left
panel) or f(x) = 1 + 2(100(x − 1/2)| < 1/2) + 2(100(x − 1)| ≥ 1/2) + 3(100x)2
(discontinuous, right
panel), x ∈ [0, 1])
12
1.20 1.25 1.30 1.35 1.40 1.45 1.50 1.55
0.750.800.850.900.95
Coverage rate
C
Coveragerate
(p=3,α=10%)
n (from p+1 to 100)
n (from p+1 to 100)
Figure 8: Clusters of coverage rate profiles at turning point (n varies, p = 3, α =10%.)
13
0.0 0.5 1.0 1.5 2.0 2.5 3.0
0.00.20.40.60.81.0
Coverage rate
C
Coveragerate
(n=30(across−the−board), α =10%)
p (1 to 10)
Figure 9: Aross-the-board coverage rate profiles [C,n(sample size)=n∗
(screen size), p varies]
14
0.0 0.5 1.0 1.5 2.0 2.5 3.0
0.00.20.40.60.81.0
Coverage rate
C
Coveragerate
(n(sample)=p+1, α =10%)
p
2
10
n(screen)
p+1
50
p=2
3
4
5
6
7
8
9
10
Figure 10: Calculate the overall coverage probability with model fitting sample size n = p + 1 and
the number of screen points varying (n = p + 1 upward)
15
0.01.02.03.0
0.00.20.40.60.81.0
Coveragerate(C)
C
Coveragerate
(n(sample)=30,p(true)=9,α=10%)
p(fit)
p(true)
p(true)+2
0.01.02.03.0
0.00.20.40.60.81.0
Coveragerate(C)
C
Coveragerate
(n(sample)=30,p(true)=8,α=10%)
p(fit)
p(true)
p(true)+2
0.01.02.03.0
0.00.20.40.60.81.0
Coveragerate(C)
C
Coveragerate
(n(sample)=30,p(true)=7,α=10%)
p(fit)
p(true)
p(true)+2
0.01.02.03.0
0.00.20.40.60.81.0
Coveragerate(C)
C
Coveragerate
(n(sample)=30,p(true)=6,α=10%)
p(fit)
p(true)
p(true)+2
0.01.02.03.0
0.00.20.40.60.81.0
Coveragerate(C)
C
Coveragerate
(n(sample)=30,p(true)=5,α=10%)
p(fit)
p(true)
p(true)+2
0.01.02.03.0
0.00.20.40.60.81.0
Coveragerate(C)
C
Coveragerate
(n(sample)=30,p(true)=4,α=10%)
p(fit)
p(true)
p(true)+2
0.01.02.03.0
0.00.20.40.60.81.0
Coveragerate(C)
C
Coveragerate
(n(sample)=30,p(true)=3,α=10%)
p(fit)
p(true)
p(true)+2
0.01.02.03.0
0.00.20.40.60.81.0
Coveragerate(C)
C
Coveragerate
(n(sample)=30,p(true)=2,α=10%)
p(fit)
p(true)
p(true)+2
Figure 11: Coverage probability profile patterns under model overfitting (n = 30, p(true model)
varies from 9 to 2 in different panels, α =10%, the true model f(x) =
∑p
j=1 jxj−1
, p(model fitting)
varies from p(true model) to p(true model)+2.)
16
0.01.02.03.0
0.00.20.40.60.81.0
Coveragerate(A)
C
Coveragerate
(n(sample)=30,p(true)=10,α=10%)
p(fit)
p(true)
2
0.01.02.03.0
0.00.20.40.60.81.0
Coveragerate(A)
C
Coveragerate
(n(sample)=30,p(true)=9,α=10%)
p(fit)
p(true)
2
0.01.02.03.0
0.00.20.40.60.81.0
Coveragerate(A)
C
Coveragerate
(n(sample)=30,p(true)=8,α=10%)
p(fit)
p(true)
2
0.01.02.03.0
0.00.20.40.60.81.0
Coveragerate(A)
C
Coveragerate
(n(sample)=30,p(true)=7,α=10%)
p(fit)
p(true)
2
0.01.02.03.0
0.00.20.40.60.81.0
Coveragerate(A)
C
Coveragerate
(n(sample)=30,p(true)=6,α=10%)
p(fit)
p(true)
2
0.01.02.03.0
0.00.20.40.60.81.0
Coveragerate(A)
C
Coveragerate
(n(sample)=30,p(true)=5,α=10%)
p(fit)
p(true)
2
0.01.02.03.0
0.00.20.40.60.81.0
Coveragerate(A)
C
Coveragerate
(n(sample)=30,p(true)=4,α=10%)
p(fit)
p(true)
2
0.01.02.03.0
0.00.20.40.60.81.0
Coveragerate(A)
C
Coveragerate
(n(sample)=30,p(true)=3,α=10%)
p(fit)
p(true)
2
Figure 12: Coverage probability profile patterns under model underfitting (n = 30, p(true model)
varies from 10 to 3 in different panels, α =10%, the true model f(x) =
∑p
j=1 jxj−1
, p(model fitting)
varies from p(true model) to 2.)
17
0.01.02.03.0
0.00.20.40.60.81.0
Coveragerate(B)
C
Coveragerate
(n(sample)=30,p(true)=10,α=10%)
p(fit)
p(true)
2
0.01.02.03.0
0.00.20.40.60.81.0
Coveragerate(B)
C
Coveragerate
(n(sample)=30,p(true)=9,α=10%)
p(fit)
p(true)
2
0.01.02.03.0
0.00.20.40.60.81.0
Coveragerate(B)
C
Coveragerate
(n(sample)=30,p(true)=8,α=10%)
p(fit)
p(true)
2
0.01.02.03.0
0.00.20.40.60.81.0
Coveragerate(B)
C
Coveragerate
(n(sample)=30,p(true)=7,α=10%)
p(fit)
p(true)
2
0.01.02.03.0
0.00.20.40.60.81.0
Coveragerate(B)
C
Coveragerate
(n(sample)=30,p(true)=6,α=10%)
p(fit)
p(true)
2
0.01.02.03.0
0.00.20.40.60.81.0
Coveragerate(B)
C
Coveragerate
(n(sample)=30,p(true)=5,α=10%)
p(fit)
p(true)
2
0.01.02.03.0
0.00.20.40.60.81.0
Coveragerate(B)
C
Coveragerate
(n(sample)=30,p(true)=4,α=10%)
p(fit)
p(true)
2
0.01.02.03.0
0.00.20.40.60.81.0
Coveragerate(B)
C
Coveragerate
(n(sample)=30,p(true)=3,α=10%)
p(fit)
p(true)
2
Figure 13: Coverage probability profile patterns under model underfitting (n = 30, p(true model)
varies from 10 to 3 in different panels, α =10%, the true model f(x) =
∑p
j=1 10jxj−1
, p(model
fitting) varies from p(true model) to 2.)
18
0.0 0.2 0.4 0.6 0.8 1.0
0.00.20.40.60.81.0
Raw data and fitted curve
X
Y
(p(fit)=5, C=2.1, α =10%)
+
+
+
+
+
+
+
+++++++++++
+++++++
+++++++++
+
+
+
+
+++
+
+
+
+
++
+
+
+
+
+
+
+
++
++++++++++++++++++++++++++++
++
+
+
+
+
+
+
0.0 0.2 0.4 0.6 0.8 1.0
0.00.20.40.60.81.0
Raw data and fitted curve
X
Y
(p(fit)=5, C=2.1, α =10%)
+
+
+
+
+
+
+
+++++++++++
+++++++
+++++++++
+
+
+
+
+++
+
+
+
+
++
+
+
+
+
+
+
+
++
++++++++++++++++++++++++++++
++
+
+
+
+
+
+
Figure 14: Brain contour estimation example. The “+” represents each captured raw data point
with noise. The estimated smooth curves along with two-sided confidence bands are displayed (the
polynomial function has dimension p = 5, C = 2.1, α = 10%). The left panel is a crude “3+3”
partition and the right panel is a landmark-based adaptive “3+4” partition. The disconnection at
the left endpoint indicates an edge effect.
19
0.0 0.5 1.0 1.5 2.0 2.5 3.0
0.00.20.40.60.81.0
Power
C
Power
(β0=(1,2,3),α=10%)
(βa=(1,2.1,3)
(β0=(1,2,3),α=10%)
(βa=(1,2.1,3)
(β0=(1,2,3),α=10%)
(βa=(1,2.1,3)
(β0=(1,2,3),α=10%)
(βa=(1,2.1,3)
(β0=(1,2,3),α=10%)
(βa=(1,2.1,3)
n=10
15
20
25
30
(βa=(1,2.5,3)(βa=(1,2.5,3)(βa=(1,2.5,3)(βa=(1,2.5,3)(βa=(1,2.5,3)
(βa=(1,3,3)(βa=(1,3,3)(βa=(1,3,3)(βa=(1,3,3)(βa=(1,3,3)
Figure 15: Power (null hypothesis (H0) rejection probability) profile patterns (n varies, α =10%,
H0 : f(t) = 1 + 2x + 3x2
, Ha : f(t) = 1 + 3x + 3x2
.)
20

More Related Content

What's hot

ABC convergence under well- and mis-specified models
ABC convergence under well- and mis-specified modelsABC convergence under well- and mis-specified models
ABC convergence under well- and mis-specified modelsChristian Robert
 
International Journal of Engineering Research and Development
International Journal of Engineering Research and DevelopmentInternational Journal of Engineering Research and Development
International Journal of Engineering Research and DevelopmentIJERD Editor
 
Multiple estimators for Monte Carlo approximations
Multiple estimators for Monte Carlo approximationsMultiple estimators for Monte Carlo approximations
Multiple estimators for Monte Carlo approximationsChristian Robert
 
Inference in generative models using the Wasserstein distance [[INI]
Inference in generative models using the Wasserstein distance [[INI]Inference in generative models using the Wasserstein distance [[INI]
Inference in generative models using the Wasserstein distance [[INI]Christian Robert
 
from model uncertainty to ABC
from model uncertainty to ABCfrom model uncertainty to ABC
from model uncertainty to ABCChristian Robert
 
random forests for ABC model choice and parameter estimation
random forests for ABC model choice and parameter estimationrandom forests for ABC model choice and parameter estimation
random forests for ABC model choice and parameter estimationChristian Robert
 
Monte Carlo in Montréal 2017
Monte Carlo in Montréal 2017Monte Carlo in Montréal 2017
Monte Carlo in Montréal 2017Christian Robert
 
Columbia workshop [ABC model choice]
Columbia workshop [ABC model choice]Columbia workshop [ABC model choice]
Columbia workshop [ABC model choice]Christian Robert
 
Tailored Bregman Ball Trees for Effective Nearest Neighbors
Tailored Bregman Ball Trees for Effective Nearest NeighborsTailored Bregman Ball Trees for Effective Nearest Neighbors
Tailored Bregman Ball Trees for Effective Nearest NeighborsFrank Nielsen
 
Approximating Bayes Factors
Approximating Bayes FactorsApproximating Bayes Factors
Approximating Bayes FactorsChristian Robert
 
Bayesian model choice in cosmology
Bayesian model choice in cosmologyBayesian model choice in cosmology
Bayesian model choice in cosmologyChristian Robert
 
A Maximum Entropy Approach to the Loss Data Aggregation Problem
A Maximum Entropy Approach to the Loss Data Aggregation ProblemA Maximum Entropy Approach to the Loss Data Aggregation Problem
A Maximum Entropy Approach to the Loss Data Aggregation ProblemErika G. G.
 
Laplace's Demon: seminar #1
Laplace's Demon: seminar #1Laplace's Demon: seminar #1
Laplace's Demon: seminar #1Christian Robert
 
Colloquium in honor of Hans Ruedi Künsch
Colloquium in honor of Hans Ruedi KünschColloquium in honor of Hans Ruedi Künsch
Colloquium in honor of Hans Ruedi KünschChristian Robert
 
ABC short course: final chapters
ABC short course: final chaptersABC short course: final chapters
ABC short course: final chaptersChristian Robert
 
ABC in London, May 5, 2011
ABC in London, May 5, 2011ABC in London, May 5, 2011
ABC in London, May 5, 2011Christian Robert
 
BALANCING BOARD MACHINES
BALANCING BOARD MACHINESBALANCING BOARD MACHINES
BALANCING BOARD MACHINESbutest
 

What's hot (20)

ABC convergence under well- and mis-specified models
ABC convergence under well- and mis-specified modelsABC convergence under well- and mis-specified models
ABC convergence under well- and mis-specified models
 
International Journal of Engineering Research and Development
International Journal of Engineering Research and DevelopmentInternational Journal of Engineering Research and Development
International Journal of Engineering Research and Development
 
Multiple estimators for Monte Carlo approximations
Multiple estimators for Monte Carlo approximationsMultiple estimators for Monte Carlo approximations
Multiple estimators for Monte Carlo approximations
 
Inference in generative models using the Wasserstein distance [[INI]
Inference in generative models using the Wasserstein distance [[INI]Inference in generative models using the Wasserstein distance [[INI]
Inference in generative models using the Wasserstein distance [[INI]
 
from model uncertainty to ABC
from model uncertainty to ABCfrom model uncertainty to ABC
from model uncertainty to ABC
 
random forests for ABC model choice and parameter estimation
random forests for ABC model choice and parameter estimationrandom forests for ABC model choice and parameter estimation
random forests for ABC model choice and parameter estimation
 
Monte Carlo in Montréal 2017
Monte Carlo in Montréal 2017Monte Carlo in Montréal 2017
Monte Carlo in Montréal 2017
 
Columbia workshop [ABC model choice]
Columbia workshop [ABC model choice]Columbia workshop [ABC model choice]
Columbia workshop [ABC model choice]
 
ABC-Gibbs
ABC-GibbsABC-Gibbs
ABC-Gibbs
 
ABC model choice
ABC model choiceABC model choice
ABC model choice
 
Tailored Bregman Ball Trees for Effective Nearest Neighbors
Tailored Bregman Ball Trees for Effective Nearest NeighborsTailored Bregman Ball Trees for Effective Nearest Neighbors
Tailored Bregman Ball Trees for Effective Nearest Neighbors
 
Approximating Bayes Factors
Approximating Bayes FactorsApproximating Bayes Factors
Approximating Bayes Factors
 
Bayesian model choice in cosmology
Bayesian model choice in cosmologyBayesian model choice in cosmology
Bayesian model choice in cosmology
 
A Maximum Entropy Approach to the Loss Data Aggregation Problem
A Maximum Entropy Approach to the Loss Data Aggregation ProblemA Maximum Entropy Approach to the Loss Data Aggregation Problem
A Maximum Entropy Approach to the Loss Data Aggregation Problem
 
Laplace's Demon: seminar #1
Laplace's Demon: seminar #1Laplace's Demon: seminar #1
Laplace's Demon: seminar #1
 
ABC-Gibbs
ABC-GibbsABC-Gibbs
ABC-Gibbs
 
Colloquium in honor of Hans Ruedi Künsch
Colloquium in honor of Hans Ruedi KünschColloquium in honor of Hans Ruedi Künsch
Colloquium in honor of Hans Ruedi Künsch
 
ABC short course: final chapters
ABC short course: final chaptersABC short course: final chapters
ABC short course: final chapters
 
ABC in London, May 5, 2011
ABC in London, May 5, 2011ABC in London, May 5, 2011
ABC in London, May 5, 2011
 
BALANCING BOARD MACHINES
BALANCING BOARD MACHINESBALANCING BOARD MACHINES
BALANCING BOARD MACHINES
 

Viewers also liked

TALK-MedImmune-2013
TALK-MedImmune-2013TALK-MedImmune-2013
TALK-MedImmune-2013Junfeng Liu
 
叠加于漂移之上的简单随机游动的概率计算
叠加于漂移之上的简单随机游动的概率计算叠加于漂移之上的简单随机游动的概率计算
叠加于漂移之上的简单随机游动的概率计算Junfeng Liu
 
A New Statistical Aspects of Cancer Diagnosis and Treatment
A New Statistical Aspects of Cancer Diagnosis and TreatmentA New Statistical Aspects of Cancer Diagnosis and Treatment
A New Statistical Aspects of Cancer Diagnosis and TreatmentJunfeng Liu
 
A STATISTICAL NOTE ON SYSTEM TRANSITION INTO EQUILIBRIUM [03-21-2014]
A STATISTICAL NOTE ON SYSTEM TRANSITION INTO EQUILIBRIUM [03-21-2014]A STATISTICAL NOTE ON SYSTEM TRANSITION INTO EQUILIBRIUM [03-21-2014]
A STATISTICAL NOTE ON SYSTEM TRANSITION INTO EQUILIBRIUM [03-21-2014]Junfeng Liu
 
A Geometric Note on a Type of Multiple Testing-07-24-2015
A Geometric Note on a Type of Multiple Testing-07-24-2015A Geometric Note on a Type of Multiple Testing-07-24-2015
A Geometric Note on a Type of Multiple Testing-07-24-2015Junfeng Liu
 
Sf2 organic low bush blueberries
Sf2 organic low bush blueberriesSf2 organic low bush blueberries
Sf2 organic low bush blueberriesacornorganic
 
Transformación y comercialización 2017 piensos
Transformación y comercialización 2017 piensosTransformación y comercialización 2017 piensos
Transformación y comercialización 2017 piensosAgronovo Ecoloxía S.L.
 
Урок на тему: "Описание физических свойств веществ" (8 класс)
Урок на тему: "Описание физических свойств веществ" (8 класс)Урок на тему: "Описание физических свойств веществ" (8 класс)
Урок на тему: "Описание физических свойств веществ" (8 класс)school17d
 
Informatica el internet 1 c
Informatica el internet 1 cInformatica el internet 1 c
Informatica el internet 1 cJoselynalmeida
 
Pete Amend Clinical Resume 170109
Pete Amend Clinical Resume 170109Pete Amend Clinical Resume 170109
Pete Amend Clinical Resume 170109Pete Amend
 
CICLASUR 2016 - ARGENTINA - "Teresa de Los Andes, religiosa del Carmelo"
CICLASUR 2016 - ARGENTINA - "Teresa de Los Andes, religiosa del Carmelo"CICLASUR 2016 - ARGENTINA - "Teresa de Los Andes, religiosa del Carmelo"
CICLASUR 2016 - ARGENTINA - "Teresa de Los Andes, religiosa del Carmelo"ORDEN SEGLAR CARMELITAS DESCALZOS
 

Viewers also liked (20)

TALK-MedImmune-2013
TALK-MedImmune-2013TALK-MedImmune-2013
TALK-MedImmune-2013
 
叠加于漂移之上的简单随机游动的概率计算
叠加于漂移之上的简单随机游动的概率计算叠加于漂移之上的简单随机游动的概率计算
叠加于漂移之上的简单随机游动的概率计算
 
A New Statistical Aspects of Cancer Diagnosis and Treatment
A New Statistical Aspects of Cancer Diagnosis and TreatmentA New Statistical Aspects of Cancer Diagnosis and Treatment
A New Statistical Aspects of Cancer Diagnosis and Treatment
 
A STATISTICAL NOTE ON SYSTEM TRANSITION INTO EQUILIBRIUM [03-21-2014]
A STATISTICAL NOTE ON SYSTEM TRANSITION INTO EQUILIBRIUM [03-21-2014]A STATISTICAL NOTE ON SYSTEM TRANSITION INTO EQUILIBRIUM [03-21-2014]
A STATISTICAL NOTE ON SYSTEM TRANSITION INTO EQUILIBRIUM [03-21-2014]
 
A Geometric Note on a Type of Multiple Testing-07-24-2015
A Geometric Note on a Type of Multiple Testing-07-24-2015A Geometric Note on a Type of Multiple Testing-07-24-2015
A Geometric Note on a Type of Multiple Testing-07-24-2015
 
PFC_o-dong
PFC_o-dongPFC_o-dong
PFC_o-dong
 
Lozano Human R
Lozano Human RLozano Human R
Lozano Human R
 
Sf2 organic low bush blueberries
Sf2 organic low bush blueberriesSf2 organic low bush blueberries
Sf2 organic low bush blueberries
 
Transformación y comercialización 2017 piensos
Transformación y comercialización 2017 piensosTransformación y comercialización 2017 piensos
Transformación y comercialización 2017 piensos
 
Nutrientes 01
Nutrientes 01Nutrientes 01
Nutrientes 01
 
Урок на тему: "Описание физических свойств веществ" (8 класс)
Урок на тему: "Описание физических свойств веществ" (8 класс)Урок на тему: "Описание физических свойств веществ" (8 класс)
Урок на тему: "Описание физических свойств веществ" (8 класс)
 
Don de Piedad
Don de Piedad Don de Piedad
Don de Piedad
 
Messak Article Libya Herald
Messak Article Libya HeraldMessak Article Libya Herald
Messak Article Libya Herald
 
Informatica el internet 1 c
Informatica el internet 1 cInformatica el internet 1 c
Informatica el internet 1 c
 
Brasilidades bahianas
Brasilidades bahianasBrasilidades bahianas
Brasilidades bahianas
 
CV Manoj kumar
CV Manoj kumarCV Manoj kumar
CV Manoj kumar
 
Hinduismo 6a gr4
Hinduismo 6a gr4Hinduismo 6a gr4
Hinduismo 6a gr4
 
Pete Amend Clinical Resume 170109
Pete Amend Clinical Resume 170109Pete Amend Clinical Resume 170109
Pete Amend Clinical Resume 170109
 
CICLASUR 2016 - ARGENTINA - "Teresa de Los Andes, religiosa del Carmelo"
CICLASUR 2016 - ARGENTINA - "Teresa de Los Andes, religiosa del Carmelo"CICLASUR 2016 - ARGENTINA - "Teresa de Los Andes, religiosa del Carmelo"
CICLASUR 2016 - ARGENTINA - "Teresa de Los Andes, religiosa del Carmelo"
 
Bezmugurkaulnieki 2
Bezmugurkaulnieki 2Bezmugurkaulnieki 2
Bezmugurkaulnieki 2
 

Similar to A Note on Confidence Bands for Linear Regression Means-07-24-2015

Normalized averaging using adaptive applicability functions with applications...
Normalized averaging using adaptive applicability functions with applications...Normalized averaging using adaptive applicability functions with applications...
Normalized averaging using adaptive applicability functions with applications...guest31063e
 
Using Subspace Pursuit Algorithm to Improve Performance of the Distributed Co...
Using Subspace Pursuit Algorithm to Improve Performance of the Distributed Co...Using Subspace Pursuit Algorithm to Improve Performance of the Distributed Co...
Using Subspace Pursuit Algorithm to Improve Performance of the Distributed Co...Polytechnique Montreal
 
Effect of grid adaptive interpolation over depth images
Effect of grid adaptive interpolation over depth imagesEffect of grid adaptive interpolation over depth images
Effect of grid adaptive interpolation over depth imagescsandit
 
Approximate Thin Plate Spline Mappings
Approximate Thin Plate Spline MappingsApproximate Thin Plate Spline Mappings
Approximate Thin Plate Spline MappingsArchzilon Eshun-Davies
 
StatPhysPerspectives_AMALEA_Cetraro_AnnaCarbone.pdf
StatPhysPerspectives_AMALEA_Cetraro_AnnaCarbone.pdfStatPhysPerspectives_AMALEA_Cetraro_AnnaCarbone.pdf
StatPhysPerspectives_AMALEA_Cetraro_AnnaCarbone.pdfAnna Carbone
 
Investigation on the Pattern Synthesis of Subarray Weights for Low EMI Applic...
Investigation on the Pattern Synthesis of Subarray Weights for Low EMI Applic...Investigation on the Pattern Synthesis of Subarray Weights for Low EMI Applic...
Investigation on the Pattern Synthesis of Subarray Weights for Low EMI Applic...IOSRJECE
 
ON INCREASING OF DENSITY OF ELEMENTS IN A MULTIVIBRATOR ON BIPOLAR TRANSISTORS
ON INCREASING OF DENSITY OF ELEMENTS IN A MULTIVIBRATOR ON BIPOLAR TRANSISTORSON INCREASING OF DENSITY OF ELEMENTS IN A MULTIVIBRATOR ON BIPOLAR TRANSISTORS
ON INCREASING OF DENSITY OF ELEMENTS IN A MULTIVIBRATOR ON BIPOLAR TRANSISTORSijcsitcejournal
 
fb69b412-97cb-4e8d-8a28-574c09557d35-160618025920
fb69b412-97cb-4e8d-8a28-574c09557d35-160618025920fb69b412-97cb-4e8d-8a28-574c09557d35-160618025920
fb69b412-97cb-4e8d-8a28-574c09557d35-160618025920Karl Rudeen
 
Double transform contoor extraction
Double transform contoor extractionDouble transform contoor extraction
Double transform contoor extractionarteimi
 
Image compression based on
Image compression based onImage compression based on
Image compression based onijma
 
Regularized Compression of A Noisy Blurred Image
Regularized Compression of A Noisy Blurred Image Regularized Compression of A Noisy Blurred Image
Regularized Compression of A Noisy Blurred Image ijcsa
 
Chapter 8 Of Rock Engineering
Chapter 8 Of  Rock  EngineeringChapter 8 Of  Rock  Engineering
Chapter 8 Of Rock EngineeringNgo Hung Long
 
Introduction to Evidential Neural Networks
Introduction to Evidential Neural NetworksIntroduction to Evidential Neural Networks
Introduction to Evidential Neural NetworksFederico Cerutti
 
Sparse data formats and efficient numerical methods for uncertainties in nume...
Sparse data formats and efficient numerical methods for uncertainties in nume...Sparse data formats and efficient numerical methods for uncertainties in nume...
Sparse data formats and efficient numerical methods for uncertainties in nume...Alexander Litvinenko
 

Similar to A Note on Confidence Bands for Linear Regression Means-07-24-2015 (20)

Normalized averaging using adaptive applicability functions with applications...
Normalized averaging using adaptive applicability functions with applications...Normalized averaging using adaptive applicability functions with applications...
Normalized averaging using adaptive applicability functions with applications...
 
New test123
New test123New test123
New test123
 
Using Subspace Pursuit Algorithm to Improve Performance of the Distributed Co...
Using Subspace Pursuit Algorithm to Improve Performance of the Distributed Co...Using Subspace Pursuit Algorithm to Improve Performance of the Distributed Co...
Using Subspace Pursuit Algorithm to Improve Performance of the Distributed Co...
 
Effect of grid adaptive interpolation over depth images
Effect of grid adaptive interpolation over depth imagesEffect of grid adaptive interpolation over depth images
Effect of grid adaptive interpolation over depth images
 
Talk 3
Talk 3Talk 3
Talk 3
 
QMC: Transition Workshop - Density Estimation by Randomized Quasi-Monte Carlo...
QMC: Transition Workshop - Density Estimation by Randomized Quasi-Monte Carlo...QMC: Transition Workshop - Density Estimation by Randomized Quasi-Monte Carlo...
QMC: Transition Workshop - Density Estimation by Randomized Quasi-Monte Carlo...
 
Approximate Thin Plate Spline Mappings
Approximate Thin Plate Spline MappingsApproximate Thin Plate Spline Mappings
Approximate Thin Plate Spline Mappings
 
kcde
kcdekcde
kcde
 
2009 asilomar
2009 asilomar2009 asilomar
2009 asilomar
 
StatPhysPerspectives_AMALEA_Cetraro_AnnaCarbone.pdf
StatPhysPerspectives_AMALEA_Cetraro_AnnaCarbone.pdfStatPhysPerspectives_AMALEA_Cetraro_AnnaCarbone.pdf
StatPhysPerspectives_AMALEA_Cetraro_AnnaCarbone.pdf
 
Investigation on the Pattern Synthesis of Subarray Weights for Low EMI Applic...
Investigation on the Pattern Synthesis of Subarray Weights for Low EMI Applic...Investigation on the Pattern Synthesis of Subarray Weights for Low EMI Applic...
Investigation on the Pattern Synthesis of Subarray Weights for Low EMI Applic...
 
ON INCREASING OF DENSITY OF ELEMENTS IN A MULTIVIBRATOR ON BIPOLAR TRANSISTORS
ON INCREASING OF DENSITY OF ELEMENTS IN A MULTIVIBRATOR ON BIPOLAR TRANSISTORSON INCREASING OF DENSITY OF ELEMENTS IN A MULTIVIBRATOR ON BIPOLAR TRANSISTORS
ON INCREASING OF DENSITY OF ELEMENTS IN A MULTIVIBRATOR ON BIPOLAR TRANSISTORS
 
fb69b412-97cb-4e8d-8a28-574c09557d35-160618025920
fb69b412-97cb-4e8d-8a28-574c09557d35-160618025920fb69b412-97cb-4e8d-8a28-574c09557d35-160618025920
fb69b412-97cb-4e8d-8a28-574c09557d35-160618025920
 
Project Paper
Project PaperProject Paper
Project Paper
 
Double transform contoor extraction
Double transform contoor extractionDouble transform contoor extraction
Double transform contoor extraction
 
Image compression based on
Image compression based onImage compression based on
Image compression based on
 
Regularized Compression of A Noisy Blurred Image
Regularized Compression of A Noisy Blurred Image Regularized Compression of A Noisy Blurred Image
Regularized Compression of A Noisy Blurred Image
 
Chapter 8 Of Rock Engineering
Chapter 8 Of  Rock  EngineeringChapter 8 Of  Rock  Engineering
Chapter 8 Of Rock Engineering
 
Introduction to Evidential Neural Networks
Introduction to Evidential Neural NetworksIntroduction to Evidential Neural Networks
Introduction to Evidential Neural Networks
 
Sparse data formats and efficient numerical methods for uncertainties in nume...
Sparse data formats and efficient numerical methods for uncertainties in nume...Sparse data formats and efficient numerical methods for uncertainties in nume...
Sparse data formats and efficient numerical methods for uncertainties in nume...
 

A Note on Confidence Bands for Linear Regression Means-07-24-2015

  • 1. A Note on Confidence Band for Linear Regression Means Dipak K Dey, Junfeng Liu, Nalini Ravishanker, Edwards Qiang Zhang (07-24-2015) ABSTRACT. We are often interested in estimating the overall set of population means (e.g., a curve or surface) defined by the corresponding set of predictor values (e.g., across certain temporal and/or spatial domains). When the model is correctly specified, we study a simple confidence band built upon least squares regression. 1 Introduction We consider the linear regression model yi = xiβ + ϵi, ϵi ∼ N(0, σ2 ), (i = 1, . . . , n). Let X (dimension n×p) be the design matrix which collects p-dimensional subject-specific predictor vectors ({xi}) with sample size n. The ordinary least square estimation for the p-dimensional coefficient vector β and noise variance (σ2 ) are denoted as ˆβ and ˆσ2 n−p, respectively. It is known that the (1 − α) confidence ellipsoid for CT β (rank(C)= s ≤ p) is {CT β : (CT β − CT ˆβ)T [CT (XT X)−1 C]−1 (CT ˆβ − CT β) ≤ sˆσ2 n−pFs,n−p,α}. Specifically, the (1 − α) confidence ellipsoid for β is {β : (ˆβ − β)′ (X′ X)(ˆβ − β) ≤ pˆσ2 n−pFp,n−p,α}. (1) A number of methods on multiple comparison and/or testing are available in the literature (e.g., Ravishanker and Dey, 2001). The rest of the article is organized as follows. Section 2 studies the confidence band for response mean set estimation; Section 3 compares the confidence band and ellipsoid approaches with regard to the power for hypothesis testing; and Section 4 concludes with future directions. 1
  • 2. 2 Confidence band for response means Often times, practice calls for a statistical estimation of an overall joint set of population means across a specified domain (e.g., a continuous temporal/spatial curve/surface) along with a con- fidence band around. We start with two special examples to highlight the relationship between single-point coverage and multiple-point coverage with regard to response mean. Example 1. yi = µi + ϵi with µi = µ (a constant) and ϵi ∼ N(0, σ2 ) independently (i = 1, . . . , n), the design matrix (X) is the n-dimensional vector 1n = (1, . . . , 1)T n and the regression coefficient (β) is µ. The simultaneous coverage of the n response means by the following C-scaling confidence band amounts to individual coverage with coverage rate Pr(|ˆµ − µ| ≤ Ctn−1,1−α/2n−1/2 (ϵT (I−Jn/n)ϵ n−1 )1/2 ) = Pr(F1,n−1 ≤ C2 F1,n−1,1−α), where, Jn = 1n1T n and t1−α 2 ,n−p represents the 1 − α 2 quantiles for the Students’ t-distribution (de- grees of freedom= n − p). C-specific and n-specific simultaneous coverage (screen set=sample set with size n) rate profiles (p = 1,α =10%) are displayed in Figure 1. Example 1∗ . We consider model yi = βi + ϵi, ϵi ∼ N(0, σ2 ), (i = 1, . . . , n). With a C-scaling confi- dence band derived from least squares estimation, the simultaneous coverage amounts to individual coverage, i.e., ∀ 1 ≤ i ≤ n, we have Pr(|ˆβi − βi| ≤ Ctn−1,1−α/2(i(XT X)−1 i)1/2 (ϵT (I−PX )ϵ n−1 )1/2 ) = Pr(|ˆβ − β| ≤ Ctn−1,1−α/2(XT X)−1/2 (ϵT (I−PX )ϵ n−1 )1/2 ) = Pr(F1,n−1 ≤ C2 F1,n−1,1−α). Involving Z quantile (standard normal distribution approximation), we have Pr(|ˆβi − βi| ≤ CZ1−α/2(i(XT X)−1 i)1/2 (ϵT (I−PX )ϵ n−1 )1/2 ) = Pr(F1,n−1 ≤ C2 χ2 1,1−α). In view of the link between single-point and multiple-point coverage, we proceed to study a general case. With the prescribed targeted coverage rate (1−α), Bonferonni multiplicity adjustment (across 2
  • 3. sample space) amounts to the following discrete n-adaptive fusion of individual confidence intervals Band = xi ˆβ ± t1−α/2n,n−pˆσn−p(xi(XT X)−1 xT i )1/2 . (2) This Bonferonni adjustment is n-specific and (2) is prescribed using the sample size under modeling (Bonferroni adjustment I). We are not very clear about the consequence when we apply Bonferonni adjustment to any number (n∗ ≥ sample size n) of data points screened for claiming an overall coverage or not across the domain (Bonferonni adjustment II). For instance, n∗ = 1000 leads to Band = xi ˆβ ± t1−α/2000,n−pˆσn−p(xi(XT X)−1 xT i )1/2 , which approaches ∞ as n∗ increases to infinity. Since Bonferonni adjustment likely leads to the actual coverage probability > 1 − α, we resort to the simple scaled fusion of individual confidence intervals Band = xi ˆβ ± C ∗ t1−α/2,n−pˆσn−p(xi(XT X)−1 xT i )1/2 . (3) The tuning parameter C (for multiplicity adjustment) is to be determined for achieving the pre- scribed coverage probability (1 − α) exactly. We are interested in the coverage probability of the band (3) for the underlying response means ({xiβ}) across a continuous domain (e.g., xi = (1, ti, t2 i , . . . , tp−1 i ), ti = i × 10−3 , 1 ≤ i ≤ 103 ). The non-coverage rate resembles the family-wise er- ror rate (FWER) under multiple testing scenario, i.e., the probability of at least one false rejections when all hypotheses are null. The confidence band width comparison among Bonferroni adjust- ment I (t1−α/2n,n−p), Bonferroni adjustment II (t1−α/2000,n−p), individual (t1−α/2,n−p) and C−scaling (Ct1−α/2,n−p) with (p = 3, C = 1.5 and α = 10%) is demonstrated in Figure 2. Since the C−scaling band (3) achieves the prescribed coverage probability (1 − α) at n = 30, Both Bonferroni ad- justments (I and II) are conservative coverage. Conditional on n and p, the C−scaling band (3) is equivalent to C∗ −scaling Bonferroni adjustments (I) with C∗ determined from the correspondent C. An confidence band example for model fitting using cosine basis functions is given in Figure 3. Under the correct models (with fixed p), the confidence band coverage probability profiles (interwoven by n and C) integrates into an intricate pattern (as n increases from p + 1) which shows clusters 3
  • 4. of n-specific profiles with limit to the normal approximation band. This sort of profile-model(p) correspondence does not depend on basis function type (e.g., polynomial, radial, cosine, continuous, discontinuous, etc) under additive models (Figures 4, 5, 6, 7). Out of these plots, a segment with detailed intersection points is enlarged in Figure 8. The across-the-board coverage rate profiles (C,n(sample size)=n∗ (screen size), p varies] are plotted in Figure 9. After fitting the model with sample size n = p + 1, we calculate the overall coverage probability by screening a certain number of data points (n∗ = p + 1 upward). The results are in Figure 10. 2.1 Estimation under model mis-specification The consequences from model over-fitting are demonstrated in Figure 11, where the basis function is polynomial f(x) = ∑p j=1 jxj−1 . The consequences from model over-fitting are demonstrated in Figures 12 and 13, where the basis function is polynomial f(x) = ∑p j=1 jxj−1 and f(x) = ∑p j=1 10jxj−1 , respectively. The over-fitting has a much less serious consequence than under-fitting. The consequence from under-fitting depends on the specific model specifications. As an example, we model and estimate the brain image (http://en.wikipedia.org/wiki/Medical imaging) contour (Figure 14). The left panel is a crude “3+3” partition of the top and bottom halves and the right panel is an adaptive “3+4” partition segmented by pre-specified landmarks. This confidence band estimation is only for illustration purpose since some segments have small numbers of data points incorporated into modeling and estimating. 3 Hypothesis test As for hypothesis test, we study the hypothesis testing (H0: β = β0 vs. Ha: β ̸= β0) using different approaches such as the confidence ellipsoid (1) and C-scaling confidence band approaches. The latter one claim H0 rejection whenever the confidence band does not cover the overall response mean curve under H0. When α = 10%, we study three different alternative hypotheses (H0 : β0 = (1, 2, 3) vs. Ha : βa= (1, 2.1, 3),(1, 2.5, 3),(1, 3, 3)). The powers are compared in Figure 15. C = 1.47 achieves (1−α) coverage under H0 with sample size n = 30. For this example, the powers are similar between 4
  • 5. two approaches in each of these three cases. Note that none of these confidence band construction and hypothesis testing procedures depend on the design matrix X and/or σ2 . 4 Conclusion Under additive models, for specified configuration (e.g., model dimension (p), error rate threshold (α), sample size (n)), scaled individual confidence intervals (by a constant C) are fused into a continuous confidence band for studying the underlying overall mean function coverage probability. In the real world, with large amount of data at hand, the fundamental motivation is to extract decisive information from data contaminated with noises. Pursuing correct model specification (e.g., basis functions and dimension) needs substantial efforts for effective data processing, description and information (feature) extraction. On one hand, we should highlight the subject-matter experience such as clear-cut specification of segments, landmarks and curve functions in the image analysis scenarios. On the other hand, we expect more sophisticated methodologies from statistics and/or machine learning point of view, such as training and validation, model selection and goodness-of- fit test, adaptive real-time modeling and prediction protocol developments which are tailored and adjusted for diversified application platforms. 5 APPENDIX References [1] Nalini Ravishanker, Dipak K. Dey (2001). A First Course in Linear Model Theory. Chapman & Hall/CRC, Boca Raton. 5
  • 6. 0 50 100 150 200 250 300 0.00.20.40.60.81.0 Coverage rate Sample size (n) Rate (Model= µ or (i β ), p=1, α =10%) C(from 1/20, by 1/20) 0.0 0.5 1.0 1.5 2.0 2.5 0.00.20.40.60.81.0 Coverage rate C Rate (Model= µ or (i β ), p=1, α =10%) n(2 to 300) n(2 to 300) Figure 1: C-specific and n-specific simultaneous coverage (screen set=sample set with size n) rate profiles (p = 1,α =10%) 6
  • 7. 5 10 15 20 25 30 35 40 24681012 Band widths n bandcoefficient (p=3, C=1.50, α =10%) Bonferroni(sample size) Bonferroni (n=1000) individual C−scaling Figure 2: Confidence band width comparison among Bonferroni adjustment I (t1−α/2n,n−p), Bon- ferroni adjustment II (t1−α/2000,n−p), individual (t1−α/2,n−p) and C−scaling (Ct1−α/2,n−p). p = 3, C = 1.5 and α = 10%. Note that C = 1.5 arises from configuration (p = 3,n = 30,α =10%) to achieve the coverage probability 1 − α exactly. 7
  • 8. 0 5 10 15 20 −30−20−100102030 Observations, fitted curve and bands Time Response (n(sample)=30, p=3, C=1.5, σ =10.0, α =10%) f(x)=1+2cos(x)+3cos(2x) + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + curve(true) curve(fitted) band Figure 3: An confidence band example for model fitting using cosine basis functions. The raw data points are represented by “+” 8
  • 9. 0.0 0.5 1.0 1.5 2.0 2.5 3.0 0.00.20.40.60.81.0 Coverage rate C Coveragerate (f(x)=1+2 x +3 x2 , α =10%) n (from p+1 to 300) n (from p+1 to 300) 0.0 0.5 1.0 1.5 2.0 2.5 3.0 0.00.20.40.60.81.0 Coverage rate C Coveragerate (f(x)=1+2cos(x)+3cos(2x), α =10%) n (from p+1 to 300) n (from p+1 to 300) Figure 4: Across-the-board simultaneous coverage rate profile (p =3 with polynomial or cosine basis: f(x) = 1 + 2x + 3x2 (left panel) or f(x) = 1 + 2cos(x) + 3cos(2x) (right panel), x ∈ [0, 1]) 9
  • 10. 0.0 0.5 1.0 1.5 2.0 2.5 3.0 0.00.20.40.60.81.0 Coverage rate C Coveragerate (f(x)=1+2cos(x)+3cos(2 x2 ), α =10%) n (from p+1 to 300) n (from p+1 to 300) 0.0 0.5 1.0 1.5 2.0 2.5 3.0 0.00.20.40.60.81.0 Coverage rate C Coveragerate (f(x)=1+2cos( x3 )+3cos(2 x6 ), α =10%) n (from p+1 to 300) n (from p+1 to 300) Figure 5: Across-the-board simultaneous coverage rate profile (p =3 with cosine basis: f(x) = 1 + 2cos(x) + 3cos(2x2 ) (left panel) or f(x) = 1 + 2cos(x3 ) + 3cos(2x6 ) (right panel), x ∈ [0, 1]) 10
  • 11. 0.0 0.5 1.0 1.5 2.0 2.5 3.0 0.00.20.40.60.81.0 Coverage rate C Coveragerate (f(x)=1+2sin(x−0.5|<0.5)+2sin( (x − 0.5)2 |>0.5)+3cos(2 x2 ), α =10%) n (from p+1 to 300) n (from p+1 to 300) 0.0 0.5 1.0 1.5 2.0 2.5 3.0 0.00.20.40.60.81.0 Coverage rate C Coveragerate (f(x)=1+2(x−0.5|<0.5)+2(x−1.0|>0.5)+3 x2 ), α =10%) n (from p+1 to 300) n (from p+1 to 300) Figure 6: Across-the-board simultaneous coverage rate profile (p =3 with polynomial or cosine basis: f(x) = 1 + 2sin(x − 1/2| < 1/2) + 2sin((x − 1/2)2 | ≥ 1/2) + 3cos(2x2 ) (not derivable, left panel) or f(x) = 1+2(x−1/2| < 1/2)+2(x−1| ≥ 1/2)+3x2 (discontinuous, right panel), x ∈ [0, 1]) 11
  • 12. 0.0 0.5 1.0 1.5 2.0 2.5 3.0 0.00.20.40.60.81.0 Coverage rate C Coveragerate ( α =10%) n (from p+1 to 300) n (from p+1 to 300) 0.0 0.5 1.0 1.5 2.0 2.5 3.0 0.00.20.40.60.81.0 Coverage rate C Coveragerate ( α =10%) n (from p+1 to 300) n (from p+1 to 300) Figure 7: Across-the-board simultaneous coverage rate profile (p =3 with polynomial or cosine basis: f(x) = 1 + 2sin(100(x − 1/2)| < 1/2) + 2sin((100(x − 1/2))2 | ≥ 1/2) + 3cos(2x2 ) (no derivable, left panel) or f(x) = 1 + 2(100(x − 1/2)| < 1/2) + 2(100(x − 1)| ≥ 1/2) + 3(100x)2 (discontinuous, right panel), x ∈ [0, 1]) 12
  • 13. 1.20 1.25 1.30 1.35 1.40 1.45 1.50 1.55 0.750.800.850.900.95 Coverage rate C Coveragerate (p=3,α=10%) n (from p+1 to 100) n (from p+1 to 100) Figure 8: Clusters of coverage rate profiles at turning point (n varies, p = 3, α =10%.) 13
  • 14. 0.0 0.5 1.0 1.5 2.0 2.5 3.0 0.00.20.40.60.81.0 Coverage rate C Coveragerate (n=30(across−the−board), α =10%) p (1 to 10) Figure 9: Aross-the-board coverage rate profiles [C,n(sample size)=n∗ (screen size), p varies] 14
  • 15. 0.0 0.5 1.0 1.5 2.0 2.5 3.0 0.00.20.40.60.81.0 Coverage rate C Coveragerate (n(sample)=p+1, α =10%) p 2 10 n(screen) p+1 50 p=2 3 4 5 6 7 8 9 10 Figure 10: Calculate the overall coverage probability with model fitting sample size n = p + 1 and the number of screen points varying (n = p + 1 upward) 15
  • 16. 0.01.02.03.0 0.00.20.40.60.81.0 Coveragerate(C) C Coveragerate (n(sample)=30,p(true)=9,α=10%) p(fit) p(true) p(true)+2 0.01.02.03.0 0.00.20.40.60.81.0 Coveragerate(C) C Coveragerate (n(sample)=30,p(true)=8,α=10%) p(fit) p(true) p(true)+2 0.01.02.03.0 0.00.20.40.60.81.0 Coveragerate(C) C Coveragerate (n(sample)=30,p(true)=7,α=10%) p(fit) p(true) p(true)+2 0.01.02.03.0 0.00.20.40.60.81.0 Coveragerate(C) C Coveragerate (n(sample)=30,p(true)=6,α=10%) p(fit) p(true) p(true)+2 0.01.02.03.0 0.00.20.40.60.81.0 Coveragerate(C) C Coveragerate (n(sample)=30,p(true)=5,α=10%) p(fit) p(true) p(true)+2 0.01.02.03.0 0.00.20.40.60.81.0 Coveragerate(C) C Coveragerate (n(sample)=30,p(true)=4,α=10%) p(fit) p(true) p(true)+2 0.01.02.03.0 0.00.20.40.60.81.0 Coveragerate(C) C Coveragerate (n(sample)=30,p(true)=3,α=10%) p(fit) p(true) p(true)+2 0.01.02.03.0 0.00.20.40.60.81.0 Coveragerate(C) C Coveragerate (n(sample)=30,p(true)=2,α=10%) p(fit) p(true) p(true)+2 Figure 11: Coverage probability profile patterns under model overfitting (n = 30, p(true model) varies from 9 to 2 in different panels, α =10%, the true model f(x) = ∑p j=1 jxj−1 , p(model fitting) varies from p(true model) to p(true model)+2.) 16
  • 17. 0.01.02.03.0 0.00.20.40.60.81.0 Coveragerate(A) C Coveragerate (n(sample)=30,p(true)=10,α=10%) p(fit) p(true) 2 0.01.02.03.0 0.00.20.40.60.81.0 Coveragerate(A) C Coveragerate (n(sample)=30,p(true)=9,α=10%) p(fit) p(true) 2 0.01.02.03.0 0.00.20.40.60.81.0 Coveragerate(A) C Coveragerate (n(sample)=30,p(true)=8,α=10%) p(fit) p(true) 2 0.01.02.03.0 0.00.20.40.60.81.0 Coveragerate(A) C Coveragerate (n(sample)=30,p(true)=7,α=10%) p(fit) p(true) 2 0.01.02.03.0 0.00.20.40.60.81.0 Coveragerate(A) C Coveragerate (n(sample)=30,p(true)=6,α=10%) p(fit) p(true) 2 0.01.02.03.0 0.00.20.40.60.81.0 Coveragerate(A) C Coveragerate (n(sample)=30,p(true)=5,α=10%) p(fit) p(true) 2 0.01.02.03.0 0.00.20.40.60.81.0 Coveragerate(A) C Coveragerate (n(sample)=30,p(true)=4,α=10%) p(fit) p(true) 2 0.01.02.03.0 0.00.20.40.60.81.0 Coveragerate(A) C Coveragerate (n(sample)=30,p(true)=3,α=10%) p(fit) p(true) 2 Figure 12: Coverage probability profile patterns under model underfitting (n = 30, p(true model) varies from 10 to 3 in different panels, α =10%, the true model f(x) = ∑p j=1 jxj−1 , p(model fitting) varies from p(true model) to 2.) 17
  • 18. 0.01.02.03.0 0.00.20.40.60.81.0 Coveragerate(B) C Coveragerate (n(sample)=30,p(true)=10,α=10%) p(fit) p(true) 2 0.01.02.03.0 0.00.20.40.60.81.0 Coveragerate(B) C Coveragerate (n(sample)=30,p(true)=9,α=10%) p(fit) p(true) 2 0.01.02.03.0 0.00.20.40.60.81.0 Coveragerate(B) C Coveragerate (n(sample)=30,p(true)=8,α=10%) p(fit) p(true) 2 0.01.02.03.0 0.00.20.40.60.81.0 Coveragerate(B) C Coveragerate (n(sample)=30,p(true)=7,α=10%) p(fit) p(true) 2 0.01.02.03.0 0.00.20.40.60.81.0 Coveragerate(B) C Coveragerate (n(sample)=30,p(true)=6,α=10%) p(fit) p(true) 2 0.01.02.03.0 0.00.20.40.60.81.0 Coveragerate(B) C Coveragerate (n(sample)=30,p(true)=5,α=10%) p(fit) p(true) 2 0.01.02.03.0 0.00.20.40.60.81.0 Coveragerate(B) C Coveragerate (n(sample)=30,p(true)=4,α=10%) p(fit) p(true) 2 0.01.02.03.0 0.00.20.40.60.81.0 Coveragerate(B) C Coveragerate (n(sample)=30,p(true)=3,α=10%) p(fit) p(true) 2 Figure 13: Coverage probability profile patterns under model underfitting (n = 30, p(true model) varies from 10 to 3 in different panels, α =10%, the true model f(x) = ∑p j=1 10jxj−1 , p(model fitting) varies from p(true model) to 2.) 18
  • 19. 0.0 0.2 0.4 0.6 0.8 1.0 0.00.20.40.60.81.0 Raw data and fitted curve X Y (p(fit)=5, C=2.1, α =10%) + + + + + + + +++++++++++ +++++++ +++++++++ + + + + +++ + + + + ++ + + + + + + + ++ ++++++++++++++++++++++++++++ ++ + + + + + + 0.0 0.2 0.4 0.6 0.8 1.0 0.00.20.40.60.81.0 Raw data and fitted curve X Y (p(fit)=5, C=2.1, α =10%) + + + + + + + +++++++++++ +++++++ +++++++++ + + + + +++ + + + + ++ + + + + + + + ++ ++++++++++++++++++++++++++++ ++ + + + + + + Figure 14: Brain contour estimation example. The “+” represents each captured raw data point with noise. The estimated smooth curves along with two-sided confidence bands are displayed (the polynomial function has dimension p = 5, C = 2.1, α = 10%). The left panel is a crude “3+3” partition and the right panel is a landmark-based adaptive “3+4” partition. The disconnection at the left endpoint indicates an edge effect. 19
  • 20. 0.0 0.5 1.0 1.5 2.0 2.5 3.0 0.00.20.40.60.81.0 Power C Power (β0=(1,2,3),α=10%) (βa=(1,2.1,3) (β0=(1,2,3),α=10%) (βa=(1,2.1,3) (β0=(1,2,3),α=10%) (βa=(1,2.1,3) (β0=(1,2,3),α=10%) (βa=(1,2.1,3) (β0=(1,2,3),α=10%) (βa=(1,2.1,3) n=10 15 20 25 30 (βa=(1,2.5,3)(βa=(1,2.5,3)(βa=(1,2.5,3)(βa=(1,2.5,3)(βa=(1,2.5,3) (βa=(1,3,3)(βa=(1,3,3)(βa=(1,3,3)(βa=(1,3,3)(βa=(1,3,3) Figure 15: Power (null hypothesis (H0) rejection probability) profile patterns (n varies, α =10%, H0 : f(t) = 1 + 2x + 3x2 , Ha : f(t) = 1 + 3x + 3x2 .) 20