Advance Statistics: Introductionto Regressionand Analysis of VarianceFixed and Random Effects
Today’s Report Random effects. One-way random effects ANOVA. Two-way mixed & random effects ANOVA. Sattherwaite’s procedure.
Two-way ANOVA Second generalization: more than one grouping variable. Two-way ANOVA model: observations: (Yijk), 1 < i < r , 1 < j < m, 1 < k < nij : r groups in first grouping variable, m groups ins second and nij samples in (i; j)-“cell”: Yijk = µ + α i + βj + (α β)ij + Eijk, Eijk~N(0, σ ² ): n Constraints: r∑i=1 αi = 0 m∑j=1 βj = 0 m∑j=1 (αβ)ij = 0, 1 < i < r r∑i=1 (αβ)ij = 0, 1 < j < m
Random and fixed effects In ANOVA examples we have seen so far, the categorical variables are well-defined categories: below average fitness, long duration, etc. In some designs, the categorical variable is “subject”. Simplest example: repeated measures, where more than one (identical) measurement is taken on the same individual. In this case, the “group” effect αi is best thought of as random because we only sample a subset of the entire population of subjects.
When to use random effects? A “group” effect is random if we can think of the levels we observe in that group to be samples from a larger population. Example: if collecting data from different medical centers, “center” might be thought of as random. Example: if surveying students on different campuses, “campus” may be a random effect.
Example: sodium content in beer How much sodium is there in North American beer? How? much does this vary by brand? Observations: for 6 brands of beer, researchers recorded the sodium content of 8 12 ounce bottles. Questions of interest: what is the “grand mean” sodium content? How much variability is there from brand to brand? “Individuals” in this case are brands, repeated measures are the 8 bottles.
One-way random effects model Suppose we take n identical measurements from r subjects Yij ~ µ + αi + Eij, 1 < I < r, 1 < j < n Eij ~ N(0, σ²), 1 < i < r, 1 < j <n αi ~ N(0,σ²µ), 1 < i < r We might be interested in the population mean, µ. : CIs, is it zero? etc. Alternatively, we might be interested in the variability across subjects,σ²µ : CIs, is it zero?
One-way random ANOVA table Source SS df E(MS) Treatments SSTR = r ∑i=1 n(Y1 – Y..) ² r–1 σ² + n σ²µ Error SSE = r ∑i=1 (Yij- Yi.) ² (n - 1)r σ² Only change here is the expectation of SSTR which reflects randomness of αi’s ANOVA table is still useful to setup tests: the same F statistics for fixed or random will work here. Under Ho : σ²µ = 0, it is to see that MSTR ~Fr-1, (n-1) r’ MSE
Inference for µ We know that E(Y..) = µ. , and can show that Var(Y..) = nσ²µ + σ² rn Therefore, Y.. - µ square SSTR ~tr - 1 (r-1)rn Why r – 1 Degrees of freedom? Imagine we could record an infinite number of observations for each individual, so that Yi ---- µi To learn anything about µ. We still only have r observations (µ1,…., µr). Sampling more within an individual cannot narrow the CI for µ,.
Estimating ²σµ From the ANOVA table σ²µ = E(SSTR/(r-1))-E(SSE/((n-1)r)) n Natural estimate S²µ = SSTR/(r-1)-SSE/((n-1)r)) n Problem: this estimate can be negative! One of the difficulties in random effect model
Example: productivity study Imagine a study on the productivity of employees in a large manufacturing company. Company wants to get an idea of daily productivity, and how it depends on which machine on employee uses. Study: take m employee and r machines, having each employee work on each machine for a total of n days. As these employees are not all employees, and these machines are not all machines it take sense to think of both the effect of machine and employees (and interactions) as random
Two-way random effects model Yijk ~ µ.. + αi + βj + (αβ)ij + Eij, 1< i < r, 1 < j < m, 1 < k < n Eijk ~ N(0, σ²), 1 < i < r, 1 < j < m, 1 < k < n αi ~ N(0, σ²α), 1 < i < r. βj ~ N(0, σ²β), 1 < j < m. (αβ)ij ~ N(0, ²σαβ), 1 < j < m, 1< i < r. Cov (Yijk,Yi’j’k’) = &ii’ ²σα + &jj’ ²σβ + &ii’ &jj’ ²σαβ + &ii’ &jj’ &kk’ σ².
ANOVA table: Two-way (random)SS df E(SS)SSA = nm r∑i=1 (Yi.. – Y…) ² r -1 σ² + nm σ²α + nσ²αβSSB = nr m∑j=1 (Yj.. – Y…) ² m -1 σ² +nr σ²β + nσ²αβSSAB = n r∑i=1 (Yij. – Yi.. –Yj.. + Y…) ² (m – 1)(r – 1) σ² + nσ²αβSSE = r∑i=1 m∑j=1 n ∑k=1 (Yijk – Y ij.)² (n – 1)ab σ² To test Ho : σ²α = 0 use SSA and SSAB. To test H0 : σ²αβ = 0 use SSAB and SSE.
Mixed effects model In some studies, some factors can be thought of as fixed, others random. For instance, we might have a study of the effect of a standard part of the brewing process on sodium levels in the beer example Then, we might think of a model in which we have a fixed effect for “brewing technique” and a random effect for beer
Two-way mixed effects model Yijk ~ µ.. + αi + βj + (αβ)ij + Eij, 1< i < r, 1 < j < m, 1 < k < n Eijk ~ N(0, σ²), 1 < i < r, 1 < j < m, 1 < k < n αi ~ N(0, σ²α), 1 < i < r. βj ~ N(0, σ²β), 1 < j < m. are constant! (αβ)ij ~ N (0,(m-1) σ²αβ/m), 1 < j < m, 1< i < r. Constraints: m∑j=1 βj = 0 m∑j=1 (αβ)ij = 0, 1 < i < r Cov ((αβ)ij, (αβ) i’j’ ) = - σ²αβ/m Cov (Yijk,Yi’j’k’) = &jj’ (σ²β + &ii’ m-1 σ²β – (1-&ii’) 1 σ²αβ + &ii’ &kk’ σ²). m m
ANOVA table: Two-way (mixed)SS df E(MS)SSA r -1 σ² + nm σ²αSSB m -1 σ² + m∑j=1 β²i+ nσ²αβ m -1SSAB (m – 1)(r – 1) σ² + nσ²αβSSE = r∑i=1 m∑j=1 n ∑k=1 (Yijk – Y ij.)² (n – 1)ab σ² To test Ho : σ²α = 0 use SSA and SSAB. To test Ho : β1 = … = βm = 0 0 use SSB and SSAB. To test Ho : σ²αβ use SSAB and SSE.
Confidence intervals for variances Consider estimating σ²β in two-way random effects ANOVA. A natural estimate is σ²β = nr (MSB – MSAB) What about CI? A linear combination of x² - but not x². To form a confidence interval for σ²β we need to know distribution of a linear combination of MS.’s at least approximately.
Satterwaite’s procedure Given K independent MS.’s L ~ k∑i=1 ciMSi Then dfTL “~” x²dfT E(L) where dfT = (k∑i=1 ciMSi)² k∑i=1 ciMS²i/dfi Where dfi the the degrees of freedom of the i-th MS. (1-α) . 100% CI for E(L): LL = dfT x L , LU = dfT x L x² dfT; 1-α/2 x² dfT; 1-α/2