Biology for Computer Engineers Course Handout.pptx
Analysis of Variance
1. Dr. Shailesh Kumar Dewangan
Assistant Professor
Department of Production Engineering,
BIT Mesra Ranchi
Analysis of Variance (ANOVA)
ANOVA
2. 2
Analysis of Variance (ANOVA)
ANOVA is a fundamental step in the Design of Experiment, which is a dominant statistical
tool aimed at statistically quantifying interactions between independent variables through
their methodical modifications to determine their impact on the predicted variables.
The ANOVA pre requires the following assumptions:
• The treatment data must be normally distributed,
• The variance must be the same for all treatments,
• All samples are randomly selected.
SS Regression = σ ො
𝑦 − 𝑦 2
SS Error = σ 𝑦 − ො
𝑦 2
SS Total = SS Regression + SS Error
y = observed response, ˆy = fitted response, and y = mean response.
ANOVA, is a statistical method
that separates observed
variance data into different
components to use for
additional tests.
3. 3
Types of ANOVAAnalysis
• Dependent Variable – Analysis of variance must have a dependent variable
that is continuous. ...
• Independent Variable – ANOVA must have one or more categorical
independent variable like Sales promotion. ...
• Null hypothesis – All means are equal.
What is ANOVA formula?
The Anova test is performed by comparing two types of variation, the variation
between the sample means, as well as the variation within each of the samples.
Formula represents one-way Anova test statistics: Alternatively,
F = MST/MSE
MST = SST/ p-1
4. 4
Degree of Freedom
DOF: Indicates the number of independent pieces of information involving the response
data needed to calculate the sum of squares. The degrees of freedom for each component
of the model are:
𝑫𝑭 𝑹𝒆𝒈𝒓𝒆𝒔𝒔𝒊𝒐𝒏 = 𝒑 − 𝟏
𝑫𝑭 𝑬𝒓𝒓𝒐𝒓 = 𝒏 − 𝒑
𝑻𝒐𝒕𝒂𝒍 = 𝒏 − 𝟏
• where n = number of observations and p = number of terms in the model
Mean Square: In an ANOVA, the term Mean Square refers to an estimate of the
population variance based on the variability among a given set of measures. The
calculation for the mean square for the model terms is:
𝑴𝑺 𝑻𝒆𝒓𝒎 =
𝑨𝒅𝒋 𝑺𝑺 𝑻𝒆𝒓𝒎
𝑫𝑭 𝑻𝒆𝒓𝒎
5. 5
F-value and P value of ANOVA
F-value is the measurement of distance between individual distributions. As the F- value
goes up, the P-value goes down. F is a test to determine whether the interaction and main
effects are significant. The formula for the model terms is:
𝑭 =
𝑴𝑺 𝑻𝒆𝒓𝒎
𝑴𝑺 𝑬𝒓𝒓𝒐𝒓
The degrees of freedom for the test are:
Numerator = degrees of freedom for term, Denominator = degrees of freedom for error
Larger values of F support
rejecting the null
hypothesis that there is not
a significant effect
P-value is used in hypothesis tests to help you decide whether to reject or fail to reject a null
hypothesis. The p-value is the probability of obtaining a test statistic that is at least as extreme as the
actual calculated value, if the null hypothesis is true. A commonly used cut-off value for the p-value
is 0.05. For example, if the calculated p-value of a test statistic is less than 0.05, you reject the null
hypothesis.
6. 6
Model Adequacy Check
Before the conclusions from the analysis of variance are adopted, the adequacy of the underlying
model should be checked it is always necessary to
• Examine the fitted model to ensure that it provides an adequate approximation to the true system;
• Verify that none of the least squares regression assumptions are violated. Now we consider
several techniques for checking model adequacy. Before the full model ANOVA, several R2 are
presented. The ordinary R2 is
𝑹𝟐 =
𝑺𝑺𝒓𝒆𝒈𝒓𝒆𝒔𝒔𝒊𝒐𝒏
𝑺𝑺𝒕𝒐𝒕𝒂𝒍
• R2 (R-sq) Coefficient of determination; indicates how much variation in the response is
explained by the model. The higher the R2, the better the model fits your data. The formula is:
𝑹𝟐
= 𝟏 −
𝑺𝑺 𝑬𝒓𝒓𝒐𝒓
𝑺𝑺 𝑻𝒐𝒕𝒂𝒍
Adjusted R2 accounts for the number of
factors in your model. The formula is:
𝑹𝟐
= 𝟏 −
𝑴𝑺 (𝑬𝒓𝒓𝒐𝒓)
𝑺𝑺 𝑻𝒐𝒕𝒂𝒍/𝑫𝑭 𝑻𝒐𝒕𝒂𝒍