SlideShare a Scribd company logo
Two-way ANOVA,
multiple regression
and general linear models
ANOVA
with more predictors:
summary(aov(seedlings ~
fertilization * management,
data=seedl))
• Df Sum Sq Mean Sq F value Pr(>F)
• fertilization 1 108.3 108.30 9.404 0.0053 **
• management 2 400.5 200.23 17.386 2.15e-
05 ***
• fertilization:management 2 34.2 17.10 1.485
0.2466
• Residuals 24 276.4 11.52
seedlings management fertilization
9 abandoned fert
4 abandoned fert
12 abandoned fert
12 abandoned fert
11 abandoned fert
11 abandoned unfert
8 abandoned unfert
16 abandoned unfert
14 abandoned unfert
12 abandoned unfert
14 grazing fert
14 grazing fert
15 grazing fert
24 grazing fert
21 grazing fert
25 grazing unfert
14 grazing unfert
20 grazing unfert
20 grazing unfert
19 grazing unfert
10 mowing fert
6 mowing fert
5 mowing fert
8 mowing fert
8 mowing fert
18 mowing unfert
11 mowing unfert
14 mowing unfert
16 mowing unfert
12 mowing unfert
Interaction
– test of additivity
H0: the effect of a factor
is not affected by the other factor
– in plots: mean-connecting lines are paralel
mow ed*fertilized; LS Means
Current effect: F(1, 16)=0.0000, p=1.0000
Effective hypothesis decomposition
Vertical bars denote 0.95 confidence intervals
mow ed 0
mow ed 1
0 1
fertilized
1.0
1.5
2.0
2.5
3.0
3.5
4.0
4.5
5.0
5.5
6.0
6.5
7.0
No
.
species
mow ed*fertilized; LS Means
Current effect: F(1, 16)=18.000, p=.00062
Effective hypothesis decomposition
Vertical bars denote 0.95 confidence intervals
mow ed 0
mow ed 1
0 1
fertilized
0
2
4
6
8
10
12
14
No.species
no interaction:
additive effect
lines are paralel
the effect of mowing
is the same
regardles of fertilization
interaction:
non-additive effect
lines are not paralel
the effect of mowing
is more pronounced
in unfertilized plots
Chocolate rats heavier, music no effect
Number of observations
– should be balanced in all groups
music
diet Rock Folk
Paper 15 15
Chocoloate 15 15
Mean mass of rats ~ diet + music
music
diet Rock Folk
Paper 20 40
Chocoloate 10 20
music
diet Rock Folk
Paper 30 10
Chocoloate 10 30
Number of observations
Chocolate rats heavier, Folk rats heavier
site 1 site 2
site 3 site 4
fixed: fertilization
random: sites
you are not interested in the effect of site
you can generalize to all (comparable) sites
Fixed vs. random effects of predictors
– depends on how general is your research question
– fixed effect: you are interested in differences
between particular factor levels only
(fertilized / unfertilized, breed1 / breed2 / breed3)
– random effect: you want to generalize the results
to all other possible levels
(site1 / site2 / site3 / site4, breed1 / breed2 / breed3, sampling-unit identity)
F U F U
U F U F
F U F U
U F U F
F U F U
U F U F
F U F U
U F U F
F U F U
U F U F
F U F U
U F U F
F U F U
U F U F
F U F U
U F U F
your results are valid only for your site
site 1 site 2
site 3 site 4
fixed: fertilization
random: sites
you are not interested in the effect of site
you can generalize to all (comparable) sites
Fixed vs. random effects of predictors
– Several computaional approaches
– simple use of different MS in F test:
– complicated advanced fitting in R
F U F U
U F U F
F U F U
U F U F
F U F U
U F U F
F U F U
U F U F
F U F U
U F U F
F U F U
U F U F
F U F U
U F U F
F U F U
U F U F
your results are valid only for your site
Effect tested A fixed
B fixed
A random
B random
A fixed
B random)
Factor A MSA/MSe MSA/MSAB MSA/MSAB
Factor B MSB/MSe MSB/MSAB MSB/MSe
A x B interaction MSAB/MSe MSAB/MSe MSAB/MSe
2 bedrocks: site 1 site 2 site 3
granite
limestone
site 4 site 5 site 6
Factors:
fertilization
sites (nested in bedrock)
– it is impossible to have more bedrocks on one site
bedrock
Hierarchical (nested) design
– not all combination of factor levels are available
– factor with more levels
is “nested in“ factor with less levels
Fbedrock = MSbedrock / MSsite
F U F U
U F U F
F U F U
U F U F
F U F U
U F U F
F U F U
U F U F
F U F U
U F U F
F U F U
U F U F
R – fairly complicated to fit a proper models with
nested factors
random effects
interactions
packages: lme4 (simpler) or nlme (advanced)
https://www.jaredknowles.com/journal/2013/11/25/getting-started-with-mixed-effect-models-in-r
2 bedrocks: site 1 site 2 site 3
granite
limestone
site 4 site 5 site 6
Factors:
fertilization (fixed)
sites (random, nested in bedrock)
– it is impossible to have more bedrocks on one site
bedrock (fixed)
fertilization:bedrock (interaction of fixed factors is interesting, other usually not)
F U F U
U F U F
F U F U
U F U F
F U F U
U F U F
F U F U
U F U F
F U F U
U F U F
F U F U
U F U F
Multiple regression
main effects of two predictors:
summary(lm(seedlings ~ productivity + temperature, data=seedl))
seedlings productivity temperature
9 589 7.2
4 674 4.5
12 484 7.4
12 504 5.4
11 484 5
11 572 6
8 411 6.2
16 374 7.6
14 353 8.4
12 406 4.7
14 759 6.8
14 789 4.6
15 689 4.8
24 611 7
21 456 5.8
25 386 7.6
14 538 4.8
20 350 9
20 413 4.8
19 599 6.5
10 558 6.4
6 752 4.2
5 687 6
8 667 4.4
8 662 7.6
18 479 4.8
11 561 9
14 450 8.4
16 578 4.7
12 592 6.8
Estimate Std. Error t value Pr(>|t|)
(Intercept) 21.286421 7.220427 2.948 0.00652 **
productivity -0.017128 0.007928 -2.160 0.03977 *
temperature 0.245520 0.681770 0.360 0.72156
> summary(lm(seedlings~productivity,data=seedl))
Estimate Std. Error t value Pr(>|t|)
(Intercept) 23.429505 4.024972 5.821 2.97e-06 ***
productivity -0.018256 0.007169 -2.547 0.0167 *
> summary(lm(seedlings~temperature,data=seedl))
Estimate Std. Error t value Pr(>|t|)
(Intercept) 8.2922 4.2487 1.952 0.061 .
temperature 0.8274 0.6661 1.242 0.224
Multiple regression
main effects of two predictors:
summary(lm(seedlings ~ productivity + temperature, data=seedl))
seedlings productivity temperature
9 589 7.2
4 674 4.5
12 484 7.4
12 504 5.4
11 484 5
11 572 6
8 411 6.2
16 374 7.6
14 353 8.4
12 406 4.7
14 759 6.8
14 789 4.6
15 689 4.8
24 611 7
21 456 5.8
25 386 7.6
14 538 4.8
20 350 9
20 413 4.8
19 599 6.5
10 558 6.4
6 752 4.2
5 687 6
8 667 4.4
8 662 7.6
18 479 4.8
11 561 9
14 450 8.4
16 578 4.7
12 592 6.8
Estimate Std. Error t value Pr(>|t|)
(Intercept) 21.286421 7.220427 2.948 0.00652 **
productivity -0.017128 0.007928 -2.160 0.03977 *
temperature 0.245520 0.681770 0.360 0.72156
Equation: y = b0 + b1x1 + b2x2 ... (plane)
Interaction: y = b0 + b1x1 + b2x2 + b1,2x1x2
– curved surface (different slope
in the front
and in the back)
Estimate Std. Error t value Pr(>|t|)
(Intercept) 21.286421 7.220427 2.948 0.00652 **
productivity -0.017128 0.007928 -2.160 0.03977 *
temperature 0.245520 0.681770 0.360 0.72156
...
Multiple R-squared: 0.192, Adjusted R-squared: 0.132
F-statistic: 3.207 on 2 and 27 DF, p-value: 0.0563
> anova(lm(seedlings~temperature+productivity,data=seedl))
Analysis of Variance Table
Response: seedlings
Df Sum Sq Mean Sq F value Pr(>F)
temperature 1 42.80 42.800 1.7454 0.19755
productivity 1 114.46 114.464 4.6678 0.03977 *
Residuals 27 662.10 24.522
> anova(lm(seedlings~productivity+temperature,data=seedl))
Analysis of Variance Table
Response: seedlings
Df Sum Sq Mean Sq F value Pr(>F)
productivity 1 154.08 154.084 6.2834 0.01851 *
temperature 1 3.18 3.180 0.1297 0.72156
Residuals 27 662.10 24.522
ANOVA test of whole model
ANOVA tests of each predictor:
Mind the order of predictors
simple vs. partial effects (in addition to the previous predictors)
Multiple regression
main effects of two predictors:
summary(lm(seedlings ~ productivity + temperature, data=seedl))
Explained variation:
(43+114)/(43+114+662)=0.192
General linear models
– ANOVA and Regression are equivalent
– same idea of testing variability explained by a model
– fitting model by least squares
Square of the difference
=TOTAL square
Overall mean
Group
mean
Difference from
the group mean
Square of
the difference
= ERROR square
Difference of the
group mean
from the overall
mean
Square of
the difference
= GROUP square
Difference from
the overall mean
Variance: mean of squared differences from mean
– get the differences
– square them
Sums of squares in regression
Total square
Error
square
Regression
square
ܱܵܵܶܶ = 𝑌𝑖 − 𝑌 2
ܴܵܵ‫ܴܩܧ‬ = 𝑌𝑖 − 𝑌
2
ܵܵ𝑒 = 𝑌𝑖 − 𝑌𝑖
2
This square is minimized
Individual values of Y
Mean of Y
Individual fitted
values of Y (values Y calculated
as Y= a + bx
Fitted value
mean(Y)
General linear models
– ANOVA and Regression are equivalent
– same idea of testing variability explained by a model
– fitting model by least squares
–> both types of predictors (numeric, factor)
can be combined
– you can use any wild combination of predictor types,
interactions, nestedness, random effects...
– one more semester:
(P. Šmilauer: Modern Regression Methods, KBE/785E)
– simplest case – analysis of covariance
– 1 numeric predictor
– 1 categorical predictor
– no interaction
– model – paralel lines
General linear models: analysis of covariance
– 1 numeric predictor
– 1 categorical predictor
– no interaction
– model – paralel lines
> anova(lm(seedlings~productivity+management,data=seedl))
Df Sum Sq Mean Sq F value Pr(>F)
productivity 1 154.08 154.084 19.843 0.0001418 ***
management 2 463.39 231.693 29.837 1.852e-07 ***
Residuals 26 201.90 7.765
seedlings management productivity
9 abandoned 589
4 abandoned 674
12 abandoned 484
12 abandoned 504
11 abandoned 484
11 abandoned 572
8 abandoned 411
16 abandoned 374
14 abandoned 353
12 abandoned 406
14 grazing 759
14 grazing 789
15 grazing 689
24 grazing 611
21 grazing 456
25 grazing 386
14 grazing 538
20 grazing 350
20 grazing 413
19 grazing 599
10 mowing 558
6 mowing 752
5 mowing 687
8 mowing 667
8 mowing 662
18 mowing 479
11 mowing 561
14 mowing 450
16 mowing 578
12 mowing 592
Analyzing many predictors:
If you have too many predictors
(e.g. measures of everything in field observation)
do not include everything to your model!
–> fit Minimal adequate model
– backward selection
– include everything to the first model,
remove all non-significant terms
– forward selection
– start with the null model
– add individual terms
– one by one (due to colinearity)
– based on p-value or AIC
– analyze final model
AIC - Akaike information criterion
AIC = 2 k – 2 log ( n / SSE ) + C
k – number of model parameters
(i.e. model df)
SSE – residual sum of squares (RSS),
C – constant (can be ignored)
Quantifies the information accounted for by a predictor
– lower AIC suggests a better fit,
absolute values of AIC are not informative
– allows comparisons between models with different
number of df – penalization of complicated models
Can be combined with an F-test of significance
Question: What does the mass of human body depend on?
Sampling design: 21 randomly chosen inhabitants of České Budějovice.
Minimal adequate models – Forward selection
Body
mass
Sex Body
height
Hair
colour
Vegetarian Hours spent weekly
by physical excersise
mass sex height colour vegetarian hours
95 M 185 blonde 1 3
96 M 165 blonde 0 2
91 M 178 blonde 1 1
82 M 186 blonde 0 2
87 M 196 black 1 4
75 M 178 black 1 6
81 M 186 black 0 2
84 M 187 black 0 6
95 M 196 red 1 1
100 M 201 red 0 8
69 M 169 red 0 12
52 F 156 blonde 0 1
58 F 168 blonde 0 8
62 F 178 blonde 0 5
61 F 168 blonde 1 6
45 F 155 black 0 4
55 F 164 black 1 3
71 F 181 black 0 1
83 F 185 red 1 2
62 F 175 red 0 4
64 F 171 red 1 2
Minimal adequate models – Forward selection
Question: What does the mass of human body depend on?
Sampling design: 21 randomly chosen inhabitants of České Budějovice.
For each person was recorded:
– body mass,
– sex,
– body height,
– hair colour,
– whether he/she is vegetarian
– number of hours spent weekly by physical exercise.
Start with null model:
> lm.0<-lm(mass~+1, data=BM)
> add1(lm.0, .~.+sex*height*colour*vegetarian*hours, test="F")
Single term additions
Model:
mass ~ +1
Df Sum of Sq RSS AIC F value Pr(F)
<none> 5318.7 118.224
sex 1 3410.9 1907.7 98.692 33.9710 1.295e-05 ***
height 1 3194.1 2124.6 100.953 28.5649 3.704e-05 ***
colour 2 191.1 5127.6 121.455 0.3354 0.7194
vegetarian 1 224.8 5093.9 119.317 0.8384 0.3713
hours 1 98.6 5220.0 119.830 0.3591 0.5561
next step
> lm.1<-update(lm.0, .~.+sex)
> add1(lm.1, .~.+sex*height*colour*vegetarian*hours, test="F")
Single term additions
Model:
mass ~ sex
Df Sum of Sq RSS AIC F value Pr(F)
<none> 1907.7 98.692
height 1 791.27 1116.5 89.441 12.7570 0.00218 **
colour 2 297.60 1610.1 99.131 1.5711 0.23655
vegetarian 1 139.13 1768.6 99.102 1.4160 0.24952
hours 1 289.43 1618.3 97.237 3.2193 0.08959 .
next step
> lm.2<-update(lm.1, .~.+height)
> add1(lm.2, .~.+sex*height*colour*vegetarian*hours, test="F")
Single term additions
Model:
mass ~ sex + height
Df Sum of Sq RSS AIC F value Pr(F)
<none> 1116.47 89.441
colour 2 245.787 870.68 88.220 2.2583 0.13681
vegetarian 1 45.693 1070.77 90.564 0.7254 0.40620
hours 1 192.420 924.05 87.469 3.5400 0.07714 .
sex:height 1 192.466 924.00 87.468 3.5410 0.07710 .
stop here (based on p) or include interaction (based on AIC)
Minimal adequate models – Forward selection
Analysis of final model
> anova(lm.0, lm.1, lm.2, test="F")
Analysis of Variance Table
Model 1: mass ~ +1
Model 2: mass ~ sex
Model 3: mass ~ sex + height
Res.Df RSS Df Sum of Sq F Pr(>F)
1 20 5318.7
2 19 1907.7 1 3410.9 54.992 7.094e-07 ***
3 18 1116.5 1 791.3 12.757 0.00218 **
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
> summary(lm.2)
Call:
lm(formula = mass ~ sex + height, data = BM)
Residuals:
Min 1Q Median 3Q Max
-8.559 -5.865 -2.027 3.041 20.865
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -41.8184 28.9782 -1.443 0.166170
sexM 16.9264 4.1986 4.031 0.000783 ***
height 0.6062 0.1697 3.572 0.002180 **
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 7.876 on 18 degrees of freedom
Multiple R-squared: 0.7901, Adjusted R-squared: 0.7668
F-statistic: 33.87 on 2 and 18 DF, p-value: 7.914e-07
Minimal adequate models – Forward selection
Conclusion: Body mass is
significantly dependent on
sex and body height, these
predictors have additive
effects
Men are on average heavier
than women and the mass
increases with height
But be careful here
because sex and height
are related to each other!
Minimal adequate models – Forward selection
How to plot the figure
plot(mass~height, data=BM, type="n", ylab="Body mass", xlab="Body height")
### plots an emply plot, i.e. the axes (with appropriate ranges, so the data fit in -
THIS IS IMPORTANT) and labels; this is specified by type="n"
points(mass~height, data=BM[BM$sex=="M",], pch=16)
### adds full points for males to the empty plot
points(mass~height, data=BM[BM$sex=="M",], pch=1)
### empty points for females
###generates x values to be later used by predict function
males.pred<-data.frame(sex="M", height=150:205)###generates a range of the height
predictor values for which the fitted values for males should be generated
females.pred<-data.frame(sex="F", height=150:205)###same for females
### predicts y values based on the model
lines(150:205, predict(lm.2, newdata=males.pred))
###Adds a solid line to the plot, corresponding to the regression fit for males
lines(150:205, predict(lm.2, newdata=females.pred), lty=2)
###Adds a dashed line to the plot, corresponding to the regression fit for females
legend(x="bottomright", legend=c("Males", "Females"), pch=c(16,1), lty=c(1,2),
inset=0.05, bty="n")
### Adds a legend to the plot
Overall conclusion
Statistics:
Numbers and formulas
– summary statistics – how big and variable are data
– hypothesis testing
– p – are the relationships larger than random?
– choose test based on data type
– data arrangement
Logic of discovery
– observation vs. experiment
– statistical vs. causal relationship
– avoid all possible bias
– random selection, proper control treatment
– enough replicates
Continuous
(e.g. 0.3, 4, 7, 5.2 etc.)
Ordinal
(e.g. 1=little,
2=medium,
3=a lot)
Categories
frequencies or percentages
(e.g. germinated: 18,
not germinated: 32)
Type of dependent variable
Type
of
predictor
Categories
–>
comparison
of
means
2 groups: t-test
(paired or not)
>2 groups:
one-way ANOVA
2 or more predictors:
two/more-way ANOVA
2 groups (not paired):
Mann-Whitney test
2 groups (paired):
Wilcoxon test
Continuous
–>
linear
relationship
2 variables, one cause
and one effect:
simple regression
2 variables,
no cause / effect:
Pearson correlation
>2 variables,
more causes
and one effect:
multiple regression
2 variables:
Spearman correlation
1 grouping variable:
Goodnes of fit
18 : 32
>2 groups:
Kruskal-Wallis test
more predictors
of both types:
General linear models
Both
types
>1 grouping variable:
Contingency table
A B
C 18 32
D 26 24
Summary statistics
How big?
Mean, median...
How variable?
Variance, quartile range,
standard deviation,
coef. of variation...
How accurate estimate?
Standard error,
confidence interval
linear models.pptx

More Related Content

Similar to linear models.pptx

Dalut ppt. of factorial analysis of variance-b
Dalut ppt. of factorial analysis of variance-bDalut ppt. of factorial analysis of variance-b
Dalut ppt. of factorial analysis of variance-b
Ian Kris Lastimosa
 
Dalut ppt. of factorial analysis of variance-b
Dalut ppt. of factorial analysis of variance-bDalut ppt. of factorial analysis of variance-b
Dalut ppt. of factorial analysis of variance-b
Ian Kris Lastimosa
 
Dalut ppt. of factorial analysis of variance-b
Dalut ppt. of factorial analysis of variance-bDalut ppt. of factorial analysis of variance-b
Dalut ppt. of factorial analysis of variance-b
Ian Kris Lastimosa
 
Spss in soil science
Spss in soil scienceSpss in soil science
Spss in soil science
Emeni Joshua
 
Teknik Simulasi
Teknik SimulasiTeknik Simulasi
Teknik Simulasi
Rezzy Caraka
 
Advances in gene-based crop modeling
Advances in gene-based crop modelingAdvances in gene-based crop modeling
Advances in gene-based crop modeling
CIAT
 
Metaheuristic Tuning of Type-II Fuzzy Inference System for Data Mining
Metaheuristic Tuning of Type-II Fuzzy Inference System for Data MiningMetaheuristic Tuning of Type-II Fuzzy Inference System for Data Mining
Metaheuristic Tuning of Type-II Fuzzy Inference System for Data Mining
Varun Ojha
 
Research Methodology anova
  Research Methodology anova  Research Methodology anova
Research Methodology anova
Praveen Minz
 
Model Selection and Multi-model Inference
Model Selection and Multi-model InferenceModel Selection and Multi-model Inference
Model Selection and Multi-model Inference
richardchandler
 
Biom-32-GxE I.ppt
Biom-32-GxE I.pptBiom-32-GxE I.ppt
Biom-32-GxE I.ppt
birhankassa
 
2014-mo444-practical-assignment-04-paulo_faria
2014-mo444-practical-assignment-04-paulo_faria2014-mo444-practical-assignment-04-paulo_faria
2014-mo444-practical-assignment-04-paulo_faria
Paulo Faria
 
Measures of Dispersion
Measures of DispersionMeasures of Dispersion
Measures of Dispersion
Kalahandi University
 
Chapter2.ppt
Chapter2.pptChapter2.ppt
Chapter2.ppt
AyushChandra55
 
Input analysis
Input analysisInput analysis
Input analysis
Bhavik A Shah
 
Rm class-2 part-1
Rm class-2 part-1Rm class-2 part-1
Rm class-2 part-1
anupta jana
 
U1.4-RVDistributions.ppt
U1.4-RVDistributions.pptU1.4-RVDistributions.ppt
U1.4-RVDistributions.ppt
Sameeraasif2
 
Simulation - Generating Continuous Random Variables
Simulation - Generating Continuous Random VariablesSimulation - Generating Continuous Random Variables
Simulation - Generating Continuous Random Variables
Martin Kretzer
 
Presentation3
Presentation3Presentation3
Presentation3
Darijiro
 
Anov af03
Anov af03Anov af03
Anov af03
pradeep joshi
 
15Tarique_DSP_8ETC.PPT
15Tarique_DSP_8ETC.PPT15Tarique_DSP_8ETC.PPT
15Tarique_DSP_8ETC.PPT
SadorYonas
 

Similar to linear models.pptx (20)

Dalut ppt. of factorial analysis of variance-b
Dalut ppt. of factorial analysis of variance-bDalut ppt. of factorial analysis of variance-b
Dalut ppt. of factorial analysis of variance-b
 
Dalut ppt. of factorial analysis of variance-b
Dalut ppt. of factorial analysis of variance-bDalut ppt. of factorial analysis of variance-b
Dalut ppt. of factorial analysis of variance-b
 
Dalut ppt. of factorial analysis of variance-b
Dalut ppt. of factorial analysis of variance-bDalut ppt. of factorial analysis of variance-b
Dalut ppt. of factorial analysis of variance-b
 
Spss in soil science
Spss in soil scienceSpss in soil science
Spss in soil science
 
Teknik Simulasi
Teknik SimulasiTeknik Simulasi
Teknik Simulasi
 
Advances in gene-based crop modeling
Advances in gene-based crop modelingAdvances in gene-based crop modeling
Advances in gene-based crop modeling
 
Metaheuristic Tuning of Type-II Fuzzy Inference System for Data Mining
Metaheuristic Tuning of Type-II Fuzzy Inference System for Data MiningMetaheuristic Tuning of Type-II Fuzzy Inference System for Data Mining
Metaheuristic Tuning of Type-II Fuzzy Inference System for Data Mining
 
Research Methodology anova
  Research Methodology anova  Research Methodology anova
Research Methodology anova
 
Model Selection and Multi-model Inference
Model Selection and Multi-model InferenceModel Selection and Multi-model Inference
Model Selection and Multi-model Inference
 
Biom-32-GxE I.ppt
Biom-32-GxE I.pptBiom-32-GxE I.ppt
Biom-32-GxE I.ppt
 
2014-mo444-practical-assignment-04-paulo_faria
2014-mo444-practical-assignment-04-paulo_faria2014-mo444-practical-assignment-04-paulo_faria
2014-mo444-practical-assignment-04-paulo_faria
 
Measures of Dispersion
Measures of DispersionMeasures of Dispersion
Measures of Dispersion
 
Chapter2.ppt
Chapter2.pptChapter2.ppt
Chapter2.ppt
 
Input analysis
Input analysisInput analysis
Input analysis
 
Rm class-2 part-1
Rm class-2 part-1Rm class-2 part-1
Rm class-2 part-1
 
U1.4-RVDistributions.ppt
U1.4-RVDistributions.pptU1.4-RVDistributions.ppt
U1.4-RVDistributions.ppt
 
Simulation - Generating Continuous Random Variables
Simulation - Generating Continuous Random VariablesSimulation - Generating Continuous Random Variables
Simulation - Generating Continuous Random Variables
 
Presentation3
Presentation3Presentation3
Presentation3
 
Anov af03
Anov af03Anov af03
Anov af03
 
15Tarique_DSP_8ETC.PPT
15Tarique_DSP_8ETC.PPT15Tarique_DSP_8ETC.PPT
15Tarique_DSP_8ETC.PPT
 

More from Pudhuvai Baveesh

Brinjal.pptx
Brinjal.pptxBrinjal.pptx
Brinjal.pptx
Pudhuvai Baveesh
 
pests on millet.pptx
pests on millet.pptxpests on millet.pptx
pests on millet.pptx
Pudhuvai Baveesh
 
Linkage and Recombination.ppt
Linkage and Recombination.pptLinkage and Recombination.ppt
Linkage and Recombination.ppt
Pudhuvai Baveesh
 
Genetic mapping.ppt
Genetic mapping.pptGenetic mapping.ppt
Genetic mapping.ppt
Pudhuvai Baveesh
 
mendelian genetics.ppt
mendelian genetics.pptmendelian genetics.ppt
mendelian genetics.ppt
Pudhuvai Baveesh
 
DNA damage.ppt
DNA damage.pptDNA damage.ppt
DNA damage.ppt
Pudhuvai Baveesh
 
GENETIC CODONS.ppt
GENETIC CODONS.pptGENETIC CODONS.ppt
GENETIC CODONS.ppt
Pudhuvai Baveesh
 
viral dna.ppt
viral dna.pptviral dna.ppt
viral dna.ppt
Pudhuvai Baveesh
 
F-NUCLEOSOMES.ppt
F-NUCLEOSOMES.pptF-NUCLEOSOMES.ppt
F-NUCLEOSOMES.ppt
Pudhuvai Baveesh
 
lampbrush chr.ppt
lampbrush chr.pptlampbrush chr.ppt
lampbrush chr.ppt
Pudhuvai Baveesh
 
polyploidy ..ppt
polyploidy ..pptpolyploidy ..ppt
polyploidy ..ppt
Pudhuvai Baveesh
 
Micro RNA.ppt
Micro RNA.pptMicro RNA.ppt
Micro RNA.ppt
Pudhuvai Baveesh
 
stress proteins and cancer.ppt
stress proteins and cancer.pptstress proteins and cancer.ppt
stress proteins and cancer.ppt
Pudhuvai Baveesh
 
DNA Structure.ppt
DNA Structure.pptDNA Structure.ppt
DNA Structure.ppt
Pudhuvai Baveesh
 
self incompatability.ppt
self incompatability.pptself incompatability.ppt
self incompatability.ppt
Pudhuvai Baveesh
 
modes of reproduction and apomixis.ppt
modes of reproduction and apomixis.pptmodes of reproduction and apomixis.ppt
modes of reproduction and apomixis.ppt
Pudhuvai Baveesh
 
Introduction to Genetics.ppt
Introduction to Genetics.pptIntroduction to Genetics.ppt
Introduction to Genetics.ppt
Pudhuvai Baveesh
 
plant breeding.ppt
plant breeding.pptplant breeding.ppt
plant breeding.ppt
Pudhuvai Baveesh
 
inheritance autosomal and sex linked.ppt
inheritance  autosomal and sex linked.pptinheritance  autosomal and sex linked.ppt
inheritance autosomal and sex linked.ppt
Pudhuvai Baveesh
 
sex determination.ppt
 sex determination.ppt sex determination.ppt
sex determination.ppt
Pudhuvai Baveesh
 

More from Pudhuvai Baveesh (20)

Brinjal.pptx
Brinjal.pptxBrinjal.pptx
Brinjal.pptx
 
pests on millet.pptx
pests on millet.pptxpests on millet.pptx
pests on millet.pptx
 
Linkage and Recombination.ppt
Linkage and Recombination.pptLinkage and Recombination.ppt
Linkage and Recombination.ppt
 
Genetic mapping.ppt
Genetic mapping.pptGenetic mapping.ppt
Genetic mapping.ppt
 
mendelian genetics.ppt
mendelian genetics.pptmendelian genetics.ppt
mendelian genetics.ppt
 
DNA damage.ppt
DNA damage.pptDNA damage.ppt
DNA damage.ppt
 
GENETIC CODONS.ppt
GENETIC CODONS.pptGENETIC CODONS.ppt
GENETIC CODONS.ppt
 
viral dna.ppt
viral dna.pptviral dna.ppt
viral dna.ppt
 
F-NUCLEOSOMES.ppt
F-NUCLEOSOMES.pptF-NUCLEOSOMES.ppt
F-NUCLEOSOMES.ppt
 
lampbrush chr.ppt
lampbrush chr.pptlampbrush chr.ppt
lampbrush chr.ppt
 
polyploidy ..ppt
polyploidy ..pptpolyploidy ..ppt
polyploidy ..ppt
 
Micro RNA.ppt
Micro RNA.pptMicro RNA.ppt
Micro RNA.ppt
 
stress proteins and cancer.ppt
stress proteins and cancer.pptstress proteins and cancer.ppt
stress proteins and cancer.ppt
 
DNA Structure.ppt
DNA Structure.pptDNA Structure.ppt
DNA Structure.ppt
 
self incompatability.ppt
self incompatability.pptself incompatability.ppt
self incompatability.ppt
 
modes of reproduction and apomixis.ppt
modes of reproduction and apomixis.pptmodes of reproduction and apomixis.ppt
modes of reproduction and apomixis.ppt
 
Introduction to Genetics.ppt
Introduction to Genetics.pptIntroduction to Genetics.ppt
Introduction to Genetics.ppt
 
plant breeding.ppt
plant breeding.pptplant breeding.ppt
plant breeding.ppt
 
inheritance autosomal and sex linked.ppt
inheritance  autosomal and sex linked.pptinheritance  autosomal and sex linked.ppt
inheritance autosomal and sex linked.ppt
 
sex determination.ppt
 sex determination.ppt sex determination.ppt
sex determination.ppt
 

Recently uploaded

Leaf Initiation, Growth and Differentiation.pdf
Leaf Initiation, Growth and Differentiation.pdfLeaf Initiation, Growth and Differentiation.pdf
Leaf Initiation, Growth and Differentiation.pdf
Renu Jangid
 
The use of Nauplii and metanauplii artemia in aquaculture (brine shrimp).pptx
The use of Nauplii and metanauplii artemia in aquaculture (brine shrimp).pptxThe use of Nauplii and metanauplii artemia in aquaculture (brine shrimp).pptx
The use of Nauplii and metanauplii artemia in aquaculture (brine shrimp).pptx
MAGOTI ERNEST
 
Deep Software Variability and Frictionless Reproducibility
Deep Software Variability and Frictionless ReproducibilityDeep Software Variability and Frictionless Reproducibility
Deep Software Variability and Frictionless Reproducibility
University of Rennes, INSA Rennes, Inria/IRISA, CNRS
 
8.Isolation of pure cultures and preservation of cultures.pdf
8.Isolation of pure cultures and preservation of cultures.pdf8.Isolation of pure cultures and preservation of cultures.pdf
8.Isolation of pure cultures and preservation of cultures.pdf
by6843629
 
THEMATIC APPERCEPTION TEST(TAT) cognitive abilities, creativity, and critic...
THEMATIC  APPERCEPTION  TEST(TAT) cognitive abilities, creativity, and critic...THEMATIC  APPERCEPTION  TEST(TAT) cognitive abilities, creativity, and critic...
THEMATIC APPERCEPTION TEST(TAT) cognitive abilities, creativity, and critic...
Abdul Wali Khan University Mardan,kP,Pakistan
 
bordetella pertussis.................................ppt
bordetella pertussis.................................pptbordetella pertussis.................................ppt
bordetella pertussis.................................ppt
kejapriya1
 
mô tả các thí nghiệm về đánh giá tác động dòng khí hóa sau đốt
mô tả các thí nghiệm về đánh giá tác động dòng khí hóa sau đốtmô tả các thí nghiệm về đánh giá tác động dòng khí hóa sau đốt
mô tả các thí nghiệm về đánh giá tác động dòng khí hóa sau đốt
HongcNguyn6
 
如何办理(uvic毕业证书)维多利亚大学毕业证本科学位证书原版一模一样
如何办理(uvic毕业证书)维多利亚大学毕业证本科学位证书原版一模一样如何办理(uvic毕业证书)维多利亚大学毕业证本科学位证书原版一模一样
如何办理(uvic毕业证书)维多利亚大学毕业证本科学位证书原版一模一样
yqqaatn0
 
aziz sancar nobel prize winner: from mardin to nobel
aziz sancar nobel prize winner: from mardin to nobelaziz sancar nobel prize winner: from mardin to nobel
aziz sancar nobel prize winner: from mardin to nobel
İsa Badur
 
DMARDs Pharmacolgy Pharm D 5th Semester.pdf
DMARDs Pharmacolgy Pharm D 5th Semester.pdfDMARDs Pharmacolgy Pharm D 5th Semester.pdf
DMARDs Pharmacolgy Pharm D 5th Semester.pdf
fafyfskhan251kmf
 
EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...
EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...
EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...
Sérgio Sacani
 
3D Hybrid PIC simulation of the plasma expansion (ISSS-14)
3D Hybrid PIC simulation of the plasma expansion (ISSS-14)3D Hybrid PIC simulation of the plasma expansion (ISSS-14)
3D Hybrid PIC simulation of the plasma expansion (ISSS-14)
David Osipyan
 
ESR spectroscopy in liquid food and beverages.pptx
ESR spectroscopy in liquid food and beverages.pptxESR spectroscopy in liquid food and beverages.pptx
ESR spectroscopy in liquid food and beverages.pptx
PRIYANKA PATEL
 
BREEDING METHODS FOR DISEASE RESISTANCE.pptx
BREEDING METHODS FOR DISEASE RESISTANCE.pptxBREEDING METHODS FOR DISEASE RESISTANCE.pptx
BREEDING METHODS FOR DISEASE RESISTANCE.pptx
RASHMI M G
 
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
University of Maribor
 
DERIVATION OF MODIFIED BERNOULLI EQUATION WITH VISCOUS EFFECTS AND TERMINAL V...
DERIVATION OF MODIFIED BERNOULLI EQUATION WITH VISCOUS EFFECTS AND TERMINAL V...DERIVATION OF MODIFIED BERNOULLI EQUATION WITH VISCOUS EFFECTS AND TERMINAL V...
DERIVATION OF MODIFIED BERNOULLI EQUATION WITH VISCOUS EFFECTS AND TERMINAL V...
Wasswaderrick3
 
Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...
Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...
Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...
University of Maribor
 
Orion Air Quality Monitoring Systems - CWS
Orion Air Quality Monitoring Systems - CWSOrion Air Quality Monitoring Systems - CWS
Orion Air Quality Monitoring Systems - CWS
Columbia Weather Systems
 
Eukaryotic Transcription Presentation.pptx
Eukaryotic Transcription Presentation.pptxEukaryotic Transcription Presentation.pptx
Eukaryotic Transcription Presentation.pptx
RitabrataSarkar3
 
Nucleophilic Addition of carbonyl compounds.pptx
Nucleophilic Addition of carbonyl  compounds.pptxNucleophilic Addition of carbonyl  compounds.pptx
Nucleophilic Addition of carbonyl compounds.pptx
SSR02
 

Recently uploaded (20)

Leaf Initiation, Growth and Differentiation.pdf
Leaf Initiation, Growth and Differentiation.pdfLeaf Initiation, Growth and Differentiation.pdf
Leaf Initiation, Growth and Differentiation.pdf
 
The use of Nauplii and metanauplii artemia in aquaculture (brine shrimp).pptx
The use of Nauplii and metanauplii artemia in aquaculture (brine shrimp).pptxThe use of Nauplii and metanauplii artemia in aquaculture (brine shrimp).pptx
The use of Nauplii and metanauplii artemia in aquaculture (brine shrimp).pptx
 
Deep Software Variability and Frictionless Reproducibility
Deep Software Variability and Frictionless ReproducibilityDeep Software Variability and Frictionless Reproducibility
Deep Software Variability and Frictionless Reproducibility
 
8.Isolation of pure cultures and preservation of cultures.pdf
8.Isolation of pure cultures and preservation of cultures.pdf8.Isolation of pure cultures and preservation of cultures.pdf
8.Isolation of pure cultures and preservation of cultures.pdf
 
THEMATIC APPERCEPTION TEST(TAT) cognitive abilities, creativity, and critic...
THEMATIC  APPERCEPTION  TEST(TAT) cognitive abilities, creativity, and critic...THEMATIC  APPERCEPTION  TEST(TAT) cognitive abilities, creativity, and critic...
THEMATIC APPERCEPTION TEST(TAT) cognitive abilities, creativity, and critic...
 
bordetella pertussis.................................ppt
bordetella pertussis.................................pptbordetella pertussis.................................ppt
bordetella pertussis.................................ppt
 
mô tả các thí nghiệm về đánh giá tác động dòng khí hóa sau đốt
mô tả các thí nghiệm về đánh giá tác động dòng khí hóa sau đốtmô tả các thí nghiệm về đánh giá tác động dòng khí hóa sau đốt
mô tả các thí nghiệm về đánh giá tác động dòng khí hóa sau đốt
 
如何办理(uvic毕业证书)维多利亚大学毕业证本科学位证书原版一模一样
如何办理(uvic毕业证书)维多利亚大学毕业证本科学位证书原版一模一样如何办理(uvic毕业证书)维多利亚大学毕业证本科学位证书原版一模一样
如何办理(uvic毕业证书)维多利亚大学毕业证本科学位证书原版一模一样
 
aziz sancar nobel prize winner: from mardin to nobel
aziz sancar nobel prize winner: from mardin to nobelaziz sancar nobel prize winner: from mardin to nobel
aziz sancar nobel prize winner: from mardin to nobel
 
DMARDs Pharmacolgy Pharm D 5th Semester.pdf
DMARDs Pharmacolgy Pharm D 5th Semester.pdfDMARDs Pharmacolgy Pharm D 5th Semester.pdf
DMARDs Pharmacolgy Pharm D 5th Semester.pdf
 
EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...
EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...
EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...
 
3D Hybrid PIC simulation of the plasma expansion (ISSS-14)
3D Hybrid PIC simulation of the plasma expansion (ISSS-14)3D Hybrid PIC simulation of the plasma expansion (ISSS-14)
3D Hybrid PIC simulation of the plasma expansion (ISSS-14)
 
ESR spectroscopy in liquid food and beverages.pptx
ESR spectroscopy in liquid food and beverages.pptxESR spectroscopy in liquid food and beverages.pptx
ESR spectroscopy in liquid food and beverages.pptx
 
BREEDING METHODS FOR DISEASE RESISTANCE.pptx
BREEDING METHODS FOR DISEASE RESISTANCE.pptxBREEDING METHODS FOR DISEASE RESISTANCE.pptx
BREEDING METHODS FOR DISEASE RESISTANCE.pptx
 
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
 
DERIVATION OF MODIFIED BERNOULLI EQUATION WITH VISCOUS EFFECTS AND TERMINAL V...
DERIVATION OF MODIFIED BERNOULLI EQUATION WITH VISCOUS EFFECTS AND TERMINAL V...DERIVATION OF MODIFIED BERNOULLI EQUATION WITH VISCOUS EFFECTS AND TERMINAL V...
DERIVATION OF MODIFIED BERNOULLI EQUATION WITH VISCOUS EFFECTS AND TERMINAL V...
 
Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...
Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...
Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...
 
Orion Air Quality Monitoring Systems - CWS
Orion Air Quality Monitoring Systems - CWSOrion Air Quality Monitoring Systems - CWS
Orion Air Quality Monitoring Systems - CWS
 
Eukaryotic Transcription Presentation.pptx
Eukaryotic Transcription Presentation.pptxEukaryotic Transcription Presentation.pptx
Eukaryotic Transcription Presentation.pptx
 
Nucleophilic Addition of carbonyl compounds.pptx
Nucleophilic Addition of carbonyl  compounds.pptxNucleophilic Addition of carbonyl  compounds.pptx
Nucleophilic Addition of carbonyl compounds.pptx
 

linear models.pptx

  • 2. ANOVA with more predictors: summary(aov(seedlings ~ fertilization * management, data=seedl)) • Df Sum Sq Mean Sq F value Pr(>F) • fertilization 1 108.3 108.30 9.404 0.0053 ** • management 2 400.5 200.23 17.386 2.15e- 05 *** • fertilization:management 2 34.2 17.10 1.485 0.2466 • Residuals 24 276.4 11.52 seedlings management fertilization 9 abandoned fert 4 abandoned fert 12 abandoned fert 12 abandoned fert 11 abandoned fert 11 abandoned unfert 8 abandoned unfert 16 abandoned unfert 14 abandoned unfert 12 abandoned unfert 14 grazing fert 14 grazing fert 15 grazing fert 24 grazing fert 21 grazing fert 25 grazing unfert 14 grazing unfert 20 grazing unfert 20 grazing unfert 19 grazing unfert 10 mowing fert 6 mowing fert 5 mowing fert 8 mowing fert 8 mowing fert 18 mowing unfert 11 mowing unfert 14 mowing unfert 16 mowing unfert 12 mowing unfert
  • 3. Interaction – test of additivity H0: the effect of a factor is not affected by the other factor – in plots: mean-connecting lines are paralel mow ed*fertilized; LS Means Current effect: F(1, 16)=0.0000, p=1.0000 Effective hypothesis decomposition Vertical bars denote 0.95 confidence intervals mow ed 0 mow ed 1 0 1 fertilized 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 5.5 6.0 6.5 7.0 No . species mow ed*fertilized; LS Means Current effect: F(1, 16)=18.000, p=.00062 Effective hypothesis decomposition Vertical bars denote 0.95 confidence intervals mow ed 0 mow ed 1 0 1 fertilized 0 2 4 6 8 10 12 14 No.species no interaction: additive effect lines are paralel the effect of mowing is the same regardles of fertilization interaction: non-additive effect lines are not paralel the effect of mowing is more pronounced in unfertilized plots
  • 4. Chocolate rats heavier, music no effect Number of observations – should be balanced in all groups music diet Rock Folk Paper 15 15 Chocoloate 15 15 Mean mass of rats ~ diet + music music diet Rock Folk Paper 20 40 Chocoloate 10 20 music diet Rock Folk Paper 30 10 Chocoloate 10 30 Number of observations Chocolate rats heavier, Folk rats heavier
  • 5. site 1 site 2 site 3 site 4 fixed: fertilization random: sites you are not interested in the effect of site you can generalize to all (comparable) sites Fixed vs. random effects of predictors – depends on how general is your research question – fixed effect: you are interested in differences between particular factor levels only (fertilized / unfertilized, breed1 / breed2 / breed3) – random effect: you want to generalize the results to all other possible levels (site1 / site2 / site3 / site4, breed1 / breed2 / breed3, sampling-unit identity) F U F U U F U F F U F U U F U F F U F U U F U F F U F U U F U F F U F U U F U F F U F U U F U F F U F U U F U F F U F U U F U F your results are valid only for your site
  • 6. site 1 site 2 site 3 site 4 fixed: fertilization random: sites you are not interested in the effect of site you can generalize to all (comparable) sites Fixed vs. random effects of predictors – Several computaional approaches – simple use of different MS in F test: – complicated advanced fitting in R F U F U U F U F F U F U U F U F F U F U U F U F F U F U U F U F F U F U U F U F F U F U U F U F F U F U U F U F F U F U U F U F your results are valid only for your site Effect tested A fixed B fixed A random B random A fixed B random) Factor A MSA/MSe MSA/MSAB MSA/MSAB Factor B MSB/MSe MSB/MSAB MSB/MSe A x B interaction MSAB/MSe MSAB/MSe MSAB/MSe
  • 7. 2 bedrocks: site 1 site 2 site 3 granite limestone site 4 site 5 site 6 Factors: fertilization sites (nested in bedrock) – it is impossible to have more bedrocks on one site bedrock Hierarchical (nested) design – not all combination of factor levels are available – factor with more levels is “nested in“ factor with less levels Fbedrock = MSbedrock / MSsite F U F U U F U F F U F U U F U F F U F U U F U F F U F U U F U F F U F U U F U F F U F U U F U F
  • 8. R – fairly complicated to fit a proper models with nested factors random effects interactions packages: lme4 (simpler) or nlme (advanced) https://www.jaredknowles.com/journal/2013/11/25/getting-started-with-mixed-effect-models-in-r 2 bedrocks: site 1 site 2 site 3 granite limestone site 4 site 5 site 6 Factors: fertilization (fixed) sites (random, nested in bedrock) – it is impossible to have more bedrocks on one site bedrock (fixed) fertilization:bedrock (interaction of fixed factors is interesting, other usually not) F U F U U F U F F U F U U F U F F U F U U F U F F U F U U F U F F U F U U F U F F U F U U F U F
  • 9. Multiple regression main effects of two predictors: summary(lm(seedlings ~ productivity + temperature, data=seedl)) seedlings productivity temperature 9 589 7.2 4 674 4.5 12 484 7.4 12 504 5.4 11 484 5 11 572 6 8 411 6.2 16 374 7.6 14 353 8.4 12 406 4.7 14 759 6.8 14 789 4.6 15 689 4.8 24 611 7 21 456 5.8 25 386 7.6 14 538 4.8 20 350 9 20 413 4.8 19 599 6.5 10 558 6.4 6 752 4.2 5 687 6 8 667 4.4 8 662 7.6 18 479 4.8 11 561 9 14 450 8.4 16 578 4.7 12 592 6.8 Estimate Std. Error t value Pr(>|t|) (Intercept) 21.286421 7.220427 2.948 0.00652 ** productivity -0.017128 0.007928 -2.160 0.03977 * temperature 0.245520 0.681770 0.360 0.72156 > summary(lm(seedlings~productivity,data=seedl)) Estimate Std. Error t value Pr(>|t|) (Intercept) 23.429505 4.024972 5.821 2.97e-06 *** productivity -0.018256 0.007169 -2.547 0.0167 * > summary(lm(seedlings~temperature,data=seedl)) Estimate Std. Error t value Pr(>|t|) (Intercept) 8.2922 4.2487 1.952 0.061 . temperature 0.8274 0.6661 1.242 0.224
  • 10. Multiple regression main effects of two predictors: summary(lm(seedlings ~ productivity + temperature, data=seedl)) seedlings productivity temperature 9 589 7.2 4 674 4.5 12 484 7.4 12 504 5.4 11 484 5 11 572 6 8 411 6.2 16 374 7.6 14 353 8.4 12 406 4.7 14 759 6.8 14 789 4.6 15 689 4.8 24 611 7 21 456 5.8 25 386 7.6 14 538 4.8 20 350 9 20 413 4.8 19 599 6.5 10 558 6.4 6 752 4.2 5 687 6 8 667 4.4 8 662 7.6 18 479 4.8 11 561 9 14 450 8.4 16 578 4.7 12 592 6.8 Estimate Std. Error t value Pr(>|t|) (Intercept) 21.286421 7.220427 2.948 0.00652 ** productivity -0.017128 0.007928 -2.160 0.03977 * temperature 0.245520 0.681770 0.360 0.72156 Equation: y = b0 + b1x1 + b2x2 ... (plane) Interaction: y = b0 + b1x1 + b2x2 + b1,2x1x2 – curved surface (different slope in the front and in the back)
  • 11. Estimate Std. Error t value Pr(>|t|) (Intercept) 21.286421 7.220427 2.948 0.00652 ** productivity -0.017128 0.007928 -2.160 0.03977 * temperature 0.245520 0.681770 0.360 0.72156 ... Multiple R-squared: 0.192, Adjusted R-squared: 0.132 F-statistic: 3.207 on 2 and 27 DF, p-value: 0.0563 > anova(lm(seedlings~temperature+productivity,data=seedl)) Analysis of Variance Table Response: seedlings Df Sum Sq Mean Sq F value Pr(>F) temperature 1 42.80 42.800 1.7454 0.19755 productivity 1 114.46 114.464 4.6678 0.03977 * Residuals 27 662.10 24.522 > anova(lm(seedlings~productivity+temperature,data=seedl)) Analysis of Variance Table Response: seedlings Df Sum Sq Mean Sq F value Pr(>F) productivity 1 154.08 154.084 6.2834 0.01851 * temperature 1 3.18 3.180 0.1297 0.72156 Residuals 27 662.10 24.522 ANOVA test of whole model ANOVA tests of each predictor: Mind the order of predictors simple vs. partial effects (in addition to the previous predictors) Multiple regression main effects of two predictors: summary(lm(seedlings ~ productivity + temperature, data=seedl)) Explained variation: (43+114)/(43+114+662)=0.192
  • 12. General linear models – ANOVA and Regression are equivalent – same idea of testing variability explained by a model – fitting model by least squares
  • 13. Square of the difference =TOTAL square Overall mean Group mean Difference from the group mean Square of the difference = ERROR square Difference of the group mean from the overall mean Square of the difference = GROUP square Difference from the overall mean Variance: mean of squared differences from mean – get the differences – square them
  • 14. Sums of squares in regression Total square Error square Regression square ܱܵܵܶܶ = 𝑌𝑖 − 𝑌 2 ܴܵܵ‫ܴܩܧ‬ = 𝑌𝑖 − 𝑌 2 ܵܵ𝑒 = 𝑌𝑖 − 𝑌𝑖 2 This square is minimized Individual values of Y Mean of Y Individual fitted values of Y (values Y calculated as Y= a + bx Fitted value mean(Y)
  • 15. General linear models – ANOVA and Regression are equivalent – same idea of testing variability explained by a model – fitting model by least squares –> both types of predictors (numeric, factor) can be combined – you can use any wild combination of predictor types, interactions, nestedness, random effects... – one more semester: (P. Šmilauer: Modern Regression Methods, KBE/785E) – simplest case – analysis of covariance – 1 numeric predictor – 1 categorical predictor – no interaction – model – paralel lines
  • 16. General linear models: analysis of covariance – 1 numeric predictor – 1 categorical predictor – no interaction – model – paralel lines > anova(lm(seedlings~productivity+management,data=seedl)) Df Sum Sq Mean Sq F value Pr(>F) productivity 1 154.08 154.084 19.843 0.0001418 *** management 2 463.39 231.693 29.837 1.852e-07 *** Residuals 26 201.90 7.765 seedlings management productivity 9 abandoned 589 4 abandoned 674 12 abandoned 484 12 abandoned 504 11 abandoned 484 11 abandoned 572 8 abandoned 411 16 abandoned 374 14 abandoned 353 12 abandoned 406 14 grazing 759 14 grazing 789 15 grazing 689 24 grazing 611 21 grazing 456 25 grazing 386 14 grazing 538 20 grazing 350 20 grazing 413 19 grazing 599 10 mowing 558 6 mowing 752 5 mowing 687 8 mowing 667 8 mowing 662 18 mowing 479 11 mowing 561 14 mowing 450 16 mowing 578 12 mowing 592
  • 17. Analyzing many predictors: If you have too many predictors (e.g. measures of everything in field observation) do not include everything to your model! –> fit Minimal adequate model – backward selection – include everything to the first model, remove all non-significant terms – forward selection – start with the null model – add individual terms – one by one (due to colinearity) – based on p-value or AIC – analyze final model
  • 18. AIC - Akaike information criterion AIC = 2 k – 2 log ( n / SSE ) + C k – number of model parameters (i.e. model df) SSE – residual sum of squares (RSS), C – constant (can be ignored) Quantifies the information accounted for by a predictor – lower AIC suggests a better fit, absolute values of AIC are not informative – allows comparisons between models with different number of df – penalization of complicated models Can be combined with an F-test of significance
  • 19. Question: What does the mass of human body depend on? Sampling design: 21 randomly chosen inhabitants of České Budějovice. Minimal adequate models – Forward selection Body mass Sex Body height Hair colour Vegetarian Hours spent weekly by physical excersise mass sex height colour vegetarian hours 95 M 185 blonde 1 3 96 M 165 blonde 0 2 91 M 178 blonde 1 1 82 M 186 blonde 0 2 87 M 196 black 1 4 75 M 178 black 1 6 81 M 186 black 0 2 84 M 187 black 0 6 95 M 196 red 1 1 100 M 201 red 0 8 69 M 169 red 0 12 52 F 156 blonde 0 1 58 F 168 blonde 0 8 62 F 178 blonde 0 5 61 F 168 blonde 1 6 45 F 155 black 0 4 55 F 164 black 1 3 71 F 181 black 0 1 83 F 185 red 1 2 62 F 175 red 0 4 64 F 171 red 1 2
  • 20. Minimal adequate models – Forward selection Question: What does the mass of human body depend on? Sampling design: 21 randomly chosen inhabitants of České Budějovice. For each person was recorded: – body mass, – sex, – body height, – hair colour, – whether he/she is vegetarian – number of hours spent weekly by physical exercise. Start with null model: > lm.0<-lm(mass~+1, data=BM) > add1(lm.0, .~.+sex*height*colour*vegetarian*hours, test="F") Single term additions Model: mass ~ +1 Df Sum of Sq RSS AIC F value Pr(F) <none> 5318.7 118.224 sex 1 3410.9 1907.7 98.692 33.9710 1.295e-05 *** height 1 3194.1 2124.6 100.953 28.5649 3.704e-05 *** colour 2 191.1 5127.6 121.455 0.3354 0.7194 vegetarian 1 224.8 5093.9 119.317 0.8384 0.3713 hours 1 98.6 5220.0 119.830 0.3591 0.5561
  • 21. next step > lm.1<-update(lm.0, .~.+sex) > add1(lm.1, .~.+sex*height*colour*vegetarian*hours, test="F") Single term additions Model: mass ~ sex Df Sum of Sq RSS AIC F value Pr(F) <none> 1907.7 98.692 height 1 791.27 1116.5 89.441 12.7570 0.00218 ** colour 2 297.60 1610.1 99.131 1.5711 0.23655 vegetarian 1 139.13 1768.6 99.102 1.4160 0.24952 hours 1 289.43 1618.3 97.237 3.2193 0.08959 . next step > lm.2<-update(lm.1, .~.+height) > add1(lm.2, .~.+sex*height*colour*vegetarian*hours, test="F") Single term additions Model: mass ~ sex + height Df Sum of Sq RSS AIC F value Pr(F) <none> 1116.47 89.441 colour 2 245.787 870.68 88.220 2.2583 0.13681 vegetarian 1 45.693 1070.77 90.564 0.7254 0.40620 hours 1 192.420 924.05 87.469 3.5400 0.07714 . sex:height 1 192.466 924.00 87.468 3.5410 0.07710 . stop here (based on p) or include interaction (based on AIC) Minimal adequate models – Forward selection
  • 22. Analysis of final model > anova(lm.0, lm.1, lm.2, test="F") Analysis of Variance Table Model 1: mass ~ +1 Model 2: mass ~ sex Model 3: mass ~ sex + height Res.Df RSS Df Sum of Sq F Pr(>F) 1 20 5318.7 2 19 1907.7 1 3410.9 54.992 7.094e-07 *** 3 18 1116.5 1 791.3 12.757 0.00218 ** --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 > summary(lm.2) Call: lm(formula = mass ~ sex + height, data = BM) Residuals: Min 1Q Median 3Q Max -8.559 -5.865 -2.027 3.041 20.865 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) -41.8184 28.9782 -1.443 0.166170 sexM 16.9264 4.1986 4.031 0.000783 *** height 0.6062 0.1697 3.572 0.002180 ** --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 7.876 on 18 degrees of freedom Multiple R-squared: 0.7901, Adjusted R-squared: 0.7668 F-statistic: 33.87 on 2 and 18 DF, p-value: 7.914e-07 Minimal adequate models – Forward selection
  • 23. Conclusion: Body mass is significantly dependent on sex and body height, these predictors have additive effects Men are on average heavier than women and the mass increases with height But be careful here because sex and height are related to each other! Minimal adequate models – Forward selection
  • 24. How to plot the figure plot(mass~height, data=BM, type="n", ylab="Body mass", xlab="Body height") ### plots an emply plot, i.e. the axes (with appropriate ranges, so the data fit in - THIS IS IMPORTANT) and labels; this is specified by type="n" points(mass~height, data=BM[BM$sex=="M",], pch=16) ### adds full points for males to the empty plot points(mass~height, data=BM[BM$sex=="M",], pch=1) ### empty points for females ###generates x values to be later used by predict function males.pred<-data.frame(sex="M", height=150:205)###generates a range of the height predictor values for which the fitted values for males should be generated females.pred<-data.frame(sex="F", height=150:205)###same for females ### predicts y values based on the model lines(150:205, predict(lm.2, newdata=males.pred)) ###Adds a solid line to the plot, corresponding to the regression fit for males lines(150:205, predict(lm.2, newdata=females.pred), lty=2) ###Adds a dashed line to the plot, corresponding to the regression fit for females legend(x="bottomright", legend=c("Males", "Females"), pch=c(16,1), lty=c(1,2), inset=0.05, bty="n") ### Adds a legend to the plot
  • 25. Overall conclusion Statistics: Numbers and formulas – summary statistics – how big and variable are data – hypothesis testing – p – are the relationships larger than random? – choose test based on data type – data arrangement Logic of discovery – observation vs. experiment – statistical vs. causal relationship – avoid all possible bias – random selection, proper control treatment – enough replicates
  • 26. Continuous (e.g. 0.3, 4, 7, 5.2 etc.) Ordinal (e.g. 1=little, 2=medium, 3=a lot) Categories frequencies or percentages (e.g. germinated: 18, not germinated: 32) Type of dependent variable Type of predictor Categories –> comparison of means 2 groups: t-test (paired or not) >2 groups: one-way ANOVA 2 or more predictors: two/more-way ANOVA 2 groups (not paired): Mann-Whitney test 2 groups (paired): Wilcoxon test Continuous –> linear relationship 2 variables, one cause and one effect: simple regression 2 variables, no cause / effect: Pearson correlation >2 variables, more causes and one effect: multiple regression 2 variables: Spearman correlation 1 grouping variable: Goodnes of fit 18 : 32 >2 groups: Kruskal-Wallis test more predictors of both types: General linear models Both types >1 grouping variable: Contingency table A B C 18 32 D 26 24 Summary statistics How big? Mean, median... How variable? Variance, quartile range, standard deviation, coef. of variation... How accurate estimate? Standard error, confidence interval