SlideShare a Scribd company logo
1 of 67
GENERALIZED LINEAR
MODELS
Ph.D Programme in Psychology, Linguistics and Cognitive
Neurosciences
Ph.D. School - University of Milano-Bicocca Prof. Franca Crippa
httpswww.vox.comfuture-perfect21504366science-replication-crisis-peer-
review-statisticsfbclid=IwAR3lIJXfXBVwFWaE5aw4RXHKY
PLAN OFTHE LESSON
 Part I
 Icebreakers: review of the General Linear Models
 Part II
 The Generalized linear Model : extension to not normally distributed
data.
 fractions (logistic regression),
 counts (Poisson regression, log-linear models),
 ordinal data (threshold models).
 Overview of specific topics ( overdispersion, (quasi-) maximum
likelihood)
 Part III
 Overview of software for GLIMs . Spss and in R. Jamovi and Jasp
(both user-friendly based on some R. R is still the Linus’ blanket, for
wideness and updates on modelling, even if a bit rough and not fluffy.
Ph.D. School - University of Milano-Bicocca Prof. Franca Crippa
PART I
Icebreaker on the General Linear Model
Ph.D. School - University of Milano-Bicocca Prof. Franca Crippa
GENERAL LINEAR MODELS AS MODELS
 Our idea is that data are generated as specified in our model
plus a random error
 DATA = MODEL + ERROR
 Very general form of the model:
 𝒀 = 𝒇(𝑿𝟏, 𝑿𝟐, 𝑿𝟑)+𝛆
 Linear Models are models
 𝒀 = 𝜷𝟎 + 𝜷𝟏𝑿𝟏 +𝜷𝟐𝑿𝟐 + 𝜷𝟑 𝑿𝟑+𝛆
Ph.D. School - University of Milano-Bicocca Prof. Franca Crippa
Beware: notation may vary
from an author to another,
from one professor to
another, from one journal to
another.
Then :
• focus on the meaning of
the symbol;
• pay attention to the
requirements of the
journal
HOW DO WE MODEL DATA?
 Objective
 Model structure (e.g. variables, formula, equation)
 Model assumptions
 Parameter estimates and interpretation
 Model fit (e.g. goodness-of-fit tests and statistics)
 Model selection
Ph.D. School - University of Milano-Bicocca Prof. Franca Crippa
PSYCHOLOGISTS’ STATISTICAL WORKHORSE:
THE GENERAL LINEAR MODEL
Ph.D. School - University of Milano-Bicocca Prof. Franca Crippa
quant
itativ
e
• Linear regression (simple or multiple)
quali
tativ
e
• Anova
both
• Ancova
Predictors
Response : quantative
continuous
One or more
between-
subjects predictors
Quantitative
predictors -
regression
Categorical
predictors -
ANOVA
Quantitative
and categorical
predictors -
ANCOVA
At least one within-
subjects predictors
PSYCHOLOGISTS’ STATISTICAL WORKHORSE:
THE GENERAL LINEAR MODEL
Ph.D. School - University of Milano-Bicocca Prof. Franca Crippa
Response : quantative continuous
One or more dichotomous or continuous between-subjects predictors
One predictor
• Independent
samples t-test
• Simple
regression
>=two predictors
• Multiple regression
• Statistical control
(covariates)
• Mediation
>=two predictors plus interaction
• Interactions
• Moderated mediation
• Other type of linear model
(polynomial)
8
THE GENERAL LINEAR MODEL
At least one within-subjects
predictors
One or more categorical
within-subjects
predictors
At least one
continuous within-subjects
predictor
Paired-samples
t-test
Within-subjects ANOVA
Linear Mixed-Effects
Models
(LMEM)
An additional
random term
GENERAL LINEAR MODEL: AN OUTLOOK ON
THE ASSUMPTIONS
Predictors 1 . on any scale: categorical or quantitative
2. measured without error (deterministic) –
random component expressed by the error 
Responsevariable: (continuous) quantitative only
 errors are iid and normally distributed. For all subjects i=1,2,..n. the errors i are:
i) identically, normally distributed with zero mean and equal variance (omoschedasticity)
ii) Incorrelated (independent)
Objective:
the response yi; i = 1, .., n is modelled by a linear (additive) function of predictors/explanatory variables xj ; j =
1, …, p plus an error term
The model is linear in the parameters
Ph.D. School - University of Milano-Bicocca Prof. Franca Crippa
The general linear model make the assumptions below. When these
assumptions are met, OLS regression coefficients are MVUE (Minimum
Variance Unbiased Estimators) and BLUE (Best Linear Unbiased
Estimators).
1. Exact X: The IVs are assumed to be known exactly (i.e., without
measurement error), deterministic.
2. 2. Independence: Residuals are independently distributed (prob. of
obtaining a specific observation does not depend on other observations)
3. 3. Normality: All residual distributions are normally distributed
4. 4. Constant variance: All residual distributions have a constant variance
5. 5. Linearity: All residual distributions (i.e., for eachY') are assumed to have
means equal to zero
ESTIMATION WHEN ASSUMTIONS ARE MET
PARAMETER INTERPRETATION
Regression:
•b0 estimate of the intercept 𝛽0
•bi is estimate of the slope 𝛽0, i.e. the increase of the response due
to the unitary increase of the i.th preditor
Anova
•General mean
•Difference between the group mean and the general mean
Model selection: which explanatory variables to include?
 Principle of parsimony (Occam’s razor): all relevant predictors
are included, no irrelevant one is.
Ph.D. School - University of Milano-Bicocca Prof. Franca Crippa
NOT ALL MISSING DATA ARETHE SAME
Missing by design
Values are missing by definition of the population of interest
Missing completely at random (MCAR)
Missing values are randomly distributed
Missing at random (MAR)
After accounting for one or more other variables, missing values are
randomly distributed
Non-ignorable (NI)
Missing values are functions of the variables themselves
BETTER METHODS OF HANDLING MISSING DATA
Full information maximum likelihood (FIML) methods
Can handle data that are MAR and NI
Implemented as part of particular statistical models
Missing data handled during analysis
Multiple imputation
Can also handle data that are MAR and NI
Simulation-based approach
Missing data are handled separately from analysis
RESTRICTIONS OF GENERAL LINEAR MODELS
Although a very useful framework, there are some situations
where general linear models are not appropriate
1. The range ofY is restricted
 categorical variables, binary, ordered or unordered categories,
counts
2. Other violations of assumptions
 Heteroschedasticity
 Non-normality
 Non linearity (in the Ivs and/or in the parameters)
 Variance depending on the mean
Ph.D. School - University of Milano-Bicocca Prof. Franca Crippa
Anscombe’s quartet
Ph.D. School - University of Milano-Bicocca Prof. Franca Crippa
MESSY DATA
Anscombe, Francis J. (1973) Graphs in statistical analysis. American
Statistician, 27, 17–21
16
A GLANCETO DECISION AND POWER (NEXT LESSON)
Reality:
NO EFFECT
Reality:
EFFECT EXISTS
Research concludes:
FAILTO REJECT (FTR)
NULL; NO EFFECT
CORRECT FTR TYPE 2 ERROR ()
Researcher concludes:
REJECT NULL;
EFFECT EXISTS
TYPE 1 ERROR () CORRECT REJECT (1-)
PART II
Generalised Linear Models (GLIMs)
Ph.D. School - University of Milano-Bicocca Prof. Franca Crippa
EXTENSIONTO GENERALIZED LINEAR MODELS
(GLIM OR GLM)
GLIMs are a family of models that:
Represent an extension of linear regression to a broader family of outcome
variables - basic structure of linear regression equations.
Allow us to extend the linear modelling framework to variables that are not
Normally distributed.
Allow us to look at models that seem different in a unifying perspective.
Ph.D. School - University of Milano-Bicocca Prof. Franca Crippa
Two major additions to the linear function framework
link function : when the response has a nonlinear relationships with predictors, a
transformation of the response is expressed as a linear regression
error structures beyond the normally, for instance binomial, poisson.
THREE COMPONENTS OF A GLIM
1. Systematic part: relation between the dependent variable Y and
the independent variables in the model.
2. Random part: error distribution of the outcome variable
3. Link function: transform of the response, so that the transfom is
expressed a the well known linear relation
g(⋅) link function (linear, logit, poisson..)
Ph.D. School - University of Milano-Bicocca Prof. Franca Crippa
How to approach GLIM:
Understanding the common underlying linear structure
Esploring reason for different estimation techniques
GLIM FIXED MODELSWITH RESPONSES AND
PREDICTORS OF ANYTYPE
Predictors measured on any scale.
Response
Ph.D. School - University of Milano-Bicocca Prof. Franca Crippa
continu
ous
• General linear model
dichoto
mous
• Logit (accuracy, yes or no)
categ
orical
• Logistic (ordinal or nominal categorical)
count
• Poisson regression (count variables,
frequencies)
21
GLIM AS A SOLUTION FOR SOME
VIOLATIONS OF GENERAL LINEAR MODELS
ASSUMPTIONS
(Independence: Inaccurate standard errors, degrees of freedom and
significance tests. Use linear mixed effects models – see my
collegue’s lessons)
Normality: Inefficient (with large N). Use transformations,
g e n e rali ze d li n e ar mode ls
Constant variance: Inefficient and inaccurate standard errors. Use
transformations, generali zed li near models
Linearity: Biased parameter estimates. Use transformations,
g e n e rali ze d li n e ar mode ls
Ph.D. School - University of Milano-Bicocca Prof. Franca Crippa
DISTRIBUTION OF ERRORS IN PROBIT AND LOGIT
MODELS
Family Default Link Function
binomial (link = "logit")
gaussian (link = "identity")
Gamma (link = "inverse")
inverse.gaussian (link = "1/mu^2")
poisson (link = "log")
quasibinomial (link = "logit")
quasipoisson (link = "log")
glm(formula, family=familytype(link=linkfunction), data=)
Ph.D. School - University of Milano-Bicocca Prof. Franca Crippa
LINK FUNCTION AND ERROR DISTRIBUTIONS
model Error distribution Link function
Regression Normal g=E(Y|X)
Binary logistic regression Binomial g=𝑙𝑛
𝑝
1−𝑝
Ordinal logistic regression Binomial g=𝑙𝑛
𝑝
1−𝑝
Multinomial logistic regression Multinomial g=𝑙𝑛
𝑝
1−𝑝
Poisson regression Poisson g=ln[E(Y|X)]
Beta regression Beta g=ln[E(Y|X)]
Gamma regression Gamma g=ln[E(Y|X)]
Negative binomial regression Negative binomial g=ln[E(Y|X)]
Ph.D. School - University of Milano-Bicocca Prof. Franca Crippa
Family Default Link Function
1. binomial (link = "logit")
FIRST CASE: MODELLINGTHE PROBABILITY FOR A
DICHOTOMOUS VARIABLE
Ph.D. School - University of Milano-Bicocca Prof. Franca Crippa
In a binomial variable (the random component is the error, which is binomial)
our interest is on the probability of ‘success’. In fact we have to outcomes,
success and insuccess (1 and 0).When we know the probability of success p,
then we derive the probability of success as 1-p.
Our reponse would be the probability, range 0-1.Then we cannot use the
General Linear Model. How do we solve the problem?
- We transform the response. Instead of the probability, we consider the logit
g=𝑙𝑛
𝑝
1−𝑝
.The symbol g stands for our ‘transformed’ response.This
transformed response now is continuous.
Family Default Link Function
2. gaussian (link = “identity")
SECOND CASE: MODELLINGTHE PROBABILITY FOR A
CONTINUOUS VARIABLE
Ph.D. School - University of Milano-Bicocca Prof. Franca Crippa
In a continuous variable with normal (gaussian) random component, the
response has no restriction on the real numbers.
Our reponse is the variable as it stands
- Do we need to transform the response?
- No.
- How do we express the transform g, i.e. the link function?
- As the identity function.
- Are General Linear Models part of Generalized Linear Models?
- Yes, when the link function, also denoted as g, is identity and when the
error terms are normal.
ESTIMATION: MAXIMUM LIKELIHOOD (ML)
 The likelihood function (LF) expresses the likelihood of observing the
data under the model
 The LF is maximized by the best fitting parameter estimates
 Any model estimated with ML methods will produce a deviance value
for the model, which can be used to assess fit of the model (for the
special case of linear regression model with normal errors, the
deviance is equal to the residual SS).
 The deviance for a model can be used to calculate analogues of the
R2multiple for GLiMs
 These notions are useful to understand the logic of models and their
‘assessment’, that’s it.
Ph.D. School - University of Milano-Bicocca Prof. Franca Crippa
MODEL STRUCTURE: BACKTOTHE LINEAR
STRUCTURE
 The binary logistic model has the GLIMs structure:
𝑙𝑛
𝑝
1−𝑝
=𝛽0 + 𝛽 1x1i + 𝛽 2x2i + i
where:
p is the probability of 1 (or the proportion of);
Ln
𝑝
1−𝑝
is the logit, the link function
𝑝
1−𝑝
is the odd, i.e. the probability of presence over the
probability of absence of the response
0<p<1 vs Logit: (-, )
Ph.D. School - University of Milano-Bicocca Prof. Franca Crippa
GOODNESS OF FIT
Wald test on logit regression coefficients:
· Large-Sample test (WaldTest) in truth a z-test:
· H0: 𝛽 = 0 HA: 𝛽  0
Ph.D. School - University of Milano-Bicocca Prof. Franca Crippa
j
j
j
B
B
W
SE

The model with intercept and predictors is compared to an intercept
only model to test
χ2=2[LL(B)-LL(0)] where LL indicates the log likelihood
Analogues of the R2 –value in linear regression:
Hosmer & Lemeshow
Cox & Snell:
Nagelkerke:
−2𝐿𝐿𝐵
−2𝐿𝐿 0
-
2 2
1 exp [ ( ) (0)]
CS
R LL B LL
n
 
   
 
 
2
2 2 1
2
,where 1 exp[2( ) (0)]
CS
N MAX
MAX
R
R R n LL
R

  
IF WE PLOT A DICHOTOMOS
RESPONSE
Ph.D. School - University of Milano-Bicocca Prof. Franca Crippa
BINARY LOGISTIC REGRESSION
 The response variable is dichotomous.
 Predictor variables may be categorical or
continuous.
 If predictors are all continuous and nicely
distributed, may use discriminant
function analysis.
 If predictors are all categorical, may use
logit analysis.
Ph.D. School - University of Milano-Bicocca Prof. Franca Crippa
PSEUDO-R MEASURES
Hosmer & Lemeshow is not computed in Spss
Cox & Snell: unluckily it does not reach 1
Nagelkerke has been adjused to reach 1, so it is used the
most
Ph.D. School - University of Milano-Bicocca Prof. Franca Crippa
PARAMETERS INTERPRETATION
Ph.D. School - University of Milano-Bicocca Prof. Franca Crippa
Holding all other predictors constant:
• 𝛽= 0  P(Presence) is the same at each level of x
• 𝛽 > (<) 0  P(Presence) increases (decreases) as x increases
x
x
e
e
P 







1
Interpretation in terms of probability   X
P
P
ODDS 
 











ˆ
1
ˆ
ln
ln
Response: vote in favour of cats as research subjects
Sample size: 315
Null (empty) model
187
128
684
.
)
379
.
( 


Exp
In favour
against
P(in favour) = 128/315 = 40.6%
P(against) = 187/315 = 59.4%
Odds = 40.6/59.4 = .684
ADDING GENDER AS A DV
We add gender as a DV, male=1, female=0
Ph.D. School - University of Milano-Bicocca Prof. Franca Crippa
429
.
0
:
448
.
1
:
847
.
0
217
.
1
847
.
0
217
.
1
847
.
0









e
female
e
male
e
e Gender
bGender
a
376
.
3
429
.
448
.
1


female
male
Odd
Odd
Odds
Odds ratio
Clearly these are not probabilities,
note that they can be >1!! (they are
odds, i.e the ratio given by y chance in
favour divided by the chance against,
for females only and for males only
respectively)
The odds ratio is the ratio between the two
odds.
A woman is .429 less likely to be in favour of
the research than against it.
A man is 1.448 times more likely to be in
favour to continue the research than against
it.
Men are 3.376 times more likelya to vote to
continue the research, i.e. . to be in favour
rather than against, with respect to women.
FROM ODDSTO PROBABILITIES
Ph.D. School - University of Milano-Bicocca Prof. Franca Crippa
30
.
0
429
.
1
429
.
0
1
ˆ




Odds
Odds
Pwomen
59
.
0
448
.
2
448
.
1
1
ˆ




Odds
Odds
Pmen
For a woman, the probability
of voting in favour of cats in
experiments is 30%
For a man, the probability of
voting in favour of cats in
experiments is 59%, almost
double the probability for a
woman.
We can draw our conclusions in terms of probability NOW
POISSON REGRESSION
 Count response variable
(frequencies) in a fixed period
of time, with a Poisson
distribution
 Poisson distribution:
probability of 0, 1, 2, . . .
events; the mean of the
distribution is equal to the
variance
 In the Poisson regression
model, predictor variables
may be categorical or
continuous. Ph.D. School - University of Milano-Bicocca Prof. Franca Crippa
Rare events
When mean>10 similar to normal
POISSON REGRESSION
Ph.D. School - University of Milano-Bicocca Prof. Franca Crippa
MODEL STRUCTURE
The Poisson model has the structure:
𝑙𝑛 𝑦 =𝛽0 + 𝛽1x1i + 𝛽2x2i + i
where the link function is ln
Goodness of fit
Wald test on regression coefficients
R2
deviance=1-
𝑑𝑒𝑣𝑖𝑎𝑛𝑐𝑒(𝑚𝑜𝑑𝑒𝑙)
𝐷𝑒𝑣𝑖𝑎𝑛𝑐𝑒(𝑛𝑢𝑙𝑙)
overall fit
R2
deviance=1-
𝑑𝑒𝑣𝑖𝑎𝑛𝑐𝑒(𝑚𝑜𝑑𝑒𝑙)
𝐷𝑒𝑣𝑖𝑎𝑛𝑐𝑒(𝑚𝑜𝑑𝑒𝑙 𝑚𝑖𝑛𝑢𝑠 𝑜𝑛𝑒 𝑐𝑜𝑣𝑎𝑟𝑖𝑎𝑡𝑒)
gain in prediction
Ph.D. School - University of Milano-Bicocca Prof. Franca Crippa
INTERPRETATION OF PARAMETERS
𝑙𝑛 𝑦 =𝛽0 + 𝛽1x1i + b2x2i
 A unitary increase in x1 results in a b1 increase in ln(y)
 For direct interpretation of the effect on the count variable, we
consider the regression as:
𝑦 = 𝑒𝛽0𝑒𝛽1x1i𝑒𝛽2x2i
 A change in the value of a predictor results in a multiplicative change in
the predicted count.
 Remember that in linear regression a change in the predictor result in
an additive change in the predicted value
Ph.D. School - University of Milano-Bicocca Prof. Franca Crippa
INTERPRETATION OF PARAMETERS/2
 If 𝛽 = 0, then exp(𝛽) = 1, Y and X are not related.
 If 𝛽 > 0, then exp(𝛽) > 1, and the expectedY is exp(β) times
larger than when X = 0
 If 𝛽 < 0, then exp(𝛽) < 1, and the expected count is exp(𝛽)
times smaller than when X = 0
Ph.D. School - University of Milano-Bicocca Prof. Franca Crippa
ESTIMATION WITH ML
 The deviance for a model can be used to calculate analogues to the
linear regression R2multiple
 equidispersion: several GLiMs have error structures based on
distributions in which the variance is a function of the mean.
 Actual data are usually overdisperse.
As in the comments for estimation of the logistic regression, these
comments sketch some general ideas.The subject is vast and at this
point we just need to see the logic and analogies and differences
between extensions of models.
Ph.D. School - University of Milano-Bicocca Prof. Franca Crippa
Two-Part Models or joint models:
 single outcome variable has multiple facets that are modeled
simultaneously or when multiple outcome variables are conceptually
closely related.
Hurdle regression models
 Hurdle regression models (Long, 1997; Mullahy, 1986) are often used to
model human decision-making processes It has been used in Italy in
migration studies.
Ph.D. School - University of Milano-Bicocca Prof. Franca Crippa
Other GLIM models/1
 Zero-inflated regression models
 Individuals from two different populations: those who have no probability
of displaying the behavior of interest and therefore always respond with a
zero, and those who produce zeros with some probability.
 Alcohol example: zeros will come from individuals who never drink for
religious, health, or other reasons and thereby produce structural zeros
that must always occur.
 In practice: more 0 than expected in a Poisson (or Negatve Binomial)
distribution.
 Consequences: estimated parameters and SE may be distorted
 the excessive number of 0 can cause overdispersion
 Solutions: Mixture models or Hurdle models
Ph.D. School - University of Milano-Bicocca Prof. Franca Crippa
Other GLIM models/2
OTHER MEASURES
 Akaike Information Criteria (AIC) You can look at AIC as counterpart
of adjusted r square in multiple regression.The smaller the better
 Null Deviance and Residual Deviance Null deviance is calculated from
the model with no features, i.e. intercept tonly. Residual deviance is
calculated from the model having all the features.
 Receiver Operator Characteristic (ROC) curve
Ph.D. School - University of Milano-Bicocca Prof. Franca Crippa
EXAMPLES IN
THE
LITERATURE
Parker, M. A., & Anthony, J. C. (2019). Underage drinking, alcohol
dependence, and young people starting to use prescription pain
relievers extra-medically: A zero-inflated Poisson regression
model. Experimental and clinical psychopharmacology, 27(1), 87.
DeLisi, M., Caudill, J.W.,Trulson, C. R., Marquart, J.W.,Vaughn, M.
G., & Beaver, K. M. (2010). Angry inmates are violent inmates: A
Poisson regression approach to youthful offenders. Journal of
Forensic Psychology Practice, 10(5), 419-439.
Ph.D. School - University of Milano-Bicocca Prof. Franca Crippa
EXAMPLES INTHE
LITERATURE/LOGISTIC
Adwere-Boamah, J., & Hufstedler, S.
(2015). Predicting SocialTrust with
Binary Logistic Regression. Research
in Higher Education Journal, 27
Adwere-Boamah, J. (2011). Multiple
Logistic Regression Analysis of
Cigarette Use among High School
Students. Journal of Case Studies in
Education, 1.
Ph.D. School - University of Milano-Bicocca Prof. Franca Crippa
PART II - 1
A bridge from Generalised Linear Models to Generalised Linear Mixed
Models
Ph.D. School - University of Milano-Bicocca Prof. Franca Crippa
GENERALIZED LINEAR MIXED
MODELS - GLMM
• GLMMs as an extension of GLIM when the assumption of
incorrelated errors in violated.
• suitable for the analysis of normal and non-normal data with
a clustered (in groups) structure
• Added complexity: random effects (different from random
errors)
Ph.D. School - University of Milano-Bicocca Prof. Franca Crippa
GLMM PARAMETERS
 fixed regression effects and variance components parameters
common to all cluster
 cluster-specific parameters, assumed to be randomly drawn from a
population distribution
 Example: experimental psychology where the experimental design
contains within-subject variables
 variance components of the population distribution to be estimated
together with the fixed effects
Ph.D. School - University of Milano-Bicocca Prof. Franca Crippa
POWER AND RELIABILITY OF ESTIMATES
Often the limiting factor is the sample size at the highest unit of
analysis.
For example, having 500 patients from each of ten doctors would
give one a reasonable total number of observations, but not
enough to get stable estimates of doctor effects nor of the
doctor-to-doctor variation.
10 patients from each of 500 doctors (leading to the same total
number of observations) would be preferable.
Ph.D. School - University of Milano-Bicocca Prof. Franca Crippa
CLASSES OF GENERALIZED LINEAR MODELS
Ph.D. School - University of Milano-Bicocca Prof. Franca Crippa
General Linear Models
(Linear regression, ANOVA,
ANCOVA)
Y= X β+𝜺
Responses Independent
Generalized Linear Models
(Logistic regression, Poisson
regression, etc.)
g(Y) = X β+𝜺
Responses Independent
Linear Mixed Models
Y = X β + Z b+𝜺
Responses Correlated
Correlation modeled in part by
“random effects”
g(Y|b) = X β + Z b+𝜺
Responses Correlated
Correlation modeled in part by
“random effects”
PART III
An overview on software
Ph.D. School - University of Milano-Bicocca Prof. Franca Crippa
IBM SPSS
 Spss allows estimation of severalGLIM
 This menu is comprehensive and a bit
more complicated than the following one
Ph.D. School - University of Milano-Bicocca Prof. Franca Crippa
IBM SPSS
Addressing GLIM
from regression
is more
straightforward
Ph.D. School - University of Milano-Bicocca Prof. Franca Crippa
The model here is multinomial
regression, where:
Response: categorical (nominal) with
>2 categories
Predictors: any scale
SAS - STATA
Ph.D. School - University of Milano-Bicocca Prof. Franca Crippa
These software share a longstanding tradition and
a widespread scientific community. Some advances
are ‘translated’ directly into their proc.
They are rather costly and not too user-friendly.
Unfortunately, so far there has been a trend
towards an inverse correlation between user-
friendly and scientifically advanced.
They are included in university our campus
sotware
OPEN SOURCE SOFTWARE
Ph.D. School - University of Milano-Bicocca Prof. Franca Crippa
Open source software has been engaging the scientific community for quite
some time.
Among the extremely user/psychologist friendly packages developed for the
psychologists community, we will take a glance at:
Jamovi (collaborators from our university)
Jasp
R is the open source environment. Several methodological/statistical advances
are published in the literature, the corresponding R package is published on
the related scientific journals and then made available to the community, with
on line documentation and with communities.
The strenght of R, Sas, Stata is that their methods are subject to a vast open
discussion in the scientific community and they provide a copious
documentation to the publich. See Idre from UCLA
JAMOVI
 Some generalised linear models can be retrieved in the two menus
Anova and Regression.
 GAMLj: module for GLM, LME and GZLMs in jamovi
Ph.D. School - University of Milano-Bicocca Prof. Franca Crippa
JASP
Ph.D. School - University of Milano-Bicocca Prof. Franca Crippa
R –THE INUDIBLE,THE INEVITABLE
Ph.D. School - University of Milano-Bicocca Prof. Franca Crippa
 Why is R so important in the scientific community?
 Why is not-so-friendly software ( where we need to use programming
language) so relevant for applying statstical methods?
Estimation can have a domino effect
THE GLM FUNCTION IN R
Generalized linear models can be fitted in R using the glm
function,
The glm function is similar to the lm function for tting linear
models.
The arguments to a glm call are as follows
glm (formula, family = ………..)
Ph.D. School - University of Milano-Bicocca Prof. Franca Crippa
R –THE INUDIBLE,THE INEVITABLE
The formula is specifieded to glm as, e.g.
y  x1 + x2
where x1, x2 are the names of
 numeric vectors (continuous variables)
 factors (categorical variables)
All specied variables must be in the workspace or in the data
frame passed to the data argument.
Ph.D. School - University of Milano-Bicocca Prof. Franca Crippa
FAMILY ARGUMENT
The family argument takes (the name of) a family function which
specifies
 the link function
 the variance function and other e.g. linkinv
The family (exponential family) functions available in R are
 Binomial (link = "logit")
 Poisson (link = "log")
 Gaussian (link = "identity")
 inverse.gaussian (link = "1/mu2")
Ph.D. School - University of Milano-Bicocca Prof. Franca Crippa
EXTRACTOR FUNCTIONS
There are several glm or lm methods available for accessing/displaying
components of the glm object, including:
 residuals()
 fitted()
 predict()
 coef()
 deviance()
 formula()
 summary()
Ph.D. School - University of Milano-Bicocca Prof. Franca Crippa
RSTUDIO
Ph.D. School - University of Milano-Bicocca Prof. Franca Crippa
SOME USEFUL WEBSITES
An excellent introduction on R, with full information and
instructions:
https://stats.idre.ucla.edu/r/
Some detailed lessons on the General Linear Model and an
introduction on the Generalized Linear Model, with applications
in R (data not provided, some lessons are not freely available to
the public):
https://arc.psych.wisc.edu/courses/610-710/
Ph.D. School - University of Milano-Bicocca Prof. Franca Crippa
Some references
Coxe, S.,West, S. G., & Aiken, L. S. (2009).The analysis of count data: A gentle
introduction to Poisson regression and its alternatives. Journal of personality
assessment, 91(2), 121-136
Coxe, S.,West, S. G., & Aiken, L. S. (2013). Generalized linear models. Oxford
handbook of quantitative methods, 26-51.
Dobson, A. J., & Barnett, A. (2008). An introduction to generalized linear models. CRC
press.
Fox, J. (2015). Applied regression analysis and generalized linear models. Sage
Publications.
Hedeker, D., Flay, B. R., & Petraitis, J. (1996). Estimating individual influences of
behavioral intentions: an application of random-effects modeling to the theory of
reasoned action. Journal of Consulting and Clinical Psychology, 64, 109-120.
Hirotsu, C. (2017). Advanced Analysis ofVariance (Vol. 384). John Wiley & Sons.
Ph.D. School - University of Milano-Bicocca Prof. Franca Crippa
Some references
Osborne, J.W. (2014). Best practices in logistic regression. Sage
Publications.
http://core.ecu.edu/psyc/wuenschk/MV/Multreg/Logistic-
SPSS.PDF
Tuerlinckx, F., Rijmen, F.,Verbeke, G., & Boeck, P. (2006).
Statistical inference in generalized linear mixed models: A
review. BritishJournal of Mathematical and Statistical
Psychology, 59(2), 225-255.
Vandekerckhove, J., Matzke, D., &Wagenmakers, E. J. (2015).
Model comparison and the principle of parsimony. Oxford
handbook of computational and mathematical psychology, 300-
319.
Ph.D. School - University of Milano-Bicocca Prof. Franca Crippa

More Related Content

Similar to GLM_2020_21.pptx

My regression lecture mk3 (uploaded to web ct)
My regression lecture   mk3 (uploaded to web ct)My regression lecture   mk3 (uploaded to web ct)
My regression lecture mk3 (uploaded to web ct)
chrisstiff
 
Multicollinearity1
Multicollinearity1Multicollinearity1
Multicollinearity1
Muhammad Ali
 
Lect4 research methodology
Lect4 research methodologyLect4 research methodology
Lect4 research methodology
Jasper Obico
 
[Xin yan, xiao_gang_su]_linear_regression_analysis(book_fi.org)
[Xin yan, xiao_gang_su]_linear_regression_analysis(book_fi.org)[Xin yan, xiao_gang_su]_linear_regression_analysis(book_fi.org)
[Xin yan, xiao_gang_su]_linear_regression_analysis(book_fi.org)
mohamedchaouche
 

Similar to GLM_2020_21.pptx (20)

2010 03 - rmic 824 master syllabus
2010 03 - rmic 824 master syllabus2010 03 - rmic 824 master syllabus
2010 03 - rmic 824 master syllabus
 
GLMM in interventional study at Require 23, 20151219
GLMM in interventional study at Require 23, 20151219GLMM in interventional study at Require 23, 20151219
GLMM in interventional study at Require 23, 20151219
 
Matlab:Regression
Matlab:RegressionMatlab:Regression
Matlab:Regression
 
Matlab: Regression
Matlab: RegressionMatlab: Regression
Matlab: Regression
 
Complex sampling in latent variable models
Complex sampling in latent variable modelsComplex sampling in latent variable models
Complex sampling in latent variable models
 
Hierarchichal species distributions model and Maxent
Hierarchichal species distributions model and MaxentHierarchichal species distributions model and Maxent
Hierarchichal species distributions model and Maxent
 
0 introduction
0  introduction0  introduction
0 introduction
 
My regression lecture mk3 (uploaded to web ct)
My regression lecture   mk3 (uploaded to web ct)My regression lecture   mk3 (uploaded to web ct)
My regression lecture mk3 (uploaded to web ct)
 
some basic designs of research.pdf
some basic designs of research.pdfsome basic designs of research.pdf
some basic designs of research.pdf
 
QSAR statistical methods for drug discovery(pharmacology m.pharm2nd sem)
QSAR statistical methods for drug discovery(pharmacology m.pharm2nd sem)QSAR statistical methods for drug discovery(pharmacology m.pharm2nd sem)
QSAR statistical methods for drug discovery(pharmacology m.pharm2nd sem)
 
Repeated-Measures and Two-Factor Analysis of Variance
Repeated-Measures and Two-Factor Analysis of VarianceRepeated-Measures and Two-Factor Analysis of Variance
Repeated-Measures and Two-Factor Analysis of Variance
 
Multicollinearity1
Multicollinearity1Multicollinearity1
Multicollinearity1
 
Causal Models and Structural Equations
Causal Models and Structural EquationsCausal Models and Structural Equations
Causal Models and Structural Equations
 
Anova - One way and two way
Anova - One way and two wayAnova - One way and two way
Anova - One way and two way
 
Mba2216 week 11 data analysis part 02
Mba2216 week 11 data analysis part 02Mba2216 week 11 data analysis part 02
Mba2216 week 11 data analysis part 02
 
mix2.pdf
mix2.pdfmix2.pdf
mix2.pdf
 
Multiple Linear Regression II and ANOVA I
Multiple Linear Regression II and ANOVA IMultiple Linear Regression II and ANOVA I
Multiple Linear Regression II and ANOVA I
 
Lect4 research methodology
Lect4 research methodologyLect4 research methodology
Lect4 research methodology
 
[Xin yan, xiao_gang_su]_linear_regression_analysis(book_fi.org)
[Xin yan, xiao_gang_su]_linear_regression_analysis(book_fi.org)[Xin yan, xiao_gang_su]_linear_regression_analysis(book_fi.org)
[Xin yan, xiao_gang_su]_linear_regression_analysis(book_fi.org)
 
STATISTICAL TOOLS IN RESEARCH
STATISTICAL TOOLS IN RESEARCHSTATISTICAL TOOLS IN RESEARCH
STATISTICAL TOOLS IN RESEARCH
 

Recently uploaded

一比一原版麦考瑞大学毕业证成绩单如何办理
一比一原版麦考瑞大学毕业证成绩单如何办理一比一原版麦考瑞大学毕业证成绩单如何办理
一比一原版麦考瑞大学毕业证成绩单如何办理
cyebo
 
一比一原版阿德莱德大学毕业证成绩单如何办理
一比一原版阿德莱德大学毕业证成绩单如何办理一比一原版阿德莱德大学毕业证成绩单如何办理
一比一原版阿德莱德大学毕业证成绩单如何办理
pyhepag
 
Exploratory Data Analysis - Dilip S.pptx
Exploratory Data Analysis - Dilip S.pptxExploratory Data Analysis - Dilip S.pptx
Exploratory Data Analysis - Dilip S.pptx
DilipVasan
 
一比一原版西悉尼大学毕业证成绩单如何办理
一比一原版西悉尼大学毕业证成绩单如何办理一比一原版西悉尼大学毕业证成绩单如何办理
一比一原版西悉尼大学毕业证成绩单如何办理
pyhepag
 
NO1 Best Kala Jadu Expert Specialist In Germany Kala Jadu Expert Specialist I...
NO1 Best Kala Jadu Expert Specialist In Germany Kala Jadu Expert Specialist I...NO1 Best Kala Jadu Expert Specialist In Germany Kala Jadu Expert Specialist I...
NO1 Best Kala Jadu Expert Specialist In Germany Kala Jadu Expert Specialist I...
Amil baba
 
一比一原版纽卡斯尔大学毕业证成绩单如何办理
一比一原版纽卡斯尔大学毕业证成绩单如何办理一比一原版纽卡斯尔大学毕业证成绩单如何办理
一比一原版纽卡斯尔大学毕业证成绩单如何办理
cyebo
 
Abortion pills in Dammam Saudi Arabia// +966572737505 // buy cytotec
Abortion pills in Dammam Saudi Arabia// +966572737505 // buy cytotecAbortion pills in Dammam Saudi Arabia// +966572737505 // buy cytotec
Abortion pills in Dammam Saudi Arabia// +966572737505 // buy cytotec
Abortion pills in Riyadh +966572737505 get cytotec
 

Recently uploaded (20)

basics of data science with application areas.pdf
basics of data science with application areas.pdfbasics of data science with application areas.pdf
basics of data science with application areas.pdf
 
一比一原版麦考瑞大学毕业证成绩单如何办理
一比一原版麦考瑞大学毕业证成绩单如何办理一比一原版麦考瑞大学毕业证成绩单如何办理
一比一原版麦考瑞大学毕业证成绩单如何办理
 
Supply chain analytics to combat the effects of Ukraine-Russia-conflict
Supply chain analytics to combat the effects of Ukraine-Russia-conflictSupply chain analytics to combat the effects of Ukraine-Russia-conflict
Supply chain analytics to combat the effects of Ukraine-Russia-conflict
 
一比一原版阿德莱德大学毕业证成绩单如何办理
一比一原版阿德莱德大学毕业证成绩单如何办理一比一原版阿德莱德大学毕业证成绩单如何办理
一比一原版阿德莱德大学毕业证成绩单如何办理
 
Data Visualization Exploring and Explaining with Data 1st Edition by Camm sol...
Data Visualization Exploring and Explaining with Data 1st Edition by Camm sol...Data Visualization Exploring and Explaining with Data 1st Edition by Camm sol...
Data Visualization Exploring and Explaining with Data 1st Edition by Camm sol...
 
Easy and simple project file on mp online
Easy and simple project file on mp onlineEasy and simple project file on mp online
Easy and simple project file on mp online
 
AI Imagen for data-storytelling Infographics.pdf
AI Imagen for data-storytelling Infographics.pdfAI Imagen for data-storytelling Infographics.pdf
AI Imagen for data-storytelling Infographics.pdf
 
Formulas dax para power bI de microsoft.pdf
Formulas dax para power bI de microsoft.pdfFormulas dax para power bI de microsoft.pdf
Formulas dax para power bI de microsoft.pdf
 
Exploratory Data Analysis - Dilip S.pptx
Exploratory Data Analysis - Dilip S.pptxExploratory Data Analysis - Dilip S.pptx
Exploratory Data Analysis - Dilip S.pptx
 
一比一原版西悉尼大学毕业证成绩单如何办理
一比一原版西悉尼大学毕业证成绩单如何办理一比一原版西悉尼大学毕业证成绩单如何办理
一比一原版西悉尼大学毕业证成绩单如何办理
 
Atlantic Grupa Case Study (Mintec Data AI)
Atlantic Grupa Case Study (Mintec Data AI)Atlantic Grupa Case Study (Mintec Data AI)
Atlantic Grupa Case Study (Mintec Data AI)
 
Pre-ProductionImproveddsfjgndflghtgg.pptx
Pre-ProductionImproveddsfjgndflghtgg.pptxPre-ProductionImproveddsfjgndflghtgg.pptx
Pre-ProductionImproveddsfjgndflghtgg.pptx
 
NO1 Best Kala Jadu Expert Specialist In Germany Kala Jadu Expert Specialist I...
NO1 Best Kala Jadu Expert Specialist In Germany Kala Jadu Expert Specialist I...NO1 Best Kala Jadu Expert Specialist In Germany Kala Jadu Expert Specialist I...
NO1 Best Kala Jadu Expert Specialist In Germany Kala Jadu Expert Specialist I...
 
一比一原版纽卡斯尔大学毕业证成绩单如何办理
一比一原版纽卡斯尔大学毕业证成绩单如何办理一比一原版纽卡斯尔大学毕业证成绩单如何办理
一比一原版纽卡斯尔大学毕业证成绩单如何办理
 
2024 Q2 Orange County (CA) Tableau User Group Meeting
2024 Q2 Orange County (CA) Tableau User Group Meeting2024 Q2 Orange County (CA) Tableau User Group Meeting
2024 Q2 Orange County (CA) Tableau User Group Meeting
 
The Significance of Transliteration Enhancing
The Significance of Transliteration EnhancingThe Significance of Transliteration Enhancing
The Significance of Transliteration Enhancing
 
Artificial_General_Intelligence__storm_gen_article.pdf
Artificial_General_Intelligence__storm_gen_article.pdfArtificial_General_Intelligence__storm_gen_article.pdf
Artificial_General_Intelligence__storm_gen_article.pdf
 
Generative AI for Trailblazers_ Unlock the Future of AI.pdf
Generative AI for Trailblazers_ Unlock the Future of AI.pdfGenerative AI for Trailblazers_ Unlock the Future of AI.pdf
Generative AI for Trailblazers_ Unlock the Future of AI.pdf
 
2024 Q1 Tableau User Group Leader Quarterly Call
2024 Q1 Tableau User Group Leader Quarterly Call2024 Q1 Tableau User Group Leader Quarterly Call
2024 Q1 Tableau User Group Leader Quarterly Call
 
Abortion pills in Dammam Saudi Arabia// +966572737505 // buy cytotec
Abortion pills in Dammam Saudi Arabia// +966572737505 // buy cytotecAbortion pills in Dammam Saudi Arabia// +966572737505 // buy cytotec
Abortion pills in Dammam Saudi Arabia// +966572737505 // buy cytotec
 

GLM_2020_21.pptx

  • 1. GENERALIZED LINEAR MODELS Ph.D Programme in Psychology, Linguistics and Cognitive Neurosciences Ph.D. School - University of Milano-Bicocca Prof. Franca Crippa httpswww.vox.comfuture-perfect21504366science-replication-crisis-peer- review-statisticsfbclid=IwAR3lIJXfXBVwFWaE5aw4RXHKY
  • 2. PLAN OFTHE LESSON  Part I  Icebreakers: review of the General Linear Models  Part II  The Generalized linear Model : extension to not normally distributed data.  fractions (logistic regression),  counts (Poisson regression, log-linear models),  ordinal data (threshold models).  Overview of specific topics ( overdispersion, (quasi-) maximum likelihood)  Part III  Overview of software for GLIMs . Spss and in R. Jamovi and Jasp (both user-friendly based on some R. R is still the Linus’ blanket, for wideness and updates on modelling, even if a bit rough and not fluffy. Ph.D. School - University of Milano-Bicocca Prof. Franca Crippa
  • 3. PART I Icebreaker on the General Linear Model Ph.D. School - University of Milano-Bicocca Prof. Franca Crippa
  • 4. GENERAL LINEAR MODELS AS MODELS  Our idea is that data are generated as specified in our model plus a random error  DATA = MODEL + ERROR  Very general form of the model:  𝒀 = 𝒇(𝑿𝟏, 𝑿𝟐, 𝑿𝟑)+𝛆  Linear Models are models  𝒀 = 𝜷𝟎 + 𝜷𝟏𝑿𝟏 +𝜷𝟐𝑿𝟐 + 𝜷𝟑 𝑿𝟑+𝛆 Ph.D. School - University of Milano-Bicocca Prof. Franca Crippa Beware: notation may vary from an author to another, from one professor to another, from one journal to another. Then : • focus on the meaning of the symbol; • pay attention to the requirements of the journal
  • 5. HOW DO WE MODEL DATA?  Objective  Model structure (e.g. variables, formula, equation)  Model assumptions  Parameter estimates and interpretation  Model fit (e.g. goodness-of-fit tests and statistics)  Model selection Ph.D. School - University of Milano-Bicocca Prof. Franca Crippa
  • 6. PSYCHOLOGISTS’ STATISTICAL WORKHORSE: THE GENERAL LINEAR MODEL Ph.D. School - University of Milano-Bicocca Prof. Franca Crippa quant itativ e • Linear regression (simple or multiple) quali tativ e • Anova both • Ancova Predictors Response : quantative continuous One or more between- subjects predictors Quantitative predictors - regression Categorical predictors - ANOVA Quantitative and categorical predictors - ANCOVA At least one within- subjects predictors
  • 7. PSYCHOLOGISTS’ STATISTICAL WORKHORSE: THE GENERAL LINEAR MODEL Ph.D. School - University of Milano-Bicocca Prof. Franca Crippa Response : quantative continuous One or more dichotomous or continuous between-subjects predictors One predictor • Independent samples t-test • Simple regression >=two predictors • Multiple regression • Statistical control (covariates) • Mediation >=two predictors plus interaction • Interactions • Moderated mediation • Other type of linear model (polynomial)
  • 8. 8 THE GENERAL LINEAR MODEL At least one within-subjects predictors One or more categorical within-subjects predictors At least one continuous within-subjects predictor Paired-samples t-test Within-subjects ANOVA Linear Mixed-Effects Models (LMEM) An additional random term
  • 9. GENERAL LINEAR MODEL: AN OUTLOOK ON THE ASSUMPTIONS Predictors 1 . on any scale: categorical or quantitative 2. measured without error (deterministic) – random component expressed by the error  Responsevariable: (continuous) quantitative only  errors are iid and normally distributed. For all subjects i=1,2,..n. the errors i are: i) identically, normally distributed with zero mean and equal variance (omoschedasticity) ii) Incorrelated (independent) Objective: the response yi; i = 1, .., n is modelled by a linear (additive) function of predictors/explanatory variables xj ; j = 1, …, p plus an error term The model is linear in the parameters Ph.D. School - University of Milano-Bicocca Prof. Franca Crippa
  • 10. The general linear model make the assumptions below. When these assumptions are met, OLS regression coefficients are MVUE (Minimum Variance Unbiased Estimators) and BLUE (Best Linear Unbiased Estimators). 1. Exact X: The IVs are assumed to be known exactly (i.e., without measurement error), deterministic. 2. 2. Independence: Residuals are independently distributed (prob. of obtaining a specific observation does not depend on other observations) 3. 3. Normality: All residual distributions are normally distributed 4. 4. Constant variance: All residual distributions have a constant variance 5. 5. Linearity: All residual distributions (i.e., for eachY') are assumed to have means equal to zero ESTIMATION WHEN ASSUMTIONS ARE MET
  • 11. PARAMETER INTERPRETATION Regression: •b0 estimate of the intercept 𝛽0 •bi is estimate of the slope 𝛽0, i.e. the increase of the response due to the unitary increase of the i.th preditor Anova •General mean •Difference between the group mean and the general mean Model selection: which explanatory variables to include?  Principle of parsimony (Occam’s razor): all relevant predictors are included, no irrelevant one is. Ph.D. School - University of Milano-Bicocca Prof. Franca Crippa
  • 12. NOT ALL MISSING DATA ARETHE SAME Missing by design Values are missing by definition of the population of interest Missing completely at random (MCAR) Missing values are randomly distributed Missing at random (MAR) After accounting for one or more other variables, missing values are randomly distributed Non-ignorable (NI) Missing values are functions of the variables themselves
  • 13. BETTER METHODS OF HANDLING MISSING DATA Full information maximum likelihood (FIML) methods Can handle data that are MAR and NI Implemented as part of particular statistical models Missing data handled during analysis Multiple imputation Can also handle data that are MAR and NI Simulation-based approach Missing data are handled separately from analysis
  • 14. RESTRICTIONS OF GENERAL LINEAR MODELS Although a very useful framework, there are some situations where general linear models are not appropriate 1. The range ofY is restricted  categorical variables, binary, ordered or unordered categories, counts 2. Other violations of assumptions  Heteroschedasticity  Non-normality  Non linearity (in the Ivs and/or in the parameters)  Variance depending on the mean Ph.D. School - University of Milano-Bicocca Prof. Franca Crippa
  • 15. Anscombe’s quartet Ph.D. School - University of Milano-Bicocca Prof. Franca Crippa MESSY DATA Anscombe, Francis J. (1973) Graphs in statistical analysis. American Statistician, 27, 17–21
  • 16. 16 A GLANCETO DECISION AND POWER (NEXT LESSON) Reality: NO EFFECT Reality: EFFECT EXISTS Research concludes: FAILTO REJECT (FTR) NULL; NO EFFECT CORRECT FTR TYPE 2 ERROR () Researcher concludes: REJECT NULL; EFFECT EXISTS TYPE 1 ERROR () CORRECT REJECT (1-)
  • 17. PART II Generalised Linear Models (GLIMs) Ph.D. School - University of Milano-Bicocca Prof. Franca Crippa
  • 18. EXTENSIONTO GENERALIZED LINEAR MODELS (GLIM OR GLM) GLIMs are a family of models that: Represent an extension of linear regression to a broader family of outcome variables - basic structure of linear regression equations. Allow us to extend the linear modelling framework to variables that are not Normally distributed. Allow us to look at models that seem different in a unifying perspective. Ph.D. School - University of Milano-Bicocca Prof. Franca Crippa Two major additions to the linear function framework link function : when the response has a nonlinear relationships with predictors, a transformation of the response is expressed as a linear regression error structures beyond the normally, for instance binomial, poisson.
  • 19. THREE COMPONENTS OF A GLIM 1. Systematic part: relation between the dependent variable Y and the independent variables in the model. 2. Random part: error distribution of the outcome variable 3. Link function: transform of the response, so that the transfom is expressed a the well known linear relation g(⋅) link function (linear, logit, poisson..) Ph.D. School - University of Milano-Bicocca Prof. Franca Crippa How to approach GLIM: Understanding the common underlying linear structure Esploring reason for different estimation techniques
  • 20. GLIM FIXED MODELSWITH RESPONSES AND PREDICTORS OF ANYTYPE Predictors measured on any scale. Response Ph.D. School - University of Milano-Bicocca Prof. Franca Crippa continu ous • General linear model dichoto mous • Logit (accuracy, yes or no) categ orical • Logistic (ordinal or nominal categorical) count • Poisson regression (count variables, frequencies)
  • 21. 21 GLIM AS A SOLUTION FOR SOME VIOLATIONS OF GENERAL LINEAR MODELS ASSUMPTIONS (Independence: Inaccurate standard errors, degrees of freedom and significance tests. Use linear mixed effects models – see my collegue’s lessons) Normality: Inefficient (with large N). Use transformations, g e n e rali ze d li n e ar mode ls Constant variance: Inefficient and inaccurate standard errors. Use transformations, generali zed li near models Linearity: Biased parameter estimates. Use transformations, g e n e rali ze d li n e ar mode ls
  • 22. Ph.D. School - University of Milano-Bicocca Prof. Franca Crippa DISTRIBUTION OF ERRORS IN PROBIT AND LOGIT MODELS
  • 23. Family Default Link Function binomial (link = "logit") gaussian (link = "identity") Gamma (link = "inverse") inverse.gaussian (link = "1/mu^2") poisson (link = "log") quasibinomial (link = "logit") quasipoisson (link = "log") glm(formula, family=familytype(link=linkfunction), data=) Ph.D. School - University of Milano-Bicocca Prof. Franca Crippa
  • 24. LINK FUNCTION AND ERROR DISTRIBUTIONS model Error distribution Link function Regression Normal g=E(Y|X) Binary logistic regression Binomial g=𝑙𝑛 𝑝 1−𝑝 Ordinal logistic regression Binomial g=𝑙𝑛 𝑝 1−𝑝 Multinomial logistic regression Multinomial g=𝑙𝑛 𝑝 1−𝑝 Poisson regression Poisson g=ln[E(Y|X)] Beta regression Beta g=ln[E(Y|X)] Gamma regression Gamma g=ln[E(Y|X)] Negative binomial regression Negative binomial g=ln[E(Y|X)] Ph.D. School - University of Milano-Bicocca Prof. Franca Crippa
  • 25. Family Default Link Function 1. binomial (link = "logit") FIRST CASE: MODELLINGTHE PROBABILITY FOR A DICHOTOMOUS VARIABLE Ph.D. School - University of Milano-Bicocca Prof. Franca Crippa In a binomial variable (the random component is the error, which is binomial) our interest is on the probability of ‘success’. In fact we have to outcomes, success and insuccess (1 and 0).When we know the probability of success p, then we derive the probability of success as 1-p. Our reponse would be the probability, range 0-1.Then we cannot use the General Linear Model. How do we solve the problem? - We transform the response. Instead of the probability, we consider the logit g=𝑙𝑛 𝑝 1−𝑝 .The symbol g stands for our ‘transformed’ response.This transformed response now is continuous.
  • 26. Family Default Link Function 2. gaussian (link = “identity") SECOND CASE: MODELLINGTHE PROBABILITY FOR A CONTINUOUS VARIABLE Ph.D. School - University of Milano-Bicocca Prof. Franca Crippa In a continuous variable with normal (gaussian) random component, the response has no restriction on the real numbers. Our reponse is the variable as it stands - Do we need to transform the response? - No. - How do we express the transform g, i.e. the link function? - As the identity function. - Are General Linear Models part of Generalized Linear Models? - Yes, when the link function, also denoted as g, is identity and when the error terms are normal.
  • 27. ESTIMATION: MAXIMUM LIKELIHOOD (ML)  The likelihood function (LF) expresses the likelihood of observing the data under the model  The LF is maximized by the best fitting parameter estimates  Any model estimated with ML methods will produce a deviance value for the model, which can be used to assess fit of the model (for the special case of linear regression model with normal errors, the deviance is equal to the residual SS).  The deviance for a model can be used to calculate analogues of the R2multiple for GLiMs  These notions are useful to understand the logic of models and their ‘assessment’, that’s it. Ph.D. School - University of Milano-Bicocca Prof. Franca Crippa
  • 28. MODEL STRUCTURE: BACKTOTHE LINEAR STRUCTURE  The binary logistic model has the GLIMs structure: 𝑙𝑛 𝑝 1−𝑝 =𝛽0 + 𝛽 1x1i + 𝛽 2x2i + i where: p is the probability of 1 (or the proportion of); Ln 𝑝 1−𝑝 is the logit, the link function 𝑝 1−𝑝 is the odd, i.e. the probability of presence over the probability of absence of the response 0<p<1 vs Logit: (-, ) Ph.D. School - University of Milano-Bicocca Prof. Franca Crippa
  • 29. GOODNESS OF FIT Wald test on logit regression coefficients: · Large-Sample test (WaldTest) in truth a z-test: · H0: 𝛽 = 0 HA: 𝛽  0 Ph.D. School - University of Milano-Bicocca Prof. Franca Crippa j j j B B W SE  The model with intercept and predictors is compared to an intercept only model to test χ2=2[LL(B)-LL(0)] where LL indicates the log likelihood Analogues of the R2 –value in linear regression: Hosmer & Lemeshow Cox & Snell: Nagelkerke: −2𝐿𝐿𝐵 −2𝐿𝐿 0 - 2 2 1 exp [ ( ) (0)] CS R LL B LL n           2 2 2 1 2 ,where 1 exp[2( ) (0)] CS N MAX MAX R R R n LL R    
  • 30. IF WE PLOT A DICHOTOMOS RESPONSE Ph.D. School - University of Milano-Bicocca Prof. Franca Crippa
  • 31. BINARY LOGISTIC REGRESSION  The response variable is dichotomous.  Predictor variables may be categorical or continuous.  If predictors are all continuous and nicely distributed, may use discriminant function analysis.  If predictors are all categorical, may use logit analysis. Ph.D. School - University of Milano-Bicocca Prof. Franca Crippa
  • 32. PSEUDO-R MEASURES Hosmer & Lemeshow is not computed in Spss Cox & Snell: unluckily it does not reach 1 Nagelkerke has been adjused to reach 1, so it is used the most Ph.D. School - University of Milano-Bicocca Prof. Franca Crippa
  • 33. PARAMETERS INTERPRETATION Ph.D. School - University of Milano-Bicocca Prof. Franca Crippa Holding all other predictors constant: • 𝛽= 0  P(Presence) is the same at each level of x • 𝛽 > (<) 0  P(Presence) increases (decreases) as x increases x x e e P         1 Interpretation in terms of probability   X P P ODDS               ˆ 1 ˆ ln ln Response: vote in favour of cats as research subjects Sample size: 315 Null (empty) model 187 128 684 . ) 379 . (    Exp In favour against P(in favour) = 128/315 = 40.6% P(against) = 187/315 = 59.4% Odds = 40.6/59.4 = .684
  • 34. ADDING GENDER AS A DV We add gender as a DV, male=1, female=0 Ph.D. School - University of Milano-Bicocca Prof. Franca Crippa 429 . 0 : 448 . 1 : 847 . 0 217 . 1 847 . 0 217 . 1 847 . 0          e female e male e e Gender bGender a 376 . 3 429 . 448 . 1   female male Odd Odd Odds Odds ratio Clearly these are not probabilities, note that they can be >1!! (they are odds, i.e the ratio given by y chance in favour divided by the chance against, for females only and for males only respectively) The odds ratio is the ratio between the two odds. A woman is .429 less likely to be in favour of the research than against it. A man is 1.448 times more likely to be in favour to continue the research than against it. Men are 3.376 times more likelya to vote to continue the research, i.e. . to be in favour rather than against, with respect to women.
  • 35. FROM ODDSTO PROBABILITIES Ph.D. School - University of Milano-Bicocca Prof. Franca Crippa 30 . 0 429 . 1 429 . 0 1 ˆ     Odds Odds Pwomen 59 . 0 448 . 2 448 . 1 1 ˆ     Odds Odds Pmen For a woman, the probability of voting in favour of cats in experiments is 30% For a man, the probability of voting in favour of cats in experiments is 59%, almost double the probability for a woman. We can draw our conclusions in terms of probability NOW
  • 36. POISSON REGRESSION  Count response variable (frequencies) in a fixed period of time, with a Poisson distribution  Poisson distribution: probability of 0, 1, 2, . . . events; the mean of the distribution is equal to the variance  In the Poisson regression model, predictor variables may be categorical or continuous. Ph.D. School - University of Milano-Bicocca Prof. Franca Crippa Rare events When mean>10 similar to normal
  • 37. POISSON REGRESSION Ph.D. School - University of Milano-Bicocca Prof. Franca Crippa
  • 38. MODEL STRUCTURE The Poisson model has the structure: 𝑙𝑛 𝑦 =𝛽0 + 𝛽1x1i + 𝛽2x2i + i where the link function is ln Goodness of fit Wald test on regression coefficients R2 deviance=1- 𝑑𝑒𝑣𝑖𝑎𝑛𝑐𝑒(𝑚𝑜𝑑𝑒𝑙) 𝐷𝑒𝑣𝑖𝑎𝑛𝑐𝑒(𝑛𝑢𝑙𝑙) overall fit R2 deviance=1- 𝑑𝑒𝑣𝑖𝑎𝑛𝑐𝑒(𝑚𝑜𝑑𝑒𝑙) 𝐷𝑒𝑣𝑖𝑎𝑛𝑐𝑒(𝑚𝑜𝑑𝑒𝑙 𝑚𝑖𝑛𝑢𝑠 𝑜𝑛𝑒 𝑐𝑜𝑣𝑎𝑟𝑖𝑎𝑡𝑒) gain in prediction Ph.D. School - University of Milano-Bicocca Prof. Franca Crippa
  • 39. INTERPRETATION OF PARAMETERS 𝑙𝑛 𝑦 =𝛽0 + 𝛽1x1i + b2x2i  A unitary increase in x1 results in a b1 increase in ln(y)  For direct interpretation of the effect on the count variable, we consider the regression as: 𝑦 = 𝑒𝛽0𝑒𝛽1x1i𝑒𝛽2x2i  A change in the value of a predictor results in a multiplicative change in the predicted count.  Remember that in linear regression a change in the predictor result in an additive change in the predicted value Ph.D. School - University of Milano-Bicocca Prof. Franca Crippa
  • 40. INTERPRETATION OF PARAMETERS/2  If 𝛽 = 0, then exp(𝛽) = 1, Y and X are not related.  If 𝛽 > 0, then exp(𝛽) > 1, and the expectedY is exp(β) times larger than when X = 0  If 𝛽 < 0, then exp(𝛽) < 1, and the expected count is exp(𝛽) times smaller than when X = 0 Ph.D. School - University of Milano-Bicocca Prof. Franca Crippa
  • 41. ESTIMATION WITH ML  The deviance for a model can be used to calculate analogues to the linear regression R2multiple  equidispersion: several GLiMs have error structures based on distributions in which the variance is a function of the mean.  Actual data are usually overdisperse. As in the comments for estimation of the logistic regression, these comments sketch some general ideas.The subject is vast and at this point we just need to see the logic and analogies and differences between extensions of models. Ph.D. School - University of Milano-Bicocca Prof. Franca Crippa
  • 42. Two-Part Models or joint models:  single outcome variable has multiple facets that are modeled simultaneously or when multiple outcome variables are conceptually closely related. Hurdle regression models  Hurdle regression models (Long, 1997; Mullahy, 1986) are often used to model human decision-making processes It has been used in Italy in migration studies. Ph.D. School - University of Milano-Bicocca Prof. Franca Crippa Other GLIM models/1
  • 43.  Zero-inflated regression models  Individuals from two different populations: those who have no probability of displaying the behavior of interest and therefore always respond with a zero, and those who produce zeros with some probability.  Alcohol example: zeros will come from individuals who never drink for religious, health, or other reasons and thereby produce structural zeros that must always occur.  In practice: more 0 than expected in a Poisson (or Negatve Binomial) distribution.  Consequences: estimated parameters and SE may be distorted  the excessive number of 0 can cause overdispersion  Solutions: Mixture models or Hurdle models Ph.D. School - University of Milano-Bicocca Prof. Franca Crippa Other GLIM models/2
  • 44. OTHER MEASURES  Akaike Information Criteria (AIC) You can look at AIC as counterpart of adjusted r square in multiple regression.The smaller the better  Null Deviance and Residual Deviance Null deviance is calculated from the model with no features, i.e. intercept tonly. Residual deviance is calculated from the model having all the features.  Receiver Operator Characteristic (ROC) curve Ph.D. School - University of Milano-Bicocca Prof. Franca Crippa
  • 45. EXAMPLES IN THE LITERATURE Parker, M. A., & Anthony, J. C. (2019). Underage drinking, alcohol dependence, and young people starting to use prescription pain relievers extra-medically: A zero-inflated Poisson regression model. Experimental and clinical psychopharmacology, 27(1), 87. DeLisi, M., Caudill, J.W.,Trulson, C. R., Marquart, J.W.,Vaughn, M. G., & Beaver, K. M. (2010). Angry inmates are violent inmates: A Poisson regression approach to youthful offenders. Journal of Forensic Psychology Practice, 10(5), 419-439. Ph.D. School - University of Milano-Bicocca Prof. Franca Crippa
  • 46. EXAMPLES INTHE LITERATURE/LOGISTIC Adwere-Boamah, J., & Hufstedler, S. (2015). Predicting SocialTrust with Binary Logistic Regression. Research in Higher Education Journal, 27 Adwere-Boamah, J. (2011). Multiple Logistic Regression Analysis of Cigarette Use among High School Students. Journal of Case Studies in Education, 1. Ph.D. School - University of Milano-Bicocca Prof. Franca Crippa
  • 47. PART II - 1 A bridge from Generalised Linear Models to Generalised Linear Mixed Models Ph.D. School - University of Milano-Bicocca Prof. Franca Crippa
  • 48. GENERALIZED LINEAR MIXED MODELS - GLMM • GLMMs as an extension of GLIM when the assumption of incorrelated errors in violated. • suitable for the analysis of normal and non-normal data with a clustered (in groups) structure • Added complexity: random effects (different from random errors) Ph.D. School - University of Milano-Bicocca Prof. Franca Crippa
  • 49. GLMM PARAMETERS  fixed regression effects and variance components parameters common to all cluster  cluster-specific parameters, assumed to be randomly drawn from a population distribution  Example: experimental psychology where the experimental design contains within-subject variables  variance components of the population distribution to be estimated together with the fixed effects Ph.D. School - University of Milano-Bicocca Prof. Franca Crippa
  • 50. POWER AND RELIABILITY OF ESTIMATES Often the limiting factor is the sample size at the highest unit of analysis. For example, having 500 patients from each of ten doctors would give one a reasonable total number of observations, but not enough to get stable estimates of doctor effects nor of the doctor-to-doctor variation. 10 patients from each of 500 doctors (leading to the same total number of observations) would be preferable. Ph.D. School - University of Milano-Bicocca Prof. Franca Crippa
  • 51. CLASSES OF GENERALIZED LINEAR MODELS Ph.D. School - University of Milano-Bicocca Prof. Franca Crippa General Linear Models (Linear regression, ANOVA, ANCOVA) Y= X β+𝜺 Responses Independent Generalized Linear Models (Logistic regression, Poisson regression, etc.) g(Y) = X β+𝜺 Responses Independent Linear Mixed Models Y = X β + Z b+𝜺 Responses Correlated Correlation modeled in part by “random effects” g(Y|b) = X β + Z b+𝜺 Responses Correlated Correlation modeled in part by “random effects”
  • 52. PART III An overview on software Ph.D. School - University of Milano-Bicocca Prof. Franca Crippa
  • 53. IBM SPSS  Spss allows estimation of severalGLIM  This menu is comprehensive and a bit more complicated than the following one Ph.D. School - University of Milano-Bicocca Prof. Franca Crippa
  • 54. IBM SPSS Addressing GLIM from regression is more straightforward Ph.D. School - University of Milano-Bicocca Prof. Franca Crippa The model here is multinomial regression, where: Response: categorical (nominal) with >2 categories Predictors: any scale
  • 55. SAS - STATA Ph.D. School - University of Milano-Bicocca Prof. Franca Crippa These software share a longstanding tradition and a widespread scientific community. Some advances are ‘translated’ directly into their proc. They are rather costly and not too user-friendly. Unfortunately, so far there has been a trend towards an inverse correlation between user- friendly and scientifically advanced. They are included in university our campus sotware
  • 56. OPEN SOURCE SOFTWARE Ph.D. School - University of Milano-Bicocca Prof. Franca Crippa Open source software has been engaging the scientific community for quite some time. Among the extremely user/psychologist friendly packages developed for the psychologists community, we will take a glance at: Jamovi (collaborators from our university) Jasp R is the open source environment. Several methodological/statistical advances are published in the literature, the corresponding R package is published on the related scientific journals and then made available to the community, with on line documentation and with communities. The strenght of R, Sas, Stata is that their methods are subject to a vast open discussion in the scientific community and they provide a copious documentation to the publich. See Idre from UCLA
  • 57. JAMOVI  Some generalised linear models can be retrieved in the two menus Anova and Regression.  GAMLj: module for GLM, LME and GZLMs in jamovi Ph.D. School - University of Milano-Bicocca Prof. Franca Crippa
  • 58. JASP Ph.D. School - University of Milano-Bicocca Prof. Franca Crippa
  • 59. R –THE INUDIBLE,THE INEVITABLE Ph.D. School - University of Milano-Bicocca Prof. Franca Crippa  Why is R so important in the scientific community?  Why is not-so-friendly software ( where we need to use programming language) so relevant for applying statstical methods? Estimation can have a domino effect
  • 60. THE GLM FUNCTION IN R Generalized linear models can be fitted in R using the glm function, The glm function is similar to the lm function for tting linear models. The arguments to a glm call are as follows glm (formula, family = ………..) Ph.D. School - University of Milano-Bicocca Prof. Franca Crippa
  • 61. R –THE INUDIBLE,THE INEVITABLE The formula is specifieded to glm as, e.g. y  x1 + x2 where x1, x2 are the names of  numeric vectors (continuous variables)  factors (categorical variables) All specied variables must be in the workspace or in the data frame passed to the data argument. Ph.D. School - University of Milano-Bicocca Prof. Franca Crippa
  • 62. FAMILY ARGUMENT The family argument takes (the name of) a family function which specifies  the link function  the variance function and other e.g. linkinv The family (exponential family) functions available in R are  Binomial (link = "logit")  Poisson (link = "log")  Gaussian (link = "identity")  inverse.gaussian (link = "1/mu2") Ph.D. School - University of Milano-Bicocca Prof. Franca Crippa
  • 63. EXTRACTOR FUNCTIONS There are several glm or lm methods available for accessing/displaying components of the glm object, including:  residuals()  fitted()  predict()  coef()  deviance()  formula()  summary() Ph.D. School - University of Milano-Bicocca Prof. Franca Crippa
  • 64. RSTUDIO Ph.D. School - University of Milano-Bicocca Prof. Franca Crippa
  • 65. SOME USEFUL WEBSITES An excellent introduction on R, with full information and instructions: https://stats.idre.ucla.edu/r/ Some detailed lessons on the General Linear Model and an introduction on the Generalized Linear Model, with applications in R (data not provided, some lessons are not freely available to the public): https://arc.psych.wisc.edu/courses/610-710/ Ph.D. School - University of Milano-Bicocca Prof. Franca Crippa
  • 66. Some references Coxe, S.,West, S. G., & Aiken, L. S. (2009).The analysis of count data: A gentle introduction to Poisson regression and its alternatives. Journal of personality assessment, 91(2), 121-136 Coxe, S.,West, S. G., & Aiken, L. S. (2013). Generalized linear models. Oxford handbook of quantitative methods, 26-51. Dobson, A. J., & Barnett, A. (2008). An introduction to generalized linear models. CRC press. Fox, J. (2015). Applied regression analysis and generalized linear models. Sage Publications. Hedeker, D., Flay, B. R., & Petraitis, J. (1996). Estimating individual influences of behavioral intentions: an application of random-effects modeling to the theory of reasoned action. Journal of Consulting and Clinical Psychology, 64, 109-120. Hirotsu, C. (2017). Advanced Analysis ofVariance (Vol. 384). John Wiley & Sons. Ph.D. School - University of Milano-Bicocca Prof. Franca Crippa
  • 67. Some references Osborne, J.W. (2014). Best practices in logistic regression. Sage Publications. http://core.ecu.edu/psyc/wuenschk/MV/Multreg/Logistic- SPSS.PDF Tuerlinckx, F., Rijmen, F.,Verbeke, G., & Boeck, P. (2006). Statistical inference in generalized linear mixed models: A review. BritishJournal of Mathematical and Statistical Psychology, 59(2), 225-255. Vandekerckhove, J., Matzke, D., &Wagenmakers, E. J. (2015). Model comparison and the principle of parsimony. Oxford handbook of computational and mathematical psychology, 300- 319. Ph.D. School - University of Milano-Bicocca Prof. Franca Crippa