Generalized Linear Models (GLMs)
November 9 & 12, 2018
Motivation
Benefits of generalized linear models
Generalized linear models Logistic regression Poisson regression 2 / 34
Motivation
Benefits of generalized linear models
• The residuals don’t have to be normally distributed
Generalized linear models Logistic regression Poisson regression 2 / 34
Motivation
Benefits of generalized linear models
• The residuals don’t have to be normally distributed
• The response variable can be binary, integer,
strictly-positive, etc...
Generalized linear models Logistic regression Poisson regression 2 / 34
Motivation
Benefits of generalized linear models
• The residuals don’t have to be normally distributed
• The response variable can be binary, integer,
strictly-positive, etc...
• The variance is not assumed to be constant
Generalized linear models Logistic regression Poisson regression 2 / 34
Motivation
Benefits of generalized linear models
• The residuals don’t have to be normally distributed
• The response variable can be binary, integer,
strictly-positive, etc...
• The variance is not assumed to be constant
• Useful for manipulative experiments or observational
studies, just like linear models.
Generalized linear models Logistic regression Poisson regression 2 / 34
Motivation
Benefits of generalized linear models
• The residuals don’t have to be normally distributed
• The response variable can be binary, integer,
strictly-positive, etc...
• The variance is not assumed to be constant
• Useful for manipulative experiments or observational
studies, just like linear models.
Examples
• Presence-absence studies
• Studies of survival
• Seed germination studies
Generalized linear models Logistic regression Poisson regression 2 / 34
Outline
Logistic regression
• The response variable is usually binary and modeled with a
binomial distribution
• The probability of success is usually a logit-linear model
Generalized linear models Logistic regression Poisson regression 3 / 34
Outline
Logistic regression
• The response variable is usually binary and modeled with a
binomial distribution
• The probability of success is usually a logit-linear model
Poisson regression
• The response variable is a non-negative integer modeled with
a Poisson distribution
• The expected count is usually modeled with a log-linear model
Generalized linear models Logistic regression Poisson regression 3 / 34
From linear to generalized linear
Linear model
µi = β0 + β1xi1 + β2xi2 + · · ·
yi ∼ Normal(µi, σ2
)
Generalized linear models Logistic regression Poisson regression 4 / 34
From linear to generalized linear
Linear model
µi = β0 + β1xi1 + β2xi2 + · · ·
yi ∼ Normal(µi, σ2
)
Generalized Linear model
g(µi) = β0 + β1xi1 + β2xi2 + · · ·
yi ∼ f(µi)
Generalized linear models Logistic regression Poisson regression 4 / 34
From linear to generalized linear
Linear model
µi = β0 + β1xi1 + β2xi2 + · · ·
yi ∼ Normal(µi, σ2
)
Generalized Linear model
g(µi) = β0 + β1xi1 + β2xi2 + · · ·
yi ∼ f(µi)
where
g is a link function, such as the log or logit link
Generalized linear models Logistic regression Poisson regression 4 / 34
From linear to generalized linear
Linear model
µi = β0 + β1xi1 + β2xi2 + · · ·
yi ∼ Normal(µi, σ2
)
Generalized Linear model
g(µi) = β0 + β1xi1 + β2xi2 + · · ·
yi ∼ f(µi)
where
g is a link function, such as the log or logit link
f is a probability distribution such as the binomial or Poisson
Generalized linear models Logistic regression Poisson regression 4 / 34
Alternative representations
This:
g(µi) = β0 + β1xi1 + β2xi2 + · · ·
yi ∼ f(µi)
Generalized linear models Logistic regression Poisson regression 5 / 34
Alternative representations
This:
g(µi) = β0 + β1xi1 + β2xi2 + · · ·
yi ∼ f(µi)
Is the same as this:
µi = g−1
(β0 + β1xi1 + β2xi2 + · · · )
yi ∼ f(µi)
Generalized linear models Logistic regression Poisson regression 5 / 34
Alternative representations
This:
g(µi) = β0 + β1xi1 + β2xi2 + · · ·
yi ∼ f(µi)
Is the same as this:
µi = g−1
(β0 + β1xi1 + β2xi2 + · · · )
yi ∼ f(µi)
Is the same as this:
g(µi) = Xβ
yi ∼ f(µi)
Generalized linear models Logistic regression Poisson regression 5 / 34
Link functions
An inverse link function (g−1) transforms values from the (−∞, ∞)
scale to the scale of interest, such as (0, 1) for probabilities
Generalized linear models Logistic regression Poisson regression 6 / 34
Link functions
An inverse link function (g−1) transforms values from the (−∞, ∞)
scale to the scale of interest, such as (0, 1) for probabilities
The link function (g) does the reverse
Generalized linear models Logistic regression Poisson regression 6 / 34
Link functions
Distribution link name1 link equation inverse link equation
Binomial logit log( p
1−p ) exp(Xβ)
1+exp(Xβ)
Poisson log log(λ) exp(Xβ)
1
These are the most common link functions, but others are available
Generalized linear models Logistic regression Poisson regression 7 / 34
Link functions
Distribution link name1 link equation inverse link equation
Binomial logit log( p
1−p ) exp(Xβ)
1+exp(Xβ)
Poisson log log(λ) exp(Xβ)
Distribution link name link in R inv link in R
Binomial logit qlogis plogis
Poisson log log exp
1
These are the most common link functions, but others are available
Generalized linear models Logistic regression Poisson regression 7 / 34
Logit link example
beta0 <- 5
beta1 <- -0.08
elevation <- 100
(logit.p <- beta0 + beta1*elevation)
## [1] -3
Generalized linear models Logistic regression Poisson regression 8 / 34
Logit link example
beta0 <- 5
beta1 <- -0.08
elevation <- 100
(logit.p <- beta0 + beta1*elevation)
## [1] -3
How do we convert -3 to a probability?
Generalized linear models Logistic regression Poisson regression 8 / 34
Logit link example
beta0 <- 5
beta1 <- -0.08
elevation <- 100
(logit.p <- beta0 + beta1*elevation)
## [1] -3
How do we convert -3 to a probability? Use the inverse-link:
p <- exp(logit.p)/(1+exp(logit.p))
p
## [1] 0.04742587
Generalized linear models Logistic regression Poisson regression 8 / 34
Logit link example
beta0 <- 5
beta1 <- -0.08
elevation <- 100
(logit.p <- beta0 + beta1*elevation)
## [1] -3
How do we convert -3 to a probability? Use the inverse-link:
p <- exp(logit.p)/(1+exp(logit.p))
p
## [1] 0.04742587
Same as:
plogis(logit.p)
## [1] 0.04742587
Generalized linear models Logistic regression Poisson regression 8 / 34
Logit link example
beta0 <- 5
beta1 <- -0.08
elevation <- 100
(logit.p <- beta0 + beta1*elevation)
## [1] -3
How do we convert -3 to a probability? Use the inverse-link:
p <- exp(logit.p)/(1+exp(logit.p))
p
## [1] 0.04742587
Same as:
plogis(logit.p)
## [1] 0.04742587
To go back, use the link function itself:
log(p/(1-p))
## [1] -3
qlogis(p)
## [1] -3
Generalized linear models Logistic regression Poisson regression 8 / 34
Logit link example
plot(function(x) 5 + -0.08*x, from=0, to=100,
xlab="Elevation", ylab="logit(prob of occurrence)")
0 20 40 60 80 100
−2024
Elevation
logit(probofoccurrence)
Generalized linear models Logistic regression Poisson regression 9 / 34
Logit link example
plot(function(x) plogis(5 + -0.08*x), from=0, to=100,
xlab="Elevation", ylab="Probability of occurrence")
0 20 40 60 80 100
0.20.40.60.81.0
Elevation
Probabilityofoccurrence
Generalized linear models Logistic regression Poisson regression 10 / 34
Logistic Regression
Logistic regression is a specific type of GLM in which the response
variable follows a binomial distribution and the link function is the
logit
Generalized linear models Logistic regression Poisson regression 11 / 34
Logistic Regression
Logistic regression is a specific type of GLM in which the response
variable follows a binomial distribution and the link function is the
logit
It would be better to call it “binomial regression” since other link
functions (e.g. the probit) can be used
Generalized linear models Logistic regression Poisson regression 11 / 34
Logistic Regression
Logistic regression is a specific type of GLM in which the response
variable follows a binomial distribution and the link function is the
logit
It would be better to call it “binomial regression” since other link
functions (e.g. the probit) can be used
Appropriate when the response is binary or a count with an upper
limit
Generalized linear models Logistic regression Poisson regression 11 / 34
Logistic Regression
Logistic regression is a specific type of GLM in which the response
variable follows a binomial distribution and the link function is the
logit
It would be better to call it “binomial regression” since other link
functions (e.g. the probit) can be used
Appropriate when the response is binary or a count with an upper
limit
Examples:
• Presence/absence studies
• Survival studies
• Disease prevalance studies
Generalized linear models Logistic regression Poisson regression 11 / 34
Logistic Regression
logit(pi) = β0 + β1xi1 + β2xi2 + · · ·
yi ∼ Binomial(N, pi)
Generalized linear models Logistic regression Poisson regression 12 / 34
Logistic Regression
logit(pi) = β0 + β1xi1 + β2xi2 + · · ·
yi ∼ Binomial(N, pi)
where:
N is the number of “trials” (e.g. coin flips)
pi is the probability of success for sample unit i
Generalized linear models Logistic regression Poisson regression 12 / 34
Binomial distribution
0 1 2 3 4 5
0.00.10.20.30.40.50.6
Binomial(N=5, p=0.5)
Number of 'successes'
Probability
Generalized linear models Logistic regression Poisson regression 13 / 34
Binomial distribution
0 1 2 3 4 5
0.00.10.20.30.40.50.6
Binomial(N=5, p=0.9)
Number of 'successes'
Probability
Generalized linear models Logistic regression Poisson regression 14 / 34
Binomial Distribution
Properties
• The expected value of y is Np
• The variance is Np(1 − p)
Generalized linear models Logistic regression Poisson regression 15 / 34
Binomial Distribution
Properties
• The expected value of y is Np
• The variance is Np(1 − p)
Bernoulli distribution
• The Bernoulli distribution is a binomial distribution with a
single trial (N = 1)
• Logistic regression is usually done in this context, such that
the response variable is 0/1 or No/Yes or Bad/Good, etc. . .
Generalized linear models Logistic regression Poisson regression 15 / 34
Worked example using glm
head(frogData, n=25)
## presence abundance elevation habitat
## 1 0 0 58 Oak
## 2 0 3 191 Oak
## 3 0 0 43 Oak
## 4 1 15 374 Oak
## 5 1 7 337 Oak
## 6 0 0 64 Oak
## 7 1 1 195 Oak
## 8 0 1 263 Oak
## 9 1 3 181 Oak
## 10 1 3 59 Oak
## 11 1 60 489 Maple
## 12 1 9 317 Maple
## 13 0 0 12 Maple
## 14 1 4 245 Maple
## 15 1 38 474 Maple
## 16 0 0 83 Maple
## 17 1 42 467 Maple
## 18 1 52 485 Maple
## 19 1 12 335 Maple
## 20 1 1 20 Maple
## 21 1 31 430 Pine
## 22 0 1 223 Pine
## 23 0 0 68 Pine
## 24 1 47 483 Pine
## 25 1 0 78 Pine
First we will model the presence-absence
response variable to determine if eleva-
tion and habitat affect the probability
of occurrence. Then we will model
abundance.
Generalized linear models Logistic regression Poisson regression 16 / 34
Raw data
plot(presence ~ elevation, frogData,
xlab="Elevation", ylab="Frog Occurrence")
q qq
qq
q
q
q
qq qq
q
q q
q
q qqq q
qq
qq q
q q
q q
0 100 200 300 400 500
0.00.20.40.60.81.0
Elevation
FrogOccurrence
Generalized linear models Logistic regression Poisson regression 17 / 34
Raw data
group.prop <- tapply(frogData$presence, frogData$habitat, mean)
barplot(group.prop, ylab="Proportion of sites with frogs")
Oak Maple Pine
Proportionofsiteswithfrogs
0.00.20.40.60.8
Generalized linear models Logistic regression Poisson regression 18 / 34
The function glm
fm1 <- glm(presence ~ habitat + elevation,
family=binomial(link="logit"), data=frogData)
Generalized linear models Logistic regression Poisson regression 19 / 34
The function glm
fm1 <- glm(presence ~ habitat + elevation,
family=binomial(link="logit"), data=frogData)
summary(fm1)
##
## Call:
## glm(formula = presence ~ habitat + elevation, family = binomial(link = "logit"),
## data = frogData)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -1.6608 -0.7663 0.1610 0.5031 1.7773
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -2.053609 1.092854 -1.879 0.0602 .
## habitatMaple 1.220668 1.324680 0.921 0.3568
## habitatPine 0.281932 1.107228 0.255 0.7990
## elevation 0.011950 0.004774 2.503 0.0123 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 39.429 on 29 degrees of freedom
## Residual deviance: 25.577 on 26 degrees of freedom
## AIC: 33.577
##
## Number of Fisher Scoring iterations: 6
Generalized linear models Logistic regression Poisson regression 19 / 34
Occurrence probability and elevation
newdat <- data.frame(elevation=seq(12, 489, length=50),
habitat="Oak")
head(newdat)
## elevation habitat
## 1 12.00000 Oak
## 2 21.73469 Oak
## 3 31.46939 Oak
## 4 41.20408 Oak
## 5 50.93878 Oak
## 6 60.67347 Oak
Generalized linear models Logistic regression Poisson regression 20 / 34
Occurrence probability and elevation
newdat <- data.frame(elevation=seq(12, 489, length=50),
habitat="Oak")
head(newdat)
## elevation habitat
## 1 12.00000 Oak
## 2 21.73469 Oak
## 3 31.46939 Oak
## 4 41.20408 Oak
## 5 50.93878 Oak
## 6 60.67347 Oak
To get confidence intervals on (0,1) scale, predict on linear (link) scale
and then backtransform using inverse-link
Generalized linear models Logistic regression Poisson regression 20 / 34
Occurrence probability and elevation
newdat <- data.frame(elevation=seq(12, 489, length=50),
habitat="Oak")
head(newdat)
## elevation habitat
## 1 12.00000 Oak
## 2 21.73469 Oak
## 3 31.46939 Oak
## 4 41.20408 Oak
## 5 50.93878 Oak
## 6 60.67347 Oak
To get confidence intervals on (0,1) scale, predict on linear (link) scale
and then backtransform using inverse-link
pred.link <- predict(fm1, newdata=newdat, se.fit=TRUE, type="link")
newdat$mu <- plogis(pred.link$fit)
newdat$lower <- plogis(pred.link$fit - 1.96*pred.link$se.fit)
newdat$upper <- plogis(pred.link$fit + 1.96*pred.link$se.fit)
Generalized linear models Logistic regression Poisson regression 20 / 34
Occurrence probability and elevation
0 100 200 300 400 500
0.00.20.40.60.81.0
Elevation
Probabilityofoccurrence
q qq
qq
q
q
q
qq qq
q
q q
q
q qqq q
qq
qq q
qq
q q
Generalized linear models Logistic regression Poisson regression 21 / 34
Poisson Regression
log(λi) = β0 + β1xi1 + β2xi2 + · · ·
yi ∼ Poisson(λi)
Generalized linear models Logistic regression Poisson regression 22 / 34
Poisson Regression
log(λi) = β0 + β1xi1 + β2xi2 + · · ·
yi ∼ Poisson(λi)
where:
λi is the expected value of yi
Generalized linear models Logistic regression Poisson regression 22 / 34
Poisson regression
Useful for:
• Count data
• Number of events in time intervals
• Other types of integer data
Generalized linear models Logistic regression Poisson regression 23 / 34
Poisson regression
Useful for:
• Count data
• Number of events in time intervals
• Other types of integer data
Properties
• The expected value of y (λ) is equal to the variance
• This is an assumption of the Poisson model
• Like all assumptions, it can be relaxed if you have enough
data
Generalized linear models Logistic regression Poisson regression 23 / 34
Poisson distribution
5 10 15 20 25
0.00.10.20.3
Poisson(λ= 1)
Response variable
Probability
Generalized linear models Logistic regression Poisson regression 24 / 34
Poisson distribution
5 10 15 20 25
0.000.050.100.15
Poisson(λ= 5)
Response variable
Probability
Generalized linear models Logistic regression Poisson regression 24 / 34
Poisson distribution
5 10 15 20 25
0.000.020.040.060.080.100.12
Poisson(λ= 10)
Response variable
Probability
Generalized linear models Logistic regression Poisson regression 24 / 34
Log link example
plot(function(x) 5 + -0.08*x, from=0, to=100,
xlab="Elevation", ylab="log(Expected abundance)")
0 20 40 60 80 100
−2024
Elevation
log(Expectedabundance)
Generalized linear models Logistic regression Poisson regression 25 / 34
Log link example
plot(function(x) exp(5 + -0.08*x), from=0, to=100,
xlab="Elevation", ylab="Expected abundance")
0 20 40 60 80 100
050100150
Elevation
Expectedabundance
Generalized linear models Logistic regression Poisson regression 26 / 34
The function glm
fm2 <- glm(abundance ~ habitat + elevation,
family=poisson(link="log"), data=frogData)
Generalized linear models Logistic regression Poisson regression 27 / 34
The function glm
fm2 <- glm(abundance ~ habitat + elevation,
family=poisson(link="log"), data=frogData)
summary(fm2)
##
## Call:
## glm(formula = abundance ~ habitat + elevation, family = poisson(link = "log"),
## data = frogData)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -1.8207 -0.9818 -0.1200 0.6251 2.3868
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -1.284839 0.267329 -4.806 1.54e-06 ***
## habitatMaple 0.262192 0.215133 1.219 0.223
## habitatPine 0.229873 0.216865 1.060 0.289
## elevation 0.010211 0.000677 15.084 < 2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for poisson family taken to be 1)
##
## Null deviance: 691.975 on 29 degrees of freedom
## Residual deviance: 28.057 on 26 degrees of freedom
## AIC: 123.21
##
## Number of Fisher Scoring iterations: 5
Generalized linear models Logistic regression Poisson regression 27 / 34
Prediction
newdat <- data.frame(elevation=seq(12, 489, length=50),
habitat="Oak")
head(newdat)
## elevation habitat
## 1 12.00000 Oak
## 2 21.73469 Oak
## 3 31.46939 Oak
## 4 41.20408 Oak
## 5 50.93878 Oak
## 6 60.67347 Oak
Generalized linear models Logistic regression Poisson regression 28 / 34
Prediction
newdat <- data.frame(elevation=seq(12, 489, length=50),
habitat="Oak")
head(newdat)
## elevation habitat
## 1 12.00000 Oak
## 2 21.73469 Oak
## 3 31.46939 Oak
## 4 41.20408 Oak
## 5 50.93878 Oak
## 6 60.67347 Oak
To get confidence intervals on (0,∞) scale, predict on linear (link) scale
and then backtransform using inverse-link
Generalized linear models Logistic regression Poisson regression 28 / 34
Prediction
newdat <- data.frame(elevation=seq(12, 489, length=50),
habitat="Oak")
head(newdat)
## elevation habitat
## 1 12.00000 Oak
## 2 21.73469 Oak
## 3 31.46939 Oak
## 4 41.20408 Oak
## 5 50.93878 Oak
## 6 60.67347 Oak
To get confidence intervals on (0,∞) scale, predict on linear (link) scale
and then backtransform using inverse-link
pred.link <- predict(fm2, newdata=newdat, se.fit=TRUE, type="link")
newdat$mu <- exp(pred.link$fit)
newdat$lower <- exp(pred.link$fit - 1.96*pred.link$se.fit)
newdat$upper <- exp(pred.link$fit + 1.96*pred.link$se.fit)
Generalized linear models Logistic regression Poisson regression 28 / 34
Prediction
0 100 200 300 400 500
020406080100
Elevation
Abundance
q
q
q
q
q
q q q
qq
q
q
q
q
q
q
q
q
q
q
q
qq
q
q
q
q q q
q
Generalized linear models Logistic regression Poisson regression 29 / 34
Assessing model fit
The most common problem in Poisson regression is overdispersion.
Generalized linear models Logistic regression Poisson regression 30 / 34
Assessing model fit
The most common problem in Poisson regression is overdispersion.
Overdispersion is the situation in which there is more variability in
the data than predicted by the model.
Generalized linear models Logistic regression Poisson regression 30 / 34
Assessing model fit
The most common problem in Poisson regression is overdispersion.
Overdispersion is the situation in which there is more variability in
the data than predicted by the model.
Overdispersion cannot be assessed by simply comparing the mean
and variance of the response variable.
Generalized linear models Logistic regression Poisson regression 30 / 34
Assessing model fit
The most common problem in Poisson regression is overdispersion.
Overdispersion is the situation in which there is more variability in
the data than predicted by the model.
Overdispersion cannot be assessed by simply comparing the mean
and variance of the response variable.
For example, the presence of many zeros is not necessarily
indicative of overdispersion.
Generalized linear models Logistic regression Poisson regression 30 / 34
Assessing model fit
The most common problem in Poisson regression is overdispersion.
Overdispersion is the situation in which there is more variability in
the data than predicted by the model.
Overdispersion cannot be assessed by simply comparing the mean
and variance of the response variable.
For example, the presence of many zeros is not necessarily
indicative of overdispersion.
Overdispersion can be assessed using a goodness-of-fit test.
Generalized linear models Logistic regression Poisson regression 30 / 34
Goodness-of-fit
The fit of a Poisson regression can be assessed using a χ2 test.
Generalized linear models Logistic regression Poisson regression 31 / 34
Goodness-of-fit
The fit of a Poisson regression can be assessed using a χ2 test.
The test statistic is the residual deviance:
D = 2 yi log
yi
ˆλi
− (yi − ˆλi)
Generalized linear models Logistic regression Poisson regression 31 / 34
Goodness-of-fit
The fit of a Poisson regression can be assessed using a χ2 test.
The test statistic is the residual deviance:
D = 2 yi log
yi
ˆλi
− (yi − ˆλi)
If the null hypothesis is true (ie, the model fits the data), D should
follow χ2 distribution with N − K degrees-of-freedom.
Generalized linear models Logistic regression Poisson regression 31 / 34
Goodness-of-fit
The fit of a Poisson regression can be assessed using a χ2 test.
The test statistic is the residual deviance:
D = 2 yi log
yi
ˆλi
− (yi − ˆλi)
If the null hypothesis is true (ie, the model fits the data), D should
follow χ2 distribution with N − K degrees-of-freedom.
N <- nrow(frogData) # sample size
K <- length(coef(fm2)) # number of parameters
df.resid <- N-K # degrees-of-freedom
Dev <- deviance(fm2) # residual deviance
p.value <- 1-pchisq(Dev, df=df.resid) # p-value
p.value # fail to reject H0
## [1] 0.3556428
Generalized linear models Logistic regression Poisson regression 31 / 34
χ2
distribution and residual deviance
curve(dchisq(x, df=df.resid), from=0, to=50, xlab="Deviance", ylab="Density")
abline(v=Dev, lwd=3, col="red")
0 10 20 30 40 50
0.000.010.020.030.040.05
Deviance
Density
The red line is the residual deviance. We fail to reject the null
hypothesis, and we conclude that the Poisson model fits the data
well.Generalized linear models Logistic regression Poisson regression 32 / 34
What if model doesn’t fit the data?
Alternatives to the Poisson distribution
• Negative binomial
• Zero-inflated Poisson
Generalized linear models Logistic regression Poisson regression 33 / 34
Negative binomial distribution
5 10 15 20 25
0.000.050.100.150.200.25
NegBin(λ= 2, α= 10)
Response variable
Probability
Generalized linear models Logistic regression Poisson regression 34 / 34
Negative binomial distribution
5 10 15 20 25
0.000.050.100.150.200.25
NegBin(λ= 2, α= 5)
Response variable
Probability
Generalized linear models Logistic regression Poisson regression 34 / 34
Negative binomial distribution
5 10 15 20 25
0.000.010.020.030.040.050.060.07
NegBin(λ= 2, α= 0.1)
Response variable
Probability
Generalized linear models Logistic regression Poisson regression 34 / 34

Introduction to Generalized Linear Models

  • 1.
    Generalized Linear Models(GLMs) November 9 & 12, 2018
  • 2.
    Motivation Benefits of generalizedlinear models Generalized linear models Logistic regression Poisson regression 2 / 34
  • 3.
    Motivation Benefits of generalizedlinear models • The residuals don’t have to be normally distributed Generalized linear models Logistic regression Poisson regression 2 / 34
  • 4.
    Motivation Benefits of generalizedlinear models • The residuals don’t have to be normally distributed • The response variable can be binary, integer, strictly-positive, etc... Generalized linear models Logistic regression Poisson regression 2 / 34
  • 5.
    Motivation Benefits of generalizedlinear models • The residuals don’t have to be normally distributed • The response variable can be binary, integer, strictly-positive, etc... • The variance is not assumed to be constant Generalized linear models Logistic regression Poisson regression 2 / 34
  • 6.
    Motivation Benefits of generalizedlinear models • The residuals don’t have to be normally distributed • The response variable can be binary, integer, strictly-positive, etc... • The variance is not assumed to be constant • Useful for manipulative experiments or observational studies, just like linear models. Generalized linear models Logistic regression Poisson regression 2 / 34
  • 7.
    Motivation Benefits of generalizedlinear models • The residuals don’t have to be normally distributed • The response variable can be binary, integer, strictly-positive, etc... • The variance is not assumed to be constant • Useful for manipulative experiments or observational studies, just like linear models. Examples • Presence-absence studies • Studies of survival • Seed germination studies Generalized linear models Logistic regression Poisson regression 2 / 34
  • 8.
    Outline Logistic regression • Theresponse variable is usually binary and modeled with a binomial distribution • The probability of success is usually a logit-linear model Generalized linear models Logistic regression Poisson regression 3 / 34
  • 9.
    Outline Logistic regression • Theresponse variable is usually binary and modeled with a binomial distribution • The probability of success is usually a logit-linear model Poisson regression • The response variable is a non-negative integer modeled with a Poisson distribution • The expected count is usually modeled with a log-linear model Generalized linear models Logistic regression Poisson regression 3 / 34
  • 10.
    From linear togeneralized linear Linear model µi = β0 + β1xi1 + β2xi2 + · · · yi ∼ Normal(µi, σ2 ) Generalized linear models Logistic regression Poisson regression 4 / 34
  • 11.
    From linear togeneralized linear Linear model µi = β0 + β1xi1 + β2xi2 + · · · yi ∼ Normal(µi, σ2 ) Generalized Linear model g(µi) = β0 + β1xi1 + β2xi2 + · · · yi ∼ f(µi) Generalized linear models Logistic regression Poisson regression 4 / 34
  • 12.
    From linear togeneralized linear Linear model µi = β0 + β1xi1 + β2xi2 + · · · yi ∼ Normal(µi, σ2 ) Generalized Linear model g(µi) = β0 + β1xi1 + β2xi2 + · · · yi ∼ f(µi) where g is a link function, such as the log or logit link Generalized linear models Logistic regression Poisson regression 4 / 34
  • 13.
    From linear togeneralized linear Linear model µi = β0 + β1xi1 + β2xi2 + · · · yi ∼ Normal(µi, σ2 ) Generalized Linear model g(µi) = β0 + β1xi1 + β2xi2 + · · · yi ∼ f(µi) where g is a link function, such as the log or logit link f is a probability distribution such as the binomial or Poisson Generalized linear models Logistic regression Poisson regression 4 / 34
  • 14.
    Alternative representations This: g(µi) =β0 + β1xi1 + β2xi2 + · · · yi ∼ f(µi) Generalized linear models Logistic regression Poisson regression 5 / 34
  • 15.
    Alternative representations This: g(µi) =β0 + β1xi1 + β2xi2 + · · · yi ∼ f(µi) Is the same as this: µi = g−1 (β0 + β1xi1 + β2xi2 + · · · ) yi ∼ f(µi) Generalized linear models Logistic regression Poisson regression 5 / 34
  • 16.
    Alternative representations This: g(µi) =β0 + β1xi1 + β2xi2 + · · · yi ∼ f(µi) Is the same as this: µi = g−1 (β0 + β1xi1 + β2xi2 + · · · ) yi ∼ f(µi) Is the same as this: g(µi) = Xβ yi ∼ f(µi) Generalized linear models Logistic regression Poisson regression 5 / 34
  • 17.
    Link functions An inverselink function (g−1) transforms values from the (−∞, ∞) scale to the scale of interest, such as (0, 1) for probabilities Generalized linear models Logistic regression Poisson regression 6 / 34
  • 18.
    Link functions An inverselink function (g−1) transforms values from the (−∞, ∞) scale to the scale of interest, such as (0, 1) for probabilities The link function (g) does the reverse Generalized linear models Logistic regression Poisson regression 6 / 34
  • 19.
    Link functions Distribution linkname1 link equation inverse link equation Binomial logit log( p 1−p ) exp(Xβ) 1+exp(Xβ) Poisson log log(λ) exp(Xβ) 1 These are the most common link functions, but others are available Generalized linear models Logistic regression Poisson regression 7 / 34
  • 20.
    Link functions Distribution linkname1 link equation inverse link equation Binomial logit log( p 1−p ) exp(Xβ) 1+exp(Xβ) Poisson log log(λ) exp(Xβ) Distribution link name link in R inv link in R Binomial logit qlogis plogis Poisson log log exp 1 These are the most common link functions, but others are available Generalized linear models Logistic regression Poisson regression 7 / 34
  • 21.
    Logit link example beta0<- 5 beta1 <- -0.08 elevation <- 100 (logit.p <- beta0 + beta1*elevation) ## [1] -3 Generalized linear models Logistic regression Poisson regression 8 / 34
  • 22.
    Logit link example beta0<- 5 beta1 <- -0.08 elevation <- 100 (logit.p <- beta0 + beta1*elevation) ## [1] -3 How do we convert -3 to a probability? Generalized linear models Logistic regression Poisson regression 8 / 34
  • 23.
    Logit link example beta0<- 5 beta1 <- -0.08 elevation <- 100 (logit.p <- beta0 + beta1*elevation) ## [1] -3 How do we convert -3 to a probability? Use the inverse-link: p <- exp(logit.p)/(1+exp(logit.p)) p ## [1] 0.04742587 Generalized linear models Logistic regression Poisson regression 8 / 34
  • 24.
    Logit link example beta0<- 5 beta1 <- -0.08 elevation <- 100 (logit.p <- beta0 + beta1*elevation) ## [1] -3 How do we convert -3 to a probability? Use the inverse-link: p <- exp(logit.p)/(1+exp(logit.p)) p ## [1] 0.04742587 Same as: plogis(logit.p) ## [1] 0.04742587 Generalized linear models Logistic regression Poisson regression 8 / 34
  • 25.
    Logit link example beta0<- 5 beta1 <- -0.08 elevation <- 100 (logit.p <- beta0 + beta1*elevation) ## [1] -3 How do we convert -3 to a probability? Use the inverse-link: p <- exp(logit.p)/(1+exp(logit.p)) p ## [1] 0.04742587 Same as: plogis(logit.p) ## [1] 0.04742587 To go back, use the link function itself: log(p/(1-p)) ## [1] -3 qlogis(p) ## [1] -3 Generalized linear models Logistic regression Poisson regression 8 / 34
  • 26.
    Logit link example plot(function(x)5 + -0.08*x, from=0, to=100, xlab="Elevation", ylab="logit(prob of occurrence)") 0 20 40 60 80 100 −2024 Elevation logit(probofoccurrence) Generalized linear models Logistic regression Poisson regression 9 / 34
  • 27.
    Logit link example plot(function(x)plogis(5 + -0.08*x), from=0, to=100, xlab="Elevation", ylab="Probability of occurrence") 0 20 40 60 80 100 0.20.40.60.81.0 Elevation Probabilityofoccurrence Generalized linear models Logistic regression Poisson regression 10 / 34
  • 28.
    Logistic Regression Logistic regressionis a specific type of GLM in which the response variable follows a binomial distribution and the link function is the logit Generalized linear models Logistic regression Poisson regression 11 / 34
  • 29.
    Logistic Regression Logistic regressionis a specific type of GLM in which the response variable follows a binomial distribution and the link function is the logit It would be better to call it “binomial regression” since other link functions (e.g. the probit) can be used Generalized linear models Logistic regression Poisson regression 11 / 34
  • 30.
    Logistic Regression Logistic regressionis a specific type of GLM in which the response variable follows a binomial distribution and the link function is the logit It would be better to call it “binomial regression” since other link functions (e.g. the probit) can be used Appropriate when the response is binary or a count with an upper limit Generalized linear models Logistic regression Poisson regression 11 / 34
  • 31.
    Logistic Regression Logistic regressionis a specific type of GLM in which the response variable follows a binomial distribution and the link function is the logit It would be better to call it “binomial regression” since other link functions (e.g. the probit) can be used Appropriate when the response is binary or a count with an upper limit Examples: • Presence/absence studies • Survival studies • Disease prevalance studies Generalized linear models Logistic regression Poisson regression 11 / 34
  • 32.
    Logistic Regression logit(pi) =β0 + β1xi1 + β2xi2 + · · · yi ∼ Binomial(N, pi) Generalized linear models Logistic regression Poisson regression 12 / 34
  • 33.
    Logistic Regression logit(pi) =β0 + β1xi1 + β2xi2 + · · · yi ∼ Binomial(N, pi) where: N is the number of “trials” (e.g. coin flips) pi is the probability of success for sample unit i Generalized linear models Logistic regression Poisson regression 12 / 34
  • 34.
    Binomial distribution 0 12 3 4 5 0.00.10.20.30.40.50.6 Binomial(N=5, p=0.5) Number of 'successes' Probability Generalized linear models Logistic regression Poisson regression 13 / 34
  • 35.
    Binomial distribution 0 12 3 4 5 0.00.10.20.30.40.50.6 Binomial(N=5, p=0.9) Number of 'successes' Probability Generalized linear models Logistic regression Poisson regression 14 / 34
  • 36.
    Binomial Distribution Properties • Theexpected value of y is Np • The variance is Np(1 − p) Generalized linear models Logistic regression Poisson regression 15 / 34
  • 37.
    Binomial Distribution Properties • Theexpected value of y is Np • The variance is Np(1 − p) Bernoulli distribution • The Bernoulli distribution is a binomial distribution with a single trial (N = 1) • Logistic regression is usually done in this context, such that the response variable is 0/1 or No/Yes or Bad/Good, etc. . . Generalized linear models Logistic regression Poisson regression 15 / 34
  • 38.
    Worked example usingglm head(frogData, n=25) ## presence abundance elevation habitat ## 1 0 0 58 Oak ## 2 0 3 191 Oak ## 3 0 0 43 Oak ## 4 1 15 374 Oak ## 5 1 7 337 Oak ## 6 0 0 64 Oak ## 7 1 1 195 Oak ## 8 0 1 263 Oak ## 9 1 3 181 Oak ## 10 1 3 59 Oak ## 11 1 60 489 Maple ## 12 1 9 317 Maple ## 13 0 0 12 Maple ## 14 1 4 245 Maple ## 15 1 38 474 Maple ## 16 0 0 83 Maple ## 17 1 42 467 Maple ## 18 1 52 485 Maple ## 19 1 12 335 Maple ## 20 1 1 20 Maple ## 21 1 31 430 Pine ## 22 0 1 223 Pine ## 23 0 0 68 Pine ## 24 1 47 483 Pine ## 25 1 0 78 Pine First we will model the presence-absence response variable to determine if eleva- tion and habitat affect the probability of occurrence. Then we will model abundance. Generalized linear models Logistic regression Poisson regression 16 / 34
  • 39.
    Raw data plot(presence ~elevation, frogData, xlab="Elevation", ylab="Frog Occurrence") q qq qq q q q qq qq q q q q q qqq q qq qq q q q q q 0 100 200 300 400 500 0.00.20.40.60.81.0 Elevation FrogOccurrence Generalized linear models Logistic regression Poisson regression 17 / 34
  • 40.
    Raw data group.prop <-tapply(frogData$presence, frogData$habitat, mean) barplot(group.prop, ylab="Proportion of sites with frogs") Oak Maple Pine Proportionofsiteswithfrogs 0.00.20.40.60.8 Generalized linear models Logistic regression Poisson regression 18 / 34
  • 41.
    The function glm fm1<- glm(presence ~ habitat + elevation, family=binomial(link="logit"), data=frogData) Generalized linear models Logistic regression Poisson regression 19 / 34
  • 42.
    The function glm fm1<- glm(presence ~ habitat + elevation, family=binomial(link="logit"), data=frogData) summary(fm1) ## ## Call: ## glm(formula = presence ~ habitat + elevation, family = binomial(link = "logit"), ## data = frogData) ## ## Deviance Residuals: ## Min 1Q Median 3Q Max ## -1.6608 -0.7663 0.1610 0.5031 1.7773 ## ## Coefficients: ## Estimate Std. Error z value Pr(>|z|) ## (Intercept) -2.053609 1.092854 -1.879 0.0602 . ## habitatMaple 1.220668 1.324680 0.921 0.3568 ## habitatPine 0.281932 1.107228 0.255 0.7990 ## elevation 0.011950 0.004774 2.503 0.0123 * ## --- ## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ## ## (Dispersion parameter for binomial family taken to be 1) ## ## Null deviance: 39.429 on 29 degrees of freedom ## Residual deviance: 25.577 on 26 degrees of freedom ## AIC: 33.577 ## ## Number of Fisher Scoring iterations: 6 Generalized linear models Logistic regression Poisson regression 19 / 34
  • 43.
    Occurrence probability andelevation newdat <- data.frame(elevation=seq(12, 489, length=50), habitat="Oak") head(newdat) ## elevation habitat ## 1 12.00000 Oak ## 2 21.73469 Oak ## 3 31.46939 Oak ## 4 41.20408 Oak ## 5 50.93878 Oak ## 6 60.67347 Oak Generalized linear models Logistic regression Poisson regression 20 / 34
  • 44.
    Occurrence probability andelevation newdat <- data.frame(elevation=seq(12, 489, length=50), habitat="Oak") head(newdat) ## elevation habitat ## 1 12.00000 Oak ## 2 21.73469 Oak ## 3 31.46939 Oak ## 4 41.20408 Oak ## 5 50.93878 Oak ## 6 60.67347 Oak To get confidence intervals on (0,1) scale, predict on linear (link) scale and then backtransform using inverse-link Generalized linear models Logistic regression Poisson regression 20 / 34
  • 45.
    Occurrence probability andelevation newdat <- data.frame(elevation=seq(12, 489, length=50), habitat="Oak") head(newdat) ## elevation habitat ## 1 12.00000 Oak ## 2 21.73469 Oak ## 3 31.46939 Oak ## 4 41.20408 Oak ## 5 50.93878 Oak ## 6 60.67347 Oak To get confidence intervals on (0,1) scale, predict on linear (link) scale and then backtransform using inverse-link pred.link <- predict(fm1, newdata=newdat, se.fit=TRUE, type="link") newdat$mu <- plogis(pred.link$fit) newdat$lower <- plogis(pred.link$fit - 1.96*pred.link$se.fit) newdat$upper <- plogis(pred.link$fit + 1.96*pred.link$se.fit) Generalized linear models Logistic regression Poisson regression 20 / 34
  • 46.
    Occurrence probability andelevation 0 100 200 300 400 500 0.00.20.40.60.81.0 Elevation Probabilityofoccurrence q qq qq q q q qq qq q q q q q qqq q qq qq q qq q q Generalized linear models Logistic regression Poisson regression 21 / 34
  • 47.
    Poisson Regression log(λi) =β0 + β1xi1 + β2xi2 + · · · yi ∼ Poisson(λi) Generalized linear models Logistic regression Poisson regression 22 / 34
  • 48.
    Poisson Regression log(λi) =β0 + β1xi1 + β2xi2 + · · · yi ∼ Poisson(λi) where: λi is the expected value of yi Generalized linear models Logistic regression Poisson regression 22 / 34
  • 49.
    Poisson regression Useful for: •Count data • Number of events in time intervals • Other types of integer data Generalized linear models Logistic regression Poisson regression 23 / 34
  • 50.
    Poisson regression Useful for: •Count data • Number of events in time intervals • Other types of integer data Properties • The expected value of y (λ) is equal to the variance • This is an assumption of the Poisson model • Like all assumptions, it can be relaxed if you have enough data Generalized linear models Logistic regression Poisson regression 23 / 34
  • 51.
    Poisson distribution 5 1015 20 25 0.00.10.20.3 Poisson(λ= 1) Response variable Probability Generalized linear models Logistic regression Poisson regression 24 / 34
  • 52.
    Poisson distribution 5 1015 20 25 0.000.050.100.15 Poisson(λ= 5) Response variable Probability Generalized linear models Logistic regression Poisson regression 24 / 34
  • 53.
    Poisson distribution 5 1015 20 25 0.000.020.040.060.080.100.12 Poisson(λ= 10) Response variable Probability Generalized linear models Logistic regression Poisson regression 24 / 34
  • 54.
    Log link example plot(function(x)5 + -0.08*x, from=0, to=100, xlab="Elevation", ylab="log(Expected abundance)") 0 20 40 60 80 100 −2024 Elevation log(Expectedabundance) Generalized linear models Logistic regression Poisson regression 25 / 34
  • 55.
    Log link example plot(function(x)exp(5 + -0.08*x), from=0, to=100, xlab="Elevation", ylab="Expected abundance") 0 20 40 60 80 100 050100150 Elevation Expectedabundance Generalized linear models Logistic regression Poisson regression 26 / 34
  • 56.
    The function glm fm2<- glm(abundance ~ habitat + elevation, family=poisson(link="log"), data=frogData) Generalized linear models Logistic regression Poisson regression 27 / 34
  • 57.
    The function glm fm2<- glm(abundance ~ habitat + elevation, family=poisson(link="log"), data=frogData) summary(fm2) ## ## Call: ## glm(formula = abundance ~ habitat + elevation, family = poisson(link = "log"), ## data = frogData) ## ## Deviance Residuals: ## Min 1Q Median 3Q Max ## -1.8207 -0.9818 -0.1200 0.6251 2.3868 ## ## Coefficients: ## Estimate Std. Error z value Pr(>|z|) ## (Intercept) -1.284839 0.267329 -4.806 1.54e-06 *** ## habitatMaple 0.262192 0.215133 1.219 0.223 ## habitatPine 0.229873 0.216865 1.060 0.289 ## elevation 0.010211 0.000677 15.084 < 2e-16 *** ## --- ## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ## ## (Dispersion parameter for poisson family taken to be 1) ## ## Null deviance: 691.975 on 29 degrees of freedom ## Residual deviance: 28.057 on 26 degrees of freedom ## AIC: 123.21 ## ## Number of Fisher Scoring iterations: 5 Generalized linear models Logistic regression Poisson regression 27 / 34
  • 58.
    Prediction newdat <- data.frame(elevation=seq(12,489, length=50), habitat="Oak") head(newdat) ## elevation habitat ## 1 12.00000 Oak ## 2 21.73469 Oak ## 3 31.46939 Oak ## 4 41.20408 Oak ## 5 50.93878 Oak ## 6 60.67347 Oak Generalized linear models Logistic regression Poisson regression 28 / 34
  • 59.
    Prediction newdat <- data.frame(elevation=seq(12,489, length=50), habitat="Oak") head(newdat) ## elevation habitat ## 1 12.00000 Oak ## 2 21.73469 Oak ## 3 31.46939 Oak ## 4 41.20408 Oak ## 5 50.93878 Oak ## 6 60.67347 Oak To get confidence intervals on (0,∞) scale, predict on linear (link) scale and then backtransform using inverse-link Generalized linear models Logistic regression Poisson regression 28 / 34
  • 60.
    Prediction newdat <- data.frame(elevation=seq(12,489, length=50), habitat="Oak") head(newdat) ## elevation habitat ## 1 12.00000 Oak ## 2 21.73469 Oak ## 3 31.46939 Oak ## 4 41.20408 Oak ## 5 50.93878 Oak ## 6 60.67347 Oak To get confidence intervals on (0,∞) scale, predict on linear (link) scale and then backtransform using inverse-link pred.link <- predict(fm2, newdata=newdat, se.fit=TRUE, type="link") newdat$mu <- exp(pred.link$fit) newdat$lower <- exp(pred.link$fit - 1.96*pred.link$se.fit) newdat$upper <- exp(pred.link$fit + 1.96*pred.link$se.fit) Generalized linear models Logistic regression Poisson regression 28 / 34
  • 61.
    Prediction 0 100 200300 400 500 020406080100 Elevation Abundance q q q q q q q q qq q q q q q q q q q q q qq q q q q q q q Generalized linear models Logistic regression Poisson regression 29 / 34
  • 62.
    Assessing model fit Themost common problem in Poisson regression is overdispersion. Generalized linear models Logistic regression Poisson regression 30 / 34
  • 63.
    Assessing model fit Themost common problem in Poisson regression is overdispersion. Overdispersion is the situation in which there is more variability in the data than predicted by the model. Generalized linear models Logistic regression Poisson regression 30 / 34
  • 64.
    Assessing model fit Themost common problem in Poisson regression is overdispersion. Overdispersion is the situation in which there is more variability in the data than predicted by the model. Overdispersion cannot be assessed by simply comparing the mean and variance of the response variable. Generalized linear models Logistic regression Poisson regression 30 / 34
  • 65.
    Assessing model fit Themost common problem in Poisson regression is overdispersion. Overdispersion is the situation in which there is more variability in the data than predicted by the model. Overdispersion cannot be assessed by simply comparing the mean and variance of the response variable. For example, the presence of many zeros is not necessarily indicative of overdispersion. Generalized linear models Logistic regression Poisson regression 30 / 34
  • 66.
    Assessing model fit Themost common problem in Poisson regression is overdispersion. Overdispersion is the situation in which there is more variability in the data than predicted by the model. Overdispersion cannot be assessed by simply comparing the mean and variance of the response variable. For example, the presence of many zeros is not necessarily indicative of overdispersion. Overdispersion can be assessed using a goodness-of-fit test. Generalized linear models Logistic regression Poisson regression 30 / 34
  • 67.
    Goodness-of-fit The fit ofa Poisson regression can be assessed using a χ2 test. Generalized linear models Logistic regression Poisson regression 31 / 34
  • 68.
    Goodness-of-fit The fit ofa Poisson regression can be assessed using a χ2 test. The test statistic is the residual deviance: D = 2 yi log yi ˆλi − (yi − ˆλi) Generalized linear models Logistic regression Poisson regression 31 / 34
  • 69.
    Goodness-of-fit The fit ofa Poisson regression can be assessed using a χ2 test. The test statistic is the residual deviance: D = 2 yi log yi ˆλi − (yi − ˆλi) If the null hypothesis is true (ie, the model fits the data), D should follow χ2 distribution with N − K degrees-of-freedom. Generalized linear models Logistic regression Poisson regression 31 / 34
  • 70.
    Goodness-of-fit The fit ofa Poisson regression can be assessed using a χ2 test. The test statistic is the residual deviance: D = 2 yi log yi ˆλi − (yi − ˆλi) If the null hypothesis is true (ie, the model fits the data), D should follow χ2 distribution with N − K degrees-of-freedom. N <- nrow(frogData) # sample size K <- length(coef(fm2)) # number of parameters df.resid <- N-K # degrees-of-freedom Dev <- deviance(fm2) # residual deviance p.value <- 1-pchisq(Dev, df=df.resid) # p-value p.value # fail to reject H0 ## [1] 0.3556428 Generalized linear models Logistic regression Poisson regression 31 / 34
  • 71.
    χ2 distribution and residualdeviance curve(dchisq(x, df=df.resid), from=0, to=50, xlab="Deviance", ylab="Density") abline(v=Dev, lwd=3, col="red") 0 10 20 30 40 50 0.000.010.020.030.040.05 Deviance Density The red line is the residual deviance. We fail to reject the null hypothesis, and we conclude that the Poisson model fits the data well.Generalized linear models Logistic regression Poisson regression 32 / 34
  • 72.
    What if modeldoesn’t fit the data? Alternatives to the Poisson distribution • Negative binomial • Zero-inflated Poisson Generalized linear models Logistic regression Poisson regression 33 / 34
  • 73.
    Negative binomial distribution 510 15 20 25 0.000.050.100.150.200.25 NegBin(λ= 2, α= 10) Response variable Probability Generalized linear models Logistic regression Poisson regression 34 / 34
  • 74.
    Negative binomial distribution 510 15 20 25 0.000.050.100.150.200.25 NegBin(λ= 2, α= 5) Response variable Probability Generalized linear models Logistic regression Poisson regression 34 / 34
  • 75.
    Negative binomial distribution 510 15 20 25 0.000.010.020.030.040.050.060.07 NegBin(λ= 2, α= 0.1) Response variable Probability Generalized linear models Logistic regression Poisson regression 34 / 34