2. Offset Regression
A variant of Poisson Regression
Count data often have an exposure variable, which indicates the number
of times the event could have happened
This variable should be incorporated into a Poisson model with the use of
the offset option
3. Offset Regression
If all the students have same exposure to math (program), the number of
awards are comparable
But if there is variation in the exposure, it could affect the count
A count of 5 awards out of 5 years is much bigger than a count of 1 out of
3
Rate of awards is count/exposure
In a model for awards count, the exposure is moved to the right side
Then if the algorithm of count is logged & also the exposure, the final
model contains ln(exposure) as term that is added to the regression
equation
This logged variable, ln(exposure) or a similarity constructed variable is
called the offset variable
4. Offset Poisson Regression
A data frame with 63 observations on the following 4 variables.
(lung.cancer)
years.smok a factor giving the number of years smoking
cigarettes a factor giving cigarette consumption
Time man-years at risk
y number of deaths
5. Negative Binomial Regression
One potential drawback of Poisson regression is that it may not accurately
describe the variability of the counts
A Poisson distribution is parameterized by λ, which happens to be both its
mean and variance. While convenient to remember, it’s not often realistic.
A distribution of counts will usually have a variance that’s not equal to its
mean. When we see this happen with data that we assume (or hope) is
Poisson distributed, we say we have under- or over dispersion, depending
on if the variance is smaller or larger than the mean.
Performing Poisson regression on count data that exhibits this behavior
results in a model that doesn’t fit well.
6. One approach that addresses this issue is Negative Binomial
Regression.
We go for Negative Binomial Regression when Variance > Mean
(over dispersion)
The negative binomial distribution, like the Poisson
distribution, describes the probabilities of the occurrence of
whole numbers greater than or equal to 0.
The variance of a negative binomial distribution is a function
of its mean and has an additional parameter, k, called the
dispersion parameter.
The variance of a negative binomial distribution is a function
of its mean and has an additional parameter, k, called the
dispersion parameter.
var(Y)=μ+μ2/k