This document describes Poisson regression, a statistical technique for analyzing count data using regression. It compares Poisson regression to ordinary least squares regression, outlines how to perform Poisson regression in the GLIM software package, and provides an example analyzing historical apprentice migration data to Edinburgh. Key aspects include:
- Poisson regression is appropriate when the dependent variable is a count, unlike OLS regression which assumes a normal distribution.
- It models the logarithm of the mean as a linear combination of predictors rather than the mean directly.
- GLIM allows specification of the Poisson error distribution and logarithmic link function required for Poisson regression.
- An example apprentice migration data set is analyzed to demonstrate the technique.
Marginal Regression for a Bi-variate Response with Diabetes Mellitus Studytheijes
In this paper, we have developed an “A Bivariate response” model to determine for „A Diabetic Mellitus‟ Study affects large number of people of all social conditions throughout the world. Continuous to grow despite of existing advances in the past few years in virtually every fled of diabetes research and in-protect care for improved treatment. This is sometimes accompanied by symptoms of serve thirst. Profuse urination, weight loss and stopper. We tested by SPSS software by taking 200 samples and using Logistic Regression to estimate the relationship between the response probability whether a diabetic patient had B.P. or not.
applied multivariate statistical techniques in agriculture and plant science 2amir rahmani
This document provides an overview of multivariate statistical techniques that can be used in agriculture and plant science research. It discusses multiple linear regression analysis, which models the relationship between a dependent variable and one or more explanatory variables. The document explains how to determine regression coefficients and test their significance using analysis of variance. It also describes different variable selection techniques for multiple regression like backward elimination, forward selection, and stepwise regression. The goal is to help researchers identify the best predictive model and determine which variables are most important when the number of predictors increases.
- Multinomial logistic regression predicts categorical membership in a dependent variable based on multiple independent variables. It is an extension of binary logistic regression that allows for more than two categories.
- Careful data analysis including checking for outliers and multicollinearity is important. A minimum sample size of 10 cases per independent variable is recommended.
- Multinomial logistic regression does not assume normality, linearity or homoscedasticity like discriminant function analysis does, making it more flexible and commonly used. It does assume independence between dependent variable categories.
Multiple Linear Regression Model with Two Parameter Doubly Truncated New Symm...theijes
This document presents a multiple linear regression model with errors that follow a two-parameter doubly truncated new symmetric distribution. The model extends traditional linear regression by allowing for non-Gaussian distributed errors that are finite and bounded. Properties of the doubly truncated new symmetric distribution are derived, including its probability density function and characteristic function. Maximum likelihood and ordinary least squares methods are used to estimate the model parameters. A simulation study compares the proposed model to existing models that assume Gaussian or new symmetric distributed errors.
Application of Weighted Least Squares Regression in Forecastingpaperpublications3
Abstract: This work models the loss of properties from fire outbreak in Ogun State using Simple Weighted Least Square Regression. The study covers (secondary) data on fire outbreak and monetary value of properties loss across the twenty (20) Local Government Areas of Ogun state for the year 2010. Data collected were analyzed electronically using SPSS 21.0. Results from the analysis reveal that there is a very strong positive relationship between the number of fire outbreak and the loss of properties; this relationship is significant. Fire outbreak exerts significant influence on loss of properties and it accounts for approximately 91.2% of the loss of properties in the state.
This document describes a new method called Likelihood-based Sufficient Dimension Reduction (LAD) for estimating the central subspace. LAD obtains the maximum likelihood estimator of the central subspace under the assumption of conditional normality of the predictors given the response. Analytically and in simulations, LAD is shown to often perform much better than existing methods like Sliced Inverse Regression, Sliced Average Variance Estimation, and Directional Regression. LAD also inherits useful properties from maximum likelihood theory, such as the ability to estimate the dimension of the central subspace and test conditional independence hypotheses.
This document provides an introduction to generalized linear mixed models (GLMMs). GLMMs allow for modeling of data that violates assumptions of linear mixed models, such as non-normal distributions and non-constant variance. The document discusses the components of a GLMM, including the linear predictor, inverse link function, and variance function. It also describes how to derive estimating equations for GLMMs and provides an example for a univariate logit model. Estimation of variance components is also briefly discussed.
1. Regression analysis is a statistical technique used to model relationships between variables and make predictions. It can be used to describe relationships, estimate coefficients, make predictions, and control systems.
2. Linear regression models describe straight-line relationships between variables, while non-linear models describe curved relationships. The goodness of fit of a model can be evaluated using the coefficient of determination.
3. The least squares method is used to fit regression lines by minimizing the sum of the squared vertical distances between observed and estimated y-values for a regression of y on x, or minimizing the sum of squared horizontal distances for a regression of x on y.
Marginal Regression for a Bi-variate Response with Diabetes Mellitus Studytheijes
In this paper, we have developed an “A Bivariate response” model to determine for „A Diabetic Mellitus‟ Study affects large number of people of all social conditions throughout the world. Continuous to grow despite of existing advances in the past few years in virtually every fled of diabetes research and in-protect care for improved treatment. This is sometimes accompanied by symptoms of serve thirst. Profuse urination, weight loss and stopper. We tested by SPSS software by taking 200 samples and using Logistic Regression to estimate the relationship between the response probability whether a diabetic patient had B.P. or not.
applied multivariate statistical techniques in agriculture and plant science 2amir rahmani
This document provides an overview of multivariate statistical techniques that can be used in agriculture and plant science research. It discusses multiple linear regression analysis, which models the relationship between a dependent variable and one or more explanatory variables. The document explains how to determine regression coefficients and test their significance using analysis of variance. It also describes different variable selection techniques for multiple regression like backward elimination, forward selection, and stepwise regression. The goal is to help researchers identify the best predictive model and determine which variables are most important when the number of predictors increases.
- Multinomial logistic regression predicts categorical membership in a dependent variable based on multiple independent variables. It is an extension of binary logistic regression that allows for more than two categories.
- Careful data analysis including checking for outliers and multicollinearity is important. A minimum sample size of 10 cases per independent variable is recommended.
- Multinomial logistic regression does not assume normality, linearity or homoscedasticity like discriminant function analysis does, making it more flexible and commonly used. It does assume independence between dependent variable categories.
Multiple Linear Regression Model with Two Parameter Doubly Truncated New Symm...theijes
This document presents a multiple linear regression model with errors that follow a two-parameter doubly truncated new symmetric distribution. The model extends traditional linear regression by allowing for non-Gaussian distributed errors that are finite and bounded. Properties of the doubly truncated new symmetric distribution are derived, including its probability density function and characteristic function. Maximum likelihood and ordinary least squares methods are used to estimate the model parameters. A simulation study compares the proposed model to existing models that assume Gaussian or new symmetric distributed errors.
Application of Weighted Least Squares Regression in Forecastingpaperpublications3
Abstract: This work models the loss of properties from fire outbreak in Ogun State using Simple Weighted Least Square Regression. The study covers (secondary) data on fire outbreak and monetary value of properties loss across the twenty (20) Local Government Areas of Ogun state for the year 2010. Data collected were analyzed electronically using SPSS 21.0. Results from the analysis reveal that there is a very strong positive relationship between the number of fire outbreak and the loss of properties; this relationship is significant. Fire outbreak exerts significant influence on loss of properties and it accounts for approximately 91.2% of the loss of properties in the state.
This document describes a new method called Likelihood-based Sufficient Dimension Reduction (LAD) for estimating the central subspace. LAD obtains the maximum likelihood estimator of the central subspace under the assumption of conditional normality of the predictors given the response. Analytically and in simulations, LAD is shown to often perform much better than existing methods like Sliced Inverse Regression, Sliced Average Variance Estimation, and Directional Regression. LAD also inherits useful properties from maximum likelihood theory, such as the ability to estimate the dimension of the central subspace and test conditional independence hypotheses.
This document provides an introduction to generalized linear mixed models (GLMMs). GLMMs allow for modeling of data that violates assumptions of linear mixed models, such as non-normal distributions and non-constant variance. The document discusses the components of a GLMM, including the linear predictor, inverse link function, and variance function. It also describes how to derive estimating equations for GLMMs and provides an example for a univariate logit model. Estimation of variance components is also briefly discussed.
1. Regression analysis is a statistical technique used to model relationships between variables and make predictions. It can be used to describe relationships, estimate coefficients, make predictions, and control systems.
2. Linear regression models describe straight-line relationships between variables, while non-linear models describe curved relationships. The goodness of fit of a model can be evaluated using the coefficient of determination.
3. The least squares method is used to fit regression lines by minimizing the sum of the squared vertical distances between observed and estimated y-values for a regression of y on x, or minimizing the sum of squared horizontal distances for a regression of x on y.
- Regression analysis is a statistical technique for modeling relationships between variables, where one variable is dependent on the others. It allows predicting the average value of the dependent variable based on the independent variables.
- The key assumptions of regression models are that the error terms are normally distributed with zero mean and constant variance, and are independent of each other.
- Linear regression specifies that the dependent variable is a linear combination of the parameters, though the independent variables need not be linearly related. In simple linear regression with one independent variable, the least squares estimates of the intercept and slope are calculated to minimize the sum of squared errors.
Quantile regression is an extension of linear regression that relates specific quantiles (percentiles) of the target variable to the predictor variables rather than just the mean. It makes fewer assumptions than ordinary least squares regression about the distribution of the target variable and is more robust to outliers. Quantile regression can provide a more complete picture of the relationship between variables by examining how predictors influence different parts of the conditional distribution.
Granger Causality Test: A Useful Descriptive Tool for Time Series DataIJMER
Interdependency of one or more variables on the other has been in the existence over long
time when it was discovered that one variable has to move or regress toward another following the
work done by Galton (1886); Pearson & Lee (1903); Kendall & Stuart, (1961); Johnston and
DiNardo, (1997); Gujarati, (2004) etc. It was in the light of this dependency over time the researcher
uses Granger Causality as an effective tool in time series Predictive causality using Nigeria GDP and
Money Supply to know the type of causality in existence in the two time series variables under
consideration and which one can statistically predicts the other.
The research work aimed at testing for nature of causality between GDP and money supply for
Federal Republic of Nigeria for the period of thirty years using the data sourced from Central Bank
of Nigeria Statistical Bulletin. After observing the various conditions of Granger causality test such
as ensuring stationarity in the variables under consideration; adding enough number of lags in the
prescribed model before estimation as Granger causality test is sensitive to the number of lags
introduced in the model; and as well as assuming the disturbance terms in the various models are
uncorrelated, the result of the analysis indicates a bilateral relationship between Nigeria GDP and
Money Supply. It implies Nigeria GDP Granger causes money Supply and vice versa. Based on the
result of this study, both Nigeria GDP and money Supply can be successfully model using Vector
Autoregressive Model since changes in one variable has a significant effect on the other variable.
Assessing relative importance using rsp scoring to generateDaniel Koh
This document proposes a new method called Driver's Score (DS) to assess the relative importance of variables in regression models. DS combines measures of a variable's reliability, significance, and power into a single composite score. Reliability is measured using residual errors, significance uses F-ratios of residual errors, and power uses standardized regression coefficients. DS is calculated at the observation level as the geometric mean of these three scores. The document argues DS provides a more intuitive and practical understanding of variable importance than existing methods. An example using industry data demonstrates how to generate DS scores and classify variables by level of importance. The methodology aims to independently measure importance while accounting for interrelationships between variables.
Assessing Relative Importance using RSP Scoring to Generate VIFDaniel Koh
This document proposes a new method called Driver's Score (DS) to assess the relative importance of variables in regression models. DS combines measures of a variable's reliability, significance, and power into a single composite score. Reliability is measured using residual errors, significance uses F-ratios of residual errors, and power uses standardized regression coefficients. DS is calculated at the observation level as the geometric mean of scores for each of these three properties. The document suggests that DS provides a more intuitive and practical understanding of variable importance than existing single-measure methods. An example using industry data demonstrates how to generate DS scores and classify variables by level of importance.
Simulating Multivariate Random Normal Data using Statistical Computing Platfo...ijtsrd
This document describes how to simulate multivariate normal random data using the statistical computing platform R. It discusses two main decomposition methods for generating such data - Eigen decomposition and Cholesky decomposition. These methods decompose the variance-covariance matrix in different ways to simulate random normal values that match the desired mean and variance-covariance structure. The document provides code examples in R to implement both methods and compares the results. It finds that both methods accurately reproduce the target multivariate normal distribution.
Ecological study design multiple group study and statistical analysissirjana Tiwari
This document summarizes study designs used in ecological studies. It discusses multiple group study designs which examine exposure and disease rates across population groups defined by place. It describes variables measured at the group level like aggregate measures, environmental measures, and global measures. It also discusses exploratory versus analytical ecological study designs and examples of statistical analyses used like regression modeling, empirical Bayes methods, and assessing spatial autocorrelation.
1. The document discusses modeling road traffic accident deaths in South Africa using generalized linear models. It analyzes mortality data from 2001-2006 to determine prevalence among age groups.
2. A negative binomial regression model was used instead of a Poisson regression model because the data exhibited overdispersion. The analysis found that the 35-49 age group had the highest prevalence of road traffic accident deaths at 26.6%.
3. Females had an expected death rate that was 65.4% lower than males. Being in the 35-49 age group increased the mean death rate by a factor of 0.557 compared to those over 65, representing a decreased rate of 44.3% for both genders.
Regression analysis is a statistical method used to determine the relationship between variables. It allows one to predict the value of a dependent or outcome variable based on the value of one or more independent or explanatory variables. There are different types of regression including linear, polynomial, logistic, and quantile regression. Regression requires assumptions like linearity and homoscedasticity. Coefficients are calculated to obtain the regression line or equation that best models the relationship between the variables. Limitations include that it can be complex, not work for qualitative data, and assumptions of causality may not always hold.
Discriminant analysis (DA) is a statistical technique used to predict group membership when the dependent variable is categorical and the independent variables are continuous. It identifies which variables discriminate between two or more naturally occurring groups. DA develops a linear equation to predict group membership based on weighted combinations of predictor variables. It aims to maximize the distance between group means to achieve strong discriminatory power. Like regression, DA assumes variables are normally distributed, cases are randomly sampled, and groups are mutually exclusive and collectively exhaustive. It requires at least two groups with minimal overlap and similar group sizes of at least five cases. DA can classify new cases into groups based on the discriminant functions derived from existing data.
Linear regression attempts to model the relationship between two variables by fitting a linear equation to observed data. The most common method for fitting a regression line is the method of least squares, which minimizes the vertical deviations between observed data points and the fitted line. Outliers and influential observations are data points far from the regression line that can significantly impact the slope and strength of the linear relationship. Residual plots are used to investigate the validity of assuming a linear relationship between variables and to identify potential lurking variables not included in the model.
Qualitative Understanding of Flattening the Curve Term in Context of COVID 19ijtsrd
The year 2020 has indeed experienced a shaking pandemic of the modern age. linear and logarithmic and shall highlight details of various functions and their trends. A phrase, Flattening the curve has been in use these days. This paper shall first build a premise and then explain the qualitative meaning of the phrase in the context of the plots. Various psychological aspects shall also be appended so that the reader can be benefitted and can be made ready to understand the current affairs. Gourav Vivek Kulkarni "Qualitative Understanding of Flattening the Curve Term in Context of COVID-19" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-4 | Issue-4 , June 2020, URL: https://www.ijtsrd.com/papers/ijtsrd30926.pdf Paper Url :https://www.ijtsrd.com/mathemetics/statistics/30926/qualitative-understanding-of-flattening-the-curve-term-in-context-of-covid19/gourav-vivek-kulkarni
The document summarizes research on nonlinear correlation coefficients on manifolds. It defines a new nonlinear correlation coefficient called SEVP, proves some of its basic properties including that it ranges from 0 to 1. It discusses how to measure nonlinear correlation between variables on a manifold and reviews common dimensionality reduction methods for manifolds. The goal is to preserve nonlinear structure as much as possible by projecting onto the orthogonal complement of tangent spaces. An optimization problem is formulated to find the linear space with the largest angle to all tangent spaces, transforming it into an eigenvalue problem to solve.
Statistica Sinica 16(2006), 847-860
PSEUDO-R
2
IN LOGISTIC REGRESSION MODEL
Bo Hu, Jun Shao and Mari Palta
University of Wisconsin-Madison
Abstract: Logistic regression with binary and multinomial outcomes is commonly
used, and researchers have long searched for an interpretable measure of the strength
of a particular logistic model. This article describes the large sample properties
of some pseudo-R2 statistics for assessing the predictive strength of the logistic
regression model. We present theoretical results regarding the convergence and
asymptotic normality of pseudo-R2s. Simulation results and an example are also
presented. The behavior of the pseudo-R2s is investigated numerically across a
range of conditions to aid in practical interpretation.
Key words and phrases: Entropy, logistic regression, pseudo-R2
1. Introduction
Logistic regression for binary and multinomial outcomes is commonly used
in health research. Researchers often desire a statistic ranging from zero to one
to summarize the overall strength of a given model, with zero indicating a model
with no predictive value and one indicating a perfect fit. The coefficient of deter-
mination R2 for the linear regression model serves as a standard for such measures
(Draper and Smith (1998)). Statisticians have searched for a corresponding in-
dicator for models with binary/multinomial outcome. Many different R2 statis-
tics have been proposed in the past three decades (see, e.g., McFadden (1973),
McKelvey and Zavoina (1975), Maddala (1983), Agresti (1986), Nagelkerke
(1991), Cox and Wermuch (1992), Ash and Shwartz (1999), Zheng and Agresti
(2000)). These statistics, which are usually identical to the standard R2 when
applied to a linear model, generally fall into categories of entropy-based and
variance-based (Mittlböck and Schemper (1996)). Entropy-based R2 statistics,
also called pseudo-R2s, have gained some popularity in the social sciences (Mad-
dala (1983), Laitila (1993) and Long (1997)). McKelvey and Zavoina (1975)
proposed a pseudo-R2 based on a latent model structure, where the binary/
multinomial outcome results from discretizing a continuous latent variable that
is related to the predictors through a linear model. Their pseudo-R2 is defined
as the proportion of the variance of the latent variable that is explained by the
848 BO HU, JUN SHAO AND MARI PALTA
covariate. McFadden (1973) suggested an alternative, known as “likelihood-
ratio index”, comparing a model without any predictor to a model including all
predictors. It is defined as one minus the ratio of the log likelihood with inter-
cepts only, and the log likelihood with all predictors. If the slope parameters
are all 0, McFadden’s R2 is 0, but it is never 1. Maddala (1983) developed
another pseudo-R2 that can be applied to any model estimated by the maximum
likelihood method. This popular and widely used measure is expressed as
R2M = 1 −
(
L(θ̃)
L(θ̂)
)
2
n
, (1)
.
Statistica Sinica 16(2006), 847-860
PSEUDO-R
2
IN LOGISTIC REGRESSION MODEL
Bo Hu, Jun Shao and Mari Palta
University of Wisconsin-Madison
Abstract: Logistic regression with binary and multinomial outcomes is commonly
used, and researchers have long searched for an interpretable measure of the strength
of a particular logistic model. This article describes the large sample properties
of some pseudo-R2 statistics for assessing the predictive strength of the logistic
regression model. We present theoretical results regarding the convergence and
asymptotic normality of pseudo-R2s. Simulation results and an example are also
presented. The behavior of the pseudo-R2s is investigated numerically across a
range of conditions to aid in practical interpretation.
Key words and phrases: Entropy, logistic regression, pseudo-R2
1. Introduction
Logistic regression for binary and multinomial outcomes is commonly used
in health research. Researchers often desire a statistic ranging from zero to one
to summarize the overall strength of a given model, with zero indicating a model
with no predictive value and one indicating a perfect fit. The coefficient of deter-
mination R2 for the linear regression model serves as a standard for such measures
(Draper and Smith (1998)). Statisticians have searched for a corresponding in-
dicator for models with binary/multinomial outcome. Many different R2 statis-
tics have been proposed in the past three decades (see, e.g., McFadden (1973),
McKelvey and Zavoina (1975), Maddala (1983), Agresti (1986), Nagelkerke
(1991), Cox and Wermuch (1992), Ash and Shwartz (1999), Zheng and Agresti
(2000)). These statistics, which are usually identical to the standard R2 when
applied to a linear model, generally fall into categories of entropy-based and
variance-based (Mittlböck and Schemper (1996)). Entropy-based R2 statistics,
also called pseudo-R2s, have gained some popularity in the social sciences (Mad-
dala (1983), Laitila (1993) and Long (1997)). McKelvey and Zavoina (1975)
proposed a pseudo-R2 based on a latent model structure, where the binary/
multinomial outcome results from discretizing a continuous latent variable that
is related to the predictors through a linear model. Their pseudo-R2 is defined
as the proportion of the variance of the latent variable that is explained by the
848 BO HU, JUN SHAO AND MARI PALTA
covariate. McFadden (1973) suggested an alternative, known as “likelihood-
ratio index”, comparing a model without any predictor to a model including all
predictors. It is defined as one minus the ratio of the log likelihood with inter-
cepts only, and the log likelihood with all predictors. If the slope parameters
are all 0, McFadden’s R2 is 0, but it is never 1. Maddala (1983) developed
another pseudo-R2 that can be applied to any model estimated by the maximum
likelihood method. This popular and widely used measure is expressed as
R2M = 1 −
(
L(θ̃)
L(θ̂)
)
2
n
, (1)
.
USE OF PLS COMPONENTS TO IMPROVE CLASSIFICATION ON BUSINESS DECISION MAKINGIJDKP
This paper presents a methodology that eliminates multicollinearity of the predictors variables in
supervised classification by transforming the predictor variables into orthogonal components obtained
from the application of Partial Least Squares (PLS) Logistic Regression. The PLS logistic regression was
developed by Bastien, Esposito-Vinzi, and Tenenhaus [1]. We apply the techniques of supervised
classification on data, based on the original variables and data based on the PLS components. The error
rates are calculated and the results compared. The implementation of the methodology of classification is
rests upon the development of computer programs written in the R language to make possible the
calculation of PLS components and error rates of classification. The impact of this research will be
disseminated, based on evidence that the methodology of Partial Least Squares Logistic Regression, is
fundamental when working in a supervised classification with data of many predictors variables.
Linear regression (1). spss analiisa statistikJuandaSatriyo1
This document discusses linear regression analysis, a statistical method used to analyze relationships between variables. It can be used to describe, estimate, and predict relationships. The document provides an overview of linear regression, including how it models relationships between dependent and independent variables using equations. It also discusses important considerations for performing and interpreting linear regression analyses correctly. Examples are provided to illustrate key points.
This document describes an analysis of count data from a study on the detection of anthelmintic resistance in gastrointestinal nematodes of small ruminants. The data consists of egg counts from 30 goats and 30 sheep that were grouped into Albendazole, Ivermectin, and control groups. The data was analyzed using Poisson and negative binomial regression models in R software. The Poisson model did not fit the data well due to overdispersion. However, the negative binomial regression model provided a better fit for the overdispersed data. Key findings from the negative binomial regression analysis are summarized.
What Should I Write My College Essay About 15Amy Cernava
The document provides steps for requesting writing assistance from HelpWriting.net:
1. Create an account with a password and email.
2. Complete a 10-minute order form providing instructions, sources, deadline, and attaching a sample for style imitation.
3. Review bids from writers and choose one based on qualifications, history, and feedback, then pay a deposit to start.
4. Review the paper and authorize full payment if pleased, or request revisions using the free revision policy.
5. Confidently choose HelpWriting.net knowing needs will be fully met through original, high-quality content or a full refund.
A New Breakdown Of. Online assignment writing service.Amy Cernava
The document provides a 5-step process for requesting writing help from HelpWriting.net:
1. Create an account and provide contact information.
2. Complete a 10-minute order form with instructions, sources, deadline, and sample work.
3. Review bids from writers and choose one based on qualifications.
4. Review the completed paper and authorize payment if satisfied.
5. Request revisions to ensure satisfaction and receive a refund for plagiarized work.
More Related Content
Similar to Analysis Of Count Data Using Poisson Regression
- Regression analysis is a statistical technique for modeling relationships between variables, where one variable is dependent on the others. It allows predicting the average value of the dependent variable based on the independent variables.
- The key assumptions of regression models are that the error terms are normally distributed with zero mean and constant variance, and are independent of each other.
- Linear regression specifies that the dependent variable is a linear combination of the parameters, though the independent variables need not be linearly related. In simple linear regression with one independent variable, the least squares estimates of the intercept and slope are calculated to minimize the sum of squared errors.
Quantile regression is an extension of linear regression that relates specific quantiles (percentiles) of the target variable to the predictor variables rather than just the mean. It makes fewer assumptions than ordinary least squares regression about the distribution of the target variable and is more robust to outliers. Quantile regression can provide a more complete picture of the relationship between variables by examining how predictors influence different parts of the conditional distribution.
Granger Causality Test: A Useful Descriptive Tool for Time Series DataIJMER
Interdependency of one or more variables on the other has been in the existence over long
time when it was discovered that one variable has to move or regress toward another following the
work done by Galton (1886); Pearson & Lee (1903); Kendall & Stuart, (1961); Johnston and
DiNardo, (1997); Gujarati, (2004) etc. It was in the light of this dependency over time the researcher
uses Granger Causality as an effective tool in time series Predictive causality using Nigeria GDP and
Money Supply to know the type of causality in existence in the two time series variables under
consideration and which one can statistically predicts the other.
The research work aimed at testing for nature of causality between GDP and money supply for
Federal Republic of Nigeria for the period of thirty years using the data sourced from Central Bank
of Nigeria Statistical Bulletin. After observing the various conditions of Granger causality test such
as ensuring stationarity in the variables under consideration; adding enough number of lags in the
prescribed model before estimation as Granger causality test is sensitive to the number of lags
introduced in the model; and as well as assuming the disturbance terms in the various models are
uncorrelated, the result of the analysis indicates a bilateral relationship between Nigeria GDP and
Money Supply. It implies Nigeria GDP Granger causes money Supply and vice versa. Based on the
result of this study, both Nigeria GDP and money Supply can be successfully model using Vector
Autoregressive Model since changes in one variable has a significant effect on the other variable.
Assessing relative importance using rsp scoring to generateDaniel Koh
This document proposes a new method called Driver's Score (DS) to assess the relative importance of variables in regression models. DS combines measures of a variable's reliability, significance, and power into a single composite score. Reliability is measured using residual errors, significance uses F-ratios of residual errors, and power uses standardized regression coefficients. DS is calculated at the observation level as the geometric mean of these three scores. The document argues DS provides a more intuitive and practical understanding of variable importance than existing methods. An example using industry data demonstrates how to generate DS scores and classify variables by level of importance. The methodology aims to independently measure importance while accounting for interrelationships between variables.
Assessing Relative Importance using RSP Scoring to Generate VIFDaniel Koh
This document proposes a new method called Driver's Score (DS) to assess the relative importance of variables in regression models. DS combines measures of a variable's reliability, significance, and power into a single composite score. Reliability is measured using residual errors, significance uses F-ratios of residual errors, and power uses standardized regression coefficients. DS is calculated at the observation level as the geometric mean of scores for each of these three properties. The document suggests that DS provides a more intuitive and practical understanding of variable importance than existing single-measure methods. An example using industry data demonstrates how to generate DS scores and classify variables by level of importance.
Simulating Multivariate Random Normal Data using Statistical Computing Platfo...ijtsrd
This document describes how to simulate multivariate normal random data using the statistical computing platform R. It discusses two main decomposition methods for generating such data - Eigen decomposition and Cholesky decomposition. These methods decompose the variance-covariance matrix in different ways to simulate random normal values that match the desired mean and variance-covariance structure. The document provides code examples in R to implement both methods and compares the results. It finds that both methods accurately reproduce the target multivariate normal distribution.
Ecological study design multiple group study and statistical analysissirjana Tiwari
This document summarizes study designs used in ecological studies. It discusses multiple group study designs which examine exposure and disease rates across population groups defined by place. It describes variables measured at the group level like aggregate measures, environmental measures, and global measures. It also discusses exploratory versus analytical ecological study designs and examples of statistical analyses used like regression modeling, empirical Bayes methods, and assessing spatial autocorrelation.
1. The document discusses modeling road traffic accident deaths in South Africa using generalized linear models. It analyzes mortality data from 2001-2006 to determine prevalence among age groups.
2. A negative binomial regression model was used instead of a Poisson regression model because the data exhibited overdispersion. The analysis found that the 35-49 age group had the highest prevalence of road traffic accident deaths at 26.6%.
3. Females had an expected death rate that was 65.4% lower than males. Being in the 35-49 age group increased the mean death rate by a factor of 0.557 compared to those over 65, representing a decreased rate of 44.3% for both genders.
Regression analysis is a statistical method used to determine the relationship between variables. It allows one to predict the value of a dependent or outcome variable based on the value of one or more independent or explanatory variables. There are different types of regression including linear, polynomial, logistic, and quantile regression. Regression requires assumptions like linearity and homoscedasticity. Coefficients are calculated to obtain the regression line or equation that best models the relationship between the variables. Limitations include that it can be complex, not work for qualitative data, and assumptions of causality may not always hold.
Discriminant analysis (DA) is a statistical technique used to predict group membership when the dependent variable is categorical and the independent variables are continuous. It identifies which variables discriminate between two or more naturally occurring groups. DA develops a linear equation to predict group membership based on weighted combinations of predictor variables. It aims to maximize the distance between group means to achieve strong discriminatory power. Like regression, DA assumes variables are normally distributed, cases are randomly sampled, and groups are mutually exclusive and collectively exhaustive. It requires at least two groups with minimal overlap and similar group sizes of at least five cases. DA can classify new cases into groups based on the discriminant functions derived from existing data.
Linear regression attempts to model the relationship between two variables by fitting a linear equation to observed data. The most common method for fitting a regression line is the method of least squares, which minimizes the vertical deviations between observed data points and the fitted line. Outliers and influential observations are data points far from the regression line that can significantly impact the slope and strength of the linear relationship. Residual plots are used to investigate the validity of assuming a linear relationship between variables and to identify potential lurking variables not included in the model.
Qualitative Understanding of Flattening the Curve Term in Context of COVID 19ijtsrd
The year 2020 has indeed experienced a shaking pandemic of the modern age. linear and logarithmic and shall highlight details of various functions and their trends. A phrase, Flattening the curve has been in use these days. This paper shall first build a premise and then explain the qualitative meaning of the phrase in the context of the plots. Various psychological aspects shall also be appended so that the reader can be benefitted and can be made ready to understand the current affairs. Gourav Vivek Kulkarni "Qualitative Understanding of Flattening the Curve Term in Context of COVID-19" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-4 | Issue-4 , June 2020, URL: https://www.ijtsrd.com/papers/ijtsrd30926.pdf Paper Url :https://www.ijtsrd.com/mathemetics/statistics/30926/qualitative-understanding-of-flattening-the-curve-term-in-context-of-covid19/gourav-vivek-kulkarni
The document summarizes research on nonlinear correlation coefficients on manifolds. It defines a new nonlinear correlation coefficient called SEVP, proves some of its basic properties including that it ranges from 0 to 1. It discusses how to measure nonlinear correlation between variables on a manifold and reviews common dimensionality reduction methods for manifolds. The goal is to preserve nonlinear structure as much as possible by projecting onto the orthogonal complement of tangent spaces. An optimization problem is formulated to find the linear space with the largest angle to all tangent spaces, transforming it into an eigenvalue problem to solve.
Statistica Sinica 16(2006), 847-860
PSEUDO-R
2
IN LOGISTIC REGRESSION MODEL
Bo Hu, Jun Shao and Mari Palta
University of Wisconsin-Madison
Abstract: Logistic regression with binary and multinomial outcomes is commonly
used, and researchers have long searched for an interpretable measure of the strength
of a particular logistic model. This article describes the large sample properties
of some pseudo-R2 statistics for assessing the predictive strength of the logistic
regression model. We present theoretical results regarding the convergence and
asymptotic normality of pseudo-R2s. Simulation results and an example are also
presented. The behavior of the pseudo-R2s is investigated numerically across a
range of conditions to aid in practical interpretation.
Key words and phrases: Entropy, logistic regression, pseudo-R2
1. Introduction
Logistic regression for binary and multinomial outcomes is commonly used
in health research. Researchers often desire a statistic ranging from zero to one
to summarize the overall strength of a given model, with zero indicating a model
with no predictive value and one indicating a perfect fit. The coefficient of deter-
mination R2 for the linear regression model serves as a standard for such measures
(Draper and Smith (1998)). Statisticians have searched for a corresponding in-
dicator for models with binary/multinomial outcome. Many different R2 statis-
tics have been proposed in the past three decades (see, e.g., McFadden (1973),
McKelvey and Zavoina (1975), Maddala (1983), Agresti (1986), Nagelkerke
(1991), Cox and Wermuch (1992), Ash and Shwartz (1999), Zheng and Agresti
(2000)). These statistics, which are usually identical to the standard R2 when
applied to a linear model, generally fall into categories of entropy-based and
variance-based (Mittlböck and Schemper (1996)). Entropy-based R2 statistics,
also called pseudo-R2s, have gained some popularity in the social sciences (Mad-
dala (1983), Laitila (1993) and Long (1997)). McKelvey and Zavoina (1975)
proposed a pseudo-R2 based on a latent model structure, where the binary/
multinomial outcome results from discretizing a continuous latent variable that
is related to the predictors through a linear model. Their pseudo-R2 is defined
as the proportion of the variance of the latent variable that is explained by the
848 BO HU, JUN SHAO AND MARI PALTA
covariate. McFadden (1973) suggested an alternative, known as “likelihood-
ratio index”, comparing a model without any predictor to a model including all
predictors. It is defined as one minus the ratio of the log likelihood with inter-
cepts only, and the log likelihood with all predictors. If the slope parameters
are all 0, McFadden’s R2 is 0, but it is never 1. Maddala (1983) developed
another pseudo-R2 that can be applied to any model estimated by the maximum
likelihood method. This popular and widely used measure is expressed as
R2M = 1 −
(
L(θ̃)
L(θ̂)
)
2
n
, (1)
.
Statistica Sinica 16(2006), 847-860
PSEUDO-R
2
IN LOGISTIC REGRESSION MODEL
Bo Hu, Jun Shao and Mari Palta
University of Wisconsin-Madison
Abstract: Logistic regression with binary and multinomial outcomes is commonly
used, and researchers have long searched for an interpretable measure of the strength
of a particular logistic model. This article describes the large sample properties
of some pseudo-R2 statistics for assessing the predictive strength of the logistic
regression model. We present theoretical results regarding the convergence and
asymptotic normality of pseudo-R2s. Simulation results and an example are also
presented. The behavior of the pseudo-R2s is investigated numerically across a
range of conditions to aid in practical interpretation.
Key words and phrases: Entropy, logistic regression, pseudo-R2
1. Introduction
Logistic regression for binary and multinomial outcomes is commonly used
in health research. Researchers often desire a statistic ranging from zero to one
to summarize the overall strength of a given model, with zero indicating a model
with no predictive value and one indicating a perfect fit. The coefficient of deter-
mination R2 for the linear regression model serves as a standard for such measures
(Draper and Smith (1998)). Statisticians have searched for a corresponding in-
dicator for models with binary/multinomial outcome. Many different R2 statis-
tics have been proposed in the past three decades (see, e.g., McFadden (1973),
McKelvey and Zavoina (1975), Maddala (1983), Agresti (1986), Nagelkerke
(1991), Cox and Wermuch (1992), Ash and Shwartz (1999), Zheng and Agresti
(2000)). These statistics, which are usually identical to the standard R2 when
applied to a linear model, generally fall into categories of entropy-based and
variance-based (Mittlböck and Schemper (1996)). Entropy-based R2 statistics,
also called pseudo-R2s, have gained some popularity in the social sciences (Mad-
dala (1983), Laitila (1993) and Long (1997)). McKelvey and Zavoina (1975)
proposed a pseudo-R2 based on a latent model structure, where the binary/
multinomial outcome results from discretizing a continuous latent variable that
is related to the predictors through a linear model. Their pseudo-R2 is defined
as the proportion of the variance of the latent variable that is explained by the
848 BO HU, JUN SHAO AND MARI PALTA
covariate. McFadden (1973) suggested an alternative, known as “likelihood-
ratio index”, comparing a model without any predictor to a model including all
predictors. It is defined as one minus the ratio of the log likelihood with inter-
cepts only, and the log likelihood with all predictors. If the slope parameters
are all 0, McFadden’s R2 is 0, but it is never 1. Maddala (1983) developed
another pseudo-R2 that can be applied to any model estimated by the maximum
likelihood method. This popular and widely used measure is expressed as
R2M = 1 −
(
L(θ̃)
L(θ̂)
)
2
n
, (1)
.
USE OF PLS COMPONENTS TO IMPROVE CLASSIFICATION ON BUSINESS DECISION MAKINGIJDKP
This paper presents a methodology that eliminates multicollinearity of the predictors variables in
supervised classification by transforming the predictor variables into orthogonal components obtained
from the application of Partial Least Squares (PLS) Logistic Regression. The PLS logistic regression was
developed by Bastien, Esposito-Vinzi, and Tenenhaus [1]. We apply the techniques of supervised
classification on data, based on the original variables and data based on the PLS components. The error
rates are calculated and the results compared. The implementation of the methodology of classification is
rests upon the development of computer programs written in the R language to make possible the
calculation of PLS components and error rates of classification. The impact of this research will be
disseminated, based on evidence that the methodology of Partial Least Squares Logistic Regression, is
fundamental when working in a supervised classification with data of many predictors variables.
Linear regression (1). spss analiisa statistikJuandaSatriyo1
This document discusses linear regression analysis, a statistical method used to analyze relationships between variables. It can be used to describe, estimate, and predict relationships. The document provides an overview of linear regression, including how it models relationships between dependent and independent variables using equations. It also discusses important considerations for performing and interpreting linear regression analyses correctly. Examples are provided to illustrate key points.
This document describes an analysis of count data from a study on the detection of anthelmintic resistance in gastrointestinal nematodes of small ruminants. The data consists of egg counts from 30 goats and 30 sheep that were grouped into Albendazole, Ivermectin, and control groups. The data was analyzed using Poisson and negative binomial regression models in R software. The Poisson model did not fit the data well due to overdispersion. However, the negative binomial regression model provided a better fit for the overdispersed data. Key findings from the negative binomial regression analysis are summarized.
Similar to Analysis Of Count Data Using Poisson Regression (20)
What Should I Write My College Essay About 15Amy Cernava
The document provides steps for requesting writing assistance from HelpWriting.net:
1. Create an account with a password and email.
2. Complete a 10-minute order form providing instructions, sources, deadline, and attaching a sample for style imitation.
3. Review bids from writers and choose one based on qualifications, history, and feedback, then pay a deposit to start.
4. Review the paper and authorize full payment if pleased, or request revisions using the free revision policy.
5. Confidently choose HelpWriting.net knowing needs will be fully met through original, high-quality content or a full refund.
A New Breakdown Of. Online assignment writing service.Amy Cernava
The document provides a 5-step process for requesting writing help from HelpWriting.net:
1. Create an account and provide contact information.
2. Complete a 10-minute order form with instructions, sources, deadline, and sample work.
3. Review bids from writers and choose one based on qualifications.
4. Review the completed paper and authorize payment if satisfied.
5. Request revisions to ensure satisfaction and receive a refund for plagiarized work.
The document provides instructions for requesting writing assistance from HelpWriting.net in 6 steps: 1) Create an account with a password and email. 2) Complete a 10-minute order form providing instructions, sources, and deadline. 3) Review bids from writers and select one based on qualifications. 4) Review the completed paper and authorize payment if satisfied. 5) Request revisions to ensure satisfaction. HelpWriting.net promises original, high-quality work or a full refund.
General Water. Online assignment writing service.Amy Cernava
The document discusses the incarnation of Jesus Christ and how he can be both fully divine and fully human. It notes that while this seems paradoxical, there are clues in biblical texts that provide evidence for both the humanity and divinity of Jesus. Specifically, the Gospel of John states that "the word became flesh," indicating Jesus took on human form while also embodying God's word. Further evidence of Jesus' humanity includes displaying human emotions like crying and praying. The document states that while it is impossible to prove historical claims definitively, the biblical texts provide historic clues about Jesus that make the case for his dual divine-human nature.
Essay Websites Sample Parent Essays For Private HiAmy Cernava
The document discusses the key differences between the three Synoptic Gospels of Matthew, Mark, and Luke, and the Gospel of John in the New Testament. The Synoptic Gospels are more similar to each other in terms of content and order of events, while John's Gospel contains more unique material and presents events in a different order than the Synoptics. The Synoptics focus more on Jesus' ministry and teachings, while John's Gospel explores theological themes like Jesus being the Son of God through its retelling of Jesus' life.
How To Write About Myself Examples - CoverletterpediaAmy Cernava
The document provides instructions for creating an account on the HelpWriting.net site in order to request that a writer complete an assignment. It outlines a 5-step process: 1) Create an account with a password and email. 2) Complete an order form with instructions, sources, and deadline. 3) Review bids from writers and choose one based on qualifications. 4) Review the completed paper and authorize payment if satisfied. 5) Request revisions until fully satisfied, with the option of a refund for plagiarized work. The document promotes HelpWriting.net's writing assistance services.
The document provides instructions for creating an account on a writing assistance website and submitting a request to have a paper written. It outlines a five step process: 1) create an account, 2) complete a request form providing details of the paper needed, 3) writers will bid on the request and the client will select a writer, 4) the client will review and approve the paper, and 5) the client can request revisions until satisfied with the paper. The process is described as quick, simple, and ensuring high quality original content with the option of a full refund if plagiarized.
Essay Introductions For Kids. Online assignment writing service.Amy Cernava
The document provides a summary of the plot of Henrik Ibsen's play "A Doll's House." It discusses the main character, Nora, who lives in 19th century Norway with her husband Torvald and their children. Due to gender roles at the time, Nora is unable to obtain a loan from the bank to pay for Torvald's medical treatment, so she forges her deceased father's signature on the loan. This action leads to her being blackmailed by the banker Krogstad. The summary examines how the play explores the oppression of women during this era through Nora's situation.
Writing Creative Essays - College Homework Help AAmy Cernava
The document provides a summary of the novel A Separate Peace by John Knowles:
1. The story follows two friends, Finny and Gene, attending boarding school during World War 1 as they transition into adolescence.
2. Gene struggles with darkness and jealousy towards his friend Finny, who represents an idealized version of innocence and youth.
3. During a secret ritual jumping from a tree, Gene causes Finny to fall and break his leg, ending his dreams and marking the end of their friendship.
4. The novel explores the themes of transitioning out of childhood, the loss of innocence, and the impact of jealousy and inner darkness on friendship.
Free Printable Primary Paper Te. Online assignment writing service.Amy Cernava
Rehabilitation psychology focuses on applying psychological knowledge and skills to help individuals with disabilities or chronic health conditions. The purpose of rehabilitation psychologists is to maximize outcomes related to health, independence, daily functioning and social participation, while also minimizing secondary health issues. Rehabilitation psychology aims to study and assist those with disabilities in order to improve their quality of life.
Diversity Essay Sample Graduate School Which CaAmy Cernava
The document provides instructions for using an online writing service to get help with assignments. It outlines a 5-step process: 1) Create an account, 2) Submit a request with instructions and deadline, 3) Review bids from writers and choose one, 4) Review the completed paper and authorize payment, 5) Request revisions if needed and get a refund for plagiarized work. The service uses a bidding system and promises original, high-quality content.
Large Notepad - Heart Border Writing Paper PrintAmy Cernava
1. The document discusses the CenTrust Bank scandal where the bank became embroiled in risky junk bond investments in the 1980s and suffered major losses as a result.
2. It describes how CenTrust reported a $119.5 million loss for the 1989 fiscal year after writing down the value of its junk bond holdings and reserving for further losses.
3. Due to its equity shortage of $283 million, CenTrust had to sell most of its branches in an attempt to raise capital and recover from the scandal.
Personal Challenges Essay. Online assignment writing service.Amy Cernava
The document provides instructions for requesting writing assistance from HelpWriting.net, including creating an account, submitting a request form with paper details and deadline, and reviewing writer bids before selecting one and placing a deposit to start the assignment. It notes the platform uses a bidding system and guarantees original, high-quality work or a full refund, as well as free revisions to ensure customer satisfaction.
Buy College Application Essays Dos And DontAmy Cernava
The document provides instructions for buying college application essays from HelpWriting.net. It outlines a 5-step process: 1) Create an account with a password and email; 2) Complete an order form with instructions and deadline; 3) Choose a writer based on bids, qualifications, and reviews; 4) Review the paper and authorize payment or request revisions; 5) Request revisions until satisfied, with a refund option for plagiarized content. The process aims to fully meet customer needs with original, high-quality content.
The document discusses the process for requesting assistance with an assignment from the website HelpWriting.net. It involves 5 steps: 1) Creating an account, 2) Completing an order form providing instructions, sources, and deadline, 3) Reviewing bids from writers and selecting one, 4) Reviewing the completed paper and authorizing payment, 5) Requesting revisions if needed, with the guarantee of a refund for plagiarized work. The purpose is to outline the simple process for students to receive help in writing assignments.
Types Of Essay And Examples. 4 Major Types OAmy Cernava
The document provides a 5-step process for requesting writing assistance from HelpWriting.net. It involves creating an account, completing an order form with instructions and deadline, reviewing bids from writers and selecting one, making a deposit to start the work, and authorizing final payment upon approval of the completed work. Revisions are allowed and a refund is guaranteed if plagiarism is found.
026 Describe Yourself Essay Example Introduce MyselfAmy Cernava
The worksheet discusses the major developmental tasks and transitions of adolescence including biological changes, cognitive development, and identity formation. It also outlines key aspects of early, middle, and late adulthood such as common social and family roles as well as changes in cognitive and physical abilities. The developmental stages from adolescence through adulthood demonstrate important changes in how individuals think, interact with others, and view themselves and their roles in society.
Term Paper Introduction Help - How To Write An IntrAmy Cernava
The document provides instructions for getting writing assistance from HelpWriting.net. It is a 5-step process:
1. Create an account with a password and email.
2. Complete a 10-minute order form providing instructions, sources, deadline and attach a sample if wanting the writer to mimic your style.
3. Review bids from writers and choose one based on qualifications, history and feedback. Place a deposit to start the assignment.
4. Review the completed paper and authorize final payment if pleased, or request free revisions.
5. Multiple revisions can be requested to ensure satisfaction, and plagiarized work will be refunded.
Analysis Of Students Critical Thinking Skill Of Middle School Through STEM E...Amy Cernava
The document describes a study that analyzed Japanese middle school students' critical thinking skills through a STEM education project-based learning program about wastewater treatment. The study involved 160 students divided into groups who completed worksheets and designed tools to clean wastewater over six lessons integrating science, technology, engineering and math concepts. Lessons progressed from introducing concepts to designing and optimizing solutions. Students' critical thinking was assessed using a rubric and most scored as "practicing thinkers", able to critique plans and construct realistic critiques. The goal was to improve students' awareness of clean water needs and critical thinking skills through STEM and project-based learning.
ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...PECB
Denis is a dynamic and results-driven Chief Information Officer (CIO) with a distinguished career spanning information systems analysis and technical project management. With a proven track record of spearheading the design and delivery of cutting-edge Information Management solutions, he has consistently elevated business operations, streamlined reporting functions, and maximized process efficiency.
Certified as an ISO/IEC 27001: Information Security Management Systems (ISMS) Lead Implementer, Data Protection Officer, and Cyber Risks Analyst, Denis brings a heightened focus on data security, privacy, and cyber resilience to every endeavor.
His expertise extends across a diverse spectrum of reporting, database, and web development applications, underpinned by an exceptional grasp of data storage and virtualization technologies. His proficiency in application testing, database administration, and data cleansing ensures seamless execution of complex projects.
What sets Denis apart is his comprehensive understanding of Business and Systems Analysis technologies, honed through involvement in all phases of the Software Development Lifecycle (SDLC). From meticulous requirements gathering to precise analysis, innovative design, rigorous development, thorough testing, and successful implementation, he has consistently delivered exceptional results.
Throughout his career, he has taken on multifaceted roles, from leading technical project management teams to owning solutions that drive operational excellence. His conscientious and proactive approach is unwavering, whether he is working independently or collaboratively within a team. His ability to connect with colleagues on a personal level underscores his commitment to fostering a harmonious and productive workplace environment.
Date: May 29, 2024
Tags: Information Security, ISO/IEC 27001, ISO/IEC 42001, Artificial Intelligence, GDPR
-------------------------------------------------------------------------------
Find out more about ISO training and certification services
Training: ISO/IEC 27001 Information Security Management System - EN | PECB
ISO/IEC 42001 Artificial Intelligence Management System - EN | PECB
General Data Protection Regulation (GDPR) - Training Courses - EN | PECB
Webinars: https://pecb.com/webinars
Article: https://pecb.com/article
-------------------------------------------------------------------------------
For more information about PECB:
Website: https://pecb.com/
LinkedIn: https://www.linkedin.com/company/pecb/
Facebook: https://www.facebook.com/PECBInternational/
Slideshare: http://www.slideshare.net/PECBCERTIFICATION
Strategies for Effective Upskilling is a presentation by Chinwendu Peace in a Your Skill Boost Masterclass organisation by the Excellence Foundation for South Sudan on 08th and 09th June 2024 from 1 PM to 3 PM on each day.
How to Fix the Import Error in the Odoo 17Celine George
An import error occurs when a program fails to import a module or library, disrupting its execution. In languages like Python, this issue arises when the specified module cannot be found or accessed, hindering the program's functionality. Resolving import errors is crucial for maintaining smooth software operation and uninterrupted development processes.
Executive Directors Chat Leveraging AI for Diversity, Equity, and InclusionTechSoup
Let’s explore the intersection of technology and equity in the final session of our DEI series. Discover how AI tools, like ChatGPT, can be used to support and enhance your nonprofit's DEI initiatives. Participants will gain insights into practical AI applications and get tips for leveraging technology to advance their DEI goals.
Exploiting Artificial Intelligence for Empowering Researchers and Faculty, In...Dr. Vinod Kumar Kanvaria
Exploiting Artificial Intelligence for Empowering Researchers and Faculty,
International FDP on Fundamentals of Research in Social Sciences
at Integral University, Lucknow, 06.06.2024
By Dr. Vinod Kumar Kanvaria
How to Make a Field Mandatory in Odoo 17Celine George
In Odoo, making a field required can be done through both Python code and XML views. When you set the required attribute to True in Python code, it makes the field required across all views where it's used. Conversely, when you set the required attribute in XML views, it makes the field required only in the context of that particular view.
How to Add Chatter in the odoo 17 ERP ModuleCeline George
In Odoo, the chatter is like a chat tool that helps you work together on records. You can leave notes and track things, making it easier to talk with your team and partners. Inside chatter, all communication history, activity, and changes will be displayed.
1. Professional zyxwvutsrqpo
Geo zyxwvutsrqpon
rapher, 41(2), 1989,pp. 190-198 zyxwvut
0Copyright 1 6 9 by Association of American Geographers
ANALYSIS OF COUNT DATA USING
POISSON REGRESSION*
Andrew Lovett and Robin Flowerdew zyx
University zyxwvu
of Lancaster zyxw
In geographical research the data of interest are often in the form of counts. Standard regression analysis is
inappropriatefor suck data, but if certain assumptionsare met, a form of regression based on the Poisson distribution
can be used. This paper illustrates the use of Poisson regression in the computer package GLIM with an example
from historical geography. Apprentice migration zyxwvut
to Edinburgh is regressed on a combination zyxw
of categorical, count,
and continuous explanatory variables. Key Words: Poisson regression, count data, migration, GLIM,
Edinburgh.
Many geographers receive a basictrain-
ing in statistical analysis but lack the op-
portunity or desire to keep up to datewith
developments in statistical methods that
may be useful in contextsreasonably com-
mon in geographical research. This paper
describes one such development,Poisson
regression analysis,in sufficientdetail that
interested researchers with moderate
knowledge of statistical applications in
geography can learn to use it for them-
selves.
Regression analysis constructs an ex-
planatory or predictive model of a de-
pendent or response variable (usually de-
noted Y) on the basis of one or more
independent or explanatory variables
(denoted XI, X2, etc.). Poisson regression
is appropriate when the dependent vari-
able is a count, such as the number of
times an event occurs or the number of
people in a certain category. It may be
particularly useful if some observations
have very low values. The method is still
not very widely known in geography, or
indeed the social or environmental sci-
* This work was done under a grant from the Eco-
nomic and Social Research Council. The manuscript
was prepared while the second author was working
with the Institute for Market and Social Analysis,
Toronto. We would like to thank Ian Whyte for sup-
plying the data, and present and former members of
the Centre for Applied Statistics, University of Lan-
caster, especially Murray Aitkin, Brian Francis, John
Hinde, and Andrew Wood for statisticalhelp.
ences generally. It does not yet appear,
for example, in any textbook of statistical
methods in geography. A growing num-
ber of applications exist in the journal lit-
erature, however. Dependent variables
which have been analyzed using Poisson
regression include the number of mi-
grants from one city to another (Constan-
tine and Gower 1982;Flowerdew and Ait-
kin 1982; Flowerdew and Lovett 1988),
airline passengers from one city to another
(Fotheringham and Williams 1983),
origins of US college students (Pickles
1985), interregional industrial move-
ments in Britain (Twomey 1986), radia-
tion exposure categories (Darby et al.
1985), plant species on islands (Vincent
and Haworth 1983),family planning clin-
icsin Nigeria (Senior 1987),shoppingtrips
(Guy 1987), children immunized against
whooping cough (Gatrell1986),heart dis-
ease mortality (Lovett et al. 1986), spina
bifida (Lovett and Gatrell 1988), looted
stores in New York City and national
newspapers of the world (Flowerdewand
Lovett 1984).Lovett (1984)has produced
a detailed guide to the technique as im-
plemented in the GLIM statistical pack-
age.
In the first section of this paper, the
characteristics of Poisson regression are
introduced by contrasting them with the
standard ordinary least squares (OLS)form
of regression analysis. An outline of the
commands needed to carry out Poisson
190
2. VOLUME zyxwvut
41, NUMBER
2, MAY,1989 191
regression in GLIM follows. Finally an
illustration of how these can be linked
together is provided in the third section
through an analysis of a data set from
historical geography.
Poisson Regression
and the Generalized
ability. The shape of a Poisson distribu-
tion depends on the value of its mean
(which is equal to its variance). If the mean
is close to zero, then the distribution is
strongly skewed; if the mean is larger, the
peak occurs further from the vertical axis.
If the mean is very large, the Poisson dis-
tribution can be approximated by the nor-
Linear Model mal.
Linear regression analysis assumes that
the dependent variable has a linear re-
lationship with the independent vari-
ables. If there is only one independent
variable X, the predicted value of the de-
pendent variable Y for case i is given by
the equation
f i zyxwvutsr
= zyxwv
PO + PI xi
where f i is the predicted value ofzyxwv
Y for
case i, xi is the value of X for case i, zyxw
POis
the intercept-the value of f when X is
zero, and PI is the slope-the amount by
which f changes as X increases by one
unit. The observed value yi correspond-
ing to xi can be envisaged as a realization
of a random variable Yi whose mean is
estimated to be Pi. In OLS regression, the
Yi variables are assumed to have normal
distributions and a constant variance.
OLS regression has been used to ana-
lyze count dependent variables, but its
assumptionsare not valid in this context.
The values of the yis are counts and so
must be non-negative integers; therefore
the random variables generating them
cannot have a normal distribution,which
is continuous and can take on negative
values. If the counts are large, the normal
distribution may give a reasonable ap-
proximation to the true discrete distri-
bution; but when the counts have small
values, the use of OLS regression is in-
appropriate and the results obtained are
unreliable.
A type of regression analysis more ap-
propriate for small counts is one in which
each Yi variable is assumed to have a Pois-
son distribution (Frome et al. 1973).The
Poisson distribution describes the prob-
ability that an event occurs k times in a
fixed period given that each occurrence
is independent and has a constant prob-
In a Poisson regression model the pre-
dicted value of the dependent variable for
qase i is the maximum likelihoodestimate
Xi of the mean of a Poisson-distributed
variable Yi. The natural logarithm of this
estimate is equal to a linear combination
of the corresponding values of the inde-
pendent variables; if there is only one in-
dependent variable the relationship has
the form
In(&) = PO + ~ l x i (2)
This equation can also be written
(3)
This is the Poisson regression equivalent
of the OLS regression equation (1).
Two characteristics of the Poisson
regression model are worthy of further
discussion. Unlike OLS regression, the
Poisson model does not assume that the
data are homoscedastic. Indeed, the vari-
ance of each case is equal to the corre-
sponding predicted value. Consequently
the variances associated with the values
of Yi cannot be the same and cannot be
normally distributed. Second, the ob-
served value of Yi is a count of indepen-
dent events generated by a Poisson dis-
tribution with parameter Xi. This means
that the model is not appropriate where
the occurrence of one event increases the
probability of others. For example, mod-
eling the incidence of a contagious dis-
ease would not be appropriate.
The OLSand Poisson regression models
can be considered as variants of what
Nelder and Wedderburn (1972) termed
generalized linear models,explained more
fully in Appendix A of the GLIM User’s
Guide (Payne 1985) and by McCullagh
and Nelder (1983). Generalized linear
models can be used whenever an analyst
3. 192 zyxwvut
THEPROFESSIONAL
GEOGRAPHER
wants to estimate the value of a depen-
dent variable Y on the basis of a linear
predictor (a linear combination of explan-
atory variables XI, X2, etc.). An observed
value yi is regarded as the sum of a sys-
tematic component (pi)and a random (or
error) component (ti):
yizyxwvu
= pi + ti zyxwvu
(4)
where yi is the observed value of Y for
case i, ki is the predicted value of Y for
case i, and ti is a randomly distributed
error term. Types of generalized linear
models are differentiated on the basis of
the probability distribution that Yi is as-
sumed to have and the link function zyxwv
g
that associatesthe mean of Yi (pi)with the
linear predictor
m
g(pJ = zyxwv
zPjxij (5)
j-1
where m is the number of parameters, Pj
is the parameter for the jth independent
variable, and xi, is the value of the ith
observation on the jth independent vari-
able. Equation (4) can be rewritten
A range of linear models can be con-
structed by specifying different types of
error term ti and link function g. Options
for the link function include identity, log-
arithmic, reciprocal, and square root,
among others. The error distribution can
be any member of the exponentialfamily
of probability distributions, including
such distributions as the normal, bino-
mial, Poisson, and gamma.
In OLSregression the error distribution
is normal. The predicted value pi is equal
to 2Pjxij,the linear predictor, so the link
is simply the identity function.In Poisson
regression, the error distribution is Pois-
son, but it is the naturaj logarithm of the
predicted value, not Xi itself, which is
equal to the linear predictor, so the link
function is logarithmic. Log-linear mod-
eling in GLIM (Bowlby and Silk 1982;
m
j-1
Wrigley 1985)also uses the Poisson prob-
ability model and logarithmic link. In-
deed, it can be regarded as a special case
of Poisson regression when all the inde-
pendent variables are categorical.
The concept of a generalized linear
model is the basis of the computer pack-
age GLIMwhich provides a relatively easy
means of performing Poisson regression.
(It is also available in other packages, in-
cluding GENSTATand LIMDEP.)Poisson
regression is not available in most of the
other major statistical packages, such as
SAS, SPSS-X, SYSTAT, or BMDP. This pa-
per, therefore, focuses on the implemen-
tation of Poisson regression in GLIM.
Poisson Regression
in GLIM
GLIM was developedby the Numerical
Algorithms Group in conjunction with the
Working Party on Statistical Computing
of the Royal Statistical Society (OBrien
1987).In addition to the mainframe ver-
sion, a microcomputer version isavailable
relatively cheaply from the Numerical Al-
gorithms Group in Oxford or in Downers
Grove, IL. Payne (1985) provides a de-
tailed guide, and several articles have been
published describing how to use GLIM
for particular purposes. Examples by
geographers include Bowlby and Silk
(1982)and O’Brien (1983).
GLIMallows the user to specifythe error
distribution and link function to be used
in the regression. Once the data have been
entered, any necessary transformations
can be done on the variables, and the data
can be plotted. The example in the next
section illustrates these operations. The
main process involved in fitting the mod-
el, however, is the selection of indepen-
dent variables for inclusion.
The results of fitting a model can be
evaluated according to the value of scaled
deviance, which is a measure of the over-
all difference between the observed val-
ues of Y and those predicted by the model.
Deviance is computed in different ways
according to the probability distribution
in use, and is based on the log-likelihood
ratio. In OLSregression, deviance is equal
to the error (or residual) sum of squares.
4. VOLUME41, NUMBER
2, MAY,1989 193
In Poisson regression (where it is some-
timesdenoted zyxwvut
G2),
the scaleddeviance(d)is zyxw
"
d = zyxwv
2 yiln(yi/ii) (7)
i=l
The deviance is large when the corre-
spondence between observed and esti-
mated values is poor, and small when it
is close.
Because the devianceof a Poisson mod-
el has a distributionapproximating to chi-
square, it is possible to assesswhether the
estimated and observed values of Y are
significantly different by comparing the
calculated deviancewith the critical value
of chi-square for the appropriate degrees
of freedom and significance level. The
number of degrees of freedom is equal to
the number of cases minus the number of
parameters fitted. If the deviance exceeds
the critical value, the null hypothesisthat
the model could have generated the data
should be rejected, and the model should
not be regarded as providing an adequate
explanation (at least in statistical terms)
of the variation in the values of the de-
pendent variable. However, little is
known about how good the approxima-
tion to chi-square is for small sets of data
(Payne 1985,lll), and this test should be
regarded only as a general guide in as-
sessing goodness of fit.
A regression model can be evaluated
from several different perspectives. One
of these is goodness-of-fit, already men-
tioned. Another aspect is the significance
of the parameters, which can be assessed
using t values, obtained by dividing each
parameter estimate by its standard error.
A third consideration is interpretability,
which can best be evaluated by seeing if
the sign and magnitude of parameter es-
timates accord with theoretical expecta-
tions. Ideally a regression model (Poisson
or otherwise)should have a good fit, sig-
nificant parameters, and a clear interpre-
tation. In practice, however, these are not
always attainable simultaneously, and the
best compromise must be chosen. There-
fore it is sensible to fit models according
to a systematic strategy, particularly be-
cause the significance of individual pa-
rameters is likely to vary between models
depending on the other variables includ-
ed (especially if they are collinear).
One strategy begins by assessing the
overall variation in the dependent vari-
able by fitting a null model, in which the
only parameter isan intercept term (equal
to the mean of Y). The deviance of this
model can be used as a benchmark to as-
sess the effect of adding independent
variables. The importance of each vari-
able added in turn to the null model can
be judged by comparing the reduction in
deviance with the critical value of chi-
square. Introducing numerical variables
involves the loss of only one degree of
freedom, and so a decline in deviance of
more than zyx
3.84indicates that the variable
is significant at the 0.05 level. The same
procedure can also be used to test the sig-
nificance of factors and interaction terms,
though here the degrees of freedom lost
will depend on the number of categories
of the factor(s) concerned.
This procedure can be used to find
which variable is the most successful in
reducing the deviance. Others are then
added one by one in the order defined by
the magnitude of their effect on deviance.
At each stage the estimates and standard
errors are examined to remove any vari-
ables whose parameters are insignificant.
New variables added to the model can
alter the significance of others, so that
variables which initially seemed signifi-
cant may drop out, and others, not sig-
nificant originally, may become so. This
procedure is complete when a significant
reduction in deviance can no longer be
obtained. At this stage, those variables and
interaction terms not in the model should
be reintroduced to test whether they are
now significant. Once this process is com-
pleted, the estimates and residuals can be
displayed and the overall adequacy of the
model evaluated.
Application of
the Model
An analysis of apprentice migration to
Edinburgh between 1775and 1799is cho-
sen to demonstrate the discussion above
with a real and reasonably small data set
5. 194 zyxwvut
THEPROFESSIONAL
GEOGRAPHER
TABLEzyxwvuts
1
DATA USED IN THE APPRENTICE MIGRATION
EXAMPLE
D A POP U R County
21
24
33
33
36
41
41
52
54
56
67
71
78
79
85
86
92
110
110
125
132
156
157
159
174
175
179
234
274
274
283
366
491
225
22
44
3
41
9
2
5
23 zyxwvutsrq
11
9
13
26
0
5
3
1
2
0
1
0
3
0
4
2
0
1
7
4
1
0
0
1
56 000
18000
30 000
7000
94 000
9000
1
1000
5000
147000
31000
34 000
51000
126000
21 000
99 000
55 000
78 000
84 000
29 000
26 000
12000
123000
22 000
36 000
27 000
8000
72 000
74000
55 000
23 000
23 000
29 000
22 000
18.8 3
37.9 2
43.4 3
30.3 1
41.3 1
29.3 3
47.4 1
41.9 3
68.1 2
15.2 3
31.8 3
31.1 2
14.4 1
27.3 2
55.3 1
25.9 3
69.9 2
26.4 2
11.3 3
12.3 1
43.6 2
23.1 1
23.2 3
12.9 1
27.6 1
45.3 1
12.7 2
10.8 1
10.7 1
28.3 1
10.3 1
9.0 1
7.7 1
Midlothian
West Lothian
East Lothian
Kinross
Fife
Peebles
Clackmannan
Selkirk
Lanark
Benvick
Roxburgh
Stirling
Perth
Dunbarton
Angus
Dumfries
Renfrew
Kirkcud-
bright
Kincardine
Bute
Aberdeen
Wigtown
Banff
Moray
Nairn
Argyll
Inverness
Ross
Caithness
Sutherland
Orkney
Shetland
A
Yr
D = distance, A = number of apprentices, zyxwvut
U = degree of
urbanization,R = categorical variable (directionof county from
Edinburgh).The data set should not include column headings
when it is read into GLIM. Source: Lovett 1984.
(Lovett et al. 1985; Whyte and Whyte
1986).The dependent variable (A) is the
number of apprentices originating from
each of the 33 Scottish counties registered
in Edinburgh between 1775and 1799.Ap-
prentice migration from each county was
expected to be positively related to its
population (POP), asrecorded in the 1801
census, and negatively related to its dis-
tance (D) and its degree of urbanization
(U), measured by the percentage of the
county population living in urban settle-
ments.These relationships might vary re-
gionally because of differencesin the ease
of travel and the nature of intervening
opportunities. A three-level categorical
variable (R) was therefore included, de-
fined according to whether the county was
north, west or south of Edinburgh.
The following is an annotated tran-
script of a Poisson regression analysis of
the data in GLIM 3.77 on a VAX 11 z
1780
running under VMS. Input is indicated
by upper case, output by lower case. The
following commands read the data from
a file called EDIN.DAT(Table 1)on chan-
nel (or unit) 10.
$UNITS 33 $
$DATA D A POP U R $
$FORMAT zyx
(F4.O,F4.0,F7.O,FS.l,F2.0)
$
$DINPUT 10 $
File name? EDIN.DAT
Population and distance variables are
logarithmically transformed, according to
usual gravity model practice. The loga-
rithmic link function used in Poisson
regression makes it unnecessary to trans-
form the number of migrants.
$CALC LP = %LOG(POP)$
$CALC LD = %LOG(D)$
The regional variable R must be specified
as a factor, and A must be identified as
the dependent variable.
$FACTOR R 3 $
$YVAR A $
$ERROR POISSON $
$LINK LOG $
(Thecommands $ERROR NORMAL $ and
$LINK IDENTITY$ would be used at this
point if OLS regression was to be done.)
Now the null model can be fitted, fol-
lowed by each independent variable in
turn.
$FIT $
scaled deviance = 1350.4 at cycle 5
d.f. = 32
$FIT LP $
scaled deviance = 1217.8 at cycle 5
d.f. = 31
6. VOLUME41, NUMBER
2,zyxwv
MAY,1989 195
$FIT LDzyxwvut
$
scaled deviance = 393.96 at cycle 4
d.f. = 31
$FIT U $
scaled deviance = 1348.9at cycle 6
d.f. = 31
$FIT R $
scaled deviance = 1054.1at cycle 5
d.f. = 30
The cycle number indicates how many
iterations were required for the solution.
The null model devianceis fairly large,
and log distance is the best explanatory
variable, followed by region. They can be
tried together.
$FIT LD + R $
scaled deviance = 329.60 at cycle 5
d.f. = 29
The devianceis reduced by 64 for the loss
of two degrees of freedom, so it is worth
retaining LD and R together in the model.
Next the other variables can be added to
the model. In the following commands,
the + indicates that the new model con-
sists of those variables in the previous
model plus the additional one specified.
$FIT + LP $
scaled deviance =
(change =
d.f. =
(change =
$FIT + zyxwv
U $
scaled deviance =
(change =
d.f. =
(change =
97.91 at cycle 4
-231.7)
28
-1)
80.524 at cycle 4
27
-17.390)
-1)
Both variables produce significant devi-
ance reductions, but the overall deviance
is stillwell over 40.11,which is the critical
chi-square value at 0.05 for 27 degrees of
freedom.
Before interaction terms are introduced
into the model, the parameter estimates
and standard errors can be examined.
$DISPLAY E $
param-
estimate s.e. eter
1 0.7346 0.9885 1
2 -2.055 0.09349 LD
3 -0.1016 0.1726 R(2)
4 0.5044 0.1477 R(3)
5 0.9822 0.07816 LP
6 -0.01705 0.004141 U
scale parameter taken as 1.000
The low standard errors of LD, LP and U
confirm their significance in accounting
for variation in A. Theparameter estimate
for LP is near 1,asthe gravity model would
suggest. The value for LD is negative as
expected, as is that for urbanization. The
values shown for R(2) and R(3) are dif-
ferences from the intercept; thus region
3 (the south) actually has the highest pa-
rameter value, with region l slightly in
excess of region 2. Wrigley (1985) pro-
vides more detailsabout interpreting this
type of GLIM output.
Now the interaction of region with the
other variables can be investigated.Add-
ing the interaction between a factor (R)
and a continuous variable (LP) is equiv-
alent to estimating a separate parameter
for each level of the factor.
$FIT f R.LP $
scaled deviance = 76.951 at cycle 4
d.f. = 25
This is not significant, zyx
so it can be omitted
from the model and the other interaction
terms fitted.
(change = -3.57)
(change = -2)
$FIT - R.LP + R.LD $
scaled deviance = 51.836 at cycle 4
(change = -25.11)
d.f. = 25
(change = -0)
$FIT + R.U $
scaled deviance = 40.919 at cycle 4
(change = -10.92)
d.f. = 23
(change = -2)
7. 196 THEPROFESSIONAL
GEOGRAPHER
TABLE 2
PARAMETER ESTIMATES AND STANDARD
ERRORS DISPLAYED BY GLIM
Estimate SE Parametera zyxwvu
~
1 -I.623 1.949 zyxwvutsr
2 -1.549 0.2013
3 -3.035 3.837
4 8.657 3.164
5 1.032 0.1386
6 -0.02619 0.008928
7 -2.049 0.8342
8 -1.316 0.2999
9 1.144 0.6820
10 -0.3891 0.2305
11 -0.02834 0.02682
12 0.03025 0.01206 zyxwvuts
a Scale parameter taken zyxwvutsrq
as 1.000.
1
LD
R(2)
R(3)
LP
U
LD.R(2)
LD.R(3)
LP.R(2)
LP.R(3)
U.R(2)
U.R(3)
Both terms produce significant reductions
in deviance when added to the model,
and the overall value is approaching an
acceptable fit (the critical value of chi-
square at 0.01 is 41.64).
Although the interaction term R.LP was
not significant earlier, it may be now that
the other interaction terms have been
added.
$FIT + R.LP $
scaled deviance = 32.642at cycle 4
(change = -8.277)
d.f. = 21
(change = -2)
This term is now significant, and the de-
viance is now below the critical value of
chi-square at 0.05. This model could
therefore have generated the observed
data; the null hypothesis that apprentice
migration is determined by these vari-
ables cannot be rejected on the basis of
this data set. The current set of parameter
estimates can now be requested with the
$DISPLAY E $ command (Table 2). Ap-
prentice migration to Edinburgh is posi-
tively associated with origin population,
and negatively with distance and urban-
ization. There are also regional variations;
for example, the distance-decay gradient
is shallower to the north and steeper to
the west.
The next step in the analysis might be
to display the residuals, and to examine
them to see if any observations are out of
line with the others. As with other forms
of regression analysis, the spatial pattern
of residuals can often give the investi-
gator ideas for other variables which may
be of relevance.
In general, the analysis has been sat-
isfactory (not always the case!). An ac-
ceptable fit has been achieved, and all the
parameter values are in accordance with
theoretical expectations. The $STOP com-
mand concludes the GLIM run.
Conclusion
This paper has presented the basics of
Poisson regression analysis from a prac-
tical point of view. The example shows
how a researcher may investigate a series
of models with speed and flexibility. Not
all analyses will produce a model with an
adequate fit, especially if data for impor-
tant explanatory variables are not avail-
able. Even if a good fit is not achieved,
however, the results will allow an eval-
uation of which variables contribute the
most to reducing the deviance. It must be
remembered, of course, that any variable
or any model, regardless of its goodness
of fit, may have no causal link at all to the
dependent variable, and may be just a sur-
rogate for other variables.
The applicability of Poisson regression
analysis depends on the assumption that
the counts which constitute the depen-
dent variable are generated by a Poisson
distribution, which is true only if the in-
dividual events making up the counts are
independent of one another. In our ex-
ample, we assume that apprentices acted
independently in making their moves to
Edinburgh. This assumption is probably
not entirely true, as apprenticesmay have
been preceded or accompanied by rela-
tives or friends from the same origin. In
other cases, such as overall interurban mi-
gration (Flowerdew and Aitkin 1982), the
independence assumption is certainly not
true, for families are likely to migrate to-
gether. In such a case, some form of gen-
eralized or compoundPoisson model may
8. VOLUME41, NUMBER
2, MAY, 1989 197 z
be appropriate (Flowerdew and Lovett
1989;Hinde 1982);fitting such a model is
less straightforward and cannot be done
in GLIM without special macro proce-
dures. If a Poisson regression of a com-
pound or generalized Poisson dependent
variable is conducted, the deviance can
be expected to be considerably greater
than for a Poisson variable. However, Da-
vies and Guy (1987)have shown that, for
a large class of compound and general-
ized Poisson distributions, the parameters
derived from Poisson regression are un-
biased estimates of the parameters of the
compound or generalized process.
Poisson regression analysis isa method
developed fairly recently and as yet is lit-
tle known in geography. Its use is appro-
priate in many contexts where the de-
pendent variable is a count. The analysis
is reasonably easy to apply, given suitable
computer software such as GLIM, and
geographers should make more use of it
in the future.
Literature Cited
Bowlby, S., and J. Silk. 1982. Analysis of qual-
ititative data using GLIM: Two examples
based on shopping survey data. zyxwvu
The Profes-
sional Geographer 34230-90.
Constantine, A. G., and J. C. Gower. 1982.
Models for the analysis of interregional mi-
gration. Environment and Planning zyxwvu
A 14477-
97.
Darby, S., E. Nakashima, and H. Kato. 1985. A
parallel analysis of cancer mortality among
atomic bomb survivors and patients with an-
kylosing spondylitis given X-ray therapy.
lournu1 zyxwvutsrq
of the National Cancer Institute 75:l-21.
Davies,R. B., and C. M. Guy. 1987.The statistical
modeling of flow data when the Poisson as-
sumption is violated. Geographical Analysis 19:
Flowerdew, R., and M. Aitkin. 1982. A method
of fitting the gravity model based on the
Poisson distribution. Journal of Regional Sci-
ence 22191-202.
Flowerdew, R., and A. Lovett. 1984. Poisson
regression models for the analysis of geo-
graphical data. End-of-grant Report to the
Economicand SocialResearch Council, Lon-
don.
Flowerdew, R., and zyxwvu
A. Lovett. 1988. Fitting
constrained Poisson regression models to
300-14.
inter-urban migration flows. Geographical
Analysis 202.97-307.
Flowerdew, R.,and A. Lovett. 1989.Compound
and generalised Poisson models for inter-
urban migration. In Advances in Regional De-
mography: Forecasts, Information, Models, ed.
P. Congdon and P. Batey, 246-56. London:
Belhaven Press.
Fotheringham, A. S., and P. A. Williams. 1983.
Further discussion on the Poisson interac-
tion model. Geographical Analysis 15:343-47.
Frome, E. L., M. H. Kutner, and 1.J. Beauchamp.
1973. Regression analysis of Poisson-
distribution data. Journalof the American Sta-
tistical Association 68:935-40.
Gatrell, A. C. 1986. Immunization against
whooping cough in Salford: A spatial anal-
ysis. Social Science and Medicine 23:1027-32.
Guy, C. M. 1987. Recent advances in spatial
interaction modelling: An application to the
forecasting of shopping travel. Environment
and Planning A 19:173-86.
Hinde, J. 1982. Compound Poisson regression
models. In GLIM 82, ed. R. Gilchrist, 109-21.
New York: Springer-Verlag.
Lovett, A. 1984. Poisson regression using the
GLIM package. Lancaster:Department of Ge-
ography, University of Lancaster, Computer
Package Guide No. 5.
Lovett, A,, C. G. Bentham, and R. Flowerdew.
1986. Analysing geographic variations in
mortality using Poisson regression: The ex-
ample of ischaemic heart disease in England
and Wales 1969-1973. Social Science and Med-
icine 23:935-43.
Lovett, A,, and Gatrell, A. C. 1988. The geog-
raphy of spina bifida in England and Wales.
Transactions,Institute of British Geographersn.s.
13:288-302.
Lovett, A,, I. D. Whyte, and K. A. Whyte. 1985.
Poisson regression analysis and migration
fields: The example of the apprenticeship
records of Edinburgh in the seventeenthand
eighteenth centuries. Transactions,Institute of
British Geographers n.s. 10:317-32.
McCullagh, P., and J. A. Nelder. 1983. Gener-
alized Linear Models. London: Chapman and
Hall.
Nelder, J. A,, and R. W. M. Wedderburn. 1972.
Generalized linearmodels.Journal,Royal Sta-
tistical Society A 135:370-84.
OBrien, L.zyxw
G: 1983. Generalised linear mo-
delling using the GLIM system. Area 15:327-
35.
OBrien, L. G. 1987. GLIM 3.77 (software re-
view). The Professional Geographer 39229-30.
Payne, C. D., ed. 1985. The GLIM System Release
9. 198 THEPROFESSIONAL zy
GEOGRAPHER z
3.77: Manual. Oxford: Numerical Algorithms
Group.
Pickles, A. 1985. A n Introduction to Likelihood
Analysis. Concepts and Techniques in Mod-
ern Geography, No. 42. Norwich, UK: Geo
Books.
Senior, M. L. 1987.The establishment of family
planning clinics in south-west Nigeria by
1970: Analyses using logit and Poisson
regression. Area 19237-45.
Twomey, J. 1986.Establishment migration: an
analytical framework. Environment and Plan-
ning zyxwvutsr
A 18:967-79.
Vincent, P. J., and J. M. Haworth. 1983.Poisson
regression models of species abundance.
Journal of Biogeography 10:153-60.
Whyte, I. D., and K. A. Whyte. 1986. Patterns
of migration of apprentices into Aberdeen
and Inverness during the eighteenth and
early nineteenth centuries. Scottish Geograph-
ical Magazine 102:81-92.
Wrigley, N. 1985. Categorical Data Analysis for
Geographers and Environmental Scientists. Lon-
don: Longman.
ANDREWLOVETT (B.A., University College of Wales,
Aberystwyth) is Teaching Fellow, Department of Ge-
ography, University of Lancaster, Lancaster LA1 4YB,
UK. His research interests are in medical geography
and industrial linkage studies. ROBIN FLOWERDEW
(Ph.D., Northwestern University) is Lecturer in Ge-
ography, University of Lancaster. His research inter-
ests are in migration modeling and geographical in-
formation systems. zyxw
~~
OPEN FORUM
Professional Geo rapher, zyxwvutsrqp
41(2). 1989, pp. 198-203 zyxwvuts
0 Copyright 1889 by Association of American Geographers
THE HANFORD ENVIRONMENTAL DOSE
RECONSTRUCTION PROJECT
Richard Morrill
University of Washrngfon
A five year, $15 million study is un-
derway to estimate radiation doses that
people might have received from opera-
tions at the Hanford, WA, nuclear reser-
vation. The study is funded by the US
Department of Energy (DOE),and the re-
search is conducted by Battelle’s Pacific
Northwest Laboratory, the current main
research contractor at Hanford. The work
is directed by an independent Technical
Steering Panel (TSP)on which I serve as
the demographer. Despite the intensely
geographic nature of the problem, the
possibility of a geographer as such as a
panel member was never raised; we have
a long way to go. This report provides
some background on the issues surround-
ing the Hanford controversy, with the
hope that readers with expertise in the
area or topic will contact me.
The Hanford Reservation
Hanford is one of the nation’s major
nuclear production and research reserv-
ations. As many as nine primary produc-
tion reactors have existed at various times.
Eight water-cooled reactors along the Co-
lumbia River have been the source of large