1. The document discusses modeling road traffic accident deaths in South Africa using generalized linear models. It analyzes mortality data from 2001-2006 to determine prevalence among age groups.
2. A negative binomial regression model was used instead of a Poisson regression model because the data exhibited overdispersion. The analysis found that the 35-49 age group had the highest prevalence of road traffic accident deaths at 26.6%.
3. Females had an expected death rate that was 65.4% lower than males. Being in the 35-49 age group increased the mean death rate by a factor of 0.557 compared to those over 65, representing a decreased rate of 44.3% for both genders.
This document summarizes a study that used small area estimation to analyze poverty levels across sub-districts in Demak District, Indonesia. Regression analysis was used to model the relationship between poverty (dependent variable) and the percentage of farm households, access to water taps, and population density (independent variables). Small area estimation with a non-parametric kernel approach was then applied to estimate poverty levels for each sub-district using the model and additional data from statistics surveys. The results of this poverty mapping showed that population density was the dominant factor influencing poverty levels in some sub-districts of Demak.
ESSENTIAL MODIFICATIONS ON BIOGEOGRAPHY-BASED OPTIMIZATION ALGORITHMcsandit
Biogeography-based optimization (BBO) is a new population-based evolutionary algorithm and
is based on an old theory of island biogeography that explains the geographical distribution of
biological organisms. BBO was introduced in 2008 and then a lot of modifications were
employed to enhance its performance. This paper proposes two modifications; firstly,
modifying the probabilistic selection process of the migration and mutation stages to give a
fairly randomized selection for all the features of the islands. Secondly, the clear duplication
process after the mutation stage is sized to avoid any corruption on the suitability index
variables. The obtained results through wide variety range of test functions with different
dimensions and complexities proved that the BBO performance can be enhanced effectively
without using any complicated form of the immigration and emigration rates. This essential
modification has to be considered as an initial step for any other modification.
Gamma and inverse Gaussian frailty models: A comparative studyinventionjournals
Frailty models have become very popular during the last three decades and their applications are numerous. The main goal of this manuscript is to compare two frailty models (gamma frailty model and inverse Gaussian frailty model) each of which has a log-logistic distribution to be its baseline hazard function. A real data set is applied for the two considered frailty models in order to deal with models comparison. It has been concluded that the gamma frailty model is the best model fits this data set. Then the inverse Gaussian frailty model, which provides a better fit of the considered data set than the Cox’s model.
ADAPTATION OF PARAMETRIC UNIFORM CROSSOVER IN GENETIC ALGORITHMcscpconf
Exploration of the search space occurs at the cost of destructing existing good solutions. This cost will grow as the search progresses. The parametric uniform crossover is a general form of the uniform crossover operator. Using this operator, it would be possible to control the swapping probability of each locus. An adaptive method proposed that control the value of the exchange probability of the parametric uniform crossover. The population will be diversified in case that the population’s diversity decreases. The recombination of the solutions would be done with regards to their fitness distance to reduce the amount of destruction of good solutions. The
experiments conducted show significant improvement in the performance of the parametric uniform crossover in comparison with to the state-of-the-art methods.
Adaptation of parametric uniform crossover in genetic algorithmcsandit
Exploration of the search space occurs at the cost of destructing existing good solutions. This cost
will grow as the search progresses. The parametric uniform crossover is a general form of the
uniform crossover operator. Using this operator, it would be possible to control the swapping
probability of each locus. An adaptive method proposed that control the value of the exchange
probability of the parametric uniform crossover. The population will be diversified in case that
the population’s diversity decreases. The recombination of the solutions would be done with
regards to their fitness distance to reduce the amount of destruction of good solutions. The
experiments conducted show significant improvement in the performance of the parametric
uniform crossover in comparison with to the state-of-the-art methods.
This document summarizes a study that used small area estimation to analyze poverty levels across sub-districts in Demak District, Indonesia. Regression analysis was used to model the relationship between poverty (dependent variable) and the percentage of farm households, access to water taps, and population density (independent variables). Small area estimation with a non-parametric kernel approach was then applied to estimate poverty levels for each sub-district using the model and additional data from statistics surveys. The results of this poverty mapping showed that population density was the dominant factor influencing poverty levels in some sub-districts of Demak.
ESSENTIAL MODIFICATIONS ON BIOGEOGRAPHY-BASED OPTIMIZATION ALGORITHMcsandit
Biogeography-based optimization (BBO) is a new population-based evolutionary algorithm and
is based on an old theory of island biogeography that explains the geographical distribution of
biological organisms. BBO was introduced in 2008 and then a lot of modifications were
employed to enhance its performance. This paper proposes two modifications; firstly,
modifying the probabilistic selection process of the migration and mutation stages to give a
fairly randomized selection for all the features of the islands. Secondly, the clear duplication
process after the mutation stage is sized to avoid any corruption on the suitability index
variables. The obtained results through wide variety range of test functions with different
dimensions and complexities proved that the BBO performance can be enhanced effectively
without using any complicated form of the immigration and emigration rates. This essential
modification has to be considered as an initial step for any other modification.
Gamma and inverse Gaussian frailty models: A comparative studyinventionjournals
Frailty models have become very popular during the last three decades and their applications are numerous. The main goal of this manuscript is to compare two frailty models (gamma frailty model and inverse Gaussian frailty model) each of which has a log-logistic distribution to be its baseline hazard function. A real data set is applied for the two considered frailty models in order to deal with models comparison. It has been concluded that the gamma frailty model is the best model fits this data set. Then the inverse Gaussian frailty model, which provides a better fit of the considered data set than the Cox’s model.
ADAPTATION OF PARAMETRIC UNIFORM CROSSOVER IN GENETIC ALGORITHMcscpconf
Exploration of the search space occurs at the cost of destructing existing good solutions. This cost will grow as the search progresses. The parametric uniform crossover is a general form of the uniform crossover operator. Using this operator, it would be possible to control the swapping probability of each locus. An adaptive method proposed that control the value of the exchange probability of the parametric uniform crossover. The population will be diversified in case that the population’s diversity decreases. The recombination of the solutions would be done with regards to their fitness distance to reduce the amount of destruction of good solutions. The
experiments conducted show significant improvement in the performance of the parametric uniform crossover in comparison with to the state-of-the-art methods.
Adaptation of parametric uniform crossover in genetic algorithmcsandit
Exploration of the search space occurs at the cost of destructing existing good solutions. This cost
will grow as the search progresses. The parametric uniform crossover is a general form of the
uniform crossover operator. Using this operator, it would be possible to control the swapping
probability of each locus. An adaptive method proposed that control the value of the exchange
probability of the parametric uniform crossover. The population will be diversified in case that
the population’s diversity decreases. The recombination of the solutions would be done with
regards to their fitness distance to reduce the amount of destruction of good solutions. The
experiments conducted show significant improvement in the performance of the parametric
uniform crossover in comparison with to the state-of-the-art methods.
Lecture 01 Introduction (Traffic Engineering هندسة المرور & Dr. Usama Shahdah) Hossam Shafiq I
This document provides an overview of a traffic engineering course, including:
- Contact information for the instructor and information about the course website.
- Requirements and grading breakdown, including assignments, exams, and a term project.
- References and software used in the course.
- An introduction to traffic engineering and what traffic engineers do, such as conducting traffic studies, evaluating performance, designing facilities, and controlling traffic.
- Components of a traffic system including diverse road users and vehicles, and the complex roadway network.
- The role of traffic flow theory in modeling and analyzing traffic systems.
Diagnostic imaging in head and neck pathologyHayat Youssef
This document provides an overview of various diagnostic imaging modalities used in head and neck pathology including their history, principles, applications, advantages, and limitations. It discusses x-ray imaging techniques like conventional radiography and tomography. It also covers computed tomography, cone beam computed tomography, magnetic resonance imaging, ultrasound imaging, and nuclear imaging techniques like scintigraphy, positron emission tomography, and single photon emission tomography. Each imaging modality is described in terms of its basic principles, clinical applications in head and neck cases, benefits, and shortcomings. The document serves as a comprehensive reference for radiologists on diagnostic tools available for evaluating head and neck conditions.
Pedestrian Accident Scenario of Dhaka City and Development of a Prediction ModelRafidTahmid1
Conference: International Conference on Recent Innovation in Civil Engineering for Sustainable Development (IICSD).
Year: 2015.
Place: Department of Civil Engineering, DUET - Gazipur, Bangladesh.
Type: Conference Paper.
Paper ID: TE-049.
Authors: H. M. Ahsan (1); M. H. Rahman (2).
(1) Professor, Department of Civil Engineering, BUET.
Email: hmahsan@ce.buet.ac.bd
(2) Undergraduate Student, Department of Civil Engineering, BUET.
Email: md.hasibur.rahman.buet.ce@gmail.com
Analysis Of Count Data Using Poisson RegressionAmy Cernava
This document describes Poisson regression, a statistical technique for analyzing count data using regression. It compares Poisson regression to ordinary least squares regression, outlines how to perform Poisson regression in the GLIM software package, and provides an example analyzing historical apprentice migration data to Edinburgh. Key aspects include:
- Poisson regression is appropriate when the dependent variable is a count, unlike OLS regression which assumes a normal distribution.
- It models the logarithm of the mean as a linear combination of predictors rather than the mean directly.
- GLIM allows specification of the Poisson error distribution and logarithmic link function required for Poisson regression.
- An example apprentice migration data set is analyzed to demonstrate the technique.
Modeling of driver lane choice behavior with artificial neural networks (ann)...cseij
In parallel to the economic developments, the importance of road transportation was significantly
increased in Turkey. As a result of this, long-distance freight transportation gains more importance and
hence numbers of the heavy vehicles were significantly increased. Consequently, road surface deformations
are observed on the roads as the increasing freight transportation and climatic conditions influence the
road surface. Therefore, loss of functionality of the road surface is observed and drivers are much prone to
accident due to their driving characteristics as they can have more tendencies to change their lanes not to
pass through the deformation area. In this study, the lane changing behaviors of the drivers were
investigated and both Artificial Neural Network (ANN) and Linear Regression (LR) models were proposed
to simulate the driver behavior of lane changing who approach to a specific road deformation area. The
potential of ANN model for simulating the driver behavior was evaluated with successive comparison of the
model performances with LR model. While there was a slight performance increase for the ANN model with
respect to LR model, it is quite evident that, ANN models can play an important role for predicting the
driver behavior approaching a road surface deformation. It can be said that, approaching speed plays an
important factor on the lane changing behavior of a driver. This can be criticized by the fact that, drivers
with high approaching speeds more likely pass through the deformation to avoid the accidents while
changing their lanes with a high speed.
EVALUATION OF PARTICLE SWARM OPTIMIZATION ALGORITHM IN PREDICTION OF THE CAR ...ijcsa
Road traffic accidents are the most common accidents that annually Endangers lives of many people in the world. Our country Iran is one of the countries with highest incidence and mortality due to accidents that has been introduced. So it’s requires identification of underlay in dimensions in this field. Due to the increasing amount of car accidents in order to increase volume of information related to car accidents and needs to explore and reveal hidden dependencies and very long time among this information. So using traditional methods to discover these complex relations don't response between involved factors and we need to use new techniques. Considering that main aim of this paper is to find best relationship between volumes of information in shortest time. So, in this paper, we classify accidents in West Azerbaijan province in Iran by accident type (damage, injury, death) and we describe it by using Particle Swarm Optimization (PSO) algorithm
UNDERSTANDING LEAST ABSOLUTE VALUE IN REGRESSION-BASED DATA MININGIJDKP
This article advances our understanding of regression-based data mining by comparing the utility of Least
Absolute Value (LAV) and Least Squares (LS) regression methods. Using demographic variables from
U.S. state-wide data, we fit variable regression models to dependent variables of varying distributions
using both LS and LAV. Forecasts generated from the resulting equations are used to compare the
performance of the regression methods under different dependent variable distribution conditions. Initial
findings indicate LAV procedures better forecast in data mining applications when the dependent variable
is non-normal. Our results differ from those found in prior research using simulated data.
Generalized Additive and Generalized Linear Modeling for Children DiseasesQUESTJOURNAL
ABSTRACT: This paper is necessarily restricted to application of Generalised Linear Models(GLM) and Generalised Additive Models(GAM), and is intended to provide readers with some measure of the power of these mathematical tools for modeling Health/Illness data systems. We are all aware that illness, in general and children illness, in particular is amongst the most serious socio-economic and demographic problems in developing countries, and they have great impact on future development. In this paper we focus on some frequently occurring diseases among children under fourteen years of age, using data collected from various hospitals of Jammu district from 2011 to 2016.The success of any policy or health care intervention depends on a correct understanding of the socio economic environmental and cultural factors that determine the occurrence of diseases and deaths. Until recently, any morbidity information available was derived from clinics and hospitals. Information on the incidence of diseases, obtained from hospitals represents only a small proportion of the illness, because many cases do not seek medical attention .Thus, the hospital records may not be appropriate from estimating the incidence of diseases from programme developments. The use of DHS data in the understanding of the childhood morbidity has expanded rapidly in recent years. However, few attempts have been made to address explicitly the problems of non linear effects on metric covariates in the interpretation of results .This study shows how the GAM model can be adapted to extent the analysis of GLM to provide an explanation of non linear relationship of the covariate. Incorporation of non linear terms in the model improves the estimates in the terms of goodness of fit. The GLM model is explicitly specified by giving symbolic description of the linear predictor and a description of the error distribution and the GAM model is fit using the local scoring algorithm, which iteratively fits weighted additive models by back fitting. The back fitting algorithm is a Gauss-Seidel method of fitting additive models by the iteratively smoothing partial residuals. The algorithm separates the parametric from the non parametric parts of the fit, and fits the parametric part using weighted linear least squares within the back fitting algorithm.
https://utilitasmathematica.com/index.php/Index
Our Journal has recently transitioned to becoming a fully open-access journal. This transition marks a significant shift in the landscape of academic publishing, aiming to provide numerous benefits to both authors and readers. Open access has the potential to democratize knowledge, enhance research impact, and foster greater collaboration within the academic community.
Utilitas Mathematica Journal is a broad scope journal that publishes original research and review articles on all aspects of both pure and applied mathematics.The journal publishes original research in all areas of pure and applied mathematics, statistics.
Utilitas Mathematica Journal. It's our journal publishes original research in all areas of pure and applied mathematics.Number Theory,Operations Research,Mathematical Biology,and it's the Utilitas Mathematica Journal commits to strengthening our professional .
https://utilitasmathematica.com/index
Our journal has to a fully open-access format is a significant step toward advancing the principles of open science and equitable access to knowledge. However, this transition also brings challenges, such as ensuring sustainable funding models and maintaining rigorous peer-review standards.
A linear prediction based on logarithmic space (theoretically non-linear) is proposed, which is essentially a geometric series. The number of deaths of coal miners decreases year by year in that proportion.
In this paper we focus on mixed model analysis for regression model to take account of over dispersion in random effects. Moreover, we present the Data Exploration, Box plot, QQ plot, Analysis of variance, linear models, linear mixed –effects model for testing the over dispersion parameter in the mixed model. A mixed model is similar in many ways to a linear model. It estimates the effects of one or more explanatory variables on a response variable. In this article, the mixed model analysis was analyzed with the R-Language. The output of a mixed model will give you a list of explanatory values, estimates and confidence intervals of their effect sizes, P-values for each effect, and at least one measure of how well the model fits. The application of the model was tested using open-source dataset such as using numerical illustration and real datasets
General Linear Model is an ANOVA procedure in which the calculations are performed using the least square regression approach to describe the statistical relationship between one or more prediction in continuous response variable. Predictors can be factors and covariates. Copy the link given below and paste it in new browser window to get more information on General Linear Model:- http://www.transtutors.com/homework-help/statistics/general-linear-model.aspx
This document provides an introduction to generalized linear mixed models (GLMMs). GLMMs allow for modeling of data that violates assumptions of linear mixed models, such as non-normal distributions and non-constant variance. The document discusses the components of a GLMM, including the linear predictor, inverse link function, and variance function. It also describes how to derive estimating equations for GLMMs and provides an example for a univariate logit model. Estimation of variance components is also briefly discussed.
Lecture 01 Introduction (Traffic Engineering هندسة المرور & Dr. Usama Shahdah) Hossam Shafiq I
This document provides an overview of a traffic engineering course, including:
- Contact information for the instructor and information about the course website.
- Requirements and grading breakdown, including assignments, exams, and a term project.
- References and software used in the course.
- An introduction to traffic engineering and what traffic engineers do, such as conducting traffic studies, evaluating performance, designing facilities, and controlling traffic.
- Components of a traffic system including diverse road users and vehicles, and the complex roadway network.
- The role of traffic flow theory in modeling and analyzing traffic systems.
Diagnostic imaging in head and neck pathologyHayat Youssef
This document provides an overview of various diagnostic imaging modalities used in head and neck pathology including their history, principles, applications, advantages, and limitations. It discusses x-ray imaging techniques like conventional radiography and tomography. It also covers computed tomography, cone beam computed tomography, magnetic resonance imaging, ultrasound imaging, and nuclear imaging techniques like scintigraphy, positron emission tomography, and single photon emission tomography. Each imaging modality is described in terms of its basic principles, clinical applications in head and neck cases, benefits, and shortcomings. The document serves as a comprehensive reference for radiologists on diagnostic tools available for evaluating head and neck conditions.
Pedestrian Accident Scenario of Dhaka City and Development of a Prediction ModelRafidTahmid1
Conference: International Conference on Recent Innovation in Civil Engineering for Sustainable Development (IICSD).
Year: 2015.
Place: Department of Civil Engineering, DUET - Gazipur, Bangladesh.
Type: Conference Paper.
Paper ID: TE-049.
Authors: H. M. Ahsan (1); M. H. Rahman (2).
(1) Professor, Department of Civil Engineering, BUET.
Email: hmahsan@ce.buet.ac.bd
(2) Undergraduate Student, Department of Civil Engineering, BUET.
Email: md.hasibur.rahman.buet.ce@gmail.com
Analysis Of Count Data Using Poisson RegressionAmy Cernava
This document describes Poisson regression, a statistical technique for analyzing count data using regression. It compares Poisson regression to ordinary least squares regression, outlines how to perform Poisson regression in the GLIM software package, and provides an example analyzing historical apprentice migration data to Edinburgh. Key aspects include:
- Poisson regression is appropriate when the dependent variable is a count, unlike OLS regression which assumes a normal distribution.
- It models the logarithm of the mean as a linear combination of predictors rather than the mean directly.
- GLIM allows specification of the Poisson error distribution and logarithmic link function required for Poisson regression.
- An example apprentice migration data set is analyzed to demonstrate the technique.
Modeling of driver lane choice behavior with artificial neural networks (ann)...cseij
In parallel to the economic developments, the importance of road transportation was significantly
increased in Turkey. As a result of this, long-distance freight transportation gains more importance and
hence numbers of the heavy vehicles were significantly increased. Consequently, road surface deformations
are observed on the roads as the increasing freight transportation and climatic conditions influence the
road surface. Therefore, loss of functionality of the road surface is observed and drivers are much prone to
accident due to their driving characteristics as they can have more tendencies to change their lanes not to
pass through the deformation area. In this study, the lane changing behaviors of the drivers were
investigated and both Artificial Neural Network (ANN) and Linear Regression (LR) models were proposed
to simulate the driver behavior of lane changing who approach to a specific road deformation area. The
potential of ANN model for simulating the driver behavior was evaluated with successive comparison of the
model performances with LR model. While there was a slight performance increase for the ANN model with
respect to LR model, it is quite evident that, ANN models can play an important role for predicting the
driver behavior approaching a road surface deformation. It can be said that, approaching speed plays an
important factor on the lane changing behavior of a driver. This can be criticized by the fact that, drivers
with high approaching speeds more likely pass through the deformation to avoid the accidents while
changing their lanes with a high speed.
EVALUATION OF PARTICLE SWARM OPTIMIZATION ALGORITHM IN PREDICTION OF THE CAR ...ijcsa
Road traffic accidents are the most common accidents that annually Endangers lives of many people in the world. Our country Iran is one of the countries with highest incidence and mortality due to accidents that has been introduced. So it’s requires identification of underlay in dimensions in this field. Due to the increasing amount of car accidents in order to increase volume of information related to car accidents and needs to explore and reveal hidden dependencies and very long time among this information. So using traditional methods to discover these complex relations don't response between involved factors and we need to use new techniques. Considering that main aim of this paper is to find best relationship between volumes of information in shortest time. So, in this paper, we classify accidents in West Azerbaijan province in Iran by accident type (damage, injury, death) and we describe it by using Particle Swarm Optimization (PSO) algorithm
UNDERSTANDING LEAST ABSOLUTE VALUE IN REGRESSION-BASED DATA MININGIJDKP
This article advances our understanding of regression-based data mining by comparing the utility of Least
Absolute Value (LAV) and Least Squares (LS) regression methods. Using demographic variables from
U.S. state-wide data, we fit variable regression models to dependent variables of varying distributions
using both LS and LAV. Forecasts generated from the resulting equations are used to compare the
performance of the regression methods under different dependent variable distribution conditions. Initial
findings indicate LAV procedures better forecast in data mining applications when the dependent variable
is non-normal. Our results differ from those found in prior research using simulated data.
Generalized Additive and Generalized Linear Modeling for Children DiseasesQUESTJOURNAL
ABSTRACT: This paper is necessarily restricted to application of Generalised Linear Models(GLM) and Generalised Additive Models(GAM), and is intended to provide readers with some measure of the power of these mathematical tools for modeling Health/Illness data systems. We are all aware that illness, in general and children illness, in particular is amongst the most serious socio-economic and demographic problems in developing countries, and they have great impact on future development. In this paper we focus on some frequently occurring diseases among children under fourteen years of age, using data collected from various hospitals of Jammu district from 2011 to 2016.The success of any policy or health care intervention depends on a correct understanding of the socio economic environmental and cultural factors that determine the occurrence of diseases and deaths. Until recently, any morbidity information available was derived from clinics and hospitals. Information on the incidence of diseases, obtained from hospitals represents only a small proportion of the illness, because many cases do not seek medical attention .Thus, the hospital records may not be appropriate from estimating the incidence of diseases from programme developments. The use of DHS data in the understanding of the childhood morbidity has expanded rapidly in recent years. However, few attempts have been made to address explicitly the problems of non linear effects on metric covariates in the interpretation of results .This study shows how the GAM model can be adapted to extent the analysis of GLM to provide an explanation of non linear relationship of the covariate. Incorporation of non linear terms in the model improves the estimates in the terms of goodness of fit. The GLM model is explicitly specified by giving symbolic description of the linear predictor and a description of the error distribution and the GAM model is fit using the local scoring algorithm, which iteratively fits weighted additive models by back fitting. The back fitting algorithm is a Gauss-Seidel method of fitting additive models by the iteratively smoothing partial residuals. The algorithm separates the parametric from the non parametric parts of the fit, and fits the parametric part using weighted linear least squares within the back fitting algorithm.
https://utilitasmathematica.com/index.php/Index
Our Journal has recently transitioned to becoming a fully open-access journal. This transition marks a significant shift in the landscape of academic publishing, aiming to provide numerous benefits to both authors and readers. Open access has the potential to democratize knowledge, enhance research impact, and foster greater collaboration within the academic community.
Utilitas Mathematica Journal is a broad scope journal that publishes original research and review articles on all aspects of both pure and applied mathematics.The journal publishes original research in all areas of pure and applied mathematics, statistics.
Utilitas Mathematica Journal. It's our journal publishes original research in all areas of pure and applied mathematics.Number Theory,Operations Research,Mathematical Biology,and it's the Utilitas Mathematica Journal commits to strengthening our professional .
https://utilitasmathematica.com/index
Our journal has to a fully open-access format is a significant step toward advancing the principles of open science and equitable access to knowledge. However, this transition also brings challenges, such as ensuring sustainable funding models and maintaining rigorous peer-review standards.
A linear prediction based on logarithmic space (theoretically non-linear) is proposed, which is essentially a geometric series. The number of deaths of coal miners decreases year by year in that proportion.
In this paper we focus on mixed model analysis for regression model to take account of over dispersion in random effects. Moreover, we present the Data Exploration, Box plot, QQ plot, Analysis of variance, linear models, linear mixed –effects model for testing the over dispersion parameter in the mixed model. A mixed model is similar in many ways to a linear model. It estimates the effects of one or more explanatory variables on a response variable. In this article, the mixed model analysis was analyzed with the R-Language. The output of a mixed model will give you a list of explanatory values, estimates and confidence intervals of their effect sizes, P-values for each effect, and at least one measure of how well the model fits. The application of the model was tested using open-source dataset such as using numerical illustration and real datasets
General Linear Model is an ANOVA procedure in which the calculations are performed using the least square regression approach to describe the statistical relationship between one or more prediction in continuous response variable. Predictors can be factors and covariates. Copy the link given below and paste it in new browser window to get more information on General Linear Model:- http://www.transtutors.com/homework-help/statistics/general-linear-model.aspx
This document provides an introduction to generalized linear mixed models (GLMMs). GLMMs allow for modeling of data that violates assumptions of linear mixed models, such as non-normal distributions and non-constant variance. The document discusses the components of a GLMM, including the linear predictor, inverse link function, and variance function. It also describes how to derive estimating equations for GLMMs and provides an example for a univariate logit model. Estimation of variance components is also briefly discussed.
The document discusses mixed models, which contain both fixed and random effects. Fixed effects have all possible levels included in the study, while random effects are a random sample from the total population. The mixed model is represented as Y = Xβ + Zγ + ε, where β are fixed effects, X are fixed effect variables, Z are random effects, γ are random effect parameters, and ε is the error term. Mixed models can model both fixed and random effects, account for correlation in errors, and handle missing data. They provide correct standard errors compared to general linear models (GLMs). Model fitting involves likelihood ratio tests and information criteria to select the best fitting model.
Differential evolution (DE) algorithm has been applied as a powerful tool to find optimum switching angles for selective harmonic elimination pulse width modulation (SHEPWM) inverters. However, the DE’s performace is very dependent on its control parameters. Conventional DE generally uses either trial and error mechanism or tuning technique to determine appropriate values of the control paramaters. The disadvantage of this process is that it is very time comsuming. In this paper, an adaptive control parameter is proposed in order to speed up the DE algorithm in optimizing SHEPWM switching angles precisely. The proposed adaptive control parameter is proven to enhance the convergence process of the DE algorithm without requiring initial guesses. The results for both negative and positive modulation index (M) also indicate that the proposed adaptive DE is superior to the conventional DE in generating SHEPWM switching patterns.
Integration Method of Local-global SVR and Parallel Time Variant PSO in Water...TELKOMNIKA JOURNAL
Flood is one type of natural disaster that can’t be predicted, one of the main causes of flooding is the continuous rain (natural events). In terms of meteorology, the cause of flood is come from high rainfall and the high tide of the sea, resulting in increased the water level. Rainfall and water level analysis in each period, still not able to solve the existing problems. Therefore in this study, the proposed integration method of Parallel Time Variant PSO (PTVPSO) and Local-Global Support Vector Regression (SVR) is used to forecast water level. Implementation in this study combine SVR as regression method for forecast the water level, Local-Global concept take the role for the minimization for the computing time, while PTVPSO used in the SVR to obtain maximum performance and higher accurate result by optimize the parameters of SVR. Hopefully this system will be able to solve the existing problems for flood early warning system due to erratic weather.
In this article, 180 gastric images taken with Light Microscope help are used. Maximally Stable
Extremal Regions (MSER) features of the images for classification has been calculated. These MSER features
have been applied Discrete Fourier Transform (DFT) method. High-dimensional of these MSER-DFT feature
vectors is reduced to lower-dimensional with Local Tangent Space Alignment (LTSA) and Neighborhood
Preserving Embedding (NPE). When size reduction process was done, properties in 5, 10, 15, 20, 25, 30, 35, 40,
45, and 50 dimensions have been obtained. These low-dimensional data are classified by Random Forest (RF)
classification. Thus, MSER_DFT_LTSA-NPE_RF method for gastric histopathological images have been
developed. Classification results obtained with these methods have been compared. According to the other
methods, classification results for gastric histopathological images have been found to be higher.
Application of Semiparametric Non-Linear Model on Panel Data with Very Small ...IOSRJM
-This research work investigated the behaviour of a new semiparametric non-linear (SPNL) model on
a set of panel data with very small time point (T = 1). The SPNL model incorporates the relationship between
individual independent variable and unobserved heterogeneity variable. Five different estimation techniques
namely; Least Square (LS), Generalized Method of Moments (GMM), Continuously Updating (CU), Empirical
Likelihood (EL) and Exponential Tilting (ET) Estimators were employed for the estimation; for the purpose of
modelling the metrical response variable non-linearly on a set of independent variables. The performances of
these estimators on the SPNL model were examined for different parameters in the model using the Least
Square Error (LSE), Mean Absolute Error (MAE) and Median Absolute Error (MedAE) criteria at the lowest
time point (T = 1). The results showed that the ET estimator which provided the least errors of estimation is
relatively more efficient for the proposed model than any of the other estimators considered. It is therefore
recommended that the ET estimator should be employed to estimate the SPNL model for panel data with very
small time point.
Application of Semiparametric Non-Linear Model on Panel Data with Very Small ...
Makalah Seminar_KNM XVII_ITS
1. KNM XVII 11-14 Juni 2014 ITS, Surabaya
1
MODELLING ROAD TRAFFIC ACCIDENT
DEATHS IN SOUTH AFRICA USING
GENERALIZED LINEAR MODELS
SHARON OGOLLA
1
, SONY SUNARYO
2
, IRHAMAH
3
1
Institut Teknologi Sepuluh Nopember Surabaya, sha.ogolla@gmail.com
2
Institut Teknologi Sepuluh Nopember Surabaya, sony_s@statistika.its.ac.id
3
Institut Teknologi Sepuluh Nopember Surabaya, irhamahn@yahoo.com
Abstract
World Health Organization (WHO) reports that over 1.2 million people die annually
due to road accidents. The numbers of deaths resulting from road traffic crashes have been
projected to reach 8.4 million in the year 2020. To analyze the mortality data it is
necessary to consider the mortality rate of certain age groups, so that we can find data
which shows the prevalence of major groups of deaths. The model is developed by the
Generalized Linear Modeling (GLM) method. The analysis of data is followed by
subsequent formulation of the Poisson regression models. It was further found that the
data analyzed over dispersion variance greater than average. As a result, Negative
Binomial model was used as an alternative and it found to fit the data perfectly.
Incremental addition of relevant explanatory variables further expanded the basic model
into a comprehensive model. At the end of this study, it could be seen through the
analysis of the data that age group from 35-49 is prevalent to road traffic accident deaths
with 26.6%. Females had an expected death rate of , which is 65.4% lower, at all
ages. The effect of being in the 35–49 year age group, compared with 65> year olds, is to
multiply the mean death rate by = 0.557, that is to decrease the mean death rate by
an estimated 44.3%, for both genders.
Keywords : Generalized Linear Models, Negative Binomial Regression, Poisson
Regression, South Africa
1. Introduction
Generalized linear models play a very important role in statistical inference. They
represent a mathematical way of quantifying the relationship between a response variable
and a set of independent variables, including a general class of statistical models.
Originally introduced by Nelder and Wedderburn [1], generalized linear model (GLM) is
an extension of the classical linear models. It includes linear regression models,
analysis of variance models, Logistic regression models, Poisson regression models,
Zero-inflated Poisson regression models, Negative Binomial regression models, log-
linear models, as well as many other models.
There are several studies that have been conducted relating to Generalized Linear
Models to solve real problems. Umar et al. [2] carried out a study to determine the impact
of running headlights on conspicuity-related motorcycle accidents in Malaysia. The
Generalized linear model with Poisson distribution and log link was used to describe the
frequency of conspicuity-related motorcycle accidents. The explanatory variables used
consisted of: influence of time trends, changes in recording system, effect of fasting
during month of Ramazan, and Balik Kampong which is a religious holiday unique to the
2. KNM XVII 11-14 Juni 2014 ITS, Surabaya
2
multi-cultural society of Malaysia. In order to overcome the over-dispersion of data, the
quasi-likelihood technique was used. Russo et al. [3] used it in Brazil to model the
number of deaths in Santo Angelo. In health, Jahangeer et al. [4] used generalized linear
models to analyze the factors influencing exclusive breastfeeding.
Studies done worldwide by Odero et al. [5] and Balogun et al.[6]have shown that road
traffic accidents are the leading causes of death of many adolescents and young adults.
There is evidence that using minimum safety standards, crash worthiness improvement in
vehicles, seatbelts use laws and reduced alcohol use, can substantially reduce deaths on
the road Leon [7]. In developing countries, including South Africa, the scenario is
different to developed countries, road traffic accidents are increasing with time and
mortality due to road traffic accidents is also on the rise Asogwa [8]. Peden et al. [9]
reported that when taking the population figures into account, developing countries in
Sub-Saharan Africa have the highest frequency of various accidents worldwide.
In South Africa, 3,280,931 deaths were recorded in between 2001 and 2006 of which
9.5% were due to non-natural causes [10]. Road traffic accident deaths comprised 9.3% of
non-natural deaths. Data from the National Injury Mortality Surveillance System
(NIMSS) showed that in 2005, transport-related injuries accounted for 74.3% of all
accidental (or unintentional) deaths [11]. Analysis of the injury burden in South Africa by
Norman et al. [12] showed that the age standardized road traffic injury mortality rates for
South Africa were about double the global rate for both males and females.
The benefits to be achieved from the results of this study are to provide scientific
insights concerning Generalized Linear Models and to create a platform for future studies
into modeling number of deaths by using Generalized Linear Models.
2. Literature Review
A. Generalized Linear Models
Generalized linear models are a natural generalization of classical linear models that
allow the mean of a population to depend on a linear predictor through a non-linear link
function. This allows the the response probability distribution to be any member of the
exponential family of distributions.
A generalized linear model (or GLM) consists of three components:
1. A random component, which specify the conditional distribution of the response
variable , given the explanatory variables
2. A linear function of the regression variables, called the linear predictor,
(1)
on which the expected value of depends.
3. An invertible link function, ( ) (2)
This transforms the expectation of the response to the linear predictor. The inverse of the
link function is sometimes called the mean function
( ) (3)
B. Poisson Regression Model
The Poisson regression model is a specific type of GLM and is non-linear. Poisson
regression analysis is a technique used to model dependent variables that describe count
data [13]. Poisson regression model has often been applied to estimate standardized
mortality and incidence ratios in cohort studies and in ecological investigations.
The primary equation of the model is
( ) (4)
3. KNM XVII 11-14 Juni 2014 ITS, Surabaya
3
The most common formulation of this model is the log-linear specification as in equation
(5)
The expected number of events per period is given by
( | ) (6)
Poisson regression model is a specific type of generalized linear models (GLM) whose
parameters can be estimated using the maximum likelihood method, with the likelihood
function given by:
∏ ( ) ∏ (7)
And the ln-likelihood function equal to:
∑ ∑ ∑ ( ) (8)
C. Solving For Over-dispersion In Poisson Regression
Over-dispersion may be modeled using compound Poisson distributions. With this
model the count y is Poisson distributed with mean λ, but λ is itself a random variable
which causes the variation to exceed that expected if the Poisson mean were fixed [14].
Thus suppose λ is regarded as a positive continuous random variable with probability
function g(λ). Given λ, the count is distributed as P(λ). Then the probability function of y
is
∫ (9)
A convenient choice for g(λ) is the gamma probability function G(μ, ν), implying (9) is
NB (μ, κ) where κ = 1/ν. In other words the negative binomial arises when there are
different groups of risks, each group characterized by a separate Poisson mean, and with
the means distributed according to the gamma distribution [14].
D. Negative Binomial Regression Model
Negative binomial distribution is a distribution that has a lot of ways in terms of its
approach. There are twelve negative binomial distribution approaches among which can
be approached by Poisson - Gamma mixture distribution, as a compound Poisson
distribution, as a sequence of Bernoulli trials, or as the inverse of the Binomial
distribution [15]
When data is overdispersed, the common method to account for it is by using negative
binomial model [15]. Negative binomial regression is a type of generalized linear model
in which the dependent variable Y is a count of the number of times an event occurs.
Statistical comparisons between Poisson and negative binomial regression models
confirm that in most cases the negative binomial better represents observed counts than
Poisson [15]. Hilbe [15] gave the parameterization of the negative binomial model as
( ) ( ) (10)
where is the mean of and is the heterogeneity parameter. Hilbe [15]
derives this parameterization as a Poisson-gamma mixture, or alternatively as the
number of failures before the ( ⁄ ) success, though we will not require ⁄ to be an
integer. Negative Binomial model estimation process is done by using the Newton
Raphson method.
4. KNM XVII 11-14 Juni 2014 ITS, Surabaya
4
The Partial likelihood form of negative binomial is
( ) ∏
( )
( ) ( ) (11)
From equation (11) it can then form a partial log-likelihood which becomes
( ) ( ) (12)
where { }
If equation (11) is substituted into (12), then the partial form ln-likelihood will be
( )
∑ ( ( )) ∑ ( ( )) ∑ ( ) ∑ ( )
∑ ( )
∑ ( ( )) ∑ ( ) ∑ ( )
∑ ( ) ∑ ( ) ∑ ( ) (13)
To maximize the function in equation above, the first derivative shall be found
( )
( ) (14)
The next step is to calculate the second partial derivatives of the log-likelihood function
partial aimed to form the Hessian matrix. The second partial derivatives of the log-
likelihood function of the partial regression coefficient β parameters are as follows:
∑ {
( )
( )
}
∑ {
( )
( )
( )}
∑ {
( )
( )
( ) }
∑ {
( )
( )
( ) }
Based on the results of the second partial derivatives above, the Hessian matrix is
obtained as follows:
5. KNM XVII 11-14 Juni 2014 ITS, Surabaya
5
=
( )
(15)
as a measurement of .
In addition, the matrix used in the iterative procedure of Newton Raphson algorithm
method for finding solutions of the log-likelihood function is convergent and used as
estimates for each parameter. Thus, the next stage is the process of Newton Raphson
algorithm in the negative binomial models as follows:
1. Determining the value of initial parameter estimates ̂ for iteration when .
2. Form a vector ̂
(̂ ) ( )
3. Shaping the Hessian matrix (̂ ).
4. Substituting the value ̂ to the elements of the vector ( ̂ )and the Hessian
matrix to obtain a vector ( ̂ ) and the Hessian matrix ( ̂ )
5. Perform iterations ranging from in the following equation
̂ ̂ (̂ ) (̂ )
6. Determine iteration update to to obtain parameter estimates that converge
|̂ ̂ |
3. Analysis And Results
A. Descriptive Statistics
From 2001 to 2006, there were a total of 28,890 people killed in South Africa. On
average, we could say that on a yearly basis, there were a total of 4,815 people killed
every year. Figure 1 below, shows the age distribution of people killed by road traffic
accidents in South Africa from 2001 to 2006. From the figure below, it is quite clear that
youths and middle aged people are prone to road traffic accidents. It can also be seen
that male group are the major victims in road traffic accidents. The highest number of
traffic accidents from year 2001-2006 is reported to come from the 35-49 male age
group which is recorded as 28.62 deaths in every 100,000 population. Followed closely
by 25-34 male age group which records a total death number of 25.69. The lowest
number of deaths from the male age group comes to 4.42 deaths in every 100,000
population.
Whilst male death rates show a peak at age group 35–49 years (similar to death
rates for both sexes), female death rates show a roughly linear increase from age group
0–14 to age group 65 years and above. Thus among females, the elderly experienced the
highest death rates due to road traffic accidents. This can be concluded that, at this age
6. KNM XVII 11-14 Juni 2014 ITS, Surabaya
6
group, most are pensioners and retirees hence they do not travel regularly. From figure
1, it can be noted that road traffic accident deaths increases very fast from infancy till
the ages 35-49 for males then it starts decreasing again. Thus it can be said that, the peak
of someone dying in South Africa due to road traffic accidents is at the age of 35-49 .
Figure 1 Age Distribution of people killed in road traffic accidents
B. DATA ANALYSIS
The deviance of the final Poisson distributed model was 1375.22 on 64 degrees of
freedom and that the scaled deviance is greater was greater than 1, a DF value of 21.49
indicating a case of over-dispersion. Since there is a case of over-dispersion, Negative
Binomial was then used to fit the model. Negative binomial reported a perfect fit for all
our models. In this case, our best model with all the variables included, the deviance of
the Negative binomial distributed model was 71.95 on 64 degrees of freedom and that
the scaled deviance and Pearson values adjusted for DF were rather small indicating
a good fit (value of 1.12). With the inclusion of all explanatory variables, the model gets
better. Age, population and gender were both highly significant, p-value was <0.05.
However, the age groups 25-34, 35-49 and 50-64 were not significant in this case since
p-value was >0.05.
Likelihood ratio statistics for type I and type III analysis tests were done.
Table 2 shows the Type I analysis tests each explanatory variable sequentially, under the
assumption that the previous explanatory variables are included in the model. With the
entry of population into the model, the deviance increases by 146.2, from 301685.765 to
301831.96.
0-14 15-24 25-34 35-49 50-64 65>
female 3.35 4.78 5.75 7.77 7.37 10.05
male 4.42 14.11 25.69 28.62 21.92 19.6
0
5
10
15
20
25
30
35
Deathsper100,000population
Age Distribution of People Killed by Road Traffic
Accidents
7. KNM XVII 11-14 Juni 2014 ITS, Surabaya
7
Table 1 Poisson regression model Information
Distribution Poisson
Link Function Log
Dependent Variable deaths
Offset Variable l_popn
Criteria For Assessing Goodness Of Fit
Criterion DF Value Value/DF
Deviance 64 1375.2179 21.4878
Scaled Deviance 64 1375.2179 21.4878
Pearson Chi-Square 64 1467.7835 22.9341
Scaled Pearson X2 64 1467.7835 22.9341
Log Likelihood 150368.9831
Full Log Likelihood -961.1206
AIC(smaller is better) 1938.2411
AICC (smaller is better) 1940.5268
BIC (smaller is better) 1956.4545
This is highly significant (p-value is <0.05) as judged against the distribution. In the
presence of gender in the model, the inclusion of age brings the deviance up to 301825.00, an
increase of 139.24. This indicates a much improved fit, achieved at a cost of five degrees of
freedom, since there are five parameters associated with categorical age. This statistic has p-
value <0.0001 on the distribution, indicating age is highly significant.
Table 2. LR Statistics for Type I and Type III Analysis
Type I Type III
Source df ∆ p-value p-value
Intercept 301685.765
Age 5 301825.00 68.53 <0.0001 73.89 <0.0001
Gender 1 301756.46 70.70 <0.0001 92.16 <0.0001
Popl 1 301831.96 6.96 0.0083 6.96 0.0083
The Type III analysis tests each explanatory variable under the assumption that all other
variables are included in the model. Gender, in the presence of age, has a deviance reduction of
= 92.16 with p-value <0.0001. Age, in the presence of gender, has = 73.89 with p-value
<0.0001 (as for the Type I analysis). There is no change in the Population value at 6.96.
Akaike Information Criterion was used to select our best model. Table 4 shows every
explanatory variable added to a model improves fit and the best model is the one with the
smallest AIC.
8. KNM XVII 11-14 Juni 2014 ITS, Surabaya
8
Table 3 Negative Binomial Model Information
Distribution Negative Binomial
Link Function Log
Dependent Variable deaths
Offset Variable l_popn
Criteria For Assessing Goodness Of Fit
Criterion DF Value Value/DF
Deviance 64 71.9595 1.1244
Scaled Deviance 64 71.9595 1.1244
Pearson Chi-Square 64 70.1381 1.0959
Scaled Pearson X2 64 70.1381 1.0959
Log Likelihood 150915.9792
Full Log Likelihood -414.1245
AIC(smaller is better) 846.2490
AICC (smaller is better) 849.1522
BIC (smaller is better) 866.7389
It is evident that when the number of explanatory variables increases, it makes a
good fit. Since the scaled deviance value is approximately close to 1, there is no case of
over-dispersion hence Negative Binomial was chosen to be the best model.
Table 4. Comparison between Poisson and Negative Binomial model with their
respective AIC, & Deviance values
Poisson Regression Model
No. Explanatory Variables AIC Scaled Deviance Value/DF
1 Age 8833.86 8274.84 125.37
2 Gender 6428.59 5877.57 83.97
3 Population 13020.65 12469.62 178.14
4 Age & Gender 2026.19 1465.17 22.54
5 Age, Gender & Population 1938.24 1375.22 21.49
Negative Binomial Regression Model
No. Explanatory Variables AIC Scaled Deviance Value/DF
1 Age 952.82 74.61 1.13
2 Gender 909.74 73.29 1.05
3 Population 980.37 76.36 1.09
4 Age & Gender 851.21 72.64 1.12
5 Age, Gender & Population 846.25 71.96 1.12
9. KNM XVII 11-14 Juni 2014 ITS, Surabaya
9
By choosing the smallest AIC, model number 5 is the best since it had an AIC value of
846.25. The fitted model was
where represent the age groups 0-14,15-24, 25-34, 35-49, 50-64
respectively, represents the female gender and represents population.
4. Conclusion
This study has shown that for an over-dispersion data, the Negative Binomial model
is better than the Poisson Regression model. Because of the Poisson distribution has a
special property that mean is equal to the variance. Thus an over dispersion means that
the variance is greater than mean. The Negative Binomial regression model is more
flexible as it allows for the variance to be greater than mean. The results also revealed
that the most affected people who die through road accidents in South Africa are male.
Females had an expected death rate of , which is 65.4% lower, at all ages. In
comparison with the age group 65>, the 0-14 age group had a decreased death rate of
89.6% for both genders, the 15-24 age group had a decreased death rate of 73.3% for
both genders, the 25-34 age group had a decreased death rate estimated at 54.9% for
both genders, the 35-49 age group had a decreased death rate estimated at 44.3% for
both genders and the 50-64 age group had a decreased death rate estimated at 38.8%. It
was also found that for every increase of in the population, the death rate of road
traffic accidents also increased by an estimated , thus the more the
population, the more the number of deaths. It can also be noted that accident deaths
increase as the years go by, and thus more care and policies should be provided to
reduce road traffic accident deaths in South Africa.
REFERENCES
[1] Nelder, J.A and Wedderburn, R.W.M (1972). “Generalized linear models”. Journal
of the Royal Statistical Society, Series B, 19, 92-100.
[2] Radin Umar, R.S., M., Norghani, H., Hussain, B., Shahrom, and M.M, Hamdan,
1998. Research Report 1, National Road Safety Council Malaysia, Kuala Lumpur.
[3] Russo, S. Flender, D. and da Silva, G.F. (2012). “Poisson Regression Models for
Count Data: Use in the Number of Deaths in the Santo Angelo (Brazil).” Journal of
Basic & Applied Sciences, 2012, 8, 266-269.
[4] Cheika J., Naushad M.K. and Maleika H.M.K. (2009). “Analyzing the factors
influencing exclusive breastfeeding using the Generalized Poisson Regression
model”. World Academy of Science, Engineering and Technology Vol:3 2009-11-
29.
[5] Odero, W., Garner, P. and Zwi, A. (1997). “Road traffic injuries in the developing
countries: a comprehensive review of epidemiological studies”. Journal of Tropical
Medicine and International Health. 2(5), 445-460.
[6] Balogun, J.A., Abereoje, O.K. (1992). “Pattern of road traffic accident cases in a
Nigeria University teaching hospital between 1987 and 1990.” J.Trop Med Hyg;
95(1):239.
[7] Leon, S.R. (1996). “Reducing death on the Road. The effects of minimum safety
standard”.119 Unpublicised crash test, seat belts and alcohol. Am J Public Health;
86(1):31-3.
[8] Asongwa, S.E. (1992). “Road traffic accidents in Nigeria: A review and a
reappraisal”. Accident Analysis and Prevention: 23 (5), 343-35.
10. KNM XVII 11-14 Juni 2014 ITS, Surabaya
10
[9] Peden, M. (Ed), (2004), “World Report on Road Traffic Injury Prevention”. World
HealthOrganisation, Geneva.
[10] Statistics South Africa. 2008. “Mortality and cause of death in South Africa, 2006:
Findings from death notification”. Statistics South Africa.
[11] Medical Research Council and UNISA. 2007. “A profile of fatal injuries in South
Africa 7th Annual Report of the National Injury Mortality Surveillance System
2005”. MRC/UNISA Crime, Violence and Injury Lead Programme, July 2007.
[12] Norman, R. Matzopoulos, R. Groenwald, P. and Bradshaw, D. (2007). “The high
burden of injuries in South Africa.” Bulletin of the World Health Organization.
September 2007, 85 (9). WHO. Geneva.
[13] Cameron, A.C. and Trivedi, P.K. (1998). “Regression Analysis of Count Data”.
Cambridge University Press, Cambridge, U.K.
[14] Jong, P. and Heller, G. Z. (2008). “Generalized Linear Models for Insurance
Data.” The International Series on Actuarial Science, Cambridge University Press
ISBN-13 978-0-511-38877-4.
[15] Hilbe, Joseph M. (2011). “Negative binomial regression” (2nd
edition) New York:
Cambridge University Press