SlideShare a Scribd company logo
1
GJESM MANUSCRIPT TEMPLATE
o ORIGINAL RESEARCH PAPER
o CASE STUDY
o SHORT COMMUNICATION
Forecast generation model of municipal solid waste using
multiple linear regression
ABSTRACT: The objective of this study was to develop a forecast model to
determine the rate of generation of municipal solid wastein the municipalities of the
Cuenca del Cañón del Sumidero, Chiapas,Mexico.Multiplelinearregression was used
with social and demographic explanatory variables.The compiled databaseconsisted
of 9 variables with 118 specific data per variable, which were analyzed using a
multicollinearity test to select the most important ones. Initially,different regression
models were generated, but only 2 of them were considered useful,becausethey used
few predictors that were statistically significant. The most important variables to
predict the rate of waste generation in the study area were the population of each
municipality,themigration and the population density.Although other variables,such
as daily per capita incomeand averageschoolingarevery important,they do not seem
to have an effect on the response variable in this study. The model with the highest
parsimony resulted in an adjusted coefficient of 0.975, an average absolute
percentage error of 7.70, an average absolute deviation of 0.16 and an average root
squareerror of 0.19, showinga high influenceon the phenomenon studied and a good
predictivecapacity.
KEYWORDS: Explanatory variables; Forecast model; Multiple linear
regression;Statisticalanalysis; Waste generation.
NUMBER OF REFERENCES
34
NUMBER OF FIGURES
5
NUMBER OF TABLES
6
RUNNING TITLE: Forecastgenerationmodel of municipal solidwaste
INTRODUCTION
Because of its high management cost, the amount of Municipal Solid Waste (MSW) generated in
population settlements is a significant factor for the provision of public services. According to
Intharathirat et al. (2015); Keser et al. (2012); Khan et al. (2016), the amount of MSW and its
compositionvary dependingon social,environmental anddemographicfactors. Severalresearchers
have developed models to predict the amount of MSW generated (Mahmood et al., 2018;
Kannangara et al., 2018; Pan et al., 2019; Soni et al., 2019), while othersanalyze the variablesthat
influencetheirgenerationandcomposition(Chhay etal., 2018; Grazhdani, 2016; Liu and Wu, 2010;
Liu et al., 2019; Rybová et al., 2018). Unfortunately,due to the social, economic and geographical
2
heterogeneityof the differentregionsof the world,it is difficulttomake inferencesorprojections
withthe proposedmodels,and therefore, the modelsandtheirvariableshavetobe adaptedtothe
conditionsof otherregions,sometimeswithlittle success. KumarandSamandder(2017) and Shan
(2010) reportsthatsome of the difficultiesforthe adaptationof thesemodels are relatedtolimited
or inaccessible information in other countries (databases). In addition, some variables are
theoretically valid, but difficult to measure. In other cases, the variables used do not provide
informationleadingto the explanationof the phenomenon,buthave tobe used,because the model
incorporatesthem. Mexico,thistopichasalsobeenaddressed,particularlyinthe centerandnorth
of the country (Buenrostro et al., 2001; Márquez et al., 2008; Ojedaet al., 2008; Rodríguez,2004).
However, it is evident that the models proposed are not applicable to the entire national context.
According to the OECD (2015), there are notable differencesbetween the central, northern and
especially southern regions of Mexico; these include disparities in income, education, access to
services,dispersionof localitiesandotherfactors,whichcause that the consumptionpatterns,and
therefore the amountof MSW,vary greatly. Thisstudy presentsamodel toforecastthe generation
rate of MSW in the municipalities of the Cuenca del Cañón del Sumidero (CCS), Chiapas State,
Mexico. The model considersthe informationof the most relevantandeasilyaccessible socialand
demographic variables for the studyarea, which correspond to statistical data for the years 2010-
2015. This model will allow the decision makers of the municipalities of the CCS to determine the
quantities of MSWgenerated,operate properlythe waste managementsystems,and evenacquire
infrastructure. This study has been carried out in municipalities of the Cuenca del Cañón del
Sumidero,ChiapasState,Mexicoduring2010 - 2015.
MATERIALS AND METHODS
Description of the study area and context
The CCS islocatedinthe State of Chiapas, inthe southeastof Mexico,betweenthe coordinates15°
56' 55" and 16° 57' 26" NorthLatitude, and 92° 30' 44" and 93° 44° 35" westlongitude (Fig.1).The
CCS has 24 municipalitiesand2,847 localities; 2,816 localitiesare rural while 31 are urban. 83% of
the populationof the studyarealivesinurbanareas (INEGI,2010). The degree of dispersionishigh,
especiallyinthe rural localitiesfarthestfromthe municipal seat.
3
Fig. 1: Geographic location of the study area in Cuenca del Cañón de Sumidero in Mexico
Developmentof the model
This study uses a multiple linear regression (MLR) model to obtain the generation rates of MSW.
Because of theirversatilityandwell-foundedtheory, MLRmodelshave beenwidelyusedin various
scientificfields. Theirmaindisadvantage isthe preparationof the database (Piresetal.,2008). The
hypothesistouse theMLRinthis study isbasedonthe effectof the explanatoryvariables(socialand
demographic variables) on the response variable (generation rate of MSW). The linear function is
shownin Eq. 1.
𝑌 = 𝛽0 + 𝛽1𝑋1 + 𝛽2𝑋2+.. .+𝛽𝑘𝑋𝑘 + 𝜀 (1)
Where, Y is the response variable, Xi (1, 2, 3 ... k) are the explanatory variables, βi (1, 2, 3 ... k) are
the regressioncoefficientsand 𝜀 isthe residual error.
Accordingto Agirre (2006), the MLR is basedontwoassumptions:i) the explanatoryvariablesmust
be independent, i.e., free of multicollinearity and ii) the dependent variable must be normally
distributed, with zero mean and constant variance. In order to determine the regression
coefficients,the least squares method, which is based on minimizing the sum of squared errors
(SSE),usingEq. 2.
𝑆𝑆𝐸 = ∑ (𝑌𝑖 − 𝑌𝑖
̂)
2
𝑛
𝑖=1 (2)
4
Where 𝑌𝑖 isthe value of eachobservationand 𝑌𝑖
̂ isthe predictedvalue.Theoretically,low SSEvalues
reflectabetterfitof the regressionmodel (KumarandSamandder,2017).Inorderto determine the
bestregressionmodel(mostparsimony),the statisticalsignificance of theexplanatoryvariablesand
the general modelwere analyzed.The analysisof the explanatoryvariableswasperformedwiththe
t-test, while the degree of adjustment and usefulness of the proposed model was performed by
evaluatingthe F-testandthe value of 𝑅2
𝑎𝑑𝑗 usingEqs. 3 and 4, respectively.
𝐹 =
(𝑆𝑆𝑌𝑌−𝑆𝑆𝐸)/𝑘
𝑆𝑆𝐸/[𝑛−(𝑘+1)]
(3)
𝑅2
𝑎𝑑𝑗 = 1 − [
(𝑛−1)
𝑛−(𝑘+1)
](1 − 𝑅2) (4)
Where 𝑆𝑆𝑌𝑌 = ∑(𝑌𝑖 − 𝑌
̅)2 representsthe sumofthe squaresof the differenceof the observeddata
(𝑌𝑖) and the average of the data(𝑌
̅);k isthe numberof explanatoryvariablesincludedinthe model;
n isthe sample size;and 𝑅2 isthe coefficientof determination.The value of 𝑅2 wasnotconsidered
tomeasure the explanatorypowerof theregressionmodel,becauseitsvalue increaseswhenadding
more explanatoryvariables,anditcanbe a deceptive measure (Changetal.,2007).
Datacollection
Accordingto Beigl etal. (2008) and Kolekaretal. (2016), the methodsof data collectiondependon
the scale of the study.In investigationscarriedoutathouseholdorlocalitylevels,the acquisitionof
information isusuallycarriedoutthroughsurveysor interviews;while atdistrictor countryscales,
the information comes from a database registered by government agencies. This study was made
at districtscale and therefore the studyarea includesseveral municipalities.MSWgenerationwas
obtained from SEMANH (2013), the studies by Alvarado et al. (2009) and Araiza et al. (2015). The
social anddemographicinformation(explanatoryvariables) wasobtainedfrom CONAPO(2017) and
INEGI (2010). The compiled informationallowedthe elaborationof a database of 9 variables,with
118 specific data per variable, coming from all the municipalities of the state of Chiapas (Table 1).
The inferencesof the proposed model were made on the municipalities of the CCS. This database
was analyzedwiththe MINITABsoftware version16.
Exploratory analysisof variables
An exploratoryanalysisof the 9 variablesusedto check the normalityof the data was carriedout.
The test used was Kolmogorov-Smirnov, with a level of significance of ∝= 0.05. This test showed
that the variables YGen, XPop, XPd, XPbam, XHgs, XCes, XDpi, did not follow a normal distribution,because
Table 1: Description of variables
No. Name of the variable Symbol Type of variable Measure
1 MSW generation YGen Dependent Tons/day
2 Population XPop Independent Inhabitants
3 Population density XPd Independent Inhabitants/km2
4 Population born in another municipality XPbam Independent Inhabitants
5 Average schooling XAs Independent Years of study
6 Household with goods and services XHgs Independent Percent (%)
7 Commercial establishments and services XCes Independent
Number of
establishments
8 Daily per capita income XDpi Independent Mexican pesos/day
9 Marginalization index XMi Independent Percent (%)
5
theirp-value wassmallerthanthe ∝ value considered.Inordertoadjusttheirvalues,the variables
were transformedwithnatural logarithms.The variables XAs andXMi were nottransformedbecause
theirdata followedanormal distribution (Table 2).
Multicollinearity analysisand variablescreening
An analysis of the explanatory variables was made prior to the selection of the best MLR model.
Through a multicollinearity test, some of the variables initially considered were eliminated.
Especially, the variance inflationfactor(VIF) and the Pearsoncorrelationcoefficient(r) were used.
SimilartoKeseretal. (2012),the r coefficientwasusedtodetectthe bivariate association,while the
VIFwas used todetectthe multivariate correlation.Eqs. 5and 6 describe the testsused.
𝑉𝐼𝐹𝑘 =
1
(1−𝑅𝑘
2)
(5)
𝑟 = √1 −
𝑆𝑆𝐸
𝑆𝑆𝑌𝑌
(6)
VIF value is calculated using the 𝑅2 of the regression equation; the explanatory variables denoted
by k are analyzedasdependentvariables,whilethe othersare usedasindependentvariables;thus,
VIFis calculatedforeachexplanatoryvariable k.
The cut-off value of VIFusedinthis study was4. Accordingto Ghineaet al. (2016), whenVIF<1, the
explanatory variables are not correlated; when 1 < VIF < 5, the explanatory variables are slightly
correlated; and when VIF > 5 or 10, the explanatory variables are highly correlated. The value of r
indicatesthe relationshipbetweentwovariables(positiveornegative);itsvalue rangesbetween -1
and1. There are noclearlydefined cut-off inthe literature.Arriaza(2006) indicatesthatwithvalues
of r greater than 0.3, there may be signs of correlation, with values greater than 0.8, there are
serious problems of multicollinearity. As in Grazhdani (2016), in this study it was considered that a
value of r ≥ 0.6 (positive ornegative),indicatescorrelationbetweenthe explanatoryvariables. The
elimination of explanatory variables was performed in an iterative procedure, i.e., the VIF values
were initially determined for the 8 variables; subsequently, the variable with the highest VIF was
eliminatedandthe next iterationwith7variableswasperformed.Thiseliminationprocedureended
whena VIFcut-off value of 4 wasfound.Finally,othereliminationswere made basedonthe values
of r. Subsequently, 3 explanatory variables were used in the search stage for a better model (of
greater parsimony). The first variable selected was XPop, i.e., the "total population" of each
Table 2: Normality test and transformation of variables
Original variable Transformed variable
Kolmogorov-Smirnov Kolmogorov-Smirnov
Statistical p-value Statistical p-value
YGen 0.338 <0.010 ln-YGen 0.050 >0.150
XPop 0.281 <0.010 ln-XPop 0.066 >0.150
XPd 0.271 <0.010 ln-XPd 0.039 >0.150
XPbam 0.387 <0.010 ln-XPbam 0.065 >0.150
XAs 0.053 >0.150 XAs --- ---
XHgs 0.183 <0.010 ln-XHgs 0.054 >0.150
XCes 0.349 <0.010 ln-XCes 0.056 >0.150
XDpi 0.117 <0.010 ln-XDpi 0.056 >0.150
XMi 0.054 >0.150 XMi --- ---
6
municipality,underthe hypothesisthat the largerthe population,the greaterthe consumption and
thus the greater the amount of MSW generated. The second explanatory variable used was XPd
"population density", under the premise that dispersion patterns or agglomeration of inhabitants
per unit area influences MSW generation. The third variable used was XPbam "population born in
another municipality", which can be seen as migration, i.e., people who move to other places to
seekbetterlivingconditions.The processof mobilizationof people causeschangesinconsumption
patternsof a newplace of settlement. Othermodelsthatdo not follow the principle of parsimony
were also created (more than 3 explanatory variables), but they should not be used to forecast
waste generation rates, since they have very low accuracy values and some of their explanatory
variablesare notsignificant.
Accuracy of the modeland validation
In order to determine the accuracy of the best model found, 3 widely used measures were
employed: the Mean Absolute Percentage Error (MAPE), the Mean Absolute Deviation(MAD) and
the Root Mean Square Error (RMSE) (Eqs. 7, 8 and9, respectively). A value of these measuresclose
to zeroindicates ahighprecisionof the model (Azadi andkarimí2016).
𝑀𝐴𝑃𝐸 =
1
𝑛
∑ |
𝐴𝑡−𝐹𝑡
𝐴𝑡
|
𝑛
𝑡=1 ∗ 100 (7)
𝑀𝐴𝐷 =
1
𝑛
∑ |𝐴𝑡 − 𝐹𝑡|
𝑛
𝑡=1 (8)
𝑅𝑀𝑆𝐸 = √
1
𝑛
∑ (𝐴𝑡 − 𝐹𝑡)2
𝑛
𝑡=1 (9)
Inthese equations,𝐴𝑡 isthe observedvalue, 𝐹𝑡 isthe predictedvalue and n isthesample size.MAPE
is expressedin terms of percentage of error, MAD expresses the precision in the units of the data
analyzed,and RMSE indicateshowconcentratedthe data are aroundthe line of bestfit. Inorderto
performthe external validation of the model,the technique called 𝑅2𝑗𝑎𝑐𝑘𝑘𝑛𝑖𝑓𝑒 usingEq. 10. This
equationiscalculatedbysystematicallyeliminatingeachobservationfromthe dataset, estimating
the regressionequationanddeterminingtowhatextentthemodelisabletopredictthe observation
that wasremoved.
𝑅2𝑗𝑎𝑐𝑘𝑘𝑛𝑖𝑓𝑒 = 1 −
∑(𝑦𝑖−𝑌
̂(𝑖))2
∑(𝑦𝑖−𝑌
̅)2
(10)
The 𝑅2𝑗𝑎𝑐𝑘𝑘𝑛𝑖𝑓𝑒 coefficient varies between 0 and 100%, larger values suggesting models with
greaterpredictive capacity; 𝑌
̂(𝑖)denotesthe predictedvaluefori-thobservationobtainedwhenthe
regression model fits the data with 𝑦𝑖 omitted (or removed) from the sample; and𝑌
̅ is the simple
average of the observeddata.
Verification of model assumptions
The validity of the MLR models is subject to the behavior of the residual errors "𝜀" (difference
between observed and predicted values of the dependent variable), particularly their normal
distribution, their independence and homoscedasticity (Kumar and Samandder, 2017). The
verification of normality was carried out through the Kolmogorov-Smirnov test, with a level of
significanceof ∝ = 0.05. In ordertoverifythe independence of residues,theDurbin-Watsontest(d)
was applied, looking for values close to 2, because "d" varies between 0 and 4 (Mendenhall and
7
Sincich, 2012). The homoscedasticity assumption was evaluated with the plot of residuals vs
predicted,bothstandardized,lookingfora residue behaviorthatdoesnotfitanyknownpattern.
RESULTS AND DISCUSSION
Statistical analysisof variables
The initial exploratoryanalysiswasperformedonthe response variableYGen,which hasthe behavior
shown in Fig. 2. It is observed that some municipalities, which appear to be outliers, show a very
highrate of MSW generation.
Fig. 2: MSW generation rates in the state of Chiapas
These atypical valueswerenoteliminatedfromthe analysisbecause theyare noterrors,butrather
data that come from the mostimportantmunicipalitiesof Chiapas,suchas"TuxtlaGutiérrez,
Comitán, San Cristóbal de las Casas and Tapachula". These municipalities are regional heads,
therefore, the number of inhabitants, their patterns of consumption and MSW generation, differ
significantlyfromthe restof the studiedarea.The normalitytestof the responsevariableandof the
8 explanatory variables is shown in Fig. 3. The non-normality of the variables YGen, XPop, XPd, XPbam,
XHgs, XCes and XDpi, can be seen. For this reason, these variables were transformed using natural
logarithms (Fig.3a,3b, 3c, 3d, 3f, 3g and 3h).
8
Fig. 3: Behavior of the variables analyzed with respect to normality
Forecastmodel
The coefficients of the MLR model were determined using the Minitab software. Only the
explanatoryvariablesthatfulfilledthe multicollinearitycriterionwere used.Initially,2theoretically
validmodelswere determined;the firstone isshowninEq. 11.
9
𝑙𝑛𝑌𝐺𝑒𝑛 = −8.91 + 1.10 𝑙𝑛𝑋𝑃𝑜𝑝 + 0.0259 𝑙𝑛𝑋𝑃𝑑
+ 0.0688 𝑙𝑛𝑋𝑃𝑏𝑎𝑚
(11)
Thisfirstmodel consistsof 3variables, XPop,XPd,XPbam (all transformed).The F-testassociatedwitha
variance analysis indicated that the model is statistically valid because p-value < 0.05. This model
can thus also be used for forecast purposes. However, it is important to be careful because the
explanatory variable XPd is not statistically significant since the null hypothesis that the coefficient
of the variable isequal to zero(H0: βi = 0) is met.Therefore,the explanatoryvariable isnotrelated
to the dependentvariable, i.e.,itshouldnotbe interpreted.
The secondmodel ispresentedinEq. 12, whichconsistsof 2 explanatoryvariables “XPop andXPbam”.
Similarto the firstmodel,here alsothe p-value andthe F-testindicate thatit is a statisticallyvalid
model that can be used for forecasting purposes. Particularly this model is the one of greater
parsimony,because itusesonly2variables.
𝑙𝑛𝑌𝐺𝑒𝑛 = −8.86 + 1.11 𝑙𝑛𝑋𝑃𝑜𝑝 + 0.0658 𝑙𝑛𝑋𝑃𝑏𝑎𝑚
(12)
All the informationassociatedwiththe analysisof variance ispresentedin Table 3.
Table 3: Analysis of variance of the proposed models
Model Source
Degree of
freedom (df)
Sum of
Squares
Mean
square
F Sig.
1
Regression 3 150.591 50.197 1,449.33 0.000
Residual 107 3.706 0.035
Total 110 154.297
2
Regression 2 150.549 75.274 2,169.16 0.000
Residual 108 3.748 0.035
Total 110 154.297
The verificationof assumptionsof the proposedmodels,especially model 2,is presentedin Fig. 4.
The probability-probability plot (p-p plot) (Fig. 4a) shows the values of the residuals with a linear
patternindicatingnormality;additionally,the Kolmogorov-Smirnovvalueanditsassociatedp-value
confirmit(p-value>0.15). The resultof the Durbin-Watsonindependencetestgave avalueof 1.979
for model 2, which indicates that the residuals are not correlated. The homoscedasticity test
presentedinFig.4bshowsabehaviorofthe residualsthatdoesnotfitanyknownpattern;therefore,
thissituationisadequate.
10
Fig. 4: Verification of model assumptions: (a) Normality of residuals, (b) Independence of residuals
On the other hand, the 𝑅2 value of the equations in both modelswas 0.976, which indicates that
97.6% of the generation rate of MSW YGen (transformed) can be explained by the explanatory
variables used. It is important to note the gradual decrease of 𝑅2
𝑎𝑑𝑗 (0.975) with respect to 𝑅2,
which is due to the adjustment by the introduction of 2 and 3 variables in models 1 and 2,
respectively. The highvalue of 𝑅2 and𝑅2
𝑎𝑑𝑗 in these modelsisdue to the initial transformationof
the explanatoryvariables,aswell asthe response variable.Additionally,the datacollectioncarried
out in this study influencedthese valuesbecause theycome froma censusdatabase,andnot from
an informationsurveythroughinterviews. The internal validationof model 2 throughMAPE, MAD
and RMSE, showedthe valuesof 7.70, 0.16 and 0.19, respectively,whichindicatesahighprecision
11
since the values of these tests are close to 0 (zero). The external validation by𝑅2𝑗𝑎𝑐𝑘𝑘𝑛𝑖𝑓𝑒
presentedavalue of 97.44%. Therefore,model 2alsohasa highforecastingcapacity.
Non-significantvariables
The analysisof the 8 explanatoryvariablesusingthe VIFtestproducedthe initial eliminationof the
variables XAs, 𝑙𝑛-XHgs, 𝑙𝑛-XCes and 𝑙𝑛-XDpi, since their value was higher than the cut-off of 4. The
variablesXAs and𝑙𝑛-XDpi have beenusedmainlyinstudiesathouseholdorlocalitylevels(Khanetal,
2016; Ojeda et al., 2008), but in this paperthey were usedat districtlevel,andthe effectof these
variables seems not to be important (low correlation with the response variable 𝑙𝑛-YGen). The
variables 𝑙𝑛-XHgs and𝑙𝑛-XCes were eliminatedbecause theyare highlycorrelatedwithXMi,since the
latteris a multidimensional indicatorthatmeasuresdeprivationina population,throughvariables
similar to those eliminated. Finally, through the r test, only XMi was eliminated, since it was highly
correlatedwith 𝑙𝑛-XPbam,witha coefficientof -0.695, i.e.,much higherthanthe cut-off value of 0.6
(positive or negative); additionally, this variable was less correlated with the 𝑙𝑛-YGen response
variable (Table 4).
Table 4: Correlation matrix of variables
Pearson's correlation ln-YGen ln-XPop ln- XPd ln- XPbam
ln-XPop 0.985 --- --- ---
ln- XPd 0.161 0.169 --- ---
ln- XPbam 0.638 0.573 -0.059 ---
XMi -0.355 -0.271 -0.097 -0.695
Significantvariables
The transformedvariables XPop,XPd andXPbam were usedinthe searchforthe best model, since their
VIFand r valueswere belowthe cut-off values. XPop hasbeenusedinthe studies of Azadi andkarimí
(2016) and Abdoli etal. (2011), as the mostimportantexplanatoryvariable.Inthis study,Pearson's
correlation r-value was 0.985, which indicates that it is also the variable most related to the
generationof waste,particularlyinapositive way, i.e., toalargerpopulation correspondsagreater
quantity of MSW. The variable XPd has been used in few publications. Bel and Mur (2009) use this
variable alsoto obtainthe costs associatedwithwaste management. Inthisstudy, r-value of 0.161
was obtained, which indicates a poor correlation with the response variable. The analysis of the
forecastmodel 1indicatedthatthisvariable isnotstatisticallysignificant,anditsuse mustbe taken
with caution. The XPbam variable is positively related to the response variable. Its Pearson's
correlation coefficient was 0.638. This variable is important in the study area, since it can be
concluded that people who move from one municipality to another have different consumption
patternsthatmodifythe amountsof MSW. Otherexplanatoryvariablesmentionedin Kolekaretal.
(2016), forinstance age,employmentstatus,level of urbanizationandenvironmentalvariablessuch
as precipitation or temperature, were not used in this study since it is difficult to find a database
withinformationonthese variables.
Othergenerated models
Eqs. 13, 14 and 15 showothermodelsgeneratedwiththe variablesinitiallyraised(models3,4 and
5 respectively). All these models are statistically significant and are also useful for forecasting
purposes,butincorporate explanatoryvariablesthatare not significant;therefore,theirresultsare
not accurate (Table 5).Additionally,theyhave low parsimonybecause theyincorporate more than
2 or 3 explanatory variables. Table 6 shows the statistical behavior of the predictors. The p-value
12
andthe VIFmust be analyzedbecausetheyindicate multicollinearitybetweenthe variablesandalso
theirpossible interpretationwithinthe generatedmodel.
𝑙𝑛𝑌𝐺𝑒𝑛 = −8.48 + 1.13 𝑙𝑛𝑋𝑃𝑜𝑝 − 0.0019 𝑙𝑛𝑋𝑃𝑑
+ 0.0315 𝑙𝑛𝑋𝑃𝑏𝑎𝑚
− 0.0111𝑋𝑀𝑖 (13)
𝑙𝑛𝑌𝐺𝑒𝑛 = −8.42 + 1.13 𝑙𝑛𝑋𝑃𝑜𝑝 − 0.0008 𝑙𝑛𝑋𝑃𝑑
+ 0.0324 𝑙𝑛𝑋𝑃𝑏𝑎𝑚
− 0.0070𝑋𝐴𝑠 − 0.0119𝑋𝑀𝑖
(14)
𝑙𝑛𝑌𝐺𝑒𝑛 = −8.22 + 0.995 𝑙𝑛𝑋𝑃𝑜𝑝 + 0.0039 𝑙𝑛𝑋𝑃𝑑
+ 0.0261 𝑙𝑛𝑋𝑃𝑏𝑎𝑚
− 0.0006𝑋𝐴𝑠 +
0.133 𝑙𝑛𝑋𝐻𝑔𝑠
− 0.00525𝑋𝑀𝑖 (15)
Table 5: Analysis of variance of other generated models
Model Source
Degree of
freedom (df)
Sum of
Squares
Mean
square
F Sig. 𝑅2
𝑎𝑑𝑗
3
Regression 4 150.900 37.725 1,177.44 0.000 0.977
Residual 106 3.396 0.032
Total 110 154.297
4
Regression 5 150.902 30.180 933.42 0.000 0.977
Residual 105 3.395 0.032
Total 110 154.297
5
Regression 6 151.397 25.233 905.01 0.000 0.980
Residual 104 2.900 0.028
Total 110 154.297
Table 6: Statistical behavior of predictors in models 3, 4 and 5
Model Predictor Coefficient p-value VIF Comments
3
Constant -8.48 0.000 -------
There is no multicollinearity
among predictors,but some
of them are not statistically
significant(p-value>0.05).
𝑙𝑛𝑋𝑃𝑜𝑝 1.13 0.000 1.947
𝑙𝑛𝑋𝑃𝑑 -0.0019 0.937 1.300
𝑙𝑛𝑋𝑃𝑏𝑎𝑚 0.0315 0.053 3.556
𝑋𝑀𝑖 -0.0111 0.002 2.405
4
Constant -8.42 0.000 -------
There is multicollinearity
between predictors (VIF> 5)
and some of them are not
statistically significant (p-
value> 0.05).
𝑙𝑛𝑋𝑃𝑜𝑝 1.13 0.000 1.948
𝑙𝑛𝑋𝑃𝑑 -0.0008 0.975 1.374
𝑙𝑛𝑋𝑃𝑏𝑎𝑚 0.0324 0.055 3.789
𝑋𝐴𝑠 -0.0070 0.844 5.373
𝑋𝑀𝑖 -0.0119 0.026 5.177
5
Constant -8.22 0.000 -------
𝑙𝑛𝑋𝑃𝑜𝑝 0.995 0.000 5.634
𝑙𝑛𝑋𝑃𝑑 0.0039 0.869 1.377
𝑙𝑛𝑋𝑃𝑏𝑎𝑚 0.0261 0.096 3.824
𝑋𝐴𝑠 -0.0006 0.986 5.384
𝑙𝑛𝑋𝐻𝑔𝑠 0.133 0.000 6.180
𝑋𝑀𝑖 -0.0525 0.309 5.710
Inferencesaboutthemunicipalitiesof thestudy area
Basedon model 2 and itsstatistical analysis,inferenceswere made toforecastthe generationrate
of MSW in the municipalitiesof the CCS. The forecastwas made withthe most currentdata of the
13
variables XPop andXPbam,correspondingtothe year2015. Fig.5 showsMSW generation forecastand
its comparisonwiththe original database. Inmost of the municipalitiesof the CCS, the generation
rate of MSW presentedagradual increase withrespecttopopulationgrowth(variable XPop),except
in Arriaga, Chiapilla,Osumacinta, Suchiapa,Teopisca, Tonalá, VenustianoCarranza and Villaflores,
due tothe fact thatthe populationof thesemunicipalitiesdidnotincreaseinthe 2010-2015 period.
Fig. 5: Forecast of MSW generation rates in the municipalities of the study area
Currently,the studyareagenerates1,600tons of MSW/day,of which74% comesfromthe regional
heads such as Berriozábal, Ocozocoautla de Espinosa, San Cristóbal de las Casas, Tuxtla Gutiérrez
and Villaflores.
CONCLUSION
In this study, a forecast model was developed to determine the generation of MSW in the
municipalitiesof the CCS,ChiapasState,Mexico.A MLRwas usedtoobtainthe forecastmodel with
social and demographicexplanatoryvariables. Twoforecastmodelswere presentedandanalyzed,
withvariablesthat metthe multicollinearitytest.The mostimportantvariablestopredictthe rate
of MSW generationinthe studyareawerethe populationof eachmunicipality(XPop),the population
born in another municipality (XPbam) and the population density (XPd). XPop is the most influential
explanatory variable of waste generation, particularly it is related in a positive way. XPbam is less
relatedtowaste generation. XPd isthe variablethatleastinfluenceswaste generationprediction;in
addition, it can present problems of correlation with other explanatory variables. Although other
variables,suchasdailypercapitaincome(XDpi)andaverage schooling(XAs),are veryimportant,they
donot seemtohave aneffectonthe responsevariableinthisstudy.The userof thisforecastmodel
shoulduse model 2, since it is the one withthe highestparsimony(itusesfewervariables); 𝑅2
𝑎𝑑𝑗,
MAPE, MAD and RMSE values indicated high influence on the explained phenomenon and high
forecastingcapacity.Additionally,itisimportanttomentionthatwhenusingthe modelsproposed
for forecastingpurposes,itisnecessarytomake a transformationinthe explanatoryandresponse
variables(use inverseof natural logarithm).The inferencesmade onthe municipalitiesof the study
area showed that, except in some municipalities,the MSW generation rate usually presenteda
gradual increase withrespecttopopulationgrowthand withrespect to the numberof inhabitants
that were born in anotherentity(migration). Finally,this study canbe a solidbasisforcomparison
14
for future researchinthe area of study.It is possible touse differentmathematical modelssuchas
artificial neural network,principal componentanalysis,time-seriesanalysis,etc.,andcompare the
response variable orthe predictors.
ACKNOWLEDGEMENT
This study was supported by the Project Support Program for Research and Technological
Innovation [Project UNAM DGAPA-PAPIIT IN105516]. We also appreciate the support of Mexican
National Council forScience andTechnology(CONACYT) tocarry outthis work.
CONFLICT OF INTEREST
The authordeclaresthatthereisno conflictofinterestsregardingthepublicationof thismanuscript.
In addition,the ethical issues,includingplagiarism, informedconsent,misconduct,datafabrication
and/or falsification,double publicationand/or submission, and redundancy have beencompletely
observedbythe authors.
ABBREVIATIONS
∝ Level of significance
𝐴𝑡 Observedvalue
βi (1,2,3 ...k) Regressioncoefficients
CCS Cuencadel Cañóndel Sumidero
d Durbin-Watsontest
Eq. Equation
𝐹 Fishertest
𝐹𝑡 Predictedvalue
H0 Null hypothesis
k Numberof explanatoryvariables includedinthe model
ln-YGen Natural logarithmof MSW generation
ln-XPop Natural logarithmof population
ln-XPd Natural logarithmof populationdensity
ln-XPbam Natural logarithmof populationborninanothermunicipality
ln-XHgs Natural logarithmof householdwithgoodsandservices
ln-XCes Natural logarithmof commercial establishmentsandservices
ln-XDpi Natural logarithmof dailypercapitaincome
MAD Mean Absolute Deviation
MAPE Mean absolute percentage error
MLR Multiple LinearRegression
MSW Municipal SolidWaste
n Sample size
p-p plot Probability-probabilityplot
p-value Probabilityvalue
r Pearsoncorrelationcoefficient
r-value Pearsoncorrelationcoefficient
𝑅2 Coefficientof determination
𝑅2
𝑎𝑑𝑗 Adjusted coefficientof determination
𝑅2𝑗𝑎𝑐𝑘𝑘𝑛𝑖𝑓𝑒 Jackknife coefficientof determination
RMSE Root Mean Square Error
SSE Sumof SquaredErrors
15
𝑆𝑆𝑌𝑌 Sumof the squaresof the difference of (𝑌𝑖) andthe (𝑌
̅)
VIF Variance InflationFactor
Xi (1,2,3 ... k) Explanatoryvariables
XAs Average schooling
XCes Commercial establishmentsandservices
XDpi Dailypercapita income
XHgs Householdwithgoodsandservices
XMi Marginalizationindex
XPbam Populationborninanothermunicipality
XPd Populationdensity
XPop Population
𝑌
̅ Average of observeddata
YGen MSW generation
𝑌𝑖 Value of eachindividualobservation
𝑌𝑖
̂ Predictedvalue
REFERENCES
Notice:(All referenceshyperlinks must be taken and included at under each references from the
link below automatically)
https://woodward.library.ubc.ca/research-help/journal-abbreviations/
Abdoli,M.;Falahnezhad,M.;Behboudian,S.,(2011).Multivariate econometricapproachforsolid
waste generation modeling:acase studyof Mashhad,Iran. Environ.Eng.Sci.,28(9): 627-633
(7 pages).
https://www.liebertpub.com/doi/abs/10.1089/ees.2010.0234
Agirre,E.;Ibarra, G.; Madariaga, I.,(2006). Regressionandmultilayerperceptron-basedmodelsto
forecasthourlyO3 and NO2 levelsinthe Bilbaoarea. Environ. Modell.Software.21(4):430–
446 (17 pages).
https://dl.acm.org/citation.cfm?id=1707340
Alvarado,H.;Nájera,H.; González,F.;Palacios,R.,(2009). Studyof generationandcharacterization
of householdsolidwaste inthe municipal seatof Chiapa de Corzo,Chiapas,Mexico.
LacandoniaJ.,3: 85-92 (8 pages).
https://cuid.unicach.mx/revistas/index.php/lacandonia/article/view/156
Araiza, J.;López,C.; Ramírez,N.,(2015). Municipal solidwaste management:case studyinLas
Margaritas, Chiapas.AIDISJ.Eng. Environ.Sci.:Res.Develop.Pract.,8(3):299-311 (13
pages).
http://www.revistas.unam.mx/index.php/aidis/article/view/53489
Arriaza,M., (2006). Practical guide todata analysis,juntade Andalucía,ministryof innovation,
science andbusiness,instituteof agricultural researchandtrainingandfishing,Spain.
16
https://www.juntadeandalucia.es/agriculturaypesca/ifapa/servifapa/contenidoAlf?id=1c141ba8-
08cc-42fc-9df7-10a632eb3194
Azadi,S.;Karimí,A., (2016). Verifyingthe performance of artificial neural networkandmultiple
linearregressioninpredictingthe meanseasonal municipalsolidwastegenerationrate:a
case studyof Fars province,Iran.Waste Manage.,48: 14-23 (10 pages).
https://www.ncbi.nlm.nih.gov/pubmed/26482809
Beigl,P.;Lebersorger,S.;Salhofer,S.,(2008).Modellingmunicipal solidwaste generation:a
review.Waste Manage.,28(1):200-214 (15 pages).
https://www.ncbi.nlm.nih.gov/pubmed/17336051
Bel,G.; Mur, M., (2009). Intermunicipalcooperation,privatizationandwaste managementcosts:
evidence fromrural municipalities.Waste Manage.,29(10):2772–2778 (7 pages).
https://www.ncbi.nlm.nih.gov/pubmed/19556117
Buenrostro,O.;Bocco, G.; Vence,J.,(2001). Forecastinggenerationof urbansolidwaste in
developingcountries - acase studyinMexico.J. AirWaste Manage.,51(1): 86-93 (8 pages).
https://www.ncbi.nlm.nih.gov/pubmed/11218430
Chang,Y.; Lin,C.; Chyan,J.; Chen,I.;Chang, J. (2007). Multiple regressionmodelsforthe lower
heatingvalue of municipal solidwaste inTaiwan. J.Environ.Manage.,85(4):891–899 (9
pages).
https://www.ncbi.nlm.nih.gov/pubmed/17234326
Chhay,L.; Reyad,M.;Suy,R.; Islam,M.; MianM., (2018). Municipal solidwaste generationinChina:
influencing factor analysis and multi-model forecasting. J. Mater. Cycles Waste Manage.,
20(3): 1761–1770 (10 pages).
https://link.springer.com/article/10.1007/s10163-018-0743-4
CONAGUA,(2009). National WaterCommission.Integratedmanagementplanforthe Cuencadel
Cañóndel Sumidero.Chiapas,México.
CONAPO, (2017). National PopulationCouncil.Municipal Marginalization Index.
http://www.conapo.gob.mx/es/CONAPO/Datos_Abiertos_del_Indice_de_Marginacion
Accessed10 June 2017.
Ghinea,C.;Niculina,E.;Comanita,E.;Gavrilescu,M.;Campean,T.;Curteanu,S.;Gavrilescu,M.
(2016). Forecastingmunicipalsolidwaste generationusingprognostictoolsandregression
analysis. J.Environ.Manage.,182: 80-93 (14 pages).
https://www.ncbi.nlm.nih.gov/pubmed/27454099
Grazhdani,D.,(2016). Assessingthe variablesaffectingonthe rate of solidwaste generationand
recycling:anempirical analysisinPrespaPark.Waste Manage.,48: 3-13 (11 pages).
https://www.ncbi.nlm.nih.gov/pubmed/26482808
INEGI, (2010). Populationandhousingcensus2010: interactive dataquery.NationalInstitute of
StatisticandGeography.
17
https://www.inegi.org.mx/programas/ccpv/2010/default.html
Intharathirat,R.;Salam,P.; Kumar,S.;Untong, A.,(2015). Forecastingof municipal solidwaste
quantityina developingcountryusingmultivariate greymodel.Waste Manage.,39: 3–14
(12 pages).
https://www.ncbi.nlm.nih.gov/pubmed/25704925
Kannangara, M.; Dua, R.; Ahmadi, L.; Bensebaa, F., (2018). Modeling and prediction of regional
municipal solidwaste generationanddiversioninCanadausingmachinelearningapproaches.
Waste Manage., 74: 3–15 (13 pages).
https://www.ncbi.nlm.nih.gov/pubmed/29221873
Khan,D.; Kumar,A.; Samadder,S.,(2016). Impact of socioeconomicstatusonmunicipal solid
waste generationrate.Waste Manage.,49: 15-25 (11 pages).
https://www.ncbi.nlm.nih.gov/pubmed/26831564
Keser,S.;Duzgun,S.;Aksoy,A.,(2012). Applicationof spatial andnon-spatial dataanalysisin
determinationof the factorsthatimpact municipal solidwaste generationratesinTurkey.
Waste Manage., 32(3): 359–371 (13 pages).
https://www.ncbi.nlm.nih.gov/pubmed/22104614
Kolekar, K.;Hazra,T.; Chakrabarty,S.,(2016). A review onpredictionof municipal solidwaste
generationmodels. ProcediaEnvironSci.,35:238 – 244 (7 pages).
https://www.sciencedirect.com/science/article/pii/S1878029616301761
Kumar,A.; Samandder,S.,(2017). Anempirical model forpredictionof householdsolidwaste
generationrate – a case studyof Dhanbad,India.Waste Manage.,68: 3-15 (13 pages).
https://www.ncbi.nlm.nih.gov/pubmed/28757221
Liu,C.; Wu, X.,(2010). Factors influencingmunicipal solidwaste generationinChina:amultiple
statistical analysisstudy.Waste Manage.Res.,29(4):371-378 (8 pages).
https://www.ncbi.nlm.nih.gov/pubmed/20699292
Liu, J.; Li, Q.; Gu, W.; Wang, C., (2019). The Impact of consumption patterns on the generation of
municipal solid waste in China: evidences from provincial data. Int. J. Environ. Res. Public
Health,16(10): 1-19 (19 pages).
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6573004/
Mahmood, S.; Sharif, F.; Rahman, A.U.; Khan, A.U., (2018). Analysis and forecasting of municipal
solidwaste in NankanaCity usinggeo-spatial techniques.Environ.Monit.Assess.,190(5): 1-
14 (14 pages).
https://www.ncbi.nlm.nih.gov/pubmed/29644486
Márquez,M.; Ojeda,S.; Hidalgo,H.,(2008). Identificationof behaviorpatternsinhouseholdsolid
waste generationinMexicali’sCity:studycase.Resour.Conserv.Recycl.,52(11):1299–1306
(8 pages).
https://www.sciencedirect.com/science/article/pii/S0921344908001146
18
Mendenhall,W.;Sincich,T.,(2012). A secondcourse in statisticsregressionanalysis,Prentice Hall,
UnitedStatesof America.
https://www.amazon.com/Second-Course-Statistics-Regression-Analysis/dp/0321691695
OECD, (2015). Measuringwell-beinginMexicanStates.OrganizationforEconomicCooperation
and Development.OECDPublishing,Paris,France.
http://www.oecd.org/regional/measuring-well-being-in-mexican-states-9789264246072-en.htm
Ojeda,S.;Lozano, G.; Morelos,R.; Armijo,C.,(2008). Mathematical modelingtopredictresidential
solidwaste generation.Waste Manage.,28(1):S7–S13 (7 pages).
https://www.ncbi.nlm.nih.gov/pubmed/18583125
Pan, A.;Yu, L.; Yang, Q, (2019). Characteristicsand forecastingof municipal solidwaste generation
inChina.Sustainability,11(5):1-11 (11 pages).
https://www.mdpi.com/2071-1050/11/5/1433
Pires,J.;Martins,F.;Sousa,S.;Alvim,M.;Pereira,M. (2008). Selectionandvalidationof parameters
inmultiple linearandprincipal componentregressions. Environ.Modell.Softw.,23(1):50-55
(6 pages).
https://dl.acm.org/citation.cfm?id=1290535.1290558
Rodríguez,M. (2004). Designof a mathematical model of municipal solidwaste generationin
NicolásRomero,Mexico.Master'sThesis,National PolytechnicInstitute,Mexico.
https://tesis.ipn.mx/handle/123456789/1617
Rybová,K.;Slavik,J.;Burcin,B.; Soukopová,J.;Kučera,T.;Černíková,A.,(2018). Socio-
demographicdeterminantsof municipal waste generation:case studyof the CzechRepublic.
J. Mater. CyclesWaste Manage.,20(3): 1884–1891 (8 pages).
https://link.springer.com/article/10.1007/s10163-018-0734-5
SEMAHN, Secretariat of the Environment and Natural History (2013). State program for the
preventionandintegral managementof municipal solidwaste andwaste special inthe state
of Chiapas.
https://www.gob.mx/cms/uploads/attachment/file/187451/Chiapas.pdf
Shan,C., (2010). Projectingmunicipal solidwaste:the case of Hong KongSAR.Resour.Conserv.
Recycl.,54(11): 759–768 (10 pages).
https://www.sciencedirect.com/science/article/pii/S0921344909002730
Soni, U.; Roy, A.; Verma, A.; Jain, V., (2019). Forecasting municipal solid waste generation using
artificial intelligence models—acase studyinIndia.SN Appl.Sci.,1(2):1-11 (11 pages).
https://link.springer.com/article/10.1007/s42452-018-0157-x
19
GRAPHICAL ABSTRACT
A Graphical Abstract mustbe producedaccordingto the mainfindingsof the manuscriptresearch
storysummarizesvisuallyand pictorially. Providingsome selectedmanuscriptgraphs,tables,
images,diagramortextcannot be assumedas Graphical Abstract!
In orderto learnmore,refertothe Graphical AbstractGuidelineproducedbyElsevierPublisheras
the belowlinkorlookingatthe GJESM Publishedarticles:
https://www.elsevier.com/authors/journal-authors/graphical-abstract
HIGHLIGHTS
The provided HIGHLIGHTS must be written scientifically accordingto your research output and NOVELTY not
as your research aims and methodology into 3 to 4 HIGHLIGHTS statements:
 The predictivemodel developed was constructed usingmultiplelinear regression,with explanatory
social and demographic variables thatmet the multicollinearity test;
 The results of this work show that the variables most influential on waste generation are directly
related to the number of inhabitants;such as population,population density and population born in
another entity;
 The suggested predictive model has a high parsimony, in addition, the adjusted coefficient of
determination and the accuracy coefficients indicated high influence on the explained phenomenon
and a high forecastingcapacity.

More Related Content

Similar to 1-Manuscript_Template.docx

B00624300_AlfredoConetta_EGM716_MAUP_Projectc
B00624300_AlfredoConetta_EGM716_MAUP_ProjectcB00624300_AlfredoConetta_EGM716_MAUP_Projectc
B00624300_AlfredoConetta_EGM716_MAUP_Projectc
Alfie Conetta MSc BSc(Hon) FRGS
 
Application of Semiparametric Non-Linear Model on Panel Data with Very Small ...
Application of Semiparametric Non-Linear Model on Panel Data with Very Small ...Application of Semiparametric Non-Linear Model on Panel Data with Very Small ...
Application of Semiparametric Non-Linear Model on Panel Data with Very Small ...
IOSRJM
 
Statistical inference: Estimation
Statistical inference: EstimationStatistical inference: Estimation
Statistical inference: Estimation
Parag Shah
 
ESTIMATING R 2 SHRINKAGE IN REGRESSION
ESTIMATING R 2 SHRINKAGE IN REGRESSIONESTIMATING R 2 SHRINKAGE IN REGRESSION
ESTIMATING R 2 SHRINKAGE IN REGRESSION
International Journal of Technical Research & Application
 
Linear Algebra Project Urban Population Dynamics This project is.pdf
Linear Algebra Project  Urban Population Dynamics This project is.pdfLinear Algebra Project  Urban Population Dynamics This project is.pdf
Linear Algebra Project Urban Population Dynamics This project is.pdf
airflyluggage
 
statistical inference.pptx
statistical inference.pptxstatistical inference.pptx
statistical inference.pptx
SoujanyaLk1
 
Project #4 Urban Population Dynamics This project will acquaint y.pdf
  Project #4 Urban Population Dynamics   This project will acquaint y.pdf  Project #4 Urban Population Dynamics   This project will acquaint y.pdf
Project #4 Urban Population Dynamics This project will acquaint y.pdf
anandinternational01
 
American Statistical Association October 23 2009 Presentation Part 1
American Statistical Association October 23 2009 Presentation Part 1American Statistical Association October 23 2009 Presentation Part 1
American Statistical Association October 23 2009 Presentation Part 1
Double Check ĆŐNSULTING
 
Time series forecasting of solid waste generation in arusha city tanzania
Time series forecasting of solid waste generation in arusha city   tanzaniaTime series forecasting of solid waste generation in arusha city   tanzania
Time series forecasting of solid waste generation in arusha city tanzania
Alexander Decker
 
Presentation1.pptx
Presentation1.pptxPresentation1.pptx
Presentation1.pptx
VishalLabde
 
K033049053
K033049053K033049053
K033049053
ijceronline
 
Criminal Justice Statistics Lab 4CRJS-3020-01 Points 30A
Criminal Justice Statistics Lab 4CRJS-3020-01  Points 30ACriminal Justice Statistics Lab 4CRJS-3020-01  Points 30A
Criminal Justice Statistics Lab 4CRJS-3020-01 Points 30A
CruzIbarra161
 
Tripartite Sequential classification Sampling Plans tomonitor Tetranychus urt...
Tripartite Sequential classification Sampling Plans tomonitor Tetranychus urt...Tripartite Sequential classification Sampling Plans tomonitor Tetranychus urt...
Tripartite Sequential classification Sampling Plans tomonitor Tetranychus urt...
AI Publications
 
Comparison of statistical methods commonly used in predictive modeling
Comparison of statistical methods commonly used in predictive modelingComparison of statistical methods commonly used in predictive modeling
Comparison of statistical methods commonly used in predictive modeling
Salford Systems
 
Modelling Individual Consumer Behaviour
Modelling Individual Consumer BehaviourModelling Individual Consumer Behaviour
Modelling Individual Consumer Behaviour
GIScRG
 
L033054058
L033054058L033054058
L033054058
ijceronline
 
PREDICTING CLASS-IMBALANCED BUSINESS RISK USING RESAMPLING, REGULARIZATION, A...
PREDICTING CLASS-IMBALANCED BUSINESS RISK USING RESAMPLING, REGULARIZATION, A...PREDICTING CLASS-IMBALANCED BUSINESS RISK USING RESAMPLING, REGULARIZATION, A...
PREDICTING CLASS-IMBALANCED BUSINESS RISK USING RESAMPLING, REGULARIZATION, A...
IJMIT JOURNAL
 
PREDICTING CLASS-IMBALANCED BUSINESS RISK USING RESAMPLING, REGULARIZATION, A...
PREDICTING CLASS-IMBALANCED BUSINESS RISK USING RESAMPLING, REGULARIZATION, A...PREDICTING CLASS-IMBALANCED BUSINESS RISK USING RESAMPLING, REGULARIZATION, A...
PREDICTING CLASS-IMBALANCED BUSINESS RISK USING RESAMPLING, REGULARIZATION, A...
IJMIT JOURNAL
 
hb2s5_BSc scriptie Steyn Heskes
hb2s5_BSc scriptie Steyn Heskeshb2s5_BSc scriptie Steyn Heskes
hb2s5_BSc scriptie Steyn Heskes
Steyn Heskes
 
Estimation of parameters of linear econometric model and the power of test in...
Estimation of parameters of linear econometric model and the power of test in...Estimation of parameters of linear econometric model and the power of test in...
Estimation of parameters of linear econometric model and the power of test in...
Alexander Decker
 

Similar to 1-Manuscript_Template.docx (20)

B00624300_AlfredoConetta_EGM716_MAUP_Projectc
B00624300_AlfredoConetta_EGM716_MAUP_ProjectcB00624300_AlfredoConetta_EGM716_MAUP_Projectc
B00624300_AlfredoConetta_EGM716_MAUP_Projectc
 
Application of Semiparametric Non-Linear Model on Panel Data with Very Small ...
Application of Semiparametric Non-Linear Model on Panel Data with Very Small ...Application of Semiparametric Non-Linear Model on Panel Data with Very Small ...
Application of Semiparametric Non-Linear Model on Panel Data with Very Small ...
 
Statistical inference: Estimation
Statistical inference: EstimationStatistical inference: Estimation
Statistical inference: Estimation
 
ESTIMATING R 2 SHRINKAGE IN REGRESSION
ESTIMATING R 2 SHRINKAGE IN REGRESSIONESTIMATING R 2 SHRINKAGE IN REGRESSION
ESTIMATING R 2 SHRINKAGE IN REGRESSION
 
Linear Algebra Project Urban Population Dynamics This project is.pdf
Linear Algebra Project  Urban Population Dynamics This project is.pdfLinear Algebra Project  Urban Population Dynamics This project is.pdf
Linear Algebra Project Urban Population Dynamics This project is.pdf
 
statistical inference.pptx
statistical inference.pptxstatistical inference.pptx
statistical inference.pptx
 
Project #4 Urban Population Dynamics This project will acquaint y.pdf
  Project #4 Urban Population Dynamics   This project will acquaint y.pdf  Project #4 Urban Population Dynamics   This project will acquaint y.pdf
Project #4 Urban Population Dynamics This project will acquaint y.pdf
 
American Statistical Association October 23 2009 Presentation Part 1
American Statistical Association October 23 2009 Presentation Part 1American Statistical Association October 23 2009 Presentation Part 1
American Statistical Association October 23 2009 Presentation Part 1
 
Time series forecasting of solid waste generation in arusha city tanzania
Time series forecasting of solid waste generation in arusha city   tanzaniaTime series forecasting of solid waste generation in arusha city   tanzania
Time series forecasting of solid waste generation in arusha city tanzania
 
Presentation1.pptx
Presentation1.pptxPresentation1.pptx
Presentation1.pptx
 
K033049053
K033049053K033049053
K033049053
 
Criminal Justice Statistics Lab 4CRJS-3020-01 Points 30A
Criminal Justice Statistics Lab 4CRJS-3020-01  Points 30ACriminal Justice Statistics Lab 4CRJS-3020-01  Points 30A
Criminal Justice Statistics Lab 4CRJS-3020-01 Points 30A
 
Tripartite Sequential classification Sampling Plans tomonitor Tetranychus urt...
Tripartite Sequential classification Sampling Plans tomonitor Tetranychus urt...Tripartite Sequential classification Sampling Plans tomonitor Tetranychus urt...
Tripartite Sequential classification Sampling Plans tomonitor Tetranychus urt...
 
Comparison of statistical methods commonly used in predictive modeling
Comparison of statistical methods commonly used in predictive modelingComparison of statistical methods commonly used in predictive modeling
Comparison of statistical methods commonly used in predictive modeling
 
Modelling Individual Consumer Behaviour
Modelling Individual Consumer BehaviourModelling Individual Consumer Behaviour
Modelling Individual Consumer Behaviour
 
L033054058
L033054058L033054058
L033054058
 
PREDICTING CLASS-IMBALANCED BUSINESS RISK USING RESAMPLING, REGULARIZATION, A...
PREDICTING CLASS-IMBALANCED BUSINESS RISK USING RESAMPLING, REGULARIZATION, A...PREDICTING CLASS-IMBALANCED BUSINESS RISK USING RESAMPLING, REGULARIZATION, A...
PREDICTING CLASS-IMBALANCED BUSINESS RISK USING RESAMPLING, REGULARIZATION, A...
 
PREDICTING CLASS-IMBALANCED BUSINESS RISK USING RESAMPLING, REGULARIZATION, A...
PREDICTING CLASS-IMBALANCED BUSINESS RISK USING RESAMPLING, REGULARIZATION, A...PREDICTING CLASS-IMBALANCED BUSINESS RISK USING RESAMPLING, REGULARIZATION, A...
PREDICTING CLASS-IMBALANCED BUSINESS RISK USING RESAMPLING, REGULARIZATION, A...
 
hb2s5_BSc scriptie Steyn Heskes
hb2s5_BSc scriptie Steyn Heskeshb2s5_BSc scriptie Steyn Heskes
hb2s5_BSc scriptie Steyn Heskes
 
Estimation of parameters of linear econometric model and the power of test in...
Estimation of parameters of linear econometric model and the power of test in...Estimation of parameters of linear econometric model and the power of test in...
Estimation of parameters of linear econometric model and the power of test in...
 

Recently uploaded

一比一原版美国亚利桑那大学毕业证(ua毕业证书)如何办理
一比一原版美国亚利桑那大学毕业证(ua毕业证书)如何办理一比一原版美国亚利桑那大学毕业证(ua毕业证书)如何办理
一比一原版美国亚利桑那大学毕业证(ua毕业证书)如何办理
homgo
 
Codes n Conventions Website Media studies.pptx
Codes n Conventions Website Media studies.pptxCodes n Conventions Website Media studies.pptx
Codes n Conventions Website Media studies.pptx
ZackSpencer3
 
2024 MATFORCE Youth Poster Contest Winners
2024 MATFORCE Youth Poster Contest Winners2024 MATFORCE Youth Poster Contest Winners
2024 MATFORCE Youth Poster Contest Winners
matforce
 
❼❷⓿❺❻❷❽❷❼❽ Dpboss Kalyan Satta Matka Guessing Matka Result Main Bazar chart
❼❷⓿❺❻❷❽❷❼❽ Dpboss Kalyan Satta Matka Guessing Matka Result Main Bazar chart❼❷⓿❺❻❷❽❷❼❽ Dpboss Kalyan Satta Matka Guessing Matka Result Main Bazar chart
❼❷⓿❺❻❷❽❷❼❽ Dpboss Kalyan Satta Matka Guessing Matka Result Main Bazar chart
❼❷⓿❺❻❷❽❷❼❽ Dpboss Kalyan Satta Matka Guessing Matka Result Main Bazar chart
 
Dino Ranch Storyboard / Kids TV Advertising
Dino Ranch Storyboard / Kids TV AdvertisingDino Ranch Storyboard / Kids TV Advertising
Dino Ranch Storyboard / Kids TV Advertising
Alessandro Occhipinti
 
➒➌➎➏➑➐➋➑➐➐ Dpboss Matka Guessing Satta Matka Kalyan panel Chart Indian Matka ...
➒➌➎➏➑➐➋➑➐➐ Dpboss Matka Guessing Satta Matka Kalyan panel Chart Indian Matka ...➒➌➎➏➑➐➋➑➐➐ Dpboss Matka Guessing Satta Matka Kalyan panel Chart Indian Matka ...
➒➌➎➏➑➐➋➑➐➐ Dpboss Matka Guessing Satta Matka Kalyan panel Chart Indian Matka ...
➒➌➎➏➑➐➋➑➐➐Dpboss Matka Guessing Satta Matka Kalyan Chart Indian Matka
 
Cherries 32 collection of colorful paintings
Cherries 32 collection of colorful paintingsCherries 32 collection of colorful paintings
Cherries 32 collection of colorful paintings
sandamichaela *
 
Colour Theory for Painting - Fine Artist.pdf
Colour Theory for Painting - Fine Artist.pdfColour Theory for Painting - Fine Artist.pdf
Colour Theory for Painting - Fine Artist.pdf
Ketan Naik
 
❼❷⓿❺❻❷❽❷❼❽ Dpboss Matka ! Fix Satta Matka ! Matka Result ! Matka Guessing ! ...
❼❷⓿❺❻❷❽❷❼❽  Dpboss Matka ! Fix Satta Matka ! Matka Result ! Matka Guessing ! ...❼❷⓿❺❻❷❽❷❼❽  Dpboss Matka ! Fix Satta Matka ! Matka Result ! Matka Guessing ! ...
❼❷⓿❺❻❷❽❷❼❽ Dpboss Matka ! Fix Satta Matka ! Matka Result ! Matka Guessing ! ...
❼❷⓿❺❻❷❽❷❼❽ Dpboss Kalyan Satta Matka Guessing Matka Result Main Bazar chart
 
FinalFinalSelf-PortraiturePowerPoint.pptx
FinalFinalSelf-PortraiturePowerPoint.pptxFinalFinalSelf-PortraiturePowerPoint.pptx
FinalFinalSelf-PortraiturePowerPoint.pptx
abbieharman
 
Ealing London Independent Photography meeting - June 2024
Ealing London Independent Photography meeting - June 2024Ealing London Independent Photography meeting - June 2024
Ealing London Independent Photography meeting - June 2024
Sean McDonnell
 
All the images mentioned in 'See What You're Missing'
All the images mentioned in 'See What You're Missing'All the images mentioned in 'See What You're Missing'
All the images mentioned in 'See What You're Missing'
Dave Boyle
 
一比一原版(BC毕业证)波士顿学院毕业证如何办理
一比一原版(BC毕业证)波士顿学院毕业证如何办理一比一原版(BC毕业证)波士顿学院毕业证如何办理
一比一原版(BC毕业证)波士顿学院毕业证如何办理
40fortunate
 
Barbie Made To Move Skin Tones Matches.pptx
Barbie Made To Move Skin Tones Matches.pptxBarbie Made To Move Skin Tones Matches.pptx
Barbie Made To Move Skin Tones Matches.pptx
LinaCosta15
 
FinalLessonPlanResponding.docxnknknknknknk
FinalLessonPlanResponding.docxnknknknknknkFinalLessonPlanResponding.docxnknknknknknk
FinalLessonPlanResponding.docxnknknknknknk
abbieharman
 
HOW TO USE PINTEREST_by: Clarissa Credito
HOW TO USE PINTEREST_by: Clarissa CreditoHOW TO USE PINTEREST_by: Clarissa Credito
HOW TO USE PINTEREST_by: Clarissa Credito
ClarissaAlanoCredito
 
Fashionista Chic Couture Mazes and Coloring AdventureA
Fashionista Chic Couture Mazes and Coloring AdventureAFashionista Chic Couture Mazes and Coloring AdventureA
Fashionista Chic Couture Mazes and Coloring AdventureA
julierjefferies8888
 
一比一原版加拿大多伦多大学毕业证(uoft毕业证书)如何办理
一比一原版加拿大多伦多大学毕业证(uoft毕业证书)如何办理一比一原版加拿大多伦多大学毕业证(uoft毕业证书)如何办理
一比一原版加拿大多伦多大学毕业证(uoft毕业证书)如何办理
taqyea
 
storyboard: Victor and Verlin discussing about top hat
storyboard: Victor and Verlin discussing about top hatstoryboard: Victor and Verlin discussing about top hat
storyboard: Victor and Verlin discussing about top hat
LyneSun
 
FinalA1LessonPlanMaking.docxdvdnlskdnvsldkvnsdkvn
FinalA1LessonPlanMaking.docxdvdnlskdnvsldkvnsdkvnFinalA1LessonPlanMaking.docxdvdnlskdnvsldkvnsdkvn
FinalA1LessonPlanMaking.docxdvdnlskdnvsldkvnsdkvn
abbieharman
 

Recently uploaded (20)

一比一原版美国亚利桑那大学毕业证(ua毕业证书)如何办理
一比一原版美国亚利桑那大学毕业证(ua毕业证书)如何办理一比一原版美国亚利桑那大学毕业证(ua毕业证书)如何办理
一比一原版美国亚利桑那大学毕业证(ua毕业证书)如何办理
 
Codes n Conventions Website Media studies.pptx
Codes n Conventions Website Media studies.pptxCodes n Conventions Website Media studies.pptx
Codes n Conventions Website Media studies.pptx
 
2024 MATFORCE Youth Poster Contest Winners
2024 MATFORCE Youth Poster Contest Winners2024 MATFORCE Youth Poster Contest Winners
2024 MATFORCE Youth Poster Contest Winners
 
❼❷⓿❺❻❷❽❷❼❽ Dpboss Kalyan Satta Matka Guessing Matka Result Main Bazar chart
❼❷⓿❺❻❷❽❷❼❽ Dpboss Kalyan Satta Matka Guessing Matka Result Main Bazar chart❼❷⓿❺❻❷❽❷❼❽ Dpboss Kalyan Satta Matka Guessing Matka Result Main Bazar chart
❼❷⓿❺❻❷❽❷❼❽ Dpboss Kalyan Satta Matka Guessing Matka Result Main Bazar chart
 
Dino Ranch Storyboard / Kids TV Advertising
Dino Ranch Storyboard / Kids TV AdvertisingDino Ranch Storyboard / Kids TV Advertising
Dino Ranch Storyboard / Kids TV Advertising
 
➒➌➎➏➑➐➋➑➐➐ Dpboss Matka Guessing Satta Matka Kalyan panel Chart Indian Matka ...
➒➌➎➏➑➐➋➑➐➐ Dpboss Matka Guessing Satta Matka Kalyan panel Chart Indian Matka ...➒➌➎➏➑➐➋➑➐➐ Dpboss Matka Guessing Satta Matka Kalyan panel Chart Indian Matka ...
➒➌➎➏➑➐➋➑➐➐ Dpboss Matka Guessing Satta Matka Kalyan panel Chart Indian Matka ...
 
Cherries 32 collection of colorful paintings
Cherries 32 collection of colorful paintingsCherries 32 collection of colorful paintings
Cherries 32 collection of colorful paintings
 
Colour Theory for Painting - Fine Artist.pdf
Colour Theory for Painting - Fine Artist.pdfColour Theory for Painting - Fine Artist.pdf
Colour Theory for Painting - Fine Artist.pdf
 
❼❷⓿❺❻❷❽❷❼❽ Dpboss Matka ! Fix Satta Matka ! Matka Result ! Matka Guessing ! ...
❼❷⓿❺❻❷❽❷❼❽  Dpboss Matka ! Fix Satta Matka ! Matka Result ! Matka Guessing ! ...❼❷⓿❺❻❷❽❷❼❽  Dpboss Matka ! Fix Satta Matka ! Matka Result ! Matka Guessing ! ...
❼❷⓿❺❻❷❽❷❼❽ Dpboss Matka ! Fix Satta Matka ! Matka Result ! Matka Guessing ! ...
 
FinalFinalSelf-PortraiturePowerPoint.pptx
FinalFinalSelf-PortraiturePowerPoint.pptxFinalFinalSelf-PortraiturePowerPoint.pptx
FinalFinalSelf-PortraiturePowerPoint.pptx
 
Ealing London Independent Photography meeting - June 2024
Ealing London Independent Photography meeting - June 2024Ealing London Independent Photography meeting - June 2024
Ealing London Independent Photography meeting - June 2024
 
All the images mentioned in 'See What You're Missing'
All the images mentioned in 'See What You're Missing'All the images mentioned in 'See What You're Missing'
All the images mentioned in 'See What You're Missing'
 
一比一原版(BC毕业证)波士顿学院毕业证如何办理
一比一原版(BC毕业证)波士顿学院毕业证如何办理一比一原版(BC毕业证)波士顿学院毕业证如何办理
一比一原版(BC毕业证)波士顿学院毕业证如何办理
 
Barbie Made To Move Skin Tones Matches.pptx
Barbie Made To Move Skin Tones Matches.pptxBarbie Made To Move Skin Tones Matches.pptx
Barbie Made To Move Skin Tones Matches.pptx
 
FinalLessonPlanResponding.docxnknknknknknk
FinalLessonPlanResponding.docxnknknknknknkFinalLessonPlanResponding.docxnknknknknknk
FinalLessonPlanResponding.docxnknknknknknk
 
HOW TO USE PINTEREST_by: Clarissa Credito
HOW TO USE PINTEREST_by: Clarissa CreditoHOW TO USE PINTEREST_by: Clarissa Credito
HOW TO USE PINTEREST_by: Clarissa Credito
 
Fashionista Chic Couture Mazes and Coloring AdventureA
Fashionista Chic Couture Mazes and Coloring AdventureAFashionista Chic Couture Mazes and Coloring AdventureA
Fashionista Chic Couture Mazes and Coloring AdventureA
 
一比一原版加拿大多伦多大学毕业证(uoft毕业证书)如何办理
一比一原版加拿大多伦多大学毕业证(uoft毕业证书)如何办理一比一原版加拿大多伦多大学毕业证(uoft毕业证书)如何办理
一比一原版加拿大多伦多大学毕业证(uoft毕业证书)如何办理
 
storyboard: Victor and Verlin discussing about top hat
storyboard: Victor and Verlin discussing about top hatstoryboard: Victor and Verlin discussing about top hat
storyboard: Victor and Verlin discussing about top hat
 
FinalA1LessonPlanMaking.docxdvdnlskdnvsldkvnsdkvn
FinalA1LessonPlanMaking.docxdvdnlskdnvsldkvnsdkvnFinalA1LessonPlanMaking.docxdvdnlskdnvsldkvnsdkvn
FinalA1LessonPlanMaking.docxdvdnlskdnvsldkvnsdkvn
 

1-Manuscript_Template.docx

  • 1. 1 GJESM MANUSCRIPT TEMPLATE o ORIGINAL RESEARCH PAPER o CASE STUDY o SHORT COMMUNICATION Forecast generation model of municipal solid waste using multiple linear regression ABSTRACT: The objective of this study was to develop a forecast model to determine the rate of generation of municipal solid wastein the municipalities of the Cuenca del Cañón del Sumidero, Chiapas,Mexico.Multiplelinearregression was used with social and demographic explanatory variables.The compiled databaseconsisted of 9 variables with 118 specific data per variable, which were analyzed using a multicollinearity test to select the most important ones. Initially,different regression models were generated, but only 2 of them were considered useful,becausethey used few predictors that were statistically significant. The most important variables to predict the rate of waste generation in the study area were the population of each municipality,themigration and the population density.Although other variables,such as daily per capita incomeand averageschoolingarevery important,they do not seem to have an effect on the response variable in this study. The model with the highest parsimony resulted in an adjusted coefficient of 0.975, an average absolute percentage error of 7.70, an average absolute deviation of 0.16 and an average root squareerror of 0.19, showinga high influenceon the phenomenon studied and a good predictivecapacity. KEYWORDS: Explanatory variables; Forecast model; Multiple linear regression;Statisticalanalysis; Waste generation. NUMBER OF REFERENCES 34 NUMBER OF FIGURES 5 NUMBER OF TABLES 6 RUNNING TITLE: Forecastgenerationmodel of municipal solidwaste INTRODUCTION Because of its high management cost, the amount of Municipal Solid Waste (MSW) generated in population settlements is a significant factor for the provision of public services. According to Intharathirat et al. (2015); Keser et al. (2012); Khan et al. (2016), the amount of MSW and its compositionvary dependingon social,environmental anddemographicfactors. Severalresearchers have developed models to predict the amount of MSW generated (Mahmood et al., 2018; Kannangara et al., 2018; Pan et al., 2019; Soni et al., 2019), while othersanalyze the variablesthat influencetheirgenerationandcomposition(Chhay etal., 2018; Grazhdani, 2016; Liu and Wu, 2010; Liu et al., 2019; Rybová et al., 2018). Unfortunately,due to the social, economic and geographical
  • 2. 2 heterogeneityof the differentregionsof the world,it is difficulttomake inferencesorprojections withthe proposedmodels,and therefore, the modelsandtheirvariableshavetobe adaptedtothe conditionsof otherregions,sometimeswithlittle success. KumarandSamandder(2017) and Shan (2010) reportsthatsome of the difficultiesforthe adaptationof thesemodels are relatedtolimited or inaccessible information in other countries (databases). In addition, some variables are theoretically valid, but difficult to measure. In other cases, the variables used do not provide informationleadingto the explanationof the phenomenon,buthave tobe used,because the model incorporatesthem. Mexico,thistopichasalsobeenaddressed,particularlyinthe centerandnorth of the country (Buenrostro et al., 2001; Márquez et al., 2008; Ojedaet al., 2008; Rodríguez,2004). However, it is evident that the models proposed are not applicable to the entire national context. According to the OECD (2015), there are notable differencesbetween the central, northern and especially southern regions of Mexico; these include disparities in income, education, access to services,dispersionof localitiesandotherfactors,whichcause that the consumptionpatterns,and therefore the amountof MSW,vary greatly. Thisstudy presentsamodel toforecastthe generation rate of MSW in the municipalities of the Cuenca del Cañón del Sumidero (CCS), Chiapas State, Mexico. The model considersthe informationof the most relevantandeasilyaccessible socialand demographic variables for the studyarea, which correspond to statistical data for the years 2010- 2015. This model will allow the decision makers of the municipalities of the CCS to determine the quantities of MSWgenerated,operate properlythe waste managementsystems,and evenacquire infrastructure. This study has been carried out in municipalities of the Cuenca del Cañón del Sumidero,ChiapasState,Mexicoduring2010 - 2015. MATERIALS AND METHODS Description of the study area and context The CCS islocatedinthe State of Chiapas, inthe southeastof Mexico,betweenthe coordinates15° 56' 55" and 16° 57' 26" NorthLatitude, and 92° 30' 44" and 93° 44° 35" westlongitude (Fig.1).The CCS has 24 municipalitiesand2,847 localities; 2,816 localitiesare rural while 31 are urban. 83% of the populationof the studyarealivesinurbanareas (INEGI,2010). The degree of dispersionishigh, especiallyinthe rural localitiesfarthestfromthe municipal seat.
  • 3. 3 Fig. 1: Geographic location of the study area in Cuenca del Cañón de Sumidero in Mexico Developmentof the model This study uses a multiple linear regression (MLR) model to obtain the generation rates of MSW. Because of theirversatilityandwell-foundedtheory, MLRmodelshave beenwidelyusedin various scientificfields. Theirmaindisadvantage isthe preparationof the database (Piresetal.,2008). The hypothesistouse theMLRinthis study isbasedonthe effectof the explanatoryvariables(socialand demographic variables) on the response variable (generation rate of MSW). The linear function is shownin Eq. 1. 𝑌 = 𝛽0 + 𝛽1𝑋1 + 𝛽2𝑋2+.. .+𝛽𝑘𝑋𝑘 + 𝜀 (1) Where, Y is the response variable, Xi (1, 2, 3 ... k) are the explanatory variables, βi (1, 2, 3 ... k) are the regressioncoefficientsand 𝜀 isthe residual error. Accordingto Agirre (2006), the MLR is basedontwoassumptions:i) the explanatoryvariablesmust be independent, i.e., free of multicollinearity and ii) the dependent variable must be normally distributed, with zero mean and constant variance. In order to determine the regression coefficients,the least squares method, which is based on minimizing the sum of squared errors (SSE),usingEq. 2. 𝑆𝑆𝐸 = ∑ (𝑌𝑖 − 𝑌𝑖 ̂) 2 𝑛 𝑖=1 (2)
  • 4. 4 Where 𝑌𝑖 isthe value of eachobservationand 𝑌𝑖 ̂ isthe predictedvalue.Theoretically,low SSEvalues reflectabetterfitof the regressionmodel (KumarandSamandder,2017).Inorderto determine the bestregressionmodel(mostparsimony),the statisticalsignificance of theexplanatoryvariablesand the general modelwere analyzed.The analysisof the explanatoryvariableswasperformedwiththe t-test, while the degree of adjustment and usefulness of the proposed model was performed by evaluatingthe F-testandthe value of 𝑅2 𝑎𝑑𝑗 usingEqs. 3 and 4, respectively. 𝐹 = (𝑆𝑆𝑌𝑌−𝑆𝑆𝐸)/𝑘 𝑆𝑆𝐸/[𝑛−(𝑘+1)] (3) 𝑅2 𝑎𝑑𝑗 = 1 − [ (𝑛−1) 𝑛−(𝑘+1) ](1 − 𝑅2) (4) Where 𝑆𝑆𝑌𝑌 = ∑(𝑌𝑖 − 𝑌 ̅)2 representsthe sumofthe squaresof the differenceof the observeddata (𝑌𝑖) and the average of the data(𝑌 ̅);k isthe numberof explanatoryvariablesincludedinthe model; n isthe sample size;and 𝑅2 isthe coefficientof determination.The value of 𝑅2 wasnotconsidered tomeasure the explanatorypowerof theregressionmodel,becauseitsvalue increaseswhenadding more explanatoryvariables,anditcanbe a deceptive measure (Changetal.,2007). Datacollection Accordingto Beigl etal. (2008) and Kolekaretal. (2016), the methodsof data collectiondependon the scale of the study.In investigationscarriedoutathouseholdorlocalitylevels,the acquisitionof information isusuallycarriedoutthroughsurveysor interviews;while atdistrictor countryscales, the information comes from a database registered by government agencies. This study was made at districtscale and therefore the studyarea includesseveral municipalities.MSWgenerationwas obtained from SEMANH (2013), the studies by Alvarado et al. (2009) and Araiza et al. (2015). The social anddemographicinformation(explanatoryvariables) wasobtainedfrom CONAPO(2017) and INEGI (2010). The compiled informationallowedthe elaborationof a database of 9 variables,with 118 specific data per variable, coming from all the municipalities of the state of Chiapas (Table 1). The inferencesof the proposed model were made on the municipalities of the CCS. This database was analyzedwiththe MINITABsoftware version16. Exploratory analysisof variables An exploratoryanalysisof the 9 variablesusedto check the normalityof the data was carriedout. The test used was Kolmogorov-Smirnov, with a level of significance of ∝= 0.05. This test showed that the variables YGen, XPop, XPd, XPbam, XHgs, XCes, XDpi, did not follow a normal distribution,because Table 1: Description of variables No. Name of the variable Symbol Type of variable Measure 1 MSW generation YGen Dependent Tons/day 2 Population XPop Independent Inhabitants 3 Population density XPd Independent Inhabitants/km2 4 Population born in another municipality XPbam Independent Inhabitants 5 Average schooling XAs Independent Years of study 6 Household with goods and services XHgs Independent Percent (%) 7 Commercial establishments and services XCes Independent Number of establishments 8 Daily per capita income XDpi Independent Mexican pesos/day 9 Marginalization index XMi Independent Percent (%)
  • 5. 5 theirp-value wassmallerthanthe ∝ value considered.Inordertoadjusttheirvalues,the variables were transformedwithnatural logarithms.The variables XAs andXMi were nottransformedbecause theirdata followedanormal distribution (Table 2). Multicollinearity analysisand variablescreening An analysis of the explanatory variables was made prior to the selection of the best MLR model. Through a multicollinearity test, some of the variables initially considered were eliminated. Especially, the variance inflationfactor(VIF) and the Pearsoncorrelationcoefficient(r) were used. SimilartoKeseretal. (2012),the r coefficientwasusedtodetectthe bivariate association,while the VIFwas used todetectthe multivariate correlation.Eqs. 5and 6 describe the testsused. 𝑉𝐼𝐹𝑘 = 1 (1−𝑅𝑘 2) (5) 𝑟 = √1 − 𝑆𝑆𝐸 𝑆𝑆𝑌𝑌 (6) VIF value is calculated using the 𝑅2 of the regression equation; the explanatory variables denoted by k are analyzedasdependentvariables,whilethe othersare usedasindependentvariables;thus, VIFis calculatedforeachexplanatoryvariable k. The cut-off value of VIFusedinthis study was4. Accordingto Ghineaet al. (2016), whenVIF<1, the explanatory variables are not correlated; when 1 < VIF < 5, the explanatory variables are slightly correlated; and when VIF > 5 or 10, the explanatory variables are highly correlated. The value of r indicatesthe relationshipbetweentwovariables(positiveornegative);itsvalue rangesbetween -1 and1. There are noclearlydefined cut-off inthe literature.Arriaza(2006) indicatesthatwithvalues of r greater than 0.3, there may be signs of correlation, with values greater than 0.8, there are serious problems of multicollinearity. As in Grazhdani (2016), in this study it was considered that a value of r ≥ 0.6 (positive ornegative),indicatescorrelationbetweenthe explanatoryvariables. The elimination of explanatory variables was performed in an iterative procedure, i.e., the VIF values were initially determined for the 8 variables; subsequently, the variable with the highest VIF was eliminatedandthe next iterationwith7variableswasperformed.Thiseliminationprocedureended whena VIFcut-off value of 4 wasfound.Finally,othereliminationswere made basedonthe values of r. Subsequently, 3 explanatory variables were used in the search stage for a better model (of greater parsimony). The first variable selected was XPop, i.e., the "total population" of each Table 2: Normality test and transformation of variables Original variable Transformed variable Kolmogorov-Smirnov Kolmogorov-Smirnov Statistical p-value Statistical p-value YGen 0.338 <0.010 ln-YGen 0.050 >0.150 XPop 0.281 <0.010 ln-XPop 0.066 >0.150 XPd 0.271 <0.010 ln-XPd 0.039 >0.150 XPbam 0.387 <0.010 ln-XPbam 0.065 >0.150 XAs 0.053 >0.150 XAs --- --- XHgs 0.183 <0.010 ln-XHgs 0.054 >0.150 XCes 0.349 <0.010 ln-XCes 0.056 >0.150 XDpi 0.117 <0.010 ln-XDpi 0.056 >0.150 XMi 0.054 >0.150 XMi --- ---
  • 6. 6 municipality,underthe hypothesisthat the largerthe population,the greaterthe consumption and thus the greater the amount of MSW generated. The second explanatory variable used was XPd "population density", under the premise that dispersion patterns or agglomeration of inhabitants per unit area influences MSW generation. The third variable used was XPbam "population born in another municipality", which can be seen as migration, i.e., people who move to other places to seekbetterlivingconditions.The processof mobilizationof people causeschangesinconsumption patternsof a newplace of settlement. Othermodelsthatdo not follow the principle of parsimony were also created (more than 3 explanatory variables), but they should not be used to forecast waste generation rates, since they have very low accuracy values and some of their explanatory variablesare notsignificant. Accuracy of the modeland validation In order to determine the accuracy of the best model found, 3 widely used measures were employed: the Mean Absolute Percentage Error (MAPE), the Mean Absolute Deviation(MAD) and the Root Mean Square Error (RMSE) (Eqs. 7, 8 and9, respectively). A value of these measuresclose to zeroindicates ahighprecisionof the model (Azadi andkarimí2016). 𝑀𝐴𝑃𝐸 = 1 𝑛 ∑ | 𝐴𝑡−𝐹𝑡 𝐴𝑡 | 𝑛 𝑡=1 ∗ 100 (7) 𝑀𝐴𝐷 = 1 𝑛 ∑ |𝐴𝑡 − 𝐹𝑡| 𝑛 𝑡=1 (8) 𝑅𝑀𝑆𝐸 = √ 1 𝑛 ∑ (𝐴𝑡 − 𝐹𝑡)2 𝑛 𝑡=1 (9) Inthese equations,𝐴𝑡 isthe observedvalue, 𝐹𝑡 isthe predictedvalue and n isthesample size.MAPE is expressedin terms of percentage of error, MAD expresses the precision in the units of the data analyzed,and RMSE indicateshowconcentratedthe data are aroundthe line of bestfit. Inorderto performthe external validation of the model,the technique called 𝑅2𝑗𝑎𝑐𝑘𝑘𝑛𝑖𝑓𝑒 usingEq. 10. This equationiscalculatedbysystematicallyeliminatingeachobservationfromthe dataset, estimating the regressionequationanddeterminingtowhatextentthemodelisabletopredictthe observation that wasremoved. 𝑅2𝑗𝑎𝑐𝑘𝑘𝑛𝑖𝑓𝑒 = 1 − ∑(𝑦𝑖−𝑌 ̂(𝑖))2 ∑(𝑦𝑖−𝑌 ̅)2 (10) The 𝑅2𝑗𝑎𝑐𝑘𝑘𝑛𝑖𝑓𝑒 coefficient varies between 0 and 100%, larger values suggesting models with greaterpredictive capacity; 𝑌 ̂(𝑖)denotesthe predictedvaluefori-thobservationobtainedwhenthe regression model fits the data with 𝑦𝑖 omitted (or removed) from the sample; and𝑌 ̅ is the simple average of the observeddata. Verification of model assumptions The validity of the MLR models is subject to the behavior of the residual errors "𝜀" (difference between observed and predicted values of the dependent variable), particularly their normal distribution, their independence and homoscedasticity (Kumar and Samandder, 2017). The verification of normality was carried out through the Kolmogorov-Smirnov test, with a level of significanceof ∝ = 0.05. In ordertoverifythe independence of residues,theDurbin-Watsontest(d) was applied, looking for values close to 2, because "d" varies between 0 and 4 (Mendenhall and
  • 7. 7 Sincich, 2012). The homoscedasticity assumption was evaluated with the plot of residuals vs predicted,bothstandardized,lookingfora residue behaviorthatdoesnotfitanyknownpattern. RESULTS AND DISCUSSION Statistical analysisof variables The initial exploratoryanalysiswasperformedonthe response variableYGen,which hasthe behavior shown in Fig. 2. It is observed that some municipalities, which appear to be outliers, show a very highrate of MSW generation. Fig. 2: MSW generation rates in the state of Chiapas These atypical valueswerenoteliminatedfromthe analysisbecause theyare noterrors,butrather data that come from the mostimportantmunicipalitiesof Chiapas,suchas"TuxtlaGutiérrez, Comitán, San Cristóbal de las Casas and Tapachula". These municipalities are regional heads, therefore, the number of inhabitants, their patterns of consumption and MSW generation, differ significantlyfromthe restof the studiedarea.The normalitytestof the responsevariableandof the 8 explanatory variables is shown in Fig. 3. The non-normality of the variables YGen, XPop, XPd, XPbam, XHgs, XCes and XDpi, can be seen. For this reason, these variables were transformed using natural logarithms (Fig.3a,3b, 3c, 3d, 3f, 3g and 3h).
  • 8. 8 Fig. 3: Behavior of the variables analyzed with respect to normality Forecastmodel The coefficients of the MLR model were determined using the Minitab software. Only the explanatoryvariablesthatfulfilledthe multicollinearitycriterionwere used.Initially,2theoretically validmodelswere determined;the firstone isshowninEq. 11.
  • 9. 9 𝑙𝑛𝑌𝐺𝑒𝑛 = −8.91 + 1.10 𝑙𝑛𝑋𝑃𝑜𝑝 + 0.0259 𝑙𝑛𝑋𝑃𝑑 + 0.0688 𝑙𝑛𝑋𝑃𝑏𝑎𝑚 (11) Thisfirstmodel consistsof 3variables, XPop,XPd,XPbam (all transformed).The F-testassociatedwitha variance analysis indicated that the model is statistically valid because p-value < 0.05. This model can thus also be used for forecast purposes. However, it is important to be careful because the explanatory variable XPd is not statistically significant since the null hypothesis that the coefficient of the variable isequal to zero(H0: βi = 0) is met.Therefore,the explanatoryvariable isnotrelated to the dependentvariable, i.e.,itshouldnotbe interpreted. The secondmodel ispresentedinEq. 12, whichconsistsof 2 explanatoryvariables “XPop andXPbam”. Similarto the firstmodel,here alsothe p-value andthe F-testindicate thatit is a statisticallyvalid model that can be used for forecasting purposes. Particularly this model is the one of greater parsimony,because itusesonly2variables. 𝑙𝑛𝑌𝐺𝑒𝑛 = −8.86 + 1.11 𝑙𝑛𝑋𝑃𝑜𝑝 + 0.0658 𝑙𝑛𝑋𝑃𝑏𝑎𝑚 (12) All the informationassociatedwiththe analysisof variance ispresentedin Table 3. Table 3: Analysis of variance of the proposed models Model Source Degree of freedom (df) Sum of Squares Mean square F Sig. 1 Regression 3 150.591 50.197 1,449.33 0.000 Residual 107 3.706 0.035 Total 110 154.297 2 Regression 2 150.549 75.274 2,169.16 0.000 Residual 108 3.748 0.035 Total 110 154.297 The verificationof assumptionsof the proposedmodels,especially model 2,is presentedin Fig. 4. The probability-probability plot (p-p plot) (Fig. 4a) shows the values of the residuals with a linear patternindicatingnormality;additionally,the Kolmogorov-Smirnovvalueanditsassociatedp-value confirmit(p-value>0.15). The resultof the Durbin-Watsonindependencetestgave avalueof 1.979 for model 2, which indicates that the residuals are not correlated. The homoscedasticity test presentedinFig.4bshowsabehaviorofthe residualsthatdoesnotfitanyknownpattern;therefore, thissituationisadequate.
  • 10. 10 Fig. 4: Verification of model assumptions: (a) Normality of residuals, (b) Independence of residuals On the other hand, the 𝑅2 value of the equations in both modelswas 0.976, which indicates that 97.6% of the generation rate of MSW YGen (transformed) can be explained by the explanatory variables used. It is important to note the gradual decrease of 𝑅2 𝑎𝑑𝑗 (0.975) with respect to 𝑅2, which is due to the adjustment by the introduction of 2 and 3 variables in models 1 and 2, respectively. The highvalue of 𝑅2 and𝑅2 𝑎𝑑𝑗 in these modelsisdue to the initial transformationof the explanatoryvariables,aswell asthe response variable.Additionally,the datacollectioncarried out in this study influencedthese valuesbecause theycome froma censusdatabase,andnot from an informationsurveythroughinterviews. The internal validationof model 2 throughMAPE, MAD and RMSE, showedthe valuesof 7.70, 0.16 and 0.19, respectively,whichindicatesahighprecision
  • 11. 11 since the values of these tests are close to 0 (zero). The external validation by𝑅2𝑗𝑎𝑐𝑘𝑘𝑛𝑖𝑓𝑒 presentedavalue of 97.44%. Therefore,model 2alsohasa highforecastingcapacity. Non-significantvariables The analysisof the 8 explanatoryvariablesusingthe VIFtestproducedthe initial eliminationof the variables XAs, 𝑙𝑛-XHgs, 𝑙𝑛-XCes and 𝑙𝑛-XDpi, since their value was higher than the cut-off of 4. The variablesXAs and𝑙𝑛-XDpi have beenusedmainlyinstudiesathouseholdorlocalitylevels(Khanetal, 2016; Ojeda et al., 2008), but in this paperthey were usedat districtlevel,andthe effectof these variables seems not to be important (low correlation with the response variable 𝑙𝑛-YGen). The variables 𝑙𝑛-XHgs and𝑙𝑛-XCes were eliminatedbecause theyare highlycorrelatedwithXMi,since the latteris a multidimensional indicatorthatmeasuresdeprivationina population,throughvariables similar to those eliminated. Finally, through the r test, only XMi was eliminated, since it was highly correlatedwith 𝑙𝑛-XPbam,witha coefficientof -0.695, i.e.,much higherthanthe cut-off value of 0.6 (positive or negative); additionally, this variable was less correlated with the 𝑙𝑛-YGen response variable (Table 4). Table 4: Correlation matrix of variables Pearson's correlation ln-YGen ln-XPop ln- XPd ln- XPbam ln-XPop 0.985 --- --- --- ln- XPd 0.161 0.169 --- --- ln- XPbam 0.638 0.573 -0.059 --- XMi -0.355 -0.271 -0.097 -0.695 Significantvariables The transformedvariables XPop,XPd andXPbam were usedinthe searchforthe best model, since their VIFand r valueswere belowthe cut-off values. XPop hasbeenusedinthe studies of Azadi andkarimí (2016) and Abdoli etal. (2011), as the mostimportantexplanatoryvariable.Inthis study,Pearson's correlation r-value was 0.985, which indicates that it is also the variable most related to the generationof waste,particularlyinapositive way, i.e., toalargerpopulation correspondsagreater quantity of MSW. The variable XPd has been used in few publications. Bel and Mur (2009) use this variable alsoto obtainthe costs associatedwithwaste management. Inthisstudy, r-value of 0.161 was obtained, which indicates a poor correlation with the response variable. The analysis of the forecastmodel 1indicatedthatthisvariable isnotstatisticallysignificant,anditsuse mustbe taken with caution. The XPbam variable is positively related to the response variable. Its Pearson's correlation coefficient was 0.638. This variable is important in the study area, since it can be concluded that people who move from one municipality to another have different consumption patternsthatmodifythe amountsof MSW. Otherexplanatoryvariablesmentionedin Kolekaretal. (2016), forinstance age,employmentstatus,level of urbanizationandenvironmentalvariablessuch as precipitation or temperature, were not used in this study since it is difficult to find a database withinformationonthese variables. Othergenerated models Eqs. 13, 14 and 15 showothermodelsgeneratedwiththe variablesinitiallyraised(models3,4 and 5 respectively). All these models are statistically significant and are also useful for forecasting purposes,butincorporate explanatoryvariablesthatare not significant;therefore,theirresultsare not accurate (Table 5).Additionally,theyhave low parsimonybecause theyincorporate more than 2 or 3 explanatory variables. Table 6 shows the statistical behavior of the predictors. The p-value
  • 12. 12 andthe VIFmust be analyzedbecausetheyindicate multicollinearitybetweenthe variablesandalso theirpossible interpretationwithinthe generatedmodel. 𝑙𝑛𝑌𝐺𝑒𝑛 = −8.48 + 1.13 𝑙𝑛𝑋𝑃𝑜𝑝 − 0.0019 𝑙𝑛𝑋𝑃𝑑 + 0.0315 𝑙𝑛𝑋𝑃𝑏𝑎𝑚 − 0.0111𝑋𝑀𝑖 (13) 𝑙𝑛𝑌𝐺𝑒𝑛 = −8.42 + 1.13 𝑙𝑛𝑋𝑃𝑜𝑝 − 0.0008 𝑙𝑛𝑋𝑃𝑑 + 0.0324 𝑙𝑛𝑋𝑃𝑏𝑎𝑚 − 0.0070𝑋𝐴𝑠 − 0.0119𝑋𝑀𝑖 (14) 𝑙𝑛𝑌𝐺𝑒𝑛 = −8.22 + 0.995 𝑙𝑛𝑋𝑃𝑜𝑝 + 0.0039 𝑙𝑛𝑋𝑃𝑑 + 0.0261 𝑙𝑛𝑋𝑃𝑏𝑎𝑚 − 0.0006𝑋𝐴𝑠 + 0.133 𝑙𝑛𝑋𝐻𝑔𝑠 − 0.00525𝑋𝑀𝑖 (15) Table 5: Analysis of variance of other generated models Model Source Degree of freedom (df) Sum of Squares Mean square F Sig. 𝑅2 𝑎𝑑𝑗 3 Regression 4 150.900 37.725 1,177.44 0.000 0.977 Residual 106 3.396 0.032 Total 110 154.297 4 Regression 5 150.902 30.180 933.42 0.000 0.977 Residual 105 3.395 0.032 Total 110 154.297 5 Regression 6 151.397 25.233 905.01 0.000 0.980 Residual 104 2.900 0.028 Total 110 154.297 Table 6: Statistical behavior of predictors in models 3, 4 and 5 Model Predictor Coefficient p-value VIF Comments 3 Constant -8.48 0.000 ------- There is no multicollinearity among predictors,but some of them are not statistically significant(p-value>0.05). 𝑙𝑛𝑋𝑃𝑜𝑝 1.13 0.000 1.947 𝑙𝑛𝑋𝑃𝑑 -0.0019 0.937 1.300 𝑙𝑛𝑋𝑃𝑏𝑎𝑚 0.0315 0.053 3.556 𝑋𝑀𝑖 -0.0111 0.002 2.405 4 Constant -8.42 0.000 ------- There is multicollinearity between predictors (VIF> 5) and some of them are not statistically significant (p- value> 0.05). 𝑙𝑛𝑋𝑃𝑜𝑝 1.13 0.000 1.948 𝑙𝑛𝑋𝑃𝑑 -0.0008 0.975 1.374 𝑙𝑛𝑋𝑃𝑏𝑎𝑚 0.0324 0.055 3.789 𝑋𝐴𝑠 -0.0070 0.844 5.373 𝑋𝑀𝑖 -0.0119 0.026 5.177 5 Constant -8.22 0.000 ------- 𝑙𝑛𝑋𝑃𝑜𝑝 0.995 0.000 5.634 𝑙𝑛𝑋𝑃𝑑 0.0039 0.869 1.377 𝑙𝑛𝑋𝑃𝑏𝑎𝑚 0.0261 0.096 3.824 𝑋𝐴𝑠 -0.0006 0.986 5.384 𝑙𝑛𝑋𝐻𝑔𝑠 0.133 0.000 6.180 𝑋𝑀𝑖 -0.0525 0.309 5.710 Inferencesaboutthemunicipalitiesof thestudy area Basedon model 2 and itsstatistical analysis,inferenceswere made toforecastthe generationrate of MSW in the municipalitiesof the CCS. The forecastwas made withthe most currentdata of the
  • 13. 13 variables XPop andXPbam,correspondingtothe year2015. Fig.5 showsMSW generation forecastand its comparisonwiththe original database. Inmost of the municipalitiesof the CCS, the generation rate of MSW presentedagradual increase withrespecttopopulationgrowth(variable XPop),except in Arriaga, Chiapilla,Osumacinta, Suchiapa,Teopisca, Tonalá, VenustianoCarranza and Villaflores, due tothe fact thatthe populationof thesemunicipalitiesdidnotincreaseinthe 2010-2015 period. Fig. 5: Forecast of MSW generation rates in the municipalities of the study area Currently,the studyareagenerates1,600tons of MSW/day,of which74% comesfromthe regional heads such as Berriozábal, Ocozocoautla de Espinosa, San Cristóbal de las Casas, Tuxtla Gutiérrez and Villaflores. CONCLUSION In this study, a forecast model was developed to determine the generation of MSW in the municipalitiesof the CCS,ChiapasState,Mexico.A MLRwas usedtoobtainthe forecastmodel with social and demographicexplanatoryvariables. Twoforecastmodelswere presentedandanalyzed, withvariablesthat metthe multicollinearitytest.The mostimportantvariablestopredictthe rate of MSW generationinthe studyareawerethe populationof eachmunicipality(XPop),the population born in another municipality (XPbam) and the population density (XPd). XPop is the most influential explanatory variable of waste generation, particularly it is related in a positive way. XPbam is less relatedtowaste generation. XPd isthe variablethatleastinfluenceswaste generationprediction;in addition, it can present problems of correlation with other explanatory variables. Although other variables,suchasdailypercapitaincome(XDpi)andaverage schooling(XAs),are veryimportant,they donot seemtohave aneffectonthe responsevariableinthisstudy.The userof thisforecastmodel shoulduse model 2, since it is the one withthe highestparsimony(itusesfewervariables); 𝑅2 𝑎𝑑𝑗, MAPE, MAD and RMSE values indicated high influence on the explained phenomenon and high forecastingcapacity.Additionally,itisimportanttomentionthatwhenusingthe modelsproposed for forecastingpurposes,itisnecessarytomake a transformationinthe explanatoryandresponse variables(use inverseof natural logarithm).The inferencesmade onthe municipalitiesof the study area showed that, except in some municipalities,the MSW generation rate usually presenteda gradual increase withrespecttopopulationgrowthand withrespect to the numberof inhabitants that were born in anotherentity(migration). Finally,this study canbe a solidbasisforcomparison
  • 14. 14 for future researchinthe area of study.It is possible touse differentmathematical modelssuchas artificial neural network,principal componentanalysis,time-seriesanalysis,etc.,andcompare the response variable orthe predictors. ACKNOWLEDGEMENT This study was supported by the Project Support Program for Research and Technological Innovation [Project UNAM DGAPA-PAPIIT IN105516]. We also appreciate the support of Mexican National Council forScience andTechnology(CONACYT) tocarry outthis work. CONFLICT OF INTEREST The authordeclaresthatthereisno conflictofinterestsregardingthepublicationof thismanuscript. In addition,the ethical issues,includingplagiarism, informedconsent,misconduct,datafabrication and/or falsification,double publicationand/or submission, and redundancy have beencompletely observedbythe authors. ABBREVIATIONS ∝ Level of significance 𝐴𝑡 Observedvalue βi (1,2,3 ...k) Regressioncoefficients CCS Cuencadel Cañóndel Sumidero d Durbin-Watsontest Eq. Equation 𝐹 Fishertest 𝐹𝑡 Predictedvalue H0 Null hypothesis k Numberof explanatoryvariables includedinthe model ln-YGen Natural logarithmof MSW generation ln-XPop Natural logarithmof population ln-XPd Natural logarithmof populationdensity ln-XPbam Natural logarithmof populationborninanothermunicipality ln-XHgs Natural logarithmof householdwithgoodsandservices ln-XCes Natural logarithmof commercial establishmentsandservices ln-XDpi Natural logarithmof dailypercapitaincome MAD Mean Absolute Deviation MAPE Mean absolute percentage error MLR Multiple LinearRegression MSW Municipal SolidWaste n Sample size p-p plot Probability-probabilityplot p-value Probabilityvalue r Pearsoncorrelationcoefficient r-value Pearsoncorrelationcoefficient 𝑅2 Coefficientof determination 𝑅2 𝑎𝑑𝑗 Adjusted coefficientof determination 𝑅2𝑗𝑎𝑐𝑘𝑘𝑛𝑖𝑓𝑒 Jackknife coefficientof determination RMSE Root Mean Square Error SSE Sumof SquaredErrors
  • 15. 15 𝑆𝑆𝑌𝑌 Sumof the squaresof the difference of (𝑌𝑖) andthe (𝑌 ̅) VIF Variance InflationFactor Xi (1,2,3 ... k) Explanatoryvariables XAs Average schooling XCes Commercial establishmentsandservices XDpi Dailypercapita income XHgs Householdwithgoodsandservices XMi Marginalizationindex XPbam Populationborninanothermunicipality XPd Populationdensity XPop Population 𝑌 ̅ Average of observeddata YGen MSW generation 𝑌𝑖 Value of eachindividualobservation 𝑌𝑖 ̂ Predictedvalue REFERENCES Notice:(All referenceshyperlinks must be taken and included at under each references from the link below automatically) https://woodward.library.ubc.ca/research-help/journal-abbreviations/ Abdoli,M.;Falahnezhad,M.;Behboudian,S.,(2011).Multivariate econometricapproachforsolid waste generation modeling:acase studyof Mashhad,Iran. Environ.Eng.Sci.,28(9): 627-633 (7 pages). https://www.liebertpub.com/doi/abs/10.1089/ees.2010.0234 Agirre,E.;Ibarra, G.; Madariaga, I.,(2006). Regressionandmultilayerperceptron-basedmodelsto forecasthourlyO3 and NO2 levelsinthe Bilbaoarea. Environ. Modell.Software.21(4):430– 446 (17 pages). https://dl.acm.org/citation.cfm?id=1707340 Alvarado,H.;Nájera,H.; González,F.;Palacios,R.,(2009). Studyof generationandcharacterization of householdsolidwaste inthe municipal seatof Chiapa de Corzo,Chiapas,Mexico. LacandoniaJ.,3: 85-92 (8 pages). https://cuid.unicach.mx/revistas/index.php/lacandonia/article/view/156 Araiza, J.;López,C.; Ramírez,N.,(2015). Municipal solidwaste management:case studyinLas Margaritas, Chiapas.AIDISJ.Eng. Environ.Sci.:Res.Develop.Pract.,8(3):299-311 (13 pages). http://www.revistas.unam.mx/index.php/aidis/article/view/53489 Arriaza,M., (2006). Practical guide todata analysis,juntade Andalucía,ministryof innovation, science andbusiness,instituteof agricultural researchandtrainingandfishing,Spain.
  • 16. 16 https://www.juntadeandalucia.es/agriculturaypesca/ifapa/servifapa/contenidoAlf?id=1c141ba8- 08cc-42fc-9df7-10a632eb3194 Azadi,S.;Karimí,A., (2016). Verifyingthe performance of artificial neural networkandmultiple linearregressioninpredictingthe meanseasonal municipalsolidwastegenerationrate:a case studyof Fars province,Iran.Waste Manage.,48: 14-23 (10 pages). https://www.ncbi.nlm.nih.gov/pubmed/26482809 Beigl,P.;Lebersorger,S.;Salhofer,S.,(2008).Modellingmunicipal solidwaste generation:a review.Waste Manage.,28(1):200-214 (15 pages). https://www.ncbi.nlm.nih.gov/pubmed/17336051 Bel,G.; Mur, M., (2009). Intermunicipalcooperation,privatizationandwaste managementcosts: evidence fromrural municipalities.Waste Manage.,29(10):2772–2778 (7 pages). https://www.ncbi.nlm.nih.gov/pubmed/19556117 Buenrostro,O.;Bocco, G.; Vence,J.,(2001). Forecastinggenerationof urbansolidwaste in developingcountries - acase studyinMexico.J. AirWaste Manage.,51(1): 86-93 (8 pages). https://www.ncbi.nlm.nih.gov/pubmed/11218430 Chang,Y.; Lin,C.; Chyan,J.; Chen,I.;Chang, J. (2007). Multiple regressionmodelsforthe lower heatingvalue of municipal solidwaste inTaiwan. J.Environ.Manage.,85(4):891–899 (9 pages). https://www.ncbi.nlm.nih.gov/pubmed/17234326 Chhay,L.; Reyad,M.;Suy,R.; Islam,M.; MianM., (2018). Municipal solidwaste generationinChina: influencing factor analysis and multi-model forecasting. J. Mater. Cycles Waste Manage., 20(3): 1761–1770 (10 pages). https://link.springer.com/article/10.1007/s10163-018-0743-4 CONAGUA,(2009). National WaterCommission.Integratedmanagementplanforthe Cuencadel Cañóndel Sumidero.Chiapas,México. CONAPO, (2017). National PopulationCouncil.Municipal Marginalization Index. http://www.conapo.gob.mx/es/CONAPO/Datos_Abiertos_del_Indice_de_Marginacion Accessed10 June 2017. Ghinea,C.;Niculina,E.;Comanita,E.;Gavrilescu,M.;Campean,T.;Curteanu,S.;Gavrilescu,M. (2016). Forecastingmunicipalsolidwaste generationusingprognostictoolsandregression analysis. J.Environ.Manage.,182: 80-93 (14 pages). https://www.ncbi.nlm.nih.gov/pubmed/27454099 Grazhdani,D.,(2016). Assessingthe variablesaffectingonthe rate of solidwaste generationand recycling:anempirical analysisinPrespaPark.Waste Manage.,48: 3-13 (11 pages). https://www.ncbi.nlm.nih.gov/pubmed/26482808 INEGI, (2010). Populationandhousingcensus2010: interactive dataquery.NationalInstitute of StatisticandGeography.
  • 17. 17 https://www.inegi.org.mx/programas/ccpv/2010/default.html Intharathirat,R.;Salam,P.; Kumar,S.;Untong, A.,(2015). Forecastingof municipal solidwaste quantityina developingcountryusingmultivariate greymodel.Waste Manage.,39: 3–14 (12 pages). https://www.ncbi.nlm.nih.gov/pubmed/25704925 Kannangara, M.; Dua, R.; Ahmadi, L.; Bensebaa, F., (2018). Modeling and prediction of regional municipal solidwaste generationanddiversioninCanadausingmachinelearningapproaches. Waste Manage., 74: 3–15 (13 pages). https://www.ncbi.nlm.nih.gov/pubmed/29221873 Khan,D.; Kumar,A.; Samadder,S.,(2016). Impact of socioeconomicstatusonmunicipal solid waste generationrate.Waste Manage.,49: 15-25 (11 pages). https://www.ncbi.nlm.nih.gov/pubmed/26831564 Keser,S.;Duzgun,S.;Aksoy,A.,(2012). Applicationof spatial andnon-spatial dataanalysisin determinationof the factorsthatimpact municipal solidwaste generationratesinTurkey. Waste Manage., 32(3): 359–371 (13 pages). https://www.ncbi.nlm.nih.gov/pubmed/22104614 Kolekar, K.;Hazra,T.; Chakrabarty,S.,(2016). A review onpredictionof municipal solidwaste generationmodels. ProcediaEnvironSci.,35:238 – 244 (7 pages). https://www.sciencedirect.com/science/article/pii/S1878029616301761 Kumar,A.; Samandder,S.,(2017). Anempirical model forpredictionof householdsolidwaste generationrate – a case studyof Dhanbad,India.Waste Manage.,68: 3-15 (13 pages). https://www.ncbi.nlm.nih.gov/pubmed/28757221 Liu,C.; Wu, X.,(2010). Factors influencingmunicipal solidwaste generationinChina:amultiple statistical analysisstudy.Waste Manage.Res.,29(4):371-378 (8 pages). https://www.ncbi.nlm.nih.gov/pubmed/20699292 Liu, J.; Li, Q.; Gu, W.; Wang, C., (2019). The Impact of consumption patterns on the generation of municipal solid waste in China: evidences from provincial data. Int. J. Environ. Res. Public Health,16(10): 1-19 (19 pages). https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6573004/ Mahmood, S.; Sharif, F.; Rahman, A.U.; Khan, A.U., (2018). Analysis and forecasting of municipal solidwaste in NankanaCity usinggeo-spatial techniques.Environ.Monit.Assess.,190(5): 1- 14 (14 pages). https://www.ncbi.nlm.nih.gov/pubmed/29644486 Márquez,M.; Ojeda,S.; Hidalgo,H.,(2008). Identificationof behaviorpatternsinhouseholdsolid waste generationinMexicali’sCity:studycase.Resour.Conserv.Recycl.,52(11):1299–1306 (8 pages). https://www.sciencedirect.com/science/article/pii/S0921344908001146
  • 18. 18 Mendenhall,W.;Sincich,T.,(2012). A secondcourse in statisticsregressionanalysis,Prentice Hall, UnitedStatesof America. https://www.amazon.com/Second-Course-Statistics-Regression-Analysis/dp/0321691695 OECD, (2015). Measuringwell-beinginMexicanStates.OrganizationforEconomicCooperation and Development.OECDPublishing,Paris,France. http://www.oecd.org/regional/measuring-well-being-in-mexican-states-9789264246072-en.htm Ojeda,S.;Lozano, G.; Morelos,R.; Armijo,C.,(2008). Mathematical modelingtopredictresidential solidwaste generation.Waste Manage.,28(1):S7–S13 (7 pages). https://www.ncbi.nlm.nih.gov/pubmed/18583125 Pan, A.;Yu, L.; Yang, Q, (2019). Characteristicsand forecastingof municipal solidwaste generation inChina.Sustainability,11(5):1-11 (11 pages). https://www.mdpi.com/2071-1050/11/5/1433 Pires,J.;Martins,F.;Sousa,S.;Alvim,M.;Pereira,M. (2008). Selectionandvalidationof parameters inmultiple linearandprincipal componentregressions. Environ.Modell.Softw.,23(1):50-55 (6 pages). https://dl.acm.org/citation.cfm?id=1290535.1290558 Rodríguez,M. (2004). Designof a mathematical model of municipal solidwaste generationin NicolásRomero,Mexico.Master'sThesis,National PolytechnicInstitute,Mexico. https://tesis.ipn.mx/handle/123456789/1617 Rybová,K.;Slavik,J.;Burcin,B.; Soukopová,J.;Kučera,T.;Černíková,A.,(2018). Socio- demographicdeterminantsof municipal waste generation:case studyof the CzechRepublic. J. Mater. CyclesWaste Manage.,20(3): 1884–1891 (8 pages). https://link.springer.com/article/10.1007/s10163-018-0734-5 SEMAHN, Secretariat of the Environment and Natural History (2013). State program for the preventionandintegral managementof municipal solidwaste andwaste special inthe state of Chiapas. https://www.gob.mx/cms/uploads/attachment/file/187451/Chiapas.pdf Shan,C., (2010). Projectingmunicipal solidwaste:the case of Hong KongSAR.Resour.Conserv. Recycl.,54(11): 759–768 (10 pages). https://www.sciencedirect.com/science/article/pii/S0921344909002730 Soni, U.; Roy, A.; Verma, A.; Jain, V., (2019). Forecasting municipal solid waste generation using artificial intelligence models—acase studyinIndia.SN Appl.Sci.,1(2):1-11 (11 pages). https://link.springer.com/article/10.1007/s42452-018-0157-x
  • 19. 19 GRAPHICAL ABSTRACT A Graphical Abstract mustbe producedaccordingto the mainfindingsof the manuscriptresearch storysummarizesvisuallyand pictorially. Providingsome selectedmanuscriptgraphs,tables, images,diagramortextcannot be assumedas Graphical Abstract! In orderto learnmore,refertothe Graphical AbstractGuidelineproducedbyElsevierPublisheras the belowlinkorlookingatthe GJESM Publishedarticles: https://www.elsevier.com/authors/journal-authors/graphical-abstract HIGHLIGHTS The provided HIGHLIGHTS must be written scientifically accordingto your research output and NOVELTY not as your research aims and methodology into 3 to 4 HIGHLIGHTS statements:  The predictivemodel developed was constructed usingmultiplelinear regression,with explanatory social and demographic variables thatmet the multicollinearity test;  The results of this work show that the variables most influential on waste generation are directly related to the number of inhabitants;such as population,population density and population born in another entity;  The suggested predictive model has a high parsimony, in addition, the adjusted coefficient of determination and the accuracy coefficients indicated high influence on the explained phenomenon and a high forecastingcapacity.